<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?>

<feed xmlns="http://purl.org/atom/ns#" version="0.3" xml:lang="en-US">
<link href="https://www.blogger.com/atom/6674541" rel="service.post" title="Search Journal" type="application/atom+xml"/>
<link href="https://www.blogger.com/atom/6674541" rel="service.feed" title="Search Journal" type="application/atom+xml"/>
<title mode="escaped" type="text/html">Search Journal</title>
<tagline mode="escaped" type="text/html">Searchmarking: Experiments with searching, personal search engines, and trusted search communities.</tagline>
<link href="http://www.esoterraka.com/searchmarking/blog/index.html" rel="alternate" title="Search Journal" type="text/html"/>
<id>tag:blogger.com,1999:blog-6674541</id>
<modified>2006-04-24T00:29:18Z</modified>
<generator url="http://www.blogger.com/" version="6.72">Blogger</generator>
<info mode="xml" type="text/html">
<div xmlns="http://www.w3.org/1999/xhtml">This is an Atom formatted XML site feed. It is intended to be viewed in a Newsreader or syndicated to another site. Please visit the <a href="http://help.blogger.com/bin/answer.py?answer=697">Blogger Help</a> for more info.</div>
</info>
<convertLineBreaks xmlns="http://www.blogger.com/atom/ns#">false</convertLineBreaks>
<entry xmlns="http://purl.org/atom/ns#">
<link href="https://www.blogger.com/atom/6674541/112605782859082857" rel="service.edit" title="Index Rebuilt" type="application/atom+xml"/>
<author>
<name>jrb</name>
</author>
<issued>2005-09-06T20:50:00-05:00</issued>
<modified>2005-09-07T01:50:28Z</modified>
<created>2005-09-07T01:50:28Z</created>
<link href="http://www.esoterraka.com/searchmarking/blog/archive/2005_09_01_index.html#112605782859082857" rel="alternate" title="Index Rebuilt" type="text/html"/>
<id>tag:blogger.com,1999:blog-6674541.post-112605782859082857</id>
<title mode="escaped" type="text/html">Index Rebuilt</title>
<content type="application/xhtml+xml" xml:base="http://www.esoterraka.com/searchmarking/blog/index.html" xml:space="preserve">
<div xmlns="http://www.w3.org/1999/xhtml">
<p>Took a moment tonight to rebuild the index. It was 15,024 <abbr title="kilobytes">KB</abbr> with 2,307 resources, and is now 14,654 <abbr title="kilobytes">KB</abbr> with 2,266 resources.</p> 

<p>Of some fascination for me is the fact that several resources have now used <abbr title="Robots Exclusion Protocol">REP</abbr> to specifically deny access to my searchmarking crawler. Again I'm left contemplating the ins and outs of <a href="http://www.esoterraka.com/searchmarking/blog/archive/2004_05_01_index.html#108583820386599200">obeying the Robots Exclusion Protocol</a>.</p>
</div>
</content>
<draft xmlns="http://purl.org/atom-blog/ns#">false</draft>
</entry>
<entry xmlns="http://purl.org/atom/ns#">
<link href="https://www.blogger.com/atom/6674541/112604681568286413" rel="service.edit" title="CiteSeer gets $1.2-million grant" type="application/atom+xml"/>
<author>
<name>jrb</name>
</author>
<issued>2005-08-29T18:38:00-05:00</issued>
<modified>2005-09-06T22:51:49Z</modified>
<created>2005-09-06T22:46:55Z</created>
<link href="http://www.esoterraka.com/searchmarking/blog/archive/2005_08_01_index.html#112604681568286413" rel="alternate" title="CiteSeer gets $1.2-million grant" type="text/html"/>
<id>tag:blogger.com,1999:blog-6674541.post-112604681568286413</id>
<title mode="escaped" type="text/html">CiteSeer gets $1.2-million grant</title>
<content type="application/xhtml+xml" xml:base="http://www.esoterraka.com/searchmarking/blog/index.html" xml:space="preserve">
<div xmlns="http://www.w3.org/1999/xhtml">
<p>
<a class="offsite" href="http://live.psu.edu/story/13209" title="Offsite link to live.psu.edu">Penn State Live has announced</a> a $1.2-million grant to fix up the venerable <a class="offsite" href="http://citeseer.ist.psu.edu/" title="Offsite link to citeseer.ist.psu.edu">CiteSeer</a> Scientific Literature Digital Library!</p>

<blockquote cite="http://live.psu.edu/story/13209">
<p>Since its launch in 1997, CiteSeer has provided the public with access to more than 700,000 documents in computer and information sciences. The Next Generation CiteSeer will archive more documents, allow new types of searching, offer CiteSeer as a Web service, include personalized recommendations and searches, and permit synchronous live-object collaboration.<br/>
- <cite>
<a class="offsite" href="http://live.psu.edu/story/13209" title="Offsite link to live.psu.edu">http://live.psu.edu/story/13209</a>
</cite>
</p>
</blockquote>

<p>Hooray! Many thanks to my fine friend at <a class="offsite" href="http://www.fridgebuzz.com" title="Offsite link to fridgebuzz.com">fridgebuzz</a> for the head's up.</p>
</div>
</content>
<draft xmlns="http://purl.org/atom-blog/ns#">false</draft>
</entry>
<entry xmlns="http://purl.org/atom/ns#">
<link href="https://www.blogger.com/atom/6674541/112166060646210289" rel="service.edit" title="What can't be found with Google" type="application/atom+xml"/>
<author>
<name>jrb</name>
</author>
<issued>2005-07-17T23:16:00-05:00</issued>
<modified>2005-07-18T04:24:35Z</modified>
<created>2005-07-18T04:23:26Z</created>
<link href="http://www.esoterraka.com/searchmarking/blog/archive/2005_07_01_index.html#112166060646210289" rel="alternate" title="What can't be found with Google" type="text/html"/>
<id>tag:blogger.com,1999:blog-6674541.post-112166060646210289</id>
<title mode="escaped" type="text/html">What can't be found with Google</title>
<content type="application/xhtml+xml" xml:base="http://www.esoterraka.com/searchmarking/blog/index.html" xml:space="preserve">
<div xmlns="http://www.w3.org/1999/xhtml">
<p>Stumbled across an interesting little site aimed at exposing what <em>can't</em> be found with google: <a class="offsite" href="http://www.cantfindongoogle.com/">cantfindongoogle.com</a>. From the looks of it this site could also be called "bad queries for zero results" but nonetheless it does define a few of the ungaurded borders of Google's territory.</p>
</div>
</content>
<draft xmlns="http://purl.org/atom-blog/ns#">false</draft>
</entry>
<entry xmlns="http://purl.org/atom/ns#">
<link href="https://www.blogger.com/atom/6674541/111768371978190904" rel="service.edit" title="Index Rebuilt" type="application/atom+xml"/>
<author>
<name>jrb</name>
</author>
<issued>2005-06-01T22:40:00-05:00</issued>
<modified>2005-06-02T03:41:59Z</modified>
<created>2005-06-02T03:41:59Z</created>
<link href="http://www.esoterraka.com/searchmarking/blog/archive/2005_06_01_index.html#111768371978190904" rel="alternate" title="Index Rebuilt" type="text/html"/>
<id>tag:blogger.com,1999:blog-6674541.post-111768371978190904</id>
<title mode="escaped" type="text/html">Index Rebuilt</title>
<content type="application/xhtml+xml" xml:base="http://www.esoterraka.com/searchmarking/blog/index.html" xml:space="preserve">
<div xmlns="http://www.w3.org/1999/xhtml">
<p>Took a moment tonight to rebuild my searchmarking index. The index started out as 14,000 <abbr title="kilobytes">KB</abbr>, and contained 2,188 resources. Once complete the index weighed in at 13,770 <abbr title="kilobytes">KB</abbr>, now with 2,142 resources.</p>
<p>I'm very surprised to note that the index currently contains 13 resources less than it did at the <a href="/searchmarking/blog/archive/2004_06_01_index.html#108637369745487479">same time last year</a>! This despite the fact that I've searchmarked hundreds of resources over the past year. Once again I am reminded of the phenomenon described in "The decay and failures of web references" by Diomidis Spinellis (available <a class="offsite" href="http://www.dmst.aueb.gr/dds/pubs/jrnl/2003-CACM-URLcite/html/urlcite.html" title="Offsite link to aueb.gr">from aueb.gr</a> or <a class="offsite" href="http://portal.acm.org/citation.cfm?doid=602421.602422" title="Offsite link to an abstract on acm.org. Reading the published article will require digital library access.">from the <abbr title="Association for Computing Machinery">ACM</abbr> digital library</a>).</p>
</div>
</content>
<draft xmlns="http://purl.org/atom-blog/ns#">false</draft>
</entry>
<entry xmlns="http://purl.org/atom/ns#">
<link href="https://www.blogger.com/atom/6674541/111767744099076959" rel="service.edit" title="Teoma replacing Google" type="application/atom+xml"/>
<author>
<name>jrb</name>
</author>
<issued>2005-05-15T20:56:00-05:00</issued>
<modified>2005-06-10T04:38:40Z</modified>
<created>2005-06-02T01:57:20Z</created>
<link href="http://www.esoterraka.com/searchmarking/blog/archive/2005_05_01_index.html#111767744099076959" rel="alternate" title="Teoma replacing Google" type="text/html"/>
<id>tag:blogger.com,1999:blog-6674541.post-111767744099076959</id>
<title mode="escaped" type="text/html">Teoma replacing Google</title>
<content mode="escaped" type="text/html" xml:base="http://www.esoterraka.com/searchmarking/blog/index.html" xml:space="preserve">&lt;p&gt;Since the start of my &lt;a href="/searchmarking/blog/archive/2005_02_01_index.html#110783568794160570"&gt;Google Fast&lt;/a&gt; I've become increasing enamored with the &lt;a class="offsite" href="http://www.teoma.com" title="Offsite link to teoma.com"&gt;Teoma search engine&lt;/a&gt;. So much so that 3 months later I'm finding myself with little reason to turn back to ye olde Google Search.  Teoma results have proven every bit as comprehensive and comprehendible:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;search for &amp;quot;asynchronous javascript xmlhttp&amp;quot; : &lt;a class="offsite topicsearch" href="http://s.teoma.com/search?submit.x=0&amp;submit.y=0&amp;submit=Search&amp;q=asynchronous+javascript+xmlhttp&amp;qcat=1&amp;qsrc=1" title="Offsite search for: asynchronous javascript xmlhttp"&gt;Teoma&lt;/a&gt; | &lt;a class="offsite topicsearch" href="http://www.google.ca/search?hl=en&amp;safe=off&amp;q=asynchronous+javascript+xmlhttp&amp;btnG=Search&amp;meta=" title="Offsite search for: asynchronous javascript xmlhttp"&gt;Google&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;search for &amp;quot;semantic markup&amp;quot; : &lt;a class="offsite topicsearch" href="http://s.teoma.com/search?submit.x=0&amp;submit.y=0&amp;submit=Search&amp;q=semantic+markup&amp;qcat=1&amp;qsrc=1" title="Offsite search for: semantic markup"&gt;Teoma&lt;/a&gt; | &lt;a class="offsite topicsearch" href="http://www.google.ca/search?hl=en&amp;safe=off&amp;q=semantic+markup&amp;btnG=Search&amp;meta=" title="Offsite search for: semantic markup"&gt;Google&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;search for &amp;quot;standards compliant flash embedding&amp;quot; : &lt;a class="offsite topicsearch" href="http://s.teoma.com/search?q=standards+compliant+flash+embedding&amp;qcat=1&amp;qsrc=0&amp;Search.x=0&amp;Search.y=0&amp;Search=submit" title="Offsite search for: standards compliant flash embedding"&gt;Teoma&lt;/a&gt; | &lt;a class="offsite topicsearch" href="http://www.google.ca/search?hl=en&amp;safe=off&amp;q=standards+compliant+flash+embedding&amp;btnG=Search&amp;meta=" title="Offsite search for: standards compliant flash embedding"&gt;Google&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;search for &amp;quot;web usability accessibility&amp;quot; : &lt;a class="offsite topicsearch" href="http://s.teoma.com/search?submit.x=28&amp;submit.y=12&amp;submit=Search&amp;q=web+usability+accessibility&amp;qcat=1&amp;qsrc=1" title="Offsite search for: web usability accessibility"&gt;Teoma&lt;/a&gt; | &lt;a class="offsite topicsearch" href="http://www.google.ca/search?hl=en&amp;safe=off&amp;q=web+usability+accessibility&amp;btnG=Search&amp;meta=" title="Offsite search for: web usability accessibility"&gt;Google&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;search for &amp;quot;themaxx.com filepile.org&amp;quot; : &lt;a class="offsite topicsearch" href="http://s.teoma.com/search?submit.x=0&amp;submit.y=0&amp;submit=Search&amp;q=themaxx.com+filepile.org&amp;qcat=1&amp;qsrc=1" title="Offsite search for: themaxx.com filepile.org"&gt;Teoma&lt;/a&gt; | &lt;a class="offsite topicsearch" href="http://www.google.ca/search?hl=en&amp;safe=off&amp;q=themaxx.com+filepile.org&amp;btnG=Search&amp;meta=" title="Offsite search for: themaxx.com filepile.org"&gt;Google&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Further, Teoma offers &lt;a href="/searchmarking/blog/archive/2005_02_01_index.html#111587565928835807"&gt;clustering&lt;/a&gt;, a feature that's come in handy on several occasions.  No doubt Google will eventually introduce clustering in one form or another - in fact, one might argue that &lt;a class="offsite" href="http://labs.google.com/sets" title="Offsite link to labs.google.com"&gt;Google Sets&lt;/a&gt; are a token gesture to this technology, as was their acquisition of Applied Semantics and its Oingo engine. But until then Teoma has a badge of distinction in its favour.&lt;/p&gt;
&lt;p&gt;Google does offer some conveniences that Teoma lacks.  The &lt;a class="offsite" href="http://www.google.com" title="Offsite link to google.com"&gt;ultra-simplicity of its interface&lt;/a&gt; is a prime example. Same goes for &lt;a class="offsite" href="http://www.google.com/sms/" title="Offsite link to google.com"&gt;Google &lt;abbr title="Short Message Service"&gt;SMS&lt;/abbr&gt;&lt;/a&gt;. And thus in some circumstance the venerable sage is still my top choice.  But day to day I'm finding myself siding with Teoma.&lt;/p&gt;
&lt;p&gt;Experiment concluded, and so ends my Google Fast. Long live Teoma.&lt;/p&gt;</content>
<draft xmlns="http://purl.org/atom-blog/ns#">false</draft>
</entry>
</feed>
