Searchmarking with JRB

of :

Search Journal



Index Rebuilt 

Took a moment tonight to rebuild the index. It was 15,024 KB with 2,307 resources, and is now 14,654 KB with 2,266 resources.

Of some fascination for me is the fact that several resources have now used REP to specifically deny access to my searchmarking crawler. Again I'm left contemplating the ins and outs of obeying the Robots Exclusion Protocol.

CiteSeer gets $1.2-million grant 

Penn State Live has announced a $1.2-million grant to fix up the venerable CiteSeer Scientific Literature Digital Library!

Since its launch in 1997, CiteSeer has provided the public with access to more than 700,000 documents in computer and information sciences. The Next Generation CiteSeer will archive more documents, allow new types of searching, offer CiteSeer as a Web service, include personalized recommendations and searches, and permit synchronous live-object collaboration.
- http://live.psu.edu/story/13209

Hooray! Many thanks to my fine friend at fridgebuzz for the head's up.

What can't be found with Google 

Stumbled across an interesting little site aimed at exposing what can't be found with google: cantfindongoogle.com. From the looks of it this site could also be called "bad queries for zero results" but nonetheless it does define a few of the ungaurded borders of Google's territory.

Index Rebuilt 

Took a moment tonight to rebuild my searchmarking index. The index started out as 14,000 KB, and contained 2,188 resources. Once complete the index weighed in at 13,770 KB, now with 2,142 resources.

I'm very surprised to note that the index currently contains 13 resources less than it did at the same time last year! This despite the fact that I've searchmarked hundreds of resources over the past year. Once again I am reminded of the phenomenon described in "The decay and failures of web references" by Diomidis Spinellis (available from aueb.gr or from the ACM digital library).

Teoma replacing Google 

Since the start of my Google Fast I've become increasing enamored with the Teoma search engine. So much so that 3 months later I'm finding myself with little reason to turn back to ye olde Google Search. Teoma results have proven every bit as comprehensive and comprehendible:

Further, Teoma offers clustering, a feature that's come in handy on several occasions. No doubt Google will eventually introduce clustering in one form or another - in fact, one might argue that Google Sets are a token gesture to this technology, as was their acquisition of Applied Semantics and its Oingo engine. But until then Teoma has a badge of distinction in its favour.

Google does offer some conveniences that Teoma lacks. The ultra-simplicity of its interface is a prime example. Same goes for Google SMS. And thus in some circumstance the venerable sage is still my top choice. But day to day I'm finding myself siding with Teoma.

Experiment concluded, and so ends my Google Fast. Long live Teoma.

Atom Site Feed

This page is powered by Blogger. Isn't yours?