Day: February 28, 2006

The State of Search

Always searching for the latest in search engine development.

Some interesting work lately on CPAN.

I had not seen KinoSearch before; a very interesting “loose port” of Lucene written in C and Perl. I need to try it out.

Search::QueryParser also looks promising. I was just thinking that such a thing would be helpful for SWISH::HiLiter and/or HTML::HiLiter … or maybe even Swish-e itself.

Search::ContextGraph is older, one of Maciej Ceglowski’s projects from back before he became a fulltime painter and world traveller.

Search::Estraier is Dobrica Pavlinusic’s pure Perl implementation of the Hyper Estraier Perl API. I know Dobrica’s a big fan of Hyper Estraier, even over Swish-e and Xapian.

Search::FreeText appears to be abandoned, or at least not actively maintained. Last updated in 2003. Too bad; the documentation makes it look interesting anyway. Although it does use DB_File, which I know from experience with Perlfect, does not scale well above 20K+ documents.

Search::Xapian has recently been updated. There’s lots of activity on the Xapian project. It’s at or near the top of candidates for the Swish3 backends.

Search::InvertedIndex is one I had not seen before, but it looks very interesting. I have been thinking about a SwishQL – a SQL backend for Swish3, and Search::InvertedIndex offers a mysql backend. Benjamin Franz wrote it; he also wrote CGI::Minimal, which I used for a while with CrayDoc (contributed a patch too, iirc). I’ll have to come back to this one.

Search::Indexer. Another one I’d not heard of, but which bears investigation. Author Laurent Dami is a familiar name to me, as he also wrote Search::QueryParser (above) and a handy FormBuilder TT patch I’ve been using.

