I am happy to announce the 1.0.0 release of libswish3:
libswish3 is at the core of multiple Swish3 implementations, and has reached a stable enough API that a 1.0.0 release seems appropriate.
From the README:
There are currently four different implementations available of Swish3.
- swish_xapian (C++ using libxapian, included in libswish3 distribution)
- SWISH::Prog::Xapian (Perl using Search::Xapian)
- SWISH::Prog::Lucy (Perl using Apache Lucy)
- SWISH::Prog::KSx (Perl using KinoSearch)
All the Perl implementations are available from CPAN. They each rely on SWISH::3 (the Perl bindings to libswish3) and the core SWISH::Prog project, a Perl rewrite of the swish-e 2.x C binary and accompanying helper scripts. The SWISH::Prog distribution includes a ‘swish3’ command line interface with options very similar to the swish-e 2.x command line tool.
Xapian, KinoSearch and Apache Lucy all offer robust UTF-8 and incremental indexing support, as well as the ability to scale to many millions of documents across multiple servers.
You can read more about Swish3 at the devel site.
UPDATE: Mailing list announcement here.
Just uploaded several modules to CPAN that together implement a full REST API for KinoSearch indexes, using Search::OpenSearch::Server::Plack.
The modules are:
- Search::OpenSearch 0.11
- Search::OpenSearch::Server 0.05
- Search::OpenSearch::Engine::KSx 0.08
- SWISH::Prog::KSx 0.17
- SWSIH::Prog 0.49
One of the three virtues of programming is Laziness. Beware of false laziness. Andy Lester writes on the problem aptly when he describes an interaction with another programmer:
Excuse me while I get up and stretch.
I can vouch for the writer’s experience, though for me it has been less about back pain (though I have that too) than eye strain (going on 7 years now). Biggest of all though has been having children and working from home: that is the interruption formula in a nutshell.
SWISH::3 0.08_04 is passing all tests all over the CPAN testers universe, so that is encouraging.
However, some reports (notably on FreeBSD) report false failures because of a Wstat issue.
I’ve posted about it at PerlMonks and hope someone out there has an easy fix.
Update: finally found a fix for this. The problem is that Perl has its own my_setenv() function that interferes with the native setenv() called by libswish3.c. The fix was to set the magic Perl var PL_use_safe_putenv as shown here. This took many hours and googling to track down. Glad to be done with it (I hope!).
There’s been a ton of work on Swish3 in the last year. I’ve actually started planning a 1.0 release, after 5 years of work.
Lately I’ve been focusing on three things: (1) making the Perl bindings easier to install; (2) indexing of compressed documents; and (3) supporting XInclude of document fragments. The first is accomplished: you can install the entire library via CPAN. The last two are aimed at large doc sets where I want to keep the XML compressed on disk for space reasons, and where I want to re-use subsets of the document collections in building multiple indexes.