One of the things I like most about using other IR libraries as backends is that many of them offer language bindings in multiple other languages. So PHP, Ruby, Python, etc., users can be happy right away with search ability.
Of course, if the indexing program is Perl instead of C, there is that added requirement. But hopefully, if the indexing API is well documented, there’s nothing to stop implementations in other scripting languages besides Perl.
Take Xapian for example. They have bindings available in nearly every major scripting language. So if you don’t like the way Swish-e implements the indexing scheme, there’s nothing to stop you from writing your own in your favorite language. At which point, you’re not really using Swish-e any more. But you could mix/match depending on your needs. Use Swish-e’s spider and SWISH::Filter, but your own parser and indexer, for example.
The general idea right now is to get the core C libraries functional, at least for SwishParser and SwishConfig. Then start working on the “swish-e” command line program replacement. I intend to write the replacement in Perl, since that will be much easier to write and performance should only see a small hit from startup costs. I’ll use SWISH::Prog to handle the basic spider/fs stuff, as well as config parsing.
Funny: I don’t think I had that in mind when I originally started SWISH::Prog but it now seems like a totally obvious fit.
SWISH::Prog::Config just underwent some major surgery. It can now parse version2 config files using the excellent Config::General, and can convert to the current SwishConfig XML format.
I’ll probably start with a Xapian backend since that’s fairly stable (though UTF-8 support is still not official till 1.0). Need to write SWISH::Index and SWISH::Search APIs (though the latter will likely look just like SWISH::API).
Everything in due time.