Archives: October 2006

They’ve got my number

Those crazy random spam generators appear to have found my number: random sentence fragments about politics and technology seem to fool SpamAssassin and end up in my inbox. Here’s a good one:

My first thought was, its a video web portal, how different can it be? Rick Santorum, left, and Pennsylvania State Treasurer Bob Casey. Foley’s online conversations with teenage male pages, but have largely hunkered down as Republicans beat up each other amid accusations of a less-than-agressive initial response. But in this battle, the only thing left to see is boxing gloves. , figuring out just how bad the collateral damage is going to be for the GOP is akin to trying to guess if the roof is going to blow off of a house in the midst of a storm. Thank you to our event sponsors Backbone Media and BusinessWire. I knew this would be fun! It is a bit too early to say if it is a great service or not, but I have great hopes that it will work out for me.


One of the things I like most about using other IR libraries as backends is that many of them offer language bindings in multiple other languages. So PHP, Ruby, Python, etc., users can be happy right away with search ability.

Of course, if the indexing program is Perl instead of C, there is that added requirement. But hopefully, if the indexing API is well documented, there’s nothing to stop implementations in other scripting languages besides Perl.

Take Xapian for example. They have bindings available in nearly every major scripting language. So if you don’t like the way Swish-e implements the indexing scheme, there’s nothing to stop you from writing your own in your favorite language. At which point, you’re not really using Swish-e any more. But you could mix/match depending on your needs. Use Swish-e’s spider and SWISH::Filter, but your own parser and indexer, for example.


The general idea right now is to get the core C libraries functional, at least for SwishParser and SwishConfig. Then start working on the “swish-e” command line program replacement. I intend to write the replacement in Perl, since that will be much easier to write and performance should only see a small hit from startup costs. I’ll use SWISH::Prog to handle the basic spider/fs stuff, as well as config parsing.

Funny: I don’t think I had that in mind when I originally started SWISH::Prog but it now seems like a totally obvious fit.

SWISH::Prog::Config just underwent some major surgery. It can now parse version2 config files using the excellent Config::General, and can convert to the current SwishConfig XML format.

I’ll probably start with a Xapian backend since that’s fairly stable (though UTF-8 support is still not official till 1.0). Need to write SWISH::Index and SWISH::Search APIs (though the latter will likely look just like SWISH::API).

Everything in due time.