I’ve been actively making noise on the swish-e discussion list for over a year now. It’s a great open source indexing and searching tool. Love it. Loooove it. How’s that for Geek Love?
Part of the power of swish-e (the product is UPPERCASE, the command is lower, and I’m a lazy typist…) is in the libxml2 parser from the GNOME project. That thing flies. I’ve since started using the libxml2 tools in my other work as well.
Part of my work with swish-e has been in improving the ranking algorithm. I found a wealth of info on that subject, thanks in part to the success of google — which makes it easy to find information about what makes google work so well. How’s that for the tail wagging the dog? Or something like that.
Anyway, this has led me down the road of natural language query and methods of relevance ranking. Pretty dense stuff. My wee brain starts to twist and shudder. But I found this a good start and this even more helpful.
I have an email in to the developer about the open source status of the NITLE Semantic Engine, which looks like a really interesting idea. The author wrote this article about vector ranking, which I found very lucid.