an eddy in the bitstream

Day: September 19, 2008

Swish3 Status 19 Sept 2008

A long hiatus for a full summer and then some contract work.

Some benchmarks of the latest tokenization algorithm shows that pure-ASCII tokenization is about 20% faster and UTF-8 tokenization is about 2% slower. So I’ll take it.

Benchmark was performed by using perl/ to generate 100 random “docs” in both encodings (ASCII and random UTF-8) and then timed using swish_lint with and without the -t option.

Also recently fixed some failing tests on Linux and a memory warning.


Swish3 has a lot more tests that Swish-e (not counting Josh Rabinowitz’s excellent testing package). And as of tonight all Swish3 tests are passing under both Linux 2.6 (CentOS 5) and OS X 10.4. \o/

© 2024 peknet

Theme by Anders NorenUp ↑