an eddy in the bitstream

Category: projects (Page 16 of 25)

Perl Modules I cannot live without

As I started to use CPAN, one of the hardest things to learn was, of the many modules that claim to implement the same (or similar) feature set, which is considered the best (stable, actively supported, secure) module for the task. It has taken me lots of trial and error, and I am still gleening best practices everyday.

In no particular order, some of the CPAN modules I consider essential, by which I mean that nearly every script or project I develop uses many from this list. (NOTE that I do not include any of my own CPAN modules here.)

  • Carp (part of the standard Perl install)
  • Path::Class
  • Class::Accesssor::Fast (or Object::Tiny)
  • File::Slurp
  • Data::Dump
  • Template::Toolkit
  • LWP
  • XML::LibXML
  • Rose::DB::Object (and DBI of course)
  • DateTime
  • Getopt::Long (for cli scripts)
  • Pod::Usage (ditto)

SWISH::Prog take 2

Spent the last week or 2 totally reworking SWISH::Prog. Reorganized the class layout to mirror the aggregator/parser/indexer/searcher paradigm I described some time ago. It has started to look a little like KinoSearch in that respect, with the addition of the aggregators and parser (which is of course Swish-e’s contribution to IR).

After mulling/experimenting for several days over how best to write the spider, I have decided to use WWW::Mechanize along with WWW::Rules and write from scratch. Then I’ll provide backwards API compat for the Swish-e 2.4 spider.pl script config files/callbacks/etc. This proved easier than a direct port, and allows me to provide extensible caching/queueing/user_agent classes rather than hardcoding everything in a single script/library. I toyed with WWW::CheckSite but in order to make it work with the aggregator API required so many gymnastics it finally became easier to just write the spider myself. And a good programming exercise as well. 🙂

« Older posts Newer posts »

© 2025 peknet

Theme by Anders NorenUp ↑