YA Perl Advent Calendar 2005-12-18

I bring you yet another regular expression treat, Regexp::Assemble. This little tool performs trie optimization, that is to say it finds the common prefixes and suffixes of the expressions you give it and assembles them into an optimized regular expression. It'll accept strings or regular expressions, and a whole slough of options you'll probably never need.

The sample code at the end gives the following output, which demonstrates the undocumented feature of sucking if you need/want to be able to appened to the parts list after compilation. Update: The author has replied.

Tree Trimmings:
        (?-xism:(?:(?:(?:ornamen|ligh)t|nutcracker|candle)s|popcorn|tinsel))

Doh! We forgot to top the tree!

Tree Trimmings:
        (?-xism:star)

Light'er up!

#Adding a star clobbered our tree, but setting the option to prevent this
#yields an unoptimized expression

Tree Trimmings:
        (?-xism:(?:nutcrackers|ornaments|candles|popcorn|lights|tinsel))

Doh! We forgot to top the tree!
Tree Trimmings:
        (?-xism:(?:nutcrackers|ornaments|candles|popcorn|lights|tinsel|star))

Light'er up!

mod18.pl


   1 use Regexp::Assemble;
   2 
   3 foreach( 0 .. 1){
   4   my $ra = new Regexp::Assemble(mutable=>$_);
   5   $ra->add(qw'popcorn tinsel lights candles nutcrackers ornaments');
   6   isDone($ra);
   7 }
   8 
   9 sub isDone{
  10   my $ra = shift;
  11 
  12   printf "Tree Trimmings:\n\t%s\n\n", $ra->re();
  13 
  14   if( $ra->re() !~ "star" ){
  15     warn "Doh! We forgot to top the tree!\n";
  16     $ra->add('star');
  17     isDone($ra);
  18   }
  19   else{
  20     warn("Light'er up!\n");
  21   }
  22 }

See Also

Regexp::List