YA Perl Advent Calendar 2005-12-12

re is a mixed bag of two esoteric security sub-pragmas 1 and a mighty useful regular expression debugger. Of course to understand the debugger output it probably doesn't hurt to have (re)written the perl regular expression engine, or at least have read the excellent Mastering Regular Expressions. Of course, even if you haven't the debugger output still easily qualifies as human readable. When things just don't seem to be working as you expect tracing this ought to be able to show you why. There are two flavors of the debug subpragma: use re 'debugcolor' for ANSI-capable terminals, and use read 'debug' for handi-capable terminals.

At the end of this page you'll find a sample script with a simple regexp and sample output from re 'debugcolor', and other resources. The head of the output shows how the expression is interpreted, and the body is a series of progressive attempts to match the expression against a fixed window in the original string. In the "colored" output below: bold indicates the text being considered, underline is discarded as unmatchable, and reverse is a match.

1. One to force even stronger tainting, the other to disable the safety which disallows interpolation under taint.

mod12.pl


   1 use re 'debugcolor';
   2 
   3 my %abbrev = (
   4 	      hippopotamus=>'hippo',
   5 	      Christ=>'X-'
   6 	     );
   7 
   8 print $_ = "I want a hippopotamus for Christmas\n";
   9 
  10 my $EXPR = join('|', keys %abbrev);
  11 
  12 s/($EXPR)/$abbrev{$1}/g;
  13 
  14 print;
first at 3
   1: OPEN1(3)
   3:   BRANCH(8)
   4:     EXACT <hippopotamus>(12)
   8:   BRANCH(12)
   9:     EXACT <Christ>(12)
  12: CLOSE1(14)
  14: END(0)
minlen 6 
Offsets: [14]
1[1] 0[0] 1[1] 2[12] 0[0] 0[0] 0[0] 14[1] 15[6] 0[0] 0[0] 21[1] 0[0] 22[
0] 
Matching REx `(hippopotamus|Christ)' against `I want a hippopotamus for Christmas
'
  Setting an EVAL scope, savestack=7
   0 <I want a hippop> |  1:  OPEN1
   0 <I want a hippop> |  3:  BRANCH
  Setting an EVAL scope, savestack=17
   0 <I want a hippop> |  4:    EXACT <hippopotamus>
                              failed...
   0 <I want a hippop> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=7..17
  Setting an EVAL scope, savestack=7
   1 <I want a hippop> |  1:  OPEN1
   1 <I want a hippop> |  3:  BRANCH
  Setting an EVAL scope, savestack=17
   1 <I want a hippop> |  4:    EXACT <hippopotamus>
                              failed...
   1 <I want a hippop> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=7..17
  Setting an EVAL scope, savestack=7
   2 <I want a hippop> |  1:  OPEN1
   2 <I want a hippop> |  3:  BRANCH
  Setting an EVAL scope, savestack=17
   2 <I want a hippop> |  4:    EXACT <hippopotamus>
                              failed...
   2 <I want a hippop> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=7..17
  Setting an EVAL scope, savestack=7
   3 <I want a hippop> |  1:  OPEN1
   3 <I want a hippop> |  3:  BRANCH
  Setting an EVAL scope, savestack=17
   3 <I want a hippop> |  4:    EXACT <hippopotamus>
                              failed...
   3 <I want a hippop> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=7..17
  Setting an EVAL scope, savestack=7
   4 <I want a hippop> |  1:  OPEN1
   4 <I want a hippop> |  3:  BRANCH
  Setting an EVAL scope, savestack=17
   4 <I want a hippop> |  4:    EXACT <hippopotamus>
                              failed...
   4 <I want a hippop> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=7..17
  Setting an EVAL scope, savestack=7
   5 <I want a hippop> |  1:  OPEN1
   5 <I want a hippop> |  3:  BRANCH
  Setting an EVAL scope, savestack=17
   5 <I want a hippop> |  4:    EXACT <hippopotamus>
                              failed...
   5 <I want a hippop> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=7..17
  Setting an EVAL scope, savestack=7
   6 < want a hippopo> |  1:  OPEN1
   6 < want a hippopo> |  3:  BRANCH
  Setting an EVAL scope, savestack=17
   6 < want a hippopo> |  4:    EXACT <hippopotamus>
                              failed...
   6 < want a hippopo> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=7..17
  Setting an EVAL scope, savestack=7
   7 <want a hippopot> |  1:  OPEN1
   7 <want a hippopot> |  3:  BRANCH
  Setting an EVAL scope, savestack=17
   7 <want a hippopot> |  4:    EXACT <hippopotamus>
                              failed...
   7 <want a hippopot> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=7..17
  Setting an EVAL scope, savestack=7
   8 <ant a hippopota> |  1:  OPEN1
   8 <ant a hippopota> |  3:  BRANCH
  Setting an EVAL scope, savestack=17
   8 <ant a hippopota> |  4:    EXACT <hippopotamus>
                              failed...
   8 <ant a hippopota> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=7..17
  Setting an EVAL scope, savestack=7
   9 <nt a hippopotam> |  1:  OPEN1
   9 <nt a hippopotam> |  3:  BRANCH
  Setting an EVAL scope, savestack=17
   9 <nt a hippopotam> |  4:    EXACT <hippopotamus>
  21 <tamus for Chris> | 12:    CLOSE1
  21 <tamus for Chris> | 14:    END
Match successful!
Matching REx `(hippopotamus|Christ)' against ` for Christmas
'
  Setting an EVAL scope, savestack=17
  21 <tamus for Chris> |  1:  OPEN1
  21 <tamus for Chris> |  3:  BRANCH
  Setting an EVAL scope, savestack=27
  21 <tamus for Chris> |  4:    EXACT <hippopotamus>
                              failed...
  21 <tamus for Chris> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=17..27
  Setting an EVAL scope, savestack=17
  22 <amus for Christ> |  1:  OPEN1
  22 <amus for Christ> |  3:  BRANCH
  Setting an EVAL scope, savestack=27
  22 <amus for Christ> |  4:    EXACT <hippopotamus>
                              failed...
  22 <amus for Christ> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=17..27
  Setting an EVAL scope, savestack=17
  23 <mus for Christm> |  1:  OPEN1
  23 <mus for Christm> |  3:  BRANCH
  Setting an EVAL scope, savestack=27
  23 <mus for Christm> |  4:    EXACT <hippopotamus>
                              failed...
  23 <mus for Christm> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=17..27
  Setting an EVAL scope, savestack=17
  24 <us for Christma> |  1:  OPEN1
  24 <us for Christma> |  3:  BRANCH
  Setting an EVAL scope, savestack=27
  24 <us for Christma> |  4:    EXACT <hippopotamus>
                              failed...
  24 <us for Christma> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=17..27
  Setting an EVAL scope, savestack=17
  25 <s for Christmas> |  1:  OPEN1
  25 <s for Christmas> |  3:  BRANCH
  Setting an EVAL scope, savestack=27
  25 <s for Christmas> |  4:    EXACT <hippopotamus>
                              failed...
  25 <s for Christmas> |  9:    EXACT <Christ>
                              failed...
  Clearing an EVAL scope, savestack=17..27
  Setting an EVAL scope, savestack=17
  26 < for Christmas
> |  1:  OPEN1
  26 < for Christmas
> |  3:  BRANCH
  Setting an EVAL scope, savestack=27
  26 < for Christmas
> |  4:    EXACT <hippopotamus>
                              failed...
  26 < for Christmas
> |  9:    EXACT <Christ>
  32 < for Christmas
> | 12:    CLOSE1
  32 < for Christmas
> | 14:    END
Match successful!
String too short [regexec_flags]...
Match failed
I want a hippopotamus for Christmas
I want a hippo for X-mas
Freeing REx: `(hippopotamus|Christ)'
Output transmogrified with HTML::FromANSI, emacs, and HTML tidy.

See Also

YAPE::Regex::Explain
An English explanation of the parsed regular expression.
YAPE::Regexp
The framework behind Explain.
Rx
An alternate framework, with actual documentation as a whitepaper at http://perl.plover.com/Rx/
rebug
A visual debugger based on Rx.
Regexp::English
Chromatic's toolkit for writing regular expression in a verbose English like format.