YA Perl Advent Calendar 2005-12-20

If you checked out yesterday's Test::Perl::Critic you've seen one application of Adam Kennedy's parse perl without perl; more accurately described as parse perl in runtime without the perl compiler. Today's module, Perl::Compare is another application from the man himself. Perl::Compare is a perl-aware diff and where diff -qr might give:
Only in Wubbulous-1.11/: 1.03.patch
Only in Wubbulous-1.11/: blib
Files Wubbulous-1.04/CHANGES and Wubbulous-1.11/CHANGES differ
Files Wubbulous-1.04/Wubbulous.pm and Wubbulous-1.11/Wubbulous.pm differ
Only in FileCache-1.11/: FileCache.pm~
Only in Wubbulous-1.11/: Makefile
Files Wubbulous-1.04/MANIFEST and Wubbulous-1.11/MANIFEST differ
Only in Wubbulous-1.11/: pm_to_blib
Only in Wubbulous-1.04/t: 01power.t
Only in Wubbulous-1.04/t: 02maxpower.t
Only in Wubbulous-1.04/t: 03eyefish.t
Only in Wubbulous-1.04/t: 04oleo.t
Only in Wubbulous-1.04/t: 05conquer.t
Only in Wubbulous-1.04/t: 06krusty.t
Only in Wubbulous-1.04/t: 07apu.t
Only in Wubbulous-1.11/t: 1.t
Only in Wubbulous-1.11/t: 2.t
Only in Wubbulous-1.11/t: 3.t
Only in Wubbulous-1.11/t: 4.t
Only in Wubbulous-1.11/t: 5.t
Only in Wubbulous-1.11/t: 6.t
Only in Wubbulous-1.11/t: 7.t
Only in Wubbulous-1.11/t: bar
Only in Wubbulous-1.11/t: baz
Only in Wubbulous-1.11/t: foo
Only in Wubbulous-1.11/t: Foo'Bar
Only in Wubbulous-1.11/t: quux
Files Wubbulous-1.04/TODO and Wubbulous-1.11/TODO differ
Perl::Compare reports the following (it can also give you a data structure instead):
! Wubbulous.pm
+ blib/lib/Wubbulous.pm
- t/01power.t
- t/02maxpower.t
- t/03eyefish.t
- t/04oleo.t
- t/05conquer.t
- t/06krusty.t
- t/07apu.t
+ t/1.t
+ t/2.t
+ t/3.t
+ t/4.t
+ t/5.t
+ t/6.t
+ t/7.t
Note that by default Compare ignored by-products of editing, and some of those from make-ing the module like the Makefile and man pages in blib (but not the library). Of course, the difference in output there is mostly prettiness and due to Perl::Compare's filtering. On the other hand, when completed, Perl::Compare ought to be able to recognize the following two snippets as identical:
   1 if( $val == 0 ){
   2   die "a horrible death\n";
   3 }
   4 elsif( $val == 42 ){
   5   liveLong && prosper;
   6 }
   7 else{
   8   work;
   9 }

   1 if( $val == 0 ){
   2   die("a horrible death\n");
   3 }
   4 elsif( $val == 42 ){
   5   liveLong && prosper; }
   6 else{
   7   work
   8 }
While diff -u does not:
--- a.pl        Tue Dec 20 21:59:19 2005
+++ b.pl        Tue Dec 20 21:59:19 2005
@@ -1,9 +1,8 @@
 if( $val == 0 ){
-  die "a horrible death\n";
+  die("a horrible death\n");
 elsif( $val == 42 ){
-  liveLong && prosper;
+  liveLong && prosper; }
-  work;
+  work
It does this by normalizing the contents. PPI currently supports two levels of normalization although in my testing I have not been able to get different output from the different levels (or layers as the documentation calls them).

Be forewarned though, this is definitely bleeding code with mismatches between documentation and implementation as well as unlocalized use of $_. I also wonder about the design decision for the unimplemented compare of a target filename to mean the filename contains a store of what to compare and not the simple DWIMy: file itself is the target for comparison.

P.S. If you know of a better/more comparable diff output let me know–jpierce@cpan–I could have sworn there was a means for diff -r to give similar +/- output but I might be thinking of my own dirdiff.