Perl 2004 Advent Calendar: Devel::Trace

On the 13th day of Advent my True Language brought to me..

Devel::Trace

Over the years I've talked about many modules that can help insert debugging output into your code - Data::Dumper, Data::Dumper::Simple, Hook::LexWrap, YAML and their friends. Modules that allow you to write commands to print out status messages at various points in the code's execution.

The trouble with this is that it's a pain to go though and insert this debugging information by hand. In the case of a simple script, or a small test script you've written to help you debug a particular chunk of code, you end up with more print statements than you did with original code.

One option at this point would be to use The Perl Debugger, though a lot of people consider this too complicated, and in a lot of cases they'd be right. Enter Devel::Trace a simple non-interactive 'debugger' that simply prints out each line of code before it executes it. No more writing lines of lines of print "I got here\n" statements, but equally no more having to carefully step though your program statement by statement. Just a simple quick to use debugging tool, while not being the solve all, a great module to have in your debugging toolbox for when you need it.

[Read the documentation for Devel::Trace on search.cpan.org]

Using Devel::Trace couldn't be simpler. You simply need to invoke it as the debugger for your code. So where you'd normally type:

  perl hello_world.pl

You would instead write:

  perl -d:Trace hello_world.pl

This merrily goes off and runs our program for us, printing out each line as we go. So for something as simple as:

  #!/usr/bin/perl

  # turn on perl's safety features
  use strict;
  use warnings;

  print "Hello World\n";

We get:

  >> test.pl:7: print "Hello World\n";
  Hello World

Printed out. The line with the print statement's the only one being executed. Before the line is executed the filename and the linenumber are prepended to the code that makes up the line and the result is printed to STDERR. Let's make the example a little more complicated by adding a loop:

  #!/usr/bin/perl

  # turn on perl's safety features
  use strict;
  use warnings;

  for (1..5)
  {
    print "Hello World\n";
  }

When we run this under Devel::Trace we get the expected output, with line nine being printing out each time it's executed though the loop:

  bash$ perl -d:Trace test.pl     
  >> test.pl:7: for (1..5)
  >> test.pl:9:   print "Hello World\n";
  Hello World
  >> test.pl:9:   print "Hello World\n";
  Hello World
  >> test.pl:9:   print "Hello World\n";
  Hello World
  >> test.pl:9:   print "Hello World\n";
  Hello World
  >> test.pl:9:   print "Hello World\n";
  Hello World
  bash$

We try to expand our script to add time:

  #!/usr/bin/perl

  # turn on perl's safety features
  use strict;
  use warnings;

  # load our modules
  use DateTime;

  for (1..5)
  {
    my $now = DateTime->now;
    print "Hello World at $now\n";
  }

However when we run it though our simple debugger we get way way too much output out as Devel::Trace blindly follows the code into DateTime:

  >> test.pl:10: for (1..5)
  >> test.pl:12:   my $now = DateTime->now;
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:368: sub now { shift->from_epoch( epoch => (scalar time), @_ ) }
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:337:     my $class = shift;
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:338:     my %p = validate( @_,
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:346:     my %args;
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:349:     my $fraction = $p{epoch} - int( $p{epoch} );
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:350:     $args{nanosecond} = int( $fraction * MAX_NANOSECONDS )
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:355:     @args{ qw( second minute hour day month year ) } =
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:357:     $args{year} += 1900;
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:358:     $args{month}++;
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:360:     my $self = $class->new( %p, %args, time_zone => 'UTC' );
  >> /Library/Perl/5.8.1/darwin-thread-multi-2level/DateTime.pm:164:     my $class = shift;
  ...

The simplest thing we could do in this situation is store the whole of the debug output in a text file that we can examine at our leisure in our editor:

  bash$ perl -d:Trace test.pl 2>file

A slightly more ad-hoc way of dealing with this is to directly filter the output at the time we're creating it with another little Perl script to just print out the lines from the file we're interested in:

  bash$ perl -d:Trace test.pl 2>&1 \
    | perl -ne 'print if (!/^>>/ || /^>> test.pl/)'
  >> test.pl:10: for (1..5)
  >> test.pl:12:   my $now = DateTime->now;
  >> test.pl:13:   print "Hello World at $now\n";
  >> test.pl:12:   my $now = DateTime->now;
  >> test.pl:13:   print "Hello World at $now\n";
  >> test.pl:12:   my $now = DateTime->now;
  >> test.pl:13:   print "Hello World at $now\n";
  >> test.pl:12:   my $now = DateTime->now;
  >> test.pl:13:   print "Hello World at $now\n";
  >> test.pl:12:   my $now = DateTime->now;
  >> test.pl:13:   print "Hello World at $now\n";
  Hello World at 2004-12-08T19:02:19
  Hello World at 2004-12-08T19:02:19
  Hello World at 2004-12-08T19:02:19
  Hello World at 2004-12-08T19:02:19
  Hello World at 2004-12-08T19:02:19

You'll notice that the lines being printed and the output are no longer properly interleaved; This is a side effect of the fact that by default perl buffers STDOUT to be more efficient, holding it back in a buffer, collecting a significant amount of output before printing a sizable chunk in one go. On the other hand STDERR is printed right away, meaning the two get out of sync. We can prevent that by making a slight alteration at the top of our script:

  #!/usr/bin/perl

  $|++; # turn off the buffering

  # turn on perl's safety features
  use strict;
  use warnings;

  # load our modules
  use DateTime;

  for (1..5)
  {
    my $now = DateTime->now;
    print "Hello World at $now\n";
  }

Everything now works:

  travis:~ mark$ perl -d:Trace test.pl 2>&1 \
    | perl -ne 'print if (!/^>>/ || /^>> test.pl/)'
  >> test.pl:3: $|++; # turn off the buffering
  >> test.pl:12: for (1..5)
  >> test.pl:14:   my $now = DateTime->now;
  >> test.pl:15:   print "Hello World at $now\n";
  Hello World at 2004-12-08T19:05:12
  >> test.pl:14:   my $now = DateTime->now;
  >> test.pl:15:   print "Hello World at $now\n";
  Hello World at 2004-12-08T19:05:12
  >> test.pl:14:   my $now = DateTime->now;
  >> test.pl:15:   print "Hello World at $now\n";
  Hello World at 2004-12-08T19:05:12
  >> test.pl:14:   my $now = DateTime->now;
  >> test.pl:15:   print "Hello World at $now\n";
  Hello World at 2004-12-08T19:05:12
  >> test.pl:14:   my $now = DateTime->now;
  >> test.pl:15:   print "Hello World at $now\n";
  Hello World at 2004-12-08T19:05:12

Filtering on the command line is tremendously powerful, but doesn't give us the full flexibility we need. What happens if for some reason we can't filter on command name - such as the large routines we don't want printed out being in the same file as the text we do want printed out? At this stage we do have to resort to putting things in the code, using the special $Devel::Trace::TRACE variable to turn on and off output:

 #!/usr/bin/perl

  # turn on perl's safety features
  use strict;
  use warnings;

  # load our modules
  use DateTime;

  for (1..5)
  {
    $Devel::Trace::TRACE = 0;
    my $now = DateTime->now;
    $Devel::Trace::TRACE = 1;
    print "Hello World at $now\n";
  }

While this makes a terrible example in such a short script (we're back in the same situation we were before, putting so much debug code in our code) it's very useful in a much larger script where we may only care about printing out what's going on in a particular block. If we turn off the output at the start of our code:

  #!/usr/bin/perl

  $Devel::Trace::TRACE = 0;

And then use local to turn it on again when we want to see it:

  sub foo
  {
    local $Devel::Trace::TRACE = 1;
    ...
  }

It'll automatically turn off again when we exit the enclosing lexical block (i.e. it'll only stay on while we're in the foo subroutine or for any code we call from within foo.) Meaning we'll get just the debug info we want. Automatically, with no fuss.

The -n command line option