Perl 2003 Advent Calendar: PPerl

When you execute a program written in Perl, the Perl interpreter perl is loaded into memory, the source files are loaded, parsed and converted into bytecode. This all happens really quickly, so fast that if you blink, you miss it. In most cases this is fast enough.

Sometimes it's not.

The trouble is, that sometimes this isn't fast enough, because, well, the computer can do other things really quickly too. If you run a Perl script every time you receive a mail then most of the time this is going to be fine, but should you suddenly get a thousand mails delivered then your computer's going to have a hard job simultaneously initialising a thousand Perl interpreters at once.

What would be nice if instead of starting from scratch each time we run a program, it was possible to keep the program hanging around in memory ready to run. This is what PPerl does - and it's really simple to use too. In most cases all you need do is change one line of your well written source code and everything will work.

[Read the documentation for PPerl on search.cpan.org]

The bash shell prompt can be configured to run a command each time it is about to print out and incorporate the output of that command into itself. For example, utilising the unix date command:

  bash-2.05b$ PS1="\$(date) $ "
  Wed Dec  3 21:27:27 GMT 2003 $ ls
  hack_the_planet.pl
  Wed Dec  3 21:27:32 GMT 2003 $

The important thing to remember is that whatever command we put in the script must be fast. It's no good sitting around waiting for your command prompt to be set when you need to take some emergency action on your box.

There's no reason this technique can't use a Perl script. For example we could write a script that prints out the current load in different colours based on how high a load we have:

  #!/usr/bin/perl

  # turn on perl's safety features
  use strict;
  use warnings;

  use Sys::Load qw(getload);
  use Term::ANSIColor qw(:constants);

  # get the load
  my ($load) = getload;

  # print out the load in the correct colour
  if ($load > 0.8)
   { print RED, "[$load]", RESET }
  elsif ($load > 0.5)
   { print YELLOW, "[$load]", RESET }
  else
   { print GREEN, "[$load]", RESET }

Saving the above example in a file called "myprompt" somewhere in your path and then making it executable means that you can place the following line in your .bashrc to get the prompt to change each time:

   PS1="\$(myprompt) $ "

This script runs pretty nippy on my box and there's no way that I can type quick enough to create any significant load, but I hate to waste CPU load on a pretty utility to display CPU load. It's time to get PPerl involved. All we really have to do is change the shebang line at the top of the script from

  #!/usr/bin/perl

To run pperl instead:

  #!/usr/bin/pperl

The first time myprompt is run it spawns a collection of processes on the box, and executes as normal.

  [0.06] $ ps fx | grep myprompt
  13920 ?        S      0:00 /home/mark/bin/myprompt
  15191 ?        S      0:00  \_ /home/mark/bin/myprompt
  15195 ?        S      0:00  \_ /home/mark/bin/myprompt
  15199 ?        S      0:00  \_ /home/mark/bin/myprompt
  15203 ?        S      0:00  \_ /home/mark/bin/myprompt
  15207 ?        S      0:00  \_ /home/mark/bin/myprompt

Now whenever myprompt is executed pperl instead of loading in perl and parsing the script it simply communicates to one of the pperl processes from the pool and gets them to rerun the script. This is considerably quicker. Let's find out exactly how much quicker:

  #!/usr/bin/perl
  
  use strict;
  use warnings;
  
  use Benchmark qw(cmpthese);
  
  cmpthese(1000, {
    perl  => "`myprompt_orig`",
    pperl => "`myprompt`",
  });

This isn't as finely controlled as it could be, as in theory the load on the machine could effect the output, but as a rough test it's not bad. Note, because we're doing most of the work in the sub processes we want to check the wallclock seconds that this returns, not the CPU usage for the timing process. On my system, it takes about 85 seconds to run a thousand iterations under perl and 80 seconds to run under pperl.

So the pperl version is slightly quicker. This speed difference is emphasised more and more the more code is in the script we load. Let's write a perl script that deliberately loads a few really massive modules that contain a lot of code:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use Template;
  use POSIX;
  use Socket;
  use CGI;
  BEGIN { CGI->compile }

  print "Hello World\n";

With this script the perl version takes 313 seconds to run all thousand iterations, but the pperl version only takes 113 seconds - almost three times quicker. So, we can probably get away with formatting our prompt with the Template Toolkit if we run under PPerl if we really want.