The 2004 Perl Advent Calendar
[about] | [archives] | [contact] | [home]

On the 4th day of Advent my True Language brought to me..
String::ShellQuote

One of Perl's greatest strengths is the ease it can interact with other programs with the system, using when needed the powerful technique of calling out to the shell on that system giving you all the power of the command line from within Perl.

Of course, with this power comes great risks. The ability to use shell metacharacters to instruct the computer to perform complex tasks is handy, but if your code mistakenly passes unwanted metacharacters though to the shell then you have a slew of problems ranging from your code not functioning correctly to allowing a user to maliciously execute arbitrary commands though your code.

String::ShellQuote can help to prevent these risks with the minimum of fuss, escaping the unwanted metacharacters so the shell treats them literally. Using it in the right place simply means that you have one less of the common gotchas to worry about. Your code will be more reliable and more secure.

When you're administrating a computer you'll often need to check the log files that the programs running on the computer create. On a unix system from the shell we can get the last few lines of a logfile by using the tail utility:

  bash$ tail -n 10 thttpd.log

Our shell parses the command and then executes the tail utility, passing the three arguments that were separated by whitespace (the -n and the 10, and the filename) to it.

One of the things that systems administrators have been relying on for years is Perl's ability to easily call out to external programs just like you can from the shell. To access the same tail utility from within perl we can use the open command with the "-|" (pipe from) mode.

  sub tail
  {
    my $filename = shift;
    my $lines = shift || 5; 
    # run the tail program with the arguments
    # we pass in a list, each in their own scalar
    # (this form of open requires perl 5.6.1 or later)
    open my $fh, "-|", "tail","-n", $lines, $filename
      or die "Can't fork: $!";
    # read the output in
    my @output = <$fh>;
    # close it, detecting further problems
    close $fh
      or die "Problem closing child: $!/$?";
    # return the output
    return @output;
  }
  print tail("thttpd.log", 10);

Note how when we called the utility with the shell we separated all the argument with spaces. In the perl code we pass open the same arguments as a list, each argument in it's own scalar. Of course, if we want we can just pass the arguments in as one string, building a string that looks just like the command we called from the shell:

    open my $fh, "-|", "tail -n $lines $filename"
      or die "Can't fork: $!";

Seeing the spaces in the string perl will realise that you want the string split up, and actually runs a copy of the shell and pass it the string for it to parse and execute for perl. This is slightly slower (it has to run the shell which runs your program rather that just running the program directly) but is a lot more powerful. For example, if I wanted to get the last ten lines of all the log files in the directory from the shell I'd type this:

  bash$ tail -n 10 *.log

And I might go as far as telling it to throw away all the debug info it normally prints to STDERR (like not being able to open some files, etc):

  bash$ tail -n 10 *.log 2>/dev/null

Since perl is calling the shell to run tail for it, we can get the same effect within Perl:

    open my $fh, "-|", "tail -n 10 *.log 2>/dev/null"
      or die "Can't fork: $!";

So rather than having to write the Perl code to find all the files that end in .log in the directory we just get the shell to do the work for us.

The Problem

While having Perl call though to the shell is powerful, if you don't use it with respect then all kinds of bad things can happen. For example, if your filename has a space in it then you need to take extra care with the shell command:

  bash$ tail -n 5 '/var/log/My Site/error.log'

The file-path containing the My Site directory has to be put in single quotes to avoid the shell separating the argument on the space character and passing /var/log/My and <Site/error.log> as two separate arguments to tail.

Of course the same need for escaping is true for Perl as that's just another way of calling the shell. This means that this doesn't quite do what we might expect in all cases:

  # get the last ten lines from $filename, ignoring anything
  # that's printed to STDERR
  open my $fh, "-|", "tail -n 10 $filename 2>/dev/null"
    or die "Can't fork: $!";

If filename contains spaces, when $filename is interpolated bad things will happen as the unescaped spaces confuse the shell. You could be forgiven for thinking that we could have just written this (incorrect) code instead:

  # get the last ten lines from $filename, ignoring anything
  # that's printed to STDERR
  open my $fh, "-|", "tail -n 10 '$filename' 2>/dev/null"
    or die "Can't fork: $!";

Which works fine, until someone puts a ' in $filename. Worse than just our code not working, let us consider the implications what would happen if someone trying to be deliberately malicious sets the filename to something even odder by passing in more shell commands characters:

  $filename = "/etc/motd; rm -rf /";

This causes:

  open my $fh, "-|", "tail -n 10 $filename 2>/dev/null"
      or die "Can't fork: $!";

To become:

  open my $fh, "-|", "tail -n 10 /etc/motd; rm -rf / 2>/dev/null"
      or die "Can't fork: $!";

Which will, filesystem permissions allowing, delete everything on the system. Eeek!

The Solution

What we need to do is use String::ShellQuote to escape the things we don't want to be interpreted as shell characters and want to be treated literally. Using String::ShellQuote imports the shell_quote function that returns an escaped string built up out of the arguments you pass it. Some example calls:

  use String::ShellQuote;
  print shell_quote("this has many spaces"), "\n";
  print shell_quote("evil; rm -rf /"), "\n";
  print shell_quote("it's more complicated than that"), "\n";

Print:

  'this has many spaces'
  'evil; rm -rf /'
  'it'\''s more complicated than that'

So we can use this in our functions to make our lives easier:

 sub tail
  {
    my $filename = shift;
    my $lines    = shift || 5; 
    # protect the arguments
    my $args = shell_quote("-n", $lines, $filename);
    # open tail
    open my $fh, "-|", "tail $args 2>/dev/null";
    # read the output in and return it
    return <$fh>
  }

It's quick and simple, but still gives us access to the shell features where we want them (like the ability to redirect STDERR) but doesn't allow one dodgy filename to bring down our script.

  • The 'open' documentation