Whenever you get information passed back from a CGI form, you never really know what you're going to get. No matter how much client side verification of the code you do, it's always possible for someone to bypass your crafted page and write their own html that submits to the same place.
So your CGI script has to be prepared to deal with whatever it's passed. It has to be untrusting, and check and double check the data. Writing code to do this is boring, time consuming, and very tempting to skip. Anything that makes this easier is very welcome, and CGI::Untaint hits the hammer on the head. It provides a framework for creating reusable components that can be used to extract various bits, and does it with the minimum of fuss.
When you use CGI to get parameters the simplest way (but also the most sloppy way) is to do this:
# create the cgi object use CGI; my $cgi = CGI->new();
# extract the data and call error_handling to print out errors # if the data can't be extracted (i.e. it was missing) my $age = $cgi->param("age"); error_handing() unless defined($age);
This can put anything in $age
at all. You're expecting a round
number of years back, but for all you know some idiot's typed "eight"
into the form or "12.5" and when you get round to inserting $age
into your database it'll all fall over. What you'll need to do is
write some code to check this with a regular expression:
# create the cgi object use CGI; my $cgi = CGI->new();
# extract the data and call error_handling to print out errors # if the data can't be extracted (i.e. it was missing or malformed) my $age = $cgi->param("age"); error_handing() unless defined($age); ($age) = $age =~ m/^(\d+)$/; # extract it if it's all digits error_handing() unless defined($age);
And you'll also have do all the same for all the other variables, hoping that the code you just typed hasn't got any subtle bugs in it. What CGI::Untaint allows you to do is to utilise collections of predefined regular expressions to pull things out of the cgi parameters instead.
# create our untainting object use CGI; use CGI::Untaint; my $cgi = CGI->new(); my $untaint = CGI::Untaint->new($cgi->Vars);
# extract 'age' from the parameters as an integer. my $age = $untaint->extract( -as_integer => "age" ); error_handing() unless defined($age);
This has several advantages; Your code is quicker to develop as you're having to write less of the sticky logic yourself. You're reusing code so any code you do write you only have to write once. Finally, the code you're using to do the extraction will have been independently tested and checked that it's functioning correctly.
The extract
method takes two arguments, how the data is to be
extracted (in this case as an integer) and the name of the cgi
parameter to be extracted. The former of these two actually names the
module that provides the instructions to CGI::Untaint how to do the
extraction (So -as_integer
means that the CGI::Untaint::integer
module should be used and -as_printable
means that
CGI::Untaint::printable will be used, and so on.) These 'handlers'
can be thought as plug-ins to CGI::Untaint telling it about new ways
to extract different types data. The default handlers that are
installed when you install the main CGI-Untaint distribution are:
printable strings, i.e. strings that don't contain control characters
integer numbers, possibly starting with a plus or minus sign
positive hexadecimal numbers (i.e. without a plus or minus sign)
In addition to this basic selection there's a wide collection of modules on the CPAN that can be downloaded and installed. A few of the notable examples are:
Even though there's a great selection of untaint handlers available from the CPAN, sooner or later you're going to find that you're in a situation where you want to check something that there isn't an untaint handler for. For example, you might want to check if a value is one of the options that you were offered from a drop down list, so that extraction handler will be unique to your own particular application.
Creating your own handlers is as simple as writing a quick module that
inherits from CGI::Untaint::object and defines a method called
_untaint_re
that returns a reference regular expression. This
regular expression should place the result of the extraction into
$1
. This is all very simple as soon as you see an example. Here's
a handler that extracts red, green or blue, and fails for all other
things passed to it:
package CGI::Untaint::red_green_blue; use base qw(CGI::Untaint::object);
# turn on perl's safety functions use strict; use warnings;
# define the regular expression that will do the test sub _untaint_re { qr/^(red|green|blue)$/ }
# return true to keep perl happy 1;
Your module can define further checks by defining a is_valid
method. This method will be passed a reference to the object, on
which value
can be called to get the current value that's just been
extracted by _untaint_re
, and the routine should return true or
false to indicate if that value was valid or nor. Cunningly,
value
can also be used to assign to. For example, here's an
expansion of the above handler that doesn't care what case the names
of the colours are:
package CGI::Untaint::red_green_blue; use base qw(CGI::Untaint::object);
# turn on perl's safety functions use strict; use warnings;
# define the regular expression that will do the test sub _untaint_re { qr/^(red|green|blue)$/i }
sub is_valid { my $self = shift; my $value = $self->value;
# make the value lower case. $self->value(lc($value));
# return true as it's valid return 1; }
# return true to keep perl happy 1;
When you're writing your own extraction handlers you really should
check that you can extract what you expect, and can't extract what you
shouldn't be able to. The Test::CGI::Untaint module (blatent plug,
since I wrote it) can help you here. It defines two tests
is_extractable
and unextractable
that check that the extraction
handler you name either extracts a value as what you expect or doesn't
extract anything at all respectively.
#!/usr/bin/perl
use strict; use warnings;
# start the testing use Test::More tests => 7; use Test::CGI::Untaint;
# check if we can extract the basic colours is_extractable("red", "red", "red_green_blue","try red"); is_extractable("green","green","red_green_blue","try blue"); is_extractable("blue", "blue", "red_green_blue","try green");
# check the case stuff works is_extractable("Red","red","red_green_blue","try Red"); is_extractable("rEd","red","red_green_blue","try rEd"); is_extractable("reD","red","red_green_blue","try reD");
# but not yellow unextractable("yellow","red_green_blue", "try yellow");