2023 twenty-four merry days of Perl Feed

HTML/XSS scrubbing and file upload validation in Catalyst

Catalyst::Plugin::CheckFileUploadTypes - 2023-12-07

At work, we needed to tighten up the security of our Catalyst-powered API, with two main requirements:

  • Stripping HTML/XSS attempts from incoming parameters

  • Validating that file uploads are expected and are the expected type


We found Catalyst::Plugin::HTML::Scrubber, which at first glance looked like it would do at least most of what we needed, automatically scrubbing parameters using HTML::Scrubber.

We're not a fan of reinventing wheels when we can avoid it, so I set about adding the extra features we needed - in particular, being able to exempt particular parameters from scrubbing, by name or regex match - and raised a pull request to share that upstream. Unfortunately, the original author doesn't seem to be active in the Perl community any more, and several attempts of contact failed - so I followed the usual steps to adopt a module, approaching our friendly CPAN admins for help, obtained co-maint, and released a new version.

Since then we added more - including recursive scrubbing of parameters within serialised POSTed/PUTted request bodies.


Next, we needed to add checking of uploaded files. Some API actions do expect uploaded files, but most don't. We wanted to make it easy to centralise that checking so that if an action hasn't specified that it expects file uploads, any attempts to upload files in requests sent to it should be rejected.

It should also be easy for the action to denote which MIME types it expects to receive, without lots of boilerplate code being added to each action.

Naturally you'd want to use something to determine the type of file you were actually sent and can't just trust the Content-Type header in the request, because the client could lie to us.

We didn't find anything that fitted our needs, so I created Catalyst::Plugin::CheckFileUploadTypes, using subroutine attributes on the actions to mark that they expect uploads, for instance:

use Catalyst qw(CheckFileUploadTypes);

# Actions can declare that they expect to receive file uploads:
sub upload_file : Local ExpectUploads { ... }

# They can also specify that any uploaded files must be of expected types
# (determined from file content by File::MMagic, not what the client said,
# as they could lie to us)
sub upload_file : Local ExpectUploads(image/jpeg image/png) { ... }

There was a little bit of fun involved if the app is using Catalyst::Action::REST, in which case we want to be looking for the attributes on the _type-suffixed action - for e.g. index_POST.

More features are planned (and may well have been implemented by the time you read this!) - including:


for e.g. the ability to say e.g. image/* for any type of image

Extra heuristics to distinguish more file types

For example, both a shell script and an XML file are both text/plain according to the underlying File::MMagic; that's not very helpful.

More options

More options to provide more control over how unexpected uploads are handled


Callbacks to fire for each uploaded file to perform additional checks on it - for example, running it through a virus checker, generating a hash and checking online services for matches, or other checks on the content of the file.

Feedback, suggestions and patches welcome!

Gravatar Image This article contributed by: David Precious <davidp@preshweb.co.uk>