IO::AtomicFile is one of those modules that has saved my skin more times that I care to mention. It's a simple module that deals with the situation where you're overwriting an existing file, and you want to preserve the existing file right up until the moment you're done writing the new file and you're sure it's a valid replacement.
To do this the module automates the process of writing to a temp file and then renaming that file to the destination. This means that any program trying to read the existing file will get the old one until the new file is completely done.
The other big advantage in this technique is that at any point you can abandon the current file you're writing and the old one is still intact. Because of this IO::AtomicFile allows you to write defensive code that acts well when it encounters errors.
IO::AtomicFile is really easy to use - you just use it exactly like you'd use IO::File. But to explain that, I'd better explain how IO::File works, just incase you haven't used it before.
This is a quick recap of file handling in Perl. Feel free to skip to the next section if this is all familiar to you.
An example program that creates a web page full of random numbers:
# open the filehandle to index.html open FH, ">", "index.html" or die "Can't open '.index.html': $!";
# print the web page print FH webpage();
# close the filehandle. Will happen automatically at the # end of the script if we don't do it ourselves close FH;
# create a page of random numbers sub webpage { my $output = q{ <html> <head><title>Today's Random Numbers</title></head> <body>};
# create the random numbers foreach my $count (1..100) { $output .= "Random number $count: " . int(rand(10000)) . "<br />\n"; } $output .= "</body></html>"; return $output; }
Since Perl 5.6.0 we've had the ability to rewrite the above file operations to use a scalar.
# open the filehandle to index.html open my $fh, ">", "index.html" or die "Can't open 'index.html': $!";
# print the webpage print {$fh} webpage();
# close the file. If we don't do this then it will be closed # automatically when $fh goes out of scope. close $fh;
This has the advantage that $fh is now just a normal scalar and you can pass it around just like any other variable and that there's a lot less chance of introducing some weird scoping bugs. There's another possibility though - one that works on even older perls than 5.6.0: The IO::File module.
# open the file handle to index.html my $fh = IO::File->new("index.html", ">") or die "Can't open 'index.html': $!"
# print the address to the file handle print {$fh} webpage();
# close the file. If we don't do this then it will be closed # automatically when $fh goes out of scope. close $fh;
Note that the order of the arguments is different to the open
command.
Now the question is, what happens if someone tries to access the
webpage at the same time as you're updating it? They could quite
possibly (assuming that you're writing quite slowly) read the file
as you're creating it and get only half a file. This is where
IO::AtomicFile comes in. Simply by replacing the IO::File
with IO::AtomicFile
we get atomic file creation.
# open the file handle to index.html my $fh = IO::AtomicFile->new("index.html", ">") or die "Can't open 'index.html': $!";
# print the address to the file handle print {$fh} webpage();
# close the file. If we don't do this then it will be closed # automatically when $fh goes out of scope. close $fh;
What actually happens is that a file index.html.TMP
is created
and the output is written there. This file is then renamed to
index.html
- and this is an 'atomic operation' meaning that
it happens, to all intents and purposes, instantaneously. One instant
the old file is there, the next instant the new one is in place.
This rename happens when you close the file - both with the explicit close listed above and in the situation where you let $fh go out of scope and it's closed automatically. In other words, you don't normally need to worry about it - it happens transparently.
The best thing about writing to a temporary file rather than directly to the 'live' file is that if at any time anything goes wrong, we can simply back out and give up. This is the actual code that I use to create the Advent Calendar pages.
# open an atomic file. This creates a temp file and means that we # can both abandon changes made, and that the live version will # be replaced suddenly. my $output_fh = IO::AtomicFile->new(catfile($dir,$file),">") or die "Can't open file for writing: $!";
# try writing the template, and unless it's okay... unless ($template->process(catfile(TEMPLATE_DIR,$file),{}, $output_fh)) { # eeek! a problem, okay, don't write that to the real file # whatever we do, delete the temp file $output_fh->delete;
# die die "Problem with template: " . $template->error() }
So, as you can see, we check if the $template->process
call had
any errors with the unless
(it'll return undef
if it did) and if
it did we abandon the file we've been working on by calling the
delete
method on the filehandle.