Jingle Refs, Jingle Refs, Jingle all the way
Santa's organization has two essential problems with their code; It has to work 100% of the time (you don't get a second chance at Christmas) and it has to be really efficient and fast (since there's a lot of children in the world and hence the code has to crunch a heck of a lot of data).
Often these goals can be at odds with each other. Abstractions, while inherently safer than writing the same small snippet of code over and over (each with a chance to make a small mistake) can lead to slower code, introducing extra subroutine calls or data processing that simply wasn't there before.
But not always. Tonight we'll tell you of the wondrous day that a Christmas miracle occurred in the most unlikely of places...the annual code review.
A Worrying Pattern
During the annual code-review, the Wise Old Elf had noticed a common pattern we all maintain: Checking the validity of references using the
There were two problems with this ref checking pattern. Sometimes some elf would get it subtly wrong:
So subtle was that error that it often slipped through into production! Perl isn't smart enough to know that
HSAH probably is a typo for
HASH and not a class name the reference had been blessed into. And since error checking code is notoriously hard to test, the automated test suites that Santa insisted on frequently didn't always catch this problem.
If the elves were worried only about correctness alone the code could be replaced by something like this:
Since under strict a typo in
is_plain_hash would cause a compile time error any fat-fingered elf problems would be caught. However, that's just introduced another subroutine call, and since these ref checks are used all over Santa's code base, the cost of doing that adds up!
The second problem with the pattern was the fact that it was strictly doing more work that it needed to. Consider what the following was actually doing:
Under the hood perl is inspecting bits on the reference to see if it's a
hash value), then building a standard SV string containing the four characters
HASH, then in a separate operation comparing that string character by character with the
HASH string which was passed in in the code. That, of course, all happens really quickly, but it's more work (and more Perl virtual ops) than need to happen just to tell if this was a hashref or not!
So the code review had found two problems: One, an abstraction was needed to make it more reliable. Two, the code needed to involve less stuff to make it faster.
The Voice of an Angel
Ref::Util provides a set of helpful functions to determine what kind of reference a variable is. It abstracts several awkward ref checking patterns into new function. So, as our above example:
Can be written as:
Or maybe we want to check for a blessed array reference, making sure it isn't accidentally blessed? The following:
Can much more succinctly be written with the new
Not writing a lot of code but still being accurate is something Santa appreciates, but that's not the best part....the best part is the speed increase.
Remember when the Elves were worried about the overhead of introducing a new function call? Well, Ref::Util doesn't do that. It introduces a set of new custom ops to do the hard work instead which is much much faster than any kind of function call.
To understand what that means requires an understanding of the internals of perl. When Perl code is run the code is first compiled into a series of "ops" - operations like "add these values", "push to list", or even as complicated as "run this regular expression". These ops are somewhat like the machine code that runs on your actual processor, but much higher level and more complicated. When perl actually runs your code it runs a sort of virtual machine that basically looks at each of these operations in turn and does whatever operation they tell it to do. At a low level what can make perl slow is the number of ops it has to process (since running each op has an overhead) and the complexity of the ops it runs (for example function calls and regular expressions are expensive compared to other more simple ops like add and subtract).
When you call
ref $foo eq 'HASH' there are actually several ops that come into play:
... 6 <2> seq vK/2 ->7 4 <1> ref[t2] sK/1 ->5 3 <#> gvsv[*foo] s ->4 5 <$> const[PV "HASH"] s ->6
Each of these is an operation that perl has to do (create a list, get the value, call ref, etc.). However, when you use Ref::Util, when
is_hashref gets encountered, it is then replaced with the following custom op:
4 <1> is_hashref vK/1 ->5 3 <#> gvsv[*foo] s ->4
While Santa is not exactly sure what all these crazy looking "op-trees" mean, an elf helpfully explained that "it makes the sled go faster". And we don't even need to put racing stripes on the sled!
Less error prone and faster too. Just what the man in red ordered.