2024 twenty-four merry days of Perl Feed

Sleigh Bells and Custom Ops: A Jolly Journey with Ref::Util

Ref::Util - 2024-12-03

Ho ho ho! 🎅 As the festive season wraps us in its warm embrace, let's unwrap a shiny gift for Perl developers that's been nestled under the tree for a few years now: Ref::Util. Just like how some of the best presents are hidden in plain sight, Ref::Util brings a sprinkle of efficiency and clarity to our code, turning "naughty" reference checks into "nice" ones. So grab a mug of hot cocoa, and let's dive into this yuletide tale!

A Christmas Classic Worth Revisiting

If Ref::Util sounds familiar, it might be thanks to the Perl Advent Calendar from 2016, where it was first introduced as a way to reduce common errors in ref-checking and boost performance with custom ops. However, many Perl newcomers and those who missed the original may not yet know about this handy tool. Let’s revisit this modern Christmas classic and explore how Ref::Util can make your Perl code cleaner, faster, and more robust.

The Ghost of Reference Checks Past 🎁

In the days of yore, you might have found yourself writing code like:

if ( ref $sleigh eq 'HASH' ) {
    load_presents($sleigh);
}

But alas! This is like finding coal in your stocking. 🎄 By using ref $sleigh eq 'HASH', we're peeking behind the curtains, exposing Perl's internal workings. We're grabbing the string representation of a reference and hoping it matches our expectations. If $sleigh isn't a reference at all, ref returns an empty string — leaving us out in the cold. (At least it's not undef, which would likely wreak havoc during runtime).

Unwrapping Ref::Util ✨

Enter Ref::Util, our festive hero that's been around for a while but might not have made it onto your wish list yet! With it, we can rewrite our code as:

use Ref::Util qw< is_hashref >;

if ( is_hashref($sleigh) ) {
    load_presents($sleigh);
}

Isn't that as refreshing as a winter's first snow? ❄️ The code is cleaner and more readable, but the true magic lies beneath the surface. Ref::Util doesn't just sprinkle a bit of tinsel on your code — it supercharges it! Ref::Util even provides a sleigh full of functions to check all sorts of references:

  • is_ref($variable)

  • is_scalarref($variable)

  • is_arrayref($variable)

  • is_hashref($variable)

  • is_coderef($variable)

  • is_regexpref($variable)

  • is_globref($variable)

  • is_formatref($variable)

  • is_ioref($variable)

  • is_refref($variable)

And their is_plain_* and is_blessed_* counterparts.

This comprehensive suite lets you check for any type of reference without exposing Perl's internals, making your code more robust and intention-revealing.

Example:

use Ref::Util qw< is_blessed_arrayref >;

if ( is_blessed_arrayref($elves) ) {
    assign_tasks($elves);
}

Instead of:

use Scalar::Util qw< blessed reftype >;
if ( ref_type($elves) eq 'ARRAY' && blessed($elves) ) {
    assign_tasks($elves);
}

(You might consider calling blessed($elves) eq 'ARRAY' but that doesn't differentiate between package name and reference type.)

Performance: The Gift That Keeps on Giving 🎁

Now, you might be thinking, "Sure, the code looks nicer, but is that all?" Oh, dear reader, the true magic lies in the massive speed improvements! Ref::Util doesn't just make your code more readable — it makes it faster.

Let's peek into Santa's workshop and look at some benchmarks:

    Ref::Util::is_plain_arrayref (CustomOP):  2.7951 +/- 0.0062 (0.2%) -- baseline
    ref():                                    5.3350 +/- 0.0180 (0.3%) -- 1.90 times
    Data::Util::is_array_ref:                 5.9074 +/- 0.0075 (0.1%) -- 2.11 times
    ref(), reftype(), !blessed():            15.5450 +/- 0.0310 (0.2%) -- 5.56 times

As you can see, Ref::Util::is_plain_arrayref using Custom Ops is nearly twice as fast as ref() (2.7951 vs. 5.335 seconds) and more than five times faster than combining ref(), Scalar::Util::reftype(), and blessed() checks (2.7951 vs. 15.545 seconds). It's like upgrading from a hand-drawn sleigh to a rocket-powered one! 🚀

Elves Under the Hood: The Magic of Op Codes 🧙‍♀️

So, how does Ref::Util achieve this festive feat? It's all thanks to some Christmas magic called Custom Ops, introduced in Perl 5.14 (released in 2011). Let's unwrap this concept together.

What Are Op Codes and the Optree?

In Perl, your code gets compiled into a tree of operations called the optree. Each operation (op) represents a fundamental action, like addition, subtraction, or fetching a variable. Think of it as the assembly instructions for Santa's elves.

The Traditional Method's Optree

Let's take the following one-liner:

    perl -MO=Concise -e'if ( ref $sleigh eq "HASH" ) {1}'

We can see the following optree, which is printed out by B::Concise when loaded with -MO=Concise:

    7  <@> leave[1 ref] vKP/REFC ->(end)
    1     <0> enter v ->2
    2     <;> nextstate(main 1 -e:1) v:{ ->3
    -     <1> null vK/1 ->7
    6        <|> and(other->7) vK/1 ->7
    5           <2> seq sK/2 ->6
    -              <1> ex-rv2sv sK/1 ->4
    3                 <#> gvsv[*sleigh] s ->4
    4              <$> const[PV "HASH"] s ->5
    -           <@> scope vK ->7
    -              <;> ex-nextstate(main 3 -e:1) v ->-
    -              <0> ex-const v ->-

This involves multiple ops: fetching the variable, getting its reference type as a string, and then comparing that string.

Scalar::Util's reftype Optree

Using Scalar::Util's reftype, the following code:

    perl -MO=Concise -MScalar::Util=reftype -e'if ( reftype($sleigh) eq "HASH" ) {1}'

We get the following optree:

    a  <@> leave[1 ref] vKP/REFC ->(end)
    1     <0> enter v ->2
    2     <;> nextstate(main 49 -e:1) v:{ ->3
    -     <1> null vKP/1 ->a
    9        <|> and(other->a) vK/1 ->a
    8           <2> seq sK/2 ->9
    6              <1> entersub[t1] sKS/TARG ->7
    -                 <1> ex-list sK ->6
    3                    <0> pushmark s ->4
    -                    <1> ex-rv2sv sKM/1 ->5
    4                       <#> gvsv[*sleight] s ->5
    -                    <1> ex-rv2cv sK/1 ->-
    5                       <#> gv[*reftype] s ->6
    7              <$> const[PV "HASH"] s ->8
    -           <@> scope vK ->a
    -              <;> ex-nextstate(main 51 -e:1) v ->-
    -              <0> ex-const v ->-

This involves more ops due to the subroutine call to reftype, adding overhead. Also, notice pushmark which adds additional variables to the stack, adding variables to the function stack that could be retrieved by calling shift. (You read this right - even having parameters to a function includes another opcode.)

Ref::Util with Custom Ops

With Ref::Util on Perl 5.14 or newer, and when using a version of Ref::Util that supports Custom Ops, let's write the following code:

    perl -MO=Concise -MRef::Util=is_hashref -e'if ( is_hashref($sleight) ) {1}'

We get the following optree:

    6  <@> leave[1 ref] vKP/REFC ->(end)
    1     <0> enter v ->2
    2     <;> nextstate(main 73 -e:1) v:{ ->3
    -     <1> null vKP/1 ->6
    5        <|> and(other->6) vK/1 ->6
    4           <1> is_hashref sK/1 ->5
    -              <1> ex-rv2sv sKM/1 ->4
    3                 <#> gvsv[*sleight] s ->4
    -           <@> scope vK ->6
    -              <;> ex-nextstate(main 75 -e:1) v ->-
    -              <0> ex-const v ->-

Notice how is_hashref becomes its own op in the optree, named is_hashref. This Custom Op directly checks the reference type at the opcode level, eliminating unnecessary steps.

How Does This Magic Happen?

  • Custom Ops Implementation (Fastest) 🎅

    When available, Ref::Util uses Custom Ops written in C (using XS). These are new operations added directly into Perl's optree, resulting in minimal overhead and maximal speed. It's like giving the elves new, more efficient tools.

  • XS Implementation (Fast) 🦌

    If Custom Ops aren't available (due to an older Perl version or limitations in the environment), Ref::Util gracefully falls back to an XS implementation without Custom Ops. This means the functions are still written in C for performance but are called as regular functions. It's still faster than pure Perl but not as swift as using Custom Ops.

  • Pure-Perl Implementation 🛷

    When neither Custom Ops nor XS implementations are available (such as on systems without a C compiler), Ref::Util provides a pure-Perl fallback. This version has similar performance to the traditional method but retains the benefits of cleaner syntax and abstraction.

The Spirit of the Season: Cleaner and Faster Code 🎅

Beyond performance, Ref::Util embodies the true spirit of the holidays — making things better for everyone. By using functions like is_hashref, is_arrayref, is_coderef, et. al., our code becomes more expressive, intention-revealing, and helps catch typos in reference checks. It's easier for others (and future you) to read and understand, reducing those "bah humbug" moments during code reviews.

When Feeling Grateful 🙏

There are plenty of people who worked hard to make Ref::Util possible, and they deserve a lot of credit:

  • Vikenty Fesunov

  • Aaron Crane

  • Gonzalo Diethelm

  • Karen Etheridge

  • Yves Orton

  • Steffen Müller

  • Jarkko Hietaniemi

  • Mattia Barbon

  • Zefram

  • Tony Cook

  • Sergey Aleynikov

A New Year's Resolution 🥂

As we bid farewell to the old year and welcome the new, let's make a resolution to write cleaner, faster, and more maintainable Perl code. Ref::Util is a wonderful tool to help us on that journey.

So, next time you find yourself writing ref $variable eq 'TYPE', remember there's a merrier way. Let Ref::Util spread joy in your codebase, and may your days be merry and bright!

Happy Holidays and Happy Coding!

May your CPU cycles be short and your code delights be plenty! 🎉

Gravatar Image This article contributed by: Sawyer X <sawyer.x@gmail.com>