2016 twenty four merry days of Perl Feed

PDF-Reuse + Reduce & Recycle

PDF::Reuse - 2016-12-20

One thing Santa should be glad about is that he doesn't have to deal with billing. Can you imagine having to invoice all the parents of the world? Sadly in the commercial sector we have to worry about sending invoice to clients. And as an overhead - you can't invoice for invoicing after all - we want to automate it all away. What better language to use than Perl?

While invoices these days don't actually have to be printed on paper anymore, they remain simulated paper documents: It's time to get Perl to create a bunch of PDFs for us.

We're building a billing solution for Advertising Agency. It has the usual Estimate → Purchase Order → Invoice cycle, with Detail, Rate, Qty and Amount format. Just to make this more complicated it has varied column information for handling Print Media & Audio/Video Media with more detail on print media position & A/V media program slot and timing. In some cases, each media type will be handled by different legal entities for manage tax related issues. Finally, the requirements were:

  1. 6 work segments (Production, Production with PO, Print Media, Audio/Video, Production with TAX)
  2. Upto 6 different legal entities for each segment
  3. 3 type of estimates
  4. Purchase Order
  5. 3 type of Release Orders for handle media communication
  6. 3 type of invoices

Wow! That's a lot of stuff to consider.

Challenges

We went in search of suitable PDF production module. We mainly looked for:

  1. Ability to tailor build the document
  2. A module that gives production control in pixel level
  3. Flexible options to set page properties
  4. Option to implement signatures
  5. Position control to handle page continuation

After a lengthy search for few days, we finally zeroed in PDF::Reuse. It did not have user friendly function wraps for graphical output, but it offered a developer friendly core functions to control the graphical output in a pixel level. It helped to wrap our custom functions to handle in generic way.

Map to Reuse, Reduce & Recycle

One of PDF::Reuse's main advantage is that it can take an existing PDF - something that our designers can create with their standard PDF tools - and then all we have to do is add the various per invoice details to it.

The code to re-use a PDF with PDF::Reuse is really simple:


1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 

 

#!perl

use PDF::Reuse;
use strict;
use warnings;

# The file we're going to make
prFile('myFile.pdf');

# Take page one from a PDF we already have we're going to reuse
prForm('source.pdf');

# Add some custom text
prText(150, 700, 'Customer Data');

# And finally write it our
prEnd();

 

The feature of reusing the existing PDF and writing content over that, helped to reuse the existing legal entity letter pads in specific company document production. It's avoided manual graphic production work.

A Range of Functions

The PDF::Reuse module provides a bunch of primative functions for adding data to our document. For example, we've already seen using the prText() function to add text to the document. But PDF::Reuse provides a plethora of other functions that can do all kinds of things.

These can be as simple as prPage


1: 
2: 

 

# add a new page to the document
prPage();

 

Or as powerful as prImage:


1: 
2: 
3: 
4: 
5: 
6: 
7: 

 

# insert an existing PDF into the document at the x,y coords:
prImage({
    'file' => $filename,
     'x' => $c_v{x},
     'y' => $c_v{y_data},
     'size' => $scale_factor,
});

 

The problem with these functions is that they're all primative. If we want to create our document in a sensbile fashion we need to build up our own libray on top of these simplistic functions.

Abstracting

We classified the document we wanted to write into three major parts:

  1. The page layout and page background
  2. Header
  3. Content

It looks like a core wrapper function we need is some way to write out a row of row structure to meet individual lines and table like outputs. The wrapper function constructed based on PrText(),PrAdd(),PrImage()


1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 

 

# Data structure input for produce one liner output
# base structure

my $print_data = {
    0 => {
        left_xaxis => 250, # print start x position
        line_height => 25,

# column definition
header => {
# define the first (and only) column
0 => {
                font => 'Helvetica-Bold',
                font_size => 12,
            },
        },

# insert the data
data => [['Production Estimate']],
    },
 };

 

It will produce a title Production Estimate in page center. A data structure that produces several rows look like this:


1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
33: 
34: 
35: 

 

my $row_data = {
    0 => {
        line_height => 12,

# format definition for 3 columns
header => {
            0 => {
                font => 'Helvetica',
                font_size => 9,
                d_width => 50
            },
            1 => {
                font => 'Helvetica',
                font_size => 9,
                d_width => 10
            },
            2 => {
                font => 'Helvetica-Bold',
                font_size => 9,
                d_width => 90,
                align => 'left'
            },
        },

# data for 3 columns
data=> [
             ['Client:','','<Client Address>' ],
             ['Address:','','<Address 1>' ],
             ['','','<Address 2>' ],
             ['','','<Address 3>' ],
             ['Estimate No.','','<estimate no.>' ],
             ['Date:','','<estimate date>' ]
        ],
    },
 };

 

Writing the code to make this work is actually quite fiddly: We have to do a lot of accounting of how much space each column takes up, and keep track of where we need to place the next line.

Rather than showing you pages and pages of our code, here's an example of a small snippet of code that we use to create the header and shows some of these challenges:


1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
33: 
34: 
35: 
36: 
37: 
38: 
39: 
40: 
41: 

 

sub header_text{

  my $data = shift;
  my $page = shift;

  my ( $width, $height ) = checking( $page );
  my $y = $height - $page->{margin}{top};

# each data item
for my $key ( sort { $a <=> $b } keys %{ $data } ){

    my $chunk = $data{ $key };
    my $start_x = $chunk->{left_xaxis} || $page->{margin}{left};

# each row
for my $row_data (@{ $chunk->{data} }) {

        my $counter = 0;
        my $x = $start_x;

        for my $column_data (@{ $row_data }) {

            my $meta = $chunk->{header}{ $counter };

            prFont( $meta->{font} || $page->{font} );
            prFontSize( $meta->{font_size} || $page->{font_size} );
            prText( $x, $y, $column_data );

            $x += $meta->{d_width};

            ++$counter;

        } # each column

# reflect current y position
$y -= $chunk->{line_height} || $page->{line_height};

    } # each row
   } # each item
   return $y;
} # end

 

You can see the techniques that we leverage: Keeping track of the changes $x and $y as we render each section, rendering column by column and looking up the corresponding header information for each data section, and using per page defaults when the meta data doesn't have settings in each section.

By abstracting this logic into sections we can deal with the complexity and build up a powerful library to easily create our own pages, making PDF::Reuse very powerful.

In Summary

With our library of functions the creator will produce the PDF document based on given data on page properties & page content. In implementation case different structures are predefined first. During runtime, the bill information dynamically added to the structure. A basic structure created first, then it cloned and blended to different account needs.

PDF:Reuse core functions PrFile, PrText, PrForm, PrImage itself helped to produce business quality documents with scalability and reusability. The modules simple and straight core functions helped us to achieve our custom document generation with ease and control. Even though we've been using it for six year, its continuously satisfying the evolving needs with the basic functions.

Now the documents become a part of our clients business communication also it gave us a business. Thanks PDF::Reuse.

Example Output


Sample Invoice Document

SEE ALSO

Gravatar Image This article contributed by: Raja Renga Bashyam <raja@webstarscg.com>