O'Reilly logo

Mastering Perl by brian d foy

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 18. Modules As Programs

Perl has excellent tools for creating, testing, and distributing modules. On the other hand, Perl’s good for writing standalone programs that don’t need anything else to be useful. I want my programs to be able to use the module development tools and be testable in the same way as modules. To do this, I restructure my programs to turn them into modulinos.

The main Thing

Other languages aren’t as DWIM as Perl, and they make us create a top-level subroutine that serves as the starting point for the application. In C or Java, I have to name this subroutine main:

/* hello_world.c */

#include <stdio.h>

int main ( void ) {
        printf( "Hello C World!\n" );

        return 0;
        }

Perl, in its desire to be helpful, already knows this and does it for me. My entire program is the main routine, which is how Perl ends up with the default package main. When I run my Perl program, Perl starts to execute the code it contains as if I had wrapped my main subroutine around the entire file.

In a module most of the code is in methods or subroutines, so most of it doesn’t immediately execute. I have to call a subroutine to make something happen. Try that with your favorite module; run it from the command line. In most cases, you won’t see anything happen. I can use perldoc’s -l switch to locate the actual module file so I can run it to see nothing happen:

$ perldoc -l Astro::MoonPhase
/usr/local/lib/perl5/site_perl/5.8.7/Astro/MoonPhase.pm
$ perl /usr/local/lib/perl5/site_perl/5.8.7/Astro/MoonPhase.pm

I can write my program as a module and then decide at runtime how to treat the code. If I run my file as a program, it will act just like a program, but if I include it as a module, perhaps in a test suite, then it won’t run the code and it will wait for me to do something. This way I get the benefit of a standalone program while using the development tools for modules.

Backing Up

My first step takes me backward in Perl evolution. I need to get that main routine back and then run it only when I decide I want to run it. For simplicity, I’ll do this with a “Just another Perl hacker” (JAPH) program, but develop something more complex later.

Normally, Perl’s version of “Hello World” is simple, but I’ve thrown in package main just for fun and use the string “Just another Perl hacker,” instead. I don’t need that for anything other than reminding the next maintainer what the default package is. I’ll use this idea later:

#!/usr/bin/perl
package main;

print "Just another Perl hacker, \n";

Obviously, when I run this program, I get the string as output. I don’t want that in this case though. I want it to behave more like a module so when I run the file, nothing appears to happen. Perl compiles the code, but doesn’t have anything to execute. I wrap the entire program in its own subroutine:

#!/usr/bin/perl
package main;

sub run {
        print "Just another Perl hacker, \n";
        }

The print statement won’t run until I execute the subroutine, and now I have to figure out when to do that. I have to know how to tell the difference between a program and a module.

Who’s Calling?

The caller built-in tells me about the call stack, which lets me know where I am in Perl’s descent into my program. Programs and modules can use caller, too; I don’t have to use it in a subroutine. If I use caller in the top level of a file I run as a program, it returns nothing because I’m already at the top level. That’s the root of the entire program. Since I know that for a file I use as a module caller returns something and that when I call the same file as a program caller returns nothing, I have what I need to decide how to act depending on how I’m called:

#!/usr/bin/perl
package main;

run() unless caller();

sub run {
        print "Just another Perl hacker, \n";
        }

I’m going to save this program in a file, but now I have to decide how to name it. Its schizophrenic nature doesn’t suggest a file extension, but I want to use this file as a module later, so I could go along with the module file-naming convention, which adds a .pm to the name. That way, I can use it and Perl can find it just as it finds other modules. Still, the terms program and module get in the way because it’s really both. It’s not a module in the usual sense, though, and I think of it as a tiny module, so I call it a modulino.

Now that I have my terms straight, I save my modulino as Japh.pm. It’s in my current directory, so I also want to ensure that Perl will look for modules there (i.e., it has “.” in the search path). I check the behavior of my modulino. First, I use it as a module. From the command line, I can load a module with the -M switch. I use a “null program,” which I specify with the -e switch. When I load it as a module nothing appears to happen:

$ perl -MJaph -e 0
$

Perl compiles the module and then goes through the statements it can execute immediately. It executes caller, which returns a list of the elements of the program that loaded my modulino. Since this is true, the unless catches it and doesn’t call run(). I’ll do more with this in a moment.

Now I want to run Japh.pm as a program. This time, caller returns nothing because it is at the top level. This fails the unless check and so Perl invokes the run() and I see the output. The only difference is how I called the file. As a module it does module things, and as a program it does program things. Here I run it as a script and get output:

$ perl Japh.pm
Just another Perl hacker,
$

Testing the Program

Now that I have the basic framework of a modulino, I can take advantage of its benefits. Since my program doesn’t execute if I include it as a module, I can load it into a test program without it doing anything. I can use all of the Perl testing framework to test programs, too.

If I write my code well, separating things into small subroutines that only do one thing, I can test each subroutine on its own. Since the run subroutine does its work by printing, I use Test::Output to capture standard output and compare the result:

use Test::More tests => 2;
use Test::Output;

use_ok( 'Japh' );

stdout_is( sub{ main::run() }, "Just another Perl hacker, \n" );

This way, I can test each part of my program until I finally put everything together in my run() subroutine, which now looks more like what I would expect from a program in C, where the main loop calls everything in the right order.

Creating the Program Distribution

There are a variety of ways to make a Perl distribution, and we covered these in Chapter 15 of Intermediate Perl. If I start with a program that I already have, I like to use my scriptdist program, which is available on CPAN (and beware, because everyone seems to write this program for themselves at some point). It builds a distribution around the program based on templates I created in ~/.scriptdist, so I can make the distro any way that I like, which also means that you can make it any way that you like, not just my way. At this point, I need the basic tests and a Makefile.PL to control the whole thing, just as I do with normal modules. Everything ends up in a directory named after the program but with .d appended to it. I typically don’t use that directory name for anything other than a temporary placeholder since I immediately import everything into source control. Notice I leave myself a reminder that I have to change into the directory before I do the import. It only took me a 50 or 60 times to figure that out:

$ scriptdist Japh.pm
Home directory is /Users/brian
RC directory is /Users/brian/.scriptdist
Processing Japh.pm...
Making directory Japh.pm.d...
Making directory Japh.pm.d/t...
RC directory is /Users/brian/.scriptdist
cwd is /Users/brian/Dev/mastering_perl/trunk/Scripts/Modulinos
Checking for file [.cvsignore]... Adding file [.cvsignore]...
Checking for file [.releaserc]... Adding file [.releaserc]...
Checking for file [Changes]... Adding file [Changes]...
Checking for file [MANIFEST.SKIP]... Adding file [MANIFEST.SKIP]...
Checking for file [Makefile.PL]... Adding file [Makefile.PL]...
Checking for file [t/compile.t]... Adding file [t/compile.t]...
Checking for file [t/pod.t]... Adding file [t/pod.t]...
Checking for file [t/prereq.t]... Adding file [t/prereq.t]...
Checking for file [t/test_manifest]... Adding file [t/test_manifest]...
Adding [Japh.pm]...
Copying script...
Opening input [Japh.pm] for output [Japh.pm.d/Japh.pm]
Copied [Japh.pm] with 0 replacements
Creating MANIFEST...
------------------------------------------------------------------
Remember to commit this directory to your source control system.
In fact, why not do that right now?  Remember, `cvs import` works
from within a directory, not above it.
------------------------------------------------------------------

Inside the Makefile.PL I only have to make a few minor adjustments to the usual module setup so it handles things as a program. I put the name of the program in the anonymous array for EXE_FILES and ExtUtils::MakeMaker will do the rest. When I run make install, the program ends up in the right place (also based on the PREFIX setting). If I want to install a manpage, instead of using MAN3PODS, which is for programming support documentation, I use MAN1PODS, which is for application documentation:

WriteMakefile(
        'NAME'      => $script_name,
        'VERSION'   => '0.10',

        'EXE_FILES' =>  [ $script_name ],

        'PREREQ_PM' => {},

        'MAN1PODS'  => {
                        $script_name => "\$(INST_MAN1DIR)/$script_name.1",
                        },

        clean => { FILES => "*.bak $script_name-*" },
        );

An advantage of EXE_FILES is that ExtUtils::MakeMaker modifies the shebang line to point to the path of the perl binary that I used to run Makefile.PL. I don’t have to worry about the location of perl.

Once I have the basic distribution set up, I start off with some basic tests. I’ll spare you the details since you can look in scriptdist to see what it creates. The compile.t test simply ensures that everything at least compiles. If the program doesn’t compile, there’s no sense going on. The pod.t file checks the program documentation for Pod errors (see Chapter 15 for more details on Pod), and the prereq.t test ensures that I’ve declared all of my prerequisites with Perl. These are the tests that clear up my most common mistakes (or, at least, the most common ones before I started using these test files with all of my distributions).

Before I get started, I’ll check to ensure everything works correctly. Now that I’m treating my program as a module, I’ll test it every step of the way. The program won’t actually do anything until I run it as a program, though:

$ cd Japh.pm.d
$ perl Makefile.PL; make test
Checking if your kit is complete...
Looks good
Writing Makefile for Japh.pm
cp Japh.pm blib/lib/Japh.pm
cp Japh.pm blib/script/Japh.pm
/usr/local/bin/perl "-MExtUtils::MY" -e "MY->fixin(shift)" blib/script/Japh.pm
/usr/local/bin/perl "-MTest::Manifest" "-e" "run_t_manifest(0,↲
'blib/lib', 'blib/arch',  )"
Level is
Test::Manifest::test_harness found [t/compile.t t/pod.t t/prereq.t]
t/compile....ok
t/pod........ok
t/prereq.....ok
All tests successful.
Files=3, Tests=4,  6 wallclock secs ( 3.73 cusr +  0.48 csys =  4.21 CPU)

Adding to the Script

Now that I have all of the infrastructure in place, I want to further develop the program. Since I’m treating it as a module, I want to add additional subroutines that I can call when I want it to do the work. These subroutines should be small and easy to test. I might even be able to reuse these subroutines by simply including my modulino in another program. It’s just a module, after all, so why shouldn’t other programs use it?

First, I move away from a hardcoded message. I’ll do this in baby steps to illustrate the development of the modulino, and the first thing I’ll do is move the actual message to its own subroutine. That hides the message to print behind an interface, and later I’ll change how I get the message without having to change the run subroutine. I’ll also be able to test message separately. At the same time, I’ll put the entire program in its own package, which I’ll call Japh. That helps compartmentalize anything I do when I want to test the modulino or use it in another program:

#!/usr/bin/perl

package Japh;

run() unless caller();

sub run {
        print message(), "\n";
        }

sub message {
        'Just another Perl hacker, ';
        }

I can add another test file to the t/ directory now. My first test is simple. I check that I can use the modulino and that my new subroutine is there. I won’t get into testing the actual message yet since I’m about to change that:[61]

# message.t
use Test::More tests => 4;

use_ok( 'Japh.pm' );

ok( defined &message );

Now I want to be able to configure the message. At the moment it’s in English, but maybe I don’t always want that. How am I going to get the message in other languages? I could do all sorts of fancy internationalization things, but for simplicity I’ll create a file that contains the language, the template string for that language, and the locales for that language. Here’s a configuration file that maps the locales to a template string for that language:

en_US "Just another %s hacker, "
eu_ES "apenas otro hacker del %s, "
fr_FR "juste un autre hacker de %s, "
de_DE "gerade ein anderer %s Hacker, "
it_IT "appena un altro hacker del %s, "

I add some bits to read the language file. I need to add a subroutine to read the file and return a data structure based on the information, and my message routine has to pick the correct template. Since message is now returning a template string, I need run to use sprintf instead. I also add another subroutine, topic, to return the type of hacker I am. I won’t branch out into the various ways I can get the topic, although you can see how I’m moving the program away from doing (or saying) one thing to making it much more flexible:

sub run
        {
        my $template = get_template();

        print message( $template ), "\n";
        }

sub message
        {
        my $template = shift;

        return sprintf $template, get_topic();
        }

sub get_topic { 'Perl' }

sub get_template { ... shown later ... }

I can add some tests to ensure that my new subroutines still work and also check that the previous tests still work.

Being quite pleased with myself that my modulino now works in many languages and that the message is configurable, I’m disappointed to find out that I’ve just introduced a possible problem. Since the user can decide the format string, he can do anything that printf allows him to do,[62]and that’s quite a bit. I’m using user-defined data to run the program, so I should really turn on taint checking (see Chapter 3), but even better than that, I should get away from the problem rather than trying to put a bandage on it.

Instead of printf, I’ll use the Template module. My format strings will turn into templates:

en_US "Just another [% topic %] hacker, "
eu_ES "apenas otro hacker del [% topic %], "
fr_FR "juste un autre hacker de [% topic %], "
de_DE "gerade ein anderer [% topic %] Hacker, "
it_IT "Solo un altro hacker del [% topic %], "

Inside my modulino, I’ll include the Template module and configure the Template parser so it doesn’t evaluate Perl code. I only need to change message because nothing else needs to know how message does its work:

sub message {
        my $template = shift;

        require Template;

        my $tt = Template->new(
                 INCLUDE_PATH => '',
                 INTERPOLATE  => 0,
                 EVAL_PERL    => 0,
                );

        $tt->process( \$template, { topic => get_topic() }, \ my $cooked );

        return $cooked;
        }

Now I have a bit of work to do on the distribution side. My modulino now depends on Template so I need to add that to the list of prerequisites. This way, CPAN (or CPANPLUS) will automatically detect the dependency and install it as it installs my modulino. That’s just another benefit of wrapping the program in a distribution:

WriteMakefile(
        ...

        'PREREQ_PM' => {
                Template => '0';
                },

        ...
        );

What happens if there is no configuration file, though? My message subroutine should still do something, so I give it a default message from get_template, but I also issue a warning if I have warnings enabled:

sub get_template {
        my $default = "Just another [% topic %] hacker, ";

        my $file = "t/config.txt";

        unless( open my( $fh ), "<", $file ) {
                carp "Could not open '$file'";
                return $default;
                }

        my $locale = shift || 'en_US';
        while( <$fh> )
                {
                chomp;
                my( $this_locale, $template ) = m/(\S+)\s+"(.*?)"/g;

                return $template if $this_locale eq $locale;
                }

        return $default;
        }

You know the drill by now: the new additions to the program require more tests. Again, I’ll leave that up to you.

Finally, I need to test the whole thing as a program. I’ve tested the bits and pieces individually, but do they all work together? To find out, I use the Test::Output module to run an external command and capture the output. I’ll compare that with what I expect. How I do this for programs depends on what the particular program is supposed to actually do. To run my program inside the test file, I wrap it in a subroutine and use the value of $^X for the perl binary I should use. That will be the same perl binary that’s running the tests:

#!/usr/bin/perl

use File::Spec;

use Test::More 'no_plan';
use Test::Output;

my $script = File::Spec->catfile( qw(blib script Japh.pm ) );

sub run_program {
        print `$^X $script`;
        }

{ # test for US English
local %ENV;
$ENV{LANG} = 'en_US';

stdout_is( \&run_program, "Just another Perl hacker, \n" );
}

{ # test for Spanish
local %ENV;
$ENV{LANG} = 'eu_ES';

stdout_is( \&run_program, "apenas otro hacker del Perl, \n" );
}

{ # test with no LANG setting
local %ENV;
delete $ENV{LANG};

stdout_is( \&run_program, "Just another Perl hacker, \n" );
}

{ # test with nonsense LANG setting
local %ENV;
$ENV{LANG} = 'blah blah';

stdout_is( \&run_program, "Just another Perl hacker, \n" );
}

Distributing the Programs

Once I create the program distribution, I can upload it to CPAN (or anywhere else that I like) so other people can download it. To create the archive, I do the same thing I do for modules. First, I run make disttest, which creates a distribution, unwraps it in a new directory, and runs the tests. That ensures that the archive I give out has the necessary files and everything runs properly (well, most of the time):

$ make disttest

After that, I create the archive in which ever format that I like:

$ make tardist
==OR==
$ make zipdist

Finally, I upload it to PAUSE and announce it to the world. In real life, however, I use my release utility that comes with Module::Release and this (and much more) all happens in one step.

As a module living on CPAN, my modulino is a candidate for CPAN Testers, the loosely connected group of volunteers and automated computers that test just about every module. They don’t test programs, but our modulino doesn’t look like a program.

There is a little known area of CPAN called “scripts” where people have uploaded standalone programs without the full distribution support.[63] Kurt Starsinic did some work on it to automatically index the programs by category, and his solution simply looks in the program’s Pod documentation for a section called “SCRIPT CATEGORIES.”[64]If I wanted, I could add my own categories to that section, and the programs archive should automatically index those on its next pass:

=pod SCRIPT CATEGORIES

CPAN/Administrative

=cut

Summary

I can create programs that look like modules. The entire program (outside of third-party modules) exists in a single file. Although it runs just like any other program, I can develop and test it just like a module. I get all the benefits of both forms, including testability, dependency handling, and installation. Since my program is a module, I can easily re-use parts of it in other programs, too.

Further Reading

“How a Script Becomes a Module” originally appeared on Perlmonks: http://www.perlmonks.org/index.pl?node_id=396759.

I also wrote about this idea for The Perl Journal in “Scripts as Modules.” Although it’s the same idea, I chose a completely different topic: turning the RSS feed from TPJ into HTML: http://www.ddj.com/dept/lightlang/184416165.

Denis Kosykh wrote “Test-Driven Development” for The Perl Review 1.0 (Summer 2004): http://www.theperlreview.com/Issues/subscribers.html.



[61] If you like Test-Driven Development, just switch the order of the tests and program changes in this chapter. Make the new tests before you change the program.

[62] The Sys::Syslog module once suffered from this problem, and its bug report explains the situation. See Dyad Security’s notice for details: http://dyadsecurity.com/webmin-0001.html.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required