BUY THIS BOOK
Add to Cart

Print Book $29.99


Add to Cart

PDF $23.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £20.99

What is this?

Looking to Reprint or License this content?


Perl Hacks
Perl Hacks Tips & Tools for Programming, Debugging, and Surviving

By chromatic , Damian Conway, Curtis Poe
Book Price: $29.99 USD
£20.99 GBP
PDF Price: $23.99

Cover | Table of Contents


Table of Contents

Chapter 1: Productivity Hacks
Hacks 1-11
Everyone wants to be more productive. That's probably why you use Perl: to get more work done in less time with less work.
Productivity isn't all about saving time, though. Saving effort is even more important, whether you mean finding the information you want, automating away repeated tasks, or finding ways not to have to think about things that you do all the time. In some ways, this is the notion of relentless automation—finding every little niggling task that always interrupts your current project by being so annoying, difficult, cumbersome, or different and then hiding it behind an alias, a shell script, a process, or whatever.
Here are a few ideas for ways to make your programming life easier and more productive. Try them, enjoy your new sense of free time, and let yourself notice the new points of friction in your life. Then solve them, too!
Keep module documentation and distributions mere keystrokes away.
If Perl has only one advantage over other programming languages, it's the number of modules on the CPAN (http://www.cpan.org/) that solve so many problems effectively. That brings up a smaller problem, though—choosing an appropriate module for the job.
http://search.cpan.org/ helps, but if you visit the site many times a day, the steps to start a search through the web interface can become annoying. Fortunately, the Mozilla family of web browsers, including Mozilla Firefox, let you set up shortcuts that make browsing much easier. These shortcuts are just bookmarked URLs with substitutable sections and keywords, but they're very powerful and useful—almost command-line aliases ("Make the Most of Shell Aliases" [Hack #4]) for your browser.
Here are three of the most useful.
The first technique is to find the module you want. Normally, you could visit the CPAN search site, type the appropriate words in the box, submit the form, and browse through the results. That's too much work though!
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Add CPAN Shortcuts to Firefox
Keep module documentation and distributions mere keystrokes away.
If Perl has only one advantage over other programming languages, it's the number of modules on the CPAN (http://www.cpan.org/) that solve so many problems effectively. That brings up a smaller problem, though—choosing an appropriate module for the job.
http://search.cpan.org/ helps, but if you visit the site many times a day, the steps to start a search through the web interface can become annoying. Fortunately, the Mozilla family of web browsers, including Mozilla Firefox, let you set up shortcuts that make browsing much easier. These shortcuts are just bookmarked URLs with substitutable sections and keywords, but they're very powerful and useful—almost command-line aliases ("Make the Most of Shell Aliases" [Hack #4]) for your browser.
Here are three of the most useful.
The first technique is to find the module you want. Normally, you could visit the CPAN search site, type the appropriate words in the box, submit the form, and browse through the results. That's too much work though!
Open the bookmark menu in your browser; this is Bookmarks→Manage Bookmarks in Mozilla Firefox. Create a new bookmark. For name, put Search CPAN and for Keyword enter cpan. In the Location box, type:
http://search.cpan.org/search?mode=module;query=%s
Figure 1-1 shows the completed dialog box. Press OK, then go back to the browser. Clear the location bar, then type cpan Acme and hit Enter. This will take you immediately to the first page of search results for modules with Acme in their names.
Figure 1-1: Creating a new keyword bookmark search
If you know exactly the name of the module you want, it's more convenient to jump straight to information about that module. Create a new bookmark named
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Put Perldoc to Work
Do more than just read the documentation.
Perl has a huge amount of documentation available through the perldoc utility—and not just from the command line. These docs cover everything from the core language and tutorials through the standard library and any additional modules you install or even write. perldoc can do more, though.
Here are a few switches and options to increase your productivity.
The perlfunc document lists every built-in operator in the language in alphabetical order. If you need to know the order of arguments to substr( ), you could type perldoc perlfunc, and then search for the correct occurrence of substr.
In a decent pager, such as less on a Unix-like system, use the forward slash (/) to begin a search. Type the rest of the name and hit Enter to begin searching. Press n to find the next occurrence and N to find the previous one.
Why search yourself, though? perldoc's -f switch searches perlfunc for you, presenting only the documentation for the named operator. Type instead:
$ perldoc -f substr
            
The program will launch your favorite pager, showing only the documentation for substr. Handy.
The Perl FAQ is a very useful piece of the core documentation, with a table of contents in perlfaq and nine other documents (perlfaq1 through perlfaq9) full of frequently asked questions and their answers.
Searching every document for your question, however, is more tedious than searching perlfunc. (Do skim perlfaq once in a while to see what questions there are, though.) Fortunately, the -q switch allows you to specify a search pattern for FAQ keywords.
If you remember that somewhere the FAQ explains how to shuffle an array, but you can't remember where, try:
$ perldoc -q shuffle
            
As with the -f switch, this will launch your favorite pager to view every question with the term
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Browse Perl Docs Online
Host your own HTML documentation.
perldoc is a fine way to view the documentation for Perl and all your installed modules and to output them in the file format of your choice ("Put Perldoc to Work" [Hack #2]). perldoc's little brother, podwebserver, is an even handier way to browse documentation—and bookmark it, and search it, and sometimes even hardcopy it, all through whatever web browser you're using this week.
podwebserver provides basically perldoc-as-HTML over HTTP. Sure, you could always just browse the documentation at http://search.cpan.org/—but using podwebserver means that you'll be seeing the documentation for exactly your system's Perl version and module versions.
podwebserver's HTML is compatible with fancy browsers as well as with more lightweight tools such as lynx, elinks, or even the w3m browser in Emacs. In fact, there have been persistent rumors of some users adventurously accessing podwebserver via cell phones, or even using something called "the Micro-Soft Internet Explorer." O'Reilly Media, Inc. can neither confirm nor deny these rumors.
If podwebserver isn't on your system, install the Pod::Webserver module from CPAN.
To run podwebserver, just start it from the command line. You don't need root access:
$ podwebserver
            
Then start a web browser and browse to http://localhost:8020/. You'll see the index of the installed documentation (Figure 1-2).
Figure 1-2: An index of your Perl documentation
If you don't want to bind the web server to localhost, or if you have something already running on port 8020, use the -H and -p arguments to change the host and port.
$ podwebserver -H windwheel -p 8080
            
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Make the Most of Shell Aliases
Make programming easier by programming your shell.
Perl is a language for people who type. It grew up from the shell to write all kinds of programs, but it still rewards people who don't mind launching programs from the command line.
If you spend your time writing Perl from the command line (whether you write short scripts or full-blown programs), spending a few minutes automating common tasks can save you lots of development time—and even more trouble.
The single most useful shell trick is the realias command. Normally creating a persistent alias means adding something to your .bashrc (or equivalent) file, starting a new shell, testing it, and then repeating the process until you get it right. Wouldn't it be nice to be able to edit and test a new alias in a single process?
Edit your .bashrc file and add a single line:
source ~/.aliases
Then create the file ~/.aliases, containing:
alias realias='$EDITOR ~/.aliases; source ~/.aliases'
If you prefer tcsh, edit your .cshrc file. Then replace the = sign with a single space in all of the alias declarations.
Launch a new shell. Type the command realias and your favorite editor (assuming you have the EDITOR environment variable set, and if you don't something is weird) will open with your ~/.aliases file. Add a line and save and quit:
alias reperl='perl -de0'
Now type reperl at the command line:
$ reperl

Loading DB routines from perl5db.pl version 1.28
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

main::(-e:1):   0
  DB<1> q
            
Within a single shell session you've identified a useful command that may be difficult to remember, automated it, and have started to use it productively. Nifty.
What makes a good shell alias for Perl programming? Obviously a command that's difficult to remember, such as the one to put the Perl debugger into pseudo-interactive mode. Another good approach is to alias commands that are lengthy or otherwise difficult to type. One final category is a series of chained commands you find yourself typing often.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Autocomplete Perl Identifiers in Vim
Why type a full identifier if your editor can do it for you?
Good variable and function names are a great boon to productivity and maintainability, but brevity and clarity are often at odds. Instead of wearing out your keys, fingertips, and memory, consider making your text editor do the typing for you.
If you use Vim, you have access to a handy autocompletion mechanism. In insert mode, type one or more letters of an identifier, then hit CTRL-N. The editor will complete your identifier using the first identifier in the same file that starts with the same letter(s). Hitting CTRL-N again gives you the second matching identifier, and so on.
This can be a real timesaver if you use long variable or subroutine names. As long as you've already typed an identifier once in a file, you can autocomplete it ever after, just by typing the first few letters and then CTRL-Ning to the right name:
sub find_the_factorial_of
{
    my ($the_number_whose_factorial_I_want) = @_;

    return 1 if $the_n<CTRL-N> <= 1;

    return $the_n<CTRL-N> * find<CTRL-N>($the_n<CTRL-N> - 1);
}
Unfortunately, Vim's idea of an identifier (in Vim-speak, a "keyword") isn't as broad as Perl's. Specifically, the editor doesn't recognize the colon character as a valid part of an identifier, which is annoying if you happen to like multipart class names, or qualified package variables.
However, it's easy to teach Vim that those intervening double-colons are valid parts of the identifiers. Add them to the editor's list of keyword characters by adding the line to your .vimrc file:
set iskeyword+=:
Then the following works too:
use Sub::Normal;

my $sub = Sub<CTRL-N>->new( );  # Expands to: Sub::Normal->new( )
Of course, you still have to type the full name of Sub::Normal once, as part of the initial use statement. That really isn't as Lazy as it could be. It would be much better if Vim just magically knew about all the Perl modules you have installed and could cleverly autocomplete their names from the very first time you used them.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Use the Best Emacs Mode for Perl
Configure Emacs for easy Perl coding.
While perl-mode is the classic Perl-editing mode that Emacs uses for Perl files by default, most Perl programmers prefer the newer cperl-mode. (The "c" in the name is because its early versions borrowed code from c-mode. It's not actually written in C, nor meant for C.) Enabling it is easy.
cperl-mode is probably already included in your version of Emacs, but you can get an up-to-date version from http://math.berkeley.edu/~ilya/software/emacs/. Save it to an Emacs library directory. Then enable it for .pl and .pm files by adding nine lines to your ~/.emacs file:
(load-library "cperl-mode")
  (add-to-list 'auto-mode-alist '("\\\\.[Pp][LlMm][Cc]?$" . cperl-mode))
  (while (let ((orig (rassoc 'perl-mode auto-mode-alist)))
              (if orig (setcdr orig 'cperl-mode))))
  (while (let ((orig (rassoc 'perl-mode interpreter-mode-alist)))
           (if orig (setcdr orig 'cperl-mode))))
  (dolist (interpreter '("perl" "perl5" "miniperl" "pugs"))
    (unless (assoc interpreter interpreter-mode-alist)
      (add-to-list 'interpreter-mode-alist (cons interpreter 'cperl-mode))))
What can you do with it?

Put Perldoc at your fingertips

cperl-mode provides a handy function for calling perldoc, but does not associate it with any key by default. To put it at your fingertips, add one line to your .emacs file:
(global-set-key "\\M-p" 'cperl-perldoc) ; alt-p
If you want to use Pod::Webserver [Hack #3], use one of the various in-Emacs web browsers:
(global-set-key "\\M-p" '(lambda ( ) (interactive)
  (require 'w3m)
  (w3m-goto-url "http://localhost:8020/")
))
If you prefer your normal web browser, just set some particular key to start it up on the Pod::Webserver page:
(global-set-key "\\M-p"
  '(lambda ( ) (interactive) (start-process "" nil
  "firefox" "http://localhost:8020/"
  ; Or however you launch your favorite browser, like:
  ;   "gnome-terminal" "-e" "lynx http://localhost:8020/"
  ;   "xterm" "-e" "elinks http://localhost:8020/"
)))
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Enforce Local Style
Keep your code clean without editing it by hand.
One of the first barriers to understanding code written by others is that their formatting style may not match yours. This is especially true if you find yourself maintaining code that, at best, has grown with little direction over the years. Whether you work with other developers and want to maintain a consistent set of coding guidelines, or you want to find some structure in a big ball of mud, perltidy can help untangle and bring consistency to even the scariest code.
Install the CPAN module Perl::Tidy. This will also install the perltidy utility. Now you can use it!

From the command line

Run perltidy on a Perl program or module and it will write out a tidied version of that file with a .tdy suffix. For example, given poorly_written_script.pl, perltidy will, if possible, reformat the code and write the new version to poorly_written_script.pl.tdy. You can then run tests against the new code to verify that it performs just as did the previous version (even if it is much easier to read).
This command reformats the contents of some_ugly_code.pl so that it's no longer, well, ugly. How effective is it? The Perltidy docs offer an example. Before:
$_= <<'EOL';
   $url = URI::URL->new( "http://www/" );   die if $url eq "xXx";
EOL
LOOP:{print(" digits"),redo LOOP if/\\G\\d+\\b[,.;]?\\s*/gc;print(" lowercase"),
redo LOOP if/\\G[a-z]+\\b[,.;]?\\s*/gc;print(" UPPERCASE"),redo LOOP
if/\\G[A-Z]+\\b[,.;]?\\s*/gc;print(" Capitalized"),
redo LOOP if/\\G[A-Z][a-z]+\\b[,.;]?\\s*/gc;
print(" MiXeD"),redo LOOP if/\\G[A-Za-z]+\\b[,.;]?\\s*/gc;print(
" alphanumeric"),redo LOOP if/\\G[A-Za-z0-9]+\\b[,.;]?\\s*/gc;print(" line-noise"
),redo LOOP if/\\G[^A-Za-z0-9]+/gc;print". That's all!\\n";}
After:
$_ = <<'EOL';
   $url = URI::URL->new( "http://www/" );   die if $url eq "xXx";
EOL
LOOP: {
    print(" digits"),       redo LOOP if /\\G\\d+\\b[,.;]?\\s*/gc;
    print(" lowercase"),    redo LOOP if /\\G[a-z]+\\b[,.;]?\\s*/gc;
    print(" UPPERCASE"),    redo LOOP if /\\G[A-Z]+\\b[,.;]?\\s*/gc;
    print(" Capitalized"),  redo LOOP if /\\G[A-Z][a-z]+\\b[,.;]?\\s*/gc;
    print(" MiXeD"),        redo LOOP if /\\G[A-Za-z]+\\b[,.;]?\\s*/gc;
    print(" alphanumeric"), redo LOOP if /\\G[A-Za-z0-9]+\\b[,.;]?\\s*/gc;
    print(" line-noise"),   redo LOOP if /\\G[^A-Za-z0-9]+/gc;
    print ". That's all!\\n";
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Don't Save Bad Perl
Don't even write out your file if the Perl isn't valid!
Perl tests tend to start by checking that your code compiles. Even if the tests don't check, you'll know it pretty quickly as all your code collapses in a string of compiler errors. Then you have to fire up your editor again and track down the problem. It's simple, though, to tell Vim that if your Perl code won't compile, it shouldn't even write it to disk.
Even better, you can load Perl's error messages back into Vim to jump right to the problem spots.
Vim supports filetype plug-ins that alter its behavior based on the type of file being edited. Enable these by adding a line to your .vimrc:
filetype plugin on
Now you can put files in ~/.vim/ftplugin (My Documents\\_vimfiles\\ftplugin on Windows) and Vim will load them when it needs them. Perl plug-ins start with perl_, so save the following file as perl_synwrite.vim:
" perl_synwrite.vim: check syntax of Perl before writing
" latest version at: http://www.vim.org/scripts/script.php?script_id=896

"" abort if b:did_perl_synwrite is true: already loaded or user pref
if exists("b:did_perl_synwrite")
  finish
endif
let b:did_perl_synwrite = 1

"" set buffer :au pref: if defined globally, inherit; otherwise, false
if (exists("perl_synwrite_au") && !exists("b:perl_synwrite_au"))
  let b:perl_synwrite_au = perl_synwrite_au
elseif !exists("b:perl_synwrite_au")
  let b:perl_synwrite_au = 0
endif

"" set buffer quickfix pref: if defined globally, inherit; otherwise, false
if (exists("perl_synwrite_qf") && !exists("b:perl_synwrite_qf"))
  let b:perl_synwrite_qf = perl_synwrite_qf
elseif !exists("b:perl_synwrite_qf")
  let b:perl_synwrite_qf = 0
endif

"" execute the given do_command if the buffer is syntactically correct perl
"" -- or if do_anyway is true
function! s:PerlSynDo(do_anyway,do_command)
  let command = "!perl -c"

  if (b:perl_synwrite_qf)
    " this env var tells Vi::QuickFix to replace "-" with actual filename
    let $VI_QUICKFIX_SOURCEFILE=expand("%")
    let command = command . " -MVi::QuickFix"
  endif

  " respect taint checking
  if (match(getline(1), "^#!.\\\\+perl.\\\\+-T") = = 0)
    let command = command . " -T"
  endif

  exec "write" command

  silent! cgetfile " try to read the error file
  if !v:shell_error || a:do_anyway
    exec a:do_command
    set nomod
  endif
endfunction

"" set up the autocommand, if b:perl_synwrite_au is true
if (b:perl_synwrite_au > 0)
  let b:undo_ftplugin = "au! perl_synwrite * " . expand("%")

  augroup perl_synwrite
    exec "au BufWriteCmd,FileWriteCmd " . expand("%") . 
         " call s:PerlSynDo(0,\\"write <afile>\\")"
  augroup END
endif

"" the :Write command
command -buffer -nargs=* -complete=file -range=% -bang Write call \\
    s:PerlSynDo("<bang>"= ="!","<line1>,<line2>write<bang> <args>")
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Automate Checkin Code Reviews
Let Perl::Tidy be your first code review—on every Subversion checkin!
In a multideveloper project, relying on developers to follow the coding standards without fail and to run perltidy against all of their code ("Enforce Local Style" [Hack #7]) before every checkin is unrealistic, especially because this is tedious work. Fortunately, this is an automatable process. If you use Subversion (or Svk), it's easy to write a hook that checks code for tidiness, however you define it.
For various reasons, it's not possible to manipulate the committed files with a pre-commit hook in Subversion. That's why this is a hack.
Within your Subversion repository, copy the hooks/post-commit.tmpl file to hooks/post-commit—unless you already have the file. Remove all code that runs other commands (again, unless you're already using it). Add a single line:
perl /usr/local/bin/check_tidy_file.pl "$REPOS" "$REV"
Adjust the file path appropriately. Make the hooks/post-commit file executable with chmod +x on Unix.
Finally, save the check_tidy_file.pl program to the path you used in the file. The program is:
#!/usr/bin/perl

use strict;
use warnings;

use Perl::Tidy;
               
use File::Temp;
use File::Spec::Functions;

my $svnlook      = '/usr/bin/svnlook';
my $diff         = '/usr/bin/diff -u';

# eat the arguments so as not to confuse Perl::Tidy
my ($repo, $rev) = @ARGV;
@ARGV            = ( );

my @diffs;

for my $changed_file (get_changed_perl_files( $repo, $rev ))
{
    my $source = get_revision( $repo, $rev, $changed_file );
    Perl::Tidy::perltidy( source => \\$source, destination => \\(my $dest) );
    push @diffs, get_diff( $changed_file, $source, $dest );
}

report_diffs( @diffs );

sub get_changed_perl_files
{
    my ($repo, $rev) = @_;

    my @files;

    for my $change (\Q$svnlook changed $repo -r $rev\Q)
    {
        my ($status, $file) =  split( /\\s+/, $change );
        next unless $file   =~ /\\.p[lm]\\z/;
        push @files, $file;
    }

    return @files;
}

sub get_revision
{
    my ($repo, $rev, $file) = @_;
    return scalar \Q$svnlook cat $repo -r $rev $file\Q;
}

sub get_diff
{
    my $filename        = shift;
    return if $_[0] eq $_[1];

    my $dir   = File::Temp::tempdir( );
    my @files = map { catdir( $dir, $filename . $_ ) } qw( .orig .tidy );

    for my $file (@files)
    {
        open( my $out, '>', $file ) or die "Couldn't write $file: $!\\n";
        print $out shift;
        close $out;
    }

    return scalar \Q$diff @files\Q;
}

sub report_diffs
{
    for my $diff (@_)
    {
        warn "Error:\\n$diff\\n";
    }
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Run Tests from Within Vim
Run your tests from your editor.
One of the nice things about Perl is the "tweak, run, tweak, run" development cycle. There's no separate compile phase to slow you down. However, you likely find yourself frequently writing tests and madly switching back and forth between the tests and the code. When you run the tests, you may exit the editor or type something like !perl -Ilib/ t/test_program.t in vi's command mode. This breaks the "tweak, run" rhythm.
Perl programmers don't like to slow things down. Instead, consider binding keys in your editor to the chicken-bone voodoo you use to run your test suite.

Binding keys

By running the tests from within the editor, you no longer have to remember how to execute the tests or edit the editor. Just tweak and run. Add the following line to your .vimrc file to run the currently edited test file by typing ,t (comma, tee):
map ,t  <Esc>:!prove -vl %<CR>
This technique uses the prove program to run your tests. prove is a handy little program distributed with and designed to run your tests through Test::Harness. The switches are v (vee), which tells prove to run in "verbose" mode and show all of the test output, and l (ell), which tells prove to add lib/ to @INC.
If lib/ is not where you typically do your development, use the I switch to add a different path to @INC.
map ,t  <Esc>:!prove -Iwork/ -v %<CR>

Seeing failures

If it's a long test and you get a few failures, it can be difficult to see where the failures were. If that's the case, use ,T (comma capital tee) to pipe the results through your favorite pager.
map ,T  <Esc>:!prove -lv % \\| less<CR>
Of course, make sure your editor does not have those keys already mapped to something else. This hack does not recommend breaking existing mappings in your editor.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Run Perl from Emacs
Make Perl and Elisp play nicely together.
Emacs's long and varied history happens to embody much of Perl's "There's More Than One Way To Do It" approach to things. This is especially evident when you run a small bit of Perl code from within Emacs. Here's how to do just that.
Suppose you really need to know the higher bits of the current value of time( ). In Perl, that's print time() 8>>;. You could use the shell-command command (normally on Control-Alt-One), and enter:
perl -e 'print time( ) >> 8;'
Emacs will dutifully run that command line and then show the output. Note though that you have to remember to quote and/or backslash-escape the Perl expression according to the rules of your default shell. This quickly becomes maddening if the expression itself contains quotes and/or backslashes or even is several lines long.
An alternative is to start an "Emacs shell" in an Emacs subwindow, then start the Perl debugger in that shell. That is, type alt-x " shell " Enter, and then perl -de1 Enter, and then enter the expression just as if you were running the debugger in a normal terminal window:
% perl -de1

Loading DB routines from perl5db.pl version 1.27
Editor support available.

Enter h or \Qh h' for help, or \Qman perldebug' for more help.

main::(-e:1):    1
  DB<1> p time( ) >> 8
4448317
  DB<2>
This means you don't have to escape the Perl expression as you would if you were sending it through a command line, but it does require you to know at least a bit about the Perl debugger and the Emacs shell. It also becomes troublesome in its own way when your expression is several lines long.
A simpler alternative is to save your snippet to a file named delme123.pl and to run that via a command line, but this is a very effective way to fill every directory in reach with files named with the same variant of delme.
I prefer defining a new function just for running Perl code in the Region (what you have selected in Emacs, between the Point and the Mark):
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: User Interaction
Hacks 12-18
Without users, there'd be few reasons to write programs. Without users—and this includes you—there'd be few bugs reported for weird error messages, strange behaviors, and classic "What were you thinking and why did it do that?" moments.
Your programs don't have to be that way. You can make your users happy, make your code work where it has to work, and even make pretty graphics with Perl, all by mastering a few tricks and tips. When your program has to interact with a real person somewhere, do it with style. People may not notice when your code just stays out of their way, but you'll know by their happy glows of productivity.
Nothing beats your favorite editor for editing text.
If you live on the command line and have a reputation for turning your favorite beverage into code, you're likely pretty handy on the keyboard. If you're a relentless automator, you probably have dozens of little programs and aliases to make your life easier.
Sometimes they need arguments. Yet beyond a certain point, prompting for arguments every time or inventing more and more command-line options just doesn't work anymore. Before you resign yourself to the fate of writing a little GUI or a web frontend, consider using a more comfortable user interface instead—your preferred text editor.
Suppose you have a series of little programs for updating your web site. Your workflow is to create a small YAML file with a new posting, then run that data through a template, update the index, and copy those pages to your server. Instead of copying a blank YAML file (or trying to recreate the necessary fields and formatting by hand), just launch an editor.
For example, a simple news site might have entries that need only a title, the date of posting, and a multiline block of text to run through some formatter. Easy:
use YAML 'DumpFile';
use POSIX 'strftime';

local $YAML::UseBlock = 1;

exit 1 unless -d 'posts';

my @posts = <posts/*.yaml>;
my $file  = 'posts/' . ( @posts + 1 ) . '.yaml';

my $fields =
{
    title => '',
    date  => strftime( '%d %B %Y', localtime( ) ),
    text  => "\\n\\n",
};

DumpFile( $file, $fields );

system( $ENV{EDITOR}, $file ) = = 0
    or die "Error launching $ENV{EDITOR}: $!\\n";
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Use $EDITOR As Your UI
Nothing beats your favorite editor for editing text.
If you live on the command line and have a reputation for turning your favorite beverage into code, you're likely pretty handy on the keyboard. If you're a relentless automator, you probably have dozens of little programs and aliases to make your life easier.
Sometimes they need arguments. Yet beyond a certain point, prompting for arguments every time or inventing more and more command-line options just doesn't work anymore. Before you resign yourself to the fate of writing a little GUI or a web frontend, consider using a more comfortable user interface instead—your preferred text editor.
Suppose you have a series of little programs for updating your web site. Your workflow is to create a small YAML file with a new posting, then run that data through a template, update the index, and copy those pages to your server. Instead of copying a blank YAML file (or trying to recreate the necessary fields and formatting by hand), just launch an editor.
For example, a simple news site might have entries that need only a title, the date of posting, and a multiline block of text to run through some formatter. Easy:
use YAML 'DumpFile';
use POSIX 'strftime';

local $YAML::UseBlock = 1;

exit 1 unless -d 'posts';

my @posts = <posts/*.yaml>;
my $file  = 'posts/' . ( @posts + 1 ) . '.yaml';

my $fields =
{
    title => '',
    date  => strftime( '%d %B %Y', localtime( ) ),
    text  => "\\n\\n",
};

DumpFile( $file, $fields );

system( $ENV{EDITOR}, $file ) = = 0
    or die "Error launching $ENV{EDITOR}: $!\\n";
Assuming you have the EDITOR environment variable set to your preferred editor, this program creates a new blank post in the posts/ subdirectory with the appropriate id (monotonically increasing, of course), then drops you in your editor to edit the YAML file. It has already populated the date field with the current date in the proper format. Additionally, setting $YAML::UseBlock
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Interact Correctly on the Command Line
Be kind to other programs.
Command-line programs that expect input from the keyboard are easy, right? Certainly they're easier than writing good GUI applications, right? Not necessarily. The Unix command line is flexible and powerful, but that flexibility can break naively written programs.
Prompting for interactive input in Perl typically looks like:
print "> ";
while (my $next_cmd = <>)
{
    chomp $next_cmd;
    process($next_cmd);
    print "> ";
}
If your program needs to handle noninteractive situations as well, things get a whole lot more complicated. The usual solution is something like:
print "> " if -t *ARGV && -t select;
while (my $next_cmd = <>)
{
    chomp $next_cmd;
    process($next_cmd);
    print "> " if -t *ARGV && -t select;
}
The -t test checks whether its filehandle argument is connected to a terminal. To handle interactive cases correctly, you need to check both that you're reading from a terminal (-t *ARGV) and that you're writing to one (-t select). It's a common mistake to mess those tests up, and write instead:
print "> " if -t *STDIN && -t *STDOUT;
The problem is that the <> operator doesn't read from STDIN; it reads from ARGV. If there are filenames specified on the command line, those two filehandles aren't the same. Likewise, although print usually writes to STDOUT, it won't if you've explicitly select-ed some other destination. You need to call select with no arguments to get the filehandle which each print will currently target.
Worse, still, even the correct version:
print "> " if -t *ARGV && -t select;
doesn't always work correctly. That's because the ARGV filehandle is magically self-opening, but only magically self-opens during the first read operation on it. If you haven't already done at least one <> before you start prompting for input, then the ARGV handle won't be open yet, so the first -t *ARGV test (the one before the while loop) won't be true, and the first prompt won't print.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Simplify Your Terminal Interactions
Read data from users correctly, effectively, and without thinking about it.
Even when you know the right way to handle interactive I/O [Hack #13], the resulting code can still be frustratingly messy:
my $offset;
print "Enter an offset: " if is_interactive;
GET_OFFSET:
while (<>)
{
    chomp;
    if (m/\\A [+-] \\d+ \\z/x)
    {
        $offset = $_;
        last GET_OFFSET;
    }
    print "Enter an offset (please enter an integer): "
        if is_interactive;
}
You can achieve exactly the same effect (and much more) with the prompt( ) subroutine provided by the IO::Prompt CPAN module. Instead of all the above infrastructure code, just write:
use IO::Prompt;

my $offset = prompt( "Enter an offset: ", -integer );
prompt( ) prints the string you give it, reads a line from standard input, chomps it, and then tests the input value against any constraint you specify (for example, -integer). If the constraint is not satisfied, the prompt repeats, along with a clarification of what was wrong. When the user finally enters an acceptable value, prompt( ) returns it.
Most importantly, prompt( ) is smart enough not to bother writing out any prompts if the application isn't running interactively, so you don't have to code explicitly for that case.
Infrastructure code is code that doesn't actually contribute to solving your problem, but merely exists to hold your program together. Typically this kind of code implements standard low-level tasks that probably ought to have built-ins dedicated to them. Many modules in the standard library and on CPAN exist solely to provide cleaner alternatives to continually rewriting your own infrastructure code. Discovering and using them can significantly decrease both the size and cruftiness of your code.
prompt( ) has a general mechanism for telling it what kind of input you need and how to ask for that input. For example:
my $hex_num = prompt( "Enter a hex number> ",
          -req => { "A *hex* number please!> " => qr/^[0-9A-F]+$/i }
          );

print "That's ", hex($hex_num), " in base 10\\n";
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Alert Your Mac
Schedule GUI alerts from the command line.
Growl (http://www.growl.info/) is a small utility for Mac OS X that allows any application to send notifications to the user. The notifications pop up as a small box in a corner of the screen, overlayed on the current active window (as shown in Figure 2-1).
Figure 2-1: A simple Growl notification
You can send Growl notifications from Perl, thanks to Chris Nandor's Mac::Growl. The first thing you have to do is tell Growl that your script wants to send notifications. The following code registers a script named growlalert and tells Growl that it sends alert notifications:
use Mac::Growl;

Mac::Growl::RegisterNotifications(
    'growlalert', # application name
    [ 'alert' ],  # notifications this app sends
    [ 'alert' ],  # enable these notifications
);
Growl displays a notification to let you know the script has registered successfully (Figure 2-2). You need only register an application once on each machine that uses it.
Figure 2-2: A newly registered application
When you want to send a notification, call PostNotification( ), passing the name of the script, the kind of notification to send, a title, and a description:
Mac::Growl::PostNotification(
    'growlalert', # application name
    'alert',      # type of notification
    "This is a title",
    "This is a description.",
);
This will pop up a notification window (Figure 2-3) and fade it out again after a few seconds.
Figure 2-3: Notification with title and description
You might want a small script that sends you an alert after a time delay. The following command-line utility takes a time period to delay and a message to display in the alert. It calculates the time when the alert should appear, then forks to return control of the terminal window to the user. The forked child sleeps the requested amount of time, and then posts a Growl notification with the message as the title and no description.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Interactive Graphical Apps
Paint pretty pictures with Perl.
People often see Perl as a general-purpose language: you start by using it to write short scripts, do administrative tasks, or text processing. If you happen to appreciate it, you end up enjoying its flexibility and power to perform almost anything that doesn't require the speed of compiled binaries.
Consider instead the number one requirement of games. Unless you're exclusively a fan of card games, you'd say "CPU power." Fortunately a crazy guy named David J. Goehrig had the mysterious idea to bind the functions of the C low-level graphical library SDL for the Perl language. The result is an object-oriented approach to SDL called sdlperl.
With SDL you will manipulate surfaces. These are rectangular images, and the most common operation is to copy one onto another; this is blitting. To implement a basic image loader with SDL Perl in just four non-comment lines of code, write:
use SDL::App;

# open a 640x480 window for your application
our $app = SDL::App->new(-width => 640, -height => 480);

# create a surface out of an image file specified on the command-line
our $img = SDL::Surface->new( -name => $ARGV[0] );

# blit the surface onto the window of your application
$img->blit( undef, $app, undef );

# flush all pending screen updates
$app->flip( );

# sleep for 3 seconds to let the user view the image
sleep 3;
You might wonder how to perform positioning and cropping during a blit. In the previous code, replace the two undef parameter values with instances of SDL::Rect, the first one specifying the rectangle to copy from the source surface, and the second specifying the rectangle where to blit on the destination surface. When you use undef instead, SDL uses top-left positioning and full sizing. Here's a blit replacement that specifies a 100x100 area in the source surface at a horizontal offset of 200 pixels:
$img->blit( SDL::Rect->new(
    -width => 100, -height => 100, -x => 200, -y => 0
), $app, undef);
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Collect Configuration Information
Save and re-use configuration information.
Some code you write needs configuration information when you build and install it. For example, consider a program that can use any of several optional and conflicting plug-ins. The user must decide what to use when she builds the module, especially if some of the dependencies themselves have dependencies.
When you run your tests and the code in general, having this information available in one spot is very valuable—you can avoid expensive and tricky checks if you hide everything behind a single, consistent interface.
How do you collect and store this information? Ask the user, and then write it into a simple configuration module!
Both Module::Build and ExtUtils::MakeMaker provide user prompting features to ask questions and get answers. The benefit of this is that they silently accept the defaults during automated installations. Users at the keyboard can still answer a prompt, while users who just want the software to install won't launch the installer, turn away, and return an hour later to find that another prompt has halted the process in the meantime.
Module::Build is easier to extend, so here's a simple subclass that allows you to specify questions, default values, and configuration keys before writing out a standard module containing this information:
package Module::Build::Configurator;

use strict;
use warnings;

use base 'Module::Build';

use SUPER;
use File::Path;
use Data::Dumper;
use File::Spec::Functions;

sub new
{
    my ($class, %args) = @_;
    my $self           = super( );
    my $config         = $self->notes( 'config_data' ) || { };

    for my $question ( @{ $args{config_questions} } )
    {
        my ($q, $name, $default) = map { defined $_ ? $_ : '' } @$question;
        $config->{$name}         = $self->prompt( $q, $default );
    }

    $self->notes( 'config_module', $args{config_module} );
    $self->notes( 'config_data',   $config );
    return $self;
}

sub ACTION_build
{
    $_[0]->write_config( );
    super( );
}

sub write_config
{
    my $self      = shift;
    my $file      = $self->notes( 'config_module' );
    my $data      = $self->notes( 'config_data' );
    my $dump      = Data::Dumper->new( [ $data ], [ 'config_data' ] )->Dump;
    my $file_path = catfile( 'lib', split( /::/, $file . '.pm' ) );

    my $path      = ( splitpath( $file_path ) )[1];
    mkpath( $path ) unless -d $path;

    my $package   = <<END_MODULE;
    package $file;

    my $dump

    sub get_value
    {
        my (\\$class, \\$key) = \\@_;

        return unless exists \\$config_data->{ \\$key };
        return               \\$config_data->{ \\$key };
    }

    1;
END_MODULE

    $package =~ s/^\\t//gm;

    open( my $fh, '>', $file_path )
        or die "Cannot write config file '$path': $!\\n";
    print $fh $package;
    close $fh;
}

1;
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Rewrite the Web
Use the power of Perl to rewrite the web.
The Greasemonkey extension for Mozilla Firefox and related browsers is a powerful way to modify web pages to your liking. In fact, the Mozilla family projects are customizable in many ways—as long as you like writing C++, JavaScript, or XUL.
If your network doesn't run only Firefox, or if you just prefer to customize the Web with Perl instead of any other language, HTTP::Proxy can help.
For whatever reason (registrar greed, mostly), plenty of useful sites such as Perl Monks have .com and .org domain names. One visitor might use http://www.perlmonks.com/, while the truly blessed saints prefer http://perlmonks.org/. That's all well and good except for the cases where you have logged in to the site through one domain name but not the others. Your HTTP cookie uses the specific domain name for identification.
Thus you may follow a link from somewhere that leads to the correct site with the incorrect domain name. How annoying!
Fixing this with HTTP::Proxy is easy though:
use strict;
use warnings;

use HTTP::Proxy ':log';
use HTTP::Proxy::HeaderFilter::simple;

# start the proxy with the given command-line parameters
my $proxy = HTTP::Proxy->new( @ARGV );

for my $redirect (<DATA>)
{
    chomp $redirect;

    my ($pattern, $destination) = split( /\\|/, $redirect );
    my $filter                  = get_filter( $destination );

    $proxy->push_filter( host => $pattern, request => $filter );
}

$proxy->start( );

my %filters;

sub get_filter
{
    my $site = shift;

    return $filters{ $site } ||= HTTP::Proxy::HeaderFilter::simple->new(
        sub
        {
            my ( $self, $headers, $message ) = @_;

            # modify the host part of the request only
            $message->uri( )->host( $site );

            # create a new redirect response
            my $res = HTTP::Response->new(
                301,
                "Moved to $site", 
                [ Location => $message->uri( ) ]
            );

            # and make the proxy send it back to the client
            $self->proxy( )->response( $res );
        }
    );
}

__DATA__
perlmonks.com|perlmonks.org
www.perlmonks.org|perlmonks.org
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Data Munging
Hacks 19-27
Perl has always been in love with data. No matter where you find it, Perl happily processes and extracts and reports on files, databases, web pages, spreadsheets, other programs, and anything that produces data. Perl's so happy to do this that it even overlooks brute-force, rough manipulations. Hey, pragmatism works!
Perl can be gentle, too. A little subtlety, a little style and finesse, and you can write maintainable, easy-to-understand code that's just as powerful as the wild-eyed forge-ahead-at-all-costs just-do-the-job code. Why? It's often faster and more correct—as well as more secure, more powerful, and shorter.
Sure, slinging data between sources sounds about as glamorous as slinging hash at the local diner, but it doesn't have to be that way. Here are several ideas to munge that yummy data with all of the elegance and style and power and clarity that you know you have.
Pretend a big stream of data on disk is a nice, malleable Perl data structure.
One of the big disappointments in programming is realizing that, although you can think of a text file as a long list of properly terminated lines, to the computer, it's just a big blob of ones and zeroes. If all you need to do is read the lines of a file and process them in order, you're fine. If you have a big file that you can't load into memory and can't process each line in order...well, good luck.
Fortunately, Mark Jason Dominus's Tie::File module exists, and is even in the core as of Perl 5.8.0. What good is it?
Imagine you have a million-line CSV file of inventory data from a customer that's just not quite right. You can't import it into a spreadsheet, because that's too much data. You need to do some processing, inserting some lines and rearranging others. Importing the data into a little SQLite database won't work either because trying to get the queries right is too troublesome.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Treat a File As an Array
Pretend a big stream of data on disk is a nice, malleable Perl data structure.
One of the big disappointments in programming is realizing that, although you can think of a text file as a long list of properly terminated lines, to the computer, it's just a big blob of ones and zeroes. If all you need to do is read the lines of a file and process them in order, you're fine. If you have a big file that you can't load into memory and can't process each line in order...well, good luck.
Fortunately, Mark Jason Dominus's Tie::File module exists, and is even in the core as of Perl 5.8.0. What good is it?
Imagine you have a million-line CSV file of inventory data from a customer that's just not quite right. You can't import it into a spreadsheet, because that's too much data. You need to do some processing, inserting some lines and rearranging others. Importing the data into a little SQLite database won't work either because trying to get the queries right is too troublesome.
Tie::File won't help you write the rules for transforming lines, but it will take the pain out of manipulating the lines of a file. Just:
use Tie::File;

tie my @csv_lines, 'Tie::File', 'big_file.csv'
    or die "Cannot open big_file.csv: !$\\n";
Suppose that your big CSV file contains a list of products and operations. That is, each line is either a list of product data (product id, name, price, supplier, et cetera) or some operation to perform on the previous n products. Operations take the form opname:number. Obviously the file would be easier to process if the operations appeared before the data on which to operate, but you can't always change customer data formats to something sane. In fact, this might be the easiest way to clean the data for other processes.
Tie::File makes this almost trivial:
for my $i ( 0 .. $#csv_lines )
{
    next unless my ($op, $num) = $csv_lines[ $i ] =~ /^(\\w+):(\\d+)/;
    next unless my $op_sub     = __PACKAGE__->can( 'op_' . $op );

    my $start                  = $i - $num;
    my $end                    = $i - 1;
    my @lines                  = @csv_lines[ $start .. $end ];
    my @newlines               = $op_sub->( @lines );

    splice @csv_lines, $start, $num + 1, @newlines;
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Read Files Backwards
Content preview·Buy PDF of this chapter|