Program: Sorting Your Mail

The program in Example 10.1 sorts a mailbox by subject by reading input a paragraph at a time, looking for one with a "From" at the start of a line. When it finds one, it searches for the subject, strips it of any "Re: " marks, and stores its lowercased version in the @sub array. Meanwhile, the messages themselves are stored in a corresponding @msgs array. The $msgno variable keeps track of the message number.

Example 10-1. bysub1

#!/usr/bin/perl 
# bysub1 - simple sort by subject
my(@msgs, @sub);
my $msgno = -1;
$/ = '';                    # paragraph reads
while (<>) {
    if (/^From/m) {
        /^Subject:\s*(?:Re:\s*)*(.*)/mi;
        $sub[++$msgno] = lc($1) || '';
    }
    $msgs[$msgno] .= $_;
} 
for my $i (sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msgs)) {
    print $msgs[$i];
}

That sort is only sorting array indices. If the subjects are the same, cmp returns 0, so the second part of the || is taken, which compares the message numbers in the order they originally appeared.

If sort were fed a list like (0,1,2,3), that list would get sorted into a different permutation, perhaps (2,1,3,0). We iterate across them with a for loop to print out each message.

Example 10.2 shows how an awk programmer might code this program, using the -00 switch to read paragraphs instead of lines.

Example 10-2. bysub2

#!/usr/bin/perl -n00
# bysub2 - awkish sort-by-subject BEGIN { $msgno = -1 } $sub[++$msgno] = (/^Subject:\s*(?:Re:\s*)*(.*)/mi)[0] if /^From/m; $msg[$msgno] .= $_; END { print @msg[ sort { $sub[$a] ...

Get Perl Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.