List Processing

Much earlier in this chapter, we mentioned that Perl has two main contexts, scalar context (for dealing with singular things) and list context (for dealing with plural things). Many of the traditional operators we've described so far have been strictly scalar in their operation. They always take singular arguments (or pairs of singular arguments for binary operators) and always produce a singular result, even in list context. So if you write this:

@array = (1 + 2, 3 - 4, 5 * 6, 7 / 8);

you know that the list on the right side contains exactly four values, because the ordinary math operators always produce scalar values, even in the list context provided by the assignment to an array.

However, other Perl operators can produce either a scalar or a list value, depending on their context. They just "know" whether a scalar or a list is expected of them. But how will you know that? It turns out to be pretty easy to figure out, once you get your mind around a few key concepts.

First, list context has to be provided by something in the "surroundings". In the previous example, the list assignment provides it. Earlier we saw that the list of a foreach loop provides it. The print operator also provides it. But you don't have to learn these one by one.

If you look at the various syntax summaries scattered throughout the rest of the book, you'll see various operators that are defined to take a LIST as an argument. Those are the operators that provide a list context. Throughout this book, LIST is used as a specific technical term to mean "a syntactic construct that provides a list context". For example, if you look up sort, you'll find the syntax summary:

sort LIST

That means that sort provides a list context to its arguments.

Second, at compile time (that is, while Perl is parsing your program and translating to internal opcodes), any operator that takes a LIST provides a list context to each syntactic element of that LIST. So every top-level operator or entity in the LIST knows at compile time that it's supposed to produce the best list it knows how to produce. This means that if you say:

sort @dudes, @chicks, other();

then each of @dudes, @chicks, and other() knows at compile time that it's supposed to produce a list value rather than a scalar value. So the compiler generates internal opcodes that reflect this.

Later, at run time (when the internal opcodes are actually interpreted), each of those LIST elements produces its list in turn, and then (this is important) all the separate lists are joined together, end to end, into a single list. And that squashed-flat, one-dimensional list is what is finally handed off to the function that wanted the LIST in the first place. So if @dudes contains (Fred,Barney), @chicks contains (Wilma,Betty), and the other() function returns the single-element list (Dino), then the LIST that sort sees is:

(Fred,Barney,Wilma,Betty,Dino)

and the LIST that sort returns is:

(Barney,Betty,Dino,Fred,Wilma)

Some operators produce lists (like keys), while some consume them (like print), and others transform lists into other lists (like sort). Operators in the last category can be considered filters, except that, unlike in the shell, the flow of data is from right to left, since list operators operate on arguments passed in from the right. You can stack up several list operators in a row:

print reverse sort map {lc} keys %hash;

That takes the keys of %hash and returns them to the map function, which lowercases all the keys by applying the lc operator to each of them, and passes them to the sort function, which sorts them, and passes them to the reverse function, which reverses the order of the list elements, and passes them to the print function, which prints them.

As you can see, that's much easier to describe in Perl than in English.

There are many other ways in which list processing produces more natural code. We can't enumerate all the ways here, but for an example, let's go back to regular expressions for a moment. We talked about using a pattern in a scalar context to see whether it matched, but if instead you use a pattern in a list context, it does something else: it pulls out all the backreferences as a list. Suppose you're searching through a log file or a mailbox, and you want to parse a string containing a time of the form "12:59:59 am". You might say this:

($hour, $min, $sec, $ampm) = /(\d+):(\d+):(\d+) *(\w+)/;

That's a convenient way to set several variables simultaneously. But you could just as easily say

@hmsa = /(\d+):(\d+):(\d+) *(\w+)/;

and put all four values into one array. Oddly, by decoupling the power of regular expressions from the power of Perl expressions, list context increases the power of the language. We don't often admit it, but Perl is actually an orthogonal language in addition to being a diagonal language. Have your cake, and eat it too.

Get Programming Perl, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.