Loop Statements

All loop statements have an optional LABEL in their formal syntax. (You can put a label on any statement, but it has a special meaning to a loop.[74]) If present, the label consists of an identifier followed by a colon. It’s customary to make the label uppercase both to stand out visually and to avoid potential confusion with reserved words. (Perl won’t get confused if you use a label that already has a meaning like if or open, but your readers might.)

while and until Statements

The while statement repeatedly executes the block as long as EXPR is true. If the word while is replaced by the word until, the sense of the test is reversed; that is, it executes the block only as long as EXPR remains false. The conditional is still tested before the first iteration, though.

The while or until statement can have an optional extra block: continue. This block is executed every time the block is continued, either by falling off the end of the first block or by an explicit next (a loop-control operator that goes to the next iteration). The continue block is not heavily used in practice, but it’s in here so we can define the three-part loop rigorously in the next section.

Unlike the foreach loop we’ll see in a moment, a while loop has no official “loop variable”.[75] You may, however, declare variables explicitly. A variable declared in the test condition of a while or until statement is visible only in the block or blocks governed by that test. It is not part of the surrounding scope. For example:

while (my $line = <STDIN>) {
    $line = lc $line;
}
continue {
    print $line;   # still visible
}
# $line now out of scope here

Here, the scope of $line extends from its declaration in the control expression throughout the rest of the loop construct, including the continue block, but not beyond. If you want the scope to extend further, declare the variable before the loop.

Three-Part Loops

The three-part loop[76] has three semicolon-separated expressions within its parentheses. These three expressions are interpreted respectively as the initialization, the condition, and the reinitialization of the loop. The parentheses around them and the two semicolons between them are required, but the expressions themselves are optional. The initializer and reinitializer do nothing if omitted. The condition, if omitted, is considered to have a true value. (The values of the initializer and reinitializer don’t matter since they are evaluated only for their side effects.)

The three-part loop can be defined in terms of the corresponding while loop, relocating its three expressions. When you say this:

LABEL:
  for (my $i = 1; $i <= 10; $i++) {
      ...
  }

it gets rearranged internally to work like this:

{
    my $i = 1;
  LABEL:
    while ($i <= 10) {
        ...
    }
    continue {
        $i++;
    }
}

(except that there’s not really an outer block; we just put one there to show how the scope of the my is limited).

If you want to iterate through two variables simultaneously, just separate the parallel expressions with commas:

my $i;
my $bit;
for ($i = 0, $bit = 0; $i < 32; $i++, $bit <<= 1) {
    say "Bit $i is set" if $mask & $bit;
}
# the values in $i and $bit persist past the loop

Or to declare those variables to be visible only inside the loop:

for (my ($i, $bit) = (0, 1); $i < 32; $i++, $bit <<= 1) {
    say "Bit $i is set" if $mask & $bit;
}
# loop's versions of $i and $bit now out of scope

Besides the normal looping through array indices, the three-part loop can lend itself to many other interesting applications. It doesn’t even need an explicit loop variable. Here’s one example that avoids the problem you get when you explicitly test for end-of-file on an interactive file descriptor, causing your program to appear to hang:

$on_a_tty = –t STDIN && –t STDOUT;
sub prompt { print "yes? " if $on_a_tty }
for ( prompt(); <STDIN>; prompt() ) {
    # do something
}

Another traditional use of the three-part loop is the “infinite loop”. Since all three expressions are optional, and the default condition is true, when you write:

for (;;) {
    ...
}

it is the same as writing:

while (1) {
    ...
}

If the notion of infinite loops bothers you, we should point out that you can always fall out of the loop at any point with an explicit loop-control operator such as last. Of course, if you’re writing the code to control a nuclear cruise missile, you may not actually need an explicit loop exit. The loop will be terminated automatically at the appropriate moment.[77]

foreach Loops

This loop iterates over a list of values by setting the control variable (VAR) to each successive element of the list:

for my VAR (LIST) {
    ...
}

If “my VAR” is omitted, the global $_ is used. You can omit the my, but only when use strict is turned off, so don’t.

For historical reasons, the foreach keyword is a synonym for the for keyword, so you can use for and foreach interchangeably, whichever you think is more readable in a given situation. We tend to prefer for because we are lazy and because it is more readable, especially with the my. (Don’t worry—Perl can easily distinguish for (@ARGV) from for ($i=0; $i <$#ARGV; $i++) because the latter contains semicolons.) Here are some examples:

$sum = 0;
for my $value (@array) { $sum += $value }

for my $count (10,9,8,7,6,5,4,3,2,1,"BOOM") {  # do a countdown
    say $count;
    sleep(1);
}

for (reverse "BOOM", 1 .. 10) {                # same thing
    say;
    sleep(1);
}

for my $field (split /:/, $data) {             # any LIST expression
    say "Field contains: '$field'";
}

for my $key (sort keys %hash) {
    say "$key => $hash{$key}";
}

That last one is the canonical way to print out the values of a hash in sorted order. See the keys and sort entries in Chapter 27 for more elaborate examples.

There is no way to tell where you are in the list. You may compare adjacent elements by remembering the previous one in a variable, but sometimes you just have to break down and write a three-part loop with subscripts. That’s why we have two different loops, after all.

If LIST consists of assignable values (meaning variables, generally, not enumerated constants), you can modify each of those variables by modifying VAR inside the loop. That’s because the loop variable becomes an implicit alias for each item in the list that you’re looping over. Not only can you modify a single array in place, you can also modify multiple arrays and hashes in a single list:

for my $pay (@salaries) {                # grant 8% raises
    $pay *= 1.08;
}

for (@christmas, @easter) {              # change menu
    s/ham/turkey/;
}
s/ham/turkey/ for @christmas, @easter;   # same thing

for ($scalar, @array, values %hash) {
    s/^\s+//;                            # strip leading  whitespace
    s/\s+$//;                            # strip trailing whitespace
}

The loop variable is valid only from within the dynamic or lexical scope of the loop and will be implicitly lexical if the variable was previously declared with my. This renders it invisible to any function defined outside the lexical scope of the variable, even if called from within that loop. However, if no lexical declaration is in scope, the loop variable will be a localized (dynamically scoped) global variable; this allows functions called from within the loop to access that variable. In either case, any previous value the localized variable had before the loop will be restored automatically upon loop exit.

If you prefer, you may explicitly declare which kind of variable (lexical or global) to use. This makes it easier for maintainers of your code to know what’s really going on; otherwise, they’ll need to search back up through enclosing scopes for a previous declaration to figure out which kind of variable it is:

for my  $i    (1 .. 10) { ... }         # $i always lexical
for our $Tick (1 .. 10) { ... }         # $Tick always global

When a declaration accompanies the loop variable, the shorter for spelling is preferred over foreach, since it reads better in English.

Here’s how a C or Java programmer might first think to code up a particular algorithm in Perl:

for ($i = 0; $i < @ary1; $i++) {
    for ($j = 0; $j < @ary2; $j++) {
        if ($ary1[$i] > $ary2[$j]) {
            last;         # Can't go to outer loop. :–(
        }
        $ary1[$i] += $ary2[$j];
    }
    # this is where that last takes me
}

But here’s how a veteran Perl programmer might do it:

WID: for my $this (@ary1) {
    JET: for my $that (@ary2) {
        next WID if $this > $that;
        $this += $that;
    }
}

See how much easier that was in idiomatic Perl? It’s cleaner, safer, and faster. It’s cleaner because it’s less noisy. It’s safer because if code gets added between the inner and outer loops later on, the new code won’t be accidentally executed, since next (explained below) explicitly iterates the outer loop rather than merely breaking out of the inner one. And it’s faster than the equivalent three-part loop, since the elements are accessed directly instead of through subscripting.

But write it however you like. TMTOWTDI.

Like the while statement, the foreach statement can also take a continue block. This lets you execute a bit of code at the bottom of each loop iteration no matter whether you got there in the normal course of events or through a next.

Speaking of which, now we can finally say it: next is next.

Loop Control

We mentioned that you can put a LABEL on a loop to give it a name. The loop’s LABEL identifies the loop for the loop-control operators next, last, and redo. The LABEL names the loop as a whole, not just the top of the loop. Hence, a loop-control operator referring to the loop doesn’t actually “go to” the loop label itself. As far as the computer is concerned, the label could just as easily have been placed at the end of the loop. But people like things labelled at the top, for some reason.

Loops are typically named for the item the loop is processing on each iteration. This interacts nicely with the loop-control operators, which are designed to read like English when used with an appropriate label and a statement modifier. The archetypal loop works on lines, so the archetypal loop label is LINE:, and the archetypal loop-control operator is something like this:

next LINE if /^#/;      # discard comments

The syntax for the loop-control operators is:

last LABEL
next LABEL
redo LABEL

The LABEL is optional; if omitted, the operator refers to the innermost enclosing loop. But if you want to jump past more than one level, you must use a LABEL to name the loop you want to affect. That LABEL does not have to be in your lexical scope, though it probably ought to be. But, in fact, the LABEL can be anywhere in your dynamic scope. If this forces you to jump out of an eval or subroutine, Perl issues a warning (upon request).

Just as you may have as many return operators in a function as you like, you may have as many loop-control operators in a loop as you like. This is not to be considered wicked or even uncool. During the early days of structured programming, some people insisted that loops and subroutines have only one entry and one exit. The one-entry notion is still a good idea, but the one-exit notion has led people to write a lot of unnatural code. Much of programming consists of traversing decision trees. A decision tree naturally starts with a single trunk but ends with many leaves. Write your code with the number of loop exits (and function returns) that is natural to the problem you’re trying to solve. If you’ve declared your variables with reasonable scopes, everything gets automatically cleaned up at the appropriate moment, no matter how you leave the block.

The last operator immediately exits the loop in question. The continue block, if any, is not executed. The following example bombs out of the loop on the first blank line:

LINE: while (<STDIN>) {
    last LINE if /^$/;      # exit when done with mail header
    ...
}

The next operator skips the rest of the current iteration of the loop and starts the next one. If there is a continue clause on the loop, it is executed just before the condition is reevaluated, just like the third component of a three-part for loop. Thus, it can be used to increment a loop variable, even when a particular iteration of the loop has been interrupted by a next:

LINE: while (<STDIN>) {
    next LINE if /^#/;      # skip comments
    next LINE if /^$/;      # skip blank lines
    ...
} continue {
    $count++;
}

The redo operator restarts the loop block without evaluating the conditional again. The continue block, if any, is not executed. This operator is often used by programs that want to fib to themselves about what was just input. Suppose you were processing a file that sometimes had a backslash at the end of a line to continue the record on the next line. Here’s how you could use redo for that:

while (<>) {
    chomp;
    if (s/\\$//) {
        $_ .= <>;
        redo unless eof;    # don't read past each file's eof
    }
    # now process $_
}

which is the customary Perl shorthand for the more explicitly (and tediously) written version:

LINE: while (defined($line = <ARGV>)) {
    chomp($line);
    if ($line =~ s/\\$//) {
        $line .= <ARGV>;
        redo LINE unless eof(ARGV);
    }
    # now process $line
}

Here’s an example from a real program that uses all three loop-control operators. Although this particular strategy of parsing command-line arguments is less common now that we have the Getopt::* modules bundled with Perl,[78] it’s still a nice illustration of the use of loop-control operators on named, nested loops:

ARG: while (@ARGV && $ARGV[0] =~ s/^–(?=.)//) {
    OPT: for (shift @ARGV) {
         m/^$/       && do {                             next ARG };
         m/^–$/      && do {                             last ARG };
         s/^d//      && do { $Debug_Level++;             redo OPT };
         s/^l//      && do { $Generate_Listing++;        redo OPT };
         s/^i(.*)//  && do { $In_Place = $1 || ".bak";   next ARG };
         say_usage("Unknown option: $_");
    }
}

One more point about loop-control operators. You may have noticed that we are not calling them “statements”. That’s because they aren’t statements—although like any expression, they can be used as statements. You can almost think of them as unary operators that just happen to cause a change in control flow. So you can use them anywhere it makes sense to use them in an expression. In fact, you can even use them where it doesn’t make sense. One sometimes sees this coding error:

open FILE, '<', $file
     or warn "Can't open $file: $!\n", next FILE;   # WRONG

The intent is fine, but the next FILE is being parsed as one of the arguments to warn, which is a list operator. So the next executes before the warn gets a chance to emit the warning. In this case, it’s easily fixed by turning the warn list operator into the warn function call with some suitably situated parentheses:

open FILE, '<', $file
     or warn("Can't open $file: $!\n"), next FILE;   # okay

However, you might find it easier to read this:

unless (open FILE, '<', $file) {
     warn "Can't open $file: $!\n";
     next FILE;
}

Bare Blocks as Loops

A BLOCK by itself (labelled or not) is semantically equivalent to a loop that executes once. Thus, you can use last to leave the block or redo to restart the block.[79] Note that this is not true of the blocks in eval {}, sub {}, or, much to everyone’s surprise, do {}. These three are not loop blocks because they’re not BLOCKs by themselves; the keyword in front makes them mere terms in an expression that just happen to include a code block. Since they’re not loop blocks, they cannot be given a label to apply loop controls to. Loop controls may only be used on true loops, just as a return may only be used within a subroutine (well, or an eval).

Loop controls don’t work in an if or unless, either, since those aren’t loops. But you can always introduce an extra set of braces to give yourself a bare block, which does count as a loop:

if (/pattern/) {{
    last if /alpha/;
    last if /beta/;
    last if /gamma/;
    # do something here only if still in if()
}}

Here’s how a block can be used to let loop-control operators work with a do {} construct. To next or redo a do, put a bare block inside:

do {{
    next if $x == $y;
    # do something here
}} until $x++ > $z;

For last, you have to be more elaborate:

{
    do {
        last if $x = $y ** 2;
        # do something here
    } while $x++ <= $z;
}

And if you want both loop controls available, you’ll have to put a label on those blocks so you can tell them apart:

DO_LAST: {
            do {
DO_NEXT:          {
                    next DO_NEXT if $x == $y;
                    last DO_LAST if $x =  $y ** 2;
                    # do something here
                  }
            } while $x++ <= $z;
         }

But certainly by that point (if not before), you’re better off using an ordinary infinite loop with last at the end:

for (;;) {
    next if $x == $y;
    last if $x =  $y ** 2;
    # do something here
    last unless $x++ <= $z;
}

Loopy Topicalizers

Perl has more than one topicalizer; in addition to given, you can also use a foreach loop as a topicalizer. For example, here’s one way to count how many times a particular string occurs in an array:

use v5.10.1;
my $count = 0;
for (@array) {
    when ("FNORD") { ++$count }
}
print "\@array contains $count copies of 'FNORD'\n";

Or in a more recent version:

use v5.14;
my $count = 0;
for (@array) {
    ++$count when "FNORD";
}
print "\@array contains $count copies of 'FNORD'\n";

At the end of all when blocks inside a foreach loop, there is an implicit break, which, since you’re in a loop, is equivalent to a next. You can override that with an explicit last if you’re only interested in the first match.

A when only works if the topic is in $_, so you can’t specify a loop variable, or if you do, it must be $_:

for my $_ (@answers) {
    say "Life, the Universe, and Everything!" when 42;
}


[74] Prior to v5.14, you couldn’t put a label on a package statement.

[75] A consequence of this is that a while never implicitly localizes any variables in its test condition. This can have “interesting” consequences when while loops are used in conjunction with operators that do implicitly know about global variables such as $_. In particular, see the section Line Input (Angle) Operator in Chapter 2 for how implicit assignment to the global $_ can occur in certain while loops, along with examples of how to deal with the problem.

[76] Also known as for loops, but that’s confusing since Perl has for loops that are not three-part loops, so we avoid that term.

[77] That is, the fallout from the loop tends to occur automatically.

[78] See Mastering Perl for a comparison of the main command-line argument parsing modules.

[79] For reasons that may (or may not) become clear upon reflection, a next also exits the once-through block. There is a slight difference, however: a next will execute a continue block, but a last won’t.

Get Programming Perl, 4th Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.