BUY THIS BOOK

Safari Books Online

What is this?

Looking to Reprint this content?

Learning Perl
Learning Perl, Second Edition

By Randal L. Schwartz, Tom Christiansen
Foreword by Larry燱all

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Introduction
Perl is short for " Practical Extraction and Report Language," although it has also been called a "P athologically Eclectic Rubbish Lister." There's no point in arguing which one is more correct, because both are endorsed by Larry Wall, Perl's creator and chief architect, implementor, and maintainer. He created Perl when he was trying to produce some reports from a Usenet-news-like hierarchy of files for a bug-reporting system, and awk ran out of steam. Larry, being the lazy programmer that he is, decided to over-kill the problem with a general-purpose tool that he could use in at least one other place. The result was the first version of Perl.
After playing with this version of Perl a bit, adding stuff here and there, Larry released it to the community of Usenet readers, commonly known as "the Net." The users on this ragtag fugitive fleet of systems around the world (tens of thousands of them) gave him feedback, asking for ways to do this, that, or the other, many of which Larry had never envisioned his little Perl handling.
But as a result, Perl grew, and grew, and grew, at about the same rate as the UNIX operating system. (For you newcomers, the entire UNIX kernel used to fit in 32K! And now we're lucky if we can get it in under a few meg.) It grew in features. It grew in portability. What was once a little language now had over a thousand pages of documentation split across dozens of different manpages, a 600-page Nutshell reference book, a handful of Usenet newsgroups with 200,000 subscribers, and now this gentle introduction.
Larry is no longer the sole maintainer of Perl, but retains his executive title of chief architect. And Perl is still growing.
This book was tested with Perl version 5.0 patchlevel 4 (the most recent release as I write this). Everything here should work with 5.0 and future releases of Perl. In fact, Perl 1.0 programs work rather well with recent releases, except for a few odd changes made necessary in the name of progress.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
History of Perl
Perl is short for " Practical Extraction and Report Language," although it has also been called a "P athologically Eclectic Rubbish Lister." There's no point in arguing which one is more correct, because both are endorsed by Larry Wall, Perl's creator and chief architect, implementor, and maintainer. He created Perl when he was trying to produce some reports from a Usenet-news-like hierarchy of files for a bug-reporting system, and awk ran out of steam. Larry, being the lazy programmer that he is, decided to over-kill the problem with a general-purpose tool that he could use in at least one other place. The result was the first version of Perl.
After playing with this version of Perl a bit, adding stuff here and there, Larry released it to the community of Usenet readers, commonly known as "the Net." The users on this ragtag fugitive fleet of systems around the world (tens of thousands of them) gave him feedback, asking for ways to do this, that, or the other, many of which Larry had never envisioned his little Perl handling.
But as a result, Perl grew, and grew, and grew, at about the same rate as the UNIX operating system. (For you newcomers, the entire UNIX kernel used to fit in 32K! And now we're lucky if we can get it in under a few meg.) It grew in features. It grew in portability. What was once a little language now had over a thousand pages of documentation split across dozens of different manpages, a 600-page Nutshell reference book, a handful of Usenet newsgroups with 200,000 subscribers, and now this gentle introduction.
Larry is no longer the sole maintainer of Perl, but retains his executive title of chief architect. And Perl is still growing.
This book was tested with Perl version 5.0 patchlevel 4 (the most recent release as I write this). Everything here should work with 5.0 and future releases of Perl. In fact, Perl 1.0 programs work rather well with recent releases, except for a few odd changes made necessary in the name of progress.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Purpose of Perl
Perl is designed to assist the programmer with common tasks that are probably too heavy or too portability-sensitive for the shell, and yet too weird or short-lived or complicated to code in C or some other UNIX glue language.
Once you become familiar with Perl, you may find yourself spending less time trying to get shell quoting (or C declarations) right, and more time reading Usenet news and downhill snowboarding, because Perl is a great tool for leverage. Perl's powerful constructs allow you to create (with minimal fuss) some very cool one-up solutions or general tools. Also, you can drag those tools along to your next job, because Perl is highly portable and readily available, so you'll have even more time there to read Usenet news and annoy your friends at karaoke bars.
Like any language, Perl can be "write-only"; it's possible to write programs that are impossible to read. But with proper care, you can avoid this common accusation. Yes, sometimes Perl looks like line noise to the uninitiated, but to the seasoned Perl programmer, it looks like checksummed line noise with a mission in life. If you follow the guidelines of this book, your programs should be easy to read and easy to maintain, but they probably won't win any obfuscated Perl contests.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Availability
If you get
perl: not found
when you try to invoke Perl from the shell, your system administrator hasn't caught the fever yet. But even if it's not on your system, you can get it for free (or nearly so).
Perl is distributed under the GNU Public License, which says something like, "you can distribute binaries of Perl only if you make the source code available at no cost, and if you modify Perl, you have to distribute the source to your modifications as well." And that's essentially free. You can get the source to Perl for the cost of a blank tape or a few megabytes over a wire. And no one can lock Perl up and sell you just binaries for their particular idea of "supported hardware configurations."
In fact, it's not only free, but it runs rather nicely on nearly everything that calls itself UNIX or UNIX-like and has a C compiler. This is because the package comes with an arcane configuration script called Configure that pokes and prods the system directories looking for things it requires, and adjusts the include files and defined symbols accordingly, turning to you for verification of its findings.
Besides UNIX or UNIX-like systems, people have also been addicted enough to Perl to port it to the Amiga, the Atari ST, the Macintosh family, VMS, OS/2, even MS/DOS and Windows NT and Windows 95鈥攁nd probably even more by the time you read this. The sources for Perl (and many precompiled binaries for non-UNIX architectures) are available from the Comprehensive Perl Archive Network (the CPAN). If you are web-savvy, visit http://www.perl.com/CPAN for one of the many mirrors. If you're absolutely stumped, write bookquestions@oreilly.com and say "Where can I get Perl?!?!"
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Basic Concepts
A shell script is nothing more than a sequence of shell commands stuffed into a text file. The file is then "made executable" by turning on the execute bit (via chmod +x filename) and then the name of the file is typed at a shell prompt. Bingo, one shell program. For example, a script to run the date command followed by the who command can be created and executed like this:
% echo date >somescript
% echo who >>somescript
% cat somescript
date
who
% chmod +x somescript
% somescript
[output of date followed by who]
%
Similarly, a Perl program is a bunch of Perl statements and definitions thrown into a file. You then turn on the execute bit and type the name of the file at a shell prompt. However, the file has to indicate that this is a Perl program and not a shell program, so you need an additional step.
Most of the time, this step involves placing the line
            #!/usr/bin/perl
as the first line of the file. But if your Perl is stuck in some nonstandard place, or your system doesn't understand the #! line, you'll have a little more work to do. Check with your Perl installer about this. The examples in this book assume that you use this common mechanism.
Perl is mostly a free-format language like C鈥攚hitespace between tokens (elements of the program, like print or +) is optional, unless two tokens put together can be mistaken for another token, in which case whitespace of some kind is mandatory. (Whitespace consists of spaces, tabs, newlines, returns, or formfeeds.) There are a few constructs that require a certain kind of whitespace in a certain place, but they'll be pointed out when we get to them. You can assume that the kind and amount of whitespace between tokens is otherwise arbitrary.
Although nearly any Perl program can be written all on one line, typically a Perl program is indented much like a C program, with nested parts of statements indented more than the surrounding parts. You'll see plenty of examples showing a typical indentation style throughout this book.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Stroll Through Perl
We begin our journey through Perl by taking a little stroll. This stroll presents a number of different features by hacking on a small application. The explanations here are extremely brief; each subject area is discussed in much greater detail later in this book. But this little stroll should give you a quick taste for the language, and you can decide if you really want to finish this book rather than read some more Usenet news or run off to the ski slopes.
Let's look at a little program that actually does something. Here is your basic "Hello, world" program:
#!/usr/bin/perl -w
print ("Hello, world!\n");
The first line is the incantation that says this is a Perl program. It's also a comment for Perl; remember that a comment is anything from a pound sign to the end of that line, as in many interpreter programming languages. Unlike all other comments in the program, the one on the first line is special: Perl looks at that line for any optional arguments. In this case, the -w switch was used. This very important switch tells Perl to produce extra warning messages about potentially dangerous constructs. You should always develop your programs under -w.
The second line is the entire executable part of this program. Here we see a print function. The built-in function print starts it off, and in this case has just one argument, a C-like text string. Within this string, the character combination \n stands for a newline character. The print statement is terminated by a semicolon (;). As in C, all simple statements in Perl are terminated by a semicolon.
When you invoke this program, the kernel fires up a Perl interpreter, which parses the entire program (all two lines of it, counting the first, comment line) and then executes the compiled form. The first and only operation is the execution of the print function, which sends its arguments to the output. After the program has completed, the Perl process exits, returning back a successful exit code to the parent shell.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exercise
Most chapters end with some exercises, for which answers are found in Appendix A. For this stroll, the answers have already been given above.
  1. Type in the example programs, and get them to work. (You'll need to create the secret-word lists as well.) Consult your local Perl guru if you need assistance.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Scalar Data
A scalar is the simplest kind of data that Perl manipulates. A scalar is either a number (like 4 or 3.25e20) or a string of characters (like hello or the Gettysburg Address). Although you may think of numbers and strings as very different things, Perl uses them nearly interchangeably, so we'll describe them together.
A scalar value can be acted upon with operators (like plus or concatenate), generally yielding a scalar result. A scalar value can be stored into a scalar variable. Scalars can be read from files and devices and written out as well.
Although a scalar is either a number or a string, it's useful to look at numbers and strings separately for the moment. Numbers first, strings in a minute... .
As you'll see in the next few paragraphs, you can specify both integers (whole numbers, like 17 or 342) and floating-point numbers (real numbers with decimal points, like 3.14, or 1.35 times 1025). But internally, Perl computes only with double-precision floating-point values. This means that there are no integer values internal to Perl; an integer constant in the program is treated as the equivalent floating-point value. You probably won't notice the conversion (or care much), but you should stop looking for integer operations (as opposed to floating-point operations), because there aren't any.
A literal is the way a value is represented in the text of the Perl program. You could also call this a constant in your program, but we'll use the term literal. Literals are the way data is represented in the source code of your program as input to the Perl compiler. (Data that is read from or written to files is treated similarly, but not identically.)
Perl accepts the complete set of floating-point literals available to C programmers. Numbers with and without decimal points are allowed (including an optional plus or minus prefix), as well as tacking on a power-of-10 indicator ( exponential notation) with
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is Scalar Data?
A scalar is the simplest kind of data that Perl manipulates. A scalar is either a number (like 4 or 3.25e20) or a string of characters (like hello or the Gettysburg Address). Although you may think of numbers and strings as very different things, Perl uses them nearly interchangeably, so we'll describe them together.
A scalar value can be acted upon with operators (like plus or concatenate), generally yielding a scalar result. A scalar value can be stored into a scalar variable. Scalars can be read from files and devices and written out as well.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Numbers
Although a scalar is either a number or a string, it's useful to look at numbers and strings separately for the moment. Numbers first, strings in a minute... .
As you'll see in the next few paragraphs, you can specify both integers (whole numbers, like 17 or 342) and floating-point numbers (real numbers with decimal points, like 3.14, or 1.35 times 1025). But internally, Perl computes only with double-precision floating-point values. This means that there are no integer values internal to Perl; an integer constant in the program is treated as the equivalent floating-point value. You probably won't notice the conversion (or care much), but you should stop looking for integer operations (as opposed to floating-point operations), because there aren't any.
A literal is the way a value is represented in the text of the Perl program. You could also call this a constant in your program, but we'll use the term literal. Literals are the way data is represented in the source code of your program as input to the Perl compiler. (Data that is read from or written to files is treated similarly, but not identically.)
Perl accepts the complete set of floating-point literals available to C programmers. Numbers with and without decimal points are allowed (including an optional plus or minus prefix), as well as tacking on a power-of-10 indicator ( exponential notation) with E notation. For example:
1.25     # about 1 and a quarter
7.25e45  # 7.25 times 10 to the 45th power (a big number)
-6.5e24  # negative 6.5 times 10 to the 24th
         # (a "big" negative number)
-12e-24  # negative 12 times 10 to the -24th
         # (a very small negative number)
-1.2E-23 # another way to say that
Integer literals are also straightforward, as in:
12
15
-2004
3485
Don't start the number with a 0, because Perl supports octal and hexadecimal (hex) literals. Octal numbers start with a leading 0, and hex numbers start with a leading
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Strings
Strings are sequences of characters (like hello). Each character is an 8-bit value from the entire 256 character set (there's nothing special about the NUL character as in some languages).
The shortest possible string has no characters. The longest string fills all of your available memory (although you wouldn't be able to do much with that). This is in accordance with the principle of "no built-in limits" that Perl follows at every opportunity. Typical strings are printable sequences of letters and digits and punctuation in the ASCII 32 to ASCII 126 range. However, the ability to have any character from 0 to 255 in a string means you can create, scan, and manipulate raw binary data as strings鈥攕omething with which most other utilities would have great difficulty. (For example, you can patch your operating system by reading it into a Perl string, making the change, and writing the result back out.)
Like numbers, strings have a literal representation (the way you represent the string in a Perl program). Literal strings come in two different flavors: single-quoted strings and double-quoted strings. Another form that looks rather like these two is the back-quoted string (`like this`). This isn't so much a literal string as a way to run external commands and get back their output. This is covered in Chapter 14.
A single-quoted string is a sequence of characters enclosed in single quotes. The single quotes are not part of the string itself; they're just there to let Perl identify the beginning and the ending of the string. Any character between the quote marks (including newline characters, if the string continues onto successive lines) is legal inside a string. Two exceptions: to get a single quote into a single-quoted string, precede it by a backslash. And to get a backslash into a double-quoted string, precede the backslash by a backslash. In other pictures:
'hello'     # five characters: h, e, l, l, o
'don\'t'    # five characters: d, o, n, single-quote, t
''          # the null string (no characters)
'silly\\me' # silly, followed by backslash, followed by me
'hello\n'   # hello followed by backslash followed by n
'hello
there'      # hello, newline, there (11 characters total)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Scalar Operators
An operator produces a new value (the result) from one or more other values (the operands). For example, + is an operator because it takes two numbers (the operands, like 5 and 6), and produces a new value (11, the result).
Perl's operators and expressions are generally a superset of those provided in most other ALGOL/Pascal-like programming languages, such as C or Java. An operator expects either numeric or string operands (or possibly a combination of both). If you provide a string operand where a number is expected, or vice versa, Perl automatically converts the operand using fairly intuitive rules, which will be detailed in Section 2.4.4, below.
Perl provides the typical ordinary addition, subtraction, multiplication, and division operators, and so on. For example:
2 + 3     # 2 plus 3, or 5
5.1 - 2.4 # 5.1 minus 2.4, or approximately 2.7
3 * 12    # 3 times 12 = 36
14 / 2    # 14 divided by 2, or 7
10.2 / 0.3 # 10.2 divided by 0.3, or approximately 34
10 / 3    # always floating point divide, so approximately 3.3333333...
Additionally, Perl provides the FORTRAN-like exponentiation operator, which many have yearned for in Pascal and C. The operator is represented by the double asterisk, such as 2**3, which is two to the third power, or eight. (If the result can't fit into a double-precision floating-point number, such as a negative number to a noninteger exponent, or a large number to a large exponent, you'll get a fatal error.)
Perl also supports a modulus operator. The value of the expression 10 % 3 is the remainder when 10 is divided by 3, which is 1. Both values are first reduced to their integer values, so 10.5 % 3.2 is computed as 10 % 3.
The logical comparison operators are
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Scalar Variables
A variable is a name for a container that holds one or more values. The name of the variable is constant throughout the program, but the value or values contained in that variable typically change over and over again throughout the execution of the program.
A scalar variable holds a single scalar value (representing a number, a string, or a reference). Scalar variable names begin with a dollar sign followed by a letter, and then possibly more letters, or digits, or underscores. Upper- and lowercase letters are distinct: the variable $A is a different variable from $a. And all of the letters, digits, and underscores are significant, so:
$a_very_long_variable_that_ends_in_1
is different from:
$a_very_long_variable_that_ends_in_2
You should generally select variable names that mean something regarding the value of the variable. For example, $xyz123 is probably not very descriptive but $line_length is.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Scalar Operators and Functions
The most common operation on a scalar variable is assignment , which is the way to give a value to a variable. The Perl assignment operator is the equal sign (much like C or FORTRAN), which takes a variable name on the left side and gives it the value of the expression on the right, like so:
$a = 17;     # give $a the value of 17
$b = $a + 3; # give $b the current value of $a plus 3 (20)
$b = $b * 2; # give $b the value of $b multiplied by 2 (40)
Notice that last line uses the $b variable twice: once to get its value (on the right side of the =), and once to define where to put the computed expression (on the left side of the =). This is legal, safe, and in fact, rather common. In fact, it's so common that we'll see in a minute that we can write this using a convenient shorthand.
You may have noticed that scalar variables are always specified with the leading $. In shell programming, you use $ to get the value, but leave the $ off to assign a new value. In Java or C, you leave the $ off entirely. If you bounce back and forth a lot, you'll find yourself typing the wrong things occasionally. This is expected. (Our solution was to stop writing shell, awk, and C programs, but that may not work for you.)
A scalar assignment may be used as a value as well as an operation, as in C. In other words, $a=3 has a value, just as $a+3 has a value. The value is the value assigned, so the value of $a=3 is 3. Although this may seem odd at first glance, using an assignment as a value is useful if you wish to assign an intermediate value in an expression to a variable, or if you simply wish to copy the same value to more than one variable. For example:
$b = 4 + ($a = 3);      # assign 3 to $a, then add 4 to that
                        # resulting in $b getting 7
$d = ($c = 5);          # copy 5 into $c, and then also into $d
$d = $c = 5;            # the same thing without parentheses
That last example works because assignment is right-associative.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
<STDIN> as a Scalar Value
At this point, if you're a typical code hacker, you're probably wondering how to get a value into a Perl program. Here's the simplest way. Each time you use <STDIN> in a place where a scalar value is expected, Perl reads the next complete text line from standard input (up to the first newline), and uses that string as the value of <STDIN>. Standard input can mean many things, but unless you do something odd, it means the terminal of the user who invoked your program (probably you). If there's nothing waiting to be read (typically the case, unless you type ahead a complete line), the Perl program will stop and wait for you to enter some characters followed by a newline (return).
The string value of <STDIN> typically has a newline on the end of it. Most often, you'll want to get rid of that newline right away (there's a big difference between hello and hello\n). This is where our friend, the chomp function, comes to the rescue. A typical input sequence goes something like this:
$a = <STDIN>;  # get the text
chomp($a);     # get rid of that pesky newline
A common abbreviation for these two lines is:
chomp($a = <STDIN>);
The assignment inside the parentheses continues to refer to $a, even after it has been given a value with <STDIN>. Thus, the chomp function is working on $a. (This is true in general about the assignment operator; an assignment expression can be used wherever a variable is needed, and the actions refer to the variable on the left side of the equal sign.)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Output with print
So, we get things in with <STDIN>. How do we get things out? With the print function. This function takes the values within its parentheses and puts them out without any embellishment onto standard output. Once again, unless you've done something odd, this will be your terminal. For example:
print("hello world\n"); # say hello world, followed by newline
print "hello world\n";  # same thing
Note that the second example shows the form of print without parentheses. Whether or not to use the parentheses is mostly a matter of style and typing agility, although there are a few cases where you'll need the parentheses to remove ambiguity.
We'll see that you can actually give print a list of values, in Section 6.3.1, but we haven't talked about lists yet, so we'll put that off for later.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Undefined Value
What happens if you use a scalar variable before you give it a value? Nothing serious, and definitely nothing fatal. Variables have the undef value before they are first assigned. This value looks like a zero when used as a number, or the zero-length empty string when used as a string. You will get a warning under Perl's -w switch, though, which is a good way to catch programming errors.
Many operators return undef when the arguments are out of range or don't make sense. If you don't do anything special, you'll get a zero or a null string without major consequences. In practice, this is hardly a problem.
One operation we've seen that returns undef under certain circumstances is <STDIN> . Normally, this returns the next line that was read; however, if there are no more lines to read (such as when you type CTRL-D at the terminal, or when a file has no more data), <STDIN> returns undef as a value. In Chapter 6, we'll see how to test for this and take special action when there is no more data available to read.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exercises
See Appendix A for answers.
  1. Write a program that computes the circumference of a circle with a radius of 12.5. The circumference is 2蟺 times the radius, or about 2 times 3.141592654 times the radius.
  2. Modify the program from the previous exercise to prompt for and accept a radius from the person running the program.
  3. Write a program that prompts for and reads two numbers, and prints out the result of the two numbers multiplied together.
  4. Write a program that reads a string and a number, and prints the string the number of times indicated by the number on separate lines. (Hint: use the "x" operator.)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Arrays and List Data
A list is ordered scalar data. An array is a variable that holds a list. Each element of the array is a separate scalar variable with an independent scalar value. These values are ordered; that is, they have a particular sequence from the lowest to the highest element.
Arrays can have any number of elements. The smallest array has no elements, while the largest array can fill all of available memory. Once again, this is in keeping with Perl's philosophy of "no unnecessary limits."
A list literal (the way you represent the value of a list within your program) consists of comma-separated values enclosed in parentheses. These values form the elements of the list. For example:
(1,2,3)             # array of three values 1, 2, and 3
("fred",4.5)        # two values, "fred" and 4.5
The elements of a list are not necessarily constants; they can be expressions that will be reevaluated each time the literal is used. For example:
($a,17);            # two values: the current value of $a, and 17
($b+$c,$d+$e)       # two values
The empty list (one of no elements) is represented by an empty pair of parentheses:
() # the empty list (zero elements)
An item of the list literal can include the list constructor operator, indicated by two scalar values separated by two consecutive periods. This operator creates a list of values starting at the left scalar value up through the right scalar value, incrementing by one each time. For example:
(1 .. 5)            # same as (1, 2, 3, 4, 5)
(1.2 .. 5.2)        # same as (1.2, 2.2, 3.2, 4.2, 5.2)
(2 .. 6,10,12)      # same as (2,3,4,5,6,10,12)
($a .. $b)          # range determined by current values of $a and $b
Having the right scalar less than the left scalar results in an empty list; you can't count down by switching the order of the values. If the final value is not a whole number of steps above the initial value, the list stops just before the next value would have been outside the range:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is a List or Array?
A list is ordered scalar data. An array is a variable that holds a list. Each element of the array is a separate scalar variable with an independent scalar value. These values are ordered; that is, they have a particular sequence from the lowest to the highest element.
Arrays can have any number of elements. The smallest array has no elements, while the largest array can fill all of available memory. Once again, this is in keeping with Perl's philosophy of "no unnecessary limits."
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Literal Representation
A list literal (the way you represent the value of a list within your program) consists of comma-separated values enclosed in parentheses. These values form the elements of the list. For example:
(1,2,3)             # array of three values 1, 2, and 3
("fred",4.5)        # two values, "fred" and 4.5
The elements of a list are not necessarily constants; they can be expressions that will be reevaluated each time the literal is used. For example:
($a,17);            # two values: the current value of $a, and 17
($b+$c,$d+$e)       # two values
The empty list (one of no elements) is represented by an empty pair of parentheses:
() # the empty list (zero elements)
An item of the list literal can include the list constructor operator, indicated by two scalar values separated by two consecutive periods. This operator creates a list of values starting at the left scalar value up through the right scalar value, incrementing by one each time. For example:
(1 .. 5)            # same as (1, 2, 3, 4, 5)
(1.2 .. 5.2)        # same as (1.2, 2.2, 3.2, 4.2, 5.2)
(2 .. 6,10,12)      # same as (2,3,4,5,6,10,12)
($a .. $b)          # range determined by current values of $a and $b
Having the right scalar less than the left scalar results in an empty list; you can't count down by switching the order of the values. If the final value is not a whole number of steps above the initial value, the list stops just before the next value would have been outside the range:
(1.3 .. 6.1) # same as (1.3,2.3,3.3,4.3,5.3)
List literals with lots of short text strings start to look pretty noisy with all the quotes and commas:
@a = ("fred","barney","betty","wilma"); # ugh!
So there's a shortcut: the "quote word" function, which creates a list from the nonwhitespace parts between the parentheses:
@a = qw(fred barney betty wilma); # better!
@a = qw(
    fred
    barney
    betty
    wilma
);                                # same thing
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Variables
An array variable holds a single list value (zero or more scalar values). Array variable names are similar to scalar variable names, differing only in the initial character, which is an at sign (@) rather than a dollar sign ($). For example:
@fred # the array variable @fred
@A_Very_Long_Array_Variable_Name
@A_Very_Long_Array_Variable_Name_that_is_different
Note that the array variable @fred is unrelated to the scalar variable $fred. Perl maintains separate namespaces for different types of things.
The value of an array variable that has not yet been assigned is (), the empty list.
An expression can refer to array variables as a whole, or it can examine and modify individual elements of the array.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Array Operators and Functions
Array functions and operators act on entire arrays. Some return a list, which can then either be used as a value for another array function, or assigned into an array variable.
Probably the most important array operator is the array assignment operator, which gives an array variable a value. It is an equal sign, just like the scalar assignment operator. Perl determines whether the assignment is a scalar assignment or an array assignment by noticing whether the assignment is to a scalar or an array variable. For example:
@fred = (1,2,3); # The fred array gets a three-element literal
@barney = @fred; # now that is copied to @barney
If a scalar value is assigned to an array variable, the scalar value becomes the single element of an array:
@huh = 1; # 1 is promoted to the list (1) automatically
Array variable names may appear in a list literal list. When the value of the list is computed, Perl replaces the names with the current values of the array, like so:
@fred = qw(one two);
@barney = (4,5,@fred,6,7); # @barney becomes 
                           # (4,5,"one","two",6,7)
@barney = (8,@barney);     # puts 8 in front of @barney
@barney = (@barney,"last");# and a "last" at the end
                        # @barney is now (8,4,5,"one","two",6,7,"last")
Note that the inserted array elements are at the same level as the rest of the literal: a list cannot contain another list as an element.
If a list literal contains only variable references (not expressions), the list literal can also be treated as a variable. In other words, such a list literal can be used on the left side of an assignment. Each scalar variable in the list literal takes on the corresponding value from the list on the right side of the assignment. For example:
($a,$b,$c) = (1,2,3);    # give 1 to $a, 2 to $b, 3 to $c
($a,$b) = ($b,$a);       # swap $a and $b
($d,@fred) = ($a,$b,$c); # give $a to $d, and ($b,$c) to @fred
($e,@fred) = @fred;      # remove first element of @fred to $e
                         # this makes @fred = ($c) and $e = $b
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Scalar and List Context
As you can see, each operator and function is designed to operate on some specified combination of scalars or lists, and returns either a scalar or a list. If an operator or function expects an operand to be a scalar, we say that the operand or argument is being evaluated in a scalar context. Similarly, if an operand or argument is expected to be a list value, we say that it is being evaluated in a list context.
Normally, this is fairly insignificant. But sometimes you get completely different behavior depending on whether you are within a scalar or a list context. For example, @fred returns the contents of the @fred array in a list context, but the length of the same array in a scalar context. These subtleties are mentioned when each operator and function is described.
A scalar value used within a list context is promoted to a single-element array.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
<STDIN> as an Array
One previously seen operation that returns a different value in a list context is <STDIN> . As described earlier, <STDIN> returns the next line of input in a scalar context. However, in a list context, it returns all remaining lines up to end of file. Each line is returned as a separate element of the list. For example:
@a = <STDIN>; # read standard input in a list context
If the person running the program types three lines, then presses CTRL-D (to indicate "end of file"), the array ends up with three elements. Each element will be a string that ends in a newline, corresponding to the three newline-terminated lines entered.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Variable Interpolation of Arrays
Like scalars, array values may be interpolated into a double-quoted string. A single element of an array will be replaced by its value, like so:
@fred = ("hello","dolly");
$y = 2;
$x = "This is $fred[1]'s place";     # "This is dolly's place"
$x = "This is $fred[$y-1]'s place";  # same thing
Note that the index expression is evaluated as an ordinary expression, as if it were outside a string. It is not variable interpolated first.
If you want to follow a simple scalar variable reference with a literal left square bracket, you need to delimit the square bracket so it isn't considered part of the array, as follows:
@fred = ("hello","dolly");  # give value to @fred for testing
$fred = "right";
                            # we are trying to say "this is right[1]"
$x = "this is $fred[1]";    # wrong, gives "this is dolly"
$x = "this is ${fred}[1]";  # right (protected by braces)
$x = "this is $fred"."[1]"; # right (different string)
$x = "this is $fred\[1]";   # right (backslash hides it)
Similarly, a list of values from an array variable can be interpolated. The simplest interpolation is an entire array, indicated by giving the array name (including its leading @ character). In this case, the elements are interpolated in sequence with a space character between them, as in:
@fred = ("a","bb","ccc",1,2,3);
$all = "Now for @fred here!";
    # $all gets "Now for a bb ccc 1 2 3 here!"
You can also select a portion of an array with a slice:
@fred = ("a","bb","ccc",1,2,3);
$all = "Now for @fred[2,3] here!";
                                      # $all gets "Now for ccc 1 here!"
$all = "Now for @fred[@fred[4,5]] here!"; # same thing
Once again, you can use any of the quoting mechanisms described earlier if you want to follow an array name reference with a literal left bracket rather than an indexing expression.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exercises
See Appendix A for answers.
  1. Write a program that reads a list of strings on separate lines and prints out the list in reverse order. If you're reading the list from the terminal, you'll probably need to delimit the end of the list by pressing your end-of-file character, probably CTRL-D under UNIX or Plan 9; often CTRL-Z elsewhere.
  2. Write a program that reads a number and then a list of strings (all on separate lines), and then prints one of the lines from the list as selected by the number.
  3. Write a program that reads a list of strings and then selects and prints a random string from the list. To select a random element of @somearray, put
                      srand;
    at the beginning of your program (this initializes the random-number generator), and then use
                      rand(@somearray)
    where you need a random value between zero and one less than the length of @somearray.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Control Structures
A statement block is a sequence of statements, enclosed in matching curly braces. It looks like this:
{
    first_statement;
    second_statement;
    third_statement;
    ...
    last_statement;
}
Perl executes each statement in sequence, from the first to the last. (Later, I'll show you how to alter this execution sequence within a block, but this is good enough for now.)
Syntactically, a block of statements is accepted in place of any single statement, but the reverse is not true.
The final semicolon on the last statement is optional. Thus, you can speak Perl with a C-accent (semicolon present) or Pascal-accent (semicolon absent). To make it easier to add more statements later, we usually suggest omitting the semicolon only when the block is all on one line. Contrast these two if blocks for examples of the two styles:
if ($ready) { $hungry++ }
    if ($tired) {
        $sleepy = ($hungry + 1) * 2;
}
Next up in order of complexity is the if statement. This construct takes a control expression (evaluated for its truth) and a block. It may optionally have an else followed by a block as well. In other words, it looks like this:
if (some_expression) {
    true_statement_1;
    true_statement_2;
    true_statement_3;
} else {
    false_statement_1;
    false_statement_2;
    false_statement_3;
}
(If you're a C or Java hacker, you should note that the curly braces are required. This eliminates the need for a "confusing dangling else" rule.)
During execution, Perl evaluates the control expression. If the expression is true, the first block (the true_statement statements above) is executed. If the expression is false, the second block (the false_statement statements above) is executed instead.
But what constitutes true and false? In Perl, the rules are slightly weird, but they give you the expected results. The control expression is evaluated for a string value in scalar context (if it's already a string, no change, but if it's a number, it is converted to a string). If this string is either the empty string (with a length of zero), or a string consisting of the single character
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Statement Blocks
A statement block is a sequence of statements, enclosed in matching curly braces. It looks like this:
{
    first_statement;
    second_statement;
    third_statement;
    ...
    last_statement;
}
Perl executes each statement in sequence, from the first to the last. (Later, I'll show you how to alter this execution sequence within a block, but this is good enough for now.)
Syntactically, a block of statements is accepted in place of any single statement, but the reverse is not true.
The final semicolon on the last statement is optional. Thus, you can speak Perl with a C-accent (semicolon present) or Pascal-accent (semicolon absent). To make it easier to add more statements later, we usually suggest omitting the semicolon only when the block is all on one line. Contrast these two if blocks for examples of the two styles:
if ($ready) { $hungry++ }
    if ($tired) {
        $sleepy = ($hungry + 1) * 2;
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The if/unless Statement
Next up in order of complexity is the if statement. This construct takes a control expression (evaluated for its truth) and a block. It may optionally have an else followed by a block as well. In other words, it looks like this:
if (some_expression) {
    true_statement_1;
    true_statement_2;
    true_statement_3;
} else {
    false_statement_1;
    false_statement_2;
    false_statement_3;
}
(If you're a C or Java hacker, you should note that the curly braces are required. This eliminates the need for a "confusing dangling else" rule.)
During execution, Perl evaluates the control expression. If the expression is true, the first block (the true_statement statements above) is executed. If the expression is false, the second block (the false_statement statements above) is executed instead.
But what constitutes true and false? In Perl, the rules are slightly weird, but they give you the expected results. The control expression is evaluated for a string value in scalar context (if it's already a string, no change, but if it's a number, it is converted to a string). If this string is either the empty string (with a length of zero), or a string consisting of the single character "0" (the digit zero), then the value of the expression is false. Anything else is true automatically. Why such funny rules? Because it makes it easy to branch on an emptyish versus nonempty string, as well as a zero versus nonzero number, without having to create two versions of interpreting true and false values. Here are some examples of true and false interpretations:
0       # converts to "0", so false
1-1     # computes to 0, then converts to "0", so false
1       # converts to "1", so true
""      # empty string, so false
"1"     # not "" or "0", so true
"00"    # not "" or "0", so true (this is weird, watch out)
"0.000" # also true for the same reason and warning
undef   # evaluates to "", so false
Practically speaking, interpretation of values as true or false is fairly intuitive. Don't let us scare you.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The while/until Statement
No programming language would be complete without some form of iteration (repeated execution of a block of statements). Perl can iterate using the while statement:
while (some_expression) {
    statement_1;
    statement_2;
    statement_3;
}
To execute this while statement, Perl evaluates the control expression (some_expression in the example). If its value is true (using Perl's notion of truth), the body of the while statement is evaluated once. This is repeated until the control expression becomes false, at which point Perl goes on to the next statement after the while loop. For example:
print "how old are you? ";
$a = <STDIN>;
chomp($a);
while ($a > 0) {
    print "At one time, you were $a years old.\n";
    $a--;
}
Sometimes it is easier to say "until something is true" rather than "while not this is true." Once again, Perl has the answer. Replacing the while with until yields the desired effect:
until (some_expression) {
    statement_1;
    statement_2;
    statement_3;
}
Note that in both the while and the until form, the body statements will be skipped entirely if the control expression is the termination value to begin with. For example, if a user enters an age less than zero for the program fragment above, Perl skips over the body of the loop.
It's possible that the control expression never lets the loop exit. This is perfectly legal, and sometimes desired, and thus not considered an error. For example, you might want a loop to repeat as long as you have no error, and then have some error-handling code following the loop. You might use this for a daemon that is meant to run until the system crashes.
The while/until statement you saw in the previous section tests its condition at the top of every loop, before the loop is entered. If the condition was already false to begin with, the loop won't be executed at all.
But sometimes you don't want to test the condition at the top of the loop. Instead, you want to test it at the bottom. To fill this need, Perl provides the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The for Statement
Another Perl iteration construct is the for statement, which looks suspiciously like C or Java's for statement and works roughly the same way. Here it is:
for ( initial_exp; test_exp; re-init_exp ) {
    statement_1;
    statement_2;
    statement_3;
}
Unraveled into forms we've seen before, this turns out as:
            initial_exp;
while (test_exp) {
    statement_1;
    statement_2;
    statement_3;
    re-init_exp;
}
In either case, the initial_exp expression is evaluated first. Th