By Randal L. Schwartz, Tom Phoenix
Cover | Table of Contents | Colophon
http://www.cpan.org/ to find one near you.
Most of the time, you can also simply visit http://COUNTRYCODE.cpan.org/ where
COUNTRYCODE is your two-letter official country
code (like on the end of your national domain names). Or, if you
don't have access to the Net, you might find a CD-ROM or
DVD-ROM with all of the useful parts of CPAN on it; check with your
local technical bookstore. Look for a recently minted archive,
though; since CPAN changes daily, an archive from two years ago is an
antique. (Better yet, get a kind friend with Net access to burn you
one with today's CPAN.)
#!/usr/bin/perl
@lines = `perldoc -u -f atan2`;
foreach (@lines) {
s/\w<([^>]+)>/\U$1/g;
print;
}
#!
line, as we saw before. You might need to
change that line for your system, as we discussed earlier.
`
`"). (The backquote key is often found next to the
number 1 on full-sized American keyboards. Be sure not to confuse the
backquote with the single quote, "'".)
The command we're using is perldoc -u -f
atan2; try typing that at your command line to see what its
output looks like. The
perldoc
command is used on most systems to read
and display the documentation for Perl and its associated extensions
and utilities, so it should normally be available.
This command tells you something about the trigonometric function
atan2; we're using it here just as an
example of an external command whose output we wish to process.
@lines
. The next line of code starts a
loop that will process each one of those lines. Inside the loop, the
statements are indented. Although Perl doesn't require this,
good programmers do.
s/\w<([^>]+)>/\U$1/g;. Without going into
too much detail, we'll just say that this can change any line
that has a special marker made with
angle brackets (ex1-1, for simplicity, since it's exercise 1
in Chapter 1.)
hello
or the Gettysburg Address). Although you may think of
numbers and strings as very different things, Perl uses them nearly
interchangeably.
hello
or the Gettysburg Address). Although you may think of
numbers and strings as very different things, Perl uses them nearly
interchangeably.
1.25
255.000
255.0
7.25e45 # 7.25 times 10 to the 45th power (a big number)
-6.5e24 # negative 6.5 times 10 to the 24th
# (a big negative number)
-12e-24 # negative 12 times 10 to the -24th
# (a very small negative number)
-1.2E-23 # another way to say that - the E may be uppercase
0 2001 -40 255 61298040283768
61_298_040_283_768
hello). Strings may contain any combination of any
characters.
'fred' # those four characters: f, r, e, and d 'barney' # those six characters '' # the null string (no characters) 'Don\'t let an apostrophe end this string prematurely!' 'the last character of this string is a backslash: \\' 'hello\n' # hello followed by backslash followed by n 'hello there' # hello, newline, there (11 characters total) '\'\\' # single quote followed by backslash
-w
option on the command line:
$ perl -w my_program
#! line:
#!/usr/bin/perl -w
#!perl -w
'12fred34' as
if it were a number:
Argument "12fred34" isn't numeric
-w switch could. See the
perllexwarn
manpage for more information on these
warnings.
$Fred
is a different variable from $fred. And all of the
letters, digits, and underscores are significant, so:
$a_very_long_variable_that_ends_in_1
$a_very_long_variable_that_ends_in_2
$. In the shell, you use $ to
get the value, but leave the $ off to assign a new
value. In awk or C, you leave the
$ off entirely. If you bounce back and forth a
lot, you'll find yourself typing the wrong things occasionally.
This is expected. (Most Perl programmers would recommend that you
stop writing shell, awk, and C programs, but
that may not work for you.)
$r is probably not very descriptive but
$line_length is. A variable used for only two or
three lines close together may be called something simple, like
$n, but a variable used throughout a program
should probably have a more descriptive name.
$super_bowl is a better name than
$superbowl, since that last one might look like
$superb_owlprint( )
operator makes this possible. It takes
a scalar argument and puts it out without any embellishment onto
standard output. Unless you've done something odd, this will be
your terminal display. For example:
print "hello world\n"; # say hello world, followed by a newline print "The answer is "; print 6 * 7; print ".\n";
print a series of values,
separated by commas.
print "The answer is ", 6 * 7, ".\n";
$meal = "brontosaurus steak"; $barney = "fred ate a $meal"; # $barney is now "fred ate a brontosaurus steak" $barney = 'fred ate a ' . $meal; # another way to write that
$barney = "fred ate a $meat"; # $barney is now "fred ate a "
print "$fred"; # unneeded quote marks print $fred; # better style
if control
structure:
if ($name gt 'fred') {
print "'$name' comes after 'fred' in sorted order.\n";
}
else
keyword provides that as well:
if ($name gt 'fred') {
print "'$name' comes after 'fred' in sorted order.\n";
} else {
print "'$name' does not come after 'fred'.\n";
print "Maybe it's the same string, in fact.\n";
}
if control structure. That's handy if you
want to store a true or false value into a variable, like this:
$is_bigger = $name gt 'fred';
if ($is_bigger) { ... }
undef is false. (We'll see
this a little later in this section.)
'') is false; all other strings
are normally true.
'0', has the same value as
its numeric form: false.
undef,
0, '', or
'0', it's false. All other scalars are
true—including all of the types of scalars that we
haven't told you about yet.
<STDIN>
. Each time you use <STDIN> in a
place where a scalar value is expected, Perl reads the next complete
text line from standard
input
(up to the first newline), and uses that
string as the value of <STDIN>. Standard
input can mean many things, but unless you do something uncommon, it
means the keyboard of the user who invoked your program (probably
you). If there's nothing waiting to be read (typically the
case, unless you type ahead a complete line), the Perl program will
stop and wait for you to enter some characters followed by a newline
(return).
<STDIN> typically has a
newline character on the end of it. So you could do something like this:
$line = <STDIN>;
if ($line eq "\n") {
print "That was just a blank line!\n";
} else {
print "That line of input was: $line";
}
chomp operator.
chomp
operator, it seems
terribly overspecialized. It works on a variable. The variable has to
hold a string. And if the string ends in a newline character,
chomp can get rid of the newline. That's
(nearly) all it does. For example:
$text = "a line of text\n"; # Or the same thing from <STDIN> chomp($text); # Gets rid of the newline character
chomp, because
of a simple rule: any time that you need a variable in Perl, you can
use an assignment instead. First, Perl does the assignment. Then it
uses the variable in whatever way you requested. So the most common
use of chomp looks like this:
chomp($text = <STDIN>); # Read the text, without the newline character $text = <STDIN>; # Do the same thing... chomp($text); # ...but in two steps
chomp may not seem
to be the easy way, especially if it seems more complex! If you think
of it as two operations—read a line, then
chomp it—then it's more natural to
write it as two statements. But if you think of it as one
operation—read just the text, not the newline—it's
more natural to write the one statement. And since most other Perl
programmers are going to write it that way, you may as well get used
to it now.
chomp is actually a function. As a function, it
has a return value, which is the number of characters removed. This
number is hardly ever useful:
$food = <STDIN>; $betty = chomp $food; # gets the value 1 - but we knew that!
chomp with or without
the parentheses. This is another general rule in Perl: except in
cases where it changes the meaning to remove them, parentheses are
always optional.
chomp removes only one. If there's no
newline, it does nothing, and returns zero.
while
loop
repeats a block of code as long as a condition is true:
$count = 0;
while ($count < 10) {
$count += 1;
print "count is now $count\n"; # Gives values from 1 to 10
}
if test. Also like the if
control structure, the
block curly braces are required. The
conditional expression is evaluated before the first iteration, so
the loop may be skipped completely, if the condition is initially
false.
undef
value before they are first assigned,
which is just Perl's way of saying "nothing here to look
at—move along, move along." If you try to use this
"nothing" as a "numeric something," it acts
like 0. If you try to use it as a "string something," it
acts like the empty string. But undef is neither a
number nor a string; it's an entirely separate kind of scalar
value.
undef automatically acts like zero when
used as a number, it's easy to make an numeric accumulator that
starts out empty:
# Add up some odd numbers
$n = 1;
while ($n < 10) {
$sum += $n;
$n += 2; # On to the next odd number
}
print "The total was $sum.\n";
$sum was
undef before the loop started. The first time
through the loop, $n is one, so the first line
inside the loop adds one to $sum. That's
like adding one to a variable that already holds zero (because
we're using undef as if it were a number).
So now it has the value 1. After that, since
it's been initialized, adding works in the traditional way.
$string .= "more text\n";
$string is undef, this will
act as if it already held the empty string, putting "more
text\n" into that variable. But if it already holds a
string, the new text is simply appended.
undef when the arguments are
out of range or don't make sense. If you don't do
anything special, you'll get a zero or a null string without
major consequences. In practice, this is hardly a problem. In fact,
most programmers will rely upon this behavior. But you should know
that when warnings are turned on, Perl will typically warn about
unusual uses of the undefined value, since that may indicate a bug.
For example, simply copying undef is the
line-input operator,
<STDIN>
. Normally, it will return a line
of text. But if there is no more input, such as at end-of-file, it
returns undef to signal this. To
tell whether a value is undef and not the empty
string, use the defined
function, which returns false for
undef, and true for everything else:
$madonna = <STDIN>;
if ( defined($madonna) ) {
print "The input was $madonna";
} else {
print "No input available!\n";
}
undef values,
you can use the obscurely named
undef
operator:
$madonna = undef; # As if it had never been touched
undef values, or any
mixture of different scalar values. Nevertheless, it's most
common to have all elements of the same type, such as a list of book
titles (all strings) or a list of cosines (all numbers).
$fred[0] = "yabba"; $fred[1] = "dabba"; $fred[2] = "doo";
$fred[0] = "yabba"; $fred[1] = "dabba"; $fred[2] = "doo";
"fred") is
from a completely separate namespace than scalars use; you could have
a scalar variable named $fred in the same program,
and Perl will treat them as different things, and wouldn't be
confused.
(Your maintenance programmer might be confused, though, so
don't capriciously make all of your variable names the same!)
$fred[2] in
every place where
you could use any other scalar variable like
$fred. For example, you can get the value from an
array element or change that value by the same sorts of expressions
we used in the previous chapter:
print $fred[0]; $fred[2] = "diddley"; $fred[1] .= "whatsis";
$number = 2.71828; print $fred[$number - 1]; # Same as printing $fred[1]
undef. This is just as with ordinary
scalars; if you've never stored a value into the variable,
it's undef.
$blank = $fred[ 142_857 ]; # unused array element gives undef $blanc = $mel; # unused scalar $mel also gives undef
undef values.
$rocks[0] = 'bedrock'; # One element... $rocks[1] = 'slate'; # another... $rocks[2] = 'lava'; # and another... $rocks[3] = 'crushed rock'; # and another... $rocks[99] = 'schist'; # now there are 95 undef elements
rocks that
we've just been using, the last element index is $#rocks. That's
not the same as the number of elements, though, because there's
an element number zero. As seen in the code snippet below, it's
actually possible to assign to this value to change the size of the
array, although this is rare in practice.
$end = $#rocks; # 99, which is the last element's index
$number_of_rocks = $end + 1; # okay, but we'll see a better way later
$#rocks = 2; # Forget all rocks after 'lava'
$#rocks = 99; # add 97 undef elements (the forgotten rocks are
# gone forever)
$rocks[ $#rocks ] = 'hard rock'; # the last rock
$#name value as an
index, like that last example, happens often enough that Larry has
provided a shortcut: negative array indices count from the
end of the array. But don't get the idea that these indices
"wrap around." If you've got three elements in the
array, the valid negative indices are -1 (the last element), -2 (the middle element), and -3 (the first element). In the real world,
nobody seems to use any of these except -1, though.
$rocks[ -1 ] = 'hard rock'; # easier way to do that last example above $dead_rock = $rocks[-100]; # gets 'bedrock' $rocks[ -200 ] = 'crystal'; # fatal error!
(1, 2, 3) # list of three values 1, 2, and 3
(1, 2, 3,) # the same three values (the trailing comma is ignored)
("fred", 4.5) # two values, "fred" and 4.5
( ) # empty list - zero elements
(1..100) # list of 100 integers
(1..5) # same as (1, 2, 3, 4, 5) (1.7..5.7) # same thing - both values are truncated (5..1) # empty list - .. only counts "uphill" (0, 2..6, 10, 12) # same as (0, 2, 3, 4, 5, 6, 10, 12) ($a..$b) # range determined by current values of $a and $b (0..$#rocks) # the indices of the rocks array from the previous section
($a, 17) # two values: the current value of $a, and 17 ($b+$c, $d+$e) # two values
("fred", "barney", "betty", "wilma", "dino")
qw
shortcut makes it easy to generate them
without typing a lot of extra quote marks:
qw/ fred barney betty wilma dino / # same as above, but less typing
qw stands for "quoted
words" or "quoted by whitespace," depending upon
whom you ask. Either way, Perl treats it like a single-quoted string
(so, you can't use \n or
$fred inside a qw list as you would in a double-quoted
string). The whitespace (characters like spaces, tabs, and newlines)
will be discarded, and whatever is left becomes the list of items.
Since whitespace
is discarded, here's another (but unusual) way to write that
same list:
($fred, $barney, $dino) = ("flintstone", "rubble", undef);
($fred, $barney) = ($barney, $fred); # swap those values ($betty[0], $betty[1]) = ($betty[1], $betty[0]);
undef.
($fred, $barney) = qw< flintstone rubble slate granite >; # two ignored items ($wilma, $dino) = qw[flintstone]; # $dino gets undef
($rocks[0], $rocks[1], $rocks[2], $rocks[3]) = qw/talc mica feldspar quartz/;
@) before the name of the array (and no
index brackets after it) to refer to the entire array at once. You
can read this as "all of the," so
@rocks is "all of the
rocks." This works on either side of the assignment operator:
@rocks = qw/ bedrock slate lava /; @tiny = ( ); # the empty list @giant = 1..1e5; # a list with 100,000 elements @stuff = (@giant, undef, @giant); # a list with 200,001 elements $dino = "granite"; @quarry = (@rocks, "crushed rock", @tiny, $dino);
@quarry the
five-element list (bedrock, slate, lava, crushed rock,
granite)@rocks = qw{ flintstone slate rubble };
print "quartz @rocks limestone\n"; # prints five rocks separated by spaces
print "Three rocks are: @rocks.\n"; print "There's nothing in the parens (@empty) here.\n";
$email = "fred@bedrock.edu"; # WRONG! Tries to interpolate @bedrock $email = "fred\@bedrock.edu"; # Correct $email = 'fred@bedrock.edu'; # Another way to do that
@fred = qw(hello dolly); $y = 2; $x = "This is $fred[1]'s place"; # "This is dolly's place" $x = "This is $fred[$y-1]'s place"; # same thing
$y contains the string
"2*4", we're still talking
about element 1, not element 7, because "2*4" as a number (the value of $y used in a numeric expression) is just
plain 2.
@fred = qw(eating rocks is wrong);
$fred = "right"; # we are trying to say "this is right[3]"
print "this is $fred[3]\n"; # prints "wrong" using $fred[3]
print "this is ${fred}[3]\n"; # prints "right" (protected by braces)
print "this is $fred"."[3]\n"; # right again (different string)
print "this is $fred\[3]\n"; # right again (backslash hides it)
foreach
loop steps through a list of values,
executing one iteration (time through the loop) for each value:
foreach $rock (qw/ bedrock slate lava /) {
print "One rock is $rock.\n"; # Prints names of three rocks
}
$rock in
that example) takes on a new value from the list for each iteration.
The first time through the loop, it's "bedrock"; the third time, it's
"lava".
@rocks = qw/ bedrock slate lava /;
foreach $rock (@rocks) {
$rock = "\t$rock"; # put a tab in front of each element of @rocks
$rock .= "\n"; # put a newline on the end of each
}
print "The rocks are:\n", @rocks; # Each one is indented, on its own line
foreach loop is automatically saved and
restored by Perl. While the loop is running, there's no way to
access or alter that saved value. So after the loop is done, the
variable has the value it had before the loop, or undef if it hadn't had a value. That
means that if you want to name your loop control variable
"$rock", you
don't have to worry that maybe you've already used that
name for another variable.
foreach loop, Perl uses its favorite
default variable, $_. This is
(mostly) just like any other scalar variable, except for its unusual
name. For example:
foreach (1..10) { # Uses $_ by default
print "I can count to $_!\n";
}
$_ when you don't tell it to use some
other variable or value, thereby saving the programmer from the heavy
labor of having to think up and type a new variable name. So as not
to keep you in suspense, one of those cases is print, which will print $_ if given no other argument:
$_ = "Yabba dabba doo\n"; print; # prints $_ by default
reverse
operator takes a list of values (which
may come from an array) and returns the list in the opposite order.
So if you were disappointed that the range operator, .., only counts upwards, this is the way to
fix it:
@fred = 6..10; @barney = reverse(@fred); # gets 10, 9, 8, 7, 6 @wilma = reverse 6..10; # gets the same thing, without the other array @fred = reverse @fred; # puts the result back into the original array
@fred twice. Perl always calculates the
value being assigned (on the right) before it begins the actual
assignment.
reverse returns the
reversed list; it doesn't affect its ar