Perl has three basic data types: scalars, arrays, and hashes.
Scalars are essentially simple variables.
They are preceded by a
dollar sign ($
).
A scalar is either a number, a string, or a reference.
(A reference is a scalar that points to another piece of data. References are
discussed later in this chapter.)
If you provide a string where a number is expected or vice versa, Perl
automatically converts the operand using fairly intuitive rules.
Arrays are ordered lists of scalars that you
access with a numeric subscript (subscripts start at 0).
They are preceded by an
“at” sign (@
).
Hashes are unordered sets of
key/value pairs that you access using
the keys as subscripts.
They are preceded by a
percent sign (%
).
Perl stores numbers internally as either signed integers or double-precision floating-point values. Numeric literals are specified in any of the following floating-point or integer formats:
12345 # integer -54321 # negative integer 12345.67 # floating point 6.02E23 # scientific notation 0xffff # hexadecimal 0377 # octal 4_294_967_296 # underline for legibility
Since Perl uses the comma as a list separator, you cannot use a comma for
improving legibility
of a large number. To improve legibility, Perl allows you
to use an underscore character instead. The underscore only
works within literal numbers specified in your program, not in strings
functioning as numbers or in data read from somewhere else. Similarly, the
leading 0x
for hex and 0
for octal work only for literals.
The automatic conversion of a string to a number does not recognize these
prefixes—you must do an explicit conversion.
Strings are sequences of characters.
String literals are usually delimited by either single ('
)
or double quotes ("
).
Double-quoted string literals
are subject to backslash and variable interpolation, and single-quoted
strings are not (except for \'
and \\
,
used to put single quotes and backslashes into single-quoted strings).
You can embed newlines directly in your strings.
Table 4.1 lists all the backslashed or escape characters that can be used in double-quoted strings.
Table 4-1. Double-Quoted String Representations
Code | Meaning |
---|---|
\n
| Newline |
\r
| Carriage return |
\t
| Horizontal tab |
\f
| Form feed |
\b
| Backspace |
\a
| Alert (bell) |
\e
| ESC character |
\033
| ESC in octal |
\x7f
| DEL in hexadecimal |
\cC
| CTRL-C |
\\
| Backslash |
\"
| Double quote |
\u
| Force next character to uppercase |
\l
| Force next character to lowercase |
\U
| Force all following characters to uppercase |
\L
| Force all following characters to lowercase |
\Q
| Backslash all following non-alphanumeric characters |
\E
|
End |
Table 4.2 lists alternative quoting schemes that can be used in Perl.
They are useful in diminishing the number of commas and quotes you may
have to type, and also allow you to not worry about escaping characters
such as backslashes when there are many instances in your data.
The generic forms allow you to use any non-alphanumeric, non-whitespace
characters as delimiters in place of the slash (/
). If the delimiters
are single quotes, no variable interpolation is done on the pattern.
Parentheses, brackets, braces, and angle brackets can be used as delimiters
in their standard opening and closing pairs.
A list is an ordered group of scalar values. A literal list can be composed as a comma-separated list of values contained in parentheses, for example:
(1,2,3) # array of three values 1, 2, and 3 ("one","two","three") # array of three values "one", "two", and "three"
The generic form of list creation uses the quoting operator qw//
to
contain a list of values separated by white space:
qw/snap crackle pop/
A variable always begins with the character that identifies its
type: $
, @
, or %
. Most of the variable names you
create can begin with a letter or underscore, followed by any combination
of letters, digits, or underscores, up to 255 characters in length.
Upper- and lowercase letters are distinct. Variable names that begin
with a digit can only contain digits, and variable names that begin with a
character other than
an alphanumeric or underscore can contain only that character.
The latter forms are usually predefined variables in Perl, so it
is best to name your variables beginning with a letter or underscore.
Variables have the undef
value before they are first assigned
or when they become “empty.” For scalar variables, undef
evaluates to zero when used as a number, and a zero-length, empty
string (""
) when used as a string.
Simple variable assignment uses the assignment operator (=
) with
the appropriate data. For example:
$age = 26; # assigns 26 to $age @date = (8, 24, 70); # assigns the three-element list to @date %fruit = ('apples', 3, 'oranges', 6); # assigns the list elements to %fruit in key/value pairs
Scalar variables are always named
with an initial $
,
even when referring to a scalar value that is part of an array or hash.
Every variable type has its own namespace. You can, without fear of
conflict, use the same name for a scalar variable, an array, or a hash
(or, for that matter, a filehandle, a subroutine name, or a label).
This means that $foo
and @foo
are two
different variables. It also means that $foo[1]
is an element
of @foo
, not a part of $foo
.
An array is a variable that stores an ordered list of scalar values. Arrays are preceded by an “at” (@) sign.
@numbers = (1,2,3); # Set the array @numbers to (1,2,3)
To refer to a single element of an array, use the dollar sign ($
) with the variable
name (it’s a scalar), followed by the index of the element in square brackets (the subscript
operator).
Array elements
are numbered starting at 0. Negative indexes count backwards from the
last element in the list (i.e., -1 refers to the last element of the list).
For example, in this list:
@date = (8, 24, 70);
$date[2]
is the value of the third element, 70.
A hash is a set of key/value pairs. Hashes are preceded by a percent (%) sign. To refer to a single element of a hash, you use the hash variable name followed by the “key” associated with the value in curly brackets. For example, the hash:
%fruit = ('apples', 3, 'oranges', 6);
has two values (in key/value pairs). If you want to get the value
associated with the key apples
, you use $fruit{'apples'}
.
It is often more readable to use the =>
operator in defining
key/value pairs.
The =>
operator is similar to a comma, but it’s
more visually distinctive, and it also quotes any bare identifiers to
the left of it:
%fruit = ( apples => 3, oranges => 6 );
Every operation
that you invoke in a Perl script is evaluated in a
specific context, and how that operation behaves may depend on
which context it is being called in. There are two major contexts:
scalar and list.
All operators know which context they are in,
and some return lists in contexts wanting a list, and scalars in
contexts wanting a scalar.
For example, the localtime
function returns a nine-element
list in list context:
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime();
But in a scalar context, localtime
returns the number of seconds since
January 1, 1970:
$now = localtime();
Statements that look confusing are easy to evaluate by identifying the proper context. For example, assigning what is commonly a list literal to a scalar variable:
$a = (2, 4, 6, 8);
gives $a
the value 8. The context forces the right side to evaluate to a scalar,
and the action of the comma operator in the expression (in the scalar context) returns the
value farthest to the right.
Another type of statement that might be confusing is the evaluation of an array or hash variable as a scalar, for example:
$b = @c;
When an array variable is evaluated as a scalar, the number of elements in the array is
returned. This type of evaluation is useful for finding the number of elements in an
array.
The special $#
array
form of an array value returns the
index of the last member of the list (one less than the number of elements).
If necessary, you can force a scalar context in the middle of a list by
using the scalar
function.
In Perl, only subroutines and formats require explicit declaration. Variables (and similar constructs) are automatically created when they are first assigned.
Variable declaration comes into play when you need to limit the scope of a variable’s use. You can do this in two ways:
Dynamic scoping creates temporary objects within a scope. Dynamically scoped constructs are visible globally, but only take action within their defined scopes. Dynamic scoping applies to variables declared with
local
.Lexical scoping creates private constructs that are only visible within their scopes. The most frequently seen form of lexically scoped declaration is the declaration of
my
variables.
Therefore, we can say that a local
variable is
dynamically scoped,
whereas a my
variable
is lexically scoped.
Dynamically scoped variables are visible to
functions called from within the block in which they are
declared. Lexically scoped variables, on the other hand, are
totally hidden from the
outside world, including any called subroutines unless they are declared within
the same scope.
See Section 4.7 later in this chapter for further discussion.
Get Perl in a Nutshell now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.