By Jon Orwant, Jarkko Hietaniemi, John Macdonald
Price: $34.95 USD
£24.95 GBP
Cover | Table of Contents | Colophon
$ sign, and all arrays begin with an
@ sign. The other common datatype in Perl is the
hash, denoted with a
% sign. Hashes "map" one set of
scalars (the "keys") to other scalars (the
"values").$brightness = $red * 0.118 + $green * 0.231 + $blue * 0.043;
$brightness any faster?
Surprisingly, yes.
, denoted
and defined as the product of all
the numbers from 1 to
. You could
define a factorial() subroutine without recursion:# factorial($n) computes the factorial of $n,
# using an iterative algorithm.
sub factorial {
my ($n) = shift;
my ($result, $i) = (1, 2);
for ( ; $i <= $n; $i++) {
$result *= $i;
}
return $result;
}# factorial_recursive($n) computes the factorial of $n,
# using a recursive algorithm.
sub factorial_recursive {
my ($n) = shift;
return $n if $n <= 2;
return $n * factorial_recursive($n - 1);
}
,
since computing the factorial of
requires
multiplications. The
recursive implementation is cleaner, and you might suspect faster.
However, it takes four times as long on our
computers, because there's overhead involved whenever you call a
subroutine. The nonrecursive (or
iterative)
subroutine just amasses the factorial in an integer, while the
recursive subroutine has to invoke itself repeatedly—and
subroutine invocations take a lot of time.
algorithm to approximate the factorial. That speed comes at a price:
it's not exact.sub factorial_approx {
return sqrt (1.5707963267949 * $_[0]) *
(($_[0] / 2.71828182845905) ** $_[0]);
}binary_search() accepting $low
and $high as arguments, checking the current word,
adjusting |
Type and Designating Symbol
|
Meaning
| |
|---|---|---|
$scalar
| ||
|
number
|
integer or float
| |
|
string
|
arbitrary length sequence of characters
| |
|
reference
|
"pointer" to another Perl data structure
| |
|
object
|
a Perl data structure that has been blessed into a class (accessed through a reference)
| |
@array
|
an ordered sequence of scalars indexed by integers; arrays are sometimes
called lists, but the two are not quite
identical
| |
%hash
| an unordered collection of scalars selected by strings (also known as associative arrays, and in some languages as dictionaries) | |
# start with a string $date = "98/07/22"; # extract the substrings containing the numeric values ($year, $month, $day) = ($date =~ m[(\d\d)/(\d\d)/(\d\d)]); # but they can just be used as numbers $year += 1900; # Y2K bug! $month = $month_name[$month-1]; # and then again as strings $printable_date = "$month $day, $year";
Point object might contain explicit values for
x- and
y-coordinates, while the
corresponding Point
class might have methods to synthesize
and
coordinates from them. This
approach isolates the rest of the code from the internal
representation; indeed, as long as the methods behave, the underlying
structure can be changed without requiring any change to the rest
of the program. You could change Point to use
angular coordinates internally instead of Cartesian coordinates, and the
x(), y(),
rho(), and theta() methods would
still return the correct values.$address{city}
than have to parse the city out of the middle of
an address string with something like
get_address(line=>4,/^[\s,]+/).
There are many
different data structures that could do the job. We'll now consider a
few alternatives, starting with simple arrays and hashes. We could
use one array per address:
@Watson_Address = (
"Dr. Watson",
"221b Baker St.",
"London",
"NW1",
"England",
);
|
@Sam_Address = (
"Sam Gamgee",
"Bagshot Row",
"Hobbiton",
"The Shire",
);
|
%Watson_Address = (
name => "Dr. Watson",
street => "221b Baker St.",
city => "London",
zone => "NW1",
country => "England",
);
|
%Sam_Address = (
name => "Sam Gamgee",
street => "Bagshot Row",
city => "Hobbiton",
country => "The Shire",
);
|
splicesplice operator used for Perl lists, we
described how splicing elements into or out of the middle of a large
array can be expensive. To cut down the expense of copying large
chunks of an array you can use a linked list. Instead of using memory
as compactly as possible, placing one element right after the previous
one as an array does, a linked list uses a separate structure for each
element. Each of these structures has two fields: the value of
the element and a reference to the next element in the list.
$list = undef;
foreach (reverse 1..5) {
$list = [ $list, $_ * $_ ];
}
[0], is a reference that points
to the next element of
the linked list. The second scalar, undef to denote the end of the list, the
last element points back to the first. Because of the circular link,
the idea of the head and tail
of the list gets fuzzier. The list pointer (e.g.,
$list) is no longer the only way to access the
element at the head of the linked list—you can get to it from
any element by following the right number of links. This means that
you can simply reassign the list pointer to point to a different
element to change which element is to be considered the head.my $p;
{
my $x = "abc";
my $y = "def";
$p = \$x; # the value "abc" now has a count of two
}
# "def" is freed
# "abc" remains in use
$p = 1;
# "abc" is freed$y has gone out of
scope. Its value, "def", had a count of 1 so it
can be freed. $x has also gone out of scope, but
its value "abc" had a count of 2. The count is
decremented to 1 and the value is not freed—it is still
accessible through $p. Later,
$p is reassigned, overwriting the reference to
"abc". This means that the count for
"abc" is decremented. Since its count is now zero,
it is freed.# start a new scope
{
# two variables
my $p1 = 1;
my $p2 = 2;
# point them at each other
$p1 = \$p2;
$p2 = \$p1;
}
# end scope
append() and
prepend() functions about to be described, which insert one or many
elements before or after a specific element. These functions work on
a list that has only a single element so long as it points to
itself. They fail if you have removed that element from another list
without relinking the standalone element to point to itself. (The code
for a singly-linked list earlier
in this chapter overwrites the link field whenever it inserts an
element into a list, so the code will work fine whatever old value was
in the link field.)double that can carry out
doubly-linked list operations. Parts of it are designed to coexist
with the package double_head shown later in this
chapter. The new method is a typical object creation
function. The _link_to method is only for
internal use; it connects two elements as neighbors within a list:http://tpj.com/tpj/programs.)Infinite
lists are helpful for cases in which you'll never be able to look at
all of your elements. Maybe the elements are tough to compute, or maybe
there are simply too
many of them. For example, if your program had an occasional need
to test whether a particular number belongs to an infinite series (prime
numbers or Fibonacci numbers, perhaps), you could keep an infinite
list around and search through it until you find a number that is the
same or larger. As the list expands, the infinite list would cache
all of the values that you've already computed, and would compute
more only if the newly requested number was "deeper" into the
list.next() method. Internally, the link value can have
two forms. When it is a normal reference
pointing
to the next
element, the next() method just returns it
immediately. But when it is a code reference, the
next() method invokes the code. The
code actually creates the next node and returns a reference to it.
Then, the next() method changes the link field of the old
element from the code reference to a normal reference pointing to the
newly found value. Finally, next() returns that new
reference for use by the calling program.
Thus, the new node is remembered and will be returned
immediately on subsequent calls to the next()
method. The new node's link field will usually be a code reference
again—ready to be invoked in its turn, if you choose to
continue advancing through the list when you've dealt with the
current (freshly created) element.
Macdonald in an
address book that contained a million names. After choosing the M
"page" you have only 100,000 names to search. But, after that, it
might take you 100,000 examinations to find the right
element.
checks.
time and it can be removed and the heap updated in
time. Since you don't keep the heap's tree fully ordered,
operations on the heap
can be carried out faster. We will see heaps used as a component of
many algorithms through the rest of this book.|
Binary Heap (worst case)
|
Binomial Heap (worst case)
|
Fibonacci Heap (amortized)
| |
|---|---|---|---|
|
create empty heap
|
θ(1)
|
θ(1)
|
θ(1)
|
|
insert new element
|
θ(log N)
|
θ(log N)
|
θ(1)
|
|
view minimum
|
θ(1)
|
θ(log N)
|
θ(1)
|
|
extract minimum
|
θ(log N)
|
θ(log N)
|
θ(log N)
|
|
union two heaps
|
θ(N)
|
θ(log N)
|
θ(1)
|
|
decrease key
|
θ(log N)
|
θ(log N)
|
θ(1)
|
|
delete element
|
θ(log N)
|
θ(log N)
|
θ(log N)
|
( 1, 3, 2, 4 ). There are
unexplored possibilities for further development here—applying
bidirectional heap ordering to slices of the full array
seems to be worth examining, for example.
operations and perform a number of them together,
making the amortized cost
instead.
The actual algorithms implemented are described in detail in the book
Introduction to Algorithms, by Cormen, Leiserson, and
Rivest.use and specify for the new()
function. In practice, if you need to use one of these modules
(rather than managing existing arrays as described earlier) you will be
best off using Heap::Fibonacci. There are two possible
exceptions. One is if your problem is small enough that the time
required to load the larger Fibonacci package is significant. The
other is if your problem is precisely the wrong size for the memory
management of your operating system: the extra memory requirements of
the Heap::Fibonacci causes significant degradation, but Heap::Binary
is small enough that no degradation occurs. Neither case is especially
likely, so use Heap::Fibonacci.use Heap::Fibonacci; # or Heap::Binary or Heap::Binomial $heap = new Heap::Fibonacci; # or Heap::Binary or Heap::Binomial # Add a value (defined below) into the heap. $heap->add($val); # Look at the smallest value. $val = $heap->minimum; # Remove the smallest value. $val = $heap->extract_minimum; # Merge two heaps - $heap2 will end up empty; all of its # elements will be merged into $heap. $heap->absorb($heap2); # Two operations on an element: # 1. Decrease an item's value. $val->val($new_value); $heap->decrease_key($val); # 2. Remove an element from the heap. $heap->delete($val);
heap method to connect
the user data structure to a separate Elem structure used to
determine its heap order.
Additionally, the routines to apply binary heap ordering to
a user-provided array will be put in a separate module called
Array::Heap.sort function, what comparing
actually means, and how you can code your own sort algorithms with
Perl.Munro, Alice 15 Brigham Road 623-2448 Munro, Alice 48 Hammersley Place 489-1073 Munro, Alicia 62 Evergreen Terrace 623-6099
Munro, Alice 15 Brigham Road 623-2448 Munro, Alice 48 Hammersley Place 489-1073 Munro, Alicia 62 Evergreen Terrace 623-6099
time) than others (usually
time). Some perform much
better on certain input; others work well regardless of the input.sort(). We will
explore it first because we can use it to demonstrate what you need to
think about when orchestrating a sort operation. The important thing
to remember is that