Instead of returning
data, a Perl subroutine can return a reference
to a subroutine. This is really no different
from any other way of passing subroutine references around, except
for a somewhat hidden feature involving anonymous subroutines and
lexical (my
) variables. Consider
$greeting = "hello world"; $rs = sub { print $greeting; }; &$rs(); #prints "hello world"
In this example, the anonymous subroutine makes use of the global
variable $greeting
. No surprises here, right? Now,
let’s modify this innocuous example slightly:
sub generate_greeting { my($greeting) = "hello world"; return sub {print $greeting}; } $rs = generate_greeting(); &$rs(); # Prints "hello world"
The generate_greeting
subroutine returns the
reference to an anonymous subroutine, which in turn prints
$greeting
. The curious thing is that
$greeting
is a my
variable that
belongs to generate_greeting
. Once
generate_greeting
finishes executing, you would
expect all its local variables to be destroyed. But when you invoke
the anonymous subroutine later on, using
&$rs()
, it manages to still print
$greeting
. How does it work?
Any other expression in place of the anonymous subroutine definition
would have used $greeting
right away. A subroutine
block, on the other hand, is a package of code to be invoked at a
later time, so it keeps track of all the
variables it is going to need later on (taking them “to
go,” in a manner of speaking). When this subroutine is called
subsequently and invokes print
"$greeting
“, the subroutine remembers the value
that $greeting
had when that subroutine was
created.
Let’s modify this a bit more to really understand what this idiom is capable of:
sub generate_greeting { my($greeting) = @_; # $greeting primed by arguments return sub { my($subject)= @_; print "$greeting $subject \n"; }; } $rs1 = generate_greeting("hello"); $rs2 = generate_greeting("my fair"); # $rs1 and $rs2 are two subroutines holding on to different $greeting's &$rs1 ("world") ; # prints "hello world" &$rs2 ("lady") ; # prints "my fair lady"
Instead of hardcoding $greeting
, we get it from
generate_greeting
’s arguments. When
generate_greeting
is called the first time, the
anonymous subroutine that it returns holds onto
$greeting
’s value. Hence the subroutine
referred to by $rs1
behaves somewhat like this:
$rs1 = sub { my ($subject) = @_; my $greeting = "hello"; print "$greeting $subject\n"; # $greeting's value is "hello" }
The subroutine is known as a closure (the term
comes from the LISP world). As you can see, it captures
$greeting
’s value, and when it is invoked
later on, it needs only one parameter.
Like some immigrants to a country who retain the culture and customs of the place in which they are born, closures are subroutines that package all the variables they need from the scope in which they are created.
As it happens, Perl creates closures only over lexical
(my
) variables and not over global or localized
(tagged with local
) variables. Let’s take a
peek under the covers to understand why this is so.
If you are not interested in the details of how closures work, you can safely go on to the next section without loss of continuity.
Recall that the name of a variable and its value are separate
entities. When it first sees $greeting
, Perl binds
the name “greeting” to a freshly allocated scalar value,
setting the value’s reference count to 1 (there’s now an
arrow pointing to the value). At the end of the block, Perl
disassociates the name from the scalar value and decrements the
value’s reference count. In a typical block where you
don’t squirrel away references to that value, the value would
be deallocated, since the reference count comes down to zero. In this
example, however, the anonymous subroutine happens to use
$greeting
, so it increments that scalar
value’s reference count, thus preventing its automatic
deallocation when generate_greeting
finishes. When
generate_greeting
is called a second time, the
name “greeting” is bound to a whole new scalar value, and
so the second closure gets to hang on to its own
scalar value.
Why
don’t closures work with local
variables?
Recall from Chapter 3, that variables marked
local
are dynamically scoped (or
“temporarily global”). A local
variable’s value depends on the call stack at the moment at
which it is used. For this reason, if
$greeting
were declared local
,
Perl would look up its value when the anonymous subroutine is
called (actually when print
is called inside it), not when it is defined.
You can verify this with a simple test:
sub generate_greeting { local ($greeting) = @_; return sub { print "$greeting \n" ; } } $rs = generate_greeting("hello"); $greeting = "Goodbye"; &$rs(); # Prints "Goodbye", not "hello"
The anonymous subroutine is not a closure in
this case, because it doesn’t hang onto the local value of
$greeting
(“hello”) at the time of its
creation. Once generate_greeting
has finished
executing, $greeting
is back to its old global
value, which is what is seen by the anonymous subroutine while
executing.
It might appear that every time generate_greeting
returns an anonymous subroutine, it creates a whole new packet of
code internally. That isn’t so. The code for the anonymous
subroutine is generated once during compile time.
$rs
is internally a reference to a “code
value,” which in turn keeps track not only of the byte-codes
themselves (which it shares with all other subroutine references
pointing to the same piece of code), but also all the variables it
requires from its environment (each subroutine reference packs its
own private context for later use). Chapter 20 does
less hand-waving and supplies exact details.
To summarize, a closure is the special case of an anonymous subroutine holding onto data that used to belong to its scope at the time of its creation.
Get Advanced Perl Programming now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.