Scoped Declarations

Like global declarations, lexically scoped declarations have an effect at the time of compilation. Unlike global declarations, lexically scoped declarations only apply from the point of the declaration through the end of the innermost enclosing scope (block, file, or eval—whichever comes first). That’s why we call them lexically scoped, though perhaps “textually scoped” would be more accurate, since lexical scoping has little to do with lexicons. But computer scientists the world over know what “lexically scoped” means, so we perpetuate the usage here.

Perl also supports dynamically scoped declarations. A dynamic scope also extends to the end of the innermost enclosing block, but in this case, “enclosing” is defined dynamically at runtime rather than textually at compile time. To put it another way, blocks nest dynamically by invoking other blocks, not by including them. This nesting of dynamic scopes may correlate somewhat to the nesting of lexical scopes, but the two are generally not identical, especially when any subroutines have been invoked.

We mentioned that some aspects of use could be considered global declarations, but other aspects of use are lexically scoped. In particular, use not only imports package symbols, it also implements various magical compiler hints, known as pragmas (or if you’re into classical Greek, pragmata). Most pragmas are lexically scoped, including the strict pragma we mention from time to time. See the later section Pragmas. (Hence, if it is implicitly turned on by use v5.14 at the top of your file, it’s on for the whole rest of the file, even if you switch packages.)

A package declaration, oddly enough, is itself lexically scoped, despite the fact that a package is a global entity. But a package declaration merely declares the identity of the default package for the rest of the enclosing block or, if you use the optional BLOCK after the package NAMESPACE, then in that specific block. Undeclared identifiers used in variable names[80] are looked up in that package. In a sense, a package is never declared at all, but springs into existence when you refer to something that belongs to that package. It’s all very Perlish.

Scoped Variable Declarations

Most of the rest of this chapter is about using global variables. Or, rather, it’s about not using global variables. There are various declarations that help you not use global variables—or, at least, not use them foolishly.

We already mentioned the package declaration, which was introduced into Perl long ago to allow globals to be split up into separate packages. This works pretty well for certain kinds of variables. Packages are used by libraries, modules, and classes to store their interface data (and some of their semiprivate data) to avoid conflicting with variables and functions of the same name in your main program or in other modules. If you see someone write $Some::stuff,[81] he’s using the $stuff scalar variable from the package Some. See Chapter 10.

If this were all there were to the matter, Perl programs would quickly become unwieldy as they got longer. Fortunately, Perl’s three scoping declarators make it easy to create completely private variables (using my or state), or to give selective access to global ones (using our). There is also a pseudodeclarator to provide temporary values to global variables (using local). These declarators are placed in front of the variable in question:

my $nose;
our $House;
state $troopers = 0;
local $TV_channel;

If more than one variable is to be declared, the list must be placed in parentheses:

my ($nose, @eyes, %teeth);
our ($House, @Autos, %Kids);
state ($name, $rank, $serno);
local (*Spouse, $phone{HOME});

The my, state, and our declarators may only declare simple scalar, array, or hash variables, while state may only initialize simple scalar variables (although this may contain a reference to anything else you’d like), not arrays or hashes. Since local is not a true declarator, the constraints are somewhat more relaxed: you may also localize, with or without initialization, entire typeglobs and individual elements or slices of arrays and hashes. Each of these modifiers offers a different sort of “confinement” to the variables they modify. To oversimplify slightly: our confines names to a scope, local confines values to a scope, and my confines both names and values to a scope. (And state is just like my, but it defines the scope a bit differently.) Each of these constructs may be assigned to, though they differ in what they actually do with the values since they have different mechanisms for storing values. They also differ somewhat if you don’t (as we didn’t above) assign any values to them: my and local cause the variables in question to start out with values of undef or (), as appropriate; our, on the other hand, leaves the current value of its associated global unchanged. And state, the oddball, starts with whatever value it had the last time we were here.

Syntactically, my, our, state, and local are simply modifiers (like adjectives) on an lvalue expression. When you assign to an lvalue modified by a declarator, it doesn’t change whether the lvalue is viewed as a scalar or a list. To determine how the assignment will work, just pretend that the declarator isn’t there. So either of:

my ($foo) = <STDIN>;
my @array = <STDIN>;

supplies list context to the righthand side, while this supplies scalar context:

my $foo = <STDIN>;

Declarators bind more tightly (with higher precedence) than the comma does. The following example erroneously declares only one variable, not two, because the list following the declarator is not enclosed in parentheses:

my $foo, $bar = 1;              # WRONG

This has the same effect as:

my $foo;
$bar = 1;

Under strict, you will get an error from that since $bar is not declared.

In general, it’s best to declare a variable in the smallest possible scope that suits it. Since variables declared in a control-flow statement are visible only in the block governed by that statement, their visibility is reduced. It reads better in English this way, too:

sub check_warehouse {
    for my $widget (our @Current_Inventory) {
        say "I have a $widget in stock today.";

By far the most frequently seen declarator is my, which declares lexically scoped variables for which both the names and values are stored in the current scope’s temporary scratchpad; these may not be accessed from outside the lexical scope. Always use my unless you know why you want one of the others. Use state if you want the same degree of privacy but you also want the value to persist from call to call.

Closely related is the our declaration, which is also persistent, and also enters a lexically scoped name in the current scope, but implements its persistence by storing its value in a global variable that anyone else can access if they wish. In other words, it’s a global variable masquerading as a lexical.

In addition to global scoping and lexical scoping, we also have what is known as dynamic scoping, implemented by local, which despite the word “local” really deals with global variables and has nothing to do with the local scratchpad. (It would be more aptly named temp, since it temporarily changes the value of an existing variable. You might even see temp in Perl 5 programs someday, if the keyword is borrowed back from Perl 6.)

The newly declared variable (or value, in the case of local) does not show up until the statement after the statement containing the declaration. Thus, you could mirror a variable this way:

my $x = $x;

That initializes the new inner $x with the current value $x, whether the current meaning of $x is global or lexical.

Declaring a lexical variable name hides any previously declared lexical of the same name, whether declared in that scope or an outer scope (although you’ll get a warning if you have those enabled). It also hides any unqualified global variable of the same name, but you can always get to the global variable by explicitly qualifying it with the name of the package the global is in, for example, $PackageName::varname.

Lexically Scoped Variables: my

To help you avoid the maintenance headaches of global variables, Perl provides lexically scoped variables, often called lexicals for short. Unlike globals, lexicals guarantee you privacy. Assuming you don’t hand out references to these private variables that would let them be fiddled with indirectly, you can be certain that every possible access to these private variables is restricted to code within one discrete and easily identifiable section of your program. That’s why we picked the keyword my, after all.

A statement sequence may contain declarations of lexically scoped variables. Such declarations tend to be placed at the front of the statement sequence, but this is not a requirement; you may simply decorate the first use of a variable with a my declarator wherever it occurs (as long as it’s in the outermost scope the variable is used). In addition to declaring variable names at compile time, the declarations act like ordinary runtime statements: each of them is executed within the sequence of statements as if it were an ordinary statement without the declarator:

my $name = "fred";
my @stuff = ("car", "house", "club");
my ($vehicle, $home, $tool) = @stuff;

These lexical variables are totally hidden from the world outside their immediately enclosing scope. Unlike the dynamic scoping effects of local (see below), lexicals are hidden from any subroutine called from their scope. This is true even if the same subroutine is called from itself or elsewhere—each instance of the subroutine gets its own “scratchpad” of lexical variables. Subroutines defined in the scope of a lexical variable, however, can see the variable just like any inner scope would.

Unlike block scopes, file scopes don’t nest; there’s no “enclosing” going on, at least not textually. If you load code from a separate file with do, require, or use, the code in that file cannot access your lexicals, nor can you access lexicals from that file.

However, any scope within a file (or even the file itself) is fair game. It’s often useful to have scopes larger than subroutine definitions, because this lets you share private variables among a limited set of subroutines. This is one way to create variables that a C programmer would think of as “file static”:

    my $state = 0;

    sub on     { $state = 1 }
    sub off    { $state = 0 }
    sub toggle { $state = !$state }

The eval STRING operator also works as a nested scope, since the code in the eval can see its caller’s lexicals (as long as the names aren’t hidden by identical declarations within the eval’s own scope). Anonymous subroutines can likewise access any lexical variables from their enclosing scopes; if they do so, they’re what are known as closures.[82] Combining those two notions, if a block evals a string that creates an anonymous subroutine, the subroutine becomes a closure with full access to the lexicals of both the eval and the block, even after the eval and the block have exited. See the section Closures in Chapter 8.

Persistent Lexically Scoped Variables: state

A state variable is a lexically scoped variable, just like my. The only difference is that state variables will never be reinitialized, unlike my variables that are reinitialized each time their enclosing block is entered. This is usually so that a function can have a private variable that retains its old value between calls to that function.

state variables are enabled only when the use feature "state" pragma is in effect. This will be automatically included if you ask to use a version of Perl that’s v5.10 or later:

use v5.14;
sub next_count {
    state $counter = 0;  # first time through, only
    return ++$counter;

Unlike my variables, state variables are currently restricted to scalars; they cannot be arrays or hashes. This may sound like a bigger restriction than it actually is, because you can always store a reference to an array or hash in a state variable:

use v5.14;
state $bag    = { };
state $vector = [ ];

unless ($bag–>{$item}) { $bag–>{$item} = 1 }
push @$vector, $item;

Lexically Scoped Global Declarations: our

In the old days before use strict, Perl programs would simply access global variables directly. A better way to access globals nowadays is by the our declaration. This declaration is lexically scoped in that it applies only through the end of the current scope. However, unlike the lexically scoped my or the dynamically scoped local, our does not isolate anything to the current lexical or dynamic scope. Instead, it provides “permission” in the current lexical scope to access a variable of the declared name in the current package. Since it declares a lexical name, it hides any previous lexicals of the same name. In this respect, our variables act just like my variables.

If you place an our declaration outside any brace-delimited block, it lasts through the end of the current compilation unit. Often, though, people put it just inside the top of a subroutine definition to indicate that they’re accessing a global variable:

sub check_warehouse {
    our @Current_Inventory;
    my  $widget;
    foreach $widget (@Current_Inventory) {
        say "I have a $widget in stock today.";

Since global variables are longer in life and broader in visibility than private variables, we like to use longer and flashier names for them than for temporary variables. This practice alone, if studiously followed, can do nearly as much as use strict can toward discouraging the overuse of global variables, especially in the less prestidigitatorial typists.

Repeated our declarations do not meaningfully nest. Every nested my produces a new variable, and every nested local a new value. But every time you use our, you’re talking about the same global variable, irrespective of nesting. When you assign to an our variable, the effects of that assignment persist after the scope of the declaration. That’s because our never creates values; it just exposes a limited form of access to the global, which lives forever:

our $PROGRAM_NAME = "waiter";
    our $PROGRAM_NAME = "server";
    # Code called here sees "server".
# Code executed here still sees "server".

Contrast this with what happens under my or local, where, after the block, the outer variable or value becomes visible again:

my $i = 10;
    my $i = 99;
# Code compiled here sees outer 10 variable.

local $PROGRAM_NAME = "waiter";
    local $PROGRAM_NAME = "server";
    # Code called here sees "server".
# Code executed here sees restored "waiter" value.

It usually only makes sense to assign to an our declaration once, probably at the very top of the program or module, or, more rarely, when you preface the our with a local of its own:

    local our @Current_Inventory = qw(bananas);
    check_warehouse();  # no, we haven't no bananas :–)

(But why not just pass it as an argument in this case?)

Dynamically Scoped Variables: local

Using a local operator on a global variable gives it a temporary value each time local is executed, but it does not affect that variable’s global visibility. When the program reaches the end of that dynamic scope, this temporary value is discarded and the original value is restored. But it’s always still a global variable that just happens to hold a temporary value while that block is executing. If you call some other function while your global contains the temporary value and that function accesses that global variable, it sees the temporary value, not the original one. In other words, that other function is in your dynamic scope, even though it’s presumably not in your lexical scope.[83]

This process is called dynamic scoping because the current value of the global variable depends on your dynamic context; that is, it depends on which of your parents in the call chain might have called local. Whoever did so last before calling you controls which value you will see.

If you have a local that looks like this:

  local $var = $newvalue;

you can think of it purely in terms of runtime assignments:

  $oldvalue = $var;
  $var = $newvalue;
continue {
  $var = $oldvalue;

The difference is that with local the value is restored no matter how you exit the block, even if you prematurely return from that scope.

As with my, you can initialize a local with a copy of the same global variable. Any changes to that variable during the execution of a subroutine (and any others called from within it, which of course can still see the dynamically scoped global) will be thrown away when the subroutine returns. You’d certainly better comment what you are doing, though:

# WARNING: Changes are temporary to this dynamic scope.
local $Some_Global = $Some_Global;

A global variable then is still completely visible throughout your whole program, no matter whether it was explicitly declared with our or just allowed to spring into existence, or whether it’s holding a local value destined to be discarded when the scope exits. In tiny programs, this isn’t so bad, but for large ones, you’ll quickly lose track of where in the code all these global variables are being used. You can forbid accidental use of globals, if you want, through the use strict 'vars' pragma, described in the next section.

Although both my and local confer some degree of protection, by and large you should prefer my over local. Sometimes, though, you have to use local so you can temporarily change the value of an existing global variable, like those listed in Chapter 25. Only alphanumeric identifiers may be lexically scoped, and many of those special variables aren’t strictly alphanumeric. You also need to use local to make temporary changes to a package’s symbol table, as shown in the section Symbol Tables in Chapter 10. Finally, you can use local on a single element or a whole slice of an array or a hash. This even works if the array or hash happens to be a lexical variable, layering local’s dynamic scoping behavior on top of those lexicals. We won’t talk much more about the semantics of local here. See the local entry in Chapter 27 for more information.

[80] Also unqualified names of subroutines, filehandles, directory handles, and formats.

[81] Or the archaic $Some'stuff, which probably shouldn’t be encouraged outside of Perl poetry.

[82] As a mnemonic, note the common element between “enclosing scope” and “closure”. (The actual definition of closure comes from a mathematical notion concerning the completeness of sets of values and operations on those values.)

[83] That’s why lexical scopes are sometimes called static scopes: to contrast them with dynamic scopes and emphasize their compile-time determinability. Don’t confuse this use of the term with how static is used in C or C++. The term is heavily overloaded, which is why we avoid it.

Get Programming Perl, 4th Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.