BUY THIS BOOK

Safari Books Online

What is this?

Looking to Reprint this content?


Perl 6 Essentials
Perl 6 Essentials By Allison Randal, Dan Sugalski, Leopold Tötsch
June 2003
Pages: 208

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Project Overview
Conceptual integrity in turn dictates that the design must proceed from one mind, or from a very small number of agreeing resonant minds.
—Frederick Brooks Jr., The Mythical Man Month
Perl 6 is the next major version of Perl. It's a complete rewrite of the interpreter, and a significant update of the language itself. The goal of Perl 6 is to add support for much-needed new features, and still be cleaner, faster, and easier to use.
The Perl 6 project is vast and complex, but it isn't complicated. The project runs on a simple structure with very little management overhead. That's really the only way it could run. The project doesn't have huge cash or time resources. Its only resource is the people who believe in the project enough to spend their off-hours—their "relaxation" time—working to see it completed. This chapter is as much about people as it is about Perl.
Back on July 18, 2000, the second day of the fourth Perl Conference (TPC 4), a small band of Perl geeks gathered to prepare for a meeting of the Perl 5 Porters later that day. The topic at hand was the current state of the Perl community. Four months had passed since the 5.6.0 release of Perl, and although it introduced some important features, none were revolutionary.
There had been very little forward movement in the previous year. It was generally acknowledged that the Perl 5 codebase had grown difficult to maintain. At the same time, infighting on the perl5-porters list had grown so intense that some of the best developers decided to leave. It was time for a change, but no one was quite sure what to do. They started conservatively with plans to change the organization of Perl development.
An hour into the discussion, around the time most people nod off in any meeting, Jon Orwant (the reserved, universally respected editor of the Perl Journal) stepped quietly into the room and snapped everyone to attention with an entirely uncharacteristic and well-planned gesture. Smash! A coffee mug hit the wall. "We are *@$!-ed (
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Birth of Perl 6
Back on July 18, 2000, the second day of the fourth Perl Conference (TPC 4), a small band of Perl geeks gathered to prepare for a meeting of the Perl 5 Porters later that day. The topic at hand was the current state of the Perl community. Four months had passed since the 5.6.0 release of Perl, and although it introduced some important features, none were revolutionary.
There had been very little forward movement in the previous year. It was generally acknowledged that the Perl 5 codebase had grown difficult to maintain. At the same time, infighting on the perl5-porters list had grown so intense that some of the best developers decided to leave. It was time for a change, but no one was quite sure what to do. They started conservatively with plans to change the organization of Perl development.
An hour into the discussion, around the time most people nod off in any meeting, Jon Orwant (the reserved, universally respected editor of the Perl Journal) stepped quietly into the room and snapped everyone to attention with an entirely uncharacteristic and well-planned gesture. Smash! A coffee mug hit the wall. "We are *@$!-ed (Crash!) unless we can come up with something that will excite the community (Pow!), because everyone's getting bored and going off and doing other things! (Bam!)" (At least, that's basically how Larry tells it. As is usually the case with events like this, no one remembers exactly what Jon said.)
Awakened by this display, the group started to search for a real solution. The language needed room to grow. It needed the freedom to evaluate new features without the obscuring weight of legacy code. The community needed something to believe in, something to get excited about.
Within a few hours the group settled on Perl 6, a complete rewrite of Perl. The plan wasn't just a language change, just an implementation change, or just a social change. It was a paradigm shift. Perl 6 would be the community's rewrite of Perl, and the community's rewrite of itself.
Would Perl 6, particularly Perl 6 as a complete rewrite, have happened without this meeting? Almost certainly. The signs appeared on the lists, in conferences, and in journals months in advance. If it hadn't started that day, it would have happened a week later, or perhaps a few months later, but it would have happened. It was a step the community needed to take.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
In the Beginning . . .
Let's pause and consider Perl development up to that fateful meeting. Perl 6 is just another link in the chain. The motivations behind it and the directions it will take are partially guided by history.
Perl was first developed in 1987 by Larry Wall while he was working as a programmer for Unisys. After creating a configuration and monitoring system for a network that spanned the two American coasts, he was faced with the task of assembling usable reports from log files scattered across the network. The available tools simply weren't up to the job. A linguist at heart, Larry set out to create his own programming language, which he called perl. He released the first version of Perl on December 18, 1987. He made it freely available on Usenet (this was before the Internet took over the world, remember), and quickly a community of Perl programmers grew.
The early adopters of Perl were system administrators who had hit the wall with shell scripting, awk, and sed. However, in the mid-1990s Perl's audience exploded with the advent of the Web, as Perl was tailor-made for CGI scripting and other web-related programming.
Meantime, the Perl language itself kept growing, as Larry and others kept adding new features. Probably the most revolutionary change in Perl (until Perl 6, of course) was the addition of packages, modules, and object-oriented programming with Perl 5. While this made the transition period from Perl 4 to Perl 5 unusually long, it breathed new life into the language by providing a modern, modular interface. Before Perl 5, Perl was considered simply a scripting language; after Perl 5, it was considered a full-fledged programming language.
Larry, meanwhile, started taking a back seat to Perl development and allowed others to take responsibility for adding new features and fixing bugs in Perl. The Perl 5 Porters (p5p) mailing list became the central clearinghouse for bug reports or proposed changes to the Perl language, with the "pumpkin holder" (also known as the "pumpking") being the programmer responsible for implementing the patches and distributing them to the rest of the list for review. Larry continued to follow Perl development, but like a parent determined not to smother his children, he stayed out of the day-to-day development, limiting his involvement to situations in which he was truly needed.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Continuing Mission
Much has changed since the early days of the project. New people join the group and others leave in a regular "changing of the guard" pattern. Plans change as the work progresses, and the demands of the work and the needs of the community become clearer. Today the Perl 6 project has three major parts: language design, internals, and documentation. Each branch is relatively autonomous, though there is a healthy amount of coordination between them.
As with all things Perl, the central command of the language design process is Larry Wall, the creator of the Perl language. Larry is supported by the rest of the design team: Damian Conway, Allison Randal, Dan Sugalski, Hugo van der Sanden, and chromatic. We speak in weekly teleconferences and also meet face-to-face a few times a year to hash out ideas for the design documents, or to work through roadblocks standing in the way of design or implementation. The group is diverse, including programmers-for-hire, Perl trainers, and linguists with a broad spectrum of interests and experiences. This diversity has proved quite valuable in the design process, as each member is able to see problems in the design or potential solutions that the other members missed.

Section 1.3.1.1: Requests for comments (RFCs)

The first step in designing the new language was the RFC (Request For Comments) process. This spurred an initial burst of community involvement. Anyone was free to submit an RFC on any subject, whether it was as small as adding an operator, or as big as reworking OO syntax. Most of the proposals were really quite conservative. The RFCs followed a standard format so they would be easier to read and easier to compare.
Each RFC was subject to peer review, carried out in an intense few weeks around October 2000. One thing the RFC process demonstrated was that the Perl community still wasn't quite ready to move beyond the infighting that had characterized Perl 5 Porters earlier that year.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Project Development
The culture's (and my own) understanding of large projects that don't follow a benevolent-dictator model is weak. Most such projects fail. A few become spectacularly successful and important (Perl, Apache, KDE). Nobody really understands where the difference lies.
—Eric S. Raymond, The Cathedral and The Bazaar
The Perl community is rich and diverse. There are as many variations in skill sets and skill levels as there are people. Some are coders, some are testers, some are writers, some are teachers, some are theorists. For every skill, there is a task. It's the combination of all the skills that gets the job done. A team of workers all wielding hammers could never build a house. Someone has to cut the wood, sand it, apply plaster, paint it, and install windows, doors, electrical systems, and plumbing.
Theoretically, language design is the driving force behind all other parts of the project. In actual practice, Parrot development and documentation frequently affect the direction and focus of design efforts. A design that gave no consideration to what can be implemented efficiently wouldn't be much use. Equally, if the design work followed a strictly linear path, it would be a waste of developer resources. The Parrot project can't afford to go on hold every time they need information from a future area of design. For example, the design of OO syntax hasn't been completed yet, but the design team took time to define enough of the required semantics so that development can move ahead.
Design work goes in cycles. Each cycle begins with a quiet period. During this time, the list traffic is fairly light, and Larry is rarely seen. It can seem as if the project is stalled, but in fact, this part of the cycle is where the bulk of original design work is done. Larry disappears when he's working on an Apocalypse. It's the most intense and creative phase.
The next phase is internal revision. Larry sends a draft of the Apocalypse to the design team for comments and makes changes based on their suggestions. Sometimes the changes are as simple as typo fixes, but sometimes they entirely alter the shape of the design. Larry repeats this several times before publishing the document. This is a very fast-paced and dynamic phase, but again, low on visible results.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Language Development
Theoretically, language design is the driving force behind all other parts of the project. In actual practice, Parrot development and documentation frequently affect the direction and focus of design efforts. A design that gave no consideration to what can be implemented efficiently wouldn't be much use. Equally, if the design work followed a strictly linear path, it would be a waste of developer resources. The Parrot project can't afford to go on hold every time they need information from a future area of design. For example, the design of OO syntax hasn't been completed yet, but the design team took time to define enough of the required semantics so that development can move ahead.
Design work goes in cycles. Each cycle begins with a quiet period. During this time, the list traffic is fairly light, and Larry is rarely seen. It can seem as if the project is stalled, but in fact, this part of the cycle is where the bulk of original design work is done. Larry disappears when he's working on an Apocalypse. It's the most intense and creative phase.
The next phase is internal revision. Larry sends a draft of the Apocalypse to the design team for comments and makes changes based on their suggestions. Sometimes the changes are as simple as typo fixes, but sometimes they entirely alter the shape of the design. Larry repeats this several times before publishing the document. This is a very fast-paced and dynamic phase, but again, low on visible results.
Next is the community review. Usually the first day or two after an Apocalypse comes out are quiet, while the ideas soak in. Then the list begins to fly. Some people suggest changes, while others ask about the design. This phase reflects the most visible progress, but the changes are mostly refinements. The changes introduced at community review polish off the rough edges, add a few new tricks, or make simplifications for the average user. Here the community takes ownership of the design, as both the design and the people change until the two are a comfortable fit. The Synopsis, a summary released by the design team soon after each Apocalypse, assists in the community review by breaking down the ideas from the Apocalypse into a simple list of points.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Parrot Development
Parrot development is the productive core of Perl 6 development. If you want coding action, this is the place to be.
Organization of the Parrot project is lightweight but efficient. It's a meritocracy—people who make valuable contributions are offered more responsibility. Communication is relaxed and informal. As Dan is so fond of saying, "This is far too important to take seriously." It's a bit like a special forces unit—the work gets done not because of tight control from the top, but because the whole team knows what they need to do, and do it.
The cycles in Parrot development center on "point releases." A point release is a version change, such as 0.0.8 to 0.0.9. The pumpking decides when point releases happen and what features are included. Usually one or two solid new features trigger a release.
Development proceeds at a steady pace of bug reports, patches submitted, and patches applied. The pace isn't so much a result of careful planning as it is the law of averages—on any given day, someone, somewhere, is working on Parrot. A release is a spike in that activity, but since Parrot tends to follow the "release early, release often" strategy, the spike is relatively small.
Typically, a few days before a release the pumpking declares a feature freeze and all development efforts center on bug squashing. This periodic cleanup is one of the most valuable aspects of a release.
Just like design work, the first step to participating in Parrot development is joining the list. The topics on p6i tend to stick to practical matters: bug reports, patches, notifications of changes committed to CVS, and questions on coding style. Occasionally there are discussions about how to implement a particular feature. In general, if you have a question about syntax or a speculation about whether Perl 6 should support a particular feature, that question belongs on the language list rather than the internals list.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Design Philosophy
Today's practicality is often no more than the accepted form of yesterday's theory.
—Kenneth Pike, An Introduction to Tagmemics
At the heart of every language is a core set of ideals that give the language its direction and purpose. If you really want to understand the choices that language designers make—why they choose one feature over another or one way of expressing a feature over another—the best place to start is with the reasoning behind the choices.
Perl 6 has a unique set of influences. It has deep roots in Unix and the children of Unix, which gives it a strong emphasis on utility and practicality. It's grounded in the academic pursuits of computer science and software engineering, which gives it a desire to solve problems the right way, not just the most expedient way. It's heavily steeped in the traditions of linguistics and anthropology, which gives it the goal of comfortable adaptation to human use. These influences and others like them define the shape of Perl and what it will become.
Perl is a human language. Now, there are significant differences between Perl and languages like English, French, German, etc. For one, it is artificially constructed, not naturally occurring. Its primary use, providing a set of instructions for a machine to follow, covers a limited range of human existence. Even so, Perl is a language humans use for communicating. Many of the same mental processes that go into speaking or writing are duplicated in writing code. The process of learning to use Perl is much like learning to speak a second language. The mental processes involved in reading are also relevant. Even though the primary audience of Perl code is a machine, as often as not humans have to read the code while they're writing it, reviewing it, or maintaining it.
Many Perl design decisions have been heavily influenced by the principles of natural language. The following are some of the most important principles, the ones we come back to over and over again while working on the design and the ones that have had the greatest impact.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Linguistic and Cognitive Considerations
Perl is a human language. Now, there are significant differences between Perl and languages like English, French, German, etc. For one, it is artificially constructed, not naturally occurring. Its primary use, providing a set of instructions for a machine to follow, covers a limited range of human existence. Even so, Perl is a language humans use for communicating. Many of the same mental processes that go into speaking or writing are duplicated in writing code. The process of learning to use Perl is much like learning to speak a second language. The mental processes involved in reading are also relevant. Even though the primary audience of Perl code is a machine, as often as not humans have to read the code while they're writing it, reviewing it, or maintaining it.
Many Perl design decisions have been heavily influenced by the principles of natural language. The following are some of the most important principles, the ones we come back to over and over again while working on the design and the ones that have had the greatest impact.
The natural tendency in human languages is to keep overall complexity about equivalent, both from one language to the next, and over time as a language changes. Like a waterbed, if you push down the complexity in one part of the language, it increases complexity elsewhere. A language with a rich system of sounds (phonology) might compensate with a simpler syntax. A language with a limited sound system might have a complex way of building words from smaller pieces (morphology). No language is complex in every way, as that would be unusable. Likewise, no language is completely simple, as too few distinctions would render it useless.
The same is true of computer languages. They require a constant balance between complexity and simplicity. Restricting the possible operators to a small set leads to a proliferation of user-defined methods and subroutines. This is not a bad thing, in itself, but it encourages code that is verbose and difficult to read. On the other hand, a language with too many operators encourages code that is heavy in line noise and difficult to read. Somewhere in the middle lies the perfect balance.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Architectural Considerations
The second set of principles governs the overall architecture of Perl 6. These principles are connected to the past, present, and future of Perl, and define the fundamental purpose of Perl 6. No principle stands alone; each is balanced against the others.
Everyone agrees that Perl 6 should still be Perl, but the question is, what exactly does that mean? It doesn't mean Perl 6 will have exactly the same syntax. It doesn't mean Perl 6 will have exactly the same features. If it did, Perl 6 would just be Perl 5. So, the core of the question is what makes Perl "Perl"?

Section 3.2.1.1: True to the original purpose

Perl will stay true to its designer's original intended purpose. Larry wanted a language that would get the job done without getting in his way. The language had to be powerful enough to accomplish complex tasks, but still lightweight and flexible. As Larry is fond of saying, "Perl makes the easy things easy and the hard things possible." The fundamental design philosophy of Perl hasn't changed. In Perl 6, the easy things are a little easier and the hard things are more possible.

Section 3.2.1.2: Familiarity

Perl 6 will be familiar to Perl 5 users. The fundamental syntax is still the same. It's just a little cleaner and a little more consistent. The basic feature set is still the same. It adds some powerful features that will probably change the way we code in Perl, but they aren't required.
Learning Perl 6 will be like American English speakers learning Australian English, not English speakers learning Japanese. Sure, there are some vocabulary changes, and the tone is a little different, but it is still—without any doubt—English.

Section 3.2.1.3: Translatable

Perl 6 will be mechanically translatable from Perl 5. In the long term, this isn't nearly as important as what it will be like to write code in Perl 6. But during the transition phase, automatic translation will be important. It will allow developers to start moving ahead before they understand every subtle nuance of every change. Perl has always been about learning what you need now and learning more as you go.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Syntax
Language serves not only to express thought but to make possible thoughts which could not exist without it.
—Bertrand Russell
Perl 6 is a work in progress, so the syntax is rapidly changing. This chapter is likely to be outdated by the time you read it. Even so, it provides a good baseline. If you start here, you'll only have to catch up on a few months of changes (starting with the design documents after Apocalypse 6), instead of several years worth.
Pretend for a moment that you don't know anything about Perl. You heard the language has some neat features, so you thought you might check it out. You go to the store and pick up a copy of Programming Perl because you think this Larry Wall guy might know something about it. It's the latest version, put out for the 6.0.1 release of Perl. It's not a delta document describing the changes, it's an introduction, and you dive in with the curiosity of a kid who got a telescope for his birthday. This chapter is a first glimpse down that telescope.
There's plenty of time later to analyze each feature and decide which you like and which you don't. For now, take a step back and get a feel for the system as a whole, for what it'll be like to work in it.
The most basic building blocks of a programming language are its nouns, the blobs of data that get sucked in, pushed around, altered in various ways, and spat out to some new location. The blobs of data are values: strings, numbers, etc., or composites of the simpler values. Variables are just named containers for those values. The three kinds of variables in Perl 6 are scalars, arrays, and hashes. Each has an identifying symbol (or sigil) as part of the name of the variable: $ for scalars, @ for arrays, and % for hashes. The sigils provide a valuable visual distinction by making it immediately obvious what kinds of behavior a particular variable is likely to have. But, fundamentally, there's little difference between the three. Each variable is essentially a container for a value, whether that value is single or collective. (This statement is an oversimplification, as you'll soon see.)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Variables
The most basic building blocks of a programming language are its nouns, the blobs of data that get sucked in, pushed around, altered in various ways, and spat out to some new location. The blobs of data are values: strings, numbers, etc., or composites of the simpler values. Variables are just named containers for those values. The three kinds of variables in Perl 6 are scalars, arrays, and hashes. Each has an identifying symbol (or sigil) as part of the name of the variable: $ for scalars, @ for arrays, and % for hashes. The sigils provide a valuable visual distinction by making it immediately obvious what kinds of behavior a particular variable is likely to have. But, fundamentally, there's little difference between the three. Each variable is essentially a container for a value, whether that value is single or collective. (This statement is an oversimplification, as you'll soon see.)
Scalars are all-purpose containers. They can hold strings, integers, floating- point numbers, and references to all kinds of objects and built-in types. For example:
$string = "Zaphod's just this guy, you know?";
$int = 42;
$float = 3.14159;
$arrayref = [ "Zaphod", "Ford", "Trillian" ];
$hashref = { "Zaphod" => 362, "Ford" => 1574, "Trillian" => 28 };
$subref = sub { print $string };
$object = Android.new;
A filehandle is just an ordinary object in an ordinary scalar variable. For example:
$filehandle = open $filename;
Array variables hold simple ordered collections of scalar values. Individual values are retrieved from the array by numeric index. The "0" index holds the first value. The @ sigil is part of the name of the variable and stays the same no matter how the variable is used:
@array = ( "Zaphod", "Ford", "Trillian" );

$second_element = @array[1]; # Ford
To get the length of an array—that is, the number of elements in an array—use the .length method. The .last method returns the index of the last element in an array—that is, the highest index in an array.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Operators
Operators provide a simple syntax for manipulating values. Many of the Perl 6 operators will be familiar, especially to Perl programmers.
The = operator is for ordinary assignment. It creates a copy of the values on the right-hand side and assigns them to the variables or data structures on the left-hand side:
$copy = $original;
@copies = @originals;
$copy and $original both have the same value, and @copies has a copy of every element in @originals.
The := operator is for binding assignment. Instead of copying the value from one variable or structure to the other, it creates an alias. An alias is an additional entry in the symbol table with a different name for the one container:
$a := $b;  # $a and $b are aliases
@c := @d;  # @c and @d are aliases
In this example, any change to $a also changes $b, because they're just two separate names for the same container. Binding assignment requires the same number of elements on both sides, so both of these would be an error:
# ($a, $b) := ($c);          # error
# ($a, $b) := ($c, $d, $e);  # error
The ::= operator is a variant of the binding operator that binds at compile time.
The binary arithmetic operators are addition (+), subtraction (-), multiplication (*), division (/), modulus (%), and exponentiation (**). Each has a corresponding assignment operator (+=, -=, *=, /=, %=, **=) that combines the arithmetic operation with assignment:
$a = 3 + 5;
$a += 5;     # $a = $a + 5
The unary arithmetic operators are the prefix and postfix autoincrement (++) and autodecrement (--) operators. The prefix operators modify their argument before it's evaluated, and the postfix operators modify it afterward:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Control Structures
The simplest flow of control is linear—one statement follows the next in a straight line to the end of the program. Since this is far too limiting for most situations, languages provide ways to alter the control flow.
Selection executes one set of actions out of many possible sets. The selection control structures are if, unless, and given.

Section 4.3.1.1: The if statement

The if statement checks a condition and executes its associated block only if that condition is true. The condition can be any expression that evaluates to a truth value. Parentheses around the condition are optional:
if $blue {
    print "True Blue.";
}
The if statement can also have an unlimited number of elsif statements that check additional conditions when the preceding conditions are false. The final else statement executes if all preceding if and elsif conditions are false:
if $blue {
    print "True Blue.";
} elsif $green {
    print "Green, green, green they say...";
} else {
    print "Colorless green ideas sleep furiously.";
}

Section 4.3.1.2: The unless statement

The unless statement is the logical opposite of if. Its block executes only when the tested condition is false:
unless $fire {
    print "All's well.";
}
There is no elsunless statement, though else works with unless.

Section 4.3.1.3: The switch statement

The switch statement selects an action by comparing a given expression, the switch, to a series of when statements, the cases. When a case matches the switch, its block is executed:
given $bugblatter {
    when Beast::Trall { close_eyes(  ); }
    when 'ravenous'   { toss('steak'); }
    when .feeding     { sneak_past(  ); }
    when /grrr+/      { cover_ears(  ); }
    when 2            { run_between(  ); }
    when (3..10)      { run_away(  ); }

}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Subroutines
The most basic form of a subroutine is simply the sub keyword, followed by the name of the sub, followed by the block that defines the sub:
sub alert {
    print "We have normality.";
}
In a simple sub, all arguments are passed in the @_ array:
sub sum {
    my $sum;
    for @_ -> $number {
        $sum += $number;
    }
    return $sum;
}
Perl 6 subroutines can define named formal parameters. The parameter list is part of the subroutine definition, often called the "signature" of the subroutine:
sub standardize ($text, $method) {
    my $clean;
    given $method {
        when 'length' { $clean = wrap($text, 72); }
        when 'lower' { $clean = lowercase($text); }
        ...
    }
    return $clean;
}
Subroutine parameter lists are non-flattening. Any array or hash passed into a subroutine is treated as a single parameter. An array in the signature expects to be passed an actual array or arrayref, and a hash expects a hash or hashref:
sub whole (@names, %flags) {
    ...
}

# and elsewhere
whole(@array, %hash);
To get the old-style behavior where the elements of an array (or the pairs of a hash) flatten out into the parameter list, use the flattening operator in the call to the subroutine. Here, $first is bound to @array[0] and $second is bound to @array[1]:
sub flat ($first, $second) {
    ...
}

flat(*@array);
To make an array (or hash) in the parameter list slurp up all the arguments passed to it, use the flattening operator in the signature definition. These are known as variadic parameters because they can take a variable number of arguments. Here, @names[0] is bound to $zaphod, and @names[1] to $ford:
sub slurp (*@names) {
    ...
}

slurp($zaphod, $ford);
Subroutines with defined parameter lists don't get an @_ array. In fact, a simple subroutine without a signature actually has an implicit signature of *@_:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Classes and Objects
Class syntax won't really be decided until Apocalypse 12, so this section is the most sketchy and likely to change of any in the chapter. Rather than roll out a lengthy speculation, we focus on the parts that are relatively certain.
Class declarations have two forms. The most basic is a class declaration statement, followed by the code that defines the class. There can be only one class or module declaration statement in a file. All code that follows is defined in the Heart::Gold namespace:
class Heart::Gold;
# class definition follows
...
The other form wraps the declaration and definition into a block. Everything within the class's block is defined in the namespace of the class. You can have as many of these as you like in a file, and embed one class within the block of another:
class Heart::Gold {
    # class definition enclosed
    ...
}
To create a new object from a class, simply call its new method. A default new method is provided in the universal base class Object :
$ship = Heart::Gold.new(length => 150);
Attributes are the data at the core of a class. They are commonly known as instance variables, data members, or instance attributes. They're declared with the has keyword, and always have a "." after the sigil:
class Heart::Gold {
    has $.height;
    has $.length;
    has @.cargo;
    has %.crew;
    ...
}
Attributes also automatically generate their own accessor method with the same name as the attribute:
$obj.height(  ) # returns the value of $.height
By default, all attributes and their accessor methods are private to the class. If you want them to be accessible from outside the class, flag them with the is public trait:
has $.height is public;
Methods are similar to subroutines, but different enough to merit their own keyword,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Grammars and Rules
Perl 6 "regular expressions" are so far beyond the formal definition of regular expressions that we decided it was time for a more meaningful name. We now call them "rules." Perl 6 rules bring the full power of recursive descent parsing to the core of Perl, but are comfortably useful even if you don't know anything about recursive descent parsing. A grammar is a collection of rules, in the same way that a class is a collection of methods.
A rule is just a pattern for matching text. Rules can match right where they're defined, or they can be stored up to match later. Rules can be named or anonymous. They may be defined with variations on the familiar /.../ syntax, or using subroutine-like syntax with the keyword rule. Table 4-2 shows the basic syntax for defining rules.
Table 4-2: Rules
Syntax
Meaning
m/.../
Match a pattern (immediate execution).
s/.../.../
Perform a substitution (immediate execution).
rx/.../
Define an anonymous rule (deferred execution).
/.../
Immediately match or define an anonymous rule, depending on the context.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Parrot Internals
"What is the tortoise standing on?"
"You're very clever, young man, very clever," said the old lady. "But it's turtles all the way down!"
—Stephen Hawking, A Brief History of Time
This chapter details the architecture and internal workings of Parrot, the interpreter behind Perl 6. Parrot is a register-based, bytecode-driven, object-oriented, multithreaded, dynamically typed, self-modifying, asynchronous interpreter. While that's an awful lot of buzzwords, the design fits together remarkably well.
Three main principles drive the design of Parrot—speed, abstraction, and stability.
Speed is a paramount concern. Parrot absolutely must be as fast as possible, since the engine effectively imposes an upper limit on the speed of any program running on it. It doesn't matter how efficient your program is or how clever your program's algorithms are if the engine it runs on limps along. While Parrot can't make a poorly written program run fast, it could make a well-written program run slowly, a possibility we find entirely unacceptable.
Speed encompasses more than just raw execution time. It extends to resource usage. It's irrelevant how fast the engine can run through its bytecode if it uses so much memory in the process that the system spends half its time swapping to disk. While we're not averse to using resources to gain speed benefits, we try not to use more than we need, and to share what we do use.
Abstraction indicates that things are designed such that there's a limit to what anyone needs to keep in their head at any one time. This is very important because Parrot is conceptually very large, as you'll see when you read the rest of the chapter. There's a lot going on, too much to keep the whole thing in mind at once. The design is such that you don't have to remember what everything does, and how it all works. This is true regardless of whether you're writing code that runs on top of Parrot or working on one of its internal subsystems.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Core Design Principles
Three main principles drive the design of Parrot—speed, abstraction, and stability.
Speed is a paramount concern. Parrot absolutely must be as fast as possible, since the engine effectively imposes an upper limit on the speed of any program running on it. It doesn't matter how efficient your program is or how clever your program's algorithms are if the engine it runs on limps along. While Parrot can't make a poorly written program run fast, it could make a well-written program run slowly, a possibility we find entirely unacceptable.
Speed encompasses more than just raw execution time. It extends to resource usage. It's irrelevant how fast the engine can run through its bytecode if it uses so much memory in the process that the system spends half its time swapping to disk. While we're not averse to using resources to gain speed benefits, we try not to use more than we need, and to share what we do use.
Abstraction indicates that things are designed such that there's a limit to what anyone needs to keep in their head at any one time. This is very important because Parrot is conceptually very large, as you'll see when you read the rest of the chapter. There's a lot going on, too much to keep the whole thing in mind at once. The design is such that you don't have to remember what everything does, and how it all works. This is true regardless of whether you're writing code that runs on top of Parrot or working on one of its internal subsystems.
Parrot also uses abstraction boundaries as places to cheat for speed. As long as it looks like an abstraction is being completely fulfilled, it doesn't matter if it actually is being fulfilled, something we take advantage of in many places within the engine. For example, variables are required to be able to return a string representation of themselves, and each variable type has a "give me your string representation" function we can call. That lets each class have custom stringification code, optimized for that particular type. The engine has no idea what goes on beneath the covers and doesn't care—it just knows to call that function when it needs the string value of a variable.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Parrot's Architecture
The Parrot system is divided into four main parts, each with its own specific task. The diagram in Figure 5-1 shows the parts, and the way source code and control flows through Parrot. Each of the four parts of Parrot are covered briefly here, with the features and parts of the interpreter covered in more detail afterward.
Figure 5-1: Parrot's flow
The flow starts with source code, which is passed into the parser module. The parser processes that source into a form that the compiler module can handle. The compiler module takes the processed source and emits bytecode, which Parrot can directly execute. That bytecode is passed into the optimizer module, which processes the bytecode and produces bytecode that is hopefully faster than what the compiler emitted. Finally, the bytecode is handed off to the interpreter module, which interprets the bytecode. Since compilation and execution are so tightly woven in Perl, the control may well end up back at the parser to parse more code.
Parrot's compiler module also has the capability to freeze bytecode to disk and read that frozen bytecode back again, bypassing the parser and compilation phases entirely. The bytecode can be directly executed, or handed to the optimizer to work on before execution. This may happen if you've loaded in a precompiled library and want Parrot to optimize the combination of your code and the library code. The bytecode loader is interesting in its own right, and also warrants a small section.
The parser module is responsible for taking source code in and turning it into an Abstract Syntax Tree (AST). An AST is a digested form of the program, one that's much more amenable to manipulation. In some systems this task is split into two parts—the lexing and the parsing—but since the tasks are so closely bound, Parrot combines them into a single module.
Lexing (or tokenizing) turns a stream of characters into a stream of tokens. It doesn't assign any meaning to those tokens—that's the job of the parser—but it is smart enough to see that
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Interpreter
The interpreter is the engine that actually runs the code emitted by the parser, compiler, and optimizer modules. The Parrot execution engine is a virtual CPU done completely in software. We've drawn on research in CPU and interpreter design over the past forty years to try and build the best engine to run dynamic languages.
That emphasis on dynamic languages is important. We are not trying to build the fastest C, Forth, Lisp, or Prolog engine. Each class of languages has its own quirks and emphasis, and no single engine will handle all the different types of languages well. Trying to design an engine that works equally well for all languages will get you an engine that executes all of them poorly.
That doesn't mean that we've ignored languages outside our area of primary focus—far from it. We've worked hard to make sure that we can accommodate as many languages as possible without compromising the performance of our core language set. We feel that even though we may not run Prolog or Scheme code as fast as a dedicated engine would, the flexibility Parrot provides to mix and match languages more than makes up for that.
Parrot's core design is that of a register rich CISC CPU, like many of the CISC machines of the past such as the VAX, Motorola 68000, and IBM System/3x0. Many of the core opcodes—Parrot's basic instructions—perform complex operations. It also bears some resemblance to modern RISC CPUs such as the IBM Power series and Intel Alpha, as it does all its operations on data in registers. Using a core design similar to older systems gives us decades of compiler research to draw on. Most compiler research since the early 1970s deals with targeting register systems of one sort or another.
Using a register architecture as the basis for Parrot goes against the current trends in virtual machines, which favor stack-based approaches. While a stack approach is simpler to implement, a register system provides a richer set of semantics. It's also just more pleasant for us assembly old-timers to write code for. Combined with the decades of sophisticated compiler research, we feel that it's the correct design decision.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
I/O, Events, Signals, and Threads
Parrot has comprehensive support for I/O, threads, and events. These three systems are interrelated, so we'll treat them together. The systems we talk about in this section are less mature than other parts of the engine, so they may change by the time we roll out the final design and implementation.
Parrot's base I/O system is fully asynchronous I/O with callbacks and per-request private data. Since this is massive overkill in many cases, we have a plain vanilla synchronous I/O layer that your programs can use if they don't need the extra power.
Asynchronous I/O is conceptually pretty simple. Your program makes an I/O request. The system takes that request and returns control to your program, which keeps running. Meanwhile the system works on satisfying the I/O request. When the request is satisfied, the system notifies your program in some way. Since there can be multiple requests outstanding, and you can't be sure exactly what your program will be doing when a request is satisfied, programs that make use of asynchronous I/O can be complex.
Synchronous I/O is even simpler. Your program makes a request to the system and then waits until that request is done. There can be only one request in process at a time, and you always know what you're doing (waiting) while the request is being processed. It makes your program much simpler, since you don't have to do any sort of coordination or synchronization.
The big benefit of asynchronous I/O systems is that they generally have a much higher throughput than a synchronous system. They move data around much faster—in some cases three or four times faster. This is because the system can be busy moving data to or from disk while your program is busy processing data that it got from a previous request.
For disk devices, having multiple outstanding requests—especially on a busy system—allows the system to order read and write requests to take better advantage of the underlying hardware. For example, many disk devices have built-in track buffers. No matter how small a request you make to the drive, it always reads a full track. With synchronous I/O, if your program makes two small requests to the same track, and they're separated by a request for some other data, the disk will have to read the full track twice. With asynchronous I/O, on the other hand, the disk may be able to read the track just once, and satisfy the second request from the track buffer.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Objects
Perl 5, Perl 6, Python, and Ruby are all object-oriented languages in some form or other, so Parrot has to have core support for objects and classes. Unfortunately, all these languages have somewhat different object systems, which made the design of Parrot's object system somewhat tricky. It turns out that if you draw the abstraction lines in the right places, support for the different systems is easily possible. This is especially true if you provide core support for things like method dispatch that the different object systems can use and override.
Parrot's object system is very simple—in fact, a PMC only has to handle method calls to be considered an object. Just handling methods covers well over 90% of the object functionality that most programs use, since the vast majority of object access is via method calls. This means that user code that does the following:
object = some_constructor(1, 2, "foo");  
object.bar(12);
will work just fine, no matter what language the class that backs object is written in, if object even has a class backing it. It could be Perl 5, Perl 6, Python, Ruby, or even Java, C#, or Common Lisp; it doesn't matter.
Objects may override other functionality as well. For example, Python objects use the basic PMC property mechanism to implement object attributes. Both Python and Perl 6 mandate that methods and properties share the same namespace, with methods overriding properties of the same name.
When we refer to Parrot objects we're really talking about Parrot's default base object type. Any PMC type that implements the method call vtable entry is an object as far as Parrot is concerned, but while that's sufficient to use an object, it's not enough to make the objects actually work.
Parrot's standard object uses a slot-based attribute model. Each object is essentially a small array, with one element per attribute in the object's class and superclasses. Each object carries a directory of which slots are used by which classes for which attributes. This allows introspective data browsers to show objects at runtime and runtime additions of attributes to objects.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Advanced Features
Since the languages Parrot targets (like Perl and Ruby) have sophisticated concepts as core features, it's in Parrot's best interest to have core support for them. This section covers some (but not all) of these features.
It's expected that modern languages have garbage collection built in. The programmer shouldn't have to worry about explicitly cleaning up after dead variables, or even identifying them. For interpreted languages, this requires support from the interpreter engine, so Parrot provides that support.
Parrot has two separate allocation systems built into it. Each allocation system has its own garbage collection scheme. Parrot also has some strict rules over what can be referenced and from where. This allows it to have a more efficient garbage collection system.
The first allocation system is responsible for PMC and string structures. These are fixed-sized objects that Parrot allocates out of arenas, which are pools of identically sized things. Using arenas makes it easy for Parrot to find and track them, and speeds up the detection of dead objects.
Parrot's dead object detection system works by first running through all the arenas and marking all strings and PMCs as dead. It then runs through the stacks and registers, marking all strings and PMCs they reference as alive. Next, it iteratively runs through all the live PMCs and strings and marks everything they reference as alive. Finally, it sweeps through all the arenas looking for newly dead PMCs and strings, which it puts on the free list. At this point, any PMC that has a custom destruction routine, such as an object with a DESTROY method, has its destruction routine called. The dead object detector is triggered whenever Parrot runs out of free objects, and can be explicitly triggered by running code. Often a language compiler will force a dead object sweep when leaving a block or subroutine.
Parrot's memory allocation system is used to allocate space for the contents of strings and PMCs. Allocations don't have a fixed size; they come from pools of memory that Parrot maintains. Whenever Parrot runs out of memory in its memory pools, it makes a compacting run—squeezing out unused sections from the pools. When it's done, one end of each pool is entirely actively used memory, and the other end is one single chunk of free memory. This makes allocating memory from the pools faster, as there's no need to walk a free list looking for a segment of memory large enough to satisfy the request for memory. It also makes more efficient use of memory, as there's less overhead than in a traditional memory allocation system.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Conclusion
We've touched on much of Parrot's core functionality, but certainly not all. Hopefully we've given you enough of a feel for how Parrot works to expand your knowledge with the Parrot documentation and source.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 6: Parrot Assembly Language
Owner: Sorry squire, I've had a look 'round the back of the shop, and uh, we're right out of parrots.
Customer: I see. I see, I get the picture.
Owner: <pause> I got a slug.
—Monty Python's Flying Circus, "Parrot Sketch"
Parrot assembly (PASM) is an assembly language written for Parrot's virtual CPU. PASM has an interesting mix of features. Because it's an assembly language, it has many low-level features, such as flow control based on branches and jumps, and direct manipulation of values on the software registers and stacks. Basic register operations or branches are generally a single CPU instruction. On the other hand, because it's designed to implement dynamic high-level languages, it has support for many advanced features, such as lexical and global variables, objects, garbage collection, continuations, coroutines, and much more.
The first step before you start playing with PASM code is to get a copy of the source code and compile it. There is some information on this in Section 2.2.2.1. For more information and updates, see http://www.parrotcode.org and the documentation in the distributed code.
The basic steps are:
$ perl Configure.pl
$ make
$ make test
With versions of Parrot later than 0.0.10, you can speed up the testing process significantly by compiling IMCC first (see Section 7.1) and running the tests with IMCC instead of the Parrot assembler:
$ make test IMCC=languages/imcc/imcc
Once you've compiled Parrot, create a small test file in the main parrot directory. We'll call it fjord.pasm.
print "He's pining for the fjords.\n"
end
.pasm is the standard extension for Parrot assembly language source files. Compile it to bytecode, using assemble.pl:
$ ./assemble.pl fjord.pasm --output fjord.pbc
You specify the name of the output bytecode file with the --output (or -o) switch. .pbc is the standard extension for Parrot bytecode. Finally, run the compiled bytecode file through the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Getting Started
The first step before you start playing with PASM code is to get a copy of the source code and compile it. There is some information on this in Section 2.2.2.1. For more information and updates, see http://www.parrotcode.org and the documentation in the distributed code.
The basic steps are:
$ perl Configure.pl
$ make
$ make test
With versions of Parrot later than 0.0.10, you can speed up the testing process significantly by compiling IMCC first (see Section 7.1) and running the tests with IMCC instead of the Parrot assembler:
$ make test IMCC=languages/imcc/imcc
Once you've compiled Parrot, create a small test file in the main parrot directory. We'll call it fjord.pasm.
print "He's pining for the fjords.\n"
end
.pasm is the standard extension for Parrot assembly language source files. Compile it to bytecode, using assemble.pl:
$ ./assemble.pl fjord.pasm --output fjord.pbc
You specify the name of the output bytecode file with the --output (or -o) switch. .pbc is the standard extension for Parrot bytecode. Finally, run the compiled bytecode file through the parrot interpreter:
$ ./parrot fjord.pbc