O'Reilly logo

Computer Science & Perl Programming by Jon Orwant

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 69. Lexical Analysis

Chip Salzenberg

It’s been said that “the only program that can parse Perl is perl.” If you’ve ever tried to get a smart editor like Emacs to properly indent your Perl program, you’ll probably agree. And while Ilya Zakharevich has made great strides with cperl-mode.el, Perl’s syntax is still more complex and exception-ridden than most.

Now, ask yourself: given that Perl’s syntax is riddled with oddities, exceptions, and attempts to do what you mean instead of what you say, what bizarre twists and turns must a program take to understand it? You’re about to find out.

Tokenizing

Lexical analysis consists of turning a source file—a single unbroken stream of characters—into discrete units, called tokens. (That’s why lexical analysis is often called tokenizing.) Tokens are the fundamental units of a programming language. Typical tokens are identifiers like foo, literal strings like "bar", and operator names or symbols like print or +.

The next stage after lexical analysis, called parsing, takes those tokens and, based on their context, figures out what they mean. After all, foo might be a subroutine name, a filehandle, or even a variable name if it follows a dollar sign. The full glory of parsing is a discussed in Chapter 21.

Lexical analysis of Perl is a seriously hairy job, and toke.c contains some seriously hairy code. (Like the rest of Perl, the tokenizer is written in C.) You’d probably find an exhaustive treatment of its ins and outs to be, well, exhausting. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required