Molecules

Perl is a free-form language, but that doesn't mean that Perl is totally free of form. As computer folks usually use the term, a free-form language is one in which you can put spaces, tabs, and newlines anywhere you like--except where you can't.

One obvious place you can't put a whitespace character is in the middle of a token. A token is what we call a sequence of characters with a unit of meaning, much like a simple word in natural language. But unlike the typical word, a token might contain other characters besides letters, just as long as they hang together to form a unit of meaning. (In that sense, they're more like molecules, which don't have to be composed of only one particular kind of atom.) For example, numbers and mathematical operators are considered tokens. An identifier is a token that starts with a letter or underscore and contains only letters, digits, and underscores. A token may not contain whitespace characters because this would split the token into two tokens, just as a space in an English word turns it into two words.[2]

Although whitespace is allowed between any two tokens, whitespace is required only between tokens that would otherwise be confused as a single token. All whitespace is equivalent for this purpose. Newlines are distinguished from spaces and tabs only within quoted strings, formats, and certain line-oriented forms of quoting. Specifically, newlines do not terminate statements as they do in certain other languages (such as FORTRAN ...

Get Programming Perl, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.