Molecules
Perl is a free-form language, but that doesn’t mean that Perl is totally free of form. As computer folks usually use the term, a free-form language is one in which you can put spaces, tabs, and newlines anywhere you like—except where you can’t.
One obvious place you can’t put a whitespace character is in the middle of a token. A token is what we call a sequence of characters with a unit of meaning, much like a simple word in natural language. But unlike the typical word, a token might contain other characters besides letters, just as long as they hang together to form a unit of meaning. (In that sense, they’re more like molecules, which don’t have to be composed of only one particular kind of atom.) For example, numbers and mathematical operators are considered tokens. An identifier is a token that starts with an alphabetic character (typically a letter) or connector punctuation like an underscore and contains only alphabetics, combining marks, digits, and underscores. A token may not contain whitespace characters because this would split the token into two tokens, just as a space in an English word turns it into two words.[37]
Although whitespace is allowed between any two tokens, whitespace is required only between tokens that would otherwise be confused as a single token. All whitespace is equivalent for this purpose. Newlines are distinguished from spaces and tabs only within quoted strings, formats, and certain line-oriented forms of quoting. Specifically, newlines ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access