Basic Concepts

A Perl program is a bunch of Perl statements and definitions thrown into a file. You can execute the file by invoking the Perl interpreter with the script name as an argument. You will often see a line

#
            !/usr/bin/perl

as the first line of a Perl script. This line is a bit of magic employed by UNIX-like operating systems to automatically execute interpreted languages with the correct command interpreter. This line is called a shebang line due to the first two characters: # is sometimes called sharp, and ! is sometimes called bang. This line normally won’t work for Perl-for-Win32 users,[5] although it doesn’t hurt anything since Perl sees lines beginning with # as comments.

The invocation examples that follow assume that you have invoked the Windows NT command interpreter (cmd.exe) and are typing into a console window. You can run Perl scripts from the Explorer or the File Manager (assuming that you’ve associated the script extension with the Perl interpreter) by double-clicking on the script icon to launch it. Throughout this book, we’re going to be discussing standard output and input streams; these are generally assumed to be your console window.

We recommend naming scripts with a .plx extension. Traditionally, Perl modules have a .pm extension, and Perl libraries have a .pl extension. The ActiveState installer prompts you to associate .pl with the interpreter.

You can always execute a script by calling the Perl interpreter with the script as an argument:

> perl myscript.plx

You can also associate files with the .plx extension (or another of your choosing) with the Perl interpreter, so that executing

> myscript.plx

will correctly invoke the Perl interpreter and execute your script. This step is normally done for you by the ActiveState installation script[6] for the .pl extension, but if you wish to change the extension or if you’ve got the standard distribution, you can do this step manually. If you’re using Windows NT 4.0 (or greater), the following commands will do the trick (use the full path to your interpreter):

> assoc .plx=Perl
> ftype Perl=c:\myperl\bin\perl.exe %1 %*

If you can’t bear the thought of typing the extension every time you execute a Perl script, you can set the PATHEXT environment variable so that it includes Perl scripts. For example:

> set PATHEXT=%PATHEXT%;.PLX

This setting will let you type

> myscript

without including the file extension. Take care when setting PATHEXT permanently—it also includes executable file types like .COM, .EXE, .BAT, and .CMD. If you inadvertently lose those extensions, you’ll have difficulty invoking applications and script files.

Perl is mostly a free-format language like C—whitespace between tokens (elements of the program, like print or +) is optional, unless two tokens placed together can be mistaken for another token, in which case whitespace of some kind is mandatory. (Whitespace consists of spaces, tabs, newlines, returns, or formfeeds.) A few constructs require a certain kind of whitespace in a certain place, but they’ll be pointed out when we get to them. You can assume that the kind and amount of whitespace between tokens is otherwise arbitrary.

Although many interesting Perl programs can be written on one line, typically a Perl program is indented much like a C program, with nested parts of statements indented more than the surrounding parts. You’ll see plenty of examples showing a typical indentation style throughout this book.

Just like a batch file, a Perl program consists of all of the Perl statements of the file taken collectively as one big routine to execute. Perl has no concept of a "main” routine as in C.

Perl comments are single-line comments (like REM in a batch file or // in a C++ or Java file). Anything from an unquoted pound sign (#) to the end-of-line is a comment. There are no C-like multiline comments.

Unlike the command shell, the Perl interpreter completely parses and compiles the program before executing any of it. This means that you can never get a syntax error from a program once the program has started, and that the whitespace and comments simply disappear and won’t slow the program down. In fact, this compilation phase ensures the rapid execution of Perl operations once execution starts, and provides additional motivation for dropping C as a systems utility language merely on the grounds that C is compiled.

This compilation does take time—it’s inefficient to have a voluminous Perl program that does one small quick task (out of many potential tasks) and then exits, because the run-time for the program will be dwarfed by the compile time.

So, Perl is like a compiler and an interpreter. It’s a compiler because the program is completely read and parsed before the first statement is executed. It’s an interpreter because no object code sits around filling up disk space. In some ways, it’s the best of both worlds. Admittedly, a caching of the compiled object code between invocations, or even translation into native machine code, would be nice. A working version of such a compiler already exists, and is currently scheduled to be bundled into the 5.005 release. See the Perl FAQ for the current status.

Documentation

Throughout this book, we’ll refer to the documentation included with the Perl distributions. The ActiveState port comes with documentation in HTML format; you can find it in the /docs subdirectory of the distribution. When we refer to the documentation, we’ll just refer to the base name of the file without the extension. For example, if we refer to perlfunc, we really mean /docs/Perl/perlfunc.html. Win32 specific documentation is located in the /docs/Perl-Win32 subdirectory, so a reference to win32ext really refers to /docs/Perl-Win32/win32ext.html.

If you have the standard 5.004 distribution, you can use the perldoc command from the command line. perldoc is a batch file wrapper around a Perl script, found in the /bin directory of the distribution. perldoc lets you view documentation pages or module documentation by invoking it as follows:

> perldoc perlfunc

perldoc extracts the documentation from the Perl POD (plain old documentation) format found in the /pod subdirectory of the distribution. If all else fails, you can just read the pod files with your favorite text editor.



[5] However, there are Win32 ports of UNIX shells (e.g., tcsh, ksh, and bash) that do understand shebang lines. If you’re using one of these shells, you can use shebang lines by specifying the path to your Perl interpreter.

[6] This statement is not true if you’re using Windows 95, in which case you’ll have to do the whole thing manually. From an Explorer window, go to View/Options/File Types and add a new type with the .pl extension and the path to the Perl interpreter.

Get Learning Perl on Win32 Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.