How Do I Make a Perl Program?

It’s about time you asked (even if you didn’t). Perl programs are text files; you can create and edit them with your favorite text editor. (You don’t need any special development environment, although there are some commercial ones available from various vendors. We’ve never used any of these enough to recommend them.)

You should generally use a programmers’ text editor, rather than an ordinary editor. What’s the difference? Well, a programmers’ text editor will let you do things that programmers need, like indenting or unindenting a block of code, or finding the matching closing curly brace for a given opening curly brace. On Unix systems, the two most popular programmers’ editors are emacs and vi (and their variants and clones). BBEdit and TextMate are good editors for Mac OS X, and a lot of people have said nice things about UltraEdit and PFE (Programmer’s Favorite Editor) on Windows. The perlfaq2 manpage lists several other editors, too. Ask your local expert about text editors on your system.

For the simple programs you’ll write for the exercises in this book, none of which should be more than about 20 or 30 lines of code, any text editor will be fine.

Some beginners try to use a word processor instead of a text editor. We recommend against this—it’s inconvenient at best and impossible at worst. But we won’t try to stop you. Be sure to tell the word processor to save your file as “text only”; the word processor’s own format will almost certainly be unusable. Most word processors will probably also tell you that your Perl program is spelled incorrectly and should use fewer semicolons.

In some cases, you may need to compose the program on one machine, then transfer it to another to run it. If you do this, be sure that the transfer uses “text” or “ASCII” mode, and not “binary” mode. This step is needed because of the different text formats on different machines. Without it, you may get inconsistent results—some versions of Perl actually abort when they detect a mismatch in the line endings.

A Simple Program

According to the oldest rule in the book, any book about a computer language that has Unix-like roots has to start with showing the “Hello, world” program. So, here it is in Perl:

#!/usr/bin/perl
print "Hello, world!\n";

Let’s imagine that you’ve typed that into your text editor. (Don’t worry yet about what the parts mean and how they work. We’ll see about those in a moment.) You can generally save that program under any name you wish. Perl doesn’t require any special kind of filename or extension, and it’s better not to use an extension at all.[*] But some systems may require an extension like .plx (meaning PerL eXecutable); see your system’s release notes for more information.

You may also need to do something so that your system knows it’s an executable program (that is, a command). What you’ll do depends upon your system; maybe you won’t have to do anything more than save the program in a certain place. (Your current directory will generally be fine.) On Unix systems, you mark a program as being executable using the chmod command, perhaps like this:

$ chmod a+x my_program

The dollar sign (and space) at the start of the line represents the shell prompt, which will probably look different on your system. If you’re used to using chmod with a number like 755 instead of a symbolic parameter like a+x, that’s fine too, of course. Either way, it tells the system that this file is now a program.

Now you’re ready to run it:

$ ./my_program

The dot and slash at the start of this command mean to find the program in the current working directory. That’s not needed in all cases, but you should use it at the start of each command invocation until you fully understand what it’s doing.[] If everything worked, it’s a miracle. More often, you’ll find that your program has a bug. Edit and try again—but you don’t need to use chmod each time, as that should “stick” to the file. (Of course, if the bug is that you didn’t use chmod correctly, you’ll probably get a “permission denied” message from your shell.)

There’s another way to write this simple program in Perl 5.10, and we might as well get that out of the way right now. Instead of print, we use say, which does almost the same thing, but with less typing. Since it’s a new feature and you might not be using Perl 5.10 yet, we include a use 5.010 statement that tells Perl that we used new features:

#!/usr/bin/perl

use 5.010;

say "Hello World!";

This program only runs under Perl 5.10. When we introduce Perl 5.10 features in this book, we’ll explicitly say they are new features in the text and include that use 5.010 statement to remind you. Perl actually thinks about the minor version as a three-digit number, so make sure that you say use 5.010 and not use 5.10 (which Perl thinks is 5.100, a version we definitely don’t have yet!)

What’s Inside That Program?

Like other “free-form” languages, Perl generally lets you use insignificant whitespace (like spaces, tabs, and newlines) at will to make your program easier to read. Most Perl programs use a fairly standard format, though, much like most of what we show here. We strongly encourage you to properly indent your programs, as that makes your program easier to read; a good text editor will do most of the work for you. Good comments also make a program easier to read. In perl, comments run from a pound sign (#) to the end of the line. (There are no “block comments” in Perl.[*]) We don’t use many comments in the programs in this book because the surrounding text explains their workings, but you should use comments as needed in your own programs.

So another way (a very strange way, it must be said) to write that same “Hello, world” program might be like this:

#!/usr/bin/perl
    print    # This is a comment
"Hello, world!\n"
  ;    # Don't write your Perl code like this!

That first line is actually a very special comment. On Unix systems,[] if the very first two characters on the first line of a text file are #!, what follows is the name of the program that actually executes the rest of the file. In this case, the program is stored in the file /usr/bin/perl.

This #! line is actually the least portable part of a Perl program because you’ll need to find out what goes there for each machine. Fortunately, it’s almost always either /usr/bin/perl or /usr/local/bin/perl. If that’s not it, you’ll have to find where your system is hiding perl, then use that path. On Unix systems, you might use a shebang line that finds perl for you:

#!/usr/bin/env perl

If perl is not in any of the directories in your search path, you might have to ask your local system administrator or somebody using the same system as you.

On non-Unix systems, it’s traditional (and even useful) to make the first line say #!perl. If nothing else, it tells your maintenance programmer as soon as he gets ready to fix it that it’s a Perl program.

If that #! line is wrong, you’ll generally get an error from your shell. This may be something unexpected, like “file not found.” It’s not your program that’s not found, though; it’s /usr/bin/perl that wasn’t where it should have been. We’d make the message clearer, but it’s not coming from Perl; it’s the shell that’s complaining. (By the way, you should be careful to spell it usr and not user—the folks who invented Unix were lazy typists, so they omitted a lot of letters.)

Another problem you could have is that your system doesn’t support the #! line at all. In that case, your shell (or whatever your system uses) will probably try to run your program all by itself, with results that may disappoint or astonish you. If you can’t figure out what some strange error message is telling you, search for it in the perldiag manpage.

The “main” program consists of all of the ordinary Perl statements (not including anything in subroutines, which you’ll see later). There’s no “main” routine, as there is in languages like C or Java. In fact, many programs don’t even have routines (in the form of subroutines).

There’s also no required variable declaration section, as there is in some other languages. If you’ve always had to declare your variables, you may be startled or unsettled by this at first. But it allows us to write “quick-and-dirty” Perl programs. If your program is only two lines long, you don’t want to have to use one of those lines just to declare your variables. If you really want to declare your variables, that’s a good thing; you’ll see how to do that in Chapter 4.

Most statements are an expression followed by a semicolon. Here’s the one you’ve seen a few times so far:

print "Hello, world!\n";

As you may have guessed by now, this line prints the message Hello, world!. At the end of that message is the shortcut \n, which is probably familiar to you if you’ve used another language like C, C++, or Java; it means a newline character. When that’s printed after the message, the print position drops down to the start of the next line, allowing the following shell prompt to appear on a line of its own, rather than being attached to the message. Every line of output should end with a newline character. We’ll see more about the newline shortcut and other so-called backslash escapes in the next chapter.

How Do I Compile Perl?

Just run your Perl program. The perl interpreter compiles and then runs your program in one user step:

$ perl my_program

When you run your program, Perl’s internal compiler first runs through your entire source, turning it into internal bytecode, which is an internal data structure representing the program. Perl’s bytecode engine takes over and actually runs the bytecode. If there’s a syntax error on line 200, you’ll get that error message before you start running line 2.[*] If you have a loop that runs 5000 times, it’s compiled just once; the actual loop can then run at top speed. And there’s no runtime penalty for using as many comments and as much whitespace as you need to make your program easy to understand. You can even use calculations involving only constants, and the result is a constant computed once as the program is beginning—not each time through a loop.

To be sure, this compilation does take time—it’s inefficient to have a voluminous Perl program that does one small quick task (out of many potential tasks, say) and then exits because the runtime for the program will be dwarfed by the compile time. But the compiler is very fast; normally the compilation will be a tiny percentage of the runtime.

An exception might be if you were writing a program run as a CGI script, where it may be called hundreds or thousands of times every minute. (This is a very high usage rate. If it were called a few hundred or thousand times per day, like most programs on the Web, we probably wouldn’t worry too much about it.) Many of these programs have very short runtimes, so the issue of recompilation may become significant. If this is an issue for you, you’ll want to find a way to keep your program in memory between invocations. The mod_perl extension to the Apache web server (http://perl.apache.org) or Perl modules like CGI::Fast can help you.

What if you could save the compiled bytecode to avoid the overhead of compilation? Or, even better, what if you could turn the bytecode into another language, like C, and then compile that? Well, both of these things are possible in some cases, but they probably won’t make most programs any easier to use, maintain, debug, or install, and they may even make your program slower. Perl 6 should do a lot better in this regard, although it is too soon to tell (as we write this).



[*] Why is it better to have no extension? Imagine that you’ve written a program to calculate bowling scores and you’ve told all of your friends that it’s called bowling.plx. One day you decide to rewrite it in C. Do you still call it by the same name, implying that it’s still written in Perl? Or do you tell everyone that it has a new name? (And don’t call it bowling.c, please!) The answer is that it’s none of their business what language it’s written in, if they’re merely using it. So, it should have simply been called bowling in the first place.

[] In short, it’s preventing your shell from running another program (or shell built-in) of the same name. A common mistake among beginners is to name their first program test. Many systems already have a program (or shell built-in) with that name; that’s what the beginners run instead of their program.

[*] But there are a number of ways to fake them. See the FAQ (accessible with perldoc perlfaq on most installations).

[] Most modern ones, anyway. The “shebang” mechanism[5] was introduced somewhere in the mid-1980s, and that’s pretty ancient, even on the extensively long Unix timeline.

[†5] pronounced “shebang,” as in “the whole shebang”

[*] Unless line two happens to be a compile-time operation, like a BEGIN block or a use invocation.

Get Learning Perl, 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.