BUY THIS BOOK
Add to Cart

Print Book $34.95


Safari Books Online

What is this?

Add to UK Cart

Print Book £24.95

What is this?

Looking to Reprint this content?


Perl for Web Site Management
Perl for Web Site Management By John Callender
October 2001
Pages: 528

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Getting Your Tools in Order
As I explained in the Preface, this book is intended for readers who have some experience creating web sites and now want to move to the next level, using the Perl programming language to create larger, more useful sites. In this first chapter, though, we won't be discussing Perl. Instead, we'll talk about some of the other things you'll be using as you make that transition: a good hosting provider, the Unix shell environment, and a text editor for writing your programming code.
If you were setting out to climb a mountain (or even to take a short hike in the backcountry), you'd want to make sure you had everything squared away before you left the trailhead. That's what this chapter is about: getting you set up properly with gear and supplies before you head into the wilderness.
If you already have a web-hosting provider you're happy with, are familiar with using an ssh or Telnet client to log into a Unix shell session, and have a programmer's text editor you're happy with, congratulations! You should probably skip right to the next chapter, where I introduce the Perl programming language. Otherwise, read on.
I assume in this book that you work in a Windows or Macintosh environment for your day-to-day computing. Much of the material I present, however, focuses on learning to work in a very different environment, that of the Unix and Unix-like systems upon which much of the Internet has been built. In particular, I focus on learning to work with Linux, a Unix-like operating system that is especially popular with Internet service providers (among others) for running web servers.
Learning to work with Linux means learning to work in the world of open source software. Open source software (http://www.opensource.org/) is a relatively new name for a relatively old phenomenon. It refers to software whose original instructions, or source code, are made freely available to anyone who uses that software. The opposite of open source is proprietary (or closed source) software,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Open Source Versus Proprietary Software
I assume in this book that you work in a Windows or Macintosh environment for your day-to-day computing. Much of the material I present, however, focuses on learning to work in a very different environment, that of the Unix and Unix-like systems upon which much of the Internet has been built. In particular, I focus on learning to work with Linux, a Unix-like operating system that is especially popular with Internet service providers (among others) for running web servers.
Learning to work with Linux means learning to work in the world of open source software. Open source software (http://www.opensource.org/) is a relatively new name for a relatively old phenomenon. It refers to software whose original instructions, or source code, are made freely available to anyone who uses that software. The opposite of open source is proprietary (or closed source) software, in which users have access only to the compiled binary form of the program—that is, the executable that actually runs on the computer.
Advocates of the open source development model claim many advantages for this approach. Open source developers are part of a voluntary, collaborative effort. Their goal is the creation of simple, flexible tools that are as useful as possible. Standards are open, and interfaces well documented. New features are added according to genuine need. Releases come early and often; bugs are identified and fixed in a matter of days, or even hours.
With proprietary software, say the open source crowd, one is dependent on a closed team of programmers who labor out of sight, insulated from the healthy effects of peer review and massively parallel debugging. Direction is determined by marketing committee. The goal is the maximization of profit, with user needs a sometimes-distant second. Standards that promote flexibility and user choice may be ignored, or even actively subverted.
At this point you may be thinking, who cares? I know I used to think that way. In my former existence as a nonprogramming user of DOS (and later, Windows and Macintosh computers), I dealt strictly with closed source software. Much of it was commercial software, some of it was shareware or freeware, but none of it came with source code included, and frankly, I didn't care. I didn't know how to program, so I wouldn't have known how to modify it even if I'd had the source code, and I didn't have a compiler to turn modified source code into an executable, anyway, so what was the point?
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Evaluating a Hosting Provider
I'm assuming in this book that you are already experienced with HTML and with the basics of setting up a web site. That means you probably already have a hosting arrangement of some sort: a web server either within your own company or at an outside Internet Service Provider (ISP). To keep things simple, I'm going to assume that you're hosting with an outside ISP, but in a large company it is quite possible that your "provider" will be an internal company department. This is especially likely if you are developing content for a company intranet, rather than the public Internet. It really doesn't matter, though. The only question is, can your hosting partner, wherever that partner is located, deliver what you need?
As your web efforts grow more ambitious, you need to think carefully about your hosting arrangements. One of the things I've learned as I've gone from maintaining a personal home page to maintaining multiple commercial sites is the number of ways in which a site can outgrow its hosting environment. After a fair amount of (sometimes painful) experience, I've come up with the following list of attributes that I now look for when evaluating a hosting provider for my own web projects.
The first four items in the following list (Unix environment, shell access, cron capability, and CGI scripting/server-side includes) represent the minimum needed to implement the examples in this book. The remaining items may be more or less important depending on what you're trying to do.
Unix environment
The examples in this book assume that your web server is running Unix (loosely defined); specifically, the open source variant of Unix called Linux. Linux web servers are quite popular with ISPs because they deliver powerful, stable performance at a minimal cost in terms of hardware and software.
Shell access
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Web Hosting Alternatives
The following is a rough hierarchy of web hosting alternatives, arranged from lowest cost (and capabilities) to highest.
A number of providers offer free web space in return for things like the ability to run ad banners on your site. While they may be suitable for a low-end site, such arrangements are unlikely to offer things like shell access, CGI scripting, or server-side includes. Most also limit disk space fairly severely, requiring you to pay if you want to use more. For the examples in this book, such free web hosts probably will not be sufficient.
Many Internet providers offer personal web space as part of a basic dial-up access package. Others offer relatively inexpensive web hosting without dial-up access. Although most of these providers do not include shell access or CGI scripting/server-side includes as part of the basic package, some do. An account with such a provider would probably be a good choice for someone on a strict budget looking to practice the examples in this book. With some searching, you can find providers that offer this level of access for $20 to $30 per month. For a site that represents a hobby, or a nonprofit community service project, or a demo of something you hope to turn into a commercial site later on, this may be sufficient—but be aware that things like load, reliability, and bandwidth can come back to haunt you.
For a cost ranging from $50 per month to about $250 per month, you can find providers willing to sell you a significant chunk of space on a web server with full CGI scripting, server-side includes, and shell access. You'll still probably be in a shared-server environment, but significantly fewer customers will probably be sharing that space with you. You should hopefully see better server performance, and a better response time on frantic calls to tech support asking why the server's down. For many low-end commercial web sites, this represents a good balance of cost and capability.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Getting Started with SSH/Telnet
You're probably already familiar with the process of making a dial-up PPP connection to the Internet from your Windows PC or Macintosh. Once connected, you then run client software to access various Internet services: a web client (like Netscape Navigator) to access web sites, a mail client (like Netscape's mail reader, or Eudora) to send and receive email, or an FTP client (like WS_FTP or Fetch) to transfer files. Traditionally, a Telnet client is just another piece of software that runs on top of your Internet connection. You use Telnet to log into a shell session on a remote server. Once you're in the shell session, you type text commands into the Telnet window, and those commands are then executed on the remote server and the results sent back to you.
We'll talk more about shell sessions. For now, let's talk a bit more about Telnet.
There's an inherent problem with using Telnet to connect to a remote server. Because Telnet traffic is sent across the network unencrypted, a malicious user located on a network somewhere along the path between you and the web server could easily obtain your username and password and use them to connect to the server as you. For that reason, a growing number of ISPs don't allow customers to make Telnet connections to their servers. Instead, they require customers to use something called ssh (for secure shell), an encrypted protocol that makes it much harder for bad guys to get hold of your login information. Once you've established the connection, an ssh session looks the same from the user perspective as a Telnet session: you get a shell window, where you type in commands and see the results of those commands printed out afterward.
I strongly encourage you to use ssh instead of Telnet. If your ISP doesn't support ssh connections you may have no choice but to use straight Telnet, but in that case you'd probably be well-served to start looking for another ISP.
In order to use ssh (or, if you must, Telnet), you will need a suitable client program. If you're running Windows, you already have one because Windows comes with a Telnet client preinstalled. I've never been happy with the Windows Telnet client, though, and can't recommend it (even without considering the security implications). Instead, I suggest you to go to your favorite software-download site (like TUCOWS, at
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Meet the Unix Shell
The Unix shell is the program that interprets the things you type at the command-line prompt. It is called the shell because it forms a shell around the computer's lower-level functions so that you, the user, don't have to deal with them directly.
If you've been using personal computers long enough to remember poking around in the DOS command-line environment, guess what? That experience is going to prove useful. In many ways working in DOS is similar to working in a Unix shell session (though Unix fans will say it's similar in the same sense that driving a soapbox racer is similar to driving a Ferrari).
If your computing experience has been limited to using a graphical environment like Windows or the Mac OS, this is going to seem a bit strange at first. If you stick with it, though, you'll come to appreciate the power and flexibility of the command-line interface.
It's a bit like LEGO blocks. The command-line interface is like the LEGOs I grew up with 30 years ago: you got the little square one and the slightly bigger rectangle and the really big rectangle and so on. The individual components were basic, but if you put them together with sufficient imagination you could make a spaceship, or a fire engine, or a skyscraper, or whatever you wanted. Unix commands are designed to be put together with each other in the same way, and before you know it you've got a custom tool to solve whatever your particular problem is.
A graphical interface is more like those fancy LEGO sets you can buy today: premolded plastic pieces that go together in just one way. You get a much snazzier-looking spaceship, but that's all you get. If you want a fire engine you have to buy another kit—and if you want something unusual you're out of luck. As Net user and Unix convert Peter J. Schoenster has written, "With a PC [meaning a Windows PC], I always felt limited by the software available. On Unix, I am limited only by my knowledge."
The hardest thing about working in command-line mode is that, well, you have to learn a lot of commands. You can't rely on being able to click through menus to discover the one you need, or having a dialog box pop up to request more specifics when you need to invoke a command with a particular set of options. It's very much the "open sesame" school of computing: you
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Network Troubleshooting
One of the really cool things about the Internet is the way you, or anyone, can see how your traffic is being routed across the Net, and what's happening to it along the way. This comes in very handy when you're experiencing some sort of problem connecting to another site. With just a few seconds of research, you can often tell exactly where the problem lies, and this in turn can tell you if it's something you need to fix yourself, something you need to complain to somebody else about, or something that's essentially out of your control. It also comes in very handy for evaluating the quality of the explanations you get when you bug your ISP about network outages, which in turn can be an important factor in deciding where to host your web site.
The first network utilities we're going to talk about are the ping and traceroute commands. These utilities let you probe a TCP/IP network (like the Internet) to see where your data packets are going, how long it's taking them to get there, and whether any of them are getting lost along the way. (See Packet-Switching 101 if these concepts are new to you.)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Suitable Text Editor
You'll need one more tool in order to begin working magic with Perl: a text editor.
Actually, you'll probably want two of them because you will be writing your Perl scripts in two different places. Sometimes you will write them on your desktop PC (or Mac), then will transfer them to your ISP's Unix server using an FTP program (like WS_FTP for the PC, or Fetch for the Mac). Other times you will create your scripts right there on the Unix machine using a Unix text editor.
Because FTP is an unencrypted protocol, it is prone to the same security problems as Telnet is. For that reason, you may wish to investigate using an encrypted protocol for your file transfers. Martin Prikryl's WinSCP (http://winscp.vse.cz/eng/) offers a nice Windows implementation of the secure scp protocol (which uses ssh for security) to do file transfers. For Mac users, the aforementioned NiftyTelnet (http://andrew2.andrew.cmu.edu/dist/niftytelnet.html) also does scp file transfers. (Mac users running OS X can also use the scp command-line program directly.)
The traditional text editors used in the Unix environment are emacs and vi (the latter pronounced "vee-eye"). Both are extremely powerful and full-featured. Both can also be a bit intimidating for beginners. Because of that, I'm actually going to focus on a simpler (albeit less powerful) editor called pico for this book's text-editing-under-Unix examples. If pico is not available on your system, you may need to buckle down and learn emacs or vi whether you want to or not. In that case, see The Traditional Unix Editors: emacs and vi later in this chapter.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Getting Started with Perl
Now we're ready to add the centerpiece of your information toolkit: Perl, the "Swiss army chainsaw" for web content creators. This chapter explains how to locate Perl on your system, and walks you through running a very simple Perl script. Along the way it explains command paths, and gives a quick lesson on Unix file permissions. Next it covers Perl variables, quoting, and then finishes by explaining how to run a Perl CGI script (which means running a script via your web server, with the output of the script being returned to your web browser).
The first thing we need to do to get you running Perl scripts is to verify that Perl has already been installed on your server, find out where it is, and check to see what version it is.
Log into a shell session and enter the command which perl . The which command prints out the full path to the program that will run when you enter the program's name by itself:
[jbc@andros jbc]$ which perl
/usr/bin/perl
So, in this case, I now know that the perl interpreter is located at /usr/bin/perl. Your copy of perl may be located somewhere else. Wherever it is, write down the location. You'll need to know it later.
If your web server doesn't have the which command, you can try finding the location of Perl using the similar command whereis , giving it the -b option to limit its output to binary files, as in:
[jbc@andros jbc]$ whereis -b perl
perl: /usr/bin/perl /usr/local/bin/perl
What if the which command doesn't give you any output, and just dumps you back to the shell prompt?
[jbc@andros jbc]$ which perl
[jbc@andros jbc]$
Or what if it gives you an error message?
[jbc@andros jbc]$ which perl
which: no perl in (/usr/bin:/bin:/usr/local/bin)
These tell you that Perl hasn't been installed on your server, or its location is not in your search path, or you don't have permission to run it, or something equally annoying. Contact your ISP's tech support staff and find out what's going on.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Finding Perl on Your System
The first thing we need to do to get you running Perl scripts is to verify that Perl has already been installed on your server, find out where it is, and check to see what version it is.
Log into a shell session and enter the command which perl . The which command prints out the full path to the program that will run when you enter the program's name by itself:
[jbc@andros jbc]$ which perl
/usr/bin/perl
So, in this case, I now know that the perl interpreter is located at /usr/bin/perl. Your copy of perl may be located somewhere else. Wherever it is, write down the location. You'll need to know it later.
If your web server doesn't have the which command, you can try finding the location of Perl using the similar command whereis , giving it the -b option to limit its output to binary files, as in:
[jbc@andros jbc]$ whereis -b perl
perl: /usr/bin/perl /usr/local/bin/perl
What if the which command doesn't give you any output, and just dumps you back to the shell prompt?
[jbc@andros jbc]$ which perl
[jbc@andros jbc]$
Or what if it gives you an error message?
[jbc@andros jbc]$ which perl
which: no perl in (/usr/bin:/bin:/usr/local/bin)
These tell you that Perl hasn't been installed on your server, or its location is not in your search path, or you don't have permission to run it, or something equally annoying. Contact your ISP's tech support staff and find out what's going on.
Let's assume you did find a copy of perl when you used the which command. The next thing to do is to figure out what version it is. To do that, you run the perl program itself with a -v command-line switch:
[jbc@andros jbc]$ perl -v

This is perl, v5.6.1 built for i586-linux

Copyright 1987-2001, Larry Wall

Perl may be copied only under the terms of either the Artistic License 
or the GNU General Public License, which may be found in the Perl 5 
source kit.

Complete documentation for Perl, including FAQ lists, should be found 
on this system using `man perl' or `perldoc perl'.  If you have access 
to the Internet, point your browser at http://www.perl.com/, the Perl 
Home Page.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Creating the "Hello, world!" Script
It's some kind of unwritten law that the first program you create in a new language should print out the message Hello, world! I'm not sure who originated the practice, but far be it from me to violate it. If you want to modify the following instructions to make your first Perl script say Hello, sailor! or Hey, bignose! or something else you find equally (or probably more) amusing, go right ahead.
For this demonstration I'm going to assume that you'll write this script on the Unix server using the pico text editor. (If your Unix server does not have pico available, you will probably need to look into using emacs or vi; see Chapter 1.) If you want to write the script on your local PC or Mac, that's fine, too; just remember that you'll have to upload it to the Unix server via FTP (ASCII upload, please) before you can test it.
You start up pico by entering the command pico in the Unix shell (clever, eh?). There are some special features of pico you can turn on with command-line options, and we'll be using three of them: -d (which makes your keyboard's Delete key erase the character under your cursor, rather than the character to the cursor's left), -w (which turns off automatic word wrapping), and -z (which allows you to suspend the pico program by typing Ctrl-Z; more about that later).
Although I show pico's command-line options merged together with a single leading hyphen (-dwz), older versions of pico may require you to enter them separately, as -d -w -z.
Let's get started. Enter pico -dwz hello.plx at the command line. This begins your pico editing session, with your cursor at the beginning of a new file called hello.plx:
[jbc@andros jbc]$ pico -dwz hello.plx
         
I typically give .plx filename extensions to my Perl scripts, even on Unix machines, where the filename extension has no formal significance (unlike on DOS/Windows machines, where it is used to indicate the file's type). This is a legacy of my originally learning Perl on a DOS computer. I left the extension on in these examples for two reasons:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Dot Slash Thing
The next step is to try running the script by entering its name at the Unix command prompt. You may be able to do this by just entering the name of the script (hello.plx) all by itself. Or you may need to precede its name by a period and a forward slash (./). What decides this is whether the dot (.), which you will recall is a shortcut for the current working directory (meaning the directory you are currently in), is in your command path.
The command path is just a list of directories that the shell looks in to find the command whose name you entered. On a DOS system, the current working directory is in your command path by default, but not so under Unix.
If the current working directory isn't in your command path, entering hello.plx by itself at the shell prompt will not work because the Unix shell will not be able to find the script, even though it's right there. Instead, you'll have to enter ./hello.plx, with that initial dot slash (./) telling the Unix shell to look in the current working directory for the command whose name you're typing in.
Having to explicitly enter the ./ before your program's name can actually be a good thing because it makes it less likely that you will accidentally run a different program with the same name in some other directory that is in your command path.
You can check to see if the current working directory is in your command path with the printenv command, as follows:
[jbc@andros jbc]$ printenv PATH
/usr/local/bin:/bin:/usr/bin:/usr/sbin:.
If you don't have access to the printenv command, you can also try using the echo command to display the contents of the $PATH shell variable:
[jbc@andros jbc]$ echo $PATH
/usr/local/bin:/bin:/usr/bin:/usr/sbin:.
Using either method should cause the Unix server to print out a colon-separated list of directories that will be searched when you type in a command. If the current working directory (
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Unix File Permissions
Back at my Unix command prompt, I try typing the name of the script to run it:
[jbc@andros jbc]$ hello.plx
hello.plx: Permission denied.
Hmm. Welcome to the world of Unix file permissions. This is one of the trickier parts of making the transition to Unix. If you're an impatient person, just type the following command:
[jbc@andros jbc]$ chmod 700 hello.plx
         
and go on to the next topic. If you want to know what's really going on, though, (which I strongly recommend, since it will save you much trouble later), keep reading.
In Unix, you can have three different types of permissions with respect to a particular file: read permission, write permission, and execute permission. Read permission lets you read the file (you need me to tell you that?), write permission lets you make changes to the file, and execute permission, in the case of a script or program, lets you actually run it.
So, that's the first half of the permissions story. The second half is this: the three types of permission can be set to "on" or "off" for each of three different sets of people: the file's owner (you, in the case of the scripts you've created), members of the file's group (which we're going to ignore for now), and everyone else in the world (which we're also going to ignore for now).
Check the permissions on a file by entering ls -l filename, with filename being replaced by the name of the file. To look at the permissions on my hello.plx script:
[jbc@andros jbc]$ ls -l hello.plx
-rw-r--r-- 1 jbc jbc 42 Sep 5 06:55 hello.plx
The first column in the ls -l listing (the -rw-r--r-- part) is what tells you what permissions the file has. That -rw-r--r-- thing can be decoded as follows:
The initial - means it's a file rather than a directory or a link (see Figure 2-2).
Figure 2-2:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Running (and Debugging) the Script
Now that we've made hello.plx executable by its owner, let's try running it again:
[jbc@andros jbc]$ hello.plx
Hello, world!
Congratulations! You've just run your first Perl script.
What if your script didn't run, though, but instead died with some sort of error message? Or what if it successfully ran, but didn't produce the output you were expecting? Then you get to experience the joy of debugging.
You're going to be experiencing a lot of this particular joy. As a novice programmer, the bulk of your time is going to be spent in the debugging phase: tracking down and squashing the silly mistakes that keep your script from running as you intended. (Actually, my understanding from the experienced programmers I know is that they, too, spend a good chunk of their time debugging.)
Sometimes the mistake is obvious: you left off a semicolon, or the closing quotation mark in a quoted string. Other times the mistake is maddeningly obscure (at least until you identify it, at which point it, too, will become obvious): you gave the wrong arguments to a function, or were confused about some aspect of how Perl behaves.
Debugging is a specialized skill, and it takes practice to get good at it. It's somewhat like car repair. An experienced mechanic can ask a few questions, listen to the engine for a second, and immediately tell you what's wrong with your car and what it will take to fix it. Meanwhile, a novice mechanic will be pulling apart the transmission when the only problem is a broken light on the dashboard.
As you learn about debugging you're sometimes going to feel like that novice mechanic, banging your head against the keyboard for what seems like hours, then finding out the problem was actually in some completely different part of your script.
This is okay. It's a good thing. It's how you learn.
With that said, I'd like to offer some debugging tips gleaned from my own experience, in the hope that they will help you learn somewhat more quickly than you otherwise would.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Perl Documentation
This mention of the perldebtut man page is a good place to talk about the official Perl documentation. There is a very complete set of documentation that comes free with Perl. If everyone was tied down and forced to read every word of it, we'd all know a lot more about Perl.
Unfortunately, the sheer quantity of the Perl documentation, along with the fact that much of it is written for people who are already experienced programmers, can make things tough for accidental programmers. There is a subset of the Perl documentation, though, that you should definitely try to familiarize yourself with now—if only so you'll know where to look for answers later on, when those answers will make more sense to you.
You can read the Perl documentation by entering man perl at the Unix command line. If you are on a system that doesn't have the man command (for example, because you installed Perl locally on your PC or Mac), you can use a utility called perldoc that comes bundled with Perl by entering perldoc perl. (Also, the ActiveState version of Perl installs the Perl documentation as HTML pages accessible under the Start menu.)
The Perl documentation has been split up into numerous sections; you access the appropriate section by entering man sectionname or perldoc sectionname. More about this, including the list of section names, in that first man perl page.
Some of those Perl manpages are going to be over your head for now, but among the ones you should at least skim through are perl, perlfaq, perltoc, perldata, perlsyn, perlop, perlre, perlrun, and perlfunc. The others are useful, too; it's just that you probably will need to learn some more before you can get much out of them.
You also should know about a neat trick you can do with perldoc: If you enter perldoc -f, followed by the name of a particular Perl function, you will get the part of the perlfunc manpage that describes that function. For example:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Perl Variables
It's time for you to get acquainted with the idea of a variable. Variables are very important in programming. They provide containers that you use to store information for later retrieval and manipulation. You choose a name for your variable (hopefully picking a nice, descriptive name that will make sense to you later on), then stick a value in it (or a bunch of values, or pairs of values; more on that in a minute). Later, you can get that value (values, pairs of values) back by referring to the variable by name. This actually sounds more complicated than it is. The following examples should help clear things up.
There are three types of variables in Perl. In increasing order of niftiness, they are scalar, array, and hash variables. You'll be using them all, so let's get to know them.
Replace the print statement in your "Hello, world!" script with the following:
$greeting = "What are you looking at?\n";

print $greeting;
This new form of the script uses a variable to hold the string that will be printed. First the string is assigned to the variable using the assignment operator, an equal sign (=). Then we feed the variable (called $greeting) to the print function.
In Perl, variables whose names begin with a dollar sign ($) are used to store a single something: a single number or a single string of text. Programmers call these single-something containers scalar variables.
When I say scalar variables hold a single string of text, I don't mean they necessarily hold only a single letter or word. Except for a very large limit based on the computer's available memory, a scalar variable can hold as long a text string as you like. You could put the entire text of the Encyclopedia Britannica in a scalar variable if you wanted to. It's just that from Perl's perspective, it would just be one thingy.
A good trick for remembering that the dollar sign refers to a scalar variable is to remember that a dollar sign looks sort of like a letter s, for
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Bit More About Quoting
So far, whenever we've needed to quote a string we've used double quotes (as in, "this is a double-quoted string"). In fact, Perl also supports the use of single quotes (as in, 'this is a single-quoted string'). It's important for you to understand the difference between the two.
The difference is just this: when it processes a double-quoted string, Perl looks in it for things that look like variables and replaces them with the contents of those variables. This process is called variable interpolation . It also looks for certain sequences beginning with a backslash (\) and replaces them with special characters. The sequences are called backslash escapes, and the process of replacing them with special characters is called backslash interpretation.
When it's processing a single-quoted string Perl doesn't bother doing this. You get the string, just like it's written. (Actually, Perl processes two backslash escapes within a single-quoted string: \', which it interprets as a literal single quote, and \\, which it interprets as a literal backslash. This lets you put literal single quotes and literal backslashes inside your string, which would otherwise be difficult to do.)
Let's create a new script called quotes.plx (Example 2-2) to see how this works.
Example 2-2. A script to test how Perl treats single- and double-quoted strings
#!/usr/bin/perl

# quotes.plx -- test handling of single- and double-quoted strings

$veggies = 'rutabagas';

print "I like to eat $veggies.\n";
When you save this script and run it in the shell you should get this:
[jbc@andros jbc]$ quotes.plx
I like to eat rutabagas.
Now modify the script to replace the double quotes in the print statement with single quotes, so the last line becomes:
print 'I like to eat $veggies.\n';
Now when you run the script you should get:
[jbc@andros jbc]$ quotes.plx
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
"Hello, world!" as a CGI Script
Before ending this brief introduction to Perl, I want to show you one more thing: the "Hello, world!" script rewritten to run as a CGI script.
I've assumed in this book that you already have a certain amount of web authoring experience, so you may have encountered CGI scripts before, and may even have run a few yourself. In case they're completely new to you, though, here's the lowdown: In simplest terms, a CGI script is a separate program that a web server runs in order to produce a customized page to show to a web user. CGI stands for Common Gateway Interface , which is just a description of a set of relatively simple rules for how the communication between the web server and the separate program will be conducted. CGI scripts can be written in any language that can produce the appropriate sort of output, but Perl is by far the most popular choice because of how easy it is to create CGI scripts in Perl.
CGI scripts are something of a gateway drug for Perl use. Many accidental programmers first come to Perl not because they've decided to learn Perl programming per se, but because they want to create a CGI script (probably to process the output of a web form), and someone has told them that Perl is the way to go. That's okay; Perl doesn't mind. It happily does the task at hand, biding its time until the user is ready for more.
Let's modify hello.plx so that it will run as a CGI script. Every CGI script needs to output a CGI header as the first thing the script outputs. This header, which consists of one or more lines of text followed immediately by a blank line, is checked by the web server, then passed on to the remote user's browser in order to tell that browser what type of document to expect. Most of the time, your script is going to output an HTML document, which means you'll need to output your script's header using something like the following snippet of Perl:
print "Content-type: text/html\n";
print "\n";
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Running a Form-to-Email Gateway
Now that you're able to run real, honest-to-goodness CGI scripts, let's make something useful: a web-form-to-email gateway. This is probably one of the most common uses of CGI scripts, since it satisfies a very common need: you have an HTML form that you want visitors to your web site to fill out, and you want the contents of that form sent to you as an email message.
My goal with this chapter is to get your form-to-email gateway script up and running as quickly as possible. Along the way I explain foreach loops, if blocks, and how Perl evaluates conditional statements for "truth." I also touch on how to open a pipe to another program and print output to that program, as well as how to use the die function to make your script stop dead in its tracks if it notices something unusual happening.
Because you still are at a fairly early stage in your Perl education, I'm going to ask you to take more or less on faith some other Perl features demonstrated in this chapter. I'll explain what they're doing in terms of this chapter's example, but I will stop short (for now) of giving a complete explanation of how you would use them in other circumstances. This is the case with this chapter's treatment of the substitution operator, and the CGI.pm module, for example. Don't worry, though. We'll be covering them more thoroughly in the chapters ahead.
We're going to make life easy for ourselves by writing this script using something called CGI.pm. CGI.pm is a Perl module , which is basically a chunk of prewritten Perl code that you can pull into your script to do lots of useful magic. CGI.pm, as you might guess, is a module specifically designed to do CGI sorts of things. It was created by Lincoln Stein, one of the real heroes of the Perl community, especially for anyone who uses Perl for web work. In this script, we'll be using CGI.pm primarily to decode data submitted from an HTML form. Although it's possible to do that form decoding without using CGI.pm, I encourage you not to try to do that. See Using CGI.pm Versus Manual Form Decoding for an explanation.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Checking for CGI.pm
We're going to make life easy for ourselves by writing this script using something called CGI.pm. CGI.pm is a Perl module , which is basically a chunk of prewritten Perl code that you can pull into your script to do lots of useful magic. CGI.pm, as you might guess, is a module specifically designed to do CGI sorts of things. It was created by Lincoln Stein, one of the real heroes of the Perl community, especially for anyone who uses Perl for web work. In this script, we'll be using CGI.pm primarily to decode data submitted from an HTML form. Although it's possible to do that form decoding without using CGI.pm, I encourage you not to try to do that. See Using CGI.pm Versus Manual Form Decoding for an explanation.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Creating the HTML Form
Among the things Lincoln explains in his documentation is how you can use CGI.pm not only to process the output of HTML forms, but to actually produce those forms in the first place. With this approach, you just send a user to your CGI script, and the first time the user invokes the script it delivers an HTML form with all the values set to their defaults. Then, when the form is submitted back to the same script, it takes the data supplied by the user and does whatever you want it to do.
CGI.pm has lots of nifty features like that, but they tend to be a bit overwhelming for beginners, so in this case we're going to take a more straightforward approach and simply create our form as a standard HTML page, then submit it to our CGI script for processing.
Example 3-1 shows an HTML form you can use for this demonstration. I'm not going to bother explaining what's going on with the table tags and form elements in this web page; again, I'm assuming you already know about those things, or can learn about them elsewhere.
You can download your own copy of this web page from this book's online example repository, at http://www.elanus.net/book/.Or you can just create your own copy of it.
Example 3-1. A page with an HTML form for testing a CGI form-to-email gateway
<HTML>

<HEAD>
<TITLE>Sample Form</TITLE>
</HEAD>

<BODY>

<H1>Sample Form</H1>

<P>Please fill out this form and submit it. Thank you.</P>

<FORM ACTION="mail_form.cgi" METHOD="POST">

<TABLE>

<TR>
<TD ALIGN="right"><STRONG>My name:</STRONG></TD>
<TD><INPUT NAME="name" SIZE=30></TD>
</TR>

<TR>
<TD ALIGN="right"><STRONG>Address:</STRONG></TD>
<TD><INPUT NAME="address" SIZE=30></TD>
</TR>

<TR>
<TD ALIGN="right"><STRONG>City:</STRONG></TD>
<TD><INPUT NAME="city" SIZE=30></TD>
</TR>

<TR>
<TD ALIGN="right"><STRONG>State:</STRONG></TD>
<TD><INPUT NAME="state" SIZE=2></TD>
</TR>

<TR>
<TD ALIGN="right"><STRONG>Zip:</STRONG></TD>
<TD><INPUT NAME="zip" SIZE=10></TD>
</TR>

<TR>
<TD ALIGN="right"><STRONG>Country:</STRONG></TD>
<TD><INPUT NAME="country" SIZE=10 VALUE="USA"></TD>
</TR>

<TR>
<TD ALIGN="right"><STRONG>My email:</STRONG></TD>
<TD><INPUT NAME="email" SIZE=30></TD>
</TR>

<TR>
<TD ALIGN="right"><STRONG>My favorite color:
</STRONG></TD>
<TD>
<TABLE BGCOLOR="#CCCCFF" BORDER><TR><TD>
<TABLE>
<TR>
<TD><INPUT NAME="color" TYPE="radio" VALUE="red">
</TD>
<TD>Red</TD>
</TR>
<TR>
<TD><INPUT NAME="color" TYPE="radio" VALUE="green">
</TD>
<TD>Green</TD>
</TR>
<TR>
<TD><INPUT NAME="color" TYPE="radio" VALUE="blue">
</TD>
<TD>Blue</TD>
</TR>
</TABLE>
</TD></TR></TABLE>
</TD>
</TR>

<TR>
<TD ALIGN="right"><STRONG>Movies I liked:</STRONG>
</TD>
<TD>
<TABLE BGCOLOR="#CCCCFF" BORDER><TR><TD>
<TABLE>
<TR>
<TD><INPUT NAME="movies" TYPE="checkbox" 
VALUE="Blade Runner"></TD>
<TD><EM>Blade Runner</EM></TD>
</TR>
<TR>
<TD><INPUT NAME="movies" TYPE="checkbox" 
VALUE="Pulp Fiction"></TD>
<TD><EM>Pulp Fiction</EM></TD>
</TR>
<TR>
<TD><INPUT NAME="movies" TYPE="checkbox" 
VALUE="Full Metal Jacket"></TD>
<TD><EM>Full Metal Jacket</EM></TD>
</TR>
</TABLE>
</TD></TR></TABLE>
</TD>
</TR>

<TR>
<TD ALIGN="right"><STRONG>When I grow up I want 
to be a(n):</STRONG></TD>
<TD>
<SELECT NAME="grow_up">
<OPTION>Astronaut
<OPTION>Fireman
<OPTION>CGI programmer
</SELECT>
</TD>
</TR>

<TR>
<TD ALIGN="right"><STRONG>My opinion on cucumber 
sandwiches is:</STRONG></TD>
<TD>
<TEXTAREA NAME="sandwiches" ROWS=5 COLS=20 WRAP="virtual">
</TEXTAREA>
</TD>
</TR>

<TR><TD COLSPAN=2>&nbsp;</TD></TR>
<TR>
<TD>&nbsp;</TD>
<TD><INPUT TYPE="submit"> <INPUT TYPE="reset">
</TD>
</TR>
</TABLE>

</FORM>
</BODY>
</HTML>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The <FORM> Tag's ACTION Attribute
When an HTML form is submitted, the contents of all the form's fields are bundled up and handed off to whatever script is specified in the ACTION attribute of the <FORM> tag. In the previous example page, the form is handed off to a script called mail_form.cgi located in the same directory as the form itself.
If you wanted to, you could have put some path information into the ACTION attribute and handed the form off to a script in a different directory. You could even have given a full URL (http://www.somewhere.com/somepath/somescript.cgi) and handed the form's contents off to a script on a completely different server. But I'm digressing. The point is, the ACTION attribute of the <FORM> tag is what determines where the form data goes when the form is submitted.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The mail_form.cgi Script
Example 3-2 shows a simple script that will take the output of a web form, bundle it up into an email message, and mail it off to someone. Go ahead and download this script from the book's online example repository, at http://www.elanus.net/book, and stick it in a suitable location on your web server. If you can execute CGI scripts anywhere, you can stick it in the same directory as your HTML form. If you need to put your scripts in a special location, stick it there, and then be sure to modify the ACTION attribute of the <FORM> tag to point to it properly. For example, if you needed to put the script in a top-level directory on your server called cgi-bin, you would edit the <FORM> tag to read: <FORM ACTION="/cgi-bin/mail_form.cgi" METHOD="POST">.
This script is considerably longer than the examples you've seen so far. Don't let that bother you, though. It's all relatively simple Perl, and I'll be explaining the whole thing, line by line.
Example 3-2. A simple web form-to-email gateway script
#!/usr/bin/perl -w

# mail_form.cgi

# bundle up form output and mail it to the specified address

# configuration:

$sendmail  = '/usr/sbin/sendmail'; # where is sendmail?
$recipient = 'forms@example.com';  # who gets the form data?
$sender    = 'forms@example.com';  # default sender?
$site_name = 'my site';            # name of site to return to after
$site_url  = '/return/path/here/'; # URL to return to after

# script proper begins...

use CGI qw(:standard);

# bundle up form submissions into a mail_body

$mail_body = '';

foreach $field (param) {
    foreach $value (param($field)) {
        $mail_body .= "$field: $value\n";
    }
}

# set an appropriate From: address

if ($email = param('email')) {
    # the user supplied an email address
    $email  =~ s/\n/ /g;
    $sender = $email;
}

# send the email message

open MAIL, "|$sendmail -oi -t" or die "Can't open pipe to $sendmail: $!\n";

print MAIL <<"EOF";
To: $recipient
From: $sender
Subject: Sample Web Form Submission

$mail_body
EOF

close MAIL or die "Can't close pipe to $sendmail: $!\n";

# now show the thank-you screen

print header, <<"EOF";
<HTML>
<HEAD>
<TITLE>Thank you</TITLE>
</HEAD>

<BODY>

<H1>Thank you</H1>

<P>Thank you for your form submission. You will be hearing 
from me shortly.</P>

<P>Return to 
<A HREF="$site_url">$site_name</A>.</P>

</BODY>
</HTML>
EOF
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Warnings via Perl's -w Switch
Let's go through the script section by section. First comes the usual top-of-the-script stuff, with one change—the shebang line now has a trailing -w:
#!/usr/local/bin/perl -w
As mentioned in Chapter 2 in the discussion of debugging, -w turns on Perl's warnings feature, which causes the script to complain to standard error if certain suspicious-looking things appear to be going on. We haven't bothered with it before this, since the other scripts so far have been so short and simple, but this one is complex enough that it's worth turning it on.
With that said, you should realize that Perl's warnings feature is mainly a tool to help you while you're writing the script. Once the script is written and working properly, there shouldn't be any warnings.
Beginning with Perl Version 5.6.0, you can enable warnings by putting a statement that says use warnings; near the beginning of your script instead of using the -w shebang-line switch. If your version of Perl is recent enough to support it, the use warnings approach has some minor advantages over the -w switch, so you should probably use it. In this book I'll just be using the -w switch.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Configuration Section
Next comes the script's configuration section:
# configuration:

$sendmail  = '/usr/sbin/sendmail'; # where is sendmail?
$recipient = 'forms@example.com';  # who gets the form data?
$sender    = 'forms@example.com';  # default sender?
$site_name = 'My Site';            # name of site to return to after
$site_url  = '/~myname/';          # URL to return to after
Each line stores some text in a scalar variable. Later on in the script, we will use these variables to plug the text they contain into various places. We do the initialization of the variables up at the top of the script, though, because that makes it easy to modify the script later on if something needs to be changed.
And in fact, you need to do some modifying of this section right now because each value is just for illustration purposes. You will need to replace the strings inside the single quotes on the right side of each assignment with a string that makes sense for your own web server. I'll be walking you through each configuration variable in turn.
The first line in the configuration section is where you give the location on your system where the Unix sendmail program can be found. The script will use this information later on to send the message containing the form data. Use the which command in the shell to find out where on your system sendmail is located (or the whereis command, if which is not supported on your system; see Chapter 2):
[jbc@catlow jbc]$ which sendmail
/usr/sbin/sendmail
Once you've located sendmail, you can put the full path inside the single quotes on the right side of the equal sign to assign it to the $sendmail variable:
$sendmail = '/usr/sbin/sendmail';
Calling the variable $sendmail doesn't do anything special within Perl, by the way. Perl doesn't know how you are planning to use this variable, and just treats it as a storage location. You could call it $mail_program_location or $walnuts or anything else you wanted (as long as the variable began with a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Invoking CGI.pm
After the configuration section of the script comes the following line:
use CGI qw(:standard);
This is where you work the CGI.pm magic. In effect, what you are doing with this line is adding a whole bunch of prewritten Perl code to your script. That code arrives in the form of some new functions that your script now has access to for doing various CGI-related things. As I mentioned earlier, I'm not going to explain that process in detail here. In a few minutes, though, you'll see how easy this makes it to process the submitted form elements received by the script.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
foreach Loops
Content preview·