Search the Catalog
CGI Programming on the World Wide Web

CGI Programming on the World Wide Web

By Shishir Gundavaram
1st Edition March 1996

This book is out of print, but it has been made available online through the O'Reilly Open Books Project.

Previous Chapter 7
Advanced Form Applications

7.3 Quiz/Test Form Application

The application that we are about to discuss allows you to embed special tags within HTML to create quizzes and tests. The program then parses the new tags to create valid forms.

The special tags I designed for the quiz application are shown in Table 7.1.

Table 7.1: Special Tags for Quiz Application




start/end a quiz

<QUESTION>, </QUESTION>, TYPE="Text", TYPE="Multiple"

start/end a question block, text field, multiple choice

<ASK>, </ASK>

start/end the question text


start/end hint text


start/end answer text


start/end response message


start/end multiple choice item

Before I show the application, I'll show you how the tags are used. Here is an example:

<HEAD><TITLE>CGI Quiz/Test Application</TITLE></HEAD>
<H1>World Wide Web Quiz</H1>

The <QUIZ> tag represents the start of the quiz. It is similar to the <FORM> tag. These new tags are similar to traditional HTML, in that they ignore whitespace, and disregard the case of the string. You can also embed other HTML tags through a quiz, with the exception of <FORM>.

<ASK>Who is credited with the invention of the World Wide Web?</ASK>

The <QUESTION> tag supports two types of questions: fill-in-the-blank (or "text"), and multiple choice (or "multiple"). The actual question is displayed by the <ASK> tag. Remember to close the <ASK> tag with </ASK>.

<HINT>WWW was created at CERN</HINT>
<HINT>The inventor now works for <A HREF="">W3C</A>

You can specify hints for the user with the <HINT> tag. Notice the embedded hypertext anchor in the <HINT> tag. The only restriction with specifying hints is that they must all be grouped together in one place within the question.

<ANSWER>Tim Berners-Lee</ANSWER>

The answer to the question is stored within the <ANSWER> and </ANSWER> tags. You can have only one answer.

<RESPONSE Tim Berners-Lee>You got it! You do know the history behind 
the Web.</RESPONSE>
<RESPONSE Marc Andreessen>Sorry. Marc was the project leader for Mosaic
at NCSA. He currently works for Netscape Communications Corp.</RESPONSE>
<RESPONSE WRONG>I guess you do not know how the Web got started.</RESPONSE>
<RESPONSE SKIP>Come on! At least guess!</RESPONSE>

The <RESPONSE> tags display messages depending on the user input. The two defined response types are "wrong" and "skip." These can be used for wrong answers or skipped questions, respectively. Like the <HINT> tags, all the <RESPONSE> tags have to be grouped together.


You have to end each question with the </QUESTION> tag.

<QUESTION TYPE="Multiple">

The " multiple" keyword specifies a multiple-choice question.

<ASK>Which of the following WWW browsers does <B>not</B> support graphics?</ASK>

Notice the use of the HTML tag <B> for emphasis.

<CHOICE A><IMG SRC="/images/mosaic.gif">Mosaic</CHOICE>
<CHOICE B><IMG SRC="/images/netscape.gif">Netscape Navigator</CHOICE>
<CHOICE C><IMG SRC="/images/we.gif">WebExplorer</CHOICE>
<CHOICE D><IMG SRC="/images/lynx.gif">Lynx</CHOICE>
<CHOICE E><IMG SRC="/images/arena.gif">Arena</CHOICE>
<CHOICE F><IMG SRC="/images/cello.gif">Cello</CHOICE>
<HINT>It was developed at the University of Kansas</HINT>

With multiple-choice questions, you can use single characters to represent each choice. The answer can also be specified as a single character. Notice how the <IMG> tags are used to display inline images within the question. The <CHOICE> tags also have to be grouped together.

<RESPONSE A><A HREF=" NCSAMosaicHome.html">
Mosaic</A> was the first graphic browser.</RESPONSE>
<RESPONSE B><A HREF="">Netscape</A> is the most used browser on the market. It supports:<BR>
         In-Line JPEG Images<BR>
         Client Pull and Server Push Animations<BR>
<RESPONSE WRONG>I guess you don't surf the Web regularly.</RESPONSE>
<RESPONSE SKIP>Come on! Are you scared of being wrong?</RESPONSE>

As mentioned before, you can embed plain HTML within any of the new quiz tags.

<QUESTION TYPE="Multiple">
Now, this is an easy question. You have to get this one right!<BR>
<ASK>Which language is preferred for CGI applications?</ASK>
<CHOICE A><A HREF=" perlinfo">Perl</A></CHOICE>
<CHOICE D>Visual Basic</CHOICE>
<CHOICE E>AppleScript</CHOICE>
<RESPONSE A>Good! Perl is well suited for CGI applications. In fact,
this program was written in Perl.</RESPONSE>
<RESPONSE SKIP>I believe you don't know the answer!</RESPONSE>
<RESPONSE WRONG>What? You don't know the answer to this question!</RESPONSE>

Notice the extra text before the <ASK> tag. It will be displayed before the question. There is also a hypertext anchor in one of the choices.


You have to end the quiz with </QUIZ>. Like forms, you can have multiple quizzes in one document, but they cannot be nested inside one another. This document when converted to pure HTML will look like Figure 7.5.

Once the user fills out the quiz, this application will correct it, as shown in Figure 7.6.

Before we go any further, let's look at how a quiz can be accessed:

Welcome to this server. <BR>
If you want to be challenged, take this
<A HREF="/cgi-bin/">quiz</A>

The relative path of the data file has to be passed as extra path information to the program. In this case, the path to the file is /quiz.html. Now, let's look at the CGI program that parses this document, and then corrects the quiz once the user submits it.

$form = 0;
$this_script = $ENV{'SCRIPT_NAME'};
$webmaster = "Shishir Gundavaram (shishir\@bu\.edu)";
$separator = "\034";

The environment variable SCRIPT_NAME returns the relative path to this script, such as "/cgi-bin/". This relative path is used to set the ACTION attribute in the quiz form to point to this program. The program then corrects the quiz and outputs the results.

$exclusive_lock = 2;
$unlock = 8;
$document_root = "/usr/local/bin/httpd_1.4.2/public";
$images_dir = "/images";
$quiz_file = $ENV{'PATH_INFO'};
if ($quiz_file) {
    $full_path = $document_root . $quiz_file;
} else {
    &return_error (500, "CGI Quiz File Error", 
                        "A quiz data file has to be specified.");

The PATH_INFO environment variable contains the relative path to the quiz data file.

open (FILE, "<" . $full_path) || 
        &return_error (500, "CGI Quiz File Error",
                       "Cannot open quiz data file [$full_path].");
flock (FILE, $exclusive_lock);

This is a way to check the specified data file. First, Perl tries to open the data file. If not successful, the second part of the expression is evaluated, and an error is returned. This construct is identical to:

if (! open (FILE, "<" . $full_path) ) {
    &return_error (500, "CGI Quiz Data File Error",
                        "Cannot open quiz data file [$full path].");

Now, let's proceed with the program:

print "Content-type: text/html", "\n\n";

If any form data is present, it is retrieved and stored in the QUIZ associative array. The parse_form_data subroutine is slightly different from what you have seen before. There will be no data in the array when the quiz is first displayed with a GET request. On the other hand, when the quiz is submitted using POST, the form data has to be stored.

Most of the work in this program is performed by a while loop, which does one of three things: It reads a quiz as supplied by a user, it displays the HTML version of a quiz, or it checks answers against those supplied.

while (<FILE>) {
    if (/<\s*quiz\s*>/i) {

The while loop iterates through the data file, storing a line in the Perl default variable $_ each time through the loop. The if statement looks for the <QUIZ> tag. The "\s*" string in the regular expression checks for zero or more spaces before and after the "quiz" string. The "i" at the end of the regular expression makes the search case insensitive.

        $count = 0;

If a <QUIZ> tag was found, the form variable is incremented, representing the number of quizzes in the data file. The count variable is initialized to zero; it is used to keep track of the number of questions within a quiz.

        if ($QUIZ{'cgi_quiz_form'}) {
            $no_correct = $no_wrong = $no_skipped = 0;
            $correct = "Correct! ";
            $wrong = "Wrong! ";
            $skipped = "Skipped! ";

This conditional will be valid only when the form is submitted. In this example, you will see something you have not seen before: a query is attached to the URL in the "ACTION" attribute of the form. The cgi_quiz_form variable represents the quiz number that the program should process.


The print_form_header subroutine outputs the <FORM> tag in the following format:


In actuality, the program name is not "hard coded" into the ACTION attribute; rather, the value of the environment variable SCRIPT_NAME is used. The data file is specified as extra path information, and the quiz that should be corrected is passed as a query through the "variable" cgi_quiz_form. The long name "cgi_quiz_form" ensures that this variable will not interfere with the other variables used in the form.

        while (<FILE>) {
            if (($type) = 
               /<\s*question\s*type\s*=\s*"?([^ ">]+)"?\s*>/i) {

Here is another loop that iterates through the file. The reason for this loop is to look for <QUESTION> tags within a <QUIZ>. If the tag is specified correctly, the question type is stored in the variable type and the count variable is incremented.

Notice the use of the "\s*" throughout the regular expression to allow the user to specify extra whitespace within the tag. Also, the user can omit quote marks for the TYPE attribute, such as:

<QUESTION TYPE=multiple>

and the regular expression will still work correctly, due to the "?" operator, which searches for an optional string. (In Perl 5, you have to use the {0,1} construct instead.)

                while (<FILE>) {
                    if (!/<\s*\/question\s*>/i) {
                        $line = join("", $line, $_);
                    } else {

This embedded while loop serves to store all the information within a question block (i.e., <QUESTION> .. </QUESTION>) in a variable. The loop iterates through the file, and concatenates each line into the line variable.[2] If a </QUESTION> tag is found, the loop is terminated with the last command.

[2] In Perl, there are two ways to perform string concatenation: the "." operator and the join command. The "." operator is less efficient because strings have to be copied back and forth. So you should use the "." operator for simple concatenation only.

                $line =~ s/\n/ /g;

Once the previous while loop terminates, all of the information within the question block is contained in the line variable. In order to treat it as one string for searching purposes, the newline characters are replaced with spaces.

                    ($ask) = ($line =~ /<\s*ask\s*>(.*)<\s*\/ask\s*>/i);

The above expression determines the question title by retrieving the string in the <ASK> .. </ASK> block. The print_question subroutine displays the question. When parentheses are used in a regular expression, the matched string is stored in such variables as $1, $2, and $3. However, when you use a construct such as this, Perl stores the specified matched string inside the parentheses in the variable provided. When using this construct, a common mistake is:

$ask = ($line =~ /<\s*ask\s*>(.*)<\s*\/ask\s*>/i);

If the parentheses around the $ask variable are omitted, the ask variable will contain the value of "1", which is definitely not what you expect. Basically, you are evaluating the ask variable in a scalar context, not in an array context. In other words, the variable will return the number of stored strings.

                $type =~ tr/A-Z/a-z/;
                $variable = join("-", $count, $type);

The specified question type is converted into a lowercase string. In order to identify individual questions in the quiz, an automatic variable name is given to each one (i.e., "1-text", "2-text", "3-multiple", etc.) This name is used to specify the name of the variable in an input field inside a form.

                if ($type =~ /^multiple$/i) {
                    &split_multiple("choice", *choices);
                } elsif ($type =~ /^text$/i) {

If the question is a multiple-choice question, the split_multiple subroutine is called to retrieve the information specified by each <CHOICE> tag and store it in the choices array. The print_radio_buttons subroutine prints the data stored in the choices array. On the other hand, if the question is a fill-in-the-blank question, the print_text_field subroutine is called.

                if ($line =~ /<\s*hint\s*>/i) {
                    &split_multiple("hint", *hints);

The line is searched for any <HINT> tags. If any hints are found, they are printed out.

                if ($QUIZ{'cgi_quiz_form'} == $form) {
                    local ($answer, %quiz_keys, %quiz_values, 
                       @responses, $user_answer);

If a query was specified as part of the ACTION attribute, referring to the quiz to be corrected, and that value matches the form variable, this loop is executed. Various variables are defined to keep track of the user's answers.


This subroutine redefines the correct, wrong, and skipped variables to point to graphic files if the client browser can support graphics.

                    ($answer) = ($line =~ 

The answer specified in the data file is retrieved and stored in the answer variable. The subroutine format_string removes leading and trailing spaces, replaces multiple spaces with a single space, and converts the string to lowercase. This makes it possible for the user's answer to match the answer specified in the data file.

                    $user_answer = $QUIZ{$variable};

The QUIZ associative array contains the form data. The key used to access this array is in the form "question number-question type," such as "1-multiple." Unnecessary spaces are removed from the user's answer as well.

                    &split_multiple("response", *responses);
                    &split_responses(*responses, *quiz_keys, 
                    print "<HR><BR>";

The response messages to be displayed are read and stored in the responses array. The split_responses subroutine creates two associative arrays: quiz_keys and quiz_values. A typical response tag follows this format:


The array quiz_keys is indexed by the "key" value specified above, and the value of the array is also the same "key." The reason for this is to quickly check to see if there is a response message for a particular answer. On the other hand, the quiz_values array contains the "value," indexed by "key."

                     if ($user_answer eq $answer) {
                        print $correct;

If the user's answer equals the one stored in the data file, the message stored in the variable correct is displayed, and a counter is incremented.

                    } elsif ($user_answer eq "") {
                        print $skipped;
                        if ($quiz_keys{'skip'}) {
                            print $quiz_values{'skip'}, " ";

This conditional checks to see if the user skipped the question. If there is a <RESPONSE SKIP> tag, the specified message is displayed.

                    } else {
                        print $wrong;
                        if ($quiz_keys{'wrong'}) {
                            print $quiz_values{'wrong'}, " ";

This checks for a wrong answer. If a <RESPONSE WRONG> tag exists, the appropriate message is displayed.

                    if ($user_answer eq $quiz_keys{$user_answer}) {
                        print $quiz_values{$user_answer}, " ";

If the data file contains a response message for a particular answer, that message is displayed. It is checked using the quiz_keys array, and the value stored in quiz_values is output. An additional space character is displayed after the message, in the case that there are additional messages.

                    print "<BR><HR><BR>";

This concludes the if statement defined above. Remember, this group of statements is executed only if the value of the cgi_quiz_form variable matches the quiz counter, which occurs when the quiz is submitted.

                $line = "";
            } elsif (/<\s*\/quiz\s*>/i) {
            } else {

The line variable contains the information contained within a question block. It is cleared at the end of the loop. If a </QUIZ> tag is found, the enclosing while loop is terminated. On the other hand, if the line from the data file was neither a <QUESTION> nor a </QUIZ> tag, it is assumed to be either HTML or text, and is printed without any processing.


The program jumps to this point if a </QUIZ> tag is found. The print_form_footer subroutine ends the quiz by outputting the Submit and Reset buttons, followed by a </FORM> tag. It will print the buttons only if the program is in question mode.

    } else {

This part of the loop will be executed only if the line is outside the quiz block. It is printed to standard output verbatim.

flock (FILE, $unlock);
close (FILE);

You have to remember to unlock and close the file after all the operations are done.

The print_form_header subroutine outputs the <FORM> tag to start a quiz.

sub print_form_header
    print <<Form_Header;
<FORM ACTION="${this_script}/${quiz_file}?cgi_quiz_form=${form}" METHOD="POST">

The quiz_file variable, which points to this script, is passed as extra path information. Notice the query in the ACTION attribute. When the quiz is submitted, the program will know exactly which quiz it is.

The parse_form_data subroutine examines the form input and parses it into the FORM_DATA array.

sub parse_form_data
    local (*FORM_DATA) = @_;
    local ($query_string, @key_value_pairs, $key_value, $key, $value);
    read (STDIN, $query_string, $ENV{'CONTENT_LENGTH'});
    if ($ENV{'QUERY_STRING'}) {
            $query_string = join("&", $query_string, 
    @key_value_pairs = split (/&/, $query_string);
    foreach $key_value (@key_value_pairs) {
        ($key, $value) = split (/=/, $key_value);
        $value =~ tr/+/ /;
        $value =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;
        if (defined($FORM_DATA{$key})) {
            $FORM_DATA{$key} = join ("\0", $FORM_DATA{$key}, $value);
        } else {
            $FORM_DATA{$key} = $value;

When you glance through this subroutine, you should notice one difference from the one you have seen before. The POST request method is assumed, and the information is read into query_string. Remember, this subroutine is only called if the POST request method was used--see the main program. The major difference in this program is that queries are joined to the query_string variable, and decoded as one. The only query that is expected is the one that is passed through the ACTION attribute of the form.

The set_browser_graphics subroutine determines if the browser is graphics capable.

sub set_browser_graphics
    local ($nongraphic_browsers, $client_browser);
    $nongraphic_browsers = 'Lynx|CERN-LineMode';
    $client_browser = $ENV{'HTTP_USER_AGENT'};
    if ($client_browser !~ /$nongraphic_browsers/) {
        $correct = "<IMG SRC=\"$images_dir/correct.gif\">";
        $wrong = "<IMG SRC=\"$images_dir/wrong.gif\">";
        $skipped = "<IMG SRC=\"$images_dir/skipped.gif\">";

If the client browser support graphics, the correct, wrong, and skipped variables are re-defined to include a relative path to appropriate images.

The print_question subroutine displays the question number, as well as the question itself, using the global variable $count.

sub print_question
    local ($question) = @_;
    print <<Question;
<H3>Question $count</H3>

The format_string subroutine "formats" the user's answer and the answer specified in the data file to ensure a greater chance of matching.

sub format_string
    local (*string) = @_;
    $string =~ s/^\s*(.*)\b\s*$/$1/;

All leading and trailing spaces are removed. This is a very useful regular expression. You might need to use it frequently when parsing data, as users often inadvertently insert spaces before or after a string.

    $string =~ s/\s+/\s/g;

Multiple spaces are replaced by a single space throughout the string.

    $string =~ tr/A-Z/a-z/;

Finally, the string is converted to lowercase.

At the heart of the program is the split_multiple subroutine. It is used to split multiple <CHOICE>, <RESPONSE>, and <HINT> tags to make the processing easier.

sub split_multiple
    local ($tag, *multiple) = @_;
    local ($info, $first, $loop);

<CHOICE> and <RESPONSE> tags are handled differently than <HINT> tags because they can contain an extra parameter in the tag. Let's first look at the <CHOICE> and <RESPONSE> tags.

    if ( ($tag eq "choice") || ($tag eq "response") ) {
        ($first, $info) = ($line =~ /<\s*$tag\s*([^>]+)>(.*)<\s*\/$tag\s*>/i);
        $info =~ s/<\s*$tag\s*([^>]+)>/$1$separator/ig;
        $info = join("$separator", $first, $info);

Before we discuss the parsing details, let's look at a simple collection of <RESPONSE> tags to illustrate some points. Everything we discuss will also apply to the <CHOICE> tag as well.

<RESPONSE key1>value1</RESPONSE>
<RESPONSE key2>value2</RESPONSE>
<RESPONSE key3>value3</RESPONSE>

The regular expression parses through the string and stores the first parameter, or "key1", in the first variable. And the string starting from "value1" till the last </RESPONSE> tag is stored in the info variable. This is why all the <RESPONSE> tags have to be grouped together in the data file. The substitute command replaces each <RESPONSE key> string with the key value and the separator (defined to be octal 34). Finally, the string stored in info is joined to the first key, and stored again in info. This is very important! If the first key is not stored, it will be lost, because the regular expression stores everything in a response block (i.e., <RESPONSE key1> to the last </RESPONSE>). Now, info will contain:


The subroutine continues:

    } else {
        ($info) = ($line =~ /<\s*$tag\s*>(.*)<\s*\/$tag\s*>/i);
        $info =~ s/<\s*$tag\s*>//ig;

This else construct will be executed for <HINT> tags. The regular expression works the same way as the previous one, except that <HINT> tags do not contain extra parameters. As a result, no extra precautions need to be taken to store those parameters.

    @multiple = split(/<\s*\/$tag\s*>/i, $info);

The split command separates the string in info with the </RESPONSE> delimiter. After this command, the array would look like this:

$multiple[0] = key1\034value1
$multiple[1] = key2\034value2
$multiple[2] = key3\034value3

Other procedures--print_radio_buttons and split_responses--split the string on the "\034" delimiter to access the key and value separately. Since the <HINT> tags do not contain extra parameters, the array would look like this:

$multiple[0] = hint1
$multiple[1] = hint2
$multiple[2] = hint3

There is no need to split the values in the array further.

    for ($loop=0; $loop <= $#multiple; $loop++) {
        $multiple[$loop] =~ s/^\s*(.*)\b\s*$/$1/;

Finally, leading and trailing spaces are removed from each element in the array.

The print_radio_buttons subroutine outputs form elements to create radio buttons for multiple-choice questions.

sub print_radio_buttons
    local (*buttons) = @_;
    local ($loop, $letter, $value, $checked, $user_answer);
    if ($QUIZ{'cgi_quiz_form'}) {
        $user_answer = $QUIZ{$variable};

The user_answer variable exists only when the quiz is submitted. You might have noticed that user_answer was defined earlier in the program. Why is it being defined again? In the main program, the variable is declared after the print_radio_buttons subroutine is called. As a result, the variable is not available to this subroutine.

    for ($loop=0; $loop <= $#buttons; $loop++) {
        ($letter, $value) = split(/$separator/, $buttons[$loop], 2);
        $letter =~ s/^\s*(.*)\b\s*$/$1/;
        $value =~ s/^\s*(.*)\b\s*$/$1/;

The loop iterates through each element of the array, which is stored in the following format:


Each element is split into a separate key and value. Leading and trailing spaces are removed from the key and value separately. You might wonder why this has to be done, considering that the split_multiple subroutine already removed leading and trailing spaces from each element. The reason is that the key and value, once separated, might have their own leading and trailing spaces.

        if ($user_answer eq $letter) {
            $checked = "CHECKED";
        } else {
            $checked = "";
        print <<Radio_Button;
<INPUT TYPE="radio" NAME="$variable" VALUE="$letter" $checked>

When the quiz is submitted, the program checks the answers, and displays the same quiz with the user's original answers, along with right/wrong messages. If the user's answer matches one of the choices, the CHECKED attribute is specified. As a result, the user-selected radio button--or multiple choice--is "checked."

The print_text_field subroutine displays a text field for fill-in-the-blank questions. Again, the information that the user typed is displayed if the program is in correction mode.

sub print_text_field
    local ($default);
    if ($QUIZ{'cgi_quiz_form'}) {
        $default = $QUIZ{$variable};
    } else {
        $default = "";
    print <<Text_Field;
<INPUT TYPE="text" NAME="$variable" SIZE=50 VALUE="$default"><BR>

The print_hints subroutine contains a loop that iterates through the array, and displays each element as an unordered list in HTML.

sub print_hints
    local (*list) = @_;
    local ($loop);
    print "<UL>", "\n";
    for ($loop=0; $loop <= $#list; $loop++) {
        print <<Unordered_List;
    print "</UL>", "\n";

The split_responses subroutine splits all of the responses stored in the array to create a key and a value.

sub split_responses
    local (*all, *index, *message) = @_;
    local ($loop, $key, $value);
    for ($loop=0; $loop <= $#all; $loop++) {
        ($key, $value) = split(/$separator/, $all[$loop], 2);
        $value =~ s/^\s*(.*)\b\s*$/$1/;
        $index{$key} = $key;
        $message{$key} = $value;

The format_string subroutine is called to "format" the key. Leading and trailing spaces are removed from the value. Two associative arrays are created: one to store the key and the other to store the value. Both arrays are indexed by the key.

The print_form_footer subroutine generates the end of the form.

sub print_form_footer
    if (!$QUIZ{'cgi_quiz_form'}) {
        print '<INPUT TYPE="submit" VALUE="Submit Quiz">';
        print '<INPUT TYPE="reset"  VALUE="Clear Answers">';
    } else {
            print <<Status;
Results: $no_correct Correct -- $no_wrong Wrong -- $no_skipped Skipped<BR>
    print "</FORM>";

If the program is in question mode, the Reset and Submit buttons are displayed. Otherwise, the results of the quiz are output. The buttons are not displayed, because you do not want the user to submit a quiz that has the answers! Finally, the </FORM> tag is output.

Believe it or not, we're now finished with the quiz program. This example truly illustrates the power of CGI and forms to create an interactive environment.

Previous Home Next
Survey/Poll and Pie Graphs Book Index Security

Back to: CGI Programming on the World Wide Web Home | O'Reilly Bookstores | How to Order | O'Reilly Contacts
International | About O'Reilly | Affiliated Companies | Privacy Policy

© 2001, O'Reilly & Associates, Inc.