O'Reilly logo

Perl for Web Site Management by John Callender

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Parsing the Data

Taking a look at exhibit.txt, we can see that it consists of individual company listings separated by blank lines. Within each company’s listing, the same sequence of lines occurs: the first holds the company name, the next holds the booth number, the next holds the street address, and so on. By splitting up the file wherever we see a blank line, we can isolate individual companies’ information. By counting lines within those sections, we should be well on our way to extracting the relevant data from the file. We can then use pattern-matching operators to help us identify the data contained in lines that otherwise would be ambiguous.

Example 5-3 shows our first version of make_exhibit.plx , the script that will do this parsing and HTML-page creation. It features several new Perl features you haven’t seen before, but not to worry; we’ll be going through them all one by one.

Example 5-3. First version of make_exhibit.plx

#!/usr/bin/perl -w # make_exhibit.plx # this script reads a pair of data files, extracts information # relating to a group of tradeshow exhibitors, and writes # out a browseable web-based directory of those exhibitors use strict; # configuration section: my $exhibit_file = './exhibit.txt'; # script-wide variable: my %listing; # key: company name ($co_name). # value: HTML-ized listing for this company. # read and parse the main exhibitor file my @listing_lines = ( ); # holds current listing's lines for passing # to the &parse_exhibitor subroutine ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required