October 2001
Beginner
528 pages
15h 20m
English
To give you an idea of how the
WWW::Search module works, we’ll start with a
simple script that runs from the command line. This script will let
us use arguments to specify which search engine to search, what query
to submit to it, and so on. The script, called
search_rank.plx, is in Example 18-1.
Example 18-1. Querying search engines from the command line with WWW::Search
#!/usr/bin/perl -w
# search_rank.plx
# using the WWW::Search module, compute the rank of the highest-ranked
# page for a particular site when searching a particular search
# engine for a particular query string.
use strict;
use WWW::Search;
use Getopt::Std;
my %opt;
getopts('s:u:q:m:', \%opt);
unless ($opt{s} and $opt{u} and $opt{q}) {
die <<"EOF";
Usage: $0 [options]
Required options: -s search_engine
-u base_url
-q 'search query'
Optional options: -m max_#_to_retrieve (defaults to 50)
EOF
}
my $max = $opt{m} || 50;
my $search = new WWW::Search($opt{s});
$search->maximum_to_retrieve($max);
my $base_url = quotemeta($opt{u});
my $rank = 0;
my $count = 1;
$search->native_query(WWW::Search::escape_query($opt{q}));
while (my $result = $search->next_result( )) {
if (not $rank and $result->url =~ /$base_url/o) {
$rank = $count;
}
print "$count: ", $result->title || $result->url,
', ', $result->url, "\n";
++$count;
}
print "Rank: $rank\n";As you
scan
through this script, the first interesting
thing you’ll notice is the use of the
Getopt::Std module. This is a standard ...
Read now
Unlock full access