Chapter 12. Searching the Web Server

Allowing users to search for specific information on your web site is a very important and useful feature, and one that can save them from potential frustration trying to locate particular documents. The concept behind creating a search application is rather trivial: accept a query from the user, check it against a set of documents, and return those that match the specified query. Unfortunately, there are several issues that complicate the matter, the most significant of which is dealing with large document repositories. In such cases, it’s not practical to search through each and every document in a linear fashion, much like searching for a needle in a haystack. The solution is to reduce the amount of data we need to search by doing some of the work in advance.

This chapter will teach you how to implement different types of search engines, ranging from the trivial, which search documents on the fly, to the most complex, which are capable of intelligent searches.

Searching One by One

The very first example that we will look at is rather trivial in that it does not perform the actual search, but passes the query to the fgrep command and processes the results.

Before we go any further, here’s the HTML form that we will use to get the information from the user:

<HTML> <HEAD> <TITLE>Simple 'Mindless' Search</TITLE> </HEAD> <BODY> <H1>Are you ready to search?</H1> <P> <FORM ACTION="/cgi/grep_search1.cgi" METHOD="GET"> <INPUT TYPE="text" NAME="query" ...

Get CGI Programming with Perl, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.