4.4. Code and Code Explanation

Code for the search engine is contained in three files. The administrative interface is saved in the publicly accessible directory as admin.php. It should be protected from unauthorized use by including the lib/401.php include file. The front-end code is also saved in the public_files directory as search.php. The crawler/indexer functionality is saved outside the public area as indexer.php.

4.4.1. Administrative Interface

The administrative interface provides an area to enter addresses that will either be included or excluded from the index, and also maintains the list of stop words. The display consists of an HTML form with two textareas. The processing of the input is done with PHP.

The first HTML textarea provides a place to enter the URLs of documents that will be included in the search engine's retrieval efforts. The second textarea provides a place for the list of stop words to be given. Each are pre-populated from appropriate database records with each item appearing on a separate line.

<form method="post" action="<?php echo htmlspecialchars($_SERVER['PHP_SELF']); ?>"> <table> <tr> <td style="vertical-align:top; text-align:right"> <label for="addresses">Include Addresses</label></td> <td><small>Enter addresses to include in crawling, one address per line.</small><br/> <textarea name="addresses" id="addresses" rows="5" cols="60"><?php $query = sprintf('SELECT DOCUMENT_URL FROM %sSEARCH_CRAWL ' . 'ORDER BY DOCUMENT_URL ASC', DB_TBL_PREFIX); ...

Get PHP and MySQL®: Create-Modify-Reuse now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.