Chapter 5. How Search Engines Work

We've seen how searchers behave and how they interact with search results. We've decided what queries we want our sites to be found for. How do search engines compile these lists?

The Evolution of Search Engines

In the emerging days of the Web, directories were built to help users navigate to various Web sites. Generally, these directories were created by hand—people categorized Web sites so users could browse to what they wanted. As the Web got larger, this effort became more difficult. "Web spiders" were created that "crawled" Web sites. Web spiders, also known as robots, are computer programs that follow links from known Web sites to other Web sites. These robots access those pages, download the contents of those pages (into a storage mechanism gener-cally referred to as an "index"), and add the links found on those pages to their list for later crawling.

While Web crawlers enabled the early search engines to have a larger list of sites than the manual method of collecting sites, they couldn't perform the other manual tasks of figuring out what the pages were about and ranking them in order of which ones were best. These search engines started working on computer programs that would help them do these things as well. For instance, computer programs could catalog all the words on a page to help figure out what those pages were about.

The Introduction of PageRank

Google's "PageRank" algorithm in 1998 was a big step forward in automatically cataloging ...

Get Marketing in the Age of Google: Your Online Strategy Is Your Business Strategy now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.