Chapter 3. Search Technology

There are three fundamental components of any search application. First, an index of the information in a set of documents is created. The words in a search query then need to be matched against this index, and all of the documents that match the words in the query must be compiled into a list. Finally, this list of results is ranked in descending order of relevance. To any one unfamiliar with the technology of search, it all seems so simple. But there is much more involved in the mechanics behind the search process. The reality is that any search application consists of a set of modules, each of which carries out a specific task in the search process. Some of these modules may be brought in by the search vendor, and others will be developed internally. The same is true of open source software development.

Users should not have to know anything about search technology to be able to use it effectively, but understanding the elements of search technology is important in the selection, testing, and management of a search application. This is because one or more of these modules may be especially important in meeting a specific user requirement. It is very much a question of the whole only being as strong as the weakest link in the chain. If there are some limitations in the way that content is indexed, then it does not matter how elegant the user interface looks—it could be that information critical to the operations of the organization remains invisible. ...

Get Enterprise Search, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.