Keyword query processing and response ranking, described in Chapter 3, depend on computing a measure of similarity between the query and documents in the collection. Although the query is regarded at par with the documents in the vector-space model, it is usually much shorter and prone to ambiguity (the average Web query is only two to three words long). For example, the query star is highly ambiguous, retrieving documents about astronomy, plants and animals, popular media and sports figures, and American patriotic songs. Their vector-space similarity (see Chapter 3) to the single-word query may carry no hint that documents pertaining to these topics are highly dissimilar. However, if the search clusters

