Drawbacks of the TF-IDF model

Suppose, on an e-commerce website, a customer is searching for a jacket and intends to purchase a jacket with a unique design. The keyword entered is unique jacket. What happens at the Solr end?

http://solr.server/solr/clothes/?q=unique+jacket

Now, unique is a comparatively rare keyword. There would be fewer items or documents that mention unique in their description. Let us see how this affects the ranking of our results via the TF-IDF scoring algorithm. A relook at the scoring algorithm with respect to this query is shown in the following diagram:

Drawbacks of the TF-IDF model

A relook at the TF-IDF scoring algorithm

The following parameters in ...

Get Apache Solr Search Patterns now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.