Overriding similarity

The Similarity class is an abstract class that defines a set of components for score calculation. To steer away from default scoring, we can create a new class extending from the DefaultSimilarity (TFIDFSimilarity) or one of the other Similarity classes. We will perform some experimentation in this section to see how each scoring components affect the overall score.

Let's begin by reviewing Similarity's methods:

  • computeNorm(FieldInvertState): This calculates a normalization value for a Field at indexing time.
  • computeWeight(float, CollectionStatics, TermStatistics): This returns a SimWeight object to calculate a score. It accepts a boost (float) value for query-time boosting.
  • coord(int, int): This returns a score factor based ...

Get Lucene 4 Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.