Available similarity models
As already mentioned, the original and default similarity model available before Apache Lucene 6.0 was the TF-IDF model but in Lucene 6.0 it is changed to BM25, which we have already discussed in detail in The changed default text scoring in Lucene: BM25 section in Chapter 2, The Improved Query DSL.
Apart from BM25, other similarity models that we can use are:
- TF-IDF (classic): This similarity model is based on TF-IDF model and used to be the default similarity model before Elasticsearch 5.0. In order to use this similarity in Elasticsearch, you need to use the
- Divergence from randomness (DFR): This similarity model is based on the probabilistic model of the same name. In order to use this similarity in Elasticsearch, ...