Challenges of large-scale indexing

Let us understand how indexing happens and what can be done to speed it up. We will also look at the challenges faced during the indexing of a large number of documents or bulky documents. An e-commerce site is a perfect example of a site containing a large number of products, while a job site is an example of a search where documents are bulky because of the content in candidate resumes.

During indexing, Solr first analyzes the documents and converts them into tokens that are stored in the RAM buffer. When the RAM buffer is full, data is flushed into a segment on the disk. When the numbers of segments are more than that defined in the MergeFactor class of the Solr configuration, the segments are merged. Data ...

Get Apache Solr Search Patterns now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.