Chapter 5. Deploy the Recommender

Before we discuss in more detail why search technology such as Solr or Elasticsearch is a good and practical choice to deploy a recommendation engine in production, let’s take a quick look at what Apache Solr and Apache Lucene actually are.

What Is Apache Solr/Lucene?

The Apache Lucene project produces two primary software artifacts. One is called Lucene-Core (usually abbreviated to simply Lucene) and the other is called Solr. Lucene-Core is a software library that provides functions to support a document-oriented sort of database that is particularly good at text retrieval. Solr is a web application that provides a full, working web service to simplify access to the capabilities of Lucene-Core. For convenience in this discussion, we will mostly just say “Solr” since it is not necessary to access the Lucene-Core library directly for recommendations.

Data loaded into a Solr index is put into collections. Each collection is made up of documents. The document contains specific information about the item in fields. If the fields are indexed, then they become searchable by Solr’s retrieval capabilities. It is this search capability that we exploit to deploy the recommender. If fields are stored, they can be displayed to users in a web interface.

Why Use Apache Solr/Lucene to Deploy?

Lucene, which is at the heart of Solr, works by taking words (usually called “terms”) in the query and attaching a weight to each one. Then Solr examines every document that contains ...

Get Practical Machine Learning: Innovations in Recommendation now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.