February 2017
Intermediate to advanced
696 pages
12h 24m
English
In Spark you can read data from a lot of sources, but in general NoSQL datastores such as HBase, Accumulo, and Cassandra you have a limited query subset and you often need to scan all the data to read only the required data. Using Elasticsearch you can retrieve a subset of documents that match your Elasticsearch query.
To read an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
You also need a working installation of Apache Spark and the data indexed in the previous example.
For reading data in Elasticsearch via Apache Spark, we will perform the steps given as follows:
Read now
Unlock full access