February 2017
Intermediate to advanced
696 pages
12h 24m
English
Using a simple map for ingesting data is not good for simple jobs. The best practice in Spark is to use the case class so that you have fast serialization and you are to manage complex type checking. During indexing, providing custom IDs can be very handy. In this recipe, we will see how to cover these issues.
You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
You also need a working installation of Apache Spark.
To store data in Elasticsearch via Apache Spark, we will perform the following steps:
./bin/spark-shell
Read now
Unlock full access