Chapter 17: Big Data Integration

Elasticsearch has become a common component in big data architectures because it provides several of the following features:

  • It allows you to search for massive amounts of data quickly.
  • For common aggregation operations, it provides real-time analytics on big data.
  • It's easier to use an Elasticsearch aggregation than a Spark one.
  • If you need to move on to a fast data solution, starting from a subset of documents after a query is faster than doing a full rescan of all your data.

The most common big data software that's used for processing data is now Apache Spark (http://spark.apache.org/), which is considered the evolution of the obsolete Hadoop MapReduce for moving the processing from disk to memory.

In this ...

Get Elasticsearch 8.x Cookbook - Fifth Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.