Chapter 17: Big Data Integration
Elasticsearch has become a common component in big data architectures because it provides several of the following features:
- It allows you to search for massive amounts of data quickly.
- For common aggregation operations, it provides real-time analytics on big data.
- It's easier to use an Elasticsearch aggregation than a Spark one.
- If you need to move on to a fast data solution, starting from a subset of documents after a query is faster than doing a full rescan of all your data.
The most common big data software that's used for processing data is now Apache Spark (http://spark.apache.org/), which is considered the evolution of the obsolete Hadoop MapReduce for moving the processing from disk to memory.
In this ...
Get Elasticsearch 8.x Cookbook - Fifth Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.