Overview of ES-Hadoop
As we mentioned, the ES-Hadoop feature contains two major areas: distributed computing and distributed storage. The main goal of ES-Hadoop is to seamlessly connect Elasticsearch and Hadoop so that they can benefit each other with distributed computing, distributed storage, searching, analytics, visualization, and more. We can import Hadoop Distributed File System (HDFS) data to Elasticsearch for search and analysis, and export the Elastisearch data to HDFS for snapshot and restore. ES-Hadoop fully supports the Spark framework, including Spark, Hive, Pig, Storm, Cascading, and sure, the standard MapReduce. Let's take a look at the data flow between Elasticsearch, ES-Hadoop, and components in the Hadoop ecosystem, as shown ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access