O'Reilly logo

Apache Hive Essentials by Dayong Du

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Overview of the Hadoop ecosystem

Hadoop was first released by Apache in 2011 as version 1.0.0. It only contained HDFS and MapReduce. Hadoop was designed as both a computing (MapReduce) and storage (HDFS) platform from the very beginning. With the increasing need for big data analysis, Hadoop attracts lots of other software to resolve big data questions together and merges to a Hadoop-centric big data ecosystem. The following diagram gives a brief introduction to the Hadoop ecosystem and the core software or components in the ecosystems:

Overview of the Hadoop ecosystem

The Hadoop ecosystem

In the current Hadoop ecosystem, HDFS is still the major storage option. On top of it, snappy, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required