Hadoop 3.0 releases and new features

Apache Hadoop development is happening on multiple tracks. The releases of 2.X, 3.0.X, and 3.1.X were simultaneous. Hadoop 3.X was separated from Hadoop 2.x six years ago. We will look at major improvements in the latest releases: 3.X and 2.X. In Hadoop version 3.0, each area has seen a major overhaul, as can be seen in the following quick overview:

  • HDFS benefited from the following:
    • Erasure code 
    • Multiple secondary Name Node support
    • Intra-Data Node Balancer
  • Improvements to YARN include the following:
    • Improved support for long-running services
    • Docker support and isolation
    • Enhancements in the Scheduler
    • Application Timeline Service v.2
    • A new User Interface for YARN
    • YARN Federation
  • MapReduce received ...

Get Apache Hadoop 3 Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.