Skip to Content
Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem
book

Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem

by Douglas Eadline
October 2015
Beginner to intermediate content levelBeginner to intermediate
304 pages
8h 42m
English
Addison-Wesley Professional

Overview

Get Started Fast with Apache Hadoop® 2, YARN, and Today’s Hadoop Ecosystem

With Hadoop 2.x and YARN, Hadoop moves beyond MapReduce to become practical for virtually any type of data processing. Hadoop 2.x and the Data Lake concept represent a radical shift away from conventional approaches to data usage and storage. Hadoop 2.x installations offer unmatched scalability and breakthrough extensibility that supports new and existing Big Data analytics processing methods and models.

Hadoop® 2 Quick-Start Guide is the first easy, accessible guide to Apache Hadoop 2.x, YARN, and the modern Hadoop ecosystem. Building on his unsurpassed experience teaching Hadoop and Big Data, author Douglas Eadline covers all the basics you need to know to install and use Hadoop 2 on personal computers or servers, and to navigate the powerful technologies that complement it.

Eadline concisely introduces and explains every key Hadoop 2 concept, tool, and service, illustrating each with a simple “beginning-to-end” example and identifying trustworthy, up-to-date resources for learning more.

This guide is ideal if you want to learn about Hadoop 2 without getting mired in technical details. Douglas Eadline will bring you up to speed quickly, whether you’re a user, admin, devops specialist, programmer, architect, analyst, or data scientist.

Coverage Includes

  • Understanding what Hadoop 2 and YARN do, and how they improve on Hadoop 1 with MapReduce

  • Understanding Hadoop-based Data Lakes versus RDBMS Data Warehouses

  • Installing Hadoop 2 and core services on Linux machines, virtualized sandboxes, or clusters

  • Exploring the Hadoop Distributed File System (HDFS)

  • Understanding the essentials of MapReduce and YARN application programming

  • Simplifying programming and data movement with Apache Pig, Hive, Sqoop, Flume, Oozie, and HBase

  • Observing application progress, controlling jobs, and managing workflows

  • Managing Hadoop efficiently with Apache Ambari–including recipes for HDFS to NFSv3 gateway, HDFS snapshots, and YARN configuration

  • Learning basic Hadoop 2 troubleshooting, and installing Apache Hue and Apache Spark

  • Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
    and much more.
    Start your free trial

    You might also like

    Introduction to the Hadoop Technology Stack

    Introduction to the Hadoop Technology Stack

    Justin Watkins
    Apache Hadoop 3 Quick Start Guide

    Apache Hadoop 3 Quick Start Guide

    Hrishikesh Vijay Karambelkar
    Apache Hadoop™ YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop™ 2

    Apache Hadoop™ YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop™ 2

    Arun C. Murthy, Vinod Kumar Vavilapalli, Doug Eadline, Joseph Niemiec, Jeff Markham
    Hadoop: Data Processing and Modelling

    Hadoop: Data Processing and Modelling

    Garry Turkington, Tanmay Deshpande, Sandeep Karanth

    Publisher Resources

    ISBN: 9780134050119Purchase book