An introduction to storing, structuring, and analyzing data at scale with Hadoop
About This Video
Explore Hadoop and its ecosystem of core components, and set up an instance
Import, organize, and query data with HDFS, Flume, Sqoop, and Hive
Learn Pig, a simplified scripting language for Hadoop, to manipulate your data
Hadoop emerged in response to the proliferation of masses and masses of data collected by organizations, offering a strong solution to store, process, and analyze what has commonly become known as Big Data. It comprises a comprehensive stack of components designed to enable these tasks on a distributed scale, across multiple servers and thousands of machines.
Learning Hadoop 2 introduces you to the powerful system synonymous with Big Data, demonstrating how to create an instance and leverage Hadoop ecosystem's many components to store, process, manage, and query massive data sets with confidence.
We open this course by providing an overview of the Hadoop component ecosystem, including HDFS, Sqoop, Flume, YARN, MapReduce, Pig, and Hive, before installing and configuring our Hadoop environment. We take a look at Hue, the graphical user interface of Hadoop.
We will then discover HDFS, Hadoop’s file-system used to store data. We will learn how to import and export data, both manually and automatically. Afterward, we turn our attention toward running computations using MapReduce, and get to grips working with Hadoop’s scripting language, Pig. Lastly, we will siphon data from HDFS into Hive, and demonstrate how it can be used to structure and query data sets.
Table of Contents
- Chapter 1 : The Hadoop Ecosystem
- Chapter 2 : Installing and Configuring Hadoop
- Chapter 3 : Data Import and Export
- Chpater 4 : Using MapReduce and Pig
- Chapter 5 : Using Hive
- Title: Learning Hadoop 2
- Release date: November 2015
- Publisher(s): Packt Publishing
- ISBN: 9781785888113