An introduction to storing, structuring, and analyzing data at scale with Hadoop
About This Video
Explore Hadoop and its ecosystem of core components, and set up an instance
Import, organize, and query data with HDFS, Flume, Sqoop, and Hive
Learn Pig, a simplified scripting language for Hadoop, to manipulate your data
Hadoop emerged in response to the proliferation of masses and masses of data collected by organizations, offering a strong solution to store, process, and analyze what has commonly become known as Big Data. It comprises a comprehensive stack of components designed to enable these tasks on a distributed scale, across multiple servers and thousands of machines.
Learning Hadoop 2 introduces you to the powerful system synonymous with Big Data, demonstrating how to create an instance and leverage Hadoop ecosystem's many components to store, process, manage, and query massive data sets with confidence.
We open this course by providing an overview of the Hadoop component ecosystem, including HDFS, Sqoop, Flume, YARN, MapReduce, Pig, and Hive, before installing and configuring our Hadoop environment. We take a look at Hue, the graphical user interface of Hadoop.
We will then discover HDFS, Hadoop’s file-system used to store data. We will learn how to import and export data, both manually and automatically. Afterward, we turn our attention toward running computations using MapReduce, and get to grips working with Hadoop’s scripting language, Pig. Lastly, we will siphon data from HDFS into Hive, and demonstrate how it can be used to structure and query data sets.
Table of contents
- Chapter 1 : The Hadoop Ecosystem
- Chapter 2 : Installing and Configuring Hadoop
- Chapter 3 : Data Import and Export
- Chpater 4 : Using MapReduce and Pig
- Chapter 5 : Using Hive
- Title: Learning Hadoop 2
- Release date: November 2015
- Publisher(s): Packt Publishing
- ISBN: 9781785888113
You might also like
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
Data Science Fundamentals Part 1: Learning Basic Concepts, Data Wrangling, and Databases with Python
20 Hours of Video Instruction Data Science Fundamentals LiveLessons teaches you the foundational concepts, theory, and …
Introduction to the Hadoop Technology Stack
In this Introduction to the Hadoop Technology Stack training course, expert author Justin Watkins will teach …
Introduction to Apache HBase Operations
HBase master Jonathan Hsieh provides a complete overview of Apache HBase operations in this course designed …