Perform real-time data analytics, stream and batch processing on your application using Hadoop
About This Video
- Get a clear understanding of the storage paradigm of Hadoop.
- Understanding of data Processing with various schemas like structured unstructured and semi structured data.
- Learn data movement from various sources like RDBMS, Web log server, Syslog server, social media and other sources.
Hadoop which is one of the best open-source software frameworks for distributed computing. It provides you with means to ramp up your career and skills. You will start out by learning the basics of Hadoop, including its file system HDFS, and its cluster management resource YARN and its many libraries and programming tools. This course will get you started with the Hadoop major components which Industry demands. You will be able to see how the structure, unstructured and semi structured data can be processed with Hadoop.
This course will majorly focus on the problem faced in Big Data and the solution offered by respective Hadoop component. You will learn to use different components and tools such as Mapreduce to process raw data and will learn how tools such as Hive and Pig aids in this process. You will then move on to Data Analysis techniques with Hadoop using tools such as Hive and will learn to apply them in a real world Big Data Application. This course will teach you to perform real-time data analytics, stream and batch processing on your application. Finally, this course will also teach you how to extend your analytics solutions to the cloud.
The codes of this course are placed on Github: https://github.com/PacktPublishing/Hands-on-Big-Data-Processing-with-Hadoop-3
Downloading the example code for this course: You can download the example code files for all Packt video courses you have purchased from your account at http://www.PacktPub.com. If you purchased this course elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.
Table of Contents
- Chapter 1 : What Is Hadoop?
Chapter 2 : Making Hadoop Efficient – YARN Architecture
- The Rise of Resource Manager 00:13:45
- YARN Architecture 00:04:51
- How YARN Has Effectively Increased the Potential of Hadoop 00:04:55
- Classic versus YARN 00:02:52
- YARN Daemons 00:02:42
- Containers 00:03:35
- Speculative Execution 00:02:38
- HDFS Federation 00:02:46
- Authentication and High Availability 00:03:27
- Understanding the Major Changes in Different Versions of Hadoop – 1.X, 2.X, and 3.X 00:06:03
- Chapter 3 : Analyze Data with MapReduce Basics
- Chapter 4 : Analyzing Structured Data with Hadoop
- Chapter 5 : Efficient Data Transfer with Sqoop
- Chapter 6 : Managing Data Collection and Transfer with Flume
- Chapter 7 : Perform Data Execution with Pig
- Title: Hands-On Big Data Processing with Hadoop 3
- Release date: October 2018
- Publisher(s): Packt Publishing
- ISBN: 9781788997553