Video description
In this Learning Path, you’ll learn how to integrate Hadoop components to implement big data solutions for a variety of use cases, including clickstream analytics, time series problems, transferring data between Hadoop and relational databases, and applications in the finance sector.
Publisher resources
Table of contents
- Introduction to Clickstream Case Study 00:11:19
- Requirements 00:08:04
- Data Modeling 00:14:55
- Data Ingest 00:16:16
- Data Processing Engines - Part 1 00:16:23
- Data Processing Engines - Part 2 00:10:59
- Data Processing Patterns 00:09:32
- Orchestration 00:14:34
- Putting It All Together 00:03:08
- Demo 00:21:47
- Q&A 00:24:35
-
Introduction
- Introduction to Time Series Problems 00:09:58
-
Kafka
- Kafka Architecture and Deployment 00:11:33
- Kafka Usage 00:03:42
-
Spark
- Introduction to Spark 00:15:43
- Spark Architecture 00:12:02
- Spark Streaming
-
Cassandra
- Introduction to Cassandra 00:08:56
- Cassandra Basic Architecture 00:11:59
- Replication, High Availability and Multi Datacenter 00:14:06
- Cassandra Weather Website Example 00:11:46
- Cassandra Query Language (CQL) 00:18:00
- Cassandra Partitions & Clustering 00:08:22
- Cassandra Read and Write Path 00:12:17
- Working with Cassandra 00:06:32
- Cassandra Drivers and Access Patterns 00:10:37
-
Spark and Cassandra
- Spark and Cassandra Architecture 00:12:00
- Analyzing Cassandra Data & Spark SQL 00:12:12
- Spark and Cassandra DataStax Enterprise 00:04:31
- Real World Use Cases
-
Introduction
- Course Introduction 00:04:21
- About The Author 00:04:14
- What Is Big Data 00:11:07
- Historical Approaches 00:07:04
- Modern-Day Approach 00:12:42
- What Is Hadoop 00:11:05
- Hadoop Core Vs Ecosystem 00:05:03
- Hadoopable Problems 00:06:37
-
Hadoop Basics
- HDFS And Yarn 00:08:14
- Hive And Pig Interface Introduction 00:05:59
- Introduction To Spark 00:04:37
- Hadoop In The Cloud (Amazon Web Services Intro) 00:08:49
- Installing Hadoop Into EMR Part - 1 00:15:31
- Installing Hadoop Into EMR Part - 2 00:15:34
- Installing Cloudera Quickstart VM 00:11:01
- Web GUIs 00:11:06
-
Hadoop Distributed Filesystem (HDFS)
- HDFS Architecture 00:10:05
- HDFS File Write Walkthrough 00:17:57
- Secondary Name Node 00:06:38
- Basic HDFS Commands 00:09:23
- Using HDFS Commands Part - 1 00:07:34
- Using HDFS Commands Part - 2 00:09:27
- HA And Federation Basics 00:12:48
- HDFS Access Controls (Or Lack Thereof) 00:09:34
-
Yarn
- Yarn Purpose 00:06:16
- Yarn Architecture 00:07:25
- Yarn With Spark 00:06:44
-
MapReduce
- MapReduce Explained 00:11:52
- MapReduce Architecture 00:07:36
- MapReduce Code Walkthrough 00:11:59
- MapReduce Details Walkthrough 00:04:45
- Running MapReduce Job 00:08:59
-
HDFS Data Import And Export
- Import/Export Options 00:11:12
- Flume Introduction 00:10:53
- Using Flume 00:13:43
- Sqoop Introduction 00:09:25
- Using Sqoop 00:17:01
- HDFS Interaction Tools 00:06:01
- Oozie Introduction 00:10:17
-
Spark Basics
- Spark Value Propositions 00:08:30
- Spark Run Modes (Yarn, Standalone, Mesos) 00:07:33
- RDDs And Dataframes 00:17:24
- Hands On Spark Part - 1 00:08:12
- Hands On Spark Part - 2 00:10:38
- Running Spark Part - 1 00:09:58
- Running Spark Part - 2 00:13:55
- Optimizing And Debugging Spark 00:18:17
- Spark Libraries Overview 00:09:05
-
Spark Built-In Libraries
- Spark SQL 00:09:01
- Spark SQL Usage 00:12:02
- MLlib Basics 00:15:30
- Common MLlib Usage Part - 1 00:15:02
- Common MLlib Usage Part - 2 00:08:23
- Spark Streaming 00:12:43
- GraphX 00:09:58
-
Hive And Pig
- Hive Vs Pig 00:09:53
- Hive Basics 00:11:53
- Analysis With Hive 00:10:54
- Pig Basics 00:14:38
- ETL And Analytics With Pig 00:20:16
-
Hadoop In The Cloud
- Hadoop/Cloud Use Cases 00:05:16
- Elastic MapReduce (EMR) 00:12:47
-
Ecosystem
- HBase Basics 00:11:16
- Enterprise Integration 00:10:39
-
Wrap Up
- Wrap Up 00:03:41
-
Introduction to Sqoop
- Introduction 00:03:45
- About The Author 00:00:48
- Use Case #1: ELT 00:05:32
- Use Case #2: ETL From DWH 00:03:04
- Use Case #3: Data Analysis 00:03:38
- Use Case #4: Data Archival 00:02:02
- Use Case #5: Move Reports To Hadoop 00:05:26
- Use Case #6: Data Consolidation 00:02:54
-
Importing Data To Hadoop From A Relational Database
- Command Line Basics: Importing Data Using Sqoop 00:09:13
- Importing Data With Column Filters, Row Filters, And Free Text Queries 00:06:12
- Parallel Imports 00:04:33
- Import Data Directory To HIVE Tables 00:07:25
- Incremental Data Import Overview 00:06:00
- Incremental Data Import And Using Sqoop Stored Jobs 00:11:05
- Sqoop Hands-On: Exporting Data From Hadoop To A Relational Database
-
Advanced topics
- Introduction to Sqoop2 Server 00:04:18
-
Course summary
- Wrap Up 00:04:07
- Continuous curation of event data for a customer event hub - Arvind Prabhakar (StreamSets) 00:40:27
- Big data governance - Steven Totman (Cloudera), Mark Donsky (Cloudera), Kristi Cunningham (Capital One), Ben Harden (CapTech Consulting) 00:42:12
- Preventing a big data security breach - Sam Heywood (Cloudera), Nick Curcuru (MasterCard Advisors), Ritu Kama (Intel) 00:39:23
- Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud, a real-world case study - Jaipaul Agonus (FINRA) 00:42:52
Product information
- Title: Learning Path: Understanding Tool Integration for Big Data Architecture
- Author(s):
- Release date: December 2016
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491978634
You might also like
video
Clean Code
Expanded Edition (August 2018) Updated with Design Patterns episodes from the Clean Code series from Clean …
video
Python Fundamentals
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
video
Software Architecture Fundamentals, Second Edition
Being a successful software architect is more than just possessing technical knowledge. It’s about thinking like …
video
Amazon Web Services AWS LiveLessons 2nd Edition
More Than 17 Hours of Video Instruction More than 17 hours of video instruction on Amazon …