Architectural Considerations for Hadoop Applications

Video Description

Implementing solutions with Apache Hadoop requires understanding not just Hadoop, but a broad range of related projects in the Hadoop ecosystem such as Hive, Pig, Oozie, Sqoop, and Flume. The good news is that there’s an abundance of materials – books, web sites, conferences, etc. – for gaining a deep understanding of Hadoop and these related projects. The bad news is there’s still a scarcity of information on how to integrate these components to implement complete solutions. In this video we’ll walk through an end-to-end case study of a clickstream analytics engine to provide a concrete example of howto architect and implement a complete solution with Hadoop.

Table of Contents

  1. Introduction to Clickstream Case Study 00:11:19
  2. Requirements 00:08:04
  3. Data Modeling 00:14:55
  4. Data Ingest 00:16:16
  5. Data Processing Engines - Part 1 00:16:23
  6. Data Processing Engines - Part 2 00:10:59
  7. Data Processing Patterns 00:09:32
  8. Orchestration 00:14:34
  9. Putting It All Together 00:03:08
  10. Demo 00:21:47
  11. Q&A 00:24:35

Product Information

  • Title: Architectural Considerations for Hadoop Applications
  • Author(s): Mark Grover, Gwen Shapira, Jonathan Seidman, Ted Malaska
  • Release date: March 2015
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781491923313