Book description
Get expert guidance on architecting end-to-end data management solutions with Apache Hadoop. While many sources explain how to use various components in the Hadoop ecosystem, this practical book takes you through architectural considerations necessary to tie those components together into a complete tailored application, based on your particular use case.
Publisher resources
Table of contents
- Foreword
- Preface
- I. Architectural Considerations for Hadoop Applications
- 1. Data Modeling in Hadoop
- 2. Data Movement
- 3. Processing Data in Hadoop
- 4. Common Hadoop Processing Patterns
- 5. Graph Processing on Hadoop
-
6. Orchestration
- Why We Need Workflow Orchestration
- The Limits of Scripting
- The Enterprise Job Scheduler and Hadoop
- Orchestration Frameworks in the Hadoop Ecosystem
- Oozie Terminology
- Oozie Overview
- Oozie Workflow
- Workflow Patterns
- Parameterizing Workflows
- Classpath Definition
- Scheduling Patterns
- Executing Workflows
- Conclusion
- 7. Near-Real-Time Processing with Hadoop
- II. Case Studies
- 8. Clickstream Analysis
-
9. Fraud Detection
- Continuous Improvement
- Taking Action
- Architectural Requirements of Fraud Detection Systems
- Introducing Our Use Case
- High-Level Design
- Client Architecture
- Profile Storage and Retrieval
- Ingest
- Near-Real-Time and Exploratory Analytics
- Near-Real-Time Processing
- Exploratory Analytics
- What About Other Architectures?
- Conclusion
- 10. Data Warehouse
- A. Joins in Impala
- Index
Product information
- Title: Hadoop Application Architectures
- Author(s):
- Release date: July 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491900086
You might also like
book
Radar Trends to Watch: September 2023
Read about the latest developments on O'Reilly Media's Radar.
book
Deciphering Data Architectures
Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern …
book
The Self-Service Data Roadmap
Data-driven insights are a key competitive advantage for any industry today, but deriving insights from raw …
book
The Manager's Path
Managing people is difficult wherever you work. But in the tech industry, where management is also …