book

Understanding ETL (Updated Edition)

Name: Understanding ETL (Updated Edition)
Author: Matt Palmer
ISBN: 9798341665118

by Matt Palmer

September 2025

Intermediate to advanced

106 pages

2h 32m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Introduction
The Brave New World of AIA Changing Data LandscapeWhat About ELT (and Other Flavors)?O’Reilly Online LearningHow to Contact UsAcknowledgments
1. Data Ingestion
Data Ingestion—Now Versus ThenSources and TargetsThe SourceThe DestinationIngestion ConsiderationsFrequencyPayloadChoosing a SolutionDeclarative SolutionsImperative SolutionsHybrid Solutions
2. Data Transformation
What Is Data Transformation?Where Are We Now?How Do We Transform Data?Building a Transformation SolutionData Transformation PatternsData Update PatternsBest PracticesReal-Time Data TransformationThe Future of Data Transformation
3. Data Orchestration
What Is Data Orchestration?Why Orchestrate?The DAGData Orchestration ToolsChoosing an OrchestratorOrchestrating SQLDesign Patterns and Best PracticesThe Future of Data Orchestration
4. Pipeline Issues and Troubleshooting
MaintainabilityMonitoring and BenchmarkingMetricsMethodsErrorsError HandlingRecoveryImproving WorkflowsStart with Relationships Align IncentivesImprove Outcomes
5. Efficiency and Scalability
Efficiency and Scalability DefinedUnderstand Your EnvironmentFrameworksResource AllocationData Processing TechniquesProcess EfficiencyData (Engineering) DemocratizationDeveloper ExperienceCollaborationConclusion
Conclusion
About the Author

Content preview from Understanding ETL (Updated Edition)

Chapter 3. Data Orchestration

Though we’ve already discussed ingestion (E, L) and transformation (T), we’ve only scratched the surface of ETL. Contrary to viewing data pipelines as a series of discrete steps, there exist overarching mechanisms that operate on a meta level, aptly dubbed “undercurrents” by Matt Housley and Joe Reis in Fundamentals of Data Engineering:

Security
Data management
Data operations (DataOps)
Data architecture
Data orchestration
Software engineering

In this chapter, we’ll explore dependency management and pipeline orchestration, touching on the history of orchestrators, which is important for understanding why certain methods of orchestration are popular today. We’ll present a menu of options for you to orchestrate your own data workflows and discuss some common design patterns in orchestration.

Throughout will be a discussion of how an “orchestrator” has historically been separate from a “transformation” tool. We’ll touch on why this has been true and why it might not be true in the future, though we still believe a separate orchestrator is the preferred approach.

What Is Data Orchestration?

Every workflow, data or not, requires sequential steps: attempting to use a French press without heating water will only brew disappointment, whereas poorly sequenced data transformations might brew a storm far more bitter than a caffeine-deprived morning (though the woes of the decaffeinated are not to be trivialized). In data, these “steps” are often ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 0642572226961Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Understanding ETL (Updated Edition)

by Matt Palmer

Chapter 3. Data Orchestration

What Is Data Orchestration?

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.