Skip to Content
Understanding ETL (Updated Edition)
book

Understanding ETL (Updated Edition)

by Matt Palmer
September 2025
Intermediate to advanced
106 pages
2h 32m
English
O'Reilly Media, Inc.
Content preview from Understanding ETL (Updated Edition)

Chapter 5. Efficiency and Scalability

In this final chapter, we focus on the crucial aspects of optimizing and scaling the data pipelines we’ve developed. We start by defining what we mean by “efficiency” and “scalability” to set the boundaries for our discussion.

Our journey begins with resource allocation, which hinges on a thorough understanding of our operational environment. This understanding enables us to optimize our processes effectively.

The chapter culminates with a dual-focused discussion. First, we explore the process of collaboration, particularly how to scale effectively in terms of team size and skill set. Second, we delve into creating an optimal developer experience, a key factor in efficient data pipeline management.

Throughout the chapter, we weave in ongoing themes such as tooling and platform considerations, the pros and cons of managed versus custom-built solutions, and architectural strategies for crafting superior ETL systems. These discussions aim to provide a comprehensive view of building and maintaining efficient, scalable data systems.

Efficiency and Scalability Defined

Efficiency is about optimizing workflows to deliver business value through data. It measures our ability to generate impactful outputs with the resources at our disposal, encompassing aspects of code, services, and teamwork. The ultimate measure of efficiency is the impact produced relative to the finite resources used.

Scalability refers to the capability of a system, network, or ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Learning SQL, 3rd Edition

Learning SQL, 3rd Edition

Alan Beaulieu

Publisher Resources

ISBN: 0642572226961Errata Page