Skip to Content
Understanding ETL (Updated Edition)
book

Understanding ETL (Updated Edition)

by Matt Palmer
September 2025
Intermediate to advanced
106 pages
2h 32m
English
O'Reilly Media, Inc.
Content preview from Understanding ETL (Updated Edition)

Chapter 2. Data Transformation

While data ingestion simply transfers data from point A to B, data transformation turns raw data into valuable insights through various stages of the data lifecycle. This chapter delves into the diverse languages, platforms, and technologies available to data practitioners for executing data transformations.

We’ll see how to ensure that data transformations are conducted efficiently and in a well-coordinated manner, laying the groundwork for more detailed discussions on efficiency, scalability, and observability later in the guide.

What Is Data Transformation?

Data transformation is the art of manipulating and enhancing data to better serve users and processes. Transformation involves taking some data, whether in a raw or nearly pristine state, and performing one or many operations to move it closer to the intended use. In an ETL pipeline, transformation occurs in not one, but many places. Data might be transformed upon ingestion and again at any number of points downstream. The goal of data transformation is to turn data into an asset—using analysis and science to create something of value for the business.

Transformation might be as simple as removing unwanted records, e.g., filtering, or as complex as restructuring the source data entirely. Transformation exists on a spectrum; there’s an almost infinite number of ways to transform data—that’s what keeps things interesting!

Similarly, transformation can be orchestrated in any language with any ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Learning SQL, 3rd Edition

Learning SQL, 3rd Edition

Alan Beaulieu

Publisher Resources

ISBN: 0642572226961Errata Page