Chapter 2. A Renaissance in Data Transformation

In this chapter, we’ll expand on the transformation layer. We will provide a brief overview of the importance of the transformation, discuss the importance of ETL/ELT, and jump into existing solutions. We’ll then present the benefits and challenges these solutions pose, framing each as code- or GUI-first. Taking the best of both worlds, we’ll present a solution that finds the “golden middle” for a flexible, yet user-friendly, experience. This golden middle of data transformation represents the second revolution in data processing and the first true automation of the transformation layer. Finally, we’ll provide direct examples of how you can use this framework to further analytics and engineering efforts on your team.

Why Data Transformations Matter

With the growing volume and variety of data, it becomes the task of a robust transformation framework to concisely filter, aggregate, and present findings in a manner that’s easily understandable. Data transformation is essential to extract (pun intended) value from all this information.

For this reason, it’s essential to implement a framework that provides consistent outputs with as little overhead as possible. Every member of the data team should be able to contribute, not just those with technical backgrounds. Furthermore, this solution must efficiently scale to handle both tremendous quantities of data and an ever-expanding domain (schemas, tables, views) within any number of data ...

Get Automating Data Transformations now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.