Skip to Content
Fundamentals of Data Engineering
book

Fundamentals of Data Engineering

by Joe Reis, Matt Housley
June 2022
Beginner
450 pages
13h 34m
English
O'Reilly Media, Inc.
Book available
Content preview from Fundamentals of Data Engineering

Chapter 5. Data Generation in Source Systems

Welcome to the first stage of the data engineering lifecycle: data generation in source systems. As we described earlier, the job of a data engineer is to take data from source systems, do something with it, and make it helpful in serving downstream use cases. But before you get raw data, you must understand where the data exists, how it is generated, and its characteristics and quirks.

This chapter covers some popular operational source system patterns and the significant types of source systems. Many source systems exist for data generation, and we’re not exhaustively covering them all. We’ll consider the data these systems generate and things you should consider when working with source systems. We also discuss how the undercurrents of data engineering apply to this first phase of the data engineering lifecycle (Figure 5-1).

Figure 5-1. Source systems generate the data for the rest of the data engineering lifecycle

As data proliferates, especially with the rise of data sharing (discussed next), we expect that a data engineer’s role will shift heavily toward understanding the interplay between data sources and destinations. The basic plumbing tasks of data engineering—moving data from A to B—will simplify dramatically. On the other hand, it will remain critical to understand the nature of data as it’s created in source systems.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley
Prompt Engineering for LLMs

Prompt Engineering for LLMs

John Berryman, Albert Ziegler
AI Engineering

AI Engineering

Chip Huyen

Publisher Resources

ISBN: 9781098108298Errata Page