Chapter 7. Integrating Data Observability in Your Data Stack

In the previous chapter, you learned about the three components of data observability, and how to apply data observability in your day to day data work. In this chapter, we will get our hands dirty and put these concepts to work.

The goals of this chapter are to provide recipes that will help you integrate data observability in your pipelines, provide technical materials to up-skill your capabilities to make your frameworks and applications data observable—then, as a good chef would do with any recipe, you will personalize it, extend it, improve it. I will explain the proper steps to follow, the purpose of each step, and how it works. Spoiler alert: it will get pretty technical from time to time—almost nerdy—but trust me, it’s worth it.

To give this chapter a logical flow, I will follow the data engineering lifecycle introduced by Joe Reis and Matt Housley in the Fundamentals of Data Engineering (Figure 7-1).

The Data Engineering Lifecycle (Courtesy of Joe Reis and Matthew Housley)
Figure 7-1. The Data Engineering Lifecycle1 (Courtesy of Joe Reis and Matthew Housley)

Data observability resides in the undercurrents of data engineering, as part of DataOps. To generate the values discussed in Chapter 2, the data observability platform must be present in both the data architecture and the undercurrents layer. While I won’t discuss the generation stage much, I will cover ingestion, transformation ...

Get Fundamentals of Data Observability now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.