Skip to Content
Data Quality Fundamentals
book

Data Quality Fundamentals

by Barr Moses, Lior Gavish, Molly Vorwerck
September 2022
Beginner to intermediate
308 pages
8h 43m
English
O'Reilly Media, Inc.
Content preview from Data Quality Fundamentals

Chapter 2. Assembling the Building Blocks of a Reliable Data System

While solving data quality issues in production is a critical skill set for any data practitioner, data downtime can often be prevented almost entirely with the right systems and processes in place.

Like software, data can rely on any number of operational, programmatic, or even data-related influences at various stages in the pipeline, and all it takes is one schema change or code push to send a downstream report into disarray.

As we’ll discuss in Chapter 8, solving for data quality and building more reliable pipelines is broken into three key components: process, technologies, and people. In this chapter, we’ll tackle the technology component of this equation, mapping together the disparate pieces of the data pipeline and what it takes to measure, fix, and prevent data downtime at each step.

Data systems are ridiculously complex, with various stages in the data pipeline contributing to this chaos. And as companies increasingly invest in data and analytics, the pressure to build at scale puts serious pressure on data engineers to account for quality before data even enters the pipeline.

In this chapter, we’ll highlight the various metadata-powered building blocks—from data catalogs to data warehouses and lakes—to ensure your data infrastructure is set up for success when it comes to ensuring high-quality data at each stage of the pipeline.

Understanding the Difference Between Operational and ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Storytelling with Data: A Data Visualization Guide for Business Professionals

Storytelling with Data: A Data Visualization Guide for Business Professionals

Cole Nussbaumer Knaflic
Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley
Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley

Publisher Resources

ISBN: 9781098112035Errata Page