Chapter 6. Fixing Data Quality Issues at Scale
Picture this: itâs Friday at 5 p.m., and youâre about to log off for the day. You start closing your tabs, packing up your bag, and settling into your weekend state of mind. Just as youâre about to turn off your laptop, you get an urgent Slack message from your CFO about a broken dashboard.
âThe numbers are wrong in our quarterly results report,â she Slacks you. âI didnât sign off on this!â
Assuming the issue is about the data itself and not rooted in your companyâs shoddy financials, you have a serious case of data downtime on your hands. You frantically open Looker to find sheâs rightâthe report looks way off and you have no idea why. You validated the numbers yesterday with her. Your charts and graphs were absolutely glowing with accuracy.
You pull up the source data (an Excel spreadsheet living on your desktop, âFinancial Report V. 212 GOOD_I_ PROMISE_YES_GOODâ), but that confuses you even more. Dozens of emails, two phone calls, a few Zoom meetings, and seven hours later, you determined the culprit of the errant dashboard: a schema change upstream with a source table.
Great, you figured out what happenedânow what?
For most data teams, pausing the pipeline and identifying the root cause of the issue at hand is just the tip of the iceberg when it comes to restoring data reliability and trust in your data.
Fixing Quality Issues in Software Development
Fortunately, analysts and engineers donât need ...
Get Data Quality Fundamentals now with the O’Reilly learning platform.
O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.