Chapter 8. Operating Your Solution at Scale
In this final chapter, we’ll discuss the challenges you’ll likely face as you put an automated data quality monitoring solution into operation and maintain it for the long term. We’ll focus on clearly defining the problems, allowing your use case and needs to point you toward the approach that’s right for your team.
Operating a technology solution like automated data quality monitoring follows a general pattern. First, there’s the process of acquiring the platform, either by building it yourself or purchasing it from a third party. Then, there’s the initial configuration and enablement so that your team can use all the features of the platform successfully. Once everything is up and running, the “final” stage is the ongoing use of the platform to facilitate your organization’s goals—in this case, improving and maintaining data quality in the long term. We’ve structured this chapter to follow these phases in order. Now, let’s dive in and see how you can reach a steady state of data quality excellence.
Build Versus Buy
Once an organization comprehensively understands a problem they are having and has researched the options in the solution space, they have a decision to make: build or buy?
Building doesn’t have to mean building from scratch, and rarely does. A common strategy that small teams may take is to build their own platform around open source packages for rule-based data quality evaluation, such as Great Expectations or Deequ. ...