Chapter 3. DataOps as a Discipline

DataOps, like DevOps, emerges from the recognition that separating the product—production-ready data—from the process that delivers it—operations—impedes quality, timeliness, transparency, and agility. The need for DataOps comes about because data consumption has changed dramatically over the past decade. Just as internet applications raised user expectations for the usability, availability, and responsiveness of applications, things like Google Knowledge Panel and Wikipedia have dramatically raised user expectations for the usability, availability, and freshness of data.

What’s more, with increased access to very usable self-service data preparation and visualization tools, there are also now many users within the enterprise who are ready and able to prepare data for their own use if official channels are unable to meet their expectations. In combination, these changes have created an environment in which continuing with the cost-laden, delay-plagued, opaque operations used to deliver data in the past are no longer acceptable. Taking a cue from DevOps, DataOps looks to combine the production and delivery of data into a single, Agile practice that directly supports specific business functions. The ultimate goal is to cost-effectively deliver timely, high-quality data that meets the ever-changing needs of the organization.

In this chapter, we review the history of DataOps, the problems it is designed to address, the tools and processes ...

Get Getting DataOps Right now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.