Understanding Data Governance
Introduction
From the moment organizations begin collecting information, governance becomes crucial: what information should we gather, and how do we ensure that information is accurate and current? What are we allowed to do with the information we gather? How do we store it, and for how long? Who can see it? And finally, how is it updated?
With the digitization of virtually everything, the importance of governance rises as we need more context around data that drives decision-making. Context includes lineage and provenance: who created the data? Where did it come from? Has it been versioned? Is it accurate? For example, when considering data quality, using out-of-date information to make critical, organizational decisions is very risky. If several data sources have similar data, how do you decide which data source is the “golden source”?
Lack of governance increases risk to the organization. In fact, the financial crisis of 2008 occurred largely because of a lack of regulation around the quality of the data. As legislation tightens, stricter reviews of data look to reduce risk. These reviews follow through in all industries, particularly around security and privacy.
Data governance is supposed to include all of the processes that ensure data assets are formally managed throughout an organization; in other words, the policies developed to define accuracy, accessibility, consistency, relevance, completeness, and management of the organization’s data ...
Get Understanding Data Governance now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.