CHAPTER 7Reference Architecture for Data Quality

INTRODUCTION

The previous three chapters looked at the analyze phase of the DARS model. Specifically, we analyzed the cases of poor data quality, impact of data quality in its lifecycle, and profiling of data. The next three chapters will be on the realize phase of the DARS model. In this chapter we look at the architectural and design patterns that enterprises can adopt to improve data quality. Categorically, this chapter discusses the following topics.

  1. Four key frameworks to manage data quality in the DLC
  2. Mechanisms to deliver quality data for business results
  3. Architectural frameworks to access and manage data in today's distributed and heterogenous system IT landscape

Today, enterprises have different types of data to be managed. These may be structured data, such as relational tables in databases, semi-structured data such as XML documents, and unstructured data such as images, videos, audios, and documents. Architecting for data quality is basically identifying reusable design patterns. This chapter looks at the important architectural frameworks and design patterns or best practices for managing data quality.

OPTIONS TO REMEDIATE DATA QUALITY

Achieving and maintaining data quality needs to be addressed holistically, that is, technically and functionally. Overall, while the technical aspects, which are closely associated with the metadata, are relatively easy to define, the business or functional or the semantic aspects ...

Get Data Quality now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.