Data Quality

Book description

Data Quality: The Accuracy Dimension is about assessing the quality of corporate data and improving its accuracy using the data profiling method. Corporate data is increasingly important as companies continue to find new ways to use it. Likewise, improving the accuracy of data in information systems is fast becoming a major goal as companies realize how much it affects their bottom line. Data profiling is a new technology that supports and enhances the accuracy of databases throughout major IT shops. Jack Olson explains data profiling and shows how it fits into the larger picture of data quality.

* Provides an accessible, enjoyable introduction to the subject of data accuracy, peppered with real-world anecdotes. * Provides a framework for data profiling with a discussion of analytical tools appropriate for assessing data accuracy. * Is written by one of the original developers of data profiling technology. * Is a must-read for any data management staff, IT management staff, and CIOs of companies with data assets.

Table of contents

  1. Front Cover
  2. Data Quality: The Accuracy Dimension
  3. Copyright Page
  4. Foreword
  5. Contents (1/2)
  6. Contents (2/2)
  7. Preface
  8. Part I: Understanding Data Accuracy
    1. Chapter 1. The Data Quality Problem
      1. 1.1 Data Is a Precious Resource
      2. 1.2 Impact of Continuous Evolution of Information Systems
      3. 1.3 Acceptance of Inaccurate Data
      4. 1.4 The Blame for Poor-Quality Data
      5. 1.5 Awareness Levels
      6. 1.6 Impact of poor-Quality Data
      7. 1.7 Requirements for Making Improvements
      8. 1.8 Expected Value Returned for Quality program
      9. 1.9 Data Quality Assurance Technology (1/2)
      10. 1.9 Data Quality Assurance Technology (2/2)
      11. 1.10 Closing Remarks
    2. Chapter 2. Definition of Accurate Data
      1. 2.1 Data Quality Definitions
      2. 2.2 Principle of Unintended Uses
      3. 2.3 Data Accuracy Defined
      4. 2.4 Distribution of Inaccurate Data
      5. 2.5 Can Total Accuracy Be Achieved?
      6. 2.6 Finding Inaccurate Values
      7. 2.7 How Important Is It to Get Close?
      8. 2.8 Closing Remarks
    3. Chapter 3. Sources of Inaccurate Data
      1. 3.1 Initial Data Entry (1/2)
      2. 3.1 Initial Data Entry (2/2)
      3. 3.2 Data Accuracy Decay
      4. 3.3 Moving and Restructuring Data (1/2)
      5. 3.3 Moving and Restructuring Data (2/2)
      6. 3.4 Using Data
      7. 3.5 Scope of Problems
      8. 3.6 Closing Remarks
  9. Part II: Implementing a Data Quality Assurance Program
    1. Chapter 4. Data Quality Assurance
      1. 4.1 Goals of a Data Quality Assurance Program
      2. 4.2 Structure of a Data Quality Assurance Program (1/2)
      3. 4.2 Structure of a Data Quality Assurance Program (2/2)
      4. 4.3 Closing Remarks
    2. Chapter 5. Data Quality Issues Management
      1. 5.1 Turning Facts into Issues
      2. 5.2 Assessing Impact
      3. 5.3 Investigating Causes (1/2)
      4. 5.3 Investigating Causes (2/2)
      5. 5.4 Developing Remedies
      6. 5.5 Implementing Remedies
      7. 5.6 Post-implementation Monitoring
      8. 5.7 Closing Remarks
    3. Chapter 6. The Business Case for Accurate Data
      1. 6.1 The Value of Accurate Data
      2. 6.2 Costs Associated with Achieving Accurate Data
      3. 6.3 Building the Business Case
      4. 6.4 Closing Remarks
  10. Part III: Data Profiling Technology
    1. Chapter 7. Data Profiling Overview
      1. 7.1 Goals of Data Profiling
      2. 7.2 General Model (1/2)
      3. 7.2 General Model (2/2)
      4. 7.3 Data Profiling Methodology (1/2)
      5. 7.3 Data Profiling Methodology (2/2)
      6. 7.4 Analytical Methods Used in Data Profiling
      7. 7.5 When Should Data Profiling Be Done?
      8. 7.6 Closing Remarks
    2. Chapter 8. Column Property Analysis
      1. 8.1 Definitions
      2. 8.2 The Process for Profiling Columns
      3. 8.3 Profiling Properties for Columns (1/3)
      4. 8.3 Profiling Properties for Columns (2/3)
      5. 8.3 Profiling Properties for Columns (3/3)
      6. 8.4 Mapping with Other Columns
      7. 8.5 Value-Level Remedies
      8. 8.6 Closing Remarks
    3. Chapter 9. Structure Analysis
      1. 9.1 Definitions
      2. 9.2 Understanding the Structures Being Profiled
      3. 9.3 The Process for Structure Analysis
      4. 9.4 The Rules for Structure (1/4)
      5. 9.4 The Rules for Structure (2/4)
      6. 9.4 The Rules for Structure (3/4)
      7. 9.4 The Rules for Structure (4/4)
      8. 9.5 Mapping with Other Structures
      9. 9.6 Structure-Level Remedies
      10. 9.7 Closing Remarks
    4. Chapter 10. Simple Data Rule Analysis
      1. 10.1 Definitions
      2. 10.2 The Process for Analyzing Simple Data Rules
      3. 10.3 Profiling Rules for Single Business Objects
      4. 10.4 Mapping with Other Applications
      5. 10.5 Simple Data Rule Remedies
      6. 10.6 Closing Remarks
    5. Chapter 11.Complex Data Rule Analysis
      1. 11.1 Definitions
      2. 11.2 The Process for Profiling Complex Data Rules
      3. 11.3 Profiling Complex Data Rules
      4. 11.4 Mapping with Other Applications
      5. 11.5 Multiple-Object Data Rule Remedies
      6. 11.6 Closing Remarks
    6. Chapter 12. Value Rule Analysis
      1. 12.1 Definitions
      2. 12.2 Process for Value Rule Analysis
      3. 12.3 Types of Value Rules
      4. 12.4 Remedies for Value Rule Violations
      5. 12.5 Closing Remarks
    7. Chapter 13. Summary
      1. 13.1 Data Quality Is a Major Issue for Corporations
      2. 13.2 Moving to a Position of High Data Quality Requires an Explicit Effort
      3. 13.3 Data Accuracy Is the Cornerstone for Data Quality Assurance
  11. Appendix A. Examples of Column Properties , Data Structure, Data Rules , and Value Rules
    1. A.1 Business Objects
    2. A.2 Tables
    3. A.3 Column Properties
    4. A.4 Structure Rules
    5. A.5 Simple Data Rules
    6. A.6 Complex Data Rules
    7. A.7 Value Rules
  12. Appendix B. Content of a Data Profiling Repository
    1. B.1 Schema Definition
    2. B.2 Business Objects
    3. B.3 Domains
    4. B.4 Data Source
    5. B.5 Table Definitions
    6. B.6 Synonyms
    7. B.7 Data Rules
    8. B.8 Value Rules
    9. B.9 Issues
  13. References
    1. Books on Data Quality Issues
    2. Books on Data Quality Technologies
    3. Articles
  14. Index (1/3)
  15. Index (2/3)
  16. Index (3/3)
  17. About the Author

Product information

  • Title: Data Quality
  • Author(s): Jack E. Olson
  • Release date: January 2003
  • Publisher(s): Morgan Kaufmann
  • ISBN: 9780080503691