13.8. EXERCISES

  1. Match the columns:

    1. domain integrity

    2. data aging

    3. entity integrity

    4. data consumer

    5. poor quality data

    6. data consistency expert

    7. error discovery

    8. data pollution source

    9. dummy values

    10. data quality benefit

    1. detect inconsistencies

    2. better customer service

    3. synchronize all data

    4. allowable values

    5. used to pass edits

    6. uses warehouse data

    7. heterogeneous systems integration

    8. lost business opportunities

    9. prevents duplicate key values

    10. decay of field values

  2. Assume that you are the data quality expert on the data warehouse project team for a large financial institution with many legacy systems dating back to the 1970s. Review the types of data quality problems you are likely to have and make suggestions on how to deal with those.

  3. Discuss the common sources of data pollution and provide examples.

  4. You are responsible for the selection of data cleansing tools for your data warehouse environment. How will you define the criteria for selection? Prepare a checklist for evaluation and selection of these tools.

  5. As a data warehouse consultant, a large bank with statewide branches has hired you to help the company set up a data quality initiative. List your major considerations. Produce an outline for a document describing the initiative, the policies, and the procedures.

Get DATA WAREHOUSING FUNDAMENTALS: A Comprehensive Guide for IT Professionals now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.