Skip to Content
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data
book

The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data

by Ralph Kimball, Joe Caserta
October 2004
Beginner to intermediate
528 pages
13h 39m
English
Wiley
Content preview from The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data

Chapter 9. Metadata

Metadata is an interesting topic because every tool space in the data warehouse arena including business intelligence (BI) tools, ETL tools, databases, and dedicated repositories claims to have a metadata solution, and many books are available to advise you on the best metadata strategies. Yet, after years of implementing and reviewing data warehouses, we've yet to encounter a true end-to-end metadata solution. Instead, most data warehouses have manually maintained pieces of metadata that separately exist across their components. Instead of adding to the metadata hoopla, this chapter simply covers the portions of metadata that the ETL team needs to be aware of—either as a consumer or a producer. We propose a set of metadata structures that you need to support the ETL team.

Note

PROCESS CHECK Planning & Design:

Requirements/Realities → ArchitectureImplementation → Release to Ops

Data Flow: ExtractCleanConformDeliver

Because the ETL system is the center of your data warehouse universe, it often assumes the responsibility of managing and storing much of the metadata for the data warehouse. One might think that there is no better place than the ETL system for storing and managing metadata because the environment must already know the specifics of all data to function properly. And the ETL process is the creator of the most important metadata in the data warehouse—the data lineage. The data lineage traces data from its exact location in the source system and ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

The Data Warehouse Lifecycle Toolkit

The Data Warehouse Lifecycle Toolkit

Ralph Kimball, Margy Ross, Warren Thornthwaite, Joy Mundy, Bob Becker
The Kimball Group Reader: Relentlessly Practical Tools for Data Warehousing and Business Intelligence

The Kimball Group Reader: Relentlessly Practical Tools for Data Warehousing and Business Intelligence

Ralph Kimball, Margy Ross, Warren Thornthwaite, Joy Mundy, Bob Becker

Publisher Resources

ISBN: 9780764567575Purchase book