Data Warehouse Designs

Book description

This book presents two data warehouse solutions that deliver significant ROI; market basket analysis, approached as a database design issue rather than a data model issue; and time variance or temporal data, which present past events in their historical context. The former is available to any enterprise with a data warehouse, while the latter provides a simple design that accommodates large data volumes. The text combines these two database designs into one design, which performs market basket analysis of transactions in their historical context.

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Dedication Page
  6. Table of Contents
  7. Preface
  8. Acknowledgments
  9. The Author
  10. Chapter 1 Data Warehouse ROI
    1. A Data Warehouse Needs a Purpose
    2. A Data Warehouse Needs an ROI
    3. Gravitational Pull of a Data Warehouse
    4. Purpose and ROI
    5. Not Quite a Victim of Success
      1. Performance
      2. Relational Integrity
      3. Data Quality
    6. Purpose
  11. Chapter 2 What Is Market Basket Analysis?
    1. Analysis versus Reporting
    2. Elements of Market Basket Analysis
      1. Itemset
      2. Object
      3. Affinity
      4. Statistics in Market Basket Analysis
    3. Logic of Market Basket Analysis
      1. Enterprise Relationships
      2. Outside the Basket
    4. What Market Basket Analysis Is Not
      1. Affinity Is Not Causality
      2. Affinity Is Not Intent
    5. Market Basket Analysis as an Activity
    6. Back to the Basket
  12. Chapter 3 How Does Market Basket Analysis Produce ROI?
    1. Analytic Structure
    2. Analytic Skills
    3. Actionable Knowledge
      1. Complements
      2. Substitutes
      3. Independents
      4. KPIs
    4. ROI
  13. Chapter 4 Why Is Market Basket Analysis Difficult?
    1. Noise
    2. Large Data Volumes
      1. Time
      2. Groups
      3. Completeness and Data Sampling
      4. Data Sample Integrity
    3. Data Warehouse Data Structures
      1. Flexibility of the Itemset
      2. Lack of Control
      3. Recursive Nature of Market Basket Analysis
    4. On Your Mark...Get Set...Go!
  14. Chapter 5 Market Basket Analysis Solution Definition
    1. Market Basket Scope Statement
      1. Definition of the Itemset
      2. Definition of the Driver Object
      3. Definition of the Correlation Object
    2. Key Definitions
      1. Itemset Key Definition
      2. Object Key Definition
      3. Attribute Definitions
      4. Itemset Metric Definitions
    3. Market Basket Table
    4. Market Basket Query
      1. Itemset
      2. Driver Object
      3. Correlation Object
      4. Recursive Join
      5. Single Object Itemset
    5. Market Basket BI Table
    6. Market Basket Analysis BI View
      1. DRIVER_OBJECT_KEY
      2. CORR_OBJECT_KEY
      3. DRIVER_QUANTITY_SUM
      4. DRIVER_DOLLARS_SUM
      5. DRIVER_COUNT_SUM
      6. CORR_QUANTITY_SUM
      7. CORR_DOLLARS_SUM
      8. CORR_COUNT_SUM
      9. QUANTITY_RATIO
      10. DOLLARS_RATIO
      11. COUNT_RATIO
      12. Additional Group by Data Elements
    7. Beyond the Market Basket Analysis BI View
      1. Object Groups
      2. Market Basket Reporting in Batch
  15. Chapter 6 Market Basket Architecture and Database Design
    1. Market Basket Analysis Architecture
      1. Data Warehouse
      2. Market Basket Table
      3. Market Basket BI Table
      4. Analysts
      5. Market Basket Reporting Architecture
        1. Data Warehouse
        2. Market Basket Table
        3. Market Basket BI Table
        4. Market Basket KPI Exception Reporting
    2. Market Basket Data Definition Language (DDL)
      1. Construction of the Market Basket Table
        1. Teradata
        2. Oracle
        3. DB2
        4. SQL Server
        5. Product Join
      2. Construction of the Market Basket BI Table
    3. Datamart Architecture
  16. Chapter 7 ETL into a Market Basket Datamart
    1. Requirement: Populate the Market Basket BI Table
      1. Singularity
      2. Completeness
      3. Identity
      4. Metadata
      5. Data Quality
    2. Market Basket ETL Design
      1. Step 1: Extract from a Fact Table and Load to a Market Basket Table
      2. Step 2: Recursively Join the Market Basket Table and Load a Market Basket BI Table
      3. Step 3: Arithmetic Juxtaposition of Driver Objects and Correlation Objects
      4. Step 4: Load a Market Basket BI Table Using a Correlation Hierarchy
      5. Step 5: Load a Market Basket BI Table Using a Driver Hierarchy
      6. Step 6: Load a Market Basket BI Table Using the Same Hierarchy as Driver and Correlation
  17. Chapter 8 What Is Time Variance?
    1. Time
    2. Periodicity
    3. The More Things Change
    4. The More They Stay the Same
    5. Year, Quarter, Month, Week, Cycle, Day of Week, Time Zone, Date, and Time of Day
    6. Ralph Kimball’s Variations of Time Variance
      1. Type 1—All History Looks Like Now
      2. Type 2—All History in Its Own Context
      3. Type 3—Alternate History
    7. Time Variant Data
  18. Chapter 9 How Does Time Variance Produce ROI?
    1. Cause and Effect
    2. Cause and Effect Is Not Causal Analysis
    3. Exceptions to the Rule
    4. Rules to the Exception
  19. Chapter 10 Why Is Time Variance Difficult?
    1. Relational Set Logic
    2. Sets—The Bane of Time Variance
    3. Options
      1. Stored Procedures
      2. Temporal Databases
    4. Time Variant Solution Design
  20. Chapter 11 Time Variant Solution Definition
    1. Time Variant Problem Reprise
    2. The Goal of This Time Variant Solution Design
    3. One Row at a Time
    4. Not a Surrogate Key
    5. Instance Key
      1. Join to a Simple Instance Key
      2. Join to a Compound Instance Key
      3. Cascading Instance Keys
    6. Which Tables Use Instance Keys?
    7. Type 1 Time Variance
    8. Type 1 and Type 2 Combined
    9. Summary Tables
      1. ETL Cycles
      2. Instance Keys
    10. Real Time and Time Variance
    11. First Time Variant Subject Area
  21. Chapter 12 Time Variant Database Definition
    1. Tables of Types and Types of Tables
    2. Type 2 Time Variant Dimension Tables
    3. Type 1 and Type 2 Time Variant Dimension Tables in a Shared Environment
    4. Type 1 Time Variant Dimension Tables
      1. All the Same Columns
      2. Only the Row in Effect Right Now
      3. Entity Primary Keys
      4. Entity Foreign Keys
    5. Fact Tables
    6. Time Summary Tables
    7. Conclusion
  22. Chapter 13 ETL into a Time Variant Data Warehouse
    1. Changed Data Capture
      1. Entities
      2. Attributes
      3. ETL Cycle and Periodicity
      4. Time Variant Metadata
      5. Back to the Original Problem
    2. Instance Keys in Dimension ETL
      1. New Row
      2. Updated Row
      3. Discontinued Row
      4. Cascading Instance Keys
      5. Dimension Load
        1. Delete
        2. Update
        3. Insert
      6. Data Quality
      7. Metadata
    3. Fact ETL
      1. Instance Keys in Fact ETL
        1. Data Quality in Fact ETL
        2. Metadata in Fact ETL
      2. Instance Keys—The Manufacturing Example
        1. Type 1 Time Variance
        2. Type 2 Time Variance
    4. Summary ETL
      1. Type 1 Summary ETL
      2. Type 2 Summary ETL
      3. Metadata, Data Quality, and the Like
  23. Chapter 14 Market Basket Analysis in a Time Variant Data Warehouse
    1. High-Level Review
      1. Market Basket Analysis at a High Level
      2. Time Variance at a High Level
      3. Market Basket Analysis in a Time Variant Context at a High Level
    2. Elements of Time Variant Market Basket Analysis
      1. Itemset
      2. Object
    3. Forms of Time Variant Market Basket Analysis
      1. Sample Data
      2. Instance Driver Object and Instance Correlation Object
      3. Instance Driver Object and Entity Correlation Object
      4. Entity Driver Object and Instance Correlation Object
      5. Entity Driver Object and Entity Correlation Object
    4. Conclusion
  24. References
  25. Index

Product information

  • Title: Data Warehouse Designs
  • Author(s): Fon Silvers
  • Release date: December 2011
  • Publisher(s): Auerbach Publications
  • ISBN: 9781466516663