Chapter 16. Analyzing Data Sources

In This Chapter

  • Digging into source data

  • Putting together an action plan for analyzing source data

  • Ensuring that you assign the right people to the job

  • Employing different techniques to analyze source data

  • Analyzing what's not there

  • Introducing mapping and transformation logic

Although the process of extracting, transforming, and moving data from its sources to the data warehouse is complicated, some people would have you believe that it's still a relatively straightforward mapping exercise that you do at the structural (database definition) level.

I would (and do) argue that the structural transformation is the least complicated part of the process of determining what you want to include in the data warehouse and then populating that warehouse. The most complicated part of the process involves digging through the source data (the files, databases, and various archives formats) and finding whatever quirks, oddities, omissions, and outright errors are waiting to bite you directly in the — you get the idea.

Source data analysis plays a key role in a data warehousing project because all the subsequent extraction and transformation processes depend on what data the data sources really contain.

A couple of years ago, I was working on a data warehouse lite project (see Chapter 3) that was being done in conjunction with an application migration project. One team (another consulting company) was working on the application migration, and my team was developing a ...

Get Data Warehousing For Dummies®, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.