Chapter 16. Analyzing Data Sources

In This Chapter

  • Digging into source data

  • Putting together an action plan for analyzing source data

  • Ensuring that you assign the right people to the job

  • Employing different techniques to analyze source data

  • Analyzing what's not there

  • Introducing mapping and transformation logic

Although the process of extracting, transforming, and moving data from its sources to the data warehouse is complicated, some people would have you believe that it's still a relatively straightforward mapping exercise that you do at the structural (database definition) level.

I would (and do) argue that the structural transformation is the least complicated part of the process of determining what you want to include in the data warehouse and then populating that warehouse. The most complicated part of the process involves digging through the source data (the files, databases, and various archives formats) and finding whatever quirks, oddities, omissions, and outright errors are waiting to bite you directly in the — you get the idea.

Source data analysis plays a key role in a data warehousing project because all the subsequent extraction and transformation processes depend on what data the data sources really contain.

A couple of years ago, I was working on a data warehouse lite project (see Chapter 3) that was being done in conjunction with an application migration project. One team (another consulting company) was working on the application migration, and my team was developing a ...

Get Data Warehousing For Dummies®, 2nd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.