Chapter 16. Analyzing Data Sources
In This Chapter
Digging into source data
Putting together an action plan for analyzing source data
Ensuring that you assign the right people to the job
Employing different techniques to analyze source data
Analyzing what's not there
Introducing mapping and transformation logic
Although the process of extracting, transforming, and moving data from its sources to the data warehouse is complicated, some people would have you believe that it's still a relatively straightforward mapping exercise that you do at the structural (database definition) level.
I would (and do) argue that the structural transformation is the least complicated part of the process of determining what you want to include in the data warehouse and then populating that warehouse. The most complicated part of the process involves digging through the source data (the files, databases, and various archives formats) and finding whatever quirks, oddities, omissions, and outright errors are waiting to bite you directly in the — you get the idea.
Source data analysis plays a key role in a data warehousing project because all the subsequent extraction and transformation processes depend on what data the data sources really contain.
A couple of years ago, I was working on a data warehouse lite project (see Chapter 3) that was being done in conjunction with an application migration project. One team (another consulting company) was working on the application migration, and my team was developing a ...