Data Extraction Best Practices
A natural first discussion point for ETL is the extraction, the E in ETL. This chapter applies the concepts of data extraction using SSIS. ETL applies to a broad spectrum of applications beyond just data warehousing and data integration. Therefore, the discussion of this topic includes both generalized extraction concepts and data warehouse–specific concepts.
Data extraction is the process of moving data off of a source system, potentially to a staging environment, or into the transformation phase of the ETL. Figure 5-1 shows the extraction process separated out on the left. An extraction process may pull data from a variety of sources, including files or database systems, as this figure highlights.
Following are a few common objectives of data extraction:
This chapter is structured into three sections related to the Problem-Design-Solution of ETL extraction processes using SSIS: