Book description
SQL Server 2012 Data Integration Recipes provides focused and practical solutions to real world problems of data integration. Need to import data into SQL Server from an outside source? Need to export data and send it to another system? SQL Server 2012 Data Integration Recipes has your back. You'll find solutions for importing from Microsoft Office data stores such as Excel and Access, from text files such as CSV files, from XML, from other database brands such as Oracle and MySQL, and even from other SQL Server databases. You'll learn techniques for managing metadata, transforming data to meet the needs of the target system, handling exceptions and errors, and much more.
What DBA or developer isn't faced with the need to move data back and forth? Author Adam Aspin brings 10 years of extensive ETL experience involving SQL Server, and especially satellite products such as Data Transformation Services and SQL Server Integration Services. Extensive coverage is given to Integration Services, Microsoft's flagship tool for data integration in SQL Server environments. Coverage is also given to the broader range of tools such as OPENDATASOURCE, linked servers, OPENROWSET, Migration Assistant for Access, BCP Import, and BULK INSERT just to name a few. If you're looking for a resource to cover data integration and ETL across the gamut of Microsoft's SQL Server toolset, SQL Server 2012 Data Integration Recipes is the one book that will meet your needs.
Provides practical and proven solutions towards creating resilient ETL environments
Clearly answers the tough questions which professionals ask
Goes beyond the tools to a thorough discussion of the underlying techniques
Covers the gamut of data integration, beyond just SSIS
Includes example databases and files to allow readers to test the recipes
What you'll learn
Import and export to and from CSV files, XML files, and other text-based sources.
Move data between SQL databases, including SQL Server and others such as Oracle Database and MySQL.
Discover and manage metadata held in various database systems.
Remove duplicates and consolidate from multiple sources.
Transform data to meet the needs of target systems.
Profile source data as part of the discovery process.
Log and manage errors and exceptions during an ETL process.
Improve efficiency by detecting and processing only changed data.
Who this book is for
SQL Server 2012 Data Integration Recipes is written for developers wishing to find fast and reliable solutions for importing and exporting to and from SQL Server. The book appeals to DBAs as well, who are often tasked with implementing ETL processes. Developers and DBAs moving to SQL Server from other platforms will find the succinct, example-based approach ideal for quickly applying their general ETL knowledge to the specific tools provided as part of a SQL Server environment.
Table of contents
- Title Page
- Dedication
- Contents at a Glance
- Contents
- About the Author
- About the Technical Reviewers
- Acknowledgments
- Introduction
-
Chapter 1: Sourcing Data from MS Office Applications
- 1-1. Ensuring Connectivity to Access and Excel
- 1-2. Importing Data from Excel
- 1-3. Modifying Excel Data During a Load
- 1-4. Specifying the Excel Data to Load During an Ad-Hoc Import
- 1-5. Planning for Future Use of a Linked Server
- 1-6. Reading Data Automatically from an Excel Worksheet
- 1-7. Loading Excel Data as Part of a Structured ETL Process
- 1-8. Importing Excel 2007/2010 Data Using SSIS 2005
- 1-9. Handling Source Data Issues When Importing Excel Worksheets Using SSIS
- 1-10. Pushing Access Data into SQL Server
- 1-11. Importing Multiple Access Tables
- 1-12. Ad Hoc Imports of Access Data
- 1-13. Obtaining Access Data Without Regular Imports
- 1-14. Importing Access Data as Part of a Regular ETL Process
- 1-15. Convert a Complex Access Database to SQL Server
- 1-16. Resolving Complex Data Migration Problems During an Access to SQL Server Upgrade
- Summary
-
Chapter 2: Flat File Data Sources
- 2-1. Importing Data From a Text File
- 2-2. Importing a Delimited Text File
- 2-3. Automatically Determining Data Types
- 2-4. Importing Fixed-Width Text Files
- 2-5. Importing Text Files Using T-SQL
- 2-6. Mapping a Source File
- 2-7. Importing Data Using T-SQL in Anticipation of Using a Linked Server
- 2-8. Accessing a Text File As a Linked Server Table
- 2-9. Importing Flat Files from the Command Line
- 2-10. Importing Large Text Files Using T-SQL and Putting the Emphasis on Speed
- 2-11. Creating a Format File for Complex Flat File Loads Using Bulk Insert or BCP
- 2-12. Performing a BULK INSERT with a Format File
- 2-13. Loading a Text File Fast Using T-SQL
- 2-14. Executing BULK INSERT from SSIS
- 2-15. Handling Complex Flat File Formats with a Row Prefix in SSIS
- 2-16. Pre-Parsing and Staging File Subsets in SSIS
- 2-17. Handling Irregular Numbers of Columns in the Source File Using SQL Server 2012
- 2-18. Handling Embedded Qualifiers in SQL Server 2012
- 2-19. Handling Irregular Numbers of Columns in the Source File in SQL Server 2005 and 2008
- 2-20. Determining the Number of Columns in a Source File
- 2-21. Preparing CSV Files for Import
- Summary
-
Chapter 3: XML Data Sources
- 3-1. Loading XML Files for Storage in SQL Server
- 3-2. Loading XML Data into Rows and Columns
- 3-3. Shredding an XML File into an SQL Server Table
- 3-4. Importing XML Data as Part of an ETL Process
- 3-6. Importing Multiple XML Files into SQL Server
- 3-7. Flattening an XML File to Prepare It for Import
- 3-8. Importing XML Data from Very Large Files, Putting a Priority on Speed
- 3-9. Loading Multiple Tables at Once from a Single XML Source File
- 3-10. Loading and Shredding Relational Tables from an XML Source File
- 3-11. Overcoming Bulk Loading Challenges for XML Files
- 3-12. Running SQLXML Bulk Loader As Part of a Regular ETL Process
- 3-13. Validating an XML Document Against a Schema File As Part of a Scripted Solution
- 3-14. Validating an XML Document Against a Schema File in SSIS
- Summary
-
Chapter 4: SQL Databases
- Preamble: Installing and Configuring OLEDB and ODBC Providers
- 4-1. Configuring Your Server to Connect to Oracle
- 4-2. Importing Data from Oracle As a Regular Process
- 4-3. Accelerating Oracle Data Import
- 4-4. Importing Oracle Data on an “Ad Hoc” Basis
- 4-5. Migrating Multiple Oracle Tables and Views
- 4-6. Loading DB2 Data on a Regular Basis
- 4-7. Importing DB2 Data Without SSIS
- 4-8. Sourcing Data from MySQL on a Regular Basis
- 4-9. Importing MySQL Data on an “Ad Hoc” Basis
- 4-10. Importing MySQL Data Using SSIS 2005 and 2008
- 4-11. Migrating Complete Tables from MySQL
- 4-12. Loading Data from Sybase Adaptive Server Enterprise (ASE)
- 4-14. Importing Sybase ASE Data on an “Ad Hoc” Basis
- 4-15. Importing Sybase ASE Data on a Regular Basis
- 4-16. Loading Teradata Data
- 4-17. Sourcing Data from PostgreSQL
- Summary
-
Chapter 5: SQL Server Sources
- 5-1. Loading Ad Hoc Data from Other SQL Server Instances
- 5-2. Reading Data from Another SQL Server Instance over a Permanent Connection
- 5-3. Loading Large Data Sets Using T-SQL
- 5-4. Load Data Exported from SQL Server from the Command Line
- 5-5. Loading SQL Server Data from a Native SQL Server File
- 5-6. Transferring Data Between SQL Server Databases Regularly
- 5-7. Porting a Tiny Amount of Data Between SQL Server Databases
- 5-8. Copying and Pasting Between Tables
- 5-9. Loading Data as Fast as Possible in an ETL Process
- 5-10. Importing and Exporting All the Tables in a Database
- 5-11. Loading Data from Different Versions of SQL Server
- 5-12. Copying Entire Databases
- 5-13. Transferring a Complex Subset of Data Between Databases
- 5-14. Loading Data into SQL Server Azure Interactively
- 5-15. Loading Data into SQL Server Azure as Part of a Regular ETL Process
- 5-16. Loading Data into SQL Server Azure from the Command Line
- 5-17. Loading Ad Hoc Data into SQL Server Azure
- 5-18. Transferring Tables Between Databases
- Summary
-
Chapter 6: Miscellaneous Data Sources
- 6-1. Importing Data from SQL Server Analysis Services
- 6-2. Importing Data from an Analysis Services Cube on a Regular Basis
- 6-3. Querying an OLAP Source on an Ad Hoc Basis
- 6-4. Loading Images and Documents into an SQL Server Table
- 6-5. Importing Multiple Files into an SQL Server Table
- 6-6. Importing Files into SQL Server on a Regular Basis
- 6-7. Importing Files with Their Attributes into SQL Server
- 6-8. Loading Visual FoxPro Files
- 6-9. Importing Data from dBase
- 6-10. Loading Data from Web Services
- 6-11. Importing Windows Management Instrumentation Data
- 6-12. Importing Data over ODBC
- 6-13. Linking to 32-bit data sources from a 64-bit SQL Server
- Summary
-
Chapter 7: Exporting Data from SQL Server
- 7-1. Exporting Data Occasionally in Various Formats
- 7-2. Exporting Data As a Delimited Text File
- 7-3. Exporting Data to a Fixed-Width Text File
- 7-4. Exporting Text Files from the Command Line
- 7-5. Exporting Data Using BCP Without Recourse to the Command Line
- 7-6. Exporting Data As Text Files from T-SQL
- 7-7. Exporting Data to Another SQL Server Database
- 7-8. Exporting Text Files Regularly from T-SQL
- 7-9. Exporting and Compressing Multiple Tables
- 7-10. Exporting a Tiny Dataset As XML
- 7-11. Exporting a Larger Dataset As XML
- 7-12. Exporting Large Datasets As XML
- 7-13. Creating an XML Schema
- 7-14. Shaping XML Export Data
- 7-15. Exporting XML Data on a Regular Basis
- 7-16. Routinely Exporting Small XML Datasets
- 7-17. Exporting Data to Excel Using T-SQL
- 7-18. Exporting Data to Access Using T-SQL
- 7-19. Exporting Data Securely to Excel from T-SQL
- 7-20. Exporting Data Securely to Access from T-SQL
- 7-21. Exporting Data to Excel Using SSIS
- 7-22. Exporting Data to Access Using SSIS
- 7-23. Pulling Data from Office Applications
- 7-24. Exporting Files Stored in SQL Server Using T-SQL
- 7-25. Exporting Files Stored in SQL Server on a Regular Basis
- 7-26. Exporting Data from SSAS Using T-SQL on an Occasional Basis
- 7-27. Exporting Data from SSAS Using T-SQL on a More Regular Basis
- 7-28. Exporting an SSAS Dimension Using SSIS
- 7-29. Exporting the Result of an MDX Query in SSIS
- 7-30. Exporting Data to Other Relational Databases Using T-SQL
- 7-31. Exporting Data to Other Relational Databases Using SSIS
- 7-32. Exporting Data from SQL Server Azure
- Summary
-
Chapter 8: Metadata
- 8-1. Listing Available Tables from a Linked Server
- 8-2. Listing the Columns Available When Using a Linked Server
- 8-3. Discovering Flat File Metadata
- 8-4. Returning Simple SQL Server Metadata
- 8-5. Gathering Tailored SQL Server Metadata
- 8-6. Analyzing SQL Server Table Metadata
- 8-7. Analyzing SQL Server Column Metadata
- 8-8. Displaying Microsoft Access Metadata
- 8-9. Reading MySQL Metadata
- 8-10. Displaying Oracle Metadata from SSIS
- 8-11. Querying Oracle Metadata Using T-SQL
- 8-12. Understanding the Oracle Data Dictionary
- 8-13. Displaying DB2 Metadata
- 8-14. Obtaining SQL Server Metadata Using .NET
- Summary
-
Chapter 9: Data Transformation
- 9-1. Converting Data Types
- 9-2. Removing Duplicates from Data
- 9-3. Deduplicating Large Recordsets
- 9-4. Deduplicating Data in an ETL Data Flow
- 9-5. Subsetting Column Data Using T-SQL
- 9-6. Generating Fixed-Length Column Data Subsets Using SSIS
- 9-7. Generating Variable-Length Column Data Subsets
- 9-8. Concatenating Source Data Using T-SQL
- 9-9. Concatenating Data Using SSIS
- 9-10. Duplicating Columns
- 9-11. Converting Strings to Uppercase or Lowercase
- 9-12. Converting Strings to Title Case
- 9-13. PIVOTing Data in T-SQL
- 9-14. Pivoting Data in SSIS with SQL Server 2012
- 9-15. Pivoting Data in SSIS with SQL Server 2005 and 2008
- 9-16. Consolidating Multiple Identical Data Sources in T-SQL
- 9-17. Consolidating Multiple Identical Data Sources in SSIS
- 9-18. Normalizing Data Inside a Single Table Using T-SQL
- 9-19. Normalizing Data Inside a Single Table Using SSIS
- 9-20. Normalizing Data into Multiple Relational Tables Using T-SQL
- 9-21. Normalizing Data into Multiple Relational Tables Using SSIS
- 9-22. Denormalizing Data by Referencing Lookup Tables in T-SQL
- 9-23. Denormalizing Data by Referencing Lookup Tables in SSIS
- 9-24. Processing Type 1 Slowly Changing Dimensions (SCDs) Using T-SQL
- 9-25. Handling Type 2 Slowly Changing Dimensions in T-SQL
- 9-26. Handling Type 2 Slowly Changing Dimensions with SSIS
- 9-27. Handling Type 3 Slowly Changing Dimensions Using T-SQL
- 9-28. Handling Type 4 Slowly Changing Dimensions Using T-SQL
- 9-29. Handling Type 4 Slowly Changing Dimensions with SSIS
- 9-30. Cleansing Data As Part of an ETL Process
- Summary
-
Chapter 10: Data Profiling
- 10-1. Profiling Data Attributes
- 10-2. Profiling Domain and Value Distribution
- 10-3. Profiling External Data
- 10-4. Profiling External Data Faster
- 10-5. Running and Logging a Complete DataProfile
- 10-6. Profiling Data Quickly in SSIS
- 10-7. Creating Custom Data Profiles with SSIS
- 10-8. Using the SSIS Data Profiling Task on non-SQL Server Data Sources
- 10-9. Reading Profile Data
- 10.10. Storing SSIS Profile Data in a Database
- 10-11. Tailoring Specific Source Data Profiles in SSIS
- 10-12. Domain Analysis in SSIS
- 10-13. Performing Multiple Domain Analyses
- 10-14. Pattern Profiling in a Data Flow
- 10-15. Pattern Profiling Using T-SQL
- 10-16. Profiling Data Types
- 10-17. Controlling Data Flow via Profile Metadata
- Summary
-
Chapter 11: Delta Data Management
- Preamble: Why Bother with Delta Data?
- Delta Data Approaches
- 11-1. Loading Delta Data as Part of a Structured ETL Process
- 11-2. Loading Data Changes Using a Linked Server
- 11-3. Loading Data Changes From a Small Source Table as Part of a Structured ETL Process
- 11-4. Detecting and Loading Delta Data Only
- 11-5. Performing Delta Data Upserts with Other SQL Databases
- 11-6. Handling Data Changes Without Writing to the Source Server
- 11-7. Detecting Data Changes with Limited Source Database Access
- 11-8. Detecting and Loading Delta Data Using T-SQL and a Linked Server When MERGE Is Not Practical
- 11-9. Detecting, Logging, and Loading Delta Data
- 11-10. Detecting Differences in Rowcounts, Metadata, and Column Data
- Summary
-
Chapter 12: Change Tracking and Change Data Capture
- 12-1. Detecting Source Table Changes with Little Overhead and No Custom Framework
- 12-2. Pulling Changes into a Destination Table with Change Tracking
- 12-3. Using Change Tracking as Part of a Structured ETL Process
- 12-4. Detecting Changes to Source Data Using the SQL Server Transaction Log
- 12-5. Applying Change Data Capture with SSIS
- 12-6. Using Change Data Capture with Oracle Source Data
- Summary
-
Chapter 13: Organising And Optimizing Data Loads
- 13-1. Loading Multiple Files
- 13-2. Selecting Multiple Text Files to Import
- 13-3. Loading Multiple Files Using Complex Selection Criteria
- 13-4. Ordering and Filtering File Loads
- 13-5. Loading Multiple Flat Files in Parallel
- 13-6. Loading Source Files with Load Balancing
- 13-7. Loading Data to Parallel Destinations
- 13-8. Using a Single Data File As a Multiple Data Source for Parallel Destination Loads
- 13-9. Reading and Writing Data from a Database Source in Parallel
- 13-10. Inserting Records in Parallel and in Bulk
- 13-11. Creating Self-Optimizing Parallel Bulk Inserts
- 13-12. Loading Files in Controlled Batches
- 13-13. Executing SQL Statements and Procedures in Parallel Using SSIS
- 13-14. Executing SQL Statements and Procedures in Parallel Without SSIS
- 13-15. Executing SQL Statements and Procedures in Parallel Using SQL Server Agent
- Summary
-
Chapter 14: ETL Process Acceleration
- 14-1. Accelerating SSIS Lookups
- 14-2. Disabling and Rebuilding Nonclustered Indexes in a Destination Table
- 14-3. Persisting Destination Database Index Metadata
- 14-4. Scripting and Executing DROP Statements for Destination Database Indexes
- 14-5. Scripting and Executing CREATE Statements for Destination Database Indexes
- 14-6. Storing Metadata, and Then Scripting and Executing DROP and CREATE Statements for Destination Database XML Indexes
- 14-7. Finding Missing Indexes
- 14-8. Managing Check Constraints
- 14-9. Managing Foreign Key Constraints
- 14-10. Optimizing Bulk Loads
- Summary
-
Chapter 15: Logging and Auditing
- 15-1. Logging Events from T-SQL
- 15-2. Logging Data from SSIS
- 15-3. Customizing SSIS Logging
- 15-4. Saving and Applying Complex SSIS Logging Details
- 15-5. Extending SSIS Logging to an SQL Server Destination
- 15-6. Logging Information from an SSIS Script Task
- 15-7. Logging from T-SQL to the SSIS Log Table
- 15-8. Handling Errors in T-SQL
- 15-9. Handling Errors in SSIS
- 15-10. Creating a Centralized Logging Framework
- 15-11. Logging to a Centralized Framework When Using SSIS Containers
- 15-12. Logging to a Centralized Framework When Using SSIS Script Tasks and Components
- 15-13. Logging to a Text or XML File from T-SQL
- 15-14. Logging Counters in T-SQL
- 15-15. Logging Counters from SSIS
- 15-16. Creating an SSIS Catalog
- 15-17. Reading Logged Events and Counters from the SSIS Catalog
- 15-18. Analyzing Events and Counters In-Depth via the SSIS Catalog
- 15-19. Creating a Process Control Framework
- 15-20. Linking the SSIS Catalog to T-SQL Logging
- 15-21. Baselining ETL Processes
- 15-22. Auditing an ETL Process
- 15-23. Logging Audit Data
- Summary
-
Appendix A: Data Types
- SQL Server Data Types
- SSIS Data Types
- Default Data Mapping in the Import/Export Wizard
- MSSQL9 to MSSQL8
- MSSQL to DB2
- MSSQL to IBMDB2
- MSSQL to Jet4
- MSSQL to SSIS11
- OracleClient to MSSQL
- OracleClient to MSSQL11
- OracleClient to SSIS11
- Oracle to MSSQL
- Oracle to MSSQL11
- Oracle to SSIS11
- SQLClient9 to MSSQL8
- SQLClient to DB2
- SQLClient to IBMDB2
- SQLClient to MSSQL11
- SQLClient to Oracle
- SQLClient to SSIS
- SSIS11 to DB2
- SSIS11 to IBMDB2
- SSIS11 to MSSQL
- DB2 to MSSQL
- SSIS to Jet
- DB2 to MSSQL11
- SSIS to Oracle
- DB2 to SSIS11
- IBMDB2 to MSSQL
- IBMDB2 to MSSQL11
- IBMDB2 to SSIS11
- Jet to MSSQL8
- Jet to SSIS
- ACE to SSIS
- Excel to SQL Server and SSIS Data Mapping
- Access to SQL Server and SSIS Data Mapping
- Oracle to SQL Server and SSIS Data Mapping
- Oracle to SQL Server Replication Data Type Mapping
- MySQL Data Types
- Sybase to SQL Server Data Type Conversion
- Appendix B: Sample Databases and Scripts
- Index
Product information
- Title: SQL Server 2012 Data Integration Recipes: Solutions for Integration Services and Other ETL Tools
- Author(s):
- Release date: November 2012
- Publisher(s): Apress
- ISBN: 9781430247913
You might also like
book
Professional Microsoft SQL Server 2012 Integration Services
An in-depth look at the radical changes to the newest release of SISS Microsoft SQL Server …
book
SQL Server 2017 Integration Services Cookbook
Harness the power of SQL Server 2017 Integration Services to build your data integration solutions with …
book
Professional Microsoft SQL Server 2014 Integration Services
Fill the gap between planning and doing with SSIS 2014 The 2014 release of Microsoft's SQL …
book
Pro SQL Server 2012 Integration Services
Pro SQL Server 2012 Integration Services teaches how to take advantage of the powerful extract, transform, …