Display A3.13 Sample PDF Output
Validating Product Data
Overview
Use a Data Validation transformation to improve the quality of operational data before
you load that data into a data warehouse or data mart. You can detect error conditions
and specify actions that alleviate those errors. Error conditions include blank or missing
values, duplicate values, and invalid values. The actions that you can take in response to
erroneous values include stopping the job, changing the value, or writing the row to an
error table instead of to the target.
Custom validation enables you to apply source values to user-written expressions. You
then define the actions that are taken in response to true and false results. Custom actions
include the replacement of source values in the target. Replacement values can be
generated by a second expression, or they can be obtained from a translation table.
Each of the validation actions sends information to an exception report, which you can
create on the Error and Exception Tables tab. You can specify the name and path of
the exception report on the Status Handling tab.
Validating Product Data 689
Problem
You want to create a job that validates operational data before that data is loaded into a
data warehouse or data mart.
Solution
You can use a Data Validation transformation to improve data quality by identifying and
acting on duplicate values, invalid values, and missing values. Perform the following
tasks to create the job:
“Create and Populate the Job” on page 690
“Configure Data Validation Settings” on page 691
“Run the Job and View the Output” on page 692
You can also develop your own validation process that translates source values by using
expressions or translation tables. The expressions can include the data quality functions
that are available in the Expression Builder. In this example, source data on product
revenues is validated before it is loaded into an enterprise data warehouse. Source rows
with duplicate product numbers or with invalid product names are written to an error
table, and valid rows are written to a table in the warehouse.
Tasks
Create and Populate the Job
Perform the following steps to create and populate the job:
1. Create an empty SAS Data Integration Studio job.
2. Select and drag a Data Validation transformation from the Data folder in the
Transformations tree. Then, drop it in the empty job on the Diagram tab in the Job
Editor window.
3. Select and drag the source table from the Inventory tree. Then, drop it before the
Data Validation transformation on the Diagram tab. The source table for this sample
job is Product_transact.
4. Drag the cursor from the source table to the input port of the Data Validation
transformation. This action connects the source to the transformation.
5. Because you want to have a permanent target table to contain the output for the
transformation, right-click the temporary work table attached to the transformation.
Then click Replace in the pop-up menu. Finally, use the Table Selector window to
select the target table for the job. The target table must be registered in SAS Data
Integration Studio.
The following display shows a sample process flow diagram for a job that contains
the Data Validation transformation.
Display A3.14 Sample Process Flow
690 Appendix 3 Miscellaneous Transformations

Get SAS Data Integration Studio 4.9 now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.