Testing Your Program
As a final step in preparing your data sets, you should test your program. Create small
temporary SAS data sets that contain a sample of observations that test all of your
program's logic. If your logic is faulty and you get unexpected output, you can use the
DATA step debugger to debug your program. For complete information about the
DATA Step Debugger, see SAS Data Set Options: Reference.
Combining SAS Data Sets: Methods
Concatenating data sets is the combining of two or more data sets, one after the other,
into a single data set. The number of observations in the new data set is the sum of the
number of observations in the original data sets. The order of observations is sequential.
All observations from the first data set are followed by all observations from the second
data set, and so on.
In the simplest case, all input data sets contain the same variables. If the input data sets
contain different variables, observations from one data set have missing values for
variables defined only in other data sets. In either case, the variables in the new data set
are the same as the variables in the old data sets.
Use this form of the SET statement to concatenate data sets:
specifies any valid SAS data set name.
For a complete description of valid SAS data set names, see the SET statement in SAS
DATA Step Processing during Concatenation
SAS reads the descriptor information of each data set that is named in the SET
statement and then creates a program data vector that contains all the variables from
all data sets as well as variables created by the DATA step.
Execution — Step 1
SAS reads the first observation from the first data set into the program data vector. It
processes the first observation and executes other statements in the DATA step. It
then writes the contents of the program data vector to the new data set.
The SET statement does not reset the values in the program data vector to missing,
except for variables whose value is calculated or assigned during the DATA step.
Variables that are created by the DATA step are set to missing at the beginning of
each iteration of the DATA step. Variables that are read from a data set are not.
Execution — Step 2
SAS continues to read one observation at a time from the first data set until it finds
an end-of-file indicator. The values of the variables in the program data vector are
456 Chapter 21 • Reading, Combining, and Modifying SAS Data Sets