
Figure 19.7 Program Data Vector After Reading from Each Data Set
Abbott, Jennifer .
Abbott, Jennifer Hitchcock-Tyler, Erin .
Abbott, Jennifer Hitchcock-Tyler, Erin 14SEP2000 10:00 103
4. After processing the first observation from the last data set and executing any other
statements in the DATA step, SAS writes the contents of the program data vector to
the new data set. If the DATA step attempts to read past the end of a data set, then the
values of all variables from that data set in the program data vector are set to
missing.
This behavior has two important consequences:
• If a variable exists in more than one data set, then the value from the last data set
SAS reads is the value that goes into the new data set, even if that value is
missing. If you want to keep all the values for like-named variables from
different data sets, then you must rename one or more of the variables with the
RENAME= data set option so that each variable has a unique name.
• After SAS processes all observations in a data set, the program data vector and
all subsequent observations in the new data set have missing values for the
variables unique to that data set. So, as the next figure shows, the program data
vector for the last observation in the new data set contains missing values for all
variables except Name2.
Figure 19.8 Program Data Vector for the Last Observation
5. SAS continues to merge observations until it has copied all observations from all
data sets.
Match-Merging
Merging with a BY Statement
Merging with a BY statement enables you to match observations according to the values
of the BY variables that you specify. Before you can perform a match-merge, all data
sets must be sorted by the variables that you want to use for the merge.
In order to understand match-merging, you must understand three key concepts:
296 Chapter 19 • Merging SAS Data Sets