Combining Selected Observations from Multiple
Data Sets
To create a data set that contains only the observations that are selected according to a
particular criterion, you can use the subsetting IF statement and a SET statement that
specifies multiple data sets. The following DATA step reads two input data sets to create
a combined data set that lists only the winning teams:
data champions(drop=result); 1
set southamerican (in=S) european; 2
by Year;
if result='won'; 3
if S then Continent='South America'; 4
else Continent='Europe';
run;
proc print data=champions;
title 'World Cup Champions from 1954 to 1998';
title2 'including Countries'' Continent';
run;
The following list corresponds to the numbered items in the preceding program:
1
The DROP= data set option drops the variable Result from the new data set
CHAMPIONS because all values for this variable are the same.
2
The SET statement reads observations from two data sets: SOUTHAMERICAN and
EUROPEAN. The S= data option creates the variable S, which is set to 1 each time
an observation is contributed by the SOUTHAMERICAN data set.
3
A subsetting IF statement writes the observation to the output data set CHAMPIONS
only if the value of the Result variable is won.
4
When the current observation comes from the data set SOUTHAMERICAN, the
value of S is 1. Otherwise, the value is 0. The IF-THEN/ELSE statements execute
one of two assignment statements, depending on the value of S. If the observation
comes from the data set SOUTHAMERICAN, then the value assigned to Continent
is South America. If the observation comes from the data set EUROPEAN, then the
value assigned to Continent is Europe.
358 Chapter 22 Conditionally Processing Observations from Multiple SAS Data Sets

Get Step-by-Step Programming with Base SAS 9.4, Second Edition, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.