Grouped by X Y Z
_N_=1 FIRST.x=1 LAST.x=0 FIRST.y=1 LAST.y=0 FIRST.z=1 LAST.z=0
_N_=2 FIRST.x=0 LAST.x=0 FIRST.y=0 LAST.y=1 FIRST.z=0 LAST.z=1
_N_=3 FIRST.x=0 LAST.x=1 FIRST.y=1 LAST.y=1 FIRST.z=1 LAST.z=1
_N_=4 FIRST.x=1 LAST.x=1 FIRST.y=1 LAST.y=1 FIRST.z=1 LAST.z=1
Grouped by Y X Z
_N_=1 FIRST.y=1 LAST.y=0 FIRST.x=1 LAST.x=0 FIRST.z=1 LAST.z=0
_N_=2 FIRST.y=0 LAST.y=1 FIRST.x=0 LAST.x=1 FIRST.z=0 LAST.z=1
_N_=3 FIRST.y=1 LAST.y=0 FIRST.x=1 LAST.x=1 FIRST.z=1 LAST.z=1
_N_=4 FIRST.y=0 LAST.y=1 FIRST.x=1 LAST.x=1 FIRST.z=1 LAST.z=1
Processing BY-Groups in the DATA Step
Overview
The most common use of BY-group processing is to combine data sets by using the BY
statement with the SET, MERGE, MODIFY, or UPDATE statements. (If you use a SET,
MERGE, or UPDATE statement with the BY statement, your observations must be
grouped or ordered.) When processing these statements, SAS reads one observation at a
time into the program data vector. With BY-group processing, SAS selects the
observations from the data sets according to the values of the BY variable or variables.
After processing all the observations from one BY group, SAS expects the next
observation to be from the next BY group.
The BY statement modifies the action of the SET, MERGE, MODIFY, or UPDATE
statement by controlling when the values in the program data vector are set to missing.
During BY-group processing, SAS retains the values of variables until it has copied the
last observation that it finds for that BY group in any of the data sets. Without the BY
statement, the SET statement sets variables to missing when it reads the last observation.
The MERGE statement does not set variables to missing after the DATA step starts
reading observations into the program data vector.
Processing BY-Groups Conditionally
You can process observations conditionally by using the subsetting IF or IF-THEN
statements, or the SELECT statement, with the temporary variables FIRST.variable and
LAST.variable (set up during BY-group processing). For example, you can use the IF or
IF THEN statements to perform calculations for each BY group and to write an
observation when the first or the last observation of a BY group has been read into the
program data vector.
The following example computes annual payroll by department. It uses IF-THEN
statements and the values of FIRST.variable and LAST.variable automatic variables to
reset the value of PAYROLL to 0 at the beginning of each BY group and to write an
observation after the last observation in a BY group is processed.
data salaries;
input Department $ Name $ WageCategory $ WageRate;
datalines;
BAD Carol Salaried 20000
BAD Elizabeth Salaried 5000
BAD Linda Salaried 7000
470 Chapter 22 BY-Group Processing in the DATA Step
BAD Thomas Salaried 9000
BAD Lynne Hourly 230
DDG Jason Hourly 200
DDG Paul Salaried 4000
PPD Kevin Salaried 5500
PPD Amber Hourly 150
PPD Tina Salaried 13000
STD Helen Hourly 200
STD Jim Salaried 8000
;
proc print data=salaries;
run;
proc sort data=salaries out=temp; by Department; run;
data budget (keep=Department Payroll);
set temp;
by Department;
if WageCategory='Salaried' then YearlyWage=WageRate*12;
else if WageCategory='Hourly' then YearlyWage=WageRate*2000;
/* SAS sets FIRST.variable to 1 if this is a new */
/* department in the BY group. */
if first.Department then Payroll=0;
Payroll+YearlyWage;
/* SAS sets LAST.variable to 1 if this is the last */
/* department in the current BY group. */
if last.Department;
run;
proc print data=budget;
format Payroll dollar10.;
title 'Annual Payroll by Department';
run;
Output 22.1 Output from Conditional BY-Group Processing
Data Not in Alphabetic or Numeric Order
In BY-group processing, you can use data that is arranged in an order other than
alphabetic or numeric, such as by calendar month or by category. To do this, use the
NOTSORTED option in a BY statement when you use a SET statement. The
NOTSORTED option in the BY statement tells SAS that the data is not in alphabetic or
numeric order, but that it is arranged in groups by the values of the BY variable. You
Processing BY-Groups in the DATA Step 471

Get SAS 9.4 Language Reference, 6th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.