O'Reilly logo

Step-by-Step Programming with Base SAS 9.4, Second Edition, 2nd Edition by SAS Institute

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

proc print data=mylib.arch_or_scen;
title 'Data Set MYLIB.ARCH_OR_SCEN';
run;
The PROC PRINT statement that follows the DATA step produces this display of the
MYLIB.ARCH_OR_SCEN data set:
Figure 12.1 Data Set MYLIB.ARCH_OR_SCEN
Working with Grouped Data
Understanding the Basics of Grouping Data
The basic method for grouping data is to use a BY statement:
BY list-of-variables;
The BY statement can be used in a DATA step with a SET, MERGE, MODIFY, or
UPDATE statement, or it can be used in SAS procedures.
To work with grouped data using the SET, MERGE, MODIFY, or UPDATE statements,
the data must meet these conditions:
The observations must be in a SAS data set, not an external file.
The variables that define the groups must appear in the BY statement.
All observations in the input data set must be in ascending or descending numeric or
character order, or grouped in some way, such as by calendar month or by a
formatted value, according to the variables that are specified in the BY statement.
Note: If you use the MODIFY statement, the input data does not need to be in any
order. However, ordering the data can improve performance.
Working with Grouped Data 185
If the third condition is not met, the data is stored in a SAS data set but is not arranged in
the groups that you want. You can order the data using the SORT procedure (discussed
in the next section).
After the SAS data set is arranged in some order, you can use the BY statement to group
values of one or more common variables.
Grouping Observations with the SORT Procedure
All observations in the input data set must be in a particular order. To meet this
condition, the observations in MYLIB.ARCH_OR_SCEN can be ordered by the values
of TourType, which are architecture or scenery. Use the SORT procedure to sort
the observations by TourType:
proc sort data=mylib.arch_or_scen out=tourorder;
by TourType;
run;
The SORT procedure sorts the data set MYLIB.ARCH_OR_SCEN alphabetically
according to the values of TourType. The sorted observations go into a new data set
specified by the OUT= option. In this example, TOURORDER is the sorted data set. If
the OUT= option is omitted, the sorted version of the data set replaces the data set
MYLIB.ARCH_OR_SCEN.
The SORT procedure does not produce output other than the sorted data set. A message
in the SAS log says that the SORT procedure was executed:
Log 12.1 Message That the SORT Procedure Has Executed Successfully
880 proc sort data=mylib.arch_or_scen out=tourorder;
881 by TourType;
882 run;
NOTE: There were 8 observations read from the data set MYLIB.ARCH_OR_SCEN.
NOTE: The data set WORK.TOURORDER has 8 observations and 5 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.20 seconds
cpu time 0.04 seconds
To see the sorted data set, add a PROC PRINT step to the program:
proc sort data=mylib.arch_or_scen out=tourorder;
by TourType;
run;
proc print data=tourorder;
var TourType Country Nights LandCost Vendor;
title 'Tours Sorted by Architecture or Scenery';
run;
186 Chapter 12 Working with Grouped or Sorted Observations
The following output displays the results.
Figure 12.2 Displaying the Sorted Output
By default, SAS arranges groups in ascending order of the BY values, smallest to
largest. Sorting a data set does not change the order of the variables within it. However,
most examples in this section use a VAR statement in the PRINT procedure to display
the BY variable in the first column. (The PRINT procedure and other procedures used in
this documentation can also produce a separate report for each BY group.)
Grouping by More Than One Variable
You can group observations by as many variables as you want. This example groups
observations by TourType, Vendor, and LandCost:
proc sort data=mylib.arch_or_scen out=tourorder2;
by TourType Vendor LandCost;
run;
proc print data=tourorder2;
var TourType Vendor LandCost Country Nights;
title 'Tours Grouped by Type of Tour, Vendor, and Price';
run;
Working with Grouped Data 187

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required