About Creating a SAS Data Set with a DATA Step
Creating a SAS Data File or a SAS View
You can create either a SAS data file, a data set that holds actual data, or a SAS view, a
data set that references data that is stored elsewhere. By default, you create a SAS data
file. To create a SAS view instead, use the VIEW= option in the DATA statement. With a
SAS view, you can process current input data values without having to edit your DATA
step. For example, you can process monthly sales figures without having to edit your
DATA step. Whenever you need to create output, the output from a SAS view reflects
the current input data values.
The following DATA statement creates a SAS view called Monthly_Sales.
data monthly_sales / view=monthly_sales;
The following DATA statement creates a data file called Test_Results.
data test_results;
Sources of Input Data
You select data-reading statements based on the source of your input data. There are at
least six sources of input data:
raw data in an external file
raw data in the jobstream (instream data)
data in SAS data sets
data that is created by programming statements
data that you can remotely access through a SAS catalog entry, the clipboard, a data
URL, an email, an FTP protocol, a Hadoop Distributed File System, TCP/IP socket,
a URL, a WebDAV protocol, or through zlib services
data that is stored in a Database Management System (DBMS) or other vendor's data
files.
Usually, DATA steps read input data records from only one of the first three sources of
input. However, DATA steps can use a combination of some or all of the sources.
Reading Raw Data: Examples
Example 1: Reading External File Data
The components of a DATA step that produce a SAS data set from raw data stored in an
external file are outlined here.
data Weight; 1
infile 'your-input-file'; 2
input IDnumber $ week1 week16; 3
WeightLoss=week1-week16; 4
run; 5
About Creating a SAS Data Set with a DATA Step 425
proc print data=Weight; 6
run; 7
1
Begin the DATA step and create a SAS data set called Weight.
2
Specify the external file that contains your data.
3
Read a record and assign values to three variables.
4
Calculate a value for variable WeightLoss.
5
Execute the DATA step.
6
Print data set Weight using the PRINT procedure.
7
Execute the PRINT procedure.
Example 2: Reading Instream Data Lines
This example reads raw data from instream data lines.
data Weight2; 1
input IDnumber $ week1 week16; 2
AverageLoss=week1-week16; 3
datalines; 4
2477 195 163
2431 220 198
2456 173 155
2412 135 116
; 5
proc print data=Weight2; 6
run;
1
Begin the DATA step and create SAS data set Weight2.
2
Read a data line and assign values to three variables.
3
Calculate a value for variable WeightLoss2.
4
Begin the data lines.
5
Signal end of data lines with a semicolon and execute the DATA step.
6
Print data set Weight2 using the PRINT procedure.
7
Execute the PRINT procedure.
Example 3: Reading Instream Data Lines with Missing Values
You can also take advantage of options in the INFILE statement when you read instream
data lines. This example shows the use of the MISSOVER option, which assigns missing
values to variables for records that contain no data for those variables.
data
weight2;
infile datalines missover; 1
input IDnumber $ Week1 Week16;
WeightLoss2=Week1-Week16;
datalines; 2
2477 195 163
2431
2456 173 155
2412 135 116
; 3
426 Chapter 20 DATA Step Processing
proc print data=weight2; 4
run; 5
1
Use the MISSOVER option to assign missing values to variables that do not contain
values in records that do not satisfy the current INPUT statement.
2
Begin data lines.
3
Signal end of data lines and execute the DATA step.
4
Print data set Weight2 using the PRINT procedure.
5
Execute the PRINT procedure.
Example 4: Using Multiple Input Files in Instream Data
This example shows how to use multiple input files as instream data to your program.
This example reads the records in each file and creates the All_Errors SAS data set. The
program then sorts the observations by Station, and creates a sorted data set called
Sorted_Errors. The print procedure prints the results.
data all_errors;
length filelocation $ 60;
input filelocation; /* reads instream data */
infile daily filevar=filelocation
filename=daily end=done;
do while (not done);
input Station $ Shift $ Employee $ NumberOfFlaws;
output;
end;
put 'Finished reading ' daily=;
datalines;
pathmyfile_A
pathmyfile_B
pathmyfile_C
;
proc sort data=all_errors out=sorted_errors;
by Station;
run;
proc print data = sorted_errors;
title 'Flaws Report sorted by Station';
run;
About Creating a SAS Data Set with a DATA Step 427

Get SAS 9.4 Language Reference, 6th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.