Compressing Data Files
Definition of Compression
Compressing a file is a process that reduces the number of bytes required to represent
each observation. In a compressed file, each observation is a variable-length record,
while in an uncompressed file, each observation is a fixed-length record.
Advantages of compressing a file include the following:
reduced storage requirements for the file
less I/O operations necessary to read from or write to the data during processing
There are disadvantages to compressing a file. For example:
More CPU resources are required to read a compressed file because of the overhead
of uncompressing each observation.
There are situations when the resulting file size can increase rather than decrease.
Requesting Compression
By default, a SAS data file is not compressed. To compress, you can use these options:
COMPRESS= system option to compress all data files that are created during a SAS
session
COMPRESS= option in the LIBNAME statement to compress all data files for a
particular SAS library
COMPRESS= data set option to compress an individual data file
To compress a data file, you can specify one of the following:
COMPRESS=CHAR to use the RLE (Run Length Encoding) compression algorithm
COMPRESS=BINARY to use the RDC (Ross Data Compression) algorithm
When you create a compressed data file, SAS writes a note to the log indicating the
percentage of reduction that is obtained by compressing the file. SAS obtains the
compression percentage by comparing the size of the compressed file with the size of an
uncompressed file of the same page size and record count.
After a file is compressed, the setting is a permanent attribute of the file. This means that
you must re-create the file to change the setting. That is, to uncompress a file, specify
COMPRESS=NO for a DATA step that copies the compressed data file.
For more information about the COMPRESS= data set option, see SAS Data Set
Options: Reference. For more information about the COMPRESS= option in the
LIBNAME statement, see SAS Statements: Reference. For more information about the
COMPRESS= system option, see SAS System Options: Reference.
Disabling a Compression Request
Compressing a file adds a fixed-length block of data to each observation. Because of the
additional block of data (12 bytes for a 32-bit host and 24 bytes for a 64-bit host per
638 Chapter 26 SAS Data Files

Get SAS 9.4 Language Reference, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.