Chapter 3: Data Quality with PROC SQL

Data quality comes before the quality of the analysis. This section introduces SQL techniques for ensuring data quality in three problem areas: outliers, uniformity, and duplicates. The previous chapter, Chapter 2, was dedicated to a fourth problem area: missing values. By integrating filters when accessing data via SQL, users can ensure that data is not entered incorrectly into the system or analysis in the first place.

Section 3.1 introduces working with integrity constraints and audit trails. In simple terms, integrity constraints are check rules for ensuring the data quality of a SAS table.

Section 3.2 deals with the handling, finding, and filtering of multiple values (duplicates). Multiple occurrences ...

Get Advanced SQL with SAS now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.