8 Some PROC SQL Solutions to Data Cleaning

Introduction

A Quick Review of PROC SQL

Checking for Invalid Character Values

Checking for Outliers

Checking a Range Using an Algorithm Based on the Standard Deviation

Checking for Missing Values

Range Checking for Dates

Checking for Duplicates

Identifying Subjects with "n" Observations Each

Checking for an ID in Each of Two Files

More Complicated Multi-File Rules

Introduction

It was a hard decision whether to group all the PROC SQL approaches together in one chapter or to include an SQL solution in each of the other chapters. I opted for the former. PROC SQL (Structured Query Language) is an alternative to the traditional DATA step and PROC approaches used in this book up to this point. Although PROC ...

Get Cody's Data Cleaning Techniques Using SAS, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.