Identifying duplicates using Proc SQL

The simplest way to remove duplicates in Proc SQL is by using the Distinct statement. We will use it on the Dealership_Looped dataset, where the i column, which is used as a looping counter, has been dropped:

Proc Sql;  Create Table Distinct_Dealership_Looped As      Select Distinct *    From Dealership_Looped  ;Quit;

Using the Distinct statement, we have correctly identified the duplicates we created as part of the DO LOOPS. We are now left with the original number of 36 records we had. This can be confirmed by looking at the following LOG:

NOTE: Table WORK.DISTINCT_DEALERSHIP_LOOPED created, with 36 rows and 6 columns. NOTE: PROCEDURE SQL used (Total process time):       real time 1:56.01       cpu time 1:05.78

Let's find ...

Get Hands-On SAS for Data Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.