Identifying duplicates using Proc SQL

The simplest way to remove duplicates in Proc SQL is by using the Distinct statement. We will use it on the Dealership_Looped dataset, where the i column, which is used as a looping counter, has been dropped:

Proc Sql;  Create Table Distinct_Dealership_Looped As      Select Distinct *    From Dealership_Looped  ;Quit;

Using the Distinct statement, we have correctly identified the duplicates we created as part of the DO LOOPS. We are now left with the original number of 36 records we had. This can be confirmed by looking at the following LOG:

NOTE: Table WORK.DISTINCT_DEALERSHIP_LOOPED created, with 36 rows and 6 columns. NOTE: PROCEDURE SQL used (Total process time):       real time 1:56.01       cpu time 1:05.78

Let's find ...

Get Hands-On SAS for Data Analysis now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.