Get full access to Hands-On SAS for Data Analysis and 60K+ other titles, with a free 10-day trial of O'Reilly.

There are also live events, courses curated by job role, and more.

Identifying duplicates using Proc SQL

The simplest way to remove duplicates in Proc SQL is by using the Distinct statement. We will use it on the Dealership_Looped dataset, where the i column, which is used as a looping counter, has been dropped:

Proc Sql;  Create Table Distinct_Dealership_Looped As      Select Distinct *    From Dealership_Looped  ;Quit;

Using the Distinct statement, we have correctly identified the duplicates we created as part of the DO LOOPS. We are now left with the original number of 36 records we had. This can be confirmed by looking at the following LOG:

NOTE: Table WORK.DISTINCT_DEALERSHIP_LOOPED created, with 36 rows and 6 columns. NOTE: PROCEDURE SQL used (Total process time):       real time 1:56.01       cpu time 1:05.78

Let's find ...

Get Hands-On SAS for Data Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Don’t leave empty-handed

Get Mark Richards’s Software Architecture Patterns ebook to better understand how to design components—and how they should interact.

It’s yours, free.

Get it now

Check it out now on O’Reilly

Dive in for free with a 10-day trial of the O’Reilly learning platform—then explore all the other resources our members count on to build skills and solve problems every day.

Start your free trial Become a member now