Skip to Main Content
Hands-On SAS for Data Analysis
book

Hands-On SAS for Data Analysis

by Harish Gulati
September 2019
Beginner to intermediate content levelBeginner to intermediate
346 pages
7h 35m
English
Packt Publishing
Content preview from Hands-On SAS for Data Analysis

Identifying duplicates using Proc SQL

The simplest way to remove duplicates in Proc SQL is by using the Distinct statement. We will use it on the Dealership_Looped dataset, where the i column, which is used as a looping counter, has been dropped:

Proc Sql;  Create Table Distinct_Dealership_Looped As      Select Distinct *    From Dealership_Looped  ;Quit;

Using the Distinct statement, we have correctly identified the duplicates we created as part of the DO LOOPS. We are now left with the original number of 36 records we had. This can be confirmed by looking at the following LOG:

NOTE: Table WORK.DISTINCT_DEALERSHIP_LOOPED created, with 36 rows and 6 columns. NOTE: PROCEDURE SQL used (Total process time):       real time 1:56.01       cpu time 1:05.78

Let's find ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Big Data Analytics with SAS

Big Data Analytics with SAS

David Pope, Subhashini S Tripathi
An Introduction to SAS Visual Analytics

An Introduction to SAS Visual Analytics

Tricia Aanderud, Rob Collum, Ryan Kumpfmiller

Publisher Resources

ISBN: 9781788839822Supplemental Content