Skip to Main Content
Microsoft® SQL Server 2008 R2 Unleashed
book

Microsoft® SQL Server 2008 R2 Unleashed

by Ray Rankins, Paul Bertucci, Chris Gallelli, Alex T. Silverstein
September 2010
Intermediate to advanced content levelIntermediate to advanced
1704 pages
111h 8m
English
Sams
Content preview from Microsoft® SQL Server 2008 R2 Unleashed

De-Duping Data with Ranking Functions

One common problem encountered with imported data is unexpected duplicate data rows, especially if the data is being consolidated from multiple sources. In previous versions of SQL Server, de-duping the data often involved the use of cursors and temp tables. Since the introduction of the ROW_NUMBER ranking function and common table expressions in SQL Server 2005, you are able to de-dupe data with a single statement.

To demonstrate this approach, Listing 43.26 shows how to create an authors_import table and populate it with some duplicate rows.

Listing 43.27 Script to Create and Populate the authors_import Table

You can see in the data for Listing 43.27 that there are two duplicates for au_id 499-84-5672 ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Microsoft® SQL Server 2005 Unleashed

Microsoft® SQL Server 2005 Unleashed

Ray Rankins, Paul Bertucci, Chris Gallelli, Alex T. Silverstein, Tudor Trufinescu, John Kane
Microsoft® SQL Server™ 2005 Administrator's Companion

Microsoft® SQL Server™ 2005 Administrator's Companion

Edward Whalen, Marcilina Garcia, Burzin Patel, Stacia Misner, Victor Isakov
Microsoft® SQL Server® 2008 Internals

Microsoft® SQL Server® 2008 Internals

Paul Randal Kalen Delaney Kimberly Tripp, and Conor Cunningham

Publisher Resources

ISBN: 9780768696585Purchase book