Eliminating Duplicates from a Table

Problem

You want to remove duplicate records from a table so that it contains only unique rows.

Solution

Select the unique rows from the table into a second table that you use to replace the original one. Or add a unique index to the table using ALTER TABLE, which will remove duplicates as it builds the index. Or use DELETE ... LIMIT n to remove all but one instance of a specific set of duplicate rows.

Discussion

If you forget to create a table with a unique index to prevent the occurrence of duplicates within the table, you may discover later that it’s necessary to apply some sort of duplicate-removal technique. The cat_mailing table used in earlier sections is an example of this, because it contains several instances where the same person is listed multiple times.

mysql> SELECT * FROM cat_mailing ORDER BY last_name, first_name;
+-----------+-------------+--------------------------+
| last_name | first_name  | street                   |
+-----------+-------------+--------------------------+
| Baxter    | Wallace     | 57 3rd Ave.              |
| BAXTER    | WALLACE     | 57 3rd Ave.              |
| Baxter    | Wallace     | 57 3rd Ave., Apt 102     |
| Brown     | Bartholomew | 432 River Run            |
| Isaacson  | Jim         | 515 Fordam St., Apt. 917 |
| McTavish  | Taylor      | 432 River Run            |
| Pinter    | Marlene     | 9 Sunset Trail           |
| Pinter    | Marlene     | 9 Sunset Trail           |
+-----------+-------------+--------------------------+

The table contains redundant entries and it would be a good idea to remove them, to eliminate duplicate mailings and ...

Get MySQL Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.