May 2018
Intermediate to advanced
576 pages
30h 25m
English
The first query works by grouping together the rows on the unique column and counting rows. Anything with more than one row must be caused by duplicate values. If we're looking for duplicates of more than one column (or even all columns), then we have to use an SQL of the following form:
SELECT *FROM mytableWHERE (col1, col2, ... ,colN) IN(SELECT col1, col2, ... ,colN FROM mytable GROUP BY col1, col2, ... ,colN HAVING count(*) > 1);
Here, col1, col2, and so on until colN are the columns of the key.
Note that this type of query may need to sort the complete table on all the key columns. That will require sort space equal to the size of the table, so you'd better think first before running that SQL on very large tables. You'll ...