Tuning “Anti-Joins” Using Subqueries

With an anti-join, we retrieve all rows from one table for which there is no matching row in another table. There are a number of ways of expressing anti-joins in MySQL.

Perhaps the most natural way of writing an anti-join is to express it as a NOT IN subquery. For instance, Example 21-7 returns all of the customers who are not employees.

Example 21-7. Example of an anti-join using NOT IN
SELECT count(*)
  FROM customers
 WHERE (contact_surname,contact_firstname, date_of_birth)
    NOT IN (SELECT surname,firstname, date_of_birth
              FROM employees)

Short Explain
-------------
1    PRIMARY select(ALL) on customers using no key
          Using where
2    DEPENDENT SUBQUERY select(ALL) on employees using no key
          Using where

Another way to express this query is to use a NOT EXISTS subquery. Just as MySQL will rewrite IN subqueries to use the EXISTS clause, so too will MySQL rewrite a NOT IN subquery as a NOT EXISTS. So, from MySQL’s perspective, Example 21-7 and Example 21-8 are equivalent.

Example 21-8. Example of an anti-join using NOT EXISTS
SELECT count(*)
  FROM customers
 WHERE NOT EXISTS (SELECT *
                     FROM employees
                    WHERE surname=customers.contact_surname
                      AND firstname=customers.contact_firstname
                      AND date_of_birth=customers.date_of_birth)

Short Explain
-------------
1    PRIMARY select(ALL) on customers using no key
          Using where
2    DEPENDENT SUBQUERY select(ALL) on employees using no key
          Using where

A third but somewhat less natural way to express this query is to use a LEFT JOIN

Get MySQL Stored Procedure Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.