Tuning Joins

So far we have looked at tuning SQL queries against a single table only. Let’s move on to tuning SQL queries that join rows from two or more tables.

How MySQL Joins Tables

MySQL currently joins tables using a fairly simple technique with a complicated-sounding name. The MySQL manual refers to the join algorithm as single-sweep multi-join. In essence, when MySQL joins two tables, it will read the rows from the first table and—for each row—search the second table for matching rows. Further details can be found in the MySQL Internals Manual; see http://dev.mysql.com/doc/internals/en/index-merge-overview.html.

Joins Without Indexes

The basic join algorithm is not very well suited to joining multiple tables unless there are indexes to support the join.[*] Performance might be adequate for very small tables, but as table sizes increase, the join overhead will increase rapidly. Even worse, the join overhead will increase almost exponentially.

Figure 20-8 shows how response time increases for nonindexed joins as the size of each table increases. This semi-exponential degradation is extremely undesirable: if we extrapolate the response time curve for larger tables, we predict that it would take 20 minutes to join two tables of 100,000 rows, 20 hours to join two tables with 1 million rows each, and 81 days to join two tables of 10 million rows each! This is definitely not the way you want your applications to perform as your database grows in size.

Figure 20-8. Table size versus ...

Get MySQL Stored Procedure Programming now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.