This is another frequent problem reported by users of an application who don’t see the updates they made, see them in the wrong order, or see something they don’t expect.
There are two main reasons for getting wrong results: something is
wrong with your SELECT
query, or the
data in database differs from what you expect. I’ll start with the first
case.
When I went over examples for this section, I had to either show some real-life examples or write my own toy cases. The real-life examples can be overwhelmingly large, but the toy cases wouldn’t be much help to you, because nobody writes such code. So I’ve chosen to use some typical real-life examples, but simplified them dramatically.
The first example involves a common user mistake when using
huge joins. We will use Example 1-1,
described in the previous section. This table contains my collection of
MySQL features that cause common usage mistakes I deal with in MySQL
Support. Each mistake has a row in the items
table. I have another table of links
to resources for information. Because
there’s a many-to-many relationship between items and links, I also
maintain an items_links
table to tie
them together. Here are the definitions of the items
and items_links
table (we don’t need links
in this example):
mysql>DESC items;
+-------------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------------+--------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | short_description | varchar(255) | YES | | NULL | | | description | text | YES | | NULL | | | example | text | YES | | NULL | | | explanation | text | YES | | NULL | | | additional | text | YES | | NULL | | +-------------------+--------------+------+-----+---------+----------------+ 6 rows in set (0.30 sec) mysql>DESC items_links;
+--------+---------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------+---------+------+-----+---------+-------+ | iid | int(11) | YES | MUL | NULL | | | linkid | int(11) | YES | MUL | NULL | | +--------+---------+------+-----+---------+-------+ 2 rows in set (0.11 sec)
The first query I wrote worked fine and returned a reasonable result:
mysql> SELECT count(*) FROM items WHERE id IN (SELECT id FROM items_links);
+----------+
| count(*) |
+----------+
| 10 |
+----------+
1 row in set (0.12 sec)
...until I compared the number returned with the total number of links:
mysql> SELECT count(*) FROM items_links;
+----------+
| count(*) |
+----------+
| 6 |
+----------+
1 row in set (0.09 sec)
How could it be possible to have more links than associations?
Let’s examine the query, which I made specially for this book, once more. It is simple and contains only two parts, a subquery:
SELECT id FROM items_links
and an outer query:
SELECT count(*) FROM items WHERE id IN ...
The subquery can be a good place to start troubleshooting because one should be able to execute it independently. Therefore, we can expect a compete result set:
mysql> SELECT id FROM items_links;
ERROR 1054 (42S22): Unknown column 'id' in 'field list'
Surprise! We have a typo, and actually there is no field named
id
in the items_links
table; it says iid
(for “items ID”) instead. If we rewrite our
query so that it uses the correct identifiers, it will work
properly:
mysql> SELECT count(*) FROM items WHERE id IN (SELECT iid FROM items_links);
+----------+
| count(*) |
+----------+
| 4 |
+----------+
1 row in set (0.08 sec)
We just learned a new debugging technique. If a
SELECT
query does not work as expected, split it into smaller chunks, and then analyze each part until you find the cause of the incorrect behavior.
Note
If you specify the full column name by using the format
table_name.column_name
, you can
prevent the problems described here in the first place because you will
get an error immediately:
mysql> SELECT count(*) FROM items WHERE items.id IN (SELECT items_links.id FROM items_links); ERROR 1054 (42S22): Unknown column 'items_links.id' in 'field list'
A good tool for testing is the simple MySQL command-line client that comes with a MySQL installation. We will discuss the importance of this tool in Chapter 6.
But why didn’t MySQL return the same error for the first query?
We have a field named id
in the
items
table, so MySQL thought we wanted
to run a dependent subquery that actually selects items.id
from items_links
. A “dependent subquery” is one that
refers to fields from the outer query.
We can also use EXPLAIN
EXTENDED
followed by SHOW
WARNINGS
to find the mistake. If we run these commands on the
original query, we get:
mysql>EXPLAIN EXTENDED SELECT count(*) FROM items WHERE id IN (SELECT id FROM items_links)\G
2 rows in set, 2 warnings (0.12 sec) *************************** 1. row *************************** id: 1 select_type: PRIMARY table: items type: index possible_keys: NULL key: PRIMARY key_len: 4 ref: NULL rows: 10 filtered: 100.00 Extra: Using where; Using index *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: items_links type: index possible_keys: NULL key: iid_2 key_len: 5 ref: NULL rows: 6 filtered: 100.00 Extra: Using where; Using index 2 rows in set, 2 warnings (0.54 sec) mysql>show warnings\G
*************************** 1. row *************************** Level: Note Code: 1276 Message: Field or reference 'collaborate2011.items.id' of SELECT #2 was resolved in SELECT #1 *************************** 2. row *************************** Level: Note Code: 1003 Message: select count(0) AS `count(*)` from `collaborate2011`.`items` where <in_optimizer7gt;(`collaborate2011`.`items`.`id`,<exists>(select 1 from `collaborate2011`.`items_links` where (<cache>(`collaborate2011`.`items`.`id`) = `collaborate2011`.`items`.`id`))) 2 rows in set (0.00 sec)
Row 2 of the EXPLAIN EXTENDED
output shows that the subquery is actually dependent: select_type
is DEPENDENT SUBQUERY
.
Before moving on from this example, I want to show one more technique that will help you avoid getting lost when your query contains lots of table references. It is easy to get lost if you join 10 or more tables in a single query, even when you know how they should be joined.
The interesting part of the previous example was the output of
SHOW WARNINGS
. The MySQL server does not always execute a query as it was
typed, but invokes the optimizer to create a better execution plan so that
the user usually gets the results faster. Following EXPLAIN EXTENDED
, the SHOW WARNINGS
command shows the optimized
query.
In our example, the SHOW WARNINGS
output contains two notes. The first is:
Field or reference 'collaborate2011.items.id' of SELECT #2 was resolved in SELECT #1
This note clearly shows that the server resolved the value of
id
in the subquery from the items
table rather than from items_links
.
The second note contains the optimized query:
select count(0) AS `count(*)` from `collaborate2011`.`items` where <in_optimizer> (`collaborate2011`.`items`.`id`,<exists> (select 1 from `collaborate2011`.`items_links` where (<cache>(`collaborate2011`.`items`.`id`) = `collaborate2011`.`items`.`id`)))
This output also shows that the server takes the value of id
from the items
table.
Now let’s compare the previous listing with the result of EXPLAIN EXTENDED
on the correct query:
mysql>EXPLAIN EXTENDED SELECt count(*) FROM items WHERE id IN (SELECT iid FROM items_links)\G
*************************** 1. row *************************** id: 1 select_type: PRIMARY table: items type: index possible_keys: NULL key: PRIMARY key_len: 4 ref: NULL rows: 10 filtered: 100.00 Extra: Using where; Using index *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: items_links type: index_subquery possible_keys: iid,iid_2 key: iid key_len: 5 ref: func rows: 1 filtered: 100.00 Extra: Using index; Using where 2 rows in set, 1 warning (0.03 sec) mysql>show warnings\G
*************************** 1. row *************************** Level: Note Code: 1003 Message: select count(0) AS `count(*)` from `collaborate2011`.`items` where <in_optimizer>(`collaborate2011`.`items`.`id`,<exists> (<index_lookup>(<cache>(`collaborate2011`.`items`.`id`) in items_links on iid where (<cache>(`collaborate2011`.`items`.`id`) = `collaborate2011`.`items_links`.`iid`)))) 1 row in set (0.00 sec)
The optimized query this time looks completely different, and really
compares items.id
with items_links.iid
as we intended.
The value of select_type
in the
correct query is still DEPENDENT
SUBQUERY
. How can that be if we resolve the field name from the
items_links
table? The explanation
starts with the part of the SHOW
WARNINGS
output that reads as follows:
where (<cache>(`collaborate2011`.`items`.`id`) = `collaborate2011`.`items_links`.`iid`)
The subquery is still dependent because the id in
clause of the outer query requires the
subquery to check its rows against the value of iid
in the inner query. This issue came up in
the discussion of report #12106 in
the MySQL Community Bugs
Database.
I added a link to the bug report because it provides another important lesson: if you doubt the behavior of your query, use good sources to find information. The community bug database is one such source.
There can be many different reasons why a SELECT
query behaves incorrectly, but the
general method of investigation is always the same:
Split the query into small chunks, and then execute them one by one until you see the cause of the problem.
Use
EXPLAIN EXTENDED
followed bySHOW WARNINGS
to get the query execution plan and information on how it was actually executed.If you don’t understand the MySQL server behavior, use the Internet and good sources for information. The Appendix A includes a list of useful resources.
Get MySQL Troubleshooting now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.