By Jeremy Zawodny, Derek J. Balling
Cover | Table of Contents | Colophon
EXPLAIN) statements: this book tells you what the
factors in good performance are, where bottlenecks occur, how to
benchmark MySQL, and other advanced performance topics.configure flag, and recompile.SHOW commands provide a window into
what's going on inside MySQL. They provide access to
MySQL's configuration variables, ongoing statistics,
and counters, as well as a description of what each client is doing.SHOW
VARIABLES
command does just that. Executing it produces quite a bit of output,
which looks something like this:mysql> SHOW VARIABLES;
+---------------------------------+------------------------------------------+
| Variable_name | Value |
+---------------------------------+------------------------------------------+
| back_log | 20 |
| basedir | mysql |
| binlog_cache_size | 32768 |
| character_set | latin1 |
| concurrent_insert | ON |
| connect_timeout | 5 |
| datadir | /home/mysql/data/ |
BEGIN
statement and applied with
COMMIT or rolled back
(undone) with ROLLBACK
. So the SQL for the transaction might
look like this: BEGIN;
[step 1] SELECT balance FROM checking WHERE customer_id = 10233276;
[step 2] UPDATE checking SET balance = balance - 200.00 WHERE customer_id = 10233276;
[step 3] UPDATE savings SET balance = balance + 200.00 WHERE customer_id = 10233276;
COMMIT;
SELECT or primarily
INSERT/UPDATE queries, MyISAM is a good choice.
Many web applications fall into this category.|
Attribute
|
MyISAM
|
Heap
|
BDB
|
InnoDB
|
|---|---|---|---|---|
|
Transactions
|
No
|
No
|
Yes
|
Yes
|
|
Lock granularity
|
Table
|
Table
|
Page (8 KB)
|
Row
|
|
Storage
|
Split files
|
In-memory
|
Single file per table
|
Tablespace(s)
|
|
Isolation levels
|
None
|
None
|
UNION or run two separate queries?
Stress testing, on the other hand, is about finding limits:
what's the maximum number of requests I can handle
with this configuration?UPDATE speed, run one of the
UPDATE-intensive tests a few times on each server.phone_book is a table containing an aggregate
phone book for the state of California, with roughly 35 million
entries. And keep in mind that records within tables
aren't inherently sorted. Consider a query like this
one:SELECT * FROM phone_book WHERE last_name = 'Zawodny'
phone_book table and compare the
last_name field with the string
"Zawodny" to see whether they
match. Clearly that's not efficient. As the number
of records increases, so does the effort necessary to find a given
record. In computer science, we call that an O(n) problem.phone_book is a table containing an aggregate
phone book for the state of California, with roughly 35 million
entries. And keep in mind that records within tables
aren't inherently sorted. Consider a query like this
one:SELECT * FROM phone_book WHERE last_name = 'Zawodny'
phone_book table and compare the
last_name field with the string
"Zawodny" to see whether they
match. Clearly that's not efficient. As the number
of records increases, so does the effort necessary to find a given
record. In computer science, we call that an O(n) problem.SELECT * FROM phone_book WHERE last_name BETWEEN 'Marten' and 'Mason'
PACK_KEYS = 1
INSERT,
UPDATE, and DELETE activity.delay_key_writeSHOW
CREATE
TABLE, you can retrieve the complete SQL necessary
to (re-)create the table. However, if you care only about indexes,
SHOW
INDEXES
FROM provides a lot more information.mysql> SHOW INDEXES FROM access_jeremy_zawodny_com \G
*************************** 1. row ***************************
Table: access_jeremy_zawodny_com
Non_unique: 1
Key_name: time_stamp
Seq_in_index: 1
Column_name: time_stamp
Collation: A
Cardinality: 9434851
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment:
1 rows in set (0.00 sec)
KEYS for
INDEXES in the query.time_stamp. It is a B-tree index with only one
component, the time_stamp column (as opposed to a
multicolumn index). The index isn't packed and is
allowed to contain NULL values. It's a non-unique
index, so duplicates are allowed.query_cache_type
to an appropriate
value in my.cnf:query_cache_type = 1
SELECT
query in the query cache before bothering to analyze or execute it.
It does this by hashing the query and using the hashed value to check
for the results in the cache. MySQL uses the exact query text it
receives, so the cache is sensitive to the most trivial variations.query_cache_type
to an appropriate
value in my.cnf:query_cache_type = 1
SELECT
query in the query cache before bothering to analyze or execute it.
It does this by hashing the query and using the hashed value to check
for the results in the cache. MySQL uses the exact query text it
receives, so the cache is sensitive to the most trivial variations.SELECT * FROM table1
select * FROM table1
SELECT queries, since
they're the only ones it makes any sense to cache.
Unfortunately, older 4.0 versions of
MySQL
don't consider every SELECT
query. The logic it uses simply checks the first three characters of
your query, looking for SEL in a case-insensitive
way./* <b>GetLatestStuff</b> */ SELECT * FROM sometable WHERE ...
CREATE TABLE weather ( city VARCHAR(100) NOT NULL, high_temp TINYINT NOT NULL, low_temp TINYINT NOT NULL, the_date DATE NOT NULL, INDEX (city), INDEX (the_date), )
the_date index. For example,
to find the average high temperature in 1980, you might write
something like this:SELECT AVG(high_temp) FROM weather WHERE the_date BETWEEN '1980-01-01' AND '1980-12-31';
weather table. In
such a case, MySQL decides that it is faster to simply scan the
entire table.# Time: 030303 0:51:27 # User@Host: user[user] @ client.example.com [192.168.50.12] # Query_time: 25 Lock_time: 0 Rows_sent: 3949 Rows_examined: 378036 select ArticleHtmlFiles.SourceTag, ArticleHtmlFiles.AuxId from ArticleHtmlFiles left join Headlines on ArticleHtmlFiles.SourceTag = Headlines.SourceTag and ArticleHtmlFiles.AuxId = Headlines.AuxId where Headlines.AuxId is NULL;
Lock_time indicates how long the query waited for
locks to be released.SELECT keyword:SELECT SQL_CACHE * FROM mytable ...
SELECT /*! SQL_CACHE */ * FROM mytable ...
EXPLAIN output for the query, you
realize that it's accessing the tables in a less
than optimal order.STRAIGHT_JOIN
hint in place of a comma or JOIN keyword in your
query:SELECT * FROM table1 STRAIGHT_JOIN table2 WHERE ...
USE INDEX after
the table name in the query:SELECT * FROM mytable USE INDEX (mod_time, name) ...
SymbolHistory and Symbols.Symbols table contains two important fields:
Id and Symbol. The
Id is an auto_increment primary
key. Here's the PriceHistory
table:mysql> DESCRIBE PriceHistory;
+----------+---------+------+-----+------------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------+------+-----+------------+-------+
| SymbolID | int(11) | | PRI | 0 | |
| Date | date | | PRI | 0000-00-00 | |
| Open | float | | | 0 | |
| High | float | | | 0 | |
| Low | float | | | 0 | |
| Close | float | | | 0 | |
| Volume | float | | | 0 | |
+----------+---------+------+-----+------------+-------+
8 rows in set (0.01 sec)
(SymbolID,
Date).Symbols table maps stock tickers to numeric
identifiers. It also contains various other bits of metadata about
each security. The PriceHistory table contains the
historical price data. One of the most common queries run against the
data is, "Show me all closing prices for a given
stock sorted from newest to oldest."mysql> EXPLAIN SELECT date_format(Date,'%Y%m%d') as Day, Close -> FROM Symbols, PriceHistory -> WHERE Symbols.ID=PriceHistory.SymbolID AND Symbols.Symbol = 'ibm' -> ORDER BY Date DESC \G