Chapter 1. Back To Basics
Many MySQL users and administrators slide into using MySQL. They hear its benefits, find that it’s easy to install on their systems (or better yet, comes pre-installed), and read a quick book on how to attach simple SQL operations to web sites or other applications.
It may take several months for the dragons to raise their heads. Perhaps one particular web page seems to take forever, or a system failure corrupts a database and makes recovery difficult.
Real-life use of MySQL requires forethought and care—and a little benchmarking and testing. This book is for the MySQL administrator who has the basics down but realizes the need to go further. It’s a good book to read after you’ve installed and learned how to use MySQL but before your site starts to get a lot of traffic, and the dragons are breathing down your neck. (When problems occur during a critical service, your fellow workers and friendly manager start to take on decidedly dragon-like appearances.)
The techniques we teach are valuable in many different situations, and sometimes to solve different problems. Replication, for instance, may be a matter of reliability for you—an essential guarantee that your site will still be up if one or two systems fail. But replication can also improve performance; we show you architectures and techniques that solve multiple problems.
We also take
optimization
far beyond the simple use of indexes and diagnostic
(EXPLAIN
) statements: this book tells you what the
factors in good performance are, where bottlenecks occur, how to
benchmark MySQL, and other advanced performance topics.
We ask for a little more patience and time commitment than the average introductory computer book. Our approach involves a learning cycle, and experience convinces us that it’s ultimately the fastest and most efficient way to get where you want.
After describing the problems we’re trying to solve in a given chapter, we start with some background explanation. In other words, we give you a mental model for understanding what MySQL is doing. Then we describe the options you have to solve the problem, and only after all that do we describe particular tools and techniques.
This book is clearly not the end of the line in terms of information. Knowing that, we’ve started a web site, http://www.highperformancemysql.com, where we put useful scripts and new topics. See the Preface for more information.
Before we dig into how to tune your MySQL system to optimum performance, it’s best if we go over a couple of ground rules and make sure everyone is on the same page.
Binary Versus Compiled-From-Source Installations
There are two ways you can install MySQL. As a novice administrator, you may have simply installed a binary package that had precompiled executables, libraries, and configuration files, and placed those files wherever the maker of the binary package decided they should go.
Tip
It’s exceedingly rare for a Windows user to compile his own copy of MySQL. If you’re running MySQL on Windows, feel free to download your copy from the MySQL web site and skip this discussion.
Alternatively, for any number of reasons, you might have decided to compile the MySQL binaries on your own, by downloading a source tarball and configuring the installation to best meet your needs. However, don’t do so lightly. Compiling from source has led to countless hours of pain for some users, mostly due to subtle bugs in their compilers or thread libraries. For this very reason, the standard binaries provided by MySQL AB are statically linked. That means they are immune to any bugs in your locally installed libraries.
There aren’t too many places where the issue of “binary versus compiled-from-source” will come into play in the average MySQL tuning regimen, but they do happen. For example, in Chapter 10, our advice on chrooting your installation can be used only if every file MySQL needs is brought into a single directory tree, which might not be the case in a binary installation.
For a novice administrator on a simple installation, we recommend
using a binary package (such as an RPM) to set up your system.
However, once you progress to the point of really needing to tinker
with the “guts” of MySQL, you will
probably want to quickly go back, change a
configure
flag, and recompile.
MySQL.com Binary Versus Distribution Binary
One thing to keep in mind is that there are a number of sources for binary packages, and nearly all of them set up the system differently.
For example, you can download the binary installation from the
MySQL.com web site. You can also nstall the binary distribution
included by your
Linux distribution vendor, or the
one you grabbed from the
FreeBSD ports
collection. Finally, you can downloaded a binary for a platform that
isn’t officially supported, but on which someone is
keeping a MySQL version current, such as the Amiga
architecture.[1] In any of these
cases, you will end up with different directory layouts, compilation
options, etc.
If you use the binary distributions from anyone other than MySQL AB, your support options may be significantly decreased, simply by virtue of having limited yourself to seeking help from those who use that particular distribution. Even a question as simple as, “Where is the my.cnf file located on the FreeBSD port of MySQL?” is going to limit those who can respond to two groups: those who have run MySQL using the FreeBSD port, and those on the mailing list or newsgroup, etc. who have encountered that question before. On the plus side, if your distribution has automated security announcements and updates, you probably never need to worry about patching MySQL if a security flaw is discovered.
Many binary distributors of MySQL mold it to fit “their” layout. For example, the Debian distribution places the config files in /etc/mysql/, some language-specific files in /usr/share/mysql/, the executables directly into /usr/bin/, etc. It’s not “the Debian way” to segregate an application’s binaries; it incorporates them into the system as a whole. Likewise, in those places it does incorporate them, it does so in what may seem like an odd manner. For instance, you might expect config files to go directly into /etc/, but instead they get put in /etc/mysql/. It can be confusing if you’re trying to find everything you need to modify, or if you’re trying to later convert from one type of installation to the other.
The MySQL.com-supplied tarball binary packages, however, behave more like the source-compilation process. All the files—configuration files, libraries, executables, and the database files themselves—end up in a single directory tree, created specifically for the MySQL install. This is typically /usr/local/mysql, but it can be altered as needed at installation time. Because this behavior is much the same as a source-compiled installation, the available support from the MySQL community is much greater. It also makes things easier if you decide later to instead use a MySQL installation you compile from source.
On the other hand, the MySQL-supplied binary packages that are distributed using package-management formats such as RPM are laid out similarly to the format of the system they are designed for. For example, the RPM installation you get from MySQL.com will have its files laid out similarly to the Red Hat-supplied RPM. This is so because it’s not uncommon for a Linux distribution to ship an RPM that hasn’t been thoroughly tested and is broken in fairly serious ways. The RPM files MySQL.com distributes are intended as upgrade paths for users with such a problem so they can have “just what they have now, but it works.”
Because of that, if you’re going to install a binary you download from MySQL.com, we highly recommend using the tarball formatted files. They will yield the familiar directory structure the average MySQL administrator is used to seeing.
Configuration Files
Configuring a MySQL server is often just a matter of editing the configuration file to make any changes you need and then restarting the server. While that sounds rather simple, adjusting the server’s configuration is something you’re not likely to do on a daily basis. More likely, you’ve installed MySQL, configured it minimally or with the defaults, and then let it run. Most users never go back and adjust the server configuration until a problem arises. As a result, it’s easy to forget how to configure MySQL.
Another possibility is that you didn’t even know there was a configuration file for MySQL. For the majority of projects, MySQL’s default configuration is more than sufficient on modern hardware. It may not be as fast as it can be (because you haven’t optimized it), but it will certainly meet your basic needs.
File Locations
When MySQL starts, it reads its configuration files in a particular order, unless told otherwise. On Unix, the order is:
/etc/my.cnf
datadir/my.cnf
~/.my.cnf
%SystemRoot%/my.ini
C:\my.cnf
Three command-line arguments affect how MySQL reads its configuration files:
- --no-defaults
Tells MySQL not to read any configuration files.
- --defaults-file=/path/to/file
Tells MySQL to read this file only, and any other files explicitly declared with
--defaults-extra-file
.- --defaults-extra-file=/path/to/file
Tells MySQL to read this file after reading the /etc/my.cnf global configuration file .
Files read later in the process override those set in previously read
files. If both /etc/my.cnf and
datadir
/my.cnf
specify a value for the TCP port that MySQL should listen to, the
latter takes precedence.
This behavior can be quite helpful when you need to run
multiple servers either on the same host or
on several different hosts. You can give all servers an identical
copy of /etc/my.cnf that specifies all the
values that aren’t specific to a single host. With
that out of the way, the few host-specific settings can be maintained
in a small supplemental file such as
datadir
/my.cnf.
A similar strategy works if you’d like to run
multiple servers on a single host. By putting all the common settings
in /etc/my.cnf and the server-specific settings
in each
datadir
/my.cnf,
it’s easy to keep several servers running with a
minimum of effort.
For example, perhaps you want to run a couple different instances of the MySQL server, one for each character set you plan to use (to make your life easier). You might put all your “common” settings in /etc/my.cnf and the following in /etc/my.english.cnf:
default-character-set=latin1 port=3306 socket=/var/lib/mysql/english.sock
Your /etc/my.german.cnf file has:
default-character-set=latin1_de port=3307 socket=/var/lib/mysql/german.sock
You might even have /etc/my.korean.cnf with:
default-character-set=euc_kr port=3308 socket=/var/lib/mysql/korean.sock
Now, when you start up the three servers, you want each to load all the settings from the shared /etc/my.cnf file, and then get settings from one of each of the previous language-based configuration files. You can use a command like the following:
$ mysqld_safe --defaults-extra-file=/etc/my.german.cnf $ mysqld_safe --defaults-extra-file=/etc/my.english.cnf $ mysqld_safe --defaults-extra-file=/etc/my.korean.cnf
This command yields three different mysqld
instances, running on ports 3306 through 3308, each using the
language-specific configuration options mentioned in the file
indicated by the defaults-extra-file
switch.
MySQL is usually installed as a service on Windows. As a result, Windows users must call c:\mysql\bin\mysqld directly to pass command-line arguments.
File Format
The configuration file format consists
of one or more sections, each of which may contain one or more lines.
Sections begin with a name in square brackets, such as
[
mysqld
]
;
this identifies the program to which the options should be applied.
Each line contains a comment, a key/value pair, a
set-variable
directive, or a Boolean directive.
Blank lines are ignored.
Two special section names can occur in each
configuration file:
[server]
and [client]
. Items
listed in the [server]
block apply to the MySQL
server process. Those in the [client]
section
apply to all client programs that use the MySQL C client library,
including mysql,
mysqlhotcopy, and
mysqldump.
Comments
begin with #
or ;
and continue
to the end of the line:
# this is a comment ; so is this
There is no multiline comment format. You can’t place a comment at the end of an otherwise non-empty line:
key_buffer=128M # a comment can't go here
The key/value pairs are settings such as:
user = mysql port = 3306
The set-variable
statements look like key/value
pairs in which the value is a key/value pair itself:
set-variable = key_buffer=384M set-variable = tmp_table_size=32M
Spaces aren’t important in
set-variable
lines. You can also write the two
previous lines as follows:
set-variable = key_buffer = 384M set-variable=tmp_table_size=32M
Either way, MySQL will understand you. However, consider using some space to enhance readability.
As of Version 4.1, the set-variable=
portion of
the variable definition is no longer needed and is deprecated. In
current versions:
set-variable = key_buffer=384M
and:
key_buffer=384M
are both interpreted in an identical manner by the server at startup
time. If you are running a version that supports leaving out the
set-variable
clause, it probably is best to do so
because it won’t be supported forever.
We’ve chosen to use the older format here because
it’s what you’re likely to have
already, and the sample configuration files in the standard MySQL
distribution continue to use it.
The few boolean directives are just stated plainly:
skip-bdb
Individual lines in the configuration file are limited to 2 KB in length. While it’s rare that you’ll ever need to use a line that long, it can occasionally be a problem.
Sample Files
The support-files directory of the MySQL distribution[2] contains four sample configuration files:
my-small.cnf
my-medium.cnf
my-large.cnf
my-huge.cnf
The names of the files are meant to signify the size of the machine on which the MySQL server will run. Each contains comments describing where the size comes from. For example, my-medium.cnf says:
# This is for a system with little memory (32M - 64M) where MySQL plays # a important part and systems up to 128M very MySQL is used together with # other programs (like a web server)
To use a sample file, simply copy it to
/etc/my.cnf (or
systemdir
\win.ini on
Windows) and making changes as necessary. While none is likely to be
ideal for any particular setup, each file is a good starting point
for setting up a new system. Failure to make adjustments to the
sample configuration can lead to worse performance in some cases.
Let’s look at the sample my-medium.cnf file from a newly installed system. Some of the information may not make sense right away (depending on how much experience you have), but the more examples you see, the more you’ll begin to understand them.
The file starts with some helpful comments about the type of system this configuration is appropriate for and information needed to install it:
# Example mysql config file for medium systems. # # This is for a system with little memory (32M - 64M) where MySQL plays # a important part and systems up to 128M very MySQL is used together with # other programs (like a web server) # # You can copy this file to # /etc/mf.cnf to set global options, # mysql-data-dir/my.cnf to set server-specific options (in this # installation this directory is /usr/local/mysq/var) or # ~/.my.cnf to set user-specific options. # # One can in this file use all long options that the program supports. # If you want to know which options a program support, run the program # with --help option.
Next are the options that apply to all the client tools you might run on this host:
# The following options will be passed to all MySQL clients [client] #password = your_password port = 3306 socket = /tmp/mysql.sock
What follows next are the parameters specific to the server. The port and socket options, of course, should agree with what the clients were just told. The remaining settings allow MySQL to allocate more RAM for various caches and buffers as well as enable some basic replication options:
# Here follows entries for some specific programs # The MySQL server [mysqld] port = 3306 socket = /tmp/mysql.sock skip-locking set-variable = key_buffer=16M set-variable = max_allowed_packet=1M set-variable = table_cache=64 set-variable = sort_buffer=512K set-variable = net_buffer_length=8K set-variable = myisam_sort_buffer_size=8M log-bin server-id = 1
Next are a few options you probably don’t need to change if you have sufficient disk space:
# Point the following paths to different dedicated disks #tmpdir = /tmp/ #log-update = /path-to-dedicated-directory/hostname
The BDB options refer to the BDB storage engine, which provide MySQL’s first transaction-safe storage. You’ll learn more about storage engines in Chapter 2.
# Uncomment the following if you are using BDB tables #set-variable = bdb_cache_size=4M #set-variable = bdb_max_lock=10000
InnoDB, another of MySQL’s storage engines, has numerous options that must be configured before you can use them. Because it provides transaction-safe tables with its own memory management and storage system, you need to specify where the data files will live, as well as how much RAM should be used. (InnoDB was briefly known as Innobase, so you may see that name in configuration files.)
# Uncomment the following if you are using Innobase tables #innodb_data_file_path = ibdata1:400M #innodb_data_home_dir = /usr/local/mysql/var/ #innodb_log_group_home_dir = /usr/local/mysql/var/ #innodb_log_arch_dir = /usr/local/mysql/var/ #set-variable = innodb_mirrored_log_groups=1 #set-variable = innodb_log_files_in_group=3 #set-variable = innodb_log_file_size=5M #set-variable = innodb_log_buffer_size=8M #innodb_flush_log_at_trx_commit=1 #innodb_log_archive=0 #set-variable = innodb_buffer_pool_size=16M #set-variable = innodb_additional_mem_pool_size=2M #set-variable = innodb_file_io_threads=4 #set-variable = innodb_lock_wait_timeout=50
The final option groups are for specific MySQL command-line utilities, including the mysql shell:
[mysqldump] quick set-variable = max_allowed_packet=16M [mysql] no-auto-rehash # Remove the next comment character if you are not familiar with SQL #safe-updates [isamchk] set-variable = key_buffer=20M set-variable = sort_buffer=20M set-variable = read_buffer=2M set-variable = write_buffer=2M [myisamchk] set-variable = key_buffer=20M set-variable = sort_buffer=20M set-variable = read_buffer=2M set-variable = write_buffer=2M [mysqlhotcopy] interactive-timeout
That file would be considerably larger and certainly more confusing if all the possible settings were listed. For 90% (or more) of MySQL users, there is simply never a need to adjust more than a few of the settings listed in the sample files.
Reconfiguration
When an administrator adjusts the server parameters, it’s common to go through an iterative process that involves making changes, restarting the server, performing some tests, and repeating the process. In fact, we’ll look at doing just that in Chapter 3. In the meantime, it’s worth mentioning that you should strongly consider putting your MySQL configuration files into some sort of revision control system (RCS, CVS, Subversion, etc.). Doing so gives you an easy way to track changes and back out of a bad configuration change.
As of MySQL 4.0, it’s possible to change server variables on the fly at runtime. For example, if you wanted to increase the size of the key buffer from what it was set to at startup, you might do the following:
mysql> SET GLOBAL key_buffer=50M;
This sets the global value for key_buffer
to 50 MB.
Some variables, such as sort_buffer_size
, can be
set globally so that they affect all new threads on the server, or
they can be defined so that they apply only to the current MySQL
client session. For example, if you wish to make a series of queries
that might better use a large sort buffer, you can type:
mysql> SET SESSION sort_buffer_size=50M;
Variables set using the
SESSION
syntax are thread-specific and
don’t alter the values other threads use.
It’s important to note that any change you make
here, using either
GLOBAL
or SESSION
syntax,
will not survive a restart of the MySQL server; it’s
completely transient in that regard. Runtime changes like this are
excellent for testing scenarios such as, “If I
increase my key_buffer value, will it improve my query
performance?” Once you’ve found a
value that works for you, though, remember to go back to your
/etc/my.cnf file and put that value into your
configuration file, or you may find yourself wondering weeks or
months later why performance was so horrible after that reboot,
completely forgetting the variable change you made on the fly months
prior.
It’s also possible to use arguments on the mysqld_safe command line to override values defined in the configuration files. For example, you might do something like the following:
$ mysqld_safe -O key_buffer=50M
Like the earlier set-variable
syntax, the
-O
syntax is deprecated as of Version 4.0. Here is a
better way to issue that command:
$ mysqld_safe --key_buffer=50M
Command-line argument changes made in the mysql.server startup script will, obviously, survive from server restart to server restart, as long as that startup script is used to disable and reenable the server. It’s important to point out, though, that it’s usually better to have all your configuration declarations in a single place, so that maintenance doesn’t become a game of hide-and-seek with the configuration options, trying to remember where you set which values.
The SHOW Commands
MySQL users often wonder how to find out what their server is actually doing at any point in time—usually when things start to slow down or behave strangely. You can look at operating system statistics to figure out how busy the server is, but that really doesn’t reveal much. Knowing that the CPU is at 100% utilization or that there’s a lot of disk I/O occurring provides a high-level picture of what is going on, but MySQL can tell far more.
Several SHOW
commands provide a window into
what’s going on inside MySQL. They provide access to
MySQL’s configuration variables, ongoing statistics,
and counters, as well as a description of what each client is doing.
SHOW VARIABLES
The easiest way to verify that
configuration changes have taken effect is to ask MySQL for its
current variable
settings. The SHOW
VARIABLES
command does just that. Executing it produces quite a bit of output,
which looks something like this:
mysql> SHOW VARIABLES;
+---------------------------------+------------------------------------------+
| Variable_name | Value |
+---------------------------------+------------------------------------------+
| back_log | 20 |
| basedir | mysql |
| binlog_cache_size | 32768 |
| character_set | latin1 |
| concurrent_insert | ON |
| connect_timeout | 5 |
| datadir | /home/mysql/data/ |
The output continues from there, covering over 120 variables in total. The variables are listed in alphabetical order, which is convenient for reading, but sometimes related variables aren’t anywhere near each other in the output. The reason for this is because as MySQL evolves, new variables are added with more descriptive names, but the older variable names aren’t changed; it would break compatibility for any program that expects them.[3]
Many of the variables in the list may be adjusted by a
set-variable
entry in any of
MySQL’s configuration files. Some of them are
compiled-in values that can not be changed. They’re
really constants (not variables), but they still show up in the
output of SHOW
VARIABLES
. Still
others are boolean flags.
Notice that the output of SHOW
VARIABLES
(and all of the SHOW
commands, for that matter) looks just like the output of any SQL
query. It’s tabular data. MySQL returns the output
in a structured format, making it easy to write tools that can
summarize and act on the output of these commands.
We’ll put that to good use in later chapters.
SHOW PROCESSLIST
The other SHOW
command
we’ll look at is SHOW
PROCESSLIST
. It outputs a list of what each thread
is doing at the time you execute the command.[4] It’s roughly equivalent to the
ps
or top
commands in Unix or
the Task Manager in Windows.
Executing it produces a process list in tabular form:
mysql> SHOW PROCESSLIST;
+----+---------+-----------+------+-------------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+---------+-----------+------+-------------+------+-------+------------------+
| 17 | jzawodn | localhost | NULL | Query | 0 | NULL | show processlist |
+----+---------+-----------+------+-------------+------+-------+------------------+
It’s common for the State
and
Info
columns to contain more information that
produces lines long enough to wrap onscreen. So it’s
a good idea to use the \G
escape in the
mysql command interpreter to produce vertical
output rather than horizontal output:
mysql> SHOW PROCESSLIST \G
*************************** 1. row ***************************
Id: 17
User: jzawodn
Host: localhost
db: NULL
Command: Query
Time: 0
State: NULL
Info: show processlist
No matter which way you look at it, the same fields are included:
- Id
The number that uniquely identifies this process. Since MySQL is a multi-threaded server, it really identifies the thread (or connection) and is unrelated to process IDs the operating system may use. As the operating system does with processes, MySQL starts numbering the threads at 1 and gives each new thread an ID one higher than the previous thread.
- User
The name of the MySQL user connected to this thread.
- Host
The name of the host or IP address from which the user is connected.
- db
The database currently selected. This may be NULL if the user didn’t specify a database.
- Command
This shows the command state (from MySQL’s internal point of view) that the thread is currently in. Table 1-1 lists each command with a description of when you are likely to see it. The commands roughly correspond to various function calls in MySQL’s C API. Many commands represent very short-lived actions. Two of those that don’t,
Sleep
andQuery
, appear frequently in day-to- day usage.
- Time
The number of seconds that the process has been running the current command. A process with a
Time
of 90 andCommand
ofSleep
has been idle for a minute and a half.- State
Additional human-readable information about the state of this thread. Here’s an example:
Slave connection: waiting for binlog update
This appears on the master server when a slave is actively replicating from it.
- Info
This is the actual SQL currently being executed, if any. Only the first 100 characters are displayed in the output of
SHOW
PROCESSLIST
. To get the full SQL, useSHOW
FULL
PROCESSLIST
.
SHOW STATUS
In addition to all the variable
information we can query, MySQL also keeps track of many useful
counters and statistics. These numbers track how often various events
occur. The SHOW
STATUS
command
produces a tabular listing of all the statistics and their names.
To confuse matters a bit, MySQL refers to these counters as variables
too. In a sense, they are variables, but they’re not
variables you can set. They change as the server runs and handles
traffic; you simply read them and reset them using the FLUSH
STATUS
command.
The SHOW
STATUS
command,
though, offers a lot of insight into your server’s
performance. It’s covered in much greater depth in
Appendix A.
SHOW INNODB STATUS
The SHOW
INNODB
STATUS
status command
provides a number of InnoDB-specific statistics. As we said earlier,
InnoDB is one of
MySQL’s storage engines; look for more on storage
engines in Chapter 2.
The output of SHOW
INNODB
STATUS
is different from that of
SHOW
STATUS
in that it reads
more as a textual report, with section headings and such. There are
different sections of the report that provide information on
semaphores, transaction statistics, buffer information, transaction
logs, and so forth.
SHOW
INNODB
STATUS
is covered in greater detail along with
SHOW
STATUS
in Appendix A. Also, note that in a future version of
MySQL, this command will be replaced with a more generic
SHOW
ENGINE
STATUS
command.
[1] At the time that sentence was written, it was entirely theoretical: the thinking was “I’m not aware of anything, but surely someone will do that!” In researching it, we found that MySQL for Amiga was, indeed, happening. For those who read German, there’s an article from Amiga Magazine at http://www.amiga-magazin.de/magazin/a08-01/mysql/ that describes how to do it, and a mailing list at http://groups.yahoo.com/group/Amiga_MySql/ for people working on it as well.
[2] These files aren’t included in the Windows distribution of older MySQL releases.
[3] In the rare event they do change, MySQL retains the old names as aliases for the new ones.
[4] Not all
threads appear in the SHOW
PROCESSLIST
output. The thread that handles
incoming network connections, for example, is never listed.
Get High Performance MySQL now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.