The MySQL server is not alone in the environment where it runs. Even if it works in a dedicated environment, you still have to consider the hardware resources and operating system (OS) limits. In shared environments, the MySQL server is also affected by other processes. Tuning operating systems for MySQL is a huge topic about which a separate book could be written. Here I won’t go deep, but show some starting points from a troubleshooting perspective. MySQL runs in various environments, which is a great advantage for MySQL, but also something that makes it hard to get specific in this chapter. So I decided to show you what you need to care about, and leave it up to you to consult the manual for your OS to determine how to handle tuning.
A common usage mistake is to have unrealistic expectations for performance. One can expect that the MySQL server needs to be tuned while forgetting the latency of hardware components. Therefore, it is important to understand what can cause the latency.
The following hardware resources affect the MySQL server:
RAM
CPU
Number of cores
Disk I/O
Network bandwidth
Let’s discuss each of them in a bit of detail.
Memory is a very important resource for MySQL. The server works fast when it does not swap. Ideally, it should fit in RAM. Therefore, it is important to configure buffers in such a way that they stay within the limits of physical memory. I provided guidelines for this in Effects of Options and in Chapter 3, particularly in Calculating Safe Values for Options.
Note
You can check whether mysqld is not swapping by checking vmstat on Linux/Unix or the Windows Task Manager on Windows.
Here is an example of swapping on Linux. Important parts are in bold. For a server that is not swapping, all these values should be equal to zero:
procs-----------memory----------
---swap--
-----io---- -system-- ----cpu---- r bswpd
free buff cachesi so
bi bo in cs us sy id wa 1 01936
296828 7524 50453404 11
860 0 470 440 4 2 75 18 0 11936
295928 7532 504643236 40
860 768 471 455 3 3 75 19 0 11936
294840 7532 50475644 12
868 0 466 441 3 3 75 19 0 11936
293752 7532 50486640 0
848 0 461 434 5 2 75 18
In those sections, we discussed how configuration variables can affect memory usage. The basic rule is to calculate the maximum realistic amount of RAM that will be used by the MySQL server and to make sure you keep it less than the physical RAM you have. Having buffers larger than the actual memory size increases the risk of the MySQL server crashing with an “Out of memory” error.
The previous point can be stated the other way around: if you need larger buffers, buy more RAM. This is always a good practice for growing applications.
Use RAM modules that support extended error correction (EEC), so if a bit of memory is corrupted, the whole MySQL server does not crash.
A few other aspects of memory use that you should consider are listed in a chapter named “How MySQL Uses Memory” in the MySQL Reference Manual. I won’t repeat its contents here, because it doesn’t involve any new troubleshooting techniques.
One important point is that when you select a row containing
a BLOB column, an internal buffer grows to the point where it can store
this value, and the storage engine does not return the memory to RAM
after the query finishes. You need to run FLUSH
TABLE
to free the memory.
Another point concerns differences between 32-bit and 64-bit architectures. Although the 32-bit ones use a smaller pointer size and thus can save memory, these systems also contain inherent restrictions on the size of buffers due to addressing limits in the operating system. Theoretically, the maximum memory available in a 32-bit system is 4GB per process, and it’s actually less on many systems. Therefore, if the buffers you want to use exceed the size of your 32-bit system, consider switching to a 64-bit architecture.
MySQL’s performance does not scale linearly with increasing CPU speed. This does not mean you can’t make use of a fast CPU, but don’t expect performance will scale by increasing CPU speed in the same way it can increase by adding more RAM.
However, the number of cores is important when you set options that affect internal thread concurrency. There is no sense in increasing the values of such options if you don’t have enough cores. This can be easily demonstrated using the benchmark utility named sysbench.[14] Table 4-1 shows the results of a small test on a machine with four cores. I used the OLTP sysbench test with 16 threads.
Table 4-1. Time spent executing an event with different innodb_thread_concurrency values
innodb_thread_concurrency | Execution time |
---|---|
1 | 7.8164 |
2 | 4.3959 |
4 | 2.5889 |
8 | 2.6708 |
16 | 3.4669 |
32 | 3.4235 |
As you can see, the test runs faster up until I start eight threads, and stops improving at higher values.
Fast disks are very important for MySQL performance. The faster the disk, the faster the I/O operations.
Regarding disks, you should pay attention to disk read latency—how much time each read access takes—and fsync latency—how long each fsync takes.
Recent solid state disks (SSDs) work well, but don’t expect miracles from them yet, because most storage engines are optimized to do read and writes for hard disks.
The same issue applies to network storage. It is possible to store data and logfiles in network filesystems and storage, but these installations may be slower than local disks. You need to check is how fast and reliable your storage is. Otherwise, don’t be surprised if you experience data corruption because of network failure.
You can determine whether your disk I/O is overloaded using iostat on Linux/Unix. The average queue length of the requests issued to the device should not be high in normal use. On Windows you can use perfmon for the same purpose. Here is an example output of iostat:
$iostat -x 5
Linux 2.6.18-8.1.1.el5 (blade12) 11/11/2011 _x86_64_
avg-cpu: %user %nice %system %iowait %steal %idle
1.27 0.00 0.32 0.65 0.00 97.79
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await
cciss/c0d0 0.02 7.25 0.50 6.34 14.35 108.73 17.98 0.48 69.58
dm-0 0.00 0.00 0.52 13.59 14.27 108.70 8.72 0.09 6.59
svctm %util
2.22 1.52
1.08 1.52
avg-cpu: %user %nice %system %iowait %steal %idle
38.69 0.00 6.43 47.84 0.00 8.64
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await
cciss/c0d0 0.00 5362.40 0.20 713.80 1.60 51547.20 72.20 138.40 193.08
dm-0 0.00 0.00 0.00 6086.00 0.00 48688.00 8.00 1294.74 227.04
svctm %util
1.40 99.88
0.16 99.88
<skipped>
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await
cciss/c0d0 0.00 10781.80 0.00 570.20 0.00 84648.00 148.45 143.58 248.72
dm-0 0.00 0.00 0.00 11358.00 0.00 90864.00 8.00 3153.82 267.57
svctm %util
1.75 100.02
0.09 100.02
avg-cpu: %user %nice %system %iowait %steal %idle
8.90 0.00 11.30 75.10 0.00 5.60
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await
cciss/c0d0 0.00 11722.40 0.00 461.60 0.00 98736.00 213.90 127.78 277.04
dm-0 0.00 0.00 0.00 12179.20 0.00 97433.60 8.00 3616.90 297.54
svctm %util
2.14 98.80
0.08 98.80
avg-cpu: %user %nice %system %iowait %steal %idle
23.55 0.00 23.95 46.19 0.00 7.11
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await
cciss/c0d0 0.00 4836.80 1.00 713.60 8.00 49070.40 68.68 144.28 204.08
dm-0 0.00 0.00 1.00 5554.60 8.00 44436.80 8.00 1321.82 257.28
svctm %util
1.40 100.02
0.18 100.02
This output was taken when mysqld was idle and then started an active I/O
job. You can see how avgqu-sz
is
growing. Although this is far from problematic, I decided to put this
example here to show how disk I/O activity changes while mysqld is doing its job.
And aside from speed, remember that the storage can lose data;
partial page writes are still possible. If you use InnoDB, use the
doublewrite buffer by setting innodb_doublewrite
to secure your data. Also,
is very important to plan battery backups for your disks because these
can prevent data loss in case of power failure.
Clients almost always connect to the MySQL server over a network, so it is important to run your MySQL server in a fast network.
In addition to network bandwidth, round-trip time (RTT) and the number of round-trips are important. RTT is a time needed for a client to send a network packet, then receive an answer from the server. The longer distance between the machines, the higher the RTT is.
Network bandwidth and RTT are the reasons why it is recommended to place the MySQL client and the server in the same local network when possible.
Local networks are recommended for replication as well. You can
connect to slaves over the Internet instead of a local intranet, but
expect delays and even errors due to corruption in the relay log data.
Such errors should be fixed automatically after bug #26489 is fixed
and if you use the relay-log-recovery
option starting
with version 5.5 and binary log checksums starting with version 5.6. But
the master and slave will still spend time resending packets due to
network failures.
To finish this part of the chapter, I will show you small example
of how hardware latencies affect a trivial UPDATE
query. We will use an InnoDB table with
autocommit
turned on. We will also
turn on the binary logfile.
UPDATE test_rrbs SET f1 = md5(id*2) WHERE id BETWEEN 200000 AND 300000;
This simple query could experience latencies when:
Get MySQL Troubleshooting now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.