Chapter 4. MySQL’s Environment

The MySQL server is not alone in the environment where it runs. Even if it works in a dedicated environment, you still have to consider the hardware resources and operating system (OS) limits. In shared environments, the MySQL server is also affected by other processes. Tuning operating systems for MySQL is a huge topic about which a separate book could be written. Here I won’t go deep, but show some starting points from a troubleshooting perspective. MySQL runs in various environments, which is a great advantage for MySQL, but also something that makes it hard to get specific in this chapter. So I decided to show you what you need to care about, and leave it up to you to consult the manual for your OS to determine how to handle tuning.

Physical Hardware Limits

A common usage mistake is to have unrealistic expectations for performance. One can expect that the MySQL server needs to be tuned while forgetting the latency of hardware components. Therefore, it is important to understand what can cause the latency.

The following hardware resources affect the MySQL server:

RAM
CPU
Number of cores
Disk I/O
Network bandwidth

Let’s discuss each of them in a bit of detail.

RAM

Memory is a very important resource for MySQL. The server works fast when it does not swap. Ideally, it should fit in RAM. Therefore, it is important to configure buffers in such a way that they stay within the limits of physical memory. I provided guidelines for this in Effects of Options and in Chapter 3, particularly in Calculating Safe Values for Options.

Note

You can check whether mysqld is not swapping by checking vmstat on Linux/Unix or the Windows Task Manager on Windows.

Here is an example of swapping on Linux. Important parts are in bold. For a server that is not swapping, all these values should be equal to zero:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 1  0   1936 296828   7524 5045340   4   11   860     0  470  440  4  2 75 18
 0  1   1936 295928   7532 5046432  36   40   860   768  471  455  3  3 75 19
 0  1   1936 294840   7532 5047564   4   12   868     0  466  441  3  3 75 19
 0  1   1936 293752   7532 5048664   0    0   848     0  461  434  5  2 75 18

In those sections, we discussed how configuration variables can affect memory usage. The basic rule is to calculate the maximum realistic amount of RAM that will be used by the MySQL server and to make sure you keep it less than the physical RAM you have. Having buffers larger than the actual memory size increases the risk of the MySQL server crashing with an “Out of memory” error.

The previous point can be stated the other way around: if you need larger buffers, buy more RAM. This is always a good practice for growing applications.
Use RAM modules that support extended error correction (EEC), so if a bit of memory is corrupted, the whole MySQL server does not crash.

A few other aspects of memory use that you should consider are listed in a chapter named “How MySQL Uses Memory” in the MySQL Reference Manual. I won’t repeat its contents here, because it doesn’t involve any new troubleshooting techniques.

One important point is that when you select a row containing a BLOB column, an internal buffer grows to the point where it can store this value, and the storage engine does not return the memory to RAM after the query finishes. You need to run FLUSH TABLE to free the memory.

Another point concerns differences between 32-bit and 64-bit architectures. Although the 32-bit ones use a smaller pointer size and thus can save memory, these systems also contain inherent restrictions on the size of buffers due to addressing limits in the operating system. Theoretically, the maximum memory available in a 32-bit system is 4GB per process, and it’s actually less on many systems. Therefore, if the buffers you want to use exceed the size of your 32-bit system, consider switching to a 64-bit architecture.

Processors and Their Cores

MySQL’s performance does not scale linearly with increasing CPU speed. This does not mean you can’t make use of a fast CPU, but don’t expect performance will scale by increasing CPU speed in the same way it can increase by adding more RAM.

However, the number of cores is important when you set options that affect internal thread concurrency. There is no sense in increasing the values of such options if you don’t have enough cores. This can be easily demonstrated using the benchmark utility named sysbench.^[14] Table 4-1 shows the results of a small test on a machine with four cores. I used the OLTP sysbench test with 16 threads.

Table 4-1. Time spent executing an event with different innodb_thread_concurrency values

`innodb_thread_concurrency`	Execution time
1	7.8164
2	4.3959
4	2.5889
8	2.6708
16	3.4669
32	3.4235

As you can see, the test runs faster up until I start eight threads, and stops improving at higher values.

Disk I/O

Fast disks are very important for MySQL performance. The faster the disk, the faster the I/O operations.

Regarding disks, you should pay attention to disk read latency—how much time each read access takes—and fsync latency—how long each fsync takes.

Recent solid state disks (SSDs) work well, but don’t expect miracles from them yet, because most storage engines are optimized to do read and writes for hard disks.

The same issue applies to network storage. It is possible to store data and logfiles in network filesystems and storage, but these installations may be slower than local disks. You need to check is how fast and reliable your storage is. Otherwise, don’t be surprised if you experience data corruption because of network failure.

You can determine whether your disk I/O is overloaded using iostat on Linux/Unix. The average queue length of the requests issued to the device should not be high in normal use. On Windows you can use perfmon for the same purpose. Here is an example output of iostat:

$iostat -x 5
Linux 2.6.18-8.1.1.el5 (blade12)        11/11/2011      _x86_64_

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.27    0.00    0.32    0.65    0.00   97.79

Device:    rrqm/s wrqm/s  r/s   w/s   rsec/s  wsec/s avgrq-sz avgqu-sz await  
cciss/c0d0   0.02   7.25  0.50  6.34  14.35   108.73   17.98    0.48   69.58   
dm-0         0.00   0.00  0.52 13.59  14.27   108.70    8.72    0.09    6.59   

svctm  %util
2.22   1.52
1.08   1.52

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          38.69    0.00    6.43   47.84    0.00    8.64

Device:    rrqm/s  wrqm/s  r/s     w/s rsec/s   wsec/s avgrq-sz avgqu-sz  await 
cciss/c0d0   0.00 5362.40 0.20  713.80   1.60 51547.20    72.20   138.40 193.08  
dm-0         0.00    0.00 0.00 6086.00   0.00 48688.00     8.00  1294.74 227.04  

svctm %util
1.40 99.88
0.16 99.88

<skipped>

Device:    rrqm/s   wrqm/s  r/s     w/s  rsec/s   wsec/s avgrq-sz avgqu-sz  await 
cciss/c0d0   0.00 10781.80 0.00   570.20  0.00  84648.00   148.45   143.58 248.72  
dm-0         0.00     0.00 0.00 11358.00  0.00  90864.00     8.00  3153.82 267.57  

svctm  %util
1.75 100.02
0.09 100.02

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           8.90    0.00   11.30   75.10    0.00    5.60

Device:    rrqm/s   wrqm/s  r/s     w/s  rsec/s   wsec/s avgrq-sz avgqu-sz  await 
cciss/c0d0   0.00 11722.40 0.00  461.60    0.00 98736.00   213.90   127.78 277.04  
dm-0         0.00     0.00 0.00 12179.20   0.00 97433.60     8.00  3616.90 297.54  

svctm %util
2.14 98.80
0.08 98.80

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          23.55    0.00   23.95   46.19    0.00    7.11

Device:    rrqm/s  wrqm/s  r/s     w/s  rsec/s   wsec/s avgrq-sz avgqu-sz  await 
cciss/c0d0   0.00 4836.80 1.00  713.60    8.00 49070.40    68.68   144.28 204.08  
dm-0         0.00    0.00 1.00 5554.60    8.00 44436.80     8.00  1321.82 257.28  

svctm  %util
1.40 100.02
0.18 100.02

This output was taken when mysqld was idle and then started an active I/O job. You can see how avgqu-sz is growing. Although this is far from problematic, I decided to put this example here to show how disk I/O activity changes while mysqld is doing its job.

And aside from speed, remember that the storage can lose data; partial page writes are still possible. If you use InnoDB, use the doublewrite buffer by setting innodb_doublewrite to secure your data. Also, is very important to plan battery backups for your disks because these can prevent data loss in case of power failure.

Network Bandwidth

Clients almost always connect to the MySQL server over a network, so it is important to run your MySQL server in a fast network.

In addition to network bandwidth, round-trip time (RTT) and the number of round-trips are important. RTT is a time needed for a client to send a network packet, then receive an answer from the server. The longer distance between the machines, the higher the RTT is.

Network bandwidth and RTT are the reasons why it is recommended to place the MySQL client and the server in the same local network when possible.

Local networks are recommended for replication as well. You can connect to slaves over the Internet instead of a local intranet, but expect delays and even errors due to corruption in the relay log data. Such errors should be fixed automatically after bug #26489 is fixed and if you use the relay-log-recovery option starting with version 5.5 and binary log checksums starting with version 5.6. But the master and slave will still spend time resending packets due to network failures.

Example of the Effect of Latencies

To finish this part of the chapter, I will show you small example of how hardware latencies affect a trivial UPDATE query. We will use an InnoDB table with autocommit turned on. We will also turn on the binary logfile.

UPDATE test_rrbs SET f1 = md5(id*2) WHERE id BETWEEN 200000 AND 300000;

This simple query could experience latencies when:

The client sends a command to the server that takes a half of RTT.

The WHERE clause of the UPDATE is executed, and mysqld reads the disk.

mysqld does an fsync call to prepare for the transaction because autocommit is on.

mysqld does an fsync call to write into a binary logfile.

mysqld does an fsync call to commit changes.

The client receives the result from the server, which is another aspect of RTT.

^[14]I discuss sysbench in SysBench.

Get MySQL Troubleshooting now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

MySQL Troubleshooting by Sveta Smirnova