BUY THIS BOOK
Add to Cart

Print Book $49.99


Add to Cart

PDF $39.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £35.50

What is this?

Looking to Reprint or License this content?


Understanding MySQL Internals
Understanding MySQL Internals

By Sasha Pachev
Book Price: $49.99 USD
£35.50 GBP
PDF Price: $39.99

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: MySQL History and Architecturel
MySQL architecture is best understood in the context of its history. Thus, the two are discussed in the same chapter.
MySQL history goes back to 1979 when Monty Widenius, working for a small company called TcX, created a reporting tool written in BASIC that ran on a 4 Mhz computer with 16 KB RAM. Over time, the tool was rewritten in C and ported to run on Unix. It was still just a low-level storage engine with a reporting front end. The tool was known by the name of Unireg.
Working under the adverse conditions of little computational resources, and perhaps building on his God-given talent, Monty developed a habit and ability to write very efficient code naturally. He also developed, or perhaps was gifted from the start, with an unusually acute vision of what needed to be done to the code to make it useful in future development—without knowing in advance much detail about what that future development would be.
In addition to the above, with TcX being a very small company and Monty being one of the owners, he had a lot of say in what happened to his code. While there are perhaps a good number of programmers out there with Monty's talent and ability, for a number of reasons, few get to carry their code around for more than 20 years. Monty did.
Monty's work, talents, and ownership of the code provided a foundation upon which the Miracle of MySQL could be built.
Some time in the 1990s, TcX customers began to push for an SQL interface to their data. Several possibilities were considered. One was to load it into a commercial database. Monty was not satisfied with the speed. He tried borrowing mSQL code for the SQL part and integrating it with his low-level storage engine. That did not work well, either. Then came the classic move of a talented, driven programmer: "I've had enough of those tools that somebody else wrote that don't work! I'm writing my own!"
Thus in May of 1996 MySQL version 1.0 was released to a limited group, followed by a public release in October 1996 of version 3.11.1. The initial public release provided only a binary distribution for Solaris. A month later, the source and the Linux binary were released.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
MySQL History
MySQL history goes back to 1979 when Monty Widenius, working for a small company called TcX, created a reporting tool written in BASIC that ran on a 4 Mhz computer with 16 KB RAM. Over time, the tool was rewritten in C and ported to run on Unix. It was still just a low-level storage engine with a reporting front end. The tool was known by the name of Unireg.
Working under the adverse conditions of little computational resources, and perhaps building on his God-given talent, Monty developed a habit and ability to write very efficient code naturally. He also developed, or perhaps was gifted from the start, with an unusually acute vision of what needed to be done to the code to make it useful in future development—without knowing in advance much detail about what that future development would be.
In addition to the above, with TcX being a very small company and Monty being one of the owners, he had a lot of say in what happened to his code. While there are perhaps a good number of programmers out there with Monty's talent and ability, for a number of reasons, few get to carry their code around for more than 20 years. Monty did.
Monty's work, talents, and ownership of the code provided a foundation upon which the Miracle of MySQL could be built.
Some time in the 1990s, TcX customers began to push for an SQL interface to their data. Several possibilities were considered. One was to load it into a commercial database. Monty was not satisfied with the speed. He tried borrowing mSQL code for the SQL part and integrating it with his low-level storage engine. That did not work well, either. Then came the classic move of a talented, driven programmer: "I've had enough of those tools that somebody else wrote that don't work! I'm writing my own!"
Thus in May of 1996 MySQL version 1.0 was released to a limited group, followed by a public release in October 1996 of version 3.11.1. The initial public release provided only a binary distribution for Solaris. A month later, the source and the Linux binary were released.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
MySQL Architecture
For the large part, MySQL architecture defies a formal definition or specification. When most of the code was originally written, it was not done to be a part of some great system in the future, but rather to solve some very specific problems. However, it was written so well and with enough insight that it reached the point where there were enough quality pieces to assemble a database server.
I make an attempt in this section to identify the core modules in the system. However, let me add a disclaimer that this is only an attempt to formalize what exists. MySQL developers rarely think in those terms. Rather, they tend to think of files, directories, classes, structures, and functions. It is much more common to hear "This happens in mi_open( )" than to hear "This happens on the MyISAM storage engine level." MySQL developers know the code so well that they are able to think conceptually on the level of functions, structures, and classes. They will probably find the abstractions in this section rather useless. However, it would be helpful to a person used to thinking in terms of modules and managers.
With regard to MySQL, I use the term "module" rather loosely. Unlike what one would typically call a module, in many cases it is not something you can easily pull out and replace with another implementation. The code from one module might be spread across several files, and you often find the code from several different modules in the same file. This is particularly true of the older code. The newer code tends to fit into the pattern of modules better. So in our definition, a module is a piece of code that logically belongs together in some way, and performs a certain critical function in the server.
One can identify the following modules in the server:
  • Server Initialization Module
  • Connection Manager
  • Thread Manager
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Nuts and Bolts of Working with the MySQL Source Code
Much can be learned about MySQL by studying its source. Monty Widenius, the creator of MySQL, once half-jokingly remarked that the source is the ultimate documentation. Indeed, assuming that the hardware and the compiler are functioning properly, the software will do exactly what the source tells it to. However, understanding the source of a complex program such as MySQL can be a challenge. The purpose of this chapter is to give you a head start in your study of the source.
Although MySQL runs on a number of different platforms, you will find it easier to study the source if you get an account on some Unix-like system, such as Linux, FreeBSD, Mac OS X, or Solaris. If you do not have a preference to start with, I recommend Linux. It could be either a remote server, or running on your desktop. The examples in this chapter assume you are logged in to a Unix command shell, and that your shell is Bourne-compatible to some degree. One way to get such a shell is to execute:
	/bin/sh
right after you log in.
MySQL developers use BitKeeper (http://www.bitmover.com) for source revision control. A BitKeeper repository containing MySQL source code is publicly available with read-only access. Although MySQL source code can also be obtained by downloading a compressed archive, using BitKeeper offers a number of advantages:
  • You get the most recent source version and can stay up to date with all the developments on a daily basis.
  • BitKeeper tools allow you to easily keep track of changes.
  • You can easily keep track of your own changes and submit patches to MySQL developers.
Unfortunately, there are also some disadvantages:
  • The initial setup requires a download of over 30 MB of data if you are downloading the revision history.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Unix Shell
Although MySQL runs on a number of different platforms, you will find it easier to study the source if you get an account on some Unix-like system, such as Linux, FreeBSD, Mac OS X, or Solaris. If you do not have a preference to start with, I recommend Linux. It could be either a remote server, or running on your desktop. The examples in this chapter assume you are logged in to a Unix command shell, and that your shell is Bourne-compatible to some degree. One way to get such a shell is to execute:
	/bin/sh
right after you log in.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
BitKeeper
MySQL developers use BitKeeper (http://www.bitmover.com) for source revision control. A BitKeeper repository containing MySQL source code is publicly available with read-only access. Although MySQL source code can also be obtained by downloading a compressed archive, using BitKeeper offers a number of advantages:
  • You get the most recent source version and can stay up to date with all the developments on a daily basis.
  • BitKeeper tools allow you to easily keep track of changes.
  • You can easily keep track of your own changes and submit patches to MySQL developers.
Unfortunately, there are also some disadvantages:
  • The initial setup requires a download of over 30 MB of data if you are downloading the revision history.
  • Special tools such as autoconf, automake, and bison have to be installed in order to build MySQL.
  • Since BitMover decided to discontinue the Open Logging License, it is not possible to automatically integrate your changes, submit patches, and do other tasks without buying a commercial license.
If the disadvantages of using BitKeeper in your situation outweigh the advantages, please refer to the "Building from Source Distribution" section in this chapter. Otherwise, the first step is to make sure that BitKeeper is installed on your system.
Without a commercial license, the only advantage of using BitKeeper is being able to get the most recent development source. If you are not planning to use the commercial version of BitKeeper, follow these instructions to download the free BitKeeper client:
  1. Download http://www.bitmover.com/bk-client.shar.
  2. Unpack it by running /bin/sh bk-client.shar.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Preparing the System to Build MySQL from BitKeeper Tree
Once you have cloned the BitKeeper repository, the following tools must be installed for the build scripts to work:
  • autoconf
  • automake
  • m4
  • libtool
  • GNU make
  • bison
  • gcc or some other C++ compiler
The required version's utilities will be present on most Linux distributions that were put together in the second half of 2003 or later. If you have an older or a very customized Linux installation, or if you are using a different Unix variant, refer to Table 2-2 to verify that you have the required version of each tool.
Table 2-2: Versions of required build tools
Tool
Minimum required version
URL
autoconf
2.53
http://www.gnu.org/software/autoconf/
automake
1.8
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Building MySQL from BitKeeper Tree
MySQL developers have created a number of scripts to build different types of MySQL binaries on different platforms. They are found in the BUILD directory of the BitKeeper tree. Unfortunately, at the time of this writing, I could not find the one that would build a debugging-enabled binary on any architecture. To solve the problem, I will provide instructions on how to create one:
  1. Copy compile-pentium-debug to compile-generic-debug.
  2. Open compile-generic-debug in a text editor.
  3. Change the line extra_flags="$pentium_cflags $debug_cflags" to extra_flags="-g $debug_cflags".
  4. Remove all lines starting with: extra_configs=.
  5. Add a line extra_configs="" before the line. "$path/FINISH.sh".
  6. Save the edited file.
After the edit, compile-generic-debug will look like this:
	#! /bin/sh

	path='dirname $0'
	. "$path/SETUP.sh" $@ --with-debug=full

	extra_flags="-g $debug_cflags"
	c_warnings="$c_warnings $debug_extra_warnings"
	cxx_warnings="$cxx_warnings $debug_extra_warnings"
	extra_configs=""

	
	. "$path/FINISH.sh"
Now you are ready to use compile-generic-debug for the build. At the shell prompt from the root of the cloned repository, execute the following:
    $ BUILD/compile-generic-debug
The script will generate the make files, create the necessary headers with definitions, and then compile the MySQL server, client, and miscellaneous utilities. This is a fairly long process. It took 12 minutes on my desktop (Athlon 2200+, 1.5 GB RAM, Linux).
During the build, you may see a number of warnings from different tools engaged in the process. Those are usually safe to ignore if the build does not abort. To verify that the build was successful, type at the shell prompt:
	$ make test
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Building from Source Distribution
Although it is preferred that you use the BitKeeper repository, in some cases it might be desirable for you to use another method to build MySQL. You can use the source distribution in such cases. Although in most of the situations you will need only gcc, gdb, and GNU make, there are times when other tools mentioned in the section "Preparing the System to Build MySQL from BitKeeper Tree" are necessary. For example, you will need Bison to change the parser, and adding another file to the source will require the use of autoconf, automake, and m4. Therefore, it is still recommended that you follow the same procedures outlined in that section to prepare your system to the fullest extent possible.
Additionally, you will need the tar (http://www.gnu.org/software/tar) and gzip (http://www.gnu.org/software/gzip) utilities to unpack the archive. If you have a non-GNU tar already installed, it is recommended that you still install GNU tar. MySQL is archived using GNU tar, and some variants of tar are not compatible with it.
The following instructions explain how to download and compile MySQL using the source distribution:
  1. Refer to the table listing MySQL versions in the "BitKeeper" section of this chapter, and decide which version of MySQL you would like to work with. Steps 2–5 assume you have chosen version 4.1. If you have chosen a different version, you will need to make the appropriate modifications to the following procedures, replacing 4.1 with the version you have chosen.
  2. Visit http://dev.mysql.com/downloads/mysql/4.1.html (note the version number in the URL), scroll down to the bottom of the page where it says "Source downloads," and click on the link that says "Pick a mirror" on the "Tarball" line.
  3. Optionally fill out the form at the top of the next page and submit it, or just scroll down to the bottom of the page, pick the mirror closest to you, click on the link, and proceed with the download. You will be downloading about 19 MB of data.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Installing MySQL into a System Directory
If desired, you may install MySQL into a system directory by executing:
	$ make install
as the root user. By default, the install prefix is /usr/local. This can be changed by adding –prefix= /path/to/other-prefix to the extra_configs variable in the build script. If you do not have root privileges on the system, another build configuration option will be helpful: add --with-mysqld-user= your_user_name to extra_configs. A full listing of build configuration options can be obtained by executing:
	# ./configure -help
in the root directory of the source tree.
If you do not plan to deploy the MySQL server binary you have built on this system, installing it into a system directory is not necessary. The mysql-test-run script permits you to start up the binary you have built and test it while it is located in the directory where it was created.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Source Code Directory Layout
Table 2-3 lists the top-level subdirectories in the MySQL source tree, with a brief explanation of each. Note that some reorganization is possible in the future versions, but most of the structure should be fairly stable.
Table 2-3: Top-level directories in MySQL source tree
Subdirectory
Description
BUILD
Developer build scripts.
Build-tools
Mostly scripts for building binary distributions.
Docs
Documentation.
NEW-RPMS
Used by the distribution build scripts to hold new RPMs.
SSL
Some configuration files from early SSL development.
VC++Files
Used for building MySQL binaries on Windows.
bdb
Berkeley DB storage engine code. Berkeley DB supports transactions and page locks. However, the interface between Berkeley DB and core MySQL is not very well developed, and InnoDB storage engine is a better choice when transactions are needed. Removed in version 5.1.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Preparing the System to Run MySQL in a Debugger
To fully enjoy the study of MySQL internals, and to be able to execute the examples in the subsequent sections of this chapter, you must have gdb (http://www.gnu.org/software/gdb/) installed on your system, and be present in your PATH. You also need to have the X Window System, including a terminal program such as xterm. There are a number of X standard implementations, perhaps the most popular of them being X.org (http://www.x.org).
The tools just mentioned will be preinstalled by default on most Linux distributions. However, to confirm that you can debug threaded programs under gdb, it is important to make sure that /lib/libpthread.so and /lib/libthread_db.so are not stripped. The following example illustrates how to check this:
	$ file -L /lib/libthread_db.so
	/lib/libthread_db.so: ELF 32-bit LSB shared object, Intel 80386, version 1, not 
	stripped 
	$ file -L /lib/libpthread.so
	/lib/libpthread.so: ELF 32-bit LSB shared object, Intel 80386, version 1, not 
	stripped
As you can see in the output, both libraries are not stripped. If you happen to have the misfortune of having them stripped by default, and you are not able to find a package with unstripped versions for your distribution, you can fix the problem by recompiling glibc.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Debugger-Guided Source Tour
Now with the tedious but necessary preparation behind your back, you can actually start exploring the source code. I find it particularly helpful, when faced with large quantities of unfamiliar code, to start by running a very simple test case in a debugger. MySQL, being a threaded server, presents a number of difficulties in this respect. Fortunately, MySQL developers have created a set of tools to facilitate the process for their own use, which they make available to the public. In this section, you will learn how to use them.
Instructions for running a simple query in a debugger:
  1. Change to the mysql-test subdirectory in the source tree.
  2. Create a new file named t/example.test. It is important that the file be under the t subdirectory, have the extension .test, and be different from the names of the already existing test files in the t subdirectory. Outside of those restrictions, the name of the file can be anything you want. If you choose a different name, however, you must also change references to it accordingly in the rest of this example.
  3. Put the following line in the edited file:
    	select 1;
    
  4. Save the file.
  5. Execute the following command to create the master result file:
    	$ ./mysql-test-run --local --record example
    
  6. Execute the following command to load MySQL server into gdb in a separate xterm window (if you're running it on another computer via SSH, be sure to have SSH X-forwarding enabled. If it's not possible—e.g., because you're using Windows —use --manual-gdb instead of --gdb):
        $ ./mysql-test-run --gdb example
    
  7. An xterm window will open with a gdb prompt inside. The MySQL server will be started with a preset breakpoint in the mysql_parse( ) function in the file sql/sql_ parse.cc
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Basics of Working with gdb
gdb has a command-line interface similar to a Unix shell. You type a command and then press the Enter key to execute it. If you have never worked with gdb before, begin by executing the help command, which produces the following output:
	List of classes of commands:

	aliases -- Aliases of other commands
	breakpoints -- Making program stop at certain points
	data -- Examining data
	files -- Specifying and examining files
	internals -- Maintenance commands
	obscure -- Obscure features
	running -- Running the program
	stack -- Examining the stack
	status -- Status inquiries
	support -- Support facilities
	tracepoints -- Tracing of program execution without stopping the program
	user-defined -- User-defined commands

	Type "help" followed by a class name for a list of commands in that class.
	Type "help" followed by command name for full documentation.
	Command name abbreviations are allowed if unambiguous.
The instructions in the preceding output give you a starting point from which you can continue a more in-depth study of gdb. For example, if you want to learn about how to run a program being debugged, execute help running, which in turn gives you the following:
	Running the program.

	List of commands:

	advance --Continue the program up to the given location (same form as args for break
	command)
	attach -- Attach to a process or file outside of GDB
	continue -- Continue program being debugged
	detach -- Detach a process or file previously attached
	disconnect -- Disconnect from a target
	finish -- Execute until selected stack frame returns
	handle -- Specify how to handle a signal 
	info handle -- What debugger does when program gets various signals
	interrupt -- Interrupt the execution of the debugged program
	jump -- Continue program being debugged at specified line or address
	kill -- Kill execution of program being debugged
	next -- Step program
	nexti -- Step one instruction
	run -- Start debugged program
	set args -- Set argument list to give program being debugged when it is started
	set environment -- Set environment variable value to give the program
	set follow-fork-mode -- Set debugger response to a program call of fork or vfork
	set scheduler-locking -- Set mode for locking scheduler during execution
	set step-mode -- Set mode of the step operation
	show args -- Show argument list to give program being debugged when it is started
	show follow-fork-mode -- Show debugger response to a program call of fork or vfork
	show scheduler-locking -- Show mode for locking scheduler during execution
	show step-mode -- Show mode of the step operation
	signal -- Continue program giving it signal specified by the argument
	step -- Step program until it reaches a different source line
	stepi -- Step one instruction exactly
	target -- Connect to a target machine or process
	thread -- Use this command to switch between threads
	thread apply -- Apply a command to a list of threads
	apply all -- Apply a command to all threads
	tty -- Set terminal for future runs of program being debugged
	unset environment -- Cancel environment variable VAR for the program
	until -- Execute until the program reaches a source line greater than the current

	Type "help" followed by command name for full documentation.
	Command name abbreviations are allowed if unambiguous.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Finding Things in the Source
A typical question programmers ask when working with a large unfamiliar code base is "Where in the world is the function get_and_lock_X( ) defined?" There are many techniques to find the answer, and many programmers have their own favorites. For those who do not, or who are having a hard time adapting them to the MySQL source, I will share mine.
Suppose you need to find the definition of mysql_lock_tables( ). Follow these steps:
  1. Start an instance of MySQL server in a debugger as outlined in the section "Debugger-Guided Source Tour."
  2. Once the debugger window opens, type in the debugger window:
    	i li mysql_lock_tables
    
    The debugger responds with:
    	Line 86 of "lock.cc"
    	starts at address 0x8125540 <mysql_lock_tables__FP3THDPP8st_tableUi>
    	  and ends at 0x812554f <mysql_lock_tables__FP3THDPP8st_tableUi+15>.
    
    This tells you that mysql_lock_tables() is defined on line 86 of lock.cc. Unfortunately, it doesn't say which directory the file is in, so one more step is necessary.
  3. Execute the following command from the root of the source tree:
    	$ find . -name lock.cc | xargs ls -l
    
    This command returns two filenames in the output:
    	lrwxrwxrwx    1 sasha   sasha    16 Jul 29 15:08 ./libmysqld/lock.cc -> ./../
    	sql/lock.cc
    	-rw-rw-r--    1 sasha   sasha  20863 Jul 14 21:40 ./sql/lock.cc
    
    From the output, you can see that libmysqld/lock.cc is a just a symbolic link, while the actual file is sql/lock.cc.
Sometimes things do not go as smoothly as in the previous example. What appears to be a function could in reality be a preprocessor macro. If that happens, the debugger will tell you it knows nothing about the symbol. Fortunately, grep comes to the rescue. Suppose you need to find the definition of ha_commit( ), and the debugger has already told you that there is no such symbol. Execute the following command from the root of the source tree:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Interesting Breakpoints and Variables
If you have ever worked with an unfamiliar code base of significant size, you have most certainly been confronted with the challenge of mentally penetrating the execution flow. Yes, I understand this init_X( ) function, but where in the world does the meat really begin when I do operation Y?
Table 2-5, along with debugger-guided source tours inspired by its contents, will hopefully help you answer a number of such questions. It is based on the 4.1 source, but for the most part should be applicable to other versions. Although MySQL developers could possibly change the names of the functions and the code organization at any time in future versions, in practice 95 percent of the functions continue to carry their original name and role once that section of the code has stabilized. The asterisks in the final column do not indicate C++ pointers; they mean "all variables whose names match."
Table 2-5: Key functions and variables involved in common MySQL operations
Operation
Good place for an entry breakpoint
Interesting variables to examine
Select query
mysql_select( )
*thd
thd->query
*tables
*join
Insert query
mysql_insert( )
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Making a Source Modification
If you have not added any extra files to the source, after making the change, simply execute:
	$ make
from the root of the source tree, and wait for the recompilation and relinking to finish. If the only files you have changed are in the sql directory, it is sufficient to run make only in that directory, which will reduce the time of the process because make would doesn't check other directories to see whether anything needs to be done there.
If you have added new files, follow these steps:
  1. For each directory where you added files, edit Makefile.am in that directory.
  2. In each Makefile.am, find the appropriate variable that ends in SOURCES. For example, in the sql directory, the variable is called mysqld_SOURCES, and in the myisam directory, libmyisam_a_SOURCES.
  3. Add the names of your C/C++ files that you have added to the SOURCES variable.
  4. In each Makefile.am file, find the variables INCLUDES and noinst_HEADERS. Add the names of the new headers that you want to be installed by a make install command to INCLUDES, and the ones that do not need to be installed to noinst_HEADERS .
  5. Execute BUILD/compile-generic-debug (or its equivalent) as described in the section "Building MySQL from BitKeeper Tree."
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Coding Guidelines
If you are making changes to MySQL, it is recommended that you follow the same coding guidelines as MySQL developers. This will make it easier for your code to work with what's already written, help you avoid bugs, and increase the chances of your patches being accepted by MySQL developers without serious modifications.
MySQL developers make their coding guidelines publicly available at http://dev.mysql.com/doc/internals/en/index.html.
In addition, I will provide a reorganized summary with some extra tips and comments from my own experience.
Here are some guidelines for preserving the code's stability while making changes:
  • Always remember that, more often than not, you are in a thread and must follow the rules of thread-safe programming.
  • Most global variables have an associated mutex that other threads will lock before accessing it. Make sure to learn which mutex is associated with each global variable, and lock it when accessing that variable.
  • Be aware that you have very little stack space available. Chunks of memory larger than 100 bytes or so should be allocated with sql_alloc( ) or my_malloc( ).
  • When possible, choose sql_alloc( ) over my_malloc( ) for small allocations. sql_ alloc( ) allocates memory from a pre-allocated connection memory pool, while my_malloc( ) is just a wrapper around the regular malloc( ) call. sql_alloc( ) can be called anywhere in the stack of execution below do_command( ). To verify the stack position in question, set a breakpoint there in the debugger, and when it is reached, run the bt command. Note that the memory allocated with sql_alloc( ) lasts until the end of the query execution. If you want your allocation to persist past that, use my_malloc( ).
  • Perform large allocations with my_malloc( ), and free the allocated blocks with
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Keeping Your BitKeeper Repository Up to Date
MySQL source keeps changing with time—a lot during the alpha stage, less during beta, and only very little once it reaches the stable status. It is recommended that you stay current with the recent developments whether you are just studying or are making modifications to the MySQL source.
Instructions follow on how to update your local BitKeeper repository. This section applies to the commercial version of BitKeeper.
  1. If you have not previously committed a change, edit BitKeeper/triggers/post-commit and replace its contents with the following:
    	#! /bin/sh
    	exit 0
    
    This script executes every time you commit a change, and in its original version it will notify MySQL developers and the public about the details of your change. This notification is desirable when a MySQL developer makes a commit, but may not be so desirable for you. If you do want the world to be notified every time you commit, you may keep this script the way it is.
  2. At the Unix shell prompt, from anywhere inside the repository tree, execute:
    	$ bk citool &
    
    BitKeeper will take a couple of minutes to examine the changed files, and then it will present a GUI dialog asking you to comment on each change you have made.
  3. After you have commented your changes to each individual file, as well as to the entire change set, press the Commit button twice and wait for the BitKeeper window to disappear.
  4. Execute the following command at the Unix shell prompt from anywhere inside the repository tree:
    	$ bk pull
    
  5. In some rather rare cases, the pull may result in conflicts. This usually happens when you change the same lines at the same time that some MySQL developer does. In this event, BitKeeper will print a message on the standard output about failed conflicts and instruct you to run bk resolve. Do so, and follow the prompts that BitKeeper gives you. If needed, refer to the BitKeeper documentation by running
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Submitting a Patch
If you have added a feature or fixed a bug, and would like MySQL developers to consider it for submission, follow the steps in this section. The instructions assume you have used the commercial version of BitKeeper. If you have not, you will have to diff your source against an unmodified copy in step 1.
  1. Execute the following command from the directory inside the source tree:
    	$ bk -r diffs -c > /tmp/mysql-patch.diff
    
  2. Examine the contents of /tmp/mysql-patch.diff to make sure the patch makes sense to you.
  3. Send a message to internals@lists.mysql.com. If the patch is reasonably small (under a few kilobytes), include it in the body of the message. Otherwise, post it at a URL and include the URL in the body of the message. Make sure to include a brief description of the patch.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Core Classes, Structures, Variables, and APIs
The MySQL source code contains several hundred thousand lines and continues to grow. Finding your way in it is quite a challenge. However, the task is not as difficult and daunting as it appears initially, if you are familiar with the core elements of the code and their respective roles. The purpose of this chapter is to give you a foundation that will enable you to read most sections of the code with some degree of ease.
This chapter is meant to be a literacy crash course. We will focus on the core elements of the code that are critical to understanding what is generally going on. The more specific details of various modules will be discussed in later chapters dedicated to them.
In the discussion of the various code elements, I will inevitably have to leave out a number of less significant class members, API calls, and global variables due to space constraints. Additionally, by the time this book is printed, a number of new code elements might appear, and a few might change names or functions to some extent. However, it is reasonable to expect that such cases will be minimal. The major part of the code we will discuss in this chapter has already stabilized and will not change significantly.
I must also note that again for reasons of space constraints we will have to leave out a large number of vital classes, structures, and macros. However, I hope that once you become familiar with what we have covered, you will have sufficient background to elicit the additional information through your own study of the code.
The THD class defines a thread descriptor. It contains the information pertinent to the thread that is handling the given request. Each client connection is handled by a thread. Each thread has a descriptor object. Handling client requests is not the only time a thread is created. MySQL has a number of system threads, such as the replication slave threads and delayed insert threads. Additionally, there exists a special case when a thread descriptor object is created without a thread—the server running in bootstrap mode to create the necessary system tables during installation.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
THD
The THD class defines a thread descriptor. It contains the information pertinent to the thread that is handling the given request. Each client connection is handled by a thread. Each thread has a descriptor object. Handling client requests is not the only time a thread is created. MySQL has a number of system threads, such as the replication slave threads and delayed insert threads. Additionally, there exists a special case when a thread descriptor object is created without a thread—the server running in bootstrap mode to create the necessary system tables during installation.
Due to the close relationship between client connections and threads, the terms thread and connection are often used synonymously by MySQL developers. I will do so in the discussion of this class.
THD is perhaps the most frequently referenced type in the server code. It is passed as the first argument in most of the higher-level calls. With the exception of low-level storage and retrieval operations, few things of significance happen inside the server without some involvement of this class. Familiarity with its members will give you a good idea of the overall architecture and capabilities of the server.
THD is defined in sql/sql_class.h and implemented in sql/sql_class.cc.
Most commonly, an instance of THD is pointed to by a variable thd of type THD*. Therefore, if you are trying to find places in the code where a particular member of this class is used, the following command will work almost without fail:
	grep "thd->var_name" sql/*.cc
The class consists mostly of data members, which will be the primary focus of this discussion. There are a few fairly simple and infrequently used methods, which we will not discuss due to the space constraints. However, once you understand the role of the data members we will discuss, the function of the methods will become apparent to you from their source.
Table 3-1 lists the most prominent members of the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
NET
The NET structure defines a network connection descriptor. MySQL uses a fairly complex protocol on top of the one already provided by the operating system for client/ server communication. This structure lies at the core of the protocol's implementation.
The protocol defines its own packet format. A packet can send a command, a message, or a block of data. Packets can be compressed, or transmitted over the SSL layer.
All network communication functions use NET one way or the other, usually by accepting it as an argument. Becoming familiar with the members of NET is a major step toward understanding the client/server communication protocol.
NET is a rather small structure, which permits us to cover it in its entirety. It is defined in include/mysql_com.h. The same definition is also used by the client library, which is written in C. This would exclude any possibility for NET to have any methods. However, there are a number of functions that accept NET* as an argument, which will be covered in Chapter 5.
Table 3-2 lists the NET members.
Table 3-2: Members of the NET class
Member definition
Description
Vio* vio
Low-level network I/O socket descriptor. V stands for virtual. The VIO abstraction was originally created to support SSL. Now it is also used to support Windows shared memory and named pipe connections. It also facilitates cross-platform porting in many other ways.
unsigned char *buff
Start of the data buffer.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
TABLE
The TABLE structure defines a database table descriptor. A table can exist in an open or closed state. In order to be used in the server, it has to be opened. Whenever this happens, a table descriptor is created, and placed in the table cache to be reused later when another request is made that references the same table.
Instances of TABLE are frequently referenced in the parser, optimizer, access control, and query cache code. It glues things together in a number of ways. Studying its members is a good way to become acquainted to a degree with the low-level details of the server implementation.
This structure is defined in sql/table.h as struct st_table, but then is aliased to TABLE with a typedef in sql/handler.h.
Note that in version 5.1 TABLE was refactored, and portions of it were moved to TABLE_SHARE class, which is shared between the instances of the same physical table. TABLE_SHAREs are cached in the table definition cache.
TABLE is a large structure. We will cover only the most essential parts. Table 3-3 lists some of its members.
Table 3-3: Members of the TABLE class
Member definition
Description
handler *file
Pointer to the storage engine (handler) object for this table. The object is used for all low-level data storage and retrieval operations.