How MySQL Compares to Other Databases

Our users have successfully run their own benchmarks against a number of open source and traditional database servers. We are aware of tests against Oracle server, DB/2 server, Microsoft SQL server, and other commercial products. Due to legal reasons we are restricted from publishing some of those benchmarks in our reference manual.

This section includes a comparison with mSQL for historical reasons and with PostgreSQL as it is also an open source database. If you have benchmark results that we can publish, please contact us at .

For comparative lists of all supported functions and types as well as measured operational limits of many different database systems, see the crash-me web page at http://www.mysql.com/information/crash-me.php.

How MySQL Compares to mSQL

Performance

For a true comparison of speed, consult the growing MySQL benchmark suite. See Section 5.1.4.

Because there is no thread creation overhead, a small parser, few features, and simple security, mSQL should be quicker at:

  • Tests that perform repeated connects and disconnects, running a very simple query during each connection.

  • INSERT operations into very simple tables with few columns and keys.

  • CREATE TABLE and DROP TABLE.

  • SELECT on something that isn’t an index. (A table scan is very easy.)

Because these operations are so simple, it is hard to be better at them when you have a higher startup overhead. After the connection is established, MySQL server should perform much better.

On the other hand, MySQL server is much faster than mSQL (and most other SQL implementations) on the following:

  • Complex SELECT operations.

  • Retrieving large results (MySQL server has a better, faster, and safer protocol).

  • Tables with variable-length strings because MySQL server has more efficient handling and can have indexes on VARCHAR columns.

  • Handling tables with many columns.

  • Handling tables with large record lengths.

  • SELECT with many expressions.

  • SELECT on large tables.

  • Handling many connections at the same time. MySQL server is fully multi-threaded. Each connection has its own thread, which means that no thread has to wait for another (unless a thread is modifying a table another thread wants to access). In mSQL, once one connection is established, all others must wait until the first has finished, regardless of whether the connection is running a query that is short or long. When the first connection terminates, the next can be served, while all the others wait again, etc.

  • Joins. mSQL can become pathologically slow if you change the order of tables in a SELECT. In the benchmark suite, a time more than 15,000 times slower than MySQL server was seen. This is due to mSQL’s lack of a join optimiser to order tables in the optimal order. However, if you put the tables in exactly the right order in mSQL2 and the WHERE is simple and uses index columns, the join will be relatively fast! See Section 5.1.4.

  • ORDER BY and GROUP BY.

  • DISTINCT.

  • Using TEXT or BLOB columns.

SQL Features
  • GROUP BY and HAVING. mSQL does not support GROUP BY at all. MySQL server supports a full GROUP BY with both HAVING and the following functions: COUNT( ), AVG( ), MIN( ), MAX( ), SUM( ), and STD( ). COUNT(*) is optimised to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause. MIN( ) and MAX( ) may take string arguments.

  • INSERT and UPDATE with calculations. MySQL server can do calculations in an INSERT or UPDATE. For example:

    mysql> UPDATE SET x=x*10+y WHERE x<20;
  • Aliasing. MySQL server has column aliasing.

  • Qualifying column names. In MySQL server, if a column name is unique among the tables used in a query, you do not have to use the full qualifier.

  • SELECT with functions. MySQL server has many functions (too many to list here; see Section 6.3).

Disk Space Efficiency

That is, how small can you make your tables?

MySQL server has very precise types, so you can create tables that take very little space. An example of a useful MySQL datatype is the MEDIUMINT that is 3 bytes long. If you have 100 million records, saving even 1 byte per record is very important.

mSQL2 has a more limited set of column types, so it is more difficult to get small tables.

Stability

This is harder to judge objectively. For a discussion of MySQL server stability, see Section 1.2.3.

We have no experience with mSQL stability, so we cannot say anything about that.

Price

Another important issue is the license. MySQL server has a more flexible license than mSQL, and is also less expensive than mSQL. Whichever product you choose to use, remember to at least consider paying for a license or email support.

Perl Interfaces

MySQL server has basically the same interfaces to Perl as mSQL with some added features.

JDBC (Java)

MySQL server currently has a lot of different JDBC drivers:

  • The mm driver: a type 4 JDBC driver by Mark Matthews . This is released under the LGPL.

  • The Resin driver: this is a commercial JDBC driver released under open source. http://www.caucho.com/projects/jdbc-mysql/index.xtp

  • The gwe driver: a Java interface by GWE technologies (not supported anymore).

  • The jms driver: an improved gwe driver by Xiaokun Kelvin ZHU (not supported anymore).

  • The twz driver: a type 4 JDBC driver by Terrence W. Zellers . This is commercial but is free for private and educational use (not supported anymore).

The recommended driver is the mm driver. The Resin driver may also be good (at least the benchmarks look good), but we haven’t received that much information about this yet.

We know that mSQL has a JDBC driver, but we have too little experience with it to compare.

Rate of Development

MySQL server has a small core team of developers, but we are quite used to coding C and C++ very rapidly. Because threads, functions, GROUP BY, and so on are still not implemented in mSQL, it has a lot of catching up to do. To get some perspective on this, you can view the mSQL HISTORY file for the last year and compare it with the News section of the MySQL Reference Manual (see http://www.mysql.com/doc/N/e/News.html). It should be pretty obvious which one has developed most rapidly.

Utility Programs

Both mSQL and MySQL server have many interesting third-party tools. Because it is very easy to port upward (from mSQL to MySQL server), almost all the interesting applications that are available for mSQL are also available for MySQL server.

MySQL server comes with a simple msql2mysql program that fixes differences in spelling between mSQL and MySQL server for the most-used C API functions. For example, it changes instances of msqlConnect( ) to mysql_connect( ). Converting a client program from mSQL to MySQL server usually requires only minor effort.

How to convert mSQL tools for MySQL

According to our experience, it doesn’t take long to convert tools such as msql-tcl and msqljava that use the mSQL C API so that they work with the MySQL C API.

The conversion procedure is:

  1. Run the shell script msql2mysql on the source. This requires the replace program, which is distributed with MySQL server.

  2. Compile.

  3. Fix all compiler errors.

Differences between the mSQL C API and the MySQL C API are:

  • MySQL server uses a MYSQL structure as a connection type (mSQL uses an int).

  • mysql_connect( ) takes a pointer to a MYSQL structure as a parameter. It is easy to define one globally or to use malloc( ) to get one. mysql_connect( ) also takes two parameters for specifying the user and password. You may set these to NULL, NULL for default use.

  • mysql_error( ) takes the MYSQL structure as a parameter. Just add the parameter to your old msql_error( ) code if you are porting old code.

  • MySQL server returns an error number and a text error message for all errors. mSQL returns only a text error message.

  • Some incompatibilities exist as a result of MySQL server supporting multiple connections to the server from the same process.

How mSQL and MySQL client/server communications protocols differ

There are enough differences that it is impossible (or at least not easy) to support both.

The most significant ways in which the MySQL protocol differs from the mSQL protocol are listed here:

  • A message buffer may contain many result rows.

  • The message buffers are dynamically enlarged if the query or the result is bigger than the current buffer, up to a configurable server and client limit.

  • All packets are numbered to catch duplicated or missing packets.

  • All column values are sent in ASCII. The lengths of columns and rows are sent in packed binary coding (1, 2, or 3 bytes).

  • MySQL can read in the result unbuffered (without having to store the full set in the client).

  • If a single read/write takes more than 30 seconds, the server closes the connection.

  • If a connection is idle for 8 hours, the server closes the connection.

How mSQL 2.0 SQL syntax differs from MySQL

Column types

MySQL server

Has the following additional types (among others; see Section 6.5.3):

  • ENUM type for one of a set of strings.

  • SET type for many of a set of strings.

  • BIGINT type for 64-bit integers.

MySQL server also supports the following additional type attributes:

  • UNSIGNED option for integer and floating-point columns.

  • ZEROFILL option for integer columns.

  • AUTO_INCREMENT option for integer columns that are a PRIMARY KEY. See Section 8.4.3.126.

  • DEFAULT value for all columns.

mSQL2

mSQL column types correspond to the MySQL types shown in the following table:

mSQL type

Corresponding MySQL type

CHAR(len)

CHAR(len)

TEXT(len)

TEXT(len). len is the maximal length. And LIKE works.

INT

INT. With many more options!

REAL

REAL. Or FLOAT. Both 4- and 8-byte versions are available.

UINT

INT UNSIGNED

DATE

DATE. Uses ANSI SQL format rather than mSQL’s own format.

TIME

TIME

MONEY

DECIMAL(12,2). A fixed-point value with two decimals.

Index creation

MySQL server

Indexes may be specified at table creation time with the CREATE TABLE statement.

mSQL

Indexes must be created after the table has been created, with separate CREATE INDEX statements.

To insert a unique identifier into a table

MySQL server

Use AUTO_INCREMENT as a column type specifier. See Section 8.4.3.126.

mSQL

Create a SEQUENCE on a table and select the _seq column.

To obtain a unique identifier for a row

MySQL server

Add a PRIMARY KEY or UNIQUE key to the table and use this. New in Version 3.23.11: If the PRIMARY or UNIQUE key consists of only one column and this is of type integer, one can also refer to it as _rowid.

mSQL

Use the _rowid column. Observe that _rowid may change over time depending on many factors.

To get the time a column was last modified

MySQL server

Add a TIMESTAMP column to the table. This column is automatically set to the current date and time for INSERT or UPDATE statements if you don’t give the column a value or if you give it a NULL value.

mSQL

Use the _timestamp column.

NULL value comparisons

MySQL server

MySQL server follows ANSI SQL, and a comparison with NULL is always NULL.

mSQL

In mSQL, NULL = NULL is TRUE. You must change =NULL to IS NULL and <>NULL to IS NOT NULL when porting old code from mSQL to MySQL server.

String comparisons

MySQL server

Normally, string comparisons are performed in case-independent fashion with the sort order determined by the current character set (ISO-8859-1 Latin1 by default). If you don’t like this, declare your columns with the BINARY attribute, which causes comparisons to be done according to the ASCII order used on the MySQL server host.

mSQL

All string comparisons are performed in case-sensitive fashion with sorting in ASCII order.

Case-insensitive searching

MySQL server

LIKE is a case-insensitive or case-sensitive operator, depending on the columns involved. If possible, MySQL uses indexes if the LIKE argument doesn’t start with a wildcard character.

mSQL

Use CLIKE.

Handling of trailing spaces

MySQL server

Strips all spaces at the end of CHAR and VARCHAR columns. Use a TEXT column if this behavior is not desired.

mSQL

Retains trailing space.

WHERE clauses

MySQL server

MySQL correctly prioritises everything (AND is evaluated before OR). To get mSQL behavior in MySQL server, use parentheses (as shown in an example later in this section).

mSQL

Evaluates everything from left to right. This means that some logical calculations with more than three arguments cannot be expressed in any way. It also means you must change some queries when you upgrade to MySQL server. You do this easily by adding parentheses. Suppose you have the following mSQL query:

mysql> SELECT * FROM table WHERE a=1 AND b=2 OR a=3 AND b=4;

To make MySQL server evaluate this the way that mSQL would, you must add parentheses:

mysql> SELECT * FROM table WHERE (a=1 AND (b=2 OR (a=3 AND (b=4))));

Access control

MySQL server

Has tables to store grant (permission) options per user, host, and database. See Section 4.2.6.

mSQL

Has a file mSQL.acl in which you can grant read/write privileges for users.

How MySQL Compares to PostgreSQL

When reading the following, please note that both products are continually evolving. We at MySQL AB and the PostgreSQL developers are both working on making our respective databases as good as possible, so we are both a serious alternative to any commercial database.

The following comparison is made by us at MySQL AB. We have tried to be as accurate and fair as possible, but although we know MySQL server thoroughly, we don’t have a full knowledge of all PostgreSQL features, so we may have got some things wrong. We will, however, correct these when they come to our attention.

We would first like to note that PostgreSQL and MySQL server are both widely used products, but with different design goals, even if we are both striving toward ANSI SQL compliancy. This means that for some applications MySQL server is more suited, while for others PostgreSQL is more suited. When choosing which database to use, you should first check if the database’s feature set satisfies your application. If you need raw speed, MySQL server is probably your best choice. If you need some of the extra features that only PostgreSQL can offer, you should use PostgreSQL.

MySQL and PostgreSQL development strategies

When adding things to MySQL server we take pride to do an optimal, definite solution. The code should be so good that we shouldn’t have any need to change it in the foreseeable future. We also do not like to sacrifice speed for features but instead will do our utmost to find a solution that will give maximal throughput. This means that development will take a little longer, but the end result will be well worth this. This kind of development is only possible because all server code are checked by one of a few (currently two) persons before it’s included in the MySQL server.

We at MySQL AB believe in frequent releases to be able to push out new features quickly to our users. Because of this we do a new small release about every three weeks, and a major branch every year. All releases are thoroughly tested with our testing tools on a lot of different platforms.

PostgreSQL is based on a kernel with lots of contributors. In this setup it makes sense to prioritise adding a lot of new features, instead of implementing them optimally, because one can always optimise things later if there arises a need for this.

Another big difference between MySQL server and PostgreSQL is that nearly all of the code in the MySQL server is coded by developers that are employed by MySQL AB and are still working on the server code. The exceptions are the transaction engines and the regexp library.

This is in sharp contrast to the PostgreSQL code, the majority of which is coded by a big group of people with different backgrounds. It was only recently that the PostgreSQL developers announced that their current developer group had finally had time to take a look at all the code in the current PostgreSQL release.

Both of the aforementioned development methods have their own merits and drawbacks. We here at MySQL AB think, of course, that our model is better because our model gives better code consistency, more optimal and reusable code, and in our opinion, fewer bugs. Because we are the authors of the MySQL server code, we are better able to coordinate new features and releases.

Featurewise comparison of MySQL and PostgreSQL

On the crash-me page (http://www.mysql.com/information/crash-me.php) you can find a list of those database constructs and limits that one can detect automatically with a program. Note, however, that a lot of the numerical limits may be changed with startup options for their respective databases. This web page is, however, extremely useful when you want to ensure that your applications work with many different databases or when you want to convert your application from one database to another.

MySQL server offers the following advantages over PostgreSQL:

  • MySQL Server is generally much faster than PostgreSQL. MySQL 4.0.1 also has a query cache that can boost up the query speed for mostly-read-only sites many times.

  • MySQL has a much larger user base than PostgreSQL. Therefore, the code is tested more and has historically proven more stable than PostgreSQL. MySQL server is used more in production environments than PostgreSQL, mostly thanks to the fact that MySQL AB, formerly TCX DataKonsult AB, has provided top-quality commercial support for MySQL server from the day it was released, whereas until recently PostgreSQL was unsupported.

  • MySQL server works better on Windows than PostgreSQL does. MySQL server runs as a native Windows application (a service on NT/2000/XP), while PostgreSQL is run under the Cygwin emulation. We have heard that PostgreSQL is not yet that stable on Windows but we haven’t been able to verify this ourselves.

  • MySQL has more APIs to other languages and is supported by more existing programs than PostgreSQL. See Appendix B.

  • MySQL server works on 24/7 heavy-duty systems. In most circumstances you never have to run any cleanups on MySQL server. PostgreSQL doesn’t yet support 24/7 systems because you have to run VACUUM once in a while to reclaim space from UPDATE and DELETE commands and to perform statistics analyses that are critical to get good performance with PostgreSQL. VACUUM is also needed after adding a lot of new rows to a table. On a busy system with lots of changes, VACUUM must be run very frequently, in the worst cases even many times a day. During the VACUUM run, which may take hours if the database is big, the database is, from a production standpoint, practically dead. Please note: in PostgreSQL version 7.2, basic vacuuming no longer locks tables, thus allowing normal user access during the vacuum. A new VACUUM FULL command does old-style vacuum by locking the table and shrinking the on-disk copy of the table.

  • MySQL replication has been thoroughly tested, and is used by sites like:

    • Yahoo Finance (http://finance.yahoo.com/)

    • Mobile.de (http://www.mobile.de/)

    • Slashdot (http://www.slashdot.org/)

  • Included in the MySQL distribution are two different testing suites, mysql-test-run and crash-me (http://www.mysql.com/information/crash-me.php), as well as a benchmark suite. The test system is actively updated with code to test each new feature and almost all reproduceable bugs that have come to our attention. We test MySQL server with these on a lot of platforms before every release. These tests are more sophisticated than anything we have seen from PostgreSQL, and they ensure that the MySQL server is kept to a high standard.

  • There are far more books in print about MySQL server than about PostgreSQL. O’Reilly, SAMS, Que, and New Riders are all major publishers with books about MySQL. All MySQL features are also documented in the MySQL online manual because when a new feature is implemented, the MySQL developers are required to document it before it’s included in the source.

  • MySQL server supports more of the standard ODBC functions than PostgreSQL.

  • MySQL server has a much more sophisticated ALTER TABLE.

  • MySQL server has support for tables without transactions for applications that need all the speed they can get. The tables may be memory-based, HEAP tables or disk based MyISAM. See Chapter 7.

  • MySQL server has support for two different table handlers that support transactions, InnoDB, and BerkeleyDB. Because every transaction engine performs differently under different conditions, this gives the application writer more options to find an optimal solution for his or her setup, if need be per individual table. See Chapter 7.

  • MERGE tables gives you a unique way to instantly make a view over a set of identical tables and use these as one. This is perfect for systems where you have log files that you order, for example, by month. See Section 7.2.

  • The option to compress read-only tables, but still have direct access to the rows in the table, gives you better performance by minimising disk reads. This is very useful when you are archiving things. See Section 4.7.4.

  • MySQL server has internal support for full-text search. See Section 6.8.

  • You can access many databases from the same connection (depending, of course, on your privileges).

  • MySQL server is coded from the start to be multi-threaded, while PostgreSQL uses processes. Context switching and access to common storage areas is much faster between threads than between separate processes. This gives MySQL server a big speed advantage in multi-user applications and also makes it easier for MySQL server to take full advantage of symmetric multiprocessor (SMP) systems.

  • MySQL server has a much more sophisticated privilege system than PostgreSQL. While PostgreSQL only supports INSERT, SELECT, and UPDATE/DELETE grants per user on a database or a table, MySQL server allows you to define a full set of different privileges on the database, table, and column level. MySQL server also allows you to specify the privilege on host and user combinations. See Section 4.3.1.

  • MySQL server supports a compressed client/server protocol which improves performance over slow links.

  • MySQL server employs a “table handler” concept, and is the only relational database we know of built around this concept. This allows different low-level table types to be called from the SQL engine, and each table type can be optimised for different performance characteristics.

  • All MySQL table types (except InnoDB) are implemented as files (one table per file), which makes it really easy to back up, move, delete, and even symlink databases and tables, even when the server is down.

  • Tools to repair and optimise MyISAM tables (the most common MySQL table type). A repair tool is only needed when a physical corruption of a data file happens, usually from a hardware failure. It allows a majority of the data to be recovered.

  • Upgrading MySQL server is painless. When you are upgrading MySQL Server, you don’t need to dump/restore your data, as you have to do with most PostgreSQL upgrades.

Drawbacks with MySQL server compared to PostgreSQL:

  • The transaction support in MySQL server is not yet as well tested as PostgreSQL’s system.

  • Because MySQL server uses threads, which are not yet flawless on many OSes, one must either use binaries from http://www.mysql.com/downloads/, or carefully follow our instructions on http://www.mysql.com/doc/I/n/Installing_source.html to get an optimal binary that works in all cases.

  • Table locking, as used by the non-transactional MyISAM tables, is in many cases faster than page locks, row locks, or versioning. The drawback, however, is that if one doesn’t take into account how table locks work, a single long-running query can block a table for updates for a long time. This can usually be avoided when designing the application. If not, one can always switch the trouble table to use one of the transactional table types. See Section 5.3.2.

  • With UDF (user-defined functions) one can extend MySQL server with both normal SQL functions and aggregates, but this is not yet as easy or as flexible as in PostgreSQL. See Section 9.2.

  • Updates that run over multiple tables are harder to do in MySQL server. This will, however, be fixed in MySQL server 4.0.2 with multi-table UPDATE and in MySQL server 4.1 with subselects. In MySQL server 4.0 one can use multi-table deletes to delete from many tables at the same time. See Section 6.4.6.

PostgreSQL currently offers the following advantages over MySQL server:

Note that because we know the MySQL road map, we have included in the following table the version when MySQL server should support this feature. Unfortunately we couldn’t do this for previous comparisons, because we don’t know the PostgreSQL roadmap.

Feature

MySQL version

Subselects

4.1

Foreign keys

4.1

Views

5.0

Stored procedures

5.0

Triggers

5.0

Unions

4.0

Full join

4.1

Constraints

4.1 or 5.0

Cursors

4.1 or 5.0

R-trees

4.1 (for MyISAM tables)

Inherited tables

Not planned

Extensible type system

Not planned

Other reasons someone may consider using PostgreSQL:

  • Standard usage in PostgreSQL is closer to ANSI SQL in some cases.

  • One can speed up PostgreSQL by coding things as stored procedures.

  • For geographical data, R-trees make PostgreSQL better than MySQL server. (note: MySQL version 4.1 will have R-trees for MyISAM tables).

  • The PostgreSQL optimiser can do some optimisation that the current MySQL optimiser can’t do. Most notable is doing joins when you don’t have the proper keys in place and doing a join where you are using different keys combined with OR. The MySQL benchmark suite at http://www.mysql.com/information/benchmarks.html shows you what kind of constructs you should watch out for when using different databases.

  • PostgreSQL has a bigger team of developers that contribute to the server.

Drawbacks with PostgreSQL compared to MySQL server:

  • VACUUM makes PostgreSQL hard to use in a 24/7 environment.

  • Only transactional tables.

  • Much slower INSERT, DELETE, and UPDATE.

For a complete list of drawbacks, you should also examine the first table in this section.

Benchmarking MySQL and PostgreSQL

The only open source benchmark that we know of that can be used to benchmark MySQL server and PostgreSQL (and other databases) is our own. It can be found at http://www.mysql.com/information/benchmarks.html.

We have many times asked the PostgreSQL developers and some PostgreSQL users to help us extend this benchmark to make it the definitive benchmark for databases, but unfortunately we haven’t gotten any feedback for this.

We, the MySQL developers, have, because of this, spent a lot of hours to get maximum performance from PostgreSQL for the benchmarks, but because we don’t know PostgreSQL intimately, we are sure that there are things that we have missed. We have on the benchmark page documented exactly how we did run the benchmark so that it should be easy for anyone to repeat and verify our results.

The benchmarks are usually run with and without the --fast option. When run with --fast we are trying to use every trick the server can do to get the code to execute as fast as possible. The idea is that the normal run should show how the server would work in a default setup and the --fast run shows how the server would do if the application developer would use extensions in the server to make his application run faster.

When running with PostgreSQL and --fast we do a VACUUM after every major table UPDATE and DROP TABLE to make the database in perfect shape for the following SELECTs. The time for VACUUM is measured separately.

When running with PostgreSQL 7.1.1 we could, however, not run with --fast because during the INSERT test, the postmaster (the PostgreSQL deamon) died and the database was so corrupted that it was impossible to restart postmaster. After this happened twice, we decided to postpone the --fast test until the next PostgreSQL release. The details about the machine we run the benchmark on can be found on the benchmark page.

Before going to the other benchmarks we know of, we would like to give some background on benchmarks.

It’s very easy to write a test that shows any database to be the best database in the world, by just restricting the test to something the database is very good at and not testing anything that the database is not good at. If one, after doing this, summarises the result as a single figure, things are even easier.

This would be like us measuring the speed of MySQL server compared to PostgreSQL by looking at the summary time of the MySQL benchmarks on our web page. Based on this MySQL server would be more than 40 times faster than PostgreSQL, something that is, of course, not true. We could make things even worse by just taking the test where PostgreSQL performs worst and claim that MySQL server is more than 2000 times faster than PostgreSQL.

The case is that MySQL does a lot of optimisations that PostgreSQL doesn’t do. This is, of course, also true the other way around. An SQL optimiser is a very complex thing, and a company could spend years just making the optimiser faster and faster.

When looking at the benchmark results you should look for things that you do in your application and just use these results to decide which database would be best suited for your application. The benchmark results also show things a particular database is not good at and should give you a notion about things to avoid and what you may have to do in other ways.

We know of two benchmark tests that claim that PostgreSQL performs better than MySQL server. These both where multi-user tests, a test that we here at MySQL AB haven’t had time to write and include in the benchmark suite, mainly because it’s a big task to do this in a manner that is fair to all databases.

One is the benchmark paid for by Great Bridge, the company that for 16 months attempted to build a business based on PostgreSQL but now has ceased operations. This is probably the worst benchmark we have ever seen anyone conduct. This was not only tuned to only test what PostgreSQL is absolutely best at, but it was also totally unfair to every other database involved in the test.

Note: We know that even some of the main PostgreSQL developers did not like the way Great Bridge conducted the benchmark, so we don’t blame the PostgreSQL team for the way the benchmark was done.

This benchmark has been condemned in a lot of postings and newsgroups, so here we will just briefly repeat some things that were wrong with it.

  • The tests were run with an expensive commercial tool that makes it impossible for an open source company like us to verify the benchmarks, or even check how the benchmarks were really done. The tool is not even a true benchmark tool, but an application/setup testing tool. To refer to this as a “standard” benchmark tool is to stretch the truth a long way.

  • Great Bridge admitted that they had optimised the PostgreSQL database (with VACUUM before the test) and tuned the startup for the tests, something they hadn’t done for any of the other databases involved. They say “This process optimises indexes and frees up disk space a bit. The optimised indexes boost performance by some margin.” Our benchmarks clearly indicate that the difference in running a lot of selects on a database with and without VACUUM can easily differ by a factor of 10.

  • The test results were also strange. The AS3AP test documentation mentions that the test does “selections, simple joins, projections, aggregates, one-tuple updates, and bulk updates.”

    PostgreSQL is good at doing SELECTs and JOINs (especially after a VACUUM), but doesn’t perform as well on INSERTs or UPDATEs. The benchmarks seem to indicate that only SELECTs were done (or very few updates). This could easily explain the good results for PostgreSQL in this test. The bad results for MySQL will be obvious a bit down in this document.

  • They did run the so-called benchmark from a Windows machine against a Linux machine over ODBC, a setup that no normal database user would ever do when running a heavy multi-user application. This tested more the ODBC driver and the Windows protocol used between the clients than the database itself.

  • When running the database against Oracle and MS-SQL (Great Bridge has indirectly indicated the databases they used in the test), they didn’t use the native protocol but instead ODBC. Anyone that has ever used Oracle knows that all real applications use the native interface instead of ODBC. Doing a test through ODBC and claiming that the results had anything to do with using the database in a real-world situation can’t be regarded as fair. They should have done two tests with and without ODBC to provide the right facts (after having gotten experts to tune all involved databases, of course).

  • They refer to the TPC-C tests, but they don’t mention anywhere that the test they did was not a true TPC-C test and they were not even allowed to call it a TPC-C test. A TPC-C test can only be conducted by the rules approved by the TPC Council (http://www.tpc.org/). Great Bridge didn’t do that. By doing this they have both violated the TPC trademark and miscredited their own benchmarks. The rules set by the TPC Council are very strict to ensure that no one can produce false results or make unprovable statements. Apparently Great Bridge wasn’t interested in doing this.

  • After the first test, we contacted Great Bridge and mentioned to them some of the obvious mistakes they had done with MySQL server:

    • Running with a debug version of our ODBC driver

    • Running on a Linux system that wasn’t optimised for threads

    • Using an old MySQL version when there was a recommended newer one available

    • Not starting MySQL server with the right options for heavy multi-user use (the default installation of MySQL server is tuned for minimal resource use)

    Great Bridge did run a new test, with our optimised ODBC driver and with better startup options for MySQL server, but refused to either use our updated glibc library or our standard binary (used by 80% of our users), which was statically linked with a fixed glibc library.

    According to what we know, Great Bridge did nothing to ensure that the other databases were set up correctly to run well in their test environment. We are sure, however, that they didn’t contact Oracle or Microsoft to ask for their advice in this matter. ;)

  • The benchmark was paid for by Great Bridge, and they decided to publish only partial, chosen results (instead of publishing it all).

Tim Perdue, a long-time PostgreSQL fan and a reluctant MySQL user, published a comparison on PHPbuilder (http://www.phpbuilder.com/columns/tim20001112.php3).

When we became aware of the comparison, we phoned Tim Perdue about this because there were a lot of strange things in his results. For example, he claimed that MySQL server had a problem with five users in his tests, when we know that there are users with similar machines as his that are using MySQL server with 2000 simultaneous connections doing 400 queries per second. (In this case the limit was the web bandwidth, not the database.)

It sounded like he was using a Linux kernel that either had some problems with many threads, such as kernels before 2.4, which had a problem with many threads on multi-CPU machines. We have documented in this manual how to fix this and Tim should be aware of this problem.

The other possible problem could have been an old glibc library and that Tim didn’t use a MySQL binary from our site, which is linked with a corrected glibc library, but had compiled a version of his own. In any of these cases, the symptom would have been exactly what Tim had measured.

We asked Tim if we could get access to his data so that we could repeat the benchmark and if he could check the MySQL version on the machine to find out what was wrong and he promised to come back to us about this. He has not done that yet.

Because of this we can’t put any trust in this benchmark either. :(

Over time things also change and the preceding benchmarks are not that relevant anymore. MySQL server now has a couple of different table handlers with different speed/concurrency tradeoffs. See Chapter 7. It would be interesting to see how the above tests would run with the different transactional table types in MySQL server. PostgreSQL has, of course, also got new features since the test was made. As these tests are not publicly available there is no way for us to know how the database would perform in the same tests today.

Conclusion:

The only benchmarks that exist today that anyone can download and run against MySQL server and PostgreSQL are the MySQL benchmarks. We here at MySQL AB believe that open source databases should be tested with open source tools! This is the only way to ensure that no one does tests that nobody can reproduce and use this to claim that one database is better than another. Without knowing all the facts it’s impossible to answer the claims of the tester.

The thing we find strange is that every test we have seen about PostgreSQL, that is impossible to reproduce, claims that PostgreSQL is better in most cases while our tests, which anyone can reproduce, clearly show otherwise. With this we don’t want to say that PostgreSQL isn’t good at many things (it is!) or that it isn’t faster than MySQL server under certain conditions. We would just like to see a fair test where PostgreSQL performs very well, so that we could get some friendly competition going!

For more information about our benchmark suite, see Section 5.1.4.

We are working on an even better benchmark suite, including multi-user tests, and a better documentation of what the individual tests really do and how to add more tests to the suite.

Get MySQL Reference Manual now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.