Improve MySQL Insert Performance

written by Kevin van Zonneveld on 31 Mar 2009
http://kvz.io/blog/2009/03/31/improve-mysql-insert-performance

Sometimes MySQL needs to work hard. I've been working on an import script that fires a lot of INSERTs. Normally our database server handles 1,000 inserts / sec. That wasn't enough. So I went looking for methods to improve the speed of MySQL inserts and was finally able to increase this number to 28,000 inserts per second. Checkout my late night benchmarking adventures.

I'm going to show you the result of 3 approaches that I tried to boost the speed of 'bulk' queries:

This article focusses on the InnoDB storage engine.

Delayed Insert

MySQL has an INSERT DELAYED feature. Despite the name this is actually meant to speedup your queries ; ) And from what I understand it does a very good job.

Unfortunately it only works with MyISAM, MEMORY, ARCHIVE, and BLACKHOLE tables.

That rules out my favorite storage engine of the moment: InnoDB.

So where to turn?

Transaction

A Transaction basically combines multiple queries in 1 'package'. If 1 query in this package fails: you can 'cancel' all the queries within that package also.

So that provides additional** integrity** to your relational data because if record A could not be deleted but depends on record B which could be deleted, you have a broken dependency in your database and that corruption could have easily been avoided using a Transaction.

Let me show you how easy a transaction really is in basic PHP/SQL** terms:

<?php
mysql_query("START TRANSACTION");
mysql_query("INSERT INTO `log` (`level`, `msg`) VALUES ('err', 'foobar!')");
?>

<?php
mysql_query("INSERT INTO `log` (`level`, `msg`) VALUES ('err', 'foobar!')");
?>

<?php
mysql_query("INSERT INTO `log` (`level`, `msg`) VALUES ('err', 'foobar!')");
?>

<?php
mysql_query("COMMIT"); // Or "ROLLBACK" if you changed your mind
?>

OK moving on : )

Transaction performance - The Theory

I showed you the integrity gain. That's reason enough to 'go Transactional' right now. But as an added bonus, Transactions could also be used for performance gain. How?

But when your queries are wrapped inside a Transaction, the table does not get re-indexed until after this entire bulk is processed. Saving a lot of work.

Bulk processing will be the key to performance gain.

Bench results

So far the theory. Now let's benchmark this. What does it gain us in** queries per second (qps)** terms:



As you can see

I tried some other forms of transaction (showed in a graph below) but none of them really hit the jackpot.

OK so Transactions are good to protect your data, and in theory can have performance gain, but I was unable to produce that.

Clearly this wasn't the performance boost I was hoping for.

Moving on.

Load Data - The Mother Load

MySQL has a very powerful way of processing bulks of data called LOAD DATA INFILE. The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed.

Bench results

In the following graph I tried to inserts different sized bulks of inserts using different methods. I recorded & calculated in how much time each query could be executed. I use the total time necessary for the entire operation, and divide that by the number of queries. So what you see is really what you get.

OK enough with these so-called facts ; ) Back the the excitement :D

At 10,000 records I was able to get a performance gain of 2,124.09%



As you can see

The next step will make your buffer 1000% bigger and it will only give you an additional performance gain of 4%.

So if you have a heavy-duty MySQL job that currently takes 1 hour to run, this approach could make it run within 3 minutes! Enjoy the remaining 57 minutes of your hour! :D

Load Data Quirks

Of course there is a price to pay for this performance win. Before the data is loaded, The data-file must be:

Finally

This is my first second benchmark so if you have some pointers that could improve my next: I'm listening.


In the original author's page, he did include a some PHP code for handling importing CSV in MySQL by using LOAD DATA method. You may visit his page if interested.