# The other side of the moon

/bb|[^b]{2}/
Never stop Grokking

## Wednesday, September 30, 2009

A couple of days ago I posted about scaling writes in mysql. I didn't say much about read performance in that post because a) it was irrelevant at the time, and b) there are thousands of articles all over the web that already cover read performance.

In this post I'm going to cover some of the things that I did to improve read performance for my own application. It may be relevant to others, however, you'd still need to read a whole bunch of other articles to understand MySQL read performance.

Looking at our access patterns, it turned out that there were two classes of read queries.
1. Reads to build the daily summaries
2. Reads from the summary tables in response to user actions
The former dealt with far more data at one go, but was only run once. Queries for this pattern were slow depending on how many rows were touched. Small resultsets came back in milliseconds while larger resultsets (some over a million rows) took several minutes to return. The latter pattern dealt with small amounts of data, but happened far more frequently. These queries needed to return in around 10ms.

Also note from my last post that inserts handled 40,000 rows per query, which meant that each query took about 4.5seconds to run.

Now why is all of this relevant? It's relevant because it renders your slow query log mostly useless. My slow query log jumps to about 300GB fairly quickly, so you need log rotation implemented. We can, however turn slow query logging on and off at run time using the slow_query_log global system variables, however since these variables are global, we need to worry about a few things.
1. Make sure you set it back on when your script finishes, even if the script crashes
2. Any other queries run while your slow script is running will not be logged even if they are slow
There's nothing we can do about the latter (nothing I can think of anyway). For the former, I prefer to use a wrapper script that turns off slow query logging, then calls my slow script, and then turns it back on when the the script has terminated (successfully or unsuccessfully). This ensures that my slow query log has mostly queries that I should and can optimise. See the MySQL Slow Query Log documentation for more information on what to do and how to do it.

Now, I mentioned in my previous post that we pushed our boxes up from 4GB RAM to 16GB RAM. This left a bit free after allocating enough to the innodb_buffer_pool. I figured that we could use this to improve read performance. My first thought was to use this for the query cache. All past tests had shown that the query cache improves read performance quite a lot. Unfortunately, these tests assume that you have a single database server or use some kind of affinity to make sure that multiple queries go to the same host.

This is not such a good idea with a multi-box set up for BCP and load balancing. There's also the way in which the query cache keys queries which not every developer understands, and this can lead to unexpected results. I wasn't too concerned about this since I was in control over every query that went into the system, but I may not always be maintaining this system. I decided that the best option was to turn off the query cache and turn it on on a per query basis using the SQL_CACHE directive in my queries. Instead, I use a frontend cache similar to memcached. The guys at the MySQL Performance Blog also have similar recommendations wrt query cache, so go read their blog for a more detailed analysis.

The second thing I did was to create tables with redundant information. I call them cache tables. I store information in there while I'm building the main tables that will eventually speed up creating the summary tables. The data in there is quite simple. For example, I have a table that contains an approximate count of rows of each type that I need to summarise. That way I can schedule summarisation using the Shortest Job First algorithm. The result is that in 50% of the time, 97% of all summaries are done and most users can start using that data. Something else I haven't done yet, but may implement soon is to let the script that summarises data run two instances in parallel, one of each slave, and one running the shortest jobs first while the other runs the longest jobs first, or some similar scheduling algorithm. The ideal result would be if it took 50% of the time to run.

The final optimisation for handling summaries was bulk queries. INSERTs, UPDATEs and SELECTs can all be batched and sometimes this can get you much better performance than running single queries. For INSERTs, I developed a method using ON DUPLICATE KEY UPDATE to INSERT and UPDATE multiple rows at once. The query looks something like this:
 INSERT INTO table (        key_field, f1, f2, f3    ) VALUES (        key1, f11, f21, f31    ), (        key2, f12, f22, f32    ), ...    ON DUPLICATE KEY UPDATE        f1 = IF(key_field=key1, f11, IF(key_field=key2, f12, IF(key_field=key3, f13, ...))),        f2 = IF(key_field=key1, f21, IF(key_field=key2, f22, IF(key_field=key3, f23, ...))),        f3 = IF(key_field=key1, f31, IF(key_field=key2, f32, IF(key_field=key3, f33, ...)))
As you can see the query gets quite complicated as the number of rows grows, but you never write this query by hand. It's generated through code in your language of choice. The only thing you have to worry about is making sure the total query size stays below your max tcp packet size. Also longer queries take longer to parse. I restrict it to about 100 rows per insert/update.

Now, it's quite likely that I need to insert/update far more than 100 rows, which means the query parser needs to run for each batch. To get around this, I use a prepared statement with a heck of a lot of question marks in it. I'll leave it as an excercise for you to figure out what to pass to it. The real trick comes on the last batch. It's unlikely that I'll have an exact multiple of 100 records to be inserted, so the last batch may have fewer than 100 records. I have two choices at this point.
1. Create a new prepared statement with the number of records I need
2. Pad the current statement with extra copies of the last row
Neither method has had any advantage over the other, so I prefer the former since it sends less data over the wire at the cost of one extra query parse.

Bulk selects are far similar. It basically means that if I'm going to have to operate on a bunch of records one at a time, then it's faster to select them all at once, store them in an array and operate on the array rather than selecting them one at a time. This, of course, costs memory, and it is possible to use up all the RAM on the system doing something like this. It's happened several times. With experience you learn where to draw the line for your application.

Now for the user queries, I did not optimise too much. I again went with data partitioning, also by time, but this time by month. Our access patterns showed that most queries were for data in the last one month, so by partitioning the summary tables by month, it meant that we only had to query one or two partitions at any time. The primary key was designed to either return the exact results the user wanted, or narrow the search down to a small set that could be filtered either through a DB scan, or in the application itself. Partitions ensured that in the worst case we'd have to do a full partition scan and not a full table scan.

This is by no means the fastest design. It is optimised to speed up the slowest part of the system, ie, writes, but reads don't quite go out of the window as a result.