Evil MySQL Subquery and High Server Load

20 Apr 2011

For a long time I’d been seeing high server load on the server MyWapBlog.com is hosted. I thought the increasing traffic was the cause. MySQL, I knew was the causing the load, but I thought it was all natural for a dynamic site with NO caching.

One day, just when I was seriously thinking about implementing some kind of cache, I found about mtop – small tool like top to display running MySQL queries in real-time, running it I found that every second there were many queries (same query but made by different requests) that got stuck in the “PREPARING” state and sometimes took as long as 2 secs. to complete. The evil query was:

SELECT c.post_id FROM category_post_relationship c 
  WHERE  (c.cat_id IN 
  (SELECT c2.id AS c2__id FROM categories c2 
  WHERE (c2.user_id =  ?)))

Thinking what it does? Let me make it easier – it fetches “post ids” of all the posts of a user that are categorized.

1. Running the query

184 rows, Query took 0.0812 sec

2. Profiling:

   starting                0.000132
   checking permissions    0.000011
   checking permissions    0.000011
   Opening tables          0.000037
   System lock             0.000015
   Table lock              0.000022
   init                    0.000065
   optimizing              0.000025
   statistics              0.000027
   preparing               0.000027
   executing               0.000011
   Sending data            0.000047
   optimizing              0.000023
   statistics              0.000086
   preparing               0.075275
   end                     0.000015
   query end               0.000010
   freeing items           0.000033
   logging slow query      0.000009
   cleaning up             0.000011

3. EXPLAIN:

id 	    select_type         table       type 	       possible_keys 	    key         key_len    ref 	    row     Extra
1 	    PRIMARY 	        c 	    index 	       NULL 	            PRIMARY 	8 	   NULL     53898   Using where; Using index
2 	    DEPENDENT SUBQUERY 	c2 	    unique_subquery    PRIMARY,user_id 	    PRIMARY 	4 	   func     1 	    Using where

If you’d ask some novice to do what the query does many of them would use two separate queries and believe me (I’ve done the testing as well) that’ll be much faster!

Googling “subquery optimization” got me this:

MySQL evaluates queries “from outside to inside.” That is, it first obtains the value of the outer expression outer_expr, and then runs the subquery and captures the rows that it produces.

From http://dev.mysql.com/doc/refman/5.1/en/in-subquery-optimization.html

The page also tells us some tips on how to optimize subqueries, going by the suggestions we get the following query:

SELECT c.post_id FROM category_post_relationship c 
  WHERE  EXISTS 
  (SELECT 1 FROM categories c2 
  WHERE c2.user_id = 7639 AND c.cat_id =  c2.id)

Still, running it takes similar query times and as such doesn’t help.

Again from the same page:

After the conversion, MySQL can use the pushed-down equality to limit the number of rows that it must examine when evaluating the subquery.

Though I didn’t read the whole page thoroughly (I’m lazy and since I got the job done by some other technique) still I’m sure that this “optimized” query only “optimizes” the subquery while our biggest problem is that the outer table (with about 50000 rows) is getting evaluated.

I felt, this was one query you’d rather not “optimize” and be happy with two separate queries.

But, this is not to say it can’t be optimized, it can be very easily – by NOT using subqueries at all. I used JOINS:

SELECT cc.post_id FROM categories c
  RIGHT JOIN category_post_relationship cc
  ON  c.id = cc.cat_id<br>
  WHERE  c.user_id = ?

Now the query takes 0.0008 sec compared to 0.0812, that’s a hundred fold improvement!

Some tips:

If you’re using subqueries and are having some problems with it being slow read the article I mentioned.
If possible, use WHERE condition on the outer query.
Don’t use subqueries when the outer query table contains lots of rows. (I may be wrong but this is what I’ve found out)

29 people like this post.

Unlike

Tags: mysql, optimization, subquery

Arvind Gupta

Evil MySQL Subquery and High Server Load

About Me

Categories

Archives

Photos