RTFM

[Read This Fine Material] from Joshua Hoblitt

Rewriting MySQL queries to join on derived tables

| 0 comments

When you IN(), ANY(), etc. in a MySQL query the optimizer is unable to do anything intelligent and you effectively end up with a cross product query.  The results of rewriting this query that looks for duplicate entries (so they can be removed and a UNIQUE() index installed) to join against a derived table is a pretty impressive improvement.

 

mysql> explain select so_id, ext_id, dir_id from storage_object where dir_id in (select dir_id from (select dir_id, dirname, parent_id, count(*) from directory group by dirname, parent_id  having count(*) > 1 ) as foo);
+----+--------------------+----------------+-------+---------------+-------------+---------+------+----------+----------------------------------------------+
| id | select_type        | table          | type  | possible_keys | key         | key_len | ref  | rows     | Extra                                        |
+----+--------------------+----------------+-------+---------------+-------------+---------+------+----------+----------------------------------------------+
|  1 | PRIMARY            | storage_object | ALL   | NULL          | NULL        | NULL    | NULL | 29105384 | Using where                                  |
|  2 | DEPENDENT SUBQUERY |      | ALL   | NULL          | NULL        | NULL    | NULL |     1612 | Using where                                  |
|  3 | DERIVED            | directory      | index | NULL          | parent_id_2 | 263     | NULL |   174729 | Using index; Using temporary; Using filesort |
+----+--------------------+----------------+-------+---------------+-------------+---------+------+----------+----------------------------------------------+
3 rows in set (3.62 sec)
mysql> explain select so.so_id, so.ext_id, so.dir_id from storage_object as so join (select dir_id, dirname, parent_id, count(*) from directory group by dirname, parent_id  having count(*) > 1) as foo on so.dir_id = foo.dir_id;
+----+-------------+------------+-------+---------------+-------------+---------+------------+--------+----------------------------------------------+
| id | select_type | table      | type  | possible_keys | key         | key_len | ref        | rows   | Extra                                        |
+----+-------------+------------+-------+---------------+-------------+---------+------------+--------+----------------------------------------------+
|  1 | PRIMARY     |  | ALL   | NULL          | NULL        | NULL    | NULL       |   1612 |                                              |
|  1 | PRIMARY     | so         | ref   | dir_id        | dir_id      | 8       | foo.dir_id |    918 |                                              |
|  2 | DERIVED     | directory  | index | NULL          | parent_id_2 | 263     | NULL       | 174729 | Using index; Using temporary; Using filesort |
+----+-------------+------------+-------+---------------+-------------+---------+------------+--------+----------------------------------------------+
3 rows in set (3.95 sec)

 

Leave a Reply