9
votes

I'm trying to optimize this query:

SELECT `posts`.* FROM `posts` INNER JOIN `posts_tags` 
     ON `posts`.id = `posts_tags`.post_id 
     WHERE  (((`posts_tags`.tag_id = 1))) 
     ORDER BY posts.created_at DESC;

The size of tables is 38k rows, and 31k and mysql uses "filesort" so it gets pretty slow. I tried to use different indexes, no luck.

CREATE TABLE `posts` (
  `id` int(11) NOT NULL auto_increment,
  `created_at` datetime default NULL,
  PRIMARY KEY  (`id`),
  KEY `index_posts_on_created_at` (`created_at`),
  KEY `for_tags` (`trashed`,`published`,`clan_private`,`created_at`)
) ENGINE=InnoDB AUTO_INCREMENT=44390 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `posts_tags` (
  `id` int(11) NOT NULL auto_increment,
  `post_id` int(11) default NULL,
  `tag_id` int(11) default NULL,
  `created_at` datetime default NULL,
  `updated_at` datetime default NULL,
  PRIMARY KEY  (`id`),
  KEY `index_posts_tags_on_post_id_and_tag_id` (`post_id`,`tag_id`)
) ENGINE=InnoDB AUTO_INCREMENT=63175 DEFAULT CHARSET=utf8
+----+-------------+------------+--------+--------------------------+--------------------------+---------+---------------------+-------+-----------------------------------------------------------+
| id | select_type | table      | type   | possible_keys            | key                      | key_len | ref                 | rows  | Extra                                                     |
+----+-------------+------------+--------+--------------------------+--------------------------+---------+---------------------+-------+-----------------------------------------------------------+
|  1 | SIMPLE      | posts_tags | index  | index_post_id_and_tag_id | index_post_id_and_tag_id | 10      | NULL                | 24159 | Using where; Using index; Using temporary; Using filesort | 
|  1 | SIMPLE      | posts      | eq_ref | PRIMARY                  | PRIMARY                  | 4       | .posts_tags.post_id |     1 |                                                           | 
+----+-------------+------------+--------+--------------------------+--------------------------+---------+---------------------+-------+-----------------------------------------------------------+
2 rows in set (0.00 sec)

What kind of index I need to define to avoid mysql using filesort? Is it possible when order field is not in where clause?

update: Profiling results:

mysql> show profile for query 1;
+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.000027 | 
| checking query cache for query | 0.037953 | 
| Opening tables                 | 0.000028 | 
| System lock                    | 0.010382 | 
| Table lock                     | 0.023894 | 
| init                           | 0.000057 | 
| optimizing                     | 0.010030 | 
| statistics                     | 0.000026 | 
| preparing                      | 0.000018 | 
| Creating tmp table             | 0.128619 | 
| executing                      | 0.000008 | 
| Copying to tmp table           | 1.819463 | 
| Sorting result                 | 0.001092 | 
| Sending data                   | 0.004239 | 
| end                            | 0.000012 | 
| removing tmp table             | 0.000885 | 
| end                            | 0.000006 | 
| end                            | 0.000005 | 
| query end                      | 0.000006 | 
| storing result in query cache  | 0.000005 | 
| freeing items                  | 0.000021 | 
| closing tables                 | 0.000013 | 
| logging slow query             | 0.000004 | 
| cleaning up                    | 0.000006 | 
+--------------------------------+----------+

update2:

Real query (some more boolean fields, more useless indexes)

SELECT `posts`.* FROM `posts` INNER JOIN `posts_tags` 
   ON `posts`.id = `posts_tags`.post_id 
   WHERE ((`posts_tags`.tag_id = 7971)) 
       AND (((posts.trashed = 0) 
       AND (`posts`.`published` = 1 
       AND `posts`.`clan_private` = 0)) 
       AND ((`posts_tags`.tag_id = 7971)))  
   ORDER BY created_at DESC LIMIT 0, 10; 

Empty set (1.25 sec)

Without ORDER BY — 0.01s.


+----+-------------+------------+--------+-----------------------------------------+-----------------------+---------+---------------------+-------+--------------------------+
| id | select_type | table      | type   | possible_keys                           | key                   | key_len | ref                 | rows  | Extra                    |
+----+-------------+------------+--------+-----------------------------------------+-----------------------+---------+---------------------+-------+--------------------------+
|  1 | SIMPLE      | posts_tags | index  | index_posts_tags_on_post_id_and_tag_id  | index_posts_tags_...  | 10      | NULL                | 23988 | Using where; Using index | 
|  1 | SIMPLE      | posts      | eq_ref | PRIMARY,index_posts_on_trashed_and_crea | PRIMARY               | 4       | .posts_tags.post_id |     1 | Using where              | 
+----+-------------+------------+--------+-----------------------------------------+-----------------------+---------+---------------------+-------+--------------------------+

SOLUTION

  1. Query updated to "ORDER BY posts_tags.created_at DESC" (two small changes in app code)
  2. Index added: index_posts_tags_on_created_at.

That's all!

3

3 Answers

3
votes

You would need to denormalize a bit, and copy the posts.created_at field into the post_tags table (I called it post_created_at, you could name it how you want):

CREATE TABLE `posts_tags` (
  `id` int(11) NOT NULL auto_increment,
  `post_id` int(11) default NULL,
  `tag_id` int(11) default NULL,
  `post_created_at` datetime default NULL,
  `created_at` datetime default NULL,
  `updated_at` datetime default NULL,
  PRIMARY KEY  (`id`),
  KEY `index_posts_tags_on_post_id_and_tag_id` (`post_id`,`tag_id`)
) ENGINE=InnoDB;

and then add an index to posts_tags on

(tag_id, post_created_at)

That will allow the query to get all the posts for a tag, in the correct order, without filesort.

1
votes

Try changing KEY index_posts_tags_on_post_id_and_tag_id (post_id,tag_id) to KEY index_posts_tags_tag_id (tag_id) and repost Explain.

What is the distribution of TagIDs withing Posts_Tags?

0
votes

your key index_posts_on_created_at is sorted ascending but you want results sorted descending