1
votes

Consider the following table and its indexes:

CREATE TABLE 'Interaction' (
  'oid' bigint(20) NOT NULL,
  'archieved' datetime DEFAULT NULL,
  'content' longtext COLLATE utf8_bin,
  'contentSentiment' int(11) DEFAULT NULL,
  'createdAt' datetime DEFAULT NULL,
  'id' varchar(255) COLLATE utf8_bin DEFAULT NULL,
  'interactionSource' longtext COLLATE utf8_bin,
  'link' varchar(255) COLLATE utf8_bin DEFAULT NULL,
  'source' varchar(255) COLLATE utf8_bin DEFAULT NULL,
  'title' varchar(255) COLLATE utf8_bin DEFAULT NULL,
  'type' int(11) DEFAULT NULL,
  'authorKloutScore' int(11) DEFAULT NULL,
  PRIMARY KEY ('oid'),
  KEY 'createdAt' ('createdAt'),
  KEY 'fullMonitorFeedSearch' ('criteria_oid','createdAt','authorKloutScore','archieved','type')
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin

Why is the following query slow to run if there is an index (fullMonitorFeedSearch) which covers it? BTW, if the 'interactio0_.TYPE = 2' is removed, the sql runs in 0.01 secs.

SELECT interactio0_.oid 
FROM   Interaction interactio0_ 
WHERE  interactio0_.criteria_oid = 21751021
    AND interactio0_.createdat = 10
    AND interactio0_.archieved IS NULL
    AND interactio0_.TYPE = 2
ORDER  BY interactio0_.createdat DESC 

This is the explain for the sql:

+----+-------------+--------------+-------+--------------------------------------------------------------------------+-----------------------+---------+------+---------+-------------+
| id | select_type | table        | type  | possible_keys                                                            | key                   | key_len | ref  | rows    | Extra       |
+----+-------------+--------------+-------+--------------------------------------------------------------------------+-----------------------+---------+------+---------+-------------+
|  1 | SIMPLE      | interactio0_ | range | FKD15475F24AA96F7,createdAt,fullMonitorFeedSearch                        | fullMonitorFeedSearch | 18      | NULL | 2323027 | Using where |
+----+-------------+--------------+-------+--------------------------------------------------------------------------+-----------------------+---------+------+---------+-------------+
2
Consider re-arranging the order of the index, by putting the least unique value first (perhaps TYPE, in your case).D.N.
What do you mean by 'least unique'?rreyes1979
In other words, the columns most likely to have repeated values. A column such as "TYPE" probably has a very limited number of values, whereas a datetime field will be quite varied. This will cut the number of records down quickly without a large index lookup. The optimizer should do the trick, but also consider rearranging the query itself to match.D.N.

2 Answers

1
votes

It seems MySQL might not be able to fully use the index because the columns in the index are different from the columns used in the WHERE and also not in the same order. Try removing the column authorKloutScore from the index:

fullMonitorFeedSearch (criteria_oid, type, createdAt, archieved)

Then modify the query as:

SELECT interactio0_.oid
FROM   Interaction interactio0_
WHERE  interactio0_.criteria_oid = 21751021
    AND interactio0_.type = 2
    AND interactio0_.createdat = 10
    AND interactio0_.archieved IS NULL
ORDER BY interactio0_.createdat DESC;

A few additional suggestions/concerns:

  1. If the type column is supposed to contain values like 1, 2, 3..., I think it makes sense to re-declare it as TINYINT UNSIGNED. That should allow you to store values in the range of 0 to 255.

  2. Similar suggestion for authorKloutScore and contentSentiment columns to re-declare as TINYINT UNSIGNED or SMALLINT UNSIGNED depending on what values the column might contain.

  3. The index and the query mentions a column criteria_oid but the column is missing from the table definition. I'm not sure if it is a typo or the column doesn't exist.

  4. The createdAt column is DATETIME but the corresponding WHERE predicate in the query doesn't make much sense - interactio0_.createdat = 10. Shouldn't it be a date or date-time value in the right side. Or is it that the column is intended to store only specific data (day, month or hour)?

  5. One particular condition in the query reads interactio0_.criteria_oid = 21751021. As I had mentioned above, the column is missing in the table definition. However, the idea here is that if that column is UNIQUE, it makes more sense to remove all other WHERE conditions - select the record where criteria_oid = 21751021 and do the other checks in the resultset in PHP. However, if the column is not UNIQUE and there could be many rows with the same creative_oid, then the query is just fine.

Hope the above helps!

0
votes

It could be a problem with cardinality; What happens if you create a specific index for Type separately ?

Also, consider your data scenarios, what's more likely - that a createdat will be specific or that archieved is null ?

One of the primary considerings when optimising indexes is consider how common the data is; That is to say, if you're indexing on 4 fields and you can select a count, grouping by those 4 fields, if you're getting lots and lots of records then your index is wrong.