3
votes

A covering index is a special case of an index in InnoDB where all required fields for a query are included in the index, as mentioned in this blog https://blog.toadworld.com/2017/04/06/speed-up-your-queries-using-the-covering-index-in-mysql.

But, I encountered a situation that the covering index is not used when SELECT and WHERE only includes indexed columns or primary key.

MySQL version: 5.7.27

Example table:

mysql> SHOW CREATE TABLE employees.employees\G;
*************************** 1. row ***************************
       Table: employees
Create Table: CREATE TABLE `employees` (
  `emp_no` int(11) NOT NULL,
  `birth_date` date NOT NULL,
  `first_name` varchar(14) NOT NULL,
  `last_name` varchar(16) NOT NULL,
  `gender` enum('M','F') NOT NULL,
  `hire_date` date NOT NULL,
  PRIMARY KEY (`emp_no`),
  KEY `first_name_last_name` (`first_name`,`last_name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Rows: 300024

Indexes:

mysql> SHOW INDEX FROM employees.employees;
+-----------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table     | Non_unique | Key_name             | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| employees |          0 | PRIMARY              |            1 | emp_no      | A         |      299379 |     NULL | NULL   |      | BTREE      |         |               |
| employees |          1 | first_name_last_name |            1 | first_name  | A         |        1242 |     NULL | NULL   |      | BTREE      |         |               |
| employees |          1 | first_name_last_name |            2 | last_name   | A         |      276690 |     NULL | NULL   |      | BTREE      |         |               |
+-----------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

mysql> EXPLAIN SELECT first_name, last_name FROM employees.employees WHERE emp_no < '10010';
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table     | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | employees | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL |    9 |   100.00 | Using where |
+----+-------------+-----------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

As can be seen, the first_name and last_name in the SELECT clause are indexed columns, and the emp_no in the WHERE clause is primary key. But, the execution plan shows that the result rows is retrieved from primary index tree.

In my opinion, it should scan the secondary index tree, and filter results by emp_no < '10010', in which the covering index is used.

Edit

Besides, I have seen the covering index is used in the same situation under MySQL 5.7.21.

Indexes: enter image description here

Rows:8204

SQL:

explain select poi_id , ctime from another_table where id < 1000;

Result: enter image description here

1

1 Answers

3
votes

You have 2 indices, a primary key (clustered index) on emp_no and a secondary (non-clustered) index on first_name_last_name.

This is how these indices look like:

enter image description here

Now when you run the following query:

SELECT first_name, last_name FROM employees.employees WHERE emp_no < '10010';

The SQL optimizer needs to find all the records with emp_ne < 10010. Your first_name_last_name index does not help finding records with emp_no smaller than 10010, because it does not hold this information.

So SQL optimizer would search your clustered index to find all the employees with the required employee number, there is not reason to get the first name and last name from the secondary index because SQL optimizer has already found this information.

Now if you change the query to:

SELECT * FROM employees.employees WHERE first_name = 'john';

Then the SQL optimizer would use your secondary (non-clustered) index to find the records, because it is the easiest way to narrow down the search result.

Note:

If you run the following query:

SELECT * FROM employees.employees WHERE last_name = 'smith';

Your secondary index would not be used, because your secondary index is a composite index containing first_name and last_name... since the index is sorted by first_name then by last_name it won't be useful for a search query on last_name. In this case, SQL optimizer would scan your entire table to find the records with last_name = 'smith'


Update

Think of it as an index at the end of a book. Imagine you have a tourist guide book for Brazil... it has an index of all restaurants and another index of all hotels in Brazil.

Restaurant Index

  • Restaurant 1: is mentioned on page 12 and 77 of Brazil guide book
  • Restaurant 2: is mentioned on page 33 of Brazil guide book
  • ...

Hotel Index

  • Hotel 1: is mentioned on page 5 of Brazil guide book
  • Hotel 2: is mentioned on page 33 and 39 of Brazil guide book
  • ...

Now if you want to search the book and find all pages that mention the city of Rio De Janeiro, neither of these indices are useful. Unless the book has a third index on city names, you would have to scan the whole book to find those pages.