疑问:
MySQL 建立 (a,b) 的索引,理论上,只有 a, (a,b) 两种情况能用到索引,但是其他情况也能用到索引么?下面就为你一一测试
数据库版本
/usr/local/bin/mysql Ver 14.14 Distrib 5.6.16, for osx10.9 (x86_64) using EditLine wrapper
CREATE TABLE `cd_happy_for_ni_deals` (
`id` int(11) NOT NULL DEFAULT '0',
`update_time` datetime DEFAULT NULL COMMENT '更新时间',
`publish_status` int(11) NOT NULL DEFAULT '4' COMMENT '发布状态',
KEY `idx_of_publish_status_update_time` (`publish_status`,`update_time`,`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
mysql> select count(distinct(update_time)) from cd_happy_for_ni_deals;
+------------------------------+
| count(distinct(update_time)) |
+------------------------------+
| 1845933 |
+------------------------------+
1 row in set (4.68 sec)
mysql> select count(distinct(publish_status)) from cd_happy_for_ni_deals;
+---------------------------------+
| count(distinct(publish_status)) |
+---------------------------------+
| 2 |
+---------------------------------+
1 row in set (1.76 sec)
mysql> select count(id) from cd_happy_for_ni_deals;
+-----------+
| count(id) |
+-----------+
| 1907609 |
+-----------+
1 row in set (0.00 sec)
update_time 的选择性:1845933 / 1907609.to_f = 0.9676684268107353 接近 1
publish_status 的选择性:2 / 1907609.to_f = 1.0484328811617055e-06 接近 0
不走寻常路,我就偏选择 选择性低的做索引的第一位。
创建索引
mysql> alter table cd_happy_for_ni_deals add index `idx_of_publish_status_update_time` (`publish_status`, `update_time`, `id`);
Query OK, 0 rows affected (14.69 sec)
Records: 0 Duplicates: 0 Warnings: 0
根据 a 查询
mysql> explain select SQL_NO_CACHE id, publish_status from cd_happy_for_ni_deals where publish_status = 4 \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: cd_happy_for_ni_deals
type: ref
possible_keys: idx_of_publish_status_update_time
key: idx_of_publish_status_update_time
key_len: 4
ref: const
rows: 964056 <- 只查询publish_status 的情况
Extra: Using index
1 row in set (0.00 sec)
查询时间:
mysql> select SQL_NO_CACHE count(id) from cd_happy_for_ni_deals where publish_status = 4 \G;
*************************** 1. row ***************************
count(id): 1858081
1 row in set (0.69 sec)
理论上可以用到索引 (a,b) 中的 a 部分。
根据 b 查询
mysql> explain select SQL_NO_CACHE id, publish_status from cd_happy_for_ni_deals where update_time = '2014-05-17 23:00:48' \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: cd_happy_for_ni_deals
type: index
possible_keys: NULL
key: idx_of_publish_status_update_time
key_len: 17
ref: NULL
rows: 1928113 <- 只查询update_time 的情况
Extra: Using where; Using index
1 row in set (0.01 sec)
查询时间:
mysql> select SQL_NO_CACHE count(id) from cd_happy_for_ni_deals where update_time = '2014-05-17 23:00:48' \G;
*************************** 1. row ***************************
count(id): 1
1 row in set (1.06 sec)
查询 b 的时候,理论上用不到索引的。为啥这里???
根据 (a,b) 查询
mysql> explain select SQL_NO_CACHE id, publish_status from cd_happy_for_ni_deals where publish_status = 4 and update_time = '2014-05-17 23:00:48' \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: cd_happy_for_ni_deals
type: ref
possible_keys: idx_of_publish_status_update_time
key: idx_of_publish_status_update_time
key_len: 13
ref: const,const
rows: 1
Extra: Using where; Using index
1 row in set (0.01 sec)
查询时间:
mysql> select SQL_NO_CACHE count(id) from cd_happy_for_ni_deals where publish_status = 4 and update_time = '2014-05-17 23:00:48' \G;
*************************** 1. row ***************************
count(id): 1
1 row in set (0.00 sec)
符合理论上的预期。
根据 (b,a) 查询
mysql> explain select SQL_NO_CACHE id, publish_status from cd_happy_for_ni_deals where update_time = '2014-05-17 23:00:48' and publish_status = 4 \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: cd_happy_for_ni_deals
type: ref
possible_keys: idx_of_publish_status_update_time
key: idx_of_publish_status_update_time
key_len: 13
ref: const,const
rows: 1
Extra: Using where; Using index
1 row in set (0.00 sec)
查询时间:
mysql> select SQL_NO_CACHE count(id) from cd_happy_for_ni_deals where update_time = '2014-05-17 23:00:48' and publish_status = 4 \G;
*************************** 1. row ***************************
count(id): 1
1 row in set (0.00 sec)
理论上,这里只能用到(a,b)中的 a 部分,为啥也这么快??
欢迎群里的小伙伴热烈讨论
暂时的结论:
1、理论上索引对顺序是敏感的,但是由于 MySQL 的查询优化器会自动调整 where 子句的条件顺序以使用适合的索引。
2、将选择性高的列放在索引的最前列。根据场景的不同,这条经验法则并不是完全准确的。在某些场景下,可能需要根据运行频率最高的查询来调整索引列的顺序。
参考