分享 关于 MySQL 前置索引

ibugs · July 21, 2015 · Last by msl12 replied at March 29, 2017 · 2797 hits

一、前置索引的好处

使用前置索引的好处

Normally, the entire value of a column is used to build the index - this is fine for short data types (integers and the like) but can result in a lot of data in the index for longer data types (CHAR and VARCHAR, for example). Using an index prefix allows you to make a trade off between the space required for the index and the cardinality of the index.

二、前置索引的坑

第一种情况

  • 索引的长度和对应字段中的长度一致(name 与 idx_of_name)
| areas | CREATE TABLE `areas` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `areaid` varchar(20) NOT NULL,
  `name` varchar(50) NOT NULL,
  `cityid` varchar(20) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_of_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select id, name from areas order by name;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
|  1 | SIMPLE      | areas | index | NULL          | idx_of_name | 152     | NULL | 3144 | Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> explain select * from areas where name = '龙潭区';
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
| id | select_type | table | type | possible_keys | key         | key_len | ref   | rows | Extra                 |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
|  1 | SIMPLE      | areas | ref  | idx_of_name   | idx_of_name | 152     | const |    1 | Using index condition |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
1 row in set (0.01 sec)

mysql>  explain select id, name from areas group by name;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
|  1 | SIMPLE      | areas | index | idx_of_name   | idx_of_name | 152     | NULL | 3144 | Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.01 sec)

第二种情况:

  • 索引的长度和对应字段中的长度不一致(name 与 idx_of_name)
| areas | CREATE TABLE `areas` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `areaid` varchar(20) NOT NULL,
  `name` varchar(50) NOT NULL,
  `cityid` varchar(20) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_of_name` (`name`(6))
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select id, name from areas order by name;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | areas | ALL  | NULL          | NULL | NULL    | NULL | 3144 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.01 sec)

mysql> explain select * from areas where name = '龙潭区';
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key         | key_len | ref   | rows | Extra       |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
|  1 | SIMPLE      | areas | ref  | idx_of_name   | idx_of_name | 20      | const |    1 | Using where |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
1 row in set (0.01 sec)

mysql> explain select id, name from areas group by name;
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
|  1 | SIMPLE      | areas | ALL  | NULL          | NULL | NULL    | NULL | 3144 | Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
1 row in set (0.01 sec)

虽然在查询上用到了索引,但是在排序时,索引不生效。

使用索引就一定能加快查询效率么?不一定。上面的例子就告诉我们,前置索引 并不是一个万能药,它的确可以帮助我们在一个过长的字段中建立索引。但同时也会导致排序(order by, group by)查询都无法使用前置索引。

三、如何计算出前置索引的最佳长度

关于如何计算出,最恰当的设置索引的长度,总体思想:

# 全列选择性
SELECT COUNT(DISTINCT column_name) / COUNT(*) FROM table_name;

# 测试某一长度前缀的选择性
SELECT COUNT(DISTINCT LEFT(column_name, prefix_length)) / COUNT(*) FROM table_name;

当前置的选择性越接近全列的选择性的时候,索引效果越好。

通常可以索引开始的几个字符,而不是全部值,以节约空间并取得好的性能。这使索引需要的空间变小,但是也会降低选择性。索引选择性不重复的索引值和表中所有行的比值。高选择性的索引有好处,因为它使MySQL在查找匹配的时候可以过滤更多的行。唯一索引的选择率为1,为最佳值

具体操作如下:

CREATE TABLE `areas` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `areaid` varchar(20) NOT NULL,
  `name` varchar(50) NOT NULL,
  `cityid` varchar(20) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select count(distinct(name))/count(id) from areas;
+---------------------------------+
| count(distinct(name))/count(id) |
+---------------------------------+
|                          0.8954 |
+---------------------------------+
1 row in set (0.01 sec)

mysql> select count(distinct left(name, 2))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 2))/count(id) |
+-----------------------------------------+
|                                  0.8648 |
+-----------------------------------------+
1 row in set (0.00 sec)

mysql> select count(distinct left(name, 3))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 3))/count(id) |
+-----------------------------------------+
|                                  0.8909 |
+-----------------------------------------+
1 row in set (0.01 sec)

mysql> select count(distinct left(name, 5))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 5))/count(id) |
+-----------------------------------------+
|                                  0.8941 |
+-----------------------------------------+
1 row in set (0.01 sec)

mysql> select count(distinct left(name, 6))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 6))/count(id) |
+-----------------------------------------+
|                                  0.8954 |
+-----------------------------------------+
1 row in set (0.00 sec)

前置索引为6时,选择性和列值选择性相同。那么就设定该索引的长度为6位。

mysql> select max(length(name)) from areas;
+-------------------+
| max(length(name)) |
+-------------------+
|                45 |
+-------------------+
1 row in set (0.11 sec)

mysql> select * from areas where length(name) = 45;
+------+--------+-----------------------------------------------+--------+
| id   | areaid | name                                          | cityid |
+------+--------+-----------------------------------------------+--------+
| 2624 | 530925 | 双江拉祜族佤族布朗族傣族自治县                | 530900 |
| 2965 | 622927 | 积石山保安族东乡族撒拉族自治县                | 622900 |
+------+--------+-----------------------------------------------+--------+
2 rows in set (0.01 sec)


四、索引的最大长度

mysql> select 255 * 3 from dual;
+---------+
| 255 * 3 |
+---------+
|     765 |
+---------+
1 row in set (0.00 sec)

mysql> show create table areas;

CREATE TABLE `areas` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `areaid` varchar(20) NOT NULL,
  `name` varchar(255) NOT NULL,
  `cityid` varchar(20) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_of_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |

mysql> alter table areas change `name` `name` varchar(256) not null;
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
mysql> alter table areas drop index `idx_of_name`;
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table areas change `name` `name` varchar(256) not null;
Query OK, 3144 rows affected (0.06 sec)
Records: 3144  Duplicates: 0  Warnings: 0

mysql> show create table areas;

CREATE TABLE `areas` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `areaid` varchar(20) NOT NULL,
  `name` varchar(256) NOT NULL,
  `cityid` varchar(20) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8

mysql> alter table areas add index `idx_of_name` (name);
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes


添加索引时,如果不指定索引的长度,MySQL 默认会选择该字段的长度作为索引长度。其实这里

alter table areas add index idx_of_name (name) 等同于 alter table areas add index idx_of_name (name(256))

MySQL 索引最大 bytes 为 767, 255 * 3 < 767 但是 256 * 3 > 767 所以,varchar(255) 时候,创建默认索引可以成功,但是 varchar(256) 时,创建默认索引就不成功。

并且由上述列子可知,索引和字段相互影响,当索引设置为 idx_of_name (name(255))时,要改变字段的长度,也会提示 Specified key was too long 错误。

总结

根据具体的业务需求来选择索引,并不是索引都会加快查询速度。

参考
我的简书
错误使用 MySQL 前缀索引导致的慢查询
MySQL 前缀索引
mysql 前缀索引
mysql 省市区邮政编码和区号级联 sql 文件

rails里面如何声明前缀索引呀?

You need to Sign in before reply, if you don't have an account, please Sign up first.