数据库 [已解决] select count (*) from table_name 为什么没有使用主键索引?

xiaoronglv · 2015年07月08日 · 最后由 ibugs 回复于 2015年08月21日 · 10841 次阅读

今天在测试索引的时候碰到一个十分诡异的问题，百思不得其解。

表结构

主键索引
token 的索引 devices_token
token 的前置索引 devices_token_prefix_index

我创建了一个只有两个字段、三个索引的测试表。

CREATE TABLE `devices` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `token` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `devices_token` (`token`),
  KEY `devices_token_prefix_index` (`token`(15))
) ENGINE=InnoDB AUTO_INCREMENT=67828774 DEFAULT CHARSET=utf8;

插入了 6000 万条测试数据后，表的大小为 8G。

诡异事件

我想看一下数据库已经插入了多少条记录，就做了一个简单的查询

select count(*) from devices

结果整整花了 9s 的时间（这种简单的查询如果使用 primary key 的话，应该只有 0.5ms 之内）

我百思不得其解，explain 一下，结果发现这个查询语句压根没有使用 primary key，而是使用了前置索引 devices_token_prefix_index。MySQL 的这种行为怎么这么古怪？

4 个赞

无引用文章

MySQL 慢查询的特征表现及优化方式

serco #0 2015年07月08日

count(*) 会使用 mysql 认为最快的 index，你可以用 force index (primary) 比较一下，我不认为你会得到更快的结果，这是 innodb 不是 myisam

zgm #1 2015年07月08日

使用 primary key 能降到 0.5ms???

hooopo #2 2015年07月08日

Innodb count(*) 都需要扫全索引，只不过是使用了覆盖索引，就是说你查询的列从索引里就可以返回结果了，不需要根据索引再次 IO 读取。

其实 MySQL 默认选择的应该是最快的，你 force index (primary) 可能更慢。

xiaoronglv #3 2015年07月08日

感谢 @zgm @serco 以及炮哥的解答。 :plus1:

我以前彻底的混淆了这两种查询的处理流程，他们两个查询时的策略完全不一样的。

query 1

--  ms 级别
-- token 已经加了索引
select count(*) from devices where token="xxx"

query 2
```
select count(*) from devices
```

Query1

在 Query1 中，mysql 会考虑 devices_token, devices_token_prefix_index 两个索引，最终选择了 selectivity 最高的索引 devices_token。

这个查询是 ms 级别的。

Query2

在 query 2 中，MySQL optimizer 会考虑使用以下三个索引：

primary key 是 clustered index，是所有索引中最大的一个。在这个查询中，无论如何也不会用它。（largest）
device_token 是次大（larger）
device_token_prefix_index 是（large）

device_token_prefix_index 体积最小，所以会优先用它。

结论

在 InnoDB 存储引擎中，primary key 是 clustered index（最大的索引），使用它来处理 select count(*) from table_name 最慢。

新的疑问

但这样牵扯出一个新的疑问：

除了 constant table，所有的 select count(*) from table_name 都是一种低效的查询，是这个样子吗？

@zgm @serco @hooopo

2 个赞

hooopo #4 2015年07月08日

是啊，如果你的表足够大了，count(*) 单纯从 sql 语句上是没发再优化的。

zgm #5 2015年07月08日

你选择了 innodb 的话，是这样子的，除非自己想办法优化。

nine #6 2015年07月08日

把主键拉出来和一个 int 字段做一个联合索引，速度会很快。（只把主键索引的没试过/忘了，转 PG 了）

zlx_star #7 2015年07月09日

#4 楼 @xiaoronglv 我觉得有一天，数据库会直接返回 many 或者 1000+ 如果你真的想知道具体的数目，再加参数。

ibugs #8 2015年08月21日

@xiaoronglv

mysql> select SQL_NO_CACHE count(id) from g_d_outs\G;
*************************** 1. row ***************************
count(id): 69008543
1 row in set (10.76 sec)

ERROR: 
No query specified

如果你想得到近似的统计值，使用下面的这个方式查找 Rows 值，这种方式是非常快的。

mysql> show table status where name = 'g_d_outs'\G;
*************************** 1. row ***************************
           Name: g_d_outs
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 69008796
 Avg_row_length: 71
    Data_length: 4965007360
Max_data_length: 0
   Index_length: 3560964096
      Data_free: 5242880
 Auto_increment: 138022949
    Create_time: 2015-08-14 18:46:13
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options: 
        Comment: ??????????
1 row in set (0.01 sec)

或者使用 EXPLAIN 其中 rows 的返回值，也是一个近似值。

mysql> explain select id from g_d_outs;
+----+-------------+-----------------+-------+---------------+------------------------+---------+------+----------+-------------+
| id | select_type | table           | type  | possible_keys | key                    | key_len | ref  | rows     | Extra       |
+----+-------------+-----------------+-------+---------------+------------------------+---------+------+----------+-------------+
|  1 | SIMPLE      | g_d_outs | index | NULL          | idx_g_d_stdate | 3       | NULL | 69008796 | Using index |
+----+-------------+-----------------+-------+---------------+------------------------+---------+------+----------+-------------+
1 row in set (0.00 sec)

1 个赞

ibugs 在 MySQL 慢查询的特征表现及优化方式提及了此话题。 10月12日 17:40

需要登录后方可回复, 如果你还没有账号请注册新账号

4 个赞

共收到 9 条回复

收到新回复，点击立即加载