本篇内容介绍了“MySQL为什么有时候会选错索引”的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!
今天在生产环境中看到一个慢SQL,是个核心业务表,数据1300万+
看一下表索引:
mysql>show index from `order`
+-----------------+----------------------+--------------------+------------------------+-----------------------+---------------------+-----------------------+--------------------+------------------+----------------+----------------------+-------------------+-------------------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+----------------------+--------------------+------------------------+-----------------------+---------------------+-----------------------+--------------------+------------------+----------------+----------------------+-------------------+-------------------------+
| order | 0 | PRIMARY | 1 | id | A | 10493505 | | | | BTREE | | |
| order | 0 | uidx_order | 1 | order_seq | A | 10512924 | | | | BTREE | | |
| order | 1 | idx_user | 1 | user_id | A | 1995181 | | | YES | BTREE | | |
| order | 1 | idx_shop | 1 | shop_id | A | 53933 | | | YES | BTREE | | |
| order | 1 | idx_out_channel | 1 | out_channel | A | 524 | | | YES | BTREE | | |
| order | 1 | idx_out_channel | 2 | out_order_no | A | 10512924 | | | YES | BTREE | | |
| order | 1 | idx_order_time | 1 | order_time | A | 9867734 | | | | BTREE | | |
| order | 1 | idx_update_time | 1 | update_time | A | 8305698 | | | | BTREE | | |
| order | 1 | idx_create_time | 1 | create_time | A | 9951390 | | | | BTREE | | |
+-----------------+----------------------+--------------------+------------------------+-----------------------+---------------------+-----------------------+--------------------+------------------+----------------+----------------------+-------------------+-------------------------+
返回行数:[9],耗时:4 ms.
mysql>SELECT id,order_seq,user_id
FROM
`ORDER`
WHERE
delete_flag = 0
AND user_id = 'd4b0c318b28a46968718dddbaf4775c0'
AND display_status = 2
ORDER BY
order_time asc
LIMIT 0,20
+--------------+---------------------+----------------------------------+
| id | order_seq | user_id |
+--------------+---------------------+----------------------------------+
| 6068129 | 20161128183300861 | d4b0c318b28a46968718dddbaf4775c0 |
| 6118611 | 20161206171509550 | d4b0c318b28a46968718dddbaf4775c0 |
| 6885081 | 20170427104933189 | d4b0c318b28a46968718dddbaf4775c0 |
| 7720299 | 2017101718252243 | d4b0c318b28a46968718dddbaf4775c0 |
| 10319613 | 201905281103186182 | d4b0c318b28a46968718dddbaf4775c0 |
| 505498 | 2019082116584284235 | d4b0c318b28a46968718dddbaf4775c0 |
| 10840144 | 1119082315041792571 | d4b0c318b28a46968718dddbaf4775c0 |
+--------------+---------------------+----------------------------------+
返回行数:[7],耗时:18534 ms.
耗时18s,这个查询速度肯定是不能接受的。
我们看一下执行计划:
mysql>EXPLAIN SELECT id,order_seq,user_id
FROM
`ORDER`
WHERE
delete_flag = 0
AND user_id = 'd4b0c318b28a46968718dddbaf4775c0'
AND display_status = 2
ORDER BY
order_time asc
LIMIT 0,20
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+----------------+-------------------+---------------+----------------+--------------------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+----------------+-------------------+---------------+----------------+--------------------+-----------------+
| 1 | SIMPLE | ORDER | | index | idx_user | idx_order_time | 5 | | 2705 | 0.01 | Using where |
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+----------------+-------------------+---------------+----------------+--------------------+-----------------+
执行计划中看到,这个SQL走索引idx_order_time,根据经验判断,此索引效率很差。而扫描行数为2705,慢日志显示扫描行数为13,347,074,二者相差甚远,那么为什么会出现如此大的差异呢?
选择索引是优化器的工作。而优化器选择索引的目的,是找一个最优的执行方案,并用最小的代价去执行语句。在数据库里面,扫描行数是影响执行代价的因素之一。扫描的行数越少,意味着访问磁盘数据的次数越少,消耗的CPU资源越少。
当然,扫描行数并不是唯一的判断标准,优化器还会结合是否使用临时表,是否排序等因素进行综合判断。扫描行数是怎么判断的?
MySQL在真正执行SQL之前,并不能准确的判断满足这个条件的数据有多少行,只能按统计信息来估算行数。
索引的统计信息就是索引的“区分度”,一个索引不同的值越多,这个索引的区分度就越好,而一个索引上不同的值的个数,我们称之为“基数”,基数越大,索引的区分度越好。
若强制使用idx_user索引,看下执行情况:
mysql>SELECT id,order_seq,user_id
FROM
`ORDER` force index(idx_user)
WHERE
delete_flag = 0
AND user_id = 'd4b0c318b28a46968718dddbaf4775c0'
AND display_status = 2
ORDER BY
order_time asc
LIMIT 0,20
+--------------+---------------------+----------------------------------+
| id | order_seq | user_id |
+--------------+---------------------+----------------------------------+
| 6068129 | 20161128183300861 | d4b0c318b28a46968718dddbaf4775c0 |
| 6118611 | 20161206171509550 | d4b0c318b28a46968718dddbaf4775c0 |
| 6885081 | 20170427104933189 | d4b0c318b28a46968718dddbaf4775c0 |
| 7720299 | 2017101718252243 | d4b0c318b28a46968718dddbaf4775c0 |
| 10319613 | 201905281103186182 | d4b0c318b28a46968718dddbaf4775c0 |
| 505498 | 2019082116584284235 | d4b0c318b28a46968718dddbaf4775c0 |
| 10840144 | 1119082315041792571 | d4b0c318b28a46968718dddbaf4775c0 |
+--------------+---------------------+----------------------------------+
查询速度还是很快的,看一下执行计划:
mysql>explain SELECT id,order_seq,user_id
FROM
`ORDER` force index(idx_user)
WHERE
delete_flag = 0
AND user_id = 'd4b0c318b28a46968718dddbaf4775c0'
AND display_status = 2
ORDER BY
order_time asc
LIMIT 0,20
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+---------------+-------------------+---------------+----------------+--------------------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+---------------+-------------------+---------------+----------------+--------------------+----------------------------------------------------+
| 1 | SIMPLE | ORDER | | ref | idx_user | idx_user | 163 | const | 77706 | 1 | Using index condition; Using where; Using filesort |
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+---------------+-------------------+---------------+----------------+--------------------+----------------------------------------------------+
返回行数:[1],耗时:4 ms.
如果换成数据行数少一些的user_id
mysql>EXPLAIN SELECT id,order_seq,user_id
FROM
`ORDER`
WHERE
delete_flag = 0
AND user_id = '1e41c833fc6f4f57b490a4627a4170dc'
AND display_status = 2
ORDER BY
order_time asc
LIMIT 0,20
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+---------------+-------------------+---------------+----------------+--------------------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+---------------+-------------------+---------------+----------------+--------------------+----------------------------------------------------+
| 1 | SIMPLE | ORDER | | ref | idx_user | idx_user | 163 | const | 13 | 1 | Using index condition; Using where; Using filesort |
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+---------------+-------------------+---------------+----------------+--------------------+----------------------------------------------------+
返回行数:[1],耗时:4 ms.
mysql>SELECT id,order_seq,user_id
FROM
`ORDER`
WHERE
delete_flag = 0
AND user_id = '1e41c833fc6f4f57b490a4627a4170dc'
AND display_status = 2
ORDER BY
order_time asc
LIMIT 0,20
+--------------+---------------------+----------------------------------+
| id | order_seq | user_id |
+--------------+---------------------+----------------------------------+
| 10397123 | 2019092523044218361 | 1e41c833fc6f4f57b490a4627a4170dc |
+--------------+---------------------+----------------------------------+
返回行数:[1],耗时:4 ms.
对比一下两个user_id对应的数据量:
返回行数:[1],耗时:4 ms.
mysql>select count(*) from order01 where user_id='1e41c833fc6f4f57b490a4627a4170dc'
+--------------------+
| count(*) |
+--------------------+
| 15 |
+--------------------+
返回行数:[1],耗时:4 ms.
mysql>select count(*) from order01 where user_id='d4b0c318b28a46968718dddbaf4775c0'
+--------------------+
| count(*) |
+--------------------+
| 38611 |
+--------------------+
返回行数:[1],耗时:14 ms.
总结:在此业务场景中,MySQL优化器认为检索38000行数据然后进行排序要比检索15行数据排序代价大得多,所以选择了有序的索引idx_order_time,但未必是最快的执行计划。
但是,此处还有一个疑问,如果对于user_id:d4b0c318b28a46968718dddbaf4775c0,不使用limit分页,执行计划是什么样呢?
mysql>SELECT *
FROM
ORDER01
WHERE
delete_flag = 0
AND user_id = 'd4b0c318b28a46968718dddbaf4775c0'
AND display_status = 2
ORDER BY
order_time desc
+--------------+---------------------+----------------------+----------------------+----------------------------------+-----------------------+---------------------+----------------------------------+---------------------+----------------------+---------------------+-------------------------------+-------------------------+---------------------------+-----------------------------------+-----------------------+-------------------------+-------------------+-----------------------+------------------------+--------------------------+----------------------+---------------------+--------------------------+------------------------+------------------------+---------------------+------------------+-----------------------+-----------------------+-----------------------+
| id | order_seq | order_type | order_flag | user_id | user_mobile | user_nick | shop_id | shop_name | pay_status | pay_time | receiver_address_id | receiver_name | receiver_mobile | receiver_address | cancel_time | cancel_reason | channel | out_channel | out_order_no | out_store_name | order_time | over_time | display_status | order_status | sale_channel | sale_mode | remark | delete_flag | create_time | update_time |
+--------------+---------------------+----------------------+----------------------+----------------------------------+-----------------------+---------------------+----------------------------------+---------------------+----------------------+---------------------+-------------------------------+-------------------------+---------------------------+-----------------------------------+-----------------------+-------------------------+-------------------+-----------------------+------------------------+--------------------------+----------------------+---------------------+--------------------------+------------------------+------------------------+---------------------+------------------+-----------------------+-----------------------+-----------------------+
| 11153421 | 201911091339555506 | 1 | 1 | d4b0c318b28a46968718dddbaf4775c0 | 13718903545 | 总部-客服-补单 | 29e541d6da9b4aae8957409ca03c6670 | 清悠 | 1 | 2019-11-09 13:40:10 | 2666265 | 总部-客服-补单 | 13718903545 | 东城区 王府井 王府井 总部补单 | | 0 | | 0 | 201911091339555506 | | 2019-11-09 13:39:55 | | 2 | 200 | 1 | 1 | | 0 | 2019-11-09 13:39:55 | 2019-11-09 13:40:10 |
| 7720299 | 2017101718252243 | 1 | 1 | d4b0c318b28a46968718dddbaf4775c0 | 13718903545 | 总部-客服-补单 | ad41dba7bf5c4b69b03e0222878cb2b0 | 蝶舞 | 1 | 2017-10-17 18:25:26 | 2282099 | 总部-客服-补单 | 13718903545 | 2号线; 地铁7号线 华强北 总部补单 | | 0 | | 0 | 2017101718252243 | | 2017-10-17 18:25:22 | | 2 | 200 | 1 | 1 | | 0 | 2017-10-17 18:25:22 | 2017-10-17 18:25:22 |
| 6885081 | 20170427104933189 | 1 | 1 | d4b0c318b28a46968718dddbaf4775c0 | 13718903545 | 总部-客服-补单 | c6092260f92643098f7f56e68560d8c0 | 木兰花 | 1 | 2017-04-27 10:49:39 | 2264946 | 总部-客服-补单 | 13718903545 | 天河北商圈 | | 0 | | 0 | 20170427104933189 | | 2017-04-27 10:49:33 | | 2 | 200 | 1 | 1 | | 0 | 2017-04-27 10:49:33 | 2017-04-27 10:49:33 |
| 6118611 | 20161206171509550 | 1 | 1 | d4b0c318b28a46968718dddbaf4775c0 | 13718903545 | 总部-客服-补单 | 7a0cd4d60f52423fb757b0be1ab55be6 | 娟子 | 1 | 2016-12-06 17:15:12 | 1904075 | 总部-客服-补单 | 13718903545 | 广东省深圳市南山区深南大道 科技园 | | 0 | helijia | 0 | 20161206171509550 | | 2016-12-06 17:15:09 | | 2 | 200 | 1 | 1 | | 0 | 2016-12-06 17:15:09 | 2016-12-06 17:15:09 |
| 6068129 | 20161128183300861 | 1 | 1 | d4b0c318b28a46968718dddbaf4775c0 | 13718903545 | 总部-客服-补单 | f6f4612493654695ac4c6bac6df67672 | 美天 | 1 | 2016-11-28 18:33:03 | 1544109 | 总部-客服-补单 | 13718903545 | 青羊区金河路口宽窄巷子 宽窄巷子 | | 0 | helijia | 0 | 20161128183300861 | | 2016-11-28 18:33:00 | | 2 | 200 | 1 | 1 | | 0 | 2016-11-28 18:33:00 | 2016-11-28 18:33:00 |
+--------------+---------------------+----------------------+----------------------+----------------------------------+-----------------------+---------------------+----------------------------------+---------------------+----------------------+---------------------+-------------------------------+-------------------------+---------------------------+-----------------------------------+-----------------------+-------------------------+-------------------+-----------------------+------------------------+--------------------------+----------------------+---------------------+--------------------------+------------------------+------------------------+---------------------+------------------+-----------------------+-----------------------+-----------------------+
返回行数:[5],耗时:152 ms.
mysql>explain SELECT *
FROM
ORDER01
WHERE
delete_flag = 0
AND user_id = 'd4b0c318b28a46968718dddbaf4775c0'
AND display_status = 2
ORDER BY
order_time desc
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+---------------+-------------------+---------------+----------------+--------------------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+---------------+-------------------+---------------+----------------+--------------------+----------------------------------------------------+
| 1 | SIMPLE | ORDER01 | | ref | idx_user | idx_user | 163 | const | 75800 | 1 | Using index condition; Using where; Using filesort |
+--------------+-----------------------+-----------------+----------------------+----------------+-------------------------+---------------+-------------------+---------------+----------------+--------------------+----------------------------------------------------+
返回行数:[1],耗时:4 ms.
查询速度很快,执行计划走了user_id字段的索引。为什么会出现这样的情况呢?
查阅了相关资料,对于order by limit这样的排序,当检索到的数据较多的时候,排序消耗是很大的,这个时候由于优化器选择了有序的idx_order_time而导致执行索引选择错误。
优化办法:
1、强制使用索引idx_user;
2、创建组合索引idx_uid_ordertime(user_id,order_time)
mysql>alter table `ORDER` add index idx_uid_ordertime(user_id,order_time)
执行成功,耗时:60334 ms.
mysql>SELECT id,order_seq,user_id
FROM
`ORDER`
WHERE
delete_flag = 0
AND user_id = 'd4b0c318b28a46968718dddbaf4775c0'
AND display_status = 2
ORDER BY
order_time asc
LIMIT 0,20
+--------------+---------------------+----------------------------------+
| id | order_seq | user_id |
+--------------+---------------------+----------------------------------+
| 6068129 | 20161128183300861 | d4b0c318b28a46968718dddbaf4775c0 |
| 6118611 | 20161206171509550 | d4b0c318b28a46968718dddbaf4775c0 |
| 6885081 | 20170427104933189 | d4b0c318b28a46968718dddbaf4775c0 |
| 7720299 | 2017101718252243 | d4b0c318b28a46968718dddbaf4775c0 |
| 10319613 | 201905281103186182 | d4b0c318b28a46968718dddbaf4775c0 |
| 505498 | 2019082116584284235 | d4b0c318b28a46968718dddbaf4775c0 |
| 10840144 | 1119082315041792571 | d4b0c318b28a46968718dddbaf4775c0 |
+--------------+---------------------+----------------------------------+
返回行数:[7],耗时:86 ms.
mysql>explain SELECT id,order_seq,user_id
FROM
`ORDER`
WHERE
delete_flag = 0
AND user_id = 'd4b0c318b28a46968718dddbaf4775c0'
AND display_status = 2
ORDER BY
order_time asc
LIMIT 0,20
+--------------+-----------------------+-----------------+----------------------+----------------+----------------------------+-------------------+-------------------+---------------+----------------+--------------------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+--------------+-----------------------+-----------------+----------------------+----------------+----------------------------+-------------------+-------------------+---------------+----------------+--------------------+------------------------------------+
| 1 | SIMPLE | ORDER | | ref | idx_user,idx_uid_ordertime | idx_uid_ordertime | 163 | const | 72772 | 1 | Using index condition; Using where |
+--------------+-----------------------+-----------------+----------------------+----------------+----------------------------+-------------------+-------------------+---------------+----------------+--------------------+------------------------------------+
返回行数:[1],耗时:4 ms.
“MySQL为什么有时候会选错索引”的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识可以关注亿速云网站,小编将为大家输出更多高质量的实用文章!