类型转换对 MySQL 选择索引的影响

遇到了几例 MySQL 没用使用预期索引的问题，读了些文档之后，发现 MySQL 的类型转换对索引选择的影响还真是一个不大不小的坑。

比如有这样一张 MySQL 表：

CREATE TABLE `indextest` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT NULL,
`age` tinyint(3) unsigned NOT NULL DEFAULT ’0′,
`create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `idx_name` (`name`),
KEY `idx_age` (`age`),
KEY `idx_create` (`create_time`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=latin1

name 是一个有索引的 varchar 字段，表内数据是这样的：

+—-+——–+—–+———————+
| id | name | age | create_time |
+—-+——–+—–+———————+
| 1 | hello | 10 | 2012-02-01 20:00:00 |
| 2 | world | 20 | 2012-02-02 20:00:00 |
| 3 | 111222 | 30 | 2012-02-03 20:00:00 |
| 4 | wow | 40 | 2012-02-04 20:00:00 |
+—-+——–+—–+———————+

使用字符串 ’111222′ 作为参数对 name 字段查询，Execution Plan 如预期的一样，会使用 name 字段上的索引 idx_name：

mysql [localhost] {msandbox} (test) > explain select age from
-> indextest where name=’111222′\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: indextest
type: ref
possible_keys: idx_name
key: idx_name
key_len: 13
ref: const
rows: 1
Extra: Using where
1 row in set (0.00 sec)

而使用数字作为参数对 name 字段做查询时，explain 表明这将是全表扫描：

mysql [localhost] {msandbox} (test) > explain select age from
-> indextest where name=111222\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: indextest
type: ALL
possible_keys: idx_name
key: NULL
key_len: NULL
ref: NULL
rows: 4
Extra: Using where
1 row in set (0.00 sec)

究其原因，是当文本字段与数字进行比较时，由于类型不同，MySQL 需要做隐式类型转换才能进行比较，结果就如上面的例子所提到的一样。

MySQL 的文档 (Type Conversion in Expression Evaluation) 中提到，在做比较时，会按这样的规则进行必要的类型转换：

两个参数至少有一个是 NULL 时，比较的结果也是 NULL，例外是使用 <=> 对两个 NULL 做比较时会返回 1，这两种情况都不需要做类型转换
两个参数都是字符串，会按照字符串来比较，不做类型转换
两个参数都是整数，按照整数来比较，不做类型转换
十六进制的值和非数字做比较时，会被当做二进制串，和数字做比较时会按下面的规则处理
有一个参数是 TIMESTAMP 或 DATETIME，并且另外一个参数是常量，常量会被转换为 timestamp
有一个参数是 decimal 类型，如果另外一个参数是 decimal 或者整数，会将整数转换为 decimal 后进行比较，如果另外一个参数是浮点数，则会把 decimal 转换为浮点数进行比较
所有其他情况下，两个参数都会被转换为浮点数再进行比较
比如：

mysql [localhost] {msandbox} (test) > SELECT ’18015376320243459′ =
-> 18015376320243459;
+—————————————–+
| ’18015376320243459′ = 18015376320243459 |
+—————————————–+
| 0 |
+—————————————–+
1 row in set (0.00 sec)

mysql [localhost] {msandbox} (test) > SELECT ’18015376320243459′ + 0;
+————————-+
| ’18015376320243459′ + 0 |
+————————-+
| 1.80153763202435e+16 |
+————————-+
1 row in set (0.00 sec)
mysql [localhost] {msandbox} (test) > SELECT
-> cast(’18015376320243459′ as unsigned) = 18015376320243459;
+———————————————————–+
| cast(’18015376320243459′ as unsigned) = 18015376320243459 |
+———————————————————–+
| 1 |
+———————————————————–+
1 row in set (0.00 sec)

因为浮点数精度(53 bits)问题，并且 MySQL 将字符串转换为浮点数和将整数转换为浮点数使用不同的方法，字符串 ’18015376320243459′ 和整数 18015376320243459 相比较就不相等，如果要避免隐式浮点数转换带来的精度问题，可以显式地使用 cast 做类型转换，将字符串转换为整数。

按照这些规则，对于上面的例子来说，name 字段的值和查询参数 ’111222′ 都会被转换为浮点数才会做比较，而很多文本都能转换为和 111222 相等的数值，比如 ’111222′, ’111222aabb’, ‘ 111222′ 和 ’11122.2e1′，所以 MySQL 不能有效使用索引，就退化为索引扫描甚至是全表扫描。

而反过来，如果使用一个字符串作为查询参数，对一个数字字段做比较查询，MySQL 则是可以有效利用索引的：

mysql [localhost] {msandbox} (test) > explain select name from
-> indextest where age=’30′\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: indextest
type: ref
possible_keys: idx_age
key: idx_age
key_len: 1
ref: const
rows: 1
Extra:
1 row in set (0.00 sec)

原因则是，MySQL 可以将查询参数 ’30′ 转换为确定的数值 30，之后可以快速地在索引中找到与之相等的数值。

除此之外，使用函数对索引字段做显式类型转换或者计算也会使 MySQL 无法使用索引：

mysql [localhost] {msandbox} (test) > explain select name from
-> indextest where cast(age as unsigned)=30\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: indextest
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 4
Extra: Using where
1 row in set (0.00 sec)

如上，使用 cast 函数对 age 做显式的类型转换，会使索引失效，当然了，在实际的代码中很少会有这样的写法，但类似下面这样对时间字段做运算的用法就比较多了：

mysql [localhost] {msandbox} (test) > explain select * from
-> indextest where date(create_time)=’2012-02-02′\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: indextest
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 4
Extra: Using where
1 row in set (0.00 sec)

对于本例的需求，是想查找 create_time 是 2012-02-02 这一天的记录，用变通的方法，避免在索引字段上做运算就可以有效使用索引了：

mysql [localhost] {msandbox} (test) > explain select * from
-> indextest where create_time between ’2012-02-02′ and ’2012-02-03′\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: indextest
type: range
possible_keys: idx_create
key: idx_create
key_len: 4
ref: NULL
rows: 1
Extra: Using where
1 row in set (0.00 sec)

MySQL 的 How … 系列文档值得读一读，比如：

伟大开源软件的文档总是需要经过反复阅读，才能逐步被理解和正确运用，RTFM 和 RTFS 的光辉无限！

觉得文章有用？立即：和朋友一起 共学习共进步！

OurMySQL

类型转换对 MySQL 选择索引的影响

猜您喜欢

文章评论： “类型转换对 MySQL 选择索引的影响”

发表评论取消回复

猜您喜欢

文章评论： “类型转换对 MySQL 选择索引的影响”

发表评论 取消回复

发表评论取消回复