Hematemesis: 10 scenarios of index failure!-headlines-php.cn

Today we will continue the topic of databases and further talk about issues related to indexing, because indexing is a public topic that everyone is more concerned about, and there are indeed many pitfalls.

I don’t know if you have encountered the following two situations in actual work:

It is obvious that an index is added to a certain field, but in fact did not take effect.
The index sometimes takes effect, sometimes it does not.

Today I will talk to you about 10 scenarios of mysql database index failure, which will provide a reference for friends who have been in the pit before or are about to step into the pit. Picture

Hematemesis: 10 scenarios of index failure!

#1. Preparation work

The so-called empty words are unfounded. If I directly throw out these scenes of index failure, it may There is no convincing.

So, I decided to build tables and data, and demonstrate the effects step by step, trying to be as reasonable as possible.

I believe that if you read this article patiently, you will gain a lot.

1.1 Create user table

Create a user table, which contains: id, code, age, name and height fields.

CREATE TABLE `user` (
  `id` int NOT NULL AUTO_INCREMENT,
  `code` varchar(20) COLLATE utf8mb4_bin DEFAULT NULL,
  `age` int DEFAULT &#39;0&#39;,
  `name` varchar(30) COLLATE utf8mb4_bin DEFAULT NULL,
  `height` int DEFAULT &#39;0&#39;,
  `address` varchar(30) COLLATE utf8mb4_bin DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_code_age_name` (`code`,`age`,`name`),
  KEY `idx_height` (`height`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin

Copy after login

In addition, three indexes were created:

id：数据库的主键
idx_code_age_name：由code、age和name三个字段组成的联合索引。
idx_height：普通索引

Copy after login

1.2 Insert data

In order to facilitate the demonstration for everyone, I specially added the user table to Insert 3 pieces of data:

INSERT INTO sue.user (id, code, age, name, height) VALUES (1, &#39;101&#39;, 21, &#39;周星驰&#39;, 175,&#39;香港&#39;);
INSERT INTO sue.user (id, code, age, name, height) VALUES (2, &#39;102&#39;, 18, &#39;周杰伦&#39;, 173,&#39;台湾&#39;);
INSERT INTO sue.user (id, code, age, name, height) VALUES (3, &#39;103&#39;, 23, &#39;苏三&#39;, 174,&#39;成都&#39;);

Copy after login

Stephen Chow and Jay Chou are my idols. I am narcissistic here and put them together with me. Hahaha.

1.3 Check the database version

In order to prevent unnecessary misunderstandings in the future, it is necessary to check the current database version here. To give a conclusion directly without mentioning the version is to be a hooligan, hahaha.

select version();

Copy after login

Find out the current mysql version number is: 8.0.21

1.4 View the execution plan

In mysql, if you want to view a certain Whether the SQL statement uses an index, or whether the established index is invalid, you can use the explain keyword to view the execution plan of the SQL statement to determine the index usage.

For example:

explain select * from user where id=1;

Copy after login

Execution result:

Hematemesis: 10 scenarios of index failure!

As can be seen from the figure, since the id field is the primary key, the sql statement uses Arrived at the primary key index.

Of course, if you want to learn more about the use of the explain keyword, you can read my other article "explain | This peerless sword of index optimization, do you really know how to use it?" 》, which gives a more detailed introduction.

2. The leftmost matching principle is not satisfied

Before, I have built a joint index for the three fields of code, age and name: idx_code_age_name.

The order of the index fields is:

code
age
name

Copy after login

If you do not pay attention to the leftmost prefix principle when using a joint index, it is likely to cause the index to fail. If you don’t believe it, let’s take a look below.

2.1 Under what circumstances is the index valid?

Let’s first look at the circumstances under which indexing can be performed.

explain select * from user
where code=&#39;101&#39;;
explain select * from user
where code=&#39;101&#39; and age=21 
explain select * from user
where code=&#39;101&#39; and age=21 and name=&#39;周星驰&#39;;

Copy after login

Execution results:

Hematemesis: 10 scenarios of index failure!

In the above three cases, SQL can be indexed normally.

In fact, there is a rather special scenario:

explain select * from user
where code = &#39;101&#39;  and name=&#39;周星驰&#39;;

Copy after login

Execution result:

Hematemesis: 10 scenarios of index failure!

The original order of query conditions is: code, age , name, but here only the code and name are broken, and the age field is missing. In this case, the index on the code field can also be used.

Seeing this, I wonder if you are smart enough to find such a pattern: these 4 SQLs all have a code field, which is the first field in the index field, which is the leftmost field. . As long as this field exists, the SQL can already be indexed.

This is what we call the leftmost matching principle.

2.2 Under what circumstances does the index fail?

I have already introduced before that after establishing a joint index, there are situations in which the index is valid in the query conditions.

Next, let’s focus on the circumstances under which the index will fail.

explain select * from user
where age=21;
explain select * from user
where name=&#39;周星驰&#39;;
explain select * from user
where age=21 and name=&#39;周星驰&#39;;

Copy after login

Execution results:

Hematemesis: 10 scenarios of index failure!

It can be seen from the figure that the index does fail in these three cases.

It means that the above three situations do not satisfy the leftmost matching principle. To put it bluntly, it is because the query conditions do not include the leftmost index field of the given field, that is, the field code.

3. Use of select *

It is clearly stated in the "Alibaba Development Manual" that the use of select * is prohibited in query SQL.

So, do you know why?

Without further ado, in accordance with international practice, let’s start with a sql:

explain 
select * from user where name=&#39;苏三&#39;;

Copy after login

Execution result:

Hematemesis: 10 scenarios of index failure!

在该sql中用了select *，从执行结果看，走了全表扫描，没有用到任何索引，查询效率是非常低的。

如果查询的时候，只查我们真正需要的列，而不查所有列，结果会怎么样？

非常快速的将上面的sql改成只查了code和name列，太easy了：

explain 
select code,name from user 
where name=&#39;苏三&#39;;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中执行结果不难看出，该sql语句这次走了全索引扫描，比全表扫描效率更高。

其实这里用到了：覆盖索引。

如果select语句中的查询列，都是索引列，那么这些列被称为覆盖索引。这种情况下，查询的相关字段都能走索引，索引查询效率相对来说更高一些。

而使用select *查询所有列的数据，大概率会查询非索引列的数据，非索引列不会走索引，查询效率非常低。

4. 索引列上有计算

介绍本章节内容前，先跟大家一起回顾一下，根据id查询数据的sql语句：

explain select * from user where id=1;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中可以看出，由于id字段是主键，该sql语句用到了主键索引。

但如果id列上面有计算，比如：

explain select * from user where id+1=2;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从上图中的执行结果，能够非常清楚的看出，该id字段的主键索引，在有计算的情况下失效了。

5. 索引列用了函数

有时候我们在某条sql语句的查询条件中，需要使用函数，比如：截取某个字段的长度。

假如现在有个需求：想查出所有身高是17开头的人，如果sql语句写成这样：

explain select * from user  where height=17;

Copy after login

该sql语句确实用到了普通索引：

Hematemesis: 10 scenarios of index failure!

但该sql语句肯定是有问题的，因为它只能查出身高正好等于17的，但对于174这种情况，它没办法查出来。

为了满足上面的要求，我们需要把sql语句稍稍改造了一下：

explain select * from user  where SUBSTR(height,1,2)=17;

Copy after login

这时需要用到SUBSTR函数，用它截取了height字段的前面两位字符，从第一个字符开始。

执行结果：

Hematemesis: 10 scenarios of index failure!

你有没有发现，在使用该函数之后，该sql语句竟然走了全表扫描，索引失效了。

6. 字段类型不同

在sql语句中因为字段类型不同，而导致索引失效的问题，很容易遇到，可能是我们日常工作中最容易忽略的问题。

到底怎么回事呢？

请大家注意观察一下t_user表中的code字段，它是varchar字符类型的。

在sql语句中查询数据时，查询条件我们可以写成这样：

explain 
select * from user where code="101";

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从上图中看到，该code字段走了索引。

温馨提醒一下，查询字符字段时，用双引号“和单引号'都可以。

但如果你在写sql时，不小心把引号弄掉了，把sql语句变成了：

explain 
select * from user where code=101;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

你会惊奇的发现，该sql语句竟然变成了全表扫描。因为少写了引号，这种小小的失误，竟然让code字段上的索引失效了。

这时你心里可能有一万个为什么，其中有一个肯定是：为什么索引会失效呢？

答：因为code字段的类型是varchar，而传参的类型是int，两种类型不同。

此外，还有一个有趣的现象，如果int类型的height字段，在查询时加了引号条件，却还可以走索引：

explain select * from user 
where height=&#39;175&#39;;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中看出该sql语句确实走了索引。int类型的参数，不管在查询时加没加引号，都能走索引。

这是变魔术吗？这不科学呀。

答：mysql发现如果是int类型字段作为查询条件时，它会自动将该字段的传参进行隐式转换，把字符串转换成int类型。

mysql会把上面列子中的字符串175，转换成数字175，所以仍然能走索引。

接下来，看一个更有趣的sql语句：

select 1 + &#39;1&#39;;

Copy after login

它的执行结果是2，还是11呢？

好吧，不卖关子了，直接公布答案执行结果是2。

mysql自动把字符串1，转换成了int类型的1，然后变成了：1+1=2。

但如果你确实想拼接字符串该怎么办？

答：可以使用concat关键字。

具体拼接sql如下：

select concat(1,&#39;1&#39;);

Copy after login

接下来，关键问题来了：为什么字符串类型的字段，传入了int类型的参数时索引会失效呢？

答：根据mysql官网上解释，字符串'1'、' 1 '、'1a'都能转换成int类型的1，也就是说可能会出现多个字符串，对应一个int类型参数的情况。那么，mysql怎么知道该把int类型的1转换成哪种字符串，用哪个索引快速查值?

感兴趣的小伙伴可以再看看官方文档：https://dev.mysql.com/doc/refman/8.0/en/type-conversion.html

7. like左边包含%

模糊查询，在我们日常的工作中，使用频率还是比较高的。

比如现在有个需求：想查询姓李的同学有哪些?

使用like语句可以很快的实现：

select * from user where name like &#39;李%&#39;;

Copy after login

但如果like用的不好，就可能会出现性能问题，因为有时候它的索引会失效。

不信，我们一起往下看。

目前like查询主要有三种情况：

like &#39;%a&#39;
like &#39;a%&#39;
like &#39;%a%&#39;

Copy after login

假如现在有个需求：想查出所有code是10开头的用户。

这个需求太简单了吧，sql语句如下：

explain select * from user
where code like &#39;10%&#39;;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

图中看出这种%在10右边时走了索引。

而如果把需求改了：想出现出所有code是1结尾的用户。

查询sql语句改为：

explain select * from user
where code like &#39;%1&#39;;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中看出这种%在1左边时，code字段上索引失效了，该sql变成了全表扫描。

此外，如果出现以下sql：

explain select * from user
where code like &#39;%1%&#39;;

Copy after login

该sql语句的索引也会失效。

下面用一句话总结一下规律：当like语句中的%，出现在查询条件的右边时，索引会失效。

那么，为什么会出现这种现象呢？

答：其实很好理解，索引就像字典中的目录。一般目录是按字母或者拼音从小到大，从左到右排序，是有顺序的。

我们在查目录时，通常会先从左边第一个字母进行匹对，如果相同，再匹对左边第二个字母，如果再相同匹对其他的字母，以此类推。

通过这种方式我们能快速锁定一个具体的目录，或者缩小目录的范围。

但如果你硬要跟目录的设计反着来，先从字典目录右边匹配第一个字母，这画面你可以自行脑补一下，你眼中可能只剩下绝望了，哈哈。

8. 列对比

上面的内容都是常规需求，接下来，来点不一样的。

假如我们现在有这样一个需求：过滤出表中某两列值相同的记录。比如user表中id字段和height字段，查询出这两个字段中值相同的记录。

这个需求很简单，sql可以这样写：

explain select * from user 
where id=height

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

意不意外，惊不惊喜？索引失效了。

为什么会出现这种结果？

id字段本身是有主键索引的，同时height字段也建了普通索引的，并且两个字段都是int类型，类型是一样的。

但如果把两个单独建了索引的列，用来做列对比时索引会失效。

感兴趣的朋友可以找我私聊。

9. 使用or关键字

我们平时在写查询sql时，使用or关键字的场景非常多，但如果你稍不注意，就可能让已有的索引失效。

不信一起往下面看。

某天你遇到这样一个需求：想查一下id=1或者height=175的用户。

你三下五除二就把sql写好了：

explain select * from user 
where id=1 or height=&#39;175&#39;;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

没错，这次确实走了索引，恭喜被你蒙对了，因为刚好id和height字段都建了索引。

但接下来的一个夜黑风高的晚上，需求改了：除了前面的查询条件之后，还想加一个address='成都'。

这还不简单，sql走起：

explain select * from user 
where id=1 or height=&#39;175&#39; or address=&#39;成都&#39;;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

结果悲剧了，之前的索引都失效了。

你可能一脸懵逼，为什么？我做了什么？

答：因为你最后加的address字段没有加索引，从而导致其他字段的索引都失效了。

注意：如果使用了or关键字，那么它前面和后面的字段都要加索引，不然所有的索引都会失效，这是一个大坑。

10. not in和not exists

在我们日常工作中用得也比较多的，还有范围查询，常见的有：

in
exists
not in
not exists
between and

Copy after login

今天重点聊聊前面四种。

10.1 in关键字

假如我们想查出height在某些范围之内的用户，这时sql语句可以这样写：

explain select * from user
where height in (173,174,175,176);

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中可以看出，sql语句中用in关键字是走了索引的。

10.2 exists关键字

有时候使用in关键字时性能不好，这时就能用exists关键字优化sql了，该关键字能达到in关键字相同的效果：

explain select * from user  t1
where  exists (select 1 from user t2 where t2.height=173 and t1.id=t2.id)

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中可以看出，用exists关键字同样走了索引。

10.3 not in关键字

上面演示的两个例子是正向的范围，即在某些范围之内。

那么反向的范围，即不在某些范围之内，能走索引不？

话不多说，先看看使用not in的情况：

explain select * from user
where height not in (173,174,175,176);

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

你没看错，索引失效了。

看如果现在需求改了：想查一下id不等于1、2、3的用户有哪些，这时sql语句可以改成这样：

explain select * from user
where id  not in (173,174,175,176);

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

你可能会惊奇的发现，主键字段中使用not in关键字查询数据范围，任然可以走索引。而普通索引字段使用了not in关键字查询数据范围，索引会失效。

10.4 not exists关键字

除此之外，如果sql语句中使用not exists时，索引也会失效。具体sql语句如下：

explain select * from user  t1
where  not exists (select 1 from user t2 where t2.height=173 and t1.id=t2.id)

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中看出sql语句中使用not exists关键后，t1表走了全表扫描，并没有走索引。

11. order by的坑

在sql语句中，对查询结果进行排序是非常常见的需求，一般情况下我们用关键字：order by就能搞定。

但我始终觉得order by挺难用的，它跟where或者limit关键字有很多千丝万缕的联系，一不小心就会出问题。

Let go

11.1 哪些情况走索引？

首先当然要温柔一点，一起看看order by的哪些情况可以走索引。

我之前说过，在code、age和name这3个字段上，已经建了联合索引：idx_code_age_name。

11.1.1 满足最左匹配原则

order by后面的条件，也要遵循联合索引的最左匹配原则。具体有以下sql：

explain select * from user
order by code limit 100;
explain select * from user
order by code,age limit 100;
explain select * from user
order by code,age,name limit 100;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中看出这3条sql都能够正常走索引。

除了遵循最左匹配原则之外，有个非常关键的地方是，后面还是加了limit关键字，如果不加它索引会失效。

11.1.2 配合where一起使用

order by还能配合where一起遵循最左匹配原则。

explain select * from user
where code=&#39;101&#39;
order by age;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

code是联合索引的第一个字段，在where中使用了，而age是联合索引的第二个字段，在order by中接着使用。

假如中间断层了，sql语句变成这样，执行结果会是什么呢？

explain select * from user
where code=&#39;101&#39;
order by name;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

虽说name是联合索引的第三个字段，但根据最左匹配原则，该sql语句依然能走索引，因为最左边的第一个字段code，在where中使用了。只不过order by的时候，排序效率比较低，需要走一次filesort排序罢了。

11.1.3 相同的排序

order by后面如果包含了联合索引的多个排序字段，只要它们的排序规律是相同的（要么同时升序，要么同时降序），也可以走索引。

具体sql如下：

explain select * from user
order by code desc,age desc limit 100;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

该示例中order by后面的code和age字段都用了降序，所以依然走了索引。

11.1.4 两者都有

如果某个联合索引字段，在where和order by中都有，结果会怎么样？

explain select * from user
where code=&#39;101&#39;
order by code, name;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

code字段在where和order by中都有，对于这种情况，从图中的结果看出，还是能走了索引的。

11.2 哪些情况不走索引？

前面介绍的都是正面的用法，是为了让大家更容易接受下面反面的用法。

好了，接下来，重点聊聊order by的哪些情况下不走索引？

11.2.1 没加where或limit

如果order by语句中没有加where或limit关键字，该sql语句将不会走索引。

explain select * from user
order by code, name;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中看出索引真的失效了。

11.2.2 对不同的索引做order by

前面介绍的基本都是联合索引，这一个索引的情况。但如果对多个索引进行order by，结果会怎么样呢？

explain select * from user
order by code, height limit 100;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中看出索引也失效了。

11.2.3 不满足最左匹配原则

前面已经介绍过，order by如果满足最左匹配原则，还是会走索引。下面看看，不满足最左匹配原则的情况：

explain select * from user
order by name limit 100;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

name字段是联合索引的第三个字段，从图中看出如果order by不满足最左匹配原则，确实不会走索引。

11.2.4 不同的排序

前面已经介绍过，如果order by后面有一个联合索引的多个字段，它们具有相同排序规则，那么会走索引。

但如果它们有不同的排序规则呢？

explain select * from user
order by code asc,age desc limit 100;

Copy after login

执行结果：

Hematemesis: 10 scenarios of index failure!

从图中看出，尽管order by后面的code和age字段遵循了最左匹配原则，但由于一个字段是用的升序，另一个字段用的降序，最终会导致索引失效。

好了今天分享的内容就先到这里，我们下期再见。