I have been proficient in using ctrl c and ctrl v to develop curd code for many years.
Why is mysql query slow? This problem is often encountered in actual development, and it is also a frequently asked question in interviews.
When encountering this kind of problem, we usually think that it is because of the index.
In addition to indexes, what other factors can cause database queries to slow down?
What operations can improve the query capabilities of mysql?
In today’s article, we will talk about the scenarios that will cause database queries to slow down, and give the reasons and solutions.
Let’s first take a look at what processes a query statement will go through.
For example, we have a database table
CREATE TABLE `user` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键', `name` varchar(100) NOT NULL DEFAULT '' COMMENT '名字', `age` int(11) NOT NULL DEFAULT '0' COMMENT '年龄', `gender` int(8) NOT NULL DEFAULT '0' COMMENT '性别', PRIMARY KEY (`id`), KEY `idx_age` (`age`), KEY `idx_gender` (`gender`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The application code we usually write (go or C or the like) is called client at this time.
The bottom layer of the client will take the account password and try to establish a TCP long link to mysql.
Mysql's Connection Management Module will manage this connection.
After establishing the connection, the client executes a query sql statement. For example:
select * from user where gender = 1 and age = 100;
The client will connect the sql statement to mysql through the network.
After mysql receives the sql statement, it will first determine whether there are grammatical errors in the SQL statement in the analyzer, such as select. If there is one less l
, it will be written as slect
, an error will be reportedYou have an error in your SQL syntax;
. This error report can be said to be very familiar to a disabled person like me.
Next is the optimizer, where select what index to use according to certain rules.
After that, the interface function of storage engine is called through executor.
Storage engineSimilar to components, they are where mysql actually obtains rows of data and returns data. The storage engine can be replaced and changed. Yes, you can either use MyISAM which does not support transactions, or you can replace it with Innodb which supports transactions. This can be specified when creating the table. For example,
CREATE TABLE `user` ( ... ) ENGINE=InnoDB;
is now most commonly used InnoDB.
Let’s focus on this.
In InnoDB, because directly operating the disk will be slower, a layer of memory is added to speed up, called buffer pool. There are many memory pages in it, each page is 16KB. Some memory pages hold the row-by-row data seen in database tables, and some hold index information.
Query SQL to InnoDB. Based on the index calculated in the previous optimizer, the corresponding index page will be queried. If it is not in the buffer pool, the index page will be loaded from the disk. Then speed up the query through the index page to get the specific location of the data page . If these data pages are not in the buffer pool, they are loaded from disk.
In this way we get the rows of data we want. Finally, the obtained data result is returned to the client.profiling.
mysql> set profiling=ON; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> show variables like 'profiling'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | profiling | ON | +---------------+-------+ 1 row in set (0.00 sec)
show profiles;
mysql> show profiles; +----------+------------+---------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+---------------------------------------------------+ | 1 | 0.06811025 | select * from user where age>=60 | | 2 | 0.00151375 | select * from user where gender = 2 and age = 80 | | 3 | 0.00230425 | select * from user where gender = 2 and age = 60 | | 4 | 0.00070400 | select * from user where gender = 2 and age = 100 | | 5 | 0.07797650 | select * from user where age!=60 | +----------+------------+---------------------------------------------------+ 5 rows in set, 1 warning (0.00 sec)
query_id, for example
select * from user where age>=60, the corresponding query_id is 1. If you want to check the specific time consumption of this SQL statement, you can execute the following Order.
mysql> show profile for query 1; +----------------------+----------+ | Status | Duration | +----------------------+----------+ | starting | 0.000074 | | checking permissions | 0.000010 | | Opening tables | 0.000034 | | init | 0.000032 | | System lock | 0.000027 | | optimizing | 0.000020 | | statistics | 0.000058 | | preparing | 0.000018 | | executing | 0.000013 | | Sending data | 0.067701 | | end | 0.000021 | | query end | 0.000015 | | closing tables | 0.000014 | | freeing items | 0.000047 | | cleaning up | 0.000027 | +----------------------+----------+ 15 rows in set, 1 warning (0.00 sec)
executor to start querying data and send the data to the client, because my table has ## eligible data. #tens of thousands of entries, so this part takes the most time and is in line with expectations. Under normal circumstances, during our development process, most of the time is spent in the
Sending data stage. If it is slow at this stage, the most likely reason to think of is the index-related reason. . <h3 data-id="heading-2"><strong>Index-related reasons</strong></h3><p>Index-related issues can generally be analyzed using the explain command. Through it, you can see <strong>which indexes</strong> are used, and probably <strong>how many rows</strong> will be scanned and other information. </p><p>mysql will look at which index to choose in the <strong>optimizer phase</strong>, and the query speed will be faster. </p><p> Generally, there are several factors to consider, such as: </p><ul><li>How many rows should be scanned to select this index <strong>(rows)</strong></li>In order to combine these To fetch rows, how many 16kb pages need to be read? <li><strong></strong> Ordinary indexes require table backing, but primary key indexes do not. </li>Is the cost of table backing<li> big? <strong></strong></li>Go back to the sql statement mentioned in show profile, we use </ul>explain select * from user where age>=60<p> to analyze it. <code>
The above statement uses
type as ALL, which means full table scan
, possible_keys refers to the index that may be used by
. The index that may be used here is a normal index built for age, but in fact the index used by the database is in the key column. is NULL
. In other words, this sql does not use the index, but scans the entire table
. This is because there are too many qualified data rows (
) in the data table. If you use the age index, you need to read them from the age index, and the age index It is a ordinary index
, and you need to return to the table to find the corresponding primary key to find the corresponding data page. After all, it is not as cost-effective as using the primary key directly. So I finally chose a full table scan. Of course, the above is just an example. In fact, when mysql executes sql,
This happens often, and there are many scenarios where the index fails. , such as using the inequality sign, implicit conversion , etc. I believe everyone has memorized this a lot when memorizing eight-part essays, so I won’t go into details. Let’s talk about two problems that are easily encountered in production.
The index does not meet expectationsThis kind of problem can be easily solved by specifying the index
throughforce index. For example,
##It can be seen from explain
that after adding the force index, SQL will use the idx_age index.It is still very slow after indexing
The first is that the index distinction is too low. For example, the full path of the URL link to the web page is used for indexing. At a glance, they are all the same domain name. If the
prefix The length of the index
full table scan. The correct approach is to try to make the discrimination of the index higher, such as removing the domain name. , only use the latter part of the URI for indexing.
The second is that the data matched in the index is too large. At this time, what needs to be paid attention to is the rows
field in the explain.It is used toestimate
the number of rows that need to be checked for this query statement. It may not be completely accurate, but it can reflect a rough magnitude.When it is very large, the following situations are generally common.
limit
限制下。如果确实要拿全部,那也不能一次性全拿,今天你数据量小,可能一次取一两万都没啥压力,万一哪天涨到了十万级别,那一次性取就有点吃不消了。你可能需要分批次取,具体操作是先用order by id
排序一下,拿到一批数据后取最大id
作为下次取数据的起始位置。索引相关的原因我们聊完了,我们来聊聊,除了索引之外,还有哪些因素会限制我们的查询速度的。
我们可以看到,mysql的server层里有个连接管理,它的作用是管理客户端和mysql之间的长连接。
正常情况下,客户端与server层如果只有一条连接,那么在执行sql查询之后,只能阻塞等待结果返回,如果有大量查询同时并发请求,那么后面的请求都需要等待前面的请求执行完成后,才能开始执行。
因此很多时候我们的应用程序,比如go或java这些,会打印出sql执行了几分钟的日志,但实际上你把这条语句单独拎出来执行,却又是毫秒级别的。 这都是因为这些sql语句在等待前面的sql执行完成。
怎么解决呢?
如果我们能多建几条连接,那么请求就可以并发执行,后面的连接就不用等那么久了。
而连接数过小的问题,受数据库和客户端两侧同时限制。
mysql的最大连接数默认是100
, 最大可以达到16384
。
可以通过设置mysql的max_connections
参数,更改数据库的最大连接数。
mysql> set global max_connections= 500; Query OK, 0 rows affected (0.00 sec) mysql> show variables like 'max_connections'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | max_connections | 500 | +-----------------+-------+ 1 row in set (0.00 sec)
上面的操作,就把最大连接数改成了500。
数据库连接大小是调整过了,但貌似问题还是没有变化?还是有很多sql执行达到了几分钟,甚至超时?
那有可能是因为你应用侧(go,java写的应用,也就是mysql的客户端)的连接数也过小。
应用侧与mysql底层的连接,是基于TCP协议的长链接,而TCP协议,需要经过三次握手和四次挥手来实现建连和释放。如果我每次执行sql都重新建立一个新的连接的话,那就要不断握手和挥手,这很耗时。所以一般会建立一个长连接池,连接用完之后,塞到连接池里,下次要执行sql的时候,再从里面捞一条连接出来用,非常环保。
我们一般写代码的时候,都会通过第三方的orm库来对数据库进行操作,而成熟的orm库,百分之一千万都会有个连接池。
而这个连接池,一般会有个大小。这个大小就控制了你的连接数最大值,如果说你的连接池太小,都还没有数据库的大,那调了数据库的最大连接数也没啥作用。
一般情况下,可以翻下你使用的orm库的文档,看下怎么设置这个连接池的大小,就几行代码的事情,改改就好。比如go语言里的gorm
里是这么设置的
func Init() { db, err := gorm.Open(mysql.Open(conn), config) sqlDB, err := db.DB() // SetMaxIdleConns 设置空闲连接池中连接的最大数量 sqlDB.SetMaxIdleConns(200) // SetMaxOpenConns 设置打开数据库连接的最大数量 sqlDB.SetMaxOpenConns(1000) }
连接数是上去了,速度也提升了。
曾经遇到过面试官会追问,有没有其他办法可以让速度更快呢?
那必须要眉头紧锁,假装思考,然后说:有的。
我们在前面的数据库查询流程里,提到了进了innodb之后,会有一层内存buffer pool,用于将磁盘数据页加载到内存页中,只要查询到buffer pool里有,就可以直接返回,否则就要走磁盘IO,那就慢了。
也就是说,如果我的buffer pool 越大,那我们能放的数据页就越多,相应的,sql查询时就更可能命中buffer pool,那查询速度自然就更快了。
可以通过下面的命令查询到buffer pool的大小,单位是Byte
。
mysql> show global variables like 'innodb_buffer_pool_size'; +-------------------------+-----------+ | Variable_name | Value | +-------------------------+-----------+ | innodb_buffer_pool_size | 134217728 | +-------------------------+-----------+ 1 row in set (0.01 sec)
也就是128Mb
。
如果想要调大一点。可以执行
mysql> set global innodb_buffer_pool_size = 536870912; Query OK, 0 rows affected (0.01 sec) mysql> show global variables like 'innodb_buffer_pool_size'; +-------------------------+-----------+ | Variable_name | Value | +-------------------------+-----------+ | innodb_buffer_pool_size | 536870912 | +-------------------------+-----------+ 1 row in set (0.01 sec)
这样就把buffer pool增大到512Mb了。
但是吧,如果buffer pool大小正常,只是别的原因导致的查询变慢,那改buffer pool毫无意义。
但问题又来了。
这个我们可以看buffer pool的缓存命中率。
通过 show status like 'Innodb_buffer_pool_%';
可以看到跟buffer pool有关的一些信息。
Innodb_buffer_pool_read_requests
表示读请求的次数。
Innodb_buffer_pool_reads
表示从物理磁盘中读取数据的请求次数。
所以buffer pool的命中率就可以这样得到:
buffer pool 命中率 = 1 - (Innodb_buffer_pool_reads/Innodb_buffer_pool_read_requests) * 100%
比如我上面截图里的就是,1 - (405/2278354) = 99.98%。可以说命中率非常高了。
一般情况下buffer pool命中率都在99%
以上,如果低于这个值,才需要考虑加大innodb buffer pool的大小。
当然,还可以把这个命中率做到监控里,这样半夜sql变慢了,早上上班还能定位到原因,就很舒服。
前面提到的是在存储引擎层里加入了buffer pool用于缓存内存页,这样可以加速查询。
那同样的道理,server层也可以加个缓存,直接将第一次查询的结果缓存下来,这样下次查询就能立刻返回,听着挺美的。
按道理,如果命中缓存的话,确实是能为查询加速的。但这个功能限制很大,其中最大的问题是只要数据库表被更新过,表里面的所有缓存都会失效,数据表频繁的更新,就会带来频繁的缓存失效。所以这个功能只适合用于那些不怎么更新的数据表。
另外,这个功能在8.0版本
之后,就被干掉了。所以这功能用来聊聊天可以,没必要真的在生产中使用啊。
最近原创更文的阅读量稳步下跌,思前想后,夜里辗转反侧。
我有个不成熟的请求。
离开广东好长时间了,好久没人叫我靓仔了。
大家可以在评论区里,叫我一靓仔吗?
我这么善良质朴的愿望,能被满足吗?
如果实在叫不出口的话,能帮我点下右下角的点赞和在看吗?
【相关推荐:mysql视频教程】
The above is the detailed content of In addition to indexes, what other factors make mysql queries slow?. For more information, please follow other related articles on the PHP Chinese website!