So, if we use the UUID string as the primary key, then every time data is inserted, it will need to find its own position in the B Tree. After finding it, it ispossible tomove the subsequent node (just like inserting a record into an array). Moving the subsequent node may involve page splitting, and the insertion efficiency will be reduced.
On the other hand, in a non-clustered index, the leaf node stores the primary key value. If the primary key is a long UUID string, it will occupy a larger storage space (relative to int and In other words), then the number of primary key values that can be saved by the same leaf node will be reduced, which may cause the tree to become taller, which means that the number of IOs during query increases and query efficiency decreases.
Based on the above analysis, we try not to use UUID as the primary key in MySQL. Without UUID, some friends may think, can I use the primary key to auto-increment?
Auto-increment of the primary key can obviously solve the two problems encountered when using UUID as the primary key. The primary key is auto-incremented. You only need to add it to the end of the tree each time. Basically, it will not involve the problem of page splitting. The primary key auto-increment means that the primary key is a number and the storage space occupied is relatively small. For non-clustered The impact of indexing will also be smaller.
So is auto-increment of the primary key the best solution? Are there any issues that need to be paid attention to when the primary key is automatically incremented?
The following content has a common premise, which is that our table has a primary key auto-increment.
Generally speaking, there is no problem with auto-increment of the primary key. However, if you are in a high-concurrency environment, there will be problems.
First of all, the easiest thing to think of is the tail hotspot problem that occurs during high concurrent insertion. During concurrent insertion, everyone needs to query this value and then calculate their own primary key value. Then the upper bound of the primary key is It will become hot data, and lock competition will occur here during concurrent insertion.
In order to solve this problem, we need to choose theinnodb_autoinc_lock_mode
that suits us.
First of all, when we insert data into the data table, there are generally three different forms, as follows:
insert into user(name) values('javaboy')
orreplace into user(name) values('javaboy')
, there is no nesting Query andcan determine how many rows to insertThe insertion is calledsimple insert
, but it should be noted thatINSERT ... ON DUPLICATE KEY UPDATE
does not count assimple insert
.
load data
orinsert into user select ... from ....
, these are batch inserts, calledbulk insert
, one feature of this bulk insert is that the number of pieces of data to be inserted is unknown at the beginning.
insert into user(id,name) values(null,'javaboy'),(null,'Jiangnan Yiyidian')
, this is also a batch insert , but it is different from the second one. This type contains some automatically generated values (the primary key in this case is auto-incremented), and can determine how many rows are inserted in total. This type is calledmixed insert
, for theINSERT ... ON DUPLICATE KEY UPDATE
mentioned in the first point above, it can also be regarded as amixed insert
.
Data insertion is divided into these three categories, mainly because when the primary key is auto-incremented, the lock processing scheme is different. Let’s continue to look down.
We can control the MySQL lock processing idea when the primary key is auto-incremented by controlling the value of the innodb_autoinc_lock_mode variable.
innodb_autoinc_lock_mode variable has three different values:
0: This represents traditional. In this mode, the three different values we mentioned above When inserting SQL, the solution for auto-increment locks is the same. At the beginning of the inserted SQL statement, a table-level AUTO-INC lock is obtained, and then the lock is released after the execution of the inserted SQL is completed. The advantage of this is that it can ensure that the auto-incrementing primary key is continuous during batch insertion.
1: This means consecutive. In this mode,simple insert
(which can determine the specific number of inserted rows, corresponds to the two situations 1 and 3 above) ) has made some optimizations. Sincesimple insert
is easy to calculate how many rows to insert, several consecutive values can be generated at one time and used in the corresponding insert SQL statements, so that AUTO- can be released in advance. INC lock can reduce lock waiting and improve concurrent insertion efficiency.
2: This means interleaved. In this case, there is no AUTO-INC lock. Each one is processed one by one. When inserting in batches, it is possible that although the primary key is incremented, it does not exist. Continuous questions.
As you can see from the above introduction, in fact, the third type, that is, when the value of innodb_autoinc_lock_mode is 2, the concurrency efficiency is the strongest, so should we What about setting innodb_autoinc_lock_mode=2?
It depends on the situation.
Songge has written an article before and introduced the three formats of MySQL binlog log files to friends:
row: What is recorded in the binlog is the specific value. Instead of the original SQL, to give a simple example, assume that a field in the table is UUID, and the SQL executed by the user isinsert into user(username,uuid) values('javaboy',uuid())
, Then the SQL finally recorded in the binlog isinsert into user(username,uuid) values('javaboy',‘0212cfa0-de06-11ed-a026-0242ac110004’)
.
statement: What is recorded in the binlog is the original SQL. Taking the one in row as an example, what is finally recorded in the binlog isinsert into user(username,uuid) values( 'javaboy',uuid())
.
mixed: In this mode, MySQL will determine the log format based on the specific SQL statement, that is, choose one between statement and row.
For these three different modes, it is obvious that the statement mode may cause inconsistency in the master-slave data during master-slave replication, so now MySQL’s default binlog format is row. .
Back to our question:
If the binlog format is row, then we can set the value of innodb_autoinc_lock_mode to 2, so as to ensure data concurrency to the greatest extent The ability to insert does not cause the problem of master-slave data inconsistency.
If the binlog format is statement, then we'd better set the value of innodb_autoinc_lock_mode to 1, so that the concurrent insertion capability ofsimple insert
is improved, and batch insertion is still Acquire the AUTO-INC lock first, and then release it after the insertion is successful. This can also avoid master-slave data inconsistency and ensure the security of data replication.
The above two points are mainly for the InnoDB storage engine. If it is the MyISAM storage engine, the AUTO-INC lock is obtained first, and then released after the insertion is completed, which is equivalent to the value pair of the innodb_autoinc_lock_mode variable. MyISAM does not work.
Next, let’s use a simple SQL to demonstrate to our friends how different values of innodb_autoinc_lock_mode correspond to different results.
We can use the following SQL query to view the current innodb_autoinc_lock_mode setting:
As you can see, the current default value of the version 8.0.32 I am using is 2.
I first change it to 0. The modification method is to add a lineinnodb_autoinc_lock_mode=0
:in the
file.
After making the changes, restart and check, as follows:
You can see that it has been changed now.
Now suppose I have the following table:
CREATE TABLE `user` ( `id` int unsigned NOT NULL AUTO_INCREMENT, `username` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=100 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
This increment starts from 100. Now suppose I have the following SQL insertion:
insert into user(id,username) values(1,'javaboy'),(null,'江南一点雨'),(3,'www.javaboy.org'),(null,'lisi');
After the insertion is completed, we Let’s look at the query results:
According to our previous introduction, this situation should be explainable, so I won’t go into details here.
Next, I changed the value of innodb_autoinc_lock_mode to 1, as follows:
Still the same SQL above, let’s execute it again. After the execution is completed, the result is the same as above.
but! ! ! **After the above SQL is executed, if we want to insert data again, and the newly inserted ID does not specify a value, we find that the automatically generated ID value is 104. **This is because we set innodb_autoinc_lock_mode=1. At this time, when executingsimple insert
to insert, the system saw that I wanted to insert 4 records and directly took out 4 IDs for me in advance. They are 100, 101, 102 and 103 respectively. As a result, the SQL actually only uses two IDs, and the remaining two are useless, but the next insertion will still start from 104.
The above is the detailed content of How to solve the pitfalls encountered by MySQL primary key auto-increment. For more information, please follow other related articles on the PHP Chinese website!