This article brings you relevant knowledge that MySQL cannot use uuid as the primary key. MySQL officially recommends not to use uuid or non-continuous and non-repeating snowflake IDs, but recommends continuous self-increasing primary key IDs. The official recommendation is auto_increment, then why is it not recommended to use uuid? I hope it will be helpful to everyone.
Preface
When designing tables in mysql, mysql officially recommends not to use uuid or not Continuous non-repeating snowflake IDs (long-shaped and unique, single-machine increment) are recommended, but continuous self-increasing primary key IDs are recommended. The official recommendation is auto_increment, so why is it not recommended to use uuid? What are the disadvantages of using uuid?
1. MySQL and program examples
##1.1. To explain this problem, let’s first create three tables
They are user_auto_key, user_uuid, and user_random_key respectively, which represent the automatically growing primary key, uuid is used as the primary key, random key is used as the primary key, and we keep the rest completely unchanged. According to For the control variable method, we only generate the primary key of each table using different strategies, while the other fields are exactly the same, and then test the insertion speed and query speed of the table: Note: The random key here is actually Refers to the non-continuous, non-repeating and irregular ids calculated using the snowflake algorithm: a string of 18-bit long values ##1.2. The theory alone is not enough, just go to the program and use spring's jdbcTemplate to implement the additional inspection test:Technical framework: springboot jdbcTemplate junit hutool, program The principle is to connect your own test database, and then write the same amount of data in the same environment to analyze the insert time to comprehensively analyze its efficiency. In order to achieve the most realistic effect, all data are randomly generated. For example, names, emails, and addresses are all randomly generated.
package com.wyq.mysqldemo; import cn.hutool.core.collection.CollectionUtil; import com.wyq.mysqldemo.databaseobject.UserKeyAuto; import com.wyq.mysqldemo.databaseobject.UserKeyRandom; import com.wyq.mysqldemo.databaseobject.UserKeyUUID; import com.wyq.mysqldemo.diffkeytest.AutoKeyTableService; import com.wyq.mysqldemo.diffkeytest.RandomKeyTableService; import com.wyq.mysqldemo.diffkeytest.UUIDKeyTableService; import com.wyq.mysqldemo.util.JdbcTemplateService; import org.junit.jupiter.api.Test; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.test.context.SpringBootTest; import org.springframework.util.StopWatch; import java.util.List; @SpringBootTest class MysqlDemoApplicationTests { @Autowired private JdbcTemplateService jdbcTemplateService; @Autowired private AutoKeyTableService autoKeyTableService; @Autowired private UUIDKeyTableService uuidKeyTableService; @Autowired private RandomKeyTableService randomKeyTableService; @Test void testDBTime() { StopWatch stopwatch = new StopWatch("执行sql时间消耗"); /** * auto_increment key任务 */ final String insertSql = "INSERT INTO user_key_auto(user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?)"; ListinsertData = autoKeyTableService.getInsertData(); stopwatch.start("自动生成key表任务开始"); long start1 = System.currentTimeMillis(); if (CollectionUtil.isNotEmpty(insertData)) { boolean insertResult = jdbcTemplateService.insert(insertSql, insertData, false); System.out.println(insertResult); } long end1 = System.currentTimeMillis(); System.out.println("auto key消耗的时间:" + (end1 - start1)); stopwatch.stop(); /** * uudID的key */ final String insertSql2 = "INSERT INTO user_uuid(id,user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?)"; List insertData2 = uuidKeyTableService.getInsertData(); stopwatch.start("UUID的key表任务开始"); long begin = System.currentTimeMillis(); if (CollectionUtil.isNotEmpty(insertData)) { boolean insertResult = jdbcTemplateService.insert(insertSql2, insertData2, true); System.out.println(insertResult); } long over = System.currentTimeMillis(); System.out.println("UUID key消耗的时间:" + (over - begin)); stopwatch.stop(); /** * 随机的long值key */ final String insertSql3 = "INSERT INTO user_random_key(id,user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?)"; List insertData3 = randomKeyTableService.getInsertData(); stopwatch.start("随机的long值key表任务开始"); Long start = System.currentTimeMillis(); if (CollectionUtil.isNotEmpty(insertData)) { boolean insertResult = jdbcTemplateService.insert(insertSql3, insertData3, true); System.out.println(insertResult); } Long end = System.currentTimeMillis(); System.out.println("随机key任务消耗时间:" + (end - start)); stopwatch.stop(); String result = stopwatch.prettyPrint(); System.out.println(result); }
1.3. Program writing results
It can be seen that in When the amount of data is about 100W, the insertion efficiency of uuid is at the bottom, and when 130W of data is added in the subsequent sequence, the time of uudi plummets again.
The overall efficiency ranking of time usage is: auto_key>random_key>uuid, uuid has the lowest efficiency. When the amount of data is large, the efficiency plummets. So why does this happen? With doubts, let’s discuss this issue:
2. Comparison of index structures using uuid and auto-increment id
2.1. Use the internal structure of the auto-incrementing id
#The values of the auto-incrementing primary key are sequential, so Innodb stores each record after a record . When the maximum fill factor of the page is reached (the default maximum fill factor of InnoDB is 15/16 of the page size, 1/16 of the space will be left for future modifications):
①The next record will be When writing to a new page, once the data is loaded in this order, the primary key page will be filled with nearly sequential records, increasing the maximum fill rate of the page, and there will be no page waste
② The newly inserted row will definitely be in the next row of the original largest data row. MySQL positioning and addressing are very fast, and there will be no extra consumption for calculating the position of the new row.
③Reduce page splits and fragmentation Generation
2.2. Use the index internal structure of uuid
Because uuid has nothing to do with the sequential auto-incrementing id As a rule, the value of a new row is not necessarily greater than the value of the previous primary key, so innodb cannot always insert the new row to the end of the index, but needs to find a new suitable position for the new row so as to to allocate new space.
This process requires a lot of additional operations. The disordered data will cause the data to be scattered and scattered, which will lead to the following problems:
①The target page written is likely to have been refreshed to on disk and removed from the cache, or has not been loaded into the cache, InnoDB has to find and read the target page from disk into memory before inserting it, which will cause a lot of random IO
② Because writing is out of order, InnoDB has to perform page splitting operations frequently in order to allocate space for new rows. Page splitting results in moving a large amount of data, and at least three pages need to be modified for one insertion.
③Due to frequent page splits, pages will become sparse and filled irregularly, which will eventually lead to data fragmentation
Loading random values (uuid and snowflake id) into the cluster After indexing (the default index type of innodb), sometimes you need to do an OPTIMEIZE TABLE to rebuild the table and optimize page filling, which will take a certain amount of time.
Conclusion: When using innodb, you should insert as much as possible in the order of increasing primary keys, and try to use monotonically increasing clustering key values to insert new rows
2.3. Disadvantages of using self-increasing ID
So there is no disadvantage at all in using self-increasing ID? No, self-increasing IDs will also have the following problems:
① Once others crawl your database, they can obtain your business growth information based on the self-increasing IDs of the database, and it is easy to analyze your business growth information. Business situation
② For high concurrent loads, innodb will cause obvious lock contention when inserting according to the primary key. The upper bound of the primary key will become a hot spot for contention, because all insertions occur within Here, concurrent insertion will lead to gap lock competition
③Auto_Increment lock mechanism will cause auto-increment lock grabbing, with a certain performance loss
Attachment: Auto_increment lock competition problem, if you want to improve it Tuning the configuration of innodb_autoinc_lock_mode
3. Summary
This blog first starts with asking questions at the beginning, creating tables, and using jdbcTemplate to test different The id generation strategy performs well in data insertion of large amounts of data, and then analyzes the different mechanisms of id in the index structure and advantages and disadvantages of mysql, and explains in depth why uuid and random non-repeating id have a performance loss in data insertion, in detail explained the problem.
In actual development, it is best to use self-increasing ID according to the official recommendation of mysql. Mysql is broad and profound, and there are many internal points worthy of optimization that we need to learn.
Recommended learning:mysql video tutorial
The above is the detailed content of Let's talk about why MySQL cannot use uuid as the primary key. For more information, please follow other related articles on the PHP Chinese website!