Issues related to RR and phantom reading in mysql-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

Issues related to RR and phantom reading in mysql

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Oct 11, 2022 pm 04:59 PM

mysql

This article brings you relevant knowledge about mysql, which mainly introduces the relevant content about RR and phantom reading, including the MVCC principle, RR generating phantom reading, and RR solving phantom reading Let’s take a look at the content below. I hope it will be helpful to everyone.

Issues related to RR and phantom reading in mysql

Recommended learning: mysql video tutorial

1. Foreword

This article focuses on these three topics. RR How to solve phantom reading?

Issues related to RR and phantom reading in mysql

##MVCC principle
Experiment: RR and phantom reading
Case: Deadlock

Let’s first review the four types of transaction isolation and some problems caused by concurrent transactions supported by InnoDB in MySQL:

Issues related to RR and phantom reading in mysql

Read uncommitted: It can read the intermediate process of a transaction, which violates the ACID characteristics and has the problem of dirty reading. It is basically not used.
Read commit: Indicates that if other transactions have been committed, then you can see it. It is not used much in production environments.
Repeatable read: the default level, the most used one. It features Gap lock.
Serializable: All implementations are implemented through locks.

Concurrent transaction processing will also bring some problems: dirty reads, non-repeatable reads, phantom reads

Dirty reads: A transaction is dealing with When a record is modified, the data of this record will be in an inconsistent state until the transaction is completed and submitted.
Non-repeatable read: A transaction is read twice according to the same query conditions, and the read data is inconsistent (modification, deletion).
Phantom reading: Re-query data according to the same query conditions within a transaction, but find that other transactions have inserted new data that satisfies its query conditions.

Summary of this article: RR introduces MVCC for faster concurrency, but there is a possibility of phantom reading. To solve phantom reading, Gap lock is introduced. Gap may cause deadlock.

2. MVCC Principle

MVCC (Multiple Version Control): Refers to the database in order to achieve high concurrent data access, multi-version processing of data, and ensure transaction visibility through transaction visibility You can see the version of the data you should see.

The biggest advantage of MVCC is that there is no locking for reading and there is no conflict between reading and writing.

In OLTP (On-Line Transaction Processing) applications, it is important that there is no conflict between reading and writing. Almost all RDBMS support MVCC.

Note: MVCC only works under two isolation levels: read-commit RC and repeatable read RR.

(1) MVCC multi-version implementation

When MySQL implements the MVCC mechanism, it is based on the undo log multi-version chain ReadView mechanism.

undo log multi-version chain: Every time the database is modified, the transaction number of the current modification record and the storage address of the data state before modification (i.e. ROLL_PTR) will be recorded in the undo log. , so that you can roll back to the old data version when necessary.
ReadView mechanism: Based on the multi-version chain, control the visibility of transaction reads. (The main difference is: RC and RR)

We don’t focus on exploring the principles here, but we need to have a general concept: undo log multi-version chain and ReadView mechanism.

For the undo log multi-version chain, here is an example:

A read transaction queries the current record, but the latest transaction has not yet been submitted.
According to atomicity, read transactions cannot see the latest data, but they can find old versions of data in the rollback segment, thus generating multiple versions.

For ReadView mechanism: Based on undo log multi-version chain implementation, different transaction isolation has different processing:

RC level transactions: Visibility At a higher level, it can see all modifications of committed transactions.
RR-level transactions: In a read transaction, no matter what modifications other transactions make to the data and whether they are submitted, the query data results will not change as long as you do not submit them. .

How is this done?

RC read submission: Each read operation statement will obtain a ReadView. After each update, the latest transaction submission status in the database will be obtained, and you can see the latest submitted transaction, that is, each statement Each execution updates its visibility view.

RR repeatable read: ReadView will not be obtained when starting a transaction, and ReadView will be obtained only when the first snapshot read is initiated.

If you use the current read, you will get a new ReadView and you can also see the updated data.

(2) Snapshot read and current read

In MVCC concurrency control, read operations can be divided into two categories:

Snapshot reading: What is read is the visible version of the record (possibly the historical version), without locking.

Operation: Simple SELECT operation.

Current read: The latest version of the record is read, and the record returned by the current read will be locked to ensure that other transactions will not modify this record concurrently.

Operations: special read operations, add/update/delete operations.

-- 对应 SQL 如下：
-- 1. 特殊读操作
SELECT ... FOR UPDATE
SELECT ... LOCK IN SHARE MODE  -- 共享锁
-- 2. 新增：INSERT 
-- 3. 更新：UPDATE
-- 4. 删除：DELETE

Combined with the ReadView mechanism to distinguish: snapshot read and current read:

Snapshot read: In a transaction, ReadView will be obtained only when the first snapshot read is initiated, and subsequent read operations Will not be obtained again.

Current reading: ReadView will be obtained for each read operation.

3. Experiment: RR and phantom reading

Interview question: Under the RR transaction isolation level, transaction A queries a piece of data, transaction B adds a piece of data, and transaction A can see transaction B The data?

Issues related to RR and phantom reading in mysql

This question is relatively vague, but we know that the general inspection point is RR and phantom reading. The question can be divided into two categories:

Under what circumstances, RR produces phantom reading? (Can see the data)

Answer: Current read (SELECT..FOR UDPDATE, SELECT ... LOCK IN SHARE MODE)

Under what circumstances does RR solve phantom reading? (Cannot see data)

Answer: Locking, snapshot reading

Note: Non-repeatable reading focuses on UPDATA and DELETE, while phantom reading focuses on INSERT.

The biggest difference between them is how to solve the problems they cause through the lock mechanism.

The lock mentioned here only uses the pessimistic locking mechanism.

Let’s review again: Phantom reading

-- 举个栗子：有这样一个查询 SQL
SELECT * FROM user WHERE id < 10;

Under the same transaction, 4 pieces of data are queried at T1 time, and 8 pieces of data are queried at T2 time. This creates phantom reading.

Under the same transaction, 8 pieces of data are queried at time T1, and 4 pieces of data are queried at time T2. This creates phantom reading.

The experiment preparation is as follows: Hands-on practice

show variables like &#39;transaction_isolation&#39;; -- 事务隔离级别 RR
select version();                            -- 版本 8.0.16
show variables like &#39;%storage_engine%&#39;;      -- 引擎 InnoDB
-- 1. 手动开启事务提交
begin;  -- 开始事务
commit; -- 提交事务
-- 2. 创建表
CREATE TABLE IF NOT EXISTS `student` (
`id` INT NOT NULL COMMENT &#39;主键 id&#39;,
`name` VARCHAR(50) NOT NULL COMMENT &#39;名字&#39;,
`age` TINYINT NOT NULL COMMENT &#39;年龄&#39;,
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT &#39;学生表&#39;;
-- 3. 新增数据用于实验
INSERT INTO student (id, name, age) VALUES (5, &#39;kunkun&#39;, 14);
INSERT INTO student (id, name, age) VALUES (30, &#39;ikun&#39;, 18);

(1) RR generates phantom reading

The experiment is as follows: Test current Read

Experiment 1: SELECT first, then SELECT... FOR UPDATE

Experiment 2: SELECT first, then UPDATE (no phantom reading will occur)

Experiment 1: SELECT first, then SELECT... FOR UPDATE

-- 事务A：
BEGIN;
SELECT * FROM student WHERE id < 30;
SELECT * FROM student WHERE id < 30 FOR UPDATE;  -- 等待事务B commit 后再执行
-- SELECT * FROM student WHERE id < 30 LOCK IN SHARE MODE;
COMMIT;
-- 事务B：
BEGIN;
INSERT INTO student (id, name, age) VALUES (20, &#39;wulikun&#39;, 16);
COMMIT;

What happened is as shown in the figure below:

Issues related to RR and phantom reading in mysql

The experimental record is as shown in the figure below:

Issues related to RR and phantom reading in mysql

Phenomenon conclusion: When using the current read (SELECT ... FOR UPDATE), phantom reads will occur.

Similarly using SELECT ... LOCK IN SHARE MODE; will produce phantom reads.

Issues related to RR and phantom reading in mysql

Experiment 2: SELECT first, then UPDATE

-- 事务A：
BEGIN;
SELECT * FROM student WHERE id < 30;
UPDATE student SET name = &#39;zhiyin&#39; WHERE id = 5;  -- 等待事务B commit 后再执行
SELECT * FROM student WHERE id < 30;
COMMIT;
-- 事务B：
BEGIN;
INSERT INTO student (id, name, age) VALUES (20, &#39;wulikun&#39;, 16);
COMMIT;

What happens is as shown below:

Issues related to RR and phantom reading in mysql

The experimental record is shown in the figure below:

Issues related to RR and phantom reading in mysql

Phenomenon conclusion: The current read (UPDATE) will not produce phantom reads. Neither INSERT / DELETE will do the same.

Issues related to RR and phantom reading in mysql

(2) RR solves phantom reading

The experiment is as follows:

Experiment 1: Snapshot reading
Experiment 2: Lock (update non-existent records)
Experiment 3: Add Lock (SELECT ... FOR UPDATE)

Experiment 1: Snapshot read, ordinary SELECT

-- 事务A：
BEGIN;
SELECT * FROM student;
SELECT * FROM student;  -- 等待事务B commit 后再执行
COMMIT;
-- 事务B：
BEGIN;
INSERT INTO student (id, name, age) VALUES (20, &#39;wulikun&#39;, 16);
COMMIT;

What happens is as shown below:

Issues related to RR and phantom reading in mysql

The experimental record is shown in the figure below:

Issues related to RR and phantom reading in mysql

Conclusion of the phenomenon: Under the RR transaction isolation level, only snapshot reads ( SELECT) will not cause phantom reading. There is no current read.

Experiment 2: Locking, (updating non-existent records)

Under the RR isolation level, transaction A uses UPDATE to lock, and transaction B cannot Insert new data between transactions, so that the data read by transaction A before and after UPDATE remains consistent, avoiding phantom reads.

-- 事务A：
BEGIN;
SELECT * FROM student;
UPDATE student SET name = &#39;wulikunkun&#39; WHERE id = 18; -- 记录不存在，产生间隙锁 (5, 30)。
COMMIT;
-- 事务B：
BEGIN;
INSERT INTO student (id, name, age) VALUES (10, &#39;zhiyin&#39;, 16); -- 需要等待事务A结束。
COMMIT;
-- 事务C：
BEGIN;
INSERT INTO student (id, name, age) VALUES (40, &#39;zhiyin你太美&#39;, 32);
COMMIT;
-- 查询数据库中当前有哪些锁
SELECT INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA FROM performance_schema.data_locks;

What happened is as shown below:

Issues related to RR and phantom reading in mysql

实验记录如下图所示：

Issues related to RR and phantom reading in mysql

现象结论：

一开始先加临键锁Next-key lock，锁范围为 (5,30]。

因为是唯一索引，且更新的记录不存在，临键锁退化成间隙锁Gap，最终锁范围为 (5,30)。其余的记录不受影响。

实验三：加锁（SELECT ... FOR UPDATE）

-- 事务A：
BEGIN;
SELECT * FROM student;
SELECT * FROM student WHERE id < 5 FOR UPDATE;
COMMIT;
-- 事务B：
BEGIN;
INSERT INTO student (id, name, age) VALUES (4, &#39;zhiyin&#39;, 4); -- 需要等待事务A结束。
COMMIT;
-- 事务C：
BEGIN;
INSERT INTO student (id, name, age) VALUES (5, &#39;zhiyin你太美&#39;, 32); -- 插入成功
COMMIT;
-- 查询数据库中当前有哪些锁
SELECT INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA FROM performance_schema.data_locks;

发生情况如下图所示：

Issues related to RR and phantom reading in mysql

实验记录如下图所示：

Issues related to RR and phantom reading in mysql

现象结论：

先加临键锁Next-key lock，锁范围为 (-∞,5]。

所以，id

拓展：Gap 锁（间隙锁）

根据官方文档可知：

锁是加在索引上的。
记录锁：行锁，只会锁定一条记录。
间隙锁：是在索引记录之间的间隙上的锁，区间为前开后开 (,)。
临键锁（Next-Key Lock）：由记录锁和间隙锁Gap 组合起来。
加锁的基本单位是临键锁，其加锁区间为前开后闭 (,]。
索引上的等值查询，给唯一索引加锁的时候，如果满足条件，临键锁退化为行锁。
索引上的等值查询，给唯一索引加锁的时候，如果不满足条件，临键锁退化为间隙锁。注意，非等值查询是不会优化的。

推荐学习：mysql视频教程

The above is the detailed content of Issues related to RR and phantom reading in mysql. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:掘金. If there is any infringement, please contact admin@php.cn delete

Explain the ACID properties (Atomicity, Consistency, Isolation, Durability).Apr 16, 2025 am 12:20 AM

ACID attributes include atomicity, consistency, isolation and durability, and are the cornerstone of database design. 1. Atomicity ensures that the transaction is either completely successful or completely failed. 2. Consistency ensures that the database remains consistent before and after a transaction. 3. Isolation ensures that transactions do not interfere with each other. 4. Persistence ensures that data is permanently saved after transaction submission.

MySQL: Database Management System vs. Programming LanguageApr 16, 2025 am 12:19 AM

MySQL is not only a database management system (DBMS) but also closely related to programming languages. 1) As a DBMS, MySQL is used to store, organize and retrieve data, and optimizing indexes can improve query performance. 2) Combining SQL with programming languages, embedded in Python, using ORM tools such as SQLAlchemy can simplify operations. 3) Performance optimization includes indexing, querying, caching, library and table division and transaction management.

MySQL: Managing Data with SQL CommandsApr 16, 2025 am 12:19 AM

MySQL uses SQL commands to manage data. 1. Basic commands include SELECT, INSERT, UPDATE and DELETE. 2. Advanced usage involves JOIN, subquery and aggregate functions. 3. Common errors include syntax, logic and performance issues. 4. Optimization tips include using indexes, avoiding SELECT* and using LIMIT.

MySQL's Purpose: Storing and Managing Data EffectivelyApr 16, 2025 am 12:16 AM

MySQL is an efficient relational database management system suitable for storing and managing data. Its advantages include high-performance queries, flexible transaction processing and rich data types. In practical applications, MySQL is often used in e-commerce platforms, social networks and content management systems, but attention should be paid to performance optimization, data security and scalability.

SQL and MySQL: Understanding the RelationshipApr 16, 2025 am 12:14 AM

The relationship between SQL and MySQL is the relationship between standard languages and specific implementations. 1.SQL is a standard language used to manage and operate relational databases, allowing data addition, deletion, modification and query. 2.MySQL is a specific database management system that uses SQL as its operating language and provides efficient data storage and management.

Explain the role of InnoDB redo logs and undo logs.Apr 15, 2025 am 12:16 AM

InnoDB uses redologs and undologs to ensure data consistency and reliability. 1.redologs record data page modification to ensure crash recovery and transaction persistence. 2.undologs records the original data value and supports transaction rollback and MVCC.

What are the key metrics to look for in an EXPLAIN output (type, key, rows, Extra)?Apr 15, 2025 am 12:15 AM

Key metrics for EXPLAIN commands include type, key, rows, and Extra. 1) The type reflects the access type of the query. The higher the value, the higher the efficiency, such as const is better than ALL. 2) The key displays the index used, and NULL indicates no index. 3) rows estimates the number of scanned rows, affecting query performance. 4) Extra provides additional information, such as Usingfilesort prompts that it needs to be optimized.

What is the Using temporary status in EXPLAIN and how to avoid it?Apr 15, 2025 am 12:14 AM

Usingtemporary indicates that the need to create temporary tables in MySQL queries, which are commonly found in ORDERBY using DISTINCT, GROUPBY, or non-indexed columns. You can avoid the occurrence of indexes and rewrite queries and improve query performance. Specifically, when Usingtemporary appears in EXPLAIN output, it means that MySQL needs to create temporary tables to handle queries. This usually occurs when: 1) deduplication or grouping when using DISTINCT or GROUPBY; 2) sort when ORDERBY contains non-index columns; 3) use complex subquery or join operations. Optimization methods include: 1) ORDERBY and GROUPB

See all articles