This article will talk about the transaction features in MySQL and introduce the implementation principle of multi-version concurrency control MVCC. I hope it will be helpful to everyone!
Transaction generally refers to a logical set of operations , or as a series of operations executed by a single logical unit. All operations in a transaction will be encapsulated into an indivisible execution unit. All operations in this unit will either all execute successfully or all fail, as long as any one of the operations is executed. If it fails, the entire transaction will be rolled back.
Atomicity(atomicity)
The atomicity of a transaction means that all operations that make up the transaction either all execute successfully or all fail.
Consistency (consistency)
The consistency of a transaction means that the data is always in a consistent state before and after the transaction is executed.
Isolation
The isolation of a transaction refers to the non-interference between two concurrently executed transactions, that is to say, the execution of a transaction During the process, you cannot see the intermediate status of other transaction running processes.
?Note:
MySQL
ensures the isolation of transactions by locking aMVCC
mechanism.
Persistence (duration)
The durability of a transaction refers to the change operation of this transaction to the data once the transaction is committed. is persisted to the database and will not be rolled back.
## Local transactions
Transactions usually controlled based on relational databases can be called traditional transactions or local transactions.
Local transaction execution process
#✔Local transactions are managed locally by the resource manager.
The disadvantage of local transactions is:
For example:
Zhang San’s account is 100 yuan. There are currently two transactions: transaction 1 and transaction 2. Transaction 1 is to transfer the balance of Zhang San’s account Add 100 yuan. Transaction 2 increases Zhang San's balance by 200. Initially, transaction 1 and transaction 2 read Zhang San's account balance of 100 yuan at the same time. Then transaction 1 and transaction 2 update Zhang Sanyue respectively. Assume that transaction 1 first Submitted in transaction 2, but after the two most recent transactions were submitted, Zhang San’s balance is 300 yuan (normally it should be 400 yuan),That is to say: the later submitted transaction 2 overwrites the update operation of transaction 1, this It is the so-called update loss. Update loss (dirty write) is essentially a conflict of write operations. However, the way to solve dirty writes is to execute each transaction serially to ensure that transactions execute write operations in a certain order.
One transaction reads uncommitted data from another transaction. For example: Transaction 1 is adding 100 yuan to Zhang San's balance. Before this transaction is submitted, another transaction 2 reads the data being modified. If there is no transaction under the control, the second transaction will read it. The dirty data that has not been committed will be detected, and the next step of processing the dirty data cluster will generate a dependency on the uncommitted data. This phenomenon is usually called Dirty Reading
, that is to say: Dirty reading is when one transaction reads data that has not been submitted by another transaction.
?Dirty reading is essentially a conflict between read and write operations. The solution is to write first and then read, that is, read after writing.
A transaction reads certain data. After a period of time, the transaction reads the previously read data again. When it is found that the read data has changed, or some of the data records have been deleted, this phenomenon is called: non-repeatable read, means the same transaction, using the same query statement, is read at different times The results obtained are inconsistent.
Non-repeatable read is essentially a conflict between read and write operations. The solution is to read first and then write, that is, write after reading.
A transaction re-reads the previously read data according to the same query conditions. At this time, it is found that other transactions have inserted data that satisfies the current query conditions. With new data, the data result set becomes larger. This phenomenon is called phantom reading, that is, a transaction reads a range of data records twice, and the results read twice are different.
Phantom reading is essentially a conflict between read and write operations. The solution is to read first and then write, that is, write after reading.
At this point, many friends have questions. The same essence is a conflict between read and write operations. The solution is to read first and then write. What is the difference between non-repeatable reading and phantom reading?, I See below for a brief explanation:
1️⃣ The focus of non-repeatable reading is update and delete operations, while the focus of phantom reading is insertion operations.
2️⃣ When MySQL uses the lock mechanism to implement transaction isolation levels, in the repeatable read isolation level, after the first SQL statement reads data, the corresponding data will be locked. , so that other transactions cannot modify and delete these data, and repeatable reading is achieved through locking. However, this method cannot lock the insertion of new data. If transaction 1 reads the data, or modifies or deletes the data, transaction 2 can also perform an insertion operation, causing transaction 1 to inexplicably add an additional piece of data that was not available before. This It's phantom reading.
3️⃣ Therefore, phantom reads cannot be avoided through the lock mechanism. A serialized transaction isolation level needs to be used, but this transaction isolation level will greatly reduce the concurrency capability of the database.
✔MySQL uses the MVCC (Multiple Version Concurrency Control) mechanism to avoid non-repeatable reads and phantom reads.
Use the following command to query global-level and session-level transactions Isolation level:
# 默认数据库的事务隔离级别为:可重复读(REPEATABLE-READ) SELECT @@global.tx_isolation; SELECT @@session.tx_isolation; SELECT @@tx_isolation;
##Multi-Version Concurrency ControlAdvantages of MVCC over locksMulti-version concurrency control,
MVCC
is a concurrency control method, generally used in database management systems to achieve concurrent access to the database.
as a compromise of row-level locks, which avoids Instead of using locks, it can provide smaller overhead. Depending on the implementation, it can allow non-blocking reads, locking only necessary records while writes occur. <h3 data-id="heading-18"><strong>MVCC的实现原理</strong></h3><p><code>MVCC
的是通过保存数据澡某个时间点的快照来实现的,也就是说,不管事务执行多长的时间,每个事务看到的数据都是一致的。根据事务的开始时间不同,每个事务对同一张表,同一时刻看到的数据可能是不一样的。InnonDB
主要通过为每一行记录添加两个额外的隐藏的值来实现MVCC
,这两个值一个记录这行数据何时被创建,另外一个记录这行数据何时过期(或者被删除)。但是InnoDB
并不存储这些事件发生时的 实际时间 ,相反它只存储这些事件发生时的系统 版本号(version) 。这是一个随着事务的创建而不断增长的数字。每个事务在事务开始时会记录它自己的系统版本号。每个查询必须去检查每行数据的版本号与事务的版本号是否相同。
《高性能MySQL》书籍中介绍到:
MVCC
只在REPEATABLE READ
和READ COMMITTIED
两个隔离级别下工作,其他两个隔离级别都和MVCC
不兼容,因为READ UNCOMMITTED
总是读取最新的数据行,而不符合当前事务版本数据行,而SERIALIZABLE
串行化隔离级别则会对所有读取的行都加锁。
接下来举个例子说明在可重复读事务隔离级别下,MVCC
机制是如何完成增删改查操作的。
在查询操作中,InnoDB存储引擎跟根据以下两个条件查询对应的行记录,只有满足对应条件才会被返回:
1️⃣ 只查找不晚于当前事务版本的数据行,也就是说,InnoDB存储引擎只会查找版本号小于或者等于当前事务版本的数据行,这些数据行要么在该事务开始前就存在,要么就是事务本身插入或者更新的行。
2️⃣ 对于数据删除,数据行删除的版本要么还没有被定义,要么大于当前事务的版本号,只有这样才能确保事务读到的行,在事务开始前并没有被删除。
举个例子,存在 事务A 和 事务B 两个事务,事务A中存在两套相同的SELECT
语句,事务B存在一条UPDATE
语句,事务A 的第一条查询语句在事务B提交之前执行,第二条查询语句在 事务B 提交之后执行,事务A 如下所示:
-- 事务A操作 START TRANSACTION; SELECT * FROM account WHERE id = 1; //在事务B提交之前执行 SELECT * FROM account WHERE id = 1; //在事务B提交之后执行 COMMIT;
事务B:
-- 事务B操作 START TRANSACTION; UPDATE account SET balance = balance+100 WHERE id = 1; COMMIT;
✔结论:如果没有使用MVCC机制,则事务A中的第一条SELECT语句读取的数据是修改前的数据,而第二条SELECT语句读取的是修改后的数据,两次读取的数据不一致,想想,那不就乱了吗?? 如果使用了MVCC机制,无论事务B如何修改数据,事务A的两条查询语句的到的结果始终是一致的。
在插入操作中,InnoDB
会将新插入的每一条行记录的当前系统版本包保存为行版本号。
比如:向account
表插入一条数据,同时MVCC
的两个版本号分别为create_version
和delete_version
,create_version
代表创建行的版本号,delete_version
代表删除行的版本号,另外还有一个事务ID字段,如下面所示:
INSERT INTO account(id, name, balance) values(1001, 'austin', 100);
对应的版本号信息如下表:
id | name | balance | transaction_id | create_version | delete_version |
---|---|---|---|---|---|
1001 | austin | 100 | 1 | 1 | 未定义 |
可以看出,当向数据表新增记录时,需要设置保存行的版本号,而删除行的版本号未定义。
在更新操作中,InnoDB
存储引擎会插入一行新记录,并保存当前系统的版本号为新记录行的版本号,同时保存当前系统的版本号到原来数据行作为删除标识。比如:将account数据表中id为1001的用户账户月增加100元,对应SQL如下:
UPDATE account SET balance = balance+100 WHERE id = 1001;
执行SQL, 在MVCC机制下的更新操作如下表所示:
id | name | balance | transaction_id | create_version | delete_version |
---|---|---|---|---|---|
1001 | austin | 100 | 1 | 1 | 2 |
1001 | austin | 200 | 2 | 2 | 未定义 |
可以明显看出,执行更新操作时,MVCC
机制是先将原来的数据复制一份,将balance
字段增加100后,再讲create_version
字段的值设置为当前系统的版本号,而delete_version
字段的值未定义。
注意的是:原来的行会被复制到Undo Log中。
在删除操作中,InnoDB存储引擎会保存删除的每一个行记录当前的系统版本号,作为删除标识。比如:删除account数据表中id为1001的数据,SQL如下所示:
DELETE FROM account WHERE id = 1001;
对应MVCC机制下的删除操作如下表所示:
id | name | balance | transaction_id | create_version | delete_version |
---|---|---|---|---|---|
1001 | austin | 200 | 3 | 2 | 3 |
可以看出,当删除数据表数据行时,MVCC机制会将当前系统的版本号写入删除数据行版本字段delete_version中,以此来表示当前数据行已经被删除。
本文主要讲述了事务的特性、类型和本地事务和分布式事务的区别,通过不同的示例解释并发事务带来的更新丢失、脏读、不可重复读、幻读的问题,同时介绍了多版本并发控制MVCC的实现原理,如果文章对你有帮助,感谢点赞?+评论?+收藏❤,我是:??austin流川枫,我们下期见!
【相关推荐:mysql视频教程】
The above is the detailed content of Let's talk in depth about transaction features and implementation principles in MySQL. For more information, please follow other related articles on the PHP Chinese website!