In the past few days, I have been asked many times in interviews about database transaction mechanisms, isolation levels, optimistic locks and pessimistic locks. I could only say that I had some understanding of these before. Some concepts are still at the memory level and I don’t understand them, so I answered Not good. I read through the book later to study and understood some things, so I will make a record here.
What is a transaction?
What I understand about a transaction is a complete business behavior. A business behavior may contain multiple actions. This complete action constitutes a transaction. A more classic example is a bank transfer. Transferring account A to account B requires two actions: subtracting from account A and adding to account B. It must be ensured that both of these actions are done or neither is done.
Transactions have ACID characteristics, including:
● Atomicity (atomicity): Atomicity means that the transaction is indivisible, either all successful or all failed, not partially successful and partially failed. . In case of failure halfway, the battlefield needs to be cleaned, that is, the data is rolled back.
● Consistency: Consistency refers to the final result of the transaction, ensuring that there are no abnormalities in the data. Consistency emphasizes results and is based on atomicity. That is to say, if atomicity can be guaranteed, there will be consistent results.
● Isolation: Isolation means that a transaction is not visible to other transactions before it is submitted, and data between transactions is isolated (of course, the degree of isolation is different at different levels).
● Durability: After the transaction is submitted, it will be persistent and can be saved for a long time.
Transaction isolation level
Before understanding the isolation level of a transaction, you need to understand several concepts of data reading:
● Dirty reading: It means reading data that others have not yet submitted.
● Repeatable read: It is two queries within the same thing. If someone else modifies the record in the query and submits it, it will not be visible to the second query and the same record will not appear. The two queries are inconsistent.
● Phantom reading: It is two queries within one thing. If someone else adds a record and submits it, what can be found in the second query will be inconsistent with the first record.
Transaction control is divided into many levels. The level determines the degree of isolation. There are four levels in MySQL:
● Read uncommitted: This level is At the lowest level, if the modification of transaction A is not committed, it will be visible to transaction B, and dirty reading of data will occur. This type is generally not used.
● Read submitted: The modification of thing A is visible to B only after it is submitted. In this case, the problem of phantom reading of data will occur, and the results of the two queries are different.
● Repeatable read: It is the default level of MySQL. Between two queries at this level, a certain record is modified in the middle, which is invisible to other transactions, ensuring repeated queries. The same record is consistent, but other transactions are visible for new additions, so new phantom reads will still occur.
● Serializable: Transactions are executed serially, and each record queried is locked. Blocking will occur, and concurrency will cause serious performance problems, so generally This type will not be used.
Isolation level overview chart
Isolation implementation of transactions
In the transaction Isolation is controlled in two ways: one is the locking method, which achieves isolation through time separation; the other is the version control method, which records multiple versions to achieve isolation.
1. Locks
The locks in MySQL are divided into read locks and write locks. Because the read lock is for reading data, multiple people can read the same data at the same time. It has a shared nature; the write lock involves data changes, so it conflicts with other write locks and read locks and has an exclusive nature.
In terms of lock granularity, it is divided into table-level locks and row-level locks. Table locks generally occur when the table structure is modified or the entire table is updated, and will block all read and write operations on this table; Row-level locks generally occur when a specified record is updated, and only the specified record will be locked. The smaller the lock granularity, the higher the concurrency. Row-level locks should be prioritized and table locks should be avoided as much as possible. This is the same principle as the lock granularity in the program.
2. Multi-version concurrency control
For performance reasons, MySQL has another method besides row-level locks, multi-version concurrency control, which is controlled by storage Engine implementation.
The book explains a simple implementation method of InnoDB. This method uses one record with multiple versions. Two hidden columns are added to each record, one is the creation version number, and the other is Delete the version number. Each time a transaction is opened, a transaction version number will be assigned. The transaction version number is incremented, and operations within the transaction will be compared based on this version number. details as follows:
● When querying: Query the records that existed before the current transaction and the records created by this transaction, and have not been deleted, that is: create version number Current version number)
● When inserting: The recorded creation version number is the current transaction version number.
● When deleting: The deleted version number of the update record is the current transaction version number.
● When updating: Insert a new record, the creation version number is the current transaction version number, and change the original record deletion version number to the current transaction version number, which means it has been deleted. In fact, the update here is equivalent to deleting and adding another record.
3. Optimistic lock and pessimistic lock
Locks are divided into pessimistic locks and optimistic locks from the perspective of use. Pessimistic locks have a very pessimistic attitude and think that I The data found may be modified by others, so when querying, this batch of data is locked to prevent others from operating it; optimistic locking has a very optimistic attitude and believes that it is basically impossible for the data I found to be modified by others. Modify, so this batch of data is not locked when querying, and when the modification is submitted, confirm whether it has been modified by others. It is not too late to make up for it before it is too late.
The implementation of optimistic locking and pessimistic locking:
● Pessimistic locking can be easily solved at the database level, using select... for update, when querying Just lock this part of the data.
● The implementation of optimistic locking is more complicated than pessimistic locking. You can put a version number column in the database. When updating, the version number is all 1 to confirm whether the data I found has been modified by others. It has been The modified ones are not updated or the program throws an exception.
Should we use optimistic locking or pessimistic locking:
Considering that optimistic locking has better performance from a performance perspective, there is no locking operation between query and update, but it can be implemented Not as simple as pessimistic locking and can go wrong. So the factor to consider is whether the concurrency of the system is high? What is the probability of conflict? When concurrency is high, it is better to use optimistic locking, otherwise it is better to use a simple method like pessimistic locking.
Recommended mysql video tutorial, address: //m.sbmmt.com/course/list/51.html
The above is the detailed content of The mechanism of MySQL database transactions [Summary]. For more information, please follow other related articles on the PHP Chinese website!