How does MySQL solve the phantom read problem? The following article will let you talk about this issue. Let’s read the article with questions!
## Among the high-frequency interview questions of Jin Busan and Yin Busi, MySQL’s transaction characteristics, isolation level and other issues are also one of the very classic eight-part essays. Faced with this kind of Question, it is estimated that most friends are familiar with it:Transaction characteristics (ACID):Atomicity(
Atomicity),
Isolation(
Isolation),
Consistency(
Consistency) and
Persistence
Isolation Level:READ UNCOMMITTED(
READ UNCOMMITTED),
READ COMMITTED(
READ COMMITTED),
Repeatable Read(
REPEATABLE READ),
Serializable(
SERIALIZABLE)
Under the isolation level,
dirty reads,
non-repeatable readsand ## may occur. #phantom read
problem
non-repeatable read
andphantom read
problems may occur, but they are not possibleDirty read
problem
phantom read
problem may occur, butdirty read# cannot occur ## and
Non-repeatable readproblems
SERIALIZABLE
For MySQL InnoDB The default isolation level supported by the storage engine isREPEATABLE-READ (repeatable)It is impossible to preventphantom reads, but we all know that the MySQL InnoDB storage engine solves the problem of phantom reads, so how does it solve it?
1. Row format
compact,
redundant,
dynamic,
compress; Although there are many line formats, they are basically the same in principle, as follows, the
compactline format:
As can be seen from the figure, A complete record can actually be divided into two parts:recorded additional information
andrecorded real data.
recorded additional informationare respectively
changes. Long field length list,
NULL value listand
record header information, and
recorded real dataIn addition to our own defined columns, MySQL will Add some default columns to each record. These default columns are also called
hidden columns. The specific columns are as follows:
We don’t need to worry about the value of the hidden column.InnoDB
The storage engine will generate it for us. Let’s draw it in more detail.compact
The row format is as follows:
and then roll_pointer
will point to theundolog
, so this column is equivalent to a pointer, through which you can find Modify the previous informationAssume there is a record as follows:Thetransaction id
inserted into the record is80
, and theroll_pointer
pointer is NULL (for ease of understanding, readers can understand that the pointer is NULL, in fact roll_pointer The first bit marks the type of undo log it points to. If the value of this bit is 1, it means that the undo log type it points to is insert undo)
Assume the next twoTransaction ID
The transactions with100
and200
respectively performUPDATE
operations on this record:
-- 事务id=100 update person set grade =20 where id =1; update person set grade =40 where id =1; -- 事务id=200 update person set grade =70 where id =1;
Each time When the record is modified, anundo log
will be recorded. Eachundo log
also has aroll_pointer
attribute (INSERT
corresponding to the operation) ##undo logdoes not have this attribute, because the record does not have an earlier version), you can connect these
undo loginto a linked list, so the current situation is like the picture below Same:
Every time the record is updated, the old value will be placed in anundo log, even if it is an old version of the record, as the number of updates increases increase, all versions will be connected into a linked list by the
roll_pointerattribute. We call this linked list
version chain. The head node of the version chain is the latest value of the current record. In addition, each version also contains the corresponding
transaction id
read uncommitted; 2)
read committed; 3)
REPEATABLE READ; 4)
SERIALIZABLE; for example,
READ UNCOMMITTED, just read the latest data of the version chain each time;
SERIALIZABLE, mainly controlled by locking; and
read committedand
REPEATABLE READThey all read things that have been submitted, so for these two isolation levels, the core issue is which things in the version chain are visible to the current thing; in order to solve this problem, MySQL proposed the read view concept, which contains four core Concept:
: When generating
read view, the active thing id collection
:# The minimum value of ##m_ids
, that is, the minimum value of active things when generating read view
read view
, the system The next thing id value
read view
, which is the current thing id.
, when accessing a record, you only need to follow the steps below to determine whether a certain version of the record is visible:
If the accessed version thing id is less thanread view
is created, the thing has been submitted, and this version is readable to the current thing
If the accessed version of the thing If the id is greater than or equal toread view
is created, the thing id that generates the version record is not opened until theRead view
is generated. Therefore, this version cannot be read by the current thing
If the accessed version thingm_ids
collection, it means thatRead view
is generated time, the transaction is still active and has not been submitted, then the version cannot be accessed; if not, it means that the transaction that generated the version whenReadView
was created has been submitted and can be accessed
InMySQL
, a very big difference betweenREAD COMMITTED
andREPEATABLE READ
isolation levels is that they generate ReadView at different times:
READ COMMITTED
—— Generate aReadView
REPEATABLE READ
—— before reading data every time When reading data once, aReadView
is generated. Let’s use detailed examples to illustrate the difference between the two:
Description | ||
---|---|---|
Row ID, uniquely identifies one Record | transaction_id | |
Transaction ID | roll_pointer | |
Rollback pointer | # Four isolation levels for the database: 1)ReadViewWhen the recorded thing id is equal to |
Time number | trx 100 | trx 200 | |
---|---|---|---|
① | BEGIN; | ||
BEGIN; | |||
##update person set grade =40 where id =1; |
|||
##⑥ |
|||
##COMMIT; | ⑦ |
||
##update person set grade =70 where id =1; |
⑧ |
SELECT * FROM person WHERE id = 1; | |
COMMIT; |
? |
COMMIT; | |
trx 100 the transaction was executed Submit, the version chain recorded in the id=1 line is as follows: |
, the transaction commit was executed, the version chain recorded in the id=1 line As follows:
At time ⑤, transactiontrx 100
will first generate a
selectstatement. #, the content of the
m_idslist ofReadView
is[100, 200]
,min_trx_id
is100
,max_trx_id
is201
,creator_trx_id
is0
, at this time, select the visible record from the version chain, and traverse the version chain from top to bottom: Because grade=40, the value oftrx_id
is100
, which is inm_ids
, so the record is not visible. Similarly, the record with grade=20 is also invisible. Continue traversing down, grade=20,trx_id
value is80
, which is less thanmin_trx_id
value100# in
ReadView##, so this version meets the requirements, and records with level 10 are returned to the user.
In time ⑧, if the isolation level of the transaction is
READ COMMITTED, a separate
ReadViewwill be generated, the ## of the
ReadViewThe content of the #m_ids
list is
,min_trx_id
is200
,max_trx_id
is201
,creator_trx_id
is0
. At this time, select the visible record from the version chain, and the version chain is traversed from top to bottom: because grade=70, the value oftrx_id
is200
, inm_ids
, so the record is not visible, continue to traverse, grade=40,trx_id
value is100
, It is less than themin_trx_id
value200
inReadView
, so this version meets the requirements, and a record with level 40 is returned to the user.In time ⑧, if the isolation level of the transaction is
REPEATABLE READ
, in time ⑧, aReadView
will not be generated separately, but the one of time 5 will be used.ReadView
, so the level returned to the user is 10. The result of the two selects is the same. This is the meaning ofrepeatable reading
.
3. Summary
By analyzing the detailed explanation of MVCC, it can be concluded that based on MVCC, under the RR isolation level, it is very easy to solve
phantom reading
Problem, but we know thatselect for update
generates current reads and is no longer snapshot reads. In this case, how does MySQL solve the
Current reading
: Use Next-Key Lock (gap lock) to lock to ensure that phantom reading does not occur
How gap lock is used in the current reading situation If you want to solve the problem of phantom reading, interested friends can add a follow and like
[Related recommendations:
The above is the detailed content of An article briefly analyzing how MySQL solves the phantom reading problem. For more information, please follow other related articles on the PHP Chinese website!