Home>Article>Database> An article briefly analyzing how MySQL solves the phantom reading problem

An article briefly analyzing how MySQL solves the phantom reading problem

青灯夜游
青灯夜游 forward
2023-02-06 20:14:16 1993browse

How does MySQL solve the phantom read problem? The following article will let you talk about this issue. Let’s read the article with questions!

An article briefly analyzing how MySQL solves the phantom reading problem

##  Among the high-frequency interview questions of Jin Busan and Yin Busi, MySQL’s transaction characteristics, isolation level and other issues are also one of the very classic eight-part essays. Faced with this kind of Question, it is estimated that most friends are familiar with it:

Transaction characteristics (ACID):Atomicity(Atomicity),Isolation(Isolation),Consistency(Consistency) andPersistence

Isolation Level:READ UNCOMMITTEDREAD UNCOMMITTED),READ COMMITTEDREAD COMMITTED),Repeatable Read(REPEATABLE READ),Serializable(SERIALIZABLE)

And each kind of isolation The problems caused by the level are:

  • READ UNCOMMITTEDUnder the isolation level,dirty reads,non-repeatable readsand ## may occur. #phantom readproblem
  • READ COMMITTED
  • Under the isolation level,non-repeatable readandphantom readproblems may occur, but they are not possibleDirty readproblem
  • REPEATABLE READ
  • Under the isolation level,phantom readproblem may occur, butdirty read# cannot occur ## andNon-repeatable readproblemsSERIALIZABLE
  • Under the isolation level, various problems cannot occur
  • For MySQL InnoDB The default isolation level supported by the storage engine is
  • REPEATABLE-READ (repeatable)
. From the definition of the four isolation levels of the SQL standard above,

REPEATABLE-READ (repeatable)It is impossible to preventphantom reads, but we all know that the MySQL InnoDB storage engine solves the problem of phantom reads, so how does it solve it?1. Row format

 Before entering the topic, we first have a general understanding of what the row format is. This will help us understand the following MVCC. The row format is The way row records in the table are stored on disk,Innodb

storage engine has a total of 4 different types of row formats:

compact,redundant,dynamic,compress; Although there are many line formats, they are basically the same in principle, as follows, thecompactline format:As can be seen from the figure, A complete record can actually be divided into two parts:recorded additional informationandAn article briefly analyzing how MySQL solves the phantom reading problemrecorded real data.recorded additional informationare respectivelychanges. Long field length list,NULL value listandrecord header information, andrecorded real dataIn addition to our own defined columns, MySQL will Add some default columns to each record. These default columns are also calledhidden columns. The specific columns are as follows:

We don’t need to worry about the value of the hidden column.InnoDBThe storage engine will generate it for us. Let’s draw it in more detail.compactThe row format is as follows:

An article briefly analyzing how MySQL solves the phantom reading problem

  • transaction_id: Transaction id. When a transaction modifies a row record, the transaction id of this transaction will be assigned to this column.
  • roll_pointer: Every time a row is modified When the record is changed, the old version of the data will be written to the undolog log,and then roll_pointerwill point to theundolog, so this column is equivalent to a pointer, through which you can find Modify the previous information

2. MVCC detailed explanation

2.1 Version chain

Assume there is a record as follows:An article briefly analyzing how MySQL solves the phantom reading problemThetransaction idinserted into the record is80, and theroll_pointerpointer is NULL (for ease of understanding, readers can understand that the pointer is NULL, in fact roll_pointer The first bit marks the type of undo log it points to. If the value of this bit is 1, it means that the undo log type it points to is insert undo)

Assume the next twoTransaction IDThe transactions with100and200respectively performUPDATEoperations on this record:

-- 事务id=100 update person set grade =20 where id =1; update person set grade =40 where id =1; -- 事务id=200 update person set grade =70 where id =1;

Each time When the record is modified, anundo logwill be recorded. Eachundo logalso has aroll_pointerattribute (INSERTcorresponding to the operation) ##undo logdoes not have this attribute, because the record does not have an earlier version), you can connect theseundo loginto a linked list, so the current situation is like the picture below Same:

 Every time the record is updated, the old value will be placed in anAn article briefly analyzing how MySQL solves the phantom reading problemundo log, even if it is an old version of the record, as the number of updates increases increase, all versions will be connected into a linked list by theroll_pointerattribute. We call this linked listversion chain. The head node of the version chain is the latest value of the current record. In addition, each version also contains the correspondingtransaction id

#2.2 ReadView

read uncommitted; 2)read committed; 3)REPEATABLE READ; 4)SERIALIZABLE; for example,READ UNCOMMITTED, just read the latest data of the version chain each time;SERIALIZABLE, mainly controlled by locking; andread committedandREPEATABLE READThey all read things that have been submitted, so for these two isolation levels, the core issue is which things in the version chain are visible to the current thing; in order to solve this problem, MySQL proposed the read view concept, which contains four core Concept:

  • m_ids: When generatingread view, the active thing id collection
  • min_trx_id:# The minimum value of ##m_ids, that is, the minimum value of active things when generating read view
  • max_trx_id
  • : Indicates that when generatingread view, the system The next thing id value
  • creator_trx_id
  • should be assigned: the thing id that createsread view, which is the current thing id.
  • With this

, when accessing a record, you only need to follow the steps below to determine whether a certain version of the record is visible:An article briefly analyzing how MySQL solves the phantom reading problem

    creator_trx_id
  • , it means that the current thing is accessing the record it modified, so this version is visibleIf the accessed version thing id is less than
  • min_trx_id
  • , it means that whenread viewis created, the thing has been submitted, and this version is readable to the current thingIf the accessed version of the thing If the id is greater than or equal to
  • max_trx_id
  • , it means that when theread viewis created, the thing id that generates the version record is not opened until theRead viewis generated. Therefore, this version cannot be read by the current thingIf the accessed version thing
  • transaction_id
  • is in them_idscollection, it means thatRead viewis generated time, the transaction is still active and has not been submitted, then the version cannot be accessed; if not, it means that the transaction that generated the version whenReadViewwas created has been submitted and can be accessed
  • Note: The thing id for reading things is 0

InMySQL, a very big difference betweenREAD COMMITTEDandREPEATABLE READisolation levels is that they generate ReadView at different times:

  • READ COMMITTED—— Generate aReadView
  • REPEATABLE READ—— before reading data every time When reading data once, aReadView

is generated. Let’s use detailed examples to illustrate the difference between the two:

Column name Length Description row_id 6 bytes Row ID, uniquely identifies one Record transaction_id 6 bytes Transaction ID roll_pointer 7 bytes Rollback pointer # Four isolation levels for the database: 1)ReadViewWhen the recorded thing id is equal to
#② BEGIN; BEGIN; ③ update person set grade =20 where id = 1; ④ ##⑤ SELECT * FROM person WHERE id = 1; ⑨ ##In time ④, due to transaction In time ⑥, due to transaction
Time number
trx 100 trx 200
BEGIN;





##update person set grade =40 where id =1;



##⑥
##COMMIT;

##update person set grade =70 where id =1;

SELECT * FROM person WHERE id = 1;


COMMIT;

?
COMMIT;

trx 100
the transaction was executed Submit, the version chain recorded in the id=1 line is as follows:
trx 200

, the transaction commit was executed, the version chain recorded in the id=1 line As follows:

An article briefly analyzing how MySQL solves the phantom reading problemAt time ⑤, transactiontrx 100will first generate a

ReadView## when executing the

selectAn article briefly analyzing how MySQL solves the phantom reading problemstatement. #, the content of the

m_ids

list ofReadViewis[100, 200],min_trx_idis100,max_trx_idis201,creator_trx_idis0, at this time, select the visible record from the version chain, and traverse the version chain from top to bottom: Because grade=40, the value oftrx_idis100, which is inm_ids, so the record is not visible. Similarly, the record with grade=20 is also invisible. Continue traversing down, grade=20,trx_idvalue is80, which is less thanmin_trx_idvalue100# inReadView##, so this version meets the requirements, and records with level 10 are returned to the user.In time ⑧, if the isolation level of the transaction isREAD COMMITTED, a separateReadViewwill be generated, the ## of theReadViewThe content of the #m_idslist is

[200]

,min_trx_idis200,max_trx_idis201,creator_trx_idis0. At this time, select the visible record from the version chain, and the version chain is traversed from top to bottom: because grade=70, the value oftrx_idis200, inm_ids, so the record is not visible, continue to traverse, grade=40,trx_idvalue is100, It is less than themin_trx_idvalue200inReadView, so this version meets the requirements, and a record with level 40 is returned to the user.In time ⑧, if the isolation level of the transaction isREPEATABLE READ, in time ⑧, aReadViewwill not be generated separately, but the one of time 5 will be used.ReadView, so the level returned to the user is 10. The result of the two selects is the same. This is the meaning ofrepeatable reading.

3. SummaryBy analyzing the detailed explanation of MVCC, it can be concluded that based on MVCC, under the RR isolation level, it is very easy to solvephantom readingProblem, but we know thatselect for updategenerates current reads and is no longer snapshot reads. In this case, how does MySQL solve the

phantom read

problem? Based on time issues (it does take a lot of time to organize and draw pictures), I will give the conclusion first, and then analyze how MySQL solves thephantom readingproblem under the current reading situation:

Current reading: Use Next-Key Lock (gap lock) to lock to ensure that phantom reading does not occurHow gap lock is used in the current reading situation If you want to solve the problem of phantom reading, interested friends can add a follow and like[Related recommendations:

mysql video tutorial
    ]

The above is the detailed content of An article briefly analyzing how MySQL solves the phantom reading problem. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete