Home  >  Article  >  Database  >  Completely master MySQL's three major logs: binlog, redo log and undo log

Completely master MySQL's three major logs: binlog, redo log and undo log

WBOY
WBOYforward
2022-02-04 06:00:303751browse

This article brings you relevant knowledge about mysql logs. What we need to focus on is the binary log (binlog) and transaction log (including redo log and undo log). I hope it will be helpful to everyone.

Completely master MySQL's three major logs: binlog, redo log and undo log

1, binlog

binlog is used to record write operations (excluding queries) information performed by the database and is saved in binary form in the disk. Binlog is the logical log of mysql and is recorded by the server layer. Mysql databases using any storage engine will record binlog logs.

  • Logical log: can be simply understood as a sql statement;
  • Physical log: The data in MySQL is stored in the data page, and the physical log records the data on the data page. Change; insert code snippet here

Binlog is written by appending. You can set the size of each binlog file through the max_binlog_size parameter. When the file size reaches the given value, it will be generated New file to save logs.

binlog usage scenarios
Project In actual applications, there are two main usage scenarios for binlog, namely master-slave replication and data recovery.

  • Master-slave replication: Enable binlog on the Master side, and then send the binlog to each Slave side. The Slave side replays the binlog to achieve master-slave data consistency.
  • Data recovery: Recover data by using the mysqlbinlog tool.

MySQL master-slave synchronization principle
Completely master MySQLs three major logs: binlog, redo log and undo logCompletely master MySQLs three major logs: binlog, redo log and undo log

  • Master node binlog dump thread
    When the slave node connects When the master node is used, the master node will create a log dump thread to send the contents of the binlog. When reading operations in the binlog, this thread will lock the binlog on the master node. When the read is completed, the lock will be released even before it is sent to the slave node;
  • Slave node I/O Thread
    After executing the start slave command on the slave node, the slave node will create an I/O thread to connect to the master node and request the updated binlog in the master library. After the I/O thread receives the update from the master node binlog dump process, it saves it in the local relaylog;
  • Slave node SQL thread
    The SQL thread is responsible for reading the content in the relaylog and parsing it into specific Operation and execution, ultimately ensuring the consistency of master-slave data;
    MySQL database master-slave synchronization principle

binlog content
As mentioned above, binlog is a A logical log can be simply understood as a SQL statement, but in fact it also contains the reverse logic of the executed SQL statement. delete corresponds to delete itself and the reverse insert information; update contains information about the data rows before and after the corresponding update is executed; insert contains its own insert and corresponding delete information.

binlog format
There are three binlog formats, namely statement, row and mixed. Before MySQL 5.7.7, statement was used by default, and after MySQL 5.7.7, row was used by default. The format of the log can be modified through binlog-format in the my.ini configuration file.
(1) Statement: Statement-based replication (SBR) based on SQL statements. Each SQL statement that modifies data will be recorded in the binlog.

  • Advantages: No need to specifically record changes in a certain row, saving space, reducing IO, and improving performance;
  • Disadvantages: When performing operations such as sysdate() or sleep() , which may lead to inconsistency between master and slave data;

(2) row: row-based replication (RBR), which does not record SQL statement context-related information, but records which Details of the record being modified.

  • Advantages: The details of each row record modification are recorded in very detail, so there will be no situation where the data cannot be copied correctly;
  • Disadvantages: Because each row is recorded in great detail Record the details of the modification, which will generate a lot of log content. Assume that there is an update statement and many records are modified. Each modified record will be recorded in the binlog. In particular, for the alter table operation, due to changes in the table structure, each row of records will change, resulting in a sudden increase in log volume;

(3)mixed: According to the above, statement and row Each has its own advantages and disadvantages, so the mixed version emerged to mix the two. Under normal circumstances, the statement format is used for saving. When the statement cannot be solved, switch to row format for saving.
In particular, as mentioned above, the new version (after MySQL 5.7.7) uses the row format by default. The row here has also been optimized accordingly. When encountering the alter table operation, the statement format is used for recording. The rest Operations still use row format.

binlog flushing timing

For the InnoDB storage engine, the binlog will only be recorded when the transaction is submitted. At this time, the record is still in the memory, and MySQL passes sync_binlog controls the flushing timing of binlog. The value range is 0-N:

  • 0: No forced flushing to disk, the system will decide when to write to disk;
  • 1: Binlog must be written to disk after each submission;
  • N: Binlog will be written to the disk every N transactions;

As can be seen from the above, the safest setting for sync_binlog is 1, which is also the version after MySQL 5.7.7 the default value. However, setting a larger value can improve database performance. Therefore, in actual situations, you can also increase the value appropriately and sacrifice a certain degree of consistency to obtain better performance.

Physical file size of binlog

In the my.ini configuration file, the size of the binlog can be configured through max_binlog_size. When the log volume exceeds the size of the binlog file, the system will regenerate a new file to continue saving the file. What should I do when a transaction is relatively large, or when there are more and more logs, and the physical space it occupies is too large? MySQL provides an automatic deletion mechanism, which can be solved by configuring the expire_logs_days parameter in the my.ini configuration file. The unit is days. When this parameter is 0, it means it will never be deleted; when it is N, it means it will be automatically deleted after the Nth day.

2. redo log

redolog is the proprietary log system of the InnoDB engine. It is mainly used to achieve transaction durability and crash-safe functions. Redolog is a physical log, which records the specific modifications on the data page after the SQL statement is executed.
We all know that when MySQL is running, data will be loaded from disk into memory. When a SQL statement is executed to modify the data, the modified content is actually only temporarily saved in the memory. If the power is cut off or other circumstances occur at this time, these modifications will be lost. Therefore, after modifying the data, MySQL will look for opportunities to flush these memory records back to the disk. But there is a performance problem, mainly in two aspects:

InnoDB interacts with the disk in data units of pages, and a transaction may only modify a few bytes on a page. , if a complete data page is flushed back to the disk, it will waste resources;

A transaction may involve multiple data pages. These data pages are only logically continuous but not physically continuous. Use random IO The performance is too poor;

Therefore, MySQL designed redolog to record the specific modifications made to the data page by the transaction, and then flush the redolog back to the disk. You may have doubts. Originally, I wanted to reduce io. Wouldn’t this add another io? The designers of InnoDB have taken these into consideration at the beginning of the design. Redolog files are generally relatively small, and the process of flashing back to disk is sequential IO, which has better performance than random IO.

Basic concept of redo log
Redolog consists of two parts, one is the log cache redo log buffer in the memory, and the other is the log file redo log file in the disk. Every time the data record is modified, these modifications will be written to the redo log buffer first, and then wait for the appropriate opportunity to flush the modifications in the memory back to the redo log file. This technology of writing logs first and then writing to disk is WAL (Write-Ahead Logging) technology. It should be noted that the redolog is flushed back to disk before the data page. Modifications to the clustered index, secondary index, and undo page all need to be recorded in the redolog.

In computer operating systems, buffer data in user space generally cannot be written directly to the disk, and must pass through the operating system kernel space buffer (OS Buffer). ). Therefore, writing the redo log buffer to the redo log file actually writes it to the OS Buffer first, and then flushes it to the redo log file through the system call fsync(). The process is as follows:
Completely master MySQLs three major logs: binlog, redo log and undo log
mysql support Three timings for writing redo log buffer to redo log file can be configured through the innodb_flush_log_at_trx_commit parameter. The meaning of each parameter value is as follows:

Parameter value Meaning
0 (delayed writing) When the transaction is submitted, the log in the redo log buffer will not be written to the os buffer, but every second Write to the os buffer and call fsync() to write to the redo log file. That is to say, when set to 0, data is written to the disk (approximately) every second. When the system crashes, 1 second of data will be lost.
1 (real-time writing, real-time brushing) Every time a transaction is submitted, the log in the redo log buffer will be written to the os buffer and fsync() will be called to flush to redo log file. This method will not lose any data even if the system crashes, but because each submission is written to the disk, the IO performance is poor.
2 (real-time writing, delayed brushing) Each submission is only written to the os buffer, and then fsync() is called every second to write the data in the os buffer The log is written to the redo log file.

Completely master MySQLs three major logs: binlog, redo log and undo log
redo log recording format
Redolog adopts a fixed size and cyclic writing format. When the redolog is full, it will be written from the beginning again. Why is it designed like this?
The main purpose of redo log is to reduce the requirement for data page flushing. Redolog records the modifications on the data page, but when the data page is also flushed back to the disk, these records become useless. Therefore, when MySQL determines that the previous redolog has expired, the new data will overwrite the invalid data. So how to judge whether it should be covered?
Completely master MySQLs three major logs: binlog, redo log and undo log
The above picture is a schematic diagram of redo log file. write pos represents the log sequence number LSN (log sequence number) currently recorded by redolog. When the data page has been flushed back to the disk, the LSN in the redo log file will be updated, indicating that the data before this LSN has been written to the disk. This LSN is the check point. The part between write pos and check point is the spare part of redolog, which is used to record new records; the part between check point and write pos is the modified part of the data page that has been recorded by redolog, but the data page has not been flushed back to the disk at this time. part. When the write pos catches up with the check point, it will first push the check point forward, vacate the position, and then record a new log.

When starting innodb, regardless of whether it was shut down normally or abnormally last time, recovery operations will always be performed. During recovery, the LSN in the data page will be checked first. If this LSN is smaller than the LSN in the redolog, that is, the write pos position, it means that the unfinished operations on the data page are recorded in the redolog, and then it will start from the nearest check point. , start synchronizing data.

Is it possible that the LSN in the data page is greater than the LSN in the redolog? The answer is of course possible. When this happens, the part beyond the redolog will not be redone, because this itself means that what has been done does not need to be redone.
The difference between redo log and binlog

##redo logbinlogFile sizeThe size of the redo log is fixed. Binlog can set the size of each binlog file through the configuration parameter max_binlog_size. Implementation methodThe redo log is implemented by the InnoDB engine layer, and not all engines have it. Binlog is implemented by the Server layer. All engines can use binlog logsRecording methodredo log is recorded in a loop writing method. When writing to the end, it will return to the beginning and write the log in a loop. binlog is recorded by appending. When the file size is larger than the given value, subsequent logs will be recorded to new filesApplicable scenariosredo log is suitable for crash recovery (crash-safe)binlog is suitable for master-slave replication and data recovery

It can be seen from the difference between binlog and redo log: binlog log is only used for archiving, and relying only on binlog does not have crash-safe capabilities. But only redo log will not work, because redo log is unique to InnoDB, and the records in the log will be overwritten after being written to disk. Therefore, both binlog and redo log need to be recorded at the same time to ensure that the data will not be lost when the database is shut down and restarted.
Two-stage submission
The above briefly introduces redolog and binlog. When modifying data, they will save and implement these modifications, but one is a physical log and the other is a logical log. So how did they perform the modification process?

Suppose there is an update statement to be executed now, update from table_name set c=c 1 where id=2, the execution process is as follows:

  • First locate the record with id=2 ;
  • The executor gets the row data given by the engine, adds 1 to this value, gets a new row of data, and then calls the engine interface to write this new row of data;
  • The engine will The new data is updated into the memory, and the update operation is recorded in the redolog. At this time, the redolog is in the prepare state. Then the executor is informed that the execution is completed and the transaction can be submitted at any time;
  • The executor generates the binlog of this operation and writes the binlog to the disk;
  • The executor calls the engine's commit transaction interface, and the engine Change the redo log just written to the commit state, and the update is completed;

The schematic diagram is as follows:
Completely master MySQLs three major logs: binlog, redo log and undo log
This splits the writing of redolog The process of preparing and committing is called two-phase commit.

Both redolog and binlog can be used to represent the commit status of a transaction, and two-phase commit is to keep the two states logically consistent. If you don't use two-phase commit, but write one first and then the other, it may cause some problems.

At this time, update is still used as an example. Assume that the current id=2 and there is a field c=0. Analyze the following situations respectively:
Write redolog first and then binlog
Assume that redolog is written first. When redolog is finished, but binlog has not yet been written. When I finished writing, MySQL suddenly encountered an exception and restarted. Since the redolog has been written before, the modified records still exist after the system is restarted, so the value of c in this line after recovery is 1. However, due to the system restart, this record does not exist in the binlog. When backing up the log later, this statement does not exist in the saved binlog. Then you will find that if you need to use this binlog to restore the temporary library, because the binlog of this statement is lost, the temporary library will not be updated this time. The value of c in the restored row is 0, which is the same as the value of the original library. different.
Write binlog first and then redolog
If you write binlog first and then restart the system when writing redolog. After restarting, there is no record of modification of c in the redolog, and the value of c is still 0 at this time. But the log "Change c from 0 to 1" has been recorded in the binlog. Therefore, when binlog is used to restore later, one more transaction will come out. The value of c in the restored row is 1, which is different from the value in the original database.

Therefore, to sum up, if a log is written first and then another log is written, the status of the database will be inconsistent with the status of the library restored using binlog.

3. undo log

undolog is mainly used to record the status before a certain row record is modified. It records the data before modification. In this case, when the transaction is rolled back, the records can be restored to the way they were before the transaction started through undolog. The atomicity and durability of transactions are also achieved by undolog. The undo log mainly records the logical changes of the data. For example, an INSERT statement corresponds to a DELETE undo log. For each UPDATE statement, it corresponds to an opposite UPDATE undo log, so that when an error occurs, it can be rolled back to before the transaction. data status. At the same time, when performing data recovery, it is used in combination with binlog and redolog to ensure the correctness of data recovery.

The function process of undolog is as follows:
Completely master MySQLs three major logs: binlog, redo log and undo log

  • #Write the pre-modified version to the undo log before the transaction starts;
  • Start the modification and save the modified data to the memory;
  • Persist the undolog to the disk;
  • Flash the data page back to the disk;
  • Transaction commit;

It should be noted that, like redolog, undolog must be flushed back to the disk before the data page. When recovering data, if the undolog is complete, the transaction can be rolled back based on the undolog.

In a transaction, the same piece of data may be modified multiple times, so should the record before each modification be recorded in the undolog? In this case, the amount of undolog logs will be too large, and redolog will come into play at this time. In a transaction, if the same record is modified, undolog will only record the original record before the transaction starts. When this record is modified again, redolog will record subsequent changes. During data recovery, redolog completes rollforward and undolog completes rollback. The two coordinate with each other to complete data recovery. The process is as follows:
Completely master MySQLs three major logs: binlog, redo log and undo log
Another function is the MVCC multi-version control chain. Please refer to this article
MVCC implementation principle of MySQL

binlog, redolog and Undolog is the three most important logs in MySQL. During data recovery, the three coordinate and cooperate to ensure the correctness of data recovery.
Completely master MySQLs three major logs: binlog, redo log and undo log

Recommended learning: mysql video tutorial


The above is the detailed content of Completely master MySQL's three major logs: binlog, redo log and undo log. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:csdn.net. If there is any infringement, please contact admin@php.cn delete