Home  >  Article  >  Database  >  MySQL 5.7 in-depth analysis: semi-synchronous replication technology

MySQL 5.7 in-depth analysis: semi-synchronous replication technology

一个新手
一个新手Original
2017-09-19 09:46:282105browse

MySQL 5.7 in-depth analysis: semi-synchronous replication technology

The history of replication architecture


Before talking about this feature, we Let’s first take a look at the history of MySQL’s replication architecture. MySQL replication is divided into four types:

  1. Ordinary replication, asynchronous and synchronous. It is simple to build and widely used. This architecture has been produced since the birth of MySQL. Its performance is very good and it can be said to be very mature. However, the data in this architecture is asynchronous, so there is a risk of losing the database.

  2. semi-sync replication, semi-synchronization. Performance and functionality are somewhere between asynchronous and fully synchronous. It was born from mysql5.5 to compromise the performance, advantages and disadvantages of the above two architectures.

  3. sync replication, full synchronization. Currently, the official 5.7 full synchronization technology based on Group replication is in the labs version and is not far from formal integration. Full synchronization technology brings more data consistency guarantees. I believe it is an important direction for future synchronization technology and is worth looking forward to.

  4. mysql cluster. Based on the NDB engine, it is simple to build and relatively stable. It is the most reliable architecture for data protection in MySQL and is currently the only architecture with complete data synchronization and zero data loss. However, I am picky about my business and have many restrictions.

Semi-synchronous replication

We are talking about the second architecture today. We know that ordinary replication, that is, the asynchronous replication of mysql, relies on the mysql binary log, that is, the binary log, for data replication. For example, there are two machines, one is the master and the other is the slave.

  1. Normal replication is: transaction one (t1) is written to the binlog buffer; the dumper thread notifies the slave that there is a new transaction t1; the binlog buffer performs checkpoint; the slave's io thread receives t1 and Write to your own relay log; slave's sql thread writes to the local database. At this time, both the master and the slave can see this new transaction. Even if the master dies, the slave can be promoted to the new master.

  2. The abnormal copy is: transaction one (t1) is written to the binlog buffer; the dumper thread notifies the slave that there is a new transaction t1; the binlog buffer performs checkpoint; the slave has not been available due to unstable network Received t1; the master died, the slave was promoted to the new master, and t1 was lost.

  3. The big problem is: the master and slave transaction updates are not synchronized. Even if there are no network or other system abnormalities, when the business is concurrent, the slave needs to execute the master sequentially. Batch transactions, resulting in large delays.

In order to make up for the shortcomings of the above scenarios, mysql has launched semi-synchronization since 5.5. That is, after the master's dumper thread notifies the slave, an ack is added, that is, whether the flag code of t1 is successfully received. That is to say, in addition to sending t1 to the slave, the dumper thread is also responsible for receiving the slave's ack. If an exception occurs and no ack is received, it will automatically be downgraded to ordinary replication until the exception is repaired.

We can see the new problems brought by semi-synchronization:

  1. If an exception occurs, it will be downgraded to ordinary replication. Then the probability of data inconsistency on the slave machine will be reduced, but not completely eliminated.

  2. The host dumper thread takes on more work, which will obviously reduce the performance of the entire database.

  3. In the MySQL 5.7 in-depth analysis: semi-synchronous replication technology mode used in MySQL 5.5 and 5.6, that is, if the slave does not receive the transaction, that is, before it is written to the relay log, the network is abnormal or unstable. When the master hangs up and the system switches to the slave, the data on both sides will be inconsistent. In this case, the slave will have one less transaction data.

With the release of MySQL 5.7, semi-synchronous replication technology has been upgraded to a new Loss-less Semi-Synchronous Replication architecture, and its maturity, data consistency and execution efficiency have been significantly improved. promote.

Improvements in data replication efficiency in MySQL 5.7

Enhanced master-slave consistency, supports waiting for ACK before transaction commit

New version semi sync adds the rpl_semi_sync_master_wait_point parameter to control the way the master database commits transactions before returning success to the session transaction in semi-sync mode.

This parameter has two values:

  1. AFTER_COMMIT (5.6 default value)

    The master writes each transaction to the binlog and passes it to the slave for refresh to disk (relay log), and the main library commits the transaction at the same time. The master waits for slave feedback to receive the relay log. Only after receiving the ACK does the master feed back the commit OK result to the client.

    MySQL 5.7 in-depth analysis: semi-synchronous replication technology

  2. AFTER_SYNC (default in 5.7, but not in 5.6)

    master writes each transaction to binlog, passing Go to slave and flush to disk (relay log). After the master waits for slave feedback to receive the ack of the relay log, it submits the transaction and returns the commit OK result to the client. Even if the main library crashes, all transactions that have been committed on the main library are guaranteed to be synchronized to the slave's relay log.

    MySQL 5.7 in-depth analysis: semi-synchronous replication technology

    Therefore, 5.7 introduced the MySQL 5.7 in-depth analysis: semi-synchronous replication technology mode. The main benefit is to solve the problem of data inconsistency between the master crash caused by MySQL 5.7 in-depth analysis: semi-synchronous replication technology. Therefore, after the introduction of the MySQL 5.7 in-depth analysis: semi-synchronous replication technology mode, all commits All data has been replicated, and data consistency will be improved during failover.

Performance improvement, support for asynchronous sending binlog and receiving ack

The old version of semi sync is limited by dump thread because of dump The thread undertakes two different and very frequent tasks: transmitting the binlog to the slave and waiting for feedback from the slave. Moreover, these two tasks are serial. The dump thread must wait for the slave to return before transmitting the next events transaction. The dump thread has become the bottleneck of improving the performance of the entire semi-synchronization. In high-concurrency business scenarios, such a mechanism will affect the overall TPS of the database.

MySQL 5.7 in-depth analysis: semi-synchronous replication technology

In order to solve the above problems, in the 5.7 version of the semi sync framework, an independent ack collector thread is specially used to receive feedback information from the slave. In this way, there are two threads on the master working independently, which can send binlog to the slave and receive feedback from the slave at the same time.

MySQL 5.7 in-depth analysis: semi-synchronous replication technology

Performance improvement, control the master database to receive slave write transaction success feedback number

MySQL 5.7 has added the rpl_semi_sync_MySQL 5.7 in-depth analysis: semi-synchronous replication technology parameter, which can be used To control how many slave write transaction success feedbacks the main library accepts, providing flexibility for high-availability architecture switching.
As shown in the figure, when the count value is 2, the master needs to wait for acks from two slaves.

MySQL 5.7 in-depth analysis: semi-synchronous replication technology

Performance improvement, Binlog mutex lock improvement

The old version of semi-synchronous replication will add a mutex lock to the binlog in the main submission binlog writing session and the dump thread reading binlog operation. As a result, the reading and writing of binlog files are serialized, and there is a concurrency problem.

MySQL 5.7 in-depth analysis: semi-synchronous replication technology

MySQL 5.7 has optimized binlog lock in the following two aspects:
1. Removed dump thread’s mutex lock on binlog
2. Added Safety margin ensures binlog reading safety

MySQL 5.7 in-depth analysis: semi-synchronous replication technology

Performance improvement, group submission

MySQL 5.7 introduces a new variable slave-parallel- type, its configurable values ​​are:
1. DATABASE (default value before 5.7), parallel replication method based on the library;
2. LOGICAL_CLOCK (new value in 5.7), parallel replication method based on group submission;

MySQL version 5.6 also supports so-called parallel replication, but its parallelism is only based on DATABASE, that is, based on the library. If there are multiple DATABASEs in the user's MySQL database instance, it can indeed be of great help to the speed of slave replication. If the user instance has only one database, then parallel playback cannot be achieved, and the performance may even be worse than the original single-threaded one. Difference.

MySQL 5.7 adds a new parallel mode: assigning the same sequence number to transactions that enter the COMMIT phase at the same time. These transactions with the same sequence number can be executed concurrently in the standby database.

MySQL 5.7 truly implements parallel replication. The main reason for this is that the playback of the slave server is consistent with that of the host. That is, parallel playback is performed on the slave just as it is executed in parallel on the master server. There are no longer library-based parallel replication limitations, and there are no special requirements for the binary log format (nor are there requirements for library-based parallel replication).

So the sequence that can be concurrent in the following sequence is (the first number is last_committed, the next number is sequence_number):

trx1 1…..2
trx2 1………….3
trx3 1…………………….4
trx4        2……………………….5
trx5               3…………………………..6
trx6                     3………………………………7
trx7                            6………………………………..8

Standby database parallel rules: When a transaction is distributed, its If the last_committed sequence number is smaller than the minimum sequence_number of the currently executing transaction, execution is allowed. Therefore:
1. When trx1 is executed, last_commit 2. After trx1 is executed, last_commit 3. After the execution of trx2 is completed, last_commit 4. After trx3, trx4, and trx5 are completed, last_commit

Overview

We believe that MySQL version 5.7 has optimized semi-synchronous replication technology, which has qualitatively improved its maturity and execution efficiency. We recommend that when using MySQL 5.7 as a production environment deployment, you can use semi-synchronization technology as a data replication solution for high availability and read-write separation.

The above is the detailed content of MySQL 5.7 in-depth analysis: semi-synchronous replication technology. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn