Home>Article>Database> Organize and summarize five common MySQL high availability solutions

Organize and summarize five common MySQL high availability solutions

WBOY forward: 2022-05-31 11:56:18 2813browse

This article brings you relevant knowledge aboutmysql, which mainly introduces issues related to common high-availability solutions. Here we only discuss the advantages and disadvantages of commonly used high-availability solutions and high availability. Let’s take a look at the selection of solutions. I hope it will be helpful to everyone.

Recommended learning:mysql video tutorial

1. Overview

We are considering the high availability of MySQL database When designing the architecture, we should mainly consider the following aspects:

If the database is down or unexpectedly interrupted, the availability of the database can be restored as soon as possible, and the downtime can be reduced as much as possible to ensure that the business will not Interrupted due to database failure.
The data of non-primary nodes used for functions such as backup and read-only replicas should be consistent with the data of the primary node in real time or eventually.
When a database switch occurs in the business, the database contents before and after the switch should be consistent, and the business will not be affected due to missing data or inconsistent data.

We will not discuss the classification of high availability in detail here. We will only discuss the advantages and disadvantages of commonly used high availability solutions and the selection of high availability solutions.

2. High availability solution

2.1. Master-slave or master-master semi-synchronous replication

Use a two-node database to build one-way or two-way semi-synchronous replication. In versions after 5.7, the introduction of a series of new features such as lossless replication and logical multi-threaded replication makes MySQL's native semi-synchronous replication more reliable.

The common architecture is as follows:
Organize and summarize five common MySQL high availability solutions
It is usually used together with third-party software such as proxy and keepalived. It can be used to monitor the health of the database and execute a series of management commands. If the primary database fails, the database can still be used after switching to the standby database.

Advantages:

The architecture is relatively simple, using native semi-synchronous replication as the basis for data synchronization;
Dual-node, there is no master selection problem after the host goes down. , you can switch directly;
Dual nodes require less resources and are simple to deploy;

Disadvantages:

Completely relies on semi-synchronous replication. If Semi-synchronous replication degenerates into asynchronous replication, and data consistency cannot be guaranteed;
It is necessary to additionally consider the high availability mechanisms of haproxy and keepalived.

2.2. Semi-synchronous replication optimization

The semi-synchronous replication mechanism is reliable. If semi-synchronous replication is always in effect, the data can be considered consistent. However, due to some objective reasons such as network fluctuations, semi-synchronous replication times out and switches to asynchronous replication. In this case, data consistency cannot be guaranteed. Therefore, ensuring semi-synchronous replication as much as possible can improve data consistency.

This solution also uses a two-node architecture, but has functional optimizations based on the original semi-synchronous replication, making the semi-synchronous replication mechanism more reliable.

The optimization solutions that can be referred to are as follows:

2.2.1. Dual-channel replication

Organize and summarize five common MySQL high availability solutions
Semi-synchronous replication times out due to Afterwards, the replication is disconnected. When replication is established again, two channels are established at the same time. One of the semi-synchronous replication channels starts replicating from the current position to ensure that the slave knows the progress of the current host execution. Another asynchronous replication channel begins to catch up with the lagging data of the slave. When the asynchronous replication channel catches up to the starting position of semi-synchronous replication, semi-synchronous replication is resumed.

2.2.2. binlog file server

2.2.2. binlog文件服务器

Build two semi-synchronous replication channels, including the semi-synchronous channel connected to the file server It is not enabled under normal circumstances. When the master-slave semi-synchronous replication is degraded due to network problems, the semi-synchronous replication channel with the file server is started. After the master-slave semi-synchronous replication resumes, close the semi-synchronous replication channel with the file server.

Advantages:

Dual nodes require less resources and are simple to deploy;
The architecture is simple, there is no problem of selecting the master, just switch directly;
Compared with native replication, Optimized semi-synchronous replication can better ensure data consistency.

Disadvantages:

Need to modify the kernel source code or use the mysql communication protocol. You need to have a certain understanding of the source code and be able to do a certain degree of secondary development.
Still relies on semi-synchronous replication, which does not fundamentally solve the data consistency problem.

2.3. High-availability architecture optimization

Expand the dual-node database to a multi-node database or a multi-node database cluster. You can choose a cluster with one master and two slaves, one master and multiple slaves, or multiple masters and multiple slaves according to your needs.

Due to semi-synchronous replication, there is a feature that semi-synchronous replication is considered successful when a successful response from a slave is received, so the reliability of multi-slave semi-synchronous replication is better than the reliability of single-slave semi-synchronous replication. Moreover, the probability of multiple nodes going down at the same time is less than the probability of a single node going down. Therefore, to a certain extent, the multi-node architecture can be considered to have better high availability than the dual-node architecture.

However, due to the large number of databases, database management software is needed to ensure the maintainability of the database. You can choose MMM, MHA or various versions of proxy, etc. Common solutions are as follows:

2.3.1. MHA multi-node cluster

Organize and summarize five common MySQL high availability solutions
MHA Manager will regularly detect the master node in the cluster. When the master appears In the event of a failure, it can automatically promote the slave with the latest data to the new master, and then redirect all other slaves to the new master. The entire failover process is completely transparent to the application.

MHA Node runs on each MySQL server. Its main function is to process the binary log during switchover to ensure that the switchover minimizes data loss.

MHA can also be extended to the following multi-node clusters:
Organize and summarize five common MySQL high availability solutions
Advantages:

Can perform automatic detection and transfer of faults;
The scalability is relatively high Well, the number and structure of MySQL nodes can be expanded as needed;
Compared with two-node MySQL replication, three-node/multi-node MySQL has a lower probability of being unavailable

Disadvantages:

At least three nodes are required, which requires more resources than two nodes;
The logic is more complex, and it is more difficult to troubleshoot and locate problems after a failure occurs;
Data consistency is still guaranteed by native semi-synchronous replication. There is still a risk of data inconsistency;
Split-brain may occur due to network partitions;

2.3.2. zookeeper proxy

Zookeeper uses distributed algorithms to ensure clustering For data consistency, using zookeeper can effectively ensure the high availability of the proxy and better avoid network partitions.

Organize and summarize five common MySQL high availability solutions

Advantages:

Better guarantees the high availability of the entire system, including proxy, MySQL;
Good scalability, can be expanded to Large-scale cluster;

Disadvantages:

Data consistency still relies on native mysql semi-synchronous replication;
With the introduction of zk, the logic of the entire system becomes more complex;

2.4. Shared storage

Shared storage realizes the decoupling of database servers and storage devices. Data synchronization between different databases no longer relies on MySQL’s native replication function, but through disk data synchronization. , to ensure data consistency.

2.4.1. SAN shared storage

The concept of SAN allows a direct high-speed network to be established between storage devices and processors (servers) (compared to LAN ) connection, through which centralized storage of data is achieved. Commonly used architectures are as follows:

Organize and summarize five common MySQL high availability solutions
When using shared storage, the MySQL server can mount the file system normally and operate it. If the main database goes down, the standby database can mount the same file system, ensuring The primary database and the standby database use the same data.

Advantages:

Two nodes are enough, simple deployment, simple switching logic;
Good guarantee of strong consistency of data;
No logical errors will occur due to MySQL Data inconsistency;

Disadvantages:

Need to consider the high availability of shared storage;
Expensive;

2.4.2. DRBD disk replication

DRBD is a software-based, network-based block replication storage solution. It is mainly used for data mirroring of disks, partitions, logical volumes, etc. between servers. When users write data locally When the disk is connected, the data will also be sent to the disk of another host in the network, so that the data of the local host (primary node) and the remote host (standby node) can be synchronized in real time. The commonly used architecture is as follows:
Organize and summarize five common MySQL high availability solutions

When a problem occurs on the local host, a copy of the same data is still retained on the remote host and can continue to be used, ensuring data security.

DRBD is a fast-level synchronous replication technology implemented by the Linux kernel module, which can achieve the same shared storage effect as SAN.

Advantages:

Only two nodes are needed, simple deployment and simple switching logic;
Compared with SAN storage network, the price is low;
Ensure strong consistency of data;

Disadvantages:

Has a greater impact on io performance;
The slave library does not provide read operations;

2.5. Distributed protocol

Distributed protocol It can solve the problem of data consistency very well. The more common solutions are as follows:

2.5.1. MySQL cluster

MySQL cluster is the official cluster deployment solution. It uses the NDB storage engine to back up redundant data in real time to achieve high availability and data consistency of the database. .
Organize and summarize five common MySQL high availability solutions

Advantages:

All use official components and do not rely on third-party software;
Can achieve strong consistency of data;

Disadvantages :

It is rarely used in China;
The configuration is more complex and requires the use of NDB storage engine, which is somewhat different from the regular MySQL engine;
At least three nodes;

2.5.2. Galera

MySQL high-availability cluster based on Galera is a MySQL cluster solution for multi-master data synchronization. It is simple to use, has no single point of failure, and has high availability. Common architectures are as follows:

Organize and summarize five common MySQL high availability solutions

Advantages:

Multi-master writing, no-delay replication, ensuring strong data consistency;
There is a mature community , used by Internet companies on a large scale;
Automatic failover, automatic addition and removal of nodes;

Disadvantages:

Requires wsrep patching for native MySQL nodes
Only supported innodb storage engine
At least three nodes;

2.5.3. POAXS

The problem solved by the Paxos algorithm is how a distributed system reaches a consensus on a certain value (resolution). This algorithm is considered the most efficient of its kind. The combination of Paxos and MySQL can achieve strong consistency in distributed MySQL data. Common architectures are as follows:

Organize and summarize five common MySQL high availability solutions
Advantages:

Multi-master writing, no-delay replication, ensuring strong data consistency;
Has a mature theoretical basis;
Automatic failover, automatic addition and removal of nodes;

Disadvantages:

Only supports innodb storage engine
At least three nodes;

3. Summary

As people's requirements for data consistency continue to increase, more and more methods are being tried to solve the problem of distributed data consistency, such as the optimization of MySQL itself, the optimization of MySQL cluster architecture, Paxos, Raft, The introduction of 2PC algorithm and so on.

The method of using distributed algorithms to solve the problem of MySQL database data consistency is becoming more and more accepted by people. A series of mature products such as PhxSQL, MariaDB Galera Cluster, Percona XtraDB Cluster, etc. are becoming more and more popular. It is increasingly being used on a large scale.

With the official GA of MySQL Group Replication, using distributed protocols to solve data consistency problems has become a mainstream direction. It is expected that more and more excellent solutions will be proposed, and the MySQL high availability problem can be better solved.

Recommended learning:mysql video tutorial

The above is the detailed content of Organize and summarize five common MySQL high availability solutions. For more information, please follow other related articles on the PHP Chinese website!

mysql 架构分布式线程多线程异步算法 zookeeper 数据库 mariadb linux

Statement：

This article is reproduced at:csdn.net. If there is any infringement, please contact admin@php.cn delete

Previous article：Detailed explanation of MySQL basics: data model and SQL language Next article：Detailed explanation of MySQL basics: data model and SQL language

See more

Organize and summarize five common MySQL high availability solutions

1. Overview

2. High availability solution

2.1. Master-slave or master-master semi-synchronous replication

2.2. Semi-synchronous replication optimization

2.2.1. Dual-channel replication

2.2.2. binlog file server

2.3. High-availability architecture optimization

2.3.1. MHA multi-node cluster

2.3.2. zookeeper proxy

2.4. Shared storage

2.4.1. SAN shared storage

2.4.2. DRBD disk replication

2.5. Distributed protocol

2.5.1. MySQL cluster

2.5.2. Galera

2.5.3. POAXS

3. Summary

Related articles