The cost of downtime adds up rapidly in enterprise environments. In one survey, 40% of respondents said that just one hour of downtime cost their organization more than $1 million. It is clearly worthwhile to ensure that database services are continuously available.
It saves your organization tons of money, not to mention relationships with stakeholders of all shapes and sizes.
So how do you ensure continuous availability? The concept behind continuous availability is called High Availability. In this article, we'll provide an overview of what high availability is and how to implement it for your MySQL cluster.
We also point out the dark side of high availability, where system administrators mistakenly rely on high availability to perform maintenance tasks—and explain why doing so undermines the goals of high availability and puts your business operations at risk.
Let’s talk about availability first. There is little point in running a service (such as a database) if it is unavailable to users most of the time. So when we talk about availability, we mean how accessible a service is.
With any service that is running properly, one would reasonably expect that it will always be available when needed - but there will also be some downtime, a day or two a year, or a few hours a month .
A generally available service may be great for many use-case scenarios, but if the service is critical in nature, or a large number of users rely on a service, relying on "availability" alone is not enough. of.
This is what high availability is all about. At its most basic, high availability ensures a higher level of availability than would normally be expected and, more specifically, an agreed level, allowing for maintenance, patching and general errors and failures.
There is no agreed-upon definition of what high availability is, it is simply to meet specific (higher) availability requirements, which usually exceeds the "availability" accepted by the provider. In fact, your organization may define required availability based on the needs of your operations - weighing the cost of high availability against the cost of downtime-related losses.
The level of availability you require can be expressed as a percentage. For example, 99.99% or "four nines" availability means a maximum of 52.06 minutes of downtime in a year, while "six nines" or 99.9999% availability limits it to 31.56 minutes of downtime in a year.
Essentially, the choice is yours - but, again, it's a trade-off. Maintaining high availability will be expensive - requiring additional physical resources and software licenses, and draining your human resources. However, you may find it's a price worth paying to avoid the knock-on costs of disruption, or the risk of lost revenue due to unhappy customers.
The exact nature of your high availability infrastructure depends on your workload. However, broadly speaking, high availability is achieved when there is fault tolerance so that even if one service or device fails, the workload is not interrupted. Typically, this means there are no single points of failure - all services and devices are fully redundant at the network and application levels.
Depending on the service, this may usually involve a number of nodes - for example, your MySQL cluster will contain a few more nodes on which the data is stored. Multiple nodes are then combined with load balancing tools so that if one node fails, requests are directed to another node. Users can still access available services, even if performance is slightly degraded.
Of course, your path to a highly available MySQL database will depend on your implementation of MySQL. In a nutshell, you need to create some type of MySQL cluster with multiple nodes - in other words, your data must be stored on multiple MySQL servers.
Next, you need a service that can replicate the data on these nodes, ensuring that each node has an accurate copy of the data contained in your database. Finally, you need a load balancer to ensure that any database requests are directed evenly to the database nodes - yes, a balanced load - but make sure that even if one node is offline, the requests are satisfied.
For example, MySQL provides a commercial product for high availability - Te MySQL InnoDB Cluster. It is based on MySQL Group Replication, a popular way of ensuring high availability in MySQL database environments.
Another alternative is Galera, which has been providing MySQL high availability for many years. If you are using the MariaDB fork of MySQL, you can configure your MariaDB environment for high availability by running multiple nodes with your Galera cluster - while relying on HAProxy for load balancing. Alternatively, you can also look into MariaDB's own
MaxScale product.
Enterprise-scale workloads increasingly use high availability principles because in the long run, it provides the best the result of. Here are a few good reasons why you should consider setting up high availability in your operation:
These are a few good valid reasons for high availability - and, in today's technology-first world, there are many workloads that simply cannot run without a high-availability platform.
Unfortunately, the growing popularity of high availability has led to its abuse. Because high availability makes systems extremely robust, technical teams may be tempted to take shortcuts when performing system management tasks (such as patching) because those teams believe that the high availability infrastructure will simply bear the burden of taking a machine offline.
Actually, it gets more complicated quickly. Take MySQL Cluster as an example. Yes, if you reboot a machine to patch it, your MySQL cluster will continue to run due to high availability. However, keep in mind that when you shut down a node for patching and then restart it, it creates a backlog of data that needs to be entered. This process may take a long time to complete.
Needless to say, every database host needs to see the same data. The danger comes from the resynchronization process: if another node goes down while you've already shut down and patched it, this could result in losing the final valid quorum. In other words, the number of servers holding the "truth" about the data may be lower than an acceptable level. Recovery from this state can be difficult and complex, and may even result in data loss.
#High availability is to ensure the normal operation of the system in the event of a failure. This inherent protection against failure is not a free pass to rely on robust, irresponsible system maintenance for high availability and have no one notice it.
Instead, technical teams should rely on other solutions - for example, setting up full redundancy for systems being patched, rather than simply hoping that a high-availability infrastructure can withstand the stress. Alternatively, rely on real-time patching instead when possible, thereby eliminating the need to restart services to install the patch.
Nonetheless, reliance on high availability for maintenance work shows worrying signs. Look carefully and you'll even find official guidance from vendors instructing users to rely on high availability to perform patching tasks. Users just hope that when one node goes offline for patching, other nodes don't have any problems.
High availability is critical to many applications - and beneficial to many others. Configured correctly, a MySQL database can provide near-perfect availability, but that doesn't mean technical teams can take availability for granted.
Abusing high-availability architectures to maintain maintenance shortcuts is not advisable - the risks are greater than they appear at first glance.
Instead, system administrators should look for proven alternatives—including redundancy and live patching—to perform maintenance operations without compromising the ability of the high-availability solution.
Original address: https://tuxcare.com/understanding-mysql-high-availability-good-and-bad-reasons-to-use-it/
Translated address : https://learnku.com/mysql/t/71681
[Related recommendations: mysql video tutorial]
The above is the detailed content of Let's talk about why you shouldn't rely on MySQL high availability for maintenance. For more information, please follow other related articles on the PHP Chinese website!