Detailed explanation of MHA automatic switching example of consul architecture

PHP中文网
Release: 2017-06-21 16:35:41
Original
3352 people have browsed it

Introduction

Until now, we have not enabled the masterha_manager automatic switching script online, mainly because there is no guarantee that the database cannot be accessed when the network is jittering (the network cable, the switch of the corresponding cabinet is unstable). For example, restarting the network card of the machine where the detection script is located does not mean that there is a problem with the database, so from this aspect we cannot judge that the database is inaccessible through just one point of detection.

Fortunately, we can use consul (because consul provides dns Interface, the author prefers to use consul instead of etcd). We add a multi-point detection mechanism. In an n cluster environment, if more than half of the detection points detect problems with the database, we consider the database to be inaccessible. , then start calling themasterha_managerscript to switch, as shown in the figure below:

   | | | +---------+ +---------+ +---------+ | consul1 | | consul2 | | consul3 | +---------+ +---------+ +---------+ \ | / \ | / \ | / \ | / +----------------------+ | http api && acl | +----------------------+ | | +----------------------+ | consul-template | ----> < mysqlxxx.tpl > --->  +----------------------+ | +--------------------------+ | masterha_manager_consul | +--------------------------+
Copy after login

checkmysqlneeds to be deployed Go to eachconsul server, so that we can realize multi-point detection of whether MySQL is normal. If it is normal,checkmysqlwill set a key with a value of 1:mysql/mysqlxxxx /node-consul, otherwise the value is 0, where the default value ofnode-consulis the hostname of the current host.

checkmysqlAfter the detection, We use the consul-template tool to monitor all key changes based on the template filemysqlxxx.tpl. If there is a change, the configurationmysqlxxxx.confwill be generated, and then callmasterha_manager_consulThe script starts to switch.

We override the methodMHA::HealthCheck::wait_until_unreachablein themasterha_manager_consulscript to avoid infinite loop detection, if it is less than half The detection point considers that the database is abnormal, then exit this round of calls, otherwise enable the child process to start performing the switching operation.

Remarks:

masterha_manager_consulis based on MHA v0.5.6 Modified, and by default, automatic switching is only done between 21 o'clock on the day and 9 o'clock on the next day. You can control this function through thenightoption. In addition, multipleconsul serversare recommended Deploy to different switches or cabinets.

Instructions

See mha_manager_consul for the code. The overall structure is as follows:

mha_manager_consul ├── bin │ ├── checkmysql │ └── masterha_manager_consul ├── conf │ ├── db.cnf │ └── template-config ├── consul │ ├── acl │ │ ├── policy.ano │ │ └── policy.key │ ├── conf │ │ └── consul.conf │ └── conf.d │ └── server.json ├── README.md └── template └── mysql3308.tpl
Copy after login

Test environment

Continue to use the previous test environment:

ip os hostname version
10.0.21.5 centos 6.5 cz-test1 consul 0.8v
10.0.21.7 centos 6.5 cz-test2 consul 0.8v
10.0.21.17 centos 6.5 cz-test3 consul 0.8v

下面所有的操作都假设已经安装好了consul cluster.

备注

在运行checkmysql之前, 我们需要设置好 acl 策略, 以免 consul 的敏感信息被旁人访问. 下面命令中的token参数即是consul主配置文件中的acl_master_token选项, 文件policy.ano则是限制匿名用户访问mysql/*相关键的策略,policy.key则是设置允许访问mysql.*相关键的权限, 这里生成的 token 则为dcb5b583-cd36-d39d-2b31-558bebf86502, 大家可以访问 consul acl 了解更多访问控制的内容.

#curl -X PUT --data @policy.ano http://localhost:8500/v1/acl/update?token=e95597e0-4045-11e7-a9ef-b6ba84687927 {"ID":"anonymous"} #curl -X PUT --data @policy.key http://localhost:8500/v1/acl/update?token=e95597e0-4045-11e7-a9ef-b6ba84687927 {"ID":"dcb5b583-cd36-d39d-2b31-558bebf86502"}
Copy after login

checkmysql

在每个consul server的节点上运行该脚本, 这里的token参数即为上述 acl 的结果,tag则是db.conf配置里的实例, 通过以下命令启动:

perl checkmysql --conf db.cnf --verbose --tag mysql3308 --token dcb5b583-cd36-d39d-2b31-558bebf86502 [2017-06-08T10:09:14] mysql/mysql3308/cz-test2 with value 1 no change [2017-06-08T10:09:15] mysql/mysql3308/cz-test2 with value 1 no change
Copy after login

cz-test2表示当前的主机名是cz-test2, 对应上述介绍的node-consul.

备注

如果你的MySQL master是通过 vip 提供服务,db.conf配置里的 host 选项最好设置成 vip 的地址.

consul-template

在 checkmysql 更新 consul 的相关 key 之后, 如果有任意一个 checkmysql 变更了key 值, 则 consul-template 根据模板文件重新生成 mysqlxxx.conf 文件, 随后开始调用 masterha_manager_consul 脚本, consul-template 的配置详见template-config; 通过以下命令启动:

# consul-template -config config 2017/05/25 10:11:13 [DEBUG] (logging) enabling syslog on LOCAL5
Copy after login

mysqlxxxx.tpl模板文件的内容如下:

# node3308 cz-test1:1 cz-test2:1 cz-test3:1
Copy after login

如果少于半数的监测点发现 MySQL 异常,consul-template打印下面的消息:

[2017-06-08T10:24:15] status ok, skip switch..
Copy after login

反之则打印 error 信息, 并开始调用masterha_manager_consul脚本:

[2017-05-25T10:24:48] status error, need switch.. Wed May 24 10:24:48 2017 - [info] Reading default configuration from /etc/masterha/app_default.cnf.. ... ...
Copy after login

conf.d/server.json

详见 template-config 配置中的 address = "consul.service.consul:8500" 选项; 在网络波动的情况下, address 选项如果只配置一个 consul server 的 ip 的话, consul-template 则不能连接到 consul server 中监控相应的 key 值, 尽管 consul-template 有重试功能, 但是在单 ip 的情况下, 难以确保可以正常获取相关的 key 值信息. conf.d/server.json 配置则将各个 consul server 的 ip 作为一个 dns 条目, 如下所示:

# dig @10.0.21.5 consul.service.consul ...... ...... ;; QUESTION SECTION: ;consul.service.consul. IN A ;; ANSWER SECTION: consul.service.consul. 0 IN A 10.0.21.7 consul.service.consul. 0 IN A 10.0.21.5 consul.service.consul. 0 IN A 10.0.21.17
Copy after login

单个 consul server 异常, 会自动跳到正常的 consul-server 中.

主从切换测试

我们简单关闭 master 的实例, 看看各工具间的输出状态.

关闭 master

关闭 master 后,checkmysql脚本开始更新状态, 在超过半数的情况下调用masterha_manager_consul脚本进行主从切换:checkmysql脚本输出, 开始将 key 的值更为 0

[2017-06-08T18:16:43] mysql/mysql3308/cz-test2 with value 1 no change DBI connect('mysql_read_default_file=./db.cnf;mysql_read_default_group=mysql3308','',...) failed: Can't connect to MySQL server on '10.0.21.7' (111) at checkmysql line 56 [2017-06-08T18:16:44] set 0 with key mysql/mysql3308/cz-test2 ok DBI connect('mysql_read_default_file=./db.cnf;mysql_read_default_group=mysql3308','',...) failed: Can't connect to MySQL server on '10.0.21.7' (111) at checkmysql line 56 [2017-06-08T18:16:45] mysql/mysql3308/cz-test2 with value 0 no change
Copy after login

mysql3308.conf配置文件变更为如下:

# node3308 cz-test1:0 cz-test2:0 cz-test3:0
Copy after login

consul-template则显示如下:

# consul-template -config config 2017/06/08 12:11:13 [DEBUG] (logging) enabling syslog on LOCAL5 [2017-05-24T12:16:48] status error, need switch.. # 脚本判定超过半数认为数据库不可访问 Wed Jun 08 12:16:48 2017 - [info] Reading default configuration from /etc/masterha/app_default.cnf.. Wed Jun 08 12:16:48 2017 - [info] Reading application default configuration from /etc/masterha/app_56.conf.. Wed Jun 08 12:16:48 2017 - [info] Updating application default configuration from /usr/bin/init_conf_loads.. ....
Copy after login

如果没有超过半数, consul-template 则显示以下:

[2017-06-08T12:24:15] status ok, skip switch..
Copy after login

MHA 切换日志

mha 切换的日志则包含以下信息, 日志文件则根据 mha 的具体配置而定:

Wed Jun 08 12:45:37 2017 - [info] Starting master failover.. Wed Jun 08 12:45:37 2017 - [info] From: 10.0.21.7(10.0.21.7:3308) (current master) +--10.0.21.17(10.0.21.17:3308) To: 10.0.21.17(10.0.21.17:3308) (new master) ... ... Master failover to 10.0.21.17(10.0.21.17:3308) completed successfully. Wed Jun 08 12:45:41 2017 - [info] Sending mail..
Copy after login

总结

整体上而言, 使用consul的架构相对繁琐, 没有单节点那么简易方便, 不过对于比较核心的数据库来说, 一致性应该放到首位, 多点检测则很大程度上健壮了切换机制. 而且原工具自带的masterha_manager脚本本身只是循环检测, 超过三次错误(每次间隔时间递增)才会开始切换, 在网络波动, 交换机故障或数据库主机较繁忙的时候, 会引起一些意料之外的操作, 所以相对来说, 多点检测避免了这类不稳定的问题, 另外consul cluster部署完成后也可以用于其他需要一致性判断的业务, 不用太纠结于繁琐方面的考虑.

The above is the detailed content of Detailed explanation of MHA automatic switching example of consul architecture. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!