Mysql 高可用(MHA)-读写分离(Atlas)

Mysql 高可用(MHA)-读写分离(Atlas)

1. 搭建主从复制(一主两从)

1.1 准备环境

1 主库:10.0.0.51/db01
2 从库:10.0.0.52/db02,10.0.0.53/db03

1.2 清理环境

 1 systemctl stop  mysqld
 2 rm -rf /data/mysql_3306/*
 3 rm -rf /binlog/
 4 mkdir /binlog/
 5 
 6 创建相关目录与授权
 7 主库操作:开启binlog 和GTID
 8 mkdir -p /binlog/
 9 mkdir -p /var/log/mysql/
10 touch /var/log/mysql/mysql.err
11 chown -R mysql.mysql /var/log/mysql/
12 chown -R mysql.mysql /binlog/
13 
14 从库操作:从库不需要开启binlog
15 mkdir -p /var/log/mysql/
16 touch /var/log/mysql/mysql.err
17 chown -R mysql.mysql /var/log/mysql/

1.3 准备配置⽂件

 1 a. 主库  (开启binlog和GTID)
 2 cat > /etc/my.cnf <<EOF
 3 [mysqld]
 4 user=mysql
 5 datadir=/data/mysql_3306
 6 basedir=/opt/mysql/
 7 socket=/tmp/mysql.sock
 8 port=3306
 9 log_error=/var/log/mysql/mysql.err
10 server_id=51
11 log_bin=/binlog/mysql-bin
12 autocommit=0
13 binlog_format=row
14 gtid-mode=on
15 enforce-gtid-consistency=true
16 log-slave-updates=1
17 [mysql]
18 socket=/tmp/mysql.sock
19 [client]
20 socket=/tmp/mysql.sock
21 EOF
22 
23 b.从库 (开启GTID,不用开启binlog)
24 cat > /etc/my.cnf <<EOF
25 [mysqld]
26 user=mysql
27 datadir=/data/mysql_3306
28 basedir=/opt/mysql/
29 socket=/tmp/mysql.sock
30 port=3306
31 log_error=/var/log/mysql/mysql.err
32 server_id=52
33 autocommit=0
34 gtid-mode=on
35 enforce-gtid-consistency=true
36 log-slave-updates=1
37 [mysql]
38 socket=/tmp/mysql.sock
39 [client]
40 socket=/tmp/mysql.sock
41 EOF
42 
43 c.从库 (开启GTID,不用开启binlog)
44 cat > /etc/my.cnf <<EOF
45 [mysqld]
46 user=mysql
47 datadir=/data/mysql_3306
48 basedir=/opt/mysql/
49 socket=/tmp/mysql.sock
50 port=3306
51 log_error=/var/log/mysql/mysql.err
52 server_id=53
53 autocommit=0
54 gtid-mode=on
55 enforce-gtid-consistency=true
56 log-slave-updates=1
57 [mysql]
58 socket=/tmp/mysql.sock
59 [client]
60 socket=/tmp/mysql.sock
61 EOF

1.4 初始化数据

1 mysqld --initialize-insecure --user=mysql --basedir=/opt/mysql --datadir=/data/mysql_3306/

1.5 启动数据库

 1 systemctl start mysql
 2 查看是否启动
 3 netstat -lntup |grep 3306
 4 
 5 如果是刚装的mysql,启动mysql需要以下方法才可以启动
 6 方法一:cp /opt/mysql/support-files/mysql.server  /etc/init.d/mysqld    
 7       chkconfig --add mysqld
 8       systemctl start mysqld
 9       
10 方法二: /etc/init.d/mysqld start

1.6 设置数据库密码

1 mysqladmin password

1.7 创建远程复制⽤户 (db01操作)

 1 mysql> grant replication slave on *.* to repl@'10.0.0.%' identified by '123456abcd';
 2 Query OK, 0 rows affected, 1 warning (0.01 sec)
 3 
 4 mysql> select user,host from mysql.user;
 5 +---------------+-----------+
 6 | user          | host      |
 7 +---------------+-----------+
 8 | repl          | 10.0.0.%  |
 9 | mysql.session | localhost |
10 | mysql.sys     | localhost |
11 | root          | localhost |
12 +---------------+-----------+
13 4 rows in set (0.00 sec)

1.8 构建主从

 1 查看主库GTID 位置
 2 mysql> show master status;
 3 +------------------+----------+--------------+------------------+------------------------------------------+
 4 | File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                        |
 5 +------------------+----------+--------------+------------------+------------------------------------------+
 6 | mysql-bin.000004 |      704 |              |                  | 86ef6764-6e12-11eb-ac19-000c29ee4d81:1-2 |
 7 +------------------+----------+--------------+------------------+------------------------------------------+
 8 1 row in set (0.00 sec)
 9 
10 从库执行以下命令:
11 change master to
12 master_host='10.0.0.52',
13 master_user='repl',
14 master_password='123456abcd',
15 MASTER_AUTO_POSITION=1;
16 
17 开启主从复制
18 start slave;
19 
20 确认启动是否OK
21 mysql> start slave;
22 Query OK, 0 rows affected (0.00 sec)
23 
24 mysql> show slave status\G
25 *************************** 1. row ***************************
26                Slave_IO_State: Waiting for master to send event
27                   Master_Host: 10.0.0.51
28                   Master_User: repl
29                   Master_Port: 3306
30                 Connect_Retry: 60
31               Master_Log_File: mysql-bin.000004
32           Read_Master_Log_Pos: 704
33                Relay_Log_File: db02-relay-bin.000002
34                 Relay_Log_Pos: 917
35         Relay_Master_Log_File: mysql-bin.000004
36              Slave_IO_Running: Yes
37             Slave_SQL_Running: Yes
38               Replicate_Do_DB: 
39           Replicate_Ignore_DB: 
40            Replicate_Do_Table: 
41        Replicate_Ignore_Table: 
42       Replicate_Wild_Do_Table: 
43   Replicate_Wild_Ignore_Table: 
44                    Last_Errno: 0
45                    Last_Error: 
46                  Skip_Counter: 0
47           Exec_Master_Log_Pos: 704
48               Relay_Log_Space: 1123
49               Until_Condition: None
50                Until_Log_File: 
51                 Until_Log_Pos: 0
52            Master_SSL_Allowed: No
53            Master_SSL_CA_File: 
54            Master_SSL_CA_Path: 
55               Master_SSL_Cert: 
56             Master_SSL_Cipher: 
57                Master_SSL_Key: 
58         Seconds_Behind_Master: 0
59 Master_SSL_Verify_Server_Cert: No
60                 Last_IO_Errno: 0
61                 Last_IO_Error: 
62                Last_SQL_Errno: 0
63                Last_SQL_Error: 
64   Replicate_Ignore_Server_Ids: 
65              Master_Server_Id: 51
66                   Master_UUID: 86ef6764-6e12-11eb-ac19-000c29ee4d81
67              Master_Info_File: /data/mysql_3306/master.info
68                     SQL_Delay: 0
69           SQL_Remaining_Delay: NULL
70       Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
71            Master_Retry_Count: 86400
72                   Master_Bind: 
73       Last_IO_Error_Timestamp: 
74      Last_SQL_Error_Timestamp: 
75                Master_SSL_Crl: 
76            Master_SSL_Crlpath: 
77            Retrieved_Gtid_Set: 86ef6764-6e12-11eb-ac19-000c29ee4d81:1-2
78             Executed_Gtid_Set: 86ef6764-6e12-11eb-ac19-000c29ee4d81:1-2
79                 Auto_Position: 1
80          Replicate_Rewrite_DB: 
81                  Channel_Name: 
82            Master_TLS_Version: 
83 1 row in set (0.00 sec)

1.9 解除从库

1 stop slave;
2 reset slave all;

2.MHA环境准备

主库:10.0.0.51
从库:10.0.0.52,10.0.0.53

2.0  MHA 介绍

 1 1.Master HA,对主节点进行监控,可实现自动故障转 移至其它从节点;通过提升某一从节点为新的主节点,基于主从复制实现,还需要客户端配合实现,目前MHA主要支持一主多从的架构,要搭建MHA,要求一个复制集群中必须最少有 三台数据库服务器,一主二从,即一台充当master,一台充当备用master,另外一台充当从库,如果财大气粗,也可以用一台专门的服务器来当MHA监控管理服务器
 2 
 3 2.MHA工作原理
 4     1 从宕机崩溃的master保存二进制日志事件(binlog events) 
 5     2 识别含有最新更新的slave 
 6     3 应用差异的中继日志(relay log)到其他的slave 
 7     4 应用从master保存的二进制日志事件(binlog events) 
 8     5 提升一个slave为新的master 
 9     6 使其他的slave连接新的master进行复制 
10     
11     注意:MHA需要基于ssh,key验证登入方法    
12     
13     
14 3.MHA软件由两部分组成,Manager工具包和Node工具包,具体的说明如下。
15  1).Manager工具包主要包括以下几个工具:
16     masterha_check_ssh              检查MHA的SSH配置状况
17     masterha_check_repl             检查MySQL复制状况
18     masterha_manger                 启动MHA
19     masterha_check_status           检测当前MHA运行状态
20     masterha_master_monitor         检测master是否宕机
21     masterha_master_switch          控制故障转移(自动或者手动)
22     masterha_conf_host              添加或删除配置的server信息
23     
24  2).Node工具包(这些工具通常由MHA Manager的脚本触发,无需人为操作)主要包括以下几个工具:
25     save_binary_logs                保存和复制master的二进制日志
26     apply_diff_relay_logs           识别差异的中继日志事件并将其差异的事件应用于其他的slave
27     filter_mysqlbinlog              去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
28     purge_relay_logs                清除中继日志(不会阻塞SQL线程)
29     
30    自定义扩展:
31     secondary_check_script:通过多条网络路由检测master的可用性;
32     master_ip_failover_script:更新application使用的masterip;
33    report_script:发送报告;
34    init_conf_load_script:加载初始配置参数;
35    master_ip_online_change_script;更新master节点ip地址。

2.1 搭建关键的软链接(51,52,53都需要设置)

1 ln -s /opt/mysql-5.7.28/bin/mysqlbinlog /usr/bin/mysqlbinlog
2 ln -s /opt/mysql-5.7.28/bin/mysql /usr/bin/mysql

2.2 配置互信(不用秘钥可以登录ssh)

 1 db01:
 2   rm -rf /root/.ssh
 3   ssh-keygen
 4   cd /root/.ssh
 5   mv id_rsa.pub authorized_keys
 6   scp -r /root/.ssh 10.0.0.52:/root
 7   scp -r /root/.ssh 10.0.0.53:/root
 8   
 9 各节点验证:(ssh无秘钥登录)
10 db01:
11 ssh 10.0.0.51 date           登录并查看时间
12 ssh 10.0.0.52 date
13 ssh 10.0.0.53 date
14 db02:
15 ssh 10.0.0.51 date
16 ssh 10.0.0.52 date
17 ssh 10.0.0.53 date
18 db03:
19 ssh 10.0.0.51 date
20 ssh 10.0.0.52 date
21 ssh 10.0.0.53 date

2.3 各节点安装 node软件和依赖包

1 yum install perl-DBD-MySQL -y
2 rpm -ivh  mha4mysql-node-0.56-0.el6.noarch.rpm
3 
4 [root@db01 ~00:50:55]# rpm -ivh  mha4mysql-node-0.56-0.el6.noarch.rpm
5 Preparing...                          ################################# [100%]
6 Updating / installing...
7    1:mha4mysql-node-0.56-0.el6        ################################# [100%]

2.4 在主库db01中创建mha需要的用户

1 grant all privileges on *.* to mha@'10.0.0.%' identified by '123456abcd';
2 
3 select user,host from mysql.user;

2.5 Manager软件安装(db03)

1 yum install -y perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
2 rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
3 
4 [root@db03 ~00:53:36]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
5 Preparing...                          ################################# [100%]
6 Updating / installing...
7    1:mha4mysql-manager-0.56-0.el6     ################################# [100%]

2.6 配置文件准备(db03)

 1 # 创建配置文件目录
 2 mkdir -p /etc/mha
 3 # 创建日志目录
 4 mkdir -p /var/log/mha/app1
 5 # 编辑mha配置文件
 6 cat > /etc/mha/app1.cnf <<EOF
 7 [server default]
 8 manager_log=/var/log/mha/app1/manager 
 9 manager_workdir=/var/log/mha/app1 
10 master_binlog_dir=/binlog                                  #主库binlog位置
11 user=mha                                                   #mha用户
12 password=123456abcd 
13 ping_interval=2                                            #每隔2秒扫描一次
14 repl_password=123456abcd
15 repl_user=repl                                             #主从复制用户
16 ssh_user=root                                              #无秘钥使用用户
17 [server1] 
18 hostname=10.0.0.51
19 port=3306 
20 [server2] 
21 hostname=10.0.0.52
22 port=3306
23 [server3]
24 hostname=10.0.0.53
25 port=3306
26 EOF

2.7 状态检查(db03)

  1 1. masterha_check_ssh --conf=/etc/mha/app1.cnf
  2 
  3 [root@db03 ~00:59:38]# masterha_check_ssh --conf=/etc/mha/app1.cnf
  4 Sun Feb 14 00:59:41 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
  5 Sun Feb 14 00:59:41 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
  6 Sun Feb 14 00:59:41 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
  7 Sun Feb 14 00:59:41 2021 - [info] Starting SSH connection tests..
  8 Sun Feb 14 00:59:42 2021 - [debug] 
  9 Sun Feb 14 00:59:41 2021 - [debug]  Connecting via SSH from root@10.0.0.51(10.0.0.51:22) to root@10.0.0.52(10.0.0.52:22)..
 10 Sun Feb 14 00:59:41 2021 - [debug]   ok.
 11 Sun Feb 14 00:59:41 2021 - [debug]  Connecting via SSH from root@10.0.0.51(10.0.0.51:22) to root@10.0.0.53(10.0.0.53:22)..
 12 Sun Feb 14 00:59:41 2021 - [debug]   ok.
 13 Sun Feb 14 00:59:42 2021 - [debug] 
 14 Sun Feb 14 00:59:41 2021 - [debug]  Connecting via SSH from root@10.0.0.52(10.0.0.52:22) to root@10.0.0.51(10.0.0.51:22)..
 15 Sun Feb 14 00:59:41 2021 - [debug]   ok.
 16 Sun Feb 14 00:59:41 2021 - [debug]  Connecting via SSH from root@10.0.0.52(10.0.0.52:22) to root@10.0.0.53(10.0.0.53:22)..
 17 Sun Feb 14 00:59:42 2021 - [debug]   ok.
 18 Sun Feb 14 00:59:43 2021 - [debug] 
 19 Sun Feb 14 00:59:42 2021 - [debug]  Connecting via SSH from root@10.0.0.53(10.0.0.53:22) to root@10.0.0.51(10.0.0.51:22)..
 20 Sun Feb 14 00:59:42 2021 - [debug]   ok.
 21 Sun Feb 14 00:59:42 2021 - [debug]  Connecting via SSH from root@10.0.0.53(10.0.0.53:22) to root@10.0.0.52(10.0.0.52:22)..
 22 Sun Feb 14 00:59:42 2021 - [debug]   ok.
 23 Sun Feb 14 00:59:43 2021 - [info] All SSH connection tests passed successfully.
 24 
 25 
 26 2. masterha_check_repl --conf=/etc/mha/app1.cnf
 27 Sun Feb 14 14:16:52 2021 - [info] Checking replication health on 10.0.0.52..
 28 Sun Feb 14 14:16:52 2021 - [info]  ok.
 29 Sun Feb 14 14:16:52 2021 - [info] Checking replication health on 10.0.0.53..
 30 Sun Feb 14 14:16:52 2021 - [info]  ok.
 31 Sun Feb 14 14:16:52 2021 - [warning] master_ip_failover_script is not defined.
 32 Sun Feb 14 14:16:52 2021 - [warning] shutdown_script is not defined.
 33 Sun Feb 14 14:16:52 2021 - [info] Got exit code 0 (Not master dead).
 34 MySQL Replication Health is OK.
 35 
 36 
 37 报错:
 38 [root@db03 ~00:59:43]# masterha_check_repl --conf=/etc/mha/app1.cnf
 39 Sun Feb 14 01:00:24 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
 40 Sun Feb 14 01:00:24 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
 41 Sun Feb 14 01:00:24 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
 42 Sun Feb 14 01:00:24 2021 - [info] MHA::MasterMonitor version 0.56.
 43 Sun Feb 14 01:00:25 2021 - [info] GTID failover mode = 1
 44 Sun Feb 14 01:00:25 2021 - [info] Dead Servers:
 45 Sun Feb 14 01:00:25 2021 - [info] Alive Servers:
 46 Sun Feb 14 01:00:25 2021 - [info]   10.0.0.51(10.0.0.51:3306)
 47 Sun Feb 14 01:00:25 2021 - [info]   10.0.0.52(10.0.0.52:3306)
 48 Sun Feb 14 01:00:25 2021 - [info]   10.0.0.53(10.0.0.53:3306)
 49 Sun Feb 14 01:00:25 2021 - [info] Alive Slaves:
 50 Sun Feb 14 01:00:25 2021 - [info]   10.0.0.52(10.0.0.52:3306)  Version=5.7.28 (oldest major version between slaves) log-bin:disabled
 51 Sun Feb 14 01:00:25 2021 - [info]     GTID ON
 52 Sun Feb 14 01:00:25 2021 - [info]     Replicating from 10.0.0.51(10.0.0.51:3306)
 53 Sun Feb 14 01:00:25 2021 - [info]   10.0.0.53(10.0.0.53:3306)  Version=5.7.28 (oldest major version between slaves) log-bin:disabled
 54 Sun Feb 14 01:00:25 2021 - [info]     GTID ON
 55 Sun Feb 14 01:00:25 2021 - [info]     Replicating from 10.0.0.51(10.0.0.51:3306)
 56 Sun Feb 14 01:00:25 2021 - [info] Current Alive Master: 10.0.0.51(10.0.0.51:3306)
 57 Sun Feb 14 01:00:25 2021 - [info] Checking slave configurations..
 58 Sun Feb 14 01:00:25 2021 - [info]  read_only=1 is not set on slave 10.0.0.52(10.0.0.52:3306).
 59 Sun Feb 14 01:00:25 2021 - [warning]  log-bin is not set on slave 10.0.0.52(10.0.0.52:3306). This host cannot be a master.
 60 Sun Feb 14 01:00:25 2021 - [info]  read_only=1 is not set on slave 10.0.0.53(10.0.0.53:3306).
 61 Sun Feb 14 01:00:25 2021 - [warning]  log-bin is not set on slave 10.0.0.53(10.0.0.53:3306). This host cannot be a master.
 62 Sun Feb 14 01:00:25 2021 - [info] Checking replication filtering settings..
 63 Sun Feb 14 01:00:25 2021 - [info]  binlog_do_db= , binlog_ignore_db= 
 64 Sun Feb 14 01:00:25 2021 - [info]  Replication filtering check ok.
 65 Sun Feb 14 01:00:25 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln361] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf
 66 Sun Feb 14 01:00:25 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48.
 67 Sun Feb 14 01:00:25 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
 68 Sun Feb 14 01:00:25 2021 - [info] Got exit code 1 (Not master dead).
 69 
 70   SQL Replication Health is NOT OK!
 71   
 72   
 73 解决问题:
 74         [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln361] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf
 75         Sun Feb 14 01:00:25 2021 - [info]   10.0.0.52(10.0.0.52:3306)  Version=5.7.28 (oldest major version between slaves) log-bin:disabled
 76         Sun Feb 14 01:00:25 2021 - [info]   10.0.0.53(10.0.0.53:3306)  Version=5.7.28 (oldest major version between slaves) log-bin:disabled
 77         
 78    以上说明log-bin处于关闭的状态,同时show variables like 'log_bin';可以看到log_bin处于OFF
 79    mysql> show variables like 'log_bin';
 80 +---------------+-------+
 81 | Variable_name | Value |
 82 +---------------+-------+
 83 | log_bin       | OFF   |
 84 +---------------+-------+
 85 1 row in set (0.00 sec)
 86      
 87 解决方法:
 88        在db02,db03的mysql 配置档中写入log_bin=master-bin,必须写在server-id=[] 后面.
 89        [root@db03 ~14:16:52]# cat /etc/my.cnf
 90         [mysqld]
 91         user=mysql
 92         datadir=/data/mysql_3306
 93         basedir=/opt/mysql/
 94         socket=/tmp/mysql.sock
 95         port=3306
 96         log_error=/var/log/mysql/mysql.err
 97         server_id=53
 98         log_bin=master-bin
 99         autocommit=0
100         gti--mode=on
101         enforce-gtid-consistency=true
102         log-slave-updates=1
103         [mysql]
104         socket=/tmp/mysql.sock
105         [client]
106         socket=/tmp/mysql.sock
107  重新再测试下,问题已解决
108          [root@db03 ~14:16:52]# cat /etc/my.cnf
109         [mysqld]
110         user=mysql
111         datadir=/data/mysql_3306
112         basedir=/opt/mysql/
113         socket=/tmp/mysql.sock
114         port=3306
115         log_error=/var/log/mysql/mysql.err
116         server_id=53
117         log_bin=master-bin
118         autocommit=0
119         gti--mode=on
120         enforce-gtid-consistency=true
121         log-slave-updates=1
122         [mysql]
123         socket=/tmp/mysql.sock
124         [client]
125         socket=/tmp/mysql.sock

2.8 开启MHA(db03)

1 nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &

2.9 查看MHA状态(db03)

1 masterha_check_status --conf=/etc/mha/app1.cnf
2 
3 [root@db03 ~14:30:54]# masterha_check_status --conf=/etc/mha/app1.cnf
4 app1 (pid:7808) is running(0:PING_OK), master:10.0.0.51

3.MHA 故障测试(主库宕机会自动选出新主库)

 1 MHA Failover 过程原理:
 2    高可用最擅长的是为我们解决物理损坏,就是说主库坏了可以很快从备份从库中筛选出新的主库,使数据正常运行.
 3 1.启动Manager 
 4   调用masterha_manager脚本启动Manager程序
 5   
 6 2.监控:
 7   通过:masterha_master_monitor心跳检测脚本,数据库节点,主要监控主库.默认探测4次,每隔(ping_interval=2)秒,如果主库还没有心跳,认为主库宕机,进入failover过程.
 8   
 9 3. 选主:
10   [server2]
11   hostname=172.25.83.2
12   port=3306
13   candidate_master=1      #设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave
14   check_repl_delay=0      #默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master
15   a. 优先级(主观),如果在节点配置时,加入了candidate_master=1 参数.如果备选主,日志量落后mater太多(后master 100M 的ralay_1ogs的话,也不会被选择成为新主master),也可以通过check_repl_delay=0,不检查日志落后的情景.
16   b. 日志量最接近主库
17   c. 日志量一样,按配置文件顺序来选择主库master.
18   
19 4. 日志补偿:
20   a. ssh 能连接上,通过save_binary_logs立即保存缺失部分日志到从库(/var/tmp目录下)并恢复
21   b. ssh 连接不了,两个从库进行relay_log日志diff(apply_diff_relay_logs)差异补偿.
22   
23 5. 主从身份切换,所有从库取消和原有主库的复制关系(stop slave;reset slave all),新主库与剩下从库构建主从复制.
24 
25 6. 故障库自动被剔除集群(通过master_conf_host 从配置信息中去掉)
26 
27 7. MHA是一次性的高可用,Failover 后,Manager 自动退出(需要手动再次加入集群)
28 以上是MHA的基础环境所有具备的功能.
29 不足的地方:
30 1.应该透明(IP地址自动切换)
31 2.数据补偿(数据不能丢失)
32 3.自动提醒
33 4.自愈功能(待开发)
34   思路:MHA + K8S +Operator 官方  ,   8.0 MGR + Mysqlsh

4.MHA应用透明VIP功能

1 说明:只能同机房使用,无法跨机房跨网络
2 如果想用跨机房跨网络 可以用keepalived,需要用candidate_master=1和check_repl_delay=0来配合,防止VIP和主库不在一个节点上.

4.1配置参数(db03)

 1 将master_ip_failover_script= /usr/local/bin/master_ip_failover写入到mha配置档中
 2 
 3 [root@db03 ~00:27:23]# cat /etc/mha/app1.cnf 
 4 [server default]
 5 master_ip_failover_script= /usr/local/bin/master_ip_failover
 6 manager_log=/var/log/mha/app1/manager
 7 manager_workdir=/var/log/mha/app1
 8 master_binlog_dir=/binlog
 9 password=123456abcd
10 ping_interval=2
11 repl_password=123456abcd
12 repl_user=repl
13 ssh_user=root
14 user=mha
15 
16 [server1]
17 hostname=10.0.0.51
18 port=3306
19 
20 [server2]
21 hostname=10.0.0.52
22 port=3306
23 
24 [server3]
25 hostname=10.0.0.53
26 port=3306

4.2 修改脚本

 1 vim /usr/local/bin/master_ip_failover
 2 my $vip = '10.0.0.55/24';  
 3 my $key = "1";
 4 my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
 5 my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";
 6 
 7 a. 需要转换下中文字符
 8 dos2unix /usr/local/bin/master_ip_failover
 9 
10 [root@db03 ~00:51:52]# dos2unix /usr/local/bin/master_ip_failover
11 dos2unix: converting file /usr/local/bin/master_ip_failover to Unix format ...
12 b. 赋予x权限
13  chmod +x /usr/local/bin/master_ip_failover

4.3 手动在主库master 中新建一个VIP.网卡名称一定要与配置文档中的一样

 1 db01:10.0.0.51 操作
 2 ifconfig eth0:1 10.0.0.55/24
 3 
 4 [root@db01 ~01:02:14]# ifconfig eth0:1 10.0.0.55/24
 5 [root@db01 ~01:02:38]# ip a
 6 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
 7     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 8     inet 127.0.0.1/8 scope host lo
 9        valid_lft forever preferred_lft forever
10     inet6 ::1/128 scope host 
11        valid_lft forever preferred_lft forever
12 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
13     link/ether 00:0c:29:ee:4d:81 brd ff:ff:ff:ff:ff:ff
14     inet 10.0.0.51/24 brd 10.0.0.255 scope global noprefixroute eth0
15        valid_lft forever preferred_lft forever
16     inet 10.0.0.55/24 brd 10.0.0.255 scope global secondary eth0:1
17        valid_lft forever preferred_lft forever
18     inet6 fe80::f2eb:f691:47ea:8c4e/64 scope link tentative noprefixroute dadfailed 
19        valid_lft forever preferred_lft forever
20     inet6 fe80::967b:20ed:e702:f7c2/64 scope link tentative noprefixroute dadfailed 
21        valid_lft forever preferred_lft forever
22     inet6 fe80::fda8:a693:1736:211/64 scope link noprefixroute 
23        valid_lft forever preferred_lft forever

4.4 重启MHA(db03操作)

a. 先停止
masterha_stop --conf=/etc/mha/app1.cnf   

b. 再启动
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &

c.查看下VIP是否还是在主库51上或者查看master是否为51
[root@db03 ~14:30:54]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:7808) is running(0:PING_OK), master:10.0.0.51

4.5 测试VIP漂移

 1 1.将主库master mysql 停掉
 2   systemctl stop mysqld
 3   
 4 2.查看mha 日志
 5   tail -f /var/log/mha/app1/manager 
 6 
 7 3.查看VIP IP已经漂移到52,说明已经将52做为新主库master.
 8 
 9 4.主库51上查看主从复制,VIP 都已要没有了,同时53库上的/etc/mha/app1.cnf [server 1] 配置已经清除.
10 
11 5.将51库的Mysql启动来,再做主从复制,主库为52.同时/etc/mha/app1.cnf  [server 1] 增加进去.
12     change master to
13     master_host='10.0.0.52',
14     master_user='repl',
15     master_password='123456abcd',
16     MASTER_AUTO_POSITION=1; 
17     再开启start slave;
18     查看show slave status\G
19     
20 6.在主库52中执行show slave hosts;可以查看到有几台从库
21   mysql> show slave hosts;
22     +-----------+------+------+-----------+--------------------------------------+
23     | Server_id | Host | Port | Master_id | Slave_UUID                           |
24     +-----------+------+------+-----------+--------------------------------------+
25     |        53 |      | 3306 |        51 | b03f4593-6e18-11eb-a543-000c291a88d9 |
26     |        52 |      | 3306 |        51 | 6a0cfd41-6e18-11eb-a1b0-000c29d2d3c1 |
27     +-----------+------+------+-----------+--------------------------------------+
28     2 rows in set (0.00 sec)

master_ip_failover(perl)脚本

 1 cat /usr/local/bin/master_ip_failover
 2 #!/usr/bin/env perl
 3 use strict;
 4 use warnings FATAL =>'all';
 5 
 6 use Getopt::Long;
 7 
 8 my (
 9 $command,          $ssh_user,        $orig_master_host, $orig_master_ip,
10 $orig_master_port, $new_master_host, $new_master_ip,    $new_master_port
11 );
12 
13 my $vip = '10.0.0.55/24';  # Virtual IP
14 my $key = "1";
15 my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
16 my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";
17 my $exit_code = 0;
18 
19 GetOptions(
20 'command=s'          => \$command,
21 'ssh_user=s'         => \$ssh_user,
22 'orig_master_host=s' => \$orig_master_host,
23 'orig_master_ip=s'   => \$orig_master_ip,
24 'orig_master_port=i' => \$orig_master_port,
25 'new_master_host=s'  => \$new_master_host,
26 'new_master_ip=s'    => \$new_master_ip,
27 'new_master_port=i'  => \$new_master_port,
28 );
29 
30 exit &main();
31 
32 sub main {
33 
34 #print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
35 
36 if ( $command eq "stop" || $command eq "stopssh" ) {
37 
38         # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
39         # If you manage master ip address at global catalog database,
40         # invalidate orig_master_ip here.
41         my $exit_code = 1;
42         eval {
43             print "\n\n\n***************************************************************\n";
44             print "Disabling the VIP - $vip on old master: $orig_master_host\n";
45             print "***************************************************************\n\n\n\n";
46 &stop_vip();
47             $exit_code = 0;
48         };
49         if ($@) {
50             warn "Got Error: $@\n";
51             exit $exit_code;
52         }
53         exit $exit_code;
54 }
55 elsif ( $command eq "start" ) {
56 
57         # all arguments are passed.
58         # If you manage master ip address at global catalog database,
59         # activate new_master_ip here.
60         # You can also grant write access (create user, set read_only=0, etc) here.
61 my $exit_code = 10;
62         eval {
63             print "\n\n\n***************************************************************\n";
64             print "Enabling the VIP - $vip on new master: $new_master_host \n";
65             print "***************************************************************\n\n\n\n";
66 &start_vip();
67             $exit_code = 0;
68         };
69         if ($@) {
70             warn $@;
71             exit $exit_code;
72         }
73         exit $exit_code;
74 }
75 elsif ( $command eq "status" ) {
76         print "Checking the Status of the script.. OK \n";
77         `ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
78         exit 0;
79 }
80 else {
81 &usage();
82         exit 1;
83 }
84 }
85 
86 # A simple system call that enable the VIP on the new master
87 sub start_vip() {
88 `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
89 }
90 # A simple system call that disable the VIP on the old_master
91 sub stop_vip() {
92 `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
93 }
94 
95 sub usage {
96 print
97 "Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=po
98 rt –new_master_host=host –new_master_ip=ip –new_master_port=port\n";
99 }

5.binlog server (数据补偿)

1 db03 主机操作
2 数据补偿,尽可能不丢失数据
3 有专门一台binlog server时时拉取Master 的Mysql-binlog,如果主master 宕机了,从库直接从binlog server读取数据.
4 缺点:可能会拉低master 性能,所以尽能将binlog server装在ssd上.

5.1 参数设置

1 vim /etc/mha/app1.cnf
2 [binlog1]
3 no_master=1            #不参与选主
4 hostname=10.0.0.53     #binlog server 主机Ip
5 master_binlog_dir=/data/mysql/binlog   #这个目录需要与master_binlog_dir=/binlog 目录不一样.

5.2 创建对应目录

1 mkdir -p /data/mysql/binlog
2 chown -R mysql.mysql /data/*

5.3 拉取主库日志

 1 必须先进入到binlog目录中
 2 cd /data/mysql/binlog
 3 mysqlbinlog -R --host=10.0.0.51 --user=mha --password=123456abcd --raw --stop-never mysql-bin.000001 &
 4 mysql-bin.000001:
 5     这个日志是为了测试用,如果生产环境中,因为有很多日志也有可能没有这个日志,所以需要用show master status;来查看下当前日志,或者flush logs;重新刷新一个新的日志起点.
 6 mysql> show master status;
 7 +------------------+----------+--------------+------------------+------------------------------------------+
 8 | File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                        |
 9 +------------------+----------+--------------+------------------+------------------------------------------+
10 | mysql-bin.000006 |      194 |              |                  | 86ef6764-6e12-11eb-ac19-000c29ee4d81:1-9 |
11 +------------------+----------+--------------+------------------+------------------------------------------+
12 1 row in set (0.00 sec)
13 
14 [root@db03 /data/mysql/binlog00:26:46]# ps -ef |grep mysqlbinlog
15 root      13131   7587  0 00:24 pts/0    00:00:00 mysqlbinlog -R --host=10.0.0.51 --user=mha --password=x xxxxxxxx --raw --stop-never mysql-bin.000001
16 root      13318   7587  0 00:29 pts/0    00:00:00 grep --color=auto mysqlbinlog
17 
18 [root@db03 /data/mysql/binlog00:20:15]# ll
19 total 24
20 -rw-r----- 1 root root  177 Feb 16 00:19 mysql-bin.000001
21 -rw-r----- 1 root root  177 Feb 16 00:19 mysql-bin.000002
22 -rw-r----- 1 root root  177 Feb 16 00:19 mysql-bin.000003
23 -rw-r----- 1 root root 1940 Feb 16 00:19 mysql-bin.000004
24 -rw-r----- 1 root root  217 Feb 16 00:19 mysql-bin.000005
25 -rw-r----- 1 root root  194 Feb 16 00:19 mysql-bin.000006
26 
27 [root@db01 /binlog00:20:39]# ll
28 total 28
29 -rw-r----- 1 mysql mysql  177 Feb 13 23:41 mysql-bin.000001
30 -rw-r----- 1 mysql mysql  177 Feb 13 23:45 mysql-bin.000002
31 -rw-r----- 1 mysql mysql  177 Feb 13 23:46 mysql-bin.000003
32 -rw-r----- 1 mysql mysql 1940 Feb 14 01:41 mysql-bin.000004
33 -rw-r----- 1 mysql mysql  217 Feb 14 14:13 mysql-bin.000005
34 -rw-r----- 1 mysql mysql  194 Feb 15 21:47 mysql-bin.000006
35 -rw-r----- 1 mysql mysql  150 Feb 15 21:47 mysql-bin.index
36 
37 在主库中先查看日志
38 show master status;
39 再刷新下log
40 flush logs;
41 查看从库53就会有一个新的binlog日志

5.4 重启MHA

1 a. 先停止
2 masterha_stop --conf=/etc/mha/app1.cnf   
3 
4 b. 再启动
5 nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &
6 
7 c.查看下VIP是否还是在主库51上或者查看master是否为51
8 [root@db03 ~14:30:54]# masterha_check_status --conf=/etc/mha/app1.cnf
9 app1 (pid:7808) is running(0:PING_OK), master:10.0.0.51

6.发送邮件提醒

6.1 参数设置(db03)

 1 report_script=/usr/local/bin/sendEmail-v1.56/sendemail.sh
 2 写入配置文档中db03
 3 [root@db03 ~20:54:53]# cat /etc/mha/app1.cnf 
 4 [server default]
 5 manager_log=/var/log/mha/app1/manager
 6 manager_workdir=/var/log/mha/app1
 7 master_binlog_dir=/binlog
 8 master_ip_failover_script=/usr/local/bin/master_ip_failover
 9 report_script=/usr/local/bin/sendEmail-v1.56/sendemail.sh
10 password=123456abcd
11 ping_interval=2
12 repl_password=123456abcd
13 repl_user=repl
14 ssh_user=root
15 user=mha
16 
17 [server1]
18 hostname=10.0.0.51
19 port=3306
20 
21 [server2]
22 hostname=10.0.0.52
23 port=3306
24 
25 [server3]
26 hostname=10.0.0.53
27 port=3306
28 
29 [binlog1]
30 no_master=1
31 hostname=10.0.0.53     
32 master_binlog_dir=/data/mysql/binlog

6.2 解压sendEmail 软件

 1 解压软件到指定位置
 2 tar xf sendEmail-v1.56.tar.gz -C /usr/local/bin/
 3 
 4 [root@db03 /usr/local/bin/sendEmail-v1.5621:30:31]# ll
 5 -rw-r--r-- 1 root root 13329 Sep 30  2009 CHANGELOG
 6 -rw-r--r-- 1 root root  7949 Sep 30  2009 README
 7 -rw-r--r-- 1 root root  4701 Sep 30  2009 README-BR.txt
 8 -rwxr-xr-x 1 root root 80213 Sep 30  2009 sendEmail
 9 lrwxrwxrwx 1 root root     9 Sep 30  2009 sendEmail.pl -> sendEmail
10 -rwxr-xr-x 1 root root   280 Feb 26 19:38 sendemail.sh
11 -rw-r--r-- 1 root root  1229 Sep 30  2009 TODO
12 
13 report_script=/usr/local/bin/sendEmail-v1.56/sendemail.sh
14 解释:MHA 最后会读取report_script里的脚本,其实就是调用sendemail.sh脚本,而sendemail.sh 脚本又是调用sendEmail 命令来发送邮件.

6.3 发邮件脚本测试

 1 #1.下载一个sendEmail-v1.56.tar.gz
 2 
 3 #2. 编写一个测试脚本./sendemail.sh 看看测试邮件是否可以发出.确认没有问题.
 4 vim sendemail.sh
 5 #!/bin/bash
 6 /usr/local/bin/sendEmail-v1.56/sendEmail -f "[email protected]" -t "[email protected]" -s "smtp.163.com" -u "myTest" -o tls=no -o message-content-type=html -o message-charset=utf-8 -xu "[email protected]" -xp "OTUWLQOUFMHVKFRL" -m "MHA 出现问题,请及时处理"
 7 
 8 #3. 直接用命令方式
 9 sendEmail -f "[email protected]" -t "[email protected]" -s "smtp.163.com" -u "myTest" -o tls=no -o message-content-type=html -o message-charset=utf-8 -xu "[email protected]" -xp "OTUWLQOUFMHVKFRL" -m "MHA 出现问题,请及时处理"  
10 
11 #4. sendEmail 参数解释
12 -f linuxyw@163.com是发件邮箱,必须是真实的,要不发不出去的,与sendmail不一样
13 -t 63780668@qq.com linuxyw@163.com 这二个邮件是接收邮件的邮箱地址,可以是一个,也可以是多个,用空格隔开就行,实现邮件群发
14 -s smtp.163.com 这是163邮箱的smtp地址,如果用其它企业邮箱,必须指定smtp地址,要不发不出邮件
15 -u hello  hello是邮件主题
16 -xu linuxyw 发件邮件的登陆用户名,必须是真实的,否则系统登陆不上去,发送不了邮件
17 -xp 123456pass 发件邮箱的登陆用户密码,必须是真实的,否则系统登陆不上去,发送不了邮件
18 -m www.linuxyw.com  这里的[www.linuxyw.com](http://www.linuxyw.com/)是邮件的内容,你可以输入你所要写的邮件内容
19 
20 #5. 运行脚本报错处理
21 ==================================================================================
22 运行脚本报错:
23 [root@db03 /usr/local/bin/sendEmail-v1.5623:20:19]# ./sendemail.sh 
24 
25 *******************************************************************
26 
27  Using the default of SSL_verify_mode of SSL_VERIFY_NONE for client
28  is deprecated! Please set SSL_verify_mode to SSL_VERIFY_PEER
29  possibly with SSL_ca_file|SSL_ca_path for verification.
30  If you really don't want to verify the certificate and keep the
31  connection open to Man-In-The-Middle attacks please set
32  SSL_verify_mode explicitly to SSL_VERIFY_NONE in your application.
33 
34 *******************************************************************
35 
36   at /usr/local/bin/sendEmail-v1.56/sendEmail line 1906.
37 invalid SSL_version specified at /usr/share/perl5/vendor_perl/IO/Socket/SSL.pm line 444.
38 ================================================================================================
39 #处理方法:
40 在centos7中,perl版本是5.16,要降级到5.10.但是降级这个perl,依赖关系很多。经过反复搜索,发现有如下解决办法:
41 a、卸载现有的perl版本,安装老版本(不推荐)
42 b、增加参数-o tls=no 选项
43 [root@zabbix02 sendEmail-v1.56]# sendEmail -f  username@163.com -t [email protected] -s smtp.163.com -u "I am zabbix" -o tls=no -o message-content-type=html -o message-charset=utf8 -xu  username@163.com -xp password -m "hello zabbix"
44 Feb 16 15:43:04 zabbix02 sendEmail[18761]: Email was sent successfully!

6.4 测试邮件是否OK

 1 a. 主库Master 执行systemctl stop mysql, 使主库宕机
 2 
 3 b. 查看是否有邮件提醒
 4    或者看邮件日志
 5    
 6 c. 查看VIP 是否转移
 7 
 8 ==============================================================
 9 ----- Failover Report -----
10 
11 app1: MySQL Master failover 10.0.0.51(10.0.0.51:3306) to 10.0.0.52(10.0.0.52:3306) succeeded
12 
13 Master 10.0.0.51(10.0.0.51:3306) is down!
14 
15 Check MHA Manager logs at db03:/var/log/mha/app1/manager for details.
16 
17 Started automated(non-interactive) failover.
18 Invalidated master IP address on 10.0.0.51(10.0.0.51:3306)
19 Selected 10.0.0.52(10.0.0.52:3306) as a new master.
20 10.0.0.52(10.0.0.52:3306): OK: Applying all logs succeeded.
21 10.0.0.52(10.0.0.52:3306): OK: Activated master IP address.
22 10.0.0.53(10.0.0.53:3306): OK: Slave started, replicating from 10.0.0.52(10.0.0.52:3306)
23 10.0.0.52(10.0.0.52:3306): Resetting slave info succeeded.
24 Master failover to 10.0.0.52(10.0.0.52:3306) completed successfully.
25 Fri Feb 26 19:49:45 2021 - [info] Sending mail..
26 Feb 26 19:49:47 db03 sendEmail[7962]: Email was sent successfully!

7. MHA故障排除思路

 1 如果主库宕机了,思路如下
 2 1. 查看进程
 3    ps -ef |grep manager
 4    masterha_check_status --conf=/etc/mha/app1.cnf
 5    
 6 2. 检查配置文档节点
 7    vim /etc/mha/app1.conf
 8    如果节点已经被移除了,说明切换过程已经大部分成功
 9    如果世点还在,说明切换过程卡在中间
10 
11 3. 看日志
12    tail -f /var/log/mha/app1/manager
13    
14 4. 修复故障库,把节点修复好
15    systemctl start mysqld
16    /etc/init.d/mysqld start 
17 5. 修复主从
18    将故障库修好后手工加入已有的主从中,做为从库
19    change master to
20    master_host='10.0.0.52',
21    master_user='repl',
22    master_password='123456abcd',
23    MASTER_AUTO_POSITION=1;
24    start slave;
25    
26 6. 修复配置文档(配置文档如果还在,则省略)
27    将被移除节点配置重新写入配置中
28     [server1]
29     hostname=10.0.0.51
30     port=3306
31     
32 7. 检查SSH 互信和Repl 主从关系
33    masterha_check_ssh --conf=/etc/mha/app1.cnf
34    masterha_check_repl --conf=/etc/mha/app1.cnf
35    如有报错,解决报错后再检查
36    
37 8. 修复binlogserver (主库宕机了,binlogserver 也会停掉)
38     [root@db03 ~14:24:23]# cd /data/mysql/binlog/
39     [root@db03 /data/mysql/binlog14:24:31]# rm -rf ./*    #删除已有的binlog日志
40     拉取新主库binlog信息 (新主库IP地址写正确)
41     cd /data/mysql/binlog
42     mysqlbinlog -R --host=10.0.0.52 --user=mha --password=123456abcd --raw --stop-never mysql-bin.000001 &
43    
44 9. 检查节点VIP 的状态
45    如果不在,再的手工生成
46    
47 10. 启动MHA
48     nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &
49     
50 11. 查看MHA 状态
51     [root@db03 ~14:30:54]# masterha_check_status --conf=/etc/mha/app1.cnf
52     app1 (pid:7808) is running(0:PING_OK), master:10.0.0.52

8. 读写分离 Atlas

8.1 Atlas 安装(db03)

1 rpm -ivh Atlas-2.2.1.el6.x86_64.rpm

8.2 配置Atlas

 1 cd  /usr/local/mysql-proxy/conf 
 2  同有一个test.cnf 配置文档,先做个备份一份
 3  [root@db03 /usr/local/mysql-proxy/conf20:59:31]# cat test.cnf
 4 [mysql-proxy]
 5 #带#号的为非必需的配置项目
 6 #管理接口的用户名
 7 admin-username = user
 8 #管理接口的密码
 9 admin-password = pwd
10 #Atlas后端连接的MySQL主库的IP和端口,可设置多项,用逗号分隔
11 proxy-backend-addresses =10.0.0.55:3306
12 #Atlas后端连接的MySQL从库的IP和端口,@后面的数字代表权重,用来作负载均衡,若省略则默认为1,可设置多项,用逗号分隔
13 proxy-read-only-backend-addresses = 10.0.0.52:3306,10.0.0.53:3306
14 #用户名与其对应的加密过的MySQL密码,密码使用PREFIX/bin目录下的加密程序encrypt加密,下行的user1和user2为示例,将其替换为你的MySQL的用户名和加密密码!
15 #[root@db03 /usr/local/mysql-proxy/bin22:25:52]# ./encrypt repl:123456abcd
16 #rdCkrCmLhbFMY6YuvYplDg==
17 #[root@db03 /usr/local/mysql-proxy/bin22:26:06]# ./encrypt 123456abcd
18 #tyElBOCykN3FCr2uMcG5AQ==
19 #[root@db03 /usr/local/mysql-proxy/bin22:26:28]# ./encrypt mha
20 #O2jBXONX098=
21 pwds = repl:tyElBOCykN3FCr2uMcG5AQ==, mha:tyElBOCykN3FCr2uMcG5AQ==
22 #设置Atlas的运行方式,设为true时为守护进程方式,设为false时为前台方式,一般开发调试时设为false,线上运行时设为true,true后面不能有空格。
23 daemon = true
24 #设置Atlas的运行方式,设为true时Atlas会启动两个进程,一个为monitor,一个为worker,monitor在worker意外退出后会自动将其重启,设为false时只有worker,没有monitor,一般开发调试时设为false,线上运行时设为true,true后面不能有空格。
25 keepalive = true
26 #工作线程数,对Atlas的性能有很大影响,可根据情况适当设置
27 event-threads = 8
28 #日志级别,分为message、warning、critical、error、debug五个级别
29 log-level = message
30 #日志存放的路径
31 log-path = /usr/local/mysql-proxy/log
32 #SQL日志的开关,可设置为OFF、ON、REALTIME,OFF代表不记录SQL日志,ON代表记录SQL日志,REALTIME代表记录SQL日志且实时写入磁盘,默认为OFF
33 sql-log = ON
34 #Atlas监听的工作接口IP和端口
35 proxy-address = 0.0.0.0:33060
36 #Atlas监听的管理接口IP和端口
37 admin-address = 0.0.0.0:2345
38 #默认字符集,设置该项后客户端不再需要执行SET NAMES语句
39 charset = utf8

8.3. 启动Atlas

 1 启动:
 2 /usr/local/mysql-proxy/bin/mysql-proxyd test start
 3 重起:
 4 /usr/local/mysql-proxy/bin/mysql-proxyd test restart
 5 [root@db03 /usr/local/mysql-proxy/conf21:44:42]# /usr/local/mysql-proxy/bin/mysql-proxyd test restart
 6 OK: MySQL-Proxy of test is stopped
 7 OK: MySQL-Proxy of test is started
 8 
 9 [root@db03 /usr/local/mysql-proxy/conf21:03:27]# /usr/local/mysql-proxy/bin/mysql-proxyd test start
10 OK: MySQL-Proxy of test is started
11 
12 确认是否启动
13 ps -ef | grep proxy
14 
15 [root@db03 /usr/local/mysql-proxy/conf21:05:02]# ps -ef | grep proxy
16 root       6281      1  0 12:54 ?        00:00:00 /usr/sbin/gssproxy -D
17 root      20514      1  0 21:05 ?        00:00:00 /usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/usr/local/mysql-proxy/conf/test.cnf
18 root      20515  20514  0 21:05 ?        00:00:00 /usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/usr/local/mysql-proxy/conf/test.cnf
19 root      20560   7774  0 21:06 pts/1    00:00:00 grep --color=auto proxy
20 
21 查看端口
22 [root@db03 /usr/local/mysql-proxy/conf21:45:03]# netstat -lntup |grep proxy
23 tcp        0      0 0.0.0.0:2345            0.0.0.0:*               LISTEN      21790/mysql-proxy   
24 tcp        0      0 0.0.0.0:33060           0.0.0.0:*               LISTEN      21790/mysql-proxy 

8.4 Atlas 配置说明

  1 Atlas运行需要依赖一个配置文件(test.cnf)。在运行Atlas之前,需要对该文件进行配置。Atlas的安装目录是/usr/local/mysql-proxy,进入安装目录下的conf目录,可以看到已经有一个名为test.cnf的默认配置文件,我们只需要修改里面的某些配置项,不需要从头写一个配置文件。
  2 
  3 配置范例及说明如下:
  4 [mysql-proxy]
  5 
  6 (必备,默认值即可)管理接口的用户名
  7 
  8 admin-username = user
  9 
 10 (必备,默认值即可)管理接口的密码
 11 
 12 admin-password = pwd
 13 
 14 (必备,根据实际情况配置)主库的IP和端口
 15 
 16 proxy-backend-addresses = 192.168.0.12:3306
 17 
 18 (非必备,根据实际情况配置)从库的IP和端口,@后面的数字代表权重,用来作负载均衡,若省略则默认为1,可设置多项,用逗号分隔。如果想让主库也能分担读请求的话,只需要将主库信息加入到下面的配置项中。
 19 
 20 proxy-read-only-backend-addresses = 192.168.0.13:3306,192.168.0.14:3306
 21 
 22 (必备,根据实际情况配置)用户名与其对应的加密过的MySQL密码,密码使用PREFIX/bin目录下的加密程序encrypt加密,用户名与密码之间用冒号分隔。主从数据库上需要先创建该用户并设置密码(用户名和密码在主从数据库上要一致)。比如用户名为myuser,密码为mypwd,执行./encrypt mypwd结果为HJBoxfRsjeI=。如果有多个用户用逗号分隔即可。则设置如下行所示:
 23 
 24 pwds = repl: HJBoxfRsjeI=,mha:HJBoxfRsjeI=
 25 
 26 (必备,默认值即可)Atlas的运行方式,设为true时为守护进程方式,设为false时为前台方式,一般开发调试时设为false,线上运行时设为true
 27 
 28 daemon = true
 29 
 30 (必备,默认值即可)设置Atlas的运行方式,设为true时Atlas会启动两个进程,一个为monitor,一个为worker,monitor在worker意外退出后会自动将其重启,设为false时只有worker,没有monitor,一般开发调试时设为false,线上运行时设为true
 31 
 32 keepalive = true
 33 
 34 (必备,根据实际情况配置)工作线程数,推荐设置成系统的CPU核数的2至4倍
 35 
 36 event-threads = 4
 37 
 38 (必备,默认值即可)日志级别,分为message、warning、critical、error、debug五个级别
 39 
 40 log-level = message
 41 
 42 (必备,默认值即可)日志存放的路径
 43 
 44 log-path = /usr/local/mysql-proxy/log
 45 
 46 (必备,根据实际情况配置)SQL日志的开关,可设置为OFF、ON、REALTIME,OFF代表不记录SQL日志,ON代表记录SQL日志,该模式下日志刷新是基于缓冲区的,当日志填满缓冲区后,才将日志信息刷到磁盘。REALTIME用于调试,代表记录SQL日志且实时写入磁盘,默认为OFF
 47 
 48 sql-log = OFF
 49 
 50 (可选项,可不设置)慢日志输出设置。当设置了该参数时,则日志只输出执行时间超过sql-log-slow(单位:ms)的日志记录。不设置该参数则输出全部日志。
 51 
 52 sql-log-slow = 10
 53 
 54 (可选项,可不设置)关闭不活跃的客户端连接设置。当设置了该参数时,Atlas会主动关闭经过'wait-timeout'时间后一直未活跃的连接。单位:秒
 55 
 56 wait-timeout = 10
 57 
 58 (必备,默认值即可)Atlas监听的工作接口IP和端口
 59 
 60 proxy-address = 0.0.0.0:1234
 61 
 62 (必备,默认值即可)Atlas监听的管理接口IP和端口 admin-address = 0.0.0.0:2345
 63 
 64 (可选项,可不设置)分表设置,此例中person为库名,mt为表名,id为分表字段,3为子表数量,可设置多项,以逗号分隔,若不分表则不需要设置该项,子表需要事先建好,子表名称为表名_数字,数字范围为[0,子表数-1],如本例里,子表名称为mt_0、mt_1、mt_2
 65 
 66 tables = person.mt.id.3
 67 
 68 (可选项,可不设置)默认字符集,若不设置该项,则默认字符集为latin1
 69 
 70 charset = utf8
 71 
 72 (可选项,可不设置)允许连接Atlas的客户端的IP,可以是精确IP,也可以是IP段,以逗号分隔,若不设置该项则允许所有IP连接,否则只允许列表中的IP连接
 73 
 74 client-ips = 127.0.0.1, 192.168.1
 75 
 76 (可选项,极少需要)Atlas前面挂接的LVS的物理网卡的IP(注意不是虚IP),若有LVS且设置了client-ips则此项必须设置,否则可以不设置
 77 
 78 lvs-ips = 192.168.1.1
 79 
 80 2.    重要配置说明
 81 以下几项配置参数对性能和正常运行起到重要作用,需要正确设置。
 82 
 83 (1)线程数
 84 
 85 event-threads项设置,过小无法充分发挥多核CPU的性能,过大造成不必要的线程切换开销,推荐设置为CPU的核数。
 86 
 87 (2)最小空闲连接数(2.x以上版本不需要该项,1.x版本需要)
 88 
 89 min-idle-connections项设置,过小则在高并发下会有报错,过大虽然不报错但在测试时不容易看出读写分离效果,推荐设置为比客户端的并发峰值稍大,详见《配置参数详解》。上面的配置范例是针对Atlas 2.X版本,没有该选项。对于Atlas 1.X版本的配置文件,需要加入该配置选项。
 90 
 91 3. 可选配置说明
 92 以下几项可以设置,也可以使用默认值,区别不大。
 93 
 94 (1)Atlas的工作端口
 95 
 96 proxy-address项配置,例如proxy-address = 0.0.0.0:1234代表客户端应该使用1234这个端口连接Atlas来发送SQL请求。
 97 
 98 (2)Atlas的管理端口
 99 
100 admin-address项配置,例如admin-address = 0.0.0.0:2345代表DBA应该使用2345这个端口连接Atlas来执行运维管理操作。
101 
102 (3)管理接口的用户名和密码
103 
104 admin-username项和admin-password项设置,这两项是用来进入Atlas的管理界面的,与后端连接的MySQL没有关系,所以可以任意设置,不需要MySQL在配置上做任何改动。
105 
106 (4)日志级别
107 
108 以log-level项配置,分为message、warning、critical、error、debug五个级别。
109 
110 (5)日志路径
111 
112 以log-path项配置,如log-path = /usr/local/mysql-proxy/log。
113 
114 [mysql-proxy]
115 
116 管理接口的用户名
117 admin-username = user
118 管理接口的密码
119 admin-password = pwd
120 Atlas后端连接的MySQL主库的IP和端口,可设置多项,用逗号分隔
121 proxy-backend-addresses = 192.168.1.1:3306
122 从库
123 proxy-read-only-backend-addresses = 192.168.1.2:3306@1
124 用户名和密码配置项,需要和主从复制配置的用户名和密码配置一样
125 r1:+jKsgB3YAG8=, user2:GS+tr4TPgqc=
126 后台运行
127 daemon = true keepalive = false
128 工作线程数,对Atlas的性能有很大影响,可根据情况适当设置
129 event-threads = 4
130 日志级别,分为message、warning、critical、error、debug五个级别
131 log-level = error
132 日志存放的路径
133 log-path = ./log
134 SQL日志的开关,可设置为OFF、ON、REALTIME,OFF代表不记录SQL日志,ON代表记录SQL日志,REALTIME代表记录SQL日>志且实时写入磁盘,默认为OFF
135 sql-log = OFF
136 慢日志输出设置。当设置了该参数时,则日志只输出执行时间超过sql-log-slow(单位:ms)的日志>记录。不设置该参数则输出全部日志。
137 sql-log-slow = 1000
138 实例名称,用于同一台机器上多个Atlas实例间的区分
139 instance = web
140 Atlas监听的工作接口IP和端口
141 proxy-address = 0.0.0.0:13470
142 Atlas监听的管理接口IP和端口
143 admin-address = 0.0.0.0:23470
144 分表设置,此例中person为库名,mt为表名,id为分表字段,3为子表数量,可设置多项,以逗号分>隔,若不分表则不需要设置该项
145 tables = person.mt.id.3
146 默认字符集,设置该项后客户端不再需要执行SET NAMES语句
147 charset = utf8
148 允许连接Atlas的客户端的IP,可以是精确IP,也可以是IP段,以逗号分隔,若不设置该项则允许所>有IP连接,否则只允许列表中的IP连接
149 client-ips = 127.0.0.1, 192.168.1
150 Atlas前面挂接的LVS的物理网卡的IP(注意不是虚IP),若有LVS且设置了client-ips则此项必须设置>,否则可以不设置
151 lvs-ips = 192.168.1.1

8.5 测试 读写分离功能(那台机器装了Atlas就用哪台)

 1 db03 操作:用mha 用户登录
 2 
 3 1. mysql -umha -p123456abcd -h 10.0.0.53 -P33060
 4 
 5 [root@db03 /usr/local/mysql-proxy/conf22:42:49]# mysql -umha -p123456abcd -h 10.0.0.53 -P33060
 6 mysql: [Warning] Using a password on the command line interface can be insecure.
 7 Welcome to the MySQL monitor.  Commands end with ; or \g.
 8 Your MySQL connection id is 1
 9 Server version: 5.0.81-log MySQL Community Server (GPL)
10 
11 Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
12 
13 Oracle is a registered trademark of Oracle Corporation and/or its
14 affiliates. Other names may be trademarks of their respective
15 owners.
16 
17 Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
18 
19 报错:
20    登录密码有问题
21 
22 解决问题:
23     Atlas 配置文档中的  pwds = repl:tyElBOCykN3FCr2uMcG5AQ==, mha:tyElBOCykN3FCr2uMcG5AQ==,这个密码必须是加密过的,要不然是无法识别.加密方式为/usr/local/mysql-proxy/bin/下的encrypt 文件加密
24  #用户名与其对应的加密过的MySQL密码,密码使用PREFIX/bin目录下的加密程序encrypt加密,下行的user1和user2为示例,将其替换为你的MySQL的用户名和加密密码!
25 #[root@db03 /usr/local/mysql-proxy/bin22:26:06]# ./encrypt 123456abcd
26 #tyElBOCykN3FCr2uMcG5AQ==
27 #[root@db03 /usr/local/mysql-proxy/bin22:26:28]# ./encrypt mha
28 #O2jBXONX098=
29 pwds = repl:tyElBOCykN3FCr2uMcG5AQ==, mha:tyElBOCykN3FCr2uMcG5AQ==
30 
31 2. 登录后测试读写分离:设置51,53为从库,52为主库
32   读操作:
33      select @@server_id; 可以看到51,53一直轮循.
34    mysql> select @@server_id;
35 +-------------+
36 | @@server_id |
37 +-------------+
38 |          53 |
39 +-------------+
40 1 row in set (0.00 sec)
41 
42 mysql> select @@server_id;
43 +-------------+
44 | @@server_id |
45 +-------------+
46 |          51 |
47 +-------------+
48 1 row in set (0.00 sec)
49 
50 mysql> select @@server_id;
51 +-------------+
52 | @@server_id |
53 +-------------+
54 |          53 |
55 +-------------+
56 1 row in set (0.00 sec)
57 
58 mysql> select @@server_id;
59 +-------------+
60 | @@server_id |
61 +-------------+
62 |          51 |
63 +-------------+
64 1 row in set (0.00 sec)
65 
66 写操作: 只会写在db02 
67   begin;select @@server_id;commit;
68   
69 mysql>  begin;select @@server_id;commit;
70 Query OK, 0 rows affected (0.01 sec)
71 
72 +-------------+
73 | @@server_id |
74 +-------------+
75 |          52 |
76 +-------------+
77 1 row in set (0.00 sec)
78 
79 Query OK, 0 rows affected (0.00 sec)
80 
81 mysql>  begin;select @@server_id;commit;
82 Query OK, 0 rows affected (0.00 sec)
83 
84 +-------------+
85 | @@server_id |
86 +-------------+
87 |          52 |
88 +-------------+
89 1 row in set (0.00 sec)
90 
91 Query OK, 0 rows affected (0.01 sec)   

8.6 Atlas 管理(db03)

a. 登录Atlas
登录Atlas 界面,用户名: user 密码:pwd   端口号:2345
  mysql -uuser -ppwd -h 10.0.0.53 -P2345

[root@db03 /usr/local/mysql-proxy/conf23:01:54]# mysql -uuser -ppwd -h 10.0.0.53 -P2345
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.0.99-agent-admin

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

b. 查看Atlas 所有管理命令
mysql> select * from help;
+----------------------------+---------------------------------------------------------+
| command                    | description                                             |
+----------------------------+---------------------------------------------------------+
| SELECT * FROM help         | shows this help                                         |
| SELECT * FROM backends     | lists the backends and their state                      |
| SET OFFLINE $backend_id    | offline backend server, $backend_id is backend_ndx's id |
| SET ONLINE $backend_id     | online backend server, ...                              |
| ADD MASTER $backend        | example: "add master 127.0.0.1:3306", ...               |
| ADD SLAVE $backend         | example: "add slave 127.0.0.1:3306", ...                |
| REMOVE BACKEND $backend_id | example: "remove backend 1", ...                        |
| SELECT * FROM clients      | lists the clients                                       |
| ADD CLIENT $client         | example: "add client 192.168.1.2", ...                  |
| REMOVE CLIENT $client      | example: "remove client 192.168.1.2", ...               |
| SELECT * FROM pwds         | lists the pwds                                          |
| ADD PWD $pwd               | example: "add pwd user:raw_password", ...               |
| ADD ENPWD $pwd             | example: "add enpwd user:encrypted_password", ...       |
| REMOVE PWD $pwd            | example: "remove pwd user", ...                         |
| SAVE CONFIG                | save the backends to config file                        |
| SELECT VERSION             | display the version of Atlas                            |
+----------------------------+---------------------------------------------------------+

c. 管理命令
    查看所有帮助
    SELECT * FROM help 
  
    查看后端节点状态
    SELECT * FROM backends

    查看数据库节点状态
SELECT * FROM backends
  
  mysql> SELECT * FROM backends;
+-------------+----------------+-------+------+
| backend_ndx | address        | state | type |
+-------------+----------------+-------+------+
|           1 | 10.0.0.55:3306 | up    | rw   |
|           2 | 10.0.0.51:3306 | up    | ro   |
|           3 | 10.0.0.53:3306 | up    | ro   |
+-------------+----------------+-------+------+
3 rows in set (0.00 sec)

    上线或下线节点(对某节点需要做升级或者数据分析等操作时)
SET OFFLINE $backend_id    
SET ONLINE $backend_id

mysql> SET OFFLINE 2;
+-------------+----------------+---------+------+
| backend_ndx | address        | state   | type |
+-------------+----------------+---------+------+
|           2 | 10.0.0.51:3306 | offline | ro   |
+-------------+----------------+---------+------+
1 row in set (0.00 sec)

mysql> SET ONLINE 2;
+-------------+----------------+---------+------+
| backend_ndx | address        | state   | type |
+-------------+----------------+---------+------+
|           2 | 10.0.0.51:3306 | unknown | ro   |
+-------------+----------------+---------+------+
1 row in set (0.00 sec)

mysql> SELECT * FROM backends;
+-------------+----------------+-------+------+
| backend_ndx | address        | state | type |
+-------------+----------------+-------+------+
|           1 | 10.0.0.55:3306 | up    | rw   |
|           2 | 10.0.0.51:3306 | up    | ro   |
|           3 | 10.0.0.53:3306 | up    | ro   |
+-------------+----------------+-------+------+
3 rows in set (0.00 sec)


  添加或移除节点
  ADD MASTER $backend                 example: "add master 127.0.0.1:3306"
  ADD SLAVE $backend                  example: "add slave 127.0.0.1:3306"
  REMOVE BACKEND $backend_id 
  
mysql> REMOVE BACKEND 2;
Empty set (0.00 sec)

mysql> SELECT * FROM backends;
+-------------+----------------+-------+------+
| backend_ndx | address        | state | type |
+-------------+----------------+-------+------+
|           1 | 10.0.0.55:3306 | up    | rw   |
|           2 | 10.0.0.53:3306 | up    | ro   |
+-------------+----------------+-------+------+
2 rows in set (0.00 sec)

mysql> ADD SLAVE 10.0.0.51:3306;
Empty set (0.00 sec)

mysql> SELECT * FROM backends;
+-------------+----------------+-------+------+
| backend_ndx | address        | state | type |
+-------------+----------------+-------+------+
|           1 | 10.0.0.55:3306 | up    | rw   |
|           2 | 10.0.0.53:3306 | up    | ro   |
|           3 | 10.0.0.51:3306 | up    | ro   |
+-------------+----------------+-------+------+
3 rows in set (0.00 sec)
 1 | SELECT * FROM pwds         | lists the pwds                                          |
 2 | ADD PWD $pwd               | example: "add pwd user:raw_password", ...               |
 3 | ADD ENPWD $pwd             | example: "add enpwd user:encrypted_password", ...       |
 4 | REMOVE PWD $pwd 
 5 
 6 mysql>  SELECT * FROM pwds;
 7 +----------+--------------------------+
 8 | username | password                 |
 9 +----------+--------------------------+
10 | repl     | tyElBOCykN3FCr2uMcG5AQ== |
11 | mha      | tyElBOCykN3FCr2uMcG5AQ== |
12 +----------+--------------------------+
13 2 rows in set (0.00 sec)
14 
15 
16 企业应用案列:
17 开发人员申请一个应用用户 app(  select  update  insert)  密码123456abcd,要通过10网段登录
18 1. 在主库中,创建用户
19 grant select ,update,insert on *.* to app@'10.0.0.%' identified by '123456abcd';
20 2. 在atlas中添加生产用户
21  方法一:  明文
22     ADD PWD $pwd     ---需要添加的用户名称,例如:ADD PWD app:123456abcd; 会自动加,并加密
23  
24  方法二:  密文
25     /usr/local/mysql-proxy/bin/encrypt  123456abcd 
26     ADD ENPWD $pwd      例: ADD ENPWD app:tyElBOCykN3FCr2uMcG5AQ== 
27  方法三:
28   /usr/local/mysql-proxy/bin/encrypt  123456abcd      ---->制作加密密码
29   vim test.cnf
30   pwds = repl:3yb5jEku5h4=,mha:O2jBXONX098=,app:/iZxz+0GRoA=
31   /usr/local/mysql-proxy/bin/mysql-proxyd test restart
32   [root@db03 conf]# mysql -uapp -p123456abcd  -h 10.0.0.53 -P 33060

8.7  配置永久生效(以上改的任何配置机器重启都会丢掉,相当于暂存在内存里)

SAVE CONFIG;