CentOS7搭建Pacemaker高可用集群(1)
Pacemaker是Red Hat High Availability Add-on的一部分。在RHEL上进行试用的最简单方法是从Scientific Linux 或CentOS存储库中进行安装
环境准备
双节点
注:centos修改主机名
临时修改:hostname 主机名 –立即生效
永久修改:hostnamectl set-hostname 主机名 –重启生效
node1 - 192.168.29.246 node2 - 192.168.29.247
系统信息
CentOS Linux release 7.8.2003 (Core)
安装
所有节点使用yum安装Pacemaker以及我们将需要的一些其他必要软件包
yum install pacemaker pcs resource-agents
创建集群
所有节点启动pcs守护进程并设置开机运行
systemctl start pcsd.service
systemctl enable pcsd.service
设置pcs所需的身份验证
#所有节点执行 echo 123456 | passwd --stdin hacluster #主节点执行 pcs cluster auth node1 node2 -u hacluster -p 123456 --force
开始创建
pcs cluster setup --force --name pacemaker1 node1 node2
过程信息如下:
[root@node1 ~]# pcs cluster setup –force –name pacemaker1 node1 node2
Destroying cluster on nodes: node1, node2…
node1: Stopping Cluster (pacemaker)…
node2: Stopping Cluster (pacemaker)…
node1: Successfully destroyed cluster
node2: Successfully destroyed cluster
Sending ‘pacemaker_remote authkey’ to ‘node1’, ‘node2’
node1: successful distribution of the file ‘pacemaker_remote authkey’
node2: successful distribution of the file ‘pacemaker_remote authkey’
Sending cluster config files to the nodes…
node1: Succeeded
node2: Succeeded
Synchronizing pcsd certificates on nodes node1, node2…
node1: Success
node2: Success
Restarting pcsd on the nodes in order to reload the certificates…
node1: Success
node2: Success
启动集群
任一节点执行
pcs cluster start --all
启动信息
[root@node1 ~]# pcs cluster start –all
node1: Starting Cluster (corosync)…
node2: Starting Cluster (corosync)…
node1: Starting Cluster (pacemaker)…
node2: Starting Cluster (pacemaker)…
集群设置
禁用Fencing
pcs property set stonith-enabled=false
因为只有两个节点,仲裁没有意义,所以我们禁用仲裁
pcs property set no-quorum-policy=ignore
强制集群在单个故障后移动服务
pcs resource defaults migration-threshold=1
添加资源
pcs resource create my_first_svc ocf:heartbeat:Dummy op monitor interval=60s
my_first_svc:服务名
ocf:pacemaker:Dummy:需要使用的脚本(Dummy- 一种用作模板以及对此类指南有用的代理)
op monitor interval = 60s 告诉Pacemaker通过调用代理的Monitor操作每1分钟检查一次此服务的运行状况
查看集群状态
[root@node1 ~]# pcs status Cluster name: pacemaker1 Stack: corosync Current DC: node1 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum Last updated: Sat Jun 6 14:57:51 2020 Last change: Sat Jun 6 14:57:25 2020 by root via cibadmin on node1 2 nodes configured 1 resource configured Online: [ node1 node2 ] Full list of resources: my_first_svc (ocf::heartbeat:Dummy): Started node1 Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
[root@node1 ~]# crm_mon -1 Stack: corosync Current DC: node1 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum Last updated: Sat Jun 6 14:58:46 2020 Last change: Sat Jun 6 14:57:25 2020 by root via cibadmin on node1 2 nodes configured 1 resource configured Online: [ node1 node2 ] Active resources: my_first_svc (ocf::heartbeat:Dummy): Started node1
故障验证
手动停止服务模拟故障
crm_resource --resource my_first_svc --force-stop
1min后再次查看状态,可知服务切换到了节点2
[root@node1 ~]# crm_mon -1 Stack: corosync Current DC: node1 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum Last updated: Sat Jun 6 15:29:55 2020 Last change: Sat Jun 6 14:57:25 2020 by root via cibadmin on node1 2 nodes configured 1 resource configured Online: [ node1 node2 ] Active resources: my_first_svc (ocf::heartbeat:Dummy): Started node2 Failed Resource Actions: * my_first_svc_monitor_60000 on node1 'not running' (7): call=7, status=complete, exitreason='No process state file found', last-rc-change='Sat Jun 6 15:29:26 2020', queued=0ms, exec=0ms