CentOS7搭建Pacemaker高可用集群（1）

2020 年 6 月 6 日
筆記
linux

Pacemaker是Red Hat High Availability Add-on的一部分。在RHEL上进行试用的最简单方法是从Scientific Linux 或CentOS存储库中进行安装

环境准备

双节点

注：centos修改主机名

临时修改：hostname 主机名 –立即生效

永久修改：hostnamectl set-hostname 主机名 –重启生效

node1 - 192.168.29.246
node2 - 192.168.29.247

系统信息

CentOS Linux release 7.8.2003 (Core)

安装

所有节点使用yum安装Pacemaker以及我们将需要的一些其他必要软件包

yum install pacemaker pcs resource-agents

创建集群

所有节点启动pcs守护进程并设置开机运行

systemctl start pcsd.service
systemctl enable pcsd.service

设置pcs所需的身份验证

#所有节点执行
echo 123456 | passwd --stdin hacluster

#主节点执行
pcs cluster auth node1 node2 -u hacluster -p 123456 --force

开始创建

pcs cluster setup --force --name pacemaker1 node1 node2

过程信息如下：

[root@node1 ~]# pcs cluster setup –force –name pacemaker1 node1 node2
Destroying cluster on nodes: node1, node2…
node1: Stopping Cluster (pacemaker)…
node2: Stopping Cluster (pacemaker)…
node1: Successfully destroyed cluster
node2: Successfully destroyed cluster

Sending ‘pacemaker_remote authkey’ to ‘node1’, ‘node2’
node1: successful distribution of the file ‘pacemaker_remote authkey’
node2: successful distribution of the file ‘pacemaker_remote authkey’
Sending cluster config files to the nodes…
node1: Succeeded
node2: Succeeded

Synchronizing pcsd certificates on nodes node1, node2…
node1: Success
node2: Success
Restarting pcsd on the nodes in order to reload the certificates…
node1: Success
node2: Success

启动集群

任一节点执行

pcs cluster start --all

启动信息

[root@node1 ~]# pcs cluster start –all
node1: Starting Cluster (corosync)…
node2: Starting Cluster (corosync)…
node1: Starting Cluster (pacemaker)…
node2: Starting Cluster (pacemaker)…

集群设置

禁用Fencing

pcs property set stonith-enabled=false

因为只有两个节点，仲裁没有意义，所以我们禁用仲裁

pcs property set no-quorum-policy=ignore

强制集群在单个故障后移动服务

pcs resource defaults migration-threshold=1

添加资源

pcs resource create my_first_svc ocf:heartbeat:Dummy op monitor interval=60s

my_first_svc：服务名

ocf:pacemaker:Dummy：需要使用的脚本（Dummy- 一种用作模板以及对此类指南有用的代理）

op monitor interval = 60s 告诉Pacemaker通过调用代理的Monitor操作每1分钟检查一次此服务的运行状况

查看集群状态

[root@node1 ~]# pcs status
Cluster name: pacemaker1
Stack: corosync
Current DC: node1 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum
Last updated: Sat Jun  6 14:57:51 2020
Last change: Sat Jun  6 14:57:25 2020 by root via cibadmin on node1

2 nodes configured
1 resource configured

Online: [ node1 node2 ]

Full list of resources:

 my_first_svc   (ocf::heartbeat:Dummy): Started node1

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

[root@node1 ~]# crm_mon -1
Stack: corosync
Current DC: node1 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum
Last updated: Sat Jun  6 14:58:46 2020
Last change: Sat Jun  6 14:57:25 2020 by root via cibadmin on node1

2 nodes configured
1 resource configured

Online: [ node1 node2 ]

Active resources:

 my_first_svc   (ocf::heartbeat:Dummy): Started node1

故障验证

手动停止服务模拟故障

crm_resource --resource my_first_svc --force-stop

1min后再次查看状态，可知服务切换到了节点2

[root@node1 ~]# crm_mon -1
Stack: corosync
Current DC: node1 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum
Last updated: Sat Jun  6 15:29:55 2020
Last change: Sat Jun  6 14:57:25 2020 by root via cibadmin on node1

2 nodes configured
1 resource configured

Online: [ node1 node2 ]

Active resources:

 my_first_svc   (ocf::heartbeat:Dummy): Started node2

Failed Resource Actions:
* my_first_svc_monitor_60000 on node1 'not running' (7): call=7, status=complete, exitreason='No process state file found',
    last-rc-change='Sat Jun  6 15:29:26 2020', queued=0ms, exec=0ms

Tags: linux