利用高可用虛擬IP構建PostgreSQL集群實踐
- 2019 年 12 月 6 日
- 筆記
PostgreSQL是一個非常流行的使用面非常廣的關係資料庫,有很多種構建Postgres HA集群的方式,例如PAF,pgool等,以下將以CentOS7系統和PostgreSQL9.6版本為例,結合高可用虛擬IP(以下簡稱vip),patroni,haproxy,etcd等組件介紹一種pg HA方案。
整個集群架構如下:

其中三個pg節點一主兩從,使用patroni管理pg節點狀態,使用etcd集群存儲patroni元數據,每個節點通過pgbouncer管理本機的pg連接池,每個節點的haproxy配置一致,都會代理三個pg實例,客戶端請求通過keepalived管理的vip來訪問pg,通過haproxy與patroni rest api配合,可以提供以下兩個埠的服務:
- 5000埠,僅代理master實例,可提供讀和寫
- 5001埠,僅代理replica實例, 提供只讀
業務可以根據實際需求來選擇讀寫埠,例如只讀的業務場景可以走5001埠獲取更高的並發能力。
一 實驗環境
- 3個cvm虛擬機節點
- 一個高可用虛擬IP,172.27.16.47,雲上申請地址:https://console.cloud.tencent.com/vpc/havip
二 安裝postgresql等組件
本文使用postgres9.6版本,三個節點均安裝,可通過騰訊雲yum鏡像倉庫安裝:
rpm -ivh http://mirrors.cloud.tencent.com/postgresql/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm yum -y install postgresql96-server
也可手工下載rpm安裝,rpm下載地址:http://mirrors.cloud.tencent.com/postgresql/repos/yum/9.6/redhat/rhel-7-x86_64/
安裝patroni,可從第三方獲取現成的rpm:https://github.com/cybertec-postgresql/patroni-packaging/releases
安裝pgbouncer,下載地址:http://mirrors.cloud.tencent.com/postgresql/repos/yum/9.6/redhat/rhel-7-x86_64/pgbouncer-1.12.0-1.rhel7.x86_64.rpm
安裝haproxy,可通過以下倉庫自行build rpm,https://github.com/itxx00/haproxy
安裝keepalived,系統自帶os repo裡面有,直接yum install keepalived即可。
安裝etcd,etcd rpm在系統自帶extras repo裡面有,直接yum install etcd即可。
三 服務和組件配置
etcd配置:/etc/etcd/etcd.conf
ETCD_NAME=node1 #每個節點不一樣 ETCD_DATA_DIR="/data/etcd" ETCD_LISTEN_PEER_URLS="http://172.27.16.22:2380" ETCD_LISTEN_CLIENT_URLS="http://172.27.16.22:2379,http://127.0.0.1:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.27.16.22:2380" ETCD_ADVERTISE_CLIENT_URLS="http://172.27.16.22:2379" ETCD_INITIAL_CLUSTER_STATE="new" ETCD_INITIAL_CLUSTER_TOKEN="change-your-token" ETCD_INITIAL_CLUSTER="node1=http://172.27.16.22:2380,node2=http://172.27.16.23:2380,node3=http://172.27.16.28:2380" ETCD_ENABLE_V2="true"
etcd啟動
systemctl start etcd
patroni配置:/etc/patroni/patroni.yml
scope: pgcluster name: node1 namespace: /service/ log: level: INFO max_queue_size: 1000 dir: /var/log/postgresql file_num: 7 file_size: 10485760 restapi: listen: 172.27.16.22:8008 connect_address: 172.27.16.22:8008 authentication: username: root password: yourpasswordhere etcd: hosts: 172.27.16.22:2379,172.27.16.23:2379,172.27.16.28:2379 username: root password: yourpasswordhere bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: false synchronous_mode_strict: false postgresql: use_pg_rewind: true use_slots: true parameters: max_connections: 100 superuser_reserved_connections: 5 max_locks_per_transaction: 64 max_prepared_transactions: 0 huge_pages: try shared_buffers: 512MB work_mem: 128MB maintenance_work_mem: 256MB effective_cache_size: 4GB checkpoint_timeout: 15min checkpoint_completion_target: 0.9 min_wal_size: 2GB max_wal_size: 4GB wal_buffers: 32MB default_statistics_target: 1000 seq_page_cost: 1 random_page_cost: 4 effective_io_concurrency: 2 synchronous_commit: on autovacuum: on autovacuum_max_workers: 5 autovacuum_vacuum_scale_factor: 0.01 autovacuum_analyze_scale_factor: 0.02 autovacuum_vacuum_cost_limit: 200 autovacuum_vacuum_cost_delay: 20 autovacuum_naptime: 1s max_files_per_process: 4096 log_rotation_age: 1d log_rotation_size: 0 log_line_prefix: '%t [%p-%l] %r %q%u@%d ' log_filename: 'postgresql-%a.log' log_directory: /var/log/postgresql initdb: - encoding: UTF8 - locale: en_US.UTF-8 - data-checksums pg_hba: - host replication replicator 127.0.0.1/32 md5 - host all all 0.0.0.0/0 md5 postgresql: listen: 172.27.16.22,127.0.0.1:5432 connect_address: 172.27.16.22:5432 use_unix_socket: true data_dir: /var/lib/pgsql/9.6/data bin_dir: /usr/pgsql-9.6/bin config_dir: /var/lib/pgsql/9.6/data pgpass: /var/lib/pgsql/.pgpass authentication: replication: username: replicator password: yourpasswordhere superuser: username: postgres password: yourpasswordhere parameters: unix_socket_directories: /var/run/postgresql stats_temp_directory: /var/lib/pgsql_stats_tmp create_replica_methods: - basebackup basebackup: max-rate: '100M' tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
patroni啟動
systemctl start patroni
pgbouncer配置:/etc/pgbouncer/pgbouncer.ini
[databases] postgres = host=127.0.0.1 port=5432 dbname=postgres testdb = host=127.0.0.1 port=5432 dbname=testdb pool_size=20 pool_mode=transaction [pgbouncer] logfile = /var/log/pgbouncer/pgbouncer.log pidfile = /var/run/pgbouncer/pgbouncer.pid listen_addr = 172.27.16.22 listen_port = 6432 unix_socket_dir = /var/run/postgresql auth_type = md5 auth_file = /etc/pgbouncer/userlist.txt admin_users = postgres ignore_startup_parameters = extra_float_digits,geqo pool_mode = session server_reset_query = DISCARD ALL max_client_conn = 10000 default_pool_size = 10 reserve_pool_size = 1 reserve_pool_timeout = 1 max_db_connections = 1000 pkt_buf = 8192 client_tls_sslmode = disable server_tls_sslmode = disable # Documentation https://pgbouncer.github.io/config.html
pgbouncer啟動
systemctl start pgbouncer
haproxy配置:/etc/haproxy/haproxy.cfg
global maxconn 100000 log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners stats timeout 30s user haproxy group haproxy daemon defaults mode tcp log global retries 2 timeout queue 5s timeout connect 5s timeout client 60m timeout server 60m timeout check 15s listen stats mode http bind 172.27.16.22:7000 stats enable stats uri /stats listen master bind 172.27.16.47:5000 mode tcp maxconn 2000 option tcplog option httpchk OPTIONS /master http-check expect status 200 default-server inter 3s fastinter 1s fall 3 rise 4 on-marked-down shutdown-sessions server db01 172.27.16.22:6432 check port 8008 server db02 172.27.16.23:6432 check port 8008 server db03 172.27.16.28:6432 check port 8008 listen replicas bind 172.27.16.47:5001 mode tcp maxconn 6000 option tcplog option httpchk OPTIONS /replica balance roundrobin http-check expect status 200 default-server inter 3s fastinter 1s fall 3 rise 2 on-marked-down shutdown-sessions server db01 172.27.16.22:6432 check port 8008 server db02 172.27.16.23:6432 check port 8008 server db03 172.27.16.28:6432 check port 8008
haproxy啟動
systemctl start haproxy
keepalived配置:/etc/keepalived/keepalived.conf
vrrp_script pg_haproxy_check { script "/usr/libexec/keepalived/haproxy_check.sh" interval 2 weight 2 } vrrp_instance pgcluster_1 { interface eth0 virtual_router_id 61 nopreempt priority 100 advert_int 2 state BACKUP # 如果是雲上vip,如未開通組播的話可使用單播,需設置unicast_peer unicast_src_ip 172.27.16.22 unicast_peer { 172.27.16.23 172.27.16.28 } virtual_ipaddress { 172.27.16.47 } garp_master_delay 1 garp_master_refresh 5 track_interface { eth0 } track_script { pg_haproxy_check } authentication { auth_type PASS auth_pass yourpasswordhere } }
haproxy_check.sh內容參考:
#!/bin/bash /bin/kill -0 `cat /var/run/haproxy.pid`
keepalived啟動
systemctl start keepalived
以上是一個節點的配置示例,另外兩個節點可根據以上配置略作調整即可。當所有節點組件均完成配置並成功啟動後,即可通過vip:5000埠訪問postgres服務了。我們可以通過patroni命令行管理工具對pg實例進行狀態查看/切換等操作,示例如下:
執行sql查詢:

查看節點狀態:

可訪問haproxy.cfg中定義的stats埠查看haproxy實時狀態:

參考文檔鏈接:
- https://github.com/vitabaks/postgresql_cluster/blob/master/README.md
- https://cloud.tencent.com/document/product/215/20186


