MariaDB Galera Cluster集群搭建

MariaDB Galera Cluster是什麼?  


 Galera Cluster是由第三方公司Codership所研發的一套免費開源的集群高可用方案,實現了數據零丟失,官網地址為//galeracluster.com/

其在MySQLInnoDB存儲引擎基礎上打了wrep(虛擬全同步複製),Percona/MariaDB已捆綁在各自的發行版本中

MariaDB Galera Cluster是MariaDB同步多主機集群。它僅支援XtraDB/InnoDB存儲引擎(雖然有對MyISAM實驗支援,具體看wsrep_replicate_myisam系統變數)

MariaDB Galera Cluster主要功能

1.同步複製

2.真正的multi-master,即所有節點可以同時讀寫資料庫

3. 自動的節點成員控制,失效節點自動被清除

4. 新節點加入數據自動複製

5.真正的並行複製,行級

6.用戶可以直接連接集群,使用感受上與MySQL完全一致

 

優勢

  • 因為是多主,所以不存在Slavelag(延遲)
  • 不存在丟失事務的情況
  • 同時具有讀和寫的擴展能力
  • 更小的客戶端延遲
  • 節點間數據是同步的,而Master/Slave模式是非同步的,不同slave上的binlog可能是不同的

 

缺點

  • 加入新節點時開銷大,需要複製完整的數據
  • 不能有效地解決寫擴展的問題,所有的寫操作都發生在所有的節點
  • 有多少個節點,就有多少份重複的數據
  • 由於事務提交需要跨節點通訊,即涉及分散式事務操作,因此寫入會比主從複製慢很多,節點越多,寫入越慢,死鎖和回滾也會更加頻繁
  • 對網路要求比較高,如果網路出現波動不穩定,則可能會造成兩個節點失聯,Galera Cluster集群會發生腦裂,服務將不可用

 

存在局限

  1. 僅支援InnoDB/XtraDB存儲引擎,任何寫入其他引擎的表,包括mysql.*表都不會被複制。但是DDL語句可以複製,但是insert into mysql.user(MyISAM存儲引擎)之類的插入數據不會被複制
  2.  Delete操作不支援沒有主鍵的表,因為沒有主鍵的表在不同的節點上的順序不同,如果執行select … limit …將出現不同的結果集
  3. LOCK/UNLOCK TABLES/FLUSH TABLES WITH READ LOCKS不支援單表所鎖,以及鎖函數GET_LOCK()、RELEASE_LOCK(),但FLUSH TABLES WITH READ LOCK支援全局表鎖
  4.  General Query Log日誌不能保存在表中,如果開始查詢日誌,則只能保存到文件中
  5. 不能有大事務寫入,不能操作wsrep_max_ws_rows=131072(行),且寫入集不能超過wsrep_max_ws_size=1073741824(1GB),否則客戶端直接報錯
  6. 由於集群是樂觀鎖並發控制,因此,在commit階段會有事務衝突發生。如果兩個事務在集群中的不同節點上對同一行寫入並提交,則失敗的節點將回滾,客戶端返回死鎖報錯
  7. XA分散式事務不支援Codership Galera Cluster,在提交時可能會回滾
  8. 整個集群的寫入吞吐量取決於最弱的節點限制,集群要使用同一的配置

 

伺服器環境說明

系統版本 MariaDB版本 主機IP 節點名稱
CentOS 7.9 mariadb-10.3.27 192.168.40.50 MariaDB-Node1
CentOS 7.9 mariadb-10.3.27 192.168.40.60 MariaDB-Node2
CentOS 7.9 mariadb-10.3.27 192.168.40.70 MariaDB-Node3

 

 

 

 

 

 

環境準備

[root@MariaDB-Node1 ~]# sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
[root@MariaDB-Node1 ~]# setenforce 0
 setenforce: SELinux is disabled
[root@MariaDB-Node1 ~]# systemctl stop firewalld.service
[root@MariaDB-Node1 ~]# systemctl disable firewalld.service
[root@MariaDB-Node1 ~]# vi /etc/security/limits.conf

* soft nofile 65536  
* hard nofile 65536  

[root@MariaDB-Node1 ~]#sysctl -p

 

安裝MariaDB

參考我之前的部落格安裝,採用二進位安裝

//www.cnblogs.com/mike666/p/13666469.html

三台資料庫主機同時安裝  MariaDB 10.3.27資料庫

備註:從MariaDB 10.1.20 版本開始,Galera Cluster就已經包含在MariaDB包裡面了,不需要單獨部署MariaDB-Galera-server 和galera 包

 

安裝 Galera 依賴包

安裝依賴包
[root@MariaDB-Node1 ~]# yum -y install  lsof cmake  gcc gcc-c++  autoconf automake zlib*  libxml* \
ncurses ncurses-devel libtool  libtool-ltdl-devel* make  bison bison-devel  \
openssl-devel libevent-devel \
libaio libaio-devel pam-devel boost boost-devel valgrind-devel \
libnl-devel popt-devel popt-static bzr nmap-* check-devel 


安裝Scons
[root@MariaDB-Node1 ~]# wget http://prdownloads.sourceforge.net/scons/scons-2.5.0.tar.gz
[root@MariaDB-Node1 ~]# tar zxvf scons-2.5.0.tar.gz -C /usr/local/ && rm -rf scons-2.5.0.tar.gz
[root@MariaDB-Node1 ~]# cd /usr/local/scons-2.5.0/
[root@MariaDB-Node1 /usr/local/scons-2.5.0]# python setup.py install
[root@MariaDB-Node1 /usr/local/scons-2.5.0]# cd 
[root@MariaDB-Node1 ~]# scons -v
SCons by Steven Knight et al.:
    script: v2.5.0.rel_2.5.0:3544:95d356f188a3[MODIFIED], 2016/04/09 14:38:50, by bdbaddog on ubuntu1404-32bit
    engine: v2.5.0.rel_2.5.0:3544:95d356f188a3[MODIFIED], 2016/04/09 14:38:50, by bdbaddog on ubuntu1404-32bit
    engine path: ['/usr/lib/scons-2.5.0/SCons']
Copyright (c) 2001 - 2016 The SCons Foundation
[root@MariaDB-Node1 ~]# 

三台都要安裝這個依賴包,不然啟動集群可能會報錯

 

修改配置文件my.cnf

下面我們開始配置MariaDB Galera Cluster,分別修改MariaDB Galera集群的每個節點上的/etc/my.cnf文件,具體每個節點的內容如下:

192.168.40.50 配置文件修改:

[root@MariaDB-Node1 ~]# vim /etc/my.cnf


[client]
port = 3306
socket = /tmp/mysql.sock

[mysqld]
user = mysql
datadir = /data/mysql/data
pid-file = /data/mysql/mysql.pid
server-id = 50
relay_log =/data/logs/mysql/mysql_relay.log
bind-address = 0.0.0.0

tmpdir=/data/tmp

init_connect ='SET NAMES utf8'
character-set-server = utf8
skip-name-resolve
back_log = 300


max_connections = 8019
max_connect_errors = 1024000
open_files_limit = 65535
table_open_cache = 2048
max_allowed_packet = 50M
max_heap_table_size = 512M
tmp_table_size = 256M


read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
key_buffer_size = 512M
thread_cache_size = 128


query_cache_type = 2
query_cache_size = 128M
query_cache_limit = 5M
thread_stack = 192k
ft_min_word_len = 4


log_bin = /data/mysql/mysql_bin.log
binlog_format = ROW
expire_logs_days = 7

log_error = /data/logs/mysql/mysql_error.log
slow_query_log = 1
long_query_time = 1
log_slow_verbosity=query_plan
slow_query_log_file = /data/logs/mysql/mysql_slow.log
performance_schema = 0


skip-external-locking #跳過外部鎖定,避免external locking

bulk_insert_buffer_size = 64M
myisam_sort_buffer_size = 128M
myisam_max_sort_file_size = 10G
myisam_repair_threads = 1


default_storage_engine = InnoDB
innodb_file_per_table = 1
innodb_open_files = 500
innodb_buffer_pool_size = 2048M
innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_thread_concurrency = 16
innodb_purge_threads = 1
innodb_flush_log_at_trx_commit = 1
innodb_log_buffer_size = 8M
innodb_log_file_size = 256M
innodb_log_files_in_group = 3
innodb_max_dirty_pages_pct = 90
innodb_lock_wait_timeout = 10
innodb_autoinc_lock_mode = 2
innodb_doublewrite = 1
###New ADD
innodb_rollback_on_timeout = 1
innodb_force_recovery=0
interactive_timeout = 28800
wait_timeout = 120


[galera]
wsrep_provider=/usr/local/mysql/lib/galera/libgalera_smm.so                        # Galera library 安裝包自帶有
wsrep_cluster_address="gcomm://192.168.40.50,192.168.40.60,192.168.40.70"          # Galera cluster 集群地址
wsrep_cluster_name='Galera_Cluster'                                                # Galera集群名稱
wsrep_node_address='192.168.40.50'                                                 # 該節點的地址
wsrep_node_name='MariaDB-Node1'                                                    # 該節點的主機名
wsrep_slave_threads = 8                                   #開啟並行複製執行緒,根據CPU核數設置
wsrep_log_conflicts=1
wsrep_on=ON                                               #開啟全同步複製模式
wsrep_provider_options="gcache.size=2G"                   #同步複製緩衝池
wsrep_forced_binlog_format=ROW
wsrep_drupal_282555_workaround=1
wsrep_max_ws_size=2147483647
wsrep_sst_method=rsync                                    #拷貝模式
wsrep_sst_auth=galera:galera                              #galera集群認證用戶:密碼
query_cache_size=0                                        #關閉查詢快取
default_storage_engine=InnoDB                             #Mariadb存儲引擎
#wsrep_dirty_reads=0    #當值是1的時候這個節點是只讀節點
#wsrep_sst_method=[mysqldump|xtrabackup|xtrabackup-v2|rsync|rsync_wan]
#wsrep_sst_method=xtrabackup-v2



[mysqldump]
quick
max_allowed_packet = 16M

[myisamchk]
key_buffer_size = 512M
sort_buffer_size = 8M
read_buffer = 4M
write_buffer = 4M


[root@MariaDB-Node1 ~]#

 

192.168.40.60配置文件修改:

[root@MariaDB-Node2 ~]# vim /etc/my.cnf

[client]
port = 3306
socket = /tmp/mysql.sock

[mysqld]
user = mysql
datadir = /data/mysql/data
pid-file = /data/mysql/mysql.pid
server-id = 60
relay_log =/data/logs/mysql/mysql_relay.log
bind-address = 0.0.0.0

tmpdir=/data/tmp



init_connect ='SET NAMES utf8'
character-set-server = utf8
skip-name-resolve
back_log = 300


max_connections = 8019
max_connect_errors = 1024000
open_files_limit = 65535
table_open_cache = 2048
max_allowed_packet = 50M
max_heap_table_size = 512M
tmp_table_size = 256M


read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
key_buffer_size = 512M
thread_cache_size = 128


query_cache_type = 2
query_cache_size = 128M
query_cache_limit = 5M
thread_stack = 192k
ft_min_word_len = 4


log_bin = /data/mysql/mysql_bin.log
binlog_format = ROW
expire_logs_days = 7

log_error = /data/logs/mysql/mysql_error.log
slow_query_log = 1
long_query_time = 1
log_slow_verbosity=query_plan
slow_query_log_file = /data/logs/mysql/mysql_slow.log
performance_schema = 0


skip-external-locking #跳過外部鎖定,避免external locking

bulk_insert_buffer_size = 64M
myisam_sort_buffer_size = 128M
myisam_max_sort_file_size = 10G
myisam_repair_threads = 1


default_storage_engine = InnoDB
innodb_file_per_table = 1
innodb_open_files = 500
innodb_buffer_pool_size = 2048M
innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_thread_concurrency = 16
innodb_purge_threads = 1
innodb_flush_log_at_trx_commit = 1
innodb_log_buffer_size = 8M
innodb_log_file_size = 256M
innodb_log_files_in_group = 3
innodb_max_dirty_pages_pct = 90
innodb_lock_wait_timeout = 10
innodb_autoinc_lock_mode = 2
innodb_doublewrite = 1

innodb_rollback_on_timeout = 1
innodb_force_recovery=0
interactive_timeout = 28800
wait_timeout = 120



[galera]
wsrep_provider=/usr/local/mysql/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.40.50,192.168.40.60,192.168.40.70"
wsrep_cluster_name='Galera_Cluster'
wsrep_node_address='192.168.40.60'
wsrep_node_name='MariaDB-Node2'
wsrep_slave_threads = 8                                   #開啟並行複製執行緒,根據CPU核數設置
wsrep_log_conflicts=1
wsrep_on=ON                                               #開啟全同步複製模式
wsrep_provider_options="gcache.size=2G"                   #同步複製緩衝池
wsrep_forced_binlog_format=ROW
wsrep_drupal_282555_workaround=1
wsrep_max_ws_size=2147483647
wsrep_sst_method=rsync                                    #拷貝模式
wsrep_sst_auth=galera:galera                              #galera集群認證用戶:密碼
query_cache_size=0                                        #關閉查詢快取
default_storage_engine=InnoDB                             #Mariadb存儲引擎
#wsrep_dirty_reads=0    #當值是1的時候這個節點是只讀節點
#wsrep_sst_method=[mysqldump|xtrabackup|xtrabackup-v2|rsync|rsync_wan]
#wsrep_sst_method=xtrabackup-v2

[mysqldump]
quick
max_allowed_packet = 16M

[myisamchk]
key_buffer_size = 512M
sort_buffer_size = 8M
read_buffer = 4M
write_buffer = 4M



[root@MariaDB-Node2 ~]# 

 

192.168.40.70配置文件修改:

[root@MariaDB-Node3 ~]# vim /etc/my.cnf

[client]
port = 3306
socket = /tmp/mysql.sock

[mysqld]
user = mysql
datadir = /data/mysql/data
pid-file = /data/mysql/mysql.pid
server-id = 70
relay_log =/data/logs/mysql/mysql_relay.log
bind-address = 0.0.0.0

tmpdir=/data/tmp



init_connect ='SET NAMES utf8'
character-set-server = utf8
skip-name-resolve
back_log = 300


max_connections = 8019
max_connect_errors = 1024000
open_files_limit = 65535
table_open_cache = 2048
max_allowed_packet = 50M
max_heap_table_size = 512M
tmp_table_size = 256M


read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
key_buffer_size = 512M
thread_cache_size = 128


query_cache_type = 2
query_cache_size = 128M
query_cache_limit = 5M
thread_stack = 192k
ft_min_word_len = 4


log_bin = /data/mysql/mysql_bin.log
binlog_format = ROW
expire_logs_days = 7

log_error = /data/logs/mysql/mysql_error.log
slow_query_log = 1
long_query_time = 1
log_slow_verbosity=query_plan
slow_query_log_file = /data/logs/mysql/mysql_slow.log
performance_schema = 0


skip-external-locking #跳過外部鎖定,避免external locking

bulk_insert_buffer_size = 64M
myisam_sort_buffer_size = 128M
myisam_max_sort_file_size = 10G
myisam_repair_threads = 1


default_storage_engine = InnoDB
innodb_file_per_table = 1
innodb_open_files = 500
innodb_buffer_pool_size = 2048M
innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_thread_concurrency = 16
innodb_purge_threads = 1
innodb_flush_log_at_trx_commit = 1
innodb_log_buffer_size = 8M
innodb_log_file_size = 256M
innodb_log_files_in_group = 3
innodb_max_dirty_pages_pct = 90
innodb_lock_wait_timeout = 10
innodb_autoinc_lock_mode = 2
innodb_doublewrite = 1

innodb_rollback_on_timeout = 1
innodb_force_recovery=0
interactive_timeout = 28800
wait_timeout = 120




[galera]
wsrep_provider=/usr/local/mysql/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.40.50,192.168.40.60,192.168.40.70"
wsrep_cluster_name='Galera_Cluster'
wsrep_node_address='192.168.40.70'
wsrep_node_name='MariaDB-Node3'
wsrep_slave_threads = 8                                   #開啟並行複製執行緒,根據CPU核數設置
wsrep_log_conflicts=1
wsrep_on=ON                                               #開啟全同步複製模式
wsrep_provider_options="gcache.size=2G"                   #同步複製緩衝池
wsrep_forced_binlog_format=ROW
wsrep_drupal_282555_workaround=1
wsrep_max_ws_size=2147483647
wsrep_sst_method=rsync                                    #拷貝模式
wsrep_sst_auth=galera:galera                              #galera集群認證用戶:密碼
query_cache_size=0                                        #關閉查詢快取
default_storage_engine=InnoDB                             #Mariadb存儲引擎
#wsrep_dirty_reads=0    #當值是1的時候這個節點是只讀節點
#wsrep_sst_method=[mysqldump|xtrabackup|xtrabackup-v2|rsync|rsync_wan]
#wsrep_sst_method=xtrabackup-v2

[mysqldump]
quick
max_allowed_packet = 16M

[myisamchk]
key_buffer_size = 512M
sort_buffer_size = 8M
read_buffer = 4M
write_buffer = 4M


[root@MariaDB-Node3 ~]# 

有幾個參數需要修改的:

server-id = 50
wsrep_node_address='192.168.40.50'
wsrep_node_name='MariaDB-Node1'

修改對應的 server ID 、IP地址 和 主機名,其他的複製第一份到別的機器修改就可以

 

初始化集群

在MariaDB一個節點初始化安裝  192.168.40.50 (其他節點不用操作)

[root@MariaDB-Node1 ~]# /usr/local/mysql/scripts/mysql_install_db --defaults-file=/etc/my.cnf --user=mysql

192.168.40.50 節點上通過bootstrap啟動(第一次啟動一定要使用 –wsrep-new-cluster,再次啟動就不需要)

[root@MariaDB-Node1 ~]#  mysqld_safe --defaults-file=/etc/my.cnf --user=mysql  --wsrep-new-cluster &

 

其他節點操作

初始化節點(192.168.40.50)操作啟動之後,別的主機只要正常啟動就可以了

192.168.40.60操作:

[root@MariaDB-Node2 ~]# /etc/init.d/mysql start

192.168.40.70操作:

[root@MariaDB-Node3 ~]# /etc/init.d/mysql start

192.168.40.50操作:

初始化節點等待其他節點正常啟動之後,再關閉帶參數啟動資料庫,再已正常方式啟動

[root@MariaDB-Node1 ~]# ps -ef|grep mysql
[root@MariaDB-Node1 ~]# kill 8316
[root@MariaDB-Node1 ~]# ps -ef|grep mysql
[root@MariaDB-Node1 ~]# /etc/init.d/mysql start

 

查看集群狀態

[root@MariaDB-Node1 ~]# mysql -uroot -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 17
Server version: 10.3.27-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> 
MariaDB [(none)]> 
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+
1 row in set (0.001 sec)

MariaDB [(none)]> show global status like 'ws%';
+-------------------------------+----------------------------------------------------------+
| Variable_name                 | Value                                                    |
+-------------------------------+----------------------------------------------------------+
| wsrep_applier_thread_count    | 8                                                        |
| wsrep_apply_oooe              | 0.000000                                                 |
| wsrep_apply_oool              | 0.000000                                                 |
| wsrep_apply_window            | 0.000000                                                 |
| wsrep_causal_reads            | 0                                                        |
| wsrep_cert_deps_distance      | 0.000000                                                 |
| wsrep_cert_index_size         | 0                                                        |
| wsrep_cert_interval           | 0.000000                                                 |
| wsrep_cluster_conf_id         | 5                                                        |
| wsrep_cluster_size            | 3                                                        |
| wsrep_cluster_state_uuid      | a4d25ee0-515c-11eb-b451-66b7a1b050e2                     |
| wsrep_cluster_status          | Primary                                                  |
| wsrep_cluster_weight          | 3                                                        |
| wsrep_commit_oooe             | 0.000000                                                 |
| wsrep_commit_oool             | 0.000000                                                 |
| wsrep_commit_window           | 0.000000                                                 |
| wsrep_connected               | ON                                                       |
| wsrep_desync_count            | 0                                                        |
| wsrep_evs_delayed             |                                                          |
| wsrep_evs_evict_list          |                                                          |
| wsrep_evs_repl_latency        | 0.000308077/0.00394484/0.0110307/0.00501106/3            |
| wsrep_evs_state               | OPERATIONAL                                              |
| wsrep_flow_control_active     | false                                                    |
| wsrep_flow_control_paused     | 0.000000                                                 |
| wsrep_flow_control_paused_ns  | 0                                                        |
| wsrep_flow_control_recv       | 0                                                        |
| wsrep_flow_control_requested  | false                                                    |
| wsrep_flow_control_sent       | 0                                                        |
| wsrep_gcomm_uuid              | 4fd60274-523f-11eb-b7dd-476f9c9f0398                     |
| wsrep_gmcast_segment          | 0                                                        |
| wsrep_incoming_addresses      | 192.168.40.60:3306,192.168.40.70:3306,192.168.40.50:3306 |
| wsrep_last_committed          | 0                                                        |
| wsrep_local_bf_aborts         | 0                                                        |
| wsrep_local_cached_downto     | 18446744073709551615                                     |
| wsrep_local_cert_failures     | 0                                                        |
| wsrep_local_commits           | 0                                                        |
| wsrep_local_index             | 2                                                        |
| wsrep_local_recv_queue        | 0                                                        |
| wsrep_local_recv_queue_avg    | 0.000000                                                 |
| wsrep_local_recv_queue_max    | 1                                                        |
| wsrep_local_recv_queue_min    | 0                                                        |
| wsrep_local_replays           | 0                                                        |
| wsrep_local_send_queue        | 0                                                        |
| wsrep_local_send_queue_avg    | 0.000000                                                 |
| wsrep_local_send_queue_max    | 1                                                        |
| wsrep_local_send_queue_min    | 0                                                        |
| wsrep_local_state             | 4                                                        |
| wsrep_local_state_comment     | Synced                                                   |
| wsrep_local_state_uuid        | a4d25ee0-515c-11eb-b451-66b7a1b050e2                     |
| wsrep_open_connections        | 0                                                        |
| wsrep_open_transactions       | 0                                                        |
| wsrep_protocol_version        | 9                                                        |
| wsrep_provider_name           | Galera                                                   |
| wsrep_provider_vendor         | Codership Oy <info@codership.com>                        |
| wsrep_provider_version        | 25.3.31(r0ede97d)                                        |
| wsrep_ready                   | ON                                                       |
| wsrep_received                | 2                                                        |
| wsrep_received_bytes          | 306                                                      |
| wsrep_repl_data_bytes         | 0                                                        |
| wsrep_repl_keys               | 0                                                        |
| wsrep_repl_keys_bytes         | 0                                                        |
| wsrep_repl_other_bytes        | 0                                                        |
| wsrep_replicated              | 0                                                        |
| wsrep_replicated_bytes        | 0                                                        |
| wsrep_rollbacker_thread_count | 1                                                        |
| wsrep_thread_count            | 9                                                        |
+-------------------------------+----------------------------------------------------------+
66 rows in set (0.001 sec)

MariaDB [(none)]> 

可以看到集群正常使用有三個節點

注釋

wsrep_cluster_status為Primary,表示節點為主節點,正常讀寫

wsrep_ready為ON,表示集群正常運行

wsrep_cluster_size為 3,表示集群有三個節點

 

遇到問題

如果資料庫 全部關機 了,再次啟動可以會遇到如下報錯

[root@MariaDB-Node1 ~]# cd /data/logs/mysql/
[root@MariaDB-Node1 /data/logs/mysql]# tail -100f mysql_error.log

2021-01-09 13:49:33 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
2021-01-09 13:49:33 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/local/mysql/lib/galera/libgalera_smm.so'
2021-01-09 13:49:33 0 [Note] WSREP: wsrep_load(): Galera 25.3.31(r0ede97d) by Codership Oy <info@codership.com> loaded successfully.
2021-01-09 13:49:33 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
2021-01-09 13:49:33 0 [Note] WSREP: Found saved state: a4d25ee0-515c-11eb-b451-66b7a1b050e2:0, safe_to_bootstrap: 0
2021-01-09 13:49:33 0 [Note] WSREP: Passing config to GCS: base_dir = /data/mysql/data/; base_host = 192.168.40.50; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /data/mysql/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /data/mysql/data//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 2G; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = 
2021-01-09 13:49:33 0 [Note] WSREP: Assign initial position for certification: 0, protocol version: -1
2021-01-09 13:49:33 0 [Note] WSREP: wsrep_sst_grab()
2021-01-09 13:49:33 0 [Note] WSREP: Start replication
2021-01-09 13:49:33 0 [Note] WSREP: 'wsrep-new-cluster' option used, bootstrapping the cluster
2021-01-09 13:49:33 0 [Note] WSREP: Setting initial position to a4d25ee0-515c-11eb-b451-66b7a1b050e2:0
2021-01-09 13:49:33 0 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .
2021-01-09 13:49:33 0 [ERROR] WSREP: wsrep::connect(gcomm://192.168.40.50,192.168.40.60,192.168.40.70) failed: 7
2021-01-09 13:49:33 0 [ERROR] Aborting

解決辦法

[root@MariaDB-Node1 /data/logs/mysql]# cd /data/mysql/data/
[root@MariaDB-Node1 /data/mysql/data]# cat grastate.dat 
# GALERA saved state
version: 2.1
uuid:    a4d25ee0-515c-11eb-b451-66b7a1b050e2
seqno:   -1
safe_to_bootstrap: 0
[root@MariaDB-Node1 /data/mysql/data]# 

把 safe_to_bootstrap: 0 修改為 1

[root@MariaDB-Node1 /data/mysql/data]# vim grastate.dat 
# GALERA saved state
version: 2.1
uuid:    a4d25ee0-515c-11eb-b451-66b7a1b050e2
seqno:   -1
safe_to_bootstrap: 1
[root@MariaDB-Node1 /data/mysql/data]# mysqld_safe --defaults-file=/etc/my.cnf --user=mysql  --wsrep-new-cluster &

把 safe_to_bootstrap: 0 修改為 1,然後再用帶參數 啟動就可以,其他節點使用正常方法啟動

 

mysql galera 集群常見問題處理

一、mysql HA集群在斷網過久或者所有節點都down了之後的恢復有以下的方法:
解決方案1:
1、等三台機器恢復網路通訊後,因為此時的mysql已經異常無法加入集群,因此需要先保證所有的mysql都是down的,再上台執行 /usr/local/mysql/bin/mysqld –wsrep-new-cluster –wsrep-cluster-address=’gcomm://’ & 這條命令,並進入mysql(只有一台機器能夠成功執行,其他機器執行了過幾秒鐘都會異常退出這個進程,我們這裡把能夠成功執行的機器稱為master)
2、此時三台只有一台能夠成功進入mysql(即執行mysql這條命令),在非master上的兩台上一台一台的執行/etc/init.d/mysql start,必須等一台成功了,另一台才能執行

3、在mysql中執行show status like “wsrep%”;

 我們需要保證的第一項為synced,以及第二項必須為三個mysql的ip

4、保證3的結果是想要的說明集群已經恢復了,此時需要將master機器上面的 /usr/local/mysql/bin/mysqld –wsrep-new-cluster –wsrep-cluster-address=’gcomm://’這個進程kill掉,然後再執行 /etc/init.d/mysql start 即可

 

二、mysql HA集群某個節點無故down了並且有一段時間處於down的情況通過以下方式恢復:

1、 若日誌裡面出現以下日誌

 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (eb9f50c6-bc95-11e5-a735-9f48e437dc03): 1 (Operation not permitted)

解決方法:刪除/var/lib/mysql/grastate.dat 文件(若還存在無法同步的情況則刪除galera.cache文件)

2、 若那個down了的節點出現以下日誌

(異常情況集群掛了)[ERROR] Found 1 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start mysqld with –tc-heuristic-recover switch to commit or rollback pending transactions

解決方法:

   1、/usr/local/mysql/bin/mysqld start –innodb_force_recovery=6

  •   (SRV_FORCE_IGNORE_CORRUPT):忽略檢查到的corrupt頁
  •   (SRV_FORCE_NO_BACKGROUND):阻止主執行緒的運行,如主執行緒需要執行full purge操作,會導致crash
  •   (SRV_FORCE_NO_TRX_UNDO):不執行事務回滾操作
  •   (SRV_FORCE_NO_IBUF_MERGE):不執行插入緩衝的合併操作
  •   (SRV_FORCE_NO_UNDO_LOG_SCAN):不查看重做日誌,InnoDB存儲引擎會將未提交的事務視為已提交
  •    (SRV_FORCE_NO_LOG_REDO):不執行前滾的操作

如果配置後出現以下情況:
130507 14:14:01  InnoDB: Waiting for the background threads to start
130507 14:14:02  InnoDB: Waiting for the background threads to start
130507 14:14:03  InnoDB: Waiting for the background threads to start
130507 14:14:04  InnoDB: Waiting for the background threads to start
130507 14:14:05  InnoDB: Waiting for the background threads to start
130507 14:14:06  InnoDB: Waiting for the background threads to start
130507 14:14:07  InnoDB: Waiting for the background threads to start
130507 14:14:08  InnoDB: Waiting for the background threads to start
130507 14:14:09  InnoDB: Waiting for the background threads to start

需要在galera.cfg中添加這一下:
如果在設置 innodb_force_recovery >2 的同時innodb_purge_thread = 0
2、mysqld –tc-heuristic-recover=ROLLBACK
3、刪除/var/lib/mysql/ib_logfile*
4、當某個mysql節點掛了,並且存在三個mysql所在host有不同的網段,當mysql想重新加入需要一個sst的過程,sst時會需要知道集群中某個節點的ip因此需要制定參數–wsrep-sst-receive-address否則可能出現同步的ip不在三台機器所共有的網段

三、一個mysql節點若down了一段時間。重新啟動的時候需要一些時間去同步數據,服務的啟動超時時間不夠,導致服務無法啟動,解決方法如下:
The correct way to adjust systemd settings so they don’t get overwritten is to create a directory and file as such:
/etc/systemd/system/mariadb.service.d/timeout.conf
[Service]

TimeoutStartSec=12min

或者直接修改/usr/lib/systemd/system/mariadb.service
[Service]

TimeoutStartSec=12min
這裡的時間最少要大於90s,默認是90s之後執行 systemctl daemon-reload再重啟服務即可

 

四、日誌中出現類似如下錯誤:
160428 13:54:49 [ERROR] Slave SQL: Error ‘Table ‘manage_operations’ already exists’ on query. Default database: ‘horizon’. Query: ‘CREATE TABLE `manage_operations` (
    `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY,
    `name` varchar(50) NOT NULL,
    `type` varchar(20) NOT NULL,
    `operation` varchar(20) NOT NULL,
    `status` varchar(20) NOT NULL,
    `time` date NOT NULL,
    `operator` varchar(50) NOT NULL
) default charset=utf8′, Error_code: 1050
160428 13:54:49 [Warning] WSREP: RBR event 1 Query apply warning: 1, 28585
160428 13:54:49 [Warning] WSREP: Ignoring error for TO isolated action: source: 752eecd1-0ce0-11e6-83fc-3e0502d0bdd2 version: 3 local: 0 state: APPLYING flags: 65 conn_id: 24053 trx_id: -1 seqnos (l: 28668, g: 28585, s: 28584, d: 28584, ts: 80224119986850)
導致進程異常關閉,
此時可以通過執行mysqladmin flush-tables來刷新表項,這個問題的原因是三個節點之間的表同步存在問題,刷新一下表即可

五、日誌出現以下錯誤:

160820  3:13:41 [ERROR] Error in accept: Too many open files
160820  3:19:42 [ERROR] Error in accept: Too many open files
160827  3:16:24 [ERROR] Error in accept: Too many open files
160831 17:20:52 [ERROR] Error in accept: Too many open files
160831 19:54:29 [ERROR] Error in accept: Too many open files
160831 20:21:53 [ERROR] Error in accept: Too many open files
160901 11:25:57 [ERROR] Error in accept: Too many open files

解決方法

vim /usr/lib/systemd/system/mariadb.service

 [Service]
 LimitNOFILE=10000

默認的mysql的open_file_limits是1024將該項增大,並且修改vim /etc/my.cnf.d/server.cnf該文件的open_files_limit值

systemctl daemon-reload

systemctl restart mysqld

查看mysql的open_file_limits值是否調整成功

cat /proc/$pid/limit

其中$pid為mysql進程的pid看看值是否調整成功,並看看日誌是否還會出現上述錯誤

 

總結一下:

  1. 當所有節點都宕機的情況下,再次啟動節點,Mariadb-Galera集群會出現無法啟動的現象
  2. Mariadb-Galera集群啟動是有順序的,遵循一個原則:最後宕機的最先啟動,因為集群認為這個節點的數據是最新的
  3. 在生產環境下應該避免使用大事務,不建議在高並發寫入場景下使用Galera Cluster架構,會導致集群限流,從而引起整個集群hang住,出現生產故障。針對這種情況可以考慮主從,實現讀寫分離等手段
  4. 對數據一致性要求較高,並且數據寫入不頻繁,資料庫容量也不大(50GB左右),網路狀況良好的情況下,可以考慮使用Galera方案

 

本文分享完畢,感謝支援點贊~~

Tags: