搬運基礎服務到kubernetes,遇這3類大坑怎麼破?
- 2019 年 10 月 4 日
- 筆記

工作中需要將原本部署在物理機或虛擬機上的一些基礎服務搬到kubernetes中,在搬的過程中遇到了不少坑,筆者在此特別分享一下所遇到的問題及相應的解決方法~


一、異常網絡引起的問題

之前使用redis-operator在kubernetes中部署了一套Redis集群,可測試的同事使用redis-benchmark隨便一壓測,這個集群就會出問題。經過艱苦的問題查找過程,終於發現了問題,原來是兩個虛擬機之間的網絡存在異常。
經驗教訓,在測試前可用iperf3先測試下node節點之間,pod節點之間的網絡狀況,方法如下:
1234567891011121314 |
# 在某台node節點上啟動iperf3服務端$ iperf3 –server# 在另一台node節點上啟動iperf3客戶端$ iperf3 –client ${node_ip} –length 150 –parallel 100 -t 60# 在kuberntes中部署iperf3的服務端與客戶端$ kubectl apply -f https://raw.githubusercontent.com/Pharb/kubernetes-iperf3/master/iperf3.yaml# 查看iperf3相關pod的podIP$ kubectl get pod -o wide# 在某個iperf3 client的pod中執行iperf3命令,以測試其到iperf3 server pod的網絡狀況$ kubectl exec -ti iperf3-clients-5b5ll — iperf3 –client ${iperf3_server_pod_ip} –length 150 –parallel 100 -t 60 |
---|


二、mysql低版本引起的集群腦裂

之前使用mysql-operator在kubernetes中部署了一套3節點MySQL InnoDB集群,測試反饋壓測一段時間後,這個集群會變得不可訪問。檢查出問題時mysql集群中mysql容器的日誌,發現以下問題:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748 |
$ kubectl logs mysql-0 -c mysql2018-04-22T15:24:36.984054Z 0 [ERROR] [MY-000000] [InnoDB] InnoDB: Assertion failure: log0write.cc:1799:time_elapsed >= 0InnoDB: thread 139746458191616InnoDB: We intentionally generate a memory trap.InnoDB: Submit a detailed bug report to http://bugs.mysql.com.InnoDB: If you get repeated assertion failures or crashes, evenInnoDB: immediately after the mysqld startup, there may beInnoDB: corruption in the InnoDB tablespace. Please refer toInnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.htmlInnoDB: about forcing recovery.15:24:36 UTC – mysqld got signal 6 ;This could be because you hit a bug. It is also possible that this binaryor one of the libraries it was linked against is corrupt, improperly built,or misconfigured. This error can also be caused by malfunctioning hardware.Attempting to collect some information that could help diagnose the problem.As this is a crash and something is definitely wrong, the informationcollection process might fail.key_buffer_size=8388608read_buffer_size=131072max_used_connections=1max_threads=151thread_count=2connection_count=1It is possible that mysqld could use up tokey_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 67841 K bytes of memoryHope that's ok; if not, decrease some variables in the equation.Thread pointer: 0x0Attempting backtrace. You can use the following information to find outwhere mysqld died. If you see no messages after this, something wentterribly wrong…stack_bottom = 0 thread_stack 0x46000/home/mdcallag/b/orig811/bin/mysqld(my_print_stacktrace(unsigned char*, unsigned long)+0x3d) [0x1b1461d]/home/mdcallag/b/orig811/bin/mysqld(handle_fatal_signal+0x4c1) [0xd58441]/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) [0x7f1cae617390]/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38) [0x7f1cacb0a428]/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) [0x7f1cacb0c02a]/home/mdcallag/b/orig811/bin/mysqld(ut_dbg_assertion_failed(char const*, char const*, unsigned long)+0xea) [0xb25e13]/home/mdcallag/b/orig811/bin/mysqld() [0x1ce5408]/home/mdcallag/b/orig811/bin/mysqld(log_flusher(log_t*)+0x2fb) [0x1ce5fab]/home/mdcallag/b/orig811/bin/mysqld(std::thread::_Impl<std::_Bind_simple<Runnable (void (*)(log_t*), log_t*)> >::_M_run()+0x68) [0x1ccbe18]/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80) [0x7f1cad476c80]/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f1cae60d6ba]/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f1cacbdc41d]The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains |
---|
在mysql的bug跟蹤系統里搜索了一下,果然發現了這個bug(https://bugs.mysql.com/bug.php?id=90670),官方提示這個bug在8.0.12之前都存在,推薦升級到8.0.13之後的版本。
還好mysql-operator支持安裝指定版本的MySQL,這裡通過指定版本為最新穩定版8.0.16解決問題。
1234567 |
apiVersion: mysql.oracle.com/v1alpha1kind: Clustermetadata: name: mysqlspec: members: 3 version: "8.0.16" |
---|


三、超額使用ephemeral-storage空間引起集群故障

MySQL InnoDB集群方案中依賴於MySQL Group Replication在主從節點間同步數據,這種同步本質上是依賴於MySQL的binlog的,因此如果是壓測場景,會在短時間內產生大量binlog日誌,而這些binlog日誌十分佔用存儲空間。
而如果使用使用mysql-operator創建MySQL集群,如果在yaml文件中不聲明volumeClaimTemplate,則pod會使用ephemeral-storage空間,雖然kubernetes官方提供了辦法:
(https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#requests-and-limits-setting-for-local-ephemeral-storage)
來設置ephemeral-storage空間的配額,但mysql-operator本身並沒有提供參數讓用戶指定ephemeral-storage空間的配額。
這樣當MySQL集群長時間壓測後,產生的大量binlog會超額使用ephemeral-storage空間,最終kubernetes為了保證容器平台的穩定,會將該pod殺掉,當3節點MySQL集群中有2個pod被殺掉時,整個集群就處於不法自動恢復的狀態了。
123456 |
Events:Type Reason Age From Message —- —— —- —- ——-Warning Evicted 39m kubelet, 9.77.34.64 The node was low on resource: ephemeral-storage. Container mysql was using 256Ki, which exceeds its request of 0. Container mysql-agent was using 11572Ki, which exceeds its request of 0.Normal Killing 39m kubelet, 9.77.34.64 Killing container with id docker://mysql-agent:Need to kill PodNormal Killing 39m kubelet, 9.77.34.64 Killing container with id docker://mysql:Need to kill Pod |
---|
解決辦法也很簡單,一是參考示例:(https://github.com/oracle/mysql-operator/blob/master/examples/cluster/cluster-with-data-volume-and-backup-volume.yaml)
在yaml文件中聲明volumeClaimTemplate,另外還可以在mysql的配置文件中指定binlog_expire_logs_seconds參數(https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#sysvar_binlog_expire_logs_seconds)
在保證在壓測場景下,能快速刪除binlog,方法如下:
12345678910111213141516171819202122232425262728293031323334353637383940 |
apiVersion: v1data:my.cnf: |[mysqld]default_authentication_plugin=mysql_native_passwordskip-name-resolvebinlog_expire_logs_seconds=300kind: ConfigMapmetadata: name: mycnf—apiVersion: mysql.oracle.com/v1alpha1kind: Clustermetadata: name: mysqlspec: members: 3 version: "8.0.16" config: name: mycnf volumeClaimTemplate: metadata: name: data spec: storageClassName: default accessModes: – ReadWriteMany resources: requests: storage: 1Gi backupVolumeClaimTemplate: metadata: name: backup-data spec: storageClassName: default accessModes: – ReadWriteMany resources: requests: storage: 1Gi |
---|
至此,Redis集群、MySQL集群終於可以穩定地在kubernetes中運行了。
參考鏈接
redis-operator:
https://github.com/spotahome/redis-operator
redis-benchmark:
https://redis.io/topics/benchmarks
iperf3:
https://iperf.fr/
mysql-operator:
https://github.com/oracle/mysql-operator
MySQL Group Replication:
https://dev.mysql.com/doc/refman/8.0/en/group-replication.html


猜你還想看這些內容
●Kustomize上篇丨Helm 和 Kustomize:不只是含谷量的區別
· END ·

記得文末點個好看鴨~

點就完事兒了!
