kafka高可用探究

kafka高可用探究

眾所周知 kafka 的 topic 可以使用 –replication-factor 數和 partitions 數來保證服務的高可用性

 

問題發現

但在最近的運維過程中,3台集群的kafka,副本與分區都為3,有其中一台 broker 掛了導致整個集群成了不可用狀態,消費者消費不到資訊,這是為什麼呢?

查了很多資料後發現是kafka本身的 topic __consumer_offsets 搞的鬼。

 

問題分析

在高版本的kakfa中,消費者的offset偏移量會保存在kafka自身一個叫做__consumer_offsets的topic中,由於這個topic是由kafka本身默認創建,所以副本數會配置文件中指定的默認副本數,一般為1。

查看副本分區情況一般為:

./kafka-topics.sh --zookeeper localhost:2181 --describe __consumer_offsets
Topic:__consumer_offsets  PartitionCount:50 ReplicationFactor:1 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
  Topic: __consumer_offsets Partition: 0  Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 1  Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 2  Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 3  Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 4  Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 5  Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 6  Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 7  Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 8  Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 9  Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 10 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 11 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 12 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 13 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 14 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 15 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 16 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 17 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 18 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 19 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 20 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 21 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 22 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 23 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 24 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 25 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 26 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 27 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 28 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 29 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 30 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 31 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 32 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 33 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 34 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 35 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 36 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 37 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 38 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 39 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 40 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 41 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 42 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 43 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 44 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 45 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 46 Leader: 1 Replicas: 1 Isr: 1
  Topic: __consumer_offsets Partition: 47 Leader: 2 Replicas: 2 Isr: 2
  Topic: __consumer_offsets Partition: 48 Leader: 3 Replicas: 3 Isr: 3
  Topic: __consumer_offsets Partition: 49 Leader: 1 Replicas: 1 Isr: 1

50個分區,每個分區1個副本。

50個分區是遍布在3台broker的,這就導致如果有其中一台broker服務掛了,在其broker的所有Partition將不能正常使用,就導致此Partition的消費者不知道自己的offset偏移量,就導致無法正常消費。

 

問題解決

方法1

由於現在kafka已經開始正常提供服務,所以只能動態修改:

先準備分區副本規劃 json 文件

vim /data/vfan/consumer.json

{
    "version": 1, 
    "partitions": [
        {
            "topic": "__consumer_offsets", 
            "partition": 0, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 1, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 2, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 3, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 4, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 5, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 6, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 7, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 8, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 9, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 10, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 11, 
            "replicas": [
                3, 
                1, 
                2
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 12, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 13, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 14, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 15, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 16, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 17, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 18, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 19, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
​
        {
            "topic": "__consumer_offsets", 
            "partition": 20, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 21, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 22, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 23, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 24, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 25, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 26, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 27, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 28, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 29, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 30, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 31, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 32, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 33, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 34, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 35, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 36, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 37, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 38, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 39, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 40, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 41, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 42, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 43, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 44, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 45, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 46, 
            "replicas": [
                3, 
                2, 
                1
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 47, 
            "replicas": [
                1, 
                2, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 48, 
            "replicas": [
                2, 
                1, 
                3
            ]
        },
        {
            "topic": "__consumer_offsets", 
            "partition": 49, 
            "replicas": [
                3, 
                2, 
                1
            ]
        }
    ]
}
各 replicas 所在的 broker id可以自定義修改,但不能有重複的broker

 

開始執行變更

./kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /data/vfan/consumer.json --execute

 

校驗變更是否完成

./kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /data/vfan/consumer.json --verify

 

檢查變更後效果

./kafka-topics.sh --zookeeper localhost:2181 --describe --topic __consumer_offsets
Topic:__consumer_offsets  PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
  Topic: __consumer_offsets Partition: 0  Leader: 1 Replicas: 1,2,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 1  Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 2  Leader: 3 Replicas: 3,2,1 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 3  Leader: 1 Replicas: 1,2,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 4  Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 5  Leader: 3 Replicas: 3,2,1 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 6  Leader: 1 Replicas: 1,2,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 7  Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 8  Leader: 3 Replicas: 3,2,1 Isr: 2,1,3
  Topic: __consumer_offsets Partition: 9  Leader: 1 Replicas: 1,2,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 10 Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 11 Leader: 3 Replicas: 3,1,2 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 12 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 13 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 14 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 15 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 16 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 17 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 18 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 19 Leader: 3 Replicas: 3,2,1 Isr: 1,3,2
  Topic: __consumer_offsets Partition: 20 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 21 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 22 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 23 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
  Topic: __consumer_offsets Partition: 24 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 25 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 26 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 27 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 28 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 29 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 30 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 31 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 32 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 33 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 34 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 35 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
  Topic: __consumer_offsets Partition: 36 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 37 Leader: 3 Replicas: 3,2,1 Isr: 1,3,2
  Topic: __consumer_offsets Partition: 38 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 39 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 40 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 41 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
  Topic: __consumer_offsets Partition: 42 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
  Topic: __consumer_offsets Partition: 43 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 44 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
  Topic: __consumer_offsets Partition: 45 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 46 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
  Topic: __consumer_offsets Partition: 47 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
  Topic: __consumer_offsets Partition: 48 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
  Topic: __consumer_offsets Partition: 49 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3

副本數已經成為三個並分布在三個broker中,實現高可用。

 

方法2

直接在kafka服務啟動前,修改系統創建topic默認副本分區參數

num.partitions=3 ;當topic不存在系統自動創建時的分區數
default.replication.factor=3 ;當topic不存在系統自動創建時的副本數
offsets.topic.replication.factor=3 ;表示kafka的內部topic consumer_offsets副本數,默認為1

 

設置完畢後,啟動 zk kafka,隨後測試生產 消費

## 生產
./kafka-console-producer.sh --broker-list localhost:9092 --topic test
​
## 消費,--from-beginning參數表示從頭開始
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

 

查看系統生成的topic 分區及副本數

## test
./kafka-topics.sh --zookeeper localhost:2181 --describe --topic test
Topic:test  PartitionCount:3  ReplicationFactor:3 Configs:
​
## __consumer_offsets
./kafka-topics.sh --zookeeper localhost:2181 --describe --topic __consumer_offsets
Topic:__consumer_offsets  PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer

系統自動生成的topic也都已實現高可用

Tags: