【事件中心 Azure Event Hub】Event Hub日誌種發現的錯誤資訊解讀

問題描述

使用Event Hub消費事件時,出現的各種客戶端錯誤的解讀。(再後期遇見新的錯誤資訊,會持續添加進此說明)

 

一:再Linux中運行Event Hub消費端程式,出現Too many open files

解讀該資訊是指java程式打開作業系統文件句柄數超出了作業系統的限制,排查作業系統的文件句柄的限制是不是默認的1024,如果是,請改為無限制。

使用ulimit -a 或者 ulimit -n 查看句柄數 open files (-n) 1024

配置文件/etc/security/limits.conf

在該配置文件中添加

* soft nofile 65535  
 
* hard nofile 65535 

 

二:New receiver ‘nil’ with higher epoch of ‘197’ is created hence current receiver ‘nil’ with epoch ‘196’ is getting disconnected

錯誤消息:

java.util.concurrent.CompletionException: com.microsoft.azure.eventhubs.ReceiverDisconnectedException: New receiver ‘nil’ with higher epoch of ‘197’ is created hence current receiver ‘nil’ with epoch ‘196’ is getting disconnected. If you are recreating the receiver, make sure a higher epoch is used. TrackingId:xxxxxxxxxxxxxxx, SystemTracker:xxxxxxx:eventhub:xxxxxxx|$default, Timestamp:2020-10-20T15:50:16, errorContext[NS: xxxxxxxxx.servicebus.chinacloudapi.cn, PATH: xxxxxxxxx/ConsumerGroups/$Default/Partitions/3, REFERENCE_ID: xxxxxxxxxx, PREFETCH_COUNT: 300, LINK_CREDIT: 300, PREFETCH_Q_LEN: 0]

java.util.concurrent.ExecutionException: com.microsoft.azure.eventprocessorhost.ExceptionWithAction: java.lang.RuntimeException: Lease lost while updating checkpoint

解讀消費者程式會為每個消息分區創建單獨的消費執行緒,消費執行緒跟分區是一對一的關係,當有額外的消費程式去消費同樣的eventhub時,並存儲checkpoint到同一個位置時,就會發生partition的再分配,或者,當其中一個消費執行緒出現問題時,客戶端程式會嘗試恢復並接手失敗執行緒所有的分區。通常情況下該錯誤可以忽略。

 

三:com.microsoft.azure.eventprocessorhost.ExceptionWithAction:The client could not finish the operation within specified maximum execution timeout.

解讀客戶端程式在消費後,將消費offset存入Storage時,發生網路超時,建議您排查下客戶端網路情況。

 

四:com.microsoft.azure.eventhubs.EventHubException: The specified partition is invalid for an EventHub partition sender or receiver. It should be between 0 and 1.

解讀客戶端程式在消費eventhub數據時,指定了錯誤的分區資訊。

 

五:com.microsoft.azure.eventhubs.EventHubException: The supplied offset ‘0’ is invalid. The last offset in the system is ‘-1’

解讀客戶端在消費eventhub數據時,提交了錯誤的offset值,不能設置初始為0。