誤用.Net Redis客戶端工具CSRedisCore,自己挖坑自己填

  • 2019 年 10 月 6 日
  • 筆記

前導  

  上次Redis MQ分散式改造完成之後, 編排的容器穩定運行了一個多月,昨天突然收到ETL端同事通知,沒有採集到解析日誌了。

趕緊進伺服器看了一下,用於數據接收的receiver容器掛掉了, 嘗試docker container start [containerid], 幾分鐘後該容器再次崩潰。

Redis連接超限

docker log [containerid] 查看容器日誌; 重點:CSRedis.RedisException: ERR max number of clients reached

日誌上顯示連接Redis伺服器的客戶端數量超限,頭腦快速思考,目前編排的某容器使用CSRedisCore 對於16個Redis DB實例化了16個客戶端,但Redis伺服器也不至於這麼不經折騰吧。

趕緊進redis.io官網搜集相關資料。

After the client is initialized, Redis checks if we are already at the limit of the number of clients that it is possible to handle simultaneously (this is configured using the maxclients configuration directive, see the next section of this document for further information). In case it can't accept the current client because the maximum number of clients was already accepted, Redis tries to send an error to the client in order to make it aware of this condition, and closes the connection immediately. The error message will be able to reach the client even if the connection is closed immediately by Redis because the new socket output buffer is usually big enough to contain the error, so the kernel will handle the transmission of the error.

大致意思是:Redis伺服器maxclients配置了客戶端連接數, 如果當前連接的客戶端超限,Redis會回發一個錯誤消息給客戶端,並迅速關閉客戶端連接。

立刻登錄Redis伺服器查看默認配置,確認當前Redis伺服器默認配置是10000。

After the client is initialized, Redis checks if we are already at the limit of the number of clients that it is possible to handle simultaneously (this is configured using the maxclients configuration directive, see the next section of this document for further information). In case it can't accept the current client because the maximum number of clients was already accepted, Redis tries to send an error to the client in order to make it aware of this condition, and closes the connection immediately. The error message will be able to reach the client even if the connection is closed immediately by Redis because the new socket output buffer is usually big enough to contain the error, so the kernel will handle the transmission of the error.

左圖表明:通過Redis-Cli 登錄進伺服器立即就被踢下線。

基本可認定redis客戶端使用方式有問題。

CSRedisCore使用方式

 繼續查看相關資料,可在redis伺服器上利用redis-cli命令:info clients、client list仔細分析客戶端連接。

info clients 命令顯示現場確實有10000的連接數;

client list命令顯示連接如下

官方對client list命令輸出欄位的解釋:

  • addr: The client address, that is, the client IP and the remote port number it used to connect with the Redis server.
  • fd: The client socket file descriptor number.
  • name: The client name as set by CLIENT SETNAME.
  • age: The number of seconds the connection existed for.
  • idle: The number of seconds the connection is idle.
  • flags: The kind of client (N means normal client, check the full list of flags).
  • omem: The amount of memory used by the client for the output buffer.
  • cmd: The last executed command.

根據以上解釋,表明 Redis伺服器收到很多ip=172.16.1.3(故障容器在網橋內的Ip 地址)的客戶端連接,這些連接最後發出的是ping命令(這是一個測試命令)

故障容器使用的Redis客戶端是CSRedisCore,該客戶端只是單純將 Msg 寫入Redis list 數據結構,CSRedisCore上相關github issue給了我一些啟發。

發現自己將CSRedisClient實例化程式碼寫在 .netcore api Controller構造函數,這樣每次請求構造Controller時都 實例化一次Redis客戶端,最終Redis客戶端不超限才怪。

依賴注入三種模式: 單例(系統內單一實例,一次性注入);瞬態(每次請求產生實例並注入);自定義範圍。 有關dotnet apicontroller 以瞬態模式 注入,請查閱鏈接。

趕緊將CSRedisCore實例化程式碼移到 startup.cs 並註冊為單例。

大膽求證

info clients命令顯示穩定在53個Redis連接。

client list命令顯示:172.16.1.3(故障容器)建立了50個客戶端連接,編排的另一個容器webapp建立了2個連接,redis-cli命令登錄到伺服器建立了1個連接。

那麼問題來了,修改之後,receiver容器為什麼還穩定建立了50個redis連接?

進一步與CSRedisCore原作者溝通,確定CSRedisCore有預熱機制,默認在連接池中預熱了50個連接。

bingo,故障和困惑全部排查清楚。

總結

經此一役,在使用CSRedisCore客戶端時, 要深入理解

① Stackexchange.Redis 使用的多路復用連接機制(使用時很容易想到註冊到單例),CSRedisCore開源庫採用連接池機制,在高並發場景下強烈建議註冊為單例, 否則在生產使用中可能會誤用在瞬態請求中實例化,導致redis客戶端幾天之後被佔滿。

② CSRedisCore會默認建立連接池,預熱50個連接, 開發者心裡要有數。

額外的方法論: 盡量不要從某度找答案,要學會問問題,並嘗試從官方、stackoverflow 、github社區尋求解答,你挖過的坑也許別人早就挖過並踏平過。

原文鏈接:https://www.cnblogs.com/JulianHuang/p/11541658.html