Etcd中Raft日誌複製的實現
Raft state of log

commitIndex : A log entry is committed once the leader that created the entry has replicated it on a majority of the servers.
在大多數伺服器上複製了該條日誌,則該條日誌的index可以被認為是commited
lastApplied : 上一個被狀態機應用的index
這2個屬性都被標註了 volatile
Impl in Etcd
日誌複製分為了2個階段的過程,commit和apply,commit是raft狀態機間相互確認日誌同步的過程,apply是應用處理好相關日誌並通知raft狀態機已被應用的過程
apply的過程較為抽象,由應用來決定業務上需要apply的過程,實際上是應用commited的日誌的應用邏輯,在完成邏輯後,只是向raft狀態機標記日誌被應用方處理了
Structure

主要由2個包組成,
pkg raft 是raft演算法的具體實現
pkg etcdserver作為使用raft演算法的應用,包含具體的應用邏輯與交互膠水
pkg raft

pkg etcdserver

remote request sequential flow

raft msg handle sequential flow
就是試試mermaid,還蠻好用的
sequenceDiagram
participant EtcdServer
participant raftNode
participant Node(pkg raft)
loop raftNode start() , EtcdServer run()
raftNode->raftNode: waiting Ready channel from Node
raftNode->raftNode: store uncommitted&committed entries
raftNode->raftNode: send entries to apply channel to apply
EtcdServer->EtcdServer: run() waiting apply channel from raftNode
raftNode->raftNode: transport remote msgs to other node from Node,the msg is build by raft
raftNode->raftNode: waiting notifyc channel from EtcdServer to Advance()
Note right of raftNode: Advance reprents that call advance() in rawNode , mark current index is applied by application
end
loop Node(pkg raft) run()
Node(pkg raft)->Node(pkg raft): waiting msgs from propc(Proposal flow)
Node(pkg raft)->Node(pkg raft): call ready(), collect entries from raft log & msgs needs handle
Node(pkg raft)->Node(pkg raft): waiting advance channel , mark applied to raft log
end
Flow
Commit flow
proposal

follower accept proposal

leader commit proposal


除開SnapShot以及重啟節點的特殊邏輯,正常啟動一個Node,在Storage中實際上存儲了uncommited&commited的日誌,並且在啟動時設置了commitedIndex就是日誌的最大長度,某些極端情況下,日誌會有不同,所以在Follower accept proposal的過程中,會有檢測衝突的過程,以及Leader強制Follower跟隨自己的日誌
apply flow
每個節點都會有自己的applied index,並不需要同步。
流程見 raft msg handle sequential flow
這一步 Node(pkg raft): call ready(), collect entries from raft log & msgs needs handle ,會生成 Ready 數據,裡面包含untable entries,以及 committed entries。其中Entries欄位實際上是包含了 raftlog中 unstable 的日誌,裡面含有uncommitted&committed的日誌,因為沒有被標記成applied,所以是 unstable的。
Ready通過channel數據傳輸至 EtcdServer後,在這裡應用層的邏輯就會執行,存儲,應用,之後mark 日誌為 applied,並且將unstable中applied的日誌清除掉。
Summary
缺失了snapshot,log compact ,leader change , config change , read linear的流程。
在交互上還未去確認的地方,是否applied過後的日誌才被etcd承認,按照目前的流程,其實commit過後的,雖然有可能會丟失,但也可以被承認如果是樂觀看待的話。


