Skip to content

1 - Add Raft-based consensus implementationFeat/raft consensus#212

Open
eyesofish wants to merge 16 commits intoapache:masterfrom
eyesofish:feat/raft-consensus
Open

1 - Add Raft-based consensus implementationFeat/raft consensus#212
eyesofish wants to merge 16 commits intoapache:masterfrom
eyesofish:feat/raft-consensus

Conversation

@eyesofish
Copy link

This PR adds a Raft-based consensus implementation to ResilientDB.

Key changes:

  • Introduce Raft kernel components and storage isolation
  • Add infrastructure for switching consensus protocols via configuration
  • Keep existing PBFT execution logic intact

This work is based on the current upstream master branch.
No new third-party code is introduced.

此更改使系统能够通过配置在不同共识协议之间切换,
初始支持PBFT和Raft协议。

主要变更:
- 在ResConfigData中添加consensus_protocol字段
- 扩展ResDBConfig以支持获取/设置共识协议类型
- 更新所有配置模板和示例以显式包含协议设置
- 默认使用pbft确保向后兼容性

未来将根据配置动态选择对应的共识管理器实现。

Fixes apache#123
…onsensusManagerRaft,并新增 Create/Generate...ForProtocol 接口,配好 RESDB_HAS_RAFT 宏,后续有 RAFT 代码时直接启用。

service/utils/server_factory.cpp (lines 1-92) 实现协议字符串归一化和选择逻辑,raft 未编译时直接 LOG(FATAL),算是把“失败路径”写死;同一逻辑也提供带 CustomQuery 的版本。
service/kv/kv_service.cpp (lines 69-84) 统一读取 config->GetConsensusProtocol(),打印后交给 GenerateResDBServerForProtocol,入口现在真正受配置控制。
TODO.md (lines 15-21) 勾掉 KV 服务的 checklist。
Add support for selecting consensus protocol (PBFT/RAFT) via config:
- Add consensus_protocol field to ResConfigData
- Extend ResDBConfig with protocol getter/setter
- Update server_factory to create consensus managers dynamically
- Migrate KV/Contract/UTXO/GraphQL services to config-driven protocol selection
- Define clear failure path for unimplemented protocols

Prepares foundation for RAFT integration. Maintains backward compatibility
with existing PBFT-only deployments.

Fixes #issue-number (if applicable)
This change introduces the foundational ConsensusManagerRaft class that will
serve as the integration point between the ResilientDB network layer and
the upcoming RAFT consensus implementation. Key features include:

- Created ConsensusManagerRaft inheriting from ConsensusManager
- Disabled legacy PBFT heartbeat in favor of RAFT's AppendEntries-based approach
- Implemented request routing for client requests, custom queries, and RAFT RPCs
- Added leadership/term tracking with atomic variables for thread safety
- Provided callback registration interface for RAFT core components
- Set up dedicated heartbeat thread with pluggable heartbeat task

The class is designed as a glue layer that allows the actual RAFT implementation
to register handlers for various operations while keeping the wiring and lifecycle
management separate. It properly handles all request types and provides access
to injected dependencies like TransactionManager and CustomQuery.

Build target //platform/consensus/ordering/raft:consensus_manager_raft
compiles successfully (timed out due to external deps, no errors observed).

Prepares foundation for full RAFT consensus implementation. Next step is
implementing the RAFT core logic that will register handlers with this skeleton.

Fixes #issue-number (if applicable)
Implement core RAFT kernel components needed for consensus operation:

- PrefixAllocator: Centralized storage namespace management for PBFT/RAFT isolation
- RaftLog: Persistent log with append/truncate/commit operations
- RaftPersistentState: Term/vote/apply state persistence
- RaftSnapshotManager: Chunked snapshot read/write with isolated storage
- RaftRpc: Network messaging for RAFT protocol RPCs

All components integrated into ConsensusManagerRaft and covered by unit tests.

Prepares groundwork for full RAFT node implementation with role transitions.

Fixes #issue-number (if applicable)
…GetStorage() 返回 nullptr,PBFT 还是能跑;只是每个 replica 在共识通过后本地写一份状态,并没有要求共识层去读取 executor 的存储指针。

RAFT 这套是“共识层直接管日志”。ConsensusManagerRaft 在构造时会调用 transaction_manager_->GetStorage() 来拿一个 Storage*,然后用它构造 RaftLog、RaftPersistentState、RaftSnapshotManager。这些对象的职责是把 WAL、term、commit index、snapshot 全部写进同一个 storage 里,保证 leader/follower 的日志状态一致。如果 GetStorage() 返回 nullptr,那这些对象全都变成空指针,接下来任何 raft_log_->Append/CommitTo 之类的调用都会直接 deref null → Segmentation fault。
只有entered没有handling的现象明显减少
Raft消息处理流程能够正常执行
状态转换恢复正常:

节点能够根据更高任期的消息正确降级
集群开始展现出更符合Raft协议的行为
心跳机制有所改善:

虽然可能还不够完美,但比之前只有单轮心跳的情况要好很多
@cjcchen
Copy link
Contributor

cjcchen commented Dec 18, 2025

hi yang,

Thanks for the PR.
Could you make the changes green?
I saw some changed files but it is not modified. like .asf.yaml

You can run "git checkout .asf.yaml".

@harish876 harish876 changed the title Add Raft-based consensus implementationFeat/raft consensus 1 - Add Raft-based consensus implementationFeat/raft consensus Dec 22, 2025
@cjcchen
Copy link
Contributor

cjcchen commented Dec 29, 2025

"This branch has conflicts that must be resolved"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants