-
Notifications
You must be signed in to change notification settings - Fork 99
Description
Deferred Block Proving
1. Background and Constraints
Block proving is expected to take approximately 30 seconds on average per block, while the network target block time is 5 seconds. Therefore, block production MUST NOT synchronously depend on block proving.
The system MUST satisfy the following constraints:
- Block production proceeds at a fixed cadence independent of proving latency.
- Committed blocks are immediately visible to the network and RPC clients.
- Proving is performed asynchronously and may lag behind the committed chain tip.
- Finality is monotonic and derives solely from proven block prefixes.
These constraints necessitate a separation between block commitment and block proving.
2. High-Level Design Overview
The system introduces asynchronous block proving as a first-class component of the block lifecycle.
Blocks progress through a series of well-defined states. State transitions prior to commitment are synchronous and atomic with respect to chain state. All proving-related transitions are asynchronous and retryable.
Committed blocks represent the canonical chain state, while finalized blocks represent the canonical and proven chain prefix.
sequenceDiagram
autonumber
participant BP as Block Producer
participant V as Validator
participant ST as Store
participant PS as Proof Scheduler
participant RP as Remote Prover
Note over BP,ST: Block creation phase (synchronous)
BP->>V: Proposed block
V->>V: Validate block
V-->>BP: Signed block
BP->>ST: Submit signed block
ST->>ST: Commit block to chain state
Note over ST: Block committed <br> Visible via RPC
Note over ST,RP: Block proving phase (asynchronous)
PS->>ST: Query committed, unproven blocks
ST-->>PS: Return next committed block
PS->>RP: Dispatch proving job (signed block)
RP->>RP: Generate proof
alt Proof succeeds
RP-->>PS: Proof
PS->>ST: Submit proof
ST->>ST: Mark block as Proven
else Proof fails
RP-->>PS: Error
PS->>PS: Retry with backoff
end
Note over ST: Finalization check
ST->>ST: Check all ancestors proven
ST->>ST: Mark block as Finalized
Note over ST: Block finalized <br> Visible via RPC
3. Block Lifecycle and States
3.1 Block States
The block lifecycle consists of the following ordered states:
Proposed → Signed → Committed → Proven → Finalized
3.2 State Definitions
| State | Description | Guarantees | Available via RPC |
|---|---|---|---|
| Proposed | Candidate block produced locally | None | No |
| Signed | Block signed by a Validator | Authenticity only | No |
| Committed | Block applied to canonical chain state | Deterministic state transition | Yes |
| Proven | Block state transition proven | Cryptographic validity | No |
| Finalized | Proven block with all ancestors proven | Irreversible | Yes |
3.3 Invariants
The system MUST uphold the following invariants:
- The committed chain is always contiguous and strictly ordered by block number.
- The finalized chain is always a prefix of the committed chain.
- A block MUST NOT be finalized unless all preceding blocks are proven.
- Block states MUST NOT regress except via explicit manual intervention (disaster recovery).
4. Block Commitment Flow (Synchronous)
4.1 Commitment Pipeline
Blocks are proposed by the Block Producer, signed by a Validator, and committed by the Store. This flow is synchronous and linear.
4.2 Semantics
The commitment flow is synchronous and atomic.
Upon commitment:
- The block is appended to the canonical chain.
- Chain state is updated immediately.
- All state-dependent RPC endpoints reflect the new block.
Commitment MUST NOT depend on proving.
Once committed, a block is considered part of the canonical chain regardless of its proving status.
5. Asynchronous Block Proving
5.1 Overview
Block proving is performed asynchronously by one or more Remote Provers. Proving is pull-based, retryable, and independent of block production.
5.2 Proving Scheduler
5.2.1 Scheduler Responsibilities
A Proving Scheduler is responsible for:
- Selecting committed but unproven blocks.
- Dispatching proving jobs to Remote Provers.
- Tracking proving progress and failures.
- Persisting successful proofs.
5.2.2 Scheduler Placement
The scheduler MAY be implemented as an internal task within the Store or as a standalone service consuming Store RPCs.
This specification treats the scheduler as a logical component independent of deployment topology.
5.2.3 Block Selection Rules
The scheduler MUST:
- Only schedule blocks in the Committed state.
- Ensure each block is proven at most once concurrently.
- Prefer lower block numbers to minimize finalized lag.
Blocks MAY be scheduled out of order, but finalization semantics remain strictly sequential.
5.2.4 Parallelism
Multiple blocks MAY be proven in parallel.
Proving MUST NOT assume sequential dependency between blocks, and proof verification MUST be independent per block.
5.2.5 Retry Semantics
Failed proving attempts are retried with backoff.
Transient errors MUST NOT mark a block as unprovable. Permanent failures are escalated according to the failure-handling procedures.
5.3 Proof Persistence
Upon successful proof verification, the Store persists the proof and transitions the block to the Proven state.
6. Block Finalization
6.1 Finalization Rule
A block N is considered finalized if and only if:
- Block N is in the Proven state; and
- All blocks with numbers less than N are also in the Proven state.
Finalization is a derived property and does not require an explicit action.
6.2 Finalized Chain Tip
The finalized chain tip is defined as the highest block number such that all blocks up to and including that number are proven.
7. RPC Semantics and Finality
7.1 Finality Parameter
RPC requests that reference block numbers MUST specify a finality requirement:
- committed
- finalized
The finality parameter MUST NOT default implicitly.
7.2 RPC Behavior
If the requested block does not satisfy the requested finality, the RPC MUST return a deterministic error (for example, PENDING_FINALITY).
RPC responses MUST NOT mix data across different finality levels.
7.3 Affected Store RPC endpoints
The following RPC endpoints are implicated by finality semantics:
- GetAccount
- GetBlockByNumber
- GetBlockHeaderByNumber
- SyncNullifiers
- SyncNotes
- SyncState
- SyncAccountVault
- SyncAccountStorageMaps
- SyncTransactions
All RPC endpoints upstream of these are also impacted (the service Api endpoints in rpc.proto).
7.4 Store RPC for Block Producer and NTX Builder
The Block Producer and NTX Builder require only committed data and MUST NOT depend on proven or finalized blocks.
8. Observability and Operational Expectations
The system SHOULD expose metrics for:
- Committed versus finalized chain tip
- Proving backlog size
- Proof latency distributions
- Persistent proving failures
Alerting SHOULD trigger when the finalized tip lags the committed tip beyond a configured threshold.
9. Failure Handling
9.1 Irrecoverable Errors
In extremely rare cases, a block MAY be unprovable due to a genuine fault in its state transition.
In such cases:
- Block production MUST be halted.
- Manual intervention is REQUIRED.
Remediation MAY involve reverting the chain to a prior committed height or canonicalizing the faulty transition via governance or override mechanisms.
9.1.1 Automatic Detection and Alerting
Such irrecoverable errors MUST be automatically detected by the system and raised as an operational alert.
9.1.2 Manual Rollback
The system MUST allow for operators to perform a rollback of chain state in such instances when it is deemed that an error is irrecoverable.
9.2 Recoverable Errors
All other proving failures are considered recoverable and MUST be handled automatically after a patch or configuration change.
Recoverable errors MUST NOT permanently stall block production.
10. Open Questions
- Final scheduler placement: embedded Store task versus standalone service.
11. Plan
11.1 Proving Control Plane
- Implement block proof scheduler.
- a. Define scheduler interface (inputs, outputs, persistence model).
- b. Implement block selection heuristics:
- committed but unproven blocks only
- lowest block number first
- at most one in-flight proof per block
- c. Implement parallel proof job dispatch:
- configurable concurrency limits
- backpressure when prover capacity is saturated
- d. Implement retry and backoff semantics for transient failures.
- e. Implement permanent failure escalation path (unprovable block signal).
Deliverable: Scheduler can continuously drain the committed → proven backlog without affecting block production.
11.2 Proof Persistence
- Integrate proofs into the Store.
- a. Persist proof metadata as required.
- b. Enforce idempotency for duplicate proof submissions.
- c. Prevent state regression or double-proving.
Deliverable: Proven blocks are durably recorded and safe against replay or scheduler restarts.
11.3 Finalization Logic
- Implement finalized chain tip logic.
- a. Compute finalized height as the maximal contiguous prefix of proven blocks.
- b. Ensure finalization is monotonic and deterministic.
- c. Ensure finalization derivation is independent of scheduler ordering.
Deliverable: Finalized chain tip is correct under reordering, retries, and partial proving.
11.4 RPC Finality Semantics
- Extend Store and external RPCs with explicit finality semantics.
- a. Add a
finalityparameter (committed | finalized) to all block-indexed RPCs. - b. Enforce deterministic errors when requested finality is not satisfied.
- c. Ensure no cross-finality data leakage in RPC responses.
- d. Ensure Block Producer and NTX Builder call paths remain committed-only.
- a. Add a
Deliverable: RPC consumers can safely choose between low-latency and fully finalized views.
11.5 Operational Visibility and Safety
- Implement observability and alerting.
- a. Metrics:
- committed vs finalized chain tip
- proving backlog size
- in-flight proofs
- proof latency and failure rates
- b. Alerts when finalized lag exceeds configured thresholds.
- a. Metrics:
Deliverable: Operators can reason about proving health based on metrics and alerts.
11.6 Failure and Recovery Procedures
- Implement state rollback and recovery mechanisms.
- a. Mechanism to halt block production and finalization without bringing node down.
- b. Tool to execute rollback.
- c. Operational runbook(s).
Deliverable: Operational readiness for irrecoverable errors.