-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
Description
Summary
Two areas lack retry logic for transient network failures:
1. EVM `fetch_logs()` (`adapters/src/evm.rs:91-117`)
Single RPC timeout fails the entire backfill. No exponential backoff or retry budget.
2. Callback delivery (`api/src/main.rs` `fire_callback()`)
Fire-and-forget with a 10-second timeout. If the webhook endpoint is temporarily unavailable, the callback is permanently lost with only a warning log.
Impact
- EVM: A single transient RPC failure aborts a multi-hour backfill that must be restarted from scratch
- Callbacks: Users have no visibility into whether their job completion callback was delivered; they may miss important notifications
Recommended Fix
- Add exponential backoff retry (3 attempts, 100ms/200ms/400ms) to `fetch_logs()`
- Add retry logic with backoff (3 attempts, 500ms/1s/2s) to `fire_callback()`
- Don't retry on 4xx responses (client error), only on network/5xx errors
🤖 Generated with Claude Code
Reactions are currently unavailable