Skip to content

feat(wasm,cli): implement decentralized challenge evaluation pipeline#6

Open
echobt wants to merge 3 commits intomainfrom
feat/decentralized-challenge-impl
Open

feat(wasm,cli): implement decentralized challenge evaluation pipeline#6
echobt wants to merge 3 commits intomainfrom
feat/decentralized-challenge-impl

Conversation

@echobt
Copy link

@echobt echobt commented Feb 18, 2026

Summary

Implements the full decentralized evaluation pipeline for term-challenge-v2, including functional route handlers, LLM review on consensus-selected validators, AST/whitelist validation, submission versioning with hotkey-bound names, timeout handling with validator replacement, decay integration, dataset consensus, and comprehensive documentation.

Changes

WASM Module (wasm/src/)

New Modules:

  • submission.rs — Submission name registration (first-register-owns), versioned submissions with auto-incrementing version numbers, hotkey-bound name resolution
  • llm_review.rs — LLM security review on 3 deterministically-selected validators using host_llm_chat_completion(), with configurable system prompts, majority-vote aggregation, and result storage
  • ast_validation.rs — Python AST/whitelist validation on 3 separate validators, checking imports against configurable allowed modules, detecting forbidden builtins (exec, eval, compile, __import__), dangerous patterns (os.system, subprocess), and enforcing code size limits
  • timeout_handler.rs — Timeout detection for stalled LLM/AST reviews with automatic validator replacement, configurable thresholds (default: 6h evaluation, 3min LLM, 1min AST), max 5 reassignments per submission

Enhanced Modules:

  • routes.rs — Complete rewrite with 12 functional route handlers (leaderboard, submissions, submit, dataset, decay, stats, agent code/logs/journey/llm_review) reading from WASM storage via host_storage_get()
  • scoring.rs — Integrated decay into scoring pipeline with top-agent state tracking, grace period, configurable decay curves, and burn to UID 0
  • dataset.rs — Implemented dataset consensus using validator proposals with >50% agreement, deterministic task selection via host_random_seed()
  • agent_storage.rs — Added evaluation status tracking (Pending → LlmReview → AstReview → Evaluating → Completed → Failed), partial results storage, agent journey retrieval
  • lib.rs — Integrated all new modules into evaluate() and validate(), implemented routes() and handle_route() trait methods, added score and submission record storage
  • types.rs — Added types for submission versioning, LLM review results, AST review results, evaluation status, timeout configuration, decay state, route request/response

CLI (cli/src/)

  • rpc.rs — Added fetch_agent_journey(), fetch_submission_history(), fetch_stats(), fetch_decay_status() RPC methods for new route endpoints

Documentation

  • README.md — Comprehensive rewrite with 8 Mermaid diagrams covering system architecture, evaluation pipeline, validator assignment, submission flow, decay mechanism, CLI data flow, agent log consensus, and route architecture
  • docs/architecture.md — Full architecture reference with component diagrams, WASM host function surface, P2P message types, and storage key schema
  • docs/miner/how-to-mine.md — Complete miner guide with prerequisites, agent structure, local testing, submission workflow, environment variables, and troubleshooting
  • docs/miner/submission.md — Submission guide covering name registration, versioning, security review process, and rejection troubleshooting
  • docs/validator/setup.md — Validator setup guide with hardware requirements, configuration, monitoring, and operational procedures

Dependencies

Requires platform-v2 changes from PlatformNetwork/platform-v2#55 for WasmRouteRequest/WasmRouteResponse types and handle_route()/routes() Challenge trait methods.

Summary by CodeRabbit

  • New Features

    • Added CLI commands to fetch agent journey, submission history, statistics, and decay status.
    • Introduced LLM-based code review and AST structural validation for agent submissions.
    • Implemented submission versioning and revision history tracking.
    • Added score decay mechanics and top-agent state tracking.
    • Enabled dataset consensus and randomized task selection.
  • Documentation

    • Added comprehensive architecture and system design guide.
    • Created miner submission and setup walkthroughs.
    • Created validator operational setup and management guide.
    • Enhanced README with detailed evaluation pipeline overview.

…es and enhance existing modules

- Add types: SubmissionName, SubmissionVersion, LlmReviewResult, AstReviewResult,
  EvaluationStatus, TopAgentState, LeaderboardEntry, StatsResponse, TimeoutConfig,
  WhitelistConfig, LlmMessage, LlmRequest, LlmResponse, WasmRouteRequest
- Create submission.rs: name registration, versioned submissions, history
- Create llm_review.rs: LLM code review via host_http_post, reviewer selection,
  result storage/aggregation
- Create ast_validation.rs: Python code validation with whitelist config,
  forbidden builtins, dangerous patterns, import checking
- Create timeout_handler.rs: timeout config, assignment tracking, replacement
  selection
- Enhance scoring.rs: top agent state tracking, epoch decay integration,
  remove dead_code allow
- Enhance dataset.rs: consensus logic, random index generation, proposals
- Enhance agent_storage.rs: evaluation status storage, remove dead_code allows
- Rewrite routes.rs: 24 route definitions, functional handlers for all endpoints
- Update lib.rs: integrate new modules into evaluate(), add routes/handle_route
  methods, store scores and submission records
@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a comprehensive evaluation pipeline for the Term Challenge with multi-stage validation (AST and LLM review), submission versioning, timeout management, and route-based API handling. Adds four new WASM modules, extends RPC capabilities, and provides detailed operator/miner documentation.

Changes

Cohort / File(s) Summary
Documentation
README.md, docs/architecture.md, docs/miner/..., docs/validator/setup.md
Expands README with evaluation pipeline details; introduces comprehensive architecture document; adds miner how-to-mine and submission guides; provides validator setup and operational procedures.
WASM Type System
wasm/src/types.rs
Adds 14 new public structs and enums for evaluation pipeline: SubmissionName/Version, LlmReviewResult, AstReviewResult, EvaluationStatus, TopAgentState, TimeoutConfig, WhitelistConfig, LlmMessage/Request/Response, WasmRouteRequest, and related data structures with serialization support.
WASM Evaluation Modules
wasm/src/ast_validation.rs, wasm/src/llm_review.rs, wasm/src/submission.rs, wasm/src/timeout_handler.rs
New modules implementing: Python code AST validation with whitelist config; LLM-based review with deterministic reviewer selection and aggregation; submission registry with versioning and name ownership; timeout tracking and validator replacement logic.
WASM State & Storage
wasm/src/agent_storage.rs, wasm/src/scoring.rs, wasm/src/dataset.rs
Adds evaluation status storage; top-agent state management with epoch decay tracking; dataset proposal consensus and random index generation from host seed.
WASM Routing System
wasm/src/routes.rs, wasm/src/lib.rs
Implements route dispatcher (handle_route_request); integrates new modules into evaluation flow; adds routes() and handle_route() methods to TermChallengeWasm; expands evaluation with AST/LLM validation stages and state transitions.
CLI RPC Extension
cli/src/rpc.rs
Adds four async methods to RpcClient for fetching agent journey, submission history, stats, and decay status via challenge-specific GET requests.

Sequence Diagram(s)

sequenceDiagram
    participant Miner as Miner
    participant WASM as WASM Module
    participant Storage as Host Storage
    participant LLM as LLM Service
    participant Validator as Validator

    Miner->>WASM: submit_versioned(name, hotkey, agent_hash)
    WASM->>Storage: register_submission_name()
    Storage-->>WASM: submission_name
    WASM->>Storage: store submission version
    Storage-->>WASM: ✓
    
    Validator->>WASM: evaluate(submission_data)
    WASM->>WASM: store_evaluation_status(Pending)
    
    WASM->>WASM: validate_python_code() [AST]
    WASM->>Storage: store_ast_result()
    Storage-->>WASM: ✓
    WASM->>WASM: update_evaluation_status(AstReview)
    
    WASM->>WASM: select_reviewers(validators)
    WASM->>LLM: run_llm_review(agent_code)
    LLM-->>WASM: LlmReviewResult
    WASM->>Storage: store_review_result()
    Storage-->>WASM: ✓
    WASM->>WASM: update_evaluation_status(LlmReview)
    
    WASM->>WASM: evaluate_tasks()
    WASM->>WASM: aggregate_reviews()
    WASM->>WASM: apply_epoch_decay(weight)
    WASM->>Storage: store_score(hotkey, final_weight)
    Storage-->>WASM: ✓
    WASM->>WASM: update_evaluation_status(Completed)
    WASM-->>Validator: evaluation_result
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 A bunny hops through validation stages,
With AST checks and LLM sages,
Submissions versioned, timeouts tracked,
Routes now routed, features fully stacked!
🏃‍♂️💨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 2.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(wasm,cli): implement decentralized challenge evaluation pipeline' clearly and concisely describes the main change—implementing a decentralized evaluation pipeline across WASM and CLI components, which aligns with the comprehensive changes documented in the raw summary.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/decentralized-challenge-impl

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
wasm/src/agent_storage.rs (1)

62-68: ⚠️ Potential issue | 🔴 Critical

truncate_output panics on multi-byte UTF-8 when max_len splits a codepoint.

&output[..max_len] performs byte-level slicing on a &str. If max_len falls inside a multi-byte UTF-8 character (common with non-ASCII agent output), this will panic at runtime. Since this function gates all task output preview storage in the evaluation pipeline, this is a latent crash risk.

Proposed fix — use a char-boundary-safe truncation
 pub fn truncate_output(output: &str, max_len: usize) -> String {
     if output.len() <= max_len {
         return String::from(output);
     }
-    let truncated = &output[..max_len];
-    String::from(truncated)
+    // Find the largest valid char boundary at or before max_len
+    let end = output.floor_char_boundary(max_len);
+    String::from(&output[..end])
 }

Note: floor_char_boundary is stable since Rust 1.80. If you need to support older toolchains, use a manual loop:

let mut end = max_len;
while end > 0 && !output.is_char_boundary(end) {
    end -= 1;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/agent_storage.rs` around lines 62 - 68, truncate_output currently
slices the string by bytes (using &output[..max_len]) which can panic on
multi-byte UTF-8 boundaries; update truncate_output to compute a
char-boundary-safe end index (use str::is_char_boundary in a small loop or
str::floor_char_boundary if on Rust ≥1.80) and slice with that boundary so you
return only valid UTF-8 (preserve the existing return type and function name
truncate_output).
🧹 Nitpick comments (5)
wasm/src/scoring.rs (1)

188-196: apply_epoch_decay accepts DecayParams but only uses min_multiplier.

The function signature suggests it applies decay based on the params, but the actual decay calculation was already performed and stored by update_top_agent_state (with hardcoded constants). This function just reads the stored current_burn_percent. The params argument is misleading — it only serves as a floor via min_multiplier.

This isn't broken, but the signature creates a false impression that grace_period_hours and half_life_hours influence the result.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/scoring.rs` around lines 188 - 196, The function apply_epoch_decay
currently takes DecayParams but only reads
get_top_agent_state().current_burn_percent and min_multiplier, which is
misleading; either remove the unused DecayParams parameter from
apply_epoch_decay (and update callers) so the function clearly relies solely on
the stored top-agent state (symbols: apply_epoch_decay, DecayParams,
get_top_agent_state, current_burn_percent, decay_active, min_multiplier), or
change apply_epoch_decay to compute decay from the passed params (compute
multiplier from params.grace_period_hours and params.half_life_hours when
decay_active is true instead of using stored current_burn_percent, and ensure
update_top_agent_state is adjusted accordingly); pick one approach and make
callers and tests consistent.
wasm/src/dataset.rs (1)

73-102: Limited entropy for more than 8 random indices.

The 32-byte seed provides 4-byte (full u32) entropy for only the first 8 indices (i * 4 + 4 <= 32). Beyond that, Line 87 falls back to a single seed byte (0..=255), heavily biasing the index when total_tasks > 256. Given the route description mentions 50-task selection, this fallback will be hit for select_count > 8.

Consider re-seeding or deriving additional entropy (e.g., a simple hash-based expansion of the original seed) when count > 8.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/dataset.rs` around lines 73 - 102, The generate_random_indices
function currently only uses full u32 entropy for the first eight indices and
falls back to a single seed byte for subsequent indices, causing high bias for
select_count > 8; fix it by expanding the 32-byte seed into as many 4-byte words
as needed (or reseeding) before building indices—for example, derive additional
4-byte blocks via a hash/PRF (SHA256/HMAC or similar) of the original seed
concatenated with a counter and use those 4-byte blocks when computing
idx_bytes, ensuring you update any references to seed slicing in
generate_random_indices so every index uses a full u32 of entropy rather than a
single byte.
wasm/src/ast_validation.rs (2)

27-52: Text-based heuristic, not AST validation — bypassable with indirection.

Despite the module name, validate_python_code uses substring matching, not actual AST parsing. Techniques like getattr(os, 'system')('cmd'), exec("import os"), or aliasing (fn = eval; fn(...)) bypass all checks. Similarly, check_dangerous_patterns won't catch subprocess.run ( (space before paren).

This is fine as a first-pass defense-in-depth layer if the LLM review is the primary security gate. Consider noting the limitation or planning a follow-up to use a real Python AST parser (e.g., via host_sandbox_exec to run a sandboxed AST analysis).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/ast_validation.rs` around lines 27 - 52, The current
validate_python_code function (and helpers check_dangerous_patterns,
check_imports) performs only substring heuristics which are bypassable; either
replace this with real AST-based validation by invoking a sandboxed Python AST
analyzer (e.g., implement a host_sandbox_exec call that runs a small Python
script to parse and inspect ast.Module and return structured violations to be
converted into AstReviewResult), or explicitly document/rename the function to
indicate it's a heuristic (e.g., heuristic_validate_python_code) and add a clear
comment and tests in wasm/src/ast_validation.rs and WhitelistConfig usage
explaining the limitation and that the LLM review or a follow-up AST parser is
required for robust enforcement.

77-109: Import checker misses multi-line and conditional imports.

check_imports processes one line at a time, so it won't catch:

  • Multi-line imports: from os \n import system (backslash continuation)
  • Parenthesized multi-line: from os import (\n system,\n path\n)
  • Imports inside exec() or eval() strings (though __import__ is caught by check_dangerous_patterns)

Low severity given it's defense-in-depth, but worth documenting the known gaps.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/ast_validation.rs` around lines 77 - 109, check_imports currently
scans one physical line at a time and misses continued/parenthesized multi-line
imports and imports embedded in exec()/eval() string literals; update
check_imports to first preprocess the input code into logical statements by (a)
joining lines with trailing backslashes to their continuations and (b)
collapsing parenthesized import blocks into single logical lines (e.g., track
open/close parens and accumulate until balanced), then run the existing import
detection logic against those logical lines (using the same root extraction and
is_module_allowed checks to push to violations). Additionally, after
preprocessing scan string literals passed to exec()/eval() (detect patterns like
exec("...") or eval('...')) and apply the same import detection to the string
contents so imports inside dynamic execution are caught; reuse the violations
vector and existing message format (e.g., "Disallowed module: " + root) and keep
is_module_allowed for permission checks.
docs/architecture.md (1)

238-268: Storage key schema is incomplete.

The schema only lists agent and dataset keys. The new modules introduce many more storage keys that callers need to be aware of: name_registry:, submission_versions:, llm_review:, ast_review:, ast_whitelist_config, timeout_config, review_assignment:, review_timeout:, llm_enabled, score:, submission:, eval_status:, dataset_proposals, leaderboard, active_miner_count, validator_count. Consider documenting these for operational reference.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture.md` around lines 238 - 268, Update the "Storage Key Schema"
section to include the missing storage keys introduced by new modules: document
entries for key prefixes such as name_registry:, submission_versions:,
llm_review:, ast_review:, ast_whitelist_config, timeout_config,
review_assignment:, review_timeout:, llm_enabled, score:, submission:,
eval_status:, dataset_proposals, leaderboard, active_miner_count, and
validator_count; for each key include the Key Format (exact prefix), Content
(serialized type or meaning, e.g., submission metadata, review blobs, config
structs, counters), Size or max size (or "Variable" if unbounded), and Module
name (e.g., name_registry, submissions, review, config, leaderboard, metrics),
and update the "Key Encoding" note if any of these use different encodings
(hotkey/epoch/separator) so callers have a complete operational reference for
lookups/sets.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/architecture.md`:
- Around line 46-87: The architecture doc is stale: update the Mermaid diagram
and the Module Responsibilities table to include the four new modules
`ast_validation`, `llm_review`, `submission`, and `timeout_handler`, and revise
`dataset.rs` description to reflect its implemented features (consensus,
proposals, random index generation) instead of "reserved for future
implementation"; edit the flowchart to add nodes for these modules and connect
them to `lib.rs` or relevant modules (e.g., `tasks.rs`/`dataset.rs`), and update
the table rows for each new module with concise purposes and change the
`dataset.rs` row to list “P2P consensus, proposals, random index generation” so
the diagram and table match the code.
- Around line 94-146: Update the "Used by Term Challenge" flags and missing
entry in the host functions tables: mark host_random_seed (signature
host_random_seed(buf: &mut [u8]) → Result<(), i32>) as "Yes" (used by dataset.rs
and llm_review.rs), mark host_consensus_get_submission_count (signature
host_consensus_get_submission_count() → i32) as "Yes" (used by routes.rs
handle_stats), and ensure host_get_timestamp (signature host_get_timestamp() →
i64) is present in the appropriate sandbox/terminal table and marked "Yes" (used
by timeout_handler.rs); edit the entries for platform_sandbox/platform_terminal
and platform_consensus to reflect these changes.

In `@docs/validator/setup.md`:
- Around line 136-142: The direct execution example exposes the secret on the
command line; update the docs so the binary is not given --secret-key
"${VALIDATOR_SECRET_KEY}" directly. Instead show one of two secure options: (1)
set VALIDATOR_SECRET_KEY in the environment and run
./target/release/validator-node without the --secret-key CLI flag, or (2) write
the secret to a protected file and invoke the binary with a --secret-key-file
/path/to/secret (or the program's equivalent file option); adjust the example to
use the env-or-file approach and reference the binary name
./target/release/validator-node and the variable VALIDATOR_SECRET_KEY so readers
know which symbols to use.

In `@README.md`:
- Around line 233-258: Update the README's architecture tree under
term-challenge/wasm/src to include the four missing WASM modules by adding
entries for submission.rs, llm_review.rs, ast_validation.rs, and
timeout_handler.rs alongside the existing files (lib.rs, types.rs, scoring.rs,
tasks.rs, dataset.rs, routes.rs, agent_storage.rs) so the list accurately
reflects all evaluation pipeline source files; ensure the new filenames are
placed in the same indented style as the other wasm/src entries.

In `@wasm/src/lib.rs`:
- Around line 217-228: The code currently treats package_zip as UTF-8 text
(code_str = core::str::from_utf8(&package_zip).unwrap_or("")) so AST/LLM
validators (ast_validation::validate_python_code and subsequent run_llm_review)
receive an empty string; replace this with proper ZIP extraction: open
package_zip as a zip archive, locate the agent Python file(s) (e.g. *.py or a
known path), read their UTF-8 contents into code_str before calling
ast_validation::validate_python_code and ast_validation::store_ast_result; keep
existing failure handling that calls agent_storage::store_evaluation_status and
returns EvaluationOutput::failure when ast_result.passed is false. Ensure you
handle extraction errors (log/store as failure) rather than silently using an
empty string.
- Around line 274-283: The code passes state.epochs_stale (an epoch count) into
apply_decay which expects hours, causing unit mismatch; fix by either converting
epochs to hours before calling apply_decay (use the project's epoch-to-hour
conversion factor) or, better, call the epoch-aware path instead (use
scoring::apply_epoch_decay or the existing apply_epoch_decay function that
accepts epoch counts) so decay uses epoch-based parameters; update the call in
the block that now does apply_decay(epoch_decayed, state.epochs_stale as f64,
decay_params) to use the correct conversion or the epoch-based function and keep
references to params.decay_params, scoring::apply_epoch_decay, apply_decay, and
state.epochs_stale consistent.

In `@wasm/src/llm_review.rs`:
- Around line 138-145: The function redact_api_keys currently only truncates and
does not remove secrets; update redact_api_keys to actually scan and replace
common API key patterns (e.g., strings starting with "sk-", "AKIA", "AIza",
"SG.", "xoxb-", long hex/base64-like tokens, and typical header patterns) using
regex replacements so matched tokens are replaced with a fixed mask like
"[REDACTED_API_KEY]" while preserving surrounding text, and keep the existing
50_000-character truncate logic; alternatively, if you prefer not to implement
redaction, rename the function to truncate_code and update all references to
reflect the new name (redact_api_keys → truncate_code) to avoid implying secret
removal. Ensure changes touch the redact_api_keys function and any callers so
behavior and naming are consistent.
- Around line 106-136: parse_llm_verdict currently never extracts the
"violations" array and uses brittle string matching for fields; update
parse_llm_verdict and extract_json_string to (1) find the "violations" array by
locating the substring "\"violations\"" then the following '[' and parse the
array using a small state machine that recognizes string elements and handles
escaped quotes (e.g., iterate chars, track inside_string and escape states,
collect each full string item into Vec<String>), (2) make approved detection
tolerant of whitespace by locating "\"approved\"" then skipping whitespace and
the ':' and checking for the literal "true"/"false" token (case-sensitive per
JSON) rather than hard-coded substrings, and (3) make extract_json_string accept
optional spaces around ':' and return an unescaped String by scanning from the
opening quote and consuming escaped characters until the matching closing quote;
return None on malformed input. Update the construction of LlmReviewResult in
parse_llm_verdict to populate violations with the parsed Vec<String> so
aggregate_reviews receives real violations.

In `@wasm/src/routes.rs`:
- Around line 222-257: The handle_stats function casts
host_consensus_get_submission_count() (an i32) directly to u64 which will
produce huge values for negative sentinels; change the conversion to clamp
negatives to 0 (or map error sentinel to None/0) before converting to u64 and
use that sanitized value for total_submissions, and remove the unused
host_consensus_get_epoch() call or instead add epoch to the StatsResponse struct
and populate it (so either drop the host call or wire epoch into StatsResponse)
to avoid an unnecessary host round-trip; update references to StatsResponse,
handle_stats, host_consensus_get_submission_count, and host_consensus_get_epoch
accordingly.
- Around line 158-206: The POST routes are dispatched without any auth info; add
an auth_hotkey: Option<String> field to the WasmRouteRequest struct and update
handle_route_request to check authorization for mutating endpoints (e.g.,
handle_set_timeout_config, handle_set_whitelist_config, handle_dataset_propose,
handle_timeout_record, handle_timeout_mark) before calling them: implement or
call a central is_authorized_hotkey(&Option<String>) (or similar) helper that
validates the hotkey against the authorized set and return an empty Vec<u8> (or
an auth error payload) when unauthorized; ensure you pass the request.body
and/or auth_hotkey into handlers that need caller identity and update those
handler signatures (e.g., handle_set_timeout_config, handle_dataset_propose,
handle_timeout_record, handle_timeout_mark) to accept the hotkey so they can
perform any audit/mutation with the caller identity.

In `@wasm/src/scoring.rs`:
- Around line 149-186: update_top_agent_state uses hardcoded epoch-based decay
constants (grace_epochs = 60, half_life_epochs = 20.0) and computes
state.epochs_stale in epochs, but elsewhere decay logic (apply_decay /
DecayParams) expects hours, causing a semantic mismatch; fix by converting
epochs to hours (use the chain's epoch-duration-in-seconds or a single
EPOCH_HOURS constant) or derive epoch-based grace/half-life from DecayParams
(grace_hours and half_life_hours) before computing multiplier: replace the
hardcoded grace_epochs and half_life_epochs with values computed from
DecayParams (or multiply epochs_stale by epoch_duration_hours) and ensure
apply_decay/get_top_agent_state/state.epochs_stale use the same time unit
throughout (reference functions/structs: update_top_agent_state,
get_top_agent_state, apply_decay, DecayParams, host_consensus_get_epoch,
state.epochs_stale).

In `@wasm/src/timeout_handler.rs`:
- Around line 54-61: The subtraction (current_time - assigned_time) in the
timeout check can be negative and casting that i64 to u64 will wrap to a huge
value, causing false timeouts; update the logic in the block that reads from
host_storage_get and uses host_get_timestamp so you compute elapsed safely
(e.g., use checked_sub/saturating_sub or explicitly return false when
current_time < assigned_time) before casting to u64, then compare that
non-negative elapsed against timeout_ms; adjust uses of assigned_time,
current_time, and timeout_ms accordingly to avoid signed-to-unsigned wraparound.

---

Outside diff comments:
In `@wasm/src/agent_storage.rs`:
- Around line 62-68: truncate_output currently slices the string by bytes (using
&output[..max_len]) which can panic on multi-byte UTF-8 boundaries; update
truncate_output to compute a char-boundary-safe end index (use
str::is_char_boundary in a small loop or str::floor_char_boundary if on Rust
≥1.80) and slice with that boundary so you return only valid UTF-8 (preserve the
existing return type and function name truncate_output).

---

Nitpick comments:
In `@docs/architecture.md`:
- Around line 238-268: Update the "Storage Key Schema" section to include the
missing storage keys introduced by new modules: document entries for key
prefixes such as name_registry:, submission_versions:, llm_review:, ast_review:,
ast_whitelist_config, timeout_config, review_assignment:, review_timeout:,
llm_enabled, score:, submission:, eval_status:, dataset_proposals, leaderboard,
active_miner_count, and validator_count; for each key include the Key Format
(exact prefix), Content (serialized type or meaning, e.g., submission metadata,
review blobs, config structs, counters), Size or max size (or "Variable" if
unbounded), and Module name (e.g., name_registry, submissions, review, config,
leaderboard, metrics), and update the "Key Encoding" note if any of these use
different encodings (hotkey/epoch/separator) so callers have a complete
operational reference for lookups/sets.

In `@wasm/src/ast_validation.rs`:
- Around line 27-52: The current validate_python_code function (and helpers
check_dangerous_patterns, check_imports) performs only substring heuristics
which are bypassable; either replace this with real AST-based validation by
invoking a sandboxed Python AST analyzer (e.g., implement a host_sandbox_exec
call that runs a small Python script to parse and inspect ast.Module and return
structured violations to be converted into AstReviewResult), or explicitly
document/rename the function to indicate it's a heuristic (e.g.,
heuristic_validate_python_code) and add a clear comment and tests in
wasm/src/ast_validation.rs and WhitelistConfig usage explaining the limitation
and that the LLM review or a follow-up AST parser is required for robust
enforcement.
- Around line 77-109: check_imports currently scans one physical line at a time
and misses continued/parenthesized multi-line imports and imports embedded in
exec()/eval() string literals; update check_imports to first preprocess the
input code into logical statements by (a) joining lines with trailing
backslashes to their continuations and (b) collapsing parenthesized import
blocks into single logical lines (e.g., track open/close parens and accumulate
until balanced), then run the existing import detection logic against those
logical lines (using the same root extraction and is_module_allowed checks to
push to violations). Additionally, after preprocessing scan string literals
passed to exec()/eval() (detect patterns like exec("...") or eval('...')) and
apply the same import detection to the string contents so imports inside dynamic
execution are caught; reuse the violations vector and existing message format
(e.g., "Disallowed module: " + root) and keep is_module_allowed for permission
checks.

In `@wasm/src/dataset.rs`:
- Around line 73-102: The generate_random_indices function currently only uses
full u32 entropy for the first eight indices and falls back to a single seed
byte for subsequent indices, causing high bias for select_count > 8; fix it by
expanding the 32-byte seed into as many 4-byte words as needed (or reseeding)
before building indices—for example, derive additional 4-byte blocks via a
hash/PRF (SHA256/HMAC or similar) of the original seed concatenated with a
counter and use those 4-byte blocks when computing idx_bytes, ensuring you
update any references to seed slicing in generate_random_indices so every index
uses a full u32 of entropy rather than a single byte.

In `@wasm/src/scoring.rs`:
- Around line 188-196: The function apply_epoch_decay currently takes
DecayParams but only reads get_top_agent_state().current_burn_percent and
min_multiplier, which is misleading; either remove the unused DecayParams
parameter from apply_epoch_decay (and update callers) so the function clearly
relies solely on the stored top-agent state (symbols: apply_epoch_decay,
DecayParams, get_top_agent_state, current_burn_percent, decay_active,
min_multiplier), or change apply_epoch_decay to compute decay from the passed
params (compute multiplier from params.grace_period_hours and
params.half_life_hours when decay_active is true instead of using stored
current_burn_percent, and ensure update_top_agent_state is adjusted
accordingly); pick one approach and make callers and tests consistent.

Comment on lines +46 to +87
```mermaid
flowchart TB
subgraph "term-challenge-wasm (no_std)"
Lib[lib.rs<br/>Challenge trait impl]
Types[types.rs<br/>Submission, TaskResult,<br/>ChallengeParams, DecayParams]
Scoring[scoring.rs<br/>Aggregate scoring,<br/>decay, weight calc]
Tasks[tasks.rs<br/>Active dataset<br/>management]
Dataset[dataset.rs<br/>Dataset selection<br/>consensus logic]
Routes[routes.rs<br/>Route definitions<br/>for RPC]
Storage[agent_storage.rs<br/>Code, hash, log<br/>storage functions]
end

Lib --> Types
Lib --> Scoring
Lib --> Storage
Lib --> Tasks
Tasks --> Dataset
Lib --> Routes

subgraph "Host Functions (platform-v2)"
HStorage[host_storage_get/set]
HHttp[host_http_post]
HEpoch[host_consensus_get_epoch]
end

Lib --> HStorage
Lib --> HHttp
Lib --> HEpoch
```

### Module Responsibilities

| Module | Purpose |
| --- | --- |
| `lib.rs` | Implements the `Challenge` trait: `validate()`, `evaluate()`, `tasks()`, `configure()` |
| `types.rs` | All data structures: `Submission`, `TaskResult`, `ChallengeParams`, `DecayParams`, `AgentLogs`, `RouteDefinition` |
| `scoring.rs` | Score aggregation by difficulty, pass rate calculation, decay application, weight conversion |
| `tasks.rs` | Active dataset storage/retrieval, dataset history management |
| `dataset.rs` | P2P dataset consensus logic (reserved for future implementation) |
| `routes.rs` | Route definitions for challenge RPC endpoints |
| `agent_storage.rs` | Agent code, hash, and log storage with size limits |

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Architecture diagram and module table are out of date with the code.

The WASM module diagram (line 46–74) and the module responsibilities table (line 78–87) omit the four new modules added in this PR: ast_validation, llm_review, submission, and timeout_handler. Additionally, Line 84 describes dataset.rs as "reserved for future implementation" while the code now fully implements consensus, proposals, and random index generation.

Please update both the diagram and the table to reflect the actual module structure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture.md` around lines 46 - 87, The architecture doc is stale:
update the Mermaid diagram and the Module Responsibilities table to include the
four new modules `ast_validation`, `llm_review`, `submission`, and
`timeout_handler`, and revise `dataset.rs` description to reflect its
implemented features (consensus, proposals, random index generation) instead of
"reserved for future implementation"; edit the flowchart to add nodes for these
modules and connect them to `lib.rs` or relevant modules (e.g.,
`tasks.rs`/`dataset.rs`), and update the table rows for each new module with
concise purposes and change the `dataset.rs` row to list “P2P consensus,
proposals, random index generation” so the diagram and table match the code.

Comment on lines +94 to +146
### Network Functions (`platform_network`)

| Function | Signature | Description | Used by Term Challenge |
| --- | --- | --- | --- |
| `host_http_get` | `(request: &[u8]) → Result<Vec<u8>, i32>` | HTTP GET request | No |
| `host_http_post` | `(request: &[u8], body: &[u8]) → Result<Vec<u8>, i32>` | HTTP POST request | Yes (LLM judge) |
| `host_dns_resolve` | `(request: &[u8]) → Result<Vec<u8>, i32>` | DNS resolution | No |

### Storage Functions (`platform_storage`)

| Function | Signature | Description | Used by Term Challenge |
| --- | --- | --- | --- |
| `host_storage_get` | `(key: &[u8]) → Result<Vec<u8>, i32>` | Read from blockchain storage | Yes |
| `host_storage_set` | `(key: &[u8], value: &[u8]) → Result<(), i32>` | Write to blockchain storage | Yes |

### Terminal Functions (`platform_terminal`)

| Function | Signature | Description | Used by Term Challenge |
| --- | --- | --- | --- |
| `host_terminal_exec` | `(request: &[u8]) → Result<Vec<u8>, i32>` | Execute terminal command | No |
| `host_read_file` | `(path: &[u8]) → Result<Vec<u8>, i32>` | Read file contents | No |
| `host_write_file` | `(path: &[u8], data: &[u8]) → Result<(), i32>` | Write file contents | No |
| `host_list_dir` | `(path: &[u8]) → Result<Vec<u8>, i32>` | List directory contents | No |
| `host_get_time` | `() → i64` | Get current timestamp | No |
| `host_random_seed` | `(buf: &mut [u8]) → Result<(), i32>` | Fill buffer with random bytes | No |

### Sandbox Functions (`platform_sandbox`)

| Function | Signature | Description | Used by Term Challenge |
| --- | --- | --- | --- |
| `host_sandbox_exec` | `(request: &[u8]) → Result<Vec<u8>, i32>` | Execute in sandbox | No |
| `host_get_timestamp` | `() → i64` | Get sandbox timestamp | No |
| `host_log` | `(level: u8, msg: &str) → ()` | Log a message | No |

### LLM Functions (`platform_llm`)

| Function | Signature | Description | Used by Term Challenge |
| --- | --- | --- | --- |
| `host_llm_chat_completion` | `(request: &[u8]) → Result<Vec<u8>, i32>` | LLM chat completion | No (uses HTTP post instead) |
| `host_llm_is_available` | `() → bool` | Check LLM availability | No |

### Consensus Functions (`platform_consensus`)

| Function | Signature | Description | Used by Term Challenge |
| --- | --- | --- | --- |
| `host_consensus_get_epoch` | `() → i64` | Get current epoch number | Yes |
| `host_consensus_get_validators` | `() → Result<Vec<u8>, i32>` | Get validator list | No |
| `host_consensus_propose_weight` | `(uid: i32, weight: i32) → Result<(), i32>` | Propose a weight | No |
| `host_consensus_get_votes` | `() → Result<Vec<u8>, i32>` | Get consensus votes | No |
| `host_consensus_get_state_hash` | `() → Result<[u8; 32], i32>` | Get state hash | No |
| `host_consensus_get_submission_count` | `() → i32` | Get submission count | No |
| `host_consensus_get_block_height` | `() → i64` | Get block height | No |

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Host function "Used by Term Challenge" flags are stale.

  • Line 118: host_random_seed is marked "No" but is used in dataset.rs and llm_review.rs.
  • Line 144: host_consensus_get_submission_count is marked "No" but is used in routes.rs (handle_stats).
  • host_get_timestamp (from platform_terminal or platform_sandbox) isn't listed but is used by timeout_handler.rs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture.md` around lines 94 - 146, Update the "Used by Term
Challenge" flags and missing entry in the host functions tables: mark
host_random_seed (signature host_random_seed(buf: &mut [u8]) → Result<(), i32>)
as "Yes" (used by dataset.rs and llm_review.rs), mark
host_consensus_get_submission_count (signature
host_consensus_get_submission_count() → i32) as "Yes" (used by routes.rs
handle_stats), and ensure host_get_timestamp (signature host_get_timestamp() →
i64) is present in the appropriate sandbox/terminal table and marked "Yes" (used
by timeout_handler.rs); edit the entries for platform_sandbox/platform_terminal
and platform_consensus to reflect these changes.

Comment on lines +136 to +142
### Direct Execution

```bash
./target/release/validator-node \
--data-dir ./data \
--secret-key "${VALIDATOR_SECRET_KEY}"
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Direct execution example also exposes the secret key on the command line.

Same issue as the systemd unit — "${VALIDATOR_SECRET_KEY}" on the CLI is visible to other users. Prefer reading from an env var or file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/validator/setup.md` around lines 136 - 142, The direct execution example
exposes the secret on the command line; update the docs so the binary is not
given --secret-key "${VALIDATOR_SECRET_KEY}" directly. Instead show one of two
secure options: (1) set VALIDATOR_SECRET_KEY in the environment and run
./target/release/validator-node without the --secret-key CLI flag, or (2) write
the secret to a protected file and invoke the binary with a --secret-key-file
/path/to/secret (or the program's equivalent file option); adjust the example to
use the env-or-file approach and reference the binary name
./target/release/validator-node and the variable VALIDATOR_SECRET_KEY so readers
know which symbols to use.

Comment on lines 233 to 258
term-challenge/
├── wasm/ # WASM evaluation module
├── wasm/ # WASM evaluation module
│ └── src/
│ ├── lib.rs # Challenge trait implementation
│ ├── types.rs # Submission, task, and config types
│ ├── scoring.rs # Score aggregation and decay
│ ├── tasks.rs # Active dataset management
│ ├── dataset.rs # Dataset selection consensus
│ ├── routes.rs # RPC route definitions
│ └── agent_storage.rs # Agent code & log storage functions
├── cli/ # Native TUI monitoring tool
│ ├── lib.rs # Challenge trait implementation (validate + evaluate)
│ ├── types.rs # Submission, task, config, route, and log types
│ ├── scoring.rs # Score aggregation, decay, and weight calculation
│ ├── tasks.rs # Active dataset management and history
│ ├── dataset.rs # Dataset selection and P2P consensus logic
│ ├── routes.rs # WASM route definitions for RPC (handle_route)
│ └── agent_storage.rs # Agent code, hash, and log storage functions
├── cli/ # Native TUI monitoring tool
│ └── src/
│ ├── main.rs # Entry point, event loop
│ ├── app.rs # Application state
│ ├── ui.rs # Ratatui UI rendering
│ └── rpc.rs # JSON-RPC 2.0 client
├── AGENTS.md # Development guide
│ ├── main.rs # Entry point, event loop
│ ├── app.rs # Application state
│ ├── ui.rs # Ratatui UI rendering
│ └── rpc.rs # JSON-RPC 2.0 client
├── docs/
│ ├── architecture.md # System architecture and internals
│ ├── miner/
│ │ ├── how-to-mine.md # Complete miner guide
│ │ └── submission.md # Submission format and review process
│ └── validator/
│ └── setup.md # Validator setup and operations
├── AGENTS.md # Development guide
└── README.md
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check what .rs files actually exist in wasm/src/
echo "=== Actual .rs files in wasm/src/ ==="
fd -e rs . wasm/src/ --max-depth 1 | sort

echo -e "\n=== README.md lines 233-258 ==="
sed -n '233,258p' README.md

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1659


Add the four new WASM modules to the architecture tree in README.md.

The README lists only 7 WASM source files but 11 actually exist. The missing modules—submission.rs, llm_review.rs, ast_validation.rs, and timeout_handler.rs—are core components of the evaluation pipeline and must be documented.

Proposed fix
 │       ├── routes.rs            # WASM route definitions for RPC (handle_route)
-│       └── agent_storage.rs     # Agent code, hash, and log storage functions
+│       ├── agent_storage.rs     # Agent code, hash, and log storage functions
+│       ├── submission.rs        # Submission name registration and versioning
+│       ├── llm_review.rs        # LLM-based code review with majority-vote
+│       ├── ast_validation.rs    # AST/whitelist structural validation
+│       └── timeout_handler.rs   # Reviewer timeout and replacement handling
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
term-challenge/
├── wasm/ # WASM evaluation module
├── wasm/ # WASM evaluation module
│ └── src/
│ ├── lib.rs # Challenge trait implementation
│ ├── types.rs # Submission, task, and config types
│ ├── scoring.rs # Score aggregation and decay
│ ├── tasks.rs # Active dataset management
│ ├── dataset.rs # Dataset selection consensus
│ ├── routes.rs # RPC route definitions
│ └── agent_storage.rs # Agent code & log storage functions
├── cli/ # Native TUI monitoring tool
│ ├── lib.rs # Challenge trait implementation (validate + evaluate)
│ ├── types.rs # Submission, task, config, route, and log types
│ ├── scoring.rs # Score aggregation, decay, and weight calculation
│ ├── tasks.rs # Active dataset management and history
│ ├── dataset.rs # Dataset selection and P2P consensus logic
│ ├── routes.rs # WASM route definitions for RPC (handle_route)
│ └── agent_storage.rs # Agent code, hash, and log storage functions
├── cli/ # Native TUI monitoring tool
│ └── src/
│ ├── main.rs # Entry point, event loop
│ ├── app.rs # Application state
│ ├── ui.rs # Ratatui UI rendering
│ └── rpc.rs # JSON-RPC 2.0 client
├── AGENTS.md # Development guide
│ ├── main.rs # Entry point, event loop
│ ├── app.rs # Application state
│ ├── ui.rs # Ratatui UI rendering
│ └── rpc.rs # JSON-RPC 2.0 client
├── docs/
│ ├── architecture.md # System architecture and internals
│ ├── miner/
│ │ ├── how-to-mine.md # Complete miner guide
│ │ └── submission.md # Submission format and review process
│ └── validator/
│ └── setup.md # Validator setup and operations
├── AGENTS.md # Development guide
└── README.md
```
term-challenge/
├── wasm/ # WASM evaluation module
│ └── src/
│ ├── lib.rs # Challenge trait implementation (validate + evaluate)
│ ├── types.rs # Submission, task, config, route, and log types
│ ├── scoring.rs # Score aggregation, decay, and weight calculation
│ ├── tasks.rs # Active dataset management and history
│ ├── dataset.rs # Dataset selection and P2P consensus logic
│ ├── routes.rs # WASM route definitions for RPC (handle_route)
│ ├── agent_storage.rs # Agent code, hash, and log storage functions
│ ├── submission.rs # Submission name registration and versioning
│ ├── llm_review.rs # LLM-based code review with majority-vote
│ ├── ast_validation.rs # AST/whitelist structural validation
│ └── timeout_handler.rs # Reviewer timeout and replacement handling
├── cli/ # Native TUI monitoring tool
│ └── src/
│ ├── main.rs # Entry point, event loop
│ ├── app.rs # Application state
│ ├── ui.rs # Ratatui UI rendering
│ └── rpc.rs # JSON-RPC 2.0 client
├── docs/
│ ├── architecture.md # System architecture and internals
│ ├── miner/
│ │ ├── how-to-mine.md # Complete miner guide
│ │ └── submission.md # Submission format and review process
│ └── validator/
│ └── setup.md # Validator setup and operations
├── AGENTS.md # Development guide
└── README.md
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 233 - 258, Update the README's architecture tree
under term-challenge/wasm/src to include the four missing WASM modules by adding
entries for submission.rs, llm_review.rs, ast_validation.rs, and
timeout_handler.rs alongside the existing files (lib.rs, types.rs, scoring.rs,
tasks.rs, dataset.rs, routes.rs, agent_storage.rs) so the list accurately
reflects all evaluation pipeline source files; ensure the new filenames are
placed in the same indented style as the other wasm/src entries.

Comment on lines +217 to +228
let whitelist_config = ast_validation::get_whitelist_config();
let code_str = core::str::from_utf8(&package_zip).unwrap_or("");
let ast_result = ast_validation::validate_python_code(code_str, &whitelist_config);
let _ = ast_validation::store_ast_result(&agent_hash, &ast_result);
if !ast_result.passed {
let _ = agent_storage::store_evaluation_status(
&miner_hotkey,
epoch,
EvaluationStatus::Failed,
);
return EvaluationOutput::failure("AST validation failed");
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -n 'package_zip' --type=rust -C 3

Repository: PlatformNetwork/term-challenge-v2

Length of output: 3078


🏁 Script executed:

# Also find where Submission struct is defined
rg -n 'struct Submission' --type=rust -C 5

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1408


🏁 Script executed:

# Check the AST validation implementation
rg -n 'fn validate_python_code' --type=rust -C 10

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1374


🏁 Script executed:

# Look for ZIP extraction/unpacking logic
rg -n 'zip|unzip|extract' --type=rust -i

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1325


🏁 Script executed:

# Check routes.rs for submission endpoint description and documentation
cat -n wasm/src/routes.rs | head -60

Repository: PlatformNetwork/term-challenge-v2

Length of output: 2807


🏁 Script executed:

# Look for any documentation about the submission format or package contents
rg -n 'description|comment|doc|package' wasm/src/routes.rs -B 2 -A 2

Repository: PlatformNetwork/term-challenge-v2

Length of output: 6264


🏁 Script executed:

# Check if there's CLI code that creates submissions and what format it uses
fd . --type=f -name "*.rs" -path "*/cli/*" | head -20

Repository: PlatformNetwork/term-challenge-v2

Length of output: 307


🏁 Script executed:

# Search for any ZIP-related imports or usage anywhere in the codebase
rg -n 'use.*zip|extern.*zip|::zip' --type=rust

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

# Find CLI code
find . -type f -name "*.rs" | xargs grep -l "cli" | head -10

Repository: PlatformNetwork/term-challenge-v2

Length of output: 111


🏁 Script executed:

# Search for ZIP file handling or extraction in the entire codebase
find . -type f \( -name "*.rs" -o -name "*.toml" \) | xargs grep -i "zip\|archive" 2>/dev/null | head -20

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1109


🏁 Script executed:

# Check the llm_review module since it also receives code_str
cat -n wasm/src/llm_review.rs | head -80

Repository: PlatformNetwork/term-challenge-v2

Length of output: 3426


🏁 Script executed:

# Look for where package_zip gets populated - check if there's any serialization/deserialization
rg -n 'run_llm_review|validate_python_code' --type=rust -B 5 -A 10

Repository: PlatformNetwork/term-challenge-v2

Length of output: 3986


🏁 Script executed:

# Check CLI code to see how package_zip is created/populated
cat -n cli/src/main.rs | head -100

Repository: PlatformNetwork/term-challenge-v2

Length of output: 3504


🏁 Script executed:

# Look for where Submission struct is created in CLI
rg -n 'Submission {' cli/src/ -A 10

Repository: PlatformNetwork/term-challenge-v2

Length of output: 164


🏁 Script executed:

# Check if there's any reference to file reading, ZIP creation, or encoding in CLI
rg -n 'read|zip|encode|package' cli/src/ -i -B 2 -A 2

Repository: PlatformNetwork/term-challenge-v2

Length of output: 351


🏁 Script executed:

# Search more broadly for Submission creation
rg -n 'Submission\s*{' cli/src/ --type=rust -A 15

Repository: PlatformNetwork/term-challenge-v2

Length of output: 168


🏁 Script executed:

# Check if there are other executables or scripts that might create submissions
find . -type f \( -name "*.rs" -o -name "*.py" -o -name "*.sh" \) | xargs grep -l "package_zip\|Submission" 2>/dev/null

Repository: PlatformNetwork/term-challenge-v2

Length of output: 221


🏁 Script executed:

# Look at the submission module in WASM to understand what format is expected
cat -n wasm/src/submission.rs

Repository: PlatformNetwork/term-challenge-v2

Length of output: 3852


🏁 Script executed:

# Check if there's any documentation or README explaining the submission format
find . -name "README*" -o -name "*.md" | head -5

Repository: PlatformNetwork/term-challenge-v2

Length of output: 166


🏁 Script executed:

# Check architecture documentation to understand submission format
cat docs/architecture.md

Repository: PlatformNetwork/term-challenge-v2

Length of output: 12767


🏁 Script executed:

# Look for where Submission struct is instantiated in CLI
rg -n 'package_zip' cli/src/ -B 5 -A 5

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

# Check the app.rs which likely has submission creation logic
grep -n 'Submission\|package' cli/src/app.rs | head -30

Repository: PlatformNetwork/term-challenge-v2

Length of output: 256


Extract Python source from the ZIP package before AST/LLM validation—binary ZIP data fails UTF-8 conversion and results in empty string.

Line 218 converts package_zip (documented as "Raw ZIP package bytes" in architecture) via core::str::from_utf8(&package_zip).unwrap_or(""). Since ZIP files are binary, this conversion fails and returns an empty string. Both validate_python_code and run_llm_review then operate on an empty string, trivially passing both security checks without inspecting the actual agent code.

Implement ZIP extraction to obtain the Python source code before validation, or—if the field is intended to always contain raw UTF-8 source—update the field name, documentation, and routes description accordingly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/lib.rs` around lines 217 - 228, The code currently treats
package_zip as UTF-8 text (code_str =
core::str::from_utf8(&package_zip).unwrap_or("")) so AST/LLM validators
(ast_validation::validate_python_code and subsequent run_llm_review) receive an
empty string; replace this with proper ZIP extraction: open package_zip as a zip
archive, locate the agent Python file(s) (e.g. *.py or a known path), read their
UTF-8 contents into code_str before calling ast_validation::validate_python_code
and ast_validation::store_ast_result; keep existing failure handling that calls
agent_storage::store_evaluation_status and returns EvaluationOutput::failure
when ast_result.passed is false. Ensure you handle extraction errors (log/store
as failure) rather than silently using an empty string.

Comment on lines +138 to +145
fn redact_api_keys(code: &str) -> String {
let mut result = String::from(code);
if result.len() > 50_000 {
result.truncate(50_000);
result.push_str("\n... [truncated]");
}
result
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

redact_api_keys doesn't redact anything — it only truncates.

The function name implies sensitive key redaction, but it only truncates the code at 50,000 chars. There is no pattern matching or replacement of API keys/secrets. Either implement actual redaction (e.g., replace strings matching sk-*, AKIA*, etc.) or rename the function to truncate_code to avoid a false sense of security.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/llm_review.rs` around lines 138 - 145, The function redact_api_keys
currently only truncates and does not remove secrets; update redact_api_keys to
actually scan and replace common API key patterns (e.g., strings starting with
"sk-", "AKIA", "AIza", "SG.", "xoxb-", long hex/base64-like tokens, and typical
header patterns) using regex replacements so matched tokens are replaced with a
fixed mask like "[REDACTED_API_KEY]" while preserving surrounding text, and keep
the existing 50_000-character truncate logic; alternatively, if you prefer not
to implement redaction, rename the function to truncate_code and update all
references to reflect the new name (redact_api_keys → truncate_code) to avoid
implying secret removal. Ensure changes touch the redact_api_keys function and
any callers so behavior and naming are consistent.

Comment on lines +158 to +206
pub fn handle_route_request(request: &WasmRouteRequest) -> Vec<u8> {
let path = request.path.as_str();
let method = request.method.as_str();

match (method, path) {
("GET", "/leaderboard") => handle_leaderboard(),
("GET", "/stats") => handle_stats(),
("GET", "/decay") => handle_decay(),
("GET", "/dataset/history") => handle_dataset_history(),
("GET", "/dataset/consensus") => handle_dataset_consensus(),
("GET", "/config/timeout") => handle_get_timeout_config(),
("GET", "/config/whitelist") => handle_get_whitelist_config(),
("POST", "/config/timeout") => handle_set_timeout_config(&request.body),
("POST", "/config/whitelist") => handle_set_whitelist_config(&request.body),
("POST", "/dataset/propose") => handle_dataset_propose(&request.body),
("POST", "/dataset/random") => handle_dataset_random(&request.body),
("POST", "/review/select") => handle_review_select(&request.body),
("POST", "/review/aggregate") => handle_review_aggregate(&request.body),
("POST", "/timeout/record") => handle_timeout_record(&request.body),
("POST", "/timeout/check") => handle_timeout_check(&request.body),
("POST", "/timeout/replace") => handle_timeout_replace(&request.body),
("POST", "/timeout/mark") => handle_timeout_mark(&request.body),
_ => {
if method == "GET" {
if let Some(id) = path.strip_prefix("/review/") {
return handle_review(id);
}
if let Some(id) = path.strip_prefix("/ast/") {
return handle_ast(id);
}
if let Some(name) = path.strip_prefix("/submission/") {
return handle_submission_by_name(name);
}
if let Some(rest) = path.strip_prefix("/agent/") {
if let Some(hotkey) = rest.strip_suffix("/journey") {
return handle_journey(hotkey);
}
if let Some(hotkey) = rest.strip_suffix("/logs") {
return handle_logs(hotkey);
}
if let Some(hotkey) = rest.strip_suffix("/code") {
return handle_code(hotkey);
}
}
}
Vec::new()
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for auth_hotkey usage in the wasm crate
rg -n 'auth_hotkey' wasm/src/ --type=rust -A 2 -B 2

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

#!/bin/bash
# Check the structure of WasmRouteRequest
rg -n 'struct WasmRouteRequest\|pub struct WasmRouteRequest' wasm/src/ --type=rust -A 10

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

#!/bin/bash
# Check if there are any authorization or validation functions being called in the handlers
rg -n 'fn handle_set_timeout_config|fn handle_set_whitelist_config|fn handle_dataset_propose|fn handle_timeout_record|fn handle_timeout_mark' wasm/src/ --type=rust -A 15

Repository: PlatformNetwork/term-challenge-v2

Length of output: 4316


🏁 Script executed:

#!/bin/bash
# Look for any platform-v2 or RPC related files that might enforce auth upstream
find . -name "*.rs" -type f | xargs grep -l "platform-v2\|RPC\|authorization" 2>/dev/null | head -20

Repository: PlatformNetwork/term-challenge-v2

Length of output: 127


🏁 Script executed:

#!/bin/bash
# Search for WasmRouteRequest anywhere in the codebase
rg -n 'WasmRouteRequest' --type=rust

Repository: PlatformNetwork/term-challenge-v2

Length of output: 467


🏁 Script executed:

#!/bin/bash
# Check wasm/src/lib.rs to understand the module structure
head -100 wasm/src/lib.rs

Repository: PlatformNetwork/term-challenge-v2

Length of output: 2993


🏁 Script executed:

#!/bin/bash
# Look for the function signature of handle_route_request and its caller
rg -n 'pub fn handle_route_request\|fn handle_route_request' wasm/src/ --type=rust -B 2 -A 1

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

#!/bin/bash
# Get the full definition of WasmRouteRequest
sed -n '325,350p' wasm/src/types.rs

Repository: PlatformNetwork/term-challenge-v2

Length of output: 177


🏁 Script executed:

#!/bin/bash
# Check if auth_hotkey is referenced anywhere in routes.rs
grep -n 'request\.' wasm/src/routes.rs

Repository: PlatformNetwork/term-challenge-v2

Length of output: 986


🏁 Script executed:

#!/bin/bash
# Find where handle_route_request is called from
rg -n 'handle_route_request' --type=rust

Repository: PlatformNetwork/term-challenge-v2

Length of output: 235


🏁 Script executed:

#!/bin/bash
# Check the context around line 154 in lib.rs where WasmRouteRequest is deserialized
sed -n '140,175p' wasm/src/lib.rs

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1127


🏁 Script executed:

#!/bin/bash
# Look for where this function is called from - check if there's auth enforcement higher up
rg -n 'pub fn.*request_data\|pub fn.*route\|pub fn.*challenge' wasm/src/lib.rs --type=rust -B 2 -A 1

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

#!/bin/bash
# Check if there's any auth/validation logic in the platform SDK or higher-level code
grep -r 'auth' cli/src/rpc.rs | head -20

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

#!/bin/bash
# Check if there's any documentation or higher-level auth enforcement
grep -r 'handle_route\|auth\|permission' cli/src/rpc.rs | head -30

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

#!/bin/bash
# Check the RouteDefinition to understand if auth is metadata there
sed -n '1,100p' wasm/src/types.rs | grep -A 10 'struct RouteDefinition\|pub struct RouteDefinition'

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

#!/bin/bash
# Check RouteDefinition more broadly
rg -n 'RouteDefinition' wasm/src/types.rs -A 10

Repository: PlatformNetwork/term-challenge-v2

Length of output: 386


🏁 Script executed:

#!/bin/bash
# Check if any POST handlers expect auth data in their body deserialization
rg -n 'bincode::deserialize' wasm/src/routes.rs | head -20

Repository: PlatformNetwork/term-challenge-v2

Length of output: 909


🏁 Script executed:

#!/bin/bash
# Check cli/src/rpc.rs to see if there's auth enforcement on the RPC side
cat cli/src/rpc.rs | head -150

Repository: PlatformNetwork/term-challenge-v2

Length of output: 3369


🏁 Script executed:

#!/bin/bash
# Search for any validation or permission checks before calling handle_route
rg -n 'handle_route\|POST.*timeout\|POST.*whitelist' --type=rust -B 5 -A 2

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


State-mutating POST endpoints lack authorization validation.

handle_route_request dispatches to handlers like handle_set_timeout_config, handle_set_whitelist_config, handle_dataset_propose, handle_timeout_record, and handle_timeout_mark without any authorization checks. The WasmRouteRequest structure contains only method, path, and body — no auth information. As a result, handlers cannot validate which caller is making the request, allowing any caller to modify timeout/whitelist configs or record assignments as any validator.

Add an auth_hotkey field to WasmRouteRequest and validate it in POST handlers against an authorized set before mutating state.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/routes.rs` around lines 158 - 206, The POST routes are dispatched
without any auth info; add an auth_hotkey: Option<String> field to the
WasmRouteRequest struct and update handle_route_request to check authorization
for mutating endpoints (e.g., handle_set_timeout_config,
handle_set_whitelist_config, handle_dataset_propose, handle_timeout_record,
handle_timeout_mark) before calling them: implement or call a central
is_authorized_hotkey(&Option<String>) (or similar) helper that validates the
hotkey against the authorized set and return an empty Vec<u8> (or an auth error
payload) when unauthorized; ensure you pass the request.body and/or auth_hotkey
into handlers that need caller identity and update those handler signatures
(e.g., handle_set_timeout_config, handle_dataset_propose, handle_timeout_record,
handle_timeout_mark) to accept the hotkey so they can perform any audit/mutation
with the caller identity.

Comment on lines +222 to +257
fn handle_stats() -> Vec<u8> {
let total_submissions = host_consensus_get_submission_count() as u64;
let epoch = host_consensus_get_epoch();
let active_miners = host_storage_get(b"active_miner_count")
.ok()
.and_then(|d| {
if d.len() >= 8 {
let mut buf = [0u8; 8];
buf.copy_from_slice(&d[..8]);
Some(u64::from_le_bytes(buf))
} else {
None
}
})
.unwrap_or(0);
let validator_count = host_storage_get(b"validator_count")
.ok()
.and_then(|d| {
if d.len() >= 8 {
let mut buf = [0u8; 8];
buf.copy_from_slice(&d[..8]);
Some(u64::from_le_bytes(buf))
} else {
None
}
})
.unwrap_or(0);

let stats = StatsResponse {
total_submissions,
active_miners,
validator_count,
};
let _ = epoch;
bincode::serialize(&stats).unwrap_or_default()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Negative i32 cast to u64 and unused epoch fetch.

Line 223: host_consensus_get_submission_count() returns i32. Casting a negative sentinel (e.g., -1 for error) directly to u64 produces a huge value. Use a clamped conversion:

-    let total_submissions = host_consensus_get_submission_count() as u64;
+    let raw_count = host_consensus_get_submission_count();
+    let total_submissions = if raw_count >= 0 { raw_count as u64 } else { 0 };

Line 224–255: epoch is fetched via a host call but only consumed by let _ = epoch;. Either incorporate it into StatsResponse or remove the call to avoid a pointless host round-trip.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/routes.rs` around lines 222 - 257, The handle_stats function casts
host_consensus_get_submission_count() (an i32) directly to u64 which will
produce huge values for negative sentinels; change the conversion to clamp
negatives to 0 (or map error sentinel to None/0) before converting to u64 and
use that sanitized value for total_submissions, and remove the unused
host_consensus_get_epoch() call or instead add epoch to the StatsResponse struct
and populate it (so either drop the host call or wire epoch into StatsResponse)
to avoid an unnecessary host round-trip; update references to StatsResponse,
handle_stats, host_consensus_get_submission_count, and host_consensus_get_epoch
accordingly.

Comment on lines +149 to +186
pub fn update_top_agent_state(agent_hash: &str, score: f64, epoch: u64) -> bool {
let current = get_top_agent_state();
let should_update = match &current {
Some(state) => score > state.score,
None => true,
};

if should_update {
let state = TopAgentState {
agent_hash: String::from(agent_hash),
score,
achieved_epoch: epoch,
epochs_stale: 0,
decay_active: false,
current_burn_percent: 0.0,
};
if let Ok(data) = bincode::serialize(&state) {
return host_storage_set(TOP_AGENT_KEY, &data).is_ok();
}
} else if let Some(mut state) = current {
let current_epoch = host_consensus_get_epoch();
if current_epoch >= 0 {
state.epochs_stale = (current_epoch as u64).saturating_sub(state.achieved_epoch);
let grace_epochs = 60u64;
state.decay_active = state.epochs_stale > grace_epochs;
if state.decay_active {
let decay_epochs = state.epochs_stale.saturating_sub(grace_epochs);
let half_life_epochs = 20.0f64;
let multiplier = 0.5f64.powf(decay_epochs as f64 / half_life_epochs);
state.current_burn_percent = (1.0 - multiplier) * 100.0;
}
if let Ok(data) = bincode::serialize(&state) {
let _ = host_storage_set(TOP_AGENT_KEY, &data);
}
}
}
false
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find all call sites of update_top_agent_state
rg -n "update_top_agent_state" --type rust -B2 -A2

Repository: PlatformNetwork/term-challenge-v2

Length of output: 685


🏁 Script executed:

# Find DecayParams definition to understand its structure
rg -n "struct DecayParams|pub struct DecayParams" --type rust -A5

Repository: PlatformNetwork/term-challenge-v2

Length of output: 322


🏁 Script executed:

# Check the full context of scoring.rs around the function in question
wc -l wasm/src/scoring.rs

Repository: PlatformNetwork/term-challenge-v2

Length of output: 100


🏁 Script executed:

# Check the context of the call site in lib.rs
sed -n '280,300p' wasm/src/lib.rs | cat -n

Repository: PlatformNetwork/term-challenge-v2

Length of output: 993


🏁 Script executed:

# Check if DecayParams is used elsewhere in scoring.rs
rg -n "DecayParams" wasm/src/scoring.rs -B2 -A2

Repository: PlatformNetwork/term-challenge-v2

Length of output: 623


🏁 Script executed:

# Check if there's any epoch-to-hours conversion logic in the codebase
rg -n "epoch|hour" wasm/src/lib.rs --type rust | head -20

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1340


🏁 Script executed:

# Look for the definition and use of apply_decay function to see how DecayParams is used
rg -n "apply_decay|DecayParams" wasm/src/scoring.rs -B1 -A3

Repository: PlatformNetwork/term-challenge-v2

Length of output: 765


🏁 Script executed:

# Check if there's an epoch-to-hours conversion constant defined
rg -n "EPOCH|epoch.*hours|hours.*epoch" wasm/src/ --type rust

Repository: PlatformNetwork/term-challenge-v2

Length of output: 223


🏁 Script executed:

# Look at how apply_epoch_decay is actually used and what decay_params is in lib.rs
sed -n '270,295p' wasm/src/lib.rs | cat -n

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1278


🏁 Script executed:

# Check the TopAgentState struct definition to understand what's stored
rg -n "struct TopAgentState" wasm/src/types.rs -A10

Repository: PlatformNetwork/term-challenge-v2

Length of output: 395


🏁 Script executed:

# Check if there's any epoch duration or conversion defined anywhere
rg -n "epoch.*duration|duration.*epoch|blocks|block" wasm/src/ --type rust | head -30

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

# Look at line 277 in lib.rs more carefully - the apply_decay call
sed -n '275,280p' wasm/src/lib.rs | cat -n

Repository: PlatformNetwork/term-challenge-v2

Length of output: 415


🏁 Script executed:

# Check what apply_decay does with the hours_since_top parameter
sed -n '88,110p' wasm/src/scoring.rs | cat -n

Repository: PlatformNetwork/term-challenge-v2

Length of output: 920


🏁 Script executed:

# Check how decay_params flows through the codebase - where is it created?
rg -n "decay_params\s*=" wasm/src/lib.rs -B3 -A1

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

# Check if there's documentation about epoch duration or the relationship between epochs and hours
rg -n "epoch|block|time|duration" wasm/src/types.rs --type rust | head -40

Repository: PlatformNetwork/term-challenge-v2

Length of output: 754


🏁 Script executed:

# Check the entire update_top_agent_state function carefully
sed -n '149,186p' wasm/src/scoring.rs | cat -n

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1827


🏁 Script executed:

# Check if there's a constant or function that converts epochs to hours or blocks per hour
rg -n "BLOCK|HOUR|EPOCH|1000|3600|60" wasm/src/ --type rust | grep -i "const\|fn" | head -30

Repository: PlatformNetwork/term-challenge-v2

Length of output: 128


🏁 Script executed:

# Look at how decay_params is constructed and passed
rg -n "DecayParams\s*{" wasm/src/ -A5

Repository: PlatformNetwork/term-challenge-v2

Length of output: 170


🏁 Script executed:

# Search in the entire wasm directory for any epoch/time relationship
fd . wasm/src --type f --exec wc -l {} + | sort -n

Repository: PlatformNetwork/term-challenge-v2

Length of output: 874


🏁 Script executed:

# Search for DecayParams instantiation more carefully
rg -n "DecayParams" wasm/src/ --type rust -A2 -B1

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1458


🏁 Script executed:

# Check if there's any documentation or comments about the relationship between epochs and time
rg -n "epoch|grace|half.life" wasm/src/scoring.rs --type rust -B1 -A1

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1830


🏁 Script executed:

# Check the Default implementation of DecayParams
sed -n '81,90p' wasm/src/types.rs | cat -n

Repository: PlatformNetwork/term-challenge-v2

Length of output: 340


🏁 Script executed:

# Let's see the exact context around line 277 in lib.rs where epochs_stale is used as hours
sed -n '273,282p' wasm/src/lib.rs | cat -n

Repository: PlatformNetwork/term-challenge-v2

Length of output: 561


🏁 Script executed:

# Check if there's any relationship between epoch duration and hours in the codebase
rg -n "host_consensus_get_epoch|epoch" wasm/src/lib.rs --type rust -B2 -A2 | head -50

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1768


🏁 Script executed:

# Check if there's any documentation about the epoch to hour conversion
rg -n "epoch" wasm/src/types.rs --type rust -B1 -A1 | head -40

Repository: PlatformNetwork/term-challenge-v2

Length of output: 726


🏁 Script executed:

# Look at how state.epochs_stale is actually computed and used
sed -n '168,186p' wasm/src/scoring.rs | cat -n

Repository: PlatformNetwork/term-challenge-v2

Length of output: 1049


🏁 Script executed:

# Verify the exact data types and logic flow
rg -n "epochs_stale|current_epoch" wasm/src/scoring.rs --type rust -B2 -A2

Repository: PlatformNetwork/term-challenge-v2

Length of output: 896


Hardcoded epoch-based decay constants (grace=60, half_life=20) create a semantic unit mismatch.

update_top_agent_state hardcodes grace_epochs = 60 and half_life_epochs = 20.0 (lines 172, 176), but these values have no documented relationship to DecayParams (which uses hours: default 72, 24). More critically, at line 277 in lib.rs, the computed state.epochs_stale (in epochs) is cast directly to f64 and passed to apply_decay() which expects hours_since_top. This conflates epochs with hours—the decay calculation is semantically incorrect.

Either:

  • Add an epoch-to-hour conversion constant and derive the epoch-based grace/half-life from DecayParams, or
  • Refactor to use hours consistently throughout instead of epochs, or
  • Document the implicit epoch-to-hour equivalence if intentional

Without clarification, operators cannot meaningfully configure decay behavior, and the two decay paths may apply inconsistent multipliers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/scoring.rs` around lines 149 - 186, update_top_agent_state uses
hardcoded epoch-based decay constants (grace_epochs = 60, half_life_epochs =
20.0) and computes state.epochs_stale in epochs, but elsewhere decay logic
(apply_decay / DecayParams) expects hours, causing a semantic mismatch; fix by
converting epochs to hours (use the chain's epoch-duration-in-seconds or a
single EPOCH_HOURS constant) or derive epoch-based grace/half-life from
DecayParams (grace_hours and half_life_hours) before computing multiplier:
replace the hardcoded grace_epochs and half_life_epochs with values computed
from DecayParams (or multiply epochs_stale by epoch_duration_hours) and ensure
apply_decay/get_top_agent_state/state.epochs_stale use the same time unit
throughout (reference functions/structs: update_top_agent_state,
get_top_agent_state, apply_decay, DecayParams, host_consensus_get_epoch,
state.epochs_stale).

Comment on lines +54 to +61
if let Ok(data) = host_storage_get(&key) {
if data.len() >= 8 {
let mut buf = [0u8; 8];
buf.copy_from_slice(&data[..8]);
let assigned_time = i64::from_le_bytes(buf);
let current_time = host_get_timestamp();
let elapsed = (current_time - assigned_time) as u64;
return elapsed > timeout_ms;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Signed-to-unsigned cast can cause false timeout on clock skew.

Line 60: (current_time - assigned_time) as u64 — if current_time < assigned_time (e.g., clock adjustment, host returning an error sentinel), the negative i64 difference wraps to a very large u64, making elapsed > timeout_ms always true. This silently marks assignments as timed out.

Proposed fix
             let assigned_time = i64::from_le_bytes(buf);
             let current_time = host_get_timestamp();
-            let elapsed = (current_time - assigned_time) as u64;
-            return elapsed > timeout_ms;
+            if current_time > assigned_time {
+                let elapsed = (current_time - assigned_time) as u64;
+                return elapsed > timeout_ms;
+            }
+            return false;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if let Ok(data) = host_storage_get(&key) {
if data.len() >= 8 {
let mut buf = [0u8; 8];
buf.copy_from_slice(&data[..8]);
let assigned_time = i64::from_le_bytes(buf);
let current_time = host_get_timestamp();
let elapsed = (current_time - assigned_time) as u64;
return elapsed > timeout_ms;
if let Ok(data) = host_storage_get(&key) {
if data.len() >= 8 {
let mut buf = [0u8; 8];
buf.copy_from_slice(&data[..8]);
let assigned_time = i64::from_le_bytes(buf);
let current_time = host_get_timestamp();
if current_time > assigned_time {
let elapsed = (current_time - assigned_time) as u64;
return elapsed > timeout_ms;
}
return false;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wasm/src/timeout_handler.rs` around lines 54 - 61, The subtraction
(current_time - assigned_time) in the timeout check can be negative and casting
that i64 to u64 will wrap to a huge value, causing false timeouts; update the
logic in the block that reads from host_storage_get and uses host_get_timestamp
so you compute elapsed safely (e.g., use checked_sub/saturating_sub or
explicitly return false when current_time < assigned_time) before casting to
u64, then compare that non-negative elapsed against timeout_ms; adjust uses of
assigned_time, current_time, and timeout_ms accordingly to avoid
signed-to-unsigned wraparound.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant