Skip to content

feat(storage): add PostgreSQL storage module using platform-v2 SDK types#21

Merged
echobt merged 9 commits intomainfrom
feat/pg-storage-sdk-types
Feb 20, 2026
Merged

feat(storage): add PostgreSQL storage module using platform-v2 SDK types#21
echobt merged 9 commits intomainfrom
feat/pg-storage-sdk-types

Conversation

@echobt
Copy link

@echobt echobt commented Feb 20, 2026

Summary

Add a new storage/ workspace crate providing a PostgreSQL persistence layer for term-challenge, with all types sourced from the platform-v2 SDK (platform-core and platform-challenge-sdk).

Changes

New crate: term-challenge-storage

  • storage/Cargo.toml: Dependencies include deadpool-postgres for connection pooling, tokio-postgres with with-uuid-1/with-chrono-0_4/with-serde_json-1 feature flags, and platform-core/platform-challenge-sdk from the platform-v2 repo
  • storage/src/lib.rs: StorageError enum (Database, Pool, Serialization, InvalidHotkey, NotFound, InvalidChallengeId) and Result<T> type alias
  • storage/src/pg.rs: PgPool type alias, PgConfig struct with serde Deserialize + Default, create_pool() using deadpool-postgres with Tokio1 runtime

PostgreSQL modules (storage/src/postgres/)

  • evaluations.rs: CRUD for EvaluationResultf64 score, HashMap<String, f64> metrics stored as JSONB, ChallengeId as UUID, Hotkey as SS58 TEXT
  • leaderboard.rs: LeaderboardEntry struct, upsert/query/delete operations, WeightAssignment get/set with f64 weight (DOUBLE PRECISION)
  • submissions.rs: StoredSubmission struct, insert/get/update/list/count/delete with UUID ChallengeId and SS58 Hotkey
  • task_logs.rs: StoredTaskLog struct, TaskLogRecord type alias, single and batch insert, per-submission and per-miner queries, pass rate calculation
  • validators.rs: StoredValidator struct, upsert/deactivate/list/count/delete, hotkey_from_ss58() helper

Key type mappings

SDK Type PostgreSQL Column
ChallengeId(uuid::Uuid) UUID (via .0)
Hotkey([u8;32]) TEXT (via to_ss58()/from_ss58())
WeightAssignment.weight: f64 DOUBLE PRECISION
EvaluationResult.score: f64 DOUBLE PRECISION
EvaluationResult.metrics: HashMap JSONB

Modified files

  • Cargo.toml: Added "storage" to workspace members
  • AGENTS.md: Updated architecture tree, build commands, and no_std exemption notes to include storage crate

Summary by CodeRabbit

Release Notes

  • New Features

    • Added PostgreSQL persistence layer for storing and managing evaluations, leaderboard rankings, submissions, task logs, and validator information.
  • Chores

    • Updated dependencies to support PostgreSQL integration.
    • Build process now includes storage crate compilation as a separate artifact.

Create a new storage/ workspace crate providing a PostgreSQL persistence
layer for term-challenge, with all types sourced from the platform-v2 SDK.

New crate: term-challenge-storage
- storage/Cargo.toml: deadpool-postgres connection pooling, tokio-postgres
  with-uuid-1/with-chrono-0_4/with-serde_json-1 feature flags, platform-core
  and platform-challenge-sdk git dependencies

Modules created:
- src/lib.rs: StorageError enum (Database, Pool, Serialization, InvalidHotkey,
  NotFound, InvalidChallengeId) and Result<T> type alias
- src/pg.rs: PgPool type alias, PgConfig struct with Deserialize + Default,
  create_pool() using deadpool-postgres with Tokio1 runtime
- src/postgres/evaluations.rs: CRUD for EvaluationResult with f64 score,
  HashMap<String,f64> metrics stored as JSONB, ChallengeId as UUID, Hotkey
  as SS58 TEXT
- src/postgres/leaderboard.rs: LeaderboardEntry struct, upsert/query/delete
  operations, WeightAssignment get/set with f64 weight (DOUBLE PRECISION)
- src/postgres/submissions.rs: StoredSubmission struct, insert/get/update/list
  /count/delete with UUID ChallengeId and SS58 Hotkey
- src/postgres/task_logs.rs: StoredTaskLog struct, TaskLogRecord type alias,
  single and batch insert, per-submission and per-miner queries, pass rate
  calculation
- src/postgres/validators.rs: StoredValidator struct, upsert/deactivate/list
  /count/delete, hotkey_from_ss58() helper for SS58 decoding

Key type mappings:
- ChallengeId(uuid::Uuid) -> PostgreSQL UUID via .0
- Hotkey([u8;32]) -> PostgreSQL TEXT via to_ss58()/from_ss58()
- WeightAssignment.weight: f64 -> DOUBLE PRECISION
- EvaluationResult.score: f64 -> DOUBLE PRECISION
- EvaluationResult.metrics: HashMap -> JSONB

Bug fix: set_weight_assignments INSERT uses 0.0 default for score column
instead of reusing the weight parameter ($3) for both score and weight.

Modified files:
- Cargo.toml: added "storage" to workspace members
- AGENTS.md: updated architecture tree, build commands, and no_std exemption
  notes to include storage crate
@coderabbitai
Copy link

coderabbitai bot commented Feb 20, 2026

Warning

Rate limit exceeded

@echobt has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 4 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

This pull request introduces a comprehensive PostgreSQL-backed storage layer for the platform. It adds configuration management for database connections via a new pg module, establishes five new PostgreSQL persistence modules (evaluations, leaderboard, submissions, task_logs, validators) with async CRUD operations, and updates workspace dependencies to reference the platform-v2 repository instead of platform.git.

Changes

Cohort / File(s) Summary
Dependency Updates
Cargo.toml, server/Cargo.toml, storage/Cargo.toml
Updated workspace dependencies to reference platform-v2 on main branch. Added PostgreSQL support dependencies (tokio, tokio-postgres, deadpool-postgres, postgres-types, serde_json) and upgraded thiserror to 2.0 with uuid enhancements in storage crate.
Storage Configuration & Pool
storage/src/lib.rs, storage/src/pg.rs
Introduced PostgreSQL pool management via PgConfig struct and create_pool function enabling serializable configuration with password redaction and connection pooling through Deadpool.
PostgreSQL Evaluations Module
storage/src/postgres/evaluations.rs
Added seven async functions for evaluation persistence: insert (with upsert), get, list by challenge/agent, retrieve latest, delete, and count operations with JSON metrics serialization.
PostgreSQL Leaderboard Module
storage/src/postgres/leaderboard.rs
Introduced LeaderboardEntry struct and seven functions handling leaderboard upserts, ranked retrievals, weight assignments management, and entry deletion with computed rank support.
PostgreSQL Submissions Module
storage/src/postgres/submissions.rs
Added StoredSubmission struct with nine async functions for submission lifecycle: insert, get, update score/status, list by challenge/miner, retrieve last epoch, count, and delete operations.
PostgreSQL Task Logs Module
storage/src/postgres/task_logs.rs
Introduced StoredTaskLog struct with six async functions supporting single and batch log insertion, retrieval by submission/miner, pass rate computation, and deletion with execution time tracking.
PostgreSQL Validators Module
storage/src/postgres/validators.rs
Added StoredValidator struct with nine functions for validator lifecycle: upsert, get, list (active and all), deactivate, update last seen, count active, and delete operations with hotkey parsing utility.
Module Organization & Error Handling
storage/src/postgres/mod.rs, storage/src/traits.rs
Added five public submodule declarations and three From trait implementations enabling error conversion from tokio\_postgres, deadpool\_postgres, and serde\_json for ergonomic error propagation.
Documentation Updates
AGENTS.md
Expanded AGENTS.md to document new storage architecture, build steps for storage crate, and broadened host-exemption notes to include cli/, server/, and storage/ crates as native code outside WASM sandbox.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 A warren of queries, so neat and so keen,
PostgreSQL tables now power the scene,
From validators to scores, each record takes root,
This storage layer's growing—the hare's prized loot! 🌱

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding a PostgreSQL storage module to the storage crate using types from the platform-v2 SDK, which aligns perfectly with the extensive storage-related additions throughout the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/pg-storage-sdk-types

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

All workspace crates now consistently reference platform-v2 for
platform-challenge-sdk and platform-core, fixing CI failures where
/workspace/platform-v2 path was unavailable and platform.git had
a different (incompatible) ServerChallenge trait definition.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (9)
storage/src/pg.rs (1)

57-59: RecyclingMethod::Fast can produce unusable connections on unreliable networks.

Fast relies solely on tokio_postgres::Client::is_closed instead of a test query; under some circumstances this can lead to tokio_postgres not noticing a disconnect and reporting the connection as useable. The safer alternative is RecyclingMethod::Verified.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/pg.rs` around lines 57 - 59, The PR sets config.manager =
Some(ManagerConfig { recycling_method: RecyclingMethod::Fast }), which can leave
stale/unusable DB connections; change the recycling strategy to use a
verification query by replacing RecyclingMethod::Fast with
RecyclingMethod::Verified in the ManagerConfig so the pool performs a test query
(ensuring tokio_postgres detects disconnects). Update the instance where
config.manager is constructed (ManagerConfig and recycling_method) to use
RecyclingMethod::Verified.
storage/Cargo.toml (1)

13-13: Avoid features = ["full"] in a library crate.

"full" enables every Tokio subsystem (fs, net, signal, process, rt-multi-thread, etc.). Library crates should declare only the features they actually need, leaving the choice of runtime to the application. At minimum, replace with ["rt", "rt-multi-thread"]; if only pool operations are needed, ["rt"] alone may suffice.

♻️ Proposed fix
-tokio = { version = "1.40", features = ["full"] }
+tokio = { version = "1.40", features = ["rt", "rt-multi-thread"] }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/Cargo.toml` at line 13, The tokio dependency in Cargo.toml currently
enables features = ["full"] which a library crate should avoid; change the tokio
entry to list only the runtime features your library actually requires (e.g.,
replace features = ["full"] with features = ["rt"] or ["rt","rt-multi-thread"]
as appropriate) so the library does not force all Tokio subsystems on downstream
applications; update the tokio line in Cargo.toml accordingly and run cargo
build to verify.
Cargo.toml (1)

14-15: Consider pinning git dependencies to a commit SHA instead of a floating branch.

branch = "main" means any breaking commit pushed to upstream will silently affect all downstream builds and the Cargo.lock is the only guard. A commit SHA (rev = "...") provides a reproducible, auditable pin.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Cargo.toml` around lines 14 - 15, The Cargo.toml currently uses floating
branch pins for platform-challenge-sdk and platform-core which risks
unreproducible builds; update the two dependency entries (platform-challenge-sdk
and platform-core) to pin to specific commit SHAs by replacing branch = "main"
with rev = "<commit-sha>" (use the exact commit SHA you want to track), and run
cargo update -p platform-challenge-sdk -p platform-core or regenerate Cargo.lock
to capture the locked versions. Ensure you pick the intended commit SHAs and
verify builds/tests after updating.
storage/src/postgres/validators.rs (2)

81-94: Duplicate row-mapping logic across list_active_validators and list_all_validators.

Both functions build StoredValidator identically. Extract a small row_to_validator(row: &Row) -> StoredValidator helper to keep this in one place.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/validators.rs` around lines 81 - 94, Both
list_active_validators and list_all_validators duplicate the same Row ->
StoredValidator mapping; extract a helper fn row_to_validator(row: &Row) ->
StoredValidator and use it from both places. Create the small function that
constructs StoredValidator using row.get("hotkey"), row.get("stake"),
row.get("is_active"), row.get("last_seen"), row.get("peer_id"),
row.get("registered_at"), replace the inline mapping loops in
list_active_validators and list_all_validators with a call to row_to_validator
for each row, and ensure imports/types (Row, StoredValidator) are in scope.

175-178: hotkey_from_ss58 does not need to be async.

The function performs no I/O and contains no .await points; making it async adds pointless overhead and forces every call site to use .await. A plain fn is correct here.

♻️ Proposed fix
-pub async fn hotkey_from_ss58(ss58: &str) -> Result<Hotkey> {
+pub fn hotkey_from_ss58(ss58: &str) -> Result<Hotkey> {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/validators.rs` around lines 175 - 178, hotkey_from_ss58
is declared async but contains no awaits; change its signature from pub async fn
hotkey_from_ss58(ss58: &str) -> Result<Hotkey> to a synchronous pub fn
hotkey_from_ss58(ss58: &str) -> Result<Hotkey> and remove the unnecessary async
keyword so callers no longer must .await the function while keeping the existing
body that calls Hotkey::from_ss58(...) and returns StorageError::InvalidData on
failure.
server/Cargo.toml (1)

10-10: Use workspace dependency inheritance to avoid duplication.

This declaration is identical to the one in [workspace.dependencies]. Replace it with workspace = true to keep the two in sync automatically.

♻️ Proposed fix
-platform-challenge-sdk = { git = "https://github.com/PlatformNetwork/platform-v2", branch = "main", features = ["http-server"] }
+platform-challenge-sdk = { workspace = true }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/Cargo.toml` at line 10, The platform-challenge-sdk dependency is
duplicated in this crate's Cargo.toml; remove the explicit git/branch/features
table and replace the dependency entry for "platform-challenge-sdk" with a
workspace dependency by setting it to "workspace = true" so it inherits the
declaration from [workspace.dependencies]; edit the dependency line referencing
platform-challenge-sdk in this file to use workspace = true (keeping any
existing features handled at the workspace level).
storage/src/postgres/evaluations.rs (1)

61-73: Duplicated row-to-EvaluationResult mapping across four functions.

The metrics deserialization + struct construction block is copy-pasted into get_evaluation, list_evaluations_by_challenge, list_evaluations_by_agent, and get_latest_evaluation_for_agent. Extract a private helper.

♻️ Proposed refactor
+fn row_to_evaluation_result(row: &tokio_postgres::Row) -> Result<EvaluationResult> {
+    let metrics_json: serde_json::Value = row.get("metrics");
+    let metrics: HashMap<String, f64> = serde_json::from_value(metrics_json)?;
+    let execution_time_ms: i64 = row.get("execution_time_ms");
+    Ok(EvaluationResult {
+        job_id: row.get("job_id"),
+        agent_hash: row.get("agent_hash"),
+        score: row.get("score"),
+        metrics,
+        logs: row.get("logs"),
+        execution_time_ms: execution_time_ms as u64,
+        timestamp: row.get("evaluated_at"),
+    })
+}

Replace all four inline blocks with row_to_evaluation_result(&row)?.

Also applies to: 98-110, 133-146, 172-185

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/evaluations.rs` around lines 61 - 73, Extract the
duplicated row-to-EvaluationResult mapping into a private helper (e.g., fn
row_to_evaluation_result(row: &Row) -> Result<EvaluationResult, _>) that
deserializes "metrics" into HashMap<String,f64>, reads execution_time_ms and
other fields, and returns the constructed EvaluationResult; then replace the
inline blocks in get_evaluation, list_evaluations_by_challenge,
list_evaluations_by_agent, and get_latest_evaluation_for_agent with calls to
row_to_evaluation_result(&row)? to eliminate repetition and centralize error
handling.
storage/src/postgres/submissions.rs (1)

58-68: Duplicated row-to-StoredSubmission mapping in three functions.

The identical struct-construction block appears in get_submission, list_submissions_by_challenge, and list_submissions_by_miner. Extract a private helper to remove the repetition.

♻️ Proposed refactor
+fn row_to_stored_submission(row: &tokio_postgres::Row) -> StoredSubmission {
+    StoredSubmission {
+        id: row.get("id"),
+        challenge_id: row.get("challenge_id"),
+        miner_hotkey_ss58: row.get("miner_hotkey"),
+        agent_hash: row.get("agent_hash"),
+        epoch: row.get("epoch"),
+        score: row.get("score"),
+        status: row.get("status"),
+        submitted_at: row.get("submitted_at"),
+        evaluated_at: row.get("evaluated_at"),
+    }
+}

Then replace each inline block with row_to_stored_submission(&row).

Also applies to: 126-136, 165-175

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/submissions.rs` around lines 58 - 68, Extract the
duplicated row-to-StoredSubmission construction into a private helper function
(e.g. fn row_to_stored_submission(row: &Row) -> StoredSubmission) that reads the
same fields (id, challenge_id, miner_hotkey, agent_hash, epoch, score, status,
submitted_at, evaluated_at) and returns a StoredSubmission; then replace the
inline struct construction in get_submission, list_submissions_by_challenge,
list_submissions_by_miner (and the other duplicate sites around the same
pattern) with calls to row_to_stored_submission(&row) to remove repetition while
preserving the exact field mappings and types.
storage/src/postgres/task_logs.rs (1)

8-8: TaskLogRecord tuple alias is opaque — consider a named struct.

The six-element positional tuple (String, bool, f64, u64, Option<String>, Option<String>) maps to (task_id, passed, score, execution_time_ms, output_preview, error) only by convention; there is no compiler-enforced naming. A small struct would improve call-site readability and catch accidental argument transpositions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/task_logs.rs` at line 8, The tuple alias TaskLogRecord
is opaque; replace it with a named struct (e.g., struct TaskLogRecord { task_id:
String, passed: bool, score: f64, execution_time_ms: u64, output_preview:
Option<String>, error: Option<String> }) and swap all uses of the tuple type to
the struct; update any constructors, function signatures, return types, and DB
row mappings (e.g., query result mapping or FromRow implementations) that
produce or consume TaskLogRecord so they build and return the struct fields by
name rather than positional tuple elements to improve readability and prevent
accidental argument transposition.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@storage/src/pg.rs`:
- Around line 37-48: The Default impl for PgConfig currently hardcodes
password="postgres"; change it to be environment-driven: in the Default
implementation for PgConfig (the Default::default function) read the password
from std::env::var("PG_PASSWORD") and use that value (e.g.
env::var("PG_PASSWORD").unwrap_or_default()) instead of the literal "postgres",
and add a brief doc comment on PgConfig::default stating this is a dev-only
fallback and that production must set PG_PASSWORD; leave other fields (e.g.
pool_size via default_pool_size()) unchanged.
- Around line 50-61: The create_pool function is not applying PgConfig.pool_size
because Config.pool is never set; update create_pool to set config.pool =
Some(PoolConfig { max_size: cfg.pool_size, ..Default::default() }) (or
equivalent builder) before calling config.create_pool so the deadpool_postgres
pool uses the intended size from PgConfig.pool_size; locate this in the
create_pool function where Config is built (symbols: create_pool, PgConfig,
Config, PoolConfig, config.create_pool) and add the pool assignment.

In `@storage/src/postgres/evaluations.rs`:
- Around line 116-150: The function list_evaluations_by_agent currently issues
an unbounded query and may load all rows for an agent; change the signature of
list_evaluations_by_agent to accept pagination parameters (e.g., limit: i64 and
offset: i64 or usize), modify the SQL to add "LIMIT $2 OFFSET $3", pass the new
parameters into the client.query call (e.g., &[&agent_hash, &limit, &offset]),
and update callers accordingly; ensure the result mapping (metrics
deserialization, execution_time_ms conversion, and construction of
EvaluationResult) remains unchanged but now only iterates the bounded rows
returned.
- Line 41: The direct cast "(result.execution_time_ms as i64)" can silently wrap
negative on overflow; replace it with a checked conversion using TryFrom/TryInto
(e.g. i64::try_from(result.execution_time_ms)) and propagate/return a clear
error if the conversion fails so an overflow is not stored. Update the code that
builds the DB parameter (the place referencing result.execution_time_ms in
evaluations.rs) to perform this checked conversion and handle the Err case
consistently with the function's existing error handling.

In `@storage/src/postgres/leaderboard.rs`:
- Around line 55-65: get_leaderboard computes ranks using ROW_NUMBER() while
get_entry_by_hotkey computes ranks by counting scores greater than
current_score, causing inconsistent ranks for tied scores; update
get_leaderboard to use the same tie-handling strategy as get_entry_by_hotkey
(choose either RANK() or DENSE_RANK() to give equal ranks to ties) by replacing
ROW_NUMBER() OVER (ORDER BY score DESC) with RANK() OVER (ORDER BY score DESC)
(or DENSE_RANK() if you prefer no gaps) in the SQL query inside get_leaderboard
so both get_leaderboard and get_entry_by_hotkey return consistent ranks for tied
scores.
- Around line 109-131: The loop in set_weight_assignments issues separate
INSERT..ON CONFLICT statements without a transaction, causing partial commits on
failure; wrap the whole loop in a single DB transaction using the Pg connection
returned by pool.get() (begin a transaction via
client.begin()/client.transaction(), run each execute against the transaction
object instead of the raw client, and commit the transaction after the loop) so
that any failure rolls back all changes for the challenge_id; ensure you use the
transaction's execute method and call commit() at the end (rollback will occur
on drop if commit is not reached).

In `@storage/src/postgres/submissions.rs`:
- Around line 191-202: The SELECT MAX(epoch) query always returns one row, so
using client.query_opt leads to an unreachable None branch; replace the
query_opt call with client.query_one(...) (awaiting the result) and then extract
the nullable epoch as Option<i64> from the returned row (row.get("last_epoch")),
removing the unused None => Ok(None) arm in submissions.rs so the function
returns the Option directly from the single row.
- Around line 79-101: Both update_submission_score and update_submission_status
currently ignore the number of rows affected and always return Ok(()), causing
silent no-ops when the id doesn't exist; change each function
(update_submission_score and update_submission_status) to bind the result of
client.execute(...) to a rows_affected variable, check if rows_affected == 0 and
return Err(StorageError::NotFound) (or appropriate StorageError variant) in that
case, otherwise continue returning Ok(()).
- Line 32: Silent truncation occurs by doing let epoch_i64 = epoch as i64; —
replace this with a checked conversion using i64::try_from(epoch) or
epoch.try_into() and propagate/return a clear error when the conversion fails
instead of writing a negative/truncated value; update the code that uses
epoch_i64 (the epoch_i64 binding and callers in this submission insertion path)
to handle the Result and return an appropriate storage/validation error so
oversized u64 epochs cannot be stored silently.

In `@storage/src/postgres/task_logs.rs`:
- Around line 67-111: insert_task_logs_batch currently issues individual
client.execute calls causing partial commits on failure; wrap the whole insert
loop in a single transaction by calling client.transaction().await? to get a
Transaction, replace client.execute(...) with txn.execute(...) inside the for
loop, collect the generated ids as before, and then call txn.commit().await?
before returning Ok(ids) so all inserts atomic (and let errors propagate to roll
back automatically if commit is not reached). Include the same parameter values
(id, submission_id, challenge_uuid, hotkey_ss58, task_id, passed, score,
exec_time_i64, output_ref, error_ref) when switching to txn.execute and keep
insert logic in insert_task_logs_batch.

In `@storage/src/postgres/validators.rs`:
- Around line 29-36: The ON CONFLICT clause in the INSERT for the validators
table uses COALESCE(EXCLUDED.peer_id, validators.peer_id), which prevents
explicitly clearing peer_id to NULL; change the upsert to assign peer_id =
EXCLUDED.peer_id (remove COALESCE) so that passing None/NULL will overwrite the
previous value with NULL. Update the SQL string used in validators.rs (the
INSERT ... ON CONFLICT (...) DO UPDATE SET ... clause) accordingly so peer_id is
unconditionally assigned from EXCLUDED.peer_id.
- Line 25: The silent cast `let stake_i64 = stake as i64;` can wrap large u64
values; replace it with a checked conversion like `i64::try_from(stake)` (or
`stake.try_into()`) and propagate a clear error on failure (e.g., return
Err(ValidationError::StakeOverflow) or map_err to the function's existing error
type). Update the surrounding code/signature if necessary so the function in
validators.rs that uses `stake_i64` returns a Result and surfaces the conversion
error instead of allowing silent wrapping.

---

Nitpick comments:
In `@Cargo.toml`:
- Around line 14-15: The Cargo.toml currently uses floating branch pins for
platform-challenge-sdk and platform-core which risks unreproducible builds;
update the two dependency entries (platform-challenge-sdk and platform-core) to
pin to specific commit SHAs by replacing branch = "main" with rev =
"<commit-sha>" (use the exact commit SHA you want to track), and run cargo
update -p platform-challenge-sdk -p platform-core or regenerate Cargo.lock to
capture the locked versions. Ensure you pick the intended commit SHAs and verify
builds/tests after updating.

In `@server/Cargo.toml`:
- Line 10: The platform-challenge-sdk dependency is duplicated in this crate's
Cargo.toml; remove the explicit git/branch/features table and replace the
dependency entry for "platform-challenge-sdk" with a workspace dependency by
setting it to "workspace = true" so it inherits the declaration from
[workspace.dependencies]; edit the dependency line referencing
platform-challenge-sdk in this file to use workspace = true (keeping any
existing features handled at the workspace level).

In `@storage/Cargo.toml`:
- Line 13: The tokio dependency in Cargo.toml currently enables features =
["full"] which a library crate should avoid; change the tokio entry to list only
the runtime features your library actually requires (e.g., replace features =
["full"] with features = ["rt"] or ["rt","rt-multi-thread"] as appropriate) so
the library does not force all Tokio subsystems on downstream applications;
update the tokio line in Cargo.toml accordingly and run cargo build to verify.

In `@storage/src/pg.rs`:
- Around line 57-59: The PR sets config.manager = Some(ManagerConfig {
recycling_method: RecyclingMethod::Fast }), which can leave stale/unusable DB
connections; change the recycling strategy to use a verification query by
replacing RecyclingMethod::Fast with RecyclingMethod::Verified in the
ManagerConfig so the pool performs a test query (ensuring tokio_postgres detects
disconnects). Update the instance where config.manager is constructed
(ManagerConfig and recycling_method) to use RecyclingMethod::Verified.

In `@storage/src/postgres/evaluations.rs`:
- Around line 61-73: Extract the duplicated row-to-EvaluationResult mapping into
a private helper (e.g., fn row_to_evaluation_result(row: &Row) ->
Result<EvaluationResult, _>) that deserializes "metrics" into
HashMap<String,f64>, reads execution_time_ms and other fields, and returns the
constructed EvaluationResult; then replace the inline blocks in get_evaluation,
list_evaluations_by_challenge, list_evaluations_by_agent, and
get_latest_evaluation_for_agent with calls to row_to_evaluation_result(&row)? to
eliminate repetition and centralize error handling.

In `@storage/src/postgres/submissions.rs`:
- Around line 58-68: Extract the duplicated row-to-StoredSubmission construction
into a private helper function (e.g. fn row_to_stored_submission(row: &Row) ->
StoredSubmission) that reads the same fields (id, challenge_id, miner_hotkey,
agent_hash, epoch, score, status, submitted_at, evaluated_at) and returns a
StoredSubmission; then replace the inline struct construction in get_submission,
list_submissions_by_challenge, list_submissions_by_miner (and the other
duplicate sites around the same pattern) with calls to
row_to_stored_submission(&row) to remove repetition while preserving the exact
field mappings and types.

In `@storage/src/postgres/task_logs.rs`:
- Line 8: The tuple alias TaskLogRecord is opaque; replace it with a named
struct (e.g., struct TaskLogRecord { task_id: String, passed: bool, score: f64,
execution_time_ms: u64, output_preview: Option<String>, error: Option<String> })
and swap all uses of the tuple type to the struct; update any constructors,
function signatures, return types, and DB row mappings (e.g., query result
mapping or FromRow implementations) that produce or consume TaskLogRecord so
they build and return the struct fields by name rather than positional tuple
elements to improve readability and prevent accidental argument transposition.

In `@storage/src/postgres/validators.rs`:
- Around line 81-94: Both list_active_validators and list_all_validators
duplicate the same Row -> StoredValidator mapping; extract a helper fn
row_to_validator(row: &Row) -> StoredValidator and use it from both places.
Create the small function that constructs StoredValidator using
row.get("hotkey"), row.get("stake"), row.get("is_active"), row.get("last_seen"),
row.get("peer_id"), row.get("registered_at"), replace the inline mapping loops
in list_active_validators and list_all_validators with a call to
row_to_validator for each row, and ensure imports/types (Row, StoredValidator)
are in scope.
- Around line 175-178: hotkey_from_ss58 is declared async but contains no
awaits; change its signature from pub async fn hotkey_from_ss58(ss58: &str) ->
Result<Hotkey> to a synchronous pub fn hotkey_from_ss58(ss58: &str) ->
Result<Hotkey> and remove the unnecessary async keyword so callers no longer
must .await the function while keeping the existing body that calls
Hotkey::from_ss58(...) and returns StorageError::InvalidData on failure.

Comment on lines +37 to +48
impl Default for PgConfig {
fn default() -> Self {
Self {
host: "localhost".to_string(),
port: 5432,
user: "postgres".to_string(),
password: "postgres".to_string(),
dbname: "term_challenge".to_string(),
pool_size: default_pool_size(),
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Default password "postgres" should be environment-driven, not hardcoded.

The Default impl ships with password: "postgres", which is likely to be used in production by accident. At minimum add a doc comment making it explicit this is a dev-only default, or remove the password default and require callers to supply it explicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/pg.rs` around lines 37 - 48, The Default impl for PgConfig
currently hardcodes password="postgres"; change it to be environment-driven: in
the Default implementation for PgConfig (the Default::default function) read the
password from std::env::var("PG_PASSWORD") and use that value (e.g.
env::var("PG_PASSWORD").unwrap_or_default()) instead of the literal "postgres",
and add a brief doc comment on PgConfig::default stating this is a dev-only
fallback and that production must set PG_PASSWORD; leave other fields (e.g.
pool_size via default_pool_size()) unchanged.

Comment on lines +50 to +61
pub fn create_pool(cfg: &PgConfig) -> Result<PgPool, CreatePoolError> {
let mut config = Config::new();
config.host = Some(cfg.host.clone());
config.port = Some(cfg.port);
config.user = Some(cfg.user.clone());
config.password = Some(cfg.password.clone());
config.dbname = Some(cfg.dbname.clone());
config.manager = Some(ManagerConfig {
recycling_method: RecyclingMethod::Fast,
});

config.create_pool(Some(Runtime::Tokio1), NoTls)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cd /tmp && find . -name "pg.rs" -path "*/storage/*" 2>/dev/null | head -5

Repository: PlatformNetwork/term-challenge-v2

Length of output: 59


🏁 Script executed:

git ls-files | grep -E "storage.*pg\.rs|pg\.rs.*storage"

Repository: PlatformNetwork/term-challenge-v2

Length of output: 94


🏁 Script executed:

git ls-files | grep "pg.rs"

Repository: PlatformNetwork/term-challenge-v2

Length of output: 94


🏁 Script executed:

cat -n storage/src/pg.rs

Repository: PlatformNetwork/term-challenge-v2

Length of output: 2211


🌐 Web query:

deadpool_postgres Config pool field PoolConfig API

💡 Result:

In deadpool_postgres, the connection config struct is:

  • deadpool_postgres::Config { …, pub pool: Option<PoolConfig>, … } — i.e., the pool field is optional, and holds the pool-tuning settings. [1]

deadpool_postgres::PoolConfig is:

pub struct PoolConfig {
    pub max_size: usize,
    pub timeouts: Timeouts,
    pub queue_mode: QueueMode,
}

with defaults documented for each field (e.g., max_size defaults to cpu_count * 4, queue_mode defaults to Fifo). [2]

If you enable the serde feature, you can set these via env/config using nested keys like:

  • PG__POOL__MAX_SIZE=16
  • PG__POOL__TIMEOUTS__WAIT__SECS=5 (etc.) [1]

Sources: [1] https://docs.rs/deadpool-postgres/latest/deadpool_postgres/struct.Config.html, [2] https://docs.rs/deadpool-postgres/latest/deadpool_postgres/struct.PoolConfig.html


🌐 Web query:

deadpool_postgres PoolConfig::new constructor method signature

💡 Result:

In deadpool_postgres (PoolConfig), the constructor signature is:

pub fn new(max_size: usize) -> PoolConfig

[1]


pool_size configuration is silently ignored — pool always uses the default size.

The config.pool field is never set, so the pool_size from PgConfig is not applied. deadpool_postgres::Config has a pub pool: Option<PoolConfig> field that controls pool configuration. Without setting it, the pool defaults to PoolConfig::default() with max_size = cpu_count * 4, ignoring the intended pool size of 16 or any custom value.

Fix
 use deadpool_postgres::{Config, CreatePoolError, ManagerConfig, Pool, RecyclingMethod, Runtime};
+use deadpool_postgres::PoolConfig;
 use serde::Deserialize;
 use tokio_postgres::NoTls;

 pub fn create_pool(cfg: &PgConfig) -> Result<PgPool, CreatePoolError> {
     let mut config = Config::new();
     config.host = Some(cfg.host.clone());
     config.port = Some(cfg.port);
     config.user = Some(cfg.user.clone());
     config.password = Some(cfg.password.clone());
     config.dbname = Some(cfg.dbname.clone());
     config.manager = Some(ManagerConfig {
         recycling_method: RecyclingMethod::Fast,
     });
+    config.pool = Some(PoolConfig::new(cfg.pool_size));

     config.create_pool(Some(Runtime::Tokio1), NoTls)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/pg.rs` around lines 50 - 61, The create_pool function is not
applying PgConfig.pool_size because Config.pool is never set; update create_pool
to set config.pool = Some(PoolConfig { max_size: cfg.pool_size,
..Default::default() }) (or equivalent builder) before calling
config.create_pool so the deadpool_postgres pool uses the intended size from
PgConfig.pool_size; locate this in the create_pool function where Config is
built (symbols: create_pool, PgConfig, Config, PoolConfig, config.create_pool)
and add the pool assignment.

&result.score,
&metrics_json,
&result.logs,
&(result.execution_time_ms as i64),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

execution_time_ms as i64 truncates silently.

Same concern as in submissions.rs: a u64 value larger than i64::MAX wraps to a negative number and is stored without error. Use a checked conversion.

🛡️ Proposed fix
-            &(result.execution_time_ms as i64),
+            &i64::try_from(result.execution_time_ms)
+                .map_err(|_| StorageError::Serialization(
+                    format!("execution_time_ms {} overflows i64", result.execution_time_ms)
+                ))?,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
&(result.execution_time_ms as i64),
&i64::try_from(result.execution_time_ms)
.map_err(|_| StorageError::Serialization(
format!("execution_time_ms {} overflows i64", result.execution_time_ms)
))?,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/evaluations.rs` at line 41, The direct cast
"(result.execution_time_ms as i64)" can silently wrap negative on overflow;
replace it with a checked conversion using TryFrom/TryInto (e.g.
i64::try_from(result.execution_time_ms)) and propagate/return a clear error if
the conversion fails so an overflow is not stored. Update the code that builds
the DB parameter (the place referencing result.execution_time_ms in
evaluations.rs) to perform this checked conversion and handle the Err case
consistently with the function's existing error handling.

Comment on lines +116 to +150
pub async fn list_evaluations_by_agent(
pool: &PgPool,
agent_hash: &str,
) -> Result<Vec<EvaluationResult>> {
let client = pool.get().await?;

let rows = client
.query(
"SELECT job_id, agent_hash, score, metrics, logs, execution_time_ms, evaluated_at
FROM evaluations
WHERE agent_hash = $1
ORDER BY evaluated_at DESC",
&[&agent_hash],
)
.await?;

let mut results = Vec::with_capacity(rows.len());
for row in &rows {
let metrics_json: serde_json::Value = row.get("metrics");
let metrics: HashMap<String, f64> = serde_json::from_value(metrics_json)?;
let execution_time_ms: i64 = row.get("execution_time_ms");

results.push(EvaluationResult {
job_id: row.get("job_id"),
agent_hash: row.get("agent_hash"),
score: row.get("score"),
metrics,
logs: row.get("logs"),
execution_time_ms: execution_time_ms as u64,
timestamp: row.get("evaluated_at"),
});
}

Ok(results)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

list_evaluations_by_agent is an unbounded query.

Unlike list_evaluations_by_challenge, this function has no limit/offset parameters. For any agent with substantial evaluation history this will load the entire set into memory, causing excessive memory use and long query times in production.

🛡️ Proposed fix
 pub async fn list_evaluations_by_agent(
     pool: &PgPool,
     agent_hash: &str,
+    limit: i64,
+    offset: i64,
 ) -> Result<Vec<EvaluationResult>> {
     let client = pool.get().await?;

     let rows = client
         .query(
             "SELECT job_id, agent_hash, score, metrics, logs, execution_time_ms, evaluated_at
              FROM evaluations
              WHERE agent_hash = $1
-             ORDER BY evaluated_at DESC",
-            &[&agent_hash],
+             ORDER BY evaluated_at DESC
+             LIMIT $2 OFFSET $3",
+            &[&agent_hash, &limit, &offset],
         )
         .await?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub async fn list_evaluations_by_agent(
pool: &PgPool,
agent_hash: &str,
) -> Result<Vec<EvaluationResult>> {
let client = pool.get().await?;
let rows = client
.query(
"SELECT job_id, agent_hash, score, metrics, logs, execution_time_ms, evaluated_at
FROM evaluations
WHERE agent_hash = $1
ORDER BY evaluated_at DESC",
&[&agent_hash],
)
.await?;
let mut results = Vec::with_capacity(rows.len());
for row in &rows {
let metrics_json: serde_json::Value = row.get("metrics");
let metrics: HashMap<String, f64> = serde_json::from_value(metrics_json)?;
let execution_time_ms: i64 = row.get("execution_time_ms");
results.push(EvaluationResult {
job_id: row.get("job_id"),
agent_hash: row.get("agent_hash"),
score: row.get("score"),
metrics,
logs: row.get("logs"),
execution_time_ms: execution_time_ms as u64,
timestamp: row.get("evaluated_at"),
});
}
Ok(results)
}
pub async fn list_evaluations_by_agent(
pool: &PgPool,
agent_hash: &str,
limit: i64,
offset: i64,
) -> Result<Vec<EvaluationResult>> {
let client = pool.get().await?;
let rows = client
.query(
"SELECT job_id, agent_hash, score, metrics, logs, execution_time_ms, evaluated_at
FROM evaluations
WHERE agent_hash = $1
ORDER BY evaluated_at DESC
LIMIT $2 OFFSET $3",
&[&agent_hash, &limit, &offset],
)
.await?;
let mut results = Vec::with_capacity(rows.len());
for row in &rows {
let metrics_json: serde_json::Value = row.get("metrics");
let metrics: HashMap<String, f64> = serde_json::from_value(metrics_json)?;
let execution_time_ms: i64 = row.get("execution_time_ms");
results.push(EvaluationResult {
job_id: row.get("job_id"),
agent_hash: row.get("agent_hash"),
score: row.get("score"),
metrics,
logs: row.get("logs"),
execution_time_ms: execution_time_ms as u64,
timestamp: row.get("evaluated_at"),
});
}
Ok(results)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/evaluations.rs` around lines 116 - 150, The function
list_evaluations_by_agent currently issues an unbounded query and may load all
rows for an agent; change the signature of list_evaluations_by_agent to accept
pagination parameters (e.g., limit: i64 and offset: i64 or usize), modify the
SQL to add "LIMIT $2 OFFSET $3", pass the new parameters into the client.query
call (e.g., &[&agent_hash, &limit, &offset]), and update callers accordingly;
ensure the result mapping (metrics deserialization, execution_time_ms
conversion, and construction of EvaluationResult) remains unchanged but now only
iterates the bounded rows returned.

Comment on lines +55 to +65
"SELECT hotkey, score, weight,
COALESCE(submissions_count, 0) as submissions_count,
updated_at,
ROW_NUMBER() OVER (ORDER BY score DESC) as rank
FROM leaderboard
WHERE challenge_id = $1
ORDER BY score DESC
LIMIT $2 OFFSET $3",
&[&challenge_uuid, &limit, &offset],
)
.await?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Rank calculation differs between get_leaderboard and get_entry_by_hotkey for tied scores.

get_leaderboard uses ROW_NUMBER() OVER (ORDER BY score DESC), which breaks ties arbitrarily (non-deterministic for equal scores). get_entry_by_hotkey uses COUNT(*) + 1 WHERE score > current_score, which assigns the same rank to tied entries. For identical underlying data, the two functions can return different rank values for the same hotkey. Align on one strategy (e.g., RANK() or DENSE_RANK()) across both queries.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/leaderboard.rs` around lines 55 - 65, get_leaderboard
computes ranks using ROW_NUMBER() while get_entry_by_hotkey computes ranks by
counting scores greater than current_score, causing inconsistent ranks for tied
scores; update get_leaderboard to use the same tie-handling strategy as
get_entry_by_hotkey (choose either RANK() or DENSE_RANK() to give equal ranks to
ties) by replacing ROW_NUMBER() OVER (ORDER BY score DESC) with RANK() OVER
(ORDER BY score DESC) (or DENSE_RANK() if you prefer no gaps) in the SQL query
inside get_leaderboard so both get_leaderboard and get_entry_by_hotkey return
consistent ranks for tied scores.

Comment on lines +79 to +101
client
.execute(
"UPDATE submissions SET score = $2, status = $3, evaluated_at = NOW()
WHERE id = $1",
&[id, &score, &status],
)
.await?;

Ok(())
}

pub async fn update_submission_status(pool: &PgPool, id: &Uuid, status: &str) -> Result<()> {
let client = pool.get().await?;

client
.execute(
"UPDATE submissions SET status = $2 WHERE id = $1",
&[id, &status],
)
.await?;

Ok(())
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

update_submission_score / update_submission_status silently no-op on missing ID.

Both functions return Ok(()) even when rows_affected == 0, making it impossible for callers to distinguish a successful update from an update that silently targeted a non-existent submission. Consider returning StorageError::NotFound when rows_affected == 0.

🛡️ Proposed fix (same pattern for both functions)
     client
         .execute(...)
         .await?;
+
+    if rows_affected == 0 {
+        return Err(StorageError::NotFound(format!("submission {}", id)));
+    }

     Ok(())

Apply this to both update_submission_score and update_submission_status (remembering to bind the .execute() result to rows_affected).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/submissions.rs` around lines 79 - 101, Both
update_submission_score and update_submission_status currently ignore the number
of rows affected and always return Ok(()), causing silent no-ops when the id
doesn't exist; change each function (update_submission_score and
update_submission_status) to bind the result of client.execute(...) to a
rows_affected variable, check if rows_affected == 0 and return
Err(StorageError::NotFound) (or appropriate StorageError variant) in that case,
otherwise continue returning Ok(()).

Comment on lines +191 to +202
let row = client
.query_opt(
"SELECT MAX(epoch) as last_epoch FROM submissions
WHERE challenge_id = $1 AND miner_hotkey = $2",
&[&challenge_uuid, &hotkey_ss58],
)
.await?;

match row {
Some(row) => Ok(row.get("last_epoch")),
None => Ok(None),
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

query_opt misused for a MAX aggregate — None arm is unreachable dead code.

SELECT MAX(epoch) … always returns exactly one row (with a NULL value when the table is empty). query_opt returns an error if the query returns more than one row, but for a MAX aggregate it always returns Some(row), so the None => Ok(None) branch can never be reached. Use query_one and retrieve Option<i64> directly from the row.

♻️ Proposed fix
-    let row = client
-        .query_opt(
-            "SELECT MAX(epoch) as last_epoch FROM submissions
-             WHERE challenge_id = $1 AND miner_hotkey = $2",
-            &[&challenge_uuid, &hotkey_ss58],
-        )
-        .await?;
-
-    match row {
-        Some(row) => Ok(row.get("last_epoch")),
-        None => Ok(None),
-    }
+    let row = client
+        .query_one(
+            "SELECT MAX(epoch) as last_epoch FROM submissions
+             WHERE challenge_id = $1 AND miner_hotkey = $2",
+            &[&challenge_uuid, &hotkey_ss58],
+        )
+        .await?;
+
+    Ok(row.get("last_epoch"))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let row = client
.query_opt(
"SELECT MAX(epoch) as last_epoch FROM submissions
WHERE challenge_id = $1 AND miner_hotkey = $2",
&[&challenge_uuid, &hotkey_ss58],
)
.await?;
match row {
Some(row) => Ok(row.get("last_epoch")),
None => Ok(None),
}
let row = client
.query_one(
"SELECT MAX(epoch) as last_epoch FROM submissions
WHERE challenge_id = $1 AND miner_hotkey = $2",
&[&challenge_uuid, &hotkey_ss58],
)
.await?;
Ok(row.get("last_epoch"))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/submissions.rs` around lines 191 - 202, The SELECT
MAX(epoch) query always returns one row, so using client.query_opt leads to an
unreachable None branch; replace the query_opt call with client.query_one(...)
(awaiting the result) and then extract the nullable epoch as Option<i64> from
the returned row (row.get("last_epoch")), removing the unused None => Ok(None)
arm in submissions.rs so the function returns the Option directly from the
single row.

Comment on lines +67 to +111
pub async fn insert_task_logs_batch(
pool: &PgPool,
submission_id: &Uuid,
challenge_id: &ChallengeId,
miner_hotkey: &Hotkey,
logs: &[TaskLogRecord],
) -> Result<Vec<Uuid>> {
let client = pool.get().await?;
let challenge_uuid: Uuid = challenge_id.0;
let hotkey_ss58 = miner_hotkey.to_ss58();

let mut ids = Vec::with_capacity(logs.len());

for (task_id, passed, score, execution_time_ms, output_preview, error) in logs {
let id = Uuid::new_v4();
let exec_time_i64 = *execution_time_ms as i64;
let output_ref = output_preview.as_deref();
let error_ref = error.as_deref();

client
.execute(
"INSERT INTO task_logs (
id, submission_id, challenge_id, miner_hotkey, task_id,
passed, score, execution_time_ms, output_preview, error, created_at
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, NOW())",
&[
&id,
submission_id,
&challenge_uuid,
&hotkey_ss58,
task_id,
passed,
score,
&exec_time_i64,
&output_ref,
&error_ref,
],
)
.await?;

ids.push(id);
}

Ok(ids)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

insert_task_logs_batch is not wrapped in a transaction — partial writes on failure.

Each client.execute() in the loop auto-commits independently. If any insertion fails mid-batch, the earlier rows are permanently committed while the remaining ones are never written, leaving the submissions's task-log set in a corrupt partial state.

deadpool_postgres::Client.transaction returns a wrapped transaction object which utilises the statement cache of the wrapped client, so the fix requires only a client.transaction() call.

🛡️ Proposed fix
 pub async fn insert_task_logs_batch(
     pool: &PgPool,
     submission_id: &Uuid,
     challenge_id: &ChallengeId,
     miner_hotkey: &Hotkey,
     logs: &[TaskLogRecord],
 ) -> Result<Vec<Uuid>> {
-    let client = pool.get().await?;
+    let mut client = pool.get().await?;
     let challenge_uuid: Uuid = challenge_id.0;
     let hotkey_ss58 = miner_hotkey.to_ss58();

+    let txn = client.transaction().await?;
     let mut ids = Vec::with_capacity(logs.len());

     for (task_id, passed, score, execution_time_ms, output_preview, error) in logs {
         let id = Uuid::new_v4();
         let exec_time_i64 = *execution_time_ms as i64;
         let output_ref = output_preview.as_deref();
         let error_ref = error.as_deref();

-        client
+        txn
             .execute(
                 "INSERT INTO task_logs (
                     id, submission_id, challenge_id, miner_hotkey, task_id,
                     passed, score, execution_time_ms, output_preview, error, created_at
                 ) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, NOW())",
                 &[
                     &id, submission_id, &challenge_uuid, &hotkey_ss58,
                     task_id, passed, score, &exec_time_i64, &output_ref, &error_ref,
                 ],
             )
             .await?;

         ids.push(id);
     }

+    txn.commit().await?;
     Ok(ids)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/task_logs.rs` around lines 67 - 111,
insert_task_logs_batch currently issues individual client.execute calls causing
partial commits on failure; wrap the whole insert loop in a single transaction
by calling client.transaction().await? to get a Transaction, replace
client.execute(...) with txn.execute(...) inside the for loop, collect the
generated ids as before, and then call txn.commit().await? before returning
Ok(ids) so all inserts atomic (and let errors propagate to roll back
automatically if commit is not reached). Include the same parameter values (id,
submission_id, challenge_uuid, hotkey_ss58, task_id, passed, score,
exec_time_i64, output_ref, error_ref) when switching to txn.execute and keep
insert logic in insert_task_logs_batch.

) -> Result<()> {
let client = pool.get().await?;
let hotkey_ss58 = hotkey.to_ss58();
let stake_i64 = stake as i64;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Silent u64 → i64 cast can silently corrupt large stake values.

stake as i64 wraps on values above i64::MAX (~9.2 × 10¹⁸). On Bittensor, stake amounts could approach this range. Use a checked cast and propagate an error rather than silently corrupting the stored value.

🐛 Proposed fix
-    let stake_i64 = stake as i64;
+    let stake_i64 = i64::try_from(stake)
+        .map_err(|_| StorageError::InvalidData(format!("stake value {} overflows i64", stake)))?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let stake_i64 = stake as i64;
let stake_i64 = i64::try_from(stake)
.map_err(|_| StorageError::InvalidData(format!("stake value {} overflows i64", stake)))?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/validators.rs` at line 25, The silent cast `let
stake_i64 = stake as i64;` can wrap large u64 values; replace it with a checked
conversion like `i64::try_from(stake)` (or `stake.try_into()`) and propagate a
clear error on failure (e.g., return Err(ValidationError::StakeOverflow) or
map_err to the function's existing error type). Update the surrounding
code/signature if necessary so the function in validators.rs that uses
`stake_i64` returns a Result and surfaces the conversion error instead of
allowing silent wrapping.

Comment on lines +29 to +36
"INSERT INTO validators (hotkey, stake, is_active, last_seen, peer_id, registered_at)
VALUES ($1, $2, true, NOW(), $3, NOW())
ON CONFLICT (hotkey) DO UPDATE SET
stake = EXCLUDED.stake,
is_active = true,
last_seen = NOW(),
peer_id = COALESCE(EXCLUDED.peer_id, validators.peer_id)",
&[&hotkey_ss58, &stake_i64, &peer_id],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

COALESCE(EXCLUDED.peer_id, validators.peer_id) makes it impossible to explicitly clear peer_id to NULL.

Passing None as peer_id will silently retain the previously stored value. If clearing the field is a valid operation, the ON CONFLICT clause should unconditionally assign peer_id = EXCLUDED.peer_id.

🐛 Proposed fix (if clearing should be supported)
-                peer_id = COALESCE(EXCLUDED.peer_id, validators.peer_id)",
+                peer_id = EXCLUDED.peer_id",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"INSERT INTO validators (hotkey, stake, is_active, last_seen, peer_id, registered_at)
VALUES ($1, $2, true, NOW(), $3, NOW())
ON CONFLICT (hotkey) DO UPDATE SET
stake = EXCLUDED.stake,
is_active = true,
last_seen = NOW(),
peer_id = COALESCE(EXCLUDED.peer_id, validators.peer_id)",
&[&hotkey_ss58, &stake_i64, &peer_id],
"INSERT INTO validators (hotkey, stake, is_active, last_seen, peer_id, registered_at)
VALUES ($1, $2, true, NOW(), $3, NOW())
ON CONFLICT (hotkey) DO UPDATE SET
stake = EXCLUDED.stake,
is_active = true,
last_seen = NOW(),
peer_id = EXCLUDED.peer_id",
&[&hotkey_ss58, &stake_i64, &peer_id],
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@storage/src/postgres/validators.rs` around lines 29 - 36, The ON CONFLICT
clause in the INSERT for the validators table uses COALESCE(EXCLUDED.peer_id,
validators.peer_id), which prevents explicitly clearing peer_id to NULL; change
the upsert to assign peer_id = EXCLUDED.peer_id (remove COALESCE) so that
passing None/NULL will overwrite the previous value with NULL. Update the SQL
string used in validators.rs (the INSERT ... ON CONFLICT (...) DO UPDATE SET ...
clause) accordingly so peer_id is unconditionally assigned from
EXCLUDED.peer_id.

@echobt echobt merged commit eed6485 into main Feb 20, 2026
6 checks passed
@echobt echobt deleted the feat/pg-storage-sdk-types branch February 20, 2026 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant