P4-W4: Metadata and observability for export jobs by user1303836 · Pull Request #125 · user1303836/spectraplex

user1303836 · 2026-03-09T00:17:32Z

Summary

Attach provenance metadata and status introspection to export and materialization jobs so downstream consumers can determine data provenance and completeness.

ExportJobStatus enrichment — added dataset_version_id, completeness_status, started_at / completed_at wall-clock timestamps, and last_ingestion_run_id to ExportJobStatus
run_export_job provenance lookup — enriched the export job runner to look up the active DatasetVersion and DatasetCompleteness records and propagate them into the job status
DeliveryMetadata provenance — propagated provenance fields into DeliveryMetadata so sinks receive complete lineage information
GET /v1/datasets/{name}/status — new materialization introspection endpoint returning dataset status, active version, completeness, and recent export job history
Comprehensive tests — 8 new tests covering serde round-trips, backward compatibility, defaults, provenance fields, and response shapes (780 total, 0 failures)

Changed files

File	Change
`core/src/materializer.rs`	New provenance fields on `ExportJobStatus` and `DeliveryMetadata`
`adapters/src/v2_repo.rs`	Provenance lookup helpers for dataset version and completeness
`api/src/main.rs`	`GET /v1/datasets/{name}/status` endpoint; provenance wiring in export runner
`README.md`	Document new endpoint

Validation

All CI-equivalent checks pass locally:

cargo fmt --all --check ✅
cargo clippy --workspace --all-targets -- -D warnings ✅
cargo test --workspace — 780 tests, 0 failures ✅
8 new P4-W4 tests cover metadata serialization, backward compatibility, defaults, delivery provenance, and dataset status response shape

Test plan

Verify cargo test --workspace passes in CI
Verify cargo clippy and cargo fmt pass in CI
Review provenance fields on ExportJobStatus for completeness
Review /v1/datasets/{name}/status response shape
Confirm backward compatibility — old clients without new fields still deserialize correctly

🤖 Generated with Claude Code

Enrich ExportJobStatus with provenance metadata: dataset_version_id, dataset_version, completeness_status, completeness_coverage, started_at, completed_at, and last_ingestion_run_id. These fields use skip_serializing_if for backward compatibility. Enhance run_export_job to look up the active DatasetVersion and DatasetCompleteness records, aggregating coverage bounds and status across matching targets/networks. Enrich DeliveryMetadata with dataset_version_id and completeness_status so sink consumers receive provenance context. Add GET /v1/datasets/{name}/status endpoint for materialization introspection: returns active version, all versions, and completeness records for downstream consumers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: 6a7e5fd842e4

user1303836 · 2026-03-09T00:20:27Z

P4-W4 Remote Review: Metadata and Observability

Verdict: No blocking issues. Review pass is clear.

Reviewed the full diff against the 6 review criteria:

✅ Backward compatibility

All 7 new fields on ExportJobStatus use skip_serializing_if = "Option::is_none". Tests (test_export_job_status_backward_compat_no_metadata, test_export_job_status_metadata_fields_omitted_when_none) explicitly verify that None metadata fields are omitted from JSON, preserving the pre-P4-W4 response shape. DeliveryMetadata provenance fields also correctly use skip_serializing_if.

✅ Query correctness

get_active_dataset_version → fetch_optional → caller uses .ok().flatten(). list_completeness_filtered → fetch_all → caller uses .unwrap_or_default(). Both handle missing records gracefully with no panic paths. Dynamic SQL parameter binding in list_completeness_filtered correctly tracks param_idx to match bind order across all filter combinations.

✅ No wallet-only assumptions

Export code uses generic target_id: Option<Uuid> and network: Option<&str> throughout. Completeness filtering, provenance lookups, and the new status endpoint are all target-type agnostic.

✅ Performance

Metadata lookups (one version query + one completeness list) execute once per export job at start time, not on status polls. The /v1/datasets/{name}/status endpoint makes two straightforward queries. No N+1 patterns or heavy joins.

✅ No migration required

All enrichment is in-memory. ExportMetadata, DatasetStatus, and DatasetCompletenessInfo are API-layer types only. Queries use existing dataset_versions and dataset_completeness tables.

⚠️ Non-blocking: gap_ranges omission

gap_ranges from DatasetCompleteness is fetched from the DB but not propagated to either completeness_coverage in the export response or DatasetCompletenessInfo in the status endpoint. The presence of gaps is captured via the aggregated status string (e.g., "gap"), and the full gap_ranges remain available through the existing /v1/datasets/{name}/completeness endpoint. Worth adding in a follow-up for completeness, but not a blocker since the semantics are not reduced to a boolean.

Non-blocking notes

last_ingestion_run_id aggregation uses .rev().find_map() based on query ordering (ORDER BY target_id, network), not temporal ordering. Consider selecting the run ID from the record with the most recent updated_at in a follow-up.
DatasetCompletenessInfo omits block_start/block_end from the per-record status view (they are included in aggregated export coverage). Minor omission.

This packet is ready to merge from the PR-comment review perspective.

user1303836 merged commit f76539b into main Mar 9, 2026
4 checks passed

user1303836 deleted the p4-w4-metadata-observability branch March 9, 2026 00:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P4-W4: Metadata and observability for export jobs#125

P4-W4: Metadata and observability for export jobs#125
user1303836 merged 1 commit intomainfrom
p4-w4-metadata-observability

user1303836 commented Mar 9, 2026

Uh oh!

user1303836 commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

user1303836 commented Mar 9, 2026

Summary

Changed files

Validation

Test plan

Uh oh!

user1303836 commented Mar 9, 2026

P4-W4 Remote Review: Metadata and Observability

✅ Backward compatibility

✅ Query correctness

✅ No wallet-only assumptions

✅ Performance

✅ No migration required

⚠️ Non-blocking: gap_ranges omission

Non-blocking notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant