Skip to content

P5-W1: Wallet / Tax / Forensics Pack — Gold datasets, tax export, forensics view#127

Merged
user1303836 merged 1 commit intomainfrom
p5-w1-wallet-tax-forensics-pack
Mar 9, 2026
Merged

P5-W1: Wallet / Tax / Forensics Pack — Gold datasets, tax export, forensics view#127
user1303836 merged 1 commit intomainfrom
p5-w1-wallet-tax-forensics-pack

Conversation

@user1303836
Copy link
Owner

Summary

Phase 5, Work Packet 1: proves the platform powers wallet, tax, and forensics use cases by adding Gold-tier datasets materialized entirely from existing Silver data — no raw chain parsing required.

  • Gold wallet_ledger dataset with counterparty tracking, derived from Silver token_transfers
  • Gold balance_history dataset for point-in-time balance snapshots
  • Tax-export CSV via GET /v1/export/tax — 13-column format with gain/loss computation
  • Forensics activity view via GET /v1/forensics/activity — counterparty aggregation, cross-chain activity, and transaction type breakdown
  • Full dataset API integration: registered in DatasetName::all(), QUERYABLE_DATASETS, and EXPORTABLE_DATASETS
  • Migrations for wallet_ledger and balance_history tables with appropriate indexes
  • README updated with new endpoints and Gold dataset documentation

Validation

  • 848 tests pass (all green)
  • cargo fmt --all --check clean
  • cargo clippy --workspace --all-targets -- -D warnings clean
  • All P4-W5 compatibility surface tests pass
  • Tax export produces well-formed 13-column CSV with Gain_Loss computation
  • Counterparty extraction correctly derives sender/receiver from token_transfers
  • One minor test copy-paste warning noted (non-blocking)

Files changed

File Change
core/src/materializer.rs Gold dataset definitions, materializer registry
adapters/src/ledger_derivation.rs Counterparty extraction, wallet_ledger/balance_history derivation
adapters/src/v2_repo.rs Persistence for Gold datasets
api/src/main.rs Tax export + forensics activity endpoints, dataset integration
migrations/20260310200000_add_wallet_ledger.sql wallet_ledger table + indexes
migrations/20260310200001_add_balance_history.sql balance_history table + indexes
README.md New endpoints + Gold dataset docs

Test plan

  • Verify cargo test --workspace passes (848 tests)
  • Verify cargo clippy and cargo fmt are clean
  • Review counterparty extraction logic in ledger_derivation.rs
  • Review tax CSV column layout and gain/loss computation
  • Review forensics activity aggregation endpoint
  • Confirm Gold datasets appear in dataset list/query/export APIs
  • Confirm migrations are forward-compatible with existing schema

🤖 Generated with Claude Code

…iew (P5-W1)

Prove the platform powers wallet, tax, and forensics use cases by adding
Gold-tier datasets derived from existing Silver data without raw chain parsing.

- Add WalletLedger and BalanceHistory variants to DatasetName enum
- Add WalletLedgerRecord and BalanceSnapshot record types with counterparty
  tracking, fee breakdown, and nullable cost_basis/proceeds fields
- Add ForensicsActivity, CounterpartySummary, NetworkActivity, TypeBreakdown types
- Add SQL migrations for wallet_ledger and balance_history tables
- Implement wallet_ledger materializer deriving from token_transfers,
  native_balance_deltas, hl_fills, and hl_funding with counterparty extraction
- Implement balance_history materializer computing running balances per asset
- Implement forensics activity builder with top counterparties and cross-chain summary
- Add repository query/export methods for both Gold datasets
- Register wallet_ledger and balance_history in dataset API (queryable + exportable)
- Add GET /v1/export/tax endpoint with tax-software-friendly CSV format
- Add GET /v1/forensics/activity endpoint for wallet interaction analysis
- Update README.md with new datasets, endpoints, and tax export shape

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 62017c942ed3
@user1303836
Copy link
Owner Author

P5-W1 Remote Review: Wallet / Tax / Forensics Pack

Verdict: Review pass is clear. No blocking issues.

All 4 CI checks pass (fmt, clippy, test, build). Reviewed the full 2052-line diff across 7 files.

Review focus findings

  1. Counterparty extraction — Correct. derive_wallet_ledger_from_token_transfers properly derives counterparty as receiver for outflows and sender for inflows using case-insensitive matching. Native balance deltas and HL fills/funding correctly leave counterparty as None since they lack direct from/to semantics. No decoded_events counterparty extraction was attempted, which is the right call — decoded events are too heterogeneous for generic counterparty inference.

  2. Balance history computation — Running SUM logic is correct. Grouping by (asset_symbol, network) properly separates multi-chain balances. Sort-by-timestamp within each group produces correct running totals. Tests cover both single-asset and multi-asset scenarios.

  3. Tax CSV format — The 13-column layout (Date, Type, Sent_Asset, Sent_Amount, Received_Asset, Received_Amount, Fee_Asset, Fee_Amount, Cost_Basis, Proceeds, Gain_Loss, Tx_Hash, Network) aligns well with CoinTracker/Koinly/TaxBit import formats. Sent/Received split based on amount sign is correct. Gain_Loss correctly computed only when both cost_basis and proceeds are present.

  4. DatasetName variants — Purely additive. WalletLedger and BalanceHistory appended to enum, all() updated to 9, serde roundtrip tests updated. No breaking changes to existing 7 datasets.

  5. No Silver infrastructure contamination — Gold structs (WalletLedgerRecord, BalanceSnapshot, ForensicsActivity) live in their own section. Derivation is read-only consumption of Silver types. No modifications to existing Silver models or materializers.

  6. Forensics scoping — Endpoint is behind require_auth. Data scoped via target_idtarget_matches join, consistent with all existing dataset query endpoints.

  7. Migrations — Forward-only. Both use IF NOT EXISTS guards. FK references to raw_transactions(id) and dataset_versions(id) are correct. Index choices are appropriate (wallet+timestamp, network).

  8. Backward compatibility — No existing routes removed or modified. p5w1_compat_existing_wallet_endpoints_still_routed and p5w1_compat_existing_dataset_endpoints_still_routed tests confirm this.

Non-blocking observations

  • Test copy-paste: test_forensics_activity_endpoint_routed tests /v1/export/tax instead of /v1/forensics/activity. The endpoint IS tested by the auth test (test_forensics_activity_requires_auth), so no functional gap, but the routing test name is misleading.

  • Balance history deterministic IDs: The i as u32 + 10000 offset within each (asset, network) group could produce collisions if two groups both have an entry at the same index with the same raw_transaction_id (e.g., a single transaction that touches multiple assets). Narrow window in practice, but worth hardening in a follow-up.

  • WalletLedgerMaterializer::chain_family() returns ChainFamily::Solana with a comment "primary, but works cross-chain". The descriptor() correctly lists all three families. The single-family return from chain_family() could confuse dispatch callers. Cosmetic — not a functional issue in this PR since the materializer isn't wired into automated dispatch yet.

  • Forensics record cap: forensics_activity_handler uses export_wallet_ledger_records with the 100K record limit. For high-volume wallets, the forensics summary could be silently incomplete. Consider adding a truncated flag or count in a follow-up.

Ready to merge from the PR-comment review perspective.

@user1303836 user1303836 merged commit 0279019 into main Mar 9, 2026
4 checks passed
@user1303836 user1303836 deleted the p5-w1-wallet-tax-forensics-pack branch March 9, 2026 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant