fix: HNSW index bugs, agent/SPARQL crashes, lru security by ruvnet · Pull Request #172 · ruvnet/ruvector

ruvnet · 2026-02-15T06:15:24Z

Summary

Fixes 5 open issues with critical PostgreSQL extension bugs:

HNSW index returns only 1 result regardless of LIMIT on small tables (<100 rows) #171 - HNSW index returns only 1 result on small tables: hnsw_build hardcoded dimensions = 128 instead of reading from column atttypmod. Also fixed non-deterministic result ordering from BinaryHeap::into_iter().
HNSW index scan segfault (signal 11) on tables >100K rows #164 - HNSW segfault on >100K rows: Same dimension bug + added page boundary bounds checking in read_vector() and read_neighbors() to prevent reading past page limits.
ruvector_list_agents() and ruvector_sparql_json() crash PostgreSQL backend #167 - ruvector_list_agents() and ruvector_sparql_json() crash PostgreSQL: SQL declared RETURNS SETOF jsonb but Rust returns TableIterator<composite>. Fixed SQL to use RETURNS TABLE(...). Added SPARQL input validation. Changed panic = "abort" to "unwind".
HNSW index causes errors on non-vector queries (COUNT, WHERE embedding IS NOT NULL) #152 - HNSW errors on COUNT/WHERE IS NOT NULL: Index scan now gracefully returns false for non-kNN queries (no ORDER BY operator), letting PostgreSQL fall back to sequential scan.
Security: lru 0.12.5 dependency has Stacked Borrows violation (GHSA-xpfx-fvgv-hgqp) #148 - lru 0.12.5 security advisory: Bumped to lru 0.16 in 3 crates.

Files Changed

File	Changes
`hnsw_am.rs`	Dimension extraction from atttypmod, bounds checks, result ordering, non-kNN handling
`ruvector--2.0.0.sql`	`RETURNS TABLE(...)` for agent functions
`graph/operators.rs`	SPARQL empty query validation
`Cargo.toml`	`panic = "unwind"`, workspace version bump
`*/Cargo.toml` (3)	lru 0.12 -> 0.16

Test plan

cargo check -p ruvector-graph passes (verified)
cargo check -p ruvector-cli passes (verified)
Docker image build with pgrx succeeds
HNSW index on ruvector(384) column creates correctly
SELECT COUNT(*) WHERE embedding IS NOT NULL works with HNSW index present
HNSW search returns correct number of results on small tables
ruvector_list_agents() returns proper table rows
ruvector_sparql_json('store', '') returns error, not crash
cargo audit clean for lru

🤖 Generated with claude-flow

… rvlite Documents phased integration plan: Phase 1 adds RVF as optional dep + CLI command group to npx ruvector, Phase 2 adds RVF as storage backend for rvlite, Phase 3 unifies shared WASM backend and MCP bridge. Co-Authored-By: claude-flow <ruv@ruv.net>

…and decision matrix Adds: single writer rule, crash ordering with epoch reconciliation, explicit backend selection (no silent fallback), cross-platform compat rule, phase contracts with success metrics, failure mode test matrix, hybrid persistence decision matrix, implementation checklist. Closes #169 Co-Authored-By: claude-flow <ruv@ruv.net>

Phase 1 implementation: - Add @ruvector/rvf as optional dependency to ruvector package - Create rvf-wrapper.ts with 10 exported functions matching core pattern - Add 3-tier platform detection (core -> rvf -> stub) with explicit --backend rvf override that fails loud if package is missing - Add 8 rvf CLI subcommands (create, ingest, query, status, segments, derive, compact, export) routed through the wrapper - 5 Rust smoke tests validating persistence across restart, deletion persistence, compaction stability, and adapter compatibility Phase 2 foundations: - Add rvf-backend feature flag to rvlite Cargo.toml (default off) - Create epoch reconciliation module for hybrid RVF + IndexedDB sync - Add @ruvector/rvf-wasm as optional dep to rvlite npm package - Add rvf-adapter-rvlite to workspace members All tests green: 237 RVF core, 23 adapter, 4 epoch, 5 smoke. Refs: #169 Co-Authored-By: claude-flow <ruv@ruv.net>

…ols, compat tests Phase 2 Rust: full epoch reconciliation (EpochTracker with AtomicU64, 23 tests), writer lease with file lock and PID-based stale detection (12 tests), direct ID mapping trait with DirectIdMap and OffsetIdMap (20 tests). Phase 2 JS: createWithRvf/saveToRvf/loadFromRvf factories, BrowserWriterLease with IndexedDB heartbeat, rvf-migrate and rvf-rebuild CLI commands, epoch sync helpers. +541 lines to index.ts, new cli-rvf.ts (363 lines). Phase 3: 3 MCP rvlite tools (rvlite_sql, rvlite_cypher, rvlite_sparql), CI wasm-dedup-check workflow, 6 cross-platform compat tests, shared peer dep. Phase 1: 4 RVF smoke integration tests (full lifecycle, cosine, multi-restart, metadata). Node.js CLI smoke test script. 81 new Rust tests passing. ADR-032 checklist fully complete. Co-Authored-By: claude-flow <ruv@ruv.net>

- ruvector 0.1.88 → 0.1.97 (match npm registry) - rvlite 0.2.1 → 0.2.2 - @ruvector/rvf 0.1.0 → 0.1.1 - Fix MCP command in ruvector README (mcp-server → mcp start) - Fix WASM type conflicts in rvlite index.ts (cast dynamic imports to any) Co-Authored-By: claude-flow <ruv@ruv.net>

…allbacks, and README examples Five "What's NOT Automatic" gaps fixed: 1. Witness auto-append: WitnessConfig in RvfOptions auto-records ingest/delete/compact operations as WITNESS_SEG entries with SHAKE-256 hash chains 2. verify-witness CLI: Real hash chain verification — extracts WITNESS_SEG payloads, runs verify_witness_chain() with full SHAKE-256 validation 3. verify-attestation CLI: Real kernel image hash verification and attestation witness chain validation 4. Prebuilt kernel fallback: KernelBuilder::from_builtin_minimal() produces valid bzImage without Docker 5. Prebuilt eBPF fallback: EbpfCompiler::from_precompiled() produces valid BPF ELF without clang; Launcher::check_requirements()/dry_run() for QEMU detection README examples added to all 3 packages: - crates/rvf/README.md: Proof of Operations section - npm/packages/rvf/README.md: 7 real-world examples - npm/packages/ruvector/README.md: Working cognitive container examples 830 tests passing, workspace compiles cleanly. Co-Authored-By: claude-flow <ruv@ruv.net>

…itramfs - Add live_boot_proof.rs: end-to-end Docker boot + SSH + RVF verification - Add ULTRAFAST_BOOT_CONFIG: sub-100ms kernel config (no NUMA/cgroups/ext4/netfilter) - Add build_fast_initramfs(): minimal init path (3 mounts + direct service start) - Add KernelBuilder::ultrafast() with optimized cmdline for fast boot - Update README with live boot proof instructions and ultra-fast boot docs - 5 new tests (44 total in rvf-kernel), all passing Co-Authored-By: claude-flow <ruv@ruv.net>

… benchmarks - Examples (self_booting, linux_microkernel, claude_code_appliance, live_boot_proof) now use KernelBuilder::build() which tries Docker first and falls back to builtin stub — real 5.2 MB bzImage embedded - Fix Docker kernel extraction: clean up stale containers, pass dummy entrypoint for scratch-based images - README: add real measured boot benchmarks (257ms boot→service, 381ms boot→verify), kernel size comparison (5.1 MB general vs 3.8 MB ultrafast = 26% smaller) - Fix claude_code_appliance idempotency (remove old file before create) Co-Authored-By: claude-flow <ruv@ruv.net>

Published to npm: - @ruvector/ruvf 0.1.2 - @ruvector/rvf-wasm 0.1.1 - @ruvector/rvf-node 0.1.1 - @ruvector/rvf-mcp-server 0.1.1 - ruvector 0.1.98 - rvlite 0.2.3 Co-Authored-By: claude-flow <ruv@ruv.net>

…167, #171, #148) HNSW fixes: - Extract vector dimensions from column atttypmod instead of hardcoding 128, which caused corrupted indexes for non-128-dim embeddings (#171, #164) - Add page boundary checks in read_vector/read_neighbors to prevent segfaults on large tables with >100K rows (#164) - Use BinaryHeap::into_sorted_vec() for deterministic result ordering instead of into_iter() which yields arbitrary order (#171) - Handle non-kNN scans (COUNT, WHERE IS NOT NULL) gracefully by returning false from hnsw_gettuple when no ORDER BY operator is present (#152) Agent/SPARQL fixes: - Fix SQL type mismatch: ruvector_list_agents() and ruvector_find_agents_by_capability() now use RETURNS TABLE(...) matching the Rust TableIterator signatures instead of RETURNS SETOF jsonb (#167) - Add empty query validation to ruvector_sparql() and ruvector_sparql_json() to prevent panics on invalid input (#167) - Change workspace panic profile from "abort" to "unwind" so pgrx can convert Rust panics to PostgreSQL errors instead of killing the backend (#167) Security: - Bump lru dependency from 0.12 to 0.16 in ruvector-graph, ruvector-cli, and ruvLLM to resolve GHSA-xpfx-fvgv-hgqp Stacked Borrows violation (#148) Version bumps: workspace 2.0.3, ruvector-postgres 2.0.2 Co-Authored-By: claude-flow <ruv@ruv.net>

…-lru-issues # Conflicts: # crates/rvf/README.md # crates/rvf/rvf-kernel/src/lib.rs # npm/packages/ruvector/package.json # npm/packages/rvf/package.json # npm/packages/rvlite/package.json

github-actions · 2026-02-15T06:22:12Z

Benchmark Results Summary

Distance Function Benchmarks

HNSW Index Benchmarks

Quantization Benchmarks

See full results in the artifacts.

github-actions · 2026-02-15T06:27:45Z

Benchmark Comparison

Distance Benchmarks

Baseline (main)

Current (PR)

Three additional hardening fixes for the SPARQL subsystem, building on PR ruvnet#172: 1. Parser: replace hardcoded saturating_sub(6) with saved_pos variable. The old backtrack assumed all update keywords are 6 chars, but LOAD, DROP, and CLEAR are 4-5 chars, causing incorrect parse positions. 2. Executor: change default_graph from Option<&'a str> to Option<String> and remove Box::leak calls in the GraphPattern::Graph handler. Each GRAPH clause previously leaked a String allocation that was never freed. 3. Operators: wrap ruvector_sparql parse/execute/format in catch_unwind so that panics from non-empty but malformed queries are converted to PostgreSQL ERROR messages instead of crashing the backend. Closes ruvnet#167 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: HNSW index bugs, agent/SPARQL crashes, lru security

ruvnet added 11 commits February 14, 2026 19:38

chore: bump and publish npm packages

be282af

Published to npm: - @ruvector/ruvf 0.1.2 - @ruvector/rvf-wasm 0.1.1 - @ruvector/rvf-node 0.1.1 - @ruvector/rvf-mcp-server 0.1.1 - ruvector 0.1.98 - rvlite 0.2.3 Co-Authored-By: claude-flow <ruv@ruv.net>

Merge remote-tracking branch 'origin/main' into fix/hnsw-agent-sparql…

91e7aac

…-lru-issues # Conflicts: # crates/rvf/README.md # crates/rvf/rvf-kernel/src/lib.rs # npm/packages/ruvector/package.json # npm/packages/rvf/package.json # npm/packages/rvlite/package.json

ruvnet merged commit 18103b4 into main Feb 15, 2026

grparry mentioned this pull request Feb 17, 2026

Fix SPARQL parser backtrack, executor memory leak, and add catch_unwind #180

Merged

4 tasks

ruvnet added a commit that referenced this pull request Feb 20, 2026

Merge pull request #172 from ruvnet/fix/hnsw-agent-sparql-lru-issues

447faf4

fix: HNSW index bugs, agent/SPARQL crashes, lru security

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: HNSW index bugs, agent/SPARQL crashes, lru security#172

fix: HNSW index bugs, agent/SPARQL crashes, lru security#172
ruvnet merged 11 commits intomainfrom
fix/hnsw-agent-sparql-lru-issues

ruvnet commented Feb 15, 2026

Uh oh!

github-actions bot commented Feb 15, 2026

Uh oh!

github-actions bot commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ruvnet commented Feb 15, 2026

Summary

Files Changed

Test plan

Uh oh!

github-actions bot commented Feb 15, 2026

Benchmark Results Summary

Distance Function Benchmarks

HNSW Index Benchmarks

Quantization Benchmarks

Uh oh!

github-actions bot commented Feb 15, 2026

Benchmark Comparison

Distance Benchmarks

Baseline (main)

Current (PR)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant