-
Notifications
You must be signed in to change notification settings - Fork 334
Closed
Description
Description
HNSW index scan returns only 1 result regardless of the LIMIT clause on tables with fewer than ~100 rows. Sequential scan on the same data returns the correct number of results.
Environment
- Extension: ruvector v0.1.0 (engine v2.0.1)
- Docker image:
ruvnet/ruvector-postgres:latest(built 2026-02-13) - Platform: macOS Darwin 25.3.0, Docker Desktop
- SIMD: x86_64 AVX2 + FMA + SSE4.2
Reproduction
-- Table: 56 rows, all with ruvector(384) embeddings
-- Column type: ruvector(384) (atttypmod correctly set)
-- Index: USING hnsw (embedding ruvector_cosine_ops) WITH (m=16, ef_construction=128)
-- WITH HNSW index: Returns 1 row (wrong)
SET ruvector.ef_search = 64;
SELECT title, category
FROM openclaw_memory.operational_knowledge
ORDER BY embedding <=> embed_text('What projects does Stuart work on?')
LIMIT 8;
-- Returns: 1 row
-- WITHOUT HNSW index (sequential scan): Returns 8 rows (correct)
DROP INDEX idx_opknowledge_embedding_hnsw;
SELECT title, category
FROM openclaw_memory.operational_knowledge
ORDER BY embedding <=> embed_text('What projects does Stuart work on?')
LIMIT 8;
-- Returns: 8 correct rows, properly ranked by relevanceAdditional Context
- Tested with
ruvector.ef_searchvalues from 40 to 500 — no change - Tested with precomputed vector in PL/pgSQL — same result
- All 56 embeddings confirmed as 384 dimensions via
ruvector_dims() - Column atttypmod is correctly set to 384 (not -1)
- Related: HNSW index causes errors on non-vector queries (COUNT, WHERE embedding IS NOT NULL) #152 (HNSW errors on non-vector queries), HNSW index scan segfault (signal 11) on tables >100K rows #164 (segfault on >100K rows)
- Workaround: Skip HNSW on tables with <1000 rows (sequential scan is sub-millisecond anyway)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels