Skip to content

HNSW index causes errors on non-vector queries (COUNT, WHERE embedding IS NOT NULL) #152

@stuinfla

Description

@stuinfla

Bug Description

HNSW indexes on ruvector columns cause PostgreSQL errors on simple non-vector queries like COUNT(*) or WHERE embedding IS NOT NULL.

Steps to Reproduce

-- Create table with ruvector column
CREATE TABLE test (
  id SERIAL PRIMARY KEY,
  content TEXT,
  embedding ruvector(384)
);

-- Create HNSW index
CREATE INDEX idx_hnsw ON test 
USING hnsw (embedding ruvector_cosine_ops) WITH (m = 16, ef_construction = 100);

-- This query FAILS:
SELECT COUNT(*) FROM test WHERE embedding IS NOT NULL;

Error Message

ERROR: HNSW: Could not extract query vector from parameter. Ensure the query vector is properly cast to ruvector type, e.g.: ORDER BY embedding <=> '[1,2,3]'::ruvector(dim)
file: hnsw_am.rs, line: 1489

Expected Behavior

COUNT(*) and WHERE clauses on embedding columns should work regardless of HNSW index presence. The index should only be used for vector similarity searches (ORDER BY embedding <=> query).

Environment

  • PostgreSQL 17.7
  • ruvector-postgres Docker image (ruvnet/ruvector-postgres:latest)
  • macOS via Docker Desktop

Workaround

Drop HNSW index and use brute-force scan. For <10K entries, brute-force is sub-100ms anyway.

Impact

Cannot use HNSW indexes in production because basic queries fail. This blocks using HNSW for performance optimization on larger datasets.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions