Skip to content

Fix vector search: optimize zvec collections after upserts#225

Merged
user1303836 merged 3 commits intomainfrom
fix/vector-indexer-optimize
Mar 7, 2026
Merged

Fix vector search: optimize zvec collections after upserts#225
user1303836 merged 3 commits intomainfrom
fix/vector-indexer-optimize

Conversation

@user1303836
Copy link
Owner

Summary

  • Adds optimize() calls after every zvec upsert (single and batch) for both articles and message_chunks collections
  • Also optimizes before flush on close() to ensure remaining WAL data is indexed

The root cause of vector indexer not found for field embedding errors: zvec writes data to a WAL (write-ahead log) on upsert, but the WAL data is not searchable until optimize() compacts it into indexed segments. Without optimization, queries against unindexed WAL segments fail at the C++ layer (segment.cc:711).

Evidence from disk confirmed this -- message_chunks had a 48MB unindexed WAL from today alongside an old index from Mar 2, and articles had a completely empty segment directory.

Test plan

  • All 11 test_vector_store.py tests pass (upsert, search, batch, delete, reopen)
  • Full test suite passes (560 tests)
  • ruff check, ruff format, mypy all clean

zvec requires optimize() to compact WAL data into indexed segments.
Without it, queries fail with "vector indexer not found for field
embedding" because unindexed WAL segments have no vector indexer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@user1303836 user1303836 merged commit ac035bd into main Mar 7, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant