Skip to content

Partition lore indexes by guild#234

Merged
user1303836 merged 2 commits intomainfrom
codex/lore-guild-index
Mar 8, 2026
Merged

Partition lore indexes by guild#234
user1303836 merged 2 commits intomainfrom
codex/lore-guild-index

Conversation

@user1303836
Copy link
Owner

Summary

  • store lore vectors in a separate zvec collection per guild instead of one global collection
  • rebuild and health-check each guild’s lore index independently from SQLite
  • remove the single-guild auto_start_ingestion() break so ingestion lifecycle matches the new storage model

Testing

  • PYTHONPATH=/Users/user1303836/Development/intelstream-codex-lore-guild-index/src /Users/user1303836/Development/intelstream/.venv/bin/pytest tests/test_vector_store.py tests/test_services/test_message_ingestion.py tests/test_discord/test_lore.py tests/test_database.py -q
  • /Users/user1303836/Development/intelstream/.venv/bin/ruff check src/intelstream/database/vector_store.py src/intelstream/database/repository.py src/intelstream/services/message_ingestion.py src/intelstream/discord/cogs/lore.py tests/test_vector_store.py tests/test_services/test_message_ingestion.py tests/test_discord/test_lore.py
  • /Users/user1303836/Development/intelstream/.venv/bin/ruff format --check src/intelstream/database/vector_store.py src/intelstream/database/repository.py src/intelstream/services/message_ingestion.py src/intelstream/discord/cogs/lore.py tests/test_vector_store.py tests/test_services/test_message_ingestion.py tests/test_discord/test_lore.py

Closes #227

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Copy link
Owner Author

@user1303836 user1303836 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Overall

Correct architectural change for multi-guild support. Clean separation between article collections (single global) and message chunk collections (per-guild). Well-tested.

Issues

  1. No migration path for existing data. The old global message_chunks collection at {data_dir}/message_chunks/ will be orphaned. New guild-scoped collections live at {data_dir}/message_chunks/{guild_id}/. The _ensure_message_chunk_index will correctly rebuild from SQLite, but the old directory is never cleaned up or warned about. Consider adding a log warning or cleanup for the orphaned directory on startup.

  2. auto_start_ingestion break removal -- sequential constraint. The break removal is the correct fix (all guilds should get ingestion, not just the first). However, start_ingestion_for_guild checks if self._ingestion_service.is_running: return, and is_running is a single boolean. So in practice, only the first guild with pending work will actually get backfilled per startup -- subsequent guilds return early because the first backfill is still running. The test test_auto_start_uses_all_guilds passes because the mock doesn't simulate is_running returning True. This may be fine for now, but worth noting that true parallel multi-guild ingestion would require per-guild backfill tracking.

  3. Lazy collection initialization. _message_chunks is no longer initialized in initialize(). Collections are lazy-created on first access via _message_chunk_collection(guild_id, create=True). This means initialization failures that used to surface at startup now surface at first use. Acceptable tradeoff but a behavior change.

  4. search_message_chunks returns [] instead of raising. When the guild collection doesn't exist (create=False returns None), it returns an empty list. The old code raised RuntimeError("VectorStore not initialized"). The new behavior is arguably better -- searching an unindexed guild returns nothing rather than crashing.

  5. Merge conflict with PR #233. Both PRs modify vector_store.py and repository.py from the same base. Recommend merging #233 first, then rebasing this PR.

CI

Lint failure is pre-existing (7 unrelated files). Tests and type checks pass.

Verdict

Near-ready. Address the orphaned collection cleanup (even just a warning log), and rebase onto #233 after it merges.

@user1303836 user1303836 merged commit 3bb51c3 into main Mar 8, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P1] Partition the lore vector index by guild before re-enabling /lore

1 participant