Skip to content

Conversation

@jsbattig
Copy link

Summary

Add check_integrity() method to Python Index class that validates HNSW graph structure.

Changes

  • Added new check_integrity() method to the Python bindings in python_bindings/bindings.cpp
  • Validates graph structure including:
    • Connection validity (no invalid neighbor IDs)
    • Self-loop detection
    • Duplicate connection detection
    • Orphan node detection (elements with no inbound connections)

Return Value

Returns a dictionary with:

  • valid: Boolean indicating if graph passed all checks
  • connections_checked: Total number of connections validated
  • element_count: Number of elements in the index
  • min_inbound: Minimum inbound connections for any node
  • max_inbound: Maximum inbound connections for any node
  • errors: List of error messages if any issues found

Use Case

This enables applications using hnswlib to perform health checks on HNSW indexes, detecting corruption or structural issues that could affect search quality.

Testing

The method has been tested with real HNSW indexes in the CIDX (Code Indexer) project.

🤖 Generated with Claude Code

Add check_integrity() method to Python Index class that validates HNSW
graph structure including:
- Connection validity (no invalid neighbor IDs)
- No self-loops
- No duplicate connections
- No orphan nodes (elements with no inbound connections)

Returns dict with: valid, connections_checked, element_count,
min_inbound, max_inbound, errors[]

This enables CIDX to perform health checks on HNSW indexes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant