Skip to content

[STORY] REST Health Endpoint #58

@jsbattig

Description

@jsbattig

Part of: #53

Story: REST Health Endpoint

Feature: Server API Integration

Priority: P0

Overview

As an API consumer or monitoring system
I want a REST endpoint to check HNSW index health
So that I can programmatically verify index integrity and integrate with monitoring tools

Context

The CIDX server provides REST APIs for various operations. A health check endpoint allows external systems, monitoring dashboards, and the Web UI to query index health status.

Acceptance Criteria

Feature: REST Health Endpoint

  Scenario: Health check on healthy index via REST
    Given the CIDX server is running
    And a repository with healthy HNSW index is registered
    When GET /api/repositories/{repo_alias}/health is called
    Then HTTP 200 is returned
    And response body contains HealthCheckResult with valid=true
    And response includes all integrity metrics

  Scenario: Health check on unhealthy index via REST
    Given the CIDX server is running
    And a repository with corrupted HNSW index is registered
    When GET /api/repositories/{repo_alias}/health is called
    Then HTTP 200 is returned (endpoint works, index unhealthy)
    And response body contains HealthCheckResult with valid=false
    And errors array contains specific issues

  Scenario: Health check on non-existent repository
    Given the CIDX server is running
    When GET /api/repositories/{unknown_alias}/health is called
    Then HTTP 404 is returned
    And error message indicates repository not found

  Scenario: Health check with authentication
    Given the CIDX server is running with authentication enabled
    When GET /api/repositories/{repo_alias}/health is called without token
    Then HTTP 401 is returned
    When called with valid token
    Then HTTP 200 is returned with health data

  Scenario: Health check with force refresh
    Given a cached health check result exists
    When GET /api/repositories/{repo_alias}/health?force_refresh=true is called
    Then cache is bypassed
    And fresh integrity check is performed
    And from_cache=false in response

  Scenario: Health check response format
    Given a valid repository exists
    When GET /api/repositories/{repo_alias}/health is called
    Then response Content-Type is application/json
    And response body matches OpenAPI schema
    And all HealthCheckResult fields are present

Technical Requirements

Endpoint Definition

GET /api/repositories/{repo_alias}/health

Path Parameters:
  - repo_alias: Repository alias (string, required)

Query Parameters:
  - force_refresh: Bypass cache (boolean, optional, default=false)

Response:
  - 200: HealthCheckResult (healthy or unhealthy)
  - 404: Repository not found
  - 401: Unauthorized
  - 500: Server error

Router Implementation

Pseudocode:

@router.get("/repositories/{repo_alias}/health", response_model=HealthCheckResult)
async def get_repository_health(
    repo_alias: str,
    force_refresh: bool = Query(default=False),
    health_service: HNSWHealthService = Depends(get_health_service),
    repo_service: RepositoryService = Depends(get_repo_service),
    executor: ThreadPoolExecutor = Depends(get_executor)
):
    # Validate repository exists
    repo = await repo_service.get_repository(repo_alias)
    if not repo:
        raise HTTPException(status_code=404, detail="Repository not found")

    # Get index path for repository
    index_path = repo.get_hnsw_index_path()

    # Perform async health check (sync operation offloaded to executor)
    result = await check_health_async(
        health_service,
        index_path,
        executor,
        force_refresh=force_refresh
    )

    return result

Response Schema (OpenAPI)

HealthCheckResult:
  type: object
  required:
    - valid
    - file_exists
    - readable
    - loadable
    - errors
    - check_duration_ms
    - from_cache
  properties:
    valid:
      type: boolean
      description: Overall health status
    file_exists:
      type: boolean
    readable:
      type: boolean
    loadable:
      type: boolean
    element_count:
      type: integer
      nullable: true
    connections_checked:
      type: integer
      nullable: true
    min_inbound:
      type: integer
      nullable: true
    max_inbound:
      type: integer
      nullable: true
    index_path:
      type: string
    file_size_bytes:
      type: integer
      nullable: true
    last_modified:
      type: string
      format: date-time
      nullable: true
    errors:
      type: array
      items:
        type: string
    check_duration_ms:
      type: number
    from_cache:
      type: boolean

Error Responses

Status Scenario Response Body
404 Unknown repo {"detail": "Repository 'xyz' not found"}
401 No/invalid auth {"detail": "Not authenticated"}
500 Server error {"detail": "Internal server error", "trace_id": "..."}

Service Integration

  • HNSWHealthService injected via FastAPI dependency
  • ThreadPoolExecutor for async wrapper (sync check_integrity offloaded)
  • Repository service resolves alias to index path

Implementation Status

Task Status Notes
Router endpoint GET /api/repositories/{alias}/health
Request validation Path params, query params
Response model HealthCheckResult
Authentication Integrate with existing auth
Error handling 404, 401, 500 responses
OpenAPI schema Auto-generated from Pydantic
Async wrapper Executor offloading

Testing Requirements

Unit Tests

  • Valid repo returns health result
  • Unknown repo returns 404
  • Unauthenticated request returns 401 (when auth enabled)
  • force_refresh bypasses cache
  • Response matches schema
  • Error responses formatted correctly

Integration Tests

  • Full endpoint test with real server
  • Authentication flow
  • Cache behavior verification
  • Multiple concurrent requests

API Contract Tests

  • Response schema validation
  • OpenAPI spec accuracy
  • Backward compatibility

Definition of Done

  • GET endpoint implemented and routed
  • Authentication integrated
  • Response matches HealthCheckResult schema
  • Error responses follow API conventions
  • OpenAPI documentation generated
  • Async wrapper handles executor offloading
  • Unit tests for all scenarios
  • Integration tests pass
  • API documentation updated

Conversation References

  • Interface requirement: "REST API: Health check endpoint"
  • Architecture decision: "Sync operations with executor offloading for async contexts"
  • MCP reuse: "MCP: Health check tool (reusing REST implementation)" - implies REST is the foundation

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions