Skip to content

Performance: Implement Caching for LLM Judge Calls #6

@c21051997

Description

@c21051997

Is your feature request related to a problem? Please describe.
When running a large evaluation, the same question/answer pair might be evaluated multiple times. Each time, an expensive and relatively slow API call is made to the AI judge.

Describe the solution you'd like
Implement a caching mechanism for the _acall_llm_judge method in the RAGEvaluator.

  • It could be a simple in-memory cache for a single run.
  • It could be a more advanced on-disk cache (e.g., using shelve or sqlite) that persists between runs.

Describe alternatives you've considered
The alternative is to continue making redundant API calls, which is inefficient.

Additional context
This would be a significant performance and cost-saving improvement for users evaluating large test sets.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions