Skip to content

Conversation

@simonrosenberg
Copy link
Collaborator

Summary

This PR adds support for overriding the api_key in LLM configuration files via the LLM_API_KEY environment variable. This allows cloud environments to inject the API key via secrets (e.g., secrets.LLM_API_KEY_EVAL) without modifying the config files.

Fixes #301

Changes

  • New utility function: Added benchmarks/utils/llm_config.py with load_llm_config() function that:

    • Loads LLM configuration from a JSON file
    • Checks for LLM_API_KEY environment variable
    • If set, overrides the api_key in the config with the environment variable value
    • Returns the validated LLM instance
  • Updated all benchmark run_infer.py files to use the new utility:

    • benchmarks/swebench/run_infer.py
    • benchmarks/swtbench/run_infer.py
    • benchmarks/gaia/run_infer.py
    • benchmarks/commit0/run_infer.py
    • benchmarks/multiswebench/run_infer.py
    • benchmarks/swebenchmultimodal/run_infer.py
    • benchmarks/openagentsafety/run_infer.py
  • Updated validate_cfg.py to use the new utility for consistency

  • Added comprehensive tests in tests/test_llm_config.py:

    • Test loading config from file without env var override
    • Test that LLM_API_KEY env var overrides the config file api_key
    • Test that empty string env var does not override config
    • Test error handling for missing config file
    • Test loading config without api_key in file, with env var set

Usage

When running benchmarks in the cloud, set the LLM_API_KEY environment variable:

export LLM_API_KEY="${secrets.LLM_API_KEY_EVAL}"

The benchmark will automatically use this value instead of the api_key in the JSON config file.

Testing

  • All new tests pass
  • Pre-commit checks pass (ruff format, ruff lint, pycodestyle, pyright)

@simonrosenberg can click here to continue refining the PR

… configs

Add support for overriding the api_key in LLM configuration files via the
LLM_API_KEY environment variable. This allows cloud environments to inject
the API key via secrets (e.g., secrets.LLM_API_KEY_EVAL) without modifying
the config files.

Changes:
- Add benchmarks/utils/llm_config.py with load_llm_config() utility function
- Update all run_infer.py files to use the new utility
- Update validate_cfg.py to use the new utility
- Add comprehensive tests for the new functionality

Co-authored-by: openhands <openhands@all-hands.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Configure llm key for running benchmarks

3 participants