feat: Support LLM_API_KEY environment variable override for benchmark configs #302

simonrosenberg · 2026-01-12T21:34:13Z

Summary

This PR adds support for overriding the api_key in LLM configuration files via the LLM_API_KEY environment variable. This allows cloud environments to inject the API key via secrets (e.g., secrets.LLM_API_KEY_EVAL) without modifying the config files.

Fixes #301

Changes

New utility function: Added benchmarks/utils/llm_config.py with load_llm_config() function that:
- Loads LLM configuration from a JSON file
- Checks for LLM_API_KEY environment variable
- If set, overrides the api_key in the config with the environment variable value
- Returns the validated LLM instance
Updated all benchmark run_infer.py files to use the new utility:
- benchmarks/swebench/run_infer.py
- benchmarks/swtbench/run_infer.py
- benchmarks/gaia/run_infer.py
- benchmarks/commit0/run_infer.py
- benchmarks/multiswebench/run_infer.py
- benchmarks/swebenchmultimodal/run_infer.py
- benchmarks/openagentsafety/run_infer.py
Updated validate_cfg.py to use the new utility for consistency
Added comprehensive tests in tests/test_llm_config.py:
- Test loading config from file without env var override
- Test that LLM_API_KEY env var overrides the config file api_key
- Test that empty string env var does not override config
- Test error handling for missing config file
- Test loading config without api_key in file, with env var set

Usage

When running benchmarks in the cloud, set the LLM_API_KEY environment variable:

export LLM_API_KEY="${secrets.LLM_API_KEY_EVAL}"

The benchmark will automatically use this value instead of the api_key in the JSON config file.

Testing

All new tests pass
Pre-commit checks pass (ruff format, ruff lint, pycodestyle, pyright)

@simonrosenberg can click here to continue refining the PR

… configs Add support for overriding the api_key in LLM configuration files via the LLM_API_KEY environment variable. This allows cloud environments to inject the API key via secrets (e.g., secrets.LLM_API_KEY_EVAL) without modifying the config files. Changes: - Add benchmarks/utils/llm_config.py with load_llm_config() utility function - Update all run_infer.py files to use the new utility - Update validate_cfg.py to use the new utility - Add comprehensive tests for the new functionality Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai bot mentioned this pull request Jan 12, 2026

Configure llm key for running benchmarks #301

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Support LLM_API_KEY environment variable override for benchmark configs #302

feat: Support LLM_API_KEY environment variable override for benchmark configs #302

Uh oh!

simonrosenberg commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Support LLM_API_KEY environment variable override for benchmark configs #302

Are you sure you want to change the base?

feat: Support LLM_API_KEY environment variable override for benchmark configs #302

Uh oh!

Conversation

simonrosenberg commented Jan 12, 2026

Summary

Changes

Usage

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants