-
-
Notifications
You must be signed in to change notification settings - Fork 52
fix: Fallback for OpenaAI AI model #670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CLA Verification PassedAll contributors have signed the CLA.
|
Summary of ChangesHello @Sahilbhatane, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request restores and significantly improves the OpenAI integration within the LLM routing system. It introduces the ability to configure and utilize multiple OpenAI API keys, enhancing the system's resilience by allowing it to automatically fall back to alternative keys or providers if an initial request fails. This change ensures more robust and flexible operation when leveraging OpenAI models. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📝 WalkthroughWalkthroughAdds OpenAI as a first-class LLM provider: multi-key API-key parsing/rotation, provider detection via environment, sync/async OpenAI client initialization, completion paths with multi-key retry, token/cost accounting, provider-failure logging, routing/fallback integration, and tests for OpenAI flows. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant LLMRouter
participant KeyLoader as Key Loader
participant OpenAI_API as OpenAI API
Client->>LLMRouter: complete(messages)
LLMRouter->>KeyLoader: _load_openai_keys()
KeyLoader-->>LLMRouter: [api_key_1, api_key_2, ...]
loop For each loaded API key
LLMRouter->>OpenAI_API: POST /chat/completions (using api_key_n)
alt Request succeeds
OpenAI_API-->>LLMRouter: response + token usage
LLMRouter->>LLMRouter: track tokens & cost
LLMRouter-->>Client: LLMResponse (provider=OPENAI)
else Request fails
OpenAI_API-->>LLMRouter: error
LLMRouter->>LLMRouter: _log_provider_failure, try next key
end
end
alt All keys exhausted
LLMRouter-->>Client: raise informative error / try fallback provider
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request successfully integrates OpenAI as a new LLM provider across the cortex/cli.py, cortex/llm_router.py, and cortex/predictive_prevention.py files. The changes include adding OpenAI to the LLMProvider enum, updating routing and fallback logic, implementing synchronous and asynchronous API completion methods, and enhancing API key management to support multiple OpenAI keys. The logging for provider failures has also been standardized to use logger.warning and logger.debug for consistency. Overall, the implementation is robust and aligns well with the existing architecture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@cortex/llm_router.py`:
- Around line 138-145: Update the docstring in llm_router (constructor or
function where Args are listed) to mention that OpenAI keys can be provided via
both OPENAI_API_KEY and OPENAI_API_KEYS environment variables and that
_load_openai_keys() accepts multiple keys separated by commas, semicolons, or
spaces; explicitly note multi-key support and the accepted delimiters for
openai_api_key/openai_api_keys and that the loader falls back to single-key env
var if needed.
- Around line 538-592: Add a new test class TestOpenAIIntegration that mirrors
existing provider tests and covers the sync method _complete_openai and async
_acomplete_openai, including successful completion, multi-key fallback when
openai_api_keys contains invalid then valid keys, and final error propagation
when all keys fail; specifically mock the OpenAI client/
OpenAI.chat.completions.create to return a fake response (with
usage.prompt_tokens and usage.completion_tokens and choices[0].message.content)
and to raise exceptions for the first key(s) to exercise the key-rotation loop,
assert returned LLMResponse fields (content, provider == LLMProvider.OPENAI,
tokens_used, cost_usd, raw_response) and latency handling, and add a test that
verifies RuntimeError is raised when all keys fail (ensuring last_error is
chained).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cortex/llm_router.py`:
- Around line 591-605: In _openai_response_to_llmresponse guard against an empty
response.choices by checking that response.choices exists and has at least one
item before accessing choices[0]; if it's empty, return a safe LLMResponse
(e.g., content="", provider=LLMProvider.OPENAI, model=self.openai_model,
tokens_used=0, cost_usd=0.0, latency_seconds=0.0) and include the raw_response
if available, or alternatively raise a descriptive exception — ensure you also
safely access message.content (e.g., check for .message) when building the
content.
♻️ Duplicate comments (2)
cortex/llm_router.py (2)
138-148: Update docstring to reflect multi-key and delimiter support.The docstring only mentions
OPENAI_API_KEYbut the implementation loads from bothOPENAI_API_KEYandOPENAI_API_KEYSenvironment variables and supports comma/space/semicolon-delimited lists.
538-592: Test coverage for OpenAI integration is still needed.As noted in a previous review, the new
_complete_openaiand_acomplete_openaimethods with multi-key retry logic need dedicated test coverage to meet the >80% test coverage requirement.
🧹 Nitpick comments (4)
cortex/llm_router.py (4)
228-248: LGTM with optional docstring suggestion.The key loading logic correctly handles multiple sources, deduplication, and various delimiters. Consider adding brief docstrings for maintainability.
📝 Optional: Add docstrings
`@staticmethod` def _split_api_keys(value: str | None) -> list[str]: + """Split a string of API keys by whitespace, comma, or semicolon.""" if not value: return [] return [key.strip() for key in re.split(r"[\s,;]+", value) if key.strip()] def _load_openai_keys(self, explicit_key: str | None) -> list[str]: + """Load OpenAI API keys from explicit input and environment variables, deduplicating.""" keys: list[str] = []
6-9: Update module docstring to include OpenAI.The module docstring lists supported providers but doesn't mention OpenAI.
📝 Suggested update
Supports: - Claude API (Anthropic) - Best for natural language, chat, requirement parsing - Kimi K2 API (Moonshot) - Best for system operations, debugging, tool use +- OpenAI API - Flexible fallback with multi-key support
101-104: Update OpenAI cost values for accurate cost tracking.Currently OpenAI costs are set to 0.0, which means all OpenAI usage will report $0 in cost tracking. Since
gpt-4o-miniis the default model with known pricing, and the codebase comment at line 91 states "update with actual pricing", these values should be updated. Other providers already have pricing populated.💡 Suggested improvement
LLMProvider.OPENAI: { - "input": 0.0, # Unknown/varies by model - "output": 0.0, # Unknown/varies by model + "input": 0.15, # gpt-4o-mini: $0.15 per 1M input tokens + "output": 0.60, # gpt-4o-mini: $0.60 per 1M output tokens },
538-569: Narrow the exception handling to only retry on authentication failures.The current implementation catches all exceptions and retries with the next key. However, only
AuthenticationError(invalid/expired API key) is key-specific and warrants retry with another key.RateLimitError(429) is enforced at the organization level and applies to all keys in the same account—switching keys won't resolve it. Additionally, other errors like invalid request format or client errors should fail immediately rather than retry.💡 Suggested improvement
+from openai import AuthenticationError + def _complete_openai( ... ) -> LLMResponse: ... for index, key in enumerate(self.openai_api_keys): try: ... - except Exception as e: + except AuthenticationError as e: last_error = e if index < len(self.openai_api_keys) - 1: logger.warning("OpenAI API key failed, trying next key") logger.debug("OpenAI error: %s", e) continue breakNote: The same pattern applies to the async
_acomplete_openaimethod.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cortex/llm_router.py`:
- Around line 237-244: The _load_openai_keys function currently extends keys
with OPENAI_API_KEY before OPENAI_API_KEYS which contradicts the docstring that
says OPENAI_API_KEYS should be preferred; update _load_openai_keys so it calls
self._split_api_keys(os.getenv("OPENAI_API_KEYS")) before
self._split_api_keys(os.getenv("OPENAI_API_KEY")) (while still preserving
explicit_key handling and de-duplication), or alternatively update the docstring
to reflect the existing precedence—refer to _load_openai_keys, _split_api_keys,
explicit_key, OPENAI_API_KEY and OPENAI_API_KEYS to locate the change.
♻️ Duplicate comments (1)
cortex/llm_router.py (1)
594-607: Guard against empty OpenAI choices to avoid IndexError.
This still assumesresponse.choices[0]exists. A defensive check prevents unexpected crashes.🛡️ Suggested guard
def _openai_response_to_llmresponse(self, response: Any) -> LLMResponse: - content = response.choices[0].message.content or "" + if not getattr(response, "choices", None): + raise RuntimeError("OpenAI returned empty choices") + content = response.choices[0].message.content or ""



Related Issue
Closes #
Summary
LLM_router.py had openai fallback but after commit was changed and openAI model wasn't refereed anywhere for cortex
AI Disclosure
copilot 5.2-codex
Checklist
type(scope): descriptionor[scope] descriptionpytest tests/)SS's
before-
After -

Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.