-
Notifications
You must be signed in to change notification settings - Fork 388
Feat(router): add oai interface support for router #1203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@codex review |
|
To use Codex here, create a Codex account and connect to github. |
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| self.app.post("/retrieve_from_text")(self.retrieve_from_text) | ||
| self.app.post("/retrieve_from_messages_template")(self.retrieve_from_messages_template) | ||
|
|
||
| # OpenAI Chat Completion API (non-streaming only for now) | ||
| self.app.post("/v1/chat/completions")(self.chat_completions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define handlers registered in _setup_routes
The router registers /retrieve_from_messages_template and /v1/chat/completions using self.retrieve_from_messages_template and self.chat_completions, but those methods are not defined on SlimeRouter anywhere in this file. That means _setup_routes() will raise AttributeError during router initialization, preventing the server from starting in any environment. Either implement these handlers or remove the registrations until they exist.
Useful? React with 👍 / 👎.
| # Compute full loss_mask: input (from cache) + output (generated) | ||
| # Note: loss_mask is retrieved at Line 361: token_ids, logprobs, loss_mask = radix_tree.retrieve_from_text(...) | ||
| input_loss_mask = loss_mask or [] | ||
| full_loss_mask = input_loss_mask + [1] * len(output_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass loss_mask into cache update to avoid silent drop
In the cache update block, loss_mask is referenced but never defined or passed into _non_stream_generate_with_cache. On the cache path with any non-empty output_ids (the normal case), this triggers a NameError, which is caught and logged, so the radix cache is never updated. That silently disables cache maintenance and defeats the purpose of the cache-enabled path.
Useful? React with 👍 / 👎.
|
@zhuzilin hi! I think this pr is ready for review. Can you take a look? |
Description
This PR enhances the
sgl-routerby adding OpenAI-compatible interface support and refining the routing logic:/v1/chat/completions), allowing the use of standard OpenAI SDKs and LangChain to interact with SGLang rollout engine.