Skip to content

Conversation

@ChangyiYang
Copy link
Contributor

@ChangyiYang ChangyiYang commented Dec 24, 2025

Description

This PR enhances the sgl-router by adding OpenAI-compatible interface support and refining the routing logic:

  • OAI Interface Support: Implements standard endpoints (e.g., /v1/chat/completions), allowing the use of standard OpenAI SDKs and LangChain to interact with SGLang rollout engine.
  • TITO Support: Support the TITO logic for OAI API
  • Logic Refinement & Fixes: Refactored core routing logic for better stability, fixed backend state management, and updated Retool examples for improved monitoring.

@ChangyiYang ChangyiYang changed the title Feat/router pr pt1 Feat(router): add oai interface support for router Dec 29, 2025
@ChangyiYang ChangyiYang marked this pull request as ready for review December 29, 2025 03:05
@ChangyiYang
Copy link
Contributor Author

@codex review

@chatgpt-codex-connector
Copy link

To use Codex here, create a Codex account and connect to github.

@ChangyiYang
Copy link
Contributor Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 77 to +81
self.app.post("/retrieve_from_text")(self.retrieve_from_text)
self.app.post("/retrieve_from_messages_template")(self.retrieve_from_messages_template)

# OpenAI Chat Completion API (non-streaming only for now)
self.app.post("/v1/chat/completions")(self.chat_completions)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Define handlers registered in _setup_routes

The router registers /retrieve_from_messages_template and /v1/chat/completions using self.retrieve_from_messages_template and self.chat_completions, but those methods are not defined on SlimeRouter anywhere in this file. That means _setup_routes() will raise AttributeError during router initialization, preventing the server from starting in any environment. Either implement these handlers or remove the registrations until they exist.

Useful? React with 👍 / 👎.

Comment on lines +513 to +516
# Compute full loss_mask: input (from cache) + output (generated)
# Note: loss_mask is retrieved at Line 361: token_ids, logprobs, loss_mask = radix_tree.retrieve_from_text(...)
input_loss_mask = loss_mask or []
full_loss_mask = input_loss_mask + [1] * len(output_ids)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pass loss_mask into cache update to avoid silent drop

In the cache update block, loss_mask is referenced but never defined or passed into _non_stream_generate_with_cache. On the cache path with any non-empty output_ids (the normal case), this triggers a NameError, which is caught and logged, so the radix cache is never updated. That silently disables cache maintenance and defeats the purpose of the cache-enabled path.

Useful? React with 👍 / 👎.

@ChangyiYang
Copy link
Contributor Author

@zhuzilin hi! I think this pr is ready for review. Can you take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants