Skip to content

feat: LLM smart router with embedding-based prompt classification#11

Merged
sblanchard merged 15 commits intomainfrom
worktree-llm-router
Feb 21, 2026
Merged

feat: LLM smart router with embedding-based prompt classification#11
sblanchard merged 15 commits intomainfrom
worktree-llm-router

Conversation

@sblanchard
Copy link
Owner

Summary

  • Embedding classifier: Classifies prompts into model tiers (Simple/Complex/Reasoning/Free) using cosine similarity against reference prompt centroids via Ollama embeddings (~10ms)
  • Smart router: 5 routing profiles (Auto, Eco, Premium, Free, Reasoning) with tier fallback chains; resolution order: explicit model → router → agent models → role defaults → any provider
  • Observability: Decision ring buffer, 4 new API endpoints (/v1/router/status, /v1/router/classify, /v1/router/decisions, /v1/router/config)
  • Dashboard: LLM Router card on Settings page (classifier status, tier assignments, collapsible decisions log), routing profile dropdown on schedule create/edit forms
  • Graceful degradation: Router disabled by default; classifier failure falls back to fixed profiles; zero behavioral change when [llm.router] is absent

New files

  • crates/providers/src/classifier.rs — embedding classifier with cache (792 lines)
  • crates/providers/src/smart_router.rs — pure routing resolution logic (233 lines)
  • crates/providers/src/decisions.rs — thread-safe decisions ring buffer (111 lines)
  • crates/gateway/src/api/router.rs — router API endpoints (229 lines)
  • crates/providers/tests/router_integration.rs — integration tests (252 lines)

Modified files

  • crates/domain/src/config/llm.rs — RoutingProfile, ModelTier, RouterConfig types
  • crates/gateway/src/runtime/mod.rs — resolve_provider() with smart router step
  • crates/gateway/src/runtime/turn.rs — routing_profile on TurnInput
  • crates/gateway/src/bootstrap.rs — router initialization at startup
  • crates/gateway/src/api/schedules.rs — routing_profile on schedule CRUD + validation
  • Dashboard: client.ts, Settings.vue, Schedules.vue, ScheduleDetail.vue

Test plan

  • 62 sa-domain tests pass (router config serde roundtrips)
  • 91 sa-providers unit tests pass (classifier, smart_router, decisions)
  • 12 integration tests pass (full routing flow without Ollama)
  • 274 sa-gateway tests pass (zero regressions)
  • Invalid routing_profile values rejected with 400
  • Manual: enable [llm.router] in config.toml, verify Settings page shows Router card
  • Manual: create schedule with routing profile, verify it persists and clears correctly

sblanchard and others added 15 commits February 21, 2026 13:55
Add smart router configuration types: RouterConfig (with Default),
RoutingProfile enum (Auto/Eco/Premium/Free/Reasoning), ModelTier enum
(Simple/Complex/Reasoning/Free), ClassifierConfig, TierConfig, and
RouterThresholds. Add optional router field to LlmConfig.
Introduces EmbeddingClassifier that uses cosine similarity between
prompt embeddings and pre-computed tier centroids (Simple, Complex,
Reasoning) to route prompts to the appropriate model tier. Includes
an in-memory LRU cache with TTL eviction, threshold-based escalation
rules, and agentic prompt detection via length heuristics.
Pure synchronous routing functions that resolve model strings from
routing profiles, classified tiers, and tier configuration. Supports
explicit model bypass, profile-to-tier mapping, classifier-driven
Auto mode, and cross-tier fallback chains.
Thread-safe DecisionLog backed by parking_lot::Mutex and VecDeque that
evicts the oldest entry at capacity. Provides `record()` and `recent()`
for observability of smart-router routing choices.
Add SmartRouterState to AppState and update resolve_provider() to check
the smart router when no explicit model override is provided. The return
type now includes an optional model name so the router can specify which
model within a provider to use. The resolution order becomes: explicit
override -> smart router -> agent models -> role defaults -> any provider.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire routing_profile through the Schedule model, create/update API,
and schedule runner so scheduled runs can use a specific router profile
instead of the global default.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add routing_profile field to Schedule, CreateScheduleRequest, and
UpdateScheduleRequest TypeScript types. Wire a "Routing Profile"
dropdown into both the create form (Schedules.vue) and edit form
(ScheduleDetail.vue) with options: Default (inherit), Auto, Eco,
Premium, Free, and Reasoning.
Full round-trip validation of the routing pipeline without requiring
Ollama or any external services. Covers all five routing profiles,
explicit model bypass, tier fallback chains, and decision log
recording with capacity eviction.
@sblanchard sblanchard merged commit 28387fa into main Feb 21, 2026
3 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant