Skip to content

Multi-model registry and adapter layer for realtime-0.5b, tts-1.5b, tts-7b#7

Open
Copilot wants to merge 4 commits intomainfrom
copilot/implement-multi-model-compatibility
Open

Multi-model registry and adapter layer for realtime-0.5b, tts-1.5b, tts-7b#7
Copilot wants to merge 4 commits intomainfrom
copilot/implement-multi-model-compatibility

Conversation

Copy link

Copilot AI commented Mar 13, 2026

Refactors the hardcoded single-model runner into a multi-backend TTS system. The model field in OpenAI-compatible requests is now resolved via a registry instead of ignored. Longform models (1.5B/7B) are registered with API plumbing but fail gracefully with 501 until a real backend is wired in. Existing realtime-0.5b behavior is fully preserved.

New runner/ package

  • model_registry.pyModelProfile dataclass, 3 model profiles, alias map (tts-1realtime-0.5b, etc.), resolve_model_key(), get_model_profile()
  • adapters/base.pyEngineAdapter ABC with is_available(), capabilities(), synthesize(), stream(), health()
  • adapters/realtime_demo.pyRealtimeDemoAdapter wrapping existing subprocess demo. Extracted detect_device(), apply_overrides(), build_realtime_demo_cmd() from run_realtime_demo.py
  • adapters/longform_native.pyLongformNativeAdapter scaffold. is_available() returns False; synthesize() raises BackendUnavailableError. No fake streaming
  • adapter_factory.pymake_adapter(model_key) dispatches by family/loader_mode
  • types.pySpeechRequest/SpeakerTurn pydantic models with per-family validation
  • errors.pyUnknownModelError, CapabilityError, BackendUnavailableError, InvalidRequestForModelError

Modified files

  • overrides/app.py/v1/audio/speech resolves model, returns 400 (invalid combo), 404 (unknown model), 501 (missing backend). /stream rejects non-realtime models. /config and /health expose model registry info
  • scripts/download_model.py — Accepts --model with registry resolution. Default unchanged
  • scripts/run_realtime_demo.py — Now a thin shim delegating to runner.adapters.realtime_demo
  • scripts/run_server.py — New generic launcher: --model realtime-0.5b|tts-1.5b|tts-7b

Backward compatibility

All existing entry points work unchanged:

uv run python scripts/download_model.py           # still downloads realtime-0.5b
uv run python scripts/run_realtime_demo.py --port 8000  # still launches realtime demo
curl -d '{"model":"tts-1","input":"Hello"}' /v1/audio/speech  # still maps to realtime-0.5b

Remaining TODOs

  • longform_native.py:_load_backend() — replace placeholder ImportError with real backend import
  • longform_native.py:synthesize() — wire actual inference
  • run_server.py — add longform serving path

Tests

46 new tests covering registry, alias resolution, per-family validation, adapter factory, graceful degradation, and error classes. All run without GPU/model weights.


📱 Kick off Copilot coding agent tasks wherever you are with GitHub Mobile, available on iOS and Android.

@groxaxo groxaxo marked this pull request as ready for review March 13, 2026 02:03
Copilot AI and others added 3 commits March 13, 2026 02:12
…l scripts

Co-authored-by: groxaxo <76023196+groxaxo@users.noreply.github.com>
…entation

Co-authored-by: groxaxo <76023196+groxaxo@users.noreply.github.com>
…or message clarity

Co-authored-by: groxaxo <76023196+groxaxo@users.noreply.github.com>
Copilot AI changed the title [WIP] Add multi-model compatibility for existing repo Multi-model registry and adapter layer for realtime-0.5b, tts-1.5b, tts-7b Mar 13, 2026
Copilot AI requested a review from groxaxo March 13, 2026 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants