Skip to content

fix: explicitly model streaming tokens vs sentences#374

Merged
JarbasAl merged 2 commits intodevfrom
better_streaming_tokens
Jan 30, 2026
Merged

fix: explicitly model streaming tokens vs sentences#374
JarbasAl merged 2 commits intodevfrom
better_streaming_tokens

Conversation

@JarbasAl
Copy link
Member

@JarbasAl JarbasAl commented Jan 30, 2026

Summary by CodeRabbit

  • New Features

    • Token-level streaming for chat interactions.
    • Sentence-level streaming option for TTS-friendly output.
  • Breaking Changes

    • Streaming API now yields token strings instead of message objects.
    • Default streaming behavior remains whole-response; real-time sentence streaming should be overridden where needed.
  • Bug Fixes

    • Sentence-splitting fallback updated to split on newlines for better fallback behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions github-actions bot added the fix label Jan 30, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 30, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Refactors agent streaming: stream_chat renamed to stream_tokens returning token strings; new stream_sentences added to yield newline-split sentences for TTS. sentence_split fallback now returns text.split("\n") instead of [text].

Changes

Cohort / File(s) Summary
Streaming Interface Refactor
ovos_plugin_manager/templates/agents.py
Renamed stream_chatstream_tokens and changed return type from Iterable[AgentMessage] to Iterable[str]. Added stream_sentences to yield full sentences (newline-split) for TTS use. Updated docstrings to clarify token vs. sentence streaming.
Sentence-splitting Fallback
ovos_plugin_manager/thirdparty/solvers.py
Changed sentence_split exception fallback from returning [text] to text.split("\n"), altering error-case splitting behavior to use newline boundaries.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 I nibble tokens, one by one, so fleet,
Sentences hop out—neat lines to greet.
From messages once whole, now bits take flight,
A rabbit's cheer for streaming, soft and light. 🥕✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: refactoring the streaming API to explicitly distinguish between token-level and sentence-level streaming.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch better_streaming_tokens

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added fix and removed fix labels Jan 30, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@ovos_plugin_manager/templates/agents.py`:
- Around line 254-278: The stream_sentences implementation and docstring are
incorrect: update the docstring in stream_sentences to say "sentence streaming"
(not "token streaming") and change the Returns description to "sentences";
replace the naive split("\n") with a proper sentence splitter by calling
AbstractSolver.sentence_split (from thirdparty/solvers.py which wraps
quebra_frases.sentence_tokenize) on the AgentMessage content returned by
continue_chat (use continue_chat(messages, session_id, lang, units).content and
then yield each sentence from AbstractSolver.sentence_split), ensuring
stream_sentences yields complete sentences suitable for TTS.

In `@ovos_plugin_manager/thirdparty/solvers.py`:
- Around line 108-110: In sentence_split, the exception fallback returns
text.split("\n") without applying the max_sentences limit; update the except
block (where LOG.exception(f"Error in sentence_split: {e}") is logged) to return
the split result truncated with [:max_sentences] (i.e., apply the same slicing
used in the normal path) so the fallback respects the max_sentences parameter.
🧹 Nitpick comments (1)
ovos_plugin_manager/templates/agents.py (1)

228-252: Clarify token semantics in documentation and consider naming.

The method uses .split() which yields whitespace-separated words, not LLM tokens (which are typically subword units). This is fine for a default fallback implementation, but the docstring should clarify that subclasses implementing real streaming would yield actual model tokens.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@JarbasAl JarbasAl merged commit c3fc766 into dev Jan 30, 2026
3 of 6 checks passed
@JarbasAl JarbasAl deleted the better_streaming_tokens branch January 30, 2026 14:13
@github-actions github-actions bot added fix and removed fix labels Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant