Skip to content

Comments

Fix issue #99: Enable silence flush regardless of hasSpeech state#137

Open
ysdede wants to merge 1 commit intomasterfrom
vad-transcription-gating
Open

Fix issue #99: Enable silence flush regardless of hasSpeech state#137
ysdede wants to merge 1 commit intomasterfrom
vad-transcription-gating

Conversation

@ysdede
Copy link
Owner

@ysdede ysdede commented Feb 14, 2026

Summary

Resolves an issue where the silence-based buffer flushing mechanism failed to trigger.

Problem

The logic for flushing the buffer was nested within a negated hasSpeech conditional block. However, hasSpeech consistently evaluated to true even during periods of silence because it validated the entire buffer range, including historical transcribed data. Consequently, the silence threshold was never reached.

Solution

Refactored the logic within the v4Tick function in App.tsx to prioritize silence detection:

  1. Prioritize Silence Detection: Decoupled the silence duration check from the hasSpeech state. The system now monitors silence duration from the trailing buffer independently.
  2. Conditional Buffer Flush: If the duration meets the v4SilenceFlushSec threshold and pending text exists, the buffer is flushed to finalize the transcription.
  3. State Validation: Post-flush, the system evaluates hasSpeech to determine if the transcription process should remain active.

Impact:

  • Ensures pending segments are finalized immediately when the user stops speaking (End-of-Utterance detection).
  • Prevents redundant transcription processing during idle periods.
  • Maintains compatibility with existing Voice Activity Detection (VAD) gating.

Related Issues

Summary by CodeRabbit

  • Bug Fixes
  • Optimized transcription finalization by refactoring the detection logic. The system now prioritizes trailing silence thresholds over historical buffer state, ensuring pending text is reliably committed. This improvement enhances End-of-Utterance reliability and prevents unnecessary processing during speech gaps.

@coderabbitai
Copy link

coderabbitai bot commented Feb 14, 2026

📝 Walkthrough

Walkthrough

The silence-based flush logic in App.tsx has been restructured to execute unconditionally before the no-speech guard. It now always queries silence duration and finalizes pending transcriptions via v4FinalizeTimeout when silence meets the threshold, followed by an early-return check if no speech is detected.

Changes

Cohort / File(s) Summary
Silence Flush Logic Reorganization
src/App.tsx
Restructured flush logic to run unconditionally before speech detection check. Now always queries silence tail duration and finalizes pending transcriptions when threshold is met. Logs flush activity and introduces early-return path when no speech is detected after flush attempt.

Sequence Diagram

sequenceDiagram
    participant App as App Component
    participant BC as BufferClient
    participant WC as TranscriptionWorkerClient
    participant W as Transcription Worker

    App->>BC: getSilenceTailDuration('energyVad', 0.3)
    BC-->>App: silence duration
    
    alt Silence ≥ threshold AND hasPendingText
        App->>WC: v4FinalizeTimeout()
        WC->>W: V4_FINALIZE_TIMEOUT
        W->>W: Promote pending text to mature
        W-->>WC: Updated transcript
        WC-->>App: Finalization complete
        App->>App: Update cursor, transcript state
    end
    
    alt hasSpeech == false
        App->>App: Return early<br/>(skip further transcription)
    else hasSpeech == true
        App->>App: Continue transcription flow
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 When silence falls with gentle hush,
My whisker-wiggles start to rush!
No more pending words delayed,
Flush logic comes to save the day—
This rabbit hops with joy always! ✨

🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: enabling silence flush to work regardless of hasSpeech state, fixing the specific issue #99.
Linked Issues check ✅ Passed The PR addresses the core coding objective of issue #99 by moving silence-flush logic before hasSpeech checks and enabling unconditional flush when silence threshold is met with pending text.
Out of Scope Changes check ✅ Passed All changes are focused on reordering and fixing the silence-flush logic in App.tsx v4Tick as required by issue #99, with no unrelated modifications detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into master
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch vad-transcription-gating

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@kiloconnect
Copy link

kiloconnect bot commented Feb 14, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (1 file)
  • src/App.tsx - Logic refactoring to fix silence flush mechanism

Review Notes

The changes in this PR correctly implement the fix described in the PR description:

  1. Silence flush now runs unconditionally - The getSilenceTailDuration() call is no longer gated by !hasSpeech, which was the root cause of issue feat: Enable 'Silence Flush' to finalize transcriptions after timeout #99

  2. Added hasPendingText check - Prevents unnecessary flush attempts when there's no pending mature text to flush

  3. Proper flow maintained - The flush attempt happens first, then the early return for no speech detection follows

The code is clean, well-commented, and the debug logging is appropriate for troubleshooting. No issues found that would block merging.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/App.tsx`:
- Around line 356-358: The current hasPendingText uses getMatureCursorFrame() >
0 which checks cursor advancement rather than actual unflushed text; change the
logic to test the real pending sentence instead (e.g., set hasPendingText =
Boolean(windowBuilder.pendingSentence) or add/use a
windowBuilder.hasPendingText() accessor) and use that in the silenceDuration
check so v4FinalizeTimeout() is only triggered when there is actual pending text
to flush.
📜 Review details

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c98b2ad and ed2728b.

📒 Files selected for processing (1)
  • src/App.tsx
🔇 Additional comments (1)
src/App.tsx (1)

353-385: Core reordering logic correctly fixes the silence-flush gating bug.

Moving the silence check before the !hasSpeech guard, and then using !hasSpeech as an early return after the flush attempt, correctly addresses the reported issue where hasSpeech was preventing silence flush from ever triggering. The structure is clean: flush first, then gate transcription.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +356 to +358
const hasPendingText = windowBuilder.getMatureCursorFrame() > 0;

if (silenceDuration >= appStore.v4SilenceFlushSec() && hasPendingText) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find getMatureCursorFrame implementation and usage
echo "=== Searching for getMatureCursorFrame ==="
rg -n 'getMatureCursorFrame' --type=ts -B2 -A2

echo ""
echo "=== Searching for v4FinalizeTimeout ==="
rg -n 'v4FinalizeTimeout' --type=ts -B2 -A2

echo ""
echo "=== Searching for immatureText in appStore ==="
rg -n 'immatureText' --type=ts -B2 -A2

Repository: ysdede/keet

Length of output: 9609


🏁 Script executed:

#!/bin/bash
# Look at the V4_FINALIZE_TIMEOUT handler in the worker
echo "=== Searching for V4_FINALIZE_TIMEOUT handler in worker ==="
rg -n 'V4_FINALIZE_TIMEOUT' --type=ts -B3 -A15

echo ""
echo "=== Check the utteranceMerger.timeoutFlush() method ==="
rg -n 'timeoutFlush' --type=ts -B2 -A10

echo ""
echo "=== Check how immatureText is cleared ==="
rg -n 'setImmatureText|immatureText = ' --type=ts -C1

Repository: ysdede/keet

Length of output: 4674


🏁 Script executed:

#!/bin/bash
# Find the finalizePendingSentenceByTimeout method
echo "=== UtteranceBasedMerger.finalizePendingSentenceByTimeout ==="
rg -n 'finalizePendingSentenceByTimeout' --type=ts -B2 -A20

Repository: ysdede/keet

Length of output: 6691


🏁 Script executed:

#!/bin/bash
# Check what happens to pendingSentence after finalization
echo "=== finalizePendingSentenceByTimeout full implementation ==="
rg -n 'finalizePendingSentenceByTimeout' --type=ts -A40

echo ""
echo "=== Check if pendingSentence is cleared ==="
rg -n 'this.pendingSentence = ' --type=ts -C2

Repository: ysdede/keet

Length of output: 12160


hasPendingText uses incorrect semantics — consider checking actual pending text for clarity.

getMatureCursorFrame() > 0 checks whether the cursor has ever advanced, not whether pending text currently exists. While repeated calls to v4FinalizeTimeout() are guarded by the worker returning null after the first successful finalization (which clears this.pendingSentence), the condition itself is semantically misleading.

For clarity and maintainability, check actual pending text:

-    const hasPendingText = windowBuilder.getMatureCursorFrame() > 0;
+    const hasPendingText = (appStore.immatureText() || '').length > 0;

This makes the intent explicit: only attempt flush if there's actually unflushed text waiting to be finalized.

🤖 Prompt for AI Agents
In `@src/App.tsx` around lines 356 - 358, The current hasPendingText uses
getMatureCursorFrame() > 0 which checks cursor advancement rather than actual
unflushed text; change the logic to test the real pending sentence instead
(e.g., set hasPendingText = Boolean(windowBuilder.pendingSentence) or add/use a
windowBuilder.hasPendingText() accessor) and use that in the silenceDuration
check so v4FinalizeTimeout() is only triggered when there is actual pending text
to flush.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Enable 'Silence Flush' to finalize transcriptions after timeout

1 participant