refactor(audio): harden speech stats cache safety by ysdede · Pull Request #173 · ysdede/keet

ysdede · 2026-02-21T00:15:18Z

Summary

centralize speechStats mutation through recordSpeechStat() so cache invalidation is guaranteed
use explicit null check and copy-on-read for cachedSpeechSummary
replace map-based average computation with a single-pass summary helper to avoid temporary arrays
add regression test for cache isolation across update cycles

Context

This revisits the optimization path discussed in #164 and applies a safer variant on top of current master.

Validation

npm run test -- src/lib/audio/AudioSegmentProcessor.test.ts

Summary by Sourcery

Refine audio segment statistics handling to make speech summary caching safer and more robust while simplifying statistics aggregation.

Bug Fixes:

Ensure cached speech statistics are invalidated whenever new speech stats are recorded and are protected from external mutation across update cycles.

Enhancements:

Introduce a dedicated helper to record speech stats with bounded history and automatic cache invalidation.
Replace map/reduce-based average calculations with a single-pass summarization helper for segment statistics to reduce allocations and centralize logic.
Return defensive copies of cached speech summaries when computing current stats to avoid shared mutable state.

Tests:

Add a regression test verifying that cached speech summaries remain isolated from caller mutations across subsequent updates.

Summary by CodeRabbit

Bug Fixes
- Enhanced audio segment processing to protect cached statistics from external mutation, ensuring more reliable data handling.
Tests
- Added tests verifying cached speech summaries remain protected from caller mutations across updates.

Centralize speechStats writes to guarantee cache invalidation, use explicit null checks for cache reads, and avoid temporary map-array allocations in summary calculation. Also adds a regression test to ensure caller mutation of returned stats cannot corrupt cached speech summary across update cycles.

sourcery-ai · 2026-02-21T00:15:28Z

Reviewer's Guide

Refactors AudioSegmentProcessor to centralize speech stats mutation through a helper, makes the speech stats cache safer by using explicit null checks and defensive copying, introduces a shared summarization helper for segment statistics, and adds a regression test to ensure cached speech summaries are not mutated by callers across updates.

Sequence diagram for speech stats cache invalidation and safe access

sequenceDiagram
    actor Client
    participant Processor as AudioSegmentProcessor
    participant State

    Client->>Processor: processAudioData(speechChunk, 0.0, speechEnergy)
    Processor->>State: update speechEnergies, speechStartTime

    Client->>Processor: processAudioData(silenceChunk, chunkDuration, silenceEnergy)
    Processor->>State: detect speech segment end
    Processor->>Processor: recordSpeechStat(stat)
    Processor->>State: push stat into speechStats
    Processor->>State: trim speechStats to maxHistoryLength
    Processor->>State: set cachedSpeechSummary = null

    Client->>Processor: getStats()
    Processor->>State: read cachedSpeechSummary (null)
    alt cachedSpeechSummary is null and speechStats not empty
        Processor->>Processor: summarizeSegmentStats(speechStats)
        Processor->>State: cachedSpeechSummary = speechSummary
        Processor-->>Client: stats with copy of speechSummary
    end

    Client->>Client: mutate returned stats.speech

    Client->>Processor: processAudioData(silenceChunk, nextTime, silenceEnergy)
    Processor->>State: update silenceStats
    Processor->>State: cachedSpeechSummary remains unchanged

    Client->>Processor: getStats()
    Processor->>State: read cachedSpeechSummary (not null)
    Processor-->>Client: stats with copy of cachedSpeechSummary
    Client->>Client: verify stats.speech equals baseline, unaffected by mutation

Class diagram for refactored AudioSegmentProcessor speech stats handling

classDiagram
    class AudioSegmentProcessor {
        - state : ProcessorState
        - options : ProcessorOptions
        + processAudioData(audioData : Float32Array, startTime : number, energy : number) void
        + getStats() ProcessorStats
        - updateStats(currentTime : number) void
        - recordSpeechStat(stat : SegmentStats) void
        - summarizeSegmentStats(segments : SegmentStats[]) StatsSummary
    }

    class ProcessorState {
        + speechStats : SegmentStats[]
        + silenceStats : SegmentStats[]
        + cachedSpeechSummary : StatsSummary
        + speechStartTime : number
        + currentStats : ProcessorStats
    }

    class ProcessorOptions {
        + maxHistoryLength : number
        + energyThreshold : number
        + silenceThreshold : number
        + maxSilenceWithinSpeech : number
    }

    class SegmentStats {
        + startTime : number
        + endTime : number
        + duration : number
        + avgEnergy : number
        + energyIntegral : number
    }

    class StatsSummary {
        + avgDuration : number
        + avgEnergy : number
        + avgEnergyIntegral : number
    }

    class ProcessorStats {
        + speech : StatsSummary
        + silence : StatsSummary
    }

    AudioSegmentProcessor --> ProcessorState : uses
    AudioSegmentProcessor --> ProcessorOptions : uses
    ProcessorState --> SegmentStats : has
    ProcessorState --> StatsSummary : caches
    ProcessorStats --> StatsSummary : contains
    AudioSegmentProcessor ..> SegmentStats : records
    AudioSegmentProcessor ..> StatsSummary : summarizes

File-Level Changes

Change	Details	Files
Centralize speech stats updates through a dedicated helper that also enforces history trimming and cache invalidation.	Replace direct pushes to speechStats with a call to recordSpeechStat when closing a speech segment. Move maxHistoryLength trimming logic for speechStats into recordSpeechStat. Ensure cachedSpeechSummary is always invalidated inside recordSpeechStat on any speechStats update.	`src/lib/audio/AudioSegmentProcessor.ts`
Refactor stats aggregation to use a single-pass summarization helper and harden the speech summary cache semantics.	Introduce summarizeSegmentStats helper that computes average duration, energy, and energyIntegral in a single pass over SegmentStats. Use summarizeSegmentStats for both silence and speech summaries instead of mapping to temporary arrays and using average(). Replace truthy check on cachedSpeechSummary with an explicit null check and return a shallow copy when serving cached summaries to avoid external mutation. Remove the generic average helper that operated on number arrays.	`src/lib/audio/AudioSegmentProcessor.ts`
Add regression test to verify cached speech summary is isolated from caller mutations across subsequent updates.	Create a scenario with one speech chunk followed by silence to populate speech stats and cache. Mutate the returned speech stats object from getStats() to simulate caller-side mutation. Process additional audio and assert that internally computed speech stats remain equal to the original baseline values.	`src/lib/audio/AudioSegmentProcessor.test.ts`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2026-02-21T00:15:41Z

📝 Walkthrough

Walkthrough

The changes refactor AudioSegmentProcessor to improve speech stats handling by introducing a helper method for recording stats with cache invalidation, consolidating averaging logic into a reusable summarizeSegmentStats method, and adding tests to verify that cached speech summaries remain isolated from caller mutations.

Changes

Cohort / File(s)	Summary
Speech Stats Processing Refactor `src/lib/audio/AudioSegmentProcessor.ts`	Introduced `recordSpeechStat` helper to manage stats recording with cache invalidation; consolidated averaging logic via `summarizeSegmentStats` method; removed deprecated `average` method; adjusted cache handling to explicitly clone values and reset on new recordings.
Cache Isolation Tests `src/lib/audio/AudioSegmentProcessor.test.ts`	Added two tests verifying that cached speech summary values remain isolated from caller mutations after object modification and subsequent processor updates.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

chore: port additional safe stale-PR updates #171: Directly related modification of AudioSegmentProcessor's speech-stats caching mechanism, cache field invalidation, and summary computation logic.

Poem

🐰 Speech stats bundled tight with care,
No mutations spoil what's cached there,
A helper clones to keep them pure,
And summarizes long and sure! 🎙️

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title 'refactor(audio): harden speech stats cache safety' accurately reflects the main change: improving cache safety for speech statistics in the AudioSegmentProcessor by centralizing mutations, adding explicit null checks, and cloning cached values.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/pr-164-review-rework

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've left some high level feedback:

You’ve made speech summaries copy-on-read and cache-safe, but silence stats are still assigned directly to this.state.currentStats; consider returning a defensive copy for silence (and possibly the top-level stats object) as well to avoid external mutation leaking back into internal state.
Now that summarizeSegmentStats exists, you might want a corresponding recordSilenceStat helper for symmetry with recordSpeechStat and to keep any future cache invalidation or history-length logic for silence centralized.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- You’ve made speech summaries copy-on-read and cache-safe, but `silence` stats are still assigned directly to `this.state.currentStats`; consider returning a defensive copy for `silence` (and possibly the top-level stats object) as well to avoid external mutation leaking back into internal state.
- Now that `summarizeSegmentStats` exists, you might want a corresponding `recordSilenceStat` helper for symmetry with `recordSpeechStat` and to keep any future cache invalidation or history-length logic for silence centralized.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

kiloconnect · 2026-02-21T00:16:43Z

src/lib/audio/AudioSegmentProcessor.ts

-                avgEnergy: this.average(this.state.silenceStats.map(s => s.avgEnergy)),
-                avgEnergyIntegral: this.average(this.state.silenceStats.map(s => s.energyIntegral))
-            };
+            stats.silence = this.summarizeSegmentStats(this.state.silenceStats);


SUGGESTION: Silence stats are recomputed on every updateStats() call via summarizeSegmentStats, while speech stats are cached in cachedSpeechSummary. This is a pre-existing asymmetry (not a regression introduced here), but since updateStats() is called once per processAudioData() invocation and the result is stored in currentStats, the practical impact is low. Consider adding a cachedSilenceSummary field in a follow-up if silence stats computation becomes a hot path concern.

Deferred to follow-up. This asymmetry is pre-existing and currently low-impact; we can add cachedSilenceSummary only if profiling shows silence summary computation as a measurable hot path.

kiloconnect · 2026-02-21T00:16:43Z

src/lib/audio/AudioSegmentProcessor.test.ts

        expect(current.silence.avgEnergy).toBe(0);
    });
+
+    it('should keep cached speech summary isolated from caller mutation across updates', () => {


SUGGESTION: The new test covers the primary mutation-isolation scenario well. Consider adding a complementary case where cachedSpeechSummary holds an all-zero summary (i.e., a speech segment with zero energy and zero duration) to verify the !== null fix over the old truthy check. The old if (this.state.cachedSpeechSummary) would have treated { avgDuration: 0, avgEnergy: 0, avgEnergyIntegral: 0 } as falsy and recomputed unnecessarily; the new explicit null check is correct, but a test for that edge case would lock in the fix.

By design: an object like {avgDuration:0, avgEnergy:0, avgEnergyIntegral:0} is truthy in JavaScript, so the previous truthy check and the explicit null check behave the same for that case. Keeping current tests as-is.

kiloconnect · 2026-02-21T00:17:02Z

Code Review Summary

Status: 2 Suggestions Found | Recommendation: Merge (no blocking issues)

Overview

Severity	Count
CRITICAL	0
WARNING	0
SUGGESTION	2

What Changed

This PR refactors AudioSegmentProcessor to:

Centralize speech stat mutation through a new recordSpeechStat() helper (push + trim + cache invalidation in one place).
Fix a correctness bug: the old if (this.state.cachedSpeechSummary) truthy check would skip the cache for an all-zero summary; replaced with an explicit !== null check.
Return defensive copies of the cached speech summary via { ...this.state.cachedSpeechSummary } to prevent external mutation from corrupting the cache.
Replace three separate .map() + average() calls with a single-pass summarizeSegmentStats() helper.
Add a regression test verifying cache isolation from caller mutation.

Risk assessment: Low. The changes are a clean refactor of stats bookkeeping only. No streaming window logic, cursor logic, or transcription merge logic is touched. The correctness fix (truthy → null check) is safe and the new test validates the primary scenario.

Issue Details (click to expand)

SUGGESTION

File	Line	Issue
`src/lib/audio/AudioSegmentProcessor.ts`	460	Silence stats are recomputed on every `updateStats()` call (no cache), while speech stats are cached. Pre-existing asymmetry; low impact since recomputation happens once per audio chunk, not per `getStats()` call. Consider a `cachedSilenceSummary` in a follow-up if it becomes a hot-path concern.
`src/lib/audio/AudioSegmentProcessor.test.ts`	213	New test covers mutation isolation well. Consider adding a case where the cached summary is all-zeros to lock in the `!== null` fix over the old truthy check.

Correctness notes

recordSpeechStat() ordering is correct: push → trim → invalidate cache. Cache is always cleared after any mutation to speechStats, so stale summaries cannot be served.
getStats() double-copy is safe: updateStats() stores { ...speechSummary } into currentStats.speech; getStats() spreads it again. External mutation of the returned object cannot reach cachedSpeechSummary.
summarizeSegmentStats for silence: called with silenceStats directly; result is stored in currentStats.silence each call. No regression vs. prior behavior.
No fragile zones touched: transcription window calculations, cursor logic, sliding-window merge/finalization logic are all untouched.

Files Reviewed (2 files)

src/lib/audio/AudioSegmentProcessor.ts — 0 blocking issues, 1 suggestion
src/lib/audio/AudioSegmentProcessor.test.ts — 0 blocking issues, 1 suggestion

Fix these issues in Kilo Cloud

coderabbitai

🧹 Nitpick comments (1)

src/lib/audio/AudioSegmentProcessor.ts (1)
459-469: Cache-on-compute + copy-on-read for speech summary is well structured.

The explicit !== null check, clone via spread on Line 464, and fresh compute + cache on Lines 466-468 together ensure callers never hold a reference to the internal cache. Good defensive pattern.

One subtle note: silenceStats is always recomputed (Line 460) and never cached, while speechStats gets the caching treatment. This asymmetry is fine if silence summaries are expected to change frequently (silence stats are pushed on every non-speech chunk). However, silenceStats also lacks the maxHistoryLength cap that recordSpeechStat enforces for speechStats — it grows unboundedly via direct .push() calls at Lines 278 and 285. Consider whether silenceStats should be similarly bounded.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/audio/AudioSegmentProcessor.ts` around lines 459 - 469, silenceStats
is currently appended without bounding while speechStats is capped via
recordSpeechStat using maxHistoryLength; update the places that push into
silenceStats (the methods that currently call this.state.silenceStats.push(...))
to enforce the same maxHistoryLength trimming logic used by recordSpeechStat so
silence history cannot grow unbounded, and optionally mirror the speech-summary
cache pattern (cachedSpeechSummary / summarizeSegmentStats) if you want a
copy-on-read for silence summaries as well; reference the symbols silenceStats,
speechStats, recordSpeechStat, maxHistoryLength, summarizeSegmentStats and
cachedSpeechSummary when making the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/lib/audio/AudioSegmentProcessor.ts`:
- Around line 459-469: silenceStats is currently appended without bounding while
speechStats is capped via recordSpeechStat using maxHistoryLength; update the
places that push into silenceStats (the methods that currently call
this.state.silenceStats.push(...)) to enforce the same maxHistoryLength trimming
logic used by recordSpeechStat so silence history cannot grow unbounded, and
optionally mirror the speech-summary cache pattern (cachedSpeechSummary /
summarizeSegmentStats) if you want a copy-on-read for silence summaries as well;
reference the symbols silenceStats, speechStats, recordSpeechStat,
maxHistoryLength, summarizeSegmentStats and cachedSpeechSummary when making the
change.

ysdede · 2026-02-21T00:21:20Z

Reviewed the AI comments. No code changes needed for this round:

getStats() already returns defensive copies for both silence and speech plus top-level fields.
The suggested all-zero-summary edge case does not change behavior between truthy and !== null checks in JS because objects are truthy.
silenceStats asymmetry is acknowledged and deferred; we can optimize/cache it in a follow-up only if profiling shows it on a real hot path.

ysdede · 2026-02-21T00:22:53Z

AI reviewer scorecard (factual, based on current PR #173 diff)

Rubric: correctness, signal-to-noise, actionability.

Sourcery: 6/10
- Wrong/weak point: implied risk that silence could leak mutable internal state. In current code, getStats() already returns defensive copies for top-level + silence + speech.
- Valid but optional point: suggested recordSilenceStat for symmetry; this is style/maintainability, not a correctness issue.
CodeRabbit: 4/10
- Incorrect claim: said silenceStats is growing unbounded via direct .push() at referenced lines. Those push sites are not present in the current file revision on this PR.
- Net: generated a non-actionable nit based on stale/incorrect line assumptions.
KiloConnect: 8/10
- Mostly accurate summary of the actual changes and risk level.
- Weak point: suggested an "all-zero summary" case for the old truthy check; object summaries are truthy in JS, so that specific concern is not a real behavior difference.

Conclusion: no blocking issues from AI reviews on #173; merge readiness should be decided by human review + green CI.

refactor(audio): harden speech stats cache safety b2d1c1c

sourcery-ai bot reviewed Feb 21, 2026

View reviewed changes

kiloconnect bot reviewed Feb 21, 2026

View reviewed changes

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

AAlp22 merged commit b2d1c1c into master Feb 21, 2026
3 checks passed

github-actions bot pushed a commit that referenced this pull request Feb 21, 2026

Deploy: Merge pull request #173 from ysdede/fix/pr-164-review-rework

011899b

refactor(audio): harden speech stats cache safety b2d1c1c

Comments

Conversation

ysdede commented Feb 21, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Validation

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

sourcery-ai bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for speech stats cache invalidation and safe access

Class diagram for refactored AudioSegmentProcessor speech stats handling

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

ysdede Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

ysdede Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot commented Feb 21, 2026

Code Review Summary

Overview

What Changed

SUGGESTION

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ysdede commented Feb 21, 2026

Uh oh!

ysdede commented Feb 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ysdede commented Feb 21, 2026 •

edited by coderabbitai bot

Loading

sourcery-ai bot commented Feb 21, 2026 •

edited

Loading

coderabbitai bot commented Feb 21, 2026 •

edited

Loading