Skip to content

Comments

refactor(audio): harden speech stats cache safety#173

Merged
AAlp22 merged 1 commit intomasterfrom
fix/pr-164-review-rework
Feb 21, 2026
Merged

refactor(audio): harden speech stats cache safety#173
AAlp22 merged 1 commit intomasterfrom
fix/pr-164-review-rework

Conversation

@ysdede
Copy link
Owner

@ysdede ysdede commented Feb 21, 2026

Summary

  • centralize speechStats mutation through recordSpeechStat() so cache invalidation is guaranteed
  • use explicit null check and copy-on-read for cachedSpeechSummary
  • replace map-based average computation with a single-pass summary helper to avoid temporary arrays
  • add regression test for cache isolation across update cycles

Context

This revisits the optimization path discussed in #164 and applies a safer variant on top of current master.

Validation

  • npm run test -- src/lib/audio/AudioSegmentProcessor.test.ts

Summary by Sourcery

Refine audio segment statistics handling to make speech summary caching safer and more robust while simplifying statistics aggregation.

Bug Fixes:

  • Ensure cached speech statistics are invalidated whenever new speech stats are recorded and are protected from external mutation across update cycles.

Enhancements:

  • Introduce a dedicated helper to record speech stats with bounded history and automatic cache invalidation.
  • Replace map/reduce-based average calculations with a single-pass summarization helper for segment statistics to reduce allocations and centralize logic.
  • Return defensive copies of cached speech summaries when computing current stats to avoid shared mutable state.

Tests:

  • Add a regression test verifying that cached speech summaries remain isolated from caller mutations across subsequent updates.

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced audio segment processing to protect cached statistics from external mutation, ensuring more reliable data handling.
  • Tests

    • Added tests verifying cached speech summaries remain protected from caller mutations across updates.

Centralize speechStats writes to guarantee cache invalidation,
use explicit null checks for cache reads, and avoid temporary
map-array allocations in summary calculation.

Also adds a regression test to ensure caller mutation of returned
stats cannot corrupt cached speech summary across update cycles.
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Feb 21, 2026

Reviewer's Guide

Refactors AudioSegmentProcessor to centralize speech stats mutation through a helper, makes the speech stats cache safer by using explicit null checks and defensive copying, introduces a shared summarization helper for segment statistics, and adds a regression test to ensure cached speech summaries are not mutated by callers across updates.

Sequence diagram for speech stats cache invalidation and safe access

sequenceDiagram
    actor Client
    participant Processor as AudioSegmentProcessor
    participant State

    Client->>Processor: processAudioData(speechChunk, 0.0, speechEnergy)
    Processor->>State: update speechEnergies, speechStartTime

    Client->>Processor: processAudioData(silenceChunk, chunkDuration, silenceEnergy)
    Processor->>State: detect speech segment end
    Processor->>Processor: recordSpeechStat(stat)
    Processor->>State: push stat into speechStats
    Processor->>State: trim speechStats to maxHistoryLength
    Processor->>State: set cachedSpeechSummary = null

    Client->>Processor: getStats()
    Processor->>State: read cachedSpeechSummary (null)
    alt cachedSpeechSummary is null and speechStats not empty
        Processor->>Processor: summarizeSegmentStats(speechStats)
        Processor->>State: cachedSpeechSummary = speechSummary
        Processor-->>Client: stats with copy of speechSummary
    end

    Client->>Client: mutate returned stats.speech

    Client->>Processor: processAudioData(silenceChunk, nextTime, silenceEnergy)
    Processor->>State: update silenceStats
    Processor->>State: cachedSpeechSummary remains unchanged

    Client->>Processor: getStats()
    Processor->>State: read cachedSpeechSummary (not null)
    Processor-->>Client: stats with copy of cachedSpeechSummary
    Client->>Client: verify stats.speech equals baseline, unaffected by mutation
Loading

Class diagram for refactored AudioSegmentProcessor speech stats handling

classDiagram
    class AudioSegmentProcessor {
        - state : ProcessorState
        - options : ProcessorOptions
        + processAudioData(audioData : Float32Array, startTime : number, energy : number) void
        + getStats() ProcessorStats
        - updateStats(currentTime : number) void
        - recordSpeechStat(stat : SegmentStats) void
        - summarizeSegmentStats(segments : SegmentStats[]) StatsSummary
    }

    class ProcessorState {
        + speechStats : SegmentStats[]
        + silenceStats : SegmentStats[]
        + cachedSpeechSummary : StatsSummary
        + speechStartTime : number
        + currentStats : ProcessorStats
    }

    class ProcessorOptions {
        + maxHistoryLength : number
        + energyThreshold : number
        + silenceThreshold : number
        + maxSilenceWithinSpeech : number
    }

    class SegmentStats {
        + startTime : number
        + endTime : number
        + duration : number
        + avgEnergy : number
        + energyIntegral : number
    }

    class StatsSummary {
        + avgDuration : number
        + avgEnergy : number
        + avgEnergyIntegral : number
    }

    class ProcessorStats {
        + speech : StatsSummary
        + silence : StatsSummary
    }

    AudioSegmentProcessor --> ProcessorState : uses
    AudioSegmentProcessor --> ProcessorOptions : uses
    ProcessorState --> SegmentStats : has
    ProcessorState --> StatsSummary : caches
    ProcessorStats --> StatsSummary : contains
    AudioSegmentProcessor ..> SegmentStats : records
    AudioSegmentProcessor ..> StatsSummary : summarizes
Loading

File-Level Changes

Change Details Files
Centralize speech stats updates through a dedicated helper that also enforces history trimming and cache invalidation.
  • Replace direct pushes to speechStats with a call to recordSpeechStat when closing a speech segment.
  • Move maxHistoryLength trimming logic for speechStats into recordSpeechStat.
  • Ensure cachedSpeechSummary is always invalidated inside recordSpeechStat on any speechStats update.
src/lib/audio/AudioSegmentProcessor.ts
Refactor stats aggregation to use a single-pass summarization helper and harden the speech summary cache semantics.
  • Introduce summarizeSegmentStats helper that computes average duration, energy, and energyIntegral in a single pass over SegmentStats.
  • Use summarizeSegmentStats for both silence and speech summaries instead of mapping to temporary arrays and using average().
  • Replace truthy check on cachedSpeechSummary with an explicit null check and return a shallow copy when serving cached summaries to avoid external mutation.
  • Remove the generic average helper that operated on number arrays.
src/lib/audio/AudioSegmentProcessor.ts
Add regression test to verify cached speech summary is isolated from caller mutations across subsequent updates.
  • Create a scenario with one speech chunk followed by silence to populate speech stats and cache.
  • Mutate the returned speech stats object from getStats() to simulate caller-side mutation.
  • Process additional audio and assert that internally computed speech stats remain equal to the original baseline values.
src/lib/audio/AudioSegmentProcessor.test.ts

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link

coderabbitai bot commented Feb 21, 2026

📝 Walkthrough

Walkthrough

The changes refactor AudioSegmentProcessor to improve speech stats handling by introducing a helper method for recording stats with cache invalidation, consolidating averaging logic into a reusable summarizeSegmentStats method, and adding tests to verify that cached speech summaries remain isolated from caller mutations.

Changes

Cohort / File(s) Summary
Speech Stats Processing Refactor
src/lib/audio/AudioSegmentProcessor.ts
Introduced recordSpeechStat helper to manage stats recording with cache invalidation; consolidated averaging logic via summarizeSegmentStats method; removed deprecated average method; adjusted cache handling to explicitly clone values and reset on new recordings.
Cache Isolation Tests
src/lib/audio/AudioSegmentProcessor.test.ts
Added two tests verifying that cached speech summary values remain isolated from caller mutations after object modification and subsequent processor updates.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 Speech stats bundled tight with care,
No mutations spoil what's cached there,
A helper clones to keep them pure,
And summarizes long and sure! 🎙️

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'refactor(audio): harden speech stats cache safety' accurately reflects the main change: improving cache safety for speech statistics in the AudioSegmentProcessor by centralizing mutations, adding explicit null checks, and cloning cached values.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/pr-164-review-rework

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • You’ve made speech summaries copy-on-read and cache-safe, but silence stats are still assigned directly to this.state.currentStats; consider returning a defensive copy for silence (and possibly the top-level stats object) as well to avoid external mutation leaking back into internal state.
  • Now that summarizeSegmentStats exists, you might want a corresponding recordSilenceStat helper for symmetry with recordSpeechStat and to keep any future cache invalidation or history-length logic for silence centralized.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- You’ve made speech summaries copy-on-read and cache-safe, but `silence` stats are still assigned directly to `this.state.currentStats`; consider returning a defensive copy for `silence` (and possibly the top-level stats object) as well to avoid external mutation leaking back into internal state.
- Now that `summarizeSegmentStats` exists, you might want a corresponding `recordSilenceStat` helper for symmetry with `recordSpeechStat` and to keep any future cache invalidation or history-length logic for silence centralized.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

avgEnergy: this.average(this.state.silenceStats.map(s => s.avgEnergy)),
avgEnergyIntegral: this.average(this.state.silenceStats.map(s => s.energyIntegral))
};
stats.silence = this.summarizeSegmentStats(this.state.silenceStats);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: Silence stats are recomputed on every updateStats() call via summarizeSegmentStats, while speech stats are cached in cachedSpeechSummary. This is a pre-existing asymmetry (not a regression introduced here), but since updateStats() is called once per processAudioData() invocation and the result is stored in currentStats, the practical impact is low. Consider adding a cachedSilenceSummary field in a follow-up if silence stats computation becomes a hot path concern.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred to follow-up. This asymmetry is pre-existing and currently low-impact; we can add cachedSilenceSummary only if profiling shows silence summary computation as a measurable hot path.

expect(current.silence.avgEnergy).toBe(0);
});

it('should keep cached speech summary isolated from caller mutation across updates', () => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: The new test covers the primary mutation-isolation scenario well. Consider adding a complementary case where cachedSpeechSummary holds an all-zero summary (i.e., a speech segment with zero energy and zero duration) to verify the !== null fix over the old truthy check. The old if (this.state.cachedSpeechSummary) would have treated { avgDuration: 0, avgEnergy: 0, avgEnergyIntegral: 0 } as falsy and recomputed unnecessarily; the new explicit null check is correct, but a test for that edge case would lock in the fix.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By design: an object like {avgDuration:0, avgEnergy:0, avgEnergyIntegral:0} is truthy in JavaScript, so the previous truthy check and the explicit null check behave the same for that case. Keeping current tests as-is.

@kiloconnect
Copy link

kiloconnect bot commented Feb 21, 2026

Code Review Summary

Status: 2 Suggestions Found | Recommendation: Merge (no blocking issues)

Overview

Severity Count
CRITICAL 0
WARNING 0
SUGGESTION 2

What Changed

This PR refactors AudioSegmentProcessor to:

  1. Centralize speech stat mutation through a new recordSpeechStat() helper (push + trim + cache invalidation in one place).
  2. Fix a correctness bug: the old if (this.state.cachedSpeechSummary) truthy check would skip the cache for an all-zero summary; replaced with an explicit !== null check.
  3. Return defensive copies of the cached speech summary via { ...this.state.cachedSpeechSummary } to prevent external mutation from corrupting the cache.
  4. Replace three separate .map() + average() calls with a single-pass summarizeSegmentStats() helper.
  5. Add a regression test verifying cache isolation from caller mutation.

Risk assessment: Low. The changes are a clean refactor of stats bookkeeping only. No streaming window logic, cursor logic, or transcription merge logic is touched. The correctness fix (truthy → null check) is safe and the new test validates the primary scenario.

Issue Details (click to expand)

SUGGESTION

File Line Issue
src/lib/audio/AudioSegmentProcessor.ts 460 Silence stats are recomputed on every updateStats() call (no cache), while speech stats are cached. Pre-existing asymmetry; low impact since recomputation happens once per audio chunk, not per getStats() call. Consider a cachedSilenceSummary in a follow-up if it becomes a hot-path concern.
src/lib/audio/AudioSegmentProcessor.test.ts 213 New test covers mutation isolation well. Consider adding a case where the cached summary is all-zeros to lock in the !== null fix over the old truthy check.
Correctness notes
  • recordSpeechStat() ordering is correct: push → trim → invalidate cache. Cache is always cleared after any mutation to speechStats, so stale summaries cannot be served.
  • getStats() double-copy is safe: updateStats() stores { ...speechSummary } into currentStats.speech; getStats() spreads it again. External mutation of the returned object cannot reach cachedSpeechSummary.
  • summarizeSegmentStats for silence: called with silenceStats directly; result is stored in currentStats.silence each call. No regression vs. prior behavior.
  • No fragile zones touched: transcription window calculations, cursor logic, sliding-window merge/finalization logic are all untouched.
Files Reviewed (2 files)
  • src/lib/audio/AudioSegmentProcessor.ts — 0 blocking issues, 1 suggestion
  • src/lib/audio/AudioSegmentProcessor.test.ts — 0 blocking issues, 1 suggestion

Fix these issues in Kilo Cloud

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/lib/audio/AudioSegmentProcessor.ts (1)

459-469: Cache-on-compute + copy-on-read for speech summary is well structured.

The explicit !== null check, clone via spread on Line 464, and fresh compute + cache on Lines 466-468 together ensure callers never hold a reference to the internal cache. Good defensive pattern.

One subtle note: silenceStats is always recomputed (Line 460) and never cached, while speechStats gets the caching treatment. This asymmetry is fine if silence summaries are expected to change frequently (silence stats are pushed on every non-speech chunk). However, silenceStats also lacks the maxHistoryLength cap that recordSpeechStat enforces for speechStats — it grows unboundedly via direct .push() calls at Lines 278 and 285. Consider whether silenceStats should be similarly bounded.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/audio/AudioSegmentProcessor.ts` around lines 459 - 469, silenceStats
is currently appended without bounding while speechStats is capped via
recordSpeechStat using maxHistoryLength; update the places that push into
silenceStats (the methods that currently call this.state.silenceStats.push(...))
to enforce the same maxHistoryLength trimming logic used by recordSpeechStat so
silence history cannot grow unbounded, and optionally mirror the speech-summary
cache pattern (cachedSpeechSummary / summarizeSegmentStats) if you want a
copy-on-read for silence summaries as well; reference the symbols silenceStats,
speechStats, recordSpeechStat, maxHistoryLength, summarizeSegmentStats and
cachedSpeechSummary when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/lib/audio/AudioSegmentProcessor.ts`:
- Around line 459-469: silenceStats is currently appended without bounding while
speechStats is capped via recordSpeechStat using maxHistoryLength; update the
places that push into silenceStats (the methods that currently call
this.state.silenceStats.push(...)) to enforce the same maxHistoryLength trimming
logic used by recordSpeechStat so silence history cannot grow unbounded, and
optionally mirror the speech-summary cache pattern (cachedSpeechSummary /
summarizeSegmentStats) if you want a copy-on-read for silence summaries as well;
reference the symbols silenceStats, speechStats, recordSpeechStat,
maxHistoryLength, summarizeSegmentStats and cachedSpeechSummary when making the
change.

@ysdede
Copy link
Owner Author

ysdede commented Feb 21, 2026

Reviewed the AI comments. No code changes needed for this round:

  • getStats() already returns defensive copies for both silence and speech plus top-level fields.
  • The suggested all-zero-summary edge case does not change behavior between truthy and !== null checks in JS because objects are truthy.
  • silenceStats asymmetry is acknowledged and deferred; we can optimize/cache it in a follow-up only if profiling shows it on a real hot path.

@ysdede
Copy link
Owner Author

ysdede commented Feb 21, 2026

AI reviewer scorecard (factual, based on current PR #173 diff)

Rubric: correctness, signal-to-noise, actionability.

  • Sourcery: 6/10

    • Wrong/weak point: implied risk that silence could leak mutable internal state. In current code, getStats() already returns defensive copies for top-level + silence + speech.
    • Valid but optional point: suggested recordSilenceStat for symmetry; this is style/maintainability, not a correctness issue.
  • CodeRabbit: 4/10

    • Incorrect claim: said silenceStats is growing unbounded via direct .push() at referenced lines. Those push sites are not present in the current file revision on this PR.
    • Net: generated a non-actionable nit based on stale/incorrect line assumptions.
  • KiloConnect: 8/10

    • Mostly accurate summary of the actual changes and risk level.
    • Weak point: suggested an "all-zero summary" case for the old truthy check; object summaries are truthy in JS, so that specific concern is not a real behavior difference.

Conclusion: no blocking issues from AI reviews on #173; merge readiness should be decided by human review + green CI.

@AAlp22 AAlp22 merged commit b2d1c1c into master Feb 21, 2026
3 checks passed
github-actions bot pushed a commit that referenced this pull request Feb 21, 2026
refactor(audio): harden speech stats cache safety b2d1c1c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants