refactor(audio): harden speech stats cache safety#173
Conversation
Centralize speechStats writes to guarantee cache invalidation, use explicit null checks for cache reads, and avoid temporary map-array allocations in summary calculation. Also adds a regression test to ensure caller mutation of returned stats cannot corrupt cached speech summary across update cycles.
Reviewer's GuideRefactors AudioSegmentProcessor to centralize speech stats mutation through a helper, makes the speech stats cache safer by using explicit null checks and defensive copying, introduces a shared summarization helper for segment statistics, and adds a regression test to ensure cached speech summaries are not mutated by callers across updates. Sequence diagram for speech stats cache invalidation and safe accesssequenceDiagram
actor Client
participant Processor as AudioSegmentProcessor
participant State
Client->>Processor: processAudioData(speechChunk, 0.0, speechEnergy)
Processor->>State: update speechEnergies, speechStartTime
Client->>Processor: processAudioData(silenceChunk, chunkDuration, silenceEnergy)
Processor->>State: detect speech segment end
Processor->>Processor: recordSpeechStat(stat)
Processor->>State: push stat into speechStats
Processor->>State: trim speechStats to maxHistoryLength
Processor->>State: set cachedSpeechSummary = null
Client->>Processor: getStats()
Processor->>State: read cachedSpeechSummary (null)
alt cachedSpeechSummary is null and speechStats not empty
Processor->>Processor: summarizeSegmentStats(speechStats)
Processor->>State: cachedSpeechSummary = speechSummary
Processor-->>Client: stats with copy of speechSummary
end
Client->>Client: mutate returned stats.speech
Client->>Processor: processAudioData(silenceChunk, nextTime, silenceEnergy)
Processor->>State: update silenceStats
Processor->>State: cachedSpeechSummary remains unchanged
Client->>Processor: getStats()
Processor->>State: read cachedSpeechSummary (not null)
Processor-->>Client: stats with copy of cachedSpeechSummary
Client->>Client: verify stats.speech equals baseline, unaffected by mutation
Class diagram for refactored AudioSegmentProcessor speech stats handlingclassDiagram
class AudioSegmentProcessor {
- state : ProcessorState
- options : ProcessorOptions
+ processAudioData(audioData : Float32Array, startTime : number, energy : number) void
+ getStats() ProcessorStats
- updateStats(currentTime : number) void
- recordSpeechStat(stat : SegmentStats) void
- summarizeSegmentStats(segments : SegmentStats[]) StatsSummary
}
class ProcessorState {
+ speechStats : SegmentStats[]
+ silenceStats : SegmentStats[]
+ cachedSpeechSummary : StatsSummary
+ speechStartTime : number
+ currentStats : ProcessorStats
}
class ProcessorOptions {
+ maxHistoryLength : number
+ energyThreshold : number
+ silenceThreshold : number
+ maxSilenceWithinSpeech : number
}
class SegmentStats {
+ startTime : number
+ endTime : number
+ duration : number
+ avgEnergy : number
+ energyIntegral : number
}
class StatsSummary {
+ avgDuration : number
+ avgEnergy : number
+ avgEnergyIntegral : number
}
class ProcessorStats {
+ speech : StatsSummary
+ silence : StatsSummary
}
AudioSegmentProcessor --> ProcessorState : uses
AudioSegmentProcessor --> ProcessorOptions : uses
ProcessorState --> SegmentStats : has
ProcessorState --> StatsSummary : caches
ProcessorStats --> StatsSummary : contains
AudioSegmentProcessor ..> SegmentStats : records
AudioSegmentProcessor ..> StatsSummary : summarizes
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
📝 WalkthroughWalkthroughThe changes refactor AudioSegmentProcessor to improve speech stats handling by introducing a helper method for recording stats with cache invalidation, consolidating averaging logic into a reusable summarizeSegmentStats method, and adding tests to verify that cached speech summaries remain isolated from caller mutations. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- You’ve made speech summaries copy-on-read and cache-safe, but
silencestats are still assigned directly tothis.state.currentStats; consider returning a defensive copy forsilence(and possibly the top-level stats object) as well to avoid external mutation leaking back into internal state. - Now that
summarizeSegmentStatsexists, you might want a correspondingrecordSilenceStathelper for symmetry withrecordSpeechStatand to keep any future cache invalidation or history-length logic for silence centralized.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- You’ve made speech summaries copy-on-read and cache-safe, but `silence` stats are still assigned directly to `this.state.currentStats`; consider returning a defensive copy for `silence` (and possibly the top-level stats object) as well to avoid external mutation leaking back into internal state.
- Now that `summarizeSegmentStats` exists, you might want a corresponding `recordSilenceStat` helper for symmetry with `recordSpeechStat` and to keep any future cache invalidation or history-length logic for silence centralized.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| avgEnergy: this.average(this.state.silenceStats.map(s => s.avgEnergy)), | ||
| avgEnergyIntegral: this.average(this.state.silenceStats.map(s => s.energyIntegral)) | ||
| }; | ||
| stats.silence = this.summarizeSegmentStats(this.state.silenceStats); |
There was a problem hiding this comment.
SUGGESTION: Silence stats are recomputed on every updateStats() call via summarizeSegmentStats, while speech stats are cached in cachedSpeechSummary. This is a pre-existing asymmetry (not a regression introduced here), but since updateStats() is called once per processAudioData() invocation and the result is stored in currentStats, the practical impact is low. Consider adding a cachedSilenceSummary field in a follow-up if silence stats computation becomes a hot path concern.
There was a problem hiding this comment.
Deferred to follow-up. This asymmetry is pre-existing and currently low-impact; we can add cachedSilenceSummary only if profiling shows silence summary computation as a measurable hot path.
| expect(current.silence.avgEnergy).toBe(0); | ||
| }); | ||
|
|
||
| it('should keep cached speech summary isolated from caller mutation across updates', () => { |
There was a problem hiding this comment.
SUGGESTION: The new test covers the primary mutation-isolation scenario well. Consider adding a complementary case where cachedSpeechSummary holds an all-zero summary (i.e., a speech segment with zero energy and zero duration) to verify the !== null fix over the old truthy check. The old if (this.state.cachedSpeechSummary) would have treated { avgDuration: 0, avgEnergy: 0, avgEnergyIntegral: 0 } as falsy and recomputed unnecessarily; the new explicit null check is correct, but a test for that edge case would lock in the fix.
There was a problem hiding this comment.
By design: an object like {avgDuration:0, avgEnergy:0, avgEnergyIntegral:0} is truthy in JavaScript, so the previous truthy check and the explicit null check behave the same for that case. Keeping current tests as-is.
Code Review SummaryStatus: 2 Suggestions Found | Recommendation: Merge (no blocking issues) Overview
What ChangedThis PR refactors
Risk assessment: Low. The changes are a clean refactor of stats bookkeeping only. No streaming window logic, cursor logic, or transcription merge logic is touched. The correctness fix (truthy → null check) is safe and the new test validates the primary scenario. Issue Details (click to expand)SUGGESTION
Correctness notes
Files Reviewed (2 files)
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/lib/audio/AudioSegmentProcessor.ts (1)
459-469: Cache-on-compute + copy-on-read for speech summary is well structured.The explicit
!== nullcheck, clone via spread on Line 464, and fresh compute + cache on Lines 466-468 together ensure callers never hold a reference to the internal cache. Good defensive pattern.One subtle note:
silenceStatsis always recomputed (Line 460) and never cached, whilespeechStatsgets the caching treatment. This asymmetry is fine if silence summaries are expected to change frequently (silence stats are pushed on every non-speech chunk). However,silenceStatsalso lacks themaxHistoryLengthcap thatrecordSpeechStatenforces forspeechStats— it grows unboundedly via direct.push()calls at Lines 278 and 285. Consider whethersilenceStatsshould be similarly bounded.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/audio/AudioSegmentProcessor.ts` around lines 459 - 469, silenceStats is currently appended without bounding while speechStats is capped via recordSpeechStat using maxHistoryLength; update the places that push into silenceStats (the methods that currently call this.state.silenceStats.push(...)) to enforce the same maxHistoryLength trimming logic used by recordSpeechStat so silence history cannot grow unbounded, and optionally mirror the speech-summary cache pattern (cachedSpeechSummary / summarizeSegmentStats) if you want a copy-on-read for silence summaries as well; reference the symbols silenceStats, speechStats, recordSpeechStat, maxHistoryLength, summarizeSegmentStats and cachedSpeechSummary when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/lib/audio/AudioSegmentProcessor.ts`:
- Around line 459-469: silenceStats is currently appended without bounding while
speechStats is capped via recordSpeechStat using maxHistoryLength; update the
places that push into silenceStats (the methods that currently call
this.state.silenceStats.push(...)) to enforce the same maxHistoryLength trimming
logic used by recordSpeechStat so silence history cannot grow unbounded, and
optionally mirror the speech-summary cache pattern (cachedSpeechSummary /
summarizeSegmentStats) if you want a copy-on-read for silence summaries as well;
reference the symbols silenceStats, speechStats, recordSpeechStat,
maxHistoryLength, summarizeSegmentStats and cachedSpeechSummary when making the
change.
|
Reviewed the AI comments. No code changes needed for this round:
|
|
AI reviewer scorecard (factual, based on current PR #173 diff) Rubric: correctness, signal-to-noise, actionability.
Conclusion: no blocking issues from AI reviews on #173; merge readiness should be decided by human review + green CI. |
refactor(audio): harden speech stats cache safety b2d1c1c
Summary
Context
This revisits the optimization path discussed in #164 and applies a safer variant on top of current master.
Validation
Summary by Sourcery
Refine audio segment statistics handling to make speech summary caching safer and more robust while simplifying statistics aggregation.
Bug Fixes:
Enhancements:
Tests:
Summary by CodeRabbit
Bug Fixes
Tests