Skip to content

Comments

Performance: Optimize AudioSegmentProcessor stats calculation#164

Open
ysdede wants to merge 1 commit intomasterfrom
perf/audio-processor-stats-16904760239692404253
Open

Performance: Optimize AudioSegmentProcessor stats calculation#164
ysdede wants to merge 1 commit intomasterfrom
perf/audio-processor-stats-16904760239692404253

Conversation

@ysdede
Copy link
Owner

@ysdede ysdede commented Feb 20, 2026

Performance Optimization in AudioSegmentProcessor

  • Bottleneck Identified: The updateStats method was recalculating averages (duration, energy, integral) for the entire speech history buffer every time a new audio chunk was processed (~12.5Hz). This involved creating multiple temporary arrays via .map() and reducing them, despite the underlying data changing only when a speech segment ended.
  • Solution: Implemented a caching mechanism (cachedSpeechSummary) in ProcessorState.
    • The cache is invalidated (set to null) only when speechStats is modified (push/shift).
    • updateStats now uses the cached value if available, or calculates and caches it if not.
  • Impact:
    • Reduced execution time of processAudioData loop by ~48% in a synthetic benchmark (100k chunks: 585ms -> 303ms).
    • Reduced GC pressure by eliminating 3 array allocations per audio chunk.
  • Verification:
    • Ran bun test src/lib/audio/AudioSegmentProcessor.test.ts (all passed).
    • Ran a custom reproduction script repro_perf.ts to measure the speedup.

PR created automatically by Jules for task 16904760239692404253 started by @ysdede

Summary by Sourcery

Cache aggregated speech statistics in AudioSegmentProcessor to avoid recomputing them on every audio chunk and improve processing performance.

Enhancements:

  • Add a cachedSpeechSummary field to ProcessorState and use it in updateStats to reuse previously computed speech statistics.
  • Invalidate the cached speech summary only when speechStats is modified or state is reinitialized.

Summary by CodeRabbit

  • Refactor
    • Optimized audio processing performance for improved responsiveness during segment handling and state transitions.

Cached `speechStats` aggregation in `AudioSegmentProcessor.updateStats` to avoid redundant O(N) calculations every audio chunk.
Performance benchmark shows ~48% improvement in `processAudioData` execution time.
Verified with existing unit tests.
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Feb 20, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Adds a cached speech stats summary to AudioSegmentProcessor to avoid recomputing averages on every audio chunk, invalidating the cache only when the underlying speech stats history changes.

Class diagram for cachedSpeechSummary in AudioSegmentProcessor

classDiagram
    class AudioSegmentProcessor {
        - ProcessorState state
        - ProcessorOptions options
        + processAudioData(audioChunk)
        + updateStats()
        - average(values)
        - resetState()
    }

    class ProcessorState {
        + number[] silenceDurations
        + number[] silenceEnergies
        + SegmentStats[] speechStats
        + SegmentStats[] silenceStats
        + StatsSummary cachedSpeechSummary
        + CurrentStats currentStats
        + number segmentCounter
        + number noiseFloor
    }

    class StatsSummary {
        + number avgDuration
        + number avgEnergy
        + number avgEnergyIntegral
    }

    class SegmentStats {
        + number duration
        + number avgEnergy
        + number energyIntegral
    }

    class CurrentStats {
        + StatsSummary speech
        + StatsSummary silence
    }

    AudioSegmentProcessor --> ProcessorState : uses
    ProcessorState "*" --> "*" SegmentStats : aggregates
    ProcessorState --> StatsSummary : caches
    ProcessorState --> CurrentStats : holds
    CurrentStats --> StatsSummary : speech
    CurrentStats --> StatsSummary : silence
Loading

Flow diagram for speech stats cache invalidation and usage

flowchart TD
    A[start processAudioData for new audioChunk] --> B[update speech detection state]
    B --> C{didSpeechSegmentEnd?}
    C -- no --> D[call updateStats]
    C -- yes --> E[compute segment duration and avgEnergy]
    E --> F[push new SegmentStats into speechStats]
    F --> G[set cachedSpeechSummary to null]
    G --> H{speechStats.length > maxHistoryLength?}
    H -- yes --> I[shift oldest SegmentStats from speechStats]
    I --> J[set cachedSpeechSummary to null]
    H -- no --> J
    J --> D[call updateStats]

    subgraph updateStats
        D --> K[compute silence stats]
        K --> L{cachedSpeechSummary is not null?}
        L -- yes --> M[set stats.speech from cachedSpeechSummary]
        L -- no --> N[map and average speechStats to compute StatsSummary]
        N --> O[set cachedSpeechSummary to computed StatsSummary]
        M --> P[set currentStats to stats]
        O --> P[set currentStats to stats]
        P --> Q[end updateStats]
    end

    Q --> R[end processAudioData for this audioChunk]
Loading

File-Level Changes

Change Details Files
Introduce cached speech summary in processor state and ensure it is initialized correctly.
  • Extend ProcessorState with a nullable cachedSpeechSummary field to hold precomputed speech stats.
  • Initialize cachedSpeechSummary to null in the AudioSegmentProcessor initial state setup.
src/lib/audio/AudioSegmentProcessor.ts
Invalidate cached speech summary whenever the speech stats history is mutated.
  • Set cachedSpeechSummary to null when a new speech segment summary is pushed onto speechStats.
  • Set cachedSpeechSummary to null when the oldest speech segment summary is removed from speechStats due to exceeding max history length.
src/lib/audio/AudioSegmentProcessor.ts
Use a cache in updateStats to avoid recomputing speech averages when speechStats has not changed.
  • Check cachedSpeechSummary first in updateStats and reuse it when available instead of recomputing averages.
  • If no cached value exists but speechStats has entries, compute the averages, assign them to stats.speech, and store the result in cachedSpeechSummary.
  • Retain existing behavior for silence stats and currentStats while integrating the cache usage.
src/lib/audio/AudioSegmentProcessor.ts

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link

coderabbitai bot commented Feb 20, 2026

📝 Walkthrough

Walkthrough

Internal caching optimization added to the AudioSegmentProcessor to avoid redundant speech summary computations. A cachedSpeechSummary field stores previously computed statistics, with automatic invalidation on speech segment completion or history length changes.

Changes

Cohort / File(s) Summary
Speech Summary Caching
src/lib/audio/AudioSegmentProcessor.ts
Added internal cache field for speech summary statistics with invalidation logic triggered on speech segment end events and history truncation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A cache so clever, tucked away so neat,
Speech summaries stored, no recompute repeat!
When segments end or history grows too long,
The cache invalidates—keeping stats strong! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: optimizing stats calculation in AudioSegmentProcessor through caching, which is the core improvement described in the PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch perf/audio-processor-stats-16904760239692404253

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Consider centralizing speechStats mutations (e.g., via helper methods) so that cachedSpeechSummary invalidation is guaranteed on every write path and can’t be missed in future changes.
  • In updateStats, prefer an explicit null check (this.state.cachedSpeechSummary !== null) instead of a truthiness check to make the cache-use condition clearer and robust against future type changes.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider centralizing `speechStats` mutations (e.g., via helper methods) so that `cachedSpeechSummary` invalidation is guaranteed on every write path and can’t be missed in future changes.
- In `updateStats`, prefer an explicit null check (`this.state.cachedSpeechSummary !== null`) instead of a truthiness check to make the cache-use condition clearer and robust against future type changes.

## Individual Comments

### Comment 1
<location> `src/lib/audio/AudioSegmentProcessor.ts:474-482` </location>
<code_context>
         }

-        if (this.state.speechStats.length > 0) {
+        if (this.state.cachedSpeechSummary) {
+            stats.speech = this.state.cachedSpeechSummary;
+        } else if (this.state.speechStats.length > 0) {
             stats.speech = {
                 avgDuration: this.average(this.state.speechStats.map(s => s.duration)),
                 avgEnergy: this.average(this.state.speechStats.map(s => s.avgEnergy)),
                 avgEnergyIntegral: this.average(this.state.speechStats.map(s => s.energyIntegral))
             };
+            this.state.cachedSpeechSummary = stats.speech;
         }

</code_context>

<issue_to_address>
**issue (bug_risk):** Cache usage relies on all mutation sites invalidating `cachedSpeechSummary`, which could lead to stale stats if any path is missed.

This depends on every `speechStats` mutation also clearing `cachedSpeechSummary`, which we currently do when appending, truncating, and in `resetState`. Any future code that replaces or clears `speechStats` must also remember to invalidate the cache, or consumers will see stale data. To make this safer, consider centralizing all `speechStats` mutations in helper(s) that always invalidate the cache, or switch to a version-based approach where the cache is tied to a `speechStats` version counter rather than manual invalidation at each call site.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +474 to +482
if (this.state.cachedSpeechSummary) {
stats.speech = this.state.cachedSpeechSummary;
} else if (this.state.speechStats.length > 0) {
stats.speech = {
avgDuration: this.average(this.state.speechStats.map(s => s.duration)),
avgEnergy: this.average(this.state.speechStats.map(s => s.avgEnergy)),
avgEnergyIntegral: this.average(this.state.speechStats.map(s => s.energyIntegral))
};
this.state.cachedSpeechSummary = stats.speech;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Cache usage relies on all mutation sites invalidating cachedSpeechSummary, which could lead to stale stats if any path is missed.

This depends on every speechStats mutation also clearing cachedSpeechSummary, which we currently do when appending, truncating, and in resetState. Any future code that replaces or clears speechStats must also remember to invalidate the cache, or consumers will see stale data. To make this safer, consider centralizing all speechStats mutations in helper(s) that always invalidate the cache, or switch to a version-based approach where the cache is tied to a speechStats version counter rather than manual invalidation at each call site.

@kiloconnect
Copy link

kiloconnect bot commented Feb 20, 2026

Code Review Summary

Status: No New Issues Found | Recommendation: Merge

The PR implements a performance optimization by caching speech statistics summary to avoid recalculating averages on every audio chunk. The implementation is correct:

  • Cache is properly invalidated when speech stats are modified (push or shift)
  • Cache is properly initialized to null in reset
  • Cache is correctly used in updateStats before computing

An existing review (Sourcery) has already flagged a maintenance concern about centralizing speechStats mutations to prevent future cache invalidation bugs. This is a valid suggestion but not a blocker.

Files Reviewed (1 file)
  • src/lib/audio/AudioSegmentProcessor.ts - Performance optimization with proper cache invalidation

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/lib/audio/AudioSegmentProcessor.ts (2)

466-472: silenceStats path still allocates 3 temporary arrays per chunk.

The optimisation was applied only to speechStats. The silenceStats block (lines 466–472) retains the same .map() × 3 pattern on every updateStats() call. Applying cachedSilenceSummary symmetrically would complete the optimisation and keep both paths consistent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/audio/AudioSegmentProcessor.ts` around lines 466 - 472, The
silenceStats branch still creates three temporary arrays via .map on every
update; mirror the speechStats optimization by adding a cachedSilenceSummary
accumulator (similar to cachedSpeechSummary) updated inside updateStats(), and
replace the current stats.silence calculation to compute avgDuration, avgEnergy
and avgEnergyIntegral from that cachedSilenceSummary instead of mapping
silenceStats each time; update references to this.state.silenceStats,
stats.silence, cachedSilenceSummary and the average(...) helper so the silence
path no longer allocates the three intermediate arrays.

269-273: Line 273 is a redundant null assignment.

cachedSpeechSummary is already null from the unconditional assignment on line 269 — the shift() branch is nested entirely within that same block. The second assignment is a no-op.

🔧 Suggested cleanup
                    this.state.speechStats.push({
                        startTime: this.state.speechStartTime,
                        endTime: currentTime,
                        duration: speechDuration,
                        avgEnergy,
                        energyIntegral: avgEnergy * speechDuration
                    });
                    this.state.cachedSpeechSummary = null;

                    if (this.state.speechStats.length > this.options.maxHistoryLength) {
                        this.state.speechStats.shift();
-                       this.state.cachedSpeechSummary = null;
                    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/audio/AudioSegmentProcessor.ts` around lines 269 - 273, Remove the
redundant null assignment to this.state.cachedSpeechSummary inside the branch
that shifts this.state.speechStats: you already set
this.state.cachedSpeechSummary = null unconditionally before the if, so delete
the second assignment within the if block (in AudioSegmentProcessor where
this.state.speechStats is compared to this.options.maxHistoryLength) to
eliminate the no-op.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/audio/AudioSegmentProcessor.ts`:
- Around line 474-483: The cache hit path assigns the cached object by reference
(this.state.cachedSpeechSummary) to stats.speech, allowing external callers of
getStats() to mutate the cached object; instead, make a shallow copy when
reading the cache so callers get an independent object. In updateStats() (or the
block where stats.speech is set) replace the direct assignment of
this.state.cachedSpeechSummary with a shallow copy (e.g. create a new object
literal copying avgDuration, avgEnergy, avgEnergyIntegral), or alternatively
ensure getStats() returns a shallow copy of this.state.currentStats so consumers
cannot mutate the internal cachedSpeechSummary; update references to
this.state.currentStats and this.state.cachedSpeechSummary accordingly.

---

Nitpick comments:
In `@src/lib/audio/AudioSegmentProcessor.ts`:
- Around line 466-472: The silenceStats branch still creates three temporary
arrays via .map on every update; mirror the speechStats optimization by adding a
cachedSilenceSummary accumulator (similar to cachedSpeechSummary) updated inside
updateStats(), and replace the current stats.silence calculation to compute
avgDuration, avgEnergy and avgEnergyIntegral from that cachedSilenceSummary
instead of mapping silenceStats each time; update references to
this.state.silenceStats, stats.silence, cachedSilenceSummary and the
average(...) helper so the silence path no longer allocates the three
intermediate arrays.
- Around line 269-273: Remove the redundant null assignment to
this.state.cachedSpeechSummary inside the branch that shifts
this.state.speechStats: you already set this.state.cachedSpeechSummary = null
unconditionally before the if, so delete the second assignment within the if
block (in AudioSegmentProcessor where this.state.speechStats is compared to
this.options.maxHistoryLength) to eliminate the no-op.

Comment on lines +474 to 483
if (this.state.cachedSpeechSummary) {
stats.speech = this.state.cachedSpeechSummary;
} else if (this.state.speechStats.length > 0) {
stats.speech = {
avgDuration: this.average(this.state.speechStats.map(s => s.duration)),
avgEnergy: this.average(this.state.speechStats.map(s => s.avgEnergy)),
avgEnergyIntegral: this.average(this.state.speechStats.map(s => s.energyIntegral))
};
this.state.cachedSpeechSummary = stats.speech;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Cache hit path stores a shared object reference — consumer mutation now silently corrupts the cache.

Line 475 assigns the cached reference directly to stats.speech. After line 485 (this.state.currentStats = stats), the invariant this.state.currentStats.speech === this.state.cachedSpeechSummary holds. getStats() returns this.state.currentStats by reference (line 497), so any consumer that writes to the returned speech object (e.g. getStats().speech.avgDuration = x) mutates the cache in place.

Before this PR, stats.speech was always a fresh object literal on every updateStats() call, so external mutations were overwritten on the very next chunk (~80 ms). After this PR, the same mutation persists until the next speech-end event — potentially several seconds — and is then served as the cache hit value.

The minimal fix is to copy on the read side, which is a cheap struct copy on the hot path:

🛡️ Proposed fix
-        if (this.state.cachedSpeechSummary) {
-            stats.speech = this.state.cachedSpeechSummary;
-        } else if (this.state.speechStats.length > 0) {
+        if (this.state.cachedSpeechSummary !== null) {
+            stats.speech = { ...this.state.cachedSpeechSummary };
+        } else if (this.state.speechStats.length > 0) {
             stats.speech = {
                 avgDuration: this.average(this.state.speechStats.map(s => s.duration)),
                 avgEnergy: this.average(this.state.speechStats.map(s => s.avgEnergy)),
                 avgEnergyIntegral: this.average(this.state.speechStats.map(s => s.energyIntegral))
             };
             this.state.cachedSpeechSummary = stats.speech;
         }

Alternatively, getStats() can return a shallow copy of currentStats to establish a consistent immutability contract at the API boundary.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (this.state.cachedSpeechSummary) {
stats.speech = this.state.cachedSpeechSummary;
} else if (this.state.speechStats.length > 0) {
stats.speech = {
avgDuration: this.average(this.state.speechStats.map(s => s.duration)),
avgEnergy: this.average(this.state.speechStats.map(s => s.avgEnergy)),
avgEnergyIntegral: this.average(this.state.speechStats.map(s => s.energyIntegral))
};
this.state.cachedSpeechSummary = stats.speech;
}
if (this.state.cachedSpeechSummary !== null) {
stats.speech = { ...this.state.cachedSpeechSummary };
} else if (this.state.speechStats.length > 0) {
stats.speech = {
avgDuration: this.average(this.state.speechStats.map(s => s.duration)),
avgEnergy: this.average(this.state.speechStats.map(s => s.avgEnergy)),
avgEnergyIntegral: this.average(this.state.speechStats.map(s => s.energyIntegral))
};
this.state.cachedSpeechSummary = stats.speech;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/audio/AudioSegmentProcessor.ts` around lines 474 - 483, The cache hit
path assigns the cached object by reference (this.state.cachedSpeechSummary) to
stats.speech, allowing external callers of getStats() to mutate the cached
object; instead, make a shallow copy when reading the cache so callers get an
independent object. In updateStats() (or the block where stats.speech is set)
replace the direct assignment of this.state.cachedSpeechSummary with a shallow
copy (e.g. create a new object literal copying avgDuration, avgEnergy,
avgEnergyIntegral), or alternatively ensure getStats() returns a shallow copy of
this.state.currentStats so consumers cannot mutate the internal
cachedSpeechSummary; update references to this.state.currentStats and
this.state.cachedSpeechSummary accordingly.

Repository owner deleted a comment from chatgpt-codex-connector bot Feb 20, 2026
Repository owner deleted a comment from google-labs-jules bot Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant