Skip to content

Comments

Bolt: Optimize incremental mel spectrogram computation#1

Open
ysdede wants to merge 1 commit intomainfrom
bolt-incremental-perf-14087958449551992334
Open

Bolt: Optimize incremental mel spectrogram computation#1
ysdede wants to merge 1 commit intomainfrom
bolt-incremental-perf-14087958449551992334

Conversation

@ysdede
Copy link
Owner

@ysdede ysdede commented Feb 9, 2026

Bolt: Optimize incremental mel spectrogram computation

What: Modified MelSpectrogram.computeRawMel to accept a startFrame argument and updated IncrementalMelSpectrogram to use it.
Why: The previous implementation recomputed the STFT for the entire audio window (O(Total History)) even when 70%+ of frames were already cached, effectively negating the benefit of incremental processing for the heavy math part.
Impact: Reduces processing time for a 5s window (70% overlap) from ~190ms to ~29ms (~6.5x speedup).
Measurement: npm test runs the benchmark should process incrementally faster than full. Compare "Full" vs "Incremental" times.


PR created automatically by Jules for task 14087958449551992334 started by @ysdede

Summary by CodeRabbit

  • New Features

  • Audio analysis now supports starting computation from a specific frame offset, enabling more efficient incremental processing and better frame reuse.

  • Documentation

  • Added performance optimization guidelines for incremental audio processing features, including benchmarking recommendations.

Modify `MelSpectrogram.computeRawMel` to accept an optional `startFrame` argument,
allowing it to skip STFT and filterbank computations for frames that are already
cached. Update `IncrementalMelSpectrogram.process` to utilize this optimization,
preventing redundant re-computation of the entire audio history.

Benchmarks show a ~6.5x speedup for 5s audio with 70% overlap (29ms vs 190ms).
This makes the incremental processing truly incremental, scaling with new data size.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 9, 2026

📝 Walkthrough

Walkthrough

Documentation added explaining IncrementalMelSpectrogram caching behavior. Method signature updated to MelSpectrogram.computeRawMel by adding optional startFrame parameter, enabling processing from a specific frame offset instead of always starting from frame zero.

Changes

Cohort / File(s) Summary
Documentation
.jules/bolt.md
Added observation that IncrementalMelSpectrogram reprocesses entire audio window (O(N)) despite frame caching, with benchmarking recommendations.
Core Implementation
src/mel.js
Added optional startFrame parameter to computeRawMel method to enable partial frame processing from an offset position, allowing IncrementalMelSpectrogram to skip previously computed frames.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 Hop hop, the frames now skip with glee,
startFrame parameter sets them free!
No more reprocessing the whole long line,
Incremental magic, so divine!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change—optimizing incremental mel spectrogram computation by avoiding redundant STFT/filterbank work—which is clearly reflected in the modified startFrame parameter and IncrementalMelSpectrogram updates.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bolt-incremental-perf-14087958449551992334

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/mel.js (1)

441-448: ⚠️ Potential issue | 🟠 Major

Bug: computeRawMel is called twice in the non-reuse path.

this.processor.process(audio) already calls computeRawMel internally (line 279), then line 443 calls it again solely to populate the cache. This doubles the expensive STFT work on the first invocation (and every cache miss).

Call computeRawMel once, cache the result, then normalize.

Proposed fix
     if (!canReuse) {
-      const result = this.processor.process(audio);
       const { rawMel, nFrames, featuresLen } = this.processor.computeRawMel(audio);
+      if (featuresLen === 0) return { features: new Float32Array(0), length: 0, cached: false, cachedFrames: 0, newFrames: 0 };
+      const features = this.processor.normalize(rawMel, nFrames, featuresLen);
       this._cachedRawMel = rawMel;
       this._cachedNFrames = nFrames;
       this._cachedAudioLen = N;
       this._cachedFeaturesLen = featuresLen;
-      return { ...result, cached: false, cachedFrames: 0, newFrames: featuresLen };
+      return { features, length: featuresLen, cached: false, cachedFrames: 0, newFrames: featuresLen };
     }
🧹 Nitpick comments (2)
src/mel.js (2)

293-331: startFrame skips only the STFT/mel loop; pre-emphasis and padding are still O(N).

The optimization correctly avoids FFT + filterbank matmul for cached frames (the dominant cost), which explains the reported speedup. However, pre-emphasis (lines 298-302) and zero-padding (lines 304-308) still run over the entire audio buffer every call. For very large buffers this linear work could become noticeable.

If this matters in practice, pre-emphasis could be computed only from the sample offset corresponding to startFrame (minus one sample for the audio[i-1] dependency). Not urgent given the current speedup numbers, but worth noting for future optimization.


316-316: Full-size rawMel allocation leaves frames [0, startFrame) as zeros until the caller overwrites them.

This works correctly in the incremental path (lines 457-462 copy cached values), but a direct caller of computeRawMel(audio, startFrame) who doesn't know about the zeroed prefix would get silently wrong results. Consider documenting this contract in the JSDoc, or marking the parameter/method as internal.

@ysdede ysdede changed the title ⚡ Bolt: Optimize incremental mel spectrogram computation Bolt: Optimize incremental mel spectrogram computation Feb 17, 2026
Repository owner deleted a comment from google-labs-jules bot Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant