Skip to content

Comments

Performance: optimize sparse mel filterbank matmul#12

Open
ysdede wants to merge 1 commit intomainfrom
bolt/optimize-mel-filterbank-16386929457820672
Open

Performance: optimize sparse mel filterbank matmul#12
ysdede wants to merge 1 commit intomainfrom
bolt/optimize-mel-filterbank-16386929457820672

Conversation

@ysdede
Copy link
Owner

@ysdede ysdede commented Feb 21, 2026

Implemented sparse matrix multiplication for the Mel filterbank step in MelSpectrogram.
Precomputed non-zero start/end indices in the constructor to skip ~98.5% of iterations in the inner loop of computeRawMel.
This results in a ~3.3x speedup for the overall processing pipeline.


PR created automatically by Jules for task 16386929457820672 started by @ysdede

The mel filterbank matrix is ~98.5% sparse. Iterating over all frequency bins for every mel band results in unnecessary zero multiplications.

This change:
1. Precomputes the start and end indices of non-zero elements for each mel band in the `MelSpectrogram` constructor.
2. Updates `computeRawMel` to iterate only over the non-zero range.

Performance impact:
- 5s audio processing time reduced from ~70ms to ~21ms (~3.3x speedup).
- Verified with `node bench.mjs` and existing tests.
@google-labs-jules
Copy link

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

this._fbEnd = new Int32Array(this.nMels);
for (let m = 0; m < this.nMels; m++) {
let start = 0;
let end = this.nFreqBins;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: The default value for end is this.nFreqBins, which means if a mel row were entirely zero (no non-zero element found in the second loop), the full range [0, nFreqBins) would be iterated — a safe conservative fallback. However, the comment above only documents the first loop's default. Consider adding a brief comment on the end default for symmetry and clarity:

Suggested change
let end = this.nFreqBins;
let end = this.nFreqBins; // default: full range (safe fallback if row is all-zero)

let melVal = 0;
const fbOff = m * nFreqBins;
for (let k = 0; k < nFreqBins; k++) melVal += powerBuf[k] * fb[fbOff + k];
// Optimization: only iterate over non-zero filterbank elements
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: The optimization comment is accurate, but consider also noting that correctness relies on the filterbank having contiguous non-zero support per row (triangular filters). If a non-standard filterbank with non-contiguous non-zero elements were ever used, this sparse-range approach would silently skip interior zeros that happen to be zero but are surrounded by non-zeros — though for the Slaney triangular filterbank this is guaranteed safe. A one-line note would help future maintainers:

Suggested change
// Optimization: only iterate over non-zero filterbank elements
// Optimization: only iterate over non-zero filterbank elements (valid for contiguous triangular filters)

@kiloconnect
Copy link

kiloconnect bot commented Feb 21, 2026

Code Review Summary

Status: 2 Suggestions Found | Recommendation: Merge (no blocking issues)

Overview

Severity Count
CRITICAL 0
WARNING 0
SUGGESTION 2

What Changed

This PR precomputes per-mel-bin non-zero range indices (_fbStart[m], _fbEnd[m]) in the MelSpectrogram constructor and uses them to skip zero-valued filterbank elements in the inner mel accumulation loop of computeRawMel. A .jules/bolt.md learning note is also added.

Risk assessment: Low. The optimization is mathematically equivalent to the dense loop for the Slaney triangular filterbank (which has contiguous non-zero support per row). The existing ONNX cross-validation test in tests/mel.test.mjs provides strong end-to-end correctness coverage. No fragile streaming/window/merge logic is touched.

Issue Details (click to expand)

SUGGESTION

File Line Issue
src/mel.js 269 end default value (this.nFreqBins) is a safe conservative fallback but lacks a comment explaining the intent — add inline note for symmetry with the start default
src/mel.js 354 Optimization comment doesn't note the contiguous-support assumption; future maintainers using non-triangular filterbanks could be misled — add a brief qualifier
Other Observations (not in diff)
  • .jules/bolt.md: This bot-generated learning note (## 2024-05-23 - Sparse Mel Filterbank) is committed to the repository. Consider whether this file belongs in version control or should be .gitignored.
  • Sparsity claim accuracy: The ~98.5% sparsity comment is accurate for the default nMels=128, nFft=512 configuration. For other configurations (e.g. nMels=80, nFft=1024) the sparsity ratio will differ. The claim is not wrong but is config-specific.
  • No new unit test for sparse bounds: The PR adds new constructor logic (_fbStart/_fbEnd computation) without a dedicated unit test. The existing ONNX cross-validation test covers end-to-end correctness, but a targeted test asserting that _fbStart[m] <= _fbEnd[m] and that the sparse result matches the dense result for a known filterbank would strengthen confidence. Not blocking given the existing coverage.
  • IncrementalMelSpectrogram inherits the optimization automatically via this.processor delegation — no changes needed there. ✅
Files Reviewed (2 files)
  • src/mel.js — 2 suggestions
  • .jules/bolt.md — meta/docs, no code issues

Fix these issues in Kilo Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant