-
Notifications
You must be signed in to change notification settings - Fork 0
Performance: optimize sparse mel filterbank matmul #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| ## 2024-05-23 - Sparse Mel Filterbank | ||
| Learning: The Mel filterbank matrix is ~98.5% sparse (only ~500 non-zero elements out of ~32k), making dense matrix multiplication extremely inefficient. | ||
| Action: Always check for sparsity in fixed transform matrices (like Mel or DCT) and implement sparse iteration if sparsity > 90%. |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -261,6 +261,32 @@ export class MelSpectrogram { | |||||
| this.hannWindow = createPaddedHannWindow(this.winLength, this.nFft); | ||||||
| this.twiddles = precomputeTwiddles(this.nFft); | ||||||
|
|
||||||
| // Precompute sparse matrix indices (optimize for ~98.5% sparsity) | ||||||
| this._fbStart = new Int32Array(this.nMels); | ||||||
| this._fbEnd = new Int32Array(this.nMels); | ||||||
| for (let m = 0; m < this.nMels; m++) { | ||||||
| let start = 0; | ||||||
| let end = this.nFreqBins; | ||||||
| const offset = m * this.nFreqBins; | ||||||
|
|
||||||
| // Find first non-zero | ||||||
| for (let k = 0; k < this.nFreqBins; k++) { | ||||||
| if (this.melFilterbank[offset + k] !== 0) { | ||||||
| start = k; | ||||||
| break; | ||||||
| } | ||||||
| } | ||||||
| // Find last non-zero | ||||||
| for (let k = this.nFreqBins - 1; k >= 0; k--) { | ||||||
| if (this.melFilterbank[offset + k] !== 0) { | ||||||
| end = k + 1; | ||||||
| break; | ||||||
| } | ||||||
| } | ||||||
| this._fbStart[m] = start; | ||||||
| this._fbEnd[m] = end; | ||||||
| } | ||||||
|
|
||||||
| // Pre-allocate reusable buffers | ||||||
| this._fftRe = new Float64Array(this.nFft); | ||||||
| this._fftIm = new Float64Array(this.nFft); | ||||||
|
|
@@ -315,7 +341,7 @@ export class MelSpectrogram { | |||||
| // 4. STFT + Power + Mel + Log | ||||||
| const rawMel = new Float32Array(this.nMels * nFrames); | ||||||
| const { _fftRe: fftRe, _fftIm: fftIm, _powerBuf: powerBuf } = this; | ||||||
| const { hannWindow: window, melFilterbank: fb, nMels, twiddles: tw, nFft, nFreqBins, hopLength, logZeroGuard } = this; | ||||||
| const { hannWindow: window, melFilterbank: fb, nMels, twiddles: tw, nFft, nFreqBins, hopLength, logZeroGuard, _fbStart, _fbEnd } = this; | ||||||
|
|
||||||
| for (let t = 0; t < nFrames; t++) { | ||||||
| const offset = t * hopLength; | ||||||
|
|
@@ -325,7 +351,8 @@ export class MelSpectrogram { | |||||
| for (let m = 0; m < nMels; m++) { | ||||||
| let melVal = 0; | ||||||
| const fbOff = m * nFreqBins; | ||||||
| for (let k = 0; k < nFreqBins; k++) melVal += powerBuf[k] * fb[fbOff + k]; | ||||||
| // Optimization: only iterate over non-zero filterbank elements | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. SUGGESTION: The optimization comment is accurate, but consider also noting that correctness relies on the filterbank having contiguous non-zero support per row (triangular filters). If a non-standard filterbank with non-contiguous non-zero elements were ever used, this sparse-range approach would silently skip interior zeros that happen to be zero but are surrounded by non-zeros — though for the Slaney triangular filterbank this is guaranteed safe. A one-line note would help future maintainers:
Suggested change
|
||||||
| for (let k = _fbStart[m]; k < _fbEnd[m]; k++) melVal += powerBuf[k] * fb[fbOff + k]; | ||||||
| rawMel[m * nFrames + t] = Math.log(melVal + logZeroGuard); | ||||||
| } | ||||||
| } | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SUGGESTION: The default value for
endisthis.nFreqBins, which means if a mel row were entirely zero (no non-zero element found in the second loop), the full range[0, nFreqBins)would be iterated — a safe conservative fallback. However, the comment above only documents the first loop's default. Consider adding a brief comment on theenddefault for symmetry and clarity: