Refactor AudioSegmentProcessor: reduce complexity in findSpeechStart by ysdede · Pull Request #87 · ysdede/keet

ysdede · 2026-02-09T09:44:25Z

Refactored findSpeechStart in src/lib/audio/AudioSegmentProcessor.ts to reduce complexity.
Verified with reproduction test src/lib/audio/AudioSegmentProcessor_repro.test.ts (deleted after verification) and full test suite.

PR created automatically by Jules for task 6181472373587751073 started by @ysdede

Summary by CodeRabbit

Bug Fixes
- Enhanced speech start detection in audio processing with improved multi-step analysis for more accurate and reliable identification of speech onset in audio recordings.
Refactor
- Reorganized audio detection logic with improved helper functions and enhanced logging for better code structure and maintainability during segment processing.

… methods This change extracts the complex heuristic logic in `findSpeechStart` into three private helper methods: `findLastSpeechChunkIndex`, `findRisingEnergyStart`, and `findSnrCrossingStart`. This improves readability and maintainability of the speech onset detection logic without altering its behavior. - Extracted `findLastSpeechChunkIndex` to locate the most recent speech chunk. - Extracted `findRisingEnergyStart` to handle rising energy trend analysis. - Extracted `findSnrCrossingStart` to handle SNR threshold crossing checks. - Updated `findSpeechStart` to compose these helpers. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

ysdede · 2026-02-09T17:17:09Z

@coderabbitai review

coderabbitai · 2026-02-09T17:17:20Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

… methods This change extracts the complex heuristic logic in `findSpeechStart` into three private helper methods: `findLastSpeechChunkIndex`, `findRisingEnergyStart`, and `findSnrCrossingStart`. This improves readability and maintainability of the speech onset detection logic without altering its behavior. - Extracted `findLastSpeechChunkIndex` to locate the most recent speech chunk. - Extracted `findRisingEnergyStart` to handle rising energy trend analysis. - Extracted `findSnrCrossingStart` to handle SNR threshold crossing checks. - Updated `findSpeechStart` to compose these helpers. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@src/lib/audio/AudioSegmentProcessor.ts`:
- Around line 447-458: In findSnrCrossingStart, handle the edge-case where the
chunk at startIndex is already below minSnrThreshold: instead of returning
Math.min(chunks.length - 1, i + 1) blindly, add an explicit branch so that if i
=== startIndex you return startIndex (or otherwise clamp so you never return an
index past the last valid speech chunk); keep the existing behavior for earlier
i (return i+1) and add a short comment explaining this special-case to prevent
regressions—this involves editing the findSnrCrossingStart function and touching
the use of chunks, i, startIndex and minSnrThreshold.
- Around line 392-400: findLastSpeechChunkIndex currently returns 0 to indicate
"not found", which conflicts with a real match at index 0; change
findLastSpeechChunkIndex to return -1 when no speech chunk is found and update
findSpeechStart to handle -1 explicitly (e.g., treat as "no speech" and avoid
using recentChunks[0]). Also search for any other callers (e.g.,
processAudioData / findSpeechStart invocation sites) and ensure they properly
handle the -1 sentinel (or switch to returning null if you prefer), adding a
brief comment in both functions documenting the sentinel semantics.
- Around line 373-375: The variable firstSpeechIndex is misnamed because
findLastSpeechChunkIndex() returns the most recent (last) speech chunk; rename
the variable to lastSpeechIndex and update any usages accordingly (e.g., where
passed into findRisingEnergyStart) and also rename the helper parameter
startIndex to lastIndex (or similar) in methods like findRisingEnergyStart and
any other helper functions that accept that value so names reflect "last/recent"
semantics; update all call sites and JSDoc/comments referencing
firstSpeechIndex/startIndex to use the new names (findLastSpeechChunkIndex,
lastSpeechIndex, lastIndex).

📜 Review details

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c7f96b9 and 9ea5f07.

📒 Files selected for processing (1)

src/lib/audio/AudioSegmentProcessor.ts

🧰 Additional context used

🧬 Code graph analysis (1)

src/lib/audio/AudioSegmentProcessor.ts (1)

src/lib/audio/audioParams.ts (1)

minSnrThreshold (133-133)

🔇 Additional comments (2)

src/lib/audio/AudioSegmentProcessor.ts (2)

372-387: Clean decomposition of findSpeechStart — nice refactor.

The three-step cascade (rising energy → SNR crossing → default lookback) is clear and easy to follow. Good separation of concerns.

One note: since findSpeechStart can no longer return -1 (minimum is 0 via Math.max), the guard at line 217 (realStartIndex !== -1) is now dead code. Consider updating the caller to remove that check for consistency.

405-442: findRisingEnergyStart logic looks correct.

The backward scan, energy-rise detection, SNR floor guard, and ~500 ms lookback cap are all well-structured. Logging only fires when a trend is actually found, which is good.

One minor observation: the guard i < chunks.length - 1 at line 414 is always true in the current call context (since i <= startIndex - 1 and startIndex <= chunks.length - 1). It's harmless as a safety check but could be removed or commented to reduce reader confusion.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-02-09T17:20:26Z

src/lib/audio/AudioSegmentProcessor.ts

+        const firstSpeechIndex = this.findLastSpeechChunkIndex();
+
+        const risingEnergyIndex = this.findRisingEnergyStart(firstSpeechIndex);


🧹 Nitpick | 🔵 Trivial

Misleading variable name: firstSpeechIndex holds the last (most recent) speech chunk index.

findLastSpeechChunkIndex() scans backwards and returns the most recent speech chunk, yet its result is stored in firstSpeechIndex. This is confusing and inverts the semantic meaning. Rename to lastSpeechIndex (and correspondingly the startIndex parameter in the helpers) to match what findLastSpeechChunkIndex actually returns.

✏️ Suggested rename

- const firstSpeechIndex = this.findLastSpeechChunkIndex(); + const lastSpeechIndex = this.findLastSpeechChunkIndex(); - const risingEnergyIndex = this.findRisingEnergyStart(firstSpeechIndex); + const risingEnergyIndex = this.findRisingEnergyStart(lastSpeechIndex); if (risingEnergyIndex !== null) { return risingEnergyIndex; } - const snrCrossingIndex = this.findSnrCrossingStart(firstSpeechIndex); + const snrCrossingIndex = this.findSnrCrossingStart(lastSpeechIndex); if (snrCrossingIndex !== null) { return snrCrossingIndex; } // Default lookback - return Math.max(0, firstSpeechIndex - 4); + return Math.max(0, lastSpeechIndex - 4);

🤖 Prompt for AI Agents

In `@src/lib/audio/AudioSegmentProcessor.ts` around lines 373 - 375, The variable firstSpeechIndex is misnamed because findLastSpeechChunkIndex() returns the most recent (last) speech chunk; rename the variable to lastSpeechIndex and update any usages accordingly (e.g., where passed into findRisingEnergyStart) and also rename the helper parameter startIndex to lastIndex (or similar) in methods like findRisingEnergyStart and any other helper functions that accept that value so names reflect "last/recent" semantics; update all call sites and JSDoc/comments referencing firstSpeechIndex/startIndex to use the new names (findLastSpeechChunkIndex, lastSpeechIndex, lastIndex).

coderabbitai · 2026-02-09T17:20:26Z

src/lib/audio/AudioSegmentProcessor.ts

+    private findLastSpeechChunkIndex(): number {
+        const chunks = this.state.recentChunks;
        for (let i = chunks.length - 1; i >= 0; i--) {
            if (chunks[i].isSpeech) {
-                firstSpeechIndex = i;
-                break;
+                return i;
            }
        }
+        return 0;
+    }


🧹 Nitpick | 🔵 Trivial

findLastSpeechChunkIndex returns 0 as a "not found" sentinel — indistinguishable from a real match at index 0.

Returning 0 when no speech chunk exists conflates "the oldest chunk in the buffer is speech" with "no speech was found." Currently this is safe because processAudioData pushes a speech chunk into recentChunks (line 182-187) before calling findSpeechStart, so the search always succeeds. However, this is fragile — any future caller or reordering would silently degrade to using a stale chunk at index 0.

Consider returning -1 (or null) for "not found" and handling it explicitly in findSpeechStart, or at minimum adding a comment explaining why 0 is acceptable here.

♻️ Safer sentinel approach

private findLastSpeechChunkIndex(): number { const chunks = this.state.recentChunks; for (let i = chunks.length - 1; i >= 0; i--) { if (chunks[i].isSpeech) { return i; } } - return 0; + return -1; }

Then in findSpeechStart:

private findSpeechStart(): number { const lastSpeechIndex = this.findLastSpeechChunkIndex(); + if (lastSpeechIndex < 0) { + return Math.max(0, this.state.recentChunks.length - 1); + } const risingEnergyIndex = this.findRisingEnergyStart(lastSpeechIndex);

🤖 Prompt for AI Agents

In `@src/lib/audio/AudioSegmentProcessor.ts` around lines 392 - 400, findLastSpeechChunkIndex currently returns 0 to indicate "not found", which conflicts with a real match at index 0; change findLastSpeechChunkIndex to return -1 when no speech chunk is found and update findSpeechStart to handle -1 explicitly (e.g., treat as "no speech" and avoid using recentChunks[0]). Also search for any other callers (e.g., processAudioData / findSpeechStart invocation sites) and ensure they properly handle the -1 sentinel (or switch to returning null if you prefer), adding a brief comment in both functions documenting the sentinel semantics.

coderabbitai · 2026-02-09T17:20:26Z

src/lib/audio/AudioSegmentProcessor.ts

+    private findSnrCrossingStart(startIndex: number): number | null {
+        const chunks = this.state.recentChunks;
+        const minSnrThreshold = this.options.minSnrThreshold;
+
+        for (let i = startIndex; i >= 0; i--) {
            if (chunks[i].snr < minSnrThreshold) {
                return Math.min(chunks.length - 1, i + 1);
            }
        }

-        // Default lookback
-        return Math.max(0, firstSpeechIndex - 4);
+        return null;
    }


⚠️ Potential issue | 🟡 Minor

findSnrCrossingStart — verify intent when the chunk at startIndex is below threshold.

If chunks[startIndex].snr < minSnrThreshold, the method returns Math.min(chunks.length - 1, startIndex + 1). In the current call flow startIndex equals chunks.length - 1, so Math.min clamps the result back to startIndex — effectively a no-op. This is fine today.

However, if startIndex were ever less than chunks.length - 1, the function would return an index after the last speech chunk, which is semantically backwards for a "find start" helper. Consider adding an explicit early return or comment clarifying this edge-case behaviour to prevent subtle regressions if the callers evolve.

🤖 Prompt for AI Agents

In `@src/lib/audio/AudioSegmentProcessor.ts` around lines 447 - 458, In findSnrCrossingStart, handle the edge-case where the chunk at startIndex is already below minSnrThreshold: instead of returning Math.min(chunks.length - 1, i + 1) blindly, add an explicit branch so that if i === startIndex you return startIndex (or otherwise clamp so you never return an index past the last valid speech chunk); keep the existing behavior for earlier i (return i+1) and add a short comment explaining this special-case to prevent regressions—this involves editing the findSnrCrossingStart function and touching the use of chunks, i, startIndex and minSnrThreshold.

ysdede · 2026-02-10T21:49:15Z

Review Summary (Post r1→master merge)

Status: ✅ MERGEABLE

Changes: Decomposes the complex \indSpeechStart()\ method in \AudioSegmentProcessor\ into three focused helpers:

\indLastSpeechChunkIndex()\ — finds most recent speech chunk in history
\indRisingEnergyStart(startIndex)\ — looks back for rising energy trend onset
\indSnrCrossingStart(startIndex)\ — finds where SNR drops below threshold

Each returns
umber | null\ with the main \indSpeechStart()\ orchestrating the fallback chain.

Assessment: Clean extract-method refactor. The original was a single 60-line method with mixed concerns. Now each strategy (energy trend, SNR crossing, default lookback) is isolated and testable.

Note: Complements PR #85 (which refactors \processAudioData). They touch different methods in the same file, so merge #85 first, then #87.

Recommendation: ✅ Merge (after #85) — Completes the AudioSegmentProcessor decomposition

google-labs-jules · 2026-02-10T21:50:14Z

Review Summary (Post r1→master merge)

Status: ✅ MERGEABLE

Changes: Decomposes the complex \indSpeechStart()\ method in \AudioSegmentProcessor\ into three focused helpers:

\indLastSpeechChunkIndex()\ — finds most recent speech chunk in history

\indRisingEnergyStart(startIndex)\ — looks back for rising energy trend onset

\indSnrCrossingStart(startIndex)\ — finds where SNR drops below threshold

Each returns
umber | null\ with the main \indSpeechStart()\ orchestrating the fallback chain.

Assessment: Clean extract-method refactor. The original was a single 60-line method with mixed concerns. Now each strategy (energy trend, SNR crossing, default lookback) is isolated and testable.

Note: Complements PR #85 (which refactors \processAudioData). They touch different methods in the same file, so merge #85 first, then #87.

Recommendation: ✅ Merge (after #85) — Completes the AudioSegmentProcessor decomposition

Acknowledged.

… methods This change extracts the complex heuristic logic in `findSpeechStart` into three private helper methods: `findLastSpeechChunkIndex`, `findRisingEnergyStart`, and `findSnrCrossingStart`. This improves readability and maintainability of the speech onset detection logic without altering its behavior. - Extracted `findLastSpeechChunkIndex` to locate the most recent speech chunk. - Extracted `findRisingEnergyStart` to handle rising energy trend analysis. - Extracted `findSnrCrossingStart` to handle SNR threshold crossing checks. - Updated `findSpeechStart` to compose these helpers. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

coderabbitai bot reviewed Feb 9, 2026

View reviewed changes

ysdede changed the base branch from release/r1 to staging February 10, 2026 21:00

ysdede mentioned this pull request Feb 10, 2026

Refactor AudioSegmentProcessor processAudioData #85

Open

Repository owner deleted a comment from google-labs-jules bot Feb 17, 2026

Repository owner deleted a comment from coderabbitai bot Feb 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Refactor AudioSegmentProcessor: reduce complexity in findSpeechStart#87

Refactor AudioSegmentProcessor: reduce complexity in findSpeechStart#87
ysdede wants to merge 3 commits intostagingfrom
refactor-audiosegmentprocessor-6181472373587751073

ysdede commented Feb 9, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

ysdede commented Feb 9, 2026

Uh oh!

coderabbitai bot commented Feb 9, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 9, 2026

Uh oh!

coderabbitai bot Feb 9, 2026

Uh oh!

coderabbitai bot Feb 9, 2026

Uh oh!

ysdede commented Feb 10, 2026

Uh oh!

google-labs-jules bot commented Feb 10, 2026

Review Summary (Post r1→master merge)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		const firstSpeechIndex = this.findLastSpeechChunkIndex();

		const risingEnergyIndex = this.findRisingEnergyStart(firstSpeechIndex);

Comments

Conversation

ysdede commented Feb 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

ysdede commented Feb 9, 2026

Uh oh!

coderabbitai bot commented Feb 9, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

ysdede commented Feb 10, 2026

Review Summary (Post r1→master merge)

Uh oh!

google-labs-jules bot commented Feb 10, 2026

Review Summary (Post r1→master merge)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ysdede commented Feb 9, 2026 •

edited by coderabbitai bot

Loading