perf: conservative round 2 optimizations (buffer/ui/debug) by ysdede · Pull Request #143 · ysdede/keet

ysdede · 2026-02-18T19:37:12Z

Summary

Implements conservative Performance Round 2 optimizations from traces/notes/TOMORROW_OPTIMIZATIONS.md with low-risk, incremental changes and no merger-logic modifications.

What Changed

Consolidated v4 VAD polling into one BufferWorker request (GET_VAD_SUMMARY) and switched v4Tick from multiple worker calls to a single summary call.
Preserved semantics:
- Inference-ready path still requires energy && inference.
- Fallback path remains energy-only.
- Silence tail remains energy-based.
Reduced transcript-pane reactive churn:
- Batched v4 store updates in App.tsx.
- Skipped merged finalized-text rebuild while Live tab is active.
Hardened DebugPanel off-path behavior:
- Added unmount/dispose guards for async callbacks and rAF in LayeredBufferVisualizer.
- Added explicit DPR media-query listener cleanup on unmount.
Added incremental cache telemetry guardrails:
- Counters + sampled logging for cache enabled and bypass reasons.
Updated traces/notes/TOMORROW_OPTIMIZATIONS.md with DONE/SKIPPED/TODO for this round.

Scope Decisions

Done: [BUG] Device selector is non-functional - shows hardcoded text #1, [BUG] Waveform visualization not showing audio signal #3, [BUG] Settings button and Share button are non-functional #5, [BUG] Debug panel shows static mock data #6 from TOMORROW_OPTIMIZATIONS.md
Deferred: [BUG] Sidebar navigation buttons are non-functional #2 payload slimming, [BUG] Session timer never updates during recording #4 mel transfer reuse/pooling
Explicitly avoided: text merge algorithm changes (UtteranceBasedMerger semantics unchanged)

Validation

npm test -- src/lib/transcription/WindowBuilder.test.ts src/lib/transcription/UtteranceBasedMerger.test.ts src/lib/transcription/UtteranceBasedMerger.regression.test.ts
npm test -- src/lib/buffer/buffer.worker.test.ts
npm test -- src/stores/appStore.test.ts
npm test -- src/lib/vad/tenvad.worker.test.ts
npm test
npm run build

Issue/PR Links

Closes #131
Related: #116, #73, #27
Related PRs: #141, #138, #92

Summary by Sourcery

Optimize v4 streaming transcription performance and debug UI behavior with conservative, non-semantic changes.

New Features:

Add consolidated GET_VAD_SUMMARY buffer worker API to return combined energy/inference VAD results and energy-based silence tail in a single call.
Introduce incremental cache telemetry counters and sampled logging for v4 decoder cache usage and bypass reasons.

Enhancements:

Reduce v4Tick worker round-trips by switching VAD polling to the new consolidated summary API while preserving existing speech and silence semantics.
Batch v4-related store updates and avoid rebuilding the merged finalized transcript while the Live tab is active to cut reactive UI churn.
Harden LayeredBufferVisualizer cleanup and DPR change handling so animation frames and async callbacks do no work after unmount or panel closure.
Record v4 buffer metrics, merger stats, latency, and RTF updates in a single batched UI update to minimize render overhead.
Add design notes documenting completed, deferred, and skipped performance optimizations for this round.

Documentation:

Add TOMORROW_OPTIMIZATIONS.md trace notes outlining planned performance optimizations, guardrails, and status for this optimization round.

Tests:

Extend buffer.worker tests to cover the new consolidated VAD summary semantics and inclusion of energy-based silence tail duration.

Summary by CodeRabbit

New Features
- Consolidated voice activity detection queries for improved reliability and reduced system overhead.
- Added telemetry tracking to monitor cache performance per-prefix at configurable intervals.
Performance
- Batched multiple UI state updates to improve responsiveness during transcription finalization.
- Optimized transcription display rendering in V4 mode to reduce unnecessary computation.
Bug Fixes
- Fixed lifecycle management in buffer visualizer to prevent operations after component unmount.
- Enhanced cleanup of device pixel ratio listeners to prevent memory leaks.
Tests
- Added comprehensive tests for voice activity detection summarization and silence tail duration calculation.

sourcery-ai · 2026-02-18T19:37:20Z

Reviewer's Guide

Implements low-risk performance optimizations for v4 streaming by consolidating VAD worker calls into a single summary request, batching and scoping transcript UI updates, hardening debug visualizer cleanup, and adding guarded telemetry around incremental cache usage, without changing merger semantics.

Sequence diagram for consolidated VAD summary and silence-based flush in v4Tick

sequenceDiagram
    participant App
    participant BufferWorkerClient
    participant BufferWorker
    participant WorkerClient
    participant WindowBuilder
    participant AppStore

    App->>BufferWorkerClient: getVadSummary(startSample,endSample,energyThreshold,inferenceThreshold,requireInference)
    BufferWorkerClient->>BufferWorker: BufferWorkerRequest GET_VAD_SUMMARY
    BufferWorker-->>BufferWorkerClient: BufferWorkerResponse VadSummaryResult
    BufferWorkerClient-->>App: VadSummaryResult vadSummary

    App->>App: hasSpeech = vadSummary.hasSpeech
    alt hasSpeech == false
        App->>App: silenceDuration = vadSummary.silenceTailDurationSec
        alt silenceDuration >= v4SilenceFlushSec
            App->>WorkerClient: v4FinalizeTimeout()
            WorkerClient-->>App: flushResult
            alt flushResult exists
                App->>AppStore: batch setMatureText,matureCursorTime,transcript
                App->>AppStore: appendV4SentenceEntries
                App->>AppStore: setV4MergerStats
                App->>WindowBuilder: advanceMatureCursorByTime(matureCursorTime)
                App->>WindowBuilder: markSentenceEnd(sampleAtMatureCursor)
            end
        end
    else hasSpeech == true
        App->>App: continue streaming without timeout flush
    end

Class diagram for buffer worker VAD summary types and client

classDiagram
    class BufferWorkerClient {
        -worker: Worker
        -nextId: number
        -pending: Map~number,PendingRequest~
        +constructor(worker: Worker)
        +sendRequest(type: BufferWorkerRequestType, payload: any) Promise~any~
        +hasSpeech(layer: string, startSample: number, endSample: number, threshold: number) Promise~HasSpeechResult~
        +getVadSummary(startSample: number, endSample: number, energyThreshold: number, inferenceThreshold: number, requireInference: boolean) Promise~VadSummaryResult~
        +getSilenceTailDuration(layer: string, threshold: number) Promise~number~
        +queryRange(startSample: number, endSample: number, layer: string) Promise~RangeResult~
    }

    class BufferWorker {
        -layers: BufferLayers
        -config: BufferWorkerConfig
        +onmessage(event: MessageEvent)
        +handleHasSpeech(id: number, query: HasSpeechQuery) void
        +handleGetVadSummary(id: number, query: VadSummaryQuery) void
        +handleGetSilenceTail(id: number, query: SilenceTailQuery) void
        +handleQueryRange(id: number, query: RangeQuery) void
        +respond(response: BufferWorkerResponse) void
        +emptyHasSpeech() HasSpeechResult
    }

    class HasSpeechQuery {
        +layer: string
        +startSample: number
        +endSample: number
        +threshold: number
    }

    class VadSummaryQuery {
        +startSample: number
        +endSample: number
        +energyThreshold: number
        +inferenceThreshold: number
        +requireInference: boolean
    }

    class SilenceTailQuery {
        +layer: string
        +threshold: number
    }

    class HasSpeechResult {
        +hasSpeech: boolean
        +maxProb: number
        +entriesChecked: number
    }

    class VadSummaryResult {
        +energy: HasSpeechResult
        +inference: HasSpeechResult~nullable~
        +hasSpeech: boolean
        +silenceTailDurationSec: number
    }

    class BufferWorkerRequest {
        <<union>>
        +type: BufferWorkerRequestType
        +id: number
        +payload: any
    }

    class BufferWorkerResponse {
        <<union>>
        +type: BufferWorkerResponseType
        +id: number
        +payload: any
    }

    BufferWorkerClient --> BufferWorkerRequest : sends
    BufferWorker --> BufferWorkerResponse : produces
    BufferWorkerClient ..> BufferWorkerResponse : awaits

    BufferWorker ..> HasSpeechQuery : uses
    BufferWorker ..> VadSummaryQuery : uses
    BufferWorker ..> SilenceTailQuery : uses
    BufferWorker ..> HasSpeechResult : uses
    BufferWorker ..> VadSummaryResult : uses

    VadSummaryResult o-- HasSpeechResult : energy
    VadSummaryResult o-- HasSpeechResult : inference

Class diagram for LayeredBufferVisualizer lifecycle and cleanup guards

classDiagram
    class LayeredBufferVisualizer {
        +props: LayeredBufferVisualizerProps
        -canvasRef: HTMLCanvasElement
        -ctx: CanvasRenderingContext2D
        -animationFrameId: number
        -disposed: boolean
        -cachedDpr: number
        -resizeObserver: ResizeObserver
        -dprMediaQuery: MediaQueryList
        -dprChangeHandler: MediaQueryListListener
        -cachedSpecData: CachedSpecData
        +onMountHook() void
        +onCleanupHook() void
        -setupResizeObserver() void
        -setupDprWatch() void
        -updateCanvasDimensions(logicalW: number, logicalH: number) void
        -loop(now: number) void
        -fetchMelFeatures(fetchStartSample: number, fetchEndSample: number) void
    }

    class LayeredBufferVisualizerProps {
        +audioEngine: AudioEngine
        +melClient: MelClient
        +windowDuration: number
        +layers: string[]
    }

    class CachedSpecData {
        +data: Float32Array
        +startSample: number
        +endSample: number
    }

    class ResizeObserver {
        +observe(target: Element) void
        +disconnect() void
    }

    class MediaQueryList {
        +addEventListener(type: string, listener: MediaQueryListListener, options: any) void
        +removeEventListener(type: string, listener: MediaQueryListListener) void
        +matches: boolean
    }

    class MediaQueryListListener {
    }

    LayeredBufferVisualizer --> LayeredBufferVisualizerProps : uses
    LayeredBufferVisualizer o-- CachedSpecData : caches
    LayeredBufferVisualizer ..> ResizeObserver : uses
    LayeredBufferVisualizer ..> MediaQueryList : uses
    LayeredBufferVisualizer ..> MediaQueryListListener : uses

File-Level Changes

Change	Details	Files
Consolidated VAD polling into a single worker summary call and wired it into v4Tick, preserving existing energy vs energy+inference semantics and silence-tail behavior.	Added GET_VAD_SUMMARY request/response types and VadSummaryQuery/VadSummaryResult to buffer worker protocol and types. Implemented handleGetVadSummary in buffer.worker to compute energy and optional inference hasSpeech plus energy-based silence tail in one pass, with safe fallbacks when layers/config are missing. Exposed getVadSummary on BufferWorkerClient as the new consolidated VAD API. Updated App.v4Tick to call getVadSummary once per tick, derive hasSpeech and silenceTailDurationSec from it, and keep energy-only fallback and energy+inference requirements unchanged.	`src/lib/buffer/types.ts` `src/lib/buffer/buffer.worker.ts` `src/lib/buffer/BufferWorkerClient.ts` `src/App.tsx`
Batched v4 UI/store updates and refined transcript recomputation to reduce reactive churn in the transcript pane while keeping behavior identical for finalized text.	Wrapped v4 timeout finalization and v4 inference result store updates in Solid batch() calls to minimize derived computations and renders. Introduced finalizedMergedText memo in TranscriptionDisplay that skips rebuilding the finalized corpus when v4 mode is active and the Live tab is selected, while keeping fullTextBody behavior unchanged for visible tabs. Ensured RTF, buffer metrics, merger stats, cursor time, and system metrics updates happen together to avoid multiple reactive cascades per tick.	`src/App.tsx` `src/components/TranscriptionDisplay.tsx`
Added disposal guards and cleanup for the LayeredBufferVisualizer to ensure no work is done after unmount and to avoid dangling DPR listeners or rAF callbacks.	Introduced a disposed flag to bail out of the animation loop and async mel feature callbacks after cleanup. Tracked the current DPR media-query handler and removed it both when re-registering and on cleanup to prevent multiple active listeners. Set disposed to true and cleared animation frame, resize observer, media-query listener, and handler references in onCleanup to keep the debug visualizer fully off-path when hidden.	`src/components/LayeredBufferVisualizer.tsx`
Instrumented incremental decoder cache usage with lightweight telemetry counters and sampled logging to understand when cache is enabled or bypassed.	Added v4CacheTelemetry counters and a trackV4CacheTelemetry helper in App.tsx that logs every N events when trace logging is enabled. Replaced inline incremental cache prefix validation with logic that records enabled vs bypassNonPositivePrefix vs bypassOutsideWindow buckets and only builds incrementalCache when the prefix is valid.	`src/App.tsx`
Extended buffer worker tests and added optimization notes to document and validate the new VAD summary API and the chosen optimization scope.	Added tests for GET_VAD_SUMMARY covering energy-only behavior, energy+inference semantics, and inclusion of energy-based silence tail duration in the summary response. Added TOMORROW_OPTIMIZATIONS.md capturing planned optimizations, guardrails, validation checklist, and DONE/SKIPPED/TODO status for this round.	`src/lib/buffer/buffer.worker.test.ts` `traces/notes/TOMORROW_OPTIMIZATIONS.md`

Assessment against linked issues

Issue	Objective	Addressed
#131	Track the DPR media-query listener function reference and remove the listener in onCleanup.	✅
#131	Null out DPR watcher state (MediaQueryList and handler) when the component is disposed.	✅
#131	Prevent DPR-related callbacks and visualizer work from executing after the component has been unmounted.	✅

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've found 1 issue

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `src/lib/buffer/BufferWorkerClient.ts:146-152` </location>
<code_context>
+     * - requireInference=false => energy only
+     * - silence tail always computed from energy layer
+     */
+    async getVadSummary(params: {
+        startSample: number;
+        endSample: number;
+        energyThreshold: number;
+        inferenceThreshold: number;
+        requireInference: boolean;
+    }): Promise<VadSummaryResult> {
+        return this.sendRequest('GET_VAD_SUMMARY', params);
+    }
</code_context>

<issue_to_address>
**suggestion:** Use the shared VadSummaryQuery type for the client method signature to prevent drift.

`getVadSummary` redefines the params inline instead of using `VadSummaryQuery` from `types.ts`, duplicating the contract and risking drift if the query shape changes. Please type it as `getVadSummary(params: VadSummaryQuery)` so the client and worker stay aligned and future changes only need to be made in one place.

Suggested implementation:

```typescript
    async getVadSummary(params: VadSummaryQuery): Promise<VadSummaryResult> {

```

You’ll also need to import `VadSummaryQuery` from your shared `types.ts` file in `src/lib/buffer/BufferWorkerClient.ts`. For example, if you currently have something like:
- `import type { VadSummaryResult } from '../types';`

update it to:
- `import type { VadSummaryResult, VadSummaryQuery } from '../types';`

Adjust the relative path (`'../types'` vs `'./types'`, etc.) to match the existing import pattern in this file.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

src/lib/buffer/BufferWorkerClient.ts

coderabbitai

Actionable comments posted: 3

📜 Review details

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e13098a and b0502ac.

📒 Files selected for processing (8)

src/App.tsx
src/components/LayeredBufferVisualizer.tsx
src/components/TranscriptionDisplay.tsx
src/lib/buffer/BufferWorkerClient.ts
src/lib/buffer/buffer.worker.test.ts
src/lib/buffer/buffer.worker.ts
src/lib/buffer/types.ts
traces/notes/TOMORROW_OPTIMIZATIONS.md

🧰 Additional context used

🧬 Code graph analysis (3)

src/lib/buffer/buffer.worker.test.ts (1)

src/lib/buffer/BufferWorkerClient.ts (1)

sendRequest (208-214)

src/App.tsx (2)

src/lib/buffer/BufferWorkerClient.ts (1)

hasSpeech (130-137)

src/stores/appStore.ts (1)

appStore (350-350)

src/lib/buffer/buffer.worker.ts (2)

src/lib/buffer/types.ts (3)

HasSpeechResult (120-124)

VadSummaryQuery (85-96)

VadSummaryResult (126-131)

src/lib/buffer/BufferWorkerClient.ts (1)

hasSpeech (130-137)

🪛 LanguageTool

traces/notes/TOMORROW_OPTIMIZATIONS.md

[uncategorized] ~21-~21: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... transcript segments re-render. - Avoid full text reconciliation when only immature tail ...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

🪛 markdownlint-cli2 (0.21.0)

traces/notes/TOMORROW_OPTIMIZATIONS.md

[warning] 14-14: Ordered list item prefix
Expected: 1; Actual: 2; Style: 1/1/1

(MD029, ol-prefix)

[warning] 19-19: Ordered list item prefix
Expected: 1; Actual: 3; Style: 1/1/1

(MD029, ol-prefix)

[warning] 23-23: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/1/1

(MD029, ol-prefix)

[warning] 27-27: Ordered list item prefix
Expected: 1; Actual: 5; Style: 1/1/1

(MD029, ol-prefix)

[warning] 30-30: Ordered list item prefix
Expected: 1; Actual: 6; Style: 1/1/1

(MD029, ol-prefix)

[warning] 34-34: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below