Skip to content

Comments

Optimize streaming pipeline to cut overhead and improve profiling#142

Merged
ysdede merged 6 commits intomasterfrom
optimizations1802
Feb 18, 2026
Merged

Optimize streaming pipeline to cut overhead and improve profiling#142
ysdede merged 6 commits intomasterfrom
optimizations1802

Conversation

@ysdede
Copy link
Owner

@ysdede ysdede commented Feb 18, 2026

Summary:
Improves the streaming transcription runtime by reducing worker/message overhead, cutting UI/render churn, and adding profiling tooling for Chrome traces and heap captures. Also restores streaming merge correctness and adds regression tests to prevent semantic regressions while optimizing.

Changes:

  • Reduce hot-path overhead by disabling heavy logging/profiling in v4/mel paths
  • Reduce UI/visualization churn with throttled waveform drawing and visibility-aware update cadence
  • Optimize BufferWorker copy paths and add coverage for query/range behavior
  • Use Transferables in TranscriptionWorkerClient to reduce structured-clone postMessage overhead
  • Restore streaming merge semantics (cursor-anchored windowing) and add cache-prefix safety guard
  • Add regression protection: UtteranceBasedMerger.regression.test.ts and WindowBuilder tests
  • Add tooling: metrics/trace_ultimate.py, metrics/analyze_chrome_trace.py, metrics/memory/analyze_heap_memory.py

Performance impact:

  • Before: Not provided (measured locally only)
  • After: Not provided (measured locally only)
  • Notes: Intended effects are lower worker-message overhead, less UI churn, and better memory/GC visibility via trace/heap analyzers.

Risk & compatibility:

  • Worker messaging changes (Transferables) can surface ownership/lifetime assumptions for buffers
  • Window/caching guardrails are correctness-sensitive; intended to preserve transcript semantics and avoid invalid cache reuse
  • Intentionally unchanged: transcript merge semantics (explicitly restored + regression tests)

How to test:

  1. npm install
  2. npm test
  3. npm run build
  4. Optional (trace): python metrics/trace_ultimate.py <trace.json> or python metrics/analyze_chrome_trace.py <trace.json>
  5. Optional (heap): python metrics/memory/analyze_heap_memory.py --inputs "traces/heaps/*" --report-md traces/heaps/memory_report_pass1.md --report-json traces/heaps/memory_report_pass1.json --top-n 25 --pairwise --strict-validate

Checklist:

  • Tests added/updated
  • Existing tests pass
  • Benchmarks captured (if applicable)
  • Docs/comments updated (if applicable)

Notes:
Review focus: worker transferables, BufferWorker copy/retention behavior, and the UtteranceBasedMerger regression suite locking in transcript semantics.

Closes #110
Closes #113
Related: #117 (partially addressed)
Related: #73 (partially addressed)
Related: #105, #92, #141

ysdede and others added 5 commits February 18, 2026 01:14
- Updated `TranscriptionWorkerClient.ts` to support transferring `ArrayBuffer` ownership in `sendRequest`.
- Modified `processChunk`, `processV3Chunk`, `processV3ChunkWithFeatures`, `transcribeSegment`, and `processV4ChunkWithFeatures` to transfer audio and feature buffers instead of copying them.
- This reduces main thread blocking during high-frequency worker communication, addressing performance issues noted in `issue-4-worker-message-overhead.md`.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Add metrics/trace_ultimate.py with ranked bottlenecks, KPI extraction, worker role mapping, render-loop and AudioWorklet cadence stats, and baseline compare mode.\n\nKeep metrics/analyze_chrome_trace.py as a backward-compatible wrapper and document the new CLI in metrics/README.md.
Disable heavy v4/mel hot-path logs, turn off v4 profiling output, and default WindowBuilder debug off.\n\nReduce UI/visualization churn with throttled waveform drawing and visibility-aware visualization update cadence.\n\nOptimize BufferWorker copy paths with typed-array set/subarray, shorten audio/mel retention, add scalar/vector QUERY_RANGE coverage, and add start-hint clamping in WindowBuilder with tests.\n\nAlso lower v4 max window duration to 12s and pass start hints from App to prevent stale-window growth.
… prefix

Root cause:
- Commit 621d749 introduced start-hint-driven window starts (App -> WindowBuilder)
  that could move the v4 inference window ahead of the mature cursor.
- This reduced/removed the intended retranscription overlap used by sentence
  formation and overlap conflict resolution.
- The same change could also send an incremental decoder prefix derived from a
  start point no longer aligned to the cursor, creating invalid cache reuse in
  some cycles.

Fix:
- Revert semantic window-start behavior to mature-cursor anchoring:
  - App now calls WindowBuilder.buildWindow() without a start hint.
  - WindowBuilder no longer accepts/applies startHintFrame.
- Restore v4 maxDurationSec from 12.0 back to 30.0 to match known-good
  sentence-building behavior under long overlap/retranscription conditions.
- Add an incremental cache safety guard in App:
  - prefixSeconds is only used when it is > 0 and strictly smaller than the
    current window duration.
  - otherwise incrementalCache is disabled for that cycle.

Regression protection tests:
- WindowBuilder:
  - verify window start remains anchored to mature cursor while head grows.
- New fixture suite for UtteranceBasedMerger:
  - partial-to-full sentence promotion
  - overlap/conflict resolution stability across retranscription
  - punctuation/case refinement across overlapping windows
  - repeated words + delayed token correction behavior
  - short-pause (no flush) and long-pause (flush) behavior

Files changed:
- src/App.tsx
- src/lib/transcription/WindowBuilder.ts
- src/lib/transcription/WindowBuilder.test.ts
- src/lib/transcription/UtteranceBasedMerger.regression.test.ts

Validation:
- npm test -- src/lib/transcription/WindowBuilder.test.ts \
            src/lib/transcription/UtteranceBasedMerger.test.ts \
            src/lib/transcription/UtteranceBasedMerger.regression.test.ts
  => 3 files, 28 tests passed
- npm test
  => 18 files, 180 tests passed
- npm run build
  => production build passed

Notes:
- Kept non-semantic performance optimizations from earlier commits.
- This commit is intentionally surgical: restore correctness first, preserve
  safe perf wins that do not alter merge output semantics.
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Feb 18, 2026

Reviewer's Guide

Optimizes the streaming transcription pipeline by reducing logging and UI churn, improving buffer/worker data paths with Transferables, restoring cursor-anchored window/merge semantics, and adding robust trace/heap profiling tooling plus regression tests for streaming merge behavior and buffer correctness.

Sequence diagram for v4 streaming tick with windowing and Transferables

sequenceDiagram
    participant App
    participant AudioEngine
    participant BufferWorker
    participant WindowBuilder
    participant TranscriptionWorkerClient
    participant TranscriptionWorker

    App->>AudioEngine: v4Tick()
    AudioEngine-->>App: ringBuffer

    App->>WindowBuilder: buildWindow()
    WindowBuilder-->>App: Window(startFrame,endFrame,durationSeconds,isInitial)

    alt no_window
        App-->>App: shouldLogV4Tick()
        App-->>App: log window_wait_state (throttled)
        App-->>App: return
    else has_window
        App-->>App: shouldLogV4Tick()
        App-->>App: log window_range (throttled)

        App->>WindowBuilder: getMatureCursorFrame()
        WindowBuilder-->>App: cursorFrame

        App-->>App: rawPrefixSeconds=(window.startFrame-cursorFrame)/16000
        App-->>App: guard prefixSeconds
        Note over App: prefixSeconds>0 && prefixSeconds<window.durationSeconds ? rawPrefixSeconds : 0

        App->>TranscriptionWorkerClient: processV4ChunkWithFeatures(features,T,melBins,startTime,overlapSeconds,prefixSeconds,incrementalCache)
        Note right of TranscriptionWorkerClient: features.buffer transferred via postMessage
        TranscriptionWorkerClient->>TranscriptionWorker: postMessage(type=PROCESS_V4_CHUNK_WITH_FEATURES,payload,transfer=[features.buffer])

        TranscriptionWorker-->>TranscriptionWorkerClient: V4ProcessResult(matureText,immatureText,incrementalCache)
        TranscriptionWorkerClient-->>App: V4ProcessResult

        App-->>App: update matureText,immatureText,segments
    end
Loading

Class diagram for streaming buffer, audio engine, and transcription worker client

classDiagram
    class CircularLayer {
        -Float32Array buffer
        -int entryDimension
        -int maxEntries
        -int globalWriteIndex
        +write(data)
        +writeBatch(data)
        +queryRange(startIndex,count) Float32Array
    }

    class BufferWorker {
        +layers
        +handleWrite(layer,data,globalSampleOffset)
        +handleWriteBatch(layer,data,globalSampleOffset)
        +handleQueryRange(startSample,endSample,layers)
    }

    class AudioEngine {
        -Array~function~ visualizationCallbacks
        -number lastVisualizationNotifyTime
        -number VISUALIZATION_NOTIFY_INTERVAL_MS
        -number VISUALIZATION_NOTIFY_HIDDEN_INTERVAL_MS
        -boolean HOT_PATH_DEBUG_LOGS
        +notifyVisualizationUpdate()
        +updateEnergyVad(energy)
        +updateTenVad(stateInfo)
        +notifySegment(audioSegment)
        +getRingBuffer()
    }

    class WaveformComponent {
        -HTMLCanvasElement canvasRef
        -CanvasRenderingContext2D ctx
        -number animationId
        -ResizeObserver resizeObserver
        -number lastDrawTs
        -number lastStyleRefreshTs
        -string bgColor
        -string strokeColor
        +animate(ts)
        +refreshThemeColors()
        +updateCanvasSize()
    }

    class WindowBuilder {
        -IRingBuffer ring
        -number matureCursorFrame
        +buildWindow() Window
        +getMatureCursorFrame() number
        +markSentenceEnd(frame)
        +advanceMatureCursor(frame)
    }

    class UtteranceBasedMerger {
        -boolean useNLP
        -boolean debug
        +processASRResult(asrResult) MergerResult
        +finalizePendingSentenceByTimeout() MergerResult
    }

    class TranscriptionWorkerClient {
        -Worker worker
        -number messageId
        -Map pendingPromises
        +sendRequest(type,payload,transfer) Promise
        +processChunk(audio) Promise~TranscriptionResult~
        +processV3Chunk(audio,startTime) Promise~TokenStreamResult~
        +processV3ChunkWithFeatures(features,T,melBins,startTime,overlapSeconds) Promise~TokenStreamResult~
        +transcribeSegment(audio) Promise~TranscriptionResult~
        +processV4ChunkWithFeatures(features,T,melBins,startTime,overlapSeconds,segmentId,incrementalCache) Promise~V4ProcessResult~
    }

    class TranscriptionWorker {
        +onmessage(event)
        +handleProcessV4ChunkWithFeatures(payload)
    }

    BufferWorker --> CircularLayer : aggregates
    AudioEngine ..> BufferWorker : writes_layers
    AudioEngine ..> WaveformComponent : visualization_data
    WindowBuilder ..> AudioEngine : uses_IRingBuffer
    UtteranceBasedMerger ..> WindowBuilder : uses_windows
    TranscriptionWorkerClient ..> TranscriptionWorker : postMessage
    TranscriptionWorkerClient ..> AudioEngine : consumes_audio
    BufferWorker ..> TranscriptionWorkerClient : provides_features
Loading

Flow diagram for waveform rendering throttling and theme caching

flowchart TD
    A_start[Waveform_effect_start] --> B_setup
    B_setup[init_canvas_and_ctx] --> C_refreshThemeColors
    C_refreshThemeColors --> D_startAnimation

    D_startAnimation --> E_requestAnimationFrame
    E_requestAnimationFrame --> F_animate

    F_animate --> G_checkCtx
    G_checkCtx -->|missing| E_requestAnimationFrame
    G_checkCtx -->|ok| H_computeVisibility

    H_computeVisibility --> I_setMinFrameInterval
    I_setMinFrameInterval --> J_checkThrottle

    J_checkThrottle -->|ts - lastDrawTs < minInterval| E_requestAnimationFrame
    J_checkThrottle -->|ts - lastDrawTs >= minInterval| K_maybeRefreshTheme

    K_maybeRefreshTheme -->|ts - lastStyleRefreshTs > 1000ms| C_refreshThemeColors
    K_maybeRefreshTheme --> L_drawBackground

    L_drawBackground --> M_drawWaveform
    M_drawWaveform --> N_updateTimestamps
    N_updateTimestamps --> E_requestAnimationFrame

    subgraph State
        lastDrawTs
        lastStyleRefreshTs
        bgColor
        strokeColor
    end
Loading

File-Level Changes

Change Details Files
Introduce trace_ultimate as the primary Chrome trace analyzer and make analyze_chrome_trace a thin, backward-compatible wrapper.
  • Replace legacy analyze_chrome_trace implementation with a wrapper that delegates to trace_ultimate while preserving default input/output behavior and adding optional CLI flags.
  • Implement trace_ultimate.py to load Chrome traces, attribute work to renderer/threads/workers, compute GC and render-loop stats, derive KPIs, rank bottlenecks, and optionally compare against a baseline run.
  • Add Markdown report generation helpers and a README documenting usage and CLI for the new trace tooling.
metrics/analyze_chrome_trace.py
metrics/trace_ultimate.py
metrics/README.md
Add a heap memory analyzer to reason about growth, detached DOM, and typed-array usage across multiple heap snapshots/timelines.
  • Discover and load .heapsnapshot/.heaptimeline files, validate them, and build a graph model with node/edge metadata, dominators, and retained sizes.
  • Aggregate counts/bytes by constructor/type and thematic buckets (e.g., buffers, DOM, worker/message) and compute pairwise deltas between captures.
  • Generate ranked findings with confidence levels and write both JSON and Markdown reports summarizing risks, checks, and next validation steps.
metrics/memory/analyze_heap_memory.py
Optimize BufferWorker circular layer read/write paths and add tests to lock in scalar/vector range behavior.
  • Special-case scalar layers and use typed-array subarray/set operations instead of per-element loops in write, writeBatch, and readRange.
  • Ensure entryDimension==1 paths avoid unnecessary work while preserving semantics for vector layers.
  • Add tests that write scalar and vector data into layers and assert QUERY_RANGE returns the correct entryDimension, entryCount, and data values.
src/lib/buffer/buffer.worker.ts
src/lib/buffer/buffer.worker.test.ts
Stabilize WindowBuilder cursor-anchored windowing semantics and regression-test mature cursor behavior as the ring buffer advances.
  • Introduce a mutable mock IRingBuffer that advances currentFrame to simulate streaming audio while keeping baseFrame constant.
  • Add a regression test ensuring that once the mature cursor is advanced, successive windows keep startFrame anchored to that cursor while the head grows.
  • Keep existing tests for initial window and sentence bookkeeping intact while extending coverage for moving-head scenarios.
src/lib/transcription/WindowBuilder.test.ts
Throttle waveform drawing and visualization updates based on tab visibility and recording state to reduce UI churn.
  • Cache waveform canvas theme colors and refresh them at a low cadence instead of querying getComputedStyle on every frame.
  • Use a timestamp-aware animation loop to enforce different minimum frame intervals for foreground-recording, idle, and hidden-tab states, and guard cancelAnimationFrame on cleanup.
  • Throttle AudioEngine visualization callbacks with separate intervals for visible vs hidden documents to reduce unnecessary updates.
src/components/Waveform.tsx
src/lib/audio/AudioEngine.ts
Reduce hot-path logging/profiling overhead in mel and v4 paths while preserving optional debug capability.
  • Gate v4 tick/worker debug logs behind a single V4_TRACE_LOGS flag and a helper that controls frequency; disable verbose logging by default.
  • Introduce a MEL_WORKER_VERBOSE_LOGS flag and wrap mel worker init/pushAudio/getFeatures/reset logs so the hot path is quiet unless explicitly enabled.
  • Disable Utterance/decoder debug in App by turning off WindowBuilder debug mode in the v4 service configuration.
src/App.tsx
src/lib/audio/mel.worker.ts
src/lib/transcription/transcription.worker.ts
Use Transferables in TranscriptionWorkerClient to cut structured-clone overhead for audio and feature buffers.
  • Extend sendRequest to accept an optional transfer list and pass it to worker.postMessage.
  • Update all audio/features RPCs (PROCESS_CHUNK, PROCESS_V3_CHUNK, PROCESS_V3_CHUNK_WITH_FEATURES, TRANSCRIBE_SEGMENT, PROCESS_V4_CHUNK_WITH_FEATURES) to transfer their underlying ArrayBuffers instead of cloning.
  • Preserve existing request/response correlation via IDs and pendingPromises while optimizing the data path only.
src/lib/transcription/TranscriptionWorkerClient.ts
Lock in streaming merge semantics of UtteranceBasedMerger via regression fixtures for overlapping windows and punctuation/case refinements.
  • Add a regression test suite that feeds synthetic ASR word streams through UtteranceBasedMerger and asserts mature/immature/full text and cursor times for partial-to-full updates.
  • Cover punctuation and casing refinement, repeated words/delayed tokens, overlap deduplication, and timeout-based sentence finalization behavior.
  • Use small helpers to construct ASRWord lists and ASR results from compact tuple fixtures for readability.
src/lib/transcription/UtteranceBasedMerger.regression.test.ts

Assessment against linked issues

Issue Objective Addressed Explanation
#73 Reduce GPU inference worker GC pressure (e.g., frequent Float32Array/ArrayBuffer allocations for WebGPU/tensor I/O) via changes such as buffer reuse/pooling or other allocation-reducing strategies. The PR reduces some overhead (e.g., using Transferables for audio/features buffers, turning off verbose logging, shortening some buffer lifetimes and optimizing buffer writes), but it does not introduce explicit reusable buffer pools, OrtValue reuse, or other clearly targeted mechanisms to eliminate the constant allocation/deallocation pattern identified in the GPU worker. The specific GC-focused action items from the issue remain unimplemented.
#73 Reduce main-thread layout and rendering pressure (forced synchronous layouts and UI churn) by throttling/optimizing UI updates and worker-to-main messaging that drive rendering.
#73 Provide or improve tooling and documentation to analyze Chrome performance traces and memory (including GC, worker activity, WebGPU, and jitter) so that future traces can be compared and regressed.
#110 Gate and substantially reduce hot-path logging (v4 tick logs, buffer dumps, mel.worker pushAudio/getFeatures, AudioEngine energy/metrics logs) so that these only run under an explicit debug flag and are not emitted per tick in steady-state recording.
#110 Provide or update tooling to analyze Chrome traces so DevTools/protocol long tasks (e.g., from logging) can be re-measured and compared between runs.
#113 Move large audio/mel payloads from structured cloning to Transferables for worker message passing to reduce serialization/deserialization overhead.
#113 Reduce UI and visualization update frequency (e.g., waveform rendering and audio visualization callbacks) to lower per-frame/render overhead during streaming.
#113 Provide or improve tooling/analysis to measure message-passing and GC overhead in Chrome traces in support of validating reductions in structured-clone costs.
#117 Cache waveform CSS variables on mount / infrequent theme updates instead of recomputing them every animation frame.
#117 Reduce unnecessary waveform animation work when idle by avoiding redraws when not recording or when input has not changed (e.g., pausing or heavily throttling the rAF loop and only redrawing when barLevels or canvas size change). The PR introduces frame-rate throttling based on visibility and isRecording (different intervals for foreground, idle, and hidden states), but the rAF loop still runs continuously and redraws the canvas on each eligible frame regardless of whether barLevels or canvas size have changed. It does not stop the rAF loop when isRecording is false, nor does it gate redraws on barLevels/canvas-size changes, so avoidable idle work remains.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

📝 Walkthrough

Walkthrough

PR introduces comprehensive performance profiling infrastructure (trace and heap analysis tools), refactors the trace analyzer into a modular architecture, and applies targeted hot-path optimizations: logging gates, buffer operation vectorization, transfer-based message passing, frame-timing throttling, and CSS variable caching.

Changes

Cohort / File(s) Summary
Metrics Analysis Infrastructure
metrics/README.md, metrics/trace_ultimate.py, metrics/memory/analyze_heap_memory.py, metrics/analyze_chrome_trace.py
New trace_ultimate.py (702 lines) and analyze_heap_memory.py (968 lines) provide modular analysis pipelines for Chrome traces and heap captures, including bottleneck detection, dominator trees, and multi-factor findings. analyze_chrome_trace.py refactored to delegate to trace_ultimate with backward-compatible CLI. README documents the toolkit.
Hot-Path Logging Gates
src/App.tsx, src/lib/audio/mel.worker.ts, src/lib/audio/AudioEngine.ts
Introduced V4_TRACE_LOGS, MEL_WORKER_VERBOSE_LOGS, and HOT_PATH_DEBUG_LOGS flags to gate verbose per-tick and per-frame logging; replaced unconditional console calls with conditional checks. Reduced preload windows from 120s to 5s in App.tsx v4 setup.
Waveform Rendering Optimization
src/components/Waveform.tsx
Added frame-timing logic with minFrameInterval based on visibility and recording state. Cached CSS variables (bgColor, strokeColor) on mount/refresh instead of per-frame reads; debounced style refresh to ~1s. Frame skipping now respects foreground/hidden/idle mode.
Visualization Throttling
src/lib/audio/AudioEngine.ts
Increased VISUALIZATION_NOTIFY_INTERVAL_MS from 16ms to 33ms; added VISUALIZATION_NOTIFY_HIDDEN_INTERVAL_MS (250ms) for reduced cadence when document not visible. Visualization update now dynamically selects interval based on visibility state.
Buffer Operation Vectorization
src/lib/buffer/buffer.worker.ts, src/lib/buffer/buffer.worker.test.ts
Simplified write, writeBatch, and readRange to use buffer.set for bulk operations instead of per-element loops. Added two regression tests validating scalar and vector layer preservation via QUERY_RANGE.
Transfer-Based Message Passing
src/lib/transcription/TranscriptionWorkerClient.ts
Enhanced sendRequest to accept optional transfer array; all audio/feature methods (processChunk, processV3Chunk, transcribeSegment, processV4ChunkWithFeatures) now transfer corresponding ArrayBuffer views via postMessage transfer list.
Transcription & Testing
src/lib/transcription/transcription.worker.ts, src/lib/transcription/UtteranceBasedMerger.regression.test.ts, src/lib/transcription/WindowBuilder.test.ts
Removed returnTokenIds and returnFrameIndices, added enableProfiling: false to v4 transcription options. Added comprehensive UtteranceBasedMerger regression suite (186 lines) covering partial-to-full merging, punctuation restoration, deduplication, and pause handling. Added window-anchoring test for WindowBuilder.
Configuration
.gitignore
Added traces/ directory to ignore rules.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

The PR combines two substantial analysis tools (trace_ultimate.py and analyze_heap_memory.py, totaling 1670 lines of logic-dense code with graph algorithms, dominator trees, and multi-factor heuristics), scattered performance optimizations across multiple files requiring separate reasoning (logging gates, buffer vectorization, frame timing, visibility-based throttling, transfer arrays), and multiple test additions. The heterogeneity of changes demands careful review of each cohort's correctness and integration points.

Possibly related issues

Possibly related PRs

  • chore(rebrand): Keet #97 — Both PRs modify metrics/analyze_chrome_trace.py; this PR fully refactors it to delegate to trace_ultimate while maintaining backward compatibility.
  • Feature/lightweight audio preview #106 — Both PRs touch Waveform.tsx, AudioEngine.ts, and mel.worker.ts; this PR implements the frame-timing, visibility-based throttling, and logging-gate patterns referenced there.

Poem

🐰 A trace of joy, a heap of care,
Buffers transfer through the air!
Logs now gate with wisdom's guard,
Frames skip smooth—no task too hard.
CSS cached, styles now swift,
Performance blooms, the gift! 🎯

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.97% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately summarizes the main objective: optimizing the streaming pipeline to reduce overhead and improve profiling capabilities.
Linked Issues check ✅ Passed Changes comprehensively address all linked issues: #110 (gated hot-path logging in App.tsx, mel.worker.ts, AudioEngine.ts), #113 (Transferables for audio/feature buffers in TranscriptionWorkerClient), #117 (cached colors, visibility-aware frame throttling in Waveform), and #73 (trace/heap analysis tools added).
Out of Scope Changes check ✅ Passed All changes are directly aligned with the linked objectives: performance optimizations (logging gates, Transferables, rendering throttles), profiling tools (trace_ultimate.py, analyze_heap_memory.py), and regression tests for correctness validation.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch optimizations1802

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • Now that buffers are being transferred in TranscriptionWorkerClient (e.g., processV3ChunkWithFeatures and processV4ChunkWithFeatures), consider documenting or enforcing at the call sites that these Float32Array instances are not reused after the call (or clone when reuse is needed) to avoid subtle use-after-transfer bugs.
  • The audio/mel buffer worker configuration was reduced from 120s to 5s of history; if any downstream logic (e.g., longer-context features or debugging introspection) implicitly relied on the larger window, it may be worth centralizing these limits in a shared config or adding explicit guards so behavior is obvious when history is truncated.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Now that buffers are being transferred in TranscriptionWorkerClient (e.g., `processV3ChunkWithFeatures` and `processV4ChunkWithFeatures`), consider documenting or enforcing at the call sites that these `Float32Array` instances are not reused after the call (or clone when reuse is needed) to avoid subtle use-after-transfer bugs.
- The audio/mel buffer worker configuration was reduced from 120s to 5s of history; if any downstream logic (e.g., longer-context features or debugging introspection) implicitly relied on the larger window, it may be worth centralizing these limits in a shared config or adding explicit guards so behavior is obvious when history is truncated.

## Individual Comments

### Comment 1
<location> `src/lib/transcription/TranscriptionWorkerClient.ts:104-108` </location>
<code_context>
     }

-    private sendRequest(type: string, payload?: any): Promise<any> {
+    private sendRequest(type: string, payload?: any, transfer?: Transferable[]): Promise<any> {
         const id = this.messageId++;
         return new Promise((resolve, reject) => {
             this.pendingPromises.set(id, { resolve, reject });
-            this.worker.postMessage({ type, payload, id });
+            this.worker.postMessage({ type, payload, id }, transfer || []);
         });
     }
</code_context>

<issue_to_address>
**issue (bug_risk):** Transferring ArrayBuffers changes caller semantics; ensure callers never reuse the passed Float32Array/features

Using transfer lists here will detach the underlying ArrayBuffers on the main thread. Please verify that all callers of the updated `sendRequest` overload (e.g. `processChunk`, `processV3Chunk`, `processV3ChunkWithFeatures`, `processV4ChunkWithFeatures`) do not read from the `Float32Array`/`features` after the call. If any do, either clone at those call sites before transferring, or make transferring opt‑in (e.g. via a flag) so those paths keep the previous non‑transferring behavior.
</issue_to_address>

### Comment 2
<location> `metrics/memory/analyze_heap_memory.py:343` </location>
<code_context>
+    timeline: Dict[str, Any]
+
+
+def analyze_cap(cap: CaptureFile, payload: Dict[str, Any], strict: bool, top_n: int) -> A:
+    issues = validate(payload, strict)
+    g = G(payload)
</code_context>

<issue_to_address>
**issue (complexity):** Consider refactoring large, multi-responsibility functions (graph setup, heuristic findings, and markdown rendering) into small focused helpers and a HeapGraph wrapper to reduce state threading and cognitive load without changing behavior.

One concrete way to lower complexity without changing behavior is to tease apart some of the “god functions” into small helpers, while keeping this in a single file for now.

### 1. Encapsulate dominator / BFS plumbing into a `HeapGraph` wrapper

Right now `G`, `dominators`, `bfs_paths`, `path_to` share low‑level arrays and are used all over. You can hide the algorithm details behind a small interface, and keep the rest of the code talking to higher‑level concepts.

Example:

```python
@dataclass
class HeapGraph:
    raw: G
    idom: List[int]
    retained: List[int]
    dfsn: List[int]
    children: List[List[int]]
    reach: int
    parent: List[int]
    parent_edge: List[str]

    @classmethod
    def from_payload(cls, payload: Dict[str, Any], root: int = 0) -> "HeapGraph":
        g = G(payload)
        dom = dominators(g, root)
        parent, parent_edge = bfs_paths(g, root)
        return cls(
            raw=g,
            idom=dom["idom"],
            retained=dom["ret"],
            dfsn=dom["dfsn"],
            children=dom["child"],
            reach=dom["reach"],
            parent=parent,
            parent_edge=parent_edge,
        )

    def path_to(self, node: Optional[int]) -> Dict[str, Any]:
        return path_to(self.raw, self.parent, self.parent_edge, node)
```

Then `analyze_cap` and other callers stop juggling `g`, `dom`, `par`, `pedge`:

```python
def analyze_cap(cap: CaptureFile, payload: Dict[str, Any], strict: bool, top_n: int) -> A:
    issues = validate(payload, strict)
    hg = HeapGraph.from_payload(payload, root=0)
    g = hg.raw

    # ... use hg.retained, hg.idom, hg.dfsn, hg.reach, hg.path_to(...)
    # instead of dom["ret"], dom["idom"], dom["dfsn"], dom["reach"], path_to(g, par, pedge, ...)
```

This keeps all behavior but significantly reduces the number of parallel arrays/dicts flowing through the pipeline.

### 2. Split `findings` into focused detectors

`findings` currently mixes several independent heuristics. A small dispatcher plus per‑signal helpers would reduce mental load and make it easier to add/remove heuristics.

Example:

```python
def _finding_jsarraybufferdata(analyses: List[A], comparisons: List[Dict[str, Any]], top_n: int) -> Optional[Dict[str, Any]]:
    if len(analyses) < 2:
        return None
    old, new = analyses[0], analyses[-1]
    anchor = compare(old, new, top_n)
    old_max = max(old.g.oid) if old.g.oid else None

    d = ctor_delta(anchor, "system / JSArrayBufferData")
    if not d or d["delta_self_size"] <= 0:
        return None

    node = best_node(new, "system / JSArrayBufferData", old_max) or best_node(new, "system / JSArrayBufferData")
    p = path_to(new.g, new.par, new.pedge, node)
    clear = any(t in p["path_text"].lower() for t in ["ringbuffer", "pendingpromises", "visualizationbuffer", "features"])
    c, cr = conf(clear, monotonic(comparisons, "system / JSArrayBufferData"), len(analyses))

    return {
        "symptom": "Typed-array backing store grew between captures.",
        # ... rest unchanged ...
    }
```

Dispatcher:

```python
def findings(analyses: List[A], comparisons: List[Dict[str, Any]], top_n: int) -> List[Dict[str, Any]]:
    detectors = [
        _finding_jsarraybufferdata,
        _finding_pending_promises,
        _finding_detached_dom,
        _finding_performance_timing,
        _finding_timeline_bursts,
    ]
    out: List[Dict[str, Any]] = []
    for fn in detectors:
        f = fn(analyses, comparisons, top_n)
        if f:
            out.append(f)
    # scoring / ranking stays here, unchanged
```

The logic doesn’t change, but the cognitive chunk per heuristic drops a lot.

### 3. Extract markdown rendering helpers for repeated sections

`report_md` is long and couples markdown layout + data selection. You can extract small helpers for repeated table sections and list rendering, keeping the main function as “layout glue”.

Example:

```python
def _md_table(headers: List[str], rows: List[List[str]]) -> List[str]:
    lines = ["| " + " | ".join(headers) + " |", "|" + "|".join("---" for _ in headers) + "|"]
    for r in rows:
        lines.append("| " + " | ".join(r) + " |")
    return lines

def _md_top_growing_ctors(rep: Dict[str, Any]) -> List[str]:
    tc = rep["appendix"]["top_growing_constructors"]
    if not tc:
        return ["- No constructor growth rows."]
    rows = [[f"`{r['constructor']}`", str(r["delta_self_size"]), str(r["delta_count"])] for r in tc]
    return _md_table(["Constructor", "Delta Self", "Delta Count"], rows)
```

Then `report_md` uses these helpers:

```python
ls += ["### Top growing constructors/types", ""]
ls += _md_top_growing_ctors(rep)
```

This doesn’t change the output, but the main markdown function becomes mostly a sequence of “section” calls, making it easier to see structure and to modify fields.

---

All of these are incremental refactors that preserve behavior and don’t require changing the CLI or data formats; they just reduce the amount of state threaded around and isolate concerns inside this module.
</issue_to_address>

### Comment 3
<location> `metrics/trace_ultimate.py:399` </location>
<code_context>
+        }
+    return out
+
+def rank_bottlenecks(duration_s, main, thread_by_tid, roles, gc, render, aw_thread, aw_ints):
+    rows = []
+
</code_context>

<issue_to_address>
**issue (complexity):** Consider extracting the scoring logic, summary structure, and evidence formatting into small helpers/data classes to separate heuristics from plumbing and make the analyzer easier to read and evolve.

You can keep all current behavior but reduce cognitive load by introducing a couple of small, focused abstractions instead of pushing everything through loosely‑typed dicts and ad‑hoc scoring.

### 1. Encapsulate bottleneck scoring heuristics

Right now the scoring logic is inline in `rank_bottlenecks`:

```python
score = tx_total + tx_gc * 5 + tx_long_count * 6 + tx_long_max * 2
...
render_score = (
    main_total * 0.15
    + to_float(page.get('total_ms'))
    + to_float(commit.get('total_ms'))
    + to_float(layout.get('total_ms'))
    + to_float(raf.get('rate_per_s')) * 10
)
...
score = total + to_float(hp.get('total_ms')) * 2
...
score = aw_total + aw_p95 * 20 + aw_max * 5
```

Pulling this into clearly named helpers makes intent explicit and keeps `rank_bottlenecks` as “data wiring” rather than “math + wiring”:

```python
def score_transcription_bottleneck(total_ms, gc_ms, long_count, long_max_ms):
    return total_ms + gc_ms * 5 + long_count * 6 + long_max_ms * 2


def score_main_render_bottleneck(main_total_ms, render):
    return (
        main_total_ms * 0.15
        + to_float(render.get('page_animator_scripted', {}).get('total_ms'))
        + to_float(render.get('commit', {}).get('total_ms'))
        + to_float(render.get('update_layout_tree', {}).get('total_ms'))
        + to_float(render.get('fire_animation_frame', {}).get('rate_per_s')) * 10
    )


def score_worker_bottleneck(total_ms, hpm_total_ms):
    return total_ms + hpm_total_ms * 2


def score_audio_worklet_bottleneck(total_ms, p95_ms, max_ms):
    return total_ms + p95_ms * 20 + max_ms * 5
```

Then `rank_bottlenecks` becomes simpler and self‑describing:

```python
tx_score = score_transcription_bottleneck(tx_total, tx_gc, tx_long_count, tx_long_max)
...
render_score = score_main_render_bottleneck(main_total, render)
...
score = score_worker_bottleneck(total, to_float(hp.get('total_ms')))
...
score = score_audio_worklet_bottleneck(aw_total, aw_p95, aw_max)
```

This keeps all behavior but localizes the heuristics so you can tweak weights without touching the core ranking logic.

### 2. Introduce a thin `TraceSummary` container

`analyze` currently returns a “bag of dicts” that `build_markdown` and `print_report` both have to know the structure of. You can reduce cross‑cutting coupling by introducing a small typed container and using attribute access instead of raw dict lookups, while still keeping everything in a single file:

```python
from dataclasses import dataclass
from typing import Any, Dict, List


@dataclass
class TraceSummary:
    version: str
    trace_file: str
    trace_path: str
    total_events: int
    trace_duration_s: float
    thread_ids: Dict[str, Any]
    process_names: Dict[str, str]
    thread_names: Dict[str, str]
    worker_urls_by_tid: Dict[str, str]
    worker_roles: Dict[str, int]
    thread_activity: List[Dict[str, Any]]
    threads: Dict[str, Any]
    gc: Dict[str, Any]
    render_loop: Dict[str, Any]
    audio_worklet_intervals: Dict[str, Any]
    bottlenecks: List[Dict[str, Any]]
    kpis: Dict[str, float]
    compare: Dict[str, Any] | None = None
```

Wrap the current dict before returning:

```python
summary_dict = {
    'version': 'ultimate-1.0',
    'trace_file': trace_path.name,
    ...
}

return TraceSummary(**summary_dict)
```

Then `build_markdown` / `print_report` become more legible and less fragile:

```python
def build_markdown(summary: TraceSummary) -> str:
    lines = []
    lines.append('# Trace Ultimate Report')
    lines.append('')
    lines.append(f"- Trace: `{summary.trace_file}`")
    lines.append(f"- Duration: `{summary.trace_duration_s}s`")
    lines.append(f"- Total events: `{summary.total_events}`")
    ...
    for row in summary.bottlenecks:
        ...
```

You can still `json.dump` by doing `json.dump(summary.__dict__, f, indent=2)` in `run_cli`, so existing JSON consumers keep working.

### 3. Factor out repeated evidence formatting

`build_markdown` and `print_report` both construct the same “first few evidence key/value pairs” string:

```python
ev_bits = ', '.join([f"{k}={ev[k]}" for k in list(ev.keys())[:4]])
```

Pull that into a tiny helper to avoid duplication and make the formatting rule explicit:

```python
def format_evidence(ev: dict, limit: int = 4) -> str:
    keys = list(ev.keys())[:limit]
    return ', '.join(f"{k}={ev[k]}" for k in keys)
```

Usage:

```python
for row in summary.get('bottlenecks', []):
    ev_bits = format_evidence(row.get('evidence', {}))
    ...
```

and

```python
ev_bits = format_evidence(ev)
print(f"  {row['rank']:>2}. {row['title']} | impact={row['impact']} | score={row['score']} | {ev_bits}")
```

These changes keep all functionality intact but carve out the “policy” pieces (scoring, evidence formatting, result structure) from the “plumbing”, which should address most of the flagged structural complexity without requiring a full multi‑module refactor.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
src/lib/transcription/transcription.worker.ts (2)

221-237: ⚠️ Potential issue | 🔴 Critical

enableProfiling: false cannot be passed to model.transcribe() — it is not a recognized option for this method.

enableProfiling is a ParakeetModel.fromUrls() initialization option only, controlling the ONNX Runtime profiler at model load time. Passing it to transcribe() will be silently ignored, leaving profiling enabled for the hot path — defeating the intended optimization.

To disable profiling on this specific code path, you must either:

  1. Disable profiling globally when loading the model (if applicable), or
  2. Use a different mechanism to skip profiling computation in this specific call.

If profiling must remain enabled elsewhere but disabled for this path, the current approach cannot work with the parakeet.js API.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/transcription/transcription.worker.ts` around lines 221 - 237, The
code passes an unsupported option enableProfiling into model.transcribe (see the
v4TranscribeResult call in transcription.worker.ts); remove enableProfiling from
the transcribe call and instead control profiling at model initialization
(ParakeetModel.fromUrls) or via whatever runtime toggle the parakeet.js API
exposes; update any logic that attempted to toggle profiling per-call to instead
set the profiler flag when creating the model instance (or implement an
alternative non-API profiling skip mechanism) so model.transcribe is only called
with supported options like precomputedFeatures, returnTimestamps, timeOffset,
and incremental.

264-264: ⚠️ Potential issue | 🟡 Minor

Remove or conditionally include undefined metrics field to reduce wire payload size.

With enableProfiling: false, v4TranscribeResult.metrics is undefined, yet it's unconditionally included in the postMessage payload. The metrics field is properly typed as optional in V4ProcessResult and all current accesses guard against undefined values (e.g., if (result.metrics) in App.tsx), so there's no runtime risk. However, sending undefined fields unnecessarily increases payload weight on every message.

Consider conditionally including the field:

🛠️ Proposed fix
-                        metrics: v4TranscribeResult.metrics,
+                        ...(v4TranscribeResult.metrics ? { metrics: v4TranscribeResult.metrics } : {}),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/transcription/transcription.worker.ts` at line 264, The postMessage
payload is always including metrics even when v4TranscribeResult.metrics is
undefined, inflating wire payloads; update the object construction in
transcription.worker.ts where you build the V4ProcessResult (the object
containing v4TranscribeResult and passed to postMessage) to only include the
metrics property when v4TranscribeResult.metrics is defined (e.g., use a
conditional spread or an if branch to add metrics to the payload), leaving all
other fields unchanged so consumers (App.tsx) that check result.metrics continue
to work.
src/lib/audio/AudioEngine.ts (1)

365-398: ⚠️ Potential issue | 🟡 Minor

Throttle can silently suppress the forced UI-clear in reset().

The comment at line 396–397 documents the explicit intent: "Push a blank update so UI clears stale waveform/segments." After the new throttle is applied, this call will be silently dropped whenever reset() is invoked within the throttle window of the previous update (33 ms in foreground, 250 ms in background). In the common reset()stop() sequence, no subsequent audio chunk will fire, so the UI may retain stale waveform data indefinitely.

🛡️ Proposed fix – reset throttle timestamp before the forced flush
     // Push a blank update so UI clears stale waveform/segments
+    this.lastVisualizationNotifyTime = 0; // bypass throttle for the explicit reset flush
     this.notifyVisualizationUpdate();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/audio/AudioEngine.ts` around lines 365 - 398, The forced UI-clear in
reset() can be suppressed by the visualization throttle; update reset() to
bypass throttling by resetting the throttle timestamp/state immediately before
calling notifyVisualizationUpdate(). Concretely, locate the throttle state used
by the visualization/update path (the variable that tracks last visualization
timestamp or throttle token used by notifyVisualizationUpdate()) and set it to a
value that forces an immediate send (e.g., 0 or null) inside reset() right
before invoking notifyVisualizationUpdate(), so the blank update cannot be
silently dropped; keep the rest of reset() unchanged (ringBuffer.reset(),
audioProcessor.reset(), metrics reset, visualizationBuffer fill, windowCallbacks
loop, etc.).
src/lib/transcription/TranscriptionWorkerClient.ts (1)

210-213: ⚠️ Potential issue | 🟠 Major

dispose() silently drops in-flight transfer-based promises — awaiting callers will hang.

pendingPromises.clear() removes without rejecting. Since all five transfer-enabled methods are now fire-and-forget from the worker's perspective (the buffer is immediately neutered), any caller await-ing processChunk, processV3Chunk, etc., when dispose() is called will get a promise that never settles — a silent memory and resource leak. With more calls now going through the transfer path, this window is wider.

🛡️ Proposed fix — reject all pending promises on dispose
 dispose() {
     this.worker.terminate();
+    const err = new Error('TranscriptionWorkerClient disposed');
+    for (const { reject } of this.pendingPromises.values()) {
+        reject(err);
+    }
     this.pendingPromises.clear();
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/transcription/TranscriptionWorkerClient.ts` around lines 210 - 213,
The dispose() currently clears pendingPromises without settling them, causing
awaiting callers of transfer-based methods (e.g., processChunk, processV3Chunk,
and other transfer-enabled calls) to hang; update dispose() in
TranscriptionWorkerClient to iterate pendingPromises, call the reject handler
for each entry with a clear Error (for example "Worker disposed") so all
in-flight promises are rejected, then clear the map and terminate the worker
(this ensures pendingPromises entries are settled via their stored
resolve/reject callbacks before calling this.worker.terminate()).
src/lib/buffer/buffer.worker.ts (1)

73-81: 🧹 Nitpick | 🔵 Trivial

writeBatch is missing the 1D fast path added to write().

For scalar layers (energyVad, inferenceVad), every iteration allocates a 1-element subarray and calls buffer.set, whereas write() fast-paths to direct index assignment for entryDimension === 1. On high-frequency batch writes this is a measurable inconsistency.

♻️ Proposed fix
     for (let e = 0; e < n; e++) {
         const writePos = (this.globalWriteIndex % this.maxEntries) * this.entryDimension;
         const srcOffset = e * this.entryDimension;
+        if (this.entryDimension === 1) {
+            this.buffer[writePos] = data[srcOffset] ?? 0;
+            this.globalWriteIndex++;
+            continue;
+        }
         this.buffer.set(data.subarray(srcOffset, srcOffset + this.entryDimension), writePos);
         this.globalWriteIndex++;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/buffer/buffer.worker.ts` around lines 73 - 81, The writeBatch method
currently always creates 1-element subarrays for entryDimension === 1 which is
inefficient; update writeBatch (the function named writeBatch using this.buffer,
this.entryDimension, this.globalWriteIndex and this.maxEntries) to mirror the
fast-path from write(): if this.entryDimension === 1, avoid subarray +
this.buffer.set and instead assign directly into this.buffer at the computed
index ((this.globalWriteIndex % this.maxEntries) * this.entryDimension) using
the scalar value from data[srcOffset]; otherwise keep the existing subarray +
set logic and still increment this.globalWriteIndex and loop for n entries.
📜 Review details

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d69277b and e7b3a54.

📒 Files selected for processing (15)
  • .gitignore
  • metrics/README.md
  • metrics/analyze_chrome_trace.py
  • metrics/memory/analyze_heap_memory.py
  • metrics/trace_ultimate.py
  • src/App.tsx
  • src/components/Waveform.tsx
  • src/lib/audio/AudioEngine.ts
  • src/lib/audio/mel.worker.ts
  • src/lib/buffer/buffer.worker.test.ts
  • src/lib/buffer/buffer.worker.ts
  • src/lib/transcription/TranscriptionWorkerClient.ts
  • src/lib/transcription/UtteranceBasedMerger.regression.test.ts
  • src/lib/transcription/WindowBuilder.test.ts
  • src/lib/transcription/transcription.worker.ts
🧰 Additional context used
🧬 Code graph analysis (5)
src/App.tsx (1)
src/stores/appStore.ts (1)
  • appStore (350-350)
src/lib/transcription/WindowBuilder.test.ts (2)
src/lib/audio/audioParams.ts (1)
  • sampleRate (144-144)
src/lib/transcription/WindowBuilder.ts (1)
  • WindowBuilder (51-356)
src/lib/transcription/TranscriptionWorkerClient.ts (2)
src/lib/transcription/TokenStreamTranscriber.ts (1)
  • TokenStreamResult (44-59)
src/lib/transcription/types.ts (1)
  • TranscriptionResult (28-39)
src/lib/audio/AudioEngine.ts (1)
metrics/analyze-trace-v2.js (1)
  • minInterval (64-64)
metrics/analyze_chrome_trace.py (1)
metrics/trace_ultimate.py (5)
  • analyze (590-647)
  • print_report (550-588)
  • parse_args (650-659)
  • main (697-698)
  • build_markdown (502-547)
🪛 Ruff (0.15.1)
metrics/trace_ultimate.py

[warning] 50-50: Prefer TypeError exception for invalid type

(TRY004)


[warning] 50-50: Avoid specifying long messages outside the exception class

(TRY003)


[warning] 271-271: Missing return type annotation for private function row

(ANN202)


[warning] 567-567: Use explicit conversion flag

Replace with conversion flag

(RUF010)

metrics/memory/analyze_heap_memory.py

[warning] 15-15: Import from collections.abc instead: Iterable, Sequence

Import from collections.abc

(UP035)


[warning] 15-15: typing.Dict is deprecated, use dict instead

(UP035)


[warning] 15-15: typing.List is deprecated, use list instead

(UP035)


[warning] 15-15: typing.Tuple is deprecated, use tuple instead

(UP035)


[warning] 56-56: Naive datetime constructed using datetime.datetime.strptime() without %z

(DTZ007)


[warning] 74-74: datetime.datetime.fromtimestamp() called without a tz argument

(DTZ006)


[warning] 79-79: Avoid specifying long messages outside the exception class

(TRY003)


[warning] 97-97: Avoid specifying long messages outside the exception class

(TRY003)


[error] 101-101: Boolean-typed positional argument in function definition

(FBT001)


[error] 180-180: Boolean-typed positional argument in function definition

(FBT001)


[error] 180-180: Boolean default positional argument in function definition

(FBT002)


[error] 343-343: Boolean-typed positional argument in function definition

(FBT001)


[warning] 416-416: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


[error] 615-615: Boolean-typed positional argument in function definition

(FBT001)


[error] 615-615: Boolean-typed positional argument in function definition

(FBT001)


[error] 615-615: Boolean-typed positional argument in function definition

(FBT001)


[error] 615-615: Boolean default positional argument in function definition

(FBT002)


[error] 672-672: Boolean positional value in function call

(FBT003)


[error] 739-739: Boolean positional value in function call

(FBT003)


[error] 739-739: Boolean positional value in function call

(FBT003)

🔇 Additional comments (22)
src/lib/transcription/WindowBuilder.test.ts (1)

134-155: LGTM — correctly validates cursor-anchored windowing semantics.

The test trace is sound:

  • First build: startFrame = matureCursorFrame = sampleRate * 2, endFrame = sampleRate * 10 (8 s window, within [3, 30] bounds) ✓
  • After advancing only frameRef.value (no cursor/sentence state change): second build correctly keeps startFrame pinned at sampleRate * 2 while endFrame grows to sampleRate * 12

This directly exercises the PR's restored "window start anchored to mature cursor" invariant.

src/lib/transcription/transcription.worker.ts (1)

240-250: No issue here. The code does not have returnTokenIds or returnFrameIndices options in the model.transcribe() call (line 221), and a codebase search confirms these fields are not referenced anywhere in the transcription module. The ASRResult interface has never included these fields—only utterance_text, words, timestamp, segment_id, and end_time.

.gitignore (1)

64-64: LGTM!

Adding traces/ to .gitignore is consistent with the new trace analysis tooling that outputs artifacts to that directory.

src/lib/audio/mel.worker.ts (1)

101-103: LGTM — logging gates are consistently applied.

All hot-path log sites are uniformly gated behind MEL_WORKER_VERBOSE_LOGS. Warning-level logs (console.warn) and error logs (console.error) are correctly left ungated, preserving diagnostic visibility for actual problems.

Also applies to: 194-196, 283-287, 323-325, 413-415

src/App.tsx (4)

51-54: LGTM — centralized logging gate with tick-based throttling.

V4_TRACE_LOGS and shouldLogV4Tick cleanly centralize the verbose logging decision. Same note as mel.worker.ts: consider making this runtime-configurable (e.g., via the debug panel toggle) instead of compile-time.


432-437: Good safety guard on decoder cache prefix.

This correctly prevents reusing a stale or out-of-bounds decoder cache when the prefix doesn't fall within the current window. The edge cases (negative rawPrefixSeconds, prefix exceeding window duration) are properly handled by clamping to 0.


291-302: LGTM — v4Tick logging consistently gated.

All verbose tick logging uses the centralized shouldLogV4Tick helper. The pattern is clean and consistent.

Also applies to: 337-348, 351-358, 389-408


607-608: The audio and mel BufferWorker layers are write-only and never read.

Verification confirms these layers are only written to (via melClient?.pushAudioCopy) but never queried from BufferWorker. All VAD-based speech detection uses only the energyVad and inferenceVad layers (App.tsx lines 323, 328), and visualization queries the mel spectrogram directly via melClient.getFeatures(), not from BufferWorker. Since no code path reads from the audio or mel layers, the retention period is irrelevant to correctness.

Likely an incorrect or invalid review comment.

metrics/analyze_chrome_trace.py (1)

33-69: LGTM — clean backward-compatible wrapper.

The wrapper preserves legacy defaults (input filename, output filename) while delegating to trace_ultimate.py for all analysis. Error handling with return codes is clean.

metrics/README.md (1)

1-53: LGTM — clear and complete documentation for the trace toolkit.

Usage examples, CLI arguments, and backward compatibility notes are well-documented and consistent with the implementation.

metrics/trace_ultimate.py (1)

40-52: LGTM — well-structured trace analysis pipeline.

The load → build_meta → per-thread summarize → KPI/bottleneck → report pipeline is clean and modular. The heuristic for renderer PID selection (Lines 112-123) gracefully handles multi-renderer traces.

Also applies to: 76-155, 173-233, 590-647

metrics/memory/analyze_heap_memory.py (1)

1-968: LGTM overall — comprehensive heap analysis tooling.

The dominator-tree computation, BFS path reconstruction, bucket-based categorization, and findings-ranking pipeline are well-designed. The structured JSON/Markdown outputs with evidence, confidence levels, and validation suggestions are thorough.

src/lib/audio/AudioEngine.ts (2)

570-572: LGTM – debug log gating is correctly applied at all three sites.

Short-circuit evaluation on && ensures Math.random() (line 599) is not invoked when the flag is false, so there is no hot-path cost even when the flag is a non-eliminated instance field.

Also applies to: 599-601, 661-667


1003-1018: LGTM – visibility-aware throttling logic is correct.

The ternary correctly maps 'visible' → 33 ms and all non-visible states ('hidden', 'prerender') → 250 ms. The typeof document guard is overly cautious for a browser-only class but harmless. minInterval must be computed before the threshold comparison, so the current ordering is unavoidable.

src/lib/transcription/TranscriptionWorkerClient.ts (2)

104-110: sendRequest Transferable wiring is correct.

The transfer || [] fallback is valid; passing an empty array to postMessage is a no-op and matches the old behavior for non-transfer callers.


129-159: No buffer transfer issues detected — all transferred buffers are fresh allocations, not subviews.

After examining the codebase, both identified risks do not apply here:

  1. Use-after-transfer: While the Transferable contract requires callers to abandon the original buffer, the current code doesn't violate this. No caller attempts to access audio, features, or samples after the await, so there is no use-after-transfer bug.

  2. Subview over-transfer: This risk does not exist. All transferred buffers are already fresh allocations, never subviews:

    • RingBuffer.read() (line 789 in App.tsx): Creates new Float32Array(length) and copies with .set().
    • mel.worker.getFeatures() (lines 745, 748 in App.tsx): Creates new Float32Array(nMels * T), optionally normalized to another fresh buffer.
    • buffer.worker.readRange(): Creates new Float32Array(count * entryDimension).
    • capture-processor chunks: Fresh Web Audio API frames per process cycle.

Since all transferred buffers are fresh allocations (not subviews of ring buffers or shared data), the defensive copy guard (audio.slice(0)) is unnecessary and would only add overhead without benefit.

The comment's educational points about Transferable semantics are correct in general, but the specific fix recommendation does not apply to this codebase's actual buffer lifecycle patterns.

Likely an incorrect or invalid review comment.

src/components/Waveform.tsx (1)

103-107: LGTM — cleanup guard is correct

Guarding cancelAnimationFrame with animationId !== undefined correctly prevents the undefined-argument call when the component is unmounted before the first rAF fires. Since animate overwrites animationId at the top of each invocation (line 47), onCleanup always holds the ID of the next pending frame and cancels it correctly.

src/lib/transcription/UtteranceBasedMerger.regression.test.ts (3)

113-158: LGTM — stale-overlap and immutability invariants are well-covered.

The staleOverlap fixture's use of ~20 ms timestamp drift to test cursor-anchored window filtering (words below matureCursorTime are dropped even when re-submitted with shifted timestamps) directly validates the core streaming-merge invariant described in the PR objectives. The additional allMatureSentences.toHaveLength(1) guard on the internal accumulator state is a good complement to the text-level assertions.


38-42: The cursor: 0.0 assertion is correct.

matureCursorTime is initialized to 0 in the class field declaration (line 143 of UtteranceBasedMerger.ts: private matureCursorTime = 0;), and the processASRResult return type includes it as a required number field. The test fixture expectation aligns with the implementation.


173-184: The test correctly validates the two-phase semantics without requiring an intermediate assertion.

The implementation does not eagerly mature complete sentences during processASRResult. Even when a sentence ends with punctuation (like "we are testing."), it remains in lastImmatureWords as immature text until finalizePendingSentenceByTimeout() is called. The test as written correctly verifies this flow:

  • After processASRResult with a complete sentence, the result would have immatureText: 'we are testing.' and matureText: ''
  • finalizePendingSentenceByTimeout() then transitions it to mature

The existing assertions already confirm the intended behavior.

Likely an incorrect or invalid review comment.

src/lib/buffer/buffer.worker.ts (1)

137-141: LGTM — the vectorized read path is correct.

Both the entryDimension === 1 direct-assignment branch and the buffer.subarray + buffer.set multi-dim branch are bounds-safe (max readPos + entryDimension = maxEntries × entryDimension = buffer.length) and correctly mirror the optimized write paths.

src/lib/buffer/buffer.worker.test.ts (1)

181-214: LGTM — vector path assertions are correct.

The manually verified trace confirms: WRITE_BATCH at offset 0 deposits entryA at writePos 0 and entryB at writePos 128; QUERY_RANGE(0, 320) produces count = 2; readRange copies via buffer.subarray + result.set for the multi-dim path; all four boundary assertions (data[0], data[127], data[128], data[255]) are consistent with the written values.

Note: @vitest/web-worker "simulates Web Worker, but in the same thread." Whether it honours the transfers argument to the worker's postMessage or falls back to a plain clone does not affect the test's data-integrity assertions — the received slice.data is valid in both cases.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@metrics/analyze_chrome_trace.py`:
- Line 19: The bare import "from trace_ultimate import analyze, print_report"
will break when this module is imported from outside metrics; change it to a
package-safe import by using a relative import (from .trace_ultimate import
analyze, print_report) or implement an import fallback: attempt the relative
import first and fall back to the top-level import to preserve direct-script
execution. Update the import line in analyze_chrome_trace.py accordingly so
functions analyze and print_report are resolved robustly.

In `@metrics/memory/analyze_heap_memory.py`:
- Line 131: Rename the two single-letter classes G and A to descriptive names
(e.g., G -> HeapGraph and A -> AnalysisResult) and update all references across
the file: class definitions, instantiations, type annotations, imports/exports,
docstrings, and any forward-references so the new names replace every occurrence
of G and A used as types or constructors; keep the classes' internals and
behavior unchanged and run existing tests/lint to catch any missed references.
- Line 74: The fallback in sort_time returns a naive datetime via
datetime.fromtimestamp(self.mtime), which is inconsistent with the UTC-aware
now_iso(); change the fallback to create a UTC-aware datetime (e.g., use
datetime.fromtimestamp(self.mtime, timezone.utc)) so sort_time returns
timezone-aware timestamps, and if you add tz here also update parse_ts to
produce timezone-aware datetimes for full consistency; refer to the sort_time
method and parse_ts function when making the changes.
- Line 15: The import line currently pulls typing generics (Dict, List, Tuple,
Iterable, Sequence) which are deprecated; update the import to remove those
typing names and replace their usages in annotations across the module (e.g.,
annotations referencing Dict, List, Tuple, Iterable, Sequence) with built-in
generics dict, list, tuple and/or types from collections.abc (e.g., Iterable,
Sequence) as appropriate; since the file already has "from __future__ import
annotations", it's safe to perform these replacements (e.g., change Dict[...] →
dict[...], List[...] → list[...], Tuple[...] → tuple[...] and import
Iterable/Sequence from collections.abc if needed).
- Around line 226-231: The recursive comp(v) can overflow on deep ancestor
chains; replace it with an iterative routine that walks up anc[] until you reach
a node u with anc[u] == -1 or anc[anc[u]] == -1, collecting visited nodes on a
small list/stack, then unwind that list to perform the same updates: for each x
in visited (in reverse order) if semi[lab[anc[x]]] < semi[lab[x]] set lab[x] =
lab[anc[x]] and then set anc[x] = anc[anc[x]]. Modify the existing comp function
(referenced as comp, anc, lab, semi) to use this loop-and-unwind approach so it
no longer uses recursion but preserves the original path-compression logic.

In `@metrics/trace_ultimate.py`:
- Around line 246-249: In summarize_gc, guard against missing or None thread ids
before converting: check e.get('tid') is not None (and optionally is numeric)
and skip the event (continue) when tid is absent to avoid TypeError from
int(e.get('tid')); locate the conversion around int(e.get('tid')) and replace it
with a pre-check that only calls int(...) when tid is present, leaving the
existing to_float(e.get('dur')) handling unchanged.
- Line 20: Remove the unused top-level import statement "import math" (the
unused symbol math) from trace_ultimate.py; simply delete the line "import math"
to eliminate the unused import and any linter warnings, or if math functionality
is actually needed, replace the deletion by using the math functions where
appropriate (but otherwise remove the import).

In `@src/components/Waveform.tsx`:
- Around line 50-55: The per-frame check using document.visibilityState and the
hidden ? HIDDEN_FRAME_MS branch is effectively unused; replace it by adding a
visibilitychange listener to pause/resume the animation loop used by animate and
compute minFrameInterval only from props.isRecording (FOREGROUND_FRAME_MS) and
IDLE_FRAME_MS; specifically, remove the per-frame document.visibilityState read
and HIDDEN_FRAME_MS branch, introduce a boolean (e.g., isVisible) updated on the
document.visibilitychange event, and have the animate loop early-return or stop
scheduling frames when not visible so the loop pauses/resumes cleanly without
per-frame visibility checks.
- Line 20: The initial value lastStyleRefreshTs = 0 causes an unnecessary
getComputedStyle call on the first rAF; after onMount calls refreshThemeColors()
you should update lastStyleRefreshTs to the current timestamp so the first
animate tick doesn't think >1000ms have passed. Locate lastStyleRefreshTs,
onMount, refreshThemeColors(), and animate() in Waveform.tsx and set
lastStyleRefreshTs (e.g. to performance.now() or the rAF ts) immediately after
refreshThemeColors() is invoked in onMount so the subsequent animate check skips
the redundant refresh.
- Around line 23-25: FOREGROUND_FRAME_MS, IDLE_FRAME_MS and HIDDEN_FRAME_MS are
declared inside the Waveform component but are pure numeric constants; move
their const declarations to module scope (above the Waveform function/component)
so they are allocated once and reused; update any references inside Waveform to
use the relocated names and keep them as plain const (no change to values or
exports) to preserve behavior.

In `@src/lib/audio/AudioEngine.ts`:
- Line 85: The HOT_PATH_DEBUG_LOGS flag is declared as an instance field which
prevents bundlers from statically eliminating dead debug branches; hoist it to a
module-level constant (replace the instance property `private readonly
HOT_PATH_DEBUG_LOGS = false` in the AudioEngine class with `const
HOT_PATH_DEBUG_LOGS = false` at top-level of the module) and then update uses of
`this.HOT_PATH_DEBUG_LOGS` in the AudioEngine methods (the guarded branches that
reference HOT_PATH_DEBUG_LOGS) to the bare `HOT_PATH_DEBUG_LOGS` identifier so
the bundler can inline and dead-code eliminate those branches.

In `@src/lib/audio/mel.worker.ts`:
- Line 64: Replace the compile-time constant MEL_WORKER_VERBOSE_LOGS with a
runtime-configurable variable and read a verbose flag from the worker INIT
message payload: change MEL_WORKER_VERBOSE_LOGS from const to a mutable (e.g.,
let) module-level variable, update the message handler that processes the "INIT"
event (the existing INIT message handling logic) to check event.data.verbose
(defaulting to false if missing) and set the variable accordingly, and use that
variable everywhere verbose logging is currently gated; also validate the
payload type to avoid runtime errors.

In `@src/lib/buffer/buffer.worker.test.ts`:
- Around line 155-179: Remove the redundant artificial delay: delete the line
awaiting new Promise((r) => setTimeout(r, 50)) in the "should preserve scalar
layer values via QUERY_RANGE" test; the worker.postMessage calls for WRITE are
FIFO so the subsequent sendRequest call with type 'QUERY_RANGE' will be
processed after the three WRITE messages—no setTimeout needed—so keep the
sendRequest invocation and all assertions intact and only remove the timeout
await.

In `@src/lib/buffer/buffer.worker.ts`:
- Around line 62-66: The current single-dimension write silently writes 0 for
empty input and uses an unnecessary cast; in the block guarded by
this.entryDimension === 1 (where this.buffer[writePos] is assigned and
this.globalWriteIndex++ returned), replace the blind (data as any)[0] ?? 0 with
a proper length check on the union type (Float32Array | number[]) — e.g. if
data.length === 0 then throw or return an error/log instead of writing 0;
otherwise assign data[0] directly (no as any cast). Update the code around
writePos, this.buffer and this.globalWriteIndex so the counter only increments
after a successful valid write.

In `@src/lib/transcription/UtteranceBasedMerger.regression.test.ts`:
- Around line 15-22: The helper function asrFromTuples currently relies on type
inference; add an explicit return type annotation of ASRResult to its signature
to ensure structural mismatches are caught at definition time (e.g., change the
declaration of asrFromTuples to return ASRResult), import ASRResult if needed,
and keep usages of wordsFromTuples and WordTuple unchanged so the function still
maps WordTuple[] -> ASRResult.

In `@src/lib/transcription/WindowBuilder.test.ts`:
- Around line 34-51: createMutableMockRingBuffer duplicates
createMockRingBuffer; refactor by extracting the shared IRingBuffer fields into
a single factory (e.g., createMockRingBuffer) that accepts a current-frame
provider (or optional currentFrameRef) and sampleRate, then implement
createMutableMockRingBuffer to call that factory with a getCurrentFrame closure
and getFillCount computed from baseFrame and the provider; update uses of
createMutableMockRingBuffer and createMockRingBuffer to use the consolidated
factory so new IRingBuffer members only need to be added in one place.

---

Outside diff comments:
In `@src/lib/audio/AudioEngine.ts`:
- Around line 365-398: The forced UI-clear in reset() can be suppressed by the
visualization throttle; update reset() to bypass throttling by resetting the
throttle timestamp/state immediately before calling notifyVisualizationUpdate().
Concretely, locate the throttle state used by the visualization/update path (the
variable that tracks last visualization timestamp or throttle token used by
notifyVisualizationUpdate()) and set it to a value that forces an immediate send
(e.g., 0 or null) inside reset() right before invoking
notifyVisualizationUpdate(), so the blank update cannot be silently dropped;
keep the rest of reset() unchanged (ringBuffer.reset(), audioProcessor.reset(),
metrics reset, visualizationBuffer fill, windowCallbacks loop, etc.).

In `@src/lib/buffer/buffer.worker.ts`:
- Around line 73-81: The writeBatch method currently always creates 1-element
subarrays for entryDimension === 1 which is inefficient; update writeBatch (the
function named writeBatch using this.buffer, this.entryDimension,
this.globalWriteIndex and this.maxEntries) to mirror the fast-path from write():
if this.entryDimension === 1, avoid subarray + this.buffer.set and instead
assign directly into this.buffer at the computed index ((this.globalWriteIndex %
this.maxEntries) * this.entryDimension) using the scalar value from
data[srcOffset]; otherwise keep the existing subarray + set logic and still
increment this.globalWriteIndex and loop for n entries.

In `@src/lib/transcription/transcription.worker.ts`:
- Around line 221-237: The code passes an unsupported option enableProfiling
into model.transcribe (see the v4TranscribeResult call in
transcription.worker.ts); remove enableProfiling from the transcribe call and
instead control profiling at model initialization (ParakeetModel.fromUrls) or
via whatever runtime toggle the parakeet.js API exposes; update any logic that
attempted to toggle profiling per-call to instead set the profiler flag when
creating the model instance (or implement an alternative non-API profiling skip
mechanism) so model.transcribe is only called with supported options like
precomputedFeatures, returnTimestamps, timeOffset, and incremental.
- Line 264: The postMessage payload is always including metrics even when
v4TranscribeResult.metrics is undefined, inflating wire payloads; update the
object construction in transcription.worker.ts where you build the
V4ProcessResult (the object containing v4TranscribeResult and passed to
postMessage) to only include the metrics property when
v4TranscribeResult.metrics is defined (e.g., use a conditional spread or an if
branch to add metrics to the payload), leaving all other fields unchanged so
consumers (App.tsx) that check result.metrics continue to work.

In `@src/lib/transcription/TranscriptionWorkerClient.ts`:
- Around line 210-213: The dispose() currently clears pendingPromises without
settling them, causing awaiting callers of transfer-based methods (e.g.,
processChunk, processV3Chunk, and other transfer-enabled calls) to hang; update
dispose() in TranscriptionWorkerClient to iterate pendingPromises, call the
reject handler for each entry with a clear Error (for example "Worker disposed")
so all in-flight promises are rejected, then clear the map and terminate the
worker (this ensures pendingPromises entries are settled via their stored
resolve/reject callbacks before calling this.worker.terminate()).

print(f"\n--- WASM: {len(s['wasm'])} events ---")
for w in s["wasm"][:5]:
print(f" {w['dur_ms']}ms {w['name']} {w['src_func']}")
from trace_ultimate import analyze, print_report
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Bare import relies on script-directory resolution.

from trace_ultimate import analyze, print_report works when the script is executed directly (Python adds the script's directory to sys.path), but would fail if this module is ever imported from outside metrics/. Since the README documents running it as a standalone script, this is acceptable — just be aware it's not importable as a package.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics/analyze_chrome_trace.py` at line 19, The bare import "from
trace_ultimate import analyze, print_report" will break when this module is
imported from outside metrics; change it to a package-safe import by using a
relative import (from .trace_ultimate import analyze, print_report) or implement
an import fallback: attempt the relative import first and fall back to the
top-level import to preserve direct-script execution. Update the import line in
analyze_chrome_trace.py accordingly so functions analyze and print_report are
resolved robustly.

from dataclasses import dataclass
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, Iterable, List, Optional, Sequence, Tuple
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Deprecated typing imports — use modern equivalents.

Dict, List, Tuple, Iterable, Sequence from typing are deprecated since Python 3.9. Since the file already uses from __future__ import annotations, you can use the built-in generics and collections.abc directly.

♻️ Proposed fix
-from typing import Any, Dict, Iterable, List, Optional, Sequence, Tuple
+from collections.abc import Iterable, Sequence
+from typing import Any, Optional

Then replace Dictdict, Listlist, Tupletuple throughout annotations. With from __future__ import annotations, this is safe on Python 3.7+.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from typing import Any, Dict, Iterable, List, Optional, Sequence, Tuple
from collections.abc import Iterable, Sequence
from typing import Any, Optional
🧰 Tools
🪛 Ruff (0.15.1)

[warning] 15-15: Import from collections.abc instead: Iterable, Sequence

Import from collections.abc

(UP035)


[warning] 15-15: typing.Dict is deprecated, use dict instead

(UP035)


[warning] 15-15: typing.List is deprecated, use list instead

(UP035)


[warning] 15-15: typing.Tuple is deprecated, use tuple instead

(UP035)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics/memory/analyze_heap_memory.py` at line 15, The import line currently
pulls typing generics (Dict, List, Tuple, Iterable, Sequence) which are
deprecated; update the import to remove those typing names and replace their
usages in annotations across the module (e.g., annotations referencing Dict,
List, Tuple, Iterable, Sequence) with built-in generics dict, list, tuple and/or
types from collections.abc (e.g., Iterable, Sequence) as appropriate; since the
file already has "from __future__ import annotations", it's safe to perform
these replacements (e.g., change Dict[...] → dict[...], List[...] → list[...],
Tuple[...] → tuple[...] and import Iterable/Sequence from collections.abc if
needed).


@property
def sort_time(self) -> datetime:
return self.ts if self.ts else datetime.fromtimestamp(self.mtime)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Naive datetime from fromtimestamp is inconsistent with UTC-aware now_iso().

Line 38 (now_iso) uses timezone.utc, but sort_time falls back to datetime.fromtimestamp(self.mtime) which creates a naive datetime in the local timezone. This can produce incorrect ordering when comparing captures across timezone changes.

🛡️ Proposed fix
     `@property`
     def sort_time(self) -> datetime:
-        return self.ts if self.ts else datetime.fromtimestamp(self.mtime)
+        return self.ts if self.ts else datetime.fromtimestamp(self.mtime, tz=timezone.utc)

Note: parse_ts (Line 56) also produces naive datetimes, so both sides would be consistent as-is. But if you add tz= here, consider also making parse_ts timezone-aware for full consistency.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return self.ts if self.ts else datetime.fromtimestamp(self.mtime)
return self.ts if self.ts else datetime.fromtimestamp(self.mtime, tz=timezone.utc)
🧰 Tools
🪛 Ruff (0.15.1)

[warning] 74-74: datetime.datetime.fromtimestamp() called without a tz argument

(DTZ006)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics/memory/analyze_heap_memory.py` at line 74, The fallback in sort_time
returns a naive datetime via datetime.fromtimestamp(self.mtime), which is
inconsistent with the UTC-aware now_iso(); change the fallback to create a
UTC-aware datetime (e.g., use datetime.fromtimestamp(self.mtime, timezone.utc))
so sort_time returns timezone-aware timestamps, and if you add tz here also
update parse_ts to produce timezone-aware datetimes for full consistency; refer
to the sort_time method and parse_ts function when making the changes.

return issues


class G:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Single-letter class names G and A hurt readability in a 968-line file.

G (heap graph) and A (analysis result) are used throughout the file. More descriptive names like HeapGraph and AnalysisResult would make the code self-documenting without impacting functionality.

Also applies to: 325-326

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics/memory/analyze_heap_memory.py` at line 131, Rename the two
single-letter classes G and A to descriptive names (e.g., G -> HeapGraph and A
-> AnalysisResult) and update all references across the file: class definitions,
instantiations, type annotations, imports/exports, docstrings, and any
forward-references so the new names replace every occurrence of G and A used as
types or constructors; keep the classes' internals and behavior unchanged and
run existing tests/lint to catch any missed references.

Comment on lines +226 to +231
def comp(v: int) -> None:
if anc[v] != -1 and anc[anc[v]] != -1:
comp(anc[v])
if semi[lab[anc[v]]] < semi[lab[v]]:
lab[v] = lab[anc[v]]
anc[v] = anc[anc[v]]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Recursive comp may overflow the stack on large heap snapshots.

The comp function in the Lengauer-Tarjan dominator computation uses recursion for path compression. Python's default recursion limit is 1000. While path compression keeps amortized depth low, the first traversal of a deep ancestor chain (before compression) can be O(depth-of-DFS-tree). Production heap snapshots with millions of nodes can have deep chains.

Convert to an iterative implementation to avoid RecursionError.

🔧 Proposed iterative replacement
-    def comp(v: int) -> None:
-        if anc[v] != -1 and anc[anc[v]] != -1:
-            comp(anc[v])
-            if semi[lab[anc[v]]] < semi[lab[v]]:
-                lab[v] = lab[anc[v]]
-            anc[v] = anc[anc[v]]
+    def comp(v: int) -> None:
+        # Iterative path compression to avoid RecursionError on large heaps
+        stack = []
+        u = v
+        while anc[u] != -1 and anc[anc[u]] != -1:
+            stack.append(u)
+            u = anc[u]
+        for u in stack:
+            if semi[lab[anc[u]]] < semi[lab[u]]:
+                lab[u] = lab[anc[u]]
+            anc[u] = anc[anc[u]]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics/memory/analyze_heap_memory.py` around lines 226 - 231, The recursive
comp(v) can overflow on deep ancestor chains; replace it with an iterative
routine that walks up anc[] until you reach a node u with anc[u] == -1 or
anc[anc[u]] == -1, collecting visited nodes on a small list/stack, then unwind
that list to perform the same updates: for each x in visited (in reverse order)
if semi[lab[anc[x]]] < semi[lab[x]] set lab[x] = lab[anc[x]] and then set anc[x]
= anc[anc[x]]. Modify the existing comp function (referenced as comp, anc, lab,
semi) to use this loop-and-unwind approach so it no longer uses recursion but
preserves the original path-compression logic.

// Logging throttle for getFeatures (avoid console spam)
let lastGetFeaturesLogTime = 0;
const GET_FEATURES_LOG_INTERVAL = 5000; // Log every 5 seconds max
const MEL_WORKER_VERBOSE_LOGS = false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Hardcoded verbose flag requires rebuild to enable.

MEL_WORKER_VERBOSE_LOGS is a compile-time constant set to false. Consider accepting a verbose flag in the INIT message payload so developers can enable worker logging at runtime without modifying source and rebuilding.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/audio/mel.worker.ts` at line 64, Replace the compile-time constant
MEL_WORKER_VERBOSE_LOGS with a runtime-configurable variable and read a verbose
flag from the worker INIT message payload: change MEL_WORKER_VERBOSE_LOGS from
const to a mutable (e.g., let) module-level variable, update the message handler
that processes the "INIT" event (the existing INIT message handling logic) to
check event.data.verbose (defaulting to false if missing) and set the variable
accordingly, and use that variable everywhere verbose logging is currently
gated; also validate the payload type to avoid runtime errors.

Comment on lines +155 to +179
it('should preserve scalar layer values via QUERY_RANGE', async () => {
await sendRequest(worker, 'INIT', defaultConfig(), nextId++);

worker.postMessage({ type: 'WRITE', payload: { layer: 'energyVad', data: [0.1], globalSampleOffset: 0 } });
worker.postMessage({ type: 'WRITE', payload: { layer: 'energyVad', data: [0.5], globalSampleOffset: 1280 } });
worker.postMessage({ type: 'WRITE', payload: { layer: 'energyVad', data: [0.9], globalSampleOffset: 2560 } });
await new Promise((r) => setTimeout(r, 50));

const response = await sendRequest(
worker,
'QUERY_RANGE',
{ startSample: 0, endSample: 1280 * 3, layers: ['energyVad'] },
nextId++
);

expect(response.type).toBe('QUERY_RANGE');
const slice = response.payload?.layers?.energyVad;
expect(slice).toBeTruthy();
expect(slice.entryDimension).toBe(1);
expect(slice.entryCount).toBe(3);
expect(slice.data.length).toBe(3);
expect(slice.data[0]).toBeCloseTo(0.1, 5);
expect(slice.data[1]).toBeCloseTo(0.5, 5);
expect(slice.data[2]).toBeCloseTo(0.9, 5);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Test logic is correct; setTimeout(50) is redundant for ordering.

The traced math is sound: three writes at globalSampleOffset 0/1280/2560 → entries 0/1/2 → readRange(0, 3840) produces count = 3, and the value assertions match.

However, since the worker message queue is FIFO, the QUERY_RANGE postMessage will always be enqueued after the three WRITE messages and processed in order — the setTimeout(50) delay is not needed for ordering correctness. The timeout only protects against a hypothetical non-FIFO scheduling anomaly in the @vitest/web-worker in-thread simulation, and it makes the test suite slower under normal conditions. The existing tests use the same pattern, so this is consistent, but worth revisiting as a suite-wide cleanup.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/buffer/buffer.worker.test.ts` around lines 155 - 179, Remove the
redundant artificial delay: delete the line awaiting new Promise((r) =>
setTimeout(r, 50)) in the "should preserve scalar layer values via QUERY_RANGE"
test; the worker.postMessage calls for WRITE are FIFO so the subsequent
sendRequest call with type 'QUERY_RANGE' will be processed after the three WRITE
messages—no setTimeout needed—so keep the sendRequest invocation and all
assertions intact and only remove the timeout await.

Comment on lines +62 to 66
if (this.entryDimension === 1) {
this.buffer[writePos] = (data as any)[0] ?? 0;
this.globalWriteIndex++;
return;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

(data as any)[0] ?? 0 silently writes 0 for empty input.

When data is an empty Float32Array or [], both return undefined at index 0. The ?? 0 fallback writes 0 without any diagnostic, silently corrupting the buffer entry. The as any cast is also avoidable — both members of the Float32Array | number[] union support numeric index access.

🛡️ Proposed fix
-        this.buffer[writePos] = (data as any)[0] ?? 0;
+        if (data.length === 0) return; // or throw — empty write is a caller bug
+        this.buffer[writePos] = (data as Float32Array | number[])[0] as number;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (this.entryDimension === 1) {
this.buffer[writePos] = (data as any)[0] ?? 0;
this.globalWriteIndex++;
return;
}
if (this.entryDimension === 1) {
if (data.length === 0) return; // or throw — empty write is a caller bug
this.buffer[writePos] = (data as Float32Array | number[])[0] as number;
this.globalWriteIndex++;
return;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/buffer/buffer.worker.ts` around lines 62 - 66, The current
single-dimension write silently writes 0 for empty input and uses an unnecessary
cast; in the block guarded by this.entryDimension === 1 (where
this.buffer[writePos] is assigned and this.globalWriteIndex++ returned), replace
the blind (data as any)[0] ?? 0 with a proper length check on the union type
(Float32Array | number[]) — e.g. if data.length === 0 then throw or return an
error/log instead of writing 0; otherwise assign data[0] directly (no as any
cast). Update the code around writePos, this.buffer and this.globalWriteIndex so
the counter only increments after a successful valid write.

Comment on lines +15 to +22
function asrFromTuples(tuples: WordTuple[]) {
const words = wordsFromTuples(tuples);
return {
utterance_text: words.map((w) => w.text).join(' '),
words,
end_time: words.length > 0 ? words[words.length - 1].end_time : 0,
};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

# Search for processASRResult method and its parameter type
rg -n "processASRResult" --type=ts -A 5 -B 2

Repository: ysdede/keet

Length of output: 21260


🏁 Script executed:

# Search for ASRResult type definition
rg -n "ASRResult" --type=ts -A 3 -B 1

Repository: ysdede/keet

Length of output: 15648


🏁 Script executed:

# Get the UtteranceBasedMerger class definition and method signatures
rg -n "class UtteranceBasedMerger|interface.*ASR|type.*ASR" --type=ts -A 10

Repository: ysdede/keet

Length of output: 5301


Consider adding an explicit return type annotation to asrFromTuples for clarity.

The function returns a structure compatible with ASRResult, but the inferred type shifts any structural-mismatch errors to call sites rather than the definition. If ASRResult evolves in the future, an explicit return type catches breaking changes at the helper rather than scattered across the test file.

♻️ Suggested annotation
-function asrFromTuples(tuples: WordTuple[]) {
+function asrFromTuples(tuples: WordTuple[]): Parameters<UtteranceBasedMerger['processASRResult']>[0] {

Alternatively, import and use ASRResult directly if preferred.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function asrFromTuples(tuples: WordTuple[]) {
const words = wordsFromTuples(tuples);
return {
utterance_text: words.map((w) => w.text).join(' '),
words,
end_time: words.length > 0 ? words[words.length - 1].end_time : 0,
};
}
function asrFromTuples(tuples: WordTuple[]): Parameters<UtteranceBasedMerger['processASRResult']>[0] {
const words = wordsFromTuples(tuples);
return {
utterance_text: words.map((w) => w.text).join(' '),
words,
end_time: words.length > 0 ? words[words.length - 1].end_time : 0,
};
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/transcription/UtteranceBasedMerger.regression.test.ts` around lines
15 - 22, The helper function asrFromTuples currently relies on type inference;
add an explicit return type annotation of ASRResult to its signature to ensure
structural mismatches are caught at definition time (e.g., change the
declaration of asrFromTuples to return ASRResult), import ASRResult if needed,
and keep usages of wordsFromTuples and WordTuple unchanged so the function still
maps WordTuple[] -> ASRResult.

Comment on lines +34 to +51
function createMutableMockRingBuffer(
baseFrame: number,
currentFrameRef: { value: number },
sampleRate: number = 16000
): IRingBuffer {
return {
sampleRate,
maxFrames: 16000 * 120,
write: () => {},
read: () => new Float32Array(0),
getCurrentFrame: () => currentFrameRef.value,
getFillCount: () => Math.max(0, currentFrameRef.value - baseFrame),
getSize: () => 16000 * 120,
getCurrentTime: () => currentFrameRef.value / sampleRate,
getBaseFrameOffset: () => baseFrame,
reset: () => {},
};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

createMutableMockRingBuffer duplicates createMockRingBuffer almost entirely.

The two helpers share identical fields (sampleRate, maxFrames, write, read, getSize, getBaseFrameOffset, reset) and differ only in how currentFrame is sourced. A simple consolidation avoids the drift risk if IRingBuffer grows new required members:

♻️ Optional consolidation
-function createMockRingBuffer(
+function createMockRingBuffer(
     baseFrame: number,
     currentFrame: number,
     sampleRate: number = 16000
 ): IRingBuffer {
-    return {
-        sampleRate,
-        maxFrames: 16000 * 120,
-        write: () => {},
-        read: () => new Float32Array(0),
-        getCurrentFrame: () => currentFrame,
-        getFillCount: () => Math.max(0, currentFrame - baseFrame),
-        getSize: () => 16000 * 120,
-        getCurrentTime: () => currentFrame / sampleRate,
-        getBaseFrameOffset: () => baseFrame,
-        reset: () => {},
-    };
+    return createMutableMockRingBuffer(baseFrame, { value: currentFrame }, sampleRate);
 }

 function createMutableMockRingBuffer(
     baseFrame: number,
     currentFrameRef: { value: number },
     sampleRate: number = 16000
 ): IRingBuffer {
     return {
         sampleRate,
         maxFrames: 16000 * 120,
         write: () => {},
         read: () => new Float32Array(0),
         getCurrentFrame: () => currentFrameRef.value,
         getFillCount: () => Math.max(0, currentFrameRef.value - baseFrame),
         getSize: () => 16000 * 120,
         getCurrentTime: () => currentFrameRef.value / sampleRate,
         getBaseFrameOffset: () => baseFrame,
         reset: () => {},
     };
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/transcription/WindowBuilder.test.ts` around lines 34 - 51,
createMutableMockRingBuffer duplicates createMockRingBuffer; refactor by
extracting the shared IRingBuffer fields into a single factory (e.g.,
createMockRingBuffer) that accepts a current-frame provider (or optional
currentFrameRef) and sampleRate, then implement createMutableMockRingBuffer to
call that factory with a getCurrentFrame closure and getFillCount computed from
baseFrame and the provider; update uses of createMutableMockRingBuffer and
createMockRingBuffer to use the consolidated factory so new IRingBuffer members
only need to be added in one place.

- Add opt-out transferOwnership flags and ownership docs in TranscriptionWorkerClient.
- Introduce HeapGraph wrapper + detector/markdown helpers in heap analyzer.
- Extract bottleneck scoring/evidence helpers in trace analyzer.

Addresses review comments #2823766418, #2823766426, #2823766432.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/transcription/TranscriptionWorkerClient.ts (1)

133-176: 🧹 Nitpick | 🔵 Trivial

processV3ChunkWithFeatures is missing transfer-ownership JSDoc present on every other updated method.

Lines 133–136, 141–144, and 170–173 all carry identical JSDoc blocks documenting the default transfer semantics and the transferOwnership: false opt-out. processV3ChunkWithFeatures (lines 153–155) only documents the mel-features bypassing behavior and silently acquired the same buffer-transfer semantics, making the transfer behavior a discovery surprise for callers.

📝 Proposed doc addition
     /**
      * Process a chunk using pre-computed mel features (from mel worker).
      * Bypasses the preprocessor in the inference worker entirely.
+     * Default behavior transfers `features.buffer` to worker ownership.
+     * Pass `transferOwnership: false` when the caller must keep using `features`.
      */
     async processV3ChunkWithFeatures(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/transcription/TranscriptionWorkerClient.ts` around lines 133 - 176,
Add the same transfer-ownership JSDoc present on the other methods to
processV3ChunkWithFeatures: document that by default the method transfers
features.buffer to worker ownership and that callers can opt out with
options.transferOwnership = false; ensure the comment sits above the async
processV3ChunkWithFeatures declaration and mentions the transfer semantics
consistent with the existing methods that call
this.transferList(features.buffer, options.transferOwnership).
📜 Review details

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e7b3a54 and 1269719.

📒 Files selected for processing (3)
  • metrics/memory/analyze_heap_memory.py
  • metrics/trace_ultimate.py
  • src/lib/transcription/TranscriptionWorkerClient.ts
🧰 Additional context used
🧬 Code graph analysis (2)
src/lib/transcription/TranscriptionWorkerClient.ts (3)
src/lib/transcription/types.ts (1)
  • TranscriptionResult (28-39)
src/types/index.ts (1)
  • TranscriptionResult (44-49)
src/lib/transcription/TokenStreamTranscriber.ts (1)
  • TokenStreamResult (44-59)
metrics/trace_ultimate.py (1)
metrics/trace_analyze.py (2)
  • pname (34-35)
  • tname (31-32)
🪛 Ruff (0.15.1)
metrics/trace_ultimate.py

[warning] 50-50: Prefer TypeError exception for invalid type

(TRY004)


[warning] 50-50: Avoid specifying long messages outside the exception class

(TRY003)


[warning] 271-271: Missing return type annotation for private function row

(ANN202)


[warning] 593-593: Use explicit conversion flag

Replace with conversion flag

(RUF010)

metrics/memory/analyze_heap_memory.py

[warning] 15-15: Import from collections.abc instead: Iterable, Sequence

Import from collections.abc

(UP035)


[warning] 15-15: typing.Dict is deprecated, use dict instead

(UP035)


[warning] 15-15: typing.List is deprecated, use list instead

(UP035)


[warning] 15-15: typing.Tuple is deprecated, use tuple instead

(UP035)


[warning] 56-56: Naive datetime constructed using datetime.datetime.strptime() without %z

(DTZ007)


[warning] 74-74: datetime.datetime.fromtimestamp() called without a tz argument

(DTZ006)


[warning] 79-79: Avoid specifying long messages outside the exception class

(TRY003)


[warning] 97-97: Avoid specifying long messages outside the exception class

(TRY003)


[error] 101-101: Boolean-typed positional argument in function definition

(FBT001)


[error] 180-180: Boolean-typed positional argument in function definition

(FBT001)


[error] 180-180: Boolean default positional argument in function definition

(FBT002)


[warning] 337-337: Remove quotes from type annotation

Remove quotes

(UP037)


[error] 372-372: Boolean-typed positional argument in function definition

(FBT001)


[warning] 444-444: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


[error] 643-643: Boolean-typed positional argument in function definition

(FBT001)


[error] 643-643: Boolean-typed positional argument in function definition

(FBT001)


[error] 643-643: Boolean-typed positional argument in function definition

(FBT001)


[error] 643-643: Boolean default positional argument in function definition

(FBT002)


[warning] 695-695: Unused function argument: comparisons

(ARG001)


[warning] 695-695: Unused function argument: top_n

(ARG001)


[error] 703-703: Boolean positional value in function call

(FBT003)


[warning] 731-731: Unused function argument: comparisons

(ARG001)


[warning] 731-731: Unused function argument: top_n

(ARG001)


[warning] 769-769: Unused function argument: comparisons

(ARG001)


[error] 779-779: Boolean positional value in function call

(FBT003)


[error] 779-779: Boolean positional value in function call

(FBT003)


[warning] 801-801: Unused function argument: comparisons

(ARG001)


[warning] 801-801: Unused function argument: top_n

(ARG001)

🔇 Additional comments (19)
metrics/trace_ultimate.py (9)

26-37: LGTM!

to_float and percentile are clean utility helpers with safe fallback handling.


40-52: LGTM!

Trace loading handles both common Chrome trace formats (object with traceEvents and bare array) with appropriate validation.


55-73: LGTM!

Duration calculation correctly scopes to renderer events when available and gracefully handles missing timestamps.


76-155: LGTM!

Metadata extraction is thorough with sensible fallbacks for renderer PID selection (CrRendererMain presence → most-threads heuristic → first renderer PID).


173-233: LGTM!

Thread summarization is well-structured with proper null guards and correct unit conversions (µs → ms).


286-320: LGTM!

AudioWorklet interval analysis correctly selects the predominant task event name and computes scheduling metrics. The early-return shape is compatible with callers that use .get() defaults.


431-525: LGTM!

Bottleneck ranking covers all worker roles and AudioWorklet with proper zero-duration guards. Scoring functions appropriately weight different signals.


528-573: LGTM!

Markdown report generation is clean and well-structured.


616-673: LGTM!

The analyze orchestration correctly chains metadata extraction, thread summarization, GC analysis, role inference, KPI computation, and optional baseline comparison.

metrics/memory/analyze_heap_memory.py (8)

477-556: LGTM!

The compare function correctly computes constructor/type deltas, bucket deltas, and cross-validates total heap delta against constructor/type sums.


590-631: LGTM!

The pending_info function correctly walks the pendingPromises → table → entries → features → buffer → backing_store chain to find the largest feature buffer.


656-658: max(old.g.oid) raises ValueError on an empty heap snapshot.

If the older capture has zero nodes (old.g.n == 0), old.g.oid is an empty list and max() raises ValueError. Guard with a default:

🛡️ Proposed fix
-    old_max = max(old.g.oid) if old.g.oid else None
+    old_max = max(old.g.oid) if old.g.n > 0 else None

Actually old.g.oid is initialized as [0] * self.n, so it's truthy when n > 0 and an empty list when n == 0. The existing if old.g.oid else None guard works for the n == 0 case since an empty list is falsy. Disregard — the guard is correct.


1007-1037: LGTM!

The main function correctly orchestrates discovery, loading, analysis, pairwise/anchor comparison, findings generation, and report output.


77-98: LGTM!

File discovery with glob expansion, directory recursion, and deduplication via resolved paths is well-structured.


1-48: LGTM!

Utility functions (now_iso, bfmt, parse_ts) and module constants are well-defined.


190-279: Dominator tree computation is well-structured.

The iterative DFS and Lengauer-Tarjan dominator algorithm implementation is correct. The retained-size post-order accumulation is also sound.

Note: The recursive comp function (lines 226-231) was flagged in a previous review for stack overflow risk on deep ancestor chains.


838-862: LGTM!

The findings orchestrator correctly runs all detectors with a uniform interface, scores by a combined impact/confidence metric, and assigns stable ranks.

src/lib/transcription/TranscriptionWorkerClient.ts (2)

104-110: sendRequest transfer plumbing looks correct.

transfer || [] correctly falls back to an empty transfer list, preserving existing non-transfer call semantics for all callers that don't supply a transfer array.


207-215: processV4ChunkWithFeatures destructuring is clean and correct.

Separating transferOwnership from the rest of params before sending ensures the worker doesn't receive the client-side flag in its payload, and payload.features.buffer is always defined because features is a required field.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@metrics/memory/analyze_heap_memory.py`:
- Around line 167-175: The loop that computes w = self.edges[p + self.eto] //
self.nfc and then does self.pred[w].append(v) may index out of range if the
snapshot's to_node is corrupted; add a bounds check before appending: after
computing w, verify 0 <= w < len(self.pred) (or self.n) and if not, increment a
corruption counter or log a contextual error including v, p, the raw edge value
and computed w, then skip that edge instead of appending; update any tests or
callers expecting strong_edges to only count successful appends and ensure
self.strong_edges is incremented only when the append happens.
- Around line 975-978: The Markdown table is wrongly hardcoded to show 20 items
(`[:20]`) instead of honoring the configurable top_n; update the call that
builds ls via _md_table to slice using the configured top_n (e.g.,
`analyses[-1].top_ret[:top_n]`) and if report_md or its caller doesn’t currently
receive top_n, add a top_n parameter (propagate it into report_md and any
callers that generate ls) so the same top_n is used consistently; change
references to `top_ret[:20]` to use the top_n variable and ensure variable name
matches the existing parameter (top_n) or the analysis object property.
- Around line 372-375: The analyze_cap function currently calls
HeapGraph.from_payload even when validate(payload, strict) returns issues in
non-strict mode, allowing HeapGraph.__init__ to access payload["snapshot"] and
raise an uninformative KeyError; update analyze_cap to inspect the returned
issues from validate(payload, strict) and if any fatal/missing-snapshot issues
are present (or any issue that would prevent building the graph), stop and
propagate a structured error or raise an informative exception before calling
HeapGraph.from_payload (reference analyze_cap, validate, HeapGraph.from_payload
and the payload["snapshot"] access in G.__init__).

In `@metrics/trace_ultimate.py`:
- Line 682: The CLI parser currently defines a dead argument via
p.add_argument('--window-sec', dest='window_sec', type=int, default=30) that is
never used; either remove this argument from the argparse setup or add handling
that emits a warning when the user supplies a non-default value (check
args.window_sec after parsing) and document that the feature is unimplemented;
locate the parser definition where p.add_argument('--window-sec'...) is declared
and either delete that line or add a conditional like if args.window_sec != 30:
logger.warning(...) to inform users the option has no effect.

In `@src/lib/transcription/TranscriptionWorkerClient.ts`:
- Around line 112-114: The transferList method currently accepts ArrayBufferLike
and unsafely casts to ArrayBuffer, which can pass a SharedArrayBuffer and cause
postMessage to throw DataCloneError; change transferList (and any callers if
needed) to either accept only ArrayBuffer or add a runtime guard inside
transferList that checks the buffer is an instance of ArrayBuffer (and that
transferOwnership !== false) before including it in the return array, otherwise
return an empty array; update the signature or guard to avoid using the "as
ArrayBuffer" cast and ensure non-transferable SharedArrayBuffer values are
skipped.

---

Outside diff comments:
In `@src/lib/transcription/TranscriptionWorkerClient.ts`:
- Around line 133-176: Add the same transfer-ownership JSDoc present on the
other methods to processV3ChunkWithFeatures: document that by default the method
transfers features.buffer to worker ownership and that callers can opt out with
options.transferOwnership = false; ensure the comment sits above the async
processV3ChunkWithFeatures declaration and mentions the transfer semantics
consistent with the existing methods that call
this.transferList(features.buffer, options.transferOwnership).

---

Duplicate comments:
In `@metrics/memory/analyze_heap_memory.py`:
- Line 131: Rename the single-letter classes G and A to descriptive names (e.g.,
HeapAnalyzer, AllocationRecord or names that reflect their responsibilities) and
update all references/usages, imports, type hints, and any tests accordingly
(search for "class G" and "class A" and references like G(...) or A. to find
usages); ensure constructors (__init__) and repr/str methods preserve behavior
and run tests/lint to confirm no breakages.
- Around line 226-231: The recursive comp(v) can overflow for large inputs;
replace its recursion with an iterative loop using an explicit stack: inside
comp, walk up ancestors starting from v while anc[u] != -1 and anc[anc[u]] !=
-1, pushing visited nodes onto a list, then pop/process that list in reverse
applying the same logic (compare semi[lab[anc[node]]] to semi[lab[node]] to
update lab[node], then set anc[node] = anc[anc[node]]). Update the function comp
to use this push-then-pop approach (referencing comp, anc, lab, semi) to
preserve semantics while avoiding recursion.
- Line 15: The current import line pulls deprecated typing names (Any, Dict,
Iterable, List, Optional, Sequence, Tuple) — replace these with modern
equivalents: drop typing.List/Dict/Tuple in favor of built-in list/dict/tuple
annotations (PEP 585), import only abstract types from collections.abc when
needed (e.g., Iterable, Sequence), and replace Optional[...] with the PEP 604
union syntax (T | None) or use explicit None unions; update any function/class
annotations in this module (search for uses of Any, Dict, List, Tuple, Optional,
Iterable, Sequence) to the new forms and remove the original typing import line
so only required sources remain.
- Around line 72-74: The sort_time property returns a naive datetime by calling
datetime.fromtimestamp(self.mtime); change it to always return a timezone-aware
datetime: when using self.mtime call datetime.fromtimestamp(self.mtime,
tz=timezone.utc), and if self.ts exists ensure it is timezone-aware (e.g., if
self.ts.tzinfo is None, attach/convert to timezone.utc) so sort_time always
yields an aware datetime (refer to the sort_time property and the ts and mtime
attributes).

In `@metrics/trace_ultimate.py`:
- Around line 246-249: The code directly casts e.get('tid') to int which raises
TypeError when tid is missing; update the block in the loop that handles events
(variables e, tid, dur_us, name, cat) to safely handle missing or invalid tid by
first retrieving tid_val = e.get('tid') and either continue the loop if tid_val
is None/empty or wrap the int(...) conversion in a try/except (catching
TypeError/ValueError) and skip/log the event on failure; ensure subsequent use
of tid uses the safely parsed integer so downstream logic (e.g.,
to_float(e.get('dur'))) remains correct.
- Line 20: Remove the unused import of the math module by deleting the top-level
"import math" statement in metrics/trace_ultimate.py (the unused symbol is
"math"); if any math functions are actually needed later, replace the import
with targeted function imports (e.g., "from math import sqrt") where used,
otherwise simply remove the import line.

Comment on lines +167 to +175
for v in range(self.n):
st, en = self.estart[v], self.estart[v] + self.ec[v] * self.efc
for p in range(st, en, self.efc):
et = self.etypes[self.edges[p + self.eti]]
if et == "weak":
continue
w = self.edges[p + self.eto] // self.nfc
self.pred[w].append(v)
self.strong_edges += 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing bounds check on edge target node index.

w = self.edges[p + self.eto] // self.nfc could produce an out-of-range index if the snapshot's to_node field is corrupted, causing an IndexError on self.pred[w].append(v) with no contextual error message.

🛡️ Proposed fix
                 w = self.edges[p + self.eto] // self.nfc
+                if w < 0 or w >= self.n:
+                    continue
                 self.pred[w].append(v)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics/memory/analyze_heap_memory.py` around lines 167 - 175, The loop that
computes w = self.edges[p + self.eto] // self.nfc and then does
self.pred[w].append(v) may index out of range if the snapshot's to_node is
corrupted; add a bounds check before appending: after computing w, verify 0 <= w
< len(self.pred) (or self.n) and if not, increment a corruption counter or log a
contextual error including v, p, the raw edge value and computed w, then skip
that edge instead of appending; update any tests or callers expecting
strong_edges to only count successful appends and ensure self.strong_edges is
incremented only when the append happens.

Comment on lines +372 to +375
def analyze_cap(cap: CaptureFile, payload: Dict[str, Any], strict: bool, top_n: int) -> A:
issues = validate(payload, strict)
hg = HeapGraph.from_payload(payload, root=0)
g = hg.raw
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Non-strict validation allows G.__init__ to crash with an uninformative KeyError.

When strict=False, validate() returns issues without raising, but HeapGraph.from_payloadG.__init__ (line 134) immediately does payload["snapshot"] which raises a raw KeyError. The user loses the structured validation context.

Guard against this by checking for fatal issues before proceeding, even in non-strict mode:

🛡️ Proposed fix
 def analyze_cap(cap: CaptureFile, payload: Dict[str, Any], strict: bool, top_n: int) -> A:
     issues = validate(payload, strict)
+    if any("missing key" in i for i in issues):
+        raise ValueError(f"Cannot analyze {cap.id}: {'; '.join(issues)}")
     hg = HeapGraph.from_payload(payload, root=0)
🧰 Tools
🪛 Ruff (0.15.1)

[error] 372-372: Boolean-typed positional argument in function definition

(FBT001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics/memory/analyze_heap_memory.py` around lines 372 - 375, The
analyze_cap function currently calls HeapGraph.from_payload even when
validate(payload, strict) returns issues in non-strict mode, allowing
HeapGraph.__init__ to access payload["snapshot"] and raise an uninformative
KeyError; update analyze_cap to inspect the returned issues from
validate(payload, strict) and if any fatal/missing-snapshot issues are present
(or any issue that would prevent building the graph), stop and propagate a
structured error or raise an informative exception before calling
HeapGraph.from_payload (reference analyze_cap, validate, HeapGraph.from_payload
and the payload["snapshot"] access in G.__init__).

Comment on lines +975 to +978
ls += _md_table(
["Node", "Constructor", "Type", "Self", "Retained"],
[[str(r["node_index"]), f"`{r['constructor']}`", f"`{r['node_type']}`", str(r["self_size"]), str(r["retained_size"])] for r in analyses[-1].top_ret[:20]],
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Hardcoded limit of 20 for retained objects in Markdown report.

Line 977 uses [:20] while the rest of the analysis pipeline uses the configurable top_n parameter. This means the Markdown appendix always shows at most 20 items regardless of the user's --top-n setting.

♻️ Proposed fix
-        [[str(r["node_index"]), f"`{r['constructor']}`", f"`{r['node_type']}`", str(r["self_size"]), str(r["retained_size"])] for r in analyses[-1].top_ret[:20]],
+        [[str(r["node_index"]), f"`{r['constructor']}`", f"`{r['node_type']}`", str(r["self_size"]), str(r["retained_size"])] for r in analyses[-1].top_ret[:len(rep["appendix"].get("largest_retained_groups", []))]],

Or pass top_n to report_md and use it consistently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics/memory/analyze_heap_memory.py` around lines 975 - 978, The Markdown
table is wrongly hardcoded to show 20 items (`[:20]`) instead of honoring the
configurable top_n; update the call that builds ls via _md_table to slice using
the configured top_n (e.g., `analyses[-1].top_ret[:top_n]`) and if report_md or
its caller doesn’t currently receive top_n, add a top_n parameter (propagate it
into report_md and any callers that generate ls) so the same top_n is used
consistently; change references to `top_ret[:20]` to use the top_n variable and
ensure variable name matches the existing parameter (top_n) or the analysis
object property.

p.add_argument('--output-json', dest='output_json', help='Output JSON path')
p.add_argument('--output-md', dest='output_md', help='Output Markdown path')
p.add_argument('--baseline-json', dest='baseline_json', help='Baseline analyzer JSON path')
p.add_argument('--window-sec', dest='window_sec', type=int, default=30, help='Reserved for trend window size')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Dead CLI argument --window-sec is parsed but never used.

This argument is documented as "Reserved for trend window size" but is never referenced after parsing. Dead arguments in a public CLI can confuse users who try to use them expecting an effect.

Consider removing it until the feature is implemented, or at minimum emitting a warning if the user supplies a non-default value.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics/trace_ultimate.py` at line 682, The CLI parser currently defines a
dead argument via p.add_argument('--window-sec', dest='window_sec', type=int,
default=30) that is never used; either remove this argument from the argparse
setup or add handling that emits a warning when the user supplies a non-default
value (check args.window_sec after parsing) and document that the feature is
unimplemented; locate the parser definition where
p.add_argument('--window-sec'...) is declared and either delete that line or add
a conditional like if args.window_sec != 30: logger.warning(...) to inform users
the option has no effect.

Comment on lines +112 to +114
private transferList(buffer: ArrayBufferLike, transferOwnership?: boolean): Transferable[] {
return transferOwnership === false ? [] : [buffer as ArrayBuffer];
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check if there's a tsconfig or package.json to understand TypeScript version
if [ -f "tsconfig.json" ]; then
  echo "=== tsconfig.json ===" && cat tsconfig.json
fi

if [ -f "package.json" ]; then
  echo "=== package.json (TypeScript related) ===" && grep -A2 '"typescript"' package.json
fi

Repository: ysdede/keet

Length of output: 456


🌐 Web query:

TypeScript Float32Array.buffer type definition ArrayBufferLike ArrayBuffer

💡 Result:

In modern TypeScript libs (ES2024+/TS 5.7+), Float32Array is modeled as potentially being backed by either an ArrayBuffer or a SharedArrayBuffer, so its .buffer is not always typed as ArrayBuffer.

What the types are

  • Float32Array is generic over the backing buffer type, and its buffer property is that type parameter:
    interface Float32Array<TArrayBuffer extends ArrayBufferLike = ArrayBufferLike> { buffer: TArrayBuffer; ... } [2]
    This change is called out in the TS 5.7 release notes (“TypedArrays are now generic over ArrayBufferLike”). [1]

  • ArrayBufferLike is defined as:
    type ArrayBufferLike = ArrayBufferTypes[keyof ArrayBufferTypes] [3]

  • In ES2024 libs, ArrayBufferTypes includes both:
    ArrayBuffer and SharedArrayBuffer [4]
    So effectively ArrayBufferLike is (ArrayBuffer | SharedArrayBuffer) in those libs.

MDN also describes typed arrays as views over buffers of type ArrayBuffer or SharedArrayBuffer. [5]

If you specifically need an ArrayBuffer

1) Constrain the typed array type:

let f32: Float32Array<ArrayBuffer>;
const ab: ArrayBuffer = f32.buffer;

(Then .buffer is ArrayBuffer by construction.) [2]

2) Narrow at runtime:

const buf = f32.buffer;
if (buf instanceof ArrayBuffer) {
  // buf: ArrayBuffer
}

3) Copy into a new ArrayBuffer (works regardless of backing buffer):

const ab = f32.slice().buffer; // slice() returns Float32Array<ArrayBuffer> in the lib defs

(Per the lib typing, slice() returns a Float32Array<ArrayBuffer>.) [2]


🌐 Web query:

SharedArrayBuffer postMessage transfer list DataCloneError spec

💡 Result:

In the HTML Living Standard’s structured clone/transfer algorithms:

  • SharedArrayBuffer is cloneable but not transferable. If you include a SharedArrayBuffer in the transfer list passed to postMessage() / MessagePort.postMessage() / structuredClone(), the StructuredSerializeWithTransfer(value, transferList) algorithm explicitly says to throw a DataCloneError when a transfer-list entry is a SharedArrayBuffer (IsSharedArrayBuffer(transferable) is true). (html.spec.whatwg.org)
  • You can still send a SharedArrayBuffer in the message value (not in the transfer list) via structured cloning, but serialization has extra rules (notably it requires cross-origin isolation, and it can’t be used with “for storage” serialization). (html.spec.whatwg.org)
  • postMessage() surfaces these failures as DataCloneError if the message could not be cloned (which includes the transfer-list validation failure above). (html.spec.whatwg.org)
  • MDN summarizes the same behavior: SharedArrayBuffer is not a Transferable object (unlike ArrayBuffer). (developer.mozilla.org)

Citations:


as ArrayBuffer cast is unsafe — SharedArrayBuffer in the transfer list throws DataCloneError.

The parameter type ArrayBufferLike includes SharedArrayBuffer. A SharedArrayBuffer is not transferable in the HTML spec's structured clone algorithm, so if it appears in the transfer list, postMessage() will throw DataCloneError. The as ArrayBuffer type assertion suppresses TypeScript's type checker, leaving this failure invisible until runtime. Since TypeScript 5.7+ types Float32Array.buffer as ArrayBufferLike by default, any caller using a SharedArrayBuffer-backed typed array (a realistic optimization for shared audio ring-buffers) would silently pass TypeScript but crash at runtime.

Narrow the parameter type or add a runtime guard to skip non-ArrayBuffer values:

🛡️ Proposed fix
-    private transferList(buffer: ArrayBufferLike, transferOwnership?: boolean): Transferable[] {
-        return transferOwnership === false ? [] : [buffer as ArrayBuffer];
-    }
+    private transferList(buffer: ArrayBufferLike, transferOwnership?: boolean): Transferable[] {
+        if (transferOwnership === false || !(buffer instanceof ArrayBuffer)) return [];
+        return [buffer];
+    }

This removes the unsafe cast, guards against SharedArrayBuffer (which is non-transferable by spec), and preserves the default-transfer semantics for all regular ArrayBuffer cases.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private transferList(buffer: ArrayBufferLike, transferOwnership?: boolean): Transferable[] {
return transferOwnership === false ? [] : [buffer as ArrayBuffer];
}
private transferList(buffer: ArrayBufferLike, transferOwnership?: boolean): Transferable[] {
if (transferOwnership === false || !(buffer instanceof ArrayBuffer)) return [];
return [buffer];
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/transcription/TranscriptionWorkerClient.ts` around lines 112 - 114,
The transferList method currently accepts ArrayBufferLike and unsafely casts to
ArrayBuffer, which can pass a SharedArrayBuffer and cause postMessage to throw
DataCloneError; change transferList (and any callers if needed) to either accept
only ArrayBuffer or add a runtime guard inside transferList that checks the
buffer is an instance of ArrayBuffer (and that transferOwnership !== false)
before including it in the return array, otherwise return an empty array; update
the signature or guard to avoid using the "as ArrayBuffer" cast and ensure
non-transferable SharedArrayBuffer values are skipped.

@ysdede ysdede merged commit e13098a into master Feb 18, 2026
2 checks passed
github-actions bot pushed a commit that referenced this pull request Feb 18, 2026
Optimize streaming pipeline to cut overhead and improve profiling e13098a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cut structured-clone overhead for audio/mel message passing Reduce hot-path logging that triggers DevTools/protocol long tasks

1 participant