perf: conservative round 2 optimizations (buffer/ui/debug)#143
perf: conservative round 2 optimizations (buffer/ui/debug)#143
Conversation
Reviewer's GuideImplements low-risk performance optimizations for v4 streaming by consolidating VAD worker calls into a single summary request, batching and scoping transcript UI updates, hardening debug visualizer cleanup, and adding guarded telemetry around incremental cache usage, without changing merger semantics. Sequence diagram for consolidated VAD summary and silence-based flush in v4TicksequenceDiagram
participant App
participant BufferWorkerClient
participant BufferWorker
participant WorkerClient
participant WindowBuilder
participant AppStore
App->>BufferWorkerClient: getVadSummary(startSample,endSample,energyThreshold,inferenceThreshold,requireInference)
BufferWorkerClient->>BufferWorker: BufferWorkerRequest GET_VAD_SUMMARY
BufferWorker-->>BufferWorkerClient: BufferWorkerResponse VadSummaryResult
BufferWorkerClient-->>App: VadSummaryResult vadSummary
App->>App: hasSpeech = vadSummary.hasSpeech
alt hasSpeech == false
App->>App: silenceDuration = vadSummary.silenceTailDurationSec
alt silenceDuration >= v4SilenceFlushSec
App->>WorkerClient: v4FinalizeTimeout()
WorkerClient-->>App: flushResult
alt flushResult exists
App->>AppStore: batch setMatureText,matureCursorTime,transcript
App->>AppStore: appendV4SentenceEntries
App->>AppStore: setV4MergerStats
App->>WindowBuilder: advanceMatureCursorByTime(matureCursorTime)
App->>WindowBuilder: markSentenceEnd(sampleAtMatureCursor)
end
end
else hasSpeech == true
App->>App: continue streaming without timeout flush
end
Class diagram for buffer worker VAD summary types and clientclassDiagram
class BufferWorkerClient {
-worker: Worker
-nextId: number
-pending: Map~number,PendingRequest~
+constructor(worker: Worker)
+sendRequest(type: BufferWorkerRequestType, payload: any) Promise~any~
+hasSpeech(layer: string, startSample: number, endSample: number, threshold: number) Promise~HasSpeechResult~
+getVadSummary(startSample: number, endSample: number, energyThreshold: number, inferenceThreshold: number, requireInference: boolean) Promise~VadSummaryResult~
+getSilenceTailDuration(layer: string, threshold: number) Promise~number~
+queryRange(startSample: number, endSample: number, layer: string) Promise~RangeResult~
}
class BufferWorker {
-layers: BufferLayers
-config: BufferWorkerConfig
+onmessage(event: MessageEvent)
+handleHasSpeech(id: number, query: HasSpeechQuery) void
+handleGetVadSummary(id: number, query: VadSummaryQuery) void
+handleGetSilenceTail(id: number, query: SilenceTailQuery) void
+handleQueryRange(id: number, query: RangeQuery) void
+respond(response: BufferWorkerResponse) void
+emptyHasSpeech() HasSpeechResult
}
class HasSpeechQuery {
+layer: string
+startSample: number
+endSample: number
+threshold: number
}
class VadSummaryQuery {
+startSample: number
+endSample: number
+energyThreshold: number
+inferenceThreshold: number
+requireInference: boolean
}
class SilenceTailQuery {
+layer: string
+threshold: number
}
class HasSpeechResult {
+hasSpeech: boolean
+maxProb: number
+entriesChecked: number
}
class VadSummaryResult {
+energy: HasSpeechResult
+inference: HasSpeechResult~nullable~
+hasSpeech: boolean
+silenceTailDurationSec: number
}
class BufferWorkerRequest {
<<union>>
+type: BufferWorkerRequestType
+id: number
+payload: any
}
class BufferWorkerResponse {
<<union>>
+type: BufferWorkerResponseType
+id: number
+payload: any
}
BufferWorkerClient --> BufferWorkerRequest : sends
BufferWorker --> BufferWorkerResponse : produces
BufferWorkerClient ..> BufferWorkerResponse : awaits
BufferWorker ..> HasSpeechQuery : uses
BufferWorker ..> VadSummaryQuery : uses
BufferWorker ..> SilenceTailQuery : uses
BufferWorker ..> HasSpeechResult : uses
BufferWorker ..> VadSummaryResult : uses
VadSummaryResult o-- HasSpeechResult : energy
VadSummaryResult o-- HasSpeechResult : inference
Class diagram for LayeredBufferVisualizer lifecycle and cleanup guardsclassDiagram
class LayeredBufferVisualizer {
+props: LayeredBufferVisualizerProps
-canvasRef: HTMLCanvasElement
-ctx: CanvasRenderingContext2D
-animationFrameId: number
-disposed: boolean
-cachedDpr: number
-resizeObserver: ResizeObserver
-dprMediaQuery: MediaQueryList
-dprChangeHandler: MediaQueryListListener
-cachedSpecData: CachedSpecData
+onMountHook() void
+onCleanupHook() void
-setupResizeObserver() void
-setupDprWatch() void
-updateCanvasDimensions(logicalW: number, logicalH: number) void
-loop(now: number) void
-fetchMelFeatures(fetchStartSample: number, fetchEndSample: number) void
}
class LayeredBufferVisualizerProps {
+audioEngine: AudioEngine
+melClient: MelClient
+windowDuration: number
+layers: string[]
}
class CachedSpecData {
+data: Float32Array
+startSample: number
+endSample: number
}
class ResizeObserver {
+observe(target: Element) void
+disconnect() void
}
class MediaQueryList {
+addEventListener(type: string, listener: MediaQueryListListener, options: any) void
+removeEventListener(type: string, listener: MediaQueryListListener) void
+matches: boolean
}
class MediaQueryListListener {
}
LayeredBufferVisualizer --> LayeredBufferVisualizerProps : uses
LayeredBufferVisualizer o-- CachedSpecData : caches
LayeredBufferVisualizer ..> ResizeObserver : uses
LayeredBufferVisualizer ..> MediaQueryList : uses
LayeredBufferVisualizer ..> MediaQueryListListener : uses
File-Level Changes
Assessment against linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `src/lib/buffer/BufferWorkerClient.ts:146-152` </location>
<code_context>
+ * - requireInference=false => energy only
+ * - silence tail always computed from energy layer
+ */
+ async getVadSummary(params: {
+ startSample: number;
+ endSample: number;
+ energyThreshold: number;
+ inferenceThreshold: number;
+ requireInference: boolean;
+ }): Promise<VadSummaryResult> {
+ return this.sendRequest('GET_VAD_SUMMARY', params);
+ }
</code_context>
<issue_to_address>
**suggestion:** Use the shared VadSummaryQuery type for the client method signature to prevent drift.
`getVadSummary` redefines the params inline instead of using `VadSummaryQuery` from `types.ts`, duplicating the contract and risking drift if the query shape changes. Please type it as `getVadSummary(params: VadSummaryQuery)` so the client and worker stay aligned and future changes only need to be made in one place.
Suggested implementation:
```typescript
async getVadSummary(params: VadSummaryQuery): Promise<VadSummaryResult> {
```
You’ll also need to import `VadSummaryQuery` from your shared `types.ts` file in `src/lib/buffer/BufferWorkerClient.ts`. For example, if you currently have something like:
- `import type { VadSummaryResult } from '../types';`
update it to:
- `import type { VadSummaryResult, VadSummaryQuery } from '../types';`
Adjust the relative path (`'../types'` vs `'./types'`, etc.) to match the existing import pattern in this file.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 3
📜 Review details
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
src/App.tsxsrc/components/LayeredBufferVisualizer.tsxsrc/components/TranscriptionDisplay.tsxsrc/lib/buffer/BufferWorkerClient.tssrc/lib/buffer/buffer.worker.test.tssrc/lib/buffer/buffer.worker.tssrc/lib/buffer/types.tstraces/notes/TOMORROW_OPTIMIZATIONS.md
🧰 Additional context used
🧬 Code graph analysis (3)
src/lib/buffer/buffer.worker.test.ts (1)
src/lib/buffer/BufferWorkerClient.ts (1)
sendRequest(208-214)
src/App.tsx (2)
src/lib/buffer/BufferWorkerClient.ts (1)
hasSpeech(130-137)src/stores/appStore.ts (1)
appStore(350-350)
src/lib/buffer/buffer.worker.ts (2)
src/lib/buffer/types.ts (3)
HasSpeechResult(120-124)VadSummaryQuery(85-96)VadSummaryResult(126-131)src/lib/buffer/BufferWorkerClient.ts (1)
hasSpeech(130-137)
🪛 LanguageTool
traces/notes/TOMORROW_OPTIMIZATIONS.md
[uncategorized] ~21-~21: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... transcript segments re-render. - Avoid full text reconciliation when only immature tail ...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
🪛 markdownlint-cli2 (0.21.0)
traces/notes/TOMORROW_OPTIMIZATIONS.md
[warning] 14-14: Ordered list item prefix
Expected: 1; Actual: 2; Style: 1/1/1
(MD029, ol-prefix)
[warning] 19-19: Ordered list item prefix
Expected: 1; Actual: 3; Style: 1/1/1
(MD029, ol-prefix)
[warning] 23-23: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/1/1
(MD029, ol-prefix)
[warning] 27-27: Ordered list item prefix
Expected: 1; Actual: 5; Style: 1/1/1
(MD029, ol-prefix)
[warning] 30-30: Ordered list item prefix
Expected: 1; Actual: 6; Style: 1/1/1
(MD029, ol-prefix)
[warning] 34-34: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 39-39: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 47-47: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 60-60: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 66-66: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 70-70: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🔇 Additional comments (14)
src/components/TranscriptionDisplay.tsx (1)
153-169: LGTM — clean reactive guard for V4 live tab.The early return of
''when on the Live tab avoids the expensive.map().filter().join()rebuild, and sincerenderFullTextContent()is only rendered inside themergedtab<Show>guard (line 288), there's no risk of displaying stale data. Switching tabs correctly triggers recomputation via theactiveTab()dependency.src/lib/buffer/buffer.worker.test.ts (2)
216-256: LGTM — good coverage of energy-only vs. energy+inference semantics.The test correctly validates: (1) energy-only mode returns
inference: nulland deriveshasSpeechfrom energy alone, and (2) inference-required mode applies the AND gate (energy && inference), confirming that high energy + low inference yieldshasSpeech: false.
258-282: LGTM — silence tail computation test is correct.The write head is at entry 3 (after three writes), two trailing entries are below threshold (0.1), giving
(2 * 1280) / 16000 ≈ 0.16s. This validates thatsilenceTailDurationSecis computed globally from the write head, independent of the query's sample range.src/components/LayeredBufferVisualizer.tsx (3)
63-63: LGTM — proper lifecycle safeguards withdisposedflag and cleanup.The
disposedflag is correctly reset on mount and set on cleanup, andonCleanupcancels pending RAF, disconnects the ResizeObserver, and removes the DPR listener. This prevents any post-unmount work.Also applies to: 126-126, 172-184
85-85: LGTM — DPR listener cleanup addresses Issue#131.Tracking
dprChangeHandlerand explicitly removing it on cleanup (and before re-registration insetupDprWatch) ensures no stale{ once: true }listeners survive unmount. Theif (disposed) returnguard inonDprChangeadds belt-and-suspenders safety.Also applies to: 142-156
186-191: LGTM — disposed guards on animation loop and async callbacks.Early return in
loop()and the.then()callback for mel features ensures no drawing or state mutation happens after unmount.Also applies to: 240-241
src/lib/buffer/types.ts (1)
84-131: LGTM — well-structured types for the consolidated VAD summary.
VadSummaryQueryandVadSummaryResultcleanly model the energy-only vs. energy+inference semantics. Pre-computinghasSpeechon the worker side keeps the main-thread consumer simple.src/lib/buffer/BufferWorkerClient.ts (1)
139-154: LGTM — follows existing async query pattern.The new
getVadSummarymethod mirrors the establishedhasSpeech()/getSilenceTailDuration()pattern. Consider importing and reusingVadSummaryQueryfor the parameter type instead of the inline object to keep the contract in one place, but this is a minor style nit.src/lib/buffer/buffer.worker.ts (2)
337-339: LGTM — clean helper for defaultHasSpeechResult.
341-385: LGTM —handleGetVadSummaryis well-structured.Defensive
min/maxnormalization of sample bounds (line 354-355), correct AND-gate semantics forrequireInference, and energy-based silence tail computation all match the documented contract. The uninitialized-state early return (lines 342-352) mirrors the pattern used by existing handlers.src/App.tsx (4)
343-351: LGTM — clean consolidation of VAD polling into a single worker round-trip.
requireInferencecorrectly keys offtenVADClient?.isReady(), preserving the energy-only fallback when TEN-VAD hasn't initialized. The pre-computedvadSummary.hasSpeechkeeps the main-thread logic simple.
386-397: LGTM — batched flush updates reduce reactive churn.Wrapping the finalization state mutations in
batch()prevents intermediate renders for each individual signal update. ThewindowBuilder.advanceMatureCursorByTimecall correctly stays outside the batch since it's not a reactive operation.
453-473: LGTM — explicit cache decision paths with telemetry.The three-way branch (non-positive prefix / outside window / enabled) clearly maps each cache decision to a telemetry bucket, making it straightforward to diagnose cache bypass rates.
487-526: Correct batching of inference-result UI updates.Computing
shouldAdvanceCursorand capturingringbefore thebatch()block ensures stable reads. Window builder mutations (lines 523-525) are correctly performed outside the reactive batch. Thebatch()consolidation should meaningfully reduce per-tick re-renders.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/App.tsx`:
- Around line 52-74: The v4CacheTelemetry counters (v4CacheTelemetry and tracked
via trackV4CacheTelemetry) are module-level and persist across stop→start; to
make telemetry per-session, reset the counters in cleanupV4Pipeline(): inside
cleanupV4Pipeline() set v4CacheTelemetry.enabled = 0,
v4CacheTelemetry.bypassNonPositivePrefix = 0, and
v4CacheTelemetry.bypassOutsideWindow = 0 (or reassign a fresh object) so counts
start at zero for each new pipeline session.
- Around line 514-519: The code currently ignores result.metrics and always
writes zeros; replace the hardcoded values in the appStore.setSystemMetrics call
with real values from result.metrics (e.g., use result.metrics.throughput and
result.metrics.modelConfidence or the actual property names in the transcriber
output) and provide safe fallbacks (e.g., || 0) so you don’t crash if a field is
missing, or alternatively remove the if (result.metrics) branch and always call
appStore.setSystemMetrics with explicit zeros if you intend initialization
instead of using real metrics; update the call site that invokes
appStore.setSystemMetrics to use those extracted values.
In `@traces/notes/TOMORROW_OPTIMIZATIONS.md`:
- Line 64: Update the sentence in TOMORROW_OPTIMIZATIONS.md that currently reads
"Deferred due higher risk around transferable ownership and cross-worker
lifecycle coordination." to "Deferred due to higher risk around transferable
ownership and cross-worker lifecycle coordination." — locate the exact line
matching that phrase and insert the missing "to" after "due" so the grammar is
correct.
| if (result.metrics) { | ||
| appStore.setSystemMetrics({ | ||
| throughput: 0, | ||
| modelConfidence: 0, | ||
| }); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find V4ProcessResult definition and understand metrics structure
find . -type f -name "*.ts" -o -name "*.tsx" | head -20
rg -n "interface V4ProcessResult" -A 15 --type=tsRepository: ysdede/keet
Length of output: 1998
🏁 Script executed:
# Search for metrics field usage with throughput/modelConfidence
rg -n "metrics.*throughput\|throughput.*metrics" --type=ts --type=tsx
rg -n "\.metrics" --type=ts --type=tsx | grep -E "(throughput|modelConfidence|setSystemMetrics)" -C 2Repository: ysdede/keet
Length of output: 110
🏁 Script executed:
# Look at the specific code section in App.tsx
sed -n '510,525p' src/App.tsxRepository: ysdede/keet
Length of output: 584
🏁 Script executed:
# Search for where result.metrics is populated/assigned
rg -n "result\.metrics\s*=" --type=ts --type=tsx -C 3Repository: ysdede/keet
Length of output: 82
🏁 Script executed:
# Search for where metrics is assigned/populated in worker or results
rg -n "metrics\s*=" src/lib/transcription --type=ts -C 3Repository: ysdede/keet
Length of output: 37
🏁 Script executed:
# Search for throughput and modelConfidence usage in entire codebase
rg -n "throughput|modelConfidence" --type=ts -C 2Repository: ysdede/keet
Length of output: 945
🏁 Script executed:
# Look for the transcription worker implementation to see where result object is created
rg -n "return.*{" src/lib/transcription/TranscriptionWorkerClient.ts -A 5 | head -40Repository: ysdede/keet
Length of output: 136
🏁 Script executed:
# Check if metrics are ever actually extracted from result.metrics
rg -n "result\.metrics\." --type=ts -C 3Repository: ysdede/keet
Length of output: 37
🏁 Script executed:
# Search for where the result object is constructed/returned from transcription
rg -n "matureText\|immatureText" src/lib/transcription --type=ts -B 5 | head -60Repository: ysdede/keet
Length of output: 37
🏁 Script executed:
# Look for metrics calculation or population
rg -n "metrics.*:" src/lib/transcription --type=ts -C 5Repository: ysdede/keet
Length of output: 2217
🏁 Script executed:
# Check entire App.tsx context around metrics usage
sed -n '500,530p' src/App.tsxRepository: ysdede/keet
Length of output: 1074
🏁 Script executed:
# Search for any other setSystemMetrics calls to see the pattern
rg -n "setSystemMetrics" src --type=ts -C 2Repository: ysdede/keet
Length of output: 836
Extract actual throughput and modelConfidence from result.metrics instead of hardcoding zeros, or remove the conditional check if metrics are not yet computed.
The code checks for result.metrics presence (which contains raw ASR metrics from the transcriber) but then ignores its contents and always sets both values to 0. Either wire the actual values from result.metrics if they're available, or remove the condition and always initialize with zeros to clarify the intent.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/App.tsx` around lines 514 - 519, The code currently ignores
result.metrics and always writes zeros; replace the hardcoded values in the
appStore.setSystemMetrics call with real values from result.metrics (e.g., use
result.metrics.throughput and result.metrics.modelConfidence or the actual
property names in the transcriber output) and provide safe fallbacks (e.g., ||
0) so you don’t crash if a field is missing, or alternatively remove the if
(result.metrics) branch and always call appStore.setSystemMetrics with explicit
zeros if you intend initialization instead of using real metrics; update the
call site that invokes appStore.setSystemMetrics to use those extracted values.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b0502ac6a8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const silenceDuration = vadSummary.silenceTailDurationSec; | ||
| if (silenceDuration >= appStore.v4SilenceFlushSec()) { |
There was a problem hiding this comment.
Re-query silence tail immediately before timeout flush
silenceDuration is taken from the earlier GET_VAD_SUMMARY snapshot, but VAD writes keep arriving asynchronously while v4Tick awaits worker responses. If speech resumes after that summary is computed and before v4FinalizeTimeout() runs, this branch can still flush based on stale silence data and prematurely finalize an active utterance. The prior two-call flow re-checked silence tail right before flushing, so this change widens that race window.
Useful? React with 👍 / 👎.
|
Review-bot follow-up complete. Applied (safe + low-risk):
Skipped (intentionally):
Verified:
All checks above passed on this branch after the updates. |
…0260218 perf: conservative round 2 optimizations (buffer/ui/debug) 66b6af1
Summary
Implements conservative Performance Round 2 optimizations from
traces/notes/TOMORROW_OPTIMIZATIONS.mdwith low-risk, incremental changes and no merger-logic modifications.What Changed
GET_VAD_SUMMARY) and switchedv4Tickfrom multiple worker calls to a single summary call.energy && inference.App.tsx.Livetab is active.LayeredBufferVisualizer.traces/notes/TOMORROW_OPTIMIZATIONS.mdwith DONE/SKIPPED/TODO for this round.Scope Decisions
TOMORROW_OPTIMIZATIONS.mdUtteranceBasedMergersemantics unchanged)Validation
npm test -- src/lib/transcription/WindowBuilder.test.ts src/lib/transcription/UtteranceBasedMerger.test.ts src/lib/transcription/UtteranceBasedMerger.regression.test.tsnpm test -- src/lib/buffer/buffer.worker.test.tsnpm test -- src/stores/appStore.test.tsnpm test -- src/lib/vad/tenvad.worker.test.tsnpm testnpm run buildIssue/PR Links
Closes #131
Related: #116, #73, #27
Related PRs: #141, #138, #92
Summary by Sourcery
Optimize v4 streaming transcription performance and debug UI behavior with conservative, non-semantic changes.
New Features:
Enhancements:
Documentation:
Tests:
Summary by CodeRabbit
New Features
Performance
Bug Fixes
Tests