test: port safe AudioSegmentProcessor coverage from stale PR#172
test: port safe AudioSegmentProcessor coverage from stale PR#172
Conversation
Reviewer's GuideAdds deterministic, focused test coverage for AudioSegmentProcessor behavior (noise floor adaptation, speech/silence handling, threshold updates, and stats) while injecting a noop logger to keep tests/CI output quiet. Sequence diagram for speech end detection after configured silence thresholdsequenceDiagram
participant Test
participant AudioSegmentProcessor
Test->>AudioSegmentProcessor: constructor(sampleRate=16000, energyThreshold=0.01, silenceThreshold=0.16, logger)
Test->>AudioSegmentProcessor: processAudioData(speechChunk, 0.0, 0.5)
AudioSegmentProcessor-->>Test: []
Test->>AudioSegmentProcessor: processAudioData(speechChunk, 0.08, 0.5)
AudioSegmentProcessor-->>Test: []
Test->>AudioSegmentProcessor: processAudioData(silenceChunk, 0.16, 0.0001)
AudioSegmentProcessor-->>Test: []
Test->>AudioSegmentProcessor: getStateInfo()
AudioSegmentProcessor-->>Test: inSpeech=true
Test->>AudioSegmentProcessor: processAudioData(silenceChunk, 0.24, 0.0001)
AudioSegmentProcessor-->>Test: [segment]
Test->>AudioSegmentProcessor: getStateInfo()
AudioSegmentProcessor-->>Test: inSpeech=false
Sequence diagram for runtime speech threshold updates via setThresholdsequenceDiagram
participant Test
participant AudioSegmentProcessor
Test->>AudioSegmentProcessor: constructor(sampleRate=16000, logger)
Test->>AudioSegmentProcessor: setThreshold(0.9)
Test->>AudioSegmentProcessor: reset()
Test->>AudioSegmentProcessor: processAudioData(speechChunk, 0.08, 0.5)
AudioSegmentProcessor-->>Test: []
Test->>AudioSegmentProcessor: getStateInfo()
AudioSegmentProcessor-->>Test: inSpeech=false
Test->>AudioSegmentProcessor: setThreshold(0.1)
Test->>AudioSegmentProcessor: processAudioData(speechChunk, 0.16, 0.5)
AudioSegmentProcessor-->>Test: []
Test->>AudioSegmentProcessor: getStateInfo()
AudioSegmentProcessor-->>Test: inSpeech=true
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- You’re inlining
logger: () => { }in many test cases; consider extracting a sharedcreateProcessor()helper or anoopLoggerconstant to keep the tests DRY and make future config tweaks easier. - Several tests rely on magic numbers for durations and thresholds (e.g.,
0.08,0.16,0.24); naming these as constants (likeCHUNK_DURATION,SILENCE_THRESHOLD) would better document the timing relationships and make the tests easier to maintain.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- You’re inlining `logger: () => { }` in many test cases; consider extracting a shared `createProcessor()` helper or a `noopLogger` constant to keep the tests DRY and make future config tweaks easier.
- Several tests rely on magic numbers for durations and thresholds (e.g., `0.08`, `0.16`, `0.24`); naming these as constants (like `CHUNK_DURATION`, `SILENCE_THRESHOLD`) would better document the timing relationships and make the tests easier to maintain.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Addressed the valid review points in the latest push. Applied:
Validation run:
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/lib/transcription/TokenStreamTranscriber.test.ts (1)
5-18: Usevi.hoisted()to follow official Vitest patterns for sharing mock instances.The current code works because
mockMergerInstanceis only accessed inside the constructor body — which executes lazily during tests, after module initialization. However, Vitest's official API documentation explicitly recommendsvi.hoisted()for sharing variables between the test scope andvi.mock()factory functions. This makes the intent clear and avoids relying on deferred closure execution.Refactor using
vi.hoisted()+const { mockMergerInstance } = vi.hoisted(() => ({ + mockMergerInstance: { + processChunk: vi.fn(), + getText: vi.fn(), + reset: vi.fn(), + getState: vi.fn(), + }, +})); -const mockMergerInstance = { - processChunk: vi.fn(), - getText: vi.fn(), - reset: vi.fn(), - getState: vi.fn(), -}; vi.mock('parakeet.js', () => ({ LCSPTFAMerger: class { constructor() { return mockMergerInstance; } }, }));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/transcription/TokenStreamTranscriber.test.ts` around lines 5 - 18, Replace the plain module-scoped mock object with a hoisted mock using vi.hoisted() and use that hoisted variable inside the vi.mock factory so the shared mock is explicitly available to tests; specifically, change the current mockMergerInstance to be created via const mockMergerInstance = vi.hoisted(() => ({ processChunk: vi.fn(), getText: vi.fn(), reset: vi.fn(), getState: vi.fn() })), and update the vi.mock('parakeet.js') factory that returns LCSPTFAMerger to reference this hoisted mockMergerInstance instead of the original plain object so tests and the mock factory share the same instance per Vitest guidance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/transcription/TokenStreamTranscriber.test.ts`:
- Around line 36-37: Tests currently only suppress console.log via the
consoleLogSpy in TokenStreamTranscriber.test.ts, but console.warn and
console.error can still leak into CI; update the setup to also spyOn/silence
console.warn and console.error (e.g., create consoleWarnSpy and consoleErrorSpy
using vi.spyOn(console, 'warn') and vi.spyOn(console, 'error') with
mockImplementation no-op) and ensure they are restored in the test teardown
(afterEach) alongside consoleLogSpy to prevent leaks during
TokenStreamTranscriber tests.
- Around line 36-37: The test currently only mocks console.log via consoleLogSpy
in TokenStreamTranscriber.test.ts, but console.warn and console.error remain
active and can pollute CI output; add mocks for console.warn and console.error
(e.g., create consoleWarnSpy and consoleErrorSpy with
vi.spyOn(...).mockImplementation(() => {})) alongside consoleLogSpy and ensure
they are restored/cleared in the same teardown where consoleLogSpy is restored
to keep tests isolated.
- Around line 5-18: The shared mock object mockMergerInstance used by the
vi.mock factory must be hoisted; replace the top-level variable with a hoisted
definition via vi.hoisted(() => ({ processChunk: vi.fn(), getText: vi.fn(),
reset: vi.fn(), getState: vi.fn() })) and then have the vi.mock('parakeet.js')
factory return a LCSPTFAMerger class that constructs/returns that hoisted mock;
this ensures the mock is initialized before vi.mock is executed and avoids
referencing an uninitialized module-scope variable in the mock factory.
---
Nitpick comments:
In `@src/lib/transcription/TokenStreamTranscriber.test.ts`:
- Around line 5-18: Replace the plain module-scoped mock object with a hoisted
mock using vi.hoisted() and use that hoisted variable inside the vi.mock factory
so the shared mock is explicitly available to tests; specifically, change the
current mockMergerInstance to be created via const mockMergerInstance =
vi.hoisted(() => ({ processChunk: vi.fn(), getText: vi.fn(), reset: vi.fn(),
getState: vi.fn() })), and update the vi.mock('parakeet.js') factory that
returns LCSPTFAMerger to reference this hoisted mockMergerInstance instead of
the original plain object so tests and the mock factory share the same instance
per Vitest guidance.
| const mockMergerInstance = { | ||
| processChunk: vi.fn(), | ||
| getText: vi.fn(), | ||
| reset: vi.fn(), | ||
| getState: vi.fn(), | ||
| }; | ||
|
|
||
| vi.mock('parakeet.js', () => ({ | ||
| LCSPTFAMerger: class { | ||
| constructor() { | ||
| return mockMergerInstance; | ||
| } | ||
| }, | ||
| })); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/lib/transcription/TokenStreamTranscriber.test.ts | head -30Repository: ysdede/keet
Length of output: 973
🌐 Web query:
Vitest vi.mock hoisting vi.hoisted constraint variables
💡 Result:
In Vitest, vi.mock() is hoisted to the top of the test file and executed before all imports, regardless of where you write it. Because of that, the mock factory can’t safely capture variables declared later in the file (or in normal module scope)—those variables may not be initialized yet when the hoisted mock runs. [1][2]
The constraint (why “outer variables” break)
vi.mock('./mod', () => ({ ... }))is moved and evaluated before the rest of the file.- So anything the factory closes over must already exist at hoist time. Otherwise you can get initialization/TDZ-style errors or
undefined. [1]
Fix: use vi.hoisted() to define shared variables for the mock
Define the values you want to reuse (often vi.fn() spies) inside vi.hoisted(), before vi.mock, then reference them from the mock factory:
import { vi } from 'vitest'
const mocks = vi.hoisted(() => ({
namedExport: vi.fn(),
}))
vi.mock('./path/to/module.js', () => ({
namedExport: mocks.namedExport,
}))This works because Vitest runs the vi.hoisted factory early and lets you reuse its returned value later (including inside vi.mock). [1]
Important limitations of vi.hoisted
- Code in
vi.hoistedruns before imports, so you cannot access imported bindings inside it (they aren’t initialized yet). If you must, use a dynamicimport()inside the hoisted callback. [1] vi.mock/vi.hoistedhoisting relies on Vitest transforming the test file (Vitest scans for these calls and rewrites static imports). [3][4]
Alternative: vi.doMock() when you need normal-scope variables
vi.doMock() is not hoisted, so it can read variables from normal file scope—but it only affects subsequent imports (typically dynamic import()), not ones already statically imported. [1]
Sources: Vitest vi API docs (hoisting, vi.hoisted, vi.doMock) [1], mocking guide (hoisted vi.mock) [2], “How it Works” (Vitest transforms/hoists mocks and rewrites imports) [4], experimental/limitations notes on transforms/hoisting detection in test files [3].
Use vi.hoisted() to define mock objects shared with vi.mock factories.
Vitest hoists all vi.mock calls to the top of the file, executing them before any imports or module-level code. The mock factory can only safely reference variables that are also hoisted via vi.hoisted()—outer module-scope variables are not yet initialized.
The current code avoids a crash only because mockMergerInstance is accessed inside the class constructor, a closure that runs during test execution (after module initialization). Refactoring the factory to reference mockMergerInstance directly would fail. Use vi.hoisted() to properly define shared variables:
♻️ Refactor using vi.hoisted()
-const mockMergerInstance = {
- processChunk: vi.fn(),
- getText: vi.fn(),
- reset: vi.fn(),
- getState: vi.fn(),
-};
-
+const mockMergerInstance = vi.hoisted(() => ({
+ processChunk: vi.fn(),
+ getText: vi.fn(),
+ reset: vi.fn(),
+ getState: vi.fn(),
+}));
+
vi.mock('parakeet.js', () => ({
LCSPTFAMerger: class {
constructor() {
return mockMergerInstance;
}
},
}));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const mockMergerInstance = { | |
| processChunk: vi.fn(), | |
| getText: vi.fn(), | |
| reset: vi.fn(), | |
| getState: vi.fn(), | |
| }; | |
| vi.mock('parakeet.js', () => ({ | |
| LCSPTFAMerger: class { | |
| constructor() { | |
| return mockMergerInstance; | |
| } | |
| }, | |
| })); | |
| const mockMergerInstance = vi.hoisted(() => ({ | |
| processChunk: vi.fn(), | |
| getText: vi.fn(), | |
| reset: vi.fn(), | |
| getState: vi.fn(), | |
| })); | |
| vi.mock('parakeet.js', () => ({ | |
| LCSPTFAMerger: class { | |
| constructor() { | |
| return mockMergerInstance; | |
| } | |
| }, | |
| })); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/transcription/TokenStreamTranscriber.test.ts` around lines 5 - 18,
The shared mock object mockMergerInstance used by the vi.mock factory must be
hoisted; replace the top-level variable with a hoisted definition via
vi.hoisted(() => ({ processChunk: vi.fn(), getText: vi.fn(), reset: vi.fn(),
getState: vi.fn() })) and then have the vi.mock('parakeet.js') factory return a
LCSPTFAMerger class that constructs/returns that hoisted mock; this ensures the
mock is initialized before vi.mock is executed and avoids referencing an
uninitialized module-scope variable in the mock factory.
| consoleLogSpy = vi.spyOn(console, 'log').mockImplementation(() => { }); | ||
|
|
There was a problem hiding this comment.
console.warn and console.error are not suppressed.
If TokenStreamTranscriber emits warnings or errors (e.g., during transcription failures or model initialization), those will leak into CI output. Since the explicit goal here is readable CI logs, both should be silenced alongside console.log.
🔇 Proposed fix
consoleLogSpy = vi.spyOn(console, 'log').mockImplementation(() => { });
+ vi.spyOn(console, 'warn').mockImplementation(() => { });
+ vi.spyOn(console, 'error').mockImplementation(() => { });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| consoleLogSpy = vi.spyOn(console, 'log').mockImplementation(() => { }); | |
| consoleLogSpy = vi.spyOn(console, 'log').mockImplementation(() => { }); | |
| vi.spyOn(console, 'warn').mockImplementation(() => { }); | |
| vi.spyOn(console, 'error').mockImplementation(() => { }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/transcription/TokenStreamTranscriber.test.ts` around lines 36 - 37,
Tests currently only suppress console.log via the consoleLogSpy in
TokenStreamTranscriber.test.ts, but console.warn and console.error can still
leak into CI; update the setup to also spyOn/silence console.warn and
console.error (e.g., create consoleWarnSpy and consoleErrorSpy using
vi.spyOn(console, 'warn') and vi.spyOn(console, 'error') with mockImplementation
no-op) and ensure they are restored in the test teardown (afterEach) alongside
consoleLogSpy to prevent leaks during TokenStreamTranscriber tests.
console.warn and console.error are not suppressed.
Since the intent is to keep CI logs clean, console.warn and console.error should be mocked alongside console.log if the implementation uses them.
🔇 Proposed fix
consoleLogSpy = vi.spyOn(console, 'log').mockImplementation(() => { });
+ vi.spyOn(console, 'warn').mockImplementation(() => { });
+ vi.spyOn(console, 'error').mockImplementation(() => { });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| consoleLogSpy = vi.spyOn(console, 'log').mockImplementation(() => { }); | |
| consoleLogSpy = vi.spyOn(console, 'log').mockImplementation(() => { }); | |
| vi.spyOn(console, 'warn').mockImplementation(() => { }); | |
| vi.spyOn(console, 'error').mockImplementation(() => { }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/transcription/TokenStreamTranscriber.test.ts` around lines 36 - 37,
The test currently only mocks console.log via consoleLogSpy in
TokenStreamTranscriber.test.ts, but console.warn and console.error remain active
and can pollute CI output; add mocks for console.warn and console.error (e.g.,
create consoleWarnSpy and consoleErrorSpy with
vi.spyOn(...).mockImplementation(() => {})) alongside consoleLogSpy and ensure
they are restored/cleared in the same teardown where consoleLogSpy is restored
to keep tests isolated.
| const processor = createProcessor({ | ||
| energyThreshold: 0.01, | ||
| silenceThreshold: silenceThresholdSec, | ||
| }); |
There was a problem hiding this comment.
WARNING: Test implicitly depends on default maxSilenceWithinSpeech value
The test sets silenceThreshold: 0.16 but does not set maxSilenceWithinSpeech. The default is 0.160 (from audioParams.ts), which is exactly equal to silenceThreshold. The state machine condition silenceDuration < maxSilenceWithinSpeech uses strict less-than, so the boundary behavior (first silence chunk is absorbed, second triggers end) only holds because maxSilenceWithinSpeech === silenceThreshold. If the default changes, this test will silently break.
Make the test self-contained by explicitly setting maxSilenceWithinSpeech:
| }); | |
| const processor = createProcessor({ | |
| energyThreshold: 0.01, | |
| silenceThreshold: silenceThresholdSec, | |
| maxSilenceWithinSpeech: silenceThresholdSec, | |
| }); |
| const audio = new Float32Array(16000 * 2); | ||
| const result = await transcriber.processChunk(audio, 0); | ||
|
|
||
| expect(mockModel.transcribe).toHaveBeenCalled(); |
There was a problem hiding this comment.
SUGGESTION: processChunk test does not assert call arguments for model.transcribe
The test only checks that mockModel.transcribe was called, but does not verify the arguments (e.g., that audio is passed as the first argument, sampleRate as the second, and that returnTimestamps: true and returnTokenIds: true are set). A future refactor that accidentally passes wrong arguments would not be caught.
Consider adding:
expect(mockModel.transcribe).toHaveBeenCalledWith(
audio,
expect.any(Number),
expect.objectContaining({
returnTimestamps: true,
returnTokenIds: true,
}),
);
Code Review SummaryStatus: 4 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
SUGGESTION
Other Observations (not in diff)Issues found in unchanged code that cannot receive inline comments:
Files Reviewed (2 files)
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/lib/transcription/TokenStreamTranscriber.test.ts`:
- Around line 5-18: The mock uses a module-scoped variable mockMergerInstance
inside vi.mock which causes a TDZ hazard because vi.mock is hoisted; change the
setup to define mockMergerInstance via vi.hoisted() and then reference that
hoisted variable inside the vi.mock factory so LCSPTFAMerger's constructor
returns the hoisted mock; update any tests that reference mockMergerInstance to
use the hoisted instance name.
- Around line 36-37: The test currently only mocks console.log (consoleLogSpy)
which leaves console.warn and console.error active; update the test setup to
also spyOn and mockImplementation for console.warn and console.error (create
consoleWarnSpy and consoleErrorSpy or similar) so TokenStreamTranscriber
warnings/errors are silenced during tests, and ensure all spies are
restored/cleared in the teardown alongside consoleLogSpy to avoid leaking mocks.
| const features = new Float32Array(80 * 100); | ||
| const result = await transcriber.processChunkWithFeatures(features, 100, 80, 0, 0); | ||
|
|
||
| expect(mockModel.transcribe).toHaveBeenCalledWith( |
There was a problem hiding this comment.
WARNING: processChunkWithFeatures test does not assert that prefixSamples is absent from the transcribe call.
The features path is documented as "bypasses the preprocessor entirely". A key invariant is that prefixSamples must NOT be passed when using pre-computed features (it would be meaningless and could confuse the model). The current assertion only checks for the presence of precomputedFeatures via expect.objectContaining, but does not verify the absence of prefixSamples.
Consider adding:
expect(mockModel.transcribe).toHaveBeenCalledWith(
null,
expect.any(Number),
expect.not.objectContaining({ prefixSamples: expect.anything() }),
);or combining both into a single toHaveBeenCalledWith that fully specifies the options shape.
| expect(processor.getStateInfo().inSpeech).toBe(false); | ||
| }); | ||
|
|
||
| it('should update speech detection behavior when threshold changes', () => { |
There was a problem hiding this comment.
SUGGESTION: The setThreshold test does not call reset() between the two threshold assertions, leaving stale state (one entry in recentChunks, non-zero silenceDuration) from the first processAudioData call.
While the test still passes because the state machine transitions correctly, the stale recentChunks entry causes findSpeechStart() to look back one extra chunk when speech is detected on the second call. This means the test is not purely testing threshold behavior — it also implicitly exercises the lookback logic with a mixed-threshold history.
Consider adding processor.reset() before processor.setThreshold(0.1) to isolate the threshold change from lookback side-effects:
processor.setThreshold(0.1);
processor.reset(); // clear stale recentChunks
processor.processAudioData(speechChunk, CHUNK_DURATION_SEC, SPEECH_ENERGY);
expect(processor.getStateInfo().inSpeech).toBe(true);test: port safe AudioSegmentProcessor coverage from stale PR a4805b8
Summary
Ports only low-risk test coverage from stale PRs onto current
master.What changed
src/lib/audio/AudioSegmentProcessor.test.tssetThresholdsrc/lib/transcription/TokenStreamTranscriber.test.ts(new)parakeet.jsmerger and silenced console logs in test scope.Why this is safe
Validation
npm test -- src/lib/transcription/TokenStreamTranscriber.test.ts src/lib/audio/AudioSegmentProcessor.test.ts✅npm run build✅Summary by CodeRabbit