feat(subtitles): refactor ai segmentation core protocol#963
feat(subtitles): refactor ai segmentation core protocol#963
Conversation
π¦ Changeset detectedLatest commit: a82afc5 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
PR Review: feat(subtitles): refactor ai segmentation core protocolOverall: Well-structured refactor. The VTT-to-line-protocol migration is a solid architectural improvement β index-based referencing eliminates the fragile timestamp-extraction problem that the old prompt had to painstakingly explain to the LLM. The decomposition into Code Quality & ArchitectureGood:
Issues:
Potential Bugs
Performance
Security
Test CoverageGood:
Missing coverage:
Summary
Recommendations (ordered by priority):
|
Code Review: feat(subtitles): refactor ai segmentation core protocolOverall AssessmentGood refactor. Replacing loose VTT parsing with a structured Code QualityStrengths:
Issues:
Potential Bugs
Performance
No performance issues worth acting on. SecurityNo security concerns. This code processes subtitle text locally, has no injection vectors, and doesn't interact with external systems beyond the existing AI provider integration. Test CoverageGood coverage of:
Missing coverage:
Prompt DesignThe new system prompt is cleaner and more structured than the old VTT-based one. The index-based protocol ( One suggestion: the prompt says "No markdown code blocks" but Summary
Actionable items (ordered by priority):
|
Code Review: feat(subtitles): refactor ai segmentation core protocolOverall AssessmentGood refactor. Replacing the loose VTT parsing with a strict Code QualityStrengths:
Suggestions:
Potential Bugs
Performance
SecurityNo concerns. The code processes subtitle text from AI model output and doesn't inject it into DOM unsafely or pass it to shell/eval. The Test CoverageGood coverage of the core protocol pipeline:
Gaps to consider:
Minor Nits
|
PR Review: feat(subtitles): refactor ai segmentation core protocolOverall AssessmentThis is a well-structured refactor that replaces a fragile VTT-based AI output protocol with a stricter, index-based line protocol ( Code Quality & ArchitectureStrengths:
Suggestions:
Potential Bugs / Issues
Performance
Security
Test CoverageWhat's covered well:
Gaps to consider:
Minor Nits
VerdictSolid refactor. The line protocol is a clear improvement over VTT parsing for AI output, the validation is rigorous, and the merge/split heuristics are well-bounded. The main actionable items are: (1) verify the comma removal from |
PR Review: feat(subtitles): refactor ai segmentation core protocolOverall this is a well-structured refactor that replaces a fragile VTT-based AI response format with a stricter index-based line protocol. The decomposition into protocol.ts, refine.ts, guards.ts, and index.ts follows SRP well, and the pipeline is easier to reason about. Good test coverage for the new modules. Code Quality and ArchitectureStrengths:
Suggestions:
Potential Bugs / Edge Cases
Performance
Test CoverageGood coverage for: protocol parsing (valid/invalid/numbered/spaced), response cleaning (markdown fences, think tags), validation (gaps, overlaps, out-of-bounds, empty), timeline rebuild, merge behavior (CJK short cues), split behavior (pause boundaries, no-boundary cases), guard enforcement (tiny cue merge, strong boundary preservation, comma/colon not-strong). Missing coverage:
SecurityNo concerns -- this code processes locally-sourced subtitle data and AI model output. cleanLineProtocolResponse properly strips markdown and think tags. No user input flows into executable paths. SummaryThis is a solid refactor. The line protocol is a meaningful improvement over VTT parsing -- it removes an entire class of timestamp-alignment bugs by making the AI only responsible for grouping, not timing. The post-processing pipeline (merge -> split -> guards) is well-layered and each step has clear invariants enforced by validation. Main actionable items:
|
Code Review: feat(subtitles): refactor ai segmentation core protocolOverall AssessmentSolid refactor. Replacing the loose VTT parsing with an explicit index-based line protocol ( Code QualityGood:
Suggestions:
Potential Bugs
PerformanceNo concerns. All operations are O(n) linear passes over fragment arrays. The split scoring loop ( Security
Test CoverageGood coverage of the new protocol pipeline:
Missing coverage:
Minor Nits
VerdictThe refactor achieves its goal: a stricter, more testable segmentation pipeline with proper validation at each stage. The index-based protocol eliminates the timestamp-alignment class of bugs from the old VTT approach. Ship it with the naming nit on |
|
This PR has been inactive for 14 days and is now marked as stale. |
PR Review: feat(subtitles): refactor ai segmentation core protocolOverall AssessmentWell-structured refactor that replaces loose VTT parsing with an explicit Code Quality & ArchitectureStrengths:
Issues:
Potential Bugs
Performance
SecurityNo injection vectors found. The AI output is parsed via strict regex patterns ( Test CoverageGood coverage of:
Missing coverage:
Minor Suggestions
Summary
Recommendation: Address #1 (comma boundary semantics) and #3 (cache key prompt versioning) before merging. The rest are minor improvements that can be follow-ups. |
Type of Changes
Description
This PR refactors the AI subtitle segmentation core into a stricter, testable line-protocol pipeline.
What changed
from-to | textline protocol.Scope note
Related Issue
N/A
How Has This Been Tested?
Test details:
src/utils/subtitles/__tests__/ai-segmentation.test.tscovering protocol parsing, coverage validation, timeline rebuild, and merge/split behavior.Screenshots
N/A
Checklist
Additional Information
Added changeset:
.changeset/feat-subtitles-ai-segmentation-core-protocol.md(@read-frog/extension: patch).