feat: warn on misspelled language suffix in prompt filenames#491
feat: warn on misspelled language suffix in prompt filenames#491Serhan-Asad wants to merge 2 commits intopromptdriven:mainfrom
Conversation
6b73120 to
a6a0db3
Compare
…riven#451) Adds fuzzy matching (Levenshtein distance ≤ 2, token length ≥ 4) to detect misspelled language suffixes in prompt filenames and warn the user before falling back to default_language. Refactors _is_known_language to share the language set via _get_known_languages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
a6a0db3 to
11d012a
Compare
There was a problem hiding this comment.
Pull request overview
This PR enhances language detection in prompt filenames by adding fuzzy matching to detect and warn about misspelled language suffixes (e.g., "typscript" instead of "typescript"). Additionally, it fixes a bug where the default_language from .pddrc was being ignored during language detection.
Changes:
- Refactored language validation to use a shared
_get_known_languages()helper to eliminate code duplication - Added Levenshtein distance calculation and fuzzy matching to detect misspelled language suffixes with a threshold of ≤2 edits
- Implemented fallback to
default_languagefrom.pddrcwhen no language can be determined from other sources
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pdd/construct_paths.py | Refactored _is_known_language to use _get_known_languages(), added _levenshtein_distance and _closest_known_language functions for fuzzy matching, and integrated misspelling warnings with default language fallback |
| tests/test_construct_paths.py | Added 23 new tests covering language detection helpers, fuzzy matching behavior, and end-to-end warning scenarios; patched _find_pddrc_file in existing tests to prevent configuration interference |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pdd/construct_paths.py
Outdated
| 'nim', 'ocaml', 'groovy', 'coffeescript', 'fish', 'zsh', | ||
| 'prisma', 'lean', 'agda', | ||
| # Frontend / templating | ||
| 'prisma', 'lean', 'agda', 'lisp', 'scheme', 'ada', |
There was a problem hiding this comment.
The new languages 'lisp', 'scheme', and 'ada' were added on the same line as 'prisma', 'lean', and 'agda', breaking the logical grouping pattern. Based on the removed comment, these appear to be different categories. Consider moving them to a separate line or adding a comment to clarify their grouping.
| 'prisma', 'lean', 'agda', 'lisp', 'scheme', 'ada', | |
| 'prisma', 'lean', 'agda', | |
| 'lisp', 'scheme', 'ada', |
tests/test_construct_paths.py
Outdated
| # Exact matches are handled by _is_known_language, not this function | ||
| # But if called, distance is 0 which is <= 2, so it returns the match | ||
| assert _closest_known_language("typescript") == "typescript" |
There was a problem hiding this comment.
This test assumes that exact matches return the language itself, but the comment suggests this case should be handled by _is_known_language instead. Consider testing the actual expected behavior where _closest_known_language is only called for non-exact matches, or explicitly document that returning exact matches is intentional fallback behavior.
Move Levenshtein distance and closest language matching to a future feature PR. Keep _get_known_languages refactor and default_language fallback. Also fix language grouping per Copilot review. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Duplicate of PR promptdriven#491 for testing purposes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Duplicate of PR promptdriven#491 for testing purposes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
default_languagefrom.pddrcbeing ignored during language detectionnew_typscript.prompt) and warns the user before silently falling back todefault_language_is_known_languageto share the language set via_get_known_languageshelper (no duplication)Fixes #451
Test plan
_get_known_languages,_levenshtein_distance,_closest_known_language, and end-to-end warning behavior