feat: Implement user story prompt validation [Fixes #346] by beknobloch · Pull Request #484 · promptdriven/pdd

beknobloch · 2026-02-10T18:56:01Z

Implements user story validation and fix workflows for prompt development.

Summary

Added pdd story-test to validate prompts against user stories (story__*.md).
Added core user story utilities:
- run_user_story_tests to validate story compliance via detect.
- run_user_story_fix to apply story-driven prompt updates and re-validate.
Integrated automatic user story validation into pdd change after prompt modifications.
Extended pdd fix with user story mode (pdd fix user_stories/story__*.md).
Added user story template and README documentation for setup and usage.
Added test coverage for command wiring and user story validation/fix behavior.

New files

pdd/user_story_tests.py: Contains logic for user story validation and fixing.
error_log.txt: Logs for pytest output and validation attempts.
user_stories/story__template.md: Template for creating user stories.
tests/test_user_story_tests.py: Unit tests for user story functionality.

Test Results

Unit tests: PASS
Regression tests: PASS
Sync regression: PASS
Test coverage: 79% for tests/test_user_story_tests.py and 86% for tests/test_change_main.py

Fixes #346

- Added user story validation feature to ensure prompt changes align with user stories. - Introduced `story-test` command for validating prompt changes against user stories. - Implemented `run_user_story_tests` and `run_user_story_fix` functions for handling user story tests and fixes. - Updated `README.md` to include documentation for new user story features and commands. - Added tests for user story validation and fix functionality to ensure reliability. New files: - `pdd/user_story_tests.py`: Contains logic for user story validation and fixing. - `error_log.txt`: Logs for pytest output and validation attempts. - `user_stories/story__template.md`: Template for creating user stories. - `tests/test_user_story_tests.py`: Unit tests for user story functionality.

- Introduced multiple tests for the `change_main` function to validate input handling, including requirements for change prompts and input codes. - Added tests to ensure proper error messages for CSV-related issues, such as missing files, empty headers, and incorrect input formats. - Implemented checks for handling exceptions during CSV processing and ensured appropriate responses for invalid inputs. - Enhanced test coverage for both CSV and non-CSV scenarios to improve reliability and robustness of the functionality.

Copilot

Pull request overview

Adds a user-story driven prompt validation and fix workflow, including a new CLI command and automatic validation after prompt changes.

Changes:

Introduces pdd story-test to validate prompts against user_stories/story__*.md.
Adds pdd/user_story_tests.py utilities to discover story/prompt files, validate via detect_change, and apply story-driven fixes.
Integrates optional user story validation into pdd change and adds a user-story mode to pdd fix.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
user_stories/story__template.md	Adds a starter template for writing user stories.
pdd/user_story_tests.py	Implements discovery, validation, and fix flows for user story testing.
pdd/change_main.py	Runs user story validation after prompt modifications and can fail the command on story failures.
pdd/commands/analysis.py	Adds `story-test` Click command wiring to run story validation from CLI.
pdd/commands/init.py	Registers the new `story-test` command.
pdd/commands/fix.py	Adds a “user story fix mode” that runs story-driven fixes for a single `story__*.md`.
tests/test_user_story_tests.py	Adds unit coverage for story discovery, validation, and fix behavior.
tests/test_change_main.py	Adds tests for change flow interactions with user story validation (including skip / CSV behaviors).
tests/commands/test_analysis.py	Adds CLI wiring tests for `story-test`.
tests/commands/test_fix.py	Adds CLI wiring test for user story fix mode.
README.md	Documents story-test, validation defaults/overrides, and fix mode usage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-10T23:36:16Z

tests/test_change_main.py

+         patch.object(Path, 'mkdir') as mock_mkdir, \
+         patch("pdd.change_main.run_user_story_tests") as mock_story_tests: # Mock user story validation
+            mock_story_tests.return_value = (True, [], 0.0, "")


This with ... as mock_story_tests: block is followed by a mis-indented statement (mock_story_tests.return_value = ...). As written, Python will raise an IndentationError because there is no new block opened between line 330 and 331. Align line 331 with the rest of the with block body (same indentation level as result = change_main(...)).

Copilot · 2026-02-10T23:36:17Z

pdd/user_story_tests.py

+            changed_files.append(str(prompt_path))
+            if result_message.startswith("[bold red]Error"):
+                errors.append(result_message)


Error detection here is unreliable: change_main() often returns plain-text error messages (e.g., \"Error during prompt modification: ...\") that do not start with \"[bold red]Error\", so failures can be silently treated as success. Also, changed_files.append(...) happens even when change_main() fails. A concrete fix is to make change_main() return an explicit success flag (or raise on failure) and use that to (1) decide whether to append to changed_files and (2) collect errors consistently. If changing change_main()'s return type is too invasive, standardize change_main() error messages so they all share a consistent, machine-detectable prefix and check for that prefix here before appending changed_files.

Suggested change

changed_files.append(str(prompt_path))

if result_message.startswith("[bold red]Error"):

errors.append(result_message)

# Treat both Rich-styled and plain-text "Error" prefixes as failures.

is_error = isinstance(result_message, str) and (

result_message.startswith("[bold red]Error")

or result_message.startswith("Error")

)

if is_error:

errors.append(result_message)

else:

changed_files.append(str(prompt_path))

Copilot · 2026-02-10T23:36:17Z

pdd/change_main.py

+                    strength=strength,
+                    temperature=temperature,
+                    time=time_budget,
+                    verbose=not quiet,


The validation call sets verbose=not quiet, which conflates verbosity with quiet mode and ignores the existing ctx.obj.get('verbose', False) option used elsewhere. This changes behavior (validation becomes verbose whenever quiet is false) and makes verbosity inconsistent across commands. Consider passing the actual configured verbose flag (e.g., ctx.obj.get('verbose', False)) and keeping quiet solely responsible for suppressing output.

Suggested change

verbose=not quiet,

verbose=ctx.obj.get("verbose", False),

Copilot · 2026-02-10T23:36:17Z

pdd/change_main.py

+                    merged: List[Path] = []
+                    seen = set()
+                    for pf in override_prompts + base_prompts:
+                        key = pf.name.lower()
+                        if key in seen:
+                            continue
+                        merged.append(pf)
+                        seen.add(key)
+                    validation_prompt_files = merged


De-duplicating by pf.name.lower() can drop distinct prompt files that share the same basename in different subdirectories. That can cause user story validation to miss prompts (false passes) depending on repository layout. Use a uniqueness key based on full normalized path (e.g., str(pf.resolve()).lower()) or avoid de-duplication entirely and let detect_change see all prompt files.

gltanaka

Hey @beknobloch, thanks for putting this together! Really like the approach of reusing detect as the validation mechanism — it's a clean design that avoids reinventing anything. The recursion prevention with skip_user_stories is well thought out too.

A few suggestions:

Design question — command naming:
The existing CLI commands are mostly single words (detect, fix, change, trace). Since story-test is essentially running detect in batch
mode against story files, would it make sense to add this as a flag on detect instead (e.g., pdd detect --stories)? That would keep
the command surface smaller and make the relationship to detect more obvious. Open to discussing this though.

Code suggestions:

Fragile error detection in run_user_story_fix (user_story_tests.py ~line 248): result_message.startswith("[bold red]Error") is
coupled to Rich markup formatting. If the formatting in change_main ever changes, this silently breaks. Could we use a more reliable
signal from the return value?
Hardcoded src/ in _prompt_to_code_path (~line 96): code_dir = prompts_dir.parent / "src" assumes the code directory is always ../src
relative to prompts. Since stories and prompts dirs both have env var overrides, it'd be nice to have the same flexibility here (e.g.,
PDD_SRC_DIR).
Docstrings and logging: The helper functions in user_story_tests.py are missing docstrings (project style guide requires them), and
the module doesn't set up logger = logging.getLogger(name) like other modules do. Adding both would bring it in line with the rest
of the codebase.

Minor nits:

The setattr(ctx, "obj", ctx_obj) calls in run_user_story_fix are likely redundant since ctx.obj is already a dict reference that's mutated in place.

Overall this is solid work — the test coverage on the core validation path is good, and the integration into change and fix is
minimally invasive. Nice job! 🎉

beknobloch added 3 commits February 10, 2026 13:45

chore: remove stray testing artifacts

8c15a3d

gltanaka requested a review from Copilot February 10, 2026 23:34

Copilot AI reviewed Feb 10, 2026

View reviewed changes

gltanaka self-requested a review February 11, 2026 01:08

gltanaka requested changes Feb 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Implement user story prompt validation [Fixes #346]#484

feat: Implement user story prompt validation [Fixes #346]#484
beknobloch wants to merge 3 commits intopromptdriven:mainfrom
beknobloch:user_stories

beknobloch commented Feb 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

gltanaka left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

beknobloch commented Feb 10, 2026

Summary

New files

Test Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

gltanaka left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants