promptdriven · beknobloch · Feb 10, 2026 · Feb 10, 2026 · Feb 10, 2026 · Copilot
diff --git a/README.md b/README.md
@@ -591,11 +591,31 @@ flowchart TB
 - **[`detect`](#10-detect)**: Analyzes prompts to determine which ones need changes based on a description
 - **[`conflicts`](#11-conflicts)**: Finds and suggests resolutions for conflicts between two prompt files
 - **[`trace`](#13-trace)**: Finds the corresponding line number in a prompt file for a given code line
+- **[`story-test`](#21-story-test)**: Validates prompt changes against user stories
 
 ### Utility Commands
 - **[`auth`](#18-auth)**: Manages authentication with PDD Cloud
 - **[`sessions`](#19-pdd-sessions---manage-remote-sessions)**: Manage remote sessions for `connect`
 
+### User Story Prompt Tests
+PDD can validate prompt changes against user stories stored as Markdown files. This uses `detect` under the hood: a story **passes** when `detect` returns no required prompt changes.
+
+Defaults:
+- Stories live in `user_stories/` and match `story__*.md`.
+- Prompts are loaded from `prompts/` (excluding `*_llm.prompt` by default).
+
+Overrides:
+- `PDD_USER_STORIES_DIR` sets the stories directory.
+- `PDD_PROMPTS_DIR` sets the prompts directory.
+
+Commands:
+- `pdd story-test` runs the validation suite.
+- `pdd change` runs story validation after prompt modifications and fails if any story fails.
+- `pdd fix user_stories/story__*.md` applies a single story to prompts and re-validates it.
+
+Template:
+- See `user_stories/story__template.md` for a starter format.
+
 ## Global Options
 
 These options can be used with any command:
@@ -1738,6 +1758,13 @@ pdd [GLOBAL OPTIONS] fix [OPTIONS] <GITHUB_ISSUE_URL>
 pdd [GLOBAL OPTIONS] fix --manual [OPTIONS] PROMPT_FILE CODE_FILE UNIT_TEST_FILE ERROR_FILE
 ```
 
+**User Story Fix Mode:**
+```
+pdd [GLOBAL OPTIONS] fix user_stories/story__my_story.md
+```
+
+This mode treats the story file as a change request for prompts. It runs `detect` to identify impacted prompts, applies prompt updates, and re-validates the story.
+
 #### Manual Mode Arguments
 - `PROMPT_FILE`: The filename of the prompt file that generated the code under test.
 - `CODE_FILE`: The filename of the code file to be fixed.
@@ -1964,6 +1991,8 @@ Options:
 - `--output LOCATION`: Specify where to save the modified prompt file. The default file name is `modified_<basename>.prompt`. If an environment variable `PDD_CHANGE_OUTPUT_PATH` is set, the file will be saved in that path unless overridden by this option.
 - `--csv`: Use a CSV file for the change prompts instead of a single change prompt file. The CSV file should have columns: `prompt_name` and `change_instructions`. When this option is used, `INPUT_PROMPT_FILE` is not needed, and `INPUT_CODE` should be the directory where the code files are located. The command expects prompt names in the CSV to follow the `<basename>_<language>.prompt` convention. For each `prompt_name` in the CSV, it will look for the corresponding code file (e.g., `<basename>.<language_extension>`) within the specified `INPUT_CODE` directory. Output files will overwrite existing files unless `--output LOCATION` is specified. If `LOCATION` is a directory, the modified prompt files will be saved inside this directory using the default naming convention otherwise, if a csv filename is specified the modified prompts will be saved in that CSV file with columns 'prompt_name' and 'modified_prompt'.
 
+**User Story Validation:** After prompt modifications (agentic or manual), `pdd change` runs user story tests when story files exist. It fails the command if any story indicates required prompt changes. Stories default to `user_stories/story__*.md` and can be overridden with `PDD_USER_STORIES_DIR`.
+
 Example (manual single prompt change):
 ```
 pdd [GLOBAL OPTIONS] change --manual --output modified_factorial_calculator_python.prompt changes_factorial.prompt src/factorial_calculator.py factorial_calculator_python.prompt
@@ -2499,6 +2528,27 @@ pdd firecrawl-cache check <url>        # Check if a URL is cached
 
 **When to use**: Caching is automatic. Use `stats` to check cache status, `info` to view configuration, `check` to verify if a URL is cached, or `clear` to force re-scraping all URLs.
 
+### 21. story-test
+
+Validate prompt changes against user stories stored as Markdown files in `user_stories/`. A story **passes** when `detect` finds no required prompt changes.
+
+**Usage:**
+```bash
+pdd [GLOBAL OPTIONS] story-test [OPTIONS]
+```
+
+**Options:**
+- `--stories-dir DIR`: Directory containing `story__*.md` files (default: `user_stories/`).
+- `--prompts-dir DIR`: Directory containing `.prompt` files (default: `prompts/`).
+- `--include-llm`: Include `*_llm.prompt` files in validation.
+- `--fail-fast/--no-fail-fast`: Stop on the first failing story (default: `--fail-fast`).
+
+**Examples:**
+```bash
+pdd story-test
+PDD_USER_STORIES_DIR=stories pdd story-test --prompts-dir prompts
+```
+
 ## Example Review Process
 
 When the global `--review-examples` option is used with any command, PDD will present potential few-shot examples that might be used for the current operation. The review process follows these steps:
@@ -2658,6 +2708,7 @@ PDD uses several environment variables to customize its behavior:
 **Note**: When using `.pddrc` configuration, context-specific settings take precedence over these global environment variables.
 
 - **`PDD_PROMPTS_DIR`**: Default directory where prompt files are located (default: "prompts").
+- **`PDD_USER_STORIES_DIR`**: Default directory where user story files are located (default: "user_stories").
 - **`PDD_GENERATE_OUTPUT_PATH`**: Default path for the `generate` command.
 - **`PDD_EXAMPLE_OUTPUT_PATH`**: Default path for the `example` command.
 - **`PDD_TEST_OUTPUT_PATH`**: Default path for the unit test file.

diff --git a/pdd/change_main.py b/pdd/change_main.py
@@ -22,6 +22,7 @@
 from .change import change as change_func
 from .process_csv_change import process_csv_change
 from .get_extension import get_extension
+from .user_story_tests import run_user_story_tests, discover_prompt_files
 
 # Set up logging
 logger = logging.getLogger(__name__)
@@ -487,7 +488,71 @@ def change_main(
                     logger.error(msg, exc_info=True)
                     return msg, total_cost, model_name or ""
 
-        # --- 5. Final User Feedback ---
+        # --- 5. User Story Validation (Optional) ---
+        if (use_csv or success) and not ctx.obj.get("skip_user_stories", False):
+            prompts_dir = resolved_config.get("prompts_dir") or os.environ.get("PDD_PROMPTS_DIR") or "prompts"
+            stories_dir = os.environ.get("PDD_USER_STORIES_DIR") or "user_stories"
+            validation_prompt_files = None
+            validation_prompts_dir = Path(prompts_dir)
+            output_is_csv = False
+
+            if use_csv and output_path_obj:
+                output_is_csv = output_path_obj.suffix.lower() == ".csv"
+
+            if output_is_csv:
+                if not quiet:
+                    rprint("[yellow]Skipping user story validation: output is CSV, no prompt files written.[/yellow]")
+                passed = True
+                story_cost = 0.0
+                story_model = ""
+            else:
+                override_dir = None
+                if use_csv:
+                    if "output_dir" in locals():
+                        override_dir = output_dir
+                    elif output_path_obj:
+                        if output_path_obj.is_dir() or (not output_path_obj.exists() and not output_path_obj.suffix):
+                            override_dir = output_path_obj
+                        else:
+                            override_dir = output_path_obj.parent
+                else:
+                    if output_path_obj:
+                        override_dir = output_path_obj.parent
+
+                if override_dir:
+                    override_prompts = discover_prompt_files(str(override_dir))
+                    base_prompts = discover_prompt_files(str(validation_prompts_dir))
+                    merged: List[Path] = []
+                    seen = set()
+                    for pf in override_prompts + base_prompts:
+                        key = pf.name.lower()
+                        if key in seen:
+                            continue
+                        merged.append(pf)
+                        seen.add(key)
+                    validation_prompt_files = merged
+
+                passed, _, story_cost, story_model = run_user_story_tests(
+                    prompts_dir=str(validation_prompts_dir) if validation_prompt_files is None else None,
+                    prompt_files=validation_prompt_files,
+                    stories_dir=stories_dir,
+                    strength=strength,
+                    temperature=temperature,
+                    time=time_budget,
+                    verbose=not quiet,
-                    verbose=not quiet,
+                    verbose=ctx.obj.get("verbose", False),
-                    verbose=not quiet,
+                    verbose=ctx.obj.get("verbose", False),
+                    quiet=quiet,
+                    fail_fast=True,
+                )
+            total_cost += story_cost
+            if story_model:
+                model_name = model_name or story_model
+            if not passed:
+                msg = "User story validation failed. Review detect results for details."
+                if not quiet:
+                    rprint(f"[bold red]Error:[/bold red] {msg}")
+                return msg, total_cost, model_name or ""
+
+        # --- 6. Final User Feedback ---
         # Show summary if not quiet AND (it was CSV mode OR non-CSV mode succeeded)
         if not quiet and (use_csv or success):
             rprint("[bold green]Prompt modification completed successfully.[/bold green]")

diff --git a/pdd/commands/__init__.py b/pdd/commands/__init__.py
@@ -7,7 +7,7 @@
 from .fix import fix
 from .modify import split, change, update
 from .maintenance import sync, auto_deps, setup
-from .analysis import detect_change, conflicts, bug, crash, trace
+from .analysis import detect_change, conflicts, bug, crash, trace, story_test
 from .connect import connect
 from .auth import auth_group
 from .misc import preprocess
@@ -35,6 +35,7 @@ def register_commands(cli: click.Group) -> None:
     cli.add_command(bug)
     cli.add_command(crash)
     cli.add_command(trace)
+    cli.add_command(story_test)
     cli.add_command(preprocess)
     cli.add_command(report_core)
     cli.add_command(install_completion_cmd, name="install_completion")

diff --git a/pdd/commands/analysis.py b/pdd/commands/analysis.py
@@ -13,6 +13,7 @@
 from ..agentic_bug import run_agentic_bug
 from ..crash_main import crash_main
 from ..trace_main import trace_main
+from ..user_story_tests import run_user_story_tests
 from ..track_cost import track_cost
 from ..core.errors import handle_error
 from ..operation_log import log_operation
@@ -62,6 +63,65 @@ def detect_change(
         return None
 
 
+@click.command("story-test")
+@click.option(
+    "--stories-dir",
+    type=click.Path(file_okay=False, dir_okay=True),
+    default=None,
+    help="Directory containing story__*.md files (default: user_stories).",
+)
+@click.option(
+    "--prompts-dir",
+    type=click.Path(file_okay=False, dir_okay=True),
+    default=None,
+    help="Directory containing .prompt files (default: prompts).",
+)
+@click.option(
+    "--include-llm",
+    is_flag=True,
+    default=False,
+    help="Include *_llm.prompt files in validation.",
+)
+@click.option(
+    "--fail-fast/--no-fail-fast",
+    default=True,
+    help="Stop on the first failing story.",
+)
+@click.pass_context
+@track_cost
+def story_test(
+    ctx: click.Context,
+    stories_dir: Optional[str],
+    prompts_dir: Optional[str],
+    include_llm: bool,
+    fail_fast: bool,
+) -> Optional[Tuple[Dict[str, Any], float, str]]:
+    """Validate prompt changes against user stories."""
+    try:
+        obj = get_context_obj(ctx)
+        passed, results, total_cost, model_name = run_user_story_tests(
+            prompts_dir=prompts_dir,
+            stories_dir=stories_dir,
+            strength=obj.get("strength", 0.2),
+            temperature=obj.get("temperature", 0.0),
+            time=obj.get("time", 0.25),
+            verbose=obj.get("verbose", False),
+            quiet=obj.get("quiet", False),
+            fail_fast=fail_fast,
+            include_llm_prompts=include_llm,
+        )
+        result = {
+            "passed": passed,
+            "results": results,
+        }
+        return result, total_cost, model_name
+    except (click.Abort, click.ClickException):
+        raise
+    except Exception as exception:
+        handle_error(exception, "story-test", get_context_obj(ctx).get("quiet", False))
+        return None
+
+
 @click.command("conflicts")
 @click.argument("prompt1", type=click.Path(exists=True, dir_okay=False))
 @click.argument("prompt2", type=click.Path(exists=True, dir_okay=False))
@@ -310,4 +370,4 @@ def trace(
         raise
     except Exception as exception:
         handle_error(exception, "trace", get_context_obj(ctx).get("quiet", False))
-        return None
+        return None
diff --git a/pdd/commands/fix.py b/pdd/commands/fix.py
@@ -73,6 +73,13 @@ def fix(
 
         # Determine mode based on first argument
         is_url = args[0].startswith("http") or "github.com" in args[0]
+
+        def is_user_story_file(path: str) -> bool:
+            return (
+                path.endswith(".md")
+                and os.path.basename(path).startswith("story__")
+                and os.path.exists(path)
+            )
 
         if is_url and not manual:
             if len(args) > 1:
@@ -107,6 +114,32 @@ def fix(
             return result_dict, cost, model
 
         else:
+            if not manual and len(args) == 1 and is_user_story_file(args[0]):
+                from ..user_story_tests import run_user_story_fix
+
+                ctx_obj = ctx.obj or {}
+                success, message, cost, model, changed_files = run_user_story_fix(
+                    ctx=ctx,
+                    story_file=args[0],
+                    prompts_dir=ctx_obj.get("prompts_dir"),
+                    strength=ctx_obj.get("strength", 0.2),
+                    temperature=ctx_obj.get("temperature", 0.0),
+                    time=ctx_obj.get("time", 0.25),
+                    budget=budget,
+                    verbose=ctx_obj.get("verbose", False),
+                    quiet=ctx_obj.get("quiet", False),
+                )
+                if success:
+                    console.print(f"[bold green]User story fix completed:[/bold green] {message}")
+                else:
+                    console.print(f"[bold red]User story fix failed:[/bold red] {message}")
+                result_dict = {
+                    "success": success,
+                    "message": message,
+                    "changed_files": changed_files,
+                }
+                return result_dict, cost, model
+
             min_args = 3 if loop else 4
             if len(args) < min_args:
                  mode_str = "Loop" if loop else "Non-loop"
@@ -188,4 +221,4 @@ def fix(
     except Exception as e:
         quiet = ctx.obj.get("quiet", False) if ctx.obj else False
         handle_error(e, "fix", quiet)
-        ctx.exit(1)
+        ctx.exit(1)