Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -591,11 +591,31 @@ flowchart TB
- **[`detect`](#10-detect)**: Analyzes prompts to determine which ones need changes based on a description
- **[`conflicts`](#11-conflicts)**: Finds and suggests resolutions for conflicts between two prompt files
- **[`trace`](#13-trace)**: Finds the corresponding line number in a prompt file for a given code line
- **[`story-test`](#21-story-test)**: Validates prompt changes against user stories

### Utility Commands
- **[`auth`](#18-auth)**: Manages authentication with PDD Cloud
- **[`sessions`](#19-pdd-sessions---manage-remote-sessions)**: Manage remote sessions for `connect`

### User Story Prompt Tests
PDD can validate prompt changes against user stories stored as Markdown files. This uses `detect` under the hood: a story **passes** when `detect` returns no required prompt changes.

Defaults:
- Stories live in `user_stories/` and match `story__*.md`.
- Prompts are loaded from `prompts/` (excluding `*_llm.prompt` by default).

Overrides:
- `PDD_USER_STORIES_DIR` sets the stories directory.
- `PDD_PROMPTS_DIR` sets the prompts directory.

Commands:
- `pdd story-test` runs the validation suite.
- `pdd change` runs story validation after prompt modifications and fails if any story fails.
- `pdd fix user_stories/story__*.md` applies a single story to prompts and re-validates it.

Template:
- See `user_stories/story__template.md` for a starter format.

## Global Options

These options can be used with any command:
Expand Down Expand Up @@ -1738,6 +1758,13 @@ pdd [GLOBAL OPTIONS] fix [OPTIONS] <GITHUB_ISSUE_URL>
pdd [GLOBAL OPTIONS] fix --manual [OPTIONS] PROMPT_FILE CODE_FILE UNIT_TEST_FILE ERROR_FILE
```

**User Story Fix Mode:**
```
pdd [GLOBAL OPTIONS] fix user_stories/story__my_story.md
```

This mode treats the story file as a change request for prompts. It runs `detect` to identify impacted prompts, applies prompt updates, and re-validates the story.

#### Manual Mode Arguments
- `PROMPT_FILE`: The filename of the prompt file that generated the code under test.
- `CODE_FILE`: The filename of the code file to be fixed.
Expand Down Expand Up @@ -1964,6 +1991,8 @@ Options:
- `--output LOCATION`: Specify where to save the modified prompt file. The default file name is `modified_<basename>.prompt`. If an environment variable `PDD_CHANGE_OUTPUT_PATH` is set, the file will be saved in that path unless overridden by this option.
- `--csv`: Use a CSV file for the change prompts instead of a single change prompt file. The CSV file should have columns: `prompt_name` and `change_instructions`. When this option is used, `INPUT_PROMPT_FILE` is not needed, and `INPUT_CODE` should be the directory where the code files are located. The command expects prompt names in the CSV to follow the `<basename>_<language>.prompt` convention. For each `prompt_name` in the CSV, it will look for the corresponding code file (e.g., `<basename>.<language_extension>`) within the specified `INPUT_CODE` directory. Output files will overwrite existing files unless `--output LOCATION` is specified. If `LOCATION` is a directory, the modified prompt files will be saved inside this directory using the default naming convention otherwise, if a csv filename is specified the modified prompts will be saved in that CSV file with columns 'prompt_name' and 'modified_prompt'.

**User Story Validation:** After prompt modifications (agentic or manual), `pdd change` runs user story tests when story files exist. It fails the command if any story indicates required prompt changes. Stories default to `user_stories/story__*.md` and can be overridden with `PDD_USER_STORIES_DIR`.

Example (manual single prompt change):
```
pdd [GLOBAL OPTIONS] change --manual --output modified_factorial_calculator_python.prompt changes_factorial.prompt src/factorial_calculator.py factorial_calculator_python.prompt
Expand Down Expand Up @@ -2499,6 +2528,27 @@ pdd firecrawl-cache check <url> # Check if a URL is cached

**When to use**: Caching is automatic. Use `stats` to check cache status, `info` to view configuration, `check` to verify if a URL is cached, or `clear` to force re-scraping all URLs.

### 21. story-test

Validate prompt changes against user stories stored as Markdown files in `user_stories/`. A story **passes** when `detect` finds no required prompt changes.

**Usage:**
```bash
pdd [GLOBAL OPTIONS] story-test [OPTIONS]
```

**Options:**
- `--stories-dir DIR`: Directory containing `story__*.md` files (default: `user_stories/`).
- `--prompts-dir DIR`: Directory containing `.prompt` files (default: `prompts/`).
- `--include-llm`: Include `*_llm.prompt` files in validation.
- `--fail-fast/--no-fail-fast`: Stop on the first failing story (default: `--fail-fast`).

**Examples:**
```bash
pdd story-test
PDD_USER_STORIES_DIR=stories pdd story-test --prompts-dir prompts
```

## Example Review Process

When the global `--review-examples` option is used with any command, PDD will present potential few-shot examples that might be used for the current operation. The review process follows these steps:
Expand Down Expand Up @@ -2658,6 +2708,7 @@ PDD uses several environment variables to customize its behavior:
**Note**: When using `.pddrc` configuration, context-specific settings take precedence over these global environment variables.

- **`PDD_PROMPTS_DIR`**: Default directory where prompt files are located (default: "prompts").
- **`PDD_USER_STORIES_DIR`**: Default directory where user story files are located (default: "user_stories").
- **`PDD_GENERATE_OUTPUT_PATH`**: Default path for the `generate` command.
- **`PDD_EXAMPLE_OUTPUT_PATH`**: Default path for the `example` command.
- **`PDD_TEST_OUTPUT_PATH`**: Default path for the unit test file.
Expand Down
67 changes: 66 additions & 1 deletion pdd/change_main.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
from .change import change as change_func
from .process_csv_change import process_csv_change
from .get_extension import get_extension
from .user_story_tests import run_user_story_tests, discover_prompt_files

# Set up logging
logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -487,7 +488,71 @@ def change_main(
logger.error(msg, exc_info=True)
return msg, total_cost, model_name or ""

# --- 5. Final User Feedback ---
# --- 5. User Story Validation (Optional) ---
if (use_csv or success) and not ctx.obj.get("skip_user_stories", False):
prompts_dir = resolved_config.get("prompts_dir") or os.environ.get("PDD_PROMPTS_DIR") or "prompts"
stories_dir = os.environ.get("PDD_USER_STORIES_DIR") or "user_stories"
validation_prompt_files = None
validation_prompts_dir = Path(prompts_dir)
output_is_csv = False

if use_csv and output_path_obj:
output_is_csv = output_path_obj.suffix.lower() == ".csv"

if output_is_csv:
if not quiet:
rprint("[yellow]Skipping user story validation: output is CSV, no prompt files written.[/yellow]")
passed = True
story_cost = 0.0
story_model = ""
else:
override_dir = None
if use_csv:
if "output_dir" in locals():
override_dir = output_dir
elif output_path_obj:
if output_path_obj.is_dir() or (not output_path_obj.exists() and not output_path_obj.suffix):
override_dir = output_path_obj
else:
override_dir = output_path_obj.parent
else:
if output_path_obj:
override_dir = output_path_obj.parent

if override_dir:
override_prompts = discover_prompt_files(str(override_dir))
base_prompts = discover_prompt_files(str(validation_prompts_dir))
merged: List[Path] = []
seen = set()
for pf in override_prompts + base_prompts:
key = pf.name.lower()
if key in seen:
continue
merged.append(pf)
seen.add(key)
validation_prompt_files = merged
Comment on lines +525 to +533
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

De-duplicating by pf.name.lower() can drop distinct prompt files that share the same basename in different subdirectories. That can cause user story validation to miss prompts (false passes) depending on repository layout. Use a uniqueness key based on full normalized path (e.g., str(pf.resolve()).lower()) or avoid de-duplication entirely and let detect_change see all prompt files.

Copilot uses AI. Check for mistakes.

passed, _, story_cost, story_model = run_user_story_tests(
prompts_dir=str(validation_prompts_dir) if validation_prompt_files is None else None,
prompt_files=validation_prompt_files,
stories_dir=stories_dir,
strength=strength,
temperature=temperature,
time=time_budget,
verbose=not quiet,
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation call sets verbose=not quiet, which conflates verbosity with quiet mode and ignores the existing ctx.obj.get('verbose', False) option used elsewhere. This changes behavior (validation becomes verbose whenever quiet is false) and makes verbosity inconsistent across commands. Consider passing the actual configured verbose flag (e.g., ctx.obj.get('verbose', False)) and keeping quiet solely responsible for suppressing output.

Suggested change
verbose=not quiet,
verbose=ctx.obj.get("verbose", False),

Copilot uses AI. Check for mistakes.
quiet=quiet,
fail_fast=True,
)
total_cost += story_cost
if story_model:
model_name = model_name or story_model
if not passed:
msg = "User story validation failed. Review detect results for details."
if not quiet:
rprint(f"[bold red]Error:[/bold red] {msg}")
return msg, total_cost, model_name or ""

# --- 6. Final User Feedback ---
# Show summary if not quiet AND (it was CSV mode OR non-CSV mode succeeded)
if not quiet and (use_csv or success):
rprint("[bold green]Prompt modification completed successfully.[/bold green]")
Expand Down
3 changes: 2 additions & 1 deletion pdd/commands/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from .fix import fix
from .modify import split, change, update
from .maintenance import sync, auto_deps, setup
from .analysis import detect_change, conflicts, bug, crash, trace
from .analysis import detect_change, conflicts, bug, crash, trace, story_test
from .connect import connect
from .auth import auth_group
from .misc import preprocess
Expand Down Expand Up @@ -35,6 +35,7 @@ def register_commands(cli: click.Group) -> None:
cli.add_command(bug)
cli.add_command(crash)
cli.add_command(trace)
cli.add_command(story_test)
cli.add_command(preprocess)
cli.add_command(report_core)
cli.add_command(install_completion_cmd, name="install_completion")
Expand Down
62 changes: 61 additions & 1 deletion pdd/commands/analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from ..agentic_bug import run_agentic_bug
from ..crash_main import crash_main
from ..trace_main import trace_main
from ..user_story_tests import run_user_story_tests
from ..track_cost import track_cost
from ..core.errors import handle_error
from ..operation_log import log_operation
Expand Down Expand Up @@ -62,6 +63,65 @@ def detect_change(
return None


@click.command("story-test")
@click.option(
"--stories-dir",
type=click.Path(file_okay=False, dir_okay=True),
default=None,
help="Directory containing story__*.md files (default: user_stories).",
)
@click.option(
"--prompts-dir",
type=click.Path(file_okay=False, dir_okay=True),
default=None,
help="Directory containing .prompt files (default: prompts).",
)
@click.option(
"--include-llm",
is_flag=True,
default=False,
help="Include *_llm.prompt files in validation.",
)
@click.option(
"--fail-fast/--no-fail-fast",
default=True,
help="Stop on the first failing story.",
)
@click.pass_context
@track_cost
def story_test(
ctx: click.Context,
stories_dir: Optional[str],
prompts_dir: Optional[str],
include_llm: bool,
fail_fast: bool,
) -> Optional[Tuple[Dict[str, Any], float, str]]:
"""Validate prompt changes against user stories."""
try:
obj = get_context_obj(ctx)
passed, results, total_cost, model_name = run_user_story_tests(
prompts_dir=prompts_dir,
stories_dir=stories_dir,
strength=obj.get("strength", 0.2),
temperature=obj.get("temperature", 0.0),
time=obj.get("time", 0.25),
verbose=obj.get("verbose", False),
quiet=obj.get("quiet", False),
fail_fast=fail_fast,
include_llm_prompts=include_llm,
)
result = {
"passed": passed,
"results": results,
}
return result, total_cost, model_name
except (click.Abort, click.ClickException):
raise
except Exception as exception:
handle_error(exception, "story-test", get_context_obj(ctx).get("quiet", False))
return None


@click.command("conflicts")
@click.argument("prompt1", type=click.Path(exists=True, dir_okay=False))
@click.argument("prompt2", type=click.Path(exists=True, dir_okay=False))
Expand Down Expand Up @@ -310,4 +370,4 @@ def trace(
raise
except Exception as exception:
handle_error(exception, "trace", get_context_obj(ctx).get("quiet", False))
return None
return None
35 changes: 34 additions & 1 deletion pdd/commands/fix.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,13 @@ def fix(

# Determine mode based on first argument
is_url = args[0].startswith("http") or "github.com" in args[0]

def is_user_story_file(path: str) -> bool:
return (
path.endswith(".md")
and os.path.basename(path).startswith("story__")
and os.path.exists(path)
)

if is_url and not manual:
if len(args) > 1:
Expand Down Expand Up @@ -107,6 +114,32 @@ def fix(
return result_dict, cost, model

else:
if not manual and len(args) == 1 and is_user_story_file(args[0]):
from ..user_story_tests import run_user_story_fix

ctx_obj = ctx.obj or {}
success, message, cost, model, changed_files = run_user_story_fix(
ctx=ctx,
story_file=args[0],
prompts_dir=ctx_obj.get("prompts_dir"),
strength=ctx_obj.get("strength", 0.2),
temperature=ctx_obj.get("temperature", 0.0),
time=ctx_obj.get("time", 0.25),
budget=budget,
verbose=ctx_obj.get("verbose", False),
quiet=ctx_obj.get("quiet", False),
)
if success:
console.print(f"[bold green]User story fix completed:[/bold green] {message}")
else:
console.print(f"[bold red]User story fix failed:[/bold red] {message}")
result_dict = {
"success": success,
"message": message,
"changed_files": changed_files,
}
return result_dict, cost, model

min_args = 3 if loop else 4
if len(args) < min_args:
mode_str = "Loop" if loop else "Non-loop"
Expand Down Expand Up @@ -188,4 +221,4 @@ def fix(
except Exception as e:
quiet = ctx.obj.get("quiet", False) if ctx.obj else False
handle_error(e, "fix", quiet)
ctx.exit(1)
ctx.exit(1)
Loading