Skip to content

Fix #232: Only print modified file paths in fix command output#326

Open
Serhan-Asad wants to merge 1 commit intopromptdriven:mainfrom
Serhan-Asad:fix-issue-232-final
Open

Fix #232: Only print modified file paths in fix command output#326
Serhan-Asad wants to merge 1 commit intopromptdriven:mainfrom
Serhan-Asad:fix-issue-232-final

Conversation

@Serhan-Asad
Copy link
Contributor

Summary

Fix #232: Only print file paths that were actually modified in pdd fix command output.

Problem: The pdd fix --loop command was printing paths for both test and code files even when only one file was modified, leading to confusion about which files actually changed.

Solution:

  • Added modification detection to fix_error_loop.py by comparing original file content (read at start) with final content (read at end)
  • Extended return tuple from 6 to 8 values, adding test_file_was_modified and code_file_was_modified boolean flags
  • Updated fix_main.py to conditionally print file paths based on modification status in loop mode
  • Non-loop mode behavior unchanged (uses LLM update flags as before)

Approach: Why Content Comparison?

After analyzing 10 alternative approaches (set-based tracking, mtime, hashing, wrapper classes, context managers, git diff, etc.), content comparison emerged as the optimal solution:

Why this approach:

  1. Semantically correct - Answers "Did the content change?" (not "Did we write to the file?")
  2. Handles edge cases - File written but content identical → correctly reports "not modified"
  3. Future-proof - Automatically works with all code paths (including agentic fallback) without modification
  4. Centralized logic - Only 2 locations: read at start, compare at end
  5. Simple & maintainable - Anyone reading the code immediately understands the logic
  6. Negligible overhead - 2 extra file reads (~1-2ms) is <0.1% of total pdd fix time

Alternatives rejected:

  • Set tracking: Tracks writes, not changes (wrong semantics)
  • mtime: Unreliable (clock skew, containers, network filesystems)
  • Hash comparison: Adds CPU overhead without benefit for small code files
  • Wrapper classes: Too invasive, requires refactoring all write points
  • Context managers: Overkill for this simple use case
  • Git diff: Not portable, heavyweight subprocess

See ISSUE_232_ANALYSIS_ALL_APPROACHES.md for full analysis.

Changes using pdd fix

  • pdd/fix_error_loop.py: Added content comparison for modification detection (48 lines modified)
  • pdd/fix_main.py: Added conditional output logic for loop vs non-loop modes (20 lines modified)
  • tests/test_fix_error_loop.py: Added 2 new modification detection tests (188 lines added)
  • tests/test_fix_main.py: Added 2 parametrized tests covering 7 scenarios (209 lines added)

Test Results

  • Unit tests: PASS (54/54 passed, excluding 17 pre-existing cloud test failures)
  • Regression tests: Not run (requires make regression with cloud credentials)
  • Sync regression: Not run (requires make sync-regression)
  • Test coverage: 53% for modified files (fix_error_loop.py: 58%, fix_main.py: 45%)

Issue #232 Specific Tests (9 scenarios):

  • test_fix_error_loop_returns_modification_flags_only_test_modified
  • test_fix_error_loop_returns_modification_flags_only_code_modified
  • test_fix_main_loop_mode_prints_only_modified_files (4 parametrized cases: only test, only code, both, neither)
  • test_fix_main_non_loop_mode_uses_update_flags_and_checks_content (3 parametrized cases)

Test improvements:

  • Refactored 7 repetitive tests into 2 parametrized tests (65% code reduction, ~400→140 lines)
  • All upstream tests preserved (0 tests removed, 4 tests added)

Manual Testing

Loop mode:

  • Only modified test file printed when code unchanged
  • Only modified code file printed when test unchanged
  • Both paths printed when both files modified
  • No file paths printed when tests pass immediately (neither modified)
  • Results file always printed

Non-loop mode:

  • Behavior unchanged (uses LLM update flags)
  • Backward compatibility maintained

Checklist

  • All tests pass (54 unit tests)
  • No temp files committed
  • No merge conflicts
  • No upstream tests removed (verified with diff)
  • (If feature) A/B comparison included - See manual testing section
  • (If prompts changed) Submitted to pdd_cap
  • Copilot comments addressed - No Copilot review yet
  • Regression tests - Requires manual execution

Fixes #232

…utput

- Add modification detection to fix_error_loop.py (content comparison)
- Add conditional output to fix_main.py based on modification flags
- Loop mode: Only prints files that were actually modified
- Non-loop mode: Unchanged behavior (uses LLM update flags)
- Add comprehensive tests for modification detection
- Extend return tuple from 6 to 8 values (added test_modified, code_modified)
@gltanaka gltanaka requested a review from Copilot January 17, 2026 19:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes issue #232 by modifying the pdd fix --loop command to only print file paths for files that were actually modified during the fix process, rather than printing paths for both test and code files regardless of which ones changed.

Changes:

  • Modified fix_error_loop to detect file modifications by comparing original content (read at start) with final content (read at end)
  • Extended fix_error_loop return tuple from 6 to 8 values to include test_file_was_modified and code_file_was_modified boolean flags
  • Updated fix_main to conditionally print file paths based on modification status in loop mode while preserving existing behavior for non-loop mode

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
pdd/fix_error_loop.py Adds content comparison logic to detect which files were modified, reads original content at function start and compares with final content at all return points
pdd/fix_main.py Implements conditional output printing based on modification flags (loop mode) vs update flags (non-loop mode)
tests/test_fix_error_loop.py Adds 2 new tests to verify modification detection works correctly for test-only and code-only changes
tests/test_fix_main.py Adds parametrized tests covering 7 scenarios for both loop and non-loop modes to ensure correct output behavior

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

success, final_unit_test, final_code, attempts, total_cost, model_name, test_modified, code_modified = fix_error_loop(
unit_test_file,
code_file,
"dummy_prompt.txt",
Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line adds a new positional argument prompt_file between existing parameters code_file and prompt, which changes the function signature. However, the function definition shows prompt_file is already the 3rd parameter. This creates a duplicate argument, which will cause a TypeError when the function is called.

Suggested change
"dummy_prompt.txt",

Copilot uses AI. Check for mistakes.
auto_submit=False
)
try:
success, fixed_test, fixed_code, attempts, cost, model = fix_main(
Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unpacking expects 6 values but fix_main now returns a tuple with 6 values. However, this is inconsistent with the changes elsewhere in the file where all other calls to fix_error_loop are updated to unpack 8 values. The test should handle the return value consistently with other tests.

Suggested change
success, fixed_test, fixed_code, attempts, cost, model = fix_main(
success, fixed_test, fixed_code, attempts, cost, model, _, _ = fix_main(

Copilot uses AI. Check for mistakes.
Comment on lines +2401 to +2407
mock_fix_error_loop.return_value = (
True,
"test content",
"code content",
2 if (test_modified or code_modified) else 0,
1.25 if (test_modified or code_modified) else 0.0,
"gpt-4" if (test_modified or code_modified) else "",
Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repeated conditional expression (test_modified or code_modified) appears three times in consecutive lines. Consider extracting this into a variable like files_were_modified = test_modified or code_modified to improve readability and maintainability.

Suggested change
mock_fix_error_loop.return_value = (
True,
"test content",
"code content",
2 if (test_modified or code_modified) else 0,
1.25 if (test_modified or code_modified) else 0.0,
"gpt-4" if (test_modified or code_modified) else "",
files_were_modified = test_modified or code_modified
mock_fix_error_loop.return_value = (
True,
"test content",
"code content",
2 if files_were_modified else 0,
1.25 if files_were_modified else 0.0,
"gpt-4" if files_were_modified else "",

Copilot uses AI. Check for mistakes.
@gltanaka
Copy link
Contributor

Thanks for working on this fix, @Serhan-Asad!

We've reviewed this PR against an alternative approach and decided to go with a simpler implementation. Here's a comparison:

This PR's Approach

  • Content comparison (original vs final file content)
  • Modifies fix_error_loop.py and fix_main.py
  • Changes fix_error_loop return signature (6 → 8 values)
  • Distinguishes loop vs non-loop modes
  • ~70 lines of changes across multiple files

Alternative Approach (implemented)

  • Simple truthy check on returned content (if fixed_unit_test: / if fixed_code:)
  • Modifies only fix_main.py (4 lines)
  • No API changes
  • Same logic for both modes
if fixed_unit_test:
    rprint(f"  Test file: {output_file_paths['output_test']}")
if fixed_code:
    rprint(f"  Code file: {output_file_paths['output_code']}")

Rationale

The simpler fix solves the reported problem without:

  • Breaking the fix_error_loop return signature
  • Adding complexity for edge cases that are rare in practice

The edge case your approach handles (LLM returns identical content) is uncommon and arguably not incorrect to report.

We appreciate the thorough analysis of alternatives mentioned in your PR description. Closing in favor of the simpler fix.

@gltanaka
Copy link
Contributor

Closing in favor of the simpler fix merged in commit 9d3d5d0.

@gltanaka
Copy link
Contributor

Reopening: this PR was automatically closed by a force push to main on Feb 9, not intentionally.

@gltanaka gltanaka reopened this Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pdd fix incorrectly prints the name of an output code file when no new code file was generated

2 participants