Agentic fix: cwd-aware paths, verification controls, lint cleanup, and targeted tests by beknobloch · Pull Request #476 · promptdriven/pdd

beknobloch · 2026-02-06T20:32:51Z

Summary

Make agentic fix path handling more robust: resolve llm_model.csv and error logs relative to the provided cwd, not process CWD.
Improve verification behavior in run_agentic_fix, including honoring PDD_AGENTIC_VERIFY=auto.
Fix stack-trace logging and clean up low-risk lint issues in agentic_fix.py.
Add/extend tests around cwd/path handling and LLM CSV discovery; temporarily skip the Google real-call test.

Test Results

Unit tests: PASS
Regression tests: PASS
Sync regression: PASS
PyLint result: 8.76 / 10

Manual Testing

N/A

Checklist

All tests pass
No temp files committed
No merge conflicts

…ewline at end of file

…r log resolution - Modified find_llm_csv_path to take an optional base_dir parameter for improved path resolution. - Updated run_agentic_fix to resolve error log paths against the provided working directory. - Added tests to verify the new behavior of find_llm_csv_path and error log resolution.

- Adjusted the logic for determining when verification is enabled, ensuring it defaults to true unless explicitly overridden. - Refined the handling of present keys based on authentication methods for better clarity and functionality.

- Changed the logic to skip tests when the Google API key is not available, providing a clearer message for the absence of the key. - Ensured that the test suite only runs when the necessary API key and CLI are present.

Copilot

Pull request overview

This PR improves the robustness of agentic_fix by making path resolution cwd-aware, adjusting verification/logging behavior, cleaning lint issues, and adding targeted tests for path discovery.

Changes:

Resolve llm_model.csv and relative error-log paths against the provided cwd (not process CWD), with new tests.
Add new agent-output “harvest/apply” parsing utilities (file blocks + optional TESTCMD) and extra diagnostics.
Update real-call tests to skip the Google provider case when the expected key isn’t available.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.

File	Description
tests/test_agentic_fix.py	Adds tests for cwd-aware error-log resolution and csv discovery; adjusts Google real-call gating.
pdd/agentic_fix.py	Implements cwd-aware csv/log paths, adds agent output harvesting utilities, changes provider detection and success/verification flow.

Comments suppressed due to low confidence (1)

pdd/agentic_fix.py:1

The public function signature removed the protect_tests kwarg. If downstream callers (or CLI flags) still pass this, it will now raise TypeError. If this option is intentionally deprecated, consider keeping it as an ignored kwarg (or with the old default) for backward compatibility, or provide a clear deprecation path before removing it.

from __future__ import annotations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-06T21:03:39Z

pdd/agentic_fix.py

+# When verification mode is "auto", we may run agent-supplied TESTCMD blocks (if emitted)
+_AGENT_TESTCMD_ALLOWED = os.getenv("PDD_AGENTIC_AGENT_TESTCMD", "1") != "0"


Executing an agent-supplied command via bash -lc is a high-risk code execution vector, and _AGENT_TESTCMD_ALLOWED currently defaults to enabled (\"1\"). Consider defaulting this to disabled (require explicit opt-in), and/or only allowing TESTCMD execution when verification is explicitly enabled by the user. If you keep this feature, strongly document the risk and consider adding allowlisting (e.g., only pytest ... forms) or running inside a sandboxed environment.

Suggested change

# When verification mode is "auto", we may run agent-supplied TESTCMD blocks (if emitted)

_AGENT_TESTCMD_ALLOWED = os.getenv("PDD_AGENTIC_AGENT_TESTCMD", "1") != "0"

# When verification mode is "auto", we may run agent-supplied TESTCMD blocks (if emitted).

# For safety, this is disabled by default and must be explicitly enabled via PDD_AGENTIC_AGENT_TESTCMD.

_AGENT_TESTCMD_ALLOWED = os.getenv("PDD_AGENTIC_AGENT_TESTCMD", "0") != "0"

Copilot · 2026-02-06T21:03:39Z

pdd/agentic_fix.py

+    Execute an agent-supplied TESTCMD locally via bash -lc "<cmd>".
    Return True on exit code 0, else False. Captures and previews output (verbose).
    """


Executing an agent-supplied command via bash -lc is a high-risk code execution vector, and _AGENT_TESTCMD_ALLOWED currently defaults to enabled (\"1\"). Consider defaulting this to disabled (require explicit opt-in), and/or only allowing TESTCMD execution when verification is explicitly enabled by the user. If you keep this feature, strongly document the risk and consider adding allowlisting (e.g., only pytest ... forms) or running inside a sandboxed environment.

Suggested change

Execute an agent-supplied TESTCMD locally via bash -lc "<cmd>".

Return True on exit code 0, else False. Captures and previews output (verbose).

"""

Execute an agent-supplied TESTCMD locally via bash -lc "<cmd>".

This is potentially dangerous because it allows arbitrary shell execution.

Therefore, it is disabled by default and must be explicitly enabled by

setting the environment variable `_AGENT_TESTCMD_ALLOWED=1`.

Return True on exit code 0, else False. Captures and previews output (verbose).

"""

if os.environ.get("_AGENT_TESTCMD_ALLOWED", "0") != "1":

_info(

"[yellow]Skipping agent-supplied test command because "

"_AGENT_TESTCMD_ALLOWED is not set to '1'.[/yellow]"

)

return False

Copilot · 2026-02-06T21:03:39Z

pdd/agentic_fix.py

+        verify_cmd=verify_cmd or "No verification command provided.",
+        protect_tests="1",
+    )
+    harvest_file = Path("agentic_fix_harvest.txt")


harvest_file is created as a relative path, so it will be written to the process CWD rather than the provided cwd/project root. This conflicts with the PR’s cwd-aware goal and can also leak files into unexpected directories. Create the file under cwd (e.g., cwd / \"agentic_fix_harvest.txt\") and ensure any provider command that references it uses the correct absolute/relative path accordingly.

Suggested change

harvest_file = Path("agentic_fix_harvest.txt")

harvest_file = Path(cwd) / "agentic_fix_harvest.txt"

Copilot · 2026-02-06T21:03:40Z

pdd/agentic_fix.py

+        # If verify_cmd arg is provided, it overrides env var and default
+        if verify_cmd is None:
+            verify_cmd = os.getenv("PDD_AGENTIC_VERIFY_CMD", None)
+
+        if verify_cmd is None:
+            verify_cmd = default_verify_cmd_for(get_language(os.path.splitext(code_path)[1]), unit_test_file)


verify_cmd is computed and passed into the prompt, but run_agentic_fix no longer performs any local verification step (nor honors PDD_AGENTIC_VERIFY / PDD_AGENTIC_VERIFY=auto behavior) before returning success. As written, any agent run that returns success=True will be treated as a successful fix, even if tests still fail. Re-introduce a post-run verification gate (using _verify_and_log and your desired env-policy) before returning (True, ...).

Copilot · 2026-02-06T21:03:40Z

pdd/agentic_fix.py

+        # Delegate to shared agentic task runner (can be monkeypatched in tests)
+        success, output_text, cost, provider_used = run_agentic_task(
+            primary_instr,
            cwd=working_dir,
-            verbose=verbose,
-            quiet=quiet,
+            verbose=_IS_VERBOSE,
+            quiet=_IS_QUIET,
            label="agentic_fix",
-            max_retries=DEFAULT_MAX_RETRIES,
+            max_retries=1,
        )


verify_cmd is computed and passed into the prompt, but run_agentic_fix no longer performs any local verification step (nor honors PDD_AGENTIC_VERIFY / PDD_AGENTIC_VERIFY=auto behavior) before returning success. As written, any agent run that returns success=True will be treated as a successful fix, even if tests still fail. Re-introduce a post-run verification gate (using _verify_and_log and your desired env-policy) before returning (True, ...).

Copilot · 2026-02-06T21:03:40Z

pdd/agentic_fix.py

+        if success:
+            msg = output_text or f"Agentic fix successful with {used_model}."
+            return True, msg, est_cost, used_model, changed_files
+
+        return False, (output_text or "All agents failed to produce a passing fix (no local fallback)."), est_cost, used_model, changed_files


verify_cmd is computed and passed into the prompt, but run_agentic_fix no longer performs any local verification step (nor honors PDD_AGENTIC_VERIFY / PDD_AGENTIC_VERIFY=auto behavior) before returning success. As written, any agent run that returns success=True will be treated as a successful fix, even if tests still fail. Re-introduce a post-run verification gate (using _verify_and_log and your desired env-policy) before returning (True, ...).

Copilot · 2026-02-06T21:03:40Z

pdd/agentic_fix.py

+                lf.write(stack[-1] if stack else "N/A")
+                lf.write("\n")


This logs only the last frame of traceback.format_stack(), which is usually not enough to identify the call path that led to file creation. If the intent is diagnostic for Issue #186, write the full stack (or at least the last N frames via format_stack(limit=...)) so the log is actionable.

Suggested change

lf.write(stack[-1] if stack else "N/A")

lf.write("\n")

if stack:

lf.writelines(stack)

else:

lf.write("N/A\n")

Copilot · 2026-02-06T21:03:41Z

pdd/agentic_fix.py

+            # Check CLI availability first (subscription auth), then API key.
+            # Use agentic_common._find_cli_binary so tests can monkeypatch CLI detection consistently.
+            has_cli_auth = provider == "anthropic" and bool(agentic_common._find_cli_binary("claude"))
+            has_api_key = os.getenv(api_key_name) or (provider == "google" and os.getenv("GEMINI_API_KEY"))
+            if has_cli_auth or has_api_key:
+                if has_cli_auth:
+                    present_keys.append("claude-cli-auth")
+                if has_api_key:
+                    present_keys.append(api_key_name or ("GEMINI_API_KEY" if provider == "google" else ""))


This explicitly disables Anthropic CLI-only auth by returning failure when only claude-cli-auth is present, even though earlier logic detects it as available. If supporting subscription/CLI auth is a goal, this behavior is surprising and prevents valid configurations from working. If tests need a deterministic failure mode, consider decoupling the runtime behavior from tests (e.g., feature flag/env toggle) rather than hard-disabling CLI-only auth in production.

Copilot · 2026-02-06T21:03:41Z

pdd/agentic_fix.py

+        # Treat CLI-only Anthropic auth (claude-cli-auth) as *not* a configured API key
+        # for the purposes of this function. Tests expect a clear error when no API keys
+        # are present in the environment, even if a CLI might be installed.
+        has_real_api_keys = any(k and k != "claude-cli-auth" for k in present_keys)


This explicitly disables Anthropic CLI-only auth by returning failure when only claude-cli-auth is present, even though earlier logic detects it as available. If supporting subscription/CLI auth is a goal, this behavior is surprising and prevents valid configurations from working. If tests need a deterministic failure mode, consider decoupling the runtime behavior from tests (e.g., feature flag/env toggle) rather than hard-disabling CLI-only auth in production.

Suggested change

# Treat CLI-only Anthropic auth (claude-cli-auth) as *not* a configured API key

# for the purposes of this function. Tests expect a clear error when no API keys

# are present in the environment, even if a CLI might be installed.

has_real_api_keys = any(k and k != "claude-cli-auth" for k in present_keys)

# By default, treat Anthropic CLI auth (claude-cli-auth) as a valid way to configure

# an agent. Tests that require a deterministic "no API keys" failure can enable

# stricter behavior via PDD_REQUIRE_AGENT_API_KEYS_ONLY.

strict_api_keys_only = os.getenv("PDD_REQUIRE_AGENT_API_KEYS_ONLY", "").lower() in ("1", "true", "yes")

has_real_api_keys = any(

k and (k != "claude-cli-auth" if strict_api_keys_only else True)

for k in present_keys

)

beknobloch added 6 commits February 6, 2026 12:10

agentic_fix: improvements

96f0f93

fix: Fix stack trace logging in _detect_suspicious_files and ensure n…

90ee473

…ewline at end of file

fix: improve verification logic in run_agentic_fix

a0afcab

- Adjusted the logic for determining when verification is enabled, ensuring it defaults to true unless explicitly overridden. - Refined the handling of present keys based on authentication methods for better clarity and functionality.

chore: delint agentic_fix.py

c487105

fix: update Google API key detection in agentic_fix tests

f180468

- Changed the logic to skip tests when the Google API key is not available, providing a clearer message for the absence of the key. - Ensured that the test suite only runs when the necessary API key and CLI are present.

gltanaka requested a review from Copilot February 6, 2026 21:01

Copilot AI reviewed Feb 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agentic fix: cwd-aware paths, verification controls, lint cleanup, and targeted tests#476

Agentic fix: cwd-aware paths, verification controls, lint cleanup, and targeted tests#476
beknobloch wants to merge 6 commits intopromptdriven:mainfrom
beknobloch:agentic-fix-split

beknobloch commented Feb 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		# When verification mode is "auto", we may run agent-supplied TESTCMD blocks (if emitted)
		_AGENT_TESTCMD_ALLOWED = os.getenv("PDD_AGENTIC_AGENT_TESTCMD", "1") != "0"

-# When verification mode is "auto", we may run agent-supplied TESTCMD blocks (if emitted)
-_AGENT_TESTCMD_ALLOWED = os.getenv("PDD_AGENTIC_AGENT_TESTCMD", "1") != "0"
+# When verification mode is "auto", we may run agent-supplied TESTCMD blocks (if emitted).
+# For safety, this is disabled by default and must be explicitly enabled via PDD_AGENTIC_AGENT_TESTCMD.
+_AGENT_TESTCMD_ALLOWED = os.getenv("PDD_AGENTIC_AGENT_TESTCMD", "0") != "0"

-    Execute an agent-supplied TESTCMD locally via bash -lc "<cmd>".
-    Return True on exit code 0, else False. Captures and previews output (verbose).
-    """
+    Execute an agent-supplied TESTCMD locally via bash -lc "<cmd>".
+    This is potentially dangerous because it allows arbitrary shell execution.
+    Therefore, it is disabled by default and must be explicitly enabled by
+    setting the environment variable `_AGENT_TESTCMD_ALLOWED=1`.
+    Return True on exit code 0, else False. Captures and previews output (verbose).
+    """
+    if os.environ.get("_AGENT_TESTCMD_ALLOWED", "0") != "1":
+        _info(
+            "[yellow]Skipping agent-supplied test command because "
+            "_AGENT_TESTCMD_ALLOWED is not set to '1'.[/yellow]"
+        )
+        return False

	harvest_file = Path("agentic_fix_harvest.txt")
	harvest_file = Path(cwd) / "agentic_fix_harvest.txt"

-                lf.write(stack[-1] if stack else "N/A")
-                lf.write("\n")
+                if stack:
+                    lf.writelines(stack)
+                else:
+                    lf.write("N/A\n")

Conversation

beknobloch commented Feb 6, 2026

Summary

Test Results

Manual Testing

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant