Skip to content

ci(strict21): add canonical policy + preflight gate#15

Open
Prekzursil wants to merge 1 commit intomainfrom
codex/strict-21-100-webcoder
Open

ci(strict21): add canonical policy + preflight gate#15
Prekzursil wants to merge 1 commit intomainfrom
codex/strict-21-100-webcoder

Conversation

@Prekzursil
Copy link
Owner

Problem / Outcome

Start strict-21 rollout (Vercel-free) with canonical context policy and preflight gate.

Scope and Success Criteria

  • Canonical required-check policy file committed.
  • PR preflight checks canonical contexts against branch protection and emitted checks.
  • No Vercel contexts included.

Risk Classification

risk:medium

Rollback Plan

Revert this branch if strict preflight blocks valid PRs unexpectedly.

Evidence Paths

  • .github/branch-protection-policy.json
  • .github/workflows/strict21-preflight.yml
  • scripts/strict21_preflight.py

Co-authored-by: Codex <noreply@openai.com>
@devloai
Copy link

devloai bot commented Mar 2, 2026

Unable to trigger custom agent "Code Reviewer". You have run out of credits 😔
Please upgrade your plan or buy additional credits from the subscription page.

@coderabbitai
Copy link

coderabbitai bot commented Mar 2, 2026

Warning

Rate limit exceeded

@Prekzursil has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 8 minutes and 39 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 0b35b66 and ddb7202.

📒 Files selected for processing (3)
  • .github/branch-protection-policy.json
  • .github/workflows/strict21-preflight.yml
  • scripts/strict21_preflight.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/strict-21-100-webcoder

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@qodo-code-review
Copy link

Review Summary by Qodo

Add strict-21 preflight validation with canonical policy

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add strict-21 preflight validation script comparing canonical contexts
• Define canonical branch protection policy with 24 required status checks
• Implement GitHub Actions workflow for automated preflight checks on PRs
• Validate branch protection and emitted check-runs against canonical contexts
Diagram
flowchart LR
  A["Canonical Contexts List"] -->|"Compare against"| B["Branch Protection Policy"]
  A -->|"Compare against"| C["Emitted Check-Runs"]
  B -->|"Evaluate"| D["Preflight Script"]
  C -->|"Evaluate"| D
  D -->|"Generate"| E["JSON Report"]
  D -->|"Generate"| F["Markdown Report"]
  G["GitHub Actions Workflow"] -->|"Trigger"| D
Loading

Grey Divider

File Changes

1. scripts/strict21_preflight.py ✨ Enhancement +301/-0

Strict-21 preflight validation script implementation

• Implements preflight validation script with 301 lines of Python code
• Compares canonical contexts against branch protection and emitted check-runs
• Queries GitHub API to fetch branch protection rules and commit check statuses
• Generates JSON and Markdown reports with findings and compliance status
• Handles HTTP errors, missing tokens, and network failures gracefully

scripts/strict21_preflight.py


2. .github/branch-protection-policy.json ⚙️ Configuration changes +30/-0

Canonical branch protection policy definition

• Defines canonical branch protection policy for main branch
• Specifies 24 required status checks including backend, frontend, and analysis tools
• Enforces strict mode, linear history, and single approval requirement
• Serves as source of truth for canonical context validation

.github/branch-protection-policy.json


3. .github/workflows/strict21-preflight.yml ✨ Enhancement +39/-0

GitHub Actions workflow for preflight checks

• Creates GitHub Actions workflow triggered on PRs and manual dispatch
• Runs preflight script with repository and branch context parameters
• Uploads preflight reports as workflow artifacts for inspection
• Outputs markdown report to workflow logs for visibility

.github/workflows/strict21-preflight.yml


Grey Divider

Qodo Logo

@qodo-code-review
Copy link

qodo-code-review bot commented Mar 2, 2026

Code Review by Qodo

🐞 Bugs (4) 📘 Rule violations (1) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Contexts don't match repo 🐞 Bug ✓ Correctness
Description
The preflight enforces a large hard-coded canonical context list and fails when any are missing from
branch protection or emitted checks. In this repo, the documented required check is only verify
and the workflows shown do not define most canonical contexts, so strict21-preflight is very likely
to mark PRs non_compliant and block merges once required.
Code

scripts/strict21_preflight.py[R15-38]

+PERMISSION_HTTP_CODES = {401, 403, 404}
+DEFAULT_CANONICAL_CONTEXTS = [
+    "applitools-core",
+    "pr-agent",
+    "deep-agent",
+    "audit-pr-evidence",
+    "backend",
+    "backend-postgres",
+    "coverage",
+    "CodeQL",
+    "codecov-analytics",
+    "Analyze (actions)",
+    "Analyze (javascript-typescript)",
+    "Analyze (python)",
+    "CodeRabbit",
+    "dependency-review",
+    "compose-smoke",
+    "frontend",
+    "label",
+    "codacy-equivalent-zero",
+    "sonar-branch-zero",
+    "Seer Code Review",
+    "SonarCloud Code Analysis",
+]
Evidence
The script defaults to a long list of canonical contexts and declares non-compliance if any
canonical context is missing from emitted contexts and/or branch protection contexts. The repo
documentation and existing workflows indicate verify is the primary required check context, making
the new canonical list inconsistent with current repo-defined checks.

scripts/strict21_preflight.py[15-38]
scripts/strict21_preflight.py[99-128]
scripts/strict21_preflight.py[295-297]
.github/BRANCH_PROTECTION.md[5-12]
.github/workflows/verify.yml[13-17]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The strict-21 preflight enforces a hard-coded canonical contexts list that appears inconsistent with the repo’s documented/defined CI checks (e.g., `verify`). This likely causes `non_compliant` results and blocks PRs.

## Issue Context
- The script defaults to `DEFAULT_CANONICAL_CONTEXTS` and fails when any context is missing.
- Repo documentation and workflows indicate `verify` is the required check today.

## Fix Focus Areas
- scripts/strict21_preflight.py[15-38]
- scripts/strict21_preflight.py[93-97]
- scripts/strict21_preflight.py[99-128]
- .github/workflows/strict21-preflight.yml[20-32]
- .github/BRANCH_PROTECTION.md[5-12]
- .github/workflows/verify.yml[13-17]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Preflight fails open on 403/404 🐞 Bug ⛨ Security
Description
API permission/availability failures (including HTTP 401/403/404) are classified as
inconclusive_permissions and the script exits 0, so the workflow can succeed without having
validated compliance. This undermines the intended “preflight gate”.
Code

scripts/strict21_preflight.py[R70-72]

+def _classify_http_status(code: int) -> str:
+    return "inconclusive_permissions" if code in PERMISSION_HTTP_CODES else "api_error"
+
Evidence
The script explicitly treats 401/403/404 as a permission/inconclusive condition, and main() only
returns non-zero for non_compliant or api_error—so inconclusive runs pass CI even though compliance
was not checked.

scripts/strict21_preflight.py[15-16]
scripts/strict21_preflight.py[70-72]
scripts/strict21_preflight.py[189-197]
scripts/strict21_preflight.py[295-297]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The preflight gate succeeds (exit 0) when it cannot validate compliance due to missing token or certain HTTP errors (401/403/404). This is a fail-open pattern for a protection gate.

## Issue Context
- HTTP 401/403/404 map to `inconclusive_permissions`.
- `main()` returns failure only for `non_compliant` or `api_error`.

## Fix Focus Areas
- scripts/strict21_preflight.py[15-16]
- scripts/strict21_preflight.py[70-72]
- scripts/strict21_preflight.py[189-197]
- scripts/strict21_preflight.py[295-297]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Preflight gate lacks docs/tests 📘 Rule violation ⛯ Reliability
Description
This PR introduces a new CI preflight gate and canonical required-check policy but does not include
corresponding documentation and/or tests describing expected behavior and operation. This increases
the risk of incorrect usage or future regressions without a clear contract.
Code

.github/workflows/strict21-preflight.yml[R20-33]

+      - name: Run strict-21 preflight
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          STAMP: ${{ github.event.pull_request.number || github.run_id }}
+        run: |
+          set -euo pipefail
+          mkdir -p .tmp/strict21-preflight
+          python3 scripts/strict21_preflight.py \
+            --repo "${GITHUB_REPOSITORY}" \
+            --branch main \
+            --ref "${GITHUB_SHA}" \
+            --out-json ".tmp/strict21-preflight/preflight.json" \
+            --out-md ".tmp/strict21-preflight/preflight.md"
+          cat .tmp/strict21-preflight/preflight.md
Evidence
PR Compliance ID 3 requires tests and/or documentation updates when behavior changes are introduced.
The added workflow and script implement a new preflight gate and enforcement behavior, but the PR
diff contains only the policy/workflow/script changes and no accompanying test or documentation
changes.

AGENTS.md
.github/workflows/strict21-preflight.yml[20-33]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A new strict-21 CI preflight gate and canonical required-check policy were added, but the PR does not include corresponding documentation and/or tests describing how the gate works, what it validates, and how to troubleshoot failures.

## Issue Context
The workflow runs `scripts/strict21_preflight.py` and can fail PRs (`exit 1`) when non-compliant or on API errors. Without docs/tests, maintainers may not know how to interpret outputs (JSON/MD artifact) or configure canonical contexts.

## Fix Focus Areas
- .github/workflows/strict21-preflight.yml[20-39]
- scripts/strict21_preflight.py[52-67]
- scripts/strict21_preflight.py[273-297]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Policy file not source-of-truth 🐞 Bug ⛯ Reliability
Description
The PR adds .github/branch-protection-policy.json, but the workflow/script never read it; instead
enforcement relies on a duplicated hard-coded list in strict21_preflight.py. This invites
configuration drift and makes the committed policy file informational rather than enforced.
Code

.github/workflows/strict21-preflight.yml[R20-32]

+      - name: Run strict-21 preflight
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          STAMP: ${{ github.event.pull_request.number || github.run_id }}
+        run: |
+          set -euo pipefail
+          mkdir -p .tmp/strict21-preflight
+          python3 scripts/strict21_preflight.py \
+            --repo "${GITHUB_REPOSITORY}" \
+            --branch main \
+            --ref "${GITHUB_SHA}" \
+            --out-json ".tmp/strict21-preflight/preflight.json" \
+            --out-md ".tmp/strict21-preflight/preflight.md"
Evidence
The committed policy JSON enumerates required status checks, but the preflight workflow invokes the
script without passing canonical contexts and the script defaults to its internal constant list when
no override is provided.

.github/branch-protection-policy.json[1-26]
.github/workflows/strict21-preflight.yml[20-32]
scripts/strict21_preflight.py[93-96]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A canonical policy file is added, but enforcement does not consume it; the canonical list is duplicated/hard-coded. This creates drift risk.

## Issue Context
- Workflow runs strict21_preflight.py without specifying canonical contexts.
- Script uses DEFAULT_CANONICAL_CONTEXTS when the argument is empty.

## Fix Focus Areas
- .github/workflows/strict21-preflight.yml[20-32]
- scripts/strict21_preflight.py[52-67]
- scripts/strict21_preflight.py[93-97]
- .github/branch-protection-policy.json[1-26]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Check-runs limited to 100 🐞 Bug ⛯ Reliability
Description
Emitted context inventory only fetches the first page of check-runs (per_page=100) and does not
paginate, which can miss contexts and incorrectly flag non_compliance on commits with many check
runs.
Code

scripts/strict21_preflight.py[R238-239]

+        check_runs = _api_get(args.api_base, args.repo, f"commits/{ref_sha}/check-runs?per_page=100", token)
+        status_payload = _api_get(args.api_base, args.repo, f"commits/{ref_sha}/status", token)
Evidence
The script requests check-runs with per_page=100 but never follows additional pages, so
emitted_contexts can be incomplete relative to what GitHub has recorded.

scripts/strict21_preflight.py[238-239]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The emitted check contexts list can be incomplete because the script only fetches the first 100 check runs for a commit.

## Issue Context
- `_run_preflight()` calls the check-runs endpoint with `per_page=100` but doesn’t iterate `page`.

## Fix Focus Areas
- scripts/strict21_preflight.py[226-249]
- scripts/strict21_preflight.py[131-135]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 2, 2026

@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

Comment on lines +79 to +88
req = Request(
url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
"User-Agent": "reframe-strict21-preflight",
},
method="GET",
)

Check failure

Code scanning / CodeQL

Partial server-side request forgery Critical

Part of the URL of this request depends on a
user-provided value
.

Copilot Autofix

AI about 12 hours ago

In general, to fix partial SSRF when part of a URL is user-controlled, you either (a) restrict the user to selecting from a set of known-good base URLs, or (b) strictly validate the supplied URL to ensure it targets only allowed hosts and protocols and does not point to internal or loopback addresses.

For this script, the best fix with minimal functional impact is to validate args.api_base before it is used in _api_get. We can introduce a small helper that parses and enforces that the host is one of a small whitelist of allowed GitHub API hosts (for example, api.github.com and api.github.com-equivalents that might be needed like github.com APIs, or optionally GitHub Enterprise patterns if you want to support them explicitly). Because we must not assume more about the environment than we see, a conservative approach is to:

  1. Parse api_base with urlparse.
  2. Ensure the scheme is https (and optionally allow http only if you truly need it; current code allows both, but tightening to https is safer).
  3. Ensure the hostname is one of a small allowed set, e.g., api.github.com.

This helper will be called in main() to normalize/validate args.api_base before passing it to _run_preflight, so that _api_get only ever receives a vetted base URL. This preserves existing behavior for normal usage (default https://api.github.com) while preventing redirection to attacker-chosen hosts.

Concretely:

  • Add a new function, for example _normalize_api_base(raw: str) -> str, near _api_get, that applies these checks and returns a clean base URL string.
  • In main(), after parsing args and before calling _run_preflight, replace uses of args.api_base with a validated api_base variable produced by _normalize_api_base.
  • Adjust _run_preflight to accept and use this validated api_base instead of the raw, unvalidated args.api_base.

No new imports are required; urlparse is already imported and used.


Suggested changeset 1
scripts/strict21_preflight.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scripts/strict21_preflight.py b/scripts/strict21_preflight.py
--- a/scripts/strict21_preflight.py
+++ b/scripts/strict21_preflight.py
@@ -71,6 +71,26 @@
     return "inconclusive_permissions" if code in PERMISSION_HTTP_CODES else "api_error"
 
 
+def _normalize_api_base(raw: str) -> str:
+    """
+    Validate and normalize the GitHub API base URL.
+
+    This ensures that the base URL uses HTTPS and targets an expected GitHub API host,
+    preventing arbitrary redirection to attacker-controlled endpoints.
+    """
+    parsed = urlparse(raw)
+    if parsed.scheme not in {"https"}:
+        raise ValueError(f"Unsupported API scheme in {raw!r}; only 'https' is allowed")
+    hostname = parsed.hostname or ""
+    # Restrict to the public GitHub API host. Extend this list explicitly if needed.
+    allowed_hosts = {"api.github.com"}
+    if hostname not in allowed_hosts:
+        raise ValueError(f"Unsupported API host in {raw!r}")
+    # Reconstruct a normalized base without path/query/fragment
+    netloc = parsed.netloc
+    return f"{parsed.scheme}://{netloc}"
+
+
 def _api_get(api_base: str, repo: str, path: str, token: str) -> dict[str, Any]:
     url = f"{api_base.rstrip('/')}/repos/{repo}/{path.lstrip('/')}"
     parsed = urlparse(url)
@@ -228,15 +248,16 @@
     args: argparse.Namespace,
     canonical: list[str],
     token: str,
+    api_base: str,
 ) -> tuple[PreflightResult, list[str], list[str]]:
     try:
-        protection = _api_get(args.api_base, args.repo, f"branches/{args.branch}/protection", token)
-        ref_payload = _api_get(args.api_base, args.repo, f"commits/{args.ref}", token)
+        protection = _api_get(api_base, args.repo, f"branches/{args.branch}/protection", token)
+        ref_payload = _api_get(api_base, args.repo, f"commits/{args.ref}", token)
         ref_sha = str(ref_payload.get("sha") or "").strip() or None
         if ref_sha is None:
             raise RuntimeError(f"Unable to resolve SHA for ref {args.ref!r}")
-        check_runs = _api_get(args.api_base, args.repo, f"commits/{ref_sha}/check-runs?per_page=100", token)
-        status_payload = _api_get(args.api_base, args.repo, f"commits/{ref_sha}/status", token)
+        check_runs = _api_get(api_base, args.repo, f"commits/{ref_sha}/check-runs?per_page=100", token)
+        status_payload = _api_get(api_base, args.repo, f"commits/{ref_sha}/status", token)
 
         branch_required_checks = sorted((protection.get("required_status_checks") or {}).get("contexts") or [])
         emitted_contexts = _collect_emitted_contexts(check_runs, status_payload)
@@ -264,10 +280,12 @@
     if not token:
         result, branch_required_checks, emitted_contexts = _missing_token_result()
     else:
+        api_base = _normalize_api_base(args.api_base)
         result, branch_required_checks, emitted_contexts = _run_preflight(
             args=args,
             canonical=canonical,
             token=token,
+            api_base=api_base,
         )
 
     payload = {
EOF
@@ -71,6 +71,26 @@
return "inconclusive_permissions" if code in PERMISSION_HTTP_CODES else "api_error"


def _normalize_api_base(raw: str) -> str:
"""
Validate and normalize the GitHub API base URL.

This ensures that the base URL uses HTTPS and targets an expected GitHub API host,
preventing arbitrary redirection to attacker-controlled endpoints.
"""
parsed = urlparse(raw)
if parsed.scheme not in {"https"}:
raise ValueError(f"Unsupported API scheme in {raw!r}; only 'https' is allowed")
hostname = parsed.hostname or ""
# Restrict to the public GitHub API host. Extend this list explicitly if needed.
allowed_hosts = {"api.github.com"}
if hostname not in allowed_hosts:
raise ValueError(f"Unsupported API host in {raw!r}")
# Reconstruct a normalized base without path/query/fragment
netloc = parsed.netloc
return f"{parsed.scheme}://{netloc}"


def _api_get(api_base: str, repo: str, path: str, token: str) -> dict[str, Any]:
url = f"{api_base.rstrip('/')}/repos/{repo}/{path.lstrip('/')}"
parsed = urlparse(url)
@@ -228,15 +248,16 @@
args: argparse.Namespace,
canonical: list[str],
token: str,
api_base: str,
) -> tuple[PreflightResult, list[str], list[str]]:
try:
protection = _api_get(args.api_base, args.repo, f"branches/{args.branch}/protection", token)
ref_payload = _api_get(args.api_base, args.repo, f"commits/{args.ref}", token)
protection = _api_get(api_base, args.repo, f"branches/{args.branch}/protection", token)
ref_payload = _api_get(api_base, args.repo, f"commits/{args.ref}", token)
ref_sha = str(ref_payload.get("sha") or "").strip() or None
if ref_sha is None:
raise RuntimeError(f"Unable to resolve SHA for ref {args.ref!r}")
check_runs = _api_get(args.api_base, args.repo, f"commits/{ref_sha}/check-runs?per_page=100", token)
status_payload = _api_get(args.api_base, args.repo, f"commits/{ref_sha}/status", token)
check_runs = _api_get(api_base, args.repo, f"commits/{ref_sha}/check-runs?per_page=100", token)
status_payload = _api_get(api_base, args.repo, f"commits/{ref_sha}/status", token)

branch_required_checks = sorted((protection.get("required_status_checks") or {}).get("contexts") or [])
emitted_contexts = _collect_emitted_contexts(check_runs, status_payload)
@@ -264,10 +280,12 @@
if not token:
result, branch_required_checks, emitted_contexts = _missing_token_result()
else:
api_base = _normalize_api_base(args.api_base)
result, branch_required_checks, emitted_contexts = _run_preflight(
args=args,
canonical=canonical,
token=token,
api_base=api_base,
)

payload = {
Copilot is powered by AI and may make mistakes. Always verify output.
"http_error": result.http_error,
}

out_json.parent.mkdir(parents=True, exist_ok=True)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
}

out_json.parent.mkdir(parents=True, exist_ok=True)
out_md.parent.mkdir(parents=True, exist_ok=True)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI about 12 hours ago

In general, to fix uncontrolled path usage you normalize the user-provided path and ensure it stays within a trusted base directory, or otherwise constrain it via an allow‑list or safe filename transformation. Here, we only need to write result files; we can safely constrain both out_json and out_md to reside under a base output directory (e.g. the current working directory) by resolving them and verifying they don’t escape that base.

The best fix with minimal behavior change is:

  • Define a small helper, e.g. _safe_subpath(base: Path, target: Path) -> Path, that:
    • Resolves base to an absolute path (e.g. base.resolve()).
    • Resolves target as if under that base (e.g. (base / target).resolve()).
    • Checks that the resolved target is within the base using Path.is_relative_to (Py3.9+) or a relative_to try/except fallback.
    • Raises a ValueError if the target escapes the base.
  • In main(), instead of using Path(args.out_json) and Path(args.out_md) directly, compute them via this helper with a base of Path.cwd() (or another appropriate base, but we must not assume more than what we see); e.g.:
    • base_output = Path.cwd().resolve()
    • out_json = _safe_subpath(base_output, Path(args.out_json))
    • out_md = _safe_subpath(base_output, Path(args.out_md))
  • Leave the later calls (out_json.parent.mkdir(...), write_text(...)) unchanged; they now operate on validated paths.

This keeps functionality (the user can still specify nested paths under the working directory) while preventing directory traversal or writing outside the intended workspace.

Suggested changeset 1
scripts/strict21_preflight.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scripts/strict21_preflight.py b/scripts/strict21_preflight.py
--- a/scripts/strict21_preflight.py
+++ b/scripts/strict21_preflight.py
@@ -67,6 +67,26 @@
     return parser.parse_args()
 
 
+def _safe_subpath(base: Path, target: Path) -> Path:
+    """
+    Return a resolved path for ``target`` under ``base``, ensuring it does not escape ``base``.
+    """
+    base_resolved = base.resolve()
+    candidate = (base_resolved / target).resolve()
+    try:
+        # Python 3.9+ provides is_relative_to; fall back to relative_to for older versions.
+        is_relative = candidate.is_relative_to(base_resolved)  # type: ignore[attr-defined]
+    except AttributeError:
+        try:
+            candidate.relative_to(base_resolved)
+            is_relative = True
+        except ValueError:
+            is_relative = False
+    if not is_relative:
+        raise ValueError(f"Output path {candidate} escapes base directory {base_resolved}")
+    return candidate
+
+
 def _classify_http_status(code: int) -> str:
     return "inconclusive_permissions" if code in PERMISSION_HTTP_CODES else "api_error"
 
@@ -255,8 +275,9 @@
 
 def main() -> int:
     args = _parse_args()
-    out_json = Path(args.out_json)
-    out_md = Path(args.out_md)
+    base_output = Path.cwd()
+    out_json = _safe_subpath(base_output, Path(args.out_json))
+    out_md = _safe_subpath(base_output, Path(args.out_md))
     canonical = _canonical_contexts(args.canonical_contexts)
 
     token = (os.environ.get("GITHUB_TOKEN") or "").strip() or (os.environ.get("GH_TOKEN") or "").strip()
EOF
@@ -67,6 +67,26 @@
return parser.parse_args()


def _safe_subpath(base: Path, target: Path) -> Path:
"""
Return a resolved path for ``target`` under ``base``, ensuring it does not escape ``base``.
"""
base_resolved = base.resolve()
candidate = (base_resolved / target).resolve()
try:
# Python 3.9+ provides is_relative_to; fall back to relative_to for older versions.
is_relative = candidate.is_relative_to(base_resolved) # type: ignore[attr-defined]
except AttributeError:
try:
candidate.relative_to(base_resolved)
is_relative = True
except ValueError:
is_relative = False
if not is_relative:
raise ValueError(f"Output path {candidate} escapes base directory {base_resolved}")
return candidate


def _classify_http_status(code: int) -> str:
return "inconclusive_permissions" if code in PERMISSION_HTTP_CODES else "api_error"

@@ -255,8 +275,9 @@

def main() -> int:
args = _parse_args()
out_json = Path(args.out_json)
out_md = Path(args.out_md)
base_output = Path.cwd()
out_json = _safe_subpath(base_output, Path(args.out_json))
out_md = _safe_subpath(base_output, Path(args.out_md))
canonical = _canonical_contexts(args.canonical_contexts)

token = (os.environ.get("GITHUB_TOKEN") or "").strip() or (os.environ.get("GH_TOKEN") or "").strip()
Copilot is powered by AI and may make mistakes. Always verify output.

out_json.parent.mkdir(parents=True, exist_ok=True)
out_md.parent.mkdir(parents=True, exist_ok=True)
out_json.write_text(json.dumps(payload, indent=2, sort_keys=True) + "\n", encoding="utf-8")

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI about 12 hours ago

In general, to fix uncontrolled path usage for CLI-provided output paths, we should normalize the path and then either (a) restrict it to lie within a safe root directory, or (b) otherwise sanitize it so that dangerous constructs like .. or absolute paths are rejected. For a reporting/preflight script, a common pattern is to write all outputs under a known working directory (for example, the current directory or a configurable base directory) and ensure user-specified paths cannot escape that root.

For this specific script, the minimal, functionality-preserving fix is to introduce a safe output root (e.g., the current working directory) and validate that --out-json and --out-md resolve to paths within that root after normalization. We can use Path.resolve() to normalize and obtain absolute paths, and then check that each resolved path has the root as one of its parents (or is equal to the root). If a user attempts to supply a path outside the root (/etc/passwd, ../../secret), we raise a clear error instead of writing. This keeps the intended ability to choose output filenames and subdirectories but defends against directory traversal and writes to unexpected locations.

Concretely, in main() in scripts/strict21_preflight.py, around lines 257–260 and before we use out_json and out_md, we will:

  1. Define a root directory, such as output_root = Path.cwd().resolve().
  2. Convert the arguments to paths and resolve them: out_json = Path(args.out_json).resolve(), out_md = Path(args.out_md).resolve().
  3. Verify output_root is a parent of both resolved paths (or equal). If not, raise ValueError with an explanatory message.
  4. Use the validated out_json and out_md as before (mkdir, write).

This change only affects the handling of the two output arguments and does not modify imports or other functionality.

Suggested changeset 1
scripts/strict21_preflight.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scripts/strict21_preflight.py b/scripts/strict21_preflight.py
--- a/scripts/strict21_preflight.py
+++ b/scripts/strict21_preflight.py
@@ -255,8 +255,13 @@
 
 def main() -> int:
     args = _parse_args()
-    out_json = Path(args.out_json)
-    out_md = Path(args.out_md)
+    output_root = Path.cwd().resolve()
+    out_json = Path(args.out_json).resolve()
+    out_md = Path(args.out_md).resolve()
+    if output_root not in (out_json, *out_json.parents):
+        raise ValueError(f"Output JSON path must be under {output_root}, got {out_json}")
+    if output_root not in (out_md, *out_md.parents):
+        raise ValueError(f"Output markdown path must be under {output_root}, got {out_md}")
     canonical = _canonical_contexts(args.canonical_contexts)
 
     token = (os.environ.get("GITHUB_TOKEN") or "").strip() or (os.environ.get("GH_TOKEN") or "").strip()
EOF
@@ -255,8 +255,13 @@

def main() -> int:
args = _parse_args()
out_json = Path(args.out_json)
out_md = Path(args.out_md)
output_root = Path.cwd().resolve()
out_json = Path(args.out_json).resolve()
out_md = Path(args.out_md).resolve()
if output_root not in (out_json, *out_json.parents):
raise ValueError(f"Output JSON path must be under {output_root}, got {out_json}")
if output_root not in (out_md, *out_md.parents):
raise ValueError(f"Output markdown path must be under {output_root}, got {out_md}")
canonical = _canonical_contexts(args.canonical_contexts)

token = (os.environ.get("GITHUB_TOKEN") or "").strip() or (os.environ.get("GH_TOKEN") or "").strip()
Copilot is powered by AI and may make mistakes. Always verify output.
out_json.parent.mkdir(parents=True, exist_ok=True)
out_md.parent.mkdir(parents=True, exist_ok=True)
out_json.write_text(json.dumps(payload, indent=2, sort_keys=True) + "\n", encoding="utf-8")
out_md.write_text(_render_markdown(payload), encoding="utf-8")

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI about 12 hours ago

In general, to fix uncontrolled path usage for files derived from user input, you should constrain where files can be written and/or normalize and validate the path before use. A common pattern is to define a fixed “output root” directory, resolve user-provided paths relative to that root, normalize via Path.resolve() or os.path.realpath(), and then ensure the final path remains within the root. If it does not, fail with an error instead of writing the file. This prevents ../ traversal and disallows redirecting output to unintended locations.

For this script, the minimal fix without changing observable behavior too much is:

  • Introduce an output root directory (e.g., current working directory by default).
  • Resolve out_json and out_md relative to that root.
  • Normalize them with Path.resolve(strict=False) (or resolve()), and then check that their resolved paths are within the root directory using Path.is_relative_to() if available, or an equivalent check with relative_to() in a try/except.
  • If the check fails, raise a clear exception instead of writing.

Since we must only modify the shown code in scripts/strict21_preflight.py, we can:

  1. Compute a base_dir = Path.cwd().resolve() in main().
  2. Replace the simple Path(args.out_json) / Path(args.out_md) with validated helper calls like _safe_output_path(base_dir, args.out_json) and _safe_output_path(base_dir, args.out_md).
  3. Add a small helper function _safe_output_path(base_dir: Path, user_path: str) -> Path above main() that:
    • Creates a Path(user_path).
    • If it is absolute, uses it as-is but still validates it against base_dir (or, alternatively, disallow absolute paths).
    • Joins with base_dir if it is relative.
    • Calls .resolve() and checks that the resolved path is under base_dir. If not, raises ValueError.

This retains the ability to specify file names relative to the current directory and prevents directory traversal outside the output root. No new imports are needed; Path is already imported.


Suggested changeset 1
scripts/strict21_preflight.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scripts/strict21_preflight.py b/scripts/strict21_preflight.py
--- a/scripts/strict21_preflight.py
+++ b/scripts/strict21_preflight.py
@@ -90,6 +90,26 @@
         return json.loads(resp.read().decode("utf-8"))
 
 
+def _safe_output_path(base_dir: Path, user_path: str) -> Path:
+    """
+    Construct a normalized output path under ``base_dir`` from a user-provided
+    path. Raises ValueError if the resulting path would escape ``base_dir``.
+    """
+    raw_path = Path(user_path)
+
+    if not raw_path.is_absolute():
+        candidate = (base_dir / raw_path).resolve()
+    else:
+        candidate = raw_path.resolve()
+
+    try:
+        candidate.relative_to(base_dir)
+    except ValueError:
+        raise ValueError(f"Output path {candidate!r} is outside allowed directory {base_dir!r}")
+
+    return candidate
+
+
 def _canonical_contexts(raw: str) -> list[str]:
     if not raw.strip():
         return list(DEFAULT_CANONICAL_CONTEXTS)
@@ -255,8 +275,9 @@
 
 def main() -> int:
     args = _parse_args()
-    out_json = Path(args.out_json)
-    out_md = Path(args.out_md)
+    base_dir = Path.cwd().resolve()
+    out_json = _safe_output_path(base_dir, args.out_json)
+    out_md = _safe_output_path(base_dir, args.out_md)
     canonical = _canonical_contexts(args.canonical_contexts)
 
     token = (os.environ.get("GITHUB_TOKEN") or "").strip() or (os.environ.get("GH_TOKEN") or "").strip()
EOF
@@ -90,6 +90,26 @@
return json.loads(resp.read().decode("utf-8"))


def _safe_output_path(base_dir: Path, user_path: str) -> Path:
"""
Construct a normalized output path under ``base_dir`` from a user-provided
path. Raises ValueError if the resulting path would escape ``base_dir``.
"""
raw_path = Path(user_path)

if not raw_path.is_absolute():
candidate = (base_dir / raw_path).resolve()
else:
candidate = raw_path.resolve()

try:
candidate.relative_to(base_dir)
except ValueError:
raise ValueError(f"Output path {candidate!r} is outside allowed directory {base_dir!r}")

return candidate


def _canonical_contexts(raw: str) -> list[str]:
if not raw.strip():
return list(DEFAULT_CANONICAL_CONTEXTS)
@@ -255,8 +275,9 @@

def main() -> int:
args = _parse_args()
out_json = Path(args.out_json)
out_md = Path(args.out_md)
base_dir = Path.cwd().resolve()
out_json = _safe_output_path(base_dir, args.out_json)
out_md = _safe_output_path(base_dir, args.out_md)
canonical = _canonical_contexts(args.canonical_contexts)

token = (os.environ.get("GITHUB_TOKEN") or "").strip() or (os.environ.get("GH_TOKEN") or "").strip()
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +15 to +38
PERMISSION_HTTP_CODES = {401, 403, 404}
DEFAULT_CANONICAL_CONTEXTS = [
"applitools-core",
"pr-agent",
"deep-agent",
"audit-pr-evidence",
"backend",
"backend-postgres",
"coverage",
"CodeQL",
"codecov-analytics",
"Analyze (actions)",
"Analyze (javascript-typescript)",
"Analyze (python)",
"CodeRabbit",
"dependency-review",
"compose-smoke",
"frontend",
"label",
"codacy-equivalent-zero",
"sonar-branch-zero",
"Seer Code Review",
"SonarCloud Code Analysis",
]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Contexts don't match repo 🐞 Bug ✓ Correctness

The preflight enforces a large hard-coded canonical context list and fails when any are missing from
branch protection or emitted checks. In this repo, the documented required check is only verify
and the workflows shown do not define most canonical contexts, so strict21-preflight is very likely
to mark PRs non_compliant and block merges once required.
Agent Prompt
## Issue description
The strict-21 preflight enforces a hard-coded canonical contexts list that appears inconsistent with the repo’s documented/defined CI checks (e.g., `verify`). This likely causes `non_compliant` results and blocks PRs.

## Issue Context
- The script defaults to `DEFAULT_CANONICAL_CONTEXTS` and fails when any context is missing.
- Repo documentation and workflows indicate `verify` is the required check today.

## Fix Focus Areas
- scripts/strict21_preflight.py[15-38]
- scripts/strict21_preflight.py[93-97]
- scripts/strict21_preflight.py[99-128]
- .github/workflows/strict21-preflight.yml[20-32]
- .github/BRANCH_PROTECTION.md[5-12]
- .github/workflows/verify.yml[13-17]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +70 to +72
def _classify_http_status(code: int) -> str:
return "inconclusive_permissions" if code in PERMISSION_HTTP_CODES else "api_error"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Preflight fails open on 403/404 🐞 Bug ⛨ Security

API permission/availability failures (including HTTP 401/403/404) are classified as
inconclusive_permissions and the script exits 0, so the workflow can succeed without having
validated compliance. This undermines the intended “preflight gate”.
Agent Prompt
## Issue description
The preflight gate succeeds (exit 0) when it cannot validate compliance due to missing token or certain HTTP errors (401/403/404). This is a fail-open pattern for a protection gate.

## Issue Context
- HTTP 401/403/404 map to `inconclusive_permissions`.
- `main()` returns failure only for `non_compliant` or `api_error`.

## Fix Focus Areas
- scripts/strict21_preflight.py[15-16]
- scripts/strict21_preflight.py[70-72]
- scripts/strict21_preflight.py[189-197]
- scripts/strict21_preflight.py[295-297]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant