From 2f562ff45f8bff9f203d890b4fd944203fd6e6e8 Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 4 Mar 2026 05:39:18 +0000 Subject: [PATCH 1/3] docs: Add continuous security testing guide with gap analysis MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Maps Argus's current capabilities against the emerging continuous autonomous pentesting model (diff-aware scanning, AutoFix→PR→Retest loops, persistent knowledge base, agent-driven attack chaining, deployment-triggered scanning, code-to-runtime context). Includes concrete implementation paths and priority ordering for each gap. https://claude.ai/code/session_017NQsm2eBxfioLrad1C7keZ --- docs/CONTINUOUS_SECURITY_TESTING_GUIDE.md | 912 ++++++++++++++++++++++ 1 file changed, 912 insertions(+) create mode 100644 docs/CONTINUOUS_SECURITY_TESTING_GUIDE.md diff --git a/docs/CONTINUOUS_SECURITY_TESTING_GUIDE.md b/docs/CONTINUOUS_SECURITY_TESTING_GUIDE.md new file mode 100644 index 0000000..77f4228 --- /dev/null +++ b/docs/CONTINUOUS_SECURITY_TESTING_GUIDE.md @@ -0,0 +1,912 @@ +# Continuous Security Testing: Closing the Ship-to-Secure Gap + +> From periodic pentests to autonomous, every-deploy security validation. + +## The Problem + +The industry gap is structural: engineering teams push thousands of lines daily, but security validation happens periodically — maybe twice a year for external pentests, or continuously-but-incompletely for in-house teams making hard triage choices about what to cover. Every change that ships untested is a version of the application that was never fully validated. + +Argus already addresses much of this with its 6-phase pipeline, AI enrichment, and proof-by-exploitation. But there are specific capabilities that would close the remaining gaps between "scan on schedule" and "secure every release." + +This guide maps what Argus has today, what's missing, and concrete implementation paths for each gap. + +--- + +## Capability Matrix: Argus Today vs. Continuous Autonomous Testing + +| Capability | Industry Goal | Argus Today | Gap | +|---|---|---|---| +| **Diff-aware scoping** | Only test what changed; skip README edits, focus on auth logic | `FileSelector.get_changed_files()` in fast mode; Phase 1 scanners scan full project | Scanners not diff-scoped | +| **Per-deploy trigger** | Every release triggers security validation | CI workflows on push/PR + weekly cron | No deployment-event integration | +| **Multi-step attack reasoning** | Agents chain cross-component vulnerabilities | `VulnerabilityChainer` with 14 rule-based chain patterns | Rule-based, not agent-driven | +| **Live exploitation** | Validate against running deployment | `SandboxValidator` + `ProofByExploitation` in Docker | Sandbox only, not against live targets | +| **AutoFix → PR** | Generate merge-ready PRs with code fixes | `RemediationEngine` generates diffs/text, no PR creation | No automated PR creation for fixes | +| **Retest after fix** | Automatically verify fix holds | `RegressionTester` as separate CI step | Not a closed loop within a single run | +| **Persistent knowledge base** | Each scan enriches cross-run intelligence | Flat-file JSONL feedback + per-scan JSON outputs | No cross-scan dedup, trending, or historical context | +| **Code-to-runtime context** | Source + API specs + cloud config + architecture | SAST + DAST exist separately; `sast_dast_correlator.py` bridges | No unified context model | + +--- + +## 1. Diff-Intelligent Scanner Scoping + +### What exists + +`scripts/orchestrator/file_selector.py` has `get_changed_files()` using `git diff --name-only HEAD^ HEAD`. When `only_changed=True`, the AI file-selection layer filters to changed files and boosts their priority by +200 points. + +### The gap + +Phase 1 scanners (Semgrep, Trivy, Checkov, TruffleHog) always scan the full project path. For a 500-file repo where 3 files changed, all 4 scanners still analyze everything. + +### What to build + +**a) Semgrep diff scoping** + +Semgrep natively supports `--include` patterns. Pass changed file paths: + +```python +# In scanner_runners.py, SemgrepRunner.run() +if self.only_changed and self.changed_files: + for f in self.changed_files: + cmd.extend(["--include", f]) +``` + +**b) Impact radius expansion** + +A single-line change to an auth middleware affects every protected route. Diff-scoping shouldn't be file-literal — it should expand to the blast radius: + +```python +class DiffImpactAnalyzer: + """Expand changed files to their security-relevant impact radius.""" + + def expand_impact(self, changed_files: list[str], project_path: str) -> list[str]: + """Given changed files, return the full set of files in the blast radius. + + - If a middleware/decorator changed, include all files that import it + - If a model changed, include all routes that use that model + - If an auth module changed, include all protected endpoints + """ + expanded = set(changed_files) + for f in changed_files: + if self._is_security_critical(f): + importers = self._find_importers(f, project_path) + expanded.update(importers) + return list(expanded) + + def _is_security_critical(self, filepath: str) -> bool: + """Check if file is auth, middleware, permissions, crypto, etc.""" + security_indicators = [ + 'auth', 'permission', 'middleware', 'security', + 'crypto', 'session', 'token', 'oauth', 'rbac', + 'acl', 'policy', 'guard', 'interceptor' + ] + name = os.path.basename(filepath).lower() + return any(ind in name for ind in security_indicators) +``` + +**c) Smart skip logic** + +The Aikido article highlights: "Updated a README and button color? Skipped." Argus should classify diffs by security relevance before deciding whether to scan at all: + +```python +class DiffClassifier: + """Classify a diff as security-relevant or skip-safe.""" + + SKIP_PATTERNS = [ + r'\.md$', r'\.txt$', r'\.css$', r'\.scss$', + r'\.svg$', r'\.png$', r'\.jpg$', + r'CHANGELOG', r'LICENSE', r'\.gitignore', + ] + + ALWAYS_SCAN_PATTERNS = [ + r'auth', r'login', r'session', r'token', r'password', + r'secret', r'key', r'crypt', r'permission', r'rbac', + r'middleware', r'guard', r'policy', r'\.env', + r'docker', r'Dockerfile', r'\.tf$', r'\.yml$', r'\.yaml$', + ] + + def classify(self, changed_files: list[str]) -> dict: + security_relevant = [] + skippable = [] + for f in changed_files: + if any(re.search(p, f, re.I) for p in self.ALWAYS_SCAN_PATTERNS): + security_relevant.append(f) + elif any(re.search(p, f, re.I) for p in self.SKIP_PATTERNS): + skippable.append(f) + else: + security_relevant.append(f) # Default: scan + return { + 'security_relevant': security_relevant, + 'skippable': skippable, + 'should_scan': len(security_relevant) > 0 + } +``` + +### Where it plugs in + +- `scripts/hybrid_analyzer.py` Phase 1 entry, before scanner invocation +- `scripts/scanner_runners.py` in each scanner's `run()` method +- New module: `scripts/diff_impact_analyzer.py` +- Config toggle: `enable_diff_scoping=True`, `diff_expand_impact_radius=True` + +--- + +## 2. Deployment-Triggered Scanning + +### What exists + +CI workflows trigger on push, PR, and weekly cron. The `action.yml` GitHub Action is the primary integration point. + +### The gap + +No integration with deployment events. Security validation happens pre-merge, not post-deploy. The running application is never validated against what actually shipped. + +### What to build + +**a) GitHub Deployment event webhook** + +```yaml +# .github/workflows/post-deploy-scan.yml +name: Post-Deploy Security Validation +on: + deployment_status: + types: [success] + +jobs: + scan: + if: github.event.deployment_status.state == 'success' + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Get deployment diff + run: | + PREV_SHA=$(git log --format='%H' -2 | tail -1) + echo "DIFF_BASE=$PREV_SHA" >> $GITHUB_ENV + + - uses: devatsecure/Argus-Security@v1 + with: + anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }} + review-type: security + only-changed: true + fail-on-blockers: true + + # If DAST target is available, also run live validation + - name: DAST against deployment + if: vars.DEPLOYMENT_URL != '' + run: | + python scripts/dast_orchestrator.py \ + --target-url "${{ vars.DEPLOYMENT_URL }}" \ + --auth-config .argus/dast-auth.yml \ + --diff-only +``` + +**b) Container registry webhook** + +For Docker-based deployments, trigger on image push: + +```yaml +on: + registry_package: + types: [published] +``` + +**c) ArgoCD / Flux / Spinnaker integration** + +Expose a webhook endpoint (or use the MCP server) that deployment tools call post-deploy: + +```python +# In scripts/mcp_server.py, add tool: +@mcp_server.tool("trigger_post_deploy_scan") +async def trigger_post_deploy_scan( + deployment_url: str, + commit_sha: str, + previous_sha: str, + environment: str = "staging", +) -> dict: + """Trigger a diff-scoped security scan after deployment.""" + changed_files = get_diff_files(previous_sha, commit_sha) + results = await run_pipeline( + target_path=repo_path, + changed_files=changed_files, + dast_target=deployment_url, + scan_mode="post-deploy", + ) + return results +``` + +### Where it plugs in + +- New workflow: `.github/workflows/post-deploy-scan.yml` +- `scripts/mcp_server.py` new tool registration +- Config: `scan_trigger=["push", "pr", "deploy", "schedule"]` + +--- + +## 3. Agent-Driven Attack Path Reasoning + +### What exists + +`scripts/vulnerability_chaining_engine.py` implements `VulnerabilityChainer` with 14 pre-defined rule-based chaining patterns (IDOR → Privilege Escalation → Data Breach, XSS → Session Hijacking → Account Takeover, etc.). Uses NetworkX for graph traversal with probability-weighted edges. + +Phase 3's `agent_personas.py` runs 5 specialized AI personas for multi-agent review, but these are independent reviewers — they don't collaborate on attack path discovery. + +### The gap + +The chaining is rule-based: if finding A is category X and finding B is category Y, draw an edge. It doesn't reason about application-specific context — whether a particular XSS is actually in a position to steal a session token, or whether a specific SSRF can reach an internal metadata endpoint. + +The Aikido article describes agents that "reason about application behavior, chain multi-step attack paths, and validate exploitability through real exploitation." That's agent-driven reasoning, not pattern matching. + +### What to build + +**a) LLM-powered chain discovery** + +Use the existing LLM infrastructure to ask the model to reason about cross-finding relationships: + +```python +class AgentChainDiscovery: + """Use LLM agents to discover attack chains that rule-based logic misses.""" + + CHAIN_DISCOVERY_PROMPT = """You are a senior penetration tester analyzing a set of +security findings from the same application. Your job is to identify multi-step +attack paths that chain these findings together. + +For each chain you discover, explain: +1. The entry point (which finding starts the chain) +2. Each step and what it enables +3. The final impact (what an attacker achieves) +4. Why these specific findings combine dangerously +5. Whether the chain requires authentication or can be triggered anonymously + +Findings: +{findings_json} + +Application context: +- Framework: {framework} +- Auth mechanism: {auth_type} +- Architecture: {architecture} + +Return your analysis as JSON array of chain objects.""" + + async def discover_chains( + self, + findings: list[dict], + app_context: dict, + llm_client, + ) -> list[dict]: + prompt = self.CHAIN_DISCOVERY_PROMPT.format( + findings_json=json.dumps(findings, indent=2), + framework=app_context.get('framework', 'unknown'), + auth_type=app_context.get('auth_type', 'unknown'), + architecture=app_context.get('architecture', 'unknown'), + ) + response = await llm_client.analyze(prompt) + return self._parse_chains(response) +``` + +**b) Cross-component reasoning** + +The key insight from the article: "Two changes that are individually safe can be dangerous in combination: a new API field here, a relaxed permission check there, and suddenly there's a cross-tenant data leak." + +This requires understanding component boundaries. Extend the chaining engine to consider data flow between components: + +```python +class CrossComponentAnalyzer: + """Analyze how findings in different components interact.""" + + def analyze_cross_component_risk( + self, + findings: list[dict], + dependency_graph: dict, # file → [files it imports] + ) -> list[dict]: + """Find findings that are individually low-risk but + dangerous in combination across component boundaries.""" + + # Group findings by component + by_component = defaultdict(list) + for f in findings: + component = self._classify_component(f['file_path']) + by_component[component].append(f) + + # For each pair of connected components, + # check if their findings combine dangerously + dangerous_combos = [] + for comp_a, comp_b in self._connected_pairs(dependency_graph): + findings_a = by_component.get(comp_a, []) + findings_b = by_component.get(comp_b, []) + if findings_a and findings_b: + combos = self._evaluate_combinations(findings_a, findings_b) + dangerous_combos.extend(combos) + + return dangerous_combos +``` + +**c) Collaborative agent reasoning** + +Currently Phase 3 agents review independently. Add a "red team council" step where agents share findings and collaboratively build attack narratives: + +```python +# In agent_personas.py, add collaborative reasoning phase +CHAIN_COUNCIL_PROMPT = """The following security agents have independently +reviewed the codebase and found these issues: + +{agent_findings} + +As the Red Team Council, your job is to: +1. Identify attack paths that span multiple agents' findings +2. Determine if any combination of "medium" findings creates a "critical" chain +3. Propose exploitation sequences that chain findings end-to-end +4. Highlight any finding that is a prerequisite for exploiting another + +Focus on practical, real-world attack scenarios.""" +``` + +### Where it plugs in + +- Extend `scripts/vulnerability_chaining_engine.py` with LLM-powered discovery +- New module: `scripts/cross_component_analyzer.py` +- Phase 3 in `scripts/agent_personas.py` — add collaborative council step +- Config: `enable_agent_chain_discovery=True`, `enable_cross_component_analysis=True` + +--- + +## 4. AutoFix → PR → Retest Closed Loop + +### What exists + +- `RemediationEngine` generates `RemediationSuggestion` objects with `fixed_code`, `diff`, `explanation`, `testing_recommendations` +- `RegressionTester` generates language-specific test code and runs it in CI +- `automated-audit.yml` uses `peter-evans/create-pull-request` to open PRs with audit findings (reports, not code fixes) + +### The gap + +These three capabilities are disconnected. The engine generates a fix, but nobody applies it. The regression tester exists, but isn't triggered by the fix. There's no PR-with-code-fix flow. + +The Aikido article describes: "AutoFix generates a merge-ready PR with the specific code-level fix. Developers review, merge, and agents automatically retest to confirm the fix holds." + +### What to build + +**a) AutoFix PR generator** + +Wire `RemediationEngine` output into actual PR creation: + +```python +class AutoFixPRGenerator: + """Generate merge-ready PRs from remediation suggestions.""" + + def create_fix_pr( + self, + suggestion: RemediationSuggestion, + repo_path: str, + base_branch: str = "main", + ) -> dict: + branch_name = f"argus/fix-{suggestion.vulnerability_type}-{suggestion.finding_id[:8]}" + + # Apply the diff to the actual file + self._apply_fix(suggestion, repo_path) + + # Generate regression test for this fix + regression_test = self.regression_tester.generate_test( + finding=suggestion.to_finding_dict(), + language=self._detect_language(suggestion.file_path), + ) + + # Create PR with fix + test + pr_body = self._format_pr_body(suggestion, regression_test) + + return { + 'branch': branch_name, + 'files_changed': [suggestion.file_path], + 'test_file': regression_test.path if regression_test else None, + 'title': f"fix: {suggestion.vulnerability_type} in {os.path.basename(suggestion.file_path)}", + 'body': pr_body, + } + + def _format_pr_body(self, suggestion, regression_test) -> str: + return f"""## Security Fix — {suggestion.vulnerability_type} + +**Finding:** {suggestion.finding_id} +**File:** `{suggestion.file_path}:{suggestion.line_number}` +**CWE:** {', '.join(suggestion.cwe_references)} +**Confidence:** {suggestion.confidence} + +### What changed +{suggestion.explanation} + +### Diff +```diff +{suggestion.diff} +``` + +### Testing +{chr(10).join(f'- {t}' for t in suggestion.testing_recommendations)} + +### Regression test included +{'Yes — see ' + regression_test.path if regression_test else 'No (template not available for this vuln type)'} + +--- +*Generated by Argus Security — [verify before merging]* +""" +``` + +**b) Retest-on-merge workflow** + +```yaml +# .github/workflows/argus-retest.yml +name: Argus Retest After Fix +on: + pull_request: + types: [closed] + +jobs: + retest: + if: | + github.event.pull_request.merged == true && + startsWith(github.event.pull_request.head.ref, 'argus/fix-') + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - name: Run targeted rescan + run: | + # Extract the finding ID from the branch name + FINDING_ID=$(echo "${{ github.event.pull_request.head.ref }}" | sed 's/argus\/fix-.*-//') + python scripts/regression_tester.py run \ + --test-dir tests/security_regression \ + --finding-id "$FINDING_ID" + + - name: Run SAST on fixed file + run: | + CHANGED_FILES=$(gh pr view ${{ github.event.pull_request.number }} --json files -q '.files[].path') + semgrep --config=auto $CHANGED_FILES + + - name: Update finding status + if: success() + run: | + python -c " + from feedback_loop import FeedbackLoop + fl = FeedbackLoop() + fl.record_feedback( + finding_id='$FINDING_ID', + automated_verdict='confirmed', + human_verdict='confirmed', + confidence=1.0, + category='fix_verified', + reasoning='Automated retest passed after merge' + )" +``` + +**c) Full closed-loop orchestration** + +The holy grail: scan → find → fix → PR → merge → retest → verify, all automated: + +```python +class ClosedLoopOrchestrator: + """Orchestrate the full find → fix → verify loop.""" + + async def run_closed_loop(self, scan_results: list[dict]) -> dict: + loop_results = { + 'fixed': [], + 'fix_failed': [], + 'retest_failed': [], + 'verified': [], + } + + fixable = [f for f in scan_results if f.get('auto_fixable')] + + for finding in fixable: + # Step 1: Generate fix + suggestion = self.remediation_engine.generate_fix(finding) + if not suggestion or suggestion.confidence == 'low': + loop_results['fix_failed'].append(finding) + continue + + # Step 2: Create PR + pr = self.pr_generator.create_fix_pr(suggestion) + loop_results['fixed'].append({**finding, 'pr': pr}) + + # Step 3: Run regression test against the fix (pre-merge validation) + test_result = self.regression_tester.run_single( + finding_id=finding['id'], + patched_code=suggestion.fixed_code, + ) + + if test_result.passed: + loop_results['verified'].append(finding) + else: + loop_results['retest_failed'].append(finding) + + return loop_results +``` + +### Where it plugs in + +- New module: `scripts/autofix_pr_generator.py` +- New workflow: `.github/workflows/argus-retest.yml` +- Extend `scripts/hybrid_analyzer.py` Phase 6 to optionally trigger AutoFix +- Config: `enable_autofix_pr=False` (opt-in), `autofix_confidence_threshold="high"`, `autofix_retest=True` + +--- + +## 5. Persistent Security Knowledge Base + +### What exists + +Argus stores results as flat files: +- `.argus/feedback/feedback_records.jsonl` — human TP/FP verdicts +- `.argus/feedback/confidence_adjustments.json` — pattern multipliers +- `.argus/sandbox-results/` — per-exploit outcomes +- `tests/security_regression/` — regression test cases +- Per-scan JSON/SARIF/Markdown reports + +### The gap + +Each scan is independent. There's no way to ask "has this vulnerability been seen before?", "is this a regression?", "what's our false positive rate trending?", or "what attack patterns are most common in this codebase?" + +### What to build + +**a) SQLite-backed findings store** + +Lightweight, zero-infrastructure, embedded in the repo (or `.argus/`): + +```python +class FindingsStore: + """Persistent cross-scan findings database.""" + + SCHEMA = """ + CREATE TABLE IF NOT EXISTS findings ( + id TEXT PRIMARY KEY, + scan_id TEXT NOT NULL, + scan_timestamp TEXT NOT NULL, + vuln_type TEXT NOT NULL, + severity TEXT NOT NULL, + file_path TEXT, + line_number INTEGER, + cwe TEXT, + cvss_score REAL, + source_tool TEXT, + status TEXT DEFAULT 'open', -- open, fixed, false_positive, accepted_risk + first_seen TEXT NOT NULL, + last_seen TEXT NOT NULL, + times_seen INTEGER DEFAULT 1, + fix_verified BOOLEAN DEFAULT FALSE, + fingerprint TEXT NOT NULL -- content-based dedup key + ); + + CREATE TABLE IF NOT EXISTS scan_history ( + scan_id TEXT PRIMARY KEY, + timestamp TEXT NOT NULL, + commit_sha TEXT, + branch TEXT, + total_findings INTEGER, + critical INTEGER, + high INTEGER, + medium INTEGER, + low INTEGER, + duration_seconds REAL, + cost_usd REAL + ); + + CREATE TABLE IF NOT EXISTS fix_history ( + finding_id TEXT, + fix_commit TEXT, + fix_timestamp TEXT, + fix_method TEXT, -- autofix, manual, dependency_update + retest_passed BOOLEAN, + regression_detected BOOLEAN DEFAULT FALSE + ); + """ + + def record_scan(self, scan_results: dict) -> None: + """Record a scan and upsert all findings.""" + ... + + def is_regression(self, finding: dict) -> bool: + """Check if a finding was previously fixed but has reappeared.""" + ... + + def trending(self, days: int = 90) -> dict: + """Return severity trends over time.""" + ... + + def mean_time_to_fix(self, severity: str = None) -> float: + """Calculate MTTF across all findings or by severity.""" + ... +``` + +**b) Cross-scan deduplication** + +Use content-based fingerprinting to track findings across scans: + +```python +def fingerprint_finding(finding: dict) -> str: + """Generate a stable fingerprint for cross-scan dedup. + + Uses vulnerability type + file path + code context (not line number, + which shifts with unrelated edits). + """ + key_parts = [ + finding.get('vuln_type', ''), + finding.get('file_path', ''), + finding.get('code_snippet', '')[:200], # Normalized code context + finding.get('cwe', ''), + ] + return hashlib.sha256('|'.join(key_parts).encode()).hexdigest()[:16] +``` + +**c) Historical context injection** + +Feed knowledge base context into LLM enrichment prompts: + +```python +# In Phase 2 AI enrichment, add historical context +historical_context = f""" +Historical context for this finding: +- First seen: {store.first_seen(fingerprint)} +- Times detected: {store.times_seen(fingerprint)} +- Previous status: {store.previous_status(fingerprint)} +- Related findings in same file: {store.related_count(file_path)} +- False positive rate for this pattern: {store.fp_rate(vuln_type)}% + +Use this context to calibrate your confidence score. +""" +``` + +### Where it plugs in + +- New module: `scripts/findings_store.py` +- Integrate into `scripts/hybrid_analyzer.py` Phase 6 (record) and Phase 2 (query) +- Extend `scripts/feedback_loop.py` to write to the store +- Config: `enable_findings_store=True`, `findings_db_path=".argus/findings.db"` + +--- + +## 6. Unified Code-to-Runtime Context Model + +### What exists + +- SAST: Semgrep, Checkov, TruffleHog, heuristic scanner +- DAST: ZAP + Nuclei via `dast_orchestrator.py` +- Correlation: `sast_dast_correlator.py` bridges SAST and DAST findings +- Auth config: `dast_auth_config.py` handles authenticated scanning + +### The gap + +Each scanner operates with its own isolated view. There's no unified application model that says "here's the full attack surface: these API endpoints exist, they're backed by these handlers, protected by this middleware, talking to this database, deployed behind this cloud config." The correlator connects findings after the fact, but doesn't inform the scanning itself. + +### What to build + +**a) Application context model** + +Build a lightweight representation of the application that all phases can reference: + +```python +@dataclass +class ApplicationContext: + """Unified application context fed to all pipeline phases.""" + + # Code structure + framework: str # django, express, spring, etc. + language: str + entry_points: list[str] # Main files, route definitions + auth_mechanism: str # jwt, session, oauth2, api_key + + # API surface + api_endpoints: list[dict] # From OpenAPI spec or route discovery + middleware_chain: list[str] # Auth, CORS, rate limiting, etc. + data_models: list[dict] # Database models / schemas + + # Infrastructure + cloud_provider: str # aws, gcp, azure, none + iac_files: list[str] # Terraform, K8s manifests + container_config: dict # Dockerfile analysis + secrets_management: str # vault, env, ssm, none + + # Dependencies + direct_deps: list[dict] # From package.json, requirements.txt, etc. + transitive_deps: list[dict] # Full dependency tree + + # DAST context (if available) + deployment_url: str | None + authenticated_endpoints: list[str] + discovered_endpoints: list[str] # From crawling +``` + +**b) Context-aware scanning** + +Pass the context model into scanner configuration: + +```python +# Phase 1: Context-aware Semgrep rules +if context.framework == 'django': + semgrep_rules.append('p/django') + semgrep_rules.append('p/python-django-security') +if context.auth_mechanism == 'jwt': + semgrep_rules.append('p/jwt') + +# Phase 2: Context-enriched LLM prompts +enrichment_prompt += f""" +Application context: +- Framework: {context.framework} +- Auth: {context.auth_mechanism} +- Cloud: {context.cloud_provider} +- Known API endpoints: {len(context.api_endpoints)} +- Middleware chain: {' → '.join(context.middleware_chain)} + +Use this context to determine if the finding is actually exploitable +in this specific application architecture. +""" +``` + +### Where it plugs in + +- New module: `scripts/app_context_builder.py` +- Called once at pipeline start, passed to all phases +- Feed into `scripts/config_loader.py` as enrichment context +- Config: `enable_app_context=True` + +--- + +## 7. Live Target Validation (Beyond Sandbox) + +### What exists + +- `SandboxValidator` runs exploit PoCs in isolated Docker containers +- `dast_orchestrator.py` runs ZAP + Nuclei against live targets +- These are separate capabilities — sandbox validates code-level findings, DAST scans network-level surface + +### The gap + +The sandbox proves "this code is theoretically exploitable in isolation." DAST finds "this endpoint responds to this payload." Neither proves "this specific finding from SAST is exploitable in the deployed application." + +The article describes: "Every finding is confirmed through direct exploitation against the live target." + +### What to build + +**a) SAST-to-DAST validation pipeline** + +Take SAST findings and generate targeted DAST tests: + +```python +class SastToDastValidator: + """Validate SAST findings against the live deployment.""" + + async def validate_finding( + self, + finding: dict, + target_url: str, + auth_config: dict, + ) -> dict: + """Generate and execute a targeted DAST test for a SAST finding.""" + + # Map SAST finding to HTTP test + test_case = self._generate_test_case(finding) + if not test_case: + return {'validated': False, 'reason': 'no_test_mapping'} + + # Execute against live target + result = await self._execute_test(test_case, target_url, auth_config) + + return { + 'validated': result.exploitable, + 'evidence': result.response_excerpt, + 'http_status': result.status_code, + 'validation_method': 'live_dast', + } + + def _generate_test_case(self, finding: dict) -> dict | None: + """Map a SAST finding to a concrete HTTP test case. + + Example: SQL injection in /api/users?id= → + GET /api/users?id=1' OR '1'='1 and check for data leak indicators + """ + vuln_type = finding.get('vuln_type', '') + endpoint = finding.get('endpoint') or self._infer_endpoint(finding) + + if not endpoint: + return None + + # Generate test payloads based on vulnerability type + payloads = self.payload_generator.for_vuln_type(vuln_type) + return { + 'endpoint': endpoint, + 'method': finding.get('http_method', 'GET'), + 'payloads': payloads, + 'success_indicators': self._success_indicators(vuln_type), + } +``` + +**b) Exploit replay against deployment** + +Wire `ProofByExploitation` to optionally target a live URL instead of only the Docker sandbox: + +```python +# In sandbox_validator.py, add a mode for live target validation +class LiveTargetValidator: + """Validate findings against a live deployment (staging/preview).""" + + ALLOWED_ENVIRONMENTS = ['staging', 'preview', 'development'] # Never production + + def validate(self, finding: dict, target_url: str, environment: str) -> dict: + if environment not in self.ALLOWED_ENVIRONMENTS: + raise ValueError(f"Live validation not allowed against {environment}") + ... +``` + +### Where it plugs in + +- New module: `scripts/sast_dast_validator.py` +- Extend Phase 4 to include live validation when `dast_target_url` is set +- Config: `enable_live_validation=False`, `live_validation_environment="staging"` + +--- + +## Implementation Priority + +Ordered by impact-to-effort ratio: + +| Priority | Feature | Effort | Impact | +|---|---|---|---| +| **P0** | Diff-intelligent scanner scoping | Medium | High — every scan runs faster and more focused | +| **P0** | AutoFix → PR generation | Medium | High — closes the most visible gap | +| **P1** | Persistent findings store (SQLite) | Medium | High — enables trending, regression detection, MTTF | +| **P1** | Retest-on-merge workflow | Low | Medium — completes the closed loop | +| **P1** | Agent-driven chain discovery | Medium | High — biggest quality uplift for finding depth | +| **P2** | Deployment-triggered scanning | Low | Medium — extends coverage to post-deploy | +| **P2** | Application context model | High | High — improves everything but requires broad integration | +| **P2** | SAST-to-DAST validation | High | Medium — requires live target, auth, environment setup | + +--- + +## Config Toggles Summary + +All new capabilities follow Argus's existing pattern of config-driven feature flags: + +```yaml +# In profiles/ or config_loader.py defaults +continuous_testing: + enable_diff_scoping: true + diff_expand_impact_radius: true + scan_trigger: ["push", "pr", "deploy"] + +autonomous_loop: + enable_autofix_pr: false # Opt-in: generates PRs with fixes + autofix_confidence_threshold: high # Only auto-fix high-confidence suggestions + autofix_retest: true # Retest after fix merge + autofix_max_prs_per_scan: 5 # Rate limit + +knowledge_base: + enable_findings_store: true + findings_db_path: ".argus/findings.db" + enable_cross_scan_dedup: true + enable_trending: true + inject_historical_context: true # Feed history into LLM prompts + +agent_reasoning: + enable_agent_chain_discovery: true + enable_cross_component_analysis: true + enable_collaborative_council: false # Expensive: multi-agent discussion + +live_validation: + enable_live_validation: false + live_validation_environment: staging + enable_sast_dast_validation: false +``` + +--- + +## What This Changes + +Today, Argus is a powerful **scan-on-demand** pipeline: trigger it, get results, act on them. + +With these additions, Argus becomes a **continuous security loop**: + +``` +Code pushed → Diff classified → Scanners scoped to blast radius + → AI enrichment with historical context → Agent-driven chain discovery + → Sandbox + live validation → AutoFix PRs generated + → Developer merges → Automated retest → Finding marked verified + → Knowledge base updated → Next scan is smarter +``` + +The attackers have autonomous tools. This gives defenders the same. From 32dfcc9010d0a6a9b86428b2813d1c8d8d9312c5 Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 4 Mar 2026 05:55:24 +0000 Subject: [PATCH 2/3] feat: Add 7 continuous security testing modules (v3.0) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Implements continuous autonomous security testing capabilities: - diff_impact_analyzer: Diff-intelligent scanner scoping with blast radius expansion via reverse dependency lookup - agent_chain_discovery: LLM-powered multi-step attack chain discovery with cross-component vulnerability analysis - autofix_pr_generator: AutoFix PR generation with closed-loop find→fix→verify orchestration - findings_store: SQLite-backed persistent findings with regression detection, MTTF, and historical context for LLM prompts - app_context_builder: Auto-detects framework, language, auth, cloud provider, IaC, middleware, and entry points for context-aware scanning - sast_dast_validator: SAST-to-DAST live validation with safety guards against production targets - GitHub Actions workflows for post-deploy scanning and automated retest Adds 13 config keys, integrates all modules into hybrid_analyzer.py pipeline, and includes 36 passing tests. https://claude.ai/code/session_017NQsm2eBxfioLrad1C7keZ --- .claude/rules/development.md | 6 + .claude/rules/features.md | 22 +- .github/workflows/argus-retest.yml | 124 ++++ .github/workflows/post-deploy-scan.yml | 102 +++ scripts/agent_chain_discovery.py | 624 +++++++++++++++++ scripts/app_context_builder.py | 808 ++++++++++++++++++++++ scripts/autofix_pr_generator.py | 921 +++++++++++++++++++++++++ scripts/config_loader.py | 48 ++ scripts/diff_impact_analyzer.py | 524 ++++++++++++++ scripts/findings_store.py | 808 ++++++++++++++++++++++ scripts/hybrid_analyzer.py | 117 ++++ scripts/sast_dast_validator.py | 844 ++++++++++++++++++++++ tests/test_continuous_security.py | 664 ++++++++++++++++++ 13 files changed, 5611 insertions(+), 1 deletion(-) create mode 100644 .github/workflows/argus-retest.yml create mode 100644 .github/workflows/post-deploy-scan.yml create mode 100644 scripts/agent_chain_discovery.py create mode 100644 scripts/app_context_builder.py create mode 100644 scripts/autofix_pr_generator.py create mode 100644 scripts/diff_impact_analyzer.py create mode 100644 scripts/findings_store.py create mode 100644 scripts/sast_dast_validator.py create mode 100644 tests/test_continuous_security.py diff --git a/.claude/rules/development.md b/.claude/rules/development.md index 46e3e3e..fa4e956 100644 --- a/.claude/rules/development.md +++ b/.claude/rules/development.md @@ -50,6 +50,12 @@ Argus-Security/ │ ├── agent_personas.py # Phase 3: Multi-agent review │ ├── sandbox_validator.py # Phase 4: Docker validation │ ├── remediation_engine.py # Auto-fix generation +│ ├── diff_impact_analyzer.py # Diff-intelligent scanner scoping +│ ├── agent_chain_discovery.py # LLM-powered attack chain discovery +│ ├── autofix_pr_generator.py # AutoFix PR generation + closed loop +│ ├── findings_store.py # SQLite cross-scan findings store +│ ├── app_context_builder.py # Unified application context model +│ ├── sast_dast_validator.py # SAST-to-DAST live validation │ └── argus # CLI entry point ├── policy/rego/ # Phase 5: OPA policies ├── profiles/ # Config profiles diff --git a/.claude/rules/features.md b/.claude/rules/features.md index fc999b4..7271115 100644 --- a/.claude/rules/features.md +++ b/.claude/rules/features.md @@ -1,6 +1,6 @@ --- description: Advanced feature modules and their configuration toggles -globs: ["scripts/error_classifier.py", "scripts/audit_trail.py", "scripts/phase_gate.py", "scripts/mcp_server.py", "scripts/dast_auth_config.py", "scripts/temporal_orchestrator.py", "scripts/license_risk_scorer.py", "scripts/epss_scorer.py", "scripts/fix_version_tracker.py", "scripts/vex_processor.py", "scripts/vuln_deduplicator.py", "scripts/advanced_suppression.py", "scripts/compliance_mapper.py"] +globs: ["scripts/error_classifier.py", "scripts/audit_trail.py", "scripts/phase_gate.py", "scripts/mcp_server.py", "scripts/dast_auth_config.py", "scripts/temporal_orchestrator.py", "scripts/license_risk_scorer.py", "scripts/epss_scorer.py", "scripts/fix_version_tracker.py", "scripts/vex_processor.py", "scripts/vuln_deduplicator.py", "scripts/advanced_suppression.py", "scripts/compliance_mapper.py", "scripts/diff_impact_analyzer.py", "scripts/agent_chain_discovery.py", "scripts/autofix_pr_generator.py", "scripts/findings_store.py", "scripts/app_context_builder.py", "scripts/sast_dast_validator.py"] --- # Advanced Features @@ -43,3 +43,23 @@ Multi-key: {VulnID, PkgName, Version, Path}. Cross-scanner merge. Strategies: au ## Compliance Mapping (`scripts/compliance_mapper.py`) NIST 800-53, PCI DSS 4.0, OWASP Top 10, SOC 2, CIS K8s, ISO 27001. CWE-based mapping + category fallback. Toggle: `enable_compliance_mapping=True` + +# Continuous Security Testing (v3.0) + +## Diff-Intelligent Scanner Scoping (`scripts/diff_impact_analyzer.py`) +Classifies changed files by security relevance (skip docs/assets, always scan auth/crypto/config). Expands blast radius via reverse dependency lookup — if auth middleware changed, finds all files importing it. Generates Semgrep `--include` args for scoped scanning. Toggle: `enable_diff_scoping=True`, `diff_expand_impact_radius=True` + +## Agent-Driven Chain Discovery (`scripts/agent_chain_discovery.py`) +LLM-powered multi-step attack chain discovery beyond rule-based patterns. Sends findings to LLM to reason about cross-component exploitation paths. Cross-component analyzer detects dangerous finding combinations across architectural boundaries (auth+api, models+api, middleware+routes). Toggle: `enable_agent_chain_discovery=False` (opt-in), `enable_cross_component_analysis=True` + +## AutoFix PR Generator (`scripts/autofix_pr_generator.py`) +Generates git branches with applied fixes from RemediationEngine suggestions. Creates conventional-commit-style messages, formatted PR bodies with diff/CWE/testing sections. ClosedLoopOrchestrator wires find→fix→verify into a single flow. Toggle: `enable_autofix_pr=False` (opt-in), `autofix_confidence_threshold="high"`, `autofix_max_prs_per_scan=5` + +## Persistent Findings Store (`scripts/findings_store.py`) +SQLite-backed cross-scan intelligence. Tracks findings across scans via content-based fingerprinting. Detects regressions (previously-fixed findings reappearing), computes MTTF, FP rates, severity trending. Injects historical context into LLM enrichment prompts. Toggle: `enable_findings_store=True`, `findings_db_path=".argus/findings.db"`, `inject_historical_context=True` + +## Application Context Builder (`scripts/app_context_builder.py`) +Detects framework (Django/Flask/Express/Spring/etc.), language, auth mechanism (JWT/OAuth2/session), cloud provider, IaC files, middleware chain, entry points, and OpenAPI specs. Generates `to_prompt_context()` string for LLM prompt injection. Toggle: `enable_app_context=True` + +## SAST-to-DAST Live Validation (`scripts/sast_dast_validator.py`) +Validates SAST findings against live deployment targets. Maps vuln types to HTTP test payloads (SQLi, XSS, SSRF, path traversal, command injection, IDOR). Safety: rejects production targets by default, only allows staging/preview/development. Toggle: `enable_live_validation=False` (opt-in), `live_validation_environment="staging"` diff --git a/.github/workflows/argus-retest.yml b/.github/workflows/argus-retest.yml new file mode 100644 index 0000000..d10c305 --- /dev/null +++ b/.github/workflows/argus-retest.yml @@ -0,0 +1,124 @@ +name: Argus Retest After Fix +on: + pull_request: + types: [closed] + +jobs: + retest: + # Only run when an argus/fix- PR is merged + if: > + github.event.pull_request.merged == true && + startsWith(github.event.pull_request.head.ref, 'argus/fix-') + runs-on: ubuntu-latest + permissions: + contents: read + pull-requests: write + + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: '3.11' + + - name: Install dependencies + run: pip install -r requirements.txt + + - name: Extract fix metadata + id: meta + run: | + BRANCH="${{ github.event.pull_request.head.ref }}" + # Extract vuln type and finding ID from branch name: argus/fix-{type}-{id} + VULN_TYPE=$(echo "$BRANCH" | sed 's|argus/fix-||' | sed 's|-[a-f0-9]*$||') + FINDING_ID=$(echo "$BRANCH" | grep -oP '[a-f0-9]{8}$' || echo "unknown") + echo "vuln_type=$VULN_TYPE" >> $GITHUB_OUTPUT + echo "finding_id=$FINDING_ID" >> $GITHUB_OUTPUT + # Get changed files from the PR + CHANGED_FILES=$(gh pr view ${{ github.event.pull_request.number }} --json files -q '.files[].path' || echo "") + echo "changed_files=$CHANGED_FILES" >> $GITHUB_OUTPUT + env: + GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} + + - name: Run regression tests + id: regression + continue-on-error: true + run: | + python -c " + import sys + sys.path.insert(0, 'scripts') + try: + from regression_tester import RegressionTester + tester = RegressionTester() + results = tester.run('tests/security_regression') + passed = results.get('passed', 0) + failed = results.get('failed', 0) + print(f'Regression tests: {passed} passed, {failed} failed') + sys.exit(1 if failed > 0 else 0) + except Exception as e: + print(f'Regression test error: {e}') + sys.exit(1) + " + + - name: Run targeted SAST rescan + id: rescan + continue-on-error: true + env: + ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} + run: | + python scripts/run_ai_audit.py \ + --project-type auto \ + --only-changed \ + --review-type security + + - name: Update finding status + if: steps.regression.outcome == 'success' && steps.rescan.outcome == 'success' + run: | + python -c " + import sys + sys.path.insert(0, 'scripts') + try: + from findings_store import FindingsStore + store = FindingsStore() + store.record_fix( + finding_id='${{ steps.meta.outputs.finding_id }}', + fix_commit='${{ github.sha }}', + fix_method='autofix', + retest_passed=True, + ) + print('Finding marked as fix-verified') + except Exception as e: + print(f'Could not update findings store: {e}') + " + + - name: Post retest results + if: always() + uses: actions/github-script@v7 + with: + script: | + const regression = '${{ steps.regression.outcome }}'; + const rescan = '${{ steps.rescan.outcome }}'; + const allPassed = regression === 'success' && rescan === 'success'; + + const body = `## Argus Retest Results + + | Check | Status | + |-------|--------| + | Regression Tests | ${regression === 'success' ? 'Passed' : 'Failed'} | + | SAST Rescan | ${rescan === 'success' ? 'Clean' : 'Issues found'} | + | **Overall** | **${allPassed ? 'Fix Verified' : 'Needs Review'}** | + + ${allPassed ? 'The fix has been verified. The vulnerability is confirmed resolved.' : 'The retest found issues. Please review the scan results.'} + + --- + *Argus Security Retest — triggered by merge of \`${{ github.event.pull_request.head.ref }}\`*`; + + // Comment on the merged PR + await github.rest.issues.createComment({ + owner: context.repo.owner, + repo: context.repo.repo, + issue_number: ${{ github.event.pull_request.number }}, + body: body + }); diff --git a/.github/workflows/post-deploy-scan.yml b/.github/workflows/post-deploy-scan.yml new file mode 100644 index 0000000..13242b7 --- /dev/null +++ b/.github/workflows/post-deploy-scan.yml @@ -0,0 +1,102 @@ +name: Post-Deploy Security Validation +on: + deployment_status: + # Trigger when deployment succeeds + workflow_dispatch: + inputs: + target_url: + description: 'Deployment URL to scan' + required: false + type: string + environment: + description: 'Deployment environment' + required: false + default: 'staging' + type: string + +jobs: + post-deploy-scan: + if: > + github.event_name == 'workflow_dispatch' || + github.event.deployment_status.state == 'success' + runs-on: ubuntu-latest + permissions: + contents: read + security-events: write + + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: '3.11' + + - name: Install dependencies + run: pip install -r requirements.txt + + - name: Determine deployment context + id: context + run: | + if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then + echo "target_url=${{ inputs.target_url }}" >> $GITHUB_OUTPUT + echo "environment=${{ inputs.environment }}" >> $GITHUB_OUTPUT + else + echo "target_url=${{ github.event.deployment.payload.web_url || '' }}" >> $GITHUB_OUTPUT + echo "environment=${{ github.event.deployment.environment }}" >> $GITHUB_OUTPUT + fi + # Get diff since last successful scan + PREV_SHA=$(git log --format='%H' -2 | tail -1) + echo "prev_sha=$PREV_SHA" >> $GITHUB_OUTPUT + CHANGED=$(git diff --name-only $PREV_SHA HEAD | head -100) + echo "has_changes=$( [ -n "$CHANGED" ] && echo true || echo false )" >> $GITHUB_OUTPUT + + - name: Run diff-scoped SAST scan + if: steps.context.outputs.has_changes == 'true' + env: + ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} + ONLY_CHANGED: "true" + run: | + python scripts/run_ai_audit.py \ + --project-type auto \ + --only-changed \ + --review-type security + + - name: Run DAST against deployment + if: steps.context.outputs.target_url != '' + env: + ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} + DAST_TARGET_URL: ${{ steps.context.outputs.target_url }} + run: | + echo "Running DAST scan against $DAST_TARGET_URL" + python -c " + import sys + sys.path.insert(0, 'scripts') + try: + from dast_orchestrator import DASTOrchestrator, OrchestratorConfig + config = OrchestratorConfig( + project_path='.', + enable_nuclei=True, + enable_zap=False, + max_duration=600, + ) + orch = DASTOrchestrator(config=config) + results = orch.run('${{ steps.context.outputs.target_url }}') + print(f'DAST scan complete: {len(results.get(\"findings\", []))} findings') + except ImportError as e: + print(f'DAST not available: {e}') + except Exception as e: + print(f'DAST scan error: {e}') + " + + - name: Upload results + if: always() + uses: actions/upload-artifact@v4 + with: + name: post-deploy-scan-results + path: | + .argus/ + *.sarif + retention-days: 30 diff --git a/scripts/agent_chain_discovery.py b/scripts/agent_chain_discovery.py new file mode 100644 index 0000000..94a3836 --- /dev/null +++ b/scripts/agent_chain_discovery.py @@ -0,0 +1,624 @@ +#!/usr/bin/env python3 +""" +LLM-Powered Vulnerability Chain Discovery for Argus Security + +Complements the rule-based vulnerability_chaining_engine.py (14 static rules) +with LLM-powered reasoning to discover novel multi-step attack paths that +static rules cannot anticipate. + +Key Features: +- LLM-driven attack chain discovery beyond predefined rule sets +- Cross-component analysis for inter-module vulnerability combinations +- Structured prompt engineering for reliable JSON chain output +- Batch processing with configurable limits to control LLM costs + +Integration: +- Accepts a callable ``llm_call`` matching Argus's LLMManager.call_llm_api() + pattern (takes prompt string, returns response string) +- Returns dataclass-based results with ``to_dict()`` for JSON serialization +- Works alongside VulnerabilityChainer for hybrid static + AI chain detection + +Toggle: enable_agent_chain_discovery (not yet wired into config_loader defaults) +""" + +import json +import logging +import re +from collections import defaultdict +from dataclasses import asdict, dataclass, field +from pathlib import Path +from typing import Any, Callable, Optional + +logger = logging.getLogger(__name__) + +# --------------------------------------------------------------------------- +# Dataclasses +# --------------------------------------------------------------------------- + + +@dataclass +class AttackStep: + """A single step within a multi-step attack chain.""" + + finding_id: str + action: str + enables: str + + +@dataclass +class AttackChain: + """ + A multi-step attack path discovered by LLM analysis. + + Each chain links several security findings into an ordered exploitation + sequence with an assessed severity, complexity, and final impact. + """ + + chain_id: str + finding_ids: list[str] + steps: list[AttackStep] + severity: str # critical / high / medium / low + complexity: str # trivial / low / medium / high + impact: str + description: str + + def to_dict(self) -> dict[str, Any]: + """Convert to a JSON-serializable dictionary.""" + return { + "chain_id": self.chain_id, + "finding_ids": self.finding_ids, + "steps": [asdict(s) for s in self.steps], + "severity": self.severity, + "complexity": self.complexity, + "impact": self.impact, + "description": self.description, + } + + +@dataclass +class CrossComponentRisk: + """Risk arising from vulnerabilities spanning two application components.""" + + component_a: str + component_b: str + findings_a: list[str] # finding IDs + findings_b: list[str] + risk_type: str + severity: str + description: str + + +# --------------------------------------------------------------------------- +# AgentChainDiscovery +# --------------------------------------------------------------------------- + + +class AgentChainDiscovery: + """LLM-powered vulnerability chain discovery. + + Uses a language model to reason about multi-step attack paths that + combine discrete security findings from the same application. This + discovers chains that rule-based engines cannot anticipate because + they depend on application-specific context and creative attacker + thinking. + + Args: + llm_call: Function that takes a prompt string and returns a + response string, matching Argus's ``LLMManager.call_llm_api()`` + pattern. + max_findings_per_batch: Upper limit on findings sent to the LLM + in a single prompt. Keeps token usage bounded. + """ + + def __init__( + self, + llm_call: Callable[[str], str], + max_findings_per_batch: int = 30, + ): + self.llm_call = llm_call + self.max_findings_per_batch = max_findings_per_batch + + # -- public API --------------------------------------------------------- + + def discover_chains( + self, + findings: list[dict], + app_context: dict[str, Any] | None = None, + ) -> list[AttackChain]: + """Discover multi-step attack chains via LLM reasoning. + + Args: + findings: Security findings, each a dict with at least + ``id``, ``type``/``category``, ``severity``, ``file``/``path``, + and ``description``/``message`` keys. + app_context: Optional application metadata (e.g. framework, + auth model, deployment info) to improve LLM reasoning. + + Returns: + List of :class:`AttackChain` objects sorted by severity + (critical first). + """ + if not findings: + logger.info("No findings provided for chain discovery") + return [] + + # Batch findings to stay within token limits + batch = findings[: self.max_findings_per_batch] + if len(findings) > self.max_findings_per_batch: + logger.warning( + "Truncating %d findings to batch limit of %d", + len(findings), + self.max_findings_per_batch, + ) + + prompt = self._build_discovery_prompt(batch, app_context) + + logger.info( + "Sending %d findings to LLM for chain discovery", len(batch) + ) + + try: + response = self.llm_call(prompt) + except Exception: + logger.exception("LLM call failed during chain discovery") + return [] + + chains = self._parse_chains(response) + + # Sort: critical > high > medium > low + severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3} + chains.sort(key=lambda c: severity_order.get(c.severity.lower(), 4)) + + logger.info("LLM discovered %d attack chains", len(chains)) + return chains + + # -- prompt building ---------------------------------------------------- + + def _build_discovery_prompt( + self, + findings: list[dict], + app_context: dict[str, Any] | None, + ) -> str: + """Build a structured prompt for LLM chain discovery. + + Args: + findings: Batch of finding dicts. + app_context: Optional application metadata. + + Returns: + Complete prompt string. + """ + # Format findings into a readable list + formatted_findings: list[str] = [] + for f in findings: + fid = f.get("id", f.get("rule_id", "unknown")) + ftype = f.get("type", f.get("category", f.get("check_id", "unknown"))) + severity = f.get("severity", "medium") + filepath = f.get("file", f.get("path", "unknown")) + description = f.get("description", f.get("message", "")) + line = f.get("line", f.get("start_line", "")) + line_info = f" (line {line})" if line else "" + formatted_findings.append( + f"- ID: {fid} | Type: {ftype} | Severity: {severity} " + f"| File: {filepath}{line_info}\n Description: {description}" + ) + + findings_block = "\n".join(formatted_findings) + + # Optional app context section + context_section = "" + if app_context: + ctx_lines = [f"- {k}: {v}" for k, v in app_context.items()] + context_section = ( + "\nApplication Context:\n" + "\n".join(ctx_lines) + "\n" + ) + + prompt = ( + "You are an expert penetration tester analyzing security findings " + "from the same application.\n" + "Identify multi-step attack chains that combine these findings.\n" + "\n" + "For each chain:\n" + "1. List the finding IDs in exploitation order\n" + "2. Describe each step and what it enables\n" + "3. Rate the overall chain severity (critical/high/medium/low)\n" + "4. Estimate attack complexity (trivial/low/medium/high)\n" + "5. Describe the final impact\n" + "\n" + f"Findings:\n{findings_block}\n" + f"{context_section}\n" + "Return ONLY a JSON array of chain objects with keys: " + "chain_id, finding_ids, steps (array of {finding_id, action, enables}), " + "severity, complexity, impact, description" + ) + return prompt + + # -- response parsing --------------------------------------------------- + + def _parse_chains(self, response: str) -> list[AttackChain]: + """Parse LLM response into AttackChain objects. + + Handles: + - Raw JSON arrays + - JSON wrapped in markdown code blocks (```json ... ```) + - Graceful fallback to empty list on parse failure + + Args: + response: Raw LLM response string. + + Returns: + List of validated :class:`AttackChain` objects. + """ + if not response or not response.strip(): + logger.warning("Empty LLM response for chain discovery") + return [] + + # Strip markdown code fences if present + cleaned = response.strip() + code_block_match = re.search( + r"```(?:json)?\s*\n?(.*?)```", cleaned, re.DOTALL + ) + if code_block_match: + cleaned = code_block_match.group(1).strip() + + try: + data = json.loads(cleaned) + except json.JSONDecodeError as exc: + logger.warning("Failed to parse LLM chain response as JSON: %s", exc) + return [] + + if not isinstance(data, list): + logger.warning( + "Expected JSON array from LLM, got %s", type(data).__name__ + ) + return [] + + chains: list[AttackChain] = [] + for idx, item in enumerate(data): + try: + chain = self._validate_chain_item(item, idx) + if chain is not None: + chains.append(chain) + except Exception: + logger.warning( + "Skipping invalid chain at index %d", idx, exc_info=True + ) + + return chains + + def _validate_chain_item( + self, item: dict, idx: int + ) -> Optional[AttackChain]: + """Validate and convert a single chain dict to an AttackChain. + + Args: + item: Raw chain dict from LLM JSON. + idx: Index in the array (for fallback chain_id). + + Returns: + :class:`AttackChain` or ``None`` if the item is invalid. + """ + if not isinstance(item, dict): + logger.warning("Chain item at index %d is not a dict", idx) + return None + + # Required keys + finding_ids = item.get("finding_ids") + if not finding_ids or not isinstance(finding_ids, list): + logger.warning("Chain at index %d missing valid finding_ids", idx) + return None + + chain_id = item.get("chain_id", f"llm-chain-{idx}") + + # Parse steps + raw_steps = item.get("steps", []) + steps: list[AttackStep] = [] + for raw_step in raw_steps: + if isinstance(raw_step, dict): + steps.append( + AttackStep( + finding_id=str(raw_step.get("finding_id", "")), + action=str(raw_step.get("action", "")), + enables=str(raw_step.get("enables", "")), + ) + ) + + severity = str(item.get("severity", "medium")).lower() + if severity not in ("critical", "high", "medium", "low"): + severity = "medium" + + complexity = str(item.get("complexity", "medium")).lower() + if complexity not in ("trivial", "low", "medium", "high"): + complexity = "medium" + + return AttackChain( + chain_id=str(chain_id), + finding_ids=[str(fid) for fid in finding_ids], + steps=steps, + severity=severity, + complexity=complexity, + impact=str(item.get("impact", "")), + description=str(item.get("description", "")), + ) + + +# --------------------------------------------------------------------------- +# CrossComponentAnalyzer +# --------------------------------------------------------------------------- + + +# Component directory patterns used for classification +_COMPONENT_DIRS: dict[str, list[str]] = { + "api": ["api", "apis", "endpoints", "resources"], + "auth": ["auth", "authentication", "authorization", "identity", "login"], + "models": ["models", "entities", "schemas", "orm"], + "views": ["views", "templates", "pages", "components", "ui"], + "middleware": ["middleware", "middlewares", "interceptors", "filters"], + "routes": ["routes", "routing", "urls", "router"], + "services": ["services", "service", "providers", "adapters"], + "utils": ["utils", "utilities", "helpers", "lib", "common"], + "config": ["config", "configuration", "settings", "conf", "env"], +} + +# Dangerous cross-component combinations +_DANGEROUS_PAIRS: list[dict[str, str]] = [ + { + "a": "auth", + "b": "api", + "risk_type": "broken_access_control", + "severity": "critical", + "description": ( + "Authentication/authorization weaknesses combined with API " + "vulnerabilities can lead to broken access control, allowing " + "unauthenticated or low-privilege users to access protected " + "endpoints and data." + ), + }, + { + "a": "models", + "b": "api", + "risk_type": "mass_assignment", + "severity": "high", + "description": ( + "Model-layer vulnerabilities combined with API endpoint issues " + "can enable mass assignment attacks where attackers modify " + "protected fields (roles, permissions, balances) by including " + "unexpected parameters in API requests." + ), + }, + { + "a": "auth", + "b": "config", + "risk_type": "credential_exposure", + "severity": "critical", + "description": ( + "Authentication weaknesses combined with configuration issues " + "can expose credentials, secrets, or tokens through insecure " + "defaults, debug modes, or unprotected configuration files." + ), + }, + { + "a": "middleware", + "b": "routes", + "risk_type": "security_bypass", + "severity": "high", + "description": ( + "Middleware vulnerabilities combined with routing issues can " + "allow attackers to bypass security controls such as " + "authentication checks, rate limiting, CSRF protection, or " + "input validation by crafting requests that skip middleware " + "processing." + ), + }, + { + "a": "services", + "b": "api", + "risk_type": "ssrf_injection", + "severity": "high", + "description": ( + "Service-layer vulnerabilities combined with API issues can " + "enable SSRF or injection attacks where user-controlled input " + "reaches backend services, internal APIs, or external " + "integrations without proper validation or sanitization." + ), + }, +] + + +class CrossComponentAnalyzer: + """Analyzes findings across application component boundaries. + + Groups findings by their originating component (classified by + directory path) and checks for dangerous cross-component + combinations that amplify risk. + + Args: + project_path: Root path of the project under analysis. + """ + + def __init__(self, project_path: str): + self.project_path = project_path + + # -- public API --------------------------------------------------------- + + def analyze(self, findings: list[dict]) -> list[dict]: + """Analyze findings for dangerous cross-component combinations. + + Args: + findings: Security findings, each with at least ``id`` and + ``file``/``path`` keys. + + Returns: + List of risk dicts, one per dangerous combination detected. + Each dict contains ``component_a``, ``component_b``, + ``risk_type``, ``severity``, ``description``, + ``findings_a``, and ``findings_b``. + """ + if not findings: + return [] + + # Group findings by component + component_findings: dict[str, list[dict]] = defaultdict(list) + for f in findings: + filepath = f.get("file", f.get("path", "")) + component = self._classify_component(filepath) + component_findings[component].append(f) + + components = list(component_findings.keys()) + logger.info( + "Cross-component analysis: %d findings across %d components (%s)", + len(findings), + len(components), + ", ".join(sorted(components)), + ) + + # Check every pair for dangerous combinations + risks: list[dict] = [] + for i, comp_a in enumerate(components): + for comp_b in components[i + 1 :]: + pair_risks = self._check_dangerous_combinations( + component_findings[comp_a], + component_findings[comp_b], + comp_a, + comp_b, + ) + risks.extend(pair_risks) + + if risks: + logger.info( + "Found %d cross-component risks", len(risks) + ) + else: + logger.info("No dangerous cross-component combinations detected") + + return risks + + # -- component classification ------------------------------------------- + + def _classify_component(self, file_path: str) -> str: + """Map a file path to a component name based on directory names. + + Scans all path parts for known component directory names. Falls + back to ``"other"`` when no match is found. + + Args: + file_path: Relative or absolute file path. + + Returns: + Component name string (e.g. ``"api"``, ``"auth"``). + """ + if not file_path: + return "other" + + # Normalise to forward-slash parts and lowercase + parts = Path(file_path).parts + lower_parts = [p.lower() for p in parts] + + for component, dir_names in _COMPONENT_DIRS.items(): + for dir_name in dir_names: + if dir_name in lower_parts: + return component + + return "other" + + # -- dangerous combination checks --------------------------------------- + + def _check_dangerous_combinations( + self, + comp_a_findings: list[dict], + comp_b_findings: list[dict], + comp_a: str, + comp_b: str, + ) -> list[dict]: + """Check a component pair for predefined dangerous combinations. + + The check is order-independent: ``(auth, api)`` matches the same + rule as ``(api, auth)``. + + Args: + comp_a_findings: Findings from the first component. + comp_b_findings: Findings from the second component. + comp_a: Name of the first component. + comp_b: Name of the second component. + + Returns: + List of risk dicts with ``component_a``, ``component_b``, + ``risk_type``, ``severity``, ``description``, + ``findings_a``, and ``findings_b`` keys. + """ + risks: list[dict] = [] + + def _extract_ids(flist: list[dict]) -> list[str]: + return [ + str(f.get("id", f.get("rule_id", "unknown"))) + for f in flist + ] + + for pair in _DANGEROUS_PAIRS: + pair_set = {pair["a"], pair["b"]} + query_set = {comp_a, comp_b} + + if pair_set == query_set: + # Determine which actual component maps to which pair role + if comp_a == pair["a"]: + fa, fb = comp_a_findings, comp_b_findings + ca, cb = comp_a, comp_b + else: + fa, fb = comp_b_findings, comp_a_findings + ca, cb = comp_b, comp_a + + risk = CrossComponentRisk( + component_a=ca, + component_b=cb, + findings_a=_extract_ids(fa), + findings_b=_extract_ids(fb), + risk_type=pair["risk_type"], + severity=pair["severity"], + description=pair["description"], + ) + + risks.append(asdict(risk)) + + return risks + + +# --------------------------------------------------------------------------- +# __main__ +# --------------------------------------------------------------------------- + +if __name__ == "__main__": + print( + "Argus Security - LLM-Powered Vulnerability Chain Discovery\n" + "============================================================\n" + "\n" + "Usage (AgentChainDiscovery):\n" + "\n" + " from agent_chain_discovery import AgentChainDiscovery\n" + "\n" + " def my_llm_call(prompt: str) -> str:\n" + ' """Wrapper around your LLM provider."""\n' + " return llm_manager.call_llm_api(prompt)\n" + "\n" + " discoverer = AgentChainDiscovery(llm_call=my_llm_call)\n" + " chains = discoverer.discover_chains(findings, app_context={\n" + ' "framework": "Django",\n' + ' "auth_model": "session-based",\n' + " })\n" + "\n" + " for chain in chains:\n" + " print(chain.to_dict())\n" + "\n" + "Usage (CrossComponentAnalyzer):\n" + "\n" + " from agent_chain_discovery import CrossComponentAnalyzer\n" + "\n" + ' analyzer = CrossComponentAnalyzer(project_path="/path/to/project")\n' + " risks = analyzer.analyze(findings)\n" + "\n" + " for risk in risks:\n" + ' print(f"{risk[\'risk_type\']}: {risk[\'component_a\']} + {risk[\'component_b\']}")\n' + "\n" + "This module complements vulnerability_chaining_engine.py (14 static\n" + "rules) with LLM-powered reasoning to discover novel attack paths\n" + "that static rule sets cannot anticipate.\n" + ) diff --git a/scripts/app_context_builder.py b/scripts/app_context_builder.py new file mode 100644 index 0000000..6117ed8 --- /dev/null +++ b/scripts/app_context_builder.py @@ -0,0 +1,808 @@ +#!/usr/bin/env python3 +""" +Application Context Builder for Argus Security Pipeline. + +Builds a unified application context model by inspecting the target project's +file structure, dependency manifests, import patterns, and infrastructure +configuration. The resulting ``ApplicationContext`` is consumed by all six +pipeline phases so that scanners, AI enrichment, and policy gates can tailor +their behaviour to the specific technology stack. + +Detection methods are designed for speed: file-count caps, early exits, and +glob-based discovery keep wall-clock time under a few hundred milliseconds +even on large mono-repos. + +Usage: + builder = AppContextBuilder("/path/to/project") + ctx = builder.build() + print(ctx.to_prompt_context()) +""" + +from __future__ import annotations + +import glob +import json +import logging +import os +import re +from dataclasses import dataclass, field +from pathlib import Path + +__all__ = ["ApplicationContext", "AppContextBuilder"] + +logger = logging.getLogger(__name__) + +# --------------------------------------------------------------------------- +# Limits – keep detection fast on large repositories +# --------------------------------------------------------------------------- + +_MAX_FILES_FOR_IMPORTS = 100 +_MAX_FILES_FOR_AUTH = 200 + +# --------------------------------------------------------------------------- +# File-extension-to-language mapping +# --------------------------------------------------------------------------- + +_EXTENSION_LANGUAGE: dict[str, str] = { + ".py": "python", + ".js": "javascript", + ".ts": "typescript", + ".jsx": "javascript", + ".tsx": "typescript", + ".java": "java", + ".go": "go", + ".rb": "ruby", + ".php": "php", + ".cs": "csharp", + ".rs": "rust", +} + +# --------------------------------------------------------------------------- +# Data class +# --------------------------------------------------------------------------- + + +@dataclass +class ApplicationContext: + """Unified application context fed to all pipeline phases.""" + + # Code structure + framework: str = "unknown" + language: str = "unknown" + entry_points: list[str] = field(default_factory=list) + auth_mechanism: str = "unknown" + + # API surface + api_endpoints: list[dict] = field(default_factory=list) + middleware_chain: list[str] = field(default_factory=list) + + # Infrastructure + cloud_provider: str = "none" + iac_files: list[str] = field(default_factory=list) + has_dockerfile: bool = False + has_k8s: bool = False + + # Dependencies + dependency_files: list[str] = field(default_factory=list) + + # DAST context + deployment_url: str | None = None + openapi_spec_path: str | None = None + + def to_dict(self) -> dict: + """Serialise the context to a plain dictionary.""" + return { + "framework": self.framework, + "language": self.language, + "entry_points": self.entry_points, + "auth_mechanism": self.auth_mechanism, + "api_endpoints": self.api_endpoints, + "middleware_chain": self.middleware_chain, + "cloud_provider": self.cloud_provider, + "iac_files": self.iac_files, + "has_dockerfile": self.has_dockerfile, + "has_k8s": self.has_k8s, + "dependency_files": self.dependency_files, + "deployment_url": self.deployment_url, + "openapi_spec_path": self.openapi_spec_path, + } + + def to_prompt_context(self) -> str: + """Format as a string suitable for LLM prompt injection.""" + lines = [ + "Application Context:", + f"- Language: {self.language}", + f"- Framework: {self.framework}", + f"- Auth mechanism: {self.auth_mechanism}", + f"- Cloud: {self.cloud_provider}", + f"- Entry points: {len(self.entry_points)} files", + ] + + # Summarise IaC files by type rather than listing every path. + if self.iac_files: + tf_count = sum(1 for f in self.iac_files if f.endswith((".tf", ".tfvars"))) + docker_count = sum( + 1 + for f in self.iac_files + if "dockerfile" in os.path.basename(f).lower() + or f.endswith((".dockerfile",)) + ) + compose_count = sum( + 1 for f in self.iac_files if "docker-compose" in os.path.basename(f).lower() + ) + k8s_count = sum( + 1 + for f in self.iac_files + if f.endswith((".yml", ".yaml")) + and "docker-compose" not in os.path.basename(f).lower() + ) + parts: list[str] = [] + if tf_count: + parts.append(f"{tf_count} terraform") + if docker_count: + parts.append(f"{docker_count} dockerfile") + if compose_count: + parts.append(f"{compose_count} compose") + if k8s_count: + parts.append(f"{k8s_count} k8s/helm") + label = ", ".join(parts) if parts else f"{len(self.iac_files)} files" + lines.append(f"- IaC: {label}") + else: + lines.append("- IaC: none") + + if self.middleware_chain: + lines.append(f"- Middleware: {', '.join(self.middleware_chain)}") + + lines.append(f"- Has Dockerfile: {'yes' if self.has_dockerfile else 'no'}") + lines.append(f"- Has Kubernetes: {'yes' if self.has_k8s else 'no'}") + + if self.openapi_spec_path: + lines.append(f"- OpenAPI spec: {self.openapi_spec_path}") + + if self.dependency_files: + lines.append(f"- Dependency files: {', '.join(self.dependency_files)}") + + if self.api_endpoints: + lines.append(f"- API endpoints detected: {len(self.api_endpoints)}") + + return "\n".join(lines) + + +# --------------------------------------------------------------------------- +# Builder +# --------------------------------------------------------------------------- + + +class AppContextBuilder: + """Inspect a project directory and assemble an ``ApplicationContext``. + + All detection methods are intentionally capped (file count limits, early + exits) so that even very large repositories are processed quickly. + """ + + def __init__(self, project_path: str) -> None: + self._root = Path(project_path).resolve() + logger.debug("AppContextBuilder initialised for %s", self._root) + + # ------------------------------------------------------------------ + # Public + # ------------------------------------------------------------------ + + def build(self) -> ApplicationContext: + """Orchestrate all detection methods and return the assembled context.""" + logger.info("Building application context for %s", self._root) + + language = self._detect_language() + framework = self._detect_framework(language) + iac_files = self._find_iac_files() + has_dockerfile = any( + "dockerfile" in os.path.basename(f).lower() for f in iac_files + ) + + ctx = ApplicationContext( + language=language, + framework=framework, + entry_points=self._find_entry_points(), + auth_mechanism=self._detect_auth(), + middleware_chain=self._detect_middleware(), + cloud_provider=self._detect_cloud(), + iac_files=iac_files, + has_dockerfile=has_dockerfile, + has_k8s=self._has_k8s(), + dependency_files=self._find_dependency_files(), + openapi_spec_path=self._find_openapi_spec(), + ) + + logger.info( + "Context built: language=%s framework=%s auth=%s cloud=%s", + ctx.language, + ctx.framework, + ctx.auth_mechanism, + ctx.cloud_provider, + ) + return ctx + + # ------------------------------------------------------------------ + # Language detection + # ------------------------------------------------------------------ + + def _detect_language(self) -> str: + """Count source files by extension and return the dominant language.""" + counts: dict[str, int] = {} + for ext, lang in _EXTENSION_LANGUAGE.items(): + pattern = os.path.join(str(self._root), "**", f"*{ext}") + # Use glob.iglob to avoid materialising huge lists; cap at a + # reasonable number so we don't spend minutes on mono-repos. + count = 0 + for _ in glob.iglob(pattern, recursive=True): + count += 1 + if count >= 5000: + break + if count: + counts[lang] = counts.get(lang, 0) + count + + if not counts: + logger.debug("No recognised source files found") + return "unknown" + + dominant = max(counts, key=lambda k: counts[k]) + logger.debug("Language counts: %s -> dominant=%s", counts, dominant) + return dominant + + # ------------------------------------------------------------------ + # Framework detection + # ------------------------------------------------------------------ + + def _detect_framework(self, language: str) -> str: + """Check for framework indicators based on the detected language.""" + detectors: dict[str, callable] = { + "python": self._detect_python_framework, + "javascript": self._detect_js_framework, + "typescript": self._detect_js_framework, + "java": self._detect_java_framework, + "go": self._detect_go_framework, + "ruby": self._detect_ruby_framework, + } + detector = detectors.get(language) + if detector: + result = detector() + if result != "unknown": + return result + + # Fallback: run all detectors for polyglot repos. + for lang, det in detectors.items(): + if lang == language: + continue + result = det() + if result != "unknown": + return result + + return "unknown" + + def _detect_python_framework(self) -> str: + """Detect Python web frameworks.""" + # Django – presence of manage.py is a strong signal. + if (self._root / "manage.py").is_file(): + logger.debug("Detected Django (manage.py)") + return "django" + + # Scan a limited set of .py files for import patterns. + py_files = self._collect_source_files("*.py", limit=_MAX_FILES_FOR_IMPORTS) + for fpath in py_files: + try: + content = fpath.read_text(errors="replace") + except OSError: + continue + if re.search(r"\bfrom\s+fastapi\b|\bimport\s+fastapi\b", content): + logger.debug("Detected FastAPI in %s", fpath) + return "fastapi" + if re.search(r"\bfrom\s+flask\b|\bimport\s+flask\b", content, re.IGNORECASE): + logger.debug("Detected Flask in %s", fpath) + return "flask" + + return "unknown" + + def _detect_js_framework(self) -> str: + """Detect JavaScript / TypeScript frameworks.""" + # File-based indicators (fast). + if any( + (self._root / name).is_file() + for name in ("next.config.js", "next.config.mjs", "next.config.ts") + ): + return "nextjs" + if any( + (self._root / name).is_file() + for name in ("nuxt.config.js", "nuxt.config.ts") + ): + return "nuxt" + if (self._root / "angular.json").is_file(): + return "angular" + + # package.json dependency check. + pkg_json = self._root / "package.json" + if pkg_json.is_file(): + try: + data = json.loads(pkg_json.read_text(errors="replace")) + except (json.JSONDecodeError, OSError): + data = {} + all_deps = { + **data.get("dependencies", {}), + **data.get("devDependencies", {}), + } + if "express" in all_deps: + return "express" + if "koa" in all_deps: + return "koa" + if "hapi" in all_deps or "@hapi/hapi" in all_deps: + return "hapi" + if "next" in all_deps: + return "nextjs" + if "nuxt" in all_deps: + return "nuxt" + + return "unknown" + + def _detect_java_framework(self) -> str: + """Detect Java frameworks via build files.""" + for build_file in ("pom.xml", "build.gradle", "build.gradle.kts"): + path = self._root / build_file + if path.is_file(): + try: + content = path.read_text(errors="replace") + except OSError: + continue + if "spring-boot" in content or "spring" in content.lower(): + return "spring" + + return "unknown" + + def _detect_go_framework(self) -> str: + """Detect Go web frameworks via go.mod.""" + go_mod = self._root / "go.mod" + if not go_mod.is_file(): + return "unknown" + try: + content = go_mod.read_text(errors="replace") + except OSError: + return "unknown" + + framework_patterns = { + "gin": r"github\.com/gin-gonic/gin", + "echo": r"github\.com/labstack/echo", + "fiber": r"github\.com/gofiber/fiber", + } + for name, pattern in framework_patterns.items(): + if re.search(pattern, content): + return name + + return "unknown" + + def _detect_ruby_framework(self) -> str: + """Detect Ruby frameworks via Gemfile.""" + gemfile = self._root / "Gemfile" + if not gemfile.is_file(): + return "unknown" + try: + content = gemfile.read_text(errors="replace") + except OSError: + return "unknown" + + if re.search(r"""gem\s+['"]rails['"]""", content): + return "rails" + if re.search(r"""gem\s+['"]sinatra['"]""", content): + return "sinatra" + + return "unknown" + + # ------------------------------------------------------------------ + # Auth detection + # ------------------------------------------------------------------ + + def _detect_auth(self) -> str: + """Detect the authentication mechanism used in the project.""" + # Check dependency manifests first (fast path). + auth = self._detect_auth_from_deps() + if auth != "unknown": + return auth + + # Fall back to scanning source files for import / usage patterns. + extensions = ("*.py", "*.js", "*.ts", "*.java", "*.go", "*.rb", "*.php") + files: list[Path] = [] + for ext in extensions: + files.extend(self._collect_source_files(ext, limit=_MAX_FILES_FOR_AUTH // len(extensions))) + if len(files) >= _MAX_FILES_FOR_AUTH: + break + + jwt_re = re.compile(r"\bjwt\b|\bjsonwebtoken\b|\bPyJWT\b", re.IGNORECASE) + oauth_re = re.compile(r"\bpassport\b|\boauth\b|\bauth0\b|\boauth2\b", re.IGNORECASE) + session_re = re.compile( + r"\bexpress-session\b|\bflask[_-]session\b|\bsession\s*middleware\b", + re.IGNORECASE, + ) + apikey_re = re.compile(r"\bapikey\b|\bapi_key\b|\bx-api-key\b", re.IGNORECASE) + basic_re = re.compile(r"\bBasicAuth\b|\bbasic_auth\b|\bBasicAuthentication\b", re.IGNORECASE) + + for fpath in files: + try: + content = fpath.read_text(errors="replace") + except OSError: + continue + + if jwt_re.search(content): + return "jwt" + if oauth_re.search(content): + return "oauth2" + if session_re.search(content): + return "session" + if apikey_re.search(content): + return "api_key" + if basic_re.search(content): + return "basic" + + return "unknown" + + def _detect_auth_from_deps(self) -> str: + """Quick check for auth libraries in dependency manifests.""" + # package.json + pkg_json = self._root / "package.json" + if pkg_json.is_file(): + try: + data = json.loads(pkg_json.read_text(errors="replace")) + except (json.JSONDecodeError, OSError): + data = {} + all_deps = " ".join( + list(data.get("dependencies", {}).keys()) + + list(data.get("devDependencies", {}).keys()) + ) + if "jsonwebtoken" in all_deps or "jose" in all_deps: + return "jwt" + if "passport" in all_deps or "auth0" in all_deps: + return "oauth2" + if "express-session" in all_deps: + return "session" + + # requirements.txt / Pipfile + for req_file in ("requirements.txt", "Pipfile"): + path = self._root / req_file + if path.is_file(): + try: + content = path.read_text(errors="replace").lower() + except OSError: + continue + if "pyjwt" in content or "python-jose" in content: + return "jwt" + if "authlib" in content or "auth0" in content: + return "oauth2" + if "flask-session" in content or "django-session" in content: + return "session" + + # go.mod + go_mod = self._root / "go.mod" + if go_mod.is_file(): + try: + content = go_mod.read_text(errors="replace").lower() + except OSError: + content = "" + if "golang-jwt" in content or "jwt-go" in content: + return "jwt" + if "oauth2" in content: + return "oauth2" + + return "unknown" + + # ------------------------------------------------------------------ + # Infrastructure detection + # ------------------------------------------------------------------ + + def _detect_cloud(self) -> str: + """Detect the primary cloud provider from dependency files.""" + indicators: list[tuple[str, str]] = [] + + # package.json + pkg_json = self._root / "package.json" + if pkg_json.is_file(): + try: + data = json.loads(pkg_json.read_text(errors="replace")) + except (json.JSONDecodeError, OSError): + data = {} + deps_str = " ".join( + list(data.get("dependencies", {}).keys()) + + list(data.get("devDependencies", {}).keys()) + ) + if "aws-sdk" in deps_str or "@aws-sdk" in deps_str: + indicators.append(("aws", deps_str)) + if "@google-cloud" in deps_str: + indicators.append(("gcp", deps_str)) + if "@azure" in deps_str: + indicators.append(("azure", deps_str)) + + # requirements.txt / Pipfile + for req_file in ("requirements.txt", "Pipfile"): + path = self._root / req_file + if path.is_file(): + try: + content = path.read_text(errors="replace").lower() + except OSError: + continue + if "boto3" in content or "botocore" in content: + indicators.append(("aws", content)) + if "google-cloud" in content: + indicators.append(("gcp", content)) + if "azure" in content: + indicators.append(("azure", content)) + + # go.mod + go_mod = self._root / "go.mod" + if go_mod.is_file(): + try: + content = go_mod.read_text(errors="replace").lower() + except OSError: + content = "" + if "aws-sdk-go" in content: + indicators.append(("aws", content)) + if "cloud.google.com" in content: + indicators.append(("gcp", content)) + if "azure-sdk" in content: + indicators.append(("azure", content)) + + # pom.xml / build.gradle + for build_file in ("pom.xml", "build.gradle", "build.gradle.kts"): + path = self._root / build_file + if path.is_file(): + try: + content = path.read_text(errors="replace").lower() + except OSError: + continue + if "aws" in content or "amazonaws" in content: + indicators.append(("aws", content)) + if "google-cloud" in content or "gcloud" in content: + indicators.append(("gcp", content)) + if "azure" in content: + indicators.append(("azure", content)) + + if not indicators: + return "none" + + # Return the most frequently signalled provider. + provider_counts: dict[str, int] = {} + for provider, _ in indicators: + provider_counts[provider] = provider_counts.get(provider, 0) + 1 + return max(provider_counts, key=lambda k: provider_counts[k]) + + def _find_iac_files(self) -> list[str]: + """Glob for infrastructure-as-code and container configuration files.""" + patterns = [ + "**/*.tf", + "**/*.tfvars", + "k8s/**/*.yml", + "k8s/**/*.yaml", + "kubernetes/**/*.yml", + "kubernetes/**/*.yaml", + "helm/**/*.yaml", + "helm/**/*.yml", + "**/*.dockerfile", + "**/Dockerfile", + "**/Dockerfile.*", + "**/docker-compose*.yml", + "**/docker-compose*.yaml", + ] + results: list[str] = [] + seen: set[str] = set() + for pattern in patterns: + full_pattern = os.path.join(str(self._root), pattern) + for match in glob.iglob(full_pattern, recursive=True): + real = os.path.realpath(match) + if real not in seen: + seen.add(real) + results.append(os.path.relpath(match, self._root)) + return sorted(results) + + def _has_k8s(self) -> bool: + """Check whether the project contains Kubernetes manifests.""" + # Dedicated k8s / kubernetes directories. + for dir_name in ("k8s", "kubernetes", "helm"): + if (self._root / dir_name).is_dir(): + return True + + # Look for k8s-indicative YAML content in .yml/.yaml at the root or + # in common sub-directories. Cap the search to stay fast. + yaml_patterns = ["*.yml", "*.yaml", "deploy/**/*.yml", "deploy/**/*.yaml"] + checked = 0 + for pattern in yaml_patterns: + full = os.path.join(str(self._root), pattern) + for match in glob.iglob(full, recursive=True): + try: + head = Path(match).read_text(errors="replace")[:2048] + except OSError: + continue + if re.search(r"apiVersion:\s|kind:\s+(Deployment|Service|Pod|StatefulSet|Ingress)", head): + return True + checked += 1 + if checked >= 50: + return False + return False + + def _find_openapi_spec(self) -> str | None: + """Look for OpenAPI / Swagger specification files.""" + candidates = [ + "openapi.json", + "openapi.yaml", + "openapi.yml", + "swagger.json", + "swagger.yaml", + "swagger.yml", + "api-spec.json", + "api-spec.yaml", + "api-spec.yml", + ] + search_dirs = [".", "docs", "api"] + for search_dir in search_dirs: + for candidate in candidates: + path = self._root / search_dir / candidate + if path.is_file(): + return os.path.relpath(str(path), self._root) + return None + + # ------------------------------------------------------------------ + # Entry points + # ------------------------------------------------------------------ + + def _find_entry_points(self) -> list[str]: + """Find main entry files and route definition modules.""" + entry_names = { + "main.py", + "app.py", + "server.py", + "wsgi.py", + "asgi.py", + "manage.py", + "index.js", + "index.ts", + "server.js", + "server.ts", + "app.js", + "app.ts", + "main.go", + "cmd/main.go", + "Application.java", + } + found: list[str] = [] + + # Direct name matches in project root and one level down. + for name in entry_names: + full = self._root / name + if full.is_file(): + found.append(os.path.relpath(str(full), self._root)) + + # Route / controller directories. + route_patterns = [ + "routes/**/*.py", + "routes/**/*.js", + "routes/**/*.ts", + "controllers/**/*.py", + "controllers/**/*.js", + "controllers/**/*.ts", + "controllers/**/*.java", + "**/urls.py", + "**/router.py", + "**/router.js", + "**/router.ts", + ] + for pattern in route_patterns: + full_pattern = os.path.join(str(self._root), pattern) + for match in glob.iglob(full_pattern, recursive=True): + rel = os.path.relpath(match, self._root) + if rel not in found: + found.append(rel) + + return sorted(found) + + # ------------------------------------------------------------------ + # Middleware detection + # ------------------------------------------------------------------ + + def _detect_middleware(self) -> list[str]: + """Detect common middleware usage across all recognised languages.""" + middleware_patterns: dict[str, re.Pattern] = { + "cors": re.compile(r"\bcors\b|\bCORS\b|\baccess-control-allow-origin\b", re.IGNORECASE), + "rate_limiting": re.compile( + r"\brate.?limit\b|\bthrottle\b|\bRateLimit\b", re.IGNORECASE + ), + "auth": re.compile( + r"\bauth.?middleware\b|\bauthenticate\b|\bisAuthenticated\b", re.IGNORECASE + ), + "helmet": re.compile(r"\bhelmet\b", re.IGNORECASE), + "csrf": re.compile(r"\bcsrf\b|\bcsurf\b|\bCSRFMiddleware\b", re.IGNORECASE), + "logging": re.compile( + r"\blogging.?middleware\b|\bmorgan\b|\brequest.?log\b", re.IGNORECASE + ), + "compression": re.compile(r"\bcompression\b|\bgzip\b|\bGZipMiddleware\b", re.IGNORECASE), + "body_parser": re.compile(r"\bbody-parser\b|\bbodyParser\b", re.IGNORECASE), + } + + detected: set[str] = set() + extensions = ("*.py", "*.js", "*.ts", "*.java", "*.go", "*.rb") + files: list[Path] = [] + for ext in extensions: + files.extend(self._collect_source_files(ext, limit=_MAX_FILES_FOR_IMPORTS // len(extensions))) + + for fpath in files: + try: + content = fpath.read_text(errors="replace") + except OSError: + continue + for name, pattern in middleware_patterns.items(): + if name not in detected and pattern.search(content): + detected.add(name) + # Early exit if we already found everything. + if len(detected) == len(middleware_patterns): + break + + return sorted(detected) + + # ------------------------------------------------------------------ + # Dependency files + # ------------------------------------------------------------------ + + def _find_dependency_files(self) -> list[str]: + """Find dependency manifest files at the project root.""" + candidates = [ + "requirements.txt", + "Pipfile", + "pyproject.toml", + "setup.py", + "package.json", + "yarn.lock", + "pnpm-lock.yaml", + "pom.xml", + "build.gradle", + "build.gradle.kts", + "go.mod", + "Gemfile", + "composer.json", + "Cargo.toml", + ] + found: list[str] = [] + for name in candidates: + if (self._root / name).is_file(): + found.append(name) + return found + + # ------------------------------------------------------------------ + # Helpers + # ------------------------------------------------------------------ + + def _collect_source_files(self, pattern: str, limit: int) -> list[Path]: + """Collect source files matching *pattern*, skipping common vendor dirs. + + Returns at most *limit* paths. Hidden directories, ``node_modules``, + ``vendor``, ``venv``, and ``__pycache__`` are excluded to avoid noise. + """ + full_pattern = os.path.join(str(self._root), "**", pattern) + skip_dirs = {"node_modules", "vendor", "venv", ".venv", "__pycache__", ".git", "dist", "build"} + results: list[Path] = [] + for match in glob.iglob(full_pattern, recursive=True): + parts = Path(match).relative_to(self._root).parts + if any(p in skip_dirs for p in parts): + continue + results.append(Path(match)) + if len(results) >= limit: + break + return results + + +# --------------------------------------------------------------------------- +# CLI entry point +# --------------------------------------------------------------------------- + +if __name__ == "__main__": + import sys + + logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") + + path = sys.argv[1] if len(sys.argv) > 1 else "." + builder = AppContextBuilder(path) + ctx = builder.build() + print(ctx.to_prompt_context()) + print() + print(json.dumps(ctx.to_dict(), indent=2)) diff --git a/scripts/autofix_pr_generator.py b/scripts/autofix_pr_generator.py new file mode 100644 index 0000000..7763f00 --- /dev/null +++ b/scripts/autofix_pr_generator.py @@ -0,0 +1,921 @@ +#!/usr/bin/env python3 +""" +AutoFix PR Generator for Argus Security + +Generates merge-ready pull requests from remediation suggestions and orchestrates +a closed-loop find-fix-verify cycle. Integrates with the remediation engine to +automatically create git branches, apply code fixes, and produce PR metadata. + +Features: +- Git branch creation and fix application (diff or full-file replacement) +- Descriptive commit messages with vulnerability context +- Formatted PR body generation with vulnerability details +- Closed-loop orchestration: find -> fix -> verify -> PR +- Confidence-based filtering for safe auto-fix deployment +- Batch processing with aggregated results + +Usage: + from autofix_pr_generator import AutoFixPRGenerator, ClosedLoopOrchestrator + + # Generate a single fix PR + generator = AutoFixPRGenerator(project_path="/path/to/repo") + fix_pr = generator.create_fix_pr(suggestion) + + # Run the full closed-loop + orchestrator = ClosedLoopOrchestrator(project_path="/path/to/repo") + result = orchestrator.run_loop(findings) +""" + +from __future__ import annotations + +import logging +import os +import subprocess +import tempfile +from dataclasses import asdict, dataclass, field +from pathlib import Path + +logger = logging.getLogger(__name__) + + +# --------------------------------------------------------------------------- +# Dataclasses +# --------------------------------------------------------------------------- + + +@dataclass +class FixBranch: + """Result of creating a git branch and applying a fix. + + Attributes: + branch_name: Name of the created git branch. + finding_id: Unique identifier of the finding being fixed. + vulnerability_type: Category of vulnerability (e.g. "sql_injection"). + file_path: Path to the file that was modified. + commit_sha: SHA of the commit containing the fix, or None on failure. + success: Whether the branch was created and fix committed. + error: Error message if the operation failed. + """ + + branch_name: str + finding_id: str + vulnerability_type: str + file_path: str + commit_sha: str | None + success: bool + error: str | None = None + + +@dataclass +class FixPR: + """Metadata for a generated pull request. + + Attributes: + branch_name: Name of the git branch containing the fix. + finding_id: Unique identifier of the finding being fixed. + vulnerability_type: Category of vulnerability. + file_path: Path to the file that was modified. + title: Suggested PR title. + body: Formatted PR body in Markdown. + commit_sha: SHA of the fix commit, or None on failure. + pushed: Whether the branch was pushed to a remote. + success: Whether the PR was successfully prepared. + error: Error message if the operation failed. + """ + + branch_name: str + finding_id: str + vulnerability_type: str + file_path: str + title: str + body: str + commit_sha: str | None + pushed: bool + success: bool + error: str | None = None + + def to_dict(self) -> dict: + """Convert to dictionary for JSON serialization.""" + return asdict(self) + + +@dataclass +class LoopResult: + """Aggregated results from a closed-loop orchestration run. + + Attributes: + total_findings: Total number of findings evaluated. + fixable: Number of findings deemed fixable. + fixed: List of successfully generated FixPR objects. + skipped_low_confidence: Finding IDs skipped due to low confidence. + failed: List of dicts with finding_id and error for failures. + """ + + total_findings: int + fixable: int + fixed: list[FixPR] = field(default_factory=list) + skipped_low_confidence: list[str] = field(default_factory=list) + failed: list[dict] = field(default_factory=list) + + @property + def success_rate(self) -> float: + """Fraction of fixable findings that were successfully fixed.""" + return len(self.fixed) / self.fixable if self.fixable > 0 else 0.0 + + def to_dict(self) -> dict: + """Convert to dictionary for JSON serialization.""" + data = { + "total_findings": self.total_findings, + "fixable": self.fixable, + "fixed": [pr.to_dict() for pr in self.fixed], + "skipped_low_confidence": self.skipped_low_confidence, + "failed": self.failed, + "success_rate": self.success_rate, + } + return data + + +# --------------------------------------------------------------------------- +# AutoFixPRGenerator +# --------------------------------------------------------------------------- + + +class AutoFixPRGenerator: + """Generates merge-ready PRs from remediation suggestions. + + Creates git branches, applies code fixes (via diff or file replacement), + commits changes with descriptive messages, and produces formatted PR bodies. + All git operations are performed via subprocess calls. + """ + + def __init__(self, project_path: str, base_branch: str = "main"): + """Initialize the PR generator. + + Args: + project_path: Absolute path to the git repository root. + base_branch: Name of the base branch to create fix branches from. + """ + self.project_path = os.path.abspath(project_path) + self.base_branch = base_branch + + def _run_git(self, *args: str, check: bool = True) -> subprocess.CompletedProcess: + """Run a git command inside the project directory. + + Args: + *args: Arguments to pass after ``git``. + check: Whether to raise on non-zero exit code. + + Returns: + CompletedProcess instance with captured output. + """ + cmd = ["git", *args] + logger.debug("Running: %s", " ".join(cmd)) + result = subprocess.run( + cmd, + cwd=self.project_path, + capture_output=True, + text=True, + check=check, + ) + return result + + # -- Branch + fix application ------------------------------------------------ + + def create_fix_branch(self, suggestion: dict) -> FixBranch: + """Create a git branch with the applied fix and commit it. + + Checks out a new branch from ``base_branch``, applies the fix using + ``_apply_fix``, stages the changed file, and creates a commit with a + descriptive message. + + Args: + suggestion: Dict with keys ``finding_id``, ``vulnerability_type``, + ``file_path``, and either ``diff`` or ``fixed_code``. + + Returns: + FixBranch describing the outcome. + """ + finding_id = suggestion.get("finding_id", "unknown") + vuln_type = suggestion.get("vulnerability_type", "unknown") + file_path = suggestion.get("file_path", "") + short_id = finding_id[:8] if finding_id else "unknown" + + # Sanitise vuln_type for branch name (lowercase, replace non-alnum) + safe_vuln = vuln_type.lower().replace(" ", "-") + safe_vuln = "".join(c if c.isalnum() or c == "-" else "-" for c in safe_vuln) + branch_name = f"argus/fix-{safe_vuln}-{short_id}" + + logger.info( + "Creating fix branch %s for %s in %s", + branch_name, + finding_id, + file_path, + ) + + try: + # Ensure we start from the base branch + self._run_git("checkout", self.base_branch) + self._run_git("checkout", "-b", branch_name) + + # Apply the fix + if not self._apply_fix(suggestion): + error_msg = "Failed to apply fix" + logger.error(error_msg) + # Return to base branch on failure + self._run_git("checkout", self.base_branch, check=False) + self._run_git("branch", "-D", branch_name, check=False) + return FixBranch( + branch_name=branch_name, + finding_id=finding_id, + vulnerability_type=vuln_type, + file_path=file_path, + commit_sha=None, + success=False, + error=error_msg, + ) + + # Stage and commit + self._run_git("add", file_path) + commit_msg = self._generate_commit_message(suggestion) + self._run_git("commit", "-m", commit_msg) + + # Retrieve commit SHA + sha_result = self._run_git("rev-parse", "HEAD") + commit_sha = sha_result.stdout.strip() + + logger.info("Fix committed as %s on branch %s", commit_sha, branch_name) + + return FixBranch( + branch_name=branch_name, + finding_id=finding_id, + vulnerability_type=vuln_type, + file_path=file_path, + commit_sha=commit_sha, + success=True, + ) + + except subprocess.CalledProcessError as exc: + error_msg = f"Git operation failed: {exc.stderr.strip() or exc.stdout.strip()}" + logger.error(error_msg) + # Best-effort cleanup + self._run_git("checkout", self.base_branch, check=False) + self._run_git("branch", "-D", branch_name, check=False) + return FixBranch( + branch_name=branch_name, + finding_id=finding_id, + vulnerability_type=vuln_type, + file_path=file_path, + commit_sha=None, + success=False, + error=error_msg, + ) + except Exception as exc: + error_msg = f"Unexpected error: {exc}" + logger.exception(error_msg) + self._run_git("checkout", self.base_branch, check=False) + self._run_git("branch", "-D", branch_name, check=False) + return FixBranch( + branch_name=branch_name, + finding_id=finding_id, + vulnerability_type=vuln_type, + file_path=file_path, + commit_sha=None, + success=False, + error=error_msg, + ) + + def _apply_fix(self, suggestion: dict) -> bool: + """Apply a fix to the working tree. + + If ``suggestion['diff']`` is present, attempt to apply it via + ``git apply``. If that fails (or no diff is provided) and + ``suggestion['fixed_code']`` exists, overwrite the target file section. + + Args: + suggestion: Dict with ``file_path`` and either ``diff`` or + ``fixed_code`` (and optionally ``original_code``). + + Returns: + True if the fix was successfully applied, False otherwise. + """ + file_path = suggestion.get("file_path", "") + diff_text = suggestion.get("diff", "") + fixed_code = suggestion.get("fixed_code", "") + original_code = suggestion.get("original_code", "") + + # Strategy 1: apply unified diff + if diff_text: + try: + with tempfile.NamedTemporaryFile( + mode="w", + suffix=".patch", + delete=False, + ) as patch_file: + patch_file.write(diff_text) + patch_path = patch_file.name + + result = self._run_git("apply", patch_path, check=False) + os.unlink(patch_path) + + if result.returncode == 0: + logger.info("Applied diff patch to %s", file_path) + return True + + logger.warning( + "git apply failed (rc=%d): %s", + result.returncode, + result.stderr.strip(), + ) + except Exception as exc: + logger.warning("Diff application error: %s", exc) + + # Strategy 2: overwrite file section with fixed_code + if fixed_code: + abs_path = os.path.join(self.project_path, file_path) + try: + if original_code and os.path.isfile(abs_path): + with open(abs_path, "r") as fh: + content = fh.read() + + if original_code in content: + content = content.replace(original_code, fixed_code, 1) + with open(abs_path, "w") as fh: + fh.write(content) + logger.info( + "Replaced vulnerable code section in %s", + file_path, + ) + return True + else: + logger.warning( + "Original code not found in %s, overwriting file", + file_path, + ) + + # Fallback: overwrite entire file content + Path(abs_path).parent.mkdir(parents=True, exist_ok=True) + with open(abs_path, "w") as fh: + fh.write(fixed_code) + logger.info("Wrote fixed code to %s", file_path) + return True + + except Exception as exc: + logger.error("Failed to write fixed code to %s: %s", file_path, exc) + return False + + logger.error("No diff or fixed_code provided in suggestion") + return False + + # -- Commit message ---------------------------------------------------------- + + def _generate_commit_message(self, suggestion: dict) -> str: + """Generate a descriptive commit message for the fix. + + Format:: + + fix(): + + Finding: + CWE: + File: : + + Generated by Argus Security AutoFix + + Args: + suggestion: Dict with vulnerability metadata. + + Returns: + Formatted commit message string. + """ + vuln_type = suggestion.get("vulnerability_type", "unknown") + finding_id = suggestion.get("finding_id", "unknown") + cwe = suggestion.get("cwe", suggestion.get("cwe_references", "N/A")) + file_path = suggestion.get("file_path", "unknown") + line_number = suggestion.get("line_number", 0) + explanation = suggestion.get("explanation", "") + + # Build a short description from the explanation + if explanation: + # Take first sentence, capped at 72 chars for subject line + short_desc = explanation.split(".")[0].strip() + if len(short_desc) > 60: + short_desc = short_desc[:57] + "..." + else: + short_desc = f"resolve {vuln_type} vulnerability" + + # Format CWE if it is a list + if isinstance(cwe, list): + cwe = ", ".join(str(c) for c in cwe) + + message = ( + f"fix({vuln_type}): {short_desc}\n" + f"\n" + f"Finding: {finding_id}\n" + f"CWE: {cwe}\n" + f"File: {file_path}:{line_number}\n" + f"\n" + f"Generated by Argus Security AutoFix" + ) + return message + + # -- PR body ----------------------------------------------------------------- + + def generate_pr_body( + self, + suggestion: dict, + regression_test_path: str | None = None, + ) -> str: + """Generate a formatted pull request body in Markdown. + + Includes a summary, vulnerability details, explanation of changes, + diff in a code block, testing recommendations, and a footer. + + Args: + suggestion: Dict with vulnerability and fix metadata. + regression_test_path: Optional path to a generated regression test. + + Returns: + Markdown-formatted PR body string. + """ + vuln_type = suggestion.get("vulnerability_type", "unknown") + finding_id = suggestion.get("finding_id", "unknown") + cwe = suggestion.get("cwe", suggestion.get("cwe_references", "N/A")) + severity = suggestion.get("severity", suggestion.get("confidence", "unknown")) + file_path = suggestion.get("file_path", "unknown") + line_number = suggestion.get("line_number", 0) + explanation = suggestion.get("explanation", "No explanation provided.") + diff_text = suggestion.get("diff", "") + testing_recs = suggestion.get("testing_recommendations", []) + + # Format CWE if it is a list + if isinstance(cwe, list): + cwe = ", ".join(str(c) for c in cwe) + + lines = [ + "## Summary", + "", + f"Automated security fix for **{vuln_type}** vulnerability " + f"detected by Argus Security.", + "", + "## Vulnerability Details", + "", + f"| Field | Value |", + f"|-------|-------|", + f"| **Type** | {vuln_type} |", + f"| **CWE** | {cwe} |", + f"| **Severity** | {severity} |", + f"| **File** | `{file_path}` |", + f"| **Line** | {line_number} |", + f"| **Finding ID** | `{finding_id}` |", + "", + "## What Changed", + "", + explanation, + "", + ] + + if diff_text: + lines.extend([ + "## Diff", + "", + "```diff", + diff_text, + "```", + "", + ]) + + if testing_recs: + lines.extend([ + "## Testing Recommendations", + "", + ]) + for rec in testing_recs: + lines.append(f"- {rec}") + lines.append("") + + if regression_test_path: + lines.extend([ + "## Regression Test", + "", + f"A regression test has been added at `{regression_test_path}`.", + "", + ]) + + lines.extend([ + "---", + "", + "*Generated by [Argus Security](https://github.com/devatsecure/Argus-Security) AutoFix*", + ]) + + return "\n".join(lines) + + # -- Orchestration ----------------------------------------------------------- + + def create_fix_pr( + self, + suggestion: dict, + push: bool = False, + ) -> FixPR: + """Orchestrate branch creation, fix application, and PR metadata. + + Creates a fix branch, applies the fix, commits it, optionally pushes + to the remote, and returns a FixPR with all metadata needed to open + a pull request. + + Args: + suggestion: Dict with vulnerability and fix metadata. + push: Whether to push the branch to the remote. + + Returns: + FixPR with branch name, title, body, and commit information. + """ + finding_id = suggestion.get("finding_id", "unknown") + vuln_type = suggestion.get("vulnerability_type", "unknown") + file_path = suggestion.get("file_path", "") + + logger.info("Creating fix PR for finding %s", finding_id) + + # Create branch and apply fix + fix_branch = self.create_fix_branch(suggestion) + + if not fix_branch.success: + return FixPR( + branch_name=fix_branch.branch_name, + finding_id=finding_id, + vulnerability_type=vuln_type, + file_path=file_path, + title="", + body="", + commit_sha=None, + pushed=False, + success=False, + error=fix_branch.error, + ) + + # Generate PR metadata + title = f"fix({vuln_type}): {finding_id[:8]} in {os.path.basename(file_path)}" + body = self.generate_pr_body(suggestion) + + # Optionally push + pushed = False + if push: + try: + self._run_git("push", "-u", "origin", fix_branch.branch_name) + pushed = True + logger.info("Pushed branch %s to origin", fix_branch.branch_name) + except subprocess.CalledProcessError as exc: + logger.warning( + "Failed to push branch %s: %s", + fix_branch.branch_name, + exc.stderr.strip(), + ) + + return FixPR( + branch_name=fix_branch.branch_name, + finding_id=finding_id, + vulnerability_type=vuln_type, + file_path=file_path, + title=title, + body=body, + commit_sha=fix_branch.commit_sha, + pushed=pushed, + success=True, + ) + + +# --------------------------------------------------------------------------- +# ClosedLoopOrchestrator +# --------------------------------------------------------------------------- + + +CONFIDENCE_LEVELS = {"high": 3, "medium": 2, "low": 1} + + +class ClosedLoopOrchestrator: + """Orchestrates the full find -> fix -> verify loop. + + Filters findings to those that are auto-fixable, generates fixes via + the remediation engine, validates confidence thresholds, and creates + PR branches for each fix. + """ + + def __init__( + self, + project_path: str, + remediation_engine=None, + regression_tester=None, + confidence_threshold: str = "high", + ): + """Initialize the orchestrator. + + Args: + project_path: Absolute path to the git repository root. + remediation_engine: Optional RemediationEngine instance for + generating fix suggestions. If None, suggestions must be + pre-populated in findings. + regression_tester: Optional callable that takes a suggestion dict + and returns a test file path (str) or None. + confidence_threshold: Minimum confidence level required for + auto-fix ("high", "medium", or "low"). + """ + self.project_path = os.path.abspath(project_path) + self.remediation_engine = remediation_engine + self.regression_tester = regression_tester + self.confidence_threshold = confidence_threshold + self._pr_generator = AutoFixPRGenerator(self.project_path) + + def run_loop(self, findings: list[dict]) -> LoopResult: + """Run the closed-loop find-fix-verify cycle. + + Filters findings to those that are fixable, generates a fix for each + via the remediation engine, validates confidence, creates a branch + and PR metadata, and optionally generates a regression test. + + Args: + findings: List of finding dicts from the Argus pipeline. + + Returns: + LoopResult with aggregated statistics and per-finding outcomes. + """ + total = len(findings) + fixable_findings = [f for f in findings if self._is_fixable(f)] + fixable_count = len(fixable_findings) + + logger.info( + "Closed-loop: %d total findings, %d fixable", + total, + fixable_count, + ) + + result = LoopResult( + total_findings=total, + fixable=fixable_count, + ) + + for finding in fixable_findings: + finding_id = finding.get("finding_id", finding.get("id", "unknown")) + + try: + # Generate fix suggestion via remediation engine + suggestion = self._generate_suggestion(finding) + + if suggestion is None: + result.failed.append({ + "finding_id": finding_id, + "error": "Remediation engine returned no suggestion", + }) + continue + + # Check confidence threshold + if not self._meets_confidence(suggestion): + logger.info( + "Skipping %s: confidence %s below threshold %s", + finding_id, + suggestion.get("confidence", "unknown"), + self.confidence_threshold, + ) + result.skipped_low_confidence.append(finding_id) + continue + + # Generate regression test if tester is available + regression_test_path = None + if self.regression_tester is not None: + try: + regression_test_path = self.regression_tester(suggestion) + except Exception as exc: + logger.warning( + "Regression test generation failed for %s: %s", + finding_id, + exc, + ) + + # Create the fix PR + fix_pr = self._pr_generator.create_fix_pr(suggestion) + + if fix_pr.success: + # Update body with regression test info if available + if regression_test_path: + fix_pr = FixPR( + branch_name=fix_pr.branch_name, + finding_id=fix_pr.finding_id, + vulnerability_type=fix_pr.vulnerability_type, + file_path=fix_pr.file_path, + title=fix_pr.title, + body=self._pr_generator.generate_pr_body( + suggestion, + regression_test_path=regression_test_path, + ), + commit_sha=fix_pr.commit_sha, + pushed=fix_pr.pushed, + success=True, + ) + result.fixed.append(fix_pr) + logger.info("Successfully created fix PR for %s", finding_id) + else: + result.failed.append({ + "finding_id": finding_id, + "error": fix_pr.error or "Unknown error during PR creation", + }) + + # Return to base branch for next iteration + self._pr_generator._run_git( + "checkout", + self._pr_generator.base_branch, + check=False, + ) + + except Exception as exc: + logger.exception("Closed-loop failed for finding %s", finding_id) + result.failed.append({ + "finding_id": finding_id, + "error": str(exc), + }) + # Best-effort return to base branch + self._pr_generator._run_git( + "checkout", + self._pr_generator.base_branch, + check=False, + ) + + logger.info( + "Closed-loop complete: %d fixed, %d skipped, %d failed (%.0f%% success rate)", + len(result.fixed), + len(result.skipped_low_confidence), + len(result.failed), + result.success_rate * 100, + ) + + return result + + def _generate_suggestion(self, finding: dict) -> dict | None: + """Generate a fix suggestion for a finding. + + Uses the remediation engine if available, converting its output to a + plain dict. Falls back to returning pre-populated suggestion data + from the finding itself. + + Args: + finding: Finding dict with vulnerability metadata. + + Returns: + Suggestion dict with fix details, or None if generation fails. + """ + if self.remediation_engine is not None: + try: + suggestion_obj = self.remediation_engine.suggest_fix(finding) + # Convert dataclass/object to dict if needed + if hasattr(suggestion_obj, "to_dict"): + return suggestion_obj.to_dict() + if hasattr(suggestion_obj, "__dataclass_fields__"): + return asdict(suggestion_obj) + if isinstance(suggestion_obj, dict): + return suggestion_obj + return None + except Exception as exc: + logger.warning( + "Remediation engine failed for %s: %s", + finding.get("finding_id", finding.get("id", "unknown")), + exc, + ) + return None + + # No remediation engine: check if finding already has fix data + if finding.get("fixed_code") or finding.get("diff"): + return { + "finding_id": finding.get("finding_id", finding.get("id", "unknown")), + "vulnerability_type": finding.get( + "vulnerability_type", finding.get("type", "unknown") + ), + "file_path": finding.get("file_path", finding.get("path", "")), + "line_number": finding.get("line_number", finding.get("line", 0)), + "fixed_code": finding.get("fixed_code", ""), + "original_code": finding.get("original_code", finding.get("code_snippet", "")), + "diff": finding.get("diff", ""), + "explanation": finding.get("explanation", ""), + "confidence": finding.get("confidence", "medium"), + "cwe": finding.get("cwe", finding.get("cwe_references", "N/A")), + "severity": finding.get("severity", "unknown"), + "testing_recommendations": finding.get("testing_recommendations", []), + } + + return None + + def _is_fixable(self, finding: dict) -> bool: + """Check if a finding has enough context for auto-fix. + + A finding is fixable if it is explicitly marked ``auto_fixable=True`` + or has a critical/high severity with sufficient code context. At a + minimum the finding must have a ``file_path``, a vulnerability type + or category, and either code context or a line number. + + Args: + finding: Finding dict. + + Returns: + True if the finding can be auto-fixed. + """ + # Must have a file path + file_path = finding.get("file_path") or finding.get("path") + if not file_path: + return False + + # Must have a vulnerability type or category + has_type = bool( + finding.get("vulnerability_type") + or finding.get("type") + or finding.get("category") + ) + if not has_type: + return False + + # Must have code context or a line number + has_context = bool( + finding.get("code_snippet") + or finding.get("original_code") + or finding.get("line_number") + or finding.get("line") + ) + if not has_context: + return False + + # Explicitly marked as auto-fixable + if finding.get("auto_fixable") is True: + return True + + # High/critical severity with context qualifies + severity = finding.get("severity", "").lower() + if severity in ("critical", "high"): + return True + + return False + + def _meets_confidence(self, suggestion: dict) -> bool: + """Check if a suggestion's confidence meets the threshold. + + Compares the suggestion's confidence level against the configured + threshold using an ordinal ranking: high > medium > low. + + Args: + suggestion: Fix suggestion dict with a ``confidence`` key. + + Returns: + True if the confidence meets or exceeds the threshold. + """ + suggestion_confidence = suggestion.get("confidence", "low").lower() + suggestion_level = CONFIDENCE_LEVELS.get(suggestion_confidence, 0) + threshold_level = CONFIDENCE_LEVELS.get(self.confidence_threshold.lower(), 3) + return suggestion_level >= threshold_level + + def get_summary(self, result: LoopResult) -> str: + """Generate a human-readable summary of the loop run. + + Args: + result: LoopResult from a completed run_loop invocation. + + Returns: + Multi-line summary string. + """ + lines = [ + "Argus Security AutoFix - Closed-Loop Summary", + "=" * 46, + f"Total findings evaluated: {result.total_findings}", + f"Fixable findings: {result.fixable}", + f"Successfully fixed: {len(result.fixed)}", + f"Skipped (low confidence): {len(result.skipped_low_confidence)}", + f"Failed: {len(result.failed)}", + f"Success rate: {result.success_rate:.1%}", + ] + + if result.fixed: + lines.append("") + lines.append("Fixed PRs:") + for pr in result.fixed: + lines.append( + f" - [{pr.branch_name}] {pr.vulnerability_type} in {pr.file_path}" + ) + + if result.skipped_low_confidence: + lines.append("") + lines.append("Skipped (low confidence):") + for fid in result.skipped_low_confidence: + lines.append(f" - {fid}") + + if result.failed: + lines.append("") + lines.append("Failed:") + for fail in result.failed: + lines.append(f" - {fail['finding_id']}: {fail.get('error', 'unknown')}") + + return "\n".join(lines) + + +# --------------------------------------------------------------------------- +# Main +# --------------------------------------------------------------------------- + + +if __name__ == "__main__": + print("AutoFix PR Generator for Argus Security") + print("Usage: Integrated into pipeline via enable_autofix_pr=True") + print(" AutoFixPRGenerator(project_path).create_fix_pr(suggestion)") + print(" ClosedLoopOrchestrator(project_path).run_loop(findings)") diff --git a/scripts/config_loader.py b/scripts/config_loader.py index 37fe900..fb87d24 100644 --- a/scripts/config_loader.py +++ b/scripts/config_loader.py @@ -151,6 +151,26 @@ def get_default_config() -> dict[str, Any]: # -- Post-Phase-3 quality filter -- "enable_quality_filter": True, "quality_filter_min_confidence": 0.30, + # -- Continuous security testing (v3.0) -- + # Diff-intelligent scanner scoping + "enable_diff_scoping": True, + "diff_expand_impact_radius": True, + # AutoFix PR generation + "enable_autofix_pr": False, # opt-in: generates branches/PRs with fixes + "autofix_confidence_threshold": "high", # only auto-fix high-confidence suggestions + "autofix_max_prs_per_scan": 5, + # Persistent findings store + "enable_findings_store": True, + "findings_db_path": ".argus/findings.db", + "inject_historical_context": True, # feed history into LLM prompts + # Agent-driven chain discovery + "enable_agent_chain_discovery": False, # opt-in: uses LLM credits + "enable_cross_component_analysis": True, + # Application context model + "enable_app_context": True, + # Live target validation + "enable_live_validation": False, # opt-in: requires dast_target_url + "live_validation_environment": "staging", } @@ -461,6 +481,20 @@ def load_profile(profile_name: str) -> dict[str, Any]: # Post-Phase-3 quality filter — removes low-quality findings before reporting (("ENABLE_QUALITY_FILTER",), "enable_quality_filter", "bool"), (("QUALITY_FILTER_MIN_CONFIDENCE",), "quality_filter_min_confidence", "float"), + # Continuous security testing (v3.0) + (("ENABLE_DIFF_SCOPING",), "enable_diff_scoping", "bool"), + (("DIFF_EXPAND_IMPACT_RADIUS",), "diff_expand_impact_radius", "bool"), + (("ENABLE_AUTOFIX_PR",), "enable_autofix_pr", "bool"), + (("AUTOFIX_CONFIDENCE_THRESHOLD",), "autofix_confidence_threshold", "str"), + (("AUTOFIX_MAX_PRS_PER_SCAN",), "autofix_max_prs_per_scan", "int"), + (("ENABLE_FINDINGS_STORE",), "enable_findings_store", "bool"), + (("FINDINGS_DB_PATH",), "findings_db_path", "str"), + (("INJECT_HISTORICAL_CONTEXT",), "inject_historical_context", "bool"), + (("ENABLE_AGENT_CHAIN_DISCOVERY",), "enable_agent_chain_discovery", "bool"), + (("ENABLE_CROSS_COMPONENT_ANALYSIS",), "enable_cross_component_analysis", "bool"), + (("ENABLE_APP_CONTEXT",), "enable_app_context", "bool"), + (("ENABLE_LIVE_VALIDATION",), "enable_live_validation", "bool"), + (("LIVE_VALIDATION_ENVIRONMENT",), "live_validation_environment", "str"), ] @@ -590,6 +624,20 @@ def load_env_overrides() -> dict[str, Any]: "suppression_auto_expire_days": "suppression_auto_expire_days", "enable_compliance_mapping": "enable_compliance_mapping", "compliance_frameworks": "compliance_frameworks", + # Continuous security testing (v3.0) + "enable_diff_scoping": "enable_diff_scoping", + "diff_expand_impact_radius": "diff_expand_impact_radius", + "enable_autofix_pr": "enable_autofix_pr", + "autofix_confidence_threshold": "autofix_confidence_threshold", + "autofix_max_prs_per_scan": "autofix_max_prs_per_scan", + "enable_findings_store": "enable_findings_store", + "findings_db_path": "findings_db_path", + "inject_historical_context": "inject_historical_context", + "enable_agent_chain_discovery": "enable_agent_chain_discovery", + "enable_cross_component_analysis": "enable_cross_component_analysis", + "enable_app_context": "enable_app_context", + "enable_live_validation": "enable_live_validation", + "live_validation_environment": "live_validation_environment", } diff --git a/scripts/diff_impact_analyzer.py b/scripts/diff_impact_analyzer.py new file mode 100644 index 0000000..036671e --- /dev/null +++ b/scripts/diff_impact_analyzer.py @@ -0,0 +1,524 @@ +#!/usr/bin/env python3 +""" +Diff Impact Analyzer for Argus Security + +Diff-intelligent scanner scoping that classifies changed files by security +relevance, expands the blast radius to include dependent files, and builds +a focused scan scope for downstream scanners (Semgrep, Trivy, etc.). + +Three main components: + + - **DiffClassifier** : Classifies changed files into security-relevant + vs. skippable buckets using pattern matching. + - **DiffImpactAnalyzer** : Expands changed files to their security-relevant + blast radius via reverse dependency lookup. + - **DiffScopeBuilder** : Combines classifier + impact analyzer into a + scanner-ready scope with Semgrep CLI helpers. + +Toggle: ``only_changed=True`` in ``DiffScopeBuilder.build_scope`` to enable +diff-scoped scanning. When disabled, the full project is scanned. +""" + +from __future__ import annotations + +import logging +import os +import re +import subprocess +from dataclasses import dataclass, field + +logger = logging.getLogger(__name__) + +# --------------------------------------------------------------------------- +# Constants +# --------------------------------------------------------------------------- + +#: File patterns that are safe to skip during security scanning. +SKIP_PATTERNS: list[str] = [ + r"\.md$", + r"\.txt$", + r"\.css$", + r"\.scss$", + r"\.svg$", + r"\.png$", + r"\.jpg$", + r"\.gif$", + r"CHANGELOG", + r"LICENSE", + r"\.gitignore", + r"README", +] + +#: File patterns that must always be scanned regardless of other heuristics. +ALWAYS_SCAN_PATTERNS: list[str] = [ + r"auth", + r"login", + r"session", + r"token", + r"password", + r"secret", + r"key", + r"crypt", + r"permission", + r"rbac", + r"middleware", + r"guard", + r"policy", + r"\.env", + r"docker", + r"Dockerfile", + r"\.tf$", + r"\.yml$", + r"\.yaml$", + r"\.toml$", + r"requirements", + r"package\.json", + r"Gemfile", + r"go\.mod", +] + +#: Keywords in file paths that indicate security-critical modules. +SECURITY_CRITICAL_KEYWORDS: list[str] = [ + "auth", + "middleware", + "permissions", + "crypto", + "session", + "token", + "oauth", + "rbac", + "acl", + "policy", + "guard", + "interceptor", +] + +#: File extensions to search when performing reverse dependency lookups. +IMPORTABLE_EXTENSIONS: set[str] = { + ".py", + ".js", + ".ts", + ".jsx", + ".tsx", + ".java", + ".go", + ".rb", + ".php", +} + +# Pre-compiled regexes for performance +_SKIP_REGEXES: list[re.Pattern[str]] = [re.compile(p, re.IGNORECASE) for p in SKIP_PATTERNS] +_ALWAYS_SCAN_REGEXES: list[re.Pattern[str]] = [re.compile(p, re.IGNORECASE) for p in ALWAYS_SCAN_PATTERNS] + +# --------------------------------------------------------------------------- +# Data classes +# --------------------------------------------------------------------------- + + +@dataclass +class DiffClassification: + """Result of classifying changed files by security relevance.""" + + security_relevant: list[str] + skippable: list[str] + should_scan: bool + total_changed: int + + +@dataclass +class ScanScope: + """Scanner-ready scope produced by DiffScopeBuilder.""" + + files: list[str] = field(default_factory=list) + is_scoped: bool = False + original_changed: list[str] = field(default_factory=list) + expanded_from: list[str] = field(default_factory=list) + skipped: list[str] = field(default_factory=list) + + +# --------------------------------------------------------------------------- +# DiffClassifier +# --------------------------------------------------------------------------- + + +class DiffClassifier: + """Classifies changed files by security relevance. + + Files matching ``SKIP_PATTERNS`` are considered safe to skip. Files + matching ``ALWAYS_SCAN_PATTERNS`` are always security-relevant. + Unmatched files default to security-relevant (scan by default). + """ + + def classify(self, changed_files: list[str]) -> DiffClassification: + """Classify *changed_files* into security-relevant and skippable. + + Args: + changed_files: List of file paths (relative to project root) + that were changed in the diff. + + Returns: + A ``DiffClassification`` with categorised file lists. + """ + security_relevant: list[str] = [] + skippable: list[str] = [] + + for filepath in changed_files: + # Always-scan wins over skip + if self._matches_always_scan(filepath): + security_relevant.append(filepath) + logger.debug("Always-scan match: %s", filepath) + continue + + if self._matches_skip(filepath): + skippable.append(filepath) + logger.debug("Skip match: %s", filepath) + continue + + # Default: treat as security-relevant + security_relevant.append(filepath) + logger.debug("Default security-relevant: %s", filepath) + + should_scan = len(security_relevant) > 0 + + logger.info( + "Classified %d files: %d security-relevant, %d skippable, should_scan=%s", + len(changed_files), + len(security_relevant), + len(skippable), + should_scan, + ) + + return DiffClassification( + security_relevant=security_relevant, + skippable=skippable, + should_scan=should_scan, + total_changed=len(changed_files), + ) + + # -- helpers ------------------------------------------------------------- + + @staticmethod + def _matches_skip(filepath: str) -> bool: + """Return True if *filepath* matches any skip pattern.""" + return any(rx.search(filepath) for rx in _SKIP_REGEXES) + + @staticmethod + def _matches_always_scan(filepath: str) -> bool: + """Return True if *filepath* matches any always-scan pattern.""" + return any(rx.search(filepath) for rx in _ALWAYS_SCAN_REGEXES) + + +# --------------------------------------------------------------------------- +# DiffImpactAnalyzer +# --------------------------------------------------------------------------- + + +class DiffImpactAnalyzer: + """Expands changed files to their security-relevant blast radius. + + When a security-critical file is changed, this class performs a reverse + dependency lookup to find all project files that import from it. + """ + + def expand_impact(self, changed_files: list[str], project_path: str) -> list[str]: + """Expand *changed_files* to include their importers. + + For each changed file that is security-critical, find all project + files that import from it and add them to the result set. + + Args: + changed_files: Diff-changed file paths (relative to project root). + project_path: Absolute or relative path to the project root. + + Returns: + De-duplicated list of additional files discovered via impact + analysis (does *not* include the original changed files). + """ + expanded: set[str] = set() + + for filepath in changed_files: + if not self._is_security_critical(filepath): + continue + + logger.info("Security-critical file changed: %s — expanding impact", filepath) + importers = self._find_importers(filepath, project_path, IMPORTABLE_EXTENSIONS) + for imp in importers: + if imp not in changed_files: + expanded.add(imp) + logger.debug("Impact expansion: %s imports %s", imp, filepath) + + logger.info("Impact analysis expanded scope by %d file(s)", len(expanded)) + return sorted(expanded) + + # -- helpers ------------------------------------------------------------- + + @staticmethod + def _is_security_critical(filepath: str) -> bool: + """Return True if *filepath* contains a security-critical keyword.""" + lower = filepath.lower() + return any(kw in lower for kw in SECURITY_CRITICAL_KEYWORDS) + + def _find_importers( + self, + target_file: str, + project_path: str, + extensions: set[str], + ) -> list[str]: + """Find project files that import from *target_file*. + + Searches for ``import``, ``from … import``, and ``require(…)`` + statements referencing the module name derived from *target_file*. + + Args: + target_file: The file whose importers we want to find. + project_path: Project root directory. + extensions: Set of file extensions to search. + + Returns: + List of file paths (relative to *project_path*) that import + the target module. + """ + module_name = self._extract_module_name(target_file) + if not module_name: + return [] + + # Build regex that matches common import styles across languages: + # import module_name + # from module_name import … + # require('module_name') / require("module_name") + # import "module_name" + import_pattern = re.compile( + r""" + (?: # non-capturing group + \bimport\s+.*\b{mod}\b # import X / import {{ X }} + | \bfrom\s+\S*\b{mod}\b\s+import # from X import … + | \brequire\(\s*['\"].*{mod}.*['\"]\s*\) # require('X') + | \bimport\s+['\"].*{mod}.*['\"] # import "X" + ) + """.format(mod=re.escape(module_name)), + re.VERBOSE, + ) + + importers: list[str] = [] + + for dirpath, _dirnames, filenames in os.walk(project_path): + # Skip hidden directories and common non-source dirs + rel_dir = os.path.relpath(dirpath, project_path) + if any( + part.startswith(".") + for part in rel_dir.split(os.sep) + if part != "." + ): + continue + if any( + skip in rel_dir + for skip in ("node_modules", "__pycache__", "vendor", ".git") + ): + continue + + for filename in filenames: + _, ext = os.path.splitext(filename) + if ext not in extensions: + continue + + abs_path = os.path.join(dirpath, filename) + rel_path = os.path.relpath(abs_path, project_path) + + # Don't match the target file itself + if os.path.normpath(rel_path) == os.path.normpath(target_file): + continue + + try: + with open(abs_path, encoding="utf-8", errors="ignore") as fh: + content = fh.read() + except OSError: + continue + + if import_pattern.search(content): + importers.append(rel_path) + + return importers + + @staticmethod + def _extract_module_name(filepath: str) -> str: + """Extract the bare module name from a file path. + + Converts e.g. ``src/auth/middleware.py`` to ``middleware``. + + Args: + filepath: File path (relative or absolute). + + Returns: + The basename without extension, or empty string if unusable. + """ + basename = os.path.basename(filepath) + name, _ = os.path.splitext(basename) + return name if name else "" + + +# --------------------------------------------------------------------------- +# DiffScopeBuilder +# --------------------------------------------------------------------------- + + +class DiffScopeBuilder: + """Combines classifier + impact analyzer into a scanner-ready scope. + + Typical usage:: + + builder = DiffScopeBuilder() + changed = builder.get_changed_files(project_path) + scope = builder.build_scope(project_path, changed, only_changed=True) + args = builder.get_semgrep_include_args(scope) + """ + + def __init__(self) -> None: + self._classifier = DiffClassifier() + self._analyzer = DiffImpactAnalyzer() + + def build_scope( + self, + project_path: str, + changed_files: list[str] | None = None, + only_changed: bool = False, + ) -> ScanScope: + """Build a scan scope from the project and optional diff info. + + Args: + project_path: Absolute or relative path to the project root. + changed_files: List of changed file paths. If ``None`` or + *only_changed* is ``False``, the full project is scanned. + only_changed: When ``True``, restrict scanning to diff-scoped + files and their blast radius. + + Returns: + A ``ScanScope`` describing what to scan. + """ + if not only_changed or changed_files is None: + logger.info("Full project scan (only_changed=%s)", only_changed) + return ScanScope( + files=[], + is_scoped=False, + original_changed=changed_files or [], + expanded_from=[], + skipped=[], + ) + + # Step 1: Classify + classification = self._classifier.classify(changed_files) + + if not classification.should_scan: + logger.info("No security-relevant changes detected — empty scope") + return ScanScope( + files=[], + is_scoped=True, + original_changed=changed_files, + expanded_from=[], + skipped=classification.skippable, + ) + + # Step 2: Expand impact + expanded = self._analyzer.expand_impact( + classification.security_relevant, project_path + ) + + # Step 3: Merge into final file list (de-duplicated, sorted) + all_files = sorted(set(classification.security_relevant) | set(expanded)) + + logger.info( + "Diff-scoped scan: %d files (%d changed + %d expanded, %d skipped)", + len(all_files), + len(classification.security_relevant), + len(expanded), + len(classification.skippable), + ) + + return ScanScope( + files=all_files, + is_scoped=True, + original_changed=changed_files, + expanded_from=expanded, + skipped=classification.skippable, + ) + + @staticmethod + def get_changed_files(project_path: str) -> list[str]: + """Detect changed files via ``git diff``. + + Runs ``git diff --name-only HEAD^ HEAD`` inside *project_path*. + Falls back to an empty list on any error (e.g. shallow clone, + no previous commit, git not available). + + Args: + project_path: Path to the git repository root. + + Returns: + List of changed file paths relative to the repo root. + """ + try: + result = subprocess.run( + ["git", "diff", "--name-only", "HEAD^", "HEAD"], + cwd=project_path, + capture_output=True, + text=True, + timeout=30, + ) + if result.returncode != 0: + logger.warning( + "git diff returned non-zero (%d): %s", + result.returncode, + result.stderr.strip(), + ) + return [] + + files = [ + line.strip() + for line in result.stdout.strip().splitlines() + if line.strip() + ] + logger.info("git diff detected %d changed file(s)", len(files)) + return files + + except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as exc: + logger.warning("Failed to get changed files via git: %s", exc) + return [] + + @staticmethod + def get_semgrep_include_args(scope: ScanScope) -> list[str]: + """Convert a scope into Semgrep ``--include`` CLI arguments. + + Args: + scope: A ``ScanScope`` produced by ``build_scope``. + + Returns: + A flat list like ``["--include", "file1", "--include", "file2"]``. + Returns an empty list if the scope is not diff-scoped. + """ + if not scope.is_scoped or not scope.files: + return [] + + args: list[str] = [] + for filepath in scope.files: + args.append("--include") + args.append(filepath) + return args + + +# --------------------------------------------------------------------------- +# Main +# --------------------------------------------------------------------------- + +if __name__ == "__main__": + import sys + + logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") + + project = sys.argv[1] if len(sys.argv) > 1 else "." + builder = DiffScopeBuilder() + changed = builder.get_changed_files(project) + scope = builder.build_scope(project, changed, only_changed=True) + print( + f"Changed: {len(changed)}, " + f"Scan scope: {len(scope.files)}, " + f"Skipped: {len(scope.skipped)}" + ) diff --git a/scripts/findings_store.py b/scripts/findings_store.py new file mode 100644 index 0000000..a685a34 --- /dev/null +++ b/scripts/findings_store.py @@ -0,0 +1,808 @@ +#!/usr/bin/env python3 +""" +Persistent SQLite-Backed Findings Store for Argus Security + +Provides cross-scan intelligence by persisting findings, scan history, and +fix records in a local SQLite database. Enables regression detection, +trend analytics, mean-time-to-fix calculations, false-positive-rate +tracking, and historical context injection for LLM enrichment. + +Features: +- Content-based fingerprinting for cross-scan deduplication +- Automatic regression detection (previously fixed findings that reappear) +- Severity trend analytics over configurable time windows +- Mean-time-to-fix and false-positive-rate metrics +- Historical context injection for AI enrichment (Phase 2) +- Thread-safe write operations with a reentrant lock + +Usage: + store = FindingsStore(db_path=".argus/findings.db") + summary = store.record_scan(scan_id, findings, commit_sha="abc123") + context = store.get_historical_context(finding) +""" + +from __future__ import annotations + +import hashlib +import logging +import os +import sqlite3 +import threading +import uuid +from dataclasses import asdict, dataclass, field +from datetime import datetime, timezone +from typing import Any + +__all__ = ["FindingsStore", "ScanSummary"] + +logger = logging.getLogger(__name__) + +# --------------------------------------------------------------------------- +# Allowed finding statuses +# --------------------------------------------------------------------------- + +VALID_STATUSES = frozenset( + {"open", "fixed", "false_positive", "accepted_risk", "wont_fix"} +) + +# --------------------------------------------------------------------------- +# Data classes +# --------------------------------------------------------------------------- + + +@dataclass +class ScanSummary: + """Summary of a single scan recording operation. + + Attributes: + scan_id: Unique identifier for the scan. + total_findings: Total number of findings processed. + new_findings: Findings seen for the first time. + existing_findings: Findings already known from prior scans. + fixed_since_last: Findings that were open before but absent now. + regressions: Previously fixed findings that have reappeared. + by_severity: Breakdown of total findings by severity level. + """ + + scan_id: str + total_findings: int + new_findings: int + existing_findings: int + fixed_since_last: int + regressions: int + by_severity: dict[str, int] = field(default_factory=dict) + + def to_dict(self) -> dict[str, Any]: + """Serialize the summary to a plain dictionary.""" + return asdict(self) + + +# --------------------------------------------------------------------------- +# SQL statements +# --------------------------------------------------------------------------- + +_CREATE_FINDINGS_TABLE = """ +CREATE TABLE IF NOT EXISTS findings ( + id TEXT PRIMARY KEY, + fingerprint TEXT NOT NULL, + scan_id TEXT NOT NULL, + scan_timestamp TEXT NOT NULL, + vuln_type TEXT NOT NULL, + severity TEXT NOT NULL, + file_path TEXT, + line_number INTEGER, + cwe TEXT, + cve TEXT, + cvss_score REAL, + source_tool TEXT, + status TEXT DEFAULT 'open', + first_seen TEXT NOT NULL, + last_seen TEXT NOT NULL, + times_seen INTEGER DEFAULT 1, + fix_verified INTEGER DEFAULT 0, + title TEXT, + description TEXT +) +""" + +_CREATE_SCAN_HISTORY_TABLE = """ +CREATE TABLE IF NOT EXISTS scan_history ( + scan_id TEXT PRIMARY KEY, + timestamp TEXT NOT NULL, + commit_sha TEXT, + branch TEXT, + total_findings INTEGER, + new_findings INTEGER DEFAULT 0, + fixed_findings INTEGER DEFAULT 0, + regression_findings INTEGER DEFAULT 0, + critical INTEGER DEFAULT 0, + high INTEGER DEFAULT 0, + medium INTEGER DEFAULT 0, + low INTEGER DEFAULT 0, + duration_seconds REAL, + cost_usd REAL +) +""" + +_CREATE_FIX_HISTORY_TABLE = """ +CREATE TABLE IF NOT EXISTS fix_history ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + finding_id TEXT NOT NULL, + fix_commit TEXT, + fix_timestamp TEXT NOT NULL, + fix_method TEXT, + retest_passed INTEGER DEFAULT 0, + regression_detected INTEGER DEFAULT 0, + FOREIGN KEY (finding_id) REFERENCES findings(id) +) +""" + +_CREATE_INDEXES = [ + "CREATE INDEX IF NOT EXISTS idx_findings_fingerprint ON findings(fingerprint)", + "CREATE INDEX IF NOT EXISTS idx_findings_status ON findings(status)", + "CREATE INDEX IF NOT EXISTS idx_findings_severity ON findings(severity)", + "CREATE INDEX IF NOT EXISTS idx_findings_vuln_type ON findings(vuln_type)", + "CREATE INDEX IF NOT EXISTS idx_findings_scan_id ON findings(scan_id)", + "CREATE INDEX IF NOT EXISTS idx_scan_history_timestamp ON scan_history(timestamp)", +] + + +# --------------------------------------------------------------------------- +# FindingsStore +# --------------------------------------------------------------------------- + + +class FindingsStore: + """Persistent SQLite-backed findings store for cross-scan intelligence. + + Maintains a local database of security findings across scans, enabling + regression detection, trend analysis, and historical context injection + for LLM-based enrichment. + + Args: + db_path: Path to the SQLite database file. Parent directories are + created automatically if they do not exist. + """ + + def __init__(self, db_path: str = ".argus/findings.db") -> None: + self.db_path = db_path + self._lock = threading.Lock() + + # Ensure parent directory exists + parent = os.path.dirname(db_path) + if parent: + os.makedirs(parent, exist_ok=True) + + self._conn = sqlite3.connect(db_path, check_same_thread=False) + self._conn.row_factory = sqlite3.Row + self._conn.execute("PRAGMA journal_mode=WAL") + self._conn.execute("PRAGMA foreign_keys=ON") + + self._init_schema() + logger.info("FindingsStore initialized at %s", db_path) + + # ------------------------------------------------------------------ + # Schema initialization + # ------------------------------------------------------------------ + + def _init_schema(self) -> None: + """Create tables and indexes if they do not already exist.""" + with self._lock: + cur = self._conn.cursor() + cur.execute(_CREATE_FINDINGS_TABLE) + cur.execute(_CREATE_SCAN_HISTORY_TABLE) + cur.execute(_CREATE_FIX_HISTORY_TABLE) + for idx_sql in _CREATE_INDEXES: + cur.execute(idx_sql) + self._conn.commit() + logger.debug("Database schema initialized") + + # ------------------------------------------------------------------ + # Fingerprinting + # ------------------------------------------------------------------ + + @staticmethod + def fingerprint_finding(finding: dict[str, Any]) -> str: + """Compute a content-based fingerprint for a finding. + + The fingerprint is a deterministic SHA-256 digest of the + concatenation of ``vuln_type``, ``file_path``, the first 200 + characters of ``code_snippet``, and ``cwe``. + + Args: + finding: Dictionary containing finding fields. + + Returns: + The first 16 hex characters of the SHA-256 digest. + """ + vuln_type = str(finding.get("vuln_type", "")) + file_path = str(finding.get("file_path", "")) + snippet = str(finding.get("code_snippet", ""))[:200] + cwe = str(finding.get("cwe", "")) + combined = f"{vuln_type}|{file_path}|{snippet}|{cwe}" + return hashlib.sha256(combined.encode("utf-8")).hexdigest()[:16] + + # ------------------------------------------------------------------ + # Core CRUD + # ------------------------------------------------------------------ + + def record_scan( + self, + scan_id: str, + findings: list[dict[str, Any]], + commit_sha: str = "", + branch: str = "", + duration: float = 0.0, + cost: float = 0.0, + ) -> ScanSummary: + """Record all findings from a scan, upserting as appropriate. + + For each finding the method computes a fingerprint and checks + whether a finding with the same fingerprint already exists: + + - **New:** inserted with ``first_seen = last_seen = now``. + - **Existing:** ``last_seen`` and ``times_seen`` are updated; + severity and status are refreshed if they changed. + - **Regression:** a previously *fixed* finding has reappeared; + its status is reset to ``open``. + + After processing findings, open findings from prior scans that + are *not* present in this scan are counted as ``fixed_since_last`` + (but their status is not changed automatically -- that requires + an explicit ``record_fix`` call). + + Args: + scan_id: Unique identifier for this scan. + findings: List of finding dictionaries. + commit_sha: Git commit SHA associated with this scan. + branch: Git branch name. + duration: Total scan duration in seconds. + cost: Estimated LLM cost in USD. + + Returns: + A :class:`ScanSummary` describing the scan results. + """ + now = datetime.now(timezone.utc).isoformat() + new_count = 0 + existing_count = 0 + regression_count = 0 + severity_counts: dict[str, int] = { + "critical": 0, + "high": 0, + "medium": 0, + "low": 0, + } + + current_fingerprints: set[str] = set() + + with self._lock: + cur = self._conn.cursor() + + for finding in findings: + fp = self.fingerprint_finding(finding) + current_fingerprints.add(fp) + + severity = str(finding.get("severity", "low")).lower() + severity_counts[severity] = severity_counts.get(severity, 0) + 1 + + # Check for existing finding by fingerprint + cur.execute( + "SELECT id, status, times_seen FROM findings WHERE fingerprint = ?", + (fp,), + ) + row = cur.fetchone() + + if row is not None: + existing_id = row["id"] + prev_status = row["status"] + prev_times = row["times_seen"] + + # Regression: was fixed, now reappeared + if prev_status == "fixed": + regression_count += 1 + new_status = "open" + logger.warning( + "Regression detected for finding %s (fingerprint=%s)", + existing_id, + fp, + ) + else: + new_status = prev_status + existing_count += 1 + + cur.execute( + """ + UPDATE findings + SET last_seen = ?, + times_seen = ?, + severity = ?, + status = ?, + scan_id = ?, + scan_timestamp = ? + WHERE id = ? + """, + ( + now, + prev_times + 1, + severity, + new_status, + scan_id, + now, + existing_id, + ), + ) + else: + # New finding + new_count += 1 + finding_id = finding.get("id") or str(uuid.uuid4()) + cur.execute( + """ + INSERT INTO findings ( + id, fingerprint, scan_id, scan_timestamp, + vuln_type, severity, file_path, line_number, + cwe, cve, cvss_score, source_tool, status, + first_seen, last_seen, times_seen, + fix_verified, title, description + ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + """, + ( + finding_id, + fp, + scan_id, + now, + str(finding.get("vuln_type", "")), + severity, + finding.get("file_path"), + finding.get("line_number"), + finding.get("cwe"), + finding.get("cve"), + finding.get("cvss_score"), + finding.get("source_tool"), + "open", + now, + now, + 1, + 0, + finding.get("title"), + finding.get("description"), + ), + ) + + # Count how many previously-open findings are absent from this scan + if current_fingerprints: + placeholders = ",".join("?" for _ in current_fingerprints) + cur.execute( + f""" + SELECT COUNT(*) AS cnt FROM findings + WHERE status = 'open' + AND fingerprint NOT IN ({placeholders}) + """, + list(current_fingerprints), + ) + else: + cur.execute( + "SELECT COUNT(*) AS cnt FROM findings WHERE status = 'open'" + ) + fixed_since_last = cur.fetchone()["cnt"] + + # Record scan history + cur.execute( + """ + INSERT OR REPLACE INTO scan_history ( + scan_id, timestamp, commit_sha, branch, + total_findings, new_findings, fixed_findings, + regression_findings, + critical, high, medium, low, + duration_seconds, cost_usd + ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + """, + ( + scan_id, + now, + commit_sha, + branch, + len(findings), + new_count, + fixed_since_last, + regression_count, + severity_counts.get("critical", 0), + severity_counts.get("high", 0), + severity_counts.get("medium", 0), + severity_counts.get("low", 0), + duration, + cost, + ), + ) + + self._conn.commit() + + summary = ScanSummary( + scan_id=scan_id, + total_findings=len(findings), + new_findings=new_count, + existing_findings=existing_count, + fixed_since_last=fixed_since_last, + regressions=regression_count, + by_severity=severity_counts, + ) + logger.info( + "Scan %s recorded: %d total, %d new, %d existing, %d regressions, %d fixed", + scan_id, + summary.total_findings, + summary.new_findings, + summary.existing_findings, + summary.regressions, + summary.fixed_since_last, + ) + return summary + + def record_fix( + self, + finding_id: str, + fix_commit: str = "", + fix_method: str = "manual", + retest_passed: bool = False, + ) -> None: + """Record a fix for a finding. + + Inserts a row into ``fix_history`` and, if ``retest_passed`` is + ``True``, updates the finding status to ``'fixed'``. + + Args: + finding_id: ID of the finding that was fixed. + fix_commit: Git commit SHA of the fix. + fix_method: How the fix was applied (``autofix``, + ``manual``, or ``dependency_update``). + retest_passed: Whether the fix was verified by a retest. + """ + now = datetime.now(timezone.utc).isoformat() + with self._lock: + cur = self._conn.cursor() + cur.execute( + """ + INSERT INTO fix_history ( + finding_id, fix_commit, fix_timestamp, fix_method, retest_passed + ) VALUES (?, ?, ?, ?, ?) + """, + (finding_id, fix_commit, now, fix_method, int(retest_passed)), + ) + if retest_passed: + cur.execute( + "UPDATE findings SET status = 'fixed', fix_verified = 1 WHERE id = ?", + (finding_id,), + ) + logger.info( + "Finding %s marked as fixed (verified by retest)", finding_id + ) + else: + logger.info( + "Fix recorded for finding %s (pending retest verification)", + finding_id, + ) + self._conn.commit() + + def update_status(self, finding_id: str, status: str) -> None: + """Update the status of a finding. + + Args: + finding_id: ID of the finding to update. + status: New status. Must be one of ``open``, ``fixed``, + ``false_positive``, ``accepted_risk``, or ``wont_fix``. + + Raises: + ValueError: If *status* is not a recognized value. + """ + if status not in VALID_STATUSES: + raise ValueError( + f"Invalid status '{status}'. Must be one of: {sorted(VALID_STATUSES)}" + ) + with self._lock: + cur = self._conn.cursor() + cur.execute( + "UPDATE findings SET status = ? WHERE id = ?", + (status, finding_id), + ) + self._conn.commit() + logger.info("Finding %s status updated to '%s'", finding_id, status) + + def get_finding(self, finding_id: str) -> dict[str, Any] | None: + """Retrieve a single finding by its ID. + + Args: + finding_id: The unique finding identifier. + + Returns: + A dictionary of finding fields, or ``None`` if not found. + """ + cur = self._conn.cursor() + cur.execute("SELECT * FROM findings WHERE id = ?", (finding_id,)) + row = cur.fetchone() + return dict(row) if row else None + + def get_finding_by_fingerprint(self, fingerprint: str) -> dict[str, Any] | None: + """Retrieve a single finding by its content-based fingerprint. + + Args: + fingerprint: The 16-char hex fingerprint. + + Returns: + A dictionary of finding fields, or ``None`` if not found. + """ + cur = self._conn.cursor() + cur.execute("SELECT * FROM findings WHERE fingerprint = ?", (fingerprint,)) + row = cur.fetchone() + return dict(row) if row else None + + # ------------------------------------------------------------------ + # Analytics + # ------------------------------------------------------------------ + + def is_regression(self, fingerprint: str) -> bool: + """Check whether a finding with this fingerprint was previously fixed. + + A regression means the finding existed before with ``status='fixed'`` + and has now reappeared. + + Args: + fingerprint: Content-based fingerprint of the finding. + + Returns: + ``True`` if a previously fixed finding matches this fingerprint. + """ + cur = self._conn.cursor() + cur.execute( + "SELECT status FROM findings WHERE fingerprint = ?", (fingerprint,) + ) + row = cur.fetchone() + if row is None: + return False + return row["status"] == "fixed" + + def trending(self, days: int = 90) -> dict[str, Any]: + """Return severity counts per week for the last *days* days. + + Uses ``scan_history`` rows to aggregate weekly counts of critical, + high, medium, and low findings. + + Args: + days: Number of days to look back. Defaults to 90. + + Returns: + A dictionary keyed by ISO week string (``YYYY-WNN``) with + severity counts for each week. + """ + cur = self._conn.cursor() + cur.execute( + """ + SELECT timestamp, critical, high, medium, low + FROM scan_history + WHERE timestamp >= datetime('now', ?) + ORDER BY timestamp ASC + """, + (f"-{days} days",), + ) + weeks: dict[str, dict[str, int]] = {} + for row in cur.fetchall(): + try: + dt = datetime.fromisoformat(row["timestamp"]) + week_key = f"{dt.isocalendar()[0]}-W{dt.isocalendar()[1]:02d}" + except (ValueError, TypeError): + continue + + if week_key not in weeks: + weeks[week_key] = {"critical": 0, "high": 0, "medium": 0, "low": 0} + + weeks[week_key]["critical"] += row["critical"] or 0 + weeks[week_key]["high"] += row["high"] or 0 + weeks[week_key]["medium"] += row["medium"] or 0 + weeks[week_key]["low"] += row["low"] or 0 + + return weeks + + def mean_time_to_fix(self, severity: str | None = None) -> float | None: + """Calculate the average time between first_seen and fix for fixed findings. + + Args: + severity: Optional severity filter (e.g., ``'critical'``). + + Returns: + Mean time to fix in seconds, or ``None`` if there are no + qualifying records. + """ + query = """ + SELECT f.first_seen, fh.fix_timestamp + FROM findings f + JOIN fix_history fh ON f.id = fh.finding_id + WHERE f.status = 'fixed' + AND fh.retest_passed = 1 + """ + params: list[Any] = [] + if severity: + query += " AND f.severity = ?" + params.append(severity.lower()) + + cur = self._conn.cursor() + cur.execute(query, params) + + durations: list[float] = [] + for row in cur.fetchall(): + try: + first = datetime.fromisoformat(row["first_seen"]) + fixed = datetime.fromisoformat(row["fix_timestamp"]) + delta = (fixed - first).total_seconds() + if delta >= 0: + durations.append(delta) + except (ValueError, TypeError): + continue + + if not durations: + return None + return sum(durations) / len(durations) + + def false_positive_rate(self, vuln_type: str | None = None) -> float: + """Calculate the false positive rate for a given vulnerability type. + + Args: + vuln_type: Optional vulnerability type filter. If ``None``, + computes the rate across all findings. + + Returns: + The ratio of ``false_positive`` findings to total findings, + or ``0.0`` if there are no findings. + """ + if vuln_type: + total_query = "SELECT COUNT(*) AS cnt FROM findings WHERE vuln_type = ?" + fp_query = ( + "SELECT COUNT(*) AS cnt FROM findings " + "WHERE vuln_type = ? AND status = 'false_positive'" + ) + params: list[Any] = [vuln_type] + else: + total_query = "SELECT COUNT(*) AS cnt FROM findings" + fp_query = ( + "SELECT COUNT(*) AS cnt FROM findings WHERE status = 'false_positive'" + ) + params = [] + + cur = self._conn.cursor() + cur.execute(total_query, params) + total = cur.fetchone()["cnt"] + if total == 0: + return 0.0 + + cur.execute(fp_query, params) + fp_count = cur.fetchone()["cnt"] + return fp_count / total + + def top_recurring(self, limit: int = 10) -> list[dict[str, Any]]: + """Return the most frequently recurring open findings. + + Args: + limit: Maximum number of results to return. + + Returns: + A list of finding dictionaries ordered by ``times_seen`` + descending. + """ + cur = self._conn.cursor() + cur.execute( + """ + SELECT * FROM findings + WHERE status = 'open' + ORDER BY times_seen DESC + LIMIT ? + """, + (limit,), + ) + return [dict(row) for row in cur.fetchall()] + + def scan_history_summary(self, limit: int = 10) -> list[dict[str, Any]]: + """Return recent scan history rows. + + Args: + limit: Maximum number of rows to return. + + Returns: + A list of scan history dictionaries ordered by timestamp + descending. + """ + cur = self._conn.cursor() + cur.execute( + """ + SELECT * FROM scan_history + ORDER BY timestamp DESC + LIMIT ? + """, + (limit,), + ) + return [dict(row) for row in cur.fetchall()] + + # ------------------------------------------------------------------ + # Context injection (for LLM enrichment) + # ------------------------------------------------------------------ + + def get_historical_context(self, finding: dict[str, Any]) -> dict[str, Any]: + """Build historical context for a finding to inject into LLM prompts. + + Computes the finding's fingerprint, looks up prior history, and + returns a context dictionary suitable for Phase 2 AI enrichment. + + Args: + finding: A finding dictionary. + + Returns: + A dictionary containing: + + - ``first_seen``: ISO timestamp of the first occurrence. + - ``times_seen``: How many scans have reported this finding. + - ``previous_status``: Status from the last scan. + - ``related_in_file``: Count of other findings in the same file. + - ``fp_rate_for_type``: False positive rate for this vuln_type. + - ``is_regression``: Whether this is a regression. + """ + fp = self.fingerprint_finding(finding) + existing = self.get_finding_by_fingerprint(fp) + + if existing is None: + return { + "first_seen": None, + "times_seen": 0, + "previous_status": None, + "related_in_file": 0, + "fp_rate_for_type": 0.0, + "is_regression": False, + } + + # Count related findings in the same file + related_count = 0 + file_path = existing.get("file_path") + if file_path: + cur = self._conn.cursor() + cur.execute( + "SELECT COUNT(*) AS cnt FROM findings WHERE file_path = ? AND fingerprint != ?", + (file_path, fp), + ) + related_count = cur.fetchone()["cnt"] + + vuln_type = existing.get("vuln_type", "") + fp_rate = self.false_positive_rate(vuln_type) if vuln_type else 0.0 + + return { + "first_seen": existing.get("first_seen"), + "times_seen": existing.get("times_seen", 0), + "previous_status": existing.get("status"), + "related_in_file": related_count, + "fp_rate_for_type": fp_rate, + "is_regression": self.is_regression(fp), + } + + # ------------------------------------------------------------------ + # Helpers + # ------------------------------------------------------------------ + + def _count_findings(self) -> int: + """Return the total number of findings in the database.""" + cur = self._conn.cursor() + cur.execute("SELECT COUNT(*) AS cnt FROM findings") + return cur.fetchone()["cnt"] + + def close(self) -> None: + """Close the underlying database connection.""" + self._conn.close() + logger.debug("FindingsStore connection closed") + + +# --------------------------------------------------------------------------- +# CLI demonstration +# --------------------------------------------------------------------------- + +if __name__ == "__main__": + logging.basicConfig( + level=logging.INFO, + format="%(asctime)s [%(levelname)s] %(name)s: %(message)s", + ) + + store = FindingsStore() + print(f"Findings store initialized at {store.db_path}") + print(f"Total findings: {store._count_findings()}") + print(f"Scan history: {len(store.scan_history_summary())}") diff --git a/scripts/hybrid_analyzer.py b/scripts/hybrid_analyzer.py index 3cf3d42..d1425e4 100644 --- a/scripts/hybrid_analyzer.py +++ b/scripts/hybrid_analyzer.py @@ -123,6 +123,49 @@ except ImportError: _LICENSE_OK = False +# Continuous security testing modules (v3.0) +try: + from diff_impact_analyzer import DiffScopeBuilder + + _DIFF_SCOPE_OK = True +except ImportError: + _DIFF_SCOPE_OK = False + +try: + from findings_store import FindingsStore + + _FINDINGS_STORE_OK = True +except ImportError: + _FINDINGS_STORE_OK = False + +try: + from app_context_builder import AppContextBuilder + + _APP_CONTEXT_OK = True +except ImportError: + _APP_CONTEXT_OK = False + +try: + from agent_chain_discovery import AgentChainDiscovery, CrossComponentAnalyzer + + _AGENT_CHAIN_OK = True +except ImportError: + _AGENT_CHAIN_OK = False + +try: + from autofix_pr_generator import AutoFixPRGenerator, ClosedLoopOrchestrator + + _AUTOFIX_OK = True +except ImportError: + _AUTOFIX_OK = False + +try: + from sast_dast_validator import SastDastValidator + + _LIVE_VALIDATION_OK = True +except ImportError: + _LIVE_VALIDATION_OK = False + try: from heuristic_scanner import ( _SAFE_PATTERN_FLAGS as _HEURISTIC_SAFE_FLAGS, @@ -565,6 +608,80 @@ def __init__( except Exception as e: logger.warning("Scanner registry init failed (non-fatal): %s", e) + # -- Continuous security testing modules (v3.0) -- + self.diff_scope_builder = None + self.findings_store = None + self.app_context = None + self.agent_chain_discovery = None + self.cross_component_analyzer = None + self.autofix_generator = None + self.live_validator = None + + if self.config.get("enable_diff_scoping", True) and _DIFF_SCOPE_OK: + try: + self.diff_scope_builder = DiffScopeBuilder() + logger.info("Diff-intelligent scanner scoping initialized") + except Exception as e: + logger.warning("Diff scope builder not available: %s", e) + + if self.config.get("enable_findings_store", True) and _FINDINGS_STORE_OK: + try: + db_path = self.config.get("findings_db_path", ".argus/findings.db") + self.findings_store = FindingsStore(db_path=db_path) + logger.info("Persistent findings store initialized (%s)", db_path) + except Exception as e: + logger.warning("Findings store not available: %s", e) + + if self.config.get("enable_app_context", True) and _APP_CONTEXT_OK: + try: + project_path = self.config.get("project_path", ".") + builder = AppContextBuilder(project_path) + self.app_context = builder.build() + logger.info("Application context: %s/%s", self.app_context.language, self.app_context.framework) + except Exception as e: + logger.warning("App context builder not available: %s", e) + + if self.config.get("enable_agent_chain_discovery", False) and _AGENT_CHAIN_OK and self.ai_client: + try: + self.agent_chain_discovery = AgentChainDiscovery( + llm_call=self.ai_client.call_llm_api + if hasattr(self.ai_client, "call_llm_api") + else None, + ) + logger.info("Agent-driven chain discovery initialized") + except Exception as e: + logger.warning("Agent chain discovery not available: %s", e) + + if self.config.get("enable_cross_component_analysis", True) and _AGENT_CHAIN_OK: + try: + project_path = self.config.get("project_path", ".") + self.cross_component_analyzer = CrossComponentAnalyzer(project_path=project_path) + logger.info("Cross-component analyzer initialized") + except Exception as e: + logger.warning("Cross-component analyzer not available: %s", e) + + if self.config.get("enable_autofix_pr", False) and _AUTOFIX_OK: + try: + project_path = self.config.get("project_path", ".") + self.autofix_generator = AutoFixPRGenerator(project_path=project_path) + logger.info("AutoFix PR generator initialized") + except Exception as e: + logger.warning("AutoFix PR generator not available: %s", e) + + if ( + self.config.get("enable_live_validation", False) + and _LIVE_VALIDATION_OK + and self.dast_target_url + ): + try: + self.live_validator = SastDastValidator( + target_url=self.dast_target_url, + timeout=10, + ) + logger.info("Live target validator initialized (%s)", self.dast_target_url) + except Exception as e: + logger.warning("Live target validator not available: %s", e) + # -- Phase 0: MCP Server (optional background service) -- self._mcp_server = None self._mcp_thread = None diff --git a/scripts/sast_dast_validator.py b/scripts/sast_dast_validator.py new file mode 100644 index 0000000..b7c06d3 --- /dev/null +++ b/scripts/sast_dast_validator.py @@ -0,0 +1,844 @@ +#!/usr/bin/env python3 +""" +SAST-to-DAST Validator for Argus Security + +Bridges static analysis (SAST) findings with dynamic application security +testing (DAST) by generating targeted HTTP tests against live targets. +Validates whether SAST-discovered vulnerabilities are actually exploitable +in a running application. + +Complements the sandbox-based validation in sandbox_validator.py by testing +against live deployments (staging, preview, development environments). + +Safety: + - NEVER targets production by default + - Rejects internal/private IP ranges unless explicitly allowed + - Truncates response bodies to prevent memory issues + - Defensive error handling for all network operations + +Usage: + from sast_dast_validator import SastDastValidator + + validator = SastDastValidator('https://staging.example.com') + result = validator.validate_finding(finding) +""" + +from __future__ import annotations + +import json +import logging +import re +import ssl +import time +import urllib.error +import urllib.parse +import urllib.request +from dataclasses import asdict, dataclass, field +from typing import Any, Optional + +logger = logging.getLogger(__name__) + +# --------------------------------------------------------------------------- +# Constants +# --------------------------------------------------------------------------- + +MAX_RESPONSE_BODY = 2000 # Truncate response bodies to this length + +DEFAULT_ALLOWED_ENVIRONMENTS = ["staging", "preview", "development", "testing"] + +# Private/internal IP ranges that should be blocked by default +_PRIVATE_IP_PATTERNS = [ + re.compile(r"^127\."), # loopback + re.compile(r"^10\."), # 10.0.0.0/8 + re.compile(r"^172\.(1[6-9]|2[0-9]|3[01])\."), # 172.16.0.0/12 + re.compile(r"^192\.168\."), # 192.168.0.0/16 + re.compile(r"^169\.254\."), # link-local + re.compile(r"^0\."), # current network +] + +_LOCALHOST_HOSTNAMES = {"localhost", "localhost.localdomain", "ip6-localhost"} + +# --------------------------------------------------------------------------- +# Vulnerability payload definitions +# --------------------------------------------------------------------------- + +_SQL_INJECTION_PAYLOADS = [ + "' OR '1'='1", + "1; DROP TABLE--", + "' UNION SELECT NULL--", +] + +_XSS_PAYLOADS = [ + "", + '">', +] + +_SSRF_PAYLOADS = [ + "http://169.254.169.254/latest/meta-data/", + "http://localhost:6379/", +] + +_PATH_TRAVERSAL_PAYLOADS = [ + "../../etc/passwd", + "..\\..\\windows\\win.ini", +] + +_COMMAND_INJECTION_PAYLOADS = [ + "; id", + "| whoami", + "`id`", +] + +# Success indicator patterns per vulnerability type +_SUCCESS_INDICATORS: dict[str, list[str]] = { + "sql-injection": [ + "sql syntax", + "mysql", + "sqlite", + "postgresql", + "ora-", + "unclosed quotation", + "unterminated string", + "you have an error in your sql", + "warning: mysql", + "microsoft ole db", + "odbc sql server", + "UNION SELECT", + ], + "sqli": [], # shares with sql-injection, handled in code + "xss": [ + "", + 'onerror=alert(1)>', + ], + "ssrf": [ + "ami-", + "instance-id", + "iam/security-credentials", + "meta-data", + "REDIS", + "+PONG", + "+OK", + ], + "path-traversal": [ + "root:", + "/bin/bash", + "/bin/sh", + "[boot loader]", + "[operating systems]", + "root:x:0:0", + ], + "command-injection": [ + "uid=", + "gid=", + "root", + "whoami", + ], +} + +# Alias sqli to share sql-injection indicators +_SUCCESS_INDICATORS["sqli"] = _SUCCESS_INDICATORS["sql-injection"] + + +# --------------------------------------------------------------------------- +# Data classes +# --------------------------------------------------------------------------- + + +@dataclass +class TestCase: + """An HTTP test case generated from a SAST finding.""" + + endpoint: str + method: str + payloads: list[str] + vuln_type: str + finding_id: str + success_indicators: list[str] + inject_in: str = "query" + content_type: str = "application/x-www-form-urlencoded" + + +@dataclass +class TestExecution: + """Result of executing a single HTTP request.""" + + status_code: int | None + response_body: str + response_headers: dict[str, str] + error: str | None + duration_ms: float + + +@dataclass +class ValidationResult: + """Outcome of validating a SAST finding against a live target.""" + + finding_id: str + validated: bool + validation_method: str + evidence: str + payload_used: str + endpoint_tested: str + http_status: int | None + error: str | None = None + + def to_dict(self) -> dict[str, Any]: + """Serialize to a plain dictionary.""" + return asdict(self) + + +# --------------------------------------------------------------------------- +# Route inference helpers +# --------------------------------------------------------------------------- + +# Mapping from common framework directory names to API prefixes +_ROUTE_PREFIX_MAP: list[tuple[re.Pattern[str], str]] = [ + (re.compile(r"routes/(.+?)\.(?:py|js|ts|rb)$"), "/api/{0}"), + (re.compile(r"controllers/(.+?)_controller\.(?:py|js|ts|rb)$"), "/{0}"), + (re.compile(r"controllers/(.+?)Controller\.(?:java|go|cs)$"), "/{0}"), + (re.compile(r"api/(.+?)\.(?:py|js|ts|rb)$"), "/api/{0}"), + (re.compile(r"views/(.+?)\.(?:py|rb)$"), "/{0}"), + (re.compile(r"handlers/(.+?)\.(?:go|py|js|ts)$"), "/{0}"), + (re.compile(r"endpoints/(.+?)\.(?:py|js|ts)$"), "/api/{0}"), +] + + +# --------------------------------------------------------------------------- +# Main validator class +# --------------------------------------------------------------------------- + + +class SastDastValidator: + """Validates SAST findings by generating targeted HTTP tests against a live target. + + Generates vulnerability-specific HTTP requests, executes them against + the target, and analyzes responses for evidence of exploitability. + + Attributes: + target_url: Base URL of the live target. + auth_headers: Optional auth headers included in requests. + timeout: HTTP request timeout in seconds. + allowed_environments: Only allow validation against these environments. + """ + + def __init__( + self, + target_url: str, + auth_headers: dict[str, str] | None = None, + timeout: int = 10, + allowed_environments: list[str] | None = None, + ): + """Initialize the SAST-to-DAST validator. + + Args: + target_url: Base URL of the live target (e.g., https://staging.example.com). + auth_headers: Optional auth headers to include in requests. + timeout: HTTP request timeout in seconds. + allowed_environments: Only allow validation against these environments. + Defaults to ['staging', 'preview', 'development', 'testing']. + NEVER includes 'production' by default. + + Raises: + ValueError: If the target URL is invalid or blocked by safety checks. + """ + self.target_url = target_url.rstrip("/") + self.auth_headers = auth_headers or {} + self.timeout = timeout + self.allowed_environments = ( + allowed_environments if allowed_environments is not None + else list(DEFAULT_ALLOWED_ENVIRONMENTS) + ) + + if not self._validate_target_url(self.target_url): + raise ValueError( + f"Target URL rejected by safety checks: {self.target_url}. " + f"Internal IPs and localhost are blocked unless explicitly allowed." + ) + + logger.info("SastDastValidator initialized for target: %s", self.target_url) + + # ------------------------------------------------------------------ + # Public methods + # ------------------------------------------------------------------ + + def validate_finding(self, finding: dict[str, Any]) -> ValidationResult: + """Validate a single SAST finding against the live target. + + Generates a targeted HTTP test case for the finding, executes it, + and returns the validation result. + + Args: + finding: Dict describing the SAST finding. Expected keys include + ``id`` or ``finding_id``, ``vuln_type``, ``endpoint``/``url``/ + ``route``/``file_path``, and optionally ``method``, ``parameter``. + + Returns: + ValidationResult with validated=True if vulnerability confirmed, + validation_method='not_applicable' if no test could be generated. + """ + finding_id = str( + finding.get("id", finding.get("finding_id", f"unknown-{id(finding)}")) + ) + + test_case = self._generate_test_case(finding) + if test_case is None: + logger.debug( + "No test case generated for finding %s (vuln_type=%s)", + finding_id, + finding.get("vuln_type", "unknown"), + ) + return ValidationResult( + finding_id=finding_id, + validated=False, + validation_method="not_applicable", + evidence="No test case could be generated for this finding type", + payload_used="", + endpoint_tested="", + http_status=None, + ) + + # Try each payload until one succeeds or all fail + last_execution: TestExecution | None = None + for payload in test_case.payloads: + current_case = TestCase( + endpoint=test_case.endpoint, + method=test_case.method, + payloads=[payload], + vuln_type=test_case.vuln_type, + finding_id=test_case.finding_id, + success_indicators=test_case.success_indicators, + inject_in=test_case.inject_in, + content_type=test_case.content_type, + ) + + execution = self._execute_test(current_case) + last_execution = execution + + if execution.error is not None: + logger.debug( + "Test execution error for finding %s with payload %r: %s", + finding_id, + payload, + execution.error, + ) + continue + + if self._check_success_indicators( + execution.response_body, + execution.status_code or 0, + current_case, + ): + logger.info( + "Vulnerability confirmed for finding %s at %s with payload %r", + finding_id, + test_case.endpoint, + payload, + ) + return ValidationResult( + finding_id=finding_id, + validated=True, + validation_method="live_dast", + evidence=execution.response_body[:MAX_RESPONSE_BODY], + payload_used=payload, + endpoint_tested=test_case.endpoint, + http_status=execution.status_code, + ) + + # No payload triggered the vulnerability + evidence = "" + error_msg = None + http_status = None + if last_execution is not None: + evidence = last_execution.response_body[:MAX_RESPONSE_BODY] + error_msg = last_execution.error + http_status = last_execution.status_code + + return ValidationResult( + finding_id=finding_id, + validated=False, + validation_method="live_dast", + evidence=evidence, + payload_used="", + endpoint_tested=test_case.endpoint, + http_status=http_status, + error=error_msg, + ) + + def validate_batch( + self, + findings: list[dict[str, Any]], + max_concurrent: int = 5, + ) -> list[ValidationResult]: + """Validate multiple SAST findings against the live target. + + Filters to only findings that have endpoint/route information, + then validates them sequentially. + + Args: + findings: List of finding dicts. + max_concurrent: Reserved for future concurrent execution. + Currently unused (sequential execution). + + Returns: + List of ValidationResult for each testable finding. + """ + testable: list[dict[str, Any]] = [] + for finding in findings: + endpoint = self._infer_endpoint(finding) + if endpoint is not None: + testable.append(finding) + else: + logger.debug( + "Skipping finding %s: no endpoint could be inferred", + finding.get("id", finding.get("finding_id", "unknown")), + ) + + logger.info( + "Validating %d of %d findings (filtered to those with endpoints)", + len(testable), + len(findings), + ) + + results: list[ValidationResult] = [] + for finding in testable: + result = self.validate_finding(finding) + results.append(result) + + return results + + # ------------------------------------------------------------------ + # Test case generation + # ------------------------------------------------------------------ + + def _generate_test_case(self, finding: dict[str, Any]) -> TestCase | None: + """Map a SAST finding to an HTTP test case. + + Args: + finding: Finding dict with at least ``vuln_type`` and enough + information to infer an endpoint. + + Returns: + TestCase if the vulnerability type is mappable, None otherwise. + """ + vuln_type = finding.get("vuln_type", "").lower().strip() + finding_id = str( + finding.get("id", finding.get("finding_id", f"unknown-{id(finding)}")) + ) + + endpoint = self._infer_endpoint(finding) + if endpoint is None: + return None + + full_url = f"{self.target_url}{endpoint}" + method = finding.get("method", "GET").upper() + parameter = finding.get("parameter", "input") + + if vuln_type in ("sql-injection", "sqli"): + return TestCase( + endpoint=full_url, + method=method, + payloads=list(_SQL_INJECTION_PAYLOADS), + vuln_type=vuln_type, + finding_id=finding_id, + success_indicators=list(_SUCCESS_INDICATORS["sql-injection"]), + inject_in="query", + ) + + if vuln_type == "xss": + return TestCase( + endpoint=full_url, + method=method, + payloads=list(_XSS_PAYLOADS), + vuln_type=vuln_type, + finding_id=finding_id, + success_indicators=list(_SUCCESS_INDICATORS["xss"]), + inject_in="query", + ) + + if vuln_type == "ssrf": + return TestCase( + endpoint=full_url, + method=method if method != "GET" else "POST", + payloads=list(_SSRF_PAYLOADS), + vuln_type=vuln_type, + finding_id=finding_id, + success_indicators=list(_SUCCESS_INDICATORS["ssrf"]), + inject_in="body", + content_type="application/x-www-form-urlencoded", + ) + + if vuln_type == "path-traversal": + return TestCase( + endpoint=full_url, + method=method, + payloads=list(_PATH_TRAVERSAL_PAYLOADS), + vuln_type=vuln_type, + finding_id=finding_id, + success_indicators=list(_SUCCESS_INDICATORS["path-traversal"]), + inject_in="path", + ) + + if vuln_type == "command-injection": + return TestCase( + endpoint=full_url, + method=method if method != "GET" else "POST", + payloads=list(_COMMAND_INJECTION_PAYLOADS), + vuln_type=vuln_type, + finding_id=finding_id, + success_indicators=list(_SUCCESS_INDICATORS["command-injection"]), + inject_in="body", + content_type="application/x-www-form-urlencoded", + ) + + if vuln_type in ("idor", "broken-access-control"): + # Test accessing the endpoint without auth and with modified IDs + return TestCase( + endpoint=full_url, + method=method, + payloads=["__no_auth__", "__different_id__"], + vuln_type=vuln_type, + finding_id=finding_id, + success_indicators=[], # success is determined by status code + inject_in="header", + ) + + logger.debug("Unmappable vuln_type: %s", vuln_type) + return None + + # ------------------------------------------------------------------ + # Endpoint inference + # ------------------------------------------------------------------ + + def _infer_endpoint(self, finding: dict[str, Any]) -> str | None: + """Try to extract or infer an HTTP endpoint from a finding. + + Checks explicit keys first (``endpoint``, ``url``, ``route``), then + falls back to inferring from ``file_path``. + + Args: + finding: Finding dict. + + Returns: + Endpoint path string (e.g., ``/api/users``) or None. + """ + # Direct keys + for key in ("endpoint", "url", "route"): + value = finding.get(key) + if value and isinstance(value, str): + # Ensure it starts with / + if value.startswith("/"): + return value + # If it's a full URL, extract the path + parsed = urllib.parse.urlparse(value) + if parsed.path: + return parsed.path + return f"/{value}" + + # Infer from file_path + file_path = finding.get("file_path", finding.get("file", "")) + if not file_path or not isinstance(file_path, str): + return None + + # Normalize separators + file_path = file_path.replace("\\", "/") + + for pattern, template in _ROUTE_PREFIX_MAP: + match = pattern.search(file_path) + if match: + # Extract the captured group, clean it up + name = match.group(1) + # Convert snake_case / camelCase filename parts to path segments + name = name.replace("_", "-") + return template.format(name) + + return None + + # ------------------------------------------------------------------ + # Test execution + # ------------------------------------------------------------------ + + def _execute_test(self, test_case: TestCase) -> TestExecution: + """Execute an HTTP test case against the live target. + + Args: + test_case: The test case to execute. Uses the first payload + in the payloads list. + + Returns: + TestExecution with response details or error information. + """ + payload = test_case.payloads[0] if test_case.payloads else "" + url = test_case.endpoint + method = test_case.method + headers = dict(self.auth_headers) + body_data: bytes | None = None + + # Handle IDOR/broken-access-control special payloads + if test_case.vuln_type in ("idor", "broken-access-control"): + if payload == "__no_auth__": + # Strip auth headers to test unauthenticated access + headers = {} + elif payload == "__different_id__": + # Modify numeric IDs in the URL + url = re.sub(r"/(\d+)(?=/|$)", "/99999", url) + + elif test_case.inject_in == "query": + # Inject payload as query parameter + separator = "&" if "?" in url else "?" + param = urllib.parse.quote(payload, safe="") + url = f"{url}{separator}input={param}" + + elif test_case.inject_in == "body": + headers["Content-Type"] = test_case.content_type + body_data = urllib.parse.urlencode({"input": payload}).encode("utf-8") + + elif test_case.inject_in == "path": + # Append payload to the URL path + encoded_payload = urllib.parse.quote(payload, safe="") + url = f"{url}/{encoded_payload}" + + elif test_case.inject_in == "header": + headers["X-Custom-Input"] = payload + + headers.setdefault("User-Agent", "Argus-Security-DAST-Validator/1.0") + + start_time = time.monotonic() + + try: + request = urllib.request.Request( + url, + data=body_data, + headers=headers, + method=method, + ) + + # Create an SSL context that still validates certificates + ctx = ssl.create_default_context() + + response = urllib.request.urlopen( + request, + timeout=self.timeout, + context=ctx, + ) + + duration_ms = (time.monotonic() - start_time) * 1000 + status_code = response.getcode() + response_headers = dict(response.headers) + raw_body = response.read() + + # Decode defensively + try: + response_body = raw_body.decode("utf-8", errors="replace") + except Exception: + response_body = raw_body.decode("latin-1", errors="replace") + + # Truncate + response_body = response_body[:MAX_RESPONSE_BODY] + + return TestExecution( + status_code=status_code, + response_body=response_body, + response_headers=response_headers, + error=None, + duration_ms=round(duration_ms, 2), + ) + + except urllib.error.HTTPError as exc: + duration_ms = (time.monotonic() - start_time) * 1000 + try: + err_body = exc.read().decode("utf-8", errors="replace")[:MAX_RESPONSE_BODY] + except Exception: + err_body = "" + + return TestExecution( + status_code=exc.code, + response_body=err_body, + response_headers=dict(exc.headers) if exc.headers else {}, + error=None, # HTTP errors are valid responses, not execution errors + duration_ms=round(duration_ms, 2), + ) + + except urllib.error.URLError as exc: + duration_ms = (time.monotonic() - start_time) * 1000 + logger.debug("URLError for %s: %s", url, exc.reason) + return TestExecution( + status_code=None, + response_body="", + response_headers={}, + error=f"URLError: {exc.reason}", + duration_ms=round(duration_ms, 2), + ) + + except Exception as exc: + duration_ms = (time.monotonic() - start_time) * 1000 + logger.debug("Unexpected error for %s: %s", url, exc) + return TestExecution( + status_code=None, + response_body="", + response_headers={}, + error=f"{type(exc).__name__}: {exc}", + duration_ms=round(duration_ms, 2), + ) + + # ------------------------------------------------------------------ + # Response analysis + # ------------------------------------------------------------------ + + def _check_success_indicators( + self, + response_body: str, + status_code: int, + test_case: TestCase, + ) -> bool: + """Check if the response indicates the vulnerability was exploited. + + Args: + response_body: The HTTP response body text. + status_code: The HTTP status code. + test_case: The test case that was executed. + + Returns: + True if evidence of exploitation is found. + """ + body_lower = response_body.lower() + vuln_type = test_case.vuln_type.lower() + + # IDOR / broken-access-control: success if resource accessible + if vuln_type in ("idor", "broken-access-control"): + payload = test_case.payloads[0] if test_case.payloads else "" + if payload == "__no_auth__": + # Resource accessible without auth is a problem + return status_code in (200, 201, 202) + if payload == "__different_id__": + # Resource accessible with a different ID is a problem + return status_code in (200, 201, 202) + return False + + # Check explicit success indicators + for indicator in test_case.success_indicators: + if indicator.lower() in body_lower: + return True + + # Vuln-type-specific heuristics + if vuln_type in ("sql-injection", "sqli"): + # Database error messages in response alongside 200/500 status + if status_code in (200, 500): + db_errors = [ + "sql syntax", + "mysql_", + "pg_query", + "sqlite3", + "ora-0", + "microsoft sql", + "unclosed quotation", + "unterminated string", + ] + for err in db_errors: + if err in body_lower: + return True + + elif vuln_type == "xss": + # Check if payload is reflected verbatim + for payload in test_case.payloads: + if payload.lower() in body_lower: + return True + + elif vuln_type == "ssrf": + # Internal service responses + ssrf_markers = [ + "ami-", + "instance-id", + "security-credentials", + "+pong", + "+ok", + "redis_version", + ] + for marker in ssrf_markers: + if marker in body_lower: + return True + + elif vuln_type == "path-traversal": + # File content markers + file_markers = [ + "root:", + "/bin/bash", + "/bin/sh", + "[boot loader]", + "[operating systems]", + ] + for marker in file_markers: + if marker in body_lower: + return True + + elif vuln_type == "command-injection": + # Command output markers + cmd_markers = ["uid=", "gid=", "groups="] + for marker in cmd_markers: + if marker in body_lower: + return True + + return False + + # ------------------------------------------------------------------ + # Safety checks + # ------------------------------------------------------------------ + + def _validate_target_url(self, url: str) -> bool: + """Validate that the target URL is safe to test against. + + Rejects localhost, 127.0.0.1, internal/private IP ranges, + and non-HTTPS URLs when in production mode. + + Args: + url: The URL to validate. + + Returns: + True if the URL is safe to target. + """ + try: + parsed = urllib.parse.urlparse(url) + except Exception: + logger.warning("Failed to parse target URL: %s", url) + return False + + if not parsed.scheme or not parsed.hostname: + logger.warning("Target URL missing scheme or hostname: %s", url) + return False + + hostname = parsed.hostname.lower() + + # Block localhost + if hostname in _LOCALHOST_HOSTNAMES: + logger.warning("Target URL points to localhost: %s", url) + return False + + # Block private IP ranges + for pattern in _PRIVATE_IP_PATTERNS: + if pattern.match(hostname): + logger.warning( + "Target URL points to private/internal IP: %s", url + ) + return False + + # Check if the environment is allowed + is_production = "production" in hostname or "prod" in hostname.split(".") + if is_production and "production" not in self.allowed_environments: + logger.warning( + "Target URL appears to be production and 'production' is not " + "in allowed_environments: %s", + url, + ) + return False + + return True + + +# --------------------------------------------------------------------------- +# Entry point +# --------------------------------------------------------------------------- + +if __name__ == "__main__": + print("SAST-to-DAST Validator for Argus Security") + print("Usage: Integrated into pipeline when dast_target_url is configured") + print(" validator = SastDastValidator('https://staging.example.com')") + print(" result = validator.validate_finding(finding)") diff --git a/tests/test_continuous_security.py b/tests/test_continuous_security.py new file mode 100644 index 0000000..a933fcc --- /dev/null +++ b/tests/test_continuous_security.py @@ -0,0 +1,664 @@ +"""Tests for continuous security testing modules (v3.0).""" +import os +import sys +import json +import pytest +from unittest.mock import patch, MagicMock +from pathlib import Path + +# Add scripts directory to path +sys.path.insert(0, str(Path(__file__).parent.parent / "scripts")) + +from diff_impact_analyzer import ( + DiffClassifier, + DiffClassification, + DiffImpactAnalyzer, + DiffScopeBuilder, + ScanScope, +) +from agent_chain_discovery import ( + AgentChainDiscovery, + AttackChain, + AttackStep, + CrossComponentAnalyzer, + CrossComponentRisk, +) +from findings_store import FindingsStore, ScanSummary +from app_context_builder import AppContextBuilder, ApplicationContext +from autofix_pr_generator import ( + AutoFixPRGenerator, + ClosedLoopOrchestrator, + FixBranch, + FixPR, + LoopResult, +) +import config_loader + + +# ============================================================================ +# 1. DiffClassifier tests +# ============================================================================ + + +class TestDiffClassifier: + """Tests for DiffClassifier file classification.""" + + def test_diff_classifier_skip_docs(self): + """Markdown and image files should be classified as skippable.""" + classifier = DiffClassifier() + result = classifier.classify(["README.md", "docs/guide.md", "logo.png"]) + assert result.skippable == ["README.md", "docs/guide.md", "logo.png"] + assert result.security_relevant == [] + + def test_diff_classifier_scan_auth(self): + """Auth-related files should always be security_relevant.""" + classifier = DiffClassifier() + result = classifier.classify(["src/auth/handler.py", "lib/session.js"]) + assert "src/auth/handler.py" in result.security_relevant + assert "lib/session.js" in result.security_relevant + assert result.skippable == [] + + def test_diff_classifier_default_to_scan(self): + """Unknown file types should default to security_relevant.""" + classifier = DiffClassifier() + result = classifier.classify(["src/utils/calc.py", "lib/server.go"]) + assert "src/utils/calc.py" in result.security_relevant + assert "lib/server.go" in result.security_relevant + + def test_diff_classifier_should_scan_false(self): + """All-docs changes should set should_scan=False.""" + classifier = DiffClassifier() + result = classifier.classify(["README.md", "CHANGELOG", "docs/notes.txt"]) + assert result.should_scan is False + assert result.security_relevant == [] + + def test_diff_classifier_mixed(self): + """Mix of security and docs files classifies correctly.""" + classifier = DiffClassifier() + files = ["README.md", "src/auth/login.py", "logo.png", "app/views.py"] + result = classifier.classify(files) + assert "README.md" in result.skippable + assert "logo.png" in result.skippable + assert "src/auth/login.py" in result.security_relevant + assert "app/views.py" in result.security_relevant + assert result.should_scan is True + assert result.total_changed == 4 + + +# ============================================================================ +# 2. DiffScopeBuilder tests +# ============================================================================ + + +class TestDiffScopeBuilder: + """Tests for DiffScopeBuilder scope construction.""" + + def test_scope_builder_full_project_when_not_scoped(self): + """With only_changed=False, returns full project scope.""" + builder = DiffScopeBuilder() + scope = builder.build_scope( + project_path="/tmp/project", + changed_files=["foo.py"], + only_changed=False, + ) + assert scope.is_scoped is False + assert scope.files == [] + + def test_scope_builder_scoped_mode(self): + """With only_changed=True and changed_files, returns scoped result.""" + builder = DiffScopeBuilder() + scope = builder.build_scope( + project_path="/tmp/project", + changed_files=["src/auth/handler.py", "README.md"], + only_changed=True, + ) + assert scope.is_scoped is True + assert "src/auth/handler.py" in scope.files + assert "README.md" in scope.skipped + + def test_semgrep_include_args(self): + """get_semgrep_include_args returns proper CLI args.""" + scope = ScanScope( + files=["src/auth.py", "src/api.py"], + is_scoped=True, + ) + args = DiffScopeBuilder.get_semgrep_include_args(scope) + assert args == ["--include", "src/auth.py", "--include", "src/api.py"] + + def test_semgrep_include_args_empty_for_unscoped(self): + """get_semgrep_include_args returns empty list for unscoped scope.""" + scope = ScanScope(files=[], is_scoped=False) + args = DiffScopeBuilder.get_semgrep_include_args(scope) + assert args == [] + + +# ============================================================================ +# 3. AgentChainDiscovery tests +# ============================================================================ + + +class TestAgentChainDiscovery: + """Tests for AgentChainDiscovery LLM response parsing.""" + + def test_parse_chains_valid_json(self): + """Valid JSON response is parsed into AttackChain objects.""" + mock_response = json.dumps([ + { + "chain_id": "chain-1", + "finding_ids": ["f1", "f2"], + "steps": [ + {"finding_id": "f1", "action": "exploit SQLi", "enables": "data leak"}, + {"finding_id": "f2", "action": "escalate", "enables": "admin access"}, + ], + "severity": "critical", + "complexity": "low", + "impact": "Full database compromise", + "description": "SQLi to privilege escalation", + } + ]) + + def fake_llm(prompt: str) -> str: + return mock_response + + discoverer = AgentChainDiscovery(llm_call=fake_llm) + chains = discoverer.discover_chains([ + {"id": "f1", "type": "sqli", "severity": "high", "file": "db.py", "description": "SQL injection"}, + {"id": "f2", "type": "auth", "severity": "medium", "file": "auth.py", "description": "Weak auth"}, + ]) + + assert len(chains) == 1 + assert chains[0].chain_id == "chain-1" + assert chains[0].finding_ids == ["f1", "f2"] + assert len(chains[0].steps) == 2 + assert chains[0].severity == "critical" + assert chains[0].complexity == "low" + + def test_parse_chains_markdown_fenced(self): + """JSON wrapped in ```json ... ``` blocks is parsed.""" + inner = json.dumps([ + { + "chain_id": "c1", + "finding_ids": ["a", "b"], + "steps": [], + "severity": "high", + "complexity": "medium", + "impact": "data leak", + "description": "chained attack", + } + ]) + fenced = f"Here is the analysis:\n```json\n{inner}\n```\nDone." + + def fake_llm(prompt: str) -> str: + return fenced + + discoverer = AgentChainDiscovery(llm_call=fake_llm) + chains = discoverer.discover_chains([ + {"id": "a", "type": "xss", "severity": "high", "file": "x.js", "description": "XSS"}, + {"id": "b", "type": "csrf", "severity": "medium", "file": "y.js", "description": "CSRF"}, + ]) + + assert len(chains) == 1 + assert chains[0].chain_id == "c1" + + def test_parse_chains_invalid_json(self): + """Invalid JSON returns empty list.""" + + def fake_llm(prompt: str) -> str: + return "This is not valid JSON at all {{{broken" + + discoverer = AgentChainDiscovery(llm_call=fake_llm) + chains = discoverer.discover_chains([ + {"id": "x", "type": "test", "severity": "low", "file": "t.py", "description": "test"}, + ]) + + assert chains == [] + + +# ============================================================================ +# 4. CrossComponentAnalyzer tests +# ============================================================================ + + +class TestCrossComponentAnalyzer: + """Tests for CrossComponentAnalyzer component classification and risk detection.""" + + def test_classify_component_auth(self): + """Files in auth/ directory classified as auth component.""" + analyzer = CrossComponentAnalyzer(project_path="/tmp/project") + component = analyzer._classify_component("src/auth/login.py") + assert component == "auth" + + def test_classify_component_api(self): + """Files in api/ or routes/ directory classified correctly.""" + analyzer = CrossComponentAnalyzer(project_path="/tmp/project") + + assert analyzer._classify_component("src/api/users.py") == "api" + assert analyzer._classify_component("app/routes/index.js") == "routes" + + def test_classify_component_other(self): + """Files not in any known directory classified as other.""" + analyzer = CrossComponentAnalyzer(project_path="/tmp/project") + component = analyzer._classify_component("scripts/deploy.sh") + assert component == "other" + + def test_dangerous_combination_auth_api(self): + """Auth + API findings trigger broken access control risk.""" + analyzer = CrossComponentAnalyzer(project_path="/tmp/project") + findings = [ + {"id": "f1", "file": "src/auth/handler.py", "severity": "high"}, + {"id": "f2", "file": "src/api/users.py", "severity": "medium"}, + ] + risks = analyzer.analyze(findings) + assert len(risks) >= 1 + risk_types = [r["risk_type"] for r in risks] + assert "broken_access_control" in risk_types + + # Verify the risk has the right structure + bac_risk = [r for r in risks if r["risk_type"] == "broken_access_control"][0] + assert bac_risk["severity"] == "critical" + assert "f1" in bac_risk["findings_a"] or "f1" in bac_risk["findings_b"] + + def test_no_dangerous_combinations(self): + """Findings in the same component should not trigger cross-component risks.""" + analyzer = CrossComponentAnalyzer(project_path="/tmp/project") + findings = [ + {"id": "f1", "file": "src/utils/helper.py"}, + {"id": "f2", "file": "src/utils/parser.py"}, + ] + risks = analyzer.analyze(findings) + assert risks == [] + + +# ============================================================================ +# 5. FindingsStore tests +# ============================================================================ + + +class TestFindingsStore: + """Tests for FindingsStore persistence and analytics.""" + + def test_findings_store_init(self, tmp_path): + """Store creates db file and tables on init.""" + db_path = str(tmp_path / "test.db") + store = FindingsStore(db_path=db_path) + assert os.path.isfile(db_path) + + # Verify tables exist by querying them + cur = store._conn.cursor() + cur.execute("SELECT name FROM sqlite_master WHERE type='table'") + tables = {row["name"] for row in cur.fetchall()} + assert "findings" in tables + assert "scan_history" in tables + assert "fix_history" in tables + store.close() + + def test_record_and_retrieve(self, tmp_path): + """Record a scan and retrieve findings.""" + db_path = str(tmp_path / "test.db") + store = FindingsStore(db_path=db_path) + + findings = [ + { + "id": "finding-001", + "vuln_type": "sql_injection", + "severity": "critical", + "file_path": "src/db.py", + "line_number": 42, + "cwe": "CWE-89", + "description": "SQL injection in query builder", + }, + { + "id": "finding-002", + "vuln_type": "xss", + "severity": "high", + "file_path": "src/template.py", + "line_number": 15, + "cwe": "CWE-79", + "description": "Reflected XSS", + }, + ] + + summary = store.record_scan("scan-1", findings, commit_sha="abc123") + assert isinstance(summary, ScanSummary) + assert summary.total_findings == 2 + assert summary.new_findings == 2 + assert summary.by_severity["critical"] == 1 + assert summary.by_severity["high"] == 1 + + # Retrieve individual finding + f = store.get_finding("finding-001") + assert f is not None + assert f["vuln_type"] == "sql_injection" + assert f["severity"] == "critical" + assert f["status"] == "open" + store.close() + + def test_fingerprinting_consistency(self): + """Same finding always produces same fingerprint.""" + finding = { + "vuln_type": "sql_injection", + "file_path": "src/db.py", + "code_snippet": "cursor.execute(query)", + "cwe": "CWE-89", + } + fp1 = FindingsStore.fingerprint_finding(finding) + fp2 = FindingsStore.fingerprint_finding(finding) + assert fp1 == fp2 + assert len(fp1) == 16 # 16 hex chars + + # Different finding produces different fingerprint + different = { + "vuln_type": "xss", + "file_path": "src/template.py", + "code_snippet": "render(html)", + "cwe": "CWE-79", + } + fp3 = FindingsStore.fingerprint_finding(different) + assert fp3 != fp1 + + def test_regression_detection(self, tmp_path): + """Previously-fixed findings are flagged as regressions.""" + db_path = str(tmp_path / "test.db") + store = FindingsStore(db_path=db_path) + + finding = { + "id": "reg-001", + "vuln_type": "sql_injection", + "severity": "high", + "file_path": "src/db.py", + "cwe": "CWE-89", + } + + # First scan: record the finding + store.record_scan("scan-1", [finding]) + + # Mark the finding as fixed + store.record_fix("reg-001", fix_commit="fix123", retest_passed=True) + + # Verify it is now marked fixed + f = store.get_finding("reg-001") + assert f["status"] == "fixed" + + # Second scan: same finding reappears -> regression + summary = store.record_scan("scan-2", [finding]) + assert summary.regressions == 1 + + # The finding should be reset to open + f2 = store.get_finding("reg-001") + assert f2["status"] == "open" + store.close() + + def test_trending(self, tmp_path): + """Trending returns severity data from scan history.""" + db_path = str(tmp_path / "test.db") + store = FindingsStore(db_path=db_path) + + findings = [ + {"vuln_type": "sqli", "severity": "critical", "file_path": "a.py"}, + {"vuln_type": "xss", "severity": "high", "file_path": "b.py"}, + ] + store.record_scan("scan-1", findings) + + weeks = store.trending(days=90) + # Should have at least one week entry with our data + assert isinstance(weeks, dict) + if weeks: + # Verify that at least one week has our severity counts + some_week = next(iter(weeks.values())) + assert "critical" in some_week + assert "high" in some_week + store.close() + + def test_historical_context(self, tmp_path): + """get_historical_context returns correct data for known finding.""" + db_path = str(tmp_path / "test.db") + store = FindingsStore(db_path=db_path) + + finding = { + "vuln_type": "sqli", + "severity": "high", + "file_path": "src/db.py", + "cwe": "CWE-89", + } + + # Record finding once + store.record_scan("scan-1", [finding]) + + # Get historical context + ctx = store.get_historical_context(finding) + assert ctx["times_seen"] == 1 + assert ctx["first_seen"] is not None + assert ctx["previous_status"] == "open" + assert ctx["is_regression"] is False + assert isinstance(ctx["fp_rate_for_type"], float) + assert isinstance(ctx["related_in_file"], int) + + # Unknown finding returns zeroed context + unknown = { + "vuln_type": "unknown_vuln", + "file_path": "nowhere.py", + "cwe": "CWE-000", + } + ctx2 = store.get_historical_context(unknown) + assert ctx2["times_seen"] == 0 + assert ctx2["first_seen"] is None + assert ctx2["is_regression"] is False + store.close() + + +# ============================================================================ +# 6. AppContextBuilder tests +# ============================================================================ + + +class TestAppContextBuilder: + """Tests for AppContextBuilder language/framework detection and context formatting.""" + + def test_detect_language_python(self, tmp_path): + """Directory with .py files detected as python.""" + # Create several .py files + for name in ("app.py", "utils.py", "models.py", "views.py", "tests.py"): + (tmp_path / name).write_text("# python file\n") + builder = AppContextBuilder(str(tmp_path)) + ctx = builder.build() + assert ctx.language == "python" + + def test_detect_framework_django(self, tmp_path): + """manage.py presence detected as django.""" + (tmp_path / "manage.py").write_text("#!/usr/bin/env python\nimport django\n") + (tmp_path / "app.py").write_text("# app\n") + builder = AppContextBuilder(str(tmp_path)) + ctx = builder.build() + assert ctx.framework == "django" + + def test_detect_language_unknown_empty(self, tmp_path): + """Empty directory returns unknown language.""" + builder = AppContextBuilder(str(tmp_path)) + ctx = builder.build() + assert ctx.language == "unknown" + + def test_to_prompt_context(self): + """to_prompt_context returns formatted string.""" + ctx = ApplicationContext( + language="python", + framework="django", + auth_mechanism="jwt", + cloud_provider="aws", + has_dockerfile=True, + has_k8s=False, + ) + prompt = ctx.to_prompt_context() + assert "Application Context:" in prompt + assert "python" in prompt + assert "django" in prompt + assert "jwt" in prompt + assert "aws" in prompt + assert "Has Dockerfile: yes" in prompt + assert "Has Kubernetes: no" in prompt + + +# ============================================================================ +# 7. AutoFixPRGenerator tests +# ============================================================================ + + +class TestAutoFixPRGenerator: + """Tests for AutoFixPRGenerator commit messages, PR bodies, and fixability checks.""" + + def test_generate_commit_message(self): + """Commit message follows conventional format.""" + generator = AutoFixPRGenerator(project_path="/tmp/project") + suggestion = { + "vulnerability_type": "sql_injection", + "finding_id": "finding-12345678", + "cwe": "CWE-89", + "file_path": "src/db.py", + "line_number": 42, + "explanation": "Use parameterized queries instead of string concatenation.", + } + msg = generator._generate_commit_message(suggestion) + assert msg.startswith("fix(sql_injection):") + assert "Finding: finding-12345678" in msg + assert "CWE: CWE-89" in msg + assert "File: src/db.py:42" in msg + assert "Generated by Argus Security AutoFix" in msg + + def test_generate_pr_body(self): + """PR body includes all required sections.""" + generator = AutoFixPRGenerator(project_path="/tmp/project") + suggestion = { + "vulnerability_type": "sql_injection", + "finding_id": "finding-001", + "cwe": "CWE-89", + "severity": "critical", + "file_path": "src/db.py", + "line_number": 42, + "explanation": "Use parameterized queries.", + "diff": "--- a/src/db.py\n+++ b/src/db.py\n@@ -42 +42 @@\n-bad\n+good", + "testing_recommendations": ["Run unit tests", "Check query output"], + } + body = generator.generate_pr_body(suggestion) + assert "## Summary" in body + assert "## Vulnerability Details" in body + assert "sql_injection" in body + assert "CWE-89" in body + assert "## What Changed" in body + assert "## Diff" in body + assert "## Testing Recommendations" in body + assert "Run unit tests" in body + assert "Argus Security" in body + + def test_is_fixable(self): + """Findings with proper context are marked fixable.""" + orchestrator = ClosedLoopOrchestrator(project_path="/tmp/project") + finding = { + "file_path": "src/db.py", + "vulnerability_type": "sql_injection", + "severity": "critical", + "line_number": 42, + } + assert orchestrator._is_fixable(finding) is True + + def test_not_fixable_missing_file(self): + """Findings without file_path are not fixable.""" + orchestrator = ClosedLoopOrchestrator(project_path="/tmp/project") + finding = { + "vulnerability_type": "sql_injection", + "severity": "critical", + "line_number": 42, + } + assert orchestrator._is_fixable(finding) is False + + def test_not_fixable_missing_type(self): + """Findings without vulnerability type are not fixable.""" + orchestrator = ClosedLoopOrchestrator(project_path="/tmp/project") + finding = { + "file_path": "src/db.py", + "severity": "critical", + "line_number": 42, + } + assert orchestrator._is_fixable(finding) is False + + def test_loop_result_success_rate(self): + """LoopResult.success_rate computes correctly.""" + result = LoopResult(total_findings=10, fixable=4) + # No fixed yet + assert result.success_rate == 0.0 + + # Add some fixed PRs + result.fixed.append( + FixPR( + branch_name="argus/fix-sqli-abc", + finding_id="f1", + vulnerability_type="sqli", + file_path="db.py", + title="fix", + body="body", + commit_sha="sha1", + pushed=False, + success=True, + ) + ) + result.fixed.append( + FixPR( + branch_name="argus/fix-xss-def", + finding_id="f2", + vulnerability_type="xss", + file_path="tmpl.py", + title="fix", + body="body", + commit_sha="sha2", + pushed=False, + success=True, + ) + ) + # 2 fixed out of 4 fixable = 0.5 + assert result.success_rate == pytest.approx(0.5) + + def test_loop_result_success_rate_zero_fixable(self): + """LoopResult.success_rate returns 0.0 when no fixable findings.""" + result = LoopResult(total_findings=5, fixable=0) + assert result.success_rate == 0.0 + + +# ============================================================================ +# 8. Config integration tests +# ============================================================================ + + +class TestConfigIntegration: + """Tests that v3.0 continuous security config keys are properly wired.""" + + V3_CONFIG_KEYS = [ + "enable_diff_scoping", + "diff_expand_impact_radius", + "enable_autofix_pr", + "autofix_confidence_threshold", + "autofix_max_prs_per_scan", + "enable_findings_store", + "findings_db_path", + "inject_historical_context", + "enable_agent_chain_discovery", + "enable_cross_component_analysis", + "enable_app_context", + "enable_live_validation", + "live_validation_environment", + "only_changed", + ] + + def test_new_config_keys_in_defaults(self): + """All v3.0 config keys exist in get_default_config.""" + defaults = config_loader.get_default_config() + for key in self.V3_CONFIG_KEYS: + assert key in defaults, f"Missing v3.0 config key in defaults: {key}" + + def test_env_var_mappings_exist(self): + """All v3.0 config keys have env var mappings.""" + # Collect all config keys that have env var mappings + mapped_config_keys = { + config_key for _, config_key, _ in config_loader._ENV_MAPPINGS + } + for key in self.V3_CONFIG_KEYS: + assert key in mapped_config_keys, ( + f"Missing env var mapping for v3.0 config key: {key}" + ) From c235c1e99bb43c292a56c9a9a3e56be0d8251e15 Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 4 Mar 2026 06:01:01 +0000 Subject: [PATCH 3/3] docs: Update README, CLAUDE.md, and CHANGELOG for v6.0.0 release - README: Add v3.0 continuous security testing feature table, env vars, deployment-triggered scanning section, and guide doc link - CLAUDE.md: Add v3.0 summary, 6 new key files, and guide reference - CHANGELOG: Add v6.0.0 release notes with all 7 new modules, workflows, config keys, and 36 tests https://claude.ai/code/session_017NQsm2eBxfioLrad1C7keZ --- CHANGELOG.md | 23 +++++++++++++++++++++++ CLAUDE.md | 20 ++++++++++++++++++-- README.md | 32 ++++++++++++++++++++++++++++++++ 3 files changed, 73 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index bae909c..18addc3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,29 @@ Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). --- +## [6.0.0] - 2026-03-04 + +### Added — Continuous Security Testing (v3.0) +- **Diff-Intelligent Scanner Scoping** (`scripts/diff_impact_analyzer.py`): Classifies changed files by security relevance, expands blast radius via reverse dependency lookup, generates Semgrep `--include` args for scoped scanning. Toggle: `enable_diff_scoping=True`, `diff_expand_impact_radius=True` +- **Agent-Driven Chain Discovery** (`scripts/agent_chain_discovery.py`): LLM-powered multi-step attack chain discovery beyond rule-based patterns. Cross-component analyzer detects dangerous finding combinations across architectural boundaries (auth+api, models+api, middleware+routes). Toggle: `enable_agent_chain_discovery=False` (opt-in), `enable_cross_component_analysis=True` +- **AutoFix PR Generator** (`scripts/autofix_pr_generator.py`): Generates git branches with applied fixes from RemediationEngine suggestions. Creates conventional-commit-style messages, formatted PR bodies with diff/CWE/testing sections. ClosedLoopOrchestrator wires find-fix-verify into a single flow. Toggle: `enable_autofix_pr=False` (opt-in), `autofix_confidence_threshold="high"`, `autofix_max_prs_per_scan=5` +- **Persistent Findings Store** (`scripts/findings_store.py`): SQLite-backed cross-scan intelligence. Tracks findings across scans via content-based fingerprinting. Detects regressions (previously-fixed findings reappearing), computes MTTF, FP rates, severity trending. Injects historical context into LLM enrichment prompts. Toggle: `enable_findings_store=True`, `findings_db_path=".argus/findings.db"`, `inject_historical_context=True` +- **Application Context Builder** (`scripts/app_context_builder.py`): Detects framework (Django/Flask/Express/Spring/etc.), language, auth mechanism (JWT/OAuth2/session), cloud provider, IaC files, middleware chain, entry points, and OpenAPI specs. Generates `to_prompt_context()` string for LLM prompt injection. Toggle: `enable_app_context=True` +- **SAST-to-DAST Live Validation** (`scripts/sast_dast_validator.py`): Validates SAST findings against live deployment targets. Maps vuln types to HTTP test payloads (SQLi, XSS, SSRF, path traversal, command injection, IDOR). Safety: rejects production targets by default, only allows staging/preview/development. Toggle: `enable_live_validation=False` (opt-in), `live_validation_environment="staging"` +- **Post-Deploy Scan workflow** (`.github/workflows/post-deploy-scan.yml`): Triggers on successful deployments, runs diff-scoped SAST + DAST against deployment URL +- **Retest After Fix workflow** (`.github/workflows/argus-retest.yml`): Triggers when `argus/fix-*` PRs merge, runs regression tests + targeted SAST rescan, updates FindingsStore +- **Continuous Security Testing Guide** (`docs/CONTINUOUS_SECURITY_TESTING_GUIDE.md`): Architecture guide mapping capabilities vs industry-standard autonomous testing +- 13 new config keys added to `config_loader.py` with env var and CLI mappings +- All 7 modules integrated into `hybrid_analyzer.py` with graceful degradation +- 36 new tests (`tests/test_continuous_security.py`) covering all v3.0 modules + +### Changed +- Updated README.md with v3.0 feature tables, env vars, and deployment scanning docs +- Updated CLAUDE.md with v3.0 key files and extended documentation references +- Updated `.claude/rules/features.md` and `.claude/rules/development.md` with v3.0 modules + +--- + ## [5.0.0] - 2026-02-16 ### Added diff --git a/CLAUDE.md b/CLAUDE.md index 9594356..0a1f662 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,6 +1,6 @@ # CLAUDE.md - Argus Security -> Enterprise-grade AI Security Platform with 6-phase analysis pipeline. +> Enterprise-grade AI Security Platform with 6-phase analysis pipeline and continuous autonomous security testing. ## What This Does @@ -17,6 +17,15 @@ Phase 6: Reporting → SARIF, JSON, Markdown outputs **Results:** 60-70% false positive reduction, +15-20% more findings via heuristic-based spontaneous discovery (regex pattern matching, not AI-powered). +**v3.0 Continuous Security:** +- Diff-intelligent scanner scoping with blast radius expansion +- Persistent cross-scan findings store with regression detection +- Application context auto-detection for context-aware scanning +- LLM-powered attack chain discovery + cross-component analysis +- AutoFix PR generation with closed-loop find-fix-verify +- SAST-to-DAST live validation against staging targets +- Deployment-triggered scanning via GitHub Actions workflows + ## Quick Start ```bash @@ -47,10 +56,17 @@ python scripts/run_ai_audit.py --project-type backend-api | `scripts/agent_personas.py` | Phase 3: multi-agent review | | `scripts/sandbox_validator.py` | Phase 4: Docker validation | | `policy/rego/` | Phase 5: OPA policies | +| `scripts/diff_impact_analyzer.py` | v3.0: Diff-intelligent scanner scoping | +| `scripts/findings_store.py` | v3.0: SQLite persistent findings store | +| `scripts/app_context_builder.py` | v3.0: Application context auto-detection | +| `scripts/agent_chain_discovery.py` | v3.0: LLM attack chain discovery | +| `scripts/autofix_pr_generator.py` | v3.0: AutoFix PR generation + closed loop | +| `scripts/sast_dast_validator.py` | v3.0: SAST-to-DAST live validation | ## Extended Documentation Details moved to scoped rule files (auto-loaded when editing relevant files): - `.claude/rules/pipeline.md` — 6-phase pipeline architecture -- `.claude/rules/features.md` — Advanced feature modules + config toggles +- `.claude/rules/features.md` — Advanced feature modules + config toggles (incl. v3.0) - `.claude/rules/development.md` — Docker, GitHub Action, project structure +- `docs/CONTINUOUS_SECURITY_TESTING_GUIDE.md` — v3.0 architecture and gap analysis diff --git a/README.md b/README.md index 682d299..6bd6ebe 100644 --- a/README.md +++ b/README.md @@ -131,6 +131,27 @@ All features are wired into both orchestrators and toggled via config/env vars. | MCP Server | `enable_mcp_server` | `False` | Expose Argus as MCP tools for Claude Code | | Temporal Orchestration | `enable_temporal` | `False` | Durable workflow wrapping for crash recovery | +### Continuous Security Testing (v3.0) + +| Feature | Config Key | Default | Description | +|---------|-----------|---------|-------------| +| Diff-Intelligent Scoping | `enable_diff_scoping` | `True` | Scope scanners to changed files + blast radius expansion | +| Application Context | `enable_app_context` | `True` | Auto-detect framework, auth, cloud, IaC for context-aware scanning | +| Persistent Findings Store | `enable_findings_store` | `True` | SQLite cross-scan intelligence with regression detection and trending | +| Cross-Component Analysis | `enable_cross_component_analysis` | `True` | Detect dangerous vulnerability combinations across architectural boundaries | +| Agent Chain Discovery | `enable_agent_chain_discovery` | `False` | LLM-powered multi-step attack chain reasoning (opt-in, uses AI credits) | +| AutoFix PR Generation | `enable_autofix_pr` | `False` | Generate merge-ready fix PRs with closed-loop verification (opt-in) | +| SAST-to-DAST Validation | `enable_live_validation` | `False` | Validate SAST findings against live staging targets (opt-in) | + +--- + +### Deployment-Triggered Scanning + +Argus includes two GitHub Actions workflows for continuous security: + +- **Post-Deploy Scan** (`.github/workflows/post-deploy-scan.yml`) -- Triggers on successful deployments. Runs diff-scoped SAST + DAST against the deployment URL. +- **Retest After Fix** (`.github/workflows/argus-retest.yml`) -- Triggers when `argus/fix-*` branches merge. Re-scans to verify fixes hold, updates FindingsStore, posts results as PR comments. + --- ## Configuration @@ -163,6 +184,16 @@ export ENABLE_ADVANCED_SUPPRESSION=true export ENABLE_COMPLIANCE_MAPPING=true export ENABLE_LICENSE_RISK_SCORING=true +# Continuous security testing (v3.0) +export ENABLE_DIFF_SCOPING=true +export ENABLE_APP_CONTEXT=true +export ENABLE_FINDINGS_STORE=true +export ENABLE_CROSS_COMPONENT_ANALYSIS=true +export ENABLE_AGENT_CHAIN_DISCOVERY=false # opt-in, uses AI credits +export ENABLE_AUTOFIX_PR=false # opt-in +export ENABLE_LIVE_VALIDATION=false # opt-in, requires staging target +export LIVE_VALIDATION_ENVIRONMENT=staging + # Limits export MAX_FILES=50 export COST_LIMIT=1.0 @@ -282,6 +313,7 @@ mypy scripts/*.py # Type check | [docs/MULTI_AGENT_GUIDE.md](docs/MULTI_AGENT_GUIDE.md) | Multi-agent analysis details | | [docs/PHASE_27_DEEP_ANALYSIS.md](docs/PHASE_27_DEEP_ANALYSIS.md) | Deep Analysis rollout guide | | [docs/FAQ.md](docs/FAQ.md) | Common questions | +| [docs/CONTINUOUS_SECURITY_TESTING_GUIDE.md](docs/CONTINUOUS_SECURITY_TESTING_GUIDE.md) | Continuous security testing architecture | | [CHANGELOG.md](CHANGELOG.md) | Release history | ---