Skip to content

Added a docs researcher subagent using Context7.#841

Merged
srtab merged 9 commits intomainfrom
feat/docs-researcher-subagent
Feb 25, 2026
Merged

Added a docs researcher subagent using Context7.#841
srtab merged 9 commits intomainfrom
feat/docs-researcher-subagent

Conversation

@srtab
Copy link
Owner

@srtab srtab commented Feb 21, 2026

No description provided.

@srtab srtab self-assigned this Feb 21, 2026
srtab and others added 2 commits February 21, 2026 00:40
@srtab srtab marked this pull request as ready for review February 21, 2026 22:40
@srtab
Copy link
Owner Author

srtab commented Feb 22, 2026

@daiv-agent /code-review

@daiv-agent
Copy link
Contributor

daiv-agent bot commented Feb 22, 2026

Code Review

Overview

  • Replaces the changelog-curator subagent with a new docs-research subagent that fetches library documentation via Context7 API
  • Updates configuration, prompts, middleware setup, and all related tests
  • Adds new model GPT_5_1_CODEX_MINI for the docs-research subagent

Findings

High Severity

  • None identified

Medium Severity

  1. Integration test flakiness risk (tests/integration_tests/test_subagents.py:98-99)

    • Lines 98-99 assert web_fetch is called when expected_tool_calls is None, but this assertion runs even when expected_tool_calls == 0
    • Fix: wrap lines 98-99 in if expected_tool_calls is not None: to avoid duplicate/conflicting assertions
  2. Missing changelog entry metadata (CHANGELOG.md:49)

    • The "Fixed" section entry doesn't mention what feature/component was affected (unit tests for which module?)
    • Suggested: "Fixed unit tests in /agents slash command by migrating from create_changelog_subagent to create_docs_research_subagent."

Low Severity

  1. Unused **kwargs parameter (daiv/automation/agent/subagents.py:116, daiv/automation/agent/subagents.py:276)

    • create_explore_subagent and create_docs_research_subagent accept **kwargs but never use them
    • This was likely added to match the signature of create_general_purpose_subagent, but removing the unused runtime: RuntimeCtx parameter broke backward compatibility. Consider removing **kwargs entirely or documenting why it's reserved.
  2. Hardcoded Makefile test filter (Makefile:46)

    • The integration-tests target now runs only test_docs_research_subagent_ask_clarifying_questions
    • This makes make integration-tests useless for normal development. Either revert to a broader filter or document that this is a temporary debug target.
  3. Langsmith fetch limit changed (Makefile:68)

    • Limit changed from 1 → 4 without explanation in commit or changelog
    • Not blocking, but consider reverting if this was unintentional

Suggestions

  • Prompt clarity: The docs-research prompt is very detailed (good!), but the "retry at most 2 times" rule (daiv/automation/agent/subagents.py:238-239) could be misinterpreted. Consider rephrasing to "you may retry with a different query up to 2 additional times (3 total fetches)" for clarity.

  • Error message consistency: The "Missing context" / "Blocked" structured messages in the prompt are novel. Consider adding examples of what these look like in practice, or testing that models actually emit them in the expected format.

Tests

What should be run:

make lint-fix
make test

Coverage notes:

  • New integration tests cover typical execution and clarification flows
  • Unit tests verify middleware composition and prompt content
  • Edge case: no tests verify retry behavior (the "2 retries max" rule)
  • Edge case: no tests verify version mismatch handling or rate limit responses

Identified gaps:

  • Add a test case where the model retries with different queries and eventually synthesizes an answer
  • Add a test case where the Context7 API returns a version mismatch

@srtab srtab linked an issue Feb 22, 2026 that may be closed by this pull request
srtab and others added 3 commits February 22, 2026 23:43
- Add WEB_FETCH_AUTH_HEADERS setting for domain-specific auth (e.g. Context7 API)
- Inject auth headers in web_fetch middleware for authenticated endpoints
- Enhance docs-researcher system prompt with structured guidelines
- Reorganize create_explore_subagent and update tool names in prompts
- Update integration and unit tests

Co-authored-by: Cursor <cursoragent@cursor.com>
@srtab
Copy link
Owner Author

srtab commented Feb 25, 2026

@daiv-agent /security-audit

@daiv-agent
Copy link
Contributor

daiv-agent bot commented Feb 25, 2026

Security Audit

Summary

  • Overall Posture: Medium risk. The new docs-research subagent introduces SSRF and credential exposure risks through the Context7 API integration via web_fetch.
  • Hotspots: URL validation in web_fetch_tool, authentication header handling in _get_auth_headers_for_url, and LLM prompt injection via user-controlled queries.
  • Good: HTTP-only scheme enforcement, cross-host redirect detection, content size limits, and exact domain matching for auth headers.

Findings

Critical Severity

  1. SSRF: No private IP/localhost blocking (daiv/automation/agent/middlewares/web_fetch.py:86-122)

    • _is_valid_http_url only checks scheme and netloc presence, but does not block private IPs (127.0.0.1, 10.0.0.0/8, 192.168.0.0/16, 169.254.0.0/16, ::1, etc.)
    • An attacker could trick the LLM into fetching https://127.0.0.1:6379/, http://169.254.169.254/latest/meta-data/ (cloud metadata), or internal services
    • Remediation: Add IP/hostname validation before AsyncClient.get():
      import ipaddress
      
      def _is_private_or_local(hostname: str) -> bool:
          try:
              ip = ipaddress.ip_address(hostname)
              return ip.is_private or ip.is_loopback or ip.is_link_local
          except ValueError:
              # Not an IP; check hostnames
              return hostname.lower() in {"localhost", "127.0.0.1", "::1"}
      
      # In _fetch_url_text, before httpx call:
      parsed = urlparse(url)
      if _is_private_or_local(parsed.hostname or ""):
          raise ValueError(f"Requests to private/local addresses are blocked: {url}")
  2. Authentication credential leakage via cache (daiv/automation/agent/middlewares/web_fetch.py:125-135)

    • Cache key is sha256(url + prompt), which does NOT include auth headers or their presence
    • If two users (or same user with different API keys) fetch the same (url, prompt) pair, they will receive the same cached response, even though one user's credentials may differ
    • Worse: If a non-authenticated user requests the same (url, prompt) after an authenticated user, they receive the authenticated response
    • Remediation: Include auth header presence/hash in cache key:
      def _cache_key_for_response(*, url: str, prompt: str, auth_headers: dict[str, str]) -> str:
          auth_hash = hashlib.sha256(str(sorted(auth_headers.items())).encode()).hexdigest() if auth_headers else "none"
          digest = hashlib.sha256(f"{url}\n{prompt}\n{auth_hash}".encode()).hexdigest()
          return f"web_fetch:response:{digest}"
      Update all call sites to pass auth_headers.

High Severity

  1. Prompt injection via unsanitized library names and queries (daiv/automation/agent/subagents.py:182-224)

    • The docs-research prompt instructs the LLM to construct Context7 URLs with user-provided libraryName and query parameters
    • An attacker could provide a malicious query like "react&query=anything&adminToken=steal" to inject URL parameters, or newlines/control chars to manipulate HTTP requests (if httpx doesn't escape properly)
    • Impact: Potential for exploiting Context7 API if it has injection vulnerabilities, or crafting misleading documentation responses
    • Remediation: URL-encode library name and query in the prompt examples, and add validation guidance:
      # In prompt:
      from urllib.parse import quote_plus
      encoded_query = quote_plus(user_query)
      fetch(url=f"https://context7.com/api/v2/libs/search?libraryName={quote_plus(library_name)}&query={encoded_query}", prompt="")
      Also add to Quality Standards: "Always URL-encode user-provided library names and queries using proper escaping."
  2. Subdomain matching inconsistency (daiv/automation/agent/middlewares/web_fetch.py:59-76)

    • Documentation in automation/conf.py:67 claims context7.com matches api.context7.com (suffix matching), but code at line 69 uses exact hostname match (hostname == domain)
    • Test at tests/unit_tests/automation/agent/middlewares/test_web_fetch.py:225-229 confirms subdomains are NOT matched
    • Impact: Misleading documentation could cause users to misconfigure auth headers, believing context7.com will cover api.context7.com when it won't
    • Remediation: Either fix the documentation to state "exact match only" or implement suffix matching (choose one):
      • Option A (exact match - safer): Update docs to remove "suffix" claim
      • Option B (suffix match - more flexible but riskier): Change line 69 to if hostname.endswith(f".{domain}") or hostname == domain:

Medium Severity

  1. Lack of rate limiting on web_fetch tool (daiv/automation/agent/middlewares/web_fetch.py:155-208)

    • No per-user or per-session rate limit on web_fetch_tool calls
    • An attacker could abuse the tool to:
      • DDoS external services (Context7, other APIs)
      • Exhaust LLM API quota by repeatedly requesting different (url, prompt) pairs
      • Fill cache storage with junk entries
    • Remediation: Add Django rate limiting (e.g., using django-ratelimit) or per-user token bucket in Redis:
      from django.core.cache import cache
      
      def check_rate_limit(user_id: str, limit: int = 60, window: int = 60) -> bool:
          key = f"web_fetch:ratelimit:{user_id}"
          count = cache.get(key, 0)
          if count >= limit:
              return False
          cache.set(key, count + 1, timeout=window)
          return True
  2. Secrets in logs via exception messages (daiv/automation/agent/middlewares/web_fetch.py:207)

    • Line 207 logs exc_info=True when model processing fails, which could leak auth headers or API keys if they appear in tracebacks
    • Remediation: Sanitize exception messages before logging:
      logger.warning("web_fetch model processing failed; returning content instead.", extra={"url": url})
      Remove exc_info=True or add a custom exception formatter that redacts headers.
  3. Missing input validation for Context7 libraryId format (daiv/automation/agent/subagents.py:204)

    • The prompt allows libraryId in formats /owner/repo, /owner/repo/version, or /owner/repo@version, but doesn't validate these formats
    • Malformed IDs could cause unexpected API behavior or errors
    • Remediation: Add validation in the prompt or as a guard before fetch:
      import re
      library_id_pattern = re.compile(r'^/[\w\-\.]+/[\w\-\.]+(/[\w\-\.]+|@[\w\-\.]+)?$')
      if not library_id_pattern.match(library_id):
          return "Invalid libraryId format. Expected /owner/repo, /owner/repo/version, or /owner/repo@version."

Low Severity

  1. Cache pollution via empty prompts (daiv/automation/agent/middlewares/web_fetch.py:170)

    • Empty prompts skip caching (line 170: if prompt.strip() and (cached := ...)), but the docs-research subagent always uses prompt=""
    • This means every docs-research fetch bypasses cache, even for identical (url, "") pairs
    • Impact: Performance degradation and redundant API calls to Context7
    • Remediation: Cache empty-prompt fetches separately or allow caching when prompt == ""
  2. HTTP error status codes not logged (daiv/automation/agent/middlewares/web_fetch.py:119-120)

    • 4xx/5xx errors return a generic message without logging details for debugging
    • Remediation: Add logging before raising:
      if response.status_code >= 400:
          logger.warning(f"web_fetch failed: {url} returned {response.status_code}")
          raise ValueError(f"Failed to fetch {url} - status code {response.status_code}")

Recommendations

  1. Add integration test for SSRF protection: Once implemented, test that web_fetch rejects http://127.0.0.1, http://169.254.169.254, and http://192.168.1.1.
  2. Security headers: When fetching external URLs, consider adding X-Content-Type-Options: nosniff and X-Frame-Options: DENY to responses (if serving fetched content to users).
  3. Content-Security-Policy for cached responses: If cached responses are ever rendered as HTML, add CSP headers to prevent XSS.
  4. Audit Context7 API key storage: Ensure WEB_FETCH_AUTH_HEADERS is loaded from secrets (Docker secrets, env vars, or vault), not hardcoded. Current code uses SecretStr, which is good.
  5. Consider allowlist instead of blocklist: For production, restrict web_fetch to an allowlist of domains (e.g., context7.com, github.com) instead of blocking private IPs.

Tests/Validation

Required Security Tests (add to tests/unit_tests/automation/agent/middlewares/test_web_fetch.py):

@pytest.mark.parametrize("url", [
    "http://127.0.0.1/admin",
    "http://localhost:8000/",
    "http://169.254.169.254/latest/meta-data/",
    "http://10.0.0.1/internal",
    "http://192.168.1.1/config",
    "http://[::1]/admin",
])
async def test_rejects_ssrf_urls(url):
    result = await web_fetch_module.web_fetch_tool.ainvoke({"url": url, "prompt": ""})
    assert "private" in result.lower() or "blocked" in result.lower()

async def test_cache_key_includes_auth_headers(httpx_mock):
    # Test that different auth headers produce different cache entries
    cache.clear()
    httpx_mock.add_response(url="https://api.example.com", status_code=200, text="DATA")
    
    with patch.object(web_fetch_module, "settings") as mock_settings:
        mock_settings.WEB_FETCH_AUTH_HEADERS = {"example.com": {"X-Key": "key1"}}
        result1 = await web_fetch_module.web_fetch_tool.ainvoke({"url": "https://api.example.com", "prompt": "Q"})
        
        mock_settings.WEB_FETCH_AUTH_HEADERS = {"example.com": {"X-Key": "key2"}}
        result2 = await web_fetch_module.web_fetch_tool.ainvoke({"url": "https://api.example.com", "prompt": "Q"})
    
    assert result1 != result2  # Should fetch twice with different auth

Manual Tests:

  1. Verify Context7 API key is not logged in production logs
  2. Test rate limiting by making 100+ web_fetch calls in quick succession
  3. Confirm cache doesn't leak responses between users with different API keys

@srtab
Copy link
Owner Author

srtab commented Feb 25, 2026

@daiv-agent fix only this identified errors:

  • SSRF: No private IP/localhost blocking
  • Subdomain matching inconsistency

@daiv-agent
Copy link
Contributor

daiv-agent bot commented Feb 25, 2026

Summary

Fixed both identified security issues:

1. SSRF Protection

  • Added ipaddress module import
  • Created _is_private_or_local() function to detect private/local IP addresses
  • Blocks:
    • Localhost variants (localhost, 127.0.0.1, ::1, etc.)
    • Private IPv4 ranges (10.x.x.x, 172.16-31.x.x, 192.168.x.x)
    • Link-local addresses (169.254.x.x, fe80::x)
    • Reserved/loopback addresses
    • .local and .localhost domain suffixes
  • Added SSRF check in _fetch_url_text() before making HTTP requests
  • Created comprehensive test suite in test_web_fetch_ssrf.py with 37 test cases covering all blocked address types

2. Subdomain Matching Documentation Fix

  • Updated automation/conf.py documentation to clarify that domain matching is exact match only
  • Changed from: "e.g. 'context7.com' also matches 'api.context7.com'" (incorrect)
  • Changed to: "e.g. 'context7.com' matches only 'context7.com' and not 'api.context7.com'" (correct)
  • Updated existing tests to use SecretStr for proper type handling

All tests pass successfully, including the new SSRF protection tests and existing web_fetch tests.

@srtab
Copy link
Owner Author

srtab commented Feb 25, 2026

@daiv-agent /code-review

@daiv-agent
Copy link
Contributor

daiv-agent bot commented Feb 25, 2026

Code Review

Overview

  • Added SSRF protection to web_fetch middleware by blocking requests to private/local IP addresses
  • Fixed documentation inconsistency for WEB_FETCH_AUTH_HEADERS to clarify exact domain matching
  • Updated existing tests to use SecretStr and added comprehensive SSRF protection tests

Findings

High Severity

  • None identified

Medium Severity

  1. Redundant boolean conversion (daiv/automation/agent/middlewares/web_fetch.py:101)
    • return bool(hostname.lower().endswith(...)) is redundant since .endswith() already returns a boolean
    • Fix: return hostname.lower().endswith(".local") or hostname.lower().endswith(".localhost")

Low Severity

  1. Repeated import in test functions (tests/unit_tests/automation/agent/middlewares/test_web_fetch.py:219-267)

    • from pydantic import SecretStr is imported inside 6 different test functions
    • Better to import once at module level for consistency
    • Fix: Add from pydantic import SecretStr at the top of the file, remove from individual functions
  2. Test URL changed from context7.com to example.com (tests/unit_tests/automation/agent/middlewares/test_web_fetch.py:263-277)

    • While functionally correct, this change is unrelated to SSRF fixes and dilutes the commit scope
    • Not blocking, but ideally security fixes should be isolated from unrelated refactoring

Suggestions

  1. Add edge case test for DNS rebinding: Consider adding a test for hostnames that resolve to private IPs but appear public (e.g., malicious.com127.0.0.1). Current implementation only validates hostname strings, not DNS resolution results. This is a known SSRF bypass technique.

  2. Consider IPv4-mapped IPv6 addresses: Test cases like ::ffff:127.0.0.1 (IPv4-mapped IPv6) to ensure ipaddress library handles these correctly.

  3. Performance consideration: The _is_private_or_local() check happens on every fetch, including same-host redirects. For redirect chains, this could be called multiple times. Not a major issue, but worth noting.

  4. Error message clarity: The error message "Requests to private/local addresses are blocked" could specify which address was blocked for better debugging (especially for IPv6 addresses where the hostname might not be obvious from the URL).

Tests

What was run:

make lint-fix
pytest tests/unit_tests/automation/agent/middlewares/test_web_fetch_ssrf.py (37 tests)
pytest tests/unit_tests/automation/agent/middlewares/test_web_fetch.py (21 tests)

Coverage:

  • ✅ Localhost variants (localhost, LOCALHOST, localhost.localdomain)
  • ✅ IPv4 loopback range (127.x.x.x)
  • ✅ IPv4 private ranges (10.x, 172.16-31.x, 192.168.x)
  • ✅ Link-local (169.254.x.x, fe80::x)
  • ✅ IPv6 loopback (::1)
  • ✅ Domain suffix patterns (.local, .localhost)
  • ✅ Public addresses correctly allowed (example.com, 8.8.8.8, 1.1.1.1)
  • ✅ Tool-level rejection (error message format)
  • ✅ Function-level exception handling

Gaps:

  • ❌ No test for IPv4-mapped IPv6 addresses (::ffff:192.168.1.1)
  • ❌ No test for uppercase IPv6 addresses (FE80::1)
  • ❌ No test for compressed IPv6 variations (::ffff:c0a8:0101)
  • ❌ No test for DNS rebinding scenarios (hostname resolution to private IPs)
  • ❌ No test for multicast addresses (224.0.0.0/4, ff00::/8)

Recommended additional tests:

# Add to test_is_private_or_local
("::ffff:127.0.0.1", True),  # IPv4-mapped IPv6 loopback
("::ffff:192.168.1.1", True),  # IPv4-mapped IPv6 private
("FE80::1", True),  # Uppercase IPv6
("224.0.0.1", True),  # Multicast IPv4
("ff02::1", True),  # Multicast IPv6

What should be run before merge:

make test  # Full test suite
make lint-fix  # Already passed

@srtab
Copy link
Owner Author

srtab commented Feb 25, 2026

@daiv-agent Apply this recommendations:

  • Repeated import in test functions
  • Recommended additional tests

@daiv-agent
Copy link
Contributor

daiv-agent bot commented Feb 25, 2026

Perfect! All tests are passing. Let me create a summary of the changes:

Summary

Successfully applied the recommended improvements:

1. Moved repeated SecretStr imports to module level

  • Removed from pydantic import SecretStr from 6 individual test functions in test_web_fetch.py
  • Added single import at the top of the file for better code organization

2. Added comprehensive SSRF protection tests

Extended test coverage with additional edge cases:

IPv6 variations:

  • Uppercase IPv6 addresses (FE80::1)
  • IPv4-mapped IPv6 addresses (::ffff:127.0.0.1, ::ffff:192.168.1.1, ::ffff:10.0.0.1, ::ffff:c0a8:0101)

Multicast addresses:

  • IPv4 multicast (224.0.0.1)
  • IPv6 multicast (ff02::1)

3. Enhanced SSRF protection implementation

  • Added is_multicast check to _is_private_or_local() function
  • Removed redundant bool() conversion (code cleanup from earlier review)

Test Results

  • 65 tests total: All passing ✅
  • New tests added: 7 additional test cases
  • Coverage: Improved from 37 to 44 parameterized test cases for test_is_private_or_local

The SSRF protection now comprehensively blocks:

  • Localhost and loopback addresses
  • Private IP ranges (IPv4 and IPv6)
  • Link-local addresses
  • Reserved addresses
  • Multicast addresses
  • IPv4-mapped IPv6 addresses
  • Domain suffixes (.local, .localhost)

@srtab srtab merged commit f03bc4e into main Feb 25, 2026
6 checks passed
@srtab srtab deleted the feat/docs-researcher-subagent branch February 25, 2026 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a docs researcher subagent

1 participant