Skip to content

Feat/GitHub pr reviewer#15

Merged
jeancsil merged 5 commits intojeancsil:mainfrom
rafaelzimmermann:feat/github-pr-reviewer
Feb 28, 2026
Merged

Feat/GitHub pr reviewer#15
jeancsil merged 5 commits intojeancsil:mainfrom
rafaelzimmermann:feat/github-pr-reviewer

Conversation

@rafaelzimmermann
Copy link
Contributor

@rafaelzimmermann rafaelzimmermann commented Feb 27, 2026

Overview

🚀All PR comments in this Pull Request were posted by the agent. 🚀

This PR adds a new github-pr-reviewer agent that performs automated, AI-powered code reviews via the GitHub API, and refactors LLM initialization to support both Anthropic and OpenAI providers.


What's New

🤖 GitHub PR Reviewer Agent

A new GitHubPRReviewerAgent that can fully review a pull request end-to-end:

agentic-run github-pr-reviewer -i "review https://github.com/org/repo/pull/42"

The agent reads the full PR diff, posts inline comments on specific lines, writes a summary review, and can reply to existing comment threads. It exposes six GitHub API tools:

Tool Description
get_pr_metadata Fetches title, description, author, head SHA
get_pr_diff Reads changed files and patches
get_pr_comments Reads inline and general comments
post_review_comment Posts inline comment on a specific file+line
post_general_comment Posts a top-level summary comment
reply_to_review_comment Replies to an existing comment thread

Changes

  • core/github_pr_reviewer.py — New agent with all 6 GitHub API tools and a detailed code review system prompt
  • core/langgraph_agent.py / core/simple_agent.py — Refactored to use _create_model() for provider-agnostic initialization
  • constants.py — New detect_provider() and get_default_model() helpers
  • pyproject.toml — Added requests runtime dep and types-requests dev dep
  • .env.example — Documented Anthropic env vars and model selection
  • README.md — Documented the new agent and its tools
  • tests/test_agent.py — Updated mocks; 381 lines of coverage across happy paths, error cases, and edge cases

Demo

The agent reviewing this very PR:

✅ Approve with minor suggestions

This is a well-executed PR that adds significant new functionality (GitHub PR
reviewer agent) and improves the framework's flexibility with multi-provider
LLM support. The code is well-tested, properly documented, and follows good
practices.

Note: Requires GITHUB_TOKEN env var with repo scope to post comments.

@rafaelzimmermann
Copy link
Contributor Author

Code Review Summary

Overall assessment:Approve with minor suggestions

This is a well-executed PR that adds significant new functionality (GitHub PR reviewer agent) and improves the framework's flexibility with multi-provider LLM support. The code is clean, well-documented, and thoroughly tested.

Issues Found

  • 🟡 Minor – Code Duplication: _create_model() is duplicated in langgraph_agent.py and simple_agent.py. Move it to constants.py to follow DRY principle.

  • 🟡 Minor – Retry Logic: The exponential backoff comment is misleading (says "1s, 2s" but actually does 1s, 2s, 4s). Also, consider reading the Retry-After header from GitHub's rate limit response instead of using fixed delays.

  • 🟡 Minor – Data Loss: PR descriptions are silently truncated at 2000 chars. Add a note in the output when truncation occurs so the agent knows it's missing context.

  • 🟡 Minor – Security: Ensure GITHUB_TOKEN is never logged in error messages. Consider adding a _sanitize_error() helper to strip tokens from exception strings.

Positive Highlights

  • Comprehensive test coverage: 451 lines of tests covering happy paths, error cases, edge cases, and retry logic
  • Clean architecture: Well-separated concerns (API functions, agent class, tools)
  • Excellent documentation: System prompt is detailed and actionable; docstrings are clear
  • Provider flexibility: Multi-provider LLM support (Anthropic + OpenAI) is well-implemented
  • Defensive programming: Token validation, line number checks, null-safe field access
  • Good error handling: Graceful fallbacks and informative error messages

Next Steps

  1. Refactor _create_model() into constants.py to eliminate duplication
  2. Improve retry logic to respect GitHub's Retry-After header
  3. Add truncation indicator to PR description output
  4. Add token sanitization to error messages
  5. Consider adding an optional integration test for real GitHub API calls

The PR is ready to merge after addressing the code duplication issue. The other suggestions are nice-to-haves that can be addressed in follow-up PRs.

"rich>=13.0.0",
"tree-sitter==0.21.3",
"tree-sitter-languages>=1.10.2",
"requests>=2.32.5",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dependency note: requests>=2.32.5 is a good choice, but consider whether this should be pinned more tightly or if there's a reason to allow newer versions. The current constraint allows any version ≥2.32.5, which is reasonable for a stable library like requests. However, ensure that the GitHub API integration is tested against the minimum version (2.32.5) to catch any compatibility issues early.

from agentic_framework.interfaces.base import Agent
from agentic_framework.mcp import MCPProvider


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code duplication: The _create_model() function is duplicated in both langgraph_agent.py and simple_agent.py. This violates DRY and makes maintenance harder. Consider:

  1. Moving _create_model() to constants.py as a shared utility
  2. Importing it in both modules
# In constants.py
def _create_model(model_name: str, temperature: float):
    """Create the appropriate LLM model instance..."""
    provider = detect_provider()
    if provider == "anthropic":
        return ChatAnthropic(model=model_name, temperature=temperature)
    return ChatOpenAI(model=model_name, temperature=temperature)

# In both agent files
from agentic_framework.constants import _create_model

**kwargs: Additional arguments forwarded to the underlying requests call.

Returns:
requests.Response from the last attempt.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential issue with retry logic: The exponential backoff calculation uses 2**attempt, which gives 1s, 2s, 4s delays. However, the comment says "1s, 2s between attempts" which is misleading. Also, consider whether 4s is appropriate for the final retry—GitHub's rate limit reset times can be much longer.

Consider reading the Retry-After header from the response to respect GitHub's rate limit guidance:

if response.status_code == 429:
    retry_after = int(response.headers.get('Retry-After', 2**attempt))
    time.sleep(retry_after)

if err:
return err
if line < 1:
return (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type annotation issue: The return type annotation str | None is used in _check_token(), but the function always returns either a string or None. However, the docstring says "Return an error string if token is absent, else None" which is correct. The issue is that callers check if err: which works, but consider being more explicit:

def _check_token() -> str | None:
    """Return an error string if GITHUB_TOKEN is missing, else None."""
    if not os.environ.get("GITHUB_TOKEN"):
        return "Error: GITHUB_TOKEN environment variable is not set..."
    return None

This is fine as-is, but the pattern is used consistently throughout. Good defensive programming.

@@ -0,0 +1,451 @@
"""Tests for the GitHub PR Reviewer agent."""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage is excellent: 451 lines of comprehensive tests covering:

  • Agent initialization and system prompt
  • All 6 tools with happy paths
  • Missing token error handling
  • API error scenarios
  • Retry logic with exponential backoff
  • Edge cases (null body, binary files, empty comments)
  • Line number validation

This is well-structured and thorough. One suggestion: consider adding an integration test (marked with @pytest.mark.integration) that makes a real API call to a test PR to catch any GitHub API changes early.

url = f"{_GITHUB_API_BASE}/repos/{repo}/pulls/{pr_number}/files"
response = _github_request("get", url, headers=_get_headers(), timeout=30)
response.raise_for_status()
files: list[dict[str, Any]] = response.json()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential data loss: The description is truncated to 2000 characters without any indication to the user. If a PR has a longer description, the agent won't see the full context. Consider:

  1. Adding a note in the output when truncation occurs
  2. Increasing the limit or making it configurable
  3. Documenting this behavior in the docstring
truncated_description = description[:2000]
if len(description) > 2000:
    truncated_description += f"\n\n[Description truncated from {len(description)} chars]"

def _get_headers() -> dict[str, str]:
"""Build GitHub API request headers from the GITHUB_TOKEN environment variable."""
token = os.environ.get("GITHUB_TOKEN", "")
return {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security concern: The GITHUB_TOKEN is being included in the Authorization header without any masking or validation. While this is necessary for API calls, ensure that:

  1. Tokens are never logged or printed in error messages
  2. The token is not exposed in debug output

Consider adding a helper to sanitize error messages:

def _sanitize_error(msg: str) -> str:
    """Remove sensitive tokens from error messages."""
    return msg.replace(os.environ.get("GITHUB_TOKEN", ""), "***")

Then use it in exception handlers to prevent accidental token leakage.


Returns:
"anthropic" if ANTHROPIC_API_KEY is set, "openai" otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation note: The docstring says "When both keys are available, Anthropic takes precedence" but this is a design choice that should be documented more prominently. Consider adding a comment explaining the rationale:

def detect_provider() -> Provider:
    """Detect which LLM provider to use based on available API keys.
    
    Returns:
        "anthropic" if ANTHROPIC_API_KEY is set, "openai" otherwise.
    
    Note:
        When both ANTHROPIC_API_KEY and OPENAI_API_KEY are set, Anthropic
        takes precedence. This allows users to have both configured without
        ambiguity. OpenAI is the default fallback when only OPENAI_API_KEY
        is available.
    """

This helps future maintainers understand the intent.

Adds a new 'github-pr-reviewer' agent with 6 GitHub API tools:
- get_pr_metadata: fetch PR title, author, base branch, and head SHA
- get_pr_diff: fetch per-file patches for a pull request
- get_pr_comments: fetch both inline review and general comments
- post_review_comment: post inline comment on a specific file+line
- post_general_comment: post top-level PR summary comment
- reply_to_review_comment: reply to an existing comment thread

The agent reads GITHUB_TOKEN from the environment (never hardcoded),
returns error strings on failure (no raised exceptions), and ships
with 25 tests at 100% coverage.

Also adds 'requests' runtime dep and 'types-requests' mypy stub.
- constants.py: clarify detect_provider() docstring — OpenAI is the
  default fallback; Anthropic takes precedence when both keys are set

- github_pr_reviewer.py: add _github_request() helper with exponential
  backoff retry (1s, 2s) for transient 429/503/504 GitHub API errors;
  all six tools now route through this helper

- github_pr_reviewer.py: validate line > 0 in post_review_comment with
  a clear error message before making the API call

- github_pr_reviewer.py: document the 2000-char description truncation
  in get_pr_metadata with an inline comment
@jeancsil
Copy link
Owner

LGTM! @rafaelzimmermann 👏🏻 👏🏻 👏🏻

- .env.example: replace deprecated claude-3-5-sonnet-20241022 with
  current model options (claude-sonnet-4-6, claude-opus-4-6, claude-haiku-4-5-20251001)
- constants.py: sync get_default_model() docstring example with actual
  default (claude-haiku-4-5-20251001)
@jeancsil jeancsil merged commit 5ebf83f into jeancsil:main Feb 28, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants