Skip to content

add e2e workflow#18

Draft
satvik007 wants to merge 3 commits intodev/satvik/add-integration-testsfrom
dev/satvik/add-e2e-tests
Draft

add e2e workflow#18
satvik007 wants to merge 3 commits intodev/satvik/add-integration-testsfrom
dev/satvik/add-e2e-tests

Conversation

@satvik007
Copy link
Contributor

@satvik007 satvik007 commented Dec 24, 2025

Summary

  • Add GitHub Actions E2E test workflow using Claude Code Action
  • Run automated tests against all QA Sphere MCP tools with real API calls
  • Trigger on pushes to main, PRs to all branches, and manual dispatch

Test Coverage

  • Projects: list_projects, get_project
  • Test Cases: list_test_cases, get_test_case, create_test_case, update_test_case
  • Folders: list_folders, upsert_folders
  • Tags: list_test_cases_tags
  • Requirements: list_requirements
  • Custom Fields: list_custom_fields
  • Shared Steps: list_shared_steps
  • Shared Preconditions: list_shared_preconditions

Configuration

  • Supports scoped test runs via workflow_dispatch input
  • Uses QASPHERE_TENANT_URL, QASPHERE_API_KEY, and CLAUDE_CODE_OAUTH_TOKEN secrets

@gemini-code-assist
Copy link

Summary of Changes

Hello @satvik007, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the project's testing infrastructure by integrating a new End-to-End (E2E) testing workflow. This workflow, powered by Claude Code, ensures the robust functionality of the QA Sphere Model Context Protocol (MCP) server by validating its tools against a real QA Sphere environment. Alongside this, comprehensive documentation has been added to guide developers through the server's architecture, tool development, and testing practices, streamlining future contributions and maintenance.

Highlights

  • New E2E Testing Workflow: Introduced an End-to-End (E2E) testing workflow that utilizes Claude Code to thoroughly exercise all Model Context Protocol (MCP) tools against a live QA Sphere instance. This workflow runs on PRs and main merges, with manual trigger options.
  • Comprehensive Development Guide: Added a detailed development guide (CLAUDE.md) for the QA Sphere MCP server, covering its architecture, how to add new tools, various testing methodologies (unit, integration, E2E), code style, API authentication, and debugging procedures.
  • MCP Server Configuration: A new .mcp.json configuration file has been added to define the qasphere MCP server, specifying its execution command, arguments, and necessary environment variables for integration with QA Sphere.
  • Updated README Documentation: The main README.md has been updated to include a new 'Development' section, providing clear instructions on running unit and integration tests, and detailing the setup and execution of the new E2E testing with Claude, including required GitHub secrets.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/e2e-tests.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an end-to-end testing workflow using Claude, along with associated documentation. The changes include a new .mcp.json configuration, a detailed development guide in CLAUDE.md for the LLM agent, and updates to README.md for developers. My review focuses on improving the clarity, consistency, and discoverability of the new documentation. I've suggested making content in AGENTS.md a proper link, pointed out a potential inconsistency in environment variables in CLAUDE.md, and recommended linking the new development guide from the main README.md.

@@ -0,0 +1 @@
CLAUDE.md No newline at end of file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The content of this file is just the string CLAUDE.md. If this file is intended to be a human-readable index of agent-specific documentation, it would be more helpful to format it as a proper Markdown link. This would allow users to navigate to the file directly from a Markdown preview.

Suggested change
CLAUDE.md
[Claude Development Guide](./CLAUDE.md)


### Integration Tests
- Real API calls against test tenant
- Requires `.env` with `QASPHERE_TENANT_URL`, `QASPHERE_API_KEY`, `QASPHERE_AUTH_EMAIL`, `QASPHERE_AUTH_PASSWORD`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line mentions QASPHERE_AUTH_EMAIL and QASPHERE_AUTH_PASSWORD as required for integration tests. However, these environment variables are not validated or used in the provided src/config.ts. If they are indeed required for integration tests, it would be good to clarify where they are used to avoid confusion for developers setting up the project. If they are no longer needed, they should be removed from this documentation.


Replace the placeholder values with your actual QA Sphere URL and API key.

## Development

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

A new CLAUDE.md file has been added which serves as a comprehensive development guide. However, it's not linked from the main README.md, making it hard for developers to discover. It would be beneficial to add a link to it under the new 'Development' section to improve discoverability and avoid duplicating information between the two files.

Suggested change
## Development
## Development\n\nFor a comprehensive guide on development, including architecture, adding new tools, and debugging, please see the [Development Guide](./CLAUDE.md).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Dec 24, 2025

Pull Request Review: E2E Workflow

Thank you for adding E2E testing infrastructure! This is a valuable addition to the project. Here is my comprehensive review:

Strengths

  1. Well-Structured Workflow - Clear separation of concerns with named steps, good use of GitHub Actions best practices (checkout@v4, setup-node@v4), proper permissions scoped appropriately

  2. Flexible Test Scoping - The test_scope input is a great feature for targeted testing and covers all tool domains comprehensively

  3. Documentation - Excellent updates to CLAUDE.md and README.md with clear instructions for setting up secrets

  4. Configuration Files - .mcp.json provides a convenient local development config and AGENTS.md symlink is a smart way to maintain consistency

Issues and Recommendations

CRITICAL: TEST_SCOPE Variable Interpolation

Issue: Line 84 in .github/workflows/e2e-tests.yml references the test scope in the Claude prompt, but the variable interpolation may not work as expected inside the multiline YAML string.

Solution: Verify that the test scope value is properly passed to Claude. Consider using github.event.inputs.test_scope directly in the prompt for explicit interpolation.

MODERATE: Conditional Test Execution Logic

Issue: The prompt includes conditional test execution (lines 98-120) but relies on Claude to interpret if scope includes X or all correctly.

Recommendation: Either make the conditional logic more explicit or accept that this is guidance. The current approach is acceptable if you trust Claude to follow instructions properly.

MINOR: Template Variable Syntax in .mcp.json

Issue: The .mcp.json file uses shell-style variable syntax which requires environment variables to be set.

Recommendation: Consider renaming to .mcp.json.template to make the intent clearer, or add documentation explaining users need environment variables set.

Testing Strategy Considerations

  1. Test Isolation - Multiple concurrent workflow runs might interfere if they create/modify the same test artifacts. Consider adding unique identifiers.

  2. Cleanup Reliability - The prompt asks Claude to clean up test cases. What happens if cleanup fails? Consider making tests fully idempotent.

  3. Success Criteria - How does the workflow determine if tests passed or failed? Consider adding a step that parses Claude output and sets the appropriate exit code.

Security Assessment

No security concerns identified. Good practices: Secrets properly stored, no credentials hardcoded, environment variables used appropriately, workflow permissions properly scoped.

Performance and Reliability Suggestions

  1. Add timeout-minutes to the job to prevent indefinite hangs
  2. Consider uploading test reports as artifacts for historical tracking
  3. npm cache already enabled via setup-node - good job!

Overall Assessment

Code Quality: 4/5 - Well-structured, minor interpolation issue
Test Coverage: 5/5 - Comprehensive coverage of all tools
Documentation: 5/5 - Excellent additions
Security: 5/5 - No concerns
Performance: 4/5 - Efficient design

Action Items

Must address:

  • Verify TEST_SCOPE interpolation works correctly in the Claude prompt

Should consider:

  • Add timeout to prevent hanging workflows
  • Clarify .mcp.json usage (template vs config)
  • Add workflow failure criteria based on test results

Nice to have:

  • Upload test reports as artifacts
  • Add example test output to documentation
  • Implement test uniqueness for concurrent runs

Recommendation

Approve with minor changes. This PR adds significant value to the project with comprehensive E2E testing infrastructure. Once the TEST_SCOPE interpolation is verified/corrected, this is ready to merge.

Excellent work on building a robust testing framework!

@claude
Copy link

claude bot commented Dec 24, 2025

Code Review: E2E Testing Infrastructure

I've reviewed PR #18 which adds E2E testing infrastructure using Claude Code. Overall, this is a well-structured addition that will provide valuable automated testing. Below is my detailed feedback:


✅ Strengths

  1. Comprehensive E2E Coverage: The workflow covers all MCP tool domains (projects, test cases, folders, tags, requirements, custom fields, shared steps, and preconditions).

  2. Flexible Test Execution: The workflow_dispatch with test_scope input allows granular control over which test categories to run - very useful for debugging specific areas.

  3. Good Documentation: The CLAUDE.md and README.md updates provide clear instructions for developers and users.

  4. Appropriate Permissions: GitHub Actions permissions are properly scoped (contents:read, pull-requests:write, etc.).

  5. Clean Architecture: The separation between workflow configuration and MCP server config is well-structured.


🔍 Security Concerns

CRITICAL: Secrets Exposure Risk (.github/workflows/e2e-tests.yml:57-70)

The workflow creates an MCP config file using shell variable expansion which could expose secrets in logs if command echoing is enabled or the command fails.

Recommendation: Use a more secure approach with jq or Python to construct the JSON, leveraging GitHub's secret masking. This ensures secrets are not exposed in heredoc syntax.


🐛 Potential Issues

1. Test Scope Variable Not Properly Utilized (.github/workflows/e2e-tests.yml:83)

The workflow sets TEST_SCOPE but only passes it as plain text in the prompt. The prompt says "(if scope includes tcases or all)" but doesn't provide Claude with clear logic to determine this.

Recommendation: Make the prompt more explicit about scope filtering logic or pre-filter the test list in the workflow.

2. AGENTS.md Symlink May Not Work on Windows (AGENTS.md:1)

The symlink AGENTS.md -> CLAUDE.md works on Unix systems but may cause issues for Windows developers.

Recommendation: Document in README that Windows users need to enable Developer Mode for symlinks, or consider using a different approach.

3. Missing .mcp.json Guidance

The .mcp.json file contains environment variable placeholders which is good, but developers might accidentally commit local versions with real secrets.

Recommendation: Add a comment in the file or document in README to copy to .mcp.local.json for local development.


🎯 Code Quality Suggestions

1. Incomplete Test Cleanup Instructions (.github/workflows/e2e-tests.yml:135)

The prompt says "Clean up any test cases you create (but keep the folder for future runs)" but doesn't enforce this. Over time, test artifacts could accumulate.

Recommendation: Add a cleanup phase in the workflow or document the manual cleanup process in CLAUDE.md.

2. Missing Validation Step

The workflow doesn't validate that the built MCP server file exists before running Claude tests.

Recommendation: Add a smoke test to verify dist/index.js exists after build.


📊 Performance Considerations

  1. Node Modules Cache: Good use of cache: 'npm' in setup-node action ✓
  2. Build Time: Full build on each run is appropriate for E2E tests
  3. Rate Limiting: If the QA Sphere API has rate limits, sequential tool calls could cause failures - consider documenting any known limits

✨ Best Practices Followed

  1. Documentation: Excellent job documenting required secrets in README.md ✓
  2. Version Pinning: Using @v4 for actions (not @latest) ✓
  3. Structured Prompts: The prompt format with clear sections and checklists is well-organized ✓
  4. Node Version: Pinning to Node 20 is appropriate ✓

📝 Minor Suggestions

  1. README.md Line 96: "Claude's test report in the workflow output" - clarify this appears in the Actions tab workflow run logs
  2. CLAUDE.md Line 92-94: E2E tests section could mention the test scope feature
  3. Workflow Naming: Minor inconsistency between filename e2e-tests.yml and workflow name E2E Tests with Claude

🧪 Test Coverage Assessment

Current Coverage: Good coverage of happy paths for all tool categories.

Future Enhancements (not blockers):

  • Error handling tests (401, 404, 500 responses)
  • Empty result sets (project with no test cases, etc.)
  • Invalid input validation
  • Concurrent operations

Summary

This PR adds valuable E2E testing infrastructure with good documentation. The main concern is the potential secrets exposure risk in the config file creation step, which should be addressed before merging.

Recommendation: Request changes to fix the secrets handling, then approve.

Priority Actions for Merge

  • HIGH: Fix secrets exposure in MCP config creation step (use jq or similar)
  • MEDIUM: Document Windows symlink limitation or provide alternative
  • LOW: Add build validation check
  • LOW: Clarify test cleanup strategy

Nice-to-Haves (Future PRs)

  • Add negative test cases
  • Document API rate limits if any
  • Add test result artifacts/reports

Great work on this testing infrastructure! This will significantly improve the reliability of the MCP server. 🚀

Documents all 16 available tools with parameter tables including types,
required/optional status, and descriptions. Tools are organized by domain:
Projects, Test Cases, Folders, Tags, Requirements, Custom Fields,
Shared Steps, and Shared Preconditions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Dec 24, 2025

PR Review: E2E Workflow Implementation

Summary

This PR adds comprehensive E2E testing infrastructure using Claude Code Action, a development guide (CLAUDE.md), and extensive README documentation. Overall, this is a high-quality contribution that significantly improves the project's testing and documentation.

✅ Strengths

1. Well-Structured E2E Workflow

  • Clean separation of concerns with proper build → configure → test flow
  • Good use of GitHub Actions permissions (principle of least privilege)
  • Flexible test scoping via workflow_dispatch is excellent for targeted testing
  • Proper secret handling for credentials

2. Comprehensive Documentation

  • CLAUDE.md: Clear, actionable guide for development with concrete examples
  • README.md: Thorough API documentation with parameter tables - this is excellent reference material
  • AGENTS.md symlink ensures compatibility with different naming conventions

3. Test Coverage

  • All 16 MCP tools are covered in the E2E prompt
  • Test instructions include cleanup guidance ("keep folder for future runs")
  • Includes error handling guidance ("continue with remaining tests")

🔍 Issues & Recommendations

Critical Issues

1. Missing get_shared_step and get_shared_precondition in E2E Tests

Severity: Medium

The README documents these tools (lines 183-190, 211-218):

  • get_shared_step
  • get_shared_precondition

But the E2E workflow only tests the list_* variants (lines 116-120). The workflow should exercise ALL documented tools.

Recommendation: Add test cases for the individual get methods:

- [ ] `get_shared_step` - Get a specific shared step by ID
- [ ] `get_shared_precondition` - Get a specific shared precondition by ID

2. E2E Prompt References Non-Existent Test Scope Logic

Severity: Medium

Lines 97-120 of the workflow say "(if scope includes tcases or all)" but there's no actual conditional logic - Claude will execute all tests regardless of TEST_SCOPE. This could be misleading.

Recommendations:

  • Option A: Remove the conditional language and always run all tests
  • Option B: Make the prompt dynamic based on TEST_SCOPE using GitHub Actions syntax
  • Option C: Clarify that Claude should interpret and filter based on TEST_SCOPE

Security Concerns

3. Secrets Exposed in Workflow Logs

Severity: Low-Medium

Line 57-70: The MCP config is created with environment variable substitution. If GitHub Actions debug logging is enabled, these could appear in logs. While the secrets themselves use GitHub's secret masking, the pattern is worth noting.

Recommendation: This is likely acceptable given GitHub's secret masking, but consider adding a comment noting that secrets are handled securely.

4. API Key Permissions Not Documented

Severity: Low

The README mentions needing QASPHERE_API_KEY but doesn't specify what permissions/scopes it requires.

Recommendation: Document minimum required API key permissions in the "Required GitHub Secrets" section.

Code Quality Issues

5. Inconsistent Tool Naming in E2E Prompt

Severity: Low

Lines 94-95 use backticks around tool names (list_projects) but the actual MCP tool names are prefixed (e.g., mcp__qasphere__list_projects per line 81).

Recommendation: Either:

  • Use full MCP tool names consistently, OR
  • Add a note explaining the mcp__qasphere__ prefix is automatic

6. Hard-Coded Test Folder Name

Severity: Low

Lines 105, 134 hard-code "MCP-E2E-Tests" as the folder name. If multiple test runs happen concurrently (e.g., different PRs), this could cause conflicts.

Recommendation: Consider making the folder name unique per run (e.g., using github.run_id) or add explicit guidance about serial vs parallel test execution.

Documentation Issues

7. Missing Error Handling Guidance

Severity: Low

The E2E prompt says "continue with remaining tests" (line 136) but doesn't specify what Claude should do if a critical dependency fails (e.g., if list_projects fails, all subsequent tests can't run).

Recommendation: Add guidance like:

- If list_projects fails, ABORT remaining tests
- Otherwise, continue testing even if individual tools fail

8. No Validation of Test Artifacts

Severity: Low

The prompt instructs Claude to create test cases (line 100) and clean them up (line 135), but doesn't verify the creation/deletion succeeded.

Recommendation: Add explicit verification steps:

- [ ] Verify created test case exists via get_test_case
- [ ] Verify deletion succeeded via list_test_cases

Performance Considerations

9. No Rate Limiting Guidance

Severity: Low

The E2E tests will make multiple API calls in quick succession. If QA Sphere has rate limits, this could cause false failures.

Recommendation: Add a note in CLAUDE.md or the workflow about:

  • Expected API rate limits
  • How Claude should handle 429 responses
  • Whether tests should include delays between calls

📝 Minor Suggestions

  1. Line 83: Consider making TEST_SCOPE available to Claude's context more explicitly (e.g., as a parameter rather than just in the header)

  2. .mcp.json: This file uses environment variable syntax (${QASPHERE_TENANT_URL}) which won't work in all contexts. Add a README note that this is a template requiring environment variable expansion.

  3. README.md line 8: The emoji was removed (⚙️ → Settings). While this aligns with Claude Code style guidelines, it was a helpful visual marker. Consider if this was intentional.

  4. CLAUDE.md line 92-94: The E2E section could link to the actual workflow file for reference.

✅ Best Practices Observed

  • ✅ Proper use of npm ci instead of npm install
  • ✅ Node.js 20 (current LTS)
  • ✅ Actions pinned to major versions (v4, v1)
  • ✅ Comprehensive tool documentation
  • ✅ Clear commit messages with Claude Code attribution
  • ✅ Follows existing project conventions (Biome, ES modules)

🎯 Test Coverage Assessment

Unit Tests: ✅ Existing unit tests cover the tools
Integration Tests: ✅ Existing integration tests cover API interactions
E2E Tests: ⚠️ Good coverage but missing 2 tools (get_shared_step, get_shared_precondition)

Final Verdict

Recommendation:Approve with minor changes

This PR significantly improves the project's testing infrastructure and documentation. The issues identified are mostly minor and don't block merging. The most important fixes are:

  1. Add missing tool tests (get_shared_step, get_shared_precondition)
  2. Clarify TEST_SCOPE behavior in the prompt
  3. Document API key permission requirements

Excellent work overall! The comprehensive README documentation will be particularly valuable for users integrating this MCP server.


Review Stats:

  • Files changed: 5
  • Additions: 525
  • Deletions: 1
  • Critical issues: 0
  • Medium issues: 2
  • Low issues: 7

@satvik007 satvik007 marked this pull request as draft December 24, 2025 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant