Add take_screenshot tool with camera control#3
Conversation
- Implemented a new MCP tool `take_screenshot` that captures the game screen. - Added support for optional coordinates and rotation parameters to position the camera before capture. - Handled rendering synchronization by deferring capture to the next frame after movement. - Implemented MCP image content support in responses. - Added `ScreenshotUtils` for robust capture and player manipulation. Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2378134cd0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Replaced `client.execute()` with a deferred tick-based capture system in `ScreenshotUtils`. - Registered `ClientTickEvents.END_CLIENT_TICK` in `MCPServerModClient` to process deferred tasks. - Ensure at least 2 ticks pass after teleportation before capturing the framebuffer, ensuring the new view is rendered. - Addressed PR feedback regarding immediate execution on the render thread. Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
- Added `saveScreenshotsForDebug` option to `MCPConfig`. - Implemented `saveDebugScreenshot` in `ScreenshotUtils` to save captures to `mcp_debug_screenshots` directory when enabled. - Ensured screenshots are timestamped for easy identification during debugging. Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
- Added documentation for the `take_screenshot` tool. - Added instructions for enabling `save_screenshots_for_debug` and the location of saved screenshots. - Updated example configuration in README. Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
|
@codex 추가 수정 사항이 있었는데, 최종 코드를 다시 리뷰해줘. |
- Added `take_screenshot` tool to MCP protocol. - Implemented camera-aware screenshot capture with deferred timing (2 ticks) to ensure visual accuracy. - Added optional x, y, z, yaw, pitch parameters for perfect framing. - Added `saveScreenshotsForDebug` configuration to save local PNGs. - Updated README with comprehensive documentation. - Registered necessary tick listeners for deferred task processing. Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ee6af7426b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Normalize null `arguments` in `HTTPMCPServer.handleTakeScreenshot` to an empty object. - Update `ScreenshotUtils.takeScreenshotInternal` to reject partial coordinate sets (requires all of x, y, and z if any are provided). - Address PR feedback for safer argument handling and clearer error reporting. Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
- Refactored `ScreenshotUtils` to make coordinate validation testable. - Added `MCPProtocolTest` for image response verification. - Added `ScreenshotUtilsTest` for coordinate validation verification. - Verified all tests pass. Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
- Refactored `ScreenshotUtils` to isolate timing and data processing logic. - Added comprehensive unit tests for deferred task timing and Base64 encoding. - Added `TESTING.md` to explain the distinction between unit and integration tests. - Clarified technical limitations regarding testing Minecraft-dependent code. Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
|
@codex 최종 코드를 다시 리뷰해줘 |
- Refactored `ScreenshotUtils` for testability. - Added extensive unit tests for deferred capture timing, coordinate validation, and Base64 encoding. - Added `TESTING.md` and updated `README.md` with detailed instructions and testing strategy. - Fixed null argument handling and coordinate validation in tool handlers. - Ensured all changes are verified and pass all tests. Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ab2b84f80f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex 최종 코드를 다시 리뷰해줘 |
- Refactored ScreenshotUtils for testability. - Added extensive unit tests for deferred capture timing, coordinate validation, and Base64 encoding. - Added TESTING.md and updated README.md with detailed instructions and testing strategy. - Fixed null argument handling and coordinate validation in tool handlers. - Ensured all changes are verified and pass all tests. Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 83028f606f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (moved) { | ||
| // Defer capture to ensure the world is rendered with the new view. | ||
| // We wait for 2 end-of-tick events to be certain a render has completed. | ||
| pendingDeferredTasks.add(new DeferredTask(() -> captureNow(client, future), 2)); |
There was a problem hiding this comment.
Serialize camera mutations before deferred screenshot capture
This code mutates global player position/view immediately, then defers capture by two ticks; concurrent take_screenshot calls can therefore overwrite each other's camera state before the first capture runs. In that scenario, earlier requests may return an image from the later request's viewpoint, which breaks result correctness for multi-client or overlapping requests.
Useful? React with 👍 / 👎.
This change adds a multimodal capability to the Minecraft MCP mod, allowing AI agents to visually inspect the world. The
take_screenshottool enables the agent to specify a viewpoint (coordinates and rotation), teleports the player to that location, waits for a frame to render, and returns the resulting screenshot as a Base64-encoded PNG image according to the MCP specification.PR created automatically by Jules for task 10459417936888807114 started by @cuspymd