Skip to content

Comments

Add take_screenshot tool with camera control#3

Merged
cuspymd merged 13 commits intomainfrom
feature/screenshot-tool-10459417936888807114
Feb 19, 2026
Merged

Add take_screenshot tool with camera control#3
cuspymd merged 13 commits intomainfrom
feature/screenshot-tool-10459417936888807114

Conversation

@cuspymd
Copy link
Owner

@cuspymd cuspymd commented Feb 16, 2026

This change adds a multimodal capability to the Minecraft MCP mod, allowing AI agents to visually inspect the world. The take_screenshot tool enables the agent to specify a viewpoint (coordinates and rotation), teleports the player to that location, waits for a frame to render, and returns the resulting screenshot as a Base64-encoded PNG image according to the MCP specification.


PR created automatically by Jules for task 10459417936888807114 started by @cuspymd

- Implemented a new MCP tool `take_screenshot` that captures the game screen.
- Added support for optional coordinates and rotation parameters to position the camera before capture.
- Handled rendering synchronization by deferring capture to the next frame after movement.
- Implemented MCP image content support in responses.
- Added `ScreenshotUtils` for robust capture and player manipulation.

Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2378134cd0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

google-labs-jules bot and others added 3 commits February 16, 2026 12:15
- Replaced `client.execute()` with a deferred tick-based capture system in `ScreenshotUtils`.
- Registered `ClientTickEvents.END_CLIENT_TICK` in `MCPServerModClient` to process deferred tasks.
- Ensure at least 2 ticks pass after teleportation before capturing the framebuffer, ensuring the new view is rendered.
- Addressed PR feedback regarding immediate execution on the render thread.

Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
- Added `saveScreenshotsForDebug` option to `MCPConfig`.
- Implemented `saveDebugScreenshot` in `ScreenshotUtils` to save captures to `mcp_debug_screenshots` directory when enabled.
- Ensured screenshots are timestamped for easy identification during debugging.

Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
- Added documentation for the `take_screenshot` tool.
- Added instructions for enabling `save_screenshots_for_debug` and the location of saved screenshots.
- Updated example configuration in README.

Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
@cuspymd
Copy link
Owner Author

cuspymd commented Feb 16, 2026

@codex 추가 수정 사항이 있었는데, 최종 코드를 다시 리뷰해줘.

- Added `take_screenshot` tool to MCP protocol.
- Implemented camera-aware screenshot capture with deferred timing (2 ticks) to ensure visual accuracy.
- Added optional x, y, z, yaw, pitch parameters for perfect framing.
- Added `saveScreenshotsForDebug` configuration to save local PNGs.
- Updated README with comprehensive documentation.
- Registered necessary tick listeners for deferred task processing.

Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee6af7426b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

google-labs-jules bot and others added 4 commits February 16, 2026 13:22
- Normalize null `arguments` in `HTTPMCPServer.handleTakeScreenshot` to an empty object.
- Update `ScreenshotUtils.takeScreenshotInternal` to reject partial coordinate sets (requires all of x, y, and z if any are provided).
- Address PR feedback for safer argument handling and clearer error reporting.

Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
- Refactored `ScreenshotUtils` to make coordinate validation testable.
- Added `MCPProtocolTest` for image response verification.
- Added `ScreenshotUtilsTest` for coordinate validation verification.
- Verified all tests pass.

Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
- Refactored `ScreenshotUtils` to isolate timing and data processing logic.
- Added comprehensive unit tests for deferred task timing and Base64 encoding.
- Added `TESTING.md` to explain the distinction between unit and integration tests.
- Clarified technical limitations regarding testing Minecraft-dependent code.

Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
@cuspymd
Copy link
Owner Author

cuspymd commented Feb 19, 2026

@codex 최종 코드를 다시 리뷰해줘

- Refactored `ScreenshotUtils` for testability.
- Added extensive unit tests for deferred capture timing, coordinate validation, and Base64 encoding.
- Added `TESTING.md` and updated `README.md` with detailed instructions and testing strategy.
- Fixed null argument handling and coordinate validation in tool handlers.
- Ensured all changes are verified and pass all tests.

Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab2b84f80f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@cuspymd
Copy link
Owner Author

cuspymd commented Feb 19, 2026

@codex 최종 코드를 다시 리뷰해줘

- Refactored ScreenshotUtils for testability.
- Added extensive unit tests for deferred capture timing, coordinate validation, and Base64 encoding.
- Added TESTING.md and updated README.md with detailed instructions and testing strategy.
- Fixed null argument handling and coordinate validation in tool handlers.
- Ensured all changes are verified and pass all tests.

Co-authored-by: cuspymd <8870299+cuspymd@users.noreply.github.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 83028f606f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if (moved) {
// Defer capture to ensure the world is rendered with the new view.
// We wait for 2 end-of-tick events to be certain a render has completed.
pendingDeferredTasks.add(new DeferredTask(() -> captureNow(client, future), 2));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Serialize camera mutations before deferred screenshot capture

This code mutates global player position/view immediately, then defers capture by two ticks; concurrent take_screenshot calls can therefore overwrite each other's camera state before the first capture runs. In that scenario, earlier requests may return an image from the later request's viewpoint, which breaks result correctness for multi-client or overlapping requests.

Useful? React with 👍 / 👎.

@cuspymd cuspymd merged commit 2f26c63 into main Feb 19, 2026
2 checks passed
@cuspymd cuspymd deleted the feature/screenshot-tool-10459417936888807114 branch February 19, 2026 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant