LLM-powered desktop automation. Control any app with an HTTP API — click buttons, type text, navigate menus, read accessibility trees, and capture screenshots.
Geisterhand (German for "ghostly hand") lets LLMs like Claude autonomously interact with native desktop applications. It works in the background without stealing focus, so you can automate apps while continuing your own work.
LLM / Script Geisterhand Desktop App
| | |
| geisterhand run Calculator | |
| ----------------------------> | Launch app in background |
| {"port":49152, "pid":1234} | ----------------------------> |
| | |
| POST /click/element | |
| {"title":"7","role":"Button"} |
| ----------------------------> | Click button via AX APIs |
| | ----------------------------> |
| | |
| GET /screenshot | |
| ----------------------------> | Capture window |
| <image data> | <---------------------------- |
| <---------------------------- | |
geisterhand run launches an app and starts a scoped HTTP server. Every request is automatically targeted at that app — no need to specify PIDs or window titles in each call.
| Repo | Description | Language |
|---|---|---|
| macos | macOS automation via ScreenCaptureKit & Accessibility APIs | Swift |
| windows | Windows automation via UI Automation & Win32 APIs | C# (.NET) |
| linux | Linux automation via AT-SPI2, XTest & xdg-desktop-portal | Rust |
| mcp | MCP server for Claude Code & Claude Desktop | TypeScript |
| landing | Website — geisterhand.dev | Astro |
| homebrew-tap | Homebrew tap for macOS install | Ruby |
brew install --cask geisterhand-io/tap/geisterhand
geisterhand run Calculator
# {"port":49152,"pid":12345,"app":"Calculator","host":"127.0.0.1"}
curl http://127.0.0.1:49152/accessibility/tree?format=compact
curl -X POST http://127.0.0.1:49152/click/element \
-H "Content-Type: application/json" \
-d '{"title": "7", "role": "AXButton"}'Download from GitHub Releases, then:
geisterhand run Calculatorcargo install geisterhand
geisterhand run gnome-calculatorAdd Geisterhand as an MCP server so Claude can control desktop apps directly:
claude mcp add-json geisterhand \
'{"type":"stdio","command":"npx","args":["geisterhand-mcp"]}' \
--scope userAll platforms expose the same HTTP API:
| Method | Path | Description |
|---|---|---|
| GET | /status |
System info and permissions |
| GET | /screenshot |
Capture screen or app window |
| POST | /click |
Click at coordinates |
| POST | /click/element |
Click element by title, role, or label |
| POST | /type |
Type text |
| POST | /key |
Press key with modifiers |
| POST | /scroll |
Scroll at position |
| POST | /wait |
Wait for element state |
| GET | /accessibility/tree |
UI element hierarchy |
| GET | /accessibility/elements |
Find elements by query |
| POST | /accessibility/action |
Perform element action |
| GET | /menu |
Get app menu structure |
| POST | /menu |
Trigger menu item |
- Background automation — Apps don't steal focus. Screenshots work even when windows are behind other apps.
- Cross-platform — Same API on macOS, Windows, and Linux.
- Scoped servers —
geisterhand runcreates a per-app server. No PID juggling. - Accessibility-first — Read and interact with UI elements by role, title, and label.
- Local only — Binds to 127.0.0.1 by default. Your desktop stays private.
MIT — Skelpo GmbH