CraftBot/README.md at main · CraftOS-dev/CraftBot

日本語版はこちら | 中文版README

🚀 Overview

CraftBot is your Personal AI Assistant that lives inside your machine and works 24/7 for you.

It autonomously interprets tasks, plans actions, and executes them to achieve your goals. It learns your preferences and objectives, proactively helping you plan and initiate tasks to achieve your life goals. MCPs and Skills, and external App integrations are supported.

CraftBot awaits your orders. Set up your own CraftBot now.

✨ Features

Bring Your Own Key (BYOK) — Flexible LLM provider system supporting OpenAI, Google Gemini, Anthropic Claude, BytePlus, and local Ollama models. Easily switch between providers.
Memory System — Distill and consolidate events that happened through the day at midnight.
Proactive Agent — Learn your preferences, habits, and life goals. Then, perform planning and initiate tasks (with approval, of course) to help you improve in life.
External Tools Integration — Connect to Google Workspace, Slack, Notion, Zoom, LinkedIn, Discord, and Telegram (more to come!) with embedded credentials and OAuth support.
MCP — Model Context Protocol integration for extending agent capabilities with external tools and services.
Skills — Extensible skill framework with built-in skills for task planning, research, code review, git operations, and more.
Cross-Platform — Full support for Windows and Linux with platform-specific code variants and Docker containerization.

Important

Note for GUI mode: The GUI mode is still in experimental phase. This means you may encounter issues when the agent switches to GUI mode. We are actively improving this feature.

🧰 Getting Started

Prerequisites

Python 3.10+
git (required to clone the repository)
An API key for your chosen LLM provider (OpenAI, Gemini, or Anthropic)
Node.js 18+ (optional - only required for browser interface)
conda (optional - if not found, installer offers to auto-install Miniconda)

Quick Install

# Clone the repository
git clone https://github.com/zfoong/CraftBot.git
cd CraftBot

# Install dependencies
python install.py

# Run the agent
python run.py

That's it! The first run will guide you through setting up your API keys.

Note: If you don't have Node.js installed, the installer will guide you with step-by-step instructions. You can also skip browser mode and use TUI instead (see modes below).

What you can do right after?

Talk to the agent naturally
Ask it to perform complex multi-step tasks
Type /help to see available commands
Connect to Google, Slack, Notion, and more

🖥️ Interface Modes

CraftBot supports multiple UI modes. Choose based on your preference:

Mode	Command	Requirements	Best For
Browser	`python run.py`	Node.js 18+	Modern web interface, easiest to use
TUI	`python run.py --tui`	None	Terminal UI, no dependencies needed
CLI	`python run.py --cli`	None	Command-line, lightweight
GUI	`python run.py --gui`	`install.py --gui`	Desktop automation with visual feedback

Browser mode is the default and recommended. If you don't have Node.js, the installer will provide installation instructions or you can use TUI mode instead.

🧩 Architecture Overview

Component	Description
Agent Base	Core orchestration layer that manages task lifecycle, coordinates between components, and handles the main agentic loop.
LLM Interface	Unified interface supporting multiple LLM providers (OpenAI, Gemini, Anthropic, BytePlus, Ollama).
Context Engine	Generates optimized prompts with KV-cache support.
Action Manager	Retrieves and executes actions from the library. Custom action is easy to extend
Action Router	Intelligently selects the best matching action based on task requirements and resolves input parameters via LLM when needed.
Event Stream	Real-time event publishing system for task progress tracking, UI updates, and execution monitoring.
Memory Manager	RAG-based semantic memory using ChromaDB. Handles memory chunking, embedding, retrieval, and incremental updates.
State Manager	Global state management for tracking agent execution context, conversation history, and runtime configuration.
Task Manager	Manages task definitions, enable simple and complex tasks bode, create todos, and multi-step workflow tracking.
Skill Manager	Loads and injects pluggable skills into the agent context.
MCP Adapter	Model Context Protocol integration that converts MCP tools into native actions.
TUI Interface	Terminal user interface built with Textual framework for interactive command-line operation.
GUI Module	Experimental GUI automation using Docker containers, OmniParser for UI element detection, and Gradio client.

🔜 Roadmap

Memory Module — Done.
External Tool integration — Still adding more!
MCP Layer — Done.
Skill Layer — Done.
Proactive Behaviour — Pending

🖥️ GUI Mode (Optional)

GUI mode enables screen automation - the agent can see and interact with a desktop environment. This is optional and requires additional setup.

# Install with GUI support (using pip, no conda required)
python install.py --gui

# Install with GUI support and conda
python install.py --gui --conda

# Run with GUI mode
python run.py --gui

Note

GUI mode is experimental and requires additional dependencies (~4GB for model weights). If you don't need desktop automation, skip this and use Browser/TUI mode instead which has no additional dependencies.

📋 Command Reference

install.py

Flag	Description
`--gui`	Install GUI components (OmniParser)
`--conda`	Use conda environment (optional)
`--cpu-only`	Install CPU-only PyTorch (with --gui)

run.py

Flag	Description
(none)	Run in Browser mode (recommended, requires Node.js)
`--tui`	Run in Terminal UI mode (no dependencies needed)
`--cli`	Run in CLI mode (lightweight)
`--gui`	Enable GUI automation mode (requires `install.py --gui` first)

Installation Examples:

# Simple pip installation (no conda)
python install.py

# With GUI support (using pip, no conda)
python install.py --gui

# With GUI on CPU-only systems (using pip, no conda)
python install.py --gui --cpu-only

# With conda environment (recommended for conda users)
python install.py --conda

# With GUI support and conda
python install.py --gui --conda

# With GUI on CPU-only systems with conda
python install.py --gui --conda --cpu-only

Running CraftBot:

# Browser mode (default, requires Node.js)
python run.py

# TUI mode (no Node.js required)
python run.py --tui

# CLI mode (lightweight)
python run.py --cli

# With GPU/GUI mode
python run.py --gui

# With conda environment
conda run -n craftbot python run.py

# Or using full path if conda not in PATH
&"$env:USERPROFILE\miniconda3\Scripts\conda.exe" run -n craftbot python run.py

Linux/macOS (Bash):

# Browser mode (default, requires Node.js)
python run.py

# TUI mode (no Node.js required)
python run.py --tui

# CLI mode (lightweight)
python run.py --cli

# With GPU/GUI mode
python run.py --gui

# With conda environment
conda run -n craftbot python run.py

Note

Installation: The installer now provides clear guidance if dependencies are missing. If Node.js is not found, you'll be prompted to install it or can switch to TUI mode. Installation automatically detects GPU availability and falls back to CPU-only mode if needed.

Tip

First-time setup: CraftBot will guide you through an onboarding sequence to configure API keys, the agent's name, MCPs, and Skills.

Note

Playwright Chromium: Optional for WhatsApp Web integration. If installation fails, the agent will still work fine for other tasks. Install manually later with: playwright install chromium

� Troubleshooting & Common Issues

Missing Node.js (for Browser Mode)

If you see "npm not found in PATH" when running python run.py:

Download from nodejs.org (choose LTS version)
Install and restart your terminal
Run python run.py again

Alternative: Use TUI mode instead (no Node.js needed):

python run.py --tui

Installation Fails with Dependencies

The installer now provides detailed error messages with solutions. If installation fails:

Check Python version: Make sure you have Python 3.10+ (python --version)
Check internet: Dependencies are downloaded during installation
Clear pip cache: pip install --upgrade pip and try again

Playwright Installation Issues

Playwright chromium installation is optional. If it fails:

The agent will still work fine for other tasks
You can skip it or install later: playwright install chromium
Only needed for WhatsApp Web integration

GPU/CUDA Issues

The installer automatically detects GPU availability:

If CUDA installation fails, it falls back to CPU mode automatically
For manual CPU setup: python install.py --gui --cpu-only

For detailed troubleshooting, see INSTALLATION_FIX.md.

The agent can connect to various services using OAuth. Release builds come with embedded credentials, but you can also use your own.

Quick Start

For release builds with embedded credentials:

/google login    # Connect Google Workspace
/zoom login      # Connect Zoom
/slack invite    # Connect Slack
/notion invite   # Connect Notion
/linkedin login  # Connect LinkedIn

Service Details

Service	Auth Type	Command	Requires Secret?
Google	PKCE	`/google login`	No (PKCE)
Zoom	PKCE	`/zoom login`	No (PKCE)
Slack	OAuth 2.0	`/slack invite`	Yes
Notion	OAuth 2.0	`/notion invite`	Yes
LinkedIn	OAuth 2.0	`/linkedin login`	Yes

Using Your Own Credentials

If you prefer to use your own OAuth credentials, add them to your .env file:

Google (PKCE - only Client ID needed)

GOOGLE_CLIENT_ID=your-client-id.apps.googleusercontent.com

Go to Google Cloud Console
Enable Gmail, Calendar, Drive, and People APIs
Create OAuth credentials as Desktop app type
Copy the Client ID (secret not required for PKCE)

Zoom (PKCE - only Client ID needed)

ZOOM_CLIENT_ID=your-zoom-client-id

Go to Zoom Marketplace
Create an OAuth app
Copy the Client ID

Slack (Requires both)

SLACK_SHARED_CLIENT_ID=your-slack-client-id
SLACK_SHARED_CLIENT_SECRET=your-slack-client-secret

Go to Slack API
Create a new app
Add OAuth scopes: chat:write, channels:read, users:read, etc.
Copy Client ID and Client Secret

Notion (Requires both)

NOTION_SHARED_CLIENT_ID=your-notion-client-id
NOTION_SHARED_CLIENT_SECRET=your-notion-client-secret

Go to Notion Developers
Create a new integration (Public integration)
Copy OAuth Client ID and Secret

LinkedIn (Requires both)

LINKEDIN_CLIENT_ID=your-linkedin-client-id
LINKEDIN_CLIENT_SECRET=your-linkedin-client-secret

Go to LinkedIn Developers
Create an app
Add OAuth 2.0 scopes
Copy Client ID and Client Secret

Run with container

The repository root included a Docker configuration with Python 3.10, key system packages (including Tesseract for OCR), and all Python dependencies defined in environment.yml/requirements.txt so the agent can run consistently in isolated environments.

Below are the setup instruction of running our agent with container.

Build the image

From the repository root:

docker build -t craftbot .

Run the container

The image is configured to launch the agent with python -m app.main by default. To run it interactively:

docker run --rm -it craftbot

If you need to supply environment variables, pass an env file (for example, based on .env.example):

docker run --rm -it --env-file .env craftbot

Mount any directories that should persist outside the container (such as data or cache folders) using -v, and adjust ports or additional flags as needed for your deployment. The container ships with system dependencies for OCR (tesseract), screen automation (pyautogui, mss, X11 utilities, and a virtual framebuffer), and common HTTP clients so the agent can work with files, network APIs, and GUI automation inside the container.

Enabling GUI/screen automation

GUI actions (mouse/keyboard events, screenshots) require an X11 server. You can either attach to your host display or run headless with xvfb:

Use the host display (requires Linux with X11):

docker run --rm -it 
  -e DISPLAY=$DISPLAY \
  -v /tmp/.X11-unix:/tmp/.X11-unix \
  -v $(pwd)/data:/app/app/data \
  craftbot

Add extra -v mounts for any folders the agent should read/write.

Run headlessly with a virtual display:

  docker run --rm -it --env-file .env craftbot bash -lc "Xvfb :99 -screen 0 1920x1080x24 & export DISPLAY=:99 && exec python -m app.main"

By default the image uses Python 3.10 and bundles the Python dependencies from environment.yml/requirements.txt, so python -m app.main works out of the box.

🤝 How to Contribute

Contributions and suggestions are welcome! You can contact @zfoong @ thamyikfoong(at)craftos.net. We currently don't have checks set up, so we can't allow direct contributions but we appreciate any suggestions and feedback.

🧾 License

This project is licensed under the MIT License. You are free to use, host, and monetize this project (you must credit this project in case of distribution and monetization).

⭐ Acknowledgements

Developed and maintained by CraftOS and contributors @zfoong and @ahmad-ajmal.
If you find CraftBot useful, please ⭐ the repository and share it with others!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚀 Overview

CraftBot is your Personal AI Assistant that lives inside your machine and works 24/7 for you.

✨ Features

🧰 Getting Started

Prerequisites

Quick Install

What you can do right after?

🖥️ Interface Modes

🧩 Architecture Overview

🔜 Roadmap

🖥️ GUI Mode (Optional)

📋 Command Reference

install.py

run.py

� Troubleshooting & Common Issues

Missing Node.js (for Browser Mode)

Installation Fails with Dependencies

Playwright Installation Issues

GPU/CUDA Issues

Quick Start

Service Details

Using Your Own Credentials

Google (PKCE - only Client ID needed)

Zoom (PKCE - only Client ID needed)

Slack (Requires both)

Notion (Requires both)

LinkedIn (Requires both)

Run with container

Build the image

Run the container

Enabling GUI/screen automation

🤝 How to Contribute

🧾 License

⭐ Acknowledgements

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🚀 Overview

CraftBot is your Personal AI Assistant that lives inside your machine and works 24/7 for you.

✨ Features

🧰 Getting Started

Prerequisites

Quick Install

What you can do right after?

🖥️ Interface Modes

🧩 Architecture Overview

🔜 Roadmap

🖥️ GUI Mode (Optional)

📋 Command Reference

install.py

run.py

� Troubleshooting & Common Issues

Missing Node.js (for Browser Mode)

Installation Fails with Dependencies

Playwright Installation Issues

GPU/CUDA Issues

Quick Start

Service Details

Using Your Own Credentials

Google (PKCE - only Client ID needed)

Zoom (PKCE - only Client ID needed)

Slack (Requires both)

Notion (Requires both)

LinkedIn (Requires both)

Run with container

Build the image

Run the container

Enabling GUI/screen automation

🤝 How to Contribute

🧾 License

⭐ Acknowledgements