Part of Forge Documentation
The runtime engine powers forge run — executing agent tasks via LLM providers with tool calling, conversation memory, and lifecycle hooks.
The core agent loop follows a simple pattern:
- Initialize memory with the system prompt and task history
- Append the user message
- Call the LLM with the conversation and available tool definitions
- If the LLM returns tool calls: execute each tool, append results, go to step 3
- If the LLM returns a text response: return it as the final answer
- If max iterations are exceeded: return an error
User message → Memory → LLM → tool_calls? → Execute tools → LLM → ... → text → Done
The loop terminates when FinishReason == "stop" or len(ToolCalls) == 0.
Forge supports multiple LLM providers with automatic fallback:
| Provider | Default Model | Auth |
|---|---|---|
openai |
gpt-5.2-2025-12-11 |
API key or OAuth; optional Organization ID |
anthropic |
claude-sonnet-4-20250514 |
API key |
gemini |
gemini-2.5-flash |
API key |
ollama |
llama3 |
None (local) |
| Custom | Configurable | API key |
model:
provider: openai
name: gpt-4oOr override with environment variables:
export FORGE_MODEL_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
forge runProvider is auto-detected from available API keys if not explicitly set. Provider configuration is resolved via ResolveModelConfig() in priority order:
- CLI flag
--provider(highest priority) - Environment variables:
FORGE_MODEL_PROVIDER,OPENAI_API_KEY,ANTHROPIC_API_KEY - forge.yaml
modelsection (lowest priority)
For OpenAI, Forge supports browser-based OAuth login (matching the Codex CLI flow) as an alternative to API keys:
forge init my-agent
# Select "OpenAI" -> "Login with browser (OAuth)"
# Browser opens for authenticationOAuth tokens are stored in ~/.forge/credentials/openai.json and automatically refreshed.
Enterprise OpenAI accounts can set an Organization ID to route API requests to the correct org:
model:
provider: openai
name: gpt-4o
organization_id: "org-xxxxxxxxxxxxxxxxxxxxxxxx"Or via environment variable (overrides YAML):
export OPENAI_ORG_ID=org-xxxxxxxxxxxxxxxxxxxxxxxxThe OpenAI-Organization header is sent on all OpenAI API requests (chat, embeddings, responses). Fallback providers inherit the primary org ID unless overridden per-fallback. The org ID is also injected into skill subprocess environments as OPENAI_ORG_ID.
Configure fallback providers for automatic failover when the primary provider is unavailable:
model:
provider: openai
name: gpt-4o
fallbacks:
- provider: anthropic
name: claude-sonnet-4-20250514
- provider: geminiOr via environment variable:
export FORGE_MODEL_FALLBACKS="anthropic:claude-sonnet-4-20250514,gemini:gemini-2.5-flash"Fallback behavior:
- Retriable errors (rate limits, overloaded, timeouts) try the next provider
- Non-retriable errors (auth, billing, bad format) abort immediately
- Per-provider exponential backoff cooldowns prevent thundering herd
- Fallbacks are also auto-detected from available API keys when not explicitly configured
The runtime supports multiple executor implementations:
| Executor | Use Case |
|---|---|
LLMExecutor |
Custom agents with LLM-powered tool calling |
SubprocessExecutor |
Framework agents (CrewAI, LangChain) running as subprocesses |
StubExecutor |
Returns canned responses for testing |
Executor selection happens in runner.go based on framework type and configuration.
Run the agent as a foreground HTTP server. Used for development and container deployments.
# Development (all interfaces, immediate shutdown)
forge run --with slack --port 8080
# Container deployment
forge run --host 0.0.0.0 --shutdown-timeout 30s| Flag | Default | Description |
|---|---|---|
--port |
8080 |
HTTP server port |
--host |
"" (all interfaces) |
Bind address |
--shutdown-timeout |
0 (immediate) |
Graceful shutdown timeout |
--with |
— | Channel adapters (e.g. slack,telegram) |
--mock-tools |
false |
Use mock executor for testing |
--model |
— | Override model name |
--provider |
— | Override LLM provider |
--env |
.env |
Path to env file |
--enforce-guardrails |
false |
Enforce guardrail violations as errors |
Manage the agent as a background daemon process with PID/log management.
# Start daemon (secure defaults: 127.0.0.1, 30s shutdown timeout)
forge serve
# Start on custom port
forge serve start --port 9090 --host 0.0.0.0
# Stop the daemon
forge serve stop
# Check status (PID, uptime, health)
forge serve status
# View recent logs (last 100 lines)
forge serve logs| Subcommand | Description |
|---|---|
start (default) |
Start the daemon in background |
stop |
Send SIGTERM (10s timeout, SIGKILL fallback) |
status |
Show PID, listen address, health check |
logs |
Tail .forge/serve.log |
The daemon forks forge run in the background with setsid, writes state to .forge/serve.json, and redirects output to .forge/serve.log. Passphrase prompting for encrypted secrets happens in the parent process (which has TTY access) before forking.
The runtime configures a FilesDir for tool-generated files (e.g., from file_create). This directory defaults to <WorkDir>/.forge/files/ and is injected into the execution context so tools can write files that other tools can reference by path.
<WorkDir>/
.forge/
files/ ← file_create output (patches.yaml, reports, etc.)
sessions/ ← conversation persistence
memory/ ← long-term memory
The FilesDir is set via LLMExecutorConfig.FilesDir and made available to tools through runtime.FilesDirFromContext(ctx). See Tools — File Create for details.
For details on session persistence, context window management, compaction, and long-term memory, see Memory.
The engine fires hooks at key points in the loop. See Hooks for details.
The current implementation (v1) runs the full tool-calling loop non-streaming. ExecuteStream calls Execute internally and emits the final response as a single message on a channel. True word-by-word streaming during tool loops is planned for v2.
← Tools | Back to README | Memory →