Alpha / Experimental. This project is under active development. APIs and behavior may change.
It aims to provide multi-provider, multi-agent orchestration while keeping privacy in mind.
agentd is a server-side agent orchestration service built in Go. It uses Google ADK (Agent Development Kit) under the hood to orchestrate agents and sub-agents, while exposing a ConnectRPC bidirectional streaming API to clients.
The core architectural principle: agents run on the server, but tool calls execute on the client. This keeps private data on the client side — the server never sees raw tool outputs beyond what the client explicitly sends back. The client communicates directly with the LLM provider through the server's mediation of the agent loop.
1. Start the server
docker run -p 8080:8080 -e GEMINI_API_KEY=${GEMINI_API_KEY} ghcr.io/apzuk3/agentd:latest2. Run an agent with a client-side tool
package main
import (
"context"
"fmt"
"log"
"github.com/apzuk3/agentd/client"
agentdv1 "github.com/apzuk3/agentd/gen/proto/go/agentd/v1"
)
type GreetInput struct {
Name string `json:"name"`
}
type GreetOutput struct {
Greeting string `json:"greeting"`
}
func main() {
clnt := client.New("http://localhost:8080")
// Register a tool — the handler runs on the client, keeping data private.
// Use concrete struct types so AddTool[T] can generate the JSON schema.
client.AddTool(clnt, "greet", "Returns a greeting", func(ctx context.Context, input GreetInput) (any, error) {
return GreetOutput{Greeting: "Hello, " + input.Name + "!"}, nil
})
// Define an agent that can use the tool.
agent := &agentdv1.Agent{
Name: "greeter",
AgentType: &agentdv1.Agent_Llm{
Llm: &agentdv1.LlmAgent{
Model: "gemini-2.5-flash",
ToolNames: []string{"greet"},
},
},
}
// Run the agent and stream the response.
for event, err := range clnt.Run(context.Background(), agent, "Say hi to Alice") {
if err != nil {
log.Fatal(err)
}
if event.OutputChunk != nil {
fmt.Print(event.OutputChunk.Content)
}
if event.End != nil {
break
}
}
}See examples/ for more complete examples.
┌─────────-┐ ConnectRPC bidi stream ┌──────────┐
│ Client │ ◄─────────────────────────────────────► │ agentd │
│ │ RunRequest / RunResponse (oneof) │ (server) │
│ - holds │ │ │
│ tools │ │ Google │
│ - holds │ │ ADK │
│ data │ │ agents │
└─────────-┘ └──────────┘
The Run RPC is a single bidirectional stream. Client and server exchange messages in a ping-pong pattern using oneof request/response envelopes:
Client → Server (RunRequest):
ExecuteRequest— start a new agent session (or resume an existing one via optionalsession_id), sending the full agent tree definition and theuser_promptfor this invocationHeartbeatRequest— keep the session aliveToolCallResponse— return tool execution results back to the server (oneofoutputorerror)CancelRequest— cancel the current generation or a specific tool callEndRequest— gracefully terminate the session
Server → Client (RunResponse):
ExecuteResponse— acknowledge session creation, returnsession_idHeartbeatResponse— heartbeat ackToolCallRequest— ask the client to execute a tool with given input, includessession_idandagent_pathfor attributionOutputChunk— stream a chunk of LLM-generated text, tagged withagent_path;last = truesignals the specific agent is done producing output;is_thought = trueindicates model thinking content rather than final responseStateUpdate— stream a state snapshot or incremental delta; the client automatically merges these updates.ErrorResponse— a structured error withErrorCode, human-readablemessage, andretryableflagEndResponse— session ended;completed = truewhen the agent tree finished naturally,falsefor client-initiated ends; includesUsageSummary
Client Server
│ │
│─── ExecuteRequest ───────────► │ (agent tree, optional session_id)
│◄── ExecuteResponse ────────── │ (session_id assigned)
│◄── StateUpdate (snapshot) ──── │ (initial state sync)
│ │
│◄── OutputChunk [root, planner] │ (planner streams, last=false)
│◄── StateUpdate (delta) ─────── │ (planner updates state)
│◄── ToolCallRequest ────────── │ (session_id, tool_call_id)
│─── ToolCallResponse ─────────► │ (oneof output/error)
│◄── OutputChunk [root, planner] │ (planner continues, last=true)
│ │
│◄── OutputChunk [root, writer] │ (writer streams, last=true)
│ │
│◄── EndResponse (completed) ─── │ (agent tree done, usage_summary)
The client can send a CancelRequest at any time during execution. If tool_call_id is set, only that specific tool call is cancelled; otherwise all generation is cancelled. Cancellation is best-effort — the server responds with either an ErrorResponse or continues with the next step.
Client Server
│ │
│─── CancelRequest ────────────►│ (session_id, optional tool_call_id)
│◄── EndResponse (completed=f) ─│ (or ErrorResponse, or next step)
If a client sends an ExecuteRequest with a previously returned session_id, the server attempts to reconnect to the existing session instead of creating a new one. When session_id is absent, a new session is created (default behavior).
The ErrorResponse includes a structured ErrorCode enum:
| Code | Name | Retryable? | Description |
|---|---|---|---|
| 0 | ERROR_CODE_UNSPECIFIED |
— | Default / unknown |
| 1 | ERROR_CODE_INTERNAL |
No | Internal server error |
| 2 | ERROR_CODE_INVALID_AGENT_TREE |
No | Malformed agent tree in ExecuteRequest |
| 3 | ERROR_CODE_RATE_LIMITED |
Yes | Provider rate limit hit |
| 4 | ERROR_CODE_AUTH_FAILED |
No | Authentication/authorization failure |
| 5 | ERROR_CODE_SESSION_NOT_FOUND |
No | Session ID not found (expired or invalid) |
| 6 | ERROR_CODE_MODEL_UNAVAILABLE |
Yes | Requested model string not supported or temporarily down |
| 7 | ERROR_CODE_TIMEOUT |
Yes | Operation timed out |
Agents are defined as a recursive tree in proto. The server uses Google ADK to execute them:
| Agent type | Proto message | Purpose |
|---|---|---|
| LlmAgent | LlmAgent |
Core agent — has a model, tools, instruction (system prompt), sub-agents |
| SequentialAgent | SequentialAgent |
Runs child agents in sequence |
| ParallelAgent | ParallelAgent |
Runs child agents in parallel |
| LoopAgent | LoopAgent |
Repeats child agents up to max_iterations; client controls continuation via a pre-defined tool through the standard ToolCallRequest/ToolCallResponse mechanism |
- Proto is the source of truth. All types flow from
proto/agentd/v1/. Runbuf generateto regenerate Go code after proto changes. - Never edit files under
gen/. They are overwritten on everybuf generate. - Tool execution is always client-side. The server must never execute tools directly — it sends
ToolCallRequestand waits forToolCallResponse. - Session lifecycle. Every agent run is scoped to a
session_idreturned inExecuteResponse. All subsequent messages reference this ID. Clients may resume sessions by passing the samesession_idin a newExecuteRequest. - Google ADK orchestration. The server-side implementation translates the proto
Agenttree into ADK agent/sub-agent structures and manages the agentic loop, forwarding tool calls to the client. - Models as strings. Model identifiers are plain strings validated at runtime. See the Models section for currently supported values.
- Instruction vs. user prompt.
LlmAgent.instructionis the static system prompt baked into the agent definition.ExecuteRequest.user_promptis the per-invocation user input, passed through to ADK'srunner.Run(). This keeps the agent tree a reusable template while the user's query varies per session. - Streaming output via
OutputChunk. LLM-generated text is streamed to the client in real-time. Each chunk carriesrepeated string agent_path— the ordered list from root to the producing agent (e.g.["root", "planner", "researcher"]) — so the client always knows which agent at which depth produced the text. Thelastfield signals per-agent completion. Theis_thoughtfield istruewhen the chunk contains model thinking (chain-of-thought) rather than final response content, allowing clients to render or hide thinking separately. EndResponsesignals completion. When the entire agent tree finishes, the server sendsEndResponsewithcompleted = trueand theUsageSummary. No separateFinalResponseexists —EndResponseserves both natural completion and client-initiated termination.- Structured errors.
ErrorResponsecarries anErrorCodeenum, a human-readablemessage, and aretryableboolean so clients can decide whether to retry automatically. - Tool results are unambiguous.
ToolCallResponseuses aoneof resultwithoutputanderrorbranches — exactly one is always set.
- The
AgentdHandlerinterface (generated inagentdv1connect) must be implemented to handle theRunbidi stream. - The server should maintain per-session state (agent tree, ADK runner, accumulated token usage) keyed by
session_id. UsageSummaryis returned inEndResponseto give the client a billing summary of the session.- Cancellation via
CancelRequestis best-effort. The server should attempt to stop in-flight LLM calls or tool dispatches, then either send anErrorResponseor proceed to the next step.