This document describes how to test the Call Stack Context Management plugin.
- OpenCode CLI installed (
opencodecommand available) - Plugin located at
.opencode/plugin/stack.ts
Important: When testing with OpenCode agents, encourage them to build new projects inside the test-projects directory. This keeps test projects organized and separate from the main codebase.
# Navigate to the stack directory
cd /path/to/stack
# Clear any existing state for a fresh test
rm -rf .opencode/stack/frames .opencode/stack/state.json
# Run a test prompt
opencode run "Create a plan for building a REST API with three endpoints"The OpenCode CLI supports non-interactive execution via opencode run [message]. This is the primary way to test the stack plugin from the command line.
# Single command execution
opencode run "Your test prompt here"
# JSON output for programmatic parsing
opencode run "Your test prompt" --format json
# With verbose logs
opencode run "Your test prompt" --print-logs 2>&1
# With specific model
opencode run "Your test prompt" --model anthropic/claude-sonnet-4-5-20250929Resume existing sessions for multi-step workflows:
# Continue the last session
opencode run --continue "Continue working on this task"
# Resume a specific session by ID
opencode run --session ses_abc123xyz "Continue from where we left off"The stack plugin at .opencode/plugin/stack.ts loads automatically when OpenCode starts. You can verify this by:
- Looking for initialization logs when using
--print-logs - Testing that stack tools are available:
opencode run "Use stack_tree to show me the current frame tree"
The --format json flag streams JSON events, useful for:
- Programmatic parsing of tool calls
- Verifying which tools the agent uses
- Debugging frame operations
opencode run "Create a simple task plan" --format json 2>&1 | head -100By default, OpenCode prompts for approval on certain operations (external_directory, doom_loop). For automated testing, configure permissions to allow operations without prompts.
Option 1: Environment variable (recommended for quick tests)
export OPENCODE_PERMISSION='{"edit":"allow","bash":"allow","external_directory":"allow","doom_loop":"allow"}'
opencode run "Your test prompt"Option 2: Config file (opencode.json in project root)
{
"$schema": "https://opencode.ai/config.json",
"permission": {
"edit": "allow",
"bash": "allow",
"external_directory": "allow",
"doom_loop": "allow"
}
}Option 3: Granular bash permissions
{
"permission": {
"bash": {
"git push": "ask",
"rm -rf *": "deny",
"*": "allow"
}
}
}Permission levels: "allow" (no prompt), "ask" (prompt), "deny" (disabled)
Test that the agent uses stack tools for task decomposition:
opencode run "Build a user authentication system with login, logout, and password reset. Break this into subtasks and work through each one."Expected behavior:
- Agent uses
stack_frame_plan_childrento create subtasks - Each subtask has a title and successCriteria
- Agent uses
stack_frame_activateto start each subtask - Agent uses
stack_frame_popwith results when completing
Check the state file after running:
cat .opencode/stack/state.json | python3 -c "
import json,sys
d=json.load(sys.stdin)
frames = d.get('frames', {})
print(f'Total frames: {len(frames)}')
for fid, f in frames.items():
print(f' [{f.get(\"status\")[:4]}] {f.get(\"title\", \"?\")[:40]}')
"Test dynamic frame creation within frames:
opencode run "Build a complex feature that requires multiple sub-components. When you encounter complexity, create child frames. Target depth > 2."Expected behavior:
- Agent creates frames within frames
- State shows
plannedChildrenrelationships - Max depth > 2 in frame tree
Verify context is being injected:
opencode run "Use stack_context_preview to show me what context is being injected" --print-logs 2>&1Look for logs showing:
Context generatedFrame context injected
Test the pop workflow:
# Get the active frame ID
FRAME_ID=$(cat .opencode/stack/state.json | python3 -c "import json,sys; print(json.load(sys.stdin).get('activeFrameID', ''))")
# Pop with explicit frame ID
opencode run "Complete this frame with stack_frame_pop. Use status=completed and provide results summarizing what was done."View the current stack tree:
opencode run "Use stack_tree to show the current frame hierarchy"Get complete state as JSON:
opencode run "Use stack_get_state to return the complete stack state"| File | Purpose |
|---|---|
.opencode/stack/state.json |
Root state with frame tree |
.opencode/stack/frames/*.json |
Individual frame metadata |
{
"version": 1,
"frames": {
"ses_xxx": {
"sessionID": "ses_xxx",
"parentSessionID": "ses_parent",
"status": "in_progress",
"title": "Frame name",
"successCriteria": "What defines done",
"successCriteriaCompacted": "Dense version",
"results": "What was accomplished",
"resultsCompacted": "Dense version",
"artifacts": ["file1.ts"],
"decisions": ["Decision text"],
"plannedChildren": ["plan-xxx"]
}
},
"activeFrameID": "ses_xxx",
"rootFrameIDs": ["ses_root"]
}| Tool | Description |
|---|---|
stack_frame_push |
Create child frame with title/criteria |
stack_frame_pop |
Complete frame with status/results |
stack_status |
Show frame tree with status icons |
stack_tree |
ASCII visualization of frame tree |
stack_frame_details |
View full frame metadata |
stack_add_artifact |
Add artifact to current frame |
stack_add_decision |
Add decision to current frame |
| Tool | Description |
|---|---|
stack_frame_plan |
Create a single planned frame |
stack_frame_plan_children |
Create multiple planned children |
stack_frame_activate |
Start work on planned frame |
stack_frame_invalidate |
Invalidate frame with cascade |
stack_frame_summarize |
Summarize frame content |
| Tool | Description |
|---|---|
stack_context_info |
Show token usage metadata |
stack_context_preview |
Preview XML context |
stack_cache_clear |
Clear context cache |
stack_get_state |
Get complete state JSON |
stack_compaction_info |
Show compaction status |
stack_get_summary |
Get summary for a frame |
stack_stats |
Show overall statistics |
stack_config |
View/update configuration |
| Tool | Description |
|---|---|
stack_subagent_complete |
Complete a subagent frame |
stack_subagent_list |
List all subagent frames |
| Tool | Description |
|---|---|
stack_autonomy |
Configure autonomy settings |
stack_should_push |
Check if push is recommended |
stack_should_pop |
Check if pop is recommended |
stack_auto_suggest |
Toggle auto-suggestions |
stack_autonomy_stats |
View autonomy statistics |
-
Check plugin initialization: Look for
=== STACK PLUGIN INITIALIZED ===in logs -
Verify hooks firing: Look for:
CHAT.MESSAGE- Message hookFrame context injected- Context injection working
-
State not persisting: Check file permissions on
.opencode/stack/ -
Reset for clean test:
rm -rf .opencode/stack/
-
Observe tool usage in JSON mode:
opencode run "Create a plan with 3 steps" --format json 2>&1 | grep -i stack_
# In one terminal, watch the state file
watch -n 1 'cat .opencode/stack/state.json | python3 -m json.tool 2>/dev/null | head -50'
# In another terminal, run commands
opencode run "Plan out a simple feature"# List all frame files by modification time
ls -lt .opencode/stack/frames/
# View the most recent frame
cat .opencode/stack/frames/$(ls -t .opencode/stack/frames/ | head -1) | python3 -m json.toolUsing JSON output mode, you can verify which tools the agent calls:
opencode run "Plan a simple task" --format json 2>&1 | \
python3 -c "
import sys, json
for line in sys.stdin:
line = line.strip()
if not line: continue
try:
event = json.loads(line)
if 'tool' in str(event).lower() or 'stack_' in str(event):
print(json.dumps(event, indent=2))
except: pass
"The plugin has been validated with:
- Simple tasks: 5 frames, proper tool usage, no TodoWrite
- Complex tasks: 63 frames, depth 4, nested hierarchies
- Real applications: TypeScript projects with 17+ source files
Key validation:
- Agents use stack tools as PRIMARY task management (not TodoWrite)
- Dynamic frame creation when complexity discovered
- Proper completion with results summaries
- Session resumption with
--sessionflag works correctly
# Token budgets
STACK_TOKEN_BUDGET_TOTAL=4000
STACK_TOKEN_BUDGET_ANCESTORS=1500
STACK_TOKEN_BUDGET_SIBLINGS=1500
STACK_TOKEN_BUDGET_CURRENT=800
# Autonomy
STACK_AUTONOMY_LEVEL=suggest # manual, suggest, or auto
STACK_PUSH_THRESHOLD=70
STACK_POP_THRESHOLD=80