AIDD-C - AI Development Driver: Claude

A minimal harness demonstrating long-running autonomous coding with the Claude Agent SDK. This demo implements a three-agent pattern (initializer/onboarding + coding agent) that can build complete applications over multiple sessions.

New in v2.0: Support for existing codebases! The agent can now analyze and continue development on existing projects, not just build from scratch.

Prerequisites

Required: Install the latest versions of both Claude Code and the Claude Agent SDK:

# Install Claude Code CLI (latest version required)
npm install -g @anthropic-ai/claude-code

# Install Python dependencies
pip install -r requirements.txt

Verify your installations:

claude --version  # Should be latest version
pip show claude-code-sdk  # Check SDK is installed

API Key: Set your Anthropic API key:

export ANTHROPIC_API_KEY='your-api-key-here'

Quick Start

New Project (build from scratch):

python aidd-c.py --project-dir ./my_project --spec ./specs/app_spec.txt

Existing Codebase (analyze and continue):

python aidd-c.py --project-dir ./path/to/existing/app

Testing with limited iterations:

python aidd-c.py --project-dir ./my_project --spec ./specs/app_spec.txt --max-iterations 3

Important Timing Expectations

Warning: This demo takes a long time to run!

First session (initialization): The agent generates a metadata directory (.auto, .autok, or .automaker) with a feature_list.json containing 200 test cases. This takes several minutes and may appear to hang - this is normal. The agent is writing out all the features.
Subsequent sessions: Each coding iteration can take 5-15 minutes depending on complexity.
Full app: Building all 200 features typically requires many hours of total runtime across multiple sessions.

Tip: The 200 features parameter in the prompts is designed for comprehensive coverage. If you want faster demos, you can modify prompts/initializer.md to reduce the feature count (e.g., 20-50 features for a quicker demo).

How It Works

Three-Agent Pattern

The system automatically detects which agent to use based on the project directory state:

Initializer Agent (Session 1 - New Projects):
- Triggered when: Directory is empty or doesn't exist
- Reads spec.txt from metadata directory, creates feature_list.json with 200 test cases
- Sets up project structure and initializes git
- Begins implementation if time permits
Onboarding Agent (Session 1 - Existing Codebases):
- Triggered when: Directory has existing code but no metadata directory with feature_list.json
- Analyzes the existing codebase to understand what's implemented
- Creates or infers spec.txt from the code in the metadata directory
- Creates feature_list.json with existing features marked as passing
- Identifies missing features and technical debt
- Prepares for continued development
Coding Agent (Sessions 2+):
- Triggered when: feature_list.json exists in metadata directory
- Picks up where previous session left off
- Implements features one by one
- Marks them as passing in feature_list.json
- Works on both new and existing codebases

Session Management

Each session runs with a fresh context window
Progress is persisted via metadata directory (feature_list.json) and git commits
The agent auto-continues between sessions (3 second delay)
Press Ctrl+C to pause; run the same command to resume

Security Model

This demo uses a defense-in-depth security approach (see security.py and client.py):

OS-level Sandbox: Bash commands run in an isolated environment
Filesystem Restrictions: File operations restricted to the project directory only
Bash Allowlist: Only specific commands are permitted:
- File inspection: ls, cat, head, tail, wc, grep
- Node.js: npm, node
- Version control: git
- Process management: ps, lsof, sleep, pkill (dev processes only)

Commands not in the allowlist are blocked by the security hook.

Project Structure

autonomous-coding/
├── aidd-c.py  # Main entry point
├── agent.py                  # Agent session logic
├── client.py                 # Claude SDK client configuration
├── security.py               # Bash command allowlist and validation
├── progress.py               # Progress tracking utilities
├── prompts.py                # Prompt loading utilities
├── prompts/
│   ├── initializer.md # First session prompt (new projects)
│   ├── onboarding.md  # First session prompt (existing codebases)
│   └── coding.md      # Continuation session prompt
├── specs/
│   └── app_spec.txt          # Application specification
└── requirements.txt          # Python dependencies

Generated Project Structure

After running, your project directory will contain:

my_project/
├── .aidd/                      # or .autok/ or .automaker/ (whichever is found/created)
│   ├── feature_list.json         # Test cases (source of truth)
│   ├── spec.txt                  # Copied specification
│   ├── init.sh                   # Environment setup script
│   └── claude-progress.txt       # Session progress notes
├── .claude_settings.json     # Security settings
└── [application files]       # Generated application code

Feature List Schema

The metadata directory's feature_list.json file uses an enhanced schema with rich metadata for better tracking:

{
  "area": "backend",
  "category": "functional",
  "description": "User can log in with email and password",
  "priority": "critical",
  "status": "open",
  "created_at": "2025-01-15",
  "closed_at": null,
  "steps": [
    "Step 1: Navigate to login page",
    "Step 2: Enter credentials",
    "Step 3: Verify login success"
  ],
  "passes": false
}

Field Definitions:

Field	Values	Description
`area`	`database`, `backend`, `frontend`, `testing`, `security`, `devex`, `docs`	System area
`category`	`functional`, `style`, `security`, `performance`, `accessibility`	Test type
`priority`	`critical`, `high`, `medium`, `low`	Implementation priority
`status`	`open`, `in_progress`, `resolved`, `deferred`	Current state
`created_at`	`YYYY-MM-DD`	Date feature was added
`closed_at`	`YYYY-MM-DD` or `null`	Date feature was completed
`steps`	Array of strings	Testing steps
`passes`	`true` or `false`	Whether feature passes testing

Progress Display:

The agent displays progress summaries including:

Overall passing/total counts
Status breakdown (open, in_progress, resolved, deferred)
Priority breakdown (critical, high, medium, low)

Running the Generated Application

After the agent completes (or pauses), you can run the generated application:

cd generations/my_project

# Run the setup script created by the agent
./[metadata-dir]/init.sh

# Or manually (typical for Node.js apps):
npm install
npm run dev

The application will typically be available at http://localhost:3000 or similar (check the agent's output or [metadata-dir]/init.sh for the exact URL).

Command Line Options

Option	Description	Default
`--project-dir`	Directory for the project (required)	None
`--spec`	Specification file (required for new projects)	None
`--max-iterations`	Max agent iterations	Unlimited
`--model`	Claude model for all phases	`claude-sonnet-4-5-20250929`
`--init-model`	Model for init/onboarding (overrides `--model`)	Same as `--model`
`--code-model`	Model for coding phases (overrides `--model`)	Same as `--model`
`--idle-timeout`	Abort session if no output for N seconds	`180`
`--quit-on-abort`	Quit after N consecutive failures	`0` (never)

Multi-Model Configuration

You can use different models for different phases to optimize cost and performance:

# Use Haiku 4.5 for setup (cheaper), Sonnet 4.5 for coding (more capable)
python aidd-c.py --project-dir ./my_project --spec ./specs/app_spec.txt \
  --init-model claude-haiku-4-5-20251001 \
  --code-model claude-sonnet-4-5-20250929

# Use Opus 4.5 for complex coding tasks
python aidd-c.py --project-dir ./my_project --spec ./specs/app_spec.txt \
  --code-model claude-opus-4-5-20251101

Recommended configurations:

Use Case	Init Model	Code Model
Cost-optimized	`claude-haiku-4-5-20251001`	`claude-sonnet-4-5-20250929`
Balanced	`claude-sonnet-4-5-20250929`	`claude-sonnet-4-5-20250929`
Maximum quality	`claude-sonnet-4-5-20250929`	`claude-opus-4-5-20251101`

Idle Timeout

The idle timeout feature automatically detects and handles stuck agent sessions. If the agent produces no output for the specified number of seconds, the session is aborted and a fresh session is started.

# Use default 180-second idle timeout
python aidd-c.py --project-dir ./my_project

# Increase timeout for complex operations (5 minutes)
python aidd-c.py --project-dir ./my_project --idle-timeout 300

# Disable idle timeout entirely
python aidd-c.py --project-dir ./my_project --idle-timeout 0

When to adjust idle timeout:

Increase if you're seeing false timeouts during long-running operations
Decrease if you want faster detection of stuck sessions
Disable (0) if you want the agent to run without time limits

Failure Threshold

The failure threshold feature tracks consecutive failures (errors and idle timeouts) and can automatically quit after reaching a threshold. This prevents infinite retry loops when something is fundamentally broken.

# Default: never quit, keep retrying forever
python aidd-c.py --project-dir ./my_project

# Quit after 3 consecutive failures
python aidd-c.py --project-dir ./my_project --quit-on-abort 3

# Quit after 5 consecutive failures (more resilient)
python aidd-c.py --project-dir ./my_project --quit-on-abort 5

How it works:

Counter increments on errors or idle timeouts
Counter resets to 0 on successful session completion
When counter reaches threshold, the agent stops
Use 0 (default) to disable and keep retrying indefinitely

When to use:

Production runs: Set to 3-5 to avoid wasting compute on broken sessions
Development/debugging: Set to 0 to allow manual investigation
Unattended runs: Set to a reasonable threshold to prevent runaway costs

Customization

Changing the Application

Edit specs/app_spec.txt to specify a different application to build.

Adjusting Feature Count

Edit prompts/initializer.md and change the "200 features" requirement to a smaller number for faster demos.

Modifying Allowed Commands

Edit security.py to add or remove commands from ALLOWED_COMMANDS.

Troubleshooting

"Appears to hang on first run" This is normal. The initializer agent is generating 200 detailed test cases, which takes significant time. Watch for [Tool: ...] output to confirm the agent is working.

"Command blocked by security hook" The agent tried to run a command not in the allowlist. This is the security system working as intended. If needed, add the command to ALLOWED_COMMANDS in security.py.

"API key not set" Ensure ANTHROPIC_API_KEY is exported in your shell environment.

License

Internal Anthropic use.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
aidd_c		aidd_c
prompts		prompts
specs		specs
.gitignore		.gitignore
README.md		README.md
agent.py		agent.py
aidd-c.py		aidd-c.py
client.py		client.py
metadata_dir.py		metadata_dir.py
progress.py		progress.py
prompts.py		prompts.py
requirements.txt		requirements.txt
security.py		security.py
test_failure_threshold.py		test_failure_threshold.py
test_multimodel.py		test_multimodel.py
test_onboarding.py		test_onboarding.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AIDD-C - AI Development Driver: Claude

Prerequisites

Quick Start

Important Timing Expectations

How It Works

Three-Agent Pattern

Session Management

Security Model

Project Structure

Generated Project Structure

Feature List Schema

Running the Generated Application

Command Line Options

Multi-Model Configuration

Idle Timeout

Failure Threshold

Customization

Changing the Application

Adjusting Feature Count

Modifying Allowed Commands

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

NomadicDaddy/aidd-c

Folders and files

Latest commit

History

Repository files navigation

AIDD-C - AI Development Driver: Claude

Prerequisites

Quick Start

Important Timing Expectations

How It Works

Three-Agent Pattern

Session Management

Security Model

Project Structure

Generated Project Structure

Feature List Schema

Running the Generated Application

Command Line Options

Multi-Model Configuration

Idle Timeout

Failure Threshold

Customization

Changing the Application

Adjusting Feature Count

Modifying Allowed Commands

Troubleshooting

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages