Agentic Evolve

Evolutionary algorithm discovery powered by Claude. Evolves novel solutions through LLM-driven mutation, crossover, and selection—optimizing for speed, size, or ML accuracy.

Features

Three optimization modes: Performance (ops/sec), Size (bytes), ML (F1/accuracy)
Hierarchical agents: Dedicated subagents for mutation, crossover, evaluation, and adversary review
Evolution Memory: Persistent storage of mutation patterns, failures, and checkpoints for cross-problem learning
Trust System: Adversary agent reviews suspicious improvements, prevents evaluator exploitation
Clean context: Each agent starts fresh, avoiding context bloat
Parallel mutations: Run multiple mutation attempts concurrently
Crash recovery: Checkpoint system enables resuming from any generation
Validation hooks: Block unsafe code patterns before execution

Quick Start

1. Install the SDK

# Create virtual environment (recommended)
python3 -m venv .venv
source .venv/bin/activate

# Install the SDK and dependencies
pip install -e sdk/
pip install claude-agent-sdk

2. Install the Skills (optional)

# Copy skills to your Claude commands directory
cp .claude/commands/evolve*.md ~/.claude/commands/

3. Use It

Via CLI:

# Activate venv first
source .venv/bin/activate

# Performance optimization
python -m evolve_sdk "faster sorting algorithm" --mode=perf

# Size optimization (code golf)
python -m evolve_sdk "shortest Python prime checker" --mode=size

# ML optimization
python -m evolve_sdk "improve F1 for classification" --mode=ml

# With memory enabled (default)
python -m evolve_sdk "faster N-Queens solver" --mode=perf --config=evolve_config.json

# Resume previous evolution
python -m evolve_sdk --resume

Via Claude Code skill:

/evolve faster sorting algorithm
/evolve shortest Python solution for ARC task
/evolve improve accuracy on this classifier
/evolve --resume

Architecture

Evolution Memory System

The memory system provides persistent storage for evolution runs, enabling:

What Memory Captures

Frame Type	Purpose
mutation	Tracks all mutation attempts with fitness deltas and tags
failed_mutation	Records rejected mutations and reasons for future avoidance
checkpoint	Enables crash recovery from any generation
generation	Summarizes each generation's progress
champion	Records winning solutions with full lineage
trust_decision	Logs adversary reviews and trust scores

Memory Configuration

{
  "memory": {
    "enabled": true,
    "inject_mutation_context": true,
    "store_successful_mutations": true,
    "store_failed_mutations": true,
    "max_similar_mutations": 5,
    "max_failed_mutations": 5
  }
}

Benefits

Pattern Learning: Mutators receive context about what worked before
Failure Avoidance: Don't repeat mutations that already failed
Crash Recovery: Resume from any checkpoint after system failure
Cross-Problem Learning: Transfer patterns between similar problems

Optimization Modes

Mode	Metric	Use Case
perf	ops/sec, latency	Algorithm optimization, benchmarks
size	bytes, characters	Code golf, minimal implementations
ml	F1, accuracy, AUC	Feature engineering, model tuning

Example Results

Problem	Mode	Result	Improvement
N-Queens	perf	20,407 sol/sec	14,000x vs baseline
hERG Toxicity	ml	0.890 ROC-AUC	+4.5% from baseline
ARC task 0520fde7	size	57 bytes	-29% from baseline
Airfoil Design	perf	44% L/D improvement	3D-printable output
Chess Challenge	ml	77.4 ACPL	AIcrowd competition

Showcases

Showcase	Description	Key Result
regex_golf	Debugger + Plateau Breaker demo	36% shorter regex
linkage-evolution	Mechanical linkage optimization	25% improvement, 3D-printable
cuopt_lp_autotuner	NVIDIA cuOpt LP autotuner	1.07x speedup, 73% improved
nqueens-evolution	N-Queens solver with memory demo	14,000x speedup
molecular-admet-prediction	hERG cardiac toxicity	0.890 ROC-AUC
code-golf	ARC-AGI minimal solutions	72 tasks, 163K points
santa-2025-packing	Kaggle bin packing	120 generations tracked
global-chess-challenge-2025	AIcrowd chess competition	77.4 ACPL
airfoil-evolution	Airfoil shape optimization	44% L/D improvement
openml-automl-benchmark	OpenML-CC18 AutoML benchmark	2.38% avg improvement

Experiments

Exploratory and work-in-progress projects live in experiments/. These include early-stage explorations, projects still being tuned, and documented negative results.

Project Structure

agentic-evolve/
├── .claude/commands/           # Skill files (thin SDK wrappers)
│   ├── evolve.md              # Master dispatcher
│   ├── evolve-perf.md         # Performance mode
│   ├── evolve-size.md         # Size mode
│   └── evolve-ml.md           # ML mode
├── sdk/                        # Python SDK
│   └── evolve_sdk/
│       ├── runner.py          # EvolutionRunner orchestrator
│       ├── config.py          # Configuration handling
│       ├── agents/            # Subagent prompts
│       │   ├── mutator.py     # Mutation specialist
│       │   ├── evaluator.py   # Fitness measurement
│       │   ├── crossover.py   # Parent combination
│       │   ├── adversary.py   # Trust validation
│       │   ├── debugger.py    # Failed mutation diagnosis
│       │   ├── plateau_breaker.py  # Stall detection/intervention
│       │   ├── meta_strategist.py  # Strategy optimization
│       │   └── diversity_guardian.py  # Convergence prevention
│       ├── memory/            # Evolution memory system
│       │   ├── store.py       # Persistent storage engine
│       │   ├── schemas.py     # Frame type definitions
│       │   ├── queries.py     # Pre-built query patterns
│       │   └── embeddings.py  # Code similarity matching
│       └── hooks/             # Validation hooks
├── showcase/                   # Verified showcase projects (10)
│   ├── nqueens-evolution/     # Memory system demo (14,000x speedup)
│   ├── molecular-admet-prediction/ # hERG toxicity (0.890 ROC-AUC)
│   ├── code-golf/             # ARC-AGI solutions (72 tasks)
│   └── ...
├── experiments/                # WIP/exploratory projects (16)
│   ├── kv-cache-eviction/     # KV-cache scoring
│   ├── kernelbench-triton-evolution/ # GPU kernel optimization
│   └── ...
└── .evolve-sdk/                # Evolution state (created per run)
    └── <problem>/
        ├── evolution.json      # Full state + memory frames
        ├── champion.json       # Best solution
        ├── trust_dossier.md    # Trust decision report
        └── mutations/          # All tested variants

Trust System

The SDK includes adversarial validation to prevent evaluator gaming:

Component	Purpose
Adversary Agent	Reviews suspicious improvements (>15% jumps)
Variance Gates	Re-evaluates N times, rejects inconsistent results
Exploit Detection	Checks timing anomalies, output integrity
Trust Dossier	Generates markdown reports of all decisions
Escalation Levels	Extended validation for high-stakes promotions

{
  "trust": {
    "enabled": true,
    "suspicious_jump_pct": 15.0,
    "require_adversary_for_champion": true,
    "n_evaluations": 3,
    "variance_threshold": 0.05
  }
}

Configuration

Use evolve_config.json for custom evaluation:

{
  "description": "Evolve fast N-Queens solvers",
  "mode": "perf",
  "evaluation": {
    "test_command": "python evaluate.py {solution} --json"
  },
  "memory": {
    "enabled": true,
    "inject_mutation_context": true
  },
  "trust": {
    "enabled": true,
    "require_adversary_for_champion": true
  },
  "starter_solutions": ["baseline.py"],
  "max_generations": 20,
  "population_size": 10
}

Then run:

python -m evolve_sdk --config=evolve_config.json

Requirements

Python 3.10+
Claude Code CLI (brew install claude-code)
Claude Agent SDK (pip install claude-agent-sdk)
Authenticated with Claude (claude auth login)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 411 Commits
.claude/commands		.claude/commands
.evolve		.evolve
docs		docs
experiments		experiments
sdk		sdk
showcase		showcase
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
optimized_packings.json		optimized_packings.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic Evolve

Features

Quick Start

1. Install the SDK

2. Install the Skills (optional)

3. Use It

Architecture

Evolution Memory System

What Memory Captures

Memory Configuration

Benefits

Optimization Modes

Example Results

Showcases

Experiments

Project Structure

Trust System

Configuration

Requirements

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

ericksoa/agentic-evolve

Folders and files

Latest commit

History

Repository files navigation

Agentic Evolve

Features

Quick Start

1. Install the SDK

2. Install the Skills (optional)

3. Use It

Architecture

Evolution Memory System

What Memory Captures

Memory Configuration

Benefits

Optimization Modes

Example Results

Showcases

Experiments

Project Structure

Trust System

Configuration

Requirements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages