Echo
⚡ Automated Bug Reproduction Agent powered by LLMs & Knowledge Graphs ⚡
Echo analyzes GitHub issues, builds code knowledge graphs, and reproduces bugs in isolated Docker environments — all automatically.
- Knowledge Graph Analysis — Builds AST-based code knowledge graphs for deep codebase understanding
- LLM-Powered Reasoning — Leverages large language models to interpret issues and generate reproductions
- Containerized Execution — Runs all reproductions in isolated Docker environments
- Batch Processing — Supports SWE-bench datasets with parallel workers
- Automatic Patch Generation — Produces reproduction files, test commands, and diff patches
| Dependency | Version |
|---|---|
| Python | 3.11+ |
| Docker | Latest |
| Neo4j | Latest |
| Git | Latest |
pip install hatchling
pip install .
pip install git+https://github.com/SWE-bench/SWE-bench@v4.1.0PostgreSQL
docker run -d \
-p 5432:5432 \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=password \
-e POSTGRES_DB=postgres \
postgresNeo4j (start before configuring)
docker run -d \
-p 7474:7474 \
-p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
-e NEO4J_PLUGINS='["apoc"]' \
-e NEO4J_dbms_memory_heap_initial__size=4G \
-e NEO4J_dbms_memory_heap_max__size=8G \
-e NEO4J_dbms_memory_pagecache_size=4G \
neo4jCopy example.env to .env and fill in:
- Neo4j — URI, username, password
- Database — PostgreSQL connection string
- LLM API Keys — Anthropic / Gemini / OpenAI-compatible
- Working Directory — Path for logs and cloned repos
mkdir working_dir# Single instance
python -m app.main -d "princeton-nlp/SWE-bench_Lite" -i "instance_id"
# Multiple instances
python -m app.main -d "princeton-nlp/SWE-bench_Lite" -i "id_1" -i "id_2"
# Full dataset with parallel workers
python -m app.main -d "princeton-nlp/SWE-bench_Lite" -w 3
# With GitHub token
python -m app.main -d "dataset_name" -g "your_github_token"
# Resume from predictions file
python -m app.main -d "dataset_name" -f "predictions_20231215_143022.json"GitHub Issue ─→ Clone Repo ─→ Build Knowledge Graph ─→ Init Container
│
┌───────────────────────────────────────────┘
▼
LLM Analysis ─→ Generate Patch ─→ Execute in Container
│ │
│ ┌──────────────────────────┘
│ ▼
│ Bug Reproduced?
│ ├─ Yes → Save Results (patch, tests, commands)
└─────┤
└─ No → Retry with refined context
app/
├── main.py # Entry point & async orchestration
├── configuration/ # Settings via pydantic-settings
├── docker/ # Container management (pexpect-based)
├── lang_graph/ # LangGraph state machines & nodes
│ └── nodes/ # Individual workflow nodes
└── services/ # Core services
├── knowledge_graph # Neo4j knowledge graph builder
├── repository # Git operations
├── llm # Model initialization
└── database # PostgreSQL interactions
- Docker must be running before starting Echo
- Provide a valid GitHub token for private repositories
- Large repositories may require more time and memory for analysis
- Knowledge graphs are cleared after each issue to manage Neo4j memory