Permanent memory for AI agents. Single binary, zero dependencies, MCP native.
ICM gives your AI agent a real memory — not a note-taking tool, not a context manager, a memory.
ICM (Infinite Context Memory)
┌──────────────────────┬─────────────────────────┐
│ MEMORIES (Topics) │ MEMOIRS (Knowledge) │
│ │ │
│ Episodic, temporal │ Permanent, structured │
│ │ │
│ ┌───┐ ┌───┐ ┌───┐ │ ┌───┐ │
│ │ m │ │ m │ │ m │ │ │ C │──depends_on──┐ │
│ └─┬─┘ └─┬─┘ └─┬─┘ │ └───┘ │ │
│ │decay │ │ │ │ refines ┌─▼─┐│
│ ▼ ▼ ▼ │ ┌─▼─┐ │ C ││
│ weight decreases │ │ C │──part_of──>└───┘│
│ over time unless │ └───┘ │
│ accessed/critical │ Concepts + Relations │
├──────────────────────┴─────────────────────────┤
│ SQLite + FTS5 + sqlite-vec │
│ Hybrid search: BM25 (30%) + cosine (70%) │
└─────────────────────────────────────────────────┘
Two memory models:
- Memories — store/recall with temporal decay by importance. Critical memories never fade, low-importance ones decay naturally. Filter by topic or keyword.
- Memoirs — permanent knowledge graphs. Concepts linked by typed relations (
depends_on,contradicts,superseded_by, ...). Filter by label.
# Homebrew (macOS / Linux)
brew tap rtk-ai/tap && brew install icm
# Quick install
curl -fsSL https://raw.githubusercontent.com/rtk-ai/icm/main/install.sh | sh
# From source
cargo install --path crates/icm-cli# Auto-detect and configure Claude Code, Claude Desktop, Cursor, Windsurf
icm initOr manually:
# Claude Code
claude mcp add icm -- icm serve
# Any MCP client: command = "icm", args = ["serve"]# Store
icm store -t "my-project" -c "Use PostgreSQL for the main DB" -i high -k "db,postgres"
# Recall
icm recall "database choice"
icm recall "auth setup" --topic "my-project" --limit 10
icm recall "architecture" --keyword "postgres"
# Manage
icm forget <memory-id>
icm consolidate --topic "my-project"
icm topics
icm stats
# Extract facts from text (rule-based, zero LLM cost)
echo "The parser uses Pratt algorithm" | icm extract -p my-project# Create a memoir
icm memoir create -n "system-architecture" -d "System design decisions"
# Add concepts with labels
icm memoir add-concept -m "system-architecture" -n "auth-service" \
-d "Handles JWT tokens and OAuth2 flows" -l "domain:auth,type:service"
# Link concepts
icm memoir link -m "system-architecture" --from "api-gateway" --to "auth-service" -r depends-on
# Search with label filter
icm memoir search -m "system-architecture" "authentication"
icm memoir search -m "system-architecture" "service" --label "domain:auth"
# Inspect neighborhood
icm memoir inspect -m "system-architecture" "auth-service" -D 2| Tool | Description |
|---|---|
icm_memory_store |
Store a memory with topic, importance, keywords |
icm_memory_recall |
Search by query, filter by topic and/or keyword |
icm_memory_forget |
Delete a memory by ID |
icm_memory_consolidate |
Merge all memories of a topic into one summary |
icm_memory_list_topics |
List all topics with counts |
icm_memory_stats |
Global memory statistics |
icm_memory_embed_all |
Backfill embeddings for vector search |
| Tool | Description |
|---|---|
icm_memoir_create |
Create a new memoir (knowledge container) |
icm_memoir_list |
List all memoirs |
icm_memoir_show |
Show memoir details and all concepts |
icm_memoir_add_concept |
Add a concept with labels |
icm_memoir_refine |
Update a concept's definition |
icm_memoir_search |
Full-text search, optionally filtered by label |
icm_memoir_search_all |
Search across all memoirs |
icm_memoir_link |
Create typed relation between concepts |
icm_memoir_inspect |
Inspect concept and graph neighborhood (BFS) |
part_of · depends_on · related_to · contradicts · refines · alternative_to · caused_by · instance_of · superseded_by
Episodic memory (Topics) captures decisions, errors, preferences. Each memory has a weight that decays over time based on importance:
| Importance | Decay | Behavior |
|---|---|---|
critical |
none | Never forgotten |
high |
slow (0.5x rate) | Fades slowly |
medium |
normal | Standard decay |
low |
fast (2x rate) | Quickly forgotten |
Decay is applied automatically on recall (if >24h since last decay).
Semantic memory (Memoirs) captures structured knowledge as a graph. Concepts are permanent — they get refined, never decayed. Use superseded_by to mark obsolete facts instead of deleting them.
With embeddings enabled, ICM uses hybrid search:
- FTS5 BM25 (30%) — full-text keyword matching
- Cosine similarity (70%) — semantic vector search via sqlite-vec (384d, BAAI/bge-small-en-v1.5)
Without embeddings, falls back to FTS5 then keyword LIKE search.
Single SQLite file. No external services, no network dependency.
~/Library/Application Support/dev.icm.icm/memories.db # macOS
~/.local/share/dev.icm.icm/memories.db # Linux
icm config # Show active config
# Edit: ~/.config/icm/config.toml (or $ICM_CONFIG)See config/default.toml for all options.
ICM extracts memories automatically via three layers:
Layer 0: Pattern hooks Layer 1: PreCompact Layer 2: SessionStart
(zero LLM cost) (cheap LLM, ~500 tok) (zero LLM cost)
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ PostToolUse hook │ │ PreCompact hook │ │ SessionStart hook │
│ │ │ │ │ │
│ • Bash exit != 0 │ │ Context about to │ │ Read cwd project │
│ → store error │ │ be compressed → │ │ → icm recall │
│ • git commit │ │ extract memories │ │ → inject as │
│ → store commit │ │ before they're │ │ additionalContext│
│ • Edit CLAUDE.md │ │ lost forever │ │ │
│ → store context │ │ │ │ Agent starts with │
│ │ │ This is the #1 │ │ relevant memories │
│ Pure pattern │ │ extraction point │ │ already loaded │
│ matching, no LLM │ │ nobody else does │ │ │
└──────────────────┘ └──────────────────┘ └──────────────────┘
| Layer | Status | LLM cost | Description |
|---|---|---|---|
| Layer 0 | Implemented | 0 | Rule-based keyword extraction via icm extract |
| Layer 1 | Planned | ~500 tok | PreCompact hook captures context before compaction |
| Layer 2 | Implemented | 0 | icm recall-context injects memories at session start |
| System | Method | LLM cost | Latency | Captures compaction? |
|---|---|---|---|---|
| ICM | 3-layer extraction | 0 to ~500 tok/session | 0ms | Yes (PreCompact) |
| Mem0 | 2 LLM calls/message | ~2k tok/message | 200-2000ms | No |
| claude-mem | PostToolUse + async | ~1-5k tok/session | 8ms hook | No |
| MemGPT/Letta | Agent self-manages | 0 marginal | 0ms | No |
| DiffMem | Git-based diffs | 0 | 0ms | No |
ICM Benchmark (1000 memories, 384d embeddings)
──────────────────────────────────────────────────────────
Store (no embeddings) 1000 ops 34.2 ms 34.2 µs/op
Store (with embeddings) 1000 ops 51.6 ms 51.6 µs/op
FTS5 search 100 ops 4.7 ms 46.6 µs/op
Vector search (KNN) 100 ops 59.0 ms 590.0 µs/op
Hybrid search 100 ops 95.1 ms 951.1 µs/op
Decay (batch) 1 ops 5.8 ms 5.8 ms/op
──────────────────────────────────────────────────────────
Apple M1 Pro, in-memory SQLite, single-threaded. icm bench --count 1000
Multi-session workflow with a real Rust project (12 files, ~550 lines). Sessions 2+ show the biggest gains as ICM recalls instead of re-reading files.
ICM Agent Benchmark (10 sessions, model: haiku, 3 runs averaged)
══════════════════════════════════════════════════════════════════
Without ICM With ICM Delta
Session 2 (recall)
Turns 5.7 4.0 -29%
Context (input) 99.9k 67.5k -32%
Cost $0.0298 $0.0249 -17%
Session 3 (recall)
Turns 3.3 2.0 -40%
Context (input) 74.7k 41.6k -44%
Cost $0.0249 $0.0194 -22%
══════════════════════════════════════════════════════════════════
icm bench-agent --sessions 10 --model haiku
Agent recalls specific facts from a dense technical document across sessions. Session 1 reads and memorizes; sessions 2+ answer 10 factual questions without the source text.
ICM Recall Benchmark (10 questions, model: haiku, 5 runs averaged)
══════════════════════════════════════════════════════════════════════
No ICM With ICM
──────────────────────────────────────────────────────────────────────
Average score 5% 68%
Questions passed 0/10 5/10
══════════════════════════════════════════════════════════════════════
icm bench-recall --model haiku
Same test with local models — pure context injection, no tool use needed.
Model Params No ICM With ICM Delta
─────────────────────────────────────────────────────────
qwen2.5:14b 14B 4% 88% +84%
mistral:7b 7B 2% 81% +79%
qwen2.5:7b 7B 2% 79% +77%
llama3.1:8b 8B 4% 67% +63%
─────────────────────────────────────────────────────────
scripts/bench-ollama.sh qwen2.5:14b
All benchmarks use real API calls — no mocks, no simulated responses, no cached answers.
- Agent benchmark: Creates a real Rust project in a tempdir. Runs N sessions with
claude -p --output-format json. Without ICM: empty MCP config. With ICM: real MCP server + auto-extraction + context injection. - Knowledge retention: Uses a fictional technical document (the "Meridian Protocol"). Scores answers by keyword matching against expected facts. 120s timeout per invocation.
- Isolation: Each run uses its own tempdir and fresh SQLite DB. No session persistence.
