Unlimited local AI memory for Claude Code and Claude Desktop
MCP Memoria is a Model Context Protocol (MCP) server that provides persistent, unlimited memory capabilities using Qdrant for vector storage and Ollama for local embeddings. Zero cloud dependencies, zero storage limits, 100% privacy.
- Unlimited Storage: No 50MB limits like cloud services
- 100% Local: All data stays on your machine
- Three Memory Types:
- Episodic: Events, conversations, time-bound memories
- Semantic: Facts, knowledge, concepts
- Procedural: Procedures, workflows, learned skills
- Semantic Search: Find relevant memories by meaning, not just keywords
- Full-Text Match: Filter results by exact keyword presence in content
- Content Chunking: Long memories are automatically split into chunks for higher-quality embeddings; results are transparently deduplicated
- Knowledge Graph: Create typed relationships between memories
- 9 relation types: causes, fixes, supports, opposes, follows, supersedes, derives, part_of, related
- Graph traversal with BFS/DFS
- AI-powered relation suggestions
- Path finding between memories
- Web UI: Browser-based Knowledge Graph explorer and memory browser
- Time Tracking: Track work sessions with clients, projects, and categories
- Memory Consolidation: Automatic merging of similar memories
- Forgetting Curve: Natural decay of unused, low-importance memories
- Export/Import: Backup and share your memories
MCP Memoria requires two components running before you can use it:
- Backend Services (Qdrant + PostgreSQL) — must be started FIRST
- Claude Configuration — connects Claude to Memoria
⚠️ Important: Start the backend services BEFORE configuring Claude. Claude will fail to connect if the services aren't running.
- Docker and Docker Compose
- Ollama installed and running with the
nomic-embed-textmodel - Python 3.11+ (for local Memoria process)
# Install Ollama (if not already installed)
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh
# Start Ollama and pull the embedding model
ollama serve # Run in background or separate terminal
ollama pull nomic-embed-text
# Verify Ollama is running
curl http://localhost:11434/api/tagsBest for: Most users. Provides all features including Knowledge Graph, Time Tracking, and Web UI.
This setup runs Qdrant, PostgreSQL, and Web UI in Docker containers. Each Claude session spawns its own local Memoria process, connecting to these shared services.
git clone https://github.com/yourusername/mcp-memoria.git
cd mcp-memoria/docker
# Start all services (Qdrant + PostgreSQL + Web UI)
./start.sh central
# Or manually:
docker-compose -f docker-compose.central.yml up -dThis starts:
| Service | Port | Description |
|---|---|---|
| Qdrant | 6333 | Vector database for semantic search |
| PostgreSQL | 5432 | Knowledge Graph + Time Tracking data |
| Web UI | 3000 | Browser-based memory explorer |
| REST API | 8765 | API for custom integrations |
# Check Qdrant
curl http://localhost:6333/health
# Check PostgreSQL
docker exec memoria-postgres pg_isready -U memoria
# Open Web UI (optional)
open http://localhost:3000cd mcp-memoria
pip install -e .Choose your Claude client:
Claude Code — Using CLI (recommended):
claude mcp add --scope user memoria \
-e MEMORIA_QDRANT_HOST=localhost \
-e MEMORIA_QDRANT_PORT=6333 \
-e MEMORIA_DATABASE_URL=postgresql://memoria:memoria_dev@localhost:5432/memoria \
-e MEMORIA_OLLAMA_HOST=http://localhost:11434 \
-- python -m mcp_memoriaClaude Code — Manual config (~/.claude.json):
{
"mcpServers": {
"memoria": {
"command": "python",
"args": ["-m", "mcp_memoria"],
"env": {
"MEMORIA_QDRANT_HOST": "localhost",
"MEMORIA_QDRANT_PORT": "6333",
"MEMORIA_DATABASE_URL": "postgresql://memoria:memoria_dev@localhost:5432/memoria",
"MEMORIA_OLLAMA_HOST": "http://localhost:11434"
}
}
}
}Claude Desktop — Config file location:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"memoria": {
"command": "python",
"args": ["-m", "mcp_memoria"],
"env": {
"MEMORIA_QDRANT_HOST": "localhost",
"MEMORIA_QDRANT_PORT": "6333",
"MEMORIA_DATABASE_URL": "postgresql://memoria:memoria_dev@localhost:5432/memoria",
"MEMORIA_OLLAMA_HOST": "http://localhost:11434"
}
}
}
}Start Claude and try:
Show me the memoria stats
If you see statistics, Memoria is working correctly.
Best for: Quick testing or if you don't need Knowledge Graph/Time Tracking features.
This setup only runs Qdrant. You won't have access to:
- Knowledge Graph tools (
memoria_link,memoria_related, etc.) - Time Tracking tools (
memoria_work_start,memoria_work_report, etc.) - Web UI
cd mcp-memoria/docker
docker-compose -f docker-compose.qdrant-only.yml up -dcurl http://localhost:6333/healthcd mcp-memoria
pip install -e .Configure Claude (same as Option A, but without MEMORIA_DATABASE_URL):
{
"mcpServers": {
"memoria": {
"command": "python",
"args": ["-m", "mcp_memoria"],
"env": {
"MEMORIA_QDRANT_HOST": "localhost",
"MEMORIA_QDRANT_PORT": "6333",
"MEMORIA_OLLAMA_HOST": "http://localhost:11434"
}
}
}
}Best for: Users who prefer running Memoria entirely in Docker, or environments without Python.
In this setup, Claude spawns an ephemeral Memoria container for each session. The container connects to the backend services via Docker networking.
cd mcp-memoria/docker
docker-compose -f docker-compose.central.yml up -dcd mcp-memoria
docker build -t mcp-memoria:latest -f docker/Dockerfile .{
"mcpServers": {
"memoria": {
"command": "docker",
"args": [
"run", "-i", "--rm",
"--network", "memoria-central",
"-e", "MEMORIA_QDRANT_HOST=qdrant",
"-e", "MEMORIA_QDRANT_PORT=6333",
"-e", "MEMORIA_DATABASE_URL=postgresql://memoria:memoria_dev@postgres:5432/memoria",
"-e", "MEMORIA_OLLAMA_HOST=http://host.docker.internal:11434",
"-e", "MEMORIA_LOG_LEVEL=WARNING",
"mcp-memoria:latest"
]
}
}
}Note: Inside Docker, use container names (
qdrant,postgres) instead oflocalhost. Usehost.docker.internalto reach Ollama running on your host machine.
# Check status
docker-compose -f docker-compose.central.yml ps
# View logs
docker-compose -f docker-compose.central.yml logs -f
# Stop services (data is preserved)
docker-compose -f docker-compose.central.yml down
# Stop and DELETE all data (⚠️ irreversible!)
docker-compose -f docker-compose.central.yml down -v┌─────────────────────────────────────────────────────────────┐
│ Your Machine │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Ollama (native) │ │
│ │ http://localhost:11434 │ │
│ │ Provides: nomic-embed-text embeddings │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Docker Services (docker-compose.central.yml) │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌────────────────┐ │ │
│ │ │ Qdrant │ │ PostgreSQL │ │ Web UI │ │ │
│ │ │ :6333 │ │ :5432 │ │ :3000 │ │ │
│ │ │ (vectors) │ │ (relations) │ │ (browser) │ │ │
│ │ └─────────────┘ └─────────────┘ └────────────────┘ │ │
│ └──────────────────────────────────────────────────────┘ │
│ ▲ │
│ │ │
│ ┌────────────────────────┼─────────────────────────────┐ │
│ │ Claude Code/Desktop │ │ │
│ │ ┌──────────────┴───────────────┐ │ │
│ │ │ Memoria Process (stdio) │ │ │
│ │ │ python -m mcp_memoria │ │ │
│ │ │ (spawned per session) │ │ │
│ │ └──────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
The Web UI provides a browser-based interface to explore and manage your memories. Access it at http://localhost:3000 after starting the central Docker services.
- Dashboard: Overview of memory statistics, recent activity, and quick actions
- Knowledge Graph Explorer: Interactive force-directed graph visualization of memory relationships
- Click nodes to view memory details
- Drag to rearrange the layout
- Filter by relation type
- Zoom and pan navigation
- Memory Browser: Search and browse all stored memories
- Semantic search with filters
- Filter by memory type (episodic, semantic, procedural)
- Sort by date, importance, or relevance
- Relation Management: Create, view, and delete relationships between memories
- AI Suggestions: Get recommended relations based on content similarity
Note: The Web UI is included in the Full Docker Setup (Option A). It's not available with the Minimal Setup.
Once configured, Claude will have access to memory tools. You can interact naturally — Claude will automatically use the appropriate memory tools based on your requests.
Try these commands in Claude to verify Memoria is working:
# Check system status
Show me the memoria stats
# Store a test memory
Remember this: The project uses Python 3.11 and FastAPI
# Recall memories
What do you remember about this project?
Memory Tools:
| Tool | Description |
|---|---|
memoria_store |
Store new memories |
memoria_recall |
Recall memories by semantic similarity (supports text_match keyword filter) |
memoria_search |
Advanced search with filters (supports text_match keyword filter) |
memoria_update |
Update existing memories |
memoria_delete |
Delete memories |
memoria_consolidate |
Merge similar memories |
memoria_export |
Export memories to file |
memoria_import |
Import memories from file |
memoria_stats |
View system statistics |
memoria_set_context |
Set current project/file context |
Knowledge Graph Tools (require PostgreSQL):
| Tool | Description |
|---|---|
memoria_link |
Create a relationship between two memories |
memoria_unlink |
Remove a relationship between memories |
memoria_related |
Find memories related through the knowledge graph |
memoria_path |
Find shortest path between two memories |
memoria_suggest_links |
Get AI-powered relation suggestions |
Time Tracking Tools (require PostgreSQL):
| Tool | Description |
|---|---|
memoria_work_start |
Start tracking a work session |
memoria_work_stop |
Stop active session and get duration |
memoria_work_status |
Check if a session is active |
memoria_work_pause |
Pause session (e.g., for breaks) |
memoria_work_resume |
Resume a paused session |
memoria_work_note |
Add notes to active session |
memoria_work_report |
Generate time tracking reports |
If you're using Claude Code, you can type /memoria-guide at any time to get a quick reference for all Memoria tools. This skill provides:
- Complete list of all memory, knowledge graph, and time tracking tools
- Memory types reference (episodic, semantic, procedural)
- Importance level guidelines (0-1 scale)
- Common usage patterns and examples
- Tag naming conventions
- Session workflow recommendations
This is especially useful when you're not sure which tool to use or need a quick reminder of the available options without leaving your conversation.
The skill is included in the repository at .claude/skills/memoria-guide/SKILL.md.
Option 1: Project-level (automatic)
If you're working inside the mcp-memoria directory, the skill is automatically available — no installation needed.
Option 2: User-level (available in all projects)
To make the skill available globally in any project:
# Create the user skills directory if it doesn't exist
mkdir -p ~/.claude/skills
# Copy the skill directory (skills must be directories, not single files)
cp -r .claude/skills/memoria-guide ~/.claude/skills/Note: After installing or updating skills, restart Claude Code for changes to take effect.
After installation, type /memoria-guide in any Claude Code session to load the quick reference.
Semantic memories (facts and knowledge):
Remember that the API endpoint for users is /api/v1/users
Store this: The database password is rotated every 30 days
Episodic memories (events and experiences):
Log this event: Deployed version 2.1.0 to production today
Remember that we had a meeting about the new auth system
Procedural memories (how-to and workflows):
Save this procedure: To deploy, run ./scripts/deploy.sh --env prod
Remember the steps to set up the dev environment
# Semantic search - finds relevant memories by meaning
What do you know about the database?
How do we handle authentication?
# With filters
Search memories about deployment from last week
Find all procedural memories about testing
Set context to associate memories with a specific project:
Set the project context to "ecommerce-api"
Now remember that this project uses Stripe for payments
Later, when working on the same project:
What do you remember about the ecommerce-api project?
Create and explore relationships between memories:
# Create a relationship
Link memory [problem-id] to [solution-id] with type "fixes"
# Find related memories
What memories are related to [memory-id]?
Show me all memories that this one causes
# Find connections
Is there a path between [memory-a] and [memory-b]?
# Get suggestions
Suggest relationships for memory [id]
Relation Types:
| Type | Description | Example |
|---|---|---|
causes |
A leads to B | Decision → Consequence |
fixes |
A resolves B | Solution → Problem |
supports |
A confirms B | Evidence → Claim |
opposes |
A contradicts B | Counterargument → Argument |
follows |
A comes after B | Event → Previous event |
supersedes |
A replaces B | New fact → Old fact |
derives |
A is derived from B | Summary → Original |
part_of |
A is component of B | Chapter → Book |
related |
Generic connection | Any correlation |
# Update a memory
Update memory [id] to include the new API version
# Delete memories
Delete all memories about the old authentication system
Forget the deprecated deployment process
# Consolidate similar memories
Consolidate memories to merge duplicates
# Export/Import
Export all memories to backup.json
Import memories from shared-knowledge.json
Track time spent on tasks, issues, and projects (requires PostgreSQL):
# Start tracking work
Start working on fixing the login timeout issue for AuthService
# Check status
What am I working on?
# Add a note
Note: Found the bug - timeout was set to 10s instead of 30s
# Take a break
Pause work for lunch
# Resume
Resume working
# Stop and see duration
Stop working - fixed by increasing timeout to 30s
# Get reports
Show me my work report for this week
How much time did I spend on AuthService this month?
Time tracking supports:
- Categories: coding, review, meeting, support, research, documentation, devops
- Clients and Projects: Track billable hours per client/project
- GitHub integration: Link sessions to issues and PRs
- Pause/Resume: Exclude breaks from work time
- Reports: Aggregate by period, client, project, or category
-
Be specific: "Remember the PostgreSQL connection string is postgres://..." is better than "Remember the database info"
-
Use context: Set project context when working on specific projects to keep memories organized
-
Regular consolidation: Run consolidation periodically to merge similar memories and reduce redundancy
-
Importance levels: Mention importance for critical information: "This is important: never delete the production database"
-
Natural language: You don't need special syntax — just talk naturally about what you want to remember or recall
If you run Qdrant on multiple machines (e.g., a Mac and a Linux server), you can keep them synchronized using the included sync script.
The sync script (scripts/sync_qdrant.py) performs incremental bidirectional synchronization:
- New memories: Copied to the other node
- Updated memories: Newer timestamp wins
- Deleted memories: Propagated to the other node
# Run sync
python scripts/sync_qdrant.py
# Dry run (show what would happen)
python scripts/sync_qdrant.py --dry-run
# Verbose output
python scripts/sync_qdrant.py -vEdit the script to set your node addresses:
LOCAL_URL = "http://localhost:6333"
REMOTE_URL = "http://your-server.local:6333"
⚠️ Warning: HTTP mode is for testing/development only. It shares WorkingMemory across all connected clients, which can cause context confusion.
For scenarios where you can't spawn processes (web apps, remote access):
# Start HTTP server
cd docker && docker-compose -f docker-compose.http.yml up -d
# Configure Claude to connect via URL
{
"mcpServers": {
"memoria": {
"url": "http://localhost:8765/sse"
}
}
}Endpoints:
GET /sse— SSE connection endpointPOST /messages/— Message endpointGET /health— Health check
When a memory exceeds MEMORIA_CHUNK_SIZE characters (default 500), it is automatically split into overlapping chunks. Each chunk is stored as a separate Qdrant point with its own embedding, linked to the original memory via a parent_id.
How it works:
- Store: long content → TextChunker → N chunks → N embeddings → N Qdrant points (same
parent_id) - Recall/Search: query matches individual chunks; results are deduplicated by
parent_id, returning the full original content - Update: content changes delete all existing chunks and re-create them; metadata-only changes propagate to every chunk
- Delete: removes all points belonging to the logical memory
Chunking is transparent — callers always see complete memories, never individual chunks.
All settings via environment variables with MEMORIA_ prefix:
| Variable | Default | Description |
|---|---|---|
MEMORIA_QDRANT_HOST |
- | Qdrant server host |
MEMORIA_QDRANT_PORT |
6333 |
Qdrant port |
MEMORIA_QDRANT_PATH |
~/.mcp-memoria/qdrant |
Local Qdrant storage path (if no host) |
MEMORIA_OLLAMA_HOST |
http://localhost:11434 |
Ollama server URL |
MEMORIA_EMBEDDING_MODEL |
nomic-embed-text |
Embedding model |
MEMORIA_CACHE_ENABLED |
true |
Enable embedding cache |
MEMORIA_CHUNK_SIZE |
500 |
Max characters per chunk |
MEMORIA_CHUNK_OVERLAP |
50 |
Overlap between consecutive chunks |
MEMORIA_LOG_LEVEL |
INFO |
Logging level (DEBUG, INFO, WARNING, ERROR) |
MEMORIA_LOG_FILE |
- | Path to log file (in addition to stderr) |
MEMORIA_HTTP_PORT |
- | HTTP port (enables HTTP mode) |
MEMORIA_HTTP_HOST |
0.0.0.0 |
HTTP host to bind to |
MEMORIA_DATABASE_URL |
- | PostgreSQL URL for Knowledge Graph |
MEMORIA_PG_HOST |
- | PostgreSQL host (alternative to DATABASE_URL) |
MEMORIA_PG_PORT |
5432 |
PostgreSQL port |
MEMORIA_PG_USER |
memoria |
PostgreSQL username |
MEMORIA_PG_PASSWORD |
- | PostgreSQL password |
MEMORIA_PG_DATABASE |
memoria |
PostgreSQL database name |
For events and experiences:
- Conversations
- Decisions made
- Problems encountered
- Meeting notes
For facts and knowledge:
- Project configurations
- API endpoints
- Best practices
- Technical documentation
For skills and procedures:
- Deployment workflows
- Build commands
- Testing procedures
- Common code patterns
Most common cause: Backend services not running. Make sure to start Docker services BEFORE launching Claude.
-
Check Qdrant is running:
curl http://localhost:6333/health # Should return: {"title":"qdrant","version":"..."} -
Check Ollama is running:
curl http://localhost:11434/api/tags # Should return list of models -
Verify the embedding model is installed:
ollama list | grep nomic-embed-text -
Check PostgreSQL (if using full setup):
docker exec memoria-postgres pg_isready -U memoria
- Ensure services are running:
docker-compose -f docker-compose.central.yml ps - For Docker setups, verify the network:
docker network ls | grep memoria - Check firewall settings if running on remote servers
- Run
memoria_statsto verify memories are being stored - Check that the embedding model is working
- Try consolidating memories if you have many similar entries
- The first query may be slow as models are loaded into memory
- Ensure Ollama is using GPU acceleration if available
- Consider using a smaller embedding model for faster results
Enable debug logging for more information:
export MEMORIA_LOG_LEVEL=DEBUGOr in Claude config:
{
"env": {
"MEMORIA_LOG_LEVEL": "DEBUG",
"MEMORIA_LOG_FILE": "/tmp/memoria.log"
}
}To completely reset Memoria and start fresh:
# Stop services and delete all data (⚠️ irreversible!)
cd docker
docker-compose -f docker-compose.central.yml down -v
docker-compose -f docker-compose.central.yml up -d# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Type checking
mypy src/mcp_memoria
# Linting
ruff check src/mcp_memoria| Feature | MCP Memoria | Memvid | Mem0 |
|---|---|---|---|
| Storage Limit | Unlimited | 50MB free | Varies |
| Local-only | Yes | Partial | No |
| MCP Native | Yes | No | No |
| Cost | Free | Freemium | Freemium |
| Vector DB | Qdrant | Custom | Cloud |
Apache 2.0
Contributions welcome! Please read CONTRIBUTING.md first.