Episodic is a conversational memory system that helps avoid the usual “I already told you that” problem with chatbots.
Instead of stuffing the prompt with lots of old messages, Episodic tracks topics, ignores irrelevant content, and injects only the most relevant prior facts for what you’re talking about right now. It can also link facts through relationships (multi-hop traversal) to include context you did not explicitly repeat. In short, Episodic keeps long chats coherent.
-
Simple mode lets users chat and search the web without touching any configuration. It handles topic detection, subject-change boundaries, conversation records, and summaries automatically. Conversations are stored as plain markdown files that Episodic can both read and write.
-
Advanced mode exposes the full system: multi-model orchestration, RAG, semantic detection models, prompt engineering, performance benchmarking, cost analysis, and fine-grained control over every pipeline stage.
Simple mode uses free services by default and is self-contained. Advanced mode is available whenever you want it.
I originally wrote this to fill a gap I couldn't find addressed elsewhere. It has since become my preferred daily interface and framework for both routine LLM use and developing new capabilities based on them.
- 🤖 Universal LLM Interface - Works with OpenAI, Anthropic, Google, Ollama, 20+ providers, and custom local models
- 🧠 Intelligent Topic Detection - Neural segmentation validated on academic benchmarks, with configurable granularity
- 🔗 Knowledge Graph (hardened) - Extracts structured knowledge from conversations in real-time and injects relevant facts into context at zero read-side LLM cost
- 🔄 Topic Reactivation - Seamlessly resume previous topics with full context restoration
- 🎭 Muse Mode (hardened) - Perplexity-like web search with many providers (e.g., DuckDuckGo, Google, Brave, Searx)
- 📚 Knowledge Base (RAG) (hardened) - Index documents and search them during chats
- 🎙️ Voice Mode - Hands-free speech input and text-to-speech output
- ⏱️ Assistant Mode (hardened) - Timers, alarms, reminders, weather, news, radio, calculator, notes, media playback, Google Calendar, and Gmail — all available via CLI or voice mode
- 🔄 Local & Cloud Flexibility - Easily switch between local (free, private) and cloud-based operation
- 📓 Markdown Import/Export - Save and resume conversations anytime
- 📎 File References (@file) (hardened) - Attach local files directly in chat messages
- 💰 Cost Tracking - Real-time token usage and costs across all providers
- 🔌 MCP Server & Client (hardened) - Model Context Protocol with 12-layer security pipeline: token auth with rotation, input sanitization, DLP, provenance tracking, canary detection, rate limiting, replay-attack prevention, and structured audit logging
- 🎨 Rich CLI - Streaming responses, theme-based colors, tab completion
Every component that touches the outside world is hardened against prompt injection, data exfiltration, and content poisoning. Web search results, Google Calendar, Gmail, RAG document indexing, knowledge graph extraction, and MCP tool calls all pass through a defense-in-depth pipeline that applies input sanitization, content isolation, provenance tracking, canary detection, action gating, and rate limiting. Features marked (hardened) above are covered. See Security Architecture for details. Note that perfect security is not achievable in any system, particularly one built on LLMs; these defenses significantly raise the bar but cannot guarantee absolute protection.
If you are here regarding the paper When F1 Fails: Granularity-Aware Evaluation for Dialogue Topic Segmentation (arXiv:2512.17083), see the paper/ directory.
This is one example of what Episodic does that a vanilla LLM cannot. The user has a working conversation, moves on to unrelated topics for hours, then comes back. The system restores the prior discussion thread and fills in entity relationships from the knowledge graph — the user never repeats anything.
# A working conversation about a paper:
> I'm working with Dr. Reyes on a paper about coastal erosion modeling.
> We're submitting to AAAI. Deadline is March 15.
> Should we use the NOAA tidal dataset or the EU Copernicus data?
🤖 NOAA has better temporal resolution for the Atlantic coast sites
you're modeling. Copernicus has broader coverage but coarser
granularity. For site-specific erosion rates, NOAA is stronger.
> Let's go with NOAA.
# ... hours of conversation on other topics ...
# The original messages have left the context window.
# The user returns to the topic:
> Where did we land on the dataset for the paper?
📎 Pulled 6 earlier messages about: coastal-erosion-paper
🤖 You decided on the NOAA tidal dataset over EU Copernicus. The
reasoning was that NOAA has better temporal resolution for the
Atlantic coast sites. I had suggested downloading the hourly
tide gauge records for your three stations first.
# That answer came from the restored conversation thread — the original
# messages were retrieved from the database and placed back into context.
# Now the user asks for a task:
> Send Dr. Reyes a reminder about the submission deadline.
🤖 Drafting a message to Dr. Reyes:
"Hi Dr. Reyes — just a heads-up that the AAAI submission deadline
for our coastal erosion modeling paper is March 15. Let me know
if you need anything from my end before then."
[Send via MCP → email tool]
# The user said eight words. Episodic filled in who, what, and when:
> /kg explain
🔗 Injected 4 edges (71 tokens):
user:self → works_on → coastal erosion modeling paper (seed: user:self)
user:self → related_to → Dr. Reyes (seed: user:self)
Dr. Reyes → works_on → coastal erosion modeling paper (seed: Dr. Reyes)
coastal erosion modeling paper → deadline → AAAI March 15 (seed: paper)
# A vanilla LLM that lost the earlier conversation would not know who
# Dr. Reyes is, what paper you're working on, or when it's due.
📖 New users: See QUICK_START.md for a complete 5-minute setup guide using free services!
git clone https://github.com/mhcoen/episodic.git
cd episodic
pip install -e .
# Set up at least one provider:
export OPENAI_API_KEY="sk-..." # or
export HUGGINGFACE_API_KEY="hf_..." # or
ollama pull phi4 # fully local
python -m episodic- Quick Start - 5-minute setup guide
- User Guide - Comprehensive guide
- Installation - Detailed setup instructions
- Features - Feature documentation
- CLI Reference - All commands
- Configuration - Settings and options
- Assistant - Timers, alarms, weather, calendar, email
- MCP Guide - MCP server and client setup
- Security - Defense-in-depth security architecture
The test suite covers the CLI, topic segmentation, knowledge graph (extraction, read-side, closure, ablation evaluation), RAG, persistence, determinism, and integration.
pytest # full suite (~4000 tests)
pytest tests/kg/ # knowledge graph tests only
pytest tests/unit/ # unit tests only
pytest -x --tb=short # stop on first failureContributions are welcome! Fork, branch, commit, and submit a pull request.
Areas of interest: non-linear conversation trees, additional LLM/embedding/search providers, UI/UX improvements.
Apache License 2.0 — see LICENSE.
Michael H. Coen Email: mhcoen@gmail.com | mhcoen@alum.mit.edu GitHub: @mhcoen
Episodic: AI that remembers the conversation.