Multi-agent RAG system with intelligent query routing, semantic search, and web fallback
QueryMind is a lightweight, production-ready Retrieval-Augmented Generation (RAG) system that combines ChromaDB vector search, Ollama LLM intelligence, and web search capabilities to provide accurate, context-aware responses from your knowledge base.
- π€ Intelligent Query Routing - Automatically routes queries to the optimal search strategy
- π Semantic Search - ChromaDB-powered vector search with mxbai-embed-large embeddings
- π‘ LLM Intent Analysis - Ollama integration for query understanding and keyword extraction
- π Web Search Fallback - Seamless fallback to Serper.dev when vault has no results
- π Structured Logging - Environment-based logging with debug, info, warning, error levels
- π‘οΈ Security Hardened - Input sanitization, injection protection, and validation
- π§ͺ Fully Tested - 27 tests covering imports, routing logic, and security
- π¦ Pip Installable - Standard Python package with pyproject.toml
QueryMind implements a multi-agent architecture with intelligent routing:
User Query β Router β [ Fast Search Agent ] β Results
[ Deep Research Agent ]
[ Web Search (fallback) ]
- FastSearchAgent - Direct keyword matching for simple queries (<1s)
- DeepResearchAgent - Ollama-powered semantic analysis for complex questions (~10s)
- WebSearchClient - Serper.dev API integration for external knowledge
Queries are automatically routed based on:
- Length: >10 words β Deep Research
- Question words: "how", "why", "what", "explain" β Deep Research
- Logical operators: "and", "or", "not" β Deep Research
- Default: Simple keywords β Fast Search
System Requirements:
- Python 3.9 or higher
- 8GB+ RAM (16GB recommended for better performance)
- (Optional) NVIDIA GPU for faster embeddings
Required Services:
- Ollama - Local LLM inference (mistral:7b or similar)
- ChromaDB - Vector database for semantic search
- Redis - Query caching (optional but recommended)
Ollama provides local LLM inference for query analysis.
macOS / Linux:
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull the mistral model (7B parameters, ~4GB)
ollama pull mistral:7b
# Verify installation
ollama listWindows:
- Download installer from https://ollama.com/download
- Run installer and follow prompts
- Open PowerShell and run:
ollama pull mistral:7b
Verify Ollama is running:
curl http://localhost:11434/api/tags
# Should return list of installed modelsChromaDB provides vector search capabilities.
Option A: Install as Python package (Recommended for development)
# ChromaDB will be installed automatically with QueryMind
# It runs in-process (no separate server needed)Option B: Run ChromaDB server (Recommended for production)
# Install ChromaDB server
pip install chromadb
# Run ChromaDB server
chroma run --host localhost --port 8000
# Verify server is running
curl http://localhost:8000/api/v1/heartbeatRedis provides query caching for better performance (73% cache hit rate).
macOS:
brew install redis
brew services start redisUbuntu/Debian:
sudo apt update
sudo apt install redis-server
sudo systemctl start redisWindows:
# Download from https://github.com/microsoftarchive/redis/releases
# Or use WSL2 with Ubuntu instructions aboveVerify Redis:
redis-cli ping
# Should return: PONG# Clone the repository
git clone https://github.com/rduffyuk/querymind.git
cd querymind
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install QueryMind with all dependencies
pip install .
# Or install in development mode
pip install -e ".[dev]"Create a .env file from the example:
cp .env.example .envEdit .env with your settings:
# Required - Path to your markdown documents
VAULT_PATH=/path/to/your/obsidian-vault
# ChromaDB settings
CHROMADB_URL=http://localhost:8000 # Or leave blank for in-process mode
# Redis settings (optional - will fall back to in-memory cache)
REDIS_URL=redis://localhost:6379
# Ollama settings
OLLAMA_API_URL=http://localhost:11434
# Optional - Web search API key (100 free queries/month)
SERPER_API_KEY=your-api-key-here
# Logging
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERRORRun the test suite to verify everything is working:
# Run all tests
pytest tests/ -v
# Should see: 25 passed, 2 skippedTest a simple query:
from querymind import auto_search
# Simple test query
result = auto_search("test query", n_results=1)
print(f"Status: {result.status}")
print(f"Agent: {result.agent_type}")For web search fallback functionality:
- Sign up at https://serper.dev
- Get your API key from the dashboard
- Add to
.env:SERPER_API_KEY=your-key-here - Free tier: 100 queries/month
- After free tier: $0.30 per 1,000 queries
Obsidian is a powerful markdown editor that works well for managing the document vault that QueryMind searches. While not required, it provides a great interface for creating and organizing your knowledge base.
macOS:
# Download from website
open https://obsidian.md/download
# Or install via Homebrew
brew install --cask obsidianLinux:
# Download AppImage from website
wget https://github.com/obsidianmd/obsidian-releases/releases/download/v1.4.16/Obsidian-1.4.16.AppImage
# Make executable and run
chmod +x Obsidian-1.4.16.AppImage
./Obsidian-1.4.16.AppImage
# Or install via Snap
sudo snap install obsidian --classicWindows:
# Download installer from website
start https://obsidian.md/download
# Or install via Chocolatey
choco install obsidianSetup your vault:
- Open Obsidian
- Create a new vault or open existing vault at
VAULT_PATHfrom your.env - Start creating markdown documents
- QueryMind will automatically index and search these documents
Ollama connection failed:
# Check if Ollama is running
ollama list
# Restart Ollama
# macOS/Linux: sudo systemctl restart ollama
# Windows: Restart Ollama Desktop appChromaDB errors:
# If using server mode, check if running
curl http://localhost:8000/api/v1/heartbeat
# If in-process mode, ensure adequate RAM
# ChromaDB needs ~2-4GB for mxbai-embed-large modelRedis not available:
# QueryMind will fall back to in-memory cache
# To use Redis, ensure it's running:
redis-cli pingfrom querymind import auto_search
# Simple query (uses FastSearchAgent)
result = auto_search("Redis caching")
print(f"Found {result.result_count} results")
for r in result.results:
print(f" - {r['file']}: {r['score']:.2f}")
# Complex query (uses DeepResearchAgent)
result = auto_search("How to implement Redis caching for APIs?")
print(f"Agent: {result.agent_type}")
print(f"Time: {result.elapsed_time:.2f}s")from querymind.agents.router import AgentRouter
# Initialize router with custom configuration
router = AgentRouter(
model="mistral:7b",
enable_web_fallback=True
)
# Execute search with verbose logging
result = router.search(
query="Explain StatefulSet vs Deployment",
n_results=10,
verbose=True
)
# Get routing statistics
stats = router.get_stats()
print(f"Fast searches: {stats['fast_searches']}")
print(f"Deep searches: {stats['deep_searches']}")from querymind.agents.vault_search_agent_local import VaultSearchAgentLocal
from querymind.agents.web_search_client import WebSearchClient
# Use vault search agent directly
vault_agent = VaultSearchAgentLocal(model="mistral:7b")
result = vault_agent.search("kubernetes deployment patterns")
# Use web search directly
web_client = WebSearchClient(api_key="your-key")
results = web_client.search_sync("latest FastAPI best practices", n_results=5)Run the test suite:
# Run all tests
pytest tests/ -v
# Run specific test file
pytest tests/test_router_basic.py -v
# Run with coverage
pytest tests/ --cov=querymind --cov-report=htmlTest coverage:
- β 27 tests total
- β 25 passing (92.6%)
- βοΈ 2 skipped (optional dependencies)
- test_imports_work.py - Verify all modules can be imported
- test_router_basic.py - Validate query routing logic and heuristics
- test_security_validation.py - Test input sanitization and injection protection
# Clone repository
git clone https://github.com/rduffyuk/querymind.git
cd querymind
# Create virtual environment
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/# Format code
black querymind/ tests/
# Lint code
ruff querymind/ tests/querymind/
βββ querymind/
β βββ __init__.py # Package initialization
β βββ core/ # Core functionality
β β βββ __init__.py
β β βββ config.py # Configuration management
β β βββ logging_config.py # Structured logging (NEW)
β β βββ embeddings.py # ChromaDB embeddings
β β βββ cache.py # Query caching (Redis)
β β βββ conversation_memory.py # Conversation stub (NEW)
β βββ agents/ # Multi-agent system
β β βββ __init__.py
β β βββ base_agent.py # Abstract base agent
β β βββ fast_search_agent.py # Quick keyword search
β β βββ deep_research_agent.py # LLM-powered search
β β βββ vault_search_agent_local.py # Ollama integration (NEW)
β β βββ web_search_client.py # Web search fallback (NEW)
β β βββ router.py # Intelligent routing
β βββ mcp/ # Model Context Protocol
β βββ server.py # FastMCP server
βββ tests/ # Test suite (NEW)
β βββ __init__.py
β βββ test_imports_work.py # Import verification
β βββ test_router_basic.py # Routing logic tests
β βββ test_security_validation.py # Security tests
βββ pyproject.toml # Package configuration (NEW)
βββ requirements.txt # Dependencies
βββ .env.example # Environment template
βββ .gitignore # Git ignore rules
βββ LICENSE.txt # MIT License
βββ README.md # This file
QueryMind uses environment variables for configuration. See .env.example for all available options:
| Variable | Description | Default |
|---|---|---|
VAULT_PATH |
Path to your markdown documents | /vault |
CHROMADB_URL |
ChromaDB HTTP endpoint | http://localhost:8000 |
REDIS_URL |
Redis cache endpoint | redis://localhost:6379 |
OLLAMA_API_URL |
Ollama LLM endpoint | http://localhost:11434 |
LOG_LEVEL |
Logging level (DEBUG/INFO/WARNING/ERROR) | INFO |
| Variable | Description | Default |
|---|---|---|
SERPER_API_KEY |
Serper.dev API key for web search | None |
DISABLE_WEB_SEARCH |
Disable web fallback | false |
Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests (
pytest tests/) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow PEP 8 style guide
- Use Black for code formatting
- Add tests for new features
- Update documentation as needed
- Use structured logging (not print statements)
- Intelligent query routing with 7 heuristics
- FastSearch, DeepResearch, WebSearch agents
- Ollama integration for intent analysis
- ChromaDB vector search
- Structured logging system
- Comprehensive test suite (27 tests)
- Security hardening and input validation
- Enhanced caching with gather cache
- Async support for concurrent searches
- Connection pooling for ChromaDB
- Advanced metrics and monitoring
- REST API endpoints
- Web UI for query testing
- Complete conversation memory implementation
- Hot-reload for configuration changes
- Docker Compose deployment
- Kubernetes deployment guides
- Multi-language support
This project is licensed under the MIT License - see the LICENSE.txt file for details.
QueryMind builds on excellent open-source projects:
- ChromaDB - Vector database for semantic search
- Ollama - Local LLM inference
- Serper.dev - Web search API
- FastMCP - Model Context Protocol server
QueryMind - Intelligent search for your knowledge base
Made with β€οΈ by Ryan Duffy