Skip to content

🧠 Intelligent RAG with smart query routing - Choose the right search strategy automatically (FastSearch <1s, DeepResearch ~10s, WebSearch 2-5s)

License

Notifications You must be signed in to change notification settings

rduffyuk/querymind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

QueryMind

Multi-agent RAG system with intelligent query routing, semantic search, and web fallback

Python 3.9+ License: MIT Tests Code style: black

QueryMind is a lightweight, production-ready Retrieval-Augmented Generation (RAG) system that combines ChromaDB vector search, Ollama LLM intelligence, and web search capabilities to provide accurate, context-aware responses from your knowledge base.

✨ Features

  • πŸ€– Intelligent Query Routing - Automatically routes queries to the optimal search strategy
  • πŸ” Semantic Search - ChromaDB-powered vector search with mxbai-embed-large embeddings
  • πŸ’‘ LLM Intent Analysis - Ollama integration for query understanding and keyword extraction
  • 🌐 Web Search Fallback - Seamless fallback to Serper.dev when vault has no results
  • πŸ“Š Structured Logging - Environment-based logging with debug, info, warning, error levels
  • πŸ›‘οΈ Security Hardened - Input sanitization, injection protection, and validation
  • πŸ§ͺ Fully Tested - 27 tests covering imports, routing logic, and security
  • πŸ“¦ Pip Installable - Standard Python package with pyproject.toml

πŸ—οΈ Architecture

QueryMind implements a multi-agent architecture with intelligent routing:

User Query β†’ Router β†’ [ Fast Search Agent   ] β†’ Results
                      [ Deep Research Agent ]
                      [ Web Search (fallback) ]

Agent Types

  1. FastSearchAgent - Direct keyword matching for simple queries (<1s)
  2. DeepResearchAgent - Ollama-powered semantic analysis for complex questions (~10s)
  3. WebSearchClient - Serper.dev API integration for external knowledge

Query Routing Heuristics

Queries are automatically routed based on:

  • Length: >10 words β†’ Deep Research
  • Question words: "how", "why", "what", "explain" β†’ Deep Research
  • Logical operators: "and", "or", "not" β†’ Deep Research
  • Default: Simple keywords β†’ Fast Search

πŸš€ Quick Start

Prerequisites

System Requirements:

  • Python 3.9 or higher
  • 8GB+ RAM (16GB recommended for better performance)
  • (Optional) NVIDIA GPU for faster embeddings

Required Services:

  • Ollama - Local LLM inference (mistral:7b or similar)
  • ChromaDB - Vector database for semantic search
  • Redis - Query caching (optional but recommended)

Step 1: Install Ollama

Ollama provides local LLM inference for query analysis.

macOS / Linux:

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull the mistral model (7B parameters, ~4GB)
ollama pull mistral:7b

# Verify installation
ollama list

Windows:

  1. Download installer from https://ollama.com/download
  2. Run installer and follow prompts
  3. Open PowerShell and run: ollama pull mistral:7b

Verify Ollama is running:

curl http://localhost:11434/api/tags
# Should return list of installed models

Step 2: Install ChromaDB

ChromaDB provides vector search capabilities.

Option A: Install as Python package (Recommended for development)

# ChromaDB will be installed automatically with QueryMind
# It runs in-process (no separate server needed)

Option B: Run ChromaDB server (Recommended for production)

# Install ChromaDB server
pip install chromadb

# Run ChromaDB server
chroma run --host localhost --port 8000

# Verify server is running
curl http://localhost:8000/api/v1/heartbeat

Step 3: Install Redis (Optional)

Redis provides query caching for better performance (73% cache hit rate).

macOS:

brew install redis
brew services start redis

Ubuntu/Debian:

sudo apt update
sudo apt install redis-server
sudo systemctl start redis

Windows:

# Download from https://github.com/microsoftarchive/redis/releases
# Or use WSL2 with Ubuntu instructions above

Verify Redis:

redis-cli ping
# Should return: PONG

Step 4: Install QueryMind

# Clone the repository
git clone https://github.com/rduffyuk/querymind.git
cd querymind

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install QueryMind with all dependencies
pip install .

# Or install in development mode
pip install -e ".[dev]"

Step 5: Configure Environment

Create a .env file from the example:

cp .env.example .env

Edit .env with your settings:

# Required - Path to your markdown documents
VAULT_PATH=/path/to/your/obsidian-vault

# ChromaDB settings
CHROMADB_URL=http://localhost:8000  # Or leave blank for in-process mode

# Redis settings (optional - will fall back to in-memory cache)
REDIS_URL=redis://localhost:6379

# Ollama settings
OLLAMA_API_URL=http://localhost:11434

# Optional - Web search API key (100 free queries/month)
SERPER_API_KEY=your-api-key-here

# Logging
LOG_LEVEL=INFO  # DEBUG, INFO, WARNING, ERROR

Step 6: Verify Installation

Run the test suite to verify everything is working:

# Run all tests
pytest tests/ -v

# Should see: 25 passed, 2 skipped

Test a simple query:

from querymind import auto_search

# Simple test query
result = auto_search("test query", n_results=1)
print(f"Status: {result.status}")
print(f"Agent: {result.agent_type}")

Optional: Get Serper.dev API Key

For web search fallback functionality:

  1. Sign up at https://serper.dev
  2. Get your API key from the dashboard
  3. Add to .env: SERPER_API_KEY=your-key-here
  4. Free tier: 100 queries/month
  5. After free tier: $0.30 per 1,000 queries

Optional: Install Obsidian for Document Management

Obsidian is a powerful markdown editor that works well for managing the document vault that QueryMind searches. While not required, it provides a great interface for creating and organizing your knowledge base.

macOS:

# Download from website
open https://obsidian.md/download

# Or install via Homebrew
brew install --cask obsidian

Linux:

# Download AppImage from website
wget https://github.com/obsidianmd/obsidian-releases/releases/download/v1.4.16/Obsidian-1.4.16.AppImage

# Make executable and run
chmod +x Obsidian-1.4.16.AppImage
./Obsidian-1.4.16.AppImage

# Or install via Snap
sudo snap install obsidian --classic

Windows:

# Download installer from website
start https://obsidian.md/download

# Or install via Chocolatey
choco install obsidian

Setup your vault:

  1. Open Obsidian
  2. Create a new vault or open existing vault at VAULT_PATH from your .env
  3. Start creating markdown documents
  4. QueryMind will automatically index and search these documents

Troubleshooting

Ollama connection failed:

# Check if Ollama is running
ollama list

# Restart Ollama
# macOS/Linux: sudo systemctl restart ollama
# Windows: Restart Ollama Desktop app

ChromaDB errors:

# If using server mode, check if running
curl http://localhost:8000/api/v1/heartbeat

# If in-process mode, ensure adequate RAM
# ChromaDB needs ~2-4GB for mxbai-embed-large model

Redis not available:

# QueryMind will fall back to in-memory cache
# To use Redis, ensure it's running:
redis-cli ping

πŸ“– Usage

Basic Search

from querymind import auto_search

# Simple query (uses FastSearchAgent)
result = auto_search("Redis caching")
print(f"Found {result.result_count} results")
for r in result.results:
    print(f"  - {r['file']}: {r['score']:.2f}")

# Complex query (uses DeepResearchAgent)
result = auto_search("How to implement Redis caching for APIs?")
print(f"Agent: {result.agent_type}")
print(f"Time: {result.elapsed_time:.2f}s")

Advanced Usage

from querymind.agents.router import AgentRouter

# Initialize router with custom configuration
router = AgentRouter(
    model="mistral:7b",
    enable_web_fallback=True
)

# Execute search with verbose logging
result = router.search(
    query="Explain StatefulSet vs Deployment",
    n_results=10,
    verbose=True
)

# Get routing statistics
stats = router.get_stats()
print(f"Fast searches: {stats['fast_searches']}")
print(f"Deep searches: {stats['deep_searches']}")

Direct Agent Access

from querymind.agents.vault_search_agent_local import VaultSearchAgentLocal
from querymind.agents.web_search_client import WebSearchClient

# Use vault search agent directly
vault_agent = VaultSearchAgentLocal(model="mistral:7b")
result = vault_agent.search("kubernetes deployment patterns")

# Use web search directly
web_client = WebSearchClient(api_key="your-key")
results = web_client.search_sync("latest FastAPI best practices", n_results=5)

πŸ§ͺ Testing

Run the test suite:

# Run all tests
pytest tests/ -v

# Run specific test file
pytest tests/test_router_basic.py -v

# Run with coverage
pytest tests/ --cov=querymind --cov-report=html

Test coverage:

  • βœ… 27 tests total
  • βœ… 25 passing (92.6%)
  • ⏭️ 2 skipped (optional dependencies)

Test Suite

  • test_imports_work.py - Verify all modules can be imported
  • test_router_basic.py - Validate query routing logic and heuristics
  • test_security_validation.py - Test input sanitization and injection protection

πŸ› οΈ Development

Setup Development Environment

# Clone repository
git clone https://github.com/rduffyuk/querymind.git
cd querymind

# Create virtual environment
python -m venv venv
source venv/bin/activate  # or `venv\Scripts\activate` on Windows

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

Code Quality

# Format code
black querymind/ tests/

# Lint code
ruff querymind/ tests/

πŸ“‹ Project Structure

querymind/
β”œβ”€β”€ querymind/
β”‚   β”œβ”€β”€ __init__.py           # Package initialization
β”‚   β”œβ”€β”€ core/                 # Core functionality
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ config.py         # Configuration management
β”‚   β”‚   β”œβ”€β”€ logging_config.py # Structured logging (NEW)
β”‚   β”‚   β”œβ”€β”€ embeddings.py     # ChromaDB embeddings
β”‚   β”‚   β”œβ”€β”€ cache.py          # Query caching (Redis)
β”‚   β”‚   └── conversation_memory.py  # Conversation stub (NEW)
β”‚   β”œβ”€β”€ agents/               # Multi-agent system
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ base_agent.py     # Abstract base agent
β”‚   β”‚   β”œβ”€β”€ fast_search_agent.py    # Quick keyword search
β”‚   β”‚   β”œβ”€β”€ deep_research_agent.py  # LLM-powered search
β”‚   β”‚   β”œβ”€β”€ vault_search_agent_local.py  # Ollama integration (NEW)
β”‚   β”‚   β”œβ”€β”€ web_search_client.py    # Web search fallback (NEW)
β”‚   β”‚   └── router.py         # Intelligent routing
β”‚   └── mcp/                  # Model Context Protocol
β”‚       └── server.py         # FastMCP server
β”œβ”€β”€ tests/                    # Test suite (NEW)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ test_imports_work.py  # Import verification
β”‚   β”œβ”€β”€ test_router_basic.py  # Routing logic tests
β”‚   └── test_security_validation.py  # Security tests
β”œβ”€β”€ pyproject.toml            # Package configuration (NEW)
β”œβ”€β”€ requirements.txt          # Dependencies
β”œβ”€β”€ .env.example              # Environment template
β”œβ”€β”€ .gitignore                # Git ignore rules
β”œβ”€β”€ LICENSE.txt               # MIT License
└── README.md                 # This file

βš™οΈ Configuration

QueryMind uses environment variables for configuration. See .env.example for all available options:

Core Settings

Variable Description Default
VAULT_PATH Path to your markdown documents /vault
CHROMADB_URL ChromaDB HTTP endpoint http://localhost:8000
REDIS_URL Redis cache endpoint redis://localhost:6379
OLLAMA_API_URL Ollama LLM endpoint http://localhost:11434
LOG_LEVEL Logging level (DEBUG/INFO/WARNING/ERROR) INFO

Optional Features

Variable Description Default
SERPER_API_KEY Serper.dev API key for web search None
DISABLE_WEB_SEARCH Disable web fallback false

🀝 Contributing

Contributions are welcome! Please follow these guidelines:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run tests (pytest tests/)
  5. Commit your changes (git commit -m 'Add amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

Coding Standards

  • Follow PEP 8 style guide
  • Use Black for code formatting
  • Add tests for new features
  • Update documentation as needed
  • Use structured logging (not print statements)

πŸ—ΊοΈ Roadmap

Current (v0.1.0)

  • Intelligent query routing with 7 heuristics
  • FastSearch, DeepResearch, WebSearch agents
  • Ollama integration for intent analysis
  • ChromaDB vector search
  • Structured logging system
  • Comprehensive test suite (27 tests)
  • Security hardening and input validation

Planned (v0.2.0)

  • Enhanced caching with gather cache
  • Async support for concurrent searches
  • Connection pooling for ChromaDB
  • Advanced metrics and monitoring
  • REST API endpoints
  • Web UI for query testing

Future (v1.0.0)

  • Complete conversation memory implementation
  • Hot-reload for configuration changes
  • Docker Compose deployment
  • Kubernetes deployment guides
  • Multi-language support

πŸ“ License

This project is licensed under the MIT License - see the LICENSE.txt file for details.

πŸ™ Acknowledgments

QueryMind builds on excellent open-source projects:


QueryMind - Intelligent search for your knowledge base

Made with ❀️ by Ryan Duffy

About

🧠 Intelligent RAG with smart query routing - Choose the right search strategy automatically (FastSearch <1s, DeepResearch ~10s, WebSearch 2-5s)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages