This repository contains my AI-powered web app. It’s still under active development, but there is already a working alpha version.
Responsibility: Access control and identity management.
- Authentication: JWT implementation (Access & Refresh tokens) using asymmetric encryption RS256.
- Security: Tokens are stored exclusively in HttpOnly cookies to mitigate XSS attacks.
- User Management: Handles user registration, login, and profile management in PostgreSQL.
Responsibility: Chat business logic and data persistence.
- Persistence: Stores chat structure, metadata, and full message history.
- Data Integrity: Uses Alembic for database schema versioning and migrations.
- Orchestration Logic: Receives client requests, validates them, and coordinates interactions with the LLM Service to generate responses.
Responsibility: Intelligent request orchestration, context management, and response generation.
-
LLM Runtime: Direct integration with Ollama for local text generation.
-
Orchestrator Agent: Acts as the intelligent core, analyzing user intent and selecting the optimal processing strategy:
-
Deep Research:
Executes a comprehensive pipeline including data collection via the SearXNG API, dynamic RAG over retrieved web pages (parsing, chunking, reranking), and synthesis of the final response based on verified findings. -
Knowledge Base:
Works with local documents through a persistent RAG pipeline. -
Direct Response:
Generates answers directly using the model’s internal weights (Llama 3.1) when additional context is not required.
-
-
RAG Pipeline:
Built on Unstructured for intelligent document parsing and chunking, combined with vectorization and semantic search in Qdrant.
To minimize hallucinations and maximize accuracy, a reranking stage filters and prioritizes only the most relevant text segments before they are fed into the LLM.
- Gateway (NGINX): Acts as an API Gateway, routing external requests to the appropriate microservices.
- Data Storage: Hybrid persistence model — relational PostgreSQL for structured data and vector-based Qdrant for AI context storage.
- Containerization: The entire stack is deployed with Docker, ensuring service isolation and consistent environments.
Created by Denys Bondarchuk. Feel free to reach out or contribute to the project!