SpotDocAI

AI-Powered Document Knowledge Management and Q&A System

Description

SpotDocAI is a full-stack proof-of-concept platform for scalable document ingestion, AI-based semantic search, and natural language question answering. Users can upload collections of documents (ZIPs), which are processed asynchronously, indexed, and queried with context-aware responses, complete with source citations.

Key Features

Multi-Document Upload: Upload multiple documents or entire ZIP collections of various file formats.
Asynchronous Processing: Background workers unzip, process, and index documents for AI retrieval.
AI-Powered Q&A: Ask natural language questions and receive AI-generated answers directly from the sources passed in (no hallucinations!).
Source Citations: Answers include references to relevant documents.
Multi-User & Multi-Collection Support: Each user can maintain separate collections.
Scalable Architecture: Containerized services using Docker, Redis queues, and MinIO/S3-compatible storage.
Streaming Responses: Real-time, low-latency answer streaming.

Tech Stack

Backend: C# (.NET 10.0), ASP.NET Core, Kernel Memory (retrieval), Amazon S3 SDk (for MinIO file storage upload), Redis Queues upload
Frontend: Next.js
Worker Service: C# (.NET 10.0), ASP.NET Core, Kernel Memory (upload and embeddings), Amazon S3 SDk (for MinIO file storage download), Redis Queues ingestion
Storage: MinIO (S3-compatible object storage), Postgres (vector database)
Containerization: Docker, docker-compose
AI/RAG: LLMs for embedding generation and Q&A (Ollama), Kernel Memory for RAG

Architecture Diagram

Frontend (Next.js)
        |
        v
Backend API (.NET 10.0, ASP.NET Core)
        |  \
        |   --> MinIO (S3-compatible storage) [upload ZIPs & files]
        |  
        --> Redis Queues [enqueue ingestion jobs]
        |
Worker Service (.NET 10.0, ASP.NET Core)
        |  \
        |   --> MinIO [download ZIPs, extract files]
        |   --> Kernel Memory [upload files, generate embeddings]
        |
        --> Postgres (Vector Store) [embeddings & retrieval]
        |
Kernel Memory (RAG & Retrieval)
        |
        v
LLMs (Ollama) [embedding generation & Q&A]

Key Learnings

Distributed system design with background workers and queues
Integration of AI embeddings for semantic search and retrieval
Scalable cloud storage patterns with MinIO / S3
Streaming data to frontend efficiently
Full-stack development (Next.js + .NET backend)

Screenshots

Homepage

File Upload Page

Chat Page

Real-Time Streamed Responses with Sources

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
Scripts		Scripts
backend		backend
frontend		frontend
kernel-memory-main		kernel-memory-main
ollama		ollama
worker		worker
.gitignore		.gitignore
File Structure.txt		File Structure.txt
ProcessFlow.md		ProcessFlow.md
README.md		README.md
docker-compose.yaml		docker-compose.yaml
generate-tree-structure.sh		generate-tree-structure.sh
init-vector.sql		init-vector.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!