fdds-rag

Project Structure

.
├── docker-compose.yml
├── Dockerfile
├── .env
├── pyproject.toml
├── uv.lock
├── .pre-commit-config.yaml
├── README.md

├── data/
│   └── qdrant/

├── src/
│   ├── fdds/
│   │   ├── __init__.py
│   │   ├── config.py
│   │   ├── handlers.py
│   │   ├── inference.py
│   │   ├── evaluation.py
│   │   ├── evaluation_pipeline.py
│   │   ├── manage_pdfs.py
│   │   └── reranker.py
│   ├── ui-build/
│   └── chat.py

Key Components

docker-compose.yml Defines and manages services like Qdrant (vector DB), the API backend, and Jaeger (for monitoring and tracing).
Dockerfile: Builds the backend API service that serves the RAG-based chatbot.
pyproject.toml and uv.lock: Configuration files for uv and project dependencies.
.pre-commit-config.yaml: Configuration for pre-commit hooks to ensure code quality.
data/qdrant: Persistent volume for Qdrant's vector data storage.
src/fdds/inference.py contains methods to process a query and generate responses based on contextual data using RAG.
src/fdds/manage_pdfs.py script to ingest and delete PDF files from the list of URLs in Qdrant.
src/fdds/evaluation.py Evaluates RAG using a defined pipeline in the src/fdds/evaluation_pipeline.py file (requires NEPTUNE_API_KEY).
src/fdds/config.py holds configuration settings for the project.
src/ui-build: Precompiled frontend UI assets for the chatbot interface.
src/chat.py: Contains the core MyChat class responsible for managing conversation flow.

Prerequisites

Python 3.11+
Docker and Docker Compose
Pre-commit
uv (https://docs.astral.sh/uv/)

Setup and Installation

1. Clone the repository:

git clone git@github.com:deepsense-ai/fdds-rag.git
cd fdds-rag

2. Install and setup using uv:

uv python install 3.11
uv sync

3. Install pre-commit:

pre-commit install

4. Start the system:

Use the automated startup script to launch the system. The script will handle environment configuration and service startup:

./start.sh

Startup Script Options:

--help - Show all available options
--with-ingest - Include ingestion service to process PDFs
--with-ingest-file FILE - Use custom PDF file list for ingestion
--detached - Run services in background mode
--jaeger - Enable Jaeger tracing
--port PORT - Set API port (default: 8000)
--host HOST - Set API host (default: 0.0.0.0)
--data-path PATH - Set data mount path (default: ./app-data)
--env-file FILE - Load environment variables from file
--env KEY=VALUE - Set individual environment variables

Examples:

# Basic startup with ingestion
./start.sh --with-ingest

# Custom port and detached mode
./start.sh --with-ingest --port 9000 --detached

# Use custom PDF list
./start.sh --with-ingest-file my-pdfs.txt

# Load environment from file
./start.sh --env-file .env.prod --with-ingest

The script will:

Prompt for OpenAI API key if not found in environment
Generate secure API keys for internal services
Create the data directory and environment configuration
Start Docker services (Qdrant, API, and optionally Jaeger/Ingestion)

Note: Ensure Docker is running before executing the script.

5. Retrieve PDF File Links (Optional):

If you don't already have a list of PDF URLs to ingest, you can generate one by running the web scraping script:

cd scripts/fdds_scrapper
uv run scrapy crawl get_pdf_links

This will crawl and extract all PDF file links from the sections specified in the start_urls list, which is defined in: scripts/fdds_scrapper/fdds_scrapper/spiders/pdf_spider.py The collected links will be saved as: scripts/fdds_scrapper/pdfs.txt.

Note: To customize which sections are scraped, modify the start_urls list in pdf_spider.py.

6. Manage PDF Files in Qdrant (Optional):

To load PDF documents into the Qdrant database, prepare a .txt file containing one PDF URL per line (no delimiters or special characters). If you followed the previous step, this file is already generated. To ingest the documents, run:

uv run src/fdds/manage_pdfs.py --ingest <path_to_txt_file>

To delete the corresponding documents from Qdrant, use:

uv run src/fdds/manage_pdfs.py --delete <path_to_txt_file>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fdds-rag

Project Structure

Key Components

Prerequisites

Setup and Installation

1. Clone the repository:

2. Install and setup using uv:

3. Install pre-commit:

4. Start the system:

Startup Script Options:

Examples:

5. Retrieve PDF File Links (Optional):

6. Manage PDF Files in Qdrant (Optional):

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
data		data
docker/fdds		docker/fdds
scripts/fdds_scrapper		scripts/fdds_scrapper
src		src
ui		ui
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
start.sh		start.sh
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

fdds-rag

Project Structure

Key Components

Prerequisites

Setup and Installation

1. Clone the repository:

2. Install and setup using uv:

3. Install pre-commit:

4. Start the system:

Startup Script Options:

Examples:

5. Retrieve PDF File Links (Optional):

6. Manage PDF Files in Qdrant (Optional):

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages