Skip to content

AILab-FOI/bytesophos

Repository files navigation

bytesophos logo

Official repository of master's thesis "Implementing a Retrieval Augmented Generation System" (2025) by David Slavik, student of University of Zagreb Faculty of Informatics.

Description

Goal of the project is to build a RAG system which will be useful for providing answers to questions regarding a codebase of a small to medium scale project.

Users can upload their codebase as a zip file or provide the GitHub repo link. After that, all the relevant files in the provided codebase are downloaded on backend and the relevant ones such as code, text, images are indexed in Postgres database (thanks to PgVector extension).

Once the indexing of files is finished, users can start asking questions in a chat interface. LLM which is used is qwen/qwen3-32b by Alibaba Cloud because of its logical reasoning and coding capabilities. Because the documents are indexed in PgVector document store, agent will understand the context of provided codebase and will take it into account when answering queries.

Conversation history is stored and users can bookmark or delete their conversations.

Technologies

React Vite Tailwind CSS shadcn/ui Postgres FastAPI Haystack Docker Groq

Frontend

React - JS library for creating user interfaces

Vite - JS bundler

TailwindCSS - framework which allows CSS styling directly within HTML by providing utility classes

shadcdn/ui - free and open source accessible and customizable UI components

Backend

Postgres - RSQL database which supports storing vector embeddings using PgVector extension

FastAPI - framework for creating API in python

Haystack - python framework for building AI pipelines and applications

Docker - for automated deployment of applications in containers

Groq - low-cost, high performance inference platform which provides API to various LLMs

Prerequisites (on the host)

  • Docker
  • Docker Compose
  • Git (if cloning the repository)

    You do not need to install Python, Node, or other language runtimes manually - the services are containerized

Building and running

Before building, it is necessary to create a .env file in root directory of the project which will contain environment variables and their values. Add your values without <> signs.

GROQ_API_KEY=<your_groq_api_key>
VOYAGE_API_KEY=<your_voyage_api_key>
VOYAGE_MODEL=<voyage-code-2 OR voyage-code-3>

POSTGRES_USER=<your_db_user>
POSTGRES_PASSWORD=<your_db_password>
POSTGRES_DB=<your_db_name>
DATABASE_DSN=postgresql://<user>:<password>@<host>:<port>/<dbname>

JWT_SECRET=<replace_with_a_random_secret>
JWT_ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=1440

UPLOAD_DIR=uploads
DATA_DIR=data/repos

To build the application, simply run the following command:

docker-compose up --build.

This command will install necessary tools and build frontend & backend images from instruction files Dockerfile.frontend, Dockerfile.backend.

Backend will be available on port 3001 and frontend on port 5173.

To start using the app, just visit http://localhost:5173.

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

David Slavik - diplomski rad

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •