CareBot is a Retrieval-Augmented Generation (RAG) chatbot designed to answer medical questions using the Gale Encyclopedia of Medicine as its knowledge base. It leverages powerful open-source tools such as LangChain, Pinecone, Sentence Transformers, and a quantized LLaMA 2 model to provide informative, contextual responses, all while running efficiently on limited compute.
- LangChain : RAG Pipeline
- Pinecone : Vector DB
- HuggingFace: Embedding Model and Quantized LLM
- Flask: API Backend
- Streamlit: Frontend UI
This chatbot uses content from the Gale Encyclopedia of Medicine as its knowledge base.
- Scalable Vector Search using Pinecone
- Lightweight system with quantized Llama - 2 model
- End-to-End Pipeline
- Evaluation Pipeline for measuring response quality and chunk relevance.
- Modular design
-
Medical documents are embedded using a sentence transformer model and stored in Pinecone.
-
A retriever pulls the top-k relevant chunks based on the query.
-
A quantized LLaMA-2 model generates grounded answers using the context.
-
Responses are evaluated using BERTScore and natural language inference (NLI).
- Clone the repository
git clone <repo-url>
- Install dependencies:
pip install -r requirements.txt
-
Set up API keys in .env file (Pinecone and HuggingFace).
-
Download the quantized LLaMA-2 model from HuggingFace: TheBloke/Llama-2-13B-GGUF
-
Run Backend
cd src/backend
python backend.py
- Run Streamlit App
cd src/frontend
streamlit run app.py

