This is a Streamlit app that uses a Retrieval-Augmented Generation (RAG) pipeline to answer questions from the IPCC AR6 WGIII report using a FAISS vector index and Databricks-hosted LLMs and embeddings.
- Clone this repository
- Add your Databricks credentials in Streamlit Cloud under Secrets:
DATABRICKS_HOST = "https://<your-workspace>.databricks.com"
DATABRICKS_TOKEN = "dapi-xxxxxxxxxxxxxxxx"
- Deploy to Streamlit Cloud
app.py: Main Streamlit interfacerag_core.py: Core logic for vector search and answer generationrequirements.txt: Python dependenciesNotebooksfolder: Databricks notebooks used to create embeddings vectors and development (must be run on a Databricks cluster having required libraries installed)