- Clone the repo or download the ZIP
git clone [github https url]
- Install packages
First run npm install yarn -g to install yarn globally (if you haven't already).
Then run:
yarn install
After installation, you should now see a node_modules folder.
- Set up your
.envfile Your.envfile should look like this:
OPENAI_API_KEY= "sk-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
PINECONE_API_KEY= "eXXXXXXXXXXXXXXXXXXXXXXXXXX"
PINECONE_ENVIRONMENT= "us-XXXX-gcp-free"
PINECONE_INDEX_NAME= "Your name"
# Mongo_db connection
DB_CONN_STRING="mongodb+srv://name:PASSWORD@DBNAME.vcmoiuc.mongodb.net/"
DB_NAME="XXXXXXX"
QUESTIONS_COLLECTION_NAME="XXXXXXX"
-
Visit openai to retrieve API keys and insert into your
.envfile. -
Visit pinecone to create and retrieve your API keys, and also retrieve your environment and index name from the dashboard.
(Make sure pinecone is created with the dimensions of 1536 and using cosine similarity) (MongoDB should be a cloud collection)
-
In the
configfolder, replace thePINECONE_NAME_SPACEwith anamespacewhere you'd like to store your embeddings on Pinecone when you runnpm run ingest. This namespace will later be used for queries and retrieval. -
In
utils/makechain.tschain change theQA_PROMPTfor your own usecase. ChangemodelNameinnew OpenAItogpt-4, if you have access togpt-4api. Please verify outside this repo that you have access togpt-4api, otherwise the application will not work.
This repo can load multiple PDF files
-
Inside
docsfolder, add your pdf files or folders that contain pdf files. -
Run the script
npm run ingestto 'ingest' and embed your docs. If you run into errors troubleshoot below. -
Check Pinecone dashboard to verify your namespace and vectors have been added.
Once you've verified that the embeddings and content have been successfully added to your Pinecone, you can run the app npm run dev to launch the local dev environment, and then type a question in the chat interface.
Sometimes the Pinecone index doesn't work initially. Just try deleting and remaking the index if you are having problems with the ingest. Pinecone Indexes in the free tier are also removed after 7 days of inactivity so keep this in mind.
Also check out gpt4-pdf-chatbot-langchain