# Retrieval Augmented Generation RAG with ollama and qdrant. ## Usage ### Setup #### Environment Variables Create a .env file or set the following parameters: ```.env CHUNK_SIZE = CHUNK_OVERLAP = ENCODER_MODEL = EMBEDDING_DIM = GENERATOR_MODEL = RAG_DB_NAME = RAG_DB_USER = QDRANT_URL = QDRANT_COLLECTION_NAME = ``` ### Ollama Download the encoder and generator models with ollama: ```sh ollama pull $GENERATOR_MODEL ollama pull $ENCODER_MODEL ``` ### Qdrant Qdrant will is used to store the embeddings of the chunks from the documents. Download and run qdrant. ### Postgres Postgres is used to save hashes of the document chunks to prevent document chunks from being added to the vector db more than ones. Download and run qdrant. #### Running Build script/or FE for adding pdfs or retrieve information ### Frontend (Low priority) [streamlit](https://github.com/streamlit/streamlit) ### Notes Yes, it is inefficient/dumb to use ollama when you can just load the models with python in the same process. ### Inspiration I took some inspiration from these tutorials. [rag-openai-qdrant](https://colab.research.google.com/github/qdrant/examples/blob/master/rag-openai-qdrant/rag-openai-qdrant.ipynb) [building-rag-application-using-langchain-openai-faiss](https://medium.com/@solidokishore/building-rag-application-using-langchain-openai-faiss-3b2af23d98ba)