summaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'README.md')
-rw-r--r--README.md101
1 files changed, 101 insertions, 0 deletions
diff --git a/README.md b/README.md
index e69de29..7fcb167 100644
--- a/README.md
+++ b/README.md
@@ -0,0 +1,101 @@
+# Retrieval Augmented Generation
+
+## Plan
+
+- [ ] Architecture
+ - [ ] Vector store
+ - [ ] which one? FAISS?
+ - [ ] Build index of the document
+ - [ ] Embedding model (mxbai-embed-large)
+ - [ ] LLM (Dolphin)
+- [ ] Gather some documents
+- [ ] Create a prompt for the query
+
+
+### Pre-Processing of Document
+1. Use langchain document loader and splitter
+ ```python
+ from langchain_community.document_loaders import PyPDFLoader
+ from langchain.text_splitter import RecursiveCharacterTextSplitter
+ ```
+
+2. Generate embeddings with mxbai, example:
+```python
+from sentence_transformers import SentenceTransformer
+from sentence_transformers.util import cos_sim
+
+# 1. load model
+model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")
+
+# For retrieval you need to pass this prompt.
+query = 'Represent this sentence for searching relevant passages: A man is eating a piece of bread'
+
+docs = [
+ query,
+ "A man is eating food.",
+ "A man is eating pasta.",
+ "The girl is carrying a baby.",
+ "A man is riding a horse.",
+]
+
+# 2. Encode
+embeddings = model.encode(docs)
+
+# 3. Calculate cosine similarity
+similarities = cos_sim(embeddings[0], embeddings[1:])
+```
+But we will use ollama...
+
+(otherwise install `sentence-transformers`)
+
+3. Create vector store
+```python
+import numpy as np
+d = 64 # dimension
+nb = 100000 # database size
+nq = 10000 # nb of queries
+np.random.seed(1234) # make reproducible
+xb = np.random.random((nb, d)).astype('float32')
+xb[:, 0] += np.arange(nb) / 1000.
+xq = np.random.random((nq, d)).astype('float32')
+xq[:, 0] += np.arange(nq) / 1000.
+
+import faiss # make faiss available
+index = faiss.IndexFlatL2(d) # build the index
+print(index.is_trained)
+index.add(xb) # add vectors to the index
+print(index.ntotal)
+
+k = 4 # we want to see 4 nearest neighbors
+D, I = index.search(xb[:5], k) # sanity check
+print(I)
+print(D)
+D, I = index.search(xq, k) # actual search
+print(I[:5]) # neighbors of the 5 first queries
+print(I[-5:]) # neighbors of the 5 last queries
+```
+
+I need to figure out the vector dim of the mxbai model.
+
+4. Use Postgres as a persisted kv-store
+
+Save index of chunk as key and value as paragraph.
+
+5. Create user input pipeline
+5.1 Create search prompt for document retrieval
+5.2 Fetch nearest neighbors as context
+5.3 Retrieve the values from the document db
+5.4 Add paragraphs as context to the query
+5.5 Send query to LLM
+5.6 Return output
+5.7 ....
+5.8 Profit
+
+### Frontend (Low priority)
+
+[streamlit](https://github.com/streamlit/streamlit)
+
+
+### Tutorial
+
+[link](https://medium.com/@solidokishore/building-rag-application-using-langchain-openai-faiss-3b2af23d98ba)