content/projects/retrieval-augmented-generation.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

---
title: "Retrieval Augmented Generation"
date: 2024-04-26 00:36
tags:
  [
    "deep learning",
    "retrieval augmented generation",
    "vector database",
    "ollama",
    "llm",
  ]
draft: false
---

I implemented a retrieval augmented generation (RAG) 
[program](https://github.com/aktersnurra/rag) for fun with the goal of being able to 
search my personal library. My focus was to make this run locally with only open
source models. This was achieved with `ollama` and `sentence-transformers` for 
downloading and running these models locally. However, the project was expanded to
integrate with cohere and its rerank and command-r+ models, since I was curious about
the command-r+ performance. These models can be downloaded and run locally, but it took
ages for my computer to generate any output, since the command-r+ is huge.


Here is a [presentation](/rag.html) that gives a brief overview of what a RAG system
is, and how it can be improved with reranking.