Retrieval Augmented Generation (RAG) on Postgres
So I recently learnt that Postgres works quite well for RAG Applications. Here is the flow I followed.
Pre-prep
Install the pgvector extension for my Postgres db.
Prep
Upload a file
Extract text
Break it into smaller chunks
Generate embeddings for each chunk using the really cheap OpenAI API
store embeddings, chunks, and the file_id in a table
Retrieval / Search
Get OpenAI to call a
function
that returns the search terms extracted from the users query. For example, “Tell me about Tesla Roadster” returns “Tesla Roadster”Generate embeddings for the search term
Query the DB for embeddings that are closer to the search term.
const similarChunks = await pool.query(
`SELECT chunks.content, chunks.chunk_index, f.original_name, chunks.embedding <=> $1 AS distance
FROM document_chunks chunks
JOIN files f ON chunks.file_id = f.id
WHERE dc.embedding <=> $1 < $2
ORDER BY distance ASC
LIMIT 10`,
[queryEmbedding, 0.3] // 0.3 represents a cosine similarity of 0.7
);
Generation
The results are sent back to OpenAI along with the user’s query.
The prompt tells OpenAI to use the chunks as context to answer the user’s query.
Tunables
Limit - the number of results you want
vector length - by default OpenAI returns vectors arrays of length 1536. For very large datasets, smaller vector arrays might perform better for memory and compute reasons.
chunk length - smaller the chunks, more granular the resulting embeddings. Memory, compute, tokens tradeoff.
cosing similarity - specifying this will return results that are in a cone of angle instead of a specific angle, I think. larger number means results from a larger cone are returned.
Post Script
All databases use indexing to speed up lookup and retrieval of information. However, for really tiny amounts of data, indexing actually doesn’t work very well. I got irrelevant results or no results even.
pgvector offers multiple methods to find the nearest neighbors including L2 distance (also known as the Euclidean distance), and Cosine Similarity.
Cosine similarity is better suited for Semantic search. Semantic search means searching for similar meaning (direction). Cosine similarity finds neighbours that are at similar angles instead of neighbours that are closest but could be at very different angles.
Older Relevant Posts
PPS
This is my attempt to document my recent learnings. Feel free to comment or point out mistakes. I will learn something new.
Never miss a post from
Satyajeet Jadhav
Get notified when Satyajeet Jadhav publishes a new post.
Comments
Participate in the conversation.
Read More
Semantic Search, aka Magic
The related notes feature searches all your notes to find the ones that are closest in meaning to your current note.Searching notes to find text similar in meaning to your query is called semantic search. We are trying to build a semantic search engine.
Cosine Similarity
it is the cosine of the angle between two vectors.
Embeddings eli5 version
Computers don’t understand text like humans do. Computers understand numbers.To get computers to understand text, you need to convert the text to numbers.
Untitled
When it comes to optimizing your website for SEO and gaining valuable insights, Google Tag Manager is a powerful tool you don’t want to overlook. It allows you to add and manage various tags—small pieces of code that track everything from website traffic to user behavior—without ...
AI - Local or Cloud ?
If you are integrating AI into your applications, there are three ways to do it.
FAQs from friends and strangers
Who do you think is the closest to thinkdeli in terms of competition? Is it notion, notes, or medium?
OMG
Oh my garden!
Thinkdeli Release Notes
This document maintains the list of new features and bug fixes as they happen.