Retrieval Augmented Generation (RAG) on Postgres

So I recently learnt that Postgres works quite well for RAG Applications. Here is the flow I followed.

Pre-prep

Install the pgvector extension for my Postgres db.

Prep

Upload a file
Extract text
Break it into smaller chunks
Generate embeddings for each chunk using the really cheap OpenAI API
store embeddings, chunks, and the file_id in a table

Retrieval / Search

Get OpenAI to call a function that returns the search terms extracted from the users query. For example, “Tell me about Tesla Roadster” returns “Tesla Roadster”
Generate embeddings for the search term
Query the DB for embeddings that are closer to the search term.

const similarChunks = await pool.query(
  `SELECT chunks.content, chunks.chunk_index, f.original_name, chunks.embedding <=> $1 AS distance
   FROM document_chunks chunks 
   JOIN files f ON chunks.file_id = f.id 
   WHERE dc.embedding <=> $1 < $2
   ORDER BY distance ASC 
   LIMIT 10`,
  [queryEmbedding, 0.3]  // 0.3 represents a cosine similarity of 0.7
);

Generation

The results are sent back to OpenAI along with the user’s query.
The prompt tells OpenAI to use the chunks as context to answer the user’s query.

Tunables

Limit - the number of results you want
vector length - by default OpenAI returns vectors arrays of length 1536. For very large datasets, smaller vector arrays might perform better for memory and compute reasons.
chunk length - smaller the chunks, more granular the resulting embeddings. Memory, compute, tokens tradeoff.
cosing similarity - specifying this will return results that are in a cone of angle instead of a specific angle, I think. larger number means results from a larger cone are returned.

Post Script

All databases use indexing to speed up lookup and retrieval of information. However, for really tiny amounts of data, indexing actually doesn’t work very well. I got irrelevant results or no results even.
pgvector offers multiple methods to find the nearest neighbors including L2 distance (also known as the Euclidean distance), and Cosine Similarity.
Cosine similarity is better suited for Semantic search. Semantic search means searching for similar meaning (direction). Cosine similarity finds neighbours that are at similar angles instead of neighbours that are closest but could be at very different angles.