Embeddings eli5 version
Computers don’t understand text like humans do. Computers understand numbers. To get computers to understand text, you need to convert the text to numbers.
But simply transforming text to numbers won’t make a lot of sense. Words make sense when they are next to each other. So the numberical representation should capture this relation.
One way to achieve this is by plotting these numbers on a graph.
As a simplistic example, consider a line. Higher numbers on the line represent more sweetness. The word apple can be represented by the number 1. The word juice can be represented by a number 2. The distance between apple and juice is 2-1 = 1. So we could say that the sweetness of apple and this juice is similar.
But words have more complex attributes than just sweetness. It could be shape, size, color, or any other arbitrary attribute. Each of these could become a line or a dimension by itself. What if we could use two dimensions instead of one? For example, x dimension is for sweetness. y dimension is for sourness. We could then plot apple at (1,1) and juice at (2,2). Orange could be (1,2). Orange Juice could be (2,1). We could arbitrarily keep adding more such dimensions. A language model could come up with its own characteristics for a dimension. Though one can’t really visualize hundreds of dimensions, theoretically, they are possible for a computer. Thus, words and sentences become points on a graph of n-dimensions.
There is an excellent explainer by Dharmesh Shah on what embeddings are. I don’t think anyone can explain it better. Please go and read that if you want to understand embeddings better - https://simple.ai/p/guide-vector-embeddings
A lot of models are available today that let you convert words to embeddings. There are a few categories of models here.
Companies like OpenAI, Cohere, etc. have powerful and large models that run on their servers. These companies expose their embeddings API for a small fee. You can send your text to the API and get back embeddings as the response. Recent advancements in AI technology have made these models really cheap.
The other category of models is Llama, Mistral, etc. These too are large and powerful. But these are open. This means, anyone can download these models and run them on their servers. Since these are large models, they can’t typically run from within every user’s browser.
The third category is models like Xenova, and Nombic. These are small models. They are open. They can run in your user’s browsers! But the obvious tradeoff is less accuracy.
Never miss a post from
Satyajeet Jadhav
Get notified when Satyajeet Jadhav publishes a new post.
Comments
Participate in the conversation.
Read More
Semantic Search, aka Magic
The related notes feature searches all your notes to find the ones that are closest in meaning to your current note.Searching notes to find text similar in meaning to your query is called semantic search. We are trying to build a semantic search engine.
Cosine Similarity
it is the cosine of the angle between two vectors.
Formatting on thinkdeli
You can use single ticks around a word to mark it as code too. For example
The biggest challenge with AI
For most of us, the biggest challenge with AI is not going to be which LLM model to use or what infra to deploy. It will be to understand why do I need AI.
Retrieval Augmented Generation (RAG) on Postgres
So I recently learnt that Postgres works quite well for RAG Applications. Here is the flow I followed.
Technology
It's complicated.
LangChain for LLM Application Development
Langchain provides convenience wrappers over OpenAI models. It lets you wrap your prompts so that even the complex prompts can be reused. Parsers let you define the format of the output you want to extract from the OpenAI response.
FAQs from friends and strangers
Who do you think is the closest to thinkdeli in terms of competition? Is it notion, notes, or medium?
AI - Local or Cloud ?
If you are integrating AI into your applications, there are three ways to do it.