Satyajeet Jadhav

4 months ago

Embeddings eli5 version

Computers don’t understand text like humans do. Computers understand numbers. To get computers to understand text, you need to convert the text to numbers.

But simply transforming text to numbers won’t make a lot of sense. Words make sense when they are next to each other. So the numberical representation should capture this relation.

One way to achieve this is by plotting these numbers on a graph.  

As a simplistic example, consider a line. Higher numbers on the line represent more sweetness. The word apple can be represented by the number 1. The word juice can be represented by a number 2. The distance between apple and juice is 2-1 = 1.  So we could say that the sweetness of apple and this juice is similar.

But words have more complex attributes than just sweetness. It could be shape, size, color, or any other arbitrary attribute. Each of these could become a line or a dimension by itself. What if we could use two dimensions instead of one? For example, x dimension is for sweetness. y dimension is for sourness. We could then plot apple at (1,1) and juice at (2,2). Orange could be (1,2). Orange Juice could be (2,1). We could arbitrarily keep adding more such dimensions. A language model could come up with its own characteristics for a dimension. Though one can’t really visualize hundreds of dimensions, theoretically, they are possible for a computer.  Thus, words and sentences become points on a graph of n-dimensions.

There is an excellent explainer by Dharmesh Shah on what embeddings are. I don’t think anyone can explain it better. Please go and read that if you want to understand embeddings better - https://simple.ai/p/guide-vector-embeddings

A lot of models are available today that let you convert words to embeddings. There are a few categories of models here.

Companies like OpenAI, Cohere, etc. have powerful and large models that run on their servers. These companies expose their embeddings API for a small fee. You can send your text to the API and get back embeddings as the response. Recent advancements in AI technology have made these models really cheap.

The other category of models is Llama, Mistral, etc. These too are large and powerful. But these are open. This means, anyone can download these models and run them on their servers. Since these are large models, they can’t typically run from within every user’s browser.

The third category is models like Xenova, and Nombic. These are small models. They are open. They can run in your user’s browsers! But the obvious tradeoff is less accuracy.

Read more by Satyajeet Jadhav
AI - Local or Cloud ?

If you are integrating AI into your applications, there are three ways to do it.API integrationYou integrate with ChatGPT, Cohere, or a similar service. You get an API key and start using APIs offered by these services. Pros Access to perhaps the best models known to humanityHigh accuracy and speedNo maintenanceConsCostPrivacy - Not all of your users will be comfortable with their data being share...

4 months ago

Blog Post Semantic Search, aka Magic image
Semantic Search, aka Magic

In this post I write about the Related Notes feature on thinkdeli. I attempt to describe my journey of implementing a deceptively simple feature using the latest advances in AI and browser tech.The featureAny decent writing app lets you organize and link notes. But all the apps expect you, the writer, to do all the work. What if your writing app could link your thoughts automatically in a fast and...

4 months ago

Comments

Participate in the conversation.

Read More

Your writing hub.