To recall, Integrating our private documents with LLM is called RAG.
Lets assume that, we have some pdfs containing our data. That data in the pdf will be broken down into chunks based on some criteria. That chunk will be fed as input to the model. More specifically embedding model. This model will generate a point. How the point is generated ?
Lets take a simple example:
- Today is Wednesday
- Tomorrow is Thursday
- I am travelling today
- Wednesday is a nice series
Lets construct a sentence now containing only unique words from the above set of sentences:
Today, is, Wednesday, Tomorrow, Thursday, I, am, travelling, a, nice, series
We are now going to construct each sentence into a number format. Based on unique construct sentence, lets scan each sentence, if it contains the word, we will be assigning number 1 to uniquely constructed sentence otherwise 0.
1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0
0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0
1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0
0,1, 1, 0, 0, 0, 0, 0, 1, 1, 1
This method of conversion is called a one shot encoding
Now coming to RAG, based on the context of the model, it will generate a point. Generated point will be multidimensional (x,y,z,a ...). Generated points will enable semantic search. What is semantic search ? It will help us to know, how two points are closely related to each other. Meaning based search is called semantic search. For each chunk, a point will be generated. Then model determines relationship among each of the points and plots related points together.
Vector DB provides a place to store related points together and when quering on the data, it provides the related data.
*How do we say that two points are closer to each other ?
*
When distance is less we say that the two points are closer to each other. Just because there are two points, we can't always say that they are nearer to each other. We need to bring in another point.(for comparison). To find distance between points, there are several algorithms: Euclidean, Cosine Similarity, Manhattan distance.
Lets take Cosine similarity and see how it works:
There are three points(p1,p2,p3) plotted in a graph. From origin, a straight line will be drawn to each of the points. The lines forming an angle with point3 will be considered and its angle will be noted. Cosine of the angle will be taken. smallest cosine angle will be the shortest point.
There are 100 points. if i want to find the nearest points for a point named x, i need to calculate distance between x to all other remaining points. Then only i can arrive the nearest points. But this approach is time consuming.
So a pipeline for RAG is, data will be given to a embedding model(nomic-embeed text), it will a generate a point (mathematical representation of the data). This point will be stored in a vector DB. Some examples of vector DB are chromaDB, pinecone, FAISS, Quadrant etc.
If i ask any query, it will be sent to embedding model and generate a point and store it in the vector DB and returns the points(say like 5) that are nearer to the query point.
United States
NORTH AMERICA
Related News
Amazon Employees Are 'Tokenmaxxing' Due To Pressure To Use AI Tools
20h ago
UCP Variant Data: The #1 Reason Agent Checkouts Fail
6h ago

Décryptage technique : Comment builder un téléchargeur de vidéos Reddit performant (DASH, HLS & WebAssembly)
16h ago
How Braze’s CTO is rethinking engineering for the agentic area
10h ago
Encryption Protocols for Secure AI Systems: A Practical Guide
20h ago