How Does the System Find the Right Embedding?
Now we’ve learned how text gets converted into embeddings.
That’s a huge step.
But the next big question is:
How does the system know which embedding is the most relevant?
Let’s say a user asks a question.
The query gets converted into an embedding.
Your documents are already stored as embeddings inside a vector database.
Now the system needs to answer one very important question:
Which document chunk is the closest match to this query?
How does it figure that out?
This is where one of the most important ideas in modern AI comes in.
Vector Similarity
The answer is vector similarity.
This is the mathematical process used to compare embeddings and determine how similar they are.
In simple words:
The system compares vectors and finds the ones that are closest in meaning.
That’s it.
And this tiny idea powers some of the biggest systems in AI today.
Including:
- RAG systems
- semantic search
- recommendation engines
- anomaly detection
- search ranking systems
- vector databases like Pinecone
This similarity calculation is exactly what makes retrieval possible.
A Simple Example
Imagine two sentences:
“How do I increase my salary?”
and
“Ways to get a pay raise”
The words are different.
But the meaning is very similar.
Because of that, their embeddings will be close in vector space.
Something like this:
The system does not look for exact words.
It looks for closeness in vector space.
That is semantic search.
And that is why embeddings are so powerful.
How Vector Databases Use This
Vector databases like:
- Pinecone
- FAISS
- Weaviate
store these embeddings and perform fast similarity search.
When a user asks a question:
- the query is converted into an embedding
- the database compares it with stored vectors
- the most similar chunks are retrieved
This process happens in milliseconds.
And that is what makes RAG feel so fast and intelligent.
But How Do We Measure Similarity?
This is the real technical question.
How do we mathematically decide whether two vectors are “close”?
There are three common similarity metrics used in practice.
Each one measures similarity in a slightly different way.
And understanding them is extremely important for building production-grade RAG systems.
The Three Most Common Similarity Metrics
We usually use one of these:
- Cosine Similarity
- Euclidean Distance
- Dot Product
Each has its own strengths depending on the use case.
And each one answers the same question:
How similar are these two vectors?
Euclidean Distance
Let’s begin with the first and probably the most intuitive similarity metric.
If you’ve ever measured the straight-line distance between two points on a graph…
then you already understand the basic idea.
That is exactly what Euclidean Distance does.
It measures:
the straight-line distance between two vectors in space
Simple.
Direct.
And very visual.
Imagine Two Points on a Map
Think of two locations on a map.
If the two points are very close to each other, the distance between them is small.
If they are far apart, the distance becomes large.
The same idea applies to embeddings.
Each embedding is a point in vector space.
And Euclidean Distance tells us how far apart those points are.
What Does This Mean for Similarity?
In simple words:
smaller distance = higher similarity
and
larger distance = lower similarity
So if two sentence embeddings are close together, the system assumes they are semantically similar.
For example:
- “salary increase”
- “pay raise”
would likely have a small Euclidean distance.
Because their meanings are closely related.
What Does Euclidean Distance Consider?
This metric looks at two things:
- how long the vectors are (magnitude)
- which direction they point (direction)
That means it does not just care about where the vectors point…
it also cares about how large they are.
This is important because sometimes vector size itself carries information.
The Formula
Mathematically, Euclidean Distance is written as:
At first, the formula may look scary…
but it’s actually doing something simple:
- Compare each coordinate
- Find the difference
- Square it
- Add everything together
- Take the square root
That final value is the distance between the two vectors.
import numpy as np
# Single pair calculation
p1 = np.array([1, 2, 3])
p2 = np.array([4, 5, 6])
dist = np.linalg.norm(p1 - p2)
print(dist)
# One-to-many vectorized calculation
origin = np.array([0, 0])
points = np.array([[1, 1], [2, 2], [3, 3]])
# axis=1 ensures we get a distance for each row
all_distances = np.linalg.norm(points - origin, axis=1)
print(all_distances)
Cosine Similarity
Now we come to the most popular similarity metric used in modern NLP and RAG systems.
If Euclidean Distance measures how far apart two vectors are…
Cosine Similarity measures something different.
It asks:
Are these two vectors pointing in the same direction?
And this small idea makes a huge difference.
It Measures Angle, Not Distance
Unlike Euclidean Distance, cosine similarity does not focus on actual distance.
Instead, it measures the angle between two vectors.
That means:
only the direction matters not the magnitude
This is extremely important.
Because sometimes one vector may be much longer than another…
but if both point in the same direction, they still represent very similar meaning.
And in language tasks, that matters a lot.
What Does the Angle Tell Us?
Here’s the simple intuition:
Small angle → High similarity
If two vectors point in almost the same direction, the angle between them is very small.
That means they are highly similar.
Large angle → Low similarity
If the vectors point in very different directions, similarity decreases.
And if they point in completely opposite directions, similarity becomes very low.
So in simple words:
smaller angle = stronger semantic similarity
The Formula
Mathematically, cosine similarity is written as:
This formula compares:
- the dot product of the vectors
- divided by their magnitudes
The final result usually falls between:
- 1 → very similar
- 0 → unrelated
- -1 → opposite direction
So the closer the value is to 1, the better the match.
import numpy as np
from numpy.linalg import norm
A = np.array([2, 1, 2, 3, 2, 9])
B = np.array([3, 4, 2, 4, 5, 5])
# compute cosine similarity
cosine = np.dot(A, B) / (norm(A) * norm(B))
print("Cosine Similarity:", cosine)
Dot Product Similarity
Now let’s look at the third similarity metric.
We’ve already seen:
- Euclidean Distance → measures straight-line distance
- Cosine Similarity → measures angle between vectors
Now comes Dot Product.
And this one is a little different.
It looks at both:
- direction
- magnitude
That means it cares about not only where the vectors point…
but also how large they are.
The Core Idea
Dot Product works by:
multiplying corresponding vector values and then adding them together
Simple idea.
Powerful result.
If two vectors point in a similar direction and also have strong magnitudes, the dot product becomes large.
That means higher similarity.
So unlike cosine similarity…
vector length matters here.
The Formula
Mathematically, dot product is written as:
What this means:
- Take each coordinate from vector x
- Multiply it with the matching coordinate from vector y
- Add all those values together
The final number tells us how aligned the vectors are.
The larger the value, the stronger the similarity.
Where Dot Product Is Commonly Used
Dot product is often useful in:
- recommendation systems
- ranking systems
- retrieval models
- large-scale search pipelines
Especially when preference strength or confidence matters.
In some high-performance vector search systems, dot product is also preferred because it can be computationally efficient.
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
result = np.dot(a, b)
print(result)
Choosing the Right Metric 🚀
So which one should you use?
The golden rule is simple:
use the same similarity metric that your embedding model was trained with
This helps vector databases like Pinecone use the most optimized search algorithms and return the most accurate results.
And this is exactly what makes retrieval in RAG so powerful.