Published on

Vector Similarity

Course: Everything about Retrieval Augmented Generation (RAG)

Authors

How Does the System Find the Right Embedding?

Now we’ve learned how text gets converted into embeddings.

That’s a huge step.

But the next big question is:

How does the system know which embedding is the most relevant?

Let’s say a user asks a question.

The query gets converted into an embedding.

Your documents are already stored as embeddings inside a vector database.

Now the system needs to answer one very important question:

Which document chunk is the closest match to this query?

How does it figure that out?

This is where one of the most important ideas in modern AI comes in.

Vector Similarity

The answer is vector similarity.

This is the mathematical process used to compare embeddings and determine how similar they are.

In simple words:

The system compares vectors and finds the ones that are closest in meaning.

That’s it.

And this tiny idea powers some of the biggest systems in AI today.

Including:

  • RAG systems
  • semantic search
  • recommendation engines
  • anomaly detection
  • search ranking systems
  • vector databases like Pinecone

This similarity calculation is exactly what makes retrieval possible.

A Simple Example

Imagine two sentences:

“How do I increase my salary?”

and

“Ways to get a pay raise”

The words are different.

But the meaning is very similar.

Because of that, their embeddings will be close in vector space.

Something like this:

esalary increaseepay raisee_{\text{salary increase}} \approx e_{\text{pay raise}}

The system does not look for exact words.

It looks for closeness in vector space.

That is semantic search.

And that is why embeddings are so powerful.

How Vector Databases Use This

Vector databases like:

  • Pinecone
  • FAISS
  • Weaviate

store these embeddings and perform fast similarity search.

When a user asks a question:

  1. the query is converted into an embedding
  2. the database compares it with stored vectors
  3. the most similar chunks are retrieved

This process happens in milliseconds.

And that is what makes RAG feel so fast and intelligent.

But How Do We Measure Similarity?

This is the real technical question.

How do we mathematically decide whether two vectors are “close”?

There are three common similarity metrics used in practice.

Each one measures similarity in a slightly different way.

And understanding them is extremely important for building production-grade RAG systems.

The Three Most Common Similarity Metrics

We usually use one of these:

  1. Cosine Similarity
  2. Euclidean Distance
  3. Dot Product

Each has its own strengths depending on the use case.

And each one answers the same question:

How similar are these two vectors?

Euclidean Distance

Let’s begin with the first and probably the most intuitive similarity metric.

If you’ve ever measured the straight-line distance between two points on a graph…

then you already understand the basic idea.

That is exactly what Euclidean Distance does.

It measures:

the straight-line distance between two vectors in space

Simple.

Direct.

And very visual.

Imagine Two Points on a Map

Think of two locations on a map.

If the two points are very close to each other, the distance between them is small.

If they are far apart, the distance becomes large.

The same idea applies to embeddings.

Each embedding is a point in vector space.

And Euclidean Distance tells us how far apart those points are.

Mermaid Diagram
Rendering…

What Does This Mean for Similarity?

In simple words:

smaller distance = higher similarity

and

larger distance = lower similarity

So if two sentence embeddings are close together, the system assumes they are semantically similar.

For example:

  • “salary increase”
  • “pay raise”

would likely have a small Euclidean distance.

Because their meanings are closely related.

What Does Euclidean Distance Consider?

This metric looks at two things:

  • how long the vectors are (magnitude)
  • which direction they point (direction)

That means it does not just care about where the vectors point…

it also cares about how large they are.

This is important because sometimes vector size itself carries information.

The Formula

Mathematically, Euclidean Distance is written as:

d(x,y)=i=1n(xiyi)2d(x,y)=\sum_{i=1}^{n}(x_i-y_i)^2

At first, the formula may look scary…

but it’s actually doing something simple:

  1. Compare each coordinate
  2. Find the difference
  3. Square it
  4. Add everything together
  5. Take the square root

That final value is the distance between the two vectors.

import numpy as np

# Single pair calculation
p1 = np.array([1, 2, 3])
p2 = np.array([4, 5, 6])
dist = np.linalg.norm(p1 - p2)
print(dist)

# One-to-many vectorized calculation
origin = np.array([0, 0])
points = np.array([[1, 1], [2, 2], [3, 3]])
# axis=1 ensures we get a distance for each row
all_distances = np.linalg.norm(points - origin, axis=1)
print(all_distances)

Cosine Similarity

Now we come to the most popular similarity metric used in modern NLP and RAG systems.

If Euclidean Distance measures how far apart two vectors are

Cosine Similarity measures something different.

It asks:

Are these two vectors pointing in the same direction?

And this small idea makes a huge difference.

It Measures Angle, Not Distance

Unlike Euclidean Distance, cosine similarity does not focus on actual distance.

Instead, it measures the angle between two vectors.

That means:

only the direction matters not the magnitude

This is extremely important.

Because sometimes one vector may be much longer than another…

but if both point in the same direction, they still represent very similar meaning.

And in language tasks, that matters a lot.

What Does the Angle Tell Us?

Here’s the simple intuition:

Small angle → High similarity

If two vectors point in almost the same direction, the angle between them is very small.

That means they are highly similar.

Large angle → Low similarity

If the vectors point in very different directions, similarity decreases.

And if they point in completely opposite directions, similarity becomes very low.

So in simple words:

smaller angle = stronger semantic similarity

The Formula

Mathematically, cosine similarity is written as:

cos(θ)=xyxy\cos(\theta)=\frac{x \cdot y}{\|x\|\|y\|}

This formula compares:

  • the dot product of the vectors
  • divided by their magnitudes

The final result usually falls between:

  • 1 → very similar
  • 0 → unrelated
  • -1 → opposite direction

So the closer the value is to 1, the better the match.

import numpy as np
from numpy.linalg import norm

A = np.array([2, 1, 2, 3, 2, 9])
B = np.array([3, 4, 2, 4, 5, 5])

# compute cosine similarity
cosine = np.dot(A, B) / (norm(A) * norm(B))
print("Cosine Similarity:", cosine)

Dot Product Similarity

Now let’s look at the third similarity metric.

We’ve already seen:

  • Euclidean Distance → measures straight-line distance
  • Cosine Similarity → measures angle between vectors

Now comes Dot Product.

And this one is a little different.

It looks at both:

  • direction
  • magnitude

That means it cares about not only where the vectors point…

but also how large they are.

The Core Idea

Dot Product works by:

multiplying corresponding vector values and then adding them together

Simple idea.

Powerful result.

If two vectors point in a similar direction and also have strong magnitudes, the dot product becomes large.

That means higher similarity.

So unlike cosine similarity…

vector length matters here.

The Formula

Mathematically, dot product is written as:

xy=i=1nxiyix \cdot y = \sum_{i=1}^{n} x_i y_i

What this means:

  1. Take each coordinate from vector x
  2. Multiply it with the matching coordinate from vector y
  3. Add all those values together

The final number tells us how aligned the vectors are.

The larger the value, the stronger the similarity.

Where Dot Product Is Commonly Used

Dot product is often useful in:

  • recommendation systems
  • ranking systems
  • retrieval models
  • large-scale search pipelines

Especially when preference strength or confidence matters.

In some high-performance vector search systems, dot product is also preferred because it can be computationally efficient.

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
result = np.dot(a, b)

print(result)

Choosing the Right Metric 🚀

So which one should you use?

The golden rule is simple:

use the same similarity metric that your embedding model was trained with

This helps vector databases like Pinecone use the most optimized search algorithms and return the most accurate results.

And this is exactly what makes retrieval in RAG so powerful.

Previous Lesson

Embeddings