This article was originally written in Arabic and translated to English with a AI help. If you happen to speak Arabic, I'd love for you to check out the original version.
Imagine you gave ChatGPT a 1,000-page PDF and asked about one tiny detail. In a fraction of a second, it pulls the answer from a sea of words. How did it do it? Did it read the whole book in that instant? Impossible.
The secret is that the book was partitioned and transformed into Vectors (arrays of numbers) using an embedding model. When you ask a question, your query is also converted into a Vector, and the system looks for the 3 or 4 text chunks that are most semantically similar.
The real challenge: How does this system find these chunks among millions of vectors in the blink of an eye without checking every single one?
The clever solution is to build a Graph—a network of connections between these vectors—and navigate it using a greedy approach until you reach the closest match. This solution is called Navigable Small World (NSW).
KNN (K-Nearest Neighbors) in the context of vector search is the "manual labor" method: you have a query vector q, you calculate the distance between it and every vector in your dataset, and you pick the top k.
def brute_force_search(query, all_vectors, k):
distances = []
for v in all_vectors:
d = euclidean_distance(query, v)
distances.append((v, d))
distances.sort(key=lambda x: x[1])
return distances[:k]The Cost: O(n×d) per query (where n is the number of vectors and d is the number of dimensions).
Each dimension is a single number in the vector, representing a specific aspect of the text's meaning.
If you have a million vectors with 768 dimensions? You'd need 768 million calculations just to answer one question. The result would be 100% accurate, but the system would be too slow for any practical application.
The question NSW answers: Can we organize these vectors into a data structure that allows us to find the top k vectors in O(logn) without scanning the entire dataset?
Before graphs, people tried various solutions like search trees (KD-trees) or hashing (LSH). However, these methods often fail when vector dimensionality increases—a problem known as the Curse of Dimensionality.
Graphs take a completely different route.
If you connect every vector to its closest neighbors, "proximity" becomes a Path you can walk. Instead of searching everything, you enter the graph at any random point and ask the neighbors: "Which of you is most similar to this query?"
You move to the closest point and repeat the process (Greedy Search) until you reach the best possible nearest neighbors.
This concept means our graph must have two types of connections:
Analogy: Think of a road network. The Highway is the long-range link that gets you from one side of the country to the other in record time. City streets are the short-range links that take you to a specific address. Without the highway, you'd have to drive through every single small town along the way.
The original paper (Malkov et al., 2014) suggests a building method that is both simple and brilliant: Sequential Insertion. Vectors enter the graph one by one:
This process transforms the dataset from scattered points into a Connected Network.
def insert(self, new_node: Node):
if len(self.nodes) == 0:
self.entry_point = new_node
self.nodes.append(new_node)
return
if len(self.nodes) < self.m:
for node in self.nodes:
new_node.neighbors.append(node)
node.neighbors.append(new_node)
self.nodes.append(new_node)
return
# Standard neighbor search
neighbors = self.search(new_node.vector, [self.entry_point], self.m)
for neighbor in neighbors:
new_node.neighbors.append(neighbor)
neighbor.neighbors.append(new_node)
self.nodes.append(new_node)When we call insert, the code handles three phases:
entry_point—the foundation for all future searches.self.search to find the M closest neighbors for the new arrival and link them bidirectionally.There isn't a single line of code that says "make a long link." These links form naturally because the first vectors added to the graph are forced to connect to each other even if they are far apart (because there are no other options). They become the "highways" of the system later on.
Imagine the graph is ready. A user submits a query. Our goal is to find the closest points without looking at all million points. We start at the entry_point and move greedily: always moving to the neighbor that looks closest to the target.
To manage this movement and ensure accuracy, we use two Heaps:
The search doesn't run forever. One specific logic gives NSW its speed:
closest_node, closest_dist = candidates.pop_min()
furthest_node, furthest_dist = results.peek_max()
if closest_dist > furthest_dist:
breakThe logic is simple: If the closest candidate I have (the best possible one I can explore next) is further than the worst result currently on my leaderboard... then I've hit a dead end. Any new neighbor I find will only be further away. We stop and return the results.
I tested this on a dataset of chess articles. Here are the findings:
| Metric | Brute Force | NSW |
|---|---|---|
| Accuracy (Recall) | 100% | ~99-100% (High) |
| Complexity | Linear O(n) | Logarithmic O(logn) |
| Search Ops (1M vectors) | 1,000,000 | ~200 |
NSW is powerful, but it has one flaw: as data grows, the "entry point" might be very far from the answer. The algorithm wastes time "wandering" through local neighborhoods just to get to the right region.
This is where HNSW (Hierarchical Navigable Small World) comes in, using an idea inspired by Google Maps.
Imagine you are zoomed in to street level in Mexico City, but you want to find a street in Taghazout, Morocco. In Standard NSW, you'd have to slide across the map at the same zoom level, crossing the Atlantic millimeter by millimeter. In HNSW, navigation is multi-layered:
- Level 1 (The Big Jump): The system zooms out to see continents. One "hop" takes you from America to Africa.
- Level 2 (Regional Search): The system zooms in to see countries. You move toward the Agadir region.
- Level 3 (Final Destination): The system zooms in to 100%. You are back at street level, landing exactly in Taghazout.
Efficient and Robust Approximate Nearest Neighbor Search using HNSW Graphs — This is the legendary paper that set the standard for modern Vector Databases like Milvus, Weaviate, and Pinecone.
The full code for this experiment is available on GitHub — Project name: jiwar.