Member-only story
IVFPQ + HNSW for Billion-scale Similarity Search
The best indexing approach for billion-sized vector datasets
We learned about IVFPQ in the previous article, where the inverted file index (IVF) is combined with product quantization (PQ) to create an effective method for large-scale similarity search.
In this article, we will learn about HNSW and how it can be used together with IVFPQ to form the best indexing approach for billion-scale similarity search.
We will first introduce NSW and skip list, the two important foundations that HNSW is built upon. We will also go through the implementation of HNSW using Faiss, the effect of different parameter settings, as well as how the different variations of HNSW indexes compare over search quality, speed, and memory usage.