Wednesday, May 6, 2026

Similarity Metrics & Search Algorithms

Type Name Description
Similarity Metric Cosine Similarity Measures angle between vectors; most common in RAG; 1 = very similar, 0 = unrelated, -1 = opposite
Similarity Metric Dot Product Measures alignment + magnitude; commonly used in embedding models
Similarity Metric Euclidean Distance (L2) Straight-line distance between vectors; smaller distance = more similar
Similarity Metric Manhattan Distance (L1) Grid-based distance (sum of absolute differences); less common for embeddings
Similarity Metric Jaccard Similarity Set-based similarity; used for sparse or keyword-style data
Search Algorithm Brute Force (Exact Search) Compares query with every vector; exact results but slow at scale; what you implemented
Search Algorithm k-d Tree Space-partitioning tree; efficient for low-dimensional data but performs poorly for high-dimensional embeddings
Search Algorithm Ball Tree Uses hyperspheres instead of splits; slightly better than k-d tree in some cases but still limited in high dimensions
Search Algorithm HNSW (Hierarchical Navigable Small World) Graph-based ANN algorithm; very fast and accurate; widely used in FAISS, Weaviate, etc.
Search Algorithm IVF (Inverted File Index) Clusters vectors first, then searches only relevant clusters; reduces search space significantly
Search Algorithm PQ (Product Quantization) Compresses vectors to reduce memory and speed up search; often combined with IVF
Search Algorithm Annoy (Approximate Nearest Neighbors Oh Yeah) Tree-based method using random projections; used by Spotify; good balance of speed and simplicity
Search Algorithm ScaNN Google’s optimized ANN algorithm; combines partitioning and scoring for efficient search
Search Algorithm LSH (Locality Sensitive Hashing) Hashes similar vectors into same buckets; very fast but less accurate


Category Search Type Algorithm / Technique Role in the RAG Pipeline
Strategy Sparse (Keyword) BM25, TF-IDF Matches exact terms, codes, and specific jargon.
Indexing Algo Dense (Vector) HNSW, IVF, Annoy, LSH, ScaNN Organizes vector space for fast "nearest neighbor" lookups.
Compression Vector Compression PQ (Product Quantization) Compresses vectors to reduce memory footprint at scale.
Strategy Hybrid RRF (Reciprocal Rank Fusion) Merges ranked lists from different sources (keyword + vector).
Strategy Hybrid Convex Combination (Alpha-blending) Weighs keyword vs. vector scores using a specific ratio (e.g., 0.7 vector).
Strategy Post-Retrieval Cross-Encoders, ColBERT A second-pass "judge" that re-ranks results for high precision.
Strategy Query Expansion HyDE (Hypothetical Document Embeddings) Generates a "fake" answer to use as the vector search query.
Strategy Knowledge-Based Leiden, Cypher, Louvain Traverses graph relationships to find "multi-hop" information.
Strategy Contextual Parent Document Retrieval Uses small chunks for searching but large chunks for the LLM.




FAISS, Pinecode and Weaviate
FeatureFAISSPineconeWeaviate
Search AlgorithmsHNSW, IVF, PQ, Flat (Exact), LSHProprietary (built on HNSW, IVF, PQ)HNSW (custom CRUD-optimized), Flat
Similarity MetricsL2, Inner Product (IP), CosineCosine, L2, Dot ProductCosine, Dot Product, L2, Manhattan, Hamming
Primary FocusLow-level library for researchersManaged SaaS for production RAGOpen-source DB with Hybrid search

1. FAISS (Facebook AI Similarity Search)
FAISS is a highly flexible library that provides "building blocks" rather than a single fixed algorithm. 
  • Search Algorithms: It offers a wide variety of indexes. Common ones include IndexHNSW (graph-based), IndexIVF (clustering), and IndexFlat (brute-force exact search). It also uses Product Quantization (PQ) to compress vectors. 
  • Similarity Metrics: Primarily optimized for L2 (Euclidean) and Inner Product.  It supports Cosine Similarity by normalizing vectors and then using Inner Product. 
2. Pinecone
Pinecone is a fully managed service, so its internal "recipe" is proprietary, but it is built on industry-standard concepts. 
  • Search Algorithms: It uses a combination of HNSW, IVF, and PQ within its architecture to balance speed and accuracy at scale. 
  • Similarity Metrics: You choose the metric when creating an index. It supports Cosine Similarity (default for many), Euclidean (L2), and Dot Product
3. Weaviate
Weaviate is designed as a full database and focuses on high-speed retrieval and hybrid search. 
  • Search Algorithms: The default and most common is a custom, high-performance implementation of HNSW.  It is specifically optimized to allow for real-time CRUD (Create, Read, Update, Delete) operations, which is often difficult for standard HNSW. 
  • Similarity Metrics: It provides a broad range: Cosine, Dot Product, L2-Squared, Manhattan, and Hamming.  It defaults to Cosine Distance.

No comments:

Post a Comment

LLM Quantizations

Quantization in Large Language Models (LLMs) is a compression technique that reduces a model's memory footprint and comp...