Wednesday, May 6, 2026

Retrieval Algorithms in RAG Systems - Summary

Category Algorithm / Technique Description / Details
Hierarchical Retrieval K-Means Hierarchical Retrieval
  • Uses K-Means clustering on embedding vectors.
  • Creates semantic groups (clusters) of document chunks.
  • At query time:
    • Embed query
    • Find nearest cluster centroid
    • Search only inside selected clusters
  • Reduces retrieval latency and search space.
  • Useful in large corpora with distinct topic separation.
Hierarchical Retrieval Hierarchical Agglomerative Clustering (HAC)
  • Bottom-up clustering algorithm.
  • Starts with individual chunks and recursively merges similar clusters.
  • Produces dendrogram/tree-like hierarchy.
  • Useful for recursive retrieval and semantic navigation.
  • Computationally expensive at large scale.
Hierarchical Retrieval HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise)
  • Density-based clustering algorithm.
  • Automatically determines number of clusters.
  • Handles uneven and irregular semantic distributions.
  • Useful for noisy corpora and topic discovery.
  • Common in research-oriented retrieval systems.
Hierarchical Retrieval Recursive Retrieval
  • Retrieval occurs in multiple stages.
  • Initial retrieved chunks are used to trigger further retrieval.
  • Supports multi-hop reasoning.
  • Common in research assistants and agentic systems.
Tree-Based Retrieval RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval)
  • Builds recursive semantic summary trees over chunks.
  • Workflow:
    • Chunk documents
    • Cluster chunks
    • Generate summaries
    • Cluster summaries recursively
  • Enables top-down semantic traversal.
  • Excellent for long-document understanding and global context reasoning.
Tree-Based Retrieval Tree Traversal Retrieval Traverses parent-child document structures progressively from root to leaf nodes.
Tree-Based Retrieval KD-Tree (K-Dimensional Tree)
  • Space-partitioning nearest-neighbor structure.
  • Splits vector space recursively along dimensions.
  • Efficient in low-dimensional spaces.
  • Less useful for modern high-dimensional embeddings.
Tree-Based Retrieval Ball Tree
  • Groups vectors into hyperspherical regions ("balls").
  • Performs nearest-neighbor search hierarchically.
  • Works better than KD-Trees in moderately higher dimensions.
Vector Index Retrieval HNSW (Hierarchical Navigable Small Worlds)
  • Approximate Nearest Neighbor (ANN) graph-based retrieval algorithm.
  • Uses multi-layer graph hierarchy.
  • Search starts in sparse upper layers and descends into denser lower layers.
  • Very fast and highly scalable.
  • Widely used in:
    • FAISS
    • Qdrant
    • Milvus
    • Weaviate
Vector Index Retrieval IVF (Inverted File Index)
  • Partitions embedding space into Voronoi cells/clusters.
  • Query searches only selected partitions.
  • Reduces search cost significantly.
  • Frequently combined with PQ.
Vector Index Retrieval PQ (Product Quantization)
  • Compresses embeddings into compact codes.
  • Improves memory efficiency.
  • Enables large-scale ANN retrieval.
  • Trades some precision for speed and storage savings.
Vector Index Retrieval IVF-PQ (Inverted File Index + Product Quantization)
  • Combines partitioning (IVF) with vector compression (PQ).
  • Common in billion-scale vector retrieval systems.
  • Balances latency, memory, and retrieval quality.
Vector Index Retrieval ScaNN (Scalable Nearest Neighbors)
  • Google ANN retrieval system.
  • Uses partitioning, quantization, and anisotropic vector search.
  • Optimized for large-scale semantic search systems.
Vector Index Retrieval Annoy (Approximate Nearest Neighbors Oh Yeah)
  • Spotify ANN retrieval library.
  • Uses random projection trees.
  • Memory-efficient and lightweight.
Graph-Based Retrieval GraphRAG
  • Represents knowledge as entities and relationships.
  • Supports multi-hop reasoning.
  • Combines graph traversal and semantic retrieval.
  • Useful for enterprise knowledge systems.
Graph-Based Retrieval Graph Traversal Algorithms
  • BFS — Breadth-First Search
  • DFS — Depth-First Search
  • Random Walk Retrieval
  • Personalized PageRank (PPR)
  • Used for entity relationship exploration and graph navigation.
Hybrid Retrieval BM25 + Dense Vector Hybrid Retrieval
  • Combines keyword search and semantic retrieval.
  • BM25 handles exact keyword matches.
  • Dense embeddings handle semantic similarity.
  • Very common in production RAG systems.
Hybrid Retrieval RRF (Reciprocal Rank Fusion)
  • Combines rankings from multiple retrievers.
  • Frequently combines:
    • BM25
    • Vector search
    • Graph retrieval
  • Improves robustness and recall.
Topic Routing Classifier-Based Routing
  • Uses ML classifiers to predict correct retrieval domain.
  • Common algorithms:
    • LR — Logistic Regression
    • SVM — Support Vector Machine
    • RF — Random Forest
    • BERT — Bidirectional Encoder Representations from Transformers
    • DistilBERT
  • Often used in enterprise search systems.
Topic Routing Embedding-Based Routing
  • Compares query embeddings against topic/domain centroids.
  • No supervised training required.
  • Lightweight and scalable.
Topic Routing LLM-Based Routing
  • Uses LLM reasoning for dynamic retriever/tool selection.
  • Common in agentic RAG systems.
  • Supports SQL tools, APIs, vector DBs, graph DBs, etc.
Re-ranking Cross-Encoder Re-ranking
  • Jointly evaluates query-document pairs.
  • Produces much more accurate relevance scores than cosine similarity.
  • Common algorithms/models:
    • BGE Reranker (BAAI General Embedding)
    • Cohere Reranker
    • MonoT5
    • ColBERT (Contextualized Late Interaction over BERT)
    • SentenceTransformers CrossEncoder
  • One of the highest-impact improvements in RAG quality.
Query Enhancement HyDE (Hypothetical Document Embeddings)
  • LLM generates hypothetical answer/document first.
  • Embedding of generated text is used for retrieval.
  • Often improves semantic recall significantly.
Query Enhancement Multi-Query Retrieval
  • Generates multiple reformulated queries.
  • Merges retrieval results from all variants.
  • Improves recall and robustness.

No comments:

Post a Comment

LangChain and LlamaIndex

  Aspect LangChain LlamaIndex Winner / Notes Primary Strength Orchestration, Agents & Workflows Data Indexing & Advanced RAG LlamaIn...