rschandrastechblog: Retrieval Algorithms in RAG Systems

Category	Algorithm / Technique	Description / Details
Hierarchical Retrieval	K-Means Hierarchical Retrieval	Uses K-Means clustering on embedding vectors. Creates semantic groups (clusters) of document chunks. At query time: Embed query Find nearest cluster centroid Search only inside selected clusters Reduces retrieval latency and search space. Useful in large corpora with distinct topic separation.
Hierarchical Retrieval	Hierarchical Agglomerative Clustering (HAC)	Bottom-up clustering algorithm. Starts with individual chunks and recursively merges similar clusters. Produces dendrogram/tree-like hierarchy. Useful for recursive retrieval and semantic navigation. Computationally expensive at large scale.
Hierarchical Retrieval	HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise)	Density-based clustering algorithm. Automatically determines number of clusters. Handles uneven and irregular semantic distributions. Useful for noisy corpora and topic discovery. Common in research-oriented retrieval systems.
Hierarchical Retrieval	Recursive Retrieval	Retrieval occurs in multiple stages. Initial retrieved chunks are used to trigger further retrieval. Supports multi-hop reasoning. Common in research assistants and agentic systems.
Tree-Based Retrieval	RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval)	Builds recursive semantic summary trees over chunks. Workflow: Chunk documents Cluster chunks Generate summaries Cluster summaries recursively Enables top-down semantic traversal. Excellent for long-document understanding and global context reasoning.
Tree-Based Retrieval	Tree Traversal Retrieval	Traverses parent-child document structures progressively from root to leaf nodes.
Tree-Based Retrieval	KD-Tree (K-Dimensional Tree)	Space-partitioning nearest-neighbor structure. Splits vector space recursively along dimensions. Efficient in low-dimensional spaces. Less useful for modern high-dimensional embeddings.
Tree-Based Retrieval	Ball Tree	Groups vectors into hyperspherical regions ("balls"). Performs nearest-neighbor search hierarchically. Works better than KD-Trees in moderately higher dimensions.
Vector Index Retrieval	HNSW (Hierarchical Navigable Small Worlds)	Approximate Nearest Neighbor (ANN) graph-based retrieval algorithm. Uses multi-layer graph hierarchy. Search starts in sparse upper layers and descends into denser lower layers. Very fast and highly scalable. Widely used in: FAISS Qdrant Milvus Weaviate
Vector Index Retrieval	IVF (Inverted File Index)	Partitions embedding space into Voronoi cells/clusters. Query searches only selected partitions. Reduces search cost significantly. Frequently combined with PQ.
Vector Index Retrieval	PQ (Product Quantization)	Compresses embeddings into compact codes. Improves memory efficiency. Enables large-scale ANN retrieval. Trades some precision for speed and storage savings.
Vector Index Retrieval	IVF-PQ (Inverted File Index + Product Quantization)	Combines partitioning (IVF) with vector compression (PQ). Common in billion-scale vector retrieval systems. Balances latency, memory, and retrieval quality.
Vector Index Retrieval	ScaNN (Scalable Nearest Neighbors)	Google ANN retrieval system. Uses partitioning, quantization, and anisotropic vector search. Optimized for large-scale semantic search systems.
Vector Index Retrieval	Annoy (Approximate Nearest Neighbors Oh Yeah)	Spotify ANN retrieval library. Uses random projection trees. Memory-efficient and lightweight.
Graph-Based Retrieval	GraphRAG	Represents knowledge as entities and relationships. Supports multi-hop reasoning. Combines graph traversal and semantic retrieval. Useful for enterprise knowledge systems.
Graph-Based Retrieval	Graph Traversal Algorithms	BFS — Breadth-First Search DFS — Depth-First Search Random Walk Retrieval Personalized PageRank (PPR) Used for entity relationship exploration and graph navigation.
Hybrid Retrieval	BM25 + Dense Vector Hybrid Retrieval	Combines keyword search and semantic retrieval. BM25 handles exact keyword matches. Dense embeddings handle semantic similarity. Very common in production RAG systems.
Hybrid Retrieval	RRF (Reciprocal Rank Fusion)	Combines rankings from multiple retrievers. Frequently combines: BM25 Vector search Graph retrieval Improves robustness and recall.
Topic Routing	Classifier-Based Routing	Uses ML classifiers to predict correct retrieval domain. Common algorithms: LR — Logistic Regression SVM — Support Vector Machine RF — Random Forest BERT — Bidirectional Encoder Representations from Transformers DistilBERT Often used in enterprise search systems.
Topic Routing	Embedding-Based Routing	Compares query embeddings against topic/domain centroids. No supervised training required. Lightweight and scalable.
Topic Routing	LLM-Based Routing	Uses LLM reasoning for dynamic retriever/tool selection. Common in agentic RAG systems. Supports SQL tools, APIs, vector DBs, graph DBs, etc.
Re-ranking	Cross-Encoder Re-ranking	Jointly evaluates query-document pairs. Produces much more accurate relevance scores than cosine similarity. Common algorithms/models: BGE Reranker (BAAI General Embedding) Cohere Reranker MonoT5 ColBERT (Contextualized Late Interaction over BERT) SentenceTransformers CrossEncoder One of the highest-impact improvements in RAG quality.
Query Enhancement	HyDE (Hypothetical Document Embeddings)	LLM generates hypothetical answer/document first. Embedding of generated text is used for retrieval. Often improves semantic recall significantly.
Query Enhancement	Multi-Query Retrieval	Generates multiple reformulated queries. Merges retrieval results from all variants. Improves recall and robustness.

rschandrastechblog

Wednesday, May 6, 2026

Retrieval Algorithms in RAG Systems - Summary

No comments:

Post a Comment

LangChain and LlamaIndex

Report Abuse

Followers