There are multiple types of algorithms available for retrieval:
Algorithms used in Hierarchical Retrieval
A. Cluster-based Hierarchical Retrieval
This is probably what you encountered earlier.
Pipeline
Chunks
↓
Embeddings
↓
Clustering
↓
Cluster summaries/centroids
↓
Search clusters first
↓
Search chunks inside selected cluster
Common Algorithms
K-Means
Most common.
embedding vectors
↓
k-means
↓
cluster centroids
At query time:
- embed query
- find nearest centroid
- search only within that cluster
This reduces:
- search space
- noise
- latency
Hierarchical Agglomerative Clustering (HAC)
Builds a tree of clusters.
small clusters
↓
merged recursively
↓
dendrogram/tree
Useful for:
- semantic document trees
- recursive retrieval
But expensive at scale.
HDBSCAN
Density-based clustering.
Better when:
- topics are irregular,
- clusters are uneven,
- unknown number of topics.
Often used in:
- research systems,
- topic discovery pipelines.
Tree-Based Retrieval
This is more advanced.
Instead of flat chunks,
knowledge is organized as a tree.
Example:
Root
├── HR
│ ├── Leave Policy
│ │ ├── maternity leave
│ │ └── paternity leave
│ └── Payroll
└── Engineering
Query traversal:
query
↓
root
↓
best child node
↓
best grandchild
↓
leaf chunks
Very similar to:
- filesystem traversal,
- decision trees,
- hierarchical indexing.
Algorithms used in Tree-Based Retrieval
A. RAPTOR
Very important modern paper.
RAPTOR =
Recursive Abstractive Processing for Tree-Organized Retrieval
Pipeline:
chunks
↓
cluster
↓
summarize cluster
↓
cluster summaries again
↓
build tree recursively
Final result:
leaf chunks
middle summaries
high-level summaries
At query time:
- traverse tree top-down,
- retrieve relevant branches,
- optionally retrieve summaries + leaves.
This improves:
- long-document understanding,
- multi-hop reasoning,
- global context.
B. Hierarchical Navigable Small Worlds (HNSW)
Very important.
This is actually a vector index algorithm.
Used internally in:
- FAISS
- Qdrant
- Weaviate
- Milvus
HNSW creates:
multi-layer graph hierarchy
Search:
top sparse graph
↓
lower dense graph
↓
nearest vectors
This is hierarchical retrieval at the vector index level.
Extremely common in production.
C. Tree Indexes
Examples:
- KD-Trees
- Ball Trees
- VP Trees
Older nearest-neighbor structures.
Good for:
- low dimensions.
Less useful for modern high-dimensional embeddings.
Graph-Based Retrieval
A close cousin of tree retrieval.
Knowledge stored as graph:
Kubernetes
↔ scheduling
↔ taints
↔ tolerations
Algorithms:
- graph traversal
- PageRank
- Personalized PageRank
- BFS/DFS
- Graph Neural Networks (GNNs)
Used in:
- GraphRAG
- knowledge graph retrieval
- enterprise search.
Recursive Retrieval
Another important concept.
Instead of retrieving chunks once:
retrieve
↓
retrieve again using results
↓
expand context
Used for:
- multi-hop QA,
- research agents,
- deep document exploration.
Hybrid Retrieval
Very common in production.
Combines:
- BM25 (keyword)
- vector search
- metadata filters
- graph retrieval
- rerankers
Example:
Query
↓
BM25 candidates
+
Vector candidates
↓
Cross-encoder reranker
↓
Final chunks
This is often better than pure embeddings.
Re-ranking
Modern systems rarely trust raw vector similarity.
They use rerankers.
Example:
retrieve top 50
↓
cross-encoder reranker
↓
best 5
Algorithms:
- BGE reranker
- Cohere reranker
- CrossEncoder models
- MonoT5
This is one of the highest-impact improvements in RAG quality.
What large production RAG systems actually use
Most serious systems use combinations like:
Topic Router
↓
Hybrid Retrieval
↓
Hierarchical Search
↓
Reranker
↓
LLM
Or:
Query
↓
Metadata filtering
↓
HNSW vector retrieval
↓
Cross-encoder reranking
↓
Context compression
↓
LLM
No comments:
Post a Comment