Wednesday, May 6, 2026

Retrieval Algorithms for RAG systems

 There are multiple types of algorithms available for retrieval: 

  • hierarchical retrieval
  • tree-based retrieval
  • graph retrieval
  • hybrid retrieval
  • recursive retrieval

  • Algorithms used in Hierarchical Retrieval


    A. Cluster-based Hierarchical Retrieval

    This is probably what you encountered earlier.

    Pipeline

    Chunks

    Embeddings

    Clustering

    Cluster summaries/centroids

    Search clusters first

    Search chunks inside selected cluster

    Common Algorithms

    K-Means

    Most common.

    embedding vectors

    k-means

    cluster centroids

    At query time:

    1. embed query
    2. find nearest centroid
    3. search only within that cluster

    This reduces:

    • search space
    • noise
    • latency

    Hierarchical Agglomerative Clustering (HAC)

    Builds a tree of clusters.

    small clusters

    merged recursively

    dendrogram/tree

    Useful for:

    • semantic document trees
    • recursive retrieval

    But expensive at scale.


    HDBSCAN

    Density-based clustering.

    Better when:

    • topics are irregular,
    • clusters are uneven,
    • unknown number of topics.

    Often used in:

    • research systems,
    • topic discovery pipelines.



    Tree-Based Retrieval

    This is more advanced.

    Instead of flat chunks,
    knowledge is organized as a tree.

    Example:

    Root
    ├── HR
    │ ├── Leave Policy
    │ │ ├── maternity leave
    │ │ └── paternity leave
    │ └── Payroll
    └── Engineering

    Query traversal:

    query

    root

    best child node

    best grandchild

    leaf chunks

    Very similar to:

    • filesystem traversal,
    • decision trees,
    • hierarchical indexing.

    Algorithms used in Tree-Based Retrieval


    A. RAPTOR

    Very important modern paper.

    RAPTOR =
    Recursive Abstractive Processing for Tree-Organized Retrieval

    Pipeline:

    chunks

    cluster

    summarize cluster

    cluster summaries again

    build tree recursively

    Final result:

    leaf chunks
    middle summaries
    high-level summaries

    At query time:

    • traverse tree top-down,
    • retrieve relevant branches,
    • optionally retrieve summaries + leaves.

    This improves:

    • long-document understanding,
    • multi-hop reasoning,
    • global context.

    B. Hierarchical Navigable Small Worlds (HNSW)

    Very important.

    This is actually a vector index algorithm.

    Used internally in:

    • FAISS
    • Qdrant
    • Weaviate
    • Milvus

    HNSW creates:

    multi-layer graph hierarchy

    Search:

    top sparse graph

    lower dense graph

    nearest vectors

    This is hierarchical retrieval at the vector index level.

    Extremely common in production.


    C. Tree Indexes

    Examples:

    • KD-Trees
    • Ball Trees
    • VP Trees

    Older nearest-neighbor structures.

    Good for:

    • low dimensions.

    Less useful for modern high-dimensional embeddings.


    Graph-Based Retrieval

    A close cousin of tree retrieval.

    Knowledge stored as graph:

    Kubernetes
    ↔ scheduling
    ↔ taints
    ↔ tolerations

    Algorithms:

    • graph traversal
    • PageRank
    • Personalized PageRank
    • BFS/DFS
    • Graph Neural Networks (GNNs)

    Used in:

    • GraphRAG
    • knowledge graph retrieval
    • enterprise search.

    Recursive Retrieval

    Another important concept.

    Instead of retrieving chunks once:

    retrieve

    retrieve again using results

    expand context

    Used for:

    • multi-hop QA,
    • research agents,
    • deep document exploration.

    Hybrid Retrieval

    Very common in production.

    Combines:

    • BM25 (keyword)
    • vector search
    • metadata filters
    • graph retrieval
    • rerankers

    Example:

    Query

    BM25 candidates
    +
    Vector candidates

    Cross-encoder reranker

    Final chunks

    This is often better than pure embeddings.


    Re-ranking

    Modern systems rarely trust raw vector similarity.

    They use rerankers.

    Example:

    retrieve top 50

    cross-encoder reranker

    best 5

    Algorithms:

    • BGE reranker
    • Cohere reranker
    • CrossEncoder models
    • MonoT5

    This is one of the highest-impact improvements in RAG quality.


    What large production RAG systems actually use

    Most serious systems use combinations like:

    Topic Router

    Hybrid Retrieval

    Hierarchical Search

    Reranker

    LLM

    Or:

    Query

    Metadata filtering

    HNSW vector retrieval

    Cross-encoder reranking

    Context compression

    LLM


    No comments:

    Post a Comment

    LangChain and LlamaIndex

      Aspect LangChain LlamaIndex Winner / Notes Primary Strength Orchestration, Agents & Workflows Data Indexing & Advanced RAG LlamaIn...