Saturday, May 9, 2026

Comparing Semantic Search and Keyword Search

There are certain use cases where semantic search excels , at others keyword search excels. Hence the modern production systems use hybrid search strategies.

Aspect Keyword Search Semantic Search Winner
Core Method Literal word/phrase matching (inverted index, BM25) Vector embeddings + meaning similarity -
Strength Precision & Exactness Understanding & Relevance Context dependent
Synonyms & Paraphrasing Poor (unless manually expanded) Excellent Semantic
Handling Typos Weak (needs fuzzy matching) Good Semantic
Context & Intent Limited Strong Semantic
Speed & Efficiency Very Fast & Lightweight Slower & Resource Heavy Keyword
Explainability High (easy to see why a result matched) Lower (black-box) Keyword
Boolean Operators Excellent (AND, OR, NOT, "exact phrase", etc.) Weak Keyword
Exact Matching Excellent (IDs, codes, SKUs, error codes, citations) Often poor Keyword
Best For Known-item search, technical logs, legal, compliance Natural language, exploratory, conceptual search Context dependent
Precision High Medium (can return loosely related results) Keyword
Recall (finding related) Lower High Semantic
Cost / Scalability Cheaper & easier to scale More expensive (embeddings + vector DB) Keyword
Transparency & Debugging Easy Difficult Keyword
New/Rare Terms Reliable Can struggle Keyword

JORNEY-009

 Naive Conversational Memory added


Check the conversation_queries and the perfect answers given by the model.



NOTE Check the performance after removing the following block also: 

# --------------------------------------------------------
# STORE CONVERSATION IN MEMORY
# --------------------------------------------------------

chat_history.append({

"user": query,

"assistant": answer
})





# ============================================================
# STEP 1 — INSTALL REQUIRED LIBRARIES
# ============================================================

# Run these in Google Colab

!pip install -q google-genai
!pip install -q faiss-cpu




%env RETRIEVAL_MODE=faiss


# ============================================================
# STEP 2 — IMPORT LIBRARIES
# ============================================================

import os
import numpy as np
import faiss
import google.genai as genai

from google.colab import userdata


# ============================================================
# STEP 3 — LOAD ENVIRONMENT SETTINGS
# ============================================================

# RETRIEVAL MODES:
#
# "cosine"  -> brute-force cosine similarity
# "faiss"   -> FAISS vector search
#
# Change this anytime later.
#
# For Render deployment:
# use environment variables.

RETRIEVAL_MODE = os.getenv(
    "RETRIEVAL_MODE",
    "cosine"
)

print(f"Retrieval mode: {RETRIEVAL_MODE}")


# ============================================================
# STEP 4 — CONFIGURE GEMINI API
# ============================================================

#GEMINI_API_KEY = userdata.get("GEMINI_API_KEY")
#GEMINI_API_KEY = userdata.get("geminiapikey")
#GEMINI_API_KEY = userdata.get("GEMINI_API_KEY-003")
#GEMINI_API_KEY = userdata.get("GEMINI_API_KEY_004")
#GEMINI_API_KEY = userdata.get("GEMINI_API_KEY_005")
GEMINI_API_KEY = userdata.get("GEMINI_API_KEY_006")

client = genai.Client(api_key=GEMINI_API_KEY)


####MEMORY HANDLER
# ============================================================
# CONVERSATIONAL MEMORY (PHASE 1)
# ============================================================

# GOAL
# ----
# Add short-term conversational memory so the system:
#   - understands follow-up questions
#   - remembers recent discussion
#   - supports multi-turn conversations
#
# EXAMPLE
# -------
# User: Why is my pod crashing?
# User: How do I debug it?
#
# "it" should refer to the crashing pod.
#
#
# IMPORTANT
# ---------
# This is NOT semantic/vector memory yet.
#
# This is:
# SHORT-TERM PROMPT MEMORY
#
#
# WHAT WE WILL ADD
# ----------------
# ✅ chat_history
# ✅ memory window
# ✅ history injection into prompt
# ✅ multi-turn continuity
#
#
# ============================================================
# MEMORY CONFIGURATION
# ============================================================

# Number of previous conversation turns to remember.
#
# Example:
# MEMORY_WINDOW = 3
#
# Means:
# last 3 user-assistant exchanges are included.

MEMORY_WINDOW = 3


# ============================================================
# CHAT HISTORY STORAGE
# ============================================================

# Conversation history format:
#
# [
#   {
#       "user": "...",
#       "assistant": "..."
#   }
# ]

chat_history = []


# ============================================================
# BUILD CONVERSATION HISTORY TEXT
# ============================================================

def build_history_context():

    # --------------------------------------------------------
    # TAKE ONLY RECENT MEMORY WINDOW
    # --------------------------------------------------------

    recent_history = chat_history[-MEMORY_WINDOW:]

    # --------------------------------------------------------
    # BUILD HISTORY TEXT
    # --------------------------------------------------------

    history_text = ""

    for turn in recent_history:

        history_text += f"""
User:
{turn['user']}

Assistant:
{turn['assistant']}

"""

    return history_text.strip()
####MEMORY HANDLER





# ============================================================
# STEP 5 — CREATE DATASET
# ============================================================

documents = [

    # --------------------------------------------------------
    # POD FAILURES / DEBUGGING
    # --------------------------------------------------------

    "CrashLoopBackOff occurs when a container repeatedly crashes after starting.",
    "OOMKilled happens when a container exceeds its memory limit.",
    "A container may crash due to missing environment variables.",
    "Incorrect command or entrypoint can cause container startup failure.",
    "Application errors inside the container often lead to restarts.",
    "kubectl logs retrieves logs from a running container.",
    "kubectl describe pod shows events and state transitions.",
    "Liveness probes determine if a container should be restarted.",
    "Readiness probes determine if a pod can receive traffic.",

    # --------------------------------------------------------
    # SCHEDULING
    # --------------------------------------------------------

    "Pods remain pending if no node satisfies resource requests.",
    "Node affinity restricts pods to specific nodes.",
    "Taints prevent pods from being scheduled on certain nodes.",
    "Tolerations allow pods to be scheduled on tainted nodes.",

    # --------------------------------------------------------
    # SERVICES
    # --------------------------------------------------------

    "ClusterIP services expose applications within the cluster.",
    "NodePort services expose applications on node IPs.",
    "LoadBalancer services expose applications externally.",
    "Ingress routes HTTP and HTTPS traffic to services.",

    # --------------------------------------------------------
    # STORAGE
    # --------------------------------------------------------

    "PersistentVolumes provide storage independent of pods.",
    "PersistentVolumeClaims request storage resources.",
    "StorageClasses define dynamic provisioning behavior.",

    # --------------------------------------------------------
    # DEPLOYMENTS
    # --------------------------------------------------------

    "Deployments manage replica sets and pod updates.",
    "Rolling updates gradually replace old pods with new ones.",
    "ReplicaSets maintain a stable number of pod replicas.",

    # --------------------------------------------------------
    # CONFIGURATION
    # --------------------------------------------------------

    "ConfigMaps store non-sensitive configuration data.",
    "Secrets store sensitive data like passwords and tokens.",
    "Environment variables can be injected from ConfigMaps and Secrets.",

    # --------------------------------------------------------
    # IMAGES / REGISTRY
    # --------------------------------------------------------

    "ImagePullBackOff occurs when Kubernetes cannot pull the container image.",
    "Incorrect image name or tag can cause image pull failures.",
    "Private registries require imagePullSecrets for authentication.",

    # --------------------------------------------------------
    # AUTOSCALING
    # --------------------------------------------------------

    "Horizontal Pod Autoscaler scales based on CPU or metrics.",

    # --------------------------------------------------------
    # SECURITY
    # --------------------------------------------------------

    "RBAC controls access permissions inside Kubernetes.",
    "RBAC misconfiguration can block access to resources.",

    # --------------------------------------------------------
    # NETWORKING
    # --------------------------------------------------------

    "NetworkPolicies control communication between pods.",

    # --------------------------------------------------------
    # CLEANUP
    # --------------------------------------------------------

    "Pods stuck in Terminating state may have finalizers blocking deletion."
]

print(f"Total documents: {len(documents)}")


# ============================================================
# STEP 6 — CREATE SLIDING WINDOW CHUNKS
# ============================================================

# WHY?
# ----
# Preserves neighboring semantic context.
#
# Example:
# sentence1 + sentence2 + sentence3
#
# Then:
# sentence2 + sentence3 + sentence4

WINDOW_SIZE = 3
STRIDE = 1

smart_chunks = []

for i in range(0, len(documents) - WINDOW_SIZE + 1, STRIDE):

    chunk = documents[i:i + WINDOW_SIZE]

    chunk_text = "\n".join(chunk)

    smart_chunks.append(chunk_text)

print(f"Total chunks created: {len(smart_chunks)}")


# ============================================================
# STEP 7 — PREPARE STRUCTURED CHUNK DATA
# ============================================================

# prepared_data = []

# for i, chunk in enumerate(smart_chunks):

#     prepared_data.append({
#         "id": f"chunk_{i}",
#         "text": chunk
#     })

# print(f"Prepared chunks: {len(prepared_data)}")
prepared_data = []

for i, chunk in enumerate(smart_chunks):
    prepared_data.append({
        # ----------------------------------------------------
        # UNIQUE SOURCE ID
        # ----------------------------------------------------
        "source_id": f"SOURCE_{i+1}",
        # ----------------------------------------------------
        # CHUNK ID
        # ----------------------------------------------------
        "id": f"chunk_{i}",
        # ----------------------------------------------------
        # ACTUAL CHUNK TEXT
        # ----------------------------------------------------
        "text": chunk,
        # ----------------------------------------------------
        # OPTIONAL METADATA
        # ----------------------------------------------------
        "metadata": {
            "topic": "kubernetes",
            "chunk_number": i
        }
    })

print("Prepared data with source attribution.")


# ============================================================
# STEP 8 — CREATE EMBEDDING FUNCTION
# ============================================================

def get_embedding(text):

    # response = embed_content(
    #     model="models/gemini-embedding-001",
    #     contents=text
    # )

    # return response["embedding"]

    response = client.models.embed_content(
        model="models/gemini-embedding-001",
        contents=text
    )
    # The new SDK returns a list of embeddings in 'embeddings'
    return response.embeddings[0].values


# ============================================================
# STEP 9 — GENERATE CHUNK EMBEDDINGS
# ============================================================

print("Generating embeddings...")

for item in prepared_data:

    embedding = get_embedding(item["text"])

    item["embedding"] = embedding

print("Embeddings generated successfully.")


# ============================================================
# STEP 10 — NORMALIZATION FUNCTION
# ============================================================

def normalize(vec):

    vec = np.array(vec)

    return vec / np.linalg.norm(vec)


# ============================================================
# STEP 11 — COSINE SIMILARITY FUNCTION
# ============================================================

def cosine_similarity(a, b):

    a = normalize(a)
    b = normalize(b)

    return np.dot(a, b)


# ============================================================
# STEP 12 — COSINE RETRIEVAL FUNCTION
# ============================================================

def retrieve_cosine(query, top_k=3, min_score=0.55):

    # --------------------------------------------------------
    # EMBED QUERY
    # --------------------------------------------------------

    query_embedding = get_embedding(query)

    scores = []

    # --------------------------------------------------------
    # CALCULATE COSINE SIMILARITY
    # --------------------------------------------------------

    for item in prepared_data:

        similarity = cosine_similarity(
            query_embedding,
            item["embedding"]
        )

        scores.append((similarity, item))

    # --------------------------------------------------------
    # SORT BY SCORE
    # --------------------------------------------------------

    scores = sorted(
        scores,
        key=lambda x: x[0],
        reverse=True
    )

    # --------------------------------------------------------
    # SIMPLE RE-RANKING
    # --------------------------------------------------------

    reranked = []

    query_words = query.lower().split()

    for sim, item in scores:

        text = item["text"].lower()

        keyword_bonus = sum(
            word in text for word in query_words
        )

        final_score = sim + (0.03 * keyword_bonus)

        reranked.append((final_score, item))

    # --------------------------------------------------------
    # SORT AGAIN AFTER RE-RANKING
    # --------------------------------------------------------

    reranked = sorted(
        reranked,
        key=lambda x: x[0],
        reverse=True
    )

    # --------------------------------------------------------
    # FILTER LOW SCORES
    # --------------------------------------------------------

    filtered = [
        x for x in reranked
        if x[0] >= min_score
    ]

    return filtered[:top_k]


# ============================================================
# STEP 13 — CREATE FAISS EMBEDDING MATRIX
# ============================================================

embedding_matrix = []

for item in prepared_data:

    embedding_matrix.append(item["embedding"])

embedding_matrix = np.array(
    embedding_matrix,
    dtype=np.float32
)

print("Embedding matrix shape:")
print(embedding_matrix.shape)


# ============================================================
# STEP 14 — NORMALIZE EMBEDDINGS FOR FAISS
# ============================================================

# IMPORTANT:
#
# IndexFlatIP uses INNER PRODUCT.
#
# If vectors are normalized:
#
# inner product == cosine similarity

faiss.normalize_L2(embedding_matrix)


# ============================================================
# STEP 15 — CREATE FAISS INDEX
# ============================================================

dimension = embedding_matrix.shape[1]

index = faiss.IndexFlatIP(dimension)

print("FAISS index created.")


# ============================================================
# STEP 16 — ADD EMBEDDINGS TO FAISS INDEX
# ============================================================

index.add(embedding_matrix)

print(f"Total vectors indexed: {index.ntotal}")


# ============================================================
# STEP 17 — FAISS RETRIEVAL FUNCTION
# ============================================================

def retrieve_faiss(query, top_k=3):

    # --------------------------------------------------------
    # EMBED QUERY
    # --------------------------------------------------------

    query_embedding = get_embedding(query)

    # --------------------------------------------------------
    # CONVERT TO NUMPY
    # --------------------------------------------------------

    query_vector = np.array(
        [query_embedding],
        dtype=np.float32
    )

    # --------------------------------------------------------
    # NORMALIZE QUERY VECTOR
    # --------------------------------------------------------

    faiss.normalize_L2(query_vector)

    # --------------------------------------------------------
    # SEARCH FAISS INDEX
    # --------------------------------------------------------

    scores, indices = index.search(
        query_vector,
        top_k
    )

    # --------------------------------------------------------
    # FORMAT RESULTS
    # --------------------------------------------------------

    results = []

    for score, idx in zip(scores[0], indices[0]):

        item = prepared_data[idx]

        results.append((score, item))

    return results


# ============================================================
# STEP 18 — RETRIEVAL ROUTER
# ============================================================

# This decides:
#
# cosine retrieval
# OR
# FAISS retrieval

def retrieve_router(query, top_k=3):

    if RETRIEVAL_MODE == "cosine":

        return retrieve_cosine(
            query=query,
            top_k=top_k
        )

    elif RETRIEVAL_MODE == "faiss":

        return retrieve_faiss(
            query=query,
            top_k=top_k
        )

    else:

        raise ValueError(
            f"Invalid retrieval mode: {RETRIEVAL_MODE}"
        )


# ============================================================
# STEP 19 — BUILD PROMPT - SOME IMPROVEMENTS
# ============================================================

# WHAT THIS IMPROVES
# -------------------
# ✅ Better grounding
# ✅ Reduced hallucinations
# ✅ Better formatting
# ✅ Better instruction following
# ✅ Cleaner troubleshooting answers
#
# IMPORTANT:
# -----------
# This does NOT improve retrieval itself.
#
# It improves:
# HOW the LLM uses retrieved chunks.


def build_prompt(query, retrieved_chunks):

    # --------------------------------------------------------
    # BUILD RETRIEVED CONTEXT
    # --------------------------------------------------------

    context_parts = []

    for i, (score, item) in enumerate(retrieved_chunks, start=1):

        context_parts.append(
            f"""
SOURCE ID: {item["source_id"]}

RELEVANCE SCORE: {score:.4f}

CONTENT:
{item["text"]}
"""
        )

    context_text = "\n".join(context_parts)

    # --------------------------------------------------------
    # BUILD CONVERSATION HISTORY
    # --------------------------------------------------------

    history_text = build_history_context()

    # --------------------------------------------------------
    # FINAL PROMPT
    # --------------------------------------------------------

    prompt = f"""
You are an expert Kubernetes troubleshooting assistant.

Your job is to answer the user's question ONLY using
the retrieved context and conversation history.

IMPORTANT RULES:
----------------

1. Use ONLY the retrieved context and conversation history.

2. Do NOT use outside knowledge.

3. Do NOT invent information.

4. Answer using available context.
   If context is incomplete,
   explicitly mention limitations.

5. If answer is not present at all,
   say:
   "I don't know based on the provided context."

6. Keep answers:
   - concise
   - technically accurate
   - well-structured

7. Use bullet points when appropriate.

8. Prefer information from higher relevance scores.

9. At the end of the answer,
   cite the source IDs used.

10. Use conversation history to understand
    follow-up questions and references.

==================================================
CONVERSATION HISTORY
==================================================

{history_text}

==================================================
RETRIEVED CONTEXT
==================================================

{context_text}

==================================================
USER QUESTION
==================================================

{query}

==================================================
ANSWER FORMAT
==================================================

Answer:
<your answer>

Sources Used:
- SOURCE_X
- SOURCE_Y
"""

    return prompt

# ============================================================
# STEP 20 — GENERATE ANSWER USING GEMINI
# ============================================================

def generate_answer(prompt):

    # model = genai.GenerativeModel(
    #     "gemini-3-flash-preview"
    # )

    # response = model.generate_content(prompt)
    response = client.models.generate_content(
        model = "models/gemini-3.1-flash-lite",
        #model="models/gemini-2.5-flash
        #model="models/gemini-3-flash-preview",
        #model="models/gemini-2.5-flash",
        #model="models/gemini-2.5-flash-lite",
        #model="models/gemini-3.1-pro-preview",
        #model="models/gemini-2.0-flash-lite",
        contents=prompt)

    return response.text


# ============================================================
# STEP 21 — MAIN RAG PIPELINE
# ============================================================

def rag_pipeline(query, top_k=3):

    # --------------------------------------------------------
    # RETRIEVE CHUNKS
    # --------------------------------------------------------

    retrieved_chunks = retrieve_router(
        query=query,
        top_k=top_k
    )

    # --------------------------------------------------------
    # BUILD PROMPT
    # --------------------------------------------------------

    prompt = build_prompt(
        query,
        retrieved_chunks
    )

    # --------------------------------------------------------
    # GENERATE ANSWER
    # --------------------------------------------------------

    answer = generate_answer(prompt)

# --------------------------------------------------------
# STORE CONVERSATION IN MEMORY
# --------------------------------------------------------

chat_history.append({

"user": query,

"assistant": answer
})

    return answer, retrieved_chunks


# # ============================================================
# # STEP 22 — TEST RETRIEVAL ONLY
# # ============================================================

# test_queries = [

#     "Why is my pod crashing?",
#     "How to debug Kubernetes logs?",
#     "What causes OOMKilled?",
#     "How do services work in Kubernetes?",
#     "Why is my container restarting repeatedly?"
# ]

# for query in test_queries:

#     print("\n" + "=" * 80)
#     print(f"QUERY: {query}\n")

#     results = retrieve_router(query)

#     for score, item in results:

#         print(f"Score: {score:.4f}")

#         print(item["text"])

#         print("-" * 40)


# ============================================================
# STEP 23 — FINAL RAG TEST
# ============================================================

conversation_queries = [

    # --------------------------------------------------------
    # FIRST QUESTION
    # --------------------------------------------------------

    "Why is my pod crashing?",

    # --------------------------------------------------------
    # FOLLOW-UP QUESTION
    # --------------------------------------------------------

    "How do I debug it?",

    # --------------------------------------------------------
    # ANOTHER FOLLOW-UP
    # --------------------------------------------------------

    "What command should I use?",

    # --------------------------------------------------------
    # MEMORY CONTEXT TEST
    # --------------------------------------------------------

    "What could be the root cause?"
]


# ============================================================
# STEP 12.7 — RUN CONVERSATION TEST
# ============================================================

for query in conversation_queries:

    print("\n" + "=" * 80)

    print(f"USER: {query}")

    # --------------------------------------------------------
    # RUN PIPELINE
    # --------------------------------------------------------

    answer, sources = rag_pipeline(query)

    # --------------------------------------------------------
    # PRINT ANSWER
    # --------------------------------------------------------

    print("\nASSISTANT:\n")

    print(answer)

    # # --------------------------------------------------------
    # # PRINT RETRIEVED SOURCES
    # # --------------------------------------------------------

    print("\nRETRIEVED SOURCES:\n")

    for score, item in sources:

        print(f"Source ID: {item['source_id']}")

        print(f"Score: {score:.4f}")

        print("-" * 40)


# ============================================================
# OPTIONAL MEMORY RESET
# ============================================================

# Use this whenever you want to clear conversation history.

# chat_history = []






======================================================================================
OUTPUT
======================================================================================
Retrieval mode: faiss Total documents: 34 Total chunks created: 32 Prepared data with source attribution. Generating embeddings... Embeddings generated successfully. Embedding matrix shape: (32, 3072) FAISS index created. Total vectors indexed: 32 ================================================================================ USER: Why is my pod crashing? ASSISTANT: Answer: A pod may crash for several reasons: * **CrashLoopBackOff:** Occurs when a container repeatedly crashes after starting. * **OOMKilled:** Occurs when a container exceeds its defined memory limit. * **Missing Environment Variables:** Required variables may be absent. * **Startup Failures:** An incorrect command or entrypoint can prevent the container from starting correctly. * **Application Errors:** Errors occurring inside the container frequently lead to restarts. To investigate the specific cause, you can: * Use `kubectl describe pod` to view events and state transitions. * Use `kubectl logs` to retrieve logs from the container. Sources Used: - SOURCE_1 - SOURCE_3 - SOURCE_5 RETRIEVED SOURCES: Source ID: SOURCE_1 Score: 0.7078 ---------------------------------------- Source ID: SOURCE_5 Score: 0.6856 ---------------------------------------- Source ID: SOURCE_3 Score: 0.6782 ---------------------------------------- ================================================================================ USER: How do I debug it? ASSISTANT: Answer: To debug a container that is crashing or failing, you can take the following steps: * **Check Pod Status and Events:** Use `kubectl describe pod <pod-name>` to view the pod's state transitions and events, which can help identify issues like OOMKilled (memory limit exceeded), missing environment variables, or startup failures. * **Examine Logs:** Use `kubectl logs <pod-name>` to retrieve logs from the container, which can help diagnose application-level errors causing restarts. * **Identify Common Causes:** * **CrashLoopBackOff:** Indicates the container is repeatedly crashing after starting. * **OOMKilled:** Occurs when the container memory limit is exceeded. * **Startup Failures:** Often caused by an incorrect command, entrypoint, or missing environment variables. Sources Used: - SOURCE_1 - SOURCE_2 - SOURCE_5 RETRIEVED SOURCES: Source ID: SOURCE_5 Score: 0.5827 ---------------------------------------- Source ID: SOURCE_2 Score: 0.5814 ---------------------------------------- Source ID: SOURCE_1 Score: 0.5794 ---------------------------------------- ================================================================================ USER: What command should I use? ASSISTANT: Answer: To troubleshoot container issues, you can use the following commands: * `kubectl logs <pod-name>`: Retrieves logs from a running container to help identify application errors. * `kubectl describe pod <pod-name>`: Shows events and state transitions for a pod, which can help diagnose startup failures or crashes. Sources Used: - SOURCE_5 RETRIEVED SOURCES: Source ID: SOURCE_2 Score: 0.5440 ---------------------------------------- Source ID: SOURCE_5 Score: 0.5411 ---------------------------------------- Source ID: SOURCE_19 Score: 0.5401 ---------------------------------------- ================================================================================ USER: What could be the root cause? ASSISTANT: Answer: A container may crash or experience `CrashLoopBackOff` due to several potential root causes: * **Memory issues:** The container exceeded its memory limit, resulting in `OOMKilled`. * **Configuration errors:** Missing environment variables. * **Startup failures:** An incorrect command or entrypoint. * **Application issues:** Errors occurring inside the container leading to restarts. Sources Used: - SOURCE_1 - SOURCE_2 - SOURCE_3 RETRIEVED SOURCES: Source ID: SOURCE_2 Score: 0.6080 ---------------------------------------- Source ID: SOURCE_1 Score: 0.6033 ---------------------------------------- Source ID: SOURCE_3 Score: 0.5994 ----------------------------------------

JOURNEY-008 Source Attribution Added

 Added Source Attribution


# ============================================================
# STEP 1 — INSTALL REQUIRED LIBRARIES
# ============================================================

# Run these in Google Colab

!pip install -q google-genai
!pip install -q faiss-cpu



%env RETRIEVAL_MODE=faiss



# ============================================================
# STEP 2 — IMPORT LIBRARIES
# ============================================================

import os
import numpy as np
import faiss
import google.genai as genai

from google.colab import userdata


# ============================================================
# STEP 3 — LOAD ENVIRONMENT SETTINGS
# ============================================================

# RETRIEVAL MODES:
#
# "cosine"  -> brute-force cosine similarity
# "faiss"   -> FAISS vector search
#
# Change this anytime later.
#
# For Render deployment:
# use environment variables.

RETRIEVAL_MODE = os.getenv(
    "RETRIEVAL_MODE",
    "cosine"
)

print(f"Retrieval mode: {RETRIEVAL_MODE}")


# ============================================================
# STEP 4 — CONFIGURE GEMINI API
# ============================================================

#GEMINI_API_KEY = userdata.get("GEMINI_API_KEY")
#GEMINI_API_KEY = userdata.get("geminiapikey")
#GEMINI_API_KEY = userdata.get("GEMINI_API_KEY-003")
GEMINI_API_KEY = userdata.get("GEMINI_API_KEY_004")

client = genai.Client(api_key=GEMINI_API_KEY)


# ============================================================
# STEP 5 — CREATE DATASET
# ============================================================

documents = [

    # --------------------------------------------------------
    # POD FAILURES / DEBUGGING
    # --------------------------------------------------------

    "CrashLoopBackOff occurs when a container repeatedly crashes after starting.",
    "OOMKilled happens when a container exceeds its memory limit.",
    "A container may crash due to missing environment variables.",
    "Incorrect command or entrypoint can cause container startup failure.",
    "Application errors inside the container often lead to restarts.",
    "kubectl logs retrieves logs from a running container.",
    "kubectl describe pod shows events and state transitions.",
    "Liveness probes determine if a container should be restarted.",
    "Readiness probes determine if a pod can receive traffic.",

    # --------------------------------------------------------
    # SCHEDULING
    # --------------------------------------------------------

    "Pods remain pending if no node satisfies resource requests.",
    "Node affinity restricts pods to specific nodes.",
    "Taints prevent pods from being scheduled on certain nodes.",
    "Tolerations allow pods to be scheduled on tainted nodes.",

    # --------------------------------------------------------
    # SERVICES
    # --------------------------------------------------------

    "ClusterIP services expose applications within the cluster.",
    "NodePort services expose applications on node IPs.",
    "LoadBalancer services expose applications externally.",
    "Ingress routes HTTP and HTTPS traffic to services.",

    # --------------------------------------------------------
    # STORAGE
    # --------------------------------------------------------

    "PersistentVolumes provide storage independent of pods.",
    "PersistentVolumeClaims request storage resources.",
    "StorageClasses define dynamic provisioning behavior.",

    # --------------------------------------------------------
    # DEPLOYMENTS
    # --------------------------------------------------------

    "Deployments manage replica sets and pod updates.",
    "Rolling updates gradually replace old pods with new ones.",
    "ReplicaSets maintain a stable number of pod replicas.",

    # --------------------------------------------------------
    # CONFIGURATION
    # --------------------------------------------------------

    "ConfigMaps store non-sensitive configuration data.",
    "Secrets store sensitive data like passwords and tokens.",
    "Environment variables can be injected from ConfigMaps and Secrets.",

    # --------------------------------------------------------
    # IMAGES / REGISTRY
    # --------------------------------------------------------

    "ImagePullBackOff occurs when Kubernetes cannot pull the container image.",
    "Incorrect image name or tag can cause image pull failures.",
    "Private registries require imagePullSecrets for authentication.",

    # --------------------------------------------------------
    # AUTOSCALING
    # --------------------------------------------------------

    "Horizontal Pod Autoscaler scales based on CPU or metrics.",

    # --------------------------------------------------------
    # SECURITY
    # --------------------------------------------------------

    "RBAC controls access permissions inside Kubernetes.",
    "RBAC misconfiguration can block access to resources.",

    # --------------------------------------------------------
    # NETWORKING
    # --------------------------------------------------------

    "NetworkPolicies control communication between pods.",

    # --------------------------------------------------------
    # CLEANUP
    # --------------------------------------------------------

    "Pods stuck in Terminating state may have finalizers blocking deletion."
]

print(f"Total documents: {len(documents)}")


# ============================================================
# STEP 6 — CREATE SLIDING WINDOW CHUNKS
# ============================================================

# WHY?
# ----
# Preserves neighboring semantic context.
#
# Example:
# sentence1 + sentence2 + sentence3
#
# Then:
# sentence2 + sentence3 + sentence4

WINDOW_SIZE = 3
STRIDE = 1

smart_chunks = []

for i in range(0, len(documents) - WINDOW_SIZE + 1, STRIDE):

    chunk = documents[i:i + WINDOW_SIZE]

    chunk_text = "\n".join(chunk)

    smart_chunks.append(chunk_text)

print(f"Total chunks created: {len(smart_chunks)}")


# ============================================================
# STEP 7 — PREPARE STRUCTURED CHUNK DATA
# ============================================================

# prepared_data = []

# for i, chunk in enumerate(smart_chunks):

#     prepared_data.append({
#         "id": f"chunk_{i}",
#         "text": chunk
#     })

# print(f"Prepared chunks: {len(prepared_data)}")
prepared_data = []

for i, chunk in enumerate(smart_chunks):
    prepared_data.append({
        # ----------------------------------------------------
        # UNIQUE SOURCE ID
        # ----------------------------------------------------
        "source_id": f"SOURCE_{i+1}",
        # ----------------------------------------------------
        # CHUNK ID
        # ----------------------------------------------------
        "id": f"chunk_{i}",
        # ----------------------------------------------------
        # ACTUAL CHUNK TEXT
        # ----------------------------------------------------
        "text": chunk,
        # ----------------------------------------------------
        # OPTIONAL METADATA
        # ----------------------------------------------------
        "metadata": {
            "topic": "kubernetes",
            "chunk_number": i
        }
    })

print("Prepared data with source attribution.")


# ============================================================
# STEP 8 — CREATE EMBEDDING FUNCTION
# ============================================================

def get_embedding(text):

    # response = embed_content(
    #     model="models/gemini-embedding-001",
    #     contents=text
    # )

    # return response["embedding"]

    response = client.models.embed_content(
        model="models/gemini-embedding-001",
        contents=text
    )
    # The new SDK returns a list of embeddings in 'embeddings'
    return response.embeddings[0].values


# ============================================================
# STEP 9 — GENERATE CHUNK EMBEDDINGS
# ============================================================

print("Generating embeddings...")

for item in prepared_data:

    embedding = get_embedding(item["text"])

    item["embedding"] = embedding

print("Embeddings generated successfully.")


# ============================================================
# STEP 10 — NORMALIZATION FUNCTION
# ============================================================

def normalize(vec):

    vec = np.array(vec)

    return vec / np.linalg.norm(vec)


# ============================================================
# STEP 11 — COSINE SIMILARITY FUNCTION
# ============================================================

def cosine_similarity(a, b):

    a = normalize(a)
    b = normalize(b)

    return np.dot(a, b)


# ============================================================
# STEP 12 — COSINE RETRIEVAL FUNCTION
# ============================================================

def retrieve_cosine(query, top_k=3, min_score=0.55):

    # --------------------------------------------------------
    # EMBED QUERY
    # --------------------------------------------------------

    query_embedding = get_embedding(query)

    scores = []

    # --------------------------------------------------------
    # CALCULATE COSINE SIMILARITY
    # --------------------------------------------------------

    for item in prepared_data:

        similarity = cosine_similarity(
            query_embedding,
            item["embedding"]
        )

        scores.append((similarity, item))

    # --------------------------------------------------------
    # SORT BY SCORE
    # --------------------------------------------------------

    scores = sorted(
        scores,
        key=lambda x: x[0],
        reverse=True
    )

    # --------------------------------------------------------
    # SIMPLE RE-RANKING
    # --------------------------------------------------------

    reranked = []

    query_words = query.lower().split()

    for sim, item in scores:

        text = item["text"].lower()

        keyword_bonus = sum(
            word in text for word in query_words
        )

        final_score = sim + (0.03 * keyword_bonus)

        reranked.append((final_score, item))

    # --------------------------------------------------------
    # SORT AGAIN AFTER RE-RANKING
    # --------------------------------------------------------

    reranked = sorted(
        reranked,
        key=lambda x: x[0],
        reverse=True
    )

    # --------------------------------------------------------
    # FILTER LOW SCORES
    # --------------------------------------------------------

    filtered = [
        x for x in reranked
        if x[0] >= min_score
    ]

    return filtered[:top_k]


# ============================================================
# STEP 13 — CREATE FAISS EMBEDDING MATRIX
# ============================================================

embedding_matrix = []

for item in prepared_data:

    embedding_matrix.append(item["embedding"])

embedding_matrix = np.array(
    embedding_matrix,
    dtype=np.float32
)

print("Embedding matrix shape:")
print(embedding_matrix.shape)


# ============================================================
# STEP 14 — NORMALIZE EMBEDDINGS FOR FAISS
# ============================================================

# IMPORTANT:
#
# IndexFlatIP uses INNER PRODUCT.
#
# If vectors are normalized:
#
# inner product == cosine similarity

faiss.normalize_L2(embedding_matrix)


# ============================================================
# STEP 15 — CREATE FAISS INDEX
# ============================================================

dimension = embedding_matrix.shape[1]

index = faiss.IndexFlatIP(dimension)

print("FAISS index created.")


# ============================================================
# STEP 16 — ADD EMBEDDINGS TO FAISS INDEX
# ============================================================

index.add(embedding_matrix)

print(f"Total vectors indexed: {index.ntotal}")


# ============================================================
# STEP 17 — FAISS RETRIEVAL FUNCTION
# ============================================================

def retrieve_faiss(query, top_k=3):

    # --------------------------------------------------------
    # EMBED QUERY
    # --------------------------------------------------------

    query_embedding = get_embedding(query)

    # --------------------------------------------------------
    # CONVERT TO NUMPY
    # --------------------------------------------------------

    query_vector = np.array(
        [query_embedding],
        dtype=np.float32
    )

    # --------------------------------------------------------
    # NORMALIZE QUERY VECTOR
    # --------------------------------------------------------

    faiss.normalize_L2(query_vector)

    # --------------------------------------------------------
    # SEARCH FAISS INDEX
    # --------------------------------------------------------

    scores, indices = index.search(
        query_vector,
        top_k
    )

    # --------------------------------------------------------
    # FORMAT RESULTS
    # --------------------------------------------------------

    results = []

    for score, idx in zip(scores[0], indices[0]):

        item = prepared_data[idx]

        results.append((score, item))

    return results


# ============================================================
# STEP 18 — RETRIEVAL ROUTER
# ============================================================

# This decides:
#
# cosine retrieval
# OR
# FAISS retrieval

def retrieve_router(query, top_k=3):

    if RETRIEVAL_MODE == "cosine":

        return retrieve_cosine(
            query=query,
            top_k=top_k
        )

    elif RETRIEVAL_MODE == "faiss":

        return retrieve_faiss(
            query=query,
            top_k=top_k
        )

    else:

        raise ValueError(
            f"Invalid retrieval mode: {RETRIEVAL_MODE}"
        )


# ============================================================
# STEP 19 — BUILD PROMPT - SOME IMPROVEMENTS
# ============================================================

# WHAT THIS IMPROVES
# -------------------
# ✅ Better grounding
# ✅ Reduced hallucinations
# ✅ Better formatting
# ✅ Better instruction following
# ✅ Cleaner troubleshooting answers
#
# IMPORTANT:
# -----------
# This does NOT improve retrieval itself.
#
# It improves:
# HOW the LLM uses retrieved chunks.


def build_prompt(query, retrieved_chunks):

    # --------------------------------------------------------
    # BUILD CONTEXT
    # --------------------------------------------------------

    context_parts = []

    for i, (score, item) in enumerate(retrieved_chunks, start=1):

        context_parts.append(
            f"""
SOURCE ID: {item["source_id"]}

RELEVANCE SCORE: {score:.4f}

CONTENT:
{item["text"]}
"""
        )

    context_text = "\n".join(context_parts)

    # --------------------------------------------------------
    # FINAL PROMPT
    # --------------------------------------------------------

    prompt = f"""
You are an expert Kubernetes troubleshooting assistant.

Your job is to answer the user's question ONLY using
the retrieved context provided below.

IMPORTANT RULES:
----------------

1. Use ONLY the retrieved context.

2. Do NOT use outside knowledge.

3. Do NOT invent information.

4. Answer using available context.
   If context is incomplete,
   explicitly mention limitations.

5. If answer is not present at all,
   say:
   "I don't know based on the provided context."

6. Keep answers:
   - concise
   - technically accurate
   - well-structured

7. Use bullet points when appropriate.

8. Prefer information from higher relevance scores.

9. At the end of the answer,
   cite the source IDs used.

==================================================
RETRIEVED CONTEXT START
==================================================

{context_text}

==================================================
RETRIEVED CONTEXT END
==================================================


==================================================
USER QUESTION
==================================================

{query}


==================================================
ANSWER FORMAT
==================================================

Answer:
<your answer>

Sources Used:
- SOURCE_X
- SOURCE_Y
"""

    return prompt

# def build_prompt(query, retrieved_chunks):

#     # --------------------------------------------------------
#     # BUILD CONTEXT SECTION
#     # --------------------------------------------------------

#     context_parts = []

#     for i, (score, item) in enumerate(retrieved_chunks, start=1):

#         context_parts.append(
#             f"""
# SOURCE {i}
# Relevance Score: {score:.4f}

# {item["text"]}
# """
#         )

#     context_text = "\n".join(context_parts)

#     # --------------------------------------------------------
#     # BUILD FINAL PROMPT
#     # --------------------------------------------------------

#     prompt = f"""
# You are an expert Kubernetes troubleshooting assistant.

# Your job is to answer the user's question ONLY using
# the retrieved context provided below.

# IMPORTANT RULES:
# ----------------
# 1. Use ONLY the retrieved context.
# 2. Do NOT use outside knowledge.
# 3. Do NOT invent information.
# 4. Answer using available context.
#    If the context is incomplete, explicitly mention that the available
#    information is limited.
# 5. If the answer is not present in the context at all,
#    say:
#    "I don't know based on the provided context."

# 6. Keep answers:
#    - concise
#    - technically accurate
#    - well-structured

# 7. When appropriate:
#    - use bullet points
#    - explain causes clearly
#    - provide troubleshooting guidance

# 8. If multiple possible causes exist,
#    list them separately.

# 9. Prefer information from higher relevance scores.

# ==================================================
# RETRIEVED CONTEXT START
# ==================================================

# {context_text}

# ==================================================
# RETRIEVED CONTEXT END
# ==================================================


# ==================================================
# USER QUESTION
# ==================================================

# {query}


# ==================================================
# ANSWER
# ==================================================
# """

#     return prompt



# ============================================================
# STEP 20 — GENERATE ANSWER USING GEMINI
# ============================================================

def generate_answer(prompt):

    # model = genai.GenerativeModel(
    #     "gemini-3-flash-preview"
    # )

    # response = model.generate_content(prompt)
    response = client.models.generate_content(
        model="models/gemini-3-flash-preview",
        #model="models/gemini-2.5-flash",
        #model="models/gemini-2.5-flash-lite",
        #model="models/gemini-3.1-pro-preview",
        #model="models/gemini-2.0-flash-lite",
        contents=prompt)

    return response.text


# ============================================================
# STEP 21 — MAIN RAG PIPELINE
# ============================================================

def rag_pipeline(query, top_k=3):

    # --------------------------------------------------------
    # RETRIEVE CHUNKS
    # --------------------------------------------------------

    retrieved_chunks = retrieve_router(
        query=query,
        top_k=top_k
    )

    # --------------------------------------------------------
    # BUILD PROMPT
    # --------------------------------------------------------

    prompt = build_prompt(
        query,
        retrieved_chunks
    )

    # --------------------------------------------------------
    # GENERATE ANSWER
    # --------------------------------------------------------

    answer = generate_answer(prompt)

    return answer, retrieved_chunks


# # ============================================================
# # STEP 22 — TEST RETRIEVAL ONLY
# # ============================================================

# test_queries = [

#     "Why is my pod crashing?",
#     "How to debug Kubernetes logs?",
#     "What causes OOMKilled?",
#     "How do services work in Kubernetes?",
#     "Why is my container restarting repeatedly?"
# ]

# for query in test_queries:

#     print("\n" + "=" * 80)
#     print(f"QUERY: {query}\n")

#     results = retrieve_router(query)

#     for score, item in results:

#         print(f"Score: {score:.4f}")

#         print(item["text"])

#         print("-" * 40)


# ============================================================
# STEP 23 — FINAL RAG TEST
# ============================================================

test_queries = [

    # --------------------------------------------------------
    # DIRECTLY ANSWERABLE
    # --------------------------------------------------------

    "Why is my pod crashing?",

    "How do I debug Kubernetes logs?",

    "What causes OOMKilled?",

    # --------------------------------------------------------
    # MULTI-CAUSE QUESTION
    # --------------------------------------------------------

    "Why is my container restarting repeatedly?",

    # --------------------------------------------------------
    # PARTIALLY SUPPORTED
    # --------------------------------------------------------

    "How does Kubernetes networking work?",

    # --------------------------------------------------------
    # SHOULD TRIGGER 'I DON'T KNOW'
    # --------------------------------------------------------

    "How do StatefulSets work?",

    "How does etcd replication happen?"
]

for query in test_queries:

    print("\n" + "=" * 80)
    print(f"QUERY: {query}")

    # --------------------------------------------------------
    # RUN RAG PIPELINE
    # --------------------------------------------------------

    answer, sources = rag_pipeline(query)

    # --------------------------------------------------------
    # PRINT ANSWER
    # --------------------------------------------------------

    print(answer)

    # --------------------------------------------------------
    # PRINT SOURCES
    # --------------------------------------------------------

    print("\nRETRIEVED SOURCES:\n")

    for score, item in sources:

        print(f"Score: {score:.4f}")

        print(item["text"])

        print("-" * 40)




======================================================================================
OUTPUT
======================================================================================
Retrieval mode: faiss Total documents: 34 Total chunks created: 32 Prepared data with source attribution. Generating embeddings... Embeddings generated successfully. Embedding matrix shape: (32, 3072) FAISS index created. Total vectors indexed: 32 ================================================================================ QUERY: Why is my pod crashing? Answer: Based on the provided context, your pod may be crashing due to the following reasons: * **Memory Issues:** The container is `OOMKilled` because it exceeded its memory limit. * **Configuration Errors:** The container may have missing environment variables or an incorrect command/entrypoint. * **Internal Application Errors:** Errors occurring inside the application often lead to restarts. * **Repeated Failure:** If the container crashes repeatedly after starting, it enters a `CrashLoopBackOff` state. To investigate further, the context suggests: * Use `kubectl logs` to retrieve logs from the container. * Use `kubectl describe pod` to view events and state transitions. Sources Used: - SOURCE_1 - SOURCE_3 - SOURCE_5 RETRIEVED SOURCES: Score: 0.7078 CrashLoopBackOff occurs when a container repeatedly crashes after starting. OOMKilled happens when a container exceeds its memory limit. A container may crash due to missing environment variables. ---------------------------------------- Score: 0.6856 Application errors inside the container often lead to restarts. kubectl logs retrieves logs from a running container. kubectl describe pod shows events and state transitions. ---------------------------------------- Score: 0.6782 A container may crash due to missing environment variables. Incorrect command or entrypoint can cause container startup failure. Application errors inside the container often lead to restarts. ---------------------------------------- ================================================================================ QUERY: How do I debug Kubernetes logs? Answer: To debug Kubernetes logs, you should use the following approach based on the provided context: * Use the command `kubectl logs` to retrieve logs from a running container. * Use `kubectl describe pod` to view events and state transitions, which can help identify why container restarts or startup failures occur. Sources Used: - SOURCE_4 - SOURCE_5 - SOURCE_6 RETRIEVED SOURCES: Score: 0.7614 Application errors inside the container often lead to restarts. kubectl logs retrieves logs from a running container. kubectl describe pod shows events and state transitions. ---------------------------------------- Score: 0.7554 Incorrect command or entrypoint can cause container startup failure. Application errors inside the container often lead to restarts. kubectl logs retrieves logs from a running container. ---------------------------------------- Score: 0.7317 kubectl logs retrieves logs from a running container. kubectl describe pod shows events and state transitions. Liveness probes determine if a container should be restarted. ---------------------------------------- ================================================================================ QUERY: What causes OOMKilled? Answer: OOMKilled is caused by the following: * A container exceeding its memory limit. Sources Used: - SOURCE_1 - SOURCE_2 RETRIEVED SOURCES: Score: 0.7658 OOMKilled happens when a container exceeds its memory limit. A container may crash due to missing environment variables. Incorrect command or entrypoint can cause container startup failure. ---------------------------------------- Score: 0.7314 CrashLoopBackOff occurs when a container repeatedly crashes after starting. OOMKilled happens when a container exceeds its memory limit. A container may crash due to missing environment variables. ---------------------------------------- Score: 0.6270 A container may crash due to missing environment variables. Incorrect command or entrypoint can cause container startup failure. Application errors inside the container often lead to restarts. ---------------------------------------- ================================================================================ QUERY: Why is my container restarting repeatedly? Answer: Based on the provided context, a container may restart repeatedly for the following reasons: * **CrashLoopBackOff:** This occurs when a container crashes repeatedly after starting. * **OOMKilled:** The container has exceeded its assigned memory limit. * **Missing Environment Variables:** The absence of required environment variables can cause a container to crash. * **Incorrect Configuration:** An incorrect command or entrypoint can lead to container startup failure. * **Application Errors:** Internal errors within the application often lead to restarts. Sources Used: - SOURCE_1 - SOURCE_3 - SOURCE_4 RETRIEVED SOURCES: Score: 0.7690 A container may crash due to missing environment variables. Incorrect command or entrypoint can cause container startup failure. Application errors inside the container often lead to restarts. ---------------------------------------- Score: 0.7391 Incorrect command or entrypoint can cause container startup failure. Application errors inside the container often lead to restarts. kubectl logs retrieves logs from a running container. ---------------------------------------- Score: 0.7250 CrashLoopBackOff occurs when a container repeatedly crashes after starting. OOMKilled happens when a container exceeds its memory limit. A container may crash due to missing environment variables. ---------------------------------------- ================================================================================ QUERY: How does Kubernetes networking work? Answer: Based on the provided context, Kubernetes networking involves the following components: * **NetworkPolicies**: These control the communication between pods. * **ClusterIP Services**: These are used to expose applications internally within the cluster. * **NodePort Services**: These expose applications on the IP addresses of the nodes. **Limitations:** The provided context is limited to high-level service types and policy controls; it does not contain information regarding the underlying network model, Container Network Interface (CNI), or specific pod-to-pod routing mechanisms. Sources Used: - SOURCE_31 - SOURCE_12 - SOURCE_13 RETRIEVED SOURCES: Score: 0.6912 RBAC controls access permissions inside Kubernetes. RBAC misconfiguration can block access to resources. NetworkPolicies control communication between pods. ---------------------------------------- Score: 0.6770 Taints prevent pods from being scheduled on certain nodes. Tolerations allow pods to be scheduled on tainted nodes. ClusterIP services expose applications within the cluster. ---------------------------------------- Score: 0.6732 Tolerations allow pods to be scheduled on tainted nodes. ClusterIP services expose applications within the cluster. NodePort services expose applications on node IPs. ---------------------------------------- ================================================================================ QUERY: How do StatefulSets work? Answer: I don't know based on the provided context. Sources Used: - N/A RETRIEVED SOURCES: Score: 0.7297 Deployments manage replica sets and pod updates. Rolling updates gradually replace old pods with new ones. ReplicaSets maintain a stable number of pod replicas. ---------------------------------------- Score: 0.7097 StorageClasses define dynamic provisioning behavior. Deployments manage replica sets and pod updates. Rolling updates gradually replace old pods with new ones. ---------------------------------------- Score: 0.7091 Rolling updates gradually replace old pods with new ones. ReplicaSets maintain a stable number of pod replicas. ConfigMaps store non-sensitive configuration data. ---------------------------------------- ================================================================================ QUERY: How does etcd replication happen? Answer: I don't know based on the provided context. Sources Used: - SOURCE_21 - SOURCE_22 - SOURCE_23 RETRIEVED SOURCES: Score: 0.6344 Deployments manage replica sets and pod updates. Rolling updates gradually replace old pods with new ones. ReplicaSets maintain a stable number of pod replicas. ---------------------------------------- Score: 0.6197 ReplicaSets maintain a stable number of pod replicas. ConfigMaps store non-sensitive configuration data. Secrets store sensitive data like passwords and tokens. ---------------------------------------- Score: 0.6182 Rolling updates gradually replace old pods with new ones. ReplicaSets maintain a stable number of pod replicas. ConfigMaps store non-sensitive configuration data. ----------------------------------------

Comparing Semantic Search and Keyword Search

There are certain use cases where semantic search excels , at others keyword search excels. Hence the modern production systems use hybri...