JOURNEY-006

Changed the library from google generativeai to google genai
# ============================================================
# STEP 1 — INSTALL REQUIRED LIBRARIES
# ============================================================

# Run these in Google Colab

!pip install -q google-genai
!pip install -q faiss-cpu


%env RETRIEVAL_MODE=faiss


# ============================================================
# STEP 2 — IMPORT LIBRARIES
# ============================================================

import os
import numpy as np
import faiss
import google.genai as genai

from google.colab import userdata


# ============================================================
# STEP 3 — LOAD ENVIRONMENT SETTINGS
# ============================================================

# RETRIEVAL MODES:
#
# "cosine"  -> brute-force cosine similarity
# "faiss"   -> FAISS vector search
#
# Change this anytime later.
#
# For Render deployment:
# use environment variables.

RETRIEVAL_MODE = os.getenv(
    "RETRIEVAL_MODE",
    "cosine"
)

print(f"Retrieval mode: {RETRIEVAL_MODE}")


# ============================================================
# STEP 4 — CONFIGURE GEMINI API
# ============================================================

GEMINI_API_KEY = userdata.get("GEMINI_API_KEY")

client = genai.Client(api_key=GEMINI_API_KEY)


# ============================================================
# STEP 5 — CREATE DATASET
# ============================================================

documents = [

    # --------------------------------------------------------
    # POD FAILURES / DEBUGGING
    # --------------------------------------------------------

    "CrashLoopBackOff occurs when a container repeatedly crashes after starting.",
    "OOMKilled happens when a container exceeds its memory limit.",
    "A container may crash due to missing environment variables.",
    "Incorrect command or entrypoint can cause container startup failure.",
    "Application errors inside the container often lead to restarts.",
    "kubectl logs retrieves logs from a running container.",
    "kubectl describe pod shows events and state transitions.",
    "Liveness probes determine if a container should be restarted.",
    "Readiness probes determine if a pod can receive traffic.",

    # --------------------------------------------------------
    # SCHEDULING
    # --------------------------------------------------------

    "Pods remain pending if no node satisfies resource requests.",
    "Node affinity restricts pods to specific nodes.",
    "Taints prevent pods from being scheduled on certain nodes.",
    "Tolerations allow pods to be scheduled on tainted nodes.",

    # --------------------------------------------------------
    # SERVICES
    # --------------------------------------------------------

    "ClusterIP services expose applications within the cluster.",
    "NodePort services expose applications on node IPs.",
    "LoadBalancer services expose applications externally.",
    "Ingress routes HTTP and HTTPS traffic to services.",

    # --------------------------------------------------------
    # STORAGE
    # --------------------------------------------------------

    "PersistentVolumes provide storage independent of pods.",
    "PersistentVolumeClaims request storage resources.",
    "StorageClasses define dynamic provisioning behavior.",

    # --------------------------------------------------------
    # DEPLOYMENTS
    # --------------------------------------------------------

    "Deployments manage replica sets and pod updates.",
    "Rolling updates gradually replace old pods with new ones.",
    "ReplicaSets maintain a stable number of pod replicas.",

    # --------------------------------------------------------
    # CONFIGURATION
    # --------------------------------------------------------

    "ConfigMaps store non-sensitive configuration data.",
    "Secrets store sensitive data like passwords and tokens.",
    "Environment variables can be injected from ConfigMaps and Secrets.",

    # --------------------------------------------------------
    # IMAGES / REGISTRY
    # --------------------------------------------------------

    "ImagePullBackOff occurs when Kubernetes cannot pull the container image.",
    "Incorrect image name or tag can cause image pull failures.",
    "Private registries require imagePullSecrets for authentication.",

    # --------------------------------------------------------
    # AUTOSCALING
    # --------------------------------------------------------

    "Horizontal Pod Autoscaler scales based on CPU or metrics.",

    # --------------------------------------------------------
    # SECURITY
    # --------------------------------------------------------

    "RBAC controls access permissions inside Kubernetes.",
    "RBAC misconfiguration can block access to resources.",

    # --------------------------------------------------------
    # NETWORKING
    # --------------------------------------------------------

    "NetworkPolicies control communication between pods.",

    # --------------------------------------------------------
    # CLEANUP
    # --------------------------------------------------------

    "Pods stuck in Terminating state may have finalizers blocking deletion."
]

print(f"Total documents: {len(documents)}")


# ============================================================
# STEP 6 — CREATE SLIDING WINDOW CHUNKS
# ============================================================

# WHY?
# ----
# Preserves neighboring semantic context.
#
# Example:
# sentence1 + sentence2 + sentence3
#
# Then:
# sentence2 + sentence3 + sentence4

WINDOW_SIZE = 3
STRIDE = 1

smart_chunks = []

for i in range(0, len(documents) - WINDOW_SIZE + 1, STRIDE):

    chunk = documents[i:i + WINDOW_SIZE]

    chunk_text = "\n".join(chunk)

    smart_chunks.append(chunk_text)

print(f"Total chunks created: {len(smart_chunks)}")


# ============================================================
# STEP 7 — PREPARE STRUCTURED CHUNK DATA
# ============================================================

prepared_data = []

for i, chunk in enumerate(smart_chunks):

    prepared_data.append({
        "id": f"chunk_{i}",
        "text": chunk
    })

print(f"Prepared chunks: {len(prepared_data)}")


# ============================================================
# STEP 8 — CREATE EMBEDDING FUNCTION
# ============================================================

def get_embedding(text):

    # response = embed_content(
    #     model="models/gemini-embedding-001",
    #     contents=text
    # )

    # return response["embedding"]

    response = client.models.embed_content(
        model="models/gemini-embedding-001",
        contents=text
    )
    # The new SDK returns a list of embeddings in 'embeddings'
    return response.embeddings[0].values


# ============================================================
# STEP 9 — GENERATE CHUNK EMBEDDINGS
# ============================================================

print("Generating embeddings...")

for item in prepared_data:

    embedding = get_embedding(item["text"])

    item["embedding"] = embedding

print("Embeddings generated successfully.")


# ============================================================
# STEP 10 — NORMALIZATION FUNCTION
# ============================================================

def normalize(vec):

    vec = np.array(vec)

    return vec / np.linalg.norm(vec)


# ============================================================
# STEP 11 — COSINE SIMILARITY FUNCTION
# ============================================================

def cosine_similarity(a, b):

    a = normalize(a)
    b = normalize(b)

    return np.dot(a, b)


# ============================================================
# STEP 12 — COSINE RETRIEVAL FUNCTION
# ============================================================

def retrieve_cosine(query, top_k=3, min_score=0.55):

    # --------------------------------------------------------
    # EMBED QUERY
    # --------------------------------------------------------

    query_embedding = get_embedding(query)

    scores = []

    # --------------------------------------------------------
    # CALCULATE COSINE SIMILARITY
    # --------------------------------------------------------

    for item in prepared_data:

        similarity = cosine_similarity(
            query_embedding,
            item["embedding"]
        )

        scores.append((similarity, item))

    # --------------------------------------------------------
    # SORT BY SCORE
    # --------------------------------------------------------

    scores = sorted(
        scores,
        key=lambda x: x[0],
        reverse=True
    )

    # --------------------------------------------------------
    # SIMPLE RE-RANKING
    # --------------------------------------------------------

    reranked = []

    query_words = query.lower().split()

    for sim, item in scores:

        text = item["text"].lower()

        keyword_bonus = sum(
            word in text for word in query_words
        )

        final_score = sim + (0.03 * keyword_bonus)

        reranked.append((final_score, item))

    # --------------------------------------------------------
    # SORT AGAIN AFTER RE-RANKING
    # --------------------------------------------------------

    reranked = sorted(
        reranked,
        key=lambda x: x[0],
        reverse=True
    )

    # --------------------------------------------------------
    # FILTER LOW SCORES
    # --------------------------------------------------------

    filtered = [
        x for x in reranked
        if x[0] >= min_score
    ]

    return filtered[:top_k]


# ============================================================
# STEP 13 — CREATE FAISS EMBEDDING MATRIX
# ============================================================

embedding_matrix = []

for item in prepared_data:

    embedding_matrix.append(item["embedding"])

embedding_matrix = np.array(
    embedding_matrix,
    dtype=np.float32
)

print("Embedding matrix shape:")
print(embedding_matrix.shape)


# ============================================================
# STEP 14 — NORMALIZE EMBEDDINGS FOR FAISS
# ============================================================

# IMPORTANT:
#
# IndexFlatIP uses INNER PRODUCT.
#
# If vectors are normalized:
#
# inner product == cosine similarity

faiss.normalize_L2(embedding_matrix)


# ============================================================
# STEP 15 — CREATE FAISS INDEX
# ============================================================

dimension = embedding_matrix.shape[1]

index = faiss.IndexFlatIP(dimension)

print("FAISS index created.")


# ============================================================
# STEP 16 — ADD EMBEDDINGS TO FAISS INDEX
# ============================================================

index.add(embedding_matrix)

print(f"Total vectors indexed: {index.ntotal}")


# ============================================================
# STEP 17 — FAISS RETRIEVAL FUNCTION
# ============================================================

def retrieve_faiss(query, top_k=3):

    # --------------------------------------------------------
    # EMBED QUERY
    # --------------------------------------------------------

    query_embedding = get_embedding(query)

    # --------------------------------------------------------
    # CONVERT TO NUMPY
    # --------------------------------------------------------

    query_vector = np.array(
        [query_embedding],
        dtype=np.float32
    )

    # --------------------------------------------------------
    # NORMALIZE QUERY VECTOR
    # --------------------------------------------------------

    faiss.normalize_L2(query_vector)

    # --------------------------------------------------------
    # SEARCH FAISS INDEX
    # --------------------------------------------------------

    scores, indices = index.search(
        query_vector,
        top_k
    )

    # --------------------------------------------------------
    # FORMAT RESULTS
    # --------------------------------------------------------

    results = []

    for score, idx in zip(scores[0], indices[0]):

        item = prepared_data[idx]

        results.append((score, item))

    return results


# ============================================================
# STEP 18 — RETRIEVAL ROUTER
# ============================================================

# This decides:
#
# cosine retrieval
# OR
# FAISS retrieval

def retrieve_router(query, top_k=3):

    if RETRIEVAL_MODE == "cosine":

        return retrieve_cosine(
            query=query,
            top_k=top_k
        )

    elif RETRIEVAL_MODE == "faiss":

        return retrieve_faiss(
            query=query,
            top_k=top_k
        )

    else:

        raise ValueError(
            f"Invalid retrieval mode: {RETRIEVAL_MODE}"
        )


# ============================================================
# STEP 19 — BUILD PROMPT
# ============================================================

def build_prompt(query, retrieved_chunks):

    context = "\n\n".join(
        [item["text"] for score, item in retrieved_chunks]
    )

    prompt = f"""
You are a Kubernetes expert.

Answer ONLY using the provided context.

If the answer is not present in the context,
say "I don't know".

Context:
{context}

Question:
{query}

Answer:
"""

    return prompt


# ============================================================
# STEP 20 — GENERATE ANSWER USING GEMINI
# ============================================================

def generate_answer(prompt):

    # model = genai.GenerativeModel(
    #     "gemini-3-flash-preview"
    # )

    # response = model.generate_content(prompt)
    response = client.models.generate_content(
        model="models/gemini-3-flash-preview",
        contents=prompt)

    return response.text


# ============================================================
# STEP 21 — MAIN RAG PIPELINE
# ============================================================

def rag_pipeline(query, top_k=3):

    # --------------------------------------------------------
    # RETRIEVE CHUNKS
    # --------------------------------------------------------

    retrieved_chunks = retrieve_router(
        query=query,
        top_k=top_k
    )

    # --------------------------------------------------------
    # BUILD PROMPT
    # --------------------------------------------------------

    prompt = build_prompt(
        query,
        retrieved_chunks
    )

    # --------------------------------------------------------
    # GENERATE ANSWER
    # --------------------------------------------------------

    answer = generate_answer(prompt)

    return answer, retrieved_chunks


# ============================================================
# STEP 22 — TEST RETRIEVAL ONLY
# ============================================================

test_queries = [

    "Why is my pod crashing?",
    "How to debug Kubernetes logs?",
    "What causes OOMKilled?",
    "How do services work in Kubernetes?",
    "Why is my container restarting repeatedly?"
]

for query in test_queries:

    print("\n" + "=" * 80)
    print(f"QUERY: {query}\n")

    results = retrieve_router(query)

    for score, item in results:

        print(f"Score: {score:.4f}")

        print(item["text"])

        print("-" * 40)


# ============================================================
# STEP 23 — FINAL RAG TEST
# ============================================================

query = "Why is my container restarting repeatedly?"

answer, sources = rag_pipeline(query)

print("\n" + "=" * 80)
print("FINAL ANSWER:\n")

print(answer)

print("\n" + "=" * 80)
print("RETRIEVED SOURCES:\n")

for score, item in sources:

    print(f"Score: {score:.4f}")

    print(item["text"])

    print("-" * 40)



======================================================================================
OUTPUT
======================================================================================

Retrieval mode: faiss
Total documents: 34
Total chunks created: 32
Prepared chunks: 32
Generating embeddings...
Embeddings generated successfully.
Embedding matrix shape:
(32, 3072)
FAISS index created.
Total vectors indexed: 32

================================================================================
QUERY: Why is my pod crashing?

Score: 0.7078
CrashLoopBackOff occurs when a container repeatedly crashes after starting.
OOMKilled happens when a container exceeds its memory limit.
A container may crash due to missing environment variables.
----------------------------------------
Score: 0.6856
Application errors inside the container often lead to restarts.
kubectl logs retrieves logs from a running container.
kubectl describe pod shows events and state transitions.
----------------------------------------
Score: 0.6782
A container may crash due to missing environment variables.
Incorrect command or entrypoint can cause container startup failure.
Application errors inside the container often lead to restarts.
----------------------------------------

================================================================================
QUERY: How to debug Kubernetes logs?

Score: 0.7563
Application errors inside the container often lead to restarts.
kubectl logs retrieves logs from a running container.
kubectl describe pod shows events and state transitions.
----------------------------------------
Score: 0.7439
Incorrect command or entrypoint can cause container startup failure.
Application errors inside the container often lead to restarts.
kubectl logs retrieves logs from a running container.
----------------------------------------
Score: 0.7289
kubectl logs retrieves logs from a running container.
kubectl describe pod shows events and state transitions.
Liveness probes determine if a container should be restarted.
----------------------------------------

================================================================================
QUERY: What causes OOMKilled?

Score: 0.7658
OOMKilled happens when a container exceeds its memory limit.
A container may crash due to missing environment variables.
Incorrect command or entrypoint can cause container startup failure.
----------------------------------------
Score: 0.7314
CrashLoopBackOff occurs when a container repeatedly crashes after starting.
OOMKilled happens when a container exceeds its memory limit.
A container may crash due to missing environment variables.
----------------------------------------
Score: 0.6270
A container may crash due to missing environment variables.
Incorrect command or entrypoint can cause container startup failure.
Application errors inside the container often lead to restarts.
----------------------------------------

================================================================================
QUERY: How do services work in Kubernetes?

Score: 0.7389
ClusterIP services expose applications within the cluster.
NodePort services expose applications on node IPs.
LoadBalancer services expose applications externally.
----------------------------------------
Score: 0.7214
Tolerations allow pods to be scheduled on tainted nodes.
ClusterIP services expose applications within the cluster.
NodePort services expose applications on node IPs.
----------------------------------------
Score: 0.7159
NodePort services expose applications on node IPs.
LoadBalancer services expose applications externally.
Ingress routes HTTP and HTTPS traffic to services.
----------------------------------------

================================================================================
QUERY: Why is my container restarting repeatedly?

Score: 0.7690
A container may crash due to missing environment variables.
Incorrect command or entrypoint can cause container startup failure.
Application errors inside the container often lead to restarts.
----------------------------------------
Score: 0.7391
Incorrect command or entrypoint can cause container startup failure.
Application errors inside the container often lead to restarts.
kubectl logs retrieves logs from a running container.
----------------------------------------
Score: 0.7250
CrashLoopBackOff occurs when a container repeatedly crashes after starting.
OOMKilled happens when a container exceeds its memory limit.
A container may crash due to missing environment variables.
----------------------------------------

================================================================================
FINAL ANSWER:

A container may restart repeatedly due to the following reasons:
* Application errors inside the container.
* Missing environment variables.
* An incorrect command or entrypoint.
* Exceeding its memory limit (OOMKilled).
* CrashLoopBackOff, which occurs when a container repeatedly crashes after starting.

================================================================================
RETRIEVED SOURCES:

Score: 0.7690
A container may crash due to missing environment variables.
Incorrect command or entrypoint can cause container startup failure.
Application errors inside the container often lead to restarts.
----------------------------------------
Score: 0.7391
Incorrect command or entrypoint can cause container startup failure.
Application errors inside the container often lead to restarts.
kubectl logs retrieves logs from a running container.
----------------------------------------
Score: 0.7250
CrashLoopBackOff occurs when a container repeatedly crashes after starting.
OOMKilled happens when a container exceeds its memory limit.
A container may crash due to missing environment variables.
----------------------------------------
RS Chandras Tech Blog | AI, ML, Agentic AI

Saturday, May 9, 2026

JOURNEY-006

No comments:

Post a Comment

Linear Regression

Pages

Search This Blog