Tuesday, April 28, 2026

AI Agent to extract info from a static web page

 # STEP 1.1 INSTALL THE REQUIRED PACKAGES

!pip install langchain_community langchain_google_genai
!pip install -U duckduckgo-search

#you may see an error regarding version of request package ,
google colab requires a lower version.
You can right now ignore this message because we are not using google colab
libs in this notebook.






# STEP 1.2 IMPORT THE NECESSARY MODULES

# For loading the OpenAI API key safely into the colab environment
import os
from google.colab import userdata

# For Extracting the text from the URL
import requests
from bs4 import BeautifulSoup

# For creating the agent
#from langchain.agents import initialize_agent, Tool, AgentType
#from langchain.chat_models import ChatOpenAI

#### # STEP 1.3 LOAD THE OPEN AI API KEY
#### os.environ["OPENAI_API_KEY"] = userdata.get('openai')  
#### The original tutorial ised OPEN AI, but the OPENAI API KEY IS NOT FREE, HENCE WE WILL USE GEMINI.

# Retrieve the secret and set it as an environment variable
os.environ["GOOGLE_API_KEY"] = userdata.get('geminiapikey')

# Now LangChain will automatically find it
from langchain_google_genai import ChatGoogleGenerativeAI
#llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash-lite", temperature=0)
#llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0)
llm = ChatGoogleGenerativeAI(model="gemini-3-flash-preview", temperature=0)


# For creating the agent
#from langchain.agents import initialize_agent, Tool, AgentType
#from langchain.chat_models import ChatOpenAI


try:
    # A simple call to verify connectivity
    response = llm.invoke("Are you online? Answer with 'Yes' and your model version.")
    print(f"Response: {response.content}")
except Exception as e:
    print(f"Error: {e}")







# NEW 2026 IMPORTS
from langchain.agents import create_agent
from langchain_core.tools import tool

# 2. DEFINE THE WEB TOOL
@tool
def website_qa_tool(url_and_question: str) -> str:
    """
    Input format: 'url | question'.
    Scrapes a website and returns the content for answering questions.
    """
    try:
        url, question = url_and_question.split("|", 1)
        headers = {'User-Agent': 'Mozilla/5.0'}
        response = requests.get(url.strip(), headers=headers, timeout=10)
        soup = BeautifulSoup(response.text, 'html.parser')

        for script in soup(["script", "style"]):
            script.extract()

        text = " ".join(soup.get_text().split())
        return text
        #return text[:25000] # Gemini 2.5 context handles this easily
    except Exception as e:
        return f"Scraping Error: {str(e)}"

# 3. INITIALIZE GEMINI


# 4. CREATE THE AGENT (Modern 2026 Method)
# This replaces create_react_agent AND AgentExecutor
agent = create_agent(
    model=llm,
    tools=[website_qa_tool],
    system_prompt="You are a research assistant. Use the website_qa_tool to find information from URLs."
)






# 5. RUN
print("\n--- Running Agent ---\n")
#query = "https://www.wikipedia.org | What is the main mission of Wikipedia?"
#query = "https://www.w3schools.com | Do they have any Angular course (NOT AngularJS) ?"
#query = "https://www.tutorialspoint.com | does it have any tutorial on DSA Learning  ? If yes please give link"
#query = "https://en.wikipedia.org/wiki/Arizona | How many private colleges and universities are there in Arizona State ?"
query = "https://en.wikipedia.org/wiki/Arizona | How many films were shot in Arizona State ?"

try:
    # In 2026, agents are run via .invoke() with a 'messages' list
    response = agent.invoke({"messages": [("user", query)]})

    print(response)
    # Get the last message from the agent's response
    final_answer = response["messages"][-1].content
    print("\n--- FINAL ANSWER ---")
    print(final_answer)
except Exception as e:
    print(f"\n❌ Agent Error: {e}")

Temperature, Thinking Budget and Thinking Level

    

Temperature, thinking budget, and thinking level are key parameters for configuring LLM behavior. Temperature controls output randomness (0-2.0, default 1.0)). Thinking Budget determines the number of tokens dedicated to reasoning, while Thinking Level (e.g., MINIMAL, MEDIUM, HIGH) adjusts the depth of reasoning for complex tasks. 

Using thinking budget and thinking level simultaneously will return error 400. 

1. Temperature (Randomness Control)
  • Definition: A sampling parameter that controls the randomness of the model's output. It affects the probability distribution over possible next words.
  • Usage Examples:
    • Low Temperature (\(<0.5\)): Ideal for coding, data extraction, or factual tasks where accuracy is key (e.g., set to \(0\) for near-deterministic results).
    • High Temperature (\(>0.7\)): Suitable for creative writing, brainstorming, or marketing copy to generate varied responses.
  • Synonyms/Related Terms: Randomness, Sampling Temperature, Creativity Setting.
  • Important Note: For Gemini 3, a default of \(1.0\) is recommended for best performance, and lower settings may cause instability. 
2. Thinking Budget (Reasoning Amount)
  • Definition: Specific to older or specific API implementations (e.g., Gemini 2.5), this directly sets a numerical budget of tokens the model can use for intermediate reasoning before generating the final answer.
  • Usage Examples:
    • High Budget: Used for complex logic, multi-step math problems, or deep research.
    • Zero/Low Budget: Used to disable thinking to save costs and reduce latency.
  • Synonyms/Related Terms: Reasoning Tokens, Thought Limit. 
3. Thinking Level (Reasoning Depth) 
  • Definition: A higher-level configuration (introduced in Gemini 3) that controls the depth of reasoning.
  • Types/Usage Examples:
    • MINIMAL (e.g., Flash-Lite): Best for simple tasks requiring low latency.
    • MEDIUM (e.g., Flash/Pro): Balanced approach for moderate complexity.
    • HIGH (e.g., Pro): Used for complex, multi-step reasoning.
  • Synonyms/Related Terms: Reasoning Depth, Logic Level. 
Summary of Differences
Feature FocusUse Case
TemperatureCreativity & RandomnessCreative writing vs. factual answers
Thinking BudgetResource Control (Tokens)Managing cost and speed in reasoning
Thinking LevelReasoning Depth (Logic)Simple vs. complex problem-solving                            
    

credits for the below adaptation function : 
https://help.apiyi.com/en/gemini-api-thinking-budget-level-error-fix-en.html

Smart Parameter Adaptation Function

def get_thinking_config(model_name: str, complexity: str = "medium") -> dict:
    """
    Automatically selects the correct thinking mode parameter based on model version.

    Args:
        model_name: Gemini model name
        complexity: Thinking complexity ("minimal", "low", "medium", "high", "dynamic")

    Returns:
        Parameter dictionary suitable for extra_body
    """
    # Gemini 3.0 model list
    gemini_3_models = [
        "gemini-3.0-flash-preview",
        "gemini-3.0-pro-preview",
        "gemini-3-flash",
        "gemini-3-pro"
    ]

    # Gemini 2.5 model list
    gemini_2_5_models = [
        "gemini-2.5-flash-preview-04-17",
        "gemini-2.5-flash-lite",
        "gemini-2-flash",
        "gemini-2-flash-lite"
    ]

    # Determine model version
    if any(m in model_name for m in gemini_3_models):
        # Gemini 3.0 uses thinking_level
        level_map = {
            "minimal": "minimal",
            "low": "low",
            "medium": "medium",
            "high": "high",
            "dynamic": "high"  # Default to high
        }
        return {"thinking_level": level_map.get(complexity, "medium")}

    elif any(m in model_name for m in gemini_2_5_models):
        # Gemini 2.5 uses thinking_budget
        budget_map = {
            "minimal": 0,
            "low": 512,
            "medium": 2048,
            "high": 8192,
            "dynamic": -1
        }
        return {"thinking_budget": budget_map.get(complexity, -1)}

    else:
        # Unknown model, default to Gemini 3.0 parameters
        return {"thinking_level": "medium"}

# Usage example
import openai

client = openai.OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyi.com/v1"
)

model = "gemini-3.0-flash-preview"  # Can be switched dynamically
thinking_config = get_thinking_config(model, complexity="high")

response = client.chat.completions.create(
    model=model,
    messages=[{"role": "user", "content": "Your question here"}],
    extra_body=thinking_config
)



create_react_agent/AgentExecutor vs create_agent in langchain and langgraph

The evolution of LangChain has led to significant shifts in how agents are constructed, moving from the legacy AgentExecutor to the graph-based runtime of LangGraph. 

1. Legacy: AgentExecutor & create_react_agent (LangChain Classic) 
The original LangChain framework used a pre-built loop called AgentExecutor to manage the reasoning and execution cycle of an agent. 
  • create_react_agent (Classic): A legacy factory function that creates a "ReAct" style agent. It requires a specific prompt template with "Thought/Action/Observation" markers.
  • AgentExecutor: A black-box class that wraps the agent and the tools. It handles the loop internally: calling the LLM, parsing the tool request, running the tool, and repeating.
  • Status: While still supported, it is considered legacy and less flexible for complex, production-grade applications. 
2. Current Standard: create_agent (LangChain 1.0) 
With the release of LangChain 1.0, the framework introduced a more unified and flexible approach. 
  • Functional Identity: create_agent is the new standard factory for building agents. It is designed to work seamlessly with the LangGraph runtime.
  • Key Feature (Middleware): It introduces a powerful middleware system for customization, allowing developers to inject logic before the model is called or after tool execution.
  • Simplification: It often defaults to "tool-calling" architectures, which can sometimes skip explicit "Thought" reasoning steps in logs compared to older ReAct implementations. 
3. Prebuilt Graph: create_react_agent (LangGraph) 
This is a high-level helper provided within the langgraph.prebuilt module. 
  • What it does: It automatically constructs a complete StateGraph that implements the ReAct loop (LLM -> Tools -> LLM).
  • Advantages: It provides immediate access to LangGraph's advanced features like persistence (long-term memory), human-in-the-loop (approval steps), and granular streaming of intermediate steps.
  • Migration Note: Recent LangGraph documentation suggests that langgraph.prebuilt.create_react_agent is being superseded by the create_agent function from the core langchain package for better architectural alignment. 
Summary Comparison Table
Feature AgentExecutor / Legacycreate_agent (LangChain 1.0)create_react_agent (LangGraph)
ArchitecturePython loop (Black box)Graph-based (LangGraph runtime)Prebuilt StateGraph
CustomizationHard to modify internal loopHigh (via middleware)High (can modify the graph)
PersistenceSession-bound (Temporary)First-class (Durable)First-class (Durable)
Use CaseSimple prototypesRecommended for productionRapid prototyping on LangGraph
StatusLegacy / ClassicModern StandardPrebuilt Utility



AI Agent to extract info from a static web page

  # STEP 1.1 INSTALL THE REQUIRED PACKAGES ! pip install langchain_community langchain_google_genai ! pip install -U duckduckgo-search #you ...