Wednesday, May 13, 2026

CrewAI "Hello World" in a single colab sheet, without YAML files

 [Cell 001]

!pip install -U crewai


[Cell 002]
from google.colab import userdata
import os
os.environ["GOOGLE_API_KEY"] = userdata.get('GEMINI_API_KEY_006')
os.environ["MODEL"] = "gemini/gemini-3.1-flash-lite"

#EVEN THE BELOW WILL ALSO WORK
# from google.colab import userdata

# # This sets it for both Python AND any !shell commands you run later
# %env GEMINI_API_KEY={userdata.get('GEMINI_API_KEY_006')}
## %env MODEL=gemini/gemini-3.1-flash-lite #if model is required  in env

[Cell 003]
#!/usr/bin/env python
from crewai import Agent, Crew, Process, Task
from crewai.project import CrewBase, agent, crew, task
from crewai.agents.agent_builder.base_agent import BaseAgent
# If you want to run a snippet of code before or after the crew starts,
# you can use the @before_kickoff and @after_kickoff decorators
# https://docs.crewai.com/concepts/crews#example-crew-class-with-decorators

@CrewBase
class HelloWorld():
    """HelloWorld crew"""

    agents: list[BaseAgent]
    tasks: list[Task]

    # Learn more about YAML configuration files here:
    # Agents: https://docs.crewai.com/concepts/agents#yaml-configuration-recommended
    # Tasks: https://docs.crewai.com/concepts/tasks#yaml-configuration-recommended
   
    # If you would like to add tools to your agents, you can learn more about it here:
    # https://docs.crewai.com/concepts/agents#agent-tools
    @agent
    def researcher(self) -> Agent:
        return Agent(
            #config=self.agents_config['researcher'], # type: ignore[index]
            role="{topic} Senior Data Researcher",
            goal="Uncover cutting-edge developments in {topic}",
            backstory="""You're a seasoned researcher with a knack for uncovering the latest
            developments in {topic}. Known for your ability to find the most relevant
            information and present it in a clear and concise manner.""",
            verbose=True
        )

    @agent
    def reporting_analyst(self) -> Agent:
        return Agent(
            #config=self.agents_config['reporting_analyst'], # type: ignore[index]
            role="{topic} Reporting Analyst",
            goal="Create detailed reports based on {topic} data analysis and research findings",
            backstory="""You're a meticulous analyst with a keen eye for detail. You're known for
            your ability to turn complex data into clear and concise reports, making
            it easy for others to understand and act on the information you provide.""",
            verbose=True
        )

    # To learn more about structured task outputs,
    # task dependencies, and task callbacks, check out the documentation:
    # https://docs.crewai.com/concepts/tasks#overview-of-a-task
    @task
    def research_task(self) -> Task:
        return Task(
            #config=self.tasks_config['research_task'], # type: ignore[index]
            description="""Conduct a thorough research about {topic}
            Make sure you find any interesting and relevant information given
            the current year is {current_year}.
            """,
            expected_output="""A list with 10 bullet points of the most relevant information about {topic}""",
            #agent="researcher"
            agent=self.researcher()            
        )

    @task
    def reporting_task(self) -> Task:
        return Task(
            #config=self.tasks_config['reporting_task'], # type: ignore[index]
            description="""Review the context you got and expand each topic into a full section for a report.
            Make sure the report is detailed and contains any and all relevant information.""",
            expected_output="""A fully fledged report with the main topics, each with a full section of information.
            Formatted as markdown without '```'""",
            #agent="reporting_analyst",
            agent=self.reporting_analyst(),
            output_file='report.md'
        )

    @crew
    def crew(self) -> Crew:
        """Creates the HelloWorld crew"""
        # To learn how to add knowledge sources to your crew, check out the documentation:
        # https://docs.crewai.com/concepts/knowledge#what-is-knowledge

        return Crew(
            #agents=self.agents, # Automatically created by the @agent decorator
            #tasks=self.tasks, # Automatically created by the @task decorator
            agents=[self.researcher(), self.reporting_analyst()],
            tasks=[self.research_task(), self.reporting_task()],
            process=Process.sequential,
            verbose=True
            # process=Process.hierarchical, # In case you wanna use that instead https://docs.crewai.com/how-to/Hierarchical/
        )


import sys
import warnings

from datetime import datetime

#from hello_world.crew import HelloWorld

warnings.filterwarnings("ignore", category=SyntaxWarning, module="pysbd")

# This main file is intended to be a way for you to run your
# crew locally, so refrain from adding unnecessary logic into this file.
# Replace with inputs you want to test with, it will automatically
# interpolate any tasks and agents information

def run():
    """
    Run the crew.
    """
    inputs = {
        'topic': 'AI LLMs',
        'current_year': str(datetime.now().year)
    }
    #print(inputs)
    #print(type(inputs))  
    try:
        #HelloWorld().crew().kickoff(inputs={"topic" : "AI LLMs", "current_year" : 2026})
        HelloWorld().crew().kickoff(inputs=inputs)
    except Exception as e:
        raise Exception(f"An error occurred while running the crew: {e}")


def train():
    """
    Train the crew for a given number of iterations.
    """
    inputs = {
        "topic": "AI LLMs",
        'current_year': str(datetime.now().year)
    }
    try:
        HelloWorld().crew().train(n_iterations=int(sys.argv[1]), filename=sys.argv[2], inputs=inputs)

    except Exception as e:
        raise Exception(f"An error occurred while training the crew: {e}")

def replay():
    """
    Replay the crew execution from a specific task.
    """
    try:
        HelloWorld().crew().replay(task_id=sys.argv[1])

    except Exception as e:
        raise Exception(f"An error occurred while replaying the crew: {e}")

def test():
    """
    Test the crew execution and returns the results.
    """
    inputs = {
        "topic": "AI LLMs",
        "current_year": str(datetime.now().year)
    }

    try:
        HelloWorld().crew().test(n_iterations=int(sys.argv[1]), eval_llm=sys.argv[2], inputs=inputs)

    except Exception as e:
        raise Exception(f"An error occurred while testing the crew: {e}")

def run_with_trigger():
    """
    Run the crew with trigger payload.
    """
    import json

    if len(sys.argv) < 2:
        raise Exception("No trigger payload provided. Please provide JSON payload as argument.")

    try:
        trigger_payload = json.loads(sys.argv[1])
    except json.JSONDecodeError:
        raise Exception("Invalid JSON payload provided as argument")

    inputs = {
        "crewai_trigger_payload": trigger_payload,
        "topic": "",
        "current_year": ""
    }

    try:
        result = HelloWorld().crew().kickoff(inputs=inputs)
        return result
    except Exception as e:
        raise Exception(f"An error occurred while running the crew with trigger: {e}")

run()



====================================================================================
OUTPUT
====================================================================================
WARNING:root:File not found: /content/config/agents.yaml
WARNING:root:Agent config file not found at /content/config/agents.yaml. Proceeding with empty agent configurations.
WARNING:root:File not found: /content/config/tasks.yaml
WARNING:root:Task config file not found at /content/config/tasks.yaml. Proceeding with empty task configurations.
{'topic': 'AI LLMs', 'current_year': 2026}
<class 'dict'>
╭─────────────────────────────────────────── 🚀 Crew Execution Started ───────────────────────────────────────────╮
                                                                                                                 
  Crew Execution Started                                                                                         
  Name: HelloWorld                                                                                               
  ID: 679ff0a1-5264-4ef8-97b6-a3fd8b9935e5                                                                       
                                                                                                                 
                                                                                                                 
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭──────────────────────────────────────────────── 📋 Task Started ────────────────────────────────────────────────╮
                                                                                                                 
  Task Started                                                                                                   
  Name: research_task                                                                                            
  ID: 975f337f-37e0-4b94-9db2-5e016ac68cf0                                                                       
                                                                                                                 
                                                                                                                 
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭─────────────────────────────────────────────── 🤖 Agent Started ────────────────────────────────────────────────╮
                                                                                                                 
  Agent: AI LLMs Senior Data Researcher                                                                          
                                                                                                                 
  Task: Conduct a thorough research about AI LLMs                                                                
              Make sure you find any interesting and relevant information given                                  
              the current year is 2026.                                                                          
                                                                                                                 
                                                                                                                 
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭───────────────────────────────────────────── ✅ Agent Final Answer ─────────────────────────────────────────────╮
                                                                                                                 
  Agent: AI LLMs Senior Data Researcher                                                                          
                                                                                                                 
  Final Answer:                                                                                                  
  As of mid-2026, the landscape of Large Language Models has shifted from "scale-at-all-costs" to "efficiency,   
  reasoning, and multimodal integration." The focus is no longer just on parameter count, but on the capability  
  of models to act as autonomous agents capable of long-horizon planning.                                        
                                                                                                                 
  Here are the 10 most relevant developments in AI LLMs for 2026:                                                
                                                                                                                 
  *   **The Rise of Inference-Time Compute:** Models are no longer limited to the "system 1" (fast, intuitive)   
  thinking of 2024. We have seen a massive industry shift toward "System 2" reasoning, where models leverage     
  significant compute during the *inference* phase to verify their own steps, perform tree-of-thought searches,  
  and self-correct before producing a final token.                                                               
  *   **Massive Context Windows are Now Standard:** We have moved beyond the "lost in the middle" phenomenon.    
  By 2026, models routinely handle 10-million+ token context windows with near-perfect retrieval accuracy,       
  allowing for the ingestion of entire corporate codebases or multi-year legal archives into a single prompt.    
  *   **Neural-Symbolic Hybridization:** To solve the "hallucination" problem that plagued LLMs in the           
  mid-2020s, current state-of-the-art models now integrate symbolic logic engines. These models act as           
  orchestrators, offloading mathematical and logical proofs to deterministic solvers while using the LLM for     
  natural language interpretation.                                                                               
  *   **Native Multimodality:** We have abandoned the "model-stitching" approach (where a vision encoder was     
  patched onto a text decoder). Today’s SOTA models are natively trained on interleaved video, audio, and text,  
  allowing for "real-time" reasoning where the AI can watch a live video feed and debug code or analyze          
  physical world processes in milliseconds.                                                                      
  *   **Agentic Orchestration Frameworks:** The focus has shifted from "chatbots" to "agents." Most enterprise   
  LLMs in 2026 exist within sophisticated frameworks that allow them to autonomously navigate the web, execute   
  shell commands, and interact with private APIs to complete complex, multi-step business goals without human    
  intervention.                                                                                                  
  *   **The Energy-Efficiency Revolution:** Following the massive power consumption spikes of 2024-2025, there   
  has been a breakthrough in "Small Language Models" (SLMs). Distillation techniques have reached a point where  
  models with performance parity to 2024’s largest models can run locally on mobile devices and edge hardware    
  with minimal thermal output.                                                                                   
  *   **Synthetic Data Pipelines:** Given that we have essentially exhausted the "high-quality" human-written    
  text on the public internet, models are now being trained predominantly on "model-generated data." The         
  industry has developed robust "automated curriculum learning" where a teacher model creates and filters        
  educational datasets for the student model to learn from.                                                      
  *   **Privacy-Preserving Federated Training:** Enterprises now utilize federated learning techniques that      
  allow LLMs to be fine-tuned on sensitive internal data without the data ever leaving the client's local        
  servers. This has unlocked the adoption of LLMs in highly regulated industries like healthcare and defense.    
  *   **Long-Term Memory Persistence:** We have transitioned from ephemeral chat sessions to "persistent         
  persona memory." LLMs now maintain a vector-based episodic memory store that evolves over time, allowing the   
  model to recall specific interactions from months prior, creating a more personalized and consistent user      
  experience.                                                                                                    
  *   **Regulatory "Watermarking" as a Standard:** Due to the 2025 Global AI Safety Accords, all major           
  commercial models now feature mandatory, cryptographically secure watermarking at the token level. This is     
  now natively supported at the hardware level in modern server GPUs to ensure provenance and combat the         
  proliferation of deepfakes and AI-generated disinformation.                                                    
                                                                                                                 
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭────────────────────────────────────────────── 📋 Task Completion ───────────────────────────────────────────────╮
                                                                                                                 
  Task Completed                                                                                                 
  Name: research_task                                                                                            
  Agent: AI LLMs Senior Data Researcher                                                                          
                                                                                                                 
                                                                                                                 
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭──────────────────────────────────────────────── 📋 Task Started ────────────────────────────────────────────────╮
                                                                                                                 
  Task Started                                                                                                   
  Name: reporting_task                                                                                           
  ID: 67f7a202-157f-4429-ab7c-9589866b7b82                                                                       
                                                                                                                 
                                                                                                                 
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭─────────────────────────────────────────────── 🤖 Agent Started ────────────────────────────────────────────────╮
                                                                                                                 
  Agent: AI LLMs Reporting Analyst                                                                               
                                                                                                                 
  Task: Review the context you got and expand each topic into a full section for a report.                       
              Make sure the report is detailed and contains any and all relevant information.                    
                                                                                                                 
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭───────────────────────────────────────────── ✅ Agent Final Answer ─────────────────────────────────────────────╮
                                                                                                                 
  Agent: AI LLMs Reporting Analyst                                                                               
                                                                                                                 
  Final Answer:                                                                                                  
  # AI LLM Landscape Report: Mid-2026 Industry Analysis                                                          
                                                                                                                 
  ## Introduction                                                                                                
  As of mid-2026, the artificial intelligence landscape has undergone a paradigm shift. The industry has moved   
  away from the "bigger is better" methodology that defined the early 2020s, favoring a strategy built on        
  operational efficiency, cognitive reasoning, and deep multimodal integration. The focus has transitioned from  
  passive conversational assistants to autonomous agents capable of complex, long-horizon planning and           
  real-world task execution. This report outlines the ten critical technological and structural advancements     
  currently shaping the AI sector.                                                                               
                                                                                                                 
  ## 1. The Rise of Inference-Time Compute                                                                       
  The limitations of "System 1" (fast, intuitive) thinking—where a model simply predicts the next token—have     
  been surpassed by "System 2" reasoning capabilities. In 2026, state-of-the-art models are designed to          
  allocate substantial compute resources during the inference phase rather than just the training phase. By      
  utilizing techniques such as tree-of-thought searches, self-verification cycles, and iterative                 
  self-correction, models now validate their logic before finalizing an output. This shift has drastically       
  reduced logic errors and increased the reliability of complex decision-making processes.                       
                                                                                                                 
  ## 2. Massive Context Windows as Standard                                                                      
  The "lost in the middle" retrieval phenomenon has been effectively mitigated. Modern LLMs now treat            
  10-million+ token context windows as the industry baseline. This capability allows users to upload entire      
  corporate codebases, years of legal archives, or comprehensive technical documentation into a single prompt.   
  Because retrieval accuracy is now near-perfect, these models serve as a "unified memory" for enterprise        
  operations, enabling deep analysis across massive datasets without the need for cumbersome external RAG        
  (Retrieval-Augmented Generation) architectures.                                                                
                                                                                                                 
  ## 3. Neural-Symbolic Hybridization                                                                            
  To address the pervasive issue of hallucinations, the industry has adopted neural-symbolic hybridization. By   
  integrating symbolic logic engines with neural networks, AI models now serve as orchestrators. When a task     
  requires rigid mathematical precision or logical verification, the LLM offloads the work to deterministic,     
  rule-based solvers. The LLM remains responsible for natural language processing and task orchestration, while  
  the symbolic engine ensures the integrity of the data, resulting in a hybrid system that is both flexible and  
  mathematically infallible.                                                                                     
                                                                                                                 
  ## 4. Native Multimodality                                                                                     
  The era of "model-stitching"—patching vision encoders onto text decoders—is over. SOTA models in 2026 are      
  natively trained on interleaved streams of video, audio, and text from the outset. This holistic architecture  
  allows for real-time, low-latency reasoning. For example, a model can process a live video feed of an          
  industrial robotic process, identify a mechanical failure, and propose a code-level fix in milliseconds. This  
  native integration has transformed AI from a text-based tool into a sensory-aware agent capable of navigating  
  the physical world.                                                                                            
                                                                                                                 
  ## 5. Agentic Orchestration Frameworks                                                                         
  Chatbots have been relegated to simple query-response tools; the current standard is the "Agentic Framework."  
  Modern LLMs are integrated into environments that allow them to autonomously navigate the web, execute shell   
  commands, manage file systems, and interface with private APIs. These frameworks permit agents to complete     
  complex, multi-step business objectives—such as drafting a contract, deploying the associated code, and        
  filing a regulatory report—with only high-level oversight from human operators.                                
                                                                                                                 
  ## 6. The Energy-Efficiency Revolution                                                                         
  Following the power infrastructure strain of 2024-2025, the industry successfully pivoted toward extreme       
  optimization. Breakthroughs in model distillation have resulted in "Small Language Models" (SLMs) that match   
  the performance of legacy giant models while operating with a fraction of the thermal and energy footprint.    
  These efficient models now run locally on mobile devices and edge hardware, enabling high-performance,         
  private, and offline AI capabilities that were previously impossible due to energy constraints.                
                                                                                                                 
  ## 7. Synthetic Data Pipelines                                                                                 
  Public internet data has been exhausted as a training resource, leading to the rise of robust synthetic data   
  pipelines. Models now learn primarily from "teacher-student" hierarchies where a highly sophisticated model    
  generates, validates, and filters educational datasets. This "automated curriculum learning" ensures that      
  training data is optimized for high-reasoning tasks, effectively creating a self-sustaining loop of            
  improvement that allows models to reach higher capabilities without relying on human-generated web content.    
                                                                                                                 
  ## 8. Privacy-Preserving Federated Training                                                                    
  In 2026, enterprise adoption in high-stakes fields like healthcare, defense, and finance has reached an        
  all-time high, driven by federated learning. This technique allows models to be fine-tuned on sensitive        
  internal data across distributed servers without the data ever being centralized or exposed. By keeping        
  training local, organizations maintain full compliance with data sovereignty laws, ensuring that the model     
  learns the organization's unique requirements without risking intellectual property or personal identifiable   
  information (PII).                                                                                             
                                                                                                                 
  ## 9. Long-Term Memory Persistence                                                                             
  The industry has abandoned ephemeral, session-based interactions. Modern LLMs incorporate vector-based         
  episodic memory stores that maintain a persistent "persona" or institutional history for the user. This        
  long-term memory allows the model to recall specific context, preferences, or tasks from interactions          
  occurring months or even years prior. This persistence facilitates a more consistent, personalized             
  experience, as the AI understands the user's long-term goals rather than treating each prompt as a blank       
  slate.                                                                                                         
                                                                                                                 
  ## 10. Regulatory "Watermarking" as a Standard                                                                 
  Following the 2025 Global AI Safety Accords, hardware-level support for provenance is now mandatory. All       
  major commercial LLMs now integrate cryptographically secure, token-level watermarking. This functionality is  
  baked into modern server-grade GPUs, ensuring that any content generated by an AI can be traced back to its    
  model and origin. This standardized system provides a critical line of defense against the proliferation of    
  deepfakes and disinformation, ensuring that human-generated and machine-generated content can be objectively   
  distinguished.                                                                                                 
                                                                                                                 
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

╭────────────────────────────────────────────── 📋 Task Completion ───────────────────────────────────────────────╮
                                                                                                                 
  Task Completed                                                                                                 
  Name: reporting_task                                                                                           
  Agent: AI LLMs Reporting Analyst                                                                               
                                                                                                                 
                                                                                                                 
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Difference between Default CrewAI created by CrewAI and Handcoded CrewAI

 When we run the command 

!uv tool run crewai run

in google colab, a default working Hello World CrewAI project is created.

This project uses a proper folder structure, and creates 4-5 code files.

Hence I tried to imitate this project in a single colab notebook. But I had to make
certain changes in it to make it work.

I specifically faced two issues:

Attribute Error: ‘str’ object has no attribute ‘get’

And
'function' object has no attribute 'get'

 I had to change
  #agent="researcher"
to
  agent=self.researcher()            

and
            #agents=self.agents, # Automatically created by the @agent decorator
            #tasks=self.tasks, # Automatically created by the @task decorator
To
            agents=[self.researcher(), self.reporting_analyst()],
            tasks=[self.research_task(), self.reporting_task()],

I got the issue analysed by Gemini and got the following answer:


The reason the demo project works without those parentheses is due to how the @CrewBase decorator

and the class attributes agents_config and tasks_config work together.

In the official crewai create crew template, the code usually looks like this:

Python
@agent
def researcher(self) -> Agent:
    return Agent(
        config=self.agents_config['researcher'], # This is the magic line
        verbose=True
    )

Here is the breakdown of why the official version works differently than your manual version:

1. The @CrewBase "Auto-Loading"

When you use the @CrewBase decorator, it automatically looks for a config/ folder.

It loads agents.yaml into an attribute called self.agents_config and tasks.yaml into self.tasks_config.

In your specific error, the logs showed:

WARNING:root:Agent config file not found at /content/config/agents.yaml. Proceeding with empty agent configurations.


Because those files were missing, self.agents_config was empty.

2. Lazy Loading vs. Manual Instantiation

In the official template, the Task definition in tasks.yaml usually looks like this:

YAML
research_task:
  description: "Conduct research on {topic}"
  agent: researcher  # This is a STRING, not a function

When the CrewAI framework reads this YAML, it sees the string "researcher".

It then looks for a method decorated with @agent that has the name researcher.

It handles the "calling" of that function for you behind the scenes.

3. Why your code threw the AttributeError

In your manual code, you were doing this:

Python
@task
def research_task(self) -> Task:
    return Task(
        description="...",
        agent=self.researcher  # Passing the METHOD reference
    )

Because you weren't using the YAML configuration to link the two, you were passing the Python function object self.researcher directly into the agent parameter of the Task class.

The Task class expected an Agent object, but it received a function. When it tried to process that "agent," it tried to look for configuration data inside it, leading to the error:

AttributeError: 'function' object has no attribute 'get'

Summary Table

FeatureOfficial Demo (YAML)Your Manual Code
Agent SettingsDefined in agents.yamlDefined inside Agent(...)
Task ConnectionLinked via string name in tasks.yamlLinked via self.agent_name() in Python
Decorator RoleOrchestrates the YAML-to-Method mappingMostly unused since configs were missing
LogicFramework "calls" the methods for youYou must call the methods yourself

In short, the official demo uses declarative configuration (YAML strings), whereas your approach was programmatic (Python objects). When doing it programmatically, you have to execute the function to produce the object the Task needs.

How are you finding the transition from YAML-based configs to pure Python?

CrewAI "Hello World" in a single colab sheet, without YAML files

 [Cell 001] ! pip install -U crewai [Cell 002] from google.colab import userdata import os os.environ[ "GOOGLE_API_KEY" ] = use...