CrewAI Built-in retry mechanism

Wednesday, May 13, 2026

CrewAI Built-in retry mechanism

CrewAI's built-in retry mechanism distinguishes between logical errors (like a bad response format) and technical API failures (like a network timeout).

1. Automatic Retries for Logical Errors

If an agent receives an output from the LLM that it cannot parse (e.g., the AI tried to use a tool incorrectly or didn't follow the formatting rules), the agent will automatically retry.

Default Behavior: It usually attempts to self-correct up to a certain limit.
Task Guardrails: If you use Guardrails for validation, you can control this behavior using guardrail_max_retries (default is 3). This tells the agent, "If the validation fails, try again this many times before giving up."

2. Technical API Failures (Rate Limits & Timeouts)

For actual API connection issues (like 500 Server Error or Rate Limit Exceeded), CrewAI relies on the underlying LLM configuration and a library called LiteLLM.

You can customize this retry behavior when you define your LLM:

Python
from crewai import LLM

gemini_llm = LLM(
    model="gemini/gemini-1.5-flash",
    max_retries=5,          # Retries the API call if it fails technically
    timeout=120             # Seconds to wait before timing out
)

3. Key Parameters to Control Retries

Parameter	Where to use it	What it does
`max_iter`	Agent	The maximum number of "thoughts" or attempts an agent can take to finish a single task.
`max_retries`	LLM	The number of times to retry a failed API request (network/rate limits).
`guardrail_max_retries`	Task	How many times the agent should try to fix an output that failed your specific validation logic.
`max_rpm`	Agent	Rate Limiting: Prevents failures in the first place by limiting requests per minute.

Summary

If the API crashes: It retries based on the max_retries setting in your LLM config.
If the Agent gets confused: It retries (loops) until it hits the max_iter limit defined in the Agent config.
Pro-Tip: If you are using a free Gemini key, you will hit rate limits often. Setting max_rpm=15 on your Agent is the best way to "handle" failures—by preventing them entirely.

RS Chandras Tech Blog | AI, ML, Agentic AI

Wednesday, May 13, 2026