Compare commits

..

51 Commits

Author SHA1 Message Date
Lucas Gomide
1305de3a0c Support multi org in CLI (#2969)
* feat: support to list, switch and see your current organization

* feat: store the current org after logged in

* feat: filtering agents, tools and their actions by organization_uuid if present

* fix linter offenses

* refactor: propagate the current org thought Header instead of params

* refactor: rename org column name to ID instead of Handle

---------

Co-authored-by: Tony Kipkemboi <iamtonykipkemboi@gmail.com>
2025-06-07 23:11:56 -07:00
Mike Plachta
5537f3d0da Increasing the default X-axis spacing for flow plotting (#2967)
* Increasing the default X-axis spacing for flow plotting

* removing unused imports
2025-06-07 23:11:56 -07:00
Greyson LaLonde
b9d7d3ce7a docs: update hallucination guardrail examples
- Add basic usage example showing guardrail uses task's expected_output as default context
- Add explicit context example for custom reference content
2025-06-07 23:11:56 -07:00
Lucas Gomide
30413d19be docs: add minimum UV version required to use the Tool repository (#2965)
* docs: add minimum UV version required to use the Tool repository

* docs: remove memory from Agent docs

The Agent does not support `memory` attribute
2025-06-07 23:11:56 -07:00
Lucas Gomide
f0f0e3d7f8 fix: remove duplicated message about Tool result (#2964)
We are currently inserting tool results into LLM messages twice, which may unnecessarily increase processing costs, especially for longer outputs.
2025-06-07 23:11:56 -07:00
Lorenze Jay
4c21711dbf Update version to 0.126.0 and dependencies in pyproject.toml and lock files (#2961) 2025-06-07 23:11:56 -07:00
Tony Kipkemboi
d3bffbfd0c Add enterprise testing image (#2960) 2025-06-07 23:11:56 -07:00
Tony Kipkemboi
e242a69ea9 docs: Fix missing await keywords in async crew kickoff methods and add llm selection guide (#2959) 2025-06-07 23:11:56 -07:00
Mike Plachta
21046f29b6 Azure embeddings documentation for knowledge (#2957)
Co-authored-by: Tony Kipkemboi <iamtonykipkemboi@gmail.com>
2025-06-07 23:11:56 -07:00
Lucas Gomide
86f9226be7 Minor adjustments on Tool publish and docs (#2958)
* fix: fix tool publisher logger when available_exports is found

* docs: update docs and templates since we support Python 3.13
2025-06-07 23:11:56 -07:00
Lucas Gomide
6d42f67dfc feat: load Tool from Agent repository by their own module (#2940)
Previously, we only supported tools from the crewai-tools open-source repository. Now, we're introducing improved support for private tool repositories.
2025-06-07 23:11:56 -07:00
Lorenze Jay
a81bd90e7c agent add knowledge sources fix and test (#2948) 2025-06-07 23:11:55 -07:00
Lucas Gomide
df701d4ab3 Persist available tools from a Tool repository (#2851)
* feat: add capability to see and expose public Tool classes

* feat: persist available Tools from repository on publish

* ci: ignore explictly templates from ruff check

Ruff only applies --exclude to files it discovers itself. So we have to skip manually the same files excluded from `ruff.toml`

* sytle: fix linter issues

* refactor: renaming available_tools_classes by available_exports

* feat: provide more context about exportable tools

* feat: allow to install a Tool from pypi

* test: fix tests

* feat: add env_vars attribute to BaseTool

* remove TODO: security check since we are handle that on enterprise side
2025-06-07 23:11:37 -07:00
siddharth Sambharia
d79b2dc68a feat/portkey-ai-docs-udpated (#2936) 2025-06-07 23:11:37 -07:00
Lucas Gomide
026b337574 Support Python 3.13 (#2844)
* ci: support python 3.13 on CI

* docs: update docs about support python version

* build: adds requires python <3.14

* build: explicit tokenizers dependency

Added explicit tokenizers dependency: Added tokenizers>=0.20.3 to ensure a version compatible with Python 3.13 is used.

* build: drop fastembed is not longer used

* build: attempt to build PyTorch on Python 3.13

* feat: upgrade fastavro, pyarrow and lancedb

* build: ensure tiktoken greather than 0.8.0 due Python 3.13 compatibility
2025-06-07 23:11:37 -07:00
VirenG
2bc9a54136 Update README.md (#2923)
Added 'Multi-AI Agent' phrase for giving more clarity to key features section in clause 3 in README.md
2025-06-07 23:11:37 -07:00
Tony Kipkemboi
9267302ce6 Expand MCP Integration documentation structure (#2922) 2025-06-07 23:11:37 -07:00
Tony Kipkemboi
613b196ffb docs(mcp): Comprehensive update to MCPServerAdapter documentation (#2921)
This commit includes several enhancements to the MCP integration guide:
- Adds a section on connecting to multiple MCP servers with a runnable example.
- Ensures consistent mention and examples for Streamable HTTP transport.
- Adds a manual lifecycle example for Streamable HTTP.
- Clarifies Stdio command examples.
- Refines definitions of Stdio, SSE, and Streamable HTTP transports.
- Simplifies comments in code examples for clarity.
2025-06-07 23:11:37 -07:00
Lucas Gomide
b78671ed40 feat: log usage tools when called by LLM (#2916)
* feat: log usage tools when called by LLM

* feat: print llm tool usage in console
2025-06-07 23:11:36 -07:00
Mark McDonald
4a13f2f7b1 docs: Adds Gemini example to OpenAI-compat section (#2915) 2025-06-07 23:10:37 -07:00
Tony Kipkemboi
5f1be9e1db docs: docs restructuring and community analytics implementation (#2913)
* docs: Fix major memory system documentation issues - Remove misleading deprecation warnings, fix confusing comments, clearly separate three memory approaches, provide accurate examples that match implementation

* fix: Correct broken image paths in README - Update crewai_logo.png and asset.png paths to point to docs/images/ directory instead of docs/ directly

* docs: Add system prompt transparency and customization guide - Add 'Understanding Default System Instructions' section to address black-box concerns - Document what CrewAI automatically injects into prompts - Provide code examples to inspect complete system prompts - Show 3 methods to override default instructions - Include observability integration examples with Langfuse - Add best practices for production prompt management

* docs: Fix implementation accuracy issues in memory documentation - Fix Ollama embedding URL parameter and remove unsupported Cohere input_type parameter

* docs: Reference observability docs instead of showing specific tool examples

* docs: Reorganize knowledge documentation for better developer experience - Move quickstart examples right after overview for immediate hands-on experience - Create logical learning progression: basics → configuration → advanced → troubleshooting - Add comprehensive agent vs crew knowledge guide with working examples - Consolidate debugging and troubleshooting in dedicated section - Organize best practices by topic in accordion format - Improve content flow from simple concepts to advanced features - Ensure all examples are grounded in actual codebase implementation

* docs: enhance custom LLM documentation with comprehensive examples and accurate imports

* docs: reorganize observability tools into dedicated section with comprehensive overview and improved navigation

* docs: rename how-to section to learn and add comprehensive overview page

* docs: finalize documentation reorganization and update navigation labels

* docs: enhance README with comprehensive badges, navigation links, and getting started video

* Add Common Room tracking to documentation - Script will track all documentation page views - Follows Mintlify custom JS implementation pattern - Enables comprehensive docs usage insights

* docs: move human-in-the-loop guide to enterprise section and update navigation - Move human-in-the-loop.mdx from learn to enterprise/guides - Update docs.json navigation to reflect new organization
2025-06-07 23:10:37 -07:00
Lorenze Jay
12fe6a9a44 chore: Bump version to 0.121.1 in project files and update dependencies (#2912) 2025-06-07 23:10:37 -07:00
Tony Kipkemboi
c90272d601 docs: Add transparency features for prompts and memory systems (#2902)
* docs: Fix major memory system documentation issues - Remove misleading deprecation warnings, fix confusing comments, clearly separate three memory approaches, provide accurate examples that match implementation

* fix: Correct broken image paths in README - Update crewai_logo.png and asset.png paths to point to docs/images/ directory instead of docs/ directly

* docs: Add system prompt transparency and customization guide - Add 'Understanding Default System Instructions' section to address black-box concerns - Document what CrewAI automatically injects into prompts - Provide code examples to inspect complete system prompts - Show 3 methods to override default instructions - Include observability integration examples with Langfuse - Add best practices for production prompt management

* docs: Fix implementation accuracy issues in memory documentation - Fix Ollama embedding URL parameter and remove unsupported Cohere input_type parameter

* docs: Reference observability docs instead of showing specific tool examples

* docs: Reorganize knowledge documentation for better developer experience - Move quickstart examples right after overview for immediate hands-on experience - Create logical learning progression: basics → configuration → advanced → troubleshooting - Add comprehensive agent vs crew knowledge guide with working examples - Consolidate debugging and troubleshooting in dedicated section - Organize best practices by topic in accordion format - Improve content flow from simple concepts to advanced features - Ensure all examples are grounded in actual codebase implementation

* docs: enhance custom LLM documentation with comprehensive examples and accurate imports

* docs: reorganize observability tools into dedicated section with comprehensive overview and improved navigation

* docs: rename how-to section to learn and add comprehensive overview page

* docs: finalize documentation reorganization and update navigation labels

* docs: enhance README with comprehensive badges, navigation links, and getting started video
2025-06-07 23:10:37 -07:00
João Moura
e4e9bf343a revamp 2025-06-02 10:18:50 -07:00
João Moura
efebcd9734 improving tool usage for crewai enterprise tools 2025-06-01 09:48:06 -07:00
João Moura
6ecb30ee87 improved 2025-06-01 03:08:25 -07:00
João Moura
7c12aeaa0c cleanign state agent 2025-06-01 01:37:31 -07:00
João Moura
4fcabd391f populating state 2025-06-01 01:27:26 -07:00
João Moura
7009a6b7a0 Agent State Step 1 2025-05-31 23:15:39 -07:00
João Moura
e3cd7209ad fixing console formatter 2025-05-31 21:37:36 -07:00
João Moura
635e5a21f3 updated reasoning 2025-05-29 20:50:38 -07:00
João Moura
e0cd41e9f9 updating 2025-05-28 20:49:05 -07:00
João Moura
224ba1fb69 new test for adaptative reasoning 2025-05-28 11:47:22 -07:00
Devin AI
286958be4f Move adaptive reasoning context prompt to en.json
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-28 16:57:34 +00:00
Devin AI
36fc2365d3 Move mid-execution reasoning update prompt to en.json
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-28 16:55:30 +00:00
Devin AI
138ac95b09 Add timing information to reasoning events to show duration in logs
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-27 09:41:37 +00:00
Devin AI
572d8043eb Fix ready flag detection to handle both execute and continue executing variations
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-27 09:22:19 +00:00
Devin AI
3d5668e988 Fix adaptive reasoning implementation and tests, move all prompts to en.json
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-27 09:04:16 +00:00
Devin AI
cbc81ecc06 Move reasoning prompts to en.json and update documentation
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-27 08:57:44 +00:00
Devin AI
38ed69577f Merge remote changes into local branch
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-27 08:37:08 +00:00
Devin AI
203faa6a77 Implement LLM-based adaptive reasoning with function calling
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-27 08:35:32 +00:00
João Moura
88721788e9 improving test 2025-05-27 01:27:25 -07:00
João Moura
e636f1dc17 Merge branch 'main' into devin/1748280430-reasoning-interval 2025-05-27 01:08:03 -07:00
Devin AI
5868ac71dd Replace examples with comprehensive documentation for reasoning agents
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-26 21:37:10 +00:00
Devin AI
cb5116a21d Fix lint error: remove unused LLM import from reasoning interval tests
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-26 19:15:47 +00:00
Devin AI
b74ee4e98b Fix tests to properly mock LLM calls to avoid authentication errors
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-26 19:07:41 +00:00
Devin AI
0383aa1f27 Update reasoning interval example with comprehensive documentation
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-26 19:01:31 +00:00
Devin AI
db6b831c66 Fix reasoning interval counter management
- Remove state modification from _should_trigger_reasoning method
- Ensure counter is incremented after each step in invoke loop
- Counter is properly reset when reasoning occurs
- Fixes test failures in reasoning_interval_test.py

Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-26 18:58:04 +00:00
Devin AI
acb1eac2ac Complete code review improvements
- Fix type errors by converting deque to list when passing to handle_mid_execution_reasoning
- Add proper type annotation for tools_used
- Fix agent type error by casting BaseAgent to Agent
- Fix _should_trigger_reasoning logic to correctly handle reasoning_interval
- All type checking passes (mypy clean)

Addresses all code quality suggestions from PR review.

Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-26 17:54:05 +00:00
Devin AI
acabaee480 Fix lint errors and implement code review suggestions
- Remove unused imports (json, re)
- Add validation for reasoning_interval parameter
- Use deque for tools_used to prevent memory leaks
- Add type hints to all new methods
- Refactor adaptive reasoning logic for better readability
- Centralize event handling logic
- Expand test coverage with parametrized tests

Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-26 17:42:00 +00:00
Devin AI
9a2ddb39ce Add reasoning_interval and adaptive_reasoning features
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-26 17:32:47 +00:00
74 changed files with 9633 additions and 425 deletions

View File

@@ -30,4 +30,7 @@ jobs:
- name: Run Ruff on Changed Files
if: ${{ steps.changed-files.outputs.files != '' }}
run: |
echo "${{ steps.changed-files.outputs.files }}" | tr " " "\n" | xargs -I{} ruff check "{}"
echo "${{ steps.changed-files.outputs.files }}" \
| tr ' ' '\n' \
| grep -v 'src/crewai/cli/templates/' \
| xargs -I{} ruff check "{}"

View File

@@ -161,7 +161,7 @@ To get started with CrewAI, follow these simple steps:
### 1. Installation
Ensure you have Python >=3.10 <3.13 installed on your system. CrewAI uses [UV](https://docs.astral.sh/uv/) for dependency management and package handling, offering a seamless setup and execution experience.
Ensure you have Python >=3.10 <3.14 installed on your system. CrewAI uses [UV](https://docs.astral.sh/uv/) for dependency management and package handling, offering a seamless setup and execution experience.
First, install CrewAI:

View File

@@ -43,7 +43,6 @@ The Visual Agent Builder enables:
| **Max Iterations** _(optional)_ | `max_iter` | `int` | Maximum iterations before the agent must provide its best answer. Default is 20. |
| **Max RPM** _(optional)_ | `max_rpm` | `Optional[int]` | Maximum requests per minute to avoid rate limits. |
| **Max Execution Time** _(optional)_ | `max_execution_time` | `Optional[int]` | Maximum time (in seconds) for task execution. |
| **Memory** _(optional)_ | `memory` | `bool` | Whether the agent should maintain memory of interactions. Default is True. |
| **Verbose** _(optional)_ | `verbose` | `bool` | Enable detailed execution logs for debugging. Default is False. |
| **Allow Delegation** _(optional)_ | `allow_delegation` | `bool` | Allow the agent to delegate tasks to other agents. Default is False. |
| **Step Callback** _(optional)_ | `step_callback` | `Optional[Any]` | Function called after each agent step, overrides crew callback. |
@@ -156,7 +155,6 @@ agent = Agent(
"you excel at finding patterns in complex datasets.",
llm="gpt-4", # Default: OPENAI_MODEL_NAME or "gpt-4"
function_calling_llm=None, # Optional: Separate LLM for tool calling
memory=True, # Default: True
verbose=False, # Default: False
allow_delegation=False, # Default: False
max_iter=20, # Default: 20 iterations
@@ -537,7 +535,6 @@ The context window management feature works automatically in the background. You
- Adjust `max_iter` and `max_retry_limit` based on task complexity
### Memory and Context Management
- Use `memory: true` for tasks requiring historical context
- Leverage `knowledge_sources` for domain-specific information
- Configure `embedder` when using custom embedding models
- Use custom templates (`system_template`, `prompt_template`, `response_template`) for fine-grained control over agent behavior
@@ -585,7 +582,6 @@ The context window management feature works automatically in the background. You
- Review code sandbox settings
4. **Memory Issues**: If agent responses seem inconsistent:
- Verify memory is enabled
- Check knowledge source configuration
- Review conversation history management

View File

@@ -325,12 +325,12 @@ for result in results:
# Example of using kickoff_async
inputs = {'topic': 'AI in healthcare'}
async_result = my_crew.kickoff_async(inputs=inputs)
async_result = await my_crew.kickoff_async(inputs=inputs)
print(async_result)
# Example of using kickoff_for_each_async
inputs_array = [{'topic': 'AI in healthcare'}, {'topic': 'AI in finance'}]
async_results = my_crew.kickoff_for_each_async(inputs=inputs_array)
async_results = await my_crew.kickoff_for_each_async(inputs=inputs_array)
for async_result in async_results:
print(async_result)
```

View File

@@ -602,6 +602,30 @@ agent = Agent(
)
```
#### Configuring Azure OpenAI Embeddings
When using Azure OpenAI embeddings:
1. Make sure you deploy the embedding model in Azure platform first
2. Then you need to use the following configuration:
```python
agent = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
knowledge_sources=[knowledge_source],
embedder={
"provider": "azure",
"config": {
"api_key": "your-azure-api-key",
"model": "text-embedding-ada-002", # change to the model you are using and is deployed in Azure
"api_base": "https://your-azure-endpoint.openai.azure.com/",
"api_version": "2024-02-01"
}
}
)
```
## Advanced Features
### Query Rewriting

View File

@@ -6,11 +6,11 @@ icon: brain
## Overview
Agent reasoning is a feature that allows agents to reflect on a task and create a plan before execution. This helps agents approach tasks more methodically and ensures they're ready to perform the assigned work.
Agent reasoning is a feature that allows agents to reflect on a task and create a plan before and during execution. This helps agents approach tasks more methodically and adapt their strategy as they progress through complex tasks.
## Usage
To enable reasoning for an agent, simply set `reasoning=True` when creating the agent:
To enable reasoning for an agent, set `reasoning=True` when creating the agent:
```python
from crewai import Agent
@@ -19,13 +19,43 @@ agent = Agent(
role="Data Analyst",
goal="Analyze complex datasets and provide insights",
backstory="You are an experienced data analyst with expertise in finding patterns in complex data.",
reasoning=True, # Enable reasoning
reasoning=True, # Enable basic reasoning
max_reasoning_attempts=3 # Optional: Set a maximum number of reasoning attempts
)
```
### Interval-based Reasoning
To enable periodic reasoning during task execution, set `reasoning_interval` to specify how often the agent should re-evaluate its plan:
```python
agent = Agent(
role="Research Analyst",
goal="Find comprehensive information about a topic",
backstory="You are a skilled research analyst who methodically approaches information gathering.",
reasoning=True,
reasoning_interval=3, # Re-evaluate plan every 3 steps
)
```
### Adaptive Reasoning
For more dynamic reasoning that adapts to the execution context, enable `adaptive_reasoning`:
```python
agent = Agent(
role="Strategic Advisor",
goal="Provide strategic advice based on market research",
backstory="You are an experienced strategic advisor who adapts your approach based on the information you discover.",
reasoning=True,
adaptive_reasoning=True, # Agent decides when to reason based on context
)
```
## How It Works
### Initial Reasoning
When reasoning is enabled, before executing a task, the agent will:
1. Reflect on the task and create a detailed plan
@@ -33,7 +63,17 @@ When reasoning is enabled, before executing a task, the agent will:
3. Refine the plan as necessary until it's ready or max_reasoning_attempts is reached
4. Inject the reasoning plan into the task description before execution
This process helps the agent break down complex tasks into manageable steps and identify potential challenges before starting.
### Mid-execution Reasoning
During task execution, the agent can re-evaluate and adjust its plan based on:
1. **Interval-based reasoning**: The agent reasons after a fixed number of steps (specified by `reasoning_interval`)
2. **Adaptive reasoning**: The agent uses its LLM to intelligently decide when reasoning is needed based on:
- Current execution context (task description, expected output, steps taken)
- The agent's own judgment about whether strategic reassessment would be beneficial
- Automatic fallback when recent errors or failures are detected in the execution
This mid-execution reasoning helps agents adapt to new information, overcome obstacles, and optimize their approach as they work through complex tasks.
## Configuration Options
@@ -45,35 +85,44 @@ This process helps the agent break down complex tasks into manageable steps and
Maximum number of attempts to refine the plan before proceeding with execution. If None (default), the agent will continue refining until it's ready.
</ParamField>
## Example
<ParamField body="reasoning_interval" type="int" default="None">
Interval of steps after which the agent should reason again during execution. If None, reasoning only happens before execution.
</ParamField>
Here's a complete example:
<ParamField body="adaptive_reasoning" type="bool" default="False">
Whether the agent should adaptively decide when to reason during execution based on context.
</ParamField>
```python
from crewai import Agent, Task, Crew
## Technical Implementation
# Create an agent with reasoning enabled
analyst = Agent(
role="Data Analyst",
goal="Analyze data and provide insights",
backstory="You are an expert data analyst.",
reasoning=True,
max_reasoning_attempts=3 # Optional: Set a limit on reasoning attempts
)
### Interval-based Reasoning
# Create a task
analysis_task = Task(
description="Analyze the provided sales data and identify key trends.",
expected_output="A report highlighting the top 3 sales trends.",
agent=analyst
)
The interval-based reasoning feature works by:
# Create a crew and run the task
crew = Crew(agents=[analyst], tasks=[analysis_task])
result = crew.kickoff()
1. Tracking the number of steps since the last reasoning event
2. Triggering reasoning when `steps_since_reasoning >= reasoning_interval`
3. Resetting the counter after each reasoning event
4. Generating an updated plan based on current progress
print(result)
```
This creates a predictable pattern of reflection during task execution, which is useful for complex tasks where periodic reassessment is beneficial.
### Adaptive Reasoning
The adaptive reasoning feature uses LLM function calling to determine when reasoning should occur:
1. **LLM-based decision**: The agent's LLM evaluates the current execution context (task description, expected output, steps taken so far) to decide if reasoning is needed
2. **Error detection fallback**: When recent messages contain error indicators like "error", "exception", "failed", etc., reasoning is automatically triggered
This creates an intelligent reasoning pattern where the agent uses its own judgment to determine when strategic reassessment would be most beneficial, while maintaining automatic error recovery.
### Mid-execution Reasoning Process
When mid-execution reasoning is triggered, the agent:
1. Summarizes current progress (steps taken, tools used, recent actions)
2. Evaluates the effectiveness of the current approach
3. Adjusts the plan based on new information and challenges encountered
4. Continues execution with the updated plan
## Error Handling
@@ -93,7 +142,7 @@ agent = Agent(
role="Data Analyst",
goal="Analyze data and provide insights",
reasoning=True,
max_reasoning_attempts=3
reasoning_interval=5 # Re-evaluate plan every 5 steps
)
# Create a task
@@ -144,4 +193,33 @@ I'll analyze the sales data to identify the top 3 trends.
READY: I am ready to execute the task.
```
This reasoning plan helps the agent organize its approach to the task, consider potential challenges, and ensure it delivers the expected output.
During execution, the agent might generate an updated plan:
```
Based on progress so far (3 steps completed):
Updated Reasoning Plan:
After examining the data structure and initial exploratory analysis, I need to adjust my approach:
1. Current findings:
- The data shows seasonal patterns that need deeper investigation
- Customer segments show varying purchasing behaviors
- There are outliers in the luxury product category
2. Adjusted approach:
- Focus more on seasonal analysis with year-over-year comparisons
- Segment analysis by both demographics and purchasing frequency
- Investigate the luxury product category anomalies
3. Next steps:
- Apply time series analysis to better quantify seasonal patterns
- Create customer cohorts for more precise segmentation
- Perform statistical tests on the luxury category data
4. Expected outcome:
Still on track to deliver the top 3 sales trends, but with more precise quantification and actionable insights.
READY: I am ready to continue executing the task.
```
This mid-execution reasoning helps the agent adapt its approach based on what it has learned during the initial steps of the task.

View File

@@ -213,6 +213,7 @@
"group": "Learn",
"pages": [
"learn/overview",
"learn/llm-selection-guide",
"learn/conditional-tasks",
"learn/coding-agents",
"learn/create-custom-tools",

View File

@@ -25,8 +25,13 @@ AI hallucinations occur when language models generate content that appears plaus
from crewai.tasks.hallucination_guardrail import HallucinationGuardrail
from crewai import LLM
# Initialize the guardrail with reference context
# Basic usage - will use task's expected_output as context
guardrail = HallucinationGuardrail(
llm=LLM(model="gpt-4o-mini")
)
# With explicit reference context
context_guardrail = HallucinationGuardrail(
context="AI helps with various tasks including analysis and generation.",
llm=LLM(model="gpt-4o-mini")
)

View File

@@ -21,6 +21,7 @@ Before using the Tool Repository, ensure you have:
- A [CrewAI Enterprise](https://app.crewai.com) account
- [CrewAI CLI](https://docs.crewai.com/concepts/cli#cli) installed
- uv>=0.5.0 installed. Check out [how to upgrade](https://docs.astral.sh/uv/getting-started/installation/#upgrading-uv)
- [Git](https://git-scm.com) installed and configured
- Access permissions to publish or install tools in your CrewAI Enterprise organization

Binary file not shown.

After

Width:  |  Height:  |  Size: 288 KiB

View File

@@ -108,6 +108,7 @@ crew_2 = Crew(agents=[coding_agent], tasks=[task_2])
# Async function to kickoff multiple crews asynchronously and wait for all to finish
async def async_multiple_crews():
# Create coroutines for concurrent execution
result_1 = crew_1.kickoff_async(inputs={"ages": [25, 30, 35, 40, 45]})
result_2 = crew_2.kickoff_async(inputs={"ages": [20, 22, 24, 28, 30]})

View File

@@ -0,0 +1,729 @@
---
title: 'Strategic LLM Selection Guide'
description: 'Strategic framework for choosing the right LLM for your CrewAI AI agents and writing effective task and agent definitions'
icon: 'brain-circuit'
---
## The CrewAI Approach to LLM Selection
Rather than prescriptive model recommendations, we advocate for a **thinking framework** that helps you make informed decisions based on your specific use case, constraints, and requirements. The LLM landscape evolves rapidly, with new models emerging regularly and existing ones being updated frequently. What matters most is developing a systematic approach to evaluation that remains relevant regardless of which specific models are available.
<Note>
This guide focuses on strategic thinking rather than specific model recommendations, as the LLM landscape evolves rapidly.
</Note>
## Quick Decision Framework
<Steps>
<Step title="Analyze Your Tasks">
Begin by deeply understanding what your tasks actually require. Consider the cognitive complexity involved, the depth of reasoning needed, the format of expected outputs, and the amount of context the model will need to process. This foundational analysis will guide every subsequent decision.
</Step>
<Step title="Map Model Capabilities">
Once you understand your requirements, map them to model strengths. Different model families excel at different types of work; some are optimized for reasoning and analysis, others for creativity and content generation, and others for speed and efficiency.
</Step>
<Step title="Consider Constraints">
Factor in your real-world operational constraints including budget limitations, latency requirements, data privacy needs, and infrastructure capabilities. The theoretically best model may not be the practically best choice for your situation.
</Step>
<Step title="Test and Iterate">
Start with reliable, well-understood models and optimize based on actual performance in your specific use case. Real-world results often differ from theoretical benchmarks, so empirical testing is crucial.
</Step>
</Steps>
## Core Selection Framework
### a. Task-First Thinking
The most critical step in LLM selection is understanding what your task actually demands. Too often, teams select models based on general reputation or benchmark scores without carefully analyzing their specific requirements. This approach leads to either over-engineering simple tasks with expensive, complex models, or under-powering sophisticated work with models that lack the necessary capabilities.
<Tabs>
<Tab title="Reasoning Complexity">
- **Simple Tasks** represent the majority of everyday AI work and include basic instruction following, straightforward data processing, and simple formatting operations. These tasks typically have clear inputs and outputs with minimal ambiguity. The cognitive load is low, and the model primarily needs to follow explicit instructions rather than engage in complex reasoning.
- **Complex Tasks** require multi-step reasoning, strategic thinking, and the ability to handle ambiguous or incomplete information. These might involve analyzing multiple data sources, developing comprehensive strategies, or solving problems that require breaking down into smaller components. The model needs to maintain context across multiple reasoning steps and often must make inferences that aren't explicitly stated.
- **Creative Tasks** demand a different type of cognitive capability focused on generating novel, engaging, and contextually appropriate content. This includes storytelling, marketing copy creation, and creative problem-solving. The model needs to understand nuance, tone, and audience while producing content that feels authentic and engaging rather than formulaic.
</Tab>
<Tab title="Output Requirements">
- **Structured Data** tasks require precision and consistency in format adherence. When working with JSON, XML, or database formats, the model must reliably produce syntactically correct output that can be programmatically processed. These tasks often have strict validation requirements and little tolerance for format errors, making reliability more important than creativity.
- **Creative Content** outputs demand a balance of technical competence and creative flair. The model needs to understand audience, tone, and brand voice while producing content that engages readers and achieves specific communication goals. Quality here is often subjective and requires models that can adapt their writing style to different contexts and purposes.
- **Technical Content** sits between structured data and creative content, requiring both precision and clarity. Documentation, code generation, and technical analysis need to be accurate and comprehensive while remaining accessible to the intended audience. The model must understand complex technical concepts and communicate them effectively.
</Tab>
<Tab title="Context Needs">
- **Short Context** scenarios involve focused, immediate tasks where the model needs to process limited information quickly. These are often transactional interactions where speed and efficiency matter more than deep understanding. The model doesn't need to maintain extensive conversation history or process large documents.
- **Long Context** requirements emerge when working with substantial documents, extended conversations, or complex multi-part tasks. The model needs to maintain coherence across thousands of tokens while referencing earlier information accurately. This capability becomes crucial for document analysis, comprehensive research, and sophisticated dialogue systems.
- **Very Long Context** scenarios push the boundaries of what's currently possible, involving massive document processing, extensive research synthesis, or complex multi-session interactions. These use cases require models specifically designed for extended context handling and often involve trade-offs between context length and processing speed.
</Tab>
</Tabs>
### b. Model Capability Mapping
Understanding model capabilities requires looking beyond marketing claims and benchmark scores to understand the fundamental strengths and limitations of different model architectures and training approaches.
<AccordionGroup>
<Accordion title="Reasoning Models" icon="brain">
Reasoning models represent a specialized category designed specifically for complex, multi-step thinking tasks. These models excel when problems require careful analysis, strategic planning, or systematic problem decomposition. They typically employ techniques like chain-of-thought reasoning or tree-of-thought processing to work through complex problems step by step.
The strength of reasoning models lies in their ability to maintain logical consistency across extended reasoning chains and to break down complex problems into manageable components. They're particularly valuable for strategic planning, complex analysis, and situations where the quality of reasoning matters more than speed of response.
However, reasoning models often come with trade-offs in terms of speed and cost. They may also be less suitable for creative tasks or simple operations where their sophisticated reasoning capabilities aren't needed. Consider these models when your tasks involve genuine complexity that benefits from systematic, step-by-step analysis.
</Accordion>
<Accordion title="General Purpose Models" icon="microchip">
General purpose models offer the most balanced approach to LLM selection, providing solid performance across a wide range of tasks without extreme specialization in any particular area. These models are trained on diverse datasets and optimized for versatility rather than peak performance in specific domains.
The primary advantage of general purpose models is their reliability and predictability across different types of work. They handle most standard business tasks competently, from research and analysis to content creation and data processing. This makes them excellent choices for teams that need consistent performance across varied workflows.
While general purpose models may not achieve the peak performance of specialized alternatives in specific domains, they offer operational simplicity and reduced complexity in model management. They're often the best starting point for new projects, allowing teams to understand their specific needs before potentially optimizing with more specialized models.
</Accordion>
<Accordion title="Fast & Efficient Models" icon="bolt">
Fast and efficient models prioritize speed, cost-effectiveness, and resource efficiency over sophisticated reasoning capabilities. These models are optimized for high-throughput scenarios where quick responses and low operational costs are more important than nuanced understanding or complex reasoning.
These models excel in scenarios involving routine operations, simple data processing, function calling, and high-volume tasks where the cognitive requirements are relatively straightforward. They're particularly valuable for applications that need to process many requests quickly or operate within tight budget constraints.
The key consideration with efficient models is ensuring that their capabilities align with your task requirements. While they can handle many routine operations effectively, they may struggle with tasks requiring nuanced understanding, complex reasoning, or sophisticated content generation. They're best used for well-defined, routine operations where speed and cost matter more than sophistication.
</Accordion>
<Accordion title="Creative Models" icon="pen">
Creative models are specifically optimized for content generation, writing quality, and creative thinking tasks. These models typically excel at understanding nuance, tone, and style while producing engaging, contextually appropriate content that feels natural and authentic.
The strength of creative models lies in their ability to adapt writing style to different audiences, maintain consistent voice and tone, and generate content that engages readers effectively. They often perform better on tasks involving storytelling, marketing copy, brand communications, and other content where creativity and engagement are primary goals.
When selecting creative models, consider not just their ability to generate text, but their understanding of audience, context, and purpose. The best creative models can adapt their output to match specific brand voices, target different audience segments, and maintain consistency across extended content pieces.
</Accordion>
<Accordion title="Open Source Models" icon="code">
Open source models offer unique advantages in terms of cost control, customization potential, data privacy, and deployment flexibility. These models can be run locally or on private infrastructure, providing complete control over data handling and model behavior.
The primary benefits of open source models include elimination of per-token costs, ability to fine-tune for specific use cases, complete data privacy, and independence from external API providers. They're particularly valuable for organizations with strict data privacy requirements, budget constraints, or specific customization needs.
However, open source models require more technical expertise to deploy and maintain effectively. Teams need to consider infrastructure costs, model management complexity, and the ongoing effort required to keep models updated and optimized. The total cost of ownership may be higher than cloud-based alternatives when factoring in technical overhead.
</Accordion>
</AccordionGroup>
## Strategic Configuration Patterns
### a. Multi-Model Approach
<Tip>
Use different models for different purposes within the same crew to optimize both performance and cost.
</Tip>
The most sophisticated CrewAI implementations often employ multiple models strategically, assigning different models to different agents based on their specific roles and requirements. This approach allows teams to optimize for both performance and cost by using the most appropriate model for each type of work.
Planning agents benefit from reasoning models that can handle complex strategic thinking and multi-step analysis. These agents often serve as the "brain" of the operation, developing strategies and coordinating other agents' work. Content agents, on the other hand, perform best with creative models that excel at writing quality and audience engagement. Processing agents handling routine operations can use efficient models that prioritize speed and cost-effectiveness.
**Example: Research and Analysis Crew**
```python
from crewai import Agent, Task, Crew, LLM
# High-capability reasoning model for strategic planning
manager_llm = LLM(model="gemini-2.5-flash-preview-05-20", temperature=0.1)
# Creative model for content generation
content_llm = LLM(model="claude-3-5-sonnet-20241022", temperature=0.7)
# Efficient model for data processing
processing_llm = LLM(model="gpt-4o-mini", temperature=0)
research_manager = Agent(
role="Research Strategy Manager",
goal="Develop comprehensive research strategies and coordinate team efforts",
backstory="Expert research strategist with deep analytical capabilities",
llm=manager_llm, # High-capability model for complex reasoning
verbose=True
)
content_writer = Agent(
role="Research Content Writer",
goal="Transform research findings into compelling, well-structured reports",
backstory="Skilled writer who excels at making complex topics accessible",
llm=content_llm, # Creative model for engaging content
verbose=True
)
data_processor = Agent(
role="Data Analysis Specialist",
goal="Extract and organize key data points from research sources",
backstory="Detail-oriented analyst focused on accuracy and efficiency",
llm=processing_llm, # Fast, cost-effective model for routine tasks
verbose=True
)
crew = Crew(
agents=[research_manager, content_writer, data_processor],
tasks=[...], # Your specific tasks
manager_llm=manager_llm, # Manager uses the reasoning model
verbose=True
)
```
The key to successful multi-model implementation is understanding how different agents interact and ensuring that model capabilities align with agent responsibilities. This requires careful planning but can result in significant improvements in both output quality and operational efficiency.
### b. Component-Specific Selection
<Tabs>
<Tab title="Manager LLM">
The manager LLM plays a crucial role in hierarchical CrewAI processes, serving as the coordination point for multiple agents and tasks. This model needs to excel at delegation, task prioritization, and maintaining context across multiple concurrent operations.
Effective manager LLMs require strong reasoning capabilities to make good delegation decisions, consistent performance to ensure predictable coordination, and excellent context management to track the state of multiple agents simultaneously. The model needs to understand the capabilities and limitations of different agents while optimizing task allocation for efficiency and quality.
Cost considerations are particularly important for manager LLMs since they're involved in every operation. The model needs to provide sufficient capability for effective coordination while remaining cost-effective for frequent use. This often means finding models that offer good reasoning capabilities without the premium pricing of the most sophisticated options.
</Tab>
<Tab title="Function Calling LLM">
Function calling LLMs handle tool usage across all agents, making them critical for crews that rely heavily on external tools and APIs. These models need to excel at understanding tool capabilities, extracting parameters accurately, and handling tool responses effectively.
The most important characteristics for function calling LLMs are precision and reliability rather than creativity or sophisticated reasoning. The model needs to consistently extract the correct parameters from natural language requests and handle tool responses appropriately. Speed is also important since tool usage often involves multiple round trips that can impact overall performance.
Many teams find that specialized function calling models or general purpose models with strong tool support work better than creative or reasoning-focused models for this role. The key is ensuring that the model can reliably bridge the gap between natural language instructions and structured tool calls.
</Tab>
<Tab title="Agent-Specific Overrides">
Individual agents can override crew-level LLM settings when their specific needs differ significantly from the general crew requirements. This capability allows for fine-tuned optimization while maintaining operational simplicity for most agents.
Consider agent-specific overrides when an agent's role requires capabilities that differ substantially from other crew members. For example, a creative writing agent might benefit from a model optimized for content generation, while a data analysis agent might perform better with a reasoning-focused model.
The challenge with agent-specific overrides is balancing optimization with operational complexity. Each additional model adds complexity to deployment, monitoring, and cost management. Teams should focus overrides on agents where the performance improvement justifies the additional complexity.
</Tab>
</Tabs>
## Task Definition Framework
### a. Focus on Clarity Over Complexity
Effective task definition is often more important than model selection in determining the quality of CrewAI outputs. Well-defined tasks provide clear direction and context that enable even modest models to perform well, while poorly defined tasks can cause even sophisticated models to produce unsatisfactory results.
<AccordionGroup>
<Accordion title="Effective Task Descriptions" icon="list-check">
The best task descriptions strike a balance between providing sufficient detail and maintaining clarity. They should define the specific objective clearly enough that there's no ambiguity about what success looks like, while explaining the approach or methodology in enough detail that the agent understands how to proceed.
Effective task descriptions include relevant context and constraints that help the agent understand the broader purpose and any limitations they need to work within. They break complex work into focused steps that can be executed systematically, rather than presenting overwhelming, multi-faceted objectives that are difficult to approach systematically.
Common mistakes include being too vague about objectives, failing to provide necessary context, setting unclear success criteria, or combining multiple unrelated tasks into a single description. The goal is to provide enough information for the agent to succeed while maintaining focus on a single, clear objective.
</Accordion>
<Accordion title="Expected Output Guidelines" icon="bullseye">
Expected output guidelines serve as a contract between the task definition and the agent, clearly specifying what the deliverable should look like and how it will be evaluated. These guidelines should describe both the format and structure needed, as well as the key elements that must be included for the output to be considered complete.
The best output guidelines provide concrete examples of quality indicators and define completion criteria clearly enough that both the agent and human reviewers can assess whether the task has been completed successfully. This reduces ambiguity and helps ensure consistent results across multiple task executions.
Avoid generic output descriptions that could apply to any task, missing format specifications that leave agents guessing about structure, unclear quality standards that make evaluation difficult, or failing to provide examples or templates that help agents understand expectations.
</Accordion>
</AccordionGroup>
### b. Task Sequencing Strategy
<Tabs>
<Tab title="Sequential Dependencies">
Sequential task dependencies are essential when tasks build upon previous outputs, information flows from one task to another, or quality depends on the completion of prerequisite work. This approach ensures that each task has access to the information and context it needs to succeed.
Implementing sequential dependencies effectively requires using the context parameter to chain related tasks, building complexity gradually through task progression, and ensuring that each task produces outputs that serve as meaningful inputs for subsequent tasks. The goal is to maintain logical flow between dependent tasks while avoiding unnecessary bottlenecks.
Sequential dependencies work best when there's a clear logical progression from one task to another and when the output of one task genuinely improves the quality or feasibility of subsequent tasks. However, they can create bottlenecks if not managed carefully, so it's important to identify which dependencies are truly necessary versus those that are merely convenient.
</Tab>
<Tab title="Parallel Execution">
Parallel execution becomes valuable when tasks are independent of each other, time efficiency is important, or different expertise areas are involved that don't require coordination. This approach can significantly reduce overall execution time while allowing specialized agents to work on their areas of strength simultaneously.
Successful parallel execution requires identifying tasks that can truly run independently, grouping related but separate work streams effectively, and planning for result integration when parallel tasks need to be combined into a final deliverable. The key is ensuring that parallel tasks don't create conflicts or redundancies that reduce overall quality.
Consider parallel execution when you have multiple independent research streams, different types of analysis that don't depend on each other, or content creation tasks that can be developed simultaneously. However, be mindful of resource allocation and ensure that parallel execution doesn't overwhelm your available model capacity or budget.
</Tab>
</Tabs>
## Optimizing Agent Configuration for LLM Performance
### a. Role-Driven LLM Selection
<Warning>
Generic agent roles make it impossible to select the right LLM. Specific roles enable targeted model optimization.
</Warning>
The specificity of your agent roles directly determines which LLM capabilities matter most for optimal performance. This creates a strategic opportunity to match precise model strengths with agent responsibilities.
**Generic vs. Specific Role Impact on LLM Choice:**
When defining roles, think about the specific domain knowledge, working style, and decision-making frameworks that would be most valuable for the tasks the agent will handle. The more specific and contextual the role definition, the better the model can embody that role effectively.
```python
# ✅ Specific role - clear LLM requirements
specific_agent = Agent(
role="SaaS Revenue Operations Analyst", # Clear domain expertise needed
goal="Analyze recurring revenue metrics and identify growth opportunities",
backstory="Specialist in SaaS business models with deep understanding of ARR, churn, and expansion revenue",
llm=LLM(model="gpt-4o") # Reasoning model justified for complex analysis
)
```
**Role-to-Model Mapping Strategy:**
- **"Research Analyst"** → Reasoning model (GPT-4o, Claude Sonnet) for complex analysis
- **"Content Editor"** → Creative model (Claude, GPT-4o) for writing quality
- **"Data Processor"** → Efficient model (GPT-4o-mini, Gemini Flash) for structured tasks
- **"API Coordinator"** → Function-calling optimized model (GPT-4o, Claude) for tool usage
### b. Backstory as Model Context Amplifier
<Info>
Strategic backstories multiply your chosen LLM's effectiveness by providing domain-specific context that generic prompting cannot achieve.
</Info>
A well-crafted backstory transforms your LLM choice from generic capability to specialized expertise. This is especially crucial for cost optimization - a well-contextualized efficient model can outperform a premium model without proper context.
**Context-Driven Performance Example:**
```python
# Context amplifies model effectiveness
domain_expert = Agent(
role="B2B SaaS Marketing Strategist",
goal="Develop comprehensive go-to-market strategies for enterprise software",
backstory="""
You have 10+ years of experience scaling B2B SaaS companies from Series A to IPO.
You understand the nuances of enterprise sales cycles, the importance of product-market
fit in different verticals, and how to balance growth metrics with unit economics.
You've worked with companies like Salesforce, HubSpot, and emerging unicorns, giving
you perspective on both established and disruptive go-to-market strategies.
""",
llm=LLM(model="claude-3-5-sonnet", temperature=0.3) # Balanced creativity with domain knowledge
)
# This context enables Claude to perform like a domain expert
# Without it, even it would produce generic marketing advice
```
**Backstory Elements That Enhance LLM Performance:**
- **Domain Experience**: "10+ years in enterprise SaaS sales"
- **Specific Expertise**: "Specializes in technical due diligence for Series B+ rounds"
- **Working Style**: "Prefers data-driven decisions with clear documentation"
- **Quality Standards**: "Insists on citing sources and showing analytical work"
### c. Holistic Agent-LLM Optimization
The most effective agent configurations create synergy between role specificity, backstory depth, and LLM selection. Each element reinforces the others to maximize model performance.
**Optimization Framework:**
```python
# Example: Technical Documentation Agent
tech_writer = Agent(
role="API Documentation Specialist", # Specific role for clear LLM requirements
goal="Create comprehensive, developer-friendly API documentation",
backstory="""
You're a technical writer with 8+ years documenting REST APIs, GraphQL endpoints,
and SDK integration guides. You've worked with developer tools companies and
understand what developers need: clear examples, comprehensive error handling,
and practical use cases. You prioritize accuracy and usability over marketing fluff.
""",
llm=LLM(
model="claude-3-5-sonnet", # Excellent for technical writing
temperature=0.1 # Low temperature for accuracy
),
tools=[code_analyzer_tool, api_scanner_tool],
verbose=True
)
```
**Alignment Checklist:**
- ✅ **Role Specificity**: Clear domain and responsibilities
- ✅ **LLM Match**: Model strengths align with role requirements
- ✅ **Backstory Depth**: Provides domain context the LLM can leverage
- ✅ **Tool Integration**: Tools support the agent's specialized function
- ✅ **Parameter Tuning**: Temperature and settings optimize for role needs
The key is creating agents where every configuration choice reinforces your LLM selection strategy, maximizing performance while optimizing costs.
## Practical Implementation Checklist
Rather than repeating the strategic framework, here's a tactical checklist for implementing your LLM selection decisions in CrewAI:
<Steps>
<Step title="Audit Your Current Setup" icon="clipboard-check">
**What to Review:**
- Are all agents using the same LLM by default?
- Which agents handle the most complex reasoning tasks?
- Which agents primarily do data processing or formatting?
- Are any agents heavily tool-dependent?
**Action**: Document current agent roles and identify optimization opportunities.
</Step>
<Step title="Implement Crew-Level Strategy" icon="users-gear">
**Set Your Baseline:**
```python
# Start with a reliable default for the crew
default_crew_llm = LLM(model="gpt-4o-mini") # Cost-effective baseline
crew = Crew(
agents=[...],
tasks=[...],
memory=True
)
```
**Action**: Establish your crew's default LLM before optimizing individual agents.
</Step>
<Step title="Optimize High-Impact Agents" icon="star">
**Identify and Upgrade Key Agents:**
```python
# Manager or coordination agents
manager_agent = Agent(
role="Project Manager",
llm=LLM(model="gemini-2.5-flash-preview-05-20"), # Premium for coordination
# ... rest of config
)
# Creative or customer-facing agents
content_agent = Agent(
role="Content Creator",
llm=LLM(model="claude-3-5-sonnet"), # Best for writing
# ... rest of config
)
```
**Action**: Upgrade 20% of your agents that handle 80% of the complexity.
</Step>
<Step title="Validate with Enterprise Testing" icon="test-tube">
**Once you deploy your agents to production:**
- Use [CrewAI Enterprise platform](https://app.crewai.com) to A/B test your model selections
- Run multiple iterations with real inputs to measure consistency and performance
- Compare cost vs. performance across your optimized setup
- Share results with your team for collaborative decision-making
**Action**: Replace guesswork with data-driven validation using the testing platform.
</Step>
</Steps>
### When to Use Different Model Types
<Tabs>
<Tab title="Reasoning Models">
Reasoning models become essential when tasks require genuine multi-step logical thinking, strategic planning, or high-level decision making that benefits from systematic analysis. These models excel when problems need to be broken down into components and analyzed systematically rather than handled through pattern matching or simple instruction following.
Consider reasoning models for business strategy development, complex data analysis that requires drawing insights from multiple sources, multi-step problem solving where each step depends on previous analysis, and strategic planning tasks that require considering multiple variables and their interactions.
However, reasoning models often come with higher costs and slower response times, so they're best reserved for tasks where their sophisticated capabilities provide genuine value rather than being used for simple operations that don't require complex reasoning.
</Tab>
<Tab title="Creative Models">
Creative models become valuable when content generation is the primary output and the quality, style, and engagement level of that content directly impact success. These models excel when writing quality and style matter significantly, creative ideation or brainstorming is needed, or brand voice and tone are important considerations.
Use creative models for blog post writing and article creation, marketing copy that needs to engage and persuade, creative storytelling and narrative development, and brand communications where voice and tone are crucial. These models often understand nuance and context better than general purpose alternatives.
Creative models may be less suitable for technical or analytical tasks where precision and factual accuracy are more important than engagement and style. They're best used when the creative and communicative aspects of the output are primary success factors.
</Tab>
<Tab title="Efficient Models">
Efficient models are ideal for high-frequency, routine operations where speed and cost optimization are priorities. These models work best when tasks have clear, well-defined parameters and don't require sophisticated reasoning or creative capabilities.
Consider efficient models for data processing and transformation tasks, simple formatting and organization operations, function calling and tool usage where precision matters more than sophistication, and high-volume operations where cost per operation is a significant factor.
The key with efficient models is ensuring that their capabilities align with task requirements. They can handle many routine operations effectively but may struggle with tasks requiring nuanced understanding, complex reasoning, or sophisticated content generation.
</Tab>
<Tab title="Open Source Models">
Open source models become attractive when budget constraints are significant, data privacy requirements exist, customization needs are important, or local deployment is required for operational or compliance reasons.
Consider open source models for internal company tools where data privacy is paramount, privacy-sensitive applications that can't use external APIs, cost-optimized deployments where per-token pricing is prohibitive, and situations requiring custom model modifications or fine-tuning.
However, open source models require more technical expertise to deploy and maintain effectively. Consider the total cost of ownership including infrastructure, technical overhead, and ongoing maintenance when evaluating open source options.
</Tab>
</Tabs>
## Common CrewAI Model Selection Pitfalls
<AccordionGroup>
<Accordion title="The 'One Model Fits All' Trap" icon="triangle-exclamation">
**The Problem**: Using the same LLM for all agents in a crew, regardless of their specific roles and responsibilities. This is often the default approach but rarely optimal.
**Real Example**: Using GPT-4o for both a strategic planning manager and a data extraction agent. The manager needs reasoning capabilities worth the premium cost, but the data extractor could perform just as well with GPT-4o-mini at a fraction of the price.
**CrewAI Solution**: Leverage agent-specific LLM configuration to match model capabilities with agent roles:
```python
# Strategic agent gets premium model
manager = Agent(role="Strategy Manager", llm=LLM(model="gpt-4o"))
# Processing agent gets efficient model
processor = Agent(role="Data Processor", llm=LLM(model="gpt-4o-mini"))
```
</Accordion>
<Accordion title="Ignoring Crew-Level vs Agent-Level LLM Hierarchy" icon="shuffle">
**The Problem**: Not understanding how CrewAI's LLM hierarchy works - crew LLM, manager LLM, and agent LLM settings can conflict or be poorly coordinated.
**Real Example**: Setting a crew to use Claude, but having agents configured with GPT models, creating inconsistent behavior and unnecessary model switching overhead.
**CrewAI Solution**: Plan your LLM hierarchy strategically:
```python
crew = Crew(
agents=[agent1, agent2],
tasks=[task1, task2],
manager_llm=LLM(model="gpt-4o"), # For crew coordination
process=Process.hierarchical # When using manager_llm
)
# Agents inherit crew LLM unless specifically overridden
agent1 = Agent(llm=LLM(model="claude-3-5-sonnet")) # Override for specific needs
```
</Accordion>
<Accordion title="Function Calling Model Mismatch" icon="screwdriver-wrench">
**The Problem**: Choosing models based on general capabilities while ignoring function calling performance for tool-heavy CrewAI workflows.
**Real Example**: Selecting a creative-focused model for an agent that primarily needs to call APIs, search tools, or process structured data. The agent struggles with tool parameter extraction and reliable function calls.
**CrewAI Solution**: Prioritize function calling capabilities for tool-heavy agents:
```python
# For agents that use many tools
tool_agent = Agent(
role="API Integration Specialist",
tools=[search_tool, api_tool, data_tool],
llm=LLM(model="gpt-4o"), # Excellent function calling
# OR
llm=LLM(model="claude-3-5-sonnet") # Also strong with tools
)
```
</Accordion>
<Accordion title="Premature Optimization Without Testing" icon="gear">
**The Problem**: Making complex model selection decisions based on theoretical performance without validating with actual CrewAI workflows and tasks.
**Real Example**: Implementing elaborate model switching logic based on task types without testing if the performance gains justify the operational complexity.
**CrewAI Solution**: Start simple, then optimize based on real performance data:
```python
# Start with this
crew = Crew(agents=[...], tasks=[...], llm=LLM(model="gpt-4o-mini"))
# Test performance, then optimize specific agents as needed
# Use Enterprise platform testing to validate improvements
```
</Accordion>
<Accordion title="Overlooking Context and Memory Limitations" icon="brain">
**The Problem**: Not considering how model context windows interact with CrewAI's memory and context sharing between agents.
**Real Example**: Using a short-context model for agents that need to maintain conversation history across multiple task iterations, or in crews with extensive agent-to-agent communication.
**CrewAI Solution**: Match context capabilities to crew communication patterns.
</Accordion>
</AccordionGroup>
## Testing and Iteration Strategy
<Steps>
<Step title="Start Simple" icon="play">
Begin with reliable, general-purpose models that are well-understood and widely supported. This provides a stable foundation for understanding your specific requirements and performance expectations before optimizing for specialized needs.
</Step>
<Step title="Measure What Matters" icon="chart-line">
Develop metrics that align with your specific use case and business requirements rather than relying solely on general benchmarks. Focus on measuring outcomes that directly impact your success rather than theoretical performance indicators.
</Step>
<Step title="Iterate Based on Results" icon="arrows-rotate">
Make model changes based on observed performance in your specific context rather than theoretical considerations or general recommendations. Real-world performance often differs significantly from benchmark results or general reputation.
</Step>
<Step title="Consider Total Cost" icon="calculator">
Evaluate the complete cost of ownership including model costs, development time, maintenance overhead, and operational complexity. The cheapest model per token may not be the most cost-effective choice when considering all factors.
</Step>
</Steps>
<Tip>
Focus on understanding your requirements first, then select models that best match those needs. The best LLM choice is the one that consistently delivers the results you need within your operational constraints.
</Tip>
### Enterprise-Grade Model Validation
For teams serious about optimizing their LLM selection, the **CrewAI Enterprise platform** provides sophisticated testing capabilities that go far beyond basic CLI testing. The platform enables comprehensive model evaluation that helps you make data-driven decisions about your LLM strategy.
<Frame>
![Enterprise Testing Interface](/images/enterprise/enterprise-testing.png)
</Frame>
**Advanced Testing Features:**
- **Multi-Model Comparison**: Test multiple LLMs simultaneously across the same tasks and inputs. Compare performance between GPT-4o, Claude, Llama, Groq, Cerebras, and other leading models in parallel to identify the best fit for your specific use case.
- **Statistical Rigor**: Configure multiple iterations with consistent inputs to measure reliability and performance variance. This helps identify models that not only perform well but do so consistently across runs.
- **Real-World Validation**: Use your actual crew inputs and scenarios rather than synthetic benchmarks. The platform allows you to test with your specific industry context, company information, and real use cases for more accurate evaluation.
- **Comprehensive Analytics**: Access detailed performance metrics, execution times, and cost analysis across all tested models. This enables data-driven decision making rather than relying on general model reputation or theoretical capabilities.
- **Team Collaboration**: Share testing results and model performance data across your team, enabling collaborative decision-making and consistent model selection strategies across projects.
Go to [app.crewai.com](https://app.crewai.com) to get started!
<Info>
The Enterprise platform transforms model selection from guesswork into a data-driven process, enabling you to validate the principles in this guide with your actual use cases and requirements.
</Info>
## Key Principles Summary
<CardGroup cols={2}>
<Card title="Task-Driven Selection" icon="bullseye">
Choose models based on what the task actually requires, not theoretical capabilities or general reputation.
</Card>
<Card title="Capability Matching" icon="puzzle-piece">
Align model strengths with agent roles and responsibilities for optimal performance.
</Card>
<Card title="Strategic Consistency" icon="link">
Maintain coherent model selection strategy across related components and workflows.
</Card>
<Card title="Practical Testing" icon="flask">
Validate choices through real-world usage rather than benchmarks alone.
</Card>
<Card title="Iterative Improvement" icon="arrow-up">
Start simple and optimize based on actual performance and needs.
</Card>
<Card title="Operational Balance" icon="scale-balanced">
Balance performance requirements with cost and complexity constraints.
</Card>
</CardGroup>
<Check>
Remember: The best LLM choice is the one that consistently delivers the results you need within your operational constraints. Focus on understanding your requirements first, then select models that best match those needs.
</Check>
## Current Model Landscape (June 2025)
<Warning>
**Snapshot in Time**: The following model rankings represent current leaderboard standings as of June 2025, compiled from [LMSys Arena](https://arena.lmsys.org/), [Artificial Analysis](https://artificialanalysis.ai/), and other leading benchmarks. LLM performance, availability, and pricing change rapidly. Always conduct your own evaluations with your specific use cases and data.
</Warning>
### Leading Models by Category
The tables below show a representative sample of current top-performing models across different categories, with guidance on their suitability for CrewAI agents:
<Note>
These tables/metrics showcase selected leading models in each category and are not exhaustive. Many excellent models exist beyond those listed here. The goal is to illustrate the types of capabilities to look for rather than provide a complete catalog.
</Note>
<Tabs>
<Tab title="Reasoning & Planning">
**Best for Manager LLMs and Complex Analysis**
| Model | Intelligence Score | Cost ($/M tokens) | Speed | Best Use in CrewAI |
|:------|:------------------|:------------------|:------|:------------------|
| **o3** | 70 | $17.50 | Fast | Manager LLM for complex multi-agent coordination |
| **Gemini 2.5 Pro** | 69 | $3.44 | Fast | Strategic planning agents, research coordination |
| **DeepSeek R1** | 68 | $0.96 | Moderate | Cost-effective reasoning for budget-conscious crews |
| **Claude 4 Sonnet** | 53 | $6.00 | Fast | Analysis agents requiring nuanced understanding |
| **Qwen3 235B (Reasoning)** | 62 | $2.63 | Moderate | Open-source alternative for reasoning tasks |
These models excel at multi-step reasoning and are ideal for agents that need to develop strategies, coordinate other agents, or analyze complex information.
</Tab>
<Tab title="Coding & Technical">
**Best for Development and Tool-Heavy Workflows**
| Model | Coding Performance | Tool Use Score | Cost ($/M tokens) | Best Use in CrewAI |
|:------|:------------------|:---------------|:------------------|:------------------|
| **Claude 4 Sonnet** | Excellent | 72.7% | $6.00 | Primary coding agent, technical documentation |
| **Claude 4 Opus** | Excellent | 72.5% | $30.00 | Complex software architecture, code review |
| **DeepSeek V3** | Very Good | High | $0.48 | Cost-effective coding for routine development |
| **Qwen2.5 Coder 32B** | Very Good | Medium | $0.15 | Budget-friendly coding agent |
| **Llama 3.1 405B** | Good | 81.1% | $3.50 | Function calling LLM for tool-heavy workflows |
These models are optimized for code generation, debugging, and technical problem-solving, making them ideal for development-focused crews.
</Tab>
<Tab title="Speed & Efficiency">
**Best for High-Throughput and Real-Time Applications**
| Model | Speed (tokens/s) | Latency (TTFT) | Cost ($/M tokens) | Best Use in CrewAI |
|:------|:-----------------|:---------------|:------------------|:------------------|
| **Llama 4 Scout** | 2,600 | 0.33s | $0.27 | High-volume processing agents |
| **Gemini 2.5 Flash** | 376 | 0.30s | $0.26 | Real-time response agents |
| **DeepSeek R1 Distill** | 383 | Variable | $0.04 | Cost-optimized high-speed processing |
| **Llama 3.3 70B** | 2,500 | 0.52s | $0.60 | Balanced speed and capability |
| **Nova Micro** | High | 0.30s | $0.04 | Simple, fast task execution |
These models prioritize speed and efficiency, perfect for agents handling routine operations or requiring quick responses. **Pro tip**: Pairing these models with fast inference providers like Groq can achieve even better performance, especially for open-source models like Llama.
</Tab>
<Tab title="Balanced Performance">
**Best All-Around Models for General Crews**
| Model | Overall Score | Versatility | Cost ($/M tokens) | Best Use in CrewAI |
|:------|:--------------|:------------|:------------------|:------------------|
| **GPT-4.1** | 53 | Excellent | $3.50 | General-purpose crew LLM |
| **Claude 3.7 Sonnet** | 48 | Very Good | $6.00 | Balanced reasoning and creativity |
| **Gemini 2.0 Flash** | 48 | Good | $0.17 | Cost-effective general use |
| **Llama 4 Maverick** | 51 | Good | $0.37 | Open-source general purpose |
| **Qwen3 32B** | 44 | Good | $1.23 | Budget-friendly versatility |
These models offer good performance across multiple dimensions, suitable for crews with diverse task requirements.
</Tab>
</Tabs>
### Selection Framework for Current Models
<AccordionGroup>
<Accordion title="High-Performance Crews" icon="rocket">
**When performance is the priority**: Use top-tier models like **o3**, **Gemini 2.5 Pro**, or **Claude 4 Sonnet** for manager LLMs and critical agents. These models excel at complex reasoning and coordination but come with higher costs.
**Strategy**: Implement a multi-model approach where premium models handle strategic thinking while efficient models handle routine operations.
</Accordion>
<Accordion title="Cost-Conscious Crews" icon="dollar-sign">
**When budget is a primary constraint**: Focus on models like **DeepSeek R1**, **Llama 4 Scout**, or **Gemini 2.0 Flash**. These provide strong performance at significantly lower costs.
**Strategy**: Use cost-effective models for most agents, reserving premium models only for the most critical decision-making roles.
</Accordion>
<Accordion title="Specialized Workflows" icon="screwdriver-wrench">
**For specific domain expertise**: Choose models optimized for your primary use case. **Claude 4** series for coding, **Gemini 2.5 Pro** for research, **Llama 405B** for function calling.
**Strategy**: Select models based on your crew's primary function, ensuring the core capability aligns with model strengths.
</Accordion>
<Accordion title="Enterprise & Privacy" icon="shield">
**For data-sensitive operations**: Consider open-source models like **Llama 4** series, **DeepSeek V3**, or **Qwen3** that can be deployed locally while maintaining competitive performance.
**Strategy**: Deploy open-source models on private infrastructure, accepting potential performance trade-offs for data control.
</Accordion>
</AccordionGroup>
### Key Considerations for Model Selection
- **Performance Trends**: The current landscape shows strong competition between reasoning-focused models (o3, Gemini 2.5 Pro) and balanced models (Claude 4, GPT-4.1). Specialized models like DeepSeek R1 offer excellent cost-performance ratios.
- **Speed vs. Intelligence Trade-offs**: Models like Llama 4 Scout prioritize speed (2,600 tokens/s) while maintaining reasonable intelligence, whereas models like o3 maximize reasoning capability at the cost of speed and price.
- **Open Source Viability**: The gap between open-source and proprietary models continues to narrow, with models like Llama 4 Maverick and DeepSeek V3 offering competitive performance at attractive price points. Fast inference providers particularly shine with open-source models, often delivering better speed-to-cost ratios than proprietary alternatives.
<Info>
**Testing is Essential**: Leaderboard rankings provide general guidance, but your specific use case, prompting style, and evaluation criteria may produce different results. Always test candidate models with your actual tasks and data before making final decisions.
</Info>
### Practical Implementation Strategy
<Steps>
<Step title="Start with Proven Models">
Begin with well-established models like **GPT-4.1**, **Claude 3.7 Sonnet**, or **Gemini 2.0 Flash** that offer good performance across multiple dimensions and have extensive real-world validation.
</Step>
<Step title="Identify Specialized Needs">
Determine if your crew has specific requirements (coding, reasoning, speed) that would benefit from specialized models like **Claude 4 Sonnet** for development or **o3** for complex analysis. For speed-critical applications, consider fast inference providers like **Groq** alongside model selection.
</Step>
<Step title="Implement Multi-Model Strategy">
Use different models for different agents based on their roles. High-capability models for managers and complex tasks, efficient models for routine operations.
</Step>
<Step title="Monitor and Optimize">
Track performance metrics relevant to your use case and be prepared to adjust model selections as new models are released or pricing changes.
</Step>
</Steps>

View File

@@ -7,196 +7,818 @@ icon: key
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-CrewAI.png" alt="Portkey CrewAI Header Image" width="70%" />
[Portkey](https://portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai) is a 2-line upgrade to make your CrewAI agents reliable, cost-efficient, and fast.
Portkey adds 4 core production capabilities to any CrewAI agent:
1. Routing to **200+ LLMs**
2. Making each LLM call more robust
3. Full-stack tracing & cost, performance analytics
4. Real-time guardrails to enforce behavior
## Introduction
## Getting Started
Portkey enhances CrewAI with production-readiness features, turning your experimental agent crews into robust systems by providing:
- **Complete observability** of every agent step, tool use, and interaction
- **Built-in reliability** with fallbacks, retries, and load balancing
- **Cost tracking and optimization** to manage your AI spend
- **Access to 200+ LLMs** through a single integration
- **Guardrails** to keep agent behavior safe and compliant
- **Version-controlled prompts** for consistent agent performance
### Installation & Setup
<Steps>
<Step title="Install CrewAI and Portkey">
```bash
pip install -qU crewai portkey-ai
```
</Step>
<Step title="Configure the LLM Client">
To build CrewAI Agents with Portkey, you'll need two keys:
- **Portkey API Key**: Sign up on the [Portkey app](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai) and copy your API key
- **Virtual Key**: Virtual Keys securely manage your LLM API keys in one place. Store your LLM provider API keys securely in Portkey's vault
<Step title="Install the required packages">
```bash
pip install -U crewai portkey-ai
```
</Step>
```python
from crewai import LLM
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
<Step title="Generate API Key" icon="lock">
Create a Portkey API key with optional budget/rate limits from the [Portkey dashboard](https://app.portkey.ai/). You can also attach configurations for reliability, caching, and more to this key. More on this later.
</Step>
gpt_llm = LLM(
model="gpt-4",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy", # We are using Virtual key
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_VIRTUAL_KEY", # Enter your Virtual key from Portkey
)
)
```
</Step>
<Step title="Create and Run Your First Agent">
```python
from crewai import Agent, Task, Crew
# Define your agents with roles and goals
coder = Agent(
role='Software developer',
goal='Write clear, concise code on demand',
backstory='An expert coder with a keen eye for software trends.',
llm=gpt_llm
)
# Create tasks for your agents
task1 = Task(
description="Define the HTML for making a simple website with heading- Hello World! Portkey is working!",
expected_output="A clear and concise HTML code",
agent=coder
)
# Instantiate your crew
crew = Crew(
agents=[coder],
tasks=[task1],
)
result = crew.kickoff()
print(result)
```
</Step>
</Steps>
## Key Features
| Feature | Description |
|:--------|:------------|
| 🌐 Multi-LLM Support | Access OpenAI, Anthropic, Gemini, Azure, and 250+ providers through a unified interface |
| 🛡️ Production Reliability | Implement retries, timeouts, load balancing, and fallbacks |
| 📊 Advanced Observability | Track 40+ metrics including costs, tokens, latency, and custom metadata |
| 🔍 Comprehensive Logging | Debug with detailed execution traces and function call logs |
| 🚧 Security Controls | Set budget limits and implement role-based access control |
| 🔄 Performance Analytics | Capture and analyze feedback for continuous improvement |
| 💾 Intelligent Caching | Reduce costs and latency with semantic or simple caching |
## Production Features with Portkey Configs
All features mentioned below are through Portkey's Config system. Portkey's Config system allows you to define routing strategies using simple JSON objects in your LLM API calls. You can create and manage Configs directly in your code or through the Portkey Dashboard. Each Config has a unique ID for easy reference.
<Frame>
<img src="https://raw.githubusercontent.com/Portkey-AI/docs-core/refs/heads/main/images/libraries/libraries-3.avif"/>
</Frame>
### 1. Use 250+ LLMs
Access various LLMs like Anthropic, Gemini, Mistral, Azure OpenAI, and more with minimal code changes. Switch between providers or use them together seamlessly. [Learn more about Universal API](https://portkey.ai/docs/product/ai-gateway/universal-api)
Easily switch between different LLM providers:
<Step title="Configure CrewAI with Portkey">
The integration is simple - you just need to update the LLM configuration in your CrewAI setup:
```python
# Anthropic Configuration
anthropic_llm = LLM(
model="claude-3-5-sonnet-latest",
from crewai import LLM
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
# Create an LLM instance with Portkey integration
gpt_llm = LLM(
model="gpt-4o",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
api_key="dummy", # We are using a Virtual key, so this is a placeholder
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_ANTHROPIC_VIRTUAL_KEY", #You don't need provider when using Virtual keys
trace_id="anthropic_agent"
virtual_key="YOUR_LLM_VIRTUAL_KEY",
trace_id="unique-trace-id", # Optional, for request tracing
)
)
# Azure OpenAI Configuration
azure_llm = LLM(
model="gpt-4",
#Use them in your Crew Agents like this:
@agent
def lead_market_analyst(self) -> Agent:
return Agent(
config=self.agents_config['lead_market_analyst'],
verbose=True,
memory=False,
llm=gpt_llm
)
```
<Info>
**What are Virtual Keys?** Virtual keys in Portkey securely store your LLM provider API keys (OpenAI, Anthropic, etc.) in an encrypted vault. They allow for easier key rotation and budget management. [Learn more about virtual keys here](https://portkey.ai/docs/product/ai-gateway/virtual-keys).
</Info>
</Step>
</Steps>
## Production Features
### 1. Enhanced Observability
Portkey provides comprehensive observability for your CrewAI agents, helping you understand exactly what's happening during each execution.
<Tabs>
<Tab title="Traces">
<Frame>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/refs/heads/main/CrewAI%20Product%2011.1.webp"/>
</Frame>
Traces provide a hierarchical view of your crew's execution, showing the sequence of LLM calls, tool invocations, and state transitions.
```python
# Add trace_id to enable hierarchical tracing in Portkey
portkey_llm = LLM(
model="gpt-4o",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_AZURE_VIRTUAL_KEY", #You don't need provider when using Virtual keys
trace_id="azure_agent"
virtual_key="YOUR_OPENAI_VIRTUAL_KEY",
trace_id="unique-session-id" # Add unique trace ID
)
)
```
</Tab>
<Tab title="Logs">
<Frame>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/refs/heads/main/CrewAI%20Portkey%20Docs%20Metadata.png"/>
</Frame>
Portkey logs every interaction with LLMs, including:
- Complete request and response payloads
- Latency and token usage metrics
- Cost calculations
- Tool calls and function executions
All logs can be filtered by metadata, trace IDs, models, and more, making it easy to debug specific crew runs.
</Tab>
<Tab title="Metrics & Dashboards">
<Frame>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/refs/heads/main/CrewAI%20Dashboard.png"/>
</Frame>
Portkey provides built-in dashboards that help you:
- Track cost and token usage across all crew runs
- Analyze performance metrics like latency and success rates
- Identify bottlenecks in your agent workflows
- Compare different crew configurations and LLMs
You can filter and segment all metrics by custom metadata to analyze specific crew types, user groups, or use cases.
</Tab>
<Tab title="Metadata Filtering">
<Frame>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/refs/heads/main/Metadata%20Filters%20from%20CrewAI.png" alt="Analytics with metadata filters" />
</Frame>
Add custom metadata to your CrewAI LLM configuration to enable powerful filtering and segmentation:
```python
portkey_llm = LLM(
model="gpt-4o",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_OPENAI_VIRTUAL_KEY",
metadata={
"crew_type": "research_crew",
"environment": "production",
"_user": "user_123", # Special _user field for user analytics
"request_source": "mobile_app"
}
)
)
```
This metadata can be used to filter logs, traces, and metrics on the Portkey dashboard, allowing you to analyze specific crew runs, users, or environments.
</Tab>
</Tabs>
### 2. Caching
Improve response times and reduce costs with two powerful caching modes:
- **Simple Cache**: Perfect for exact matches
- **Semantic Cache**: Matches responses for requests that are semantically similar
[Learn more about Caching](https://portkey.ai/docs/product/ai-gateway/cache-simple-and-semantic)
### 2. Reliability - Keep Your Crews Running Smoothly
```py
config = {
"cache": {
"mode": "semantic", # or "simple" for exact matching
When running crews in production, things can go wrong - API rate limits, network issues, or provider outages. Portkey's reliability features ensure your agents keep running smoothly even when problems occur.
It's simple to enable fallback in your CrewAI setup by using a Portkey Config:
```python
from crewai import LLM
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
# Create LLM with fallback configuration
portkey_llm = LLM(
model="gpt-4o",
max_tokens=1000,
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
config={
"strategy": {
"mode": "fallback"
},
"targets": [
{
"provider": "openai",
"api_key": "YOUR_OPENAI_API_KEY",
"override_params": {"model": "gpt-4o"}
},
{
"provider": "anthropic",
"api_key": "YOUR_ANTHROPIC_API_KEY",
"override_params": {"model": "claude-3-opus-20240229"}
}
]
}
)
)
# Use this LLM configuration with your agents
```
This configuration will automatically try Claude if the GPT-4o request fails, ensuring your crew can continue operating.
<CardGroup cols="2">
<Card title="Automatic Retries" icon="rotate" href="https://portkey.ai/docs/product/ai-gateway/automatic-retries">
Handles temporary failures automatically. If an LLM call fails, Portkey will retry the same request for the specified number of times - perfect for rate limits or network blips.
</Card>
<Card title="Request Timeouts" icon="clock" href="https://portkey.ai/docs/product/ai-gateway/request-timeouts">
Prevent your agents from hanging. Set timeouts to ensure you get responses (or can fail gracefully) within your required timeframes.
</Card>
<Card title="Conditional Routing" icon="route" href="https://portkey.ai/docs/product/ai-gateway/conditional-routing">
Send different requests to different providers. Route complex reasoning to GPT-4, creative tasks to Claude, and quick responses to Gemini based on your needs.
</Card>
<Card title="Fallbacks" icon="shield" href="https://portkey.ai/docs/product/ai-gateway/fallbacks">
Keep running even if your primary provider fails. Automatically switch to backup providers to maintain availability.
</Card>
<Card title="Load Balancing" icon="scale-balanced" href="https://portkey.ai/docs/product/ai-gateway/load-balancing">
Spread requests across multiple API keys or providers. Great for high-volume crew operations and staying within rate limits.
</Card>
</CardGroup>
### 3. Prompting in CrewAI
Portkey's Prompt Engineering Studio helps you create, manage, and optimize the prompts used in your CrewAI agents. Instead of hardcoding prompts or instructions, use Portkey's prompt rendering API to dynamically fetch and apply your versioned prompts.
<Frame caption="Manage prompts in Portkey's Prompt Library">
![Prompt Playground Interface](https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/refs/heads/main/CrewAI%20Portkey%20Docs.webp)
</Frame>
<Tabs>
<Tab title="Prompt Playground">
Prompt Playground is a place to compare, test and deploy perfect prompts for your AI application. It's where you experiment with different models, test variables, compare outputs, and refine your prompt engineering strategy before deploying to production. It allows you to:
1. Iteratively develop prompts before using them in your agents
2. Test prompts with different variables and models
3. Compare outputs between different prompt versions
4. Collaborate with team members on prompt development
This visual environment makes it easier to craft effective prompts for each step in your CrewAI agents' workflow.
</Tab>
<Tab title="Using Prompt Templates">
The Prompt Render API retrieves your prompt templates with all parameters configured:
```python
from crewai import Agent, LLM
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL, Portkey
# Initialize Portkey admin client
portkey_admin = Portkey(api_key="YOUR_PORTKEY_API_KEY")
# Retrieve prompt using the render API
prompt_data = portkey_client.prompts.render(
prompt_id="YOUR_PROMPT_ID",
variables={
"agent_role": "Senior Research Scientist",
}
)
backstory_agent_prompt=prompt_data.data.messages[0]["content"]
# Set up LLM with Portkey integration
portkey_llm = LLM(
model="gpt-4o",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_OPENAI_VIRTUAL_KEY"
)
)
# Create agent using the rendered prompt
researcher = Agent(
role="Senior Research Scientist",
goal="Discover groundbreaking insights about the assigned topic",
backstory=backstory_agent, # Use the rendered prompt
verbose=True,
llm=portkey_llm
)
```
</Tab>
<Tab title="Prompt Versioning">
You can:
- Create multiple versions of the same prompt
- Compare performance between versions
- Roll back to previous versions if needed
- Specify which version to use in your code:
```python
# Use a specific prompt version
prompt_data = portkey_admin.prompts.render(
prompt_id="YOUR_PROMPT_ID@version_number",
variables={
"agent_role": "Senior Research Scientist",
"agent_goal": "Discover groundbreaking insights"
}
)
```
</Tab>
<Tab title="Mustache Templating for variables">
Portkey prompts use Mustache-style templating for easy variable substitution:
```
You are a {{agent_role}} with expertise in {{domain}}.
Your mission is to {{agent_goal}} by leveraging your knowledge
and experience in the field.
Always maintain a {{tone}} tone and focus on providing {{focus_area}}.
```
When rendering, simply pass the variables:
```python
prompt_data = portkey_admin.prompts.render(
prompt_id="YOUR_PROMPT_ID",
variables={
"agent_role": "Senior Research Scientist",
"domain": "artificial intelligence",
"agent_goal": "discover groundbreaking insights",
"tone": "professional",
"focus_area": "practical applications"
}
)
```
</Tab>
</Tabs>
<Card title="Prompt Engineering Studio" icon="wand-magic-sparkles" href="https://portkey.ai/docs/product/prompt-library">
Learn more about Portkey's prompt management features
</Card>
### 4. Guardrails for Safe Crews
Guardrails ensure your CrewAI agents operate safely and respond appropriately in all situations.
**Why Use Guardrails?**
CrewAI agents can experience various failure modes:
- Generating harmful or inappropriate content
- Leaking sensitive information like PII
- Hallucinating incorrect information
- Generating outputs in incorrect formats
Portkey's guardrails add protections for both inputs and outputs.
**Implementing Guardrails**
```python
from crewai import Agent, LLM
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
# Create LLM with guardrails
portkey_llm = LLM(
model="gpt-4o",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_OPENAI_VIRTUAL_KEY",
config={
"input_guardrails": ["guardrails-id-xxx", "guardrails-id-yyy"],
"output_guardrails": ["guardrails-id-zzz"]
}
)
)
# Create agent with guardrailed LLM
researcher = Agent(
role="Senior Research Scientist",
goal="Discover groundbreaking insights about the assigned topic",
backstory="You are an expert researcher with deep domain knowledge.",
verbose=True,
llm=portkey_llm
)
```
Portkey's guardrails can:
- Detect and redact PII in both inputs and outputs
- Filter harmful or inappropriate content
- Validate response formats against schemas
- Check for hallucinations against ground truth
- Apply custom business logic and rules
<Card title="Learn More About Guardrails" icon="shield-check" href="https://portkey.ai/docs/product/guardrails">
Explore Portkey's guardrail features to enhance agent safety
</Card>
### 5. User Tracking with Metadata
Track individual users through your CrewAI agents using Portkey's metadata system.
**What is Metadata in Portkey?**
Metadata allows you to associate custom data with each request, enabling filtering, segmentation, and analytics. The special `_user` field is specifically designed for user tracking.
```python
from crewai import Agent, LLM
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
# Configure LLM with user tracking
portkey_llm = LLM(
model="gpt-4o",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_OPENAI_VIRTUAL_KEY",
metadata={
"_user": "user_123", # Special _user field for user analytics
"user_tier": "premium",
"user_company": "Acme Corp",
"session_id": "abc-123"
}
)
)
# Create agent with tracked LLM
researcher = Agent(
role="Senior Research Scientist",
goal="Discover groundbreaking insights about the assigned topic",
backstory="You are an expert researcher with deep domain knowledge.",
verbose=True,
llm=portkey_llm
)
```
**Filter Analytics by User**
With metadata in place, you can filter analytics by user and analyze performance metrics on a per-user basis:
<Frame caption="Filter analytics by user">
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/refs/heads/main/Metadata%20Filters%20from%20CrewAI.png"/>
</Frame>
This enables:
- Per-user cost tracking and budgeting
- Personalized user analytics
- Team or organization-level metrics
- Environment-specific monitoring (staging vs. production)
<Card title="Learn More About Metadata" icon="tags" href="https://portkey.ai/docs/product/observability/metadata">
Explore how to use custom metadata to enhance your analytics
</Card>
### 6. Caching for Efficient Crews
Implement caching to make your CrewAI agents more efficient and cost-effective:
<Tabs>
<Tab title="Simple Caching">
```python
from crewai import Agent, LLM
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
# Configure LLM with simple caching
portkey_llm = LLM(
model="gpt-4o",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_OPENAI_VIRTUAL_KEY",
config={
"cache": {
"mode": "simple"
}
}
)
)
# Create agent with cached LLM
researcher = Agent(
role="Senior Research Scientist",
goal="Discover groundbreaking insights about the assigned topic",
backstory="You are an expert researcher with deep domain knowledge.",
verbose=True,
llm=portkey_llm
)
```
Simple caching performs exact matches on input prompts, caching identical requests to avoid redundant model executions.
</Tab>
<Tab title="Semantic Caching">
```python
from crewai import Agent, LLM
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
# Configure LLM with semantic caching
portkey_llm = LLM(
model="gpt-4o",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_OPENAI_VIRTUAL_KEY",
config={
"cache": {
"mode": "semantic"
}
}
)
)
# Create agent with semantically cached LLM
researcher = Agent(
role="Senior Research Scientist",
goal="Discover groundbreaking insights about the assigned topic",
backstory="You are an expert researcher with deep domain knowledge.",
verbose=True,
llm=portkey_llm
)
```
Semantic caching considers the contextual similarity between input requests, caching responses for semantically similar inputs.
</Tab>
</Tabs>
### 7. Model Interoperability
CrewAI supports multiple LLM providers, and Portkey extends this capability by providing access to over 200 LLMs through a unified interface. You can easily switch between different models without changing your core agent logic:
```python
from crewai import Agent, LLM
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
# Set up LLMs with different providers
openai_llm = LLM(
model="gpt-4o",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_OPENAI_VIRTUAL_KEY"
)
)
anthropic_llm = LLM(
model="claude-3-5-sonnet-latest",
max_tokens=1000,
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_ANTHROPIC_VIRTUAL_KEY"
)
)
# Choose which LLM to use for each agent based on your needs
researcher = Agent(
role="Senior Research Scientist",
goal="Discover groundbreaking insights about the assigned topic",
backstory="You are an expert researcher with deep domain knowledge.",
verbose=True,
llm=openai_llm # Use anthropic_llm for Anthropic
)
```
Portkey provides access to LLMs from providers including:
- OpenAI (GPT-4o, GPT-4 Turbo, etc.)
- Anthropic (Claude 3.5 Sonnet, Claude 3 Opus, etc.)
- Mistral AI (Mistral Large, Mistral Medium, etc.)
- Google Vertex AI (Gemini 1.5 Pro, etc.)
- Cohere (Command, Command-R, etc.)
- AWS Bedrock (Claude, Titan, etc.)
- Local/Private Models
<Card title="Supported Providers" icon="server" href="https://portkey.ai/docs/integrations/llms">
See the full list of LLM providers supported by Portkey
</Card>
## Set Up Enterprise Governance for CrewAI
**Why Enterprise Governance?**
If you are using CrewAI inside your organization, you need to consider several governance aspects:
- **Cost Management**: Controlling and tracking AI spending across teams
- **Access Control**: Managing which teams can use specific models
- **Usage Analytics**: Understanding how AI is being used across the organization
- **Security & Compliance**: Maintaining enterprise security standards
- **Reliability**: Ensuring consistent service across all users
Portkey adds a comprehensive governance layer to address these enterprise needs. Let's implement these controls step by step.
<Steps>
<Step title="Create Virtual Key">
Virtual Keys are Portkey's secure way to manage your LLM provider API keys. They provide essential controls like:
- Budget limits for API usage
- Rate limiting capabilities
- Secure API key storage
To create a virtual key:
Go to [Virtual Keys](https://app.portkey.ai/virtual-keys) in the Portkey App. Save and copy the virtual key ID
<Frame>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/refs/heads/main/Virtual%20Key%20from%20Portkey%20Docs.png" width="500"/>
</Frame>
<Note>
Save your virtual key ID - you'll need it for the next step.
</Note>
</Step>
<Step title="Create Default Config">
Configs in Portkey define how your requests are routed, with features like advanced routing, fallbacks, and retries.
To create your config:
1. Go to [Configs](https://app.portkey.ai/configs) in Portkey dashboard
2. Create new config with:
```json
{
"virtual_key": "YOUR_VIRTUAL_KEY_FROM_STEP1",
"override_params": {
"model": "gpt-4o" // Your preferred model name
}
}
```
3. Save and note the Config name for the next step
<Frame>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/refs/heads/main/CrewAI%20Portkey%20Docs%20Config.png" width="500"/>
</Frame>
</Step>
<Step title="Configure Portkey API Key">
Now create a Portkey API key and attach the config you created in Step 2:
1. Go to [API Keys](https://app.portkey.ai/api-keys) in Portkey and Create new API key
2. Select your config from `Step 2`
3. Generate and save your API key
<Frame>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/refs/heads/main/CrewAI%20API%20Key.png" width="500"/>
</Frame>
</Step>
<Step title="Connect to CrewAI">
After setting up your Portkey API key with the attached config, connect it to your CrewAI agents:
```python
from crewai import Agent, LLM
from portkey_ai import PORTKEY_GATEWAY_URL
# Configure LLM with your API key
portkey_llm = LLM(
model="gpt-4o",
base_url=PORTKEY_GATEWAY_URL,
api_key="YOUR_PORTKEY_API_KEY"
)
# Create agent with Portkey-enabled LLM
researcher = Agent(
role="Senior Research Scientist",
goal="Discover groundbreaking insights about the assigned topic",
backstory="You are an expert researcher with deep domain knowledge.",
verbose=True,
llm=portkey_llm
)
```
</Step>
</Steps>
<AccordionGroup>
<Accordion title="Step 1: Implement Budget Controls & Rate Limits">
### Step 1: Implement Budget Controls & Rate Limits
Virtual Keys enable granular control over LLM access at the team/department level. This helps you:
- Set up [budget limits](https://portkey.ai/docs/product/ai-gateway/virtual-keys/budget-limits)
- Prevent unexpected usage spikes using Rate limits
- Track departmental spending
#### Setting Up Department-Specific Controls:
1. Navigate to [Virtual Keys](https://app.portkey.ai/virtual-keys) in Portkey dashboard
2. Create new Virtual Key for each department with budget limits and rate limits
3. Configure department-specific limits
<Frame>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/refs/heads/main/Virtual%20Key%20from%20Portkey%20Docs.png" width="500"/>
</Frame>
</Accordion>
<Accordion title="Step 2: Define Model Access Rules">
### Step 2: Define Model Access Rules
As your AI usage scales, controlling which teams can access specific models becomes crucial. Portkey Configs provide this control layer with features like:
#### Access Control Features:
- **Model Restrictions**: Limit access to specific models
- **Data Protection**: Implement guardrails for sensitive data
- **Reliability Controls**: Add fallbacks and retry logic
#### Example Configuration:
Here's a basic configuration to route requests to OpenAI, specifically using GPT-4o:
```json
{
"strategy": {
"mode": "single"
},
"targets": [
{
"virtual_key": "YOUR_OPENAI_VIRTUAL_KEY",
"override_params": {
"model": "gpt-4o"
}
}
]
}
```
### 3. Production Reliability
Portkey provides comprehensive reliability features:
- **Automatic Retries**: Handle temporary failures gracefully
- **Request Timeouts**: Prevent hanging operations
- **Conditional Routing**: Route requests based on specific conditions
- **Fallbacks**: Set up automatic provider failovers
- **Load Balancing**: Distribute requests efficiently
Create your config on the [Configs page](https://app.portkey.ai/configs) in your Portkey dashboard.
[Learn more about Reliability Features](https://portkey.ai/docs/product/ai-gateway/)
<Note>
Configs can be updated anytime to adjust controls without affecting running applications.
</Note>
</Accordion>
<Accordion title="Step 3: Implement Access Controls">
### Step 3: Implement Access Controls
Create User-specific API keys that automatically:
- Track usage per user/team with the help of virtual keys
- Apply appropriate configs to route requests
- Collect relevant metadata to filter logs
- Enforce access permissions
### 4. Metrics
Create API keys through:
- [Portkey App](https://app.portkey.ai/)
- [API Key Management API](/api-reference/admin-api/control-plane/api-keys/create-api-key)
Agent runs are complex. Portkey automatically logs **40+ comprehensive metrics** for your AI agents, including cost, tokens used, latency, etc. Whether you need a broad overview or granular insights into your agent runs, Portkey's customizable filters provide the metrics you need.
Example using Python SDK:
```python
from portkey_ai import Portkey
portkey = Portkey(api_key="YOUR_ADMIN_API_KEY")
- Cost per agent interaction
- Response times and latency
- Token usage and efficiency
- Success/failure rates
- Cache hit rates
api_key = portkey.api_keys.create(
name="engineering-team",
type="organisation",
workspace_id="YOUR_WORKSPACE_ID",
defaults={
"config_id": "your-config-id",
"metadata": {
"environment": "production",
"department": "engineering"
}
},
scopes=["logs.view", "configs.read"]
)
```
<img src="https://github.com/siddharthsambharia-portkey/Portkey-Product-Images/blob/main/Portkey-Dashboard.png?raw=true" width="70%" alt="Portkey Dashboard" />
For detailed key management instructions, see our [API Keys documentation](/api-reference/admin-api/control-plane/api-keys/create-api-key).
</Accordion>
### 5. Detailed Logging
Logs are essential for understanding agent behavior, diagnosing issues, and improving performance. They provide a detailed record of agent activities and tool use, which is crucial for debugging and optimizing processes.
<Accordion title="Step 4: Deploy & Monitor">
### Step 4: Deploy & Monitor
After distributing API keys to your team members, your enterprise-ready CrewAI setup is ready to go. Each team member can now use their designated API keys with appropriate access levels and budget controls.
Monitor usage in Portkey dashboard:
- Cost tracking by department
- Model usage patterns
- Request volumes
- Error rates
</Accordion>
Access a dedicated section to view records of agent executions, including parameters, outcomes, function calls, and errors. Filter logs based on multiple parameters such as trace ID, model, tokens used, and metadata.
</AccordionGroup>
<details>
<summary><b>Traces</b></summary>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-Traces.png" alt="Portkey Traces" width="70%" />
</details>
<Note>
### Enterprise Features Now Available
**Your CrewAI integration now has:**
- Departmental budget controls
- Model access governance
- Usage tracking & attribution
- Security guardrails
- Reliability features
</Note>
<details>
<summary><b>Logs</b></summary>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-Logs.png" alt="Portkey Logs" width="70%" />
</details>
## Frequently Asked Questions
### 6. Enterprise Security Features
- Set budget limit and rate limts per Virtual Key (disposable API keys)
- Implement role-based access control
- Track system changes with audit logs
- Configure data retention policies
<AccordionGroup>
<Accordion title="How does Portkey enhance CrewAI?">
Portkey adds production-readiness to CrewAI through comprehensive observability (traces, logs, metrics), reliability features (fallbacks, retries, caching), and access to 200+ LLMs through a unified interface. This makes it easier to debug, optimize, and scale your agent applications.
</Accordion>
<Accordion title="Can I use Portkey with existing CrewAI applications?">
Yes! Portkey integrates seamlessly with existing CrewAI applications. You just need to update your LLM configuration code with the Portkey-enabled version. The rest of your agent and crew code remains unchanged.
</Accordion>
<Accordion title="Does Portkey work with all CrewAI features?">
Portkey supports all CrewAI features, including agents, tools, human-in-the-loop workflows, and all task process types (sequential, hierarchical, etc.). It adds observability and reliability without limiting any of the framework's functionality.
</Accordion>
For detailed information on creating and managing Configs, visit the [Portkey documentation](https://docs.portkey.ai/product/ai-gateway/configs).
<Accordion title="Can I track usage across multiple agents in a crew?">
Yes, Portkey allows you to use a consistent `trace_id` across multiple agents in a crew to track the entire workflow. This is especially useful for complex crews where you want to understand the full execution path across multiple agents.
</Accordion>
<Accordion title="How do I filter logs and traces for specific crew runs?">
Portkey allows you to add custom metadata to your LLM configuration, which you can then use for filtering. Add fields like `crew_name`, `crew_type`, or `session_id` to easily find and analyze specific crew executions.
</Accordion>
<Accordion title="Can I use my own API keys with Portkey?">
Yes! Portkey uses your own API keys for the various LLM providers. It securely stores them as virtual keys, allowing you to easily manage and rotate keys without changing your code.
</Accordion>
</AccordionGroup>
## Resources
- [📘 Portkey Documentation](https://docs.portkey.ai)
- [📊 Portkey Dashboard](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai)
- [🐦 Twitter](https://twitter.com/portkeyai)
- [💬 Discord Community](https://discord.gg/DD7vgKK299)
<CardGroup cols="3">
<Card title="CrewAI Docs" icon="book" href="https://docs.crewai.com/">
<p>Official CrewAI documentation</p>
</Card>
<Card title="Book a Demo" icon="calendar" href="https://calendly.com/portkey-ai">
<p>Get personalized guidance on implementing this integration</p>
</Card>
</CardGroup>

View File

@@ -1,6 +1,6 @@
[project]
name = "crewai"
version = "0.121.1"
version = "0.126.0"
description = "Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks."
readme = "README.md"
requires-python = ">=3.10,<3.14"
@@ -47,7 +47,7 @@ Documentation = "https://docs.crewai.com"
Repository = "https://github.com/crewAIInc/crewAI"
[project.optional-dependencies]
tools = ["crewai-tools~=0.45.0"]
tools = ["crewai-tools~=0.46.0"]
embeddings = [
"tiktoken~=0.8.0"
]

3
score.json Normal file
View File

@@ -0,0 +1,3 @@
{
"score": 4
}

View File

@@ -18,7 +18,7 @@ warnings.filterwarnings(
category=UserWarning,
module="pydantic.main",
)
__version__ = "0.121.1"
__version__ = "0.126.0"
__all__ = [
"Agent",
"Crew",

View File

@@ -2,7 +2,7 @@ import shutil
import subprocess
from typing import Any, Dict, List, Literal, Optional, Sequence, Type, Union
from pydantic import Field, InstanceOf, PrivateAttr, model_validator
from pydantic import Field, InstanceOf, PrivateAttr, field_validator, model_validator
from crewai.agents import CacheHandler
from crewai.agents.agent_builder.base_agent import BaseAgent
@@ -71,6 +71,7 @@ class Agent(BaseAgent):
"""
_times_executed: int = PrivateAttr(default=0)
_last_reasoning_output: Optional[Any] = PrivateAttr(default=None)
max_execution_time: Optional[int] = Field(
default=None,
description="Maximum execution time for an agent to execute a task",
@@ -135,6 +136,21 @@ class Agent(BaseAgent):
default=None,
description="Maximum number of reasoning attempts before executing the task. If None, will try until ready.",
)
reasoning_interval: Optional[int] = Field(
default=None,
description="Interval of steps after which the agent should reason again during execution. If None, reasoning only happens before execution.",
)
@field_validator('reasoning_interval')
@classmethod
def validate_reasoning_interval(cls, v):
if v is not None and v < 1:
raise ValueError("reasoning_interval must be >= 1")
return v
adaptive_reasoning: bool = Field(
default=False,
description="Whether the agent should adaptively decide when to reason during execution based on context.",
)
embedder: Optional[Dict[str, Any]] = Field(
default=None,
description="Embedder configuration for the agent.",
@@ -166,6 +182,9 @@ class Agent(BaseAgent):
def post_init_setup(self):
self.agent_ops_agent_name = self.role
if getattr(self, "adaptive_reasoning", False) and not getattr(self, "reasoning", False):
self.reasoning = True
self.llm = create_llm(self.llm)
if self.function_calling_llm and not isinstance(
self.function_calling_llm, BaseLLM
@@ -200,6 +219,7 @@ class Agent(BaseAgent):
collection_name=self.role,
storage=self.knowledge_storage or None,
)
self.knowledge.add_sources()
except (TypeError, ValueError) as e:
raise ValueError(f"Invalid Knowledge Configuration: {str(e)}")
@@ -243,21 +263,28 @@ class Agent(BaseAgent):
"""
if self.reasoning:
try:
from crewai.utilities.reasoning_handler import AgentReasoning, AgentReasoningOutput
from crewai.utilities.reasoning_handler import (
AgentReasoning,
AgentReasoningOutput,
)
reasoning_handler = AgentReasoning(task=task, agent=self)
reasoning_output: AgentReasoningOutput = reasoning_handler.handle_agent_reasoning()
reasoning_output: AgentReasoningOutput = (
reasoning_handler.handle_agent_reasoning()
)
# Add the reasoning plan to the task description
task.description += f"\n\nReasoning Plan:\n{reasoning_output.plan.plan}"
except Exception as e:
if hasattr(self, '_logger'):
self._logger.log("error", f"Error during reasoning process: {str(e)}")
if hasattr(self, "_logger"):
self._logger.log(
"error", f"Error during reasoning process: {str(e)}"
)
else:
print(f"Error during reasoning process: {str(e)}")
self._inject_date_to_task(task)
if self.tools_handler:
self.tools_handler.last_used_tool = {} # type: ignore # Incompatible types in assignment (expression has type "dict[Never, Never]", variable has type "ToolCalling")
@@ -369,6 +396,44 @@ class Agent(BaseAgent):
else:
task_prompt = self._use_trained_data(task_prompt=task_prompt)
if self.reasoning:
try:
from crewai.utilities.reasoning_handler import (
AgentReasoning,
AgentReasoningOutput,
)
reasoning_handler = AgentReasoning(
task=task,
agent=self,
extra_context=context or "",
)
reasoning_output: AgentReasoningOutput = reasoning_handler.handle_agent_reasoning()
# Store the reasoning output for the executor to use
self._last_reasoning_output = reasoning_output
plan_text = reasoning_output.plan.plan
internal_plan_msg = (
"### INTERNAL PLAN (do NOT reveal or repeat)\n" + plan_text
)
task_prompt = (
task_prompt
+ "\n\n"
+ internal_plan_msg
)
except Exception as e:
if hasattr(self, "_logger"):
self._logger.log(
"error", f"Error during reasoning process: {str(e)}"
)
else:
print(f"Error during reasoning process: {str(e)}")
try:
crewai_event_bus.emit(
self,
@@ -444,6 +509,10 @@ class Agent(BaseAgent):
self,
event=AgentExecutionCompletedEvent(agent=self, task=task, output=result),
)
# Clean up reasoning output after task completion
self._last_reasoning_output = None
return result
def _execute_with_timeout(self, task_prompt: str, task: Task, timeout: int) -> str:
@@ -622,22 +691,33 @@ class Agent(BaseAgent):
"""Inject the current date into the task description if inject_date is enabled."""
if self.inject_date:
from datetime import datetime
try:
valid_format_codes = ['%Y', '%m', '%d', '%H', '%M', '%S', '%B', '%b', '%A', '%a']
valid_format_codes = [
"%Y",
"%m",
"%d",
"%H",
"%M",
"%S",
"%B",
"%b",
"%A",
"%a",
]
is_valid = any(code in self.date_format for code in valid_format_codes)
if not is_valid:
raise ValueError(f"Invalid date format: {self.date_format}")
current_date: str = datetime.now().strftime(self.date_format)
task.description += f"\n\nCurrent Date: {current_date}"
except Exception as e:
if hasattr(self, '_logger'):
if hasattr(self, "_logger"):
self._logger.log("warning", f"Failed to inject date: {str(e)}")
else:
print(f"Warning: Failed to inject date: {str(e)}")
def _validate_docker_installation(self) -> None:
"""Check if Docker is installed and running."""
if not shutil.which("docker"):

View File

@@ -0,0 +1,386 @@
"""Agent state management for long-running tasks with focus on progress tracking."""
from typing import Any, Dict, List, Optional, Union, Set
from pydantic import BaseModel, Field
from datetime import datetime
import json
class CriterionProgress(BaseModel):
"""Progress tracking for a single acceptance criterion."""
criterion: str = Field(description="The acceptance criterion")
status: str = Field(default="not_started", description="Status: not_started, in_progress, completed")
progress_notes: str = Field(default="", description="Specific progress made towards this criterion")
completion_percentage: int = Field(default=0, description="Estimated completion percentage (0-100)")
remaining_work: str = Field(default="", description="What still needs to be done for this criterion")
# Enhanced tracking
processed_items: Set[str] = Field(default_factory=set, description="IDs or identifiers of processed items")
total_items_expected: Optional[int] = Field(default=None, description="Total number of items expected (if known)")
items_to_process: List[str] = Field(default_factory=list, description="Queue of specific items to process next")
last_updated: datetime = Field(default_factory=datetime.now)
class ProgressLog(BaseModel):
"""Single log entry for progress tracking."""
timestamp: datetime = Field(default_factory=datetime.now)
action: str = Field(description="What action was taken")
result: str = Field(description="Result or outcome of the action")
items_processed: List[str] = Field(default_factory=list, description="Items processed in this action")
criterion: Optional[str] = Field(default=None, description="Related acceptance criterion")
class AgentState(BaseModel):
"""Enhanced state management with deterministic progress tracking.
This state helps agents maintain focus during long executions by tracking
specific progress against each acceptance criterion with detailed logging.
"""
# Core planning elements
plan: List[str] = Field(
default_factory=list,
description="The current plan steps"
)
acceptance_criteria: List[str] = Field(
default_factory=list,
description="Concrete criteria that must be met for task completion"
)
# Progress tracking
criteria_progress: Dict[str, CriterionProgress] = Field(
default_factory=dict,
description="Detailed progress for each acceptance criterion"
)
# Data storage
scratchpad: Dict[str, Any] = Field(
default_factory=dict,
description="Storage for intermediate results and data"
)
# Simple tracking
current_focus: str = Field(
default="",
description="What the agent should be focusing on right now"
)
next_steps: List[str] = Field(
default_factory=list,
description="Immediate next steps to take"
)
overall_progress: int = Field(
default=0,
description="Overall task completion percentage (0-100)"
)
# Enhanced tracking
progress_logs: List[ProgressLog] = Field(
default_factory=list,
description="Detailed log of all progress made"
)
work_queue: List[Dict[str, Any]] = Field(
default_factory=list,
description="Queue of specific work items to process"
)
# Metadata tracking
metadata: Dict[str, Any] = Field(
default_factory=dict,
description="Additional metadata for tracking (e.g., total count expectations)"
)
def initialize_criteria_progress(self) -> None:
"""Initialize progress tracking for all acceptance criteria."""
for criterion in self.acceptance_criteria:
if criterion not in self.criteria_progress:
self.criteria_progress[criterion] = CriterionProgress(criterion=criterion)
def update_criterion_progress(
self,
criterion: str,
status: str,
progress_notes: str,
completion_percentage: int,
remaining_work: str,
processed_items: Optional[List[str]] = None,
items_to_process: Optional[List[str]] = None,
total_items_expected: Optional[int] = None
) -> None:
"""Update progress for a specific criterion with enhanced tracking."""
if criterion in self.criteria_progress:
progress = self.criteria_progress[criterion]
progress.status = status
progress.progress_notes = progress_notes
progress.completion_percentage = max(0, min(100, completion_percentage))
progress.remaining_work = remaining_work
progress.last_updated = datetime.now()
# Update processed items
if processed_items:
progress.processed_items.update(processed_items)
# Update items to process queue
if items_to_process is not None:
progress.items_to_process = items_to_process
# Update total expected if provided
if total_items_expected is not None:
progress.total_items_expected = total_items_expected
# Recalculate completion percentage based on actual items if possible
if progress.total_items_expected and progress.total_items_expected > 0:
actual_percentage = int((len(progress.processed_items) / progress.total_items_expected) * 100)
progress.completion_percentage = actual_percentage
# Update overall progress
self._recalculate_overall_progress()
def _recalculate_overall_progress(self) -> None:
"""Recalculate overall progress based on all criteria."""
if not self.criteria_progress:
self.overall_progress = 0
return
total_progress = sum(p.completion_percentage for p in self.criteria_progress.values())
self.overall_progress = int(total_progress / len(self.criteria_progress))
def add_to_scratchpad(self, key: str, value: Any) -> None:
"""Add or update a value in the scratchpad."""
self.scratchpad[key] = value
# Analyze the data for item tracking
self._analyze_scratchpad_for_items(key, value)
def _analyze_scratchpad_for_items(self, key: str, value: Any) -> None:
"""Analyze scratchpad data to extract trackable items."""
# If it's a list, try to extract IDs
if isinstance(value, list) and value:
item_ids = []
for item in value:
if isinstance(item, dict):
# Look for common ID fields
for id_field in ['id', 'ID', 'uid', 'uuid', 'message_id', 'email_id']:
if id_field in item:
item_ids.append(str(item[id_field]))
break
if item_ids:
# Store metadata about this list
self.metadata[f"{key}_ids"] = item_ids
self.metadata[f"{key}_count"] = len(value)
def log_progress(self, action: str, result: str, items_processed: Optional[List[str]] = None, criterion: Optional[str] = None) -> None:
"""Add a progress log entry."""
log_entry = ProgressLog(
action=action,
result=result,
items_processed=items_processed or [],
criterion=criterion
)
self.progress_logs.append(log_entry)
def add_to_work_queue(self, work_item: Dict[str, Any]) -> None:
"""Add an item to the work queue."""
self.work_queue.append(work_item)
def get_next_work_item(self) -> Optional[Dict[str, Any]]:
"""Get and remove the next item from the work queue."""
if self.work_queue:
return self.work_queue.pop(0)
return None
def set_focus_and_next_steps(self, focus: str, next_steps: List[str]) -> None:
"""Update current focus and next steps."""
self.current_focus = focus
self.next_steps = next_steps
def get_progress_context(self) -> str:
"""Generate a focused progress update for the agent."""
context = f"📊 PROGRESS UPDATE (Overall: {self.overall_progress}%)\n"
context += "="*50 + "\n\n"
# Current focus
if self.current_focus:
context += f"🎯 CURRENT FOCUS: {self.current_focus}\n\n"
# Progress on each criterion with detailed tracking
if self.criteria_progress:
context += "📋 ACCEPTANCE CRITERIA PROGRESS:\n"
for criterion, progress in self.criteria_progress.items():
status_emoji = "" if progress.status == "completed" else "🔄" if progress.status == "in_progress" else "⏸️"
context += f"\n{status_emoji} {criterion}\n"
# Show detailed progress
if progress.total_items_expected:
context += f" Progress: {len(progress.processed_items)}/{progress.total_items_expected} items ({progress.completion_percentage}%)\n"
else:
context += f" Progress: {progress.completion_percentage}%"
if progress.processed_items:
context += f" - {len(progress.processed_items)} items processed"
context += "\n"
if progress.progress_notes:
context += f" Notes: {progress.progress_notes}\n"
# Show next items to process
if progress.items_to_process and progress.status != "completed":
next_items = progress.items_to_process[:3] # Show next 3
context += f" Next items: {', '.join(next_items)}"
if len(progress.items_to_process) > 3:
context += f" (and {len(progress.items_to_process) - 3} more)"
context += "\n"
if progress.remaining_work and progress.status != "completed":
context += f" Still needed: {progress.remaining_work}\n"
# Work queue status
if self.work_queue:
context += f"\n📝 WORK QUEUE: {len(self.work_queue)} items pending\n"
next_work = self.work_queue[0]
context += f" Next: {next_work.get('description', 'Process next item')}\n"
# Next steps
if self.next_steps:
context += f"\n📍 IMMEDIATE NEXT STEPS:\n"
for i, step in enumerate(self.next_steps, 1):
context += f"{i}. {step}\n"
# Available data
if self.scratchpad:
context += f"\n💾 AVAILABLE DATA IN SCRATCHPAD:\n"
for key, value in self.scratchpad.items():
if isinstance(value, list):
context += f"'{key}' - {len(value)} items"
if f"{key}_ids" in self.metadata:
context += f" (IDs tracked)"
context += "\n"
elif isinstance(value, dict):
context += f"'{key}' - dictionary data\n"
else:
context += f"'{key}'\n"
# Recent progress logs
if self.progress_logs:
context += f"\n📜 RECENT ACTIVITY:\n"
for log in self.progress_logs[-3:]: # Show last 3 logs
context += f"{log.timestamp.strftime('%H:%M:%S')} - {log.action}"
if log.items_processed:
context += f" ({len(log.items_processed)} items)"
context += "\n"
context += "\n" + "="*50
return context
def analyze_scratchpad_for_criterion_progress(self, criterion: str) -> Dict[str, Any]:
"""Analyze scratchpad data to determine specific progress on a criterion."""
analysis = {
"relevant_data": [],
"item_count": 0,
"processed_ids": set(),
"data_completeness": 0,
"specific_gaps": []
}
criterion_lower = criterion.lower()
# Look for data that relates to this criterion
for key, value in self.scratchpad.items():
key_lower = key.lower()
# Check if this data is relevant to the criterion
is_relevant = False
for keyword in criterion_lower.split():
if len(keyword) > 3 and keyword in key_lower: # Skip short words
is_relevant = True
break
if is_relevant:
analysis["relevant_data"].append(key)
# Count items and extract IDs
if isinstance(value, list):
analysis["item_count"] += len(value)
# Try to extract IDs from metadata
if f"{key}_ids" in self.metadata:
analysis["processed_ids"].update(self.metadata[f"{key}_ids"])
elif isinstance(value, dict):
analysis["item_count"] += 1
# Calculate completeness based on what we know
if analysis["item_count"] > 0:
# Check if criterion mentions specific numbers
import re
number_match = re.search(r'\b(\d+)\b', criterion)
if number_match:
expected_count = int(number_match.group(1))
analysis["data_completeness"] = min(100, int((analysis["item_count"] / expected_count) * 100))
if analysis["item_count"] < expected_count:
analysis["specific_gaps"].append(f"Need {expected_count - analysis['item_count']} more items")
else:
# For criteria without specific numbers, use heuristics
if "all" in criterion_lower or "every" in criterion_lower:
# For "all" criteria, we need to be more careful
analysis["data_completeness"] = 50 if analysis["item_count"] > 0 else 0
analysis["specific_gaps"].append("Verify all items are included")
else:
analysis["data_completeness"] = min(100, analysis["item_count"] * 20) # Rough estimate
return analysis
def generate_specific_next_steps(self, criterion: str) -> List[str]:
"""Generate specific, actionable next steps for a criterion."""
analysis = self.analyze_scratchpad_for_criterion_progress(criterion)
progress = self.criteria_progress.get(criterion)
next_steps = []
if not progress:
return ["Initialize progress tracking for this criterion"]
# If we have a queue of items to process
if progress.items_to_process:
next_item = progress.items_to_process[0]
next_steps.append(f"Query/process item: {next_item}")
if len(progress.items_to_process) > 1:
next_steps.append(f"Then process {len(progress.items_to_process) - 1} remaining items")
# If we have processed some items but not all
elif progress.processed_items and progress.total_items_expected:
remaining = progress.total_items_expected - len(progress.processed_items)
if remaining > 0:
next_steps.append(f"Process {remaining} more items to reach target of {progress.total_items_expected}")
# If we have data but haven't accessed it
elif analysis["relevant_data"] and not progress.processed_items:
for data_key in analysis["relevant_data"][:2]: # First 2 relevant keys
next_steps.append(f"Access and process data from '{data_key}'")
# Generic steps based on criterion keywords
else:
criterion_lower = criterion.lower()
if "email" in criterion_lower:
next_steps.append("Use email search/fetch tool to gather emails")
elif "analyze" in criterion_lower or "summary" in criterion_lower:
next_steps.append("Access stored data and create analysis/summary")
else:
next_steps.append(f"Use appropriate tools to gather data for: {criterion}")
return next_steps
def reset(self) -> None:
"""Reset state for a new task."""
self.plan = []
self.acceptance_criteria = []
self.criteria_progress = {}
self.scratchpad = {}
self.current_focus = ""
self.next_steps = []
self.overall_progress = 0
self.progress_logs = []
self.work_queue = []
self.metadata = {}

File diff suppressed because it is too large Load Diff

View File

@@ -16,6 +16,7 @@ from .deploy.main import DeployCommand
from .evaluate_crew import evaluate_crew
from .install_crew import install_crew
from .kickoff_flow import kickoff_flow
from .organization.main import OrganizationCommand
from .plot_flow import plot_flow
from .replay_from_task import replay_task_command
from .reset_memories_command import reset_memories_command
@@ -353,5 +354,33 @@ def chat():
run_chat()
@crewai.group(invoke_without_command=True)
def org():
"""Organization management commands."""
pass
@org.command()
def list():
"""List available organizations."""
org_command = OrganizationCommand()
org_command.list()
@org.command()
@click.argument("id")
def switch(id):
"""Switch to a specific organization."""
org_command = OrganizationCommand()
org_command.switch(id)
@org.command()
def current():
"""Show current organization when 'crewai org' is called without subcommands."""
org_command = OrganizationCommand()
org_command.current()
if __name__ == "__main__":
crewai()

View File

@@ -14,6 +14,12 @@ class Settings(BaseModel):
tool_repository_password: Optional[str] = Field(
None, description="Password for interacting with the Tool Repository"
)
org_name: Optional[str] = Field(
None, description="Name of the currently active organization"
)
org_uuid: Optional[str] = Field(
None, description="UUID of the currently active organization"
)
config_path: Path = Field(default=DEFAULT_CONFIG_PATH, exclude=True)
def __init__(self, config_path: Path = DEFAULT_CONFIG_PATH, **data):

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1,63 @@
from rich.console import Console
from rich.table import Table
from crewai.cli.command import BaseCommand, PlusAPIMixin
from crewai.cli.config import Settings
console = Console()
class OrganizationCommand(BaseCommand, PlusAPIMixin):
def __init__(self):
BaseCommand.__init__(self)
PlusAPIMixin.__init__(self, telemetry=self._telemetry)
def list(self):
try:
response = self.plus_api_client.get_organizations()
response.raise_for_status()
orgs = response.json()
if not orgs:
console.print("You don't belong to any organizations yet.", style="yellow")
return
table = Table(title="Your Organizations")
table.add_column("Name", style="cyan")
table.add_column("ID", style="green")
for org in orgs:
table.add_row(org["name"], org["uuid"])
console.print(table)
except Exception as e:
console.print(f"Failed to retrieve organization list: {str(e)}", style="bold red")
raise SystemExit(1)
def switch(self, org_id):
try:
response = self.plus_api_client.get_organizations()
response.raise_for_status()
orgs = response.json()
org = next((o for o in orgs if o["uuid"] == org_id), None)
if not org:
console.print(f"Organization with id '{org_id}' not found.", style="bold red")
return
settings = Settings()
settings.org_name = org["name"]
settings.org_uuid = org["uuid"]
settings.dump()
console.print(f"Successfully switched to {org['name']} ({org['uuid']})", style="bold green")
except Exception as e:
console.print(f"Failed to switch organization: {str(e)}", style="bold red")
raise SystemExit(1)
def current(self):
settings = Settings()
if settings.org_uuid:
console.print(f"Currently logged in to organization {settings.org_name} ({settings.org_uuid})", style="bold green")
else:
console.print("You're not currently logged in to any organization.", style="yellow")
console.print("Use 'crewai org list' to see available organizations.", style="yellow")
console.print("Use 'crewai org switch <id>' to switch to an organization.", style="yellow")

View File

@@ -1,9 +1,10 @@
from os import getenv
from typing import Optional
from typing import List, Optional
from urllib.parse import urljoin
import requests
from crewai.cli.config import Settings
from crewai.cli.version import get_crewai_version
@@ -13,6 +14,7 @@ class PlusAPI:
"""
TOOLS_RESOURCE = "/crewai_plus/api/v1/tools"
ORGANIZATIONS_RESOURCE = "/crewai_plus/api/v1/me/organizations"
CREWS_RESOURCE = "/crewai_plus/api/v1/crews"
AGENTS_RESOURCE = "/crewai_plus/api/v1/agents"
@@ -24,6 +26,9 @@ class PlusAPI:
"User-Agent": f"CrewAI-CLI/{get_crewai_version()}",
"X-Crewai-Version": get_crewai_version(),
}
settings = Settings()
if settings.org_uuid:
self.headers["X-Crewai-Organization-Id"] = settings.org_uuid
self.base_url = getenv("CREWAI_BASE_URL", "https://app.crewai.com")
def _make_request(self, method: str, endpoint: str, **kwargs) -> requests.Response:
@@ -48,6 +53,7 @@ class PlusAPI:
version: str,
description: Optional[str],
encoded_file: str,
available_exports: Optional[List[str]] = None,
):
params = {
"handle": handle,
@@ -55,6 +61,7 @@ class PlusAPI:
"version": version,
"file": encoded_file,
"description": description,
"available_exports": available_exports,
}
return self._make_request("POST", f"{self.TOOLS_RESOURCE}", json=params)
@@ -101,3 +108,7 @@ class PlusAPI:
def create_crew(self, payload) -> requests.Response:
return self._make_request("POST", self.CREWS_RESOURCE, json=payload)
def get_organizations(self) -> requests.Response:
return self._make_request("GET", self.ORGANIZATIONS_RESOURCE)

View File

@@ -4,7 +4,7 @@ Welcome to the {{crew_name}} Crew project, powered by [crewAI](https://crewai.co
## Installation
Ensure you have Python >=3.10 <3.13 installed on your system. This project uses [UV](https://docs.astral.sh/uv/) for dependency management and package handling, offering a seamless setup and execution experience.
Ensure you have Python >=3.10 <3.14 installed on your system. This project uses [UV](https://docs.astral.sh/uv/) for dependency management and package handling, offering a seamless setup and execution experience.
First, if you haven't already, install uv:

View File

@@ -3,9 +3,9 @@ name = "{{folder_name}}"
version = "0.1.0"
description = "{{name}} using crewAI"
authors = [{ name = "Your Name", email = "you@example.com" }]
requires-python = ">=3.10,<3.13"
requires-python = ">=3.10,<3.14"
dependencies = [
"crewai[tools]>=0.121.1,<1.0.0"
"crewai[tools]>=0.126.0,<1.0.0"
]
[project.scripts]

View File

@@ -4,7 +4,7 @@ Welcome to the {{crew_name}} Crew project, powered by [crewAI](https://crewai.co
## Installation
Ensure you have Python >=3.10 <3.13 installed on your system. This project uses [UV](https://docs.astral.sh/uv/) for dependency management and package handling, offering a seamless setup and execution experience.
Ensure you have Python >=3.10 <3.14 installed on your system. This project uses [UV](https://docs.astral.sh/uv/) for dependency management and package handling, offering a seamless setup and execution experience.
First, if you haven't already, install uv:

View File

@@ -3,9 +3,9 @@ name = "{{folder_name}}"
version = "0.1.0"
description = "{{name}} using crewAI"
authors = [{ name = "Your Name", email = "you@example.com" }]
requires-python = ">=3.10,<3.13"
requires-python = ">=3.10,<3.14"
dependencies = [
"crewai[tools]>=0.121.1,<1.0.0",
"crewai[tools]>=0.126.0,<1.0.0",
]
[project.scripts]

View File

@@ -5,7 +5,7 @@ custom tools to power up your crews.
## Installing
Ensure you have Python >=3.10 <3.13 installed on your system. This project
Ensure you have Python >=3.10 <3.14 installed on your system. This project
uses [UV](https://docs.astral.sh/uv/) for dependency management and package
handling, offering a seamless setup and execution experience.

View File

@@ -3,9 +3,9 @@ name = "{{folder_name}}"
version = "0.1.0"
description = "Power up your crews with {{folder_name}}"
readme = "README.md"
requires-python = ">=3.10,<3.13"
requires-python = ">=3.10,<3.14"
dependencies = [
"crewai[tools]>=0.121.1"
"crewai[tools]>=0.126.0"
]
[tool.crewai]

View File

@@ -0,0 +1,3 @@
from .tool import {{class_name}}
__all__ = ["{{class_name}}"]

View File

@@ -3,6 +3,7 @@ import os
import subprocess
import tempfile
from pathlib import Path
from typing import Any
import click
from rich.console import Console
@@ -11,6 +12,7 @@ from crewai.cli import git
from crewai.cli.command import BaseCommand, PlusAPIMixin
from crewai.cli.config import Settings
from crewai.cli.utils import (
extract_available_exports,
get_project_description,
get_project_name,
get_project_version,
@@ -82,6 +84,14 @@ class ToolCommand(BaseCommand, PlusAPIMixin):
project_description = get_project_description(require=False)
encoded_tarball = None
console.print("[bold blue]Discovering tools from your project...[/bold blue]")
available_exports = extract_available_exports()
if available_exports:
console.print(
f"[green]Found these tools to publish: {', '.join([e['name'] for e in available_exports])}[/green]"
)
with tempfile.TemporaryDirectory() as temp_build_dir:
subprocess.run(
["uv", "build", "--sdist", "--out-dir", temp_build_dir],
@@ -105,12 +115,14 @@ class ToolCommand(BaseCommand, PlusAPIMixin):
encoded_tarball = base64.b64encode(tarball_contents).decode("utf-8")
console.print("[bold blue]Publishing tool to repository...[/bold blue]")
publish_response = self.plus_api_client.publish_tool(
handle=project_name,
is_public=is_public,
version=project_version,
description=project_description,
encoded_file=f"data:application/x-gzip;base64,{encoded_tarball}",
available_exports=available_exports,
)
self._validate_response(publish_response)
@@ -161,13 +173,20 @@ class ToolCommand(BaseCommand, PlusAPIMixin):
settings.tool_repository_password = login_response_json["credential"][
"password"
]
settings.org_uuid = login_response_json["current_organization"][
"uuid"
]
settings.org_name = login_response_json["current_organization"][
"name"
]
settings.dump()
console.print(
"Successfully authenticated to the tool repository.", style="bold green"
)
def _add_package(self, tool_details):
def _add_package(self, tool_details: dict[str, Any]):
is_from_pypi = tool_details.get("source", None) == "pypi"
tool_handle = tool_details["handle"]
repository_handle = tool_details["repository"]["handle"]
repository_url = tool_details["repository"]["url"]
@@ -176,10 +195,13 @@ class ToolCommand(BaseCommand, PlusAPIMixin):
add_package_command = [
"uv",
"add",
"--index",
index,
tool_handle,
]
if is_from_pypi:
add_package_command.append(tool_handle)
else:
add_package_command.extend(["--index", index, tool_handle])
add_package_result = subprocess.run(
add_package_command,
capture_output=False,

View File

@@ -1,8 +1,10 @@
import importlib.util
import os
import shutil
import sys
from functools import reduce
from inspect import isfunction, ismethod
from inspect import getmro, isclass, isfunction, ismethod
from pathlib import Path
from typing import Any, Dict, List, get_type_hints
import click
@@ -339,3 +341,112 @@ def fetch_crews(module_attr) -> list[Crew]:
if crew_instance := get_crew_instance(attr):
crew_instances.append(crew_instance)
return crew_instances
def is_valid_tool(obj):
from crewai.tools.base_tool import Tool
if isclass(obj):
try:
return any(base.__name__ == "BaseTool" for base in getmro(obj))
except (TypeError, AttributeError):
return False
return isinstance(obj, Tool)
def extract_available_exports(dir_path: str = "src"):
"""
Extract available tool classes from the project's __init__.py files.
Only includes classes that inherit from BaseTool or functions decorated with @tool.
Returns:
list: A list of valid tool class names or ["BaseTool"] if none found
"""
try:
init_files = Path(dir_path).glob("**/__init__.py")
available_exports = []
for init_file in init_files:
tools = _load_tools_from_init(init_file)
available_exports.extend(tools)
if not available_exports:
_print_no_tools_warning()
raise SystemExit(1)
return available_exports
except Exception as e:
console.print(f"[red]Error: Could not extract tool classes: {str(e)}[/red]")
console.print(
"Please ensure your project contains valid tools (classes inheriting from BaseTool or functions with @tool decorator)."
)
raise SystemExit(1)
def _load_tools_from_init(init_file: Path) -> list[dict[str, Any]]:
"""
Load and validate tools from a given __init__.py file.
"""
spec = importlib.util.spec_from_file_location("temp_module", init_file)
if not spec or not spec.loader:
return []
module = importlib.util.module_from_spec(spec)
sys.modules["temp_module"] = module
try:
spec.loader.exec_module(module)
if not hasattr(module, "__all__"):
console.print(
f"[bold yellow]Warning: No __all__ defined in {init_file}[/bold yellow]"
)
raise SystemExit(1)
return [
{
"name": name,
}
for name in module.__all__
if hasattr(module, name) and is_valid_tool(getattr(module, name))
]
except Exception as e:
console.print(f"[red]Warning: Could not load {init_file}: {str(e)}[/red]")
raise SystemExit(1)
finally:
sys.modules.pop("temp_module", None)
def _print_no_tools_warning():
"""
Display warning and usage instructions if no tools were found.
"""
console.print(
"\n[bold yellow]Warning: No valid tools were exposed in your __init__.py file![/bold yellow]"
)
console.print(
"Your __init__.py file must contain all classes that inherit from [bold]BaseTool[/bold] "
"or functions decorated with [bold]@tool[/bold]."
)
console.print(
"\nExample:\n[dim]# In your __init__.py file[/dim]\n"
"[green]__all__ = ['YourTool', 'your_tool_function'][/green]\n\n"
"[dim]# In your tool.py file[/dim]\n"
"[green]from crewai.tools import BaseTool, tool\n\n"
"# Tool class example\n"
"class YourTool(BaseTool):\n"
' name = "your_tool"\n'
' description = "Your tool description"\n'
" # ... rest of implementation\n\n"
"# Decorated function example\n"
"@tool\n"
"def your_tool_function(text: str) -> str:\n"
' """Your tool description"""\n'
" # ... implementation\n"
" return result\n"
)

View File

@@ -314,7 +314,7 @@ class Crew(FlowTrackable, BaseModel):
def create_crew_memory(self) -> "Crew":
"""Initialize private memory attributes."""
self._external_memory = (
# External memory doesnt support a default value since it was designed to be managed entirely externally
# External memory doesn't support a default value since it was designed to be managed entirely externally
self.external_memory.set_crew(self) if self.external_memory else None
)
@@ -1081,6 +1081,23 @@ class Crew(FlowTrackable, BaseModel):
token_usage=token_usage,
)
def _finish_execution(self, final_string_output: str) -> None:
if self.max_rpm:
self._rpm_controller.stop_rpm_counter()
def calculate_usage_metrics(self) -> UsageMetrics:
"""Calculates and returns the usage metrics."""
total_usage_metrics = UsageMetrics()
for agent in self.agents:
if hasattr(agent, "_token_process"):
token_sum = agent._token_process.get_summary()
total_usage_metrics.add_usage_metrics(token_sum)
if self.manager_agent and hasattr(self.manager_agent, "_token_process"):
token_sum = self.manager_agent._token_process.get_summary()
total_usage_metrics.add_usage_metrics(token_sum)
self.usage_metrics = total_usage_metrics
return total_usage_metrics
def _process_async_tasks(
self,
futures: List[Tuple[Task, Future[TaskOutput], int]],
@@ -1284,23 +1301,6 @@ class Crew(FlowTrackable, BaseModel):
for agent in self.agents:
agent.interpolate_inputs(inputs)
def _finish_execution(self, final_string_output: str) -> None:
if self.max_rpm:
self._rpm_controller.stop_rpm_counter()
def calculate_usage_metrics(self) -> UsageMetrics:
"""Calculates and returns the usage metrics."""
total_usage_metrics = UsageMetrics()
for agent in self.agents:
if hasattr(agent, "_token_process"):
token_sum = agent._token_process.get_summary()
total_usage_metrics.add_usage_metrics(token_sum)
if self.manager_agent and hasattr(self.manager_agent, "_token_process"):
token_sum = self.manager_agent._token_process.get_summary()
total_usage_metrics.add_usage_metrics(token_sum)
self.usage_metrics = total_usage_metrics
return total_usage_metrics
def test(
self,
n_iterations: int,

View File

@@ -17,7 +17,7 @@ Example
import ast
import inspect
from typing import Any, Dict, List, Optional, Tuple, Union
from typing import Any, Dict, List, Tuple, Union
from .utils import (
build_ancestor_dict,
@@ -140,7 +140,7 @@ def compute_positions(
flow: Any,
node_levels: Dict[str, int],
y_spacing: float = 150,
x_spacing: float = 150
x_spacing: float = 300
) -> Dict[str, Tuple[float, float]]:
"""
Compute the (x, y) positions for each node in the flow graph.
@@ -154,7 +154,7 @@ def compute_positions(
y_spacing : float, optional
Vertical spacing between levels, by default 150.
x_spacing : float, optional
Horizontal spacing between nodes, by default 150.
Horizontal spacing between nodes, by default 300.
Returns
-------

View File

@@ -79,7 +79,7 @@ class FilteredStream(io.TextIOBase):
"give feedback / get help" in lower_s
or "litellm.info:" in lower_s
or "litellm" in lower_s
or "Consider using a smaller input or implementing a text splitting strategy" in lower_s
or "consider using a smaller input or implementing a text splitting strategy" in lower_s
):
return 0

View File

@@ -527,10 +527,10 @@ class Task(BaseModel):
def prompt(self) -> str:
"""Generates the task prompt with optional markdown formatting.
When the markdown attribute is True, instructions for formatting the
response in Markdown syntax will be added to the prompt.
Returns:
str: The formatted prompt string containing the task description,
expected output, and optional markdown formatting instructions.
@@ -541,7 +541,7 @@ class Task(BaseModel):
expected_output=self.expected_output
)
tasks_slices = [self.description, output]
if self.markdown:
markdown_instruction = """Your final answer MUST be formatted in Markdown syntax.
Follow these guidelines:
@@ -550,7 +550,8 @@ Follow these guidelines:
- Use * for italic text
- Use - or * for bullet points
- Use `code` for inline code
- Use ```language for code blocks"""
- Use ```language for code blocks
- Don't start your answer with a code block"""
tasks_slices.append(markdown_instruction)
return "\n".join(tasks_slices)

View File

@@ -1 +1,7 @@
from .base_tool import BaseTool, tool
from .base_tool import BaseTool, tool, EnvVar
__all__ = [
"BaseTool",
"tool",
"EnvVar",
]

View File

@@ -1 +1,6 @@
"""Agent tools for crewAI."""
from .agent_tools import AgentTools
from .scratchpad_tool import ScratchpadTool
__all__ = ["AgentTools", "ScratchpadTool"]

View File

@@ -0,0 +1,174 @@
"""Tool for accessing data stored in the agent's scratchpad during reasoning."""
from typing import Any, Dict, Optional, Type, Union, Callable
from pydantic import BaseModel, Field
from crewai.tools import BaseTool
class ScratchpadToolSchema(BaseModel):
"""Input schema for ScratchpadTool."""
key: str = Field(
...,
description=(
"The key name to retrieve data from the scratchpad. "
"Must be one of the available keys shown in the tool description. "
"Example: if 'email_data' is listed as available, use {\"key\": \"email_data\"}"
)
)
class ScratchpadTool(BaseTool):
"""Tool that allows agents to access data stored in their scratchpad during task execution.
This tool's description is dynamically updated to show all available keys,
making it easy for agents to know what data they can retrieve.
"""
name: str = "Access Scratchpad Memory"
description: str = "Access data stored in your scratchpad memory during task execution."
args_schema: Type[BaseModel] = ScratchpadToolSchema
scratchpad_data: Dict[str, Any] = Field(default_factory=dict)
# Allow repeated usage of this tool - scratchpad access should not be limited
cache_function: Callable = lambda _args, _result: False # Don't cache scratchpad access
allow_repeated_usage: bool = True # Allow accessing the same key multiple times
def __init__(self, scratchpad_data: Optional[Dict[str, Any]] = None, **kwargs):
"""Initialize the scratchpad tool with optional initial data.
Args:
scratchpad_data: Initial scratchpad data (usually from agent state)
"""
super().__init__(**kwargs)
if scratchpad_data:
self.scratchpad_data = scratchpad_data
self._update_description()
def _run(
self,
key: str,
**kwargs: Any,
) -> Union[str, Dict[str, Any], Any]:
"""Retrieve data from the scratchpad using the specified key.
Args:
key: The key to look up in the scratchpad
Returns:
The value associated with the key, or an error message if not found
"""
print(f"[DEBUG] ScratchpadTool._run called with key: '{key}'")
print(f"[DEBUG] Current scratchpad keys: {list(self.scratchpad_data.keys())}")
print(f"[DEBUG] Scratchpad data size: {len(self.scratchpad_data)}")
if not self.scratchpad_data:
return (
"❌ SCRATCHPAD IS EMPTY\n\n"
"The scratchpad does not contain any data yet.\n"
"Data will be automatically stored here as you use other tools.\n"
"Try executing other tools first to gather information.\n\n"
"💡 TIP: Tools like search, read, or fetch operations will automatically store their results in the scratchpad."
)
if key not in self.scratchpad_data:
available_keys = list(self.scratchpad_data.keys())
keys_formatted = "\n".join(f" - '{k}'" for k in available_keys)
# Create more helpful examples based on actual keys
example_key = available_keys[0] if available_keys else 'example_key'
# Check if the user tried a similar key (case-insensitive or partial match)
similar_keys = [k for k in available_keys if key.lower() in k.lower() or k.lower() in key.lower()]
similarity_hint = ""
if similar_keys:
similarity_hint = f"\n\n🔍 Did you mean one of these?\n" + "\n".join(f" - '{k}'" for k in similar_keys)
return (
f"❌ KEY NOT FOUND: '{key}'\n"
f"{'='*50}\n\n"
f"The key '{key}' does not exist in the scratchpad.\n\n"
f"📦 AVAILABLE KEYS IN SCRATCHPAD:\n{keys_formatted}\n"
f"{similarity_hint}\n\n"
f"✅ CORRECT USAGE EXAMPLE:\n"
f"Action: Access Scratchpad Memory\n"
f"Action Input: {{\"key\": \"{example_key}\"}}\n\n"
f"⚠️ IMPORTANT:\n"
f"- Keys are case-sensitive and must match EXACTLY\n"
f"- Use the exact key name from the list above\n"
f"- Do NOT modify or guess key names\n\n"
f"{'='*50}"
)
value = self.scratchpad_data[key]
# Format the output nicely based on the type
if isinstance(value, dict):
import json
formatted_output = f"✅ Successfully retrieved data for key '{key}':\n\n"
formatted_output += json.dumps(value, indent=2)
return formatted_output
elif isinstance(value, list):
import json
formatted_output = f"✅ Successfully retrieved data for key '{key}':\n\n"
formatted_output += json.dumps(value, indent=2)
return formatted_output
else:
return f"✅ Successfully retrieved data for key '{key}':\n\n{str(value)}"
def update_scratchpad(self, new_data: Dict[str, Any]) -> None:
"""Update the scratchpad data and refresh the tool description.
Args:
new_data: The new complete scratchpad data
"""
self.scratchpad_data = new_data
self._update_description()
def _update_description(self) -> None:
"""Update the tool description to include all available keys."""
base_description = (
"Access data stored in your scratchpad memory during task execution.\n\n"
"HOW TO USE THIS TOOL:\n"
"Provide a JSON object with a 'key' field containing the exact name of the data you want to retrieve.\n"
"Example: {\"key\": \"email_data\"}"
)
if not self.scratchpad_data:
self.description = (
f"{base_description}\n\n"
"📝 STATUS: Scratchpad is currently empty.\n"
"Data will be automatically stored here as you use other tools."
)
return
# Build a description of available keys with a preview of their contents
key_descriptions = []
example_key = None
for key, value in self.scratchpad_data.items():
if not example_key:
example_key = key
# Create a brief description of what's stored
if isinstance(value, dict):
preview = f"dict with {len(value)} items"
if 'data' in value and isinstance(value['data'], list):
preview = f"list of {len(value['data'])} items"
elif isinstance(value, list):
preview = f"list of {len(value)} items"
elif isinstance(value, str):
preview = f"string ({len(value)} chars)"
else:
preview = type(value).__name__
key_descriptions.append(f" 📌 '{key}': {preview}")
available_keys_text = "\n".join(key_descriptions)
self.description = (
f"{base_description}\n\n"
f"📦 AVAILABLE DATA IN SCRATCHPAD:\n{available_keys_text}\n\n"
f"💡 EXAMPLE USAGE:\n"
f"To retrieve the '{example_key}' data, use:\n"
f"Action Input: {{\"key\": \"{example_key}\"}}"
)

View File

@@ -1,7 +1,7 @@
import asyncio
from abc import ABC, abstractmethod
from inspect import signature
from typing import Any, Callable, Type, get_args, get_origin
from typing import Any, Callable, Type, get_args, get_origin, Optional, List
from pydantic import (
BaseModel,
@@ -14,6 +14,11 @@ from pydantic import BaseModel as PydanticBaseModel
from crewai.tools.structured_tool import CrewStructuredTool
class EnvVar(BaseModel):
name: str
description: str
required: bool = True
default: Optional[str] = None
class BaseTool(BaseModel, ABC):
class _ArgsSchemaPlaceholder(PydanticBaseModel):
@@ -25,6 +30,8 @@ class BaseTool(BaseModel, ABC):
"""The unique name of the tool that clearly communicates its purpose."""
description: str
"""Used to tell the model how/when/why to use the tool."""
env_vars: List[EnvVar] = []
"""List of environment variables used by the tool."""
args_schema: Type[PydanticBaseModel] = Field(
default_factory=_ArgsSchemaPlaceholder, validate_default=True
)
@@ -39,6 +46,8 @@ class BaseTool(BaseModel, ABC):
"""Maximum number of times this tool can be used. None means unlimited usage."""
current_usage_count: int = 0
"""Current number of times this tool has been used."""
allow_repeated_usage: bool = False
"""Flag to allow this tool to be used repeatedly with the same arguments."""
@field_validator("args_schema", mode="before")
@classmethod
@@ -57,7 +66,7 @@ class BaseTool(BaseModel, ABC):
},
},
)
@field_validator("max_usage_count", mode="before")
@classmethod
def validate_max_usage_count(cls, v: int | None) -> int | None:
@@ -81,11 +90,11 @@ class BaseTool(BaseModel, ABC):
# If _run is async, we safely run it
if asyncio.iscoroutine(result):
result = asyncio.run(result)
self.current_usage_count += 1
return result
def reset_usage_count(self) -> None:
"""Reset the current usage count to zero."""
self.current_usage_count = 0
@@ -109,6 +118,8 @@ class BaseTool(BaseModel, ABC):
result_as_answer=self.result_as_answer,
max_usage_count=self.max_usage_count,
current_usage_count=self.current_usage_count,
allow_repeated_usage=self.allow_repeated_usage,
cache_function=self.cache_function,
)
@classmethod
@@ -272,7 +283,7 @@ def to_langchain(
def tool(*args, result_as_answer: bool = False, max_usage_count: int | None = None) -> Callable:
"""
Decorator to create a tool from a function.
Args:
*args: Positional arguments, either the function to decorate or the tool name.
result_as_answer: Flag to indicate if the tool result should be used as the final agent answer.

View File

@@ -25,6 +25,8 @@ class CrewStructuredTool:
result_as_answer: bool = False,
max_usage_count: int | None = None,
current_usage_count: int = 0,
allow_repeated_usage: bool = False,
cache_function: Optional[Callable] = None,
) -> None:
"""Initialize the structured tool.
@@ -36,6 +38,8 @@ class CrewStructuredTool:
result_as_answer: Whether to return the output directly
max_usage_count: Maximum number of times this tool can be used. None means unlimited usage.
current_usage_count: Current number of times this tool has been used.
allow_repeated_usage: Whether to allow this tool to be used repeatedly with the same arguments.
cache_function: Function that will be used to determine if the tool should be cached.
"""
self.name = name
self.description = description
@@ -45,6 +49,8 @@ class CrewStructuredTool:
self.result_as_answer = result_as_answer
self.max_usage_count = max_usage_count
self.current_usage_count = current_usage_count
self.allow_repeated_usage = allow_repeated_usage
self.cache_function = cache_function if cache_function is not None else lambda _args=None, _result=None: True
# Validate the function signature matches the schema
self._validate_function_signature()
@@ -197,6 +203,42 @@ class CrewStructuredTool:
validated_args = self.args_schema.model_validate(raw_args)
return validated_args.model_dump()
except Exception as e:
# Check if this is a "Field required" error and try to fix it
error_str = str(e)
if "Field required" in error_str:
# Try to parse missing fields from the error
import re
from pydantic import ValidationError
if isinstance(e, ValidationError):
# Extract missing fields from validation error
missing_fields = []
for error_detail in e.errors():
if error_detail.get('type') == 'missing':
field_path = error_detail.get('loc', ())
if field_path:
missing_fields.append(field_path[0])
if missing_fields:
# Create a copy of raw_args and add missing fields with None
fixed_args = dict(raw_args) if isinstance(raw_args, dict) else {}
for field in missing_fields:
if field not in fixed_args:
fixed_args[field] = None
# Try validation again with fixed args
try:
self._logger.log("debug", f"Auto-fixing missing fields: {missing_fields}")
validated_args = self.args_schema.model_validate(fixed_args)
return validated_args.model_dump()
except Exception as retry_e:
# If it still fails, raise the original error with additional context
raise ValueError(
f"Arguments validation failed: {e}\n"
f"Attempted to auto-fix missing fields {missing_fields} but still failed: {retry_e}"
)
# For other validation errors, raise as before
raise ValueError(f"Arguments validation failed: {e}")
async def ainvoke(

View File

@@ -149,7 +149,13 @@ class ToolUsage:
tool: CrewStructuredTool,
calling: Union[ToolCalling, InstructorToolCalling],
) -> str:
if self._check_tool_repeated_usage(calling=calling): # type: ignore # _check_tool_repeated_usage of "ToolUsage" does not return a value (it only ever returns None)
# Check if tool allows repeated usage before blocking
allows_repeated = False
if hasattr(tool, 'allow_repeated_usage'):
allows_repeated = tool.allow_repeated_usage
elif hasattr(tool, '_tool') and hasattr(tool._tool, 'allow_repeated_usage'):
allows_repeated = tool._tool.allow_repeated_usage
if not allows_repeated and self._check_tool_repeated_usage(calling=calling): # type: ignore # _check_tool_repeated_usage of "ToolUsage" does not return a value (it only ever returns None)
try:
result = self._i18n.errors("task_repeated_usage").format(
tool_names=self.tools_names
@@ -180,7 +186,7 @@ class ToolUsage:
event_data.update(self.agent.fingerprint)
crewai_event_bus.emit(self,ToolUsageStartedEvent(**event_data))
started_at = time.time()
from_cache = False
result = None # type: ignore
@@ -250,9 +256,54 @@ class ToolUsage:
self._run_attempts += 1
if self._run_attempts > self._max_parsing_attempts:
self._telemetry.tool_usage_error(llm=self.function_calling_llm)
error_message = self._i18n.errors("tool_usage_exception").format(
error=e, tool=tool.name, tool_inputs=tool.description
)
# Check if this is a validation error with missing fields
error_str = str(e)
if "Arguments validation failed" in error_str and "Field required" in error_str:
# Extract the field name that's missing
import re
field_match = re.search(r'(\w+)\s*Field required', error_str)
if field_match:
missing_field = field_match.group(1)
# Create a more helpful error message
error_message = (
f"Tool validation error: The '{missing_field}' parameter is required but was not provided.\n\n"
f"SOLUTION: Include ALL parameters in your tool call, even optional ones (use null for optional parameters):\n"
f'{{"tool_name": "{tool.name}", "arguments": {{'
)
# Get all expected parameters from the tool schema
if hasattr(tool, 'args_schema'):
schema_props = tool.args_schema.model_json_schema().get('properties', {})
param_examples = []
for param_name, param_info in schema_props.items():
if param_name == missing_field:
param_examples.append(f'"{param_name}": "YOUR_VALUE_HERE"')
else:
# Check if it's optional by looking at required fields
is_required = param_name in tool.args_schema.model_json_schema().get('required', [])
if not is_required:
param_examples.append(f'"{param_name}": null')
else:
param_examples.append(f'"{param_name}": "value"')
error_message += ', '.join(param_examples)
error_message += '}}\n\n'
error_message += f"Original error: {e}\n"
error_message += f"Tool description: {tool.description}"
else:
# Use the original error message
error_message = self._i18n.errors("tool_usage_exception").format(
error=e, tool=tool.name, tool_inputs=tool.description
)
else:
# Use the original error message for non-validation errors
error_message = self._i18n.errors("tool_usage_exception").format(
error=e, tool=tool.name, tool_inputs=tool.description
)
error = ToolUsageErrorException(
f"\n{error_message}.\nMoving on then. {self._i18n.slice('format').format(tool_names=self.tools_names)}"
).message
@@ -281,49 +332,54 @@ class ToolUsage:
self.tools_handler.on_tool_use(
calling=calling, output=result, should_cache=should_cache
)
self._telemetry.tool_usage(
llm=self.function_calling_llm,
tool_name=tool.name,
attempts=self._run_attempts,
)
result = self._format_result(result=result) # type: ignore # "_format_result" of "ToolUsage" does not return a value (it only ever returns None)
data = {
"result": result,
"tool_name": tool.name,
"tool_args": calling.arguments,
}
self._telemetry.tool_usage(
llm=self.function_calling_llm,
tool_name=tool.name,
attempts=self._run_attempts,
)
result = self._format_result(result=result) # type: ignore # "_format_result" of "ToolUsage" does not return a value (it only ever returns None)
data = {
"result": result,
"tool_name": tool.name,
"tool_args": calling.arguments,
}
self.on_tool_use_finished(
tool=tool,
tool_calling=calling,
from_cache=from_cache,
started_at=started_at,
result=result,
)
self.on_tool_use_finished(
tool=tool,
tool_calling=calling,
from_cache=from_cache,
started_at=started_at,
result=result,
)
if (
hasattr(available_tool, "result_as_answer")
and available_tool.result_as_answer # type: ignore # Item "None" of "Any | None" has no attribute "cache_function"
):
result_as_answer = available_tool.result_as_answer # type: ignore # Item "None" of "Any | None" has no attribute "result_as_answer"
data["result_as_answer"] = result_as_answer # type: ignore
if (
hasattr(available_tool, "result_as_answer")
and available_tool.result_as_answer # type: ignore # Item "None" of "Any | None" has no attribute "cache_function"
):
result_as_answer = available_tool.result_as_answer # type: ignore # Item "None" of "Any | None" has no attribute "result_as_answer"
data["result_as_answer"] = result_as_answer # type: ignore
if self.agent and hasattr(self.agent, "tools_results"):
self.agent.tools_results.append(data)
if self.agent and hasattr(self.agent, "tools_results"):
self.agent.tools_results.append(data)
if available_tool and hasattr(available_tool, 'current_usage_count'):
available_tool.current_usage_count += 1
if hasattr(available_tool, 'max_usage_count') and available_tool.max_usage_count is not None:
self._printer.print(
content=f"Tool '{available_tool.name}' usage: {available_tool.current_usage_count}/{available_tool.max_usage_count}",
color="blue"
)
if available_tool and hasattr(available_tool, 'current_usage_count'):
available_tool.current_usage_count += 1
if hasattr(available_tool, 'max_usage_count') and available_tool.max_usage_count is not None:
self._printer.print(
content=f"Tool '{available_tool.name}' usage: {available_tool.current_usage_count}/{available_tool.max_usage_count}",
color="blue"
)
return result
return result
def _format_result(self, result: Any) -> str:
if self.task:
self.task.used_tools += 1
# Handle None results explicitly
if result is None:
result = "No result returned from tool"
if self._should_remember_format():
result = self._remember_format(result=result)
return str(result)
@@ -346,24 +402,34 @@ class ToolUsage:
if not self.tools_handler:
return False
if last_tool_usage := self.tools_handler.last_used_tool:
return (calling.tool_name == last_tool_usage.tool_name) and (
# Add debug logging
print(f"[DEBUG] _check_tool_repeated_usage:")
print(f" Current tool: {calling.tool_name}")
print(f" Current args: {calling.arguments}")
print(f" Last tool: {last_tool_usage.tool_name}")
print(f" Last args: {last_tool_usage.arguments}")
is_repeated = (calling.tool_name == last_tool_usage.tool_name) and (
calling.arguments == last_tool_usage.arguments
)
print(f" Is repeated: {is_repeated}")
return is_repeated
return False
def _check_usage_limit(self, tool: Any, tool_name: str) -> str | None:
"""Check if tool has reached its usage limit.
Args:
tool: The tool to check
tool_name: The name of the tool (used for error message)
Returns:
Error message if limit reached, None otherwise
"""
if (
hasattr(tool, 'max_usage_count')
and tool.max_usage_count is not None
hasattr(tool, 'max_usage_count')
and tool.max_usage_count is not None
and tool.current_usage_count >= tool.max_usage_count
):
return f"Tool '{tool_name}' has reached its usage limit of {tool.max_usage_count} times and cannot be used anymore."

View File

@@ -25,7 +25,7 @@
"formatted_task_instructions": "Ensure your final answer contains only the content in the following format: {output_format}\n\nEnsure the final output does not include any code block markers like ```json or ```python.",
"conversation_history_instruction": "You are a member of a crew collaborating to achieve a common goal. Your task is a specific action that contributes to this larger objective. For additional context, please review the conversation history between you and the user that led to the initiation of this crew. Use any relevant information or feedback from the conversation to inform your task execution and ensure your response aligns with both the immediate task and the crew's overall goals.",
"feedback_instructions": "User feedback: {feedback}\nInstructions: Use this feedback to enhance the next output iteration.\nNote: Do not respond or add commentary.",
"lite_agent_system_prompt_with_tools": "You are {role}. {backstory}\nYour personal goal is: {goal}\n\nYou ONLY have access to the following tools, and should NEVER make up tools that are not listed here:\n\n{tools}\n\nIMPORTANT: Use the following format in your response:\n\n```\nThought: you should always think about what to do\nAction: the action to take, only one name of [{tool_names}], just the name, exactly as it's written.\nAction Input: the input to the action, just a simple JSON object, enclosed in curly braces, using \" to wrap keys and values.\nObservation: the result of the action\n```\n\nOnce all necessary information is gathered, return the following format:\n\n```\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n```",
"lite_agent_system_prompt_with_tools": "You are {role}. {backstory}\nYour personal goal is: {goal}\n\nYou ONLY have access to the following tools, and should NEVER make up tools that are not listed here:\n\n{tools}\n\nIMPORTANT: Use the following format in your response:\n\n```\nThought: you should always think about what to do\nAction: the action to take, only one name of [{tool_names}], just the name, exactly as it's written.\nAction Input: the input to the action, just a simple JSON object, enclosed in curly braces, using \" to wrap keys and values.\nObservation: the result of the action\n```\n\nOnce all necessary information is gathered, return the following format:\n\n```\nThought: I now know the final answer\nFinal Answer: the complete final answer to the original input question\n```",
"lite_agent_system_prompt_without_tools": "You are {role}. {backstory}\nYour personal goal is: {goal}\n\nTo give my best complete final answer to the task respond using the exact following format:\n\nThought: I now can give a great answer\nFinal Answer: Your final answer must be the great and the most complete as possible, it must be outcome described.\n\nI MUST use these formats, my job depends on it!",
"lite_agent_response_format": "\nIMPORTANT: Your final answer MUST contain all the information requested in the following format: {response_format}\n\nIMPORTANT: Ensure the final output does not include any code block markers like ```json or ```python.",
"knowledge_search_query": "The original query is: {task_prompt}.",
@@ -41,7 +41,8 @@
"wrong_tool_name": "You tried to use the tool {tool}, but it doesn't exist. You must use one of the following tools, use one at time: {tools}.",
"tool_usage_exception": "I encountered an error while trying to use the tool. This was the error: {error}.\n Tool {tool} accepts these inputs: {tool_inputs}",
"agent_tool_execution_error": "Error executing task with agent '{agent_role}'. Error: {error}",
"validation_error": "### Previous attempt failed validation: {guardrail_result_error}\n\n\n### Previous result:\n{task_output}\n\n\nTry again, making sure to address the validation error."
"validation_error": "### Previous attempt failed validation: {guardrail_result_error}\n\n\n### Previous result:\n{task_output}\n\n\nTry again, making sure to address the validation error.",
"criteria_validation_error": "### Your answer did not meet all acceptance criteria\n\n### Unmet criteria:\n{unmet_criteria}\n\n### Previous result:\n{task_output}\n\n\nPlease revise your answer to ensure ALL acceptance criteria are met. Use the 'Access Scratchpad Memory' tool if you need to retrieve any previously collected information."
},
"tools": {
"delegate_work": "Delegate a specific task to one of the following coworkers: {coworkers}\nThe input to this tool should be the coworker, the task you want them to do, and ALL necessary context to execute the task, they know nothing about the task, so share absolutely everything you know, don't reference things but instead explain them.",
@@ -55,7 +56,12 @@
"reasoning": {
"initial_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are creating a strategic plan for a task that requires your expertise and unique perspective.",
"refine_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are refining a strategic plan for a task that requires your expertise and unique perspective.",
"create_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou have been assigned the following task:\n{description}\n\nExpected output:\n{expected_output}\n\nAvailable tools: {tools}\n\nBefore executing this task, create a detailed plan that leverages your expertise as {role} and outlines:\n1. Your understanding of the task from your professional perspective\n2. The key steps you'll take to complete it, drawing on your background and skills\n3. How you'll approach any challenges that might arise, considering your expertise\n4. How you'll strategically use the available tools based on your experience, exactly what tools to use and how to use them\n5. The expected outcome and how it aligns with your goal\n\nAfter creating your plan, assess whether you feel ready to execute the task or if you could do better.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan because [specific reason].\"",
"refine_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou created the following plan for this task:\n{current_plan}\n\nHowever, you indicated that you're not ready to execute the task yet.\n\nPlease refine your plan further, drawing on your expertise as {role} to address any gaps or uncertainties. As you refine your plan, be specific about which available tools you will use, how you will use them, and why they are the best choices for each step. Clearly outline your tool usage strategy as part of your improved plan.\n\nAfter refining your plan, assess whether you feel ready to execute the task.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan further because [specific reason].\""
"create_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou have been assigned the following task:\n{description}\n\nExpected output:\n{expected_output}\n\nAvailable tools: {tools}\n\nBefore executing this task, create a detailed plan that leverages your expertise as {role} and outlines:\n1. Your understanding of the task from your professional perspective\n2. The key steps you'll take to complete it, drawing on your background and skills\n3. How you'll approach any challenges that might arise, considering your expertise\n4. How you'll strategically use the available tools based on your experience, exactly what tools to use and how to use them\n5. The expected outcome and how it aligns with your goal\n\nIMPORTANT: Structure your plan as follows:\n\nSTEPS:\n1. [First concrete action step]\n2. [Second concrete action step]\n3. [Continue with numbered steps...]\n\nACCEPTANCE CRITERIA:\n- [First criterion that must be met]\n- [Second criterion that must be met]\n- [Continue with criteria...]\n\nRemember: your ultimate objective is to produce the most COMPLETE Final Answer that fully meets the **Expected output** criteria.\n\nAfter creating your plan, assess whether you feel ready to execute the task or if you could do better.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan because [specific reason].\"",
"refine_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou created the following plan for this task:\n{current_plan}\n\nHowever, you indicated that you're not ready to execute the task yet.\n\nPlease refine your plan further, drawing on your expertise as {role} to address any gaps or uncertainties. As you refine your plan, be specific about which available tools you will use, how you will use them, and why they are the best choices for each step. Clearly outline your tool usage strategy as part of your improved plan.\n\nIMPORTANT: Structure your refined plan as follows:\n\nSTEPS:\n1. [First concrete action step]\n2. [Second concrete action step]\n3. [Continue with numbered steps...]\n\nACCEPTANCE CRITERIA:\n- [First criterion that must be met]\n- [Second criterion that must be met]\n- [Continue with criteria...]\n\nMake sure your refined strategy directly guides you toward producing the most COMPLETE Final Answer that fully satisfies the **Expected output**.\n\nAfter refining your plan, assess whether you feel ready to execute the task.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan further because [specific reason].\"",
"adaptive_reasoning_decision": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are currently executing a task and need to decide whether to pause and reassess your plan based on the current context.",
"mid_execution_reasoning": "You are currently executing a task and need to reassess your plan based on progress so far.\n\nTASK DESCRIPTION:\n{description}\n\nEXPECTED OUTPUT:\n{expected_output}\n\nCURRENT PROGRESS:\nSteps completed: {current_steps}\nTools used: {tools_used}\nProgress summary: {current_progress}\n\nRECENT CONVERSATION:\n{recent_messages}\n\nYour reassessment MUST focus on steering the remaining work toward a FINAL ANSWER that is as complete as possible and perfectly matches the **Expected output**.\n\nBased on the current progress and context, please reassess your plan for completing this task.\nConsider what has been accomplished, what challenges you've encountered, and what steps remain.\nAdjust your strategy if needed or confirm your current approach is still optimal.\n\nIMPORTANT: Structure your updated plan as follows:\n\nREMAINING STEPS:\n1. [First remaining action step]\n2. [Second remaining action step]\n3. [Continue with numbered steps...]\n\nUPDATED ACCEPTANCE CRITERIA (if changed):\n- [First criterion that must be met]\n- [Second criterion that must be met]\n- [Continue with criteria...]\n\nProvide a detailed updated plan for completing the task.\nEnd with \"READY: I am ready to continue executing the task.\" if you're confident in your plan.",
"mid_execution_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are reassessing your plan during task execution based on the progress made so far.",
"mid_execution_reasoning_update": "I've reassessed my approach based on progress so far. Updated plan:\n\n{plan}",
"adaptive_reasoning_context": "\n\nTASK DESCRIPTION:\n{description}\n\nEXPECTED OUTPUT:\n{expected_output}\n\nCURRENT EXECUTION CONTEXT:\n- Steps completed: {current_steps}\n- Tools used: {tools_used}\n- Progress summary: {current_progress}\n\nConsider whether the current approach is optimal or if a strategic pause to reassess would be beneficial. You should reason when:\n- You might be approaching the task inefficiently\n- The context suggests a different strategy might be better\n- You're uncertain about the next steps\n- The progress suggests you need to reconsider your approach\n- Tool usage patterns indicate issues (e.g., repeated failures, same tool used many times, rapid switching)\n- Multiple tools have returned errors or empty results\n- You're using the same tool repeatedly without making progress\n\nPay special attention to the TOOL USAGE STATISTICS section if present, as it reveals patterns that might not be obvious from the tool list alone.\n\nDecide whether reasoning/re-planning is needed at this point."
}
}

View File

@@ -215,9 +215,6 @@ def handle_agent_action_core(
if show_logs:
show_logs(formatted_answer)
if messages is not None:
messages.append({"role": "assistant", "content": tool_result.result})
return formatted_answer
@@ -296,6 +293,8 @@ def handle_context_length(
llm: Any,
callbacks: List[Any],
i18n: Any,
task_description: Optional[str] = None,
expected_output: Optional[str] = None,
) -> None:
"""Handle context length exceeded by either summarizing or raising an error.
@@ -306,13 +305,22 @@ def handle_context_length(
llm: LLM instance for summarization
callbacks: List of callbacks for LLM
i18n: I18N instance for messages
task_description: Optional original task description
expected_output: Optional expected output
"""
if respect_context_window:
printer.print(
content="Context length exceeded. Summarizing content to fit the model context window. Might take a while...",
color="yellow",
)
summarize_messages(messages, llm, callbacks, i18n)
summarize_messages(
messages,
llm,
callbacks,
i18n,
task_description=task_description,
expected_output=expected_output,
)
else:
printer.print(
content="Context length exceeded. Consider using smaller text or RAG tools from crewai_tools.",
@@ -328,6 +336,8 @@ def summarize_messages(
llm: Any,
callbacks: List[Any],
i18n: Any,
task_description: Optional[str] = None,
expected_output: Optional[str] = None,
) -> None:
"""Summarize messages to fit within context window.
@@ -336,6 +346,8 @@ def summarize_messages(
llm: LLM instance for summarization
callbacks: List of callbacks for LLM
i18n: I18N instance for messages
task_description: Optional original task description
expected_output: Optional expected output
"""
messages_string = " ".join([message["content"] for message in messages])
messages_groups = []
@@ -368,12 +380,19 @@ def summarize_messages(
merged_summary = " ".join(content["content"] for content in summarized_contents)
# Build the summary message and optionally inject the task reminder.
summary_message = i18n.slice("summary").format(merged_summary=merged_summary)
if task_description or expected_output:
summary_message += "\n\n" # blank line before the reminder
if task_description:
summary_message += f"Original task: {task_description}\n"
if expected_output:
summary_message += f"Expected output: {expected_output}"
# Replace the conversation with the new summary message.
messages.clear()
messages.append(
format_message_for_llm(
i18n.slice("summary").format(merged_summary=merged_summary)
)
)
messages.append(format_message_for_llm(summary_message))
def show_agent_logs(
@@ -464,7 +483,7 @@ def load_agent_from_repository(from_repository: str) -> Dict[str, Any]:
attributes[key] = []
for tool in value:
try:
module = importlib.import_module("crewai_tools")
module = importlib.import_module(tool["module"])
tool_class = getattr(module, tool["name"])
attributes[key].append(tool_class())
except Exception as e:

View File

@@ -61,6 +61,8 @@ from .reasoning_events import (
AgentReasoningStartedEvent,
AgentReasoningCompletedEvent,
AgentReasoningFailedEvent,
AgentMidExecutionReasoningStartedEvent,
AgentMidExecutionReasoningCompletedEvent,
)
@@ -108,6 +110,7 @@ class EventListener(BaseEventListener):
event.crew_name or "Crew",
source.id,
"completed",
final_result=final_string_output,
)
@crewai_event_bus.on(CrewKickoffFailedEvent)
@@ -437,8 +440,6 @@ class EventListener(BaseEventListener):
self.formatter.current_crew_tree,
)
# ----------- REASONING EVENTS -----------
@crewai_event_bus.on(AgentReasoningStartedEvent)
def on_agent_reasoning_started(source, event: AgentReasoningStartedEvent):
self.formatter.handle_reasoning_started(
@@ -462,5 +463,37 @@ class EventListener(BaseEventListener):
self.formatter.current_crew_tree,
)
@crewai_event_bus.on(AgentMidExecutionReasoningStartedEvent)
def on_mid_execution_reasoning_started(source, event: AgentMidExecutionReasoningStartedEvent):
self.formatter.handle_reasoning_started(
self.formatter.current_agent_branch,
event.attempt if hasattr(event, "attempt") else 1,
self.formatter.current_crew_tree,
current_step=event.current_step,
reasoning_trigger=event.reasoning_trigger,
)
@crewai_event_bus.on(AgentMidExecutionReasoningCompletedEvent)
def on_mid_execution_reasoning_completed(source, event: AgentMidExecutionReasoningCompletedEvent):
self.formatter.handle_reasoning_completed(
event.updated_plan,
True,
self.formatter.current_crew_tree,
duration_seconds=event.duration_seconds,
current_step=event.current_step,
reasoning_trigger=event.reasoning_trigger,
)
from crewai.utilities.events.reasoning_events import AgentAdaptiveReasoningDecisionEvent
@crewai_event_bus.on(AgentAdaptiveReasoningDecisionEvent)
def on_adaptive_reasoning_decision(source, event: AgentAdaptiveReasoningDecisionEvent):
self.formatter.handle_adaptive_reasoning_decision(
self.formatter.current_agent_branch,
event.should_reason,
event.reasoning,
self.formatter.current_crew_tree,
)
event_listener = EventListener()

View File

@@ -19,6 +19,7 @@ class AgentReasoningCompletedEvent(BaseEvent):
plan: str
ready: bool
attempt: int = 1
duration_seconds: float = 0.0 # Time taken for reasoning in seconds
class AgentReasoningFailedEvent(BaseEvent):
@@ -28,4 +29,37 @@ class AgentReasoningFailedEvent(BaseEvent):
agent_role: str
task_id: str
error: str
attempt: int = 1
attempt: int = 1
class AgentMidExecutionReasoningStartedEvent(BaseEvent):
"""Event emitted when an agent starts mid-execution reasoning."""
type: str = "agent_mid_execution_reasoning_started"
agent_role: str
task_id: str
current_step: int
reasoning_trigger: str # "interval" or "adaptive"
class AgentMidExecutionReasoningCompletedEvent(BaseEvent):
"""Event emitted when an agent completes mid-execution reasoning."""
type: str = "agent_mid_execution_reasoning_completed"
agent_role: str
task_id: str
current_step: int
updated_plan: str
reasoning_trigger: str
duration_seconds: float = 0.0 # Time taken for reasoning in seconds
class AgentAdaptiveReasoningDecisionEvent(BaseEvent):
"""Event emitted after the agent decides whether to trigger adaptive reasoning."""
type: str = "agent_adaptive_reasoning_decision"
agent_role: str
task_id: str
should_reason: bool # Whether the agent decided to reason
reasoning: str # Brief explanation / rationale from the LLM
reasoning_trigger: str = "adaptive" # Always adaptive for this event

View File

@@ -1,4 +1,5 @@
from typing import Any, Dict, Optional
import threading
from rich.console import Console
from rich.panel import Panel
@@ -17,6 +18,14 @@ class ConsoleFormatter:
current_lite_agent_branch: Optional[Tree] = None
tool_usage_counts: Dict[str, int] = {}
current_reasoning_branch: Optional[Tree] = None # Track reasoning status
current_adaptive_decision_branch: Optional[Tree] = None # Track last adaptive decision branch
# Spinner support ---------------------------------------------------
_spinner_frames = ["", "", "", "", "", "", "", "", "", ""]
_spinner_index: int = 0
_spinner_branches: Dict[Tree, tuple[str, str, str]] = {} # branch -> (icon, name, style)
_spinner_thread: Optional[threading.Thread] = None
_stop_spinner_event: Optional[threading.Event] = None
_spinner_running: bool = False
current_llm_tool_tree: Optional[Tree] = None
def __init__(self, verbose: bool = False):
@@ -49,6 +58,8 @@ class ConsoleFormatter:
for label, value in fields.items():
content.append(f"{label}: ", style="white")
if label == "Result":
content.append("\n")
content.append(
f"{value}\n", style=fields.get(f"{label}_style", status_style)
)
@@ -138,6 +149,7 @@ class ConsoleFormatter:
crew_name: str,
source_id: str,
status: str = "completed",
final_result: Optional[str] = None,
) -> None:
"""Handle crew tree updates with consistent formatting."""
if not self.verbose or tree is None:
@@ -163,15 +175,26 @@ class ConsoleFormatter:
style,
)
# Prepare additional fields for the completion panel
additional_fields: Dict[str, Any] = {"ID": source_id}
# Include the final result if provided and the status is completed
if status == "completed" and final_result is not None:
additional_fields["Result"] = final_result
content = self.create_status_content(
content_title,
crew_name or "Crew",
style,
ID=source_id,
**additional_fields,
)
self.print_panel(content, title, style)
# Clear all spinners when crew completes or fails
if status in {"completed", "failed"}:
self._clear_all_spinners()
def create_crew_tree(self, crew_name: str, source_id: str) -> Optional[Tree]:
"""Create and initialize a new crew tree with initial status."""
if not self.verbose:
@@ -219,6 +242,15 @@ class ConsoleFormatter:
# Set the current_task_branch attribute directly
self.current_task_branch = task_branch
# When a new task starts, clear pointers to previous agent, reasoning,
# and tool branches so that any upcoming Reasoning / Tool logs attach
# to the correct task.
if self.current_tool_branch:
self._unregister_spinner_branch(self.current_tool_branch)
self.current_agent_branch = None
# Keep current_reasoning_branch; reasoning may still be in progress
self.current_tool_branch = None
return task_branch
def update_task_status(
@@ -266,6 +298,17 @@ class ConsoleFormatter:
)
self.print_panel(content, panel_title, style)
# Clear task-scoped pointers after the task is finished so subsequent
# events don't mistakenly attach to the old task branch.
if status in {"completed", "failed"}:
self.current_task_branch = None
self.current_agent_branch = None
self.current_tool_branch = None
# Ensure spinner is stopped if reasoning branch exists
if self.current_reasoning_branch is not None:
self._unregister_spinner_branch(self.current_reasoning_branch)
self.current_reasoning_branch = None
def create_agent_branch(
self, task_branch: Optional[Tree], agent_role: str, crew_tree: Optional[Tree]
) -> Optional[Tree]:
@@ -502,19 +545,20 @@ class ConsoleFormatter:
# Update tool usage count
self.tool_usage_counts[tool_name] = self.tool_usage_counts.get(tool_name, 0) + 1
# Find or create tool node
tool_branch = self.current_tool_branch
if tool_branch is None:
tool_branch = branch_to_use.add("")
self.current_tool_branch = tool_branch
# Always create a new branch for each tool invocation so that previous
# tool usages remain visible in the tree.
tool_branch = branch_to_use.add("")
self.current_tool_branch = tool_branch
# Update label with current count
spinner_char = self._next_spinner()
self.update_tree_label(
tool_branch,
"🔧",
f"🔧 {spinner_char}",
f"Using {tool_name} ({self.tool_usage_counts[tool_name]})",
"yellow",
)
self._register_spinner_branch(tool_branch, "🔧", f"Using {tool_name} ({self.tool_usage_counts[tool_name]})", "yellow")
# Print updated tree immediately
self.print(tree_to_use)
@@ -544,9 +588,7 @@ class ConsoleFormatter:
f"Used {tool_name} ({self.tool_usage_counts[tool_name]})",
"green",
)
# Clear the current tool branch as we're done with it
self.current_tool_branch = None
self._unregister_spinner_branch(tool_branch)
# Only print if we have a valid tree and the tool node is still in it
if isinstance(tree_to_use, Tree) and tool_branch in tree_to_use.children:
@@ -574,6 +616,7 @@ class ConsoleFormatter:
f"{tool_name} ({self.tool_usage_counts[tool_name]})",
"red",
)
self._unregister_spinner_branch(tool_branch)
if tree_to_use:
self.print(tree_to_use)
self.print()
@@ -613,7 +656,9 @@ class ConsoleFormatter:
# Only add thinking status if we don't have a current tool branch
if self.current_tool_branch is None:
tool_branch = branch_to_use.add("")
self.update_tree_label(tool_branch, "🧠", "Thinking...", "blue")
spinner_char = self._next_spinner()
self.update_tree_label(tool_branch, f"🧠 {spinner_char}", "Thinking...", "blue")
self._register_spinner_branch(tool_branch, "🧠", "Thinking...", "blue")
self.current_tool_branch = tool_branch
self.print(tree_to_use)
self.print()
@@ -647,6 +692,8 @@ class ConsoleFormatter:
for parent in parents:
if isinstance(parent, Tree) and tool_branch in parent.children:
parent.children.remove(tool_branch)
# Stop spinner for the thinking branch before removing
self._unregister_spinner_branch(tool_branch)
removed = True
break
@@ -671,6 +718,7 @@ class ConsoleFormatter:
# Update tool branch if it exists
if tool_branch:
tool_branch.label = Text("❌ LLM Failed", style="red bold")
self._unregister_spinner_branch(tool_branch)
if tree_to_use:
self.print(tree_to_use)
self.print()
@@ -1106,17 +1154,23 @@ class ConsoleFormatter:
agent_branch: Optional[Tree],
attempt: int,
crew_tree: Optional[Tree],
current_step: Optional[int] = None,
reasoning_trigger: Optional[str] = None,
) -> Optional[Tree]:
"""Handle agent reasoning started (or refinement) event."""
if not self.verbose:
return None
# Prefer LiteAgent > Agent > Task branch as the parent for reasoning
branch_to_use = (
self.current_lite_agent_branch
or agent_branch
or self.current_task_branch
)
# Prefer to nest under the latest adaptive decision branch when this is a
# mid-execution reasoning cycle so the tree indents nicely.
if current_step is not None and self.current_adaptive_decision_branch is not None:
branch_to_use = self.current_adaptive_decision_branch
else:
branch_to_use = (
self.current_lite_agent_branch
or agent_branch
or self.current_task_branch
)
# We always want to render the full crew tree when possible so the
# Live view updates coherently. Fallbacks: crew tree → branch itself.
@@ -1132,11 +1186,21 @@ class ConsoleFormatter:
reasoning_branch = branch_to_use.add("")
self.current_reasoning_branch = reasoning_branch
# Build label text depending on attempt
status_text = (
f"Reasoning (Attempt {attempt})" if attempt > 1 else "Reasoning..."
)
self.update_tree_label(reasoning_branch, "🧠", status_text, "blue")
# Build label text depending on attempt and whether it's mid-execution
if current_step is not None:
status_text = "Mid-Execution Reasoning"
else:
status_text = (
f"Reasoning (Attempt {attempt})" if attempt > 1 else "Reasoning..."
)
# ⠋ is the first frame of a braille spinner visually hints progress even
# without true animation.
spinner_char = self._next_spinner()
self.update_tree_label(reasoning_branch, f"🧠 {spinner_char}", status_text, "yellow")
# Register branch for continuous spinner
self._register_spinner_branch(reasoning_branch, "🧠", status_text, "yellow")
self.print(tree_to_use)
self.print()
@@ -1148,6 +1212,9 @@ class ConsoleFormatter:
plan: str,
ready: bool,
crew_tree: Optional[Tree],
duration_seconds: float = 0.0,
current_step: Optional[int] = None,
reasoning_trigger: Optional[str] = None,
) -> None:
"""Handle agent reasoning completed event."""
if not self.verbose:
@@ -1161,10 +1228,31 @@ class ConsoleFormatter:
or crew_tree
)
style = "green" if ready else "yellow"
status_text = "Reasoning Completed" if ready else "Reasoning Completed (Not Ready)"
# Completed reasoning should always display in green.
style = "green"
# Build duration part separately for cleaner formatting
duration_part = f"{duration_seconds:.2f}s" if duration_seconds > 0 else ""
if reasoning_branch is not None:
if current_step is not None:
# Build label manually to style duration differently and omit trigger info.
if reasoning_branch is not None:
label = Text()
label.append("", style=f"{style} bold")
label.append("Mid-Execution Reasoning Completed", style=style)
if duration_part:
label.append(f" ({duration_part})", style="cyan")
reasoning_branch.label = label
status_text = None # Already set label manually
else:
status_text = (
f"Reasoning Completed ({duration_part})" if duration_part else "Reasoning Completed"
) if ready else (
f"Reasoning Completed (Not Ready • {duration_part})" if duration_part else "Reasoning Completed (Not Ready)"
)
# If we didn't build a custom label (non-mid-execution case), use helper
if status_text and reasoning_branch is not None:
self.update_tree_label(reasoning_branch, "", status_text, style)
if tree_to_use is not None:
@@ -1172,9 +1260,17 @@ class ConsoleFormatter:
# Show plan in a panel (trim very long plans)
if plan:
# Derive duration text for panel title
duration_text = f" ({duration_part})" if duration_part else ""
if current_step is not None:
title = f"🧠 Mid-Execution Reasoning Plan{duration_text}"
else:
title = f"🧠 Reasoning Plan{duration_text}"
plan_panel = Panel(
Text(plan, style="white"),
title="🧠 Reasoning Plan",
title=title,
border_style=style,
padding=(1, 2),
)
@@ -1182,9 +1278,17 @@ class ConsoleFormatter:
self.print()
# Unregister spinner before clearing
if reasoning_branch is not None:
self._unregister_spinner_branch(reasoning_branch)
# Clear stored branch after completion
self.current_reasoning_branch = None
# After reasoning finished, we also clear the adaptive decision branch to
# avoid nesting unrelated future nodes.
self.current_adaptive_decision_branch = None
def handle_reasoning_failed(
self,
error: str,
@@ -1204,6 +1308,7 @@ class ConsoleFormatter:
if reasoning_branch is not None:
self.update_tree_label(reasoning_branch, "", "Reasoning Failed", "red")
self._unregister_spinner_branch(reasoning_branch)
if tree_to_use is not None:
self.print(tree_to_use)
@@ -1219,3 +1324,115 @@ class ConsoleFormatter:
# Clear stored branch after failure
self.current_reasoning_branch = None
# ----------- ADAPTIVE REASONING DECISION EVENTS -----------
def handle_adaptive_reasoning_decision(
self,
agent_branch: Optional[Tree],
should_reason: bool,
reasoning: str,
crew_tree: Optional[Tree],
) -> None:
"""Render the decision on whether to trigger adaptive reasoning."""
if not self.verbose:
return
# Prefer LiteAgent > Agent > Task as parent
branch_to_use = (
self.current_lite_agent_branch
or agent_branch
or self.current_task_branch
)
tree_to_use = self.current_crew_tree or crew_tree or branch_to_use
if branch_to_use is None or tree_to_use is None:
return
decision_branch = branch_to_use.add("")
decision_text = "YES" if should_reason else "NO"
style = "green" if should_reason else "yellow"
self.update_tree_label(
decision_branch,
"🤔",
f"Adaptive Reasoning Decision: {decision_text}",
style,
)
# Print tree first (live update)
self.print(tree_to_use)
# Also show explanation if available
if reasoning:
truncated_reasoning = reasoning[:500] + "..." if len(reasoning) > 500 else reasoning
panel = Panel(
Text(truncated_reasoning, style="white"),
title="🤔 Adaptive Reasoning Rationale",
border_style=style,
padding=(1, 2),
)
self.print(panel)
self.print()
# Store the decision branch so that subsequent mid-execution reasoning nodes
# can be rendered as children of this decision (for better indentation).
self.current_adaptive_decision_branch = decision_branch
# ------------------------------------------------------------------
# Spinner helpers
# ------------------------------------------------------------------
def _next_spinner(self) -> str:
"""Return next spinner frame."""
frame = self._spinner_frames[self._spinner_index]
self._spinner_index = (self._spinner_index + 1) % len(self._spinner_frames)
return frame
def _register_spinner_branch(self, branch: Tree, icon: str, name: str, style: str):
"""Start animating spinner for given branch."""
self._spinner_branches[branch] = (icon, name, style)
if not self._spinner_running:
self._start_spinner_thread()
def _unregister_spinner_branch(self, branch: Optional[Tree]):
if branch is None:
return
self._spinner_branches.pop(branch, None)
if not self._spinner_branches:
self._stop_spinner_thread()
def _start_spinner_thread(self):
if self._spinner_running:
return
self._stop_spinner_event = threading.Event()
self._spinner_thread = threading.Thread(target=self._spinner_loop, daemon=True)
self._spinner_thread.start()
self._spinner_running = True
def _stop_spinner_thread(self):
if self._stop_spinner_event:
self._stop_spinner_event.set()
self._spinner_running = False
def _clear_all_spinners(self):
"""Clear all active spinners. Used as a safety mechanism."""
self._spinner_branches.clear()
self._stop_spinner_thread()
def _spinner_loop(self):
import time
while self._stop_spinner_event and not self._stop_spinner_event.is_set():
if self._live and self._spinner_branches:
for branch, (icon, name, style) in list(self._spinner_branches.items()):
spinner_char = self._next_spinner()
self.update_tree_label(branch, f"{icon} {spinner_char}", name, style)
# Refresh live view
try:
self._live.update(self._live.renderable, refresh=True)
except Exception:
pass
time.sleep(0.15)

View File

@@ -1,6 +1,6 @@
import logging
import json
from typing import Tuple, cast
from typing import Tuple, cast, List, Optional, Dict, Any
from pydantic import BaseModel, Field
@@ -16,10 +16,17 @@ from crewai.utilities.events.reasoning_events import (
)
class StructuredPlan(BaseModel):
"""Structured representation of a task plan."""
steps: List[str] = Field(description="List of steps to complete the task")
acceptance_criteria: List[str] = Field(description="Criteria that must be met before task is considered complete")
class ReasoningPlan(BaseModel):
"""Model representing a reasoning plan for a task."""
plan: str = Field(description="The detailed reasoning plan for the task.")
ready: bool = Field(description="Whether the agent is ready to execute the task.")
structured_plan: Optional[StructuredPlan] = Field(default=None, description="Structured version of the plan")
class AgentReasoningOutput(BaseModel):
@@ -31,6 +38,8 @@ class ReasoningFunction(BaseModel):
"""Model for function calling with reasoning."""
plan: str = Field(description="The detailed reasoning plan for the task.")
ready: bool = Field(description="Whether the agent is ready to execute the task.")
steps: Optional[List[str]] = Field(default=None, description="List of steps to complete the task")
acceptance_criteria: Optional[List[str]] = Field(default=None, description="Criteria that must be met before task is complete")
class AgentReasoning:
@@ -38,7 +47,7 @@ class AgentReasoning:
Handles the agent reasoning process, enabling an agent to reflect and create a plan
before executing a task.
"""
def __init__(self, task: Task, agent: Agent):
def __init__(self, task: Task, agent: Agent, extra_context: str | None = None):
if not task or not agent:
raise ValueError("Both task and agent must be provided.")
self.task = task
@@ -46,6 +55,7 @@ class AgentReasoning:
self.llm = cast(LLM, agent.llm)
self.logger = logging.getLogger(__name__)
self.i18n = I18N()
self.extra_context = extra_context or ""
def handle_agent_reasoning(self) -> AgentReasoningOutput:
"""
@@ -55,6 +65,9 @@ class AgentReasoning:
Returns:
AgentReasoningOutput: The output of the agent reasoning process.
"""
import time
start_time = time.time()
# Emit a reasoning started event (attempt 1)
try:
crewai_event_bus.emit(
@@ -72,6 +85,8 @@ class AgentReasoning:
try:
output = self.__handle_agent_reasoning()
duration_seconds = time.time() - start_time
# Emit reasoning completed event
try:
crewai_event_bus.emit(
@@ -82,6 +97,7 @@ class AgentReasoning:
plan=output.plan.plan,
ready=output.plan.ready,
attempt=1,
duration_seconds=duration_seconds,
),
)
except Exception:
@@ -112,25 +128,25 @@ class AgentReasoning:
Returns:
AgentReasoningOutput: The output of the agent reasoning process.
"""
plan, ready = self.__create_initial_plan()
plan, ready, structured_plan = self.__create_initial_plan()
plan, ready = self.__refine_plan_if_needed(plan, ready)
plan, ready, structured_plan = self.__refine_plan_if_needed(plan, ready, structured_plan)
reasoning_plan = ReasoningPlan(plan=plan, ready=ready)
reasoning_plan = ReasoningPlan(plan=plan, ready=ready, structured_plan=structured_plan)
return AgentReasoningOutput(plan=reasoning_plan)
def __create_initial_plan(self) -> Tuple[str, bool]:
def __create_initial_plan(self) -> Tuple[str, bool, Optional[StructuredPlan]]:
"""
Creates the initial reasoning plan for the task.
Returns:
Tuple[str, bool]: The initial plan and whether the agent is ready to execute the task.
Tuple[str, bool, Optional[StructuredPlan]]: The initial plan, whether the agent is ready, and structured plan.
"""
reasoning_prompt = self.__create_reasoning_prompt()
if self.llm.supports_function_calling():
plan, ready = self.__call_with_function(reasoning_prompt, "initial_plan")
return plan, ready
plan, ready, structured_plan = self.__call_with_function(reasoning_prompt, "initial_plan")
return plan, ready, structured_plan
else:
system_prompt = self.i18n.retrieve("reasoning", "initial_plan").format(
role=self.agent.role,
@@ -145,18 +161,21 @@ class AgentReasoning:
]
)
return self.__parse_reasoning_response(str(response))
plan, ready = self.__parse_reasoning_response(str(response))
structured_plan = self.__extract_structured_plan(plan)
return plan, ready, structured_plan
def __refine_plan_if_needed(self, plan: str, ready: bool) -> Tuple[str, bool]:
def __refine_plan_if_needed(self, plan: str, ready: bool, structured_plan: Optional[StructuredPlan]) -> Tuple[str, bool, Optional[StructuredPlan]]:
"""
Refines the reasoning plan if the agent is not ready to execute the task.
Args:
plan: The current reasoning plan.
ready: Whether the agent is ready to execute the task.
structured_plan: The current structured plan.
Returns:
Tuple[str, bool]: The refined plan and whether the agent is ready to execute the task.
Tuple[str, bool, Optional[StructuredPlan]]: The refined plan, ready status, and structured plan.
"""
attempt = 1
max_attempts = self.agent.max_reasoning_attempts
@@ -178,7 +197,7 @@ class AgentReasoning:
refine_prompt = self.__create_refine_prompt(plan)
if self.llm.supports_function_calling():
plan, ready = self.__call_with_function(refine_prompt, "refine_plan")
plan, ready, structured_plan = self.__call_with_function(refine_prompt, "refine_plan")
else:
system_prompt = self.i18n.retrieve("reasoning", "refine_plan").format(
role=self.agent.role,
@@ -193,6 +212,7 @@ class AgentReasoning:
]
)
plan, ready = self.__parse_reasoning_response(str(response))
structured_plan = self.__extract_structured_plan(plan)
attempt += 1
@@ -202,9 +222,9 @@ class AgentReasoning:
)
break
return plan, ready
return plan, ready, structured_plan
def __call_with_function(self, prompt: str, prompt_type: str) -> Tuple[str, bool]:
def __call_with_function(self, prompt: str, prompt_type: str) -> Tuple[str, bool, Optional[StructuredPlan]]:
"""
Calls the LLM with function calling to get a reasoning plan.
@@ -213,7 +233,7 @@ class AgentReasoning:
prompt_type: The type of prompt (initial_plan or refine_plan).
Returns:
Tuple[str, bool]: A tuple containing the plan and whether the agent is ready.
Tuple[str, bool, Optional[StructuredPlan]]: A tuple containing the plan, ready status, and structured plan.
"""
self.logger.debug(f"Using function calling for {prompt_type} reasoning")
@@ -232,6 +252,16 @@ class AgentReasoning:
"ready": {
"type": "boolean",
"description": "Whether the agent is ready to execute the task."
},
"steps": {
"type": "array",
"items": {"type": "string"},
"description": "List of steps to complete the task"
},
"acceptance_criteria": {
"type": "array",
"items": {"type": "string"},
"description": "Criteria that must be met before task is considered complete"
}
},
"required": ["plan", "ready"]
@@ -247,9 +277,14 @@ class AgentReasoning:
)
# Prepare a simple callable that just returns the tool arguments as JSON
def _create_reasoning_plan(plan: str, ready: bool): # noqa: N802
def _create_reasoning_plan(plan: str, ready: bool, steps: Optional[List[str]] = None, acceptance_criteria: Optional[List[str]] = None): # noqa: N802
"""Return the reasoning plan result in JSON string form."""
return json.dumps({"plan": plan, "ready": ready})
return json.dumps({
"plan": plan,
"ready": ready,
"steps": steps,
"acceptance_criteria": acceptance_criteria
})
response = self.llm.call(
[
@@ -265,12 +300,19 @@ class AgentReasoning:
try:
result = json.loads(response)
if "plan" in result and "ready" in result:
return result["plan"], result["ready"]
structured_plan = None
if result.get("steps") or result.get("acceptance_criteria"):
structured_plan = StructuredPlan(
steps=result.get("steps", []),
acceptance_criteria=result.get("acceptance_criteria", [])
)
return result["plan"], result["ready"], structured_plan
except (json.JSONDecodeError, KeyError):
pass
response_str = str(response)
return response_str, "READY: I am ready to execute the task." in response_str
structured_plan = self.__extract_structured_plan(response_str)
return response_str, "READY: I am ready to execute the task." in response_str, structured_plan
except Exception as e:
self.logger.warning(f"Error during function calling: {str(e)}. Falling back to text parsing.")
@@ -290,10 +332,11 @@ class AgentReasoning:
)
fallback_str = str(fallback_response)
return fallback_str, "READY: I am ready to execute the task." in fallback_str
structured_plan = self.__extract_structured_plan(fallback_str)
return fallback_str, "READY: I am ready to execute the task." in fallback_str, structured_plan
except Exception as inner_e:
self.logger.error(f"Error during fallback text parsing: {str(inner_e)}")
return "Failed to generate a plan due to an error.", True # Default to ready to avoid getting stuck
return "Failed to generate a plan due to an error.", True, None # Default to ready to avoid getting stuck
def __get_agent_backstory(self) -> str:
"""
@@ -317,7 +360,7 @@ class AgentReasoning:
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory(),
description=self.task.description,
description=self.task.description + (f"\n\nContext:\n{self.extra_context}" if self.extra_context else ""),
expected_output=self.task.expected_output,
tools=available_tools
)
@@ -368,7 +411,7 @@ class AgentReasoning:
plan = response
ready = False
if "READY: I am ready to execute the task." in response:
if "READY: I am ready to execute the task." in response or "READY: I am ready to continue executing the task." in response:
ready = True
return plan, ready
@@ -385,3 +428,403 @@ class AgentReasoning:
"The _handle_agent_reasoning method is deprecated. Use handle_agent_reasoning instead."
)
return self.handle_agent_reasoning()
def _emit_reasoning_event(self, event_class, **kwargs):
"""Centralized method for emitting reasoning events."""
try:
reasoning_trigger = "interval"
if hasattr(self.agent, 'adaptive_reasoning') and self.agent.adaptive_reasoning:
reasoning_trigger = "adaptive"
crewai_event_bus.emit(
self.agent,
event_class(
agent_role=self.agent.role,
task_id=str(self.task.id),
reasoning_trigger=reasoning_trigger,
**kwargs
),
)
except Exception:
# Ignore event bus errors to avoid breaking execution
pass
def handle_mid_execution_reasoning(
self,
current_steps: int,
tools_used: list,
current_progress: str,
iteration_messages: list
) -> AgentReasoningOutput:
"""
Handle reasoning during task execution with context about current progress.
Args:
current_steps: Number of steps executed so far
tools_used: List of tools that have been used
current_progress: Summary of progress made so far
iteration_messages: Recent conversation messages
Returns:
AgentReasoningOutput: Updated reasoning plan based on current context
"""
import time
start_time = time.time()
from crewai.utilities.events.reasoning_events import AgentMidExecutionReasoningStartedEvent
self._emit_reasoning_event(
AgentMidExecutionReasoningStartedEvent,
current_step=current_steps
)
try:
output = self.__handle_mid_execution_reasoning(
current_steps, tools_used, current_progress, iteration_messages
)
duration_seconds = time.time() - start_time
# Emit completed event
from crewai.utilities.events.reasoning_events import AgentMidExecutionReasoningCompletedEvent
self._emit_reasoning_event(
AgentMidExecutionReasoningCompletedEvent,
current_step=current_steps,
updated_plan=output.plan.plan,
duration_seconds=duration_seconds
)
return output
except Exception as e:
# Emit failed event
from crewai.utilities.events.reasoning_events import AgentReasoningFailedEvent
self._emit_reasoning_event(
AgentReasoningFailedEvent,
error=str(e),
attempt=1
)
raise
def __handle_mid_execution_reasoning(
self,
current_steps: int,
tools_used: list,
current_progress: str,
iteration_messages: list
) -> AgentReasoningOutput:
"""
Private method that handles the mid-execution reasoning process.
Args:
current_steps: Number of steps executed so far
tools_used: List of tools that have been used
current_progress: Summary of progress made so far
iteration_messages: Recent conversation messages
Returns:
AgentReasoningOutput: The output of the mid-execution reasoning process.
"""
mid_execution_prompt = self.__create_mid_execution_prompt(
current_steps, tools_used, current_progress, iteration_messages
)
if self.llm.supports_function_calling():
plan, ready, structured_plan = self.__call_with_function(mid_execution_prompt, "mid_execution_plan")
else:
# Use the same prompt for system context
system_prompt = self.i18n.retrieve("reasoning", "mid_execution_plan").format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory()
)
response = self.llm.call(
[
{"role": "system", "content": system_prompt},
{"role": "user", "content": mid_execution_prompt}
]
)
plan, ready = self.__parse_reasoning_response(str(response))
structured_plan = self.__extract_structured_plan(plan)
reasoning_plan = ReasoningPlan(plan=plan, ready=ready, structured_plan=structured_plan)
return AgentReasoningOutput(plan=reasoning_plan)
def __create_mid_execution_prompt(
self,
current_steps: int,
tools_used: list,
current_progress: str,
iteration_messages: list
) -> str:
"""
Creates a prompt for the agent to reason during task execution.
Args:
current_steps: Number of steps executed so far
tools_used: List of tools that have been used
current_progress: Summary of progress made so far
iteration_messages: Recent conversation messages
Returns:
str: The mid-execution reasoning prompt.
"""
tools_used_str = ", ".join(tools_used) if tools_used else "No tools used yet"
recent_messages = ""
if iteration_messages:
recent_msgs = iteration_messages[-6:] if len(iteration_messages) > 6 else iteration_messages
for msg in recent_msgs:
role = msg.get("role", "unknown")
content = msg.get("content", "")
if content:
recent_messages += f"{role.upper()}: {content[:200]}...\n\n"
return self.i18n.retrieve("reasoning", "mid_execution_reasoning").format(
description=self.task.description + (f"\n\nContext:\n{self.extra_context}" if self.extra_context else ""),
expected_output=self.task.expected_output,
current_steps=current_steps,
tools_used=tools_used_str,
current_progress=current_progress,
recent_messages=recent_messages
)
def should_adaptive_reason_llm(
self,
current_steps: int,
tools_used: list,
current_progress: str,
tool_usage_stats: Optional[Dict[str, Any]] = None
) -> bool:
"""
Use LLM function calling to determine if adaptive reasoning should be triggered.
Args:
current_steps: Number of steps executed so far
tools_used: List of tools that have been used
current_progress: Summary of progress made so far
tool_usage_stats: Optional statistics about tool usage patterns
Returns:
bool: True if reasoning should be triggered, False otherwise.
"""
try:
decision_prompt = self.__create_adaptive_reasoning_decision_prompt(
current_steps, tools_used, current_progress, tool_usage_stats
)
if self.llm.supports_function_calling():
should_reason, reasoning_expl = self.__call_adaptive_reasoning_function(decision_prompt)
else:
should_reason, reasoning_expl = self.__call_adaptive_reasoning_text(decision_prompt)
# Emit an event so the UI/console can display the decision
try:
from crewai.utilities.events.reasoning_events import AgentAdaptiveReasoningDecisionEvent
self._emit_reasoning_event(
AgentAdaptiveReasoningDecisionEvent,
should_reason=should_reason,
reasoning=reasoning_expl,
)
except Exception:
# Ignore event bus errors to avoid breaking execution
pass
return should_reason
except Exception as e:
self.logger.warning(f"Error during adaptive reasoning decision: {str(e)}. Defaulting to no reasoning.")
return False
def __call_adaptive_reasoning_function(self, prompt: str) -> tuple[bool, str]:
"""Call LLM with function calling for adaptive reasoning decision."""
function_schema = {
"type": "function",
"function": {
"name": "decide_reasoning_need",
"description": "Decide whether reasoning is needed based on current task execution context",
"parameters": {
"type": "object",
"properties": {
"should_reason": {
"type": "boolean",
"description": "Whether reasoning/re-planning is needed at this point in task execution."
},
"reasoning": {
"type": "string",
"description": "Brief explanation of why reasoning is or isn't needed."
},
"detected_issues": {
"type": "array",
"items": {"type": "string"},
"description": "List of specific issues detected (e.g., 'repeated tool failures', 'no progress', 'inefficient approach')"
}
},
"required": ["should_reason", "reasoning"]
}
}
}
def _decide_reasoning_need(should_reason: bool, reasoning: str, detected_issues: Optional[List[str]] = None):
"""Return the reasoning decision result in JSON string form."""
result = {
"should_reason": should_reason,
"reasoning": reasoning
}
if detected_issues:
result["detected_issues"] = detected_issues
# Append detected issues to reasoning explanation
issues_str = ", ".join(detected_issues)
result["reasoning"] = f"{reasoning} Detected issues: {issues_str}"
return json.dumps(result)
system_prompt = self.i18n.retrieve("reasoning", "adaptive_reasoning_decision").format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory()
)
response = self.llm.call(
[
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt}
],
tools=[function_schema],
available_functions={"decide_reasoning_need": _decide_reasoning_need},
)
try:
result = json.loads(response)
reasoning_text = result.get("reasoning", "No explanation provided")
if result.get("detected_issues"):
# Include detected issues in the reasoning text for logging
self.logger.debug(f"Adaptive reasoning detected issues: {result['detected_issues']}")
return result.get("should_reason", False), reasoning_text
except (json.JSONDecodeError, KeyError):
return False, "No explanation provided"
def __call_adaptive_reasoning_text(self, prompt: str) -> tuple[bool, str]:
"""Fallback text-based adaptive reasoning decision."""
system_prompt = self.i18n.retrieve("reasoning", "adaptive_reasoning_decision").format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory()
)
response = self.llm.call([
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt + "\n\nRespond with 'YES' if reasoning is needed, 'NO' if not."}
])
return "YES" in str(response).upper(), "No explanation provided"
def __create_adaptive_reasoning_decision_prompt(
self,
current_steps: int,
tools_used: list,
current_progress: str,
tool_usage_stats: Optional[Dict[str, Any]] = None
) -> str:
"""Create prompt for adaptive reasoning decision."""
tools_used_str = ", ".join(tools_used) if tools_used else "No tools used yet"
# Add tool usage statistics to the prompt
tool_stats_str = ""
if tool_usage_stats:
tool_stats_str = f"\n\nTOOL USAGE STATISTICS:\n"
tool_stats_str += f"- Total tool invocations: {tool_usage_stats.get('total_tool_uses', 0)}\n"
tool_stats_str += f"- Unique tools used: {tool_usage_stats.get('unique_tools', 0)}\n"
if tool_usage_stats.get('tools_by_frequency'):
tool_stats_str += "- Tool frequency:\n"
for tool, count in tool_usage_stats['tools_by_frequency'].items():
tool_stats_str += f"{tool}: {count} times\n"
if tool_usage_stats.get('recent_patterns'):
tool_stats_str += f"- Recent patterns: {tool_usage_stats['recent_patterns']}\n"
# Use the prompt from i18n and format it with the current context
base_prompt = self.i18n.retrieve("reasoning", "adaptive_reasoning_decision").format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory()
)
context_prompt = self.i18n.retrieve("reasoning", "adaptive_reasoning_context").format(
description=self.task.description + (f"\n\nContext:\n{self.extra_context}" if self.extra_context else ""),
expected_output=self.task.expected_output,
current_steps=current_steps,
tools_used=tools_used_str,
current_progress=current_progress + tool_stats_str
)
prompt = base_prompt + context_prompt
return prompt
def __extract_structured_plan(self, plan: str) -> Optional[StructuredPlan]:
"""
Extracts a structured plan from the given plan text.
Args:
plan: The plan text.
Returns:
Optional[StructuredPlan]: The extracted structured plan or None if no plan was found.
"""
if not plan:
return None
import re
steps = []
acceptance_criteria = []
# Look for numbered steps (1., 2., etc.)
step_pattern = r'^\s*(?:\d+\.|\-|\*)\s*(.+)$'
# Look for acceptance criteria section
in_acceptance_section = False
lines = plan.split('\n')
for line in lines:
line = line.strip()
# Check if we're entering acceptance criteria section
if any(marker in line.lower() for marker in ['acceptance criteria', 'success criteria', 'completion criteria']):
in_acceptance_section = True
continue
# Skip empty lines
if not line:
continue
# Extract steps or criteria
match = re.match(step_pattern, line, re.MULTILINE)
if match:
content = match.group(1).strip()
if in_acceptance_section:
acceptance_criteria.append(content)
else:
steps.append(content)
elif line and not line.endswith(':'): # Non-empty line that's not a header
if in_acceptance_section:
acceptance_criteria.append(line)
else:
# Check if it looks like a step (starts with action verb)
action_verbs = ['create', 'implement', 'design', 'build', 'test', 'verify', 'check', 'ensure', 'analyze', 'review']
if any(line.lower().startswith(verb) for verb in action_verbs):
steps.append(line)
# If we found steps or criteria, return structured plan
if steps or acceptance_criteria:
return StructuredPlan(
steps=steps,
acceptance_criteria=acceptance_criteria
)
return None

View File

@@ -0,0 +1,89 @@
from unittest.mock import MagicMock, patch
import pytest
from crewai import Agent, Crew, Task
from crewai.agents.crew_agent_executor import CrewAgentExecutor
def _create_executor(agent): # noqa: D401,E501
"""Utility to build a minimal CrewAgentExecutor with the given agent.
A real LLM call is not required for these unit-tests, so we stub it with
MagicMock to avoid any network interaction.
"""
return CrewAgentExecutor(
llm=MagicMock(),
task=MagicMock(),
crew=MagicMock(),
agent=agent,
prompt={},
max_iter=5,
tools=[],
tools_names="",
stop_words=[],
tools_description="",
tools_handler=MagicMock(),
)
def test_agent_adaptive_reasoning_default():
"""Agent.adaptive_reasoning should be False by default."""
agent = Agent(role="Test", goal="Goal", backstory="Backstory")
assert agent.adaptive_reasoning is False
@pytest.mark.parametrize("adaptive_decision,expected", [(True, True), (False, False)])
def test_should_trigger_reasoning_with_adaptive_reasoning(adaptive_decision, expected):
"""Verify _should_trigger_reasoning defers to _should_adaptive_reason when
adaptive_reasoning is enabled and reasoning_interval is None."""
# Use a lightweight mock instead of a full Agent instance to isolate the logic
agent = MagicMock()
agent.reasoning = True
agent.reasoning_interval = None
agent.adaptive_reasoning = True
executor = _create_executor(agent)
# Ensure the helper returns the desired decision
with patch.object(executor, "_should_adaptive_reason", return_value=adaptive_decision) as mock_adaptive:
assert executor._should_trigger_reasoning() is expected
mock_adaptive.assert_called_once()
@pytest.mark.vcr(filter_headers=["authorization"])
def test_adaptive_reasoning_full_execution():
"""End-to-end test that triggers adaptive reasoning in a real execution flow.
The task description intentionally contains the word "error" to activate the
simple error-based heuristic inside `_should_adaptive_reason`, guaranteeing
that the agent reasons mid-execution without relying on patched internals.
"""
agent = Agent(
role="Math Analyst",
goal="Solve arithmetic problems flawlessly",
backstory="You excel at basic calculations and always double-check your steps.",
llm="gpt-4o-mini",
reasoning=True,
adaptive_reasoning=True,
verbose=False,
)
task = Task(
description="There was an unexpected error earlier. Now, please calculate 3 + 5 and return only the number.",
expected_output="The result of the calculation (a single number).",
agent=agent,
)
crew = Crew(agents=[agent], tasks=[task])
result = crew.kickoff()
# Validate the answer is correct and numeric
assert result.raw.strip() == "8"
# Confirm that an adaptive reasoning message (Updated plan) was injected
assert any(
"updated plan" in msg.get("content", "").lower()
for msg in agent.agent_executor.messages
)

View File

@@ -309,7 +309,9 @@ def test_cache_hitting():
def handle_tool_end(source, event):
received_events.append(event)
with (patch.object(CacheHandler, "read") as read,):
with (
patch.object(CacheHandler, "read") as read,
):
read.return_value = "0"
task = Task(
description="What is 2 times 6? Ignore correctness and just return the result of the multiplication tool, you must use the tool.",
@@ -1628,13 +1630,13 @@ def test_agent_execute_task_with_ollama():
@pytest.mark.vcr(filter_headers=["authorization"])
def test_agent_with_knowledge_sources():
# Create a knowledge source with some content
content = "Brandon's favorite color is red and he likes Mexican food."
string_source = StringKnowledgeSource(content=content)
with patch("crewai.knowledge") as MockKnowledge:
mock_knowledge_instance = MockKnowledge.return_value
mock_knowledge_instance.sources = [string_source]
mock_knowledge_instance.search.return_value = [{"content": content}]
MockKnowledge.add_sources.return_value = [string_source]
agent = Agent(
role="Information Agent",
@@ -1644,7 +1646,6 @@ def test_agent_with_knowledge_sources():
knowledge_sources=[string_source],
)
# Create a task that requires the agent to use the knowledge
task = Task(
description="What is Brandon's favorite color?",
expected_output="Brandon's favorite color.",
@@ -1652,10 +1653,11 @@ def test_agent_with_knowledge_sources():
)
crew = Crew(agents=[agent], tasks=[task])
result = crew.kickoff()
# Assert that the agent provides the correct information
assert "red" in result.raw.lower()
with patch.object(Knowledge, "add_sources") as mock_add_sources:
result = crew.kickoff()
assert mock_add_sources.called, "add_sources() should have been called"
mock_add_sources.assert_called_once()
assert "red" in result.raw.lower()
@pytest.mark.vcr(filter_headers=["authorization"])
@@ -2036,7 +2038,7 @@ def mock_get_auth_token():
@patch("crewai.cli.plus_api.PlusAPI.get_agent")
def test_agent_from_repository(mock_get_agent, mock_get_auth_token):
from crewai_tools import SerperDevTool
from crewai_tools import SerperDevTool, XMLSearchTool
mock_get_response = MagicMock()
mock_get_response.status_code = 200
@@ -2044,7 +2046,10 @@ def test_agent_from_repository(mock_get_agent, mock_get_auth_token):
"role": "test role",
"goal": "test goal",
"backstory": "test backstory",
"tools": [{"name": "SerperDevTool"}],
"tools": [
{"module": "crewai_tools", "name": "SerperDevTool"},
{"module": "crewai_tools", "name": "XMLSearchTool"},
],
}
mock_get_agent.return_value = mock_get_response
agent = Agent(from_repository="test_agent")
@@ -2052,8 +2057,9 @@ def test_agent_from_repository(mock_get_agent, mock_get_auth_token):
assert agent.role == "test role"
assert agent.goal == "test goal"
assert agent.backstory == "test backstory"
assert len(agent.tools) == 1
assert len(agent.tools) == 2
assert isinstance(agent.tools[0], SerperDevTool)
assert isinstance(agent.tools[1], XMLSearchTool)
@patch("crewai.cli.plus_api.PlusAPI.get_agent")
@@ -2066,7 +2072,7 @@ def test_agent_from_repository_override_attributes(mock_get_agent, mock_get_auth
"role": "test role",
"goal": "test goal",
"backstory": "test backstory",
"tools": [{"name": "SerperDevTool"}],
"tools": [{"name": "SerperDevTool", "module": "crewai_tools"}],
}
mock_get_agent.return_value = mock_get_response
agent = Agent(from_repository="test_agent", role="Custom Role")
@@ -2086,7 +2092,7 @@ def test_agent_from_repository_with_invalid_tools(mock_get_agent, mock_get_auth_
"role": "test role",
"goal": "test goal",
"backstory": "test backstory",
"tools": [{"name": "DoesNotExist"}],
"tools": [{"name": "DoesNotExist", "module": "crewai_tools",}],
}
mock_get_agent.return_value = mock_get_response
with pytest.raises(

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1020,4 +1020,63 @@ interactions:
- req_83a900d075a98ab391c27c5d1cd4fbcb
http_version: HTTP/1.1
status_code: 200
- request:
body: !!binary |
CtcMCiQKIgoMc2VydmljZS5uYW1lEhIKEGNyZXdBSS10ZWxlbWV0cnkSrgwKEgoQY3Jld2FpLnRl
bGVtZXRyeRKUCAoQot+4QIkYBzRK/SkLTbr2XBIIKFjBQjUmSxQqDENyZXcgQ3JlYXRlZDABOdDB
ddshVEMYQbBDfdshVEMYShsKDmNyZXdhaV92ZXJzaW9uEgkKBzAuMTIxLjBKGgoOcHl0aG9uX3Zl
cnNpb24SCAoGMy4xMS43Si4KCGNyZXdfa2V5EiIKIDY5NDY1NGEzMThmNzE5ODgzYzA2ZjhlNmQ5
YTc1NDlmSjEKB2NyZXdfaWQSJgokYTgxMTFiOTktZWJkMy00ZWYzLWFmNmQtMTk1ZDhiYjNhN2Jl
ShwKDGNyZXdfcHJvY2VzcxIMCgpzZXF1ZW50aWFsShEKC2NyZXdfbWVtb3J5EgIQAEoaChRjcmV3
X251bWJlcl9vZl90YXNrcxICGAFKGwoVY3Jld19udW1iZXJfb2ZfYWdlbnRzEgIYAUo6ChBjcmV3
X2ZpbmdlcnByaW50EiYKJGQ2YTE3OTk4LTQ0ODgtNDQ0Mi1iY2I3LWZiYzdlMDU1NjE4MUo7Chtj
cmV3X2ZpbmdlcnByaW50X2NyZWF0ZWRfYXQSHAoaMjAyNS0wNS0yN1QwMToxMzowNC43NDEyNzRK
zAIKC2NyZXdfYWdlbnRzErwCCrkCW3sia2V5IjogIjU1ODY5YmNiMTYzMjNlNzEyOWQyNTIzNjJj
ODU1ZGE2IiwgImlkIjogIjdiOTMxZWIzLTRiM2YtNGI3OC1hOWEzLTY4ODZiNTE1M2QxZiIsICJy
b2xlIjogIlNheSBIaSIsICJ2ZXJib3NlPyI6IGZhbHNlLCAibWF4X2l0ZXIiOiAyNSwgIm1heF9y
cG0iOiBudWxsLCAiZnVuY3Rpb25fY2FsbGluZ19sbG0iOiAiIiwgImxsbSI6ICJ0ZXN0LW1vZGVs
IiwgImRlbGVnYXRpb25fZW5hYmxlZD8iOiBmYWxzZSwgImFsbG93X2NvZGVfZXhlY3V0aW9uPyI6
IGZhbHNlLCAibWF4X3JldHJ5X2xpbWl0IjogMiwgInRvb2xzX25hbWVzIjogW119XUr7AQoKY3Jl
d190YXNrcxLsAQrpAVt7ImtleSI6ICJkZTI5NDBmMDZhZDhhNDE2YzI4Y2MwZjI2MTBmMTgwYiIs
ICJpZCI6ICI5MWU0NjUyYy1kMzk0LTQyMGQtYTIwOS0wNzlhYThkN2E5MDAiLCAiYXN5bmNfZXhl
Y3V0aW9uPyI6IGZhbHNlLCAiaHVtYW5faW5wdXQ/IjogZmFsc2UsICJhZ2VudF9yb2xlIjogIlNh
eSBIaSIsICJhZ2VudF9rZXkiOiAiNTU4NjliY2IxNjMyM2U3MTI5ZDI1MjM2MmM4NTVkYTYiLCAi
dG9vbHNfbmFtZXMiOiBbXX1degIYAYUBAAEAABKABAoQG7YeGAh0HoexHn1XjCcuCRIIHGHDZGTQ
CskqDFRhc2sgQ3JlYXRlZDABObDFkdshVEMYQThWktshVEMYSi4KCGNyZXdfa2V5EiIKIDY5NDY1
NGEzMThmNzE5ODgzYzA2ZjhlNmQ5YTc1NDlmSjEKB2NyZXdfaWQSJgokYTgxMTFiOTktZWJkMy00
ZWYzLWFmNmQtMTk1ZDhiYjNhN2JlSi4KCHRhc2tfa2V5EiIKIGRlMjk0MGYwNmFkOGE0MTZjMjhj
YzBmMjYxMGYxODBiSjEKB3Rhc2tfaWQSJgokOTFlNDY1MmMtZDM5NC00MjBkLWEyMDktMDc5YWE4
ZDdhOTAwSjoKEGNyZXdfZmluZ2VycHJpbnQSJgokZDZhMTc5OTgtNDQ4OC00NDQyLWJjYjctZmJj
N2UwNTU2MTgxSjoKEHRhc2tfZmluZ2VycHJpbnQSJgokZDk1ZDk1ZmYtYTBhNC00NmFjLWI2NWUt
MWE4Njg5ODYwOWQySjsKG3Rhc2tfZmluZ2VycHJpbnRfY3JlYXRlZF9hdBIcChoyMDI1LTA1LTI3
VDAxOjEzOjA0Ljc0MTIyNko7ChFhZ2VudF9maW5nZXJwcmludBImCiRkNTc5ZDA2Ny03YzU1LTQw
NmQtYTg5Zi1mZTM1MzU0ZGFlYTJ6AhgBhQEAAQAA
headers:
Accept:
- '*/*'
Accept-Encoding:
- gzip, deflate
Connection:
- keep-alive
Content-Length:
- '1626'
Content-Type:
- application/x-protobuf
User-Agent:
- OTel-OTLP-Exporter-Python/1.31.1
method: POST
uri: https://telemetry.crewai.com:4319/v1/traces
response:
body:
string: "\n\0"
headers:
Content-Length:
- '2'
Content-Type:
- application/x-protobuf
Date:
- Tue, 27 May 2025 08:13:05 GMT
status:
code: 200
message: OK
version: 1

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -199,4 +199,829 @@ interactions:
- req_2ac1e3cef69e9b09b7ade0e1d010fc08
http_version: HTTP/1.1
status_code: 200
- request:
body: '{"input": ["I now can give a great answer. Final Answer: **Topic**: Basic
Addition **Explanation**: Addition is a fundamental concept in math that means
combining two or more numbers to get a new total. It''s like putting together
pieces of a puzzle to see the whole picture. When we add, we take two or more
groups of things and count them all together. **Angle**: Use relatable and
engaging real-life scenarios to illustrate addition, making it fun and easier
for a 6-year-old to understand and apply. **Examples**: 1. **Counting Apples**: Let''s
say you have 2 apples and your friend gives you 3 more apples. How many apples
do you have in total? - You start with 2 apples. - Your friend gives you
3 more apples. - Now, you count all the apples together: 2 + 3 = 5. -
So, you have 5 apples in total. 2. **Toy Cars**: Imagine you have 4 toy
cars and you find 2 more toy cars in your room. How many toy cars do you have
now? - You start with 4 toy cars. - You find 2 more toy cars. - You
count them all together: 4 + 2 = 6. - So, you have 6 toy cars in total. 3.
**Drawing Pictures**: If you draw 3 pictures today and 2 pictures tomorrow,
how many pictures will you have drawn in total? - You draw 3 pictures today. -
You draw 2 pictures tomorrow. - You add them together: 3 + 2 = 5. - So,
you will have drawn 5 pictures in total. 4. **Using Fingers**: Let''s use
your fingers to practice addition. Show 3 fingers on one hand and 1 finger on
the other hand. How many fingers are you holding up? - 3 fingers on one hand. -
1 finger on the other hand. - Put them together and count: 3 + 1 = 4. -
So, you are holding up 4 fingers. By using objects that kids are familiar with,
such as apples, toy cars, drawings, and even their own fingers, we can make
the concept of addition relatable and enjoyable. Practicing with real items
helps children visualize the math and understand that addition is simply combining
groups to find out how many there are altogether."], "model": "text-embedding-3-small",
"encoding_format": "base64"}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '2092'
content-type:
- application/json
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.68.2
x-stainless-arch:
- arm64
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.68.2
x-stainless-read-timeout:
- '600'
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.11.7
method: POST
uri: https://api.openai.com/v1/embeddings
response:
body:
string: !!binary |
H4sIAAAAAAAAA1SaSw+yTrfl5++nePKf0idyEarqnXEXASkEr51OBxAREJFLFVAn57t38OmcTk8c
IBGE2muv9dv1n//68+efNq3ybPzn33/+eZfD+M//WI89kjH5599//ue//vz58+c/f5//35l5k+aP
R/kpfqf/viw/j3z+599/+P8+8v9O+veff+ZECgOidjuPT/hggOYz8/B5WXJjKhdrgcXu+QpqzjpU
fHRkJbTCW4nP3zI23k+iNugeb0qq9aECplduqMgw4oEQ79P11KnpRe7JhmCtVCkjaRmdQPymFnlr
alLNxrOD0FHcF7aCU+zx4if3AdioPj4Y2K+W8UIT2IeKGkz3BnhfmzMXhMhxhw/EruKFp/kWHL5S
Rd399RvP3VZSAdhPHc2G/mvQ7ZJPECZiSrWnMXjLt25zkGDgYS1J7X4e5CVBrDtrNIqXOObbdEhg
ROwPxh/vyKb6ClTY3d0d9UThysbDPCwwwwnEKvAoYO0YdUhr0iGY+TruhV2VnaAT4xrbtfoFQywZ
KozwpAdXsNlVAhNFE87oqWHTwbAfzlrpoFZo1GB5DAmTpjsNAZeNO5yk8SedYWISNNrcg+KxbEC2
C4sJ9dXtQk+NdKgk+XtbIH+J+gCiAqXU6h86VKTwRC/vsUynp+tE6P60Q6py75e3zCGywe65SMH2
ethV/FLVIthevw69JLYe88BmOVTVvAvGgtcqnhNLHbXz/h7wdW2nou/vM9jm1QcHlhIZ00s+OrBi
QkSfwsiBiVotQSdU37A6uDmYLevmQzxFEd7nxdJTgNISnqOhoImdvPvFfNEcpqpQYtuzjh4f3r4+
CM5KS+Y+VBizw6H5+z6vocWDIVgYROWyhThwhI2x8PSigNcMG+yj4pHOvfsqlfen4OnFferp0m0P
BXQsNaHnk6FXYsjXLcKhltFry12qqZNuPqxvAoeNnXz25nl4qjDUlxbvz8HEFul5KEGjcg71d9Wr
J49BTqARKjMNqH6PheNO7dD1MX+pIfQ3JpzdRYeNNXLY/AZ6NRePYwl3pmlS3XOu3nD2DqLiwFbC
l1ta9PzEpwoUIiAEKLo/PSkLM1P+trNIxm1z7ykpThw6yJDDYWQ4vRhvXglyVasmUOOXeIzFYwFP
cuzTdT1V88OJE2TfnATne+gwiRZKB+F4mQOFM2WwlGflBKO+PmJf45d0fkh2A0vd1PDxMzqxSK/d
BD/TwaCPQBwq3qNFCEmS7rGWaueUdVlfw3PouzivjachVAfVhZcr1KkT2Q828dnGAbODn2QWvnYv
GMElQFonxngv3U+pVNC4gVKZNuSSkA+QhjPawn5oGFXtrK2Wx7GK0A4ML3y+cDQeQHh14TeeXRzG
08ug+CsvUDFfDtk+rFssHIWLA9vZu2M3+pzSuXkVNTpPoMO7d1p7U+/FOrzdRYL3U/7piXfGGbix
UKdHGs39OJzVGrG8PeDw0c5sycqtDvnmZuM7oVtv7rYbFV4jU6RxWpdVS54sQ6Nwe/zepyFwqWMj
EJsV3d32eirQl8TBjmg3et9/dxX/yg0dvnMf4IPLC2zOhk8Ji86K6HOvW5449e8SNaKN6eFEHjHv
yR4EBbt7+EbSXb+g8QGh38oH7Jbinc0mew9IAmNOw+3W6dnTNW20i44BPUWqXkmc5BVw9CRC7etc
G0s5OzlsP1KBD+ZTYmycvwPwkvNCd35ZpkxhkYMeYU6wZWZlz+rvK4L32kb0AHWtEk/biw7N3t0E
pfjOKnYwzQbdx2NCVW+IK0l9mDXyNE3BVqraHo0d+QLrk+1gk1ZqJV5bs4NvxXCpWn/8mE/54Qat
e73BGl/HFf/Y+1u4JVmHb9rxWy1m4N7AkKoMa2QoGTvHYQLbiMNUhxedSUyYJhQr40xKeTOy6SS1
A8jApcGH/OH30/c9LoCiTYgDaTj3bJOlNrL2z5p6THGB8MaiA/2Hf8fnu5yl/EnzM/DuRRVj7zyk
0+tDCLgkF4T30JgYm17hBblhsCEKs56p4D1KHQmRLOBdJBUxFWIeovFbBzgOQtcQV32F7qnwsHeO
ZMaie3GDR7cssJoC2Vj20T2HQZxENIvdV7V4Xj+Bjfwc8L5NEu9Xj6AZpgtObtw+FdlxypB+D2N6
MV0zFfYjSxDpyhrr78lJJYDiAn08E2A36TUgnGW89uekDHripdWS2KQErplMBDrMZ9MzriBkVqBg
9XrM1nqVXEhx1+PoReZ+kA8vB8Yb9xZ0PB8AYeZV++//u+jpibGDUvDQsfQkgM9rmM6DrCSw8gMe
62v/EpRCuUGznUxq4bttvFXKnyDXaCp9uPwZTC9fDNHenyKcp+JkLFtFvoB0+9xg0yQFmzoXncDv
98yNq1ZS7+U8iIfy89OHfnCEKdpqjx3D1uLTeBFOJIdO4EvUKV5qKmaozRRyUwOalGHvsaj6Tkje
DjLdyySqFk3+KvADZJFi9day0boJIbraiY6NV8biqYY1j7Ran+m+LPexCHijRXXpSUSpkNEv0LEz
WFhmEwg23TO+c9EFXIxXSq/385NNxRbkqDzVV4qjU8OoJ48Z5EQ3pansXoxxYl0ClyII8PMFbMCP
58VE589nxpohDN6kA38Lq3Mxkhm2midJp+mEWl3kqb8zHzHlyTuBrcQB7Avnd7qkt5CDeRs0WKU1
BbN8mXU4vMRLMNDmzCQuOg7IO2cvmoqCBFjvXXiAU8fFjzhfDMZS9YS6z/mIdeHaprPQDAOEdsrT
nXzIPEm/5ArEp0tLlI1XVYM+GwU6+nJKcfOuPJ4EDYGin1dk+529lP3WI0z4FB+PsxtLrjopqHod
XtjIfCeeA4vWkAfJEkhH9wNoKS8tSgieSN+oRj9VhxSi0+NaBnz/2gOpdQoePbxcowfbn9IF3bdb
SJ+TjePeK3u2hN8Q1oe9izH/7PoJf8Pwr77GvCp5yyY2fHD4ZC1+bBu5mrYHL4OyUS80Izetl3b9
MkBOdFLqmP6lWuovssH7reYBr2+O8XTIZR96yXUJGHftAcMoF8HyvfUYX6CfsnZMOnTvlCcO1n4r
NP0XwlVvSHWXOm/5hKqKxEDK6WF3rNIZL4WLFi69YjMOLoAah5sCizzJqRnGncdIkXGA25EUO2/K
V/S4Ezhg7ewvtn2+6PmZO0cw7wikJslEMPB168OhdwJq3eUs5pHon6BbnBKK97plCEWDfPh4wCt2
IDpVdPWXCgmGM72qQI8ZheoEf3qmX3ZzNdVXpiMmagINci9I+QmXNqqI+KY+U3fevFR3HZ6F0xHv
dmER0w+nDagsYg87ihoC6RzfEqhz4YZsIxux71N7D4j2mUJmwJkVT8C+QY9PGAZbPG/6cWzfHYwT
/Yj11rIrNn6KC+AyuiMzF+tgEh9KATTZeFBtrYepa4wMICGwMNb7ySPbZ+9DJc0saq9+l46f9gLn
p5tRcyuUVce7OAHg3MhEOjhHb75/Y15ePNvGFrRFwJbwFYG1XnFoXc/VdNL8HNRjopJvdvBikaWZ
q1yMKqWed4tSWl85Hr7zAFD11Tk9w+jCK93negy4NHylS4gVHsovcA06QX6x9X5OcPXbVN0wIaaw
rExYqaWGD9HWiMnbKiBK81qjUSoWYNa81gcgS1KyHQg2pHisXDR6AqHuxjP6v/04r+QE4+ZteMLP
v7X37kIPVu0C/llPBCLve6GuFgOwTP1YKuN5/yHzsHHTyQBljXKrFqjxLQog3g/yFrYRxARMhcyW
8WAMIN9eNMK0+mpQWiwdaG5nir30I7HyV3/5mYTBIpkFYGz/2MLWwhRj4pyrJXJoBkQ/q9Z+6sdk
Q44XiOvcoMFz27DpM8QZkl/ylernlw4mKQ99eNudbvi+zASw2JkvaHvtHRrDjqV/+2UPh4BGTHc9
UUGfi8K3SkZ1H7+8KbsziHbO5vzLAx6Vv+EC62TBhH68GQz8PjHhYOs7MgXakU15nargg2+74Bhd
ZW+Y730OO2LcqMciKxV5etnCl3tyCODLsJqqpc+hbrclzVT6jJflGE2A7WiI93o3evOQ1yHkzvIb
H6L7xlvuX0eHN5PXcTi4HGDF6ZBBA/IJtcpq2887rswR2VISREzvjGWCS/Q3T9+tT8j4y/Q4AVWX
3mQq3uee321iCJf37hkoG8+o2JuVLbztLjdsfQ48G5qQtrD8vo2g0JMjWMKujxS+NJVgU+2BMbrb
KIN5N8AAOa+gJ30s3+DRGDgy7hutWrLjoQHqdrcjkbqHjDmp1/1dn0YUFYwmFlKBVW632PXMF5u/
qpsBVRfeRFIzzls0WpRo0jeIWr88/wnlAhhJ8MW2nvJsWjgFwp9/uHWbTV8f0UTAej80H/m3txAg
Kmjth1QlTwjI8Q4DGG+cGz1N89lYIqHVEfwab2xMD7Pib1/awbc5t9QASetNt8KDAC/fw/r8JsbY
ZnER+iJA/e9RAqPvnyNYeNOIrWfrVTS9bxbw6o0WmzvN8VhlHhdUGeoVW9Flz+bNlpTgvfcx2W6o
6k3SByjwfb3fycwUyRs4sVQhXK55AF9D0C+S1+YwcO076ZO7UUlL+ArR6m8p5q7Am992ksGLGH6w
phwHg73kjFP0vbPFp3N1NsZvTAu41hc9VVCtiHEYCJQa8UAOw71jc3rXMzTPU4B3d742WAKPEXI+
nYetfScC+mZdC3/Pe99YANBm987AqenTYDM9GBtXvyufP++ZTM9zFy96e3bBwt2veP9KN/20rjdo
MT3+63/YbVZ8+DjwkPQRAj05P8YG6FMPCDLwUC0vb4KoUgsN30X2rqbw9goQSqc5UK7SJiX3pSLw
GR02dNVXNjf25MDFXhTCe99DRV6KnYF8t5nIzz8J0w4XUAyEnGpdNxijGxcdbF7flnB72LJZsYYQ
8g/7QJjKtfHq37ZwrY/gUT6NeOUROUB6JtNjnOtAUoL2AnHquthhxsGjuaoTONjqjhrGEVfDY28q
8Bjzd4wfcugxf9IWxI7aHueKxsDS4O4El70n0MA/7jwxrtMQ6oJdEvPA9WDZS14It7OdUfd6nHp2
L945ZMMIsV/OWsXQTulgm7kD1rO8qSaPFtHPP2KvaaKUCTGE8Drd02B7FQ4GrZTwgqoraMm0069x
1z3ONjQ2h3Ow7BUd/OUJBxK5QfdQC29ESXeDMjhgbHdoTNf3FciOIJzx+XM4rfpxPkFjg8+klmzm
fSfW3ZTVb9PHYX/yFkHb36BPB0rjrT4ytjX3IaqbJg+mVa+Wbj+piseYjTXEcxVBlRJBO+dfeEcs
NRaeRK3RpH50aj+MxljurxxCrIlRgBPOMNgjAS5cogehh8dNrkhgcja8sOVIFHjfGmy6f0L443s7
MlpA5KKxVDRr8KkF8qOxXPMWwvKyiagbnkZvfn2iLZIBxj9+ELNV30FW37/0serZfFVcRanBkwbF
6WYD/g4iG9HnYmMruRX9rOWfBpZCLGE/px2Y1voFF/dQrv23ipn77gjIFmdP17zR0wQeQ/ispYl6
t6E3WiGeXUTr7w0b7JJUkw5MBU6c3NCTftd78SGebURBc6YWlY7GcJ2ADt/FGOFIC+1etG7eAqMo
vGN75WuDG8McgnMtU9vrn2yauXMID157x8+Mf3gLKEUF2m6q4AB/tXRqwk8HDVHf0l+9jzDxCfwm
zUB3e1tJZ99/hL98ToRT73h/n//1tr2Ry3vU46Xd3LYoTbyE2ppgG9M3cgi4lv6RJqPUG9O5STlo
DGRDpOvTYlI7Ji1Y+VfAUV1OZxpEChpt+KCPOI+87zEcLlDdRf7K9zgwK/jdoeoqt9iF95ux8qcb
MB8vheIdSL1lkNUtMNvFDDZZEMRLAKITyBB9UY0p19Wfez6smzoP0LVowShoyhY8HtyV6iQxeyE4
hB36LjtK0C8/RunTB4pZOQSsPHbZp4KDNvJjoJalz/HyUoIcjhK7YXwYtHRAx6SAP/8DoUJ6qrfa
BTp30GO7rzJjGeYrD1G6zNSaG7Wf+ublwsUzbfzXnzySY4uOD13Eq5/2Fi56F7CmE6B7kVm9WGj3
Lbh+dmXAbxbbEL+RqEKn02ysOwc3FstnrgOK255eTptbNWTce4ErrybS3k7iZXrVDshzZQx6rdCA
EIvHEh2k2ljzyataBtlRUPZQNfxs3pVBAD7y0DooHnWmyWKsjcwC7mliBDeuvcTLq3klCOyXbvVj
Qs/gLuNgfTIdfBO4xBDse8RBS0gDvL+AfcyP36sDf/lit+ax6Yi2BKz+hcb5E/dz+JpttK2uEGuz
9jK67GjVMBbEgnqcbMXCYOQuNJ1XgB+NLrApvechPB4vBj7ortzTSrmd4O7SJNhldVURK2htKAiv
Bseffc9W/1FD2UIj9Vce8hY9GP7WL46G0TcmlIgBsKL5gTGah3QmX+8C1w4X3NgXsNnivQTOfFQH
U/EWqjnlwwa07/yNtdMgxYvkFdmPl6z++xyzrtmFci3bfCAkN7UfRH1IQC8oA3kvPk4Hy94P8MqT
D0FkiAxem0YRnI6LG0hKhL3VH7UQOicL78vyG7PRj3z55aQS9cv51U/fSCXIvadZwPvyFoxhaHdQ
S8iWrnmPTSlWaqBm+wfVY9+spOPdJWBdj0ScldlYyOF1g3HsU3xLJQNI5tImsJSHnD4W0rJ55rQc
xepwp/rhIcTT01UjWEq+g73b4HmSJ78zCDewxJmdB9XCxDL81QPhsIs8ejD9BgI/Vsm7CI/p+NK/
NvzgZIft9fp/9W/NQ6Rud1vAuOhI0JqPg6pnjiH+eORvfoLT+vjrhxFUHoKKHfVbe0ueDq0iHIBD
xIqL2DIarQ795/tEz6naGNOTLCZQpmzBfndxGRNPUQtXfcIBP7fs626TDFx24gfbq3599fO3hvh0
avEZ2hcw7TYp95e3OIP9rYaNtnfhwevu1H17N6/7JNoNrX4C7wyRGsPKt4DNuC/dn0UxnfbvvQ/n
4+uGcc+u6XKqJw7Zo9bjw7a5VzQarRw2b8gH7cpfV17RgCCrKV7nJ3/1Dd32Y4w13YmrVY8HyI7G
nsiNlbL50qgOuATdHrvXY1gt7y6aYMWkiDq5X3okrtMIao5fUsMN4nRBGzVBD66XAhgZVzAmt2GB
a97EwXyVqnkerjqCTdthzzL4dLkZUwE7zmFE/j5Hb1C1pYPotMHYrGs7/hqKbMtTYPfB8TRc4yVF
7gSuU5pS21U5Y0SOBEHvnxE206iOx9IvM3jl9Cf17GlgU9d4GQTWMw+mWQ48FiybBfZVcvnx3n7p
yEsE+Ss1sbtr74x91c2kqLvQxykNST9fjaYD+EKrgK28Z1r91s8/Yo2LdTbja6TCrFQJ3YlbBMi1
9TswnPAxEO1HX9HMnhKUU9Whz/ATpOKxem4VqBUVdUep9xZur/FwYpcg2Kz8hd+29wb+eK/Z7rZs
7XeKsj4PaqqD6/F+Hhdwo0mYgM3zZCys3vDwEFZ9IHhXmq68eAInraN0X+DSmPZ7YMPH1RFouOaR
sbKu9m++FWxW/eejyokgaHcYuzxPGPNPoAWf8sMR8fHlGZ1e4QndnYhQZ9Un1j+0CAbWfKDqOr/8
PEJ5UnZLYdLUeVg9Xx/kDloctdf5KAZ8e+xVeHRUlzSaYHstFEgLf/161fNqHpXQhl1a6UGv22G/
jO2dhz6SZWqs/HkGvOCikBuOGFtqEIt6+3D/9utg92Zsviq6AhTovfFhne+JF3pT4ai6AGMmmUDK
NrtQOQuXI1UDbQZ/5ynqvt/h4PhM00HVlBYGFjsE2218ixn+zhPkzuCNnZXnDHtSb9GT89qVp4k9
XecDcJ0P0B9fXCpzrH/9m+orb2Z+jnQ4deqZPu9Ht19y97mFewtO2B2xVf30X9nhPqPmZrG95XaR
FKin34Bq0fXurfmh/PFAen2a2CPfus3gpjSt1c/HgD83MYTTfavgYMncfuWbBIKN7hM4A95bx2uc
ojlBSd7BiRl05rhJKS8oWvP802BZeDLR1hNlrGllzeb7N+XhjDWN7prvNl4W0Wwg1Y1P8FjPn/Ps
4kLEYn71C29v+PHkw1eoqCskqJ86V7iAdZ5DFreYq3k5khr85kP7/XfXM3B7ERTDTsf2uE3S2Y3b
9ufPME64ylhOcVCCUx0W9LKVeDCt80QIl3NOg/7ZsDHFmo329V4h0nD7AH6j7R249KxZ8/AVzPEG
+cA+XrbY1oTGmMenBtE/v10B//WvP3/+12+HQdM+8ve6MWDM5/E//nurwH9I/zE0yfv9dxsCGZIi
/+ff/3cHwj/fvm2+4/8e2zr/DP/8+48sin83G/wztmPy/v+++Nd6sf/61/8BAAD//wMAMvIyqeIg
AAA=
headers:
CF-RAY:
- 94640d271c367db6-LAX
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Tue, 27 May 2025 08:13:10 GMT
Server:
- cloudflare
Set-Cookie:
- __cf_bm=zxXbTMyK.67_c.SQXNivPXTcfsIBL5Vl1Q7WXFcTgxU-1748333590-1.0.1.1-ArIOxtxz6HMOCmEGGFc.Hxs19gY1LkxaxTZYc9hAE7zSdmh2fCrczDquGUovgGjYHvCJ94TxWQTCVlo1v7kDhnnrF0jwHy_U_LaR6AbA.94;
path=/; expires=Tue, 27-May-25 08:43:10 GMT; domain=.api.openai.com; HttpOnly;
Secure; SameSite=None
- _cfuvid=4YFW5U1WwjrbdrZ7OWIvzgymvtGAnmnPEu1zeEdQsGg-1748333590310-0.0.1.1-604800000;
path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-allow-origin:
- '*'
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-model:
- text-embedding-3-small
openai-organization:
- crewai-iuxna1
openai-processing-ms:
- '180'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
via:
- envoy-router-5f689c5f9d-mdwg9
x-envoy-upstream-service-time:
- '185'
x-ratelimit-limit-requests:
- '10000'
x-ratelimit-limit-tokens:
- '10000000'
x-ratelimit-remaining-requests:
- '9999'
x-ratelimit-remaining-tokens:
- '9999496'
x-ratelimit-reset-requests:
- 6ms
x-ratelimit-reset-tokens:
- 3ms
x-request-id:
- req_421cd0437639cc5312b23b1d7727f0a4
status:
code: 200
message: OK
- request:
body: '{"messages": [{"role": "user", "content": "Assess the quality of the task
completed based on the description, expected output, and actual results.\n\nTask
Description:\nResearch a topic to teach a kid aged 6 about math.\n\nExpected
Output:\nA topic, explanation, angle, and examples.\n\nActual Output:\nI now
can give a great answer.\nFinal Answer: \n**Topic**: Basic Addition\n\n**Explanation**:\nAddition
is a fundamental concept in math that means combining two or more numbers to
get a new total. It''s like putting together pieces of a puzzle to see the whole
picture. When we add, we take two or more groups of things and count them all
together.\n\n**Angle**:\nUse relatable and engaging real-life scenarios to illustrate
addition, making it fun and easier for a 6-year-old to understand and apply.\n\n**Examples**:\n\n1.
**Counting Apples**:\n Let''s say you have 2 apples and your friend gives
you 3 more apples. How many apples do you have in total?\n - You start with
2 apples.\n - Your friend gives you 3 more apples.\n - Now, you count all
the apples together: 2 + 3 = 5.\n - So, you have 5 apples in total.\n\n2.
**Toy Cars**:\n Imagine you have 4 toy cars and you find 2 more toy cars in
your room. How many toy cars do you have now?\n - You start with 4 toy cars.\n -
You find 2 more toy cars.\n - You count them all together: 4 + 2 = 6.\n -
So, you have 6 toy cars in total.\n\n3. **Drawing Pictures**:\n If you draw
3 pictures today and 2 pictures tomorrow, how many pictures will you have drawn
in total?\n - You draw 3 pictures today.\n - You draw 2 pictures tomorrow.\n -
You add them together: 3 + 2 = 5.\n - So, you will have drawn 5 pictures in
total.\n\n4. **Using Fingers**:\n Let''s use your fingers to practice addition.
Show 3 fingers on one hand and 1 finger on the other hand. How many fingers
are you holding up?\n - 3 fingers on one hand.\n - 1 finger on the other
hand.\n - Put them together and count: 3 + 1 = 4.\n - So, you are holding
up 4 fingers.\n\nBy using objects that kids are familiar with, such as apples,
toy cars, drawings, and even their own fingers, we can make the concept of addition
relatable and enjoyable. Practicing with real items helps children visualize
the math and understand that addition is simply combining groups to find out
how many there are altogether.\n\nPlease provide:\n- Bullet points suggestions
to improve future similar tasks\n- A score from 0 to 10 evaluating on completion,
quality, and overall performance- Entities extracted from the task output, if
any, their type, description, and relationships"}], "model": "gpt-4o-mini",
"tool_choice": {"type": "function", "function": {"name": "TaskEvaluation"}},
"tools": [{"type": "function", "function": {"name": "TaskEvaluation", "description":
"Correctly extracted `TaskEvaluation` with all the required parameters with
correct types", "parameters": {"$defs": {"Entity": {"properties": {"name": {"description":
"The name of the entity.", "title": "Name", "type": "string"}, "type": {"description":
"The type of the entity.", "title": "Type", "type": "string"}, "description":
{"description": "Description of the entity.", "title": "Description", "type":
"string"}, "relationships": {"description": "Relationships of the entity.",
"items": {"type": "string"}, "title": "Relationships", "type": "array"}}, "required":
["name", "type", "description", "relationships"], "title": "Entity", "type":
"object"}}, "properties": {"suggestions": {"description": "Suggestions to improve
future similar tasks.", "items": {"type": "string"}, "title": "Suggestions",
"type": "array"}, "quality": {"description": "A score from 0 to 10 evaluating
on completion, quality, and overall performance, all taking into account the
task description, expected output, and the result of the task.", "title": "Quality",
"type": "number"}, "entities": {"description": "Entities extracted from the
task output.", "items": {"$ref": "#/$defs/Entity"}, "title": "Entities", "type":
"array"}}, "required": ["entities", "quality", "suggestions"], "type": "object"}}}]}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '4092'
content-type:
- application/json
cookie:
- _cfuvid=SlnUP7AT9jJlQiN.Fm1c7MDyo78_hBRAz8PoabvHVSU-1736018539826-0.0.1.1-604800000
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.68.2
x-stainless-arch:
- arm64
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.68.2
x-stainless-raw-response:
- 'true'
x-stainless-read-timeout:
- '600.0'
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.11.7
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: !!binary |
H4sIAAAAAAAAAwAAAP//zFZLbxs3EL77Vwx4lgzb8lM3PxrAOaRu4xSIs4YwImd3x+GSLB+KFob/
e0GuLcm2CqSnRgdB4ry+b+bbHT7uAAhWYgpCthhl5/T4Yv7A9qG7O4qX8k9X790df17SZ3NFV7ef
JmKUI+z8gWR8idqVtnOaIlszmKUnjJSz7p8cnk4mk6OzvWLorCKdwxoXx4d23LHh8cHeweF472S8
f/oc3VqWFMQUvu0AADyW74zTKFqKKZRc5aSjELAhMV05AQhvdT4RGAKHiCaK0doorYlkMnSTtN4w
RGv1TKLW68LD53Hj97pZqPXs6mJ5+fWu/th9bRbu03J54/GPj3ft6Ua9IXXvCqA6Gblq0oZ9dT59
UwxAGOxK7C2G778tUCfckgFAoG9SRyZm9OKxEiE1DYXsGyox/VaJL4EgcJ6TB42mSdgQhMQR55qg
th4Qjsc9oR9brUaAAYLtCGjpNJpSNUCHPYTIWsOcYJj6crcSo0pcG6mTIuisJ2ATyaOMvMjxmN0C
5Ar5iCNTgNhiBNmyVp4MSDTg2j5w7msPjnxtfQfRApkmA/3BsQVUijOOoeKlNYEVeWAjrXfWY2TT
wIJDQl3KKY8/2DQhc8mQGo8654wtQSSUbfZ33koKYQRkWjQyHyWjyGflqPwvt6a3yTSgCb0hH4b6
N94uWBFEdqE4OfR5AGANtPZHLuSJTW29pFJSWiPJRcAIbe5s4VQnAw12Q39oQb5X2G80arcS96NK
/J1Qc+wrMT0bVYJMLMY82ceqaKQS00pcYGAJ589tKiiz9ort1jqW5UhRkJ7d4JItLWUYCrN+UK9w
soEOY0sdRpYBbJ0nPmeTm2JSNycfMsksgxQJEKKNqIfmeNKDZFp2zwK8DoAbCVGDdeRxhfQ6AIWQ
qaEu/ST0uh800lqrSuwwAzZNJe6fRpvcz11W2WvOv5fX1FbSH3ziGCAFUpkomhehZkqsdQrRYySY
l5a+Vt4Wcl9e8oC0yRQhbuQrYlsleQf91vZwif7nwd/a/heBfsMyJv8f+v5XeTwBfQTHJOkX4fGB
TUNvJ3BhVQ836LczOTcYbVeE7DKd/4vH/ZN4tQyedrb9vt9YdZ7qFFC/34FojI0DnrwE758tT6t9
q23jvJ2HN6GiZsOhnXnCUNaYCNG6AVaGUIqL9GpVC+dt5+Is2u9Uyp2cnQ35xPo6sbYeHB8+W8tb
Zm3Y3zuejLZknCmKyGWbry4QEmVLah27vkhgUmw3DDsbvN/j2ZZ74M6m+Zn0a4PMr1pSM+dJsXzN
ee3mKT9Q/+a26nMBLAL5BUuaRSafZ6GoxqSHW5AIfYjUzeqid+e5XIVE7WaTQzw6RDqbSLHztPMP
AAAA//8DAGHAjEAYCgAA
headers:
CF-RAY:
- 94640d2c4fbb14f8-LAX
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Tue, 27 May 2025 08:13:14 GMT
Server:
- cloudflare
Set-Cookie:
- __cf_bm=2gcR.TdsIISbWE0mGGmusReUBNsIatAwc18zV72zD20-1748333594-1.0.1.1-qeHSEYFoSHrtU4ZrkbG05aW.fPl53pBh5THKK8DsmUMObmaOM2VjUu.LX4CG.kTiSHKPDctGkrALLResb.5.jJ8KafoAb00ULAntajitgj0;
path=/; expires=Tue, 27-May-25 08:43:14 GMT; domain=.api.openai.com; HttpOnly;
Secure; SameSite=None
- _cfuvid=JuFQuJzf0SDdTXeiRyPt8YpIoGL3iM9ZMap_LG1xa5k-1748333594521-0.0.1.1-604800000;
path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- crewai-iuxna1
openai-processing-ms:
- '3966'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
x-envoy-upstream-service-time:
- '3976'
x-ratelimit-limit-requests:
- '30000'
x-ratelimit-limit-tokens:
- '150000000'
x-ratelimit-remaining-requests:
- '29999'
x-ratelimit-remaining-tokens:
- '149999368'
x-ratelimit-reset-requests:
- 2ms
x-ratelimit-reset-tokens:
- 0s
x-request-id:
- req_da07dbd4e017b1837841676fb84c2e7f
status:
code: 200
message: OK
- request:
body: '{"input": ["Apples(Object): Fruits used in an example to illustrate basic
addition."], "model": "text-embedding-3-small", "encoding_format": "base64"}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '150'
content-type:
- application/json
cookie:
- __cf_bm=zxXbTMyK.67_c.SQXNivPXTcfsIBL5Vl1Q7WXFcTgxU-1748333590-1.0.1.1-ArIOxtxz6HMOCmEGGFc.Hxs19gY1LkxaxTZYc9hAE7zSdmh2fCrczDquGUovgGjYHvCJ94TxWQTCVlo1v7kDhnnrF0jwHy_U_LaR6AbA.94;
_cfuvid=4YFW5U1WwjrbdrZ7OWIvzgymvtGAnmnPEu1zeEdQsGg-1748333590310-0.0.1.1-604800000
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.68.2
x-stainless-arch:
- arm64
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.68.2
x-stainless-read-timeout:
- '600'
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.11.7
method: POST
uri: https://api.openai.com/v1/embeddings
response:
body:
string: !!binary |
H4sIAAAAAAAAA1SaSxO6Orfm5++n2LWn9im5SZI9Q0BEbkFAxa6uLkBEUOSaADl1vnsX/t863T1x
AEEIZK31PL+V//zXX3/93aRVno1///PX359yGP/+H+uxRzImf//z1//8119//fXXf/5+/7+ReZ3m
j0f5LX7DfyfL7yOf//7nL+6/j/zfQf/89Xckbi/UPZZxOtWfDQHNrcjxw2hOjJN9zoAvyTuRGUlV
T9NahrB6Dxgn0OyqvsmqDrm5AKlaYacfKd86cCu5jreZq6Ia4JiZ8P7EFdkJkcE4P0kncCxiF1vv
61TNsaboCLKxwApwr/Z0eu1r5LYfCe+b7WkdHy5ovrgC1rbFbC/qronk21avyEabHbYMe6mBTVwJ
2LmMijZ71sWAb5vzqaWWhU1sx4jkmxO/vK2xe/UTSt8ejCXpi0P/iwAzToWBJqUR6dWTylD4it8O
hnJxw17/9tjEudoCu3OqUs23hGo+aEaOAk854zg6l9qUyhsTPJetRIjs7W3B0WEBD+0lx/eLy8C0
bZsGbp/+SB1yNyuhLx1P3r1zCasWS+x5R90Y4OtxQ5W7MGpz2KUZHBy9w4H0cdnQgyuEu2SMqDYD
1abK9qajvIU7bK3vf0b9XUaSsjvj25an6fKdHgL0vjzwIGnznmvUq4CGyhmw+3xpvdA6CMpu6qn4
4G75dGyU7A2E6G7TaLAP9jLJ04KyqLxg3ftaFTflL4K8j2LQm7n/9DOvVDFKJkiIPL1bNnDqp4O3
FzFoSOnTpkKQXYFxfZ9waH2+4WLqXQn7TLHwtXv0GvOqpwCdg7Kj1qvbV4KqiwQdVPwgY7qxAPO5
ooaAfzyovpEAm9bnh/EObAkUN2JFuHKvoNvVOlCNQj/k6IcVsLgFuid5z1c1bwLfRBcBx/T0kLiw
bEUlQ4tVJfR06YBNMrNtEJpsg2oH7aHxHphj9LGMEJtFYNlifQYFGJ2ThQ+3+dLPrH8YcnbaDh48
BXzPfCAncPpsUmoZF48tzwO9Qj0vVfLVAafNT5h4MLi/A2zYy8SoNU0B0i705rEW5WB2pNSAF3/H
YafXXjZ/r+cNPEb7I32U3R0wWJQLGi/HliqbyWPsUqkd0vXpSf0mJPa8t4wANpal0GczBjav+5ME
mRYZ+OBlWcVXRHnD19CY2JDGhAlJw3tofZ/4MUeOzfGDZMEM5cybFMTC5SvSBrw3RYCvh+oUCiZW
GqTSOKH5qXJDfic8PWhtOw7vrZfCxHzudcjVeoMfZVIxsazuGYyL7YsqiqYD8jx8r1D3owKflRKH
yxI5NVzOfYb17Kun4kz8BD3nTeEtwy3quezFxWBrWgMB0x2w6bs5B8gq/RtOsFv0U8BoAYOPFeI7
bvRKqPZMQJbyHbH9/hxsZteJBbkvOGB9Z6S9YJsmB7NrtHi83mUpJz+fBH5ki1Gj2AchcyxHh2Ve
Hch8l/xUtOvARDGFJsa0XuysHIQCfjV78fhHddWmLCnf6JpFe/o8Mt+ea36IAW3GCSsG1JmoS08f
xqZwpMeBCPb0MJcOTUmcUN/XHSa8X4UHF3E6UMVtGBt4EC5oY+oONZnQ9azJ+g7WOqDUzUVbW/hb
KUM5GADh3e0lnVEBJtjC44NILyeuOADECT3EQKTGMCyMJfPdQEYFvzT0b0cmvBtZgWja2FQF+QgY
6WQD3u7LFmt6zQNqbVMJWp/dmTrhC2kTz502ckg2V+xdZqOazxY1oH6YRW/hnzCcr+Nu2T2Ozg2H
n7Fhi/4IArgISKFKHWUhr/uSBPo2bnAw3KKKr0lSQ5S1G2wR0wdtLLUyigv0oml5GrS5qPwc3IXL
h6r+R64Y3191eBfNDvt8yoPp9ME63O6UMz3cNjWbk4ulgjSlLbXy+RVyCs44eN2qGtUqYIacuzdV
8OiJQPfHlKZdKJ9M8K27HfbARQH8+7yUCJz2IvUdckj5DbYF8PAtATuBdUz55NlYkNu/T16RlZm9
DH4UgKYynvhoc5E2ATXOUUHiA1Zoy6ds+94bkJ0p9ODb2mtiWtUNuueOTUOOU1MqES5B2UtmWN8e
XK27NZWDvlP8wPbof3qypNcE0i+JaXxuRNBcc11HVWLXVKlsQ5sk8JlgWH7P+Piy+HQedtGAnndl
T9d6ACipGh/8vsel3Sg9dxnoAKvFGEkvwijlzW0iwa2EHXwUims6Crt7Ae1dJtETrEi/nJNoAVxu
v/6cZ/SCOICJcMJx95r7iU7nTo7lUceBm71TsnVPNbQcuODrgO1qPtV3E8L6plIz18qQLr2ugwZ/
bvj2+GaVYJuKALfk0VDv2l+0obuNBiS9v9Cn8jAZL9lhgUgfLF7Xq9twts6RhJptKxLUl3Y/yfFB
ArwdO9S9lIEmXtKJg+m89/C9M9xeUE1dBev8CJrENxj0ve3BELwo1uv7u5+en4CAVU/Q583a9my4
VBwaMH1TdWFjOKEoSWSt5xSaXvbvkJ7OmoS42mioDuih4uuvMiGM3C29n5OPxmRFiNETlQVWk+Mb
LNgMOFTVxUAN/v2uxLhuNrvP2bvQi9SL1Z94yxcQk00sQ0BT4iyA3xgeve1OSsU9btcAeQqVsSlQ
CywPYdeBl/F6YvX+GMKhP/AmaC5kT0/T5RJO+l5z0HFcOuqgKbaXbGNnwA+jFl+9/MkWg0YOfN/i
M74fnJ51Dio74B+XBuub5gAW+z5NYHczErJttF3PLMnS4RovnnA+WtqSDKcCzldBoUfBV8IREw/C
zzGfqULbS8on7uQg2dze8K/ezZgYEAn+dKY3ZnlAGE5WDpspQ1TjiQ9Y8HxHKGm5AT+cpAunHuQb
eHkernj/FF9g1U8ZusHHnvoyKrSh51wTntRbh7Uspow8hLlBYCg3WPcsLSVSmSqwQXyCXXF6g8kP
9xDlenvxYF+YqcjxnAMPqQ0IvY5xOJWDUMJg3zvYHkcHsDU/g8McfIj0beZ0CK+Fj0gQRfR+cGw2
vFVHgPnjy6jeHli/zm8DhvvTxIfxKIDh9EABjKQlwUdhPPaDu1dUeAy8Dus3WFVT4cSbf3/Piz1V
i/56LjDJco/iRe3tOaz2DVyEreJJAu3AZDz6AML6ouLsLW5SVsYggshuanzKD59+Ibk/oH254bGl
7XjWOiOEoHxeLfrLH0s3UQGa+uHrbWI5Y4vIWRv4GZYLAeJz1BbwmSN04T3FE6ezx5Y+bN5Qu4w3
aurjPmTwcc2Bcfu8vLm+PvvpdXI6cFjOCFuvxxRSM/I9xLnWCe+pN6fTMzdlZOYNptllVOy5JkGN
XNzsqaMhGC7nF3lDKQ8JmXrB6MXn5aHAfv8svXo8ydpCnWJCbyBN9HhmH8YKGHpILg53fJA8P6WZ
u3OgaukCtvv8YC/BPlXl0NNnrN1Gaq/5Mt4dvLqkh0dY2FOkS1c4R07ugZfqpD3OPAWy5y73WiO0
e+I2JIHh+06ow6VPbVmvh5GILtSIFCulrxkZwPuKwGM8mcB8AFwNgOGoP/8TipdU4mB0lHh8V5Gu
cbC2rkB5vKdf/PaC1UQxmq+cQpPn/pYuezu+gtL4bKlmfb7pJN6Y/ItnbxvIbT+cZdP86SdvtuYr
YH1X6DBnwwMfo2fEWCKVEhxvo00vbWTa8y8/9s23wqeuigDRX7cFKhxZPPg8PACbjokHN5vUJDJ2
lV7oPnG5kws5pzbGGuCCJnZgHMUmzZ6HB2NgCQZ4Ui8dtad61JaLo5no6WbKT89Xc/X9QiCcOoy1
wt70w+pn4Gdkoycws++n07XP4OYktR7juDLtPunxDVd9Si6Rh+w/ekrPCxUnE7dl/cnd1jv5BQTs
rHp9aXzLh/7pcyDzSTRDDtZqhPxz6RA4wn3Yd1orAb0/pPjpmm06trMkI+My6zSwqrf9uz+4Pm53
6syBCQRjKUrYGY1BXbSk6eTfjhmqcNxRx4mP9kS5m/nHb1gvpbWpzp0n5LmSTh3i3NLp/upisOze
pSflRds/MvJYwO0+bal+J7q2pBjkYHu8fckuzX7/j3MZdmzvgR2JGJO+ZQ3PXEDIu5rkdLi84hgq
Gu9iTE8dW66iIYDzc/GwZvEoZZwSxOgT8Zi6aOv0yzlUN0gU7ZEwn6NaczsdObnUFsubFapU/Jp/
4GlbR/Qgpa9e6Npcgf1Udli1FTtk2wIuADvmEQfN+8NGLH10cLvBkjoTv68YWWwZXB7BDh8FDdsL
3u8VeLPeaz3f2f3EarcA23AYsf0FTjgnF1VFGKQlGdZ4Zq+uCpAivDARD+YnHN5Ot5EPiVVhe7qn
YL7zjxoGWGvp4TF9U7KuP9hsmie1pqPaLyoaBdhcxzu2mHy224z/1uDbPK+esGWOvZyTbIHJ41HT
y9DRqjOQKMAL7yj0vOaj4ck20Z/5XKT0Vc2Pp2yhOu8DAjCuGPNaKQNldzCIfME3Wzg/Xjq8ed7G
mzdLoE0WejXwMVYh3u9bbC+jmOTwvK8HT2CLbi82Thp4Fg42Nr92p03d7WOg8hlZ9LHGPzk2VgLZ
eYTeHG9ejBHe92C5KC7OZHebrryhg8Ot64mgui/Gfjzkl58Pg2uGXCQHJko+8Ug2HKb9xHIwwBTF
IVZkpGgCVa0Otsd688cviT+/LOwGSh9aUIY//Qdur8GgycQ9Qac0tzck8nMkGydzK/5zjBLkvd09
jiPS/jtfy2Gh4OTkUW1Jk/MbWnHJqAc+V/bTKz9/TkB869mwxou06hGMe6aloguaBAhHZcTHMzsw
PiOXCZz9+YsP27lhw+yLPlyC/LX6g7pirioHUhT0M3XX+j67masCV+kTuv+0ZTj5GHWw1486xuFb
1RblnRgQnDTRgyZPUmampQpUOEn4+OBGwKK59lFWGFeq758lGz7yAoH4VMNV/xugfVsFQfBrHn9+
T2N2ChW45geyLc93jduliSeLWh5iQ9jbgN8b5gSu2Baw1+SPalIfZx+CMdKwJbA5ZcHF8yBcBpNq
HFeGLVwqgrAcWzg1Wa+Rb+so0MmXjBrnY2dPbXtRISqnhqCr0/ULVO85WNc/kdRzk87r+9oVNH1Q
S91+0y5q7g4sdVZitc2jcBHbt4Bw0XzxxQ6NkOMOOwifiZB7vD251RxWpw62g1RiN91YbH6f5RKW
nWzj/T5DYPnF1xMVBfacXQRmcCwUxKPzibCmbm22b9sG7mcrxunTt5lYntvy56epaeiiRrqhu/7i
/6dvKoaOrQJXf02Tl9JqP32IHHEA9CYg1E/Xq+zLZeO31Fn1SZMOSQl3TjV7Wy5/a5MntBxc6yc9
bnka/u4PfterdzmsxKZCORzgBlNXo23F4rrYoPOSn6neBjqb8k2UQTBeNezJvZlydHQjeK4lnwYZ
Y4A+41eEBm9w6D11Dunc7tQBxSXbENZqU7gwNsE/PO7wMAhgOnefdu5y1UcAOSPkPqy1YFryPXbi
Wkm5ZbxFsGsjE2uUPrWvV90EGaZvY62XFSP0FGXQiVyOAOMW2KT7+AX8XPuBsG1xtr8ntQlQF+In
/sM3DwDWu10si9SN0yX9o8fod4hpmNQ07Ky+GOBpjhcc6Ec7nWJ5sMB2aG18aESFEXHYWXAjLx51
Y+leTXKkxEj3rwV2LgUBc7uzBhC8Lx6+fzuuX3budUGnb+95UoImQM7vRUZHcDKx8zorQCDl1YF9
ploY+/tTRWe+82RpXgTqvXsCmJqKDjpzPqFPcNM1evWMGHpxq9CsbIRqppAp8L2tOqxswm/Fhvcx
Q280uvjKPgug1VFbeQ0syFS+3H6+E3sB3WEPvWluvlq7KOcE7VONYG/NT1xOJQ4a1eaLD1HqpayD
voJWfY2tg3Ts5zzABFY46Va9JaWEMQnC5PGs8W+9Tas/Ar6g9nhve7nNZxKQ4XJuM6xln5c25haS
5VVfEREGVzZT3ZHgs06e1BFffjj33l0FkCs++Hmsc8aeWc1BG4QXrBBqgPHHW0FHXtgjD6yt/KUB
fFl6ZIPeTcVWvvqrnx6fiF97OguMQ8oiXbAbhl26fIc4Au2V7eghVx6p6BvxG9i7XKLuu7UrPoB2
DFZ98oeXUNArMvpEIqbqbR+GrDraiWzYikmVT+KyZa0HYIY3kR4H8E1ZsoUEclvOJiORmDY9c0VG
byBPWGuLW89OrlhDJmrFn/HTUX+VcL0eWwhbYSNx4QYWcZhhfPpMPVuqJYB7L79gw96WmvBK+BqK
gnXBP73Az8E+Qsb9vvFYUuOUFtqog5VneRt0E9Nl2AwBlPbkiv/wb61sBPmguo9VjysaN4P9Ihcn
/fXHD9Cfn1n9o8cv7d5mojLE8OWfNZwGpq5Nn1ZTUXjjLtjfJ3I1GNd7Biq3dukJS0k1dH4Y/PQu
trqLqk39AVk7neO29OeHRludA4QS/oU1LxbsqS1tD+7Vo+9Nz2b88cxud1WmPTXP8YtNaz1BLz/U
fvmCjcTrVSiYvEX1fTBXi6mXJdIPTCSC36lgVmcn++kVbL+eYSWSqvChUZoW1W9Q61vjSQq4+l9s
wUWzR64QNnBdP1SnG9Yvx2Kq0UKFzW89h1QzTgtc+ba3W/0dgeo5h9g9PVf/Bv/oc1QL7/uv3jP+
Ai8xOn93DQ6oPNrsLNUG/HZnRpXGyMCiJ+4GbjuBULN7nasJ1uoVXb86WudzrujKE8HKt7GFH9eq
y18ohvhUfqi+h53WrOMhaGSdmtIm6KfIRNxPf3sd/JzByN5aAIn5ngg8PQ2wlAfTgtUcNh54DW82
csUGwogqgkcbvwOj8H29oWVbF7rGdzWlGzWDTLsadD93Vrjk7dGDP78U3vwjYPWZFXA2u4cnd5fS
niSx8SF9sjPWNrXajysfhP6oO/jsKpt+zMhjgqW6PWBV+oxsFgTXg70WbgmaI2f1M7YF1/kTSBwx
pCtfBvFJfHqCnp2qRel7CUhJw1Hd1s9sfsTNVfZcWfeKLDBsQRMiFR0q6YizU8BXbW87BJyfk0dd
R5z74Z7nhtzcyhwf2o1SiZNixpAYS+imGtjZLH22phybexVrTX2ylyXS39BC3A1bh14IhxmcJujV
Wk7NL69WLOKWGu2qnnmiOjjhopDJQ/e+bgm/+uH5t77DYuJ+er0Sk6NRI/SJLepxGPfCztA34H39
OPi4jRuNFHQnQF/OV3/wWPpxb5gLPD+sgSqcrmhzHhwHtPI/rEfOjk1Hk6rgI0VXbxdErGJkdxl+
/T3qGHVdsaWSfbj6CxrVgg2mH+/45Sf3E761WWvlDhYvDLyN3DfhsvJWuOpP+uOjw+GNjN98vZXn
VcvjcxrA9XG54/vYGiFT1D6Xv3PKkyHUebby/RJ+cSF6xdFJGCfUzxKu/SciTzepZ64UGX/8gj40
Q0h+vKMNrS0+wSsKx+XYe/DHe12uC6v5c4zin7+mhp6dej6GNAZttM28r4AePUFTm0ED6A221FKx
hW4or0AT4z0Nsxgz9nvfaaxTet5cLDAwPzbgeKO2J7+rPZg6Pw3AK5A++HIuPXvKPuYAY2pr5C5E
NaBldc4RlhPLA6u/4rdK4chKa86kuwybcHEbEkPeSd90n0WnlJv9rQ/2XnbB4eOJe8FBXQP3iz3R
Y/ck4avkkxgsJkH0CGmuzVIwRmDN/1653RvVMrgYQrdeNOpx6ialpFsMOF4OLfap42kCA9YCFQZG
b9mes6qX7aiE2vHF4dM7GsOJjocI8u9rT9XkJjOy8kV4BLbpiQkY+zGFivPr32Jz7a+ROe8GKIqn
EXvodgv/+Juo4CN8qcpjz9J6gSh0w4Eaqry12cGfJbTySqpYwZDOxL4SSM4vTF3iyRW5ve4xvCHf
IIut2ClrPVMBv/5yuMSncIzivoRlv6vor3/Ef+DGg+rmc6G2c1p6Um+UAP301Ek+Xnr24vcx8A/Q
xqb62oSEm4YGVs3tRGQzUnr+51+l/EyorvNdOgvopshXz8qpAjqUjlVV5oittdA673DFzx5SYL/c
jmSWt1W6/J539U/klUVtuPYbLTglSUIxGbRKMIRa2QH++cBrfu6XK9tHqAucI31o22c/XYbvAPv9
o8TuM1rC2YTeG2xMw/HEJDdWXn8uf/kWa8fh+G9/a2o6xlp924fTMIArrIf9tPJ7ISStKeXQ0Zzs
3/z8Iy8bdNbup5VX64A7mlQBn1BZaFAun7RVQflGa/+F8O1Mw9mUTjr8+7cr4L/+9ddf/+u3w6Bu
Hvln3Rgw5vP4H/+9VeA/xP8Y6uTz+bMNgQxJkf/9z793IPzd9k3djv97bN75d/j7n7/43Z+9Bn+P
zZh8/t/j/1pv9V//+j8AAAD//wMAYp9Xq+AgAAA=
headers:
CF-RAY:
- 94640d46896fb6c9-LAX
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Tue, 27 May 2025 08:13:14 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-allow-origin:
- '*'
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-model:
- text-embedding-3-small
openai-organization:
- crewai-iuxna1
openai-processing-ms:
- '196'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
via:
- envoy-router-5f689c5f9d-svjx7
x-envoy-upstream-service-time:
- '200'
x-ratelimit-limit-requests:
- '10000'
x-ratelimit-limit-tokens:
- '10000000'
x-ratelimit-remaining-requests:
- '9999'
x-ratelimit-remaining-tokens:
- '9999982'
x-ratelimit-reset-requests:
- 6ms
x-ratelimit-reset-tokens:
- 0s
x-request-id:
- req_51978202e4a91acdb0b2cf314d536032
status:
code: 200
message: OK
- request:
body: '{"input": ["Pictures(Object): Visual art pieces used in an example to illustrate
basic addition."], "model": "text-embedding-3-small", "encoding_format": "base64"}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '163'
content-type:
- application/json
cookie:
- __cf_bm=zxXbTMyK.67_c.SQXNivPXTcfsIBL5Vl1Q7WXFcTgxU-1748333590-1.0.1.1-ArIOxtxz6HMOCmEGGFc.Hxs19gY1LkxaxTZYc9hAE7zSdmh2fCrczDquGUovgGjYHvCJ94TxWQTCVlo1v7kDhnnrF0jwHy_U_LaR6AbA.94;
_cfuvid=4YFW5U1WwjrbdrZ7OWIvzgymvtGAnmnPEu1zeEdQsGg-1748333590310-0.0.1.1-604800000
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.68.2
x-stainless-arch:
- arm64
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.68.2
x-stainless-read-timeout:
- '600'
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.11.7
method: POST
uri: https://api.openai.com/v1/embeddings
response:
body:
string: !!binary |
H4sIAAAAAAAAA1R6RxOyTLvm/vsVT71b5hRJ6O5vRxJJ0khSp6amQBFJooQG+tT571P6TE3YuEAK
6HDfV+r//NefP//0eV3cpn/+/eefthqnf/7b99o9m7J//v3nv//rz58/f/7z9/v/3Vl0eXG/V6/y
d/vvz+p1L9Z//v2H+z9X/u9N//7zT8ilLlHq2s55S89TOKWb6C88dWvqEyrDt+NzfnEL13o7XJAA
I/3oYMWmTDS7HfWRwR74+XZYjkCo27JElbiaOPPTyt3G6d3AQORG7KTLTl9fDcsBpGgywYXe1mtk
vi8Qvy4GPj/aiztA5JpwNGOMD+8eR2LPmzOc3+yOeHvJdEWK3yE8nLTLzC10G1bdAA1Mr/Xdf6qq
kU9908uQDPWTaEcC61k6cRlUH3nid6uq6vRzKH0AfDHGeqbV7pqtOIMn6ZKQ61O81qLSWanMVryI
D2fRigQhJRkcuitD9AJwEXkZTYaexsBh7z6zdGvNKQC/8T7GrXaX+3oykOBTgDVZHnK610IDpYF7
IkcJ7CPOEzNDfl5Dl1hmplNhH15iGN3alLiHd1pz1zof4a6cC3xzzNVdT28/htfu3vgUNQtYPZUa
yChONjH4x46u9DjPkI+Ws/958Zdh3bTWQZdbT+Ydcj75snFrhvgn9yHHrlQHjt4bCCPKHLCxhhPd
GKuNYfm5ARIngww2x1YDKYDlhh8e0+rLUW44WOH27aN6ZMAC76wH3YxT8eXk3N3puN0z4NemRu4H
ce9Si198dBtY1+fNRoqmJxpLiUcsxvrqpgM9ISmA+ettzxKZ9vW6+DsNydtIiEMSdRBvHWtAbmhk
UkzOQeczd/7AIXq3JCay6gp0b8Uojt4bNvcqpcs0XAxotNN7Xm5PxaVkBjcIRG0lOD5K9bbTSwNR
uH7I5ZqO+TtoQYOAY/vELxmRru2qjSjAb+xD+5XQ8Y5dEzpV3+KgwZhuhC4OssZN+rufhuz49JCn
A4n47YdxqZpWMyL2zvZLCvV6Uce7D5M1DLGSDpRuAUhNZFoKxpbz8sEmrH4jueIaYdP0klx0n0IF
Gi5oScw4ek7nyy6F94aNfeb9rvJReeYfGD0S4jOusgccS4cM7rFikuvSie5W7y47ENjqjXjaRY14
9xYGckwbHaf5iafbw3t2iDwygPfTTsm5eyrJQD0lur8GQI+E1OhLwFo94zNbFdWc7fMlanz+SQrL
fdWrFl89aDbsEx9xy9EpLxQDgc/OxRdrOehiF2sKcCZYELWvN3d5HeYKpDhtiLtwbb3ph2mGB6hC
v44mMKwyO3bweQ1ckje5m3Mw5D8wBNsHO0Wg6t/6MOHb38k4KkCc8+8X7RF3yjRcZHJDP3n/1NBZ
kX3su4+nTpGSyFC9rNWcC8MxFx6PKwOdiSmIgzUrFwtPgagcL73/xtJh4GeX88Di+y45NIEGBP0w
jbBrbw7+zWfHd0cFElu28UFgFVdoR1NBEpONxOSqlk5Npzq/fkf20W2m2/vMX+T7updJnhzDYaIU
F9K3XmZ6Ou/qCVelAJqR25M8FUk+no+vEvaM4ZOI9adhxXgpUCgaDbnq0yXiTKuD8Fa+Q5w0TzIs
/r67wXUkJraX7qxvPrdr0HnfxCTsP+98ZYaIQeK110i2H841X9yYBdaMwmMtNhp3kT+O+ZtvfDyv
hs5fJ+4DeOcu+cp6aKNPvVQGlKrgQPJrhoahy3ccoA/uQgo6JfpysC6K9GwvKzGTbNZHeV1ktHSP
88wb8q0Wbna7wMdHYLETZ7y76OSqQa27r9g8HWt9/RxvBfzWBwmGk+TS7Xwxocg6HPGDFtWbo4YF
CsrL0S/PAp8vzL0M0N1SNHLo7xNdM7fr4a7qXZJ2jE0XutxTuQ8OA9mbzTXi5HUnQ/gpD9/+LdUk
fosdbFeGwzqje+6KkouARldJSOioI1ij62kBaOelOHsyAdji174AXOXPxBrXQKfzZUnRKSUDtmxg
DeKYcBXipciemWvQu9uzZQupZG2VHDTzWdMkXQL0wxN7ao7DItaggsPa29gTUadvFpM2iFyohbUu
7vRx46QMZsVbIQ8k1kMP76KHdsIxIEWIs5ze1CIFNYEVVvPilPOtz40Iu+fjr56GxbmKBvDgIfHp
+Ahy8ZNdGmSPF56cWu2Yz6bzrODVELyZ/87PwpXiCEIudkkuOtbAdcXGoPI1VvhUf8qaDDunhC9p
8/A1Kdh8CVDZo6HLGew8Q2dYqjLyYW4tDi62ig7kgfMZmEMbf/t3W//GB7RbsMOh8i4j7pRnHyDr
0gO7wz0e1vWg+jB0jiq57aVOn2YXesCIZTQPkp/XvNVzBowhg7EbpZ/hYya1CV16OJLjM81/9ZVB
TXEif2kwBkvbPTd4OcA3NvDrABaHfXCw2T+Sma/TiVKlU2IorbZJslvX5BR4/YICanUkGhm3Xo+V
uMDmWXokf0FCqZ2fFPTtp/iLj7XwIiqHxhd6kTxKP/UaKCGHvI3YPtCWfb49Ni2FmFafuT2/r5Qz
HjSD92zLyXn8XMAUM1sFTzobEvt0inPiJ7cZpseLSDJv3EfCbaAbOlpHgJOdzURf/DLh/ZRhnO5V
Cii+vG7wVnXsjGLY0i1Ldj165FAhCdGfVDCTwYCP541iz2y9nNuy0IdxAK7YrUkD/u6/S6pB7GWj
EFGls2Io2tPqS8xNdzlFf44wzR/OvANsXy/WTo7h9/n4OjXHemL21IPTGL3J3jeqnPs4VYzOev/C
JrmwYHTYM4f2H30gxyR/upzDFSHs/MPVf/Z16FLRVxX093n1M4226X2sYAXvni9rb1+ny2Z6KHvm
GVE5OOZbxFx3QGqVEmePXBvETj19oAPFOzaL6BGt9xPy4GEbFqzfjbGeiVk2yLyNyfxp4thdYn8n
iD++59y6Jlq/fAHcHlyLVcAk9VRo5waAUztjU7G0YRmZ8wUd94+KHNloBWvkX3wUbpeCJOe3BNZY
1Gf45UvYaWJOfwe7AYIMPVniDndu2FKjr6DDUAer6TboTS2kKXpLioFzXg3qTWJd5sePfa5bQE7Z
5R3AqpcPPsc/dr/+syHldZTwxTUdna+s2YGhLx0IrsgA1iNKISxuqYEtwygp9ZN4hBV8eLP8ssOc
U9LQgPfbaGE7KR7RJp1gBsVaw/jXv4Teoxwa13BPTNtnKD0vpgW//YKYzg7U26p5Jchy+YUd7T27
W8ScZHgXyYQPDrsBUpvGBymAC3xweAvDkjSUgWqq2sTScnYYn3ergNCra6yUmgG6fWuNsNfRHmsJ
bocxTIUGurHnY73cYL4KdbaD4pjl2OgWEG3ceZnBSUehz/SFAoRdGtxAaU+Mv+59nG965ckwSY45
3h/bg07TTNfgiTFz4l3SKBqPN1mG3/3jo8vdqdepyy9wcvoTvjI3Xd+ap1Eg3TWtWXqfq5o4682C
P/xWucB0V+2maMjMth4rx02rqQrdAnz5LPE/XE15OXM5cF8PMva+/erLx2K4g7xOcpu9glVmmwal
x0wknmOu+uIPBx8azmzMgFeXYcnmewAJE8lzVXBIn/h9CZHUwaNPv/yHUl5TIDZIhn20eNGmFbIF
hWL2SGF9NMDF3LUBR0BOWBnQIxr3aEph0VWIaAneDyLvPxj45ddYN1A18N/6ABJzGckpOpnRp48/
H/BA7YXEIg/oYk27BbLbZhHnWy/bpDcbkKW0IiY9Ny6lB/kGW++5+EIwvyhFGW8AjYgIe4dlAtuO
czIQvBQFG5B9Unra9xZcz16MtR//0DfdQnrQAGwB1hom7zEp4ld/4ILBr3olMK2gYqUy+eJZtHRv
rkTRAQTzuHPWnFTJJ5Q/p1OAo9me3G0Z1RSJrMWRE2nznBbSpYTE77C/ozHj0krchbCl8tN/fiTe
pYPuWZCkCiEqCHH+l39XHcywS986EPFx2v3wjajmNaDUViQBRAmq8b67nPO16WwLWufwQKytPwN6
8x4ylPDrRnSsJjml1fUD17Mfkz24n8Ema8Um33Ln4bP2vYwIuekmzFty8yGq7mAZgPeRt2OhY0yT
k7s8WtNDi13t/cWmTL49pNVB5m1OCO6gQLcx0rK/63kQBzPi2yJM0cevBKJ2o+OKSxwH0BHwx8f3
xokWr3mOcLmFxQy1i5pz4s0YYZSwNdHh7jPQlzFe5L11iYj22KJhRb3qoZ4bNWKzr2bYnkoxwp9+
dHVggjHtzRle9KAm9s4s9VWx5w6Syk389dBs0ao0+wVmtuNjLdizdKlz1gLh6f38y182836V4avM
FGJURqKv2s3SQJ+SAqtMvKN0OwcmfFung/8RGsFduJIdwa/fhbk55aQwQgfxfXH370KmDIK7cstP
/3/5rZ0vij+EaE3CF1bhB+p/9fbHFpZZsH0GrBLWArhe4eQvXRtSWt2vJox9hyVKUob5+tlPH6j3
ajbD+lyCGT7WDaWhyfgySZ715giBgJLjBvz2FjfDmi+7Ator6OfGPhmAujCZkfq4Jthi1c1d0vvR
kkvWVWf0eERgTeTiBr/93Aenh0b5eNQsqD30k//c3Vd3lZ9yD5YAtt9+zUZ0n2gFnMbTm7j7iA4L
OEADVnSvkLC1jVqsTa8Hv3qxqRzU/Ea4EerFcMR7/sDRjZlN88cXccrpn4HmAWV+fggxiPzU1+XZ
Z7BqtYno9Hmtmzrgux+f8OvZPuqcfD7NcP8QtpmxU9vl11BbfuP1pZ2p6FzC5pYMdkyB9T7cR2Ld
h1885u9fP4ZEa7YeMsgUTw7/9N3MSocP/PJbsvekEmz9K9J+9Yi1NHvpi3odBai8sDTvcuXtLsLw
ScEbzQ52Vs8eWowUBe3aS0OKkjnTPrluGRSSWCF7Snp3SXt/hmXM2MS/NCrlLTDe4Pmo0fmcSSul
Ly/35EU6LsTa4yqnL6nagGZXD2wonZwvMXftYHZlZGKsDOcupzz8ID05qVjVhqe7fPo4RCn0Drio
44wuBytQwEWpnPmHR1wEH6nsPsoD9uZD5m58t9dgSUWLHK7zki+fezjDVuhl7BkA19MYBAF8luOA
v/pm2N5nlMEEjz2+6XVFV8XuOugxqoado1vWW5CgFPqf8jlLs27R8RoMqby1BGPr4jqA57ddBdfP
eiVuKuJoaygUwHc+ZvC8Pun2iIMLlLeZ4OTrtyzLx9ygU847f6MSqse8YD9QPZ114pE1paOlPD3U
toJPfAYcKfdMrj0cbsc9SbqLmH/58QbnfXcgztdP2uYrkSEnvY/YcIBab9MCU4BkSydnpFO9t31U
yq1uNzN4U5Gu19wIERVcceaTbdPJIiENfvf/LDpsCFbreQwAy5iqL2+CmfNCxAdwq6BA8rbz6DoE
L09esnglSpudcuHH9758mRy9E4iW0gkMRJiTPLMdTMEWe10AldN0wPuHXdHxI0oy/KDUJC6vBoOw
Sy8FPO6YHXbM8x1sX/8AnqAzk798BcCtgIV0SokvzxX97p8FXaf9ed6ZWU2n7iqE6Lf+VwHK7pQ+
2FCuVjvCh+zoDmI3v31kcfj69YMKd/t+L5wVWcUPHFvu9sXjH1/0F3mEdDq4jiXPTngkzqL2Ovn6
g1CoL5EvslfBXTzW2oHGOs44OjrH4YcHkL0+PsSYVtvl34kewtu02/kgdG1X+PI5+F0PX2JfRk3T
zFVg7Twf2H1BTLf3dmvgmgQvkgmNoK/Pep9Co4jsmVU2ns5+ZflwP4E9/uHRegObgn7+S14Hdr08
tWeILNS25CBVfkSX+BZCI68Wn/vuh0mYbwX0CCTYuLs6FYKET8FapzJWZcrp22V378D+aFyxfb3X
Na3EJfyL79HdGAequG8LRmyoYGULTSA2TLJAO+9sf9XPr3zj0LWA9zLysRGrjNtqai/AmHY6sSPr
pnetiBt4udCEaBJ5g00yZQHeKx9gm7Xs/ItnzU8f4kMTVHRs1DMDS+W84eNEUp0WRmjBe2z15HBA
dvTXj+mh85w3BDnw8Sd1+eE3xkWlAJF1lRI59H4idyMtB75dnREoPjMRraNqtI1KpkBkTAq25nyk
24uoArrYckocUV8jgpRkB90y5fzlcM916ju5A5Pi9SKHixHVqzLADkLvWc/Mo3N1EWx2hV6i8MDf
eohG77TGkF69duaFda0HsVwzSAVbxEaRtnnnrtwGv37+X/7YxM+n9+sfWO/XsuaOt02WoL06RFPe
SsTJD6VE4PSase4NYUR2w8aA5vQCvkDPhiuqupxCn/VexGiySzSeZl0B/q7jfFnq2Ho6uJqD5ONp
mCV0etZrNZvyb/1n2E6Nu4gzb8DKQoJPqiCo/+oJG0QcthL7mU+i8eh//jUxM5OL3v1pqeBkaAV2
Xvyu3sZCZ2B29p5Ey81jLhRqqSHNLbxZiLE3rOftMkNcbidybBNFF8W+9/++T/j6n18+6MGv/0k8
7qbWwu4ZapBNpxNRqsuBcsZR15CldSpWs1IG09dvAdxUvWZuOXiRsMrJR37Kpoitey3mK1cqFir6
4oOVwNrTTX6JJRCfZkX0nWfkfOM+GaQRHvnCR6ER1djUk7nKm7HKyxqg1aIWsMguZ59juVIn1f1q
wB0U9bkf10XfUqMsIR5NB2f3dQ+EeneR5bprUuxs58FdzkdSQVm88NgVhmO0SazOQEFAl6nfFY27
MIlUwFOp2cTcvKoeT1YVwv1pKYlyig75IlxWBvJducyfl73lE5szKfjxt68eGsghf0HwKi8Kvjiv
mZKUrTe5Pp522JRyUZ++fhr4fY9xKjClRfHdj23i+rzqjWCV0tWAbzQ6836R33RNH2IA9k6skKg0
BX29s8YGv/rb38ncwd0eL2X54T+O1/BIJ92rMvTFI7wX1nUgB+FgwXuD4h+e1mPB1QJUbx/oj/aL
B30tFCnMjMLy2XvziebDEQZwfeZHf33PtJ4uKAugchdHst/2bkQOAnbAnBvST9+7f/OHr7828w+7
Akv3hhWstxGS6L6EYJ0U+QPe8XUiZiK9o02Nlgaxwb7wly3sAK37rP+bh6hSvw3k1okG1LrHivUC
zzr96jUQWlFMXHCmw5e/hvDOnEufZvM7+upT55eHEFuAmU5PgyXDRxIf8bnrWroJbiugrtwMoqVS
lW9rzMmwnOMEK8Ppqm8//tWIGuNvMLzodMcMKbxN8u4vv/zxY5TnZf3XH1msaVnA1x/Ahn1qKL/1
xx18UetKUrHB+toWWQzZvdwSS+Ze7pYlSw855hwR9T3TYVCLa/n7Pl9Ol4u+cefdCPdOqmC1ffYD
yY5vH0pTo+AjME1Ad/bIgMG+jfgcKmW97pVzBddczjDeTYhuatcz8NFZ0szXvP3Ne7oPrJ36QX7+
lsDxnx7gtUHkm/eA33z88hGi//IE75ULsqLI+Tc/eLo9B9IdGNzsOoP4Oear8YljmAXdPLPf943d
WqfgqjjmzNi8ka+H3WmEIogyojF5oG8P791B+/q5EWXpHZ2G+RBC7O87fOqOM6Bm6WnwxxevfHjP
iS1pEOXtdCN7N2CH583/FFDRe25+G1eGLtUeKn+fr8Mr/vWTHlZVEJMom+2cj/yLB8/654UPt1Mx
iGPf7ODPr01Nj4/GV7WfwTffnCX3+IzoVy+DhYg1Oe6fmruZ99MO1VJKiNfoh3x1T8GMUNSy+OtX
UP5b31AQ2AvWXXag6yFcGPjrR3pppvo3HzDgvnw/iJEMGV2DyPWB22oycU0U5OskuRukH/30zUt1
/ZOcCQe//rDPH3Su3h6S9MXvQMF5rK0uOeQEwi+ekK9eBIvXvEeIFEXGipGv4Kefdm3L+SR9HiqX
fusFFATYM6vdtoFKZD+DcMsK7HviEfz8U/CCZYuPsKDD9ticFFro1RKFQn0QGfdRgJ2vUp+5kjoi
CVYLOFXgjhWtsQHHmhGHDJw55Jt/uHzHPjT446uLucbg5y/L89X2iFs1dbTot7KH0e2VYpsP75GY
7bcL2thDPkP+VbvjL5+IJ52ZX9i7Uxq/2QZ88QmrHynRxZQdNjhPfIjP2LuDbx5Wwnqh2B9+ei+b
kxCc40dElLDoAUnSXQCXhyZi9+svU+3ZhGhbed5HYxMP4i8P/PldjP/81PTDSin0vSf65gH7YWSW
ywf+8qDzaz3k27UXGWgOrxg7wfHjbubiBCBJcD5T+21ROg1sDxLb4IluIK0WBHcSwD+/UwH/9a8/
f/7H74RB19+L9nswYCrW6T/+z1GB/xD/Y+yytv17DGEes7L459//+wTCP++h797T/5z6pniN//z7
Dy/9PWvwz9RPWfv/Xv/X91X/9a//BQAA//8DAK+bEofgIAAA
headers:
CF-RAY:
- 94640d492fdbb6c9-LAX
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Tue, 27 May 2025 08:13:15 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-allow-origin:
- '*'
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-model:
- text-embedding-3-small
openai-organization:
- crewai-iuxna1
openai-processing-ms:
- '393'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
via:
- envoy-router-78456c78d9-689qp
x-envoy-upstream-service-time:
- '395'
x-ratelimit-limit-requests:
- '10000'
x-ratelimit-limit-tokens:
- '10000000'
x-ratelimit-remaining-requests:
- '9999'
x-ratelimit-remaining-tokens:
- '9999979'
x-ratelimit-reset-requests:
- 6ms
x-ratelimit-reset-tokens:
- 0s
x-request-id:
- req_c9c729639c1a9714296bd221d8edd696
status:
code: 200
message: OK
version: 1

View File

@@ -231,7 +231,7 @@ class TestDeployCommand(unittest.TestCase):
[project]
name = "test_project"
version = "0.1.0"
requires-python = ">=3.10,<3.13"
requires-python = ">=3.10,<3.14"
dependencies = ["crewai"]
""",
)
@@ -250,7 +250,7 @@ class TestDeployCommand(unittest.TestCase):
[project]
name = "test_project"
version = "0.1.0"
requires-python = ">=3.10,<3.13"
requires-python = ">=3.10,<3.14"
dependencies = ["crewai"]
""",
)

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1,206 @@
import unittest
from unittest.mock import MagicMock, patch, call
import pytest
from click.testing import CliRunner
import requests
from crewai.cli.organization.main import OrganizationCommand
from crewai.cli.cli import list, switch, current
@pytest.fixture
def runner():
return CliRunner()
@pytest.fixture
def org_command():
with patch.object(OrganizationCommand, '__init__', return_value=None):
command = OrganizationCommand()
yield command
@pytest.fixture
def mock_settings():
with patch('crewai.cli.organization.main.Settings') as mock_settings_class:
mock_settings_instance = MagicMock()
mock_settings_class.return_value = mock_settings_instance
yield mock_settings_instance
@patch('crewai.cli.cli.OrganizationCommand')
def test_org_list_command(mock_org_command_class, runner):
mock_org_instance = MagicMock()
mock_org_command_class.return_value = mock_org_instance
result = runner.invoke(list)
assert result.exit_code == 0
mock_org_command_class.assert_called_once()
mock_org_instance.list.assert_called_once()
@patch('crewai.cli.cli.OrganizationCommand')
def test_org_switch_command(mock_org_command_class, runner):
mock_org_instance = MagicMock()
mock_org_command_class.return_value = mock_org_instance
result = runner.invoke(switch, ['test-id'])
assert result.exit_code == 0
mock_org_command_class.assert_called_once()
mock_org_instance.switch.assert_called_once_with('test-id')
@patch('crewai.cli.cli.OrganizationCommand')
def test_org_current_command(mock_org_command_class, runner):
mock_org_instance = MagicMock()
mock_org_command_class.return_value = mock_org_instance
result = runner.invoke(current)
assert result.exit_code == 0
mock_org_command_class.assert_called_once()
mock_org_instance.current.assert_called_once()
class TestOrganizationCommand(unittest.TestCase):
def setUp(self):
with patch.object(OrganizationCommand, '__init__', return_value=None):
self.org_command = OrganizationCommand()
self.org_command.plus_api_client = MagicMock()
@patch('crewai.cli.organization.main.console')
@patch('crewai.cli.organization.main.Table')
def test_list_organizations_success(self, mock_table, mock_console):
mock_response = MagicMock()
mock_response.raise_for_status = MagicMock()
mock_response.json.return_value = [
{"name": "Org 1", "uuid": "org-123"},
{"name": "Org 2", "uuid": "org-456"}
]
self.org_command.plus_api_client = MagicMock()
self.org_command.plus_api_client.get_organizations.return_value = mock_response
mock_console.print = MagicMock()
self.org_command.list()
self.org_command.plus_api_client.get_organizations.assert_called_once()
mock_table.assert_called_once_with(title="Your Organizations")
mock_table.return_value.add_column.assert_has_calls([
call("Name", style="cyan"),
call("ID", style="green")
])
mock_table.return_value.add_row.assert_has_calls([
call("Org 1", "org-123"),
call("Org 2", "org-456")
])
@patch('crewai.cli.organization.main.console')
def test_list_organizations_empty(self, mock_console):
mock_response = MagicMock()
mock_response.raise_for_status = MagicMock()
mock_response.json.return_value = []
self.org_command.plus_api_client = MagicMock()
self.org_command.plus_api_client.get_organizations.return_value = mock_response
self.org_command.list()
self.org_command.plus_api_client.get_organizations.assert_called_once()
mock_console.print.assert_called_once_with(
"You don't belong to any organizations yet.",
style="yellow"
)
@patch('crewai.cli.organization.main.console')
def test_list_organizations_api_error(self, mock_console):
self.org_command.plus_api_client = MagicMock()
self.org_command.plus_api_client.get_organizations.side_effect = requests.exceptions.RequestException("API Error")
with pytest.raises(SystemExit):
self.org_command.list()
self.org_command.plus_api_client.get_organizations.assert_called_once()
mock_console.print.assert_called_once_with(
"Failed to retrieve organization list: API Error",
style="bold red"
)
@patch('crewai.cli.organization.main.console')
@patch('crewai.cli.organization.main.Settings')
def test_switch_organization_success(self, mock_settings_class, mock_console):
mock_response = MagicMock()
mock_response.raise_for_status = MagicMock()
mock_response.json.return_value = [
{"name": "Org 1", "uuid": "org-123"},
{"name": "Test Org", "uuid": "test-id"}
]
self.org_command.plus_api_client = MagicMock()
self.org_command.plus_api_client.get_organizations.return_value = mock_response
mock_settings_instance = MagicMock()
mock_settings_class.return_value = mock_settings_instance
self.org_command.switch("test-id")
self.org_command.plus_api_client.get_organizations.assert_called_once()
mock_settings_instance.dump.assert_called_once()
assert mock_settings_instance.org_name == "Test Org"
assert mock_settings_instance.org_uuid == "test-id"
mock_console.print.assert_called_once_with(
"Successfully switched to Test Org (test-id)",
style="bold green"
)
@patch('crewai.cli.organization.main.console')
def test_switch_organization_not_found(self, mock_console):
mock_response = MagicMock()
mock_response.raise_for_status = MagicMock()
mock_response.json.return_value = [
{"name": "Org 1", "uuid": "org-123"},
{"name": "Org 2", "uuid": "org-456"}
]
self.org_command.plus_api_client = MagicMock()
self.org_command.plus_api_client.get_organizations.return_value = mock_response
self.org_command.switch("non-existent-id")
self.org_command.plus_api_client.get_organizations.assert_called_once()
mock_console.print.assert_called_once_with(
"Organization with id 'non-existent-id' not found.",
style="bold red"
)
@patch('crewai.cli.organization.main.console')
@patch('crewai.cli.organization.main.Settings')
def test_current_organization_with_org(self, mock_settings_class, mock_console):
mock_settings_instance = MagicMock()
mock_settings_instance.org_name = "Test Org"
mock_settings_instance.org_uuid = "test-id"
mock_settings_class.return_value = mock_settings_instance
self.org_command.current()
self.org_command.plus_api_client.get_organizations.assert_not_called()
mock_console.print.assert_called_once_with(
"Currently logged in to organization Test Org (test-id)",
style="bold green"
)
@patch('crewai.cli.organization.main.console')
@patch('crewai.cli.organization.main.Settings')
def test_current_organization_without_org(self, mock_settings_class, mock_console):
mock_settings_instance = MagicMock()
mock_settings_instance.org_uuid = None
mock_settings_class.return_value = mock_settings_instance
self.org_command.current()
assert mock_console.print.call_count == 3
mock_console.print.assert_any_call(
"You're not currently logged in to any organization.",
style="yellow"
)

View File

@@ -1,6 +1,6 @@
import os
import unittest
from unittest.mock import MagicMock, patch
from unittest.mock import MagicMock, patch, ANY
from crewai.cli.plus_api import PlusAPI
@@ -9,6 +9,7 @@ class TestPlusAPI(unittest.TestCase):
def setUp(self):
self.api_key = "test_api_key"
self.api = PlusAPI(self.api_key)
self.org_uuid = "test-org-uuid"
def test_init(self):
self.assertEqual(self.api.api_key, self.api_key)
@@ -29,17 +30,96 @@ class TestPlusAPI(unittest.TestCase):
)
self.assertEqual(response, mock_response)
def assert_request_with_org_id(self, mock_make_request, method: str, endpoint: str, **kwargs):
mock_make_request.assert_called_once_with(
method, f"https://app.crewai.com{endpoint}", headers={'Authorization': ANY, 'Content-Type': ANY, 'User-Agent': ANY, 'X-Crewai-Version': ANY, 'X-Crewai-Organization-Id': self.org_uuid}, **kwargs
)
@patch("crewai.cli.plus_api.Settings")
@patch("requests.Session.request")
def test_login_to_tool_repository_with_org_uuid(self, mock_make_request, mock_settings_class):
mock_settings = MagicMock()
mock_settings.org_uuid = self.org_uuid
mock_settings_class.return_value = mock_settings
# re-initialize Client
self.api = PlusAPI(self.api_key)
mock_response = MagicMock()
mock_make_request.return_value = mock_response
response = self.api.login_to_tool_repository()
self.assert_request_with_org_id(
mock_make_request,
'POST',
'/crewai_plus/api/v1/tools/login'
)
self.assertEqual(response, mock_response)
@patch("crewai.cli.plus_api.PlusAPI._make_request")
def test_get_agent(self, mock_make_request):
mock_response = MagicMock()
mock_make_request.return_value = mock_response
response = self.api.get_agent("test_agent_handle")
mock_make_request.assert_called_once_with(
"GET", "/crewai_plus/api/v1/agents/test_agent_handle"
)
self.assertEqual(response, mock_response)
@patch("crewai.cli.plus_api.Settings")
@patch("requests.Session.request")
def test_get_agent_with_org_uuid(self, mock_make_request, mock_settings_class):
mock_settings = MagicMock()
mock_settings.org_uuid = self.org_uuid
mock_settings_class.return_value = mock_settings
# re-initialize Client
self.api = PlusAPI(self.api_key)
mock_response = MagicMock()
mock_make_request.return_value = mock_response
response = self.api.get_agent("test_agent_handle")
self.assert_request_with_org_id(
mock_make_request,
"GET",
"/crewai_plus/api/v1/agents/test_agent_handle"
)
self.assertEqual(response, mock_response)
@patch("crewai.cli.plus_api.PlusAPI._make_request")
def test_get_tool(self, mock_make_request):
mock_response = MagicMock()
mock_make_request.return_value = mock_response
response = self.api.get_tool("test_tool_handle")
mock_make_request.assert_called_once_with(
"GET", "/crewai_plus/api/v1/tools/test_tool_handle"
)
self.assertEqual(response, mock_response)
@patch("crewai.cli.plus_api.Settings")
@patch("requests.Session.request")
def test_get_tool_with_org_uuid(self, mock_make_request, mock_settings_class):
mock_settings = MagicMock()
mock_settings.org_uuid = self.org_uuid
mock_settings_class.return_value = mock_settings
# re-initialize Client
self.api = PlusAPI(self.api_key)
# Set up mock response
mock_response = MagicMock()
mock_make_request.return_value = mock_response
response = self.api.get_tool("test_tool_handle")
self.assert_request_with_org_id(
mock_make_request,
"GET",
"/crewai_plus/api/v1/tools/test_tool_handle"
)
self.assertEqual(response, mock_response)
@patch("crewai.cli.plus_api.PlusAPI._make_request")
def test_publish_tool(self, mock_make_request):
@@ -61,11 +141,53 @@ class TestPlusAPI(unittest.TestCase):
"version": version,
"file": encoded_file,
"description": description,
"available_exports": None,
}
mock_make_request.assert_called_once_with(
"POST", "/crewai_plus/api/v1/tools", json=params
)
self.assertEqual(response, mock_response)
@patch("crewai.cli.plus_api.Settings")
@patch("requests.Session.request")
def test_publish_tool_with_org_uuid(self, mock_make_request, mock_settings_class):
mock_settings = MagicMock()
mock_settings.org_uuid = self.org_uuid
mock_settings_class.return_value = mock_settings
# re-initialize Client
self.api = PlusAPI(self.api_key)
# Set up mock response
mock_response = MagicMock()
mock_make_request.return_value = mock_response
handle = "test_tool_handle"
public = True
version = "1.0.0"
description = "Test tool description"
encoded_file = "encoded_test_file"
response = self.api.publish_tool(
handle, public, version, description, encoded_file
)
# Expected params including organization_uuid
expected_params = {
"handle": handle,
"public": public,
"version": version,
"file": encoded_file,
"description": description,
"available_exports": None,
}
self.assert_request_with_org_id(
mock_make_request,
"POST",
"/crewai_plus/api/v1/tools",
json=expected_params
)
self.assertEqual(response, mock_response)
@patch("crewai.cli.plus_api.PlusAPI._make_request")
def test_publish_tool_without_description(self, mock_make_request):
@@ -87,6 +209,7 @@ class TestPlusAPI(unittest.TestCase):
"version": version,
"file": encoded_file,
"description": description,
"available_exports": None,
}
mock_make_request.assert_called_once_with(
"POST", "/crewai_plus/api/v1/tools", json=params

View File

@@ -1,6 +1,7 @@
import os
import shutil
import tempfile
from pathlib import Path
import pytest
@@ -100,3 +101,163 @@ def test_tree_copy_to_existing_directory(temp_tree):
assert os.path.isfile(os.path.join(dest_dir, "file1.txt"))
finally:
shutil.rmtree(dest_dir)
@pytest.fixture
def temp_project_dir():
"""Create a temporary directory for testing tool extraction."""
with tempfile.TemporaryDirectory() as temp_dir:
yield Path(temp_dir)
def create_init_file(directory, content):
return create_file(directory / "__init__.py", content)
def test_extract_available_exports_empty_project(temp_project_dir, capsys):
with pytest.raises(SystemExit):
utils.extract_available_exports(dir_path=temp_project_dir)
captured = capsys.readouterr()
assert "No valid tools were exposed in your __init__.py file" in captured.out
def test_extract_available_exports_no_init_file(temp_project_dir, capsys):
(temp_project_dir / "some_file.py").write_text("print('hello')")
with pytest.raises(SystemExit):
utils.extract_available_exports(dir_path=temp_project_dir)
captured = capsys.readouterr()
assert "No valid tools were exposed in your __init__.py file" in captured.out
def test_extract_available_exports_empty_init_file(temp_project_dir, capsys):
create_init_file(temp_project_dir, "")
with pytest.raises(SystemExit):
utils.extract_available_exports(dir_path=temp_project_dir)
captured = capsys.readouterr()
assert "Warning: No __all__ defined in" in captured.out
def test_extract_available_exports_no_all_variable(temp_project_dir, capsys):
create_init_file(
temp_project_dir,
"from crewai.tools import BaseTool\n\nclass MyTool(BaseTool):\n pass",
)
with pytest.raises(SystemExit):
utils.extract_available_exports(dir_path=temp_project_dir)
captured = capsys.readouterr()
assert "Warning: No __all__ defined in" in captured.out
def test_extract_available_exports_valid_base_tool_class(temp_project_dir):
create_init_file(
temp_project_dir,
"""from crewai.tools import BaseTool
class MyTool(BaseTool):
name: str = "my_tool"
description: str = "A test tool"
__all__ = ['MyTool']
""",
)
tools = utils.extract_available_exports(dir_path=temp_project_dir)
assert [{"name": "MyTool"}] == tools
def test_extract_available_exports_valid_tool_decorator(temp_project_dir):
create_init_file(
temp_project_dir,
"""from crewai.tools import tool
@tool
def my_tool_function(text: str) -> str:
\"\"\"A test tool function\"\"\"
return text
__all__ = ['my_tool_function']
""",
)
tools = utils.extract_available_exports(dir_path=temp_project_dir)
assert [{"name": "my_tool_function"}] == tools
def test_extract_available_exports_multiple_valid_tools(temp_project_dir):
create_init_file(
temp_project_dir,
"""from crewai.tools import BaseTool, tool
class MyTool(BaseTool):
name: str = "my_tool"
description: str = "A test tool"
@tool
def my_tool_function(text: str) -> str:
\"\"\"A test tool function\"\"\"
return text
__all__ = ['MyTool', 'my_tool_function']
""",
)
tools = utils.extract_available_exports(dir_path=temp_project_dir)
assert [{"name": "MyTool"}, {"name": "my_tool_function"}] == tools
def test_extract_available_exports_with_invalid_tool_decorator(temp_project_dir):
create_init_file(
temp_project_dir,
"""from crewai.tools import BaseTool
class MyTool(BaseTool):
name: str = "my_tool"
description: str = "A test tool"
def not_a_tool():
pass
__all__ = ['MyTool', 'not_a_tool']
""",
)
tools = utils.extract_available_exports(dir_path=temp_project_dir)
assert [{"name": "MyTool"}] == tools
def test_extract_available_exports_import_error(temp_project_dir, capsys):
create_init_file(
temp_project_dir,
"""from nonexistent_module import something
class MyTool(BaseTool):
pass
__all__ = ['MyTool']
""",
)
with pytest.raises(SystemExit):
utils.extract_available_exports(dir_path=temp_project_dir)
captured = capsys.readouterr()
assert "nonexistent_module" in captured.out
def test_extract_available_exports_syntax_error(temp_project_dir, capsys):
create_init_file(
temp_project_dir,
"""from crewai.tools import BaseTool
class MyTool(BaseTool):
# Missing closing parenthesis
def __init__(self, name:
pass
__all__ = ['MyTool']
""",
)
with pytest.raises(SystemExit):
utils.extract_available_exports(dir_path=temp_project_dir)
captured = capsys.readouterr()
assert "was never closed" in captured.out

View File

@@ -85,6 +85,36 @@ def test_install_success(mock_get, mock_subprocess_run, capsys, tool_command):
env=unittest.mock.ANY,
)
@patch("crewai.cli.tools.main.subprocess.run")
@patch("crewai.cli.plus_api.PlusAPI.get_tool")
def test_install_success_from_pypi(mock_get, mock_subprocess_run, capsys, tool_command):
mock_get_response = MagicMock()
mock_get_response.status_code = 200
mock_get_response.json.return_value = {
"handle": "sample-tool",
"repository": {"handle": "sample-repo", "url": "https://example.com/repo"},
"source": "pypi",
}
mock_get.return_value = mock_get_response
mock_subprocess_run.return_value = MagicMock(stderr=None)
tool_command.install("sample-tool")
output = capsys.readouterr().out
assert "Successfully installed sample-tool" in output
mock_get.assert_has_calls([mock.call("sample-tool"), mock.call().json()])
mock_subprocess_run.assert_any_call(
[
"uv",
"add",
"sample-tool",
],
capture_output=False,
text=True,
check=True,
env=unittest.mock.ANY,
)
@patch("crewai.cli.plus_api.PlusAPI.get_tool")
def test_install_tool_not_found(mock_get, capsys, tool_command):
@@ -135,7 +165,9 @@ def test_publish_when_not_in_sync(mock_is_synced, capsys, tool_command):
)
@patch("crewai.cli.plus_api.PlusAPI.publish_tool")
@patch("crewai.cli.tools.main.git.Repository.is_synced", return_value=False)
@patch("crewai.cli.tools.main.extract_available_exports", return_value=[{"name": "SampleTool"}])
def test_publish_when_not_in_sync_and_force(
mock_available_exports,
mock_is_synced,
mock_publish,
mock_open,
@@ -168,6 +200,7 @@ def test_publish_when_not_in_sync_and_force(
version="1.0.0",
description="A sample tool",
encoded_file=unittest.mock.ANY,
available_exports=[{"name": "SampleTool"}],
)
@@ -183,7 +216,9 @@ def test_publish_when_not_in_sync_and_force(
)
@patch("crewai.cli.plus_api.PlusAPI.publish_tool")
@patch("crewai.cli.tools.main.git.Repository.is_synced", return_value=True)
@patch("crewai.cli.tools.main.extract_available_exports", return_value=[{"name": "SampleTool"}])
def test_publish_success(
mock_available_exports,
mock_is_synced,
mock_publish,
mock_open,
@@ -216,6 +251,7 @@ def test_publish_success(
version="1.0.0",
description="A sample tool",
encoded_file=unittest.mock.ANY,
available_exports=[{"name": "SampleTool"}],
)
@@ -230,7 +266,9 @@ def test_publish_success(
read_data=b"sample tarball content",
)
@patch("crewai.cli.plus_api.PlusAPI.publish_tool")
@patch("crewai.cli.tools.main.extract_available_exports", return_value=[{"name": "SampleTool"}])
def test_publish_failure(
mock_available_exports,
mock_publish,
mock_open,
mock_listdir,
@@ -266,7 +304,9 @@ def test_publish_failure(
read_data=b"sample tarball content",
)
@patch("crewai.cli.plus_api.PlusAPI.publish_tool")
@patch("crewai.cli.tools.main.extract_available_exports", return_value=[{"name": "SampleTool"}])
def test_publish_api_error(
mock_available_exports,
mock_publish,
mock_open,
mock_listdir,

View File

@@ -4418,7 +4418,7 @@ def test_reset_knowledge_with_no_crew_knowledge(researcher,writer):
with pytest.raises(RuntimeError) as excinfo:
crew.reset_memories(command_type='knowledge')
# Optionally, you can also check the error message
assert "Crew Knowledge and Agent Knowledge memory system is not initialized" in str(excinfo.value) # Replace with the expected message
@@ -4497,7 +4497,7 @@ def test_reset_agent_knowledge_with_no_agent_knowledge(researcher,writer):
with pytest.raises(RuntimeError) as excinfo:
crew.reset_memories(command_type='agent_knowledge')
# Optionally, you can also check the error message
assert "Agent Knowledge memory system is not initialized" in str(excinfo.value) # Replace with the expected message
@@ -4517,7 +4517,7 @@ def test_reset_agent_knowledge_with_only_crew_knowledge(researcher,writer):
with pytest.raises(RuntimeError) as excinfo:
crew.reset_memories(command_type='agent_knowledge')
# Optionally, you can also check the error message
assert "Agent Knowledge memory system is not initialized" in str(excinfo.value) # Replace with the expected message

View File

@@ -0,0 +1,258 @@
"""Tests for reasoning interval and adaptive reasoning in agents."""
import pytest
from unittest.mock import patch, MagicMock
from crewai import Agent, Task
from crewai.agents.crew_agent_executor import CrewAgentExecutor
from crewai.utilities.reasoning_handler import AgentReasoning
def test_agent_with_reasoning_interval():
"""Ensure that the agent triggers mid-execution reasoning based on the fixed interval."""
# Use a mock LLM to avoid real network calls
llm = MagicMock()
agent = Agent(
role="Test Agent",
goal="To test the reasoning interval feature",
backstory="I am a test agent created to verify the reasoning interval feature works correctly.",
llm=llm,
reasoning=True,
reasoning_interval=2, # Reason every 2 steps
verbose=True,
)
task = Task(
description="Multi-step task that requires periodic reasoning.",
expected_output="The task should be completed with periodic reasoning.",
agent=agent,
)
# Create a mock executor that will be injected into the agent
mock_executor = MagicMock()
mock_executor.steps_since_reasoning = 0
def mock_invoke(*args, **kwargs):
return mock_executor._invoke_loop()
def mock_invoke_loop():
assert not mock_executor._should_trigger_reasoning()
mock_executor.steps_since_reasoning += 1
mock_executor.steps_since_reasoning = 2
assert mock_executor._should_trigger_reasoning()
mock_executor._handle_mid_execution_reasoning()
return {"output": "Task completed successfully."}
mock_executor.invoke = MagicMock(side_effect=mock_invoke)
mock_executor._invoke_loop = MagicMock(side_effect=mock_invoke_loop)
mock_executor._should_trigger_reasoning = MagicMock(side_effect=lambda: mock_executor.steps_since_reasoning >= 2)
mock_executor._handle_mid_execution_reasoning = MagicMock()
# Monkey-patch create_agent_executor so that it sets our mock_executor
def _fake_create_agent_executor(self, tools=None, task=None): # noqa: D401,E501
"""Replace the real executor with the mock while preserving behaviour."""
self.agent_executor = mock_executor
return mock_executor
with patch.object(Agent, "create_agent_executor", _fake_create_agent_executor):
result = agent.execute_task(task)
# Validate results and that reasoning happened when expected
assert result == "Task completed successfully."
mock_executor._invoke_loop.assert_called_once()
mock_executor._handle_mid_execution_reasoning.assert_called_once()
def test_agent_with_adaptive_reasoning():
"""Test agent with adaptive reasoning."""
# Create a mock agent with adaptive reasoning
agent = MagicMock()
agent.reasoning = True
agent.reasoning_interval = None
agent.adaptive_reasoning = True
agent.role = "Test Agent"
# Create a mock task
task = MagicMock()
executor = CrewAgentExecutor(
llm=MagicMock(),
task=task,
crew=MagicMock(),
agent=agent,
prompt={},
max_iter=10,
tools=[],
tools_names="",
stop_words=[],
tools_description="",
tools_handler=MagicMock()
)
def mock_invoke_loop():
assert executor._should_adaptive_reason()
executor._handle_mid_execution_reasoning()
return {"output": "Task completed with adaptive reasoning."}
executor._invoke_loop = MagicMock(side_effect=mock_invoke_loop)
executor._should_adaptive_reason = MagicMock(return_value=True)
executor._handle_mid_execution_reasoning = MagicMock()
result = executor._invoke_loop()
assert result["output"] == "Task completed with adaptive reasoning."
executor._should_adaptive_reason.assert_called_once()
executor._handle_mid_execution_reasoning.assert_called_once()
def test_mid_execution_reasoning_handler():
"""Test the mid-execution reasoning handler."""
llm = MagicMock()
llm.call.return_value = "Based on progress, I'll adjust my approach.\n\nREADY: I am ready to continue executing the task."
agent = Agent(
role="Test Agent",
goal="To test the mid-execution reasoning handler",
backstory="I am a test agent created to verify the mid-execution reasoning handler works correctly.",
llm=llm,
reasoning=True,
verbose=True
)
task = Task(
description="Task to test mid-execution reasoning handler.",
expected_output="The mid-execution reasoning handler should work correctly.",
agent=agent
)
agent.llm.call = MagicMock(return_value="Based on progress, I'll adjust my approach.\n\nREADY: I am ready to continue executing the task.")
reasoning_handler = AgentReasoning(task=task, agent=agent)
result = reasoning_handler.handle_mid_execution_reasoning(
current_steps=3,
tools_used=["search_tool", "calculator_tool"],
current_progress="Made progress on steps 1-3",
iteration_messages=[
{"role": "assistant", "content": "I'll search for information."},
{"role": "system", "content": "Search results: ..."},
{"role": "assistant", "content": "I'll calculate the answer."},
{"role": "system", "content": "Calculation result: 42"}
]
)
assert result is not None
assert hasattr(result, 'plan')
assert hasattr(result.plan, 'plan')
assert hasattr(result.plan, 'ready')
assert result.plan.ready is True
def test_should_trigger_reasoning_interval():
"""Test the _should_trigger_reasoning method with interval-based reasoning."""
agent = MagicMock()
agent.reasoning = True
agent.reasoning_interval = 3
agent.adaptive_reasoning = False
executor = CrewAgentExecutor(
llm=MagicMock(),
task=MagicMock(),
crew=MagicMock(),
agent=agent,
prompt={},
max_iter=10,
tools=[],
tools_names="",
stop_words=[],
tools_description="",
tools_handler=MagicMock()
)
executor.steps_since_reasoning = 0
assert executor._should_trigger_reasoning() is False
executor.steps_since_reasoning = 2
assert executor._should_trigger_reasoning() is False
executor.steps_since_reasoning = 3
assert executor._should_trigger_reasoning() is True
executor.steps_since_reasoning = 4
assert executor._should_trigger_reasoning() is True
def test_should_trigger_adaptive_reasoning():
"""Test the _should_adaptive_reason method."""
agent = MagicMock()
agent.reasoning = True
agent.reasoning_interval = None
agent.adaptive_reasoning = True
executor = CrewAgentExecutor(
llm=MagicMock(),
task=MagicMock(),
crew=MagicMock(),
agent=agent,
prompt={},
max_iter=10,
tools=[],
tools_names="",
stop_words=[],
tools_description="",
tools_handler=MagicMock()
)
with patch('crewai.utilities.reasoning_handler.AgentReasoning.should_adaptive_reason_llm', return_value=True):
assert executor._should_adaptive_reason() is True
executor.messages = [
{"role": "assistant", "content": "I'll try this approach."},
{"role": "system", "content": "Error: Failed to execute the command."},
{"role": "assistant", "content": "Let me try something else."}
]
assert executor._should_adaptive_reason() is True
executor.messages = [
{"role": "assistant", "content": "I'll try this approach."},
{"role": "system", "content": "Command executed successfully."},
{"role": "assistant", "content": "Let me continue with the next step."}
]
with patch('crewai.utilities.reasoning_handler.AgentReasoning.should_adaptive_reason_llm', return_value=False):
assert executor._should_adaptive_reason() is False
@pytest.mark.parametrize("interval,steps,should_reason", [
(None, 5, False),
(3, 2, False),
(3, 3, True),
(1, 1, True),
(5, 10, True),
])
def test_reasoning_interval_scenarios(interval, steps, should_reason):
"""Test various reasoning interval scenarios."""
agent = MagicMock()
agent.reasoning = True
agent.reasoning_interval = interval
agent.adaptive_reasoning = False
executor = CrewAgentExecutor(
llm=MagicMock(),
task=MagicMock(),
crew=MagicMock(),
agent=agent,
prompt={},
max_iter=10,
tools=[],
tools_names="",
stop_words=[],
tools_description="",
tools_handler=MagicMock()
)
executor.steps_since_reasoning = steps
assert executor._should_trigger_reasoning() is should_reason

View File

@@ -0,0 +1,215 @@
"""Unit tests for acceptance criteria validation feature at task level."""
import pytest
from unittest.mock import MagicMock, patch, call
from typing import List, Tuple
from crewai.agents.crew_agent_executor import CrewAgentExecutor
from crewai.agents.agent_state import AgentState
from crewai.tools.agent_tools.scratchpad_tool import ScratchpadTool
from crewai.agents.parser import AgentFinish
from crewai.utilities import Printer
from crewai.llm import LLM
class TestAcceptanceCriteriaValidation:
"""Test suite for task-level acceptance criteria validation functionality."""
def setup_method(self):
"""Set up test fixtures."""
self.mock_llm = MagicMock(spec=LLM)
self.mock_agent = MagicMock()
self.mock_task = MagicMock()
self.mock_crew = MagicMock()
self.mock_tools_handler = MagicMock()
# Set up agent attributes
self.mock_agent.role = "Test Agent"
self.mock_agent.reasoning = True
self.mock_agent.verbose = False
self.mock_agent.reasoning_interval = None
self.mock_agent.adaptive_reasoning = False
# Create executor
self.executor = CrewAgentExecutor(
llm=self.mock_llm,
task=self.mock_task,
crew=self.mock_crew,
agent=self.mock_agent,
prompt={},
max_iter=10,
tools=[],
tools_names="",
stop_words=[],
tools_description="",
tools_handler=self.mock_tools_handler,
callbacks=[]
)
# Set up agent state with acceptance criteria
self.executor.agent_state = AgentState(task_id="test-task-id")
self.executor.agent_state.acceptance_criteria = [
"Include all required information",
"Format output properly",
"Provide complete analysis"
]
# Mock printer
self.executor._printer = MagicMock(spec=Printer)
def test_validate_acceptance_criteria_all_met(self):
"""Test validation when all acceptance criteria are met."""
output = "Complete output with all information, properly formatted, with full analysis"
# Configure LLM to return all criteria met
self.mock_llm.call.return_value = '''{
"1": "MET",
"2": "MET",
"3": "MET"
}'''
is_valid, unmet_criteria = self.executor._validate_acceptance_criteria(output)
assert is_valid is True
assert unmet_criteria == []
assert self.mock_llm.call.call_count == 1
def test_validate_acceptance_criteria_some_unmet(self):
"""Test validation when some criteria are not met."""
output = "Partial output missing formatting"
# Configure LLM to return mixed results
self.mock_llm.call.return_value = '''{
"1": "MET",
"2": "NOT MET: Missing proper formatting",
"3": "NOT MET: Analysis incomplete"
}'''
is_valid, unmet_criteria = self.executor._validate_acceptance_criteria(output)
assert is_valid is False
assert len(unmet_criteria) == 2
assert "Format output properly" in unmet_criteria
assert "Provide complete analysis" in unmet_criteria
def test_create_criteria_retry_prompt_with_scratchpad(self):
"""Test retry prompt creation when scratchpad has data."""
# Set up scratchpad tool with data
self.executor.scratchpad_tool = ScratchpadTool()
self.executor.agent_state.scratchpad = {
"research_data": {"key": "value"},
"analysis_results": ["item1", "item2"]
}
# Set up task details
self.mock_task.description = "Analyze research data and provide insights"
self.mock_task.expected_output = "A comprehensive report with analysis and recommendations"
unmet_criteria = ["Include specific examples", "Add recommendations"]
prompt = self.executor._create_criteria_retry_prompt(unmet_criteria)
# Verify prompt content with new format
assert "VALIDATION FAILED" in prompt
assert "YOU CANNOT PROVIDE A FINAL ANSWER YET" in prompt
assert "ORIGINAL TASK:" in prompt
assert "Analyze research data" in prompt
assert "EXPECTED OUTPUT:" in prompt
assert "comprehensive report" in prompt
assert "Include specific examples" in prompt
assert "Add recommendations" in prompt
assert "Access Scratchpad Memory" in prompt
assert "'research_data'" in prompt
assert "'analysis_results'" in prompt
assert "Action:" in prompt
assert "Action Input:" in prompt
assert "CONTINUE WITH TOOL USAGE NOW" in prompt
assert "DO NOT ATTEMPT ANOTHER FINAL ANSWER" in prompt
def test_create_criteria_retry_prompt_without_scratchpad(self):
"""Test retry prompt creation when no scratchpad data exists."""
unmet_criteria = ["Add more detail"]
prompt = self.executor._create_criteria_retry_prompt(unmet_criteria)
assert "Add more detail" in prompt
assert "VALIDATION FAILED" in prompt
assert "📦 YOUR SCRATCHPAD CONTAINS DATA" not in prompt
@patch('crewai.agents.crew_agent_executor.get_llm_response')
@patch('crewai.agents.crew_agent_executor.process_llm_response')
def test_invoke_loop_blocks_incomplete_final_answer(self, mock_process, mock_get_response):
"""Test that invoke loop blocks incomplete final answers."""
# Set up conditions
self.executor.agent_state.acceptance_criteria = ["Complete all sections"]
# First attempt returns incomplete final answer
incomplete_answer = AgentFinish(
thought="Done",
output="Exploring potential follow-up tasks!",
text="Final Answer: Exploring potential follow-up tasks!"
)
# After retry, return complete answer
complete_answer = AgentFinish(
thought="Done with all sections",
output="Complete output with all sections addressed",
text="Final Answer: Complete output with all sections addressed"
)
# Configure mocks
mock_process.side_effect = [incomplete_answer, complete_answer]
mock_get_response.return_value = "response"
# Configure validation
self.mock_llm.call.side_effect = [
'{"1": "NOT MET: Missing required sections"}', # First validation fails
'{"1": "MET"}' # Second validation passes
]
# Execute
result = self.executor._invoke_loop()
# Verify
assert result == complete_answer
assert self.mock_llm.call.call_count == 2 # Two validation attempts
assert mock_process.call_count == 2 # Two processing attempts
# Verify error message was shown
self._verify_validation_messages_shown()
def test_validation_happens_on_every_final_answer_attempt(self):
"""Test that validation happens on every AgentFinish attempt."""
self.executor.agent_state.acceptance_criteria = ["Complete all sections"]
# Configure LLM to always return criteria not met
self.mock_llm.call.return_value = '{"1": "NOT MET: Missing required sections"}'
output = "Incomplete output"
# Validate multiple times - each should trigger validation
for _ in range(3):
is_valid, unmet_criteria = self.executor._validate_acceptance_criteria(output)
assert is_valid is False
assert len(unmet_criteria) == 1
# Verify validation was called every time
assert self.mock_llm.call.call_count == 3
def _verify_validation_messages_shown(self):
"""Helper to verify validation messages were displayed."""
print_calls = self.executor._printer.print.call_args_list
# Check for validation message
validation_msg_shown = any(
"Validating acceptance criteria" in str(call)
for call in print_calls
)
# Check for failure message
failure_msg_shown = any(
"Cannot finalize" in str(call)
for call in print_calls
)
assert validation_msg_shown or failure_msg_shown

View File

@@ -0,0 +1,176 @@
"""Unit tests for the ScratchpadTool."""
import pytest
from crewai.tools.agent_tools.scratchpad_tool import ScratchpadTool, ScratchpadToolSchema
class TestScratchpadTool:
"""Test suite for the ScratchpadTool functionality."""
def test_schema_description(self):
"""Test that the schema has helpful description."""
schema = ScratchpadToolSchema
key_field = schema.model_fields['key']
assert "Example:" in key_field.description
assert '{"key":' in key_field.description
def test_empty_scratchpad_error_message(self):
"""Test error message when scratchpad is empty."""
tool = ScratchpadTool()
result = tool._run(key="nonexistent")
assert "❌ SCRATCHPAD IS EMPTY" in result
assert "does not contain any data yet" in result
assert "Try executing other tools first" in result
assert "💡 TIP:" in result
assert "search, read, or fetch operations" in result
def test_key_not_found_error_message(self):
"""Test error message when key is not found."""
tool = ScratchpadTool(scratchpad_data={
"existing_key": "value",
"another_key": {"data": "test"}
})
result = tool._run(key="wrong_key")
assert "❌ KEY NOT FOUND: 'wrong_key'" in result
assert "📦 AVAILABLE KEYS IN SCRATCHPAD:" in result
assert "- 'existing_key'" in result
assert "- 'another_key'" in result
assert '✅ CORRECT USAGE EXAMPLE:' in result
assert 'Action: Access Scratchpad Memory' in result
assert 'Action Input: {"key": "existing_key"}' in result
assert "⚠️ IMPORTANT:" in result
assert "Keys are case-sensitive and must match EXACTLY" in result
def test_successful_retrieval_string(self):
"""Test successful retrieval of string data."""
tool = ScratchpadTool(scratchpad_data={
"message": "Hello, World!"
})
result = tool._run(key="message")
assert "✅ Successfully retrieved data for key 'message':" in result
assert "Hello, World!" in result
def test_successful_retrieval_dict(self):
"""Test successful retrieval of dictionary data."""
test_dict = {"name": "John", "age": 30}
tool = ScratchpadTool(scratchpad_data={
"user_data": test_dict
})
result = tool._run(key="user_data")
assert "✅ Successfully retrieved data for key 'user_data':" in result
assert '"name": "John"' in result
assert '"age": 30' in result
def test_successful_retrieval_list(self):
"""Test successful retrieval of list data."""
test_list = ["item1", "item2", "item3"]
tool = ScratchpadTool(scratchpad_data={
"items": test_list
})
result = tool._run(key="items")
assert "✅ Successfully retrieved data for key 'items':" in result
assert '"item1"' in result
assert '"item2"' in result
assert '"item3"' in result
def test_tool_description_empty(self):
"""Test tool description when scratchpad is empty."""
tool = ScratchpadTool()
assert "HOW TO USE THIS TOOL:" in tool.description
assert 'Example: {"key": "email_data"}' in tool.description
assert "📝 STATUS: Scratchpad is currently empty" in tool.description
def test_tool_description_with_data(self):
"""Test tool description when scratchpad has data."""
tool = ScratchpadTool(scratchpad_data={
"emails": ["email1@test.com", "email2@test.com"],
"results": {"count": 5, "status": "success"},
"api_key": "secret_key_123"
})
desc = tool.description
# Check basic structure
assert "HOW TO USE THIS TOOL:" in desc
assert "📦 AVAILABLE DATA IN SCRATCHPAD:" in desc
assert "💡 EXAMPLE USAGE:" in desc
# Check key listings
assert "📌 'emails': list of 2 items" in desc
assert "📌 'results': dict with 2 items" in desc
assert "📌 'api_key': string (14 chars)" in desc
# Check example uses first key
assert 'Action Input: {"key": "emails"}' in desc
def test_update_scratchpad(self):
"""Test updating scratchpad data."""
tool = ScratchpadTool()
# Initially empty
assert not tool.scratchpad_data
# Update with data
new_data = {"test": "value"}
tool.update_scratchpad(new_data)
assert tool.scratchpad_data == new_data
assert "📌 'test': string (5 chars)" in tool.description
def test_complex_data_preview(self):
"""Test preview generation for complex data structures."""
tool = ScratchpadTool(scratchpad_data={
"nested_dict": {
"data": ["item1", "item2", "item3"]
},
"empty_list": [],
"boolean_value": True,
"number": 42
})
desc = tool.description
# Special case for dict with 'data' key containing list
assert "📌 'nested_dict': list of 3 items" in desc
assert "📌 'empty_list': list of 0 items" in desc
assert "📌 'boolean_value': bool" in desc
assert "📌 'number': int" in desc
def test_similar_key_suggestion(self):
"""Test that similar keys are suggested when a wrong key is used."""
tool = ScratchpadTool(scratchpad_data={
"email_search_results": ["email1", "email2"],
"email_details": {"id": "123"},
"user_preferences": {"theme": "dark"}
})
# Test partial match
result = tool._run(key="email")
assert "🔍 Did you mean one of these?" in result
# Check that similar keys are in the suggestions
# Extract just the "Did you mean" section
did_you_mean_section = result.split("🔍 Did you mean one of these?")[1].split("✅ CORRECT USAGE EXAMPLE:")[0]
assert "- 'email_search_results'" in did_you_mean_section
assert "- 'email_details'" in did_you_mean_section
assert "- 'user_preferences'" not in did_you_mean_section
# But user_preferences should still be in the full list
assert "- 'user_preferences'" in result
# Test case-insensitive match
result = tool._run(key="EMAIL_DETAILS")
assert "🔍 Did you mean one of these?" in result
assert "- 'email_details'" in result
# Test no similar keys
result = tool._run(key="completely_different")
assert "🔍 Did you mean one of these?" not in result

10
uv.lock generated
View File

@@ -725,7 +725,7 @@ wheels = [
[[package]]
name = "crewai"
version = "0.121.1"
version = "0.126.0"
source = { editable = "." }
dependencies = [
{ name = "appdirs" },
@@ -814,7 +814,7 @@ requires-dist = [
{ name = "blinker", specifier = ">=1.9.0" },
{ name = "chromadb", specifier = ">=0.5.23" },
{ name = "click", specifier = ">=8.1.7" },
{ name = "crewai-tools", marker = "extra == 'tools'", specifier = "~=0.45.0" },
{ name = "crewai-tools", marker = "extra == 'tools'", specifier = "~=0.46.0" },
{ name = "docling", marker = "extra == 'docling'", specifier = ">=2.12.0" },
{ name = "instructor", specifier = ">=1.3.3" },
{ name = "json-repair", specifier = ">=0.25.2" },
@@ -866,7 +866,7 @@ dev = [
[[package]]
name = "crewai-tools"
version = "0.45.0"
version = "0.46.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "chromadb" },
@@ -881,9 +881,9 @@ dependencies = [
{ name = "pytube" },
{ name = "requests" },
]
sdist = { url = "https://files.pythonhosted.org/packages/e9/3a/7070dcacef56702c5d83ad1a87021b1666ff1850ff80b3aa7540892406e7/crewai_tools-0.45.0.tar.gz", hash = "sha256:1b2e4eff3f928ce5fac308d6e648719a0e4718a1228ae98980aa0d74fc16bfc7", size = 909723 }
sdist = { url = "https://files.pythonhosted.org/packages/0d/9e/69109f5d5b398b2edeccec1055e93cdceac3becd04407bcce97de6557180/crewai_tools-0.46.0.tar.gz", hash = "sha256:c8f89247199d528c77db4b450a1ca781b5d32405982467baf516ede4b2045bd1", size = 913715 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/6e/72/db45626973027c992df75cbc7ef391f18393d631be3bceb6388c1b9f01e1/crewai_tools-0.45.0-py3-none-any.whl", hash = "sha256:9dd34e4792c075ee7a72134aedaab268e78d0e350114fd7fe2426e691c5f52a3", size = 602659 },
{ url = "https://files.pythonhosted.org/packages/ab/62/0b68637ce820fbb0385495bd6d75ceb27de53f060df5417f293419826481/crewai_tools-0.46.0-py3-none-any.whl", hash = "sha256:f8e60723869ca36ede7b43dcc1491ebefc93410a972d97b7b0ce59c4bd7a826b", size = 606190 },
]
[[package]]