mirror of
https://github.com/crewAIInc/crewAI.git
synced 2025-12-16 04:18:35 +00:00
Add reasoning attribute to Agent class (#2866)
* Add reasoning attribute to Agent class Co-Authored-By: Joe Moura <joao@crewai.com> * Address PR feedback: improve type hints, error handling, refactor reasoning handler, and enhance tests and docs Co-Authored-By: Joe Moura <joao@crewai.com> * Implement function calling for reasoning and move prompts to translations Co-Authored-By: Joe Moura <joao@crewai.com> * Simplify function calling implementation with better error handling Co-Authored-By: Joe Moura <joao@crewai.com> * Enhance system prompts to leverage agent context (role, goal, backstory) Co-Authored-By: Joe Moura <joao@crewai.com> * Fix lint and type-checker issues Co-Authored-By: Joe Moura <joao@crewai.com> * Enhance system prompts to better leverage agent context Co-Authored-By: Joe Moura <joao@crewai.com> * Fix backstory access in reasoning handler for Python 3.12 compatibility Co-Authored-By: Joe Moura <joao@crewai.com> --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Joe Moura <joao@crewai.com> Co-authored-by: João Moura <joaomdmoura@gmail.com>
This commit is contained in:
committed by
GitHub
parent
227b521f9e
commit
1ef22131e6
140
docs/concepts/reasoning.mdx
Normal file
140
docs/concepts/reasoning.mdx
Normal file
@@ -0,0 +1,140 @@
|
||||
---
|
||||
title: "Agent Reasoning"
|
||||
---
|
||||
|
||||
# Agent Reasoning
|
||||
|
||||
Agent reasoning is a feature that allows agents to reflect on a task and create a plan before execution. This helps agents approach tasks more methodically and ensures they're ready to perform the assigned work.
|
||||
|
||||
## How to Use Agent Reasoning
|
||||
|
||||
To enable reasoning for an agent, simply set `reasoning=True` when creating the agent:
|
||||
|
||||
```python
|
||||
from crewai import Agent
|
||||
|
||||
agent = Agent(
|
||||
role="Data Analyst",
|
||||
goal="Analyze complex datasets and provide insights",
|
||||
backstory="You are an experienced data analyst with expertise in finding patterns in complex data.",
|
||||
reasoning=True, # Enable reasoning
|
||||
max_reasoning_attempts=3 # Optional: Set a maximum number of reasoning attempts
|
||||
)
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
When reasoning is enabled, before executing a task, the agent will:
|
||||
|
||||
1. Reflect on the task and create a detailed plan
|
||||
2. Evaluate whether it's ready to execute the task
|
||||
3. Refine the plan as necessary until it's ready or max_reasoning_attempts is reached
|
||||
4. Inject the reasoning plan into the task description before execution
|
||||
|
||||
This process helps the agent break down complex tasks into manageable steps and identify potential challenges before starting.
|
||||
|
||||
## Configuration Options
|
||||
|
||||
- `reasoning` (bool): Enable or disable reasoning (default: False)
|
||||
- `max_reasoning_attempts` (int, optional): Maximum number of attempts to refine the plan before proceeding with execution. If None (default), the agent will continue refining until it's ready.
|
||||
|
||||
## Example
|
||||
|
||||
Here's a complete example:
|
||||
|
||||
```python
|
||||
from crewai import Agent, Task, Crew
|
||||
|
||||
# Create an agent with reasoning enabled
|
||||
analyst = Agent(
|
||||
role="Data Analyst",
|
||||
goal="Analyze data and provide insights",
|
||||
backstory="You are an expert data analyst.",
|
||||
reasoning=True,
|
||||
max_reasoning_attempts=3 # Optional: Set a limit on reasoning attempts
|
||||
)
|
||||
|
||||
# Create a task
|
||||
analysis_task = Task(
|
||||
description="Analyze the provided sales data and identify key trends.",
|
||||
expected_output="A report highlighting the top 3 sales trends.",
|
||||
agent=analyst
|
||||
)
|
||||
|
||||
# Create a crew and run the task
|
||||
crew = Crew(agents=[analyst], tasks=[analysis_task])
|
||||
result = crew.kickoff()
|
||||
|
||||
print(result)
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
The reasoning process is designed to be robust, with error handling built in. If an error occurs during reasoning, the agent will proceed with executing the task without the reasoning plan. This ensures that tasks can still be executed even if the reasoning process fails.
|
||||
|
||||
Here's how to handle potential errors in your code:
|
||||
|
||||
```python
|
||||
from crewai import Agent, Task
|
||||
import logging
|
||||
|
||||
# Set up logging to capture any reasoning errors
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
# Create an agent with reasoning enabled
|
||||
agent = Agent(
|
||||
role="Data Analyst",
|
||||
goal="Analyze data and provide insights",
|
||||
reasoning=True,
|
||||
max_reasoning_attempts=3
|
||||
)
|
||||
|
||||
# Create a task
|
||||
task = Task(
|
||||
description="Analyze the provided sales data and identify key trends.",
|
||||
expected_output="A report highlighting the top 3 sales trends.",
|
||||
agent=agent
|
||||
)
|
||||
|
||||
# Execute the task
|
||||
# If an error occurs during reasoning, it will be logged and execution will continue
|
||||
result = agent.execute_task(task)
|
||||
```
|
||||
|
||||
## Example Reasoning Output
|
||||
|
||||
Here's an example of what a reasoning plan might look like for a data analysis task:
|
||||
|
||||
```
|
||||
Task: Analyze the provided sales data and identify key trends.
|
||||
|
||||
Reasoning Plan:
|
||||
I'll analyze the sales data to identify the top 3 trends.
|
||||
|
||||
1. Understanding of the task:
|
||||
I need to analyze sales data to identify key trends that would be valuable for business decision-making.
|
||||
|
||||
2. Key steps I'll take:
|
||||
- First, I'll examine the data structure to understand what fields are available
|
||||
- Then I'll perform exploratory data analysis to identify patterns
|
||||
- Next, I'll analyze sales by time periods to identify temporal trends
|
||||
- I'll also analyze sales by product categories and customer segments
|
||||
- Finally, I'll identify the top 3 most significant trends
|
||||
|
||||
3. Approach to challenges:
|
||||
- If the data has missing values, I'll decide whether to fill or filter them
|
||||
- If the data has outliers, I'll investigate whether they're valid data points or errors
|
||||
- If trends aren't immediately obvious, I'll apply statistical methods to uncover patterns
|
||||
|
||||
4. Use of available tools:
|
||||
- I'll use data analysis tools to explore and visualize the data
|
||||
- I'll use statistical tools to identify significant patterns
|
||||
- I'll use knowledge retrieval to access relevant information about sales analysis
|
||||
|
||||
5. Expected outcome:
|
||||
A concise report highlighting the top 3 sales trends with supporting evidence from the data.
|
||||
|
||||
READY: I am ready to execute the task.
|
||||
```
|
||||
|
||||
This reasoning plan helps the agent organize its approach to the task, consider potential challenges, and ensure it delivers the expected output.
|
||||
@@ -119,6 +119,14 @@ class Agent(BaseAgent):
|
||||
default="safe",
|
||||
description="Mode for code execution: 'safe' (using Docker) or 'unsafe' (direct execution).",
|
||||
)
|
||||
reasoning: bool = Field(
|
||||
default=False,
|
||||
description="Whether the agent should reflect and create a plan before executing a task.",
|
||||
)
|
||||
max_reasoning_attempts: Optional[int] = Field(
|
||||
default=None,
|
||||
description="Maximum number of reasoning attempts before executing the task. If None, will try until ready.",
|
||||
)
|
||||
embedder: Optional[Dict[str, Any]] = Field(
|
||||
default=None,
|
||||
description="Embedder configuration for the agent.",
|
||||
@@ -225,6 +233,21 @@ class Agent(BaseAgent):
|
||||
ValueError: If the max execution time is not a positive integer.
|
||||
RuntimeError: If the agent execution fails for other reasons.
|
||||
"""
|
||||
if self.reasoning:
|
||||
try:
|
||||
from crewai.utilities.reasoning_handler import AgentReasoning, AgentReasoningOutput
|
||||
|
||||
reasoning_handler = AgentReasoning(task=task, agent=self)
|
||||
reasoning_output: AgentReasoningOutput = reasoning_handler.handle_agent_reasoning()
|
||||
|
||||
# Add the reasoning plan to the task description
|
||||
task.description += f"\n\nReasoning Plan:\n{reasoning_output.plan.plan}"
|
||||
except Exception as e:
|
||||
if hasattr(self, '_logger'):
|
||||
self._logger.log("error", f"Error during reasoning process: {str(e)}")
|
||||
else:
|
||||
print(f"Error during reasoning process: {str(e)}")
|
||||
|
||||
if self.tools_handler:
|
||||
self.tools_handler.last_used_tool = {} # type: ignore # Incompatible types in assignment (expression has type "dict[Never, Never]", variable has type "ToolCalling")
|
||||
|
||||
|
||||
@@ -51,5 +51,11 @@
|
||||
"description": "See image to understand its content, you can optionally ask a question about the image",
|
||||
"default_action": "Please provide a detailed description of this image, including all visual elements, context, and any notable details you can observe."
|
||||
}
|
||||
},
|
||||
"reasoning": {
|
||||
"initial_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are creating a strategic plan for a task that requires your expertise and unique perspective.",
|
||||
"refine_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are refining a strategic plan for a task that requires your expertise and unique perspective.",
|
||||
"create_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou have been assigned the following task:\n{description}\n\nExpected output:\n{expected_output}\n\nAvailable tools: {tools}\n\nBefore executing this task, create a detailed plan that leverages your expertise as {role} and outlines:\n1. Your understanding of the task from your professional perspective\n2. The key steps you'll take to complete it, drawing on your background and skills\n3. How you'll approach any challenges that might arise, considering your expertise\n4. How you'll strategically use the available tools based on your experience\n5. The expected outcome and how it aligns with your goal\n\nAfter creating your plan, assess whether you feel ready to execute the task.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan because [specific reason].\"",
|
||||
"refine_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou created the following plan for this task:\n{current_plan}\n\nHowever, you indicated that you're not ready to execute the task yet.\n\nPlease refine your plan further, drawing on your expertise as {role} to address any gaps or uncertainties.\n\nAfter refining your plan, assess whether you feel ready to execute the task.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan further because [specific reason].\""
|
||||
}
|
||||
}
|
||||
|
||||
311
src/crewai/utilities/reasoning_handler.py
Normal file
311
src/crewai/utilities/reasoning_handler.py
Normal file
@@ -0,0 +1,311 @@
|
||||
import logging
|
||||
import json
|
||||
from typing import Tuple, cast
|
||||
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from crewai.agent import Agent
|
||||
from crewai.task import Task
|
||||
from crewai.utilities import I18N
|
||||
from crewai.llm import LLM
|
||||
|
||||
|
||||
class ReasoningPlan(BaseModel):
|
||||
"""Model representing a reasoning plan for a task."""
|
||||
plan: str = Field(description="The detailed reasoning plan for the task.")
|
||||
ready: bool = Field(description="Whether the agent is ready to execute the task.")
|
||||
|
||||
|
||||
class AgentReasoningOutput(BaseModel):
|
||||
"""Model representing the output of the agent reasoning process."""
|
||||
plan: ReasoningPlan = Field(description="The reasoning plan for the task.")
|
||||
|
||||
|
||||
class ReasoningFunction(BaseModel):
|
||||
"""Model for function calling with reasoning."""
|
||||
plan: str = Field(description="The detailed reasoning plan for the task.")
|
||||
ready: bool = Field(description="Whether the agent is ready to execute the task.")
|
||||
|
||||
|
||||
class AgentReasoning:
|
||||
"""
|
||||
Handles the agent reasoning process, enabling an agent to reflect and create a plan
|
||||
before executing a task.
|
||||
"""
|
||||
def __init__(self, task: Task, agent: Agent):
|
||||
if not task or not agent:
|
||||
raise ValueError("Both task and agent must be provided.")
|
||||
self.task = task
|
||||
self.agent = agent
|
||||
self.llm = cast(LLM, agent.llm)
|
||||
self.logger = logging.getLogger(__name__)
|
||||
self.i18n = I18N()
|
||||
|
||||
def handle_agent_reasoning(self) -> AgentReasoningOutput:
|
||||
"""
|
||||
Public method for the reasoning process that creates and refines a plan
|
||||
for the task until the agent is ready to execute it.
|
||||
|
||||
Returns:
|
||||
AgentReasoningOutput: The output of the agent reasoning process.
|
||||
"""
|
||||
return self.__handle_agent_reasoning()
|
||||
|
||||
def __handle_agent_reasoning(self) -> AgentReasoningOutput:
|
||||
"""
|
||||
Private method that handles the agent reasoning process.
|
||||
|
||||
Returns:
|
||||
AgentReasoningOutput: The output of the agent reasoning process.
|
||||
"""
|
||||
plan, ready = self.__create_initial_plan()
|
||||
|
||||
plan, ready = self.__refine_plan_if_needed(plan, ready)
|
||||
|
||||
reasoning_plan = ReasoningPlan(plan=plan, ready=ready)
|
||||
return AgentReasoningOutput(plan=reasoning_plan)
|
||||
|
||||
def __create_initial_plan(self) -> Tuple[str, bool]:
|
||||
"""
|
||||
Creates the initial reasoning plan for the task.
|
||||
|
||||
Returns:
|
||||
Tuple[str, bool]: The initial plan and whether the agent is ready to execute the task.
|
||||
"""
|
||||
reasoning_prompt = self.__create_reasoning_prompt()
|
||||
|
||||
if self.llm.supports_function_calling():
|
||||
plan, ready = self.__call_with_function(reasoning_prompt, "initial_plan")
|
||||
return plan, ready
|
||||
else:
|
||||
system_prompt = self.i18n.retrieve("reasoning", "initial_plan").format(
|
||||
role=self.agent.role,
|
||||
goal=self.agent.goal,
|
||||
backstory=self.__get_agent_backstory()
|
||||
)
|
||||
|
||||
response = self.llm.call(
|
||||
[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": reasoning_prompt}
|
||||
]
|
||||
)
|
||||
|
||||
return self.__parse_reasoning_response(str(response))
|
||||
|
||||
def __refine_plan_if_needed(self, plan: str, ready: bool) -> Tuple[str, bool]:
|
||||
"""
|
||||
Refines the reasoning plan if the agent is not ready to execute the task.
|
||||
|
||||
Args:
|
||||
plan: The current reasoning plan.
|
||||
ready: Whether the agent is ready to execute the task.
|
||||
|
||||
Returns:
|
||||
Tuple[str, bool]: The refined plan and whether the agent is ready to execute the task.
|
||||
"""
|
||||
attempt = 1
|
||||
max_attempts = self.agent.max_reasoning_attempts
|
||||
|
||||
while not ready and (max_attempts is None or attempt < max_attempts):
|
||||
refine_prompt = self.__create_refine_prompt(plan)
|
||||
|
||||
if self.llm.supports_function_calling():
|
||||
plan, ready = self.__call_with_function(refine_prompt, "refine_plan")
|
||||
else:
|
||||
system_prompt = self.i18n.retrieve("reasoning", "refine_plan").format(
|
||||
role=self.agent.role,
|
||||
goal=self.agent.goal,
|
||||
backstory=self.__get_agent_backstory()
|
||||
)
|
||||
|
||||
response = self.llm.call(
|
||||
[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": refine_prompt}
|
||||
]
|
||||
)
|
||||
plan, ready = self.__parse_reasoning_response(str(response))
|
||||
|
||||
attempt += 1
|
||||
|
||||
if max_attempts is not None and attempt >= max_attempts:
|
||||
self.logger.warning(
|
||||
f"Agent reasoning reached maximum attempts ({max_attempts}) without being ready. Proceeding with current plan."
|
||||
)
|
||||
break
|
||||
|
||||
return plan, ready
|
||||
|
||||
def __call_with_function(self, prompt: str, prompt_type: str) -> Tuple[str, bool]:
|
||||
"""
|
||||
Calls the LLM with function calling to get a reasoning plan.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to send to the LLM.
|
||||
prompt_type: The type of prompt (initial_plan or refine_plan).
|
||||
|
||||
Returns:
|
||||
Tuple[str, bool]: A tuple containing the plan and whether the agent is ready.
|
||||
"""
|
||||
self.logger.debug(f"Using function calling for {prompt_type} reasoning")
|
||||
|
||||
function_schema = {
|
||||
"name": "create_reasoning_plan",
|
||||
"description": "Create or refine a reasoning plan for a task",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"plan": {
|
||||
"type": "string",
|
||||
"description": "The detailed reasoning plan for the task."
|
||||
},
|
||||
"ready": {
|
||||
"type": "boolean",
|
||||
"description": "Whether the agent is ready to execute the task."
|
||||
}
|
||||
},
|
||||
"required": ["plan", "ready"]
|
||||
}
|
||||
}
|
||||
|
||||
try:
|
||||
system_prompt = self.i18n.retrieve("reasoning", prompt_type).format(
|
||||
role=self.agent.role,
|
||||
goal=self.agent.goal,
|
||||
backstory=self.__get_agent_backstory()
|
||||
)
|
||||
|
||||
response = self.llm.call(
|
||||
[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": prompt}
|
||||
],
|
||||
tools=[function_schema]
|
||||
)
|
||||
|
||||
self.logger.debug(f"Function calling response: {response[:100]}...")
|
||||
|
||||
try:
|
||||
result = json.loads(response)
|
||||
if "plan" in result and "ready" in result:
|
||||
return result["plan"], result["ready"]
|
||||
except (json.JSONDecodeError, KeyError):
|
||||
pass
|
||||
|
||||
response_str = str(response)
|
||||
return response_str, "READY: I am ready to execute the task." in response_str
|
||||
|
||||
except Exception as e:
|
||||
self.logger.warning(f"Error during function calling: {str(e)}. Falling back to text parsing.")
|
||||
|
||||
try:
|
||||
system_prompt = self.i18n.retrieve("reasoning", prompt_type).format(
|
||||
role=self.agent.role,
|
||||
goal=self.agent.goal,
|
||||
backstory=self.__get_agent_backstory()
|
||||
)
|
||||
|
||||
fallback_response = self.llm.call(
|
||||
[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": prompt}
|
||||
]
|
||||
)
|
||||
|
||||
fallback_str = str(fallback_response)
|
||||
return fallback_str, "READY: I am ready to execute the task." in fallback_str
|
||||
except Exception as inner_e:
|
||||
self.logger.error(f"Error during fallback text parsing: {str(inner_e)}")
|
||||
return "Failed to generate a plan due to an error.", True # Default to ready to avoid getting stuck
|
||||
|
||||
def __get_agent_backstory(self) -> str:
|
||||
"""
|
||||
Safely gets the agent's backstory, providing a default if not available.
|
||||
|
||||
Returns:
|
||||
str: The agent's backstory or a default value.
|
||||
"""
|
||||
return getattr(self.agent, "backstory", "No backstory provided")
|
||||
|
||||
def __create_reasoning_prompt(self) -> str:
|
||||
"""
|
||||
Creates a prompt for the agent to reason about the task.
|
||||
|
||||
Returns:
|
||||
str: The reasoning prompt.
|
||||
"""
|
||||
available_tools = self.__format_available_tools()
|
||||
|
||||
return self.i18n.retrieve("reasoning", "create_plan_prompt").format(
|
||||
role=self.agent.role,
|
||||
goal=self.agent.goal,
|
||||
backstory=self.__get_agent_backstory(),
|
||||
description=self.task.description,
|
||||
expected_output=self.task.expected_output,
|
||||
tools=available_tools
|
||||
)
|
||||
|
||||
def __format_available_tools(self) -> str:
|
||||
"""
|
||||
Formats the available tools for inclusion in the prompt.
|
||||
|
||||
Returns:
|
||||
str: Comma-separated list of tool names.
|
||||
"""
|
||||
try:
|
||||
return ', '.join([tool.name for tool in (self.task.tools or [])])
|
||||
except (AttributeError, TypeError):
|
||||
return "No tools available"
|
||||
|
||||
def __create_refine_prompt(self, current_plan: str) -> str:
|
||||
"""
|
||||
Creates a prompt for the agent to refine its reasoning plan.
|
||||
|
||||
Args:
|
||||
current_plan: The current reasoning plan.
|
||||
|
||||
Returns:
|
||||
str: The refine prompt.
|
||||
"""
|
||||
return self.i18n.retrieve("reasoning", "refine_plan_prompt").format(
|
||||
role=self.agent.role,
|
||||
goal=self.agent.goal,
|
||||
backstory=self.__get_agent_backstory(),
|
||||
current_plan=current_plan
|
||||
)
|
||||
|
||||
def __parse_reasoning_response(self, response: str) -> Tuple[str, bool]:
|
||||
"""
|
||||
Parses the reasoning response to extract the plan and whether
|
||||
the agent is ready to execute the task.
|
||||
|
||||
Args:
|
||||
response: The LLM response.
|
||||
|
||||
Returns:
|
||||
Tuple[str, bool]: The plan and whether the agent is ready to execute the task.
|
||||
"""
|
||||
if not response:
|
||||
return "No plan was generated.", False
|
||||
|
||||
plan = response
|
||||
ready = False
|
||||
|
||||
if "READY: I am ready to execute the task." in response:
|
||||
ready = True
|
||||
|
||||
return plan, ready
|
||||
|
||||
def _handle_agent_reasoning(self) -> AgentReasoningOutput:
|
||||
"""
|
||||
Deprecated method for backward compatibility.
|
||||
Use handle_agent_reasoning() instead.
|
||||
|
||||
Returns:
|
||||
AgentReasoningOutput: The output of the agent reasoning process.
|
||||
"""
|
||||
self.logger.warning(
|
||||
"The _handle_agent_reasoning method is deprecated. Use handle_agent_reasoning instead."
|
||||
)
|
||||
return self.handle_agent_reasoning()
|
||||
261
tests/agent_reasoning_test.py
Normal file
261
tests/agent_reasoning_test.py
Normal file
@@ -0,0 +1,261 @@
|
||||
"""Tests for reasoning in agents."""
|
||||
|
||||
import json
|
||||
import pytest
|
||||
|
||||
from crewai import Agent, Task
|
||||
from crewai.llm import LLM
|
||||
from crewai.utilities.reasoning_handler import AgentReasoning
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_llm_responses():
|
||||
"""Fixture for mock LLM responses."""
|
||||
return {
|
||||
"ready": "I'll solve this simple math problem.\n\nREADY: I am ready to execute the task.\n\n",
|
||||
"not_ready": "I need to think about derivatives.\n\nNOT READY: I need to refine my plan because I'm not sure about the derivative rules.",
|
||||
"ready_after_refine": "I'll use the power rule for derivatives where d/dx(x^n) = n*x^(n-1).\n\nREADY: I am ready to execute the task.",
|
||||
"execution": "4"
|
||||
}
|
||||
|
||||
|
||||
def test_agent_with_reasoning(mock_llm_responses):
|
||||
"""Test agent with reasoning."""
|
||||
llm = LLM("gpt-3.5-turbo")
|
||||
|
||||
agent = Agent(
|
||||
role="Test Agent",
|
||||
goal="To test the reasoning feature",
|
||||
backstory="I am a test agent created to verify the reasoning feature works correctly.",
|
||||
llm=llm,
|
||||
reasoning=True,
|
||||
verbose=True
|
||||
)
|
||||
|
||||
task = Task(
|
||||
description="Simple math task: What's 2+2?",
|
||||
expected_output="The answer should be a number.",
|
||||
agent=agent
|
||||
)
|
||||
|
||||
agent.llm.call = lambda messages, *args, **kwargs: (
|
||||
mock_llm_responses["ready"]
|
||||
if any("create a detailed plan" in msg.get("content", "") for msg in messages)
|
||||
else mock_llm_responses["execution"]
|
||||
)
|
||||
|
||||
result = agent.execute_task(task)
|
||||
|
||||
assert result == mock_llm_responses["execution"]
|
||||
assert "Reasoning Plan:" in task.description
|
||||
|
||||
|
||||
def test_agent_with_reasoning_not_ready_initially(mock_llm_responses):
|
||||
"""Test agent with reasoning that requires refinement."""
|
||||
llm = LLM("gpt-3.5-turbo")
|
||||
|
||||
agent = Agent(
|
||||
role="Test Agent",
|
||||
goal="To test the reasoning feature",
|
||||
backstory="I am a test agent created to verify the reasoning feature works correctly.",
|
||||
llm=llm,
|
||||
reasoning=True,
|
||||
max_reasoning_attempts=2,
|
||||
verbose=True
|
||||
)
|
||||
|
||||
task = Task(
|
||||
description="Complex math task: What's the derivative of x²?",
|
||||
expected_output="The answer should be a mathematical expression.",
|
||||
agent=agent
|
||||
)
|
||||
|
||||
call_count = [0]
|
||||
|
||||
def mock_llm_call(messages, *args, **kwargs):
|
||||
if any("create a detailed plan" in msg.get("content", "") for msg in messages) or any("refine your plan" in msg.get("content", "") for msg in messages):
|
||||
call_count[0] += 1
|
||||
if call_count[0] == 1:
|
||||
return mock_llm_responses["not_ready"]
|
||||
else:
|
||||
return mock_llm_responses["ready_after_refine"]
|
||||
else:
|
||||
return "2x"
|
||||
|
||||
agent.llm.call = mock_llm_call
|
||||
|
||||
result = agent.execute_task(task)
|
||||
|
||||
assert result == "2x"
|
||||
assert call_count[0] == 2 # Should have made 2 reasoning calls
|
||||
assert "Reasoning Plan:" in task.description
|
||||
|
||||
|
||||
def test_agent_with_reasoning_max_attempts_reached():
|
||||
"""Test agent with reasoning that reaches max attempts without being ready."""
|
||||
llm = LLM("gpt-3.5-turbo")
|
||||
|
||||
agent = Agent(
|
||||
role="Test Agent",
|
||||
goal="To test the reasoning feature",
|
||||
backstory="I am a test agent created to verify the reasoning feature works correctly.",
|
||||
llm=llm,
|
||||
reasoning=True,
|
||||
max_reasoning_attempts=2,
|
||||
verbose=True
|
||||
)
|
||||
|
||||
task = Task(
|
||||
description="Complex math task: Solve the Riemann hypothesis.",
|
||||
expected_output="A proof or disproof of the hypothesis.",
|
||||
agent=agent
|
||||
)
|
||||
|
||||
call_count = [0]
|
||||
|
||||
def mock_llm_call(messages, *args, **kwargs):
|
||||
if any("create a detailed plan" in msg.get("content", "") for msg in messages) or any("refine your plan" in msg.get("content", "") for msg in messages):
|
||||
call_count[0] += 1
|
||||
return f"Attempt {call_count[0]}: I need more time to think.\n\nNOT READY: I need to refine my plan further."
|
||||
else:
|
||||
return "This is an unsolved problem in mathematics."
|
||||
|
||||
agent.llm.call = mock_llm_call
|
||||
|
||||
result = agent.execute_task(task)
|
||||
|
||||
assert result == "This is an unsolved problem in mathematics."
|
||||
assert call_count[0] == 2 # Should have made exactly 2 reasoning calls (max_attempts)
|
||||
assert "Reasoning Plan:" in task.description
|
||||
|
||||
|
||||
def test_agent_reasoning_input_validation():
|
||||
"""Test input validation in AgentReasoning."""
|
||||
llm = LLM("gpt-3.5-turbo")
|
||||
|
||||
agent = Agent(
|
||||
role="Test Agent",
|
||||
goal="To test the reasoning feature",
|
||||
backstory="I am a test agent created to verify the reasoning feature works correctly.",
|
||||
llm=llm,
|
||||
reasoning=True
|
||||
)
|
||||
|
||||
with pytest.raises(ValueError, match="Both task and agent must be provided"):
|
||||
AgentReasoning(task=None, agent=agent)
|
||||
|
||||
task = Task(
|
||||
description="Simple task",
|
||||
expected_output="Simple output"
|
||||
)
|
||||
with pytest.raises(ValueError, match="Both task and agent must be provided"):
|
||||
AgentReasoning(task=task, agent=None)
|
||||
|
||||
|
||||
def test_agent_reasoning_error_handling():
|
||||
"""Test error handling during the reasoning process."""
|
||||
llm = LLM("gpt-3.5-turbo")
|
||||
|
||||
agent = Agent(
|
||||
role="Test Agent",
|
||||
goal="To test the reasoning feature",
|
||||
backstory="I am a test agent created to verify the reasoning feature works correctly.",
|
||||
llm=llm,
|
||||
reasoning=True
|
||||
)
|
||||
|
||||
task = Task(
|
||||
description="Task that will cause an error",
|
||||
expected_output="Output that will never be generated",
|
||||
agent=agent
|
||||
)
|
||||
|
||||
call_count = [0]
|
||||
|
||||
def mock_llm_call_error(*args, **kwargs):
|
||||
call_count[0] += 1
|
||||
if call_count[0] <= 2: # First calls are for reasoning
|
||||
raise Exception("LLM error during reasoning")
|
||||
return "Fallback execution result" # Return a value for task execution
|
||||
|
||||
agent.llm.call = mock_llm_call_error
|
||||
|
||||
result = agent.execute_task(task)
|
||||
|
||||
assert result == "Fallback execution result"
|
||||
assert call_count[0] > 2 # Ensure we called the mock multiple times
|
||||
|
||||
|
||||
def test_agent_with_function_calling():
|
||||
"""Test agent with reasoning using function calling."""
|
||||
llm = LLM("gpt-3.5-turbo")
|
||||
|
||||
agent = Agent(
|
||||
role="Test Agent",
|
||||
goal="To test the reasoning feature",
|
||||
backstory="I am a test agent created to verify the reasoning feature works correctly.",
|
||||
llm=llm,
|
||||
reasoning=True,
|
||||
verbose=True
|
||||
)
|
||||
|
||||
task = Task(
|
||||
description="Simple math task: What's 2+2?",
|
||||
expected_output="The answer should be a number.",
|
||||
agent=agent
|
||||
)
|
||||
|
||||
agent.llm.supports_function_calling = lambda: True
|
||||
|
||||
def mock_function_call(messages, *args, **kwargs):
|
||||
if "tools" in kwargs:
|
||||
return json.dumps({
|
||||
"plan": "I'll solve this simple math problem: 2+2=4.",
|
||||
"ready": True
|
||||
})
|
||||
else:
|
||||
return "4"
|
||||
|
||||
agent.llm.call = mock_function_call
|
||||
|
||||
result = agent.execute_task(task)
|
||||
|
||||
assert result == "4"
|
||||
assert "Reasoning Plan:" in task.description
|
||||
assert "I'll solve this simple math problem: 2+2=4." in task.description
|
||||
|
||||
|
||||
def test_agent_with_function_calling_fallback():
|
||||
"""Test agent with reasoning using function calling that falls back to text parsing."""
|
||||
llm = LLM("gpt-3.5-turbo")
|
||||
|
||||
agent = Agent(
|
||||
role="Test Agent",
|
||||
goal="To test the reasoning feature",
|
||||
backstory="I am a test agent created to verify the reasoning feature works correctly.",
|
||||
llm=llm,
|
||||
reasoning=True,
|
||||
verbose=True
|
||||
)
|
||||
|
||||
task = Task(
|
||||
description="Simple math task: What's 2+2?",
|
||||
expected_output="The answer should be a number.",
|
||||
agent=agent
|
||||
)
|
||||
|
||||
agent.llm.supports_function_calling = lambda: True
|
||||
|
||||
def mock_function_call(messages, *args, **kwargs):
|
||||
if "tools" in kwargs:
|
||||
return "Invalid JSON that will trigger fallback. READY: I am ready to execute the task."
|
||||
else:
|
||||
return "4"
|
||||
|
||||
agent.llm.call = mock_function_call
|
||||
|
||||
result = agent.execute_task(task)
|
||||
|
||||
assert result == "4"
|
||||
assert "Reasoning Plan:" in task.description
|
||||
assert "Invalid JSON that will trigger fallback" in task.description
|
||||
Reference in New Issue
Block a user