Files
crewAI/tests/tools/test_tool_usage.py
Lorenze Jay 00c2f5043e WIP crew events emitter (#2048)
* WIP crew events emitter

* Refactor event handling and introduce new event types

- Migrate from global `emit` function to `event_bus.emit`
- Add new event types for task failures, tool usage, and agent execution
- Update event listeners and event bus to support more granular event tracking
- Remove deprecated event emission methods
- Improve event type consistency and add more detailed event information

* Add event emission for agent execution lifecycle

- Emit AgentExecutionStarted and AgentExecutionError events
- Update CrewAgentExecutor to use event_bus for tracking agent execution
- Refactor error handling to include event emission
- Minor code formatting improvements in task.py and crew_agent_executor.py
- Fix a typo in test file

* Refactor event system and add third-party event listeners

- Move event_bus import to correct module paths
- Introduce BaseEventListener abstract base class
- Add AgentOpsListener for third-party event tracking
- Update event listener initialization and setup
- Clean up event-related imports and exports

* Enhance event system type safety and error handling

- Improve type annotations for event bus and event types
- Add null checks for agent and task in event emissions
- Update import paths for base tool and base agent
- Refactor event listener type hints
- Remove unnecessary print statements
- Update test configurations to match new event handling

* Refactor event classes to improve type safety and naming consistency

- Rename event classes to have explicit 'Event' suffix (e.g., TaskStartedEvent)
- Update import statements and references across multiple files
- Remove deprecated events.py module
- Enhance event type hints and configurations
- Clean up unnecessary event-related code

* Add default model for CrewEvaluator and fix event import order

- Set default model to "gpt-4o-mini" in CrewEvaluator when no model is specified
- Reorder event-related imports in task.py to follow standard import conventions
- Update event bus initialization method return type hint
- Export event_bus in events/__init__.py

* Fix tool usage and event import handling

- Update tool usage to use `.get()` method when checking tool name
- Remove unnecessary `__all__` export list in events/__init__.py

* Refactor Flow and Agent event handling to use event_bus

- Remove `event_emitter` from Flow class and replace with `event_bus.emit()`
- Update Flow and Agent tests to use event_bus event listeners
- Remove redundant event emissions in Flow methods
- Add debug print statements in Flow execution
- Simplify event tracking in test cases

* Enhance event handling for Crew, Task, and Event classes

- Add crew name to failed event types (CrewKickoffFailedEvent, CrewTrainFailedEvent, CrewTestFailedEvent)
- Update Task events to remove redundant task and context attributes
- Refactor EventListener to use Logger for consistent event logging
- Add new event types for Crew train and test events
- Improve event bus event tracking in test cases

* Remove telemetry and tracing dependencies from Task and Flow classes

- Remove telemetry-related imports and private attributes from Task class
- Remove `_telemetry` attribute from Flow class
- Update event handling to emit events without direct telemetry tracking
- Simplify task and flow execution by removing explicit telemetry spans
- Move telemetry-related event handling to EventListener

* Clean up unused imports and event-related code

- Remove unused imports from various event and flow-related files
- Reorder event imports to follow standard conventions
- Remove unnecessary event type references
- Simplify import statements in event and flow modules

* Update crew test to validate verbose output and kickoff_for_each method

- Enhance test_crew_verbose_output to check specific listener log messages
- Modify test_kickoff_for_each_invalid_input to use Pydantic validation error
- Improve test coverage for crew logging and input validation

* Update crew test verbose output with improved emoji icons

- Replace task and agent completion icons from 👍 to 
- Enhance readability of test output logging
- Maintain consistent test coverage for crew verbose output

* Add MethodExecutionFailedEvent to handle flow method execution failures

- Introduce new MethodExecutionFailedEvent in flow_events module
- Update Flow class to catch and emit method execution failures
- Add event listener for method execution failure events
- Update event-related imports to include new event type
- Enhance test coverage for method execution failure handling

* Propagate method execution failures in Flow class

- Modify Flow class to re-raise exceptions after emitting MethodExecutionFailedEvent
- Reorder MethodExecutionFailedEvent import to maintain consistent import style

* Enable test coverage for Flow method execution failure event

- Uncomment pytest.raises() in test_events to verify exception handling
- Ensure test validates MethodExecutionFailedEvent emission during flow kickoff

* Add event handling for tool usage events

- Introduce event listeners for ToolUsageFinishedEvent and ToolUsageErrorEvent
- Log tool usage events with descriptive emoji icons ( and )
- Update event_listener to track and log tool usage lifecycle

* Reorder and clean up event imports in event_listener

- Reorganize imports for tool usage events and other event types
- Maintain consistent import ordering and remove unused imports
- Ensure clean and organized import structure in event_listener module

* moving to dedicated eventlistener

* dont forget crew level

* Refactor AgentOps event listener for crew-level tracking

- Modify AgentOpsListener to handle crew-level events
- Initialize and end AgentOps session at crew kickoff and completion
- Create agents for each crew member during session initialization
- Improve session management and event recording
- Clean up and simplify event handling logic

* Update test_events to validate tool usage error event handling

- Modify test to assert single error event with correct attributes
- Use pytest.raises() to verify error event generation
- Simplify error event validation in test case

* Improve AgentOps listener type hints and formatting

- Add string type hints for AgentOps classes to resolve potential import issues
- Clean up unnecessary whitespace and improve code indentation
- Simplify initialization and event handling logic

* Update test_events to validate multiple tool usage events

- Modify test to assert 75 events instead of a single error event
- Remove pytest.raises() check, allowing crew kickoff to complete
- Adjust event validation to support broader event tracking

* Rename event_bus to crewai_event_bus for improved clarity and specificity

- Replace all references to `event_bus` with `crewai_event_bus`
- Update import statements across multiple files
- Remove the old `event_bus.py` file
- Maintain existing event handling functionality

* Enhance EventListener with singleton pattern and color configuration

- Implement singleton pattern for EventListener to ensure single instance
- Add default color configuration using EMITTER_COLOR from constants
- Modify log method calls to use default color and remove redundant color parameters
- Improve initialization logic to prevent multiple initializations

* Add FlowPlotEvent and update event bus to support flow plotting

- Introduce FlowPlotEvent to track flow plotting events
- Replace Telemetry method with event bus emission in Flow.plot()
- Update event bus to support new FlowPlotEvent type
- Add test case to validate flow plotting event emission

* Remove RunType enum and clean up crew events module

- Delete unused RunType enum from crew_events.py
- Simplify crew_events.py by removing unnecessary enum definition
- Improve code clarity by removing unneeded imports

* Enhance event handling for tool usage and agent execution

- Add new events for tool usage: ToolSelectionErrorEvent, ToolValidateInputErrorEvent
- Improve error tracking and event emission in ToolUsage and LLM classes
- Update AgentExecutionStartedEvent to use task_prompt instead of inputs
- Add comprehensive test coverage for new event types and error scenarios

* Refactor event system and improve crew testing

- Extract base CrewEvent class to a new base_events.py module
- Update event imports across multiple event-related files
- Modify CrewTestStartedEvent to use eval_llm instead of openai_model_name
- Add LLM creation validation in crew testing method
- Improve type handling and event consistency

* Refactor task events to use base CrewEvent

- Move CrewEvent import from crew_events to base_events
- Remove unnecessary blank lines in task_events.py
- Simplify event class structure for task-related events

* Update AgentExecutionStartedEvent to use task_prompt

- Modify test_events.py to use task_prompt instead of inputs
- Simplify event input validation in test case
- Align with recent event system refactoring

* Improve type hinting for TaskCompletedEvent handler

- Add explicit type annotation for TaskCompletedEvent in event_listener.py
- Enhance type safety for event handling in EventListener

* Improve test_validate_tool_input_invalid_input with mock objects

- Add explicit mock objects for agent and action in test case
- Ensure proper string values for mock agent and action attributes
- Simplify test setup for ToolUsage validation method

* Remove ToolUsageStartedEvent emission in tool usage process

- Remove unnecessary event emission for tool usage start
- Simplify tool usage event handling
- Eliminate redundant event data preparation step

* refactor: clean up and organize imports in llm and flow modules

* test: Improve flow persistence test cases and logging
2025-02-19 13:52:47 -08:00

627 lines
18 KiB
Python

import json
import random
from unittest.mock import MagicMock, patch
import pytest
from pydantic import BaseModel, Field
from crewai import Agent, Task
from crewai.tools import BaseTool
from crewai.tools.tool_usage import ToolUsage
from crewai.utilities.events import crewai_event_bus
from crewai.utilities.events.tool_usage_events import (
ToolSelectionErrorEvent,
ToolValidateInputErrorEvent,
)
class RandomNumberToolInput(BaseModel):
min_value: int = Field(
..., description="The minimum value of the range (inclusive)"
)
max_value: int = Field(
..., description="The maximum value of the range (inclusive)"
)
class RandomNumberTool(BaseTool):
name: str = "Random Number Generator"
description: str = "Generates a random number within a specified range"
args_schema: type[BaseModel] = RandomNumberToolInput
def _run(self, min_value: int, max_value: int) -> int:
return random.randint(min_value, max_value)
# Example agent and task
example_agent = Agent(
role="Number Generator",
goal="Generate random numbers for various purposes",
backstory="You are an AI agent specialized in generating random numbers within specified ranges.",
tools=[RandomNumberTool()],
verbose=True,
)
example_task = Task(
description="Generate a random number between 1 and 100",
expected_output="A random number between 1 and 100",
agent=example_agent,
)
def test_random_number_tool_range():
tool = RandomNumberTool()
result = tool._run(1, 10)
assert 1 <= result <= 10
def test_random_number_tool_invalid_range():
tool = RandomNumberTool()
with pytest.raises(ValueError):
tool._run(10, 1) # min_value > max_value
def test_random_number_tool_schema():
tool = RandomNumberTool()
# Get the schema using model_json_schema()
schema = tool.args_schema.model_json_schema()
# Convert the schema to a string
schema_str = json.dumps(schema)
# Check if the schema string contains the expected fields
assert "min_value" in schema_str
assert "max_value" in schema_str
# Parse the schema string back to a dictionary
schema_dict = json.loads(schema_str)
# Check if the schema contains the correct field types
assert schema_dict["properties"]["min_value"]["type"] == "integer"
assert schema_dict["properties"]["max_value"]["type"] == "integer"
# Check if the schema contains the field descriptions
assert (
"minimum value" in schema_dict["properties"]["min_value"]["description"].lower()
)
assert (
"maximum value" in schema_dict["properties"]["max_value"]["description"].lower()
)
def test_tool_usage_render():
tool = RandomNumberTool()
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[tool],
original_tools=[tool],
tools_description="Sample tool for testing",
tools_names="random_number_generator",
task=MagicMock(),
function_calling_llm=MagicMock(),
agent=MagicMock(),
action=MagicMock(),
)
rendered = tool_usage._render()
# Updated checks to match the actual output
assert "Tool Name: Random Number Generator" in rendered
assert "Tool Arguments:" in rendered
assert (
"'min_value': {'description': 'The minimum value of the range (inclusive)', 'type': 'int'}"
in rendered
)
assert (
"'max_value': {'description': 'The maximum value of the range (inclusive)', 'type': 'int'}"
in rendered
)
assert (
"Tool Description: Generates a random number within a specified range"
in rendered
)
assert (
"Tool Name: Random Number Generator\nTool Arguments: {'min_value': {'description': 'The minimum value of the range (inclusive)', 'type': 'int'}, 'max_value': {'description': 'The maximum value of the range (inclusive)', 'type': 'int'}}\nTool Description: Generates a random number within a specified range"
in rendered
)
def test_validate_tool_input_booleans_and_none():
# Create a ToolUsage instance with mocks
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=MagicMock(),
agent=MagicMock(),
action=MagicMock(),
)
# Input with booleans and None
tool_input = '{"key1": True, "key2": False, "key3": None}'
expected_arguments = {"key1": True, "key2": False, "key3": None}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_mixed_types():
# Create a ToolUsage instance with mocks
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=MagicMock(),
agent=MagicMock(),
action=MagicMock(),
)
# Input with mixed types
tool_input = '{"number": 123, "text": "Some text", "flag": True}'
expected_arguments = {"number": 123, "text": "Some text", "flag": True}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_single_quotes():
# Create a ToolUsage instance with mocks
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=MagicMock(),
agent=MagicMock(),
action=MagicMock(),
)
# Input with single quotes instead of double quotes
tool_input = "{'key': 'value', 'flag': True}"
expected_arguments = {"key": "value", "flag": True}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_invalid_json_repairable():
# Create a ToolUsage instance with mocks
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=MagicMock(),
agent=MagicMock(),
action=MagicMock(),
)
# Invalid JSON input that can be repaired
tool_input = '{"key": "value", "list": [1, 2, 3,]}'
expected_arguments = {"key": "value", "list": [1, 2, 3]}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_with_special_characters():
# Create a ToolUsage instance with mocks
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=MagicMock(),
agent=MagicMock(),
action=MagicMock(),
)
# Input with special characters
tool_input = '{"message": "Hello, world! \u263a", "valid": True}'
expected_arguments = {"message": "Hello, world! ☺", "valid": True}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_none_input():
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=None,
agent=MagicMock(),
action=MagicMock(),
)
arguments = tool_usage._validate_tool_input(None)
assert arguments == {}
def test_validate_tool_input_valid_json():
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=None,
agent=MagicMock(),
action=MagicMock(),
)
tool_input = '{"key": "value", "number": 42, "flag": true}'
expected_arguments = {"key": "value", "number": 42, "flag": True}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_python_dict():
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=None,
agent=MagicMock(),
action=MagicMock(),
)
tool_input = "{'key': 'value', 'number': 42, 'flag': True}"
expected_arguments = {"key": "value", "number": 42, "flag": True}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_json5_unquoted_keys():
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=None,
agent=MagicMock(),
action=MagicMock(),
)
tool_input = "{key: 'value', number: 42, flag: true}"
expected_arguments = {"key": "value", "number": 42, "flag": True}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_with_trailing_commas():
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=None,
agent=MagicMock(),
action=MagicMock(),
)
tool_input = '{"key": "value", "number": 42, "flag": true,}'
expected_arguments = {"key": "value", "number": 42, "flag": True}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_invalid_input():
# Create mock agent with proper string values
mock_agent = MagicMock()
mock_agent.key = "test_agent_key" # Must be a string
mock_agent.role = "test_agent_role" # Must be a string
mock_agent._original_role = "test_agent_role" # Must be a string
mock_agent.i18n = MagicMock()
mock_agent.verbose = False
# Create mock action with proper string value
mock_action = MagicMock()
mock_action.tool = "test_tool" # Must be a string
mock_action.tool_input = "test_input" # Must be a string
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=None,
agent=mock_agent,
action=mock_action,
)
invalid_inputs = [
"Just a string",
"['list', 'of', 'values']",
"12345",
"",
]
for invalid_input in invalid_inputs:
with pytest.raises(Exception) as e_info:
tool_usage._validate_tool_input(invalid_input)
assert (
"Tool input must be a valid dictionary in JSON or Python literal format"
in str(e_info.value)
)
# Test for None input separately
arguments = tool_usage._validate_tool_input(None)
assert arguments == {}
def test_validate_tool_input_complex_structure():
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=None,
agent=MagicMock(),
action=MagicMock(),
)
tool_input = """
{
"user": {
"name": "Alice",
"age": 30
},
"items": [
{"id": 1, "value": "Item1"},
{"id": 2, "value": "Item2",}
],
"active": true,
}
"""
expected_arguments = {
"user": {"name": "Alice", "age": 30},
"items": [
{"id": 1, "value": "Item1"},
{"id": 2, "value": "Item2"},
],
"active": True,
}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_code_content():
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=None,
agent=MagicMock(),
action=MagicMock(),
)
tool_input = '{"filename": "script.py", "content": "def hello():\\n print(\'Hello, world!\')"}'
expected_arguments = {
"filename": "script.py",
"content": "def hello():\n print('Hello, world!')",
}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_with_escaped_quotes():
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=None,
agent=MagicMock(),
action=MagicMock(),
)
tool_input = '{"text": "He said, \\"Hello, world!\\""}'
expected_arguments = {"text": 'He said, "Hello, world!"'}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_validate_tool_input_large_json_content():
tool_usage = ToolUsage(
tools_handler=MagicMock(),
tools=[],
original_tools=[],
tools_description="",
tools_names="",
task=MagicMock(),
function_calling_llm=None,
agent=MagicMock(),
action=MagicMock(),
)
# Simulate a large JSON content
tool_input = (
'{"data": ' + json.dumps([{"id": i, "value": i * 2} for i in range(1000)]) + "}"
)
expected_arguments = {"data": [{"id": i, "value": i * 2} for i in range(1000)]}
arguments = tool_usage._validate_tool_input(tool_input)
assert arguments == expected_arguments
def test_tool_selection_error_event_direct():
"""Test tool selection error event emission directly from ToolUsage class."""
mock_agent = MagicMock()
mock_agent.key = "test_key"
mock_agent.role = "test_role"
mock_agent.i18n = MagicMock()
mock_agent.verbose = False
mock_task = MagicMock()
mock_tools_handler = MagicMock()
class TestTool(BaseTool):
name: str = "Test Tool"
description: str = "A test tool"
def _run(self, input: dict) -> str:
return "test result"
test_tool = TestTool()
tool_usage = ToolUsage(
tools_handler=mock_tools_handler,
tools=[test_tool],
original_tools=[test_tool],
tools_description="Test Tool Description",
tools_names="Test Tool",
task=mock_task,
function_calling_llm=None,
agent=mock_agent,
action=MagicMock(),
)
received_events = []
@crewai_event_bus.on(ToolSelectionErrorEvent)
def event_handler(source, event):
received_events.append(event)
with pytest.raises(Exception) as exc_info:
tool_usage._select_tool("Non Existent Tool")
assert len(received_events) == 1
event = received_events[0]
assert isinstance(event, ToolSelectionErrorEvent)
assert event.agent_key == "test_key"
assert event.agent_role == "test_role"
assert event.tool_name == "Non Existent Tool"
assert event.tool_args == {}
assert event.tool_class == "Test Tool Description"
assert "don't exist" in event.error
received_events.clear()
with pytest.raises(Exception) as exc_info:
tool_usage._select_tool("")
assert len(received_events) == 1
event = received_events[0]
assert isinstance(event, ToolSelectionErrorEvent)
assert event.agent_key == "test_key"
assert event.agent_role == "test_role"
assert event.tool_name == ""
assert event.tool_args == {}
assert event.tool_class == "Test Tool Description"
assert "forgot the Action name" in event.error
def test_tool_validate_input_error_event():
"""Test tool validation input error event emission from ToolUsage class."""
# Mock agent and required components
mock_agent = MagicMock()
mock_agent.key = "test_key"
mock_agent.role = "test_role"
mock_agent.verbose = False
mock_agent._original_role = "test_role"
# Mock i18n with error message
mock_i18n = MagicMock()
mock_i18n.errors.return_value = (
"Tool input must be a valid dictionary in JSON or Python literal format"
)
mock_agent.i18n = mock_i18n
# Mock task and tools handler
mock_task = MagicMock()
mock_tools_handler = MagicMock()
# Mock printer
mock_printer = MagicMock()
# Create test tool
class TestTool(BaseTool):
name: str = "Test Tool"
description: str = "A test tool"
def _run(self, input: dict) -> str:
return "test result"
test_tool = TestTool()
# Create ToolUsage instance
tool_usage = ToolUsage(
tools_handler=mock_tools_handler,
tools=[test_tool],
original_tools=[test_tool],
tools_description="Test Tool Description",
tools_names="Test Tool",
task=mock_task,
function_calling_llm=None,
agent=mock_agent,
action=MagicMock(tool="test_tool"),
)
tool_usage._printer = mock_printer
# Mock all parsing attempts to fail
with (
patch("json.loads", side_effect=json.JSONDecodeError("Test Error", "", 0)),
patch("ast.literal_eval", side_effect=ValueError),
patch("json5.loads", side_effect=json.JSONDecodeError("Test Error", "", 0)),
patch("json_repair.repair_json", side_effect=Exception("Failed to repair")),
):
received_events = []
@crewai_event_bus.on(ToolValidateInputErrorEvent)
def event_handler(source, event):
received_events.append(event)
# Test invalid input
invalid_input = "invalid json {[}"
with pytest.raises(Exception) as exc_info:
tool_usage._validate_tool_input(invalid_input)
# Verify event was emitted
assert len(received_events) == 1, "Expected one event to be emitted"
event = received_events[0]
assert isinstance(event, ToolValidateInputErrorEvent)
assert event.agent_key == "test_key"
assert event.agent_role == "test_role"
assert event.tool_name == "test_tool"
assert "must be a valid dictionary" in event.error