fix: address PR feedback with improved validation, documentation, and tests

Co-Authored-By: Joe Moura <joao@crewai.com>
feat: add ToolWithInstruction wrapper for tool-specific usage instructions (issue #2515 )
2026-01-07 23:28:30 +00:00 · 2025-04-03 11:09:30 +00:00 · 2025-04-03 11:04:12 +00:00
9 changed files with 392 additions and 129 deletions
--- a/README.md
+++ b/README.md
@@ -267,6 +267,7 @@ In addition to the sequential process, you can use the hierarchical process, whi
 - **Role-Based Agent Design**: Customize agents with specific roles, goals, and tools.
 - **Autonomous Inter-Agent Delegation**: Agents can autonomously delegate tasks and inquire amongst themselves, enhancing problem-solving efficiency.
 - **Flexible Task Management**: Define tasks with customizable tools and assign them to agents dynamically.
+- **Tool Instructions**: Attach specific usage instructions to tools for better control over when and how agents use them.
 - **Processes Driven**: Currently only supports `sequential` task execution and `hierarchical` processes, but more complex processes like consensual and autonomous are being worked on.
 - **Save output as file**: Save the output of individual tasks as a file, so you can use it later.
 - **Parse output as Pydantic or Json**: Parse the output of individual tasks as a Pydantic model or as a Json if you want to.
--- a/docs/how-to/ToolInstructions.md
+++ b/docs/how-to/ToolInstructions.md
@@ -0,0 +1,153 @@
+# Tool Instructions
+
+CrewAI allows you to provide specific instructions for when and how to use tools. This is useful when you want to guide agents on proper tool usage without cluttering their backstory.
+
+## Basic Usage
+
+```python
+from crewai import Agent
+from crewai_tools import ScrapeWebsiteTool
+from crewai.tools import ToolWithInstruction
+
+# Create a tool with instructions
+scrape_tool = ScrapeWebsiteTool()
+scrape_with_instructions = ToolWithInstruction(
+    tool=scrape_tool,
+    instructions="""
+    ALWAYS use this tool when making a joke.
+    NEVER use this tool when making joke about someone's mom.
+    """
+)
+
+# Use the tool with an agent
+agent = Agent(
+    role="Comedian",
+    goal="Create hilarious and engaging jokes",
+    backstory="""
+          You are a professional stand-up comedian with years of experience in crafting jokes.
+          You have a great sense of humor and can create jokes about any topic 
+          while keeping them appropriate and entertaining.
+    """,
+    tools=[scrape_with_instructions],
+)
+```
+
+## Real-World Examples
+
+### Example 1: Research Assistant with Web Search Tool
+
+```python
+from crewai import Agent
+from crewai_tools import SearchTool
+from crewai.tools import ToolWithInstruction
+
+search_tool = SearchTool()
+search_with_instructions = ToolWithInstruction(
+    tool=search_tool,
+    instructions="""
+    Use this tool ONLY for factual information that requires up-to-date data.
+    ALWAYS verify information by searching multiple sources.
+    DO NOT use this tool for speculative questions or opinions.
+    """
+)
+
+research_agent = Agent(
+    role="Research Analyst",
+    goal="Provide accurate and well-sourced information",
+    backstory="You are a meticulous research analyst with attention to detail and fact-checking.",
+    tools=[search_with_instructions],
+)
+```
+
+### Example 2: Data Scientist with Multiple Analysis Tools
+
+```python
+from crewai import Agent
+from crewai_tools import PythonTool, DataVisualizationTool
+from crewai.tools import ToolWithInstruction
+
+# Python tool for data processing
+python_tool = PythonTool()
+python_with_instructions = ToolWithInstruction(
+    tool=python_tool,
+    instructions="""
+    Use this tool for data cleaning, transformation, and statistical analysis.
+    ALWAYS include comments in your code.
+    DO NOT use this tool for creating visualizations.
+    """
+)
+
+# Visualization tool
+viz_tool = DataVisualizationTool()
+viz_with_instructions = ToolWithInstruction(
+    tool=viz_tool,
+    instructions="""
+    Use this tool ONLY for creating data visualizations.
+    ALWAYS label axes and include titles in your charts.
+    PREFER simple visualizations that clearly communicate the main insight.
+    """
+)
+
+data_scientist = Agent(
+    role="Data Scientist",
+    goal="Analyze data and create insightful visualizations",
+    backstory="You are an experienced data scientist who excels at finding patterns in data.",
+    tools=[python_with_instructions, viz_with_instructions],
+)
+```
+
+## How Instructions Are Presented to Agents
+
+When an agent considers using a tool, the instructions are included in the tool's description. For example, a tool with instructions might appear to the agent like this:
+
+```
+Tool: search_web
+Description: Search the web for information on a given topic.
+Instructions: Use this tool ONLY for factual information that requires up-to-date data. 
+ALWAYS verify information by searching multiple sources. 
+DO NOT use this tool for speculative questions or opinions.
+```
+
+This clear presentation helps the agent understand when and how to use the tool appropriately.
+
+## Dynamically Updating Instructions
+
+You can update tool instructions dynamically during execution:
+
+```python
+# Create a tool with initial instructions
+search_with_instructions = ToolWithInstruction(
+    tool=search_tool,
+    instructions="Initial instructions for tool usage"
+)
+
+# Later, update the instructions based on new requirements
+search_with_instructions.update_instructions("Updated instructions for tool usage")
+```
+
+## Error Handling and Best Practices
+
+### Validation
+
+The `ToolWithInstruction` class includes validation to ensure instructions are not empty and don't exceed a maximum length. If you provide invalid instructions, a `ValueError` will be raised.
+
+### Best Practices for Writing Instructions
+
+1. **Be specific and clear** about when to use and when not to use the tool
+2. **Use imperative language** like "ALWAYS", "NEVER", "USE", "DO NOT USE"
+3. **Keep instructions concise** but comprehensive
+4. **Include examples** of good and bad usage scenarios when possible
+5. **Format instructions** with line breaks for readability
+
+## When to Use Tool Instructions
+
+Tool instructions are useful when:
+
+1. You want to specify precise conditions for tool usage
+2. You have multiple similar tools that should be used in different situations
+3. You want to keep the agent's backstory focused on its role and personality, 
+   not technical details about tools
+4. You need to provide technical guidance on how to format inputs or interpret outputs
+5. You want to enforce consistent tool usage across multiple agents
+
+Tool instructions are semantically more correct than putting tool usage guidelines in the agent's backstory.
--- a/src/crewai/crew.py
+++ b/src/crewai/crew.py
@@ -4,7 +4,6 @@ import uuid
 import warnings
 from concurrent.futures import Future
 from hashlib import md5
-from crewai.llm import LLM
 from typing import Any, Callable, Dict, List, Optional, Tuple, Union

 from pydantic import (
@@ -1076,36 +1075,19 @@ class Crew(BaseModel):
    def test(
        self,
        n_iterations: int,
-        llm: Union[str, LLM],
+        openai_model_name: Optional[str] = None,
        inputs: Optional[Dict[str, Any]] = None,
    ) -> None:
-        """Test and evaluate the Crew with the given inputs for n iterations concurrently using concurrent.futures.
-        
-        Args:
-            n_iterations: Number of test iterations to run
-            llm: Language model to use for evaluation. Can be either a model name string (e.g. "gpt-4") 
-                 or an LLM instance for custom implementations
-            inputs: Optional dictionary of input values to use for task execution
-            
-        Example:
-            ```python
-            # Using model name string
-            crew.test(n_iterations=3, llm="gpt-4")
-            
-            # Using custom LLM implementation
-            custom_llm = LLM(model="custom-model")
-            crew.test(n_iterations=3, llm=custom_llm)
-            ```
-        """
+        """Test and evaluate the Crew with the given inputs for n iterations concurrently using concurrent.futures."""
        test_crew = self.copy()

        self._test_execution_span = test_crew._telemetry.test_execution_span(
            test_crew,
            n_iterations,
            inputs,
-            str(llm) if isinstance(llm, LLM) else llm,
-        )
-        evaluator = CrewEvaluator(test_crew, llm)
+            openai_model_name,  # type: ignore[arg-type]
+        )  # type: ignore[arg-type]
+        evaluator = CrewEvaluator(test_crew, openai_model_name)  # type: ignore[arg-type]

        for i in range(1, n_iterations + 1):
            evaluator.set_iteration(i)
--- a/src/crewai/tools/init.py
+++ b/src/crewai/tools/init.py
@@ -1 +1,2 @@
 from .base_tool import BaseTool, tool
+from .tool_with_instruction import ToolWithInstruction
--- a/src/crewai/tools/tool_with_instruction.py
+++ b/src/crewai/tools/tool_with_instruction.py
@@ -0,0 +1,110 @@
+from typing import Any, List, Optional, Dict, Callable, Union, ClassVar
+
+from pydantic import Field, model_validator, field_validator, ConfigDict
+
+from crewai.tools.base_tool import BaseTool
+from crewai.tools.structured_tool import CrewStructuredTool
+
+
+class ToolWithInstruction(BaseTool):
+    """A wrapper for tools that adds specific usage instructions.
+    
+    This allows users to provide specific instructions on when and how to use a tool,
+    without having to include these instructions in the agent's backstory.
+    
+    Attributes:
+        tool: The tool to wrap
+        instructions: Specific instructions about when and how to use this tool
+        name: Name of the tool (inherited from the wrapped tool)
+        description: Description of the tool (inherited from the wrapped tool with instructions)
+    """
+    
+    MAX_INSTRUCTION_LENGTH: ClassVar[int] = 2000
+    
+    name: str = Field(default="", description="Name of the tool")
+    description: str = Field(default="", description="Description of the tool")
+    tool: BaseTool = Field(description="The tool to wrap")
+    instructions: str = Field(description="Instructions about when and how to use this tool")
+    
+    model_config = ConfigDict(arbitrary_types_allowed=True)
+    
+    @field_validator("instructions")
+    @classmethod
+    def validate_instructions(cls, value: str) -> str:
+        """Validate that instructions are not empty and not too long.
+        
+        Args:
+            value: The instructions string to validate
+            
+        Returns:
+            str: The validated and sanitized instructions
+            
+        Raises:
+            ValueError: If instructions are empty or exceed maximum length
+        """
+        if not value or not value.strip():
+            raise ValueError("Instructions cannot be empty")
+        
+        if len(value) > cls.MAX_INSTRUCTION_LENGTH:
+            raise ValueError(
+                f"Instructions exceed maximum length of {cls.MAX_INSTRUCTION_LENGTH} characters"
+            )
+        
+        return value.strip()
+    
+    @model_validator(mode="after")
+    def set_tool_attributes(self) -> "ToolWithInstruction":
+        """Sets name, description, and args_schema from the wrapped tool.
+        
+        Returns:
+            ToolWithInstruction: The validated instance with updated attributes.
+        """
+        self.name = self.tool.name
+        self.description = f"{self.tool.description}\nInstructions: {self.instructions}"
+        self.args_schema = self.tool.args_schema
+        return self
+    
+    def update_instructions(self, new_instructions: str) -> None:
+        """Updates the tool's usage instructions.
+        
+        Args:
+            new_instructions (str): New instructions for tool usage.
+            
+        Raises:
+            ValueError: If new instructions are empty or exceed maximum length
+        """
+        if not new_instructions or not new_instructions.strip():
+            raise ValueError("Instructions cannot be empty")
+        
+        if len(new_instructions) > self.MAX_INSTRUCTION_LENGTH:
+            raise ValueError(
+                f"Instructions exceed maximum length of {self.MAX_INSTRUCTION_LENGTH} characters"
+            )
+        
+        self.instructions = new_instructions.strip()
+        
+        self.description = f"{self.tool.description}\nInstructions: {self.instructions}"
+    
+    def _run(self, *args: Any, **kwargs: Any) -> Any:
+        """Run the wrapped tool.
+        
+        Args:
+            *args: Positional arguments to pass to the wrapped tool
+            **kwargs: Keyword arguments to pass to the wrapped tool
+            
+        Returns:
+            Any: The result from the wrapped tool's _run method
+        """
+        return self.tool._run(*args, **kwargs)
+    
+    def to_structured_tool(self) -> CrewStructuredTool:
+        """Convert this tool to a CrewStructuredTool instance.
+        
+        Returns:
+            CrewStructuredTool: A structured tool with instructions included in the description
+        """
+        structured_tool = self.tool.to_structured_tool()
+        
+        structured_tool.description = f"{structured_tool.description}\nInstructions: {self.instructions}"
+        
+        return structured_tool
--- a/src/crewai/utilities/evaluators/crew_evaluator_handler.py
+++ b/src/crewai/utilities/evaluators/crew_evaluator_handler.py
@@ -1,16 +1,10 @@
 from collections import defaultdict
-from typing import Any, Dict, List, Optional, TypeVar, Union
-from typing import DefaultDict  # Separate import to avoid circular imports

 from pydantic import BaseModel, Field
 from rich.box import HEAVY_EDGE
 from rich.console import Console
 from rich.table import Table

-from crewai.llm import LLM
-
-T = TypeVar('T', bound=LLM)
-
 from crewai.agent import Agent
 from crewai.task import Task
 from crewai.tasks.task_output import TaskOutput
@@ -34,47 +28,14 @@ class CrewEvaluator:
        iteration (int): The current iteration of the evaluation.
    """

-    _tasks_scores: DefaultDict[int, List[float]] = Field(
-        default_factory=lambda: defaultdict(list))
-    _run_execution_times: DefaultDict[int, List[float]] = Field(
-        default_factory=lambda: defaultdict(list))
+    tasks_scores: defaultdict = defaultdict(list)
+    run_execution_times: defaultdict = defaultdict(list)
    iteration: int = 0

-    @property
-    def tasks_scores(self) -> DefaultDict[int, List[float]]:
-        return self._tasks_scores
-
-    @tasks_scores.setter
-    def tasks_scores(self, value: Dict[int, List[float]]) -> None:
-        self._tasks_scores = defaultdict(list, value)
-
-    @property
-    def run_execution_times(self) -> DefaultDict[int, List[float]]:
-        return self._run_execution_times
-
-    @run_execution_times.setter
-    def run_execution_times(self, value: Dict[int, List[float]]) -> None:
-        self._run_execution_times = defaultdict(list, value)
-
-    def __init__(self, crew, llm: Union[str, T]):
-        """Initialize the CrewEvaluator.
-        
-        Args:
-            crew: The Crew instance to evaluate
-            llm: Language model to use for evaluation. Can be either a model name string
-                or an LLM instance for custom implementations
-                
-        Raises:
-            ValueError: If llm is None or invalid
-        """
-        if not llm:
-            raise ValueError("Invalid LLM configuration")
-            
+    def __init__(self, crew, openai_model_name: str):
        self.crew = crew
-        self.llm = LLM(model=llm) if isinstance(llm, str) else llm
+        self.openai_model_name = openai_model_name
        self._telemetry = Telemetry()
-        self._tasks_scores = defaultdict(list)
-        self._run_execution_times = defaultdict(list)
        self._setup_for_evaluating()

    def _setup_for_evaluating(self) -> None:
@@ -90,7 +51,7 @@ class CrewEvaluator:
            ),
            backstory="Evaluator agent for crew evaluation with precise capabilities to evaluate the performance of the agents in the crew based on the tasks they have performed",
            verbose=False,
-            llm=self.llm,
+            llm=self.openai_model_name,
        )

    def _evaluation_task(
@@ -220,19 +181,11 @@ class CrewEvaluator:
                self.crew,
                evaluation_result.pydantic.quality,
                current_task._execution_time,
-                self._get_llm_identifier(),
+                self.openai_model_name,
            )
-            self._tasks_scores[self.iteration].append(evaluation_result.pydantic.quality)
-            self._run_execution_times[self.iteration].append(
+            self.tasks_scores[self.iteration].append(evaluation_result.pydantic.quality)
+            self.run_execution_times[self.iteration].append(
                current_task._execution_time
            )
        else:
            raise ValueError("Evaluation result is not in the expected format")
-
-    def _get_llm_identifier(self) -> str:
-        """Get a string identifier for the LLM instance.
-        
-        Returns:
-            String representation of the LLM for telemetry
-        """
-        return str(self.llm) if isinstance(self.llm, LLM) else self.llm
--- a/tests/crew_test.py
+++ b/tests/crew_test.py
@@ -10,7 +10,6 @@ import instructor
 import pydantic_core
 import pytest

-from crewai.llm import LLM
 from crewai.agent import Agent
 from crewai.agents.cache import CacheHandler
 from crewai.crew import Crew
@@ -1124,7 +1123,7 @@ def test_kickoff_for_each_empty_input():
    assert results == []


-@pytest.mark.vcr(filter_headeruvs=["authorization"])
+@pytest.mark.vcr(filter_headers=["authorization"])
 def test_kickoff_for_each_invalid_input():
    """Tests if kickoff_for_each raises TypeError for invalid input types."""

@@ -2829,7 +2828,7 @@ def test_crew_testing_function(kickoff_mock, copy_mock, crew_evaluator):
    copy_mock.return_value = crew

    n_iterations = 2
-    crew.test(n_iterations, llm="gpt-4o-mini", inputs={"topic": "AI"})
+    crew.test(n_iterations, openai_model_name="gpt-4o-mini", inputs={"topic": "AI"})

    # Ensure kickoff is called on the copied crew
    kickoff_mock.assert_has_calls(
@@ -2845,32 +2844,6 @@ def test_crew_testing_function(kickoff_mock, copy_mock, crew_evaluator):
        ]
    )

-@mock.patch("crewai.crew.CrewEvaluator")
-@mock.patch("crewai.crew.Crew.copy")
-@mock.patch("crewai.crew.Crew.kickoff")
-def test_crew_testing_with_custom_llm(kickoff_mock, copy_mock, crew_evaluator):
-    task = Task(
-        description="Test task",
-        expected_output="Test output",
-        agent=researcher,
-    )
-    crew = Crew(agents=[researcher], tasks=[task])
-    copy_mock.return_value = crew
-    custom_llm = LLM(model="gpt-4")
-    
-    crew.test(2, llm=custom_llm, inputs={"topic": "AI"})
-    
-    kickoff_mock.assert_has_calls([
-        mock.call(inputs={"topic": "AI"}),
-        mock.call(inputs={"topic": "AI"})
-    ])
-    crew_evaluator.assert_has_calls([
-        mock.call(crew, custom_llm),
-        mock.call().set_iteration(1),
-        mock.call().set_iteration(2),
-        mock.call().print_crew_evaluation_result(),
-    ])
-

@pytest.mark.vcr(filter_headers=["authorization"])
 def test_hierarchical_verbose_manager_agent():
@@ -3152,4 +3125,4 @@ def test_multimodal_agent_live_image_analysis():
    # Verify we got a meaningful response
    assert isinstance(result.raw, str)
    assert len(result.raw) > 100  # Expecting a detailed analysis
-    assert "error" not in result.raw.lower()  # No error messages in response
+    assert "error" not in result.raw.lower()  # No error messages in response
--- a/tests/tools/test_tool_with_instruction.py
+++ b/tests/tools/test_tool_with_instruction.py
@@ -0,0 +1,110 @@
+import pytest
+from unittest.mock import MagicMock, patch
+from typing import Any, Dict, Optional
+
+from crewai.tools.base_tool import BaseTool, Tool
+from crewai.tools.tool_with_instruction import ToolWithInstruction
+
+
+class MockTool(BaseTool):
+    """Mock tool for testing."""
+    name: str = "mock_tool"
+    description: str = "A mock tool for testing"
+    
+    def _run(self, *args: Any, **kwargs: Any) -> str:
+        return "mock result"
+
+
+class TestToolWithInstruction:
+    """Test suite for ToolWithInstruction."""
+    
+    def test_initialization(self):
+        """Test tool initialization with instructions."""
+        tool = MockTool()
+        instructions = "Only use this tool for XYZ"
+        
+        wrapped_tool = ToolWithInstruction(tool=tool, instructions=instructions)
+        
+        assert wrapped_tool.name == tool.name
+        assert "Instructions: Only use this tool for XYZ" in wrapped_tool.description
+        assert wrapped_tool.args_schema == tool.args_schema
+    
+    def test_run_method(self):
+        """Test that the run method delegates to the original tool."""
+        tool = MockTool()
+        instructions = "Only use this tool for XYZ"
+        
+        wrapped_tool = ToolWithInstruction(tool=tool, instructions=instructions)
+        result = wrapped_tool.run()
+        
+        assert result == "mock result"
+    
+    def test_to_structured_tool(self):
+        """Test that to_structured_tool includes instructions."""
+        tool = MockTool()
+        instructions = "Only use this tool for XYZ"
+        
+        wrapped_tool = ToolWithInstruction(tool=tool, instructions=instructions)
+        structured_tool = wrapped_tool.to_structured_tool()
+        
+        assert "Instructions: Only use this tool for XYZ" in structured_tool.description
+    
+    def test_with_function_tool(self):
+        """Test tool wrapping with a function tool."""
+        def sample_func():
+            return "sample result"
+            
+        tool = Tool(
+            name="sample_tool", 
+            description="A sample tool", 
+            func=sample_func
+        )
+        
+        instructions = "Only use this tool for XYZ"
+        wrapped_tool = ToolWithInstruction(tool=tool, instructions=instructions)
+        
+        assert wrapped_tool.name == tool.name
+        assert "Instructions: Only use this tool for XYZ" in wrapped_tool.description
+    
+    def test_empty_instructions(self):
+        """Test that empty instructions raise ValueError."""
+        tool = MockTool()
+        
+        with pytest.raises(ValueError, match="Instructions cannot be empty"):
+            ToolWithInstruction(tool=tool, instructions="")
+        
+        with pytest.raises(ValueError, match="Instructions cannot be empty"):
+            ToolWithInstruction(tool=tool, instructions="   ")
+    
+    def test_too_long_instructions(self):
+        """Test that instructions exceeding maximum length raise ValueError."""
+        tool = MockTool()
+        long_instructions = "x" * (ToolWithInstruction.MAX_INSTRUCTION_LENGTH + 1)
+        
+        with pytest.raises(ValueError, match="Instructions exceed maximum length"):
+            ToolWithInstruction(tool=tool, instructions=long_instructions)
+    
+    def test_update_instructions(self):
+        """Test updating instructions dynamically."""
+        tool = MockTool()
+        initial_instructions = "Initial instructions"
+        new_instructions = "Updated instructions"
+        
+        wrapped_tool = ToolWithInstruction(tool=tool, instructions=initial_instructions)
+        assert "Instructions: Initial instructions" in wrapped_tool.description
+        
+        wrapped_tool.update_instructions(new_instructions)
+        assert "Instructions: Updated instructions" in wrapped_tool.description
+        assert wrapped_tool.instructions == new_instructions
+    
+    def test_update_instructions_validation(self):
+        """Test validation when updating instructions."""
+        tool = MockTool()
+        wrapped_tool = ToolWithInstruction(tool=tool, instructions="Valid instructions")
+        
+        with pytest.raises(ValueError, match="Instructions cannot be empty"):
+            wrapped_tool.update_instructions("")
+        
+        long_instructions = "x" * (ToolWithInstruction.MAX_INSTRUCTION_LENGTH + 1)
+        with pytest.raises(ValueError, match="Instructions exceed maximum length"):
+            wrapped_tool.update_instructions(long_instructions)
--- a/tests/utilities/evaluators/test_crew_evaluator_handler.py
+++ b/tests/utilities/evaluators/test_crew_evaluator_handler.py
@@ -2,7 +2,6 @@ from unittest import mock

 import pytest

-from crewai.llm import LLM
 from crewai.agent import Agent
 from crewai.crew import Crew
 from crewai.task import Task
@@ -24,7 +23,7 @@ class TestCrewEvaluator:
        )
        crew = Crew(agents=[agent], tasks=[task])

-        return CrewEvaluator(crew, llm="gpt-4o-mini")
+        return CrewEvaluator(crew, openai_model_name="gpt-4o-mini")

    def test_setup_for_evaluating(self, crew_planner):
        crew_planner._setup_for_evaluating()
@@ -48,25 +47,6 @@ class TestCrewEvaluator:
        assert agent.verbose is False
        assert agent.llm.model == "gpt-4o-mini"

-    @pytest.mark.parametrize("llm_input,expected_model", [
-        (LLM(model="gpt-4"), "gpt-4"),
-        ("gpt-4", "gpt-4"),
-    ])
-    def test_evaluator_with_llm_types(self, crew_planner, llm_input, expected_model):
-        evaluator = CrewEvaluator(crew_planner.crew, llm_input)
-        agent = evaluator._evaluator_agent()
-        assert agent.llm.model == expected_model
-        
-    def test_evaluator_with_invalid_llm(self, crew_planner):
-        with pytest.raises(ValueError, match="Invalid LLM configuration"):
-            CrewEvaluator(crew_planner.crew, None)
-
-    def test_evaluator_with_string_llm(self, crew_planner):
-        evaluator = CrewEvaluator(crew_planner.crew, "gpt-4")
-        agent = evaluator._evaluator_agent()
-        assert isinstance(agent.llm, LLM)
-        assert agent.llm.model == "gpt-4"
-
    def test_evaluation_task(self, crew_planner):
        evaluator_agent = Agent(
            role="Evaluator Agent",
Author	SHA1	Message	Date
Devin AI	f46d19e193	fix: address PR feedback with improved validation, documentation, and tests Co-Authored-By: Joe Moura <joao@crewai.com>	2025-04-03 11:09:30 +00:00
Devin AI	d8571dc196	feat: add ToolWithInstruction wrapper for tool-specific usage instructions (issue #2515 ) Co-Authored-By: Joe Moura <joao@crewai.com>	2025-04-03 11:04:12 +00:00