Compare commits

...

5 Commits

Author SHA1 Message Date
Devin AI
5e9968d2f8 Address PR feedback for #2886: Add validation, logging, documentation, utility methods, and comprehensive tests for Agent parameters
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-22 11:53:35 +00:00
Devin AI
4d67ecabfd Fix #2886: Add undocumented Agent parameters allow_feedback, allow_conflict, and allow_iteration
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-05-22 11:36:04 +00:00
siddharth Sambharia
409892d65f Portkey Integration with CrewAI (#1233)
* Create Portkey-Observability-and-Guardrails.md

* crewAI update with new changes

* small change

---------

Co-authored-by: siddharthsambharia-portkey <siddhath.s@portkey.ai>
Co-authored-by: João Moura <joaomdmoura@gmail.com>
2024-12-27 18:16:47 -03:00
devin-ai-integration[bot]
62f3df7ed5 docs: add guide for multimodal agents (#1807)
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Joe Moura <joao@crewai.com>
2024-12-27 18:16:02 -03:00
João Igor
4cf8913d31 chore: removing crewai-tools from dev-dependencies (#1760)
As mentioned in issue #1759, listing crewai-tools as dev-dependencies makes pip install it a required dependency, and not an optional

Co-authored-by: João Moura <joaomdmoura@gmail.com>
2024-12-27 17:45:06 -03:00
6 changed files with 594 additions and 2 deletions

View File

@@ -0,0 +1,211 @@
# Portkey Integration with CrewAI
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-CrewAI.png" alt="Portkey CrewAI Header Image" width="70%" />
[Portkey](https://portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai) is a 2-line upgrade to make your CrewAI agents reliable, cost-efficient, and fast.
Portkey adds 4 core production capabilities to any CrewAI agent:
1. Routing to **200+ LLMs**
2. Making each LLM call more robust
3. Full-stack tracing & cost, performance analytics
4. Real-time guardrails to enforce behavior
## Getting Started
1. **Install Required Packages:**
```bash
pip install -qU crewai portkey-ai
```
2. **Configure the LLM Client:**
To build CrewAI Agents with Portkey, you'll need two keys:
- **Portkey API Key**: Sign up on the [Portkey app](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai) and copy your API key
- **Virtual Key**: Virtual Keys securely manage your LLM API keys in one place. Store your LLM provider API keys securely in Portkey's vault
```python
from crewai import LLM
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
gpt_llm = LLM(
model="gpt-4",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy", # We are using Virtual key
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_VIRTUAL_KEY", # Enter your Virtual key from Portkey
)
)
```
3. **Create and Run Your First Agent:**
```python
from crewai import Agent, Task, Crew
# Define your agents with roles and goals
coder = Agent(
role='Software developer',
goal='Write clear, concise code on demand',
backstory='An expert coder with a keen eye for software trends.',
llm=gpt_llm
)
# Create tasks for your agents
task1 = Task(
description="Define the HTML for making a simple website with heading- Hello World! Portkey is working!",
expected_output="A clear and concise HTML code",
agent=coder
)
# Instantiate your crew
crew = Crew(
agents=[coder],
tasks=[task1],
)
result = crew.kickoff()
print(result)
```
## Key Features
| Feature | Description |
|---------|-------------|
| 🌐 Multi-LLM Support | Access OpenAI, Anthropic, Gemini, Azure, and 250+ providers through a unified interface |
| 🛡️ Production Reliability | Implement retries, timeouts, load balancing, and fallbacks |
| 📊 Advanced Observability | Track 40+ metrics including costs, tokens, latency, and custom metadata |
| 🔍 Comprehensive Logging | Debug with detailed execution traces and function call logs |
| 🚧 Security Controls | Set budget limits and implement role-based access control |
| 🔄 Performance Analytics | Capture and analyze feedback for continuous improvement |
| 💾 Intelligent Caching | Reduce costs and latency with semantic or simple caching |
## Production Features with Portkey Configs
All features mentioned below are through Portkey's Config system. Portkey's Config system allows you to define routing strategies using simple JSON objects in your LLM API calls. You can create and manage Configs directly in your code or through the Portkey Dashboard. Each Config has a unique ID for easy reference.
<Frame>
<img src="https://raw.githubusercontent.com/Portkey-AI/docs-core/refs/heads/main/images/libraries/libraries-3.avif"/>
</Frame>
### 1. Use 250+ LLMs
Access various LLMs like Anthropic, Gemini, Mistral, Azure OpenAI, and more with minimal code changes. Switch between providers or use them together seamlessly. [Learn more about Universal API](https://portkey.ai/docs/product/ai-gateway/universal-api)
Easily switch between different LLM providers:
```python
# Anthropic Configuration
anthropic_llm = LLM(
model="claude-3-5-sonnet-latest",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_ANTHROPIC_VIRTUAL_KEY", #You don't need provider when using Virtual keys
trace_id="anthropic_agent"
)
)
# Azure OpenAI Configuration
azure_llm = LLM(
model="gpt-4",
base_url=PORTKEY_GATEWAY_URL,
api_key="dummy",
extra_headers=createHeaders(
api_key="YOUR_PORTKEY_API_KEY",
virtual_key="YOUR_AZURE_VIRTUAL_KEY", #You don't need provider when using Virtual keys
trace_id="azure_agent"
)
)
```
### 2. Caching
Improve response times and reduce costs with two powerful caching modes:
- **Simple Cache**: Perfect for exact matches
- **Semantic Cache**: Matches responses for requests that are semantically similar
[Learn more about Caching](https://portkey.ai/docs/product/ai-gateway/cache-simple-and-semantic)
```py
config = {
"cache": {
"mode": "semantic", # or "simple" for exact matching
}
}
```
### 3. Production Reliability
Portkey provides comprehensive reliability features:
- **Automatic Retries**: Handle temporary failures gracefully
- **Request Timeouts**: Prevent hanging operations
- **Conditional Routing**: Route requests based on specific conditions
- **Fallbacks**: Set up automatic provider failovers
- **Load Balancing**: Distribute requests efficiently
[Learn more about Reliability Features](https://portkey.ai/docs/product/ai-gateway/)
### 4. Metrics
Agent runs are complex. Portkey automatically logs **40+ comprehensive metrics** for your AI agents, including cost, tokens used, latency, etc. Whether you need a broad overview or granular insights into your agent runs, Portkey's customizable filters provide the metrics you need.
- Cost per agent interaction
- Response times and latency
- Token usage and efficiency
- Success/failure rates
- Cache hit rates
<img src="https://github.com/siddharthsambharia-portkey/Portkey-Product-Images/blob/main/Portkey-Dashboard.png?raw=true" width="70%" alt="Portkey Dashboard" />
### 5. Detailed Logging
Logs are essential for understanding agent behavior, diagnosing issues, and improving performance. They provide a detailed record of agent activities and tool use, which is crucial for debugging and optimizing processes.
Access a dedicated section to view records of agent executions, including parameters, outcomes, function calls, and errors. Filter logs based on multiple parameters such as trace ID, model, tokens used, and metadata.
<details>
<summary><b>Traces</b></summary>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-Traces.png" alt="Portkey Traces" width="70%" />
</details>
<details>
<summary><b>Logs</b></summary>
<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-Logs.png" alt="Portkey Logs" width="70%" />
</details>
### 6. Enterprise Security Features
- Set budget limit and rate limts per Virtual Key (disposable API keys)
- Implement role-based access control
- Track system changes with audit logs
- Configure data retention policies
For detailed information on creating and managing Configs, visit the [Portkey documentation](https://docs.portkey.ai/product/ai-gateway/configs).
## Resources
- [📘 Portkey Documentation](https://docs.portkey.ai)
- [📊 Portkey Dashboard](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai)
- [🐦 Twitter](https://twitter.com/portkeyai)
- [💬 Discord Community](https://discord.gg/DD7vgKK299)

View File

@@ -0,0 +1,138 @@
---
title: Using Multimodal Agents
description: Learn how to enable and use multimodal capabilities in your agents for processing images and other non-text content within the CrewAI framework.
icon: image
---
# Using Multimodal Agents
CrewAI supports multimodal agents that can process both text and non-text content like images. This guide will show you how to enable and use multimodal capabilities in your agents.
## Enabling Multimodal Capabilities
To create a multimodal agent, simply set the `multimodal` parameter to `True` when initializing your agent:
```python
from crewai import Agent
agent = Agent(
role="Image Analyst",
goal="Analyze and extract insights from images",
backstory="An expert in visual content interpretation with years of experience in image analysis",
multimodal=True # This enables multimodal capabilities
)
```
When you set `multimodal=True`, the agent is automatically configured with the necessary tools for handling non-text content, including the `AddImageTool`.
## Working with Images
The multimodal agent comes pre-configured with the `AddImageTool`, which allows it to process images. You don't need to manually add this tool - it's automatically included when you enable multimodal capabilities.
Here's a complete example showing how to use a multimodal agent to analyze an image:
```python
from crewai import Agent, Task, Crew
# Create a multimodal agent
image_analyst = Agent(
role="Product Analyst",
goal="Analyze product images and provide detailed descriptions",
backstory="Expert in visual product analysis with deep knowledge of design and features",
multimodal=True
)
# Create a task for image analysis
task = Task(
description="Analyze the product image at https://example.com/product.jpg and provide a detailed description",
agent=image_analyst
)
# Create and run the crew
crew = Crew(
agents=[image_analyst],
tasks=[task]
)
result = crew.kickoff()
```
### Advanced Usage with Context
You can provide additional context or specific questions about the image when creating tasks for multimodal agents. The task description can include specific aspects you want the agent to focus on:
```python
from crewai import Agent, Task, Crew
# Create a multimodal agent for detailed analysis
expert_analyst = Agent(
role="Visual Quality Inspector",
goal="Perform detailed quality analysis of product images",
backstory="Senior quality control expert with expertise in visual inspection",
multimodal=True # AddImageTool is automatically included
)
# Create a task with specific analysis requirements
inspection_task = Task(
description="""
Analyze the product image at https://example.com/product.jpg with focus on:
1. Quality of materials
2. Manufacturing defects
3. Compliance with standards
Provide a detailed report highlighting any issues found.
""",
agent=expert_analyst
)
# Create and run the crew
crew = Crew(
agents=[expert_analyst],
tasks=[inspection_task]
)
result = crew.kickoff()
```
### Tool Details
When working with multimodal agents, the `AddImageTool` is automatically configured with the following schema:
```python
class AddImageToolSchema:
image_url: str # Required: The URL or path of the image to process
action: Optional[str] = None # Optional: Additional context or specific questions about the image
```
The multimodal agent will automatically handle the image processing through its built-in tools, allowing it to:
- Access images via URLs or local file paths
- Process image content with optional context or specific questions
- Provide analysis and insights based on the visual information and task requirements
## Best Practices
When working with multimodal agents, keep these best practices in mind:
1. **Image Access**
- Ensure your images are accessible via URLs that the agent can reach
- For local images, consider hosting them temporarily or using absolute file paths
- Verify that image URLs are valid and accessible before running tasks
2. **Task Description**
- Be specific about what aspects of the image you want the agent to analyze
- Include clear questions or requirements in the task description
- Consider using the optional `action` parameter for focused analysis
3. **Resource Management**
- Image processing may require more computational resources than text-only tasks
- Some language models may require base64 encoding for image data
- Consider batch processing for multiple images to optimize performance
4. **Environment Setup**
- Verify that your environment has the necessary dependencies for image processing
- Ensure your language model supports multimodal capabilities
- Test with small images first to validate your setup
5. **Error Handling**
- Implement proper error handling for image loading failures
- Have fallback strategies for when image processing fails
- Monitor and log image processing operations for debugging

View File

@@ -67,7 +67,6 @@ dev-dependencies = [
"mkdocs-material-extensions>=1.3.1",
"pillow>=10.2.0",
"cairosvg>=2.7.1",
"crewai-tools>=0.17.0",
"pytest>=8.0.0",
"pytest-vcr>=1.0.2",
"python-dotenv>=1.0.0",

View File

@@ -9,6 +9,7 @@ from crewai.agents import CacheHandler
from crewai.agents.agent_builder.base_agent import BaseAgent
from crewai.agents.crew_agent_executor import CrewAgentExecutor
from crewai.cli.constants import ENV_VARS, LITELLM_PARAMS
from crewai.utilities import Logger
from crewai.knowledge.knowledge import Knowledge
from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
from crewai.knowledge.utils.knowledge_utils import extract_knowledge_context
@@ -62,8 +63,12 @@ class Agent(BaseAgent):
tools: Tools at agents disposal
step_callback: Callback to be executed after each step of the agent execution.
knowledge_sources: Knowledge sources for the agent.
allow_feedback: Whether the agent can receive and process feedback during execution.
allow_conflict: Whether the agent can handle conflicts with other agents during execution.
allow_iteration: Whether the agent can iterate on its solutions based on feedback and validation.
"""
_logger = PrivateAttr(default_factory=lambda: Logger(verbose=False))
_times_executed: int = PrivateAttr(default=0)
max_execution_time: Optional[int] = Field(
default=None,
@@ -123,6 +128,18 @@ class Agent(BaseAgent):
default="safe",
description="Mode for code execution: 'safe' (using Docker) or 'unsafe' (direct execution).",
)
allow_feedback: bool = Field(
default=False,
description="Enable agent to receive and process feedback during execution.",
)
allow_conflict: bool = Field(
default=False,
description="Enable agent to handle conflicts with other agents during execution.",
)
allow_iteration: bool = Field(
default=False,
description="Enable agent to iterate on its solutions based on feedback and validation.",
)
embedder_config: Optional[Dict[str, Any]] = Field(
default=None,
description="Embedder configuration for the agent.",
@@ -139,6 +156,19 @@ class Agent(BaseAgent):
def post_init_setup(self):
self._set_knowledge()
self.agent_ops_agent_name = self.role
if self.allow_feedback:
self._logger.log("info", "Feedback mode enabled for agent.", color="bold_green")
if self.allow_conflict:
self._logger.log("info", "Conflict handling enabled for agent.", color="bold_green")
if self.allow_iteration:
self._logger.log("info", "Iteration mode enabled for agent.", color="bold_green")
# Validate boolean parameters
for param in ['allow_feedback', 'allow_conflict', 'allow_iteration']:
if not isinstance(getattr(self, param), bool):
raise ValueError(f"Parameter '{param}' must be a boolean value.")
unaccepted_attributes = [
"AWS_ACCESS_KEY_ID",
"AWS_SECRET_ACCESS_KEY",
@@ -400,6 +430,9 @@ class Agent(BaseAgent):
step_callback=self.step_callback,
function_calling_llm=self.function_calling_llm,
respect_context_window=self.respect_context_window,
allow_feedback=self.allow_feedback,
allow_conflict=self.allow_conflict,
allow_iteration=self.allow_iteration,
request_within_rpm_limit=(
self._rpm_controller.check_or_wait if self._rpm_controller else None
),

View File

@@ -31,6 +31,34 @@ class ToolResult:
class CrewAgentExecutor(CrewAgentExecutorMixin):
"""CrewAgentExecutor class for managing agent execution.
This class is responsible for executing agent tasks, handling tools,
managing agent interactions, and processing the results.
Parameters:
llm: The language model to use for generating responses.
task: The task to be executed.
crew: The crew that the agent belongs to.
agent: The agent to execute the task.
prompt: The prompt to use for generating responses.
max_iter: Maximum number of iterations for the agent execution.
tools: The tools available to the agent.
tools_names: The names of the tools available to the agent.
stop_words: Words that signal the end of agent execution.
tools_description: Description of the tools available to the agent.
tools_handler: Handler for tool operations.
step_callback: Callback function for each step of execution.
original_tools: Original list of tools before processing.
function_calling_llm: LLM specifically for function calling.
respect_context_window: Whether to respect the context window size.
request_within_rpm_limit: Function to check if request is within RPM limit.
callbacks: List of callback functions.
allow_feedback: Controls feedback processing during execution.
allow_conflict: Enables conflict handling between agents.
allow_iteration: Allows solution iteration based on feedback.
"""
_logger: Logger = Logger()
def __init__(
@@ -52,6 +80,9 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
respect_context_window: bool = False,
request_within_rpm_limit: Any = None,
callbacks: List[Any] = [],
allow_feedback: bool = False,
allow_conflict: bool = False,
allow_iteration: bool = False,
):
self._i18n: I18N = I18N()
self.llm = llm
@@ -73,6 +104,9 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
self.function_calling_llm = function_calling_llm
self.respect_context_window = respect_context_window
self.request_within_rpm_limit = request_within_rpm_limit
self.allow_feedback = allow_feedback
self.allow_conflict = allow_conflict
self.allow_iteration = allow_iteration
self.ask_for_human_input = False
self.messages: List[Dict[str, str]] = []
self.iterations = 0
@@ -487,3 +521,56 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
self.ask_for_human_input = False
return formatted_answer
def process_feedback(self, feedback: str) -> bool:
"""
Process feedback for the agent if feedback mode is enabled.
Parameters:
feedback (str): The feedback to process.
Returns:
bool: True if the feedback was processed successfully, False otherwise.
"""
if not self.allow_feedback:
self._logger.log("warning", "Feedback processing skipped (allow_feedback=False).", color="yellow")
return False
self._logger.log("info", f"Processing feedback: {feedback}", color="green")
# Add feedback to messages
self.messages.append(self._format_msg(f"Feedback: {feedback}"))
return True
def handle_conflict(self, other_agent: 'CrewAgentExecutor') -> bool:
"""
Handle conflict with another agent if conflict handling is enabled.
Parameters:
other_agent (CrewAgentExecutor): The other agent involved in the conflict.
Returns:
bool: True if the conflict was handled successfully, False otherwise.
"""
if not self.allow_conflict:
self._logger.log("warning", "Conflict handling skipped (allow_conflict=False).", color="yellow")
return False
self._logger.log("info", f"Handling conflict with agent: {other_agent.agent.role}", color="green")
return True
def process_iteration(self, result: Any) -> bool:
"""
Process iteration based on result if iteration mode is enabled.
Parameters:
result (Any): The result to iterate on.
Returns:
bool: True if the iteration was processed successfully, False otherwise.
"""
if not self.allow_iteration:
self._logger.log("warning", "Iteration processing skipped (allow_iteration=False).", color="yellow")
return False
self._logger.log("info", "Processing iteration on result.", color="green")
return True

View File

@@ -1625,3 +1625,127 @@ def test_agent_with_knowledge_sources():
# Assert that the agent provides the correct information
assert "red" in result.raw.lower()
def test_agent_with_feedback_conflict_iteration_params():
"""Test that the agent correctly handles the allow_feedback, allow_conflict, and allow_iteration parameters."""
agent = Agent(
role="test role",
goal="test goal",
backstory="test backstory",
allow_feedback=True,
allow_conflict=True,
allow_iteration=True,
)
assert agent.allow_feedback is True
assert agent.allow_conflict is True
assert agent.allow_iteration is True
# Create another agent with default values
default_agent = Agent(
role="test role",
goal="test goal",
backstory="test backstory",
)
assert default_agent.allow_feedback is False
assert default_agent.allow_conflict is False
assert default_agent.allow_iteration is False
def test_agent_feedback_processing():
"""Test that the agent correctly processes feedback when allow_feedback is enabled."""
from unittest.mock import patch, MagicMock
# Create a mock CrewAgentExecutor
mock_executor = MagicMock()
mock_executor.allow_feedback = True
mock_executor.process_feedback.return_value = True
# Mock the create_agent_executor method at the module level
with patch('crewai.agent.Agent.create_agent_executor', return_value=mock_executor):
# Create an agent with allow_feedback=True
agent = Agent(
role="test role",
goal="test goal",
backstory="test backstory",
allow_feedback=True,
llm=MagicMock() # Mock LLM to avoid API calls
)
executor = agent.create_agent_executor()
assert executor.allow_feedback is True
result = executor.process_feedback("Test feedback")
assert result is True
executor.process_feedback.assert_called_once_with("Test feedback")
def test_agent_conflict_handling():
"""Test that the agent correctly handles conflicts when allow_conflict is enabled."""
from unittest.mock import patch, MagicMock
mock_executor1 = MagicMock()
mock_executor1.allow_conflict = True
mock_executor1.handle_conflict.return_value = True
mock_executor2 = MagicMock()
mock_executor2.allow_conflict = True
with patch('crewai.agent.Agent.create_agent_executor', return_value=mock_executor1):
# Create agents with allow_conflict=True
agent1 = Agent(
role="role1",
goal="goal1",
backstory="backstory1",
allow_conflict=True,
llm=MagicMock() # Mock LLM to avoid API calls
)
agent2 = Agent(
role="role2",
goal="goal2",
backstory="backstory2",
allow_conflict=True,
llm=MagicMock() # Mock LLM to avoid API calls
)
# Get the executors
executor1 = agent1.create_agent_executor()
executor2 = agent2.create_agent_executor()
assert executor1.allow_conflict is True
assert executor2.allow_conflict is True
result = executor1.handle_conflict(executor2)
assert result is True
executor1.handle_conflict.assert_called_once_with(executor2)
def test_agent_iteration_processing():
"""Test that the agent correctly processes iterations when allow_iteration is enabled."""
from unittest.mock import patch, MagicMock
# Create a mock CrewAgentExecutor
mock_executor = MagicMock()
mock_executor.allow_iteration = True
mock_executor.process_iteration.return_value = True
# Mock the create_agent_executor method at the module level
with patch('crewai.agent.Agent.create_agent_executor', return_value=mock_executor):
# Create an agent with allow_iteration=True
agent = Agent(
role="test role",
goal="test goal",
backstory="test backstory",
allow_iteration=True,
llm=MagicMock() # Mock LLM to avoid API calls
)
executor = agent.create_agent_executor()
assert executor.allow_iteration is True
result = executor.process_iteration("Test result")
assert result is True
executor.process_iteration.assert_called_once_with("Test result")