Compare commits

..

2 Commits

Author SHA1 Message Date
Brandon Hancock (bhancock_ai)
0af6b05e16 Merge branch 'main' into bugfix/support-llm-managers-that-use-model 2025-01-14 13:24:17 -05:00
Brandon Hancock
f5d01b9efc Incorporate y4izus fix 2025-01-14 13:15:54 -05:00
3 changed files with 6 additions and 7 deletions

View File

@@ -1,14 +1,14 @@
---
title: Using Multimodal Agents
description: Learn how to enable and use multimodal capabilities in your agents for processing images and other non-text content within the CrewAI framework.
icon: video
icon: image
---
## Using Multimodal Agents
# Using Multimodal Agents
CrewAI supports multimodal agents that can process both text and non-text content like images. This guide will show you how to enable and use multimodal capabilities in your agents.
### Enabling Multimodal Capabilities
## Enabling Multimodal Capabilities
To create a multimodal agent, simply set the `multimodal` parameter to `True` when initializing your agent:
@@ -25,7 +25,7 @@ agent = Agent(
When you set `multimodal=True`, the agent is automatically configured with the necessary tools for handling non-text content, including the `AddImageTool`.
### Working with Images
## Working with Images
The multimodal agent comes pre-configured with the `AddImageTool`, which allows it to process images. You don't need to manually add this tool - it's automatically included when you enable multimodal capabilities.
@@ -108,7 +108,7 @@ The multimodal agent will automatically handle the image processing through its
- Process image content with optional context or specific questions
- Provide analysis and insights based on the visual information and task requirements
### Best Practices
## Best Practices
When working with multimodal agents, keep these best practices in mind:

View File

@@ -91,7 +91,6 @@
"how-to/custom-manager-agent",
"how-to/llm-connections",
"how-to/customizing-agents",
"how-to/multimodal-agents",
"how-to/coding-agents",
"how-to/force-tool-output-as-result",
"how-to/human-input-on-execution",

View File

@@ -43,7 +43,7 @@
"ask_question": "Ask a specific question to one of the following coworkers: {coworkers}\nThe input to this tool should be the coworker, the question you have for them, and ALL necessary context to ask the question properly, they know nothing about the question, so share absolute everything you know, don't reference things but instead explain them.",
"add_image": {
"name": "Add image to content",
"description": "See image to understand its content, you can optionally ask a question about the image",
"description": "See image to understand it's content, you can optionally ask a question about the image",
"default_action": "Please provide a detailed description of this image, including all visual elements, context, and any notable details you can observe."
}
}