docs: update multimodal agents guide and mint.json configuration

fix: add multimodal docs path to mint.json
fix: get rid of translation typo (#1880 )
2026-03-03 18:28:13 +00:00 · 2025-01-15 14:13:37 -05:00 · 2025-01-15 13:54:32 -05:00 · 2025-01-14 14:06:01 -05:00 · 2025-01-14 13:35:21 -05:00 · 2025-01-14 13:24:03 -05:00
4 changed files with 8 additions and 6 deletions
--- a/docs/how-to/multimodal-agents.mdx
+++ b/docs/how-to/multimodal-agents.mdx
@@ -1,14 +1,14 @@
 ---
 title: Using Multimodal Agents
 description: Learn how to enable and use multimodal capabilities in your agents for processing images and other non-text content within the CrewAI framework.
-icon: image
+icon: video
 ---

-# Using Multimodal Agents
+## Using Multimodal Agents

 CrewAI supports multimodal agents that can process both text and non-text content like images. This guide will show you how to enable and use multimodal capabilities in your agents.

-## Enabling Multimodal Capabilities
+### Enabling Multimodal Capabilities

 To create a multimodal agent, simply set the `multimodal` parameter to `True` when initializing your agent:

@@ -25,7 +25,7 @@ agent = Agent(

 When you set `multimodal=True`, the agent is automatically configured with the necessary tools for handling non-text content, including the `AddImageTool`.

-## Working with Images
+### Working with Images

 The multimodal agent comes pre-configured with the `AddImageTool`, which allows it to process images. You don't need to manually add this tool - it's automatically included when you enable multimodal capabilities.

@@ -108,7 +108,7 @@ The multimodal agent will automatically handle the image processing through its
 - Process image content with optional context or specific questions
 - Provide analysis and insights based on the visual information and task requirements

-## Best Practices
+### Best Practices

 When working with multimodal agents, keep these best practices in mind:

--- a/docs/mint.json
+++ b/docs/mint.json
@@ -91,6 +91,7 @@
        "how-to/custom-manager-agent",
        "how-to/llm-connections",
        "how-to/customizing-agents",
+        "how-to/multimodal-agents",
        "how-to/coding-agents",
        "how-to/force-tool-output-as-result",
        "how-to/human-input-on-execution",
--- a/src/crewai/crew.py
+++ b/src/crewai/crew.py
@@ -676,6 +676,7 @@ class Crew(BaseModel):
        else:
            self.manager_llm = (
                getattr(self.manager_llm, "model_name", None)
+                or getattr(self.manager_llm, "model", None)
                or getattr(self.manager_llm, "deployment_name", None)
                or self.manager_llm
            )
--- a/src/crewai/translations/en.json
+++ b/src/crewai/translations/en.json
@@ -43,7 +43,7 @@
    "ask_question": "Ask a specific question to one of the following coworkers: {coworkers}\nThe input to this tool should be the coworker, the question you have for them, and ALL necessary context to ask the question properly, they know nothing about the question, so share absolute everything you know, don't reference things but instead explain them.",
    "add_image": {
      "name": "Add image to content",
-      "description": "See image to understand it's content, you can optionally ask a question about the image",
+      "description": "See image to understand its content, you can optionally ask a question about the image",
      "default_action": "Please provide a detailed description of this image, including all visual elements, context, and any notable details you can observe."
    }
  }
Author	SHA1	Message	Date
Tony Kipkemboi	c12343a8b8	docs: update multimodal agents guide and mint.json configuration	2025-01-15 14:13:37 -05:00
Tony Kipkemboi	835557e648	fix: add multimodal docs path to mint.json	2025-01-15 13:54:32 -05:00
Daniel Barreto	4185ea688f	fix: get rid of translation typo (#1880 ) Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>	2025-01-14 14:06:01 -05:00
Brandon Hancock (bhancock_ai)	0532089246	Incorporate y4izus fix (#1893 )	2025-01-14 13:35:21 -05:00
Brandon Hancock (bhancock_ai)	24b155015c	before kickoff breaks if inputs are none. (#1883 ) * before kickoff breaks if inputs are none. * improve none type * Fix failing tests * add tests for new code * Fix failing test * drop extra comments * clean up based on eduardo feedback	2025-01-14 13:24:03 -05:00