mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-01-10 16:48:30 +00:00
This commit fixes the multimodal image handling in CrewAI agents: 1. CrewAgentExecutor._handle_agent_action: Fixed to properly append multimodal messages without double-wrapping. The add_image tool result is now appended directly if it's already a properly formatted message dict with role and content. 2. ToolUsage._use: Fixed to preserve the raw dict result for add_image tool instead of stringifying it via _format_result(). This ensures the multimodal content structure is maintained. 3. Gemini provider _format_messages_for_gemini: Added proper handling for multimodal content with image_url parts. The provider now: - Converts image_url parts to Gemini's inline_data format - Supports HTTP(S) URLs by fetching and converting to base64 - Supports data URLs by parsing and extracting base64 data - Supports local file paths by reading and converting to base64 4. Added comprehensive tests for all multimodal handling fixes. Fixes #4016 Co-Authored-By: João <joao@crewai.com>