Adding Multimodal Abilities to Crew (#1805)

* initial fix on delegation tools * fixing tests for delegations and coding * Refactor prepare tool and adding initial add images logic * supporting image tool * fixing linter * fix linter * Making sure multimodal feature support i18n * fix linter and types * mixxing translations * fix types and linter * Revert "fixing linter" This reverts commit 2eda5fdeed. * fix linters * test * fix * fix * fix linter * fix * ignore * type improvements
2026-01-11 00:58:30 +00:00 · 2024-12-27 17:03:35 -03:00
parent 6cc2f510bf
commit 82647358b2
15 changed files with 2992 additions and 546 deletions
--- a/src/crewai/agents/crew_agent_executor.py
+++ b/src/crewai/agents/crew_agent_executor.py
@@ -143,10 +143,20 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                        tool_result = self._execute_tool_and_check_finality(
                            formatted_answer
                        )
-                        if self.step_callback:
-                            self.step_callback(tool_result)

-                        formatted_answer.text += f"\nObservation: {tool_result.result}"
+                        # Directly append the result to the messages if the
+                        # tool is "Add image to content" in case of multimodal
+                        # agents
+                        if formatted_answer.tool == self._i18n.tools("add_image")["name"]:
+                            self.messages.append(tool_result.result)
+                            continue
+
+                        else:
+                            if self.step_callback:
+                                self.step_callback(tool_result)
+
+                            formatted_answer.text += f"\nObservation: {tool_result.result}"
+
                        formatted_answer.result = tool_result.result
                        if tool_result.result_as_answer:
                            return AgentFinish(