Add support for custom LLM implementations (#2277)

* Add support for custom LLM implementations Co-Authored-By: Joe Moura <joao@crewai.com> * Fix import sorting and type annotations Co-Authored-By: Joe Moura <joao@crewai.com> * Fix linting issues with import sorting Co-Authored-By: Joe Moura <joao@crewai.com> * Fix type errors in crew.py by updating tool-related methods to return List[BaseTool] Co-Authored-By: Joe Moura <joao@crewai.com> * Enhance custom LLM implementation with better error handling, documentation, and test coverage Co-Authored-By: Joe Moura <joao@crewai.com> * Refactor LLM module by extracting BaseLLM to a separate file This commit moves the BaseLLM abstract base class from llm.py to a new file llms/base_llm.py to improve code organization. The changes include: - Creating a new file src/crewai/llms/base_llm.py - Moving the BaseLLM class to the new file - Updating imports in __init__.py and llm.py to reflect the new location - Updating test cases to use the new import path The refactoring maintains the existing functionality while improving the project's module structure. * Add AISuite LLM support and update dependencies - Integrate AISuite as a new third-party LLM option - Update pyproject.toml and uv.lock to include aisuite package - Modify BaseLLM to support more flexible initialization - Remove unnecessary LLM imports across multiple files - Implement AISuiteLLM with basic chat completion functionality * Update AISuiteLLM and LLM utility type handling - Modify AISuiteLLM to support more flexible input types for messages - Update type hints in AISuiteLLM to allow string or list of message dictionaries - Enhance LLM utility function to support broader LLM type annotations - Remove default `self.stop` attribute from BaseLLM initialization * Update LLM imports and type hints across multiple files - Modify imports in crew_chat.py to use LLM instead of BaseLLM - Update type hints in llm_utils.py to use LLM type - Add optional `stop` parameter to BaseLLM initialization - Refactor type handling for LLM creation and usage * Improve stop words handling in CrewAgentExecutor - Add support for handling existing stop words in LLM configuration - Ensure stop words are correctly merged and deduplicated - Update type hints to support both LLM and BaseLLM types * Remove abstract method set_callbacks from BaseLLM class * Enhance CustomLLM and JWTAuthLLM initialization with model parameter - Update CustomLLM to accept a model parameter during initialization - Modify test cases to include the new model argument - Ensure JWTAuthLLM and TimeoutHandlingLLM also utilize the model parameter in their constructors - Update type hints in create_llm function to support both LLM and BaseLLM types * Enhance create_llm function to support BaseLLM type - Update the create_llm function to accept both LLM and BaseLLM instances - Ensure compatibility with existing LLM handling logic * Update type hint for initialize_chat_llm to support BaseLLM - Modify the return type of initialize_chat_llm function to allow for both LLM and BaseLLM instances - Ensure compatibility with recent changes in create_llm function * Refactor AISuiteLLM to include tools parameter in completion methods - Update the _prepare_completion_params method to accept an optional tools parameter - Modify the chat completion method to utilize the new tools parameter for enhanced functionality - Clean up print statements for better code clarity * Remove unused tool_calls handling in AISuiteLLM chat completion method for cleaner code. * Refactor Crew class and LLM hierarchy for improved type handling and code clarity - Update Crew class methods to enhance readability with consistent formatting and type hints. - Change LLM class to inherit from BaseLLM for better structure. - Remove unnecessary type checks and streamline tool handling in CrewAgentExecutor. - Adjust BaseLLM to provide default implementations for stop words and context window size methods. - Clean up AISuiteLLM by removing unused methods related to stop words and context window size. * Remove unused `stream` method from `BaseLLM` class to enhance code clarity and maintainability. --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Joe Moura <joao@crewai.com> Co-authored-by: Lorenze Jay <lorenzejaytech@gmail.com> Co-authored-by: João Moura <joaomdmoura@gmail.com> Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>
2026-01-14 10:38:29 +00:00 · 2025-03-25 12:39:08 -04:00
parent 3dea3d0183
commit 807c13e144
16 changed files with 1671 additions and 61 deletions
--- a/src/crewai/init.py
+++ b/src/crewai/init.py
@@ -5,6 +5,7 @@ from crewai.crew import Crew
 from crewai.flow.flow import Flow
 from crewai.knowledge.knowledge import Knowledge
 from crewai.llm import LLM
+from crewai.llms.base_llm import BaseLLM
 from crewai.process import Process
 from crewai.task import Task

@@ -21,6 +22,7 @@ __all__ = [
    "Process",
    "Task",
    "LLM",
+    "BaseLLM",
    "Flow",
    "Knowledge",
 ]
--- a/src/crewai/agent.py
+++ b/src/crewai/agent.py
@@ -11,7 +11,7 @@ from crewai.agents.crew_agent_executor import CrewAgentExecutor
 from crewai.knowledge.knowledge import Knowledge
 from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
 from crewai.knowledge.utils.knowledge_utils import extract_knowledge_context
-from crewai.llm import LLM
+from crewai.llm import BaseLLM
 from crewai.memory.contextual.contextual_memory import ContextualMemory
 from crewai.security import Fingerprint
 from crewai.task import Task
@@ -71,10 +71,10 @@ class Agent(BaseAgent):
        default=True,
        description="Use system prompt for the agent.",
    )
-    llm: Union[str, InstanceOf[LLM], Any] = Field(
+    llm: Union[str, InstanceOf[BaseLLM], Any] = Field(
        description="Language model that will run the agent.", default=None
    )
-    function_calling_llm: Optional[Union[str, InstanceOf[LLM], Any]] = Field(
+    function_calling_llm: Optional[Union[str, InstanceOf[BaseLLM], Any]] = Field(
        description="Language model that will run the agent.", default=None
    )
    system_template: Optional[str] = Field(
@@ -118,7 +118,9 @@ class Agent(BaseAgent):
        self.agent_ops_agent_name = self.role

        self.llm = create_llm(self.llm)
-        if self.function_calling_llm and not isinstance(self.function_calling_llm, LLM):
+        if self.function_calling_llm and not isinstance(
+            self.function_calling_llm, BaseLLM
+        ):
            self.function_calling_llm = create_llm(self.function_calling_llm)

        if not self.agent_executor:
--- a/src/crewai/agents/crew_agent_executor.py
+++ b/src/crewai/agents/crew_agent_executor.py
@@ -13,7 +13,7 @@ from crewai.agents.parser import (
    OutputParserException,
 )
 from crewai.agents.tools_handler import ToolsHandler
-from crewai.llm import LLM
+from crewai.llm import BaseLLM
 from crewai.tools.base_tool import BaseTool
 from crewai.tools.tool_usage import ToolUsage, ToolUsageErrorException
 from crewai.utilities import I18N, Printer
@@ -61,7 +61,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
        callbacks: List[Any] = [],
    ):
        self._i18n: I18N = I18N()
-        self.llm: LLM = llm
+        self.llm: BaseLLM = llm
        self.task = task
        self.agent = agent
        self.crew = crew
@@ -87,8 +87,14 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
        self.tool_name_to_tool_map: Dict[str, BaseTool] = {
            tool.name: tool for tool in self.tools
        }
-        self.stop = stop_words
-        self.llm.stop = list(set(self.llm.stop + self.stop))
+        existing_stop = self.llm.stop or []
+        self.llm.stop = list(
+            set(
+                existing_stop + self.stop
+                if isinstance(existing_stop, list)
+                else self.stop
+            )
+        )

    def invoke(self, inputs: Dict[str, str]) -> Dict[str, Any]:
        if "system" in self.prompt:
--- a/src/crewai/cli/crew_chat.py
+++ b/src/crewai/cli/crew_chat.py
@@ -14,7 +14,7 @@ from packaging import version
 from crewai.cli.utils import read_toml
 from crewai.cli.version import get_crewai_version
 from crewai.crew import Crew
-from crewai.llm import LLM
+from crewai.llm import LLM, BaseLLM
 from crewai.types.crew_chat import ChatInputField, ChatInputs
 from crewai.utilities.llm_utils import create_llm

@@ -116,7 +116,7 @@ def show_loading(event: threading.Event):
    print()


-def initialize_chat_llm(crew: Crew) -> Optional[LLM]:
+def initialize_chat_llm(crew: Crew) -> Optional[LLM | BaseLLM]:
    """Initializes the chat LLM and handles exceptions."""
    try:
        return create_llm(crew.chat_llm)
--- a/src/crewai/crew.py
+++ b/src/crewai/crew.py
@@ -6,8 +6,9 @@ import warnings
 from concurrent.futures import Future
 from copy import copy as shallow_copy
 from hashlib import md5
-from typing import Any, Callable, Dict, List, Optional, Set, Tuple, Union
+from typing import Any, Callable, Dict, List, Optional, Set, Tuple, TypeVar, Union, cast

+from langchain_core.tools import BaseTool as LangchainBaseTool
 from pydantic import (
    UUID4,
    BaseModel,
@@ -26,7 +27,7 @@ from crewai.agents.cache import CacheHandler
 from crewai.crews.crew_output import CrewOutput
 from crewai.knowledge.knowledge import Knowledge
 from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
-from crewai.llm import LLM
+from crewai.llm import LLM, BaseLLM
 from crewai.memory.entity.entity_memory import EntityMemory
 from crewai.memory.long_term.long_term_memory import LongTermMemory
 from crewai.memory.short_term.short_term_memory import ShortTermMemory
@@ -37,7 +38,7 @@ from crewai.task import Task
 from crewai.tasks.conditional_task import ConditionalTask
 from crewai.tasks.task_output import TaskOutput
 from crewai.tools.agent_tools.agent_tools import AgentTools
-from crewai.tools.base_tool import Tool
+from crewai.tools.base_tool import BaseTool, Tool
 from crewai.types.usage_metrics import UsageMetrics
 from crewai.utilities import I18N, FileHandler, Logger, RPMController
 from crewai.utilities.constants import TRAINING_DATA_FILE
@@ -153,7 +154,7 @@ class Crew(BaseModel):
        default=None,
        description="Metrics for the LLM usage during all tasks execution.",
    )
-    manager_llm: Optional[Any] = Field(
+    manager_llm: Optional[Union[str, InstanceOf[BaseLLM], Any]] = Field(
        description="Language model that will run the agent.", default=None
    )
    manager_agent: Optional[BaseAgent] = Field(
@@ -187,7 +188,7 @@ class Crew(BaseModel):
        default=None,
        description="Maximum number of requests per minute for the crew execution to be respected.",
    )
-    prompt_file: str = Field(
+    prompt_file: Optional[str] = Field(
        default=None,
        description="Path to the prompt json file to be used for the crew.",
    )
@@ -199,7 +200,7 @@ class Crew(BaseModel):
        default=False,
        description="Plan the crew execution and add the plan to the crew.",
    )
-    planning_llm: Optional[Any] = Field(
+    planning_llm: Optional[Union[str, InstanceOf[BaseLLM], Any]] = Field(
        default=None,
        description="Language model that will run the AgentPlanner if planning is True.",
    )
@@ -215,7 +216,7 @@ class Crew(BaseModel):
        default=None,
        description="Knowledge sources for the crew. Add knowledge sources to the knowledge object.",
    )
-    chat_llm: Optional[Any] = Field(
+    chat_llm: Optional[Union[str, InstanceOf[BaseLLM], Any]] = Field(
        default=None,
        description="LLM used to handle chatting with the crew.",
    )
@@ -819,7 +820,12 @@ class Crew(BaseModel):

            # Determine which tools to use - task tools take precedence over agent tools
            tools_for_task = task.tools or agent_to_use.tools or []
-            tools_for_task = self._prepare_tools(agent_to_use, task, tools_for_task)
+            # Prepare tools and ensure they're compatible with task execution
+            tools_for_task = self._prepare_tools(
+                agent_to_use,
+                task,
+                cast(Union[List[Tool], List[BaseTool]], tools_for_task),
+            )

            self._log_task_start(task, agent_to_use.role)

@@ -838,7 +844,7 @@ class Crew(BaseModel):
                future = task.execute_async(
                    agent=agent_to_use,
                    context=context,
-                    tools=tools_for_task,
+                    tools=cast(List[BaseTool], tools_for_task),
                )
                futures.append((task, future, task_index))
            else:
@@ -850,7 +856,7 @@ class Crew(BaseModel):
                task_output = task.execute_sync(
                    agent=agent_to_use,
                    context=context,
-                    tools=tools_for_task,
+                    tools=cast(List[BaseTool], tools_for_task),
                )
                task_outputs.append(task_output)
                self._process_task_result(task, task_output)
@@ -888,10 +894,12 @@ class Crew(BaseModel):
        return None

    def _prepare_tools(
-        self, agent: BaseAgent, task: Task, tools: List[Tool]
-    ) -> List[Tool]:
+        self, agent: BaseAgent, task: Task, tools: Union[List[Tool], List[BaseTool]]
+    ) -> List[BaseTool]:
        # Add delegation tools if agent allows delegation
-        if agent.allow_delegation:
+        if hasattr(agent, "allow_delegation") and getattr(
+            agent, "allow_delegation", False
+        ):
            if self.process == Process.hierarchical:
                if self.manager_agent:
                    tools = self._update_manager_tools(task, tools)
@@ -900,17 +908,24 @@ class Crew(BaseModel):
                        "Manager agent is required for hierarchical process."
                    )

-            elif agent and agent.allow_delegation:
+            elif agent:
                tools = self._add_delegation_tools(task, tools)

        # Add code execution tools if agent allows code execution
-        if agent.allow_code_execution:
+        if hasattr(agent, "allow_code_execution") and getattr(
+            agent, "allow_code_execution", False
+        ):
            tools = self._add_code_execution_tools(agent, tools)

-        if agent and agent.multimodal:
+        if (
+            agent
+            and hasattr(agent, "multimodal")
+            and getattr(agent, "multimodal", False)
+        ):
            tools = self._add_multimodal_tools(agent, tools)

-        return tools
+        # Return a List[BaseTool] which is compatible with both Task.execute_sync and Task.execute_async
+        return cast(List[BaseTool], tools)

    def _get_agent_to_use(self, task: Task) -> Optional[BaseAgent]:
        if self.process == Process.hierarchical:
@@ -918,11 +933,13 @@ class Crew(BaseModel):
        return task.agent

    def _merge_tools(
-        self, existing_tools: List[Tool], new_tools: List[Tool]
-    ) -> List[Tool]:
+        self,
+        existing_tools: Union[List[Tool], List[BaseTool]],
+        new_tools: Union[List[Tool], List[BaseTool]],
+    ) -> List[BaseTool]:
        """Merge new tools into existing tools list, avoiding duplicates by tool name."""
        if not new_tools:
-            return existing_tools
+            return cast(List[BaseTool], existing_tools)

        # Create mapping of tool names to new tools
        new_tool_map = {tool.name: tool for tool in new_tools}
@@ -933,23 +950,41 @@ class Crew(BaseModel):
        # Add all new tools
        tools.extend(new_tools)

-        return tools
+        return cast(List[BaseTool], tools)

    def _inject_delegation_tools(
-        self, tools: List[Tool], task_agent: BaseAgent, agents: List[BaseAgent]
-    ):
-        delegation_tools = task_agent.get_delegation_tools(agents)
-        return self._merge_tools(tools, delegation_tools)
+        self,
+        tools: Union[List[Tool], List[BaseTool]],
+        task_agent: BaseAgent,
+        agents: List[BaseAgent],
+    ) -> List[BaseTool]:
+        if hasattr(task_agent, "get_delegation_tools"):
+            delegation_tools = task_agent.get_delegation_tools(agents)
+            # Cast delegation_tools to the expected type for _merge_tools
+            return self._merge_tools(tools, cast(List[BaseTool], delegation_tools))
+        return cast(List[BaseTool], tools)

-    def _add_multimodal_tools(self, agent: BaseAgent, tools: List[Tool]):
-        multimodal_tools = agent.get_multimodal_tools()
-        return self._merge_tools(tools, multimodal_tools)
+    def _add_multimodal_tools(
+        self, agent: BaseAgent, tools: Union[List[Tool], List[BaseTool]]
+    ) -> List[BaseTool]:
+        if hasattr(agent, "get_multimodal_tools"):
+            multimodal_tools = agent.get_multimodal_tools()
+            # Cast multimodal_tools to the expected type for _merge_tools
+            return self._merge_tools(tools, cast(List[BaseTool], multimodal_tools))
+        return cast(List[BaseTool], tools)

-    def _add_code_execution_tools(self, agent: BaseAgent, tools: List[Tool]):
-        code_tools = agent.get_code_execution_tools()
-        return self._merge_tools(tools, code_tools)
+    def _add_code_execution_tools(
+        self, agent: BaseAgent, tools: Union[List[Tool], List[BaseTool]]
+    ) -> List[BaseTool]:
+        if hasattr(agent, "get_code_execution_tools"):
+            code_tools = agent.get_code_execution_tools()
+            # Cast code_tools to the expected type for _merge_tools
+            return self._merge_tools(tools, cast(List[BaseTool], code_tools))
+        return cast(List[BaseTool], tools)

-    def _add_delegation_tools(self, task: Task, tools: List[Tool]):
+    def _add_delegation_tools(
+        self, task: Task, tools: Union[List[Tool], List[BaseTool]]
+    ) -> List[BaseTool]:
        agents_for_delegation = [agent for agent in self.agents if agent != task.agent]
        if len(self.agents) > 1 and len(agents_for_delegation) > 0 and task.agent:
            if not tools:
@@ -957,7 +992,7 @@ class Crew(BaseModel):
            tools = self._inject_delegation_tools(
                tools, task.agent, agents_for_delegation
            )
-        return tools
+        return cast(List[BaseTool], tools)

    def _log_task_start(self, task: Task, role: str = "None"):
        if self.output_log_file:
@@ -965,7 +1000,9 @@ class Crew(BaseModel):
                task_name=task.name, task=task.description, agent=role, status="started"
            )

-    def _update_manager_tools(self, task: Task, tools: List[Tool]):
+    def _update_manager_tools(
+        self, task: Task, tools: Union[List[Tool], List[BaseTool]]
+    ) -> List[BaseTool]:
        if self.manager_agent:
            if task.agent:
                tools = self._inject_delegation_tools(tools, task.agent, [task.agent])
@@ -973,7 +1010,7 @@ class Crew(BaseModel):
                tools = self._inject_delegation_tools(
                    tools, self.manager_agent, self.agents
                )
-        return tools
+        return cast(List[BaseTool], tools)

    def _get_context(self, task: Task, task_outputs: List[TaskOutput]):
        context = (
@@ -1214,13 +1251,14 @@ class Crew(BaseModel):
    def test(
        self,
        n_iterations: int,
-        eval_llm: Union[str, InstanceOf[LLM]],
+        eval_llm: Union[str, InstanceOf[BaseLLM]],
        inputs: Optional[Dict[str, Any]] = None,
    ) -> None:
        """Test and evaluate the Crew with the given inputs for n iterations concurrently using concurrent.futures."""
        try:
-            eval_llm = create_llm(eval_llm)
-            if not eval_llm:
+            # Create LLM instance and ensure it's of type LLM for CrewEvaluator
+            llm_instance = create_llm(eval_llm)
+            if not llm_instance:
                raise ValueError("Failed to create LLM instance.")

            crewai_event_bus.emit(
@@ -1228,12 +1266,12 @@ class Crew(BaseModel):
                CrewTestStartedEvent(
                    crew_name=self.name or "crew",
                    n_iterations=n_iterations,
-                    eval_llm=eval_llm,
+                    eval_llm=llm_instance,
                    inputs=inputs,
                ),
            )
            test_crew = self.copy()
-            evaluator = CrewEvaluator(test_crew, eval_llm)  # type: ignore[arg-type]
+            evaluator = CrewEvaluator(test_crew, llm_instance)

            for i in range(1, n_iterations + 1):
                evaluator.set_iteration(i)
--- a/src/crewai/llm.py
+++ b/src/crewai/llm.py
@@ -40,6 +40,7 @@ with warnings.catch_warnings():
    from litellm.utils import supports_response_schema


+from crewai.llms.base_llm import BaseLLM
 from crewai.utilities.events import crewai_event_bus
 from crewai.utilities.exceptions.context_window_exceeding_exception import (
    LLMContextLengthExceededException,
@@ -218,7 +219,7 @@ class StreamingChoices(TypedDict):
    finish_reason: Optional[str]


-class LLM:
+class LLM(BaseLLM):
    def __init__(
        self,
        model: str,
--- a/src/crewai/llms/base_llm.py
+++ b/src/crewai/llms/base_llm.py
@@ -0,0 +1,91 @@
+from abc import ABC, abstractmethod
+from typing import Any, Callable, Dict, List, Optional, Union
+
+
+class BaseLLM(ABC):
+    """Abstract base class for LLM implementations.
+
+    This class defines the interface that all LLM implementations must follow.
+    Users can extend this class to create custom LLM implementations that don't
+    rely on litellm's authentication mechanism.
+
+    Custom LLM implementations should handle error cases gracefully, including
+    timeouts, authentication failures, and malformed responses. They should also
+    implement proper validation for input parameters and provide clear error
+    messages when things go wrong.
+
+    Attributes:
+        stop (list): A list of stop sequences that the LLM should use to stop generation.
+            This is used by the CrewAgentExecutor and other components.
+    """
+
+    model: str
+    temperature: Optional[float] = None
+    stop: Optional[List[str]] = None
+
+    def __init__(
+        self,
+        model: str,
+        temperature: Optional[float] = None,
+    ):
+        """Initialize the BaseLLM with default attributes.
+
+        This constructor sets default values for attributes that are expected
+        by the CrewAgentExecutor and other components.
+
+        All custom LLM implementations should call super().__init__() to ensure
+        that these default attributes are properly initialized.
+        """
+        self.model = model
+        self.temperature = temperature
+        self.stop = []
+
+    @abstractmethod
+    def call(
+        self,
+        messages: Union[str, List[Dict[str, str]]],
+        tools: Optional[List[dict]] = None,
+        callbacks: Optional[List[Any]] = None,
+        available_functions: Optional[Dict[str, Any]] = None,
+    ) -> Union[str, Any]:
+        """Call the LLM with the given messages.
+
+        Args:
+            messages: Input messages for the LLM.
+                     Can be a string or list of message dictionaries.
+                     If string, it will be converted to a single user message.
+                     If list, each dict must have 'role' and 'content' keys.
+            tools: Optional list of tool schemas for function calling.
+                  Each tool should define its name, description, and parameters.
+            callbacks: Optional list of callback functions to be executed
+                      during and after the LLM call.
+            available_functions: Optional dict mapping function names to callables
+                               that can be invoked by the LLM.
+
+        Returns:
+            Either a text response from the LLM (str) or
+            the result of a tool function call (Any).
+
+        Raises:
+            ValueError: If the messages format is invalid.
+            TimeoutError: If the LLM request times out.
+            RuntimeError: If the LLM request fails for other reasons.
+        """
+        pass
+
+    def supports_stop_words(self) -> bool:
+        """Check if the LLM supports stop words.
+
+        Returns:
+            bool: True if the LLM supports stop words, False otherwise.
+        """
+        return True  # Default implementation assumes support for stop words
+
+    def get_context_window_size(self) -> int:
+        """Get the context window size for the LLM.
+
+        Returns:
+            int: The number of tokens/characters the model can handle.
+        """
+        # Default implementation - subclasses should override with model-specific values
+        return 4096
--- a/src/crewai/llms/third_party/ai_suite.py
+++ b/src/crewai/llms/third_party/ai_suite.py
@@ -0,0 +1,38 @@
+from typing import Any, Dict, List, Optional, Union
+
+import aisuite as ai
+
+from crewai.llms.base_llm import BaseLLM
+
+
+class AISuiteLLM(BaseLLM):
+    def __init__(self, model: str, temperature: Optional[float] = None, **kwargs):
+        super().__init__(model, temperature, **kwargs)
+        self.client = ai.Client()
+
+    def call(
+        self,
+        messages: Union[str, List[Dict[str, str]]],
+        tools: Optional[List[dict]] = None,
+        callbacks: Optional[List[Any]] = None,
+        available_functions: Optional[Dict[str, Any]] = None,
+    ) -> Union[str, Any]:
+        completion_params = self._prepare_completion_params(messages, tools)
+        response = self.client.chat.completions.create(**completion_params)
+
+        return response.choices[0].message.content
+
+    def _prepare_completion_params(
+        self,
+        messages: Union[str, List[Dict[str, str]]],
+        tools: Optional[List[dict]] = None,
+    ) -> Dict[str, Any]:
+        return {
+            "model": self.model,
+            "messages": messages,
+            "temperature": self.temperature,
+            "tools": tools,
+        }
+
+    def supports_function_calling(self) -> bool:
+        return False
--- a/src/crewai/utilities/evaluators/crew_evaluator_handler.py
+++ b/src/crewai/utilities/evaluators/crew_evaluator_handler.py
@@ -6,7 +6,7 @@ from rich.console import Console
 from rich.table import Table

 from crewai.agent import Agent
-from crewai.llm import LLM
+from crewai.llm import BaseLLM
 from crewai.task import Task
 from crewai.tasks.task_output import TaskOutput
 from crewai.telemetry import Telemetry
@@ -24,7 +24,7 @@ class CrewEvaluator:

    Attributes:
        crew (Crew): The crew of agents to evaluate.
-        eval_llm (LLM): Language model instance to use for evaluations
+        eval_llm (BaseLLM): Language model instance to use for evaluations
        tasks_scores (defaultdict): A dictionary to store the scores of the agents for each task.
        iteration (int): The current iteration of the evaluation.
    """
@@ -33,7 +33,7 @@ class CrewEvaluator:
    run_execution_times: defaultdict = defaultdict(list)
    iteration: int = 0

-    def __init__(self, crew, eval_llm: InstanceOf[LLM]):
+    def __init__(self, crew, eval_llm: InstanceOf[BaseLLM]):
        self.crew = crew
        self.llm = eval_llm
        self._telemetry = Telemetry()
--- a/src/crewai/utilities/llm_utils.py
+++ b/src/crewai/utilities/llm_utils.py
@@ -2,28 +2,28 @@ import os
 from typing import Any, Dict, List, Optional, Union

 from crewai.cli.constants import DEFAULT_LLM_MODEL, ENV_VARS, LITELLM_PARAMS
-from crewai.llm import LLM
+from crewai.llm import LLM, BaseLLM


 def create_llm(
    llm_value: Union[str, LLM, Any, None] = None,
-) -> Optional[LLM]:
+) -> Optional[LLM | BaseLLM]:
    """
    Creates or returns an LLM instance based on the given llm_value.

    Args:
-        llm_value (str | LLM | Any | None):
+        llm_value (str | BaseLLM | Any | None):
            - str: The model name (e.g., "gpt-4").
-            - LLM: Already instantiated LLM, returned as-is.
+            - BaseLLM: Already instantiated BaseLLM (including LLM), returned as-is.
            - Any: Attempt to extract known attributes like model_name, temperature, etc.
            - None: Use environment-based or fallback default model.

    Returns:
-        An LLM instance if successful, or None if something fails.
+        A BaseLLM instance if successful, or None if something fails.
    """

-    # 1) If llm_value is already an LLM object, return it directly
-    if isinstance(llm_value, LLM):
+    # 1) If llm_value is already a BaseLLM or LLM object, return it directly
+    if isinstance(llm_value, LLM) or isinstance(llm_value, BaseLLM):
        return llm_value

    # 2) If llm_value is a string (model name)