Fix type error and test issues

Co-Authored-By: Joe Moura <joao@crewai.com>
Fix import sorting with ruff
2026-01-08 07:38:29 +00:00 · 2025-04-25 21:03:36 +00:00 · 2025-04-25 20:58:24 +00:00 · 2025-04-25 20:56:22 +00:00 · 2025-04-25 20:51:36 +00:00 · 2025-04-25 09:34:00 -04:00
25 changed files with 1262 additions and 123 deletions
--- a/docs/concepts/cli.mdx
+++ b/docs/concepts/cli.mdx
@@ -179,7 +179,78 @@ def crew(self) -> Crew:
 ```
 </Note>

-### 10. API Keys
+### 10. Deploy
+
+Deploy the crew or flow to [CrewAI Enterprise](https://app.crewai.com).
+
+- **Authentication**: You need to be authenticated to deploy to CrewAI Enterprise.
+    ```shell Terminal
+    crewai signup
+    ```
+    If you already have an account, you can login with:
+    ```shell Terminal
+    crewai login
+    ```
+
+- **Create a deployment**: Once you are authenticated, you can create a deployment for your crew or flow from the root of your localproject.
+    ```shell Terminal
+    crewai deploy create
+    ```
+    - Reads your local project configuration.
+    - Prompts you to confirm the environment variables (like `OPENAI_API_KEY`, `SERPER_API_KEY`) found locally. These will be securely stored with the deployment on the Enterprise platform. Ensure your sensitive keys are correctly configured locally (e.g., in a `.env` file) before running this.
+    - Links the deployment to the corresponding remote GitHub repository (it usually detects this automatically).
+
+- **Deploy the Crew**: Once you are authenticated, you can deploy your crew or flow to CrewAI Enterprise.
+    ```shell Terminal
+    crewai deploy push
+    ``` 
+    - Initiates the deployment process on the CrewAI Enterprise platform.
+    - Upon successful initiation, it will output the Deployment created successfully! message along with the Deployment Name and a unique Deployment ID (UUID).
+
+- **Deployment Status**: You can check the status of your deployment with:
+    ```shell Terminal
+    crewai deploy status
+    ```
+    This fetches the latest deployment status of your most recent deployment attempt (e.g., `Building Images for Crew`, `Deploy Enqueued`, `Online`).
+
+- **Deployment Logs**: You can check the logs of your deployment with:
+    ```shell Terminal
+    crewai deploy logs
+    ```
+    This streams the deployment logs to your terminal.
+
+- **List deployments**: You can list all your deployments with:
+    ```shell Terminal
+    crewai deploy list
+    ```
+    This lists all your deployments.
+
+- **Delete a deployment**: You can delete a deployment with:
+    ```shell Terminal
+    crewai deploy remove
+    ```
+    This deletes the deployment from the CrewAI Enterprise platform.
+
+- **Help Command**: You can get help with the CLI with:
+    ```shell Terminal
+    crewai deploy --help
+    ```
+    This shows the help message for the CrewAI Deploy CLI.
+
+Watch this video tutorial for a step-by-step demonstration of deploying your crew to [CrewAI Enterprise](http://app.crewai.com) using the CLI.
+
+<iframe
+  width="100%"
+  height="400"
+  src="https://www.youtube.com/embed/3EqSV-CYDZA"
+  title="CrewAI Deployment Guide"
+  frameborder="0"
+  style={{ borderRadius: '10px' }}
+  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
+  allowfullscreen
+></iframe>
+
+### 11. API Keys

 When running ```crewai create crew``` command, the CLI will first show you the top 5 most common LLM providers and ask you to select one.

--- a/docs/docs.json
+++ b/docs/docs.json
@@ -8,6 +8,9 @@
    "dark": "#C94C3C"
  },
  "favicon": "favicon.svg",
+  "contextual": {
+    "options": ["copy", "view", "chatgpt", "claude"]
+  },
  "navigation": {
    "tabs": [
      {
@@ -244,7 +247,12 @@
    "prompt": "Search CrewAI docs"
  },
  "seo": {
-    "indexing": "navigable"
+    "indexing": "all"
+  },
+  "errors": {
+    "404": {
+      "redirect": true
+    }
  },
  "footer": {
    "socials": {
--- a/docs/tools/codeinterpretertool.mdx
+++ b/docs/tools/codeinterpretertool.mdx
@@ -8,11 +8,29 @@ icon: code-simple

 ## Description

-The `CodeInterpreterTool` enables CrewAI agents to execute Python 3 code that they generate autonomously. The code is run in a secure, isolated Docker container, ensuring safety regardless of the content. This functionality is particularly valuable as it allows agents to create code, execute it, obtain the results, and utilize that information to inform subsequent decisions and actions.
+The `CodeInterpreterTool` enables CrewAI agents to execute Python 3 code that they generate autonomously. This functionality is particularly valuable as it allows agents to create code, execute it, obtain the results, and utilize that information to inform subsequent decisions and actions.

-## Requirements
+There are several ways to use this tool:
+
+### Docker Container (Recommended)
+
+This is the primary option. The code runs in a secure, isolated Docker container, ensuring safety regardless of its content.
+Make sure Docker is installed and running on your system. If you don’t have it, you can install it from [here](https://docs.docker.com/get-docker/).
+
+### Sandbox environment
+
+If Docker is unavailable — either not installed or not accessible for any reason — the code will be executed in a restricted Python environment - called sandbox.
+This environment is very limited, with strict restrictions on many modules and built-in functions.
+
+### Unsafe Execution
+
+**NOT RECOMMENDED FOR PRODUCTION** 
+This mode allows execution of any Python code, including dangerous calls to `sys, os..` and similar modules. [Check out](/tools/codeinterpretertool#enabling-unsafe-mode) how to enable this mode
+
+## Logging
+
+The `CodeInterpreterTool` logs the selected execution strategy to STDOUT

- Docker must be installed and running on your system. If you don't have it, you can install it from [here](https://docs.docker.com/get-docker/).

 ## Installation

@@ -74,18 +92,32 @@ programmer_agent = Agent(
 )
 ```

+### Enabling `unsafe_mode`
+
+```python Code
+from crewai_tools import CodeInterpreterTool
+
+code = """
+import os
+os.system("ls -la")
+"""
+
+CodeInterpreterTool(unsafe_mode=True).run(code=code)
+```
+
 ## Parameters

 The `CodeInterpreterTool` accepts the following parameters during initialization:

 - **user_dockerfile_path**: Optional. Path to a custom Dockerfile to use for the code interpreter container.
 - **user_docker_base_url**: Optional. URL to the Docker daemon to use for running the container.
- **unsafe_mode**: Optional. Whether to run code directly on the host machine instead of in a Docker container. Default is `False`. Use with caution!
+- **unsafe_mode**: Optional. Whether to run code directly on the host machine instead of in a Docker container or sandbox. Default is `False`. Use with caution!
+- **default_image_tag**: Optional. Default Docker image tag. Default is `code-interpreter:latest`

 When using the tool with an agent, the agent will need to provide:

 - **code**: Required. The Python 3 code to execute.
- **libraries_used**: Required. A list of libraries used in the code that need to be installed.
+- **libraries_used**: Optional. A list of libraries used in the code that need to be installed. Default is `[]`

 ## Agent Integration Example

@@ -152,7 +184,7 @@ class CodeInterpreterTool(BaseTool):
        if self.unsafe_mode:
            return self.run_code_unsafe(code, libraries_used)
        else:
-            return self.run_code_in_docker(code, libraries_used)
+            return self.run_code_safety(code, libraries_used)
 ```

 The tool performs the following steps:
@@ -168,8 +200,9 @@ The tool performs the following steps:
 By default, the `CodeInterpreterTool` runs code in an isolated Docker container, which provides a layer of security. However, there are still some security considerations to keep in mind:

 1. The Docker container has access to the current working directory, so sensitive files could potentially be accessed.
-2. The `unsafe_mode` parameter allows code to be executed directly on the host machine, which should only be used in trusted environments.
-3. Be cautious when allowing agents to install arbitrary libraries, as they could potentially include malicious code.
+2. If the Docker container is unavailable and the code needs to run safely, it will be executed in a sandbox environment. For security reasons, installing arbitrary libraries is not allowed
+3. The `unsafe_mode` parameter allows code to be executed directly on the host machine, which should only be used in trusted environments.
+4. Be cautious when allowing agents to install arbitrary libraries, as they could potentially include malicious code.

 ## Conclusion

--- a/src/crewai/cli/constants.py
+++ b/src/crewai/cli/constants.py
@@ -122,7 +122,16 @@ PROVIDERS = [
 ]

 MODELS = {
-    "openai": ["gpt-4", "gpt-4o", "gpt-4o-mini", "o1-mini", "o1-preview"],
+    "openai": [
+        "gpt-4",
+        "gpt-4.1",
+        "gpt-4.1-mini-2025-04-14",
+        "gpt-4.1-nano-2025-04-14",
+        "gpt-4o",
+        "gpt-4o-mini",
+        "o1-mini",
+        "o1-preview",
+    ],
    "anthropic": [
        "claude-3-5-sonnet-20240620",
        "claude-3-sonnet-20240229",
@@ -132,8 +141,17 @@ MODELS = {
    "gemini": [
        "gemini/gemini-1.5-flash",
        "gemini/gemini-1.5-pro",
+        "gemini/gemini-2.0-flash-lite-001",
+        "gemini/gemini-2.0-flash-001",
+        "gemini/gemini-2.0-flash-thinking-exp-01-21",
+        "gemini/gemini-2.5-flash-preview-04-17",
+        "gemini/gemini-2.5-pro-exp-03-25",
        "gemini/gemini-gemma-2-9b-it",
        "gemini/gemini-gemma-2-27b-it",
+        "gemini/gemma-3-1b-it",
+        "gemini/gemma-3-4b-it",
+        "gemini/gemma-3-12b-it",
+        "gemini/gemma-3-27b-it",
    ],
    "nvidia_nim": [
        "nvidia_nim/nvidia/mistral-nemo-minitron-8b-8k-instruct",
--- a/src/crewai/llm.py
+++ b/src/crewai/llm.py
@@ -37,6 +37,7 @@ with warnings.catch_warnings():
    warnings.simplefilter("ignore", UserWarning)
    import litellm
    from litellm import Choices
+    from litellm.exceptions import ContextWindowExceededError
    from litellm.litellm_core_utils.get_supported_openai_params import (
        get_supported_openai_params,
    )
@@ -81,14 +82,26 @@ LLM_CONTEXT_WINDOW_SIZES = {
    "gpt-4o": 128000,
    "gpt-4o-mini": 128000,
    "gpt-4-turbo": 128000,
+    "gpt-4.1": 1047576,  # Based on official docs
+    "gpt-4.1-mini-2025-04-14": 1047576,
+    "gpt-4.1-nano-2025-04-14": 1047576,
    "o1-preview": 128000,
    "o1-mini": 128000,
    "o3-mini": 200000,  # Based on official o3-mini specifications
    # gemini
    "gemini-2.0-flash": 1048576,
+    "gemini-2.0-flash-thinking-exp-01-21": 32768,
+    "gemini-2.0-flash-lite-001": 1048576,
+    "gemini-2.0-flash-001": 1048576,
+    "gemini-2.5-flash-preview-04-17": 1048576,
+    "gemini-2.5-pro-exp-03-25": 1048576,
    "gemini-1.5-pro": 2097152,
    "gemini-1.5-flash": 1048576,
    "gemini-1.5-flash-8b": 1048576,
+    "gemini/gemma-3-1b-it": 32000,
+    "gemini/gemma-3-4b-it": 128000,
+    "gemini/gemma-3-12b-it": 128000,
+    "gemini/gemma-3-27b-it": 128000,
    # deepseek
    "deepseek-chat": 128000,
    # groq
@@ -585,6 +598,11 @@ class LLM(BaseLLM):
            self._handle_emit_call_events(full_response, LLMCallType.LLM_CALL)
            return full_response

+        except ContextWindowExceededError as e:
+            # Catch context window errors from litellm and convert them to our own exception type.
+            # This exception is handled by CrewAgentExecutor._invoke_loop() which can then
+            # decide whether to summarize the content or abort based on the respect_context_window flag.
+            raise LLMContextLengthExceededException(str(e))
        except Exception as e:
            logging.error(f"Error in streaming response: {str(e)}")
            if full_response.strip():
@@ -699,7 +717,16 @@ class LLM(BaseLLM):
            str: The response text
        """
        # --- 1) Make the completion call
-        response = litellm.completion(**params)
+        try:
+            # Attempt to make the completion call, but catch context window errors
+            # and convert them to our own exception type for consistent handling
+            # across the codebase. This allows CrewAgentExecutor to handle context
+            # length issues appropriately.
+            response = litellm.completion(**params)
+        except ContextWindowExceededError as e:
+            # Convert litellm's context window error to our own exception type
+            # for consistent handling in the rest of the codebase
+            raise LLMContextLengthExceededException(str(e))

        # --- 2) Extract response message and content
        response_message = cast(Choices, cast(ModelResponse, response).choices)[
@@ -858,15 +885,17 @@ class LLM(BaseLLM):
                        params, callbacks, available_functions
                    )

+            except LLMContextLengthExceededException:
+                # Re-raise LLMContextLengthExceededException as it should be handled
+                # by the CrewAgentExecutor._invoke_loop method, which can then decide
+                # whether to summarize the content or abort based on the respect_context_window flag
+                raise
            except Exception as e:
                crewai_event_bus.emit(
                    self,
                    event=LLMCallFailedEvent(error=str(e)),
                )
-                if not LLMContextLengthExceededException(
-                    str(e)
-                )._is_context_limit_error(str(e)):
-                    logging.error(f"LiteLLM call failed: {str(e)}")
+                logging.error(f"LiteLLM call failed: {str(e)}")
                raise

    def _handle_emit_call_events(self, response: Any, call_type: LLMCallType):
--- a/src/crewai/tools/tool_usage.py
+++ b/src/crewai/tools/tool_usage.py
@@ -75,7 +75,6 @@ class ToolUsage:
        agent: Optional[Union["BaseAgent", "LiteAgent"]] = None,
        action: Any = None,
        fingerprint_context: Optional[Dict[str, str]] = None,
-        original_tools: List[Any] = [],
    ) -> None:
        self._i18n: I18N = agent.i18n if agent else I18N()
        self._printer: Printer = Printer()
@@ -87,7 +86,6 @@ class ToolUsage:
        self.tools_description = render_text_description_and_args(tools)
        self.tools_names = get_tool_names(tools)
        self.tools_handler = tools_handler
-        self.original_tools = original_tools
        self.tools = tools
        self.task = task
        self.action = action
@@ -193,16 +191,13 @@ class ToolUsage:
            )  # type: ignore
            from_cache = result is not None

-        original_tool = None
-        if hasattr(self, 'original_tools') and self.original_tools:
-            original_tool = next(
-                (ot for ot in self.original_tools if ot.name == tool.name),
-                None
-            )
-        
        available_tool = next(
-            (at for at in self.tools if at.name == tool.name),
-            None
+            (
+                available_tool
+                for available_tool in self.tools
+                if available_tool.name == tool.name
+            ),
+            None,
        )

        if result is None:
@@ -264,11 +259,10 @@ class ToolUsage:

            if self.tools_handler:
                should_cache = True
-                if original_tool and hasattr(original_tool, "cache_function") and original_tool.cache_function:
-                    should_cache = original_tool.cache_function(
-                        calling.arguments, result
-                    )
-                elif available_tool and hasattr(available_tool, "cache_function") and available_tool.cache_function:
+                if (
+                    hasattr(available_tool, "cache_function")
+                    and available_tool.cache_function  # type: ignore # Item "None" of "Any | None" has no attribute "cache_function"
+                ):
                    should_cache = available_tool.cache_function(  # type: ignore # Item "None" of "Any | None" has no attribute "cache_function"
                        calling.arguments, result
                    )
@@ -296,10 +290,10 @@ class ToolUsage:
            result=result,
        )

-        if original_tool and hasattr(original_tool, "result_as_answer") and original_tool.result_as_answer:
-            result_as_answer = original_tool.result_as_answer
-            data["result_as_answer"] = result_as_answer
-        elif available_tool and hasattr(available_tool, "result_as_answer") and available_tool.result_as_answer:
+        if (
+            hasattr(available_tool, "result_as_answer")
+            and available_tool.result_as_answer  # type: ignore # Item "None" of "Any | None" has no attribute "cache_function"
+        ):
            result_as_answer = available_tool.result_as_answer  # type: ignore # Item "None" of "Any | None" has no attribute "result_as_answer"
            data["result_as_answer"] = result_as_answer  # type: ignore

--- a/src/crewai/utilities/embedding_configurator.py
+++ b/src/crewai/utilities/embedding_configurator.py
@@ -104,16 +104,25 @@ class EmbeddingConfigurator:

    @staticmethod
    def _configure_vertexai(config, model_name):
-        from chromadb.utils.embedding_functions.google_embedding_function import (
-            GoogleVertexEmbeddingFunction,
-        )
+        try:
+            from chromadb.utils.embedding_functions.google_embedding_function import (
+                GoogleVertexEmbeddingFunction,
+            )

-        return GoogleVertexEmbeddingFunction(
-            model_name=model_name,
-            api_key=config.get("api_key"),
-            project_id=config.get("project_id"),
-            region=config.get("region"),
-        )
+            from crewai.utilities.embedding_functions import (
+                FixedGoogleVertexEmbeddingFunction,
+            )
+            
+            return FixedGoogleVertexEmbeddingFunction(
+                model_name=model_name,
+                api_key=config.get("api_key"),
+                project_id=config.get("project_id"),
+                region=config.get("region"),
+            )
+        except ImportError as e:
+            raise ImportError(
+                "Google Vertex dependencies are not installed. Please install them to use Vertex embedding."
+            ) from e

    @staticmethod
    def _configure_google(config, model_name):
--- a/src/crewai/utilities/embedding_functions.py
+++ b/src/crewai/utilities/embedding_functions.py
@@ -0,0 +1,40 @@
+from typing import Any, List, Optional
+from urllib.parse import parse_qs, urlencode, urlparse, urlunparse
+
+import requests
+from chromadb import Documents, Embeddings
+from chromadb.utils.embedding_functions.google_embedding_function import (
+    GoogleVertexEmbeddingFunction,
+)
+
+
+class FixedGoogleVertexEmbeddingFunction(GoogleVertexEmbeddingFunction):
+    """
+    A wrapper around ChromaDB's GoogleVertexEmbeddingFunction that fixes the URL typo
+    where 'publishers/goole' is incorrectly used instead of 'publishers/google'.
+    
+    Issue reference: https://github.com/crewaiinc/crewai/issues/2690
+    """
+    
+    def __init__(self, 
+                model_name: str = "textembedding-gecko", 
+                api_key: Optional[str] = None,
+                **kwargs: Any):
+        api_key_str = "" if api_key is None else api_key
+        super().__init__(model_name=model_name, api_key=api_key_str, **kwargs)
+        
+        self._original_post = requests.post
+        requests.post = self._patched_post
+    
+    def __del__(self):
+        if hasattr(self, '_original_post'):
+            requests.post = self._original_post
+    
+    def _patched_post(self, url, *args, **kwargs):
+        if 'publishers/goole' in url:
+            url = url.replace('publishers/goole', 'publishers/google')
+        
+        return self._original_post(url, *args, **kwargs)
+    
+    def __call__(self, input: Documents) -> Embeddings:
+        return super().__call__(input)
--- a/src/crewai/utilities/tool_utils.py
+++ b/src/crewai/utilities/tool_utils.py
@@ -60,7 +60,6 @@ def execute_tool_and_check_finality(
            task=task,
            agent=agent,
            action=agent_action,
-            original_tools=tools,  # Pass original tools to ensure custom tools work
        )

        # Parse tool calling
--- a/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-001].yaml
+++ b/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-001].yaml
@@ -0,0 +1,59 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-001:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/62RTU+EMBCG7/0VTY9kIQUT1vXqx0njRokxUQ8jDNAILaFdoyH8dwssbNGrTdo0
+        807nnT7TEUpZCjITGRjU7IK+2Ail3XgOmpIGpbHCHLLBBlpzyp1W59xtisGv4RFLSqQpNMJARVVO
+        b1qQKVKhqeftoRXa84JXyZy3/XJ/25wcW1XhUK5WGVZzej8nsFxIocsHBK3kkPaY3O/ZosJncauK
+        plXvQ9M+D3gUxiE/tzs6i+JtyHdkth5N2UFDgXdowFKB5e/Mlqgbk6gPlJfqMFLZTi4Ow5W8O8pG
+        WQArJYw3f4rqK2spKhetQ93+HSphvkes188Jc/iYVU8zH+Jg/N3hP3nt1l7kOJVpUE/YajFNpMDa
+        zsiPAu7nFejS5zxkpCc/6so6tIECAAA=
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:05 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=1219
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-lite-001].yaml
+++ b/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-lite-001].yaml
@@ -0,0 +1,59 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-lite-001:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/61RTU+EMBC98yuanhdSiMjq1Y+Txo0SY6J7GGGARmhJ2zUawn+3wMIWvdpDM5n3
+        Zt7Mm84jhGYgcp6DQU0vyavNENKN/4BJYVAYC8wpm2xBmRN3ep0TW4rBr6GIphWSDFpuoCayILcK
+        RIaEa7IDxXXwJqhT1y/xfnNSU7LGoVUjc6xnej8TaMEF19UjgpZioD2lDzu6oPBZ3smyVfJ9GNhn
+        AQuTMIpjFl/EZyxKwuTcm6VHUXrQUOI9GrCOwLI3tS2a1qTyA8WVPIyOJJOK498K3h5hI+3yKySM
+        N3+a6msryWvXVsdxuzvU3HyPlt68pNTxx6xmmv3xHBt/T/hPWtu1lne8ynSoZ1SaTxcpsbE38qOA
+        +UUNuvJtd/QZC6nXez+t5iqFggIAAA==
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:06 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=1090
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-thinking-exp-01-21].yaml
+++ b/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-thinking-exp-01-21].yaml
@@ -0,0 +1,58 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-thinking-exp-01-21:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/22QQWuEMBCF7/6KkKNsFt1DKb2221vp0koplD0M66jDxkSSWbCI/71Rq+vS5pCE
+        eW9meF8XCSFPYHLKgdHLB/EVKkJ04z1o1jAaDsJcCsUGHF+90+lW/2BhbIcmmVUoTtAQgxa2EM8O
+        zAkFeRHHB3Dk43grV5398j9urvuc1TgMq22Oerb3s0EWZMhXbwjemsH2nr0e5KKSybEN5SSaF4yj
+        5cVDiS/IEJLDkk82ztYNZ/aM5tFexuT306wVp39ltiHkjZLebf4M9U9hJek1vhXZkBA08feIbv+Z
+        yRUFvlk6UxjfY/TLY0L0gc7TxKLEOtBRu22iCg2+UlyROZMpFbaNSlK1S2XURz/Fy1P+CAIAAA==
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:04 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=764
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemini_models[gemini-gemini-2.5-flash-preview-04-17].yaml
+++ b/tests/cassettes/test_gemini_models[gemini-gemini-2.5-flash-preview-04-17].yaml
@@ -0,0 +1,59 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview-04-17:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/2WQT0+EMBDF7/0UTY9k2ewaDerVPzfjRokxMR4mMEBjaUk76BrCd7fAslvcHppm
+        5s2bvl/HOBcZ6FzmQOjELf/wFc678R56RhNq8o255IsNWDppp9MFby8h3A9DIq2QZ9BIAsVNwR8t
+        6Ay5dDyKdmCli6K1CCb74/tzddpnjcLBrDY5qlnezwJRSC1d9YLgjB5kr+nzThy7Uue49+UNmxeM
+        1qJ1UOITEvjkcMwnGmvqhlLzhfrOtGPy68kr4LRoJzeHPhmfcjmZrM5c3b3fKVXIL0DrI4KS9Duy
+        e3hPRYCBFtYzBhbQElSZtqzo3we37IBrIviG1skJVYm1hxdfrK/iQoGr4sbit8SfeHMZbxPBevYH
+        O2bXiSICAAA=
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:28 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=20971
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemini_models[gemini-gemini-2.5-pro-exp-03-25].yaml
+++ b/tests/cassettes/test_gemini_models[gemini-gemini-2.5-pro-exp-03-25].yaml
@@ -0,0 +1,59 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-exp-03-25:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/12QT2uEMBDF7/kUIUepi2tZaHttu7fSpZVSKD0MOquhmkgygkX87o26cWM9SJj3
+        5s/7DYxzkYMqZAGEVjzwL1fhfJj/k6YVoSIn+JIrtmDo6l2+IXg7C2E/NYmsQp5DKwlqrs/8aEDl
+        yKXlUXQCI20U7UTQOa7v75vrPqNrnIY1usDa20dvEGeppK3eEKxWk+09ez2JVZWqwN6VE+YXzKNF
+        Z6HEFyRwyWHNJ1qjm5Yy/YPqUXdz8rtlVsBpI++T5GIg7WL+03xzMNc+ua2yDgkGcF1IqCX9zvSe
+        PzMRgKDNWR4EC3gJqnRXVrQ98T5lF2ALww80Vi6wSmwcvjjdHWJ3Yox9Gye3cXoQbGR/TedYqx4C
+        AAA=
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:30 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=2418
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemma3[gemini-gemma-3-12b-it].yaml
+++ b/tests/cassettes/test_gemma3[gemini-gemma-3-12b-it].yaml
@@ -0,0 +1,59 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-12b-it:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/2WRTWvDMAyG7/kVwpdBSEvX7jB23QfsMFa2MAZbD2qipGaOFWwFWkr/+5ykaVPq
+        gGP0SvLrR/sIQGVoc52jkFcP8BMiAPtubzW2QlaCMIRCsEYn59x+7UfnkCK0bYtUuiHIsNaCBriA
+        F4c2I9Ae4niJTvs4nv7a/nuVGw9oPIOEIoOuJC+QadmBtkNlsAoIpeF1aJgFZ+SgYAfBUQIF+o1m
+        m0CJXhxbrnZJV5E1RhpHUzUyeTidV8n5aY4Ntb4rzskM6YchQRXaar/5IPRs27TP9H2pTqq2OW1D
+        eBYNF3StVeOxpDcSDJDxhFLVjqtaUv4j+8hNB/m+7zUayYW8mB914QD0QrqbJVdd/VO4U5vxqEZT
+        DE9EE+h2Y3r+TtUIg1yYGjB0/1V0BNIz+iLndQ+jpKrCyWJyO19PtKjoEP0DlZtdIF8CAAA=
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:39 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=3835
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemma3[gemini-gemma-3-1b-it].yaml
+++ b/tests/cassettes/test_gemma3[gemini-gemma-3-1b-it].yaml
@@ -0,0 +1,60 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-1b-it:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/2VRy07DQAy85yusPUZtBSoIxJWHxAFRQYSQKAc3cVqL7DrKuqKlqsRv8Ht8CZuk
+        aVOxh314xuP1eBMBmBRdxhkqeXMFbyECsGn2GhOn5DQAXSgES6z0wG3XpncPFKVVnWSSBUGKJSsW
+        IDncVehSAvYQxxOs2MfxCKZu6u719/vHA0Iq1ooDyz6UTqlUDi9doELDr1M1aMbiinXcSQ9gtlTg
+        VqOGJc855VAztAZWvMInZ1SsoaJU5o6/KANxNDK9X2/39/fBoddKCqobsRLyO/q2I5icHfvFE6EX
+        V9Oek8eJ2aPsMlqF8EnUFWikzdLjnB5IMbiOe29NWYktNZEPcteybFy/bLV6MzqCxxc7XCXYcASd
+        nQ/+qfqbUJOL/ux6Yw0tYsG6buZ2+5qYng169KnOhuZ8j3aGtB69UOW5NWNO1uJwPDydDVlNtI3+
+        AD6XWQdvAgAA
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:32 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=1535
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemma3[gemini-gemma-3-27b-it].yaml
+++ b/tests/cassettes/test_gemma3[gemini-gemma-3-27b-it].yaml
@@ -0,0 +1,60 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-27b-it:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/2VRXUvDMBR976+45EUo3RDnUHwTnSA4HFpEcHuI7e16aZqU5NZNxv67abtuHTbQ
+        hHPu5zm7AEAkUqeUSkYn7uDLIwC79t9wRjNq9kQPebCSlk+x3bcbvH0I47ZJEnGOkMiKWCowGTxZ
+        qRMEchCGC2nJheEYlnqpn/nCQaHNRkNmLJDvSwkoP1kpbeFAUYHAvtiMsgwVxGaDNmqRF1P/WIR5
+        7bAuI/ApLXxvE0gRYkumrHL0hIMNKtXcxA4y6XIyOoKkJkcau8ykVlxbHDczNUcM1tof36voJIY1
+        CptNS5Oi6sP3fYDISJPL31A6o5uw9/h1IY4s6RS3Hr4M+gZtaVE7ucY5svS2yKP4orJ+F45NgfrB
+        1K0tt12tgYln9PX0wLPxFpxR00n0r6p79D1JDc0d+O5XlIr4tzV29hmLgQx8NlQvQ3uvgoMgnUYf
+        aB11YqyxLOVoMrq6+R4Ri2Af/AEDrXcbkQIAAA==
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:41 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=2447
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemma3[gemini-gemma-3-4b-it].yaml
+++ b/tests/cassettes/test_gemma3[gemini-gemma-3-4b-it].yaml
@@ -0,0 +1,60 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-4b-it:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/2WRzUrDQBDH73mKYY+lLUKLiBcPfoAHsWhQwXqYJtNk6WYn7E6woRQ8+wR68t18
+        Ah/BbWraFPeQXeb/z3z8ZhUBqARtqlMU8uoUnkMEYNV8NxpbIStBaEMhWKKTvXd7Vp13sAgtNz+p
+        OCdIsNSCBngOVw5tQqA99HoTdNr3ekOY2qm9lu+3Tw8ImeFZ8CahKDmYs4NQrA9z9Llm24cMvTi2
+        XNQQ2oakMlI5GsLP18d7k+mRK5NCzRUYvSAQhoXl12CuJdc2g4IdAc64Emg6OFOdzte790t/P69j
+        Q5thCk7JtPZ1a1BzbbXP7wg9243tPr6dqJ2qbUrLED6K2gJNalV5zOiGBAN53PFVpeOilJgXZM+5
+        asifbHN19nQgj1pdOFA+kMbH/X9Z/UWoqU13f53VhhHRaKmb3V0+xaqDQQ6aajE090v0B2TL6IGc
+        11sYGRUFDkaD8WygRUXr6BcxmBLccwIAAA==
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:35 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=2349
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gpt_4_1[gpt-4.1-mini-2025-04-14].yaml
+++ b/tests/cassettes/test_gpt_4_1[gpt-4.1-mini-2025-04-14].yaml
@@ -0,0 +1,101 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "gpt-4.1-mini-2025-04-14", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '125'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.68.2
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.68.2
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-read-timeout:
+      - '600.0'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.11.12
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAAAwAAAP//jJJPb9swDMXv/hSCznEQe+7S5LgAGzDs0P3paSgMRqJtZrIkSPTQoch3
+        H2Snsbt1wC4+8MdHv0fxKRNCkpZ7IVUHrHpv8nd3t4eIH77w4ePX+/b+8XP56bTdHE7dgb2Sq6Rw
+        xxMqflatleu9QSZnJ6wCAmOaWmyrmzflrqrejqB3Gk2StZ7zal3kPVnKy015k2+qvKgu8s6Rwij3
+        4nsmhBBP4zcZtRof5V5sVs+VHmOEFuX+2iSEDM6kioQYKTJYlqsZKmcZ7ej9W4dCgScGI1wj3gew
+        CgVFcQeB4nqpCtgMEZJ1OxizAGCtY0jRR78PF3K+OjSu9cEd4x9S2ZCl2NUBITqb3ER2Xo70nAnx
+        MG5ieBFO+uB6zzW7Hzj+rqimcXJ+gBneXhg7BjOXy3L1yrBaIwOZuFikVKA61LNy3joMmtwCZIvI
+        f3t5bfYUm2z7P+NnoBR6Rl37gJrUy7xzW8B0nf9qu654NCwjhp+ksGbCkJ5BYwODmU5Gxl+Rsa8b
+        si0GH2i6m8bX291xuztiVTQyO2e/AQAA//8DAP7WRo9GAwAA
+    headers:
+      CF-RAY:
+      - 93458dcf6d0ef53b-GRU
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 22 Apr 2025 13:44:07 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '1391'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999989'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_74408ec81f430b6e9795cf5332262b8d
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gpt_4_1[gpt-4.1-nano-2025-04-14].yaml
+++ b/tests/cassettes/test_gpt_4_1[gpt-4.1-nano-2025-04-14].yaml
@@ -0,0 +1,101 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "gpt-4.1-nano-2025-04-14", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '125'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.68.2
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.68.2
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-read-timeout:
+      - '600.0'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.11.12
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAAAwAAAP//jJJPb9swDMXv/hSCznEQuwrq5bgNBbZTh7aHYigMRqJtbbKkSXT/oMh3
+        L2Snsdt1wC4+8MdHv0fxOWOMa8V3jMsOSPbe5J8vqy/D96K/+PPjSn/Fh5szMewfb/eiar/d8lVS
+        uP0vlPSqWkvXe4OknZ2wDAiEaWpxLrZn5SchqhH0TqFJstZTLtZFbsG6vNyU23wj8kIc5Z3TEiPf
+        sZ8ZY4w9j99k1Cp85Du2Wb1WeowRWuS7UxNjPDiTKhxi1JHAEl/NUDpLaEfv1x0yCV4TGOYadhHA
+        SmQ6sksIOq6XqoDNECFZt4MxCwDWOoIUffR7dySHk0PjWh/cPr6T8kZbHbs6IERnk5tIzvORHjLG
+        7sZNDG/CcR9c76km9xvH3xViGsfnB5hhdWTkCMxcLsvVB8NqhQTaxMUiuQTZoZqV89ZhUNotQLaI
+        /LeXj2ZPsbVt/2f8DKRET6hqH1Bp+Tbv3BYwXee/2k4rHg3ziOFeS6xJY0jPoLCBwUwnw+NTJOzr
+        RtsWgw96upvG14gKq2ajxJZnh+wFAAD//wMATWCJPkYDAAA=
+    headers:
+      CF-RAY:
+      - 93458dd9aebff53b-GRU
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 22 Apr 2025 13:44:08 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '134'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999990'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_0ba738662e063fb55ea01793aafdcecc
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gpt_4_1[gpt-4.1].yaml
+++ b/tests/cassettes/test_gpt_4_1[gpt-4.1].yaml
@@ -0,0 +1,101 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "gpt-4.1", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '109'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.68.2
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.68.2
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-read-timeout:
+      - '600.0'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.11.12
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAAA4xSwYrbMBC9+yvEHEMcHMeh3dx2SwttoYTSW1nMRB7b2pUlIY27XZb8e5GdxM5u
+        C734MG/e83tP85IIAaqCnQDZIsvO6fRu//6Db26fnvTtx8/5t7v95rmT3y1+pc0XDcvIsIcHknxm
+        raTtnCZW1oyw9IRMUXX9rthu8pui2A5AZyvSkdY4TovVOs2zfJtmRbouTszWKkkBduJnIoQQL8M3
+        ejQV/YadyJbnSUchYEOwuywJAd7qOAEMQQVGw7CcQGkNkxls/2hJSHSKUQtbi08ejSShglgs9uhV
+        WCxWc6anug8YnZte6xmAxljGmHzwfH9CjheX2jbO20N4RYVaGRXa0hMGa6KjwNbBgB4TIe6HNvqr
+        gOC87RyXbB9p+N26GOVg6n8GnpoCtox6mudn0pVaWRGj0mHWJkiULVUTc6oe+0rZGZDMMr818zft
+        Mbcyzf/IT4CU5Jiq0nmqlLwOPK15itf5r7VLx4NhCOR/KUklK/LxHSqqsdfj3UB4DkxdWSvTkHde
+        jcdTuxJllh2yLMskJMfkDwAAAP//AwC4aq9JRgMAAA==
+    headers:
+      CF-RAY:
+      - 93458dcc1b9df53b-GRU
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 22 Apr 2025 13:44:06 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '296'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '10000'
+      x-ratelimit-limit-tokens:
+      - '30000000'
+      x-ratelimit-remaining-requests:
+      - '9999'
+      x-ratelimit-remaining-tokens:
+      - '29999989'
+      x-ratelimit-reset-requests:
+      - 6ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_7b4e1628a0608e9547e71e7857a29b58
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/llm_test.py
+++ b/tests/llm_test.py
@@ -256,6 +256,52 @@ def test_validate_call_params_no_response_format():
    llm._validate_call_params()


+@pytest.mark.vcr(filter_headers=["authorization"], filter_query_parameters=["key"])
+@pytest.mark.parametrize(
+    "model",
+    [
+        "gemini/gemini-2.0-flash-thinking-exp-01-21",
+        "gemini/gemini-2.0-flash-001",
+        "gemini/gemini-2.0-flash-lite-001",
+        "gemini/gemini-2.5-flash-preview-04-17",
+        "gemini/gemini-2.5-pro-exp-03-25",
+    ],
+)
+def test_gemini_models(model):
+    llm = LLM(model=model)
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
+@pytest.mark.vcr(filter_headers=["authorization"], filter_query_parameters=["key"])
+@pytest.mark.parametrize(
+    "model",
+    [
+        "gemini/gemma-3-1b-it",
+        "gemini/gemma-3-4b-it",
+        "gemini/gemma-3-12b-it",
+        "gemini/gemma-3-27b-it",
+    ],
+)
+def test_gemma3(model):
+    llm = LLM(model=model)
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+@pytest.mark.parametrize(
+    "model", ["gpt-4.1", "gpt-4.1-mini-2025-04-14", "gpt-4.1-nano-2025-04-14"]
+)
+def test_gpt_4_1(model):
+    llm = LLM(model=model)
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
@pytest.mark.vcr(filter_headers=["authorization"])
 def test_o3_mini_reasoning_effort_high():
    llm = LLM(
@@ -327,6 +373,45 @@ def get_weather_tool_schema():
        },
    }

+def test_context_window_exceeded_error_handling():
+    """Test that litellm.ContextWindowExceededError is converted to LLMContextLengthExceededException."""
+    from litellm.exceptions import ContextWindowExceededError
+
+    from crewai.utilities.exceptions.context_window_exceeding_exception import (
+        LLMContextLengthExceededException,
+    )
+
+    llm = LLM(model="gpt-4")
+
+    # Test non-streaming response
+    with patch("litellm.completion") as mock_completion:
+        mock_completion.side_effect = ContextWindowExceededError(
+            "This model's maximum context length is 8192 tokens. However, your messages resulted in 10000 tokens.",
+            model="gpt-4",
+            llm_provider="openai"
+        )
+
+        with pytest.raises(LLMContextLengthExceededException) as excinfo:
+            llm.call("This is a test message")
+
+        assert "context length exceeded" in str(excinfo.value).lower()
+        assert "8192 tokens" in str(excinfo.value)
+
+    # Test streaming response
+    llm = LLM(model="gpt-4", stream=True)
+    with patch("litellm.completion") as mock_completion:
+        mock_completion.side_effect = ContextWindowExceededError(
+            "This model's maximum context length is 8192 tokens. However, your messages resulted in 10000 tokens.",
+            model="gpt-4",
+            llm_provider="openai"
+        )
+
+        with pytest.raises(LLMContextLengthExceededException) as excinfo:
+            llm.call("This is a test message")
+
+        assert "context length exceeded" in str(excinfo.value).lower()
+        assert "8192 tokens" in str(excinfo.value)
+

@pytest.mark.vcr(filter_headers=["authorization"])
@pytest.fixture
--- a/tests/tools/test_custom_tool_invocation.py
+++ b/tests/tools/test_custom_tool_invocation.py
@@ -1,77 +0,0 @@
-from unittest.mock import MagicMock
-
-import pytest
-from pydantic import BaseModel, Field
-
-from crewai import Agent, Task
-from crewai.agents.crew_agent_executor import CrewAgentExecutor
-from crewai.agents.parser import AgentAction
-from crewai.agents.tools_handler import ToolsHandler
-from crewai.tools import BaseTool
-from crewai.utilities.i18n import I18N
-from crewai.utilities.tool_utils import execute_tool_and_check_finality
-
-
-class TestToolInput(BaseModel):
-    test_param: str = Field(..., description="A test parameter")
-
-
-class TestCustomTool(BaseTool):
-    name: str = "Test Custom Tool"
-    description: str = "A test tool to verify custom tool invocation"
-    args_schema: type[BaseModel] = TestToolInput
-
-    def _run(self, test_param: str) -> str:
-        return f"Tool executed with param: {test_param}"
-
-
-def test_custom_tool_invocation():
-    custom_tool = TestCustomTool()
-    
-    mock_agent = MagicMock()
-    mock_task = MagicMock()
-    mock_llm = MagicMock()
-    mock_crew = MagicMock()
-    tools_handler = ToolsHandler()
-    
-    executor = CrewAgentExecutor(
-        llm=mock_llm,
-        task=mock_task,
-        crew=mock_crew,
-        agent=mock_agent,
-        prompt={},
-        max_iter=5,
-        tools=[custom_tool],
-        tools_names="Test Custom Tool",
-        stop_words=[],
-        tools_description="A test tool to verify custom tool invocation",
-        tools_handler=tools_handler,
-        original_tools=[custom_tool]
-    )
-    
-    action = AgentAction(
-        tool="Test Custom Tool",
-        tool_input={"test_param": "test_value"},
-        thought="I'll use the custom tool",
-        text="I'll use the Test Custom Tool to get a result"
-    )
-    
-    i18n = I18N()
-    
-    mock_agent.key = "test_agent"
-    mock_agent.role = "test_role"
-    
-    result = execute_tool_and_check_finality(
-        agent_action=action,
-        tools=[custom_tool],
-        i18n=i18n,
-        agent_key=mock_agent.key,
-        agent_role=mock_agent.role,
-        tools_handler=tools_handler,
-        task=mock_task,
-        agent=mock_agent,
-        function_calling_llm=mock_llm
-    )
-    
-    assert "Tool executed with param: test_value" in result.result
-    assert result.result_as_answer is False
--- a/tests/utilities/test_embedding_configurator.py
+++ b/tests/utilities/test_embedding_configurator.py
@@ -0,0 +1,37 @@
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from crewai.utilities.embedding_configurator import EmbeddingConfigurator
+from crewai.utilities.embedding_functions import FixedGoogleVertexEmbeddingFunction
+
+
+class TestEmbeddingConfigurator:
+    @pytest.fixture
+    def embedding_configurator(self):
+        return EmbeddingConfigurator()
+    
+    def test_configure_vertexai(self, embedding_configurator):
+        with patch('crewai.utilities.embedding_functions.FixedGoogleVertexEmbeddingFunction') as mock_class:
+            mock_instance = MagicMock()
+            mock_class.return_value = mock_instance
+            
+            config = {
+                "provider": "vertexai",
+                "config": {
+                    "api_key": "test-key",
+                    "model": "test-model",
+                    "project_id": "test-project",
+                    "region": "test-region"
+                }
+            }
+            
+            result = embedding_configurator.configure_embedder(config)
+            
+            mock_class.assert_called_once_with(
+                model_name="test-model",
+                api_key="test-key",
+                project_id="test-project",
+                region="test-region"
+            )
+            assert result == mock_instance
--- a/tests/utilities/test_embedding_functions.py
+++ b/tests/utilities/test_embedding_functions.py
@@ -0,0 +1,57 @@
+from unittest.mock import MagicMock, patch
+
+import pytest
+import requests
+
+from crewai.utilities.embedding_functions import FixedGoogleVertexEmbeddingFunction
+
+
+class TestFixedGoogleVertexEmbeddingFunction:
+    @pytest.fixture
+    def embedding_function(self):
+        with patch('requests.post') as mock_post:
+            mock_response = MagicMock()
+            mock_response.json.return_value = {"predictions": [[0.1, 0.2, 0.3]]}
+            mock_post.return_value = mock_response
+            
+            function = FixedGoogleVertexEmbeddingFunction(
+                model_name="test-model",
+                api_key="test-key"
+            )
+            
+            yield function, mock_post
+            
+            if hasattr(function, '_original_post'):
+                requests.post = function._original_post
+            
+    def test_url_correction(self, embedding_function):
+        function, mock_post = embedding_function
+        
+        typo_url = "https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/publishers/goole/models/test-model:predict"
+        
+        expected_url = "https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/publishers/google/models/test-model:predict"
+        
+        with patch.object(function, '_original_post') as mock_original_post:
+            mock_response = MagicMock()
+            mock_response.json.return_value = {"predictions": [[0.1, 0.2, 0.3]]}
+            mock_original_post.return_value = mock_response
+            
+            response = function._patched_post(typo_url, json={})
+            
+            mock_original_post.assert_called_once()
+            call_args = mock_original_post.call_args
+            assert call_args[0][0] == expected_url
+            
+    def test_embedding_call(self, embedding_function):
+        function, mock_post = embedding_function
+        
+        mock_response = MagicMock()
+        mock_response.json.return_value = {"predictions": [[0.1, 0.2, 0.3]]}
+        mock_post.return_value = mock_response
+        
+        embeddings = function(["test text"])
+        
+        mock_post.assert_called_once()
+        
+        assert isinstance(embeddings, list)
+        assert len(embeddings) > 0
Author	SHA1	Message	Date
Devin AI	5623e2c851	Fix type error and test issues Co-Authored-By: Joe Moura <joao@crewai.com>	2025-04-25 21:03:36 +00:00
Devin AI	50059c7120	Fix import sorting with ruff Co-Authored-By: Joe Moura <joao@crewai.com>	2025-04-25 20:58:24 +00:00
Devin AI	335f1dfdf8	Fix lint and type errors Co-Authored-By: Joe Moura <joao@crewai.com>	2025-04-25 20:56:22 +00:00
Devin AI	1f2def2cbe	Fix Vertex AI embeddings URL typo (publishers/goole -> publishers/google) Co-Authored-By: Joe Moura <joao@crewai.com>	2025-04-25 20:51:36 +00:00
Lucas Gomide	b2969e9441	style: fix linter issue (#2686 ) Some checks are pending Notify Downstream / notify-downstream (push) Waiting to run Details	2025-04-25 09:34:00 -04:00
João Moura	5b9606e8b6	fix contenxt windown	2025-04-24 23:09:23 -07:00
Kunal Lunia	685d20f46c	added gpt-4.1 models and gemini-2.0 and 2.5 pro models (#2609 ) Some checks failed Notify Downstream / notify-downstream (push) Has been cancelled Details * added gpt4.1 models and gemini 2.0 and 2.5 models * added flash model * Updated test fun to all models * Added Gemma3 test cases and passed all google test case * added gemini 2.5 flash * added gpt4.1 models and gemini 2.0 and 2.5 models * added flash model * Updated test fun to all models * Added Gemma3 test cases and passed all google test case * added gemini 2.5 flash * added gpt4.1 models and gemini 2.0 and 2.5 models * added flash model * Updated test fun to all models * Added Gemma3 test cases and passed all google test case * added gemini 2.5 flash * test: add missing cassettes * test: ignore authorization key from gemini/gemma3 request --------- Co-authored-by: Lucas Gomide <lucaslg200@gmail.com> Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>	2025-04-23 11:20:32 -07:00
Lucas Gomide	9ebf3aa043	docs(CodeInterpreterTool): update docs (#2675 )	2025-04-23 10:27:25 -07:00
Tony Kipkemboi	2e4c97661a	Add enterprise deployment documentation to CLI docs (#2670 ) Some checks are pending Notify Downstream / notify-downstream (push) Waiting to run Details	2025-04-22 13:27:58 -07:00
Tony Kipkemboi	16eb4df556	docs: update docs.json with contextual options, SEO, and 404 redirect (#2654 ) * docs: 0.114.0 release notes, navigation restructure, new guides, deploy video, and cleanup - Add v0.114.0 release notes with highlights image and doc links - Restructure docs navigation (Strategy group, Releases tab, navbar links) - Update quickstart with deployment video and clearer instructions - Add/rename guides (Custom Manager Agent, Custom LLM) - Remove legacy concept/tool docs - Add new images and tool docs - Minor formatting and content improvements throughout * docs: update docs.json with contextual options, SEO indexing, and 404 redirect settings	2025-04-22 09:52:27 -07:00