style: fix linter issue

fix contenxt windown
added gpt-4.1 models and gemini-2.0 and 2.5 pro models (#2609 )
2026-01-04 13:48:31 +00:00 · 2025-04-25 10:30:58 -03:00 · 2025-04-24 23:09:23 -07:00 · 2025-04-23 11:20:32 -07:00 · 2025-04-23 10:27:25 -07:00
16 changed files with 1015 additions and 14 deletions
--- a/docs/tools/codeinterpretertool.mdx
+++ b/docs/tools/codeinterpretertool.mdx
@@ -8,11 +8,29 @@ icon: code-simple

 ## Description

-The `CodeInterpreterTool` enables CrewAI agents to execute Python 3 code that they generate autonomously. The code is run in a secure, isolated Docker container, ensuring safety regardless of the content. This functionality is particularly valuable as it allows agents to create code, execute it, obtain the results, and utilize that information to inform subsequent decisions and actions.
+The `CodeInterpreterTool` enables CrewAI agents to execute Python 3 code that they generate autonomously. This functionality is particularly valuable as it allows agents to create code, execute it, obtain the results, and utilize that information to inform subsequent decisions and actions.

-## Requirements
+There are several ways to use this tool:
+
+### Docker Container (Recommended)
+
+This is the primary option. The code runs in a secure, isolated Docker container, ensuring safety regardless of its content.
+Make sure Docker is installed and running on your system. If you don’t have it, you can install it from [here](https://docs.docker.com/get-docker/).
+
+### Sandbox environment
+
+If Docker is unavailable — either not installed or not accessible for any reason — the code will be executed in a restricted Python environment - called sandbox.
+This environment is very limited, with strict restrictions on many modules and built-in functions.
+
+### Unsafe Execution
+
+**NOT RECOMMENDED FOR PRODUCTION** 
+This mode allows execution of any Python code, including dangerous calls to `sys, os..` and similar modules. [Check out](/tools/codeinterpretertool#enabling-unsafe-mode) how to enable this mode
+
+## Logging
+
+The `CodeInterpreterTool` logs the selected execution strategy to STDOUT

- Docker must be installed and running on your system. If you don't have it, you can install it from [here](https://docs.docker.com/get-docker/).

 ## Installation

@@ -74,18 +92,32 @@ programmer_agent = Agent(
 )
 ```

+### Enabling `unsafe_mode`
+
+```python Code
+from crewai_tools import CodeInterpreterTool
+
+code = """
+import os
+os.system("ls -la")
+"""
+
+CodeInterpreterTool(unsafe_mode=True).run(code=code)
+```
+
 ## Parameters

 The `CodeInterpreterTool` accepts the following parameters during initialization:

 - **user_dockerfile_path**: Optional. Path to a custom Dockerfile to use for the code interpreter container.
 - **user_docker_base_url**: Optional. URL to the Docker daemon to use for running the container.
- **unsafe_mode**: Optional. Whether to run code directly on the host machine instead of in a Docker container. Default is `False`. Use with caution!
+- **unsafe_mode**: Optional. Whether to run code directly on the host machine instead of in a Docker container or sandbox. Default is `False`. Use with caution!
+- **default_image_tag**: Optional. Default Docker image tag. Default is `code-interpreter:latest`

 When using the tool with an agent, the agent will need to provide:

 - **code**: Required. The Python 3 code to execute.
- **libraries_used**: Required. A list of libraries used in the code that need to be installed.
+- **libraries_used**: Optional. A list of libraries used in the code that need to be installed. Default is `[]`

 ## Agent Integration Example

@@ -152,7 +184,7 @@ class CodeInterpreterTool(BaseTool):
        if self.unsafe_mode:
            return self.run_code_unsafe(code, libraries_used)
        else:
-            return self.run_code_in_docker(code, libraries_used)
+            return self.run_code_safety(code, libraries_used)
 ```

 The tool performs the following steps:
@@ -168,8 +200,9 @@ The tool performs the following steps:
 By default, the `CodeInterpreterTool` runs code in an isolated Docker container, which provides a layer of security. However, there are still some security considerations to keep in mind:

 1. The Docker container has access to the current working directory, so sensitive files could potentially be accessed.
-2. The `unsafe_mode` parameter allows code to be executed directly on the host machine, which should only be used in trusted environments.
-3. Be cautious when allowing agents to install arbitrary libraries, as they could potentially include malicious code.
+2. If the Docker container is unavailable and the code needs to run safely, it will be executed in a sandbox environment. For security reasons, installing arbitrary libraries is not allowed
+3. The `unsafe_mode` parameter allows code to be executed directly on the host machine, which should only be used in trusted environments.
+4. Be cautious when allowing agents to install arbitrary libraries, as they could potentially include malicious code.

 ## Conclusion

--- a/src/crewai/cli/constants.py
+++ b/src/crewai/cli/constants.py
@@ -122,7 +122,16 @@ PROVIDERS = [
 ]

 MODELS = {
-    "openai": ["gpt-4", "gpt-4o", "gpt-4o-mini", "o1-mini", "o1-preview"],
+    "openai": [
+        "gpt-4",
+        "gpt-4.1",
+        "gpt-4.1-mini-2025-04-14",
+        "gpt-4.1-nano-2025-04-14",
+        "gpt-4o",
+        "gpt-4o-mini",
+        "o1-mini",
+        "o1-preview",
+    ],
    "anthropic": [
        "claude-3-5-sonnet-20240620",
        "claude-3-sonnet-20240229",
@@ -132,8 +141,17 @@ MODELS = {
    "gemini": [
        "gemini/gemini-1.5-flash",
        "gemini/gemini-1.5-pro",
+        "gemini/gemini-2.0-flash-lite-001",
+        "gemini/gemini-2.0-flash-001",
+        "gemini/gemini-2.0-flash-thinking-exp-01-21",
+        "gemini/gemini-2.5-flash-preview-04-17",
+        "gemini/gemini-2.5-pro-exp-03-25",
        "gemini/gemini-gemma-2-9b-it",
        "gemini/gemini-gemma-2-27b-it",
+        "gemini/gemma-3-1b-it",
+        "gemini/gemma-3-4b-it",
+        "gemini/gemma-3-12b-it",
+        "gemini/gemma-3-27b-it",
    ],
    "nvidia_nim": [
        "nvidia_nim/nvidia/mistral-nemo-minitron-8b-8k-instruct",
--- a/src/crewai/llm.py
+++ b/src/crewai/llm.py
@@ -37,6 +37,7 @@ with warnings.catch_warnings():
    warnings.simplefilter("ignore", UserWarning)
    import litellm
    from litellm import Choices
+    from litellm.exceptions import ContextWindowExceededError
    from litellm.litellm_core_utils.get_supported_openai_params import (
        get_supported_openai_params,
    )
@@ -81,14 +82,26 @@ LLM_CONTEXT_WINDOW_SIZES = {
    "gpt-4o": 128000,
    "gpt-4o-mini": 128000,
    "gpt-4-turbo": 128000,
+    "gpt-4.1": 1047576,  # Based on official docs
+    "gpt-4.1-mini-2025-04-14": 1047576,
+    "gpt-4.1-nano-2025-04-14": 1047576,
    "o1-preview": 128000,
    "o1-mini": 128000,
    "o3-mini": 200000,  # Based on official o3-mini specifications
    # gemini
    "gemini-2.0-flash": 1048576,
+    "gemini-2.0-flash-thinking-exp-01-21": 32768,
+    "gemini-2.0-flash-lite-001": 1048576,
+    "gemini-2.0-flash-001": 1048576,
+    "gemini-2.5-flash-preview-04-17": 1048576,
+    "gemini-2.5-pro-exp-03-25": 1048576,
    "gemini-1.5-pro": 2097152,
    "gemini-1.5-flash": 1048576,
    "gemini-1.5-flash-8b": 1048576,
+    "gemini/gemma-3-1b-it": 32000,
+    "gemini/gemma-3-4b-it": 128000,
+    "gemini/gemma-3-12b-it": 128000,
+    "gemini/gemma-3-27b-it": 128000,
    # deepseek
    "deepseek-chat": 128000,
    # groq
@@ -585,6 +598,11 @@ class LLM(BaseLLM):
            self._handle_emit_call_events(full_response, LLMCallType.LLM_CALL)
            return full_response

+        except ContextWindowExceededError as e:
+            # Catch context window errors from litellm and convert them to our own exception type.
+            # This exception is handled by CrewAgentExecutor._invoke_loop() which can then
+            # decide whether to summarize the content or abort based on the respect_context_window flag.
+            raise LLMContextLengthExceededException(str(e))
        except Exception as e:
            logging.error(f"Error in streaming response: {str(e)}")
            if full_response.strip():
@@ -699,7 +717,16 @@ class LLM(BaseLLM):
            str: The response text
        """
        # --- 1) Make the completion call
-        response = litellm.completion(**params)
+        try:
+            # Attempt to make the completion call, but catch context window errors
+            # and convert them to our own exception type for consistent handling
+            # across the codebase. This allows CrewAgentExecutor to handle context
+            # length issues appropriately.
+            response = litellm.completion(**params)
+        except ContextWindowExceededError as e:
+            # Convert litellm's context window error to our own exception type
+            # for consistent handling in the rest of the codebase
+            raise LLMContextLengthExceededException(str(e))

        # --- 2) Extract response message and content
        response_message = cast(Choices, cast(ModelResponse, response).choices)[
@@ -858,15 +885,17 @@ class LLM(BaseLLM):
                        params, callbacks, available_functions
                    )

+            except LLMContextLengthExceededException:
+                # Re-raise LLMContextLengthExceededException as it should be handled
+                # by the CrewAgentExecutor._invoke_loop method, which can then decide
+                # whether to summarize the content or abort based on the respect_context_window flag
+                raise
            except Exception as e:
                crewai_event_bus.emit(
                    self,
                    event=LLMCallFailedEvent(error=str(e)),
                )
-                if not LLMContextLengthExceededException(
-                    str(e)
-                )._is_context_limit_error(str(e)):
-                    logging.error(f"LiteLLM call failed: {str(e)}")
+                logging.error(f"LiteLLM call failed: {str(e)}")
                raise

    def _handle_emit_call_events(self, response: Any, call_type: LLMCallType):
--- a/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-001].yaml
+++ b/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-001].yaml
@@ -0,0 +1,59 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-001:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/62RTU+EMBCG7/0VTY9kIQUT1vXqx0njRokxUQ8jDNAILaFdoyH8dwssbNGrTdo0
+        807nnT7TEUpZCjITGRjU7IK+2Ail3XgOmpIGpbHCHLLBBlpzyp1W59xtisGv4RFLSqQpNMJARVVO
+        b1qQKVKhqeftoRXa84JXyZy3/XJ/25wcW1XhUK5WGVZzej8nsFxIocsHBK3kkPaY3O/ZosJncauK
+        plXvQ9M+D3gUxiE/tzs6i+JtyHdkth5N2UFDgXdowFKB5e/Mlqgbk6gPlJfqMFLZTi4Ow5W8O8pG
+        WQArJYw3f4rqK2spKhetQ93+HSphvkes188Jc/iYVU8zH+Jg/N3hP3nt1l7kOJVpUE/YajFNpMDa
+        zsiPAu7nFejS5zxkpCc/6so6tIECAAA=
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:05 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=1219
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-lite-001].yaml
+++ b/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-lite-001].yaml
@@ -0,0 +1,59 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-lite-001:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/61RTU+EMBC98yuanhdSiMjq1Y+Txo0SY6J7GGGARmhJ2zUawn+3wMIWvdpDM5n3
+        Zt7Mm84jhGYgcp6DQU0vyavNENKN/4BJYVAYC8wpm2xBmRN3ep0TW4rBr6GIphWSDFpuoCayILcK
+        RIaEa7IDxXXwJqhT1y/xfnNSU7LGoVUjc6xnej8TaMEF19UjgpZioD2lDzu6oPBZ3smyVfJ9GNhn
+        AQuTMIpjFl/EZyxKwuTcm6VHUXrQUOI9GrCOwLI3tS2a1qTyA8WVPIyOJJOK498K3h5hI+3yKySM
+        N3+a6msryWvXVsdxuzvU3HyPlt68pNTxx6xmmv3xHBt/T/hPWtu1lne8ynSoZ1SaTxcpsbE38qOA
+        +UUNuvJtd/QZC6nXez+t5iqFggIAAA==
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:06 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=1090
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-thinking-exp-01-21].yaml
+++ b/tests/cassettes/test_gemini_models[gemini-gemini-2.0-flash-thinking-exp-01-21].yaml
@@ -0,0 +1,58 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-thinking-exp-01-21:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/22QQWuEMBCF7/6KkKNsFt1DKb2221vp0koplD0M66jDxkSSWbCI/71Rq+vS5pCE
+        eW9meF8XCSFPYHLKgdHLB/EVKkJ04z1o1jAaDsJcCsUGHF+90+lW/2BhbIcmmVUoTtAQgxa2EM8O
+        zAkFeRHHB3Dk43grV5398j9urvuc1TgMq22Oerb3s0EWZMhXbwjemsH2nr0e5KKSybEN5SSaF4yj
+        5cVDiS/IEJLDkk82ztYNZ/aM5tFexuT306wVp39ltiHkjZLebf4M9U9hJek1vhXZkBA08feIbv+Z
+        yRUFvlk6UxjfY/TLY0L0gc7TxKLEOtBRu22iCg2+UlyROZMpFbaNSlK1S2XURz/Fy1P+CAIAAA==
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:04 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=764
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemini_models[gemini-gemini-2.5-flash-preview-04-17].yaml
+++ b/tests/cassettes/test_gemini_models[gemini-gemini-2.5-flash-preview-04-17].yaml
@@ -0,0 +1,59 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview-04-17:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/2WQT0+EMBDF7/0UTY9k2ewaDerVPzfjRokxMR4mMEBjaUk76BrCd7fAslvcHppm
+        5s2bvl/HOBcZ6FzmQOjELf/wFc678R56RhNq8o255IsNWDppp9MFby8h3A9DIq2QZ9BIAsVNwR8t
+        6Ay5dDyKdmCli6K1CCb74/tzddpnjcLBrDY5qlnezwJRSC1d9YLgjB5kr+nzThy7Uue49+UNmxeM
+        1qJ1UOITEvjkcMwnGmvqhlLzhfrOtGPy68kr4LRoJzeHPhmfcjmZrM5c3b3fKVXIL0DrI4KS9Duy
+        e3hPRYCBFtYzBhbQElSZtqzo3we37IBrIviG1skJVYm1hxdfrK/iQoGr4sbit8SfeHMZbxPBevYH
+        O2bXiSICAAA=
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:28 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=20971
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemini_models[gemini-gemini-2.5-pro-exp-03-25].yaml
+++ b/tests/cassettes/test_gemini_models[gemini-gemini-2.5-pro-exp-03-25].yaml
@@ -0,0 +1,59 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-exp-03-25:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/12QT2uEMBDF7/kUIUepi2tZaHttu7fSpZVSKD0MOquhmkgygkX87o26cWM9SJj3
+        5s/7DYxzkYMqZAGEVjzwL1fhfJj/k6YVoSIn+JIrtmDo6l2+IXg7C2E/NYmsQp5DKwlqrs/8aEDl
+        yKXlUXQCI20U7UTQOa7v75vrPqNrnIY1usDa20dvEGeppK3eEKxWk+09ez2JVZWqwN6VE+YXzKNF
+        Z6HEFyRwyWHNJ1qjm5Yy/YPqUXdz8rtlVsBpI++T5GIg7WL+03xzMNc+ua2yDgkGcF1IqCX9zvSe
+        PzMRgKDNWR4EC3gJqnRXVrQ98T5lF2ALww80Vi6wSmwcvjjdHWJ3Yox9Gye3cXoQbGR/TedYqx4C
+        AAA=
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:30 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=2418
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemma3[gemini-gemma-3-12b-it].yaml
+++ b/tests/cassettes/test_gemma3[gemini-gemma-3-12b-it].yaml
@@ -0,0 +1,59 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-12b-it:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/2WRTWvDMAyG7/kVwpdBSEvX7jB23QfsMFa2MAZbD2qipGaOFWwFWkr/+5ykaVPq
+        gGP0SvLrR/sIQGVoc52jkFcP8BMiAPtubzW2QlaCMIRCsEYn59x+7UfnkCK0bYtUuiHIsNaCBriA
+        F4c2I9Ae4niJTvs4nv7a/nuVGw9oPIOEIoOuJC+QadmBtkNlsAoIpeF1aJgFZ+SgYAfBUQIF+o1m
+        m0CJXhxbrnZJV5E1RhpHUzUyeTidV8n5aY4Ntb4rzskM6YchQRXaar/5IPRs27TP9H2pTqq2OW1D
+        eBYNF3StVeOxpDcSDJDxhFLVjqtaUv4j+8hNB/m+7zUayYW8mB914QD0QrqbJVdd/VO4U5vxqEZT
+        DE9EE+h2Y3r+TtUIg1yYGjB0/1V0BNIz+iLndQ+jpKrCyWJyO19PtKjoEP0DlZtdIF8CAAA=
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:39 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=3835
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemma3[gemini-gemma-3-1b-it].yaml
+++ b/tests/cassettes/test_gemma3[gemini-gemma-3-1b-it].yaml
@@ -0,0 +1,60 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-1b-it:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/2VRy07DQAy85yusPUZtBSoIxJWHxAFRQYSQKAc3cVqL7DrKuqKlqsRv8Ht8CZuk
+        aVOxh314xuP1eBMBmBRdxhkqeXMFbyECsGn2GhOn5DQAXSgES6z0wG3XpncPFKVVnWSSBUGKJSsW
+        IDncVehSAvYQxxOs2MfxCKZu6u719/vHA0Iq1ooDyz6UTqlUDi9doELDr1M1aMbiinXcSQ9gtlTg
+        VqOGJc855VAztAZWvMInZ1SsoaJU5o6/KANxNDK9X2/39/fBoddKCqobsRLyO/q2I5icHfvFE6EX
+        V9Oek8eJ2aPsMlqF8EnUFWikzdLjnB5IMbiOe29NWYktNZEPcteybFy/bLV6MzqCxxc7XCXYcASd
+        nQ/+qfqbUJOL/ux6Yw0tYsG6buZ2+5qYng169KnOhuZ8j3aGtB69UOW5NWNO1uJwPDydDVlNtI3+
+        AD6XWQdvAgAA
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:32 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=1535
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemma3[gemini-gemma-3-27b-it].yaml
+++ b/tests/cassettes/test_gemma3[gemini-gemma-3-27b-it].yaml
@@ -0,0 +1,60 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-27b-it:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/2VRXUvDMBR976+45EUo3RDnUHwTnSA4HFpEcHuI7e16aZqU5NZNxv67abtuHTbQ
+        hHPu5zm7AEAkUqeUSkYn7uDLIwC79t9wRjNq9kQPebCSlk+x3bcbvH0I47ZJEnGOkMiKWCowGTxZ
+        qRMEchCGC2nJheEYlnqpn/nCQaHNRkNmLJDvSwkoP1kpbeFAUYHAvtiMsgwVxGaDNmqRF1P/WIR5
+        7bAuI/ApLXxvE0gRYkumrHL0hIMNKtXcxA4y6XIyOoKkJkcau8ykVlxbHDczNUcM1tof36voJIY1
+        CptNS5Oi6sP3fYDISJPL31A6o5uw9/h1IY4s6RS3Hr4M+gZtaVE7ucY5svS2yKP4orJ+F45NgfrB
+        1K0tt12tgYln9PX0wLPxFpxR00n0r6p79D1JDc0d+O5XlIr4tzV29hmLgQx8NlQvQ3uvgoMgnUYf
+        aB11YqyxLOVoMrq6+R4Ri2Af/AEDrXcbkQIAAA==
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:41 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=2447
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gemma3[gemini-gemma-3-4b-it].yaml
+++ b/tests/cassettes/test_gemma3[gemini-gemma-3-4b-it].yaml
@@ -0,0 +1,60 @@
+interactions:
+- request:
+    body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
+      of France?"}]}], "generationConfig": {"stop_sequences": []}}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '131'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      user-agent:
+      - litellm/1.60.2
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-4b-it:generateContent
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAC/2WRzUrDQBDH73mKYY+lLUKLiBcPfoAHsWhQwXqYJtNk6WYn7E6woRQ8+wR68t18
+        Ah/BbWraFPeQXeb/z3z8ZhUBqARtqlMU8uoUnkMEYNV8NxpbIStBaEMhWKKTvXd7Vp13sAgtNz+p
+        OCdIsNSCBngOVw5tQqA99HoTdNr3ekOY2qm9lu+3Tw8ImeFZ8CahKDmYs4NQrA9z9Llm24cMvTi2
+        XNQQ2oakMlI5GsLP18d7k+mRK5NCzRUYvSAQhoXl12CuJdc2g4IdAc64Emg6OFOdzte790t/P69j
+        Q5thCk7JtPZ1a1BzbbXP7wg9243tPr6dqJ2qbUrLED6K2gJNalV5zOiGBAN53PFVpeOilJgXZM+5
+        asifbHN19nQgj1pdOFA+kMbH/X9Z/UWoqU13f53VhhHRaKmb3V0+xaqDQQ6aajE090v0B2TL6IGc
+        11sYGRUFDkaD8WygRUXr6BcxmBLccwIAAA==
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Tue, 22 Apr 2025 14:25:35 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=2349
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - nosniff
+      X-Frame-Options:
+      - SAMEORIGIN
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gpt_4_1[gpt-4.1-mini-2025-04-14].yaml
+++ b/tests/cassettes/test_gpt_4_1[gpt-4.1-mini-2025-04-14].yaml
@@ -0,0 +1,101 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "gpt-4.1-mini-2025-04-14", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '125'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.68.2
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.68.2
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-read-timeout:
+      - '600.0'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.11.12
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAAAwAAAP//jJJPb9swDMXv/hSCznEQe+7S5LgAGzDs0P3paSgMRqJtZrIkSPTQoch3
+        H2Snsbt1wC4+8MdHv0fxKRNCkpZ7IVUHrHpv8nd3t4eIH77w4ePX+/b+8XP56bTdHE7dgb2Sq6Rw
+        xxMqflatleu9QSZnJ6wCAmOaWmyrmzflrqrejqB3Gk2StZ7zal3kPVnKy015k2+qvKgu8s6Rwij3
+        4nsmhBBP4zcZtRof5V5sVs+VHmOEFuX+2iSEDM6kioQYKTJYlqsZKmcZ7ej9W4dCgScGI1wj3gew
+        CgVFcQeB4nqpCtgMEZJ1OxizAGCtY0jRR78PF3K+OjSu9cEd4x9S2ZCl2NUBITqb3ER2Xo70nAnx
+        MG5ieBFO+uB6zzW7Hzj+rqimcXJ+gBneXhg7BjOXy3L1yrBaIwOZuFikVKA61LNy3joMmtwCZIvI
+        f3t5bfYUm2z7P+NnoBR6Rl37gJrUy7xzW8B0nf9qu654NCwjhp+ksGbCkJ5BYwODmU5Gxl+Rsa8b
+        si0GH2i6m8bX291xuztiVTQyO2e/AQAA//8DAP7WRo9GAwAA
+    headers:
+      CF-RAY:
+      - 93458dcf6d0ef53b-GRU
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 22 Apr 2025 13:44:07 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '1391'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999989'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_74408ec81f430b6e9795cf5332262b8d
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gpt_4_1[gpt-4.1-nano-2025-04-14].yaml
+++ b/tests/cassettes/test_gpt_4_1[gpt-4.1-nano-2025-04-14].yaml
@@ -0,0 +1,101 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "gpt-4.1-nano-2025-04-14", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '125'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.68.2
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.68.2
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-read-timeout:
+      - '600.0'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.11.12
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAAAwAAAP//jJJPb9swDMXv/hSCznEQuwrq5bgNBbZTh7aHYigMRqJtbbKkSXT/oMh3
+        L2Snsdt1wC4+8MdHv0fxOWOMa8V3jMsOSPbe5J8vqy/D96K/+PPjSn/Fh5szMewfb/eiar/d8lVS
+        uP0vlPSqWkvXe4OknZ2wDAiEaWpxLrZn5SchqhH0TqFJstZTLtZFbsG6vNyU23wj8kIc5Z3TEiPf
+        sZ8ZY4w9j99k1Cp85Du2Wb1WeowRWuS7UxNjPDiTKhxi1JHAEl/NUDpLaEfv1x0yCV4TGOYadhHA
+        SmQ6sksIOq6XqoDNECFZt4MxCwDWOoIUffR7dySHk0PjWh/cPr6T8kZbHbs6IERnk5tIzvORHjLG
+        7sZNDG/CcR9c76km9xvH3xViGsfnB5hhdWTkCMxcLsvVB8NqhQTaxMUiuQTZoZqV89ZhUNotQLaI
+        /LeXj2ZPsbVt/2f8DKRET6hqH1Bp+Tbv3BYwXee/2k4rHg3ziOFeS6xJY0jPoLCBwUwnw+NTJOzr
+        RtsWgw96upvG14gKq2ajxJZnh+wFAAD//wMATWCJPkYDAAA=
+    headers:
+      CF-RAY:
+      - 93458dd9aebff53b-GRU
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 22 Apr 2025 13:44:08 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '134'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999990'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_0ba738662e063fb55ea01793aafdcecc
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_gpt_4_1[gpt-4.1].yaml
+++ b/tests/cassettes/test_gpt_4_1[gpt-4.1].yaml
@@ -0,0 +1,101 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "gpt-4.1", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '109'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.68.2
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.68.2
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-read-timeout:
+      - '600.0'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.11.12
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAAA4xSwYrbMBC9+yvEHEMcHMeh3dx2SwttoYTSW1nMRB7b2pUlIY27XZb8e5GdxM5u
+        C734MG/e83tP85IIAaqCnQDZIsvO6fRu//6Db26fnvTtx8/5t7v95rmT3y1+pc0XDcvIsIcHknxm
+        raTtnCZW1oyw9IRMUXX9rthu8pui2A5AZyvSkdY4TovVOs2zfJtmRbouTszWKkkBduJnIoQQL8M3
+        ejQV/YadyJbnSUchYEOwuywJAd7qOAEMQQVGw7CcQGkNkxls/2hJSHSKUQtbi08ejSShglgs9uhV
+        WCxWc6anug8YnZte6xmAxljGmHzwfH9CjheX2jbO20N4RYVaGRXa0hMGa6KjwNbBgB4TIe6HNvqr
+        gOC87RyXbB9p+N26GOVg6n8GnpoCtox6mudn0pVaWRGj0mHWJkiULVUTc6oe+0rZGZDMMr818zft
+        Mbcyzf/IT4CU5Jiq0nmqlLwOPK15itf5r7VLx4NhCOR/KUklK/LxHSqqsdfj3UB4DkxdWSvTkHde
+        jcdTuxJllh2yLMskJMfkDwAAAP//AwC4aq9JRgMAAA==
+    headers:
+      CF-RAY:
+      - 93458dcc1b9df53b-GRU
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 22 Apr 2025 13:44:06 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '296'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '10000'
+      x-ratelimit-limit-tokens:
+      - '30000000'
+      x-ratelimit-remaining-requests:
+      - '9999'
+      x-ratelimit-remaining-tokens:
+      - '29999989'
+      x-ratelimit-reset-requests:
+      - 6ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_7b4e1628a0608e9547e71e7857a29b58
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/llm_test.py
+++ b/tests/llm_test.py
@@ -256,6 +256,52 @@ def test_validate_call_params_no_response_format():
    llm._validate_call_params()


+@pytest.mark.vcr(filter_headers=["authorization"], filter_query_parameters=["key"])
+@pytest.mark.parametrize(
+    "model",
+    [
+        "gemini/gemini-2.0-flash-thinking-exp-01-21",
+        "gemini/gemini-2.0-flash-001",
+        "gemini/gemini-2.0-flash-lite-001",
+        "gemini/gemini-2.5-flash-preview-04-17",
+        "gemini/gemini-2.5-pro-exp-03-25",
+    ],
+)
+def test_gemini_models(model):
+    llm = LLM(model=model)
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
+@pytest.mark.vcr(filter_headers=["authorization"], filter_query_parameters=["key"])
+@pytest.mark.parametrize(
+    "model",
+    [
+        "gemini/gemma-3-1b-it",
+        "gemini/gemma-3-4b-it",
+        "gemini/gemma-3-12b-it",
+        "gemini/gemma-3-27b-it",
+    ],
+)
+def test_gemma3(model):
+    llm = LLM(model=model)
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+@pytest.mark.parametrize(
+    "model", ["gpt-4.1", "gpt-4.1-mini-2025-04-14", "gpt-4.1-nano-2025-04-14"]
+)
+def test_gpt_4_1(model):
+    llm = LLM(model=model)
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
@pytest.mark.vcr(filter_headers=["authorization"])
 def test_o3_mini_reasoning_effort_high():
    llm = LLM(
@@ -327,6 +373,45 @@ def get_weather_tool_schema():
        },
    }

+def test_context_window_exceeded_error_handling():
+    """Test that litellm.ContextWindowExceededError is converted to LLMContextLengthExceededException."""
+    from litellm.exceptions import ContextWindowExceededError
+
+    from crewai.utilities.exceptions.context_window_exceeding_exception import (
+        LLMContextLengthExceededException,
+    )
+
+    llm = LLM(model="gpt-4")
+
+    # Test non-streaming response
+    with patch("litellm.completion") as mock_completion:
+        mock_completion.side_effect = ContextWindowExceededError(
+            "This model's maximum context length is 8192 tokens. However, your messages resulted in 10000 tokens.",
+            model="gpt-4",
+            llm_provider="openai"
+        )
+
+        with pytest.raises(LLMContextLengthExceededException) as excinfo:
+            llm.call("This is a test message")
+
+        assert "context length exceeded" in str(excinfo.value).lower()
+        assert "8192 tokens" in str(excinfo.value)
+
+    # Test streaming response
+    llm = LLM(model="gpt-4", stream=True)
+    with patch("litellm.completion") as mock_completion:
+        mock_completion.side_effect = ContextWindowExceededError(
+            "This model's maximum context length is 8192 tokens. However, your messages resulted in 10000 tokens.",
+            model="gpt-4",
+            llm_provider="openai"
+        )
+
+        with pytest.raises(LLMContextLengthExceededException) as excinfo:
+            llm.call("This is a test message")
+
+        assert "context length exceeded" in str(excinfo.value).lower()
+        assert "8192 tokens" in str(excinfo.value)
+

@pytest.mark.vcr(filter_headers=["authorization"])
@pytest.fixture
Author	SHA1	Message	Date
Lucas Gomide	c68257efbc	style: fix linter issue	2025-04-25 10:30:58 -03:00
João Moura	5b9606e8b6	fix contenxt windown	2025-04-24 23:09:23 -07:00
Kunal Lunia	685d20f46c	added gpt-4.1 models and gemini-2.0 and 2.5 pro models (#2609 ) Some checks failed Notify Downstream / notify-downstream (push) Has been cancelled Details * added gpt4.1 models and gemini 2.0 and 2.5 models * added flash model * Updated test fun to all models * Added Gemma3 test cases and passed all google test case * added gemini 2.5 flash * added gpt4.1 models and gemini 2.0 and 2.5 models * added flash model * Updated test fun to all models * Added Gemma3 test cases and passed all google test case * added gemini 2.5 flash * added gpt4.1 models and gemini 2.0 and 2.5 models * added flash model * Updated test fun to all models * Added Gemma3 test cases and passed all google test case * added gemini 2.5 flash * test: add missing cassettes * test: ignore authorization key from gemini/gemma3 request --------- Co-authored-by: Lucas Gomide <lucaslg200@gmail.com> Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>	2025-04-23 11:20:32 -07:00
Lucas Gomide	9ebf3aa043	docs(CodeInterpreterTool): update docs (#2675 )	2025-04-23 10:27:25 -07:00