Compare commits

...

4 Commits

Author SHA1 Message Date
Lucas Gomide
c68257efbc style: fix linter issue 2025-04-25 10:30:58 -03:00
João Moura
5b9606e8b6 fix contenxt windown 2025-04-24 23:09:23 -07:00
Kunal Lunia
685d20f46c added gpt-4.1 models and gemini-2.0 and 2.5 pro models (#2609)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
* added gpt4.1 models and gemini 2.0 and 2.5 models

* added flash model

* Updated test fun to all models

* Added Gemma3 test cases and passed all google test case

* added gemini 2.5 flash

* added gpt4.1 models and gemini 2.0 and 2.5 models

* added flash model

* Updated test fun to all models

* Added Gemma3 test cases and passed all google test case

* added gemini 2.5 flash

* added gpt4.1 models and gemini 2.0 and 2.5 models

* added flash model

* Updated test fun to all models

* Added Gemma3 test cases and passed all google test case

* added gemini 2.5 flash

* test: add missing cassettes

* test: ignore authorization key from gemini/gemma3 request

---------

Co-authored-by: Lucas Gomide <lucaslg200@gmail.com>
Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>
2025-04-23 11:20:32 -07:00
Lucas Gomide
9ebf3aa043 docs(CodeInterpreterTool): update docs (#2675) 2025-04-23 10:27:25 -07:00
16 changed files with 1015 additions and 14 deletions

View File

@@ -8,11 +8,29 @@ icon: code-simple
## Description
The `CodeInterpreterTool` enables CrewAI agents to execute Python 3 code that they generate autonomously. The code is run in a secure, isolated Docker container, ensuring safety regardless of the content. This functionality is particularly valuable as it allows agents to create code, execute it, obtain the results, and utilize that information to inform subsequent decisions and actions.
The `CodeInterpreterTool` enables CrewAI agents to execute Python 3 code that they generate autonomously. This functionality is particularly valuable as it allows agents to create code, execute it, obtain the results, and utilize that information to inform subsequent decisions and actions.
## Requirements
There are several ways to use this tool:
### Docker Container (Recommended)
This is the primary option. The code runs in a secure, isolated Docker container, ensuring safety regardless of its content.
Make sure Docker is installed and running on your system. If you dont have it, you can install it from [here](https://docs.docker.com/get-docker/).
### Sandbox environment
If Docker is unavailable — either not installed or not accessible for any reason — the code will be executed in a restricted Python environment - called sandbox.
This environment is very limited, with strict restrictions on many modules and built-in functions.
### Unsafe Execution
**NOT RECOMMENDED FOR PRODUCTION**
This mode allows execution of any Python code, including dangerous calls to `sys, os..` and similar modules. [Check out](/tools/codeinterpretertool#enabling-unsafe-mode) how to enable this mode
## Logging
The `CodeInterpreterTool` logs the selected execution strategy to STDOUT
- Docker must be installed and running on your system. If you don't have it, you can install it from [here](https://docs.docker.com/get-docker/).
## Installation
@@ -74,18 +92,32 @@ programmer_agent = Agent(
)
```
### Enabling `unsafe_mode`
```python Code
from crewai_tools import CodeInterpreterTool
code = """
import os
os.system("ls -la")
"""
CodeInterpreterTool(unsafe_mode=True).run(code=code)
```
## Parameters
The `CodeInterpreterTool` accepts the following parameters during initialization:
- **user_dockerfile_path**: Optional. Path to a custom Dockerfile to use for the code interpreter container.
- **user_docker_base_url**: Optional. URL to the Docker daemon to use for running the container.
- **unsafe_mode**: Optional. Whether to run code directly on the host machine instead of in a Docker container. Default is `False`. Use with caution!
- **unsafe_mode**: Optional. Whether to run code directly on the host machine instead of in a Docker container or sandbox. Default is `False`. Use with caution!
- **default_image_tag**: Optional. Default Docker image tag. Default is `code-interpreter:latest`
When using the tool with an agent, the agent will need to provide:
- **code**: Required. The Python 3 code to execute.
- **libraries_used**: Required. A list of libraries used in the code that need to be installed.
- **libraries_used**: Optional. A list of libraries used in the code that need to be installed. Default is `[]`
## Agent Integration Example
@@ -152,7 +184,7 @@ class CodeInterpreterTool(BaseTool):
if self.unsafe_mode:
return self.run_code_unsafe(code, libraries_used)
else:
return self.run_code_in_docker(code, libraries_used)
return self.run_code_safety(code, libraries_used)
```
The tool performs the following steps:
@@ -168,8 +200,9 @@ The tool performs the following steps:
By default, the `CodeInterpreterTool` runs code in an isolated Docker container, which provides a layer of security. However, there are still some security considerations to keep in mind:
1. The Docker container has access to the current working directory, so sensitive files could potentially be accessed.
2. The `unsafe_mode` parameter allows code to be executed directly on the host machine, which should only be used in trusted environments.
3. Be cautious when allowing agents to install arbitrary libraries, as they could potentially include malicious code.
2. If the Docker container is unavailable and the code needs to run safely, it will be executed in a sandbox environment. For security reasons, installing arbitrary libraries is not allowed
3. The `unsafe_mode` parameter allows code to be executed directly on the host machine, which should only be used in trusted environments.
4. Be cautious when allowing agents to install arbitrary libraries, as they could potentially include malicious code.
## Conclusion

View File

@@ -122,7 +122,16 @@ PROVIDERS = [
]
MODELS = {
"openai": ["gpt-4", "gpt-4o", "gpt-4o-mini", "o1-mini", "o1-preview"],
"openai": [
"gpt-4",
"gpt-4.1",
"gpt-4.1-mini-2025-04-14",
"gpt-4.1-nano-2025-04-14",
"gpt-4o",
"gpt-4o-mini",
"o1-mini",
"o1-preview",
],
"anthropic": [
"claude-3-5-sonnet-20240620",
"claude-3-sonnet-20240229",
@@ -132,8 +141,17 @@ MODELS = {
"gemini": [
"gemini/gemini-1.5-flash",
"gemini/gemini-1.5-pro",
"gemini/gemini-2.0-flash-lite-001",
"gemini/gemini-2.0-flash-001",
"gemini/gemini-2.0-flash-thinking-exp-01-21",
"gemini/gemini-2.5-flash-preview-04-17",
"gemini/gemini-2.5-pro-exp-03-25",
"gemini/gemini-gemma-2-9b-it",
"gemini/gemini-gemma-2-27b-it",
"gemini/gemma-3-1b-it",
"gemini/gemma-3-4b-it",
"gemini/gemma-3-12b-it",
"gemini/gemma-3-27b-it",
],
"nvidia_nim": [
"nvidia_nim/nvidia/mistral-nemo-minitron-8b-8k-instruct",

View File

@@ -37,6 +37,7 @@ with warnings.catch_warnings():
warnings.simplefilter("ignore", UserWarning)
import litellm
from litellm import Choices
from litellm.exceptions import ContextWindowExceededError
from litellm.litellm_core_utils.get_supported_openai_params import (
get_supported_openai_params,
)
@@ -81,14 +82,26 @@ LLM_CONTEXT_WINDOW_SIZES = {
"gpt-4o": 128000,
"gpt-4o-mini": 128000,
"gpt-4-turbo": 128000,
"gpt-4.1": 1047576, # Based on official docs
"gpt-4.1-mini-2025-04-14": 1047576,
"gpt-4.1-nano-2025-04-14": 1047576,
"o1-preview": 128000,
"o1-mini": 128000,
"o3-mini": 200000, # Based on official o3-mini specifications
# gemini
"gemini-2.0-flash": 1048576,
"gemini-2.0-flash-thinking-exp-01-21": 32768,
"gemini-2.0-flash-lite-001": 1048576,
"gemini-2.0-flash-001": 1048576,
"gemini-2.5-flash-preview-04-17": 1048576,
"gemini-2.5-pro-exp-03-25": 1048576,
"gemini-1.5-pro": 2097152,
"gemini-1.5-flash": 1048576,
"gemini-1.5-flash-8b": 1048576,
"gemini/gemma-3-1b-it": 32000,
"gemini/gemma-3-4b-it": 128000,
"gemini/gemma-3-12b-it": 128000,
"gemini/gemma-3-27b-it": 128000,
# deepseek
"deepseek-chat": 128000,
# groq
@@ -585,6 +598,11 @@ class LLM(BaseLLM):
self._handle_emit_call_events(full_response, LLMCallType.LLM_CALL)
return full_response
except ContextWindowExceededError as e:
# Catch context window errors from litellm and convert them to our own exception type.
# This exception is handled by CrewAgentExecutor._invoke_loop() which can then
# decide whether to summarize the content or abort based on the respect_context_window flag.
raise LLMContextLengthExceededException(str(e))
except Exception as e:
logging.error(f"Error in streaming response: {str(e)}")
if full_response.strip():
@@ -699,7 +717,16 @@ class LLM(BaseLLM):
str: The response text
"""
# --- 1) Make the completion call
response = litellm.completion(**params)
try:
# Attempt to make the completion call, but catch context window errors
# and convert them to our own exception type for consistent handling
# across the codebase. This allows CrewAgentExecutor to handle context
# length issues appropriately.
response = litellm.completion(**params)
except ContextWindowExceededError as e:
# Convert litellm's context window error to our own exception type
# for consistent handling in the rest of the codebase
raise LLMContextLengthExceededException(str(e))
# --- 2) Extract response message and content
response_message = cast(Choices, cast(ModelResponse, response).choices)[
@@ -858,15 +885,17 @@ class LLM(BaseLLM):
params, callbacks, available_functions
)
except LLMContextLengthExceededException:
# Re-raise LLMContextLengthExceededException as it should be handled
# by the CrewAgentExecutor._invoke_loop method, which can then decide
# whether to summarize the content or abort based on the respect_context_window flag
raise
except Exception as e:
crewai_event_bus.emit(
self,
event=LLMCallFailedEvent(error=str(e)),
)
if not LLMContextLengthExceededException(
str(e)
)._is_context_limit_error(str(e)):
logging.error(f"LiteLLM call failed: {str(e)}")
logging.error(f"LiteLLM call failed: {str(e)}")
raise
def _handle_emit_call_events(self, response: Any, call_type: LLMCallType):

View File

@@ -0,0 +1,59 @@
interactions:
- request:
body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
of France?"}]}], "generationConfig": {"stop_sequences": []}}'
headers:
accept:
- '*/*'
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '131'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
user-agent:
- litellm/1.60.2
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-001:generateContent
response:
body:
string: !!binary |
H4sIAAAAAAAC/62RTU+EMBCG7/0VTY9kIQUT1vXqx0njRokxUQ8jDNAILaFdoyH8dwssbNGrTdo0
807nnT7TEUpZCjITGRjU7IK+2Ail3XgOmpIGpbHCHLLBBlpzyp1W59xtisGv4RFLSqQpNMJARVVO
b1qQKVKhqeftoRXa84JXyZy3/XJ/25wcW1XhUK5WGVZzej8nsFxIocsHBK3kkPaY3O/ZosJncauK
plXvQ9M+D3gUxiE/tzs6i+JtyHdkth5N2UFDgXdowFKB5e/Mlqgbk6gPlJfqMFLZTi4Ow5W8O8pG
WQArJYw3f4rqK2spKhetQ93+HSphvkes188Jc/iYVU8zH+Jg/N3hP3nt1l7kOJVpUE/YajFNpMDa
zsiPAu7nFejS5zxkpCc/6so6tIECAAA=
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Encoding:
- gzip
Content-Type:
- application/json; charset=UTF-8
Date:
- Tue, 22 Apr 2025 14:25:05 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=1219
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- nosniff
X-Frame-Options:
- SAMEORIGIN
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,59 @@
interactions:
- request:
body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
of France?"}]}], "generationConfig": {"stop_sequences": []}}'
headers:
accept:
- '*/*'
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '131'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
user-agent:
- litellm/1.60.2
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-lite-001:generateContent
response:
body:
string: !!binary |
H4sIAAAAAAAC/61RTU+EMBC98yuanhdSiMjq1Y+Txo0SY6J7GGGARmhJ2zUawn+3wMIWvdpDM5n3
Zt7Mm84jhGYgcp6DQU0vyavNENKN/4BJYVAYC8wpm2xBmRN3ep0TW4rBr6GIphWSDFpuoCayILcK
RIaEa7IDxXXwJqhT1y/xfnNSU7LGoVUjc6xnej8TaMEF19UjgpZioD2lDzu6oPBZ3smyVfJ9GNhn
AQuTMIpjFl/EZyxKwuTcm6VHUXrQUOI9GrCOwLI3tS2a1qTyA8WVPIyOJJOK498K3h5hI+3yKySM
N3+a6msryWvXVsdxuzvU3HyPlt68pNTxx6xmmv3xHBt/T/hPWtu1lne8ynSoZ1SaTxcpsbE38qOA
+UUNuvJtd/QZC6nXez+t5iqFggIAAA==
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Encoding:
- gzip
Content-Type:
- application/json; charset=UTF-8
Date:
- Tue, 22 Apr 2025 14:25:06 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=1090
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- nosniff
X-Frame-Options:
- SAMEORIGIN
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,58 @@
interactions:
- request:
body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
of France?"}]}], "generationConfig": {"stop_sequences": []}}'
headers:
accept:
- '*/*'
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '131'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
user-agent:
- litellm/1.60.2
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-thinking-exp-01-21:generateContent
response:
body:
string: !!binary |
H4sIAAAAAAAC/22QQWuEMBCF7/6KkKNsFt1DKb2221vp0koplD0M66jDxkSSWbCI/71Rq+vS5pCE
eW9meF8XCSFPYHLKgdHLB/EVKkJ04z1o1jAaDsJcCsUGHF+90+lW/2BhbIcmmVUoTtAQgxa2EM8O
zAkFeRHHB3Dk43grV5398j9urvuc1TgMq22Oerb3s0EWZMhXbwjemsH2nr0e5KKSybEN5SSaF4yj
5cVDiS/IEJLDkk82ztYNZ/aM5tFexuT306wVp39ltiHkjZLebf4M9U9hJek1vhXZkBA08feIbv+Z
yRUFvlk6UxjfY/TLY0L0gc7TxKLEOtBRu22iCg2+UlyROZMpFbaNSlK1S2XURz/Fy1P+CAIAAA==
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Encoding:
- gzip
Content-Type:
- application/json; charset=UTF-8
Date:
- Tue, 22 Apr 2025 14:25:04 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=764
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- nosniff
X-Frame-Options:
- SAMEORIGIN
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,59 @@
interactions:
- request:
body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
of France?"}]}], "generationConfig": {"stop_sequences": []}}'
headers:
accept:
- '*/*'
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '131'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
user-agent:
- litellm/1.60.2
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview-04-17:generateContent
response:
body:
string: !!binary |
H4sIAAAAAAAC/2WQT0+EMBDF7/0UTY9k2ewaDerVPzfjRokxMR4mMEBjaUk76BrCd7fAslvcHppm
5s2bvl/HOBcZ6FzmQOjELf/wFc678R56RhNq8o255IsNWDppp9MFby8h3A9DIq2QZ9BIAsVNwR8t
6Ay5dDyKdmCli6K1CCb74/tzddpnjcLBrDY5qlnezwJRSC1d9YLgjB5kr+nzThy7Uue49+UNmxeM
1qJ1UOITEvjkcMwnGmvqhlLzhfrOtGPy68kr4LRoJzeHPhmfcjmZrM5c3b3fKVXIL0DrI4KS9Duy
e3hPRYCBFtYzBhbQElSZtqzo3we37IBrIviG1skJVYm1hxdfrK/iQoGr4sbit8SfeHMZbxPBevYH
O2bXiSICAAA=
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Encoding:
- gzip
Content-Type:
- application/json; charset=UTF-8
Date:
- Tue, 22 Apr 2025 14:25:28 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=20971
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- nosniff
X-Frame-Options:
- SAMEORIGIN
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,59 @@
interactions:
- request:
body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
of France?"}]}], "generationConfig": {"stop_sequences": []}}'
headers:
accept:
- '*/*'
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '131'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
user-agent:
- litellm/1.60.2
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-exp-03-25:generateContent
response:
body:
string: !!binary |
H4sIAAAAAAAC/12QT2uEMBDF7/kUIUepi2tZaHttu7fSpZVSKD0MOquhmkgygkX87o26cWM9SJj3
5s/7DYxzkYMqZAGEVjzwL1fhfJj/k6YVoSIn+JIrtmDo6l2+IXg7C2E/NYmsQp5DKwlqrs/8aEDl
yKXlUXQCI20U7UTQOa7v75vrPqNrnIY1usDa20dvEGeppK3eEKxWk+09ez2JVZWqwN6VE+YXzKNF
Z6HEFyRwyWHNJ1qjm5Yy/YPqUXdz8rtlVsBpI++T5GIg7WL+03xzMNc+ua2yDgkGcF1IqCX9zvSe
PzMRgKDNWR4EC3gJqnRXVrQ98T5lF2ALww80Vi6wSmwcvjjdHWJ3Yox9Gye3cXoQbGR/TedYqx4C
AAA=
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Encoding:
- gzip
Content-Type:
- application/json; charset=UTF-8
Date:
- Tue, 22 Apr 2025 14:25:30 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=2418
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- nosniff
X-Frame-Options:
- SAMEORIGIN
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,59 @@
interactions:
- request:
body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
of France?"}]}], "generationConfig": {"stop_sequences": []}}'
headers:
accept:
- '*/*'
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '131'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
user-agent:
- litellm/1.60.2
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-12b-it:generateContent
response:
body:
string: !!binary |
H4sIAAAAAAAC/2WRTWvDMAyG7/kVwpdBSEvX7jB23QfsMFa2MAZbD2qipGaOFWwFWkr/+5ykaVPq
gGP0SvLrR/sIQGVoc52jkFcP8BMiAPtubzW2QlaCMIRCsEYn59x+7UfnkCK0bYtUuiHIsNaCBriA
F4c2I9Ae4niJTvs4nv7a/nuVGw9oPIOEIoOuJC+QadmBtkNlsAoIpeF1aJgFZ+SgYAfBUQIF+o1m
m0CJXhxbrnZJV5E1RhpHUzUyeTidV8n5aY4Ntb4rzskM6YchQRXaar/5IPRs27TP9H2pTqq2OW1D
eBYNF3StVeOxpDcSDJDxhFLVjqtaUv4j+8hNB/m+7zUayYW8mB914QD0QrqbJVdd/VO4U5vxqEZT
DE9EE+h2Y3r+TtUIg1yYGjB0/1V0BNIz+iLndQ+jpKrCyWJyO19PtKjoEP0DlZtdIF8CAAA=
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Encoding:
- gzip
Content-Type:
- application/json; charset=UTF-8
Date:
- Tue, 22 Apr 2025 14:25:39 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=3835
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- nosniff
X-Frame-Options:
- SAMEORIGIN
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,60 @@
interactions:
- request:
body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
of France?"}]}], "generationConfig": {"stop_sequences": []}}'
headers:
accept:
- '*/*'
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '131'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
user-agent:
- litellm/1.60.2
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-1b-it:generateContent
response:
body:
string: !!binary |
H4sIAAAAAAAC/2VRy07DQAy85yusPUZtBSoIxJWHxAFRQYSQKAc3cVqL7DrKuqKlqsRv8Ht8CZuk
aVOxh314xuP1eBMBmBRdxhkqeXMFbyECsGn2GhOn5DQAXSgES6z0wG3XpncPFKVVnWSSBUGKJSsW
IDncVehSAvYQxxOs2MfxCKZu6u719/vHA0Iq1ooDyz6UTqlUDi9doELDr1M1aMbiinXcSQ9gtlTg
VqOGJc855VAztAZWvMInZ1SsoaJU5o6/KANxNDK9X2/39/fBoddKCqobsRLyO/q2I5icHfvFE6EX
V9Oek8eJ2aPsMlqF8EnUFWikzdLjnB5IMbiOe29NWYktNZEPcteybFy/bLV6MzqCxxc7XCXYcASd
nQ/+qfqbUJOL/ux6Yw0tYsG6buZ2+5qYng169KnOhuZ8j3aGtB69UOW5NWNO1uJwPDydDVlNtI3+
AD6XWQdvAgAA
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Encoding:
- gzip
Content-Type:
- application/json; charset=UTF-8
Date:
- Tue, 22 Apr 2025 14:25:32 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=1535
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- nosniff
X-Frame-Options:
- SAMEORIGIN
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,60 @@
interactions:
- request:
body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
of France?"}]}], "generationConfig": {"stop_sequences": []}}'
headers:
accept:
- '*/*'
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '131'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
user-agent:
- litellm/1.60.2
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-27b-it:generateContent
response:
body:
string: !!binary |
H4sIAAAAAAAC/2VRXUvDMBR976+45EUo3RDnUHwTnSA4HFpEcHuI7e16aZqU5NZNxv67abtuHTbQ
hHPu5zm7AEAkUqeUSkYn7uDLIwC79t9wRjNq9kQPebCSlk+x3bcbvH0I47ZJEnGOkMiKWCowGTxZ
qRMEchCGC2nJheEYlnqpn/nCQaHNRkNmLJDvSwkoP1kpbeFAUYHAvtiMsgwVxGaDNmqRF1P/WIR5
7bAuI/ApLXxvE0gRYkumrHL0hIMNKtXcxA4y6XIyOoKkJkcau8ykVlxbHDczNUcM1tof36voJIY1
CptNS5Oi6sP3fYDISJPL31A6o5uw9/h1IY4s6RS3Hr4M+gZtaVE7ucY5svS2yKP4orJ+F45NgfrB
1K0tt12tgYln9PX0wLPxFpxR00n0r6p79D1JDc0d+O5XlIr4tzV29hmLgQx8NlQvQ3uvgoMgnUYf
aB11YqyxLOVoMrq6+R4Ri2Af/AEDrXcbkQIAAA==
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Encoding:
- gzip
Content-Type:
- application/json; charset=UTF-8
Date:
- Tue, 22 Apr 2025 14:25:41 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=2447
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- nosniff
X-Frame-Options:
- SAMEORIGIN
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,60 @@
interactions:
- request:
body: '{"contents": [{"role": "user", "parts": [{"text": "What is the capital
of France?"}]}], "generationConfig": {"stop_sequences": []}}'
headers:
accept:
- '*/*'
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '131'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
user-agent:
- litellm/1.60.2
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemma-3-4b-it:generateContent
response:
body:
string: !!binary |
H4sIAAAAAAAC/2WRzUrDQBDH73mKYY+lLUKLiBcPfoAHsWhQwXqYJtNk6WYn7E6woRQ8+wR68t18
Ah/BbWraFPeQXeb/z3z8ZhUBqARtqlMU8uoUnkMEYNV8NxpbIStBaEMhWKKTvXd7Vp13sAgtNz+p
OCdIsNSCBngOVw5tQqA99HoTdNr3ekOY2qm9lu+3Tw8ImeFZ8CahKDmYs4NQrA9z9Llm24cMvTi2
XNQQ2oakMlI5GsLP18d7k+mRK5NCzRUYvSAQhoXl12CuJdc2g4IdAc64Emg6OFOdzte790t/P69j
Q5thCk7JtPZ1a1BzbbXP7wg9243tPr6dqJ2qbUrLED6K2gJNalV5zOiGBAN53PFVpeOilJgXZM+5
asifbHN19nQgj1pdOFA+kMbH/X9Z/UWoqU13f53VhhHRaKmb3V0+xaqDQQ6aajE090v0B2TL6IGc
11sYGRUFDkaD8WygRUXr6BcxmBLccwIAAA==
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Encoding:
- gzip
Content-Type:
- application/json; charset=UTF-8
Date:
- Tue, 22 Apr 2025 14:25:35 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=2349
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- nosniff
X-Frame-Options:
- SAMEORIGIN
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,101 @@
interactions:
- request:
body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
"model": "gpt-4.1-mini-2025-04-14", "stop": []}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '125'
content-type:
- application/json
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.68.2
x-stainless-arch:
- arm64
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.68.2
x-stainless-raw-response:
- 'true'
x-stainless-read-timeout:
- '600.0'
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.11.12
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: !!binary |
H4sIAAAAAAAAAwAAAP//jJJPb9swDMXv/hSCznEQe+7S5LgAGzDs0P3paSgMRqJtZrIkSPTQoch3
H2Snsbt1wC4+8MdHv0fxKRNCkpZ7IVUHrHpv8nd3t4eIH77w4ePX+/b+8XP56bTdHE7dgb2Sq6Rw
xxMqflatleu9QSZnJ6wCAmOaWmyrmzflrqrejqB3Gk2StZ7zal3kPVnKy015k2+qvKgu8s6Rwij3
4nsmhBBP4zcZtRof5V5sVs+VHmOEFuX+2iSEDM6kioQYKTJYlqsZKmcZ7ej9W4dCgScGI1wj3gew
CgVFcQeB4nqpCtgMEZJ1OxizAGCtY0jRR78PF3K+OjSu9cEd4x9S2ZCl2NUBITqb3ER2Xo70nAnx
MG5ieBFO+uB6zzW7Hzj+rqimcXJ+gBneXhg7BjOXy3L1yrBaIwOZuFikVKA61LNy3joMmtwCZIvI
f3t5bfYUm2z7P+NnoBR6Rl37gJrUy7xzW8B0nf9qu654NCwjhp+ksGbCkJ5BYwODmU5Gxl+Rsa8b
si0GH2i6m8bX291xuztiVTQyO2e/AQAA//8DAP7WRo9GAwAA
headers:
CF-RAY:
- 93458dcf6d0ef53b-GRU
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Tue, 22 Apr 2025 13:44:07 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- crewai-iuxna1
openai-processing-ms:
- '1391'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
x-ratelimit-limit-requests:
- '30000'
x-ratelimit-limit-tokens:
- '150000000'
x-ratelimit-remaining-requests:
- '29999'
x-ratelimit-remaining-tokens:
- '149999989'
x-ratelimit-reset-requests:
- 2ms
x-ratelimit-reset-tokens:
- 0s
x-request-id:
- req_74408ec81f430b6e9795cf5332262b8d
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,101 @@
interactions:
- request:
body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
"model": "gpt-4.1-nano-2025-04-14", "stop": []}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '125'
content-type:
- application/json
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.68.2
x-stainless-arch:
- arm64
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.68.2
x-stainless-raw-response:
- 'true'
x-stainless-read-timeout:
- '600.0'
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.11.12
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: !!binary |
H4sIAAAAAAAAAwAAAP//jJJPb9swDMXv/hSCznEQuwrq5bgNBbZTh7aHYigMRqJtbbKkSXT/oMh3
L2Snsdt1wC4+8MdHv0fxOWOMa8V3jMsOSPbe5J8vqy/D96K/+PPjSn/Fh5szMewfb/eiar/d8lVS
uP0vlPSqWkvXe4OknZ2wDAiEaWpxLrZn5SchqhH0TqFJstZTLtZFbsG6vNyU23wj8kIc5Z3TEiPf
sZ8ZY4w9j99k1Cp85Du2Wb1WeowRWuS7UxNjPDiTKhxi1JHAEl/NUDpLaEfv1x0yCV4TGOYadhHA
SmQ6sksIOq6XqoDNECFZt4MxCwDWOoIUffR7dySHk0PjWh/cPr6T8kZbHbs6IERnk5tIzvORHjLG
7sZNDG/CcR9c76km9xvH3xViGsfnB5hhdWTkCMxcLsvVB8NqhQTaxMUiuQTZoZqV89ZhUNotQLaI
/LeXj2ZPsbVt/2f8DKRET6hqH1Bp+Tbv3BYwXee/2k4rHg3ziOFeS6xJY0jPoLCBwUwnw+NTJOzr
RtsWgw96upvG14gKq2ajxJZnh+wFAAD//wMATWCJPkYDAAA=
headers:
CF-RAY:
- 93458dd9aebff53b-GRU
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Tue, 22 Apr 2025 13:44:08 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- crewai-iuxna1
openai-processing-ms:
- '134'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
x-ratelimit-limit-requests:
- '30000'
x-ratelimit-limit-tokens:
- '150000000'
x-ratelimit-remaining-requests:
- '29999'
x-ratelimit-remaining-tokens:
- '149999990'
x-ratelimit-reset-requests:
- 2ms
x-ratelimit-reset-tokens:
- 0s
x-request-id:
- req_0ba738662e063fb55ea01793aafdcecc
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,101 @@
interactions:
- request:
body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
"model": "gpt-4.1", "stop": []}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '109'
content-type:
- application/json
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.68.2
x-stainless-arch:
- arm64
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.68.2
x-stainless-raw-response:
- 'true'
x-stainless-read-timeout:
- '600.0'
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.11.12
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: !!binary |
H4sIAAAAAAAAA4xSwYrbMBC9+yvEHEMcHMeh3dx2SwttoYTSW1nMRB7b2pUlIY27XZb8e5GdxM5u
C734MG/e83tP85IIAaqCnQDZIsvO6fRu//6Db26fnvTtx8/5t7v95rmT3y1+pc0XDcvIsIcHknxm
raTtnCZW1oyw9IRMUXX9rthu8pui2A5AZyvSkdY4TovVOs2zfJtmRbouTszWKkkBduJnIoQQL8M3
ejQV/YadyJbnSUchYEOwuywJAd7qOAEMQQVGw7CcQGkNkxls/2hJSHSKUQtbi08ejSShglgs9uhV
WCxWc6anug8YnZte6xmAxljGmHzwfH9CjheX2jbO20N4RYVaGRXa0hMGa6KjwNbBgB4TIe6HNvqr
gOC87RyXbB9p+N26GOVg6n8GnpoCtox6mudn0pVaWRGj0mHWJkiULVUTc6oe+0rZGZDMMr818zft
Mbcyzf/IT4CU5Jiq0nmqlLwOPK15itf5r7VLx4NhCOR/KUklK/LxHSqqsdfj3UB4DkxdWSvTkHde
jcdTuxJllh2yLMskJMfkDwAAAP//AwC4aq9JRgMAAA==
headers:
CF-RAY:
- 93458dcc1b9df53b-GRU
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Tue, 22 Apr 2025 13:44:06 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- crewai-iuxna1
openai-processing-ms:
- '296'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
x-ratelimit-limit-requests:
- '10000'
x-ratelimit-limit-tokens:
- '30000000'
x-ratelimit-remaining-requests:
- '9999'
x-ratelimit-remaining-tokens:
- '29999989'
x-ratelimit-reset-requests:
- 6ms
x-ratelimit-reset-tokens:
- 0s
x-request-id:
- req_7b4e1628a0608e9547e71e7857a29b58
status:
code: 200
message: OK
version: 1

View File

@@ -256,6 +256,52 @@ def test_validate_call_params_no_response_format():
llm._validate_call_params()
@pytest.mark.vcr(filter_headers=["authorization"], filter_query_parameters=["key"])
@pytest.mark.parametrize(
"model",
[
"gemini/gemini-2.0-flash-thinking-exp-01-21",
"gemini/gemini-2.0-flash-001",
"gemini/gemini-2.0-flash-lite-001",
"gemini/gemini-2.5-flash-preview-04-17",
"gemini/gemini-2.5-pro-exp-03-25",
],
)
def test_gemini_models(model):
llm = LLM(model=model)
result = llm.call("What is the capital of France?")
assert isinstance(result, str)
assert "Paris" in result
@pytest.mark.vcr(filter_headers=["authorization"], filter_query_parameters=["key"])
@pytest.mark.parametrize(
"model",
[
"gemini/gemma-3-1b-it",
"gemini/gemma-3-4b-it",
"gemini/gemma-3-12b-it",
"gemini/gemma-3-27b-it",
],
)
def test_gemma3(model):
llm = LLM(model=model)
result = llm.call("What is the capital of France?")
assert isinstance(result, str)
assert "Paris" in result
@pytest.mark.vcr(filter_headers=["authorization"])
@pytest.mark.parametrize(
"model", ["gpt-4.1", "gpt-4.1-mini-2025-04-14", "gpt-4.1-nano-2025-04-14"]
)
def test_gpt_4_1(model):
llm = LLM(model=model)
result = llm.call("What is the capital of France?")
assert isinstance(result, str)
assert "Paris" in result
@pytest.mark.vcr(filter_headers=["authorization"])
def test_o3_mini_reasoning_effort_high():
llm = LLM(
@@ -327,6 +373,45 @@ def get_weather_tool_schema():
},
}
def test_context_window_exceeded_error_handling():
"""Test that litellm.ContextWindowExceededError is converted to LLMContextLengthExceededException."""
from litellm.exceptions import ContextWindowExceededError
from crewai.utilities.exceptions.context_window_exceeding_exception import (
LLMContextLengthExceededException,
)
llm = LLM(model="gpt-4")
# Test non-streaming response
with patch("litellm.completion") as mock_completion:
mock_completion.side_effect = ContextWindowExceededError(
"This model's maximum context length is 8192 tokens. However, your messages resulted in 10000 tokens.",
model="gpt-4",
llm_provider="openai"
)
with pytest.raises(LLMContextLengthExceededException) as excinfo:
llm.call("This is a test message")
assert "context length exceeded" in str(excinfo.value).lower()
assert "8192 tokens" in str(excinfo.value)
# Test streaming response
llm = LLM(model="gpt-4", stream=True)
with patch("litellm.completion") as mock_completion:
mock_completion.side_effect = ContextWindowExceededError(
"This model's maximum context length is 8192 tokens. However, your messages resulted in 10000 tokens.",
model="gpt-4",
llm_provider="openai"
)
with pytest.raises(LLMContextLengthExceededException) as excinfo:
llm.call("This is a test message")
assert "context length exceeded" in str(excinfo.value).lower()
assert "8192 tokens" in str(excinfo.value)
@pytest.mark.vcr(filter_headers=["authorization"])
@pytest.fixture