Compare commits

..

1 Commits

Author SHA1 Message Date
Lucas Gomide
5ed863385e fix: handle GPT-5.x models not supporting the stop API parameter
GPT-5.x models reject the `stop` parameter at the API level with "Unsupported parameter: 'stop' is not supported with this model". This breaks CrewAI executions when routing through LiteLLM (e.g. via
OpenAI-compatible gateways like Asimov), because the LiteLLM fallback path always includes `stop` in the API request params.

The native OpenAI provider was unaffected because it never sends `stop` to the API — it applies stop words client-side via `_apply_stop_words()`. However, when the request goes through LiteLLM (custom endpoints, proxy gateways),
`stop` is sent as an API parameter and GPT-5.x rejects it.

Additionally, the existing retry logic that catches this error only matched the OpenAI API error format ("Unsupported parameter") but missed
LiteLLM's own pre-validation error format ("does not support parameters"), so the self-healing retry never triggered for LiteLLM-routed calls.
2026-03-27 16:32:13 -03:00
7 changed files with 962 additions and 1209 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -1,550 +0,0 @@
---
title: Single Sign-On (SSO)
icon: "key"
description: Configure enterprise SSO authentication for CrewAI Platform — SaaS and Factory
---
## Overview
CrewAI Platform supports enterprise Single Sign-On (SSO) across both **SaaS (AMP)** and **Factory (self-hosted)** deployments. SSO enables your team to authenticate using your organization's existing identity provider, enforcing centralized access control, MFA policies, and user lifecycle management.
### Supported Providers
| Provider | SaaS | Factory | Protocol | CLI Support |
|---|---|---|---|---|
| **WorkOS** | ✅ (default) | ✅ | OAuth 2.0 / OIDC | ✅ |
| **Microsoft Entra ID** (Azure AD) | ✅ (enterprise) | ✅ | OAuth 2.0 / SAML 2.0 | ✅ |
| **Okta** | ✅ (enterprise) | ✅ | OAuth 2.0 / OIDC | ✅ |
| **Auth0** | ✅ (enterprise) | ✅ | OAuth 2.0 / OIDC | ✅ |
| **Keycloak** | — | ✅ | OAuth 2.0 / OIDC | ✅ |
### Key Capabilities
- **SAML 2.0 and OAuth 2.0 / OIDC** protocol support
- **Device Authorization Grant** flow for CLI authentication
- **Role-Based Access Control (RBAC)** with custom roles and per-resource permissions
- **MFA enforcement** delegated to your identity provider
- **User provisioning** through IdP assignment (users/groups)
---
## SaaS SSO
### Default Authentication
CrewAI's managed SaaS platform (AMP) uses **WorkOS** as the default authentication provider. When you sign up at [app.crewai.com](https://app.crewai.com), authentication is handled through `login.crewai.com` — no additional SSO configuration is required.
### Enterprise Custom SSO
Enterprise SaaS customers can configure SSO with their own identity provider (Entra ID, Okta, Auth0). Contact your CrewAI account team to enable custom SSO for your organization. Once configured:
1. Your team members authenticate through your organization's IdP
2. Access control and MFA policies are enforced by your IdP
3. The CrewAI CLI automatically detects your SSO configuration via `crewai enterprise configure`
### CLI Defaults (SaaS)
| Setting | Default Value |
|---|---|
| `enterprise_base_url` | `https://app.crewai.com` |
| `oauth2_provider` | `workos` |
| `oauth2_domain` | `login.crewai.com` |
---
## Factory SSO Setup
Factory (self-hosted) deployments require you to configure SSO by setting environment variables in your Helm `values.yaml` and registering an application in your identity provider.
### Microsoft Entra ID (Azure AD)
<Steps>
<Step title="Register an Application">
1. Go to [portal.azure.com](https://portal.azure.com) → **Microsoft Entra ID** → **App registrations** → **New registration**
2. Configure:
- **Name:** `CrewAI` (or your preferred name)
- **Supported account types:** Accounts in this organizational directory only
- **Redirect URI:** Select **Web**, enter `https://<your-domain>/auth/entra_id/callback`
3. Click **Register**
</Step>
<Step title="Collect Credentials">
From the app overview page, copy:
- **Application (client) ID** → `ENTRA_ID_CLIENT_ID`
- **Directory (tenant) ID** → `ENTRA_ID_TENANT_ID`
</Step>
<Step title="Create Client Secret">
1. Navigate to **Certificates & Secrets** → **New client secret**
2. Add a description and select expiration period
3. Copy the secret value immediately (it won't be shown again) → `ENTRA_ID_CLIENT_SECRET`
</Step>
<Step title="Grant Admin Consent">
1. Go to **Enterprise applications** → select your app
2. Under **Security** → **Permissions**, click **Grant admin consent**
3. Ensure **Microsoft Graph → User.Read** is granted
</Step>
<Step title="Configure App Roles (Recommended)">
Under **App registrations** → your app → **App roles**, create:
| Display Name | Value | Allowed Member Types |
|---|---|---|
| Member | `member` | Users/Groups |
| Factory Admin | `factory-admin` | Users/Groups |
<Note>
The `member` role grants login access. The `factory-admin` role grants admin panel access. Roles are included in the JWT automatically.
</Note>
</Step>
<Step title="Assign Users">
1. Under **Properties**, set **Assignment required?** to **Yes**
2. Under **Users and groups**, assign users/groups with the appropriate role
</Step>
<Step title="Set Environment Variables">
```yaml
envVars:
AUTH_PROVIDER: "entra_id"
secrets:
ENTRA_ID_CLIENT_ID: "<Application (client) ID>"
ENTRA_ID_CLIENT_SECRET: "<Client Secret>"
ENTRA_ID_TENANT_ID: "<Directory (tenant) ID>"
```
</Step>
<Step title="Enable CLI Support (Optional)">
To allow `crewai login` via Device Authorization Grant:
1. Under **Authentication** → **Advanced settings**, enable **Allow public client flows**
2. Under **Expose an API**, add an Application ID URI (e.g., `api://crewai-cli`)
3. Add a scope (e.g., `read`) with **Admins and users** consent
4. Under **Manifest**, set `accessTokenAcceptedVersion` to `2`
5. Add environment variables:
```yaml
secrets:
ENTRA_ID_DEVICE_AUTHORIZATION_CLIENT_ID: "<Application (client) ID>"
ENTRA_ID_CUSTOM_OPENID_SCOPE: "<scope URI, e.g. api://crewai-cli/read>"
```
</Step>
</Steps>
---
### Okta
<Steps>
<Step title="Create App Integration">
1. Open Okta Admin Console → **Applications** → **Create App Integration**
2. Select **OIDC - OpenID Connect** → **Web Application** → **Next**
3. Configure:
- **App integration name:** `CrewAI SSO`
- **Sign-in redirect URI:** `https://<your-domain>/auth/okta/callback`
- **Sign-out redirect URI:** `https://<your-domain>`
- **Assignments:** Choose who can access (everyone or specific groups)
4. Click **Save**
</Step>
<Step title="Collect Credentials">
From the app details page:
- **Client ID** → `OKTA_CLIENT_ID`
- **Client Secret** → `OKTA_CLIENT_SECRET`
- **Okta URL** (top-right corner, under your username) → `OKTA_SITE`
</Step>
<Step title="Configure Authorization Server">
1. Navigate to **Security** → **API**
2. Select your authorization server (default: `default`)
3. Under **Access Policies**, add a policy and rule:
- In the rule, under **Scopes requested**, select **The following scopes** → **OIDC default scopes**
4. Note the **Name** and **Audience** of the authorization server
<Warning>
The authorization server name and audience must match `OKTA_AUTHORIZATION_SERVER` and `OKTA_AUDIENCE` exactly. Mismatches cause `401 Unauthorized` or `Invalid token: Signature verification failed` errors.
</Warning>
</Step>
<Step title="Set Environment Variables">
```yaml
envVars:
AUTH_PROVIDER: "okta"
secrets:
OKTA_CLIENT_ID: "<Okta app client ID>"
OKTA_CLIENT_SECRET: "<Okta client secret>"
OKTA_SITE: "https://your-domain.okta.com"
OKTA_AUTHORIZATION_SERVER: "default"
OKTA_AUDIENCE: "api://default"
```
</Step>
<Step title="Enable CLI Support (Optional)">
1. Create a **new** app integration: **OIDC** → **Native Application**
2. Enable **Device Authorization** and **Refresh Token** grant types
3. Allow everyone in your organization to access
4. Add environment variable:
```yaml
secrets:
OKTA_DEVICE_AUTHORIZATION_CLIENT_ID: "<Native app client ID>"
```
<Note>
Device Authorization requires a **Native Application** — it cannot use the Web Application created for browser-based SSO.
</Note>
</Step>
</Steps>
---
### Keycloak
<Steps>
<Step title="Create a Client">
1. Open Keycloak Admin Console → navigate to your realm
2. **Clients** → **Create client**:
- **Client type:** OpenID Connect
- **Client ID:** `crewai-factory` (suggested)
3. Capability config:
- **Client authentication:** On
- **Standard flow:** Checked
4. Login settings:
- **Root URL:** `https://<your-domain>`
- **Valid redirect URIs:** `https://<your-domain>/auth/keycloak/callback`
- **Valid post logout redirect URIs:** `https://<your-domain>`
5. Click **Save**
</Step>
<Step title="Collect Credentials">
- **Client ID** → `KEYCLOAK_CLIENT_ID`
- Under **Credentials** tab: **Client secret** → `KEYCLOAK_CLIENT_SECRET`
- **Realm name** → `KEYCLOAK_REALM`
- **Keycloak server URL** → `KEYCLOAK_SITE`
</Step>
<Step title="Set Environment Variables">
```yaml
envVars:
AUTH_PROVIDER: "keycloak"
secrets:
KEYCLOAK_CLIENT_ID: "<client ID>"
KEYCLOAK_CLIENT_SECRET: "<client secret>"
KEYCLOAK_SITE: "https://keycloak.yourdomain.com"
KEYCLOAK_REALM: "<realm name>"
KEYCLOAK_AUDIENCE: "account"
# Only set if using a custom base path (pre-v17 migrations):
# KEYCLOAK_BASE_URL: "/auth"
```
<Note>
Keycloak includes `account` as the default audience in access tokens. For most installations, `KEYCLOAK_AUDIENCE=account` works without additional configuration. See [Keycloak audience documentation](https://www.keycloak.org/docs/latest/authorization_services/index.html) if you need a custom audience.
</Note>
</Step>
<Step title="Enable CLI Support (Optional)">
1. Create a **second** client:
- **Client type:** OpenID Connect
- **Client ID:** `crewai-factory-cli` (suggested)
- **Client authentication:** Off (Device Authorization requires a public client)
- **Authentication flow:** Check **only** OAuth 2.0 Device Authorization Grant
2. Add environment variable:
```yaml
secrets:
KEYCLOAK_DEVICE_AUTHORIZATION_CLIENT_ID: "<CLI client ID>"
```
</Step>
</Steps>
---
### WorkOS
<Steps>
<Step title="Configure in WorkOS Dashboard">
1. Create an application in the [WorkOS Dashboard](https://dashboard.workos.com)
2. Configure the redirect URI: `https://<your-domain>/auth/workos/callback`
3. Note the **Client ID** and **AuthKit domain**
4. Set up organizations in the WorkOS dashboard
</Step>
<Step title="Set Environment Variables">
```yaml
envVars:
AUTH_PROVIDER: "workos"
secrets:
WORKOS_CLIENT_ID: "<WorkOS client ID>"
WORKOS_AUTHKIT_DOMAIN: "<your-authkit-domain.authkit.com>"
```
</Step>
</Steps>
---
### Auth0
<Steps>
<Step title="Create Application">
1. In the [Auth0 Dashboard](https://manage.auth0.com), create a new **Regular Web Application**
2. Configure:
- **Allowed Callback URLs:** `https://<your-domain>/auth/auth0/callback`
- **Allowed Logout URLs:** `https://<your-domain>`
3. Note the **Domain**, **Client ID**, and **Client Secret**
</Step>
<Step title="Set Environment Variables">
```yaml
envVars:
AUTH_PROVIDER: "auth0"
secrets:
AUTH0_CLIENT_ID: "<Auth0 client ID>"
AUTH0_CLIENT_SECRET: "<Auth0 client secret>"
AUTH0_DOMAIN: "<your-tenant.auth0.com>"
```
</Step>
<Step title="Enable CLI Support (Optional)">
1. Create a **Native** application in Auth0 for Device Authorization
2. Enable the **Device Authorization** grant type under application settings
3. Configure the CLI with the appropriate audience and client ID
</Step>
</Steps>
---
## CLI Authentication
The CrewAI CLI supports SSO authentication via the **Device Authorization Grant** flow. This allows developers to authenticate from their terminal without exposing credentials.
### Quick Setup
For Factory installations, the CLI can auto-configure all OAuth2 settings:
```bash
crewai enterprise configure https://your-factory-url.app
```
This command fetches the SSO configuration from your Factory instance and sets all required CLI parameters automatically.
Then authenticate:
```bash
crewai login
```
<Note>
Requires CrewAI CLI version **1.6.0** or higher for Entra ID, **0.159.0** or higher for Okta, and **1.9.0** or higher for Keycloak.
</Note>
### Manual CLI Configuration
If you need to configure the CLI manually, use `crewai config set`:
```bash
# Set the provider
crewai config set oauth2_provider okta
# Set provider-specific values
crewai config set oauth2_domain your-domain.okta.com
crewai config set oauth2_client_id your-client-id
crewai config set oauth2_audience api://default
# Set the enterprise base URL
crewai config set enterprise_base_url https://your-factory-url.app
```
### CLI Configuration Reference
| Setting | Description | Example |
|---|---|---|
| `enterprise_base_url` | Your CrewAI instance URL | `https://crewai.yourcompany.com` |
| `oauth2_provider` | Provider name | `workos`, `okta`, `auth0`, `entra_id`, `keycloak` |
| `oauth2_domain` | Provider domain | `your-domain.okta.com` |
| `oauth2_client_id` | OAuth2 client ID | `0oaqnwji7pGW7VT6T697` |
| `oauth2_audience` | API audience identifier | `api://default` |
View current configuration:
```bash
crewai config list
```
### How Device Authorization Works
1. Run `crewai login` — the CLI requests a device code from your IdP
2. A verification URL and code are displayed in your terminal
3. Your browser opens to the verification URL
4. Enter the code and authenticate with your IdP credentials
5. The CLI receives an access token and stores it locally
---
## Role-Based Access Control (RBAC)
CrewAI Platform provides granular RBAC that integrates with your SSO provider.
### Permission Model
| Permission | Description |
|---|---|
| **Read** | View resources (dashboards, automations, logs) |
| **Write** | Create and modify resources |
| **Manage** | Full control including deletion and configuration |
### Resources
Permissions can be scoped to individual resources:
- **Usage Dashboard** — Platform usage metrics and analytics
- **Automations Dashboard** — Crew and flow management
- **Environment Variables** — Secret and configuration management
- **Individual Automations** — Per-automation access control
### Roles
- **Predefined roles** come out of the box with standard permission sets
- **Custom roles** can be created with any combination of permissions
- **Per-resource assignment** — limit specific automations to individual users or roles
### Factory Admin Access
For Factory deployments using Entra ID, admin access is controlled via App Roles:
- Assign the `factory-admin` role to users who need admin panel access
- Assign the `member` role for standard platform access
- Roles are communicated via JWT claims — no additional configuration needed after IdP setup
---
## Troubleshooting
### Invalid Redirect URI
**Symptom:** Authentication fails with a redirect URI mismatch error.
**Fix:** Ensure the redirect URI in your IdP exactly matches the expected callback URL:
| Provider | Callback URL |
|---|---|
| Entra ID | `https://<domain>/auth/entra_id/callback` |
| Okta | `https://<domain>/auth/okta/callback` |
| Keycloak | `https://<domain>/auth/keycloak/callback` |
| WorkOS | `https://<domain>/auth/workos/callback` |
| Auth0 | `https://<domain>/auth/auth0/callback` |
### CLI Login Fails (Device Authorization)
**Symptom:** `crewai login` returns an error or times out.
**Fix:**
- Verify that Device Authorization Grant is enabled in your IdP
- For Okta: ensure you have a **Native Application** (not Web) with Device Authorization grant
- For Entra ID: ensure **Allow public client flows** is enabled
- For Keycloak: ensure the CLI client has **Client authentication: Off** and only Device Authorization Grant enabled
- Check that `*_DEVICE_AUTHORIZATION_CLIENT_ID` environment variable is set on the server
### Token Validation Errors
**Symptom:** `Invalid token: Signature verification failed` or `401 Unauthorized` after login.
**Fix:**
- **Okta:** Verify `OKTA_AUTHORIZATION_SERVER` and `OKTA_AUDIENCE` match the authorization server's Name and Audience exactly
- **Entra ID:** Ensure `accessTokenAcceptedVersion` is set to `2` in the app manifest
- **Keycloak:** Verify `KEYCLOAK_AUDIENCE` matches the audience in your access tokens (default: `account`)
### Admin Consent Not Granted (Entra ID)
**Symptom:** Users can't log in, see "needs admin approval" message.
**Fix:** Go to **Enterprise applications** → your app → **Permissions** → **Grant admin consent**. Ensure `User.Read` is granted for Microsoft Graph.
### 403 Forbidden After Login
**Symptom:** User authenticates successfully but gets 403 errors.
**Fix:**
- Check that the user is assigned to the application in your IdP
- For Entra ID with **Assignment required = Yes**: ensure the user has a role assignment (Member or Factory Admin)
- For Okta: verify the user or their group is assigned under the app's **Assignments** tab
### CLI Can't Reach Factory Instance
**Symptom:** `crewai enterprise configure` fails to connect.
**Fix:**
- Verify the Factory URL is reachable from your machine
- Check that `enterprise_base_url` is set correctly: `crewai config list`
- Ensure TLS certificates are valid and trusted
---
## Environment Variables Reference
### Common
| Variable | Description |
|---|---|
| `AUTH_PROVIDER` | Authentication provider: `entra_id`, `okta`, `workos`, `auth0`, `keycloak`, `local` |
### Microsoft Entra ID
| Variable | Required | Description |
|---|---|---|
| `ENTRA_ID_CLIENT_ID` | ✅ | Application (client) ID from Azure |
| `ENTRA_ID_CLIENT_SECRET` | ✅ | Client secret from Azure |
| `ENTRA_ID_TENANT_ID` | ✅ | Directory (tenant) ID from Azure |
| `ENTRA_ID_DEVICE_AUTHORIZATION_CLIENT_ID` | CLI only | Client ID for Device Authorization Grant |
| `ENTRA_ID_CUSTOM_OPENID_SCOPE` | CLI only | Custom scope from "Expose an API" (e.g., `api://crewai-cli/read`) |
### Okta
| Variable | Required | Description |
|---|---|---|
| `OKTA_CLIENT_ID` | ✅ | Okta application client ID |
| `OKTA_CLIENT_SECRET` | ✅ | Okta client secret |
| `OKTA_SITE` | ✅ | Okta organization URL (e.g., `https://your-domain.okta.com`) |
| `OKTA_AUTHORIZATION_SERVER` | ✅ | Authorization server name (e.g., `default`) |
| `OKTA_AUDIENCE` | ✅ | Authorization server audience (e.g., `api://default`) |
| `OKTA_DEVICE_AUTHORIZATION_CLIENT_ID` | CLI only | Native app client ID for Device Authorization |
### WorkOS
| Variable | Required | Description |
|---|---|---|
| `WORKOS_CLIENT_ID` | ✅ | WorkOS application client ID |
| `WORKOS_AUTHKIT_DOMAIN` | ✅ | AuthKit domain (e.g., `your-domain.authkit.com`) |
### Auth0
| Variable | Required | Description |
|---|---|---|
| `AUTH0_CLIENT_ID` | ✅ | Auth0 application client ID |
| `AUTH0_CLIENT_SECRET` | ✅ | Auth0 client secret |
| `AUTH0_DOMAIN` | ✅ | Auth0 tenant domain (e.g., `your-tenant.auth0.com`) |
### Keycloak
| Variable | Required | Description |
|---|---|---|
| `KEYCLOAK_CLIENT_ID` | ✅ | Keycloak client ID |
| `KEYCLOAK_CLIENT_SECRET` | ✅ | Keycloak client secret |
| `KEYCLOAK_SITE` | ✅ | Keycloak server URL |
| `KEYCLOAK_REALM` | ✅ | Keycloak realm name |
| `KEYCLOAK_AUDIENCE` | ✅ | Token audience (default: `account`) |
| `KEYCLOAK_BASE_URL` | Optional | Base URL path (e.g., `/auth` for pre-v17 migrations) |
| `KEYCLOAK_DEVICE_AUTHORIZATION_CLIENT_ID` | CLI only | Public client ID for Device Authorization |
---
## Next Steps
- [Installation Guide](/installation) — Get started with CrewAI
- [Quickstart](/quickstart) — Build your first crew
- [RBAC Setup](/enterprise/features/rbac) — Detailed role and permission management

View File

@@ -753,7 +753,7 @@ class LLM(BaseLLM):
"temperature": self.temperature,
"top_p": self.top_p,
"n": self.n,
"stop": self.stop or None,
"stop": (self.stop or None) if self.supports_stop_words() else None,
"max_tokens": self.max_tokens or self.max_completion_tokens,
"presence_penalty": self.presence_penalty,
"frequency_penalty": self.frequency_penalty,
@@ -1825,9 +1825,11 @@ class LLM(BaseLLM):
# whether to summarize the content or abort based on the respect_context_window flag
raise
except Exception as e:
unsupported_stop = "Unsupported parameter" in str(
e
) and "'stop'" in str(e)
error_str = str(e)
unsupported_stop = "'stop'" in error_str and (
"Unsupported parameter" in error_str
or "does not support parameters" in error_str
)
if unsupported_stop:
if (
@@ -1961,9 +1963,11 @@ class LLM(BaseLLM):
except LLMContextLengthExceededError:
raise
except Exception as e:
unsupported_stop = "Unsupported parameter" in str(
e
) and "'stop'" in str(e)
error_str = str(e)
unsupported_stop = "'stop'" in error_str and (
"Unsupported parameter" in error_str
or "does not support parameters" in error_str
)
if unsupported_stop:
if (
@@ -2263,6 +2267,10 @@ class LLM(BaseLLM):
Note: This method is only used by the litellm fallback path.
Native providers override this method with their own implementation.
"""
model_lower = self.model.lower() if self.model else ""
if "gpt-5" in model_lower:
return False
if not LITELLM_AVAILABLE or get_supported_openai_params is None:
# When litellm is not available, assume stop words are supported
return True

View File

@@ -2245,6 +2245,9 @@ class OpenAICompletion(BaseLLM):
def supports_stop_words(self) -> bool:
"""Check if the model supports stop words."""
model_lower = self.model.lower() if self.model else ""
if "gpt-5" in model_lower:
return False
return not self.is_o1_model
def get_context_window_size(self) -> int:

View File

@@ -0,0 +1,110 @@
interactions:
- request:
body: '{"messages":[{"role":"user","content":"What is the capital of France?"}],"model":"gpt-5"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '89'
content-type:
- application/json
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-raw-response:
- 'true'
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.2
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: "{\n \"id\": \"chatcmpl-DO4LcSpy72yIXCYSIVOQEXWNXydgn\",\n \"object\":
\"chat.completion\",\n \"created\": 1774628956,\n \"model\": \"gpt-5-2025-08-07\",\n
\ \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\":
\"assistant\",\n \"content\": \"Paris.\",\n \"refusal\": null,\n
\ \"annotations\": []\n },\n \"finish_reason\": \"stop\"\n
\ }\n ],\n \"usage\": {\n \"prompt_tokens\": 13,\n \"completion_tokens\":
11,\n \"total_tokens\": 24,\n \"prompt_tokens_details\": {\n \"cached_tokens\":
0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\":
{\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\":
0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"service_tier\":
\"default\",\n \"system_fingerprint\": null\n}\n"
headers:
CF-Cache-Status:
- DYNAMIC
CF-Ray:
- 9e2fc5dce85582fb-GIG
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Fri, 27 Mar 2026 16:29:17 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
access-control-expose-headers:
- ACCESS-CONTROL-XXX
alt-svc:
- h3=":443"; ma=86400
content-length:
- '772'
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '1343'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-openai-proxy-wasm:
- v0.1
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
version: 1

View File

@@ -1523,6 +1523,69 @@ def test_openai_stop_words_not_applied_to_structured_output():
assert "Observation:" in result.observation
def test_openai_gpt5_models_do_not_support_stop_words():
"""
Test that GPT-5 family models do not support stop words via the API.
GPT-5 models reject the 'stop' parameter, so stop words must be
applied client-side only.
"""
gpt5_models = [
"gpt-5",
"gpt-5-mini",
"gpt-5-nano",
"gpt-5-pro",
"gpt-5.1",
"gpt-5.1-chat",
"gpt-5.2",
"gpt-5.2-chat",
]
for model_name in gpt5_models:
llm = OpenAICompletion(model=model_name)
assert llm.supports_stop_words() == False, (
f"Expected {model_name} to NOT support stop words"
)
def test_openai_non_gpt5_models_support_stop_words():
"""
Test that non-GPT-5 models still support stop words normally.
"""
supported_models = [
"gpt-4o",
"gpt-4o-mini",
"gpt-4.1",
"gpt-4.1-mini",
"gpt-4-turbo",
]
for model_name in supported_models:
llm = OpenAICompletion(model=model_name)
assert llm.supports_stop_words() == True, (
f"Expected {model_name} to support stop words"
)
def test_openai_gpt5_still_applies_stop_words_client_side():
"""
Test that GPT-5 models still truncate responses at stop words client-side
via _apply_stop_words(), even though they don't send 'stop' to the API.
"""
llm = OpenAICompletion(
model="gpt-5.2",
stop=["Observation:", "Final Answer:"],
)
assert llm.supports_stop_words() == False
response = "I need to search.\n\nAction: search\nObservation: Found results"
result = llm._apply_stop_words(response)
assert "Observation:" not in result
assert "Found results" not in result
assert "I need to search" in result
def test_openai_stop_words_still_applied_to_regular_responses():
"""
Test that stop words ARE still applied for regular (non-structured) responses.

View File

@@ -682,6 +682,126 @@ def test_llm_call_when_stop_is_unsupported_when_additional_drop_params_is_provid
assert "Paris" in result
@pytest.mark.vcr()
def test_litellm_gpt5_call_succeeds_without_stop_error():
"""
Integration test: GPT-5 call succeeds when stop words are configured,
because stop is omitted from API params and applied client-side.
"""
llm = LLM(model="gpt-5", stop=["Observation:"], is_litellm=True)
result = llm.call("What is the capital of France?")
assert isinstance(result, str)
assert len(result) > 0
def test_litellm_gpt5_does_not_send_stop_in_params():
"""
Test that the LiteLLM fallback path does not include 'stop' in API params
for GPT-5.x models, since they reject it at the API level.
"""
llm = LLM(model="openai/gpt-5.2", stop=["Observation:"], is_litellm=True)
params = llm._prepare_completion_params(
messages=[{"role": "user", "content": "Hello"}]
)
assert params.get("stop") is None, (
"GPT-5.x models should not have 'stop' in API params"
)
def test_litellm_non_gpt5_sends_stop_in_params():
"""
Test that the LiteLLM fallback path still includes 'stop' in API params
for models that support it.
"""
llm = LLM(model="gpt-4o", stop=["Observation:"], is_litellm=True)
params = llm._prepare_completion_params(
messages=[{"role": "user", "content": "Hello"}]
)
assert params.get("stop") == ["Observation:"], (
"Non-GPT-5 models should have 'stop' in API params"
)
def test_litellm_retry_catches_litellm_unsupported_params_error(caplog):
"""
Test that the retry logic catches LiteLLM's UnsupportedParamsError format
("does not support parameters") in addition to the OpenAI API format.
"""
llm = LLM(model="openai/gpt-5.2", stop=["Observation:"], is_litellm=True)
litellm_error = Exception(
"litellm.UnsupportedParamsError: openai does not support parameters: "
"['stop'], for model=openai/gpt-5.2."
)
call_count = 0
try:
import litellm
except ImportError:
pytest.skip("litellm is not installed; skipping LiteLLM retry test")
def mock_completion(*args, **kwargs):
nonlocal call_count
call_count += 1
if call_count == 1:
raise litellm_error
return MagicMock(
choices=[MagicMock(message=MagicMock(content="Paris", tool_calls=None))],
usage=MagicMock(
prompt_tokens=10,
completion_tokens=5,
total_tokens=15,
),
)
with patch("litellm.completion", side_effect=mock_completion):
with caplog.at_level(logging.INFO):
result = llm.call("What is the capital of France?")
assert "Retrying LLM call without the unsupported 'stop'" in caplog.text
assert "stop" in llm.additional_params.get("additional_drop_params", [])
def test_litellm_retry_catches_openai_api_stop_error(caplog):
"""
Test that the retry logic still catches the OpenAI API error format
("Unsupported parameter: 'stop'").
"""
llm = LLM(model="openai/gpt-5.2", stop=["Observation:"], is_litellm=True)
api_error = Exception(
"Unsupported parameter: 'stop' is not supported with this model."
)
call_count = 0
def mock_completion(*args, **kwargs):
nonlocal call_count
call_count += 1
if call_count == 1:
raise api_error
return MagicMock(
choices=[MagicMock(message=MagicMock(content="Paris", tool_calls=None))],
usage=MagicMock(
prompt_tokens=10,
completion_tokens=5,
total_tokens=15,
),
)
with patch("litellm.completion", side_effect=mock_completion):
with caplog.at_level(logging.INFO):
llm.call("What is the capital of France?")
assert "Retrying LLM call without the unsupported 'stop'" in caplog.text
assert "stop" in llm.additional_params.get("additional_drop_params", [])
@pytest.fixture
def ollama_llm():
return LLM(model="ollama/llama3.2:3b", is_litellm=True)