When LLM(base_url=...) is used without api_base, litellm does not
receive the custom endpoint because it reads api_base (not base_url).
This causes requests to fall back to api.openai.com, breaking
multi-provider setups (e.g. Scaleway + Nebius).
The fix syncs base_url and api_base in LLM.__init__:
- If only base_url is provided, api_base is set to match
- If only api_base is provided, base_url is set to match
- If both are provided, both keep their explicit values
Closes#5139
Co-Authored-By: João <joao@crewai.com>
After PyPI publish, clones crewAIInc/crew_deployment_test, bumps the
crewai[tools] pin to the new version, regenerates uv.lock, and pushes
to main. Includes retry logic for CDN propagation delays.
- Add --skip-to-enterprise flag to resume just Phase 3 after a failure
- Add --prerelease=allow to uv sync for alpha/beta/rc versions
- Retry uv sync up to 10 times to handle PyPI CDN propagation delay
- Update pyproject.toml [project] version field (fixes apps/api version)
- Print PR URL after creating enterprise bump PR
- Add full feature permissions matrix (11 features × permission levels)
- Document Owner vs Member default permissions
- Add deployment guide: what permissions are needed to deploy from GitHub or Zip
- Document entity-level permissions (deployment permission types: run, traces, manage_settings, HITL, full_access)
- Document entity RBAC for env vars, LLM connections, and Git repositories
- Add common role patterns: Developer, Viewer/Stakeholder, Ops/Platform Admin
- Add quick-reference table for minimum deployment permissions
Addresses user feedback that RBAC was too restrictive and unclear:
members didn't know which permissions to configure for a developer profile.
* fix: preserve method return value as flow output for @human_feedback with emit
When a @human_feedback decorated method with emit= is the final method in a
flow (no downstream listeners triggered), the flow's final output was
incorrectly set to the collapsed outcome string (e.g., 'approved') instead
of the method's actual return value (e.g., a state dict).
Root cause: _process_feedback() returns the collapsed_outcome string when
emit is set, and this string was being stored as the method's result in
_method_outputs.
The fix:
1. In human_feedback.py: After _process_feedback, stash the real method_output
on the flow instance as _human_feedback_method_output when emit is set.
2. In flow.py: After appending a method result to _method_outputs, check if
_human_feedback_method_output is set. If so, replace the last entry with
the stashed real output and clear the stash.
This ensures:
- Routing still works correctly (collapsed outcome used for @listen matching)
- The flow's final result is the actual method return value
- If downstream listeners execute, their results become the final output
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* style: ruff format flow.py
* fix: use per-method dict stash for concurrency safety and None returns
Addresses review comments:
- Replace single flow-level slot with dict keyed by method name,
safe under concurrent @human_feedback+emit execution
- Dict key presence (not value) indicates stashed output,
correctly preserving None return values
- Added test for None return value preservation
---------
Co-authored-by: Joao Moura <joao@crewai.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* feat: add request_id to HumanFeedbackRequestedEvent
Allow platforms to attach a correlation identifier to human feedback requests so downstream consumers can deterministically match spans to their corresponding feedback records
* feat: add request_id to HumanFeedbackReceivedEvent for correlation
Without request_id on the received event, consumers cannot correlate
a feedback response back to its originating request. Both sides of the
request/response pair need the correlation identifier.
---------
Co-authored-by: Alex <alex@crewai.com>
- Delegate supports_function_calling() to parent (handles o1 models via OpenRouter)
- Guard empty env vars in base_url resolution
- Fix misleading comment about model validation rules
- Remove unused MagicMock import
- Use 'is not None' for env var restoration in tests
Co-authored-by: Joao Moura <joao@crewai.com>
## Summary
### Core fixes
<details>
<summary><b>Fix silent 404 cascade on trace event send</b></summary>
When `_initialize_backend_batch` failed, `trace_batch_id` was left populated with a client-generated UUID never registered server-side. All subsequent event sends hit a non-existent batch endpoint and returned 404. Now all three failure paths (None response, non-2xx status, exception) clear `trace_batch_id`.
</details>
<details>
<summary><b>Fix first-time deferred batch init silently skipped</b></summary>
First-time users have `is_tracing_enabled_in_context() = False` by design. This caused `_initialize_backend_batch` to return early without creating the batch, and `finalize_batch` to skip finalization (same guard). The first-time handler now passes `skip_context_check=True` to bypass both guards, calls `_finalize_backend_batch` directly, gates `backend_initialized` on actual success, checks `_send_events_to_backend` return status (marking batch as failed on 500), captures event count/duration/batch ID before they're consumed by send/finalize, and cleans up all singleton state via `_reset_batch_state()` on every exit path.
</details>
<details>
<summary><b>Sync <code>is_current_batch_ephemeral</code> on batch creation success</b></summary>
When the batch is successfully created on the server, `is_current_batch_ephemeral` is now synced with the actual `use_ephemeral` value used. This prevents endpoint mismatches where the batch was created on one endpoint but events and finalization were sent to a different one, resulting in 404.
</details>
<details>
<summary><b>Route <code>mark_trace_batch_as_failed</code> to correct endpoint for ephemeral batches</b></summary>
`mark_trace_batch_as_failed` always routed to the non-ephemeral endpoint (`/tracing/batches/{id}`), causing 404s when called on ephemeral batches — the same class of endpoint mismatch this PR aims to fix. Added `mark_ephemeral_trace_batch_as_failed` to `PlusAPI` and a `_mark_batch_as_failed` helper on `TraceBatchManager` that routes based on `is_current_batch_ephemeral`.
</details>
<details>
<summary><b>Gate <code>backend_initialized</code> on actual init success (non-first-time path)</b></summary>
On the non-first-time path, `backend_initialized` was set to `True` unconditionally after `_initialize_backend_batch` returned. With the new failure-path cleanup that clears `trace_batch_id`, this created an inconsistent state: `backend_initialized=True` + `trace_batch_id=None`. Now set via `self.trace_batch_id is not None`.
</details>
### Resilience improvements
<details>
<summary><b>Retry transient failures on batch creation</b></summary>
`_initialize_backend_batch` now retries up to 2 times with 200ms backoff on transient failures (None response, 5xx, network errors). Non-transient 4xx errors are not retried. The short backoff minimizes lock hold time on the non-first-time path where `_batch_ready_cv` is held.
</details>
<details>
<summary><b>Fall back to ephemeral on server auth rejection</b></summary>
When the non-ephemeral endpoint returns 401/403 (expired token, revoked credentials, key rotation), the client automatically switches to ephemeral tracing instead of losing traces. The fallback forwards `skip_context_check` and is guarded against infinite recursion — if ephemeral also fails, `trace_batch_id` is cleared normally.
</details>
<details>
<summary><b>Fix action-event race initializing batch as non-ephemeral</b></summary>
`_handle_action_event` called `batch_manager.initialize_batch()` directly, defaulting `use_ephemeral=False`. When a `DefaultEnvEvent` or `LLMCallStartedEvent` fired before `CrewKickoffStartedEvent` in the thread pool, the batch was locked in as non-ephemeral. Now routes through `_initialize_batch()` which computes `use_ephemeral` from `_check_authenticated()`.
</details>
<details>
<summary><b>Guard <code>_mark_batch_as_failed</code> against cascading network errors</b></summary>
When `_finalize_backend_batch` failed with a network error (e.g. `[Errno 54] Connection reset by peer`), the exception handler called `_mark_batch_as_failed` — which also makes an HTTP request on the same dead connection. That second failure was unhandled. Now wrapped in a try/except so it logs at debug level instead of propagating.
</details>
<details>
<summary><b>Design decision: first-time users always use ephemeral</b></summary>
First-time trace collection **always creates ephemeral batches**, regardless of authentication status. This is intentional:
1. **The first-time handler UX is built around ephemeral traces** — it displays an access code, a 24-hour expiry link, and opens the browser to the ephemeral trace viewer. Non-ephemeral batches don't produce these artifacts, so the handler would fall through to the "Local Traces Collected" fallback even when traces were successfully sent.
2. **The server handles account linking automatically** — `LinkEphemeralTracesJob` runs on user signup and migrates ephemeral traces to permanent records. Logged-in users can access their traces via their dashboard regardless.
3. **Checking auth during batch setup broke event collection** — moving `_check_authenticated()` into `_initialize_batch` caused the batch initialization to fail silently during the flow/crew start event handler, preventing all event collection. Keeping the first-time path fast and side-effect-free preserves event collection.
The auth check is deferred to the non-first-time path (second run onwards), where `is_tracing_enabled_in_context()` is `True` and the normal tracing pipeline handles everything — including the 401/403 ephemeral fallback.
</details>
### Manual tests
<details>
<summary><b>Matrix</b></summary>
| Scenario | First run | Second run |
|----------|-----------|------------|
| Logged out, fresh `.crewai_user.json` | Ephemeral trace created, URL returned | Ephemeral trace created, URL returned |
| Logged in, fresh `.crewai_user.json` | Ephemeral trace created, URL returned | Trace batch finalized, URL returned |
| Flow execution | Tested with `poem_flow` | Tested with `poem_flow` |
| Crew execution | Tested with `hitl_crew` | Tested with `hitl_crew` |
</details>
Fix: Add a remember_many() method to the MemoryScope class that delegates to self._memory.remember_many(...) with the scoped path, following the exact same pattern as the existing remember() method.
Problem: When you pass memory=memory.scope("/agent/...") to an Agent, CrewAI's internal code calls remember_many() after every task to persist results. But MemoryScope never implemented remember_many() — only the parent Memory class has it.
Symptom: [ERROR]: Failed to save kickoff result to memory: 'MemoryScope' object has no attribute 'remember_many' — memories are silently never saved after agent tasks.
When a method has both @listen and @human_feedback(emit=[...]),
the FlowMeta metaclass registered it as a router but only used
get_possible_return_constants() to detect paths. This fails for
@human_feedback methods since the paths come from the decorator's
emit param, not from return statements in the source code.
Now checks __router_paths__ first (set by @human_feedback), then
falls back to source code analysis for plain @router methods.
This was causing missing edges in the flow serializer output —
e.g. the whitepaper generator's review_infographic -> handle_cancelled,
send_slack_notification, classify_feedback edges were all missing.
Adds test: @listen + @human_feedback(emit=[...]) generates correct
router edges in serialized output.
Co-authored-by: Joao Moura <joao@crewai.com>
The litellm optional dependency had a wide upper bound (<3) that allowed
any future litellm release to be installed automatically. This means
breaking changes in new litellm versions could affect customers immediately.
Pins the upper bound to <=1.82.6 (current latest known-good version).
When newer litellm versions are tested and validated, bump this bound
explicitly.
* feat: add native OpenAI-compatible providers (OpenRouter, DeepSeek, Ollama, vLLM, Cerebras, Dashscope)
Add a data-driven OpenAI-compatible provider system that enables
native support for multiple third-party APIs that implement the
OpenAI API specification.
New providers:
- OpenRouter: 500+ models via openrouter.ai
- DeepSeek: deepseek-chat, deepseek-coder, deepseek-reasoner
- Ollama: local models (llama3, mistral, codellama, etc.)
- hosted_vllm: self-hosted vLLM servers
- Cerebras: ultra-fast inference
- Dashscope: Alibaba Qwen models (qwen-turbo, qwen-max, etc.)
Architecture:
- Single OpenAICompatibleCompletion class extends OpenAICompletion
- ProviderConfig dataclass stores per-provider settings
- Registry dict makes adding new providers a single config entry
- Handles provider-specific quirks (OpenRouter headers, Ollama
base URL normalization, optional API keys)
Usage:
LLM(model="deepseek/deepseek-chat")
LLM(model="ollama/llama3")
LLM(model="openrouter/anthropic/claude-3-opus")
LLM(model="llama3", provider="ollama")
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: add is_litellm=True to tests that test litellm-specific methods
Tests for _get_custom_llm_provider and _validate_call_params used
openrouter/ model prefix which now routes to native provider.
Added is_litellm=True to force litellm path since these test
litellm-specific internals.
---------
Co-authored-by: Joao Moura <joao@crewai.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>