Commit Graph

544 Commits

Author SHA1 Message Date
Devin AI
847aee8906 fix: apply ruff format to agent_executor.py
Co-Authored-By: João <joao@crewai.com>
2026-06-07 23:41:53 +00:00
Devin AI
34afc71f80 Fix #6065: Add ExecutorContext protocol compliance to experimental AgentExecutor
The new default AgentExecutor (Flow-based) did not expose ask_for_human_input
as a direct attribute — it only stored it in self.state.ask_for_human_input.
This caused AttributeError when human_input=True was set on a Task, because
SyncHumanInputProvider reads/writes context.ask_for_human_input directly.

Changes:
- Add ask_for_human_input property (getter+setter) delegating to state
- Add _invoke_loop() and _ainvoke_loop() for re-running agent after feedback
- Add _format_feedback_message() for formatting human feedback as LLM messages
- Add 12 regression tests covering ExecutorContext protocol compliance

Co-Authored-By: João <joao@crewai.com>
2026-06-07 23:39:48 +00:00
Lorenze Jay
17cfbdf95f feat: bump versions to 1.14.7a2 (#6054) 2026-06-05 14:15:43 -07:00
Lorenze Jay
8cd51fc67e Lorenze/imp/conversational flow traces (#6044)
* feat: add conversation message and route selection events

- Introduced `ConversationMessageAddedEvent` and `ConversationRouteSelectedEvent` to enhance conversational flow tracking.
- Updated event listeners to emit these events during message handling and routing decisions.
- Enhanced the `_ConversationalMixin` class to emit events for user and assistant messages, as well as selected routes.
- Added tests to verify the correct emission of these events during conversational turns.

* ensure flow started events only emiited once

* refactor(tracing): rename trace event handler methods to action event handlers

Updated the  class to replace  with  for  and  events, improving clarity in event handling.

Additionally, adjusted comments in the  class to clarify the application of pending user messages in relation to state restoration and flow scope initialization.

* fix(conversational_mixin): handle empty message index in route events

Updated the message index handling in the  class to return  when there are no messages. Added tests to ensure that route events do not reference index zero when the transcript is empty, and verified the correct emission of conversation message events during flow handling.
2026-06-05 14:10:19 -07:00
Lucas Gomide
cab3319af9 feat(otel): surface real finish_reason + sampling params + response.id on LLM events (#5945)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
* feat(otel): surface real finish_reason + sampling params + response.id on LLM events

Companion to the OTel GenAI emitter compliance work in crewai-enterprise
(CON-172). Today the enterprise emitter reads these fields off the OSS
LLM events via `getattr(..., None)`, so it produces valid (but partial)
spans against the existing OSS surface. This change makes those fields
first-class on the events so spans can carry the real provider data.

What this adds:

- `LLMCallStartedEvent` gains the sampling-param fields the emitter needs
  for `gen_ai.request.*`: `temperature`, `top_p`, `max_tokens`, `stream`,
  `seed`, `stop_sequences`, `frequency_penalty`, `presence_penalty`, `n`.
  All optional; existing call sites keep working.
- `BaseLLM._emit_call_started_event` introspects those values off `self`
  (the LLM instance) via `getattr(..., None)` so every provider gets the
  fields propagated for free without per-provider plumbing.
- `LLMCallCompletedEvent` gains `finish_reason: str | None` and
  `response_id: str | None`. A field validator coerces any non-string
  value (MagicMock, unexpected provider object) to None so the event
  never raises on construction.
- `LLM._emit_call_completed_event` accepts both as kwargs.
- `LLM` (LiteLLM path) gets a defensive `_extract_finish_reason_and_response_id`
  helper that handles both streaming (`StreamingChoices`) and non-streaming
  (`Choices`) shapes and is wired into every completion-event emission site.
- Provider completions extract native values from their SDK responses and
  pass them through:
  - OpenAI: `_extract_responses_finish_reason_and_id` for Responses-API,
    `_extract_finish_reason_and_id` for Chat-Completions.
  - Anthropic: `_extract_finish_reason_and_id` (Messages API + streaming).
  - Bedrock: `_extract_finish_reason_and_id` (`stopReason` from converse).
  - Gemini: `_extract_finish_reason_and_id` (`finish_reason` from candidates).
  - Azure: inherits via OpenAI sub-class; adds the helper for Azure-specific
    response shapes.
  - openai_compatible: inherits from OpenAICompletion, no edits needed.

Compatibility:

- All new fields are optional with sensible defaults. No existing call
  sites need to change.
- The validator on `LLMCallCompletedEvent` swallows non-string values for
  the new fields so legacy mocks / exotic provider types don't blow up
  event construction.
- Enterprise side already reads these fields defensively, so OSS and
  enterprise can merge independently and cut on the same synchronized
  release.

Tested against the full LLM + events + provider test suite — all green;
the 14 pre-existing multimodal failures on main are unrelated and
reproduce without this diff.

* fix(bedrock): propagate finish_reason + response_id on async paths

The original commit covered every provider's sync path and Bedrock's
sync streaming path, but two Bedrock async paths still emitted
LLMCallCompletedEvent without finish_reason/response_id:

- _ahandle_converse: the final fallback emit_call_completed_event call
  was missing both fields. Added stop_reason + response_id matching the
  other emission sites in the same function.

- _ahandle_streaming_converse: response_id was never seeded from the
  initial response object, and stream_finish_reason wasn't propagated
  to the structured-output and final-text emissions. Now extracts
  response_id up front and threads stream_finish_reason through every
  completion event.

Adds a dedicated test file covering the new event fields end-to-end:
- LLMCallCompletedEvent.finish_reason / response_id Pydantic validation
  (string accepted, None default, non-string coerced to None).
- LLMCallStartedEvent sampling params (all nine fields accepted, default
  to None).
- BaseLLM._emit_call_started_event introspecting sampling params off
  self, with explicit kwargs overriding.
- BaseLLM._emit_call_completed_event passing finish_reason/response_id
  through to the event.
- LLM._extract_finish_reason_and_response_id across the LiteLLM shapes
  (non-streaming response, streaming chunk, dict, missing fields,
  non-string values, unexpected input).

* fix(otel): correct streaming finish_reason + bedrock response_id semantics

Two correctness fixes uncovered while landing the OTel finish_reason +
response_id plumbing:

- LiteLLM streaming (sync + async): `stream_options={"include_usage": True}`
  causes LiteLLM to emit a final usage-only chunk with `choices=[]`. The
  post-loop `_extract_finish_reason_and_response_id(last_chunk)` silently
  returned `(None, None)` because the last chunk has no choices, even though
  earlier chunks carried `finish_reason="stop"`. Track both fields
  incrementally inside the loop (mirroring how OpenAI/Gemini/Azure already
  handle their native streams) and use the tracked values for the
  LLMCallCompletedEvent emission and the partial-response error path.

- Bedrock Converse: `ResponseMetadata.RequestId` is an AWS infra trace id,
  not a model-level response id (semantically different from OpenAI's
  `chatcmpl-XXX`). Return None for `response_id` rather than mislead
  downstream telemetry consumers. The audit-fix's async propagation chain
  still works — None propagates through unchanged.

Adds `test_llm_streaming_finish_reason.py` pinning both the sync and async
LiteLLM streaming paths against the include_usage chunk shape.

* refactor(otel): unify LLM event introspection + drop redundant defensive code

Three cohesion cleanups uncovered during PR review, all behavior-preserving:

- LLM.call / LLM.acall in llm.py now delegate to BaseLLM._emit_call_started_event
  instead of constructing LLMCallStartedEvent inline. The base helper already
  introspects sampling params off self via getattr; the inline duplication was
  accidental, not justified, and a duplication risk if anyone adds a tenth
  OTel sampling param later.

- Extracted lib/crewai/llms/_finish_reason_utils.py:extract_choices_finish_reason_and_id
  as the shared extractor for the choices-based response shape. OpenAI Chat,
  Azure, and LiteLLM all read the same shape (response.id + choices[0].finish_reason)
  as both object attrs and dict keys. Providers with genuinely different shapes
  - Anthropic (stop_reason), Bedrock (stopReason), Gemini (protobuf enum),
  OpenAI Responses (status) - keep their own provider-specific helpers.

- Dropped redundant try/except (AttributeError, TypeError) wrappers around
  bare getattr(obj, "field", None) calls across the new extraction helpers.
  getattr with a default already suppresses AttributeError, and the inner
  isinstance / dict.get / int-coercion ops can't raise TypeError in practice.
  Kept the catches that legitimately guard against IndexError (e.g. choices[0]
  on an empty list).

Tests: 600 passed, 23 skipped, 14 pre-existing multimodal failures unchanged.
Added 12 parametrized tests for the shared helper covering object + dict
shapes, missing fields, non-string coercion, and never-raises invariants.

* chore(otel): drop dead last_chunk variable from async streaming

The streaming-fix commit (49e5581b5) replaced the post-loop
`_extract_finish_reason_and_response_id(last_chunk)` call with the
incrementally-tracked `stream_finish_reason` / `stream_response_id`,
which removed the only reader of `last_chunk` in
`_ahandle_streaming_response`. The declaration and per-iteration
assignment were left behind — harmless but confusing for future
readers because the sync sibling still legitimately uses `last_chunk`
(for usage and content fallbacks via `_handle_streaming_callbacks`).

The async path inlines its usage extraction directly inside the loop
(`chunk.model_extra.get("usage")`), so there's no fallback consumer.
Drop both lines.

Sync path untouched — `last_chunk` there is still load-bearing.

* fix(otel): coerce non-list stop_sequences to list[str] on LLMCallStartedEvent

Observed in Datadog: gen_ai.request.stop_sequences on a Gemini/Vertex
span surfaced the textproto repr of a google.protobuf.struct_pb2.ListValue
(values { string_value: "\nObservation:" }) instead of a real Sequence[str].

Root cause is upstream - a Vertex AI / Gemini code path stores the stop
list in a protobuf container (RepeatedScalarContainer or ListValue) rather
than a plain Python list. When that container reaches LLMCallStartedEvent
and then BaseLLM._emit_call_started_event hands it to the OTel SDK as a
span attribute, the SDK falls back to str(value) because the type isn't a
recognised Sequence[str] - producing the protobuf textproto string instead
of an array attribute.

* chore: fix ruff lint findings

* refactor(otel): declare sampling params on BaseLLM + honor stop overrides + dict chunk id

* fix: widen max_tokens to int | float | None + apply ruff format

* fix(otel): coerce unknown finish_reason / response_id to None instead of stringifying

* fix(otel): extract Azure stream finish_reason/id before usage-continue

Match the LiteLLM ordering so a finish_reason or response id riding on a
usage-carrying chunk isn't dropped by the early `continue`.

* fix(otel): report effective max_tokens cap + bedrock structured finish_reason
2026-06-05 07:23:38 -04:00
Vini Brasil
906cd9769d feat(flow): type DSL triggers as route-aware decorators (#6042)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Centralize FlowTrigger and FlowMethodDecorator so start/listen/router and the boolean trigger helpers share one authoring contract. This preserves decorated method signatures for static checking while allowing route-label strings in nested FlowCondition data.

Export the shared typing helpers for static analyzers, use an explicit Protocol body, align condition validation with Sequence-backed condition data, and drop the stale call-arg ignore exposed by the signature-preserving decorators.

Update the flow guide to use or_(...) for multi-label listeners.
2026-06-04 18:07:49 -03:00
Lorenze Jay
14ce97d787 chat api for convo flows (#6034)
* Add conversational Flow chat helper

* Document conversational flow chat APIs in translations

* Stringify conversational chat REPL output
2026-06-04 13:36:48 -07:00
Matt Aitchison
f3a15a4f07 feat(lock_store): make locking backend overridable (#6015)
* feat(lock_store): make locking backend overridable

Allow the centralised lock factory to use a pluggable backend instead of
the hardcoded Redis/file selection. Backends are resolved with precedence
override > CREWAI_LOCK_FACTORY env > built-in default:

- set_lock_backend()/reset_lock_backend() and a scoped lock_backend()
  context manager for programmatic overrides
- CREWAI_LOCK_FACTORY="module:callable" env import-path, resolved lazily
  and cached, with clear errors on malformed or non-callable specs
- LockBackend Protocol documenting the contract (raw name in, context
  manager out; backend owns its namespacing)

Default Redis/file behavior is unchanged when nothing is overridden.

* refactor(lock_store): use explicit body for LockBackend protocol method

Replace the no-op `...` body with `raise NotImplementedError` to satisfy
the CodeQL ineffectual-statement check while keeping the Protocol
structural-typing only.

* refactor(lock_store): drop scoped lock_backend context manager

Keep the backend overridable via set_lock_backend/reset_lock_backend and
the CREWAI_LOCK_FACTORY env path, but remove the scoped lock_backend()
context manager. It was speculative surface and the only thread-unsafe
piece (racy save/restore of the module global); nothing depends on it.

* refactor(lock_store): drop reset_lock_backend alias

reset_lock_backend() was just set_lock_backend(None); callers use that
directly. Clearing the override is documented on set_lock_backend.

* style(lock_store): apply ruff format

* refactor(lock_store): simplify overridable backend to a single setter

Reduce the override surface to just set_lock_backend(): lock() uses the
custom backend when one is set, otherwise the unchanged Redis/file default.

Drop the CREWAI_LOCK_FACTORY env import-path, the runtime_checkable
Protocol, the precedence resolver, and the getter — a custom backend is
now any callable(name, *, timeout) -> context manager, registered in
process.

* fix(lock_store): snapshot backend to avoid check-then-call race

Read the module-global backend once into a local before the None check
and the call, so a concurrent set_lock_backend(None) cannot make lock()
invoke None.

* docs(lock_store): clarify name handling for custom backends

The default namespaces the lock name; custom backends receive it
verbatim. Correct the lock() docstring which implied namespacing always
happens.

* docs(lock_store): note set_lock_backend is for one-time startup setup
2026-06-04 13:28:31 -05:00
Vini Brasil
75dad212a2 Split flow DSL monolith into focused decorator modules (#6040)
The Flow DSL lived in one 1033-line `dsl.py` that mixed every decorator
(`@start`/`@listen`/`@router`), the `human_feedback` decorator,
condition combinators, and FlowDefinition extraction helpers in a single
file.

Split it into a `dsl/` package where each decorator gets its own module
(`start.py` 68 lines, `listen.py` 55, `router.py` 164,
`human_feedback.py` 98) and the shared extraction/condition helpers stay
in `utils.py`. The public API is re-exported from `dsl/__init__.py`, so
import paths are unchanged.

This is simpler because each decorator is now read and changed in
isolation instead of scanning a 1000-line file to find one of them, and
router-specific annotation parsing no longer sits next to unrelated
start/listen logic.
2026-06-04 15:02:06 -03:00
Vini Brasil
051fa0c1cb Build FlowDefinition from Flow DSL metadata (#6017)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* Build FlowDefinition from Flow DSL metadata

Introduce `FlowDefinition`, a serializable model built from the Flow
DSL's runtime metadata. It becomes the structural contract for Flow
methods, triggers, routers, state, and configuration.

The visualization layer is the first consumer: `flow_structure` and
`build_flow_structure` now project from the definition instead of
re-introspecting the class. The runner still executes from live
registries, but the definition gives future runners a single static
contract to read.

This replaces AST source parsing for router return values, crew
references, and state schema with runtime metadata plus explicit
`@router(paths=...)` or `Literal`/`Enum` return hints. AST parsing was
fragile and could silently fail for dynamic or non-inspectable methods.

The refactor removes obsolete introspection and serializer code:

* Delete `flow_serializer.py`, `flow/utils.py`, and
  `visualization/schema.py`
* Move flow structure modeling into `flow_definition.py`
* Simplify visualization building around the static definition contract

* Format files
2026-06-03 18:02:56 -03:00
Lucas Gomide
d09e3f4544 feat: flatten LiteLLM cache/reasoning usage sub-counts in _usage_to_dict (#6033)
LiteLLM returns provider usage as-is, nesting cache-read / cache-creation /
reasoning counts under provider-specific shapes (e.g.
prompt_tokens_details.cached_tokens, Anthropic-style cache_read_input_tokens).
Surface them as flat cached_prompt_tokens / reasoning_tokens /
cache_creation_tokens keys so the span pipeline can read them; prompt /
completion / total token counts are left untouched.
2026-06-03 15:13:30 -04:00
Lorenze Jay
1357491f0d Lorenze/feat/conversational flows (#5896)
* feat: add conversational flows documentation and chat session support

- Introduced a new guide for building multi-turn chat applications using , detailing session management and message handling.
- Added  class to facilitate chat interactions, including streaming support and event handling.
- Implemented  for class-level defaults and improved input normalization for conversational turns.
- Enhanced event listeners to manage flow events and tracing more effectively, including support for nested crew executions.
- Added tests for conversational flow helpers and kickoff parameters to ensure functionality and reliability.

* linted

* feat: enhance flow event tracing and session management

- Updated TraceCollectionListener to handle nested flows without re-claiming parent session batches.
- Ensured that method execution events are always emitted for tracing, regardless of flow event suppression.
- Improved finalization logic for flow trace batches to respect session deferral flags.
- Added tests to verify that method execution events are emitted correctly when flow events are suppressed and that deferred session finalization is respected in nested flows.

* updated docs

* feat: introduce experimental conversational flow framework

- Added a new module for conversational flow, including classes for managing conversation state, messages, and events.
- Implemented  and  for structured intent handling and routing.
- Enhanced the  class to support turn-oriented conversational applications with built-in routing and message handling.
- Updated  to include new classes in the public API.
- Added tests to validate the functionality of the new conversational flow features.

* handled docs

* feat(flow): enhance conversational flow handling and tracing

- Introduced support for deferred multi-turn tracing to maintain continuous event sequences.
- Updated  method to delegate to restored checkpoint flows, improving session management.
- Added tests to validate the new tracing behavior and ensure correct event handling in conversational flows.

* fix multimodal test

* better conversational

* adjusted prompt

* drop unused

* fix test

* refactor: rename  to  and update related documentation

This commit refactors the  class to  for clarity and consistency across the codebase. The documentation has been updated to reflect this change, ensuring that references to the new  class are accurate. Additionally, the alias for legacy imports is maintained for backward compatibility. The changes enhance the overall structure and readability of the conversational flow implementation.

* fix test

* adding experimetnal indicators

* fix test and reloaded cassettes

* cleanup ConversationalFlow class

* addressing double finalization and fixed tests

* improve on emphemeral tracing and adddressing comments
2026-06-03 11:53:16 -07:00
Lorenze Jay
be3cf62b63 feat: bump versions to 1.14.7a1 (#6031) 2026-06-03 10:30:33 -07:00
Greyson LaLonde
68cdd44520 fix(cli): restore [project.scripts] in crewai package for uv tool install 2026-06-03 09:50:39 -07:00
Lorenze Jay
770d1b284f Lorenze/fix/file input not working reliably (#6020)
* fix filesystem

* Refine commit message formatting

* fix for async kickoffs

* added suggestion
2026-06-02 17:14:51 -07:00
alex-clawd
b047c96756 Handle Snowflake Claude stringified tool calls (#6008)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
* Handle Snowflake Claude stringified tool calls

* Fix Snowflake tool id type narrowing

* Extract Snowflake tool result text in summaries

* Bump PyJWT for vulnerability scan

---------

Co-authored-by: João Moura <joaomdmoura@gmail.com>
2026-06-02 19:37:18 -03:00
Greyson LaLonde
d37af0d404 perf(knowledge): lazy-load docling imports to speed up crewai import 2026-06-02 15:16:48 -07:00
Greyson LaLonde
c81b4fe11e fix(deps): bump pyjwt to >=2.13.0 to patch CVEs 2026-06-02 10:01:53 -07:00
Lorenze Jay
a9cb7867bb Add crew trained agents file support (#6012)
* Add crew trained agents file support

* Add crew trained agents file support
2026-06-02 09:38:34 -07:00
alex-clawd
774fd871a8 Fix Snowflake Claude incomplete tool result histories (#6006)
* Fix Snowflake Claude incomplete tool result histories

* Filter Snowflake Claude preserved tool results
2026-06-02 09:11:59 -03:00
alex-clawd
4a0769d97c Add native Snowflake Cortex LLM provider (#6005) 2026-06-02 08:10:13 -03:00
Greyson LaLonde
fee5b3e395 fix(devtools): point template bumper at lib/cli templates dir 2026-06-02 02:02:12 -07:00
devin-ai-integration[bot]
3010f1286f chore: widen click dependency constraint to allow 8.2+
Addresses #6002
2026-06-02 00:06:25 -07:00
Greyson LaLonde
e53a676c04 fix(flow): re-arm multi-source or_ listeners across router-driven cycles
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
The previous discard-after-body approach cleared the gate mid-wave, so
a slow parallel @start finishing after the listener body could re-fire
the same multi-source or_ listener. Re-arm only when a router emits a
signal that matches the listener's condition; parallel @start paths
never reach that branch and the race gate keeps protecting them.

Closes #5972
2026-06-01 15:24:58 -07:00
Vini Brasil
1aba9fe415 Split flow.py into DSL, definition, and runtime (#5997)
This commit separates the monolithic `flow.py` into three modules, each
with one job:

- `dsl.py` - the Python DSL for flows (@start/@listen/@router, or_/and_)
- `flow_definition.py` - the structural model extracted from the DSL
- `runtime.py` - the execution engine and state for flows

This phase moves code only and should not have any breaking changes.
2026-06-01 18:37:10 -03:00
Greyson LaLonde
0486b85aa3 feat: bump versions to 1.14.6 2026-05-28 09:47:19 -07:00
Greyson LaLonde
ed91100a0f refactor(skills): move Skills Repository to experimental + CREWAI_EXPERIMENTAL gate
Moves the registry/cache pieces of PR #5867 under crewai.experimental.skills
and the CLI commands under `crewai experimental skill`. The stable local-file
skills feature (loader, parser, validation, models) stays in crewai.skills.

Both entry points now require CREWAI_EXPERIMENTAL=1:
- resolve_registry_ref() calls require_experimental_skills() before resolving
- The `crewai experimental` CLI group raises UsageError when the flag is unset

SkillDownloadStarted/CompletedEvent move out of crewai.events.types.skill_events
into crewai.experimental.skills.events.

* refactor(skills): move 'version' off SkillFrontmatter into metadata

The skill version is now stored as `metadata.version` rather than a
top-level field on `SkillFrontmatter`. A `before` validator lifts any
top-level YAML `version:` into `metadata['version']` so existing SKILL.md
files keep parsing.
2026-05-28 09:38:10 -07:00
Greyson LaLonde
d52106b3c7 feat: bump versions to 1.14.6a2 2026-05-27 16:42:40 -07:00
Lorenze Jay
2e36f06732 feat: enhance StdioTransport to prevent environment variable leakage (#5506)
* feat: enhance StdioTransport to prevent environment variable leakage

- Replaced os.environ.copy() with get_default_environment() to ensure only allowed environment variables are passed to the MCP server.
- Added tests to verify that ambient environment variables do not leak and that user-supplied environment variables can override defaults.

* feat: add environment variable filtering hook to StdioTransport

- Introduced an optional `_env_filter_hook` to allow extensions to modify the environment variables passed to MCP servers, enabling features like credential stripping.
- Updated tests to ensure the filtering hook is applied correctly after merging user-supplied and default environment variables.
2026-05-27 13:38:25 -07:00
Lorenze Jay
a1033e4bfe Fix structured output leaks in tool-calling loops (#5897)
* Fix structured output leaks in tool-calling loops

* addressing comments

* drop scripts

* Update Gemini agent tests to include structured output with thoughts and bump model version to 2.5-flash

* merge

* Update Anthropic test cases to use new model and tool structure

- Changed the model from "claude-3-5-haiku-20241022" to "claude-sonnet-4-6" in the test setup.
- Updated the request and response formats in the YAML test cassette to reflect the new tool structure and improved content formatting.
- Adjusted the expected response body to match the new output format from the assistant, including changes in tool usage and response details.
- Increased rate limit values in the response headers for better testing scenarios.

* adjusted bedrock cassettes

* adjusting cassettes for bedrock

* fix test

* Update VCR configuration to use 'host' instead of 'bedrock_host' for request matching
2026-05-27 13:20:53 -07:00
Greyson LaLonde
c5ea415cda chore(crewai-tools): drop self-explanatory comments
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2026-05-26 16:25:07 -07:00
Greyson LaLonde
3a52919a35 chore(devtools): drop self-explanatory comments 2026-05-26 15:50:44 -07:00
Greyson LaLonde
07569f04ee chore(crewai-files): drop self-explanatory comments 2026-05-26 15:01:22 -07:00
Greyson LaLonde
840ba89900 chore(crewai-core): drop self-explanatory comments 2026-05-26 10:33:18 -07:00
Greyson LaLonde
fd10c64148 chore(crewai): drop self-explanatory comments 2026-05-26 10:23:33 -07:00
Lorenze Jay
77a61274dc feat(planning): enhance planning configuration and observation handling (#5913)
* feat(planning): enhance planning configuration and observation handling

- Introduced  attribute in  to control LLM calls after each step.
- Updated  to set default  to 1 when planning is enabled without explicit config.
- Modified  to support heuristic observations when LLM calls are disabled.
- Adjusted  to respect  and  settings for step observations.
- Added tests to verify behavior of new configurations and ensure correct observation handling across different reasoning efforts.

* fix(agent_executor): update handling of failed steps in low effort mode

- Adjusted logic to ensure that failed steps are recorded without marking them as completed when using low reasoning effort.

- Introduced feedback for failed steps, allowing the process to continue while tracking failures.
- Added a test to verify that failed steps are correctly marked without triggering a replan.

- And linted

* linted
2026-05-26 09:10:43 -07:00
Vini Brasil
32f5e74449 Skip lock acquisition in CrewTrainingHandler.load when file is missing (#5935)
Every agent kickoff calls _use_trained_data, which calls
CrewTrainingHandler(...).load(). Since #4827 wrapped load() in store_lock,
that means every kickoff acquires the cross-process (Redis-backed when
REDIS_URL is set) lock even on deployments that never train and have no
trained-agents file on disk.

Move the missing/empty-file short-circuit above store_lock so the lock is
only acquired when there is actually a file to read. save() and the real
read remain locked.
2026-05-26 12:52:31 -03:00
Greyson LaLonde
bad64b1ee6 chore(cli): drop self-explanatory comments
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2026-05-26 01:05:25 -07:00
Greyson LaLonde
867df0f633 fix(checkpoint): drop unroundtrippable callbacks and adapter state
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
- callable_to_string returns None for lambdas/closures instead of an
  unresolvable dotted path; Crew filters Nones out of restored callback
  lists.
- EventNode.event serializer honors info.mode so mode='json' calls cascade
  properly into nested event payloads.
- RagTool.adapter serializes to None (post-validator rebuilds from
  config); concrete adapters hold runtime state that can't be round-tripped.
2026-05-25 19:24:02 -07:00
Greyson LaLonde
c3e2001d52 fix(checkpoint): serialize type[BaseModel] fields as JSON schema
Some checks failed
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
Subclass redeclarations of args_schema/response_format dropped the
parent's Annotated PlainSerializer, causing PydanticSerializationError
on model_dump(mode='json'). Replace with @field_serializer decorators
backed by a shared serialize_model_class helper:

- BaseTool: covers RecallMemoryTool, RememberTool, AskQuestionTool,
  DelegateWorkTool, AddImageTool, ReadFileTool
- BaseLLM (check_fields=False): covers LLM, Anthropic, OpenAI, Gemini,
  Bedrock
- LiteAgent.response_format
- A2AConfig / A2AClientConfig response_model
2026-05-23 03:50:24 +08:00
Greyson LaLonde
306f5989b4 fix(checkpoint): avoid orphan task_started on resume scope restore
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
Move scope restoration from Crew-level global push to a per-task push
inside Task via resume_task_scope() in event_context. Fixes orphan
task_started warning, hierarchical resume (manager_agent now eligible
for _resuming), and parallel async resume (each contextvars copy owns
its own scope). Tests added.
2026-05-23 01:20:15 +08:00
Greyson LaLonde
88e95befe7 fix(experimental): allow AgentExecutor restore from checkpoint
llm and prompt were declared required with exclude=True, making the
model un-restorable from its own serialized output. Mirror the
CrewAgentExecutor pattern: make them nullable with default None, keep
exclude=True, and re-attach llm on the resume path alongside the other
re-attached fields. Guard the two prompt-deref sites so the runtime
invariant survives the looser type.
2026-05-22 23:24:12 +08:00
Thiago Moretto
c3ef622ec6 feat(tools): declare env_vars on DatabricksQueryTool (#5892)
* feat(tools): declare env_vars on DatabricksQueryTool

Add EnvVar import and env_vars field to DatabricksQueryTool so the host
UI knows which environment variables the tool requires. Both auth paths
(DATABRICKS_HOST+TOKEN or DATABRICKS_CONFIG_PROFILE) are marked
required=False with descriptions explaining the alternative.

* chore: update tool specifications

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-05-21 16:20:58 -03:00
Thiago Moretto
56b6594669 fix(tools): correct mongdb typo to pymongo in package_dependencies (#5891)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
* fix(tools): correct mongdb typo to pymongo in package_dependencies

The `package_dependencies` field in `MongoDBVectorSearchTool` referenced
the non-existent package `mongdb` instead of the actual PyPI package
`pymongo`, which is the driver imported and used throughout the file.

* chore: update tool specifications

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-05-21 10:57:17 -04:00
Greyson LaLonde
81c21e3166 feat: bump versions to 1.14.6a1
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2026-05-21 15:09:48 +08:00
Greyson LaLonde
b4b285764c fix: harden RuntimeState serialization across entity fields
Adds missing serializers, discriminators, and exclude markers on entity
fields that previously crashed model_dump_json or restored ambiguously:

- Flow.persistence: add _serialize_persistence; drop | Any escape hatch
- Flow.input_provider: SerializableInstance dotted-path round-trip
- BaseAgent.agent_executor: add _serialize_executor_ref
- BaseAgent.tools_handler / cache_handler: exclude=True
- Memory / MemoryScope / MemorySlice: memory_kind Literal discriminator
- Knowledge.storage / .embedder: exclude live client, serialize spec
- BaseKnowledgeSource subclasses: source_type Literal + dict-resolver
- BaseKnowledgeSource.storage / chunk_embeddings: exclude=True
- input_provider: enforce InputProvider protocol via dedicated
  validator/serializer; reject non-class dotted paths in
  _dotted_path_to_instance
- MemoryScope/MemorySlice: allow restore without live Memory; expose
  bind() to reattach the dependency post-restore
- Knowledge.embedder: add BeforeValidator that resolves provider_class
  dotted paths back to a BaseEmbeddingsProvider subclass
2026-05-21 14:53:40 +08:00
alex-clawd
418afd29e7 feat: Skills Repository — registry, cache, CLI, and SDK integration (#5867)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
* feat: add Skills Repository — registry, cache, CLI, and SDK integration

Adds a Skills Repository feature allowing users to publish, install,
and use skills from the CrewAI registry with @org/skill-name refs.

## What's New

### SDK (lib/crewai/)
- SkillFrontmatter: added optional 'version' field (backward compatible)
- SkillCacheManager: manages ~/.crewai/skills/{org}/{name}/ with
  .crewai_meta.json tracking, path-traversal-safe tar extraction
- SkillRegistry: parse @org/skill-name refs, local-first resolution
  (./skills/ > cache > download), interactive prompt on first use,
  CI-mode guard (CREWAI_NONINTERACTIVE/CI env vars)
- Agent.skills and Crew.skills widened to accept str refs (@org/name)
- set_skills() resolves registry refs with org-prefixed dedup keys
- New events: SkillDownloadStartedEvent, SkillDownloadCompletedEvent

### CLI (lib/cli/)
- crewai skill create <name> — context-aware (project vs standalone)
- crewai skill install @org/name — downloads to ./skills/ or cache
- crewai skill publish — ZIP + upload to org registry
- crewai skill list — show installed skills

### PlusAPI (lib/crewai-core/)
- Added SKILLS_RESOURCE, get_skill(), publish_skill(), list_skills()

### Scaffolding
- crew and flow templates now include skills/ directory

### Tests
- 91 SDK skill tests + 15 CLI skill tests, all passing

* fix: address all CI failures and CodeRabbit review comments

Lint:
- Remove unused imports (click, pytest, json)
- Replace try-except-pass with logging (S110)
- Fix unprotected zipfile.extractall (S202)

Security:
- Path traversal: startswith → is_relative_to for tar extraction
- Add path traversal protection to ZIP extraction via _safe_extract_zip
- Both cache.py and CLI main.py hardened

Type checker:
- Fix import path: crewai.events.event_bus (not crewai_event_bus)
- Remove unused type: ignore comments
- Fix type mismatches in set_skills() variable types

Code quality:
- Fix f-string interpolation in SkillNotCachedError
- Use ValidationError instead of Exception in test

* style: ruff format + autofix remaining lint errors

* refactor: reuse SDK parser and SkillCacheManager in CLI

- _parse_frontmatter() now delegates to crewai.skills.parser.parse_frontmatter
  when available, with a minimal fallback for CLI-only installs
- install() global cache path now reuses SkillCacheManager.store() instead
  of duplicating metadata writing logic

* refactor: add _print_current_organization to SkillCommand (matches ToolCommand pattern)

* fix: write .crewai_meta.json in fallback install path

CodeRabbit caught that the ImportError fallback in install() didn't write
cache metadata, making skills invisible to 'crewai skill list'.

* fix: tighten @org/name ref validation to prevent path traversal

Reject refs with multiple slashes (@org/a/b), dot segments (@../skill),
or leading dots in org/name. Applied to both CLI install() and SDK
parse_registry_ref() so the contract is enforced consistently.

* fix: update test assertions to match tightened error messages

* fix: align OSS client with AMP API contract

- download_skill(): fetch download_url (presigned URL) instead of
  expecting inline base64. Falls back to 'file' field for compat.
- Read 'latest_version' field, fall back to 'version'
- Same fixes applied to CLI install() command

* fix: publish as tar.gz (matches AMP content_type validation) + add zip fallback to SDK cache

CLI publish:
- _build_skill_zip → _build_skill_tarball (tar.gz format)
- Content type: application/x-gzip (matches SkillVersion validation)

SDK cache:
- store() now tries tar.gz first, falls back to zip extraction
- Added _safe_extract_zip for path-traversal-safe zip handling
- Both formats work for download/install regardless of server format

---------

Co-authored-by: João Moura <joaomdmoura@gmail.com>
2026-05-20 14:38:25 -03:00
Greyson LaLonde
35f693cf68 chore: tighten typing across plus_api client
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Adds typed containers for wire payloads, literal aliases for HTTP method
and log type, and Ffnal markers on resource constants. Updates
upstream returns in project_utils.py and deploy/main.py to match
the new contracts.
2026-05-20 01:43:48 +08:00
Greyson LaLonde
da15554d81 feat: generate categorized release notes for enterprise 2026-05-20 00:24:26 +08:00
Greyson LaLonde
c50da7a6f2 feat: bump versions to 1.14.5 2026-05-19 03:11:26 +08:00