Add a description and examples to every FlowDefinition field and
standardize on `typing.Literal`, so the generated JSON schema documents
itself — each action discriminator, state branch, and config option
explains what it is and shows a realistic value.
Examples live on individual fields only, never at the model level, which
keeps the schema readable for tooling that renders field-level help.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Add script/code blocks to FlowDefinition
Let a Flow method run trusted inline Python with `call: script`. The code
is compiled once into a generated function and receives the runtime
values as arguments.
```yaml
methods:
normalize:
start: true
do:
call: script
code: |
import math
state["rounded"] = math.ceil(state["raw_score"])
return f"rounded:{state['rounded']}"
```
Even though this shares the same surface of tools (custom code), I
decided to make it opt-in for now, using
`CREWAI_ALLOW_FLOW_SCRIPT_EXECUTION=1`.
* Address code review comments
* Replace eval with safe expression parser in calculator tool example
Update the calculator tool example in the CLI template to use
ast.parse instead of eval for expression evaluation.
Co-authored-by: Vinicius Brasil <vini@hey.com>
* Replace calculator example with practical file reader tool
* Use word count example - safe, no file/eval risk
---------
Co-authored-by: Vinicius Brasil <vini@hey.com>
The litellm extra was capped at <1.85, which excludes future
patch lines and reintroduces resolution failures under uv/pip.
Widen to >=1.84.0,<2 so the extra resolves cleanly against
crewai's openai/python-dotenv pins.
Closes OSS-71
* feat: adopt directory-based docs versioning with Edge channel
Switch docs.crewai.com from navigation-only versioning (every version
selector entry rendered the same docs/<lang>/* source files) to
Mintlify's directory-based versioning so each version selector entry
renders its own snapshot. Add an "Edge" channel under docs/edge/<lang>/*
that always reflects main HEAD for unreleased work, eliminating
pre-release leakage onto frozen release labels. External links to
canonical /<lang>/* URLs are preserved via wildcard redirects that
always land on the current default version.
Layout:
- docs/edge/<lang>/* rolling source (you edit here)
- docs/edge/enterprise-api.*.yaml
- docs/v<X.Y.Z>/<lang>/* frozen, immutable snapshots
- docs/v<X.Y.Z>/enterprise-api.*.yaml
- docs/images/ shared, append-only
- docs/docs.json nav + redirects
URLs follow the Mintlify-idiomatic shape: /edge/<lang>/<page> for
Edge, /v<X.Y.Z>/<lang>/<page> for every frozen snapshot. The wildcard
redirects /<lang>/:slug* -> /<default>/<lang>/:slug* keep stale links
working, and every freeze rewrites them (plus all per-section/per-page
redirects) so destinations always resolve to the current default
without depending on a second redirect hop.
Release flow integration (devtools release):
- New module crewai_devtools.docs_versioning.freeze() materialises
docs/v<X.Y.Z>/ from docs/edge/, rewrites openapi: refs inside the
snapshot, inserts the version into every language block in
docs.json, and refreshes all redirect destinations.
- _update_docs_and_create_pr() in cli.py now calls that freeze during
Phase 2 of devtools release. Edge changelogs are updated first (so
the snapshot freeze picks them up), then the snapshot is staged
alongside docs.json, branched as docs/freeze-v<X.Y.Z>, and the PR
is titled [docs-freeze] docs: snapshot and changelog for v<X.Y.Z>
— the title prefix the new CI guard reads.
- The PR still gates tag, GitHub release, PyPI publish, and the
enterprise release as before; no new PRs are added.
- Pre-releases (1.X.YaN, 1.X.YbN, ...) skip the snapshot — they ride
Edge — and the docs PR title omits the [docs-freeze] prefix.
- docs_check (AI-generated docs scaffolding) writes to
docs/edge/<lang>/* so newly-generated unreleased docs land in Edge
and never accidentally touch a frozen snapshot.
Migration scripts (one-shot):
- scripts/docs/freeze_historical_versions.py reconstructs all 16
historical snapshots (v1.10.0 .. v1.14.7) from git tags via
git archive | tar, rewriting openapi: MDX refs so each snapshot
reads its own enterprise-api YAML rather than the live one.
- scripts/docs/prefix_version_paths.py one-shot-migrates docs.json:
rewrites every page path in 16 versioned blocks to point under
docs/v<X.Y.Z>/, inserts a new Edge entry per language, tags
v1.14.7 as Latest (default), prunes pages whose target file
doesn't exist in the snapshot (e.g. docs/ar/ didn't exist before
v1.12.0), and writes the wildcard + per-section redirects.
- scripts/docs/freeze_current_edge.py is now a thin CLI wrapper
around docs_versioning.freeze for manual one-off freezes (e.g.
retroactively snapshotting a forgotten release).
CI guards (.github/workflows/docs-snapshots.yml):
- Frozen snapshots under docs/v[0-9]*/ are immutable; only PRs whose
title contains [docs-freeze] (i.e. release-cut PRs generated by
devtools release or the manual wrapper) may modify them.
- Images under docs/images/ are append-only since snapshots share a
single image directory. Deleting or renaming an image breaks every
historical snapshot that still references it.
Restored docs/images/crewai-otel-export.png from PR #3673; it was
deleted in PR #4908 but v1.10.0 / v1.10.1 snapshots still reference
it. Restoring instead of editing the snapshots preserves historical
rendering fidelity and validates the new append-only rule
retroactively.
Tests:
- lib/devtools/tests/test_docs_versioning.py covers the freeze: file
copy, openapi rewrite, version insertion, default demotion, redirect
upserts, per-section redirect rewriting, idempotency, and invalid
inputs.
Verified locally with mintlify broken-links: 0 broken links across
the full site (Edge + 16 frozen versions, 4 locales).
AGENTS.md (repo root) is the contributor guide for the new model;
RELEASING.md is the release-cut runbook; README's Contribution
section links to both.
Co-authored-by: Cursor <cursoragent@cursor.com>
* style: resolve linter issues
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
* Enhance memory reset functionality and JSON crew handling
- Added `reset_all` method to the `Memory` class to reset the entire memory store, ignoring `root_scope`.
- Updated the `Crew` class to utilize `reset_all` when resetting memory.
- Enhanced the `_reset_flow_memory` function to check for `Memory` instances and call `reset_all` accordingly.
- Introduced helper functions to load JSON crew configurations and handle project declarations, improving the reset command's flexibility.
- Added tests to validate the new JSON crew memory reset behavior and ensure proper handling of declared flow projects.
* Fix memory reset review issues
* Bump litellm for security advisory
Replace the single FlowStateDefinition model with a `type`-discriminated
union of FlowDictStateDefinition, FlowPydanticStateDefinition,
FlowJsonSchemaStateDefinition, and FlowUnknownStateDefinition.
Each branch only carries the fields it actually uses and forbids extras,
so an invalid combination like a `dict` state with a `ref` now fails
validation instead of being silently accepted. The runtime reads `ref`
and `json_schema` defensively since they no longer exist on every branch.
```yaml
state:
type: json_schema
json_schema:
type: object
properties:
topic:
type: string
```
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Update installation and quickstart documentation for JSON-first crew projects
- Revised the installation guide to reflect the new JSON-first project structure, detailing the creation of `crew.jsonc` and `agents/*.jsonc` files.
- Updated the quickstart guide to demonstrate setting up agents and tasks using JSONC format, replacing previous YAML examples.
- Enhanced the agents and tasks documentation to clarify the transition from YAML to JSONC, including examples and explanations of the new structure.
- Added notes on the classic YAML structure for legacy projects and provided guidance on migrating to the new format.
* docs: clarify json crew quickstart guidance
* docs: address json docs review feedback
* Implement DMN mode support in crew creation and execution
- Added `is_dmn_mode_enabled` utility to check for enterprise non-interactive mode based on the `CREWAI_DMN` environment variable.
- Updated `create` function in `cli.py` to enforce required parameters when DMN mode is active, raising appropriate usage errors.
- Enhanced `create_crew` and `create_json_crew` functions to skip provider prompts and handle folder existence checks in DMN mode.
- Introduced non-interactive defaults for agent and task creation in DMN mode, ensuring seamless project setup without user input.
- Modified `run_crew` to bypass TUI and handle runtime inputs directly when in DMN mode, improving execution flow for JSON-defined crews.
- Added tests to validate DMN mode behavior, ensuring correct handling of required inputs and non-interactive defaults.
* Implement DMN mode support in crew creation and execution
- Introduced `is_dmn_mode_enabled()` utility to check for non-interactive mode based on the `CREWAI_DMN` environment variable.
- Updated `create` function to enforce required parameters when DMN mode is active, raising appropriate usage errors.
- Modified `create_crew` and `create_json_crew` functions to skip provider prompts and utilize non-interactive defaults in DMN mode.
- Enhanced `run_crew` to bypass TUI and handle runtime inputs directly in DMN mode, ensuring smooth execution without user interaction.
- Added tests to validate DMN mode behavior, including requirements for type and name, and ensuring proper handling of existing folders and missing inputs.
* Enhance crew loading and validation logic
- Updated `crew_loader.py` to pass the project root when loading task and agent definitions, improving the handling of Python references.
- Refactored `json_loader.py` to include additional validation for Python references, ensuring they are resolved within the project root and enforcing depth limits.
- Added tests in `test_crew_loader.py` and `test_json_loader.py` to validate rejection of unsafe Python references and input files outside the project root.
- Improved error handling for JSON project validation, ensuring clearer feedback for invalid configurations.
* Refactor tests for hierarchical verbose manager agent
- Removed `@pytest.mark.vcr()` decorators from `test_hierarchical_verbose_manager_agent` and `test_hierarchical_verbose_false_manager_agent`.
- Introduced mocking for task outputs in both tests to simulate execution without relying on external dependencies.
- Ensured that the `crew.kickoff()` method is called within a context that patches the `Task.execute_sync` method, improving test isolation and reliability.
* Fix JSON loader PR review comments
* Fix JSON loader project root after rebase
* Handle UNC paths in JSON input files
* Enhance JSON crew project handling and validation
- Updated `create_json_crew.py` to specify input files with a brief path.
- Refactored `crew_loader.py` to improve agent and task loading logic, including the introduction of a `build_agent` function and better handling of task classes.
- Enhanced `json_loader.py` with additional validation for agent and task definitions, including support for Python references and conditional tasks.
- Added tests in `test_crew_loader.py` and `test_json_loader.py` to ensure proper loading of agents, tasks, and validation of project structures, including custom types and conditional tasks.
- Improved error handling and validation safety across the project loading process.
* Enhance JSON crew configuration options in create_json_crew.py
- Added optional fields for custom agent subclasses and advanced task options, including condition checks and output specifications.
- Improved documentation comments for better clarity on agent and task configurations.
- Updated JSON crew handling to support additional callbacks for pre- and post-execution processes.
* Enhance JSON crew template tests in test_create_crew.py
- Added assertions for new optional fields in crew and agent templates, including conditional tasks, custom converters, and input file specifications.
- Improved validation checks for manager agents and callback references to ensure proper configuration in JSON crew definitions.
- Expanded documentation references within the tests to provide clearer guidance on the expected structure and usage of crew templates.
* Fix JSON crew PR review issues
* Update crewAI CLI with various enhancements and fixes
- Updated `create_json_crew.py` to require `crewai[tools]>=1.14.7`.
- Enhanced `git.py` with improved repository initialization, including automatic initial commit creation and exclusion patterns for initial commits.
- Modified `install_crew.py` to allow error handling during installation with an optional `raise_on_error` parameter.
- Expanded `plus_api.py` to include methods for creating and updating crews from ZIP files.
- Introduced a new `archive.py` for creating deployable ZIP archives of CrewAI projects, ensuring local artifacts are excluded.
- Updated `run_crew.py` to manage JSON crew dependencies and run crews in the project's environment.
- Enhanced deployment logic in `main.py` to handle ZIP uploads and improve user feedback during deployment processes.
- Added tests for new functionalities and ensured existing tests reflect recent changes in behavior and requirements.
* fix(cli): address deploy zip review feedback
* fix(cli): sync missing lockfile before deploy
* fix(cli): preserve remote deploy on git setup warnings
* test(cli): use single deploy main import style
* fix(cli): skip project install for json crew sync
* fix(cli): load json runner from source checkout
* fix(cli): skip json crew sync when locked
* fix(cli): address deploy zip review feedback
* fix(cli): pass env on zip redeploy
* fix(cli): harden json run and zip fallback
* fix(cli): validate before deploy lock install
* fix(cli): respect poetry lock for json runs
* fix(cli): align json zip wrapper detection
* fix(deps): bump starlette audit floor
* fix(cli): avoid auth retry for deploy exits
* fix(cli): update json zip script entrypoints
* feat(cli): introduce JSON crew project support and TUI enhancements
- Added support for creating and running JSON-defined crew projects, allowing users to scaffold projects with a new `create_json_crew.py` file.
- Implemented a full-screen Textual TUI for crew execution in `crew_run_tui.py`, enhancing user interaction with a two-column layout.
- Updated `run_crew.py` to prioritize JSON crew projects and added daemon mode for running without TUI.
- Introduced interactive pickers in `tui_picker.py` for improved CLI prompts.
- Enhanced validation for JSON crew files in `validate.py` to ensure proper structure and agent definitions.
- Updated `.gitignore` to exclude demo and crewai directories.
* feat: update LLM model references to gpt-5.4-mini
- Changed default LLM model from gpt-4o-mini to gpt-5.4-mini across various files, including CLI options, JSON crew configurations, and agent definitions.
- Enhanced benchmark and human feedback functionalities to utilize the new model.
- Improved user interface elements in the TUI for better interaction and feedback during execution.
- Added support for new skills directory in JSON crew project creation.
* feat(benchmark): add crew-level benchmarking functionality
- Introduced a new `benchmark` command in the CLI for crew-level benchmarking, allowing users to specify agents, models, and timeout settings.
- Implemented `CrewBenchmarkCase` to handle crew-level benchmark cases with inputs and criteria.
- Enhanced the benchmark runner to support progress tracking and detailed reporting of results for multiple models.
- Added tests for loading crew benchmark cases and validating their structure.
- Updated existing benchmark functions to accommodate the new crew-level execution model.
* feat(cli): enhance JSON crew project functionality and TUI improvements
- Added optional agent-level guardrails and advanced options in JSON crew configurations to improve output validation and flexibility.
- Updated the TUI to better handle plan step statuses, including visual indicators for task completion and failure.
- Introduced methods for parsing and managing step observation events, ensuring accurate updates to task statuses during execution.
- Enhanced validation for JSON crew projects, ensuring proper structure and error handling for agent and task definitions.
- Added comprehensive tests for new features and validation logic, ensuring robustness in JSON crew project handling.
* refactor(cli): streamline JSON crew project handling and improve validation
- Refactored JSON crew project loading and validation logic to enhance clarity and maintainability.
- Introduced utility functions for finding JSON crew files, improving code reuse across modules.
- Removed deprecated benchmark functionality and associated tests to simplify the codebase.
- Updated CLI commands to utilize the new JSON project structure, ensuring compatibility with recent changes.
- Enhanced test coverage for JSON crew project features, ensuring robust validation and error handling.
* feat(cli): enhance activity log navigation and focus management
- Added functionality to focus on the activity log when navigating through log entries.
- Implemented refresh logic for the log panel to ensure updates are displayed correctly during navigation.
- Improved keyboard navigation for log entries, allowing users to expand and scroll through logs seamlessly.
- Added tests to verify the correct behavior of log navigation and focus management in the TUI.
* feat(cli): enhance JSON crew project interaction and input handling
- Introduced a new function to enable prompt line editing for better user experience during input prompts.
- Updated the JSON crew project wizards to show interpolation hints for dynamic values, improving user guidance.
- Enhanced the handling of missing input placeholders by prompting users for required values during crew setup.
- Refactored the crew run logic to ensure proper loading and preparation of JSON-defined crews, including runtime input management.
- Added tests to verify the correct behavior of new input handling features and JSON crew project interactions.
* feat(cli): improve crew project input prompts and event handling
- Enhanced the `_prompt_text` function to allow for configurable spacing before prompts, improving user experience during input collection.
- Updated the wizards for agent and task creation to utilize the new prompt configuration, ensuring a more compact and streamlined interaction.
- Introduced new plan step lifecycle events (`PlanStepStartedEvent`, `PlanStepCompletedEvent`) to better track the execution status of plan steps.
- Refactored the step executor to emit these events during the execution of tasks, improving observability and debugging capabilities.
- Added tests to verify the correct behavior of new prompt handling and event emissions during crew project execution.
* fix: refine json-first crew interactions
* fix: prioritize common json crew tools
* fix: make json crew more tools expandable
* fix: show json crew tools by category
* feat(memory): update default embedder to OpenAI text-embedding-3-large and enhance memory compatibility
- Changed the default embedding model for Memory to OpenAI text-embedding-3-large, which uses 3072-dimensional vectors.
- Added warnings regarding compatibility issues with existing local memory stores created with 1536-dimensional embeddings.
- Updated documentation to reflect the new default embedder and its configuration options.
- Enhanced the CLI and codebase to support the new embedding model across various components, ensuring a seamless transition for users.
* fix: address PR review feedback for JSON-first crews
Review blockers:
- Forward trained_agents_file to JSON crews: crewai run -f now exports
CREWAI_TRAINED_AGENTS_FILE for the in-process JSON crew path
- Wizard agent picker: Esc/cancel now reprompts instead of silently
assigning the first agent
- JSON tool resolution hard-fails: unknown tool names, missing custom
tool files, and invalid custom tool modules raise JSONProjectError
with actionable messages instead of warn-and-continue
- Embedding dimension mismatch: LanceDB and Qdrant Edge storages raise
EmbeddingDimensionMismatchError with reset/pin guidance instead of
silently zero-filling vectors or returning empty search results
- Custom tool code execution documented in loader docstring and the
scaffolded project README
CI fixes:
- ruff format across lib/
- All 133 PR-introduced mypy errors fixed (llm.py lazy-litellm and
cli.py lazy command shims now use TYPE_CHECKING imports; textual
is_mounted misuse fixed; pick_many overloads; misc annotations)
Bot review comments:
- Empty except blocks now have explanatory comments or debug logging
- Removed unused _C_BG/_C_PANEL/_C_BORDER globals and redundant
import re; tests use a single import style for create_json_crew
Tests: trained-agents propagation, wizard cancel, tool resolution
failures, and dimension mismatch guidance.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix: address second round of PR review comments
Cursor Bugbot:
- Wizard agent slugs: strip to [a-z0-9_] and fall back to agent_<n> so
symbol-only roles can't produce an empty agents/.jsonc filename
- Wizard task names: dedupe against prior task names and fall back to
task_<n> for symbol-only descriptions
CodeRabbit:
- Agent.message(): import Task explicitly at runtime instead of relying
on the namespace injection done by crewai/__init__
- Async executor: move the native-tools-unsupported fallback from
_ainvoke_loop_react (self-recursion) to _ainvoke_loop_native_tools,
mirroring the sync implementation
- StepExecutor downgrade: keep the in-step conversation and append the
text-tooling instructions instead of rebuilding messages, so completed
native tool calls are not re-executed
- crewai-files: extension-based MIME lookup now runs before byte
sniffing so csv/xml types are not degraded to text/plain
- Memory storages: validate every record in a save() batch against a
consistent embedding dimension (LanceDB previously checked only the
first record); added mixed-batch tests
- _print_post_tui_summary now typed against CrewRunApp
- Docs: Azure OpenAI default embedder change called out in the memory
migration warning and provider table
Code quality bots:
- Removed unused _C_YELLOW/_C_CYAN (crew_run_tui) and _GREEN (tui_picker)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* feat(cli): accordion tool picker in JSON crew wizard
The flat tool list had grown to ~90 rows. The picker now shows:
- Common tools always visible at the top
- Every other category as a single expandable row with tool and
selection counts (e.g. "Search & Research (27 tools, 2 selected)")
- Expanding a category collapses the previously expanded one
- Selections persist across expand/collapse via new preselected
support in pick_many; cursor follows the toggled category row
tui_picker gains preselected + initial_cursor options on pick_many,
and Esc in multi-select now confirms the current selection instead of
discarding it (required so collapsing can't silently drop choices).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* refactor(cli): remove --daemon flag from crewai run
The flag only affected JSON crew projects — classic and flow projects
ignored it entirely, which made the behavior inconsistent. Removed the
option, the daemon code path (_run_json_crew_daemon), and its helper
(_load_json_crew_with_inputs).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* test: update run command tests after --daemon removal
lib/crewai/tests/cli/test_run_crew.py still asserted the old
run_crew(trained_agents_file=..., daemon=False) call signature.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(cli): exit codes, mid-run quit, async statuses, hyphen placeholders
Addresses the latest Bugbot review round:
- Failed JSON crew runs now exit non-zero (SystemExit(1)) so scripts
and CI don't treat failures as success, mirroring the classic path
- Quitting the TUI mid-run now ends the process (os._exit(130));
kickoff runs in a thread worker that cannot be force-cancelled, so
letting the CLI return would leave LLM/tool work burning tokens in
the background
- Sidebar task statuses are now async-safe: completion/failure events
resolve the task's own row via identity instead of assuming the most
recently started task, and starting a task no longer blanket-marks
earlier active rows as done
- The runtime-input prompt regex now accepts hyphenated placeholder
names ({my-topic}), matching kickoff's interpolation pattern
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix: validation safety, custom tool sandboxing, TUI log integrity, memory error surfacing
- Deploy validation no longer executes project code: validation mode
checks tool declarations structurally (well-formed entries, custom
tool file exists) without importing or instantiating anything.
custom:<name> resolution only happens on the actual run path.
- custom:<name> is constrained to [A-Za-z_][A-Za-z0-9_]* and the
resolved path must stay inside the project's tools/ directory, so
custom:../foo or absolute-path names cannot execute code outside it.
Tool paths resolve relative to the crew project root, not cwd.
- TUI task logs are built from per-task state captured at task start
(idx, description, agent, start time); an out-of-order completion
takes its output from the event and no longer steals or resets the
current task's streamed steps/output.
- EmbeddingDimensionMismatchError now inherits ValueError instead of
RuntimeError so background saves surface it through
MemorySaveFailedEvent instead of silently dropping the save; the
shutdown catch in _background_encode_batch is narrowed to the
"cannot schedule new futures" case.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(cli): declared project type wins over crew.json presence
A flow project that also contains a crew.json(c) file now runs and
validates as the flow it declares in pyproject.toml instead of being
hijacked by the JSON crew path. Both crewai run (_has_json_crew) and
deploy validation (_is_json_crew) check tool.crewai.type; a missing or
unreadable pyproject still means a bare JSON crew project.
Also documents why StepObservationFailedEvent intentionally marks the
plan step "done": the event signals an observer failure, not a step
failure, and the executor continues past it.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(cli): type the declared_type locals so mypy stays clean
Comparing an Any-typed .get() chain returns Any, which tripped
no-any-return on the previous commit.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Let users run a Flow from a Flow Definition YAML file or inline string
without writing Python, passing kickoff inputs as `--inputs` JSON. The
flag is gated behind an experimental warning since the definition format
may still change.
A `do:` step can now say `call: tool` and name a CrewAI tool to run,
passing its inputs under `with:`. Before this, a definition could only
point at Python code to run.
```yaml
methods:
search:
start: true
do:
call: tool
ref: crewai_tools:ExaSearchTool
with:
search_query: ai agents
```
* Drive human feedback from the flow definition
@human_feedback previously wrapped methods with the full HITL runtime (feedback
request, outcome collapse, learn loop), so flows built from a YAML definition —
which carry no decorated callables — could not pause for or route on human
feedback.
# Conflicts:
# lib/crewai/src/crewai/flow/persistence/decorators.py
# lib/crewai/src/crewai/flow/runtime/__init__.py
* Address code review comments
* Wire config and persistence from FlowDefinition into the runtime
`from_definition` was silently dropping all config fields; it now passes
`config.model_dump()` so suppress_flow_events, max_method_calls, etc.
actually apply.
Persistence is now engine-driven: `_persist_method_completion` fires
after every method using the definition's persist metadata, so
`@persist` no longer needs to wrap methods — it just stamps them.
* Address code review comments
* feat: aggregate LLM token usage at the flow level
Introduces `flow.usage_metrics`, a snapshot of every LLMCallCompletedEvent
emitted under the flow's `current_flow_id` for the duration of one kickoff
(or resume) call. Aggregation happens on the singleton event bus so it
covers crews, direct `LLM.call`s, and nested listener calls — solving the
mismatch where the SDK reported only the last crew's usage while the
Enterprise UI showed the correct full total.
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor: centralize provider key normalization in UsageMetrics
Add UsageMetrics.from_provider_dict to normalize raw LLM usage dicts
across providers (LiteLLM, native Anthropic, native Gemini, OpenAI
nested cached). BaseLLM._track_token_usage_internal and the flow-level
aggregator now share this single source of truth, so `flow.usage_metrics`
agrees with per-LLM totals on every provider — including the native
Anthropic path that emits `input_tokens`/`output_tokens` instead of
`prompt_tokens`/`completion_tokens`.
* fix: flush event bus before reading aggregated usage_metrics
`crewai_event_bus.emit` dispatches LLMCallCompletedEvent handlers on a
ThreadPoolExecutor (fire-and-forget), so a flow whose last LLM call
completes right before kickoff_async/resume_async returns can detach
the usage listener while that handler is still queued, leaving its
tokens off `flow.usage_metrics`. Match `Crew.kickoff()` and call
`crewai_event_bus.flush()` in both finally blocks so every handler
drains before the listener is detached.
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
* Read flow dispatch from FlowDefinition
Store the definition in a `_definition` PrivateAttr at post-init and
convert the dispatch helpers (`_start_method_names`, `_listener_methods`,
`_start_condition`, `_listen_condition`, `_is_router`) from classmethods
to instance methods that read it. Event names now fall back to
`self._definition.name` instead of `self.__class__.__name__`.
Behavior is identical for decorator subclasses, but the engine no longer
assumes the definition comes from the class. This is the seam for
`Flow.from_definition`, where an instance runs a definition that was
loaded rather than built from a Python subclass.
* Add Flow.from_definition to run flows without a subclass
A FlowDefinition (e.g. loaded from YAML) was only usable for dispatch on
decorator-authored subclasses. Now each method definition records an
importable `module:qualname` handler ref, and `Flow.from_definition`
resolves and binds those handlers to build a runnable flow directly.
* Build flow state from FlowDefinition
Definition-driven flows previously always started with a bare dict
state.
* Replace handler string with structured FlowActionDefinition
`handler: str | None` was optional and opaque — missing handlers only
surfaced at kickoff time. `do: FlowActionDefinition` is required, so
Pydantic rejects invalid definitions at parse time.
The `call: "code"` discriminator prepares the schema for future
non-Python action types (e.g. MCP tool, crew) without touching
`FlowMethodDefinition`. Resolution logic is extracted to
`runtime/_action_resolvers.py` to keep the dispatch point isolated.
* Fix conversational start router missing required do field
FlowMethodDefinition.do became required when the handler string was
replaced with FlowActionDefinition, but _conversation_start_router still
built its fragment without it, breaking crewai import entirely.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* Add event scoping to flow test
* Change lib/crewai/tests/test_flow_from_definition.py
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
A custom BaseLLM subclass serializes with the inherited llm_type "base",
which the registry maps to the abstract BaseLLM. Restore then crashed on
cls(**value). Rebuild a concrete LLM from the saved config when the
resolved class is abstract.
Checkpoint serialization stamps checkpoint_completed_methods onto every live
Flow in RuntimeState.root, including the agent executor reused across a crew's
tasks. kickoff_async read that stamp as a restore signal, so the second task
replayed the first task's completed methods and never reached a final answer.
Gate is_restoring on _restored_from_checkpoint, set only by
_restore_from_checkpoint, and consume it single-shot.
flow.plot defaults to show=True, which calls webbrowser.open on every run.
The test only asserts FlowPlotEvent is emitted, so disable the browser open.
* ci: ignore GHSA-rrmf-rvhw-rf47 (torch alias of PYSEC-2025-194)
pip-audit reports CVE-2025-3000 under its GHSA id, which the existing
PYSEC-2025-194 ignore does not match. Same advisory: memory corruption
in torch.jit.script, CVSS 1.9, local-only, no fix for torch 2.11.0.
* ci: sync GHSA-rrmf-rvhw-rf47 ignore into pre-commit pip-audit
* improve one less route
* flows in flows, new agent executor causing early trace batch finalization
* addressing comments
* addressing comments pt2
* lint and typecheck fix
* docs: udpate docs to reflect new state of OpenTelemetry collector
* docs: add OTel collector and Datadog screenshots
These images are referenced by the capture_telemetry_logs guides but were
missing from the tree, which broke the link checker across all locales.
* docs: address PR review on OTel collector guide
- Clarify that OpenTelemetry Traces and Logs are separate integrations
sharing the same fields (resolves Traces/Logs wording inconsistency)
- List regional Datadog OTLP hosts (US1/US3/US5/EU1/AP1) so users outside
US5 can copy the right domain
* decouple convo logic from runtime and added a conversational_definition
* type check fix
* always defer traces for convo and so fix tests to reflect that
Re-evaluate the whole `@listen`/`@router` condition tree against the set
of events seen so far, instead of tracking which AND sub-branches remain
pending.
Net effect:
* Fixes a regression where `or_()` short-circuited at the first
satisfied branch, leaving a sibling `and_()` half-complete so a later
trigger could spuriously re-fire the listener
* Removes the fragile per-branch pending state and `id()`-based keys
* Shrinks the evaluator to one readable predicate