Safari does not re-scroll to hash targets after client-side hydration
replaces the server-rendered DOM. This causes anchored links like
`/concepts/tools#typed-tool-outputs` to load at the top of the page
instead of scrolling to the target section.
This custom script retries scrolling to the hash target after page load,
waiting for React hydration to settle. Mintlify auto-includes any .js
file in the docs directory on every page.
Tested with: Safari 18 on macOS
Currently, tools have a strong input contract through `args_schema`, but no
output contract. This means that anything a tool outputs is converted to
string.
Not only the contract is weak, but the "invisible" conversion to string can
have unexpected effects when the tool returns complex objects like dicts and
arrays.
With this PR, a tool can _optionally_ define an output contract with
`output_schema`. CrewAI validates the raw result and sends the agent JSON.
```python
class ProductResult(BaseModel):
sku: str
name: str
in_stock: bool
class ProductLookupTool(BaseTool):
name: str = "Product Lookup"
description: str = "Look up product availability by SKU."
def _run(self, sku: str) -> ProductResult:
return ProductResult(sku=sku, name="USB-C dock", in_stock=True)
```
If the result does not match the schema, CrewAI warns and falls back to
`str(raw_result)` instead of failing the run:
```python
@tool("Product Lookup", output_schema=ProductResult)
def product_lookup(sku: str) -> dict[str, object]:
return {"sku": sku, "name": "USB-C dock", "in_stock": True}
#=> RuntimeWarning: Failed to validate or serialize output from tool 'Bad Product Lookup' using output_schema 'ProductResult'... Falling back to str(raw_result).
```
This is additive and non-breaking. Existing tools do not need to change. Tools
without `output_schema` keep the old string behavior. Invalid typed outputs
warn and fall back to the old formatting path.
* docs: add "One Card per Step" Studio page (AGE-107)
Document the merge of the task and agent nodes into a single step card on
the Studio canvas. Written as evergreen present-tense feature docs with a
dated rollout banner (June 24th) for the pre-launch customer announcement;
the banner is the only time-bound content and is flagged for removal after
ship. Added in edge + v1.14.7 across en, pt-BR, ko, and ar, with nav entries
in docs.json and three canvas/editor/swap screenshots.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: bump bedrock agentcore dependencies
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: alex-clawd <alex@crewai.com>
* Add single agent action to Flow definitions
Lets a flow method build and run a single CrewAI agent directly, without
wrapping it in a crew. Same idea as the existing `crew` action, but for
one agent.
methods:
answer:
do:
call: agent
with:
role: Analyst
goal: Answer questions
backstory: Knows things.
input: "${state.question}"
start: true
* `input` is required and interpolated from flow state, like
`${state.question}` or `${item}` inside an `each` loop
* optional `response_format` points at a Pydantic model (`{"python":
"models.AnswerModel"}`) to get structured output
* `input` must be a string and its CEL is validated at load time, so bad
expressions like `${state.}` fail early
* Simplify test code
Adds a consolidated `datadog.mdx` under `docs/edge/{en,pt-BR,ko,ar}/enterprise/guides/`
covering both the Datadog Agent path (stdout JSON logs via `CREWAI_LOG_FORMAT=json`)
and the Datadog OTLP intake, with a JSON log schema reference and a ready-to-import
operations dashboard (`datadog_dashboard.json`). Reframes `capture_telemetry_logs.mdx`
to lead with OpenTelemetry as the vendor-neutral path and point readers to the new
Datadog page for that ecosystem's setup.
* Validate flow CEL expressions at definition load time
Promote CEL expression handling to a public Expression API and validate expressions when a FlowDefinition is built instead of when it executes.
Invalid CEL syntax or unknown roots now raise ValidationError from FlowDefinition.from_yaml() and FlowDefinition.from_dict(). Expressions may reference state and outputs, plus item inside each.do; bare identifiers are rejected as unknown roots.
For with values, the CEL contract is intentionally simple: after trimming whitespace, a string is evaluated as CEL only if it starts with ${ and ends with }. Anything else is treated as a literal value, so partial interpolation is not supported. If the content inside the wrapper is not valid CEL, validation fails.
Examples:
```text
"${state.topic}" -> evaluated, returns state.topic
"topic is ${state.topic}" -> literal string
"${state.topic} suffix" -> literal string
"${'a'}${'b'}" -> invalid CEL
```
* Honor explicit empty-context overrides in evaluate() / render_template()
* Use explicit name/action shape for each.do steps
* Add optional `if` expression to `each.do` steps
Lets a step inside an `each` action run conditionally based on a CEL
expression evaluated against `item` and prior step `outputs`.
* feat: update pyproject.toml to specify wheel targets
Added a new section to the pyproject.toml file to include only specific files in the wheel build, enhancing the packaging process. Updated tests to verify the inclusion of these targets.
* feat: add memory save event handling to activity log
Implemented event handlers for MemorySaveStartedEvent, MemorySaveCompletedEvent, and MemorySaveFailedEvent in the crew_run_tui module. This allows the application to log memory save operations, capturing their status and details in the activity log. Added corresponding tests to verify the correct logging behavior for successful and failed memory saves.
* feat: enhance memory save event handling in activity log
Added functionality to suppress nested memory save events and updated the handling of MemorySaveStartedEvent, MemorySaveCompletedEvent, and MemorySaveFailedEvent to improve logging accuracy. Introduced new tests to verify the correct behavior of memory save events, including scenarios for nested events and completion updates for timed-out entries.
* Fix memory save activity log handling
* Normalize alpha package versions
* Update scaffolded crew dependency
* feat: add button to copy setup instructions for CrewAI coding agents
Introduced a button in the documentation that allows users to easily copy setup instructions for CrewAI coding agents. The instructions include installation steps, environment setup, and best practices for using the CrewAI CLI. This enhancement aims to streamline the onboarding process for new users.
* Improve missing CrewAI install guidance
* fix: address pr review feedback
* fix: avoid mismatched memory save rows
* fix: wait for queued memory save events
* fix: avoid matching memory saves on missing ids
* chore: normalize prerelease version to 1.14.8a1
Add a description and examples to every FlowDefinition field and
standardize on `typing.Literal`, so the generated JSON schema documents
itself — each action discriminator, state branch, and config option
explains what it is and shows a realistic value.
Examples live on individual fields only, never at the model level, which
keeps the schema readable for tooling that renders field-level help.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Add script/code blocks to FlowDefinition
Let a Flow method run trusted inline Python with `call: script`. The code
is compiled once into a generated function and receives the runtime
values as arguments.
```yaml
methods:
normalize:
start: true
do:
call: script
code: |
import math
state["rounded"] = math.ceil(state["raw_score"])
return f"rounded:{state['rounded']}"
```
Even though this shares the same surface of tools (custom code), I
decided to make it opt-in for now, using
`CREWAI_ALLOW_FLOW_SCRIPT_EXECUTION=1`.
* Address code review comments
* Replace eval with safe expression parser in calculator tool example
Update the calculator tool example in the CLI template to use
ast.parse instead of eval for expression evaluation.
Co-authored-by: Vinicius Brasil <vini@hey.com>
* Replace calculator example with practical file reader tool
* Use word count example - safe, no file/eval risk
---------
Co-authored-by: Vinicius Brasil <vini@hey.com>
The litellm extra was capped at <1.85, which excludes future
patch lines and reintroduces resolution failures under uv/pip.
Widen to >=1.84.0,<2 so the extra resolves cleanly against
crewai's openai/python-dotenv pins.
Closes OSS-71
* feat: adopt directory-based docs versioning with Edge channel
Switch docs.crewai.com from navigation-only versioning (every version
selector entry rendered the same docs/<lang>/* source files) to
Mintlify's directory-based versioning so each version selector entry
renders its own snapshot. Add an "Edge" channel under docs/edge/<lang>/*
that always reflects main HEAD for unreleased work, eliminating
pre-release leakage onto frozen release labels. External links to
canonical /<lang>/* URLs are preserved via wildcard redirects that
always land on the current default version.
Layout:
- docs/edge/<lang>/* rolling source (you edit here)
- docs/edge/enterprise-api.*.yaml
- docs/v<X.Y.Z>/<lang>/* frozen, immutable snapshots
- docs/v<X.Y.Z>/enterprise-api.*.yaml
- docs/images/ shared, append-only
- docs/docs.json nav + redirects
URLs follow the Mintlify-idiomatic shape: /edge/<lang>/<page> for
Edge, /v<X.Y.Z>/<lang>/<page> for every frozen snapshot. The wildcard
redirects /<lang>/:slug* -> /<default>/<lang>/:slug* keep stale links
working, and every freeze rewrites them (plus all per-section/per-page
redirects) so destinations always resolve to the current default
without depending on a second redirect hop.
Release flow integration (devtools release):
- New module crewai_devtools.docs_versioning.freeze() materialises
docs/v<X.Y.Z>/ from docs/edge/, rewrites openapi: refs inside the
snapshot, inserts the version into every language block in
docs.json, and refreshes all redirect destinations.
- _update_docs_and_create_pr() in cli.py now calls that freeze during
Phase 2 of devtools release. Edge changelogs are updated first (so
the snapshot freeze picks them up), then the snapshot is staged
alongside docs.json, branched as docs/freeze-v<X.Y.Z>, and the PR
is titled [docs-freeze] docs: snapshot and changelog for v<X.Y.Z>
— the title prefix the new CI guard reads.
- The PR still gates tag, GitHub release, PyPI publish, and the
enterprise release as before; no new PRs are added.
- Pre-releases (1.X.YaN, 1.X.YbN, ...) skip the snapshot — they ride
Edge — and the docs PR title omits the [docs-freeze] prefix.
- docs_check (AI-generated docs scaffolding) writes to
docs/edge/<lang>/* so newly-generated unreleased docs land in Edge
and never accidentally touch a frozen snapshot.
Migration scripts (one-shot):
- scripts/docs/freeze_historical_versions.py reconstructs all 16
historical snapshots (v1.10.0 .. v1.14.7) from git tags via
git archive | tar, rewriting openapi: MDX refs so each snapshot
reads its own enterprise-api YAML rather than the live one.
- scripts/docs/prefix_version_paths.py one-shot-migrates docs.json:
rewrites every page path in 16 versioned blocks to point under
docs/v<X.Y.Z>/, inserts a new Edge entry per language, tags
v1.14.7 as Latest (default), prunes pages whose target file
doesn't exist in the snapshot (e.g. docs/ar/ didn't exist before
v1.12.0), and writes the wildcard + per-section redirects.
- scripts/docs/freeze_current_edge.py is now a thin CLI wrapper
around docs_versioning.freeze for manual one-off freezes (e.g.
retroactively snapshotting a forgotten release).
CI guards (.github/workflows/docs-snapshots.yml):
- Frozen snapshots under docs/v[0-9]*/ are immutable; only PRs whose
title contains [docs-freeze] (i.e. release-cut PRs generated by
devtools release or the manual wrapper) may modify them.
- Images under docs/images/ are append-only since snapshots share a
single image directory. Deleting or renaming an image breaks every
historical snapshot that still references it.
Restored docs/images/crewai-otel-export.png from PR #3673; it was
deleted in PR #4908 but v1.10.0 / v1.10.1 snapshots still reference
it. Restoring instead of editing the snapshots preserves historical
rendering fidelity and validates the new append-only rule
retroactively.
Tests:
- lib/devtools/tests/test_docs_versioning.py covers the freeze: file
copy, openapi rewrite, version insertion, default demotion, redirect
upserts, per-section redirect rewriting, idempotency, and invalid
inputs.
Verified locally with mintlify broken-links: 0 broken links across
the full site (Edge + 16 frozen versions, 4 locales).
AGENTS.md (repo root) is the contributor guide for the new model;
RELEASING.md is the release-cut runbook; README's Contribution
section links to both.
Co-authored-by: Cursor <cursoragent@cursor.com>
* style: resolve linter issues
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
* Enhance memory reset functionality and JSON crew handling
- Added `reset_all` method to the `Memory` class to reset the entire memory store, ignoring `root_scope`.
- Updated the `Crew` class to utilize `reset_all` when resetting memory.
- Enhanced the `_reset_flow_memory` function to check for `Memory` instances and call `reset_all` accordingly.
- Introduced helper functions to load JSON crew configurations and handle project declarations, improving the reset command's flexibility.
- Added tests to validate the new JSON crew memory reset behavior and ensure proper handling of declared flow projects.
* Fix memory reset review issues
* Bump litellm for security advisory
Replace the single FlowStateDefinition model with a `type`-discriminated
union of FlowDictStateDefinition, FlowPydanticStateDefinition,
FlowJsonSchemaStateDefinition, and FlowUnknownStateDefinition.
Each branch only carries the fields it actually uses and forbids extras,
so an invalid combination like a `dict` state with a `ref` now fails
validation instead of being silently accepted. The runtime reads `ref`
and `json_schema` defensively since they no longer exist on every branch.
```yaml
state:
type: json_schema
json_schema:
type: object
properties:
topic:
type: string
```
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Update installation and quickstart documentation for JSON-first crew projects
- Revised the installation guide to reflect the new JSON-first project structure, detailing the creation of `crew.jsonc` and `agents/*.jsonc` files.
- Updated the quickstart guide to demonstrate setting up agents and tasks using JSONC format, replacing previous YAML examples.
- Enhanced the agents and tasks documentation to clarify the transition from YAML to JSONC, including examples and explanations of the new structure.
- Added notes on the classic YAML structure for legacy projects and provided guidance on migrating to the new format.
* docs: clarify json crew quickstart guidance
* docs: address json docs review feedback
* Implement DMN mode support in crew creation and execution
- Added `is_dmn_mode_enabled` utility to check for enterprise non-interactive mode based on the `CREWAI_DMN` environment variable.
- Updated `create` function in `cli.py` to enforce required parameters when DMN mode is active, raising appropriate usage errors.
- Enhanced `create_crew` and `create_json_crew` functions to skip provider prompts and handle folder existence checks in DMN mode.
- Introduced non-interactive defaults for agent and task creation in DMN mode, ensuring seamless project setup without user input.
- Modified `run_crew` to bypass TUI and handle runtime inputs directly when in DMN mode, improving execution flow for JSON-defined crews.
- Added tests to validate DMN mode behavior, ensuring correct handling of required inputs and non-interactive defaults.
* Implement DMN mode support in crew creation and execution
- Introduced `is_dmn_mode_enabled()` utility to check for non-interactive mode based on the `CREWAI_DMN` environment variable.
- Updated `create` function to enforce required parameters when DMN mode is active, raising appropriate usage errors.
- Modified `create_crew` and `create_json_crew` functions to skip provider prompts and utilize non-interactive defaults in DMN mode.
- Enhanced `run_crew` to bypass TUI and handle runtime inputs directly in DMN mode, ensuring smooth execution without user interaction.
- Added tests to validate DMN mode behavior, including requirements for type and name, and ensuring proper handling of existing folders and missing inputs.
* Enhance crew loading and validation logic
- Updated `crew_loader.py` to pass the project root when loading task and agent definitions, improving the handling of Python references.
- Refactored `json_loader.py` to include additional validation for Python references, ensuring they are resolved within the project root and enforcing depth limits.
- Added tests in `test_crew_loader.py` and `test_json_loader.py` to validate rejection of unsafe Python references and input files outside the project root.
- Improved error handling for JSON project validation, ensuring clearer feedback for invalid configurations.
* Refactor tests for hierarchical verbose manager agent
- Removed `@pytest.mark.vcr()` decorators from `test_hierarchical_verbose_manager_agent` and `test_hierarchical_verbose_false_manager_agent`.
- Introduced mocking for task outputs in both tests to simulate execution without relying on external dependencies.
- Ensured that the `crew.kickoff()` method is called within a context that patches the `Task.execute_sync` method, improving test isolation and reliability.
* Fix JSON loader PR review comments
* Fix JSON loader project root after rebase
* Handle UNC paths in JSON input files
* Enhance JSON crew project handling and validation
- Updated `create_json_crew.py` to specify input files with a brief path.
- Refactored `crew_loader.py` to improve agent and task loading logic, including the introduction of a `build_agent` function and better handling of task classes.
- Enhanced `json_loader.py` with additional validation for agent and task definitions, including support for Python references and conditional tasks.
- Added tests in `test_crew_loader.py` and `test_json_loader.py` to ensure proper loading of agents, tasks, and validation of project structures, including custom types and conditional tasks.
- Improved error handling and validation safety across the project loading process.
* Enhance JSON crew configuration options in create_json_crew.py
- Added optional fields for custom agent subclasses and advanced task options, including condition checks and output specifications.
- Improved documentation comments for better clarity on agent and task configurations.
- Updated JSON crew handling to support additional callbacks for pre- and post-execution processes.
* Enhance JSON crew template tests in test_create_crew.py
- Added assertions for new optional fields in crew and agent templates, including conditional tasks, custom converters, and input file specifications.
- Improved validation checks for manager agents and callback references to ensure proper configuration in JSON crew definitions.
- Expanded documentation references within the tests to provide clearer guidance on the expected structure and usage of crew templates.
* Fix JSON crew PR review issues
* Update crewAI CLI with various enhancements and fixes
- Updated `create_json_crew.py` to require `crewai[tools]>=1.14.7`.
- Enhanced `git.py` with improved repository initialization, including automatic initial commit creation and exclusion patterns for initial commits.
- Modified `install_crew.py` to allow error handling during installation with an optional `raise_on_error` parameter.
- Expanded `plus_api.py` to include methods for creating and updating crews from ZIP files.
- Introduced a new `archive.py` for creating deployable ZIP archives of CrewAI projects, ensuring local artifacts are excluded.
- Updated `run_crew.py` to manage JSON crew dependencies and run crews in the project's environment.
- Enhanced deployment logic in `main.py` to handle ZIP uploads and improve user feedback during deployment processes.
- Added tests for new functionalities and ensured existing tests reflect recent changes in behavior and requirements.
* fix(cli): address deploy zip review feedback
* fix(cli): sync missing lockfile before deploy
* fix(cli): preserve remote deploy on git setup warnings
* test(cli): use single deploy main import style
* fix(cli): skip project install for json crew sync
* fix(cli): load json runner from source checkout
* fix(cli): skip json crew sync when locked
* fix(cli): address deploy zip review feedback
* fix(cli): pass env on zip redeploy
* fix(cli): harden json run and zip fallback
* fix(cli): validate before deploy lock install
* fix(cli): respect poetry lock for json runs
* fix(cli): align json zip wrapper detection
* fix(deps): bump starlette audit floor
* fix(cli): avoid auth retry for deploy exits
* fix(cli): update json zip script entrypoints
* feat(cli): introduce JSON crew project support and TUI enhancements
- Added support for creating and running JSON-defined crew projects, allowing users to scaffold projects with a new `create_json_crew.py` file.
- Implemented a full-screen Textual TUI for crew execution in `crew_run_tui.py`, enhancing user interaction with a two-column layout.
- Updated `run_crew.py` to prioritize JSON crew projects and added daemon mode for running without TUI.
- Introduced interactive pickers in `tui_picker.py` for improved CLI prompts.
- Enhanced validation for JSON crew files in `validate.py` to ensure proper structure and agent definitions.
- Updated `.gitignore` to exclude demo and crewai directories.
* feat: update LLM model references to gpt-5.4-mini
- Changed default LLM model from gpt-4o-mini to gpt-5.4-mini across various files, including CLI options, JSON crew configurations, and agent definitions.
- Enhanced benchmark and human feedback functionalities to utilize the new model.
- Improved user interface elements in the TUI for better interaction and feedback during execution.
- Added support for new skills directory in JSON crew project creation.
* feat(benchmark): add crew-level benchmarking functionality
- Introduced a new `benchmark` command in the CLI for crew-level benchmarking, allowing users to specify agents, models, and timeout settings.
- Implemented `CrewBenchmarkCase` to handle crew-level benchmark cases with inputs and criteria.
- Enhanced the benchmark runner to support progress tracking and detailed reporting of results for multiple models.
- Added tests for loading crew benchmark cases and validating their structure.
- Updated existing benchmark functions to accommodate the new crew-level execution model.
* feat(cli): enhance JSON crew project functionality and TUI improvements
- Added optional agent-level guardrails and advanced options in JSON crew configurations to improve output validation and flexibility.
- Updated the TUI to better handle plan step statuses, including visual indicators for task completion and failure.
- Introduced methods for parsing and managing step observation events, ensuring accurate updates to task statuses during execution.
- Enhanced validation for JSON crew projects, ensuring proper structure and error handling for agent and task definitions.
- Added comprehensive tests for new features and validation logic, ensuring robustness in JSON crew project handling.
* refactor(cli): streamline JSON crew project handling and improve validation
- Refactored JSON crew project loading and validation logic to enhance clarity and maintainability.
- Introduced utility functions for finding JSON crew files, improving code reuse across modules.
- Removed deprecated benchmark functionality and associated tests to simplify the codebase.
- Updated CLI commands to utilize the new JSON project structure, ensuring compatibility with recent changes.
- Enhanced test coverage for JSON crew project features, ensuring robust validation and error handling.
* feat(cli): enhance activity log navigation and focus management
- Added functionality to focus on the activity log when navigating through log entries.
- Implemented refresh logic for the log panel to ensure updates are displayed correctly during navigation.
- Improved keyboard navigation for log entries, allowing users to expand and scroll through logs seamlessly.
- Added tests to verify the correct behavior of log navigation and focus management in the TUI.
* feat(cli): enhance JSON crew project interaction and input handling
- Introduced a new function to enable prompt line editing for better user experience during input prompts.
- Updated the JSON crew project wizards to show interpolation hints for dynamic values, improving user guidance.
- Enhanced the handling of missing input placeholders by prompting users for required values during crew setup.
- Refactored the crew run logic to ensure proper loading and preparation of JSON-defined crews, including runtime input management.
- Added tests to verify the correct behavior of new input handling features and JSON crew project interactions.
* feat(cli): improve crew project input prompts and event handling
- Enhanced the `_prompt_text` function to allow for configurable spacing before prompts, improving user experience during input collection.
- Updated the wizards for agent and task creation to utilize the new prompt configuration, ensuring a more compact and streamlined interaction.
- Introduced new plan step lifecycle events (`PlanStepStartedEvent`, `PlanStepCompletedEvent`) to better track the execution status of plan steps.
- Refactored the step executor to emit these events during the execution of tasks, improving observability and debugging capabilities.
- Added tests to verify the correct behavior of new prompt handling and event emissions during crew project execution.
* fix: refine json-first crew interactions
* fix: prioritize common json crew tools
* fix: make json crew more tools expandable
* fix: show json crew tools by category
* feat(memory): update default embedder to OpenAI text-embedding-3-large and enhance memory compatibility
- Changed the default embedding model for Memory to OpenAI text-embedding-3-large, which uses 3072-dimensional vectors.
- Added warnings regarding compatibility issues with existing local memory stores created with 1536-dimensional embeddings.
- Updated documentation to reflect the new default embedder and its configuration options.
- Enhanced the CLI and codebase to support the new embedding model across various components, ensuring a seamless transition for users.
* fix: address PR review feedback for JSON-first crews
Review blockers:
- Forward trained_agents_file to JSON crews: crewai run -f now exports
CREWAI_TRAINED_AGENTS_FILE for the in-process JSON crew path
- Wizard agent picker: Esc/cancel now reprompts instead of silently
assigning the first agent
- JSON tool resolution hard-fails: unknown tool names, missing custom
tool files, and invalid custom tool modules raise JSONProjectError
with actionable messages instead of warn-and-continue
- Embedding dimension mismatch: LanceDB and Qdrant Edge storages raise
EmbeddingDimensionMismatchError with reset/pin guidance instead of
silently zero-filling vectors or returning empty search results
- Custom tool code execution documented in loader docstring and the
scaffolded project README
CI fixes:
- ruff format across lib/
- All 133 PR-introduced mypy errors fixed (llm.py lazy-litellm and
cli.py lazy command shims now use TYPE_CHECKING imports; textual
is_mounted misuse fixed; pick_many overloads; misc annotations)
Bot review comments:
- Empty except blocks now have explanatory comments or debug logging
- Removed unused _C_BG/_C_PANEL/_C_BORDER globals and redundant
import re; tests use a single import style for create_json_crew
Tests: trained-agents propagation, wizard cancel, tool resolution
failures, and dimension mismatch guidance.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix: address second round of PR review comments
Cursor Bugbot:
- Wizard agent slugs: strip to [a-z0-9_] and fall back to agent_<n> so
symbol-only roles can't produce an empty agents/.jsonc filename
- Wizard task names: dedupe against prior task names and fall back to
task_<n> for symbol-only descriptions
CodeRabbit:
- Agent.message(): import Task explicitly at runtime instead of relying
on the namespace injection done by crewai/__init__
- Async executor: move the native-tools-unsupported fallback from
_ainvoke_loop_react (self-recursion) to _ainvoke_loop_native_tools,
mirroring the sync implementation
- StepExecutor downgrade: keep the in-step conversation and append the
text-tooling instructions instead of rebuilding messages, so completed
native tool calls are not re-executed
- crewai-files: extension-based MIME lookup now runs before byte
sniffing so csv/xml types are not degraded to text/plain
- Memory storages: validate every record in a save() batch against a
consistent embedding dimension (LanceDB previously checked only the
first record); added mixed-batch tests
- _print_post_tui_summary now typed against CrewRunApp
- Docs: Azure OpenAI default embedder change called out in the memory
migration warning and provider table
Code quality bots:
- Removed unused _C_YELLOW/_C_CYAN (crew_run_tui) and _GREEN (tui_picker)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* feat(cli): accordion tool picker in JSON crew wizard
The flat tool list had grown to ~90 rows. The picker now shows:
- Common tools always visible at the top
- Every other category as a single expandable row with tool and
selection counts (e.g. "Search & Research (27 tools, 2 selected)")
- Expanding a category collapses the previously expanded one
- Selections persist across expand/collapse via new preselected
support in pick_many; cursor follows the toggled category row
tui_picker gains preselected + initial_cursor options on pick_many,
and Esc in multi-select now confirms the current selection instead of
discarding it (required so collapsing can't silently drop choices).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* refactor(cli): remove --daemon flag from crewai run
The flag only affected JSON crew projects — classic and flow projects
ignored it entirely, which made the behavior inconsistent. Removed the
option, the daemon code path (_run_json_crew_daemon), and its helper
(_load_json_crew_with_inputs).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* test: update run command tests after --daemon removal
lib/crewai/tests/cli/test_run_crew.py still asserted the old
run_crew(trained_agents_file=..., daemon=False) call signature.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(cli): exit codes, mid-run quit, async statuses, hyphen placeholders
Addresses the latest Bugbot review round:
- Failed JSON crew runs now exit non-zero (SystemExit(1)) so scripts
and CI don't treat failures as success, mirroring the classic path
- Quitting the TUI mid-run now ends the process (os._exit(130));
kickoff runs in a thread worker that cannot be force-cancelled, so
letting the CLI return would leave LLM/tool work burning tokens in
the background
- Sidebar task statuses are now async-safe: completion/failure events
resolve the task's own row via identity instead of assuming the most
recently started task, and starting a task no longer blanket-marks
earlier active rows as done
- The runtime-input prompt regex now accepts hyphenated placeholder
names ({my-topic}), matching kickoff's interpolation pattern
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix: validation safety, custom tool sandboxing, TUI log integrity, memory error surfacing
- Deploy validation no longer executes project code: validation mode
checks tool declarations structurally (well-formed entries, custom
tool file exists) without importing or instantiating anything.
custom:<name> resolution only happens on the actual run path.
- custom:<name> is constrained to [A-Za-z_][A-Za-z0-9_]* and the
resolved path must stay inside the project's tools/ directory, so
custom:../foo or absolute-path names cannot execute code outside it.
Tool paths resolve relative to the crew project root, not cwd.
- TUI task logs are built from per-task state captured at task start
(idx, description, agent, start time); an out-of-order completion
takes its output from the event and no longer steals or resets the
current task's streamed steps/output.
- EmbeddingDimensionMismatchError now inherits ValueError instead of
RuntimeError so background saves surface it through
MemorySaveFailedEvent instead of silently dropping the save; the
shutdown catch in _background_encode_batch is narrowed to the
"cannot schedule new futures" case.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(cli): declared project type wins over crew.json presence
A flow project that also contains a crew.json(c) file now runs and
validates as the flow it declares in pyproject.toml instead of being
hijacked by the JSON crew path. Both crewai run (_has_json_crew) and
deploy validation (_is_json_crew) check tool.crewai.type; a missing or
unreadable pyproject still means a bare JSON crew project.
Also documents why StepObservationFailedEvent intentionally marks the
plan step "done": the event signals an observer failure, not a step
failure, and the executor continues past it.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(cli): type the declared_type locals so mypy stays clean
Comparing an Any-typed .get() chain returns Any, which tripped
no-any-return on the previous commit.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Let users run a Flow from a Flow Definition YAML file or inline string
without writing Python, passing kickoff inputs as `--inputs` JSON. The
flag is gated behind an experimental warning since the definition format
may still change.
A `do:` step can now say `call: tool` and name a CrewAI tool to run,
passing its inputs under `with:`. Before this, a definition could only
point at Python code to run.
```yaml
methods:
search:
start: true
do:
call: tool
ref: crewai_tools:ExaSearchTool
with:
search_query: ai agents
```
* Drive human feedback from the flow definition
@human_feedback previously wrapped methods with the full HITL runtime (feedback
request, outcome collapse, learn loop), so flows built from a YAML definition —
which carry no decorated callables — could not pause for or route on human
feedback.
# Conflicts:
# lib/crewai/src/crewai/flow/persistence/decorators.py
# lib/crewai/src/crewai/flow/runtime/__init__.py
* Address code review comments
* Wire config and persistence from FlowDefinition into the runtime
`from_definition` was silently dropping all config fields; it now passes
`config.model_dump()` so suppress_flow_events, max_method_calls, etc.
actually apply.
Persistence is now engine-driven: `_persist_method_completion` fires
after every method using the definition's persist metadata, so
`@persist` no longer needs to wrap methods — it just stamps them.
* Address code review comments
* feat: aggregate LLM token usage at the flow level
Introduces `flow.usage_metrics`, a snapshot of every LLMCallCompletedEvent
emitted under the flow's `current_flow_id` for the duration of one kickoff
(or resume) call. Aggregation happens on the singleton event bus so it
covers crews, direct `LLM.call`s, and nested listener calls — solving the
mismatch where the SDK reported only the last crew's usage while the
Enterprise UI showed the correct full total.
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor: centralize provider key normalization in UsageMetrics
Add UsageMetrics.from_provider_dict to normalize raw LLM usage dicts
across providers (LiteLLM, native Anthropic, native Gemini, OpenAI
nested cached). BaseLLM._track_token_usage_internal and the flow-level
aggregator now share this single source of truth, so `flow.usage_metrics`
agrees with per-LLM totals on every provider — including the native
Anthropic path that emits `input_tokens`/`output_tokens` instead of
`prompt_tokens`/`completion_tokens`.
* fix: flush event bus before reading aggregated usage_metrics
`crewai_event_bus.emit` dispatches LLMCallCompletedEvent handlers on a
ThreadPoolExecutor (fire-and-forget), so a flow whose last LLM call
completes right before kickoff_async/resume_async returns can detach
the usage listener while that handler is still queued, leaving its
tokens off `flow.usage_metrics`. Match `Crew.kickoff()` and call
`crewai_event_bus.flush()` in both finally blocks so every handler
drains before the listener is detached.
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
* Read flow dispatch from FlowDefinition
Store the definition in a `_definition` PrivateAttr at post-init and
convert the dispatch helpers (`_start_method_names`, `_listener_methods`,
`_start_condition`, `_listen_condition`, `_is_router`) from classmethods
to instance methods that read it. Event names now fall back to
`self._definition.name` instead of `self.__class__.__name__`.
Behavior is identical for decorator subclasses, but the engine no longer
assumes the definition comes from the class. This is the seam for
`Flow.from_definition`, where an instance runs a definition that was
loaded rather than built from a Python subclass.
* Add Flow.from_definition to run flows without a subclass
A FlowDefinition (e.g. loaded from YAML) was only usable for dispatch on
decorator-authored subclasses. Now each method definition records an
importable `module:qualname` handler ref, and `Flow.from_definition`
resolves and binds those handlers to build a runnable flow directly.
* Build flow state from FlowDefinition
Definition-driven flows previously always started with a bare dict
state.
* Replace handler string with structured FlowActionDefinition
`handler: str | None` was optional and opaque — missing handlers only
surfaced at kickoff time. `do: FlowActionDefinition` is required, so
Pydantic rejects invalid definitions at parse time.
The `call: "code"` discriminator prepares the schema for future
non-Python action types (e.g. MCP tool, crew) without touching
`FlowMethodDefinition`. Resolution logic is extracted to
`runtime/_action_resolvers.py` to keep the dispatch point isolated.
* Fix conversational start router missing required do field
FlowMethodDefinition.do became required when the handler string was
replaced with FlowActionDefinition, but _conversation_start_router still
built its fragment without it, breaking crewai import entirely.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* Add event scoping to flow test
* Change lib/crewai/tests/test_flow_from_definition.py
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
A custom BaseLLM subclass serializes with the inherited llm_type "base",
which the registry maps to the abstract BaseLLM. Restore then crashed on
cls(**value). Rebuild a concrete LLM from the saved config when the
resolved class is abstract.
Checkpoint serialization stamps checkpoint_completed_methods onto every live
Flow in RuntimeState.root, including the agent executor reused across a crew's
tasks. kickoff_async read that stamp as a restore signal, so the second task
replayed the first task's completed methods and never reached a final answer.
Gate is_restoring on _restored_from_checkpoint, set only by
_restore_from_checkpoint, and consume it single-shot.
flow.plot defaults to show=True, which calls webbrowser.open on every run.
The test only asserts FlowPlotEvent is emitted, so disable the browser open.