* Require explicit CrewAI project definitions
JSON crews and declarative flows now resolve from `[tool.crewai]`
metadata instead of implicit filename discovery. This makes project type
selection deterministic, prevents stray `crew.json(c)` files from changing
CLI behavior, and centralizes definition path validation for run, install,
deploy validation, plotting, and memory reset paths.
`[tool.crewai].definition` must be a project-local file path. Absolute
paths, `~`, missing files, directories, and paths escaping the project root
are rejected so deploy and runtime commands use the same contract.
Breaking changes and migration paths:
* JSON crew projects are no longer discovered from `crew.json` or
`crew.jsonc` alone. Add explicit metadata:
```toml
[tool.crewai]
type = "crew"
definition = "crew.jsonc"
```
* Declarative flow projects must use a valid project-local definition path:
```toml
[tool.crewai]
type = "flow"
definition = "flows/research.yaml"
```
* `Flow.from_definition(definition)` is removed. Use:
```python
Flow.from_declaration(contents=definition)
```
* `FlowDefinition.to_json()` and `FlowDefinition.to_yaml()` are removed.
Use `FlowDefinition.to_dict()` and serialize with the caller's JSON or
YAML library.
* `FlowDefinition.from_dict()` is removed. Use:
```python
FlowDefinition.from_declaration(contents=data)
```
* `FlowDefinition.json_schema()` is removed. Use Pydantic's schema API only
where schema generation is intentionally needed:
```python
FlowDefinition.model_json_schema(by_alias=True)
```
* `crewai_cli.run_crew.find_crew_json_file()` and `_has_json_crew()` are
removed. Use `configured_project_json_crew()` or the shared
`crewai_core.project.configured_project_definition("crew")` helper.
* `crewai reset-memories` now only loads JSON crews declared through
`[tool.crewai].definition`, and invalid declared JSON crew definitions
fail instead of silently falling back to classic crew discovery.
* Address code review comments
* Track conversational flow turn usage in telemetry
* adjusted name to flow:conversation_turn
* only mark on turn completed event
* ensure tui also emits these events
`StateProxy` looked like a thread-safety boundary, but it only protected
a small slice of state operations. Some examples of operations that were
not covered:
- `self.state.counter += 1`, `self.state["counter"] += 1` (increments)
- `self.state.user.profile.score += 1` (nested object mutations)
- `self.state.config["limits"]["max"] = 10` (mutation through model fields)
- `self.state.items[0].status = "done"` (list/container mutations)
This commit decided to remove it completely for simplicity and
performance:
- Simpler runtime code
- attr read: 24x faster, attr write: 27x faster, list append: 19x faster (local benchmark)
- Clearer concurrency contract (lifecycle locks remain, but arbitrary
shared state mutation is not presented as thread-safe)
Declarative flows already used `module:qualname` refs for runtime
objects, but crew JSON tools still had their own lookup path. That meant
examples like `project_tools:LookupTool` were treated as named
`crewai_tools` lookups and failed with guidance that only mentioned
`SerperDevTool` or `custom:<name>`. Invalid refs such as
`not_tools:NotATool` also missed the same BaseTool validation used by
flow tool actions.
Move ref resolution into a shared declarative helper, use it from flow
tool actions and crew JSON loading, and require tool refs to resolve to
`BaseTool` classes before instantiation. Validation still checks tool
refs structurally, so validating a crew does not import or execute
project code.
Allow required JSON schema state fields to be supplied by kickoff inputs
instead of requiring every field to exist in state.default before
runtime.
Example: a flow with required lead_name and no state.default can now run
with kickoff inputs={"lead_name": "Ada Lovelace"}.
* Fix symlink path traversal in skill archive extraction
`_safe_extractall` (the Python < 3.12 fallback used by `crewai skills`
archive unpacking) validated each member's *name* against the destination
but never validated symlink/hardlink *targets*. A malicious skill tarball
could plant a symlink escaping the destination (e.g. `link -> /home/user/.ssh`)
followed by a regular member written through it (`link/authorized_keys`),
escaping `dest` even though every member name resolves inside it — the
classic symlink-extraction traversal.
The 3.12+ path (`extractall(..., filter="data")`) already blocks this; the
fallback now mirrors it by rejecting absolute link targets and any link
target that resolves outside the destination directory.
Adds regression tests covering absolute and relative escaping symlinks plus
benign in-tree symlinks and ordinary archives.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* Harden skill cache archive extraction
* Reject special skill archive members
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Add a single declaration loader shared by API and CLI callers.
- Add FlowDefinition.from_declaration for FlowDefinition instances, dictionaries, YAML/JSON strings, and file paths
- Add Flow.from_declaration to build runnable flows directly from the same inputs
- Route declarative flow CLI loading through Flow.from_declaration so path handling and validation stay centralized
```
# Load just the serializable definition when you do not need to run it yet.
definition = FlowDefinition.from_declaration(path="flows/research.crewai")
definition = FlowDefinition.from_declaration(contents=flow_yaml)
definition = FlowDefinition.from_declaration(contents=flow_dict)
# Build a runnable flow directly from the same declaration inputs.
flow = Flow.from_declaration(path="flows/research.crewai")
flow = Flow.from_declaration(contents=flow_yaml)
flow = Flow.from_declaration(contents=flow_dict)
flow = Flow.from_declaration(contents=definition)
# Run it like any other flow.
result = flow.kickoff(inputs={"topic": "AI agents"})
# The CLI now goes through the same path-based loader.
# crewai run --definition flows/research.crewai
```
Inline crews default to `verbose=False`. They set the shared formatter's
`verbose` value in `lib/crewai/src/crewai/crew.py`, which could hide
flow method status from `lib/crewai/src/crewai/events/utils/
console_formatter.py`.
Remove that `verbose` check for flow method status. Flow output is still
controlled by `suppress_flow_events`.
Normal quiet crews are unchanged because crew, task, and agent logs
still use their own `verbose` checks.
* Add declarative Flow CLI support
Currently, declarative flows can be loaded by the runtime, but the CLI
still treats them as an experimental definition file instead of a
first-class Flow project shape.
With this PR, `crewai create flow --declarative` scaffolds a YAML-backed
Flow project, and `crewai run`, `crewai flow kickoff`, and `crewai flow
plot` can run against the configured definition.
This also lets crew actions reference reusable crew definition files or
folders and override their inputs from the Flow definition, so
declarative flows can compose existing declarative crews without
inlining everything.
* Address code review comments
This commit fixes a bug where a router method could not be the start
method of a flow.
This is useful when you want to route against the initial state, or even
stack two routers.
Currently, tools have a strong input contract through `args_schema`, but no
output contract. This means that anything a tool outputs is converted to
string.
Not only the contract is weak, but the "invisible" conversion to string can
have unexpected effects when the tool returns complex objects like dicts and
arrays.
With this PR, a tool can _optionally_ define an output contract with
`output_schema`. CrewAI validates the raw result and sends the agent JSON.
```python
class ProductResult(BaseModel):
sku: str
name: str
in_stock: bool
class ProductLookupTool(BaseTool):
name: str = "Product Lookup"
description: str = "Look up product availability by SKU."
def _run(self, sku: str) -> ProductResult:
return ProductResult(sku=sku, name="USB-C dock", in_stock=True)
```
If the result does not match the schema, CrewAI warns and falls back to
`str(raw_result)` instead of failing the run:
```python
@tool("Product Lookup", output_schema=ProductResult)
def product_lookup(sku: str) -> dict[str, object]:
return {"sku": sku, "name": "USB-C dock", "in_stock": True}
#=> RuntimeWarning: Failed to validate or serialize output from tool 'Bad Product Lookup' using output_schema 'ProductResult'... Falling back to str(raw_result).
```
This is additive and non-breaking. Existing tools do not need to change. Tools
without `output_schema` keep the old string behavior. Invalid typed outputs
warn and fall back to the old formatting path.
* Add single agent action to Flow definitions
Lets a flow method build and run a single CrewAI agent directly, without
wrapping it in a crew. Same idea as the existing `crew` action, but for
one agent.
methods:
answer:
do:
call: agent
with:
role: Analyst
goal: Answer questions
backstory: Knows things.
input: "${state.question}"
start: true
* `input` is required and interpolated from flow state, like
`${state.question}` or `${item}` inside an `each` loop
* optional `response_format` points at a Pydantic model (`{"python":
"models.AnswerModel"}`) to get structured output
* `input` must be a string and its CEL is validated at load time, so bad
expressions like `${state.}` fail early
* Simplify test code
* Validate flow CEL expressions at definition load time
Promote CEL expression handling to a public Expression API and validate expressions when a FlowDefinition is built instead of when it executes.
Invalid CEL syntax or unknown roots now raise ValidationError from FlowDefinition.from_yaml() and FlowDefinition.from_dict(). Expressions may reference state and outputs, plus item inside each.do; bare identifiers are rejected as unknown roots.
For with values, the CEL contract is intentionally simple: after trimming whitespace, a string is evaluated as CEL only if it starts with ${ and ends with }. Anything else is treated as a literal value, so partial interpolation is not supported. If the content inside the wrapper is not valid CEL, validation fails.
Examples:
```text
"${state.topic}" -> evaluated, returns state.topic
"topic is ${state.topic}" -> literal string
"${state.topic} suffix" -> literal string
"${'a'}${'b'}" -> invalid CEL
```
* Honor explicit empty-context overrides in evaluate() / render_template()
* Use explicit name/action shape for each.do steps
* Add optional `if` expression to `each.do` steps
Lets a step inside an `each` action run conditionally based on a CEL
expression evaluated against `item` and prior step `outputs`.
* feat: update pyproject.toml to specify wheel targets
Added a new section to the pyproject.toml file to include only specific files in the wheel build, enhancing the packaging process. Updated tests to verify the inclusion of these targets.
* feat: add memory save event handling to activity log
Implemented event handlers for MemorySaveStartedEvent, MemorySaveCompletedEvent, and MemorySaveFailedEvent in the crew_run_tui module. This allows the application to log memory save operations, capturing their status and details in the activity log. Added corresponding tests to verify the correct logging behavior for successful and failed memory saves.
* feat: enhance memory save event handling in activity log
Added functionality to suppress nested memory save events and updated the handling of MemorySaveStartedEvent, MemorySaveCompletedEvent, and MemorySaveFailedEvent to improve logging accuracy. Introduced new tests to verify the correct behavior of memory save events, including scenarios for nested events and completion updates for timed-out entries.
* Fix memory save activity log handling
* Normalize alpha package versions
* Update scaffolded crew dependency
* feat: add button to copy setup instructions for CrewAI coding agents
Introduced a button in the documentation that allows users to easily copy setup instructions for CrewAI coding agents. The instructions include installation steps, environment setup, and best practices for using the CrewAI CLI. This enhancement aims to streamline the onboarding process for new users.
* Improve missing CrewAI install guidance
* fix: address pr review feedback
* fix: avoid mismatched memory save rows
* fix: wait for queued memory save events
* fix: avoid matching memory saves on missing ids
* chore: normalize prerelease version to 1.14.8a1
Add a description and examples to every FlowDefinition field and
standardize on `typing.Literal`, so the generated JSON schema documents
itself — each action discriminator, state branch, and config option
explains what it is and shows a realistic value.
Examples live on individual fields only, never at the model level, which
keeps the schema readable for tooling that renders field-level help.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Add script/code blocks to FlowDefinition
Let a Flow method run trusted inline Python with `call: script`. The code
is compiled once into a generated function and receives the runtime
values as arguments.
```yaml
methods:
normalize:
start: true
do:
call: script
code: |
import math
state["rounded"] = math.ceil(state["raw_score"])
return f"rounded:{state['rounded']}"
```
Even though this shares the same surface of tools (custom code), I
decided to make it opt-in for now, using
`CREWAI_ALLOW_FLOW_SCRIPT_EXECUTION=1`.
* Address code review comments
* Enhance memory reset functionality and JSON crew handling
- Added `reset_all` method to the `Memory` class to reset the entire memory store, ignoring `root_scope`.
- Updated the `Crew` class to utilize `reset_all` when resetting memory.
- Enhanced the `_reset_flow_memory` function to check for `Memory` instances and call `reset_all` accordingly.
- Introduced helper functions to load JSON crew configurations and handle project declarations, improving the reset command's flexibility.
- Added tests to validate the new JSON crew memory reset behavior and ensure proper handling of declared flow projects.
* Fix memory reset review issues
* Bump litellm for security advisory
Replace the single FlowStateDefinition model with a `type`-discriminated
union of FlowDictStateDefinition, FlowPydanticStateDefinition,
FlowJsonSchemaStateDefinition, and FlowUnknownStateDefinition.
Each branch only carries the fields it actually uses and forbids extras,
so an invalid combination like a `dict` state with a `ref` now fails
validation instead of being silently accepted. The runtime reads `ref`
and `json_schema` defensively since they no longer exist on every branch.
```yaml
state:
type: json_schema
json_schema:
type: object
properties:
topic:
type: string
```
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Enhance crew loading and validation logic
- Updated `crew_loader.py` to pass the project root when loading task and agent definitions, improving the handling of Python references.
- Refactored `json_loader.py` to include additional validation for Python references, ensuring they are resolved within the project root and enforcing depth limits.
- Added tests in `test_crew_loader.py` and `test_json_loader.py` to validate rejection of unsafe Python references and input files outside the project root.
- Improved error handling for JSON project validation, ensuring clearer feedback for invalid configurations.
* Refactor tests for hierarchical verbose manager agent
- Removed `@pytest.mark.vcr()` decorators from `test_hierarchical_verbose_manager_agent` and `test_hierarchical_verbose_false_manager_agent`.
- Introduced mocking for task outputs in both tests to simulate execution without relying on external dependencies.
- Ensured that the `crew.kickoff()` method is called within a context that patches the `Task.execute_sync` method, improving test isolation and reliability.
* Fix JSON loader PR review comments
* Fix JSON loader project root after rebase
* Handle UNC paths in JSON input files
* Enhance JSON crew project handling and validation
- Updated `create_json_crew.py` to specify input files with a brief path.
- Refactored `crew_loader.py` to improve agent and task loading logic, including the introduction of a `build_agent` function and better handling of task classes.
- Enhanced `json_loader.py` with additional validation for agent and task definitions, including support for Python references and conditional tasks.
- Added tests in `test_crew_loader.py` and `test_json_loader.py` to ensure proper loading of agents, tasks, and validation of project structures, including custom types and conditional tasks.
- Improved error handling and validation safety across the project loading process.
* Enhance JSON crew configuration options in create_json_crew.py
- Added optional fields for custom agent subclasses and advanced task options, including condition checks and output specifications.
- Improved documentation comments for better clarity on agent and task configurations.
- Updated JSON crew handling to support additional callbacks for pre- and post-execution processes.
* Enhance JSON crew template tests in test_create_crew.py
- Added assertions for new optional fields in crew and agent templates, including conditional tasks, custom converters, and input file specifications.
- Improved validation checks for manager agents and callback references to ensure proper configuration in JSON crew definitions.
- Expanded documentation references within the tests to provide clearer guidance on the expected structure and usage of crew templates.
* Fix JSON crew PR review issues
* Update crewAI CLI with various enhancements and fixes
- Updated `create_json_crew.py` to require `crewai[tools]>=1.14.7`.
- Enhanced `git.py` with improved repository initialization, including automatic initial commit creation and exclusion patterns for initial commits.
- Modified `install_crew.py` to allow error handling during installation with an optional `raise_on_error` parameter.
- Expanded `plus_api.py` to include methods for creating and updating crews from ZIP files.
- Introduced a new `archive.py` for creating deployable ZIP archives of CrewAI projects, ensuring local artifacts are excluded.
- Updated `run_crew.py` to manage JSON crew dependencies and run crews in the project's environment.
- Enhanced deployment logic in `main.py` to handle ZIP uploads and improve user feedback during deployment processes.
- Added tests for new functionalities and ensured existing tests reflect recent changes in behavior and requirements.
* fix(cli): address deploy zip review feedback
* fix(cli): sync missing lockfile before deploy
* fix(cli): preserve remote deploy on git setup warnings
* test(cli): use single deploy main import style
* fix(cli): skip project install for json crew sync
* fix(cli): load json runner from source checkout
* fix(cli): skip json crew sync when locked
* fix(cli): address deploy zip review feedback
* fix(cli): pass env on zip redeploy
* fix(cli): harden json run and zip fallback
* fix(cli): validate before deploy lock install
* fix(cli): respect poetry lock for json runs
* fix(cli): align json zip wrapper detection
* fix(deps): bump starlette audit floor
* fix(cli): avoid auth retry for deploy exits
* fix(cli): update json zip script entrypoints
* feat(cli): introduce JSON crew project support and TUI enhancements
- Added support for creating and running JSON-defined crew projects, allowing users to scaffold projects with a new `create_json_crew.py` file.
- Implemented a full-screen Textual TUI for crew execution in `crew_run_tui.py`, enhancing user interaction with a two-column layout.
- Updated `run_crew.py` to prioritize JSON crew projects and added daemon mode for running without TUI.
- Introduced interactive pickers in `tui_picker.py` for improved CLI prompts.
- Enhanced validation for JSON crew files in `validate.py` to ensure proper structure and agent definitions.
- Updated `.gitignore` to exclude demo and crewai directories.
* feat: update LLM model references to gpt-5.4-mini
- Changed default LLM model from gpt-4o-mini to gpt-5.4-mini across various files, including CLI options, JSON crew configurations, and agent definitions.
- Enhanced benchmark and human feedback functionalities to utilize the new model.
- Improved user interface elements in the TUI for better interaction and feedback during execution.
- Added support for new skills directory in JSON crew project creation.
* feat(benchmark): add crew-level benchmarking functionality
- Introduced a new `benchmark` command in the CLI for crew-level benchmarking, allowing users to specify agents, models, and timeout settings.
- Implemented `CrewBenchmarkCase` to handle crew-level benchmark cases with inputs and criteria.
- Enhanced the benchmark runner to support progress tracking and detailed reporting of results for multiple models.
- Added tests for loading crew benchmark cases and validating their structure.
- Updated existing benchmark functions to accommodate the new crew-level execution model.
* feat(cli): enhance JSON crew project functionality and TUI improvements
- Added optional agent-level guardrails and advanced options in JSON crew configurations to improve output validation and flexibility.
- Updated the TUI to better handle plan step statuses, including visual indicators for task completion and failure.
- Introduced methods for parsing and managing step observation events, ensuring accurate updates to task statuses during execution.
- Enhanced validation for JSON crew projects, ensuring proper structure and error handling for agent and task definitions.
- Added comprehensive tests for new features and validation logic, ensuring robustness in JSON crew project handling.
* refactor(cli): streamline JSON crew project handling and improve validation
- Refactored JSON crew project loading and validation logic to enhance clarity and maintainability.
- Introduced utility functions for finding JSON crew files, improving code reuse across modules.
- Removed deprecated benchmark functionality and associated tests to simplify the codebase.
- Updated CLI commands to utilize the new JSON project structure, ensuring compatibility with recent changes.
- Enhanced test coverage for JSON crew project features, ensuring robust validation and error handling.
* feat(cli): enhance activity log navigation and focus management
- Added functionality to focus on the activity log when navigating through log entries.
- Implemented refresh logic for the log panel to ensure updates are displayed correctly during navigation.
- Improved keyboard navigation for log entries, allowing users to expand and scroll through logs seamlessly.
- Added tests to verify the correct behavior of log navigation and focus management in the TUI.
* feat(cli): enhance JSON crew project interaction and input handling
- Introduced a new function to enable prompt line editing for better user experience during input prompts.
- Updated the JSON crew project wizards to show interpolation hints for dynamic values, improving user guidance.
- Enhanced the handling of missing input placeholders by prompting users for required values during crew setup.
- Refactored the crew run logic to ensure proper loading and preparation of JSON-defined crews, including runtime input management.
- Added tests to verify the correct behavior of new input handling features and JSON crew project interactions.
* feat(cli): improve crew project input prompts and event handling
- Enhanced the `_prompt_text` function to allow for configurable spacing before prompts, improving user experience during input collection.
- Updated the wizards for agent and task creation to utilize the new prompt configuration, ensuring a more compact and streamlined interaction.
- Introduced new plan step lifecycle events (`PlanStepStartedEvent`, `PlanStepCompletedEvent`) to better track the execution status of plan steps.
- Refactored the step executor to emit these events during the execution of tasks, improving observability and debugging capabilities.
- Added tests to verify the correct behavior of new prompt handling and event emissions during crew project execution.
* fix: refine json-first crew interactions
* fix: prioritize common json crew tools
* fix: make json crew more tools expandable
* fix: show json crew tools by category
* feat(memory): update default embedder to OpenAI text-embedding-3-large and enhance memory compatibility
- Changed the default embedding model for Memory to OpenAI text-embedding-3-large, which uses 3072-dimensional vectors.
- Added warnings regarding compatibility issues with existing local memory stores created with 1536-dimensional embeddings.
- Updated documentation to reflect the new default embedder and its configuration options.
- Enhanced the CLI and codebase to support the new embedding model across various components, ensuring a seamless transition for users.
* fix: address PR review feedback for JSON-first crews
Review blockers:
- Forward trained_agents_file to JSON crews: crewai run -f now exports
CREWAI_TRAINED_AGENTS_FILE for the in-process JSON crew path
- Wizard agent picker: Esc/cancel now reprompts instead of silently
assigning the first agent
- JSON tool resolution hard-fails: unknown tool names, missing custom
tool files, and invalid custom tool modules raise JSONProjectError
with actionable messages instead of warn-and-continue
- Embedding dimension mismatch: LanceDB and Qdrant Edge storages raise
EmbeddingDimensionMismatchError with reset/pin guidance instead of
silently zero-filling vectors or returning empty search results
- Custom tool code execution documented in loader docstring and the
scaffolded project README
CI fixes:
- ruff format across lib/
- All 133 PR-introduced mypy errors fixed (llm.py lazy-litellm and
cli.py lazy command shims now use TYPE_CHECKING imports; textual
is_mounted misuse fixed; pick_many overloads; misc annotations)
Bot review comments:
- Empty except blocks now have explanatory comments or debug logging
- Removed unused _C_BG/_C_PANEL/_C_BORDER globals and redundant
import re; tests use a single import style for create_json_crew
Tests: trained-agents propagation, wizard cancel, tool resolution
failures, and dimension mismatch guidance.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix: address second round of PR review comments
Cursor Bugbot:
- Wizard agent slugs: strip to [a-z0-9_] and fall back to agent_<n> so
symbol-only roles can't produce an empty agents/.jsonc filename
- Wizard task names: dedupe against prior task names and fall back to
task_<n> for symbol-only descriptions
CodeRabbit:
- Agent.message(): import Task explicitly at runtime instead of relying
on the namespace injection done by crewai/__init__
- Async executor: move the native-tools-unsupported fallback from
_ainvoke_loop_react (self-recursion) to _ainvoke_loop_native_tools,
mirroring the sync implementation
- StepExecutor downgrade: keep the in-step conversation and append the
text-tooling instructions instead of rebuilding messages, so completed
native tool calls are not re-executed
- crewai-files: extension-based MIME lookup now runs before byte
sniffing so csv/xml types are not degraded to text/plain
- Memory storages: validate every record in a save() batch against a
consistent embedding dimension (LanceDB previously checked only the
first record); added mixed-batch tests
- _print_post_tui_summary now typed against CrewRunApp
- Docs: Azure OpenAI default embedder change called out in the memory
migration warning and provider table
Code quality bots:
- Removed unused _C_YELLOW/_C_CYAN (crew_run_tui) and _GREEN (tui_picker)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* feat(cli): accordion tool picker in JSON crew wizard
The flat tool list had grown to ~90 rows. The picker now shows:
- Common tools always visible at the top
- Every other category as a single expandable row with tool and
selection counts (e.g. "Search & Research (27 tools, 2 selected)")
- Expanding a category collapses the previously expanded one
- Selections persist across expand/collapse via new preselected
support in pick_many; cursor follows the toggled category row
tui_picker gains preselected + initial_cursor options on pick_many,
and Esc in multi-select now confirms the current selection instead of
discarding it (required so collapsing can't silently drop choices).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* refactor(cli): remove --daemon flag from crewai run
The flag only affected JSON crew projects — classic and flow projects
ignored it entirely, which made the behavior inconsistent. Removed the
option, the daemon code path (_run_json_crew_daemon), and its helper
(_load_json_crew_with_inputs).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* test: update run command tests after --daemon removal
lib/crewai/tests/cli/test_run_crew.py still asserted the old
run_crew(trained_agents_file=..., daemon=False) call signature.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(cli): exit codes, mid-run quit, async statuses, hyphen placeholders
Addresses the latest Bugbot review round:
- Failed JSON crew runs now exit non-zero (SystemExit(1)) so scripts
and CI don't treat failures as success, mirroring the classic path
- Quitting the TUI mid-run now ends the process (os._exit(130));
kickoff runs in a thread worker that cannot be force-cancelled, so
letting the CLI return would leave LLM/tool work burning tokens in
the background
- Sidebar task statuses are now async-safe: completion/failure events
resolve the task's own row via identity instead of assuming the most
recently started task, and starting a task no longer blanket-marks
earlier active rows as done
- The runtime-input prompt regex now accepts hyphenated placeholder
names ({my-topic}), matching kickoff's interpolation pattern
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix: validation safety, custom tool sandboxing, TUI log integrity, memory error surfacing
- Deploy validation no longer executes project code: validation mode
checks tool declarations structurally (well-formed entries, custom
tool file exists) without importing or instantiating anything.
custom:<name> resolution only happens on the actual run path.
- custom:<name> is constrained to [A-Za-z_][A-Za-z0-9_]* and the
resolved path must stay inside the project's tools/ directory, so
custom:../foo or absolute-path names cannot execute code outside it.
Tool paths resolve relative to the crew project root, not cwd.
- TUI task logs are built from per-task state captured at task start
(idx, description, agent, start time); an out-of-order completion
takes its output from the event and no longer steals or resets the
current task's streamed steps/output.
- EmbeddingDimensionMismatchError now inherits ValueError instead of
RuntimeError so background saves surface it through
MemorySaveFailedEvent instead of silently dropping the save; the
shutdown catch in _background_encode_batch is narrowed to the
"cannot schedule new futures" case.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(cli): declared project type wins over crew.json presence
A flow project that also contains a crew.json(c) file now runs and
validates as the flow it declares in pyproject.toml instead of being
hijacked by the JSON crew path. Both crewai run (_has_json_crew) and
deploy validation (_is_json_crew) check tool.crewai.type; a missing or
unreadable pyproject still means a bare JSON crew project.
Also documents why StepObservationFailedEvent intentionally marks the
plan step "done": the event signals an observer failure, not a step
failure, and the executor continues past it.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(cli): type the declared_type locals so mypy stays clean
Comparing an Any-typed .get() chain returns Any, which tripped
no-any-return on the previous commit.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
A `do:` step can now say `call: tool` and name a CrewAI tool to run,
passing its inputs under `with:`. Before this, a definition could only
point at Python code to run.
```yaml
methods:
search:
start: true
do:
call: tool
ref: crewai_tools:ExaSearchTool
with:
search_query: ai agents
```
* Drive human feedback from the flow definition
@human_feedback previously wrapped methods with the full HITL runtime (feedback
request, outcome collapse, learn loop), so flows built from a YAML definition —
which carry no decorated callables — could not pause for or route on human
feedback.
# Conflicts:
# lib/crewai/src/crewai/flow/persistence/decorators.py
# lib/crewai/src/crewai/flow/runtime/__init__.py
* Address code review comments
* Wire config and persistence from FlowDefinition into the runtime
`from_definition` was silently dropping all config fields; it now passes
`config.model_dump()` so suppress_flow_events, max_method_calls, etc.
actually apply.
Persistence is now engine-driven: `_persist_method_completion` fires
after every method using the definition's persist metadata, so
`@persist` no longer needs to wrap methods — it just stamps them.
* Address code review comments
* feat: aggregate LLM token usage at the flow level
Introduces `flow.usage_metrics`, a snapshot of every LLMCallCompletedEvent
emitted under the flow's `current_flow_id` for the duration of one kickoff
(or resume) call. Aggregation happens on the singleton event bus so it
covers crews, direct `LLM.call`s, and nested listener calls — solving the
mismatch where the SDK reported only the last crew's usage while the
Enterprise UI showed the correct full total.
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor: centralize provider key normalization in UsageMetrics
Add UsageMetrics.from_provider_dict to normalize raw LLM usage dicts
across providers (LiteLLM, native Anthropic, native Gemini, OpenAI
nested cached). BaseLLM._track_token_usage_internal and the flow-level
aggregator now share this single source of truth, so `flow.usage_metrics`
agrees with per-LLM totals on every provider — including the native
Anthropic path that emits `input_tokens`/`output_tokens` instead of
`prompt_tokens`/`completion_tokens`.
* fix: flush event bus before reading aggregated usage_metrics
`crewai_event_bus.emit` dispatches LLMCallCompletedEvent handlers on a
ThreadPoolExecutor (fire-and-forget), so a flow whose last LLM call
completes right before kickoff_async/resume_async returns can detach
the usage listener while that handler is still queued, leaving its
tokens off `flow.usage_metrics`. Match `Crew.kickoff()` and call
`crewai_event_bus.flush()` in both finally blocks so every handler
drains before the listener is detached.
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
* Read flow dispatch from FlowDefinition
Store the definition in a `_definition` PrivateAttr at post-init and
convert the dispatch helpers (`_start_method_names`, `_listener_methods`,
`_start_condition`, `_listen_condition`, `_is_router`) from classmethods
to instance methods that read it. Event names now fall back to
`self._definition.name` instead of `self.__class__.__name__`.
Behavior is identical for decorator subclasses, but the engine no longer
assumes the definition comes from the class. This is the seam for
`Flow.from_definition`, where an instance runs a definition that was
loaded rather than built from a Python subclass.
* Add Flow.from_definition to run flows without a subclass
A FlowDefinition (e.g. loaded from YAML) was only usable for dispatch on
decorator-authored subclasses. Now each method definition records an
importable `module:qualname` handler ref, and `Flow.from_definition`
resolves and binds those handlers to build a runnable flow directly.
* Build flow state from FlowDefinition
Definition-driven flows previously always started with a bare dict
state.
* Replace handler string with structured FlowActionDefinition
`handler: str | None` was optional and opaque — missing handlers only
surfaced at kickoff time. `do: FlowActionDefinition` is required, so
Pydantic rejects invalid definitions at parse time.
The `call: "code"` discriminator prepares the schema for future
non-Python action types (e.g. MCP tool, crew) without touching
`FlowMethodDefinition`. Resolution logic is extracted to
`runtime/_action_resolvers.py` to keep the dispatch point isolated.
* Fix conversational start router missing required do field
FlowMethodDefinition.do became required when the handler string was
replaced with FlowActionDefinition, but _conversation_start_router still
built its fragment without it, breaking crewai import entirely.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* Add event scoping to flow test
* Change lib/crewai/tests/test_flow_from_definition.py
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
A custom BaseLLM subclass serializes with the inherited llm_type "base",
which the registry maps to the abstract BaseLLM. Restore then crashed on
cls(**value). Rebuild a concrete LLM from the saved config when the
resolved class is abstract.
Checkpoint serialization stamps checkpoint_completed_methods onto every live
Flow in RuntimeState.root, including the agent executor reused across a crew's
tasks. kickoff_async read that stamp as a restore signal, so the second task
replayed the first task's completed methods and never reached a final answer.
Gate is_restoring on _restored_from_checkpoint, set only by
_restore_from_checkpoint, and consume it single-shot.
flow.plot defaults to show=True, which calls webbrowser.open on every run.
The test only asserts FlowPlotEvent is emitted, so disable the browser open.
* improve one less route
* flows in flows, new agent executor causing early trace batch finalization
* addressing comments
* addressing comments pt2
* lint and typecheck fix
* decouple convo logic from runtime and added a conversational_definition
* type check fix
* always defer traces for convo and so fix tests to reflect that
Re-evaluate the whole `@listen`/`@router` condition tree against the set
of events seen so far, instead of tracking which AND sub-branches remain
pending.
Net effect:
* Fixes a regression where `or_()` short-circuited at the first
satisfied branch, leaving a sibling `and_()` half-complete so a later
trigger could spuriously re-fire the listener
* Removes the fragile per-branch pending state and `id()`-based keys
* Shrinks the evaluator to one readable predicate
* Migrate @listen/@router runtime to read from FlowDefinition
The runtime now resolves listener conditions, router status, and emit
values from `FlowMethodDefinition` instead of legacy method metadata and
the `_listeners`/`_routers`/`_router_emit` registries.
* Evaluate AND/OR listener conditions over the definition shape via
`_evaluate_definition_condition`
* Drop the class registries and the `FlowMeta` extraction that built
them; stop stamping `__trigger_methods__`, `__is_router__`,
`__router_emit__`, and friends
* `@human_feedback` emit now lives only on its config
* Simplify conditionals DSL
Add opt-in extension seams so an application can route memory, knowledge,
RAG, and flow persistence through a custom backend without subclassing or
threading an explicit instance through every construction site -- mirroring
the existing crewai_core.lock_store.set_lock_backend seam.
- memory: crewai.memory.storage.factory.set_memory_storage_factory
- knowledge: crewai.knowledge.storage.factory.set_knowledge_storage_factory
- rag: crewai.rag.factory.register_rag_client_factory (provider registry)
- flow: crewai.flow.persistence.factory.set_flow_persistence_factory
Each construction site consults the registered factory and falls back to the
built-in default when none is set; an explicit instance always wins. Widen
Knowledge.storage and the knowledge source base classes to BaseKnowledgeStorage
(consistent with BaseAgent.knowledge_storage) so any base-interface backend
plugs in. Runtime-free tests cover each seam.
* Remove `_start_methods` and `__is_start_method__` stamping
* Add helpers to read start info from the definition
* Scan `__dict__` instead of `dir()` to find flow methods