crewAI

mirror of https://github.com/crewAIInc/crewAI.git synced 2026-07-02 13:48:09 +00:00

Author	SHA1	Message	Date
Joao Moura	b11132d7ee	Update crewAI CLI with various enhancements and fixes - Updated `create_json_crew.py` to require `crewai[tools]>=1.14.7`. - Enhanced `git.py` with improved repository initialization, including automatic initial commit creation and exclusion patterns for initial commits. - Modified `install_crew.py` to allow error handling during installation with an optional `raise_on_error` parameter. - Expanded `plus_api.py` to include methods for creating and updating crews from ZIP files. - Introduced a new `archive.py` for creating deployable ZIP archives of CrewAI projects, ensuring local artifacts are excluded. - Updated `run_crew.py` to manage JSON crew dependencies and run crews in the project's environment. - Enhanced deployment logic in `main.py` to handle ZIP uploads and improve user feedback during deployment processes. - Added tests for new functionalities and ensured existing tests reflect recent changes in behavior and requirements.	2026-06-15 12:05:26 -07:00
Lorenze Jay	a5cc6f6d0e	Add crewai_version to flow execution telemetry (#6167 ) Some checks failed CodeQL Advanced / Analyze (python) (push) Has been cancelled Details CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details	2026-06-15 09:34:01 -07:00
João Moura	bb477f8a91	JSON first crews (#6131 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details * feat(cli): introduce JSON crew project support and TUI enhancements - Added support for creating and running JSON-defined crew projects, allowing users to scaffold projects with a new `create_json_crew.py` file. - Implemented a full-screen Textual TUI for crew execution in `crew_run_tui.py`, enhancing user interaction with a two-column layout. - Updated `run_crew.py` to prioritize JSON crew projects and added daemon mode for running without TUI. - Introduced interactive pickers in `tui_picker.py` for improved CLI prompts. - Enhanced validation for JSON crew files in `validate.py` to ensure proper structure and agent definitions. - Updated `.gitignore` to exclude demo and crewai directories. * feat: update LLM model references to gpt-5.4-mini - Changed default LLM model from gpt-4o-mini to gpt-5.4-mini across various files, including CLI options, JSON crew configurations, and agent definitions. - Enhanced benchmark and human feedback functionalities to utilize the new model. - Improved user interface elements in the TUI for better interaction and feedback during execution. - Added support for new skills directory in JSON crew project creation. * feat(benchmark): add crew-level benchmarking functionality - Introduced a new `benchmark` command in the CLI for crew-level benchmarking, allowing users to specify agents, models, and timeout settings. - Implemented `CrewBenchmarkCase` to handle crew-level benchmark cases with inputs and criteria. - Enhanced the benchmark runner to support progress tracking and detailed reporting of results for multiple models. - Added tests for loading crew benchmark cases and validating their structure. - Updated existing benchmark functions to accommodate the new crew-level execution model. * feat(cli): enhance JSON crew project functionality and TUI improvements - Added optional agent-level guardrails and advanced options in JSON crew configurations to improve output validation and flexibility. - Updated the TUI to better handle plan step statuses, including visual indicators for task completion and failure. - Introduced methods for parsing and managing step observation events, ensuring accurate updates to task statuses during execution. - Enhanced validation for JSON crew projects, ensuring proper structure and error handling for agent and task definitions. - Added comprehensive tests for new features and validation logic, ensuring robustness in JSON crew project handling. * refactor(cli): streamline JSON crew project handling and improve validation - Refactored JSON crew project loading and validation logic to enhance clarity and maintainability. - Introduced utility functions for finding JSON crew files, improving code reuse across modules. - Removed deprecated benchmark functionality and associated tests to simplify the codebase. - Updated CLI commands to utilize the new JSON project structure, ensuring compatibility with recent changes. - Enhanced test coverage for JSON crew project features, ensuring robust validation and error handling. * feat(cli): enhance activity log navigation and focus management - Added functionality to focus on the activity log when navigating through log entries. - Implemented refresh logic for the log panel to ensure updates are displayed correctly during navigation. - Improved keyboard navigation for log entries, allowing users to expand and scroll through logs seamlessly. - Added tests to verify the correct behavior of log navigation and focus management in the TUI. * feat(cli): enhance JSON crew project interaction and input handling - Introduced a new function to enable prompt line editing for better user experience during input prompts. - Updated the JSON crew project wizards to show interpolation hints for dynamic values, improving user guidance. - Enhanced the handling of missing input placeholders by prompting users for required values during crew setup. - Refactored the crew run logic to ensure proper loading and preparation of JSON-defined crews, including runtime input management. - Added tests to verify the correct behavior of new input handling features and JSON crew project interactions. * feat(cli): improve crew project input prompts and event handling - Enhanced the `_prompt_text` function to allow for configurable spacing before prompts, improving user experience during input collection. - Updated the wizards for agent and task creation to utilize the new prompt configuration, ensuring a more compact and streamlined interaction. - Introduced new plan step lifecycle events (`PlanStepStartedEvent`, `PlanStepCompletedEvent`) to better track the execution status of plan steps. - Refactored the step executor to emit these events during the execution of tasks, improving observability and debugging capabilities. - Added tests to verify the correct behavior of new prompt handling and event emissions during crew project execution. * fix: refine json-first crew interactions * fix: prioritize common json crew tools * fix: make json crew more tools expandable * fix: show json crew tools by category * feat(memory): update default embedder to OpenAI text-embedding-3-large and enhance memory compatibility - Changed the default embedding model for Memory to OpenAI text-embedding-3-large, which uses 3072-dimensional vectors. - Added warnings regarding compatibility issues with existing local memory stores created with 1536-dimensional embeddings. - Updated documentation to reflect the new default embedder and its configuration options. - Enhanced the CLI and codebase to support the new embedding model across various components, ensuring a seamless transition for users. * fix: address PR review feedback for JSON-first crews Review blockers: - Forward trained_agents_file to JSON crews: crewai run -f now exports CREWAI_TRAINED_AGENTS_FILE for the in-process JSON crew path - Wizard agent picker: Esc/cancel now reprompts instead of silently assigning the first agent - JSON tool resolution hard-fails: unknown tool names, missing custom tool files, and invalid custom tool modules raise JSONProjectError with actionable messages instead of warn-and-continue - Embedding dimension mismatch: LanceDB and Qdrant Edge storages raise EmbeddingDimensionMismatchError with reset/pin guidance instead of silently zero-filling vectors or returning empty search results - Custom tool code execution documented in loader docstring and the scaffolded project README CI fixes: - ruff format across lib/ - All 133 PR-introduced mypy errors fixed (llm.py lazy-litellm and cli.py lazy command shims now use TYPE_CHECKING imports; textual is_mounted misuse fixed; pick_many overloads; misc annotations) Bot review comments: - Empty except blocks now have explanatory comments or debug logging - Removed unused _C_BG/_C_PANEL/_C_BORDER globals and redundant import re; tests use a single import style for create_json_crew Tests: trained-agents propagation, wizard cancel, tool resolution failures, and dimension mismatch guidance. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: address second round of PR review comments Cursor Bugbot: - Wizard agent slugs: strip to [a-z0-9_] and fall back to agent_<n> so symbol-only roles can't produce an empty agents/.jsonc filename - Wizard task names: dedupe against prior task names and fall back to task_<n> for symbol-only descriptions CodeRabbit: - Agent.message(): import Task explicitly at runtime instead of relying on the namespace injection done by crewai/__init__ - Async executor: move the native-tools-unsupported fallback from _ainvoke_loop_react (self-recursion) to _ainvoke_loop_native_tools, mirroring the sync implementation - StepExecutor downgrade: keep the in-step conversation and append the text-tooling instructions instead of rebuilding messages, so completed native tool calls are not re-executed - crewai-files: extension-based MIME lookup now runs before byte sniffing so csv/xml types are not degraded to text/plain - Memory storages: validate every record in a save() batch against a consistent embedding dimension (LanceDB previously checked only the first record); added mixed-batch tests - _print_post_tui_summary now typed against CrewRunApp - Docs: Azure OpenAI default embedder change called out in the memory migration warning and provider table Code quality bots: - Removed unused _C_YELLOW/_C_CYAN (crew_run_tui) and _GREEN (tui_picker) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(cli): accordion tool picker in JSON crew wizard The flat tool list had grown to ~90 rows. The picker now shows: - Common tools always visible at the top - Every other category as a single expandable row with tool and selection counts (e.g. "Search & Research (27 tools, 2 selected)") - Expanding a category collapses the previously expanded one - Selections persist across expand/collapse via new preselected support in pick_many; cursor follows the toggled category row tui_picker gains preselected + initial_cursor options on pick_many, and Esc in multi-select now confirms the current selection instead of discarding it (required so collapsing can't silently drop choices). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(cli): remove --daemon flag from crewai run The flag only affected JSON crew projects — classic and flow projects ignored it entirely, which made the behavior inconsistent. Removed the option, the daemon code path (_run_json_crew_daemon), and its helper (_load_json_crew_with_inputs). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test: update run command tests after --daemon removal lib/crewai/tests/cli/test_run_crew.py still asserted the old run_crew(trained_agents_file=..., daemon=False) call signature. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): exit codes, mid-run quit, async statuses, hyphen placeholders Addresses the latest Bugbot review round: - Failed JSON crew runs now exit non-zero (SystemExit(1)) so scripts and CI don't treat failures as success, mirroring the classic path - Quitting the TUI mid-run now ends the process (os._exit(130)); kickoff runs in a thread worker that cannot be force-cancelled, so letting the CLI return would leave LLM/tool work burning tokens in the background - Sidebar task statuses are now async-safe: completion/failure events resolve the task's own row via identity instead of assuming the most recently started task, and starting a task no longer blanket-marks earlier active rows as done - The runtime-input prompt regex now accepts hyphenated placeholder names ({my-topic}), matching kickoff's interpolation pattern Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: validation safety, custom tool sandboxing, TUI log integrity, memory error surfacing - Deploy validation no longer executes project code: validation mode checks tool declarations structurally (well-formed entries, custom tool file exists) without importing or instantiating anything. custom:<name> resolution only happens on the actual run path. - custom:<name> is constrained to [A-Za-z_][A-Za-z0-9_]* and the resolved path must stay inside the project's tools/ directory, so custom:../foo or absolute-path names cannot execute code outside it. Tool paths resolve relative to the crew project root, not cwd. - TUI task logs are built from per-task state captured at task start (idx, description, agent, start time); an out-of-order completion takes its output from the event and no longer steals or resets the current task's streamed steps/output. - EmbeddingDimensionMismatchError now inherits ValueError instead of RuntimeError so background saves surface it through MemorySaveFailedEvent instead of silently dropping the save; the shutdown catch in _background_encode_batch is narrowed to the "cannot schedule new futures" case. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): declared project type wins over crew.json presence A flow project that also contains a crew.json(c) file now runs and validates as the flow it declares in pyproject.toml instead of being hijacked by the JSON crew path. Both crewai run (_has_json_crew) and deploy validation (_is_json_crew) check tool.crewai.type; a missing or unreadable pyproject still means a bare JSON crew project. Also documents why StepObservationFailedEvent intentionally marks the plan step "done": the event signals an observer failure, not a step failure, and the executor continues past it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): type the declared_type locals so mypy stays clean Comparing an Any-typed .get() chain returns Any, which tripped no-any-return on the previous commit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-14 04:19:48 -03:00
Vini Brasil	6ad821b157	Add expressions to FlowDefinition actions (#6145 ) * Add expressions to FlowDefinition actions Let definitions compute values without Python. A new `call: expression` action evaluates a Common Expression Language (CEL) expression, and tool `with:` blocks now render `${...}` CEL templates. Example 1: ```yaml decide: do: call: expression expr: "state.score >= 80 ? 'qualified' : 'nurture'" router: true emit: [qualified, nurture] ``` Example 2: ```yaml search: do: call: tool ref: my.pkg:SearchTool with: search_query: "${outputs.build_query.query + ' news'}" max_results: "${state.limit}" ``` * Address code review comments * Address code review comments * Fix linting offenses * Address code review comments * Fix scrapgraph issue	2026-06-12 21:56:02 -07:00
Vini Brasil	2444895ca4	Implement Flow definition run tools without Python code (#6144 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details A `do:` step can now say `call: tool` and name a CrewAI tool to run, passing its inputs under `with:`. Before this, a definition could only point at Python code to run. ```yaml methods: search: start: true do: call: tool ref: crewai_tools:ExaSearchTool with: search_query: ai agents ```	2026-06-12 19:47:58 -07:00
Vini Brasil	bf291a7a55	Drive human feedback from the flow definition (#6133 ) * Drive human feedback from the flow definition @human_feedback previously wrapped methods with the full HITL runtime (feedback request, outcome collapse, learn loop), so flows built from a YAML definition — which carry no decorated callables — could not pause for or route on human feedback. # Conflicts: # lib/crewai/src/crewai/flow/persistence/decorators.py # lib/crewai/src/crewai/flow/runtime/__init__.py * Address code review comments	2026-06-12 14:48:43 -07:00
Vini Brasil	64438cba37	Wire config and persistence from FlowDefinition into the runtime (#6132 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details * Wire config and persistence from FlowDefinition into the runtime `from_definition` was silently dropping all config fields; it now passes `config.model_dump()` so suppress_flow_events, max_method_calls, etc. actually apply. Persistence is now engine-driven: `_persist_method_completion` fires after every method using the definition's persist metadata, so `@persist` no longer needs to wrap methods — it just stamps them. * Address code review comments	2026-06-12 11:51:44 -07:00
Lucas Gomide	887adafd2c	fix: aggregate token usage across all LLM calls (#6122 ) * feat: aggregate LLM token usage at the flow level Introduces `flow.usage_metrics`, a snapshot of every LLMCallCompletedEvent emitted under the flow's `current_flow_id` for the duration of one kickoff (or resume) call. Aggregation happens on the singleton event bus so it covers crews, direct `LLM.call`s, and nested listener calls — solving the mismatch where the SDK reported only the last crew's usage while the Enterprise UI showed the correct full total. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor: centralize provider key normalization in UsageMetrics Add UsageMetrics.from_provider_dict to normalize raw LLM usage dicts across providers (LiteLLM, native Anthropic, native Gemini, OpenAI nested cached). BaseLLM._track_token_usage_internal and the flow-level aggregator now share this single source of truth, so `flow.usage_metrics` agrees with per-LLM totals on every provider — including the native Anthropic path that emits `input_tokens`/`output_tokens` instead of `prompt_tokens`/`completion_tokens`. * fix: flush event bus before reading aggregated usage_metrics `crewai_event_bus.emit` dispatches LLMCallCompletedEvent handlers on a ThreadPoolExecutor (fire-and-forget), so a flow whose last LLM call completes right before kickoff_async/resume_async returns can detach the usage listener while that handler is still queued, leaving its tokens off `flow.usage_metrics`. Match `Crew.kickoff()` and call `crewai_event_bus.flush()` in both finally blocks so every handler drains before the listener is detached. --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-12 12:55:22 -04:00
Vini Brasil	373dca3d04	Run flows from a definition without a Python subclass (#6104 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details * Read flow dispatch from FlowDefinition Store the definition in a `_definition` PrivateAttr at post-init and convert the dispatch helpers (`_start_method_names`, `_listener_methods`, `_start_condition`, `_listen_condition`, `_is_router`) from classmethods to instance methods that read it. Event names now fall back to `self._definition.name` instead of `self.__class__.__name__`. Behavior is identical for decorator subclasses, but the engine no longer assumes the definition comes from the class. This is the seam for `Flow.from_definition`, where an instance runs a definition that was loaded rather than built from a Python subclass. * Add Flow.from_definition to run flows without a subclass A FlowDefinition (e.g. loaded from YAML) was only usable for dispatch on decorator-authored subclasses. Now each method definition records an importable `module:qualname` handler ref, and `Flow.from_definition` resolves and binds those handlers to build a runnable flow directly. * Build flow state from FlowDefinition Definition-driven flows previously always started with a bare dict state. * Replace handler string with structured FlowActionDefinition `handler: str \| None` was optional and opaque — missing handlers only surfaced at kickoff time. `do: FlowActionDefinition` is required, so Pydantic rejects invalid definitions at parse time. The `call: "code"` discriminator prepares the schema for future non-Python action types (e.g. MCP tool, crew) without touching `FlowMethodDefinition`. Resolution logic is extracted to `runtime/_action_resolvers.py` to keep the dispatch point isolated. * Fix conversational start router missing required do field FlowMethodDefinition.do became required when the handler string was replaced with FlowActionDefinition, but _conversation_start_router still built its fragment without it, breaking crewai import entirely. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Add event scoping to flow test * Change lib/crewai/tests/test_flow_from_definition.py --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 14:18:49 -07:00
Greyson LaLonde	50b9c02272	fix(checkpoint): rebuild custom BaseLLM as concrete LLM on restore Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details A custom BaseLLM subclass serializes with the inherited llm_type "base", which the registry maps to the abstract BaseLLM. Restore then crashed on cls(**value). Rebuild a concrete LLM from the saved config when the resolved class is abstract.	2026-06-10 22:21:35 -07:00
Greyson LaLonde	fbafe1f0d3	fix(flow): gate restore on a flag so live snapshots don't replay as resume Checkpoint serialization stamps checkpoint_completed_methods onto every live Flow in RuntimeState.root, including the agent executor reused across a crew's tasks. kickoff_async read that stamp as a restore signal, so the second task replayed the first task's completed methods and never reached a final answer. Gate is_restoring on _restored_from_checkpoint, set only by _restore_from_checkpoint, and consume it single-shot.	2026-06-10 20:40:08 -07:00
Greyson LaLonde	5267c059f5	test(flow): pass show=False in test_flow_plotting to not open a browser flow.plot defaults to show=True, which calls webbrowser.open on every run. The test only asserts FlowPlotEvent is emitted, so disable the browser open.	2026-06-10 20:36:14 -07:00
Greyson LaLonde	a1f44eb272	fix(events): scope runtime state per run to bound growth and isolate concurrent runs	2026-06-10 18:39:05 -07:00
Lorenze Jay	036b032ab6	handle supporting both custom prompts (#6108 ) * handle supporting both custom prompts * handle translations * handle deprecation warnings better	2026-06-10 17:52:53 -07:00
Lorenze Jay	f88ae54f96	fix telemetry setup on crewai-login (#6106 ) * fix telemetry setup on crewai-login * type check fix	2026-06-10 17:03:25 -07:00
Lorenze Jay	b6e5d632c1	improve convo routing cycle with one less route (#6102 ) * improve one less route * flows in flows, new agent executor causing early trace batch finalization * addressing comments * addressing comments pt2 * lint and typecheck fix	2026-06-10 16:49:16 -07:00
Greyson LaLonde	0d971e5bc5	feat(events): add reset_runtime_state to release accumulated bus state	2026-06-10 16:12:28 -07:00
Lorenze Jay	f214ff4b7b	decouple convo logic from runtime and added a conversational_definition (#6091 ) * decouple convo logic from runtime and added a conversational_definition * type check fix * always defer traces for convo and so fix tests to reflect that	2026-06-10 10:49:39 -07:00
Vini Brasil	a9e7c3a44f	Simplify flow condition evaluation to be stateless per event (#6097 ) Re-evaluate the whole `@listen`/`@router` condition tree against the set of events seen so far, instead of tracking which AND sub-branches remain pending. Net effect: * Fixes a regression where `or_()` short-circuited at the first satisfied branch, leaving a sibling `and_()` half-complete so a later trigger could spuriously re-fire the listener * Removes the fragile per-branch pending state and `id()`-based keys * Shrinks the evaluator to one readable predicate	2026-06-10 10:35:25 -07:00
Lucas Gomide	da8fe8c715	fix: respect suppress_flow_events for method-execution events (#6095 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details * fix: respect suppress_flow_events for method-execution events * test: align suppressed-flow test with new method-event behavior	2026-06-09 17:19:25 -04:00
Vini Brasil	703ffe67ee	Migrate @listen/@router runtime to read from FlowDefinition (#6084 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details * Migrate @listen/@router runtime to read from FlowDefinition The runtime now resolves listener conditions, router status, and emit values from `FlowMethodDefinition` instead of legacy method metadata and the `_listeners`/`_routers`/`_router_emit` registries. * Evaluate AND/OR listener conditions over the definition shape via `_evaluate_definition_condition` * Drop the class registries and the `FlowMeta` extraction that built them; stop stamping `__trigger_methods__`, `__is_router__`, `__router_emit__`, and friends * `@human_feedback` emit now lives only on its config * Simplify conditionals DSL	2026-06-09 09:40:30 -07:00
Matt Aitchison	8919026326	feat(storage): pluggable default backends for memory, knowledge, rag, flow (#6079 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details Add opt-in extension seams so an application can route memory, knowledge, RAG, and flow persistence through a custom backend without subclassing or threading an explicit instance through every construction site -- mirroring the existing crewai_core.lock_store.set_lock_backend seam. - memory: crewai.memory.storage.factory.set_memory_storage_factory - knowledge: crewai.knowledge.storage.factory.set_knowledge_storage_factory - rag: crewai.rag.factory.register_rag_client_factory (provider registry) - flow: crewai.flow.persistence.factory.set_flow_persistence_factory Each construction site consults the registered factory and falls back to the built-in default when none is set; an explicit instance always wins. Widen Knowledge.storage and the knowledge source base classes to BaseKnowledgeStorage (consistent with BaseAgent.knowledge_storage) so any base-interface backend plugs in. Runtime-free tests cover each seam.	2026-06-08 21:14:13 -05:00
Vini Brasil	e570534f15	Migrate `@start` to read from `FlowDefinition` (#6071 ) * Remove `_start_methods` and `__is_start_method__` stamping * Add helpers to read start info from the definition * Scan `__dict__` instead of `dir()` to find flow methods	2026-06-08 15:03:50 -07:00
Lorenze Jay	8cd51fc67e	Lorenze/imp/conversational flow traces (#6044 ) * feat: add conversation message and route selection events - Introduced `ConversationMessageAddedEvent` and `ConversationRouteSelectedEvent` to enhance conversational flow tracking. - Updated event listeners to emit these events during message handling and routing decisions. - Enhanced the `_ConversationalMixin` class to emit events for user and assistant messages, as well as selected routes. - Added tests to verify the correct emission of these events during conversational turns. * ensure flow started events only emiited once * refactor(tracing): rename trace event handler methods to action event handlers Updated the class to replace with for and events, improving clarity in event handling. Additionally, adjusted comments in the class to clarify the application of pending user messages in relation to state restoration and flow scope initialization. * fix(conversational_mixin): handle empty message index in route events Updated the message index handling in the class to return when there are no messages. Added tests to ensure that route events do not reference index zero when the transcript is empty, and verified the correct emission of conversation message events during flow handling.	2026-06-05 14:10:19 -07:00
Lucas Gomide	cab3319af9	feat(otel): surface real finish_reason + sampling params + response.id on LLM events (#5945 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details * feat(otel): surface real finish_reason + sampling params + response.id on LLM events Companion to the OTel GenAI emitter compliance work in crewai-enterprise (CON-172). Today the enterprise emitter reads these fields off the OSS LLM events via `getattr(..., None)`, so it produces valid (but partial) spans against the existing OSS surface. This change makes those fields first-class on the events so spans can carry the real provider data. What this adds: - `LLMCallStartedEvent` gains the sampling-param fields the emitter needs for `gen_ai.request.`: `temperature`, `top_p`, `max_tokens`, `stream`, `seed`, `stop_sequences`, `frequency_penalty`, `presence_penalty`, `n`. All optional; existing call sites keep working. - `BaseLLM._emit_call_started_event` introspects those values off `self` (the LLM instance) via `getattr(..., None)` so every provider gets the fields propagated for free without per-provider plumbing. - `LLMCallCompletedEvent` gains `finish_reason: str \| None` and `response_id: str \| None`. A field validator coerces any non-string value (MagicMock, unexpected provider object) to None so the event never raises on construction. - `LLM._emit_call_completed_event` accepts both as kwargs. - `LLM` (LiteLLM path) gets a defensive `_extract_finish_reason_and_response_id` helper that handles both streaming (`StreamingChoices`) and non-streaming (`Choices`) shapes and is wired into every completion-event emission site. - Provider completions extract native values from their SDK responses and pass them through: - OpenAI: `_extract_responses_finish_reason_and_id` for Responses-API, `_extract_finish_reason_and_id` for Chat-Completions. - Anthropic: `_extract_finish_reason_and_id` (Messages API + streaming). - Bedrock: `_extract_finish_reason_and_id` (`stopReason` from converse). - Gemini: `_extract_finish_reason_and_id` (`finish_reason` from candidates). - Azure: inherits via OpenAI sub-class; adds the helper for Azure-specific response shapes. - openai_compatible: inherits from OpenAICompletion, no edits needed. Compatibility: - All new fields are optional with sensible defaults. No existing call sites need to change. - The validator on `LLMCallCompletedEvent` swallows non-string values for the new fields so legacy mocks / exotic provider types don't blow up event construction. - Enterprise side already reads these fields defensively, so OSS and enterprise can merge independently and cut on the same synchronized release. Tested against the full LLM + events + provider test suite — all green; the 14 pre-existing multimodal failures on main are unrelated and reproduce without this diff. fix(bedrock): propagate finish_reason + response_id on async paths The original commit covered every provider's sync path and Bedrock's sync streaming path, but two Bedrock async paths still emitted LLMCallCompletedEvent without finish_reason/response_id: - _ahandle_converse: the final fallback emit_call_completed_event call was missing both fields. Added stop_reason + response_id matching the other emission sites in the same function. - _ahandle_streaming_converse: response_id was never seeded from the initial response object, and stream_finish_reason wasn't propagated to the structured-output and final-text emissions. Now extracts response_id up front and threads stream_finish_reason through every completion event. Adds a dedicated test file covering the new event fields end-to-end: - LLMCallCompletedEvent.finish_reason / response_id Pydantic validation (string accepted, None default, non-string coerced to None). - LLMCallStartedEvent sampling params (all nine fields accepted, default to None). - BaseLLM._emit_call_started_event introspecting sampling params off self, with explicit kwargs overriding. - BaseLLM._emit_call_completed_event passing finish_reason/response_id through to the event. - LLM._extract_finish_reason_and_response_id across the LiteLLM shapes (non-streaming response, streaming chunk, dict, missing fields, non-string values, unexpected input). * fix(otel): correct streaming finish_reason + bedrock response_id semantics Two correctness fixes uncovered while landing the OTel finish_reason + response_id plumbing: - LiteLLM streaming (sync + async): `stream_options={"include_usage": True}` causes LiteLLM to emit a final usage-only chunk with `choices=[]`. The post-loop `_extract_finish_reason_and_response_id(last_chunk)` silently returned `(None, None)` because the last chunk has no choices, even though earlier chunks carried `finish_reason="stop"`. Track both fields incrementally inside the loop (mirroring how OpenAI/Gemini/Azure already handle their native streams) and use the tracked values for the LLMCallCompletedEvent emission and the partial-response error path. - Bedrock Converse: `ResponseMetadata.RequestId` is an AWS infra trace id, not a model-level response id (semantically different from OpenAI's `chatcmpl-XXX`). Return None for `response_id` rather than mislead downstream telemetry consumers. The audit-fix's async propagation chain still works — None propagates through unchanged. Adds `test_llm_streaming_finish_reason.py` pinning both the sync and async LiteLLM streaming paths against the include_usage chunk shape. * refactor(otel): unify LLM event introspection + drop redundant defensive code Three cohesion cleanups uncovered during PR review, all behavior-preserving: - LLM.call / LLM.acall in llm.py now delegate to BaseLLM._emit_call_started_event instead of constructing LLMCallStartedEvent inline. The base helper already introspects sampling params off self via getattr; the inline duplication was accidental, not justified, and a duplication risk if anyone adds a tenth OTel sampling param later. - Extracted lib/crewai/llms/_finish_reason_utils.py:extract_choices_finish_reason_and_id as the shared extractor for the choices-based response shape. OpenAI Chat, Azure, and LiteLLM all read the same shape (response.id + choices[0].finish_reason) as both object attrs and dict keys. Providers with genuinely different shapes - Anthropic (stop_reason), Bedrock (stopReason), Gemini (protobuf enum), OpenAI Responses (status) - keep their own provider-specific helpers. - Dropped redundant try/except (AttributeError, TypeError) wrappers around bare getattr(obj, "field", None) calls across the new extraction helpers. getattr with a default already suppresses AttributeError, and the inner isinstance / dict.get / int-coercion ops can't raise TypeError in practice. Kept the catches that legitimately guard against IndexError (e.g. choices[0] on an empty list). Tests: 600 passed, 23 skipped, 14 pre-existing multimodal failures unchanged. Added 12 parametrized tests for the shared helper covering object + dict shapes, missing fields, non-string coercion, and never-raises invariants. * chore(otel): drop dead last_chunk variable from async streaming The streaming-fix commit (49e5581b5) replaced the post-loop `_extract_finish_reason_and_response_id(last_chunk)` call with the incrementally-tracked `stream_finish_reason` / `stream_response_id`, which removed the only reader of `last_chunk` in `_ahandle_streaming_response`. The declaration and per-iteration assignment were left behind — harmless but confusing for future readers because the sync sibling still legitimately uses `last_chunk` (for usage and content fallbacks via `_handle_streaming_callbacks`). The async path inlines its usage extraction directly inside the loop (`chunk.model_extra.get("usage")`), so there's no fallback consumer. Drop both lines. Sync path untouched — `last_chunk` there is still load-bearing. * fix(otel): coerce non-list stop_sequences to list[str] on LLMCallStartedEvent Observed in Datadog: gen_ai.request.stop_sequences on a Gemini/Vertex span surfaced the textproto repr of a google.protobuf.struct_pb2.ListValue (values { string_value: "\nObservation:" }) instead of a real Sequence[str]. Root cause is upstream - a Vertex AI / Gemini code path stores the stop list in a protobuf container (RepeatedScalarContainer or ListValue) rather than a plain Python list. When that container reaches LLMCallStartedEvent and then BaseLLM._emit_call_started_event hands it to the OTel SDK as a span attribute, the SDK falls back to str(value) because the type isn't a recognised Sequence[str] - producing the protobuf textproto string instead of an array attribute. * chore: fix ruff lint findings * refactor(otel): declare sampling params on BaseLLM + honor stop overrides + dict chunk id * fix: widen max_tokens to int \| float \| None + apply ruff format * fix(otel): coerce unknown finish_reason / response_id to None instead of stringifying * fix(otel): extract Azure stream finish_reason/id before usage-continue Match the LiteLLM ordering so a finish_reason or response id riding on a usage-carrying chunk isn't dropped by the early `continue`. * fix(otel): report effective max_tokens cap + bedrock structured finish_reason	2026-06-05 07:23:38 -04:00
Vini Brasil	906cd9769d	feat(flow): type DSL triggers as route-aware decorators (#6042 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details Centralize FlowTrigger and FlowMethodDecorator so start/listen/router and the boolean trigger helpers share one authoring contract. This preserves decorated method signatures for static checking while allowing route-label strings in nested FlowCondition data. Export the shared typing helpers for static analyzers, use an explicit Protocol body, align condition validation with Sequence-backed condition data, and drop the stale call-arg ignore exposed by the signature-preserving decorators. Update the flow guide to use or_(...) for multi-label listeners.	2026-06-04 18:07:49 -03:00
Lorenze Jay	14ce97d787	chat api for convo flows (#6034 ) * Add conversational Flow chat helper * Document conversational flow chat APIs in translations * Stringify conversational chat REPL output	2026-06-04 13:36:48 -07:00
Matt Aitchison	f3a15a4f07	feat(lock_store): make locking backend overridable (#6015 ) * feat(lock_store): make locking backend overridable Allow the centralised lock factory to use a pluggable backend instead of the hardcoded Redis/file selection. Backends are resolved with precedence override > CREWAI_LOCK_FACTORY env > built-in default: - set_lock_backend()/reset_lock_backend() and a scoped lock_backend() context manager for programmatic overrides - CREWAI_LOCK_FACTORY="module:callable" env import-path, resolved lazily and cached, with clear errors on malformed or non-callable specs - LockBackend Protocol documenting the contract (raw name in, context manager out; backend owns its namespacing) Default Redis/file behavior is unchanged when nothing is overridden. * refactor(lock_store): use explicit body for LockBackend protocol method Replace the no-op `...` body with `raise NotImplementedError` to satisfy the CodeQL ineffectual-statement check while keeping the Protocol structural-typing only. * refactor(lock_store): drop scoped lock_backend context manager Keep the backend overridable via set_lock_backend/reset_lock_backend and the CREWAI_LOCK_FACTORY env path, but remove the scoped lock_backend() context manager. It was speculative surface and the only thread-unsafe piece (racy save/restore of the module global); nothing depends on it. * refactor(lock_store): drop reset_lock_backend alias reset_lock_backend() was just set_lock_backend(None); callers use that directly. Clearing the override is documented on set_lock_backend. * style(lock_store): apply ruff format * refactor(lock_store): simplify overridable backend to a single setter Reduce the override surface to just set_lock_backend(): lock() uses the custom backend when one is set, otherwise the unchanged Redis/file default. Drop the CREWAI_LOCK_FACTORY env import-path, the runtime_checkable Protocol, the precedence resolver, and the getter — a custom backend is now any callable(name, , timeout) -> context manager, registered in process. fix(lock_store): snapshot backend to avoid check-then-call race Read the module-global backend once into a local before the None check and the call, so a concurrent set_lock_backend(None) cannot make lock() invoke None. * docs(lock_store): clarify name handling for custom backends The default namespaces the lock name; custom backends receive it verbatim. Correct the lock() docstring which implied namespacing always happens. * docs(lock_store): note set_lock_backend is for one-time startup setup	2026-06-04 13:28:31 -05:00
Vini Brasil	75dad212a2	Split flow DSL monolith into focused decorator modules (#6040 ) The Flow DSL lived in one 1033-line `dsl.py` that mixed every decorator (`@start`/`@listen`/`@router`), the `human_feedback` decorator, condition combinators, and FlowDefinition extraction helpers in a single file. Split it into a `dsl/` package where each decorator gets its own module (`start.py` 68 lines, `listen.py` 55, `router.py` 164, `human_feedback.py` 98) and the shared extraction/condition helpers stay in `utils.py`. The public API is re-exported from `dsl/__init__.py`, so import paths are unchanged. This is simpler because each decorator is now read and changed in isolation instead of scanning a 1000-line file to find one of them, and router-specific annotation parsing no longer sits next to unrelated start/listen logic.	2026-06-04 15:02:06 -03:00
Vini Brasil	051fa0c1cb	Build FlowDefinition from Flow DSL metadata (#6017 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details * Build FlowDefinition from Flow DSL metadata Introduce `FlowDefinition`, a serializable model built from the Flow DSL's runtime metadata. It becomes the structural contract for Flow methods, triggers, routers, state, and configuration. The visualization layer is the first consumer: `flow_structure` and `build_flow_structure` now project from the definition instead of re-introspecting the class. The runner still executes from live registries, but the definition gives future runners a single static contract to read. This replaces AST source parsing for router return values, crew references, and state schema with runtime metadata plus explicit `@router(paths=...)` or `Literal`/`Enum` return hints. AST parsing was fragile and could silently fail for dynamic or non-inspectable methods. The refactor removes obsolete introspection and serializer code: * Delete `flow_serializer.py`, `flow/utils.py`, and `visualization/schema.py` * Move flow structure modeling into `flow_definition.py` * Simplify visualization building around the static definition contract * Format files	2026-06-03 18:02:56 -03:00
Lucas Gomide	d09e3f4544	feat: flatten LiteLLM cache/reasoning usage sub-counts in _usage_to_dict (#6033 ) LiteLLM returns provider usage as-is, nesting cache-read / cache-creation / reasoning counts under provider-specific shapes (e.g. prompt_tokens_details.cached_tokens, Anthropic-style cache_read_input_tokens). Surface them as flat cached_prompt_tokens / reasoning_tokens / cache_creation_tokens keys so the span pipeline can read them; prompt / completion / total token counts are left untouched.	2026-06-03 15:13:30 -04:00
Lorenze Jay	1357491f0d	Lorenze/feat/conversational flows (#5896 ) * feat: add conversational flows documentation and chat session support - Introduced a new guide for building multi-turn chat applications using , detailing session management and message handling. - Added class to facilitate chat interactions, including streaming support and event handling. - Implemented for class-level defaults and improved input normalization for conversational turns. - Enhanced event listeners to manage flow events and tracing more effectively, including support for nested crew executions. - Added tests for conversational flow helpers and kickoff parameters to ensure functionality and reliability. * linted * feat: enhance flow event tracing and session management - Updated TraceCollectionListener to handle nested flows without re-claiming parent session batches. - Ensured that method execution events are always emitted for tracing, regardless of flow event suppression. - Improved finalization logic for flow trace batches to respect session deferral flags. - Added tests to verify that method execution events are emitted correctly when flow events are suppressed and that deferred session finalization is respected in nested flows. * updated docs * feat: introduce experimental conversational flow framework - Added a new module for conversational flow, including classes for managing conversation state, messages, and events. - Implemented and for structured intent handling and routing. - Enhanced the class to support turn-oriented conversational applications with built-in routing and message handling. - Updated to include new classes in the public API. - Added tests to validate the functionality of the new conversational flow features. * handled docs * feat(flow): enhance conversational flow handling and tracing - Introduced support for deferred multi-turn tracing to maintain continuous event sequences. - Updated method to delegate to restored checkpoint flows, improving session management. - Added tests to validate the new tracing behavior and ensure correct event handling in conversational flows. * fix multimodal test * better conversational * adjusted prompt * drop unused * fix test * refactor: rename to and update related documentation This commit refactors the class to for clarity and consistency across the codebase. The documentation has been updated to reflect this change, ensuring that references to the new class are accurate. Additionally, the alias for legacy imports is maintained for backward compatibility. The changes enhance the overall structure and readability of the conversational flow implementation. * fix test * adding experimetnal indicators * fix test and reloaded cassettes * cleanup ConversationalFlow class * addressing double finalization and fixed tests * improve on emphemeral tracing and adddressing comments	2026-06-03 11:53:16 -07:00
Lorenze Jay	770d1b284f	Lorenze/fix/file input not working reliably (#6020 ) * fix filesystem * Refine commit message formatting * fix for async kickoffs * added suggestion	2026-06-02 17:14:51 -07:00
alex-clawd	b047c96756	Handle Snowflake Claude stringified tool calls (#6008 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details * Handle Snowflake Claude stringified tool calls * Fix Snowflake tool id type narrowing * Extract Snowflake tool result text in summaries * Bump PyJWT for vulnerability scan --------- Co-authored-by: João Moura <joaomdmoura@gmail.com>	2026-06-02 19:37:18 -03:00
Lorenze Jay	a9cb7867bb	Add crew trained agents file support (#6012 ) * Add crew trained agents file support * Add crew trained agents file support	2026-06-02 09:38:34 -07:00
alex-clawd	774fd871a8	Fix Snowflake Claude incomplete tool result histories (#6006 ) * Fix Snowflake Claude incomplete tool result histories * Filter Snowflake Claude preserved tool results	2026-06-02 09:11:59 -03:00
alex-clawd	4a0769d97c	Add native Snowflake Cortex LLM provider (#6005 )	2026-06-02 08:10:13 -03:00
Greyson LaLonde	e53a676c04	fix(flow): re-arm multi-source or_ listeners across router-driven cycles Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details The previous discard-after-body approach cleared the gate mid-wave, so a slow parallel @start finishing after the listener body could re-fire the same multi-source or_ listener. Re-arm only when a router emits a signal that matches the listener's condition; parallel @start paths never reach that branch and the race gate keeps protecting them. Closes #5972	2026-06-01 15:24:58 -07:00
Vini Brasil	1aba9fe415	Split `flow.py` into DSL, definition, and runtime (#5997 ) This commit separates the monolithic `flow.py` into three modules, each with one job: - `dsl.py` - the Python DSL for flows (@start/@listen/@router, or_/and_) - `flow_definition.py` - the structural model extracted from the DSL - `runtime.py` - the execution engine and state for flows This phase moves code only and should not have any breaking changes.	2026-06-01 18:37:10 -03:00
Greyson LaLonde	ed91100a0f	refactor(skills): move Skills Repository to experimental + CREWAI_EXPERIMENTAL gate Moves the registry/cache pieces of PR #5867 under crewai.experimental.skills and the CLI commands under `crewai experimental skill`. The stable local-file skills feature (loader, parser, validation, models) stays in crewai.skills. Both entry points now require CREWAI_EXPERIMENTAL=1: - resolve_registry_ref() calls require_experimental_skills() before resolving - The `crewai experimental` CLI group raises UsageError when the flag is unset SkillDownloadStarted/CompletedEvent move out of crewai.events.types.skill_events into crewai.experimental.skills.events. * refactor(skills): move 'version' off SkillFrontmatter into metadata The skill version is now stored as `metadata.version` rather than a top-level field on `SkillFrontmatter`. A `before` validator lifts any top-level YAML `version:` into `metadata['version']` so existing SKILL.md files keep parsing.	2026-05-28 09:38:10 -07:00
Lorenze Jay	2e36f06732	feat: enhance StdioTransport to prevent environment variable leakage (#5506 ) * feat: enhance StdioTransport to prevent environment variable leakage - Replaced os.environ.copy() with get_default_environment() to ensure only allowed environment variables are passed to the MCP server. - Added tests to verify that ambient environment variables do not leak and that user-supplied environment variables can override defaults. * feat: add environment variable filtering hook to StdioTransport - Introduced an optional `_env_filter_hook` to allow extensions to modify the environment variables passed to MCP servers, enabling features like credential stripping. - Updated tests to ensure the filtering hook is applied correctly after merging user-supplied and default environment variables.	2026-05-27 13:38:25 -07:00
Lorenze Jay	a1033e4bfe	Fix structured output leaks in tool-calling loops (#5897 ) * Fix structured output leaks in tool-calling loops * addressing comments * drop scripts * Update Gemini agent tests to include structured output with thoughts and bump model version to 2.5-flash * merge * Update Anthropic test cases to use new model and tool structure - Changed the model from "claude-3-5-haiku-20241022" to "claude-sonnet-4-6" in the test setup. - Updated the request and response formats in the YAML test cassette to reflect the new tool structure and improved content formatting. - Adjusted the expected response body to match the new output format from the assistant, including changes in tool usage and response details. - Increased rate limit values in the response headers for better testing scenarios. * adjusted bedrock cassettes * adjusting cassettes for bedrock * fix test * Update VCR configuration to use 'host' instead of 'bedrock_host' for request matching	2026-05-27 13:20:53 -07:00
Greyson LaLonde	fd10c64148	chore(crewai): drop self-explanatory comments	2026-05-26 10:23:33 -07:00
Lorenze Jay	77a61274dc	feat(planning): enhance planning configuration and observation handling (#5913 ) * feat(planning): enhance planning configuration and observation handling - Introduced attribute in to control LLM calls after each step. - Updated to set default to 1 when planning is enabled without explicit config. - Modified to support heuristic observations when LLM calls are disabled. - Adjusted to respect and settings for step observations. - Added tests to verify behavior of new configurations and ensure correct observation handling across different reasoning efforts. * fix(agent_executor): update handling of failed steps in low effort mode - Adjusted logic to ensure that failed steps are recorded without marking them as completed when using low reasoning effort. - Introduced feedback for failed steps, allowing the process to continue while tracking failures. - Added a test to verify that failed steps are correctly marked without triggering a replan. - And linted * linted	2026-05-26 09:10:43 -07:00
Vini Brasil	32f5e74449	Skip lock acquisition in CrewTrainingHandler.load when file is missing (#5935 ) Every agent kickoff calls _use_trained_data, which calls CrewTrainingHandler(...).load(). Since #4827 wrapped load() in store_lock, that means every kickoff acquires the cross-process (Redis-backed when REDIS_URL is set) lock even on deployments that never train and have no trained-agents file on disk. Move the missing/empty-file short-circuit above store_lock so the lock is only acquired when there is actually a file to read. save() and the real read remain locked.	2026-05-26 12:52:31 -03:00
Greyson LaLonde	867df0f633	fix(checkpoint): drop unroundtrippable callbacks and adapter state Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details - callable_to_string returns None for lambdas/closures instead of an unresolvable dotted path; Crew filters Nones out of restored callback lists. - EventNode.event serializer honors info.mode so mode='json' calls cascade properly into nested event payloads. - RagTool.adapter serializes to None (post-validator rebuilds from config); concrete adapters hold runtime state that can't be round-tripped.	2026-05-25 19:24:02 -07:00
Greyson LaLonde	306f5989b4	fix(checkpoint): avoid orphan task_started on resume scope restore Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details Move scope restoration from Crew-level global push to a per-task push inside Task via resume_task_scope() in event_context. Fixes orphan task_started warning, hierarchical resume (manager_agent now eligible for _resuming), and parallel async resume (each contextvars copy owns its own scope). Tests added.	2026-05-23 01:20:15 +08:00
Greyson LaLonde	b4b285764c	fix: harden RuntimeState serialization across entity fields Adds missing serializers, discriminators, and exclude markers on entity fields that previously crashed model_dump_json or restored ambiguously: - Flow.persistence: add _serialize_persistence; drop \| Any escape hatch - Flow.input_provider: SerializableInstance dotted-path round-trip - BaseAgent.agent_executor: add _serialize_executor_ref - BaseAgent.tools_handler / cache_handler: exclude=True - Memory / MemoryScope / MemorySlice: memory_kind Literal discriminator - Knowledge.storage / .embedder: exclude live client, serialize spec - BaseKnowledgeSource subclasses: source_type Literal + dict-resolver - BaseKnowledgeSource.storage / chunk_embeddings: exclude=True - input_provider: enforce InputProvider protocol via dedicated validator/serializer; reject non-class dotted paths in _dotted_path_to_instance - MemoryScope/MemorySlice: allow restore without live Memory; expose bind() to reattach the dependency post-restore - Knowledge.embedder: add BeforeValidator that resolves provider_class dotted paths back to a BaseEmbeddingsProvider subclass	2026-05-21 14:53:40 +08:00
alex-clawd	418afd29e7	feat: Skills Repository — registry, cache, CLI, and SDK integration (#5867 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details * feat: add Skills Repository — registry, cache, CLI, and SDK integration Adds a Skills Repository feature allowing users to publish, install, and use skills from the CrewAI registry with @org/skill-name refs. ## What's New ### SDK (lib/crewai/) - SkillFrontmatter: added optional 'version' field (backward compatible) - SkillCacheManager: manages ~/.crewai/skills/{org}/{name}/ with .crewai_meta.json tracking, path-traversal-safe tar extraction - SkillRegistry: parse @org/skill-name refs, local-first resolution (./skills/ > cache > download), interactive prompt on first use, CI-mode guard (CREWAI_NONINTERACTIVE/CI env vars) - Agent.skills and Crew.skills widened to accept str refs (@org/name) - set_skills() resolves registry refs with org-prefixed dedup keys - New events: SkillDownloadStartedEvent, SkillDownloadCompletedEvent ### CLI (lib/cli/) - crewai skill create <name> — context-aware (project vs standalone) - crewai skill install @org/name — downloads to ./skills/ or cache - crewai skill publish — ZIP + upload to org registry - crewai skill list — show installed skills ### PlusAPI (lib/crewai-core/) - Added SKILLS_RESOURCE, get_skill(), publish_skill(), list_skills() ### Scaffolding - crew and flow templates now include skills/ directory ### Tests - 91 SDK skill tests + 15 CLI skill tests, all passing * fix: address all CI failures and CodeRabbit review comments Lint: - Remove unused imports (click, pytest, json) - Replace try-except-pass with logging (S110) - Fix unprotected zipfile.extractall (S202) Security: - Path traversal: startswith → is_relative_to for tar extraction - Add path traversal protection to ZIP extraction via _safe_extract_zip - Both cache.py and CLI main.py hardened Type checker: - Fix import path: crewai.events.event_bus (not crewai_event_bus) - Remove unused type: ignore comments - Fix type mismatches in set_skills() variable types Code quality: - Fix f-string interpolation in SkillNotCachedError - Use ValidationError instead of Exception in test * style: ruff format + autofix remaining lint errors * refactor: reuse SDK parser and SkillCacheManager in CLI - _parse_frontmatter() now delegates to crewai.skills.parser.parse_frontmatter when available, with a minimal fallback for CLI-only installs - install() global cache path now reuses SkillCacheManager.store() instead of duplicating metadata writing logic * refactor: add _print_current_organization to SkillCommand (matches ToolCommand pattern) * fix: write .crewai_meta.json in fallback install path CodeRabbit caught that the ImportError fallback in install() didn't write cache metadata, making skills invisible to 'crewai skill list'. * fix: tighten @org/name ref validation to prevent path traversal Reject refs with multiple slashes (@org/a/b), dot segments (@../skill), or leading dots in org/name. Applied to both CLI install() and SDK parse_registry_ref() so the contract is enforced consistently. * fix: update test assertions to match tightened error messages * fix: align OSS client with AMP API contract - download_skill(): fetch download_url (presigned URL) instead of expecting inline base64. Falls back to 'file' field for compat. - Read 'latest_version' field, fall back to 'version' - Same fixes applied to CLI install() command * fix: publish as tar.gz (matches AMP content_type validation) + add zip fallback to SDK cache CLI publish: - _build_skill_zip → _build_skill_tarball (tar.gz format) - Content type: application/x-gzip (matches SkillVersion validation) SDK cache: - store() now tries tar.gz first, falls back to zip extraction - Added _safe_extract_zip for path-traversal-safe zip handling - Both formats work for download/install regardless of server format --------- Co-authored-by: João Moura <joaomdmoura@gmail.com>	2026-05-20 14:38:25 -03:00
Greyson LaLonde	eefe0e42ac	fix: surface streamed tool calls when available_functions is absent	2026-05-16 02:46:35 +08:00

1 2 3 4 5 ...

283 Commits