- Remove redundant local `import asyncio` in executor.py that caused
ruff F823 (local variable referenced before assignment)
- Clear progress state before creating Live display (fixes flash)
- Use threshold-based passed in _save_run_results so persisted results
match CLI output
- Pass agents_dir to load_agent_from_definition in _train_new_agents
so coworker references resolve correctly
- Deduplicate verbose/non-verbose benchmark execution blocks into
single context-manager expression
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add "new_agent" top-level section to en.json with all prompt templates
(soul, tools_header, coworkers_header, skills_self_build, temporal,
tool_result_truncated). Add new_agent() accessor to I18N class.
Executor now pulls all prompt text from I18N_DEFAULT.new_agent() with
format() placeholders, making it easy to tune prompts per model or
translate to other languages without touching executor logic.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. Summary tags: Reverse the logic — for models like gpt-4.1 that wrap
their actual response in <summary> tags (with thinking/CoT before it),
extract the inner content instead of stripping it. Streaming uses a
preflight buffer that waits for <summary>; if none appears, flushes
everything normally.
2. TUI autocomplete: Change @mention accept key from Tab to right-arrow
so autocomplete doesn't steal focus from the input widget. Only
triggers when there's an active mention context with matches.
3. Tool output: Truncate tool results >4000 chars in LLM message history
to prevent the model from echoing full file contents. Add soul-layer
instruction telling the agent to summarize tool results rather than
repeating them verbatim.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Root cause: test_gap_implementations.py assigned directly to
crewai_event_bus.emit (instance attribute), which shadowed the class
method even after restoration. Later tests using patch.object on the
class couldn't intercept calls.
Also converts all 19 positional crewai_event_bus.emit() calls across 8
new_agent files to use the event= keyword argument, matching the
pattern in llm.py. Adds <summary> tag stripping for both ainvoke() and
astream() to prevent summarization prompt leakage in agent responses.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Updated several methods in ConversationalAgentExecutor to use asyncio.to_thread for improved concurrency.
- Ensured that memory persistence and message summarization are handled asynchronously to enhance performance.
- Adjusted prompt stack building to run in a separate thread for better responsiveness.
- Remove dead `env_vars.get("MODEL")` check in _setup_env (always truthy
since MODEL is set two lines above)
- Fix test_sync_delegation mock: use return_value instead of side_effect
list and disable planning to prevent StopAsyncIteration on Python 3.10
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- test_streaming_properties_from_docs: add record_mode="none" so VCR never
falls through to the real OpenAI API; cassette already exists.
- gitpython >=3.1.50 (GHSA-mv93-w799-cj2w)
- langchain-core >=1.3.1 (GHSA-pjwx-r37v-7724; resolves to 1.3.3)
- urllib3 >=2.7.0 (GHSA-qccp-gfcp-xxvc, GHSA-mf9v-mfxr-j63j; 2.6.4 was never released)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace S101 assert guards with explicit if/raise RuntimeError in
benchmark.py and cli.py (3 locations)
- Fix test_create_llm_from_env_with_unaccepted_attributes to use
DEFAULT_LLM_MODEL with clear=True so the assertion isn't brittle
against the hardcoded model name
- Add n_iterations loop to _test_new_agents (was unused, now mirrors
_train_new_agents iteration pattern)
- Consolidate dotenv loading in cli.py and agent_tui.py to use the
existing load_env_vars() from utils.py instead of duplicating logic
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add cast import and use cast() to fix no-any-return errors in _find_tool_class
- Add dict[str, Any] type params to fix type-arg errors in parse_agent_definition/load_agent_from_definition
- Add # type: ignore[import-untyped] for jsonschema import
- Fix A2AClientConfig call-arg: url -> endpoint
- Cast llm to BaseLLM when passing to LLMGuardrail
- Cast tool attr to type[Any] to allow instantiation
- Add # type: ignore[import-not-found] for DirectoryKnowledgeSource import
- Use MCPServerHTTP instead of non-callable MCPServerConfig union alias
- Add explicit list[Any] type annotation for resolved variable
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- test_lite_agent_standalone_still_works: replace real LLM with Mock to
avoid ConnectionError hitting OpenAI in CI
- coworker_tools.py:352: add type: ignore[import-not-found] for crewai.a2a.client
- coworker_tools.py:415: filter BaseException instances from gather results
so return type matches list[str]
- executor.py:740: add type: ignore[import-not-found] for checkpoint_events
- executor.py:2245: guard r.content access with isinstance(r, Message) check
- flow.py:3259: cast model_dump() result to dict[str, Any]
- flow.py: fix response/future no-redef errors by hoisting declarations
and renaming coro_future to avoid duplicate type annotations
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Format: auto-reformat agent_tui.py, benchmark.py, coworker_tools.py via ruff
- Lint: 0 remaining errors after format pass
- Mypy: fix _NullPrinter to subclass Printer for type compatibility in
executor.py, planning.py, and skill_builder.py; add isinstance(r, Message)
guards in spawn_tools.py; annotate return types and fix dict type params
and MCPToolResolver logger type in new_agent.py; add missing printer args
to get_llm_response calls
- cli.py: fix _read_config to use sentinel so falsy values (0, false) are
returned correctly instead of being treated as missing keys
- create_agent.py: replace regex-based JSONC comment stripper with a
token-aware parser that preserves // inside quoted strings (e.g. URLs)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Added checks for read-only memory settings in `ConversationalAgentExecutor` to prevent modifications when memory is not writable.
- Improved memory extraction logic to include metadata when remembering memories, enhancing context retention.
- Updated logging from debug to warning level for memory initialization and save failures, ensuring better visibility of issues.
These changes aim to improve the robustness and clarity of memory management within the CrewAI framework.
- Added a check for `None` tools in the `_tool_has_arun` method to prevent errors during tool validation.
- Improved the logic to exclude tools from the `crewai.tools.base_tool` module when determining if they have a real async `_arun` method, ensuring more accurate tool handling.
These changes aim to improve the robustness of tool validation within the CrewAI framework.
- Added `_arun` methods to `DelegateToCoworkerTool`, `MultiDelegateTool`, and `SpawnSubtaskTool` classes to support asynchronous task delegation and spawning, enhancing non-blocking operations.
- Introduced event emissions for delegation and spawning processes, allowing for better tracking of task states and outcomes.
- Implemented error handling and logging for async operations, ensuring robust execution and feedback during agent interactions.
These enhancements aim to optimize the performance and responsiveness of agent task management within the CrewAI framework.
- B904: raise KeyboardInterrupt from err in cli_provider.py
- mypy: add TYPE_CHECKING import for SQLiteConversationStorage, annotate
_initialized class var in TaskScheduler, fix Match type params and
Returning Any in create_agent.py
- tests: mock aget_llm_response in 3 integration tests that fail when
network is blocked but OPENAI_API_KEY is set
- flow.py: use asyncio.run_coroutine_threadsafe() instead of asyncio.run()
when a loop is already running in ask() and say()
- cli.py: fix threshold=0.0 treated as falsy by using `is not None` check
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Added a `CreateRoomScreen` modal for creating new rooms with agent selection and engagement options.
- Updated the main TUI layout to include a sidebar for room management, allowing users to create and switch between rooms.
- Enhanced the configuration handling to support room definitions and engagement modes.
- Refactored existing code to accommodate new room functionalities and improve overall structure.
These changes enhance the user experience by enabling better organization and interaction with multiple agents in the CrewAI framework.
- Introduced a new `LoadedCases` class to encapsulate benchmark cases and optional thresholds, improving data management.
- Updated `load_benchmark_cases` function to support loading cases from both bare arrays and object wrappers with a threshold.
- Modified CLI options to allow dynamic threshold configuration, defaulting to a value from `config.json` if not specified.
- Enhanced error handling for invalid benchmark case formats and added tests to validate new functionality.
These changes aim to improve the flexibility and usability of benchmark case management within the CrewAI framework.
- Added a `_safe_render` function to escape Rich markup and convert markdown to Rich format.
- Implemented token-by-token streaming for agent responses in the TUI, improving user experience during interactions.
- Updated the CLI to allow selection of LLM providers and models, enhancing flexibility in agent creation.
- Refactored benchmark case paths to use a `tests` directory instead of `benchmarks`.
- Introduced a `last_stream_result` property in the `NewAgent` class to retrieve the latest streaming response.
These changes aim to provide a more interactive and user-friendly experience in managing agents within the CrewAI framework.
- Introduced a new `create_agent` command for interactive agent definition.
- Added `agent_tui.py` for a conversational TUI supporting multi-agent interactions.
- Updated CLI to support agent creation and training workflows.
- Enhanced `.gitignore` to exclude demo files and configuration artifacts.
- Implemented a benchmark runner for testing agent performance against defined cases.
This commit lays the groundwork for a more interactive and user-friendly experience in managing agents within the CrewAI framework.
In `_execute_task_with_a2a` and its async variant, the try body
sets `task.output_pydantic = None` before returning an A2A
response. The finally block then checks
`if task.output_pydantic is not None` before restoring the
original value — but since it was just set to None, the condition
is always False and the original value is never restored. This
permanently mutates the Task object.
Remove the guard so `output_pydantic` is unconditionally restored,
matching the unconditional restoration of `description` and
`response_model` in the same block.
Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com>
When a tool with result_as_answer=True raises an exception, the agent
was receiving result_as_answer=True and returning the error string as
the final answer. Now we set result_as_answer=False when an error event
is emitted, allowing the agent to reflect and retry.
FixescrewAIInc/crewAI#5156
---------
Co-authored-by: NIK-TIGER-BILL <nik.tiger.bill@github.com>
Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com>
## Summary
- Reverts `b0e2fda` ("fix(flow): add execution_id separate from state.id", COR-48): removes `Flow.execution_id` and points `current_flow_id` / `current_flow_request_id` back at `flow_id` (i.e. `state.id`). The separate per-run tracking id was no longer the right abstraction once `restore_from_state_id` reshapes how `state.id` is assigned;
- Adds an optional `restore_from_state_id` kwarg to `Flow.kickoff` / `Flow.kickoff_async` that hydrates state from a previously-persisted flow's latest snapshot
- Reassigns `state.id` to a fresh value (or `inputs["id"]` if pinned) so the new run's `@persist` writes don't extend the source's history
- Existing `inputs["id"]` resume, `@persist`, and `from_checkpoint` paths are unchanged
## Problem
`@persist` only supports *resume* today: `kickoff(inputs={"id": <uuid>})` hydrates state and continues writing under the same `flow_uuid`. There's no way to **fork** — hydrate from a snapshot but persist under a separate key, leaving the source's history intact. This PR adds that.
| | `state.id` after kickoff | `@persist` writes land under |
|---|---|---|
| `inputs["id"]` (resume) | supplied id | supplied id (extends history) |
| `restore_from_state_id` (fork) | fresh id, or `inputs["id"]` if pinned | new id (source preserved) |
## Behavior
| `inputs.id` | `restore_from_state_id` | Effect |
|---|---|---|
| — | — | Fresh kickoff |
| set | — | Existing resume |
| — | UUID | Fork — new `state.id`, hydrated from source |
| set | UUID | Fork into a pinned `state.id`, hydrated from source |
- Source not found → silent fallback (mirrors existing resume)
- Both `from_checkpoint` and `restore_from_state_id` set → `ValueError`
- `restore_from_state_id=None` → byte-identical to current main
## Design
Fork hydration runs before the existing `inputs` block in `kickoff_async`. On a hit, it calls the same `_restore_state` primitive used by resume, then overwrites `state.id` with a fresh UUID (or `inputs["id"]`). A `fork_succeeded` flag gates the existing `inputs["id"]` path so we don't double-load. `_completed_methods` / `_is_execution_resuming` are intentionally untouched — skip-completed-methods remains the territory of `apply_checkpoint` and `from_pending`.
## Test plan
- [ ] `pytest tests/test_flow_persistence.py` — 5 new tests (four-row matrix, not-found fallback, default no-op, conflict raise) + 6 existing as regression
- [ ] `pytest tests/test_flow.py` — broader flow suite
- [ ] Manual end-to-end against an HITL `@persist` flow
* feat(crewai-tools): add highlights to ExaSearchTool, rename from EXASearchTool
- Add a highlights init param so agents can get token-efficient excerpts instead of full pages
- Rename EXASearchTool to ExaSearchTool; keep EXASearchTool as a deprecated alias so existing imports keep working
- Update the docs and example to use highlights as the recommended option
- Add a small note that says Exa is the fastest and most accurate web search API
- Add tests for the new highlights param and the deprecation alias
* fix(crewai-tools): import order and module-level Exa for tests
- Reorder std-lib imports so ruff is happy with force-sort-within-sections.
- Import Exa at module level (with a fallback) so the existing test mocks resolve.
The lazy install prompt still works if exa_py is missing.
- Allow content and summary to be a dict, matching highlights.
- Trim test file to the cases this PR introduces (highlights param and the
EXASearchTool deprecation alias). Existing init-shape tests stay.
Co-Authored-By: ishan <ishan@exa.ai>
* chore(crewai-tools): drop self-explanatory comment on schema alias
Co-Authored-By: ishan <ishan@exa.ai>
* docs(crewai-tools): default highlights to True, drop summary from examples
Co-Authored-By: ishan <ishan@exa.ai>
* docs(crewai-tools): simplify highlights examples to highlights=True
Co-Authored-By: ishan <ishan@exa.ai>
* feat(crewai-tools): add x-exa-integration header for usage tracking
Co-Authored-By: ishan <ishan@exa.ai>
* docs(crewai-tools): add Exa MCP section and resources links
Co-Authored-By: ishan <ishan@exa.ai>
---------
Co-authored-by: ishan <ishan@exa.ai>
Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com>
Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>
* feat(azure): forward credential_scopes to Azure AI Inference client
Adds a credential_scopes field to the native Azure AI Inference
provider and a matching AZURE_CREDENTIAL_SCOPES env var
(comma-separated). The value is forwarded to ChatCompletionsClient /
AsyncChatCompletionsClient when set, letting keyless / Entra-based
callers target a specific Azure AD audience (e.g.
https://cognitiveservices.azure.com/.default) without subclassing the
provider. Matches the upstream azure.ai.inference SDK kwarg of the
same name.
Lazy build re-reads the env var so an LLM constructed at module
import (before deployment env vars are set) still picks up scopes —
same pattern as the existing AZURE_API_KEY / AZURE_ENDPOINT lazy
reads. to_config_dict round-trips the field.
* refactor(azure): tighten credential_scopes env handling
Address review feedback:
- Move os.getenv into the helper so AZURE_CREDENTIAL_SCOPES appears once
- Match the surrounding api_key/endpoint `or` style in the validator
- Drop the list() defensive copy in to_config_dict — every other field
in that method (and the base class's `stop`) is assigned by reference