- benchmark verbose path: pass on_progress callback the same way as
the non-verbose path (was missing entirely)
- _train_new_agents: replace per-case asyncio.run() with a single
event loop (new_event_loop / run_until_complete / close) to avoid
creating and destroying a loop on every case iteration
- format_results_table: use case_index + 1 so the '#' column is
1-based, matching the display in _test_new_agents failed output
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace S101 assert guards with explicit if/raise RuntimeError in
benchmark.py and cli.py (3 locations)
- Fix test_create_llm_from_env_with_unaccepted_attributes to use
DEFAULT_LLM_MODEL with clear=True so the assertion isn't brittle
against the hardcoded model name
- Add n_iterations loop to _test_new_agents (was unused, now mirrors
_train_new_agents iteration pattern)
- Consolidate dotenv loading in cli.py and agent_tui.py to use the
existing load_env_vars() from utils.py instead of duplicating logic
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add missing type annotations to benchmark.py context managers, event
handlers, LoadedCases iteration methods, and fix union-attr on BaseLLM.
Fix no-any-return errors in agent_tui.py and make action_quit async to
match the Textual App supertype. Add type annotations to
_BenchmarkLiveProgress methods in cli.py and fix icon redefinition.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add cast import and use cast() to fix no-any-return errors in _find_tool_class
- Add dict[str, Any] type params to fix type-arg errors in parse_agent_definition/load_agent_from_definition
- Add # type: ignore[import-untyped] for jsonschema import
- Fix A2AClientConfig call-arg: url -> endpoint
- Cast llm to BaseLLM when passing to LLMGuardrail
- Cast tool attr to type[Any] to allow instantiation
- Add # type: ignore[import-not-found] for DirectoryKnowledgeSource import
- Use MCPServerHTTP instead of non-callable MCPServerConfig union alias
- Add explicit list[Any] type annotation for resolved variable
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- test_lite_agent_standalone_still_works: replace real LLM with Mock to
avoid ConnectionError hitting OpenAI in CI
- coworker_tools.py:352: add type: ignore[import-not-found] for crewai.a2a.client
- coworker_tools.py:415: filter BaseException instances from gather results
so return type matches list[str]
- executor.py:740: add type: ignore[import-not-found] for checkpoint_events
- executor.py:2245: guard r.content access with isinstance(r, Message) check
- flow.py:3259: cast model_dump() result to dict[str, Any]
- flow.py: fix response/future no-redef errors by hoisting declarations
and renaming coro_future to avoid duplicate type annotations
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Format: auto-reformat agent_tui.py, benchmark.py, coworker_tools.py via ruff
- Lint: 0 remaining errors after format pass
- Mypy: fix _NullPrinter to subclass Printer for type compatibility in
executor.py, planning.py, and skill_builder.py; add isinstance(r, Message)
guards in spawn_tools.py; annotate return types and fix dict type params
and MCPToolResolver logger type in new_agent.py; add missing printer args
to get_llm_response calls
- cli.py: fix _read_config to use sentinel so falsy values (0, false) are
returned correctly instead of being treated as missing keys
- create_agent.py: replace regex-based JSONC comment stripper with a
token-aware parser that preserves // inside quoted strings (e.g. URLs)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Added checks for read-only memory settings in `ConversationalAgentExecutor` to prevent modifications when memory is not writable.
- Improved memory extraction logic to include metadata when remembering memories, enhancing context retention.
- Updated logging from debug to warning level for memory initialization and save failures, ensuring better visibility of issues.
These changes aim to improve the robustness and clarity of memory management within the CrewAI framework.
- Added a check for `None` tools in the `_tool_has_arun` method to prevent errors during tool validation.
- Improved the logic to exclude tools from the `crewai.tools.base_tool` module when determining if they have a real async `_arun` method, ensuring more accurate tool handling.
These changes aim to improve the robustness of tool validation within the CrewAI framework.
- Extract `_strip_jsonc` as the single shared helper in `create_agent.py`,
replacing the three duplicate implementations in `agent_tui.py`,
`benchmark.py`, and the inline regex in `cli.py::_read_config`.
- Apply `_strip_jsonc` (including trailing-comma removal) inside
`_read_config` so JSONC config.json files are parsed correctly.
- Add `if progress is not None:` guard inside `_make_progress_cb._cb`
to prevent a `NoneType` call when running in verbose mode.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Added `_arun` methods to `DelegateToCoworkerTool`, `MultiDelegateTool`, and `SpawnSubtaskTool` classes to support asynchronous task delegation and spawning, enhancing non-blocking operations.
- Introduced event emissions for delegation and spawning processes, allowing for better tracking of task states and outcomes.
- Implemented error handling and logging for async operations, ensuring robust execution and feedback during agent interactions.
These enhancements aim to optimize the performance and responsiveness of agent task management within the CrewAI framework.
- Introduced a new `ChatTextArea` class to enhance multiline chat input functionality, allowing users to submit messages with Enter and insert newlines with Shift+Enter.
- Updated the TUI layout to replace the previous input method with `ChatTextArea`, improving user experience during chat interactions.
- Removed unused sidebar actions and adjusted input row styling for better visual consistency.
These changes aim to streamline chat interactions within the CrewAI framework, providing a more intuitive input experience.
- Introduced a `verbose` flag in the CLI for the `test` and `benchmark` commands to enable detailed logging of agent execution, including tool calls and LLM responses.
- Updated the `_run_model_benchmark` and `_test_new_agents` functions to accept the `verbose` parameter, allowing for enhanced debugging during benchmark runs.
- Implemented a `verbose_benchmark_output` context manager to manage logging output when verbose mode is enabled, improving the visibility of agent interactions.
These changes enhance the debugging capabilities of the CrewAI framework, providing users with more insights during testing and benchmarking processes.
- cli.py: use s.get('done',0)+1 instead of max(s['done'], event['case_index']) for correct progress counting
- cli.py: use explicit 'is not None' check for config_threshold to avoid treating 0.0 as falsy
- cli.py: remove unused agent_count variable
- constants.py + create_agent.py: add key_name to ollama ENV_VARS entry so API_BASE is correctly saved to OPENAI_API_BASE
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rename suppress_benchmark_output → SuppressBenchmarkOutput and
artifacts_sandbox → ArtifactsSandbox (N801 CapWords), and drop unused
loop variable to use dict.values() (PERF102).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- B904: raise KeyboardInterrupt from err in cli_provider.py
- mypy: add TYPE_CHECKING import for SQLiteConversationStorage, annotate
_initialized class var in TaskScheduler, fix Match type params and
Returning Any in create_agent.py
- tests: mock aget_llm_response in 3 integration tests that fail when
network is blocked but OPENAI_API_KEY is set
- flow.py: use asyncio.run_coroutine_threadsafe() instead of asyncio.run()
when a loop is already running in ask() and say()
- cli.py: fix threshold=0.0 treated as falsy by using `is not None` check
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Added a `CreateRoomScreen` modal for creating new rooms with agent selection and engagement options.
- Updated the main TUI layout to include a sidebar for room management, allowing users to create and switch between rooms.
- Enhanced the configuration handling to support room definitions and engagement modes.
- Refactored existing code to accommodate new room functionalities and improve overall structure.
These changes enhance the user experience by enabling better organization and interaction with multiple agents in the CrewAI framework.
- Introduced a new `LoadedCases` class to encapsulate benchmark cases and optional thresholds, improving data management.
- Updated `load_benchmark_cases` function to support loading cases from both bare arrays and object wrappers with a threshold.
- Modified CLI options to allow dynamic threshold configuration, defaulting to a value from `config.json` if not specified.
- Enhanced error handling for invalid benchmark case formats and added tests to validate new functionality.
These changes aim to improve the flexibility and usability of benchmark case management within the CrewAI framework.
- Added functionality to load environment variables from a `.env` file if it exists, improving configuration management.
- Updated the CLI to fallback to a `benchmarks` directory for test cases if the `tests` directory is not found, ensuring compatibility with previous project structures.
- Refactored benchmark case path handling to streamline testing processes.
These changes aim to improve the usability and flexibility of the CrewAI CLI in various project setups.
- Added a `_safe_render` function to escape Rich markup and convert markdown to Rich format.
- Implemented token-by-token streaming for agent responses in the TUI, improving user experience during interactions.
- Updated the CLI to allow selection of LLM providers and models, enhancing flexibility in agent creation.
- Refactored benchmark case paths to use a `tests` directory instead of `benchmarks`.
- Introduced a `last_stream_result` property in the `NewAgent` class to retrieve the latest streaming response.
These changes aim to provide a more interactive and user-friendly experience in managing agents within the CrewAI framework.
- Introduced a new `create_agent` command for interactive agent definition.
- Added `agent_tui.py` for a conversational TUI supporting multi-agent interactions.
- Updated CLI to support agent creation and training workflows.
- Enhanced `.gitignore` to exclude demo files and configuration artifacts.
- Implemented a benchmark runner for testing agent performance against defined cases.
This commit lays the groundwork for a more interactive and user-friendly experience in managing agents within the CrewAI framework.
In `_execute_task_with_a2a` and its async variant, the try body
sets `task.output_pydantic = None` before returning an A2A
response. The finally block then checks
`if task.output_pydantic is not None` before restoring the
original value — but since it was just set to None, the condition
is always False and the original value is never restored. This
permanently mutates the Task object.
Remove the guard so `output_pydantic` is unconditionally restored,
matching the unconditional restoration of `description` and
`response_model` in the same block.
Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com>