crewAI

mirror of https://github.com/crewAIInc/crewAI.git synced 2026-07-01 21:28:10 +00:00

Author	SHA1	Message	Date
Joao Moura	db604b6f32	fix: ruff formatting and mypy type error Run ruff format on agent_tui.py, cli.py, executor.py. Fix agents_dir argument type: pass Path object instead of str to match the load_agent_from_definition signature (Path \| None). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-14 17:48:30 -04:00
Joao Moura	2d82896d71	fix: address PR review comments — lint, threshold, dedup, agents_dir - Remove redundant local `import asyncio` in executor.py that caused ruff F823 (local variable referenced before assignment) - Clear progress state before creating Live display (fixes flash) - Use threshold-based passed in _save_run_results so persisted results match CLI output - Pass agents_dir to load_agent_from_definition in _train_new_agents so coworker references resolve correctly - Deduplicate verbose/non-verbose benchmark execution blocks into single context-manager expression Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-14 16:48:17 -04:00
Joao Moura	126d0010ba	fix: summary tag handling, TUI autocomplete focus, tool output flooding 1. Summary tags: Reverse the logic — for models like gpt-4.1 that wrap their actual response in <summary> tags (with thinking/CoT before it), extract the inner content instead of stripping it. Streaming uses a preflight buffer that waits for <summary>; if none appears, flushes everything normally. 2. TUI autocomplete: Change @mention accept key from Tab to right-arrow so autocomplete doesn't steal focus from the input widget. Only triggers when there's an active mention context with matches. 3. Tool output: Truncate tool results >4000 chars in LLM message history to prevent the model from echoing full file contents. Add soul-layer instruction telling the agent to summarize tool results rather than repeating them verbatim. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-14 16:17:44 -04:00
Joao Moura	2eb7e15f89	fix: address remaining PR review comments — null guard, markup escaping, empty criteria - Add null check after _load_agent() in benchmark runner (agent can return None on circular refs) - Escape user-sourced content in Rich markup via _safe_render() in memory panel and skills list - Default to passed=True when benchmark case has neither expected nor criteria Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-14 14:04:40 -04:00
Joao Moura	16488f5fe5	fix: address PR #5788 review comments - Remove dead `env_vars.get("MODEL")` check in _setup_env (always truthy since MODEL is set two lines above) - Fix test_sync_delegation mock: use return_value instead of side_effect list and disable planning to prevent StopAsyncIteration on Python 3.10 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-14 13:08:49 -04:00
alex-clawd	f723e69410	fix: address bugbot review comments and CI failures Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 21:37:40 -07:00
Joao Moura	ef39974bd8	feat: add configurable case timeout for benchmarking and testing - Introduced a case_timeout parameter in the benchmark and test functions to allow dynamic timeout settings. - Updated the project configuration template to include a default case_timeout value of 90 seconds. - Enhanced the handling of timeouts in benchmark results to reflect the configured case_timeout.	2026-05-14 00:28:44 -04:00
Joao Moura	2897535799	feat: enhance benchmarking and evaluation features - Introduced a new judge tool for submitting evaluation scores with structured parameters. - Added a function to parse judge results from various response formats. - Updated the benchmark command to handle iterations more effectively, allowing configuration from the command line or config file. - Implemented a method to save run results to a JSON file for better tracking of test outcomes. - Enhanced progress display to show current iteration during benchmark runs. - Updated project configuration template to clarify test iteration settings.	2026-05-14 00:23:32 -04:00
Joao Moura	8f3196e1cf	feat: enhance ChatTextArea with @mention autocomplete and improve UI feedback - Updated ChatTextArea to support @mention autocomplete using Tab for completion. - Added MentionChanged message to handle autocomplete state changes. - Improved user experience by displaying a hint for available mentions. - Enhanced error handling in AgentTUI for agent message timeouts. - Updated rendering logic to ensure proper display of system messages with Rich markup.	2026-05-13 17:17:55 -04:00
alex-clawd	92b24334d5	fix: move progress.start() into try block and use shared event loop in benchmark command - Move progress.start() inside the try block so the finally clause never calls progress.stop() on an un-started display - Replace asyncio.run() with new_event_loop/run_until_complete/loop.close() pattern, consistent with _test_new_agents and _train_new_agents Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:36:56 -07:00
alex-clawd	4bcb72f951	fix: use _safe_render for system messages to preserve Rich markup	2026-05-13 12:31:16 -07:00
alex-clawd	023bb7e6b8	fix: address three review comments on env/cli handling - write_env_file: remove .upper() to preserve original key case - load_env_vars: strip surrounding single/double quotes from values - constants.py: fix Ollama key_name from OPENAI_API_BASE to OLLAMA_HOST - _test_new_agents: replace asyncio.run() loop with new_event_loop + run_until_complete Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
alex-clawd	b5396ea290	fix: reset progress state between iterations + use set for agents_trained	2026-05-13 12:28:25 -07:00
alex-clawd	006a2d5944	fix: use 1-based case_index in print_results_chart for consistency	2026-05-13 12:28:25 -07:00
alex-clawd	e9a59ab25c	fix: count unique agents instead of agent-iterations in test output	2026-05-13 12:28:25 -07:00
alex-clawd	a723d991f5	fix: address three review comments on benchmark/test CLI - benchmark verbose path: pass on_progress callback the same way as the non-verbose path (was missing entirely) - _train_new_agents: replace per-case asyncio.run() with a single event loop (new_event_loop / run_until_complete / close) to avoid creating and destroying a loop on every case iteration - format_results_table: use case_index + 1 so the '#' column is 1-based, matching the display in _test_new_agents failed output Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
alex-clawd	74bf197ccb	fix: resolve lint, test, and review issues - Replace S101 assert guards with explicit if/raise RuntimeError in benchmark.py and cli.py (3 locations) - Fix test_create_llm_from_env_with_unaccepted_attributes to use DEFAULT_LLM_MODEL with clear=True so the assertion isn't brittle against the hardcoded model name - Add n_iterations loop to _test_new_agents (was unused, now mirrors _train_new_agents iteration pattern) - Consolidate dotenv loading in cli.py and agent_tui.py to use the existing load_env_vars() from utils.py instead of duplicating logic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
alex-clawd	68fb64f383	fix: resolve all mypy type errors in CLI files Add missing type annotations to benchmark.py context managers, event handlers, LoadedCases iteration methods, and fix union-attr on BaseLLM. Fix no-any-return errors in agent_tui.py and make action_quit async to match the Textual App supertype. Add type annotations to _BenchmarkLiveProgress methods in cli.py and fix icon redefinition. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
alex-clawd	48a861aa1a	fix: resolve all CI failures — format, lint, mypy, and review comments - Format: auto-reformat agent_tui.py, benchmark.py, coworker_tools.py via ruff - Lint: 0 remaining errors after format pass - Mypy: fix _NullPrinter to subclass Printer for type compatibility in executor.py, planning.py, and skill_builder.py; add isinstance(r, Message) guards in spawn_tools.py; annotate return types and fix dict type params and MCPToolResolver logger type in new_agent.py; add missing printer args to get_llm_response calls - cli.py: fix _read_config to use sentinel so falsy values (0, false) are returned correctly instead of being treated as missing keys - create_agent.py: replace regex-based JSONC comment stripper with a token-aware parser that preserves // inside quoted strings (e.g. URLs) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
alex-clawd	d744b37723	fix: deduplicate JSONC stripping, guard progress callback, and fix _read_config - Extract `_strip_jsonc` as the single shared helper in `create_agent.py`, replacing the three duplicate implementations in `agent_tui.py`, `benchmark.py`, and the inline regex in `cli.py::_read_config`. - Apply `_strip_jsonc` (including trailing-comma removal) inside `_read_config` so JSONC config.json files are parsed correctly. - Add `if progress is not None:` guard inside `_make_progress_cb._cb` to prevent a `NoneType` call when running in verbose mode. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
alex-clawd	22bcced6c0	fix: add missing TabbedContent import and _rich_escape in agent_tui.py	2026-05-13 12:28:25 -07:00
Joao Moura	a0f4cb0d7a	feat: implement ChatTextArea for improved chat input handling - Introduced a new `ChatTextArea` class to enhance multiline chat input functionality, allowing users to submit messages with Enter and insert newlines with Shift+Enter. - Updated the TUI layout to replace the previous input method with `ChatTextArea`, improving user experience during chat interactions. - Removed unused sidebar actions and adjusted input row styling for better visual consistency. These changes aim to streamline chat interactions within the CrewAI framework, providing a more intuitive input experience.	2026-05-13 12:28:25 -07:00
alex-clawd	94b5e2ea7b	fix: address CI failures — ruff, mypy, mock OpenAI tests, JSONC support Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
Joao Moura	0ddedbc48a	feat: add verbose output option for benchmarking and testing - Introduced a `verbose` flag in the CLI for the `test` and `benchmark` commands to enable detailed logging of agent execution, including tool calls and LLM responses. - Updated the `_run_model_benchmark` and `_test_new_agents` functions to accept the `verbose` parameter, allowing for enhanced debugging during benchmark runs. - Implemented a `verbose_benchmark_output` context manager to manage logging output when verbose mode is enabled, improving the visibility of agent interactions. These changes enhance the debugging capabilities of the CrewAI framework, providing users with more insights during testing and benchmarking processes.	2026-05-13 12:28:25 -07:00
alex-clawd	c33fd82286	fix: address 4 new bugbot review comments - cli.py: use s.get('done',0)+1 instead of max(s['done'], event['case_index']) for correct progress counting - cli.py: use explicit 'is not None' check for config_threshold to avoid treating 0.0 as falsy - cli.py: remove unused agent_count variable - constants.py + create_agent.py: add key_name to ollama ENV_VARS entry so API_BASE is correctly saved to OPENAI_API_BASE Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
alex-clawd	b3044a780e	fix: resolve remaining ruff lint errors Rename suppress_benchmark_output → SuppressBenchmarkOutput and artifacts_sandbox → ArtifactsSandbox (N801 CapWords), and drop unused loop variable to use dict.values() (PERF102). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
alex-clawd	089656195d	fix: address remaining review comments — broken import, race condition, duplicate logic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
alex-clawd	2ddc348ad2	fix: resolve lint, type-check, and test failures - B904: raise KeyboardInterrupt from err in cli_provider.py - mypy: add TYPE_CHECKING import for SQLiteConversationStorage, annotate _initialized class var in TaskScheduler, fix Match type params and Returning Any in create_agent.py - tests: mock aget_llm_response in 3 integration tests that fail when network is blocked but OPENAI_API_KEY is set - flow.py: use asyncio.run_coroutine_threadsafe() instead of asyncio.run() when a loop is already running in ask() and say() - cli.py: fix threshold=0.0 treated as falsy by using `is not None` check Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 12:28:25 -07:00
Joao Moura	75651f962d	feat: introduce room management and agent selection in TUI - Added a `CreateRoomScreen` modal for creating new rooms with agent selection and engagement options. - Updated the main TUI layout to include a sidebar for room management, allowing users to create and switch between rooms. - Enhanced the configuration handling to support room definitions and engagement modes. - Refactored existing code to accommodate new room functionalities and improve overall structure. These changes enhance the user experience by enabling better organization and interaction with multiple agents in the CrewAI framework.	2026-05-13 12:28:25 -07:00
Joao Moura	fc85637e60	feat: enhance benchmark case loading and CLI threshold handling - Introduced a new `LoadedCases` class to encapsulate benchmark cases and optional thresholds, improving data management. - Updated `load_benchmark_cases` function to support loading cases from both bare arrays and object wrappers with a threshold. - Modified CLI options to allow dynamic threshold configuration, defaulting to a value from `config.json` if not specified. - Enhanced error handling for invalid benchmark case formats and added tests to validate new functionality. These changes aim to improve the flexibility and usability of benchmark case management within the CrewAI framework.	2026-05-13 12:28:25 -07:00
Joao Moura	813173c85f	Update benchmark	2026-05-13 12:28:25 -07:00
Joao Moura	4c33de86a9	feat: enhance CLI environment variable loading and benchmark path handling - Added functionality to load environment variables from a `.env` file if it exists, improving configuration management. - Updated the CLI to fallback to a `benchmarks` directory for test cases if the `tests` directory is not found, ensuring compatibility with previous project structures. - Refactored benchmark case path handling to streamline testing processes. These changes aim to improve the usability and flexibility of the CrewAI CLI in various project setups.	2026-05-13 12:28:25 -07:00
Joao Moura	6cb29dce65	feat: enhance agent TUI and CLI with streaming responses and model selection improvements - Added a `_safe_render` function to escape Rich markup and convert markdown to Rich format. - Implemented token-by-token streaming for agent responses in the TUI, improving user experience during interactions. - Updated the CLI to allow selection of LLM providers and models, enhancing flexibility in agent creation. - Refactored benchmark case paths to use a `tests` directory instead of `benchmarks`. - Introduced a `last_stream_result` property in the `NewAgent` class to retrieve the latest streaming response. These changes aim to provide a more interactive and user-friendly experience in managing agents within the CrewAI framework.	2026-05-13 12:28:25 -07:00
Joao Moura	fe7f730546	feat: add interactive agent creation and TUI for multi-agent interaction - Introduced a new `create_agent` command for interactive agent definition. - Added `agent_tui.py` for a conversational TUI supporting multi-agent interactions. - Updated CLI to support agent creation and training workflows. - Enhanced `.gitignore` to exclude demo files and configuration artifacts. - Implemented a benchmark runner for testing agent performance against defined cases. This commit lays the groundwork for a more interactive and user-friendly experience in managing agents within the CrewAI framework.	2026-05-13 12:28:25 -07:00
Greyson LaLonde	2034f2140a	feat: bump versions to 1.14.5a5	2026-05-13 02:54:13 +08:00
Greyson LaLonde	a09c4de2fd	feat: bump versions to 1.14.5a4	2026-05-09 03:08:22 +08:00
Cole Goeppinger	74a1ff8db5	feat: update llm listings Add the latest Anthropic and OpenAI LLMs to the CLI	2026-05-08 01:19:47 +08:00
Greyson LaLonde	d165bcb65f	fix(deps): move textual to crewai-cli and add certifi Some checks failed Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details Nightly Canary Release / Check for new commits (push) Has been cancelled Details Nightly Canary Release / Build nightly packages (push) Has been cancelled Details Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details	2026-05-07 04:40:08 +08:00
Greyson LaLonde	e961a005cb	feat: bump versions to 1.14.5a3 Some checks failed Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Vulnerability Scan / pip-audit (push) Has been cancelled Details	2026-05-07 01:44:05 +08:00
Greyson LaLonde	93e786d263	refactor: extract CLI into standalone crewai-cli package	2026-05-06 20:46:46 +08:00

40 Commits