Commit Graph

213 Commits

Author SHA1 Message Date
lorenzejay
fad23d804a Refactor PlannerObserver and StepExecutor to Utilize I18N for Prompts
This update enhances the PlannerObserver and StepExecutor classes by integrating the I18N utility for managing prompts and messages. The system and user prompts are now retrieved from the I18N module, allowing for better localization and maintainability. Additionally, the code has been cleaned up to remove hardcoded strings, improving readability and consistency across the planning and execution processes.
2026-02-13 13:40:04 -08:00
lorenzejay
ca89b729f8 dry 2026-02-13 13:33:55 -08:00
lorenzejay
7e09e01215 fixing tests 2026-02-12 14:26:29 -08:00
lorenzejay
eec88ad2bb cassette regen 2026-02-12 10:53:46 -08:00
lorenzejay
5d4ed12072 regen cassettes for test and fix test 2026-02-11 16:04:24 -08:00
lorenzejay
a164e94f49 Enhance PlanningConfig and AgentExecutor with Reasoning Effort Levels
This update introduces a new  attribute in the  class, allowing users to customize the observation and replanning behavior during task execution. The  class has been modified to utilize this new attribute, routing step observations based on the specified reasoning effort level: low, medium, or high.

Additionally, tests have been added to validate the functionality of the reasoning effort levels, ensuring that the agent behaves as expected under different configurations. This enhancement improves the adaptability and efficiency of the planning process in agent execution.
2026-02-11 13:55:02 -08:00
lorenzejay
576345140f fix 2026-02-10 17:38:49 -08:00
lorenzejay
8fd7ef7f43 linted 2026-02-10 17:14:38 -08:00
lorenzejay
9cac1792bd regen tests 2026-02-10 17:05:47 -08:00
lorenzejay
b2de783559 Merge branch 'lorenze/feat/plan-execute-pattern' of github.com:crewAIInc/crewAI into lorenze/feat/planning-pt-3-todo-list-execution 2026-02-10 16:58:14 -08:00
lorenzejay
d77e2cb1f8 Merge branch 'lorenze/feat/plan-execute-pattern' of github.com:crewAIInc/crewAI into lorenze/feat/plan-execute-pattern 2026-02-10 16:10:20 -08:00
Lorenze Jay
a6dcb275e1 Lorenze/feat planning pt 2 todo list gen (#4449)
* feat: introduce PlanningConfig for enhanced agent planning capabilities

This update adds a new PlanningConfig class to manage agent planning configurations, allowing for customizable planning behavior before task execution. The existing reasoning parameter is deprecated in favor of this new configuration, ensuring backward compatibility while enhancing the planning process. Additionally, the Agent class has been updated to utilize this new configuration, and relevant utility functions have been adjusted accordingly. Tests have been added to validate the new planning functionality and ensure proper integration with existing agent workflows.

* dropping redundancy

* fix test

* revert handle_reasoning here

* refactor: update reasoning handling in Agent class

This commit modifies the Agent class to conditionally call the handle_reasoning function based on the executor class being used. The legacy CrewAgentExecutor will continue to utilize handle_reasoning, while the new AgentExecutor will manage planning internally. Additionally, the PlanningConfig class has been referenced in the documentation to clarify its role in enabling or disabling planning. Tests have been updated to reflect these changes and ensure proper functionality.

* improve planning prompts

* matching

* refactor: remove default enabled flag from PlanningConfig in Agent class

* more cassettes

* fix test

* feat: enhance agent planning with structured todo management

This commit introduces a new planning system within the AgentExecutor class, allowing for the creation of structured todo items from planning steps. The TodoList and TodoItem models have been added to facilitate tracking of plan execution. The reasoning plan now includes a list of steps, improving the clarity and organization of agent tasks. Additionally, tests have been added to validate the new planning functionality and ensure proper integration with existing workflows.

* refactor: update planning prompt and remove deprecated methods in reasoning handler

* improve planning prompt

* improve handler

* linted

* linted
2026-02-10 16:08:26 -08:00
Lorenze Jay
79a01fca31 feat: introduce PlanningConfig for enhanced agent planning capabilities (#4344)
* feat: introduce PlanningConfig for enhanced agent planning capabilities

This update adds a new PlanningConfig class to manage agent planning configurations, allowing for customizable planning behavior before task execution. The existing reasoning parameter is deprecated in favor of this new configuration, ensuring backward compatibility while enhancing the planning process. Additionally, the Agent class has been updated to utilize this new configuration, and relevant utility functions have been adjusted accordingly. Tests have been added to validate the new planning functionality and ensure proper integration with existing agent workflows.

* dropping redundancy

* fix test

* revert handle_reasoning here

* refactor: update reasoning handling in Agent class

This commit modifies the Agent class to conditionally call the handle_reasoning function based on the executor class being used. The legacy CrewAgentExecutor will continue to utilize handle_reasoning, while the new AgentExecutor will manage planning internally. Additionally, the PlanningConfig class has been referenced in the documentation to clarify its role in enabling or disabling planning. Tests have been updated to reflect these changes and ensure proper functionality.

* improve planning prompts

* matching

* refactor: remove default enabled flag from PlanningConfig in Agent class

* more cassettes

* fix test

* refactor: update planning prompt and remove deprecated methods in reasoning handler

* improve planning prompt
2026-02-10 13:26:49 -08:00
Lucas Gomide
a3bee66be8 Address OpenSSL CVE-2025-15467 vulnerability (#4426)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
* fix(security): bump regex from 2024.9.11 to 2026.1.15

Address security vulnerability flagged in regex==2024.9.11

* bump mcp from 1.23.1 to 1.26.0

Address security vulnerability flagged in mcp==1.16.0 (resolved to 1.23.3)
2026-02-10 09:39:35 -08:00
lorenzejay
735a2204fd refactor: implement structured output handling in final answer synthesis
This commit enhances the final answer synthesis process in the AgentExecutor class by introducing support for structured outputs when a response model is specified. The synthesis method now utilizes the response model to produce outputs that conform to the expected schema, while still falling back to concatenation in case of synthesis failures. This change ensures that intermediate steps yield free-text results, but the final output can be structured, improving the overall coherence and usability of the synthesized answers.
2026-02-08 16:19:16 -08:00
lorenzejay
ff57956d05 refactor: enhance final answer synthesis in AgentExecutor
This commit improves the synthesis of final answers in the AgentExecutor class by implementing a more coherent approach to combining results from multiple todo items. The method now utilizes a single LLM call to generate a polished response, falling back to concatenation if the synthesis fails. Additionally, the test cases have been updated to reflect the changes in planning and execution, ensuring that the results are properly validated and that the plan-and-execute architecture is functioning as intended.
2026-02-06 15:39:04 -08:00
Greyson LaLonde
f6fa04528a fix: add async HITL support and chained-router tests
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
asynchronous human-in-the-loop handling and related fixes.

- Extend human_input provider with async support: AsyncExecutorContext, handle_feedback_async, async prompt helpers (_prompt_input_async, _async_readline), and async training/regular feedback loops in SyncHumanInputProvider.
- Add async handler methods in CrewAgentExecutor and AgentExecutor (_ahandle_human_feedback, _ainvoke_loop) to integrate async provider flows.
- Change PlusAPI.get_agent to an async httpx call and adapt caller in agent_utils to run it via asyncio.run.
- Simplify listener execution in flow.Flow to correctly pass HumanFeedbackResult to listeners and unify execution path for router outcomes.
- Remove deprecated types/hitl.py definitions.
- Add tests covering chained router feedback, rejected paths, and mixed router/non-router listeners to prevent regressions.
2026-02-06 16:29:27 -05:00
Greyson LaLonde
7d498b29be fix: event ordering; flow state locks, routing
* fix: add current task id context and flow updates

introduce a context var for the current task id in `crewai.context` to track task scope. update `Flow._execute_single_listener` to return `(result, event_id)` and adjust callers to unpack it and append `FlowMethodName(str(result))` to `router_results`. set/reset the current task id at the start/end of task execution (async + sync) with minor import and call-site tweaks.

* fix: await event futures and flush event bus

call `crewai_event_bus.flush()` after crew kickoff. in `Flow`, await event handler futures instead of just collecting them: await pending `_event_futures` before finishing, await emitted futures immediately with try/except to log failures, then clear `_event_futures`. ensures handlers complete and errors surface.

* fix: continue iteration on tool completion events

expand the loop bridge listener to also trigger on tool completion events (`tool_completed` and `native_tool_completed`) so agent iteration resumes after tools finish. add a `requests.post` mock and response fixture in the liteagent test to simulate platform tool execution. refresh and sanitize vcr cassettes (updated model responses, timestamps, and header placeholders) to reflect tool-call flows and new recordings.

* fix: thread-safe state proxies & native routing

add thread-safe state proxies and refactor native tool routing.

* introduce `LockedListProxy` and `LockedDictProxy` in `flow.py` and update `StateProxy` to return them for list/dict attrs so mutations are protected by the flow lock.
* update `AgentExecutor` to use `StateProxy` on flow init, guard the messages setter with the state lock, and return a `StateProxy` from the temp state accessor.
* convert `call_llm_native_tools` into a listener (no direct routing return) and add `route_native_tool_result` to route based on state (pending tool calls, final answer, or context error).
* minor cleanup in `continue_iteration` to drop orphan listeners on init.
* update test cassettes for new native tool call responses, timestamps, and ids.

improves concurrency safety for shared state and makes native tool routing explicit.

* chore: regen cassettes

* chore: regen cassettes, remove duplicate listener call path
2026-02-06 14:02:43 -05:00
lorenzejay
9f3c53ca97 refactor: enhance final answer synthesis in AgentExecutor
This commit improves the synthesis of final answers in the AgentExecutor class by implementing a more coherent approach to combining results from multiple todo items. The method now utilizes a single LLM call to generate a polished response, falling back to concatenation if the synthesis fails. Additionally, the test cases have been updated to reflect the changes in planning and execution, ensuring that the results are properly validated and that the plan-and-execute architecture is functioning as intended.
2026-02-06 10:38:55 -08:00
Greyson LaLonde
1308bdee63 feat: add started_event_id and set in eventbus
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* feat: add started_event_id and set in eventbus

* chore: update additional test assumption

* fix: restore event bus handlers on context exit

fix rollback in crewai events bus so that exiting the context restores
the previous _sync_handlers, _async_handlers, _handler_dependencies, and _execution_plan_cache by assigning shallow copies of the saved dicts. previously these
were set to empty dicts on exit, which caused registered handlers and cached execution plans to be lost.
2026-02-05 21:28:23 -05:00
lorenzejay
8e1474d371 feat: introduce PlannerObserver and StepExecutor for enhanced plan execution
This commit adds the PlannerObserver and StepExecutor classes to the CrewAI framework, implementing the observation phase of the Plan-and-Execute architecture. The PlannerObserver analyzes step execution results, determines plan validity, and suggests refinements, while the StepExecutor executes individual todo items in isolation. These additions improve the overall planning and execution process, allowing for more dynamic and responsive agent behavior. Additionally, new observation events have been defined to facilitate monitoring and logging of the planning process.
2026-02-05 15:46:21 -08:00
lorenzejay
81d9fd4ab3 execute todos and be able to track them 2026-02-05 10:51:54 -08:00
Greyson LaLonde
6bb1b178a1 chore: extension points
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
Introduce ContextVar-backed hooks and small API/behavior changes to improve extensibility and testability.

Changes include:
- agents: mark configure_structured_output as abstract and change its parameter to task to reflect use of task metadata.
- tracing: convert _first_time_trace_hook to a ContextVar and call .get() to safely retrieve the hook.
- console formatter: add _disable_version_check ContextVar and skip version checks when set (avoids noisy checks in certain contexts).
- flow: use current_triggering_event_id variable when scheduling listener tasks to keep naming consistent.
- hallucination guardrail: make context optional, add _validate_output_hook to allow custom validation hooks, update examples and return contract to allow hooks to override behavior.
- agent utilities: add _create_plus_client_hook for injecting a Plus client (used in tests/alternate flows), ensure structured tools have current_usage_count initialized and propagate to original tool, and fall back to creating PlusAPI client when no hook is provided.
2026-02-05 12:49:54 -05:00
Greyson LaLonde
fe2a4b4e40 chore: bug fixes and more refactor
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Refactor agent executor to delegate human interactions to a provider: add messages and ask_for_human_input properties, implement _invoke_loop and _format_feedback_message, and replace the internal iterative/training feedback logic with a call to get_provider().handle_feedback.

Make LLMGuardrail kickoff coroutine-aware by detecting coroutines and running them via asyncio.run so both sync and async agents are supported.

Make telemetry more robust by safely handling missing task.output (use empty string) and returning early if span is None before setting attributes.

Improve serialization to detect circular references via an _ancestors set, propagate it through recursive calls, and pass exclude/max_depth/_current_depth consistently to prevent infinite recursion and produce stable serializable output.
2026-02-04 21:21:54 -05:00
Greyson LaLonde
711e7171e1 chore: improve hook typing and registration
Allow hook registration to accept both typed hook types and plain callables by importing and using After*/Before*CallHookCallable types; add explicit LLMCallHookContext and ToolCallHookContext typing in crew_base. Introduce a post-initialize crew hook list and invoke hooks after Crew instance initialization. Refactor filtered hook factory functions to include precise typing and clearer local names (before_llm_hook/after_llm_hook/before_tool_hook/after_tool_hook) and register those with the instance. Update CrewInstance protocol to include _registered_hook_functions and _hooks_being_registered fields.
2026-02-04 21:16:20 -05:00
Vini Brasil
76b5f72e81 Fix tool error causing double event scope pop (#4373)
When a tool raises an error, both ToolUsageErrorEvent and
ToolUsageFinishedEvent were being emitted. Since both events pop the
event scope stack, this caused the agent scope to be incorrectly popped
along with the tool scope.
2026-02-04 20:34:08 -03:00
Greyson LaLonde
d86d43d3e0 chore: refactor crew to provider
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Enable dynamic extension exports and small behavior fixes across events and flow modules:

- events/__init__.py: Added _extension_exports and extended __getattr__ to lazily resolve registered extension values or import paths.
- events/event_bus.py: Implemented off() to unregister sync/async handlers, clean handler dependencies, and invalidate execution plan cache.
- events/listeners/tracing/utils.py: Added Callable import and _first_time_trace_hook to allow overriding first-time trace auto-collection behavior.
- events/types/tool_usage_events.py: Changed ToolUsageEvent.run_attempts default from None to 0 to avoid nullable handling.
- events/utils/console_formatter.py: Respect CREWAI_DISABLE_VERSION_CHECK env var to skip version checks in CI-like flows.
- flow/async_feedback/__init__.py: Added typing.Any import, _extension_exports and __getattr__ to support extensions via attribute lookup.

These changes add extension points and safer defaults, and provide a way to unregister event handlers.
2026-02-04 16:05:21 -05:00
Greyson LaLonde
6bfc98e960 refactor: extract hitl to provider pattern
* refactor: extract hitl to provider pattern

- add humaninputprovider protocol with setup_messages and handle_feedback
- move sync hitl logic from executor to synchuman inputprovider
- add _passthrough_exceptions extension point in agent/core.py
- create crewai.core.providers module for extensible components
- remove _ask_human_input from base_agent_executor_mixin
2026-02-04 15:40:22 -05:00
Greyson LaLonde
3cc33ef6ab fix: resolve complex schema $ref pointers in mcp tools
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* fix: resolve complex schema $ref pointers in mcp tools

* chore: update tool specifications

* fix: adapt mcp tools; sanitize pydantic json schemas

* fix: strip nulls from json schemas and simplify mcp args

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-02-03 20:47:58 -05:00
Lorenze Jay
3fec4669af Lorenze/fix/anthropic available functions call (#4360)
* feat: enhance AnthropicCompletion to support available functions in tool execution

- Updated the `_prepare_completion_params` method to accept `available_functions` for better tool handling.
- Modified tool execution logic to directly return results from tools when `available_functions` is provided, aligning behavior with OpenAI's model.
- Added new test cases to validate the execution of tools with available functions, ensuring correct argument passing and result formatting.

This change improves the flexibility and usability of the Anthropic LLM integration, allowing for more complex interactions with tools.

* refactor: remove redundant event emission in AnthropicCompletion

* fix test

* dry up
2026-02-03 16:30:43 -08:00
lorenzejay
7e1ae7226b improve handler 2026-02-03 15:15:22 -08:00
lorenzejay
adee852a2a Merge branch 'lorenze/feat/planning-pt-1' of github.com:crewAIInc/crewAI into lorenze/feat-planning-pt-2-todo-list-gen 2026-02-03 13:54:47 -08:00
lorenzejay
b7d5a4afef Merge branch 'main' of github.com:crewAIInc/crewAI into lorenze/feat/planning-pt-1 2026-02-03 13:54:27 -08:00
lorenzejay
abf86d5572 Merge branch 'lorenze/feat/planning-pt-1' of github.com:crewAIInc/crewAI into lorenze/feat-planning-pt-2-todo-list-gen 2026-02-03 13:48:36 -08:00
lorenzejay
02dc39faa2 improve planning prompt 2026-02-03 13:47:38 -08:00
lorenzejay
dd8230f051 Merge branch 'lorenze/feat/planning-pt-1' of github.com:crewAIInc/crewAI into lorenze/feat-planning-pt-2-todo-list-gen 2026-02-03 13:36:48 -08:00
lorenzejay
a3c2c946d3 refactor: update planning prompt and remove deprecated methods in reasoning handler 2026-02-03 13:35:45 -08:00
lorenzejay
bd95cffd41 feat: enhance agent planning with structured todo management
This commit introduces a new planning system within the AgentExecutor class, allowing for the creation of structured todo items from planning steps. The TodoList and TodoItem models have been added to facilitate tracking of plan execution. The reasoning plan now includes a list of steps, improving the clarity and organization of agent tasks. Additionally, tests have been added to validate the new planning functionality and ensure proper integration with existing workflows.
2026-02-03 13:24:55 -08:00
lorenzejay
ab6ce4b7aa fix test 2026-02-03 08:57:47 -08:00
lorenzejay
ac1d1fcfa3 more cassettes 2026-02-03 08:27:10 -08:00
lorenzejay
83f38184ff refactor: remove default enabled flag from PlanningConfig in Agent class 2026-02-03 08:01:27 -08:00
lorenzejay
f2016f8979 matching 2026-02-03 07:59:27 -08:00
Greyson LaLonde
a3c01265ee feat: add version check & integrate update notices 2026-02-03 10:17:50 -05:00
Thiago Moretto
e30645e855 limit stagehand dep version to 0.5.9 due breaking changes (#4339)
* limit to 0.5.9 due breaking changes + add env vars requirements

* fix tool spec extract that was ignoring with default

* original tool spec

* update spec
2026-02-03 09:43:24 -05:00
Greyson LaLonde
c1d2801be2 fix: reject reserved script names for crew folders 2026-02-03 09:16:55 -05:00
Greyson LaLonde
6a8483fcb6 fix: resolve race condition in guardrail event emission test 2026-02-03 09:06:48 -05:00
Greyson LaLonde
5fb602dff2 fix: replace timing-based concurrency test with state tracking 2026-02-03 08:58:51 -05:00
Greyson LaLonde
b90cff580a fix: relax openai and litellm dependency constraints 2026-02-03 08:51:55 -05:00
Vini Brasil
576b74b2ef Add call_id to LLM events for correlating requests (#4281)
When monitoring LLM events, consumers need to know which events belong
to the same API call. Before this change, there was no way to correlate
LLMCallStartedEvent, LLMStreamChunkEvent, and LLMCallCompletedEvent
belonging to the same request.
2026-02-03 10:10:33 -03:00
Greyson LaLonde
7590d4c6e3 fix: enforce additionalProperties=false in schemas
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* fix: enforce additionalProperties=false in schemas

* fix: ensure nested items have required properties
2026-02-02 22:19:04 -05:00