Compare commits

..

28 Commits

Author SHA1 Message Date
Heitor Sammuel Carvalho
a60aa3daca feat: remove --public/--private flags from tool publish command
All tools are now private to the organization by default. The --public
and --private CLI flags have been removed along with the is_public
parameter throughout the stack. Docs updated across EN, PT-BR, and KO.
2026-03-18 13:28:59 -03:00
Greyson LaLonde
116182f708 docs: update changelog and version for v1.11.0 2026-03-18 09:38:38 -04:00
Greyson LaLonde
9eed13b8a2 feat: bump versions to 1.11.0 2026-03-18 09:30:05 -04:00
Greyson LaLonde
50b2c7d072 docs: update changelog and version for v1.11.0rc2
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
2026-03-17 17:07:26 -04:00
Greyson LaLonde
e9ba4932a0 feat: bump versions to 1.11.0rc2 2026-03-17 16:58:59 -04:00
Tanishq
0b07b4c45f docs: update Exa Search Tool page with improved naming, description, and configuration options (#4800)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
* docs: update Exa Search Tool page with improved naming, description, and configuration options

Co-Authored-By: Tanishq Jaiswal <tanishq.jaiswal97@gmail.com>

* docs: fix API key link and remove neural/keyword search type references

Co-Authored-By: Tanishq Jaiswal <tanishq.jaiswal97@gmail.com>

* docs: add instant, fast, auto, deep search types

Co-Authored-By: Tanishq Jaiswal <tanishq.jaiswal97@gmail.com>

---------

Co-authored-by: João Moura <joaomdmoura@gmail.com>
2026-03-17 12:27:41 -03:00
João Moura
6235810844 fix: enhance LLM response handling and serialization (#4909)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* fix: enhance LLM response handling and serialization

* Updated the Flow class to improve error handling when both structured and simple prompting fail, ensuring the first outcome is returned as a fallback.
* Introduced a new function, _serialize_llm_for_context, to properly serialize LLM objects with provider prefixes for better context management.
* Added tests to validate the new serialization logic and ensure correct behavior when LLM calls fail.

This update enhances the robustness of LLM interactions and improves the overall flow of handling outcomes.

* fix: patch VCR response handling to prevent httpx.ResponseNotRead errors (#4917)

* fix: enhance LLM response handling and serialization

* Updated the Flow class to improve error handling when both structured and simple prompting fail, ensuring the first outcome is returned as a fallback.
* Introduced a new function, _serialize_llm_for_context, to properly serialize LLM objects with provider prefixes for better context management.
* Added tests to validate the new serialization logic and ensure correct behavior when LLM calls fail.

This update enhances the robustness of LLM interactions and improves the overall flow of handling outcomes.

* fix: patch VCR response handling to prevent httpx.ResponseNotRead errors

VCR's _from_serialized_response mocks httpx.Response.read(), which
prevents the response's internal _content attribute from being properly
initialized. When OpenAI's client (using with_raw_response) accesses
response.content, httpx raises ResponseNotRead.

This patch explicitly sets response._content after the response is
created, ensuring that tests using VCR cassettes work correctly with
the OpenAI client's raw response handling.

Fixes tests:
- test_hierarchical_crew_creation_tasks_with_sync_last
- test_conditional_task_last_task_when_conditional_is_false
- test_crew_log_file_output

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Joao Moura <joaomdmoura@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: alex-clawd <alex@crewai.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-17 05:19:31 -03:00
Matt Aitchison
b95486c187 fix: upgrade vulnerable transitive dependencies (authlib, PyJWT, snowflake-connector-python) (#4913)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
- authlib 1.6.7 → 1.6.9 (CVE-2026-27962 critical, CVE-2026-28498, CVE-2026-28490)
- PyJWT 2.11.0 → 2.12.1 (CVE-2026-32597)
- snowflake-connector-python 4.2.0 → 4.3.0
2026-03-16 19:02:39 -05:00
Lucas Gomide
ead8e8d6e6 docs: add Custom MCP Servers in How-To Guide (#4911) 2026-03-16 17:01:41 -04:00
Vini Brasil
5bbf9c8e03 Update OTEL collectors documentation (#4908)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
* Update OTEL collectors documentation

* Add translations
2026-03-16 13:27:57 -03:00
Greyson LaLonde
5053fae8a1 docs: update changelog and version for v1.11.0rc1 2026-03-16 09:55:45 -04:00
Lucas Gomide
9facd96aad docs: update MCP documentation (#4904) 2026-03-16 09:13:10 -04:00
Rip&Tear
9acb327d9f fix: replace os.system with subprocess.run in unsafe mode pip install
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* fix: replace os.system with subprocess.run in unsafe mode pip install

Eliminates shell injection risk (A05) where a malicious library name like
"pkg; rm -rf /" could execute arbitrary host commands. Using list-form
subprocess.run with shell=False ensures the library name is always treated
as a single argument with no shell metacharacter expansion.

Adds two tests: one verifying list-form invocation, one verifying that
shell metacharacters in a library name cannot trigger shell execution.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use sys.executable -m pip to satisfy S607 linting rule

S607 flags partial executable paths like ["pip", ...]. Using
[sys.executable, "-m", "pip", ...] provides an absolute path and also
ensures installation targets the correct Python environment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com>
2026-03-16 02:04:24 -04:00
Greyson LaLonde
aca0817421 feat: bump versions to 1.11.0rc1 2026-03-15 23:37:20 -04:00
Greyson LaLonde
4d21c6e4ad feat(a2a): add plus api token auth
* feat(a2a): add plus api token auth

* feat(a2a): use stub for plus api

* fix: use dynamic separator in slugify for dedup and strip

* fix: remove unused _DUPLICATE_SEPARATOR_PATTERN, cache compiled regex in slugify
2026-03-15 23:30:29 -04:00
Lorenze Jay
32d7b4a8d4 Lorenze/feat/plan execute pattern (#4817)
* feat: introduce PlanningConfig for enhanced agent planning capabilities (#4344)

* feat: introduce PlanningConfig for enhanced agent planning capabilities

This update adds a new PlanningConfig class to manage agent planning configurations, allowing for customizable planning behavior before task execution. The existing reasoning parameter is deprecated in favor of this new configuration, ensuring backward compatibility while enhancing the planning process. Additionally, the Agent class has been updated to utilize this new configuration, and relevant utility functions have been adjusted accordingly. Tests have been added to validate the new planning functionality and ensure proper integration with existing agent workflows.

* dropping redundancy

* fix test

* revert handle_reasoning here

* refactor: update reasoning handling in Agent class

This commit modifies the Agent class to conditionally call the handle_reasoning function based on the executor class being used. The legacy CrewAgentExecutor will continue to utilize handle_reasoning, while the new AgentExecutor will manage planning internally. Additionally, the PlanningConfig class has been referenced in the documentation to clarify its role in enabling or disabling planning. Tests have been updated to reflect these changes and ensure proper functionality.

* improve planning prompts

* matching

* refactor: remove default enabled flag from PlanningConfig in Agent class

* more cassettes

* fix test

* refactor: update planning prompt and remove deprecated methods in reasoning handler

* improve planning prompt

* Lorenze/feat planning pt 2 todo list gen (#4449)

* feat: introduce PlanningConfig for enhanced agent planning capabilities

This update adds a new PlanningConfig class to manage agent planning configurations, allowing for customizable planning behavior before task execution. The existing reasoning parameter is deprecated in favor of this new configuration, ensuring backward compatibility while enhancing the planning process. Additionally, the Agent class has been updated to utilize this new configuration, and relevant utility functions have been adjusted accordingly. Tests have been added to validate the new planning functionality and ensure proper integration with existing agent workflows.

* dropping redundancy

* fix test

* revert handle_reasoning here

* refactor: update reasoning handling in Agent class

This commit modifies the Agent class to conditionally call the handle_reasoning function based on the executor class being used. The legacy CrewAgentExecutor will continue to utilize handle_reasoning, while the new AgentExecutor will manage planning internally. Additionally, the PlanningConfig class has been referenced in the documentation to clarify its role in enabling or disabling planning. Tests have been updated to reflect these changes and ensure proper functionality.

* improve planning prompts

* matching

* refactor: remove default enabled flag from PlanningConfig in Agent class

* more cassettes

* fix test

* feat: enhance agent planning with structured todo management

This commit introduces a new planning system within the AgentExecutor class, allowing for the creation of structured todo items from planning steps. The TodoList and TodoItem models have been added to facilitate tracking of plan execution. The reasoning plan now includes a list of steps, improving the clarity and organization of agent tasks. Additionally, tests have been added to validate the new planning functionality and ensure proper integration with existing workflows.

* refactor: update planning prompt and remove deprecated methods in reasoning handler

* improve planning prompt

* improve handler

* linted

* linted

* Lorenze/feat/planning pt 3 todo list execution (#4450)

* feat: introduce PlanningConfig for enhanced agent planning capabilities

This update adds a new PlanningConfig class to manage agent planning configurations, allowing for customizable planning behavior before task execution. The existing reasoning parameter is deprecated in favor of this new configuration, ensuring backward compatibility while enhancing the planning process. Additionally, the Agent class has been updated to utilize this new configuration, and relevant utility functions have been adjusted accordingly. Tests have been added to validate the new planning functionality and ensure proper integration with existing agent workflows.

* dropping redundancy

* fix test

* revert handle_reasoning here

* refactor: update reasoning handling in Agent class

This commit modifies the Agent class to conditionally call the handle_reasoning function based on the executor class being used. The legacy CrewAgentExecutor will continue to utilize handle_reasoning, while the new AgentExecutor will manage planning internally. Additionally, the PlanningConfig class has been referenced in the documentation to clarify its role in enabling or disabling planning. Tests have been updated to reflect these changes and ensure proper functionality.

* improve planning prompts

* matching

* refactor: remove default enabled flag from PlanningConfig in Agent class

* more cassettes

* fix test

* feat: enhance agent planning with structured todo management

This commit introduces a new planning system within the AgentExecutor class, allowing for the creation of structured todo items from planning steps. The TodoList and TodoItem models have been added to facilitate tracking of plan execution. The reasoning plan now includes a list of steps, improving the clarity and organization of agent tasks. Additionally, tests have been added to validate the new planning functionality and ensure proper integration with existing workflows.

* refactor: update planning prompt and remove deprecated methods in reasoning handler

* improve planning prompt

* improve handler

* execute todos and be able to track them

* feat: introduce PlannerObserver and StepExecutor for enhanced plan execution

This commit adds the PlannerObserver and StepExecutor classes to the CrewAI framework, implementing the observation phase of the Plan-and-Execute architecture. The PlannerObserver analyzes step execution results, determines plan validity, and suggests refinements, while the StepExecutor executes individual todo items in isolation. These additions improve the overall planning and execution process, allowing for more dynamic and responsive agent behavior. Additionally, new observation events have been defined to facilitate monitoring and logging of the planning process.

* refactor: enhance final answer synthesis in AgentExecutor

This commit improves the synthesis of final answers in the AgentExecutor class by implementing a more coherent approach to combining results from multiple todo items. The method now utilizes a single LLM call to generate a polished response, falling back to concatenation if the synthesis fails. Additionally, the test cases have been updated to reflect the changes in planning and execution, ensuring that the results are properly validated and that the plan-and-execute architecture is functioning as intended.

* refactor: enhance final answer synthesis in AgentExecutor

This commit improves the synthesis of final answers in the AgentExecutor class by implementing a more coherent approach to combining results from multiple todo items. The method now utilizes a single LLM call to generate a polished response, falling back to concatenation if the synthesis fails. Additionally, the test cases have been updated to reflect the changes in planning and execution, ensuring that the results are properly validated and that the plan-and-execute architecture is functioning as intended.

* refactor: implement structured output handling in final answer synthesis

This commit enhances the final answer synthesis process in the AgentExecutor class by introducing support for structured outputs when a response model is specified. The synthesis method now utilizes the response model to produce outputs that conform to the expected schema, while still falling back to concatenation in case of synthesis failures. This change ensures that intermediate steps yield free-text results, but the final output can be structured, improving the overall coherence and usability of the synthesized answers.

* regen tests

* linted

* fix

* Enhance PlanningConfig and AgentExecutor with Reasoning Effort Levels

This update introduces a new  attribute in the  class, allowing users to customize the observation and replanning behavior during task execution. The  class has been modified to utilize this new attribute, routing step observations based on the specified reasoning effort level: low, medium, or high.

Additionally, tests have been added to validate the functionality of the reasoning effort levels, ensuring that the agent behaves as expected under different configurations. This enhancement improves the adaptability and efficiency of the planning process in agent execution.

* regen cassettes for test and fix test

* cassette regen

* fixing tests

* dry

* Refactor PlannerObserver and StepExecutor to Utilize I18N for Prompts

This update enhances the PlannerObserver and StepExecutor classes by integrating the I18N utility for managing prompts and messages. The system and user prompts are now retrieved from the I18N module, allowing for better localization and maintainability. Additionally, the code has been cleaned up to remove hardcoded strings, improving readability and consistency across the planning and execution processes.

* Refactor PlannerObserver and StepExecutor to Utilize I18N for Prompts

This update enhances the PlannerObserver and StepExecutor classes by integrating the I18N utility for managing prompts and messages. The system and user prompts are now retrieved from the I18N module, allowing for better localization and maintainability. Additionally, the code has been cleaned up to remove hardcoded strings, improving readability and consistency across the planning and execution processes.

* consolidate agent logic

* fix datetime

* improving step executor

* refactor: streamline observation and refinement process in PlannerObserver

- Updated the PlannerObserver to apply structured refinements directly from observations without requiring a second LLM call.
- Renamed  method to  for clarity.
- Enhanced documentation to reflect changes in how refinements are handled.
- Removed unnecessary LLM message building and parsing logic, simplifying the refinement process.
- Updated event emissions to include summaries of refinements instead of raw data.

* enhance step executor with tool usage events and validation

- Added event emissions for tool usage, including started and finished events, to track tool execution.
- Implemented validation to ensure expected tools are called during step execution, raising errors when not.
- Refactored the  method to handle tool execution with event logging.
- Introduced a new method  for parsing tool input into a structured format.
- Updated tests to cover new functionality and ensure correct behavior of tool usage events.

* refactor: enhance final answer synthesis logic in AgentExecutor

- Updated the finalization process to conditionally skip synthesis when the last todo result is sufficient as a complete answer.
- Introduced a new method to determine if the last todo result can be used directly, improving efficiency.
- Added tests to verify the new behavior, ensuring synthesis is skipped when appropriate and maintained when a response model is set.

* fix: update observation handling in PlannerObserver for LLM errors

- Modified the error handling in the PlannerObserver to default to a conservative replan when an LLM call fails.
- Updated the return values to indicate that the step was not completed successfully and that a full replan is needed.
- Added a new test to verify the behavior of the observer when an LLM error occurs, ensuring the correct replan logic is triggered.

* refactor: enhance planning and execution flow in agents

- Updated the PlannerObserver to accept a kickoff input for standalone task execution, improving flexibility in task handling.
- Refined the step execution process in StepExecutor to support multi-turn action loops, allowing for iterative tool execution and observation.
- Introduced a method to extract relevant task sections from descriptions, ensuring clarity in task requirements.
- Enhanced the AgentExecutor to manage step failures more effectively, triggering replans only when necessary and preserving completed task history.
- Updated translations to reflect changes in planning principles and execution prompts, emphasizing concrete and executable steps.

* refactor: update setup_native_tools to include tool_name_mapping

- Modified the setup_native_tools function to return an additional mapping of tool names.
- Updated StepExecutor and AgentExecutor classes to accommodate the new return value from setup_native_tools.

* fix tests

* linted

* linted

* feat: enhance image block handling in Anthropic provider and update AgentExecutor logic

- Added a method to convert OpenAI-style image_url blocks to Anthropic's required format.
- Updated AgentExecutor to handle cases where no todos are ready, introducing a needs_replan return state.
- Improved fallback answer generation in AgentExecutor to prevent RuntimeErrors when no final output is produced.

* lint

* lint

* 1. Added failed to TodoStatus (planning_types.py)

  - TodoStatus now includes failed as a valid state: Literal[pending, running, completed, failed]
  - Added mark_failed(step_number, result) method to TodoList
  - Added get_failed_todos() method to TodoList
  - Updated is_complete to treat both completed and failed as terminal states
  - Updated replace_pending_todos docstring to mention failed items are preserved

  2. Mark running todos as failed before replan (agent_executor.py)

  All three effort-level handlers now call mark_failed() on the current todo before routing to replan_now:

  - Low effort (handle_step_observed_low): hard-failure branch
  - Medium effort (handle_step_observed_medium): needs_full_replan branch
  - High effort (decide_next_action): both needs_full_replan and step_completed_successfully=False branches

  3. Updated _should_replan to use get_failed_todos()

  Previously filtered on todo.status == failed which was dead code. Now uses the proper accessor method that will actually find failed items.

  What this fixes: Before these changes, a step that triggered a replan would stay in running status permanently, causing is_complete to never
  return True and next_pending to skip it — leading to stuck execution states. Now failed steps are properly tracked, replanning context correctly
  reports them, and LiteAgentOutput.failed_todos will actually return results.

* fix test

* imp on failed states

* adjusted the var name from AgentReActState to AgentExecutorState

* addressed p0 bugs

* more improvements

* linted

* regen cassette

* addressing crictical comments

* ensure configurable timeouts, max_replans and max step iterations

* adjusted tools

* dropping debug statements

* addressed comment

* fix  linter

* lints and test fixes

* fix: default observation parse fallback to failure and clean up plan-execute types

When _parse_observation_response fails all parse attempts, default to
step_completed_successfully=False instead of True to avoid silently
masking failures. Extract duplicate _extract_task_section into a shared
utility in agent_utils. Type PlanningConfig.llm as str | BaseLLM | None
instead of str | Any | None. Make StepResult a frozen dataclass for
immutability consistency with StepExecutionContext.

* fix: remove Any from function_calling_llm union type in step_executor

* fix: make BaseTool usage count thread-safe for parallel step execution

Add _usage_lock and _claim_usage() to BaseTool for atomic
check-and-increment of current_usage_count. This prevents race
conditions when parallel plan steps invoke the same tool concurrently
via execute_todos_parallel. Remove the racy pre-check from
execute_single_native_tool_call since the limit is now enforced
atomically inside tool.run().

---------

Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com>
Co-authored-by: Greyson LaLonde <greyson@crewai.com>
2026-03-15 18:33:17 -07:00
Rip&Tear
fb2323b3de Code interpreter sandbox escape (#4791)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
* [SECURITY] Fix sandbox escape vulnerability in CodeInterpreterTool (F-001)

This commit addresses a critical security vulnerability where the CodeInterpreterTool
could be exploited via sandbox escape attacks when Docker was unavailable.

Changes:
- Remove insecure fallback to restricted sandbox in run_code_safety()
- Now fails closed with RuntimeError when Docker is unavailable
- Mark run_code_in_restricted_sandbox() as deprecated and insecure
- Add clear security warnings to SandboxPython class documentation
- Update tests to reflect secure-by-default behavior
- Add test demonstrating the sandbox escape vulnerability
- Update README with security requirements and best practices

The previous implementation would fall back to a Python-based 'restricted sandbox'
when Docker was unavailable. However, this sandbox could be easily bypassed using
Python object introspection to recover the original __import__ function, allowing
arbitrary module access and command execution on the host.

The fix enforces Docker as a requirement for safe code execution. Users who cannot
use Docker must explicitly enable unsafe_mode=True, acknowledging the security risks.

Security Impact:
- Prevents RCE via sandbox escape when Docker is unavailable
- Enforces fail-closed security model
- Maintains backward compatibility via unsafe_mode flag

References:
- https://docs.crewai.com/tools/ai-ml/codeinterpretertool

Co-authored-by: Rip&Tear <theCyberTech@users.noreply.github.com>

* Add security fix documentation for F-001

Co-authored-by: Rip&Tear <theCyberTech@users.noreply.github.com>

* Add Slack summary for security fix

Co-authored-by: Rip&Tear <theCyberTech@users.noreply.github.com>

* Delete SECURITY_FIX_F001.md

* Delete SLACK_SUMMARY.md

* chore: regen cassettes

* chore: regen more cassettes

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Rip&Tear <theCyberTech@users.noreply.github.com>
Co-authored-by: Greyson LaLonde <greyson@crewai.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-03-15 13:18:02 +08:00
Greyson LaLonde
e1d7de0dba docs: update changelog and version for v1.10.2rc2
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
2026-03-14 00:49:48 -04:00
Greyson LaLonde
96b07bfc84 feat: bump versions to 1.10.2rc2 2026-03-14 00:34:12 -04:00
Greyson LaLonde
b8d7942675 fix: remove exclusive locks from read-only storage operations
* fix: remove exclusive locks from read-only storage operations to eliminate lock contention

read operations like search, list_scopes, get_scope_info, count across
LanceDB, ChromaDB, and RAG adapters were holding exclusive locks unnecessarily.
under multi-process prefork workers this caused RedisLock contention triggering
a portalocker bug where AlreadyLocked is raised with the exceptions module as its arg.

- remove store_lock from 7 LanceDB read methods since MVCC handles concurrent reads
- remove store_lock from ChromaDB search/asearch which are thread-safe since v0.4
- remove store_lock from RAG core query and LanceDB adapter query
- wrap lock_store BaseLockException with actionable error message
- add exception handling in encoding_flow/recall_flow ThreadPoolExecutor calls
- fix flow.py double-logging of ancestor listener errors

* fix: remove dead conditional in filter_and_chunk fallback

both branches of the if/else and the except all produced the same
candidates = [scope_prefix] result, making the get_scope_info call
and conditional pointless

* fix: separate lock acquisition from caller body in lock_store

the try/except wrapped the yield inside the contextmanager, which meant
any BaseLockException raised by the caller's code inside the with block
would be caught and re-raised with a misleading "Failed to acquire lock"
message. split into acquire-then-yield so only actual acquisition
failures get the actionable error message.
2026-03-14 00:21:14 -04:00
Greyson LaLonde
88fd859c26 docs: update changelog and version for v1.10.2rc1
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
2026-03-13 17:07:31 -04:00
Greyson LaLonde
3413f2e671 feat: bump versions to 1.10.2rc1 2026-03-13 16:53:48 -04:00
Greyson LaLonde
326ec15d54 feat(devtools): add release command and trigger PyPI publish
* feat(devtools): add release command and fix automerge on protected branches

Replace gh pr merge --auto with polling-based merge wait that prints the
PR URL for manual review. Add unified release command that chains bump
and tag into a single end-to-end workflow.

* feat(devtools): trigger PyPI publish workflow after GitHub release

* refactor(devtools): extract shared helpers to eliminate duplication

Extract _poll_pr_until_merged, _update_all_versions,
_generate_release_notes, _update_docs_and_create_pr,
_create_tag_and_release, and _trigger_pypi_publish into reusable
helpers. All three commands (bump, tag, release) now compose from
these shared functions.
2026-03-13 16:41:27 -04:00
Greyson LaLonde
c5a8fef118 fix: add cross-process and thread-safe locking to unprotected I/O (#4827)
* fix: add cross-process and thread-safe locking to unprotected I/O

* style: apply ruff formatting and import sorting

* fix: avoid event loop deadlock in snowflake pool lock

* perf: move embedding calls outside cross-process lock in RAG adapter

* fix: close TOCTOU race in browser session manager

* fix: add error handling to update_user_data

* fix: use async lock acquisition in chromadb async methods

* fix: avoid blocking event loop in async browser session wait

* fix: replace dual-lock with single cross-process lock in LanceDB storage

* fix: remove dead _save_user_data function and stale mock

* fix: re-addd file descriptor limit to prevent crashes
2026-03-13 12:28:11 -07:00
Greyson LaLonde
b7af26ff60 ci: add slack notification on successful pypi publish 2026-03-13 12:05:52 -04:00
Greyson LaLonde
48eb7c6937 fix: propagate contextvars across all thread and executor boundaries
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2026-03-13 00:32:22 -04:00
danglies007
d8e38f2f0b fix: propagate ContextVars into async task threads
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
threading.Thread() does not inherit the parent's contextvars.Context,
causing ContextVar-based state (OpenTelemetry spans, Langfuse trace IDs,
and any other request-scoped vars) to be silently dropped in async tasks.

Fix by calling contextvars.copy_context() before spawning each thread and
using ctx.run() as the thread target, which runs the function inside the
captured context.

Affected locations:
- task.py: execute_async() — the primary async task execution path
- utilities/streaming.py: create_chunk_generator() — streaming execution path

Fixes: #4822
Related: #4168, #4286

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-12 15:33:58 -04:00
Greyson LaLonde
542afe61a8 docs: update changelog and version for v1.10.2a1
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2026-03-11 11:44:00 -04:00
234 changed files with 52516 additions and 14426 deletions

View File

@@ -59,6 +59,8 @@ jobs:
contents: read
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.release_tag || github.ref }}
- name: Install uv
uses: astral-sh/setup-uv@v6
@@ -93,3 +95,72 @@ jobs:
echo "Some packages failed to publish"
exit 1
fi
- name: Build Slack payload
if: success()
id: slack
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
RELEASE_TAG: ${{ inputs.release_tag }}
run: |
payload=$(uv run python -c "
import json, re, subprocess, sys
with open('lib/crewai/src/crewai/__init__.py') as f:
m = re.search(r\"__version__\s*=\s*[\\\"']([^\\\"']+)\", f.read())
version = m.group(1) if m else 'unknown'
import os
tag = os.environ.get('RELEASE_TAG') or version
try:
r = subprocess.run(['gh','release','view',tag,'--json','body','-q','.body'],
capture_output=True, text=True, check=True)
body = r.stdout.strip()
except Exception:
body = ''
blocks = [
{'type':'section','text':{'type':'mrkdwn',
'text':f':rocket: \`crewai v{version}\` published to PyPI'}},
{'type':'section','text':{'type':'mrkdwn',
'text':f'<https://pypi.org/project/crewai/{version}/|View on PyPI> · <https://github.com/crewAIInc/crewAI/releases/tag/{tag}|Release notes>'}},
{'type':'divider'},
]
if body:
heading, items = '', []
for line in body.split('\n'):
line = line.strip()
if not line: continue
hm = re.match(r'^#{2,3}\s+(.*)', line)
if hm:
if heading and items:
skip = heading in ('What\\'s Changed','') or 'Contributors' in heading
if not skip:
txt = f'*{heading}*\n' + '\n'.join(f'• {i}' for i in items)
blocks.append({'type':'section','text':{'type':'mrkdwn','text':txt}})
heading, items = hm.group(1), []
elif line.startswith('- ') or line.startswith('* '):
items.append(re.sub(r'\*\*([^*]*)\*\*', r'*\1*', line[2:]))
if heading and items:
skip = heading in ('What\\'s Changed','') or 'Contributors' in heading
if not skip:
txt = f'*{heading}*\n' + '\n'.join(f'• {i}' for i in items)
blocks.append({'type':'section','text':{'type':'mrkdwn','text':txt}})
blocks.append({'type':'divider'})
blocks.append({'type':'section','text':{'type':'mrkdwn',
'text':f'\`\`\`uv add \"crewai[tools]=={version}\"\`\`\`'}})
print(json.dumps({'blocks':blocks}))
")
echo "payload=$payload" >> $GITHUB_OUTPUT
- name: Notify Slack
if: success()
uses: slackapi/slack-github-action@v2.1.0
with:
webhook: ${{ secrets.SLACK_WEBHOOK_URL }}
webhook-type: incoming-webhook
payload: ${{ steps.slack.outputs.payload }}

View File

@@ -43,6 +43,35 @@ def _patched_make_vcr_request(httpx_request: Any, **kwargs: Any) -> Any:
httpx_stubs._make_vcr_request = _patched_make_vcr_request
# Patch the response-side of VCR to fix httpx.ResponseNotRead errors.
# VCR's _from_serialized_response mocks httpx.Response.read(), which prevents
# the response's internal _content attribute from being properly initialized.
# When OpenAI's client (using with_raw_response) accesses response.content,
# httpx raises ResponseNotRead because read() was never actually called.
# This patch ensures _content is explicitly set after response creation.
_original_from_serialized_response = getattr(
httpx_stubs, "_from_serialized_response", None
)
if _original_from_serialized_response is not None:
def _patched_from_serialized_response(
request: Any, serialized_response: Any, history: Any = None
) -> Any:
"""Patched version that ensures response._content is properly set."""
response = _original_from_serialized_response(request, serialized_response, history)
# Explicitly set _content to avoid ResponseNotRead errors
# The content was passed to the constructor but the mocked read() prevents
# proper initialization of the internal state
body_content = serialized_response.get("body", {}).get("string", b"")
if isinstance(body_content, str):
body_content = body_content.encode("utf-8")
response._content = body_content
return response
httpx_stubs._from_serialized_response = _patched_from_serialized_response
@pytest.fixture(autouse=True, scope="function")
def cleanup_event_handlers() -> Generator[None, Any, None]:
"""Clean up event bus handlers after each test to prevent test pollution."""

File diff suppressed because it is too large Load Diff

View File

@@ -4,6 +4,146 @@ description: "Product updates, improvements, and bug fixes for CrewAI"
icon: "clock"
mode: "wide"
---
<Update label="Mar 18, 2026">
## v1.11.0
[View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.11.0)
## What's Changed
### Documentation
- Update changelog and version for v1.11.0rc2
## Contributors
@greysonlalonde
</Update>
<Update label="Mar 17, 2026">
## v1.11.0rc2
[View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.11.0rc2)
## What's Changed
### Bug Fixes
- Enhance LLM response handling and serialization.
- Upgrade vulnerable transitive dependencies (authlib, PyJWT, snowflake-connector-python).
- Replace `os.system` with `subprocess.run` in unsafe mode pip install.
### Documentation
- Update Exa Search Tool page with improved naming, description, and configuration options.
- Add Custom MCP Servers in How-To Guide.
- Update OTEL collectors documentation.
- Update MCP documentation.
- Update changelog and version for v1.11.0rc1.
## Contributors
@10ishq, @greysonlalonde, @joaomdmoura, @lucasgomide, @mattatcha, @theCyberTech, @vinibrsl
</Update>
<Update label="Mar 15, 2026">
## v1.11.0rc1
[View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.11.0rc1)
## What's Changed
### Features
- Add Plus API token authentication in a2a
- Implement plan execute pattern
### Bug Fixes
- Resolve code interpreter sandbox escape issue
### Documentation
- Update changelog and version for v1.10.2rc2
## Contributors
@Copilot, @greysonlalonde, @lorenzejay, @theCyberTech
</Update>
<Update label="Mar 14, 2026">
## v1.10.2rc2
[View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc2)
## What's Changed
### Bug Fixes
- Remove exclusive locks from read-only storage operations
### Documentation
- Update changelog and version for v1.10.2rc1
## Contributors
@greysonlalonde
</Update>
<Update label="Mar 13, 2026">
## v1.10.2rc1
[View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc1)
## What's Changed
### Features
- Add release command and trigger PyPI publish
### Bug Fixes
- Fix cross-process and thread-safe locking to unprotected I/O
- Propagate contextvars across all thread and executor boundaries
- Propagate ContextVars into async task threads
### Documentation
- Update changelog and version for v1.10.2a1
## Contributors
@danglies007, @greysonlalonde
</Update>
<Update label="Mar 11, 2026">
## v1.10.2a1
[View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2a1)
## What's Changed
### Features
- Add support for tool search, saving tokens, and dynamically injecting appropriate tools during execution for Anthropics.
- Introduce more Brave Search tools.
- Create action for nightly releases.
### Bug Fixes
- Fix LockException under concurrent multi-process execution.
- Resolve issues with grouping parallel tool results in a single user message.
- Address MCP tools resolutions and eliminate all shared mutable connections.
- Update LLM parameter handling in the human_feedback function.
- Add missing list/dict methods to LockedListProxy and LockedDictProxy.
- Propagate contextvars context to parallel tool call threads.
- Bump gitpython dependency to >=3.1.41 to resolve CVE path traversal vulnerability.
### Refactoring
- Refactor memory classes to be serializable.
### Documentation
- Update changelog and version for v1.10.1.
## Contributors
@akaKuruma, @github-actions[bot], @giulio-leone, @greysonlalonde, @joaomdmoura, @jonathansampson, @lorenzejay, @lucasgomide, @mattatcha
</Update>
<Update label="Mar 04, 2026">
## v1.10.1

View File

@@ -219,6 +219,16 @@ CrewAI provides a wide range of events that you can listen for:
- **ToolExecutionErrorEvent**: Emitted when a tool execution encounters an error
- **ToolSelectionErrorEvent**: Emitted when there's an error selecting a tool
### MCP Events
- **MCPConnectionStartedEvent**: Emitted when starting to connect to an MCP server. Contains the server name, URL, transport type, connection timeout, and whether it's a reconnection attempt.
- **MCPConnectionCompletedEvent**: Emitted when successfully connected to an MCP server. Contains the server name, connection duration in milliseconds, and whether it was a reconnection.
- **MCPConnectionFailedEvent**: Emitted when connection to an MCP server fails. Contains the server name, error message, and error type (`timeout`, `authentication`, `network`, etc.).
- **MCPToolExecutionStartedEvent**: Emitted when starting to execute an MCP tool. Contains the server name, tool name, and tool arguments.
- **MCPToolExecutionCompletedEvent**: Emitted when MCP tool execution completes successfully. Contains the server name, tool name, result, and execution duration in milliseconds.
- **MCPToolExecutionFailedEvent**: Emitted when MCP tool execution fails. Contains the server name, tool name, error message, and error type (`timeout`, `validation`, `server_error`, etc.).
- **MCPConfigFetchFailedEvent**: Emitted when fetching an MCP server configuration fails (e.g., the MCP is not connected in your account, API error, or connection failure after config was fetched). Contains the slug, error message, and error type (`not_connected`, `api_error`, `connection_failed`).
### Knowledge Events
- **KnowledgeRetrievalStartedEvent**: Emitted when a knowledge retrieval is started

View File

@@ -1,30 +1,39 @@
---
title: "Open Telemetry Logs"
description: "Understand how to capture telemetry logs from your CrewAI AMP deployments"
title: "OpenTelemetry Export"
description: "Export traces and logs from your CrewAI AMP deployments to your own OpenTelemetry collector"
icon: "magnifying-glass-chart"
mode: "wide"
---
CrewAI AMP provides a powerful way to capture telemetry logs from your deployments. This allows you to monitor the performance of your agents and workflows, and to debug issues that may arise.
CrewAI AMP can export OpenTelemetry **traces** and **logs** from your deployments directly to your own collector. This lets you monitor agent performance, track LLM calls, and debug issues using your existing observability stack.
Telemetry data follows the [OpenTelemetry GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) plus additional CrewAI-specific attributes.
## Prerequisites
<CardGroup cols={2}>
<Card title="ENTERPRISE OTEL SETUP enabled" icon="users">
Your organization should have ENTERPRISE OTEL SETUP enabled
<Card title="CrewAI AMP account" icon="users">
Your organization must have an active CrewAI AMP account.
</Card>
<Card title="OTEL collector setup" icon="server">
Your organization should have an OTEL collector setup or a provider like
Datadog log intake setup
<Card title="OpenTelemetry collector" icon="server">
You need an OpenTelemetry-compatible collector endpoint (e.g., your own OTel Collector, Datadog, Grafana, or any OTLP-compatible backend).
</Card>
</CardGroup>
## How to capture telemetry logs
## Setting up a collector
1. Go to settings/organization tab
2. Configure your OTEL collector setup
3. Save
1. In CrewAI AMP, go to **Settings** > **OpenTelemetry Collectors**.
2. Click **Add Collector**.
3. Select an integration type — **OpenTelemetry Traces** or **OpenTelemetry Logs**.
4. Configure the connection:
- **Endpoint** — Your collector's OTLP endpoint (e.g., `https://otel-collector.example.com:4317`).
- **Service Name** — A name to identify this service in your observability platform.
- **Custom Headers** *(optional)* — Add authentication or routing headers as key-value pairs.
- **Certificate** *(optional)* — Provide a TLS certificate if your collector requires one.
5. Click **Save**.
Example to setup OTEL log collection capture to Datadog.
<Frame>![OpenTelemetry Collector Configuration](/images/crewai-otel-collector-config.png)</Frame>
<Frame>![Capture Telemetry Logs](/images/crewai-otel-export.png)</Frame>
<Tip>
You can add multiple collectors — for example, one for traces and another for logs, or send to different backends for different purposes.
</Tip>

View File

@@ -0,0 +1,136 @@
---
title: "Custom MCP Servers"
description: "Connect your own MCP servers to CrewAI AMP with public access, API key authentication, or OAuth 2.0"
icon: "plug"
mode: "wide"
---
CrewAI AMP supports connecting to any MCP server that implements the [Model Context Protocol](https://modelcontextprotocol.io/). You can bring public servers that require no authentication, servers protected by an API key or bearer token, and servers that use OAuth 2.0 for secure delegated access.
## Prerequisites
<CardGroup cols={2}>
<Card title="CrewAI AMP Account" icon="user">
You need an active [CrewAI AMP](https://app.crewai.com) account.
</Card>
<Card title="MCP Server URL" icon="link">
The URL of the MCP server you want to connect. The server must be accessible from the internet and support Streamable HTTP transport.
</Card>
</CardGroup>
## Adding a Custom MCP Server
<Steps>
<Step title="Open Tools & Integrations">
Navigate to **Tools & Integrations** in the left sidebar of CrewAI AMP, then select the **Connections** tab.
</Step>
<Step title="Start adding a Custom MCP Server">
Click the **Add Custom MCP Server** button. A dialog will appear with the configuration form.
</Step>
<Step title="Fill in the basic information">
- **Name** (required): A descriptive name for your MCP server (e.g., "My Internal Tools Server").
- **Description**: An optional summary of what this MCP server provides.
- **Server URL** (required): The full URL to your MCP server endpoint (e.g., `https://my-server.example.com/mcp`).
</Step>
<Step title="Choose an authentication method">
Select one of the three available authentication methods based on how your MCP server is secured. See the sections below for details on each method.
</Step>
<Step title="Add custom headers (optional)">
If your MCP server requires additional headers on every request (e.g., tenant identifiers or routing headers), click **+ Add Header** and provide the header name and value. You can add multiple custom headers.
</Step>
<Step title="Create the connection">
Click **Create MCP Server** to save the connection. Your custom MCP server will now appear in the Connections list and its tools will be available for use in your crews.
</Step>
</Steps>
## Authentication Methods
### No Authentication
Choose this option when your MCP server is publicly accessible and does not require any credentials. This is common for open-source or internal servers running behind a VPN.
### Authentication Token
Use this method when your MCP server is protected by an API key or bearer token.
<Frame>
<img src="/images/enterprise/custom-mcp-auth-token.png" alt="Custom MCP Server with Authentication Token" />
</Frame>
| Field | Required | Description |
|-------|----------|-------------|
| **Header Name** | Yes | The name of the HTTP header that carries the token (e.g., `X-API-Key`, `Authorization`). |
| **Value** | Yes | Your API key or bearer token. |
| **Add to** | No | Where to attach the credential — **Header** (default) or **Query parameter**. |
<Tip>
If your server expects a `Bearer` token in the `Authorization` header, set the Header Name to `Authorization` and the Value to `Bearer <your-token>`.
</Tip>
### OAuth 2.0
Use this method for MCP servers that require OAuth 2.0 authorization. CrewAI will handle the full OAuth flow, including token refresh.
<Frame>
<img src="/images/enterprise/custom-mcp-oauth.png" alt="Custom MCP Server with OAuth 2.0" />
</Frame>
| Field | Required | Description |
|-------|----------|-------------|
| **Redirect URI** | — | Pre-filled and read-only. Copy this URI and register it as an authorized redirect URI in your OAuth provider. |
| **Authorization Endpoint** | Yes | The URL where users are sent to authorize access (e.g., `https://auth.example.com/oauth/authorize`). |
| **Token Endpoint** | Yes | The URL used to exchange the authorization code for an access token (e.g., `https://auth.example.com/oauth/token`). |
| **Client ID** | Yes | The OAuth client ID issued by your provider. |
| **Client Secret** | No | The OAuth client secret. Not required for public clients using PKCE. |
| **Scopes** | No | Space-separated list of scopes to request (e.g., `read write`). |
| **Token Auth Method** | No | How the client credentials are sent when exchanging tokens — **Standard (POST body)** or **Basic Auth (header)**. Defaults to Standard. |
| **PKCE Supported** | No | Enable if your OAuth provider supports Proof Key for Code Exchange. Recommended for improved security. |
<Info>
**Discover OAuth Config**: If your OAuth provider supports OpenID Connect Discovery, click the **Discover OAuth Config** link to auto-populate the authorization and token endpoints from the provider's `/.well-known/openid-configuration` URL.
</Info>
#### Setting Up OAuth 2.0 Step by Step
<Steps>
<Step title="Register the redirect URI">
Copy the **Redirect URI** shown in the form and add it as an authorized redirect URI in your OAuth provider's application settings.
</Step>
<Step title="Enter endpoints and credentials">
Fill in the **Authorization Endpoint**, **Token Endpoint**, **Client ID**, and optionally the **Client Secret** and **Scopes**.
</Step>
<Step title="Configure token exchange method">
Select the appropriate **Token Auth Method**. Most providers use the default **Standard (POST body)**. Some older providers require **Basic Auth (header)**.
</Step>
<Step title="Enable PKCE (recommended)">
Check **PKCE Supported** if your provider supports it. PKCE adds an extra layer of security to the authorization code flow and is recommended for all new integrations.
</Step>
<Step title="Create and authorize">
Click **Create MCP Server**. You will be redirected to your OAuth provider to authorize access. Once authorized, CrewAI will store the tokens and automatically refresh them as needed.
</Step>
</Steps>
## Using Your Custom MCP Server
Once connected, your custom MCP server's tools appear alongside built-in connections on the **Tools & Integrations** page. You can:
- **Assign tools to agents** in your crews just like any other CrewAI tool.
- **Manage visibility** to control which team members can use the server.
- **Edit or remove** the connection at any time from the Connections list.
<Warning>
If your MCP server becomes unreachable or the credentials expire, tool calls using that server will fail. Make sure the server URL is stable and credentials are kept up to date.
</Warning>
<Card title="Need Help?" icon="headset" href="mailto:support@crewai.com">
Contact our support team for assistance with custom MCP server configuration or troubleshooting.
</Card>

View File

@@ -9,10 +9,7 @@ mode: "wide"
The Tool Repository is a package manager for CrewAI tools. It allows users to publish, install, and manage tools that integrate with CrewAI crews and flows.
Tools can be:
- **Private**: accessible only within your organization (default)
- **Public**: accessible to all CrewAI users if published with the `--public` flag
All tools are private by default and accessible only within your organization.
The repository is not a version control system. Use Git to track code changes and enable collaboration.
@@ -106,12 +103,6 @@ To publish the tool:
crewai tool publish
```
By default, tools are published as private. To make a tool public:
```bash
crewai tool publish --public
```
For more details on how to build tools, see [Creating your own tools](/en/concepts/tools#creating-your-own-tools).
## Updating Tools

View File

@@ -62,22 +62,22 @@ Use the `#` syntax to select specific tools from a server:
"https://mcp.exa.ai/mcp?api_key=your_key#web_search_exa"
```
### CrewAI AMP Marketplace
### Connected MCP Integrations
Access tools from the CrewAI AMP marketplace:
Connect MCP servers from the CrewAI catalog or bring your own. Once connected in your account, reference them by slug:
```python
# Full service with all tools
"crewai-amp:financial-data"
# Connected MCP with all tools
"snowflake"
# Specific tool from AMP service
"crewai-amp:research-tools#pubmed_search"
# Specific tool from a connected MCP
"stripe#list_invoices"
# Multiple AMP services
# Multiple connected MCPs
mcps=[
"crewai-amp:weather-insights",
"crewai-amp:market-analysis",
"crewai-amp:social-media-monitoring"
"snowflake",
"stripe",
"github"
]
```
@@ -99,10 +99,10 @@ multi_source_agent = Agent(
"https://mcp.exa.ai/mcp?api_key=your_exa_key&profile=research",
"https://weather.api.com/mcp#get_current_conditions",
# CrewAI AMP marketplace
"crewai-amp:financial-insights",
"crewai-amp:academic-research#pubmed_search",
"crewai-amp:market-intelligence#competitor_analysis"
# Connected MCPs from catalog
"snowflake",
"stripe#list_invoices",
"github#search_repositories"
]
)
@@ -147,7 +147,7 @@ agent = Agent(
mcps=[
"https://mcp.exa.ai/mcp?api_key=key", # Tools: mcp_exa_ai_*
"https://weather.service.com/mcp", # Tools: weather_service_com_*
"crewai-amp:financial-data" # Tools: financial_data_*
"snowflake" # Tools: snowflake_*
]
)
@@ -170,7 +170,7 @@ agent = Agent(
"https://primary-server.com/mcp", # Primary data source
"https://backup-server.com/mcp", # Backup if primary fails
"https://unreachable-server.com/mcp", # Will be skipped with warning
"crewai-amp:reliable-service" # Reliable AMP service
"snowflake" # Connected MCP from catalog
]
)
@@ -254,7 +254,7 @@ agent = Agent(
apps=["gmail", "slack"], # Platform integrations
mcps=[ # MCP servers
"https://mcp.exa.ai/mcp?api_key=key",
"crewai-amp:research-tools"
"snowflake"
],
verbose=True,
@@ -298,7 +298,7 @@ agent = Agent(
mcps=[
"https://primary-api.com/mcp", # Primary choice
"https://backup-api.com/mcp", # Backup option
"crewai-amp:reliable-service" # AMP fallback
"snowflake" # Connected MCP fallback
]
```
@@ -311,7 +311,7 @@ agent = Agent(
backstory="Financial analyst with access to weather data for agricultural market insights",
mcps=[
"https://weather.service.com/mcp#get_forecast",
"crewai-amp:financial-data#stock_analysis"
"stripe#list_invoices"
]
)
```

View File

@@ -17,7 +17,7 @@ Use the `mcps` field directly on agents for seamless MCP tool integration. The D
#### String-Based References (Quick Setup)
Perfect for remote HTTPS servers and CrewAI AMP marketplace:
Perfect for remote HTTPS servers and connected MCP integrations from the CrewAI catalog:
```python
from crewai import Agent
@@ -29,8 +29,8 @@ agent = Agent(
mcps=[
"https://mcp.exa.ai/mcp?api_key=your_key", # External MCP server
"https://api.weather.com/mcp#get_forecast", # Specific tool from server
"crewai-amp:financial-data", # CrewAI AMP marketplace
"crewai-amp:research-tools#pubmed_search" # Specific AMP tool
"snowflake", # Connected MCP from catalog
"stripe#list_invoices" # Specific tool from connected MCP
]
)
# MCP tools are now automatically available to your agent!
@@ -127,7 +127,7 @@ research_agent = Agent(
backstory="Expert researcher with access to multiple data sources",
mcps=[
"https://mcp.exa.ai/mcp?api_key=your_key&profile=your_profile",
"crewai-amp:weather-service#current_conditions"
"snowflake#run_query"
]
)
@@ -204,19 +204,22 @@ mcps=[
]
```
#### CrewAI AMP Marketplace
#### Connected MCP Integrations
Connect MCP servers from the CrewAI catalog or bring your own. Once connected in your account, reference them by slug:
```python
mcps=[
# Full AMP MCP service - get all available tools
"crewai-amp:financial-data",
# Connected MCP - get all available tools
"snowflake",
# Specific tool from AMP service using # syntax
"crewai-amp:research-tools#pubmed_search",
# Specific tool from a connected MCP using # syntax
"stripe#list_invoices",
# Multiple AMP services
"crewai-amp:weather-service",
"crewai-amp:market-analysis"
# Multiple connected MCPs
"snowflake",
"stripe",
"github"
]
```
@@ -299,7 +302,7 @@ from crewai.mcp import MCPServerStdio, MCPServerHTTP
mcps=[
# String references
"https://external-api.com/mcp", # External server
"crewai-amp:financial-insights", # AMP service
"snowflake", # Connected MCP from catalog
# Structured configurations
MCPServerStdio(
@@ -409,7 +412,7 @@ agent = Agent(
# String references
"https://reliable-server.com/mcp", # Will work
"https://unreachable-server.com/mcp", # Will be skipped gracefully
"crewai-amp:working-service", # Will work
"snowflake", # Connected MCP from catalog
# Structured configs
MCPServerStdio(

View File

@@ -1,53 +1,110 @@
---
title: EXA Search Web Loader
description: The `EXASearchTool` is designed to perform a semantic search for a specified query from a text's content across the internet.
icon: globe-pointer
title: "Exa Search Tool"
description: "Search the web using the Exa Search API to find the most relevant results for any query, with options for full page content, highlights, and summaries."
icon: "magnifying-glass"
mode: "wide"
---
# `EXASearchTool`
## Description
The EXASearchTool is designed to perform a semantic search for a specified query from a text's content across the internet.
It utilizes the [exa.ai](https://exa.ai/) API to fetch and display the most relevant search results based on the query provided by the user.
The `EXASearchTool` lets CrewAI agents search the web using the [Exa](https://exa.ai/) search API. It returns the most relevant results for any query, with options for full page content and AI-generated summaries.
## Installation
To incorporate this tool into your project, follow the installation instructions below:
Install the CrewAI tools package:
```shell
pip install 'crewai[tools]'
```
## Example
## Environment Variables
The following example demonstrates how to initialize the tool and execute a search with a given query:
Set your Exa API key as an environment variable:
```python Code
from crewai_tools import EXASearchTool
# Initialize the tool for internet searching capabilities
tool = EXASearchTool()
```bash
export EXA_API_KEY='your_exa_api_key'
```
## Steps to Get Started
Get an API key from the [Exa dashboard](https://dashboard.exa.ai/api-keys).
To effectively use the EXASearchTool, follow these steps:
## Example Usage
<Steps>
<Step title="Package Installation">
Confirm that the `crewai[tools]` package is installed in your Python environment.
</Step>
<Step title="API Key Acquisition">
Acquire a [exa.ai](https://exa.ai/) API key by registering for a free account at [exa.ai](https://exa.ai/).
</Step>
<Step title="Environment Configuration">
Store your obtained API key in an environment variable named `EXA_API_KEY` to facilitate its use by the tool.
</Step>
</Steps>
Here's how to use the `EXASearchTool` within a CrewAI agent:
## Conclusion
```python
import os
from crewai import Agent, Task, Crew
from crewai_tools import EXASearchTool
By integrating the `EXASearchTool` into Python projects, users gain the ability to conduct real-time, relevant searches across the internet directly from their applications.
By adhering to the setup and usage guidelines provided, incorporating this tool into projects is streamlined and straightforward.
# Initialize the tool
exa_tool = EXASearchTool()
# Create an agent that uses the tool
researcher = Agent(
role='Research Analyst',
goal='Find the latest information on any topic',
backstory='An expert researcher who finds the most relevant and up-to-date information.',
tools=[exa_tool],
verbose=True
)
# Create a task for the agent
research_task = Task(
description='Find the top 3 recent breakthroughs in quantum computing.',
expected_output='A summary of the top 3 breakthroughs with source URLs.',
agent=researcher
)
# Form the crew and kick it off
crew = Crew(
agents=[researcher],
tasks=[research_task],
verbose=True
)
result = crew.kickoff()
print(result)
```
## Configuration Options
The `EXASearchTool` accepts the following parameters during initialization:
- `type` (str, optional): The search type to use. Defaults to `"auto"`. Options: `"auto"`, `"instant"`, `"fast"`, `"deep"`.
- `content` (bool, optional): Whether to include full page content in results. Defaults to `False`.
- `summary` (bool, optional): Whether to include AI-generated summaries of each result. Requires `content=True`. Defaults to `False`.
- `api_key` (str, optional): Your Exa API key. Falls back to the `EXA_API_KEY` environment variable if not provided.
- `base_url` (str, optional): Custom API server URL. Falls back to the `EXA_BASE_URL` environment variable if not provided.
When calling the tool (or when an agent invokes it), the following search parameters are available:
- `search_query` (str): **Required**. The search query string.
- `start_published_date` (str, optional): Filter results published after this date (ISO 8601 format, e.g. `"2024-01-01"`).
- `end_published_date` (str, optional): Filter results published before this date (ISO 8601 format).
- `include_domains` (list[str], optional): A list of domains to restrict the search to.
## Advanced Usage
You can configure the tool with custom parameters for richer results:
```python
# Get full page content with AI summaries
exa_tool = EXASearchTool(
content=True,
summary=True,
type="deep"
)
# Use it in an agent
agent = Agent(
role="Deep Researcher",
goal="Conduct thorough research with full content and summaries",
tools=[exa_tool]
)
```
## Features
- **Semantic Search**: Find results based on meaning, not just keywords
- **Full Content Retrieval**: Get the full text of web pages alongside search results
- **AI Summaries**: Get concise, AI-generated summaries of each result
- **Date Filtering**: Limit results to specific time periods with published date filters
- **Domain Filtering**: Restrict searches to specific domains

Binary file not shown.

After

Width:  |  Height:  |  Size: 356 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 317 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

View File

@@ -4,6 +4,146 @@ description: "CrewAI의 제품 업데이트, 개선 사항 및 버그 수정"
icon: "clock"
mode: "wide"
---
<Update label="2026년 3월 18일">
## v1.11.0
[GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.11.0)
## 변경 사항
### 문서
- v1.11.0rc2에 대한 변경 로그 및 버전 업데이트
## 기여자
@greysonlalonde
</Update>
<Update label="2026년 3월 17일">
## v1.11.0rc2
[GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.11.0rc2)
## 변경 사항
### 버그 수정
- LLM 응답 처리 및 직렬화 개선.
- 취약한 전이 종속성(authlib, PyJWT, snowflake-connector-python) 업그레이드.
- 안전하지 않은 모드에서 pip 설치 시 `os.system`을 `subprocess.run`으로 교체.
### 문서
- 개선된 이름, 설명 및 구성 옵션으로 Exa 검색 도구 페이지 업데이트.
- 사용 방법 가이드에 사용자 지정 MCP 서버 추가.
- OTEL 수집기 문서 업데이트.
- MCP 문서 업데이트.
- v1.11.0rc1에 대한 변경 로그 및 버전 업데이트.
## 기여자
@10ishq, @greysonlalonde, @joaomdmoura, @lucasgomide, @mattatcha, @theCyberTech, @vinibrsl
</Update>
<Update label="2026년 3월 15일">
## v1.11.0rc1
[GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.11.0rc1)
## 변경 사항
### 기능
- Plus API 토큰 인증 추가
- 에서 계획 실행 패턴 구현
### 버그 수정
- 코드 인터프리터 샌드박스 탈출 문제 해결
### 문서
- v1.10.2rc2의 변경 로그 및 버전 업데이트
## 기여자
@Copilot, @greysonlalonde, @lorenzejay, @theCyberTech
</Update>
<Update label="2026년 3월 14일">
## v1.10.2rc2
[GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc2)
## 변경 사항
### 버그 수정
- 읽기 전용 스토리지 작업에서 독점 잠금 제거
### 문서
- v1.10.2rc1에 대한 변경 로그 및 버전 업데이트
## 기여자
@greysonlalonde
</Update>
<Update label="2026년 3월 13일">
## v1.10.2rc1
[GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc1)
## 변경 사항
### 기능
- 릴리스 명령 추가 및 PyPI 게시 트리거
### 버그 수정
- 보호되지 않은 I/O에 대한 프로세스 간 및 스레드 안전 잠금 수정
- 모든 스레드 및 실행기 경계를 넘는 contextvars 전파
- async 작업 스레드로 ContextVars 전파
### 문서
- v1.10.2a1에 대한 변경 로그 및 버전 업데이트
## 기여자
@danglies007, @greysonlalonde
</Update>
<Update label="2026년 3월 11일">
## v1.10.2a1
[GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2a1)
## 변경 사항
### 기능
- Anthropics에 대한 도구 검색 지원 추가, 토큰 저장, 실행 중 적절한 도구를 동적으로 주입하는 기능 추가.
- 더 많은 Brave Search 도구 도입.
- 야간 릴리스를 위한 액션 생성.
### 버그 수정
- 동시 다중 프로세스 실행 중 LockException 수정.
- 단일 사용자 메시지에서 병렬 도구 결과 그룹화 문제 해결.
- MCP 도구 해상도 문제 해결 및 모든 공유 가변 연결 제거.
- human_feedback 함수에서 LLM 매개변수 처리 업데이트.
- LockedListProxy 및 LockedDictProxy에 누락된 list/dict 메서드 추가.
- 병렬 도구 호출 스레드에 contextvars 컨텍스트 전파.
- CVE 경로 탐색 취약점을 해결하기 위해 gitpython 의존성을 >=3.1.41로 업데이트.
### 리팩토링
- 메모리 클래스를 직렬화 가능하도록 리팩토링.
### 문서
- v1.10.1에 대한 변경 로그 및 버전 업데이트.
## 기여자
@akaKuruma, @github-actions[bot], @giulio-leone, @greysonlalonde, @joaomdmoura, @jonathansampson, @lorenzejay, @lucasgomide, @mattatcha
</Update>
<Update label="2026년 3월 4일">
## v1.10.1

View File

@@ -0,0 +1,39 @@
---
title: "OpenTelemetry 내보내기"
description: "CrewAI AMP 배포에서 자체 OpenTelemetry 수집기로 트레이스와 로그를 내보내기"
icon: "magnifying-glass-chart"
mode: "wide"
---
CrewAI AMP는 배포에서 OpenTelemetry **트레이스**와 **로그**를 자체 수집기로 직접 내보낼 수 있습니다. 이를 통해 기존 관측 가능성 스택을 사용하여 에이전트 성능을 모니터링하고, LLM 호출을 추적하고, 문제를 디버깅할 수 있습니다.
텔레메트리 데이터는 [OpenTelemetry GenAI 시맨틱 규칙](https://opentelemetry.io/docs/specs/semconv/gen-ai/)과 추가적인 CrewAI 전용 속성을 따릅니다.
## 사전 요구 사항
<CardGroup cols={2}>
<Card title="CrewAI AMP 계정" icon="users">
조직에 활성 CrewAI AMP 계정이 있어야 합니다.
</Card>
<Card title="OpenTelemetry 수집기" icon="server">
OpenTelemetry 호환 수집기 엔드포인트가 필요합니다 (예: 자체 OTel Collector, Datadog, Grafana 또는 OTLP 호환 백엔드).
</Card>
</CardGroup>
## 수집기 설정
1. CrewAI AMP에서 **Settings** > **OpenTelemetry Collectors**로 이동합니다.
2. **Add Collector**를 클릭합니다.
3. 통합 유형을 선택합니다 — **OpenTelemetry Traces** 또는 **OpenTelemetry Logs**.
4. 연결을 구성합니다:
- **Endpoint** — 수집기의 OTLP 엔드포인트 (예: `https://otel-collector.example.com:4317`).
- **Service Name** — 관측 가능성 플랫폼에서 이 서비스를 식별하기 위한 이름.
- **Custom Headers** *(선택 사항)* — 인증 또는 라우팅 헤더를 키-값 쌍으로 추가합니다.
- **Certificate** *(선택 사항)* — 수집기에서 TLS 인증서가 필요한 경우 제공합니다.
5. **Save**를 클릭합니다.
<Frame>![OpenTelemetry 수집기 구성](/images/crewai-otel-collector-config.png)</Frame>
<Tip>
여러 수집기를 추가할 수 있습니다 — 예를 들어, 트레이스용 하나와 로그용 하나를 추가하거나, 다른 목적을 위해 다른 백엔드로 전송할 수 있습니다.
</Tip>

View File

@@ -0,0 +1,136 @@
---
title: "커스텀 MCP 서버"
description: "공개 액세스, API 키 인증 또는 OAuth 2.0을 사용하여 자체 MCP 서버를 CrewAI AMP에 연결하세요"
icon: "plug"
mode: "wide"
---
CrewAI AMP는 [Model Context Protocol](https://modelcontextprotocol.io/)을 구현하는 모든 MCP 서버에 연결할 수 있습니다. 인증이 필요 없는 공개 서버, API 키 또는 Bearer 토큰으로 보호되는 서버, OAuth 2.0을 사용하는 서버를 연결할 수 있습니다.
## 사전 요구사항
<CardGroup cols={2}>
<Card title="CrewAI AMP 계정" icon="user">
활성화된 [CrewAI AMP](https://app.crewai.com) 계정이 필요합니다.
</Card>
<Card title="MCP 서버 URL" icon="link">
연결하려는 MCP 서버의 URL입니다. 서버는 인터넷에서 접근 가능해야 하며 Streamable HTTP 전송을 지원해야 합니다.
</Card>
</CardGroup>
## 커스텀 MCP 서버 추가하기
<Steps>
<Step title="Tools & Integrations 열기">
CrewAI AMP 왼쪽 사이드바에서 **Tools & Integrations**로 이동한 후 **Connections** 탭을 선택합니다.
</Step>
<Step title="커스텀 MCP 서버 추가 시작">
**Add Custom MCP Server** 버튼을 클릭합니다. 구성 양식이 포함된 대화 상자가 나타납니다.
</Step>
<Step title="기본 정보 입력">
- **Name** (필수): MCP 서버의 설명적 이름 (예: "내부 도구 서버").
- **Description**: 이 MCP 서버가 제공하는 기능에 대한 선택적 요약.
- **Server URL** (필수): MCP 서버 엔드포인트의 전체 URL (예: `https://my-server.example.com/mcp`).
</Step>
<Step title="인증 방법 선택">
MCP 서버의 보안 방식에 따라 세 가지 인증 방법 중 하나를 선택합니다. 각 방법에 대한 자세한 내용은 아래 섹션을 참조하세요.
</Step>
<Step title="커스텀 헤더 추가 (선택사항)">
MCP 서버가 모든 요청에 추가 헤더를 요구하는 경우 (예: 테넌트 식별자 또는 라우팅 헤더), **+ Add Header**를 클릭하고 헤더 이름과 값을 입력합니다. 여러 커스텀 헤더를 추가할 수 있습니다.
</Step>
<Step title="연결 생성">
**Create MCP Server**를 클릭하여 연결을 저장합니다. 커스텀 MCP 서버가 Connections 목록에 나타나고 해당 도구를 crew에서 사용할 수 있게 됩니다.
</Step>
</Steps>
## 인증 방법
### 인증 없음
MCP 서버가 공개적으로 접근 가능하고 자격 증명이 필요 없을 때 이 옵션을 선택합니다. 오픈 소스 서버나 VPN 뒤에서 실행되는 내부 서버에 일반적입니다.
### 인증 토큰
MCP 서버가 API 키 또는 Bearer 토큰으로 보호되는 경우 이 방법을 사용합니다.
<Frame>
<img src="/images/enterprise/custom-mcp-auth-token.png" alt="인증 토큰을 사용하는 커스텀 MCP 서버" />
</Frame>
| 필드 | 필수 | 설명 |
|------|------|------|
| **Header Name** | 예 | 토큰을 전달하는 HTTP 헤더 이름 (예: `X-API-Key`, `Authorization`). |
| **Value** | 예 | API 키 또는 Bearer 토큰. |
| **Add to** | 아니오 | 자격 증명을 첨부할 위치 — **Header** (기본값) 또는 **Query parameter**. |
<Tip>
서버가 `Authorization` 헤더에 `Bearer` 토큰을 예상하는 경우, Header Name을 `Authorization`으로, Value를 `Bearer <토큰>`으로 설정하세요.
</Tip>
### OAuth 2.0
OAuth 2.0 인증이 필요한 MCP 서버에 이 방법을 사용합니다. CrewAI가 토큰 갱신을 포함한 전체 OAuth 흐름을 처리합니다.
<Frame>
<img src="/images/enterprise/custom-mcp-oauth.png" alt="OAuth 2.0을 사용하는 커스텀 MCP 서버" />
</Frame>
| 필드 | 필수 | 설명 |
|------|------|------|
| **Redirect URI** | — | 자동으로 채워지며 읽기 전용입니다. 이 URI를 복사하여 OAuth 제공자에 승인된 리디렉션 URI로 등록하세요. |
| **Authorization Endpoint** | 예 | 사용자가 접근을 승인하기 위해 이동하는 URL (예: `https://auth.example.com/oauth/authorize`). |
| **Token Endpoint** | 예 | 인증 코드를 액세스 토큰으로 교환하는 데 사용되는 URL (예: `https://auth.example.com/oauth/token`). |
| **Client ID** | 예 | OAuth 제공자가 발급한 클라이언트 ID. |
| **Client Secret** | 아니오 | OAuth 클라이언트 시크릿. PKCE를 사용하는 공개 클라이언트에는 필요하지 않습니다. |
| **Scopes** | 아니오 | 요청할 스코프의 공백으로 구분된 목록 (예: `read write`). |
| **Token Auth Method** | 아니오 | 토큰 교환 시 클라이언트 자격 증명을 보내는 방법 — **Standard (POST body)** 또는 **Basic Auth (header)**. 기본값은 Standard입니다. |
| **PKCE Supported** | 아니오 | OAuth 제공자가 Proof Key for Code Exchange를 지원하는 경우 활성화합니다. 보안 강화를 위해 권장됩니다. |
<Info>
**Discover OAuth Config**: OAuth 제공자가 OpenID Connect Discovery를 지원하는 경우, **Discover OAuth Config** 링크를 클릭하여 제공자의 `/.well-known/openid-configuration` URL에서 인증 및 토큰 엔드포인트를 자동으로 채울 수 있습니다.
</Info>
#### OAuth 2.0 단계별 설정
<Steps>
<Step title="리디렉션 URI 등록">
양식에 표시된 **Redirect URI**를 복사하여 OAuth 제공자의 애플리케이션 설정에서 승인된 리디렉션 URI로 추가합니다.
</Step>
<Step title="엔드포인트 및 자격 증명 입력">
**Authorization Endpoint**, **Token Endpoint**, **Client ID**를 입력하고, 선택적으로 **Client Secret**과 **Scopes**를 입력합니다.
</Step>
<Step title="토큰 교환 방법 구성">
적절한 **Token Auth Method**를 선택합니다. 대부분의 제공자는 기본값인 **Standard (POST body)**를 사용합니다. 일부 오래된 제공자는 **Basic Auth (header)**를 요구합니다.
</Step>
<Step title="PKCE 활성화 (권장)">
제공자가 지원하는 경우 **PKCE Supported**를 체크합니다. PKCE는 인증 코드 흐름에 추가 보안 계층을 제공하며 모든 새 통합에 권장됩니다.
</Step>
<Step title="생성 및 인증">
**Create MCP Server**를 클릭합니다. OAuth 제공자로 리디렉션되어 접근을 인증합니다. 인증 완료 후 CrewAI가 토큰을 저장하고 필요에 따라 자동으로 갱신합니다.
</Step>
</Steps>
## 커스텀 MCP 서버 사용하기
연결이 완료되면 커스텀 MCP 서버의 도구가 **Tools & Integrations** 페이지에서 기본 제공 연결과 함께 표시됩니다. 다음을 수행할 수 있습니다:
- 다른 CrewAI 도구와 마찬가지로 crew의 **에이전트에 도구를 할당**합니다.
- **가시성을 관리**하여 어떤 팀원이 서버를 사용할 수 있는지 제어합니다.
- Connections 목록에서 언제든지 연결을 **편집하거나 제거**합니다.
<Warning>
MCP 서버에 접근할 수 없거나 자격 증명이 만료되면 해당 서버를 사용하는 도구 호출이 실패합니다. 서버 URL이 안정적이고 자격 증명이 최신 상태인지 확인하세요.
</Warning>
<Card title="도움이 필요하신가요?" icon="headset" href="mailto:support@crewai.com">
커스텀 MCP 서버 구성 또는 문제 해결에 대한 도움이 필요하면 지원팀에 문의하세요.
</Card>

View File

@@ -9,10 +9,7 @@ mode: "wide"
Tool Repository는 CrewAI 도구를 위한 패키지 관리자입니다. 사용자는 CrewAI crew와 flow에 통합되는 도구를 게시, 설치 및 관리할 수 있습니다.
도구는 다음과 같이 분류됩니다:
- **비공개**: 조직 내에서만 접근할 수 있습니다(기본값)
- **공개**: `--public` 플래그로 게시하면 모든 CrewAI 사용자가 접근할 수 있습니다
모든 도구는 기본적으로 비공개이며 조직 내에서만 접근할 수 있습니다.
이 저장소는 버전 관리 시스템이 아닙니다. 코드 변경 사항을 추적하고 협업을 활성화하려면 Git을 사용하십시오.
@@ -60,12 +57,6 @@ git commit -m "Initial version"
crewai tool publish
```
기본적으로 도구는 비공개로 게시됩니다. 도구를 공개로 설정하려면:
```bash
crewai tool publish --public
```
도구 빌드에 대한 자세한 내용은 [나만의 도구 만들기](/ko/concepts/tools#creating-your-own-tools)를 참고하세요.
## 도구 업데이트

View File

@@ -62,22 +62,22 @@ agent = Agent(
"https://mcp.exa.ai/mcp?api_key=your_key#web_search_exa"
```
### CrewAI AMP 마켓플레이스
### 연결된 MCP 통합
CrewAI AMP 마켓플레이스의 도구에 액세스하세요:
CrewAI 카탈로그에서 MCP 서버를 연결하거나 직접 가져올 수 있습니다. 계정에 연결한 후 슬러그로 참조하세요:
```python
# 모든 도구가 포함된 전체 서비스
"crewai-amp:financial-data"
# 모든 도구가 포함된 연결된 MCP
"snowflake"
# AMP 서비스의 특정 도구
"crewai-amp:research-tools#pubmed_search"
# 연결된 MCP의 특정 도구
"stripe#list_invoices"
# 다중 AMP 서비스
# 여러 연결된 MCP
mcps=[
"crewai-amp:weather-insights",
"crewai-amp:market-analysis",
"crewai-amp:social-media-monitoring"
"snowflake",
"stripe",
"github"
]
```
@@ -99,10 +99,10 @@ multi_source_agent = Agent(
"https://mcp.exa.ai/mcp?api_key=your_exa_key&profile=research",
"https://weather.api.com/mcp#get_current_conditions",
# CrewAI AMP 마켓플레이스
"crewai-amp:financial-insights",
"crewai-amp:academic-research#pubmed_search",
"crewai-amp:market-intelligence#competitor_analysis"
# 카탈로그에서 연결된 MCP
"snowflake",
"stripe#list_invoices",
"github#search_repositories"
]
)
@@ -154,7 +154,7 @@ agent = Agent(
"https://reliable-server.com/mcp", # 작동할 것
"https://unreachable-server.com/mcp", # 우아하게 건너뛸 것
"https://slow-server.com/mcp", # 우아하게 타임아웃될 것
"crewai-amp:working-service" # 작동할 것
"snowflake" # 카탈로그에서 연결된 MCP
]
)
# 에이전트는 작동하는 서버의 도구를 사용하고 실패한 서버에 대한 경고를 로그에 남깁니다
@@ -229,6 +229,6 @@ agent = Agent(
mcps=[
"https://primary-api.com/mcp", # 주요 선택
"https://backup-api.com/mcp", # 백업 옵션
"crewai-amp:reliable-service" # AMP 폴백
"snowflake" # 연결된 MCP 폴백
]
```

View File

@@ -25,8 +25,8 @@ agent = Agent(
mcps=[
"https://mcp.exa.ai/mcp?api_key=your_key", # 외부 MCP 서버
"https://api.weather.com/mcp#get_forecast", # 서버의 특정 도구
"crewai-amp:financial-data", # CrewAI AMP 마켓플레이스
"crewai-amp:research-tools#pubmed_search" # 특정 AMP 도구
"snowflake", # 카탈로그에서 연결된 MCP
"stripe#list_invoices" # 연결된 MCP의 특정 도구
]
)
# MCP 도구들이 이제 자동으로 에이전트에서 사용 가능합니다!

View File

@@ -4,6 +4,146 @@ description: "Atualizações de produto, melhorias e correções do CrewAI"
icon: "clock"
mode: "wide"
---
<Update label="18 mar 2026">
## v1.11.0
[Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.11.0)
## O que Mudou
### Documentação
- Atualizar changelog e versão para v1.11.0rc2
## Contribuidores
@greysonlalonde
</Update>
<Update label="17 mar 2026">
## v1.11.0rc2
[Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.11.0rc2)
## O que Mudou
### Correções de Bugs
- Aprimorar o manuseio e a serialização das respostas do LLM.
- Atualizar dependências transitivas vulneráveis (authlib, PyJWT, snowflake-connector-python).
- Substituir `os.system` por `subprocess.run` na instalação do pip em modo inseguro.
### Documentação
- Atualizar a página da Ferramenta de Pesquisa Exa com nomes, descrições e opções de configuração aprimoradas.
- Adicionar Servidores MCP Personalizados no Guia de Como Fazer.
- Atualizar a documentação dos coletores OTEL.
- Atualizar a documentação do MCP.
- Atualizar o changelog e a versão para v1.11.0rc1.
## Contributors
@10ishq, @greysonlalonde, @joaomdmoura, @lucasgomide, @mattatcha, @theCyberTech, @vinibrsl
</Update>
<Update label="15 mar 2026">
## v1.11.0rc1
[Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.11.0rc1)
## O que Mudou
### Funcionalidades
- Adicionar autenticação de token da API Plus
- Implementar padrão de execução de plano
### Correções de Bugs
- Resolver problema de escape do sandbox do interpretador de código
### Documentação
- Atualizar changelog e versão para v1.10.2rc2
## Contribuidores
@Copilot, @greysonlalonde, @lorenzejay, @theCyberTech
</Update>
<Update label="14 mar 2026">
## v1.10.2rc2
[Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc2)
## O que Mudou
### Correções de Bugs
- Remover bloqueios exclusivos de operações de armazenamento somente leitura
### Documentação
- Atualizar changelog e versão para v1.10.2rc1
## Contribuidores
@greysonlalonde
</Update>
<Update label="13 mar 2026">
## v1.10.2rc1
[Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc1)
## O que Mudou
### Funcionalidades
- Adicionar comando de lançamento e acionar publicação no PyPI
### Correções de Bugs
- Corrigir bloqueio seguro entre processos e threads para I/O não protegido
- Propagar contextvars através de todos os limites de thread e executor
- Propagar ContextVars para threads de tarefas assíncronas
### Documentação
- Atualizar changelog e versão para v1.10.2a1
## Contribuidores
@danglies007, @greysonlalonde
</Update>
<Update label="11 mar 2026">
## v1.10.2a1
[Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2a1)
## O que mudou
### Recursos
- Adicionar suporte para busca de ferramentas, salvamento de tokens e injeção dinâmica de ferramentas apropriadas durante a execução para Anthropics.
- Introduzir mais ferramentas de Busca Brave.
- Criar ação para lançamentos noturnos.
### Correções de Bugs
- Corrigir LockException durante a execução concorrente de múltiplos processos.
- Resolver problemas com a agrupação de resultados de ferramentas paralelas em uma única mensagem de usuário.
- Abordar resoluções de ferramentas MCP e eliminar todas as conexões mutáveis compartilhadas.
- Atualizar o manuseio de parâmetros LLM na função human_feedback.
- Adicionar métodos de lista/dicionário ausentes a LockedListProxy e LockedDictProxy.
- Propagar o contexto de contextvars para as threads de chamada de ferramentas paralelas.
- Atualizar a dependência gitpython para >=3.1.41 para resolver a vulnerabilidade de travessia de diretórios CVE.
### Refatoração
- Refatorar classes de memória para serem serializáveis.
### Documentação
- Atualizar o changelog e a versão para v1.10.1.
## Contribuidores
@akaKuruma, @github-actions[bot], @giulio-leone, @greysonlalonde, @joaomdmoura, @jonathansampson, @lorenzejay, @lucasgomide, @mattatcha
</Update>
<Update label="04 mar 2026">
## v1.10.1

View File

@@ -0,0 +1,39 @@
---
title: "Exportação OpenTelemetry"
description: "Exporte traces e logs das suas implantações CrewAI AMP para seu próprio coletor OpenTelemetry"
icon: "magnifying-glass-chart"
mode: "wide"
---
O CrewAI AMP pode exportar **traces** e **logs** do OpenTelemetry das suas implantações diretamente para seu próprio coletor. Isso permite que você monitore o desempenho dos agentes, rastreie chamadas de LLM e depure problemas usando sua stack de observabilidade existente.
Os dados de telemetria seguem as [convenções semânticas GenAI do OpenTelemetry](https://opentelemetry.io/docs/specs/semconv/gen-ai/) além de atributos adicionais específicos do CrewAI.
## Pré-requisitos
<CardGroup cols={2}>
<Card title="Conta CrewAI AMP" icon="users">
Sua organização deve ter uma conta CrewAI AMP ativa.
</Card>
<Card title="Coletor OpenTelemetry" icon="server">
Você precisa de um endpoint de coletor compatível com OpenTelemetry (por exemplo, seu próprio OTel Collector, Datadog, Grafana ou qualquer backend compatível com OTLP).
</Card>
</CardGroup>
## Configurando um coletor
1. No CrewAI AMP, vá para **Settings** > **OpenTelemetry Collectors**.
2. Clique em **Add Collector**.
3. Selecione um tipo de integração — **OpenTelemetry Traces** ou **OpenTelemetry Logs**.
4. Configure a conexão:
- **Endpoint** — O endpoint OTLP do seu coletor (por exemplo, `https://otel-collector.example.com:4317`).
- **Service Name** — Um nome para identificar este serviço na sua plataforma de observabilidade.
- **Custom Headers** *(opcional)* — Adicione headers de autenticação ou roteamento como pares chave-valor.
- **Certificate** *(opcional)* — Forneça um certificado TLS se o seu coletor exigir um.
5. Clique em **Save**.
<Frame>![Configuração do Coletor OpenTelemetry](/images/crewai-otel-collector-config.png)</Frame>
<Tip>
Você pode adicionar múltiplos coletores — por exemplo, um para traces e outro para logs, ou enviar para diferentes backends para diferentes propósitos.
</Tip>

View File

@@ -0,0 +1,136 @@
---
title: "Servidores MCP Personalizados"
description: "Conecte seus próprios servidores MCP ao CrewAI AMP com acesso público, autenticação por token ou OAuth 2.0"
icon: "plug"
mode: "wide"
---
O CrewAI AMP suporta a conexão com qualquer servidor MCP que implemente o [Model Context Protocol](https://modelcontextprotocol.io/). Você pode conectar servidores públicos que não exigem autenticação, servidores protegidos por chave de API ou token bearer, e servidores que utilizam OAuth 2.0 para acesso delegado seguro.
## Pré-requisitos
<CardGroup cols={2}>
<Card title="Conta CrewAI AMP" icon="user">
Você precisa de uma conta ativa no [CrewAI AMP](https://app.crewai.com).
</Card>
<Card title="URL do Servidor MCP" icon="link">
A URL do servidor MCP que você deseja conectar. O servidor deve ser acessível pela internet e suportar transporte Streamable HTTP.
</Card>
</CardGroup>
## Adicionando um Servidor MCP Personalizado
<Steps>
<Step title="Acesse Tools & Integrations">
Navegue até **Tools & Integrations** no menu lateral esquerdo do CrewAI AMP e selecione a aba **Connections**.
</Step>
<Step title="Inicie a adição de um Servidor MCP Personalizado">
Clique no botão **Add Custom MCP Server**. Um diálogo aparecerá com o formulário de configuração.
</Step>
<Step title="Preencha as informações básicas">
- **Name** (obrigatório): Um nome descritivo para seu servidor MCP (ex.: "Meu Servidor de Ferramentas Internas").
- **Description**: Um resumo opcional do que este servidor MCP fornece.
- **Server URL** (obrigatório): A URL completa do endpoint do seu servidor MCP (ex.: `https://my-server.example.com/mcp`).
</Step>
<Step title="Escolha um método de autenticação">
Selecione um dos três métodos de autenticação disponíveis com base em como seu servidor MCP está protegido. Veja as seções abaixo para detalhes sobre cada método.
</Step>
<Step title="Adicione headers personalizados (opcional)">
Se seu servidor MCP requer headers adicionais em cada requisição (ex.: identificadores de tenant ou headers de roteamento), clique em **+ Add Header** e forneça o nome e valor do header. Você pode adicionar múltiplos headers personalizados.
</Step>
<Step title="Crie a conexão">
Clique em **Create MCP Server** para salvar a conexão. Seu servidor MCP personalizado aparecerá na lista de Connections e suas ferramentas estarão disponíveis para uso nas suas crews.
</Step>
</Steps>
## Métodos de Autenticação
### Sem Autenticação
Escolha esta opção quando seu servidor MCP é publicamente acessível e não requer nenhuma credencial. Isso é comum para servidores open-source ou servidores internos rodando atrás de uma VPN.
### Token de Autenticação
Use este método quando seu servidor MCP é protegido por uma chave de API ou token bearer.
<Frame>
<img src="/images/enterprise/custom-mcp-auth-token.png" alt="Servidor MCP Personalizado com Token de Autenticação" />
</Frame>
| Campo | Obrigatório | Descrição |
|-------|-------------|-----------|
| **Header Name** | Sim | O nome do header HTTP que carrega o token (ex.: `X-API-Key`, `Authorization`). |
| **Value** | Sim | Sua chave de API ou token bearer. |
| **Add to** | Não | Onde anexar a credencial — **Header** (padrão) ou **Query parameter**. |
<Tip>
Se seu servidor espera um token `Bearer` no header `Authorization`, defina o Header Name como `Authorization` e o Value como `Bearer <seu-token>`.
</Tip>
### OAuth 2.0
Use este método para servidores MCP que requerem autorização OAuth 2.0. O CrewAI gerenciará todo o fluxo OAuth, incluindo a renovação de tokens.
<Frame>
<img src="/images/enterprise/custom-mcp-oauth.png" alt="Servidor MCP Personalizado com OAuth 2.0" />
</Frame>
| Campo | Obrigatório | Descrição |
|-------|-------------|-----------|
| **Redirect URI** | — | Preenchido automaticamente e somente leitura. Copie esta URI e registre-a como URI de redirecionamento autorizada no seu provedor OAuth. |
| **Authorization Endpoint** | Sim | A URL para onde os usuários são enviados para autorizar o acesso (ex.: `https://auth.example.com/oauth/authorize`). |
| **Token Endpoint** | Sim | A URL usada para trocar o código de autorização por um token de acesso (ex.: `https://auth.example.com/oauth/token`). |
| **Client ID** | Sim | O Client ID OAuth emitido pelo seu provedor. |
| **Client Secret** | Não | O Client Secret OAuth. Não é necessário para clientes públicos usando PKCE. |
| **Scopes** | Não | Lista de escopos separados por espaço a solicitar (ex.: `read write`). |
| **Token Auth Method** | Não | Como as credenciais do cliente são enviadas ao trocar tokens — **Standard (POST body)** ou **Basic Auth (header)**. Padrão é Standard. |
| **PKCE Supported** | Não | Ative se seu provedor OAuth suporta Proof Key for Code Exchange. Recomendado para maior segurança. |
<Info>
**Discover OAuth Config**: Se seu provedor OAuth suporta OpenID Connect Discovery, clique no link **Discover OAuth Config** para preencher automaticamente os endpoints de autorização e token a partir da URL `/.well-known/openid-configuration` do provedor.
</Info>
#### Configurando OAuth 2.0 Passo a Passo
<Steps>
<Step title="Registre a URI de redirecionamento">
Copie a **Redirect URI** exibida no formulário e adicione-a como URI de redirecionamento autorizada nas configurações do seu provedor OAuth.
</Step>
<Step title="Insira os endpoints e credenciais">
Preencha o **Authorization Endpoint**, **Token Endpoint**, **Client ID** e, opcionalmente, o **Client Secret** e **Scopes**.
</Step>
<Step title="Configure o método de troca de tokens">
Selecione o **Token Auth Method** apropriado. A maioria dos provedores usa o padrão **Standard (POST body)**. Alguns provedores mais antigos requerem **Basic Auth (header)**.
</Step>
<Step title="Ative o PKCE (recomendado)">
Marque **PKCE Supported** se seu provedor suporta. O PKCE adiciona uma camada extra de segurança ao fluxo de código de autorização e é recomendado para todas as novas integrações.
</Step>
<Step title="Crie e autorize">
Clique em **Create MCP Server**. Você será redirecionado ao seu provedor OAuth para autorizar o acesso. Uma vez autorizado, o CrewAI armazenará os tokens e os renovará automaticamente conforme necessário.
</Step>
</Steps>
## Usando Seu Servidor MCP Personalizado
Uma vez conectado, as ferramentas do seu servidor MCP personalizado aparecem junto com as conexões integradas na página **Tools & Integrations**. Você pode:
- **Atribuir ferramentas a agentes** nas suas crews, assim como qualquer outra ferramenta CrewAI.
- **Gerenciar visibilidade** para controlar quais membros da equipe podem usar o servidor.
- **Editar ou remover** a conexão a qualquer momento na lista de Connections.
<Warning>
Se seu servidor MCP ficar inacessível ou as credenciais expirarem, as chamadas de ferramentas usando esse servidor falharão. Certifique-se de que a URL do servidor seja estável e as credenciais estejam atualizadas.
</Warning>
<Card title="Precisa de Ajuda?" icon="headset" href="mailto:support@crewai.com">
Entre em contato com nossa equipe de suporte para assistência com configuração ou resolução de problemas de servidores MCP personalizados.
</Card>

View File

@@ -9,10 +9,7 @@ mode: "wide"
O Repositório de Ferramentas é um gerenciador de pacotes para ferramentas da CrewAI. Ele permite que usuários publiquem, instalem e gerenciem ferramentas que se integram com crews e flows da CrewAI.
As ferramentas podem ser:
- **Privadas**: acessíveis apenas dentro da sua organização (padrão)
- **Públicas**: acessíveis a todos os usuários CrewAI se publicadas com a flag `--public`
Todas as ferramentas são privadas por padrão e acessíveis apenas dentro da sua organização.
O repositório não é um sistema de controle de versões. Use Git para rastrear mudanças no código e permitir colaboração.
@@ -60,12 +57,6 @@ Para publicar a ferramenta:
crewai tool publish
```
Por padrão, as ferramentas são publicadas como privadas. Para tornar uma ferramenta pública:
```bash
crewai tool publish --public
```
Para mais detalhes sobre como construir ferramentas, acesse [Criando suas próprias ferramentas](/pt-BR/concepts/tools#creating-your-own-tools).
## Atualizando ferramentas

View File

@@ -62,22 +62,22 @@ Use a sintaxe `#` para selecionar ferramentas específicas de um servidor:
"https://mcp.exa.ai/mcp?api_key=sua_chave#web_search_exa"
```
### Marketplace CrewAI AMP
### Integrações MCP Conectadas
Acesse ferramentas do marketplace CrewAI AMP:
Conecte servidores MCP do catálogo CrewAI ou traga os seus próprios. Uma vez conectados em sua conta, referencie-os pelo slug:
```python
# Serviço completo com todas as ferramentas
"crewai-amp:financial-data"
# MCP conectado com todas as ferramentas
"snowflake"
# Ferramenta específica do serviço AMP
"crewai-amp:research-tools#pubmed_search"
# Ferramenta específica de um MCP conectado
"stripe#list_invoices"
# Múltiplos serviços AMP
# Múltiplos MCPs conectados
mcps=[
"crewai-amp:weather-insights",
"crewai-amp:market-analysis",
"crewai-amp:social-media-monitoring"
"snowflake",
"stripe",
"github"
]
```
@@ -99,10 +99,10 @@ agente_multi_fonte = Agent(
"https://mcp.exa.ai/mcp?api_key=sua_chave_exa&profile=pesquisa",
"https://weather.api.com/mcp#get_current_conditions",
# Marketplace CrewAI AMP
"crewai-amp:financial-insights",
"crewai-amp:academic-research#pubmed_search",
"crewai-amp:market-intelligence#competitor_analysis"
# MCPs conectados do catálogo
"snowflake",
"stripe#list_invoices",
"github#search_repositories"
]
)
@@ -154,7 +154,7 @@ agente = Agent(
"https://servidor-confiavel.com/mcp", # Vai funcionar
"https://servidor-inalcancavel.com/mcp", # Será ignorado graciosamente
"https://servidor-lento.com/mcp", # Timeout gracioso
"crewai-amp:servico-funcionando" # Vai funcionar
"snowflake" # MCP conectado do catálogo
]
)
# O agente usará ferramentas de servidores funcionais e registrará avisos para os que falharem
@@ -229,6 +229,6 @@ agente = Agent(
mcps=[
"https://api-principal.com/mcp", # Escolha principal
"https://api-backup.com/mcp", # Opção de backup
"crewai-amp:servico-confiavel" # Fallback AMP
"snowflake" # Fallback MCP conectado
]
```

View File

@@ -25,8 +25,8 @@ agent = Agent(
mcps=[
"https://mcp.exa.ai/mcp?api_key=sua_chave", # Servidor MCP externo
"https://api.weather.com/mcp#get_forecast", # Ferramenta específica do servidor
"crewai-amp:financial-data", # Marketplace CrewAI AMP
"crewai-amp:research-tools#pubmed_search" # Ferramenta AMP específica
"snowflake", # MCP conectado do catálogo
"stripe#list_invoices" # Ferramenta específica de MCP conectado
]
)
# Ferramentas MCP agora estão automaticamente disponíveis para seu agente!

View File

@@ -152,4 +152,4 @@ __all__ = [
"wrap_file_source",
]
__version__ = "1.10.2a1"
__version__ = "1.11.0"

View File

@@ -11,7 +11,7 @@ dependencies = [
"pytube~=15.0.0",
"requests~=2.32.5",
"docker~=7.1.0",
"crewai==1.10.2a1",
"crewai==1.11.0",
"tiktoken~=0.8.0",
"beautifulsoup4~=4.13.4",
"python-docx~=1.2.0",

View File

@@ -309,4 +309,4 @@ __all__ = [
"ZapierActionTools",
]
__version__ = "1.10.2a1"
__version__ = "1.11.0"

View File

@@ -1,7 +1,9 @@
from collections.abc import Callable
import os
from pathlib import Path
from typing import Any
from crewai.utilities.lock_store import lock as store_lock
from lancedb import ( # type: ignore[import-untyped]
DBConnection as LanceDBConnection,
connect as lancedb_connect,
@@ -33,10 +35,12 @@ class LanceDBAdapter(Adapter):
_db: LanceDBConnection = PrivateAttr()
_table: LanceDBTable = PrivateAttr()
_lock_name: str = PrivateAttr(default="")
def model_post_init(self, __context: Any) -> None:
self._db = lancedb_connect(self.uri)
self._table = self._db.open_table(self.table_name)
self._lock_name = f"lancedb:{os.path.realpath(str(self.uri))}"
super().model_post_init(__context)
@@ -56,4 +60,5 @@ class LanceDBAdapter(Adapter):
*args: Any,
**kwargs: Any,
) -> None:
self._table.add(*args, **kwargs)
with store_lock(self._lock_name):
self._table.add(*args, **kwargs)

View File

@@ -1,6 +1,9 @@
from __future__ import annotations
import asyncio
import contextvars
import logging
import threading
from typing import TYPE_CHECKING
@@ -18,6 +21,9 @@ class BrowserSessionManager:
This class maintains separate browser sessions for different threads,
enabling concurrent usage of browsers in multi-threaded environments.
Browsers are created lazily only when needed by tools.
Uses per-key events to serialize creation for the same thread_id without
blocking unrelated callers or wasting resources on duplicate sessions.
"""
def __init__(self, region: str = "us-west-2"):
@@ -27,8 +33,10 @@ class BrowserSessionManager:
region: AWS region for browser client
"""
self.region = region
self._lock = threading.Lock()
self._async_sessions: dict[str, tuple[BrowserClient, AsyncBrowser]] = {}
self._sync_sessions: dict[str, tuple[BrowserClient, SyncBrowser]] = {}
self._creating: dict[str, threading.Event] = {}
async def get_async_browser(self, thread_id: str) -> AsyncBrowser:
"""Get or create an async browser for the specified thread.
@@ -39,10 +47,29 @@ class BrowserSessionManager:
Returns:
An async browser instance specific to the thread
"""
if thread_id in self._async_sessions:
return self._async_sessions[thread_id][1]
loop = asyncio.get_event_loop()
while True:
with self._lock:
if thread_id in self._async_sessions:
return self._async_sessions[thread_id][1]
if thread_id not in self._creating:
self._creating[thread_id] = threading.Event()
break
event = self._creating[thread_id]
ctx = contextvars.copy_context()
await loop.run_in_executor(None, ctx.run, event.wait)
return await self._create_async_browser_session(thread_id)
try:
browser_client, browser = await self._create_async_browser_session(
thread_id
)
with self._lock:
self._async_sessions[thread_id] = (browser_client, browser)
return browser
finally:
with self._lock:
evt = self._creating.pop(thread_id)
evt.set()
def get_sync_browser(self, thread_id: str) -> SyncBrowser:
"""Get or create a sync browser for the specified thread.
@@ -53,19 +80,33 @@ class BrowserSessionManager:
Returns:
A sync browser instance specific to the thread
"""
if thread_id in self._sync_sessions:
return self._sync_sessions[thread_id][1]
while True:
with self._lock:
if thread_id in self._sync_sessions:
return self._sync_sessions[thread_id][1]
if thread_id not in self._creating:
self._creating[thread_id] = threading.Event()
break
event = self._creating[thread_id]
event.wait()
return self._create_sync_browser_session(thread_id)
try:
return self._create_sync_browser_session(thread_id)
finally:
with self._lock:
evt = self._creating.pop(thread_id)
evt.set()
async def _create_async_browser_session(self, thread_id: str) -> AsyncBrowser:
async def _create_async_browser_session(
self, thread_id: str
) -> tuple[BrowserClient, AsyncBrowser]:
"""Create a new async browser session for the specified thread.
Args:
thread_id: Unique identifier for the thread
Returns:
The newly created async browser instance
Tuple of (BrowserClient, AsyncBrowser).
Raises:
Exception: If browser session creation fails
@@ -75,10 +116,8 @@ class BrowserSessionManager:
browser_client = BrowserClient(region=self.region)
try:
# Start browser session
browser_client.start()
# Get WebSocket connection info
ws_url, headers = browser_client.generate_ws_headers()
logger.info(
@@ -87,7 +126,6 @@ class BrowserSessionManager:
from playwright.async_api import async_playwright
# Connect to browser using Playwright
playwright = await async_playwright().start()
browser = await playwright.chromium.connect_over_cdp(
endpoint_url=ws_url, headers=headers, timeout=30000
@@ -96,17 +134,13 @@ class BrowserSessionManager:
f"Successfully connected to async browser for thread {thread_id}"
)
# Store session resources
self._async_sessions[thread_id] = (browser_client, browser)
return browser
return browser_client, browser
except Exception as e:
logger.error(
f"Failed to create async browser session for thread {thread_id}: {e}"
)
# Clean up resources if session creation fails
if browser_client:
try:
browser_client.stop()
@@ -132,10 +166,8 @@ class BrowserSessionManager:
browser_client = BrowserClient(region=self.region)
try:
# Start browser session
browser_client.start()
# Get WebSocket connection info
ws_url, headers = browser_client.generate_ws_headers()
logger.info(
@@ -144,7 +176,6 @@ class BrowserSessionManager:
from playwright.sync_api import sync_playwright
# Connect to browser using Playwright
playwright = sync_playwright().start()
browser = playwright.chromium.connect_over_cdp(
endpoint_url=ws_url, headers=headers, timeout=30000
@@ -153,8 +184,8 @@ class BrowserSessionManager:
f"Successfully connected to sync browser for thread {thread_id}"
)
# Store session resources
self._sync_sessions[thread_id] = (browser_client, browser)
with self._lock:
self._sync_sessions[thread_id] = (browser_client, browser)
return browser
@@ -163,7 +194,6 @@ class BrowserSessionManager:
f"Failed to create sync browser session for thread {thread_id}: {e}"
)
# Clean up resources if session creation fails
if browser_client:
try:
browser_client.stop()
@@ -178,13 +208,13 @@ class BrowserSessionManager:
Args:
thread_id: Unique identifier for the thread
"""
if thread_id not in self._async_sessions:
logger.warning(f"No async browser session found for thread {thread_id}")
return
with self._lock:
if thread_id not in self._async_sessions:
logger.warning(f"No async browser session found for thread {thread_id}")
return
browser_client, browser = self._async_sessions[thread_id]
browser_client, browser = self._async_sessions.pop(thread_id)
# Close browser
if browser:
try:
await browser.close()
@@ -193,7 +223,6 @@ class BrowserSessionManager:
f"Error closing async browser for thread {thread_id}: {e}"
)
# Stop browser client
if browser_client:
try:
browser_client.stop()
@@ -202,8 +231,6 @@ class BrowserSessionManager:
f"Error stopping browser client for thread {thread_id}: {e}"
)
# Remove session from dictionary
del self._async_sessions[thread_id]
logger.info(f"Async browser session cleaned up for thread {thread_id}")
def close_sync_browser(self, thread_id: str) -> None:
@@ -212,13 +239,13 @@ class BrowserSessionManager:
Args:
thread_id: Unique identifier for the thread
"""
if thread_id not in self._sync_sessions:
logger.warning(f"No sync browser session found for thread {thread_id}")
return
with self._lock:
if thread_id not in self._sync_sessions:
logger.warning(f"No sync browser session found for thread {thread_id}")
return
browser_client, browser = self._sync_sessions[thread_id]
browser_client, browser = self._sync_sessions.pop(thread_id)
# Close browser
if browser:
try:
browser.close()
@@ -227,7 +254,6 @@ class BrowserSessionManager:
f"Error closing sync browser for thread {thread_id}: {e}"
)
# Stop browser client
if browser_client:
try:
browser_client.stop()
@@ -236,19 +262,17 @@ class BrowserSessionManager:
f"Error stopping browser client for thread {thread_id}: {e}"
)
# Remove session from dictionary
del self._sync_sessions[thread_id]
logger.info(f"Sync browser session cleaned up for thread {thread_id}")
async def close_all_browsers(self) -> None:
"""Close all browser sessions."""
# Close all async browsers
async_thread_ids = list(self._async_sessions.keys())
with self._lock:
async_thread_ids = list(self._async_sessions.keys())
sync_thread_ids = list(self._sync_sessions.keys())
for thread_id in async_thread_ids:
await self.close_async_browser(thread_id)
# Close all sync browsers
sync_thread_ids = list(self._sync_sessions.keys())
for thread_id in sync_thread_ids:
self.close_sync_browser(thread_id)

View File

@@ -1,9 +1,11 @@
import logging
import os
from pathlib import Path
from typing import Any
from uuid import uuid4
import chromadb
from crewai.utilities.lock_store import lock as store_lock
from pydantic import BaseModel, Field, PrivateAttr
from crewai_tools.rag.base_loader import BaseLoader
@@ -38,22 +40,32 @@ class RAG(Adapter):
_client: Any = PrivateAttr()
_collection: Any = PrivateAttr()
_embedding_service: EmbeddingService = PrivateAttr()
_lock_name: str = PrivateAttr(default="")
def model_post_init(self, __context: Any) -> None:
try:
if self.persist_directory:
self._client = chromadb.PersistentClient(path=self.persist_directory)
else:
self._client = chromadb.Client()
self._collection = self._client.get_or_create_collection(
name=self.collection_name,
metadata={
"hnsw:space": "cosine",
"description": "CrewAI Knowledge Base",
},
self._lock_name = (
f"chromadb:{os.path.realpath(self.persist_directory)}"
if self.persist_directory
else "chromadb:ephemeral"
)
with store_lock(self._lock_name):
if self.persist_directory:
self._client = chromadb.PersistentClient(
path=self.persist_directory
)
else:
self._client = chromadb.Client()
self._collection = self._client.get_or_create_collection(
name=self.collection_name,
metadata={
"hnsw:space": "cosine",
"description": "CrewAI Knowledge Base",
},
)
self._embedding_service = EmbeddingService(
provider=self.embedding_provider,
model=self.embedding_model,
@@ -87,29 +99,8 @@ class RAG(Adapter):
loader_result = loader.load(source_content)
doc_id = loader_result.doc_id
existing_doc = self._collection.get(
where={"source": source_content.source_ref}, limit=1
)
existing_doc_id = (
existing_doc and existing_doc["metadatas"][0]["doc_id"]
if existing_doc["metadatas"]
else None
)
if existing_doc_id == doc_id:
logger.warning(
f"Document with source {loader_result.source} already exists"
)
return
# Document with same source ref does exists but the content has changed, deleting the oldest reference
if existing_doc_id and existing_doc_id != loader_result.doc_id:
logger.warning(f"Deleting old document with doc_id {existing_doc_id}")
self._collection.delete(where={"doc_id": existing_doc_id})
documents = []
chunks = chunker.chunk(loader_result.content)
documents = []
for i, chunk in enumerate(chunks):
doc_metadata = (metadata or {}).copy()
doc_metadata["chunk_index"] = i
@@ -136,7 +127,6 @@ class RAG(Adapter):
ids = [doc.id for doc in documents]
metadatas = []
for doc in documents:
doc_metadata = doc.metadata.copy()
doc_metadata.update(
@@ -148,16 +138,36 @@ class RAG(Adapter):
)
metadatas.append(doc_metadata)
try:
self._collection.add(
ids=ids,
embeddings=embeddings,
documents=contents,
metadatas=metadatas,
with store_lock(self._lock_name):
existing_doc = self._collection.get(
where={"source": source_content.source_ref}, limit=1
)
logger.info(f"Added {len(documents)} documents to knowledge base")
except Exception as e:
logger.error(f"Failed to add documents to ChromaDB: {e}")
existing_doc_id = (
existing_doc and existing_doc["metadatas"][0]["doc_id"]
if existing_doc["metadatas"]
else None
)
if existing_doc_id == doc_id:
logger.warning(
f"Document with source {loader_result.source} already exists"
)
return
if existing_doc_id and existing_doc_id != loader_result.doc_id:
logger.warning(f"Deleting old document with doc_id {existing_doc_id}")
self._collection.delete(where={"doc_id": existing_doc_id})
try:
self._collection.add(
ids=ids,
embeddings=embeddings,
documents=contents,
metadatas=metadatas,
)
logger.info(f"Added {len(documents)} documents to knowledge base")
except Exception as e:
logger.error(f"Failed to add documents to ChromaDB: {e}")
def query(self, question: str, where: dict[str, Any] | None = None) -> str: # type: ignore
try:
@@ -201,7 +211,8 @@ class RAG(Adapter):
def delete_collection(self) -> None:
try:
self._client.delete_collection(self.collection_name)
with store_lock(self._lock_name):
self._client.delete_collection(self.collection_name)
logger.info(f"Deleted collection: {self.collection_name}")
except Exception as e:
logger.error(f"Failed to delete collection: {e}")

View File

@@ -1,4 +1,3 @@
from datetime import datetime
import json
import os
import time
@@ -10,8 +9,8 @@ from pydantic import BaseModel, Field
from pydantic.types import StringConstraints
import requests
from crewai_tools.tools.brave_search_tool.schemas import WebSearchParams
from crewai_tools.tools.brave_search_tool.base import _save_results_to_file
from crewai_tools.tools.brave_search_tool.schemas import WebSearchParams
load_dotenv()

View File

@@ -1,13 +1,27 @@
# CodeInterpreterTool
## Description
This tool is used to give the Agent the ability to run code (Python3) from the code generated by the Agent itself. The code is executed in a sandboxed environment, so it is safe to run any code.
This tool is used to give the Agent the ability to run code (Python3) from the code generated by the Agent itself. The code is executed in a Docker container for secure isolation.
It is incredible useful since it allows the Agent to generate code, run it in the same environment, get the result and use it to make decisions.
It is incredibly useful since it allows the Agent to generate code, run it in an isolated environment, get the result and use it to make decisions.
## ⚠️ Security Requirements
**Docker is REQUIRED** for safe code execution. The tool will refuse to execute code without Docker to prevent security vulnerabilities.
### Why Docker is Required
Previous versions included a "restricted sandbox" fallback when Docker was unavailable. This has been **removed** due to critical security vulnerabilities:
- The Python-based sandbox could be escaped via object introspection
- Attackers could recover the original `__import__` function and access any module
- This allowed arbitrary command execution on the host system
**Docker provides real process isolation** and is the only secure way to execute untrusted code.
## Requirements
- Docker
- **Docker (REQUIRED)** - Install from [docker.com](https://docs.docker.com/get-docker/)
## Installation
Install the crewai_tools package
@@ -17,7 +31,9 @@ pip install 'crewai[tools]'
## Example
Remember that when using this tool, the code must be generated by the Agent itself. The code must be a Python3 code. And it will take some time for the first time to run because it needs to build the Docker image.
Remember that when using this tool, the code must be generated by the Agent itself. The code must be Python3 code. It will take some time the first time to run because it needs to build the Docker image.
### Basic Usage (Docker Container - Recommended)
```python
from crewai_tools import CodeInterpreterTool
@@ -28,7 +44,9 @@ Agent(
)
```
Or if you need to pass your own Dockerfile just do this
### Custom Dockerfile
If you need to pass your own Dockerfile:
```python
from crewai_tools import CodeInterpreterTool
@@ -39,15 +57,39 @@ Agent(
)
```
If it is difficult to connect to docker daemon automatically (especially for macOS users), you can do this to setup docker host manually
### Manual Docker Host Configuration
If it is difficult to connect to the Docker daemon automatically (especially for macOS users), you can set up the Docker host manually:
```python
from crewai_tools import CodeInterpreterTool
Agent(
...
tools=[CodeInterpreterTool(user_docker_base_url="<Docker Host Base Url>",
user_dockerfile_path="<Dockerfile_path>")],
tools=[CodeInterpreterTool(
user_docker_base_url="<Docker Host Base Url>",
user_dockerfile_path="<Dockerfile_path>"
)],
)
```
### Unsafe Mode (NOT RECOMMENDED)
If you absolutely cannot use Docker and **fully trust the code source**, you can use unsafe mode:
```python
from crewai_tools import CodeInterpreterTool
# WARNING: Only use with fully trusted code!
Agent(
...
tools=[CodeInterpreterTool(unsafe_mode=True)],
)
```
**⚠️ SECURITY WARNING:** `unsafe_mode=True` executes code directly on the host without any isolation. Only use this if:
- You completely trust the code being executed
- You understand the security risks
- You cannot install Docker in your environment
For production use, **always use Docker** (the default mode).

View File

@@ -8,6 +8,7 @@ potentially unsafe operations and importing restricted modules.
import importlib.util
import os
import subprocess
import sys
from types import ModuleType
from typing import Any, ClassVar, TypedDict
@@ -50,11 +51,16 @@ class CodeInterpreterSchema(BaseModel):
class SandboxPython:
"""A restricted Python execution environment for running code safely.
"""INSECURE: A restricted Python execution environment with known vulnerabilities.
This class provides methods to safely execute Python code by restricting access to
potentially dangerous modules and built-in functions. It creates a sandboxed
environment where harmful operations are blocked.
WARNING: This class does NOT provide real security isolation and is vulnerable to
sandbox escape attacks via Python object introspection. Attackers can recover the
original __import__ function and bypass all restrictions.
DO NOT USE for untrusted code execution. Use Docker containers instead.
This class attempts to restrict access to dangerous modules and built-in functions
but provides no real security boundary against a motivated attacker.
"""
BLOCKED_MODULES: ClassVar[set[str]] = {
@@ -299,8 +305,8 @@ class CodeInterpreterTool(BaseTool):
def run_code_safety(self, code: str, libraries_used: list[str]) -> str:
"""Runs code in the safest available environment.
Attempts to run code in Docker if available, falls back to a restricted
sandbox if Docker is not available.
Requires Docker to be available for secure code execution. Fails closed
if Docker is not available to prevent sandbox escape vulnerabilities.
Args:
code: The Python code to execute as a string.
@@ -308,10 +314,24 @@ class CodeInterpreterTool(BaseTool):
Returns:
The output of the executed code as a string.
Raises:
RuntimeError: If Docker is not available, as the restricted sandbox
is vulnerable to escape attacks and should not be used
for untrusted code execution.
"""
if self._check_docker_available():
return self.run_code_in_docker(code, libraries_used)
return self.run_code_in_restricted_sandbox(code)
error_msg = (
"Docker is required for safe code execution but is not available. "
"The restricted sandbox fallback has been removed due to security vulnerabilities "
"that allow sandbox escape via Python object introspection. "
"Please install Docker (https://docs.docker.com/get-docker/) or use unsafe_mode=True "
"if you trust the code source and understand the security risks."
)
Printer.print(error_msg, color="bold_red")
raise RuntimeError(error_msg)
def run_code_in_docker(self, code: str, libraries_used: list[str]) -> str:
"""Runs Python code in a Docker container for safe isolation.
@@ -342,10 +362,19 @@ class CodeInterpreterTool(BaseTool):
@staticmethod
def run_code_in_restricted_sandbox(code: str) -> str:
"""Runs Python code in a restricted sandbox environment.
"""DEPRECATED AND INSECURE: Runs Python code in a restricted sandbox environment.
Executes the code with restricted access to potentially dangerous modules and
built-in functions for basic safety when Docker is not available.
WARNING: This method is vulnerable to sandbox escape attacks via Python object
introspection and should NOT be used for untrusted code execution. It has been
deprecated and is only kept for backward compatibility with trusted code.
The "restricted" environment can be bypassed by attackers who can:
- Use object graph introspection to recover the original __import__ function
- Access any Python module including os, subprocess, sys, etc.
- Execute arbitrary commands on the host system
Use run_code_in_docker() for secure code execution, or run_code_unsafe()
if you explicitly acknowledge the security risks.
Args:
code: The Python code to execute as a string.
@@ -354,7 +383,10 @@ class CodeInterpreterTool(BaseTool):
The value of the 'result' variable from the executed code,
or an error message if execution failed.
"""
Printer.print("Running code in restricted sandbox", color="yellow")
Printer.print(
"WARNING: Running code in INSECURE restricted sandbox (vulnerable to escape attacks)",
color="bold_red"
)
exec_locals: dict[str, Any] = {}
try:
SandboxPython.exec(code=code, locals_=exec_locals)
@@ -380,7 +412,7 @@ class CodeInterpreterTool(BaseTool):
Printer.print("WARNING: Running code in unsafe mode", color="bold_magenta")
# Install libraries on the host machine
for library in libraries_used:
os.system(f"pip install {library}") # noqa: S605
subprocess.run([sys.executable, "-m", "pip", "install", library], check=False) # noqa: S603
# Execute the code
try:

View File

@@ -30,9 +30,8 @@ class FileWriterTool(BaseTool):
def _run(self, **kwargs: Any) -> str:
try:
# Create the directory if it doesn't exist
if kwargs.get("directory") and not os.path.exists(kwargs["directory"]):
os.makedirs(kwargs["directory"])
if kwargs.get("directory"):
os.makedirs(kwargs["directory"], exist_ok=True)
# Construct the full path
filepath = os.path.join(kwargs.get("directory") or "", kwargs["filename"])

View File

@@ -99,8 +99,8 @@ class FileCompressorTool(BaseTool):
def _prepare_output(output_path: str, overwrite: bool) -> bool:
"""Ensures output path is ready for writing."""
output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir):
os.makedirs(output_dir)
if output_dir:
os.makedirs(output_dir, exist_ok=True)
if os.path.exists(output_path) and not overwrite:
return False
return True

View File

@@ -18,7 +18,6 @@ class MergeAgentHandlerToolError(Exception):
"""Base exception for Merge Agent Handler tool errors."""
class MergeAgentHandlerTool(BaseTool):
"""
Wrapper for Merge Agent Handler tools.
@@ -174,7 +173,7 @@ class MergeAgentHandlerTool(BaseTool):
>>> tool = MergeAgentHandlerTool.from_tool_name(
... tool_name="linear__create_issue",
... tool_pack_id="134e0111-0f67-44f6-98f0-597000290bb3",
... registered_user_id="91b2b905-e866-40c8-8be2-efe53827a0aa"
... registered_user_id="91b2b905-e866-40c8-8be2-efe53827a0aa",
... )
"""
# Create an empty args schema model (proper BaseModel subclass)
@@ -210,7 +209,10 @@ class MergeAgentHandlerTool(BaseTool):
if "parameters" in tool_schema:
try:
params = tool_schema["parameters"]
if params.get("type") == "object" and "properties" in params:
if (
params.get("type") == "object"
and "properties" in params
):
# Build field definitions for Pydantic
fields = {}
properties = params["properties"]
@@ -298,7 +300,7 @@ class MergeAgentHandlerTool(BaseTool):
>>> tools = MergeAgentHandlerTool.from_tool_pack(
... tool_pack_id="134e0111-0f67-44f6-98f0-597000290bb3",
... registered_user_id="91b2b905-e866-40c8-8be2-efe53827a0aa",
... tool_names=["linear__create_issue", "linear__get_issues"]
... tool_names=["linear__create_issue", "linear__get_issues"],
... )
"""
# Create a temporary instance to fetch the tool list

View File

@@ -110,11 +110,13 @@ class QdrantVectorSearchTool(BaseTool):
self.custom_embedding_fn(query)
if self.custom_embedding_fn
else (
lambda: __import__("openai")
.Client(api_key=os.getenv("OPENAI_API_KEY"))
.embeddings.create(input=[query], model="text-embedding-3-large")
.data[0]
.embedding
lambda: (
__import__("openai")
.Client(api_key=os.getenv("OPENAI_API_KEY"))
.embeddings.create(input=[query], model="text-embedding-3-large")
.data[0]
.embedding
)
)()
)
results = self.client.query_points(

View File

@@ -3,6 +3,7 @@ from __future__ import annotations
import asyncio
from concurrent.futures import ThreadPoolExecutor
import logging
import threading
from typing import TYPE_CHECKING, Any
from crewai.tools.base_tool import BaseTool
@@ -33,6 +34,7 @@ logger = logging.getLogger(__name__)
# Cache for query results
_query_cache: dict[str, list[dict[str, Any]]] = {}
_cache_lock = threading.Lock()
class SnowflakeConfig(BaseModel):
@@ -102,7 +104,7 @@ class SnowflakeSearchTool(BaseTool):
)
_connection_pool: list[SnowflakeConnection] | None = None
_pool_lock: asyncio.Lock | None = None
_pool_lock: threading.Lock | None = None
_thread_pool: ThreadPoolExecutor | None = None
_model_rebuilt: bool = False
package_dependencies: list[str] = Field(
@@ -122,7 +124,7 @@ class SnowflakeSearchTool(BaseTool):
try:
if SNOWFLAKE_AVAILABLE:
self._connection_pool = []
self._pool_lock = asyncio.Lock()
self._pool_lock = threading.Lock()
self._thread_pool = ThreadPoolExecutor(max_workers=self.pool_size)
else:
raise ImportError
@@ -147,7 +149,7 @@ class SnowflakeSearchTool(BaseTool):
)
self._connection_pool = []
self._pool_lock = asyncio.Lock()
self._pool_lock = threading.Lock()
self._thread_pool = ThreadPoolExecutor(max_workers=self.pool_size)
except subprocess.CalledProcessError as e:
raise ImportError("Failed to install Snowflake dependencies") from e
@@ -163,13 +165,12 @@ class SnowflakeSearchTool(BaseTool):
raise RuntimeError("Pool lock not initialized")
if self._connection_pool is None:
raise RuntimeError("Connection pool not initialized")
async with self._pool_lock:
if not self._connection_pool:
conn = await asyncio.get_event_loop().run_in_executor(
self._thread_pool, self._create_connection
)
self._connection_pool.append(conn)
return self._connection_pool.pop()
with self._pool_lock:
if self._connection_pool:
return self._connection_pool.pop()
return await asyncio.get_event_loop().run_in_executor(
self._thread_pool, self._create_connection
)
def _create_connection(self) -> SnowflakeConnection:
"""Create a new Snowflake connection."""
@@ -204,9 +205,10 @@ class SnowflakeSearchTool(BaseTool):
"""Execute a query with retries and return results."""
if self.enable_caching:
cache_key = self._get_cache_key(query, timeout)
if cache_key in _query_cache:
logger.info("Returning cached result")
return _query_cache[cache_key]
with _cache_lock:
if cache_key in _query_cache:
logger.info("Returning cached result")
return _query_cache[cache_key]
for attempt in range(self.max_retries):
try:
@@ -225,7 +227,8 @@ class SnowflakeSearchTool(BaseTool):
]
if self.enable_caching:
_query_cache[self._get_cache_key(query, timeout)] = results
with _cache_lock:
_query_cache[self._get_cache_key(query, timeout)] = results
return results
finally:
@@ -234,7 +237,7 @@ class SnowflakeSearchTool(BaseTool):
self._pool_lock is not None
and self._connection_pool is not None
):
async with self._pool_lock:
with self._pool_lock:
self._connection_pool.append(conn)
except (DatabaseError, OperationalError) as e: # noqa: PERF203
if attempt == self.max_retries - 1:

View File

@@ -1,4 +1,5 @@
import asyncio
import contextvars
import json
import os
import re
@@ -137,7 +138,9 @@ class StagehandTool(BaseTool):
- 'observe': For finding elements in a specific area
"""
args_schema: type[BaseModel] = StagehandToolSchema
package_dependencies: list[str] = Field(default_factory=lambda: ["stagehand<=0.5.9"])
package_dependencies: list[str] = Field(
default_factory=lambda: ["stagehand<=0.5.9"]
)
env_vars: list[EnvVar] = Field(
default_factory=lambda: [
EnvVar(
@@ -620,9 +623,12 @@ class StagehandTool(BaseTool):
# We're in an existing event loop, use it
import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(
asyncio.run, self._async_run(instruction, url, command_type)
ctx.run,
asyncio.run,
self._async_run(instruction, url, command_type),
)
result = future.result()
else:
@@ -706,11 +712,12 @@ class StagehandTool(BaseTool):
if loop.is_running():
import concurrent.futures
ctx = contextvars.copy_context()
with (
concurrent.futures.ThreadPoolExecutor() as executor
):
future = executor.submit(
asyncio.run, self._async_close()
ctx.run, asyncio.run, self._async_close()
)
future.result()
else:

View File

@@ -1,3 +1,4 @@
import sys
from unittest.mock import patch
from crewai_tools.tools.code_interpreter_tool.code_interpreter_tool import (
@@ -76,24 +77,22 @@ print("This is line 2")"""
)
def test_restricted_sandbox_basic_code_execution(printer_mock, docker_unavailable_mock):
"""Test basic code execution."""
def test_docker_unavailable_raises_error(printer_mock, docker_unavailable_mock):
"""Test that execution fails when Docker is unavailable in safe mode."""
tool = CodeInterpreterTool()
code = """
result = 2 + 2
print(result)
"""
result = tool.run(code=code, libraries_used=[])
printer_mock.assert_called_with(
"Running code in restricted sandbox", color="yellow"
)
assert result == 4
with pytest.raises(RuntimeError) as exc_info:
tool.run(code=code, libraries_used=[])
assert "Docker is required for safe code execution" in str(exc_info.value)
assert "sandbox escape" in str(exc_info.value)
def test_restricted_sandbox_running_with_blocked_modules(
printer_mock, docker_unavailable_mock
):
"""Test that restricted modules cannot be imported."""
def test_restricted_sandbox_running_with_blocked_modules():
"""Test that restricted modules cannot be imported when using the deprecated sandbox directly."""
tool = CodeInterpreterTool()
restricted_modules = SandboxPython.BLOCKED_MODULES
@@ -102,18 +101,15 @@ def test_restricted_sandbox_running_with_blocked_modules(
import {module}
result = "Import succeeded"
"""
result = tool.run(code=code, libraries_used=[])
printer_mock.assert_called_with(
"Running code in restricted sandbox", color="yellow"
)
# Note: run_code_in_restricted_sandbox is deprecated and insecure
# This test verifies the old behavior but should not be used in production
result = tool.run_code_in_restricted_sandbox(code)
assert f"An error occurred: Importing '{module}' is not allowed" in result
def test_restricted_sandbox_running_with_blocked_builtins(
printer_mock, docker_unavailable_mock
):
"""Test that restricted builtins are not available."""
def test_restricted_sandbox_running_with_blocked_builtins():
"""Test that restricted builtins are not available when using the deprecated sandbox directly."""
tool = CodeInterpreterTool()
restricted_builtins = SandboxPython.UNSAFE_BUILTINS
@@ -122,25 +118,23 @@ def test_restricted_sandbox_running_with_blocked_builtins(
{builtin}("test")
result = "Builtin available"
"""
result = tool.run(code=code, libraries_used=[])
printer_mock.assert_called_with(
"Running code in restricted sandbox", color="yellow"
)
# Note: run_code_in_restricted_sandbox is deprecated and insecure
# This test verifies the old behavior but should not be used in production
result = tool.run_code_in_restricted_sandbox(code)
assert f"An error occurred: name '{builtin}' is not defined" in result
def test_restricted_sandbox_running_with_no_result_variable(
printer_mock, docker_unavailable_mock
):
"""Test behavior when no result variable is set."""
"""Test behavior when no result variable is set in deprecated sandbox."""
tool = CodeInterpreterTool()
code = """
x = 10
"""
result = tool.run(code=code, libraries_used=[])
printer_mock.assert_called_with(
"Running code in restricted sandbox", color="yellow"
)
# Note: run_code_in_restricted_sandbox is deprecated and insecure
# This test verifies the old behavior but should not be used in production
result = tool.run_code_in_restricted_sandbox(code)
assert result == "No result variable found."
@@ -159,6 +153,44 @@ x = 10
assert result == "No result variable found."
@patch("crewai_tools.tools.code_interpreter_tool.code_interpreter_tool.subprocess.run")
def test_unsafe_mode_installs_libraries_without_shell(
subprocess_run_mock, printer_mock, docker_unavailable_mock
):
"""Test that library installation uses subprocess.run with shell=False, not os.system."""
tool = CodeInterpreterTool(unsafe_mode=True)
code = "result = 1"
libraries_used = ["numpy", "pandas"]
tool.run(code=code, libraries_used=libraries_used)
assert subprocess_run_mock.call_count == 2
for call, library in zip(subprocess_run_mock.call_args_list, libraries_used):
args, kwargs = call
# Must be list form (no shell expansion possible)
assert args[0] == [sys.executable, "-m", "pip", "install", library]
# shell= must not be True (defaults to False)
assert kwargs.get("shell", False) is False
@patch("crewai_tools.tools.code_interpreter_tool.code_interpreter_tool.subprocess.run")
def test_unsafe_mode_library_name_with_shell_metacharacters_does_not_invoke_shell(
subprocess_run_mock, printer_mock, docker_unavailable_mock
):
"""Test that a malicious library name cannot inject shell commands."""
tool = CodeInterpreterTool(unsafe_mode=True)
code = "result = 1"
malicious_library = "numpy; rm -rf /"
tool.run(code=code, libraries_used=[malicious_library])
subprocess_run_mock.assert_called_once()
args, kwargs = subprocess_run_mock.call_args
# The entire malicious string is passed as a single argument — no shell parsing
assert args[0] == [sys.executable, "-m", "pip", "install", malicious_library]
assert kwargs.get("shell", False) is False
def test_unsafe_mode_running_unsafe_code(printer_mock, docker_unavailable_mock):
"""Test behavior when no result variable is set."""
tool = CodeInterpreterTool(unsafe_mode=True)
@@ -172,3 +204,50 @@ result = eval("5/1")
"WARNING: Running code in unsafe mode", color="bold_magenta"
)
assert 5.0 == result
@pytest.mark.xfail(
reason=(
"run_code_in_restricted_sandbox is known to be vulnerable to sandbox "
"escape via object introspection. This test encodes the desired secure "
"behavior (no escape possible) and will start passing once the "
"vulnerability is fixed or the function is removed."
)
)
def test_sandbox_escape_vulnerability_demonstration(printer_mock):
"""Demonstrate that the restricted sandbox is vulnerable to escape attacks.
This test shows that an attacker can use Python object introspection to bypass
the restricted sandbox and access blocked modules like 'os'. This is why the
sandbox should never be used for untrusted code execution.
NOTE: This test uses the deprecated run_code_in_restricted_sandbox directly
to demonstrate the vulnerability. In production, Docker is now required.
"""
tool = CodeInterpreterTool()
# Classic Python sandbox escape via object introspection
escape_code = """
# Recover the real __import__ function via object introspection
for cls in ().__class__.__bases__[0].__subclasses__():
if cls.__name__ == 'catch_warnings':
# Get the real builtins module
real_builtins = cls()._module.__builtins__
real_import = real_builtins['__import__']
# Now we can import os and execute commands
os = real_import('os')
# Demonstrate we have escaped the sandbox
result = "SANDBOX_ESCAPED" if hasattr(os, 'system') else "FAILED"
break
"""
# The deprecated sandbox is vulnerable to this attack
result = tool.run_code_in_restricted_sandbox(escape_code)
# Desired behavior: the restricted sandbox should prevent this escape.
# If this assertion fails, run_code_in_restricted_sandbox remains vulnerable.
assert result != "SANDBOX_ESCAPED", (
"The restricted sandbox was bypassed via object introspection. "
"This indicates run_code_in_restricted_sandbox is still vulnerable and "
"is why Docker is now required for safe code execution."
)

View File

@@ -53,7 +53,7 @@ Repository = "https://github.com/crewAIInc/crewAI"
[project.optional-dependencies]
tools = [
"crewai-tools==1.10.2a1",
"crewai-tools==1.11.0",
]
embeddings = [
"tiktoken~=0.8.0"

View File

@@ -1,9 +1,11 @@
import contextvars
import threading
from typing import Any
import urllib.request
import warnings
from crewai.agent.core import Agent
from crewai.agent.planning_config import PlanningConfig
from crewai.crew import Crew
from crewai.crews.crew_output import CrewOutput
from crewai.flow.flow import Flow
@@ -40,7 +42,7 @@ def _suppress_pydantic_deprecation_warnings() -> None:
_suppress_pydantic_deprecation_warnings()
__version__ = "1.10.2a1"
__version__ = "1.11.0"
_telemetry_submitted = False
@@ -66,7 +68,8 @@ def _track_install() -> None:
def _track_install_async() -> None:
"""Track installation in background thread to avoid blocking imports."""
if not Telemetry._is_telemetry_disabled():
thread = threading.Thread(target=_track_install, daemon=True)
ctx = contextvars.copy_context()
thread = threading.Thread(target=ctx.run, args=(_track_install,), daemon=True)
thread.start()
@@ -100,6 +103,7 @@ __all__ = [
"Knowledge",
"LLMGuardrail",
"Memory",
"PlanningConfig",
"Process",
"Task",
"TaskOutput",

View File

@@ -13,6 +13,7 @@ from crewai.a2a.auth.client_schemes import (
)
from crewai.a2a.auth.server_schemes import (
AuthenticatedUser,
EnterpriseTokenAuth,
OIDCAuth,
ServerAuthScheme,
SimpleTokenAuth,
@@ -25,6 +26,7 @@ __all__ = [
"AuthenticatedUser",
"BearerTokenAuth",
"ClientAuthScheme",
"EnterpriseTokenAuth",
"HTTPBasicAuth",
"HTTPDigestAuth",
"OAuth2AuthorizationCode",

View File

@@ -4,6 +4,7 @@ These schemes validate incoming requests to A2A server endpoints.
Supported authentication methods:
- Simple token validation with static bearer tokens
- Enterprise token validation (via PlusAPI)
- OpenID Connect with JWT validation using JWKS
- OAuth2 with JWT validation or token introspection
"""
@@ -16,6 +17,7 @@ import logging
import os
from typing import TYPE_CHECKING, Annotated, Any, ClassVar, Literal
import httpx
import jwt
from jwt import PyJWKClient
from pydantic import (
@@ -33,6 +35,7 @@ from typing_extensions import Self
if TYPE_CHECKING:
from a2a.types import OAuth2SecurityScheme
from jwt.types import Options
logger = logging.getLogger(__name__)
@@ -183,6 +186,24 @@ class SimpleTokenAuth(ServerAuthScheme):
)
class EnterpriseTokenAuth(ServerAuthScheme):
"""Enterprise token authentication.
Validates tokens via the PlusAPI enterprise verification endpoint.
"""
async def authenticate(self, token: str) -> AuthenticatedUser:
"""Authenticate using enterprise token verification.
Args:
token: The bearer token to authenticate.
Raises:
NotImplementedError
"""
raise NotImplementedError
class OIDCAuth(ServerAuthScheme):
"""OpenID Connect authentication.
@@ -475,7 +496,7 @@ class OAuth2ServerAuth(ServerAuthScheme):
try:
signing_key = self._jwk_client.get_signing_key_from_jwt(token)
decode_options: dict[str, Any] = {
decode_options: Options = {
"require": self.required_claims,
}
@@ -556,7 +577,6 @@ class OAuth2ServerAuth(ServerAuthScheme):
async def _authenticate_introspection(self, token: str) -> AuthenticatedUser:
"""Authenticate using OAuth2 token introspection (RFC 7662)."""
import httpx
if not self.introspection_url:
raise HTTPException(

View File

@@ -633,6 +633,10 @@ class A2AServerConfig(BaseModel):
default=False,
description="Whether agent provides extended card to authenticated users",
)
extended_skills: list[AgentSkill] = Field(
default_factory=list,
description="Additional skills visible only to authenticated users in the extended card",
)
url: Url | None = Field(
default=None,
description="Preferred endpoint URL for the agent. Set at runtime if not provided.",

View File

@@ -63,6 +63,9 @@ class A2AErrorCode(IntEnum):
INVALID_AGENT_RESPONSE = -32006
"""The agent produced an invalid response."""
AUTHENTICATED_EXTENDED_CARD_NOT_CONFIGURED = -32007
"""Authenticated extended card feature is not configured."""
# CrewAI Custom Extensions (-32768 to -32100)
UNSUPPORTED_VERSION = -32009
"""The requested A2A protocol version is not supported."""
@@ -108,6 +111,7 @@ ERROR_MESSAGES: dict[int, str] = {
A2AErrorCode.UNSUPPORTED_OPERATION: "This operation is not supported",
A2AErrorCode.CONTENT_TYPE_NOT_SUPPORTED: "Incompatible content types",
A2AErrorCode.INVALID_AGENT_RESPONSE: "Invalid agent response",
A2AErrorCode.AUTHENTICATED_EXTENDED_CARD_NOT_CONFIGURED: "Authenticated Extended Card is not configured",
A2AErrorCode.UNSUPPORTED_VERSION: "Unsupported A2A version",
A2AErrorCode.UNSUPPORTED_EXTENSION: "Client does not support required extensions",
A2AErrorCode.AUTHENTICATION_REQUIRED: "Authentication required",
@@ -284,6 +288,15 @@ class InvalidAgentResponseError(A2AError):
code: int = field(default=A2AErrorCode.INVALID_AGENT_RESPONSE, init=False)
@dataclass
class AuthenticatedExtendedCardNotConfiguredError(A2AError):
"""Authenticated extended card is not configured."""
code: int = field(
default=A2AErrorCode.AUTHENTICATED_EXTENDED_CARD_NOT_CONFIGURED, init=False
)
@dataclass
class UnsupportedVersionError(A2AError):
"""The requested A2A version is not supported."""

View File

@@ -5,6 +5,7 @@ from __future__ import annotations
import asyncio
from collections.abc import MutableMapping
import concurrent.futures
import contextvars
from functools import lru_cache
import ssl
import time
@@ -147,8 +148,9 @@ def fetch_agent_card(
has_running_loop = False
if has_running_loop:
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
return pool.submit(asyncio.run, coro).result()
return pool.submit(ctx.run, asyncio.run, coro).result()
return asyncio.run(coro)
@@ -215,8 +217,9 @@ def _fetch_agent_card_cached(
has_running_loop = False
if has_running_loop:
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
return pool.submit(asyncio.run, coro).result()
return pool.submit(ctx.run, asyncio.run, coro).result()
return asyncio.run(coro)

View File

@@ -7,6 +7,7 @@ import base64
from collections.abc import AsyncIterator, Callable, MutableMapping
import concurrent.futures
from contextlib import asynccontextmanager
import contextvars
import logging
from typing import TYPE_CHECKING, Any, Final, Literal
import uuid
@@ -229,8 +230,9 @@ def execute_a2a_delegation(
has_running_loop = False
if has_running_loop:
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
return pool.submit(asyncio.run, coro).result()
return pool.submit(ctx.run, asyncio.run, coro).result()
return asyncio.run(coro)

View File

@@ -8,6 +8,7 @@ from __future__ import annotations
import asyncio
from collections.abc import Callable, Coroutine, Mapping
from concurrent.futures import ThreadPoolExecutor, as_completed
import contextvars
from functools import wraps
import json
from types import MethodType
@@ -278,7 +279,9 @@ def _fetch_agent_cards_concurrently(
max_workers = min(len(a2a_agents), 10)
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = {
executor.submit(_fetch_card_from_config, config): config
executor.submit(
contextvars.copy_context().run, _fetch_card_from_config, config
): config
for config in a2a_agents
}
for future in as_completed(futures):

View File

@@ -2,6 +2,7 @@ from __future__ import annotations
import asyncio
from collections.abc import Callable, Coroutine, Sequence
import contextvars
import shutil
import subprocess
import time
@@ -22,6 +23,7 @@ from pydantic import (
)
from typing_extensions import Self
from crewai.agent.planning_config import PlanningConfig
from crewai.agent.utils import (
ahandle_knowledge_retrieval,
apply_training_data,
@@ -191,13 +193,23 @@ class Agent(BaseAgent):
default="safe",
description="Mode for code execution: 'safe' (using Docker) or 'unsafe' (direct execution).",
)
reasoning: bool = Field(
planning_config: PlanningConfig | None = Field(
default=None,
description="Configuration for agent planning before task execution.",
)
planning: bool = Field(
default=False,
description="Whether the agent should reflect and create a plan before executing a task.",
)
reasoning: bool = Field(
default=False,
description="[DEPRECATED: Use planning_config instead] Whether the agent should reflect and create a plan before executing a task.",
deprecated=True,
)
max_reasoning_attempts: int | None = Field(
default=None,
description="Maximum number of reasoning attempts before executing the task. If None, will try until ready.",
description="[DEPRECATED: Use planning_config.max_attempts instead] Maximum number of reasoning attempts before executing the task. If None, will try until ready.",
deprecated=True,
)
embedder: EmbedderConfig | None = Field(
default=None,
@@ -264,8 +276,26 @@ class Agent(BaseAgent):
if self.allow_code_execution:
self._validate_docker_installation()
# Handle backward compatibility: convert reasoning=True to planning_config
if self.reasoning and self.planning_config is None:
import warnings
warnings.warn(
"The 'reasoning' parameter is deprecated. Use 'planning_config=PlanningConfig()' instead.",
DeprecationWarning,
stacklevel=2,
)
self.planning_config = PlanningConfig(
max_attempts=self.max_reasoning_attempts,
)
return self
@property
def planning_enabled(self) -> bool:
"""Check if planning is enabled for this agent."""
return self.planning_config is not None or self.planning
def _setup_agent_executor(self) -> None:
if not self.cache_handler:
self.cache_handler = CacheHandler()
@@ -334,7 +364,11 @@ class Agent(BaseAgent):
ValueError: If the max execution time is not a positive integer.
RuntimeError: If the agent execution fails for other reasons.
"""
handle_reasoning(self, task)
# Only call handle_reasoning for legacy CrewAgentExecutor
# For AgentExecutor, planning is handled in AgentExecutor.generate_plan()
if self.executor_class is not AgentExecutor:
handle_reasoning(self, task)
self._inject_date_to_task(task)
if self.tools_handler:
@@ -513,9 +547,13 @@ class Agent(BaseAgent):
"""
import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(
self._execute_without_timeout, task_prompt=task_prompt, task=task
ctx.run,
self._execute_without_timeout,
task_prompt=task_prompt,
task=task,
)
try:
@@ -572,7 +610,10 @@ class Agent(BaseAgent):
ValueError: If the max execution time is not a positive integer.
RuntimeError: If the agent execution fails for other reasons.
"""
handle_reasoning(self, task)
if self.executor_class is not AgentExecutor:
handle_reasoning(
self, task
) # we need this till CrewAgentExecutor migrates to AgentExecutor
self._inject_date_to_task(task)
if self.tools_handler:
@@ -1418,17 +1459,19 @@ class Agent(BaseAgent):
except Exception as e:
self._logger.log("error", f"Failed to save kickoff result to memory: {e}")
def _execute_and_build_output(
def _build_output_from_result(
self,
result: dict[str, Any],
executor: AgentExecutor,
inputs: dict[str, str],
response_format: type[Any] | None = None,
) -> LiteAgentOutput:
"""Execute the agent and build the output object.
"""Build a LiteAgentOutput from an executor result dict.
Shared logic used by both sync and async execution paths.
Args:
result: The result dictionary from executor.invoke / invoke_async.
executor: The executor instance.
inputs: Input dictionary for execution.
response_format: Optional response format.
Returns:
@@ -1436,8 +1479,6 @@ class Agent(BaseAgent):
"""
import json
# Execute the agent (this is called from sync path, so invoke returns dict)
result = cast(dict[str, Any], executor.invoke(inputs))
output = result.get("output", "")
# Handle response format conversion
@@ -1485,91 +1526,39 @@ class Agent(BaseAgent):
else str(raw_output)
)
todo_results = LiteAgentOutput.from_todo_items(executor.state.todos.items)
return LiteAgentOutput(
raw=raw_str,
pydantic=formatted_result,
agent_role=self.role,
usage_metrics=usage_metrics.model_dump() if usage_metrics else None,
messages=executor.messages,
messages=list(executor.state.messages),
plan=executor.state.plan,
todos=todo_results,
replan_count=executor.state.replan_count,
last_replan_reason=executor.state.last_replan_reason,
)
def _execute_and_build_output(
self,
executor: AgentExecutor,
inputs: dict[str, str],
response_format: type[Any] | None = None,
) -> LiteAgentOutput:
"""Execute the agent synchronously and build the output object."""
result = cast(dict[str, Any], executor.invoke(inputs))
return self._build_output_from_result(result, executor, response_format)
async def _execute_and_build_output_async(
self,
executor: AgentExecutor,
inputs: dict[str, str],
response_format: type[Any] | None = None,
) -> LiteAgentOutput:
"""Execute the agent asynchronously and build the output object.
This is the async version of _execute_and_build_output that uses
invoke_async() for native async execution within event loops.
Args:
executor: The executor instance.
inputs: Input dictionary for execution.
response_format: Optional response format.
Returns:
LiteAgentOutput with raw output, formatted result, and metrics.
"""
import json
# Execute the agent asynchronously
"""Execute the agent asynchronously and build the output object."""
result = await executor.invoke_async(inputs)
output = result.get("output", "")
# Handle response format conversion
formatted_result: BaseModel | None = None
raw_output: str
if isinstance(output, BaseModel):
formatted_result = output
raw_output = output.model_dump_json()
elif response_format:
raw_output = str(output) if not isinstance(output, str) else output
try:
model_schema = generate_model_description(response_format)
schema = json.dumps(model_schema, indent=2)
instructions = self.i18n.slice("formatted_task_instructions").format(
output_format=schema
)
converter = Converter(
llm=self.llm,
text=raw_output,
model=response_format,
instructions=instructions,
)
conversion_result = converter.to_pydantic()
if isinstance(conversion_result, BaseModel):
formatted_result = conversion_result
except ConverterError:
pass # Keep raw output if conversion fails
else:
raw_output = str(output) if not isinstance(output, str) else output
# Get token usage metrics
if isinstance(self.llm, BaseLLM):
usage_metrics = self.llm.get_token_usage_summary()
else:
usage_metrics = self._token_process.get_summary()
raw_str = (
raw_output
if isinstance(raw_output, str)
else raw_output.model_dump_json()
if isinstance(raw_output, BaseModel)
else str(raw_output)
)
return LiteAgentOutput(
raw=raw_str,
pydantic=formatted_result,
agent_role=self.role,
usage_metrics=usage_metrics.model_dump() if usage_metrics else None,
messages=executor.messages,
)
return self._build_output_from_result(result, executor, response_format)
def _process_kickoff_guardrail(
self,

View File

@@ -0,0 +1,138 @@
from __future__ import annotations
from typing import Literal
from pydantic import BaseModel, Field
from crewai.llms.base_llm import BaseLLM
class PlanningConfig(BaseModel):
"""Configuration for agent planning/reasoning before task execution.
This allows users to customize the planning behavior including prompts,
iteration limits, the LLM used for planning, and the reasoning effort
level that controls post-step observation and replanning behavior.
Note: To disable planning, don't pass a planning_config or set planning=False
on the Agent. The presence of a PlanningConfig enables planning.
Attributes:
reasoning_effort: Controls observation and replanning after each step.
- "low": Observe each step (validates success), but skip the
decide/replan/refine pipeline. Steps are marked complete and
execution continues linearly. Fastest option.
- "medium": Observe each step. On failure, trigger replanning.
On success, skip refinement and continue. Balanced option.
- "high": Full observation pipeline — observe every step, then
route through decide_next_action which can trigger early goal
achievement, full replanning, or lightweight refinement.
Most adaptive but adds latency per step.
max_attempts: Maximum number of planning refinement attempts.
If None, will continue until the agent indicates readiness.
max_steps: Maximum number of steps in the generated plan.
system_prompt: Custom system prompt for planning. Uses default if None.
plan_prompt: Custom prompt for creating the initial plan.
refine_prompt: Custom prompt for refining the plan.
llm: LLM to use for planning. Uses agent's LLM if None.
Example:
```python
from crewai import Agent
from crewai.agent.planning_config import PlanningConfig
# Simple usage — fast, linear execution (default)
agent = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
planning_config=PlanningConfig(),
)
# Balanced — replan only when steps fail
agent = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
planning_config=PlanningConfig(
reasoning_effort="medium",
),
)
# Full adaptive planning with refinement and replanning
agent = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
planning_config=PlanningConfig(
reasoning_effort="high",
max_attempts=3,
max_steps=10,
plan_prompt="Create a focused plan for: {description}",
llm="gpt-4o-mini", # Use cheaper model for planning
),
)
```
"""
reasoning_effort: Literal["low", "medium", "high"] = Field(
default="medium",
description=(
"Controls post-step observation and replanning behavior. "
"'low' observes steps but skips replanning/refinement (fastest). "
"'medium' observes and replans only on step failure (balanced). "
"'high' runs full observation pipeline with replanning, refinement, "
"and early goal detection (most adaptive, highest latency)."
),
)
max_attempts: int | None = Field(
default=None,
description=(
"Maximum number of planning refinement attempts. "
"If None, will continue until the agent indicates readiness."
),
)
max_steps: int = Field(
default=20,
description="Maximum number of steps in the generated plan.",
ge=1,
)
system_prompt: str | None = Field(
default=None,
description="Custom system prompt for planning. Uses default if None.",
)
plan_prompt: str | None = Field(
default=None,
description="Custom prompt for creating the initial plan.",
)
refine_prompt: str | None = Field(
default=None,
description="Custom prompt for refining the plan.",
)
max_replans: int = Field(
default=3,
description="Maximum number of full replanning attempts before finalizing.",
ge=0,
)
max_step_iterations: int = Field(
default=15,
description=(
"Maximum LLM iterations per step in the StepExecutor multi-turn loop. "
"Lower values make steps faster but less thorough."
),
ge=1,
)
step_timeout: int | None = Field(
default=None,
description=(
"Maximum wall-clock seconds for a single step execution. "
"If exceeded, the step is marked as failed and observation decides "
"whether to continue or replan. None means no per-step timeout."
),
)
llm: str | BaseLLM | None = Field(
default=None,
description="LLM to use for planning. Uses agent's LLM if None.",
)
model_config = {"arbitrary_types_allowed": True}

View File

@@ -28,13 +28,20 @@ if TYPE_CHECKING:
def handle_reasoning(agent: Agent, task: Task) -> None:
"""Handle the reasoning process for an agent before task execution.
"""Handle the reasoning/planning process for an agent before task execution.
This function checks if planning is enabled for the agent and, if so,
creates a plan that gets appended to the task description.
Note: This function is used by CrewAgentExecutor (legacy path).
For AgentExecutor, planning is handled in AgentExecutor.generate_plan().
Args:
agent: The agent performing the task.
task: The task to execute.
"""
if not agent.reasoning:
# Check if planning is enabled using the planning_enabled property
if not getattr(agent, "planning_enabled", False):
return
try:
@@ -43,13 +50,13 @@ def handle_reasoning(agent: Agent, task: Task) -> None:
AgentReasoningOutput,
)
reasoning_handler = AgentReasoning(task=task, agent=agent)
reasoning_output: AgentReasoningOutput = (
reasoning_handler.handle_agent_reasoning()
planning_handler = AgentReasoning(agent=agent, task=task)
planning_output: AgentReasoningOutput = (
planning_handler.handle_agent_reasoning()
)
task.description += f"\n\nReasoning Plan:\n{reasoning_output.plan.plan}"
task.description += f"\n\nPlanning:\n{planning_output.plan.plan}"
except Exception as e:
agent._logger.log("error", f"Error during reasoning process: {e!s}")
agent._logger.log("error", f"Error during planning: {e!s}")
def build_task_prompt_with_schema(task: Task, task_prompt: str, i18n: I18N) -> str:

View File

@@ -895,7 +895,9 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
ToolUsageStartedEvent,
)
args_dict, parse_error = parse_tool_call_args(func_args, func_name, call_id, original_tool)
args_dict, parse_error = parse_tool_call_args(
func_args, func_name, call_id, original_tool
)
if parse_error is not None:
return parse_error

View File

@@ -0,0 +1,345 @@
"""PlannerObserver: Observation phase after each step execution.
Implements the "Observe" phase. After every step execution, the Planner
analyzes what happened, what new information was learned, and whether the
remaining plan is still valid.
This is NOT an error detector — it runs on every step, including successes,
to incorporate runtime observations into the remaining plan.
Refinements are structured (StepRefinement objects) and applied directly
from the observation result — no second LLM call required.
"""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Any
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.observation_events import (
StepObservationCompletedEvent,
StepObservationFailedEvent,
StepObservationStartedEvent,
)
from crewai.utilities.agent_utils import extract_task_section
from crewai.utilities.i18n import I18N, get_i18n
from crewai.utilities.llm_utils import create_llm
from crewai.utilities.planning_types import StepObservation, TodoItem
from crewai.utilities.types import LLMMessage
if TYPE_CHECKING:
from crewai.agent import Agent
from crewai.task import Task
logger = logging.getLogger(__name__)
class PlannerObserver:
"""Observes step execution results and decides on plan continuation.
After EVERY step execution, this class:
1. Analyzes what the step accomplished
2. Identifies new information learned
3. Decides if the remaining plan is still valid
4. Suggests lightweight refinements or triggers full replanning
LLM resolution (magical fallback):
- If ``agent.planning_config.llm`` is explicitly set → use that
- Otherwise → fall back to ``agent.llm`` (same LLM for everything)
Args:
agent: The agent instance (for LLM resolution and config).
task: Optional task context (for description and expected output).
"""
def __init__(
self,
agent: Agent,
task: Task | None = None,
kickoff_input: str = "",
) -> None:
self.agent = agent
self.task = task
self.kickoff_input = kickoff_input
self.llm = self._resolve_llm()
self._i18n: I18N = get_i18n()
def _resolve_llm(self) -> Any:
"""Resolve which LLM to use for observation/planning.
Mirrors AgentReasoning._resolve_llm(): uses planning_config.llm
if explicitly set, otherwise falls back to agent.llm.
Returns:
The resolved LLM instance.
"""
from crewai.llm import LLM
config = getattr(self.agent, "planning_config", None)
if config is not None and config.llm is not None:
if isinstance(config.llm, LLM):
return config.llm
return create_llm(config.llm)
return self.agent.llm
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def observe(
self,
completed_step: TodoItem,
result: str,
all_completed: list[TodoItem],
remaining_todos: list[TodoItem],
) -> StepObservation:
"""Observe a step's result and decide on plan continuation.
This runs after EVERY step execution — not just failures.
Args:
completed_step: The todo item that was just executed.
result: The final result string from the step.
all_completed: All previously completed todos (for context).
remaining_todos: The pending todos still in the plan.
Returns:
StepObservation with the Planner's analysis. Any suggested
refinements are structured StepRefinement objects ready for
direct application — no second LLM call needed.
"""
agent_role = self.agent.role
crewai_event_bus.emit(
self.agent,
event=StepObservationStartedEvent(
agent_role=agent_role,
step_number=completed_step.step_number,
step_description=completed_step.description,
from_task=self.task,
from_agent=self.agent,
),
)
messages = self._build_observation_messages(
completed_step, result, all_completed, remaining_todos
)
try:
response = self.llm.call(
messages,
response_model=StepObservation,
from_task=self.task,
from_agent=self.agent,
)
observation = self._parse_observation_response(response)
refinement_summaries = (
[
f"Step {r.step_number}: {r.new_description}"
for r in observation.suggested_refinements
]
if observation.suggested_refinements
else None
)
crewai_event_bus.emit(
self.agent,
event=StepObservationCompletedEvent(
agent_role=agent_role,
step_number=completed_step.step_number,
step_description=completed_step.description,
step_completed_successfully=observation.step_completed_successfully,
key_information_learned=observation.key_information_learned,
remaining_plan_still_valid=observation.remaining_plan_still_valid,
needs_full_replan=observation.needs_full_replan,
replan_reason=observation.replan_reason,
goal_already_achieved=observation.goal_already_achieved,
suggested_refinements=refinement_summaries,
from_task=self.task,
from_agent=self.agent,
),
)
return observation
except Exception as e:
logger.warning(
f"Observation LLM call failed: {e}. Defaulting to conservative replan."
)
crewai_event_bus.emit(
self.agent,
event=StepObservationFailedEvent(
agent_role=agent_role,
step_number=completed_step.step_number,
step_description=completed_step.description,
error=str(e),
from_task=self.task,
from_agent=self.agent,
),
)
# Don't force a full replan — the step may have succeeded even if the
# observer LLM failed to parse the result. Defaulting to "continue" is
# far less disruptive than wiping the entire plan on every observer error.
return StepObservation(
step_completed_successfully=True,
key_information_learned="",
remaining_plan_still_valid=True,
needs_full_replan=False,
)
def apply_refinements(
self,
observation: StepObservation,
remaining_todos: list[TodoItem],
) -> list[TodoItem]:
"""Apply structured refinements from the observation directly to todo descriptions.
No LLM call needed — refinements are already structured StepRefinement
objects produced by the observation call. This is a pure in-memory update.
Args:
observation: The observation containing structured refinements.
remaining_todos: The pending todos to update in-place.
Returns:
The same todo list with updated descriptions where refinements applied.
"""
if not observation.suggested_refinements:
return remaining_todos
todo_by_step: dict[int, TodoItem] = {t.step_number: t for t in remaining_todos}
for refinement in observation.suggested_refinements:
if refinement.step_number in todo_by_step and refinement.new_description:
todo_by_step[
refinement.step_number
].description = refinement.new_description
return remaining_todos
# ------------------------------------------------------------------
# Internal: Message building
# ------------------------------------------------------------------
def _build_observation_messages(
self,
completed_step: TodoItem,
result: str,
all_completed: list[TodoItem],
remaining_todos: list[TodoItem],
) -> list[LLMMessage]:
"""Build messages for the observation LLM call."""
task_desc = ""
task_goal = ""
if self.task:
task_desc = self.task.description or ""
task_goal = self.task.expected_output or ""
elif self.kickoff_input:
# Standalone kickoff path — no Task object, but we have the raw input.
# Extract just the ## Task section so the observer sees the actual goal,
# not the full enriched instruction with env/tools/verification noise.
task_desc = extract_task_section(self.kickoff_input)
task_goal = "Complete the task successfully"
system_prompt = self._i18n.retrieve("planning", "observation_system_prompt")
# Build context of what's been done
completed_summary = ""
if all_completed:
completed_lines = []
for todo in all_completed:
result_preview = (todo.result or "")[:200]
completed_lines.append(
f" Step {todo.step_number}: {todo.description}\n"
f" Result: {result_preview}"
)
completed_summary = "\n## Previously completed steps:\n" + "\n".join(
completed_lines
)
# Build remaining plan
remaining_summary = ""
if remaining_todos:
remaining_lines = [
f" Step {todo.step_number}: {todo.description}"
for todo in remaining_todos
]
remaining_summary = "\n## Remaining plan steps:\n" + "\n".join(
remaining_lines
)
user_prompt = self._i18n.retrieve("planning", "observation_user_prompt").format(
task_description=task_desc,
task_goal=task_goal,
completed_summary=completed_summary,
step_number=completed_step.step_number,
step_description=completed_step.description,
step_result=result,
remaining_summary=remaining_summary,
)
return [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
]
@staticmethod
def _parse_observation_response(response: Any) -> StepObservation:
"""Parse the LLM response into a StepObservation.
The LLM may return:
- A StepObservation instance directly (streaming + litellm path)
- A JSON string (non-streaming path serialises model_dump_json())
- A dict (some provider paths)
- Something else (unexpected)
We handle all cases to avoid silently falling back to a
hardcoded success default.
"""
if isinstance(response, StepObservation):
return response
# JSON string path — most common miss before this fix
if isinstance(response, str):
text = response.strip()
try:
return StepObservation.model_validate_json(text)
except Exception: # noqa: S110
pass
# Some LLMs wrap the JSON in markdown fences
if text.startswith("```"):
lines = text.split("\n")
# Strip first and last lines (``` markers)
inner = "\n".join(
lines[1:-1] if lines[-1].strip() == "```" else lines[1:]
)
try:
return StepObservation.model_validate_json(inner.strip())
except Exception: # noqa: S110
pass
# Dict path
if isinstance(response, dict):
try:
return StepObservation.model_validate(response)
except Exception: # noqa: S110
pass
# Last resort — log what we got so it's diagnosable
logger.warning(
"Could not parse observation response (type=%s). "
"Falling back to default failure observation. Preview: %.200s",
type(response).__name__,
str(response),
)
return StepObservation(
step_completed_successfully=False,
key_information_learned=str(response) if response else "",
remaining_plan_still_valid=False,
)

View File

@@ -0,0 +1,629 @@
"""StepExecutor: Isolated executor for a single plan step.
Implements the direct-action execution pattern from Plan-and-Act
(arxiv 2503.09572): the Executor receives one step description,
makes a single LLM call, executes any tool call returned, and
returns the result immediately.
There is no inner loop. Recovery from failure (retry, replan) is
the responsibility of PlannerObserver and AgentExecutor — keeping
this class single-purpose and fast.
"""
from __future__ import annotations
from collections.abc import Callable
from datetime import datetime
import json
import time
from typing import TYPE_CHECKING, Any, cast
from pydantic import BaseModel
from crewai.agents.parser import AgentAction, AgentFinish
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.tool_usage_events import (
ToolUsageErrorEvent,
ToolUsageFinishedEvent,
ToolUsageStartedEvent,
)
from crewai.utilities.agent_utils import (
build_tool_calls_assistant_message,
check_native_tool_support,
enforce_rpm_limit,
execute_single_native_tool_call,
extract_task_section,
format_message_for_llm,
is_tool_call_list,
process_llm_response,
setup_native_tools,
)
from crewai.utilities.i18n import I18N, get_i18n
from crewai.utilities.planning_types import TodoItem
from crewai.utilities.printer import Printer
from crewai.utilities.step_execution_context import StepExecutionContext, StepResult
from crewai.utilities.string_utils import sanitize_tool_name
from crewai.utilities.tool_utils import execute_tool_and_check_finality
from crewai.utilities.types import LLMMessage
if TYPE_CHECKING:
from crewai.agent import Agent
from crewai.agents.tools_handler import ToolsHandler
from crewai.crew import Crew
from crewai.llms.base_llm import BaseLLM
from crewai.task import Task
from crewai.tools.base_tool import BaseTool
from crewai.tools.structured_tool import CrewStructuredTool
class StepExecutor:
"""Executes a SINGLE todo item using direct-action execution.
The StepExecutor owns its own message list per invocation. It never reads
or writes the AgentExecutor's state. Results flow back via StepResult.
Execution pattern (per Plan-and-Act, arxiv 2503.09572):
1. Build messages from todo + context
2. Call LLM once (with or without native tools)
3. If tool call → execute it → return tool result
4. If text answer → return it directly
No inner loop — recovery is PlannerObserver's responsibility.
Args:
llm: The language model to use for execution.
tools: Structured tools available to the executor.
agent: The agent instance (for role/goal/verbose/config).
original_tools: Original BaseTool instances (needed for native tool schema).
tools_handler: Optional tools handler for caching and delegation tracking.
task: Optional task context.
crew: Optional crew context.
function_calling_llm: Optional separate LLM for function calling.
request_within_rpm_limit: Optional RPM limit function.
callbacks: Optional list of callbacks.
i18n: Optional i18n instance.
"""
def __init__(
self,
llm: BaseLLM,
tools: list[CrewStructuredTool],
agent: Agent,
original_tools: list[BaseTool] | None = None,
tools_handler: ToolsHandler | None = None,
task: Task | None = None,
crew: Crew | None = None,
function_calling_llm: BaseLLM | None = None,
request_within_rpm_limit: Callable[[], bool] | None = None,
callbacks: list[Any] | None = None,
i18n: I18N | None = None,
) -> None:
self.llm = llm
self.tools = tools
self.agent = agent
self.original_tools = original_tools or []
self.tools_handler = tools_handler
self.task = task
self.crew = crew
self.function_calling_llm = function_calling_llm
self.request_within_rpm_limit = request_within_rpm_limit
self.callbacks = callbacks or []
self._i18n: I18N = i18n or get_i18n()
self._printer: Printer = Printer()
# Native tool support — set up once
self._use_native_tools = check_native_tool_support(
self.llm, self.original_tools
)
self._openai_tools: list[dict[str, Any]] = []
self._available_functions: dict[str, Callable[..., Any]] = {}
if self._use_native_tools and self.original_tools:
(
self._openai_tools,
self._available_functions,
_,
) = setup_native_tools(self.original_tools)
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def execute(
self,
todo: TodoItem,
context: StepExecutionContext,
max_step_iterations: int = 15,
step_timeout: int | None = None,
) -> StepResult:
"""Execute a single todo item using a multi-turn action loop.
Enforces the RPM limit, builds a fresh message list, then iterates
LLM call → tool execution → observation until the LLM signals it is
done (text answer) or max_step_iterations is reached. Never touches
external AgentExecutor state.
Args:
todo: The todo item to execute.
context: Immutable context with task info and dependency results.
max_step_iterations: Maximum LLM iterations in the multi-turn loop.
step_timeout: Maximum wall-clock seconds for this step. None = no limit.
Returns:
StepResult with the outcome.
"""
start_time = time.monotonic()
tool_calls_made: list[str] = []
try:
enforce_rpm_limit(self.request_within_rpm_limit)
messages = self._build_isolated_messages(todo, context)
if self._use_native_tools:
result_text = self._execute_native(
messages,
tool_calls_made,
max_step_iterations=max_step_iterations,
step_timeout=step_timeout,
start_time=start_time,
)
else:
result_text = self._execute_text_parsed(
messages,
tool_calls_made,
max_step_iterations=max_step_iterations,
step_timeout=step_timeout,
start_time=start_time,
)
self._validate_expected_tool_usage(todo, tool_calls_made)
elapsed = time.monotonic() - start_time
return StepResult(
success=True,
result=result_text,
tool_calls_made=tool_calls_made,
execution_time=elapsed,
)
except Exception as e:
elapsed = time.monotonic() - start_time
return StepResult(
success=False,
result="",
error=str(e),
tool_calls_made=tool_calls_made,
execution_time=elapsed,
)
# ------------------------------------------------------------------
# Internal: Message building
# ------------------------------------------------------------------
def _build_isolated_messages(
self, todo: TodoItem, context: StepExecutionContext
) -> list[LLMMessage]:
"""Build a fresh message list for this step's execution.
System prompt tells the LLM it is an Executor focused on one step.
User prompt provides the step description, dependencies, and tools.
"""
system_prompt = self._build_system_prompt()
user_prompt = self._build_user_prompt(todo, context)
return [
format_message_for_llm(system_prompt, role="system"),
format_message_for_llm(user_prompt, role="user"),
]
def _build_system_prompt(self) -> str:
"""Build the Executor's system prompt."""
role = self.agent.role if self.agent else "Assistant"
goal = self.agent.goal if self.agent else "Complete tasks efficiently"
backstory = getattr(self.agent, "backstory", "") or ""
tools_section = ""
if self.tools and not self._use_native_tools:
tool_names = ", ".join(sanitize_tool_name(t.name) for t in self.tools)
tools_section = self._i18n.retrieve(
"planning", "step_executor_tools_section"
).format(tool_names=tool_names)
elif self.tools:
tool_names = ", ".join(sanitize_tool_name(t.name) for t in self.tools)
tools_section = f"\n\nAvailable tools: {tool_names}"
return self._i18n.retrieve("planning", "step_executor_system_prompt").format(
role=role,
backstory=backstory,
goal=goal,
tools_section=tools_section,
)
def _build_user_prompt(self, todo: TodoItem, context: StepExecutionContext) -> str:
"""Build the user prompt for this specific step."""
parts: list[str] = []
# Include overall task context so the executor knows the full goal and
# required output format/location — critical for knowing WHAT to produce.
# We extract only the task body (not tool instructions or verification
# sections) to avoid duplicating directives already in the system prompt.
if context.task_description:
task_section = extract_task_section(context.task_description)
if task_section:
parts.append(
self._i18n.retrieve(
"planning", "step_executor_task_context"
).format(
task_context=task_section,
)
)
parts.append(
self._i18n.retrieve("planning", "step_executor_user_prompt").format(
step_description=todo.description,
)
)
if todo.tool_to_use:
parts.append(
self._i18n.retrieve("planning", "step_executor_suggested_tool").format(
tool_to_use=todo.tool_to_use,
)
)
# Include dependency results (final results only, no traces)
if context.dependency_results:
parts.append(
self._i18n.retrieve("planning", "step_executor_context_header")
)
for step_num, result in sorted(context.dependency_results.items()):
parts.append(
self._i18n.retrieve(
"planning", "step_executor_context_entry"
).format(step_number=step_num, result=result)
)
parts.append(self._i18n.retrieve("planning", "step_executor_complete_step"))
return "\n".join(parts)
# ------------------------------------------------------------------
# Internal: Multi-turn execution loop
# ------------------------------------------------------------------
def _execute_text_parsed(
self,
messages: list[LLMMessage],
tool_calls_made: list[str],
max_step_iterations: int = 15,
step_timeout: int | None = None,
start_time: float | None = None,
) -> str:
"""Execute step using text-parsed tool calling with a multi-turn loop.
Iterates LLM call → tool execution → observation until the LLM
produces a Final Answer or max_step_iterations is reached.
This allows the agent to: run a command, see the output, adjust its
approach, and run another command — all within a single plan step.
"""
use_stop_words = self.llm.supports_stop_words() if self.llm else False
last_tool_result = ""
for _ in range(max_step_iterations):
# Check step timeout
if step_timeout and start_time:
elapsed = time.monotonic() - start_time
if elapsed >= step_timeout:
return last_tool_result or f"Step timed out after {elapsed:.0f}s"
answer = self.llm.call(
messages,
callbacks=self.callbacks,
from_task=self.task,
from_agent=self.agent,
)
if not answer:
raise ValueError("Empty response from LLM")
answer_str = str(answer)
formatted = process_llm_response(answer_str, use_stop_words)
if isinstance(formatted, AgentFinish):
return str(formatted.output)
if isinstance(formatted, AgentAction):
tool_calls_made.append(formatted.tool)
tool_result = self._execute_text_tool_with_events(formatted)
last_tool_result = tool_result
# Append the assistant's reasoning + action, then the observation.
# _build_observation_message handles vision sentinels so the LLM
# receives an image content block instead of raw base64 text.
messages.append({"role": "assistant", "content": answer_str})
messages.append(self._build_observation_message(tool_result))
continue
# Raw text response with no Final Answer marker — treat as done
return answer_str
# Max iterations reached — return the last tool result we accumulated
return last_tool_result
def _execute_text_tool_with_events(self, formatted: AgentAction) -> str:
"""Execute text-parsed tool calls with tool usage events."""
args_dict = self._parse_tool_args(formatted.tool_input)
agent_key = getattr(self.agent, "key", "unknown") if self.agent else "unknown"
started_at = datetime.now()
crewai_event_bus.emit(
self,
event=ToolUsageStartedEvent(
tool_name=formatted.tool,
tool_args=args_dict,
from_agent=self.agent,
from_task=self.task,
agent_key=agent_key,
),
)
try:
fingerprint_context = {}
if (
self.agent
and hasattr(self.agent, "security_config")
and hasattr(self.agent.security_config, "fingerprint")
):
fingerprint_context = {
"agent_fingerprint": str(self.agent.security_config.fingerprint)
}
tool_result = execute_tool_and_check_finality(
agent_action=formatted,
fingerprint_context=fingerprint_context,
tools=self.tools,
i18n=self._i18n,
agent_key=self.agent.key if self.agent else None,
agent_role=self.agent.role if self.agent else None,
tools_handler=self.tools_handler,
task=self.task,
agent=self.agent,
function_calling_llm=self.function_calling_llm,
crew=self.crew,
)
except Exception as e:
crewai_event_bus.emit(
self,
event=ToolUsageErrorEvent(
tool_name=formatted.tool,
tool_args=args_dict,
from_agent=self.agent,
from_task=self.task,
agent_key=agent_key,
error=e,
),
)
raise
crewai_event_bus.emit(
self,
event=ToolUsageFinishedEvent(
output=str(tool_result.result),
tool_name=formatted.tool,
tool_args=args_dict,
from_agent=self.agent,
from_task=self.task,
agent_key=agent_key,
started_at=started_at,
finished_at=datetime.now(),
),
)
return str(tool_result.result)
def _parse_tool_args(self, tool_input: Any) -> dict[str, Any]:
"""Parse tool args from the parser output into a dict payload for events."""
if isinstance(tool_input, dict):
return tool_input
if isinstance(tool_input, str):
stripped_input = tool_input.strip()
if not stripped_input:
return {}
try:
parsed = json.loads(stripped_input)
if isinstance(parsed, dict):
return parsed
return {"input": parsed}
except json.JSONDecodeError:
return {"input": stripped_input}
return {"input": str(tool_input)}
# ------------------------------------------------------------------
# Internal: Vision support
# ------------------------------------------------------------------
@staticmethod
def _parse_vision_sentinel(raw: str) -> tuple[str, str] | None:
"""Parse a VISION_IMAGE sentinel into (media_type, base64_data), or None."""
prefix = "VISION_IMAGE:"
if not raw.startswith(prefix):
return None
rest = raw[len(prefix) :]
sep = rest.find(":")
if sep <= 0:
return None
return rest[:sep], rest[sep + 1 :]
@staticmethod
def _build_observation_message(tool_result: str) -> LLMMessage:
"""Build an observation message, converting vision sentinels to image blocks.
When a tool returns a VISION_IMAGE sentinel (e.g. from read_image),
we build a multimodal content block so the LLM can actually *see*
the image rather than receiving a wall of base64 text.
Uses the standard image_url / data-URI format so each LLM provider's
SDK (OpenAI, LiteLLM, etc.) handles the provider-specific conversion.
Format: ``VISION_IMAGE:<media_type>:<base64_data>``
"""
parsed = StepExecutor._parse_vision_sentinel(tool_result)
if parsed:
media_type, b64_data = parsed
return {
"role": "user",
"content": [
{"type": "text", "text": "Observation: Here is the image:"},
{
"type": "image_url",
"image_url": {
"url": f"data:{media_type};base64,{b64_data}",
},
},
],
}
return {"role": "user", "content": f"Observation: {tool_result}"}
def _validate_expected_tool_usage(
self,
todo: TodoItem,
tool_calls_made: list[str],
) -> None:
"""Fail step execution when a required tool is configured but not called."""
expected_tool = getattr(todo, "tool_to_use", None)
if not expected_tool:
return
expected_tool_name = sanitize_tool_name(expected_tool)
available_tool_names = {
sanitize_tool_name(tool.name)
for tool in self.tools
if getattr(tool, "name", "")
} | set(self._available_functions.keys())
if expected_tool_name not in available_tool_names:
return
called_names = {sanitize_tool_name(name) for name in tool_calls_made}
if expected_tool_name not in called_names:
raise ValueError(
f"Expected tool '{expected_tool_name}' was not called "
f"for step {todo.step_number}."
)
def _execute_native(
self,
messages: list[LLMMessage],
tool_calls_made: list[str],
max_step_iterations: int = 15,
step_timeout: int | None = None,
start_time: float | None = None,
) -> str:
"""Execute step using native function calling with a multi-turn loop.
Iterates LLM call → tool execution → appended results until the LLM
returns a text answer (no more tool calls) or max_step_iterations is
reached. This lets the agent run a shell command, observe the output,
correct mistakes, and issue follow-up commands — all within one step.
"""
accumulated_results: list[str] = []
for _ in range(max_step_iterations):
# Check step timeout
if step_timeout and start_time:
elapsed = time.monotonic() - start_time
if elapsed >= step_timeout:
return (
"\n\n".join(accumulated_results)
if accumulated_results
else f"Step timed out after {elapsed:.0f}s"
)
answer = self.llm.call(
messages,
tools=self._openai_tools,
callbacks=self.callbacks,
from_task=self.task,
from_agent=self.agent,
)
if not answer:
raise ValueError("Empty response from LLM")
if isinstance(answer, BaseModel):
return answer.model_dump_json()
if isinstance(answer, list) and answer and is_tool_call_list(answer):
# _execute_native_tool_calls appends assistant + tool messages
# to `messages` as a side-effect, so the next LLM call will
# see the full conversation history including tool outputs.
result = self._execute_native_tool_calls(
answer, messages, tool_calls_made
)
accumulated_results.append(result)
continue
# Text answer → LLM decided the step is done
return str(answer)
# Max iterations reached — return everything we accumulated
return "\n".join(filter(None, accumulated_results))
def _execute_native_tool_calls(
self,
tool_calls: list[Any],
messages: list[LLMMessage],
tool_calls_made: list[str],
) -> str:
"""Execute a batch of native tool calls and return their results.
Returns the result of the first tool marked result_as_answer if any,
otherwise returns all tool results concatenated.
"""
assistant_message, _reports = build_tool_calls_assistant_message(tool_calls)
if assistant_message:
messages.append(assistant_message)
tool_results: list[str] = []
for tool_call in tool_calls:
call_result = execute_single_native_tool_call(
tool_call,
available_functions=self._available_functions,
original_tools=self.original_tools,
structured_tools=self.tools,
tools_handler=self.tools_handler,
agent=self.agent,
task=self.task,
crew=self.crew,
event_source=self,
printer=self._printer,
verbose=bool(self.agent and self.agent.verbose),
)
if call_result.func_name:
tool_calls_made.append(call_result.func_name)
if call_result.result_as_answer:
return str(call_result.result)
if call_result.tool_message:
raw_content = call_result.tool_message.get("content", "")
if isinstance(raw_content, str):
parsed = self._parse_vision_sentinel(raw_content)
if parsed:
media_type, b64_data = parsed
# Replace the sentinel with a standard image_url content block.
# Each provider's _format_messages handles conversion to
# its native format (e.g. Anthropic image blocks).
modified: LLMMessage = cast(
LLMMessage, dict(call_result.tool_message)
)
modified["content"] = [
{
"type": "image_url",
"image_url": {
"url": f"data:{media_type};base64,{b64_data}",
},
}
]
messages.append(modified)
tool_results.append("[image]")
else:
messages.append(call_result.tool_message)
if raw_content:
tool_results.append(raw_content)
else:
messages.append(call_result.tool_message)
if raw_content:
tool_results.append(str(raw_content))
return "\n".join(tool_results) if tool_results else ""

View File

@@ -182,15 +182,24 @@ def log_tasks_outputs() -> None:
@crewai.command()
@click.option("-m", "--memory", is_flag=True, help="Reset MEMORY")
@click.option(
"-l", "--long", is_flag=True, hidden=True,
"-l",
"--long",
is_flag=True,
hidden=True,
help="[Deprecated: use --memory] Reset memory",
)
@click.option(
"-s", "--short", is_flag=True, hidden=True,
"-s",
"--short",
is_flag=True,
hidden=True,
help="[Deprecated: use --memory] Reset memory",
)
@click.option(
"-e", "--entities", is_flag=True, hidden=True,
"-e",
"--entities",
is_flag=True,
hidden=True,
help="[Deprecated: use --memory] Reset memory",
)
@click.option("-kn", "--knowledge", is_flag=True, help="Reset KNOWLEDGE storage")
@@ -218,7 +227,13 @@ def reset_memories(
# Treat legacy flags as --memory with a deprecation warning
if long or short or entities:
legacy_used = [
f for f, v in [("--long", long), ("--short", short), ("--entities", entities)] if v
f
for f, v in [
("--long", long),
("--short", short),
("--entities", entities),
]
if v
]
click.echo(
f"Warning: {', '.join(legacy_used)} {'is' if len(legacy_used) == 1 else 'are'} "
@@ -238,9 +253,7 @@ def reset_memories(
"Please specify at least one memory type to reset using the appropriate flags."
)
return
reset_memories_command(
memory, knowledge, agent_knowledge, kickoff_outputs, all
)
reset_memories_command(memory, knowledge, agent_knowledge, kickoff_outputs, all)
except Exception as e:
click.echo(f"An error occurred while resetting memories: {e}", err=True)
@@ -439,12 +452,10 @@ def tool_install(handle: str):
default=False,
help="Bypasses Git remote validations",
)
@click.option("--public", "is_public", flag_value=True, default=False)
@click.option("--private", "is_public", flag_value=False)
def tool_publish(is_public: bool, force: bool):
def tool_publish(force: bool):
tool_cmd = ToolCommand()
tool_cmd.login()
tool_cmd.publish(is_public, force)
tool_cmd.publish(force)
@crewai.group()
@@ -669,18 +680,11 @@ def traces_enable():
from rich.console import Console
from rich.panel import Panel
from crewai.events.listeners.tracing.utils import (
_load_user_data,
_save_user_data,
)
from crewai.events.listeners.tracing.utils import update_user_data
console = Console()
# Update user data to enable traces
user_data = _load_user_data()
user_data["trace_consent"] = True
user_data["first_execution_done"] = True
_save_user_data(user_data)
update_user_data({"trace_consent": True, "first_execution_done": True})
panel = Panel(
"✅ Trace collection has been enabled!\n\n"
@@ -699,18 +703,11 @@ def traces_disable():
from rich.console import Console
from rich.panel import Panel
from crewai.events.listeners.tracing.utils import (
_load_user_data,
_save_user_data,
)
from crewai.events.listeners.tracing.utils import update_user_data
console = Console()
# Update user data to disable traces
user_data = _load_user_data()
user_data["trace_consent"] = False
user_data["first_execution_done"] = True
_save_user_data(user_data)
update_user_data({"trace_consent": False, "first_execution_done": True})
panel = Panel(
"❌ Trace collection has been disabled!\n\n"

View File

@@ -1,3 +1,4 @@
import contextvars
import json
from pathlib import Path
import platform
@@ -80,7 +81,10 @@ def run_chat() -> None:
# Start loading indicator
loading_complete = threading.Event()
loading_thread = threading.Thread(target=show_loading, args=(loading_complete,))
ctx = contextvars.copy_context()
loading_thread = threading.Thread(
target=ctx.run, args=(show_loading, loading_complete)
)
loading_thread.start()
try:

View File

@@ -125,13 +125,19 @@ class MemoryTUI(App[None]):
from crewai.memory.storage.lancedb_storage import LanceDBStorage
from crewai.memory.unified_memory import Memory
storage = LanceDBStorage(path=storage_path) if storage_path else LanceDBStorage()
storage = (
LanceDBStorage(path=storage_path) if storage_path else LanceDBStorage()
)
embedder = None
if embedder_config is not None:
from crewai.rag.embeddings.factory import build_embedder
embedder = build_embedder(embedder_config)
self._memory = Memory(storage=storage, embedder=embedder) if embedder else Memory(storage=storage)
self._memory = (
Memory(storage=storage, embedder=embedder)
if embedder
else Memory(storage=storage)
)
except Exception as e:
self._init_error = str(e)
@@ -200,11 +206,7 @@ class MemoryTUI(App[None]):
if len(record.content) > 80
else record.content
)
label = (
f"{date_str} "
f"[bold]{record.importance:.1f}[/] "
f"{preview}"
)
label = f"{date_str} [bold]{record.importance:.1f}[/] {preview}"
option_list.add_option(label)
def _populate_recall_list(self) -> None:
@@ -220,9 +222,7 @@ class MemoryTUI(App[None]):
else m.record.content
)
label = (
f"[bold]\\[{m.score:.2f}][/] "
f"{preview} "
f"[dim]scope={m.record.scope}[/]"
f"[bold]\\[{m.score:.2f}][/] {preview} [dim]scope={m.record.scope}[/]"
)
option_list.add_option(label)
@@ -251,8 +251,7 @@ class MemoryTUI(App[None]):
lines.append(f"[dim]Scope:[/] [bold]{record.scope}[/]")
lines.append(f"[dim]Importance:[/] [bold]{record.importance:.2f}[/]")
lines.append(
f"[dim]Created:[/] "
f"{record.created_at.strftime('%Y-%m-%d %H:%M:%S')}"
f"[dim]Created:[/] {record.created_at.strftime('%Y-%m-%d %H:%M:%S')}"
)
lines.append(
f"[dim]Last accessed:[/] "
@@ -362,17 +361,11 @@ class MemoryTUI(App[None]):
panel = self.query_one("#info-panel", Static)
panel.loading = True
try:
scope = (
self._selected_scope
if self._selected_scope != "/"
else None
)
scope = self._selected_scope if self._selected_scope != "/" else None
loop = asyncio.get_event_loop()
matches = await loop.run_in_executor(
None,
lambda: self._memory.recall(
query, scope=scope, limit=10, depth="deep"
),
lambda: self._memory.recall(query, scope=scope, limit=10, depth="deep"),
)
self._recall_matches = matches or []
self._view_mode = "recall"

View File

@@ -68,7 +68,6 @@ class PlusAPI:
def publish_tool(
self,
handle: str,
is_public: bool,
version: str,
description: str | None,
encoded_file: str,
@@ -76,7 +75,6 @@ class PlusAPI:
) -> httpx.Response:
params = {
"handle": handle,
"public": is_public,
"version": version,
"file": encoded_file,
"description": description,

View File

@@ -95,9 +95,7 @@ def reset_memories_command(
continue
if memory:
_reset_flow_memory(flow)
click.echo(
f"[Flow ({flow_name})] Memory has been reset."
)
click.echo(f"[Flow ({flow_name})] Memory has been reset.")
except subprocess.CalledProcessError as e:
click.echo(f"An error occurred while resetting the memories: {e}", err=True)

View File

@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
authors = [{ name = "Your Name", email = "you@example.com" }]
requires-python = ">=3.10,<3.14"
dependencies = [
"crewai[tools]==1.10.2a1"
"crewai[tools]==1.11.0"
]
[project.scripts]

View File

@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
authors = [{ name = "Your Name", email = "you@example.com" }]
requires-python = ">=3.10,<3.14"
dependencies = [
"crewai[tools]==1.10.2a1"
"crewai[tools]==1.11.0"
]
[project.scripts]

View File

@@ -5,7 +5,7 @@ description = "Power up your crews with {{folder_name}}"
readme = "README.md"
requires-python = ">=3.10,<3.14"
dependencies = [
"crewai[tools]==1.10.2a1"
"crewai[tools]==1.11.0"
]
[tool.crewai]

View File

@@ -73,7 +73,7 @@ class ToolCommand(BaseCommand, PlusAPIMixin):
finally:
os.chdir(old_directory)
def publish(self, is_public: bool, force: bool = False) -> None:
def publish(self, force: bool = False) -> None:
if not git.Repository().is_synced() and not force:
console.print(
"[bold red]Failed to publish tool.[/bold red]\n"
@@ -129,7 +129,6 @@ class ToolCommand(BaseCommand, PlusAPIMixin):
console.print("[bold blue]Publishing tool to repository...[/bold blue]")
publish_response = self.plus_api_client.publish_tool(
handle=project_name,
is_public=is_public,
version=project_version,
description=project_description,
encoded_file=f"data:application/x-gzip;base64,{encoded_tarball}",

View File

@@ -442,9 +442,7 @@ def get_flows(flow_path: str = "main.py") -> list[Flow]:
for search_path in search_paths:
for root, dirs, files in os.walk(search_path):
dirs[:] = [
d
for d in dirs
if d not in _SKIP_DIRS and not d.startswith(".")
d for d in dirs if d not in _SKIP_DIRS and not d.startswith(".")
]
if flow_path in files and "cli/templates" not in root:
file_os_path = os.path.join(root, flow_path)
@@ -464,9 +462,7 @@ def get_flows(flow_path: str = "main.py") -> list[Flow]:
for attr_name in dir(module):
module_attr = getattr(module, attr_name)
try:
if flow_instance := get_flow_instance(
module_attr
):
if flow_instance := get_flow_instance(module_attr):
flow_instances.append(flow_instance)
except Exception: # noqa: S112
continue

View File

@@ -1410,9 +1410,7 @@ class Crew(FlowTrackable, BaseModel):
return self._merge_tools(tools, cast(list[BaseTool], code_tools))
return tools
def _add_memory_tools(
self, tools: list[BaseTool], memory: Any
) -> list[BaseTool]:
def _add_memory_tools(self, tools: list[BaseTool], memory: Any) -> list[BaseTool]:
"""Add recall and remember tools when memory is available.
Args:

View File

@@ -75,6 +75,14 @@ from crewai.events.types.mcp_events import (
MCPToolExecutionFailedEvent,
MCPToolExecutionStartedEvent,
)
from crewai.events.types.observation_events import (
GoalAchievedEarlyEvent,
PlanRefinementEvent,
PlanReplanTriggeredEvent,
StepObservationCompletedEvent,
StepObservationFailedEvent,
StepObservationStartedEvent,
)
from crewai.events.types.reasoning_events import (
AgentReasoningCompletedEvent,
AgentReasoningFailedEvent,
@@ -535,6 +543,64 @@ class EventListener(BaseEventListener):
event.error,
)
# ----------- OBSERVATION EVENTS (Plan-and-Execute) -----------
@crewai_event_bus.on(StepObservationStartedEvent)
def on_step_observation_started(
_: Any, event: StepObservationStartedEvent
) -> None:
self.formatter.handle_observation_started(
event.agent_role,
event.step_number,
event.step_description,
)
@crewai_event_bus.on(StepObservationCompletedEvent)
def on_step_observation_completed(
_: Any, event: StepObservationCompletedEvent
) -> None:
self.formatter.handle_observation_completed(
event.agent_role,
event.step_number,
event.step_completed_successfully,
event.remaining_plan_still_valid,
event.key_information_learned,
event.needs_full_replan,
event.goal_already_achieved,
)
@crewai_event_bus.on(StepObservationFailedEvent)
def on_step_observation_failed(
_: Any, event: StepObservationFailedEvent
) -> None:
self.formatter.handle_observation_failed(
event.step_number,
event.error,
)
@crewai_event_bus.on(PlanRefinementEvent)
def on_plan_refinement(_: Any, event: PlanRefinementEvent) -> None:
self.formatter.handle_plan_refinement(
event.step_number,
event.refined_step_count,
event.refinements,
)
@crewai_event_bus.on(PlanReplanTriggeredEvent)
def on_plan_replan_triggered(_: Any, event: PlanReplanTriggeredEvent) -> None:
self.formatter.handle_plan_replan(
event.replan_reason,
event.replan_count,
event.completed_steps_preserved,
)
@crewai_event_bus.on(GoalAchievedEarlyEvent)
def on_goal_achieved_early(_: Any, event: GoalAchievedEarlyEvent) -> None:
self.formatter.handle_goal_achieved_early(
event.steps_completed,
event.steps_remaining,
)
# ----------- AGENT LOGGING EVENTS -----------
@crewai_event_bus.on(AgentLogsStartedEvent)

View File

@@ -93,6 +93,14 @@ from crewai.events.types.memory_events import (
MemorySaveFailedEvent,
MemorySaveStartedEvent,
)
from crewai.events.types.observation_events import (
GoalAchievedEarlyEvent,
PlanRefinementEvent,
PlanReplanTriggeredEvent,
StepObservationCompletedEvent,
StepObservationFailedEvent,
StepObservationStartedEvent,
)
from crewai.events.types.reasoning_events import (
AgentReasoningCompletedEvent,
AgentReasoningFailedEvent,
@@ -437,6 +445,39 @@ class TraceCollectionListener(BaseEventListener):
) -> None:
self._handle_action_event("agent_reasoning_failed", source, event)
# Observation events (Plan-and-Execute)
@event_bus.on(StepObservationStartedEvent)
def on_step_observation_started(
source: Any, event: StepObservationStartedEvent
) -> None:
self._handle_action_event("step_observation_started", source, event)
@event_bus.on(StepObservationCompletedEvent)
def on_step_observation_completed(
source: Any, event: StepObservationCompletedEvent
) -> None:
self._handle_action_event("step_observation_completed", source, event)
@event_bus.on(StepObservationFailedEvent)
def on_step_observation_failed(
source: Any, event: StepObservationFailedEvent
) -> None:
self._handle_action_event("step_observation_failed", source, event)
@event_bus.on(PlanRefinementEvent)
def on_plan_refinement(source: Any, event: PlanRefinementEvent) -> None:
self._handle_action_event("plan_refinement", source, event)
@event_bus.on(PlanReplanTriggeredEvent)
def on_plan_replan_triggered(
source: Any, event: PlanReplanTriggeredEvent
) -> None:
self._handle_action_event("plan_replan_triggered", source, event)
@event_bus.on(GoalAchievedEarlyEvent)
def on_goal_achieved_early(source: Any, event: GoalAchievedEarlyEvent) -> None:
self._handle_action_event("goal_achieved_early", source, event)
@event_bus.on(KnowledgeRetrievalStartedEvent)
def on_knowledge_retrieval_started(
source: Any, event: KnowledgeRetrievalStartedEvent

View File

@@ -1,4 +1,5 @@
from collections.abc import Callable
import contextvars
from contextvars import ContextVar, Token
from datetime import datetime
import getpass
@@ -18,6 +19,7 @@ from rich.console import Console
from rich.panel import Panel
from rich.text import Text
from crewai.utilities.lock_store import lock as store_lock
from crewai.utilities.paths import db_storage_path
from crewai.utilities.serialization import to_serializable
@@ -137,12 +139,25 @@ def _load_user_data() -> dict[str, Any]:
return {}
def _save_user_data(data: dict[str, Any]) -> None:
def _user_data_lock_name() -> str:
"""Return a stable lock name for the user data file."""
return f"file:{os.path.realpath(_user_data_file())}"
def update_user_data(updates: dict[str, Any]) -> None:
"""Atomically read-modify-write the user data file.
Args:
updates: Key-value pairs to merge into the existing user data.
"""
try:
p = _user_data_file()
p.write_text(json.dumps(data, indent=2))
with store_lock(_user_data_lock_name()):
data = _load_user_data()
data.update(updates)
p = _user_data_file()
p.write_text(json.dumps(data, indent=2))
except (OSError, PermissionError) as e:
logger.warning(f"Failed to save user data: {e}")
logger.warning(f"Failed to update user data: {e}")
def has_user_declined_tracing() -> bool:
@@ -357,24 +372,30 @@ def _get_generic_system_id() -> str | None:
return None
def get_user_id() -> str:
"""Stable, anonymized user identifier with caching."""
data = _load_user_data()
if "user_id" in data:
return cast(str, data["user_id"])
def _generate_user_id() -> str:
"""Compute an anonymized user identifier from username and machine ID."""
try:
username = getpass.getuser()
except Exception:
username = "unknown"
seed = f"{username}|{_get_machine_id()}"
uid = hashlib.sha256(seed.encode()).hexdigest()
return hashlib.sha256(seed.encode()).hexdigest()
data["user_id"] = uid
_save_user_data(data)
return uid
def get_user_id() -> str:
"""Stable, anonymized user identifier with caching."""
with store_lock(_user_data_lock_name()):
data = _load_user_data()
if "user_id" in data:
return cast(str, data["user_id"])
uid = _generate_user_id()
data["user_id"] = uid
p = _user_data_file()
p.write_text(json.dumps(data, indent=2))
return uid
def is_first_execution() -> bool:
@@ -389,20 +410,23 @@ def mark_first_execution_done(user_consented: bool = False) -> None:
Args:
user_consented: Whether the user consented to trace collection.
"""
data = _load_user_data()
if data.get("first_execution_done", False):
return
with store_lock(_user_data_lock_name()):
data = _load_user_data()
if data.get("first_execution_done", False):
return
data.update(
{
"first_execution_done": True,
"first_execution_at": datetime.now().timestamp(),
"user_id": get_user_id(),
"machine_id": _get_machine_id(),
"trace_consent": user_consented,
}
)
_save_user_data(data)
uid = data.get("user_id") or _generate_user_id()
data.update(
{
"first_execution_done": True,
"first_execution_at": datetime.now().timestamp(),
"user_id": uid,
"machine_id": _get_machine_id(),
"trace_consent": user_consented,
}
)
p = _user_data_file()
p.write_text(json.dumps(data, indent=2))
def safe_serialize_to_dict(obj: Any, exclude: set[str] | None = None) -> dict[str, Any]:
@@ -509,7 +533,8 @@ def prompt_user_for_trace_viewing(timeout_seconds: int = 20) -> bool:
# Handle all input-related errors silently
result[0] = False
input_thread = threading.Thread(target=get_input, daemon=True)
ctx = contextvars.copy_context()
input_thread = threading.Thread(target=ctx.run, args=(get_input,), daemon=True)
input_thread.start()
input_thread.join(timeout=timeout_seconds)

View File

@@ -0,0 +1,99 @@
"""Observation events for the Plan-and-Execute architecture.
Emitted during the Observation phase (PLAN-AND-ACT Section 3.3) when the
PlannerObserver analyzes step execution results and decides on plan
continuation, refinement, or replanning.
"""
from typing import Any
from crewai.events.base_events import BaseEvent
class ObservationEvent(BaseEvent):
"""Base event for observation phase events."""
type: str
agent_role: str
step_number: int
step_description: str = ""
from_task: Any | None = None
from_agent: Any | None = None
def __init__(self, **data: Any) -> None:
super().__init__(**data)
self._set_task_params(data)
self._set_agent_params(data)
class StepObservationStartedEvent(ObservationEvent):
"""Emitted when the Planner begins observing a step's result.
Fires after every step execution, before the observation LLM call.
"""
type: str = "step_observation_started"
class StepObservationCompletedEvent(ObservationEvent):
"""Emitted when the Planner finishes observing a step's result.
Contains the full observation analysis: what was learned, whether
the plan is still valid, and what action to take next.
"""
type: str = "step_observation_completed"
step_completed_successfully: bool = True
key_information_learned: str = ""
remaining_plan_still_valid: bool = True
needs_full_replan: bool = False
replan_reason: str | None = None
goal_already_achieved: bool = False
suggested_refinements: list[str] | None = None
class StepObservationFailedEvent(ObservationEvent):
"""Emitted when the observation LLM call itself fails.
The system defaults to continuing the plan when this happens,
but the event allows monitoring/alerting on observation failures.
"""
type: str = "step_observation_failed"
error: str = ""
class PlanRefinementEvent(ObservationEvent):
"""Emitted when the Planner refines upcoming step descriptions.
This is the lightweight refinement path — no full replan, just
sharpening pending todo descriptions based on new information.
"""
type: str = "plan_refinement"
refined_step_count: int = 0
refinements: list[str] | None = None
class PlanReplanTriggeredEvent(ObservationEvent):
"""Emitted when the Planner triggers a full replan.
The remaining plan was deemed fundamentally wrong and will be
regenerated from scratch, preserving completed step results.
"""
type: str = "plan_replan_triggered"
replan_reason: str = ""
replan_count: int = 0
completed_steps_preserved: int = 0
class GoalAchievedEarlyEvent(ObservationEvent):
"""Emitted when the Planner detects the goal was achieved early.
Remaining steps will be skipped and execution will finalize.
"""
type: str = "goal_achieved_early"
steps_remaining: int = 0
steps_completed: int = 0

View File

@@ -9,7 +9,7 @@ class ReasoningEvent(BaseEvent):
type: str
attempt: int = 1
agent_role: str
task_id: str
task_id: str | None = None
task_name: str | None = None
from_task: Any | None = None
agent_id: str | None = None

View File

@@ -43,6 +43,7 @@ def should_suppress_console_output() -> bool:
class ConsoleFormatter:
tool_usage_counts: ClassVar[dict[str, int]] = {}
_tool_counts_lock: ClassVar[threading.Lock] = threading.Lock()
current_a2a_turn_count: int = 0
_pending_a2a_message: str | None = None
@@ -445,9 +446,11 @@ To enable tracing, do any one of these:
if not self.verbose:
return
# Update tool usage count
self.tool_usage_counts[tool_name] = self.tool_usage_counts.get(tool_name, 0) + 1
iteration = self.tool_usage_counts[tool_name]
with self._tool_counts_lock:
self.tool_usage_counts[tool_name] = (
self.tool_usage_counts.get(tool_name, 0) + 1
)
iteration = self.tool_usage_counts[tool_name]
content = Text()
content.append("Tool: ", style="white")
@@ -474,7 +477,8 @@ To enable tracing, do any one of these:
if not self.verbose:
return
iteration = self.tool_usage_counts.get(tool_name, 1)
with self._tool_counts_lock:
iteration = self.tool_usage_counts.get(tool_name, 1)
content = Text()
content.append("Tool Completed\n", style="green bold")
@@ -500,7 +504,8 @@ To enable tracing, do any one of these:
if not self.verbose:
return
iteration = self.tool_usage_counts.get(tool_name, 1)
with self._tool_counts_lock:
iteration = self.tool_usage_counts.get(tool_name, 1)
content = Text()
content.append("Tool Failed\n", style="red bold")
@@ -936,6 +941,152 @@ To enable tracing, do any one of these:
)
self.print_panel(error_content, "❌ Reasoning Error", "red")
# ----------- OBSERVATION EVENTS (Plan-and-Execute) -----------
def handle_observation_started(
self,
agent_role: str,
step_number: int,
step_description: str,
) -> None:
"""Handle step observation started event."""
if not self.verbose:
return
content = Text()
content.append("Observation Started\n", style="cyan bold")
content.append("Agent: ", style="white")
content.append(f"{agent_role}\n", style="cyan")
content.append("Step: ", style="white")
content.append(f"{step_number}\n", style="cyan")
if step_description:
desc_preview = step_description[:80] + (
"..." if len(step_description) > 80 else ""
)
content.append("Description: ", style="white")
content.append(f"{desc_preview}\n", style="cyan")
self.print_panel(content, "🔍 Observing Step Result", "cyan")
def handle_observation_completed(
self,
agent_role: str,
step_number: int,
step_completed: bool,
plan_valid: bool,
key_info: str,
needs_replan: bool,
goal_achieved: bool,
) -> None:
"""Handle step observation completed event."""
if not self.verbose:
return
if goal_achieved:
style = "green"
status = "Goal Achieved Early"
elif needs_replan:
style = "yellow"
status = "Replan Needed"
elif plan_valid:
style = "green"
status = "Plan Valid — Continue"
else:
style = "red"
status = "Step Failed"
content = Text()
content.append("Observation Complete\n", style=f"{style} bold")
content.append("Step: ", style="white")
content.append(f"{step_number}\n", style=style)
content.append("Status: ", style="white")
content.append(f"{status}\n", style=style)
if key_info:
info_preview = key_info[:120] + ("..." if len(key_info) > 120 else "")
content.append("Learned: ", style="white")
content.append(f"{info_preview}\n", style=style)
self.print_panel(content, "🔍 Observation Result", style)
def handle_observation_failed(
self,
step_number: int,
error: str,
) -> None:
"""Handle step observation failure event."""
if not self.verbose:
return
error_content = self.create_status_content(
"Observation Failed",
"Error",
"red",
Step=str(step_number),
Error=error,
)
self.print_panel(error_content, "❌ Observation Error", "red")
def handle_plan_refinement(
self,
step_number: int,
refined_count: int,
refinements: list[str] | None,
) -> None:
"""Handle plan refinement event."""
if not self.verbose:
return
content = Text()
content.append("Plan Refined\n", style="cyan bold")
content.append("After Step: ", style="white")
content.append(f"{step_number}\n", style="cyan")
content.append("Steps Updated: ", style="white")
content.append(f"{refined_count}\n", style="cyan")
if refinements:
for r in refinements[:3]:
content.append(f"{r[:80]}\n", style="white")
self.print_panel(content, "✏️ Plan Refinement", "cyan")
def handle_plan_replan(
self,
reason: str,
replan_count: int,
preserved_count: int,
) -> None:
"""Handle plan replan triggered event."""
if not self.verbose:
return
content = Text()
content.append("Full Replan Triggered\n", style="yellow bold")
content.append("Reason: ", style="white")
content.append(f"{reason}\n", style="yellow")
content.append("Replan #: ", style="white")
content.append(f"{replan_count}\n", style="yellow")
content.append("Preserved Steps: ", style="white")
content.append(f"{preserved_count}\n", style="yellow")
self.print_panel(content, "🔄 Dynamic Replan", "yellow")
def handle_goal_achieved_early(
self,
steps_completed: int,
steps_remaining: int,
) -> None:
"""Handle goal achieved early event."""
if not self.verbose:
return
content = Text()
content.append("Goal Achieved Early!\n", style="green bold")
content.append("Completed: ", style="white")
content.append(f"{steps_completed} steps\n", style="green")
content.append("Skipped: ", style="white")
content.append(f"{steps_remaining} remaining steps\n", style="green")
self.print_panel(content, "🎯 Early Goal Achievement", "green")
# ----------- AGENT LOGGING EVENTS -----------
def handle_agent_logs_started(

File diff suppressed because it is too large Load Diff

View File

@@ -34,6 +34,7 @@ class ConsoleProvider:
```python
from crewai.flow.async_feedback import ConsoleProvider
@human_feedback(
message="Review this:",
provider=ConsoleProvider(),
@@ -46,6 +47,7 @@ class ConsoleProvider:
```python
from crewai.flow import Flow, start
class MyFlow(Flow):
@start()
def gather_info(self):

View File

@@ -17,6 +17,7 @@ from collections.abc import (
ValuesView,
)
from concurrent.futures import Future, ThreadPoolExecutor
import contextvars
import copy
import enum
import inspect
@@ -497,7 +498,9 @@ class LockedListProxy(list, Generic[T]): # type: ignore[type-arg]
def __bool__(self) -> bool:
return bool(self._list)
def index(self, value: T, start: SupportsIndex = 0, stop: SupportsIndex | None = None) -> int: # type: ignore[override]
def index(
self, value: T, start: SupportsIndex = 0, stop: SupportsIndex | None = None
) -> int: # type: ignore[override]
if stop is None:
return self._list.index(value, start)
return self._list.index(value, start, stop)
@@ -1811,8 +1814,9 @@ class Flow(Generic[T], metaclass=FlowMeta):
try:
asyncio.get_running_loop()
ctx = contextvars.copy_context()
with ThreadPoolExecutor(max_workers=1) as pool:
return pool.submit(asyncio.run, _run_flow()).result()
return pool.submit(ctx.run, asyncio.run, _run_flow()).result()
except RuntimeError:
return asyncio.run(_run_flow())
@@ -2236,8 +2240,6 @@ class Flow(Generic[T], metaclass=FlowMeta):
else:
# Run sync methods in thread pool for isolation
# This allows Agent.kickoff() to work synchronously inside Flow methods
import contextvars
ctx = contextvars.copy_context()
result = await asyncio.to_thread(ctx.run, method, *args, **kwargs)
finally:
@@ -2714,7 +2716,9 @@ class Flow(Generic[T], metaclass=FlowMeta):
from crewai.flow.async_feedback.types import HumanFeedbackPending
if not isinstance(e, HumanFeedbackPending):
logger.error(f"Error executing listener {listener_name}: {e}")
if not getattr(e, "_flow_listener_logged", False):
logger.error(f"Error executing listener {listener_name}: {e}")
e._flow_listener_logged = True # type: ignore[attr-defined]
raise
# ── User Input (self.ask) ────────────────────────────────────────
@@ -2856,8 +2860,9 @@ class Flow(Generic[T], metaclass=FlowMeta):
# Manual executor management to avoid shutdown(wait=True)
# deadlock when the provider call outlives the timeout.
executor = ThreadPoolExecutor(max_workers=1)
ctx = contextvars.copy_context()
future = executor.submit(
provider.request_input, message, self, metadata
ctx.run, provider.request_input, message, self, metadata
)
try:
raw = future.result(timeout=timeout)
@@ -3081,25 +3086,35 @@ class Flow(Generic[T], metaclass=FlowMeta):
logger.warning(
f"Structured output failed, falling back to simple prompting: {e}"
)
response = llm_instance.call(messages=prompt)
response_clean = str(response).strip()
try:
response = llm_instance.call(
messages=[{"role": "user", "content": prompt}],
)
response_clean = str(response).strip()
# Exact match (case-insensitive)
for outcome in outcomes:
if outcome.lower() == response_clean.lower():
return outcome
# Exact match (case-insensitive)
for outcome in outcomes:
if outcome.lower() == response_clean.lower():
return outcome
# Partial match
for outcome in outcomes:
if outcome.lower() in response_clean.lower():
return outcome
# Partial match
for outcome in outcomes:
if outcome.lower() in response_clean.lower():
return outcome
# Fallback to first outcome
logger.warning(
f"Could not match LLM response '{response_clean}' to outcomes {list(outcomes)}. "
f"Falling back to first outcome: {outcomes[0]}"
)
return outcomes[0]
# Fallback to first outcome
logger.warning(
f"Could not match LLM response '{response_clean}' to outcomes {list(outcomes)}. "
f"Falling back to first outcome: {outcomes[0]}"
)
return outcomes[0]
except Exception as fallback_err:
logger.warning(
f"Simple prompting also failed: {fallback_err}. "
f"Falling back to first outcome: {outcomes[0]}"
)
return outcomes[0]
def _log_flow_event(
self,

View File

@@ -76,6 +76,24 @@ if TYPE_CHECKING:
F = TypeVar("F", bound=Callable[..., Any])
def _serialize_llm_for_context(llm: Any) -> str | None:
"""Serialize a BaseLLM object to a model string with provider prefix.
When persisting the LLM for HITL resume, we need to store enough info
to reconstruct a working LLM on the resume worker. Just storing the bare
model name (e.g. "gemini-3-flash-preview") causes provider inference to
fail — it defaults to OpenAI. Including the provider prefix (e.g.
"gemini/gemini-3-flash-preview") allows LLM() to correctly route.
"""
model = getattr(llm, "model", None)
if not model:
return None
provider = getattr(llm, "provider", None)
if provider and "/" not in model:
return f"{provider}/{model}"
return model
@dataclass
class HumanFeedbackResult:
"""Result from a @human_feedback decorated method.
@@ -188,7 +206,7 @@ def human_feedback(
metadata: dict[str, Any] | None = None,
provider: HumanFeedbackProvider | None = None,
learn: bool = False,
learn_source: str = "hitl"
learn_source: str = "hitl",
) -> Callable[[F], F]:
"""Decorator for Flow methods that require human feedback.
@@ -328,9 +346,7 @@ def human_feedback(
"""Recall past HITL lessons and use LLM to pre-review the output."""
try:
query = f"human feedback lessons for {func.__name__}: {method_output!s}"
matches = flow_instance.memory.recall(
query, source=learn_source
)
matches = flow_instance.memory.recall(query, source=learn_source)
if not matches:
return method_output
@@ -341,7 +357,10 @@ def human_feedback(
lessons=lessons,
)
messages = [
{"role": "system", "content": _get_hitl_prompt("hitl_pre_review_system")},
{
"role": "system",
"content": _get_hitl_prompt("hitl_pre_review_system"),
},
{"role": "user", "content": prompt},
]
if getattr(llm_inst, "supports_function_calling", lambda: False)():
@@ -366,7 +385,10 @@ def human_feedback(
feedback=raw_feedback,
)
messages = [
{"role": "system", "content": _get_hitl_prompt("hitl_distill_system")},
{
"role": "system",
"content": _get_hitl_prompt("hitl_distill_system"),
},
{"role": "user", "content": prompt},
]
@@ -408,7 +430,7 @@ def human_feedback(
emit=list(emit) if emit else None,
default_outcome=default_outcome,
metadata=metadata or {},
llm=llm if isinstance(llm, str) else getattr(llm, "model", None),
llm=llm if isinstance(llm, str) else _serialize_llm_for_context(llm),
)
# Determine effective provider:
@@ -487,7 +509,11 @@ def human_feedback(
result = _process_feedback(self, method_output, raw_feedback)
# Distill: extract lessons from output + feedback, store in memory
if learn and getattr(self, "memory", None) is not None and raw_feedback.strip():
if (
learn
and getattr(self, "memory", None) is not None
and raw_feedback.strip()
):
_distill_and_store_lessons(self, method_output, raw_feedback)
return result
@@ -507,7 +533,11 @@ def human_feedback(
result = _process_feedback(self, method_output, raw_feedback)
# Distill: extract lessons from output + feedback, store in memory
if learn and getattr(self, "memory", None) is not None and raw_feedback.strip():
if (
learn
and getattr(self, "memory", None) is not None
and raw_feedback.strip()
):
_distill_and_store_lessons(self, method_output, raw_feedback)
return result
@@ -534,7 +564,7 @@ def human_feedback(
metadata=metadata,
provider=provider,
learn=learn,
learn_source=learn_source
learn_source=learn_source,
)
wrapper.__is_flow_method__ = True

View File

@@ -1,11 +1,10 @@
"""
SQLite-based implementation of flow state persistence.
"""
"""SQLite-based implementation of flow state persistence."""
from __future__ import annotations
from datetime import datetime, timezone
import json
import os
from pathlib import Path
import sqlite3
from typing import TYPE_CHECKING, Any
@@ -13,6 +12,7 @@ from typing import TYPE_CHECKING, Any
from pydantic import BaseModel
from crewai.flow.persistence.base import FlowPersistence
from crewai.utilities.lock_store import lock as store_lock
from crewai.utilities.paths import db_storage_path
@@ -68,11 +68,15 @@ class SQLiteFlowPersistence(FlowPersistence):
raise ValueError("Database path must be provided")
self.db_path = path # Now mypy knows this is str
self._lock_name = f"sqlite:{os.path.realpath(self.db_path)}"
self.init_db()
def init_db(self) -> None:
"""Create the necessary tables if they don't exist."""
with sqlite3.connect(self.db_path, timeout=30) as conn:
with (
store_lock(self._lock_name),
sqlite3.connect(self.db_path, timeout=30) as conn,
):
conn.execute("PRAGMA journal_mode=WAL")
# Main state table
conn.execute(
@@ -114,6 +118,49 @@ class SQLiteFlowPersistence(FlowPersistence):
"""
)
def _save_state_sql(
self,
conn: sqlite3.Connection,
flow_uuid: str,
method_name: str,
state_dict: dict[str, Any],
) -> None:
"""Execute the save-state INSERT without acquiring the lock.
Args:
conn: An open SQLite connection.
flow_uuid: Unique identifier for the flow instance.
method_name: Name of the method that just completed.
state_dict: State data as a plain dict.
"""
conn.execute(
"""
INSERT INTO flow_states (
flow_uuid,
method_name,
timestamp,
state_json
) VALUES (?, ?, ?, ?)
""",
(
flow_uuid,
method_name,
datetime.now(timezone.utc).isoformat(),
json.dumps(state_dict),
),
)
@staticmethod
def _to_state_dict(state_data: dict[str, Any] | BaseModel) -> dict[str, Any]:
"""Convert state_data to a plain dict."""
if isinstance(state_data, BaseModel):
return state_data.model_dump()
if isinstance(state_data, dict):
return state_data
raise ValueError(
f"state_data must be either a Pydantic BaseModel or dict, got {type(state_data)}"
)
def save_state(
self,
flow_uuid: str,
@@ -127,33 +174,13 @@ class SQLiteFlowPersistence(FlowPersistence):
method_name: Name of the method that just completed
state_data: Current state data (either dict or Pydantic model)
"""
# Convert state_data to dict, handling both Pydantic and dict cases
if isinstance(state_data, BaseModel):
state_dict = state_data.model_dump()
elif isinstance(state_data, dict):
state_dict = state_data
else:
raise ValueError(
f"state_data must be either a Pydantic BaseModel or dict, got {type(state_data)}"
)
state_dict = self._to_state_dict(state_data)
with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute(
"""
INSERT INTO flow_states (
flow_uuid,
method_name,
timestamp,
state_json
) VALUES (?, ?, ?, ?)
""",
(
flow_uuid,
method_name,
datetime.now(timezone.utc).isoformat(),
json.dumps(state_dict),
),
)
with (
store_lock(self._lock_name),
sqlite3.connect(self.db_path, timeout=30) as conn,
):
self._save_state_sql(conn, flow_uuid, method_name, state_dict)
def load_state(self, flow_uuid: str) -> dict[str, Any] | None:
"""Load the most recent state for a given flow UUID.
@@ -198,24 +225,14 @@ class SQLiteFlowPersistence(FlowPersistence):
context: The pending feedback context with all resume information
state_data: Current state data
"""
# Import here to avoid circular imports
state_dict = self._to_state_dict(state_data)
# Convert state_data to dict
if isinstance(state_data, BaseModel):
state_dict = state_data.model_dump()
elif isinstance(state_data, dict):
state_dict = state_data
else:
raise ValueError(
f"state_data must be either a Pydantic BaseModel or dict, got {type(state_data)}"
)
with (
store_lock(self._lock_name),
sqlite3.connect(self.db_path, timeout=30) as conn,
):
self._save_state_sql(conn, flow_uuid, context.method_name, state_dict)
# Also save to regular state table for consistency
self.save_state(flow_uuid, context.method_name, state_data)
# Save pending feedback context
with sqlite3.connect(self.db_path, timeout=30) as conn:
# Use INSERT OR REPLACE to handle re-triggering feedback on same flow
conn.execute(
"""
INSERT OR REPLACE INTO pending_feedback (
@@ -273,7 +290,10 @@ class SQLiteFlowPersistence(FlowPersistence):
Args:
flow_uuid: Unique identifier for the flow instance
"""
with sqlite3.connect(self.db_path, timeout=30) as conn:
with (
store_lock(self._lock_name),
sqlite3.connect(self.db_path, timeout=30) as conn,
):
conn.execute(
"""
DELETE FROM pending_feedback

View File

@@ -6,9 +6,27 @@ from typing import Any
from pydantic import BaseModel, Field
from crewai.utilities.planning_types import TodoItem
from crewai.utilities.types import LLMMessage
class TodoExecutionResult(BaseModel):
"""Summary of a single todo execution."""
step_number: int = Field(description="Step number in the plan")
description: str = Field(description="What the todo was supposed to do")
tool_used: str | None = Field(
default=None, description="Tool that was used for this step"
)
status: str = Field(description="Final status: completed, failed, pending")
result: str | None = Field(
default=None, description="Result or error message from execution"
)
depends_on: list[int] = Field(
default_factory=list, description="Step numbers this depended on"
)
class LiteAgentOutput(BaseModel):
"""Class that represents the result of a LiteAgent execution."""
@@ -24,12 +42,75 @@ class LiteAgentOutput(BaseModel):
)
messages: list[LLMMessage] = Field(description="Messages of the agent", default=[])
plan: str | None = Field(
default=None, description="The execution plan that was generated, if any"
)
todos: list[TodoExecutionResult] = Field(
default_factory=list,
description="List of todos that were executed with their results",
)
replan_count: int = Field(
default=0, description="Number of times the plan was regenerated"
)
last_replan_reason: str | None = Field(
default=None, description="Reason for the last replan, if any"
)
@classmethod
def from_todo_items(cls, todo_items: list[TodoItem]) -> list[TodoExecutionResult]:
"""Convert TodoItem objects to TodoExecutionResult summaries.
Args:
todo_items: List of TodoItem objects from execution.
Returns:
List of TodoExecutionResult summaries.
"""
return [
TodoExecutionResult(
step_number=item.step_number,
description=item.description,
tool_used=item.tool_to_use,
status=item.status,
result=item.result,
depends_on=item.depends_on,
)
for item in todo_items
]
def to_dict(self) -> dict[str, Any]:
"""Convert pydantic_output to a dictionary."""
if self.pydantic:
return self.pydantic.model_dump()
return {}
@property
def completed_todos(self) -> list[TodoExecutionResult]:
"""Get only the completed todos."""
return [t for t in self.todos if t.status == "completed"]
@property
def failed_todos(self) -> list[TodoExecutionResult]:
"""Get only the failed todos."""
return [t for t in self.todos if t.status == "failed"]
@property
def had_plan(self) -> bool:
"""Check if the agent executed with a plan."""
return self.plan is not None or len(self.todos) > 0
def __str__(self) -> str:
"""Return the raw output as a string."""
return self.raw
def __repr__(self) -> str:
"""Return a detailed representation including todo summary."""
parts = [f"LiteAgentOutput(role={self.agent_role!r}"]
if self.todos:
completed = len(self.completed_todos)
total = len(self.todos)
parts.append(f", todos={completed}/{total} completed")
if self.replan_count > 0:
parts.append(f", replans={self.replan_count}")
parts.append(")")
return "".join(parts)

View File

@@ -240,6 +240,7 @@ ANTHROPIC_MODELS: list[AnthropicModels] = [
GeminiModels: TypeAlias = Literal[
"gemini-3-pro-preview",
"gemini-3-flash-preview",
"gemini-2.5-pro",
"gemini-2.5-pro-preview-03-25",
"gemini-2.5-pro-preview-05-06",
@@ -294,6 +295,7 @@ GeminiModels: TypeAlias = Literal[
]
GEMINI_MODELS: list[GeminiModels] = [
"gemini-3-pro-preview",
"gemini-3-flash-preview",
"gemini-2.5-pro",
"gemini-2.5-pro-preview-03-25",
"gemini-2.5-pro-preview-05-06",

View File

@@ -618,6 +618,50 @@ class AnthropicCompletion(BaseLLM):
return redacted_block
return None
@staticmethod
def _convert_image_blocks(content: Any) -> Any:
"""Convert OpenAI-style image_url blocks to Anthropic image blocks.
Upstream code (e.g. StepExecutor) uses the standard ``image_url``
format with a ``data:`` URI. Anthropic rejects that — it requires
``{"type": "image", "source": {"type": "base64", ...}}``.
Non-list content and blocks that are not ``image_url`` are passed
through unchanged.
"""
if not isinstance(content, list):
return content
converted: list[dict[str, Any]] = []
for block in content:
if not isinstance(block, dict) or block.get("type") != "image_url":
converted.append(block)
continue
image_info = block.get("image_url", {})
url = image_info.get("url", "") if isinstance(image_info, dict) else ""
if url.startswith("data:") and ";base64," in url:
# Parse data:<media_type>;base64,<data>
header, b64_data = url.split(";base64,", 1)
media_type = (
header.split("data:", 1)[1] if "data:" in header else "image/png"
)
converted.append(
{
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": b64_data,
},
}
)
else:
# Non-data URI — pass through as-is (Anthropic supports url source)
converted.append(block)
return converted
def _format_messages_for_anthropic(
self, messages: str | list[LLMMessage]
) -> tuple[list[LLMMessage], str | None]:
@@ -656,10 +700,11 @@ class AnthropicCompletion(BaseLLM):
tool_call_id = message.get("tool_call_id", "")
if not tool_call_id:
raise ValueError("Tool message missing required tool_call_id")
tool_content = self._convert_image_blocks(content) if content else ""
tool_result = {
"type": "tool_result",
"tool_use_id": tool_call_id,
"content": content if content else "",
"content": tool_content,
}
pending_tool_results.append(tool_result)
elif role == "assistant":
@@ -718,7 +763,12 @@ class AnthropicCompletion(BaseLLM):
role_str = role if role is not None else "user"
if isinstance(content, list):
formatted_messages.append({"role": role_str, "content": content})
formatted_messages.append(
{
"role": role_str,
"content": self._convert_image_blocks(content),
}
)
else:
content_str = content if content is not None else ""
formatted_messages.append(

View File

@@ -1847,7 +1847,10 @@ class BedrockCompletion(BaseLLM):
converse_messages.append({"role": "user", "content": pending_tool_results})
# CRITICAL: Handle model-specific conversation requirements
# Cohere and some other models require conversation to end with user message
# Cohere and some other models require conversation to end with user message.
# Anthropic models on Bedrock also reject assistant messages in the final
# position when tools are present ("pre-filling the assistant response is
# not supported").
if converse_messages:
last_message = converse_messages[-1]
if last_message["role"] == "assistant":
@@ -1874,6 +1877,20 @@ class BedrockCompletion(BaseLLM):
"content": [{"text": "Continue your response."}],
}
)
# Anthropic (Claude) models reject assistant-last messages when
# tools are in the request. Append a user message so the
# Converse API accepts the payload.
elif "anthropic" in self.model.lower() or "claude" in self.model.lower():
converse_messages.append(
{
"role": "user",
"content": [
{
"text": "Please continue and provide your final answer."
}
],
}
)
# Ensure first message is from user (required by Converse API)
if not converse_messages:

View File

@@ -11,6 +11,7 @@ into a standalone MCPToolResolver. It handles three flavours of MCP reference:
from __future__ import annotations
import asyncio
import contextvars
import time
from typing import TYPE_CHECKING, Any, Final, cast
from urllib.parse import urlparse
@@ -22,10 +23,10 @@ from crewai.mcp.config import (
MCPServerSSE,
MCPServerStdio,
)
from crewai.utilities.string_utils import sanitize_tool_name
from crewai.mcp.transports.http import HTTPTransport
from crewai.mcp.transports.sse import SSETransport
from crewai.mcp.transports.stdio import StdioTransport
from crewai.utilities.string_utils import sanitize_tool_name
if TYPE_CHECKING:
@@ -227,7 +228,9 @@ class MCPToolResolver:
server_params = {"url": server_url}
server_name = self._extract_server_name(server_url)
sanitized_specific_tool = sanitize_tool_name(specific_tool) if specific_tool else None
sanitized_specific_tool = (
sanitize_tool_name(specific_tool) if specific_tool else None
)
try:
tool_schemas = self._get_mcp_tool_schemas(server_params)
@@ -353,9 +356,10 @@ class MCPToolResolver:
asyncio.get_running_loop()
import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(
asyncio.run, _setup_client_and_list_tools()
ctx.run, asyncio.run, _setup_client_and_list_tools()
)
tools_list = future.result()
except RuntimeError:

View File

@@ -308,7 +308,9 @@ def analyze_for_save(
return MemoryAnalysis.model_validate(response)
except Exception as e:
_logger.warning(
"Memory save analysis failed, using defaults: %s", e, exc_info=False,
"Memory save analysis failed, using defaults: %s",
e,
exc_info=False,
)
return _SAVE_DEFAULTS
@@ -366,6 +368,8 @@ def analyze_for_consolidation(
return ConsolidationPlan.model_validate(response)
except Exception as e:
_logger.warning(
"Consolidation analysis failed, defaulting to insert: %s", e, exc_info=False,
"Consolidation analysis failed, defaulting to insert: %s",
e,
exc_info=False,
)
return _CONSOLIDATION_DEFAULT

View File

@@ -11,7 +11,9 @@ Orchestrates the encoding side of memory in a single Flow with 5 steps:
from __future__ import annotations
from concurrent.futures import Future, ThreadPoolExecutor
import contextvars
from datetime import datetime
import logging
import math
from typing import Any
from uuid import uuid4
@@ -28,6 +30,8 @@ from crewai.memory.analyze import (
from crewai.memory.types import MemoryConfig, MemoryRecord, embed_texts
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# State models
# ---------------------------------------------------------------------------
@@ -164,14 +168,20 @@ class EncodingFlow(Flow[EncodingState]):
def parallel_find_similar(self) -> None:
"""Search storage for similar records, concurrently for all active items."""
items = list(self.state.items)
active = [(i, item) for i, item in enumerate(items) if not item.dropped and item.embedding]
active = [
(i, item)
for i, item in enumerate(items)
if not item.dropped and item.embedding
]
if not active:
return
def _search_one(item: ItemState) -> list[tuple[MemoryRecord, float]]:
def _search_one(
item: ItemState,
) -> list[tuple[MemoryRecord, float]]:
scope_prefix = item.scope if item.scope and item.scope.strip("/") else None
return self._storage.search(
return self._storage.search( # type: ignore[no-any-return]
item.embedding,
scope_prefix=scope_prefix,
categories=None,
@@ -181,14 +191,37 @@ class EncodingFlow(Flow[EncodingState]):
if len(active) == 1:
_, item = active[0]
raw = _search_one(item)
try:
raw = _search_one(item)
except Exception:
logger.warning(
"Storage search failed in parallel_find_similar, "
"treating item as new",
exc_info=True,
)
raw = []
item.similar_records = [r for r, _ in raw]
item.top_similarity = float(raw[0][1]) if raw else 0.0
else:
with ThreadPoolExecutor(max_workers=min(len(active), 8)) as pool:
futures = [(i, item, pool.submit(_search_one, item)) for i, item in active]
futures = [
(
i,
item,
pool.submit(contextvars.copy_context().run, _search_one, item),
)
for i, item in active
]
for _, item, future in futures:
raw = future.result()
try:
raw = future.result()
except Exception:
logger.warning(
"Storage search failed in parallel_find_similar, "
"treating item as new",
exc_info=True,
)
raw = []
item.similar_records = [r for r, _ in raw]
item.top_similarity = float(raw[0][1]) if raw else 0.0
@@ -250,24 +283,38 @@ class EncodingFlow(Flow[EncodingState]):
# Group B: consolidation only
self._apply_defaults(item)
consol_futures[i] = pool.submit(
contextvars.copy_context().run,
analyze_for_consolidation,
item.content, list(item.similar_records), self._llm,
item.content,
list(item.similar_records),
self._llm,
)
elif not fields_provided and not has_similar:
# Group C: field resolution only
save_futures[i] = pool.submit(
contextvars.copy_context().run,
analyze_for_save,
item.content, existing_scopes, existing_categories, self._llm,
item.content,
existing_scopes,
existing_categories,
self._llm,
)
else:
# Group D: both in parallel
save_futures[i] = pool.submit(
contextvars.copy_context().run,
analyze_for_save,
item.content, existing_scopes, existing_categories, self._llm,
item.content,
existing_scopes,
existing_categories,
self._llm,
)
consol_futures[i] = pool.submit(
contextvars.copy_context().run,
analyze_for_consolidation,
item.content, list(item.similar_records), self._llm,
item.content,
list(item.similar_records),
self._llm,
)
# Collect field-resolution results
@@ -300,8 +347,8 @@ class EncodingFlow(Flow[EncodingState]):
item.plan = ConsolidationPlan(actions=[], insert_new=True)
# Collect consolidation results
for i, future in consol_futures.items():
items[i].plan = future.result()
for i, consol_future in consol_futures.items():
items[i].plan = consol_future.result()
finally:
pool.shutdown(wait=False)
@@ -339,7 +386,9 @@ class EncodingFlow(Flow[EncodingState]):
# similar_records overlap). Collect one action per record_id, first wins.
# Also build a map from record_id to the original MemoryRecord for updates.
dedup_deletes: set[str] = set() # record_ids to delete
dedup_updates: dict[str, tuple[int, str]] = {} # record_id -> (item_idx, new_content)
dedup_updates: dict[
str, tuple[int, str]
] = {} # record_id -> (item_idx, new_content)
all_similar: dict[str, MemoryRecord] = {} # record_id -> MemoryRecord
for i, item in enumerate(items):
@@ -350,13 +399,24 @@ class EncodingFlow(Flow[EncodingState]):
all_similar[r.id] = r
for action in item.plan.actions:
rid = action.record_id
if action.action == "delete" and rid not in dedup_deletes and rid not in dedup_updates:
if (
action.action == "delete"
and rid not in dedup_deletes
and rid not in dedup_updates
):
dedup_deletes.add(rid)
elif action.action == "update" and action.new_content and rid not in dedup_deletes and rid not in dedup_updates:
elif (
action.action == "update"
and action.new_content
and rid not in dedup_deletes
and rid not in dedup_updates
):
dedup_updates[rid] = (i, action.new_content)
# --- Batch re-embed all update contents in ONE call ---
update_list = list(dedup_updates.items()) # [(record_id, (item_idx, new_content)), ...]
update_list = list(
dedup_updates.items()
) # [(record_id, (item_idx, new_content)), ...]
update_embeddings: list[list[float]] = []
if update_list:
update_contents = [content for _, (_, content) in update_list]
@@ -377,51 +437,52 @@ class EncodingFlow(Flow[EncodingState]):
if item.dropped or item.plan is None:
continue
if item.plan.insert_new:
to_insert.append((i, MemoryRecord(
content=item.content,
scope=item.resolved_scope,
categories=item.resolved_categories,
metadata=item.resolved_metadata,
importance=item.resolved_importance,
embedding=item.embedding if item.embedding else None,
source=item.resolved_source,
private=item.resolved_private,
)))
# All storage mutations under one lock so no other pipeline can
# interleave and cause version conflicts. The lock is reentrant
# (RLock) so the individual storage methods re-acquire it safely.
updated_records: dict[str, MemoryRecord] = {}
with self._storage.write_lock:
if dedup_deletes:
self._storage.delete(record_ids=list(dedup_deletes))
self.state.records_deleted += len(dedup_deletes)
for rid, (_item_idx, new_content) in dedup_updates.items():
existing = all_similar.get(rid)
if existing is not None:
new_emb = update_emb_map.get(rid, [])
updated = MemoryRecord(
id=existing.id,
content=new_content,
scope=existing.scope,
categories=existing.categories,
metadata=existing.metadata,
importance=existing.importance,
created_at=existing.created_at,
last_accessed=now,
embedding=new_emb if new_emb else existing.embedding,
to_insert.append(
(
i,
MemoryRecord(
content=item.content,
scope=item.resolved_scope,
categories=item.resolved_categories,
metadata=item.resolved_metadata,
importance=item.resolved_importance,
embedding=item.embedding if item.embedding else None,
source=item.resolved_source,
private=item.resolved_private,
),
)
self._storage.update(updated)
self.state.records_updated += 1
updated_records[rid] = updated
)
if to_insert:
records = [r for _, r in to_insert]
self._storage.save(records)
self.state.records_inserted += len(records)
for idx, record in to_insert:
items[idx].result_record = record
updated_records: dict[str, MemoryRecord] = {}
if dedup_deletes:
self._storage.delete(record_ids=list(dedup_deletes))
self.state.records_deleted += len(dedup_deletes)
for rid, (_item_idx, new_content) in dedup_updates.items():
existing = all_similar.get(rid)
if existing is not None:
new_emb = update_emb_map.get(rid, [])
updated = MemoryRecord(
id=existing.id,
content=new_content,
scope=existing.scope,
categories=existing.categories,
metadata=existing.metadata,
importance=existing.importance,
created_at=existing.created_at,
last_accessed=now,
embedding=new_emb if new_emb else existing.embedding,
)
self._storage.update(updated)
self.state.records_updated += 1
updated_records[rid] = updated
if to_insert:
records = [r for _, r in to_insert]
self._storage.save(records)
self.state.records_inserted += len(records)
for idx, record in to_insert:
items[idx].result_record = record
# Set result_record for non-insert items (after lock, using updated_records)
for _i, item in enumerate(items):

View File

@@ -11,7 +11,9 @@ Implements adaptive-depth retrieval with:
from __future__ import annotations
from concurrent.futures import ThreadPoolExecutor, as_completed
import contextvars
from datetime import datetime
import logging
from typing import Any
from uuid import uuid4
@@ -29,6 +31,9 @@ from crewai.memory.types import (
)
logger = logging.getLogger(__name__)
class RecallState(BaseModel):
"""State for the recall flow."""
@@ -103,13 +108,12 @@ class RecallFlow(Flow[RecallState]):
)
# Post-filter by time cutoff
if self.state.time_cutoff and raw:
raw = [
(r, s) for r, s in raw if r.created_at >= self.state.time_cutoff
]
raw = [(r, s) for r, s in raw if r.created_at >= self.state.time_cutoff]
# Privacy filter
if not self.state.include_private and raw:
raw = [
(r, s) for r, s in raw
(r, s)
for r, s in raw
if not r.private or r.source == self.state.source
]
return scope, raw
@@ -125,38 +129,57 @@ class RecallFlow(Flow[RecallState]):
if len(tasks) <= 1:
for emb, sc in tasks:
scope, results = _search_one(emb, sc)
try:
scope, results = _search_one(emb, sc)
except Exception:
logger.warning(
"Storage search failed in recall flow, skipping scope",
exc_info=True,
)
continue
if results:
top_composite, _ = compute_composite_score(
results[0][0], results[0][1], self._config
)
findings.append({
"scope": scope,
"results": results,
"top_score": top_composite,
})
findings.append(
{
"scope": scope,
"results": results,
"top_score": top_composite,
}
)
else:
with ThreadPoolExecutor(max_workers=min(len(tasks), 4)) as pool:
futures = {
pool.submit(_search_one, emb, sc): (emb, sc)
pool.submit(contextvars.copy_context().run, _search_one, emb, sc): (
emb,
sc,
)
for emb, sc in tasks
}
for future in as_completed(futures):
scope, results = future.result()
try:
scope, results = future.result()
except Exception:
logger.warning(
"Storage search failed in recall flow, skipping scope",
exc_info=True,
)
continue
if results:
top_composite, _ = compute_composite_score(
results[0][0], results[0][1], self._config
)
findings.append({
"scope": scope,
"results": results,
"top_score": top_composite,
})
findings.append(
{
"scope": scope,
"results": results,
"top_score": top_composite,
}
)
self.state.chunk_findings = findings
self.state.confidence = max(
(f["top_score"] for f in findings), default=0.0
)
self.state.confidence = max((f["top_score"] for f in findings), default=0.0)
return findings
# ------------------------------------------------------------------
@@ -210,12 +233,16 @@ class RecallFlow(Flow[RecallState]):
# Parse time_filter into a datetime cutoff
if analysis.time_filter:
try:
self.state.time_cutoff = datetime.fromisoformat(analysis.time_filter)
self.state.time_cutoff = datetime.fromisoformat(
analysis.time_filter
)
except ValueError:
pass
# Batch-embed all sub-queries in ONE call
queries = analysis.recall_queries if analysis.recall_queries else [self.state.query]
queries = (
analysis.recall_queries if analysis.recall_queries else [self.state.query]
)
queries = queries[:3]
embeddings = embed_texts(self._embedder, queries)
pairs: list[tuple[str, list[float]]] = [
@@ -237,13 +264,17 @@ class RecallFlow(Flow[RecallState]):
if analysis and analysis.suggested_scopes:
candidates = [s for s in analysis.suggested_scopes if s]
else:
candidates = self._storage.list_scopes(scope_prefix)
try:
candidates = self._storage.list_scopes(scope_prefix)
except Exception:
logger.warning(
"Storage list_scopes failed in filter_and_chunk, "
"falling back to scope prefix",
exc_info=True,
)
candidates = []
if not candidates:
info = self._storage.get_scope_info(scope_prefix)
if info.record_count > 0:
candidates = [scope_prefix]
else:
candidates = [scope_prefix]
candidates = [scope_prefix]
self.state.candidate_scopes = candidates[:20]
return self.state.candidate_scopes
@@ -296,17 +327,21 @@ class RecallFlow(Flow[RecallState]):
response = self._llm.call([{"role": "user", "content": prompt}])
if isinstance(response, str) and "missing" in response.lower():
self.state.evidence_gaps.append(response[:200])
enhanced.append({
"scope": finding["scope"],
"extraction": response,
"results": finding["results"],
})
enhanced.append(
{
"scope": finding["scope"],
"extraction": response,
"results": finding["results"],
}
)
except Exception:
enhanced.append({
"scope": finding["scope"],
"extraction": "",
"results": finding["results"],
})
enhanced.append(
{
"scope": finding["scope"],
"extraction": "",
"results": finding["results"],
}
)
self.state.chunk_findings = enhanced
return enhanced
@@ -318,7 +353,7 @@ class RecallFlow(Flow[RecallState]):
@router(re_search)
def re_decide_depth(self) -> str:
"""Re-evaluate depth after re-search. Same logic as decide_depth."""
return self.decide_depth()
return self.decide_depth() # type: ignore[call-arg]
@listen("synthesize")
def synthesize_results(self) -> list[MemoryMatch]:

View File

@@ -1,5 +1,6 @@
import json
import logging
import os
from pathlib import Path
import sqlite3
from typing import Any
@@ -8,6 +9,7 @@ from crewai.task import Task
from crewai.utilities import Printer
from crewai.utilities.crew_json_encoder import CrewJSONEncoder
from crewai.utilities.errors import DatabaseError, DatabaseOperationError
from crewai.utilities.lock_store import lock as store_lock
from crewai.utilities.paths import db_storage_path
@@ -24,6 +26,7 @@ class KickoffTaskOutputsSQLiteStorage:
# Get the parent directory of the default db path and create our db file there
db_path = str(Path(db_storage_path()) / "latest_kickoff_task_outputs.db")
self.db_path = db_path
self._lock_name = f"sqlite:{os.path.realpath(self.db_path)}"
self._printer: Printer = Printer()
self._initialize_db()
@@ -38,24 +41,25 @@ class KickoffTaskOutputsSQLiteStorage:
DatabaseOperationError: If database initialization fails due to SQLite errors.
"""
try:
with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("PRAGMA journal_mode=WAL")
cursor = conn.cursor()
cursor.execute(
with store_lock(self._lock_name):
with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("PRAGMA journal_mode=WAL")
cursor = conn.cursor()
cursor.execute(
"""
CREATE TABLE IF NOT EXISTS latest_kickoff_task_outputs (
task_id TEXT PRIMARY KEY,
expected_output TEXT,
output JSON,
task_index INTEGER,
inputs JSON,
was_replayed BOOLEAN,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
)
"""
CREATE TABLE IF NOT EXISTS latest_kickoff_task_outputs (
task_id TEXT PRIMARY KEY,
expected_output TEXT,
output JSON,
task_index INTEGER,
inputs JSON,
was_replayed BOOLEAN,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
)
"""
)
conn.commit()
conn.commit()
except sqlite3.Error as e:
error_msg = DatabaseError.format_error(DatabaseError.INIT_ERROR, e)
logger.error(error_msg)
@@ -83,25 +87,26 @@ class KickoffTaskOutputsSQLiteStorage:
"""
inputs = inputs or {}
try:
with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("BEGIN TRANSACTION")
cursor = conn.cursor()
cursor.execute(
"""
INSERT OR REPLACE INTO latest_kickoff_task_outputs
(task_id, expected_output, output, task_index, inputs, was_replayed)
VALUES (?, ?, ?, ?, ?, ?)
""",
(
str(task.id),
task.expected_output,
json.dumps(output, cls=CrewJSONEncoder),
task_index,
json.dumps(inputs, cls=CrewJSONEncoder),
was_replayed,
),
)
conn.commit()
with store_lock(self._lock_name):
with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("BEGIN TRANSACTION")
cursor = conn.cursor()
cursor.execute(
"""
INSERT OR REPLACE INTO latest_kickoff_task_outputs
(task_id, expected_output, output, task_index, inputs, was_replayed)
VALUES (?, ?, ?, ?, ?, ?)
""",
(
str(task.id),
task.expected_output,
json.dumps(output, cls=CrewJSONEncoder),
task_index,
json.dumps(inputs, cls=CrewJSONEncoder),
was_replayed,
),
)
conn.commit()
except sqlite3.Error as e:
error_msg = DatabaseError.format_error(DatabaseError.SAVE_ERROR, e)
logger.error(error_msg)
@@ -126,30 +131,31 @@ class KickoffTaskOutputsSQLiteStorage:
DatabaseOperationError: If updating the task output fails due to SQLite errors.
"""
try:
with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("BEGIN TRANSACTION")
cursor = conn.cursor()
with store_lock(self._lock_name):
with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("BEGIN TRANSACTION")
cursor = conn.cursor()
fields = []
values = []
for key, value in kwargs.items():
fields.append(f"{key} = ?")
values.append(
json.dumps(value, cls=CrewJSONEncoder)
if isinstance(value, dict)
else value
)
fields = []
values = []
for key, value in kwargs.items():
fields.append(f"{key} = ?")
values.append(
json.dumps(value, cls=CrewJSONEncoder)
if isinstance(value, dict)
else value
)
query = f"UPDATE latest_kickoff_task_outputs SET {', '.join(fields)} WHERE task_index = ?" # nosec # noqa: S608
values.append(task_index)
query = f"UPDATE latest_kickoff_task_outputs SET {', '.join(fields)} WHERE task_index = ?" # nosec # noqa: S608
values.append(task_index)
cursor.execute(query, tuple(values))
conn.commit()
cursor.execute(query, tuple(values))
conn.commit()
if cursor.rowcount == 0:
logger.warning(
f"No row found with task_index {task_index}. No update performed."
)
if cursor.rowcount == 0:
logger.warning(
f"No row found with task_index {task_index}. No update performed."
)
except sqlite3.Error as e:
error_msg = DatabaseError.format_error(DatabaseError.UPDATE_ERROR, e)
logger.error(error_msg)
@@ -206,11 +212,12 @@ class KickoffTaskOutputsSQLiteStorage:
DatabaseOperationError: If deleting task outputs fails due to SQLite errors.
"""
try:
with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("BEGIN TRANSACTION")
cursor = conn.cursor()
cursor.execute("DELETE FROM latest_kickoff_task_outputs")
conn.commit()
with store_lock(self._lock_name):
with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("BEGIN TRANSACTION")
cursor = conn.cursor()
cursor.execute("DELETE FROM latest_kickoff_task_outputs")
conn.commit()
except sqlite3.Error as e:
error_msg = DatabaseError.format_error(DatabaseError.DELETE_ERROR, e)
logger.error(error_msg)

View File

@@ -2,7 +2,7 @@
from __future__ import annotations
from contextlib import AbstractContextManager
import contextvars
from datetime import datetime
import json
import logging
@@ -10,9 +10,9 @@ import os
from pathlib import Path
import threading
import time
from typing import Any, ClassVar
from typing import Any
import lancedb
import lancedb # type: ignore[import-untyped]
from crewai.memory.types import MemoryRecord, ScopeInfo
from crewai.utilities.lock_store import lock as store_lock
@@ -41,15 +41,6 @@ _RETRY_BASE_DELAY = 0.2 # seconds; doubles on each retry
class LanceDBStorage:
"""LanceDB-backed storage for the unified memory system."""
# Class-level registry: maps resolved database path -> shared write lock.
# When multiple Memory instances (e.g. agent + crew) independently create
# LanceDBStorage pointing at the same directory, they share one lock so
# their writes don't conflict.
# Uses RLock (reentrant) so callers can hold the lock for a batch of
# operations while the individual methods re-acquire it without deadlocking.
_path_locks: ClassVar[dict[str, threading.RLock]] = {}
_path_locks_guard: ClassVar[threading.Lock] = threading.Lock()
def __init__(
self,
path: str | Path | None = None,
@@ -85,11 +76,6 @@ class LanceDBStorage:
self._table_name = table_name
self._db = lancedb.connect(str(self._path))
# On macOS and Linux the default per-process open-file limit is 256.
# A LanceDB table stores one file per fragment (one fragment per save()
# call by default). With hundreds of fragments, a single full-table
# scan opens all of them simultaneously, exhausting the limit.
# Raise it proactively so scans on large tables never hit OS error 24.
try:
import resource
@@ -104,67 +90,44 @@ class LanceDBStorage:
self._lock_name = f"lancedb:{self._path.resolve()}"
resolved = str(self._path.resolve())
with LanceDBStorage._path_locks_guard:
if resolved not in LanceDBStorage._path_locks:
LanceDBStorage._path_locks[resolved] = threading.RLock()
self._write_lock = LanceDBStorage._path_locks[resolved]
# Try to open an existing table and infer dimension from its schema.
# If no table exists yet, defer creation until the first save so the
# dimension can be auto-detected from the embedder's actual output.
try:
self._table: lancedb.table.Table | None = self._db.open_table(
self._table_name
)
self._table: Any = self._db.open_table(self._table_name)
self._vector_dim: int = self._infer_dim_from_table(self._table)
# Best-effort: create the scope index if it doesn't exist yet.
with self._file_lock():
with store_lock(self._lock_name):
self._ensure_scope_index()
# Compact in the background if the table has accumulated many
# fragments from previous runs (each save() creates one).
self._compact_if_needed()
except Exception:
_logger.debug(
"Failed to open existing LanceDB table %r", table_name, exc_info=True
)
self._table = None
self._vector_dim = vector_dim or 0 # 0 = not yet known
# Explicit dim provided: create the table immediately if it doesn't exist.
if self._table is None and vector_dim is not None:
self._vector_dim = vector_dim
with self._file_lock():
with store_lock(self._lock_name):
self._table = self._create_table(vector_dim)
@property
def write_lock(self) -> threading.RLock:
"""The shared reentrant write lock for this database path.
Callers can acquire this to hold the lock across multiple storage
operations (e.g. delete + update + save as one atomic batch).
Individual methods also acquire it internally, but since it's
reentrant (RLock), the same thread won't deadlock.
"""
return self._write_lock
@staticmethod
def _infer_dim_from_table(table: lancedb.table.Table) -> int:
def _infer_dim_from_table(table: Any) -> int:
"""Read vector dimension from an existing table's schema."""
schema = table.schema
for field in schema:
if field.name == "vector":
try:
return field.type.list_size
return int(field.type.list_size)
except Exception:
break
return DEFAULT_VECTOR_DIM
def _file_lock(self) -> AbstractContextManager[None]:
"""Return a cross-process lock for serialising writes."""
return store_lock(self._lock_name)
def _do_write(self, op: str, *args: Any, **kwargs: Any) -> Any:
"""Execute a single table write with retry on commit conflicts.
Caller must already hold the cross-process file lock.
Caller must already hold ``store_lock(self._lock_name)``.
"""
delay = _RETRY_BASE_DELAY
for attempt in range(_MAX_RETRIES + 1):
@@ -182,16 +145,16 @@ class LanceDBStorage:
)
try:
self._table = self._db.open_table(self._table_name)
except Exception: # noqa: S110
pass
except Exception:
_logger.debug("Failed to re-open table during retry", exc_info=True)
time.sleep(delay)
delay *= 2
return None # unreachable, but satisfies type checker
def _create_table(self, vector_dim: int) -> lancedb.table.Table:
def _create_table(self, vector_dim: int) -> Any:
"""Create a new table with the given vector dimension.
Caller must already hold the cross-process file lock.
Caller must already hold ``store_lock(self._lock_name)``.
"""
placeholder = [
{
@@ -229,8 +192,10 @@ class LanceDBStorage:
return
try:
self._table.create_scalar_index("scope", index_type="BTREE", replace=False)
except Exception: # noqa: S110
pass # index already exists, table empty, or unsupported version
except Exception:
_logger.debug(
"Scope index creation skipped (may already exist)", exc_info=True
)
# ------------------------------------------------------------------
# Automatic background compaction
@@ -250,8 +215,10 @@ class LanceDBStorage:
def _compact_async(self) -> None:
"""Fire-and-forget: compact the table in a daemon background thread."""
ctx = contextvars.copy_context()
threading.Thread(
target=self._compact_safe,
target=ctx.run,
args=(self._compact_safe,),
daemon=True,
name="lancedb-compact",
).start()
@@ -260,13 +227,13 @@ class LanceDBStorage:
"""Run ``table.optimize()`` in a background thread, absorbing errors."""
try:
if self._table is not None:
with self._file_lock():
with store_lock(self._lock_name):
self._table.optimize()
self._ensure_scope_index()
except Exception:
_logger.debug("LanceDB background compaction failed", exc_info=True)
def _ensure_table(self, vector_dim: int | None = None) -> lancedb.table.Table:
def _ensure_table(self, vector_dim: int | None = None) -> Any:
"""Return the table, creating it lazily if needed.
Args:
@@ -332,12 +299,12 @@ class LanceDBStorage:
dim = len(r.embedding)
break
is_new_table = self._table is None
with self._write_lock, self._file_lock():
with store_lock(self._lock_name):
self._ensure_table(vector_dim=dim)
rows = [self._record_to_row(r) for r in records]
for r in rows:
if r["vector"] is None or len(r["vector"]) != self._vector_dim:
r["vector"] = [0.0] * self._vector_dim
rows = [self._record_to_row(rec) for rec in records]
for row in rows:
if row["vector"] is None or len(row["vector"]) != self._vector_dim:
row["vector"] = [0.0] * self._vector_dim
self._do_write("add", rows)
if is_new_table:
self._ensure_scope_index()
@@ -348,7 +315,7 @@ class LanceDBStorage:
def update(self, record: MemoryRecord) -> None:
"""Update a record by ID. Preserves created_at, updates last_accessed."""
with self._write_lock, self._file_lock():
with store_lock(self._lock_name):
self._ensure_table()
safe_id = str(record.id).replace("'", "''")
self._do_write("delete", f"id = '{safe_id}'")
@@ -369,7 +336,7 @@ class LanceDBStorage:
"""
if not record_ids or self._table is None:
return
with self._write_lock, self._file_lock():
with store_lock(self._lock_name):
now = datetime.utcnow().isoformat()
safe_ids = [str(rid).replace("'", "''") for rid in record_ids]
ids_expr = ", ".join(f"'{rid}'" for rid in safe_ids)
@@ -435,12 +402,12 @@ class LanceDBStorage:
) -> int:
if self._table is None:
return 0
with self._write_lock, self._file_lock():
with store_lock(self._lock_name):
if record_ids and not (categories or metadata_filter):
before = self._table.count_rows()
before = int(self._table.count_rows())
ids_expr = ", ".join(f"'{rid}'" for rid in record_ids)
self._do_write("delete", f"id IN ({ids_expr})")
return before - self._table.count_rows()
return before - int(self._table.count_rows())
if categories or metadata_filter:
rows = self._scan_rows(scope_prefix)
to_delete: list[str] = []
@@ -459,10 +426,10 @@ class LanceDBStorage:
to_delete.append(record.id)
if not to_delete:
return 0
before = self._table.count_rows()
before = int(self._table.count_rows())
ids_expr = ", ".join(f"'{rid}'" for rid in to_delete)
self._do_write("delete", f"id IN ({ids_expr})")
return before - self._table.count_rows()
return before - int(self._table.count_rows())
conditions = []
if scope_prefix is not None and scope_prefix.strip("/"):
prefix = scope_prefix.rstrip("/")
@@ -472,13 +439,13 @@ class LanceDBStorage:
if older_than is not None:
conditions.append(f"created_at < '{older_than.isoformat()}'")
if not conditions:
before = self._table.count_rows()
before = int(self._table.count_rows())
self._do_write("delete", "id != ''")
return before - self._table.count_rows()
return before - int(self._table.count_rows())
where_expr = " AND ".join(conditions)
before = self._table.count_rows()
before = int(self._table.count_rows())
self._do_write("delete", where_expr)
return before - self._table.count_rows()
return before - int(self._table.count_rows())
def _scan_rows(
self,
@@ -505,7 +472,8 @@ class LanceDBStorage:
q = q.where(f"scope LIKE '{scope_prefix.rstrip('/')}%'")
if columns is not None:
q = q.select(columns)
return q.limit(limit).to_list()
result: list[dict[str, Any]] = q.limit(limit).to_list()
return result
def list_records(
self, scope_prefix: str | None = None, limit: int = 200, offset: int = 0
@@ -612,12 +580,12 @@ class LanceDBStorage:
if self._table is None:
return 0
if scope_prefix is None or scope_prefix.strip("/") == "":
return self._table.count_rows()
return int(self._table.count_rows())
info = self.get_scope_info(scope_prefix)
return info.record_count
def reset(self, scope_prefix: str | None = None) -> None:
with self._write_lock, self._file_lock():
with store_lock(self._lock_name):
if scope_prefix is None or scope_prefix.strip("/") == "":
if self._table is not None:
self._db.drop_table(self._table_name)
@@ -643,7 +611,7 @@ class LanceDBStorage:
"""
if self._table is None:
return
with self._write_lock, self._file_lock():
with store_lock(self._lock_name):
self._table.optimize()
self._ensure_scope_index()

View File

@@ -3,6 +3,7 @@
from __future__ import annotations
from concurrent.futures import Future, ThreadPoolExecutor
import contextvars
from datetime import datetime
import threading
import time
@@ -229,8 +230,9 @@ class Memory(BaseModel):
If the pool has been shut down (e.g. after ``close()``), the save
runs synchronously as a fallback so late saves still succeed.
"""
ctx = contextvars.copy_context()
try:
future: Future[Any] = self._save_pool.submit(fn, *args, **kwargs)
future: Future[Any] = self._save_pool.submit(ctx.run, fn, *args, **kwargs)
except RuntimeError:
# Pool shut down -- run synchronously as fallback
future = Future()

View File

@@ -4,6 +4,7 @@ from __future__ import annotations
import asyncio
from collections.abc import Callable
import contextvars
from functools import wraps
import inspect
from typing import TYPE_CHECKING, Any, Concatenate, ParamSpec, TypeVar, overload
@@ -169,8 +170,9 @@ def _call_method(method: Callable[..., Any], *args: Any, **kwargs: Any) -> Any:
if loop and loop.is_running():
import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as pool:
return pool.submit(asyncio.run, result).result()
return pool.submit(ctx.run, asyncio.run, result).result()
return asyncio.run(result)
return result

View File

@@ -4,6 +4,7 @@ from __future__ import annotations
import asyncio
from collections.abc import Callable
import contextvars
from functools import partial
import inspect
from pathlib import Path
@@ -146,8 +147,9 @@ def _resolve_result(result: Any) -> Any:
if loop and loop.is_running():
import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as pool:
return pool.submit(asyncio.run, result).result()
return pool.submit(ctx.run, asyncio.run, result).result()
return asyncio.run(result)
return result

View File

@@ -1,5 +1,8 @@
"""ChromaDB client implementation."""
import asyncio
from collections.abc import AsyncIterator
from contextlib import AbstractContextManager, asynccontextmanager, nullcontext
import logging
from typing import Any
@@ -29,6 +32,7 @@ from crewai.rag.core.base_client import (
BaseCollectionParams,
)
from crewai.rag.types import SearchResult
from crewai.utilities.lock_store import lock as store_lock
from crewai.utilities.logger_utils import suppress_logging
@@ -52,6 +56,7 @@ class ChromaDBClient(BaseClient):
default_limit: int = 5,
default_score_threshold: float = 0.6,
default_batch_size: int = 100,
lock_name: str = "",
) -> None:
"""Initialize ChromaDBClient with client and embedding function.
@@ -61,12 +66,32 @@ class ChromaDBClient(BaseClient):
default_limit: Default number of results to return in searches.
default_score_threshold: Default minimum score for search results.
default_batch_size: Default batch size for adding documents.
lock_name: Optional lock name for cross-process synchronization.
"""
self.client = client
self.embedding_function = embedding_function
self.default_limit = default_limit
self.default_score_threshold = default_score_threshold
self.default_batch_size = default_batch_size
self._lock_name = lock_name
def _locked(self) -> AbstractContextManager[None]:
"""Return a cross-process lock context manager, or nullcontext if no lock name."""
return store_lock(self._lock_name) if self._lock_name else nullcontext()
@asynccontextmanager
async def _alocked(self) -> AsyncIterator[None]:
"""Async cross-process lock that acquires/releases in an executor."""
if not self._lock_name:
yield
return
lock_cm = store_lock(self._lock_name)
loop = asyncio.get_event_loop()
await loop.run_in_executor(None, lock_cm.__enter__)
try:
yield
finally:
await loop.run_in_executor(None, lock_cm.__exit__, None, None, None)
def create_collection(
self, **kwargs: Unpack[ChromaDBCollectionCreateParams]
@@ -313,23 +338,24 @@ class ChromaDBClient(BaseClient):
if not documents:
raise ValueError("Documents list cannot be empty")
collection = self.client.get_or_create_collection(
name=_sanitize_collection_name(collection_name),
embedding_function=self.embedding_function,
)
prepared = _prepare_documents_for_chromadb(documents)
for i in range(0, len(prepared.ids), batch_size):
batch_ids, batch_texts, batch_metadatas = _create_batch_slice(
prepared=prepared, start_index=i, batch_size=batch_size
with self._locked():
collection = self.client.get_or_create_collection(
name=_sanitize_collection_name(collection_name),
embedding_function=self.embedding_function,
)
collection.upsert(
ids=batch_ids,
documents=batch_texts,
metadatas=batch_metadatas, # type: ignore[arg-type]
)
prepared = _prepare_documents_for_chromadb(documents)
for i in range(0, len(prepared.ids), batch_size):
batch_ids, batch_texts, batch_metadatas = _create_batch_slice(
prepared=prepared, start_index=i, batch_size=batch_size
)
collection.upsert(
ids=batch_ids,
documents=batch_texts,
metadatas=batch_metadatas, # type: ignore[arg-type]
)
async def aadd_documents(self, **kwargs: Unpack[BaseCollectionAddParams]) -> None:
"""Add documents with their embeddings to a collection asynchronously.
@@ -363,22 +389,23 @@ class ChromaDBClient(BaseClient):
if not documents:
raise ValueError("Documents list cannot be empty")
collection = await self.client.get_or_create_collection(
name=_sanitize_collection_name(collection_name),
embedding_function=self.embedding_function,
)
prepared = _prepare_documents_for_chromadb(documents)
for i in range(0, len(prepared.ids), batch_size):
batch_ids, batch_texts, batch_metadatas = _create_batch_slice(
prepared=prepared, start_index=i, batch_size=batch_size
async with self._alocked():
collection = await self.client.get_or_create_collection(
name=_sanitize_collection_name(collection_name),
embedding_function=self.embedding_function,
)
prepared = _prepare_documents_for_chromadb(documents)
await collection.upsert(
ids=batch_ids,
documents=batch_texts,
metadatas=batch_metadatas, # type: ignore[arg-type]
)
for i in range(0, len(prepared.ids), batch_size):
batch_ids, batch_texts, batch_metadatas = _create_batch_slice(
prepared=prepared, start_index=i, batch_size=batch_size
)
await collection.upsert(
ids=batch_ids,
documents=batch_texts,
metadatas=batch_metadatas, # type: ignore[arg-type]
)
def search(
self, **kwargs: Unpack[ChromaDBCollectionSearchParams]
@@ -531,7 +558,10 @@ class ChromaDBClient(BaseClient):
)
collection_name = kwargs["collection_name"]
self.client.delete_collection(name=_sanitize_collection_name(collection_name))
with self._locked():
self.client.delete_collection(
name=_sanitize_collection_name(collection_name)
)
async def adelete_collection(self, **kwargs: Unpack[BaseCollectionParams]) -> None:
"""Delete a collection and all its data asynchronously.
@@ -561,9 +591,10 @@ class ChromaDBClient(BaseClient):
)
collection_name = kwargs["collection_name"]
await self.client.delete_collection(
name=_sanitize_collection_name(collection_name)
)
async with self._alocked():
await self.client.delete_collection(
name=_sanitize_collection_name(collection_name)
)
def reset(self) -> None:
"""Reset the vector database by deleting all collections and data.
@@ -586,7 +617,8 @@ class ChromaDBClient(BaseClient):
"Use areset() for AsyncClientAPI."
)
self.client.reset()
with self._locked():
self.client.reset()
async def areset(self) -> None:
"""Reset the vector database by deleting all collections and data asynchronously.
@@ -612,4 +644,5 @@ class ChromaDBClient(BaseClient):
"Use reset() for ClientAPI."
)
await self.client.reset()
async with self._alocked():
await self.client.reset()

View File

@@ -39,4 +39,5 @@ def create_client(config: ChromaDBConfig) -> ChromaDBClient:
default_limit=config.limit,
default_score_threshold=config.score_threshold,
default_batch_size=config.batch_size,
lock_name=f"chromadb:{persist_dir}",
)

View File

@@ -2,6 +2,7 @@ from __future__ import annotations
import asyncio
from concurrent.futures import Future
import contextvars
from copy import copy as shallow_copy
import datetime
from hashlib import md5
@@ -524,10 +525,11 @@ class Task(BaseModel):
) -> Future[TaskOutput]:
"""Execute the task asynchronously."""
future: Future[TaskOutput] = Future()
ctx = contextvars.copy_context()
threading.Thread(
daemon=True,
target=self._execute_task_async,
args=(agent, context, tools, future),
target=ctx.run,
args=(self._execute_task_async, agent, context, tools, future),
).start()
return future

View File

@@ -5,6 +5,7 @@ import asyncio
from collections.abc import Awaitable, Callable
from inspect import Parameter, signature
import json
import threading
from typing import (
Any,
Generic,
@@ -18,6 +19,7 @@ from pydantic import (
BaseModel as PydanticBaseModel,
ConfigDict,
Field,
PrivateAttr,
create_model,
field_validator,
)
@@ -94,6 +96,7 @@ class BaseTool(BaseModel, ABC):
default=0,
description="Current number of times this tool has been used.",
)
_usage_lock: threading.Lock = PrivateAttr(default_factory=threading.Lock)
@field_validator("args_schema", mode="before")
@classmethod
@@ -173,6 +176,25 @@ class BaseTool(BaseModel, ABC):
) from e
return kwargs
def _claim_usage(self) -> str | None:
"""Atomically check max usage and increment the counter.
Returns:
None if usage was claimed successfully, or an error message
string if the tool has reached its usage limit.
"""
with self._usage_lock:
if (
self.max_usage_count is not None
and self.current_usage_count >= self.max_usage_count
):
return (
f"Tool '{self.name}' has reached its usage limit of "
f"{self.max_usage_count} times and cannot be used anymore."
)
self.current_usage_count += 1
return None
def run(
self,
*args: Any,
@@ -181,13 +203,15 @@ class BaseTool(BaseModel, ABC):
if not args:
kwargs = self._validate_kwargs(kwargs)
limit_error = self._claim_usage()
if limit_error:
return limit_error
result = self._run(*args, **kwargs)
if asyncio.iscoroutine(result):
result = asyncio.run(result)
self.current_usage_count += 1
return result
async def arun(
@@ -206,9 +230,12 @@ class BaseTool(BaseModel, ABC):
"""
if not args:
kwargs = self._validate_kwargs(kwargs)
result = await self._arun(*args, **kwargs)
self.current_usage_count += 1
return result
limit_error = self._claim_usage()
if limit_error:
return limit_error
return await self._arun(*args, **kwargs)
async def _arun(
self,
@@ -361,12 +388,15 @@ class Tool(BaseTool, Generic[P, R]):
if not args:
kwargs = self._validate_kwargs(kwargs) # type: ignore[assignment]
limit_error = self._claim_usage()
if limit_error:
return limit_error # type: ignore[return-value]
result = self.func(*args, **kwargs)
if asyncio.iscoroutine(result):
result = asyncio.run(result)
self.current_usage_count += 1
return result # type: ignore[return-value]
def _run(self, *args: P.args, **kwargs: P.kwargs) -> R:
@@ -393,9 +423,12 @@ class Tool(BaseTool, Generic[P, R]):
"""
if not args:
kwargs = self._validate_kwargs(kwargs) # type: ignore[assignment]
result = await self._arun(*args, **kwargs)
self.current_usage_count += 1
return result
limit_error = self._claim_usage()
if limit_error:
return limit_error # type: ignore[return-value]
return await self._arun(*args, **kwargs)
async def _arun(self, *args: P.args, **kwargs: P.kwargs) -> R:
"""Executes the wrapped function asynchronously.

View File

@@ -7,6 +7,7 @@ concurrently by the executor.
import asyncio
from collections.abc import Callable
import contextvars
from typing import Any
from crewai.tools import BaseTool
@@ -84,9 +85,10 @@ class MCPNativeTool(BaseTool):
import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as executor:
coro = self._run_async(**kwargs)
future = executor.submit(asyncio.run, coro)
future = executor.submit(ctx.run, asyncio.run, coro)
return future.result()
except RuntimeError:
return asyncio.run(self._run_async(**kwargs))

View File

@@ -74,9 +74,28 @@
"consolidation_user": "New content to consider storing:\n{new_content}\n\nExisting similar memories:\n{records_summary}\n\nReturn the consolidation plan as structured output."
},
"reasoning": {
"initial_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are creating a strategic plan for a task that requires your expertise and unique perspective.",
"refine_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are refining a strategic plan for a task that requires your expertise and unique perspective.",
"create_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou have been assigned the following task:\n{description}\n\nExpected output:\n{expected_output}\n\nAvailable tools: {tools}\n\nBefore executing this task, create a detailed plan that leverages your expertise as {role} and outlines:\n1. Your understanding of the task from your professional perspective\n2. The key steps you'll take to complete it, drawing on your background and skills\n3. How you'll approach any challenges that might arise, considering your expertise\n4. How you'll strategically use the available tools based on your experience, exactly what tools to use and how to use them\n5. The expected outcome and how it aligns with your goal\n\nAfter creating your plan, assess whether you feel ready to execute the task or if you could do better.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan because [specific reason].\"",
"refine_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou created the following plan for this task:\n{current_plan}\n\nHowever, you indicated that you're not ready to execute the task yet.\n\nPlease refine your plan further, drawing on your expertise as {role} to address any gaps or uncertainties. As you refine your plan, be specific about which available tools you will use, how you will use them, and why they are the best choices for each step. Clearly outline your tool usage strategy as part of your improved plan.\n\nAfter refining your plan, assess whether you feel ready to execute the task.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan further because [specific reason].\""
"initial_plan": "You are {role}. Create a focused execution plan using only the essential steps needed.",
"refine_plan": "You are {role}. Refine your plan to address the specific gap while keeping it minimal.",
"create_plan_prompt": "You are {role}.\n\nTask: {description}\n\nExpected output: {expected_output}\n\nAvailable tools: {tools}\n\nCreate a focused plan with ONLY the essential steps needed. Most tasks require just 2-5 steps. Do NOT pad with unnecessary steps like \"review\", \"verify\", \"document\", or \"finalize\" unless explicitly required.\n\nFor each step, specify the action and which tool to use (if any).\n\nConclude with:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan because [specific reason].\"",
"refine_plan_prompt": "Your plan:\n{current_plan}\n\nYou indicated you're not ready. Address the specific gap while keeping the plan minimal.\n\nConclude with READY or NOT READY."
},
"planning": {
"system_prompt": "You are a strategic planning assistant. Create concrete, executable plans where every step produces a verifiable result.",
"create_plan_prompt": "Create an execution plan for the following task:\n\n## Task\n{description}\n\n## Expected Output\n{expected_output}\n\n## Available Tools\n{tools}\n\n## Planning Principles\nFocus on CONCRETE, EXECUTABLE steps. Each step must clearly state WHAT ACTION to take and HOW to verify it succeeded. The number of steps should match the task complexity. Hard limit: {max_steps} steps.\n\n## Rules:\n- Each step must have a clear DONE criterion\n- Do NOT group unrelated actions: if steps can fail independently, keep them separate\n- NO standalone \"thinking\" or \"planning\" steps — act, don't just observe\n- The last step must produce the required output\n\nAfter your plan, state READY or NOT READY.",
"refine_plan_prompt": "Your previous plan:\n{current_plan}\n\nYou indicated you weren't ready. Refine your plan to address the specific gap.\n\nKeep the plan minimal - only add steps that directly address the issue.\n\nConclude with READY or NOT READY as before.",
"observation_system_prompt": "You are a Planning Agent observing execution progress. After each step completes, you analyze what happened and decide whether the remaining plan is still valid.\n\nReason step-by-step about:\n1. Did this step produce a concrete, verifiable result? (file created, command succeeded, service running, etc.) — or did it only explore without acting?\n2. What new information was learned from this step's result?\n3. Whether the remaining steps still make sense given this new information\n4. What refinements, if any, are needed for upcoming steps\n5. Whether the overall goal has already been achieved\n\nCritical: mark `step_completed_successfully=false` if:\n- The step result is only exploratory (ls, pwd, cat) without producing the required artifact or action\n- A command returned a non-zero exit code and the error was not recovered\n- The step description required creating/building/starting something and the result shows it was not done\n\nBe conservative about triggering full replans — only do so when the remaining plan is fundamentally wrong, not just suboptimal.\n\nIMPORTANT: Set step_completed_successfully=false if:\n- The step's stated goal was NOT achieved (even if other things were done)\n- The first meaningful action returned an error (file not found, command not found, etc.)\n- The result is exploration/discovery output rather than the concrete action the step required\n- The step ran out of attempts without producing the required output\nSet needs_full_replan=true if the current plan's remaining steps reference paths or state that don't exist yet and need to be created first.",
"observation_user_prompt": "## Original task\n{task_description}\n\n## Expected output\n{task_goal}\n{completed_summary}\n\n## Just completed step {step_number}\nDescription: {step_description}\nResult: {step_result}\n{remaining_summary}\n\nAnalyze this step's result and provide your observation.",
"step_executor_system_prompt": "You are {role}. {backstory}\n\nYour goal: {goal}\n\nYou are executing ONE specific step in a larger plan. Your ONLY job is to fully complete this step — not to plan ahead.\n\nKey rules:\n- **ACT FIRST.** Execute the primary action of this step immediately. Do NOT read or explore files before attempting the main action unless exploration IS the step's goal.\n- If the step says 'run X', run X NOW. If it says 'write file Y', write Y NOW.\n- If the step requires producing an output file (e.g. /app/move.txt, report.jsonl, summary.csv), you MUST write that file using a tool call — do NOT just state the answer in text.\n- You may use tools MULTIPLE TIMES. After each tool use, check the result. If it failed, try a different approach.\n- Only output your Final Answer AFTER the concrete outcome is verified (file written, build succeeded, command exited 0).\n- If a command is not found or a path does not exist, fix it (different PATH, install missing deps, use absolute paths).\n- Do NOT spend more than 3 tool calls on exploration/analysis before attempting the primary action.{tools_section}",
"step_executor_tools_section": "\n\nAvailable tools: {tool_names}\n\nYou may call tools multiple times in sequence. Use this format for EACH tool call:\nThought: <what you observed and what you will try next>\nAction: <tool_name>\nAction Input: <input>\n\nAfter observing each result, decide: is the step complete? If yes:\nThought: The step is done because <evidence>\nFinal Answer: <concise summary of what was accomplished and the key result>",
"step_executor_user_prompt": "## Current Step\n{step_description}",
"step_executor_suggested_tool": "\nSuggested tool: {tool_to_use}",
"step_executor_context_header": "\n## Context from previous steps:",
"step_executor_context_entry": "Step {step_number} result: {result}",
"step_executor_complete_step": "\n**Execute the primary action of this step NOW.** If the step requires writing a file, write it. If it requires running a command, run it. Verify the outcome with a follow-up tool call, then give your Final Answer. Your Final Answer must confirm what was DONE (file created at path X, command succeeded), not just what should be done.",
"todo_system_prompt": "You are {role}. Your goal: {goal}\n\nYou are executing a specific step in a multi-step plan. Focus only on completing the current step. Use the suggested tool if one is provided. Be concise and provide clear results that can be used by subsequent steps.",
"synthesis_system_prompt": "You are {role}. You have completed a multi-step task. Synthesize the results from all steps into a single, coherent final response that directly addresses the original task. Do NOT list step numbers or say 'Step 1 result'. Produce a clean, polished answer as if you did it all at once.",
"synthesis_user_prompt": "## Original Task\n{task_description}\n\n## Results from each step\n{combined_steps}\n\nSynthesize these results into a single, coherent final answer.",
"replan_enhancement_prompt": "\n\nIMPORTANT: Previous execution attempt did not fully succeed. Please create a revised plan that accounts for the following context from the previous attempt:\n\n{previous_context}\n\nConsider:\n1. What steps succeeded and can be built upon\n2. What steps failed and why they might have failed\n3. Alternative approaches that might work better\n4. Whether dependencies need to be restructured",
"step_executor_task_context": "## Task Context\nThe following is the full task you are helping complete. Keep this in mind — especially any required output files, exact filenames, and expected formats.\n\n{task_context}\n\n---\n"
}
}
}

View File

@@ -3,6 +3,9 @@ from __future__ import annotations
import asyncio
from collections.abc import Callable, Sequence
import concurrent.futures
import contextvars
from dataclasses import dataclass, field
from datetime import datetime
import inspect
import json
import re
@@ -39,6 +42,7 @@ from crewai.utilities.types import LLMMessage
if TYPE_CHECKING:
from crewai.agent import Agent
from crewai.agents.crew_agent_executor import CrewAgentExecutor
from crewai.agents.tools_handler import ToolsHandler
from crewai.experimental.agent_executor import AgentExecutor
from crewai.lite_agent import LiteAgent
from crewai.llm import LLM
@@ -210,6 +214,30 @@ def convert_tools_to_openai_schema(
return openai_tools, available_functions, tool_name_mapping
def extract_task_section(text: str) -> str:
"""Extract the ## Task body from a structured enriched instruction.
For structured descriptions (e.g. with ## Task and ## Instructions sections),
extracts just the task body so the caller sees the requirements without
duplicating tool/verification instructions.
Falls back to the full text (up to 2000 chars, with a truncation marker)
for plain inputs.
"""
for marker in ("\n## Task\n", "\n## Task:", "## Task\n"):
idx = text.find(marker)
if idx >= 0:
start = idx + len(marker)
for end_marker in ("\n---\n", "\n## "):
end = text.find(end_marker, start)
if end > 0:
return text[start:end].strip()
return text[start : start + 2000].strip()
if len(text) > 2000:
return text[:2000] + "\n... [truncated]"
return text
def has_reached_max_iterations(iterations: int, max_iterations: int) -> bool:
"""Check if the maximum number of iterations has been reached.
@@ -335,6 +363,66 @@ def enforce_rpm_limit(
request_within_rpm_limit()
def _prepare_llm_call(
executor_context: CrewAgentExecutor | AgentExecutor | LiteAgent | None,
messages: list[LLMMessage],
printer: Printer,
verbose: bool = True,
) -> list[LLMMessage]:
"""Shared pre-call logic: run before hooks and resolve messages.
Args:
executor_context: Optional executor context for hook invocation.
messages: The messages to send to the LLM.
printer: Printer instance for output.
verbose: Whether to print output.
Returns:
The resolved messages list (may come from executor_context).
Raises:
ValueError: If a before hook blocks the call.
"""
if executor_context is not None:
if not _setup_before_llm_call_hooks(executor_context, printer, verbose=verbose):
raise ValueError("LLM call blocked by before_llm_call hook")
messages = executor_context.messages
return messages
def _validate_and_finalize_llm_response(
answer: Any,
executor_context: CrewAgentExecutor | AgentExecutor | LiteAgent | None,
printer: Printer,
verbose: bool = True,
) -> str | BaseModel | Any:
"""Shared post-call logic: validate response and run after hooks.
Args:
answer: The raw LLM response.
executor_context: Optional executor context for hook invocation.
printer: Printer instance for output.
verbose: Whether to print output.
Returns:
The potentially modified response.
Raises:
ValueError: If the response is None or empty.
"""
if not answer:
if verbose:
printer.print(
content="Received None or empty response from LLM call.",
color="red",
)
raise ValueError("Invalid response from LLM call - None or empty.")
return _setup_after_llm_call_hooks(
executor_context, answer, printer, verbose=verbose
)
def get_llm_response(
llm: LLM | BaseLLM,
messages: list[LLMMessage],
@@ -371,11 +459,7 @@ def get_llm_response(
Exception: If an error occurs.
ValueError: If the response is None or empty.
"""
if executor_context is not None:
if not _setup_before_llm_call_hooks(executor_context, printer, verbose=verbose):
raise ValueError("LLM call blocked by before_llm_call hook")
messages = executor_context.messages
messages = _prepare_llm_call(executor_context, messages, printer, verbose=verbose)
try:
answer = llm.call(
@@ -389,16 +473,9 @@ def get_llm_response(
)
except Exception as e:
raise e
if not answer:
if verbose:
printer.print(
content="Received None or empty response from LLM call.",
color="red",
)
raise ValueError("Invalid response from LLM call - None or empty.")
return _setup_after_llm_call_hooks(
executor_context, answer, printer, verbose=verbose
return _validate_and_finalize_llm_response(
answer, executor_context, printer, verbose=verbose
)
@@ -428,6 +505,7 @@ async def aget_llm_response(
from_agent: Optional agent context for the LLM call.
response_model: Optional Pydantic model for structured outputs.
executor_context: Optional executor context for hook invocation.
verbose: Whether to print output.
Returns:
The response from the LLM as a string, Pydantic model (when response_model is provided),
@@ -437,10 +515,7 @@ async def aget_llm_response(
Exception: If an error occurs.
ValueError: If the response is None or empty.
"""
if executor_context is not None:
if not _setup_before_llm_call_hooks(executor_context, printer, verbose=verbose):
raise ValueError("LLM call blocked by before_llm_call hook")
messages = executor_context.messages
messages = _prepare_llm_call(executor_context, messages, printer, verbose=verbose)
try:
answer = await llm.acall(
@@ -454,16 +529,9 @@ async def aget_llm_response(
)
except Exception as e:
raise e
if not answer:
if verbose:
printer.print(
content="Received None or empty response from LLM call.",
color="red",
)
raise ValueError("Invalid response from LLM call - None or empty.")
return _setup_after_llm_call_hooks(
executor_context, answer, printer, verbose=verbose
return _validate_and_finalize_llm_response(
answer, executor_context, printer, verbose=verbose
)
@@ -907,8 +975,9 @@ def summarize_messages(
chunks=chunks, llm=llm, callbacks=callbacks, i18n=i18n
)
if is_inside_event_loop():
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
summarized_contents = pool.submit(asyncio.run, coro).result()
summarized_contents = pool.submit(ctx.run, asyncio.run, coro).result()
else:
summarized_contents = asyncio.run(coro)
@@ -1157,6 +1226,372 @@ def extract_tool_call_info(
return None
def is_tool_call_list(response: list[Any]) -> bool:
"""Check if a response from the LLM is a list of tool calls.
Supports OpenAI, Anthropic, Bedrock, and Gemini formats.
Args:
response: The response to check.
Returns:
True if the response appears to be a list of tool calls.
"""
if not response:
return False
first_item = response[0]
# OpenAI-style
if hasattr(first_item, "function") or (
isinstance(first_item, dict) and "function" in first_item
):
return True
# Anthropic-style (ToolUseBlock)
if hasattr(first_item, "type") and getattr(first_item, "type", None) == "tool_use":
return True
if hasattr(first_item, "name") and hasattr(first_item, "input"):
return True
# Bedrock-style
if isinstance(first_item, dict) and "name" in first_item and "input" in first_item:
return True
# Gemini-style
if hasattr(first_item, "function_call") and first_item.function_call:
return True
return False
def check_native_tool_support(llm: Any, original_tools: list[BaseTool] | None) -> bool:
"""Check if the LLM supports native function calling and tools are available.
Args:
llm: The LLM instance.
original_tools: Original BaseTool instances.
Returns:
True if native function calling is supported and tools exist.
"""
return (
hasattr(llm, "supports_function_calling")
and callable(getattr(llm, "supports_function_calling", None))
and llm.supports_function_calling()
and bool(original_tools)
)
def setup_native_tools(
original_tools: list[BaseTool],
) -> tuple[
list[dict[str, Any]],
dict[str, Callable[..., Any]],
dict[str, BaseTool | CrewStructuredTool],
]:
"""Convert tools to OpenAI schema format for native function calling.
Args:
original_tools: Original BaseTool instances.
Returns:
Tuple of (openai_tools_schema, available_functions_dict, tool_name_mapping).
"""
return convert_tools_to_openai_schema(original_tools)
def build_tool_calls_assistant_message(
tool_calls: list[Any],
) -> tuple[LLMMessage | None, list[dict[str, Any]]]:
"""Build an assistant message containing tool call reports.
Extracts info from each tool call, builds the standard assistant message
format, and preserves raw Gemini parts when applicable.
Args:
tool_calls: Raw tool call objects from the LLM response.
Returns:
Tuple of (assistant_message, tool_calls_to_report).
assistant_message is None if no valid tool calls found.
"""
tool_calls_to_report: list[dict[str, Any]] = []
for tool_call in tool_calls:
info = extract_tool_call_info(tool_call)
if not info:
continue
call_id, func_name, func_args = info
tool_calls_to_report.append(
{
"id": call_id,
"type": "function",
"function": {
"name": func_name,
"arguments": func_args
if isinstance(func_args, str)
else json.dumps(func_args),
},
}
)
if not tool_calls_to_report:
return None, []
assistant_message: LLMMessage = {
"role": "assistant",
"content": None,
"tool_calls": tool_calls_to_report,
}
# Preserve raw parts for Gemini compatibility
if all(type(tc).__qualname__ == "Part" for tc in tool_calls):
assistant_message["raw_tool_call_parts"] = list(tool_calls)
return assistant_message, tool_calls_to_report
@dataclass
class NativeToolCallResult:
"""Result from executing a single native tool call."""
call_id: str
func_name: str
result: str
from_cache: bool = False
result_as_answer: bool = False
tool_message: LLMMessage = field(default_factory=dict) # type: ignore[assignment]
def execute_single_native_tool_call(
tool_call: Any,
*,
available_functions: dict[str, Callable[..., Any]],
original_tools: list[BaseTool],
structured_tools: list[CrewStructuredTool] | None,
tools_handler: ToolsHandler | None,
agent: Agent | None,
task: Task | None,
crew: Any | None,
event_source: Any,
printer: Printer | None = None,
verbose: bool = False,
) -> NativeToolCallResult:
"""Execute a single native tool call with full lifecycle management.
Handles: arg parsing, tool lookup, max-usage check, cache read/write,
before/after hooks, event emission, and result_as_answer detection.
Args:
tool_call: Raw tool call object from the LLM.
available_functions: Map of sanitized tool name -> callable.
original_tools: Original BaseTool list (for cache_function, result_as_answer).
structured_tools: Structured tools list (for hook context).
tools_handler: Optional handler with cache.
agent: The agent instance.
task: The current task.
crew: The crew instance.
event_source: The object to use as event emitter source.
printer: Optional printer for verbose logging.
verbose: Whether to print verbose output.
Returns:
NativeToolCallResult with all execution details.
"""
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.tool_usage_events import (
ToolUsageErrorEvent,
ToolUsageFinishedEvent,
ToolUsageStartedEvent,
)
from crewai.hooks.tool_hooks import (
ToolCallHookContext,
get_after_tool_call_hooks,
get_before_tool_call_hooks,
)
info = extract_tool_call_info(tool_call)
if not info:
return NativeToolCallResult(
call_id="", func_name="", result="Unrecognized tool call format"
)
call_id, func_name, func_args = info
# Parse arguments
if isinstance(func_args, str):
try:
args_dict = json.loads(func_args)
except json.JSONDecodeError:
args_dict = {}
else:
args_dict = func_args
agent_key = getattr(agent, "key", "unknown") if agent else "unknown"
# Find original tool for cache_function and result_as_answer
original_tool: BaseTool | None = None
for tool in original_tools:
if sanitize_tool_name(tool.name) == func_name:
original_tool = tool
break
# Check cache
from_cache = False
input_str = json.dumps(args_dict) if args_dict else ""
result = "Tool not found"
if tools_handler and tools_handler.cache:
cached_result = tools_handler.cache.read(tool=func_name, input=input_str)
if cached_result is not None:
result = (
str(cached_result)
if not isinstance(cached_result, str)
else cached_result
)
from_cache = True
# Emit tool started event
started_at = datetime.now()
crewai_event_bus.emit(
event_source,
event=ToolUsageStartedEvent(
tool_name=func_name,
tool_args=args_dict,
from_agent=agent,
from_task=task,
agent_key=agent_key,
),
)
track_delegation_if_needed(func_name, args_dict, task)
# Find structured tool for hooks
structured_tool: CrewStructuredTool | None = None
for structured in structured_tools or []:
if sanitize_tool_name(structured.name) == func_name:
structured_tool = structured
break
# Before hooks
hook_blocked = False
before_hook_context = ToolCallHookContext(
tool_name=func_name,
tool_input=args_dict,
tool=structured_tool, # type: ignore[arg-type]
agent=agent,
task=task,
crew=crew,
)
try:
for hook in get_before_tool_call_hooks():
if hook(before_hook_context) is False:
hook_blocked = True
break
except Exception: # noqa: S110
pass
error_event_emitted = False
if hook_blocked:
result = f"Tool execution blocked by hook. Tool: {func_name}"
elif not from_cache:
if func_name in available_functions:
try:
tool_func = available_functions[func_name]
raw_result = tool_func(**args_dict)
# Cache result
if tools_handler and tools_handler.cache:
should_cache = True
if original_tool:
should_cache = original_tool.cache_function(
args_dict, raw_result
)
if should_cache:
tools_handler.cache.add(
tool=func_name, input=input_str, output=raw_result
)
result = (
str(raw_result) if not isinstance(raw_result, str) else raw_result
)
except Exception as e:
result = f"Error executing tool: {e}"
if task:
task.increment_tools_errors()
crewai_event_bus.emit(
event_source,
event=ToolUsageErrorEvent(
tool_name=func_name,
tool_args=args_dict,
from_agent=agent,
from_task=task,
agent_key=agent_key,
error=e,
),
)
error_event_emitted = True
# After hooks
after_hook_context = ToolCallHookContext(
tool_name=func_name,
tool_input=args_dict,
tool=structured_tool, # type: ignore[arg-type]
agent=agent,
task=task,
crew=crew,
tool_result=result,
)
try:
for after_hook in get_after_tool_call_hooks():
hook_result = after_hook(after_hook_context)
if hook_result is not None:
result = hook_result
after_hook_context.tool_result = result
except Exception: # noqa: S110
pass
# Emit tool finished event (only if error event wasn't already emitted)
if not error_event_emitted:
crewai_event_bus.emit(
event_source,
event=ToolUsageFinishedEvent(
output=result,
tool_name=func_name,
tool_args=args_dict,
from_agent=agent,
from_task=task,
agent_key=agent_key,
started_at=started_at,
finished_at=datetime.now(),
),
)
# Build tool result message
tool_message: LLMMessage = {
"role": "tool",
"tool_call_id": call_id,
"name": func_name,
"content": result,
}
if verbose and printer:
cache_info = " (from cache)" if from_cache else ""
printer.print(
content=f"Tool {func_name} executed with result{cache_info}: {result[:200]}...",
color="green",
)
# Check result_as_answer
is_result_as_answer = bool(
original_tool
and hasattr(original_tool, "result_as_answer")
and original_tool.result_as_answer
)
return NativeToolCallResult(
call_id=call_id,
func_name=func_name,
result=result,
from_cache=from_cache,
result_as_answer=is_result_as_answer,
tool_message=tool_message,
)
def parse_tool_call_args(
func_args: dict[str, Any] | str,
func_name: str,

Some files were not shown because too many files have changed in this diff Show More