Merge branch 'main' into gl/ci/pr-checks-and-commitizen

This commit is contained in:
Greyson LaLonde
2026-03-16 03:09:28 -04:00
committed by GitHub
203 changed files with 49984 additions and 14203 deletions

View File

@@ -59,6 +59,8 @@ jobs:
contents: read contents: read
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
with:
ref: ${{ inputs.release_tag || github.ref }}
- name: Install uv - name: Install uv
uses: astral-sh/setup-uv@v6 uses: astral-sh/setup-uv@v6
@@ -93,3 +95,72 @@ jobs:
echo "Some packages failed to publish" echo "Some packages failed to publish"
exit 1 exit 1
fi fi
- name: Build Slack payload
if: success()
id: slack
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
RELEASE_TAG: ${{ inputs.release_tag }}
run: |
payload=$(uv run python -c "
import json, re, subprocess, sys
with open('lib/crewai/src/crewai/__init__.py') as f:
m = re.search(r\"__version__\s*=\s*[\\\"']([^\\\"']+)\", f.read())
version = m.group(1) if m else 'unknown'
import os
tag = os.environ.get('RELEASE_TAG') or version
try:
r = subprocess.run(['gh','release','view',tag,'--json','body','-q','.body'],
capture_output=True, text=True, check=True)
body = r.stdout.strip()
except Exception:
body = ''
blocks = [
{'type':'section','text':{'type':'mrkdwn',
'text':f':rocket: \`crewai v{version}\` published to PyPI'}},
{'type':'section','text':{'type':'mrkdwn',
'text':f'<https://pypi.org/project/crewai/{version}/|View on PyPI> · <https://github.com/crewAIInc/crewAI/releases/tag/{tag}|Release notes>'}},
{'type':'divider'},
]
if body:
heading, items = '', []
for line in body.split('\n'):
line = line.strip()
if not line: continue
hm = re.match(r'^#{2,3}\s+(.*)', line)
if hm:
if heading and items:
skip = heading in ('What\\'s Changed','') or 'Contributors' in heading
if not skip:
txt = f'*{heading}*\n' + '\n'.join(f'• {i}' for i in items)
blocks.append({'type':'section','text':{'type':'mrkdwn','text':txt}})
heading, items = hm.group(1), []
elif line.startswith('- ') or line.startswith('* '):
items.append(re.sub(r'\*\*([^*]*)\*\*', r'*\1*', line[2:]))
if heading and items:
skip = heading in ('What\\'s Changed','') or 'Contributors' in heading
if not skip:
txt = f'*{heading}*\n' + '\n'.join(f'• {i}' for i in items)
blocks.append({'type':'section','text':{'type':'mrkdwn','text':txt}})
blocks.append({'type':'divider'})
blocks.append({'type':'section','text':{'type':'mrkdwn',
'text':f'\`\`\`uv add \"crewai[tools]=={version}\"\`\`\`'}})
print(json.dumps({'blocks':blocks}))
")
echo "payload=$payload" >> $GITHUB_OUTPUT
- name: Notify Slack
if: success()
uses: slackapi/slack-github-action@v2.1.0
with:
webhook: ${{ secrets.SLACK_WEBHOOK_URL }}
webhook-type: incoming-webhook
payload: ${{ steps.slack.outputs.payload }}

View File

@@ -4,6 +4,49 @@ description: "Product updates, improvements, and bug fixes for CrewAI"
icon: "clock" icon: "clock"
mode: "wide" mode: "wide"
--- ---
<Update label="Mar 14, 2026">
## v1.10.2rc2
[View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc2)
## What's Changed
### Bug Fixes
- Remove exclusive locks from read-only storage operations
### Documentation
- Update changelog and version for v1.10.2rc1
## Contributors
@greysonlalonde
</Update>
<Update label="Mar 13, 2026">
## v1.10.2rc1
[View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc1)
## What's Changed
### Features
- Add release command and trigger PyPI publish
### Bug Fixes
- Fix cross-process and thread-safe locking to unprotected I/O
- Propagate contextvars across all thread and executor boundaries
- Propagate ContextVars into async task threads
### Documentation
- Update changelog and version for v1.10.2a1
## Contributors
@danglies007, @greysonlalonde
</Update>
<Update label="Mar 11, 2026"> <Update label="Mar 11, 2026">
## v1.10.2a1 ## v1.10.2a1

View File

@@ -4,6 +4,49 @@ description: "CrewAI의 제품 업데이트, 개선 사항 및 버그 수정"
icon: "clock" icon: "clock"
mode: "wide" mode: "wide"
--- ---
<Update label="2026년 3월 14일">
## v1.10.2rc2
[GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc2)
## 변경 사항
### 버그 수정
- 읽기 전용 스토리지 작업에서 독점 잠금 제거
### 문서
- v1.10.2rc1에 대한 변경 로그 및 버전 업데이트
## 기여자
@greysonlalonde
</Update>
<Update label="2026년 3월 13일">
## v1.10.2rc1
[GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc1)
## 변경 사항
### 기능
- 릴리스 명령 추가 및 PyPI 게시 트리거
### 버그 수정
- 보호되지 않은 I/O에 대한 프로세스 간 및 스레드 안전 잠금 수정
- 모든 스레드 및 실행기 경계를 넘는 contextvars 전파
- async 작업 스레드로 ContextVars 전파
### 문서
- v1.10.2a1에 대한 변경 로그 및 버전 업데이트
## 기여자
@danglies007, @greysonlalonde
</Update>
<Update label="2026년 3월 11일"> <Update label="2026년 3월 11일">
## v1.10.2a1 ## v1.10.2a1

View File

@@ -4,6 +4,49 @@ description: "Atualizações de produto, melhorias e correções do CrewAI"
icon: "clock" icon: "clock"
mode: "wide" mode: "wide"
--- ---
<Update label="14 mar 2026">
## v1.10.2rc2
[Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc2)
## O que Mudou
### Correções de Bugs
- Remover bloqueios exclusivos de operações de armazenamento somente leitura
### Documentação
- Atualizar changelog e versão para v1.10.2rc1
## Contribuidores
@greysonlalonde
</Update>
<Update label="13 mar 2026">
## v1.10.2rc1
[Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.2rc1)
## O que Mudou
### Funcionalidades
- Adicionar comando de lançamento e acionar publicação no PyPI
### Correções de Bugs
- Corrigir bloqueio seguro entre processos e threads para I/O não protegido
- Propagar contextvars através de todos os limites de thread e executor
- Propagar ContextVars para threads de tarefas assíncronas
### Documentação
- Atualizar changelog e versão para v1.10.2a1
## Contribuidores
@danglies007, @greysonlalonde
</Update>
<Update label="11 mar 2026"> <Update label="11 mar 2026">
## v1.10.2a1 ## v1.10.2a1

View File

@@ -152,4 +152,4 @@ __all__ = [
"wrap_file_source", "wrap_file_source",
] ]
__version__ = "1.10.2a1" __version__ = "1.11.0rc1"

View File

@@ -11,7 +11,7 @@ dependencies = [
"pytube~=15.0.0", "pytube~=15.0.0",
"requests~=2.32.5", "requests~=2.32.5",
"docker~=7.1.0", "docker~=7.1.0",
"crewai==1.10.2a1", "crewai==1.11.0rc1",
"tiktoken~=0.8.0", "tiktoken~=0.8.0",
"beautifulsoup4~=4.13.4", "beautifulsoup4~=4.13.4",
"python-docx~=1.2.0", "python-docx~=1.2.0",

View File

@@ -309,4 +309,4 @@ __all__ = [
"ZapierActionTools", "ZapierActionTools",
] ]
__version__ = "1.10.2a1" __version__ = "1.11.0rc1"

View File

@@ -1,7 +1,9 @@
from collections.abc import Callable from collections.abc import Callable
import os
from pathlib import Path from pathlib import Path
from typing import Any from typing import Any
from crewai.utilities.lock_store import lock as store_lock
from lancedb import ( # type: ignore[import-untyped] from lancedb import ( # type: ignore[import-untyped]
DBConnection as LanceDBConnection, DBConnection as LanceDBConnection,
connect as lancedb_connect, connect as lancedb_connect,
@@ -33,10 +35,12 @@ class LanceDBAdapter(Adapter):
_db: LanceDBConnection = PrivateAttr() _db: LanceDBConnection = PrivateAttr()
_table: LanceDBTable = PrivateAttr() _table: LanceDBTable = PrivateAttr()
_lock_name: str = PrivateAttr(default="")
def model_post_init(self, __context: Any) -> None: def model_post_init(self, __context: Any) -> None:
self._db = lancedb_connect(self.uri) self._db = lancedb_connect(self.uri)
self._table = self._db.open_table(self.table_name) self._table = self._db.open_table(self.table_name)
self._lock_name = f"lancedb:{os.path.realpath(str(self.uri))}"
super().model_post_init(__context) super().model_post_init(__context)
@@ -56,4 +60,5 @@ class LanceDBAdapter(Adapter):
*args: Any, *args: Any,
**kwargs: Any, **kwargs: Any,
) -> None: ) -> None:
with store_lock(self._lock_name):
self._table.add(*args, **kwargs) self._table.add(*args, **kwargs)

View File

@@ -1,6 +1,9 @@
from __future__ import annotations from __future__ import annotations
import asyncio
import contextvars
import logging import logging
import threading
from typing import TYPE_CHECKING from typing import TYPE_CHECKING
@@ -18,6 +21,9 @@ class BrowserSessionManager:
This class maintains separate browser sessions for different threads, This class maintains separate browser sessions for different threads,
enabling concurrent usage of browsers in multi-threaded environments. enabling concurrent usage of browsers in multi-threaded environments.
Browsers are created lazily only when needed by tools. Browsers are created lazily only when needed by tools.
Uses per-key events to serialize creation for the same thread_id without
blocking unrelated callers or wasting resources on duplicate sessions.
""" """
def __init__(self, region: str = "us-west-2"): def __init__(self, region: str = "us-west-2"):
@@ -27,8 +33,10 @@ class BrowserSessionManager:
region: AWS region for browser client region: AWS region for browser client
""" """
self.region = region self.region = region
self._lock = threading.Lock()
self._async_sessions: dict[str, tuple[BrowserClient, AsyncBrowser]] = {} self._async_sessions: dict[str, tuple[BrowserClient, AsyncBrowser]] = {}
self._sync_sessions: dict[str, tuple[BrowserClient, SyncBrowser]] = {} self._sync_sessions: dict[str, tuple[BrowserClient, SyncBrowser]] = {}
self._creating: dict[str, threading.Event] = {}
async def get_async_browser(self, thread_id: str) -> AsyncBrowser: async def get_async_browser(self, thread_id: str) -> AsyncBrowser:
"""Get or create an async browser for the specified thread. """Get or create an async browser for the specified thread.
@@ -39,10 +47,29 @@ class BrowserSessionManager:
Returns: Returns:
An async browser instance specific to the thread An async browser instance specific to the thread
""" """
loop = asyncio.get_event_loop()
while True:
with self._lock:
if thread_id in self._async_sessions: if thread_id in self._async_sessions:
return self._async_sessions[thread_id][1] return self._async_sessions[thread_id][1]
if thread_id not in self._creating:
self._creating[thread_id] = threading.Event()
break
event = self._creating[thread_id]
ctx = contextvars.copy_context()
await loop.run_in_executor(None, ctx.run, event.wait)
return await self._create_async_browser_session(thread_id) try:
browser_client, browser = await self._create_async_browser_session(
thread_id
)
with self._lock:
self._async_sessions[thread_id] = (browser_client, browser)
return browser
finally:
with self._lock:
evt = self._creating.pop(thread_id)
evt.set()
def get_sync_browser(self, thread_id: str) -> SyncBrowser: def get_sync_browser(self, thread_id: str) -> SyncBrowser:
"""Get or create a sync browser for the specified thread. """Get or create a sync browser for the specified thread.
@@ -53,19 +80,33 @@ class BrowserSessionManager:
Returns: Returns:
A sync browser instance specific to the thread A sync browser instance specific to the thread
""" """
while True:
with self._lock:
if thread_id in self._sync_sessions: if thread_id in self._sync_sessions:
return self._sync_sessions[thread_id][1] return self._sync_sessions[thread_id][1]
if thread_id not in self._creating:
self._creating[thread_id] = threading.Event()
break
event = self._creating[thread_id]
event.wait()
try:
return self._create_sync_browser_session(thread_id) return self._create_sync_browser_session(thread_id)
finally:
with self._lock:
evt = self._creating.pop(thread_id)
evt.set()
async def _create_async_browser_session(self, thread_id: str) -> AsyncBrowser: async def _create_async_browser_session(
self, thread_id: str
) -> tuple[BrowserClient, AsyncBrowser]:
"""Create a new async browser session for the specified thread. """Create a new async browser session for the specified thread.
Args: Args:
thread_id: Unique identifier for the thread thread_id: Unique identifier for the thread
Returns: Returns:
The newly created async browser instance Tuple of (BrowserClient, AsyncBrowser).
Raises: Raises:
Exception: If browser session creation fails Exception: If browser session creation fails
@@ -75,10 +116,8 @@ class BrowserSessionManager:
browser_client = BrowserClient(region=self.region) browser_client = BrowserClient(region=self.region)
try: try:
# Start browser session
browser_client.start() browser_client.start()
# Get WebSocket connection info
ws_url, headers = browser_client.generate_ws_headers() ws_url, headers = browser_client.generate_ws_headers()
logger.info( logger.info(
@@ -87,7 +126,6 @@ class BrowserSessionManager:
from playwright.async_api import async_playwright from playwright.async_api import async_playwright
# Connect to browser using Playwright
playwright = await async_playwright().start() playwright = await async_playwright().start()
browser = await playwright.chromium.connect_over_cdp( browser = await playwright.chromium.connect_over_cdp(
endpoint_url=ws_url, headers=headers, timeout=30000 endpoint_url=ws_url, headers=headers, timeout=30000
@@ -96,17 +134,13 @@ class BrowserSessionManager:
f"Successfully connected to async browser for thread {thread_id}" f"Successfully connected to async browser for thread {thread_id}"
) )
# Store session resources return browser_client, browser
self._async_sessions[thread_id] = (browser_client, browser)
return browser
except Exception as e: except Exception as e:
logger.error( logger.error(
f"Failed to create async browser session for thread {thread_id}: {e}" f"Failed to create async browser session for thread {thread_id}: {e}"
) )
# Clean up resources if session creation fails
if browser_client: if browser_client:
try: try:
browser_client.stop() browser_client.stop()
@@ -132,10 +166,8 @@ class BrowserSessionManager:
browser_client = BrowserClient(region=self.region) browser_client = BrowserClient(region=self.region)
try: try:
# Start browser session
browser_client.start() browser_client.start()
# Get WebSocket connection info
ws_url, headers = browser_client.generate_ws_headers() ws_url, headers = browser_client.generate_ws_headers()
logger.info( logger.info(
@@ -144,7 +176,6 @@ class BrowserSessionManager:
from playwright.sync_api import sync_playwright from playwright.sync_api import sync_playwright
# Connect to browser using Playwright
playwright = sync_playwright().start() playwright = sync_playwright().start()
browser = playwright.chromium.connect_over_cdp( browser = playwright.chromium.connect_over_cdp(
endpoint_url=ws_url, headers=headers, timeout=30000 endpoint_url=ws_url, headers=headers, timeout=30000
@@ -153,7 +184,7 @@ class BrowserSessionManager:
f"Successfully connected to sync browser for thread {thread_id}" f"Successfully connected to sync browser for thread {thread_id}"
) )
# Store session resources with self._lock:
self._sync_sessions[thread_id] = (browser_client, browser) self._sync_sessions[thread_id] = (browser_client, browser)
return browser return browser
@@ -163,7 +194,6 @@ class BrowserSessionManager:
f"Failed to create sync browser session for thread {thread_id}: {e}" f"Failed to create sync browser session for thread {thread_id}: {e}"
) )
# Clean up resources if session creation fails
if browser_client: if browser_client:
try: try:
browser_client.stop() browser_client.stop()
@@ -178,13 +208,13 @@ class BrowserSessionManager:
Args: Args:
thread_id: Unique identifier for the thread thread_id: Unique identifier for the thread
""" """
with self._lock:
if thread_id not in self._async_sessions: if thread_id not in self._async_sessions:
logger.warning(f"No async browser session found for thread {thread_id}") logger.warning(f"No async browser session found for thread {thread_id}")
return return
browser_client, browser = self._async_sessions[thread_id] browser_client, browser = self._async_sessions.pop(thread_id)
# Close browser
if browser: if browser:
try: try:
await browser.close() await browser.close()
@@ -193,7 +223,6 @@ class BrowserSessionManager:
f"Error closing async browser for thread {thread_id}: {e}" f"Error closing async browser for thread {thread_id}: {e}"
) )
# Stop browser client
if browser_client: if browser_client:
try: try:
browser_client.stop() browser_client.stop()
@@ -202,8 +231,6 @@ class BrowserSessionManager:
f"Error stopping browser client for thread {thread_id}: {e}" f"Error stopping browser client for thread {thread_id}: {e}"
) )
# Remove session from dictionary
del self._async_sessions[thread_id]
logger.info(f"Async browser session cleaned up for thread {thread_id}") logger.info(f"Async browser session cleaned up for thread {thread_id}")
def close_sync_browser(self, thread_id: str) -> None: def close_sync_browser(self, thread_id: str) -> None:
@@ -212,13 +239,13 @@ class BrowserSessionManager:
Args: Args:
thread_id: Unique identifier for the thread thread_id: Unique identifier for the thread
""" """
with self._lock:
if thread_id not in self._sync_sessions: if thread_id not in self._sync_sessions:
logger.warning(f"No sync browser session found for thread {thread_id}") logger.warning(f"No sync browser session found for thread {thread_id}")
return return
browser_client, browser = self._sync_sessions[thread_id] browser_client, browser = self._sync_sessions.pop(thread_id)
# Close browser
if browser: if browser:
try: try:
browser.close() browser.close()
@@ -227,7 +254,6 @@ class BrowserSessionManager:
f"Error closing sync browser for thread {thread_id}: {e}" f"Error closing sync browser for thread {thread_id}: {e}"
) )
# Stop browser client
if browser_client: if browser_client:
try: try:
browser_client.stop() browser_client.stop()
@@ -236,19 +262,17 @@ class BrowserSessionManager:
f"Error stopping browser client for thread {thread_id}: {e}" f"Error stopping browser client for thread {thread_id}: {e}"
) )
# Remove session from dictionary
del self._sync_sessions[thread_id]
logger.info(f"Sync browser session cleaned up for thread {thread_id}") logger.info(f"Sync browser session cleaned up for thread {thread_id}")
async def close_all_browsers(self) -> None: async def close_all_browsers(self) -> None:
"""Close all browser sessions.""" """Close all browser sessions."""
# Close all async browsers with self._lock:
async_thread_ids = list(self._async_sessions.keys()) async_thread_ids = list(self._async_sessions.keys())
sync_thread_ids = list(self._sync_sessions.keys())
for thread_id in async_thread_ids: for thread_id in async_thread_ids:
await self.close_async_browser(thread_id) await self.close_async_browser(thread_id)
# Close all sync browsers
sync_thread_ids = list(self._sync_sessions.keys())
for thread_id in sync_thread_ids: for thread_id in sync_thread_ids:
self.close_sync_browser(thread_id) self.close_sync_browser(thread_id)

View File

@@ -1,9 +1,11 @@
import logging import logging
import os
from pathlib import Path from pathlib import Path
from typing import Any from typing import Any
from uuid import uuid4 from uuid import uuid4
import chromadb import chromadb
from crewai.utilities.lock_store import lock as store_lock
from pydantic import BaseModel, Field, PrivateAttr from pydantic import BaseModel, Field, PrivateAttr
from crewai_tools.rag.base_loader import BaseLoader from crewai_tools.rag.base_loader import BaseLoader
@@ -38,11 +40,21 @@ class RAG(Adapter):
_client: Any = PrivateAttr() _client: Any = PrivateAttr()
_collection: Any = PrivateAttr() _collection: Any = PrivateAttr()
_embedding_service: EmbeddingService = PrivateAttr() _embedding_service: EmbeddingService = PrivateAttr()
_lock_name: str = PrivateAttr(default="")
def model_post_init(self, __context: Any) -> None: def model_post_init(self, __context: Any) -> None:
try: try:
self._lock_name = (
f"chromadb:{os.path.realpath(self.persist_directory)}"
if self.persist_directory
else "chromadb:ephemeral"
)
with store_lock(self._lock_name):
if self.persist_directory: if self.persist_directory:
self._client = chromadb.PersistentClient(path=self.persist_directory) self._client = chromadb.PersistentClient(
path=self.persist_directory
)
else: else:
self._client = chromadb.Client() self._client = chromadb.Client()
@@ -87,29 +99,8 @@ class RAG(Adapter):
loader_result = loader.load(source_content) loader_result = loader.load(source_content)
doc_id = loader_result.doc_id doc_id = loader_result.doc_id
existing_doc = self._collection.get(
where={"source": source_content.source_ref}, limit=1
)
existing_doc_id = (
existing_doc and existing_doc["metadatas"][0]["doc_id"]
if existing_doc["metadatas"]
else None
)
if existing_doc_id == doc_id:
logger.warning(
f"Document with source {loader_result.source} already exists"
)
return
# Document with same source ref does exists but the content has changed, deleting the oldest reference
if existing_doc_id and existing_doc_id != loader_result.doc_id:
logger.warning(f"Deleting old document with doc_id {existing_doc_id}")
self._collection.delete(where={"doc_id": existing_doc_id})
documents = []
chunks = chunker.chunk(loader_result.content) chunks = chunker.chunk(loader_result.content)
documents = []
for i, chunk in enumerate(chunks): for i, chunk in enumerate(chunks):
doc_metadata = (metadata or {}).copy() doc_metadata = (metadata or {}).copy()
doc_metadata["chunk_index"] = i doc_metadata["chunk_index"] = i
@@ -136,7 +127,6 @@ class RAG(Adapter):
ids = [doc.id for doc in documents] ids = [doc.id for doc in documents]
metadatas = [] metadatas = []
for doc in documents: for doc in documents:
doc_metadata = doc.metadata.copy() doc_metadata = doc.metadata.copy()
doc_metadata.update( doc_metadata.update(
@@ -148,6 +138,26 @@ class RAG(Adapter):
) )
metadatas.append(doc_metadata) metadatas.append(doc_metadata)
with store_lock(self._lock_name):
existing_doc = self._collection.get(
where={"source": source_content.source_ref}, limit=1
)
existing_doc_id = (
existing_doc and existing_doc["metadatas"][0]["doc_id"]
if existing_doc["metadatas"]
else None
)
if existing_doc_id == doc_id:
logger.warning(
f"Document with source {loader_result.source} already exists"
)
return
if existing_doc_id and existing_doc_id != loader_result.doc_id:
logger.warning(f"Deleting old document with doc_id {existing_doc_id}")
self._collection.delete(where={"doc_id": existing_doc_id})
try: try:
self._collection.add( self._collection.add(
ids=ids, ids=ids,
@@ -201,6 +211,7 @@ class RAG(Adapter):
def delete_collection(self) -> None: def delete_collection(self) -> None:
try: try:
with store_lock(self._lock_name):
self._client.delete_collection(self.collection_name) self._client.delete_collection(self.collection_name)
logger.info(f"Deleted collection: {self.collection_name}") logger.info(f"Deleted collection: {self.collection_name}")
except Exception as e: except Exception as e:

View File

@@ -1,4 +1,3 @@
from datetime import datetime
import json import json
import os import os
import time import time
@@ -10,8 +9,8 @@ from pydantic import BaseModel, Field
from pydantic.types import StringConstraints from pydantic.types import StringConstraints
import requests import requests
from crewai_tools.tools.brave_search_tool.schemas import WebSearchParams
from crewai_tools.tools.brave_search_tool.base import _save_results_to_file from crewai_tools.tools.brave_search_tool.base import _save_results_to_file
from crewai_tools.tools.brave_search_tool.schemas import WebSearchParams
load_dotenv() load_dotenv()

View File

@@ -1,13 +1,27 @@
# CodeInterpreterTool # CodeInterpreterTool
## Description ## Description
This tool is used to give the Agent the ability to run code (Python3) from the code generated by the Agent itself. The code is executed in a sandboxed environment, so it is safe to run any code. This tool is used to give the Agent the ability to run code (Python3) from the code generated by the Agent itself. The code is executed in a Docker container for secure isolation.
It is incredible useful since it allows the Agent to generate code, run it in the same environment, get the result and use it to make decisions. It is incredibly useful since it allows the Agent to generate code, run it in an isolated environment, get the result and use it to make decisions.
## ⚠️ Security Requirements
**Docker is REQUIRED** for safe code execution. The tool will refuse to execute code without Docker to prevent security vulnerabilities.
### Why Docker is Required
Previous versions included a "restricted sandbox" fallback when Docker was unavailable. This has been **removed** due to critical security vulnerabilities:
- The Python-based sandbox could be escaped via object introspection
- Attackers could recover the original `__import__` function and access any module
- This allowed arbitrary command execution on the host system
**Docker provides real process isolation** and is the only secure way to execute untrusted code.
## Requirements ## Requirements
- Docker - **Docker (REQUIRED)** - Install from [docker.com](https://docs.docker.com/get-docker/)
## Installation ## Installation
Install the crewai_tools package Install the crewai_tools package
@@ -17,7 +31,9 @@ pip install 'crewai[tools]'
## Example ## Example
Remember that when using this tool, the code must be generated by the Agent itself. The code must be a Python3 code. And it will take some time for the first time to run because it needs to build the Docker image. Remember that when using this tool, the code must be generated by the Agent itself. The code must be Python3 code. It will take some time the first time to run because it needs to build the Docker image.
### Basic Usage (Docker Container - Recommended)
```python ```python
from crewai_tools import CodeInterpreterTool from crewai_tools import CodeInterpreterTool
@@ -28,7 +44,9 @@ Agent(
) )
``` ```
Or if you need to pass your own Dockerfile just do this ### Custom Dockerfile
If you need to pass your own Dockerfile:
```python ```python
from crewai_tools import CodeInterpreterTool from crewai_tools import CodeInterpreterTool
@@ -39,15 +57,39 @@ Agent(
) )
``` ```
If it is difficult to connect to docker daemon automatically (especially for macOS users), you can do this to setup docker host manually ### Manual Docker Host Configuration
If it is difficult to connect to the Docker daemon automatically (especially for macOS users), you can set up the Docker host manually:
```python ```python
from crewai_tools import CodeInterpreterTool from crewai_tools import CodeInterpreterTool
Agent( Agent(
... ...
tools=[CodeInterpreterTool(user_docker_base_url="<Docker Host Base Url>", tools=[CodeInterpreterTool(
user_dockerfile_path="<Dockerfile_path>")], user_docker_base_url="<Docker Host Base Url>",
user_dockerfile_path="<Dockerfile_path>"
)],
) )
``` ```
### Unsafe Mode (NOT RECOMMENDED)
If you absolutely cannot use Docker and **fully trust the code source**, you can use unsafe mode:
```python
from crewai_tools import CodeInterpreterTool
# WARNING: Only use with fully trusted code!
Agent(
...
tools=[CodeInterpreterTool(unsafe_mode=True)],
)
```
**⚠️ SECURITY WARNING:** `unsafe_mode=True` executes code directly on the host without any isolation. Only use this if:
- You completely trust the code being executed
- You understand the security risks
- You cannot install Docker in your environment
For production use, **always use Docker** (the default mode).

View File

@@ -8,6 +8,7 @@ potentially unsafe operations and importing restricted modules.
import importlib.util import importlib.util
import os import os
import subprocess import subprocess
import sys
from types import ModuleType from types import ModuleType
from typing import Any, ClassVar, TypedDict from typing import Any, ClassVar, TypedDict
@@ -50,11 +51,16 @@ class CodeInterpreterSchema(BaseModel):
class SandboxPython: class SandboxPython:
"""A restricted Python execution environment for running code safely. """INSECURE: A restricted Python execution environment with known vulnerabilities.
This class provides methods to safely execute Python code by restricting access to WARNING: This class does NOT provide real security isolation and is vulnerable to
potentially dangerous modules and built-in functions. It creates a sandboxed sandbox escape attacks via Python object introspection. Attackers can recover the
environment where harmful operations are blocked. original __import__ function and bypass all restrictions.
DO NOT USE for untrusted code execution. Use Docker containers instead.
This class attempts to restrict access to dangerous modules and built-in functions
but provides no real security boundary against a motivated attacker.
""" """
BLOCKED_MODULES: ClassVar[set[str]] = { BLOCKED_MODULES: ClassVar[set[str]] = {
@@ -299,8 +305,8 @@ class CodeInterpreterTool(BaseTool):
def run_code_safety(self, code: str, libraries_used: list[str]) -> str: def run_code_safety(self, code: str, libraries_used: list[str]) -> str:
"""Runs code in the safest available environment. """Runs code in the safest available environment.
Attempts to run code in Docker if available, falls back to a restricted Requires Docker to be available for secure code execution. Fails closed
sandbox if Docker is not available. if Docker is not available to prevent sandbox escape vulnerabilities.
Args: Args:
code: The Python code to execute as a string. code: The Python code to execute as a string.
@@ -308,10 +314,24 @@ class CodeInterpreterTool(BaseTool):
Returns: Returns:
The output of the executed code as a string. The output of the executed code as a string.
Raises:
RuntimeError: If Docker is not available, as the restricted sandbox
is vulnerable to escape attacks and should not be used
for untrusted code execution.
""" """
if self._check_docker_available(): if self._check_docker_available():
return self.run_code_in_docker(code, libraries_used) return self.run_code_in_docker(code, libraries_used)
return self.run_code_in_restricted_sandbox(code)
error_msg = (
"Docker is required for safe code execution but is not available. "
"The restricted sandbox fallback has been removed due to security vulnerabilities "
"that allow sandbox escape via Python object introspection. "
"Please install Docker (https://docs.docker.com/get-docker/) or use unsafe_mode=True "
"if you trust the code source and understand the security risks."
)
Printer.print(error_msg, color="bold_red")
raise RuntimeError(error_msg)
def run_code_in_docker(self, code: str, libraries_used: list[str]) -> str: def run_code_in_docker(self, code: str, libraries_used: list[str]) -> str:
"""Runs Python code in a Docker container for safe isolation. """Runs Python code in a Docker container for safe isolation.
@@ -342,10 +362,19 @@ class CodeInterpreterTool(BaseTool):
@staticmethod @staticmethod
def run_code_in_restricted_sandbox(code: str) -> str: def run_code_in_restricted_sandbox(code: str) -> str:
"""Runs Python code in a restricted sandbox environment. """DEPRECATED AND INSECURE: Runs Python code in a restricted sandbox environment.
Executes the code with restricted access to potentially dangerous modules and WARNING: This method is vulnerable to sandbox escape attacks via Python object
built-in functions for basic safety when Docker is not available. introspection and should NOT be used for untrusted code execution. It has been
deprecated and is only kept for backward compatibility with trusted code.
The "restricted" environment can be bypassed by attackers who can:
- Use object graph introspection to recover the original __import__ function
- Access any Python module including os, subprocess, sys, etc.
- Execute arbitrary commands on the host system
Use run_code_in_docker() for secure code execution, or run_code_unsafe()
if you explicitly acknowledge the security risks.
Args: Args:
code: The Python code to execute as a string. code: The Python code to execute as a string.
@@ -354,7 +383,10 @@ class CodeInterpreterTool(BaseTool):
The value of the 'result' variable from the executed code, The value of the 'result' variable from the executed code,
or an error message if execution failed. or an error message if execution failed.
""" """
Printer.print("Running code in restricted sandbox", color="yellow") Printer.print(
"WARNING: Running code in INSECURE restricted sandbox (vulnerable to escape attacks)",
color="bold_red"
)
exec_locals: dict[str, Any] = {} exec_locals: dict[str, Any] = {}
try: try:
SandboxPython.exec(code=code, locals_=exec_locals) SandboxPython.exec(code=code, locals_=exec_locals)
@@ -380,7 +412,7 @@ class CodeInterpreterTool(BaseTool):
Printer.print("WARNING: Running code in unsafe mode", color="bold_magenta") Printer.print("WARNING: Running code in unsafe mode", color="bold_magenta")
# Install libraries on the host machine # Install libraries on the host machine
for library in libraries_used: for library in libraries_used:
os.system(f"pip install {library}") # noqa: S605 subprocess.run([sys.executable, "-m", "pip", "install", library], check=False) # noqa: S603
# Execute the code # Execute the code
try: try:

View File

@@ -30,9 +30,8 @@ class FileWriterTool(BaseTool):
def _run(self, **kwargs: Any) -> str: def _run(self, **kwargs: Any) -> str:
try: try:
# Create the directory if it doesn't exist if kwargs.get("directory"):
if kwargs.get("directory") and not os.path.exists(kwargs["directory"]): os.makedirs(kwargs["directory"], exist_ok=True)
os.makedirs(kwargs["directory"])
# Construct the full path # Construct the full path
filepath = os.path.join(kwargs.get("directory") or "", kwargs["filename"]) filepath = os.path.join(kwargs.get("directory") or "", kwargs["filename"])

View File

@@ -99,8 +99,8 @@ class FileCompressorTool(BaseTool):
def _prepare_output(output_path: str, overwrite: bool) -> bool: def _prepare_output(output_path: str, overwrite: bool) -> bool:
"""Ensures output path is ready for writing.""" """Ensures output path is ready for writing."""
output_dir = os.path.dirname(output_path) output_dir = os.path.dirname(output_path)
if output_dir and not os.path.exists(output_dir): if output_dir:
os.makedirs(output_dir) os.makedirs(output_dir, exist_ok=True)
if os.path.exists(output_path) and not overwrite: if os.path.exists(output_path) and not overwrite:
return False return False
return True return True

View File

@@ -18,7 +18,6 @@ class MergeAgentHandlerToolError(Exception):
"""Base exception for Merge Agent Handler tool errors.""" """Base exception for Merge Agent Handler tool errors."""
class MergeAgentHandlerTool(BaseTool): class MergeAgentHandlerTool(BaseTool):
""" """
Wrapper for Merge Agent Handler tools. Wrapper for Merge Agent Handler tools.
@@ -174,7 +173,7 @@ class MergeAgentHandlerTool(BaseTool):
>>> tool = MergeAgentHandlerTool.from_tool_name( >>> tool = MergeAgentHandlerTool.from_tool_name(
... tool_name="linear__create_issue", ... tool_name="linear__create_issue",
... tool_pack_id="134e0111-0f67-44f6-98f0-597000290bb3", ... tool_pack_id="134e0111-0f67-44f6-98f0-597000290bb3",
... registered_user_id="91b2b905-e866-40c8-8be2-efe53827a0aa" ... registered_user_id="91b2b905-e866-40c8-8be2-efe53827a0aa",
... ) ... )
""" """
# Create an empty args schema model (proper BaseModel subclass) # Create an empty args schema model (proper BaseModel subclass)
@@ -210,7 +209,10 @@ class MergeAgentHandlerTool(BaseTool):
if "parameters" in tool_schema: if "parameters" in tool_schema:
try: try:
params = tool_schema["parameters"] params = tool_schema["parameters"]
if params.get("type") == "object" and "properties" in params: if (
params.get("type") == "object"
and "properties" in params
):
# Build field definitions for Pydantic # Build field definitions for Pydantic
fields = {} fields = {}
properties = params["properties"] properties = params["properties"]
@@ -298,7 +300,7 @@ class MergeAgentHandlerTool(BaseTool):
>>> tools = MergeAgentHandlerTool.from_tool_pack( >>> tools = MergeAgentHandlerTool.from_tool_pack(
... tool_pack_id="134e0111-0f67-44f6-98f0-597000290bb3", ... tool_pack_id="134e0111-0f67-44f6-98f0-597000290bb3",
... registered_user_id="91b2b905-e866-40c8-8be2-efe53827a0aa", ... registered_user_id="91b2b905-e866-40c8-8be2-efe53827a0aa",
... tool_names=["linear__create_issue", "linear__get_issues"] ... tool_names=["linear__create_issue", "linear__get_issues"],
... ) ... )
""" """
# Create a temporary instance to fetch the tool list # Create a temporary instance to fetch the tool list

View File

@@ -110,11 +110,13 @@ class QdrantVectorSearchTool(BaseTool):
self.custom_embedding_fn(query) self.custom_embedding_fn(query)
if self.custom_embedding_fn if self.custom_embedding_fn
else ( else (
lambda: __import__("openai") lambda: (
__import__("openai")
.Client(api_key=os.getenv("OPENAI_API_KEY")) .Client(api_key=os.getenv("OPENAI_API_KEY"))
.embeddings.create(input=[query], model="text-embedding-3-large") .embeddings.create(input=[query], model="text-embedding-3-large")
.data[0] .data[0]
.embedding .embedding
)
)() )()
) )
results = self.client.query_points( results = self.client.query_points(

View File

@@ -3,6 +3,7 @@ from __future__ import annotations
import asyncio import asyncio
from concurrent.futures import ThreadPoolExecutor from concurrent.futures import ThreadPoolExecutor
import logging import logging
import threading
from typing import TYPE_CHECKING, Any from typing import TYPE_CHECKING, Any
from crewai.tools.base_tool import BaseTool from crewai.tools.base_tool import BaseTool
@@ -33,6 +34,7 @@ logger = logging.getLogger(__name__)
# Cache for query results # Cache for query results
_query_cache: dict[str, list[dict[str, Any]]] = {} _query_cache: dict[str, list[dict[str, Any]]] = {}
_cache_lock = threading.Lock()
class SnowflakeConfig(BaseModel): class SnowflakeConfig(BaseModel):
@@ -102,7 +104,7 @@ class SnowflakeSearchTool(BaseTool):
) )
_connection_pool: list[SnowflakeConnection] | None = None _connection_pool: list[SnowflakeConnection] | None = None
_pool_lock: asyncio.Lock | None = None _pool_lock: threading.Lock | None = None
_thread_pool: ThreadPoolExecutor | None = None _thread_pool: ThreadPoolExecutor | None = None
_model_rebuilt: bool = False _model_rebuilt: bool = False
package_dependencies: list[str] = Field( package_dependencies: list[str] = Field(
@@ -122,7 +124,7 @@ class SnowflakeSearchTool(BaseTool):
try: try:
if SNOWFLAKE_AVAILABLE: if SNOWFLAKE_AVAILABLE:
self._connection_pool = [] self._connection_pool = []
self._pool_lock = asyncio.Lock() self._pool_lock = threading.Lock()
self._thread_pool = ThreadPoolExecutor(max_workers=self.pool_size) self._thread_pool = ThreadPoolExecutor(max_workers=self.pool_size)
else: else:
raise ImportError raise ImportError
@@ -147,7 +149,7 @@ class SnowflakeSearchTool(BaseTool):
) )
self._connection_pool = [] self._connection_pool = []
self._pool_lock = asyncio.Lock() self._pool_lock = threading.Lock()
self._thread_pool = ThreadPoolExecutor(max_workers=self.pool_size) self._thread_pool = ThreadPoolExecutor(max_workers=self.pool_size)
except subprocess.CalledProcessError as e: except subprocess.CalledProcessError as e:
raise ImportError("Failed to install Snowflake dependencies") from e raise ImportError("Failed to install Snowflake dependencies") from e
@@ -163,13 +165,12 @@ class SnowflakeSearchTool(BaseTool):
raise RuntimeError("Pool lock not initialized") raise RuntimeError("Pool lock not initialized")
if self._connection_pool is None: if self._connection_pool is None:
raise RuntimeError("Connection pool not initialized") raise RuntimeError("Connection pool not initialized")
async with self._pool_lock: with self._pool_lock:
if not self._connection_pool: if self._connection_pool:
conn = await asyncio.get_event_loop().run_in_executor( return self._connection_pool.pop()
return await asyncio.get_event_loop().run_in_executor(
self._thread_pool, self._create_connection self._thread_pool, self._create_connection
) )
self._connection_pool.append(conn)
return self._connection_pool.pop()
def _create_connection(self) -> SnowflakeConnection: def _create_connection(self) -> SnowflakeConnection:
"""Create a new Snowflake connection.""" """Create a new Snowflake connection."""
@@ -204,6 +205,7 @@ class SnowflakeSearchTool(BaseTool):
"""Execute a query with retries and return results.""" """Execute a query with retries and return results."""
if self.enable_caching: if self.enable_caching:
cache_key = self._get_cache_key(query, timeout) cache_key = self._get_cache_key(query, timeout)
with _cache_lock:
if cache_key in _query_cache: if cache_key in _query_cache:
logger.info("Returning cached result") logger.info("Returning cached result")
return _query_cache[cache_key] return _query_cache[cache_key]
@@ -225,6 +227,7 @@ class SnowflakeSearchTool(BaseTool):
] ]
if self.enable_caching: if self.enable_caching:
with _cache_lock:
_query_cache[self._get_cache_key(query, timeout)] = results _query_cache[self._get_cache_key(query, timeout)] = results
return results return results
@@ -234,7 +237,7 @@ class SnowflakeSearchTool(BaseTool):
self._pool_lock is not None self._pool_lock is not None
and self._connection_pool is not None and self._connection_pool is not None
): ):
async with self._pool_lock: with self._pool_lock:
self._connection_pool.append(conn) self._connection_pool.append(conn)
except (DatabaseError, OperationalError) as e: # noqa: PERF203 except (DatabaseError, OperationalError) as e: # noqa: PERF203
if attempt == self.max_retries - 1: if attempt == self.max_retries - 1:

View File

@@ -1,4 +1,5 @@
import asyncio import asyncio
import contextvars
import json import json
import os import os
import re import re
@@ -137,7 +138,9 @@ class StagehandTool(BaseTool):
- 'observe': For finding elements in a specific area - 'observe': For finding elements in a specific area
""" """
args_schema: type[BaseModel] = StagehandToolSchema args_schema: type[BaseModel] = StagehandToolSchema
package_dependencies: list[str] = Field(default_factory=lambda: ["stagehand<=0.5.9"]) package_dependencies: list[str] = Field(
default_factory=lambda: ["stagehand<=0.5.9"]
)
env_vars: list[EnvVar] = Field( env_vars: list[EnvVar] = Field(
default_factory=lambda: [ default_factory=lambda: [
EnvVar( EnvVar(
@@ -620,9 +623,12 @@ class StagehandTool(BaseTool):
# We're in an existing event loop, use it # We're in an existing event loop, use it
import concurrent.futures import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as executor: with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit( future = executor.submit(
asyncio.run, self._async_run(instruction, url, command_type) ctx.run,
asyncio.run,
self._async_run(instruction, url, command_type),
) )
result = future.result() result = future.result()
else: else:
@@ -706,11 +712,12 @@ class StagehandTool(BaseTool):
if loop.is_running(): if loop.is_running():
import concurrent.futures import concurrent.futures
ctx = contextvars.copy_context()
with ( with (
concurrent.futures.ThreadPoolExecutor() as executor concurrent.futures.ThreadPoolExecutor() as executor
): ):
future = executor.submit( future = executor.submit(
asyncio.run, self._async_close() ctx.run, asyncio.run, self._async_close()
) )
future.result() future.result()
else: else:

View File

@@ -1,3 +1,4 @@
import sys
from unittest.mock import patch from unittest.mock import patch
from crewai_tools.tools.code_interpreter_tool.code_interpreter_tool import ( from crewai_tools.tools.code_interpreter_tool.code_interpreter_tool import (
@@ -76,24 +77,22 @@ print("This is line 2")"""
) )
def test_restricted_sandbox_basic_code_execution(printer_mock, docker_unavailable_mock): def test_docker_unavailable_raises_error(printer_mock, docker_unavailable_mock):
"""Test basic code execution.""" """Test that execution fails when Docker is unavailable in safe mode."""
tool = CodeInterpreterTool() tool = CodeInterpreterTool()
code = """ code = """
result = 2 + 2 result = 2 + 2
print(result) print(result)
""" """
result = tool.run(code=code, libraries_used=[]) with pytest.raises(RuntimeError) as exc_info:
printer_mock.assert_called_with( tool.run(code=code, libraries_used=[])
"Running code in restricted sandbox", color="yellow"
) assert "Docker is required for safe code execution" in str(exc_info.value)
assert result == 4 assert "sandbox escape" in str(exc_info.value)
def test_restricted_sandbox_running_with_blocked_modules( def test_restricted_sandbox_running_with_blocked_modules():
printer_mock, docker_unavailable_mock """Test that restricted modules cannot be imported when using the deprecated sandbox directly."""
):
"""Test that restricted modules cannot be imported."""
tool = CodeInterpreterTool() tool = CodeInterpreterTool()
restricted_modules = SandboxPython.BLOCKED_MODULES restricted_modules = SandboxPython.BLOCKED_MODULES
@@ -102,18 +101,15 @@ def test_restricted_sandbox_running_with_blocked_modules(
import {module} import {module}
result = "Import succeeded" result = "Import succeeded"
""" """
result = tool.run(code=code, libraries_used=[]) # Note: run_code_in_restricted_sandbox is deprecated and insecure
printer_mock.assert_called_with( # This test verifies the old behavior but should not be used in production
"Running code in restricted sandbox", color="yellow" result = tool.run_code_in_restricted_sandbox(code)
)
assert f"An error occurred: Importing '{module}' is not allowed" in result assert f"An error occurred: Importing '{module}' is not allowed" in result
def test_restricted_sandbox_running_with_blocked_builtins( def test_restricted_sandbox_running_with_blocked_builtins():
printer_mock, docker_unavailable_mock """Test that restricted builtins are not available when using the deprecated sandbox directly."""
):
"""Test that restricted builtins are not available."""
tool = CodeInterpreterTool() tool = CodeInterpreterTool()
restricted_builtins = SandboxPython.UNSAFE_BUILTINS restricted_builtins = SandboxPython.UNSAFE_BUILTINS
@@ -122,25 +118,23 @@ def test_restricted_sandbox_running_with_blocked_builtins(
{builtin}("test") {builtin}("test")
result = "Builtin available" result = "Builtin available"
""" """
result = tool.run(code=code, libraries_used=[]) # Note: run_code_in_restricted_sandbox is deprecated and insecure
printer_mock.assert_called_with( # This test verifies the old behavior but should not be used in production
"Running code in restricted sandbox", color="yellow" result = tool.run_code_in_restricted_sandbox(code)
)
assert f"An error occurred: name '{builtin}' is not defined" in result assert f"An error occurred: name '{builtin}' is not defined" in result
def test_restricted_sandbox_running_with_no_result_variable( def test_restricted_sandbox_running_with_no_result_variable(
printer_mock, docker_unavailable_mock printer_mock, docker_unavailable_mock
): ):
"""Test behavior when no result variable is set.""" """Test behavior when no result variable is set in deprecated sandbox."""
tool = CodeInterpreterTool() tool = CodeInterpreterTool()
code = """ code = """
x = 10 x = 10
""" """
result = tool.run(code=code, libraries_used=[]) # Note: run_code_in_restricted_sandbox is deprecated and insecure
printer_mock.assert_called_with( # This test verifies the old behavior but should not be used in production
"Running code in restricted sandbox", color="yellow" result = tool.run_code_in_restricted_sandbox(code)
)
assert result == "No result variable found." assert result == "No result variable found."
@@ -159,6 +153,44 @@ x = 10
assert result == "No result variable found." assert result == "No result variable found."
@patch("crewai_tools.tools.code_interpreter_tool.code_interpreter_tool.subprocess.run")
def test_unsafe_mode_installs_libraries_without_shell(
subprocess_run_mock, printer_mock, docker_unavailable_mock
):
"""Test that library installation uses subprocess.run with shell=False, not os.system."""
tool = CodeInterpreterTool(unsafe_mode=True)
code = "result = 1"
libraries_used = ["numpy", "pandas"]
tool.run(code=code, libraries_used=libraries_used)
assert subprocess_run_mock.call_count == 2
for call, library in zip(subprocess_run_mock.call_args_list, libraries_used):
args, kwargs = call
# Must be list form (no shell expansion possible)
assert args[0] == [sys.executable, "-m", "pip", "install", library]
# shell= must not be True (defaults to False)
assert kwargs.get("shell", False) is False
@patch("crewai_tools.tools.code_interpreter_tool.code_interpreter_tool.subprocess.run")
def test_unsafe_mode_library_name_with_shell_metacharacters_does_not_invoke_shell(
subprocess_run_mock, printer_mock, docker_unavailable_mock
):
"""Test that a malicious library name cannot inject shell commands."""
tool = CodeInterpreterTool(unsafe_mode=True)
code = "result = 1"
malicious_library = "numpy; rm -rf /"
tool.run(code=code, libraries_used=[malicious_library])
subprocess_run_mock.assert_called_once()
args, kwargs = subprocess_run_mock.call_args
# The entire malicious string is passed as a single argument — no shell parsing
assert args[0] == [sys.executable, "-m", "pip", "install", malicious_library]
assert kwargs.get("shell", False) is False
def test_unsafe_mode_running_unsafe_code(printer_mock, docker_unavailable_mock): def test_unsafe_mode_running_unsafe_code(printer_mock, docker_unavailable_mock):
"""Test behavior when no result variable is set.""" """Test behavior when no result variable is set."""
tool = CodeInterpreterTool(unsafe_mode=True) tool = CodeInterpreterTool(unsafe_mode=True)
@@ -172,3 +204,50 @@ result = eval("5/1")
"WARNING: Running code in unsafe mode", color="bold_magenta" "WARNING: Running code in unsafe mode", color="bold_magenta"
) )
assert 5.0 == result assert 5.0 == result
@pytest.mark.xfail(
reason=(
"run_code_in_restricted_sandbox is known to be vulnerable to sandbox "
"escape via object introspection. This test encodes the desired secure "
"behavior (no escape possible) and will start passing once the "
"vulnerability is fixed or the function is removed."
)
)
def test_sandbox_escape_vulnerability_demonstration(printer_mock):
"""Demonstrate that the restricted sandbox is vulnerable to escape attacks.
This test shows that an attacker can use Python object introspection to bypass
the restricted sandbox and access blocked modules like 'os'. This is why the
sandbox should never be used for untrusted code execution.
NOTE: This test uses the deprecated run_code_in_restricted_sandbox directly
to demonstrate the vulnerability. In production, Docker is now required.
"""
tool = CodeInterpreterTool()
# Classic Python sandbox escape via object introspection
escape_code = """
# Recover the real __import__ function via object introspection
for cls in ().__class__.__bases__[0].__subclasses__():
if cls.__name__ == 'catch_warnings':
# Get the real builtins module
real_builtins = cls()._module.__builtins__
real_import = real_builtins['__import__']
# Now we can import os and execute commands
os = real_import('os')
# Demonstrate we have escaped the sandbox
result = "SANDBOX_ESCAPED" if hasattr(os, 'system') else "FAILED"
break
"""
# The deprecated sandbox is vulnerable to this attack
result = tool.run_code_in_restricted_sandbox(escape_code)
# Desired behavior: the restricted sandbox should prevent this escape.
# If this assertion fails, run_code_in_restricted_sandbox remains vulnerable.
assert result != "SANDBOX_ESCAPED", (
"The restricted sandbox was bypassed via object introspection. "
"This indicates run_code_in_restricted_sandbox is still vulnerable and "
"is why Docker is now required for safe code execution."
)

View File

@@ -53,7 +53,7 @@ Repository = "https://github.com/crewAIInc/crewAI"
[project.optional-dependencies] [project.optional-dependencies]
tools = [ tools = [
"crewai-tools==1.10.2a1", "crewai-tools==1.11.0rc1",
] ]
embeddings = [ embeddings = [
"tiktoken~=0.8.0" "tiktoken~=0.8.0"

View File

@@ -1,9 +1,11 @@
import contextvars
import threading import threading
from typing import Any from typing import Any
import urllib.request import urllib.request
import warnings import warnings
from crewai.agent.core import Agent from crewai.agent.core import Agent
from crewai.agent.planning_config import PlanningConfig
from crewai.crew import Crew from crewai.crew import Crew
from crewai.crews.crew_output import CrewOutput from crewai.crews.crew_output import CrewOutput
from crewai.flow.flow import Flow from crewai.flow.flow import Flow
@@ -40,7 +42,7 @@ def _suppress_pydantic_deprecation_warnings() -> None:
_suppress_pydantic_deprecation_warnings() _suppress_pydantic_deprecation_warnings()
__version__ = "1.10.2a1" __version__ = "1.11.0rc1"
_telemetry_submitted = False _telemetry_submitted = False
@@ -66,7 +68,8 @@ def _track_install() -> None:
def _track_install_async() -> None: def _track_install_async() -> None:
"""Track installation in background thread to avoid blocking imports.""" """Track installation in background thread to avoid blocking imports."""
if not Telemetry._is_telemetry_disabled(): if not Telemetry._is_telemetry_disabled():
thread = threading.Thread(target=_track_install, daemon=True) ctx = contextvars.copy_context()
thread = threading.Thread(target=ctx.run, args=(_track_install,), daemon=True)
thread.start() thread.start()
@@ -100,6 +103,7 @@ __all__ = [
"Knowledge", "Knowledge",
"LLMGuardrail", "LLMGuardrail",
"Memory", "Memory",
"PlanningConfig",
"Process", "Process",
"Task", "Task",
"TaskOutput", "TaskOutput",

View File

@@ -13,6 +13,7 @@ from crewai.a2a.auth.client_schemes import (
) )
from crewai.a2a.auth.server_schemes import ( from crewai.a2a.auth.server_schemes import (
AuthenticatedUser, AuthenticatedUser,
EnterpriseTokenAuth,
OIDCAuth, OIDCAuth,
ServerAuthScheme, ServerAuthScheme,
SimpleTokenAuth, SimpleTokenAuth,
@@ -25,6 +26,7 @@ __all__ = [
"AuthenticatedUser", "AuthenticatedUser",
"BearerTokenAuth", "BearerTokenAuth",
"ClientAuthScheme", "ClientAuthScheme",
"EnterpriseTokenAuth",
"HTTPBasicAuth", "HTTPBasicAuth",
"HTTPDigestAuth", "HTTPDigestAuth",
"OAuth2AuthorizationCode", "OAuth2AuthorizationCode",

View File

@@ -4,6 +4,7 @@ These schemes validate incoming requests to A2A server endpoints.
Supported authentication methods: Supported authentication methods:
- Simple token validation with static bearer tokens - Simple token validation with static bearer tokens
- Enterprise token validation (via PlusAPI)
- OpenID Connect with JWT validation using JWKS - OpenID Connect with JWT validation using JWKS
- OAuth2 with JWT validation or token introspection - OAuth2 with JWT validation or token introspection
""" """
@@ -16,6 +17,7 @@ import logging
import os import os
from typing import TYPE_CHECKING, Annotated, Any, ClassVar, Literal from typing import TYPE_CHECKING, Annotated, Any, ClassVar, Literal
import httpx
import jwt import jwt
from jwt import PyJWKClient from jwt import PyJWKClient
from pydantic import ( from pydantic import (
@@ -33,6 +35,7 @@ from typing_extensions import Self
if TYPE_CHECKING: if TYPE_CHECKING:
from a2a.types import OAuth2SecurityScheme from a2a.types import OAuth2SecurityScheme
from jwt.types import Options
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -183,6 +186,24 @@ class SimpleTokenAuth(ServerAuthScheme):
) )
class EnterpriseTokenAuth(ServerAuthScheme):
"""Enterprise token authentication.
Validates tokens via the PlusAPI enterprise verification endpoint.
"""
async def authenticate(self, token: str) -> AuthenticatedUser:
"""Authenticate using enterprise token verification.
Args:
token: The bearer token to authenticate.
Raises:
NotImplementedError
"""
raise NotImplementedError
class OIDCAuth(ServerAuthScheme): class OIDCAuth(ServerAuthScheme):
"""OpenID Connect authentication. """OpenID Connect authentication.
@@ -475,7 +496,7 @@ class OAuth2ServerAuth(ServerAuthScheme):
try: try:
signing_key = self._jwk_client.get_signing_key_from_jwt(token) signing_key = self._jwk_client.get_signing_key_from_jwt(token)
decode_options: dict[str, Any] = { decode_options: Options = {
"require": self.required_claims, "require": self.required_claims,
} }
@@ -556,7 +577,6 @@ class OAuth2ServerAuth(ServerAuthScheme):
async def _authenticate_introspection(self, token: str) -> AuthenticatedUser: async def _authenticate_introspection(self, token: str) -> AuthenticatedUser:
"""Authenticate using OAuth2 token introspection (RFC 7662).""" """Authenticate using OAuth2 token introspection (RFC 7662)."""
import httpx
if not self.introspection_url: if not self.introspection_url:
raise HTTPException( raise HTTPException(

View File

@@ -633,6 +633,10 @@ class A2AServerConfig(BaseModel):
default=False, default=False,
description="Whether agent provides extended card to authenticated users", description="Whether agent provides extended card to authenticated users",
) )
extended_skills: list[AgentSkill] = Field(
default_factory=list,
description="Additional skills visible only to authenticated users in the extended card",
)
url: Url | None = Field( url: Url | None = Field(
default=None, default=None,
description="Preferred endpoint URL for the agent. Set at runtime if not provided.", description="Preferred endpoint URL for the agent. Set at runtime if not provided.",

View File

@@ -63,6 +63,9 @@ class A2AErrorCode(IntEnum):
INVALID_AGENT_RESPONSE = -32006 INVALID_AGENT_RESPONSE = -32006
"""The agent produced an invalid response.""" """The agent produced an invalid response."""
AUTHENTICATED_EXTENDED_CARD_NOT_CONFIGURED = -32007
"""Authenticated extended card feature is not configured."""
# CrewAI Custom Extensions (-32768 to -32100) # CrewAI Custom Extensions (-32768 to -32100)
UNSUPPORTED_VERSION = -32009 UNSUPPORTED_VERSION = -32009
"""The requested A2A protocol version is not supported.""" """The requested A2A protocol version is not supported."""
@@ -108,6 +111,7 @@ ERROR_MESSAGES: dict[int, str] = {
A2AErrorCode.UNSUPPORTED_OPERATION: "This operation is not supported", A2AErrorCode.UNSUPPORTED_OPERATION: "This operation is not supported",
A2AErrorCode.CONTENT_TYPE_NOT_SUPPORTED: "Incompatible content types", A2AErrorCode.CONTENT_TYPE_NOT_SUPPORTED: "Incompatible content types",
A2AErrorCode.INVALID_AGENT_RESPONSE: "Invalid agent response", A2AErrorCode.INVALID_AGENT_RESPONSE: "Invalid agent response",
A2AErrorCode.AUTHENTICATED_EXTENDED_CARD_NOT_CONFIGURED: "Authenticated Extended Card is not configured",
A2AErrorCode.UNSUPPORTED_VERSION: "Unsupported A2A version", A2AErrorCode.UNSUPPORTED_VERSION: "Unsupported A2A version",
A2AErrorCode.UNSUPPORTED_EXTENSION: "Client does not support required extensions", A2AErrorCode.UNSUPPORTED_EXTENSION: "Client does not support required extensions",
A2AErrorCode.AUTHENTICATION_REQUIRED: "Authentication required", A2AErrorCode.AUTHENTICATION_REQUIRED: "Authentication required",
@@ -284,6 +288,15 @@ class InvalidAgentResponseError(A2AError):
code: int = field(default=A2AErrorCode.INVALID_AGENT_RESPONSE, init=False) code: int = field(default=A2AErrorCode.INVALID_AGENT_RESPONSE, init=False)
@dataclass
class AuthenticatedExtendedCardNotConfiguredError(A2AError):
"""Authenticated extended card is not configured."""
code: int = field(
default=A2AErrorCode.AUTHENTICATED_EXTENDED_CARD_NOT_CONFIGURED, init=False
)
@dataclass @dataclass
class UnsupportedVersionError(A2AError): class UnsupportedVersionError(A2AError):
"""The requested A2A version is not supported.""" """The requested A2A version is not supported."""

View File

@@ -5,6 +5,7 @@ from __future__ import annotations
import asyncio import asyncio
from collections.abc import MutableMapping from collections.abc import MutableMapping
import concurrent.futures import concurrent.futures
import contextvars
from functools import lru_cache from functools import lru_cache
import ssl import ssl
import time import time
@@ -147,8 +148,9 @@ def fetch_agent_card(
has_running_loop = False has_running_loop = False
if has_running_loop: if has_running_loop:
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool: with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
return pool.submit(asyncio.run, coro).result() return pool.submit(ctx.run, asyncio.run, coro).result()
return asyncio.run(coro) return asyncio.run(coro)
@@ -215,8 +217,9 @@ def _fetch_agent_card_cached(
has_running_loop = False has_running_loop = False
if has_running_loop: if has_running_loop:
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool: with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
return pool.submit(asyncio.run, coro).result() return pool.submit(ctx.run, asyncio.run, coro).result()
return asyncio.run(coro) return asyncio.run(coro)

View File

@@ -7,6 +7,7 @@ import base64
from collections.abc import AsyncIterator, Callable, MutableMapping from collections.abc import AsyncIterator, Callable, MutableMapping
import concurrent.futures import concurrent.futures
from contextlib import asynccontextmanager from contextlib import asynccontextmanager
import contextvars
import logging import logging
from typing import TYPE_CHECKING, Any, Final, Literal from typing import TYPE_CHECKING, Any, Final, Literal
import uuid import uuid
@@ -229,8 +230,9 @@ def execute_a2a_delegation(
has_running_loop = False has_running_loop = False
if has_running_loop: if has_running_loop:
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool: with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
return pool.submit(asyncio.run, coro).result() return pool.submit(ctx.run, asyncio.run, coro).result()
return asyncio.run(coro) return asyncio.run(coro)

View File

@@ -8,6 +8,7 @@ from __future__ import annotations
import asyncio import asyncio
from collections.abc import Callable, Coroutine, Mapping from collections.abc import Callable, Coroutine, Mapping
from concurrent.futures import ThreadPoolExecutor, as_completed from concurrent.futures import ThreadPoolExecutor, as_completed
import contextvars
from functools import wraps from functools import wraps
import json import json
from types import MethodType from types import MethodType
@@ -278,7 +279,9 @@ def _fetch_agent_cards_concurrently(
max_workers = min(len(a2a_agents), 10) max_workers = min(len(a2a_agents), 10)
with ThreadPoolExecutor(max_workers=max_workers) as executor: with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = { futures = {
executor.submit(_fetch_card_from_config, config): config executor.submit(
contextvars.copy_context().run, _fetch_card_from_config, config
): config
for config in a2a_agents for config in a2a_agents
} }
for future in as_completed(futures): for future in as_completed(futures):

View File

@@ -2,6 +2,7 @@ from __future__ import annotations
import asyncio import asyncio
from collections.abc import Callable, Coroutine, Sequence from collections.abc import Callable, Coroutine, Sequence
import contextvars
import shutil import shutil
import subprocess import subprocess
import time import time
@@ -22,6 +23,7 @@ from pydantic import (
) )
from typing_extensions import Self from typing_extensions import Self
from crewai.agent.planning_config import PlanningConfig
from crewai.agent.utils import ( from crewai.agent.utils import (
ahandle_knowledge_retrieval, ahandle_knowledge_retrieval,
apply_training_data, apply_training_data,
@@ -191,13 +193,23 @@ class Agent(BaseAgent):
default="safe", default="safe",
description="Mode for code execution: 'safe' (using Docker) or 'unsafe' (direct execution).", description="Mode for code execution: 'safe' (using Docker) or 'unsafe' (direct execution).",
) )
reasoning: bool = Field( planning_config: PlanningConfig | None = Field(
default=None,
description="Configuration for agent planning before task execution.",
)
planning: bool = Field(
default=False, default=False,
description="Whether the agent should reflect and create a plan before executing a task.", description="Whether the agent should reflect and create a plan before executing a task.",
) )
reasoning: bool = Field(
default=False,
description="[DEPRECATED: Use planning_config instead] Whether the agent should reflect and create a plan before executing a task.",
deprecated=True,
)
max_reasoning_attempts: int | None = Field( max_reasoning_attempts: int | None = Field(
default=None, default=None,
description="Maximum number of reasoning attempts before executing the task. If None, will try until ready.", description="[DEPRECATED: Use planning_config.max_attempts instead] Maximum number of reasoning attempts before executing the task. If None, will try until ready.",
deprecated=True,
) )
embedder: EmbedderConfig | None = Field( embedder: EmbedderConfig | None = Field(
default=None, default=None,
@@ -264,8 +276,26 @@ class Agent(BaseAgent):
if self.allow_code_execution: if self.allow_code_execution:
self._validate_docker_installation() self._validate_docker_installation()
# Handle backward compatibility: convert reasoning=True to planning_config
if self.reasoning and self.planning_config is None:
import warnings
warnings.warn(
"The 'reasoning' parameter is deprecated. Use 'planning_config=PlanningConfig()' instead.",
DeprecationWarning,
stacklevel=2,
)
self.planning_config = PlanningConfig(
max_attempts=self.max_reasoning_attempts,
)
return self return self
@property
def planning_enabled(self) -> bool:
"""Check if planning is enabled for this agent."""
return self.planning_config is not None or self.planning
def _setup_agent_executor(self) -> None: def _setup_agent_executor(self) -> None:
if not self.cache_handler: if not self.cache_handler:
self.cache_handler = CacheHandler() self.cache_handler = CacheHandler()
@@ -334,7 +364,11 @@ class Agent(BaseAgent):
ValueError: If the max execution time is not a positive integer. ValueError: If the max execution time is not a positive integer.
RuntimeError: If the agent execution fails for other reasons. RuntimeError: If the agent execution fails for other reasons.
""" """
# Only call handle_reasoning for legacy CrewAgentExecutor
# For AgentExecutor, planning is handled in AgentExecutor.generate_plan()
if self.executor_class is not AgentExecutor:
handle_reasoning(self, task) handle_reasoning(self, task)
self._inject_date_to_task(task) self._inject_date_to_task(task)
if self.tools_handler: if self.tools_handler:
@@ -513,9 +547,13 @@ class Agent(BaseAgent):
""" """
import concurrent.futures import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as executor: with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit( future = executor.submit(
self._execute_without_timeout, task_prompt=task_prompt, task=task ctx.run,
self._execute_without_timeout,
task_prompt=task_prompt,
task=task,
) )
try: try:
@@ -572,7 +610,10 @@ class Agent(BaseAgent):
ValueError: If the max execution time is not a positive integer. ValueError: If the max execution time is not a positive integer.
RuntimeError: If the agent execution fails for other reasons. RuntimeError: If the agent execution fails for other reasons.
""" """
handle_reasoning(self, task) if self.executor_class is not AgentExecutor:
handle_reasoning(
self, task
) # we need this till CrewAgentExecutor migrates to AgentExecutor
self._inject_date_to_task(task) self._inject_date_to_task(task)
if self.tools_handler: if self.tools_handler:
@@ -1418,17 +1459,19 @@ class Agent(BaseAgent):
except Exception as e: except Exception as e:
self._logger.log("error", f"Failed to save kickoff result to memory: {e}") self._logger.log("error", f"Failed to save kickoff result to memory: {e}")
def _execute_and_build_output( def _build_output_from_result(
self, self,
result: dict[str, Any],
executor: AgentExecutor, executor: AgentExecutor,
inputs: dict[str, str],
response_format: type[Any] | None = None, response_format: type[Any] | None = None,
) -> LiteAgentOutput: ) -> LiteAgentOutput:
"""Execute the agent and build the output object. """Build a LiteAgentOutput from an executor result dict.
Shared logic used by both sync and async execution paths.
Args: Args:
result: The result dictionary from executor.invoke / invoke_async.
executor: The executor instance. executor: The executor instance.
inputs: Input dictionary for execution.
response_format: Optional response format. response_format: Optional response format.
Returns: Returns:
@@ -1436,8 +1479,6 @@ class Agent(BaseAgent):
""" """
import json import json
# Execute the agent (this is called from sync path, so invoke returns dict)
result = cast(dict[str, Any], executor.invoke(inputs))
output = result.get("output", "") output = result.get("output", "")
# Handle response format conversion # Handle response format conversion
@@ -1485,91 +1526,39 @@ class Agent(BaseAgent):
else str(raw_output) else str(raw_output)
) )
todo_results = LiteAgentOutput.from_todo_items(executor.state.todos.items)
return LiteAgentOutput( return LiteAgentOutput(
raw=raw_str, raw=raw_str,
pydantic=formatted_result, pydantic=formatted_result,
agent_role=self.role, agent_role=self.role,
usage_metrics=usage_metrics.model_dump() if usage_metrics else None, usage_metrics=usage_metrics.model_dump() if usage_metrics else None,
messages=executor.messages, messages=list(executor.state.messages),
plan=executor.state.plan,
todos=todo_results,
replan_count=executor.state.replan_count,
last_replan_reason=executor.state.last_replan_reason,
) )
def _execute_and_build_output(
self,
executor: AgentExecutor,
inputs: dict[str, str],
response_format: type[Any] | None = None,
) -> LiteAgentOutput:
"""Execute the agent synchronously and build the output object."""
result = cast(dict[str, Any], executor.invoke(inputs))
return self._build_output_from_result(result, executor, response_format)
async def _execute_and_build_output_async( async def _execute_and_build_output_async(
self, self,
executor: AgentExecutor, executor: AgentExecutor,
inputs: dict[str, str], inputs: dict[str, str],
response_format: type[Any] | None = None, response_format: type[Any] | None = None,
) -> LiteAgentOutput: ) -> LiteAgentOutput:
"""Execute the agent asynchronously and build the output object. """Execute the agent asynchronously and build the output object."""
This is the async version of _execute_and_build_output that uses
invoke_async() for native async execution within event loops.
Args:
executor: The executor instance.
inputs: Input dictionary for execution.
response_format: Optional response format.
Returns:
LiteAgentOutput with raw output, formatted result, and metrics.
"""
import json
# Execute the agent asynchronously
result = await executor.invoke_async(inputs) result = await executor.invoke_async(inputs)
output = result.get("output", "") return self._build_output_from_result(result, executor, response_format)
# Handle response format conversion
formatted_result: BaseModel | None = None
raw_output: str
if isinstance(output, BaseModel):
formatted_result = output
raw_output = output.model_dump_json()
elif response_format:
raw_output = str(output) if not isinstance(output, str) else output
try:
model_schema = generate_model_description(response_format)
schema = json.dumps(model_schema, indent=2)
instructions = self.i18n.slice("formatted_task_instructions").format(
output_format=schema
)
converter = Converter(
llm=self.llm,
text=raw_output,
model=response_format,
instructions=instructions,
)
conversion_result = converter.to_pydantic()
if isinstance(conversion_result, BaseModel):
formatted_result = conversion_result
except ConverterError:
pass # Keep raw output if conversion fails
else:
raw_output = str(output) if not isinstance(output, str) else output
# Get token usage metrics
if isinstance(self.llm, BaseLLM):
usage_metrics = self.llm.get_token_usage_summary()
else:
usage_metrics = self._token_process.get_summary()
raw_str = (
raw_output
if isinstance(raw_output, str)
else raw_output.model_dump_json()
if isinstance(raw_output, BaseModel)
else str(raw_output)
)
return LiteAgentOutput(
raw=raw_str,
pydantic=formatted_result,
agent_role=self.role,
usage_metrics=usage_metrics.model_dump() if usage_metrics else None,
messages=executor.messages,
)
def _process_kickoff_guardrail( def _process_kickoff_guardrail(
self, self,

View File

@@ -0,0 +1,138 @@
from __future__ import annotations
from typing import Literal
from pydantic import BaseModel, Field
from crewai.llms.base_llm import BaseLLM
class PlanningConfig(BaseModel):
"""Configuration for agent planning/reasoning before task execution.
This allows users to customize the planning behavior including prompts,
iteration limits, the LLM used for planning, and the reasoning effort
level that controls post-step observation and replanning behavior.
Note: To disable planning, don't pass a planning_config or set planning=False
on the Agent. The presence of a PlanningConfig enables planning.
Attributes:
reasoning_effort: Controls observation and replanning after each step.
- "low": Observe each step (validates success), but skip the
decide/replan/refine pipeline. Steps are marked complete and
execution continues linearly. Fastest option.
- "medium": Observe each step. On failure, trigger replanning.
On success, skip refinement and continue. Balanced option.
- "high": Full observation pipeline — observe every step, then
route through decide_next_action which can trigger early goal
achievement, full replanning, or lightweight refinement.
Most adaptive but adds latency per step.
max_attempts: Maximum number of planning refinement attempts.
If None, will continue until the agent indicates readiness.
max_steps: Maximum number of steps in the generated plan.
system_prompt: Custom system prompt for planning. Uses default if None.
plan_prompt: Custom prompt for creating the initial plan.
refine_prompt: Custom prompt for refining the plan.
llm: LLM to use for planning. Uses agent's LLM if None.
Example:
```python
from crewai import Agent
from crewai.agent.planning_config import PlanningConfig
# Simple usage — fast, linear execution (default)
agent = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
planning_config=PlanningConfig(),
)
# Balanced — replan only when steps fail
agent = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
planning_config=PlanningConfig(
reasoning_effort="medium",
),
)
# Full adaptive planning with refinement and replanning
agent = Agent(
role="Researcher",
goal="Research topics",
backstory="Expert researcher",
planning_config=PlanningConfig(
reasoning_effort="high",
max_attempts=3,
max_steps=10,
plan_prompt="Create a focused plan for: {description}",
llm="gpt-4o-mini", # Use cheaper model for planning
),
)
```
"""
reasoning_effort: Literal["low", "medium", "high"] = Field(
default="medium",
description=(
"Controls post-step observation and replanning behavior. "
"'low' observes steps but skips replanning/refinement (fastest). "
"'medium' observes and replans only on step failure (balanced). "
"'high' runs full observation pipeline with replanning, refinement, "
"and early goal detection (most adaptive, highest latency)."
),
)
max_attempts: int | None = Field(
default=None,
description=(
"Maximum number of planning refinement attempts. "
"If None, will continue until the agent indicates readiness."
),
)
max_steps: int = Field(
default=20,
description="Maximum number of steps in the generated plan.",
ge=1,
)
system_prompt: str | None = Field(
default=None,
description="Custom system prompt for planning. Uses default if None.",
)
plan_prompt: str | None = Field(
default=None,
description="Custom prompt for creating the initial plan.",
)
refine_prompt: str | None = Field(
default=None,
description="Custom prompt for refining the plan.",
)
max_replans: int = Field(
default=3,
description="Maximum number of full replanning attempts before finalizing.",
ge=0,
)
max_step_iterations: int = Field(
default=15,
description=(
"Maximum LLM iterations per step in the StepExecutor multi-turn loop. "
"Lower values make steps faster but less thorough."
),
ge=1,
)
step_timeout: int | None = Field(
default=None,
description=(
"Maximum wall-clock seconds for a single step execution. "
"If exceeded, the step is marked as failed and observation decides "
"whether to continue or replan. None means no per-step timeout."
),
)
llm: str | BaseLLM | None = Field(
default=None,
description="LLM to use for planning. Uses agent's LLM if None.",
)
model_config = {"arbitrary_types_allowed": True}

View File

@@ -28,13 +28,20 @@ if TYPE_CHECKING:
def handle_reasoning(agent: Agent, task: Task) -> None: def handle_reasoning(agent: Agent, task: Task) -> None:
"""Handle the reasoning process for an agent before task execution. """Handle the reasoning/planning process for an agent before task execution.
This function checks if planning is enabled for the agent and, if so,
creates a plan that gets appended to the task description.
Note: This function is used by CrewAgentExecutor (legacy path).
For AgentExecutor, planning is handled in AgentExecutor.generate_plan().
Args: Args:
agent: The agent performing the task. agent: The agent performing the task.
task: The task to execute. task: The task to execute.
""" """
if not agent.reasoning: # Check if planning is enabled using the planning_enabled property
if not getattr(agent, "planning_enabled", False):
return return
try: try:
@@ -43,13 +50,13 @@ def handle_reasoning(agent: Agent, task: Task) -> None:
AgentReasoningOutput, AgentReasoningOutput,
) )
reasoning_handler = AgentReasoning(task=task, agent=agent) planning_handler = AgentReasoning(agent=agent, task=task)
reasoning_output: AgentReasoningOutput = ( planning_output: AgentReasoningOutput = (
reasoning_handler.handle_agent_reasoning() planning_handler.handle_agent_reasoning()
) )
task.description += f"\n\nReasoning Plan:\n{reasoning_output.plan.plan}" task.description += f"\n\nPlanning:\n{planning_output.plan.plan}"
except Exception as e: except Exception as e:
agent._logger.log("error", f"Error during reasoning process: {e!s}") agent._logger.log("error", f"Error during planning: {e!s}")
def build_task_prompt_with_schema(task: Task, task_prompt: str, i18n: I18N) -> str: def build_task_prompt_with_schema(task: Task, task_prompt: str, i18n: I18N) -> str:

View File

@@ -895,7 +895,9 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
ToolUsageStartedEvent, ToolUsageStartedEvent,
) )
args_dict, parse_error = parse_tool_call_args(func_args, func_name, call_id, original_tool) args_dict, parse_error = parse_tool_call_args(
func_args, func_name, call_id, original_tool
)
if parse_error is not None: if parse_error is not None:
return parse_error return parse_error

View File

@@ -0,0 +1,345 @@
"""PlannerObserver: Observation phase after each step execution.
Implements the "Observe" phase. After every step execution, the Planner
analyzes what happened, what new information was learned, and whether the
remaining plan is still valid.
This is NOT an error detector — it runs on every step, including successes,
to incorporate runtime observations into the remaining plan.
Refinements are structured (StepRefinement objects) and applied directly
from the observation result — no second LLM call required.
"""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Any
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.observation_events import (
StepObservationCompletedEvent,
StepObservationFailedEvent,
StepObservationStartedEvent,
)
from crewai.utilities.agent_utils import extract_task_section
from crewai.utilities.i18n import I18N, get_i18n
from crewai.utilities.llm_utils import create_llm
from crewai.utilities.planning_types import StepObservation, TodoItem
from crewai.utilities.types import LLMMessage
if TYPE_CHECKING:
from crewai.agent import Agent
from crewai.task import Task
logger = logging.getLogger(__name__)
class PlannerObserver:
"""Observes step execution results and decides on plan continuation.
After EVERY step execution, this class:
1. Analyzes what the step accomplished
2. Identifies new information learned
3. Decides if the remaining plan is still valid
4. Suggests lightweight refinements or triggers full replanning
LLM resolution (magical fallback):
- If ``agent.planning_config.llm`` is explicitly set → use that
- Otherwise → fall back to ``agent.llm`` (same LLM for everything)
Args:
agent: The agent instance (for LLM resolution and config).
task: Optional task context (for description and expected output).
"""
def __init__(
self,
agent: Agent,
task: Task | None = None,
kickoff_input: str = "",
) -> None:
self.agent = agent
self.task = task
self.kickoff_input = kickoff_input
self.llm = self._resolve_llm()
self._i18n: I18N = get_i18n()
def _resolve_llm(self) -> Any:
"""Resolve which LLM to use for observation/planning.
Mirrors AgentReasoning._resolve_llm(): uses planning_config.llm
if explicitly set, otherwise falls back to agent.llm.
Returns:
The resolved LLM instance.
"""
from crewai.llm import LLM
config = getattr(self.agent, "planning_config", None)
if config is not None and config.llm is not None:
if isinstance(config.llm, LLM):
return config.llm
return create_llm(config.llm)
return self.agent.llm
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def observe(
self,
completed_step: TodoItem,
result: str,
all_completed: list[TodoItem],
remaining_todos: list[TodoItem],
) -> StepObservation:
"""Observe a step's result and decide on plan continuation.
This runs after EVERY step execution — not just failures.
Args:
completed_step: The todo item that was just executed.
result: The final result string from the step.
all_completed: All previously completed todos (for context).
remaining_todos: The pending todos still in the plan.
Returns:
StepObservation with the Planner's analysis. Any suggested
refinements are structured StepRefinement objects ready for
direct application — no second LLM call needed.
"""
agent_role = self.agent.role
crewai_event_bus.emit(
self.agent,
event=StepObservationStartedEvent(
agent_role=agent_role,
step_number=completed_step.step_number,
step_description=completed_step.description,
from_task=self.task,
from_agent=self.agent,
),
)
messages = self._build_observation_messages(
completed_step, result, all_completed, remaining_todos
)
try:
response = self.llm.call(
messages,
response_model=StepObservation,
from_task=self.task,
from_agent=self.agent,
)
observation = self._parse_observation_response(response)
refinement_summaries = (
[
f"Step {r.step_number}: {r.new_description}"
for r in observation.suggested_refinements
]
if observation.suggested_refinements
else None
)
crewai_event_bus.emit(
self.agent,
event=StepObservationCompletedEvent(
agent_role=agent_role,
step_number=completed_step.step_number,
step_description=completed_step.description,
step_completed_successfully=observation.step_completed_successfully,
key_information_learned=observation.key_information_learned,
remaining_plan_still_valid=observation.remaining_plan_still_valid,
needs_full_replan=observation.needs_full_replan,
replan_reason=observation.replan_reason,
goal_already_achieved=observation.goal_already_achieved,
suggested_refinements=refinement_summaries,
from_task=self.task,
from_agent=self.agent,
),
)
return observation
except Exception as e:
logger.warning(
f"Observation LLM call failed: {e}. Defaulting to conservative replan."
)
crewai_event_bus.emit(
self.agent,
event=StepObservationFailedEvent(
agent_role=agent_role,
step_number=completed_step.step_number,
step_description=completed_step.description,
error=str(e),
from_task=self.task,
from_agent=self.agent,
),
)
# Don't force a full replan — the step may have succeeded even if the
# observer LLM failed to parse the result. Defaulting to "continue" is
# far less disruptive than wiping the entire plan on every observer error.
return StepObservation(
step_completed_successfully=True,
key_information_learned="",
remaining_plan_still_valid=True,
needs_full_replan=False,
)
def apply_refinements(
self,
observation: StepObservation,
remaining_todos: list[TodoItem],
) -> list[TodoItem]:
"""Apply structured refinements from the observation directly to todo descriptions.
No LLM call needed — refinements are already structured StepRefinement
objects produced by the observation call. This is a pure in-memory update.
Args:
observation: The observation containing structured refinements.
remaining_todos: The pending todos to update in-place.
Returns:
The same todo list with updated descriptions where refinements applied.
"""
if not observation.suggested_refinements:
return remaining_todos
todo_by_step: dict[int, TodoItem] = {t.step_number: t for t in remaining_todos}
for refinement in observation.suggested_refinements:
if refinement.step_number in todo_by_step and refinement.new_description:
todo_by_step[
refinement.step_number
].description = refinement.new_description
return remaining_todos
# ------------------------------------------------------------------
# Internal: Message building
# ------------------------------------------------------------------
def _build_observation_messages(
self,
completed_step: TodoItem,
result: str,
all_completed: list[TodoItem],
remaining_todos: list[TodoItem],
) -> list[LLMMessage]:
"""Build messages for the observation LLM call."""
task_desc = ""
task_goal = ""
if self.task:
task_desc = self.task.description or ""
task_goal = self.task.expected_output or ""
elif self.kickoff_input:
# Standalone kickoff path — no Task object, but we have the raw input.
# Extract just the ## Task section so the observer sees the actual goal,
# not the full enriched instruction with env/tools/verification noise.
task_desc = extract_task_section(self.kickoff_input)
task_goal = "Complete the task successfully"
system_prompt = self._i18n.retrieve("planning", "observation_system_prompt")
# Build context of what's been done
completed_summary = ""
if all_completed:
completed_lines = []
for todo in all_completed:
result_preview = (todo.result or "")[:200]
completed_lines.append(
f" Step {todo.step_number}: {todo.description}\n"
f" Result: {result_preview}"
)
completed_summary = "\n## Previously completed steps:\n" + "\n".join(
completed_lines
)
# Build remaining plan
remaining_summary = ""
if remaining_todos:
remaining_lines = [
f" Step {todo.step_number}: {todo.description}"
for todo in remaining_todos
]
remaining_summary = "\n## Remaining plan steps:\n" + "\n".join(
remaining_lines
)
user_prompt = self._i18n.retrieve("planning", "observation_user_prompt").format(
task_description=task_desc,
task_goal=task_goal,
completed_summary=completed_summary,
step_number=completed_step.step_number,
step_description=completed_step.description,
step_result=result,
remaining_summary=remaining_summary,
)
return [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
]
@staticmethod
def _parse_observation_response(response: Any) -> StepObservation:
"""Parse the LLM response into a StepObservation.
The LLM may return:
- A StepObservation instance directly (streaming + litellm path)
- A JSON string (non-streaming path serialises model_dump_json())
- A dict (some provider paths)
- Something else (unexpected)
We handle all cases to avoid silently falling back to a
hardcoded success default.
"""
if isinstance(response, StepObservation):
return response
# JSON string path — most common miss before this fix
if isinstance(response, str):
text = response.strip()
try:
return StepObservation.model_validate_json(text)
except Exception: # noqa: S110
pass
# Some LLMs wrap the JSON in markdown fences
if text.startswith("```"):
lines = text.split("\n")
# Strip first and last lines (``` markers)
inner = "\n".join(
lines[1:-1] if lines[-1].strip() == "```" else lines[1:]
)
try:
return StepObservation.model_validate_json(inner.strip())
except Exception: # noqa: S110
pass
# Dict path
if isinstance(response, dict):
try:
return StepObservation.model_validate(response)
except Exception: # noqa: S110
pass
# Last resort — log what we got so it's diagnosable
logger.warning(
"Could not parse observation response (type=%s). "
"Falling back to default failure observation. Preview: %.200s",
type(response).__name__,
str(response),
)
return StepObservation(
step_completed_successfully=False,
key_information_learned=str(response) if response else "",
remaining_plan_still_valid=False,
)

View File

@@ -0,0 +1,629 @@
"""StepExecutor: Isolated executor for a single plan step.
Implements the direct-action execution pattern from Plan-and-Act
(arxiv 2503.09572): the Executor receives one step description,
makes a single LLM call, executes any tool call returned, and
returns the result immediately.
There is no inner loop. Recovery from failure (retry, replan) is
the responsibility of PlannerObserver and AgentExecutor — keeping
this class single-purpose and fast.
"""
from __future__ import annotations
from collections.abc import Callable
from datetime import datetime
import json
import time
from typing import TYPE_CHECKING, Any, cast
from pydantic import BaseModel
from crewai.agents.parser import AgentAction, AgentFinish
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.tool_usage_events import (
ToolUsageErrorEvent,
ToolUsageFinishedEvent,
ToolUsageStartedEvent,
)
from crewai.utilities.agent_utils import (
build_tool_calls_assistant_message,
check_native_tool_support,
enforce_rpm_limit,
execute_single_native_tool_call,
extract_task_section,
format_message_for_llm,
is_tool_call_list,
process_llm_response,
setup_native_tools,
)
from crewai.utilities.i18n import I18N, get_i18n
from crewai.utilities.planning_types import TodoItem
from crewai.utilities.printer import Printer
from crewai.utilities.step_execution_context import StepExecutionContext, StepResult
from crewai.utilities.string_utils import sanitize_tool_name
from crewai.utilities.tool_utils import execute_tool_and_check_finality
from crewai.utilities.types import LLMMessage
if TYPE_CHECKING:
from crewai.agent import Agent
from crewai.agents.tools_handler import ToolsHandler
from crewai.crew import Crew
from crewai.llms.base_llm import BaseLLM
from crewai.task import Task
from crewai.tools.base_tool import BaseTool
from crewai.tools.structured_tool import CrewStructuredTool
class StepExecutor:
"""Executes a SINGLE todo item using direct-action execution.
The StepExecutor owns its own message list per invocation. It never reads
or writes the AgentExecutor's state. Results flow back via StepResult.
Execution pattern (per Plan-and-Act, arxiv 2503.09572):
1. Build messages from todo + context
2. Call LLM once (with or without native tools)
3. If tool call → execute it → return tool result
4. If text answer → return it directly
No inner loop — recovery is PlannerObserver's responsibility.
Args:
llm: The language model to use for execution.
tools: Structured tools available to the executor.
agent: The agent instance (for role/goal/verbose/config).
original_tools: Original BaseTool instances (needed for native tool schema).
tools_handler: Optional tools handler for caching and delegation tracking.
task: Optional task context.
crew: Optional crew context.
function_calling_llm: Optional separate LLM for function calling.
request_within_rpm_limit: Optional RPM limit function.
callbacks: Optional list of callbacks.
i18n: Optional i18n instance.
"""
def __init__(
self,
llm: BaseLLM,
tools: list[CrewStructuredTool],
agent: Agent,
original_tools: list[BaseTool] | None = None,
tools_handler: ToolsHandler | None = None,
task: Task | None = None,
crew: Crew | None = None,
function_calling_llm: BaseLLM | None = None,
request_within_rpm_limit: Callable[[], bool] | None = None,
callbacks: list[Any] | None = None,
i18n: I18N | None = None,
) -> None:
self.llm = llm
self.tools = tools
self.agent = agent
self.original_tools = original_tools or []
self.tools_handler = tools_handler
self.task = task
self.crew = crew
self.function_calling_llm = function_calling_llm
self.request_within_rpm_limit = request_within_rpm_limit
self.callbacks = callbacks or []
self._i18n: I18N = i18n or get_i18n()
self._printer: Printer = Printer()
# Native tool support — set up once
self._use_native_tools = check_native_tool_support(
self.llm, self.original_tools
)
self._openai_tools: list[dict[str, Any]] = []
self._available_functions: dict[str, Callable[..., Any]] = {}
if self._use_native_tools and self.original_tools:
(
self._openai_tools,
self._available_functions,
_,
) = setup_native_tools(self.original_tools)
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def execute(
self,
todo: TodoItem,
context: StepExecutionContext,
max_step_iterations: int = 15,
step_timeout: int | None = None,
) -> StepResult:
"""Execute a single todo item using a multi-turn action loop.
Enforces the RPM limit, builds a fresh message list, then iterates
LLM call → tool execution → observation until the LLM signals it is
done (text answer) or max_step_iterations is reached. Never touches
external AgentExecutor state.
Args:
todo: The todo item to execute.
context: Immutable context with task info and dependency results.
max_step_iterations: Maximum LLM iterations in the multi-turn loop.
step_timeout: Maximum wall-clock seconds for this step. None = no limit.
Returns:
StepResult with the outcome.
"""
start_time = time.monotonic()
tool_calls_made: list[str] = []
try:
enforce_rpm_limit(self.request_within_rpm_limit)
messages = self._build_isolated_messages(todo, context)
if self._use_native_tools:
result_text = self._execute_native(
messages,
tool_calls_made,
max_step_iterations=max_step_iterations,
step_timeout=step_timeout,
start_time=start_time,
)
else:
result_text = self._execute_text_parsed(
messages,
tool_calls_made,
max_step_iterations=max_step_iterations,
step_timeout=step_timeout,
start_time=start_time,
)
self._validate_expected_tool_usage(todo, tool_calls_made)
elapsed = time.monotonic() - start_time
return StepResult(
success=True,
result=result_text,
tool_calls_made=tool_calls_made,
execution_time=elapsed,
)
except Exception as e:
elapsed = time.monotonic() - start_time
return StepResult(
success=False,
result="",
error=str(e),
tool_calls_made=tool_calls_made,
execution_time=elapsed,
)
# ------------------------------------------------------------------
# Internal: Message building
# ------------------------------------------------------------------
def _build_isolated_messages(
self, todo: TodoItem, context: StepExecutionContext
) -> list[LLMMessage]:
"""Build a fresh message list for this step's execution.
System prompt tells the LLM it is an Executor focused on one step.
User prompt provides the step description, dependencies, and tools.
"""
system_prompt = self._build_system_prompt()
user_prompt = self._build_user_prompt(todo, context)
return [
format_message_for_llm(system_prompt, role="system"),
format_message_for_llm(user_prompt, role="user"),
]
def _build_system_prompt(self) -> str:
"""Build the Executor's system prompt."""
role = self.agent.role if self.agent else "Assistant"
goal = self.agent.goal if self.agent else "Complete tasks efficiently"
backstory = getattr(self.agent, "backstory", "") or ""
tools_section = ""
if self.tools and not self._use_native_tools:
tool_names = ", ".join(sanitize_tool_name(t.name) for t in self.tools)
tools_section = self._i18n.retrieve(
"planning", "step_executor_tools_section"
).format(tool_names=tool_names)
elif self.tools:
tool_names = ", ".join(sanitize_tool_name(t.name) for t in self.tools)
tools_section = f"\n\nAvailable tools: {tool_names}"
return self._i18n.retrieve("planning", "step_executor_system_prompt").format(
role=role,
backstory=backstory,
goal=goal,
tools_section=tools_section,
)
def _build_user_prompt(self, todo: TodoItem, context: StepExecutionContext) -> str:
"""Build the user prompt for this specific step."""
parts: list[str] = []
# Include overall task context so the executor knows the full goal and
# required output format/location — critical for knowing WHAT to produce.
# We extract only the task body (not tool instructions or verification
# sections) to avoid duplicating directives already in the system prompt.
if context.task_description:
task_section = extract_task_section(context.task_description)
if task_section:
parts.append(
self._i18n.retrieve(
"planning", "step_executor_task_context"
).format(
task_context=task_section,
)
)
parts.append(
self._i18n.retrieve("planning", "step_executor_user_prompt").format(
step_description=todo.description,
)
)
if todo.tool_to_use:
parts.append(
self._i18n.retrieve("planning", "step_executor_suggested_tool").format(
tool_to_use=todo.tool_to_use,
)
)
# Include dependency results (final results only, no traces)
if context.dependency_results:
parts.append(
self._i18n.retrieve("planning", "step_executor_context_header")
)
for step_num, result in sorted(context.dependency_results.items()):
parts.append(
self._i18n.retrieve(
"planning", "step_executor_context_entry"
).format(step_number=step_num, result=result)
)
parts.append(self._i18n.retrieve("planning", "step_executor_complete_step"))
return "\n".join(parts)
# ------------------------------------------------------------------
# Internal: Multi-turn execution loop
# ------------------------------------------------------------------
def _execute_text_parsed(
self,
messages: list[LLMMessage],
tool_calls_made: list[str],
max_step_iterations: int = 15,
step_timeout: int | None = None,
start_time: float | None = None,
) -> str:
"""Execute step using text-parsed tool calling with a multi-turn loop.
Iterates LLM call → tool execution → observation until the LLM
produces a Final Answer or max_step_iterations is reached.
This allows the agent to: run a command, see the output, adjust its
approach, and run another command — all within a single plan step.
"""
use_stop_words = self.llm.supports_stop_words() if self.llm else False
last_tool_result = ""
for _ in range(max_step_iterations):
# Check step timeout
if step_timeout and start_time:
elapsed = time.monotonic() - start_time
if elapsed >= step_timeout:
return last_tool_result or f"Step timed out after {elapsed:.0f}s"
answer = self.llm.call(
messages,
callbacks=self.callbacks,
from_task=self.task,
from_agent=self.agent,
)
if not answer:
raise ValueError("Empty response from LLM")
answer_str = str(answer)
formatted = process_llm_response(answer_str, use_stop_words)
if isinstance(formatted, AgentFinish):
return str(formatted.output)
if isinstance(formatted, AgentAction):
tool_calls_made.append(formatted.tool)
tool_result = self._execute_text_tool_with_events(formatted)
last_tool_result = tool_result
# Append the assistant's reasoning + action, then the observation.
# _build_observation_message handles vision sentinels so the LLM
# receives an image content block instead of raw base64 text.
messages.append({"role": "assistant", "content": answer_str})
messages.append(self._build_observation_message(tool_result))
continue
# Raw text response with no Final Answer marker — treat as done
return answer_str
# Max iterations reached — return the last tool result we accumulated
return last_tool_result
def _execute_text_tool_with_events(self, formatted: AgentAction) -> str:
"""Execute text-parsed tool calls with tool usage events."""
args_dict = self._parse_tool_args(formatted.tool_input)
agent_key = getattr(self.agent, "key", "unknown") if self.agent else "unknown"
started_at = datetime.now()
crewai_event_bus.emit(
self,
event=ToolUsageStartedEvent(
tool_name=formatted.tool,
tool_args=args_dict,
from_agent=self.agent,
from_task=self.task,
agent_key=agent_key,
),
)
try:
fingerprint_context = {}
if (
self.agent
and hasattr(self.agent, "security_config")
and hasattr(self.agent.security_config, "fingerprint")
):
fingerprint_context = {
"agent_fingerprint": str(self.agent.security_config.fingerprint)
}
tool_result = execute_tool_and_check_finality(
agent_action=formatted,
fingerprint_context=fingerprint_context,
tools=self.tools,
i18n=self._i18n,
agent_key=self.agent.key if self.agent else None,
agent_role=self.agent.role if self.agent else None,
tools_handler=self.tools_handler,
task=self.task,
agent=self.agent,
function_calling_llm=self.function_calling_llm,
crew=self.crew,
)
except Exception as e:
crewai_event_bus.emit(
self,
event=ToolUsageErrorEvent(
tool_name=formatted.tool,
tool_args=args_dict,
from_agent=self.agent,
from_task=self.task,
agent_key=agent_key,
error=e,
),
)
raise
crewai_event_bus.emit(
self,
event=ToolUsageFinishedEvent(
output=str(tool_result.result),
tool_name=formatted.tool,
tool_args=args_dict,
from_agent=self.agent,
from_task=self.task,
agent_key=agent_key,
started_at=started_at,
finished_at=datetime.now(),
),
)
return str(tool_result.result)
def _parse_tool_args(self, tool_input: Any) -> dict[str, Any]:
"""Parse tool args from the parser output into a dict payload for events."""
if isinstance(tool_input, dict):
return tool_input
if isinstance(tool_input, str):
stripped_input = tool_input.strip()
if not stripped_input:
return {}
try:
parsed = json.loads(stripped_input)
if isinstance(parsed, dict):
return parsed
return {"input": parsed}
except json.JSONDecodeError:
return {"input": stripped_input}
return {"input": str(tool_input)}
# ------------------------------------------------------------------
# Internal: Vision support
# ------------------------------------------------------------------
@staticmethod
def _parse_vision_sentinel(raw: str) -> tuple[str, str] | None:
"""Parse a VISION_IMAGE sentinel into (media_type, base64_data), or None."""
prefix = "VISION_IMAGE:"
if not raw.startswith(prefix):
return None
rest = raw[len(prefix) :]
sep = rest.find(":")
if sep <= 0:
return None
return rest[:sep], rest[sep + 1 :]
@staticmethod
def _build_observation_message(tool_result: str) -> LLMMessage:
"""Build an observation message, converting vision sentinels to image blocks.
When a tool returns a VISION_IMAGE sentinel (e.g. from read_image),
we build a multimodal content block so the LLM can actually *see*
the image rather than receiving a wall of base64 text.
Uses the standard image_url / data-URI format so each LLM provider's
SDK (OpenAI, LiteLLM, etc.) handles the provider-specific conversion.
Format: ``VISION_IMAGE:<media_type>:<base64_data>``
"""
parsed = StepExecutor._parse_vision_sentinel(tool_result)
if parsed:
media_type, b64_data = parsed
return {
"role": "user",
"content": [
{"type": "text", "text": "Observation: Here is the image:"},
{
"type": "image_url",
"image_url": {
"url": f"data:{media_type};base64,{b64_data}",
},
},
],
}
return {"role": "user", "content": f"Observation: {tool_result}"}
def _validate_expected_tool_usage(
self,
todo: TodoItem,
tool_calls_made: list[str],
) -> None:
"""Fail step execution when a required tool is configured but not called."""
expected_tool = getattr(todo, "tool_to_use", None)
if not expected_tool:
return
expected_tool_name = sanitize_tool_name(expected_tool)
available_tool_names = {
sanitize_tool_name(tool.name)
for tool in self.tools
if getattr(tool, "name", "")
} | set(self._available_functions.keys())
if expected_tool_name not in available_tool_names:
return
called_names = {sanitize_tool_name(name) for name in tool_calls_made}
if expected_tool_name not in called_names:
raise ValueError(
f"Expected tool '{expected_tool_name}' was not called "
f"for step {todo.step_number}."
)
def _execute_native(
self,
messages: list[LLMMessage],
tool_calls_made: list[str],
max_step_iterations: int = 15,
step_timeout: int | None = None,
start_time: float | None = None,
) -> str:
"""Execute step using native function calling with a multi-turn loop.
Iterates LLM call → tool execution → appended results until the LLM
returns a text answer (no more tool calls) or max_step_iterations is
reached. This lets the agent run a shell command, observe the output,
correct mistakes, and issue follow-up commands — all within one step.
"""
accumulated_results: list[str] = []
for _ in range(max_step_iterations):
# Check step timeout
if step_timeout and start_time:
elapsed = time.monotonic() - start_time
if elapsed >= step_timeout:
return (
"\n\n".join(accumulated_results)
if accumulated_results
else f"Step timed out after {elapsed:.0f}s"
)
answer = self.llm.call(
messages,
tools=self._openai_tools,
callbacks=self.callbacks,
from_task=self.task,
from_agent=self.agent,
)
if not answer:
raise ValueError("Empty response from LLM")
if isinstance(answer, BaseModel):
return answer.model_dump_json()
if isinstance(answer, list) and answer and is_tool_call_list(answer):
# _execute_native_tool_calls appends assistant + tool messages
# to `messages` as a side-effect, so the next LLM call will
# see the full conversation history including tool outputs.
result = self._execute_native_tool_calls(
answer, messages, tool_calls_made
)
accumulated_results.append(result)
continue
# Text answer → LLM decided the step is done
return str(answer)
# Max iterations reached — return everything we accumulated
return "\n".join(filter(None, accumulated_results))
def _execute_native_tool_calls(
self,
tool_calls: list[Any],
messages: list[LLMMessage],
tool_calls_made: list[str],
) -> str:
"""Execute a batch of native tool calls and return their results.
Returns the result of the first tool marked result_as_answer if any,
otherwise returns all tool results concatenated.
"""
assistant_message, _reports = build_tool_calls_assistant_message(tool_calls)
if assistant_message:
messages.append(assistant_message)
tool_results: list[str] = []
for tool_call in tool_calls:
call_result = execute_single_native_tool_call(
tool_call,
available_functions=self._available_functions,
original_tools=self.original_tools,
structured_tools=self.tools,
tools_handler=self.tools_handler,
agent=self.agent,
task=self.task,
crew=self.crew,
event_source=self,
printer=self._printer,
verbose=bool(self.agent and self.agent.verbose),
)
if call_result.func_name:
tool_calls_made.append(call_result.func_name)
if call_result.result_as_answer:
return str(call_result.result)
if call_result.tool_message:
raw_content = call_result.tool_message.get("content", "")
if isinstance(raw_content, str):
parsed = self._parse_vision_sentinel(raw_content)
if parsed:
media_type, b64_data = parsed
# Replace the sentinel with a standard image_url content block.
# Each provider's _format_messages handles conversion to
# its native format (e.g. Anthropic image blocks).
modified: LLMMessage = cast(
LLMMessage, dict(call_result.tool_message)
)
modified["content"] = [
{
"type": "image_url",
"image_url": {
"url": f"data:{media_type};base64,{b64_data}",
},
}
]
messages.append(modified)
tool_results.append("[image]")
else:
messages.append(call_result.tool_message)
if raw_content:
tool_results.append(raw_content)
else:
messages.append(call_result.tool_message)
if raw_content:
tool_results.append(str(raw_content))
return "\n".join(tool_results) if tool_results else ""

View File

@@ -182,15 +182,24 @@ def log_tasks_outputs() -> None:
@crewai.command() @crewai.command()
@click.option("-m", "--memory", is_flag=True, help="Reset MEMORY") @click.option("-m", "--memory", is_flag=True, help="Reset MEMORY")
@click.option( @click.option(
"-l", "--long", is_flag=True, hidden=True, "-l",
"--long",
is_flag=True,
hidden=True,
help="[Deprecated: use --memory] Reset memory", help="[Deprecated: use --memory] Reset memory",
) )
@click.option( @click.option(
"-s", "--short", is_flag=True, hidden=True, "-s",
"--short",
is_flag=True,
hidden=True,
help="[Deprecated: use --memory] Reset memory", help="[Deprecated: use --memory] Reset memory",
) )
@click.option( @click.option(
"-e", "--entities", is_flag=True, hidden=True, "-e",
"--entities",
is_flag=True,
hidden=True,
help="[Deprecated: use --memory] Reset memory", help="[Deprecated: use --memory] Reset memory",
) )
@click.option("-kn", "--knowledge", is_flag=True, help="Reset KNOWLEDGE storage") @click.option("-kn", "--knowledge", is_flag=True, help="Reset KNOWLEDGE storage")
@@ -218,7 +227,13 @@ def reset_memories(
# Treat legacy flags as --memory with a deprecation warning # Treat legacy flags as --memory with a deprecation warning
if long or short or entities: if long or short or entities:
legacy_used = [ legacy_used = [
f for f, v in [("--long", long), ("--short", short), ("--entities", entities)] if v f
for f, v in [
("--long", long),
("--short", short),
("--entities", entities),
]
if v
] ]
click.echo( click.echo(
f"Warning: {', '.join(legacy_used)} {'is' if len(legacy_used) == 1 else 'are'} " f"Warning: {', '.join(legacy_used)} {'is' if len(legacy_used) == 1 else 'are'} "
@@ -238,9 +253,7 @@ def reset_memories(
"Please specify at least one memory type to reset using the appropriate flags." "Please specify at least one memory type to reset using the appropriate flags."
) )
return return
reset_memories_command( reset_memories_command(memory, knowledge, agent_knowledge, kickoff_outputs, all)
memory, knowledge, agent_knowledge, kickoff_outputs, all
)
except Exception as e: except Exception as e:
click.echo(f"An error occurred while resetting memories: {e}", err=True) click.echo(f"An error occurred while resetting memories: {e}", err=True)
@@ -669,18 +682,11 @@ def traces_enable():
from rich.console import Console from rich.console import Console
from rich.panel import Panel from rich.panel import Panel
from crewai.events.listeners.tracing.utils import ( from crewai.events.listeners.tracing.utils import update_user_data
_load_user_data,
_save_user_data,
)
console = Console() console = Console()
# Update user data to enable traces update_user_data({"trace_consent": True, "first_execution_done": True})
user_data = _load_user_data()
user_data["trace_consent"] = True
user_data["first_execution_done"] = True
_save_user_data(user_data)
panel = Panel( panel = Panel(
"✅ Trace collection has been enabled!\n\n" "✅ Trace collection has been enabled!\n\n"
@@ -699,18 +705,11 @@ def traces_disable():
from rich.console import Console from rich.console import Console
from rich.panel import Panel from rich.panel import Panel
from crewai.events.listeners.tracing.utils import ( from crewai.events.listeners.tracing.utils import update_user_data
_load_user_data,
_save_user_data,
)
console = Console() console = Console()
# Update user data to disable traces update_user_data({"trace_consent": False, "first_execution_done": True})
user_data = _load_user_data()
user_data["trace_consent"] = False
user_data["first_execution_done"] = True
_save_user_data(user_data)
panel = Panel( panel = Panel(
"❌ Trace collection has been disabled!\n\n" "❌ Trace collection has been disabled!\n\n"

View File

@@ -1,3 +1,4 @@
import contextvars
import json import json
from pathlib import Path from pathlib import Path
import platform import platform
@@ -80,7 +81,10 @@ def run_chat() -> None:
# Start loading indicator # Start loading indicator
loading_complete = threading.Event() loading_complete = threading.Event()
loading_thread = threading.Thread(target=show_loading, args=(loading_complete,)) ctx = contextvars.copy_context()
loading_thread = threading.Thread(
target=ctx.run, args=(show_loading, loading_complete)
)
loading_thread.start() loading_thread.start()
try: try:

View File

@@ -125,13 +125,19 @@ class MemoryTUI(App[None]):
from crewai.memory.storage.lancedb_storage import LanceDBStorage from crewai.memory.storage.lancedb_storage import LanceDBStorage
from crewai.memory.unified_memory import Memory from crewai.memory.unified_memory import Memory
storage = LanceDBStorage(path=storage_path) if storage_path else LanceDBStorage() storage = (
LanceDBStorage(path=storage_path) if storage_path else LanceDBStorage()
)
embedder = None embedder = None
if embedder_config is not None: if embedder_config is not None:
from crewai.rag.embeddings.factory import build_embedder from crewai.rag.embeddings.factory import build_embedder
embedder = build_embedder(embedder_config) embedder = build_embedder(embedder_config)
self._memory = Memory(storage=storage, embedder=embedder) if embedder else Memory(storage=storage) self._memory = (
Memory(storage=storage, embedder=embedder)
if embedder
else Memory(storage=storage)
)
except Exception as e: except Exception as e:
self._init_error = str(e) self._init_error = str(e)
@@ -200,11 +206,7 @@ class MemoryTUI(App[None]):
if len(record.content) > 80 if len(record.content) > 80
else record.content else record.content
) )
label = ( label = f"{date_str} [bold]{record.importance:.1f}[/] {preview}"
f"{date_str} "
f"[bold]{record.importance:.1f}[/] "
f"{preview}"
)
option_list.add_option(label) option_list.add_option(label)
def _populate_recall_list(self) -> None: def _populate_recall_list(self) -> None:
@@ -220,9 +222,7 @@ class MemoryTUI(App[None]):
else m.record.content else m.record.content
) )
label = ( label = (
f"[bold]\\[{m.score:.2f}][/] " f"[bold]\\[{m.score:.2f}][/] {preview} [dim]scope={m.record.scope}[/]"
f"{preview} "
f"[dim]scope={m.record.scope}[/]"
) )
option_list.add_option(label) option_list.add_option(label)
@@ -251,8 +251,7 @@ class MemoryTUI(App[None]):
lines.append(f"[dim]Scope:[/] [bold]{record.scope}[/]") lines.append(f"[dim]Scope:[/] [bold]{record.scope}[/]")
lines.append(f"[dim]Importance:[/] [bold]{record.importance:.2f}[/]") lines.append(f"[dim]Importance:[/] [bold]{record.importance:.2f}[/]")
lines.append( lines.append(
f"[dim]Created:[/] " f"[dim]Created:[/] {record.created_at.strftime('%Y-%m-%d %H:%M:%S')}"
f"{record.created_at.strftime('%Y-%m-%d %H:%M:%S')}"
) )
lines.append( lines.append(
f"[dim]Last accessed:[/] " f"[dim]Last accessed:[/] "
@@ -362,17 +361,11 @@ class MemoryTUI(App[None]):
panel = self.query_one("#info-panel", Static) panel = self.query_one("#info-panel", Static)
panel.loading = True panel.loading = True
try: try:
scope = ( scope = self._selected_scope if self._selected_scope != "/" else None
self._selected_scope
if self._selected_scope != "/"
else None
)
loop = asyncio.get_event_loop() loop = asyncio.get_event_loop()
matches = await loop.run_in_executor( matches = await loop.run_in_executor(
None, None,
lambda: self._memory.recall( lambda: self._memory.recall(query, scope=scope, limit=10, depth="deep"),
query, scope=scope, limit=10, depth="deep"
),
) )
self._recall_matches = matches or [] self._recall_matches = matches or []
self._view_mode = "recall" self._view_mode = "recall"

View File

@@ -95,9 +95,7 @@ def reset_memories_command(
continue continue
if memory: if memory:
_reset_flow_memory(flow) _reset_flow_memory(flow)
click.echo( click.echo(f"[Flow ({flow_name})] Memory has been reset.")
f"[Flow ({flow_name})] Memory has been reset."
)
except subprocess.CalledProcessError as e: except subprocess.CalledProcessError as e:
click.echo(f"An error occurred while resetting the memories: {e}", err=True) click.echo(f"An error occurred while resetting the memories: {e}", err=True)

View File

@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
authors = [{ name = "Your Name", email = "you@example.com" }] authors = [{ name = "Your Name", email = "you@example.com" }]
requires-python = ">=3.10,<3.14" requires-python = ">=3.10,<3.14"
dependencies = [ dependencies = [
"crewai[tools]==1.10.2a1" "crewai[tools]==1.11.0rc1"
] ]
[project.scripts] [project.scripts]

View File

@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
authors = [{ name = "Your Name", email = "you@example.com" }] authors = [{ name = "Your Name", email = "you@example.com" }]
requires-python = ">=3.10,<3.14" requires-python = ">=3.10,<3.14"
dependencies = [ dependencies = [
"crewai[tools]==1.10.2a1" "crewai[tools]==1.11.0rc1"
] ]
[project.scripts] [project.scripts]

View File

@@ -5,7 +5,7 @@ description = "Power up your crews with {{folder_name}}"
readme = "README.md" readme = "README.md"
requires-python = ">=3.10,<3.14" requires-python = ">=3.10,<3.14"
dependencies = [ dependencies = [
"crewai[tools]==1.10.2a1" "crewai[tools]==1.11.0rc1"
] ]
[tool.crewai] [tool.crewai]

View File

@@ -442,9 +442,7 @@ def get_flows(flow_path: str = "main.py") -> list[Flow]:
for search_path in search_paths: for search_path in search_paths:
for root, dirs, files in os.walk(search_path): for root, dirs, files in os.walk(search_path):
dirs[:] = [ dirs[:] = [
d d for d in dirs if d not in _SKIP_DIRS and not d.startswith(".")
for d in dirs
if d not in _SKIP_DIRS and not d.startswith(".")
] ]
if flow_path in files and "cli/templates" not in root: if flow_path in files and "cli/templates" not in root:
file_os_path = os.path.join(root, flow_path) file_os_path = os.path.join(root, flow_path)
@@ -464,9 +462,7 @@ def get_flows(flow_path: str = "main.py") -> list[Flow]:
for attr_name in dir(module): for attr_name in dir(module):
module_attr = getattr(module, attr_name) module_attr = getattr(module, attr_name)
try: try:
if flow_instance := get_flow_instance( if flow_instance := get_flow_instance(module_attr):
module_attr
):
flow_instances.append(flow_instance) flow_instances.append(flow_instance)
except Exception: # noqa: S112 except Exception: # noqa: S112
continue continue

View File

@@ -1410,9 +1410,7 @@ class Crew(FlowTrackable, BaseModel):
return self._merge_tools(tools, cast(list[BaseTool], code_tools)) return self._merge_tools(tools, cast(list[BaseTool], code_tools))
return tools return tools
def _add_memory_tools( def _add_memory_tools(self, tools: list[BaseTool], memory: Any) -> list[BaseTool]:
self, tools: list[BaseTool], memory: Any
) -> list[BaseTool]:
"""Add recall and remember tools when memory is available. """Add recall and remember tools when memory is available.
Args: Args:

View File

@@ -75,6 +75,14 @@ from crewai.events.types.mcp_events import (
MCPToolExecutionFailedEvent, MCPToolExecutionFailedEvent,
MCPToolExecutionStartedEvent, MCPToolExecutionStartedEvent,
) )
from crewai.events.types.observation_events import (
GoalAchievedEarlyEvent,
PlanRefinementEvent,
PlanReplanTriggeredEvent,
StepObservationCompletedEvent,
StepObservationFailedEvent,
StepObservationStartedEvent,
)
from crewai.events.types.reasoning_events import ( from crewai.events.types.reasoning_events import (
AgentReasoningCompletedEvent, AgentReasoningCompletedEvent,
AgentReasoningFailedEvent, AgentReasoningFailedEvent,
@@ -535,6 +543,64 @@ class EventListener(BaseEventListener):
event.error, event.error,
) )
# ----------- OBSERVATION EVENTS (Plan-and-Execute) -----------
@crewai_event_bus.on(StepObservationStartedEvent)
def on_step_observation_started(
_: Any, event: StepObservationStartedEvent
) -> None:
self.formatter.handle_observation_started(
event.agent_role,
event.step_number,
event.step_description,
)
@crewai_event_bus.on(StepObservationCompletedEvent)
def on_step_observation_completed(
_: Any, event: StepObservationCompletedEvent
) -> None:
self.formatter.handle_observation_completed(
event.agent_role,
event.step_number,
event.step_completed_successfully,
event.remaining_plan_still_valid,
event.key_information_learned,
event.needs_full_replan,
event.goal_already_achieved,
)
@crewai_event_bus.on(StepObservationFailedEvent)
def on_step_observation_failed(
_: Any, event: StepObservationFailedEvent
) -> None:
self.formatter.handle_observation_failed(
event.step_number,
event.error,
)
@crewai_event_bus.on(PlanRefinementEvent)
def on_plan_refinement(_: Any, event: PlanRefinementEvent) -> None:
self.formatter.handle_plan_refinement(
event.step_number,
event.refined_step_count,
event.refinements,
)
@crewai_event_bus.on(PlanReplanTriggeredEvent)
def on_plan_replan_triggered(_: Any, event: PlanReplanTriggeredEvent) -> None:
self.formatter.handle_plan_replan(
event.replan_reason,
event.replan_count,
event.completed_steps_preserved,
)
@crewai_event_bus.on(GoalAchievedEarlyEvent)
def on_goal_achieved_early(_: Any, event: GoalAchievedEarlyEvent) -> None:
self.formatter.handle_goal_achieved_early(
event.steps_completed,
event.steps_remaining,
)
# ----------- AGENT LOGGING EVENTS ----------- # ----------- AGENT LOGGING EVENTS -----------
@crewai_event_bus.on(AgentLogsStartedEvent) @crewai_event_bus.on(AgentLogsStartedEvent)

View File

@@ -93,6 +93,14 @@ from crewai.events.types.memory_events import (
MemorySaveFailedEvent, MemorySaveFailedEvent,
MemorySaveStartedEvent, MemorySaveStartedEvent,
) )
from crewai.events.types.observation_events import (
GoalAchievedEarlyEvent,
PlanRefinementEvent,
PlanReplanTriggeredEvent,
StepObservationCompletedEvent,
StepObservationFailedEvent,
StepObservationStartedEvent,
)
from crewai.events.types.reasoning_events import ( from crewai.events.types.reasoning_events import (
AgentReasoningCompletedEvent, AgentReasoningCompletedEvent,
AgentReasoningFailedEvent, AgentReasoningFailedEvent,
@@ -437,6 +445,39 @@ class TraceCollectionListener(BaseEventListener):
) -> None: ) -> None:
self._handle_action_event("agent_reasoning_failed", source, event) self._handle_action_event("agent_reasoning_failed", source, event)
# Observation events (Plan-and-Execute)
@event_bus.on(StepObservationStartedEvent)
def on_step_observation_started(
source: Any, event: StepObservationStartedEvent
) -> None:
self._handle_action_event("step_observation_started", source, event)
@event_bus.on(StepObservationCompletedEvent)
def on_step_observation_completed(
source: Any, event: StepObservationCompletedEvent
) -> None:
self._handle_action_event("step_observation_completed", source, event)
@event_bus.on(StepObservationFailedEvent)
def on_step_observation_failed(
source: Any, event: StepObservationFailedEvent
) -> None:
self._handle_action_event("step_observation_failed", source, event)
@event_bus.on(PlanRefinementEvent)
def on_plan_refinement(source: Any, event: PlanRefinementEvent) -> None:
self._handle_action_event("plan_refinement", source, event)
@event_bus.on(PlanReplanTriggeredEvent)
def on_plan_replan_triggered(
source: Any, event: PlanReplanTriggeredEvent
) -> None:
self._handle_action_event("plan_replan_triggered", source, event)
@event_bus.on(GoalAchievedEarlyEvent)
def on_goal_achieved_early(source: Any, event: GoalAchievedEarlyEvent) -> None:
self._handle_action_event("goal_achieved_early", source, event)
@event_bus.on(KnowledgeRetrievalStartedEvent) @event_bus.on(KnowledgeRetrievalStartedEvent)
def on_knowledge_retrieval_started( def on_knowledge_retrieval_started(
source: Any, event: KnowledgeRetrievalStartedEvent source: Any, event: KnowledgeRetrievalStartedEvent

View File

@@ -1,4 +1,5 @@
from collections.abc import Callable from collections.abc import Callable
import contextvars
from contextvars import ContextVar, Token from contextvars import ContextVar, Token
from datetime import datetime from datetime import datetime
import getpass import getpass
@@ -18,6 +19,7 @@ from rich.console import Console
from rich.panel import Panel from rich.panel import Panel
from rich.text import Text from rich.text import Text
from crewai.utilities.lock_store import lock as store_lock
from crewai.utilities.paths import db_storage_path from crewai.utilities.paths import db_storage_path
from crewai.utilities.serialization import to_serializable from crewai.utilities.serialization import to_serializable
@@ -137,12 +139,25 @@ def _load_user_data() -> dict[str, Any]:
return {} return {}
def _save_user_data(data: dict[str, Any]) -> None: def _user_data_lock_name() -> str:
"""Return a stable lock name for the user data file."""
return f"file:{os.path.realpath(_user_data_file())}"
def update_user_data(updates: dict[str, Any]) -> None:
"""Atomically read-modify-write the user data file.
Args:
updates: Key-value pairs to merge into the existing user data.
"""
try: try:
with store_lock(_user_data_lock_name()):
data = _load_user_data()
data.update(updates)
p = _user_data_file() p = _user_data_file()
p.write_text(json.dumps(data, indent=2)) p.write_text(json.dumps(data, indent=2))
except (OSError, PermissionError) as e: except (OSError, PermissionError) as e:
logger.warning(f"Failed to save user data: {e}") logger.warning(f"Failed to update user data: {e}")
def has_user_declined_tracing() -> bool: def has_user_declined_tracing() -> bool:
@@ -357,23 +372,29 @@ def _get_generic_system_id() -> str | None:
return None return None
def get_user_id() -> str: def _generate_user_id() -> str:
"""Stable, anonymized user identifier with caching.""" """Compute an anonymized user identifier from username and machine ID."""
data = _load_user_data()
if "user_id" in data:
return cast(str, data["user_id"])
try: try:
username = getpass.getuser() username = getpass.getuser()
except Exception: except Exception:
username = "unknown" username = "unknown"
seed = f"{username}|{_get_machine_id()}" seed = f"{username}|{_get_machine_id()}"
uid = hashlib.sha256(seed.encode()).hexdigest() return hashlib.sha256(seed.encode()).hexdigest()
def get_user_id() -> str:
"""Stable, anonymized user identifier with caching."""
with store_lock(_user_data_lock_name()):
data = _load_user_data()
if "user_id" in data:
return cast(str, data["user_id"])
uid = _generate_user_id()
data["user_id"] = uid data["user_id"] = uid
_save_user_data(data) p = _user_data_file()
p.write_text(json.dumps(data, indent=2))
return uid return uid
@@ -389,20 +410,23 @@ def mark_first_execution_done(user_consented: bool = False) -> None:
Args: Args:
user_consented: Whether the user consented to trace collection. user_consented: Whether the user consented to trace collection.
""" """
with store_lock(_user_data_lock_name()):
data = _load_user_data() data = _load_user_data()
if data.get("first_execution_done", False): if data.get("first_execution_done", False):
return return
uid = data.get("user_id") or _generate_user_id()
data.update( data.update(
{ {
"first_execution_done": True, "first_execution_done": True,
"first_execution_at": datetime.now().timestamp(), "first_execution_at": datetime.now().timestamp(),
"user_id": get_user_id(), "user_id": uid,
"machine_id": _get_machine_id(), "machine_id": _get_machine_id(),
"trace_consent": user_consented, "trace_consent": user_consented,
} }
) )
_save_user_data(data) p = _user_data_file()
p.write_text(json.dumps(data, indent=2))
def safe_serialize_to_dict(obj: Any, exclude: set[str] | None = None) -> dict[str, Any]: def safe_serialize_to_dict(obj: Any, exclude: set[str] | None = None) -> dict[str, Any]:
@@ -509,7 +533,8 @@ def prompt_user_for_trace_viewing(timeout_seconds: int = 20) -> bool:
# Handle all input-related errors silently # Handle all input-related errors silently
result[0] = False result[0] = False
input_thread = threading.Thread(target=get_input, daemon=True) ctx = contextvars.copy_context()
input_thread = threading.Thread(target=ctx.run, args=(get_input,), daemon=True)
input_thread.start() input_thread.start()
input_thread.join(timeout=timeout_seconds) input_thread.join(timeout=timeout_seconds)

View File

@@ -0,0 +1,99 @@
"""Observation events for the Plan-and-Execute architecture.
Emitted during the Observation phase (PLAN-AND-ACT Section 3.3) when the
PlannerObserver analyzes step execution results and decides on plan
continuation, refinement, or replanning.
"""
from typing import Any
from crewai.events.base_events import BaseEvent
class ObservationEvent(BaseEvent):
"""Base event for observation phase events."""
type: str
agent_role: str
step_number: int
step_description: str = ""
from_task: Any | None = None
from_agent: Any | None = None
def __init__(self, **data: Any) -> None:
super().__init__(**data)
self._set_task_params(data)
self._set_agent_params(data)
class StepObservationStartedEvent(ObservationEvent):
"""Emitted when the Planner begins observing a step's result.
Fires after every step execution, before the observation LLM call.
"""
type: str = "step_observation_started"
class StepObservationCompletedEvent(ObservationEvent):
"""Emitted when the Planner finishes observing a step's result.
Contains the full observation analysis: what was learned, whether
the plan is still valid, and what action to take next.
"""
type: str = "step_observation_completed"
step_completed_successfully: bool = True
key_information_learned: str = ""
remaining_plan_still_valid: bool = True
needs_full_replan: bool = False
replan_reason: str | None = None
goal_already_achieved: bool = False
suggested_refinements: list[str] | None = None
class StepObservationFailedEvent(ObservationEvent):
"""Emitted when the observation LLM call itself fails.
The system defaults to continuing the plan when this happens,
but the event allows monitoring/alerting on observation failures.
"""
type: str = "step_observation_failed"
error: str = ""
class PlanRefinementEvent(ObservationEvent):
"""Emitted when the Planner refines upcoming step descriptions.
This is the lightweight refinement path — no full replan, just
sharpening pending todo descriptions based on new information.
"""
type: str = "plan_refinement"
refined_step_count: int = 0
refinements: list[str] | None = None
class PlanReplanTriggeredEvent(ObservationEvent):
"""Emitted when the Planner triggers a full replan.
The remaining plan was deemed fundamentally wrong and will be
regenerated from scratch, preserving completed step results.
"""
type: str = "plan_replan_triggered"
replan_reason: str = ""
replan_count: int = 0
completed_steps_preserved: int = 0
class GoalAchievedEarlyEvent(ObservationEvent):
"""Emitted when the Planner detects the goal was achieved early.
Remaining steps will be skipped and execution will finalize.
"""
type: str = "goal_achieved_early"
steps_remaining: int = 0
steps_completed: int = 0

View File

@@ -9,7 +9,7 @@ class ReasoningEvent(BaseEvent):
type: str type: str
attempt: int = 1 attempt: int = 1
agent_role: str agent_role: str
task_id: str task_id: str | None = None
task_name: str | None = None task_name: str | None = None
from_task: Any | None = None from_task: Any | None = None
agent_id: str | None = None agent_id: str | None = None

View File

@@ -43,6 +43,7 @@ def should_suppress_console_output() -> bool:
class ConsoleFormatter: class ConsoleFormatter:
tool_usage_counts: ClassVar[dict[str, int]] = {} tool_usage_counts: ClassVar[dict[str, int]] = {}
_tool_counts_lock: ClassVar[threading.Lock] = threading.Lock()
current_a2a_turn_count: int = 0 current_a2a_turn_count: int = 0
_pending_a2a_message: str | None = None _pending_a2a_message: str | None = None
@@ -445,8 +446,10 @@ To enable tracing, do any one of these:
if not self.verbose: if not self.verbose:
return return
# Update tool usage count with self._tool_counts_lock:
self.tool_usage_counts[tool_name] = self.tool_usage_counts.get(tool_name, 0) + 1 self.tool_usage_counts[tool_name] = (
self.tool_usage_counts.get(tool_name, 0) + 1
)
iteration = self.tool_usage_counts[tool_name] iteration = self.tool_usage_counts[tool_name]
content = Text() content = Text()
@@ -474,6 +477,7 @@ To enable tracing, do any one of these:
if not self.verbose: if not self.verbose:
return return
with self._tool_counts_lock:
iteration = self.tool_usage_counts.get(tool_name, 1) iteration = self.tool_usage_counts.get(tool_name, 1)
content = Text() content = Text()
@@ -500,6 +504,7 @@ To enable tracing, do any one of these:
if not self.verbose: if not self.verbose:
return return
with self._tool_counts_lock:
iteration = self.tool_usage_counts.get(tool_name, 1) iteration = self.tool_usage_counts.get(tool_name, 1)
content = Text() content = Text()
@@ -936,6 +941,152 @@ To enable tracing, do any one of these:
) )
self.print_panel(error_content, "❌ Reasoning Error", "red") self.print_panel(error_content, "❌ Reasoning Error", "red")
# ----------- OBSERVATION EVENTS (Plan-and-Execute) -----------
def handle_observation_started(
self,
agent_role: str,
step_number: int,
step_description: str,
) -> None:
"""Handle step observation started event."""
if not self.verbose:
return
content = Text()
content.append("Observation Started\n", style="cyan bold")
content.append("Agent: ", style="white")
content.append(f"{agent_role}\n", style="cyan")
content.append("Step: ", style="white")
content.append(f"{step_number}\n", style="cyan")
if step_description:
desc_preview = step_description[:80] + (
"..." if len(step_description) > 80 else ""
)
content.append("Description: ", style="white")
content.append(f"{desc_preview}\n", style="cyan")
self.print_panel(content, "🔍 Observing Step Result", "cyan")
def handle_observation_completed(
self,
agent_role: str,
step_number: int,
step_completed: bool,
plan_valid: bool,
key_info: str,
needs_replan: bool,
goal_achieved: bool,
) -> None:
"""Handle step observation completed event."""
if not self.verbose:
return
if goal_achieved:
style = "green"
status = "Goal Achieved Early"
elif needs_replan:
style = "yellow"
status = "Replan Needed"
elif plan_valid:
style = "green"
status = "Plan Valid — Continue"
else:
style = "red"
status = "Step Failed"
content = Text()
content.append("Observation Complete\n", style=f"{style} bold")
content.append("Step: ", style="white")
content.append(f"{step_number}\n", style=style)
content.append("Status: ", style="white")
content.append(f"{status}\n", style=style)
if key_info:
info_preview = key_info[:120] + ("..." if len(key_info) > 120 else "")
content.append("Learned: ", style="white")
content.append(f"{info_preview}\n", style=style)
self.print_panel(content, "🔍 Observation Result", style)
def handle_observation_failed(
self,
step_number: int,
error: str,
) -> None:
"""Handle step observation failure event."""
if not self.verbose:
return
error_content = self.create_status_content(
"Observation Failed",
"Error",
"red",
Step=str(step_number),
Error=error,
)
self.print_panel(error_content, "❌ Observation Error", "red")
def handle_plan_refinement(
self,
step_number: int,
refined_count: int,
refinements: list[str] | None,
) -> None:
"""Handle plan refinement event."""
if not self.verbose:
return
content = Text()
content.append("Plan Refined\n", style="cyan bold")
content.append("After Step: ", style="white")
content.append(f"{step_number}\n", style="cyan")
content.append("Steps Updated: ", style="white")
content.append(f"{refined_count}\n", style="cyan")
if refinements:
for r in refinements[:3]:
content.append(f"{r[:80]}\n", style="white")
self.print_panel(content, "✏️ Plan Refinement", "cyan")
def handle_plan_replan(
self,
reason: str,
replan_count: int,
preserved_count: int,
) -> None:
"""Handle plan replan triggered event."""
if not self.verbose:
return
content = Text()
content.append("Full Replan Triggered\n", style="yellow bold")
content.append("Reason: ", style="white")
content.append(f"{reason}\n", style="yellow")
content.append("Replan #: ", style="white")
content.append(f"{replan_count}\n", style="yellow")
content.append("Preserved Steps: ", style="white")
content.append(f"{preserved_count}\n", style="yellow")
self.print_panel(content, "🔄 Dynamic Replan", "yellow")
def handle_goal_achieved_early(
self,
steps_completed: int,
steps_remaining: int,
) -> None:
"""Handle goal achieved early event."""
if not self.verbose:
return
content = Text()
content.append("Goal Achieved Early!\n", style="green bold")
content.append("Completed: ", style="white")
content.append(f"{steps_completed} steps\n", style="green")
content.append("Skipped: ", style="white")
content.append(f"{steps_remaining} remaining steps\n", style="green")
self.print_panel(content, "🎯 Early Goal Achievement", "green")
# ----------- AGENT LOGGING EVENTS ----------- # ----------- AGENT LOGGING EVENTS -----------
def handle_agent_logs_started( def handle_agent_logs_started(

File diff suppressed because it is too large Load Diff

View File

@@ -34,6 +34,7 @@ class ConsoleProvider:
```python ```python
from crewai.flow.async_feedback import ConsoleProvider from crewai.flow.async_feedback import ConsoleProvider
@human_feedback( @human_feedback(
message="Review this:", message="Review this:",
provider=ConsoleProvider(), provider=ConsoleProvider(),
@@ -46,6 +47,7 @@ class ConsoleProvider:
```python ```python
from crewai.flow import Flow, start from crewai.flow import Flow, start
class MyFlow(Flow): class MyFlow(Flow):
@start() @start()
def gather_info(self): def gather_info(self):

View File

@@ -17,6 +17,7 @@ from collections.abc import (
ValuesView, ValuesView,
) )
from concurrent.futures import Future, ThreadPoolExecutor from concurrent.futures import Future, ThreadPoolExecutor
import contextvars
import copy import copy
import enum import enum
import inspect import inspect
@@ -497,7 +498,9 @@ class LockedListProxy(list, Generic[T]): # type: ignore[type-arg]
def __bool__(self) -> bool: def __bool__(self) -> bool:
return bool(self._list) return bool(self._list)
def index(self, value: T, start: SupportsIndex = 0, stop: SupportsIndex | None = None) -> int: # type: ignore[override] def index(
self, value: T, start: SupportsIndex = 0, stop: SupportsIndex | None = None
) -> int: # type: ignore[override]
if stop is None: if stop is None:
return self._list.index(value, start) return self._list.index(value, start)
return self._list.index(value, start, stop) return self._list.index(value, start, stop)
@@ -1811,8 +1814,9 @@ class Flow(Generic[T], metaclass=FlowMeta):
try: try:
asyncio.get_running_loop() asyncio.get_running_loop()
ctx = contextvars.copy_context()
with ThreadPoolExecutor(max_workers=1) as pool: with ThreadPoolExecutor(max_workers=1) as pool:
return pool.submit(asyncio.run, _run_flow()).result() return pool.submit(ctx.run, asyncio.run, _run_flow()).result()
except RuntimeError: except RuntimeError:
return asyncio.run(_run_flow()) return asyncio.run(_run_flow())
@@ -2236,8 +2240,6 @@ class Flow(Generic[T], metaclass=FlowMeta):
else: else:
# Run sync methods in thread pool for isolation # Run sync methods in thread pool for isolation
# This allows Agent.kickoff() to work synchronously inside Flow methods # This allows Agent.kickoff() to work synchronously inside Flow methods
import contextvars
ctx = contextvars.copy_context() ctx = contextvars.copy_context()
result = await asyncio.to_thread(ctx.run, method, *args, **kwargs) result = await asyncio.to_thread(ctx.run, method, *args, **kwargs)
finally: finally:
@@ -2714,7 +2716,9 @@ class Flow(Generic[T], metaclass=FlowMeta):
from crewai.flow.async_feedback.types import HumanFeedbackPending from crewai.flow.async_feedback.types import HumanFeedbackPending
if not isinstance(e, HumanFeedbackPending): if not isinstance(e, HumanFeedbackPending):
if not getattr(e, "_flow_listener_logged", False):
logger.error(f"Error executing listener {listener_name}: {e}") logger.error(f"Error executing listener {listener_name}: {e}")
e._flow_listener_logged = True # type: ignore[attr-defined]
raise raise
# ── User Input (self.ask) ──────────────────────────────────────── # ── User Input (self.ask) ────────────────────────────────────────
@@ -2856,8 +2860,9 @@ class Flow(Generic[T], metaclass=FlowMeta):
# Manual executor management to avoid shutdown(wait=True) # Manual executor management to avoid shutdown(wait=True)
# deadlock when the provider call outlives the timeout. # deadlock when the provider call outlives the timeout.
executor = ThreadPoolExecutor(max_workers=1) executor = ThreadPoolExecutor(max_workers=1)
ctx = contextvars.copy_context()
future = executor.submit( future = executor.submit(
provider.request_input, message, self, metadata ctx.run, provider.request_input, message, self, metadata
) )
try: try:
raw = future.result(timeout=timeout) raw = future.result(timeout=timeout)

View File

@@ -188,7 +188,7 @@ def human_feedback(
metadata: dict[str, Any] | None = None, metadata: dict[str, Any] | None = None,
provider: HumanFeedbackProvider | None = None, provider: HumanFeedbackProvider | None = None,
learn: bool = False, learn: bool = False,
learn_source: str = "hitl" learn_source: str = "hitl",
) -> Callable[[F], F]: ) -> Callable[[F], F]:
"""Decorator for Flow methods that require human feedback. """Decorator for Flow methods that require human feedback.
@@ -328,9 +328,7 @@ def human_feedback(
"""Recall past HITL lessons and use LLM to pre-review the output.""" """Recall past HITL lessons and use LLM to pre-review the output."""
try: try:
query = f"human feedback lessons for {func.__name__}: {method_output!s}" query = f"human feedback lessons for {func.__name__}: {method_output!s}"
matches = flow_instance.memory.recall( matches = flow_instance.memory.recall(query, source=learn_source)
query, source=learn_source
)
if not matches: if not matches:
return method_output return method_output
@@ -341,7 +339,10 @@ def human_feedback(
lessons=lessons, lessons=lessons,
) )
messages = [ messages = [
{"role": "system", "content": _get_hitl_prompt("hitl_pre_review_system")}, {
"role": "system",
"content": _get_hitl_prompt("hitl_pre_review_system"),
},
{"role": "user", "content": prompt}, {"role": "user", "content": prompt},
] ]
if getattr(llm_inst, "supports_function_calling", lambda: False)(): if getattr(llm_inst, "supports_function_calling", lambda: False)():
@@ -366,7 +367,10 @@ def human_feedback(
feedback=raw_feedback, feedback=raw_feedback,
) )
messages = [ messages = [
{"role": "system", "content": _get_hitl_prompt("hitl_distill_system")}, {
"role": "system",
"content": _get_hitl_prompt("hitl_distill_system"),
},
{"role": "user", "content": prompt}, {"role": "user", "content": prompt},
] ]
@@ -487,7 +491,11 @@ def human_feedback(
result = _process_feedback(self, method_output, raw_feedback) result = _process_feedback(self, method_output, raw_feedback)
# Distill: extract lessons from output + feedback, store in memory # Distill: extract lessons from output + feedback, store in memory
if learn and getattr(self, "memory", None) is not None and raw_feedback.strip(): if (
learn
and getattr(self, "memory", None) is not None
and raw_feedback.strip()
):
_distill_and_store_lessons(self, method_output, raw_feedback) _distill_and_store_lessons(self, method_output, raw_feedback)
return result return result
@@ -507,7 +515,11 @@ def human_feedback(
result = _process_feedback(self, method_output, raw_feedback) result = _process_feedback(self, method_output, raw_feedback)
# Distill: extract lessons from output + feedback, store in memory # Distill: extract lessons from output + feedback, store in memory
if learn and getattr(self, "memory", None) is not None and raw_feedback.strip(): if (
learn
and getattr(self, "memory", None) is not None
and raw_feedback.strip()
):
_distill_and_store_lessons(self, method_output, raw_feedback) _distill_and_store_lessons(self, method_output, raw_feedback)
return result return result
@@ -534,7 +546,7 @@ def human_feedback(
metadata=metadata, metadata=metadata,
provider=provider, provider=provider,
learn=learn, learn=learn,
learn_source=learn_source learn_source=learn_source,
) )
wrapper.__is_flow_method__ = True wrapper.__is_flow_method__ = True

View File

@@ -1,11 +1,10 @@
""" """SQLite-based implementation of flow state persistence."""
SQLite-based implementation of flow state persistence.
"""
from __future__ import annotations from __future__ import annotations
from datetime import datetime, timezone from datetime import datetime, timezone
import json import json
import os
from pathlib import Path from pathlib import Path
import sqlite3 import sqlite3
from typing import TYPE_CHECKING, Any from typing import TYPE_CHECKING, Any
@@ -13,6 +12,7 @@ from typing import TYPE_CHECKING, Any
from pydantic import BaseModel from pydantic import BaseModel
from crewai.flow.persistence.base import FlowPersistence from crewai.flow.persistence.base import FlowPersistence
from crewai.utilities.lock_store import lock as store_lock
from crewai.utilities.paths import db_storage_path from crewai.utilities.paths import db_storage_path
@@ -68,11 +68,15 @@ class SQLiteFlowPersistence(FlowPersistence):
raise ValueError("Database path must be provided") raise ValueError("Database path must be provided")
self.db_path = path # Now mypy knows this is str self.db_path = path # Now mypy knows this is str
self._lock_name = f"sqlite:{os.path.realpath(self.db_path)}"
self.init_db() self.init_db()
def init_db(self) -> None: def init_db(self) -> None:
"""Create the necessary tables if they don't exist.""" """Create the necessary tables if they don't exist."""
with sqlite3.connect(self.db_path, timeout=30) as conn: with (
store_lock(self._lock_name),
sqlite3.connect(self.db_path, timeout=30) as conn,
):
conn.execute("PRAGMA journal_mode=WAL") conn.execute("PRAGMA journal_mode=WAL")
# Main state table # Main state table
conn.execute( conn.execute(
@@ -114,30 +118,21 @@ class SQLiteFlowPersistence(FlowPersistence):
""" """
) )
def save_state( def _save_state_sql(
self, self,
conn: sqlite3.Connection,
flow_uuid: str, flow_uuid: str,
method_name: str, method_name: str,
state_data: dict[str, Any] | BaseModel, state_dict: dict[str, Any],
) -> None: ) -> None:
"""Save the current flow state to SQLite. """Execute the save-state INSERT without acquiring the lock.
Args: Args:
flow_uuid: Unique identifier for the flow instance conn: An open SQLite connection.
method_name: Name of the method that just completed flow_uuid: Unique identifier for the flow instance.
state_data: Current state data (either dict or Pydantic model) method_name: Name of the method that just completed.
state_dict: State data as a plain dict.
""" """
# Convert state_data to dict, handling both Pydantic and dict cases
if isinstance(state_data, BaseModel):
state_dict = state_data.model_dump()
elif isinstance(state_data, dict):
state_dict = state_data
else:
raise ValueError(
f"state_data must be either a Pydantic BaseModel or dict, got {type(state_data)}"
)
with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute( conn.execute(
""" """
INSERT INTO flow_states ( INSERT INTO flow_states (
@@ -155,6 +150,38 @@ class SQLiteFlowPersistence(FlowPersistence):
), ),
) )
@staticmethod
def _to_state_dict(state_data: dict[str, Any] | BaseModel) -> dict[str, Any]:
"""Convert state_data to a plain dict."""
if isinstance(state_data, BaseModel):
return state_data.model_dump()
if isinstance(state_data, dict):
return state_data
raise ValueError(
f"state_data must be either a Pydantic BaseModel or dict, got {type(state_data)}"
)
def save_state(
self,
flow_uuid: str,
method_name: str,
state_data: dict[str, Any] | BaseModel,
) -> None:
"""Save the current flow state to SQLite.
Args:
flow_uuid: Unique identifier for the flow instance
method_name: Name of the method that just completed
state_data: Current state data (either dict or Pydantic model)
"""
state_dict = self._to_state_dict(state_data)
with (
store_lock(self._lock_name),
sqlite3.connect(self.db_path, timeout=30) as conn,
):
self._save_state_sql(conn, flow_uuid, method_name, state_dict)
def load_state(self, flow_uuid: str) -> dict[str, Any] | None: def load_state(self, flow_uuid: str) -> dict[str, Any] | None:
"""Load the most recent state for a given flow UUID. """Load the most recent state for a given flow UUID.
@@ -198,24 +225,14 @@ class SQLiteFlowPersistence(FlowPersistence):
context: The pending feedback context with all resume information context: The pending feedback context with all resume information
state_data: Current state data state_data: Current state data
""" """
# Import here to avoid circular imports state_dict = self._to_state_dict(state_data)
# Convert state_data to dict with (
if isinstance(state_data, BaseModel): store_lock(self._lock_name),
state_dict = state_data.model_dump() sqlite3.connect(self.db_path, timeout=30) as conn,
elif isinstance(state_data, dict): ):
state_dict = state_data self._save_state_sql(conn, flow_uuid, context.method_name, state_dict)
else:
raise ValueError(
f"state_data must be either a Pydantic BaseModel or dict, got {type(state_data)}"
)
# Also save to regular state table for consistency
self.save_state(flow_uuid, context.method_name, state_data)
# Save pending feedback context
with sqlite3.connect(self.db_path, timeout=30) as conn:
# Use INSERT OR REPLACE to handle re-triggering feedback on same flow
conn.execute( conn.execute(
""" """
INSERT OR REPLACE INTO pending_feedback ( INSERT OR REPLACE INTO pending_feedback (
@@ -273,7 +290,10 @@ class SQLiteFlowPersistence(FlowPersistence):
Args: Args:
flow_uuid: Unique identifier for the flow instance flow_uuid: Unique identifier for the flow instance
""" """
with sqlite3.connect(self.db_path, timeout=30) as conn: with (
store_lock(self._lock_name),
sqlite3.connect(self.db_path, timeout=30) as conn,
):
conn.execute( conn.execute(
""" """
DELETE FROM pending_feedback DELETE FROM pending_feedback

View File

@@ -6,9 +6,27 @@ from typing import Any
from pydantic import BaseModel, Field from pydantic import BaseModel, Field
from crewai.utilities.planning_types import TodoItem
from crewai.utilities.types import LLMMessage from crewai.utilities.types import LLMMessage
class TodoExecutionResult(BaseModel):
"""Summary of a single todo execution."""
step_number: int = Field(description="Step number in the plan")
description: str = Field(description="What the todo was supposed to do")
tool_used: str | None = Field(
default=None, description="Tool that was used for this step"
)
status: str = Field(description="Final status: completed, failed, pending")
result: str | None = Field(
default=None, description="Result or error message from execution"
)
depends_on: list[int] = Field(
default_factory=list, description="Step numbers this depended on"
)
class LiteAgentOutput(BaseModel): class LiteAgentOutput(BaseModel):
"""Class that represents the result of a LiteAgent execution.""" """Class that represents the result of a LiteAgent execution."""
@@ -24,12 +42,75 @@ class LiteAgentOutput(BaseModel):
) )
messages: list[LLMMessage] = Field(description="Messages of the agent", default=[]) messages: list[LLMMessage] = Field(description="Messages of the agent", default=[])
plan: str | None = Field(
default=None, description="The execution plan that was generated, if any"
)
todos: list[TodoExecutionResult] = Field(
default_factory=list,
description="List of todos that were executed with their results",
)
replan_count: int = Field(
default=0, description="Number of times the plan was regenerated"
)
last_replan_reason: str | None = Field(
default=None, description="Reason for the last replan, if any"
)
@classmethod
def from_todo_items(cls, todo_items: list[TodoItem]) -> list[TodoExecutionResult]:
"""Convert TodoItem objects to TodoExecutionResult summaries.
Args:
todo_items: List of TodoItem objects from execution.
Returns:
List of TodoExecutionResult summaries.
"""
return [
TodoExecutionResult(
step_number=item.step_number,
description=item.description,
tool_used=item.tool_to_use,
status=item.status,
result=item.result,
depends_on=item.depends_on,
)
for item in todo_items
]
def to_dict(self) -> dict[str, Any]: def to_dict(self) -> dict[str, Any]:
"""Convert pydantic_output to a dictionary.""" """Convert pydantic_output to a dictionary."""
if self.pydantic: if self.pydantic:
return self.pydantic.model_dump() return self.pydantic.model_dump()
return {} return {}
@property
def completed_todos(self) -> list[TodoExecutionResult]:
"""Get only the completed todos."""
return [t for t in self.todos if t.status == "completed"]
@property
def failed_todos(self) -> list[TodoExecutionResult]:
"""Get only the failed todos."""
return [t for t in self.todos if t.status == "failed"]
@property
def had_plan(self) -> bool:
"""Check if the agent executed with a plan."""
return self.plan is not None or len(self.todos) > 0
def __str__(self) -> str: def __str__(self) -> str:
"""Return the raw output as a string.""" """Return the raw output as a string."""
return self.raw return self.raw
def __repr__(self) -> str:
"""Return a detailed representation including todo summary."""
parts = [f"LiteAgentOutput(role={self.agent_role!r}"]
if self.todos:
completed = len(self.completed_todos)
total = len(self.todos)
parts.append(f", todos={completed}/{total} completed")
if self.replan_count > 0:
parts.append(f", replans={self.replan_count}")
parts.append(")")
return "".join(parts)

View File

@@ -618,6 +618,50 @@ class AnthropicCompletion(BaseLLM):
return redacted_block return redacted_block
return None return None
@staticmethod
def _convert_image_blocks(content: Any) -> Any:
"""Convert OpenAI-style image_url blocks to Anthropic image blocks.
Upstream code (e.g. StepExecutor) uses the standard ``image_url``
format with a ``data:`` URI. Anthropic rejects that — it requires
``{"type": "image", "source": {"type": "base64", ...}}``.
Non-list content and blocks that are not ``image_url`` are passed
through unchanged.
"""
if not isinstance(content, list):
return content
converted: list[dict[str, Any]] = []
for block in content:
if not isinstance(block, dict) or block.get("type") != "image_url":
converted.append(block)
continue
image_info = block.get("image_url", {})
url = image_info.get("url", "") if isinstance(image_info, dict) else ""
if url.startswith("data:") and ";base64," in url:
# Parse data:<media_type>;base64,<data>
header, b64_data = url.split(";base64,", 1)
media_type = (
header.split("data:", 1)[1] if "data:" in header else "image/png"
)
converted.append(
{
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": b64_data,
},
}
)
else:
# Non-data URI — pass through as-is (Anthropic supports url source)
converted.append(block)
return converted
def _format_messages_for_anthropic( def _format_messages_for_anthropic(
self, messages: str | list[LLMMessage] self, messages: str | list[LLMMessage]
) -> tuple[list[LLMMessage], str | None]: ) -> tuple[list[LLMMessage], str | None]:
@@ -656,10 +700,11 @@ class AnthropicCompletion(BaseLLM):
tool_call_id = message.get("tool_call_id", "") tool_call_id = message.get("tool_call_id", "")
if not tool_call_id: if not tool_call_id:
raise ValueError("Tool message missing required tool_call_id") raise ValueError("Tool message missing required tool_call_id")
tool_content = self._convert_image_blocks(content) if content else ""
tool_result = { tool_result = {
"type": "tool_result", "type": "tool_result",
"tool_use_id": tool_call_id, "tool_use_id": tool_call_id,
"content": content if content else "", "content": tool_content,
} }
pending_tool_results.append(tool_result) pending_tool_results.append(tool_result)
elif role == "assistant": elif role == "assistant":
@@ -718,7 +763,12 @@ class AnthropicCompletion(BaseLLM):
role_str = role if role is not None else "user" role_str = role if role is not None else "user"
if isinstance(content, list): if isinstance(content, list):
formatted_messages.append({"role": role_str, "content": content}) formatted_messages.append(
{
"role": role_str,
"content": self._convert_image_blocks(content),
}
)
else: else:
content_str = content if content is not None else "" content_str = content if content is not None else ""
formatted_messages.append( formatted_messages.append(

View File

@@ -1847,7 +1847,10 @@ class BedrockCompletion(BaseLLM):
converse_messages.append({"role": "user", "content": pending_tool_results}) converse_messages.append({"role": "user", "content": pending_tool_results})
# CRITICAL: Handle model-specific conversation requirements # CRITICAL: Handle model-specific conversation requirements
# Cohere and some other models require conversation to end with user message # Cohere and some other models require conversation to end with user message.
# Anthropic models on Bedrock also reject assistant messages in the final
# position when tools are present ("pre-filling the assistant response is
# not supported").
if converse_messages: if converse_messages:
last_message = converse_messages[-1] last_message = converse_messages[-1]
if last_message["role"] == "assistant": if last_message["role"] == "assistant":
@@ -1874,6 +1877,20 @@ class BedrockCompletion(BaseLLM):
"content": [{"text": "Continue your response."}], "content": [{"text": "Continue your response."}],
} }
) )
# Anthropic (Claude) models reject assistant-last messages when
# tools are in the request. Append a user message so the
# Converse API accepts the payload.
elif "anthropic" in self.model.lower() or "claude" in self.model.lower():
converse_messages.append(
{
"role": "user",
"content": [
{
"text": "Please continue and provide your final answer."
}
],
}
)
# Ensure first message is from user (required by Converse API) # Ensure first message is from user (required by Converse API)
if not converse_messages: if not converse_messages:

View File

@@ -11,6 +11,7 @@ into a standalone MCPToolResolver. It handles three flavours of MCP reference:
from __future__ import annotations from __future__ import annotations
import asyncio import asyncio
import contextvars
import time import time
from typing import TYPE_CHECKING, Any, Final, cast from typing import TYPE_CHECKING, Any, Final, cast
from urllib.parse import urlparse from urllib.parse import urlparse
@@ -22,10 +23,10 @@ from crewai.mcp.config import (
MCPServerSSE, MCPServerSSE,
MCPServerStdio, MCPServerStdio,
) )
from crewai.utilities.string_utils import sanitize_tool_name
from crewai.mcp.transports.http import HTTPTransport from crewai.mcp.transports.http import HTTPTransport
from crewai.mcp.transports.sse import SSETransport from crewai.mcp.transports.sse import SSETransport
from crewai.mcp.transports.stdio import StdioTransport from crewai.mcp.transports.stdio import StdioTransport
from crewai.utilities.string_utils import sanitize_tool_name
if TYPE_CHECKING: if TYPE_CHECKING:
@@ -227,7 +228,9 @@ class MCPToolResolver:
server_params = {"url": server_url} server_params = {"url": server_url}
server_name = self._extract_server_name(server_url) server_name = self._extract_server_name(server_url)
sanitized_specific_tool = sanitize_tool_name(specific_tool) if specific_tool else None sanitized_specific_tool = (
sanitize_tool_name(specific_tool) if specific_tool else None
)
try: try:
tool_schemas = self._get_mcp_tool_schemas(server_params) tool_schemas = self._get_mcp_tool_schemas(server_params)
@@ -353,9 +356,10 @@ class MCPToolResolver:
asyncio.get_running_loop() asyncio.get_running_loop()
import concurrent.futures import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as executor: with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit( future = executor.submit(
asyncio.run, _setup_client_and_list_tools() ctx.run, asyncio.run, _setup_client_and_list_tools()
) )
tools_list = future.result() tools_list = future.result()
except RuntimeError: except RuntimeError:

View File

@@ -308,7 +308,9 @@ def analyze_for_save(
return MemoryAnalysis.model_validate(response) return MemoryAnalysis.model_validate(response)
except Exception as e: except Exception as e:
_logger.warning( _logger.warning(
"Memory save analysis failed, using defaults: %s", e, exc_info=False, "Memory save analysis failed, using defaults: %s",
e,
exc_info=False,
) )
return _SAVE_DEFAULTS return _SAVE_DEFAULTS
@@ -366,6 +368,8 @@ def analyze_for_consolidation(
return ConsolidationPlan.model_validate(response) return ConsolidationPlan.model_validate(response)
except Exception as e: except Exception as e:
_logger.warning( _logger.warning(
"Consolidation analysis failed, defaulting to insert: %s", e, exc_info=False, "Consolidation analysis failed, defaulting to insert: %s",
e,
exc_info=False,
) )
return _CONSOLIDATION_DEFAULT return _CONSOLIDATION_DEFAULT

View File

@@ -11,7 +11,9 @@ Orchestrates the encoding side of memory in a single Flow with 5 steps:
from __future__ import annotations from __future__ import annotations
from concurrent.futures import Future, ThreadPoolExecutor from concurrent.futures import Future, ThreadPoolExecutor
import contextvars
from datetime import datetime from datetime import datetime
import logging
import math import math
from typing import Any from typing import Any
from uuid import uuid4 from uuid import uuid4
@@ -28,6 +30,8 @@ from crewai.memory.analyze import (
from crewai.memory.types import MemoryConfig, MemoryRecord, embed_texts from crewai.memory.types import MemoryConfig, MemoryRecord, embed_texts
logger = logging.getLogger(__name__)
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# State models # State models
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -164,14 +168,20 @@ class EncodingFlow(Flow[EncodingState]):
def parallel_find_similar(self) -> None: def parallel_find_similar(self) -> None:
"""Search storage for similar records, concurrently for all active items.""" """Search storage for similar records, concurrently for all active items."""
items = list(self.state.items) items = list(self.state.items)
active = [(i, item) for i, item in enumerate(items) if not item.dropped and item.embedding] active = [
(i, item)
for i, item in enumerate(items)
if not item.dropped and item.embedding
]
if not active: if not active:
return return
def _search_one(item: ItemState) -> list[tuple[MemoryRecord, float]]: def _search_one(
item: ItemState,
) -> list[tuple[MemoryRecord, float]]:
scope_prefix = item.scope if item.scope and item.scope.strip("/") else None scope_prefix = item.scope if item.scope and item.scope.strip("/") else None
return self._storage.search( return self._storage.search( # type: ignore[no-any-return]
item.embedding, item.embedding,
scope_prefix=scope_prefix, scope_prefix=scope_prefix,
categories=None, categories=None,
@@ -181,14 +191,37 @@ class EncodingFlow(Flow[EncodingState]):
if len(active) == 1: if len(active) == 1:
_, item = active[0] _, item = active[0]
try:
raw = _search_one(item) raw = _search_one(item)
except Exception:
logger.warning(
"Storage search failed in parallel_find_similar, "
"treating item as new",
exc_info=True,
)
raw = []
item.similar_records = [r for r, _ in raw] item.similar_records = [r for r, _ in raw]
item.top_similarity = float(raw[0][1]) if raw else 0.0 item.top_similarity = float(raw[0][1]) if raw else 0.0
else: else:
with ThreadPoolExecutor(max_workers=min(len(active), 8)) as pool: with ThreadPoolExecutor(max_workers=min(len(active), 8)) as pool:
futures = [(i, item, pool.submit(_search_one, item)) for i, item in active] futures = [
(
i,
item,
pool.submit(contextvars.copy_context().run, _search_one, item),
)
for i, item in active
]
for _, item, future in futures: for _, item, future in futures:
try:
raw = future.result() raw = future.result()
except Exception:
logger.warning(
"Storage search failed in parallel_find_similar, "
"treating item as new",
exc_info=True,
)
raw = []
item.similar_records = [r for r, _ in raw] item.similar_records = [r for r, _ in raw]
item.top_similarity = float(raw[0][1]) if raw else 0.0 item.top_similarity = float(raw[0][1]) if raw else 0.0
@@ -250,24 +283,38 @@ class EncodingFlow(Flow[EncodingState]):
# Group B: consolidation only # Group B: consolidation only
self._apply_defaults(item) self._apply_defaults(item)
consol_futures[i] = pool.submit( consol_futures[i] = pool.submit(
contextvars.copy_context().run,
analyze_for_consolidation, analyze_for_consolidation,
item.content, list(item.similar_records), self._llm, item.content,
list(item.similar_records),
self._llm,
) )
elif not fields_provided and not has_similar: elif not fields_provided and not has_similar:
# Group C: field resolution only # Group C: field resolution only
save_futures[i] = pool.submit( save_futures[i] = pool.submit(
contextvars.copy_context().run,
analyze_for_save, analyze_for_save,
item.content, existing_scopes, existing_categories, self._llm, item.content,
existing_scopes,
existing_categories,
self._llm,
) )
else: else:
# Group D: both in parallel # Group D: both in parallel
save_futures[i] = pool.submit( save_futures[i] = pool.submit(
contextvars.copy_context().run,
analyze_for_save, analyze_for_save,
item.content, existing_scopes, existing_categories, self._llm, item.content,
existing_scopes,
existing_categories,
self._llm,
) )
consol_futures[i] = pool.submit( consol_futures[i] = pool.submit(
contextvars.copy_context().run,
analyze_for_consolidation, analyze_for_consolidation,
item.content, list(item.similar_records), self._llm, item.content,
list(item.similar_records),
self._llm,
) )
# Collect field-resolution results # Collect field-resolution results
@@ -300,8 +347,8 @@ class EncodingFlow(Flow[EncodingState]):
item.plan = ConsolidationPlan(actions=[], insert_new=True) item.plan = ConsolidationPlan(actions=[], insert_new=True)
# Collect consolidation results # Collect consolidation results
for i, future in consol_futures.items(): for i, consol_future in consol_futures.items():
items[i].plan = future.result() items[i].plan = consol_future.result()
finally: finally:
pool.shutdown(wait=False) pool.shutdown(wait=False)
@@ -339,7 +386,9 @@ class EncodingFlow(Flow[EncodingState]):
# similar_records overlap). Collect one action per record_id, first wins. # similar_records overlap). Collect one action per record_id, first wins.
# Also build a map from record_id to the original MemoryRecord for updates. # Also build a map from record_id to the original MemoryRecord for updates.
dedup_deletes: set[str] = set() # record_ids to delete dedup_deletes: set[str] = set() # record_ids to delete
dedup_updates: dict[str, tuple[int, str]] = {} # record_id -> (item_idx, new_content) dedup_updates: dict[
str, tuple[int, str]
] = {} # record_id -> (item_idx, new_content)
all_similar: dict[str, MemoryRecord] = {} # record_id -> MemoryRecord all_similar: dict[str, MemoryRecord] = {} # record_id -> MemoryRecord
for i, item in enumerate(items): for i, item in enumerate(items):
@@ -350,13 +399,24 @@ class EncodingFlow(Flow[EncodingState]):
all_similar[r.id] = r all_similar[r.id] = r
for action in item.plan.actions: for action in item.plan.actions:
rid = action.record_id rid = action.record_id
if action.action == "delete" and rid not in dedup_deletes and rid not in dedup_updates: if (
action.action == "delete"
and rid not in dedup_deletes
and rid not in dedup_updates
):
dedup_deletes.add(rid) dedup_deletes.add(rid)
elif action.action == "update" and action.new_content and rid not in dedup_deletes and rid not in dedup_updates: elif (
action.action == "update"
and action.new_content
and rid not in dedup_deletes
and rid not in dedup_updates
):
dedup_updates[rid] = (i, action.new_content) dedup_updates[rid] = (i, action.new_content)
# --- Batch re-embed all update contents in ONE call --- # --- Batch re-embed all update contents in ONE call ---
update_list = list(dedup_updates.items()) # [(record_id, (item_idx, new_content)), ...] update_list = list(
dedup_updates.items()
) # [(record_id, (item_idx, new_content)), ...]
update_embeddings: list[list[float]] = [] update_embeddings: list[list[float]] = []
if update_list: if update_list:
update_contents = [content for _, (_, content) in update_list] update_contents = [content for _, (_, content) in update_list]
@@ -377,7 +437,10 @@ class EncodingFlow(Flow[EncodingState]):
if item.dropped or item.plan is None: if item.dropped or item.plan is None:
continue continue
if item.plan.insert_new: if item.plan.insert_new:
to_insert.append((i, MemoryRecord( to_insert.append(
(
i,
MemoryRecord(
content=item.content, content=item.content,
scope=item.resolved_scope, scope=item.resolved_scope,
categories=item.resolved_categories, categories=item.resolved_categories,
@@ -386,13 +449,11 @@ class EncodingFlow(Flow[EncodingState]):
embedding=item.embedding if item.embedding else None, embedding=item.embedding if item.embedding else None,
source=item.resolved_source, source=item.resolved_source,
private=item.resolved_private, private=item.resolved_private,
))) ),
)
)
# All storage mutations under one lock so no other pipeline can
# interleave and cause version conflicts. The lock is reentrant
# (RLock) so the individual storage methods re-acquire it safely.
updated_records: dict[str, MemoryRecord] = {} updated_records: dict[str, MemoryRecord] = {}
with self._storage.write_lock:
if dedup_deletes: if dedup_deletes:
self._storage.delete(record_ids=list(dedup_deletes)) self._storage.delete(record_ids=list(dedup_deletes))
self.state.records_deleted += len(dedup_deletes) self.state.records_deleted += len(dedup_deletes)

View File

@@ -11,7 +11,9 @@ Implements adaptive-depth retrieval with:
from __future__ import annotations from __future__ import annotations
from concurrent.futures import ThreadPoolExecutor, as_completed from concurrent.futures import ThreadPoolExecutor, as_completed
import contextvars
from datetime import datetime from datetime import datetime
import logging
from typing import Any from typing import Any
from uuid import uuid4 from uuid import uuid4
@@ -29,6 +31,9 @@ from crewai.memory.types import (
) )
logger = logging.getLogger(__name__)
class RecallState(BaseModel): class RecallState(BaseModel):
"""State for the recall flow.""" """State for the recall flow."""
@@ -103,13 +108,12 @@ class RecallFlow(Flow[RecallState]):
) )
# Post-filter by time cutoff # Post-filter by time cutoff
if self.state.time_cutoff and raw: if self.state.time_cutoff and raw:
raw = [ raw = [(r, s) for r, s in raw if r.created_at >= self.state.time_cutoff]
(r, s) for r, s in raw if r.created_at >= self.state.time_cutoff
]
# Privacy filter # Privacy filter
if not self.state.include_private and raw: if not self.state.include_private and raw:
raw = [ raw = [
(r, s) for r, s in raw (r, s)
for r, s in raw
if not r.private or r.source == self.state.source if not r.private or r.source == self.state.source
] ]
return scope, raw return scope, raw
@@ -125,38 +129,57 @@ class RecallFlow(Flow[RecallState]):
if len(tasks) <= 1: if len(tasks) <= 1:
for emb, sc in tasks: for emb, sc in tasks:
try:
scope, results = _search_one(emb, sc) scope, results = _search_one(emb, sc)
except Exception:
logger.warning(
"Storage search failed in recall flow, skipping scope",
exc_info=True,
)
continue
if results: if results:
top_composite, _ = compute_composite_score( top_composite, _ = compute_composite_score(
results[0][0], results[0][1], self._config results[0][0], results[0][1], self._config
) )
findings.append({ findings.append(
{
"scope": scope, "scope": scope,
"results": results, "results": results,
"top_score": top_composite, "top_score": top_composite,
}) }
)
else: else:
with ThreadPoolExecutor(max_workers=min(len(tasks), 4)) as pool: with ThreadPoolExecutor(max_workers=min(len(tasks), 4)) as pool:
futures = { futures = {
pool.submit(_search_one, emb, sc): (emb, sc) pool.submit(contextvars.copy_context().run, _search_one, emb, sc): (
emb,
sc,
)
for emb, sc in tasks for emb, sc in tasks
} }
for future in as_completed(futures): for future in as_completed(futures):
try:
scope, results = future.result() scope, results = future.result()
except Exception:
logger.warning(
"Storage search failed in recall flow, skipping scope",
exc_info=True,
)
continue
if results: if results:
top_composite, _ = compute_composite_score( top_composite, _ = compute_composite_score(
results[0][0], results[0][1], self._config results[0][0], results[0][1], self._config
) )
findings.append({ findings.append(
{
"scope": scope, "scope": scope,
"results": results, "results": results,
"top_score": top_composite, "top_score": top_composite,
}) }
)
self.state.chunk_findings = findings self.state.chunk_findings = findings
self.state.confidence = max( self.state.confidence = max((f["top_score"] for f in findings), default=0.0)
(f["top_score"] for f in findings), default=0.0
)
return findings return findings
# ------------------------------------------------------------------ # ------------------------------------------------------------------
@@ -210,12 +233,16 @@ class RecallFlow(Flow[RecallState]):
# Parse time_filter into a datetime cutoff # Parse time_filter into a datetime cutoff
if analysis.time_filter: if analysis.time_filter:
try: try:
self.state.time_cutoff = datetime.fromisoformat(analysis.time_filter) self.state.time_cutoff = datetime.fromisoformat(
analysis.time_filter
)
except ValueError: except ValueError:
pass pass
# Batch-embed all sub-queries in ONE call # Batch-embed all sub-queries in ONE call
queries = analysis.recall_queries if analysis.recall_queries else [self.state.query] queries = (
analysis.recall_queries if analysis.recall_queries else [self.state.query]
)
queries = queries[:3] queries = queries[:3]
embeddings = embed_texts(self._embedder, queries) embeddings = embed_texts(self._embedder, queries)
pairs: list[tuple[str, list[float]]] = [ pairs: list[tuple[str, list[float]]] = [
@@ -237,12 +264,16 @@ class RecallFlow(Flow[RecallState]):
if analysis and analysis.suggested_scopes: if analysis and analysis.suggested_scopes:
candidates = [s for s in analysis.suggested_scopes if s] candidates = [s for s in analysis.suggested_scopes if s]
else: else:
try:
candidates = self._storage.list_scopes(scope_prefix) candidates = self._storage.list_scopes(scope_prefix)
except Exception:
logger.warning(
"Storage list_scopes failed in filter_and_chunk, "
"falling back to scope prefix",
exc_info=True,
)
candidates = []
if not candidates: if not candidates:
info = self._storage.get_scope_info(scope_prefix)
if info.record_count > 0:
candidates = [scope_prefix]
else:
candidates = [scope_prefix] candidates = [scope_prefix]
self.state.candidate_scopes = candidates[:20] self.state.candidate_scopes = candidates[:20]
return self.state.candidate_scopes return self.state.candidate_scopes
@@ -296,17 +327,21 @@ class RecallFlow(Flow[RecallState]):
response = self._llm.call([{"role": "user", "content": prompt}]) response = self._llm.call([{"role": "user", "content": prompt}])
if isinstance(response, str) and "missing" in response.lower(): if isinstance(response, str) and "missing" in response.lower():
self.state.evidence_gaps.append(response[:200]) self.state.evidence_gaps.append(response[:200])
enhanced.append({ enhanced.append(
{
"scope": finding["scope"], "scope": finding["scope"],
"extraction": response, "extraction": response,
"results": finding["results"], "results": finding["results"],
}) }
)
except Exception: except Exception:
enhanced.append({ enhanced.append(
{
"scope": finding["scope"], "scope": finding["scope"],
"extraction": "", "extraction": "",
"results": finding["results"], "results": finding["results"],
}) }
)
self.state.chunk_findings = enhanced self.state.chunk_findings = enhanced
return enhanced return enhanced
@@ -318,7 +353,7 @@ class RecallFlow(Flow[RecallState]):
@router(re_search) @router(re_search)
def re_decide_depth(self) -> str: def re_decide_depth(self) -> str:
"""Re-evaluate depth after re-search. Same logic as decide_depth.""" """Re-evaluate depth after re-search. Same logic as decide_depth."""
return self.decide_depth() return self.decide_depth() # type: ignore[call-arg]
@listen("synthesize") @listen("synthesize")
def synthesize_results(self) -> list[MemoryMatch]: def synthesize_results(self) -> list[MemoryMatch]:

View File

@@ -1,5 +1,6 @@
import json import json
import logging import logging
import os
from pathlib import Path from pathlib import Path
import sqlite3 import sqlite3
from typing import Any from typing import Any
@@ -8,6 +9,7 @@ from crewai.task import Task
from crewai.utilities import Printer from crewai.utilities import Printer
from crewai.utilities.crew_json_encoder import CrewJSONEncoder from crewai.utilities.crew_json_encoder import CrewJSONEncoder
from crewai.utilities.errors import DatabaseError, DatabaseOperationError from crewai.utilities.errors import DatabaseError, DatabaseOperationError
from crewai.utilities.lock_store import lock as store_lock
from crewai.utilities.paths import db_storage_path from crewai.utilities.paths import db_storage_path
@@ -24,6 +26,7 @@ class KickoffTaskOutputsSQLiteStorage:
# Get the parent directory of the default db path and create our db file there # Get the parent directory of the default db path and create our db file there
db_path = str(Path(db_storage_path()) / "latest_kickoff_task_outputs.db") db_path = str(Path(db_storage_path()) / "latest_kickoff_task_outputs.db")
self.db_path = db_path self.db_path = db_path
self._lock_name = f"sqlite:{os.path.realpath(self.db_path)}"
self._printer: Printer = Printer() self._printer: Printer = Printer()
self._initialize_db() self._initialize_db()
@@ -38,6 +41,7 @@ class KickoffTaskOutputsSQLiteStorage:
DatabaseOperationError: If database initialization fails due to SQLite errors. DatabaseOperationError: If database initialization fails due to SQLite errors.
""" """
try: try:
with store_lock(self._lock_name):
with sqlite3.connect(self.db_path, timeout=30) as conn: with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("PRAGMA journal_mode=WAL") conn.execute("PRAGMA journal_mode=WAL")
cursor = conn.cursor() cursor = conn.cursor()
@@ -83,6 +87,7 @@ class KickoffTaskOutputsSQLiteStorage:
""" """
inputs = inputs or {} inputs = inputs or {}
try: try:
with store_lock(self._lock_name):
with sqlite3.connect(self.db_path, timeout=30) as conn: with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("BEGIN TRANSACTION") conn.execute("BEGIN TRANSACTION")
cursor = conn.cursor() cursor = conn.cursor()
@@ -126,6 +131,7 @@ class KickoffTaskOutputsSQLiteStorage:
DatabaseOperationError: If updating the task output fails due to SQLite errors. DatabaseOperationError: If updating the task output fails due to SQLite errors.
""" """
try: try:
with store_lock(self._lock_name):
with sqlite3.connect(self.db_path, timeout=30) as conn: with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("BEGIN TRANSACTION") conn.execute("BEGIN TRANSACTION")
cursor = conn.cursor() cursor = conn.cursor()
@@ -206,6 +212,7 @@ class KickoffTaskOutputsSQLiteStorage:
DatabaseOperationError: If deleting task outputs fails due to SQLite errors. DatabaseOperationError: If deleting task outputs fails due to SQLite errors.
""" """
try: try:
with store_lock(self._lock_name):
with sqlite3.connect(self.db_path, timeout=30) as conn: with sqlite3.connect(self.db_path, timeout=30) as conn:
conn.execute("BEGIN TRANSACTION") conn.execute("BEGIN TRANSACTION")
cursor = conn.cursor() cursor = conn.cursor()

View File

@@ -2,7 +2,7 @@
from __future__ import annotations from __future__ import annotations
from contextlib import AbstractContextManager import contextvars
from datetime import datetime from datetime import datetime
import json import json
import logging import logging
@@ -10,9 +10,9 @@ import os
from pathlib import Path from pathlib import Path
import threading import threading
import time import time
from typing import Any, ClassVar from typing import Any
import lancedb import lancedb # type: ignore[import-untyped]
from crewai.memory.types import MemoryRecord, ScopeInfo from crewai.memory.types import MemoryRecord, ScopeInfo
from crewai.utilities.lock_store import lock as store_lock from crewai.utilities.lock_store import lock as store_lock
@@ -41,15 +41,6 @@ _RETRY_BASE_DELAY = 0.2 # seconds; doubles on each retry
class LanceDBStorage: class LanceDBStorage:
"""LanceDB-backed storage for the unified memory system.""" """LanceDB-backed storage for the unified memory system."""
# Class-level registry: maps resolved database path -> shared write lock.
# When multiple Memory instances (e.g. agent + crew) independently create
# LanceDBStorage pointing at the same directory, they share one lock so
# their writes don't conflict.
# Uses RLock (reentrant) so callers can hold the lock for a batch of
# operations while the individual methods re-acquire it without deadlocking.
_path_locks: ClassVar[dict[str, threading.RLock]] = {}
_path_locks_guard: ClassVar[threading.Lock] = threading.Lock()
def __init__( def __init__(
self, self,
path: str | Path | None = None, path: str | Path | None = None,
@@ -85,11 +76,6 @@ class LanceDBStorage:
self._table_name = table_name self._table_name = table_name
self._db = lancedb.connect(str(self._path)) self._db = lancedb.connect(str(self._path))
# On macOS and Linux the default per-process open-file limit is 256.
# A LanceDB table stores one file per fragment (one fragment per save()
# call by default). With hundreds of fragments, a single full-table
# scan opens all of them simultaneously, exhausting the limit.
# Raise it proactively so scans on large tables never hit OS error 24.
try: try:
import resource import resource
@@ -104,67 +90,44 @@ class LanceDBStorage:
self._lock_name = f"lancedb:{self._path.resolve()}" self._lock_name = f"lancedb:{self._path.resolve()}"
resolved = str(self._path.resolve())
with LanceDBStorage._path_locks_guard:
if resolved not in LanceDBStorage._path_locks:
LanceDBStorage._path_locks[resolved] = threading.RLock()
self._write_lock = LanceDBStorage._path_locks[resolved]
# Try to open an existing table and infer dimension from its schema. # Try to open an existing table and infer dimension from its schema.
# If no table exists yet, defer creation until the first save so the # If no table exists yet, defer creation until the first save so the
# dimension can be auto-detected from the embedder's actual output. # dimension can be auto-detected from the embedder's actual output.
try: try:
self._table: lancedb.table.Table | None = self._db.open_table( self._table: Any = self._db.open_table(self._table_name)
self._table_name
)
self._vector_dim: int = self._infer_dim_from_table(self._table) self._vector_dim: int = self._infer_dim_from_table(self._table)
# Best-effort: create the scope index if it doesn't exist yet. with store_lock(self._lock_name):
with self._file_lock():
self._ensure_scope_index() self._ensure_scope_index()
# Compact in the background if the table has accumulated many
# fragments from previous runs (each save() creates one).
self._compact_if_needed() self._compact_if_needed()
except Exception: except Exception:
_logger.debug(
"Failed to open existing LanceDB table %r", table_name, exc_info=True
)
self._table = None self._table = None
self._vector_dim = vector_dim or 0 # 0 = not yet known self._vector_dim = vector_dim or 0 # 0 = not yet known
# Explicit dim provided: create the table immediately if it doesn't exist. # Explicit dim provided: create the table immediately if it doesn't exist.
if self._table is None and vector_dim is not None: if self._table is None and vector_dim is not None:
self._vector_dim = vector_dim self._vector_dim = vector_dim
with self._file_lock(): with store_lock(self._lock_name):
self._table = self._create_table(vector_dim) self._table = self._create_table(vector_dim)
@property
def write_lock(self) -> threading.RLock:
"""The shared reentrant write lock for this database path.
Callers can acquire this to hold the lock across multiple storage
operations (e.g. delete + update + save as one atomic batch).
Individual methods also acquire it internally, but since it's
reentrant (RLock), the same thread won't deadlock.
"""
return self._write_lock
@staticmethod @staticmethod
def _infer_dim_from_table(table: lancedb.table.Table) -> int: def _infer_dim_from_table(table: Any) -> int:
"""Read vector dimension from an existing table's schema.""" """Read vector dimension from an existing table's schema."""
schema = table.schema schema = table.schema
for field in schema: for field in schema:
if field.name == "vector": if field.name == "vector":
try: try:
return field.type.list_size return int(field.type.list_size)
except Exception: except Exception:
break break
return DEFAULT_VECTOR_DIM return DEFAULT_VECTOR_DIM
def _file_lock(self) -> AbstractContextManager[None]:
"""Return a cross-process lock for serialising writes."""
return store_lock(self._lock_name)
def _do_write(self, op: str, *args: Any, **kwargs: Any) -> Any: def _do_write(self, op: str, *args: Any, **kwargs: Any) -> Any:
"""Execute a single table write with retry on commit conflicts. """Execute a single table write with retry on commit conflicts.
Caller must already hold the cross-process file lock. Caller must already hold ``store_lock(self._lock_name)``.
""" """
delay = _RETRY_BASE_DELAY delay = _RETRY_BASE_DELAY
for attempt in range(_MAX_RETRIES + 1): for attempt in range(_MAX_RETRIES + 1):
@@ -182,16 +145,16 @@ class LanceDBStorage:
) )
try: try:
self._table = self._db.open_table(self._table_name) self._table = self._db.open_table(self._table_name)
except Exception: # noqa: S110 except Exception:
pass _logger.debug("Failed to re-open table during retry", exc_info=True)
time.sleep(delay) time.sleep(delay)
delay *= 2 delay *= 2
return None # unreachable, but satisfies type checker return None # unreachable, but satisfies type checker
def _create_table(self, vector_dim: int) -> lancedb.table.Table: def _create_table(self, vector_dim: int) -> Any:
"""Create a new table with the given vector dimension. """Create a new table with the given vector dimension.
Caller must already hold the cross-process file lock. Caller must already hold ``store_lock(self._lock_name)``.
""" """
placeholder = [ placeholder = [
{ {
@@ -229,8 +192,10 @@ class LanceDBStorage:
return return
try: try:
self._table.create_scalar_index("scope", index_type="BTREE", replace=False) self._table.create_scalar_index("scope", index_type="BTREE", replace=False)
except Exception: # noqa: S110 except Exception:
pass # index already exists, table empty, or unsupported version _logger.debug(
"Scope index creation skipped (may already exist)", exc_info=True
)
# ------------------------------------------------------------------ # ------------------------------------------------------------------
# Automatic background compaction # Automatic background compaction
@@ -250,8 +215,10 @@ class LanceDBStorage:
def _compact_async(self) -> None: def _compact_async(self) -> None:
"""Fire-and-forget: compact the table in a daemon background thread.""" """Fire-and-forget: compact the table in a daemon background thread."""
ctx = contextvars.copy_context()
threading.Thread( threading.Thread(
target=self._compact_safe, target=ctx.run,
args=(self._compact_safe,),
daemon=True, daemon=True,
name="lancedb-compact", name="lancedb-compact",
).start() ).start()
@@ -260,13 +227,13 @@ class LanceDBStorage:
"""Run ``table.optimize()`` in a background thread, absorbing errors.""" """Run ``table.optimize()`` in a background thread, absorbing errors."""
try: try:
if self._table is not None: if self._table is not None:
with self._file_lock(): with store_lock(self._lock_name):
self._table.optimize() self._table.optimize()
self._ensure_scope_index() self._ensure_scope_index()
except Exception: except Exception:
_logger.debug("LanceDB background compaction failed", exc_info=True) _logger.debug("LanceDB background compaction failed", exc_info=True)
def _ensure_table(self, vector_dim: int | None = None) -> lancedb.table.Table: def _ensure_table(self, vector_dim: int | None = None) -> Any:
"""Return the table, creating it lazily if needed. """Return the table, creating it lazily if needed.
Args: Args:
@@ -332,12 +299,12 @@ class LanceDBStorage:
dim = len(r.embedding) dim = len(r.embedding)
break break
is_new_table = self._table is None is_new_table = self._table is None
with self._write_lock, self._file_lock(): with store_lock(self._lock_name):
self._ensure_table(vector_dim=dim) self._ensure_table(vector_dim=dim)
rows = [self._record_to_row(r) for r in records] rows = [self._record_to_row(rec) for rec in records]
for r in rows: for row in rows:
if r["vector"] is None or len(r["vector"]) != self._vector_dim: if row["vector"] is None or len(row["vector"]) != self._vector_dim:
r["vector"] = [0.0] * self._vector_dim row["vector"] = [0.0] * self._vector_dim
self._do_write("add", rows) self._do_write("add", rows)
if is_new_table: if is_new_table:
self._ensure_scope_index() self._ensure_scope_index()
@@ -348,7 +315,7 @@ class LanceDBStorage:
def update(self, record: MemoryRecord) -> None: def update(self, record: MemoryRecord) -> None:
"""Update a record by ID. Preserves created_at, updates last_accessed.""" """Update a record by ID. Preserves created_at, updates last_accessed."""
with self._write_lock, self._file_lock(): with store_lock(self._lock_name):
self._ensure_table() self._ensure_table()
safe_id = str(record.id).replace("'", "''") safe_id = str(record.id).replace("'", "''")
self._do_write("delete", f"id = '{safe_id}'") self._do_write("delete", f"id = '{safe_id}'")
@@ -369,7 +336,7 @@ class LanceDBStorage:
""" """
if not record_ids or self._table is None: if not record_ids or self._table is None:
return return
with self._write_lock, self._file_lock(): with store_lock(self._lock_name):
now = datetime.utcnow().isoformat() now = datetime.utcnow().isoformat()
safe_ids = [str(rid).replace("'", "''") for rid in record_ids] safe_ids = [str(rid).replace("'", "''") for rid in record_ids]
ids_expr = ", ".join(f"'{rid}'" for rid in safe_ids) ids_expr = ", ".join(f"'{rid}'" for rid in safe_ids)
@@ -435,12 +402,12 @@ class LanceDBStorage:
) -> int: ) -> int:
if self._table is None: if self._table is None:
return 0 return 0
with self._write_lock, self._file_lock(): with store_lock(self._lock_name):
if record_ids and not (categories or metadata_filter): if record_ids and not (categories or metadata_filter):
before = self._table.count_rows() before = int(self._table.count_rows())
ids_expr = ", ".join(f"'{rid}'" for rid in record_ids) ids_expr = ", ".join(f"'{rid}'" for rid in record_ids)
self._do_write("delete", f"id IN ({ids_expr})") self._do_write("delete", f"id IN ({ids_expr})")
return before - self._table.count_rows() return before - int(self._table.count_rows())
if categories or metadata_filter: if categories or metadata_filter:
rows = self._scan_rows(scope_prefix) rows = self._scan_rows(scope_prefix)
to_delete: list[str] = [] to_delete: list[str] = []
@@ -459,10 +426,10 @@ class LanceDBStorage:
to_delete.append(record.id) to_delete.append(record.id)
if not to_delete: if not to_delete:
return 0 return 0
before = self._table.count_rows() before = int(self._table.count_rows())
ids_expr = ", ".join(f"'{rid}'" for rid in to_delete) ids_expr = ", ".join(f"'{rid}'" for rid in to_delete)
self._do_write("delete", f"id IN ({ids_expr})") self._do_write("delete", f"id IN ({ids_expr})")
return before - self._table.count_rows() return before - int(self._table.count_rows())
conditions = [] conditions = []
if scope_prefix is not None and scope_prefix.strip("/"): if scope_prefix is not None and scope_prefix.strip("/"):
prefix = scope_prefix.rstrip("/") prefix = scope_prefix.rstrip("/")
@@ -472,13 +439,13 @@ class LanceDBStorage:
if older_than is not None: if older_than is not None:
conditions.append(f"created_at < '{older_than.isoformat()}'") conditions.append(f"created_at < '{older_than.isoformat()}'")
if not conditions: if not conditions:
before = self._table.count_rows() before = int(self._table.count_rows())
self._do_write("delete", "id != ''") self._do_write("delete", "id != ''")
return before - self._table.count_rows() return before - int(self._table.count_rows())
where_expr = " AND ".join(conditions) where_expr = " AND ".join(conditions)
before = self._table.count_rows() before = int(self._table.count_rows())
self._do_write("delete", where_expr) self._do_write("delete", where_expr)
return before - self._table.count_rows() return before - int(self._table.count_rows())
def _scan_rows( def _scan_rows(
self, self,
@@ -505,7 +472,8 @@ class LanceDBStorage:
q = q.where(f"scope LIKE '{scope_prefix.rstrip('/')}%'") q = q.where(f"scope LIKE '{scope_prefix.rstrip('/')}%'")
if columns is not None: if columns is not None:
q = q.select(columns) q = q.select(columns)
return q.limit(limit).to_list() result: list[dict[str, Any]] = q.limit(limit).to_list()
return result
def list_records( def list_records(
self, scope_prefix: str | None = None, limit: int = 200, offset: int = 0 self, scope_prefix: str | None = None, limit: int = 200, offset: int = 0
@@ -612,12 +580,12 @@ class LanceDBStorage:
if self._table is None: if self._table is None:
return 0 return 0
if scope_prefix is None or scope_prefix.strip("/") == "": if scope_prefix is None or scope_prefix.strip("/") == "":
return self._table.count_rows() return int(self._table.count_rows())
info = self.get_scope_info(scope_prefix) info = self.get_scope_info(scope_prefix)
return info.record_count return info.record_count
def reset(self, scope_prefix: str | None = None) -> None: def reset(self, scope_prefix: str | None = None) -> None:
with self._write_lock, self._file_lock(): with store_lock(self._lock_name):
if scope_prefix is None or scope_prefix.strip("/") == "": if scope_prefix is None or scope_prefix.strip("/") == "":
if self._table is not None: if self._table is not None:
self._db.drop_table(self._table_name) self._db.drop_table(self._table_name)
@@ -643,7 +611,7 @@ class LanceDBStorage:
""" """
if self._table is None: if self._table is None:
return return
with self._write_lock, self._file_lock(): with store_lock(self._lock_name):
self._table.optimize() self._table.optimize()
self._ensure_scope_index() self._ensure_scope_index()

View File

@@ -3,6 +3,7 @@
from __future__ import annotations from __future__ import annotations
from concurrent.futures import Future, ThreadPoolExecutor from concurrent.futures import Future, ThreadPoolExecutor
import contextvars
from datetime import datetime from datetime import datetime
import threading import threading
import time import time
@@ -229,8 +230,9 @@ class Memory(BaseModel):
If the pool has been shut down (e.g. after ``close()``), the save If the pool has been shut down (e.g. after ``close()``), the save
runs synchronously as a fallback so late saves still succeed. runs synchronously as a fallback so late saves still succeed.
""" """
ctx = contextvars.copy_context()
try: try:
future: Future[Any] = self._save_pool.submit(fn, *args, **kwargs) future: Future[Any] = self._save_pool.submit(ctx.run, fn, *args, **kwargs)
except RuntimeError: except RuntimeError:
# Pool shut down -- run synchronously as fallback # Pool shut down -- run synchronously as fallback
future = Future() future = Future()

View File

@@ -4,6 +4,7 @@ from __future__ import annotations
import asyncio import asyncio
from collections.abc import Callable from collections.abc import Callable
import contextvars
from functools import wraps from functools import wraps
import inspect import inspect
from typing import TYPE_CHECKING, Any, Concatenate, ParamSpec, TypeVar, overload from typing import TYPE_CHECKING, Any, Concatenate, ParamSpec, TypeVar, overload
@@ -169,8 +170,9 @@ def _call_method(method: Callable[..., Any], *args: Any, **kwargs: Any) -> Any:
if loop and loop.is_running(): if loop and loop.is_running():
import concurrent.futures import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as pool: with concurrent.futures.ThreadPoolExecutor() as pool:
return pool.submit(asyncio.run, result).result() return pool.submit(ctx.run, asyncio.run, result).result()
return asyncio.run(result) return asyncio.run(result)
return result return result

View File

@@ -4,6 +4,7 @@ from __future__ import annotations
import asyncio import asyncio
from collections.abc import Callable from collections.abc import Callable
import contextvars
from functools import partial from functools import partial
import inspect import inspect
from pathlib import Path from pathlib import Path
@@ -146,8 +147,9 @@ def _resolve_result(result: Any) -> Any:
if loop and loop.is_running(): if loop and loop.is_running():
import concurrent.futures import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as pool: with concurrent.futures.ThreadPoolExecutor() as pool:
return pool.submit(asyncio.run, result).result() return pool.submit(ctx.run, asyncio.run, result).result()
return asyncio.run(result) return asyncio.run(result)
return result return result

View File

@@ -1,5 +1,8 @@
"""ChromaDB client implementation.""" """ChromaDB client implementation."""
import asyncio
from collections.abc import AsyncIterator
from contextlib import AbstractContextManager, asynccontextmanager, nullcontext
import logging import logging
from typing import Any from typing import Any
@@ -29,6 +32,7 @@ from crewai.rag.core.base_client import (
BaseCollectionParams, BaseCollectionParams,
) )
from crewai.rag.types import SearchResult from crewai.rag.types import SearchResult
from crewai.utilities.lock_store import lock as store_lock
from crewai.utilities.logger_utils import suppress_logging from crewai.utilities.logger_utils import suppress_logging
@@ -52,6 +56,7 @@ class ChromaDBClient(BaseClient):
default_limit: int = 5, default_limit: int = 5,
default_score_threshold: float = 0.6, default_score_threshold: float = 0.6,
default_batch_size: int = 100, default_batch_size: int = 100,
lock_name: str = "",
) -> None: ) -> None:
"""Initialize ChromaDBClient with client and embedding function. """Initialize ChromaDBClient with client and embedding function.
@@ -61,12 +66,32 @@ class ChromaDBClient(BaseClient):
default_limit: Default number of results to return in searches. default_limit: Default number of results to return in searches.
default_score_threshold: Default minimum score for search results. default_score_threshold: Default minimum score for search results.
default_batch_size: Default batch size for adding documents. default_batch_size: Default batch size for adding documents.
lock_name: Optional lock name for cross-process synchronization.
""" """
self.client = client self.client = client
self.embedding_function = embedding_function self.embedding_function = embedding_function
self.default_limit = default_limit self.default_limit = default_limit
self.default_score_threshold = default_score_threshold self.default_score_threshold = default_score_threshold
self.default_batch_size = default_batch_size self.default_batch_size = default_batch_size
self._lock_name = lock_name
def _locked(self) -> AbstractContextManager[None]:
"""Return a cross-process lock context manager, or nullcontext if no lock name."""
return store_lock(self._lock_name) if self._lock_name else nullcontext()
@asynccontextmanager
async def _alocked(self) -> AsyncIterator[None]:
"""Async cross-process lock that acquires/releases in an executor."""
if not self._lock_name:
yield
return
lock_cm = store_lock(self._lock_name)
loop = asyncio.get_event_loop()
await loop.run_in_executor(None, lock_cm.__enter__)
try:
yield
finally:
await loop.run_in_executor(None, lock_cm.__exit__, None, None, None)
def create_collection( def create_collection(
self, **kwargs: Unpack[ChromaDBCollectionCreateParams] self, **kwargs: Unpack[ChromaDBCollectionCreateParams]
@@ -313,6 +338,7 @@ class ChromaDBClient(BaseClient):
if not documents: if not documents:
raise ValueError("Documents list cannot be empty") raise ValueError("Documents list cannot be empty")
with self._locked():
collection = self.client.get_or_create_collection( collection = self.client.get_or_create_collection(
name=_sanitize_collection_name(collection_name), name=_sanitize_collection_name(collection_name),
embedding_function=self.embedding_function, embedding_function=self.embedding_function,
@@ -363,6 +389,7 @@ class ChromaDBClient(BaseClient):
if not documents: if not documents:
raise ValueError("Documents list cannot be empty") raise ValueError("Documents list cannot be empty")
async with self._alocked():
collection = await self.client.get_or_create_collection( collection = await self.client.get_or_create_collection(
name=_sanitize_collection_name(collection_name), name=_sanitize_collection_name(collection_name),
embedding_function=self.embedding_function, embedding_function=self.embedding_function,
@@ -531,7 +558,10 @@ class ChromaDBClient(BaseClient):
) )
collection_name = kwargs["collection_name"] collection_name = kwargs["collection_name"]
self.client.delete_collection(name=_sanitize_collection_name(collection_name)) with self._locked():
self.client.delete_collection(
name=_sanitize_collection_name(collection_name)
)
async def adelete_collection(self, **kwargs: Unpack[BaseCollectionParams]) -> None: async def adelete_collection(self, **kwargs: Unpack[BaseCollectionParams]) -> None:
"""Delete a collection and all its data asynchronously. """Delete a collection and all its data asynchronously.
@@ -561,6 +591,7 @@ class ChromaDBClient(BaseClient):
) )
collection_name = kwargs["collection_name"] collection_name = kwargs["collection_name"]
async with self._alocked():
await self.client.delete_collection( await self.client.delete_collection(
name=_sanitize_collection_name(collection_name) name=_sanitize_collection_name(collection_name)
) )
@@ -586,6 +617,7 @@ class ChromaDBClient(BaseClient):
"Use areset() for AsyncClientAPI." "Use areset() for AsyncClientAPI."
) )
with self._locked():
self.client.reset() self.client.reset()
async def areset(self) -> None: async def areset(self) -> None:
@@ -612,4 +644,5 @@ class ChromaDBClient(BaseClient):
"Use reset() for ClientAPI." "Use reset() for ClientAPI."
) )
async with self._alocked():
await self.client.reset() await self.client.reset()

View File

@@ -39,4 +39,5 @@ def create_client(config: ChromaDBConfig) -> ChromaDBClient:
default_limit=config.limit, default_limit=config.limit,
default_score_threshold=config.score_threshold, default_score_threshold=config.score_threshold,
default_batch_size=config.batch_size, default_batch_size=config.batch_size,
lock_name=f"chromadb:{persist_dir}",
) )

View File

@@ -2,6 +2,7 @@ from __future__ import annotations
import asyncio import asyncio
from concurrent.futures import Future from concurrent.futures import Future
import contextvars
from copy import copy as shallow_copy from copy import copy as shallow_copy
import datetime import datetime
from hashlib import md5 from hashlib import md5
@@ -524,10 +525,11 @@ class Task(BaseModel):
) -> Future[TaskOutput]: ) -> Future[TaskOutput]:
"""Execute the task asynchronously.""" """Execute the task asynchronously."""
future: Future[TaskOutput] = Future() future: Future[TaskOutput] = Future()
ctx = contextvars.copy_context()
threading.Thread( threading.Thread(
daemon=True, daemon=True,
target=self._execute_task_async, target=ctx.run,
args=(agent, context, tools, future), args=(self._execute_task_async, agent, context, tools, future),
).start() ).start()
return future return future

View File

@@ -5,6 +5,7 @@ import asyncio
from collections.abc import Awaitable, Callable from collections.abc import Awaitable, Callable
from inspect import Parameter, signature from inspect import Parameter, signature
import json import json
import threading
from typing import ( from typing import (
Any, Any,
Generic, Generic,
@@ -18,6 +19,7 @@ from pydantic import (
BaseModel as PydanticBaseModel, BaseModel as PydanticBaseModel,
ConfigDict, ConfigDict,
Field, Field,
PrivateAttr,
create_model, create_model,
field_validator, field_validator,
) )
@@ -94,6 +96,7 @@ class BaseTool(BaseModel, ABC):
default=0, default=0,
description="Current number of times this tool has been used.", description="Current number of times this tool has been used.",
) )
_usage_lock: threading.Lock = PrivateAttr(default_factory=threading.Lock)
@field_validator("args_schema", mode="before") @field_validator("args_schema", mode="before")
@classmethod @classmethod
@@ -173,6 +176,25 @@ class BaseTool(BaseModel, ABC):
) from e ) from e
return kwargs return kwargs
def _claim_usage(self) -> str | None:
"""Atomically check max usage and increment the counter.
Returns:
None if usage was claimed successfully, or an error message
string if the tool has reached its usage limit.
"""
with self._usage_lock:
if (
self.max_usage_count is not None
and self.current_usage_count >= self.max_usage_count
):
return (
f"Tool '{self.name}' has reached its usage limit of "
f"{self.max_usage_count} times and cannot be used anymore."
)
self.current_usage_count += 1
return None
def run( def run(
self, self,
*args: Any, *args: Any,
@@ -181,13 +203,15 @@ class BaseTool(BaseModel, ABC):
if not args: if not args:
kwargs = self._validate_kwargs(kwargs) kwargs = self._validate_kwargs(kwargs)
limit_error = self._claim_usage()
if limit_error:
return limit_error
result = self._run(*args, **kwargs) result = self._run(*args, **kwargs)
if asyncio.iscoroutine(result): if asyncio.iscoroutine(result):
result = asyncio.run(result) result = asyncio.run(result)
self.current_usage_count += 1
return result return result
async def arun( async def arun(
@@ -206,9 +230,12 @@ class BaseTool(BaseModel, ABC):
""" """
if not args: if not args:
kwargs = self._validate_kwargs(kwargs) kwargs = self._validate_kwargs(kwargs)
result = await self._arun(*args, **kwargs)
self.current_usage_count += 1 limit_error = self._claim_usage()
return result if limit_error:
return limit_error
return await self._arun(*args, **kwargs)
async def _arun( async def _arun(
self, self,
@@ -361,12 +388,15 @@ class Tool(BaseTool, Generic[P, R]):
if not args: if not args:
kwargs = self._validate_kwargs(kwargs) # type: ignore[assignment] kwargs = self._validate_kwargs(kwargs) # type: ignore[assignment]
limit_error = self._claim_usage()
if limit_error:
return limit_error # type: ignore[return-value]
result = self.func(*args, **kwargs) result = self.func(*args, **kwargs)
if asyncio.iscoroutine(result): if asyncio.iscoroutine(result):
result = asyncio.run(result) result = asyncio.run(result)
self.current_usage_count += 1
return result # type: ignore[return-value] return result # type: ignore[return-value]
def _run(self, *args: P.args, **kwargs: P.kwargs) -> R: def _run(self, *args: P.args, **kwargs: P.kwargs) -> R:
@@ -393,9 +423,12 @@ class Tool(BaseTool, Generic[P, R]):
""" """
if not args: if not args:
kwargs = self._validate_kwargs(kwargs) # type: ignore[assignment] kwargs = self._validate_kwargs(kwargs) # type: ignore[assignment]
result = await self._arun(*args, **kwargs)
self.current_usage_count += 1 limit_error = self._claim_usage()
return result if limit_error:
return limit_error # type: ignore[return-value]
return await self._arun(*args, **kwargs)
async def _arun(self, *args: P.args, **kwargs: P.kwargs) -> R: async def _arun(self, *args: P.args, **kwargs: P.kwargs) -> R:
"""Executes the wrapped function asynchronously. """Executes the wrapped function asynchronously.

View File

@@ -7,6 +7,7 @@ concurrently by the executor.
import asyncio import asyncio
from collections.abc import Callable from collections.abc import Callable
import contextvars
from typing import Any from typing import Any
from crewai.tools import BaseTool from crewai.tools import BaseTool
@@ -84,9 +85,10 @@ class MCPNativeTool(BaseTool):
import concurrent.futures import concurrent.futures
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor() as executor: with concurrent.futures.ThreadPoolExecutor() as executor:
coro = self._run_async(**kwargs) coro = self._run_async(**kwargs)
future = executor.submit(asyncio.run, coro) future = executor.submit(ctx.run, asyncio.run, coro)
return future.result() return future.result()
except RuntimeError: except RuntimeError:
return asyncio.run(self._run_async(**kwargs)) return asyncio.run(self._run_async(**kwargs))

View File

@@ -74,9 +74,28 @@
"consolidation_user": "New content to consider storing:\n{new_content}\n\nExisting similar memories:\n{records_summary}\n\nReturn the consolidation plan as structured output." "consolidation_user": "New content to consider storing:\n{new_content}\n\nExisting similar memories:\n{records_summary}\n\nReturn the consolidation plan as structured output."
}, },
"reasoning": { "reasoning": {
"initial_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are creating a strategic plan for a task that requires your expertise and unique perspective.", "initial_plan": "You are {role}. Create a focused execution plan using only the essential steps needed.",
"refine_plan": "You are {role}, a professional with the following background: {backstory}\n\nYour primary goal is: {goal}\n\nAs {role}, you are refining a strategic plan for a task that requires your expertise and unique perspective.", "refine_plan": "You are {role}. Refine your plan to address the specific gap while keeping it minimal.",
"create_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou have been assigned the following task:\n{description}\n\nExpected output:\n{expected_output}\n\nAvailable tools: {tools}\n\nBefore executing this task, create a detailed plan that leverages your expertise as {role} and outlines:\n1. Your understanding of the task from your professional perspective\n2. The key steps you'll take to complete it, drawing on your background and skills\n3. How you'll approach any challenges that might arise, considering your expertise\n4. How you'll strategically use the available tools based on your experience, exactly what tools to use and how to use them\n5. The expected outcome and how it aligns with your goal\n\nAfter creating your plan, assess whether you feel ready to execute the task or if you could do better.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan because [specific reason].\"", "create_plan_prompt": "You are {role}.\n\nTask: {description}\n\nExpected output: {expected_output}\n\nAvailable tools: {tools}\n\nCreate a focused plan with ONLY the essential steps needed. Most tasks require just 2-5 steps. Do NOT pad with unnecessary steps like \"review\", \"verify\", \"document\", or \"finalize\" unless explicitly required.\n\nFor each step, specify the action and which tool to use (if any).\n\nConclude with:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan because [specific reason].\"",
"refine_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou created the following plan for this task:\n{current_plan}\n\nHowever, you indicated that you're not ready to execute the task yet.\n\nPlease refine your plan further, drawing on your expertise as {role} to address any gaps or uncertainties. As you refine your plan, be specific about which available tools you will use, how you will use them, and why they are the best choices for each step. Clearly outline your tool usage strategy as part of your improved plan.\n\nAfter refining your plan, assess whether you feel ready to execute the task.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan further because [specific reason].\"" "refine_plan_prompt": "Your plan:\n{current_plan}\n\nYou indicated you're not ready. Address the specific gap while keeping the plan minimal.\n\nConclude with READY or NOT READY."
},
"planning": {
"system_prompt": "You are a strategic planning assistant. Create concrete, executable plans where every step produces a verifiable result.",
"create_plan_prompt": "Create an execution plan for the following task:\n\n## Task\n{description}\n\n## Expected Output\n{expected_output}\n\n## Available Tools\n{tools}\n\n## Planning Principles\nFocus on CONCRETE, EXECUTABLE steps. Each step must clearly state WHAT ACTION to take and HOW to verify it succeeded. The number of steps should match the task complexity. Hard limit: {max_steps} steps.\n\n## Rules:\n- Each step must have a clear DONE criterion\n- Do NOT group unrelated actions: if steps can fail independently, keep them separate\n- NO standalone \"thinking\" or \"planning\" steps — act, don't just observe\n- The last step must produce the required output\n\nAfter your plan, state READY or NOT READY.",
"refine_plan_prompt": "Your previous plan:\n{current_plan}\n\nYou indicated you weren't ready. Refine your plan to address the specific gap.\n\nKeep the plan minimal - only add steps that directly address the issue.\n\nConclude with READY or NOT READY as before.",
"observation_system_prompt": "You are a Planning Agent observing execution progress. After each step completes, you analyze what happened and decide whether the remaining plan is still valid.\n\nReason step-by-step about:\n1. Did this step produce a concrete, verifiable result? (file created, command succeeded, service running, etc.) — or did it only explore without acting?\n2. What new information was learned from this step's result?\n3. Whether the remaining steps still make sense given this new information\n4. What refinements, if any, are needed for upcoming steps\n5. Whether the overall goal has already been achieved\n\nCritical: mark `step_completed_successfully=false` if:\n- The step result is only exploratory (ls, pwd, cat) without producing the required artifact or action\n- A command returned a non-zero exit code and the error was not recovered\n- The step description required creating/building/starting something and the result shows it was not done\n\nBe conservative about triggering full replans — only do so when the remaining plan is fundamentally wrong, not just suboptimal.\n\nIMPORTANT: Set step_completed_successfully=false if:\n- The step's stated goal was NOT achieved (even if other things were done)\n- The first meaningful action returned an error (file not found, command not found, etc.)\n- The result is exploration/discovery output rather than the concrete action the step required\n- The step ran out of attempts without producing the required output\nSet needs_full_replan=true if the current plan's remaining steps reference paths or state that don't exist yet and need to be created first.",
"observation_user_prompt": "## Original task\n{task_description}\n\n## Expected output\n{task_goal}\n{completed_summary}\n\n## Just completed step {step_number}\nDescription: {step_description}\nResult: {step_result}\n{remaining_summary}\n\nAnalyze this step's result and provide your observation.",
"step_executor_system_prompt": "You are {role}. {backstory}\n\nYour goal: {goal}\n\nYou are executing ONE specific step in a larger plan. Your ONLY job is to fully complete this step — not to plan ahead.\n\nKey rules:\n- **ACT FIRST.** Execute the primary action of this step immediately. Do NOT read or explore files before attempting the main action unless exploration IS the step's goal.\n- If the step says 'run X', run X NOW. If it says 'write file Y', write Y NOW.\n- If the step requires producing an output file (e.g. /app/move.txt, report.jsonl, summary.csv), you MUST write that file using a tool call — do NOT just state the answer in text.\n- You may use tools MULTIPLE TIMES. After each tool use, check the result. If it failed, try a different approach.\n- Only output your Final Answer AFTER the concrete outcome is verified (file written, build succeeded, command exited 0).\n- If a command is not found or a path does not exist, fix it (different PATH, install missing deps, use absolute paths).\n- Do NOT spend more than 3 tool calls on exploration/analysis before attempting the primary action.{tools_section}",
"step_executor_tools_section": "\n\nAvailable tools: {tool_names}\n\nYou may call tools multiple times in sequence. Use this format for EACH tool call:\nThought: <what you observed and what you will try next>\nAction: <tool_name>\nAction Input: <input>\n\nAfter observing each result, decide: is the step complete? If yes:\nThought: The step is done because <evidence>\nFinal Answer: <concise summary of what was accomplished and the key result>",
"step_executor_user_prompt": "## Current Step\n{step_description}",
"step_executor_suggested_tool": "\nSuggested tool: {tool_to_use}",
"step_executor_context_header": "\n## Context from previous steps:",
"step_executor_context_entry": "Step {step_number} result: {result}",
"step_executor_complete_step": "\n**Execute the primary action of this step NOW.** If the step requires writing a file, write it. If it requires running a command, run it. Verify the outcome with a follow-up tool call, then give your Final Answer. Your Final Answer must confirm what was DONE (file created at path X, command succeeded), not just what should be done.",
"todo_system_prompt": "You are {role}. Your goal: {goal}\n\nYou are executing a specific step in a multi-step plan. Focus only on completing the current step. Use the suggested tool if one is provided. Be concise and provide clear results that can be used by subsequent steps.",
"synthesis_system_prompt": "You are {role}. You have completed a multi-step task. Synthesize the results from all steps into a single, coherent final response that directly addresses the original task. Do NOT list step numbers or say 'Step 1 result'. Produce a clean, polished answer as if you did it all at once.",
"synthesis_user_prompt": "## Original Task\n{task_description}\n\n## Results from each step\n{combined_steps}\n\nSynthesize these results into a single, coherent final answer.",
"replan_enhancement_prompt": "\n\nIMPORTANT: Previous execution attempt did not fully succeed. Please create a revised plan that accounts for the following context from the previous attempt:\n\n{previous_context}\n\nConsider:\n1. What steps succeeded and can be built upon\n2. What steps failed and why they might have failed\n3. Alternative approaches that might work better\n4. Whether dependencies need to be restructured",
"step_executor_task_context": "## Task Context\nThe following is the full task you are helping complete. Keep this in mind — especially any required output files, exact filenames, and expected formats.\n\n{task_context}\n\n---\n"
} }
} }

View File

@@ -3,6 +3,9 @@ from __future__ import annotations
import asyncio import asyncio
from collections.abc import Callable, Sequence from collections.abc import Callable, Sequence
import concurrent.futures import concurrent.futures
import contextvars
from dataclasses import dataclass, field
from datetime import datetime
import inspect import inspect
import json import json
import re import re
@@ -39,6 +42,7 @@ from crewai.utilities.types import LLMMessage
if TYPE_CHECKING: if TYPE_CHECKING:
from crewai.agent import Agent from crewai.agent import Agent
from crewai.agents.crew_agent_executor import CrewAgentExecutor from crewai.agents.crew_agent_executor import CrewAgentExecutor
from crewai.agents.tools_handler import ToolsHandler
from crewai.experimental.agent_executor import AgentExecutor from crewai.experimental.agent_executor import AgentExecutor
from crewai.lite_agent import LiteAgent from crewai.lite_agent import LiteAgent
from crewai.llm import LLM from crewai.llm import LLM
@@ -210,6 +214,30 @@ def convert_tools_to_openai_schema(
return openai_tools, available_functions, tool_name_mapping return openai_tools, available_functions, tool_name_mapping
def extract_task_section(text: str) -> str:
"""Extract the ## Task body from a structured enriched instruction.
For structured descriptions (e.g. with ## Task and ## Instructions sections),
extracts just the task body so the caller sees the requirements without
duplicating tool/verification instructions.
Falls back to the full text (up to 2000 chars, with a truncation marker)
for plain inputs.
"""
for marker in ("\n## Task\n", "\n## Task:", "## Task\n"):
idx = text.find(marker)
if idx >= 0:
start = idx + len(marker)
for end_marker in ("\n---\n", "\n## "):
end = text.find(end_marker, start)
if end > 0:
return text[start:end].strip()
return text[start : start + 2000].strip()
if len(text) > 2000:
return text[:2000] + "\n... [truncated]"
return text
def has_reached_max_iterations(iterations: int, max_iterations: int) -> bool: def has_reached_max_iterations(iterations: int, max_iterations: int) -> bool:
"""Check if the maximum number of iterations has been reached. """Check if the maximum number of iterations has been reached.
@@ -335,6 +363,66 @@ def enforce_rpm_limit(
request_within_rpm_limit() request_within_rpm_limit()
def _prepare_llm_call(
executor_context: CrewAgentExecutor | AgentExecutor | LiteAgent | None,
messages: list[LLMMessage],
printer: Printer,
verbose: bool = True,
) -> list[LLMMessage]:
"""Shared pre-call logic: run before hooks and resolve messages.
Args:
executor_context: Optional executor context for hook invocation.
messages: The messages to send to the LLM.
printer: Printer instance for output.
verbose: Whether to print output.
Returns:
The resolved messages list (may come from executor_context).
Raises:
ValueError: If a before hook blocks the call.
"""
if executor_context is not None:
if not _setup_before_llm_call_hooks(executor_context, printer, verbose=verbose):
raise ValueError("LLM call blocked by before_llm_call hook")
messages = executor_context.messages
return messages
def _validate_and_finalize_llm_response(
answer: Any,
executor_context: CrewAgentExecutor | AgentExecutor | LiteAgent | None,
printer: Printer,
verbose: bool = True,
) -> str | BaseModel | Any:
"""Shared post-call logic: validate response and run after hooks.
Args:
answer: The raw LLM response.
executor_context: Optional executor context for hook invocation.
printer: Printer instance for output.
verbose: Whether to print output.
Returns:
The potentially modified response.
Raises:
ValueError: If the response is None or empty.
"""
if not answer:
if verbose:
printer.print(
content="Received None or empty response from LLM call.",
color="red",
)
raise ValueError("Invalid response from LLM call - None or empty.")
return _setup_after_llm_call_hooks(
executor_context, answer, printer, verbose=verbose
)
def get_llm_response( def get_llm_response(
llm: LLM | BaseLLM, llm: LLM | BaseLLM,
messages: list[LLMMessage], messages: list[LLMMessage],
@@ -371,11 +459,7 @@ def get_llm_response(
Exception: If an error occurs. Exception: If an error occurs.
ValueError: If the response is None or empty. ValueError: If the response is None or empty.
""" """
messages = _prepare_llm_call(executor_context, messages, printer, verbose=verbose)
if executor_context is not None:
if not _setup_before_llm_call_hooks(executor_context, printer, verbose=verbose):
raise ValueError("LLM call blocked by before_llm_call hook")
messages = executor_context.messages
try: try:
answer = llm.call( answer = llm.call(
@@ -389,16 +473,9 @@ def get_llm_response(
) )
except Exception as e: except Exception as e:
raise e raise e
if not answer:
if verbose:
printer.print(
content="Received None or empty response from LLM call.",
color="red",
)
raise ValueError("Invalid response from LLM call - None or empty.")
return _setup_after_llm_call_hooks( return _validate_and_finalize_llm_response(
executor_context, answer, printer, verbose=verbose answer, executor_context, printer, verbose=verbose
) )
@@ -428,6 +505,7 @@ async def aget_llm_response(
from_agent: Optional agent context for the LLM call. from_agent: Optional agent context for the LLM call.
response_model: Optional Pydantic model for structured outputs. response_model: Optional Pydantic model for structured outputs.
executor_context: Optional executor context for hook invocation. executor_context: Optional executor context for hook invocation.
verbose: Whether to print output.
Returns: Returns:
The response from the LLM as a string, Pydantic model (when response_model is provided), The response from the LLM as a string, Pydantic model (when response_model is provided),
@@ -437,10 +515,7 @@ async def aget_llm_response(
Exception: If an error occurs. Exception: If an error occurs.
ValueError: If the response is None or empty. ValueError: If the response is None or empty.
""" """
if executor_context is not None: messages = _prepare_llm_call(executor_context, messages, printer, verbose=verbose)
if not _setup_before_llm_call_hooks(executor_context, printer, verbose=verbose):
raise ValueError("LLM call blocked by before_llm_call hook")
messages = executor_context.messages
try: try:
answer = await llm.acall( answer = await llm.acall(
@@ -454,16 +529,9 @@ async def aget_llm_response(
) )
except Exception as e: except Exception as e:
raise e raise e
if not answer:
if verbose:
printer.print(
content="Received None or empty response from LLM call.",
color="red",
)
raise ValueError("Invalid response from LLM call - None or empty.")
return _setup_after_llm_call_hooks( return _validate_and_finalize_llm_response(
executor_context, answer, printer, verbose=verbose answer, executor_context, printer, verbose=verbose
) )
@@ -907,8 +975,9 @@ def summarize_messages(
chunks=chunks, llm=llm, callbacks=callbacks, i18n=i18n chunks=chunks, llm=llm, callbacks=callbacks, i18n=i18n
) )
if is_inside_event_loop(): if is_inside_event_loop():
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool: with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
summarized_contents = pool.submit(asyncio.run, coro).result() summarized_contents = pool.submit(ctx.run, asyncio.run, coro).result()
else: else:
summarized_contents = asyncio.run(coro) summarized_contents = asyncio.run(coro)
@@ -1157,6 +1226,372 @@ def extract_tool_call_info(
return None return None
def is_tool_call_list(response: list[Any]) -> bool:
"""Check if a response from the LLM is a list of tool calls.
Supports OpenAI, Anthropic, Bedrock, and Gemini formats.
Args:
response: The response to check.
Returns:
True if the response appears to be a list of tool calls.
"""
if not response:
return False
first_item = response[0]
# OpenAI-style
if hasattr(first_item, "function") or (
isinstance(first_item, dict) and "function" in first_item
):
return True
# Anthropic-style (ToolUseBlock)
if hasattr(first_item, "type") and getattr(first_item, "type", None) == "tool_use":
return True
if hasattr(first_item, "name") and hasattr(first_item, "input"):
return True
# Bedrock-style
if isinstance(first_item, dict) and "name" in first_item and "input" in first_item:
return True
# Gemini-style
if hasattr(first_item, "function_call") and first_item.function_call:
return True
return False
def check_native_tool_support(llm: Any, original_tools: list[BaseTool] | None) -> bool:
"""Check if the LLM supports native function calling and tools are available.
Args:
llm: The LLM instance.
original_tools: Original BaseTool instances.
Returns:
True if native function calling is supported and tools exist.
"""
return (
hasattr(llm, "supports_function_calling")
and callable(getattr(llm, "supports_function_calling", None))
and llm.supports_function_calling()
and bool(original_tools)
)
def setup_native_tools(
original_tools: list[BaseTool],
) -> tuple[
list[dict[str, Any]],
dict[str, Callable[..., Any]],
dict[str, BaseTool | CrewStructuredTool],
]:
"""Convert tools to OpenAI schema format for native function calling.
Args:
original_tools: Original BaseTool instances.
Returns:
Tuple of (openai_tools_schema, available_functions_dict, tool_name_mapping).
"""
return convert_tools_to_openai_schema(original_tools)
def build_tool_calls_assistant_message(
tool_calls: list[Any],
) -> tuple[LLMMessage | None, list[dict[str, Any]]]:
"""Build an assistant message containing tool call reports.
Extracts info from each tool call, builds the standard assistant message
format, and preserves raw Gemini parts when applicable.
Args:
tool_calls: Raw tool call objects from the LLM response.
Returns:
Tuple of (assistant_message, tool_calls_to_report).
assistant_message is None if no valid tool calls found.
"""
tool_calls_to_report: list[dict[str, Any]] = []
for tool_call in tool_calls:
info = extract_tool_call_info(tool_call)
if not info:
continue
call_id, func_name, func_args = info
tool_calls_to_report.append(
{
"id": call_id,
"type": "function",
"function": {
"name": func_name,
"arguments": func_args
if isinstance(func_args, str)
else json.dumps(func_args),
},
}
)
if not tool_calls_to_report:
return None, []
assistant_message: LLMMessage = {
"role": "assistant",
"content": None,
"tool_calls": tool_calls_to_report,
}
# Preserve raw parts for Gemini compatibility
if all(type(tc).__qualname__ == "Part" for tc in tool_calls):
assistant_message["raw_tool_call_parts"] = list(tool_calls)
return assistant_message, tool_calls_to_report
@dataclass
class NativeToolCallResult:
"""Result from executing a single native tool call."""
call_id: str
func_name: str
result: str
from_cache: bool = False
result_as_answer: bool = False
tool_message: LLMMessage = field(default_factory=dict) # type: ignore[assignment]
def execute_single_native_tool_call(
tool_call: Any,
*,
available_functions: dict[str, Callable[..., Any]],
original_tools: list[BaseTool],
structured_tools: list[CrewStructuredTool] | None,
tools_handler: ToolsHandler | None,
agent: Agent | None,
task: Task | None,
crew: Any | None,
event_source: Any,
printer: Printer | None = None,
verbose: bool = False,
) -> NativeToolCallResult:
"""Execute a single native tool call with full lifecycle management.
Handles: arg parsing, tool lookup, max-usage check, cache read/write,
before/after hooks, event emission, and result_as_answer detection.
Args:
tool_call: Raw tool call object from the LLM.
available_functions: Map of sanitized tool name -> callable.
original_tools: Original BaseTool list (for cache_function, result_as_answer).
structured_tools: Structured tools list (for hook context).
tools_handler: Optional handler with cache.
agent: The agent instance.
task: The current task.
crew: The crew instance.
event_source: The object to use as event emitter source.
printer: Optional printer for verbose logging.
verbose: Whether to print verbose output.
Returns:
NativeToolCallResult with all execution details.
"""
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.tool_usage_events import (
ToolUsageErrorEvent,
ToolUsageFinishedEvent,
ToolUsageStartedEvent,
)
from crewai.hooks.tool_hooks import (
ToolCallHookContext,
get_after_tool_call_hooks,
get_before_tool_call_hooks,
)
info = extract_tool_call_info(tool_call)
if not info:
return NativeToolCallResult(
call_id="", func_name="", result="Unrecognized tool call format"
)
call_id, func_name, func_args = info
# Parse arguments
if isinstance(func_args, str):
try:
args_dict = json.loads(func_args)
except json.JSONDecodeError:
args_dict = {}
else:
args_dict = func_args
agent_key = getattr(agent, "key", "unknown") if agent else "unknown"
# Find original tool for cache_function and result_as_answer
original_tool: BaseTool | None = None
for tool in original_tools:
if sanitize_tool_name(tool.name) == func_name:
original_tool = tool
break
# Check cache
from_cache = False
input_str = json.dumps(args_dict) if args_dict else ""
result = "Tool not found"
if tools_handler and tools_handler.cache:
cached_result = tools_handler.cache.read(tool=func_name, input=input_str)
if cached_result is not None:
result = (
str(cached_result)
if not isinstance(cached_result, str)
else cached_result
)
from_cache = True
# Emit tool started event
started_at = datetime.now()
crewai_event_bus.emit(
event_source,
event=ToolUsageStartedEvent(
tool_name=func_name,
tool_args=args_dict,
from_agent=agent,
from_task=task,
agent_key=agent_key,
),
)
track_delegation_if_needed(func_name, args_dict, task)
# Find structured tool for hooks
structured_tool: CrewStructuredTool | None = None
for structured in structured_tools or []:
if sanitize_tool_name(structured.name) == func_name:
structured_tool = structured
break
# Before hooks
hook_blocked = False
before_hook_context = ToolCallHookContext(
tool_name=func_name,
tool_input=args_dict,
tool=structured_tool, # type: ignore[arg-type]
agent=agent,
task=task,
crew=crew,
)
try:
for hook in get_before_tool_call_hooks():
if hook(before_hook_context) is False:
hook_blocked = True
break
except Exception: # noqa: S110
pass
error_event_emitted = False
if hook_blocked:
result = f"Tool execution blocked by hook. Tool: {func_name}"
elif not from_cache:
if func_name in available_functions:
try:
tool_func = available_functions[func_name]
raw_result = tool_func(**args_dict)
# Cache result
if tools_handler and tools_handler.cache:
should_cache = True
if original_tool:
should_cache = original_tool.cache_function(
args_dict, raw_result
)
if should_cache:
tools_handler.cache.add(
tool=func_name, input=input_str, output=raw_result
)
result = (
str(raw_result) if not isinstance(raw_result, str) else raw_result
)
except Exception as e:
result = f"Error executing tool: {e}"
if task:
task.increment_tools_errors()
crewai_event_bus.emit(
event_source,
event=ToolUsageErrorEvent(
tool_name=func_name,
tool_args=args_dict,
from_agent=agent,
from_task=task,
agent_key=agent_key,
error=e,
),
)
error_event_emitted = True
# After hooks
after_hook_context = ToolCallHookContext(
tool_name=func_name,
tool_input=args_dict,
tool=structured_tool, # type: ignore[arg-type]
agent=agent,
task=task,
crew=crew,
tool_result=result,
)
try:
for after_hook in get_after_tool_call_hooks():
hook_result = after_hook(after_hook_context)
if hook_result is not None:
result = hook_result
after_hook_context.tool_result = result
except Exception: # noqa: S110
pass
# Emit tool finished event (only if error event wasn't already emitted)
if not error_event_emitted:
crewai_event_bus.emit(
event_source,
event=ToolUsageFinishedEvent(
output=result,
tool_name=func_name,
tool_args=args_dict,
from_agent=agent,
from_task=task,
agent_key=agent_key,
started_at=started_at,
finished_at=datetime.now(),
),
)
# Build tool result message
tool_message: LLMMessage = {
"role": "tool",
"tool_call_id": call_id,
"name": func_name,
"content": result,
}
if verbose and printer:
cache_info = " (from cache)" if from_cache else ""
printer.print(
content=f"Tool {func_name} executed with result{cache_info}: {result[:200]}...",
color="green",
)
# Check result_as_answer
is_result_as_answer = bool(
original_tool
and hasattr(original_tool, "result_as_answer")
and original_tool.result_as_answer
)
return NativeToolCallResult(
call_id=call_id,
func_name=func_name,
result=result,
from_cache=from_cache,
result_as_answer=is_result_as_answer,
tool_message=tool_message,
)
def parse_tool_call_args( def parse_tool_call_args(
func_args: dict[str, Any] | str, func_args: dict[str, Any] | str,
func_name: str, func_name: str,

View File

@@ -6,6 +6,8 @@ from typing import Any, TypedDict
from typing_extensions import Unpack from typing_extensions import Unpack
from crewai.utilities.lock_store import lock as store_lock
class LogEntry(TypedDict, total=False): class LogEntry(TypedDict, total=False):
"""TypedDict for log entry kwargs with optional fields for flexibility.""" """TypedDict for log entry kwargs with optional fields for flexibility."""
@@ -90,6 +92,7 @@ class FileHandler:
ValueError: If logging fails. ValueError: If logging fails.
""" """
try: try:
with store_lock(f"file:{os.path.realpath(self._path)}"):
now = datetime.now().strftime("%Y-%m-%d %H:%M:%S") now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
log_entry = {"timestamp": now, **kwargs} log_entry = {"timestamp": now, **kwargs}
@@ -112,7 +115,9 @@ class FileHandler:
# Append log in plain text format # Append log in plain text format
message = ( message = (
f"{now}: " f"{now}: "
+ ", ".join([f'{key}="{value}"' for key, value in kwargs.items()]) + ", ".join(
[f'{key}="{value}"' for key, value in kwargs.items()]
)
+ "\n" + "\n"
) )
with open(self._path, "a", encoding="utf-8") as file: with open(self._path, "a", encoding="utf-8") as file:
@@ -153,6 +158,7 @@ class PickleHandler:
Args: Args:
data: The data to be saved to the file. data: The data to be saved to the file.
""" """
with store_lock(f"file:{os.path.realpath(self.file_path)}"):
with open(self.file_path, "wb") as f: with open(self.file_path, "wb") as f:
pickle.dump(obj=data, file=f) pickle.dump(obj=data, file=f)
@@ -162,13 +168,17 @@ class PickleHandler:
Returns: Returns:
The data loaded from the file. The data loaded from the file.
""" """
if not os.path.exists(self.file_path) or os.path.getsize(self.file_path) == 0: with store_lock(f"file:{os.path.realpath(self.file_path)}"):
return {} # Return an empty dictionary if the file does not exist or is empty if (
not os.path.exists(self.file_path)
or os.path.getsize(self.file_path) == 0
):
return {}
with open(self.file_path, "rb") as file: with open(self.file_path, "rb") as file:
try: try:
return pickle.load(file) # noqa: S301 return pickle.load(file) # noqa: S301
except EOFError: except EOFError:
return {} # Return an empty dictionary if the file is empty or corrupted return {}
except Exception: except Exception:
raise # Raise any other exceptions that occur during loading raise

View File

@@ -5,6 +5,7 @@ from __future__ import annotations
import asyncio import asyncio
from collections.abc import Coroutine from collections.abc import Coroutine
import concurrent.futures import concurrent.futures
import contextvars
import logging import logging
from typing import TYPE_CHECKING, TypeVar from typing import TYPE_CHECKING, TypeVar
from uuid import UUID from uuid import UUID
@@ -46,8 +47,9 @@ def _run_sync(coro: Coroutine[None, None, T]) -> T:
""" """
try: try:
asyncio.get_running_loop() asyncio.get_running_loop()
ctx = contextvars.copy_context()
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor: with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
future = executor.submit(asyncio.run, coro) future = executor.submit(ctx.run, asyncio.run, coro)
return future.result() return future.result()
except RuntimeError: except RuntimeError:
return asyncio.run(coro) return asyncio.run(coro)

View File

@@ -100,7 +100,13 @@ class I18N(BaseModel):
def retrieve( def retrieve(
self, self,
kind: Literal[ kind: Literal[
"slices", "errors", "tools", "reasoning", "hierarchical_manager_agent", "memory" "slices",
"errors",
"tools",
"reasoning",
"planning",
"hierarchical_manager_agent",
"memory",
], ],
key: str, key: str,
) -> str: ) -> str:

View File

@@ -10,17 +10,21 @@ from collections.abc import Iterator
from contextlib import contextmanager from contextlib import contextmanager
from functools import lru_cache from functools import lru_cache
from hashlib import md5 from hashlib import md5
import logging
import os import os
import tempfile import tempfile
from typing import TYPE_CHECKING, Final from typing import TYPE_CHECKING, Final
import portalocker import portalocker
import portalocker.exceptions
if TYPE_CHECKING: if TYPE_CHECKING:
import redis import redis
logger = logging.getLogger(__name__)
_REDIS_URL: str | None = os.environ.get("REDIS_URL") _REDIS_URL: str | None = os.environ.get("REDIS_URL")
_DEFAULT_TIMEOUT: Final[int] = 120 _DEFAULT_TIMEOUT: Final[int] = 120
@@ -57,5 +61,16 @@ def lock(name: str, *, timeout: float = _DEFAULT_TIMEOUT) -> Iterator[None]:
else: else:
lock_dir = tempfile.gettempdir() lock_dir = tempfile.gettempdir()
lock_path = os.path.join(lock_dir, f"{channel}.lock") lock_path = os.path.join(lock_dir, f"{channel}.lock")
with portalocker.Lock(lock_path, timeout=timeout): try:
pl = portalocker.Lock(lock_path, timeout=timeout)
pl.acquire()
except portalocker.exceptions.BaseLockException as exc:
raise portalocker.exceptions.LockException(
f"Failed to acquire lock '{name}' at {lock_path} "
f"(timeout={timeout}s). This commonly occurs in "
f"multi-process environments. "
) from exc
try:
yield yield
finally:
pl.release() # type: ignore[no-untyped-call]

View File

@@ -0,0 +1,279 @@
"""Types for agent planning and todo tracking."""
from __future__ import annotations
from typing import Literal
from uuid import uuid4
from pydantic import BaseModel, Field, field_validator
# Todo status type
TodoStatus = Literal["pending", "running", "completed", "failed"]
class PlanStep(BaseModel):
"""A single step in the reasoning plan."""
step_number: int = Field(description="Step number (1-based)")
description: str = Field(description="What to do in this step")
tool_to_use: str | None = Field(
default=None, description="Tool to use for this step, if any"
)
depends_on: list[int] = Field(
default_factory=list, description="Step numbers this step depends on"
)
class TodoItem(BaseModel):
"""A single todo item representing a step in the execution plan."""
id: str = Field(default_factory=lambda: str(uuid4()))
step_number: int = Field(description="Order of this step in the plan (1-based)")
description: str = Field(description="What needs to be done")
tool_to_use: str | None = Field(
default=None, description="Tool to use for this step, if any"
)
status: TodoStatus = Field(default="pending", description="Current status")
depends_on: list[int] = Field(
default_factory=list, description="Step numbers this depends on"
)
result: str | None = Field(
default=None, description="Result after completion, if any"
)
class TodoList(BaseModel):
"""Collection of todos for tracking plan execution."""
items: list[TodoItem] = Field(default_factory=list)
@property
def current_todo(self) -> TodoItem | None:
"""Get the currently running todo item."""
for item in self.items:
if item.status == "running":
return item
return None
@property
def next_pending(self) -> TodoItem | None:
"""Get the next pending todo item."""
for item in self.items:
if item.status == "pending":
return item
return None
@property
def is_complete(self) -> bool:
"""Check if all todos are in a terminal state (completed or failed)."""
return len(self.items) > 0 and all(
item.status in ("completed", "failed") for item in self.items
)
@property
def pending_count(self) -> int:
"""Count of pending todos."""
return sum(1 for item in self.items if item.status == "pending")
@property
def completed_count(self) -> int:
"""Count of completed todos."""
return sum(1 for item in self.items if item.status == "completed")
def get_by_step_number(self, step_number: int) -> TodoItem | None:
"""Get a todo by its step number."""
for item in self.items:
if item.step_number == step_number:
return item
return None
def mark_running(self, step_number: int) -> None:
"""Mark a todo as running by step number."""
item = self.get_by_step_number(step_number)
if item:
item.status = "running"
def mark_completed(self, step_number: int, result: str | None = None) -> None:
"""Mark a todo as completed by step number."""
item = self.get_by_step_number(step_number)
if item:
item.status = "completed"
if result is not None:
item.result = result
def mark_failed(self, step_number: int, result: str | None = None) -> None:
"""Mark a todo as failed by step number."""
item = self.get_by_step_number(step_number)
if item:
item.status = "failed"
if result is not None:
item.result = result
def _dependencies_satisfied(self, item: TodoItem) -> bool:
"""Check if all dependencies for a todo item are in a terminal state.
A dependency is satisfied when it has finished executing — either
successfully (completed) or not (failed). This prevents downstream
todos from being permanently blocked when a dependency fails.
The executor/observer is responsible for deciding whether to skip,
replan, or continue when a dependency has failed.
Args:
item: The todo item to check dependencies for.
Returns:
True if all dependencies are in a terminal state, False otherwise.
"""
for dep_num in item.depends_on:
dep = self.get_by_step_number(dep_num)
if dep is None or dep.status not in ("completed", "failed"):
return False
return True
def get_ready_todos(self) -> list[TodoItem]:
"""Get all todos that are ready to execute (pending with satisfied dependencies).
Returns:
List of TodoItem objects that can be executed now.
"""
ready: list[TodoItem] = []
for item in self.items:
if item.status != "pending":
continue
if self._dependencies_satisfied(item):
ready.append(item)
return ready
@property
def can_parallelize(self) -> bool:
"""Check if multiple todos can run in parallel.
Returns:
True if more than one todo is ready to execute.
"""
return len(self.get_ready_todos()) > 1
@property
def running_count(self) -> int:
"""Count of currently running todos."""
return sum(1 for item in self.items if item.status == "running")
def get_completed_todos(self) -> list[TodoItem]:
"""Get all completed todos.
Returns:
List of completed TodoItem objects.
"""
return [item for item in self.items if item.status == "completed"]
def get_failed_todos(self) -> list[TodoItem]:
"""Get all failed todos.
Returns:
List of failed TodoItem objects.
"""
return [item for item in self.items if item.status == "failed"]
def get_pending_todos(self) -> list[TodoItem]:
"""Get all pending todos.
Returns:
List of pending TodoItem objects.
"""
return [item for item in self.items if item.status == "pending"]
def replace_pending_todos(self, new_items: list[TodoItem]) -> None:
"""Replace all pending todos with new items.
Preserves completed, failed, and running todos, replaces only pending ones.
Used during replanning to swap in a new plan for remaining work.
Args:
new_items: The new todo items to replace pending ones.
"""
non_pending = [item for item in self.items if item.status != "pending"]
self.items = non_pending + new_items
class StepRefinement(BaseModel):
"""A structured in-place update for a single pending step.
Returned as part of StepObservation when the Planner learns new
information that makes a pending step description more specific.
Applied directly — no second LLM call required.
"""
step_number: int = Field(description="The step number to update (1-based)")
new_description: str = Field(
description="The updated, more specific description for this step"
)
class StepObservation(BaseModel):
"""Planner's observation after a step execution completes.
Returned by the PlannerObserver after EVERY step — not just failures.
The Planner uses this to decide whether to continue, refine, or replan.
Based on PLAN-AND-ACT (Section 3.3): the Planner observes what the Executor
did and incorporates new information into the remaining plan.
Attributes:
step_completed_successfully: Whether the step achieved its objective.
key_information_learned: New information revealed by this step
(e.g., "Found 3 products: A, B, C"). Used to refine upcoming steps.
remaining_plan_still_valid: Whether pending todos still make sense
given the new information. True does NOT mean no refinement needed.
suggested_refinements: Structured in-place updates to pending step
descriptions. Each entry targets a specific step by number. These
are applied directly without a second LLM call.
Example: [{"step_number": 3, "new_description": "Select product B (highest rated)"}]
needs_full_replan: The remaining plan is fundamentally wrong and must
be regenerated from scratch. Mutually exclusive with
remaining_plan_still_valid (if this is True, that should be False).
replan_reason: Explanation of why a full replan is needed (None if not).
goal_already_achieved: The overall task goal has been satisfied early.
No more steps needed — skip remaining todos and finalize.
"""
step_completed_successfully: bool = Field(
description="Whether the step achieved what it was asked to do"
)
key_information_learned: str = Field(
default="",
description="What new information this step revealed",
)
remaining_plan_still_valid: bool = Field(
default=True,
description="Whether the remaining pending todos still make sense given new information",
)
suggested_refinements: list[StepRefinement] | None = Field(
default=None,
description=(
"Structured updates to pending step descriptions based on new information. "
"Each entry specifies a step_number and new_description. "
"Applied directly — no separate replan needed."
),
)
@field_validator("suggested_refinements", mode="before")
@classmethod
def coerce_single_refinement_to_list(cls, v):
"""Coerce a single dict refinement into a list to handle LLM returning a single object."""
if isinstance(v, dict):
return [v]
return v
needs_full_replan: bool = Field(
default=False,
description="The remaining plan is fundamentally wrong and must be regenerated",
)
replan_reason: str | None = Field(
default=None,
description="Explanation of why a full replan is needed",
)
goal_already_achieved: bool = Field(
default=False,
description="The overall task goal has been satisfied early; no more steps needed",
)

View File

@@ -657,7 +657,10 @@ def _json_schema_to_pydantic_field(
A tuple of (type, Field) for use with create_model. A tuple of (type, Field) for use with create_model.
""" """
type_ = _json_schema_to_pydantic_type( type_ = _json_schema_to_pydantic_type(
json_schema, root_schema, name_=name.title(), enrich_descriptions=enrich_descriptions json_schema,
root_schema,
name_=name.title(),
enrich_descriptions=enrich_descriptions,
) )
is_required = name in required is_required = name in required
@@ -806,7 +809,10 @@ def _json_schema_to_pydantic_type(
if ref: if ref:
ref_schema = _resolve_ref(ref, root_schema) ref_schema = _resolve_ref(ref, root_schema)
return _json_schema_to_pydantic_type( return _json_schema_to_pydantic_type(
ref_schema, root_schema, name_=name_, enrich_descriptions=enrich_descriptions ref_schema,
root_schema,
name_=name_,
enrich_descriptions=enrich_descriptions,
) )
enum_values = json_schema.get("enum") enum_values = json_schema.get("enum")
@@ -835,12 +841,16 @@ def _json_schema_to_pydantic_type(
if all_of_schemas: if all_of_schemas:
if len(all_of_schemas) == 1: if len(all_of_schemas) == 1:
return _json_schema_to_pydantic_type( return _json_schema_to_pydantic_type(
all_of_schemas[0], root_schema, name_=name_, all_of_schemas[0],
root_schema,
name_=name_,
enrich_descriptions=enrich_descriptions, enrich_descriptions=enrich_descriptions,
) )
merged = _merge_all_of_schemas(all_of_schemas, root_schema) merged = _merge_all_of_schemas(all_of_schemas, root_schema)
return _json_schema_to_pydantic_type( return _json_schema_to_pydantic_type(
merged, root_schema, name_=name_, merged,
root_schema,
name_=name_,
enrich_descriptions=enrich_descriptions, enrich_descriptions=enrich_descriptions,
) )
@@ -858,7 +868,9 @@ def _json_schema_to_pydantic_type(
items_schema = json_schema.get("items") items_schema = json_schema.get("items")
if items_schema: if items_schema:
item_type = _json_schema_to_pydantic_type( item_type = _json_schema_to_pydantic_type(
items_schema, root_schema, name_=name_, items_schema,
root_schema,
name_=name_,
enrich_descriptions=enrich_descriptions, enrich_descriptions=enrich_descriptions,
) )
return list[item_type] # type: ignore[valid-type] return list[item_type] # type: ignore[valid-type]
@@ -870,7 +882,8 @@ def _json_schema_to_pydantic_type(
if json_schema_.get("title") is None: if json_schema_.get("title") is None:
json_schema_["title"] = name_ or "DynamicModel" json_schema_["title"] = name_ or "DynamicModel"
return create_model_from_schema( return create_model_from_schema(
json_schema_, root_schema=root_schema, json_schema_,
root_schema=root_schema,
enrich_descriptions=enrich_descriptions, enrich_descriptions=enrich_descriptions,
) )
return dict return dict

View File

@@ -1,10 +1,13 @@
"""Handles planning/reasoning for agents before task execution."""
from __future__ import annotations
import json import json
import logging import logging
from typing import Any, Final, Literal, cast from typing import TYPE_CHECKING, Any, Final, Literal, cast
from pydantic import BaseModel, Field from pydantic import BaseModel, Field
from crewai.agent import Agent
from crewai.events.event_bus import crewai_event_bus from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.reasoning_events import ( from crewai.events.types.reasoning_events import (
AgentReasoningCompletedEvent, AgentReasoningCompletedEvent,
@@ -12,14 +15,24 @@ from crewai.events.types.reasoning_events import (
AgentReasoningStartedEvent, AgentReasoningStartedEvent,
) )
from crewai.llm import LLM from crewai.llm import LLM
from crewai.task import Task from crewai.utilities.llm_utils import create_llm
from crewai.utilities.planning_types import PlanStep
from crewai.utilities.string_utils import sanitize_tool_name from crewai.utilities.string_utils import sanitize_tool_name
if TYPE_CHECKING:
from crewai.agent import Agent
from crewai.agent.planning_config import PlanningConfig
from crewai.task import Task
class ReasoningPlan(BaseModel): class ReasoningPlan(BaseModel):
"""Model representing a reasoning plan for a task.""" """Model representing a reasoning plan for a task."""
plan: str = Field(description="The detailed reasoning plan for the task.") plan: str = Field(description="The detailed reasoning plan for the task.")
steps: list[PlanStep] = Field(
default_factory=list, description="Structured steps to execute"
)
ready: bool = Field(description="Whether the agent is ready to execute the task.") ready: bool = Field(description="Whether the agent is ready to execute the task.")
@@ -29,24 +42,63 @@ class AgentReasoningOutput(BaseModel):
plan: ReasoningPlan = Field(description="The reasoning plan for the task.") plan: ReasoningPlan = Field(description="The reasoning plan for the task.")
# Aliases for backward compatibility
PlanningPlan = ReasoningPlan
AgentPlanningOutput = AgentReasoningOutput
FUNCTION_SCHEMA: Final[dict[str, Any]] = { FUNCTION_SCHEMA: Final[dict[str, Any]] = {
"type": "function", "type": "function",
"function": { "function": {
"name": "create_reasoning_plan", "name": "create_reasoning_plan",
"description": "Create or refine a reasoning plan for a task", "description": "Create or refine a reasoning plan for a task with structured steps",
"parameters": { "parameters": {
"type": "object", "type": "object",
"properties": { "properties": {
"plan": { "plan": {
"type": "string", "type": "string",
"description": "The detailed reasoning plan for the task.", "description": "A brief summary of the overall plan.",
},
"steps": {
"type": "array",
"description": "List of discrete steps to execute the plan",
"items": {
"type": "object",
"properties": {
"step_number": {
"type": "integer",
"description": "Step number (1-based)",
},
"description": {
"type": "string",
"description": "What to do in this step",
},
"tool_to_use": {
"type": ["string", "null"],
"description": "Tool to use for this step, or null if no tool needed",
},
"depends_on": {
"type": "array",
"items": {"type": "integer"},
"description": "Step numbers this step depends on (empty array if none)",
},
},
"required": [
"step_number",
"description",
"tool_to_use",
"depends_on",
],
"additionalProperties": False,
},
}, },
"ready": { "ready": {
"type": "boolean", "type": "boolean",
"description": "Whether the agent is ready to execute the task.", "description": "Whether the agent is ready to execute the task.",
}, },
}, },
"required": ["plan", "ready"], "required": ["plan", "steps", "ready"],
"additionalProperties": False,
}, },
}, },
} }
@@ -54,41 +106,101 @@ FUNCTION_SCHEMA: Final[dict[str, Any]] = {
class AgentReasoning: class AgentReasoning:
""" """
Handles the agent reasoning process, enabling an agent to reflect and create a plan Handles the agent planning/reasoning process, enabling an agent to reflect
before executing a task. and create a plan before executing a task.
Attributes: Attributes:
task: The task for which the agent is reasoning. task: The task for which the agent is planning (optional).
agent: The agent performing the reasoning. agent: The agent performing the planning.
llm: The language model used for reasoning. config: The planning configuration.
llm: The language model used for planning.
logger: Logger for logging events and errors. logger: Logger for logging events and errors.
description: Task description or input text for planning.
expected_output: Expected output description.
""" """
def __init__(self, task: Task, agent: Agent) -> None: def __init__(
"""Initialize the AgentReasoning with a task and an agent. self,
agent: Agent,
task: Task | None = None,
*,
description: str | None = None,
expected_output: str | None = None,
) -> None:
"""Initialize the AgentReasoning with an agent and optional task.
Args: Args:
task: The task for which the agent is reasoning. agent: The agent performing the planning.
agent: The agent performing the reasoning. task: The task for which the agent is planning (optional).
description: Task description or input text (used if task is None).
expected_output: Expected output (used if task is None).
""" """
self.task = task
self.agent = agent self.agent = agent
self.llm = cast(LLM, agent.llm) self.task = task
# Use task attributes if available, otherwise use provided values
self._description = description or (
task.description if task else "Complete the requested task"
)
self._expected_output = expected_output or (
task.expected_output if task else "Complete the task successfully"
)
self.config = self._get_planning_config()
self.llm = self._resolve_llm()
self.logger = logging.getLogger(__name__) self.logger = logging.getLogger(__name__)
def handle_agent_reasoning(self) -> AgentReasoningOutput: @property
"""Public method for the reasoning process that creates and refines a plan for the task until the agent is ready to execute it. def description(self) -> str:
"""Get the task/input description."""
return self._description
@property
def expected_output(self) -> str:
"""Get the expected output."""
return self._expected_output
def _get_planning_config(self) -> PlanningConfig:
"""Get the planning configuration from the agent.
Returns: Returns:
AgentReasoningOutput: The output of the agent reasoning process. The planning configuration, using defaults if not set.
""" """
# Emit a reasoning started event (attempt 1) from crewai.agent.planning_config import PlanningConfig
if self.agent.planning_config is not None:
return self.agent.planning_config
# Fallback for backward compatibility
return PlanningConfig(
max_attempts=getattr(self.agent, "max_reasoning_attempts", None),
)
def _resolve_llm(self) -> LLM:
"""Resolve which LLM to use for planning.
Returns:
The LLM to use - either from config or the agent's LLM.
"""
if self.config.llm is not None:
if isinstance(self.config.llm, LLM):
return self.config.llm
return create_llm(self.config.llm)
return cast(LLM, self.agent.llm)
def handle_agent_reasoning(self) -> AgentReasoningOutput:
"""Public method for the planning process that creates and refines a plan
for the task until the agent is ready to execute it.
Returns:
AgentReasoningOutput: The output of the agent planning process.
"""
task_id = str(self.task.id) if self.task else "kickoff"
# Emit a planning started event (attempt 1)
try: try:
crewai_event_bus.emit( crewai_event_bus.emit(
self.agent, self.agent,
AgentReasoningStartedEvent( AgentReasoningStartedEvent(
agent_role=self.agent.role, agent_role=self.agent.role,
task_id=str(self.task.id), task_id=task_id,
attempt=1, attempt=1,
from_task=self.task, from_task=self.task,
), ),
@@ -98,13 +210,13 @@ class AgentReasoning:
pass pass
try: try:
output = self.__handle_agent_reasoning() output = self._execute_planning()
crewai_event_bus.emit( crewai_event_bus.emit(
self.agent, self.agent,
AgentReasoningCompletedEvent( AgentReasoningCompletedEvent(
agent_role=self.agent.role, agent_role=self.agent.role,
task_id=str(self.task.id), task_id=task_id,
plan=output.plan.plan, plan=output.plan.plan,
ready=output.plan.ready, ready=output.plan.ready,
attempt=1, attempt=1,
@@ -115,135 +227,158 @@ class AgentReasoning:
return output return output
except Exception as e: except Exception as e:
# Emit reasoning failed event # Emit planning failed event
try: try:
crewai_event_bus.emit( crewai_event_bus.emit(
self.agent, self.agent,
AgentReasoningFailedEvent( AgentReasoningFailedEvent(
agent_role=self.agent.role, agent_role=self.agent.role,
task_id=str(self.task.id), task_id=task_id,
error=str(e), error=str(e),
attempt=1, attempt=1,
from_task=self.task, from_task=self.task,
from_agent=self.agent, from_agent=self.agent,
), ),
) )
except Exception as e: except Exception as event_error:
logging.error(f"Error emitting reasoning failed event: {e}") logging.error(f"Error emitting planning failed event: {event_error}")
raise raise
def __handle_agent_reasoning(self) -> AgentReasoningOutput: def _execute_planning(self) -> AgentReasoningOutput:
"""Private method that handles the agent reasoning process. """Execute the planning process.
Returns: Returns:
The output of the agent reasoning process. The output of the agent planning process.
""" """
plan, ready = self.__create_initial_plan() plan, steps, ready = self._create_initial_plan()
plan, steps, ready = self._refine_plan_if_needed(plan, steps, ready)
plan, ready = self.__refine_plan_if_needed(plan, ready) reasoning_plan = ReasoningPlan(plan=plan, steps=steps, ready=ready)
reasoning_plan = ReasoningPlan(plan=plan, ready=ready)
return AgentReasoningOutput(plan=reasoning_plan) return AgentReasoningOutput(plan=reasoning_plan)
def __create_initial_plan(self) -> tuple[str, bool]: def _create_initial_plan(self) -> tuple[str, list[PlanStep], bool]:
"""Creates the initial reasoning plan for the task. """Creates the initial plan for the task.
Returns: Returns:
The initial plan and whether the agent is ready to execute the task. A tuple of the plan summary, list of steps, and whether the agent is ready.
""" """
reasoning_prompt = self.__create_reasoning_prompt() planning_prompt = self._create_planning_prompt()
if self.llm.supports_function_calling(): if self.llm.supports_function_calling():
plan, ready = self.__call_with_function(reasoning_prompt, "initial_plan") plan, steps, ready = self._call_with_function(
return plan, ready planning_prompt, "create_plan"
response = _call_llm_with_reasoning_prompt( )
llm=self.llm, return plan, steps, ready
prompt=reasoning_prompt,
task=self.task, response = self._call_llm_with_prompt(
reasoning_agent=self.agent, prompt=planning_prompt,
backstory=self.__get_agent_backstory(), plan_type="create_plan",
plan_type="initial_plan",
) )
return self.__parse_reasoning_response(str(response)) plan, ready = self._parse_planning_response(str(response))
return plan, [], ready # No structured steps from text parsing
def __refine_plan_if_needed(self, plan: str, ready: bool) -> tuple[str, bool]: def _refine_plan_if_needed(
"""Refines the reasoning plan if the agent is not ready to execute the task. self, plan: str, steps: list[PlanStep], ready: bool
) -> tuple[str, list[PlanStep], bool]:
"""Refines the plan if the agent is not ready to execute the task.
Args: Args:
plan: The current reasoning plan. plan: The current plan.
steps: The current list of steps.
ready: Whether the agent is ready to execute the task. ready: Whether the agent is ready to execute the task.
Returns: Returns:
The refined plan and whether the agent is ready to execute the task. The refined plan, steps, and whether the agent is ready to execute.
""" """
attempt = 1 attempt = 1
max_attempts = self.agent.max_reasoning_attempts max_attempts = self.config.max_attempts
task_id = str(self.task.id) if self.task else "kickoff"
while not ready and (max_attempts is None or attempt < max_attempts): while not ready and (max_attempts is None or attempt < max_attempts):
attempt += 1
# Emit event for each refinement attempt # Emit event for each refinement attempt
try: try:
crewai_event_bus.emit( crewai_event_bus.emit(
self.agent, self.agent,
AgentReasoningStartedEvent( AgentReasoningStartedEvent(
agent_role=self.agent.role, agent_role=self.agent.role,
task_id=str(self.task.id), task_id=task_id,
attempt=attempt + 1, attempt=attempt,
from_task=self.task, from_task=self.task,
), ),
) )
except Exception: # noqa: S110 except Exception: # noqa: S110
pass pass
refine_prompt = self.__create_refine_prompt(plan) refine_prompt = self._create_refine_prompt(plan)
if self.llm.supports_function_calling(): if self.llm.supports_function_calling():
plan, ready = self.__call_with_function(refine_prompt, "refine_plan") plan, steps, ready = self._call_with_function(
refine_prompt, "refine_plan"
)
else: else:
response = _call_llm_with_reasoning_prompt( response = self._call_llm_with_prompt(
llm=self.llm,
prompt=refine_prompt, prompt=refine_prompt,
task=self.task,
reasoning_agent=self.agent,
backstory=self.__get_agent_backstory(),
plan_type="refine_plan", plan_type="refine_plan",
) )
plan, ready = self.__parse_reasoning_response(str(response)) plan, ready = self._parse_planning_response(str(response))
steps = [] # No structured steps from text parsing
attempt += 1 # Emit completed event for this refinement attempt
try:
crewai_event_bus.emit(
self.agent,
AgentReasoningCompletedEvent(
agent_role=self.agent.role,
task_id=task_id,
plan=plan,
ready=ready,
attempt=attempt,
from_task=self.task,
from_agent=self.agent,
),
)
except Exception: # noqa: S110
pass
if max_attempts is not None and attempt >= max_attempts: if max_attempts is not None and attempt >= max_attempts:
self.logger.warning( self.logger.warning(
f"Agent reasoning reached maximum attempts ({max_attempts}) without being ready. Proceeding with current plan." f"Agent planning reached maximum attempts ({max_attempts}) "
"without being ready. Proceeding with current plan."
) )
break break
return plan, ready return plan, steps, ready
def __call_with_function(self, prompt: str, prompt_type: str) -> tuple[str, bool]: def _call_with_function(
"""Calls the LLM with function calling to get a reasoning plan. self, prompt: str, plan_type: Literal["create_plan", "refine_plan"]
) -> tuple[str, list[PlanStep], bool]:
"""Calls the LLM with function calling to get a plan.
Args: Args:
prompt: The prompt to send to the LLM. prompt: The prompt to send to the LLM.
prompt_type: The type of prompt (initial_plan or refine_plan). plan_type: The type of plan being created.
Returns: Returns:
A tuple containing the plan and whether the agent is ready. A tuple containing the plan summary, list of steps, and whether the agent is ready.
""" """
self.logger.debug(f"Using function calling for {prompt_type} reasoning") self.logger.debug(f"Using function calling for {plan_type} planning")
try: try:
system_prompt = self.agent.i18n.retrieve("reasoning", prompt_type).format( system_prompt = self._get_system_prompt()
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory(),
)
# Prepare a simple callable that just returns the tool arguments as JSON # Prepare a simple callable that just returns the tool arguments as JSON
def _create_reasoning_plan(plan: str, ready: bool = True) -> str: def _create_reasoning_plan(
"""Return the reasoning plan result in JSON string form.""" plan: str,
return json.dumps({"plan": plan, "ready": ready}) steps: list[dict[str, Any]] | None = None,
ready: bool = True,
) -> str:
"""Return the planning result in JSON string form."""
return json.dumps({"plan": plan, "steps": steps or [], "ready": ready})
response = self.llm.call( response = self.llm.call(
[ [
@@ -255,19 +390,33 @@ class AgentReasoning:
from_task=self.task, from_task=self.task,
from_agent=self.agent, from_agent=self.agent,
) )
self.logger.debug(f"Function calling response: {response[:100]}...")
try: try:
result = json.loads(response) result = json.loads(response)
if "plan" in result and "ready" in result: if "plan" in result and "ready" in result:
return result["plan"], result["ready"] # Parse steps from the response
steps: list[PlanStep] = []
raw_steps = result.get("steps", [])
try:
for step_data in raw_steps:
step = PlanStep(
step_number=step_data.get("step_number", 0),
description=step_data.get("description", ""),
tool_to_use=step_data.get("tool_to_use"),
depends_on=step_data.get("depends_on", []),
)
steps.append(step)
except Exception as step_error:
self.logger.warning(
f"Failed to parse step: {step_data}, error: {step_error}"
)
return result["plan"], steps, result["ready"]
except (json.JSONDecodeError, KeyError): except (json.JSONDecodeError, KeyError):
pass pass
response_str = str(response) response_str = str(response)
return ( return (
response_str, response_str,
[],
"READY: I am ready to execute the task." in response_str, "READY: I am ready to execute the task." in response_str,
) )
@@ -277,13 +426,7 @@ class AgentReasoning:
) )
try: try:
system_prompt = self.agent.i18n.retrieve( system_prompt = self._get_system_prompt()
"reasoning", prompt_type
).format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self.__get_agent_backstory(),
)
fallback_response = self.llm.call( fallback_response = self.llm.call(
[ [
@@ -297,78 +440,165 @@ class AgentReasoning:
fallback_str = str(fallback_response) fallback_str = str(fallback_response)
return ( return (
fallback_str, fallback_str,
[],
"READY: I am ready to execute the task." in fallback_str, "READY: I am ready to execute the task." in fallback_str,
) )
except Exception as inner_e: except Exception as inner_e:
self.logger.error(f"Error during fallback text parsing: {inner_e!s}") self.logger.error(f"Error during fallback text parsing: {inner_e!s}")
return ( return (
"Failed to generate a plan due to an error.", "Failed to generate a plan due to an error.",
[],
True, True,
) # Default to ready to avoid getting stuck ) # Default to ready to avoid getting stuck
def __get_agent_backstory(self) -> str: def _call_llm_with_prompt(
""" self,
Safely gets the agent's backstory, providing a default if not available. prompt: str,
plan_type: Literal["create_plan", "refine_plan"],
) -> str:
"""Calls the LLM with the planning prompt.
Args:
prompt: The prompt to send to the LLM.
plan_type: The type of plan being created.
Returns: Returns:
str: The agent's backstory or a default value. The LLM response.
"""
system_prompt = self._get_system_prompt()
response = self.llm.call(
[
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt},
],
from_task=self.task,
from_agent=self.agent,
)
return str(response)
def _get_system_prompt(self) -> str:
"""Get the system prompt for planning.
Returns:
The system prompt, either custom or from i18n.
"""
if self.config.system_prompt is not None:
return self.config.system_prompt
# Try new "planning" section first, fall back to "reasoning" for compatibility
try:
return self.agent.i18n.retrieve("planning", "system_prompt")
except (KeyError, AttributeError):
# Fallback to reasoning section for backward compatibility
return self.agent.i18n.retrieve("reasoning", "initial_plan").format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self._get_agent_backstory(),
)
def _get_agent_backstory(self) -> str:
"""Safely gets the agent's backstory, providing a default if not available.
Returns:
The agent's backstory or a default value.
""" """
return getattr(self.agent, "backstory", "No backstory provided") return getattr(self.agent, "backstory", "No backstory provided")
def __create_reasoning_prompt(self) -> str: def _create_planning_prompt(self) -> str:
""" """Creates a prompt for the agent to plan the task.
Creates a prompt for the agent to reason about the task.
Returns: Returns:
str: The reasoning prompt. The planning prompt.
""" """
available_tools = self.__format_available_tools() available_tools = self._format_available_tools()
# Use custom prompt if provided
if self.config.plan_prompt is not None:
return self.config.plan_prompt.format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self._get_agent_backstory(),
description=self.description,
expected_output=self.expected_output,
tools=available_tools,
max_steps=self.config.max_steps,
)
# Try new "planning" section first
try:
return self.agent.i18n.retrieve("planning", "create_plan_prompt").format(
description=self.description,
expected_output=self.expected_output,
tools=available_tools,
max_steps=self.config.max_steps,
)
except (KeyError, AttributeError):
# Fallback to reasoning section for backward compatibility
return self.agent.i18n.retrieve("reasoning", "create_plan_prompt").format( return self.agent.i18n.retrieve("reasoning", "create_plan_prompt").format(
role=self.agent.role, role=self.agent.role,
goal=self.agent.goal, goal=self.agent.goal,
backstory=self.__get_agent_backstory(), backstory=self._get_agent_backstory(),
description=self.task.description, description=self.description,
expected_output=self.task.expected_output, expected_output=self.expected_output,
tools=available_tools, tools=available_tools,
) )
def __format_available_tools(self) -> str: def _format_available_tools(self) -> str:
""" """Formats the available tools for inclusion in the prompt.
Formats the available tools for inclusion in the prompt.
Returns: Returns:
str: Comma-separated list of tool names. Comma-separated list of tool names.
""" """
try: try:
return ", ".join( # Try task tools first, then agent tools
[sanitize_tool_name(tool.name) for tool in (self.task.tools or [])] tools = []
) if self.task:
tools = self.task.tools or []
if not tools:
tools = getattr(self.agent, "tools", []) or []
if not tools:
return "No tools available"
return ", ".join([sanitize_tool_name(tool.name) for tool in tools])
except (AttributeError, TypeError): except (AttributeError, TypeError):
return "No tools available" return "No tools available"
def __create_refine_prompt(self, current_plan: str) -> str: def _create_refine_prompt(self, current_plan: str) -> str:
""" """Creates a prompt for the agent to refine its plan.
Creates a prompt for the agent to refine its reasoning plan.
Args: Args:
current_plan: The current reasoning plan. current_plan: The current plan.
Returns: Returns:
str: The refine prompt. The refine prompt.
""" """
# Use custom prompt if provided
if self.config.refine_prompt is not None:
return self.config.refine_prompt.format(
role=self.agent.role,
goal=self.agent.goal,
backstory=self._get_agent_backstory(),
current_plan=current_plan,
max_steps=self.config.max_steps,
)
# Try new "planning" section first
try:
return self.agent.i18n.retrieve("planning", "refine_plan_prompt").format(
current_plan=current_plan,
)
except (KeyError, AttributeError):
# Fallback to reasoning section for backward compatibility
return self.agent.i18n.retrieve("reasoning", "refine_plan_prompt").format( return self.agent.i18n.retrieve("reasoning", "refine_plan_prompt").format(
role=self.agent.role, role=self.agent.role,
goal=self.agent.goal, goal=self.agent.goal,
backstory=self.__get_agent_backstory(), backstory=self._get_agent_backstory(),
current_plan=current_plan, current_plan=current_plan,
) )
@staticmethod @staticmethod
def __parse_reasoning_response(response: str) -> tuple[str, bool]: def _parse_planning_response(response: str) -> tuple[str, bool]:
""" """Parses the planning response to extract the plan and readiness.
Parses the reasoning response to extract the plan and whether
the agent is ready to execute the task.
Args: Args:
response: The LLM response. response: The LLM response.
@@ -380,25 +610,13 @@ class AgentReasoning:
return "No plan was generated.", False return "No plan was generated.", False
plan = response plan = response
ready = False ready = "READY: I am ready to execute the task." in response
if "READY: I am ready to execute the task." in response:
ready = True
return plan, ready return plan, ready
def _handle_agent_reasoning(self) -> AgentReasoningOutput:
"""
Deprecated method for backward compatibility.
Use handle_agent_reasoning() instead.
Returns: # Alias for backward compatibility
AgentReasoningOutput: The output of the agent reasoning process. AgentPlanning = AgentReasoning
"""
self.logger.warning(
"The _handle_agent_reasoning method is deprecated. Use handle_agent_reasoning instead."
)
return self.handle_agent_reasoning()
def _call_llm_with_reasoning_prompt( def _call_llm_with_reasoning_prompt(
@@ -409,7 +627,9 @@ def _call_llm_with_reasoning_prompt(
backstory: str, backstory: str,
plan_type: Literal["initial_plan", "refine_plan"], plan_type: Literal["initial_plan", "refine_plan"],
) -> str: ) -> str:
"""Calls the LLM with the reasoning prompt. """Deprecated: Calls the LLM with the reasoning prompt.
This function is kept for backward compatibility.
Args: Args:
llm: The language model to use. llm: The language model to use.
@@ -417,7 +637,7 @@ def _call_llm_with_reasoning_prompt(
task: The task for which the agent is reasoning. task: The task for which the agent is reasoning.
reasoning_agent: The agent performing the reasoning. reasoning_agent: The agent performing the reasoning.
backstory: The agent's backstory. backstory: The agent's backstory.
plan_type: The type of plan being created ("initial_plan" or "refine_plan"). plan_type: The type of plan being created.
Returns: Returns:
The LLM response. The LLM response.

View File

@@ -0,0 +1,64 @@
"""Context and result types for isolated step execution in Plan-and-Execute architecture.
These types mediate between the AgentExecutor (orchestrator) and StepExecutor (per-step worker).
StepExecutionContext carries only final results from dependencies — never LLM message histories.
StepResult carries only the outcome of a step — never internal execution traces.
"""
from __future__ import annotations
from dataclasses import dataclass, field
@dataclass(frozen=True)
class StepExecutionContext:
"""Immutable context passed to a StepExecutor for a single todo.
Contains only the information the Executor needs to complete one step:
the task description, goal, and final results from dependency steps.
No LLM message history, no execution traces, no shared mutable state.
Attributes:
task_description: The original task description (from Task or kickoff input).
task_goal: The expected output / goal of the overall task.
dependency_results: Mapping of step_number → final result string
for all completed dependencies of the current step.
"""
task_description: str
task_goal: str
dependency_results: dict[int, str] = field(default_factory=dict)
def get_dependency_result(self, step_number: int) -> str | None:
"""Get the final result of a dependency step.
Args:
step_number: The step number to look up.
Returns:
The result string if available, None otherwise.
"""
return self.dependency_results.get(step_number)
@dataclass(frozen=True)
class StepResult:
"""Result returned by a StepExecutor after executing a single todo.
Contains the final outcome and metadata for debugging/metrics.
Tool call details are for audit logging only — they are NOT passed
to subsequent steps or the Planner.
Attributes:
success: Whether the step completed successfully.
result: The final output string from the step.
error: Error message if the step failed (None on success).
tool_calls_made: List of tool names invoked (for debugging/logging only).
execution_time: Wall-clock time in seconds for the step execution.
"""
success: bool
result: str
error: str | None = None
tool_calls_made: list[str] = field(default_factory=list)
execution_time: float = 0.0

View File

@@ -2,6 +2,7 @@
import asyncio import asyncio
from collections.abc import AsyncIterator, Callable, Iterator from collections.abc import AsyncIterator, Callable, Iterator
import contextvars
import queue import queue
import threading import threading
from typing import Any, NamedTuple from typing import Any, NamedTuple
@@ -240,7 +241,8 @@ def create_chunk_generator(
Yields: Yields:
StreamChunk objects as they arrive. StreamChunk objects as they arrive.
""" """
thread = threading.Thread(target=run_func, daemon=True) ctx = contextvars.copy_context()
thread = threading.Thread(target=ctx.run, args=(run_func,), daemon=True)
thread.start() thread.start()
try: try:

View File

@@ -2,6 +2,7 @@
# https://github.com/un33k/python-slugify # https://github.com/un33k/python-slugify
# MIT License # MIT License
import functools
import hashlib import hashlib
import re import re
from typing import Any, Final from typing import Any, Final
@@ -17,6 +18,11 @@ _DUPLICATE_UNDERSCORE_PATTERN: Final[re.Pattern[str]] = re.compile(r"_+")
_MAX_TOOL_NAME_LENGTH: Final[int] = 64 _MAX_TOOL_NAME_LENGTH: Final[int] = 64
@functools.lru_cache(maxsize=8)
def _duplicate_separator_pattern(separator: str) -> re.Pattern[str]:
return re.compile(f"(?:{re.escape(separator)}){{2,}}")
def sanitize_tool_name(name: str, max_length: int = _MAX_TOOL_NAME_LENGTH) -> str: def sanitize_tool_name(name: str, max_length: int = _MAX_TOOL_NAME_LENGTH) -> str:
"""Sanitize tool name for LLM provider compatibility. """Sanitize tool name for LLM provider compatibility.
@@ -48,6 +54,28 @@ def sanitize_tool_name(name: str, max_length: int = _MAX_TOOL_NAME_LENGTH) -> st
return name return name
def slugify(text: str, separator: str = "_") -> str:
"""Convert text to a URL-safe slug.
Normalizes Unicode characters, removes special characters,
and replaces whitespace with the separator.
Args:
text: The text to slugify.
separator: The separator to use between words. Defaults to underscore.
Returns:
A URL-safe slug.
"""
text = unicodedata.normalize("NFKD", text)
text = text.encode("ascii", "ignore").decode("ascii")
text = text.lower()
text = _QUOTE_PATTERN.sub("", text)
text = _DISALLOWED_CHARS_PATTERN.sub(separator, text)
text = _duplicate_separator_pattern(separator).sub(separator, text)
return text.strip(separator)
def interpolate_only( def interpolate_only(
input_string: str | None, input_string: str | None,
inputs: dict[str, str | int | float | dict[str, Any] | list[Any]], inputs: dict[str, str | int | float | dict[str, Any] | list[Any]],

View File

@@ -1456,7 +1456,7 @@ def test_agent_execute_task_with_tool():
) )
result = agent.execute_task(task) result = agent.execute_task(task)
assert "you should always think about what to do" in result assert "test query" in result
@pytest.mark.vcr() @pytest.mark.vcr()
@@ -1475,9 +1475,9 @@ def test_agent_execute_task_with_custom_llm():
) )
result = agent.execute_task(task) result = agent.execute_task(task)
assert "In circuits they thrive" in result assert "Artificial minds" in result
assert "Artificial minds awake" in result assert "Code and circuits" in result
assert "Future's coded drive" in result assert "Future undefined" in result
@pytest.mark.vcr() @pytest.mark.vcr()

File diff suppressed because it is too large Load Diff

View File

@@ -1,240 +1,345 @@
"""Tests for reasoning in agents.""" """Tests for planning/reasoning in agents."""
import json import warnings
import pytest import pytest
from crewai import Agent, Task from crewai import Agent, PlanningConfig, Task
from crewai.llm import LLM from crewai.llm import LLM
@pytest.fixture # =============================================================================
def mock_llm_responses(): # Tests for PlanningConfig configuration (no LLM calls needed)
"""Fixture for mock LLM responses.""" # =============================================================================
return {
"ready": "I'll solve this simple math problem.\n\nREADY: I am ready to execute the task.\n\n",
"not_ready": "I need to think about derivatives.\n\nNOT READY: I need to refine my plan because I'm not sure about the derivative rules.",
"ready_after_refine": "I'll use the power rule for derivatives where d/dx(x^n) = n*x^(n-1).\n\nREADY: I am ready to execute the task.",
"execution": "4",
}
def test_agent_with_reasoning(mock_llm_responses): def test_planning_config_default_values():
"""Test agent with reasoning.""" """Test PlanningConfig default values."""
llm = LLM("gpt-3.5-turbo") config = PlanningConfig()
assert config.max_attempts is None
assert config.max_steps == 20
assert config.system_prompt is None
assert config.plan_prompt is None
assert config.refine_prompt is None
assert config.llm is None
def test_planning_config_custom_values():
"""Test PlanningConfig with custom values."""
config = PlanningConfig(
max_attempts=5,
max_steps=15,
system_prompt="Custom system",
plan_prompt="Custom plan: {description}",
refine_prompt="Custom refine: {current_plan}",
llm="gpt-4",
)
assert config.max_attempts == 5
assert config.max_steps == 15
assert config.system_prompt == "Custom system"
assert config.plan_prompt == "Custom plan: {description}"
assert config.refine_prompt == "Custom refine: {current_plan}"
assert config.llm == "gpt-4"
def test_agent_with_planning_config_custom_prompts():
"""Test agent with PlanningConfig using custom prompts."""
llm = LLM("gpt-4o-mini")
custom_system_prompt = "You are a specialized planner."
custom_plan_prompt = "Plan this task: {description}"
agent = Agent(
role="Test Agent",
goal="To test custom prompts",
backstory="I am a test agent.",
llm=llm,
planning_config=PlanningConfig(
system_prompt=custom_system_prompt,
plan_prompt=custom_plan_prompt,
max_steps=10,
),
verbose=False,
)
# Just test that the agent is created properly
assert agent.planning_config is not None
assert agent.planning_config.system_prompt == custom_system_prompt
assert agent.planning_config.plan_prompt == custom_plan_prompt
assert agent.planning_config.max_steps == 10
def test_agent_with_planning_config_disabled():
"""Test agent with PlanningConfig disabled."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Test Agent",
goal="To test disabled planning",
backstory="I am a test agent.",
llm=llm,
planning=False,
verbose=False,
)
# Planning should be disabled
assert agent.planning_enabled is False
def test_planning_enabled_property():
"""Test the planning_enabled property on Agent."""
llm = LLM("gpt-4o-mini")
# With planning_config enabled
agent_with_planning = Agent(
role="Test Agent",
goal="Test",
backstory="Test",
llm=llm,
planning=True,
)
assert agent_with_planning.planning_enabled is True
# With planning_config disabled
agent_disabled = Agent(
role="Test Agent",
goal="Test",
backstory="Test",
llm=llm,
planning=False,
)
assert agent_disabled.planning_enabled is False
# Without planning_config
agent_no_planning = Agent(
role="Test Agent",
goal="Test",
backstory="Test",
llm=llm,
)
assert agent_no_planning.planning_enabled is False
# =============================================================================
# Tests for backward compatibility with reasoning=True (no LLM calls)
# =============================================================================
def test_agent_with_reasoning_backward_compat():
"""Test agent with reasoning=True (backward compatibility)."""
llm = LLM("gpt-4o-mini")
# This should emit a deprecation warning
with warnings.catch_warnings(record=True):
warnings.simplefilter("always")
agent = Agent( agent = Agent(
role="Test Agent", role="Test Agent",
goal="To test the reasoning feature", goal="To test the reasoning feature",
backstory="I am a test agent created to verify the reasoning feature works correctly.", backstory="I am a test agent created to verify the reasoning feature works correctly.",
llm=llm, llm=llm,
reasoning=True, reasoning=True,
verbose=True, verbose=False,
) )
task = Task( # Should have created a PlanningConfig internally
description="Simple math task: What's 2+2?", assert agent.planning_config is not None
expected_output="The answer should be a number.", assert agent.planning_enabled is True
agent=agent,
)
agent.llm.call = lambda messages, *args, **kwargs: (
mock_llm_responses["ready"]
if any("create a detailed plan" in msg.get("content", "") for msg in messages)
else mock_llm_responses["execution"]
)
result = agent.execute_task(task)
assert result == mock_llm_responses["execution"]
assert "Reasoning Plan:" in task.description
def test_agent_with_reasoning_not_ready_initially(mock_llm_responses): def test_agent_with_reasoning_and_max_attempts_backward_compat():
"""Test agent with reasoning that requires refinement.""" """Test agent with reasoning=True and max_reasoning_attempts (backward compatibility)."""
llm = LLM("gpt-3.5-turbo") llm = LLM("gpt-4o-mini")
agent = Agent( agent = Agent(
role="Test Agent", role="Test Agent",
goal="To test the reasoning feature", goal="To test the reasoning feature",
backstory="I am a test agent created to verify the reasoning feature works correctly.", backstory="I am a test agent.",
llm=llm, llm=llm,
reasoning=True, reasoning=True,
max_reasoning_attempts=2, max_reasoning_attempts=5,
verbose=True, verbose=False,
) )
task = Task( # Should have created a PlanningConfig with max_attempts
description="Complex math task: What's the derivative of x²?", assert agent.planning_config is not None
expected_output="The answer should be a mathematical expression.", assert agent.planning_config.max_attempts == 5
agent=agent,
)
call_count = [0]
def mock_llm_call(messages, *args, **kwargs):
if any(
"create a detailed plan" in msg.get("content", "") for msg in messages
) or any("refine your plan" in msg.get("content", "") for msg in messages):
call_count[0] += 1
if call_count[0] == 1:
return mock_llm_responses["not_ready"]
return mock_llm_responses["ready_after_refine"]
return "2x"
agent.llm.call = mock_llm_call
result = agent.execute_task(task)
assert result == "2x"
assert call_count[0] == 2 # Should have made 2 reasoning calls
assert "Reasoning Plan:" in task.description
def test_agent_with_reasoning_max_attempts_reached(): # =============================================================================
"""Test agent with reasoning that reaches max attempts without being ready.""" # Tests for Agent.kickoff() with planning (uses AgentExecutor)
llm = LLM("gpt-3.5-turbo") # =============================================================================
@pytest.mark.vcr()
def test_agent_kickoff_with_planning():
"""Test Agent.kickoff() with planning enabled generates a plan."""
llm = LLM("gpt-4o-mini")
agent = Agent( agent = Agent(
role="Test Agent", role="Math Assistant",
goal="To test the reasoning feature", goal="Help solve math problems step by step",
backstory="I am a test agent created to verify the reasoning feature works correctly.", backstory="A helpful math tutor",
llm=llm, llm=llm,
reasoning=True, planning_config=PlanningConfig(max_attempts=1),
max_reasoning_attempts=2, verbose=False,
verbose=True,
) )
task = Task( result = agent.kickoff("What is 15 + 27?")
description="Complex math task: Solve the Riemann hypothesis.",
expected_output="A proof or disproof of the hypothesis.",
agent=agent,
)
call_count = [0] assert result is not None
assert "42" in str(result)
def mock_llm_call(messages, *args, **kwargs):
if any(
"create a detailed plan" in msg.get("content", "") for msg in messages
) or any("refine your plan" in msg.get("content", "") for msg in messages):
call_count[0] += 1
return f"Attempt {call_count[0]}: I need more time to think.\n\nNOT READY: I need to refine my plan further."
return "This is an unsolved problem in mathematics."
agent.llm.call = mock_llm_call
result = agent.execute_task(task)
assert result == "This is an unsolved problem in mathematics."
assert (
call_count[0] == 2
) # Should have made exactly 2 reasoning calls (max_attempts)
assert "Reasoning Plan:" in task.description
def test_agent_reasoning_error_handling(): @pytest.mark.vcr()
"""Test error handling during the reasoning process.""" def test_agent_kickoff_without_planning():
llm = LLM("gpt-3.5-turbo") """Test Agent.kickoff() without planning skips plan generation."""
llm = LLM("gpt-4o-mini")
agent = Agent( agent = Agent(
role="Test Agent", role="Math Assistant",
goal="To test the reasoning feature", goal="Help solve math problems",
backstory="I am a test agent created to verify the reasoning feature works correctly.", backstory="A helpful assistant",
llm=llm, llm=llm,
reasoning=True, # No planning_config = no planning
verbose=False,
) )
task = Task( result = agent.kickoff("What is 8 * 7?")
description="Task that will cause an error",
expected_output="Output that will never be generated",
agent=agent,
)
call_count = [0] assert result is not None
assert "56" in str(result)
def mock_llm_call_error(*args, **kwargs):
call_count[0] += 1
if call_count[0] <= 2: # First calls are for reasoning
raise Exception("LLM error during reasoning")
return "Fallback execution result" # Return a value for task execution
agent.llm.call = mock_llm_call_error
result = agent.execute_task(task)
assert result == "Fallback execution result"
assert call_count[0] > 2 # Ensure we called the mock multiple times
@pytest.mark.skip(reason="Test requires updates for native tool calling changes") @pytest.mark.vcr()
def test_agent_with_function_calling(): def test_agent_kickoff_with_planning_disabled():
"""Test agent with reasoning using function calling.""" """Test Agent.kickoff() with planning explicitly disabled via planning=False."""
llm = LLM("gpt-3.5-turbo") llm = LLM("gpt-4o-mini")
agent = Agent( agent = Agent(
role="Test Agent", role="Math Assistant",
goal="To test the reasoning feature", goal="Help solve math problems",
backstory="I am a test agent created to verify the reasoning feature works correctly.", backstory="A helpful assistant",
llm=llm, llm=llm,
reasoning=True, planning=False, # Explicitly disable planning
verbose=True, verbose=False,
) )
task = Task( result = agent.kickoff("What is 100 / 4?")
description="Simple math task: What's 2+2?",
expected_output="The answer should be a number.",
agent=agent,
)
agent.llm.supports_function_calling = lambda: True assert result is not None
assert "25" in str(result)
def mock_function_call(messages, *args, **kwargs):
if "tools" in kwargs:
return json.dumps(
{"plan": "I'll solve this simple math problem: 2+2=4.", "ready": True}
)
return "4"
agent.llm.call = mock_function_call
result = agent.execute_task(task)
assert result == "4"
assert "Reasoning Plan:" in task.description
assert "I'll solve this simple math problem: 2+2=4." in task.description
@pytest.mark.skip(reason="Test requires updates for native tool calling changes") @pytest.mark.vcr()
def test_agent_with_function_calling_fallback(): def test_agent_kickoff_multi_step_task_with_planning():
"""Test agent with reasoning using function calling that falls back to text parsing.""" """Test Agent.kickoff() with a multi-step task that benefits from planning."""
llm = LLM("gpt-3.5-turbo") llm = LLM("gpt-4o-mini")
agent = Agent( agent = Agent(
role="Test Agent", role="Math Tutor",
goal="To test the reasoning feature", goal="Solve multi-step math problems",
backstory="I am a test agent created to verify the reasoning feature works correctly.", backstory="An expert tutor who explains step by step",
llm=llm, llm=llm,
reasoning=True, planning_config=PlanningConfig(max_attempts=1, max_steps=5),
verbose=True, verbose=False,
)
# Task requires: find primes, sum them, then double
result = agent.kickoff(
"Find the first 3 prime numbers, add them together, then multiply by 2."
)
assert result is not None
# First 3 primes: 2, 3, 5 -> sum = 10 -> doubled = 20
assert "20" in str(result)
# =============================================================================
# Tests for Agent.execute_task() with planning (uses CrewAgentExecutor)
# These test the legacy path via handle_reasoning()
# =============================================================================
@pytest.mark.vcr()
def test_agent_execute_task_with_planning():
"""Test Agent.execute_task() with planning via CrewAgentExecutor."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Assistant",
goal="Help solve math problems",
backstory="A helpful math tutor",
llm=llm,
planning_config=PlanningConfig(max_attempts=1),
verbose=False,
) )
task = Task( task = Task(
description="Simple math task: What's 2+2?", description="What is 9 + 11?",
expected_output="The answer should be a number.", expected_output="A number",
agent=agent, agent=agent,
) )
agent.llm.supports_function_calling = lambda: True result = agent.execute_task(task)
def mock_function_call(messages, *args, **kwargs): assert result is not None
if "tools" in kwargs: assert "20" in str(result)
return "Invalid JSON that will trigger fallback. READY: I am ready to execute the task." # Planning should be appended to task description
return "4" assert "Planning:" in task.description
agent.llm.call = mock_function_call
@pytest.mark.vcr()
def test_agent_execute_task_without_planning():
"""Test Agent.execute_task() without planning."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Assistant",
goal="Help solve math problems",
backstory="A helpful assistant",
llm=llm,
verbose=False,
)
task = Task(
description="What is 12 * 3?",
expected_output="A number",
agent=agent,
)
result = agent.execute_task(task) result = agent.execute_task(task)
assert result == "4" assert result is not None
assert "Reasoning Plan:" in task.description assert "36" in str(result)
assert "Invalid JSON that will trigger fallback" in task.description # No planning should be added
assert "Planning:" not in task.description
@pytest.mark.vcr()
def test_agent_execute_task_with_planning_refine():
"""Test Agent.execute_task() with planning that requires refinement."""
llm = LLM("gpt-4o-mini")
agent = Agent(
role="Math Tutor",
goal="Solve complex math problems step by step",
backstory="An expert tutor",
llm=llm,
planning_config=PlanningConfig(max_attempts=2),
verbose=False,
)
task = Task(
description="Calculate the area of a circle with radius 5 (use pi = 3.14)",
expected_output="The area as a number",
agent=agent,
)
result = agent.execute_task(task)
assert result is not None
# Area = pi * r^2 = 3.14 * 25 = 78.5
assert "78" in str(result) or "79" in str(result)
assert "Planning:" in task.description

View File

@@ -359,17 +359,34 @@ def test_sets_flow_context_when_inside_flow():
@pytest.mark.vcr() @pytest.mark.vcr()
def test_guardrail_is_called_using_string(): def test_guardrail_is_called_using_string():
"""Test that a string guardrail triggers events and retries correctly.
Uses a callable guardrail that deterministically fails on the first
attempt and passes on the second. This tests the guardrail event
machinery (started/completed events, retry loop) without depending
on the LLM to comply with contradictory constraints.
"""
guardrail_events: dict[str, list] = defaultdict(list) guardrail_events: dict[str, list] = defaultdict(list)
from crewai.events.event_types import ( from crewai.events.event_types import (
LLMGuardrailCompletedEvent, LLMGuardrailCompletedEvent,
LLMGuardrailStartedEvent, LLMGuardrailStartedEvent,
) )
# Deterministic guardrail: fail first call, pass second
call_count = {"n": 0}
def fail_then_pass_guardrail(output):
call_count["n"] += 1
if call_count["n"] == 1:
return (False, "Missing required format — please use a numbered list")
return (True, output)
agent = Agent( agent = Agent(
role="Sports Analyst", role="Sports Analyst",
goal="Gather information about the best soccer players", goal="List the best soccer players",
backstory="""You are an expert at gathering and organizing information. You carefully collect details and present them in a structured way.""", backstory="You are an expert at gathering and organizing information.",
guardrail="""Only include Brazilian players, both women and men""", guardrail=fail_then_pass_guardrail,
guardrail_max_retries=3,
) )
condition = threading.Condition() condition = threading.Condition()
@@ -388,7 +405,7 @@ def test_guardrail_is_called_using_string():
guardrail_events["completed"].append(event) guardrail_events["completed"].append(event)
condition.notify() condition.notify()
result = agent.kickoff(messages="Top 10 best players in the world?") result = agent.kickoff(messages="Top 5 best soccer players in the world?")
with condition: with condition:
success = condition.wait_for( success = condition.wait_for(

View File

@@ -1,15 +1,9 @@
interactions: interactions:
- request: - request:
body: '{"max_tokens":4096,"messages":[{"role":"user","content":[{"type":"text","text":"\nCurrent body: '{"max_tokens":4096,"messages":[{"role":"user","content":[{"type":"text","text":"\nCurrent
Task: What type of document is this?\n\nBegin! This is VERY important to you, Task: What type of document is this?\n\nProvide your complete response:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
use the tools available and give your best Final Answer, your job depends on
it!\n\nThought:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately\nTo give my best complete final answer is: Analyze and describe files accurately"}'
to the task respond using the exact following format:\n\nThought: I now can
give a great answer\nFinal Answer: Your final answer must be the great and the
most complete as possible, it must be outcome described.\n\nI MUST use these
formats, my job depends on it!"}'
headers: headers:
User-Agent: User-Agent:
- X-USER-AGENT-XXX - X-USER-AGENT-XXX
@@ -22,7 +16,7 @@ interactions:
connection: connection:
- keep-alive - keep-alive
content-length: content-length:
- '1351' - '950'
content-type: content-type:
- application/json - application/json
host: host:
@@ -38,35 +32,35 @@ interactions:
x-stainless-os: x-stainless-os:
- X-STAINLESS-OS-XXX - X-STAINLESS-OS-XXX
x-stainless-package-version: x-stainless-package-version:
- 0.71.1 - 0.73.0
x-stainless-retry-count: x-stainless-retry-count:
- '0' - '0'
x-stainless-runtime: x-stainless-runtime:
- CPython - CPython
x-stainless-runtime-version: x-stainless-runtime-version:
- 3.12.10 - 3.13.3
x-stainless-timeout: x-stainless-timeout:
- NOT_GIVEN - NOT_GIVEN
method: POST method: POST
uri: https://api.anthropic.com/v1/messages uri: https://api.anthropic.com/v1/messages
response: response:
body: body:
string: '{"model":"claude-3-5-haiku-20241022","id":"msg_01AcygCF93tRhc7A3bfXMqe7","type":"message","role":"assistant","content":[{"type":"text","text":"Thought: string: '{"model":"claude-3-5-haiku-20241022","id":"msg_01C8ZkZMunUVDUDd8mh1r1We","type":"message","role":"assistant","content":[{"type":"text","text":"I
I can see this is a PDF document, but the image appears to be completely white apologize, but the image appears to be completely blank or white. Without
or blank. Without any visible content, I cannot definitively determine the any visible text, graphics, or distinguishing features, I cannot determine
specific type of document.\n\nFinal Answer: The document is a PDF file, but the type of document. The file is a PDF, but the content page seems to be
the provided image shows a blank white page with no discernible content or empty or failed to render properly."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1658,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":58,"service_tier":"standard","inference_geo":"not_available"}}'
text. More information or a clearer image would be needed to identify the
precise type of document."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1750,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":89,"service_tier":"standard"}}'
headers: headers:
CF-RAY: CF-RAY:
- CF-RAY-XXX - CF-RAY-XXX
Connection: Connection:
- keep-alive - keep-alive
Content-Security-Policy:
- CSP-FILTERED
Content-Type: Content-Type:
- application/json - application/json
Date: Date:
- Fri, 23 Jan 2026 19:08:04 GMT - Thu, 12 Feb 2026 19:30:55 GMT
Server: Server:
- cloudflare - cloudflare
Transfer-Encoding: Transfer-Encoding:
@@ -92,7 +86,7 @@ interactions:
anthropic-ratelimit-requests-remaining: anthropic-ratelimit-requests-remaining:
- '3999' - '3999'
anthropic-ratelimit-requests-reset: anthropic-ratelimit-requests-reset:
- '2026-01-23T19:08:01Z' - '2026-02-12T19:30:53Z'
anthropic-ratelimit-tokens-limit: anthropic-ratelimit-tokens-limit:
- ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX - ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
anthropic-ratelimit-tokens-remaining: anthropic-ratelimit-tokens-remaining:
@@ -106,7 +100,112 @@ interactions:
strict-transport-security: strict-transport-security:
- STS-XXX - STS-XXX
x-envoy-upstream-service-time: x-envoy-upstream-service-time:
- '2837' - '2129'
status:
code: 200
message: OK
- request:
body: '{"max_tokens":4096,"messages":[{"role":"user","content":[{"type":"text","text":"\nCurrent
Task: What type of document is this?\n\nProvide your complete response:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
anthropic-version:
- '2023-06-01'
connection:
- keep-alive
content-length:
- '950'
content-type:
- application/json
host:
- api.anthropic.com
x-api-key:
- X-API-KEY-XXX
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 0.73.0
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
x-stainless-timeout:
- NOT_GIVEN
method: POST
uri: https://api.anthropic.com/v1/messages
response:
body:
string: '{"model":"claude-3-5-haiku-20241022","id":"msg_013jb7edagayZxqGs6ioACyU","type":"message","role":"assistant","content":[{"type":"text","text":"I
apologize, but the image appears to be completely blank or white. There are
no visible contents or text that I can analyze to determine the type of document.
Without any discernible information, I cannot definitively state what type
of document this is."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1658,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":55,"service_tier":"standard","inference_geo":"not_available"}}'
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Security-Policy:
- CSP-FILTERED
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:30:58 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Robots-Tag:
- none
anthropic-organization-id:
- ANTHROPIC-ORGANIZATION-ID-XXX
anthropic-ratelimit-input-tokens-limit:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-LIMIT-XXX
anthropic-ratelimit-input-tokens-remaining:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-REMAINING-XXX
anthropic-ratelimit-input-tokens-reset:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-RESET-XXX
anthropic-ratelimit-output-tokens-limit:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-LIMIT-XXX
anthropic-ratelimit-output-tokens-remaining:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-REMAINING-XXX
anthropic-ratelimit-output-tokens-reset:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-RESET-XXX
anthropic-ratelimit-requests-limit:
- '4000'
anthropic-ratelimit-requests-remaining:
- '3999'
anthropic-ratelimit-requests-reset:
- '2026-02-12T19:30:56Z'
anthropic-ratelimit-tokens-limit:
- ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
anthropic-ratelimit-tokens-remaining:
- ANTHROPIC-RATELIMIT-TOKENS-REMAINING-XXX
anthropic-ratelimit-tokens-reset:
- ANTHROPIC-RATELIMIT-TOKENS-RESET-XXX
cf-cache-status:
- DYNAMIC
request-id:
- REQUEST-ID-XXX
strict-transport-security:
- STS-XXX
x-envoy-upstream-service-time:
- '2005'
status: status:
code: 200 code: 200
message: OK message: OK

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,14 +1,9 @@
interactions: interactions:
- request: - request:
body: '{"max_tokens":4096,"messages":[{"role":"user","content":[{"type":"text","text":"\nCurrent body: '{"max_tokens":4096,"messages":[{"role":"user","content":[{"type":"text","text":"\nCurrent
Task: What is this document?\n\nBegin! This is VERY important to you, use the Task: What is this document?\n\nProvide your complete response:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
tools available and give your best Final Answer, your job depends on it!\n\nThought:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately\nTo give my best complete final answer is: Analyze and describe files accurately"}'
to the task respond using the exact following format:\n\nThought: I now can
give a great answer\nFinal Answer: Your final answer must be the great and the
most complete as possible, it must be outcome described.\n\nI MUST use these
formats, my job depends on it!"}'
headers: headers:
User-Agent: User-Agent:
- X-USER-AGENT-XXX - X-USER-AGENT-XXX
@@ -21,7 +16,7 @@ interactions:
connection: connection:
- keep-alive - keep-alive
content-length: content-length:
- '1343' - '942'
content-type: content-type:
- application/json - application/json
host: host:
@@ -37,34 +32,35 @@ interactions:
x-stainless-os: x-stainless-os:
- X-STAINLESS-OS-XXX - X-STAINLESS-OS-XXX
x-stainless-package-version: x-stainless-package-version:
- 0.71.1 - 0.73.0
x-stainless-retry-count: x-stainless-retry-count:
- '0' - '0'
x-stainless-runtime: x-stainless-runtime:
- CPython - CPython
x-stainless-runtime-version: x-stainless-runtime-version:
- 3.12.10 - 3.13.3
x-stainless-timeout: x-stainless-timeout:
- NOT_GIVEN - NOT_GIVEN
method: POST method: POST
uri: https://api.anthropic.com/v1/messages uri: https://api.anthropic.com/v1/messages
response: response:
body: body:
string: '{"model":"claude-3-5-haiku-20241022","id":"msg_01XwAhfdaMxwTNzTy7YhmA5e","type":"message","role":"assistant","content":[{"type":"text","text":"Thought: string: '{"model":"claude-3-5-haiku-20241022","id":"msg_01RnyTYpTE9Dd8BfwyMfuwum","type":"message","role":"assistant","content":[{"type":"text","text":"I
I can see this is a PDF document, but the image appears to be blank or completely apologize, but the image appears to be blank or completely white. Without
white. Without any visible text or content, I cannot determine the specific any visible text or content, I cannot determine the type or nature of the
type or purpose of this document.\n\nFinal Answer: The document appears to document. If you intended to share a specific document, you may want to check
be a blank white PDF page with no discernible text, images, or content visible. the file and try uploading it again."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1656,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":59,"service_tier":"standard","inference_geo":"not_available"}}'
It could be an empty document, a scanning error, or a placeholder file."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1748,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":88,"service_tier":"standard"}}'
headers: headers:
CF-RAY: CF-RAY:
- CF-RAY-XXX - CF-RAY-XXX
Connection: Connection:
- keep-alive - keep-alive
Content-Security-Policy:
- CSP-FILTERED
Content-Type: Content-Type:
- application/json - application/json
Date: Date:
- Fri, 23 Jan 2026 19:08:19 GMT - Thu, 12 Feb 2026 19:29:25 GMT
Server: Server:
- cloudflare - cloudflare
Transfer-Encoding: Transfer-Encoding:
@@ -90,7 +86,7 @@ interactions:
anthropic-ratelimit-requests-remaining: anthropic-ratelimit-requests-remaining:
- '3999' - '3999'
anthropic-ratelimit-requests-reset: anthropic-ratelimit-requests-reset:
- '2026-01-23T19:08:16Z' - '2026-02-12T19:29:23Z'
anthropic-ratelimit-tokens-limit: anthropic-ratelimit-tokens-limit:
- ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX - ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
anthropic-ratelimit-tokens-remaining: anthropic-ratelimit-tokens-remaining:
@@ -104,7 +100,111 @@ interactions:
strict-transport-security: strict-transport-security:
- STS-XXX - STS-XXX
x-envoy-upstream-service-time: x-envoy-upstream-service-time:
- '3114' - '2072'
status:
code: 200
message: OK
- request:
body: '{"max_tokens":4096,"messages":[{"role":"user","content":[{"type":"text","text":"\nCurrent
Task: What is this document?\n\nProvide your complete response:"},{"type":"document","source":{"type":"base64","media_type":"application/pdf","data":"JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="},"cache_control":{"type":"ephemeral"}}]}],"model":"claude-3-5-haiku-20241022","stop_sequences":["\nObservation:"],"stream":false,"system":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
anthropic-version:
- '2023-06-01'
connection:
- keep-alive
content-length:
- '942'
content-type:
- application/json
host:
- api.anthropic.com
x-api-key:
- X-API-KEY-XXX
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 0.73.0
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
x-stainless-timeout:
- NOT_GIVEN
method: POST
uri: https://api.anthropic.com/v1/messages
response:
body:
string: '{"model":"claude-3-5-haiku-20241022","id":"msg_011J2La8KpjxAK255NsSpePY","type":"message","role":"assistant","content":[{"type":"text","text":"I
apologize, but the document appears to be a blank white page. No text, images,
or discernible content is visible in this PDF file. Without any readable information,
I cannot determine the type or purpose of this document."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":1656,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":51,"service_tier":"standard","inference_geo":"not_available"}}'
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Security-Policy:
- CSP-FILTERED
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:29:27 GMT
Server:
- cloudflare
Transfer-Encoding:
- chunked
X-Robots-Tag:
- none
anthropic-organization-id:
- ANTHROPIC-ORGANIZATION-ID-XXX
anthropic-ratelimit-input-tokens-limit:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-LIMIT-XXX
anthropic-ratelimit-input-tokens-remaining:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-REMAINING-XXX
anthropic-ratelimit-input-tokens-reset:
- ANTHROPIC-RATELIMIT-INPUT-TOKENS-RESET-XXX
anthropic-ratelimit-output-tokens-limit:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-LIMIT-XXX
anthropic-ratelimit-output-tokens-remaining:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-REMAINING-XXX
anthropic-ratelimit-output-tokens-reset:
- ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-RESET-XXX
anthropic-ratelimit-requests-limit:
- '4000'
anthropic-ratelimit-requests-remaining:
- '3999'
anthropic-ratelimit-requests-reset:
- '2026-02-12T19:29:26Z'
anthropic-ratelimit-tokens-limit:
- ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
anthropic-ratelimit-tokens-remaining:
- ANTHROPIC-RATELIMIT-TOKENS-REMAINING-XXX
anthropic-ratelimit-tokens-reset:
- ANTHROPIC-RATELIMIT-TOKENS-RESET-XXX
cf-cache-status:
- DYNAMIC
request-id:
- REQUEST-ID-XXX
strict-transport-security:
- STS-XXX
x-envoy-upstream-service-time:
- '1802'
status: status:
code: 200 code: 200
message: OK message: OK

View File

@@ -1,14 +1,9 @@
interactions: interactions:
- request: - request:
body: '{"input":[{"role":"user","content":[{"type":"input_text","text":"\nCurrent body: '{"input":[{"role":"user","content":[{"type":"input_text","text":"\nCurrent
Task: What is this document?\n\nBegin! This is VERY important to you, use the Task: What is this document?\n\nProvide your complete response:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"gpt-4o-mini","instructions":"You
tools available and give your best Final Answer, your job depends on it!\n\nThought:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"gpt-4o-mini","instructions":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately\nTo give my best complete final answer is: Analyze and describe files accurately"}'
to the task respond using the exact following format:\n\nThought: I now can
give a great answer\nFinal Answer: Your final answer must be the great and the
most complete as possible, it must be outcome described.\n\nI MUST use these
formats, my job depends on it!"}'
headers: headers:
User-Agent: User-Agent:
- X-USER-AGENT-XXX - X-USER-AGENT-XXX
@@ -21,7 +16,7 @@ interactions:
connection: connection:
- keep-alive - keep-alive
content-length: content-length:
- '1235' - '834'
content-type: content-type:
- application/json - application/json
host: host:
@@ -43,47 +38,37 @@ interactions:
x-stainless-runtime: x-stainless-runtime:
- CPython - CPython
x-stainless-runtime-version: x-stainless-runtime-version:
- 3.12.10 - 3.13.3
method: POST method: POST
uri: https://api.openai.com/v1/responses uri: https://api.openai.com/v1/responses
response: response:
body: body:
string: "{\n \"id\": \"resp_059d23bc71d450aa006973c72416788197bddcc99157e3a313\",\n string: "{\n \"id\": \"resp_0751868929a7aa7500698e2a23d5508194b8e4092ff79a8f41\",\n
\ \"object\": \"response\",\n \"created_at\": 1769195300,\n \"status\": \ \"object\": \"response\",\n \"created_at\": 1770924579,\n \"status\":
\"completed\",\n \"background\": false,\n \"billing\": {\n \"payer\": \"completed\",\n \"background\": false,\n \"billing\": {\n \"payer\":
\"developer\"\n },\n \"completed_at\": 1769195307,\n \"error\": null,\n \"developer\"\n },\n \"completed_at\": 1770924581,\n \"error\": null,\n
\ \"frequency_penalty\": 0.0,\n \"incomplete_details\": null,\n \"instructions\": \ \"frequency_penalty\": 0.0,\n \"incomplete_details\": null,\n \"instructions\":
\"You are File Analyst. Expert at analyzing various file types.\\nYour personal \"You are File Analyst. Expert at analyzing various file types.\\nYour personal
goal is: Analyze and describe files accurately\\nTo give my best complete goal is: Analyze and describe files accurately\",\n \"max_output_tokens\":
final answer to the task respond using the exact following format:\\n\\nThought:
I now can give a great answer\\nFinal Answer: Your final answer must be the
great and the most complete as possible, it must be outcome described.\\n\\nI
MUST use these formats, my job depends on it!\",\n \"max_output_tokens\":
null,\n \"max_tool_calls\": null,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n null,\n \"max_tool_calls\": null,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"output\": [\n {\n \"id\": \"msg_059d23bc71d450aa006973c724b1d881979787b0eeb53bdbd2\",\n \ \"output\": [\n {\n \"id\": \"msg_0751868929a7aa7500698e2a2474208194a7ea7e8d1179c3fa\",\n
\ \"type\": \"message\",\n \"status\": \"completed\",\n \"content\": \ \"type\": \"message\",\n \"status\": \"completed\",\n \"content\":
[\n {\n \"type\": \"output_text\",\n \"annotations\": [\n {\n \"type\": \"output_text\",\n \"annotations\":
[],\n \"logprobs\": [],\n \"text\": \"Thought: I now can [],\n \"logprobs\": [],\n \"text\": \"It seems that you
give a great answer. \\nFinal Answer: Without access to a specific document have not uploaded any document or file for analysis. Please provide the file
or its contents, I cannot provide a detailed analysis. However, in general, you'd like me to review, and I'll be happy to help you with the analysis and
important aspects of a document can include its format (such as PDF, DOCX, description.\"\n }\n ],\n \"role\": \"assistant\"\n }\n
or TXT), purpose (such as legal, informative, or persuasive), and key elements \ ],\n \"parallel_tool_calls\": true,\n \"presence_penalty\": 0.0,\n \"previous_response_id\":
like headings, text structure, and any embedded media (such as images or charts). null,\n \"prompt_cache_key\": null,\n \"prompt_cache_retention\": null,\n
For a thorough analysis, it's essential to understand the context, audience, \ \"reasoning\": {\n \"effort\": null,\n \"summary\": null\n },\n \"safety_identifier\":
and intended use of the document. If you can provide the document itself or null,\n \"service_tier\": \"default\",\n \"store\": true,\n \"temperature\":
more context about it, I would be able to give a complete assessment.\"\n 1.0,\n \"text\": {\n \"format\": {\n \"type\": \"text\"\n },\n
\ }\n ],\n \"role\": \"assistant\"\n }\n ],\n \"parallel_tool_calls\": \ \"verbosity\": \"medium\"\n },\n \"tool_choice\": \"auto\",\n \"tools\":
true,\n \"presence_penalty\": 0.0,\n \"previous_response_id\": null,\n \"prompt_cache_key\": [],\n \"top_logprobs\": 0,\n \"top_p\": 1.0,\n \"truncation\": \"disabled\",\n
null,\n \"prompt_cache_retention\": null,\n \"reasoning\": {\n \"effort\": \ \"usage\": {\n \"input_tokens\": 51,\n \"input_tokens_details\": {\n
null,\n \"summary\": null\n },\n \"safety_identifier\": null,\n \"service_tier\": \ \"cached_tokens\": 0\n },\n \"output_tokens\": 38,\n \"output_tokens_details\":
\"default\",\n \"store\": true,\n \"temperature\": 1.0,\n \"text\": {\n {\n \"reasoning_tokens\": 0\n },\n \"total_tokens\": 89\n },\n
\ \"format\": {\n \"type\": \"text\"\n },\n \"verbosity\": \"medium\"\n \ \"user\": null,\n \"metadata\": {}\n}"
\ },\n \"tool_choice\": \"auto\",\n \"tools\": [],\n \"top_logprobs\":
0,\n \"top_p\": 1.0,\n \"truncation\": \"disabled\",\n \"usage\": {\n \"input_tokens\":
137,\n \"input_tokens_details\": {\n \"cached_tokens\": 0\n },\n
\ \"output_tokens\": 132,\n \"output_tokens_details\": {\n \"reasoning_tokens\":
0\n },\n \"total_tokens\": 269\n },\n \"user\": null,\n \"metadata\":
{}\n}"
headers: headers:
CF-RAY: CF-RAY:
- CF-RAY-XXX - CF-RAY-XXX
@@ -92,11 +77,9 @@ interactions:
Content-Type: Content-Type:
- application/json - application/json
Date: Date:
- Fri, 23 Jan 2026 19:08:27 GMT - Thu, 12 Feb 2026 19:29:41 GMT
Server: Server:
- cloudflare - cloudflare
Set-Cookie:
- SET-COOKIE-XXX
Strict-Transport-Security: Strict-Transport-Security:
- STS-XXX - STS-XXX
Transfer-Encoding: Transfer-Encoding:
@@ -110,13 +93,132 @@ interactions:
openai-organization: openai-organization:
- OPENAI-ORG-XXX - OPENAI-ORG-XXX
openai-processing-ms: openai-processing-ms:
- '7347' - '1581'
openai-project: openai-project:
- OPENAI-PROJECT-XXX - OPENAI-PROJECT-XXX
openai-version: openai-version:
- '2020-10-01' - '2020-10-01'
x-envoy-upstream-service-time: set-cookie:
- '7350' - SET-COOKIE-XXX
x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens:
- X-RATELIMIT-LIMIT-TOKENS-XXX
x-ratelimit-remaining-requests:
- X-RATELIMIT-REMAINING-REQUESTS-XXX
x-ratelimit-remaining-tokens:
- X-RATELIMIT-REMAINING-TOKENS-XXX
x-ratelimit-reset-requests:
- X-RATELIMIT-RESET-REQUESTS-XXX
x-ratelimit-reset-tokens:
- X-RATELIMIT-RESET-TOKENS-XXX
x-request-id:
- X-REQUEST-ID-XXX
status:
code: 200
message: OK
- request:
body: '{"input":[{"role":"user","content":[{"type":"input_text","text":"\nCurrent
Task: What is this document?\n\nProvide your complete response:"},{"type":"input_file","filename":"document.pdf","file_data":"data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iaiA8PCAvVHlwZSAvQ2F0YWxvZyAvUGFnZXMgMiAwIFIgPj4gZW5kb2JqCjIgMCBvYmogPDwgL1R5cGUgL1BhZ2VzIC9LaWRzIFszIDAgUl0gL0NvdW50IDEgPj4gZW5kb2JqCjMgMCBvYmogPDwgL1R5cGUgL1BhZ2UgL1BhcmVudCAyIDAgUiAvTWVkaWFCb3ggWzAgMCA2MTIgNzkyXSA+PiBlbmRvYmoKeHJlZgowIDQKMDAwMDAwMDAwMCA2NTUzNSBmCjAwMDAwMDAwMDkgMDAwMDAgbgowMDAwMDAwMDU4IDAwMDAwIG4KMDAwMDAwMDExNSAwMDAwMCBuCnRyYWlsZXIgPDwgL1NpemUgNCAvUm9vdCAxIDAgUiA+PgpzdGFydHhyZWYKMTk2CiUlRU9GCg=="}]}],"model":"gpt-4o-mini","instructions":"You
are File Analyst. Expert at analyzing various file types.\nYour personal goal
is: Analyze and describe files accurately"}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- application/json
accept-encoding:
- ACCEPT-ENCODING-XXX
authorization:
- AUTHORIZATION-XXX
connection:
- keep-alive
content-length:
- '834'
content-type:
- application/json
cookie:
- COOKIE-XXX
host:
- api.openai.com
x-stainless-arch:
- X-STAINLESS-ARCH-XXX
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- X-STAINLESS-OS-XXX
x-stainless-package-version:
- 1.83.0
x-stainless-read-timeout:
- X-STAINLESS-READ-TIMEOUT-XXX
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.13.3
method: POST
uri: https://api.openai.com/v1/responses
response:
body:
string: "{\n \"id\": \"resp_0c3ca22d310deec300698e2a25842881929a9aad25ea18eb77\",\n
\ \"object\": \"response\",\n \"created_at\": 1770924581,\n \"status\":
\"completed\",\n \"background\": false,\n \"billing\": {\n \"payer\":
\"developer\"\n },\n \"completed_at\": 1770924582,\n \"error\": null,\n
\ \"frequency_penalty\": 0.0,\n \"incomplete_details\": null,\n \"instructions\":
\"You are File Analyst. Expert at analyzing various file types.\\nYour personal
goal is: Analyze and describe files accurately\",\n \"max_output_tokens\":
null,\n \"max_tool_calls\": null,\n \"model\": \"gpt-4o-mini-2024-07-18\",\n
\ \"output\": [\n {\n \"id\": \"msg_0c3ca22d310deec300698e2a26058081929351f3632bd1aa8e\",\n
\ \"type\": \"message\",\n \"status\": \"completed\",\n \"content\":
[\n {\n \"type\": \"output_text\",\n \"annotations\":
[],\n \"logprobs\": [],\n \"text\": \"Please upload the
document you would like me to analyze, and I'll provide you with a detailed
description and analysis of its contents.\"\n }\n ],\n \"role\":
\"assistant\"\n }\n ],\n \"parallel_tool_calls\": true,\n \"presence_penalty\":
0.0,\n \"previous_response_id\": null,\n \"prompt_cache_key\": null,\n \"prompt_cache_retention\":
null,\n \"reasoning\": {\n \"effort\": null,\n \"summary\": null\n
\ },\n \"safety_identifier\": null,\n \"service_tier\": \"default\",\n \"store\":
true,\n \"temperature\": 1.0,\n \"text\": {\n \"format\": {\n \"type\":
\"text\"\n },\n \"verbosity\": \"medium\"\n },\n \"tool_choice\":
\"auto\",\n \"tools\": [],\n \"top_logprobs\": 0,\n \"top_p\": 1.0,\n \"truncation\":
\"disabled\",\n \"usage\": {\n \"input_tokens\": 51,\n \"input_tokens_details\":
{\n \"cached_tokens\": 0\n },\n \"output_tokens\": 26,\n \"output_tokens_details\":
{\n \"reasoning_tokens\": 0\n },\n \"total_tokens\": 77\n },\n
\ \"user\": null,\n \"metadata\": {}\n}"
headers:
CF-RAY:
- CF-RAY-XXX
Connection:
- keep-alive
Content-Type:
- application/json
Date:
- Thu, 12 Feb 2026 19:29:42 GMT
Server:
- cloudflare
Strict-Transport-Security:
- STS-XXX
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- OPENAI-ORG-XXX
openai-processing-ms:
- '870'
openai-project:
- OPENAI-PROJECT-XXX
openai-version:
- '2020-10-01'
set-cookie:
- SET-COOKIE-XXX
x-ratelimit-limit-requests: x-ratelimit-limit-requests:
- X-RATELIMIT-LIMIT-REQUESTS-XXX - X-RATELIMIT-LIMIT-REQUESTS-XXX
x-ratelimit-limit-tokens: x-ratelimit-limit-tokens:

View File

@@ -1,16 +1,11 @@
interactions: interactions:
- request: - request:
body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Summarize this text.\n\nBegin! body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Summarize this text.\n\nProvide
This is VERY important to you, use the tools available and give your best Final your complete response:"}, {"inlineData": {"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
Answer, your job depends on it!\n\nThought:"}, {"inlineData": {"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
"mimeType": "text/plain"}}], "role": "user"}], "systemInstruction": {"parts": "mimeType": "text/plain"}}], "role": "user"}], "systemInstruction": {"parts":
[{"text": "You are File Analyst. Expert at analyzing various file types.\nYour [{"text": "You are File Analyst. Expert at analyzing various file types.\nYour
personal goal is: Analyze and describe files accurately\nTo give my best complete personal goal is: Analyze and describe files accurately"}], "role": "user"},
final answer to the task respond using the exact following format:\n\nThought: "generationConfig": {"stopSequences": ["\nObservation:"]}}'
I now can give a great answer\nFinal Answer: Your final answer must be the great
and the most complete as possible, it must be outcome described.\n\nI MUST use
these formats, my job depends on it!"}], "role": "user"}, "generationConfig":
{"stopSequences": ["\nObservation:"]}}'
headers: headers:
User-Agent: User-Agent:
- X-USER-AGENT-XXX - X-USER-AGENT-XXX
@@ -21,13 +16,13 @@ interactions:
connection: connection:
- keep-alive - keep-alive
content-length: content-length:
- '1619' - '1218'
content-type: content-type:
- application/json - application/json
host: host:
- generativelanguage.googleapis.com - generativelanguage.googleapis.com
x-goog-api-client: x-goog-api-client:
- google-genai-sdk/1.49.0 gl-python/3.12.10 - google-genai-sdk/1.49.0 gl-python/3.13.3
x-goog-api-key: x-goog-api-key:
- X-GOOG-API-KEY-XXX - X-GOOG-API-KEY-XXX
method: POST method: POST
@@ -35,34 +30,101 @@ interactions:
response: response:
body: body:
string: "{\n \"candidates\": [\n {\n \"content\": {\n \"parts\": string: "{\n \"candidates\": [\n {\n \"content\": {\n \"parts\":
[\n {\n \"text\": \"Thought: This text provides guidelines [\n {\n \"text\": \"The text provides guidelines for giving
for giving effective feedback. I need to summarize these guidelines in a clear effective feedback. Key principles include being clear, focusing on behavior
and concise manner.\\n\\nFinal Answer: The text outlines eight guidelines and outcomes with specific examples, balancing positive and constructive criticism,
for providing effective feedback: be clear and concise, focus on behavior remaining respectful, using objective criteria, suggesting actionable next
and outcomes, be specific with examples, balance positive aspects with areas steps, and proofreading for clarity and tone. In essence, feedback should
for improvement, be respectful and constructive by offering solutions, use be easily understood, objective, and geared towards improvement.\\n\"\n }\n
objective criteria, suggest actionable next steps, and proofread for tone, \ ],\n \"role\": \"model\"\n },\n \"finishReason\":
grammar, and clarity before submission. These guidelines aim to ensure feedback \"STOP\",\n \"avgLogprobs\": -0.24900928895864913\n }\n ],\n \"usageMetadata\":
is easily understood, impactful, and geared towards positive growth.\\n\"\n {\n \"promptTokenCount\": 163,\n \"candidatesTokenCount\": 67,\n \"totalTokenCount\":
\ }\n ],\n \"role\": \"model\"\n },\n \"finishReason\": 230,\n \"promptTokensDetails\": [\n {\n \"modality\": \"TEXT\",\n
\"STOP\",\n \"avgLogprobs\": -0.24753604923282657\n }\n ],\n \"usageMetadata\": \ \"tokenCount\": 163\n }\n ],\n \"candidatesTokensDetails\":
{\n \"promptTokenCount\": 252,\n \"candidatesTokenCount\": 111,\n \"totalTokenCount\": [\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 67\n
363,\n \"promptTokensDetails\": [\n {\n \"modality\": \"TEXT\",\n
\ \"tokenCount\": 252\n }\n ],\n \"candidatesTokensDetails\":
[\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 111\n
\ }\n ]\n },\n \"modelVersion\": \"gemini-2.0-flash\",\n \"responseId\": \ }\n ]\n },\n \"modelVersion\": \"gemini-2.0-flash\",\n \"responseId\":
\"88lzae_VGaGOjMcPxNCokQI\"\n}\n" \"SDSOaae8LLzRjMcPptjXkQ4\"\n}\n"
headers: headers:
Alt-Svc: Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Type: Content-Type:
- application/json; charset=UTF-8 - application/json; charset=UTF-8
Date: Date:
- Fri, 23 Jan 2026 19:20:20 GMT - Thu, 12 Feb 2026 20:12:58 GMT
Server: Server:
- scaffolding on HTTPServer2 - scaffolding on HTTPServer2
Server-Timing: Server-Timing:
- gfet4t7; dur=1200 - gfet4t7; dur=1742
Transfer-Encoding:
- chunked
Vary:
- Origin
- X-Origin
- Referer
X-Content-Type-Options:
- X-CONTENT-TYPE-XXX
X-Frame-Options:
- X-FRAME-OPTIONS-XXX
X-XSS-Protection:
- '0'
status:
code: 200
message: OK
- request:
body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Summarize this text.\n\nProvide
your complete response:"}, {"inlineData": {"data": "UmV2aWV3IEd1aWRlbGluZXMKCjEuIEJlIGNsZWFyIGFuZCBjb25jaXNlOiBXcml0ZSBmZWVkYmFjayB0aGF0IGlzIGVhc3kgdG8gdW5kZXJzdGFuZC4KMi4gRm9jdXMgb24gYmVoYXZpb3IgYW5kIG91dGNvbWVzOiBEZXNjcmliZSB3aGF0IGhhcHBlbmVkIGFuZCB3aHkgaXQgbWF0dGVycy4KMy4gQmUgc3BlY2lmaWM6IFByb3ZpZGUgZXhhbXBsZXMgdG8gc3VwcG9ydCB5b3VyIHBvaW50cy4KNC4gQmFsYW5jZSBwb3NpdGl2ZXMgYW5kIGltcHJvdmVtZW50czogSGlnaGxpZ2h0IHN0cmVuZ3RocyBhbmQgYXJlYXMgdG8gZ3Jvdy4KNS4gQmUgcmVzcGVjdGZ1bCBhbmQgY29uc3RydWN0aXZlOiBBc3N1bWUgcG9zaXRpdmUgaW50ZW50IGFuZCBvZmZlciBzb2x1dGlvbnMuCjYuIFVzZSBvYmplY3RpdmUgY3JpdGVyaWE6IFJlZmVyZW5jZSBnb2FscywgbWV0cmljcywgb3IgZXhwZWN0YXRpb25zIHdoZXJlIHBvc3NpYmxlLgo3LiBTdWdnZXN0IG5leHQgc3RlcHM6IFJlY29tbWVuZCBhY3Rpb25hYmxlIHdheXMgdG8gaW1wcm92ZS4KOC4gUHJvb2ZyZWFkOiBDaGVjayB0b25lLCBncmFtbWFyLCBhbmQgY2xhcml0eSBiZWZvcmUgc3VibWl0dGluZy4K",
"mimeType": "text/plain"}}], "role": "user"}], "systemInstruction": {"parts":
[{"text": "You are File Analyst. Expert at analyzing various file types.\nYour
personal goal is: Analyze and describe files accurately"}], "role": "user"},
"generationConfig": {"stopSequences": ["\nObservation:"]}}'
headers:
User-Agent:
- X-USER-AGENT-XXX
accept:
- '*/*'
accept-encoding:
- ACCEPT-ENCODING-XXX
connection:
- keep-alive
content-length:
- '1218'
content-type:
- application/json
host:
- generativelanguage.googleapis.com
x-goog-api-client:
- google-genai-sdk/1.49.0 gl-python/3.13.3
x-goog-api-key:
- X-GOOG-API-KEY-XXX
method: POST
uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent
response:
body:
string: "{\n \"candidates\": [\n {\n \"content\": {\n \"parts\":
[\n {\n \"text\": \"The text provides guidelines for writing
effective feedback. Key recommendations include being clear, concise, specific,
and respectful. Feedback should focus on behavior and outcomes, balance positive
and negative aspects, use objective criteria, and suggest actionable next
steps. Proofreading is essential before submitting feedback.\\n\"\n }\n
\ ],\n \"role\": \"model\"\n },\n \"finishReason\":
\"STOP\",\n \"avgLogprobs\": -0.29874773892489348\n }\n ],\n \"usageMetadata\":
{\n \"promptTokenCount\": 163,\n \"candidatesTokenCount\": 55,\n \"totalTokenCount\":
218,\n \"promptTokensDetails\": [\n {\n \"modality\": \"TEXT\",\n
\ \"tokenCount\": 163\n }\n ],\n \"candidatesTokensDetails\":
[\n {\n \"modality\": \"TEXT\",\n \"tokenCount\": 55\n
\ }\n ]\n },\n \"modelVersion\": \"gemini-2.0-flash\",\n \"responseId\":
\"SjSOab3-HaajjMcP38-yyQw\"\n}\n"
headers:
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Type:
- application/json; charset=UTF-8
Date:
- Thu, 12 Feb 2026 20:12:59 GMT
Server:
- scaffolding on HTTPServer2
Server-Timing:
- gfet4t7; dur=1198
Transfer-Encoding: Transfer-Encoding:
- chunked - chunked
Vary: Vary:

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

Some files were not shown because too many files have changed in this diff Show More