Compare commits

...

7 Commits

Author SHA1 Message Date
Devin AI
f13272e3a1 fix: address review feedback for streaming handlers and test robustness
- Fix sync streaming handler: check accumulated_tool_args (from deltas)
  instead of only checking tool_calls from last chunk. This mirrors the
  async streaming handler fix.
- Add isinstance(tool_calls, list) checks in non-streaming handlers for
  type safety (prevents false positives from auto-created MagicMock attrs).
- Update test_handle_streaming_tool_calls_no_available_functions to verify
  tool calls are returned (not discarded) when available_functions is None,
  matching the corrected behavior from issue #4788.
- Emit LLM completion event before returning accumulated tool calls from
  streaming handler.

Co-Authored-By: João <joao@crewai.com>
2026-03-09 13:57:58 +00:00
Devin AI
6d4fcbd7ee fix: prioritize tool calls over text when available_functions is None
When LLMs like Anthropic return both text content AND tool calls in
the same response, the text response was being returned instead of
the tool calls when available_functions=None. This caused the executor
to treat the text as a final answer, discarding the tool calls.

The fix reorders the priority checks in all 4 response handlers
(_handle_non_streaming_response, _ahandle_non_streaming_response,
_handle_streaming_response, _ahandle_streaming_response) so that
tool calls are returned before falling back to text content when
available_functions is None.

Fixes #4788

Co-Authored-By: João <joao@crewai.com>
2026-03-09 13:45:12 +00:00
Greyson LaLonde
cd42bcf035 refactor(memory): convert memory classes to serializable
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* refactor(memory): convert Memory, MemoryScope, and MemorySlice to BaseModel

* fix(test): update mock memory attribute from _read_only to read_only

* fix: handle re-validation in wrap validators and patch BaseModel class in tests
2026-03-08 23:08:10 -04:00
Greyson LaLonde
bc45a7fbe3 feat: create action for nightly releases
Some checks failed
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2026-03-06 18:32:52 -05:00
Matt Aitchison
87759cdb14 fix(deps): bump gitpython to >=3.1.41 to resolve CVE path traversal vulnerability (#4740)
Some checks failed
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
GitPython ==3.1.38 is affected by a high-severity path traversal
vulnerability (dependabot alert #1). Bump to >=3.1.41,<4 which
includes the fix.
2026-03-05 12:41:24 -06:00
Tiago Freire
059cb93aeb fix(executor): propagate contextvars context to parallel tool call threads
ThreadPoolExecutor threads do not inherit the calling thread's contextvars
context, causing _event_id_stack and _current_celery_task_id to be empty
in worker threads. This broke OTel span parenting for parallel tool calls
(missing parent_event_id) and lost the Celery task ID in the enterprise
tracking layer ([Task ID: no-task]).

Fix by capturing an independent context copy per submission via
contextvars.copy_context().run in CrewAgentExecutor._handle_native_tool_calls,
so each worker thread starts with the correct inherited context without
sharing mutable state across threads.
2026-03-05 08:20:09 -05:00
Lorenze Jay
cebc52694e docs: update changelog and version for v1.10.1
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2026-03-04 18:20:02 -05:00
19 changed files with 2121 additions and 246 deletions

127
.github/workflows/nightly.yml vendored Normal file
View File

@@ -0,0 +1,127 @@
name: Nightly Canary Release
on:
schedule:
- cron: '0 6 * * *' # daily at 6am UTC
workflow_dispatch:
jobs:
check:
name: Check for new commits
runs-on: ubuntu-latest
permissions:
contents: read
outputs:
has_changes: ${{ steps.check.outputs.has_changes }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Check for commits in last 24h
id: check
run: |
RECENT=$(git log --since="24 hours ago" --oneline | head -1)
if [ -n "$RECENT" ]; then
echo "has_changes=true" >> "$GITHUB_OUTPUT"
else
echo "has_changes=false" >> "$GITHUB_OUTPUT"
fi
build:
name: Build nightly packages
needs: check
if: needs.check.outputs.has_changes == 'true' || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install uv
uses: astral-sh/setup-uv@v4
- name: Stamp nightly versions
run: |
DATE=$(date +%Y%m%d)
for init_file in \
lib/crewai/src/crewai/__init__.py \
lib/crewai-tools/src/crewai_tools/__init__.py \
lib/crewai-files/src/crewai_files/__init__.py; do
CURRENT=$(python -c "
import re
text = open('$init_file').read()
print(re.search(r'__version__\s*=\s*\"(.*?)\"\s*$', text, re.MULTILINE).group(1))
")
NIGHTLY="${CURRENT}.dev${DATE}"
sed -i "s/__version__ = .*/__version__ = \"${NIGHTLY}\"/" "$init_file"
echo "$init_file: $CURRENT -> $NIGHTLY"
done
# Update cross-package dependency pins to nightly versions
sed -i "s/\"crewai-tools==[^\"]*\"/\"crewai-tools==${NIGHTLY}\"/" lib/crewai/pyproject.toml
sed -i "s/\"crewai==[^\"]*\"/\"crewai==${NIGHTLY}\"/" lib/crewai-tools/pyproject.toml
echo "Updated cross-package dependency pins to ${NIGHTLY}"
- name: Build packages
run: |
uv build --all-packages
rm dist/.gitignore
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: dist
path: dist/
publish:
name: Publish nightly to PyPI
needs: build
runs-on: ubuntu-latest
environment:
name: pypi
url: https://pypi.org/p/crewai
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
version: "0.8.4"
python-version: "3.12"
enable-cache: false
- name: Download artifacts
uses: actions/download-artifact@v4
with:
name: dist
path: dist
- name: Publish to PyPI
env:
UV_PUBLISH_TOKEN: ${{ secrets.PYPI_API_TOKEN }}
run: |
failed=0
for package in dist/*; do
if [[ "$package" == *"crewai_devtools"* ]]; then
echo "Skipping private package: $package"
continue
fi
echo "Publishing $package"
if ! uv publish "$package"; then
echo "Failed to publish $package"
failed=1
fi
done
if [ $failed -eq 1 ]; then
echo "Some packages failed to publish"
exit 1
fi

File diff suppressed because it is too large Load Diff

View File

@@ -4,6 +4,38 @@ description: "Product updates, improvements, and bug fixes for CrewAI"
icon: "clock"
mode: "wide"
---
<Update label="Mar 04, 2026">
## v1.10.1
[View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.1)
## What's Changed
### Features
- Upgrade Gemini GenAI
### Bug Fixes
- Adjust executor listener value to avoid recursion
- Group parallel function response parts in a single Content object in Gemini
- Surface thought output from thinking models in Gemini
- Load MCP and platform tools when agent tools are None
- Support Jupyter environments with running event loops in A2A
- Use anonymous ID for ephemeral traces
- Conditionally pass plus header
- Skip signal handler registration in non-main threads for telemetry
- Inject tool errors as observations and resolve name collisions
- Upgrade pypdf from 4.x to 6.7.4 to resolve Dependabot alerts
- Resolve critical and high Dependabot security alerts
### Documentation
- Sync Composio tool documentation across locales
## Contributors
@giulio-leone, @greysonlalonde, @haxzie, @joaomdmoura, @lorenzejay, @mattatcha, @mplachta, @nicoferdi96
</Update>
<Update label="Feb 27, 2026">
## v1.10.1a1

View File

@@ -4,6 +4,38 @@ description: "CrewAI의 제품 업데이트, 개선 사항 및 버그 수정"
icon: "clock"
mode: "wide"
---
<Update label="2026년 3월 4일">
## v1.10.1
[GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.10.1)
## 변경 사항
### 기능
- Gemini GenAI 업그레이드
### 버그 수정
- 재귀를 피하기 위해 실행기 리스너 값을 조정
- Gemini에서 병렬 함수 응답 부분을 단일 Content 객체로 그룹화
- Gemini에서 사고 모델의 사고 출력을 표시
- 에이전트 도구가 None일 때 MCP 및 플랫폼 도구 로드
- A2A에서 실행 이벤트 루프가 있는 Jupyter 환경 지원
- 일시적인 추적을 위해 익명 ID 사용
- 조건부로 플러스 헤더 전달
- 원격 측정을 위해 비주 스레드에서 신호 처리기 등록 건너뛰기
- 도구 오류를 관찰로 주입하고 이름 충돌 해결
- Dependabot 경고를 해결하기 위해 pypdf를 4.x에서 6.7.4로 업그레이드
- 심각 및 높은 Dependabot 보안 경고 해결
### 문서
- Composio 도구 문서를 지역별로 동기화
## 기여자
@giulio-leone, @greysonlalonde, @haxzie, @joaomdmoura, @lorenzejay, @mattatcha, @mplachta, @nicoferdi96
</Update>
<Update label="2026년 2월 27일">
## v1.10.1a1

View File

@@ -4,6 +4,38 @@ description: "Atualizações de produto, melhorias e correções do CrewAI"
icon: "clock"
mode: "wide"
---
<Update label="04 mar 2026">
## v1.10.1
[Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.1)
## O que mudou
### Recursos
- Atualizar Gemini GenAI
### Correções de Bugs
- Ajustar o valor do listener do executor para evitar recursão
- Agrupar partes da resposta da função paralela em um único objeto Content no Gemini
- Exibir a saída de pensamento dos modelos de pensamento no Gemini
- Carregar ferramentas MCP e da plataforma quando as ferramentas do agente forem None
- Suportar ambientes Jupyter com loops de eventos em A2A
- Usar ID anônimo para rastreamentos efêmeros
- Passar condicionalmente o cabeçalho plus
- Ignorar o registro do manipulador de sinal em threads não principais para telemetria
- Injetar erros de ferramentas como observações e resolver colisões de nomes
- Atualizar pypdf de 4.x para 6.7.4 para resolver alertas do Dependabot
- Resolver alertas de segurança críticos e altos do Dependabot
### Documentação
- Sincronizar a documentação da ferramenta Composio entre locais
## Contribuidores
@giulio-leone, @greysonlalonde, @haxzie, @joaomdmoura, @lorenzejay, @mattatcha, @mplachta, @nicoferdi96
</Update>
<Update label="27 fev 2026">
## v1.10.1a1

View File

@@ -108,7 +108,7 @@ stagehand = [
"stagehand>=0.4.1",
]
github = [
"gitpython==3.1.38",
"gitpython>=3.1.41,<4",
"PyGithub==1.59.1",
]
rag = [

View File

@@ -30,12 +30,9 @@ class CrewAgentExecutorMixin:
memory = getattr(self.agent, "memory", None) or (
getattr(self.crew, "_memory", None) if self.crew else None
)
if memory is None or not self.task or getattr(memory, "_read_only", False):
if memory is None or not self.task or memory.read_only:
return
if (
f"Action: {sanitize_tool_name('Delegate work to coworker')}"
in output.text
):
if f"Action: {sanitize_tool_name('Delegate work to coworker')}" in output.text:
return
try:
raw = (
@@ -48,6 +45,4 @@ class CrewAgentExecutorMixin:
if extracted:
memory.remember_many(extracted, agent_role=self.agent.role)
except Exception as e:
self.agent._logger.log(
"error", f"Failed to save to memory: {e}"
)
self.agent._logger.log("error", f"Failed to save to memory: {e}")

View File

@@ -8,6 +8,7 @@ from __future__ import annotations
import asyncio
from collections.abc import Callable
import contextvars
from concurrent.futures import ThreadPoolExecutor, as_completed
import inspect
import logging
@@ -755,6 +756,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
with ThreadPoolExecutor(max_workers=max_workers) as pool:
futures = {
pool.submit(
contextvars.copy_context().run,
self._execute_single_native_tool_call,
call_id=call_id,
func_name=func_name,

View File

@@ -1,6 +1,7 @@
from __future__ import annotations
import asyncio
import contextvars
from collections.abc import Callable, Coroutine
from concurrent.futures import ThreadPoolExecutor, as_completed
from datetime import datetime
@@ -728,7 +729,7 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
max_workers = min(8, len(runnable_tool_calls))
with ThreadPoolExecutor(max_workers=max_workers) as pool:
future_to_idx = {
pool.submit(self._execute_single_native_tool_call, tool_call): idx
pool.submit(contextvars.copy_context().run, self._execute_single_native_tool_call, tool_call): idx
for idx, tool_call in enumerate(runnable_tool_calls)
}
ordered_results: list[dict[str, Any] | None] = [None] * len(

View File

@@ -600,7 +600,7 @@ class LiteAgent(FlowTrackable, BaseModel):
def _save_to_memory(self, output_text: str) -> None:
"""Extract discrete memories from the run and remember each. No-op if _memory is None or read-only."""
if self._memory is None or getattr(self._memory, "_read_only", False):
if self._memory is None or self._memory.read_only:
return
input_str = self._get_last_user_content() or "User request"
try:

View File

@@ -967,7 +967,43 @@ class LLM(BaseLLM):
self._track_token_usage_internal(usage_info)
self._handle_streaming_callbacks(callbacks, usage_info, last_chunk)
if not tool_calls or not available_functions:
# --- 8) Check accumulated_tool_args from streaming deltas
# Streaming responses deliver tool calls via deltas accumulated in
# accumulated_tool_args, not via the final chunk's message. When
# available_functions is None (native tool handling), we must return
# the accumulated tool calls so the caller (e.g., executor) can
# handle them. When available_functions is provided, tool execution
# already happened during the chunk processing loop via
# _handle_streaming_tool_calls.
if accumulated_tool_args and not available_functions:
tool_calls_list: list[ChatCompletionDeltaToolCall] = [
ChatCompletionDeltaToolCall(
index=idx,
function=Function(
name=tool_arg.function.name,
arguments=tool_arg.function.arguments,
),
)
for idx, tool_arg in accumulated_tool_args.items()
if tool_arg.function.name
]
if tool_calls_list:
self._handle_emit_call_events(
response=full_response,
call_type=LLMCallType.LLM_CALL,
from_task=from_task,
from_agent=from_agent,
messages=params["messages"],
)
return tool_calls_list
# --- 8b) If there are tool calls from last chunk but no available functions,
# return the tool calls
if tool_calls and not available_functions:
return tool_calls
if not tool_calls and not accumulated_tool_args:
if response_model and self.is_litellm:
instructor_instance = InternalInstructor(
content=full_response,
@@ -994,10 +1030,11 @@ class LLM(BaseLLM):
)
return full_response
# --- 9) Handle tool calls if present
tool_result = self._handle_tool_call(tool_calls, available_functions)
if tool_result is not None:
return tool_result
# --- 9) Handle tool calls from last chunk if present (execute when available_functions provided)
if tool_calls and available_functions:
tool_result = self._handle_tool_call(tool_calls, available_functions)
if tool_result is not None:
return tool_result
# --- 10) Emit completion event and return response
self._handle_emit_call_events(
@@ -1234,8 +1271,17 @@ class LLM(BaseLLM):
# --- 4) Check for tool calls
tool_calls = getattr(response_message, "tool_calls", [])
# --- 5) If no tool calls or no available functions, return the text response directly as long as there is a text response
if (not tool_calls or not available_functions) and text_response:
# --- 5) If there are tool calls but no available functions, return the tool calls
# This allows the caller (e.g., executor) to handle tool execution
# This must be checked before the text response fallback because some LLMs
# (e.g., Anthropic) return both text content and tool calls in the same response.
# The isinstance check ensures we have actual tool call data (list), not
# auto-generated attributes from mocks or unexpected types.
if isinstance(tool_calls, list) and tool_calls and not available_functions:
return tool_calls
# --- 6) If no tool calls or no available functions, return the text response directly as long as there is a text response
if not tool_calls and text_response:
self._handle_emit_call_events(
response=text_response,
call_type=LLMCallType.LLM_CALL,
@@ -1245,13 +1291,8 @@ class LLM(BaseLLM):
)
return text_response
# --- 6) If there are tool calls but no available functions, return the tool calls
# This allows the caller (e.g., executor) to handle tool execution
if tool_calls and not available_functions:
return tool_calls
# --- 7) Handle tool calls if present (execute when available_functions provided)
if tool_calls and available_functions:
if isinstance(tool_calls, list) and tool_calls and available_functions:
tool_result = self._handle_tool_call(
tool_calls, available_functions, from_task, from_agent
)
@@ -1364,7 +1405,16 @@ class LLM(BaseLLM):
tool_calls = getattr(response_message, "tool_calls", [])
if (not tool_calls or not available_functions) and text_response:
# If there are tool calls but no available functions, return the tool calls
# This allows the caller (e.g., executor) to handle tool execution
# This must be checked before the text response fallback because some LLMs
# (e.g., Anthropic) return both text content and tool calls in the same response.
# The isinstance check ensures we have actual tool call data (list), not
# auto-generated attributes from mocks or unexpected types.
if isinstance(tool_calls, list) and tool_calls and not available_functions:
return tool_calls
if not tool_calls and text_response:
self._handle_emit_call_events(
response=text_response,
call_type=LLMCallType.LLM_CALL,
@@ -1374,13 +1424,8 @@ class LLM(BaseLLM):
)
return text_response
# If there are tool calls but no available functions, return the tool calls
# This allows the caller (e.g., executor) to handle tool execution
if tool_calls and not available_functions:
return tool_calls
# Handle tool calls if present (execute when available_functions provided)
if tool_calls and available_functions:
if isinstance(tool_calls, list) and tool_calls and available_functions:
tool_result = self._handle_tool_call(
tool_calls, available_functions, from_task, from_agent
)
@@ -1513,7 +1558,7 @@ class LLM(BaseLLM):
if usage_info:
self._track_token_usage_internal(usage_info)
if accumulated_tool_args and available_functions:
if accumulated_tool_args:
# Convert accumulated tool args to ChatCompletionDeltaToolCall objects
tool_calls_list: list[ChatCompletionDeltaToolCall] = [
ChatCompletionDeltaToolCall(
@@ -1527,7 +1572,14 @@ class LLM(BaseLLM):
if tool_arg.function.name
]
if tool_calls_list:
# If there are tool calls but no available functions, return the tool calls
# This allows the caller (e.g., executor) to handle tool execution.
# This must be checked before the text response fallback because some LLMs
# (e.g., Anthropic) return both text content and tool calls in the same response.
if tool_calls_list and not available_functions:
return tool_calls_list
if tool_calls_list and available_functions:
result = self._handle_streaming_tool_calls(
tool_calls=tool_calls_list,
accumulated_tool_args=accumulated_tool_args,

View File

@@ -3,11 +3,9 @@
from __future__ import annotations
from datetime import datetime
from typing import TYPE_CHECKING, Any
from typing import Any, Literal
if TYPE_CHECKING:
from crewai.memory.unified_memory import Memory
from pydantic import BaseModel, ConfigDict, Field, PrivateAttr, model_validator
from crewai.memory.types import (
_RECALL_OVERSAMPLE_FACTOR,
@@ -15,22 +13,38 @@ from crewai.memory.types import (
MemoryRecord,
ScopeInfo,
)
from crewai.memory.unified_memory import Memory
class MemoryScope:
class MemoryScope(BaseModel):
"""View of Memory restricted to a root path. All operations are scoped under that path."""
def __init__(self, memory: Memory, root_path: str) -> None:
"""Initialize scope.
model_config = ConfigDict(arbitrary_types_allowed=True)
Args:
memory: The underlying Memory instance.
root_path: Root path for this scope (e.g. /agent/1).
"""
self._memory = memory
self._root = root_path.rstrip("/") or ""
if self._root and not self._root.startswith("/"):
self._root = "/" + self._root
root_path: str = Field(default="/")
_memory: Memory = PrivateAttr()
_root: str = PrivateAttr()
@model_validator(mode="wrap")
@classmethod
def _accept_memory(cls, data: Any, handler: Any) -> MemoryScope:
"""Extract memory dependency and normalize root path before validation."""
if isinstance(data, MemoryScope):
return data
memory = data.pop("memory")
instance: MemoryScope = handler(data)
instance._memory = memory
root = instance.root_path.rstrip("/") or ""
if root and not root.startswith("/"):
root = "/" + root
instance._root = root
return instance
@property
def read_only(self) -> bool:
"""Whether the underlying memory is read-only."""
return self._memory.read_only
def _scope_path(self, scope: str | None) -> str:
if not scope or scope == "/":
@@ -52,7 +66,7 @@ class MemoryScope:
importance: float | None = None,
source: str | None = None,
private: bool = False,
) -> MemoryRecord:
) -> MemoryRecord | None:
"""Remember content; scope is relative to this scope's root."""
path = self._scope_path(scope)
return self._memory.remember(
@@ -71,7 +85,7 @@ class MemoryScope:
scope: str | None = None,
categories: list[str] | None = None,
limit: int = 10,
depth: str = "deep",
depth: Literal["shallow", "deep"] = "deep",
source: str | None = None,
include_private: bool = False,
) -> list[MemoryMatch]:
@@ -138,34 +152,34 @@ class MemoryScope:
"""Return a narrower scope under this scope."""
child = path.strip("/")
if not child:
return MemoryScope(self._memory, self._root or "/")
return MemoryScope(memory=self._memory, root_path=self._root or "/")
base = self._root.rstrip("/") or ""
new_root = f"{base}/{child}" if base else f"/{child}"
return MemoryScope(self._memory, new_root)
return MemoryScope(memory=self._memory, root_path=new_root)
class MemorySlice:
class MemorySlice(BaseModel):
"""View over multiple scopes: recall searches all, remember is a no-op when read_only."""
def __init__(
self,
memory: Memory,
scopes: list[str],
categories: list[str] | None = None,
read_only: bool = True,
) -> None:
"""Initialize slice.
model_config = ConfigDict(arbitrary_types_allowed=True)
Args:
memory: The underlying Memory instance.
scopes: List of scope paths to include.
categories: Optional category filter for recall.
read_only: If True, remember() is a silent no-op.
"""
self._memory = memory
self._scopes = [s.rstrip("/") or "/" for s in scopes]
self._categories = categories
self._read_only = read_only
scopes: list[str] = Field(default_factory=list)
categories: list[str] | None = Field(default=None)
read_only: bool = Field(default=True)
_memory: Memory = PrivateAttr()
@model_validator(mode="wrap")
@classmethod
def _accept_memory(cls, data: Any, handler: Any) -> MemorySlice:
"""Extract memory dependency and normalize scopes before validation."""
if isinstance(data, MemorySlice):
return data
memory = data.pop("memory")
data["scopes"] = [s.rstrip("/") or "/" for s in data.get("scopes", [])]
instance: MemorySlice = handler(data)
instance._memory = memory
return instance
def remember(
self,
@@ -178,7 +192,7 @@ class MemorySlice:
private: bool = False,
) -> MemoryRecord | None:
"""Remember into an explicit scope. No-op when read_only=True."""
if self._read_only:
if self.read_only:
return None
return self._memory.remember(
content,
@@ -196,14 +210,14 @@ class MemorySlice:
scope: str | None = None,
categories: list[str] | None = None,
limit: int = 10,
depth: str = "deep",
depth: Literal["shallow", "deep"] = "deep",
source: str | None = None,
include_private: bool = False,
) -> list[MemoryMatch]:
"""Recall across all slice scopes; results merged and re-ranked."""
cats = categories or self._categories
cats = categories or self.categories
all_matches: list[MemoryMatch] = []
for sc in self._scopes:
for sc in self.scopes:
matches = self._memory.recall(
query,
scope=sc,
@@ -231,7 +245,7 @@ class MemorySlice:
def list_scopes(self, path: str = "/") -> list[str]:
"""List scopes across all slice roots."""
out: list[str] = []
for sc in self._scopes:
for sc in self.scopes:
full = f"{sc.rstrip('/')}{path}" if sc != "/" else path
out.extend(self._memory.list_scopes(full))
return sorted(set(out))
@@ -243,15 +257,23 @@ class MemorySlice:
oldest: datetime | None = None
newest: datetime | None = None
children: list[str] = []
for sc in self._scopes:
for sc in self.scopes:
full = f"{sc.rstrip('/')}{path}" if sc != "/" else path
inf = self._memory.info(full)
total_records += inf.record_count
all_categories.update(inf.categories)
if inf.oldest_record:
oldest = inf.oldest_record if oldest is None else min(oldest, inf.oldest_record)
oldest = (
inf.oldest_record
if oldest is None
else min(oldest, inf.oldest_record)
)
if inf.newest_record:
newest = inf.newest_record if newest is None else max(newest, inf.newest_record)
newest = (
inf.newest_record
if newest is None
else max(newest, inf.newest_record)
)
children.extend(inf.child_scopes)
return ScopeInfo(
path=path,
@@ -265,7 +287,7 @@ class MemorySlice:
def list_categories(self, path: str | None = None) -> dict[str, int]:
"""Categories and counts across slice scopes."""
counts: dict[str, int] = {}
for sc in self._scopes:
for sc in self.scopes:
full = (f"{sc.rstrip('/')}{path}" if sc != "/" else path) if path else sc
for k, v in self._memory.list_categories(full).items():
counts[k] = counts.get(k, 0) + v

View File

@@ -6,7 +6,9 @@ from concurrent.futures import Future, ThreadPoolExecutor
from datetime import datetime
import threading
import time
from typing import TYPE_CHECKING, Any, Literal
from typing import TYPE_CHECKING, Annotated, Any, Literal
from pydantic import BaseModel, ConfigDict, Field, PlainValidator, PrivateAttr
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.memory_events import (
@@ -39,13 +41,18 @@ if TYPE_CHECKING:
)
def _passthrough(v: Any) -> Any:
"""PlainValidator that accepts any value, bypassing strict union discrimination."""
return v
def _default_embedder() -> OpenAIEmbeddingFunction:
"""Build default OpenAI embedder for memory."""
spec: OpenAIProviderSpec = {"provider": "openai", "config": {}}
return build_embedder(spec)
class Memory:
class Memory(BaseModel):
"""Unified memory: standalone, LLM-analyzed, with intelligent recall flow.
Works without agent/crew. Uses LLM to infer scope, categories, importance on save.
@@ -53,116 +60,119 @@ class Memory:
pluggable storage (LanceDB default).
"""
def __init__(
self,
llm: BaseLLM | str = "gpt-4o-mini",
storage: StorageBackend | str = "lancedb",
embedder: Any = None,
# -- Scoring weights --
# These three weights control how recall results are ranked.
# The composite score is: semantic_weight * similarity + recency_weight * decay + importance_weight * importance.
# They should sum to ~1.0 for intuitive scoring.
recency_weight: float = 0.3,
semantic_weight: float = 0.5,
importance_weight: float = 0.2,
# How quickly old memories lose relevance. The recency score halves every
# N days (exponential decay). Lower = faster forgetting; higher = longer relevance.
recency_half_life_days: int = 30,
# -- Consolidation --
# When remembering new content, if an existing record has similarity >= this
# threshold, the LLM is asked to merge/update/delete. Set to 1.0 to disable.
consolidation_threshold: float = 0.85,
# Max existing records to compare against when checking for consolidation.
consolidation_limit: int = 5,
# -- Save defaults --
# Importance assigned to new memories when no explicit value is given and
# the LLM analysis path is skipped (all fields provided by the caller).
default_importance: float = 0.5,
# -- Recall depth control --
# These thresholds govern the RecallFlow router that decides between
# returning results immediately ("synthesize") vs. doing an extra
# LLM-driven exploration round ("explore_deeper").
# confidence >= confidence_threshold_high => always synthesize
# confidence < confidence_threshold_low => explore deeper (if budget > 0)
# complex query + confidence < complex_query_threshold => explore deeper
confidence_threshold_high: float = 0.8,
confidence_threshold_low: float = 0.5,
complex_query_threshold: float = 0.7,
# How many LLM-driven exploration rounds the RecallFlow is allowed to run.
# 0 = always shallow (vector search only); higher = more thorough but slower.
exploration_budget: int = 1,
# Queries shorter than this skip LLM analysis (saving ~1-3s).
# Longer queries (full task descriptions) benefit from LLM distillation.
query_analysis_threshold: int = 200,
# When True, all write operations (remember, remember_many) are silently
# skipped. Useful for sharing a read-only view of memory across agents
# without any of them persisting new memories.
read_only: bool = False,
) -> None:
"""Initialize Memory.
model_config = ConfigDict(arbitrary_types_allowed=True)
Args:
llm: LLM for analysis (model name or BaseLLM instance).
storage: Backend: "lancedb" or a StorageBackend instance.
embedder: Embedding callable, provider config dict, or None (default OpenAI).
recency_weight: Weight for recency in the composite relevance score.
semantic_weight: Weight for semantic similarity in the composite relevance score.
importance_weight: Weight for importance in the composite relevance score.
recency_half_life_days: Recency score halves every N days (exponential decay).
consolidation_threshold: Similarity above which consolidation is triggered on save.
consolidation_limit: Max existing records to compare during consolidation.
default_importance: Default importance when not provided or inferred.
confidence_threshold_high: Recall confidence above which results are returned directly.
confidence_threshold_low: Recall confidence below which deeper exploration is triggered.
complex_query_threshold: For complex queries, explore deeper below this confidence.
exploration_budget: Number of LLM-driven exploration rounds during deep recall.
query_analysis_threshold: Queries shorter than this skip LLM analysis during deep recall.
read_only: If True, remember() and remember_many() are silent no-ops.
"""
self._read_only = read_only
llm: Annotated[BaseLLM | str, PlainValidator(_passthrough)] = Field(
default="gpt-4o-mini",
description="LLM for analysis (model name or BaseLLM instance).",
)
storage: Annotated[StorageBackend | str, PlainValidator(_passthrough)] = Field(
default="lancedb",
description="Storage backend instance or path string.",
)
embedder: Any = Field(
default=None,
description="Embedding callable, provider config dict, or None for default OpenAI.",
)
recency_weight: float = Field(
default=0.3,
description="Weight for recency in the composite relevance score.",
)
semantic_weight: float = Field(
default=0.5,
description="Weight for semantic similarity in the composite relevance score.",
)
importance_weight: float = Field(
default=0.2,
description="Weight for importance in the composite relevance score.",
)
recency_half_life_days: int = Field(
default=30,
description="Recency score halves every N days (exponential decay).",
)
consolidation_threshold: float = Field(
default=0.85,
description="Similarity above which consolidation is triggered on save.",
)
consolidation_limit: int = Field(
default=5,
description="Max existing records to compare during consolidation.",
)
default_importance: float = Field(
default=0.5,
description="Default importance when not provided or inferred.",
)
confidence_threshold_high: float = Field(
default=0.8,
description="Recall confidence above which results are returned directly.",
)
confidence_threshold_low: float = Field(
default=0.5,
description="Recall confidence below which deeper exploration is triggered.",
)
complex_query_threshold: float = Field(
default=0.7,
description="For complex queries, explore deeper below this confidence.",
)
exploration_budget: int = Field(
default=1,
description="Number of LLM-driven exploration rounds during deep recall.",
)
query_analysis_threshold: int = Field(
default=200,
description="Queries shorter than this skip LLM analysis during deep recall.",
)
read_only: bool = Field(
default=False,
description="If True, remember() and remember_many() are silent no-ops.",
)
_config: MemoryConfig = PrivateAttr()
_llm_instance: BaseLLM | None = PrivateAttr(default=None)
_embedder_instance: Any = PrivateAttr(default=None)
_storage: StorageBackend = PrivateAttr()
_save_pool: ThreadPoolExecutor = PrivateAttr(
default_factory=lambda: ThreadPoolExecutor(
max_workers=1, thread_name_prefix="memory-save"
)
)
_pending_saves: list[Future[Any]] = PrivateAttr(default_factory=list)
_pending_lock: threading.Lock = PrivateAttr(default_factory=threading.Lock)
def model_post_init(self, __context: Any) -> None:
"""Initialize runtime state from field values."""
self._config = MemoryConfig(
recency_weight=recency_weight,
semantic_weight=semantic_weight,
importance_weight=importance_weight,
recency_half_life_days=recency_half_life_days,
consolidation_threshold=consolidation_threshold,
consolidation_limit=consolidation_limit,
default_importance=default_importance,
confidence_threshold_high=confidence_threshold_high,
confidence_threshold_low=confidence_threshold_low,
complex_query_threshold=complex_query_threshold,
exploration_budget=exploration_budget,
query_analysis_threshold=query_analysis_threshold,
recency_weight=self.recency_weight,
semantic_weight=self.semantic_weight,
importance_weight=self.importance_weight,
recency_half_life_days=self.recency_half_life_days,
consolidation_threshold=self.consolidation_threshold,
consolidation_limit=self.consolidation_limit,
default_importance=self.default_importance,
confidence_threshold_high=self.confidence_threshold_high,
confidence_threshold_low=self.confidence_threshold_low,
complex_query_threshold=self.complex_query_threshold,
exploration_budget=self.exploration_budget,
query_analysis_threshold=self.query_analysis_threshold,
)
# Store raw config for lazy initialization. LLM and embedder are only
# built on first access so that Memory() never fails at construction
# time (e.g. when auto-created by Flow without an API key set).
self._llm_config: BaseLLM | str = llm
self._llm_instance: BaseLLM | None = None if isinstance(llm, str) else llm
self._embedder_config: Any = embedder
self._embedder_instance: Any = (
embedder
if (embedder is not None and not isinstance(embedder, dict))
self._llm_instance = None if isinstance(self.llm, str) else self.llm
self._embedder_instance = (
self.embedder
if (self.embedder is not None and not isinstance(self.embedder, dict))
else None
)
if isinstance(storage, str):
if isinstance(self.storage, str):
from crewai.memory.storage.lancedb_storage import LanceDBStorage
self._storage = LanceDBStorage() if storage == "lancedb" else LanceDBStorage(path=storage)
self._storage = (
LanceDBStorage()
if self.storage == "lancedb"
else LanceDBStorage(path=self.storage)
)
else:
self._storage = storage
# Background save queue. max_workers=1 serializes saves to avoid
# concurrent storage mutations (two saves finding the same similar
# record and both trying to update/delete it). Within each save,
# the parallel LLM calls still run on their own thread pool.
self._save_pool = ThreadPoolExecutor(
max_workers=1, thread_name_prefix="memory-save"
)
self._pending_saves: list[Future[Any]] = []
self._pending_lock = threading.Lock()
self._storage = self.storage
_MEMORY_DOCS_URL = "https://docs.crewai.com/concepts/memory"
@@ -173,11 +183,7 @@ class Memory:
from crewai.llm import LLM
try:
model_name = (
self._llm_config
if isinstance(self._llm_config, str)
else str(self._llm_config)
)
model_name = self.llm if isinstance(self.llm, str) else str(self.llm)
self._llm_instance = LLM(model=model_name)
except Exception as e:
raise RuntimeError(
@@ -197,8 +203,8 @@ class Memory:
"""Lazy embedder initialization -- only created when first needed."""
if self._embedder_instance is None:
try:
if isinstance(self._embedder_config, dict):
self._embedder_instance = build_embedder(self._embedder_config)
if isinstance(self.embedder, dict):
self._embedder_instance = build_embedder(self.embedder)
else:
self._embedder_instance = _default_embedder()
except Exception as e:
@@ -356,7 +362,7 @@ class Memory:
Raises:
Exception: On save failure (events emitted).
"""
if self._read_only:
if self.read_only:
return None
_source_type = "unified_memory"
try:
@@ -444,7 +450,7 @@ class Memory:
Returns:
Empty list (records are not available until the background save completes).
"""
if not contents or self._read_only:
if not contents or self.read_only:
return []
self._submit_save(

View File

@@ -121,7 +121,7 @@ def create_memory_tools(memory: Any) -> list[BaseTool]:
description=i18n.tools("recall_memory"),
),
]
if not getattr(memory, "_read_only", False):
if not memory.read_only:
tools.append(
RememberTool(
memory=memory,

View File

@@ -1136,7 +1136,7 @@ def test_lite_agent_memory_instance_recall_and_save_called():
successful_requests=1,
)
mock_memory = Mock()
mock_memory._read_only = False
mock_memory.read_only = False
mock_memory.recall.return_value = []
mock_memory.extract_memories.return_value = ["Fact one.", "Fact two."]

View File

@@ -172,8 +172,8 @@ def test_memory_scope_slice(tmp_path: Path, mock_embedder: MagicMock) -> None:
sc = mem.scope("/agent/1")
assert sc._root in ("/agent/1", "/agent/1/")
sl = mem.slice(["/a", "/b"], read_only=True)
assert sl._read_only is True
assert "/a" in sl._scopes and "/b" in sl._scopes
assert sl.read_only is True
assert "/a" in sl.scopes and "/b" in sl.scopes
def test_memory_list_scopes_info_tree(tmp_path: Path, mock_embedder: MagicMock) -> None:
@@ -198,7 +198,7 @@ def test_memory_scope_remember_recall(tmp_path: Path, mock_embedder: MagicMock)
from crewai.memory.memory_scope import MemoryScope
mem = Memory(storage=str(tmp_path / "db5"), llm=MagicMock(), embedder=mock_embedder)
scope = MemoryScope(mem, "/crew/1")
scope = MemoryScope(memory=mem, root_path="/crew/1")
scope.remember("Scoped note", scope="/", categories=[], importance=0.5, metadata={})
results = scope.recall("note", limit=5, depth="shallow")
assert len(results) >= 1
@@ -213,7 +213,7 @@ def test_memory_slice_recall(tmp_path: Path, mock_embedder: MagicMock) -> None:
mem = Memory(storage=str(tmp_path / "db6"), llm=MagicMock(), embedder=mock_embedder)
mem.remember("In scope A", scope="/a", categories=[], importance=0.5, metadata={})
sl = MemorySlice(mem, ["/a"], read_only=True)
sl = MemorySlice(memory=mem, scopes=["/a"], read_only=True)
matches = sl.recall("scope", limit=5, depth="shallow")
assert isinstance(matches, list)
@@ -223,7 +223,7 @@ def test_memory_slice_remember_is_noop_when_read_only(tmp_path: Path, mock_embed
from crewai.memory.memory_scope import MemorySlice
mem = Memory(storage=str(tmp_path / "db7"), llm=MagicMock(), embedder=mock_embedder)
sl = MemorySlice(mem, ["/a"], read_only=True)
sl = MemorySlice(memory=mem, scopes=["/a"], read_only=True)
result = sl.remember("x", scope="/a")
assert result is None
assert mem.list_records() == []
@@ -319,7 +319,7 @@ def test_executor_save_to_memory_calls_extract_then_remember_per_item() -> None:
from crewai.agents.parser import AgentFinish
mock_memory = MagicMock()
mock_memory._read_only = False
mock_memory.read_only = False
mock_memory.extract_memories.return_value = ["Fact A.", "Fact B."]
mock_agent = MagicMock()
@@ -360,7 +360,7 @@ def test_executor_save_to_memory_skips_delegation_output() -> None:
from crewai.utilities.string_utils import sanitize_tool_name
mock_memory = MagicMock()
mock_memory._read_only = False
mock_memory.read_only = False
mock_agent = MagicMock()
mock_agent.memory = mock_memory
mock_agent._logger = MagicMock()
@@ -393,7 +393,7 @@ def test_memory_scope_extract_memories_delegates() -> None:
mock_memory = MagicMock()
mock_memory.extract_memories.return_value = ["Scoped fact."]
scope = MemoryScope(mock_memory, "/agent/1")
scope = MemoryScope(memory=mock_memory, root_path="/agent/1")
result = scope.extract_memories("Some content")
mock_memory.extract_memories.assert_called_once_with("Some content")
assert result == ["Scoped fact."]
@@ -405,7 +405,7 @@ def test_memory_slice_extract_memories_delegates() -> None:
mock_memory = MagicMock()
mock_memory.extract_memories.return_value = ["Sliced fact."]
sl = MemorySlice(mock_memory, ["/a", "/b"], read_only=True)
sl = MemorySlice(memory=mock_memory, scopes=["/a", "/b"], read_only=True)
result = sl.extract_memories("Some content")
mock_memory.extract_memories.assert_called_once_with("Some content")
assert result == ["Sliced fact."]
@@ -670,10 +670,10 @@ def test_agent_kickoff_memory_recall_and_save(tmp_path: Path, mock_embedder: Mag
verbose=False,
)
# Mock recall to verify it's called, but return real results
with patch.object(mem, "recall", wraps=mem.recall) as recall_mock, \
patch.object(mem, "extract_memories", return_value=["PostgreSQL is used."]) as extract_mock, \
patch.object(mem, "remember_many", wraps=mem.remember_many) as remember_many_mock:
# Patch on the class to avoid Pydantic BaseModel __delattr__ restriction
with patch.object(Memory, "recall", wraps=mem.recall) as recall_mock, \
patch.object(Memory, "extract_memories", return_value=["PostgreSQL is used."]) as extract_mock, \
patch.object(Memory, "remember_many", wraps=mem.remember_many) as remember_many_mock:
result = agent.kickoff("What database do we use?")
assert result is not None

View File

@@ -36,7 +36,7 @@ from crewai.flow import Flow, start
from crewai.knowledge.knowledge import Knowledge
from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
from crewai.llm import LLM
from crewai.memory.unified_memory import Memory
from crewai.process import Process
from crewai.project import CrewBase, agent, before_kickoff, crew, task
from crewai.task import Task
@@ -2618,9 +2618,9 @@ def test_memory_remember_called_after_task():
)
with patch.object(
crew._memory, "extract_memories", wraps=crew._memory.extract_memories
Memory, "extract_memories", wraps=crew._memory.extract_memories
) as extract_mock, patch.object(
crew._memory, "remember", wraps=crew._memory.remember
Memory, "remember", wraps=crew._memory.remember
) as remember_mock:
crew.kickoff()
@@ -4773,13 +4773,13 @@ def test_memory_remember_receives_task_content():
# Mock extract_memories to return fake memories and capture the raw input.
# No wraps= needed -- the test only checks what args it receives, not the output.
patch.object(
crew._memory, "extract_memories", return_value=["Fake memory."]
Memory, "extract_memories", return_value=["Fake memory."]
) as extract_mock,
# Mock recall to avoid LLM calls for query analysis (not in cassette).
patch.object(crew._memory, "recall", return_value=[]),
patch.object(Memory, "recall", return_value=[]),
# Mock remember_many to prevent the background save from triggering
# LLM calls (field resolution) that aren't in the cassette.
patch.object(crew._memory, "remember_many", return_value=[]),
patch.object(Memory, "remember_many", return_value=[]),
):
crew.kickoff()

View File

@@ -614,6 +614,11 @@ def test_handle_streaming_tool_calls_with_error(get_weather_tool_schema, mock_em
def test_handle_streaming_tool_calls_no_available_functions(
get_weather_tool_schema, mock_emit
):
"""When tools are provided but available_functions is not (defaults to None),
the streaming handler should return the accumulated tool calls so the caller
(e.g., CrewAgentExecutor) can handle them. This is the fix for issue #4788
where tool calls were previously discarded and an empty string was returned.
"""
llm = LLM(model="openai/gpt-4o", stream=True, is_litellm=True)
response = llm.call(
messages=[
@@ -621,7 +626,14 @@ def test_handle_streaming_tool_calls_no_available_functions(
],
tools=[get_weather_tool_schema],
)
assert response == ""
# With the fix for #4788, tool calls should be returned as a list
# instead of being discarded (previously returned "")
assert isinstance(response, list), (
f"Expected list of tool calls but got {type(response)}: {response}"
)
assert len(response) == 1
assert response[0].function.name == "get_weather"
assert response[0].function.arguments == '{"location":"New York, NY"}'
assert_event_count(
mock_emit=mock_emit,
@@ -1022,3 +1034,166 @@ async def test_usage_info_streaming_with_acall():
assert llm._token_usage["total_tokens"] > 0
assert len(result) > 0
def test_non_streaming_tool_calls_returned_when_no_available_functions():
"""Test that tool calls are returned (not text) when available_functions is None.
This reproduces the bug from issue #4788 where LLMs like Anthropic return both
text content AND tool calls in the same response. When available_functions=None
(as used by the executor for native tool handling), tool calls should be returned
instead of the text content.
"""
from litellm.types.utils import ChatCompletionMessageToolCall, Function
llm = LLM(model="gpt-4o-mini", is_litellm=True)
# Mock a response that has BOTH text content AND tool calls
mock_tool_call = ChatCompletionMessageToolCall(
id="call_123",
type="function",
function=Function(
name="code_search",
arguments='{"query": "test query"}',
),
)
mock_message = MagicMock()
mock_message.content = "I will search for the given query."
mock_message.tool_calls = [mock_tool_call]
mock_choice = MagicMock()
mock_choice.message = mock_message
mock_response = MagicMock()
mock_response.choices = [mock_choice]
mock_response.usage = MagicMock()
mock_response.usage.prompt_tokens = 10
mock_response.usage.completion_tokens = 5
mock_response.usage.total_tokens = 15
with patch("litellm.completion", return_value=mock_response):
# Call WITHOUT available_functions (as the executor does for native tool handling)
result = llm.call(
messages=[{"role": "user", "content": "Search for something"}],
tools=[{"type": "function", "function": {"name": "code_search"}}],
available_functions=None,
)
# Result should be the tool calls list, NOT the text response
assert isinstance(result, list), (
f"Expected list of tool calls but got {type(result)}: {result}"
)
assert len(result) == 1
assert result[0].function.name == "code_search"
def test_non_streaming_text_returned_when_no_tool_calls():
"""Test that text response is still returned when there are no tool calls."""
llm = LLM(model="gpt-4o-mini", is_litellm=True)
mock_message = MagicMock()
mock_message.content = "The capital of France is Paris."
mock_message.tool_calls = None
mock_choice = MagicMock()
mock_choice.message = mock_message
mock_response = MagicMock()
mock_response.choices = [mock_choice]
mock_response.usage = MagicMock()
mock_response.usage.prompt_tokens = 10
mock_response.usage.completion_tokens = 5
mock_response.usage.total_tokens = 15
with patch("litellm.completion", return_value=mock_response):
result = llm.call(
messages=[{"role": "user", "content": "What is the capital of France?"}],
)
assert isinstance(result, str)
assert result == "The capital of France is Paris."
@pytest.mark.asyncio
async def test_async_non_streaming_tool_calls_returned_when_no_available_functions():
"""Test async path: tool calls are returned (not text) when available_functions is None.
Same bug as #4788 but for the async non-streaming handler.
"""
from litellm.types.utils import ChatCompletionMessageToolCall, Function
llm = LLM(model="gpt-4o-mini", is_litellm=True, stream=False)
mock_tool_call = ChatCompletionMessageToolCall(
id="call_456",
type="function",
function=Function(
name="web_search",
arguments='{"query": "test"}',
),
)
mock_message = MagicMock()
mock_message.content = "I will search the web."
mock_message.tool_calls = [mock_tool_call]
mock_choice = MagicMock()
mock_choice.message = mock_message
mock_response = MagicMock()
mock_response.choices = [mock_choice]
mock_response.usage = MagicMock()
mock_response.usage.prompt_tokens = 10
mock_response.usage.completion_tokens = 5
mock_response.usage.total_tokens = 15
with patch("litellm.acompletion", return_value=mock_response):
result = await llm.acall(
messages=[{"role": "user", "content": "Search for something"}],
tools=[{"type": "function", "function": {"name": "web_search"}}],
available_functions=None,
)
assert isinstance(result, list), (
f"Expected list of tool calls but got {type(result)}: {result}"
)
assert len(result) == 1
assert result[0].function.name == "web_search"
def test_non_streaming_tool_calls_executed_when_available_functions_provided():
"""Test that tool calls are still executed when available_functions IS provided.
This ensures the fix doesn't break the normal tool execution path.
"""
llm = LLM(model="gpt-4o-mini", is_litellm=True)
mock_tool_call = MagicMock()
mock_tool_call.function.name = "get_weather"
mock_tool_call.function.arguments = '{"location": "New York"}'
mock_message = MagicMock()
mock_message.content = "I will check the weather."
mock_message.tool_calls = [mock_tool_call]
mock_choice = MagicMock()
mock_choice.message = mock_message
mock_response = MagicMock()
mock_response.choices = [mock_choice]
mock_response.usage = MagicMock()
mock_response.usage.prompt_tokens = 10
mock_response.usage.completion_tokens = 5
mock_response.usage.total_tokens = 15
def get_weather(location: str) -> str:
return f"Sunny in {location}"
with patch("litellm.completion", return_value=mock_response):
result = llm.call(
messages=[{"role": "user", "content": "What's the weather?"}],
tools=[{"type": "function", "function": {"name": "get_weather"}}],
available_functions={"get_weather": get_weather},
)
# When available_functions is provided, the tool should be executed
assert result == "Sunny in New York"

8
uv.lock generated
View File

@@ -1426,7 +1426,7 @@ requires-dist = [
{ name = "docker", specifier = "~=7.1.0" },
{ name = "exa-py", marker = "extra == 'exa-py'", specifier = ">=1.8.7" },
{ name = "firecrawl-py", marker = "extra == 'firecrawl-py'", specifier = ">=1.8.0" },
{ name = "gitpython", marker = "extra == 'github'", specifier = "==3.1.38" },
{ name = "gitpython", marker = "extra == 'github'", specifier = ">=3.1.41,<4" },
{ name = "hyperbrowser", marker = "extra == 'hyperbrowser'", specifier = ">=0.18.0" },
{ name = "langchain-apify", marker = "extra == 'apify'", specifier = ">=0.1.2,<1.0.0" },
{ name = "linkup-sdk", marker = "extra == 'linkup-sdk'", specifier = ">=0.2.2" },
@@ -2201,14 +2201,14 @@ wheels = [
[[package]]
name = "gitpython"
version = "3.1.38"
version = "3.1.46"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "gitdb" },
]
sdist = { url = "https://files.pythonhosted.org/packages/b3/45/cee7af549b6fa33f04531e402693a772b776cd9f845a2cbeca99cfac3331/GitPython-3.1.38.tar.gz", hash = "sha256:4d683e8957c8998b58ddb937e3e6cd167215a180e1ffd4da769ab81c620a89fe", size = 200632, upload-time = "2023-10-17T06:09:52.235Z" }
sdist = { url = "https://files.pythonhosted.org/packages/df/b5/59d16470a1f0dfe8c793f9ef56fd3826093fc52b3bd96d6b9d6c26c7e27b/gitpython-3.1.46.tar.gz", hash = "sha256:400124c7d0ef4ea03f7310ac2fbf7151e09ff97f2a3288d64a440c584a29c37f", size = 215371, upload-time = "2026-01-01T15:37:32.073Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/3c/ae/044453eacd5a526d3f242ccd77e38ee8219c65e0b132562b551bd67c61a4/GitPython-3.1.38-py3-none-any.whl", hash = "sha256:9e98b672ffcb081c2c8d5aa630d4251544fb040fb158863054242f24a2a2ba30", size = 190573, upload-time = "2023-10-17T06:09:50.18Z" },
{ url = "https://files.pythonhosted.org/packages/6a/09/e21df6aef1e1ffc0c816f0522ddc3f6dcded766c3261813131c78a704470/gitpython-3.1.46-py3-none-any.whl", hash = "sha256:79812ed143d9d25b6d176a10bb511de0f9c67b1fa641d82097b0ab90398a2058", size = 208620, upload-time = "2026-01-01T15:37:30.574Z" },
]
[[package]]