mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-01-23 23:28:15 +00:00
Compare commits
4 Commits
devin/1768
...
gl/feat/de
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
c1776aca8a | ||
|
|
8f99fa76ed | ||
|
|
17e3fcbe1f | ||
|
|
b858d705a8 |
@@ -574,6 +574,10 @@ When you run this Flow, the output will change based on the random boolean value
|
||||
|
||||
### Human in the Loop (human feedback)
|
||||
|
||||
<Note>
|
||||
The `@human_feedback` decorator requires **CrewAI version 1.8.0 or higher**.
|
||||
</Note>
|
||||
|
||||
The `@human_feedback` decorator enables human-in-the-loop workflows by pausing flow execution to collect feedback from a human. This is useful for approval gates, quality review, and decision points that require human judgment.
|
||||
|
||||
```python Code
|
||||
|
||||
@@ -91,6 +91,10 @@ The `A2AConfig` class accepts the following parameters:
|
||||
Update mechanism for receiving task status. Options: `StreamingConfig`, `PollingConfig`, or `PushNotificationConfig`.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="transport_protocol" type="Literal['JSONRPC', 'GRPC', 'HTTP+JSON']" default="JSONRPC">
|
||||
Transport protocol for A2A communication. Options: `JSONRPC` (default), `GRPC`, or `HTTP+JSON`.
|
||||
</ParamField>
|
||||
|
||||
## Authentication
|
||||
|
||||
For A2A agents that require authentication, use one of the provided auth schemes:
|
||||
|
||||
@@ -7,6 +7,10 @@ mode: "wide"
|
||||
|
||||
## Overview
|
||||
|
||||
<Note>
|
||||
The `@human_feedback` decorator requires **CrewAI version 1.8.0 or higher**. Make sure to update your installation before using this feature.
|
||||
</Note>
|
||||
|
||||
The `@human_feedback` decorator enables human-in-the-loop (HITL) workflows directly within CrewAI Flows. It allows you to pause flow execution, present output to a human for review, collect their feedback, and optionally route to different listeners based on the feedback outcome.
|
||||
|
||||
This is particularly valuable for:
|
||||
|
||||
@@ -11,10 +11,10 @@ Human-in-the-Loop (HITL) is a powerful approach that combines artificial intelli
|
||||
|
||||
CrewAI offers two main approaches for implementing human-in-the-loop workflows:
|
||||
|
||||
| Approach | Best For | Integration |
|
||||
|----------|----------|-------------|
|
||||
| **Flow-based** (`@human_feedback` decorator) | Local development, console-based review, synchronous workflows | [Human Feedback in Flows](/en/learn/human-feedback-in-flows) |
|
||||
| **Webhook-based** (Enterprise) | Production deployments, async workflows, external integrations (Slack, Teams, etc.) | This guide |
|
||||
| Approach | Best For | Integration | Version |
|
||||
|----------|----------|-------------|---------|
|
||||
| **Flow-based** (`@human_feedback` decorator) | Local development, console-based review, synchronous workflows | [Human Feedback in Flows](/en/learn/human-feedback-in-flows) | **1.8.0+** |
|
||||
| **Webhook-based** (Enterprise) | Production deployments, async workflows, external integrations (Slack, Teams, etc.) | This guide | - |
|
||||
|
||||
<Tip>
|
||||
If you're building flows and want to add human review steps with routing based on feedback, check out the [Human Feedback in Flows](/en/learn/human-feedback-in-flows) guide for the `@human_feedback` decorator.
|
||||
|
||||
@@ -567,6 +567,10 @@ Fourth method running
|
||||
|
||||
### Human in the Loop (인간 피드백)
|
||||
|
||||
<Note>
|
||||
`@human_feedback` 데코레이터는 **CrewAI 버전 1.8.0 이상**이 필요합니다.
|
||||
</Note>
|
||||
|
||||
`@human_feedback` 데코레이터는 인간의 피드백을 수집하기 위해 플로우 실행을 일시 중지하는 human-in-the-loop 워크플로우를 가능하게 합니다. 이는 승인 게이트, 품질 검토, 인간의 판단이 필요한 결정 지점에 유용합니다.
|
||||
|
||||
```python Code
|
||||
|
||||
@@ -7,6 +7,10 @@ mode: "wide"
|
||||
|
||||
## 개요
|
||||
|
||||
<Note>
|
||||
`@human_feedback` 데코레이터는 **CrewAI 버전 1.8.0 이상**이 필요합니다. 이 기능을 사용하기 전에 설치를 업데이트하세요.
|
||||
</Note>
|
||||
|
||||
`@human_feedback` 데코레이터는 CrewAI Flow 내에서 직접 human-in-the-loop(HITL) 워크플로우를 가능하게 합니다. Flow 실행을 일시 중지하고, 인간에게 검토를 위해 출력을 제시하고, 피드백을 수집하고, 선택적으로 피드백 결과에 따라 다른 리스너로 라우팅할 수 있습니다.
|
||||
|
||||
이는 특히 다음과 같은 경우에 유용합니다:
|
||||
|
||||
@@ -5,9 +5,22 @@ icon: "user-check"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
휴먼 인 더 루프(HITL, Human-in-the-Loop)는 인공지능과 인간의 전문 지식을 결합하여 의사결정을 강화하고 작업 결과를 향상시키는 강력한 접근 방식입니다. 이 가이드에서는 CrewAI 내에서 HITL을 구현하는 방법을 안내합니다.
|
||||
휴먼 인 더 루프(HITL, Human-in-the-Loop)는 인공지능과 인간의 전문 지식을 결합하여 의사결정을 강화하고 작업 결과를 향상시키는 강력한 접근 방식입니다. CrewAI는 필요에 따라 HITL을 구현하는 여러 가지 방법을 제공합니다.
|
||||
|
||||
## HITL 워크플로우 설정
|
||||
## HITL 접근 방식 선택
|
||||
|
||||
CrewAI는 human-in-the-loop 워크플로우를 구현하기 위한 두 가지 주요 접근 방식을 제공합니다:
|
||||
|
||||
| 접근 방식 | 적합한 용도 | 통합 | 버전 |
|
||||
|----------|----------|-------------|---------|
|
||||
| **Flow 기반** (`@human_feedback` 데코레이터) | 로컬 개발, 콘솔 기반 검토, 동기식 워크플로우 | [Flow에서 인간 피드백](/ko/learn/human-feedback-in-flows) | **1.8.0+** |
|
||||
| **Webhook 기반** (Enterprise) | 프로덕션 배포, 비동기 워크플로우, 외부 통합 (Slack, Teams 등) | 이 가이드 | - |
|
||||
|
||||
<Tip>
|
||||
Flow를 구축하면서 피드백을 기반으로 라우팅하는 인간 검토 단계를 추가하려면 `@human_feedback` 데코레이터에 대한 [Flow에서 인간 피드백](/ko/learn/human-feedback-in-flows) 가이드를 참조하세요.
|
||||
</Tip>
|
||||
|
||||
## Webhook 기반 HITL 워크플로우 설정
|
||||
|
||||
<Steps>
|
||||
<Step title="작업 구성">
|
||||
|
||||
@@ -309,6 +309,10 @@ Ao executar esse Flow, a saída será diferente dependendo do valor booleano ale
|
||||
|
||||
### Human in the Loop (feedback humano)
|
||||
|
||||
<Note>
|
||||
O decorador `@human_feedback` requer **CrewAI versão 1.8.0 ou superior**.
|
||||
</Note>
|
||||
|
||||
O decorador `@human_feedback` permite fluxos de trabalho human-in-the-loop, pausando a execução do flow para coletar feedback de um humano. Isso é útil para portões de aprovação, revisão de qualidade e pontos de decisão que requerem julgamento humano.
|
||||
|
||||
```python Code
|
||||
|
||||
@@ -7,6 +7,10 @@ mode: "wide"
|
||||
|
||||
## Visão Geral
|
||||
|
||||
<Note>
|
||||
O decorador `@human_feedback` requer **CrewAI versão 1.8.0 ou superior**. Certifique-se de atualizar sua instalação antes de usar este recurso.
|
||||
</Note>
|
||||
|
||||
O decorador `@human_feedback` permite fluxos de trabalho human-in-the-loop (HITL) diretamente nos CrewAI Flows. Ele permite pausar a execução do flow, apresentar a saída para um humano revisar, coletar seu feedback e, opcionalmente, rotear para diferentes listeners com base no resultado do feedback.
|
||||
|
||||
Isso é particularmente valioso para:
|
||||
|
||||
@@ -5,9 +5,22 @@ icon: "user-check"
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
Human-in-the-Loop (HITL) é uma abordagem poderosa que combina a inteligência artificial com a experiência humana para aprimorar a tomada de decisões e melhorar os resultados das tarefas. Este guia mostra como implementar HITL dentro da CrewAI.
|
||||
Human-in-the-Loop (HITL) é uma abordagem poderosa que combina a inteligência artificial com a experiência humana para aprimorar a tomada de decisões e melhorar os resultados das tarefas. CrewAI oferece várias maneiras de implementar HITL dependendo das suas necessidades.
|
||||
|
||||
## Configurando Workflows HITL
|
||||
## Escolhendo Sua Abordagem HITL
|
||||
|
||||
CrewAI oferece duas abordagens principais para implementar workflows human-in-the-loop:
|
||||
|
||||
| Abordagem | Melhor Para | Integração | Versão |
|
||||
|----------|----------|-------------|---------|
|
||||
| **Baseada em Flow** (decorador `@human_feedback`) | Desenvolvimento local, revisão via console, workflows síncronos | [Feedback Humano em Flows](/pt-BR/learn/human-feedback-in-flows) | **1.8.0+** |
|
||||
| **Baseada em Webhook** (Enterprise) | Deployments em produção, workflows assíncronos, integrações externas (Slack, Teams, etc.) | Este guia | - |
|
||||
|
||||
<Tip>
|
||||
Se você está construindo flows e deseja adicionar etapas de revisão humana com roteamento baseado em feedback, confira o guia [Feedback Humano em Flows](/pt-BR/learn/human-feedback-in-flows) para o decorador `@human_feedback`.
|
||||
</Tip>
|
||||
|
||||
## Configurando Workflows HITL Baseados em Webhook
|
||||
|
||||
<Steps>
|
||||
<Step title="Configure sua Tarefa">
|
||||
|
||||
@@ -5,7 +5,7 @@ This module is separate from experimental.a2a to avoid circular imports.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Annotated, Any, ClassVar
|
||||
from typing import Annotated, Any, ClassVar, Literal
|
||||
|
||||
from pydantic import (
|
||||
BaseModel,
|
||||
@@ -53,6 +53,7 @@ class A2AConfig(BaseModel):
|
||||
fail_fast: If True, raise error when agent unreachable; if False, skip and continue.
|
||||
trust_remote_completion_status: If True, return A2A agent's result directly when completed.
|
||||
updates: Update mechanism config.
|
||||
transport_protocol: A2A transport protocol (grpc, jsonrpc, http+json).
|
||||
"""
|
||||
|
||||
model_config: ClassVar[ConfigDict] = ConfigDict(extra="forbid")
|
||||
@@ -82,3 +83,7 @@ class A2AConfig(BaseModel):
|
||||
default_factory=_get_default_update_config,
|
||||
description="Update mechanism config",
|
||||
)
|
||||
transport_protocol: Literal["JSONRPC", "GRPC", "HTTP+JSON"] = Field(
|
||||
default="JSONRPC",
|
||||
description="Specified mode of A2A transport protocol",
|
||||
)
|
||||
|
||||
@@ -7,7 +7,7 @@ from collections.abc import AsyncIterator, MutableMapping
|
||||
from contextlib import asynccontextmanager
|
||||
from functools import lru_cache
|
||||
import time
|
||||
from typing import TYPE_CHECKING, Any
|
||||
from typing import TYPE_CHECKING, Any, Literal
|
||||
import uuid
|
||||
|
||||
from a2a.client import A2AClientHTTPError, Client, ClientConfig, ClientFactory
|
||||
@@ -18,7 +18,6 @@ from a2a.types import (
|
||||
PushNotificationConfig as A2APushNotificationConfig,
|
||||
Role,
|
||||
TextPart,
|
||||
TransportProtocol,
|
||||
)
|
||||
from aiocache import cached # type: ignore[import-untyped]
|
||||
from aiocache.serializers import PickleSerializer # type: ignore[import-untyped]
|
||||
@@ -259,6 +258,7 @@ async def _afetch_agent_card_impl(
|
||||
|
||||
def execute_a2a_delegation(
|
||||
endpoint: str,
|
||||
transport_protocol: Literal["JSONRPC", "GRPC", "HTTP+JSON"],
|
||||
auth: AuthScheme | None,
|
||||
timeout: int,
|
||||
task_description: str,
|
||||
@@ -282,6 +282,23 @@ def execute_a2a_delegation(
|
||||
use aexecute_a2a_delegation directly.
|
||||
|
||||
Args:
|
||||
endpoint: A2A agent endpoint URL (AgentCard URL)
|
||||
transport_protocol: Optional A2A transport protocol (grpc, jsonrpc, http+json)
|
||||
auth: Optional AuthScheme for authentication (Bearer, OAuth2, API Key, HTTP Basic/Digest)
|
||||
timeout: Request timeout in seconds
|
||||
task_description: The task to delegate
|
||||
context: Optional context information
|
||||
context_id: Context ID for correlating messages/tasks
|
||||
task_id: Specific task identifier
|
||||
reference_task_ids: List of related task IDs
|
||||
metadata: Additional metadata (external_id, request_id, etc.)
|
||||
extensions: Protocol extensions for custom fields
|
||||
conversation_history: Previous Message objects from conversation
|
||||
agent_id: Agent identifier for logging
|
||||
agent_role: Role of the CrewAI agent delegating the task
|
||||
agent_branch: Optional agent tree branch for logging
|
||||
response_model: Optional Pydantic model for structured outputs
|
||||
turn_number: Optional turn number for multi-turn conversations
|
||||
endpoint: A2A agent endpoint URL.
|
||||
auth: Optional AuthScheme for authentication.
|
||||
timeout: Request timeout in seconds.
|
||||
@@ -323,6 +340,7 @@ def execute_a2a_delegation(
|
||||
agent_role=agent_role,
|
||||
agent_branch=agent_branch,
|
||||
response_model=response_model,
|
||||
transport_protocol=transport_protocol,
|
||||
turn_number=turn_number,
|
||||
updates=updates,
|
||||
)
|
||||
@@ -333,6 +351,7 @@ def execute_a2a_delegation(
|
||||
|
||||
async def aexecute_a2a_delegation(
|
||||
endpoint: str,
|
||||
transport_protocol: Literal["JSONRPC", "GRPC", "HTTP+JSON"],
|
||||
auth: AuthScheme | None,
|
||||
timeout: int,
|
||||
task_description: str,
|
||||
@@ -356,6 +375,23 @@ async def aexecute_a2a_delegation(
|
||||
in an async context (e.g., with Crew.akickoff() or agent.aexecute_task()).
|
||||
|
||||
Args:
|
||||
endpoint: A2A agent endpoint URL
|
||||
transport_protocol: Optional A2A transport protocol (grpc, jsonrpc, http+json)
|
||||
auth: Optional AuthScheme for authentication
|
||||
timeout: Request timeout in seconds
|
||||
task_description: Task to delegate
|
||||
context: Optional context
|
||||
context_id: Context ID for correlation
|
||||
task_id: Specific task identifier
|
||||
reference_task_ids: Related task IDs
|
||||
metadata: Additional metadata
|
||||
extensions: Protocol extensions
|
||||
conversation_history: Previous Message objects
|
||||
turn_number: Current turn number
|
||||
agent_branch: Agent tree branch for logging
|
||||
agent_id: Agent identifier for logging
|
||||
agent_role: Agent role for logging
|
||||
response_model: Optional Pydantic model for structured outputs
|
||||
endpoint: A2A agent endpoint URL.
|
||||
auth: Optional AuthScheme for authentication.
|
||||
timeout: Request timeout in seconds.
|
||||
@@ -414,6 +450,7 @@ async def aexecute_a2a_delegation(
|
||||
agent_role=agent_role,
|
||||
response_model=response_model,
|
||||
updates=updates,
|
||||
transport_protocol=transport_protocol,
|
||||
)
|
||||
|
||||
crewai_event_bus.emit(
|
||||
@@ -431,6 +468,7 @@ async def aexecute_a2a_delegation(
|
||||
|
||||
async def _aexecute_a2a_delegation_impl(
|
||||
endpoint: str,
|
||||
transport_protocol: Literal["JSONRPC", "GRPC", "HTTP+JSON"],
|
||||
auth: AuthScheme | None,
|
||||
timeout: int,
|
||||
task_description: str,
|
||||
@@ -524,7 +562,6 @@ async def _aexecute_a2a_delegation_impl(
|
||||
extensions=extensions,
|
||||
)
|
||||
|
||||
transport_protocol = TransportProtocol("JSONRPC")
|
||||
new_messages: list[Message] = [*conversation_history, message]
|
||||
crewai_event_bus.emit(
|
||||
None,
|
||||
@@ -596,7 +633,7 @@ async def _aexecute_a2a_delegation_impl(
|
||||
@asynccontextmanager
|
||||
async def _create_a2a_client(
|
||||
agent_card: AgentCard,
|
||||
transport_protocol: TransportProtocol,
|
||||
transport_protocol: Literal["JSONRPC", "GRPC", "HTTP+JSON"],
|
||||
timeout: int,
|
||||
headers: MutableMapping[str, str],
|
||||
streaming: bool,
|
||||
@@ -640,7 +677,7 @@ async def _create_a2a_client(
|
||||
|
||||
config = ClientConfig(
|
||||
httpx_client=httpx_client,
|
||||
supported_transports=[str(transport_protocol.value)],
|
||||
supported_transports=[transport_protocol],
|
||||
streaming=streaming and not use_polling,
|
||||
polling=use_polling,
|
||||
accepted_output_modes=["application/json"],
|
||||
|
||||
@@ -771,6 +771,7 @@ def _delegate_to_a2a(
|
||||
response_model=agent_config.response_model,
|
||||
turn_number=turn_num + 1,
|
||||
updates=agent_config.updates,
|
||||
transport_protocol=agent_config.transport_protocol,
|
||||
)
|
||||
|
||||
conversation_history = a2a_result.get("history", [])
|
||||
@@ -1085,6 +1086,7 @@ async def _adelegate_to_a2a(
|
||||
agent_branch=agent_branch,
|
||||
response_model=agent_config.response_model,
|
||||
turn_number=turn_num + 1,
|
||||
transport_protocol=agent_config.transport_protocol,
|
||||
updates=agent_config.updates,
|
||||
)
|
||||
|
||||
|
||||
@@ -209,10 +209,9 @@ class EventListener(BaseEventListener):
|
||||
@crewai_event_bus.on(TaskCompletedEvent)
|
||||
def on_task_completed(source: Any, event: TaskCompletedEvent) -> None:
|
||||
# Handle telemetry
|
||||
span = self.execution_spans.get(source)
|
||||
span = self.execution_spans.pop(source, None)
|
||||
if span:
|
||||
self._telemetry.task_ended(span, source, source.agent.crew)
|
||||
self.execution_spans[source] = None
|
||||
|
||||
# Pass task name if it exists
|
||||
task_name = get_task_name(source)
|
||||
@@ -222,11 +221,10 @@ class EventListener(BaseEventListener):
|
||||
|
||||
@crewai_event_bus.on(TaskFailedEvent)
|
||||
def on_task_failed(source: Any, event: TaskFailedEvent) -> None:
|
||||
span = self.execution_spans.get(source)
|
||||
span = self.execution_spans.pop(source, None)
|
||||
if span:
|
||||
if source.agent and source.agent.crew:
|
||||
self._telemetry.task_ended(span, source, source.agent.crew)
|
||||
self.execution_spans[source] = None
|
||||
|
||||
# Pass task name if it exists
|
||||
task_name = get_task_name(source)
|
||||
|
||||
@@ -341,7 +341,6 @@ class AccumulatedToolArgs(BaseModel):
|
||||
|
||||
class LLM(BaseLLM):
|
||||
completion_cost: float | None = None
|
||||
_callback_lock: threading.RLock = threading.RLock()
|
||||
|
||||
def __new__(cls, model: str, is_litellm: bool = False, **kwargs: Any) -> LLM:
|
||||
"""Factory method that routes to native SDK or falls back to LiteLLM.
|
||||
@@ -1145,7 +1144,7 @@ class LLM(BaseLLM):
|
||||
if response_model:
|
||||
params["response_model"] = response_model
|
||||
response = litellm.completion(**params)
|
||||
|
||||
|
||||
if hasattr(response,"usage") and not isinstance(response.usage, type) and response.usage:
|
||||
usage_info = response.usage
|
||||
self._track_token_usage_internal(usage_info)
|
||||
@@ -1364,7 +1363,7 @@ class LLM(BaseLLM):
|
||||
"""
|
||||
full_response = ""
|
||||
chunk_count = 0
|
||||
|
||||
|
||||
usage_info = None
|
||||
|
||||
accumulated_tool_args: defaultdict[int, AccumulatedToolArgs] = defaultdict(
|
||||
@@ -1658,92 +1657,78 @@ class LLM(BaseLLM):
|
||||
raise ValueError("LLM call blocked by before_llm_call hook")
|
||||
|
||||
# --- 5) Set up callbacks if provided
|
||||
# Use a class-level lock to synchronize access to global litellm callbacks.
|
||||
# This prevents race conditions when multiple LLM instances call set_callbacks
|
||||
# concurrently, which could cause callbacks to be removed before they fire.
|
||||
with suppress_warnings():
|
||||
with LLM._callback_lock:
|
||||
if callbacks and len(callbacks) > 0:
|
||||
self.set_callbacks(callbacks)
|
||||
try:
|
||||
# --- 6) Prepare parameters for the completion call
|
||||
params = self._prepare_completion_params(messages, tools)
|
||||
# --- 7) Make the completion call and handle response
|
||||
if self.stream:
|
||||
result = self._handle_streaming_response(
|
||||
params=params,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
from_task=from_task,
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
else:
|
||||
result = self._handle_non_streaming_response(
|
||||
params=params,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
from_task=from_task,
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
|
||||
if isinstance(result, str):
|
||||
result = self._invoke_after_llm_call_hooks(
|
||||
messages, result, from_agent
|
||||
)
|
||||
|
||||
return result
|
||||
except LLMContextLengthExceededError:
|
||||
# Re-raise LLMContextLengthExceededError as it should be handled
|
||||
# by the CrewAgentExecutor._invoke_loop method, which can then decide
|
||||
# whether to summarize the content or abort based on the respect_context_window flag
|
||||
raise
|
||||
except Exception as e:
|
||||
unsupported_stop = "Unsupported parameter" in str(
|
||||
e
|
||||
) and "'stop'" in str(e)
|
||||
|
||||
if unsupported_stop:
|
||||
if (
|
||||
"additional_drop_params" in self.additional_params
|
||||
and isinstance(
|
||||
self.additional_params["additional_drop_params"], list
|
||||
)
|
||||
):
|
||||
self.additional_params["additional_drop_params"].append(
|
||||
"stop"
|
||||
)
|
||||
else:
|
||||
self.additional_params = {
|
||||
"additional_drop_params": ["stop"]
|
||||
}
|
||||
|
||||
logging.info(
|
||||
"Retrying LLM call without the unsupported 'stop'"
|
||||
)
|
||||
|
||||
# Recursive call happens inside the lock since we're using
|
||||
# a reentrant-safe pattern (the lock is released when we
|
||||
# exit the with block, and the recursive call will acquire
|
||||
# it again)
|
||||
return self.call(
|
||||
messages,
|
||||
tools=tools,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
from_task=from_task,
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
|
||||
crewai_event_bus.emit(
|
||||
self,
|
||||
event=LLMCallFailedEvent(
|
||||
error=str(e), from_task=from_task, from_agent=from_agent
|
||||
),
|
||||
if callbacks and len(callbacks) > 0:
|
||||
self.set_callbacks(callbacks)
|
||||
try:
|
||||
# --- 6) Prepare parameters for the completion call
|
||||
params = self._prepare_completion_params(messages, tools)
|
||||
# --- 7) Make the completion call and handle response
|
||||
if self.stream:
|
||||
result = self._handle_streaming_response(
|
||||
params=params,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
from_task=from_task,
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
raise
|
||||
else:
|
||||
result = self._handle_non_streaming_response(
|
||||
params=params,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
from_task=from_task,
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
|
||||
if isinstance(result, str):
|
||||
result = self._invoke_after_llm_call_hooks(
|
||||
messages, result, from_agent
|
||||
)
|
||||
|
||||
return result
|
||||
except LLMContextLengthExceededError:
|
||||
# Re-raise LLMContextLengthExceededError as it should be handled
|
||||
# by the CrewAgentExecutor._invoke_loop method, which can then decide
|
||||
# whether to summarize the content or abort based on the respect_context_window flag
|
||||
raise
|
||||
except Exception as e:
|
||||
unsupported_stop = "Unsupported parameter" in str(
|
||||
e
|
||||
) and "'stop'" in str(e)
|
||||
|
||||
if unsupported_stop:
|
||||
if (
|
||||
"additional_drop_params" in self.additional_params
|
||||
and isinstance(
|
||||
self.additional_params["additional_drop_params"], list
|
||||
)
|
||||
):
|
||||
self.additional_params["additional_drop_params"].append("stop")
|
||||
else:
|
||||
self.additional_params = {"additional_drop_params": ["stop"]}
|
||||
|
||||
logging.info("Retrying LLM call without the unsupported 'stop'")
|
||||
|
||||
return self.call(
|
||||
messages,
|
||||
tools=tools,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
from_task=from_task,
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
|
||||
crewai_event_bus.emit(
|
||||
self,
|
||||
event=LLMCallFailedEvent(
|
||||
error=str(e), from_task=from_task, from_agent=from_agent
|
||||
),
|
||||
)
|
||||
raise
|
||||
|
||||
async def acall(
|
||||
self,
|
||||
@@ -1805,27 +1790,14 @@ class LLM(BaseLLM):
|
||||
msg_role: Literal["assistant"] = "assistant"
|
||||
message["role"] = msg_role
|
||||
|
||||
# Use a class-level lock to synchronize access to global litellm callbacks.
|
||||
# This prevents race conditions when multiple LLM instances call set_callbacks
|
||||
# concurrently, which could cause callbacks to be removed before they fire.
|
||||
with suppress_warnings():
|
||||
with LLM._callback_lock:
|
||||
if callbacks and len(callbacks) > 0:
|
||||
self.set_callbacks(callbacks)
|
||||
try:
|
||||
params = self._prepare_completion_params(messages, tools)
|
||||
if callbacks and len(callbacks) > 0:
|
||||
self.set_callbacks(callbacks)
|
||||
try:
|
||||
params = self._prepare_completion_params(messages, tools)
|
||||
|
||||
if self.stream:
|
||||
return await self._ahandle_streaming_response(
|
||||
params=params,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
from_task=from_task,
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
|
||||
return await self._ahandle_non_streaming_response(
|
||||
if self.stream:
|
||||
return await self._ahandle_streaming_response(
|
||||
params=params,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
@@ -1833,49 +1805,52 @@ class LLM(BaseLLM):
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
except LLMContextLengthExceededError:
|
||||
raise
|
||||
except Exception as e:
|
||||
unsupported_stop = "Unsupported parameter" in str(
|
||||
e
|
||||
) and "'stop'" in str(e)
|
||||
|
||||
if unsupported_stop:
|
||||
if (
|
||||
"additional_drop_params" in self.additional_params
|
||||
and isinstance(
|
||||
self.additional_params["additional_drop_params"], list
|
||||
)
|
||||
):
|
||||
self.additional_params["additional_drop_params"].append(
|
||||
"stop"
|
||||
)
|
||||
else:
|
||||
self.additional_params = {
|
||||
"additional_drop_params": ["stop"]
|
||||
}
|
||||
return await self._ahandle_non_streaming_response(
|
||||
params=params,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
from_task=from_task,
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
except LLMContextLengthExceededError:
|
||||
raise
|
||||
except Exception as e:
|
||||
unsupported_stop = "Unsupported parameter" in str(
|
||||
e
|
||||
) and "'stop'" in str(e)
|
||||
|
||||
logging.info(
|
||||
"Retrying LLM call without the unsupported 'stop'"
|
||||
if unsupported_stop:
|
||||
if (
|
||||
"additional_drop_params" in self.additional_params
|
||||
and isinstance(
|
||||
self.additional_params["additional_drop_params"], list
|
||||
)
|
||||
):
|
||||
self.additional_params["additional_drop_params"].append("stop")
|
||||
else:
|
||||
self.additional_params = {"additional_drop_params": ["stop"]}
|
||||
|
||||
return await self.acall(
|
||||
messages,
|
||||
tools=tools,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
from_task=from_task,
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
logging.info("Retrying LLM call without the unsupported 'stop'")
|
||||
|
||||
crewai_event_bus.emit(
|
||||
self,
|
||||
event=LLMCallFailedEvent(
|
||||
error=str(e), from_task=from_task, from_agent=from_agent
|
||||
),
|
||||
return await self.acall(
|
||||
messages,
|
||||
tools=tools,
|
||||
callbacks=callbacks,
|
||||
available_functions=available_functions,
|
||||
from_task=from_task,
|
||||
from_agent=from_agent,
|
||||
response_model=response_model,
|
||||
)
|
||||
raise
|
||||
|
||||
crewai_event_bus.emit(
|
||||
self,
|
||||
event=LLMCallFailedEvent(
|
||||
error=str(e), from_task=from_task, from_agent=from_agent
|
||||
),
|
||||
)
|
||||
raise
|
||||
|
||||
def _handle_emit_call_events(
|
||||
self,
|
||||
|
||||
370
lib/crewai/src/crewai/utilities/deferred_import_utils.py
Normal file
370
lib/crewai/src/crewai/utilities/deferred_import_utils.py
Normal file
@@ -0,0 +1,370 @@
|
||||
"""Lazy loader for Python packages.
|
||||
|
||||
Makes it easy to load subpackages and functions on demand.
|
||||
|
||||
Pulled from https://github.com/scientific-python/lazy-loader/blob/main/src/lazy_loader/__init__.py,
|
||||
modernized a little.
|
||||
"""
|
||||
|
||||
import ast
|
||||
from collections.abc import Callable, Sequence
|
||||
from dataclasses import dataclass, field
|
||||
import importlib
|
||||
import importlib.metadata
|
||||
import importlib.util
|
||||
import inspect
|
||||
import os
|
||||
from pathlib import Path
|
||||
import sys
|
||||
import threading
|
||||
import types
|
||||
from typing import Any, NoReturn
|
||||
import warnings
|
||||
|
||||
import packaging.requirements
|
||||
|
||||
|
||||
_threadlock = threading.Lock()
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class _FrameData:
|
||||
"""Captured stack frame information for delayed error reporting."""
|
||||
|
||||
filename: str
|
||||
lineno: int
|
||||
function: str
|
||||
code_context: Sequence[str] | None
|
||||
|
||||
|
||||
def attach(
|
||||
package_name: str,
|
||||
submodules: set[str] | None = None,
|
||||
submod_attrs: dict[str, list[str]] | None = None,
|
||||
) -> tuple[Callable[[str], Any], Callable[[], list[str]], list[str]]:
|
||||
"""Attach lazily loaded submodules, functions, or other attributes.
|
||||
|
||||
Replaces a package's `__getattr__`, `__dir__`, and `__all__` such that
|
||||
imports work normally but occur upon first use.
|
||||
|
||||
Example:
|
||||
__getattr__, __dir__, __all__ = lazy.attach(
|
||||
__name__, ["mysubmodule"], {"foo": ["someattr"]}
|
||||
)
|
||||
|
||||
Args:
|
||||
package_name: The package name, typically ``__name__``.
|
||||
submodules: Set of submodule names to attach.
|
||||
submod_attrs: Mapping of submodule names to lists of attributes.
|
||||
These attributes are imported as they are used.
|
||||
|
||||
Returns:
|
||||
A tuple of (__getattr__, __dir__, __all__) to assign in the package.
|
||||
"""
|
||||
submod_attrs = submod_attrs or {}
|
||||
submodules = set(submodules) if submodules else set()
|
||||
attr_to_modules = {
|
||||
attr: mod for mod, attrs in submod_attrs.items() for attr in attrs
|
||||
}
|
||||
__all__ = sorted(submodules | attr_to_modules.keys())
|
||||
|
||||
def __getattr__(name: str) -> Any: # noqa: N807
|
||||
if name in submodules:
|
||||
return importlib.import_module(f"{package_name}.{name}")
|
||||
if name in attr_to_modules:
|
||||
submod_path = f"{package_name}.{attr_to_modules[name]}"
|
||||
submod = importlib.import_module(submod_path)
|
||||
attr = getattr(submod, name)
|
||||
|
||||
# If the attribute lives in a file (module) with the same
|
||||
# name as the attribute, ensure that the attribute and *not*
|
||||
# the module is accessible on the package.
|
||||
if name == attr_to_modules[name]:
|
||||
pkg = sys.modules[package_name]
|
||||
pkg.__dict__[name] = attr
|
||||
|
||||
return attr
|
||||
raise AttributeError(f"No {package_name} attribute {name}")
|
||||
|
||||
def __dir__() -> list[str]: # noqa: N807
|
||||
return __all__.copy()
|
||||
|
||||
if os.environ.get("EAGER_IMPORT"):
|
||||
for attr in set(attr_to_modules.keys()) | submodules:
|
||||
__getattr__(attr)
|
||||
|
||||
return __getattr__, __dir__, __all__.copy()
|
||||
|
||||
|
||||
class DelayedImportErrorModule(types.ModuleType):
|
||||
"""Module type that delays raising ModuleNotFoundError until attribute access.
|
||||
|
||||
Captures stack frame data to provide helpful error messages showing where
|
||||
the original import was attempted.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
frame_data: _FrameData,
|
||||
*args: Any,
|
||||
message: str,
|
||||
**kwargs: Any,
|
||||
) -> None:
|
||||
"""Initialize the delayed error module.
|
||||
|
||||
Args:
|
||||
frame_data: Captured frame information for error reporting.
|
||||
*args: Positional arguments passed to ModuleType.
|
||||
message: The error message to display when accessed.
|
||||
**kwargs: Keyword arguments passed to ModuleType.
|
||||
"""
|
||||
self._frame_data = frame_data
|
||||
self._message = message
|
||||
super().__init__(*args, **kwargs)
|
||||
|
||||
def __getattr__(self, name: str) -> NoReturn:
|
||||
"""Raise ModuleNotFoundError with detailed context on any attribute access."""
|
||||
frame = self._frame_data
|
||||
code = "".join(frame.code_context) if frame.code_context else ""
|
||||
raise ModuleNotFoundError(
|
||||
f"{self._message}\n\n"
|
||||
"This error is lazily reported, having originally occurred in\n"
|
||||
f" File {frame.filename}, line {frame.lineno}, in {frame.function}\n\n"
|
||||
f"----> {code.strip()}"
|
||||
)
|
||||
|
||||
|
||||
def load(
|
||||
fullname: str,
|
||||
*,
|
||||
require: str | None = None,
|
||||
error_on_import: bool = False,
|
||||
suppress_warning: bool = False,
|
||||
) -> types.ModuleType:
|
||||
"""Return a lazily imported proxy for a module.
|
||||
|
||||
The proxy module delays actual import until first attribute access.
|
||||
|
||||
Example:
|
||||
np = lazy.load("numpy")
|
||||
|
||||
def myfunc():
|
||||
np.norm(...)
|
||||
|
||||
Warning:
|
||||
Lazily loading subpackages causes the parent package to be eagerly
|
||||
loaded. Use `lazy_loader.attach` instead for subpackages.
|
||||
|
||||
Args:
|
||||
fullname: The full name of the module to import (e.g., "scipy").
|
||||
require: A PEP-508 dependency requirement (e.g., "numpy >=1.24").
|
||||
If specified, raises an error if the installed version doesn't match.
|
||||
error_on_import: If True, raise import errors immediately.
|
||||
If False (default), delay errors until module is accessed.
|
||||
suppress_warning: If True, suppress the warning when loading subpackages.
|
||||
|
||||
Returns:
|
||||
A proxy module that loads on first attribute access.
|
||||
"""
|
||||
with _threadlock:
|
||||
module = sys.modules.get(fullname)
|
||||
|
||||
# Most common, short-circuit
|
||||
if module is not None and require is None:
|
||||
return module
|
||||
|
||||
have_module = module is not None
|
||||
|
||||
if not suppress_warning and "." in fullname:
|
||||
msg = (
|
||||
"subpackages can technically be lazily loaded, but it causes the "
|
||||
"package to be eagerly loaded even if it is already lazily loaded. "
|
||||
"So, you probably shouldn't use subpackages with this lazy feature."
|
||||
)
|
||||
warnings.warn(msg, RuntimeWarning, stacklevel=2)
|
||||
|
||||
spec = None
|
||||
|
||||
if not have_module:
|
||||
spec = importlib.util.find_spec(fullname)
|
||||
have_module = spec is not None
|
||||
|
||||
if not have_module:
|
||||
not_found_message = f"No module named '{fullname}'"
|
||||
elif require is not None:
|
||||
try:
|
||||
have_module = _check_requirement(require)
|
||||
except ModuleNotFoundError as e:
|
||||
raise ValueError(
|
||||
f"Found module '{fullname}' but cannot test "
|
||||
"requirement '{require}'. "
|
||||
"Requirements must match distribution name, not module name."
|
||||
) from e
|
||||
|
||||
not_found_message = f"No distribution can be found matching '{require}'"
|
||||
|
||||
if not have_module:
|
||||
if error_on_import:
|
||||
raise ModuleNotFoundError(not_found_message)
|
||||
|
||||
parent = inspect.stack()[1]
|
||||
frame_data = _FrameData(
|
||||
filename=parent.filename,
|
||||
lineno=parent.lineno,
|
||||
function=parent.function,
|
||||
code_context=parent.code_context,
|
||||
)
|
||||
del parent
|
||||
return DelayedImportErrorModule(
|
||||
frame_data,
|
||||
"DelayedImportErrorModule",
|
||||
message=not_found_message,
|
||||
)
|
||||
|
||||
if spec is not None:
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
sys.modules[fullname] = module
|
||||
|
||||
if spec.loader is not None:
|
||||
loader = importlib.util.LazyLoader(spec.loader)
|
||||
loader.exec_module(module)
|
||||
|
||||
if module is None:
|
||||
raise ModuleNotFoundError(f"No module named '{fullname}'")
|
||||
|
||||
return module
|
||||
|
||||
|
||||
def _check_requirement(require: str) -> bool:
|
||||
"""Verify that a package requirement is satisfied.
|
||||
|
||||
Args:
|
||||
require: A dependency requirement as defined in PEP-508.
|
||||
|
||||
Returns:
|
||||
True if the installed version matches the requirement, False otherwise.
|
||||
|
||||
Raises:
|
||||
ModuleNotFoundError: If the package is not installed.
|
||||
"""
|
||||
req = packaging.requirements.Requirement(require)
|
||||
return req.specifier.contains(
|
||||
importlib.metadata.version(req.name),
|
||||
prereleases=True,
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class _StubVisitor(ast.NodeVisitor):
|
||||
"""AST visitor to parse a stub file for submodules and submod_attrs."""
|
||||
|
||||
_submodules: set[str] = field(default_factory=set)
|
||||
_submod_attrs: dict[str, list[str]] = field(default_factory=dict)
|
||||
|
||||
def visit_ImportFrom(self, node: ast.ImportFrom) -> None:
|
||||
"""Visit an ImportFrom node and extract submodule/attribute information.
|
||||
|
||||
Args:
|
||||
node: The AST ImportFrom node to visit.
|
||||
|
||||
Raises:
|
||||
ValueError: If the import is not a relative import or uses star import.
|
||||
"""
|
||||
if node.level != 1:
|
||||
raise ValueError(
|
||||
"Only within-module imports are supported (`from .* import`)"
|
||||
)
|
||||
names = [alias.name for alias in node.names]
|
||||
if node.module:
|
||||
if "*" in names:
|
||||
raise ValueError(
|
||||
f"lazy stub loader does not support star import "
|
||||
f"`from {node.module} import *`"
|
||||
)
|
||||
self._submod_attrs.setdefault(node.module, []).extend(names)
|
||||
else:
|
||||
self._submodules.update(names)
|
||||
|
||||
|
||||
def attach_stub(
|
||||
package_name: str,
|
||||
filename: str,
|
||||
) -> tuple[Callable[[str], Any], Callable[[], list[str]], list[str]]:
|
||||
"""Attach lazily loaded submodules and functions from a type stub.
|
||||
|
||||
Parses a `.pyi` stub file to infer submodules and attributes. This allows
|
||||
static type checkers to find imports while providing lazy loading at runtime.
|
||||
|
||||
Args:
|
||||
package_name: The package name, typically ``__name__``.
|
||||
filename: Path to `.py` file with an adjacent `.pyi` file.
|
||||
Typically use ``__file__``.
|
||||
|
||||
Returns:
|
||||
A tuple of (__getattr__, __dir__, __all__) to assign in the package.
|
||||
|
||||
Raises:
|
||||
ValueError: If stub file is not found or contains invalid imports.
|
||||
"""
|
||||
path = Path(filename)
|
||||
stubfile = path if path.suffix == ".pyi" else path.with_suffix(".pyi")
|
||||
|
||||
if not stubfile.exists():
|
||||
raise ValueError(f"Cannot load imports from non-existent stub {stubfile!r}")
|
||||
|
||||
visitor = _StubVisitor()
|
||||
visitor.visit(ast.parse(stubfile.read_text()))
|
||||
return attach(package_name, visitor._submodules, visitor._submod_attrs)
|
||||
|
||||
|
||||
def lazy_exports_stub(package_name: str, filename: str) -> None:
|
||||
"""Install lazy loading on a module based on its .pyi stub file.
|
||||
|
||||
Parses the adjacent `.pyi` stub file to determine what to export lazily.
|
||||
Type checkers see the stub, runtime gets lazy loading.
|
||||
|
||||
Example:
|
||||
# __init__.py
|
||||
from crewai.utilities.lazy import lazy_exports_stub
|
||||
lazy_exports_stub(__name__, __file__)
|
||||
|
||||
# __init__.pyi
|
||||
from .config import ChromaDBConfig, ChromaDBSettings
|
||||
from .types import EmbeddingType
|
||||
|
||||
Args:
|
||||
package_name: The package name, typically ``__name__``.
|
||||
filename: Path to the module file, typically ``__file__``.
|
||||
"""
|
||||
__getattr__, __dir__, __all__ = attach_stub(package_name, filename)
|
||||
module = sys.modules[package_name]
|
||||
module.__getattr__ = __getattr__ # type: ignore[method-assign]
|
||||
module.__dir__ = __dir__ # type: ignore[method-assign]
|
||||
module.__dict__["__all__"] = __all__
|
||||
|
||||
|
||||
def lazy_exports(
|
||||
package_name: str,
|
||||
submod_attrs: dict[str, list[str]],
|
||||
submodules: set[str] | None = None,
|
||||
) -> None:
|
||||
"""Install lazy loading on a module.
|
||||
|
||||
Example:
|
||||
from crewai.utilities.lazy import lazy_exports
|
||||
|
||||
lazy_exports(__name__, {
|
||||
'config': ['ChromaDBConfig', 'ChromaDBSettings'],
|
||||
'types': ['EmbeddingType'],
|
||||
})
|
||||
|
||||
Args:
|
||||
package_name: The package name, typically ``__name__``.
|
||||
submod_attrs: Mapping of submodule names to lists of attributes.
|
||||
submodules: Optional set of submodule names to expose directly.
|
||||
"""
|
||||
__getattr__, __dir__, __all__ = attach(package_name, submodules, submod_attrs)
|
||||
module = sys.modules[package_name]
|
||||
module.__getattr__ = __getattr__ # type: ignore[method-assign]
|
||||
module.__dir__ = __dir__ # type: ignore[method-assign]
|
||||
module.__dict__["__all__"] = __all__
|
||||
@@ -1,6 +1,6 @@
|
||||
import logging
|
||||
import os
|
||||
import threading
|
||||
from time import sleep
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
from crewai.agents.agent_builder.utilities.base_token_process import TokenProcess
|
||||
@@ -18,15 +18,9 @@ from pydantic import BaseModel
|
||||
import pytest
|
||||
|
||||
|
||||
# TODO: This test fails without print statement, which makes me think that something is happening asynchronously that we need to eventually fix and dive deeper into at a later date
|
||||
@pytest.mark.vcr()
|
||||
def test_llm_callback_replacement():
|
||||
"""Test that callbacks are properly isolated between LLM instances.
|
||||
|
||||
This test verifies that the race condition fix (using _callback_lock) works
|
||||
correctly. Previously, this test required a sleep(5) workaround because
|
||||
callbacks were being modified globally without synchronization, causing
|
||||
one LLM instance's callbacks to interfere with another's.
|
||||
"""
|
||||
llm1 = LLM(model="gpt-4o-mini", is_litellm=True)
|
||||
llm2 = LLM(model="gpt-4o-mini", is_litellm=True)
|
||||
|
||||
@@ -43,6 +37,7 @@ def test_llm_callback_replacement():
|
||||
messages=[{"role": "user", "content": "Hello, world from another agent!"}],
|
||||
callbacks=[calc_handler_2],
|
||||
)
|
||||
sleep(5)
|
||||
usage_metrics_2 = calc_handler_2.token_cost_process.get_summary()
|
||||
|
||||
# The first handler should not have been updated
|
||||
@@ -51,66 +46,6 @@ def test_llm_callback_replacement():
|
||||
assert usage_metrics_1 == calc_handler_1.token_cost_process.get_summary()
|
||||
|
||||
|
||||
def test_llm_callback_lock_prevents_race_condition():
|
||||
"""Test that the _callback_lock prevents race conditions in concurrent LLM calls.
|
||||
|
||||
This test verifies that multiple threads can safely call LLM.call() with
|
||||
different callbacks without interfering with each other. The lock ensures
|
||||
that callbacks are properly isolated between concurrent calls.
|
||||
"""
|
||||
num_threads = 5
|
||||
results: list[int] = []
|
||||
errors: list[Exception] = []
|
||||
lock = threading.Lock()
|
||||
|
||||
def make_llm_call(thread_id: int, mock_completion: MagicMock) -> None:
|
||||
try:
|
||||
llm = LLM(model="gpt-4o-mini", is_litellm=True)
|
||||
calc_handler = TokenCalcHandler(token_cost_process=TokenProcess())
|
||||
|
||||
mock_message = MagicMock()
|
||||
mock_message.content = f"Response from thread {thread_id}"
|
||||
mock_choice = MagicMock()
|
||||
mock_choice.message = mock_message
|
||||
mock_response = MagicMock()
|
||||
mock_response.choices = [mock_choice]
|
||||
mock_response.usage = {
|
||||
"prompt_tokens": 10,
|
||||
"completion_tokens": 10,
|
||||
"total_tokens": 20,
|
||||
}
|
||||
mock_completion.return_value = mock_response
|
||||
|
||||
llm.call(
|
||||
messages=[{"role": "user", "content": f"Hello from thread {thread_id}"}],
|
||||
callbacks=[calc_handler],
|
||||
)
|
||||
|
||||
usage = calc_handler.token_cost_process.get_summary()
|
||||
with lock:
|
||||
results.append(usage.successful_requests)
|
||||
except Exception as e:
|
||||
with lock:
|
||||
errors.append(e)
|
||||
|
||||
with patch("litellm.completion") as mock_completion:
|
||||
threads = [
|
||||
threading.Thread(target=make_llm_call, args=(i, mock_completion))
|
||||
for i in range(num_threads)
|
||||
]
|
||||
|
||||
for t in threads:
|
||||
t.start()
|
||||
for t in threads:
|
||||
t.join()
|
||||
|
||||
assert len(errors) == 0, f"Errors occurred: {errors}"
|
||||
assert len(results) == num_threads
|
||||
assert all(
|
||||
r == 1 for r in results
|
||||
), f"Expected all callbacks to have 1 successful request, got {results}"
|
||||
|
||||
|
||||
@pytest.mark.vcr()
|
||||
def test_llm_call_with_string_input():
|
||||
llm = LLM(model="gpt-4o-mini")
|
||||
@@ -1054,4 +989,4 @@ async def test_usage_info_streaming_with_acall():
|
||||
assert llm._token_usage["completion_tokens"] > 0
|
||||
assert llm._token_usage["total_tokens"] > 0
|
||||
|
||||
assert len(result) > 0
|
||||
assert len(result) > 0
|
||||
Reference in New Issue
Block a user