mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-07-01 21:28:10 +00:00
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* feat(cli): introduce JSON crew project support and TUI enhancements - Added support for creating and running JSON-defined crew projects, allowing users to scaffold projects with a new `create_json_crew.py` file. - Implemented a full-screen Textual TUI for crew execution in `crew_run_tui.py`, enhancing user interaction with a two-column layout. - Updated `run_crew.py` to prioritize JSON crew projects and added daemon mode for running without TUI. - Introduced interactive pickers in `tui_picker.py` for improved CLI prompts. - Enhanced validation for JSON crew files in `validate.py` to ensure proper structure and agent definitions. - Updated `.gitignore` to exclude demo and crewai directories. * feat: update LLM model references to gpt-5.4-mini - Changed default LLM model from gpt-4o-mini to gpt-5.4-mini across various files, including CLI options, JSON crew configurations, and agent definitions. - Enhanced benchmark and human feedback functionalities to utilize the new model. - Improved user interface elements in the TUI for better interaction and feedback during execution. - Added support for new skills directory in JSON crew project creation. * feat(benchmark): add crew-level benchmarking functionality - Introduced a new `benchmark` command in the CLI for crew-level benchmarking, allowing users to specify agents, models, and timeout settings. - Implemented `CrewBenchmarkCase` to handle crew-level benchmark cases with inputs and criteria. - Enhanced the benchmark runner to support progress tracking and detailed reporting of results for multiple models. - Added tests for loading crew benchmark cases and validating their structure. - Updated existing benchmark functions to accommodate the new crew-level execution model. * feat(cli): enhance JSON crew project functionality and TUI improvements - Added optional agent-level guardrails and advanced options in JSON crew configurations to improve output validation and flexibility. - Updated the TUI to better handle plan step statuses, including visual indicators for task completion and failure. - Introduced methods for parsing and managing step observation events, ensuring accurate updates to task statuses during execution. - Enhanced validation for JSON crew projects, ensuring proper structure and error handling for agent and task definitions. - Added comprehensive tests for new features and validation logic, ensuring robustness in JSON crew project handling. * refactor(cli): streamline JSON crew project handling and improve validation - Refactored JSON crew project loading and validation logic to enhance clarity and maintainability. - Introduced utility functions for finding JSON crew files, improving code reuse across modules. - Removed deprecated benchmark functionality and associated tests to simplify the codebase. - Updated CLI commands to utilize the new JSON project structure, ensuring compatibility with recent changes. - Enhanced test coverage for JSON crew project features, ensuring robust validation and error handling. * feat(cli): enhance activity log navigation and focus management - Added functionality to focus on the activity log when navigating through log entries. - Implemented refresh logic for the log panel to ensure updates are displayed correctly during navigation. - Improved keyboard navigation for log entries, allowing users to expand and scroll through logs seamlessly. - Added tests to verify the correct behavior of log navigation and focus management in the TUI. * feat(cli): enhance JSON crew project interaction and input handling - Introduced a new function to enable prompt line editing for better user experience during input prompts. - Updated the JSON crew project wizards to show interpolation hints for dynamic values, improving user guidance. - Enhanced the handling of missing input placeholders by prompting users for required values during crew setup. - Refactored the crew run logic to ensure proper loading and preparation of JSON-defined crews, including runtime input management. - Added tests to verify the correct behavior of new input handling features and JSON crew project interactions. * feat(cli): improve crew project input prompts and event handling - Enhanced the `_prompt_text` function to allow for configurable spacing before prompts, improving user experience during input collection. - Updated the wizards for agent and task creation to utilize the new prompt configuration, ensuring a more compact and streamlined interaction. - Introduced new plan step lifecycle events (`PlanStepStartedEvent`, `PlanStepCompletedEvent`) to better track the execution status of plan steps. - Refactored the step executor to emit these events during the execution of tasks, improving observability and debugging capabilities. - Added tests to verify the correct behavior of new prompt handling and event emissions during crew project execution. * fix: refine json-first crew interactions * fix: prioritize common json crew tools * fix: make json crew more tools expandable * fix: show json crew tools by category * feat(memory): update default embedder to OpenAI text-embedding-3-large and enhance memory compatibility - Changed the default embedding model for Memory to OpenAI text-embedding-3-large, which uses 3072-dimensional vectors. - Added warnings regarding compatibility issues with existing local memory stores created with 1536-dimensional embeddings. - Updated documentation to reflect the new default embedder and its configuration options. - Enhanced the CLI and codebase to support the new embedding model across various components, ensuring a seamless transition for users. * fix: address PR review feedback for JSON-first crews Review blockers: - Forward trained_agents_file to JSON crews: crewai run -f now exports CREWAI_TRAINED_AGENTS_FILE for the in-process JSON crew path - Wizard agent picker: Esc/cancel now reprompts instead of silently assigning the first agent - JSON tool resolution hard-fails: unknown tool names, missing custom tool files, and invalid custom tool modules raise JSONProjectError with actionable messages instead of warn-and-continue - Embedding dimension mismatch: LanceDB and Qdrant Edge storages raise EmbeddingDimensionMismatchError with reset/pin guidance instead of silently zero-filling vectors or returning empty search results - Custom tool code execution documented in loader docstring and the scaffolded project README CI fixes: - ruff format across lib/ - All 133 PR-introduced mypy errors fixed (llm.py lazy-litellm and cli.py lazy command shims now use TYPE_CHECKING imports; textual is_mounted misuse fixed; pick_many overloads; misc annotations) Bot review comments: - Empty except blocks now have explanatory comments or debug logging - Removed unused _C_BG/_C_PANEL/_C_BORDER globals and redundant import re; tests use a single import style for create_json_crew Tests: trained-agents propagation, wizard cancel, tool resolution failures, and dimension mismatch guidance. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: address second round of PR review comments Cursor Bugbot: - Wizard agent slugs: strip to [a-z0-9_] and fall back to agent_<n> so symbol-only roles can't produce an empty agents/.jsonc filename - Wizard task names: dedupe against prior task names and fall back to task_<n> for symbol-only descriptions CodeRabbit: - Agent.message(): import Task explicitly at runtime instead of relying on the namespace injection done by crewai/__init__ - Async executor: move the native-tools-unsupported fallback from _ainvoke_loop_react (self-recursion) to _ainvoke_loop_native_tools, mirroring the sync implementation - StepExecutor downgrade: keep the in-step conversation and append the text-tooling instructions instead of rebuilding messages, so completed native tool calls are not re-executed - crewai-files: extension-based MIME lookup now runs before byte sniffing so csv/xml types are not degraded to text/plain - Memory storages: validate every record in a save() batch against a consistent embedding dimension (LanceDB previously checked only the first record); added mixed-batch tests - _print_post_tui_summary now typed against CrewRunApp - Docs: Azure OpenAI default embedder change called out in the memory migration warning and provider table Code quality bots: - Removed unused _C_YELLOW/_C_CYAN (crew_run_tui) and _GREEN (tui_picker) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(cli): accordion tool picker in JSON crew wizard The flat tool list had grown to ~90 rows. The picker now shows: - Common tools always visible at the top - Every other category as a single expandable row with tool and selection counts (e.g. "Search & Research (27 tools, 2 selected)") - Expanding a category collapses the previously expanded one - Selections persist across expand/collapse via new preselected support in pick_many; cursor follows the toggled category row tui_picker gains preselected + initial_cursor options on pick_many, and Esc in multi-select now confirms the current selection instead of discarding it (required so collapsing can't silently drop choices). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(cli): remove --daemon flag from crewai run The flag only affected JSON crew projects — classic and flow projects ignored it entirely, which made the behavior inconsistent. Removed the option, the daemon code path (_run_json_crew_daemon), and its helper (_load_json_crew_with_inputs). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test: update run command tests after --daemon removal lib/crewai/tests/cli/test_run_crew.py still asserted the old run_crew(trained_agents_file=..., daemon=False) call signature. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): exit codes, mid-run quit, async statuses, hyphen placeholders Addresses the latest Bugbot review round: - Failed JSON crew runs now exit non-zero (SystemExit(1)) so scripts and CI don't treat failures as success, mirroring the classic path - Quitting the TUI mid-run now ends the process (os._exit(130)); kickoff runs in a thread worker that cannot be force-cancelled, so letting the CLI return would leave LLM/tool work burning tokens in the background - Sidebar task statuses are now async-safe: completion/failure events resolve the task's own row via identity instead of assuming the most recently started task, and starting a task no longer blanket-marks earlier active rows as done - The runtime-input prompt regex now accepts hyphenated placeholder names ({my-topic}), matching kickoff's interpolation pattern Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: validation safety, custom tool sandboxing, TUI log integrity, memory error surfacing - Deploy validation no longer executes project code: validation mode checks tool declarations structurally (well-formed entries, custom tool file exists) without importing or instantiating anything. custom:<name> resolution only happens on the actual run path. - custom:<name> is constrained to [A-Za-z_][A-Za-z0-9_]* and the resolved path must stay inside the project's tools/ directory, so custom:../foo or absolute-path names cannot execute code outside it. Tool paths resolve relative to the crew project root, not cwd. - TUI task logs are built from per-task state captured at task start (idx, description, agent, start time); an out-of-order completion takes its output from the event and no longer steals or resets the current task's streamed steps/output. - EmbeddingDimensionMismatchError now inherits ValueError instead of RuntimeError so background saves surface it through MemorySaveFailedEvent instead of silently dropping the save; the shutdown catch in _background_encode_batch is narrowed to the "cannot schedule new futures" case. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): declared project type wins over crew.json presence A flow project that also contains a crew.json(c) file now runs and validates as the flow it declares in pyproject.toml instead of being hijacked by the JSON crew path. Both crewai run (_has_json_crew) and deploy validation (_is_json_crew) check tool.crewai.type; a missing or unreadable pyproject still means a bare JSON crew project. Also documents why StepObservationFailedEvent intentionally marks the plan step "done": the event signals an observer failure, not a step failure, and the executor continues past it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): type the declared_type locals so mypy stays clean Comparing an Any-typed .get() chain returns Any, which tripped no-any-return on the previous commit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
1141 lines
37 KiB
Python
1141 lines
37 KiB
Python
# mypy: ignore-errors
|
|
import threading
|
|
from collections import defaultdict
|
|
from typing import cast
|
|
from unittest.mock import Mock, patch
|
|
|
|
from crewai.events.event_bus import crewai_event_bus
|
|
from crewai.events.types.agent_events import LiteAgentExecutionStartedEvent
|
|
from crewai.events.types.tool_usage_events import ToolUsageStartedEvent
|
|
from crewai.lite_agent import LiteAgent
|
|
from crewai.lite_agent_output import LiteAgentOutput
|
|
from crewai.llms.base_llm import BaseLLM
|
|
from pydantic import BaseModel, Field
|
|
import pytest
|
|
|
|
from crewai import LLM, Agent
|
|
from crewai.flow import Flow, start
|
|
from crewai.tools import BaseTool
|
|
from crewai.types.usage_metrics import UsageMetrics
|
|
|
|
|
|
class SecretLookupTool(BaseTool):
|
|
name: str = "secret_lookup"
|
|
description: str = "A tool to lookup secrets"
|
|
|
|
def _run(self) -> str:
|
|
return "SUPERSECRETPASSWORD123"
|
|
|
|
|
|
class WebSearchTool(BaseTool):
|
|
"""Tool for searching the web for information."""
|
|
|
|
name: str = "search_web"
|
|
description: str = "Search the web for information about a topic."
|
|
|
|
def _run(self, query: str) -> str:
|
|
"""Search the web for information about a topic."""
|
|
if "tokyo" in query.lower():
|
|
return "Tokyo's population in 2023 was approximately 21 million people in the city proper, and 37 million in the greater metropolitan area."
|
|
if "climate change" in query.lower() and "coral" in query.lower():
|
|
return "Climate change severely impacts coral reefs through: 1) Ocean warming causing coral bleaching, 2) Ocean acidification reducing calcification, 3) Sea level rise affecting light availability, 4) Increased storm frequency damaging reef structures. Sources: NOAA Coral Reef Conservation Program, Global Coral Reef Alliance."
|
|
return f"Found information about {query}: This is a simulated search result for demonstration purposes."
|
|
|
|
|
|
class CalculatorTool(BaseTool):
|
|
"""Tool for performing calculations."""
|
|
|
|
name: str = "calculate"
|
|
description: str = "Calculate the result of a mathematical expression."
|
|
|
|
def _run(self, expression: str) -> str:
|
|
"""Calculate the result of a mathematical expression."""
|
|
try:
|
|
result = eval(expression, {"__builtins__": {}}) # noqa: S307
|
|
return f"The result of {expression} is {result}"
|
|
except Exception as e:
|
|
return f"Error calculating {expression}: {e!s}"
|
|
|
|
|
|
# Define a custom response format using Pydantic
|
|
class ResearchResult(BaseModel):
|
|
"""Structure for research results."""
|
|
|
|
main_findings: str = Field(description="The main findings from the research")
|
|
key_points: list[str] = Field(description="List of key points")
|
|
sources: list[str] = Field(description="List of sources used")
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
@pytest.mark.parametrize("verbose", [True, False])
|
|
def test_agent_kickoff_preserves_parameters(verbose):
|
|
"""Test that Agent.kickoff() uses the correct parameters from the Agent."""
|
|
mock_llm = Mock(spec=LLM)
|
|
mock_llm.call.return_value = "Final Answer: Test response"
|
|
mock_llm.stop = []
|
|
|
|
from crewai.types.usage_metrics import UsageMetrics
|
|
|
|
mock_usage_metrics = UsageMetrics(
|
|
total_tokens=100,
|
|
prompt_tokens=50,
|
|
completion_tokens=50,
|
|
cached_prompt_tokens=0,
|
|
successful_requests=1,
|
|
)
|
|
mock_llm.get_token_usage_summary.return_value = mock_usage_metrics
|
|
|
|
custom_tools = [WebSearchTool(), CalculatorTool()]
|
|
max_iter = 10
|
|
|
|
agent = Agent(
|
|
role="Test Agent",
|
|
goal="Test Goal",
|
|
backstory="Test Backstory",
|
|
llm=mock_llm,
|
|
tools=custom_tools,
|
|
max_iter=max_iter,
|
|
verbose=verbose,
|
|
)
|
|
|
|
result = agent.kickoff("Test query")
|
|
|
|
assert agent.role == "Test Agent"
|
|
assert agent.goal == "Test Goal"
|
|
assert agent.backstory == "Test Backstory"
|
|
assert len(agent.tools) == 2
|
|
assert isinstance(agent.tools[0], WebSearchTool)
|
|
assert isinstance(agent.tools[1], CalculatorTool)
|
|
assert agent.max_iter == max_iter
|
|
assert agent.verbose == verbose
|
|
|
|
assert result is not None
|
|
assert result.raw is not None
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_lite_agent_with_tools():
|
|
"""Test that Agent can use tools."""
|
|
llm = LLM(model="gpt-4o-mini")
|
|
agent = Agent(
|
|
role="Research Assistant",
|
|
goal="Find information about the population of Tokyo",
|
|
backstory="You are a helpful research assistant who can search for information about the population of Tokyo.",
|
|
llm=llm,
|
|
tools=[WebSearchTool()],
|
|
verbose=True,
|
|
)
|
|
|
|
result = agent.kickoff(
|
|
"What is the population of Tokyo and how many people would that be per square kilometer if Tokyo's area is 2,194 square kilometers?"
|
|
)
|
|
|
|
assert "21 million" in result.raw or "37 million" in result.raw, (
|
|
"Agent should find Tokyo's population"
|
|
)
|
|
assert "per square kilometer" in result.raw, (
|
|
"Agent should calculate population density"
|
|
)
|
|
|
|
received_events = []
|
|
event_received = threading.Event()
|
|
|
|
@crewai_event_bus.on(ToolUsageStartedEvent)
|
|
def event_handler(source, event):
|
|
received_events.append(event)
|
|
event_received.set()
|
|
|
|
agent.kickoff("What are the effects of climate change on coral reefs?")
|
|
|
|
assert event_received.wait(timeout=5), "Timeout waiting for tool usage events"
|
|
assert len(received_events) > 0, "Tool usage events should be emitted"
|
|
event = received_events[0]
|
|
assert isinstance(event, ToolUsageStartedEvent)
|
|
assert event.agent_role == "Research Assistant"
|
|
assert event.tool_name == "search_web"
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_lite_agent_structured_output():
|
|
"""Test that Agent can return a simple structured output."""
|
|
|
|
class SimpleOutput(BaseModel):
|
|
"""Simple structure for agent outputs."""
|
|
|
|
summary: str = Field(description="A brief summary of findings")
|
|
confidence: int = Field(description="Confidence level from 1-100")
|
|
|
|
web_search_tool = WebSearchTool()
|
|
|
|
llm = LLM(model="gpt-4o-mini")
|
|
agent = Agent(
|
|
role="Info Gatherer",
|
|
goal="Provide brief information",
|
|
backstory="You gather and summarize information quickly.",
|
|
llm=llm,
|
|
tools=[web_search_tool],
|
|
verbose=True,
|
|
)
|
|
|
|
result = agent.kickoff(
|
|
"What is the population of Tokyo? Return your structured output in JSON format with the following fields: summary, confidence",
|
|
response_format=SimpleOutput,
|
|
)
|
|
|
|
assert result.pydantic is not None, "Should return a Pydantic model"
|
|
|
|
output = cast(SimpleOutput, result.pydantic)
|
|
|
|
assert isinstance(output.summary, str), "Summary should be a string"
|
|
assert len(output.summary) > 0, "Summary should not be empty"
|
|
assert isinstance(output.confidence, int), "Confidence should be an integer"
|
|
assert 1 <= output.confidence <= 100, "Confidence should be between 1 and 100"
|
|
|
|
assert "tokyo" in output.summary.lower() or "population" in output.summary.lower()
|
|
|
|
assert result.usage_metrics is not None
|
|
|
|
return result
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_lite_agent_returns_usage_metrics():
|
|
"""Test that LiteAgent returns usage metrics."""
|
|
llm = LLM(model="gpt-4o-mini")
|
|
agent = Agent(
|
|
role="Research Assistant",
|
|
goal="Find information about the population of Tokyo",
|
|
backstory="You are a helpful research assistant who can search for information about the population of Tokyo.",
|
|
llm=llm,
|
|
tools=[WebSearchTool()],
|
|
verbose=True,
|
|
)
|
|
|
|
result = agent.kickoff(
|
|
"What is the population of Tokyo? Return your structured output in JSON format with the following fields: summary, confidence"
|
|
)
|
|
|
|
assert result.usage_metrics is not None
|
|
assert result.usage_metrics["total_tokens"] > 0
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_lite_agent_output_includes_messages():
|
|
"""Test that LiteAgentOutput includes messages from agent execution."""
|
|
llm = LLM(model="gpt-4o-mini")
|
|
agent = Agent(
|
|
role="Research Assistant",
|
|
goal="Find information about the population of Tokyo",
|
|
backstory="You are a helpful research assistant who can search for information about the population of Tokyo.",
|
|
llm=llm,
|
|
tools=[WebSearchTool()],
|
|
verbose=True,
|
|
)
|
|
|
|
result = agent.kickoff("What is the population of Tokyo?")
|
|
|
|
assert isinstance(result, LiteAgentOutput)
|
|
assert hasattr(result, "messages")
|
|
assert isinstance(result.messages, list)
|
|
assert len(result.messages) > 0
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
@pytest.mark.asyncio
|
|
async def test_lite_agent_returns_usage_metrics_async():
|
|
"""Test that LiteAgent returns usage metrics when run asynchronously."""
|
|
llm = LLM(model="gpt-4o-mini")
|
|
agent = Agent(
|
|
role="Research Assistant",
|
|
goal="Find information about the population of Tokyo",
|
|
backstory="You are a helpful research assistant who can search for information about the population of Tokyo.",
|
|
llm=llm,
|
|
tools=[WebSearchTool()],
|
|
verbose=True,
|
|
)
|
|
|
|
result = await agent.kickoff_async(
|
|
"What is the population of Tokyo? Return your structured output in JSON format with the following fields: summary, confidence"
|
|
)
|
|
assert isinstance(result, LiteAgentOutput)
|
|
assert (
|
|
"21 million" in result.raw
|
|
or "37 million" in result.raw
|
|
or "21000000" in result.raw
|
|
or "37000000" in result.raw
|
|
)
|
|
assert result.usage_metrics is not None
|
|
assert result.usage_metrics["total_tokens"] > 0
|
|
|
|
|
|
class TestFlow(Flow):
|
|
"""A test flow that creates and runs an agent."""
|
|
|
|
def __init__(self, llm, tools):
|
|
self.llm = llm
|
|
self.tools = tools
|
|
super().__init__()
|
|
|
|
@start()
|
|
def start(self):
|
|
agent = Agent(
|
|
role="Test Agent",
|
|
goal="Test Goal",
|
|
backstory="Test Backstory",
|
|
llm=self.llm,
|
|
tools=self.tools,
|
|
)
|
|
return agent.kickoff("Test query")
|
|
|
|
|
|
def verify_agent_flow_context(result, agent, flow):
|
|
"""Verify that both the result and agent have the correct flow context."""
|
|
assert result._flow_id == flow.flow_id # type: ignore[attr-defined]
|
|
assert result._request_id == flow.flow_id # type: ignore[attr-defined]
|
|
assert agent is not None
|
|
assert agent._flow_id == flow.flow_id # type: ignore[attr-defined]
|
|
assert agent._request_id == flow.flow_id # type: ignore[attr-defined]
|
|
|
|
|
|
def test_sets_flow_context_when_inside_flow():
|
|
"""Test that an Agent can be created and executed inside a Flow context."""
|
|
captured_event = None
|
|
|
|
mock_llm = Mock(spec=LLM)
|
|
mock_llm.call.return_value = "Test response"
|
|
mock_llm.stop = []
|
|
|
|
from crewai.types.usage_metrics import UsageMetrics
|
|
|
|
mock_usage_metrics = UsageMetrics(
|
|
total_tokens=100,
|
|
prompt_tokens=50,
|
|
completion_tokens=50,
|
|
cached_prompt_tokens=0,
|
|
successful_requests=1,
|
|
)
|
|
mock_llm.get_token_usage_summary.return_value = mock_usage_metrics
|
|
|
|
class MyFlow(Flow):
|
|
@start()
|
|
def start(self):
|
|
agent = Agent(
|
|
role="Test Agent",
|
|
goal="Test Goal",
|
|
backstory="Test Backstory",
|
|
llm=mock_llm,
|
|
tools=[WebSearchTool()],
|
|
)
|
|
return agent.kickoff("Test query")
|
|
|
|
flow = MyFlow()
|
|
event_received = threading.Event()
|
|
|
|
@crewai_event_bus.on(LiteAgentExecutionStartedEvent)
|
|
def capture_event(source, event):
|
|
nonlocal captured_event
|
|
captured_event = event
|
|
event_received.set()
|
|
|
|
result = flow.kickoff()
|
|
|
|
assert event_received.wait(timeout=5), "Timeout waiting for agent execution event"
|
|
assert captured_event is not None
|
|
assert captured_event.agent_info["role"] == "Test Agent"
|
|
assert result is not None
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_guardrail_is_called_using_string():
|
|
"""Test that a string guardrail triggers events and retries correctly.
|
|
|
|
Uses a callable guardrail that deterministically fails on the first
|
|
attempt and passes on the second. This tests the guardrail event
|
|
machinery (started/completed events, retry loop) without depending
|
|
on the LLM to comply with contradictory constraints.
|
|
"""
|
|
guardrail_events: dict[str, list] = defaultdict(list)
|
|
from crewai.events.event_types import (
|
|
LLMGuardrailCompletedEvent,
|
|
LLMGuardrailStartedEvent,
|
|
)
|
|
|
|
# Deterministic guardrail: fail first call, pass second
|
|
call_count = {"n": 0}
|
|
|
|
def fail_then_pass_guardrail(output):
|
|
call_count["n"] += 1
|
|
if call_count["n"] == 1:
|
|
return (False, "Missing required format — please use a numbered list")
|
|
return (True, output)
|
|
|
|
agent = Agent(
|
|
role="Sports Analyst",
|
|
goal="List the best soccer players",
|
|
backstory="You are an expert at gathering and organizing information.",
|
|
guardrail=fail_then_pass_guardrail,
|
|
guardrail_max_retries=3,
|
|
)
|
|
|
|
condition = threading.Condition()
|
|
|
|
@crewai_event_bus.on(LLMGuardrailStartedEvent)
|
|
def capture_guardrail_started(source, event):
|
|
assert isinstance(source, Agent)
|
|
with condition:
|
|
guardrail_events["started"].append(event)
|
|
condition.notify()
|
|
|
|
@crewai_event_bus.on(LLMGuardrailCompletedEvent)
|
|
def capture_guardrail_completed(source, event):
|
|
assert isinstance(source, Agent)
|
|
with condition:
|
|
guardrail_events["completed"].append(event)
|
|
condition.notify()
|
|
|
|
result = agent.kickoff(messages="Top 5 best soccer players in the world?")
|
|
|
|
with condition:
|
|
success = condition.wait_for(
|
|
lambda: len(guardrail_events["started"]) >= 2
|
|
and any(e.success for e in guardrail_events["completed"]),
|
|
timeout=10,
|
|
)
|
|
assert success, "Timeout waiting for successful guardrail event"
|
|
assert len(guardrail_events["started"]) >= 2
|
|
assert len(guardrail_events["completed"]) >= 2
|
|
assert not guardrail_events["completed"][0].success
|
|
successful_events = [e for e in guardrail_events["completed"] if e.success]
|
|
assert len(successful_events) >= 1, "Expected at least one successful guardrail completion"
|
|
assert result is not None
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_guardrail_is_called_using_callable():
|
|
guardrail_events: dict[str, list] = defaultdict(list)
|
|
from crewai.events.event_types import (
|
|
LLMGuardrailCompletedEvent,
|
|
LLMGuardrailStartedEvent,
|
|
)
|
|
|
|
condition = threading.Condition()
|
|
|
|
@crewai_event_bus.on(LLMGuardrailStartedEvent)
|
|
def capture_guardrail_started(source, event):
|
|
with condition:
|
|
guardrail_events["started"].append(event)
|
|
condition.notify()
|
|
|
|
@crewai_event_bus.on(LLMGuardrailCompletedEvent)
|
|
def capture_guardrail_completed(source, event):
|
|
with condition:
|
|
guardrail_events["completed"].append(event)
|
|
condition.notify()
|
|
|
|
agent = Agent(
|
|
role="Sports Analyst",
|
|
goal="Gather information about the best soccer players",
|
|
backstory="""You are an expert at gathering and organizing information. You carefully collect details and present them in a structured way.""",
|
|
guardrail=lambda output: (True, "Pelé - Santos, 1958"),
|
|
)
|
|
|
|
result = agent.kickoff(messages="Top 1 best players in the world?")
|
|
|
|
with condition:
|
|
success = condition.wait_for(
|
|
lambda: len(guardrail_events["started"]) >= 1
|
|
and len(guardrail_events["completed"]) >= 1,
|
|
timeout=10,
|
|
)
|
|
assert success, "Timeout waiting for all guardrail events"
|
|
assert len(guardrail_events["started"]) == 1
|
|
assert len(guardrail_events["completed"]) == 1
|
|
assert guardrail_events["completed"][0].success
|
|
assert "Pelé - Santos, 1958" in result.raw
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_guardrail_reached_attempt_limit():
|
|
guardrail_events: dict[str, list] = defaultdict(list)
|
|
from crewai.events.event_types import (
|
|
LLMGuardrailCompletedEvent,
|
|
LLMGuardrailStartedEvent,
|
|
)
|
|
|
|
condition = threading.Condition()
|
|
|
|
@crewai_event_bus.on(LLMGuardrailStartedEvent)
|
|
def capture_guardrail_started(source, event):
|
|
with condition:
|
|
guardrail_events["started"].append(event)
|
|
condition.notify()
|
|
|
|
@crewai_event_bus.on(LLMGuardrailCompletedEvent)
|
|
def capture_guardrail_completed(source, event):
|
|
with condition:
|
|
guardrail_events["completed"].append(event)
|
|
condition.notify()
|
|
|
|
agent = Agent(
|
|
role="Sports Analyst",
|
|
goal="Gather information about the best soccer players",
|
|
backstory="""You are an expert at gathering and organizing information. You carefully collect details and present them in a structured way.""",
|
|
guardrail=lambda output: (
|
|
False,
|
|
"You are not allowed to include Brazilian players",
|
|
),
|
|
guardrail_max_retries=2,
|
|
)
|
|
|
|
with pytest.raises(
|
|
Exception, match="Agent's guardrail failed validation after 2 retries"
|
|
):
|
|
agent.kickoff(messages="Top 10 best players in the world?")
|
|
|
|
with condition:
|
|
success = condition.wait_for(
|
|
lambda: len(guardrail_events["started"]) >= 3
|
|
and len(guardrail_events["completed"]) >= 3,
|
|
timeout=10,
|
|
)
|
|
assert success, "Timeout waiting for all guardrail events"
|
|
assert len(guardrail_events["started"]) == 3 # 2 retries + 1 initial call
|
|
assert len(guardrail_events["completed"]) == 3 # 2 retries + 1 initial call
|
|
assert not guardrail_events["completed"][0].success
|
|
assert not guardrail_events["completed"][1].success
|
|
assert not guardrail_events["completed"][2].success
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_agent_output_when_guardrail_returns_base_model():
|
|
class Player(BaseModel):
|
|
name: str
|
|
country: str
|
|
|
|
agent = Agent(
|
|
role="Sports Analyst",
|
|
goal="Gather information about the best soccer players",
|
|
backstory="""You are an expert at gathering and organizing information. You carefully collect details and present them in a structured way.""",
|
|
guardrail=lambda output: (
|
|
True,
|
|
Player(name="Lionel Messi", country="Argentina"),
|
|
),
|
|
)
|
|
|
|
result = agent.kickoff(messages="Top 10 best players in the world?")
|
|
|
|
assert result.pydantic == Player(name="Lionel Messi", country="Argentina")
|
|
|
|
|
|
def test_lite_agent_with_custom_llm_and_guardrails():
|
|
"""Test that CustomLLM (inheriting from BaseLLM) works with guardrails."""
|
|
|
|
class CustomLLM(BaseLLM):
|
|
def __init__(self, response: str = "Custom response"):
|
|
super().__init__(model="custom-model")
|
|
self.response = response
|
|
self.call_count = 0
|
|
|
|
def call(
|
|
self,
|
|
messages,
|
|
tools=None,
|
|
callbacks=None,
|
|
available_functions=None,
|
|
from_task=None,
|
|
from_agent=None,
|
|
response_model=None,
|
|
) -> str:
|
|
self.call_count += 1
|
|
|
|
if "valid" in str(messages) and "feedback" in str(messages):
|
|
return '{"valid": true, "feedback": null}'
|
|
|
|
if "Thought:" in str(messages):
|
|
return f"Thought: I will analyze soccer players\nFinal Answer: {self.response}"
|
|
|
|
return self.response
|
|
|
|
def supports_function_calling(self) -> bool:
|
|
return False
|
|
|
|
def supports_stop_words(self) -> bool:
|
|
return False
|
|
|
|
def get_context_window_size(self) -> int:
|
|
return 4096
|
|
|
|
custom_llm = CustomLLM(response="Brazilian soccer players are the best!")
|
|
|
|
agent = LiteAgent(
|
|
role="Sports Analyst",
|
|
goal="Analyze soccer players",
|
|
backstory="You analyze soccer players and their performance.",
|
|
llm=custom_llm,
|
|
guardrail="Only include Brazilian players",
|
|
)
|
|
|
|
result = agent.kickoff("Tell me about the best soccer players")
|
|
|
|
assert custom_llm.call_count > 0
|
|
assert "Brazilian" in result.raw
|
|
|
|
custom_llm2 = CustomLLM(response="Original response")
|
|
|
|
def test_guardrail(output):
|
|
return (True, "Modified by guardrail")
|
|
|
|
agent2 = LiteAgent(
|
|
role="Test Agent",
|
|
goal="Test goal",
|
|
backstory="Test backstory",
|
|
llm=custom_llm2,
|
|
guardrail=test_guardrail,
|
|
)
|
|
|
|
result2 = agent2.kickoff("Test message")
|
|
assert result2.raw == "Modified by guardrail"
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_lite_agent_with_invalid_llm():
|
|
"""Test that LiteAgent raises proper error when create_llm returns None."""
|
|
with patch("crewai.lite_agent.create_llm", return_value=None):
|
|
with pytest.raises(ValueError) as exc_info:
|
|
LiteAgent(
|
|
role="Test Agent",
|
|
goal="Test goal",
|
|
backstory="Test backstory",
|
|
llm="invalid-model",
|
|
)
|
|
assert "Expected LLM instance of type BaseLLM" in str(exc_info.value)
|
|
|
|
|
|
@patch.dict("os.environ", {"CREWAI_PLATFORM_INTEGRATION_TOKEN": "test_token"})
|
|
@patch("crewai_tools.tools.crewai_platform_tools.crewai_platform_action_tool.requests.post")
|
|
@patch("crewai_tools.tools.crewai_platform_tools.crewai_platform_tool_builder.requests.get")
|
|
@pytest.mark.vcr()
|
|
def test_agent_kickoff_with_platform_tools(mock_get, mock_post):
|
|
"""Test that Agent.kickoff() properly integrates platform tools with LiteAgent"""
|
|
mock_response = Mock()
|
|
mock_response.raise_for_status.return_value = None
|
|
mock_response.json.return_value = {
|
|
"actions": {
|
|
"github": [
|
|
{
|
|
"name": "create_issue",
|
|
"description": "Create a GitHub issue",
|
|
"parameters": {
|
|
"type": "object",
|
|
"properties": {
|
|
"title": {"type": "string", "description": "Issue title"},
|
|
"body": {"type": "string", "description": "Issue body"},
|
|
},
|
|
"required": ["title"],
|
|
},
|
|
}
|
|
]
|
|
}
|
|
}
|
|
mock_get.return_value = mock_response
|
|
|
|
mock_post_response = Mock()
|
|
mock_post_response.ok = True
|
|
mock_post_response.json.return_value = {
|
|
"success": True,
|
|
"issue_url": "https://github.com/test/repo/issues/1"
|
|
}
|
|
mock_post.return_value = mock_post_response
|
|
|
|
agent = Agent(
|
|
role="Test Agent",
|
|
goal="Test goal",
|
|
backstory="Test backstory",
|
|
llm=LLM(model="gpt-3.5-turbo"),
|
|
apps=["github"],
|
|
verbose=True
|
|
)
|
|
|
|
result = agent.kickoff("Create a GitHub issue")
|
|
|
|
assert isinstance(result, LiteAgentOutput)
|
|
assert result.raw is not None
|
|
|
|
|
|
@patch.dict("os.environ", {"EXA_API_KEY": "test_exa_key"})
|
|
@patch("crewai.agent.Agent.get_mcp_tools")
|
|
@pytest.mark.vcr()
|
|
def test_agent_kickoff_with_mcp_tools(mock_get_mcp_tools):
|
|
"""Test that Agent.kickoff() properly integrates MCP tools with LiteAgent"""
|
|
class MockMCPTool(BaseTool):
|
|
name: str = "exa_search"
|
|
description: str = "Search the web using Exa"
|
|
|
|
def _run(self, query: str) -> str:
|
|
return f"Mock search results for: {query}"
|
|
|
|
mock_get_mcp_tools.return_value = [MockMCPTool()]
|
|
|
|
agent = Agent(
|
|
role="Test Agent",
|
|
goal="Test goal",
|
|
backstory="Test backstory",
|
|
llm=LLM(model="gpt-3.5-turbo"),
|
|
mcps=["https://mcp.exa.ai/mcp?api_key=test_exa_key&profile=research"],
|
|
verbose=True
|
|
)
|
|
|
|
result = agent.kickoff("Search for information about AI")
|
|
|
|
assert isinstance(result, LiteAgentOutput)
|
|
assert result.raw is not None
|
|
|
|
mock_get_mcp_tools.assert_called_once_with(["https://mcp.exa.ai/mcp?api_key=test_exa_key&profile=research"])
|
|
|
|
|
|
|
|
from crewai.flow.flow import listen
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_lite_agent_inside_flow_sync():
|
|
"""Test that LiteAgent.kickoff() works magically inside a Flow.
|
|
|
|
This tests the "magic auto-async" pattern where calling agent.kickoff()
|
|
from within a Flow automatically detects the event loop and returns a
|
|
coroutine that the Flow framework awaits. Users don't need to use async/await.
|
|
"""
|
|
execution_log = []
|
|
|
|
class TestFlow(Flow):
|
|
@start()
|
|
def run_agent(self):
|
|
execution_log.append("flow_started")
|
|
agent = Agent(
|
|
role="Test Agent",
|
|
goal="Answer questions",
|
|
backstory="A helpful test assistant",
|
|
llm=LLM(model="gpt-4o-mini"),
|
|
verbose=False,
|
|
)
|
|
# Magic: just call kickoff() normally - it auto-detects Flow context
|
|
result = agent.kickoff(messages="What is 2+2? Reply with just the number.")
|
|
execution_log.append("agent_completed")
|
|
return result
|
|
|
|
flow = TestFlow()
|
|
result = flow.kickoff()
|
|
|
|
assert "flow_started" in execution_log
|
|
assert "agent_completed" in execution_log
|
|
assert result is not None
|
|
assert isinstance(result, LiteAgentOutput)
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_lite_agent_inside_flow_with_tools():
|
|
"""Test that LiteAgent with tools works correctly inside a Flow."""
|
|
class TestFlow(Flow):
|
|
@start()
|
|
def run_agent_with_tools(self):
|
|
agent = Agent(
|
|
role="Calculator Agent",
|
|
goal="Perform calculations",
|
|
backstory="A math expert",
|
|
llm=LLM(model="gpt-4o-mini"),
|
|
tools=[CalculatorTool()],
|
|
verbose=False,
|
|
)
|
|
result = agent.kickoff(messages="Calculate 10 * 5")
|
|
return result
|
|
|
|
flow = TestFlow()
|
|
result = flow.kickoff()
|
|
|
|
assert result is not None
|
|
assert isinstance(result, LiteAgentOutput)
|
|
assert result.raw is not None
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_multiple_agents_in_same_flow():
|
|
"""Test that multiple LiteAgents can run sequentially in the same Flow."""
|
|
class MultiAgentFlow(Flow):
|
|
@start()
|
|
def first_step(self):
|
|
agent1 = Agent(
|
|
role="First Agent",
|
|
goal="Greet users",
|
|
backstory="A friendly greeter",
|
|
llm=LLM(model="gpt-4o-mini"),
|
|
verbose=False,
|
|
)
|
|
return agent1.kickoff(messages="Say hello")
|
|
|
|
@listen(first_step)
|
|
def second_step(self, first_result):
|
|
agent2 = Agent(
|
|
role="Second Agent",
|
|
goal="Say goodbye",
|
|
backstory="A polite farewell agent",
|
|
llm=LLM(model="gpt-4o-mini"),
|
|
verbose=False,
|
|
)
|
|
return agent2.kickoff(messages="Say goodbye")
|
|
|
|
flow = MultiAgentFlow()
|
|
result = flow.kickoff()
|
|
|
|
assert result is not None
|
|
assert isinstance(result, LiteAgentOutput)
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_lite_agent_kickoff_async_inside_flow():
|
|
"""Test that Agent.kickoff_async() works correctly from async Flow methods."""
|
|
class AsyncAgentFlow(Flow):
|
|
@start()
|
|
async def async_agent_step(self):
|
|
agent = Agent(
|
|
role="Async Test Agent",
|
|
goal="Answer questions asynchronously",
|
|
backstory="An async helper",
|
|
llm=LLM(model="gpt-4o-mini"),
|
|
verbose=False,
|
|
)
|
|
result = await agent.kickoff_async(messages="What is 3+3?")
|
|
return result
|
|
|
|
flow = AsyncAgentFlow()
|
|
result = flow.kickoff()
|
|
|
|
assert result is not None
|
|
assert isinstance(result, LiteAgentOutput)
|
|
|
|
|
|
@pytest.mark.vcr()
|
|
def test_lite_agent_standalone_still_works():
|
|
"""Test that LiteAgent.kickoff() still works normally outside of a Flow.
|
|
|
|
This verifies that the magic auto-async pattern doesn't break standalone usage
|
|
where there's no event loop running.
|
|
"""
|
|
agent = Agent(
|
|
role="Standalone Agent",
|
|
goal="Answer questions",
|
|
backstory="A helpful assistant",
|
|
llm=LLM(model="gpt-4o-mini"),
|
|
verbose=False,
|
|
)
|
|
|
|
result = agent.kickoff(messages="What is 5+5? Reply with just the number.")
|
|
|
|
assert result is not None
|
|
assert isinstance(result, LiteAgentOutput)
|
|
assert result.raw is not None
|
|
|
|
|
|
def test_agent_kickoff_with_files_parameter():
|
|
"""Test that Agent.kickoff() accepts and passes files to the executor."""
|
|
from unittest.mock import Mock, patch
|
|
|
|
from crewai_files import File
|
|
|
|
from crewai.types.usage_metrics import UsageMetrics
|
|
|
|
mock_llm = Mock(spec=LLM)
|
|
mock_llm.call.return_value = "Final Answer: I can see the file content."
|
|
mock_llm.stop = []
|
|
mock_llm.supports_stop_words.return_value = False
|
|
mock_llm.get_token_usage_summary.return_value = UsageMetrics(
|
|
total_tokens=100,
|
|
prompt_tokens=50,
|
|
completion_tokens=50,
|
|
cached_prompt_tokens=0,
|
|
successful_requests=1,
|
|
)
|
|
|
|
agent = Agent(
|
|
role="File Analyzer",
|
|
goal="Analyze files",
|
|
backstory="An agent that analyzes files",
|
|
llm=mock_llm,
|
|
verbose=False,
|
|
)
|
|
|
|
test_file = File(source=b"mock pdf content")
|
|
input_files = {"document.pdf": test_file}
|
|
|
|
with patch.object(
|
|
agent, "_prepare_kickoff", wraps=agent._prepare_kickoff
|
|
) as mock_prepare:
|
|
result = agent.kickoff(messages="Analyze the document", input_files=input_files)
|
|
|
|
mock_prepare.assert_called_once()
|
|
call_args = mock_prepare.call_args
|
|
assert call_args.args[0] == "Analyze the document"
|
|
called_files = call_args.kwargs.get("input_files") or call_args.args[2]
|
|
assert "document.pdf" in called_files
|
|
assert called_files["document.pdf"] is test_file
|
|
|
|
assert result is not None
|
|
|
|
|
|
def test_prepare_kickoff_extracts_files_from_messages():
|
|
"""Test that _prepare_kickoff extracts files from messages."""
|
|
from unittest.mock import Mock
|
|
|
|
from crewai_files import File
|
|
|
|
from crewai.types.usage_metrics import UsageMetrics
|
|
|
|
mock_llm = Mock(spec=LLM)
|
|
mock_llm.call.return_value = "Final Answer: Done."
|
|
mock_llm.stop = []
|
|
mock_llm.supports_stop_words.return_value = False
|
|
mock_llm.get_token_usage_summary.return_value = UsageMetrics(
|
|
total_tokens=100,
|
|
prompt_tokens=50,
|
|
completion_tokens=50,
|
|
cached_prompt_tokens=0,
|
|
successful_requests=1,
|
|
)
|
|
|
|
agent = Agent(
|
|
role="Test Agent",
|
|
goal="Test files",
|
|
backstory="Test backstory",
|
|
llm=mock_llm,
|
|
verbose=False,
|
|
)
|
|
|
|
test_file = File(source=b"mock image content")
|
|
messages = [
|
|
{"role": "user", "content": "Analyze this", "files": {"img.png": test_file}}
|
|
]
|
|
|
|
executor, inputs, agent_info, parsed_tools = agent._prepare_kickoff(messages=messages)
|
|
|
|
assert "files" in inputs
|
|
assert "img.png" in inputs["files"]
|
|
assert inputs["files"]["img.png"] is test_file
|
|
|
|
|
|
def test_prepare_kickoff_merges_files_from_messages_and_parameter():
|
|
"""Test that _prepare_kickoff merges files from messages and parameter."""
|
|
from unittest.mock import Mock
|
|
|
|
from crewai_files import File
|
|
|
|
from crewai.types.usage_metrics import UsageMetrics
|
|
|
|
mock_llm = Mock(spec=LLM)
|
|
mock_llm.call.return_value = "Final Answer: Done."
|
|
mock_llm.stop = []
|
|
mock_llm.supports_stop_words.return_value = False
|
|
mock_llm.get_token_usage_summary.return_value = UsageMetrics(
|
|
total_tokens=100,
|
|
prompt_tokens=50,
|
|
completion_tokens=50,
|
|
cached_prompt_tokens=0,
|
|
successful_requests=1,
|
|
)
|
|
|
|
agent = Agent(
|
|
role="Test Agent",
|
|
goal="Test files",
|
|
backstory="Test backstory",
|
|
llm=mock_llm,
|
|
verbose=False,
|
|
)
|
|
|
|
msg_file = File(source=b"message file content")
|
|
param_file = File(source=b"param file content")
|
|
messages = [
|
|
{"role": "user", "content": "Analyze these", "files": {"from_msg.png": msg_file}}
|
|
]
|
|
input_files = {"from_param.pdf": param_file}
|
|
|
|
executor, inputs, agent_info, parsed_tools = agent._prepare_kickoff(
|
|
messages=messages, input_files=input_files
|
|
)
|
|
|
|
assert "files" in inputs
|
|
assert "from_msg.png" in inputs["files"]
|
|
assert "from_param.pdf" in inputs["files"]
|
|
assert inputs["files"]["from_msg.png"] is msg_file
|
|
assert inputs["files"]["from_param.pdf"] is param_file
|
|
|
|
|
|
def test_prepare_kickoff_param_files_override_message_files():
|
|
"""Test that files parameter overrides files from messages with same name."""
|
|
from unittest.mock import Mock
|
|
|
|
from crewai_files import File
|
|
|
|
from crewai.types.usage_metrics import UsageMetrics
|
|
|
|
mock_llm = Mock(spec=LLM)
|
|
mock_llm.call.return_value = "Final Answer: Done."
|
|
mock_llm.stop = []
|
|
mock_llm.supports_stop_words.return_value = False
|
|
mock_llm.get_token_usage_summary.return_value = UsageMetrics(
|
|
total_tokens=100,
|
|
prompt_tokens=50,
|
|
completion_tokens=50,
|
|
cached_prompt_tokens=0,
|
|
successful_requests=1,
|
|
)
|
|
|
|
agent = Agent(
|
|
role="Test Agent",
|
|
goal="Test files",
|
|
backstory="Test backstory",
|
|
llm=mock_llm,
|
|
verbose=False,
|
|
)
|
|
|
|
msg_file = File(source=b"message file content")
|
|
param_file = File(source=b"param file content")
|
|
messages = [
|
|
{"role": "user", "content": "Analyze", "files": {"same.png": msg_file}}
|
|
]
|
|
input_files = {"same.png": param_file}
|
|
|
|
executor, inputs, agent_info, parsed_tools = agent._prepare_kickoff(
|
|
messages=messages, input_files=input_files
|
|
)
|
|
|
|
assert "files" in inputs
|
|
assert inputs["files"]["same.png"] is param_file
|
|
|
|
|
|
def test_lite_agent_verbose_false_suppresses_printer_output():
|
|
"""Test that setting verbose=False suppresses all printer output."""
|
|
from crewai.agents.parser import AgentFinish
|
|
from crewai.types.usage_metrics import UsageMetrics
|
|
|
|
mock_llm = Mock(spec=LLM)
|
|
mock_llm.call.return_value = "Final Answer: Hello!"
|
|
mock_llm.stop = []
|
|
mock_llm.supports_stop_words.return_value = False
|
|
mock_llm.get_token_usage_summary.return_value = UsageMetrics(
|
|
total_tokens=100,
|
|
prompt_tokens=50,
|
|
completion_tokens=50,
|
|
cached_prompt_tokens=0,
|
|
successful_requests=1,
|
|
)
|
|
|
|
with pytest.warns(FutureWarning):
|
|
agent = LiteAgent(
|
|
role="Test Agent",
|
|
goal="Test goal",
|
|
backstory="Test backstory",
|
|
llm=mock_llm,
|
|
verbose=False,
|
|
)
|
|
|
|
mock_printer = Mock()
|
|
with patch("crewai.lite_agent.PRINTER", mock_printer):
|
|
result = agent.kickoff("Say hello")
|
|
|
|
assert result is not None
|
|
assert isinstance(result, LiteAgentOutput)
|
|
mock_printer.print.assert_not_called()
|
|
|
|
|
|
|
|
|
|
@pytest.mark.filterwarnings("ignore:LiteAgent is deprecated")
|
|
def test_lite_agent_memory_none_default():
|
|
"""With memory=None (default), _memory is None and no memory is used."""
|
|
mock_llm = Mock(spec=LLM)
|
|
mock_llm.call.return_value = "Final Answer: Ok"
|
|
mock_llm.stop = []
|
|
mock_llm.get_token_usage_summary.return_value = UsageMetrics(
|
|
total_tokens=10,
|
|
prompt_tokens=5,
|
|
completion_tokens=5,
|
|
cached_prompt_tokens=0,
|
|
successful_requests=1,
|
|
)
|
|
agent = LiteAgent(
|
|
role="Test",
|
|
goal="Test goal",
|
|
backstory="Test backstory",
|
|
llm=mock_llm,
|
|
memory=None,
|
|
verbose=False,
|
|
)
|
|
assert agent._memory is None
|
|
|
|
|
|
@pytest.mark.filterwarnings("ignore:LiteAgent is deprecated")
|
|
def test_lite_agent_memory_true_resolves_to_default_memory():
|
|
"""With memory=True, _memory is a Memory instance."""
|
|
from crewai.memory.unified_memory import Memory
|
|
|
|
mock_llm = Mock(spec=LLM)
|
|
mock_llm.call.return_value = "Final Answer: Ok"
|
|
mock_llm.stop = []
|
|
mock_llm.get_token_usage_summary.return_value = UsageMetrics(
|
|
total_tokens=10,
|
|
prompt_tokens=5,
|
|
completion_tokens=5,
|
|
cached_prompt_tokens=0,
|
|
successful_requests=1,
|
|
)
|
|
agent = LiteAgent(
|
|
role="Test",
|
|
goal="Test goal",
|
|
backstory="Test backstory",
|
|
llm=mock_llm,
|
|
memory=True,
|
|
verbose=False,
|
|
)
|
|
assert agent._memory is not None
|
|
assert isinstance(agent._memory, Memory)
|
|
assert agent._memory.llm is agent.llm
|
|
|
|
|
|
@pytest.mark.filterwarnings("ignore:LiteAgent is deprecated")
|
|
def test_lite_agent_memory_instance_recall_and_save_called():
|
|
"""With a custom memory instance, kickoff calls recall and then extract_memories/remember."""
|
|
mock_llm = Mock(spec=LLM)
|
|
mock_llm.call.return_value = "Final Answer: The answer is 42."
|
|
mock_llm.stop = []
|
|
mock_llm.supports_stop_words.return_value = False
|
|
mock_llm.get_token_usage_summary.return_value = UsageMetrics(
|
|
total_tokens=10,
|
|
prompt_tokens=5,
|
|
completion_tokens=5,
|
|
cached_prompt_tokens=0,
|
|
successful_requests=1,
|
|
)
|
|
mock_memory = Mock()
|
|
mock_memory.read_only = False
|
|
mock_memory.recall.return_value = []
|
|
mock_memory.extract_memories.return_value = ["Fact one.", "Fact two."]
|
|
|
|
agent = LiteAgent(
|
|
role="Test",
|
|
goal="Test goal",
|
|
backstory="Test backstory",
|
|
llm=mock_llm,
|
|
memory=mock_memory,
|
|
verbose=False,
|
|
)
|
|
assert agent._memory is mock_memory
|
|
|
|
agent.kickoff("What is the answer?")
|
|
|
|
mock_memory.recall.assert_called_once()
|
|
call_kw = mock_memory.recall.call_args[1]
|
|
assert call_kw.get("limit") == 10
|
|
# depth is not passed explicitly; Memory.recall() defaults to "deep"
|
|
mock_memory.extract_memories.assert_called_once()
|
|
mock_memory.remember_many.assert_called_once_with(
|
|
["Fact one.", "Fact two."], agent_role="Test"
|
|
)
|