Compare commits

..

2 Commits

Author SHA1 Message Date
Iris Clawd
71c7293197 docs: remove --public flag references (public tools no longer supported)
Remove all references to --public visibility flag since public tools
have been removed from the platform (crewai-plus#2380, ENG-1453).
Tools are now organization-scoped (private) by default.

Co-authored-by: Diego Nogues <diego@crewai.com>
2026-05-14 09:52:10 -03:00
Iris Clawd
288568110f docs: add Platform Tools CLI guide for crewai tool create/publish/install
Add documentation for the CLI commands referenced in the Create Tool
modal on the platform (crewai tool create, crewai tool publish,
crewai tool install). These commands manage tools on the CrewAI
platform registry — distinct from PyPI publishing.

Changes:
- New guide: docs/en/guides/tools/platform-tools-cli.mdx
  Full lifecycle: create → implement → publish → install
  Covers visibility flags (--public/--private/--force)
  Includes platform vs PyPI comparison
- Updated create-custom-tools.mdx tip to cross-reference both guides
- Added new page to docs.json navigation (all versions)

Resolves EPD-76

Co-authored-by: Diego Nogues <diego@crewai.com>
2026-05-14 09:52:10 -03:00
64 changed files with 1420 additions and 25304 deletions

2
.gitignore vendored
View File

@@ -31,5 +31,3 @@ chromadb-*.lock
blogs/*
secrets/*
UNKNOWN.egg-info/
demos/
.crewai/

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,131 @@
---
title: Platform Tools CLI
description: Create, publish, and install custom tools on the CrewAI platform using the CLI.
icon: terminal
mode: "wide"
---
## Overview
The CrewAI CLI provides commands to manage custom tools on the **CrewAI platform** — a hosted tool registry that lets you share tools within your organization without publishing to PyPI.
| Command | Purpose |
|---------|---------|
| `crewai tool create <handle>` | Scaffold a new tool project |
| `crewai tool publish` | Publish the tool to the CrewAI platform |
| `crewai tool install <handle>` | Install a platform tool into your crew project |
<Note type="info" title="Platform vs PyPI">
These commands manage tools on the **CrewAI platform registry**. If you want to publish a standalone Python package to PyPI instead, see the [Publish Custom Tools to PyPI](/en/guides/tools/publish-custom-tools) guide.
</Note>
## Prerequisites
- **CrewAI CLI** installed (`pip install crewai`)
- **Authenticated** with the platform — run `crewai login` first
---
## Step 1: Create a Tool Project
Scaffold a new tool project:
```bash
crewai tool create my_custom_tool
```
This generates a project structure with the boilerplate you need to start building your tool.
<Tip>
The `handle` is the unique identifier for your tool on the platform. Choose something descriptive and specific to what the tool does.
</Tip>
### Implement Your Tool
Edit the generated tool file to add your logic. The tool follows the standard CrewAI tools contract — you can subclass `BaseTool` or use the `@tool` decorator:
```python
from crewai.tools import BaseTool
class MyCustomTool(BaseTool):
name: str = "My Custom Tool"
description: str = "Description of what this tool does — be specific so agents know when to use it."
def _run(self, argument: str) -> str:
# Your tool logic here
return "result"
```
For the full tools API reference (input schemas, caching, async support, error handling), see the [Create Custom Tools](/en/learn/create-custom-tools) guide.
---
## Step 2: Publish to the Platform
From your tool project directory, publish it to the CrewAI platform:
```bash
crewai tool publish
```
### Options
| Flag | Description |
|------|-------------|
| `--force` | Bypass Git remote validations |
Tools are published privately to your organization by default.
---
## Step 3: Install a Platform Tool
To install a tool that's been published to the platform:
```bash
crewai tool install my_custom_tool
```
Once installed, you can use the tool in your crew like any other tool — assign it to an agent via the `tools` parameter.
---
## Full Lifecycle Example
```bash
# 1. Authenticate with the platform
crewai login
# 2. Scaffold a new tool
crewai tool create weather_lookup
# 3. Implement your logic in the generated project
cd weather_lookup
# ... edit the tool file ...
# 4. Publish to the platform
crewai tool publish
# 5. In another project, install and use it
crewai tool install weather_lookup
```
---
## Platform Tools vs PyPI Packages
| | Platform Tools | PyPI Packages |
|---|---|---|
| **Publish** | `crewai tool publish` | `uv build` + `uv publish` |
| **Registry** | CrewAI platform | PyPI |
| **Install** | `crewai tool install <handle>` | `pip install <package>` |
| **Auth** | `crewai login` | PyPI account + token |
| **Visibility** | Organization-scoped (private) | Always public |
| **Guide** | This page | [Publish Custom Tools](/en/guides/tools/publish-custom-tools) |
---
## Related
- [Create Custom Tools](/en/learn/create-custom-tools) — Python API reference for building tools (BaseTool, @tool decorator)
- [Publish Custom Tools to PyPI](/en/guides/tools/publish-custom-tools) — package and distribute tools as standalone Python libraries

View File

@@ -12,7 +12,9 @@ incorporating the latest functionalities such as tool delegation, error handling
enabling agents to perform a wide range of actions.
<Tip>
**Want to publish your tool for the community?** If you're building a tool that others could benefit from, check out the [Publish Custom Tools](/en/guides/tools/publish-custom-tools) guide to learn how to package and distribute your tool on PyPI.
**Want to publish your tool to the CrewAI platform?** Use the CLI to scaffold, publish, and share tools directly on the platform — see the [Platform Tools CLI](/en/guides/tools/platform-tools-cli) guide.
**Prefer publishing to PyPI?** Check out the [Publish Custom Tools](/en/guides/tools/publish-custom-tools) guide to package and distribute your tool as a standalone Python library.
</Tip>
### Subclassing `BaseTool`

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -55,7 +55,6 @@ ENV_VARS: dict[str, list[dict[str, Any]]] = {
"ollama": [
{
"default": True,
"key_name": "OLLAMA_HOST",
"API_BASE": "http://localhost:11434",
}
],

View File

@@ -1,874 +0,0 @@
"""Create agent definitions via interactive prompts."""
from __future__ import annotations
import json
from pathlib import Path
import re
import subprocess
import sys
import click
from crewai_cli.constants import ENV_VARS
from crewai_cli.utils import load_env_vars, write_env_file
AGENT_TEMPLATE = """\
{{
// Agent identity — defines the agent's persona and expertise
// These three fields shape how the agent thinks and communicates
"name": "{name}",
// What this agent does (any role you want)
"role": "{role}",
// The agent's primary objective
"goal": "{goal}",
// Background context that shapes personality and approach
"backstory": "{backstory}",
// Which LLM powers this agent
// Format: "provider/model" — e.g., "openai/gpt-4o", "anthropic/claude-sonnet-4-20250514"
"llm": "{llm}",
// Separate LLM for tool/function calls (optional, defaults to main LLM)
// Useful for using a cheaper model for tool routing
// "function_calling_llm": "openai/gpt-4o-mini",
// Tools this agent can use — referenced by name from the crewai-tools package
// See: https://docs.crewai.com/tools for available tools
// Use "custom:tool_name" for custom tools defined in your tools/ directory
"tools": [],
// MCP servers — external tool servers following the Model Context Protocol
// Can be URLs ("https://mcp.example.com") or platform slugs ("notion")
"mcps": [],
// Platform app integrations — managed by CrewAI Platform
// App name ("gmail") or app/action ("gmail/send_email")
"apps": [],
// Coworkers — other agents this agent can delegate work to
// {{"ref": "name"}} for local agents in agents/ directory
// {{"amp": "handle"}} for agents from the CrewAI AMP repository (your org)
// {{"amp": "handle", "llm": "..."}} for AMP agents with LLM override
// {{"a2a": "url"}} for remote agents via A2A protocol
"coworkers": [],
// Knowledge sources — files/directories the agent can search for context
// Supports: PDF, CSV, JSON, TXT, Excel, and directories
"knowledge_sources": [],
// Output guardrail — validates agent responses before sending to user
// "type": "llm" uses an LLM to check the response against instructions
// Remove this block to disable guardrails
// "guardrail": {{
// "type": "llm",
// "instructions": "Never reveal internal pricing information.",
// "llm": "openai/gpt-4o-mini"
// }},
// Settings — all have sensible defaults, only override what you need
"settings": {{
// Agent remembers across conversations
"memory": true,
// Enable extended thinking / chain-of-thought
"reasoning": true,
// Dreaming: consolidate memories over time into canonical insights
"self_improving": true,
// Agent plans before complex tasks
"planning": true,
// Agent decides at runtime whether to plan (default: true)
// "auto_plan": true,
// Allow agent to spawn parallel copies for subtasks (default: true)
// "can_spawn_copies": true,
// How deep spawned copies can nest (default: 1)
// "max_spawn_depth": 1,
// Max parallel copies running at once (default: 4)
// "max_concurrent_spawns": 4,
// Messages sent to LLM per turn, null = all (default: null)
// "max_history_messages": null,
// Detect claimed-but-not-done actions (default: false)
// "narration_guard": false,
// Hours between dreaming cycles (default: 24)
// "dreaming_interval_hours": 24,
// New memories before dreaming triggers (default: 10)
// "dreaming_trigger_threshold": 10,
// Separate LLM for dreaming (default: uses agent's LLM)
// "dreaming_llm": "openai/gpt-4o-mini",
// Provenance detail level: "minimal", "standard", or "detailed"
// "provenance_detail": "standard"
}}
}}
"""
PROJECT_CONFIG_TEMPLATE = """\
{
// Project configuration for crewai agents
// Test / benchmark settings — used by `crewai test`
"test": {
// How many times to repeat each test run. Higher = more confidence.
// Override with: crewai test -n 5
"iterations": 3,
// Minimum score (0.01.0) for a test case to pass.
// Override per test file with: {"threshold": 0.9, "cases": [...]}
// Override with: crewai test --threshold 0.8
"threshold": 0.7,
// LLM used to judge test responses (provider/model format).
// Override with: crewai test --judge-model openai/gpt-4o
"judge_model": "openai/gpt-4o-mini",
// Max seconds to wait for an agent response per test case.
// Increase for agents that use tools or multi-step reasoning.
"case_timeout": 90
},
// Rooms define how agents collaborate in the TUI (`crewai run`)
"rooms": {
"common": {
// Which agents participate in this room (agent names from agents/ dir)
"agents": [],
// Engagement mode:
// "organic" — all agents see messages, respond if relevant (default)
// "tagged" — @mention to direct messages
// "dm" — chat with one agent at a time
"engagement": "organic"
}
}
}
"""
_STARTER_CASES = """\
[
{
"input": "Hello, what can you help me with?",
"criteria": "The agent should clearly describe its role and capabilities."
}
]
"""
_PROVIDER_TO_EXTRA: dict[str, str] = {
# Native providers with dedicated SDK extras
"anthropic": "anthropic",
"gemini": "google-genai",
"google": "google-genai",
"azure": "azure-ai-inference",
"azure_openai": "azure-ai-inference",
"bedrock": "bedrock",
"aws": "aws",
# Providers that route through litellm
"watsonx": "litellm",
"groq": "litellm",
"nvidia_nim": "litellm",
"huggingface": "litellm",
"sambanova": "litellm",
# OpenAI-compatible providers — no extra needed:
# openai, ollama, cerebras, deepseek, openrouter, hosted_vllm, dashscope
}
_PROVIDER_BONUS_EXTRAS: dict[str, list[str]] = {
"watsonx": ["watson"],
}
_GITIGNORE_TEMPLATE = """\
.env
__pycache__/
.DS_Store
.crewai/
tests/artifacts/
"""
def _build_pyproject(project_name: str, crewai_version: str, llm_provider: str) -> str:
"""Build pyproject.toml content with the right LLM provider extra."""
extras = ["tools"]
provider_extra = _PROVIDER_TO_EXTRA.get(llm_provider, "")
if provider_extra and provider_extra not in extras:
extras.append(provider_extra)
for bonus in _PROVIDER_BONUS_EXTRAS.get(llm_provider, []):
if bonus not in extras:
extras.append(bonus)
extras_str = ",".join(extras)
lines = [
"[project]",
f'name = "{project_name}"',
'version = "0.1.0"',
'description = "CrewAI agent project"',
'requires-python = ">=3.10,<3.14"',
"dependencies = [",
f' "crewai[{extras_str}]>={crewai_version}",',
f' "crewai-cli>={crewai_version}",',
"]",
"",
"[tool.uv]",
'prerelease = "allow"',
"constraint-dependencies = [",
' "onnxruntime<=1.25.1",',
"]",
"",
"[tool.crewai]",
'type = "agent"',
"",
]
return "\n".join(lines)
def _bootstrap_project(base: Path, llm_model: str = "") -> None:
"""Create project structure if it doesn't exist yet."""
agents_dir = base / "agents"
agents_dir.mkdir(parents=True, exist_ok=True)
tools_dir = base / "tools"
tools_dir.mkdir(parents=True, exist_ok=True)
tests_dir = base / "tests"
tests_dir.mkdir(parents=True, exist_ok=True)
config_path = base / "config.json"
if not config_path.exists():
config_path.write_text(PROJECT_CONFIG_TEMPLATE, encoding="utf-8")
provider = llm_model.split("/")[0].lower() if "/" in llm_model else ""
pyproject_path = base / "pyproject.toml"
if not pyproject_path.exists():
crewai_version = _get_crewai_version()
pyproject_path.write_text(
_build_pyproject(base.name, crewai_version, provider),
encoding="utf-8",
)
else:
_maybe_add_provider_extra(pyproject_path, provider)
gitignore_path = base / ".gitignore"
if not gitignore_path.exists():
gitignore_path.write_text(_GITIGNORE_TEMPLATE, encoding="utf-8")
def _maybe_add_provider_extra(pyproject_path: Path, provider: str) -> None:
"""If the pyproject.toml exists but doesn't include the provider extra, add it."""
all_extras = []
primary = _PROVIDER_TO_EXTRA.get(provider, "")
if primary:
all_extras.append(primary)
all_extras.extend(_PROVIDER_BONUS_EXTRAS.get(provider, []))
if not all_extras:
return
try:
content = pyproject_path.read_text(encoding="utf-8")
missing = [
e
for e in all_extras
if f"[{e}]" not in content
and f",{e}]" not in content
and f",{e}," not in content
]
if not missing:
return
import re as _re
suffix = "," + ",".join(missing)
def _add_extras(m: _re.Match[str]) -> str:
bracket: str = m.group(0)
return bracket[:-1] + suffix + "]"
updated = _re.sub(r"crewai\[[^\]]+\]", _add_extras, content, count=1)
if updated != content:
pyproject_path.write_text(updated, encoding="utf-8")
except Exception:
pass
def _get_crewai_version() -> str:
"""Get the installed crewai version for the dependency pin."""
try:
from crewai_cli.version import get_crewai_version
return get_crewai_version()
except Exception:
return "1.14.5"
def _run_uv_sync(base: Path) -> None:
"""Run uv sync to install dependencies."""
click.echo()
click.secho("Installing dependencies...", fg="cyan")
try:
result = subprocess.run(
["uv", "sync"],
cwd=str(base),
capture_output=True,
text=True,
timeout=300,
)
if result.returncode == 0:
click.secho("Dependencies installed successfully.", fg="green")
else:
click.secho("Failed to install dependencies:", fg="red")
if result.stderr:
click.echo(result.stderr)
click.echo("Try running: uv sync")
except FileNotFoundError:
click.secho(
"uv not found. Install it (https://docs.astral.sh/uv/) then run: uv sync",
fg="yellow",
)
except subprocess.TimeoutExpired:
click.secho("uv sync timed out. Run manually: uv sync", fg="yellow")
except Exception as e:
click.secho(f"Could not run uv sync: {e}", fg="yellow")
click.echo("Run manually: uv sync")
def _create_benchmark_cases(base: Path, agent_name: str) -> None:
"""Create a starter benchmark cases file for the agent."""
cases_path = base / "tests" / f"{agent_name}_cases.json"
if cases_path.exists():
return
cases_path.parent.mkdir(parents=True, exist_ok=True)
cases_path.write_text(_STARTER_CASES, encoding="utf-8")
_PROVIDERS: list[tuple[str, str]] = [
("openai", "OpenAI"),
("anthropic", "Anthropic"),
("gemini", "Google Gemini"),
("groq", "Groq (fast inference)"),
("ollama", "Ollama (local)"),
]
_PROVIDER_MODELS: dict[str, list[tuple[str, str]]] = {
"openai": [
("gpt-5.5", "GPT-5.5"),
("gpt-5.5-pro", "GPT-5.5 Pro"),
("o4-mini", "o4-mini (reasoning, fast)"),
("o3", "o3 (reasoning)"),
("gpt-4.1-mini", "GPT-4.1 Mini (budget)"),
],
"anthropic": [
("claude-opus-4-6", "Claude Opus 4.6"),
("claude-sonnet-4-6", "Claude Sonnet 4.6"),
("claude-haiku-4-5-20251001", "Claude Haiku 4.5 (fast)"),
("claude-3-7-sonnet-20250219", "Claude 3.7 Sonnet"),
("claude-3-5-sonnet-20241022", "Claude 3.5 Sonnet"),
],
"gemini": [
("gemini-3-pro-preview", "Gemini 3 Pro (preview)"),
("gemini-2.5-pro-exp-03-25", "Gemini 2.5 Pro"),
("gemini-2.5-flash-preview-04-17", "Gemini 2.5 Flash"),
("gemini-2.0-flash-001", "Gemini 2.0 Flash"),
("gemini-1.5-pro", "Gemini 1.5 Pro"),
],
"groq": [
("llama-3.3-70b-versatile", "Llama 3.3 70B"),
("llama-3.1-70b-versatile", "Llama 3.1 70B"),
("llama-3.1-8b-instant", "Llama 3.1 8B (fast)"),
("deepseek-r1-distill-llama-70b", "DeepSeek R1 70B"),
("mixtral-8x7b-32768", "Mixtral 8x7B"),
],
"ollama": [
("llama3.3", "Llama 3.3"),
("llama3.1", "Llama 3.1"),
("deepseek-r1", "DeepSeek R1"),
("qwen2.5", "Qwen 2.5"),
("mistral", "Mistral"),
],
}
_POPULAR_TOOLS: list[tuple[str, str]] = [
("SerperDevTool", "Web search via Serper API"),
("ScrapeWebsiteTool", "Scrape and extract content from URLs"),
("FileReadTool", "Read local files"),
("FileWriterTool", "Write content to local files"),
("DirectoryReadTool", "List directory contents"),
("CodeInterpreterTool", "Execute Python code in a sandbox"),
("CSVSearchTool", "Search within CSV files"),
("PDFSearchTool", "Search within PDF documents"),
("JSONSearchTool", "Search within JSON files"),
("GithubSearchTool", "Search GitHub repositories"),
("YoutubeVideoSearchTool", "Search YouTube video transcripts"),
("TavilySearchTool", "Web search via Tavily API"),
("BraveSearchTool", "Web search via Brave API"),
("RagTool", "RAG over custom knowledge sources"),
("DallETool", "Generate images with DALL-E"),
("VisionTool", "Analyze images with vision models"),
]
_AGENT_NAME_RE = re.compile(r"^[a-z][a-z0-9_-]*$")
# ── Arrow-key selection helpers ──────────────────────────────────
_CYAN = "\033[36m"
_BOLD = "\033[1m"
_GREEN = "\033[32m"
_DIM = "\033[2m"
_RESET = "\033[0m"
def _is_interactive() -> bool:
"""Check if stdin/stdout are real terminals (not piped or in tests)."""
try:
return sys.stdin.isatty() and sys.stdout.isatty()
except Exception:
return False
def _read_key() -> str:
"""Read a single keypress. Returns 'up', 'down', 'enter', 'space', or the char."""
if sys.platform == "win32":
import msvcrt
ch = msvcrt.getwch()
if ch in ("\x00", "\xe0"):
ch2 = msvcrt.getwch()
return {"H": "up", "P": "down"}.get(ch2, "")
if ch == "\r":
return "enter"
if ch == " ":
return "space"
if ch == "\x03":
raise KeyboardInterrupt
return ch
import termios
import tty
fd = sys.stdin.fileno()
old = termios.tcgetattr(fd)
try:
tty.setcbreak(fd)
ch = sys.stdin.read(1)
if ch == "\x1b":
seq = sys.stdin.read(2)
if seq == "[A":
return "up"
if seq == "[B":
return "down"
return "esc"
if ch in ("\r", "\n"):
return "enter"
if ch == " ":
return "space"
if ch == "\x03":
raise KeyboardInterrupt
return ch
finally:
termios.tcsetattr(fd, termios.TCSADRAIN, old)
def _draw_single(labels: list[str], cursor: int, *, clear: bool = False) -> None:
"""Draw single-select menu. If clear=True, move cursor up first."""
total = len(labels)
if clear:
sys.stdout.write(f"\033[{total}A")
for i, label in enumerate(labels):
if i == cursor:
sys.stdout.write(f"\033[2K {_CYAN}{_RESET} {_BOLD}{label}{_RESET}\n")
else:
sys.stdout.write(f"\033[2K {label}\n")
sys.stdout.flush()
def _draw_multi(
labels: list[str], cursor: int, selected: set[int], *, clear: bool = False
) -> None:
"""Draw multi-select menu with checkboxes."""
hint = f" {_DIM}↑↓ navigate, space toggle, enter confirm{_RESET}"
total = len(labels) + 1 # +1 for hint line
if clear:
sys.stdout.write(f"\033[{total}A")
sys.stdout.write(f"\033[2K{hint}\n")
for i, label in enumerate(labels):
check = f"{_CYAN}[×]{_RESET}" if i in selected else "[ ]"
arrow = f"{_CYAN}{_RESET} " if i == cursor else " "
bold = f"{_BOLD}{label}{_RESET}" if i == cursor else label
sys.stdout.write(f"\033[2K {arrow}{check} {bold}\n")
sys.stdout.flush()
def _clear_lines(n: int) -> None:
"""Clear n lines above and position cursor at the top."""
sys.stdout.write(f"\033[{n}A")
for _ in range(n):
sys.stdout.write("\033[2K\n")
sys.stdout.write(f"\033[{n}A")
sys.stdout.flush()
def create_agent(name: str | None = None) -> None:
"""Create an agent definition interactively.
Both paths (with and without a name) ask the same structured
questions and produce the same annotated JSONC output.
"""
click.secho("\nCrewAI Agent Creator\n", fg="cyan", bold=True)
if name is None:
name = _prompt_agent_name()
base = Path.cwd()
# Directories are bootstrapped now, pyproject written after model selection
for d in ("agents", "tools", "tests"):
(base / d).mkdir(parents=True, exist_ok=True)
dest = base / "agents" / f"{name}.jsonc"
if dest.exists():
if not click.confirm(f"File {dest} already exists. Overwrite?"):
click.secho("Operation cancelled.", fg="yellow")
return
click.secho(f"Configuring agent: {name}\n", fg="cyan")
role = click.prompt(" Role (what this agent does)", type=str)
goal = click.prompt(" Goal (the agent's objective)", type=str)
backstory = click.prompt(
" Backstory (context that shapes personality, optional)",
type=str,
default="",
show_default=False,
)
llm = _select_model()
tools = _select_tools()
content = AGENT_TEMPLATE.format(
name=name,
role=role,
goal=goal,
backstory=backstory,
llm=llm,
)
if tools:
tools_json = json.dumps(tools)
content = content.replace('"tools": []', f'"tools": {tools_json}')
dest.write_text(content, encoding="utf-8")
_bootstrap_project(base, llm)
_add_agent_to_config(base, name)
_create_benchmark_cases(base, name)
_setup_env(base, llm)
_run_uv_sync(base)
click.echo()
click.secho(f"Agent created: {dest}", fg="green", bold=True)
click.echo("Run: crewai run")
def _select_model() -> str:
"""Two-step selection: provider first, then model."""
# Step 1: Pick provider
provider_labels = [label for _, label in _PROVIDERS]
provider_labels.append("Other (enter manually)")
click.echo()
click.secho(" LLM Provider:", fg="cyan")
p_idx = _arrow_or_fallback(provider_labels)
if p_idx == len(_PROVIDERS):
custom: str = click.prompt(" Enter model (provider/model)", type=str)
return custom.strip()
provider_key, provider_name = _PROVIDERS[p_idx]
click.secho(f"{provider_name}", fg="green")
# Step 2: Pick model from that provider
models = _PROVIDER_MODELS.get(provider_key, [])
model_labels = [f"{label} ({model_id})" for model_id, label in models]
model_labels.append("Other (enter model name)")
click.echo()
click.secho(f" {provider_name} Model:", fg="cyan")
m_idx = _arrow_or_fallback(model_labels)
if m_idx == len(models):
custom = click.prompt(f" Enter model name for {provider_key}/", type=str)
result = f"{provider_key}/{custom.strip()}"
else:
model_id = models[m_idx][0]
result = f"{provider_key}/{model_id}"
click.secho(f"{result}", fg="green")
return result
def _arrow_or_fallback(labels: list[str]) -> int:
"""Arrow-key select if interactive, numbered fallback otherwise."""
if _is_interactive():
try:
return _arrow_select_one(labels)
except Exception:
pass
return _numbered_select(labels)
def _arrow_select_one(labels: list[str]) -> int:
"""Arrow-key single-select. Returns selected index."""
cursor = 0
total = len(labels)
_draw_single(labels, cursor)
while True:
key = _read_key()
if key == "up" and cursor > 0:
cursor -= 1
_draw_single(labels, cursor, clear=True)
elif key == "down" and cursor < total - 1:
cursor += 1
_draw_single(labels, cursor, clear=True)
elif key == "enter":
_clear_lines(total)
return cursor
def _numbered_select(labels: list[str]) -> int:
"""Numbered fallback for non-TTY environments."""
for idx, label in enumerate(labels, 1):
click.echo(f" {idx}. {label}")
click.echo()
while True:
choice = click.prompt(" Select", type=str, default="1")
try:
num = int(choice)
if 1 <= num <= len(labels):
return num - 1
except ValueError:
pass
click.secho(f" Invalid choice. Enter 1-{len(labels)}.", fg="red")
def _select_tools() -> list[str]:
"""Let the user pick tools from popular options and/or add custom ones."""
labels = [f"{cls_name:<28s} {desc}" for cls_name, desc in _POPULAR_TOOLS]
labels.append("Add custom tool class names")
click.echo()
click.secho(" Tools (space to select, enter to confirm):", fg="cyan")
if _is_interactive():
try:
indices = _select_tools_interactive(labels)
except Exception:
indices = _select_tools_fallback(labels)
else:
indices = _select_tools_fallback(labels)
selected: list[str] = []
has_custom = False
for idx in indices:
if idx == len(_POPULAR_TOOLS):
has_custom = True
elif 0 <= idx < len(_POPULAR_TOOLS):
cls_name = _POPULAR_TOOLS[idx][0]
if cls_name not in selected:
selected.append(cls_name)
if has_custom:
custom = click.prompt(
" Custom tool class names (comma-separated)",
type=str,
default="",
show_default=False,
)
for name in custom.split(","):
name = name.strip()
if name and name not in selected:
selected.append(name)
if selected:
click.secho(f"{', '.join(selected)}", fg="green")
return selected
def _select_tools_interactive(labels: list[str]) -> list[int]:
"""Arrow-key multi-select for tools."""
cursor = 0
chosen: set[int] = set()
total_lines = len(labels) + 1 # +1 for hint line
_draw_multi(labels, cursor, chosen)
while True:
key = _read_key()
if key == "up" and cursor > 0:
cursor -= 1
_draw_multi(labels, cursor, chosen, clear=True)
elif key == "down" and cursor < len(labels) - 1:
cursor += 1
_draw_multi(labels, cursor, chosen, clear=True)
elif key == "space":
if cursor in chosen:
chosen.discard(cursor)
else:
chosen.add(cursor)
_draw_multi(labels, cursor, chosen, clear=True)
elif key == "enter":
_clear_lines(total_lines)
return sorted(chosen)
def _select_tools_fallback(labels: list[str]) -> list[int]:
"""Numbered fallback for non-TTY environments."""
for idx, label in enumerate(labels, 1):
click.echo(f" {idx:2d}. {label}")
click.echo()
raw = click.prompt(
" Select tools (e.g. 1 3 5)",
type=str,
default="",
show_default=False,
)
if not raw.strip():
return []
indices: list[int] = []
for token in raw.split():
try:
num = int(token)
if 1 <= num <= len(labels):
indices.append(num - 1)
except ValueError:
pass
return indices
def _setup_env(base: Path, llm_model: str) -> None:
"""Prompt for API keys based on the selected LLM provider and write .env."""
env_vars = load_env_vars(base)
provider = llm_model.split("/")[0].lower() if "/" in llm_model else ""
if not provider:
return
env_vars["MODEL"] = llm_model
already_set = all(
details.get("key_name", "") in env_vars
for details in ENV_VARS.get(provider, [])
if "key_name" in details
)
if already_set:
return
if provider in ENV_VARS:
click.echo()
for details in ENV_VARS[provider]:
key_name = details.get("key_name")
if not key_name or key_name in env_vars:
continue
if details.get("default"):
env_vars[key_name] = details.get("API_BASE", "")
continue
value = click.prompt(
f" {details.get('prompt', f'Enter {key_name}')}",
default="",
show_default=False,
)
if value.strip():
env_vars[key_name] = value.strip()
if env_vars:
write_env_file(base, env_vars)
click.secho("API keys saved to .env", fg="green")
else:
click.secho(
"No API keys provided. Create a .env file manually before running.",
fg="yellow",
)
def _prompt_agent_name() -> str:
"""Prompt for a valid agent identifier."""
while True:
raw: str = click.prompt(
" Agent identifier (lowercase, hyphens/underscores, no spaces)",
type=str,
)
name = raw.strip().lower()
if _AGENT_NAME_RE.match(name):
return name
click.secho(
" Invalid name — use lowercase letters, numbers, hyphens, or underscores.",
fg="red",
)
_JSONC_TOKEN_RE = re.compile(
r'"(?:[^"\\]|\\.)*"' # double-quoted string
r"|'(?:[^'\\]|\\.)*'" # single-quoted string (not standard JSON, but safe)
r"|/\*.*?\*/" # /* block comment */
r"|//[^\n]*" # // line comment
r"|.", # any other character
re.DOTALL,
)
def _strip_jsonc(text: str) -> str:
"""Strip // and /* */ comments from JSONC text, then fix trailing commas.
Only strips comments that appear outside of quoted strings, so double
slashes inside string values (e.g. URLs) are preserved correctly.
"""
def _replacer(match: re.Match[str]) -> str:
token = match.group(0)
if token.startswith(("//", "/*")):
return ""
return token
result = _JSONC_TOKEN_RE.sub(_replacer, text)
result = re.sub(r",\s*([}\]])", r"\1", result)
return result
def _add_agent_to_config(base: Path, agent_name: str) -> None:
"""Add the agent to the common room in config.json."""
config_path = base / "config.json"
if not config_path.exists():
return
try:
raw = config_path.read_text(encoding="utf-8")
clean = _strip_jsonc(raw)
config = json.loads(clean)
rooms = config.get("rooms", {})
common = rooms.get("common", {"agents": [], "engagement": "organic"})
agents = common.get("agents", [])
if agent_name not in agents:
agents.append(agent_name)
common["agents"] = agents
rooms["common"] = common
config["rooms"] = rooms
config_path.write_text(json.dumps(config, indent=2), encoding="utf-8")
except Exception as e:
click.echo(f"Warning: Could not update config.json: {e}", err=True)

View File

@@ -1,5 +1,4 @@
from enum import Enum
import os
import subprocess
import click
@@ -10,62 +9,17 @@ from crewai_cli.utils import build_env_with_all_tool_credentials, read_toml
from crewai_cli.version import get_crewai_version
_UV_CONTEXT_VAR = "_CREWAI_UV"
class CrewType(Enum):
STANDARD = "standard"
FLOW = "flow"
def _has_agents_dir() -> bool:
"""Check if current directory has an agents/ directory with definitions."""
from pathlib import Path
agents_dir = Path.cwd() / "agents"
if not agents_dir.is_dir():
return False
files = list(agents_dir.glob("*.json")) + list(agents_dir.glob("*.jsonc"))
return len(files) > 0
def _needs_uv_relaunch() -> bool:
"""True when we should re-exec through ``uv run`` for the project venv."""
if os.environ.get(_UV_CONTEXT_VAR):
return False
from pathlib import Path
pyproject = Path.cwd() / "pyproject.toml"
if not pyproject.exists():
return False
try:
return 'type = "agent"' in pyproject.read_text(encoding="utf-8")
except Exception:
return False
def _relaunch_via_uv(args: list[str]) -> None:
"""Re-exec ``uv run crewai <args>`` inside the project venv, then exit."""
env = {**os.environ, _UV_CONTEXT_VAR: "1"}
cmd = ["uv", "run", "crewai", *args]
try:
result = subprocess.run(cmd, env=env)
raise SystemExit(result.returncode)
except FileNotFoundError:
click.secho(
"uv not found — running without project venv. "
"Install uv (https://docs.astral.sh/uv/) for full provider support.",
fg="yellow",
)
def run_crew(trained_agents_file: str | None = None) -> None:
"""Run the crew, flow, or agent TUI.
"""Run the crew or flow by running a command in the UV environment.
Detects the project type:
- If agents/ directory exists with definitions: launch agent TUI
- If pyproject.toml type is "flow": run the flow
- Otherwise: run the crew
Starting from version 0.103.0, this command can be used to run both
standard crews and flows. For flows, it detects the type from pyproject.toml
and automatically runs the appropriate command.
Args:
trained_agents_file: Optional path to a trained-agents pickle produced
@@ -73,19 +27,6 @@ def run_crew(trained_agents_file: str | None = None) -> None:
``CREWAI_TRAINED_AGENTS_FILE`` so agents load suggestions from this
file instead of the default ``trained_agents_data.pkl``.
"""
# Check for agents/ directory first — agent projects don't need pyproject.toml
if _has_agents_dir():
if _needs_uv_relaunch():
uv_args = ["run"]
if trained_agents_file:
uv_args.extend(["-f", trained_agents_file])
_relaunch_via_uv(uv_args)
click.echo("Launching agent TUI...")
from crewai_cli.agent_tui import run_agent_tui
run_agent_tui()
return
crewai_version = get_crewai_version()
min_required_version = "0.71.0"
pyproject_data = read_toml()
@@ -128,7 +69,7 @@ def execute_command(
env[CREWAI_TRAINED_AGENTS_FILE_ENV] = trained_agents_file
try:
subprocess.run(command, capture_output=False, text=True, check=True, env=env)
subprocess.run(command, capture_output=False, text=True, check=True, env=env) # noqa: S603
except subprocess.CalledProcessError as e:
handle_error(e, crew_type)

View File

@@ -125,12 +125,6 @@ def load_env_vars(folder_path: Path) -> dict[str, Any]:
for line in file:
key, _, value = line.strip().partition("=")
if key and value:
if (
len(value) >= 2
and value[0] == value[-1]
and value[0] in ('"', "'")
):
value = value[1:-1]
env_vars[key] = value
return env_vars
@@ -140,4 +134,4 @@ def write_env_file(folder_path: Path, env_vars: dict[str, Any]) -> None:
env_file_path = folder_path / ".env"
with open(env_file_path, "w") as file:
for key, value in env_vars.items():
file.write(f"{key}={value}\n")
file.write(f"{key.upper()}={value}\n")

View File

@@ -184,9 +184,6 @@ except (ImportError, PydanticUserError):
)
RuntimeState = None # type: ignore[assignment,misc]
from crewai.new_agent import NewAgent # noqa: E402
__all__ = [
"LLM",
"Agent",
@@ -199,7 +196,6 @@ __all__ = [
"Knowledge",
"LLMGuardrail",
"Memory",
"NewAgent",
"PlanningConfig",
"Process",
"RuntimeState",

View File

@@ -166,25 +166,6 @@ class FlowInputReceivedEvent(FlowEvent):
type: Literal["flow_input_received"] = "flow_input_received"
class FlowMessageSentEvent(FlowEvent):
"""Event emitted when a flow sends a message to the user via ``Flow.say()``.
This event is emitted when a flow sends an informational message
that does not require a response from the user.
Attributes:
flow_name: Name of the flow sending the message.
method_name: Name of the flow method that called ``say()``.
message: The message sent to the user.
metadata: Optional metadata sent with the message.
"""
method_name: str
message: str
metadata: dict[str, Any] | None = None
type: Literal["flow_message_sent"] = "flow_message_sent"
class HumanFeedbackRequestedEvent(FlowEvent):
"""Event emitted when human feedback is requested.

View File

@@ -951,16 +951,7 @@ class Flow(BaseModel, Generic[T], metaclass=FlowMeta):
stream: bool = Field(default=False)
memory: Memory | MemoryScope | MemorySlice | None = Field(default=None)
input_provider: InputProvider | None = Field(default=None)
conversational_provider: Any = Field(default=None)
suppress_flow_events: bool = Field(default=False)
pending_mode: bool = Field(
default=False,
description=(
"When True, ask() will serialize state and raise "
"HumanFeedbackPending instead of blocking for user input, "
"allowing the thread to be freed for server-side use cases."
),
)
human_feedback_history: list[HumanFeedbackResult] = Field(default_factory=list)
last_human_feedback: HumanFeedbackResult | None = Field(default=None)
@@ -1081,7 +1072,6 @@ class Flow(BaseModel, Generic[T], metaclass=FlowMeta):
_event_futures: list[Future[None]] = PrivateAttr(default_factory=list)
_pending_feedback_context: PendingFeedbackContext | None = PrivateAttr(default=None)
_human_feedback_method_outputs: dict[str, Any] = PrivateAttr(default_factory=dict)
_pending_response: str | None = PrivateAttr(default=None)
_input_history: list[InputHistoryEntry] = PrivateAttr(default_factory=list)
_state: Any = PrivateAttr(default=None)
@@ -1443,44 +1433,6 @@ class Flow(BaseModel, Generic[T], metaclass=FlowMeta):
return instance
@classmethod
def from_ask_pending(
cls,
user_input: str,
state: dict[str, Any] | None = None,
**kwargs: Any,
) -> Flow[Any]:
"""Create a Flow ready to resume from a pending ask().
When ``pending_mode=True`` causes ``ask()`` to raise
``HumanFeedbackPending``, use this classmethod to construct a
new flow that will return ``user_input`` on the next ``ask()``
call instead of blocking or raising again.
Args:
user_input: The answer to feed back into ``ask()``.
state: Optional state dict to restore (from ``HumanFeedbackPending.callback_info["state"]``).
**kwargs: Additional keyword arguments passed to the Flow constructor.
Returns:
A new Flow instance with ``_pending_response`` set.
Example:
```python
try:
result = flow.kickoff()
except HumanFeedbackPending as e:
state = e.callback_info.get("state")
flow2 = MyFlow.from_ask_pending("user answer", state=state)
result = flow2.kickoff()
```
"""
instance = cls(**kwargs)
if state is not None:
instance._initialize_state(state)
instance._pending_response = user_input
return instance
@property
def pending_feedback(self) -> PendingFeedbackContext | None:
"""Get the pending feedback context if this flow is waiting for feedback.
@@ -3250,15 +3202,6 @@ class Flow(BaseModel, Generic[T], metaclass=FlowMeta):
except Exception:
logger.debug("Failed to checkpoint state before ask()", exc_info=True)
def _serialize_state(self) -> dict[str, Any]:
"""Serialize flow state for pending-mode persistence."""
state = self._state
if isinstance(state, dict):
return dict(state)
if hasattr(state, "model_dump"):
return cast(dict[str, Any], state.model_dump())
return {}
def ask(
self,
message: str,
@@ -3272,13 +3215,6 @@ class Flow(BaseModel, Generic[T], metaclass=FlowMeta):
flow framework runs sync methods in a thread pool via
``asyncio.to_thread``, so the event loop stays free).
When ``pending_mode`` is enabled on the flow, instead of blocking
this method serializes the flow state and raises
``HumanFeedbackPending``, allowing the calling thread to be freed.
Use ``from_ask_pending()`` to continue execution later.
If a ``_pending_response`` is set (from ``from_ask_pending()``), it is
returned immediately without blocking or raising.
Timeout ensures flows always terminate. When timeout expires,
``None`` is returned, enabling the pattern::
@@ -3306,10 +3242,6 @@ class Flow(BaseModel, Generic[T], metaclass=FlowMeta):
or provider error. Empty string ``""`` means the user pressed
Enter without typing (intentional empty input).
Raises:
HumanFeedbackPending: When ``pending_mode`` is True and no
``_pending_response`` is available.
Example:
```python
class MyFlow(Flow):
@@ -3338,23 +3270,6 @@ class Flow(BaseModel, Generic[T], metaclass=FlowMeta):
from crewai.flow.input_provider import InputResponse
method_name = current_flow_method_name.get("unknown")
response: str | None = None
# GAP-34: If a pending response was set (from from_ask_pending()), return it
if self._pending_response is not None:
response = self._pending_response
self._pending_response = None
self._input_history.append(
{
"message": message,
"response": response,
"method_name": method_name,
"timestamp": datetime.now(),
"metadata": metadata,
"response_metadata": None,
}
)
return response
# Emit input requested event
crewai_event_bus.emit(
@@ -3371,41 +3286,6 @@ class Flow(BaseModel, Generic[T], metaclass=FlowMeta):
# Auto-checkpoint state before waiting
self._checkpoint_state_for_ask()
# GAP-34: pending mode — serialize state and raise instead of blocking
if self.pending_mode:
from crewai.flow.async_feedback.types import (
HumanFeedbackPending,
PendingFeedbackContext,
)
state = self._serialize_state()
context = PendingFeedbackContext(
flow_id=self.flow_id,
flow_class=f"{self.__class__.__module__}.{self.__class__.__qualname__}",
method_name=method_name,
method_output=state,
message=message,
metadata=metadata or {},
)
raise HumanFeedbackPending(
context=context,
callback_info={"state": state},
)
# ── ConversationalProvider path ──────────────────────────────
# When a conversational_provider is set (e.g. from NewAgent),
# use it for transport instead of the InputProvider protocol.
conv_provider = self.conversational_provider
if conv_provider is not None:
return self._ask_via_conversational_provider(
conv_provider,
message,
method_name,
metadata,
timeout,
)
# ── InputProvider path (existing behavior) ───────────────────
provider = self._resolve_input_provider()
raw: str | InputResponse | None = None
@@ -3437,7 +3317,7 @@ class Flow(BaseModel, Generic[T], metaclass=FlowMeta):
raw = None
# Normalize provider response: str, InputResponse, or None
response = None
response: str | None = None
response_metadata: dict[str, Any] | None = None
if isinstance(raw, InputResponse):
@@ -3476,202 +3356,6 @@ class Flow(BaseModel, Generic[T], metaclass=FlowMeta):
return response
def _ask_via_conversational_provider(
self,
conv_provider: Any,
message: str,
method_name: str,
metadata: dict[str, Any] | None,
timeout: float | None,
) -> str | None:
"""Route ask() through a ConversationalProvider.
Sends the question as an "agent" message, then waits for the user
reply via ``receive_message()``. Both calls are async on the
provider, so we run them in an event loop.
Args:
conv_provider: A ConversationalProvider instance.
message: The question to send.
method_name: Name of the calling flow method (for history).
metadata: Optional metadata from the caller.
timeout: Maximum seconds to wait for a reply (best-effort).
Returns:
The user's reply text, or None on timeout/error.
"""
from concurrent.futures import (
ThreadPoolExecutor,
TimeoutError as FuturesTimeoutError,
)
from datetime import datetime
from crewai.events.types.flow_events import (
FlowInputReceivedEvent,
)
from crewai.new_agent.models import Message as AgentMessage
async def _round_trip() -> str | None:
# Send the question
outgoing = AgentMessage(
role="agent",
content=message,
metadata=metadata,
)
await conv_provider.send_message(outgoing)
# Wait for the user reply
reply = await conv_provider.receive_message()
return reply.content if reply else None
response: str | None = None
try:
if timeout is not None:
executor = ThreadPoolExecutor(max_workers=1)
ctx = contextvars.copy_context()
future = executor.submit(ctx.run, asyncio.run, _round_trip())
try:
response = future.result(timeout=timeout)
except FuturesTimeoutError:
future.cancel()
response = None
finally:
executor.shutdown(wait=False, cancel_futures=True)
else:
# Run the async round-trip synchronously. Use an existing
# loop if available, otherwise create one.
try:
loop = asyncio.get_running_loop()
except RuntimeError:
loop = None
if loop and loop.is_running():
# We're inside an async context — schedule the coroutine
# on the running loop and block until it completes.
import concurrent.futures
coro_future: concurrent.futures.Future[Any] = (
asyncio.run_coroutine_threadsafe(_round_trip(), loop)
)
response = coro_future.result()
else:
response = asyncio.run(_round_trip())
except KeyboardInterrupt:
raise
except Exception:
logger.debug("ConversationalProvider error in ask()", exc_info=True)
response = None
# Record in history
self._input_history.append(
{
"message": message,
"response": response,
"method_name": method_name,
"timestamp": datetime.now(),
"metadata": metadata,
"response_metadata": None,
}
)
# Emit input received event
crewai_event_bus.emit(
self,
FlowInputReceivedEvent(
type="flow_input_received",
flow_name=self.name or self.__class__.__name__,
method_name=method_name,
message=message,
response=response,
metadata=metadata,
),
)
return response
def say(
self,
message: str,
metadata: dict[str, Any] | None = None,
) -> None:
"""Send a message to the user without waiting for a response.
This is a one-way communication channel for status updates,
progress reports, or informational messages during flow execution.
When a ``conversational_provider`` is set (e.g. from NewAgent),
the message is sent through it. Otherwise, the message is printed
to the console via Rich and emitted as a ``FlowMessageSentEvent``.
Args:
message: The message to send to the user.
metadata: Optional metadata to attach to the message
(e.g., category, severity, context).
Example:
```python
class MyFlow(Flow):
@start()
def process(self):
self.say("Starting data analysis...")
# ... do work ...
self.say("Analysis complete, generating report.")
return self.ask("Would you like the detailed or summary report?")
```
"""
from crewai.events.types.flow_events import FlowMessageSentEvent
from crewai.flow.flow_context import current_flow_method_name
method_name = current_flow_method_name.get("unknown")
# ── ConversationalProvider path ──────────────────────────────
conv_provider = self.conversational_provider
if conv_provider is not None:
from crewai.new_agent.models import Message as AgentMessage
outgoing = AgentMessage(
role="agent",
content=message,
metadata=metadata,
)
try:
try:
loop = asyncio.get_running_loop()
except RuntimeError:
loop = None
if loop and loop.is_running():
# We're inside an async context — schedule on the running loop.
import concurrent.futures as _cf
_send_future: _cf.Future[None] = asyncio.run_coroutine_threadsafe(
conv_provider.send_message(outgoing), loop
)
_send_future.result()
else:
asyncio.run(conv_provider.send_message(outgoing))
except Exception:
logger.debug("ConversationalProvider error in say()", exc_info=True)
else:
# ── Console fallback ─────────────────────────────────────
console = Console()
flow_name = self.name or self.__class__.__name__
console.print(
Panel(message, title=f"[bold]{flow_name}[/bold]", border_style="blue")
)
# Emit event regardless of provider
crewai_event_bus.emit(
self,
FlowMessageSentEvent(
type="flow_message_sent",
flow_name=self.name or self.__class__.__name__,
method_name=method_name,
message=message,
metadata=metadata,
),
)
def _request_human_feedback(
self,
message: str,

View File

@@ -125,7 +125,7 @@ def _contains_file_id_reference(messages: list[dict[str, Any]]) -> bool:
class AnthropicThinkingConfig(BaseModel):
type: Literal["enabled", "disabled", "adaptive"]
type: Literal["enabled", "disabled"]
budget_tokens: int | None = None
@@ -485,7 +485,7 @@ class AnthropicCompletion(BaseLLM):
if self.thinking:
if isinstance(self.thinking, AnthropicThinkingConfig):
params["thinking"] = self.thinking.model_dump(exclude_none=True)
params["thinking"] = self.thinking.model_dump()
else:
params["thinking"] = self.thinking

View File

@@ -313,7 +313,7 @@ class Memory(BaseModel):
source_type="unified_memory",
),
)
except Exception:
except Exception: # noqa: S110
pass # swallow everything during shutdown
def drain_writes(self) -> None:
@@ -608,18 +608,7 @@ class Memory(BaseModel):
# The encoding pipeline uses asyncio.run() -> to_thread() internally.
# If the process is shutting down, the default executor is closed and
# to_thread raises "cannot schedule new futures after shutdown".
# Emit MemorySaveFailedEvent to keep event bus scope stack balanced.
try:
crewai_event_bus.emit(
self,
MemorySaveFailedEvent(
value=f"{len(contents)} memories (abandoned)",
metadata=metadata,
error="executor shutdown during encoding",
),
)
except Exception:
pass
# Silently abandon the save -- the process is exiting anyway.
return []
try:
@@ -763,7 +752,7 @@ class Memory(BaseModel):
touch = getattr(self._storage, "touch_records", None)
if touch is not None:
touch([m.record.id for m in results])
except Exception:
except Exception: # noqa: S110
pass # Non-critical: don't fail recall because of touch
elapsed_ms = (time.perf_counter() - start) * 1000

View File

@@ -1,67 +0,0 @@
"""NewAgent — standalone, conversational, self-improving agent."""
from crewai.new_agent.cli_provider import CLIProvider
from crewai.new_agent.coworker_tools import MultiDelegateTool
from crewai.new_agent.dreaming import DreamingEngine
from crewai.new_agent.knowledge_discovery import KnowledgeDiscovery
from crewai.new_agent.models import (
AgentSettings,
AgentStatus,
MemoryScope,
MemorySlice,
Message,
MessageAction,
PromptLayer,
PromptStack,
ProvenanceEntry,
TokenUsage,
)
from crewai.new_agent.new_agent import NewAgent, clear_amp_cache
from crewai.new_agent.planning import PlanningEngine
from crewai.new_agent.provider import (
ConversationStorage,
ConversationalProvider,
DirectProvider,
SQLiteConversationStorage,
)
from crewai.new_agent.scheduler import ScheduleTaskTool, ScheduledTask, TaskScheduler
from crewai.new_agent.skill_builder import SkillBuilder
from crewai.new_agent.spawn_tools import SpawnSubtaskArgs, SpawnSubtaskTool
__all__ = [
"AgentSettings",
"AgentStatus",
"CLIProvider",
"ConversationStorage",
"ConversationalProvider",
"DirectProvider",
"DreamingEngine",
"KnowledgeDiscovery",
"MemoryScope",
"MemorySlice",
"Message",
"MessageAction",
"MultiDelegateTool",
"NewAgent",
"PlanningEngine",
"PromptLayer",
"PromptStack",
"ProvenanceEntry",
"SQLiteConversationStorage",
"ScheduleTaskTool",
"ScheduledTask",
"SkillBuilder",
"SpawnSubtaskArgs",
"SpawnSubtaskTool",
"TaskScheduler",
"TokenUsage",
"clear_amp_cache",
]
try:
from crewai.new_agent.event_listener import register_new_agent_listeners
register_new_agent_listeners()
except Exception:
pass

View File

@@ -1,110 +0,0 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "CrewAI Agent Definition",
"description": "Declarative definition for a CrewAI NewAgent",
"type": "object",
"required": ["role", "goal"],
"properties": {
"name": { "type": "string", "description": "Agent identifier" },
"role": { "type": "string", "description": "What this agent does" },
"goal": { "type": "string", "description": "What the agent is trying to achieve" },
"backstory": { "type": "string", "description": "Context that shapes personality/approach", "default": "" },
"llm": { "type": "string", "description": "LLM model identifier (e.g., 'openai/gpt-4o')" },
"function_calling_llm": { "type": ["string", "null"], "description": "Separate LLM for tool calls (optional)" },
"tools": {
"type": "array",
"items": { "type": "string" },
"description": "Tool names from crewai-tools. Use 'custom:name' for project-local tools."
},
"mcps": {
"type": "array",
"items": {
"oneOf": [
{ "type": "string" },
{ "type": "object", "properties": { "url": { "type": "string" }, "name": { "type": "string" } } }
]
},
"description": "MCP server connections"
},
"apps": {
"type": "array",
"items": { "type": "string" },
"description": "Platform app integrations"
},
"coworkers": {
"type": "array",
"items": {
"oneOf": [
{ "type": "object", "properties": { "ref": { "type": "string" } }, "required": ["ref"] },
{ "type": "object", "properties": { "amp": { "type": "string" }, "llm": { "type": "string" } }, "required": ["amp"] },
{ "type": "object", "properties": { "a2a": { "type": "string" } }, "required": ["a2a"] }
]
},
"description": "Coworkers: local refs, AMP handles, or A2A URLs"
},
"knowledge_sources": {
"type": "array",
"items": {
"type": "object",
"properties": { "path": { "type": "string" } },
"required": ["path"]
}
},
"skills": {
"type": "array",
"items": { "type": "string" },
"description": "Paths to skill directories containing SKILL.md files"
},
"guardrail": {
"oneOf": [
{ "type": "string", "description": "Guardrail instructions as a simple string (shorthand for LLM guardrail)" },
{
"type": "object",
"properties": {
"type": { "type": "string", "enum": ["llm", "code"] },
"instructions": { "type": "string" },
"function": { "type": "string", "description": "Dotted path to a callable for code guardrails" },
"path": { "type": "string", "description": "Alias for function (dotted path to callable)" },
"llm": { "type": "string", "description": "LLM model for LLM guardrails" }
}
}
]
},
"response_model": { "type": "string", "description": "Dotted path to a Pydantic BaseModel class" },
"settings": {
"type": "object",
"properties": {
"memory": { "type": "boolean", "default": true },
"memory_read_only": { "type": "boolean", "default": false, "description": "Allow memory recall but prevent saving new memories" },
"reasoning": { "type": "boolean", "default": true },
"self_improving": { "type": "boolean", "default": true },
"planning": { "type": "boolean", "default": true },
"auto_plan": { "type": "boolean", "default": true },
"can_spawn_copies": { "type": "boolean", "default": true },
"max_spawn_depth": { "type": "integer", "default": 1, "minimum": 0 },
"max_concurrent_spawns": { "type": "integer", "default": 4, "minimum": 1 },
"max_history_messages": { "type": ["integer", "null"], "default": null },
"narration_guard": { "type": "boolean", "default": false },
"dreaming_interval_hours": { "type": "integer", "default": 24, "minimum": 1 },
"dreaming_trigger_threshold": { "type": "integer", "default": 10, "minimum": 1 },
"dreaming_llm": { "type": ["string", "null"], "default": null, "description": "LLM for dreaming (defaults to agent's LLM)" },
"provenance_detail": { "type": "string", "enum": ["minimal", "standard", "detailed"], "default": "standard" },
"spawn_timeout": { "type": "integer", "default": 600, "minimum": 1 },
"can_create_knowledge": { "type": "boolean", "default": true },
"can_build_skills": { "type": "boolean", "default": true, "description": "Enable auto-generation and suggestion of SKILL.md files" },
"can_schedule": { "type": "boolean", "default": false, "description": "Enable the agent to schedule future tasks via ScheduleTaskTool" },
"narration_max_retries": { "type": "integer", "default": 2, "minimum": 0 },
"respect_context_window": { "type": "boolean", "default": true },
"cache_tool_results": { "type": "boolean", "default": true },
"max_retry_limit": { "type": "integer", "default": 2, "minimum": 0 },
"share_data": { "type": "boolean", "default": false, "description": "If true, include sensitive data (message content, tool inputs/outputs) in telemetry spans" }
},
"additionalProperties": false
},
"max_iter": { "type": "integer", "default": 25, "minimum": 1 },
"max_tokens": { "type": ["integer", "null"] },
"max_execution_time": { "type": ["integer", "null"] },
"verbose": { "type": "boolean", "default": false }
},
"additionalProperties": false
}

View File

@@ -1,224 +0,0 @@
"""Terminal-based conversational provider for NewAgent."""
from __future__ import annotations
import asyncio
from pathlib import Path
import sys
import threading
from typing import TYPE_CHECKING, Any
from crewai.new_agent.models import AgentStatus, Message, ProvenanceEntry
if TYPE_CHECKING:
from crewai.new_agent.provider import SQLiteConversationStorage
# ── Spinner frames ───────────────────────────────────────────
_BRAILLE_FRAMES = "⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏"
# ── Formatting helpers ───────────────────────────────────────
def format_tokens(n: int) -> str:
"""Format a token count compactly.
Examples:
0 → "0"
999 → "999"
1000 → "1.0k"
1234 → "1.2k"
12345 → "12.3k"
1234567 → "1.2M"
"""
if n < 1000:
return str(n)
if n < 1_000_000:
value = n / 1000
return f"{value:.1f}k"
value = n / 1_000_000
return f"{value:.1f}M"
def format_elapsed(ms: int) -> str:
"""Format elapsed milliseconds as a human-readable duration.
Examples:
12000 → "12s"
72000 → "1m 12s"
3723000 → "1h 2m"
"""
total_seconds = ms // 1000
if total_seconds < 60:
return f"{total_seconds}s"
if total_seconds < 3600:
minutes = total_seconds // 60
seconds = total_seconds % 60
return f"{minutes}m {seconds}s"
hours = total_seconds // 3600
minutes = (total_seconds % 3600) // 60
return f"{hours}h {minutes}m"
def format_status_line(status: AgentStatus, spinner_frame: str = "") -> str:
"""Build the status line shown during agent work.
Format:
⠋ Searching the web… (12s · ↓ 3.4k tokens · ↑ 1.2k tokens)
"""
detail = status.detail or status.state
parts: list[str] = []
if status.elapsed_ms:
parts.append(format_elapsed(status.elapsed_ms))
if status.input_tokens:
parts.append(f"{format_tokens(status.input_tokens)} tokens")
if status.output_tokens:
parts.append(f"{format_tokens(status.output_tokens)} tokens")
suffix = f" ({' · '.join(parts)})" if parts else ""
return f"{spinner_frame} {detail}{suffix}"
# ── Spinner helper ───────────────────────────────────────────
class _Spinner:
"""Simple terminal spinner that overwrites the current line."""
def __init__(self) -> None:
self._running = False
self._thread: threading.Thread | None = None
self._status: AgentStatus | None = None
self._lock = threading.Lock()
def update(self, status: AgentStatus) -> None:
with self._lock:
self._status = status
def start(self) -> None:
if self._running:
return
self._running = True
self._thread = threading.Thread(target=self._spin, daemon=True)
self._thread.start()
def stop(self) -> None:
self._running = False
if self._thread is not None:
self._thread.join(timeout=1.0)
self._thread = None
# Clear the spinner line
sys.stderr.write("\r\033[K")
sys.stderr.flush()
def _spin(self) -> None:
frames = _BRAILLE_FRAMES
idx = 0
while self._running:
with self._lock:
status = self._status
if status is not None:
frame = frames[idx % len(frames)]
line = format_status_line(status, spinner_frame=frame)
sys.stderr.write(f"\r\033[K{line}")
sys.stderr.flush()
idx += 1
try:
# ~80ms per frame ≈ 12.5 fps
threading.Event().wait(timeout=0.08)
except Exception:
break
# ── History persistence ──────────────────────────────────────
def _storage_path(agent_name: str) -> Path:
"""Return the path to the agent's SQLite conversation database."""
return Path.cwd() / ".crewai" / "conversations" / f"{agent_name}.db"
def _get_storage(agent_name: str) -> SQLiteConversationStorage:
from crewai.new_agent.provider import SQLiteConversationStorage
return SQLiteConversationStorage(_storage_path(agent_name))
# ── CLIProvider ──────────────────────────────────────────────
class CLIProvider:
"""Terminal-based conversational provider for NewAgent.
Uses stdin/stdout for user interaction and displays live status
updates with an animated spinner on stderr. Conversation history
is persisted via SQLiteConversationStorage (WAL mode).
"""
def __init__(self, agent_name: str = "agent", storage: Any = None) -> None:
self.agent_name = agent_name
self._storage = storage or _get_storage(agent_name)
self._spinner = _Spinner()
# ── ConversationalProvider protocol ──────────────────────
async def send_message(self, message: Message) -> None:
"""Print the agent's message to stdout."""
# Stop spinner before printing output
self._spinner.stop()
prefix = ""
if message.role == "agent":
prefix = (
f"\n{message.sender or 'Agent'}: " if message.sender else "\nAgent: "
)
elif message.role == "system":
prefix = "\n[system] "
sys.stdout.write(f"{prefix}{message.content}\n")
sys.stdout.flush()
async def receive_message(self) -> Message:
"""Read user input from stdin."""
# Stop spinner while waiting for input
self._spinner.stop()
try:
loop = asyncio.get_running_loop()
text = await loop.run_in_executor(None, self._read_input)
except EOFError as err:
raise KeyboardInterrupt("End of input") from err
return Message(role="user", content=text)
async def send_status(self, status: AgentStatus) -> None:
"""Show a spinner with status details on stderr."""
self._spinner.update(status)
self._spinner.start()
def get_history(self) -> list[Message]:
return self._storage.load_messages()
def save_history(self, messages: list[Message]) -> None:
self._storage.save_messages(messages)
def reset_history(self) -> None:
self._storage.clear_messages()
def save_provenance(self, entries: list[ProvenanceEntry]) -> None:
self._storage.save_provenance(entries)
def load_provenance(self) -> list[ProvenanceEntry]:
return self._storage.load_provenance()
def get_scope(self) -> dict[str, str]:
return {}
# ── Internal helpers ─────────────────────────────────────
@staticmethod
def _read_input() -> str:
"""Blocking stdin read (called from executor)."""
return input("\nYou: ")

View File

@@ -1,552 +0,0 @@
"""Build delegation tools from coworker agents.
GAP-49: Token tracking for delegation sub-actions.
GAP-55: Delegation provenance summary appended to results.
"""
from __future__ import annotations
import asyncio
from collections import Counter
import logging
import time
from typing import Any
from pydantic import BaseModel, Field
from crewai.tools.base_tool import BaseTool
from crewai.utilities.string_utils import sanitize_tool_name
logger = logging.getLogger(__name__)
def _emit_delegation_event(event_cls: type, **kwargs: Any) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
crewai_event_bus.emit(None, event=event_cls(**kwargs))
except Exception:
pass
def _build_provenance_summary(
coworker: Any, cw_role: str, elapsed_ms: int, in_tokens: int, out_tokens: int
) -> str:
"""GAP-55: Build a brief summary of what the coworker did during delegation."""
try:
executor = getattr(coworker, "_executor", None)
if executor is None:
return ""
provenance = getattr(executor, "provenance_log", [])
if not provenance:
return ""
# Count tool calls by name
tool_counts: Counter[str] = Counter()
step_count = 0
for entry in provenance:
step_count += 1
if entry.action == "tool_call":
tool_name = (entry.inputs or {}).get("tool", "unknown")
tool_counts[tool_name] += 1
if not tool_counts and step_count <= 1:
return ""
# Format tool usage summary
tool_parts = []
for tool_name, count in tool_counts.most_common():
if count > 1:
tool_parts.append(f"{tool_name} ({count}x)")
else:
tool_parts.append(tool_name)
tools_str = ", ".join(tool_parts) if tool_parts else "none"
in_k = f"{in_tokens:,}" if in_tokens else "0"
out_k = f"{out_tokens:,}" if out_tokens else "0"
return (
f"\n\n---\n"
f"[Coworker: {cw_role} | Tools: {tools_str} | "
f"Steps: {step_count} | Tokens: ↑{in_k}{out_k}]"
)
except Exception:
return ""
class DelegateToCoworkerArgs(BaseModel):
"""Arguments for delegating work to a coworker."""
message: str = Field(
description="The message/instruction to send to the coworker. Be specific about what you need."
)
fire_and_forget: bool = Field(
default=False,
description="MUST be false (default) to get the coworker's response. Only set true for background tasks where you don't need the result.",
)
class DelegateToCoworkerTool(BaseTool):
"""Tool that delegates work to a specific coworker agent."""
name: str = ""
description: str = ""
args_schema: type[BaseModel] = DelegateToCoworkerArgs
coworker: Any = None
coworker_source: str = "local"
parent_agent: Any = None
def __init__(
self,
coworker: Any,
source: str = "local",
parent_agent: Any = None,
**kwargs: Any,
) -> None:
cw_role = getattr(coworker, "role", "coworker")
tool_name = sanitize_tool_name(f"delegate_to_{cw_role}")
cw_goal = getattr(coworker, "goal", "")
desc = (
f"Delegate work to {cw_role}. "
f"Their expertise: {cw_goal}. "
f"Send them a clear message describing what you need."
)
super().__init__(
name=tool_name,
description=desc,
coworker=coworker,
coworker_source=source,
parent_agent=parent_agent,
**kwargs,
)
def _run(self, message: str, fire_and_forget: bool = False, **kwargs: Any) -> str:
"""Execute delegation to the coworker."""
from crewai.new_agent.events import (
NewAgentDelegationCompletedEvent,
NewAgentDelegationFailedEvent,
NewAgentDelegationStartedEvent,
NewAgentFireAndForgetCompletedEvent,
NewAgentFireAndForgetDispatchedEvent,
)
from crewai.new_agent.new_agent import NewAgent
cw_role = getattr(self.coworker, "role", "unknown")
parent_id = getattr(self.parent_agent, "id", "") if self.parent_agent else ""
if self.parent_agent and getattr(self.parent_agent, "on_delegate", None):
self.parent_agent.on_delegate(self.coworker, message)
if not isinstance(self.coworker, NewAgent):
return self._delegate_a2a(message)
if fire_and_forget:
_emit_delegation_event(
NewAgentFireAndForgetDispatchedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
)
try:
loop = asyncio.get_running_loop()
except RuntimeError:
loop = None
def _bg_fire_and_forget() -> None:
try:
self.coworker.message(message)
finally:
_emit_delegation_event(
NewAgentFireAndForgetCompletedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
)
if loop and loop.is_running():
async def _async_ff() -> None:
try:
await self.coworker.amessage(message)
finally:
_emit_delegation_event(
NewAgentFireAndForgetCompletedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
)
loop.create_task(_async_ff())
else:
import threading
threading.Thread(target=_bg_fire_and_forget, daemon=True).start()
return f"Work delegated to {cw_role}. They are working on it in the background."
_emit_delegation_event(
NewAgentDelegationStartedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
delegation_mode="sync",
coworker_source=self.coworker_source,
)
start = time.monotonic()
try:
response = self.coworker.message(message)
elapsed_ms = int((time.monotonic() - start) * 1000)
in_tokens = getattr(response, "input_tokens", 0) or 0
out_tokens = getattr(response, "output_tokens", 0) or 0
tokens = in_tokens + out_tokens
_emit_delegation_event(
NewAgentDelegationCompletedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
tokens_consumed=tokens,
response_time_ms=elapsed_ms,
)
# GAP-49: Record token usage on the parent agent if available
if self.parent_agent and tokens > 0:
try:
from crewai.new_agent.models import TokenUsage
executor = getattr(self.parent_agent, "_executor", None)
if executor is not None:
executor._sub_action_tokens.append(
TokenUsage(
action="delegation",
agent_id=str(parent_id),
input_tokens=in_tokens,
output_tokens=out_tokens,
model=getattr(response, "model", "") or "",
delegation_target=cw_role,
coworker_source=self.coworker_source,
)
)
except Exception:
pass
# GAP-55: Build and append provenance summary
result_content = response.content
summary = _build_provenance_summary(
self.coworker, cw_role, elapsed_ms, in_tokens, out_tokens
)
if summary:
result_content += summary
return result_content
except Exception as e:
_emit_delegation_event(
NewAgentDelegationFailedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
error=str(e),
)
raise
async def _arun(
self, message: str, fire_and_forget: bool = False, **kwargs: Any
) -> str:
"""Async delegation — avoids blocking the event loop."""
from crewai.new_agent.events import (
NewAgentDelegationCompletedEvent,
NewAgentDelegationFailedEvent,
NewAgentDelegationStartedEvent,
NewAgentFireAndForgetCompletedEvent,
NewAgentFireAndForgetDispatchedEvent,
)
from crewai.new_agent.new_agent import NewAgent
cw_role = getattr(self.coworker, "role", "unknown")
parent_id = getattr(self.parent_agent, "id", "") if self.parent_agent else ""
if self.parent_agent and getattr(self.parent_agent, "on_delegate", None):
self.parent_agent.on_delegate(self.coworker, message)
if not isinstance(self.coworker, NewAgent):
return self._delegate_a2a(message)
if fire_and_forget:
_emit_delegation_event(
NewAgentFireAndForgetDispatchedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
)
async def _async_ff() -> None:
try:
await self.coworker.amessage(message)
finally:
_emit_delegation_event(
NewAgentFireAndForgetCompletedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
)
asyncio.get_running_loop().create_task(_async_ff())
return f"Work delegated to {cw_role}. They are working on it in the background."
_emit_delegation_event(
NewAgentDelegationStartedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
delegation_mode="sync",
coworker_source=self.coworker_source,
)
start = time.monotonic()
try:
response = await self.coworker.amessage(message)
elapsed_ms = int((time.monotonic() - start) * 1000)
in_tokens = getattr(response, "input_tokens", 0) or 0
out_tokens = getattr(response, "output_tokens", 0) or 0
tokens = in_tokens + out_tokens
_emit_delegation_event(
NewAgentDelegationCompletedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
tokens_consumed=tokens,
response_time_ms=elapsed_ms,
)
if self.parent_agent and tokens > 0:
try:
from crewai.new_agent.models import TokenUsage
executor = getattr(self.parent_agent, "_executor", None)
if executor is not None:
executor._sub_action_tokens.append(
TokenUsage(
action="delegation",
agent_id=str(parent_id),
input_tokens=in_tokens,
output_tokens=out_tokens,
model=getattr(response, "model", "") or "",
delegation_target=cw_role,
coworker_source=self.coworker_source,
)
)
except Exception:
pass
result_content = response.content
summary = _build_provenance_summary(
self.coworker, cw_role, elapsed_ms, in_tokens, out_tokens
)
if summary:
result_content += summary
return result_content
except Exception as e:
_emit_delegation_event(
NewAgentDelegationFailedEvent,
new_agent_id=parent_id,
coworker_role=cw_role,
error=str(e),
)
raise
def _delegate_a2a(self, message: str) -> str:
"""Delegate to an A2A remote coworker."""
try:
from crewai.a2a.client import A2AClient # type: ignore[import-not-found]
url = getattr(self.coworker, "url", None) or str(self.coworker)
client = A2AClient(url=url)
result = client.send_message(message)
return str(result)
except Exception as e:
return f"A2A delegation failed: {e}"
class MultiDelegateArgs(BaseModel):
"""Arguments for delegating to multiple coworkers in parallel."""
delegations: list[dict[str, str]] = Field(
description=(
"List of delegations. Each item is a dict with 'coworker' (role name) "
"and 'message' (instruction to send). All coworkers run in parallel "
"and results are collected."
),
)
class MultiDelegateTool(BaseTool):
"""Tool that delegates work to multiple coworkers in parallel (sync)."""
name: str = "delegate_to_multiple_coworkers"
description: str = (
"Delegate work to multiple coworkers simultaneously. "
"Each coworker runs in parallel and all results are collected. "
"Use when you need input from several coworkers to synthesize a response."
)
args_schema: type[BaseModel] = MultiDelegateArgs
coworker_map: dict[str, Any] = Field(default_factory=dict)
def _run(self, delegations: list[dict[str, str]], **kwargs: Any) -> str:
"""Execute parallel delegations to multiple coworkers."""
from crewai.new_agent.new_agent import NewAgent
tasks_to_run = []
for d in delegations:
cw_name = d.get("coworker", "")
message = d.get("message", "")
coworker = self.coworker_map.get(cw_name)
if coworker is None:
# Try matching by partial role name
for role, cw in self.coworker_map.items():
if cw_name.lower() in role.lower():
coworker = cw
break
if coworker is None or not isinstance(coworker, NewAgent):
tasks_to_run.append((cw_name, message, None))
else:
tasks_to_run.append((cw_name, message, coworker))
results: list[str] = []
async def _run_all() -> list[str]:
coros = []
for cw_name, message, coworker in tasks_to_run:
if coworker is None:
coros.append(_error_result(cw_name))
else:
coros.append(coworker.amessage(message))
raw_results = await asyncio.gather(*coros, return_exceptions=True)
return [r for r in raw_results if not isinstance(r, BaseException)]
async def _error_result(name: str) -> str:
return f"[Error] Coworker '{name}' not found."
try:
loop = asyncio.get_running_loop()
except RuntimeError:
loop = None
if loop and loop.is_running():
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
raw = pool.submit(asyncio.run, _run_all()).result()
else:
raw = asyncio.run(_run_all())
for i, (cw_name, message, coworker) in enumerate(tasks_to_run):
r = raw[i]
if isinstance(r, Exception):
results.append(f"[{cw_name}] Error: {r}")
elif isinstance(r, str):
results.append(f"[{cw_name}] {r}")
else:
content = getattr(r, "content", str(r))
role = cw_name or f"Coworker {i + 1}"
# GAP-55: Append provenance summary for each coworker
in_tokens = getattr(r, "input_tokens", 0) or 0
out_tokens = getattr(r, "output_tokens", 0) or 0
if coworker is not None:
summary = _build_provenance_summary(
coworker, role, 0, in_tokens, out_tokens
)
if summary:
content += summary
results.append(f"[{role}] {content}")
return "\n\n".join(results)
async def _arun(self, delegations: list[dict[str, str]], **kwargs: Any) -> str:
"""Async parallel delegation — avoids blocking the event loop."""
from crewai.new_agent.new_agent import NewAgent
tasks_to_run = []
for d in delegations:
cw_name = d.get("coworker", "")
message = d.get("message", "")
coworker = self.coworker_map.get(cw_name)
if coworker is None:
for role, cw in self.coworker_map.items():
if cw_name.lower() in role.lower():
coworker = cw
break
if coworker is None or not isinstance(coworker, NewAgent):
tasks_to_run.append((cw_name, message, None))
else:
tasks_to_run.append((cw_name, message, coworker))
async def _error_result(name: str) -> str:
return f"[Error] Coworker '{name}' not found."
coros = []
for cw_name, message, coworker in tasks_to_run:
if coworker is None:
coros.append(_error_result(cw_name))
else:
coros.append(coworker.amessage(message))
raw = await asyncio.gather(*coros, return_exceptions=True)
results: list[str] = []
for i, (cw_name, message, coworker) in enumerate(tasks_to_run):
r = raw[i]
if isinstance(r, Exception):
results.append(f"[{cw_name}] Error: {r}")
elif isinstance(r, str):
results.append(f"[{cw_name}] {r}")
else:
content = getattr(r, "content", str(r))
role = cw_name or f"Coworker {i + 1}"
in_tokens = getattr(r, "input_tokens", 0) or 0
out_tokens = getattr(r, "output_tokens", 0) or 0
if coworker is not None:
summary = _build_provenance_summary(
coworker, role, 0, in_tokens, out_tokens
)
if summary:
content += summary
results.append(f"[{role}] {content}")
return "\n\n".join(results)
def build_coworker_tools(
coworkers: list[Any],
parent_role: str = "",
parent_agent: Any = None,
) -> list[BaseTool]:
"""Build delegation tools for a list of resolved coworkers."""
tools: list[BaseTool] = []
coworker_map: dict[str, Any] = {}
for cw in coworkers:
from crewai.new_agent.new_agent import NewAgent
cw_role = getattr(cw, "role", "")
if parent_role and cw_role == parent_role:
continue
if isinstance(cw, NewAgent):
source = "amp" if getattr(cw, "_amp_resolved", False) else "local"
tools.append(
DelegateToCoworkerTool(
coworker=cw,
source=source,
parent_agent=parent_agent,
)
)
coworker_map[cw.role] = cw
else:
source = "a2a"
cw_url = getattr(cw, "url", None)
if cw_url:
tool_name = sanitize_tool_name(
f"delegate_to_a2a_{cw_url.split('/')[-1]}"
)
tools.append(
DelegateToCoworkerTool(
coworker=cw,
source=source,
parent_agent=parent_agent,
)
)
if len(coworker_map) > 1:
tools.append(MultiDelegateTool(coworker_map=coworker_map))
return tools

View File

@@ -1,487 +0,0 @@
"""Parser for declarative agent definitions (JSON/JSONC)."""
from __future__ import annotations
import json
import logging
from pathlib import Path
import re
from typing import Any, cast
logger = logging.getLogger(__name__)
def strip_jsonc_comments(text: str) -> str:
"""Strip // and /* */ comments from JSONC text, then fix trailing commas."""
result = re.sub(r"(?<!:)//.*?$", "", text, flags=re.MULTILINE)
result = re.sub(r"/\*.*?\*/", "", result, flags=re.DOTALL)
result = re.sub(r",\s*([}\]])", r"\1", result)
return result
def _validate_against_schema(definition: dict[str, Any]) -> None:
"""Validate agent definition against the JSON schema.
Logs a warning on validation failure rather than raising, so
existing definitions continue to work (graceful degradation).
"""
try:
import jsonschema # type: ignore[import-untyped]
except ImportError:
logger.debug("jsonschema not installed, skipping validation")
return
schema_path = Path(__file__).parent / "agent_schema.json"
if not schema_path.exists():
logger.debug("agent_schema.json not found, skipping validation")
return
try:
schema = json.loads(schema_path.read_text(encoding="utf-8"))
jsonschema.validate(definition, schema)
except jsonschema.ValidationError as e:
logger.warning("Agent definition validation failed: %s", e.message)
except Exception as e:
logger.debug("Schema validation skipped: %s", e)
def parse_agent_definition(source: str | Path | dict[str, Any]) -> dict[str, Any]:
"""Parse an agent definition from a file path, JSON string, or dict.
Args:
source: Path to a .json/.jsonc file, a JSON string, or a dict.
Returns:
Parsed and validated agent definition dict.
"""
if isinstance(source, dict):
defn = source
elif isinstance(source, Path) or (
isinstance(source, str) and source.endswith((".json", ".jsonc"))
):
path = Path(source)
raw = path.read_text(encoding="utf-8")
clean = strip_jsonc_comments(raw)
defn = json.loads(clean)
else:
raw = source
clean = strip_jsonc_comments(raw)
defn = json.loads(clean)
# GAP-65: validate against schema (warn-only)
_validate_against_schema(defn)
return defn
def load_agent_from_definition(
source: str | Path | dict[str, Any],
agents_dir: Path | None = None,
_loading_chain: set[str] | None = None,
) -> Any:
"""Load a NewAgent from a declarative definition.
Args:
source: Agent definition (file path, JSON string, or dict).
agents_dir: Directory to resolve local coworker refs from.
_loading_chain: Internal — tracks agent names being loaded to
detect circular coworker references.
Returns:
A configured NewAgent instance.
"""
from crewai.new_agent.models import AgentSettings
from crewai.new_agent.new_agent import NewAgent
if _loading_chain is None:
_loading_chain = set()
defn = parse_agent_definition(source)
agent_name = defn.get("name", "")
if agent_name and agent_name in _loading_chain:
logger.debug(
"Skipping coworker back-reference '%s' (already in loading chain)",
agent_name,
)
return None
if agent_name:
_loading_chain.add(agent_name)
# Build settings
settings_raw = defn.get("settings", {})
settings_kwargs = {}
settings_map = {
"memory": "memory_enabled",
"reasoning": "reasoning_enabled",
"self_improving": "self_improving",
"planning": "planning_enabled",
"auto_plan": "auto_plan",
"can_spawn_copies": "can_spawn_copies",
"max_spawn_depth": "max_spawn_depth",
"max_concurrent_spawns": "max_concurrent_spawns",
"max_history_messages": "max_history_messages",
"narration_guard": "narration_guard",
"dreaming_interval_hours": "dreaming_interval_hours",
"dreaming_trigger_threshold": "dreaming_trigger_threshold",
"dreaming_llm": "dreaming_llm",
"provenance_detail": "provenance_detail",
"spawn_timeout": "spawn_timeout",
"can_create_knowledge": "can_create_knowledge",
"can_build_skills": "can_build_skills",
"can_schedule": "can_schedule",
"memory_read_only": "memory_read_only",
"narration_max_retries": "narration_max_retries",
"respect_context_window": "respect_context_window",
"cache_tool_results": "cache_tool_results",
"max_retry_limit": "max_retry_limit",
"share_data": "share_data",
}
for json_key, model_key in settings_map.items():
if json_key in settings_raw:
settings_kwargs[model_key] = settings_raw[json_key]
settings = AgentSettings(**settings_kwargs)
try:
# Resolve coworkers (pass loading chain to detect circular refs)
coworkers = _resolve_coworkers(
defn.get("coworkers", []), agents_dir, _loading_chain
)
# Resolve guardrail
guardrail = _resolve_guardrail(defn.get("guardrail"))
# Resolve knowledge sources
knowledge_sources = _resolve_knowledge_sources(
defn.get("knowledge_sources", [])
)
# Build agent
agent_kwargs: dict[str, Any] = {
"role": defn["role"],
"goal": defn["goal"],
"backstory": defn.get("backstory", ""),
"settings": settings,
"verbose": defn.get("verbose", False),
"max_iter": defn.get("max_iter", 25),
}
if "llm" in defn:
agent_kwargs["llm"] = defn["llm"]
if "function_calling_llm" in defn:
agent_kwargs["function_calling_llm"] = defn["function_calling_llm"]
if "tools" in defn:
agent_kwargs["tools"] = _resolve_tools(defn["tools"])
if "mcps" in defn:
agent_kwargs["mcps"] = _resolve_mcps(defn["mcps"])
if "apps" in defn:
agent_kwargs["apps"] = defn["apps"]
if coworkers:
agent_kwargs["coworkers"] = coworkers
if guardrail is not None:
agent_kwargs["guardrail"] = guardrail
if "max_tokens" in defn:
agent_kwargs["max_tokens"] = defn["max_tokens"]
if "max_execution_time" in defn:
agent_kwargs["max_execution_time"] = defn["max_execution_time"]
if knowledge_sources:
agent_kwargs["knowledge_sources"] = knowledge_sources
if "skills" in defn:
from pathlib import Path as _Path
agent_kwargs["skills"] = [_Path(p) for p in defn["skills"]]
if "response_model" in defn:
resolved_model = _resolve_response_model(defn["response_model"])
if resolved_model is not None:
agent_kwargs["response_model"] = resolved_model
memory_setting = settings_raw.get("memory", True)
agent_kwargs["memory"] = memory_setting
return NewAgent(**agent_kwargs)
finally:
if agent_name:
_loading_chain.discard(agent_name)
def _resolve_tools(tool_names: list[str]) -> list[Any]:
"""Resolve tool names into tool instances."""
tools = []
for name in tool_names:
if name.startswith("custom:"):
custom_tool = _resolve_custom_tool(name[7:])
if custom_tool is not None:
tools.append(custom_tool)
continue
try:
tool_cls = _find_tool_class(name)
if tool_cls:
tools.append(tool_cls())
except Exception as e:
logger.warning(f"Failed to resolve tool '{name}': {e}")
return tools
def _find_tool_class(name: str) -> type | None:
"""Look up a tool class by name from the crewai_tools package."""
try:
import crewai_tools
# Convert snake_case name to PascalCase + Tool suffix
class_name = "".join(word.capitalize() for word in name.split("_")) + "Tool"
cls = getattr(crewai_tools, class_name, None)
if cls is not None:
return cast(type, cls)
# Try direct attribute lookup
cls = getattr(crewai_tools, name, None)
return cast(type, cls) if cls is not None else None
except ImportError:
return None
def _resolve_coworkers(
coworker_defs: list[dict[str, Any]],
agents_dir: Path | None,
_loading_chain: set[str] | None = None,
) -> list[Any]:
"""Resolve coworker definitions into NewAgent instances or handles."""
coworkers = []
for cw in coworker_defs:
if isinstance(cw, str):
coworkers.append(cw)
elif "ref" in cw:
ref_name = cw["ref"]
if _loading_chain and ref_name in _loading_chain:
logger.debug(
"Skipping coworker back-reference '%s' (already in loading chain)",
ref_name,
)
continue
if agents_dir:
for ext in (".json", ".jsonc"):
ref_path = agents_dir / f"{ref_name}{ext}"
if ref_path.exists():
result = load_agent_from_definition(
ref_path,
agents_dir,
set(_loading_chain) if _loading_chain else None,
)
if result is not None:
coworkers.append(result)
break
else:
logger.warning(
f"Coworker ref '{ref_name}' not found in {agents_dir}"
)
else:
logger.warning(
f"Cannot resolve coworker ref '{ref_name}' — no agents_dir specified"
)
elif "amp" in cw:
# AMP handle — pass as string for resolution at construction time
# Support overrides: {"amp": "handle", "llm": "...", "settings": {...}}
amp_handle = cw["amp"]
overrides = {k: v for k, v in cw.items() if k != "amp"}
if overrides:
coworkers.append({"handle": amp_handle, "overrides": overrides})
else:
coworkers.append(amp_handle)
elif "a2a" in cw:
# A2A remote — would need A2AClientConfig
try:
from crewai.a2a.config import A2AClientConfig
coworkers.append(A2AClientConfig(endpoint=cw["a2a"]))
except ImportError:
logger.warning(f"A2A support not available for coworker {cw['a2a']}")
else:
logger.warning(f"Unknown coworker definition format: {cw}")
return coworkers
def _resolve_guardrail(guardrail_def: dict[str, Any] | str | None) -> Any:
"""Resolve guardrail definition.
Supports:
- String shorthand: converted to an LLM guardrail with the string as instructions.
- Dict with type "llm": creates an LLMGuardrail.
- Dict with type "code": resolves a dotted function path.
"""
if guardrail_def is None:
return None
# GAP-91: String shorthand -> LLM guardrail
if isinstance(guardrail_def, str):
guardrail_def = {"type": "llm", "instructions": guardrail_def}
if not isinstance(guardrail_def, dict):
return None
guard_type = guardrail_def.get("type", "")
if guard_type == "llm":
from crewai.llms.base_llm import BaseLLM
from crewai.tasks.llm_guardrail import LLMGuardrail
from crewai.utilities.llm_utils import create_llm
llm_ref = guardrail_def.get("llm", "openai/gpt-4o-mini")
llm = create_llm(llm_ref) if isinstance(llm_ref, str) else llm_ref
return LLMGuardrail(
description=guardrail_def.get("instructions", ""),
llm=cast(BaseLLM, llm),
)
# GAP-106: Code guardrail — resolve dotted function path
if guard_type == "code":
import importlib
code_path = guardrail_def.get("function", guardrail_def.get("path", ""))
if code_path:
try:
module_path, func_name = code_path.rsplit(".", 1)
module = importlib.import_module(module_path)
func = getattr(module, func_name)
return func
except Exception as e:
logger.warning(f"Failed to resolve code guardrail '{code_path}': {e}")
return None
return None
def _resolve_custom_tool(tool_name: str) -> Any:
"""Resolve a custom tool from the project's tools/ directory."""
tools_dir = Path.cwd() / "tools"
tool_file = tools_dir / f"{tool_name}.py"
if not tool_file.exists():
logger.warning(f"Custom tool file not found: {tool_file}")
return None
try:
import importlib.util
spec = importlib.util.spec_from_file_location(
f"custom_tools.{tool_name}", tool_file
)
if spec is None or spec.loader is None:
return None
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
from crewai.tools.base_tool import BaseTool
for attr_name in dir(module):
attr = getattr(module, attr_name)
if (
isinstance(attr, type)
and issubclass(attr, BaseTool)
and attr is not BaseTool
):
tool_cls = cast(type[Any], attr)
return tool_cls()
logger.warning(f"No BaseTool subclass found in {tool_file}")
return None
except Exception as e:
logger.warning(f"Failed to load custom tool '{tool_name}': {e}")
return None
def _resolve_knowledge_sources(sources: list[dict[str, Any]]) -> list[Any]:
"""Resolve knowledge source definitions into knowledge source instances."""
resolved = []
for src in sources:
path_str = src.get("path", "")
if not path_str:
continue
path = Path(path_str)
try:
if path.is_dir():
from crewai.knowledge.source.directory_knowledge_source import ( # type: ignore[import-not-found]
DirectoryKnowledgeSource,
)
resolved.append(DirectoryKnowledgeSource(path=path_str))
elif path.suffix.lower() == ".csv":
from crewai.knowledge.source.csv_knowledge_source import (
CSVKnowledgeSource,
)
resolved.append(CSVKnowledgeSource(file_paths=[path_str]))
elif path.suffix.lower() == ".pdf":
from crewai.knowledge.source.pdf_knowledge_source import (
PDFKnowledgeSource,
)
resolved.append(PDFKnowledgeSource(file_paths=[path_str]))
elif path.suffix.lower() in (".xls", ".xlsx"):
from crewai.knowledge.source.excel_knowledge_source import (
ExcelKnowledgeSource,
)
resolved.append(ExcelKnowledgeSource(file_paths=[path_str]))
elif path.suffix.lower() == ".json":
from crewai.knowledge.source.json_knowledge_source import (
JSONKnowledgeSource,
)
resolved.append(JSONKnowledgeSource(file_paths=[path_str]))
elif path.suffix.lower() == ".txt":
from crewai.knowledge.source.text_file_knowledge_source import (
TextFileKnowledgeSource,
)
resolved.append(TextFileKnowledgeSource(file_paths=[path_str]))
else:
from crewai.knowledge.source.text_file_knowledge_source import (
TextFileKnowledgeSource,
)
resolved.append(TextFileKnowledgeSource(file_paths=[path_str]))
except Exception as e:
logger.warning(f"Failed to resolve knowledge source '{path_str}': {e}")
return resolved
def _resolve_response_model(dotted_path: str) -> type | None:
"""Resolve a dotted path string to a Pydantic BaseModel class."""
try:
import importlib
module_path, class_name = dotted_path.rsplit(".", 1)
module = importlib.import_module(module_path)
cls = getattr(module, class_name)
from pydantic import BaseModel
if isinstance(cls, type) and issubclass(cls, BaseModel):
return cls
logger.warning(f"response_model '{dotted_path}' is not a BaseModel subclass")
return None
except Exception as e:
logger.warning(f"Failed to resolve response_model '{dotted_path}': {e}")
return None
def _resolve_mcps(mcp_defs: list[Any]) -> list[Any]:
"""Resolve MCP definitions into proper config objects."""
resolved: list[Any] = []
for mcp in mcp_defs:
if isinstance(mcp, str):
resolved.append(mcp)
elif isinstance(mcp, dict):
url = mcp.get("url", "")
if url:
try:
from crewai.mcp import MCPServerHTTP
resolved.append(MCPServerHTTP(url=url))
except ImportError:
resolved.append(url)
else:
resolved.append(mcp)
else:
resolved.append(mcp)
return resolved

View File

@@ -1,807 +0,0 @@
"""Dreaming — background memory consolidation for NewAgent.
GAP-48: Marks raw memories as processed so they are not re-processed.
GAP-49: Tracks token usage from the consolidation LLM call.
GAP-54: Scopes canonical memories (global / user / conversation) and only shares global ones.
GAP-62: Saves detected workflows as reusable JSON recipes.
GAP-80: Workflow user confirmation flow — pending list instead of auto-save.
GAP-81: Generate executable Python Flow code alongside JSON metadata.
GAP-82: match_workflow() to consult discovered flows during execution.
GAP-100: Persist scope classification with canonical memories.
GAP-101: Shared canonical memories tagged read-only.
GAP-112: Prune raw memories after dreaming consolidation.
GAP-113: Workflow detection threshold raised from 3 to 5.
"""
from __future__ import annotations
from datetime import datetime, timezone
import json
import logging
import os
import re
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
from crewai.new_agent.new_agent import NewAgent
logger = logging.getLogger(__name__)
# GAP-54: Scope constants for canonical memories
SCOPE_GLOBAL = "global"
SCOPE_USER = "user"
SCOPE_CONVERSATION = "conversation"
# GAP-54: Heuristic patterns for user-scoped memories
_USER_SCOPE_PATTERNS: list[re.Pattern[str]] = [
re.compile(p, re.IGNORECASE)
for p in (
r"\bmy\s+(name|preference|email|account|setting)\b",
r"\buser\s+prefer",
r"\bpersonal\s+(preference|setting|detail)",
r"\bI\s+(like|prefer|want|need|always|usually)\b",
r"\b(his|her|their)\s+(name|preference|email|account)\b",
)
]
# GAP-54: Patterns that indicate conversation-specific context
_CONVERSATION_SCOPE_PATTERNS: list[re.Pattern[str]] = [
re.compile(p, re.IGNORECASE)
for p in (
r"\bin this conversation\b",
r"\bjust now\b",
r"\bthis session\b",
r"\bcurrent discussion\b",
)
]
def _classify_scope(canonical_text: str) -> str:
"""Classify a canonical memory's scope using heuristics."""
for pattern in _CONVERSATION_SCOPE_PATTERNS:
if pattern.search(canonical_text):
return SCOPE_CONVERSATION
for pattern in _USER_SCOPE_PATTERNS:
if pattern.search(canonical_text):
return SCOPE_USER
return SCOPE_GLOBAL
class DreamingEngine:
"""Consolidates raw memories into canonical insights."""
def __init__(self, agent: NewAgent):
self.agent = agent
self._last_dreaming_time: datetime | None = None
self._memories_since_last_dream: int = 0
# GAP-48: Track processed memory IDs (persistent)
self._processed_memory_ids: set[str] = set()
self._cycle_count: int = 0
self._load_processed_ids()
# GAP-49: Token tracking for the last dream cycle
self._last_cycle_tokens: Any = None
# GAP-62: Discovered flow recipes from previous cycles
self._discovered_flows: list[dict[str, Any]] = []
self._load_discovered_flows()
# GAP-80: Pending workflows awaiting user confirmation
self._pending_workflows: list[dict[str, Any]] = []
# GAP-122: Training feedback awaiting next consolidation cycle
self._training_feedback: list[dict[str, Any]] = []
# ── GAP-48: Persistent processed-memory tracking ──────────
def _processed_ids_path(self) -> str:
"""Path to the JSON file persisting processed memory IDs."""
agent_name = re.sub(r"[^a-zA-Z0-9_-]", "_", self.agent.role)[:64]
base_dir = os.path.join(".crewai", "dreaming")
return os.path.join(base_dir, f"{agent_name}_processed.json")
def _load_processed_ids(self) -> None:
"""Load previously processed memory IDs from disk."""
try:
path = self._processed_ids_path()
if os.path.exists(path):
with open(path, "r") as f:
data = json.load(f)
self._processed_memory_ids = set(data.get("ids", []))
self._cycle_count = data.get("cycle_count", 0)
except Exception:
self._processed_memory_ids = set()
def _save_processed_ids(self) -> None:
"""Persist processed memory IDs to disk."""
try:
path = self._processed_ids_path()
os.makedirs(os.path.dirname(path), exist_ok=True)
with open(path, "w") as f:
json.dump(
{
"ids": list(self._processed_memory_ids),
"cycle_count": self._cycle_count,
},
f,
)
except Exception as e:
logger.debug(f"Failed to persist processed memory IDs: {e}")
# ── GAP-62: Discovered flow persistence ───────────────────
def _flows_manifest_path(self) -> str:
return os.path.join(".crewai", "flows", "manifest.json")
def _load_discovered_flows(self) -> None:
"""Load the flow manifest from disk."""
try:
path = self._flows_manifest_path()
if os.path.exists(path):
with open(path, "r") as f:
self._discovered_flows = json.load(f)
except Exception:
self._discovered_flows = []
def _save_flow_recipe(self, workflow: dict[str, Any]) -> None:
"""GAP-62: Save a workflow as a reusable JSON recipe and register in manifest."""
tools = workflow.get("tools", [])
count = workflow.get("count", 0)
if not tools:
return
try:
flows_dir = os.path.join(".crewai", "flows")
os.makedirs(flows_dir, exist_ok=True)
# Generate a recipe name
recipe_name = "_".join(tools[:5]).replace(" ", "_").lower()
recipe_name = re.sub(r"[^a-zA-Z0-9_]", "", recipe_name)[:64]
recipe_path = os.path.join(flows_dir, f"{recipe_name}.json")
recipe = {
"name": recipe_name,
"tools": tools,
"pattern_count": count,
"created_at": datetime.now(timezone.utc).isoformat(),
"agent_role": self.agent.role,
"description": f"Repeated pattern ({count}x): {' -> '.join(tools)}",
}
with open(recipe_path, "w") as f:
json.dump(recipe, f, indent=2)
# Update manifest
manifest_path = self._flows_manifest_path()
manifest: list[dict[str, Any]] = []
if os.path.exists(manifest_path):
try:
with open(manifest_path, "r") as f:
manifest = json.load(f)
except Exception:
manifest = []
# Avoid duplicate entries
if not any(entry.get("name") == recipe_name for entry in manifest):
manifest.append(
{
"name": recipe_name,
"path": recipe_path,
"tools": tools,
"created_at": recipe["created_at"],
}
)
with open(manifest_path, "w") as f:
json.dump(manifest, f, indent=2)
self._discovered_flows = manifest
logger.debug(f"Saved workflow recipe: {recipe_name}")
except Exception as e:
logger.debug(f"Failed to save workflow recipe: {e}")
def _generate_flow_code(self, workflow: dict[str, Any]) -> str | None:
"""GAP-81: Generate executable Python Flow code for a workflow.
Saves a ``.py`` file alongside the JSON metadata. The generated Flow
is readable and editable by the user.
Returns the file path on success, or None on failure.
"""
tools = workflow.get("tools", [])
if not tools:
return None
try:
recipe_name = "_".join(tools[:5]).replace(" ", "_").lower()
recipe_name = re.sub(r"[^a-zA-Z0-9_]", "", recipe_name)[:64]
class_name = (
"".join(word.capitalize() for word in recipe_name.split("_") if word)
or "DetectedWorkflow"
)
# Build step methods
steps: list[str] = []
for i, tool_name in enumerate(tools):
safe_name = re.sub(r"[^a-zA-Z0-9_]", "_", tool_name)
step_num = i + 1
if i == 0:
decorator = " @start()"
else:
prev_safe = re.sub(r"[^a-zA-Z0-9_]", "_", tools[i - 1])
decorator = f' @listen("step_{i}_{prev_safe}")'
method = (
f"{decorator}\n"
f" def step_{step_num}_{safe_name}(self):\n"
f' """Calls {tool_name} tool."""\n'
f' agent = self.state.get("agent")\n'
f' if agent and "{tool_name}" in (agent.tools or {{}}):\n'
f' result = agent.tools["{tool_name}"].run(\n'
f' self.state.get("step_{step_num}_input", self.state.get("input", ""))\n'
f" )\n"
f" else:\n"
f" result = None\n"
f' self.state["step_{step_num}_result"] = result\n'
f" return result"
)
steps.append(method)
steps_code = "\n\n".join(steps)
code = (
f'"""Auto-generated Flow for workflow: {recipe_name}\n'
f"\n"
f"Tools: {' -> '.join(tools)}\n"
f"Generated by CrewAI DreamingEngine.\n"
f'"""\n'
f"\n"
f"from crewai.flow.flow import Flow, start, listen\n"
f"\n"
f"\n"
f"class {class_name}(Flow):\n"
f' """Workflow: {" -> ".join(tools)}"""\n'
f"\n"
f"{steps_code}\n"
)
flows_dir = os.path.join(".crewai", "flows")
os.makedirs(flows_dir, exist_ok=True)
py_path = os.path.join(flows_dir, f"workflow_{recipe_name}.py")
with open(py_path, "w") as f:
f.write(code)
logger.debug(f"Generated Flow code: {py_path}")
return py_path
except Exception as e:
logger.debug(f"Failed to generate Flow code: {e}")
return None
# ── GAP-82: Match user messages against discovered workflows ──
def match_workflow(self, user_message: str) -> dict[str, Any] | None:
"""Check if a user message matches a previously confirmed workflow.
Uses keyword overlap between the message and workflow descriptions.
Returns the matching workflow dict, or None if no match is found.
"""
if not self._discovered_flows:
return None
stop_words = {
"the",
"a",
"an",
"is",
"to",
"and",
"or",
"of",
"in",
"for",
"it",
"on",
}
msg_lower = user_message.lower()
msg_words = set(msg_lower.split()) - stop_words
for flow in self._discovered_flows:
desc = flow.get("description", "").lower()
desc_words = set(desc.split()) - stop_words
overlap = desc_words & msg_words
if len(overlap) >= 3:
return flow
return None
# ── GAP-112: Prune processed raw memories ────────────────────
def _prune_processed_memories(self, processed_ids: set[str]) -> None:
"""Remove raw memories that have been consolidated into canonical insights.
Keeps the most recent ``KEEP_RECENT`` memories as an audit trail.
"""
memory = getattr(self.agent, "_memory_instance", None)
if not memory:
return
try:
KEEP_RECENT = 20
prunable = sorted(processed_ids)
if len(prunable) <= KEEP_RECENT:
return # Keep all if we haven't accumulated enough
to_prune = prunable[:-KEEP_RECENT] # Prune oldest, keep recent
for mem_id in to_prune:
try:
memory.delete(mem_id)
except Exception:
pass
except Exception:
pass
# ── GAP-122: Training feedback integration ─────────────────
def add_training_feedback(self, feedback: str, task_context: str = "") -> None:
"""Receive training feedback for priority inclusion in the next dream cycle.
Stored entries are injected into the consolidation prompt with higher
weight so the agent learns from explicit user corrections faster.
"""
self._training_feedback.append(
{
"feedback": feedback,
"task_context": task_context,
"timestamp": datetime.now(timezone.utc).isoformat(),
}
)
self.increment_memory_count()
logger.debug("Training feedback received for agent '%s'", self.agent.role)
# ── Core dreaming logic ───────────────────────────────────
def should_dream(self) -> bool:
"""Check if dreaming should be triggered."""
settings = self.agent.settings
if not settings.self_improving:
return False
now = datetime.now(timezone.utc)
# Time-based trigger
if self._last_dreaming_time is not None:
hours_since = (now - self._last_dreaming_time).total_seconds() / 3600
if hours_since >= settings.dreaming_interval_hours:
return True
elif self._memories_since_last_dream >= settings.dreaming_trigger_threshold:
# Threshold trigger on first run
return True
# Threshold trigger
if self._memories_since_last_dream >= settings.dreaming_trigger_threshold:
return True
return False
def increment_memory_count(self) -> None:
self._memories_since_last_dream += 1
async def dream(self) -> dict[str, Any]:
"""Run dreaming cycle. Returns summary of what was consolidated."""
# Emit event
self._emit_dreaming_started()
self._cycle_count += 1
result = {
"memories_processed": 0,
"canonical_created": 0,
"workflows_detected": 0,
}
try:
memory = getattr(self.agent, "_memory_instance", None)
if memory is not None:
# GAP-48: Filter out already-processed memories
memories, memory_ids = self._get_recent_memories(memory)
result["memories_processed"] = len(memories)
if memories:
consolidated = await self._consolidate_memories(memories)
result["canonical_created"] = len(consolidated)
for canonical in consolidated:
# GAP-54 + GAP-100: Classify scope and persist with metadata
scope = _classify_scope(canonical)
try:
memory.remember(
canonical,
agent_role=self.agent.role,
importance=0.9,
metadata={
"type": "canonical",
"scope": scope,
"dreaming_cycle": self._cycle_count,
},
)
except TypeError:
# Fallback if memory.remember() doesn't accept metadata
try:
memory.remember(
canonical,
agent_role=self.agent.role,
importance=0.9,
)
except Exception as e:
logger.debug(f"Failed to save canonical memory: {e}")
except Exception as e:
logger.debug(f"Failed to save canonical memory: {e}")
# GAP-54: Only share global-scoped memories with coworkers
global_memories = [
c for c in consolidated if _classify_scope(c) == SCOPE_GLOBAL
]
self._share_with_coworkers(global_memories)
# GAP-48: Mark these memories as processed
self._processed_memory_ids.update(memory_ids)
self._save_processed_ids()
# GAP-112: Prune raw memories that have been consolidated
self._prune_processed_memories(self._processed_memory_ids)
# Detect workflow patterns from provenance (independent of memory)
workflows = self._detect_workflows()
result["workflows_detected"] = len(workflows)
for wf in workflows:
self._emit_workflow_detected(wf)
# GAP-80: Propose only — no auto-save. User must confirm.
self._propose_workflow(wf)
except Exception as e:
logger.warning(f"Dreaming cycle failed: {e}")
# Always reset counters after a dreaming attempt
self._last_dreaming_time = datetime.now(timezone.utc)
self._memories_since_last_dream = 0
self._emit_dreaming_completed(result)
return result
def _get_recent_memories(self, memory: Any) -> tuple[list[str], list[str]]:
"""Get memories accumulated since last dreaming cycle.
GAP-48: Returns (memory_contents, memory_ids) filtering out already-processed IDs.
"""
try:
results = memory.recall("", limit=50)
contents: list[str] = []
ids: list[str] = []
for m in results or []:
# Try to extract a unique ID for this memory
mem_id = getattr(m, "id", None) or getattr(
getattr(m, "record", None), "id", None
)
if mem_id is None:
# Use content hash as fallback ID
content = getattr(m, "content", "") or getattr(
getattr(m, "record", None), "content", ""
)
if content:
mem_id = str(hash(content))
else:
continue
mem_id = str(mem_id)
# GAP-48: Skip already-processed memories
if mem_id in self._processed_memory_ids:
continue
# GAP-101: Skip read-only shared memories during consolidation
mem_metadata = (
getattr(m, "metadata", None)
or getattr(getattr(m, "record", None), "metadata", None)
or {}
)
if isinstance(mem_metadata, dict) and mem_metadata.get("read_only"):
continue
content = getattr(m, "content", "") or getattr(
getattr(m, "record", None), "content", ""
)
# GAP-101: Also skip by tag prefix
if content and content.startswith("[shared:read-only]"):
continue
if content:
contents.append(content)
ids.append(mem_id)
return contents, ids
except Exception:
return [], []
def _get_dreaming_llm(self) -> Any:
"""Get the LLM to use for dreaming — dedicated or agent's default."""
dreaming_llm_ref = self.agent.settings.dreaming_llm
if dreaming_llm_ref is not None:
from crewai.utilities.llm_utils import create_llm
return create_llm(dreaming_llm_ref)
return self.agent._llm_instance
async def _consolidate_memories(self, memories: list[str]) -> list[str]:
"""Use LLM to consolidate raw memories into canonical insights."""
llm = self._get_dreaming_llm()
if llm is None:
return []
from crewai.utilities.agent_utils import (
aget_llm_response,
format_message_for_llm,
)
from crewai.utilities.types import LLMMessage
memory_text = "\n".join(f"- {m}" for m in memories)
# GAP-122: Include pending training feedback with higher priority
training_section = ""
if self._training_feedback:
lines = []
for entry in self._training_feedback:
ctx = entry.get("task_context", "")
fb = entry.get("feedback", "")
if ctx:
lines.append(f"- [Context: {ctx}] {fb}")
else:
lines.append(f"- {fb}")
training_section = (
"\n\nTraining feedback (HIGH PRIORITY — these are explicit user "
"corrections and should be preserved as canonical insights):\n"
+ "\n".join(lines)
)
self._training_feedback.clear()
prompt = (
"You are analyzing a collection of raw memories from an AI agent's interactions. "
"Your task is to consolidate these into canonical insights — key learnings, patterns, "
"and important facts that should be retained long-term.\n\n"
"Raw memories:\n"
f"{memory_text}"
f"{training_section}\n\n"
"Instructions:\n"
"1. Identify patterns, repeated themes, and key facts\n"
"2. Consolidate redundant memories into single, clear statements\n"
"3. Resolve any pronouns or vague references into specific, self-contained facts\n"
"4. Drop any memories that are too vague or incomplete to be useful\n"
"5. Output each canonical insight on its own line, prefixed with '- '\n"
"6. Keep insights concise but self-contained\n"
"7. Training feedback entries are high priority — always preserve them\n\n"
"Canonical insights:"
)
messages: list[LLMMessage] = [format_message_for_llm(prompt, role="user")]
try:
from crewai.new_agent.executor import _NullPrinter
response = await aget_llm_response(
llm=llm,
messages=messages,
callbacks=[],
printer=_NullPrinter(),
verbose=False,
)
# GAP-49: Record token usage from the consolidation LLM call
try:
from crewai.new_agent.models import TokenUsage
usage = getattr(llm, "_token_usage", None) or {}
in_tokens = usage.get("prompt_tokens", 0)
out_tokens = usage.get("completion_tokens", 0)
model_name = getattr(llm, "model", "") or ""
self._last_cycle_tokens = TokenUsage(
action="dreaming",
agent_id=str(self.agent.id),
input_tokens=in_tokens,
output_tokens=out_tokens,
model=model_name,
)
except Exception:
pass
lines = str(response).strip().split("\n")
canonical = []
for line in lines:
line = line.strip()
if line.startswith("- "):
canonical.append(line[2:].strip())
elif line:
canonical.append(line)
return canonical
except Exception as e:
logger.debug(f"Memory consolidation LLM call failed: {e}")
return []
def _detect_workflows(self) -> list[dict[str, Any]]:
"""Detect repeated tool-call sequences in provenance logs."""
executor = self.agent._executor
if executor is None:
return []
provenance = executor.provenance_log
tool_sequences: list[list[str]] = []
current_sequence: list[str] = []
for entry in provenance:
if entry.action == "tool_call":
tool_name = (entry.inputs or {}).get("tool", "")
if tool_name:
current_sequence.append(tool_name)
elif entry.action == "response":
if len(current_sequence) >= 2:
tool_sequences.append(current_sequence)
current_sequence = []
if len(current_sequence) >= 2:
tool_sequences.append(current_sequence)
# Find repeated sequences (simplified — look for exact matches)
from collections import Counter
seq_counter = Counter(tuple(s) for s in tool_sequences)
workflows = [
{"tools": list(seq), "count": count}
for seq, count in seq_counter.items()
if count >= 5 # GAP-113: Must appear at least 5 times (plan threshold)
]
return workflows
def _share_with_coworkers(self, canonical_memories: list[str]) -> None:
"""Share general canonical memories with coworker agents as read-only.
GAP-54: Only receives memories already filtered to global scope.
GAP-101: Tags shared memories with read_only=True so they are protected.
"""
coworkers = getattr(self.agent, "_resolved_coworkers", [])
if not coworkers:
return
from crewai.new_agent.new_agent import NewAgent
for cw in coworkers:
if not isinstance(cw, NewAgent):
continue
cw_memory = getattr(cw, "_memory_instance", None)
if cw_memory is None:
continue
for canonical in canonical_memories:
try:
cw_memory.remember(
f"[shared:read-only][shared from {self.agent.role}] {canonical}",
agent_role=cw.role,
importance=0.7,
metadata={
"type": "canonical_shared",
"source_agent": self.agent.role,
"read_only": True,
},
)
except TypeError:
# Fallback if remember() doesn't accept metadata kwarg
try:
cw_memory.remember(
f"[shared:read-only][shared from {self.agent.role}] {canonical}",
agent_role=cw.role,
importance=0.7,
)
except Exception:
pass
except Exception:
pass
def _propose_workflow(self, workflow: dict[str, Any]) -> None:
"""GAP-80: Add workflow to pending list and emit proposal event.
Does NOT auto-save. The workflow stays pending until the user
confirms via ``confirm_workflow()`` or rejects via ``reject_workflow()``.
"""
tools = workflow.get("tools", [])
count = workflow.get("count", 0)
description = (
f"Detected repeated pattern ({count}x): {''.join(tools)}. "
f"This could be crystallized into an automated workflow."
)
workflow["description"] = description
self._pending_workflows.append(workflow)
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentWorkflowProposedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentWorkflowProposedEvent(
new_agent_id=str(self.agent.id),
workflow_description=description,
),
)
except Exception:
pass
# ── GAP-80: User confirmation flow for workflows ─────────────
def get_pending_workflows(self) -> list[dict[str, Any]]:
"""Return the list of workflows awaiting user confirmation."""
return list(self._pending_workflows)
def confirm_workflow(self, index: int) -> dict[str, Any] | None:
"""Confirm a pending workflow, saving it as a recipe and Flow code.
Returns the confirmed workflow dict, or None if the index is invalid.
"""
if index < 0 or index >= len(self._pending_workflows):
return None
workflow = self._pending_workflows.pop(index)
self._save_flow_recipe(workflow)
# GAP-81: Also generate executable Flow code
self._generate_flow_code(workflow)
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentWorkflowConfirmedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentWorkflowConfirmedEvent(new_agent_id=str(self.agent.id)),
)
except Exception:
pass
return workflow
def reject_workflow(self, index: int) -> dict[str, Any] | None:
"""Reject a pending workflow, removing it from the pending list.
Returns the rejected workflow dict, or None if the index is invalid.
"""
if index < 0 or index >= len(self._pending_workflows):
return None
return self._pending_workflows.pop(index)
def _emit_dreaming_started(self) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentDreamingStartedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentDreamingStartedEvent(new_agent_id=str(self.agent.id)),
)
except Exception:
pass
def _emit_workflow_detected(self, workflow: dict[str, Any]) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentWorkflowDetectedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentWorkflowDetectedEvent(
new_agent_id=str(self.agent.id),
tools=workflow.get("tools", []),
count=workflow.get("count", 0),
),
)
except Exception:
pass
def _emit_dreaming_completed(self, result: dict[str, Any]) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentDreamingCompletedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentDreamingCompletedEvent(
new_agent_id=str(self.agent.id),
memories_processed=result.get("memories_processed", 0),
canonical_created=result.get("canonical_created", 0),
workflows_detected=result.get("workflows_detected", 0),
),
)
except Exception:
pass

View File

@@ -1,625 +0,0 @@
"""Event listeners for the NewAgent system — bridges events to telemetry.
GAP-47: Uses a module-level registry to look up telemetry instances by agent ID.
GAP-61: Registers handlers for ALL event types defined in events.py.
"""
from __future__ import annotations
import logging
from typing import Any
logger = logging.getLogger(__name__)
def _get_tel(agent_id: str) -> Any:
"""Look up the telemetry instance for *agent_id* via the registry.
Returns None (graceful degradation) if the agent is not registered.
"""
try:
from crewai.new_agent.telemetry import get_telemetry_for_agent
return get_telemetry_for_agent(agent_id)
except Exception:
return None
def register_new_agent_listeners() -> None:
"""Register all NewAgent event listeners on the crewai event bus."""
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import (
NewAgentContextSummarizedEvent,
NewAgentConversationResetEvent,
NewAgentConversationStartedEvent,
NewAgentDelegationCompletedEvent,
NewAgentDelegationFailedEvent,
NewAgentDelegationStartedEvent,
NewAgentDreamingCompletedEvent,
NewAgentDreamingStartedEvent,
NewAgentExplainRequestedEvent,
NewAgentFireAndForgetCompletedEvent,
NewAgentFireAndForgetDispatchedEvent,
NewAgentGuardrailPassedEvent,
NewAgentGuardrailRejectedEvent,
NewAgentKnowledgeConfirmedEvent,
NewAgentKnowledgeQueryEvent,
NewAgentKnowledgeRejectedEvent,
NewAgentKnowledgeSuggestedEvent,
NewAgentLLMCallCompletedEvent,
NewAgentLLMCallFailedEvent,
NewAgentLLMCallStartedEvent,
NewAgentMemoryRecallEvent,
NewAgentMemorySaveEvent,
NewAgentMessageReceivedEvent,
NewAgentMessageSentEvent,
NewAgentNarrationGuardTriggeredEvent,
NewAgentPlanningCompletedEvent,
NewAgentPlanningStartedEvent,
NewAgentSpawnCompletedEvent,
NewAgentSpawnFailedEvent,
NewAgentSpawnStartedEvent,
NewAgentStatusUpdateEvent,
NewAgentToolUsageCompletedEvent,
NewAgentToolUsageFailedEvent,
NewAgentToolUsageStartedEvent,
NewAgentWorkflowConfirmedEvent,
NewAgentWorkflowDetectedEvent,
NewAgentWorkflowProposedEvent,
)
# ── Conversation ──────────────────────────────────────────
@crewai_event_bus.on(NewAgentConversationStartedEvent)
def _on_conversation_started(
source: Any, event: NewAgentConversationStartedEvent
) -> None:
logger.debug("NewAgent %s conversation started", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
tel.agent_created(
agent_id=event.new_agent_id,
role=event.new_agent_role,
goal="",
llm="",
)
@crewai_event_bus.on(NewAgentConversationResetEvent)
def _on_conversation_reset(
source: Any, event: NewAgentConversationResetEvent
) -> None:
logger.debug("NewAgent %s conversation reset", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
tel.conversation_reset(agent_id=event.new_agent_id)
# ── Messages ──────────────────────────────────────────────
@crewai_event_bus.on(NewAgentMessageReceivedEvent)
def _on_message_received(
source: Any, event: NewAgentMessageReceivedEvent
) -> None:
logger.debug(
"NewAgent %s received message (%d chars)",
event.new_agent_id,
event.message_length,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.message_received(
agent_id=event.new_agent_id, message_length=event.message_length
)
@crewai_event_bus.on(NewAgentMessageSentEvent)
def _on_message_sent(source: Any, event: NewAgentMessageSentEvent) -> None:
logger.debug(
"NewAgent %s sent message: %d in / %d out tokens",
event.new_agent_role,
event.input_tokens,
event.output_tokens,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.message_sent(
agent_id=event.new_agent_id,
input_tokens=event.input_tokens,
output_tokens=event.output_tokens,
response_time_ms=event.response_time_ms,
)
# ── LLM Calls ────────────────────────────────────────────
@crewai_event_bus.on(NewAgentLLMCallStartedEvent)
def _on_llm_call_started(
source: Any, event: NewAgentLLMCallStartedEvent
) -> None:
logger.debug(
"NewAgent %s LLM call started (model=%s)",
event.new_agent_id,
event.model,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.llm_call_started(agent_id=event.new_agent_id, model=event.model)
@crewai_event_bus.on(NewAgentLLMCallCompletedEvent)
def _on_llm_call_completed(
source: Any, event: NewAgentLLMCallCompletedEvent
) -> None:
logger.debug(
"NewAgent %s LLM call completed: %d in / %d out tokens in %dms",
event.new_agent_id,
event.input_tokens,
event.output_tokens,
event.response_time_ms,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.llm_call_completed(
agent_id=event.new_agent_id,
model=event.model,
input_tokens=event.input_tokens,
output_tokens=event.output_tokens,
response_time_ms=event.response_time_ms,
)
@crewai_event_bus.on(NewAgentLLMCallFailedEvent)
def _on_llm_call_failed(source: Any, event: NewAgentLLMCallFailedEvent) -> None:
logger.warning(
"NewAgent %s LLM call failed: %s", event.new_agent_id, event.error
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.llm_call_failed(agent_id=event.new_agent_id, error=event.error)
# ── Tool Usage ────────────────────────────────────────────
@crewai_event_bus.on(NewAgentToolUsageStartedEvent)
def _on_tool_started(source: Any, event: NewAgentToolUsageStartedEvent) -> None:
logger.debug(
"NewAgent %s using tool: %s", event.new_agent_id, event.tool_name
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.tool_usage_started(
agent_id=event.new_agent_id, tool_name=event.tool_name
)
@crewai_event_bus.on(NewAgentToolUsageCompletedEvent)
def _on_tool_completed(
source: Any, event: NewAgentToolUsageCompletedEvent
) -> None:
logger.debug(
"NewAgent %s tool completed: %s", event.new_agent_id, event.tool_name
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.tool_usage_completed_event(
agent_id=event.new_agent_id, tool_name=event.tool_name
)
@crewai_event_bus.on(NewAgentToolUsageFailedEvent)
def _on_tool_failed(source: Any, event: NewAgentToolUsageFailedEvent) -> None:
logger.warning(
"NewAgent %s tool %s failed: %s",
event.new_agent_id,
event.tool_name,
event.error,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.tool_usage_failed(
agent_id=event.new_agent_id,
tool_name=event.tool_name,
error=event.error,
)
# ── Delegation ────────────────────────────────────────────
@crewai_event_bus.on(NewAgentDelegationStartedEvent)
def _on_delegation_started(
source: Any, event: NewAgentDelegationStartedEvent
) -> None:
logger.debug(
"NewAgent %s delegation started to %s",
event.new_agent_id,
event.coworker_role,
)
tel = _get_tel(event.new_agent_id)
if tel:
span = tel.delegation(
agent_id=event.new_agent_id,
coworker_role=event.coworker_role,
mode=event.delegation_mode,
source=event.coworker_source,
)
key = tel._span_key(
event.new_agent_id, "delegation", event.coworker_role
)
tel.store_span(key, span)
@crewai_event_bus.on(NewAgentDelegationCompletedEvent)
def _on_delegation_completed(
source: Any, event: NewAgentDelegationCompletedEvent
) -> None:
logger.debug(
"NewAgent %s delegation to %s completed (%d tokens, %dms)",
event.new_agent_id,
event.coworker_role,
event.tokens_consumed,
event.response_time_ms,
)
tel = _get_tel(event.new_agent_id)
if tel:
key = tel._span_key(
event.new_agent_id, "delegation", event.coworker_role
)
span = tel.retrieve_span(key)
tel.delegation_completed(
span,
tokens_consumed=event.tokens_consumed,
response_time_ms=event.response_time_ms,
)
@crewai_event_bus.on(NewAgentDelegationFailedEvent)
def _on_delegation_failed(
source: Any, event: NewAgentDelegationFailedEvent
) -> None:
logger.warning(
"NewAgent %s delegation to %s failed: %s",
event.new_agent_id,
event.coworker_role,
event.error,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.delegation_failed(
agent_id=event.new_agent_id,
coworker_role=event.coworker_role,
error=event.error,
)
@crewai_event_bus.on(NewAgentFireAndForgetDispatchedEvent)
def _on_fire_and_forget_dispatched(
source: Any, event: NewAgentFireAndForgetDispatchedEvent
) -> None:
logger.debug(
"NewAgent %s fire-and-forget to %s",
event.new_agent_id,
event.coworker_role,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.fire_and_forget_dispatched(
agent_id=event.new_agent_id, coworker_role=event.coworker_role
)
@crewai_event_bus.on(NewAgentFireAndForgetCompletedEvent)
def _on_fire_and_forget_completed(
source: Any, event: NewAgentFireAndForgetCompletedEvent
) -> None:
logger.debug(
"NewAgent %s fire-and-forget to %s completed",
event.new_agent_id,
event.coworker_role,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.fire_and_forget_completed(
agent_id=event.new_agent_id, coworker_role=event.coworker_role
)
# ── Memory ────────────────────────────────────────────────
@crewai_event_bus.on(NewAgentMemorySaveEvent)
def _on_memory_save(source: Any, event: NewAgentMemorySaveEvent) -> None:
logger.debug("NewAgent %s memory save", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
tel.memory_save(agent_id=event.new_agent_id)
@crewai_event_bus.on(NewAgentMemoryRecallEvent)
def _on_memory_recall(source: Any, event: NewAgentMemoryRecallEvent) -> None:
logger.debug(
"NewAgent %s memory recall (%d results)",
event.new_agent_id,
event.results_count,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.memory_recall(
agent_id=event.new_agent_id, results_count=event.results_count
)
# ── Dreaming ──────────────────────────────────────────────
@crewai_event_bus.on(NewAgentDreamingStartedEvent)
def _on_dreaming_started(
source: Any, event: NewAgentDreamingStartedEvent
) -> None:
logger.debug("NewAgent %s dreaming started", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
span = tel.dreaming(agent_id=event.new_agent_id)
key = tel._span_key(event.new_agent_id, "dreaming")
tel.store_span(key, span)
@crewai_event_bus.on(NewAgentDreamingCompletedEvent)
def _on_dreaming_completed(
source: Any, event: NewAgentDreamingCompletedEvent
) -> None:
logger.debug(
"NewAgent %s dreaming: %d processed, %d canonical, %d workflows",
event.new_agent_id,
event.memories_processed,
event.canonical_created,
event.workflows_detected,
)
tel = _get_tel(event.new_agent_id)
if tel:
key = tel._span_key(event.new_agent_id, "dreaming")
span = tel.retrieve_span(key)
tel.dreaming_completed(
span,
memories_processed=event.memories_processed,
canonical_created=event.canonical_created,
)
# ── Planning ──────────────────────────────────────────────
@crewai_event_bus.on(NewAgentPlanningStartedEvent)
def _on_planning_started(
source: Any, event: NewAgentPlanningStartedEvent
) -> None:
logger.debug("NewAgent %s planning started", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
span = tel.planning(agent_id=event.new_agent_id)
key = tel._span_key(event.new_agent_id, "planning")
tel.store_span(key, span)
@crewai_event_bus.on(NewAgentPlanningCompletedEvent)
def _on_planning_completed(
source: Any, event: NewAgentPlanningCompletedEvent
) -> None:
logger.debug(
"NewAgent %s planned %d steps",
event.new_agent_id,
event.plan_steps_count,
)
tel = _get_tel(event.new_agent_id)
if tel:
key = tel._span_key(event.new_agent_id, "planning")
span = tel.retrieve_span(key)
tel.planning_completed(span, steps_count=event.plan_steps_count)
# ── Guardrails ────────────────────────────────────────────
@crewai_event_bus.on(NewAgentGuardrailPassedEvent)
def _on_guardrail_passed(
source: Any, event: NewAgentGuardrailPassedEvent
) -> None:
logger.debug(
"NewAgent %s guardrail passed (%s)",
event.new_agent_id,
event.guardrail_type,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.guardrail_passed(
agent_id=event.new_agent_id, guardrail_type=event.guardrail_type
)
@crewai_event_bus.on(NewAgentGuardrailRejectedEvent)
def _on_guardrail_rejected(
source: Any, event: NewAgentGuardrailRejectedEvent
) -> None:
logger.warning(
"NewAgent %s guardrail rejected (%s) after %d retries",
event.new_agent_id,
event.guardrail_type,
event.retries,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.guardrail(
agent_id=event.new_agent_id, guardrail_type=event.guardrail_type
)
# ── Knowledge ─────────────────────────────────────────────
@crewai_event_bus.on(NewAgentKnowledgeQueryEvent)
def _on_knowledge_query(
source: Any, event: NewAgentKnowledgeQueryEvent
) -> None:
logger.debug("NewAgent %s knowledge query", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
tel.knowledge_query(agent_id=event.new_agent_id)
@crewai_event_bus.on(NewAgentKnowledgeSuggestedEvent)
def _on_knowledge_suggested(
source: Any, event: NewAgentKnowledgeSuggestedEvent
) -> None:
logger.debug(
"NewAgent %s knowledge suggested (type=%s)",
event.new_agent_id,
event.source_type,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.knowledge_suggested(
agent_id=event.new_agent_id, source_type=event.source_type
)
@crewai_event_bus.on(NewAgentKnowledgeConfirmedEvent)
def _on_knowledge_confirmed(
source: Any, event: NewAgentKnowledgeConfirmedEvent
) -> None:
logger.debug(
"NewAgent %s knowledge confirmed (type=%s)",
event.new_agent_id,
event.source_type,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.knowledge_confirmed(
agent_id=event.new_agent_id, source_type=event.source_type
)
@crewai_event_bus.on(NewAgentKnowledgeRejectedEvent)
def _on_knowledge_rejected(
source: Any, event: NewAgentKnowledgeRejectedEvent
) -> None:
logger.debug("NewAgent %s knowledge rejected", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
tel.knowledge_rejected(agent_id=event.new_agent_id)
# ── Explain ───────────────────────────────────────────────
@crewai_event_bus.on(NewAgentExplainRequestedEvent)
def _on_explain_requested(
source: Any, event: NewAgentExplainRequestedEvent
) -> None:
logger.debug("NewAgent %s explain requested", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
tel.explain_requested(agent_id=event.new_agent_id)
# ── Spawn ─────────────────────────────────────────────────
@crewai_event_bus.on(NewAgentSpawnStartedEvent)
def _on_spawn_started(source: Any, event: NewAgentSpawnStartedEvent) -> None:
logger.debug(
"NewAgent %s spawn started (id=%s, depth=%d)",
event.new_agent_id,
event.spawn_id,
event.spawn_depth,
)
tel = _get_tel(event.new_agent_id)
if tel:
span = tel.spawn(
agent_id=event.new_agent_id,
spawn_id=event.spawn_id,
depth=event.spawn_depth,
)
key = tel._span_key(event.new_agent_id, "spawn", event.spawn_id)
tel.store_span(key, span)
@crewai_event_bus.on(NewAgentSpawnCompletedEvent)
def _on_spawn_completed(
source: Any, event: NewAgentSpawnCompletedEvent
) -> None:
logger.debug(
"NewAgent %s spawn completed (id=%s)",
event.new_agent_id,
event.spawn_id,
)
tel = _get_tel(event.new_agent_id)
if tel:
key = tel._span_key(event.new_agent_id, "spawn", event.spawn_id)
span = tel.retrieve_span(key)
if span:
tel.spawn_completed(span)
else:
tel.spawn_completed_event(
agent_id=event.new_agent_id, spawn_id=event.spawn_id
)
@crewai_event_bus.on(NewAgentSpawnFailedEvent)
def _on_spawn_failed(source: Any, event: NewAgentSpawnFailedEvent) -> None:
logger.warning(
"NewAgent %s spawn failed (id=%s): %s",
event.new_agent_id,
event.spawn_id,
event.error,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.spawn_failed(
agent_id=event.new_agent_id,
spawn_id=event.spawn_id,
error=event.error,
)
# ── Narration ─────────────────────────────────────────────
@crewai_event_bus.on(NewAgentNarrationGuardTriggeredEvent)
def _on_narration_guard(
source: Any, event: NewAgentNarrationGuardTriggeredEvent
) -> None:
logger.debug(
"NewAgent %s narration guard triggered (%d retries)",
event.new_agent_id,
event.retries,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.narration_guard_triggered(
agent_id=event.new_agent_id, retries=event.retries
)
# ── Context ───────────────────────────────────────────────
@crewai_event_bus.on(NewAgentContextSummarizedEvent)
def _on_context_summarized(
source: Any, event: NewAgentContextSummarizedEvent
) -> None:
logger.debug("NewAgent %s context summarized", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
tel.context_summarized(agent_id=event.new_agent_id)
# ── Status Updates ────────────────────────────────────────
@crewai_event_bus.on(NewAgentStatusUpdateEvent)
def _on_status_update(source: Any, event: NewAgentStatusUpdateEvent) -> None:
logger.debug(
"NewAgent status update: %s (%s)", event.state, event.detail or ""
)
# ── Workflow Events ───────────────────────────────────────
@crewai_event_bus.on(NewAgentWorkflowDetectedEvent)
def _on_workflow_detected(
source: Any, event: NewAgentWorkflowDetectedEvent
) -> None:
logger.debug(
"NewAgent %s workflow detected: %s (%dx)",
event.new_agent_id,
event.tools,
event.count,
)
tel = _get_tel(event.new_agent_id)
if tel:
tel.workflow_detected(
agent_id=event.new_agent_id, tools=event.tools, count=event.count
)
@crewai_event_bus.on(NewAgentWorkflowProposedEvent)
def _on_workflow_proposed(
source: Any, event: NewAgentWorkflowProposedEvent
) -> None:
logger.debug("NewAgent %s workflow proposed", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
tel.workflow_proposed(
agent_id=event.new_agent_id, description=event.workflow_description
)
@crewai_event_bus.on(NewAgentWorkflowConfirmedEvent)
def _on_workflow_confirmed(
source: Any, event: NewAgentWorkflowConfirmedEvent
) -> None:
logger.debug("NewAgent %s workflow confirmed", event.new_agent_id)
tel = _get_tel(event.new_agent_id)
if tel:
tel.workflow_confirmed(agent_id=event.new_agent_id)
logger.debug("NewAgent event listeners registered (all event types)")
except Exception as e:
logger.debug("Failed to register NewAgent event listeners: %s", e)

View File

@@ -1,287 +0,0 @@
"""Event types for the NewAgent system."""
from __future__ import annotations
from crewai.events.base_events import BaseEvent
class NewAgentCreatedEvent(BaseEvent):
"""Emitted when a NewAgent instance is constructed."""
type: str = "new_agent_created"
new_agent_id: str = ""
new_agent_role: str = ""
class NewAgentConversationStartedEvent(BaseEvent):
type: str = "new_agent_conversation_started"
conversation_id: str = ""
new_agent_id: str = ""
new_agent_role: str = ""
class NewAgentConversationResetEvent(BaseEvent):
type: str = "new_agent_conversation_reset"
conversation_id: str = ""
new_agent_id: str = ""
class NewAgentMessageReceivedEvent(BaseEvent):
type: str = "new_agent_message_received"
conversation_id: str = ""
new_agent_id: str = ""
message_length: int = 0
class NewAgentMessageSentEvent(BaseEvent):
type: str = "new_agent_message_sent"
conversation_id: str = ""
new_agent_id: str = ""
new_agent_role: str = ""
input_tokens: int = 0
output_tokens: int = 0
response_time_ms: int = 0
model: str = ""
class NewAgentStatusUpdateEvent(BaseEvent):
type: str = "new_agent_status_update"
state: str = ""
detail: str | None = None
input_tokens: int = 0
output_tokens: int = 0
elapsed_ms: int = 0
new_agent_id: str = ""
class NewAgentLLMCallStartedEvent(BaseEvent):
type: str = "new_agent_llm_call_started"
new_agent_id: str = ""
model: str = ""
class NewAgentLLMCallCompletedEvent(BaseEvent):
type: str = "new_agent_llm_call_completed"
new_agent_id: str = ""
model: str = ""
input_tokens: int = 0
output_tokens: int = 0
response_time_ms: int = 0
class NewAgentLLMCallFailedEvent(BaseEvent):
type: str = "new_agent_llm_call_failed"
new_agent_id: str = ""
error: str = ""
class NewAgentToolUsageStartedEvent(BaseEvent):
type: str = "new_agent_tool_usage_started"
new_agent_id: str = ""
tool_name: str = ""
class NewAgentToolUsageCompletedEvent(BaseEvent):
type: str = "new_agent_tool_usage_completed"
new_agent_id: str = ""
tool_name: str = ""
class NewAgentToolUsageFailedEvent(BaseEvent):
type: str = "new_agent_tool_usage_failed"
new_agent_id: str = ""
tool_name: str = ""
error: str = ""
class NewAgentDelegationStartedEvent(BaseEvent):
type: str = "new_agent_delegation_started"
new_agent_id: str = ""
coworker_role: str = ""
delegation_mode: str = "sync"
coworker_source: str = "local"
class NewAgentDelegationCompletedEvent(BaseEvent):
type: str = "new_agent_delegation_completed"
new_agent_id: str = ""
coworker_role: str = ""
tokens_consumed: int = 0
response_time_ms: int = 0
class NewAgentDelegationFailedEvent(BaseEvent):
type: str = "new_agent_delegation_failed"
new_agent_id: str = ""
coworker_role: str = ""
error: str = ""
class NewAgentFireAndForgetDispatchedEvent(BaseEvent):
type: str = "new_agent_fire_and_forget_dispatched"
new_agent_id: str = ""
coworker_role: str = ""
class NewAgentMemorySaveEvent(BaseEvent):
type: str = "new_agent_memory_save"
new_agent_id: str = ""
scope: str = ""
class NewAgentMemoryRecallEvent(BaseEvent):
type: str = "new_agent_memory_recall"
new_agent_id: str = ""
scope: str = ""
results_count: int = 0
class NewAgentDreamingStartedEvent(BaseEvent):
type: str = "new_agent_dreaming_started"
new_agent_id: str = ""
class NewAgentDreamingCompletedEvent(BaseEvent):
type: str = "new_agent_dreaming_completed"
new_agent_id: str = ""
memories_processed: int = 0
canonical_created: int = 0
workflows_detected: int = 0
class NewAgentPlanningStartedEvent(BaseEvent):
type: str = "new_agent_planning_started"
new_agent_id: str = ""
class NewAgentPlanningCompletedEvent(BaseEvent):
type: str = "new_agent_planning_completed"
new_agent_id: str = ""
plan_steps_count: int = 0
class NewAgentGuardrailPassedEvent(BaseEvent):
type: str = "new_agent_guardrail_passed"
new_agent_id: str = ""
guardrail_type: str = ""
class NewAgentGuardrailRejectedEvent(BaseEvent):
type: str = "new_agent_guardrail_rejected"
new_agent_id: str = ""
guardrail_type: str = ""
retries: int = 0
class NewAgentKnowledgeQueryEvent(BaseEvent):
type: str = "new_agent_knowledge_query"
new_agent_id: str = ""
class NewAgentKnowledgeSuggestedEvent(BaseEvent):
type: str = "new_agent_knowledge_suggested"
new_agent_id: str = ""
source_type: str = ""
class NewAgentExplainRequestedEvent(BaseEvent):
type: str = "new_agent_explain_requested"
new_agent_id: str = ""
class NewAgentSpawnStartedEvent(BaseEvent):
type: str = "new_agent_spawn_started"
new_agent_id: str = ""
spawn_id: str = ""
parent_id: str = ""
spawn_depth: int = 0
class NewAgentSpawnCompletedEvent(BaseEvent):
type: str = "new_agent_spawn_completed"
new_agent_id: str = ""
spawn_id: str = ""
class NewAgentSpawnFailedEvent(BaseEvent):
type: str = "new_agent_spawn_failed"
new_agent_id: str = ""
spawn_id: str = ""
error: str = ""
class NewAgentFireAndForgetCompletedEvent(BaseEvent):
type: str = "new_agent_fire_and_forget_completed"
new_agent_id: str = ""
coworker_role: str = ""
class NewAgentContextSummarizedEvent(BaseEvent):
type: str = "new_agent_context_summarized"
new_agent_id: str = ""
class NewAgentNarrationGuardTriggeredEvent(BaseEvent):
type: str = "new_agent_narration_guard_triggered"
new_agent_id: str = ""
retries: int = 0
class NewAgentWorkflowDetectedEvent(BaseEvent):
type: str = "new_agent_workflow_detected"
new_agent_id: str = ""
tools: list[str] = []
count: int = 0
class NewAgentWorkflowProposedEvent(BaseEvent):
type: str = "new_agent_workflow_proposed"
new_agent_id: str = ""
workflow_description: str = ""
class NewAgentWorkflowConfirmedEvent(BaseEvent):
type: str = "new_agent_workflow_confirmed"
new_agent_id: str = ""
class NewAgentKnowledgeConfirmedEvent(BaseEvent):
type: str = "new_agent_knowledge_confirmed"
new_agent_id: str = ""
source_type: str = ""
class NewAgentKnowledgeRejectedEvent(BaseEvent):
type: str = "new_agent_knowledge_rejected"
new_agent_id: str = ""
class NewAgentSkillSuggestedEvent(BaseEvent):
type: str = "new_agent_skill_suggested"
new_agent_id: str = ""
skill_name: str = ""
source_type: str = ""
class NewAgentSkillConfirmedEvent(BaseEvent):
type: str = "new_agent_skill_confirmed"
new_agent_id: str = ""
skill_name: str = ""
class NewAgentSkillRejectedEvent(BaseEvent):
type: str = "new_agent_skill_rejected"
new_agent_id: str = ""
skill_name: str = ""
class NewAgentTokenUsageEvent(BaseEvent):
"""Emitted when token usage is recorded, for platform billing."""
type: str = "new_agent_token_usage"
new_agent_id: str = ""
conversation_id: str = ""
action: str = ""
input_tokens: int = 0
output_tokens: int = 0
model: str = ""

File diff suppressed because it is too large Load Diff

View File

@@ -1,210 +0,0 @@
"""Knowledge Discovery — detect and suggest reusable knowledge for NewAgent."""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
from crewai.new_agent.new_agent import NewAgent
logger = logging.getLogger(__name__)
class KnowledgeDiscovery:
"""Identifies valuable information during conversations and suggests
creating knowledge sources."""
def __init__(self, agent: NewAgent):
self.agent = agent
self._pending_suggestions: list[dict[str, Any]] = []
@property
def pending_suggestions(self) -> list[dict[str, Any]]:
return list(self._pending_suggestions)
def evaluate_for_knowledge(
self, tool_name: str, tool_result: str
) -> dict[str, Any] | None:
"""Evaluate a tool result for knowledge-worthiness.
Returns a suggestion dict if the result is worth saving, None otherwise.
"""
settings = getattr(self.agent.settings, "can_create_knowledge", True)
if not settings:
return None
# Heuristic: results from search/scrape/read tools are often knowledge-worthy
if len(tool_result) < 50:
return None
knowledge_tools = {
"search_web",
"scrape_url",
"read_file",
"search",
"web_search",
"read_website",
"scrape",
"fetch_url",
"search_knowledge",
"query_database",
"read_document",
}
if tool_name.lower() not in knowledge_tools:
return None
# Extract a title from the first line or first sentence
first_line = tool_result.split("\n", 1)[0].strip()
if not first_line:
first_line = tool_result[:100].strip()
# Use first sentence if first line is very long
if len(first_line) > 120:
dot_pos = first_line.find(".")
if dot_pos > 0:
first_line = first_line[: dot_pos + 1]
else:
first_line = first_line[:100] + "..."
title = f"{tool_name}: {first_line}" if first_line else tool_name
suggestion = {
"source_tool": tool_name,
"content": tool_result[:2000], # Truncate for suggestion
"title": title,
"status": "pending",
}
self._pending_suggestions.append(suggestion)
self._emit_suggestion_event(suggestion)
return suggestion
def build_suggestion_message(
self, suggestion: dict[str, Any]
) -> tuple[str, list[dict[str, Any]]]:
"""Return (conversational_text, actions) for a pending suggestion."""
title = suggestion.get("title", "Untitled")
content = suggestion.get("content", "")
preview = content[:300] + ("..." if len(content) > 300 else "")
text = (
f"I found potentially useful information: **{title}**\n\n"
f"```\n{preview}\n```\n\n"
f"Would you like me to save this as a knowledge source? "
f"You can say yes, no, or ask me to modify it first."
)
from crewai.new_agent.models import MessageAction
actions = [
MessageAction(
action_id=f"knowledge-confirm-{title[:40]}",
label="Approve",
action_type="suggestion_confirm",
payload={"type": "knowledge", "title": title},
),
MessageAction(
action_id=f"knowledge-reject-{title[:40]}",
label="Dismiss",
action_type="suggestion_reject",
payload={"type": "knowledge", "title": title},
),
]
return text, [a.model_dump() for a in actions]
def handle_suggestion_response(self, user_text: str) -> dict[str, Any] | None:
"""Interpret a plain-text user response to a pending suggestion."""
if not self._pending_suggestions:
return None
from crewai.new_agent.skill_builder import _detect_suggestion_intent
intent = _detect_suggestion_intent(user_text)
if intent == "confirm":
suggestion = self._pending_suggestions[0]
title = suggestion.get("title", "Untitled")
if self.confirm_suggestion(0):
self._pending_suggestions.pop(0)
return {"action": "confirmed", "title": title}
return {"action": "error", "title": title}
if intent == "reject":
suggestion = self._pending_suggestions[0]
title = suggestion.get("title", "Untitled")
self.reject_suggestion(0)
self._pending_suggestions.pop(0)
return {"action": "rejected", "title": title}
return {"action": "ignored"}
def confirm_suggestion(self, index: int) -> bool:
"""Confirm a knowledge suggestion and create the knowledge source."""
if index < 0 or index >= len(self._pending_suggestions):
return False
suggestion = self._pending_suggestions[index]
suggestion["status"] = "confirmed"
try:
from crewai.knowledge.source.string_knowledge_source import (
StringKnowledgeSource,
)
source = StringKnowledgeSource(content=suggestion["content"])
if self.agent.knowledge is not None:
self.agent.knowledge.sources.append(source)
else:
self.agent.knowledge_sources.append(source)
self._emit_confirmed_event()
return True
except Exception as e:
logger.debug(f"Failed to create knowledge source: {e}")
return False
def reject_suggestion(self, index: int) -> None:
"""Reject a knowledge suggestion."""
if 0 <= index < len(self._pending_suggestions):
self._pending_suggestions[index]["status"] = "rejected"
self._emit_rejected_event()
def _emit_suggestion_event(self, suggestion: dict[str, Any]) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentKnowledgeSuggestedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentKnowledgeSuggestedEvent(
new_agent_id=str(self.agent.id),
source_type=suggestion.get("source_tool", ""),
),
)
except Exception:
pass
def _emit_confirmed_event(self) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentKnowledgeConfirmedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentKnowledgeConfirmedEvent(new_agent_id=str(self.agent.id)),
)
except Exception:
pass
def _emit_rejected_event(self) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentKnowledgeRejectedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentKnowledgeRejectedEvent(new_agent_id=str(self.agent.id)),
)
except Exception:
pass

View File

@@ -1,174 +0,0 @@
"""Core data models for the NewAgent system."""
from __future__ import annotations
from datetime import datetime, timezone
from typing import Any
from uuid import uuid4
from pydantic import BaseModel, Field
class Artifact(BaseModel):
"""An artifact attached to a message (file, image, structured data, etc.)."""
type: str # "file" | "image" | "json" | "code" | "url"
name: str = ""
content: str = ""
mime_type: str = ""
metadata: dict[str, Any] = Field(default_factory=dict)
class MessageAction(BaseModel):
"""A structured action attached to a message.
Plain-text providers (CLI) ignore these — the user responds
conversationally. Rich providers (Slack, Teams, Web) render them
as buttons, cards, or interactive components.
"""
action_id: str
label: str
action_type: str # "suggestion_confirm" | "suggestion_reject" | "suggestion_edit"
payload: dict[str, Any] = Field(default_factory=dict)
class Message(BaseModel):
"""A single message in a conversation."""
id: str = Field(default_factory=lambda: uuid4().hex)
conversation_id: str = ""
role: str # "user" | "agent" | "coworker" | "system"
content: str
sender: str | None = None
artifacts: list[Artifact] | None = None
actions: list[MessageAction] | None = None
timestamp: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
model: str | None = None
input_tokens: int | None = None
output_tokens: int | None = None
cost: float | None = None
response_time_ms: int | None = None
tools_used: list[str] | None = None
delegations: list[str] | None = None
metadata: dict[str, Any] | None = None
class AgentSettings(BaseModel):
"""Opinionated agent settings with sensible defaults."""
memory_enabled: bool = True
memory_read_only: bool = False
reasoning_enabled: bool = True
self_improving: bool = True
dreaming_interval_hours: int = 24
dreaming_trigger_threshold: int = 10
dreaming_llm: str | Any | None = None
planning_enabled: bool = True
auto_plan: bool = True
can_spawn_copies: bool = False
max_spawn_depth: int = 1
max_concurrent_spawns: int = 4
spawn_timeout: int = 600
can_create_knowledge: bool = True
can_build_skills: bool = True
can_schedule: bool = False
provenance_enabled: bool = True
provenance_detail: str = "standard"
share_data: bool = False
narration_guard: bool = False
narration_max_retries: int = 2
respect_context_window: bool = True
cache_tool_results: bool = True
max_retry_limit: int = 2
max_history_messages: int | None = None
class AgentStatus(BaseModel):
"""Ephemeral status update emitted while the agent works."""
state: str # "thinking" | "using_tool" | "delegating" | "planning" | "recalling" | "dreaming"
detail: str | None = None
tool_name: str | None = None
coworker: str | None = None
progress: float | None = None
elapsed_ms: int = 0
input_tokens: int = 0
output_tokens: int = 0
class PromptLayer(BaseModel):
"""A single layer in the prompt stack."""
name: str
content: str
source: str = ""
class PromptStack(BaseModel):
"""Structured system prompt assembly."""
layers: list[PromptLayer] = Field(default_factory=list)
def assemble(self) -> str:
return "\n\n".join(layer.content for layer in self.layers if layer.content)
def add(self, name: str, content: str, source: str = "") -> None:
self.layers.append(PromptLayer(name=name, content=content, source=source))
class ProvenanceEntry(BaseModel):
"""A single decision trace entry."""
id: str = Field(default_factory=lambda: uuid4().hex)
timestamp: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
conversation_id: str = ""
action: str # "tool_call" | "delegation" | "response" | "knowledge_query"
reasoning: str = ""
inputs: dict[str, Any] | None = None
outcome: str | None = None
confidence: float | None = None
sources: list[str] | None = None
class TokenUsage(BaseModel):
"""Token consumption record for a single action."""
action: str # "message" | "delegation" | "tool_call" | "dreaming" | "planning" | "guardrail"
agent_id: str = ""
conversation_id: str = ""
input_tokens: int = 0
output_tokens: int = 0
model: str = ""
timestamp: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
delegation_target: str | None = None
tool_name: str | None = None
coworker_source: str | None = None
# ── GAP-45: Memory scoping types ────────────────────────────────
class MemoryScope(BaseModel):
"""Scoped memory namespace."""
namespace: str
shared: bool = False # If True, readable by coworkers
class MemorySlice(BaseModel):
"""Filtered view of memory."""
scope: str = ""
user_id: str | None = None
conversation_id: str | None = None
tags: list[str] = Field(default_factory=list)

View File

@@ -1,988 +0,0 @@
"""NewAgent — standalone, conversational, self-improving agent."""
from __future__ import annotations
from collections.abc import AsyncGenerator, Callable
import importlib.util
import logging
from pathlib import Path
import re
import threading
from typing import Any
from uuid import uuid4
from pydantic import BaseModel, Field, PrivateAttr, model_validator
from typing_extensions import Self
from crewai.new_agent.models import (
AgentSettings,
MemoryScope,
MemorySlice,
Message,
PromptStack,
ProvenanceEntry,
)
from crewai.new_agent.provider import DirectProvider
logger = logging.getLogger(__name__)
# ── GAP-56: Circular coworker guard ─────────────────────────────
_init_chain = threading.local()
def _get_init_chain() -> set[str]:
"""Return the thread-local set of agent IDs currently being initialized."""
if not hasattr(_init_chain, "agent_ids"):
_init_chain.agent_ids = set()
return _init_chain.agent_ids # type: ignore[no-any-return]
# ── GAP-63: Process-level AMP definition cache ──────────────────
_amp_cache: dict[str, dict[str, Any]] = {}
def clear_amp_cache() -> None:
"""Clear the process-level AMP coworker definition cache."""
_amp_cache.clear()
# ── GAP-24: Pronouns that trigger anaphora resolution ───────────
_ANAPHORA_PRONOUNS = re.compile(
r"\b(he|she|it|they|this|that|these|those)\b",
re.IGNORECASE,
)
class NewAgent(BaseModel):
"""Standalone conversational agent.
Replaces the Agent + Task + Crew pattern with a direct
message-based interface: message(), amessage(), stream().
"""
model_config = {"arbitrary_types_allowed": True}
# Identity
id: str = Field(default_factory=lambda: uuid4().hex)
role: str
goal: str
backstory: str = ""
# LLM
llm: str | Any | None = None
function_calling_llm: str | Any | None = None
# Capabilities
tools: list[Any] = Field(default_factory=list)
skills: list[Any] = Field(default_factory=list)
mcps: list[Any] = Field(default_factory=list)
apps: list[Any] = Field(default_factory=list)
# Collaboration
coworkers: list[Any] = Field(default_factory=list)
# Knowledge & Memory
knowledge: Any | None = None
knowledge_sources: list[Any] = Field(default_factory=list)
memory: bool | Any = True
# Settings
settings: AgentSettings = Field(default_factory=AgentSettings)
# Execution
max_iter: int = 25
max_tokens: int | None = None
max_execution_time: int | None = None
verbose: bool = False
# Guardrails
guardrail: Any | None = None
# Structured output
response_model: type[BaseModel] | None = None
# Self-construction from AMP repository
from_repository: str | None = None
# Security & A2A
security_config: Any | None = None
a2a: Any | None = None
# Hooks
on_message: Callable[..., Any] | None = Field(default=None, exclude=True)
on_delegate: Callable[..., Any] | None = Field(default=None, exclude=True)
on_complete: Callable[..., Any] | None = Field(default=None, exclude=True)
step_callback: Callable[..., Any] | None = Field(default=None, exclude=True)
# Provider (transport) — typed as Any to allow duck-typed providers and mocks.
# Implements the ConversationalProvider protocol from crewai.new_agent.provider.
provider: Any | None = Field(default=None, exclude=True)
# GAP-41: Manual memory scope override
memory_scope: str | None = None
# Private
_llm_instance: Any = PrivateAttr(default=None)
_memory_instance: Any = PrivateAttr(default=None)
_resolved_tools: list[Any] = PrivateAttr(default_factory=list)
_coworker_tools: list[Any] = PrivateAttr(default_factory=list)
_resolved_coworkers: list[Any] = PrivateAttr(default_factory=list)
# GAP-31: Concurrent conversation support — dict of executors keyed by conversation_id
_executors: dict[str, Any] = PrivateAttr(default_factory=dict)
_default_conversation_id: str = PrivateAttr(default_factory=lambda: uuid4().hex)
_dreaming_engine: Any = PrivateAttr(default=None)
_planning_engine: Any = PrivateAttr(default=None)
_knowledge_discovery: Any = PrivateAttr(default=None)
_skill_builder: Any = PrivateAttr(default=None)
_active_skills: list[Any] = PrivateAttr(default_factory=list)
_telemetry: Any = PrivateAttr(default=None)
_conversation_id: str = PrivateAttr(default_factory=lambda: uuid4().hex)
_logger: logging.Logger = PrivateAttr(
default_factory=lambda: logging.getLogger("crewai.new_agent")
)
# GAP-41/45: Memory namespace and filter from MemoryScope/MemorySlice
_memory_namespace: str | None = PrivateAttr(default=None)
_memory_shared: bool = PrivateAttr(default=False)
_memory_filter: Any = PrivateAttr(default=None)
# GAP-38: Stored A2A configuration
_a2a_config: Any = PrivateAttr(default=None)
# GAP-31: Provider instance for creating new executors
_provider: Any = PrivateAttr(default=None)
# GAP-86: Flag indicating agent was resolved from AMP repository
_amp_resolved: bool = PrivateAttr(default=False)
@model_validator(mode="before")
@classmethod
def _load_from_repository(cls, data: Any) -> Any:
if isinstance(data, dict) and data.get("from_repository"):
handle = data["from_repository"]
try:
from crewai.utilities.agent_utils import load_agent_from_repository
attrs = load_agent_from_repository(handle)
for key, val in attrs.items():
if key not in data or data[key] is None:
data[key] = val
except Exception:
pass
return data
@model_validator(mode="after")
def _setup(self) -> Self:
"""Initialize LLM, tools, coworkers, and executor."""
self._init_llm()
self._init_memory()
self._init_tools()
self._init_skills()
self._init_apps_warning()
self._init_security_a2a()
# GAP-56: Circular coworker guard
chain = _get_init_chain()
if self.id in chain:
# GAP-99: Log a clear warning when circular coworker reference is detected
logger.warning(
f"Circular coworker reference detected for agent '{self.role}' (id={self.id}). "
f"Skipping coworker initialization to prevent infinite recursion. "
f"Check your coworker configuration."
)
self._init_engines()
self._init_telemetry()
self._init_executor()
self._emit_created_event()
return self
chain.add(self.id)
try:
self._init_coworkers()
finally:
chain.discard(self.id)
self._init_engines()
self._init_telemetry()
self._init_executor()
self._emit_created_event()
return self
def _init_llm(self) -> None:
from crewai.utilities.llm_utils import create_llm
self._llm_instance = create_llm(self.llm)
if self._llm_instance is None:
self._llm_instance = create_llm(None)
def _init_memory(self) -> None:
"""Initialize memory if enabled.
GAP-45: Accepts MemoryScope and MemorySlice as memory field values.
GAP-41: Reads memory_scope from provider context or manual override.
"""
if not self.settings.memory_enabled:
self._memory_instance = None
return
if self.memory is False:
self._memory_instance = None
return
# GAP-45: Handle MemoryScope / MemorySlice types
if isinstance(self.memory, MemoryScope):
self._memory_namespace = self.memory.namespace
self._memory_shared = self.memory.shared
self._init_memory_instance()
elif isinstance(self.memory, MemorySlice):
self._memory_namespace = self.memory.scope or None
self._memory_filter = self.memory
self._init_memory_instance()
else:
try:
from crewai.memory.unified_memory import Memory
from crewai.memory.utils import sanitize_scope_name
if isinstance(self.memory, Memory):
self._memory_instance = self.memory
elif self.memory is True or self.memory is None:
agent_name = sanitize_scope_name(self.role or str(self.id))
self._memory_instance = Memory(root_scope=f"/agent/{agent_name}")
else:
self._memory_instance = self.memory
except Exception as e:
self._logger.warning(f"Memory initialization failed: {e}")
self._memory_instance = None
if self._memory_instance and self.settings.memory_read_only:
self._memory_instance.read_only = True
# GAP-41: Apply memory scope from provider or manual override
scope = self.memory_scope
if scope is None:
provider = self.provider
if provider is not None:
scope = getattr(provider, "memory_scope", None)
if scope:
self._memory_namespace = scope
def _init_memory_instance(self) -> None:
"""Create a Memory instance (used by MemoryScope/MemorySlice paths)."""
try:
from crewai.memory.unified_memory import Memory
from crewai.memory.utils import sanitize_scope_name
agent_name = sanitize_scope_name(self.role or str(self.id))
self._memory_instance = Memory(root_scope=f"/agent/{agent_name}")
except Exception as e:
self._logger.warning(f"Memory initialization failed: {e}")
self._memory_instance = None
def _init_tools(self) -> None:
"""Resolve tools from various sources."""
resolved: list[Any] = []
for tool in self.tools:
resolved.append(tool)
if self.mcps:
try:
from crewai.mcp.tool_resolver import MCPToolResolver
from crewai.utilities.logger import Logger as _CrewLogger
_mcp_logger = _CrewLogger(verbose=self.verbose)
resolver = MCPToolResolver(agent=self, logger=_mcp_logger)
mcp_tools = resolver.resolve(self.mcps)
resolved.extend(mcp_tools)
except Exception as e:
self._logger.warning(f"Failed to resolve MCP tools: {e}")
self._resolved_tools = resolved
if getattr(self.settings, "can_schedule", False):
try:
from crewai.new_agent.scheduler import ScheduleTaskTool
agent_name = getattr(self, "role", "") or str(self.id)
self._resolved_tools.append(ScheduleTaskTool(agent_name=agent_name))
except Exception:
pass
def _init_skills(self) -> None:
"""Resolve skills from Path objects into SKILL.md-based Skill instances,
falling back to Python module loading for backward compatibility."""
if not self.skills:
return
for skill in self.skills:
if isinstance(skill, (str, Path)):
skill_path = Path(skill)
if skill_path.is_dir() and (skill_path / "SKILL.md").exists():
try:
from crewai.skills.loader import activate_skill, discover_skills
discovered = discover_skills(skill_path.parent)
for s in discovered:
if s.name == skill_path.name:
activated = activate_skill(s)
self._active_skills.append(activated)
except Exception as e:
self._logger.warning(
f"Failed to load SKILL.md from {skill_path}: {e}"
)
else:
self._load_python_skill(skill_path)
elif hasattr(skill, "run") or hasattr(skill, "_run"):
self._resolved_tools.append(skill)
else:
try:
from crewai.skills.models import Skill as SkillModel
if isinstance(skill, SkillModel):
self._active_skills.append(skill)
except Exception:
pass
def _load_python_skill(self, skill_path: Path) -> None:
"""Load a Python module as tool instances (backward compatibility)."""
try:
spec = importlib.util.spec_from_file_location(
f"skill_{skill_path.stem}",
str(skill_path),
)
if spec is None or spec.loader is None:
self._logger.warning(f"Cannot load skill from {skill_path}")
return
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
for attr_name in dir(module):
attr = getattr(module, attr_name)
if (
isinstance(attr, type)
and attr_name != "BaseTool"
and hasattr(attr, "run")
):
try:
self._resolved_tools.append(attr())
except Exception:
pass
except Exception as e:
self._logger.warning(f"Failed to load skill from {skill_path}: {e}")
def _init_apps_warning(self) -> None:
"""GAP-36: Log a warning when apps are specified (platform-managed)."""
if self.apps:
self._logger.warning(
"Apps integration requires the CrewAI Platform. "
f"{len(self.apps)} app(s) configured but not resolved locally."
)
def _init_security_a2a(self) -> None:
"""GAP-38: Store security_config and a2a fields for later use."""
if self.security_config is not None:
self._logger.info(
f"Security configuration applied: {type(self.security_config).__name__}"
)
if self.a2a is not None:
self._a2a_config = self.a2a
self._logger.info(
"A2A server configured — agent will be accessible via A2A protocol"
)
def _init_coworkers(self) -> None:
"""Resolve coworker references into delegation tools."""
from crewai.new_agent.coworker_tools import build_coworker_tools
self._resolved_coworkers = []
self._coworker_tools = []
for cw in self.coworkers:
if isinstance(cw, NewAgent):
if cw.id == self.id or cw.role == self.role:
continue
self._resolved_coworkers.append(cw)
elif isinstance(cw, str):
try:
resolved = self._resolve_amp_coworker(cw)
self._resolved_coworkers.append(resolved)
except Exception as e:
self._logger.warning(f"Failed to resolve AMP coworker '{cw}': {e}")
elif isinstance(cw, dict):
# GAP-86: Support both plan format {"amp": "handle"} and legacy {"handle": "handle"}
handle = cw.get("amp") or cw.get("handle")
if handle:
overrides = {
k: v
for k, v in cw.items()
if k not in ("amp", "handle", "overrides")
}
overrides.update(cw.get("overrides", {}))
try:
resolved = self._resolve_amp_coworker(
handle,
overrides=overrides or None,
)
resolved._amp_resolved = True
self._resolved_coworkers.append(resolved)
except Exception as e:
self._logger.warning(
f"Failed to resolve AMP coworker '{handle}': {e}"
)
else:
self._resolved_coworkers.append(cw)
else:
self._resolved_coworkers.append(cw)
if self._resolved_coworkers:
self._coworker_tools = build_coworker_tools(
self._resolved_coworkers,
parent_role=self.role,
parent_agent=self,
)
def _init_engines(self) -> None:
"""Initialize dreaming, planning, knowledge discovery, and skill builder."""
from crewai.new_agent.dreaming import DreamingEngine
from crewai.new_agent.knowledge_discovery import KnowledgeDiscovery
from crewai.new_agent.planning import PlanningEngine
if self.settings.self_improving:
self._dreaming_engine = DreamingEngine(self)
if self.settings.planning_enabled:
self._planning_engine = PlanningEngine(self)
self._knowledge_discovery = KnowledgeDiscovery(self)
if self.settings.can_build_skills:
try:
from crewai.new_agent.skill_builder import SkillBuilder
self._skill_builder = SkillBuilder(self)
except Exception:
pass
def _resolve_amp_coworker(
self,
handle: str,
overrides: dict[str, Any] | None = None,
) -> NewAgent:
"""Resolve an AMP repository handle into a NewAgent instance.
GAP-63: Uses a process-level cache to avoid redundant API calls.
"""
from crewai.utilities.agent_utils import load_agent_from_repository
# GAP-63: Check cache first
if handle in _amp_cache:
attrs = _amp_cache[handle]
else:
attrs = load_agent_from_repository(handle)
_amp_cache[handle] = attrs
kwargs: dict[str, Any] = {
"role": attrs.get("role", handle),
"goal": attrs.get("goal", ""),
"backstory": attrs.get("backstory", ""),
"tools": attrs.get("tools", []),
"llm": attrs.get("llm", self.llm),
}
if overrides:
for key, val in overrides.items():
kwargs[key] = val
return NewAgent(**kwargs)
def _init_telemetry(self) -> None:
try:
from crewai.new_agent.telemetry import NewAgentTelemetry, register_agent
self._telemetry = NewAgentTelemetry(
share_data=getattr(self.settings, "share_data", False),
)
# GAP-123: Register so event listeners can look up this telemetry instance
register_agent(self.id, self._telemetry)
# GAP-124: Compute and set agent fingerprint
self._telemetry.set_fingerprint(self._compute_fingerprint())
except Exception:
pass
def _compute_fingerprint(self) -> str:
"""GAP-124: Stable hash of agent config for telemetry correlation."""
import hashlib
tool_names = sorted(
getattr(t, "name", "") or getattr(t, "__name__", str(t))
for t in self._resolved_tools
)
parts = [
self.role,
self.goal[:100],
",".join(tool_names),
str(self.settings.planning_enabled),
str(self.settings.self_improving),
]
digest = hashlib.sha256("|".join(parts).encode()).hexdigest()[:16]
return digest
def _emit_created_event(self) -> None:
"""GAP-84: Emit agent-created event at construction time.
The conversation_started event is now emitted in _get_or_create_executor
when a NEW conversation executor is actually created.
"""
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentCreatedEvent
crewai_event_bus.emit(
self,
event=NewAgentCreatedEvent(
new_agent_id=self.id,
new_agent_role=self.role,
),
)
except Exception:
pass
if self._telemetry:
amp_count = sum(
1
for cw in self._resolved_coworkers
if getattr(cw, "_amp_resolved", False)
)
self._telemetry.agent_created(
agent_id=self.id,
role=self.role,
goal=self.goal,
llm=str(self.llm or ""),
tools_count=len(self._resolved_tools),
coworkers_count=len(self._resolved_coworkers),
memory_enabled=self.settings.memory_enabled,
planning_enabled=self.settings.planning_enabled,
coworker_amp_count=amp_count,
)
def _init_executor(self) -> None:
"""Create the default executor and store the provider for future use."""
self._provider = self.provider or DirectProvider()
executor = self._create_executor(self._provider)
# GAP-31: Store in the executors dict keyed by default conversation ID
self._default_conversation_id = self._conversation_id
self._executors[self._default_conversation_id] = executor
def _create_executor(self, provider: Any) -> Any:
"""Create a new ConversationalAgentExecutor instance."""
from crewai.new_agent.executor import ConversationalAgentExecutor
return ConversationalAgentExecutor(
agent=self,
provider=provider,
max_iter=self.max_iter,
verbose=self.verbose,
)
def _get_or_create_executor(self, conversation_id: str) -> Any:
"""GAP-31: Get an existing executor or create a new one for the given conversation ID.
New conversations get a fresh DirectProvider so their history is isolated.
GAP-84: Emits NewAgentConversationStartedEvent when a NEW executor is created.
"""
if conversation_id in self._executors:
return self._executors[conversation_id]
# Create a fresh provider for the new conversation so history is isolated
executor = self._create_executor(DirectProvider())
self._executors[conversation_id] = executor
# GAP-84: Emit conversation_started when a new conversation begins
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentConversationStartedEvent
crewai_event_bus.emit(
self,
event=NewAgentConversationStartedEvent(
conversation_id=conversation_id,
new_agent_id=self.id,
new_agent_role=self.role,
),
)
except Exception:
pass
return executor
@property
def _executor(self) -> Any:
"""Return the default conversation's executor (backward compatibility)."""
return self._executors.get(self._default_conversation_id)
# ── Public API ──────────────────────────────────────────────
def message(
self, content: str, *, conversation_id: str | None = None, **kwargs: Any
) -> Message:
"""Send a message and get a response (sync).
GAP-31: Accepts optional conversation_id for concurrent conversations.
"""
cid = conversation_id or self._default_conversation_id
executor = self._get_or_create_executor(cid)
user_msg = Message(
conversation_id=cid,
role="user",
content=content,
)
if self.on_message:
self.on_message(user_msg)
response: Message = executor.invoke(user_msg)
if self.on_complete:
self.on_complete(response)
return response
async def amessage(
self, content: str, *, conversation_id: str | None = None, **kwargs: Any
) -> Message:
"""Send a message and get a response (async).
GAP-31: Accepts optional conversation_id for concurrent conversations.
"""
cid = conversation_id or self._default_conversation_id
executor = self._get_or_create_executor(cid)
user_msg = Message(
conversation_id=cid,
role="user",
content=content,
)
if self.on_message:
self.on_message(user_msg)
response: Message = await executor.ainvoke(user_msg)
if self.on_complete:
self.on_complete(response)
return response
async def stream(
self, content: str, *, conversation_id: str | None = None, **kwargs: Any
) -> AsyncGenerator[str, None]:
"""Stream a response token by token.
GAP-31: Accepts optional conversation_id for concurrent conversations.
After the generator is exhausted, call ``last_stream_result`` to get
the full ``Message`` with token metadata.
"""
cid = conversation_id or self._default_conversation_id
executor = self._get_or_create_executor(cid)
user_msg = Message(
conversation_id=cid,
role="user",
content=content,
)
async for chunk in executor.astream(user_msg):
yield chunk
@property
def last_stream_result(self) -> Message | None:
"""Return the Message from the most recent ``stream()`` call."""
executor = self._executors.get(self._default_conversation_id)
if executor:
return getattr(executor, "_last_stream_result", None)
return None
def reset_conversation(self, conversation_id: str | None = None) -> None:
"""Clear conversation history and start fresh.
GAP-31: Accepts optional conversation_id to reset a specific conversation.
"""
cid = conversation_id or self._default_conversation_id
executor = self._executors.get(cid)
if executor is None:
return
old_conversation_id = cid
# GAP-79: Persist provenance before clearing — audit trail survives reset
if self.provider and hasattr(self.provider, "save_provenance"):
try:
self.provider.save_provenance(executor.provenance_log)
except Exception:
pass
elif self._provider and hasattr(self._provider, "save_provenance"):
try:
self._provider.save_provenance(executor.provenance_log)
except Exception:
pass
executor.conversation_history.clear()
executor.usage_records.clear()
# NOTE: provenance_log is intentionally NOT cleared — provenance
# persists independently of conversation history per plan.
# Reset the per-conversation provider (not the agent's global provider)
conv_provider = getattr(executor, "provider", None)
if conv_provider and hasattr(conv_provider, "reset_history"):
conv_provider.reset_history()
if cid == self._default_conversation_id:
new_id = uuid4().hex
self._conversation_id = new_id
self._default_conversation_id = new_id
del self._executors[cid]
self._executors[new_id] = executor
else:
del self._executors[cid]
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentConversationResetEvent
crewai_event_bus.emit(
self,
event=NewAgentConversationResetEvent(
conversation_id=old_conversation_id,
new_agent_id=self.id,
),
)
except Exception:
pass
def explain(self, conversation_id: str | None = None) -> list[ProvenanceEntry]:
"""Return the decision trace for this agent.
GAP-31: Accepts optional conversation_id for a specific conversation.
"""
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentExplainRequestedEvent
crewai_event_bus.emit(
self,
event=NewAgentExplainRequestedEvent(new_agent_id=self.id),
)
except Exception:
pass
cid = conversation_id or self._default_conversation_id
executor = self._executors.get(cid)
if executor is None:
return []
entries = list(executor.provenance_log)
# GAP-88: Decouple from planning engine. Use a direct sync LLM call
# for reasoning reconstruction — works in both sync and async contexts.
needs_reasoning = any(not e.reasoning for e in entries)
if needs_reasoning and self._llm_instance:
try:
from crewai.utilities.agent_utils import (
format_message_for_llm,
get_llm_response,
)
from crewai.utilities.types import LLMMessage
log_text = "\n".join(
f"Step {i + 1}: {e.action} - inputs={e.inputs}, outcome={e.outcome}"
for i, e in enumerate(entries)
)
prompt = (
f"Given this execution trace, explain the reasoning behind each step:\n\n"
f"{log_text}\n\n"
f"For each step, provide a brief explanation of WHY the agent chose that action."
)
messages: list[LLMMessage] = [
format_message_for_llm(prompt, role="user")
]
from crewai.new_agent.executor import _NullPrinter
reasoning_text = get_llm_response(
llm=self._llm_instance,
messages=messages,
callbacks=[],
printer=_NullPrinter(),
)
if reasoning_text:
reasoning_str = str(reasoning_text).strip()
for entry in entries:
if not entry.reasoning:
entry.reasoning = reasoning_str
except Exception:
pass
return entries
@property
def memory_view(self) -> Any:
"""GAP-111: Read-only view of the agent's memory backend.
Returns the underlying memory instance (supports .recall(), .save(), etc.)
or None if memory is disabled. For a higher-level query API, use query_memory().
"""
return self._memory_instance
def query_memory(self, query: str, limit: int = 10) -> list[Any]:
"""Query the agent's memory for relevant information.
GAP-45: Applies MemoryScope namespace and MemorySlice filters
when configured.
"""
if self._memory_instance is None:
return []
try:
scoped_query = query
if self._memory_namespace:
scoped_query = f"[{self._memory_namespace}] {query}"
results = self._memory_instance.recall(scoped_query, limit=limit)
if not results:
return []
if self._memory_filter is not None:
filtered = []
for r in results:
r_str = str(r).lower() if r else ""
if (
self._memory_filter.user_id
and self._memory_filter.user_id.lower() not in r_str
):
continue
filtered.append(r)
return filtered
return results or []
except Exception:
return []
def get_conversation_history(self, conversation_id: str) -> list[Message]:
"""GAP-31: Get conversation history for a specific conversation."""
executor = self._executors.get(conversation_id)
if executor is None:
return []
return list(executor.conversation_history)
@property
def conversation_history(self) -> list[Message]:
"""Return the default conversation's history."""
executor = self._executors.get(self._default_conversation_id)
if executor is None:
return []
return list(executor.conversation_history)
@property
def last_prompt_stack(self) -> PromptStack | None:
executor = self._executors.get(self._default_conversation_id)
if executor is None:
return None
result: PromptStack | None = executor.prompt_stack
return result
@property
def usage_metrics(self) -> dict[str, int]:
executor = self._executors.get(self._default_conversation_id)
if executor is None:
return {
"total_input_tokens": 0,
"total_output_tokens": 0,
"total_tokens": 0,
"total_actions": 0,
}
total_in = sum(r.input_tokens for r in executor.usage_records)
total_out = sum(r.output_tokens for r in executor.usage_records)
return {
"total_input_tokens": total_in,
"total_output_tokens": total_out,
"total_tokens": total_in + total_out,
"total_actions": len(executor.usage_records),
}
# ── GAP-40: Training → Canonical Memories ──────────────────
def train(self, feedback: str, task_context: str = "") -> None:
"""Process training feedback as canonical memories.
GAP-40: Instead of prompt-tuning, saves feedback as high-priority
memories for the agent to recall during future conversations.
"""
if not self._memory_instance:
return
canonical = f"Training feedback: {feedback}"
if task_context:
canonical = f"Context: {task_context}\nFeedback: {feedback}"
try:
self._memory_instance.remember(
canonical,
agent_role=self.role,
importance=0.95,
)
except Exception:
pass
if self._dreaming_engine:
try:
self._dreaming_engine.add_training_feedback(feedback, task_context)
except Exception:
pass
# ── GAP-24: Anaphora Resolution in Memory Encoding ─────────
def prepare_memory_context(self, raw_text: str) -> str:
"""Prepare text for memory storage by resolving anaphora.
GAP-24: Returns an enhanced prompt that the executor can use
to resolve pronouns before saving to memory.
"""
last_messages = (
self.conversation_history[-5:] if self.conversation_history else []
)
context = "\n".join(f"{m.role}: {m.content}" for m in last_messages)
return (
f"Given this conversation context:\n{context}\n\n"
f"Resolve all pronouns and references in the following text to their "
f"full names/concepts. Only output the resolved text, nothing else:\n"
f"{raw_text}"
)
def _resolve_anaphora(self, text: str, context: list[Message]) -> str:
"""Resolve pronouns in text using conversation context.
GAP-24: Only triggers if the text contains pronouns.
Requires an LLM call via the agent's LLM.
"""
if not _ANAPHORA_PRONOUNS.search(text):
return text
llm = self._llm_instance
if llm is None:
return text
context_str = "\n".join(f"{m.role}: {m.content}" for m in context[-5:])
prompt = (
f"Given this conversation context:\n{context_str}\n\n"
f"Resolve all pronouns and references in the following text to their "
f"full names/concepts. Only output the resolved text, nothing else:\n"
f"{text}"
)
try:
from crewai.new_agent.executor import _NullPrinter
from crewai.utilities.agent_utils import (
format_message_for_llm,
get_llm_response,
)
from crewai.utilities.types import LLMMessage
messages: list[LLMMessage] = [format_message_for_llm(prompt, role="user")]
result = get_llm_response(
llm=llm,
messages=messages,
callbacks=[],
printer=_NullPrinter(),
)
resolved = str(result).strip()
return resolved if resolved else text
except Exception:
return text

View File

@@ -1,252 +0,0 @@
"""Planning — execution plan creation for NewAgent.
GAP-49: Tracks token usage from plan creation and reasoning reconstruction LLM calls.
"""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
from crewai.new_agent.new_agent import NewAgent
logger = logging.getLogger(__name__)
class PlanningEngine:
"""Creates execution plans for complex tasks."""
def __init__(self, agent: NewAgent):
self.agent = agent
self._current_plan: list[str] | None = None
# GAP-49: Token tracking for the last plan/reasoning call
self._last_plan_tokens: Any = None
@property
def current_plan(self) -> list[str] | None:
return self._current_plan
async def maybe_plan(self, user_message: str) -> list[str] | None:
"""Decide if planning is needed and create a plan if so.
Returns a list of plan steps, or None if no planning needed.
"""
settings = self.agent.settings
if not settings.planning_enabled:
return None
if settings.auto_plan:
needs_plan = await self._assess_complexity(user_message)
if not needs_plan:
return None
plan = await self._create_plan(user_message)
self._current_plan = plan
self._emit_planning_events(plan)
return plan
async def _assess_complexity(self, message: str) -> bool:
"""Use a heuristic to determine if a message needs planning."""
# Simple heuristic: long messages, multiple questions, or explicit planning keywords
complexity_indicators = [
len(message) > 500,
message.count("?") > 2,
any(
kw in message.lower()
for kw in [
"step by step",
"plan",
"multiple",
"compare",
"analyze",
"research",
"comprehensive",
"detailed",
"all of",
"each of",
"every",
]
),
message.count(",") > 4,
message.count(" and ") > 3,
]
return sum(complexity_indicators) >= 2
async def _create_plan(self, message: str) -> list[str]:
"""Use LLM to create an execution plan."""
llm = self.agent._llm_instance
if llm is None:
return []
from crewai.utilities.agent_utils import (
aget_llm_response,
format_message_for_llm,
)
from crewai.utilities.types import LLMMessage
tools_desc = ""
if self.agent._resolved_tools:
tools_desc = "Available tools: " + ", ".join(
t.name for t in self.agent._resolved_tools
)
coworkers_desc = ""
if self.agent._resolved_coworkers:
coworkers_desc = "Available coworkers: " + ", ".join(
getattr(cw, "role", str(cw)) for cw in self.agent._resolved_coworkers
)
prompt = (
f"You are {self.agent.role}. Your goal: {self.agent.goal}\n\n"
f"A user has asked: {message}\n\n"
f"{tools_desc}\n{coworkers_desc}\n\n"
"Create a concise execution plan. List each step on its own line, "
"prefixed with a number and period (e.g., '1. Search for...'). "
"Keep steps actionable and specific. Maximum 7 steps."
)
messages: list[LLMMessage] = [format_message_for_llm(prompt, role="user")]
try:
from crewai.new_agent.executor import _NullPrinter
response = await aget_llm_response(
llm=llm,
messages=messages,
callbacks=[],
printer=_NullPrinter(),
verbose=False,
)
# GAP-49: Record token usage from the planning LLM call
try:
from crewai.new_agent.models import TokenUsage
usage = getattr(llm, "_token_usage", None) or {}
in_tokens = usage.get("prompt_tokens", 0)
out_tokens = usage.get("completion_tokens", 0)
model_name = getattr(llm, "model", "") or ""
self._last_plan_tokens = TokenUsage(
action="planning",
agent_id=str(self.agent.id),
input_tokens=in_tokens,
output_tokens=out_tokens,
model=model_name,
)
except Exception:
pass
lines = str(response).strip().split("\n")
steps = []
for line in lines:
line = line.strip()
if line and (line[0].isdigit() or line.startswith("-")):
# Remove numbering prefix
clean = line.lstrip("0123456789.-) ").strip()
if clean:
steps.append(clean)
return steps or [str(response).strip()]
except Exception as e:
logger.debug(f"Planning LLM call failed: {e}")
return []
async def reconstruct_reasoning(self, provenance_log: list[Any]) -> list[Any]:
"""Reconstruct reasoning for provenance entries with empty reasoning fields."""
entries_without_reasoning = [e for e in provenance_log if not e.reasoning]
if not entries_without_reasoning:
return provenance_log
llm = self.agent._llm_instance
if llm is None:
return provenance_log
from crewai.utilities.agent_utils import (
aget_llm_response,
format_message_for_llm,
)
from crewai.utilities.types import LLMMessage
log_text = "\n".join(
f"- [{e.action}] inputs={e.inputs}, outcome={e.outcome}"
for e in provenance_log
)
prompt = (
f"You are analyzing the decision trace of an AI agent ({self.agent.role}).\n\n"
f"Execution log:\n{log_text}\n\n"
"For each action, explain WHY the agent took that action in 1-2 sentences. "
"Output one reasoning per line in the same order as the log entries, prefixed with the action index (0-based):\n"
"0: reason\n1: reason\n..."
)
messages: list[LLMMessage] = [format_message_for_llm(prompt, role="user")]
try:
from crewai.new_agent.executor import _NullPrinter
response = await aget_llm_response(
llm=llm,
messages=messages,
callbacks=[],
printer=_NullPrinter(),
verbose=False,
)
# GAP-49: Record token usage from the reasoning reconstruction call
try:
from crewai.new_agent.models import TokenUsage
usage = getattr(llm, "_token_usage", None) or {}
in_tokens = usage.get("prompt_tokens", 0)
out_tokens = usage.get("completion_tokens", 0)
model_name = getattr(llm, "model", "") or ""
self._last_plan_tokens = TokenUsage(
action="planning",
agent_id=str(self.agent.id),
input_tokens=in_tokens,
output_tokens=out_tokens,
model=model_name,
)
except Exception:
pass
lines = str(response).strip().split("\n")
for line in lines:
line = line.strip()
if ":" in line:
idx_str, reasoning = line.split(":", 1)
try:
idx = int(idx_str.strip())
if 0 <= idx < len(provenance_log):
provenance_log[idx].reasoning = reasoning.strip()
except (ValueError, IndexError):
continue
except Exception:
pass
return provenance_log
def _emit_planning_events(self, plan: list[str]) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import (
NewAgentPlanningCompletedEvent,
NewAgentPlanningStartedEvent,
)
crewai_event_bus.emit(
self.agent,
event=NewAgentPlanningStartedEvent(new_agent_id=str(self.agent.id)),
)
crewai_event_bus.emit(
self.agent,
event=NewAgentPlanningCompletedEvent(
new_agent_id=str(self.agent.id),
plan_steps_count=len(plan),
),
)
except Exception:
pass

View File

@@ -1,192 +0,0 @@
"""ConversationalProvider protocol and basic implementations."""
from __future__ import annotations
import json
import logging
from pathlib import Path
import sqlite3
from typing import Protocol, runtime_checkable
from crewai.new_agent.models import AgentStatus, Message, ProvenanceEntry
logger = logging.getLogger(__name__)
@runtime_checkable
class ConversationStorage(Protocol):
"""Pluggable persistence for conversation history and provenance.
OSS ships SQLiteConversationStorage. Enterprise can replace with
Postgres, DynamoDB, etc.
"""
def load_messages(self) -> list[Message]: ...
def save_messages(self, messages: list[Message]) -> None: ...
def clear_messages(self) -> None: ...
def load_provenance(self) -> list[ProvenanceEntry]: ...
def save_provenance(self, entries: list[ProvenanceEntry]) -> None: ...
class SQLiteConversationStorage:
"""Thread-safe SQLite WAL storage for conversations and provenance."""
def __init__(self, db_path: str | Path) -> None:
self._db_path = str(db_path)
Path(self._db_path).parent.mkdir(parents=True, exist_ok=True)
self._init_db()
def _connect(self) -> sqlite3.Connection:
conn = sqlite3.connect(self._db_path, timeout=30)
conn.execute("PRAGMA journal_mode=WAL")
return conn
def _init_db(self) -> None:
with self._connect() as conn:
conn.execute("""
CREATE TABLE IF NOT EXISTS messages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
data_json TEXT NOT NULL
)
""")
conn.execute("""
CREATE TABLE IF NOT EXISTS provenance (
id INTEGER PRIMARY KEY AUTOINCREMENT,
data_json TEXT NOT NULL
)
""")
def load_messages(self) -> list[Message]:
try:
with self._connect() as conn:
rows = conn.execute(
"SELECT data_json FROM messages ORDER BY id"
).fetchall()
return [Message.model_validate(json.loads(r[0])) for r in rows]
except Exception as e:
logger.debug(f"Failed to load messages: {e}")
return []
def save_messages(self, messages: list[Message]) -> None:
try:
with self._connect() as conn:
conn.execute("DELETE FROM messages")
conn.executemany(
"INSERT INTO messages (data_json) VALUES (?)",
[
(json.dumps(m.model_dump(mode="json"), default=str),)
for m in messages
],
)
except Exception as e:
logger.debug(f"Failed to save messages: {e}")
def clear_messages(self) -> None:
try:
with self._connect() as conn:
conn.execute("DELETE FROM messages")
except Exception as e:
logger.debug(f"Failed to clear messages: {e}")
def load_provenance(self) -> list[ProvenanceEntry]:
try:
with self._connect() as conn:
rows = conn.execute(
"SELECT data_json FROM provenance ORDER BY id"
).fetchall()
return [ProvenanceEntry.model_validate(json.loads(r[0])) for r in rows]
except Exception as e:
logger.debug(f"Failed to load provenance: {e}")
return []
def save_provenance(self, entries: list[ProvenanceEntry]) -> None:
try:
with self._connect() as conn:
conn.execute("DELETE FROM provenance")
conn.executemany(
"INSERT INTO provenance (data_json) VALUES (?)",
[
(json.dumps(e.model_dump(mode="json"), default=str),)
for e in entries
],
)
except Exception as e:
logger.debug(f"Failed to save provenance: {e}")
@runtime_checkable
class ConversationalProvider(Protocol):
"""Pluggable transport for agent conversations.
OSS provides CLIProvider (TUI). Enterprise provides
SlackProvider, TeamsProvider, WebProvider, etc.
"""
async def send_message(self, message: Message) -> None: ...
async def receive_message(self) -> Message: ...
async def send_status(self, status: AgentStatus) -> None: ...
def get_history(self) -> list[Message]: ...
def save_history(self, messages: list[Message]) -> None: ...
def reset_history(self) -> None: ...
def save_provenance(self, entries: list[ProvenanceEntry]) -> None: ...
def load_provenance(self) -> list[ProvenanceEntry]: ...
def get_scope(self) -> dict[str, str]:
"""Return scope context for multi-tenant memory isolation.
Enterprise providers override this to convey conversation scope
(e.g., Slack channel ID, Teams thread, user DM). The executor
passes this to memory operations so memories are scoped correctly.
Returns a dict with provider-defined keys. Common keys:
- "channel_id": platform channel/thread identifier
- "user_id": platform user identifier
- "team_id": workspace/org identifier
"""
...
class DirectProvider:
"""In-process provider for programmatic use (no TUI, no stdin).
Conversations happen via message()/amessage() calls directly.
History is kept in-memory.
"""
def __init__(self) -> None:
self._history: list[Message] = []
self._provenance: list[ProvenanceEntry] = []
self._pending_status: AgentStatus | None = None
async def send_message(self, message: Message) -> None:
self._history.append(message)
async def receive_message(self) -> Message:
raise NotImplementedError(
"DirectProvider does not support interactive receive. "
"Use agent.message() instead."
)
async def send_status(self, status: AgentStatus) -> None:
self._pending_status = status
def get_history(self) -> list[Message]:
return list(self._history)
def save_history(self, messages: list[Message]) -> None:
self._history = list(messages)
def reset_history(self) -> None:
self._history.clear()
def save_provenance(self, entries: list[ProvenanceEntry]) -> None:
"""Persist provenance entries in memory."""
self._provenance = list(entries)
def load_provenance(self) -> list[ProvenanceEntry]:
"""Load provenance entries from memory."""
return list(self._provenance)
def get_scope(self) -> dict[str, str]:
return {}

View File

@@ -1,302 +0,0 @@
"""Task scheduler — lets agents schedule one-time or recurring work.
Persists tasks to ``~/.crewai/scheduled_tasks.json`` and runs an asyncio
background loop that fires due tasks.
"""
from __future__ import annotations
import asyncio
from collections.abc import Callable
from datetime import datetime, timedelta, timezone
import json
import logging
from pathlib import Path
import re
from typing import Any
from uuid import uuid4
from pydantic import BaseModel, Field
from crewai.tools.base_tool import BaseTool
logger = logging.getLogger(__name__)
_PERSIST_PATH = Path.home() / ".crewai" / "scheduled_tasks.json"
# ── Relative-time parser ────────────────────────────────────────
_RELATIVE_RE = re.compile(
r"(?:in\s+)?(\d+)\s*(second|sec|minute|min|hour|hr|day)s?",
re.IGNORECASE,
)
_UNIT_SECONDS = {
"second": 1,
"sec": 1,
"minute": 60,
"min": 60,
"hour": 3600,
"hr": 3600,
"day": 86400,
}
def parse_schedule_time(text: str) -> datetime | None:
"""Parse a human-friendly time string into a UTC datetime.
Supports:
- Relative: "in 5 minutes", "30 seconds", "2 hours"
- ISO 8601: "2026-05-11T18:00:00Z"
"""
text = text.strip()
# Try relative first
m = _RELATIVE_RE.search(text)
if m:
amount = int(m.group(1))
unit = m.group(2).lower()
secs = amount * _UNIT_SECONDS.get(unit, 60)
return datetime.now(timezone.utc) + timedelta(seconds=secs)
# Try ISO
for fmt in ("%Y-%m-%dT%H:%M:%SZ", "%Y-%m-%dT%H:%M:%S%z", "%Y-%m-%dT%H:%M:%S"):
try:
dt = datetime.strptime(text, fmt)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return dt
except ValueError:
continue
return None
# ── ScheduledTask model ─────────────────────────────────────────
class ScheduledTask(BaseModel):
id: str = Field(default_factory=lambda: f"task-{uuid4().hex[:8]}")
agent_name: str = ""
description: str = ""
schedule_type: str = "once" # "once" or "recurring"
next_run_at: str = "" # ISO 8601 UTC
interval_seconds: int | None = None # for recurring
status: str = "pending" # pending, running, completed, failed, cancelled
last_result: str = ""
created_at: str = Field(
default_factory=lambda: datetime.now(timezone.utc).isoformat()
)
# ── TaskScheduler ───────────────────────────────────────────────
class TaskScheduler:
"""Singleton scheduler that checks for due tasks every 30 seconds."""
_instance: TaskScheduler | None = None
_initialized: bool
def __new__(cls) -> TaskScheduler:
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance._initialized = False
return cls._instance
def __init__(self) -> None:
if self._initialized:
return
self._initialized = True
self._tasks: list[ScheduledTask] = []
self._callback: Callable[[ScheduledTask], Any] | None = None
self._running = False
self._bg_task: asyncio.Task[None] | None = None
self._load()
def set_callback(self, cb: Callable[[ScheduledTask], Any]) -> None:
self._callback = cb
# ── Persistence ──
def _load(self) -> None:
if _PERSIST_PATH.exists():
try:
data = json.loads(_PERSIST_PATH.read_text())
self._tasks = [ScheduledTask(**t) for t in data]
except Exception:
self._tasks = []
def _save(self) -> None:
_PERSIST_PATH.parent.mkdir(parents=True, exist_ok=True)
try:
_PERSIST_PATH.write_text(
json.dumps([t.model_dump() for t in self._tasks], indent=2)
)
except Exception as e:
logger.warning(f"Failed to persist scheduled tasks: {e}")
# ── CRUD ──
def add(self, task: ScheduledTask) -> ScheduledTask:
self._tasks.append(task)
self._save()
return task
def cancel(self, task_id: str) -> bool:
for t in self._tasks:
if t.id == task_id and t.status == "pending":
t.status = "cancelled"
self._save()
return True
return False
def list_tasks(self, include_done: bool = False) -> list[ScheduledTask]:
if include_done:
return list(self._tasks)
return [t for t in self._tasks if t.status in ("pending", "running")]
# ── Background loop ──
def start(self, loop: asyncio.AbstractEventLoop | None = None) -> None:
if self._running:
return
self._running = True
if loop is not None:
self._bg_task = loop.create_task(self._loop())
else:
try:
running_loop = asyncio.get_running_loop()
self._bg_task = running_loop.create_task(self._loop())
except RuntimeError:
pass
def stop(self) -> None:
self._running = False
if self._bg_task and not self._bg_task.done():
self._bg_task.cancel()
async def _loop(self) -> None:
while self._running:
try:
await asyncio.sleep(30)
self._tick()
except asyncio.CancelledError:
break
except Exception as e:
logger.warning(f"Scheduler tick error: {e}")
def _tick(self) -> None:
now = datetime.now(timezone.utc)
for task in self._tasks:
if task.status != "pending":
continue
try:
due = datetime.fromisoformat(task.next_run_at)
if due.tzinfo is None:
due = due.replace(tzinfo=timezone.utc)
except (ValueError, TypeError):
continue
if now >= due:
task.status = "running"
self._save()
try:
if self._callback:
result = self._callback(task)
task.last_result = str(result) if result else "done"
except Exception as e:
task.status = "failed"
task.last_result = str(e)
self._save()
continue
if task.schedule_type == "recurring" and task.interval_seconds:
task.status = "pending"
task.next_run_at = (
now + timedelta(seconds=task.interval_seconds)
).isoformat()
else:
task.status = "completed"
self._save()
@classmethod
def reset(cls) -> None:
"""Reset singleton — for testing only."""
cls._instance = None
# ── ScheduleTaskTool ────────────────────────────────────────────
class ScheduleTaskArgs(BaseModel):
description: str = Field(description="What the agent should do when the task fires")
when: str = Field(
description=(
"When to run. Accepts relative ('in 5 minutes', '2 hours') "
"or ISO 8601 ('2026-05-11T18:00:00Z')"
)
)
recurring_interval: str | None = Field(
default=None,
description=(
"For recurring tasks, how often to repeat (e.g. '30 minutes', '1 hour'). "
"Omit for one-time tasks."
),
)
class ScheduleTaskTool(BaseTool):
"""Tool that lets an agent schedule future work."""
name: str = "schedule_task"
description: str = (
"Schedule a task to be executed at a future time. "
"Use this when you promise to do something later, "
"need to set a reminder, or want to run recurring checks."
)
args_schema: type[BaseModel] = ScheduleTaskArgs
agent_name: str = Field(default="", exclude=True)
def _run(
self,
description: str,
when: str,
recurring_interval: str | None = None,
**kwargs: Any,
) -> str:
run_at = parse_schedule_time(when)
if run_at is None:
return (
f"Could not parse time '{when}'. "
"Use relative ('in 5 minutes') or ISO 8601 format."
)
schedule_type = "once"
interval_seconds: int | None = None
if recurring_interval:
m = _RELATIVE_RE.search(recurring_interval)
if m:
amount = int(m.group(1))
unit = m.group(2).lower()
interval_seconds = amount * _UNIT_SECONDS.get(unit, 60)
schedule_type = "recurring"
task = ScheduledTask(
agent_name=self.agent_name,
description=description,
schedule_type=schedule_type,
next_run_at=run_at.isoformat(),
interval_seconds=interval_seconds,
)
scheduler = TaskScheduler()
scheduler.add(task)
when_str = run_at.strftime("%Y-%m-%d %H:%M UTC")
result = f"Scheduled task '{task.id}': {description} — due {when_str}"
if schedule_type == "recurring":
result += f" (repeats every {recurring_interval})"
return result

View File

@@ -1,498 +0,0 @@
"""SkillBuilder — lets agents create and suggest SKILL.md files.
Mirrors KnowledgeDiscovery: detects patterns, builds pending suggestions,
emits events, and waits for user approval before writing to disk.
"""
from __future__ import annotations
import json
import logging
from pathlib import Path
import re
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
from crewai.new_agent.new_agent import NewAgent
from crewai.skills.models import Skill
logger = logging.getLogger(__name__)
_SKILL_NAME_RE = re.compile(r"^[a-z0-9]+(?:-[a-z0-9]+)*$")
_SLUGIFY_RE = re.compile(r"[^a-z0-9]+")
_GENERATION_PROMPT = """\
You are generating a reusable skill definition for a CrewAI agent.
A skill is a set of instructions that tells the agent HOW to perform a procedure.
Source type: {source_type}
Input:
{source_text}
Generate a JSON object with exactly these fields:
- "name": a kebab-case identifier (lowercase letters, digits, hyphens only, max 64 chars)
- "description": a one-line description of what this skill does (max 200 chars)
- "instructions": markdown-formatted step-by-step instructions
Return ONLY the JSON object, no extra text.
"""
def _slugify(text: str, max_len: int = 64) -> str:
slug = _SLUGIFY_RE.sub("-", text.lower().strip()).strip("-")
return slug[:max_len]
_CONFIRM_WORDS = {
"yes",
"yep",
"yeah",
"sure",
"approve",
"confirmed",
"accept",
"lgtm",
}
_CONFIRM_PHRASES = {"go ahead", "save it", "sounds good", "looks good"}
_REJECT_WORDS = {"no", "nah", "nope", "reject", "decline"}
_REJECT_PHRASES = {"never mind", "no thanks", "don't save", "not now"}
def _detect_suggestion_intent(user_text: str) -> str:
"""Return 'confirm', 'reject', or 'ignore' for a user response.
Only short responses (≤ 10 words) are treated as confirm/reject signals.
Longer messages are always 'ignore' — they're conversational, not
yes/no answers. Single-word triggers must appear in the first two
words; multi-word phrases can appear anywhere in the short text.
"""
lower = user_text.lower().strip()
words = lower.split()
if not words:
return "ignore"
if len(words) > 10:
return "ignore"
leading = " ".join(words[:2])
def _word_match(word: str, text: str) -> bool:
return bool(re.search(rf"\b{re.escape(word)}\b(?!-)", text))
for phrase in _CONFIRM_PHRASES:
if phrase in lower:
return "confirm"
for word in _CONFIRM_WORDS:
if _word_match(word, leading):
return "confirm"
for phrase in _REJECT_PHRASES:
if phrase in lower:
return "reject"
for word in _REJECT_WORDS:
if _word_match(word, leading):
return "reject"
return "ignore"
class SkillBuilder:
"""Builds, suggests, and manages auto-generated skills for a NewAgent."""
def __init__(self, agent: NewAgent) -> None:
self.agent = agent
self._pending_suggestions: list[dict[str, Any]] = []
self._active_skills: list[Skill] = []
role_slug = _slugify(agent.role or str(agent.id))
self._skills_dir = Path("agents") / role_slug / "skills"
self._load_existing_skills()
@property
def pending_suggestions(self) -> list[dict[str, Any]]:
return list(self._pending_suggestions)
# ── Suggestion creation ──
def suggest_skill(
self,
name: str,
description: str,
instructions: str,
source: str,
metadata: dict[str, str] | None = None,
) -> dict[str, Any]:
"""Create a pending skill suggestion and emit an event."""
if not self.agent.settings.can_build_skills:
return {}
name = _slugify(name)
if not name:
name = f"skill-{len(self._pending_suggestions) + 1}"
if not _SKILL_NAME_RE.match(name):
name = _slugify(name)
for existing in self._active_skills:
if existing.name == name:
name = f"{name}-{len(self._pending_suggestions) + 1}"
break
suggestion: dict[str, Any] = {
"name": name,
"description": description[:200],
"instructions": instructions,
"source": source,
"status": "pending",
"metadata": metadata or {"auto-generated": "true"},
}
self._pending_suggestions.append(suggestion)
self._emit_suggested_event(suggestion)
return suggestion
def build_suggestion_message(
self, suggestion: dict[str, Any]
) -> tuple[str, list[dict[str, Any]]]:
"""Return (conversational_text, actions) for a pending suggestion.
Plain-text providers show just the text and let the user respond
conversationally. Rich providers (Slack, Teams) can render
the actions as buttons or interactive cards.
"""
name = suggestion.get("name", "skill")
desc = suggestion.get("description", "")
instructions = suggestion.get("instructions", "")
preview = instructions[:300] + ("..." if len(instructions) > 300 else "")
text = (
f"I've identified a pattern that could be saved as a reusable skill:\n\n"
f"**{name}** — {desc}\n\n"
f"```\n{preview}\n```\n\n"
f"Would you like me to save this skill? "
f"You can say yes, no, or ask me to modify it first."
)
from crewai.new_agent.models import MessageAction
actions = [
MessageAction(
action_id=f"skill-confirm-{name}",
label="Approve",
action_type="suggestion_confirm",
payload={"type": "skill", "name": name},
),
MessageAction(
action_id=f"skill-reject-{name}",
label="Dismiss",
action_type="suggestion_reject",
payload={"type": "skill", "name": name},
),
MessageAction(
action_id=f"skill-edit-{name}",
label="Edit",
action_type="suggestion_edit",
payload={"type": "skill", "name": name},
),
]
return text, [a.model_dump() for a in actions]
def handle_suggestion_response(self, user_text: str) -> dict[str, Any] | None:
"""Interpret a plain-text user response to a pending suggestion.
Returns a dict with ``{"action": "confirmed"|"rejected"|"ignored", ...}``
or ``None`` if there are no pending suggestions.
After 3 consecutive ignores the suggestion is auto-dismissed.
"""
if not self._pending_suggestions:
return None
intent = _detect_suggestion_intent(user_text)
if intent == "confirm":
suggestion = self._pending_suggestions[0]
if self.confirm_suggestion(0):
return {"action": "confirmed", "name": suggestion["name"]}
return {"action": "error", "name": suggestion["name"]}
if intent == "reject":
suggestion = self._pending_suggestions[0]
name = suggestion["name"]
self.reject_suggestion(0)
return {"action": "rejected", "name": name}
self._pending_suggestions[0]["_ignore_count"] = (
self._pending_suggestions[0].get("_ignore_count", 0) + 1
)
if self._pending_suggestions[0]["_ignore_count"] >= 3:
name = self._pending_suggestions[0]["name"]
self.reject_suggestion(0)
return {"action": "rejected", "name": name}
return {"action": "ignored"}
def suggest_from_instruction(self, user_text: str) -> dict[str, Any]:
"""Generate a skill suggestion from an explicit user instruction."""
generated = self._generate_skill_content(user_text, "explicit-instruction")
if not generated:
return self.suggest_skill(
name=_slugify(user_text[:60]),
description=user_text[:200],
instructions=user_text,
source="explicit-instruction",
)
return self.suggest_skill(
name=generated["name"],
description=generated["description"],
instructions=generated["instructions"],
source="explicit-instruction",
)
def suggest_from_workflow(self, workflow: dict[str, Any]) -> dict[str, Any]:
"""Convert a DreamingEngine workflow into a skill suggestion."""
tools = workflow.get("tools", [])
count = workflow.get("count", 0)
source_text = (
f"Repeated tool sequence ({count}x): {' -> '.join(tools)}\n"
+ "\n".join(f" Step {i + 1}: {t}" for i, t in enumerate(tools))
)
generated = self._generate_skill_content(source_text, "workflow-detection")
if not generated:
name = _slugify("-".join(tools[:4]))
return self.suggest_skill(
name=name or "workflow-skill",
description=f"Automated workflow: {' -> '.join(tools)}",
instructions=(
f"## Workflow (detected {count} times)\n\n"
+ "\n".join(
f"{i + 1}. Use the **{t}** tool" for i, t in enumerate(tools)
)
),
source="workflow-detection",
)
return self.suggest_skill(
name=generated["name"],
description=generated["description"],
instructions=generated["instructions"],
source="workflow-detection",
)
# ── Approval / rejection ──
def confirm_suggestion(self, index: int) -> bool:
"""Approve a pending suggestion: write SKILL.md, load, and activate."""
if index < 0 or index >= len(self._pending_suggestions):
return False
suggestion = self._pending_suggestions[index]
if suggestion["status"] != "pending":
return False
name = suggestion["name"]
description = suggestion["description"]
instructions = suggestion["instructions"]
metadata = suggestion.get("metadata", {})
try:
skill_path = self._write_skill_to_disk(
name, description, instructions, metadata
)
except Exception as e:
logger.warning(f"Failed to write skill '{name}': {e}")
return False
try:
from crewai.skills.parser import (
load_skill_instructions,
load_skill_metadata,
)
skill = load_skill_metadata(skill_path)
skill = load_skill_instructions(skill)
self._active_skills.append(skill)
except Exception as e:
logger.warning(f"Failed to load skill '{name}' after writing: {e}")
return False
suggestion["status"] = "confirmed"
self._pending_suggestions.pop(index)
self._emit_confirmed_event(name)
return True
def reject_suggestion(self, index: int) -> None:
if 0 <= index < len(self._pending_suggestions):
self._pending_suggestions[index]["status"] = "rejected"
name = self._pending_suggestions[index]["name"]
self._pending_suggestions.pop(index)
self._emit_rejected_event(name)
def update_suggestion(self, index: int, instructions: str) -> bool:
if 0 <= index < len(self._pending_suggestions):
self._pending_suggestions[index]["instructions"] = instructions
return True
return False
# ── Active skills ──
def get_active_skills(self) -> list[Skill]:
return list(self._active_skills)
def format_skills_context(self) -> str:
if not self._active_skills:
return ""
try:
from crewai.skills.loader import format_skill_context
sections = [format_skill_context(s) for s in self._active_skills]
return "\n\n".join(sections)
except Exception as e:
logger.warning(f"Failed to format skills context: {e}")
return ""
# ── Disk I/O ──
def _write_skill_to_disk(
self,
name: str,
description: str,
instructions: str,
metadata: dict[str, str],
) -> Path:
skill_dir = self._skills_dir / name
skill_dir.mkdir(parents=True, exist_ok=True)
frontmatter_lines = [
"---",
f"name: {name}",
f'description: "{description}"',
]
if metadata:
frontmatter_lines.append("metadata:")
for k, v in metadata.items():
frontmatter_lines.append(f' {k}: "{v}"')
frontmatter_lines.append("---")
frontmatter_lines.append("")
content = "\n".join(frontmatter_lines) + instructions
(skill_dir / "SKILL.md").write_text(content)
return skill_dir
def _load_existing_skills(self) -> None:
if not self._skills_dir.is_dir():
return
try:
from crewai.skills.loader import activate_skill, discover_skills
discovered = discover_skills(self._skills_dir)
for skill in discovered:
try:
activated = activate_skill(skill)
self._active_skills.append(activated)
except Exception:
pass
except Exception:
pass
# ── LLM skill generation ──
def _generate_skill_content(
self, source_text: str, source_type: str
) -> dict[str, Any] | None:
llm = getattr(self.agent, "_llm_instance", None)
if llm is None:
return None
prompt = _GENERATION_PROMPT.format(
source_type=source_type,
source_text=source_text,
)
try:
from crewai.new_agent.executor import _NullPrinter
from crewai.utilities.agent_utils import (
format_message_for_llm,
get_llm_response,
)
messages = [format_message_for_llm(prompt, role="user")]
response = get_llm_response(
llm=llm,
messages=messages,
callbacks=[],
printer=_NullPrinter(),
verbose=False,
)
text = str(response).strip()
# Extract JSON from response (may be wrapped in ```json blocks)
if "```" in text:
match = re.search(r"```(?:json)?\s*(\{.*?\})\s*```", text, re.DOTALL)
if match:
text = match.group(1)
data = json.loads(text)
name = data.get("name", "")
description = data.get("description", "")
instructions = data.get("instructions", "")
if not name or not instructions:
return None
return {
"name": _slugify(name),
"description": description[:200],
"instructions": instructions,
}
except Exception as e:
logger.debug(f"LLM skill generation failed: {e}")
return None
# ── Events ──
def _emit_suggested_event(self, suggestion: dict[str, Any]) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentSkillSuggestedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentSkillSuggestedEvent(
new_agent_id=str(self.agent.id),
skill_name=suggestion.get("name", ""),
source_type=suggestion.get("source", ""),
),
)
except Exception:
pass
def _emit_confirmed_event(self, skill_name: str) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentSkillConfirmedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentSkillConfirmedEvent(
new_agent_id=str(self.agent.id),
skill_name=skill_name,
),
)
except Exception:
pass
def _emit_rejected_event(self, skill_name: str) -> None:
try:
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentSkillRejectedEvent
crewai_event_bus.emit(
self.agent,
event=NewAgentSkillRejectedEvent(
new_agent_id=str(self.agent.id),
skill_name=skill_name,
),
)
except Exception:
pass

View File

@@ -1,469 +0,0 @@
"""Spawn tool — lets an agent spawn parallel copies of itself for sub-tasks.
GAP-57: Emits spawn started/completed/failed events.
GAP-58: Injects relevant parent memory into spawned copies.
"""
from __future__ import annotations
import asyncio
import logging
import time
from typing import Any
from uuid import uuid4
from pydantic import BaseModel, Field
from crewai.new_agent.models import Message
from crewai.tools.base_tool import BaseTool
logger = logging.getLogger(__name__)
def _emit_spawn_event(event_cls: type, **kwargs: Any) -> None:
"""Emit a spawn event on the event bus, swallowing errors."""
try:
from crewai.events.event_bus import crewai_event_bus
crewai_event_bus.emit(None, event=event_cls(**kwargs))
except Exception:
pass
def _query_parent_memory(agent: Any, subtask: str, limit: int = 10) -> str:
"""GAP-58: Query the parent agent's memory for context relevant to the subtask.
Returns a formatted context string, or empty string if unavailable.
"""
try:
memory = getattr(agent, "_memory_instance", None)
if memory is None:
return ""
results = memory.recall(subtask, limit=limit)
if not results:
return ""
lines: list[str] = []
for m in results:
content = getattr(m, "content", "") or getattr(
getattr(m, "record", None), "content", ""
)
if content:
lines.append(f"- {content}")
if not lines:
return ""
return "Parent agent's relevant memory:\n" + "\n".join(lines)
except Exception:
return ""
class SpawnSubtaskArgs(BaseModel):
"""Arguments for spawning parallel sub-tasks."""
subtasks: list[str] = Field(
description="List of sub-task instructions to execute in parallel"
)
fire_and_forget: bool = Field(
default=False,
description="If true, dispatches subtasks in background without waiting for results.",
)
class SpawnSubtaskTool(BaseTool):
"""Tool that spawns parallel copies of the agent for sub-tasks.
Each copy receives the same tools but operates on a single sub-task
with no backstory, history, or memory — just the instruction and tools.
"""
name: str = "spawn_parallel_subtasks"
description: str = (
"Spawn parallel copies of yourself to handle multiple sub-tasks "
"simultaneously. Each copy gets the same tools but focuses on one "
"sub-task. Returns the collected results from all copies."
)
args_schema: type[BaseModel] = SpawnSubtaskArgs
agent: Any = Field(default=None, exclude=True)
def _run(
self, subtasks: list[str], fire_and_forget: bool = False, **kwargs: Any
) -> str:
"""Execute parallel spawns synchronously."""
from crewai.new_agent.new_agent import NewAgent
if not isinstance(self.agent, NewAgent):
return "Error: spawn tool requires a NewAgent instance."
if not self.agent.settings.can_spawn_copies:
return "Error: this agent is not allowed to spawn copies (can_spawn_copies=False)."
if self.agent.settings.max_spawn_depth < 1:
return "Error: spawn depth exceeded — copies cannot spawn further copies."
settings = self.agent.settings
max_spawns = settings.max_concurrent_spawns
timeout = settings.spawn_timeout
parent_id = str(self.agent.id)
# Cap the number of sub-tasks
if len(subtasks) > max_spawns:
subtasks = subtasks[:max_spawns]
# GAP-57: Generate spawn IDs and emit started events
spawn_ids: list[str] = []
for i, subtask in enumerate(subtasks):
spawn_id = f"spawn-{uuid4().hex[:8]}-{i + 1}"
spawn_ids.append(spawn_id)
try:
from crewai.new_agent.events import NewAgentSpawnStartedEvent
_emit_spawn_event(
NewAgentSpawnStartedEvent,
new_agent_id=parent_id,
spawn_id=spawn_id,
parent_id=parent_id,
spawn_depth=1,
)
except Exception:
pass
spawn_start = time.monotonic()
# Build stripped-down copies
from crewai.new_agent.models import AgentSettings
spawn_settings = AgentSettings(
can_spawn_copies=False,
max_spawn_depth=0,
memory_enabled=True, # Enable so copies can persist insights
provenance_enabled=settings.provenance_enabled,
respect_context_window=settings.respect_context_window,
cache_tool_results=settings.cache_tool_results,
narration_guard=settings.narration_guard,
narration_max_retries=settings.narration_max_retries,
)
# GAP-58: Query parent memory for each subtask and build enriched messages
enriched_messages: list[str] = []
for subtask in subtasks:
context = _query_parent_memory(self.agent, subtask)
if context:
enriched_messages.append(f"{context}\n\nTask: {subtask}")
else:
enriched_messages.append(subtask)
copies: list[NewAgent] = []
for subtask in subtasks:
copy = NewAgent(
role=self.agent.role,
goal=subtask,
backstory="",
llm=self.agent.llm,
tools=list(self.agent.tools),
memory=True, # Enable memory
memory_scope=f"spawn-{parent_id}", # Isolated scope
settings=spawn_settings,
verbose=self.agent.verbose,
)
copies.append(copy)
# Fire-and-forget mode: start tasks in background threads and return immediately
if fire_and_forget:
import threading
def _bg_spawn(copy: NewAgent, msg: str, sid: str) -> None:
try:
copy.message(msg)
try:
from crewai.new_agent.events import NewAgentSpawnCompletedEvent
_emit_spawn_event(
NewAgentSpawnCompletedEvent,
new_agent_id=parent_id,
spawn_id=sid,
)
except Exception:
pass
except Exception as e:
try:
from crewai.new_agent.events import NewAgentSpawnFailedEvent
_emit_spawn_event(
NewAgentSpawnFailedEvent,
new_agent_id=parent_id,
spawn_id=sid,
error=str(e),
)
except Exception:
pass
for copy, msg, sid in zip(copies, enriched_messages, spawn_ids):
threading.Thread(
target=_bg_spawn, args=(copy, msg, sid), daemon=True
).start()
return f"Dispatched {len(copies)} subtask(s) in the background (fire-and-forget)."
# Run in parallel
async def _run_all() -> list[str]:
tasks = [
asyncio.wait_for(
copy.amessage(msg),
timeout=timeout,
)
for copy, msg in zip(copies, enriched_messages)
]
raw_results = await asyncio.gather(*tasks, return_exceptions=True)
output: list[str] = []
for i, r in enumerate(raw_results):
if isinstance(r, asyncio.TimeoutError):
output.append(f"[Subtask {i + 1}] Timed out after {timeout}s")
# GAP-57: Emit spawn failed event
try:
from crewai.new_agent.events import NewAgentSpawnFailedEvent
_emit_spawn_event(
NewAgentSpawnFailedEvent,
new_agent_id=parent_id,
spawn_id=spawn_ids[i],
error=f"Timed out after {timeout}s",
)
except Exception:
pass
elif isinstance(r, Exception):
output.append(f"[Subtask {i + 1}] Error: {r}")
# GAP-57: Emit spawn failed event
try:
from crewai.new_agent.events import NewAgentSpawnFailedEvent
_emit_spawn_event(
NewAgentSpawnFailedEvent,
new_agent_id=parent_id,
spawn_id=spawn_ids[i],
error=str(r),
)
except Exception:
pass
else:
content = r.content if isinstance(r, Message) else str(r)
output.append(f"[Subtask {i + 1}] {content}")
# GAP-57: Emit spawn completed event
try:
from crewai.new_agent.events import NewAgentSpawnCompletedEvent
_emit_spawn_event(
NewAgentSpawnCompletedEvent,
new_agent_id=parent_id,
spawn_id=spawn_ids[i],
)
except Exception:
pass
return output
# Handle event loop scenarios
try:
loop = asyncio.get_running_loop()
except RuntimeError:
loop = None
if loop and loop.is_running():
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, _run_all())
results = future.result()
else:
results = asyncio.run(_run_all())
self._log_spawn_provenance(subtasks, results, spawn_ids)
return "\n\n".join(results)
async def _arun(
self, subtasks: list[str], fire_and_forget: bool = False, **kwargs: Any
) -> str:
"""Async spawn — avoids blocking the event loop."""
from crewai.new_agent.new_agent import NewAgent
if not isinstance(self.agent, NewAgent):
return "Error: spawn tool requires a NewAgent instance."
if not self.agent.settings.can_spawn_copies:
return "Error: this agent is not allowed to spawn copies (can_spawn_copies=False)."
if self.agent.settings.max_spawn_depth < 1:
return "Error: spawn depth exceeded — copies cannot spawn further copies."
settings = self.agent.settings
max_spawns = settings.max_concurrent_spawns
timeout = settings.spawn_timeout
parent_id = str(self.agent.id)
if len(subtasks) > max_spawns:
subtasks = subtasks[:max_spawns]
spawn_ids: list[str] = []
for i, subtask in enumerate(subtasks):
spawn_id = f"spawn-{uuid4().hex[:8]}-{i + 1}"
spawn_ids.append(spawn_id)
try:
from crewai.new_agent.events import NewAgentSpawnStartedEvent
_emit_spawn_event(
NewAgentSpawnStartedEvent,
new_agent_id=parent_id,
spawn_id=spawn_id,
parent_id=parent_id,
spawn_depth=1,
)
except Exception:
pass
from crewai.new_agent.models import AgentSettings as SpawnSettings
spawn_settings = SpawnSettings(
can_spawn_copies=False,
max_spawn_depth=0,
memory_enabled=True,
provenance_enabled=settings.provenance_enabled,
respect_context_window=settings.respect_context_window,
cache_tool_results=settings.cache_tool_results,
narration_guard=settings.narration_guard,
narration_max_retries=settings.narration_max_retries,
)
enriched_messages: list[str] = []
for subtask in subtasks:
context = _query_parent_memory(self.agent, subtask)
if context:
enriched_messages.append(f"{context}\n\nTask: {subtask}")
else:
enriched_messages.append(subtask)
copies: list[NewAgent] = []
for subtask in subtasks:
copy = NewAgent(
role=self.agent.role,
goal=subtask,
backstory="",
llm=self.agent.llm,
tools=list(self.agent.tools),
memory=True,
memory_scope=f"spawn-{parent_id}",
settings=spawn_settings,
verbose=self.agent.verbose,
)
copies.append(copy)
if fire_and_forget:
for copy, msg, sid in zip(copies, enriched_messages, spawn_ids):
async def _bg(c: NewAgent = copy, m: str = msg, s: str = sid) -> None:
try:
await c.amessage(m)
try:
from crewai.new_agent.events import (
NewAgentSpawnCompletedEvent,
)
_emit_spawn_event(
NewAgentSpawnCompletedEvent,
new_agent_id=parent_id,
spawn_id=s,
)
except Exception:
pass
except Exception as e:
try:
from crewai.new_agent.events import (
NewAgentSpawnFailedEvent,
)
_emit_spawn_event(
NewAgentSpawnFailedEvent,
new_agent_id=parent_id,
spawn_id=s,
error=str(e),
)
except Exception:
pass
asyncio.get_running_loop().create_task(_bg())
return f"Dispatched {len(copies)} subtask(s) in the background (fire-and-forget)."
tasks = [
asyncio.wait_for(copy.amessage(msg), timeout=timeout)
for copy, msg in zip(copies, enriched_messages)
]
raw_results = await asyncio.gather(*tasks, return_exceptions=True)
results: list[str] = []
for i, r in enumerate(raw_results):
if isinstance(r, asyncio.TimeoutError):
results.append(f"[Subtask {i + 1}] Timed out after {timeout}s")
try:
from crewai.new_agent.events import NewAgentSpawnFailedEvent
_emit_spawn_event(
NewAgentSpawnFailedEvent,
new_agent_id=parent_id,
spawn_id=spawn_ids[i],
error=f"Timed out after {timeout}s",
)
except Exception:
pass
elif isinstance(r, Exception):
results.append(f"[Subtask {i + 1}] Error: {r}")
try:
from crewai.new_agent.events import NewAgentSpawnFailedEvent
_emit_spawn_event(
NewAgentSpawnFailedEvent,
new_agent_id=parent_id,
spawn_id=spawn_ids[i],
error=str(r),
)
except Exception:
pass
else:
content = r.content if isinstance(r, Message) else str(r)
results.append(f"[Subtask {i + 1}] {content}")
try:
from crewai.new_agent.events import NewAgentSpawnCompletedEvent
_emit_spawn_event(
NewAgentSpawnCompletedEvent,
new_agent_id=parent_id,
spawn_id=spawn_ids[i],
)
except Exception:
pass
self._log_spawn_provenance(subtasks, results, spawn_ids)
return "\n\n".join(results)
def _log_spawn_provenance(
self, subtasks: list[str], results: list[str], spawn_ids: list[str]
) -> None:
if self.agent.settings.provenance_enabled and hasattr(self.agent, "_executor"):
from crewai.new_agent.models import ProvenanceEntry
executor = self.agent._executor
conv_id = (
executor.conversation_history[0].conversation_id
if executor.conversation_history
else ""
)
for i, (subtask, result) in enumerate(zip(subtasks, results)):
executor.provenance_log.append(
ProvenanceEntry(
conversation_id=conv_id,
action="spawn",
reasoning=f"Spawned copy {i + 1}/{len(subtasks)} for parallel sub-task",
inputs={"subtask": subtask, "spawn_id": spawn_ids[i]},
outcome=result[:500],
)
)

View File

@@ -1,733 +0,0 @@
"""Telemetry spans for the NewAgent system."""
from __future__ import annotations
import logging
from typing import Any
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# GAP-47: Module-level registry mapping agent IDs to telemetry instances.
# Event handlers can look up the correct telemetry instance by agent ID.
# ---------------------------------------------------------------------------
_active_agents: dict[str, NewAgentTelemetry] = {}
def register_agent(agent_id: str, telemetry: NewAgentTelemetry) -> None:
"""Register an agent's telemetry instance for event-handler lookup."""
_active_agents[agent_id] = telemetry
def unregister_agent(agent_id: str) -> None:
"""Remove an agent's telemetry instance from the registry."""
_active_agents.pop(agent_id, None)
def get_telemetry_for_agent(agent_id: str) -> NewAgentTelemetry | None:
"""Look up the telemetry instance for a given agent ID."""
return _active_agents.get(agent_id)
class NewAgentTelemetry:
"""Wraps the Telemetry singleton with NewAgent-specific span methods."""
def __init__(self, share_data: bool = False) -> None:
self._telemetry: Any = None
self._share_data: bool = share_data
# GAP-123: Store open duration spans keyed by (agent_id, operation, detail)
self._pending_spans: dict[str, Any] = {}
# GAP-124: Agent fingerprint (set once via set_fingerprint)
self._agent_fingerprint: str = ""
try:
from crewai.telemetry.telemetry import Telemetry
self._telemetry = Telemetry()
except Exception:
pass
def set_fingerprint(self, fingerprint: str) -> None:
"""GAP-124: Store the agent's config fingerprint for span decoration."""
self._agent_fingerprint = fingerprint
def _span_key(self, agent_id: str, operation: str, detail: str = "") -> str:
return f"{agent_id}:{operation}:{detail}"
def store_span(self, key: str, span: Any) -> None:
"""Store an open span for later retrieval by a completed handler."""
if span is not None:
self._pending_spans[key] = span
def retrieve_span(self, key: str) -> Any:
"""Pop and return a previously stored span, or None."""
return self._pending_spans.pop(key, None)
def _should_share_data(self) -> bool:
"""Check if the current agent opts into sharing sensitive data."""
return self._share_data
def _safe(self, fn: str, **kwargs: Any) -> None:
"""Call a telemetry method safely, swallowing errors."""
if self._telemetry is None:
return
try:
method = getattr(self._telemetry, fn, None)
if method:
method(**kwargs)
except Exception:
pass
def agent_created(
self,
agent_id: str,
role: str,
goal: str,
llm: str = "",
tools_count: int = 0,
coworkers_count: int = 0,
memory_enabled: bool = True,
planning_enabled: bool = True,
# GAP-64: Additional metadata counts
coworker_amp_count: int = 0,
mcp_count: int = 0,
apps_count: int = 0,
knowledge_source_count: int = 0,
tool_count: int = 0,
**extra: Any,
) -> None:
if self._telemetry is None:
return
try:
import sys
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Created")
if span:
# GAP-107: Include crewai_version and python_version
try:
import crewai as _crewai_mod
span.set_attribute(
"crewai_version", getattr(_crewai_mod, "__version__", "unknown")
)
except Exception:
span.set_attribute("crewai_version", "unknown")
span.set_attribute("python_version", sys.version.split()[0])
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("new_agent_role", role)
# GAP-124: Agent fingerprint
if self._agent_fingerprint:
span.set_attribute("agent_fingerprint", self._agent_fingerprint)
# GAP-109: Only include goal when share_data is True
if self._should_share_data():
span.set_attribute("new_agent_goal", goal)
span.set_attribute("new_agent_llm", llm)
span.set_attribute("new_agent_tools_count", tools_count)
span.set_attribute("new_agent_coworkers_count", coworkers_count)
span.set_attribute("new_agent_memory_enabled", memory_enabled)
span.set_attribute("new_agent_planning_enabled", planning_enabled)
# GAP-64: Metadata counts
span.set_attribute("new_agent_coworker_amp_count", coworker_amp_count)
span.set_attribute("new_agent_mcp_count", mcp_count)
span.set_attribute("new_agent_apps_count", apps_count)
span.set_attribute(
"new_agent_knowledge_source_count", knowledge_source_count
)
span.set_attribute("new_agent_tool_count", tool_count)
# GAP-107: Forward extra keyword args as span attributes
for key, val in extra.items():
span.set_attribute(key, str(val) if val is not None else "")
tracer.end_span(span)
except Exception:
pass
def execution_started(
self, agent_id: str, conversation_id: str, model: str = ""
) -> Any:
if self._telemetry is None:
return None
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Execution")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("conversation_id", conversation_id)
span.set_attribute("model", model)
if self._agent_fingerprint:
span.set_attribute("agent_fingerprint", self._agent_fingerprint)
return span
except Exception:
return None
def execution_completed(
self,
span: Any,
input_tokens: int = 0,
output_tokens: int = 0,
response_time_ms: int = 0,
) -> None:
if span is None or self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span.set_attribute("input_tokens", input_tokens)
span.set_attribute("output_tokens", output_tokens)
span.set_attribute("response_time_ms", response_time_ms)
tracer.end_span(span)
except Exception:
pass
def tool_usage(self, agent_id: str, tool_name: str) -> Any:
if self._telemetry is None:
return None
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Tool Usage")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("tool_name", tool_name)
return span
except Exception:
return None
def tool_usage_error(self, span: Any, error: str = "") -> None:
if span is None or self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span.set_attribute("error", error)
tracer.end_span(span)
except Exception:
pass
def tool_usage_completed(self, span: Any) -> None:
if span is None or self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
tracer.end_span(span)
except Exception:
pass
def delegation(
self,
agent_id: str,
coworker_role: str,
mode: str = "sync",
source: str = "local",
) -> Any:
if self._telemetry is None:
return None
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Delegation")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("coworker_role", coworker_role)
span.set_attribute("delegation_mode", mode)
span.set_attribute("coworker_source", source)
return span
except Exception:
return None
def delegation_completed(
self, span: Any, tokens_consumed: int = 0, response_time_ms: int = 0
) -> None:
if span is None or self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span.set_attribute("tokens_consumed", tokens_consumed)
span.set_attribute("response_time_ms", response_time_ms)
tracer.end_span(span)
except Exception:
pass
def spawn(self, agent_id: str, spawn_id: str, depth: int = 0) -> Any:
if self._telemetry is None:
return None
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Spawn")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("spawn_id", spawn_id)
span.set_attribute("spawn_depth", depth)
return span
except Exception:
return None
def spawn_completed(self, span: Any) -> None:
if span is None or self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
tracer.end_span(span)
except Exception:
pass
def spawn_completed_event(self, agent_id: str, spawn_id: str = "") -> None:
"""GAP-123: Point span for spawn completion, used by event listener."""
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Spawn Completed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("spawn_id", spawn_id)
tracer.end_span(span)
except Exception:
pass
def dreaming(self, agent_id: str) -> Any:
if self._telemetry is None:
return None
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Dreaming")
if span:
span.set_attribute("new_agent_id", agent_id)
return span
except Exception:
return None
def dreaming_completed(
self, span: Any, memories_processed: int = 0, canonical_created: int = 0
) -> None:
if span is None or self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span.set_attribute("memories_processed", memories_processed)
span.set_attribute("canonical_created", canonical_created)
tracer.end_span(span)
except Exception:
pass
def planning(self, agent_id: str) -> Any:
if self._telemetry is None:
return None
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Planning")
if span:
span.set_attribute("new_agent_id", agent_id)
return span
except Exception:
return None
def planning_completed(self, span: Any, steps_count: int = 0) -> None:
if span is None or self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span.set_attribute("plan_steps_count", steps_count)
tracer.end_span(span)
except Exception:
pass
def guardrail(self, agent_id: str, guardrail_type: str = "") -> Any:
if self._telemetry is None:
return None
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Guardrail")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("guardrail_type", guardrail_type)
return span
except Exception:
return None
def guardrail_completed(self, span: Any, passed: bool = True) -> None:
if span is None or self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span.set_attribute("guardrail_passed", passed)
tracer.end_span(span)
except Exception:
pass
def memory_save(self, agent_id: str) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Memory Save")
if span:
span.set_attribute("new_agent_id", agent_id)
tracer.end_span(span)
except Exception:
pass
def memory_recall(self, agent_id: str, results_count: int = 0) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Memory Recall")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("results_count", results_count)
tracer.end_span(span)
except Exception:
pass
def knowledge_suggested(self, agent_id: str, source_type: str = "") -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Knowledge Suggested")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("source_type", source_type)
tracer.end_span(span)
except Exception:
pass
# ── Additional span methods for GAP-47 / GAP-61 bridge ──────
def conversation_reset(self, agent_id: str) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Conversation Reset")
if span:
span.set_attribute("new_agent_id", agent_id)
tracer.end_span(span)
except Exception:
pass
def message_received(self, agent_id: str, message_length: int = 0) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Message Received")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("message_length", message_length)
tracer.end_span(span)
except Exception:
pass
def message_sent(
self,
agent_id: str,
input_tokens: int = 0,
output_tokens: int = 0,
response_time_ms: int = 0,
) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Message Sent")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("input_tokens", input_tokens)
span.set_attribute("output_tokens", output_tokens)
span.set_attribute("response_time_ms", response_time_ms)
tracer.end_span(span)
except Exception:
pass
def llm_call_started(self, agent_id: str, model: str = "") -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent LLM Call Started")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("model", model)
tracer.end_span(span)
except Exception:
pass
def llm_call_completed(
self,
agent_id: str,
model: str = "",
input_tokens: int = 0,
output_tokens: int = 0,
response_time_ms: int = 0,
) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent LLM Call Completed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("model", model)
span.set_attribute("input_tokens", input_tokens)
span.set_attribute("output_tokens", output_tokens)
span.set_attribute("response_time_ms", response_time_ms)
tracer.end_span(span)
except Exception:
pass
def llm_call_failed(self, agent_id: str, error: str = "") -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent LLM Call Failed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("error", error)
tracer.end_span(span)
except Exception:
pass
def tool_usage_started(self, agent_id: str, tool_name: str = "") -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Tool Usage Started")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("tool_name", tool_name)
tracer.end_span(span)
except Exception:
pass
def tool_usage_completed_event(self, agent_id: str, tool_name: str = "") -> None:
"""GAP-123: Point span for tool completion, used by event listener."""
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Tool Usage Completed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("tool_name", tool_name)
if self._agent_fingerprint:
span.set_attribute("agent_fingerprint", self._agent_fingerprint)
tracer.end_span(span)
except Exception:
pass
def tool_usage_failed(
self, agent_id: str, tool_name: str = "", error: str = ""
) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Tool Usage Failed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("tool_name", tool_name)
span.set_attribute("error", error)
tracer.end_span(span)
except Exception:
pass
def delegation_failed(
self, agent_id: str, coworker_role: str = "", error: str = ""
) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Delegation Failed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("coworker_role", coworker_role)
span.set_attribute("error", error)
tracer.end_span(span)
except Exception:
pass
def fire_and_forget_dispatched(
self, agent_id: str, coworker_role: str = ""
) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Fire And Forget Dispatched")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("coworker_role", coworker_role)
tracer.end_span(span)
except Exception:
pass
def fire_and_forget_completed(self, agent_id: str, coworker_role: str = "") -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Fire And Forget Completed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("coworker_role", coworker_role)
tracer.end_span(span)
except Exception:
pass
def spawn_failed(self, agent_id: str, spawn_id: str = "", error: str = "") -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Spawn Failed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("spawn_id", spawn_id)
span.set_attribute("error", error)
tracer.end_span(span)
except Exception:
pass
def context_summarized(self, agent_id: str) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Context Summarized")
if span:
span.set_attribute("new_agent_id", agent_id)
tracer.end_span(span)
except Exception:
pass
def narration_guard_triggered(self, agent_id: str, retries: int = 0) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Narration Guard Triggered")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("retries", retries)
tracer.end_span(span)
except Exception:
pass
def workflow_detected(
self, agent_id: str, tools: list[str] | None = None, count: int = 0
) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Workflow Detected")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("workflow_tools", ",".join(tools or []))
span.set_attribute("workflow_count", count)
tracer.end_span(span)
except Exception:
pass
def workflow_proposed(self, agent_id: str, description: str = "") -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Workflow Proposed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("workflow_description", description[:500])
tracer.end_span(span)
except Exception:
pass
def workflow_confirmed(self, agent_id: str) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Workflow Confirmed")
if span:
span.set_attribute("new_agent_id", agent_id)
tracer.end_span(span)
except Exception:
pass
def knowledge_query(self, agent_id: str) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Knowledge Query")
if span:
span.set_attribute("new_agent_id", agent_id)
tracer.end_span(span)
except Exception:
pass
def knowledge_confirmed(self, agent_id: str, source_type: str = "") -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Knowledge Confirmed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("source_type", source_type)
tracer.end_span(span)
except Exception:
pass
def knowledge_rejected(self, agent_id: str) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Knowledge Rejected")
if span:
span.set_attribute("new_agent_id", agent_id)
tracer.end_span(span)
except Exception:
pass
def explain_requested(self, agent_id: str) -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Explain Requested")
if span:
span.set_attribute("new_agent_id", agent_id)
tracer.end_span(span)
except Exception:
pass
def guardrail_passed(self, agent_id: str, guardrail_type: str = "") -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Guardrail Passed")
if span:
span.set_attribute("new_agent_id", agent_id)
span.set_attribute("guardrail_type", guardrail_type)
tracer.end_span(span)
except Exception:
pass
def status_update(self, state: str = "", detail: str = "") -> None:
if self._telemetry is None:
return
try:
tracer = self._telemetry._tracer
span = tracer.start_span("NewAgent Status Update")
if span:
span.set_attribute("state", state)
span.set_attribute("detail", detail or "")
tracer.end_span(span)
except Exception:
pass

View File

@@ -23,8 +23,8 @@
"human_feedback": "You got human feedback on your work, re-evaluate it and give a new Final Answer when ready.\n {human_feedback}",
"getting_input": "This is the agent's final answer: {final_answer}\n\n",
"summarizer_system_message": "You are a precise assistant that creates structured summaries of agent conversations. You preserve critical context needed for seamless task continuation.",
"summarize_instruction": "Analyze the following conversation and create a concise summary that preserves critical context.\n\n<conversation>\n{conversation}\n</conversation>\n\nCreate a summary with these sections:\n1. **Task**: What is being accomplished?\n2. **Completed**: What has been done so far?\n3. **Key Facts**: Important data, results, values, names, URLs, or code snippets that must not be lost.\n\nWrap your entire summary in <summary> tags.\n\n<summary>\n[Your summary here]\n</summary>",
"summary": "Prior conversation context:\n{merged_summary}\n\nThe above is background context from earlier in this conversation. Do not repeat or reference it directly — use it to inform your response to the user's message below.",
"summarize_instruction": "Analyze the following conversation and create a structured summary that preserves all information needed to continue the task seamlessly.\n\n<conversation>\n{conversation}\n</conversation>\n\nCreate a summary with these sections:\n1. **Task Overview**: What is the agent trying to accomplish?\n2. **Current State**: What has been completed so far? What step is the agent on?\n3. **Important Discoveries**: Key facts, data, tool results, or findings that must not be lost.\n4. **Next Steps**: What should the agent do next based on the conversation?\n5. **Context to Preserve**: Any specific values, names, URLs, code snippets, or details referenced in the conversation.\n\nWrap your entire summary in <summary> tags.\n\n<summary>\n[Your structured summary here]\n</summary>",
"summary": "<summary>\n{merged_summary}\n</summary>\n\nContinue the task from where the conversation left off. The above is a structured summary of prior context.",
"manager_request": "Your best answer to your coworker asking you this, accounting for the context shared.",
"formatted_task_instructions": "Format your final answer according to the following OpenAPI schema: {output_format}\n\nIMPORTANT: Preserve the original content exactly as-is. Do NOT rewrite, paraphrase, or modify the meaning of the content. Only structure it to match the schema format.\n\nDo not include the OpenAPI schema in the final output. Ensure the final output does not include any code block markers like ```json or ```python.",
"conversation_history_instruction": "You are a member of a crew collaborating to achieve a common goal. Your task is a specific action that contributes to this larger objective. For additional context, please review the conversation history between you and the user that led to the initiation of this crew. Use any relevant information or feedback from the conversation to inform your task execution and ensure your response aligns with both the immediate task and the crew's overall goals.",
@@ -97,13 +97,5 @@
"synthesis_user_prompt": "## Original Task\n{task_description}\n\n## Results from each step\n{combined_steps}\n\nSynthesize these results into a single, coherent final answer.",
"replan_enhancement_prompt": "\n\nIMPORTANT: Previous execution attempt did not fully succeed. Please create a revised plan that accounts for the following context from the previous attempt:\n\n{previous_context}\n\nConsider:\n1. What steps succeeded and can be built upon\n2. What steps failed and why they might have failed\n3. Alternative approaches that might work better\n4. Whether dependencies need to be restructured",
"step_executor_task_context": "## Task Context\nThe following is the full task you are helping complete. Keep this in mind — especially any required output files, exact filenames, and expected formats.\n\n{task_context}\n\n---\n"
},
"new_agent": {
"soul": "You are {role}.\nYour goal: {goal}\nBackground: {backstory}\n\nCRITICAL INSTRUCTIONS:\n- Use your tools to accomplish tasks. Do NOT describe what you would do — actually do it by calling tools.\n- Never output a numbered plan, step list, or task breakdown as your response. If you have a plan, execute it silently using tools.\n- Never repeat or echo raw tool output, file contents, conversation history, or system context in your response.\n- When summarized context appears in the conversation, use it as background — do not echo or reference it.\n- Respond with the results of your work, not a description of what you intend to do.",
"tools_header": "You have access to the following tools:\n{tool_descriptions}",
"coworkers_header": "You have coworkers with specialized expertise. When a request involves work outside your core specialty, delegate to the appropriate coworker using their tool. Delegation is preferred over attempting work you're not specialized in.\n\nAvailable coworkers:\n{coworker_descriptions}",
"skills_self_build": "You can learn new skills from instructions the user gives you. When the user asks you to remember a process, encode a workflow, or create a skill, a skill suggestion will be generated automatically — do NOT use file-writing tools to create skill files yourself.",
"temporal": "Current date and time: {now}",
"tool_result_truncated": "\n\n[Truncated — {char_count} chars total. Use the result directly; do not repeat it in your response.]"
}
}

View File

@@ -9,7 +9,6 @@ CONTEXT_LIMIT_ERRORS: Final[list[str]] = [
"context window full",
"too many tokens",
"input is too long",
"prompt is too long",
"exceeds token limit",
]

View File

@@ -97,17 +97,6 @@ class I18N(BaseModel):
"""
return self.retrieve("memory", key)
def new_agent(self, key: str) -> str:
"""Retrieve a NewAgent prompt by key.
Args:
key: The key of the new_agent prompt to retrieve.
Returns:
The prompt as a string.
"""
return self.retrieve("new_agent", key)
def retrieve(
self,
kind: Literal[
@@ -118,7 +107,6 @@ class I18N(BaseModel):
"planning",
"hierarchical_manager_agent",
"memory",
"new_agent",
],
key: str,
) -> str:

View File

@@ -836,32 +836,18 @@ def test_lite_agent_kickoff_async_inside_flow():
assert isinstance(result, LiteAgentOutput)
@pytest.mark.vcr(record_mode="none")
@pytest.mark.vcr()
def test_lite_agent_standalone_still_works():
"""Test that LiteAgent.kickoff() still works normally outside of a Flow.
This verifies that the magic auto-async pattern doesn't break standalone usage
where there's no event loop running.
"""
from crewai.types.usage_metrics import UsageMetrics
mock_llm = Mock(spec=LLM)
mock_llm.call.return_value = "10"
mock_llm.stop = []
mock_llm.get_token_usage_summary.return_value = UsageMetrics(
total_tokens=10,
prompt_tokens=5,
completion_tokens=5,
cached_prompt_tokens=0,
successful_requests=1,
)
agent = Agent(
role="Standalone Agent",
goal="Answer questions",
backstory="A helpful assistant",
llm=mock_llm,
memory=False,
llm=LLM(model="gpt-4o-mini"),
verbose=False,
)

View File

@@ -1,420 +0,0 @@
"""Tests for dreaming, planning, knowledge discovery, spawning, and narration guard."""
from __future__ import annotations
import asyncio
from datetime import datetime, timezone, timedelta
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from crewai.new_agent import (
AgentSettings,
DreamingEngine,
KnowledgeDiscovery,
Message,
NewAgent,
PlanningEngine,
SpawnSubtaskTool,
)
# ── Dreaming tests ─────────────────────────────────────────────
class TestDreamingEngine:
def test_engine_initialized(self):
agent = NewAgent(role="R", goal="g")
assert agent._dreaming_engine is not None
def test_engine_not_initialized_when_disabled(self):
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(self_improving=False),
)
assert agent._dreaming_engine is None
def test_should_dream_false_initially(self):
agent = NewAgent(role="R", goal="g")
engine = agent._dreaming_engine
assert not engine.should_dream()
def test_should_dream_after_threshold(self):
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(dreaming_trigger_threshold=3),
)
engine = agent._dreaming_engine
for _ in range(3):
engine.increment_memory_count()
assert engine.should_dream()
def test_should_dream_after_time_interval(self):
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(dreaming_interval_hours=1),
)
engine = agent._dreaming_engine
engine._last_dreaming_time = datetime.now(timezone.utc) - timedelta(hours=2)
engine._memories_since_last_dream = 1
assert engine.should_dream()
def test_should_not_dream_too_soon(self):
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(dreaming_interval_hours=24),
)
engine = agent._dreaming_engine
engine._last_dreaming_time = datetime.now(timezone.utc) - timedelta(hours=1)
engine._memories_since_last_dream = 0
assert not engine.should_dream()
def test_increment_memory_count(self):
agent = NewAgent(role="R", goal="g")
engine = agent._dreaming_engine
assert engine._memories_since_last_dream == 0
engine.increment_memory_count()
engine.increment_memory_count()
assert engine._memories_since_last_dream == 2
@pytest.mark.asyncio
async def test_dream_resets_counters(self):
agent = NewAgent(
role="R", goal="g",
memory=False,
settings=AgentSettings(memory_enabled=False, self_improving=True),
)
engine = agent._dreaming_engine
engine._memories_since_last_dream = 15
result = await engine.dream()
assert engine._memories_since_last_dream == 0
assert engine._last_dreaming_time is not None
assert result["memories_processed"] == 0
def test_detect_workflows_empty(self):
agent = NewAgent(role="R", goal="g")
engine = agent._dreaming_engine
workflows = engine._detect_workflows()
assert workflows == []
# ── Planning tests ──────────────────────────────────────────────
class TestPlanningEngine:
def test_engine_initialized(self):
agent = NewAgent(role="R", goal="g")
assert agent._planning_engine is not None
def test_engine_not_initialized_when_disabled(self):
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(planning_enabled=False),
)
assert agent._planning_engine is None
@pytest.mark.asyncio
async def test_assess_complexity_simple(self):
agent = NewAgent(role="R", goal="g")
engine = agent._planning_engine
assert not await engine._assess_complexity("Hi")
@pytest.mark.asyncio
async def test_assess_complexity_complex(self):
agent = NewAgent(role="R", goal="g")
engine = agent._planning_engine
# Must trigger at least 2 complexity indicators:
# - "step by step" keyword AND "comprehensive" AND "compare" = keyword indicator
# - multiple commas (>4)
# - multiple "and" (>3)
msg = (
"Please analyze the following data step by step, compare each of the metrics, "
"then research the implications, analyze the patterns, evaluate the trends, "
"and provide a comprehensive detailed analysis of marketing and sales and operations "
"and support and engineering and design."
)
assert await engine._assess_complexity(msg)
@pytest.mark.asyncio
async def test_maybe_plan_returns_none_for_simple(self):
agent = NewAgent(role="R", goal="g")
engine = agent._planning_engine
result = await engine.maybe_plan("Hi there")
assert result is None
@pytest.mark.asyncio
@patch("crewai.utilities.agent_utils.aget_llm_response")
async def test_create_plan(self, mock_llm):
mock_llm.return_value = "1. Research AI\n2. Compare frameworks\n3. Write summary"
agent = NewAgent(role="R", goal="g")
engine = agent._planning_engine
plan = await engine._create_plan("Research AI agent frameworks")
assert len(plan) == 3
assert "Research AI" in plan[0]
@pytest.mark.asyncio
@patch("crewai.utilities.agent_utils.aget_llm_response")
async def test_maybe_plan_forced(self, mock_llm):
mock_llm.return_value = "1. Step one\n2. Step two"
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(auto_plan=False),
)
engine = agent._planning_engine
plan = await engine.maybe_plan("Anything")
assert plan is not None
assert len(plan) >= 1
def test_current_plan_initially_none(self):
agent = NewAgent(role="R", goal="g")
assert agent._planning_engine.current_plan is None
# ── Knowledge Discovery tests ──────────────────────────────────
class TestKnowledgeDiscovery:
def test_engine_initialized(self):
agent = NewAgent(role="R", goal="g")
assert agent._knowledge_discovery is not None
def test_evaluate_short_result_ignored(self):
agent = NewAgent(role="R", goal="g")
kd = agent._knowledge_discovery
result = kd.evaluate_for_knowledge("search_web", "short")
assert result is None
def test_evaluate_irrelevant_tool_ignored(self):
agent = NewAgent(role="R", goal="g")
kd = agent._knowledge_discovery
result = kd.evaluate_for_knowledge("calculator", "x" * 200)
assert result is None
def test_evaluate_knowledge_worthy(self):
agent = NewAgent(role="R", goal="g")
kd = agent._knowledge_discovery
result = kd.evaluate_for_knowledge("search_web", "x" * 200)
assert result is not None
assert result["status"] == "pending"
assert len(kd.pending_suggestions) == 1
def test_reject_suggestion(self):
agent = NewAgent(role="R", goal="g")
kd = agent._knowledge_discovery
kd.evaluate_for_knowledge("search_web", "x" * 200)
kd.reject_suggestion(0)
assert kd._pending_suggestions[0]["status"] == "rejected"
def test_reject_invalid_index(self):
agent = NewAgent(role="R", goal="g")
kd = agent._knowledge_discovery
kd.reject_suggestion(99) # Should not raise
def test_pending_suggestions_returns_copy(self):
agent = NewAgent(role="R", goal="g")
kd = agent._knowledge_discovery
kd.evaluate_for_knowledge("search_web", "x" * 200)
suggestions = kd.pending_suggestions
suggestions.clear()
assert len(kd.pending_suggestions) == 1 # Original unchanged
# ── Spawn Tool tests ───────────────────────────────────────────
class TestSpawnTool:
def test_spawn_not_allowed_when_disabled(self):
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(can_spawn_copies=False),
)
tool = SpawnSubtaskTool(agent=agent)
result = tool._run(subtasks=["Do something"])
assert "not allowed" in result
def test_spawn_depth_guard(self):
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(can_spawn_copies=True, max_spawn_depth=0),
)
tool = SpawnSubtaskTool(agent=agent)
result = tool._run(subtasks=["Do something"])
assert "depth exceeded" in result
@patch("crewai.new_agent.executor.aget_llm_response")
def test_spawn_creates_copies(self, mock_llm):
mock_llm.return_value = "Subtask result."
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
can_spawn_copies=True,
max_spawn_depth=1,
memory_enabled=False,
),
)
tool = SpawnSubtaskTool(agent=agent)
result = tool._run(subtasks=["Task A", "Task B"])
assert "[Subtask 1]" in result
assert "[Subtask 2]" in result
def test_spawn_caps_subtasks(self):
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
can_spawn_copies=True,
max_concurrent_spawns=2,
memory_enabled=False,
),
)
tool = SpawnSubtaskTool(agent=agent)
# The tool should cap subtasks to max_concurrent_spawns
assert agent.settings.max_concurrent_spawns == 2
# ── Narration Guard tests ──────────────────────────────────────
class TestNarrationGuard:
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_narration_guard_off_by_default(self, mock_llm):
mock_llm.return_value = "I've updated the file."
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Update the file")
# Narration guard off by default — no checking
assert "I've updated" in result.content
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_narration_guard_triggers(self, mock_llm):
mock_llm.side_effect = [
"I've updated the configuration.", # main LLM call
"Here's what you need to do to update the configuration:", # regeneration (no narration)
]
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
memory_enabled=False,
narration_guard=True,
narration_max_retries=1,
),
)
result = await agent.amessage("Update the config")
# After retry, the narration should be corrected
assert "Here's what you need to do" in result.content
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_narration_guard_allows_with_tools(self, mock_llm):
mock_llm.return_value = "I've completed the analysis."
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
memory_enabled=False,
narration_guard=True,
),
)
# Simulate that tools were used
result = await agent.amessage("Analyze this")
# Even with guard on, if we claim actions and the LLM didn't use tools,
# the guard would trigger. But the content check still works.
assert result.content is not None
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_narration_bailout_logged(self, mock_llm):
# Always return narrating text matching pattern "\bI deleted\b"
mock_llm.return_value = "I deleted all the files successfully."
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
memory_enabled=False,
narration_guard=True,
narration_max_retries=1,
),
)
await agent.amessage("Delete files")
prov = agent.explain()
bailout_entries = [e for e in prov if e.action == "narration_bailout"]
assert len(bailout_entries) == 1
# ── Structured Output integration tests ────────────────────────
class TestStructuredOutputIntegration:
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_structured_output_in_metadata(self, mock_llm):
from pydantic import BaseModel
class Result(BaseModel):
answer: str
confidence: float
mock_llm.return_value = '{"answer": "42", "confidence": 0.95}'
agent = NewAgent(
role="R", goal="g",
response_model=Result,
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("What is the answer?")
assert result.metadata is not None
assert "structured_output" in result.metadata
assert result.metadata["structured_output"]["answer"] == "42"
assert result.metadata["structured_output"]["confidence"] == 0.95
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_structured_output_no_model(self, mock_llm):
mock_llm.return_value = "Just plain text."
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Hello")
assert result.metadata is None
# ── Engine wiring integration tests ────────────────────────────
class TestEngineWiring:
def test_all_engines_present(self):
agent = NewAgent(role="R", goal="g")
assert agent._dreaming_engine is not None
assert agent._planning_engine is not None
assert agent._knowledge_discovery is not None
def test_disabled_engines_are_none(self):
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
self_improving=False,
planning_enabled=False,
),
)
assert agent._dreaming_engine is None
assert agent._planning_engine is None
assert agent._knowledge_discovery is not None # Always present
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_spawn_tool_auto_added(self, mock_llm):
mock_llm.return_value = "Done."
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
can_spawn_copies=True,
max_spawn_depth=1,
memory_enabled=False,
),
)
# The spawn tool should be added automatically during execution
await agent.amessage("Do something")
# If we get here without error, the integration works
assert True

View File

@@ -1,201 +0,0 @@
"""Tests for the agent TUI and crewai run integration."""
from __future__ import annotations
import json
import os
import re
from pathlib import Path
import pytest
def strip_jsonc_comments(text: str) -> str:
result = re.sub(r"(?<!:)//.*?$", "", text, flags=re.MULTILINE)
result = re.sub(r"/\*.*?\*/", "", result, flags=re.DOTALL)
return result
class TestLoadAgents:
"""Tests for loading agent definitions from agents/ directory."""
def test_loads_json_file(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import _load_agents
agents_dir = tmp_path / "agents"
agents_dir.mkdir()
defn = {"name": "test", "role": "Test", "goal": "Test"}
(agents_dir / "test.json").write_text(json.dumps(defn))
agents = _load_agents(agents_dir)
assert len(agents) == 1
assert agents[0]["name"] == "test"
def test_loads_jsonc_file(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import _load_agents
agents_dir = tmp_path / "agents"
agents_dir.mkdir()
jsonc = '{\n // comment\n "name": "test",\n "role": "R",\n "goal": "G"\n}'
(agents_dir / "test.jsonc").write_text(jsonc)
agents = _load_agents(agents_dir)
assert len(agents) == 1
assert agents[0]["name"] == "test"
def test_loads_multiple_agents(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import _load_agents
agents_dir = tmp_path / "agents"
agents_dir.mkdir()
for name in ("alpha", "beta", "gamma"):
defn = {"name": name, "role": name.title(), "goal": f"{name} goal"}
(agents_dir / f"{name}.json").write_text(json.dumps(defn))
agents = _load_agents(agents_dir)
assert len(agents) == 3
names = [a["name"] for a in agents]
assert sorted(names) == ["alpha", "beta", "gamma"]
def test_skips_invalid_json(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import _load_agents
agents_dir = tmp_path / "agents"
agents_dir.mkdir()
(agents_dir / "good.json").write_text('{"name": "good", "role": "R", "goal": "G"}')
(agents_dir / "bad.json").write_text("this is not json {{{")
agents = _load_agents(agents_dir)
assert len(agents) == 1
assert agents[0]["name"] == "good"
def test_empty_directory(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import _load_agents
agents_dir = tmp_path / "agents"
agents_dir.mkdir()
agents = _load_agents(agents_dir)
assert agents == []
class TestLoadConfig:
"""Tests for loading project config.json."""
def test_loads_config(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import _load_config
config = {"rooms": {"common": {"agents": ["a", "b"], "engagement": "tagged"}}}
(tmp_path / "config.json").write_text(json.dumps(config))
result = _load_config(tmp_path)
assert result["rooms"]["common"]["engagement"] == "tagged"
assert result["rooms"]["common"]["agents"] == ["a", "b"]
def test_missing_config_returns_defaults(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import _load_config
result = _load_config(tmp_path)
assert "rooms" in result
assert "common" in result["rooms"]
def test_loads_jsonc_config(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import _load_config
jsonc = '{\n // comment\n "rooms": {"common": {"agents": [], "engagement": "organic"}}\n}'
(tmp_path / "config.json").write_text(jsonc)
result = _load_config(tmp_path)
assert result["rooms"]["common"]["engagement"] == "organic"
class TestHasAgentsDir:
"""Tests for _has_agents_dir detection in run_crew."""
def test_detects_agents_dir(self, tmp_path: Path) -> None:
from crewai_cli.run_crew import _has_agents_dir
agents_dir = tmp_path / "agents"
agents_dir.mkdir()
(agents_dir / "test.json").write_text('{"name": "test"}')
old_cwd = os.getcwd()
os.chdir(tmp_path)
try:
assert _has_agents_dir() is True
finally:
os.chdir(old_cwd)
def test_no_agents_dir(self, tmp_path: Path) -> None:
from crewai_cli.run_crew import _has_agents_dir
old_cwd = os.getcwd()
os.chdir(tmp_path)
try:
assert _has_agents_dir() is False
finally:
os.chdir(old_cwd)
def test_empty_agents_dir(self, tmp_path: Path) -> None:
from crewai_cli.run_crew import _has_agents_dir
(tmp_path / "agents").mkdir()
old_cwd = os.getcwd()
os.chdir(tmp_path)
try:
assert _has_agents_dir() is False
finally:
os.chdir(old_cwd)
class TestAgentTUIConstruction:
"""Tests for AgentTUI class construction."""
def test_constructs_with_agents_dir(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import AgentTUI
agents_dir = tmp_path / "agents"
agents_dir.mkdir()
(agents_dir / "test.json").write_text('{"name": "test", "role": "R", "goal": "G"}')
tui = AgentTUI(agents_dir=agents_dir)
assert tui._agents_dir == agents_dir
def test_constructs_with_config(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import AgentTUI
agents_dir = tmp_path / "agents"
agents_dir.mkdir()
config = {"rooms": {"common": {"agents": ["test"], "engagement": "organic"}}}
tui = AgentTUI(agents_dir=agents_dir, config=config)
assert tui._config["rooms"]["common"]["engagement"] == "organic"
class TestRunAgentTUI:
"""Tests for run_agent_tui function."""
def test_exits_if_no_agents_dir(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import run_agent_tui
old_cwd = os.getcwd()
os.chdir(tmp_path)
try:
with pytest.raises(SystemExit):
run_agent_tui()
finally:
os.chdir(old_cwd)
def test_exits_if_empty_agents_dir(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import run_agent_tui
(tmp_path / "agents").mkdir()
old_cwd = os.getcwd()
os.chdir(tmp_path)
try:
with pytest.raises(SystemExit):
run_agent_tui()
finally:
os.chdir(old_cwd)

View File

@@ -1,573 +0,0 @@
"""Tests for the benchmark module."""
from __future__ import annotations
import asyncio
import json
import tempfile
from pathlib import Path
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from crewai_cli.benchmark import (
BenchmarkCase,
BenchmarkResult,
_check_expected,
_strip_jsonc_comments,
format_comparison_table,
format_results_table,
load_benchmark_cases,
run_benchmark,
)
# ── BenchmarkCase model tests ──────────────────────────────────
class TestBenchmarkCase:
def test_with_expected(self):
case = BenchmarkCase(input="What is 2+2?", expected="4")
assert case.input == "What is 2+2?"
assert case.expected == "4"
assert case.criteria is None
def test_with_criteria(self):
case = BenchmarkCase(
input="Write a haiku",
criteria="Must be a valid haiku",
)
assert case.input == "Write a haiku"
assert case.expected is None
assert case.criteria == "Must be a valid haiku"
def test_with_both(self):
case = BenchmarkCase(
input="Answer", expected="42", criteria="Must be numeric"
)
assert case.expected == "42"
assert case.criteria == "Must be numeric"
def test_input_only(self):
case = BenchmarkCase(input="Hello")
assert case.expected is None
assert case.criteria is None
# ── BenchmarkResult model tests ──────────────────────────────────
class TestBenchmarkResult:
def test_defaults(self):
r = BenchmarkResult(case_index=0, input="test")
assert r.case_index == 0
assert r.input == "test"
assert r.passed is False
assert r.score == 0.0
assert r.input_tokens == 0
assert r.output_tokens == 0
assert r.response_time_ms == 0
assert r.cost is None
assert r.model == ""
assert r.actual == ""
def test_full(self):
r = BenchmarkResult(
case_index=1,
input="What is 2+2?",
expected="4",
actual="The answer is 4",
model="openai/gpt-4o",
passed=True,
score=1.0,
input_tokens=50,
output_tokens=10,
response_time_ms=500,
cost=0.001,
)
assert r.passed is True
assert r.cost == 0.001
# ── load_benchmark_cases tests ──────────────────────────────────
class TestLoadBenchmarkCases:
def test_load_json(self, tmp_path: Path):
cases_data = [
{"input": "What is 2+2?", "expected": "4"},
{"input": "Write a haiku", "criteria": "Must be 5-7-5"},
]
f = tmp_path / "cases.json"
f.write_text(json.dumps(cases_data), encoding="utf-8")
cases = load_benchmark_cases(f)
assert len(cases) == 2
assert cases[0].input == "What is 2+2?"
assert cases[0].expected == "4"
assert cases[1].criteria == "Must be 5-7-5"
def test_load_jsonc(self, tmp_path: Path):
jsonc_content = """[
// A simple math test
{"input": "What is 2+2?", "expected": "4"},
/* Multi-line
comment */
{"input": "Hello", "criteria": "Must be polite"}
]"""
f = tmp_path / "cases.jsonc"
f.write_text(jsonc_content, encoding="utf-8")
cases = load_benchmark_cases(f)
assert len(cases) == 2
assert cases[0].expected == "4"
assert cases[1].criteria == "Must be polite"
def test_file_not_found(self):
with pytest.raises(FileNotFoundError, match="not found"):
load_benchmark_cases("/nonexistent/path.json")
def test_invalid_json(self, tmp_path: Path):
f = tmp_path / "bad.json"
f.write_text("{invalid json", encoding="utf-8")
with pytest.raises(ValueError, match="Invalid JSON"):
load_benchmark_cases(f)
def test_not_array(self, tmp_path: Path):
f = tmp_path / "obj.json"
f.write_text('"just a string"', encoding="utf-8")
with pytest.raises(ValueError, match="must contain a JSON array"):
load_benchmark_cases(f)
def test_object_without_cases_key(self, tmp_path: Path):
f = tmp_path / "obj.json"
f.write_text('{"input": "test"}', encoding="utf-8")
with pytest.raises(ValueError, match="must have a 'cases' array"):
load_benchmark_cases(f)
def test_object_wrapper_with_threshold(self, tmp_path: Path):
data = {
"threshold": 0.9,
"cases": [
{"input": "What is 2+2?", "expected": "4"},
{"input": "Hello", "criteria": "Must be polite"},
],
}
f = tmp_path / "wrapped.json"
f.write_text(json.dumps(data), encoding="utf-8")
loaded = load_benchmark_cases(f)
assert len(loaded) == 2
assert loaded.threshold == 0.9
assert loaded.cases[0].input == "What is 2+2?"
def test_object_wrapper_without_threshold(self, tmp_path: Path):
data = {"cases": [{"input": "Hello"}]}
f = tmp_path / "wrapped_no_thresh.json"
f.write_text(json.dumps(data), encoding="utf-8")
loaded = load_benchmark_cases(f)
assert len(loaded) == 1
assert loaded.threshold is None
def test_bare_array_has_no_threshold(self, tmp_path: Path):
f = tmp_path / "bare.json"
f.write_text('[{"input": "Hello"}]', encoding="utf-8")
loaded = load_benchmark_cases(f)
assert len(loaded) == 1
assert loaded.threshold is None
def test_missing_input_field(self, tmp_path: Path):
f = tmp_path / "missing.json"
f.write_text('[{"expected": "4"}]', encoding="utf-8")
with pytest.raises(ValueError, match="missing required 'input' field"):
load_benchmark_cases(f)
def test_non_object_item(self, tmp_path: Path):
f = tmp_path / "bad_items.json"
f.write_text('["not an object"]', encoding="utf-8")
with pytest.raises(ValueError, match="must be a JSON object"):
load_benchmark_cases(f)
def test_string_path(self, tmp_path: Path):
cases_data = [{"input": "Hello"}]
f = tmp_path / "str_path.json"
f.write_text(json.dumps(cases_data), encoding="utf-8")
cases = load_benchmark_cases(str(f))
assert len(cases) == 1
# ── _strip_jsonc_comments tests ──────────────────────────────────
class TestStripJsoncComments:
def test_no_comments(self):
text = '{"key": "value"}'
assert json.loads(_strip_jsonc_comments(text)) == {"key": "value"}
def test_single_line_comments(self):
text = '{\n // comment\n "key": "value"\n}'
result = json.loads(_strip_jsonc_comments(text))
assert result == {"key": "value"}
def test_multi_line_comments(self):
text = '{\n /* multi\n line */\n "key": "value"\n}'
result = json.loads(_strip_jsonc_comments(text))
assert result == {"key": "value"}
# ── _check_expected tests ──────────────────────────────────
class TestCheckExpected:
def test_exact_match(self):
passed, score = _check_expected("4", "4")
assert passed is True
assert score == 1.0
def test_substring_match(self):
passed, score = _check_expected("4", "The answer is 4.")
assert passed is True
assert score == 1.0
def test_case_insensitive(self):
passed, score = _check_expected("hello", "HELLO WORLD")
assert passed is True
assert score == 1.0
def test_no_match(self):
passed, score = _check_expected("banana", "The answer is apple")
assert passed is False
assert score == 0.0
# ── format_results_table tests ──────────────────────────────────
class TestFormatResultsTable:
def test_empty_results(self):
output = format_results_table([])
assert output == "No results to display."
def test_single_result(self):
results = [
BenchmarkResult(
case_index=0,
input="What is 2+2?",
expected="4",
actual="4",
model="openai/gpt-4o",
passed=True,
score=1.0,
input_tokens=50,
output_tokens=10,
response_time_ms=200,
)
]
output = format_results_table(results)
assert "openai/gpt-4o" in output
assert "PASS" in output
assert "1/1 passed" in output
assert "Avg score: 1.00" in output
def test_multiple_results_mixed(self):
results = [
BenchmarkResult(
case_index=0,
input="Q1",
model="m1",
passed=True,
score=1.0,
input_tokens=10,
output_tokens=5,
response_time_ms=100,
),
BenchmarkResult(
case_index=1,
input="Q2",
model="m1",
passed=False,
score=0.3,
input_tokens=20,
output_tokens=8,
response_time_ms=150,
),
]
output = format_results_table(results)
assert "1/2 passed" in output
assert "PASS" in output
assert "FAIL" in output
# Avg score = (1.0 + 0.3) / 2 = 0.65
assert "0.65" in output
def test_long_input_truncated(self):
long_input = "A" * 100
results = [
BenchmarkResult(
case_index=0,
input=long_input,
model="m1",
passed=True,
score=1.0,
)
]
output = format_results_table(results)
assert "..." in output
# ── format_comparison_table tests ──────────────────────────────────
class TestFormatComparisonTable:
def test_empty(self):
output = format_comparison_table({})
assert output == "No results to compare."
def test_single_model(self):
results_by_model = {
"openai/gpt-4o": [
BenchmarkResult(
case_index=0,
input="Q1",
model="openai/gpt-4o",
passed=True,
score=1.0,
input_tokens=50,
output_tokens=10,
response_time_ms=200,
)
]
}
output = format_comparison_table(results_by_model)
assert "openai/gpt-4o" in output
assert "Best model: openai/gpt-4o" in output
def test_multi_model_comparison(self):
results_by_model = {
"model-a": [
BenchmarkResult(
case_index=0, input="Q1", model="model-a",
passed=True, score=0.9, input_tokens=50,
output_tokens=10, response_time_ms=200,
),
BenchmarkResult(
case_index=1, input="Q2", model="model-a",
passed=True, score=0.8, input_tokens=60,
output_tokens=15, response_time_ms=300,
),
],
"model-b": [
BenchmarkResult(
case_index=0, input="Q1", model="model-b",
passed=False, score=0.3, input_tokens=40,
output_tokens=8, response_time_ms=150,
),
BenchmarkResult(
case_index=1, input="Q2", model="model-b",
passed=False, score=0.2, input_tokens=45,
output_tokens=12, response_time_ms=250,
),
],
}
output = format_comparison_table(results_by_model)
assert "model-a" in output
assert "model-b" in output
assert "Best model: model-a" in output
assert "Model Comparison" in output
# ── run_benchmark tests (mocked LLM) ──────────────────────────────────
def _make_mock_agent(content: str = "The answer is 4", input_tokens: int = 50, output_tokens: int = 10):
"""Create a mock agent that returns a fixed response."""
from crewai.new_agent.models import Message
mock_response = Message(
role="agent",
content=content,
model="test-model",
input_tokens=input_tokens,
output_tokens=output_tokens,
response_time_ms=100,
)
mock_agent = MagicMock()
mock_agent.amessage = AsyncMock(return_value=mock_response)
return mock_agent
class TestRunBenchmark:
def test_single_case_expected_pass(self):
cases = [BenchmarkCase(input="What is 2+2?", expected="4")]
mock_agent = _make_mock_agent("The answer is 4")
with patch("crewai_cli.benchmark._parse_definition", return_value={"role": "test", "goal": "test", "llm": "test-model"}), \
patch("crewai_cli.benchmark._load_agent", return_value=mock_agent):
results = asyncio.run(run_benchmark(
agent_def={"role": "test", "goal": "test"},
cases=cases,
))
assert "test-model" in results
assert len(results["test-model"]) == 1
assert results["test-model"][0].passed is True
assert results["test-model"][0].score == 1.0
def test_single_case_expected_fail(self):
cases = [BenchmarkCase(input="What is 2+2?", expected="banana")]
mock_agent = _make_mock_agent("The answer is 4")
with patch("crewai_cli.benchmark._parse_definition", return_value={"role": "test", "goal": "test", "llm": "test-model"}), \
patch("crewai_cli.benchmark._load_agent", return_value=mock_agent):
results = asyncio.run(run_benchmark(
agent_def={"role": "test", "goal": "test"},
cases=cases,
))
assert results["test-model"][0].passed is False
assert results["test-model"][0].score == 0.0
def test_multiple_cases(self):
cases = [
BenchmarkCase(input="Q1", expected="4"),
BenchmarkCase(input="Q2", expected="banana"),
]
mock_agent = _make_mock_agent("The answer is 4")
with patch("crewai_cli.benchmark._parse_definition", return_value={"role": "test", "goal": "test", "llm": "test-model"}), \
patch("crewai_cli.benchmark._load_agent", return_value=mock_agent):
results = asyncio.run(run_benchmark(
agent_def={"role": "test", "goal": "test"},
cases=cases,
))
assert len(results["test-model"]) == 2
assert results["test-model"][0].passed is True
assert results["test-model"][1].passed is False
def test_multi_model_comparison(self):
cases = [BenchmarkCase(input="Q1", expected="4")]
mock_agent = _make_mock_agent("The answer is 4")
with patch("crewai_cli.benchmark._parse_definition", return_value={"role": "test", "goal": "test", "llm": "default"}), \
patch("crewai_cli.benchmark._load_agent", return_value=mock_agent):
results = asyncio.run(run_benchmark(
agent_def={"role": "test", "goal": "test"},
cases=cases,
models=["model-a", "model-b"],
))
assert "model-a" in results
assert "model-b" in results
assert len(results["model-a"]) == 1
assert len(results["model-b"]) == 1
def test_criteria_evaluation(self):
cases = [BenchmarkCase(input="Write a haiku", criteria="Must be a valid haiku")]
mock_agent = _make_mock_agent("Old pond / frog leaps in / water's sound")
mock_judge_result = (True, 0.9)
with patch("crewai_cli.benchmark._parse_definition", return_value={"role": "test", "goal": "test", "llm": "test-model"}), \
patch("crewai_cli.benchmark._load_agent", return_value=mock_agent), \
patch("crewai_cli.benchmark._judge_with_llm", new_callable=AsyncMock, return_value=mock_judge_result):
results = asyncio.run(run_benchmark(
agent_def={"role": "test", "goal": "test"},
cases=cases,
))
assert results["test-model"][0].passed is True
assert results["test-model"][0].score == 0.9
def test_combined_expected_and_criteria(self):
cases = [
BenchmarkCase(
input="What is 2+2?",
expected="4",
criteria="Must be numeric",
)
]
mock_agent = _make_mock_agent("The answer is 4")
mock_judge_result = (True, 0.8)
with patch("crewai_cli.benchmark._parse_definition", return_value={"role": "test", "goal": "test", "llm": "test-model"}), \
patch("crewai_cli.benchmark._load_agent", return_value=mock_agent), \
patch("crewai_cli.benchmark._judge_with_llm", new_callable=AsyncMock, return_value=mock_judge_result):
results = asyncio.run(run_benchmark(
agent_def={"role": "test", "goal": "test"},
cases=cases,
))
r = results["test-model"][0]
assert r.passed is True
# Score should be average of expected (1.0) and criteria (0.8) = 0.9
assert r.score == pytest.approx(0.9)
def test_agent_creation_error(self):
cases = [BenchmarkCase(input="Q1", expected="4")]
with patch("crewai_cli.benchmark._parse_definition", return_value={"role": "test", "goal": "test", "llm": "test-model"}), \
patch("crewai_cli.benchmark._load_agent", side_effect=Exception("Agent init failed")):
results = asyncio.run(run_benchmark(
agent_def={"role": "test", "goal": "test"},
cases=cases,
))
r = results["test-model"][0]
assert r.passed is False
assert "Agent creation error" in r.actual
def test_agent_message_error(self):
cases = [BenchmarkCase(input="Q1", expected="4")]
mock_agent = MagicMock()
mock_agent.amessage = AsyncMock(side_effect=Exception("LLM timeout"))
with patch("crewai_cli.benchmark._parse_definition", return_value={"role": "test", "goal": "test", "llm": "test-model"}), \
patch("crewai_cli.benchmark._load_agent", return_value=mock_agent):
results = asyncio.run(run_benchmark(
agent_def={"role": "test", "goal": "test"},
cases=cases,
))
r = results["test-model"][0]
assert r.passed is False
assert "Error" in r.actual
def test_tokens_and_timing_recorded(self):
cases = [BenchmarkCase(input="Q1", expected="4")]
mock_agent = _make_mock_agent("4", input_tokens=100, output_tokens=25)
with patch("crewai_cli.benchmark._parse_definition", return_value={"role": "test", "goal": "test", "llm": "test-model"}), \
patch("crewai_cli.benchmark._load_agent", return_value=mock_agent):
results = asyncio.run(run_benchmark(
agent_def={"role": "test", "goal": "test"},
cases=cases,
))
r = results["test-model"][0]
assert r.input_tokens == 100
assert r.output_tokens == 25
assert r.response_time_ms >= 0
def test_default_model_used(self):
"""When no models specified, uses agent's default llm."""
cases = [BenchmarkCase(input="Q1", expected="4")]
mock_agent = _make_mock_agent("4")
with patch("crewai_cli.benchmark._parse_definition", return_value={"role": "test", "goal": "test", "llm": "openai/gpt-4o"}), \
patch("crewai_cli.benchmark._load_agent", return_value=mock_agent):
results = asyncio.run(run_benchmark(
agent_def={"role": "test", "goal": "test"},
cases=cases,
models=None,
))
assert "openai/gpt-4o" in results

View File

@@ -1,451 +0,0 @@
"""Tests for NewAgent CLI commands (create agent, agent reset-history, agent memory)."""
from __future__ import annotations
import json
import logging
import os
import re
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from click.testing import CliRunner
from crewai_cli.cli import crewai
from crewai_cli.create_agent import AGENT_TEMPLATE, create_agent
# ── Helpers ─────────────────────────────────────────────────────
def strip_jsonc_comments(text: str) -> str:
"""Strip // and /* */ comments so the output is valid JSON."""
result = re.sub(r"(?<!:)//.*?$", "", text, flags=re.MULTILINE)
result = re.sub(r"/\*.*?\*/", "", result, flags=re.DOTALL)
result = re.sub(r",\s*([}\]])", r"\1", result)
return result
# ── Helpers ─────────────────────────────────────────────────────
# Standard interactive input for agent creation:
# role, goal, backstory, provider (1=OpenAI), model (1=first), tools (none), api key (skip)
_DEFAULT_PROMPTS_INPUT = "Test Role\nTest Goal\n\n1\n1\n\n\n"
# ── crewai create agent <name> ──────────────────────────────────
class TestCreateAgentCommand:
"""Tests for ``crewai create agent <name>``."""
def test_creates_jsonc_file(self, tmp_path: Path) -> None:
"""The command should create agents/<name>.jsonc."""
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
result = runner.invoke(
crewai, ["create", "agent", "researcher"],
input=_DEFAULT_PROMPTS_INPUT,
)
assert result.exit_code == 0, result.output
dest = Path("agents/researcher.jsonc")
assert dest.exists(), f"Expected {dest} to be created"
def test_file_contains_agent_name(self, tmp_path: Path) -> None:
"""The scaffolded file must contain the agent name."""
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
runner.invoke(
crewai, ["create", "agent", "writer"],
input=_DEFAULT_PROMPTS_INPUT,
)
content = Path("agents/writer.jsonc").read_text()
assert '"name": "writer"' in content
def test_prompts_populate_fields(self, tmp_path: Path) -> None:
"""Interactive prompts should populate role, goal, backstory."""
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
# role, goal, backstory, provider (1=OpenAI), model (1=first), tools (none), api key (skip)
result = runner.invoke(
crewai, ["create", "agent", "analyst"],
input="Data Analyst\nAnalyze data\nExpert analyst\n1\n1\n\n\n",
)
assert result.exit_code == 0, result.output
raw = Path("agents/analyst.jsonc").read_text()
clean = strip_jsonc_comments(raw)
data = json.loads(clean)
assert data["name"] == "analyst"
assert data["role"] == "Data Analyst"
assert data["goal"] == "Analyze data"
assert data["backstory"] == "Expert analyst"
assert data["llm"] == "openai/gpt-5.5"
def test_tools_selection(self, tmp_path: Path) -> None:
"""Selecting tools should populate the tools array."""
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
# role, goal, backstory, provider (1), model (1), tools (1 2 = SerperDevTool + ScrapeWebsiteTool), api key (skip)
result = runner.invoke(
crewai, ["create", "agent", "searcher"],
input="Web Searcher\nSearch things\n\n1\n1\n1 2\n\n",
)
assert result.exit_code == 0, result.output
raw = Path("agents/searcher.jsonc").read_text()
clean = strip_jsonc_comments(raw)
data = json.loads(clean)
assert data["tools"] == ["SerperDevTool", "ScrapeWebsiteTool"]
def test_jsonc_is_parseable(self, tmp_path: Path) -> None:
"""After stripping comments the JSONC must be valid JSON."""
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
runner.invoke(
crewai, ["create", "agent", "analyst"],
input=_DEFAULT_PROMPTS_INPUT,
)
raw = Path("agents/analyst.jsonc").read_text()
clean = strip_jsonc_comments(raw)
data = json.loads(clean)
assert data["name"] == "analyst"
assert data["settings"]["memory"] is True
assert data["settings"]["planning"] is True
def test_all_expected_fields_present(self, tmp_path: Path) -> None:
"""The scaffolded JSON should contain every documented field."""
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
runner.invoke(
crewai, ["create", "agent", "myagent"],
input=_DEFAULT_PROMPTS_INPUT,
)
raw = Path("agents/myagent.jsonc").read_text()
data = json.loads(strip_jsonc_comments(raw))
for key in ("name", "role", "goal", "backstory", "llm", "tools", "mcps", "coworkers", "settings"):
assert key in data, f"Missing expected field: {key}"
def test_does_not_overwrite_without_confirm(self, tmp_path: Path) -> None:
"""If the file already exists, declining should leave it untouched."""
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
runner.invoke(
crewai, ["create", "agent", "dup"],
input=_DEFAULT_PROMPTS_INPUT,
)
original = Path("agents/dup.jsonc").read_text()
# Decline overwrite (input 'n' after the prompts)
result = runner.invoke(
crewai, ["create", "agent", "dup"],
input="n\n",
)
assert "cancelled" in result.output.lower()
assert Path("agents/dup.jsonc").read_text() == original
def test_creates_agents_directory(self, tmp_path: Path) -> None:
"""The agents/ directory should be created if it does not exist."""
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
assert not Path("agents").exists()
runner.invoke(
crewai, ["create", "agent", "newone"],
input=_DEFAULT_PROMPTS_INPUT,
)
assert Path("agents").is_dir()
def test_success_message(self, tmp_path: Path) -> None:
"""The command should print a success message."""
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
result = runner.invoke(
crewai, ["create", "agent", "bot"],
input=_DEFAULT_PROMPTS_INPUT,
)
assert "Agent created:" in result.output
# ── crewai agent reset-history <name> ───────────────────────────
class TestAgentResetHistoryCommand:
"""Tests for ``crewai agent reset-history <name>``."""
def test_no_history_file(self) -> None:
runner = CliRunner()
result = runner.invoke(crewai, ["agent", "reset-history", "researcher"])
assert result.exit_code == 0, result.output
assert "researcher" in result.output
assert "no conversation history" in result.output.lower()
def test_deletes_history_file(self, tmp_path: Path) -> None:
import os
old_cwd = os.getcwd()
os.chdir(tmp_path)
try:
history_dir = tmp_path / ".crewai" / "conversations"
history_dir.mkdir(parents=True)
history_file = history_dir / "test-agent.json"
history_file.write_text("[]")
runner = CliRunner()
result = runner.invoke(crewai, ["agent", "reset-history", "test-agent"])
assert result.exit_code == 0
assert "cleared" in result.output.lower()
assert not history_file.exists()
finally:
os.chdir(old_cwd)
def test_accepts_any_name(self) -> None:
runner = CliRunner()
result = runner.invoke(crewai, ["agent", "reset-history", "my-custom-agent"])
assert result.exit_code == 0
assert "my-custom-agent" in result.output
# ── Template unit tests ─────────────────────────────────────────
class TestAgentTemplate:
"""Unit tests for the AGENT_TEMPLATE constant."""
def _render(self, **kwargs) -> str:
defaults = {"name": "test", "role": "", "goal": "", "backstory": "", "llm": "openai/gpt-5.5"}
defaults.update(kwargs)
return AGENT_TEMPLATE.format(**defaults)
def test_template_renders_name(self) -> None:
content = self._render(name="tester")
assert '"name": "tester"' in content
def test_template_is_valid_jsonc(self) -> None:
content = self._render(name="demo")
clean = strip_jsonc_comments(content)
data = json.loads(clean)
assert data["name"] == "demo"
assert isinstance(data["settings"], dict)
def test_comments_on_line_above(self) -> None:
"""Comments should be on the line before, not inline with values."""
content = self._render(name="check")
lines = content.split("\n")
for i, line in enumerate(lines):
stripped = line.strip()
# Skip comment-only lines and blank lines
if stripped.startswith("//") or not stripped:
continue
# Lines with actual JSON values should NOT have inline comments
if ":" in stripped and not stripped.startswith("//"):
# Allow trailing comments only on lines that are JUST comments
assert "//" not in stripped.split(":")[1] or stripped.strip().startswith("//"), \
f"Inline comment found on line {i+1}: {line}"
class TestProjectBootstrap:
"""Tests for project structure creation."""
def test_creates_project_structure(self, tmp_path: Path) -> None:
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
runner.invoke(
crewai, ["create", "agent", "myagent"],
input=_DEFAULT_PROMPTS_INPUT,
)
assert Path("agents").is_dir()
assert Path("tools").is_dir()
assert Path("config.json").exists()
def test_config_json_is_valid(self, tmp_path: Path) -> None:
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
runner.invoke(
crewai, ["create", "agent", "myagent"],
input=_DEFAULT_PROMPTS_INPUT,
)
raw = Path("config.json").read_text()
clean = strip_jsonc_comments(raw)
data = json.loads(clean)
assert "rooms" in data
def test_agent_added_to_config(self, tmp_path: Path) -> None:
runner = CliRunner()
with runner.isolated_filesystem(temp_dir=tmp_path):
runner.invoke(
crewai, ["create", "agent", "researcher"],
input=_DEFAULT_PROMPTS_INPUT,
)
raw = Path("config.json").read_text()
clean = strip_jsonc_comments(raw)
data = json.loads(clean)
agents = data["rooms"]["common"]["agents"]
assert "researcher" in agents
# ── GAP-65: Schema validation tests ──────────────────────────
class TestSchemaValidation:
"""Tests for agent definition schema validation (GAP-65)."""
def test_valid_definition_no_warning(self, tmp_path: Path, caplog) -> None:
"""A valid definition should not produce a validation warning."""
from crewai.new_agent.definition_parser import parse_agent_definition
valid = {"role": "Tester", "goal": "Test things", "name": "test"}
with caplog.at_level(logging.WARNING, logger="crewai.new_agent.definition_parser"):
result = parse_agent_definition(valid)
assert result["role"] == "Tester"
# No validation warning expected (if jsonschema is installed)
validation_warnings = [
r for r in caplog.records
if "validation failed" in r.message.lower()
]
assert len(validation_warnings) == 0
def test_invalid_definition_warns(self, tmp_path: Path, caplog) -> None:
"""An invalid definition (missing required fields) should log a warning."""
from crewai.new_agent.definition_parser import parse_agent_definition
invalid = {"name": "bad-agent"} # Missing required "role" and "goal"
with caplog.at_level(logging.WARNING, logger="crewai.new_agent.definition_parser"):
result = parse_agent_definition(invalid)
# Should still return the dict (graceful degradation)
assert result["name"] == "bad-agent"
# Check for validation warning (only if jsonschema is installed)
try:
import jsonschema # noqa: F401
validation_warnings = [
r for r in caplog.records
if "validation failed" in r.message.lower()
]
assert len(validation_warnings) > 0
except ImportError:
pass # No jsonschema, skip assertion
def test_additional_properties_warns(self, tmp_path: Path, caplog) -> None:
"""Extra properties should trigger a validation warning."""
from crewai.new_agent.definition_parser import parse_agent_definition
defn = {
"role": "Tester",
"goal": "Test",
"unknown_field": "should_warn",
}
with caplog.at_level(logging.WARNING, logger="crewai.new_agent.definition_parser"):
result = parse_agent_definition(defn)
assert result["role"] == "Tester"
try:
import jsonschema # noqa: F401
validation_warnings = [
r for r in caplog.records
if "validation failed" in r.message.lower()
]
assert len(validation_warnings) > 0
except ImportError:
pass
def test_jsonc_file_validated(self, tmp_path: Path, caplog) -> None:
"""JSONC files should be validated after parsing."""
from crewai.new_agent.definition_parser import parse_agent_definition
jsonc_content = """{
// This is a JSONC file
"role": "Researcher",
"goal": "Find answers",
"name": "researcher"
}"""
file_path = tmp_path / "test.jsonc"
file_path.write_text(jsonc_content, encoding="utf-8")
with caplog.at_level(logging.WARNING, logger="crewai.new_agent.definition_parser"):
result = parse_agent_definition(file_path)
assert result["role"] == "Researcher"
# ── GAP-68: Agent memory CLI command tests ─────────────────────
class TestAgentMemoryCommand:
"""Tests for ``crewai agent memory <name>``."""
def test_agent_not_found(self, tmp_path: Path) -> None:
"""Command should report when agent definition is not found."""
runner = CliRunner()
old_cwd = os.getcwd()
os.chdir(tmp_path)
try:
result = runner.invoke(crewai, ["agent", "memory", "nonexistent"])
assert result.exit_code == 0
assert "not found" in result.output.lower()
finally:
os.chdir(old_cwd)
def test_memory_subcommand_exists(self) -> None:
"""The memory subcommand should be registered."""
runner = CliRunner()
result = runner.invoke(crewai, ["agent", "memory", "--help"])
assert result.exit_code == 0
assert "memory" in result.output.lower()
def test_clear_flag_present(self) -> None:
"""The --clear flag should be accepted."""
runner = CliRunner()
result = runner.invoke(crewai, ["agent", "memory", "--help"])
assert "--clear" in result.output
def test_search_flag_present(self) -> None:
"""The --search flag should be accepted."""
runner = CliRunner()
result = runner.invoke(crewai, ["agent", "memory", "--help"])
assert "--search" in result.output
def test_limit_flag_present(self) -> None:
"""The --limit flag should be accepted."""
runner = CliRunner()
result = runner.invoke(crewai, ["agent", "memory", "--help"])
assert "--limit" in result.output
# ── GAP-28: Organic mode routing tests ─────────────────────────
class TestOrganicMode:
"""Tests for organic engagement mode (GAP-28)."""
def test_score_relevance_keyword_match(self) -> None:
"""Agents whose role/goal matches message words should score highest."""
from crewai_cli.agent_tui import AgentTUI
app = AgentTUI.__new__(AgentTUI)
agents = [
{"name": "researcher", "role": "Web Researcher", "goal": "Find information on the web"},
{"name": "writer", "role": "Content Writer", "goal": "Write compelling articles"},
]
scored = app._score_relevance("search the web for news", agents)
assert len(scored) > 0
names = [a["name"] for a, _ in scored]
assert names[0] == "researcher"
def test_score_relevance_no_match_returns_empty(self) -> None:
"""When no keywords match, empty list is returned."""
from crewai_cli.agent_tui import AgentTUI
app = AgentTUI.__new__(AgentTUI)
agents = [
{"name": "a1", "role": "Alpha", "goal": "Do alpha"},
{"name": "a2", "role": "Beta", "goal": "Do beta"},
]
scored = app._score_relevance("xyzzy foobar", agents)
assert len(scored) == 0
def test_score_relevance_filters_stop_words(self) -> None:
"""Stop words should not cause false matches."""
from crewai_cli.agent_tui import AgentTUI
app = AgentTUI.__new__(AgentTUI)
agents = [
{"name": "helper", "role": "is a helper", "goal": "the goal"},
]
scored = app._score_relevance("is the", agents)
assert len(scored) == 0

View File

@@ -1,257 +0,0 @@
"""Tests for the CLIProvider and formatting helpers."""
from __future__ import annotations
import asyncio
import json
from pathlib import Path
import pytest
from crewai.new_agent.cli_provider import (
CLIProvider,
format_elapsed,
format_status_line,
format_tokens,
)
from crewai.new_agent.models import AgentStatus, Message
from crewai.new_agent.provider import ConversationalProvider
# ── format_tokens ────────────────────────────────────────────
class TestFormatTokens:
def test_zero(self):
assert format_tokens(0) == "0"
def test_small(self):
assert format_tokens(999) == "999"
def test_one_thousand(self):
assert format_tokens(1000) == "1.0k"
def test_thousands(self):
assert format_tokens(1234) == "1.2k"
def test_tens_of_thousands(self):
assert format_tokens(12345) == "12.3k"
def test_hundreds_of_thousands(self):
assert format_tokens(123456) == "123.5k"
def test_millions(self):
assert format_tokens(1234567) == "1.2M"
def test_large_millions(self):
assert format_tokens(12345678) == "12.3M"
def test_one(self):
assert format_tokens(1) == "1"
def test_boundary_999(self):
assert format_tokens(999) == "999"
def test_boundary_999999(self):
assert format_tokens(999999) == "1000.0k"
def test_boundary_1000000(self):
assert format_tokens(1000000) == "1.0M"
# ── format_elapsed ───────────────────────────────────────────
class TestFormatElapsed:
def test_seconds(self):
assert format_elapsed(12000) == "12s"
def test_zero(self):
assert format_elapsed(0) == "0s"
def test_one_minute(self):
assert format_elapsed(60000) == "1m 0s"
def test_minutes_and_seconds(self):
assert format_elapsed(72000) == "1m 12s"
def test_one_hour(self):
assert format_elapsed(3600000) == "1h 0m"
def test_hours_and_minutes(self):
assert format_elapsed(3723000) == "1h 2m"
def test_under_one_second(self):
assert format_elapsed(500) == "0s"
def test_59_seconds(self):
assert format_elapsed(59000) == "59s"
# ── format_status_line ───────────────────────────────────────
class TestFormatStatusLine:
def test_basic_status(self):
status = AgentStatus(state="thinking")
line = format_status_line(status)
assert line == "⠋ thinking…"
def test_with_detail(self):
status = AgentStatus(state="using_tool", detail="Searching the web")
line = format_status_line(status)
assert line == "⠋ Searching the web…"
def test_with_elapsed(self):
status = AgentStatus(state="thinking", detail="Analyzing", elapsed_ms=12000)
line = format_status_line(status)
assert line == "⠋ Analyzing… (12s)"
def test_with_tokens(self):
status = AgentStatus(
state="using_tool",
detail="Searching the web",
elapsed_ms=12000,
input_tokens=3400,
output_tokens=1200,
)
line = format_status_line(status)
assert line == "⠋ Searching the web… (12s · ↓ 3.4k tokens · ↑ 1.2k tokens)"
def test_custom_spinner_frame(self):
status = AgentStatus(state="thinking", detail="Working")
line = format_status_line(status, spinner_frame="")
assert line.startswith("⠸ Working…")
def test_only_input_tokens(self):
status = AgentStatus(
state="thinking",
detail="Reading",
elapsed_ms=5000,
input_tokens=500,
output_tokens=0,
)
line = format_status_line(status)
assert line == "⠋ Reading… (5s · ↓ 500 tokens)"
def test_only_output_tokens(self):
status = AgentStatus(
state="thinking",
detail="Writing",
elapsed_ms=0,
input_tokens=0,
output_tokens=2500,
)
line = format_status_line(status)
assert line == "⠋ Writing… (↑ 2.5k tokens)"
# ── CLIProvider protocol conformance ─────────────────────────
class TestCLIProviderProtocol:
def test_implements_protocol(self):
provider = CLIProvider(agent_name="test-agent")
assert isinstance(provider, ConversationalProvider)
def test_has_required_methods(self):
provider = CLIProvider()
assert hasattr(provider, "send_message")
assert hasattr(provider, "receive_message")
assert hasattr(provider, "send_status")
assert hasattr(provider, "get_history")
assert hasattr(provider, "save_history")
assert hasattr(provider, "reset_history")
# ── CLIProvider history persistence ──────────────────────────
class TestCLIProviderHistory:
@pytest.fixture()
def provider(self, tmp_path, monkeypatch):
"""Create a CLIProvider that stores history in a temp dir."""
monkeypatch.chdir(tmp_path)
return CLIProvider(agent_name="test-agent")
def test_get_history_empty(self, provider):
assert provider.get_history() == []
def test_save_and_load(self, provider):
messages = [
Message(role="user", content="Hello"),
Message(role="agent", content="Hi there", sender="TestAgent"),
]
provider.save_history(messages)
loaded = provider.get_history()
assert len(loaded) == 2
assert loaded[0].role == "user"
assert loaded[0].content == "Hello"
assert loaded[1].role == "agent"
assert loaded[1].content == "Hi there"
assert loaded[1].sender == "TestAgent"
def test_reset_history(self, provider, tmp_path):
messages = [Message(role="user", content="Hello")]
provider.save_history(messages)
assert len(provider.get_history()) == 1
provider.reset_history()
assert provider.get_history() == []
def test_reset_nonexistent_history(self, provider):
# Should not raise
provider.reset_history()
def test_history_creates_directories(self, provider, tmp_path):
messages = [Message(role="user", content="Hello")]
provider.save_history(messages)
db_path = tmp_path / ".crewai" / "conversations" / "test-agent.db"
assert db_path.exists()
def test_history_roundtrip_preserves_fields(self, provider):
msg = Message(
role="agent",
content="Result",
sender="Researcher",
model="gpt-4o",
input_tokens=100,
output_tokens=50,
tools_used=["search"],
)
provider.save_history([msg])
loaded = provider.get_history()
assert loaded[0].sender == "Researcher"
assert loaded[0].model == "gpt-4o"
assert loaded[0].input_tokens == 100
assert loaded[0].output_tokens == 50
assert loaded[0].tools_used == ["search"]
# ── CLIProvider send_message ─────────────────────────────────
class TestCLIProviderSendMessage:
def test_send_agent_message(self, capsys, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
provider = CLIProvider(agent_name="test")
msg = Message(role="agent", content="Hello!", sender="Researcher")
asyncio.run(provider.send_message(msg))
captured = capsys.readouterr()
assert "Researcher: Hello!" in captured.out
def test_send_system_message(self, capsys, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
provider = CLIProvider(agent_name="test")
msg = Message(role="system", content="Agent initialized")
asyncio.run(provider.send_message(msg))
captured = capsys.readouterr()
assert "[system] Agent initialized" in captured.out
def test_send_agent_message_no_sender(self, capsys, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
provider = CLIProvider(agent_name="test")
msg = Message(role="agent", content="Hi")
asyncio.run(provider.send_message(msg))
captured = capsys.readouterr()
assert "Agent: Hi" in captured.out

View File

@@ -1,480 +0,0 @@
"""Tests for Flow.ask() and Flow.say() with ConversationalProvider integration."""
from __future__ import annotations
import asyncio
from typing import Any
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from crewai.flow.flow import Flow, start
from crewai.new_agent.models import Message
from crewai.new_agent.provider import ConversationalProvider, DirectProvider
# ── Helpers ─────────────────────────────────────────────────────
class MockConversationalProvider:
"""A mock ConversationalProvider that records sent messages and
returns pre-configured replies for receive_message().
"""
def __init__(self, replies: list[str] | None = None) -> None:
self._replies = list(replies or [])
self._reply_index = 0
self.sent_messages: list[Message] = []
self.statuses: list[Any] = []
async def send_message(self, message: Message) -> None:
self.sent_messages.append(message)
async def receive_message(self) -> Message:
if self._reply_index < len(self._replies):
content = self._replies[self._reply_index]
self._reply_index += 1
return Message(role="user", content=content)
return Message(role="user", content="")
async def send_status(self, status: Any) -> None:
self.statuses.append(status)
def get_history(self) -> list[Message]:
return list(self.sent_messages)
def save_history(self, messages: list[Message]) -> None:
pass
def reset_history(self) -> None:
self.sent_messages.clear()
def save_provenance(self, entries: list) -> None:
pass
def load_provenance(self) -> list:
return []
def get_scope(self) -> dict[str, str]:
return {}
# ── Test Flows ──────────────────────────────────────────────────
class SimpleAskFlow(Flow):
"""Flow that asks a single question."""
_skip_auto_memory = True
@start()
def greet(self):
answer = self.ask("What is your name?")
self.state["answer"] = answer
return answer
class SimpleSayFlow(Flow):
"""Flow that sends a message without waiting for a response."""
_skip_auto_memory = True
@start()
def notify(self):
self.say("Processing started...")
self.state["notified"] = True
return "done"
class AskAndSayFlow(Flow):
"""Flow that uses both ask() and say()."""
_skip_auto_memory = True
@start()
def interact(self):
self.say("Welcome to the interactive flow!")
name = self.ask("What is your name?")
self.say(f"Hello, {name}! Processing your request...")
topic = self.ask("What topic interests you?")
self.say(f"Great choice, {name}! Researching {topic}...")
self.state["name"] = name
self.state["topic"] = topic
return {"name": name, "topic": topic}
class MetadataFlow(Flow):
"""Flow that passes metadata through ask() and say()."""
_skip_auto_memory = True
@start()
def with_metadata(self):
self.say("Starting", metadata={"channel": "#ops"})
answer = self.ask("Continue?", metadata={"user_id": "u123"})
self.state["answer"] = answer
return answer
# ── Tests: ConversationalProvider field ─────────────────────────
class TestConversationalProviderField:
def test_default_is_none(self):
flow = Flow(_skip_auto_memory=True, suppress_flow_events=True)
assert flow.conversational_provider is None
def test_can_set_provider(self):
provider = MockConversationalProvider()
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
assert flow.conversational_provider is provider
def test_provider_implements_protocol(self):
provider = MockConversationalProvider()
assert isinstance(provider, ConversationalProvider)
# ── Tests: ask() with ConversationalProvider ────────────────────
class TestAskWithConversationalProvider:
def test_ask_sends_and_receives(self):
provider = MockConversationalProvider(replies=["Alice"])
flow = SimpleAskFlow(
conversational_provider=provider,
suppress_flow_events=True,
)
result = flow.kickoff()
assert result == "Alice"
assert flow.state["answer"] == "Alice"
# The provider should have received the question
assert len(provider.sent_messages) == 1
assert provider.sent_messages[0].content == "What is your name?"
assert provider.sent_messages[0].role == "agent"
def test_ask_returns_none_on_timeout(self):
class SlowProvider(MockConversationalProvider):
async def receive_message(self) -> Message:
await asyncio.sleep(10)
return Message(role="user", content="too late")
provider = SlowProvider()
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
result = flow.ask("Quick question?", timeout=0.1)
assert result is None
def test_ask_returns_none_on_provider_error(self):
class BrokenProvider(MockConversationalProvider):
async def receive_message(self) -> Message:
raise ConnectionError("Provider disconnected")
provider = BrokenProvider()
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
result = flow.ask("Hello?")
assert result is None
def test_ask_records_input_history(self):
provider = MockConversationalProvider(replies=["Bob"])
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
flow.ask("Who are you?")
assert len(flow._input_history) == 1
entry = flow._input_history[0]
assert entry["message"] == "Who are you?"
assert entry["response"] == "Bob"
def test_ask_with_metadata(self):
provider = MockConversationalProvider(replies=["yes"])
flow = MetadataFlow(
conversational_provider=provider,
suppress_flow_events=True,
)
result = flow.kickoff()
assert result == "yes"
# Check that the ask message was sent with correct metadata
ask_msgs = [m for m in provider.sent_messages if "Continue" in m.content]
assert len(ask_msgs) == 1
assert ask_msgs[0].metadata == {"user_id": "u123"}
# ── Tests: say() ────────────────────────────────────────────────
class TestSayWithConversationalProvider:
def test_say_sends_message(self):
provider = MockConversationalProvider()
flow = SimpleSayFlow(
conversational_provider=provider,
suppress_flow_events=True,
)
result = flow.kickoff()
assert result == "done"
assert flow.state["notified"] is True
assert len(provider.sent_messages) == 1
assert provider.sent_messages[0].content == "Processing started..."
assert provider.sent_messages[0].role == "agent"
def test_say_with_metadata(self):
provider = MockConversationalProvider()
flow = MetadataFlow(
conversational_provider=provider,
suppress_flow_events=True,
)
# We need a reply for the ask() call
provider._replies = ["ok"]
flow.kickoff()
# The say("Starting") message should have metadata
say_msgs = [m for m in provider.sent_messages if m.content == "Starting"]
assert len(say_msgs) == 1
assert say_msgs[0].metadata == {"channel": "#ops"}
def test_say_does_not_block(self):
"""say() should not wait for a response -- it's fire-and-forget."""
provider = MockConversationalProvider()
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
# say() should return None (no return value)
result = flow.say("Hello!")
assert result is None
assert len(provider.sent_messages) == 1
def test_say_gracefully_handles_provider_error(self):
class BrokenSayProvider(MockConversationalProvider):
async def send_message(self, message: Message) -> None:
raise ConnectionError("Cannot send")
provider = BrokenSayProvider()
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
# Should not raise -- errors are logged and swallowed
flow.say("This will fail silently")
class TestSayWithoutProvider:
def test_say_prints_to_console(self):
flow = Flow(
_skip_auto_memory=True,
suppress_flow_events=True,
)
# Without a conversational_provider, say() falls back to console
with patch("crewai.flow.flow.Console") as MockConsole:
mock_console = MagicMock()
MockConsole.return_value = mock_console
flow.say("Console message")
mock_console.print.assert_called_once()
# Verify the Panel was created with the message
call_args = mock_console.print.call_args
panel = call_args[0][0]
# The Panel renderable should contain our message
assert "Console message" in str(panel.renderable)
# ── Tests: Combined ask() and say() ────────────────────────────
class TestAskAndSayCombined:
def test_full_conversation_flow(self):
provider = MockConversationalProvider(replies=["Alice", "AI"])
flow = AskAndSayFlow(
conversational_provider=provider,
suppress_flow_events=True,
)
result = flow.kickoff()
assert result == {"name": "Alice", "topic": "AI"}
assert flow.state["name"] == "Alice"
assert flow.state["topic"] == "AI"
# Check all sent messages in order
contents = [m.content for m in provider.sent_messages]
assert contents == [
"Welcome to the interactive flow!",
"What is your name?",
"Hello, Alice! Processing your request...",
"What topic interests you?",
"Great choice, Alice! Researching AI...",
]
def test_mixed_say_and_ask_message_roles(self):
provider = MockConversationalProvider(replies=["yes"])
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
flow.say("Info message")
flow.ask("Question?")
# Both say() and ask() send as "agent" role
assert all(m.role == "agent" for m in provider.sent_messages)
# ── Tests: Fallback behavior (no conversational_provider) ──────
class MockInputProvider:
"""A mock InputProvider that returns a pre-configured response."""
def __init__(self, response: str = "fallback answer") -> None:
self._response = response
self.call_count = 0
def request_input(
self,
message: str,
flow: Any,
metadata: dict[str, Any] | None = None,
) -> str | None:
self.call_count += 1
return self._response
class TestFallbackBehavior:
def test_ask_falls_back_to_input_provider(self):
"""When no conversational_provider is set, ask() uses InputProvider."""
mock_input_provider = MockInputProvider("fallback answer")
flow = Flow(
input_provider=mock_input_provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
result = flow.ask("Test question?")
assert result == "fallback answer"
assert mock_input_provider.call_count == 1
def test_conversational_provider_takes_priority(self):
"""When both providers are set, conversational_provider wins for ask()."""
conv_provider = MockConversationalProvider(replies=["conv answer"])
input_provider = MockInputProvider("input answer")
flow = Flow(
conversational_provider=conv_provider,
input_provider=input_provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
result = flow.ask("Which provider?")
assert result == "conv answer"
# InputProvider should NOT have been called
assert input_provider.call_count == 0
# ── Tests: Events ───────────────────────────────────────────────
class TestFlowMessageEvents:
def test_say_emits_flow_message_sent_event(self):
from crewai.events.types.flow_events import FlowMessageSentEvent
provider = MockConversationalProvider()
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
emitted_events: list[FlowMessageSentEvent] = []
original_emit = crewai_event_bus_emit = None
with patch.object(
type(flow), "_Flow__class__", create=True
):
pass
# We'll check that the event is emitted by patching crewai_event_bus
with patch("crewai.flow.flow.crewai_event_bus") as mock_bus:
flow.say("Test message", metadata={"key": "value"})
# Find the FlowMessageSentEvent among emitted events
for call in mock_bus.emit.call_args_list:
args = call[0]
if len(args) >= 2 and isinstance(args[1], FlowMessageSentEvent):
event = args[1]
assert event.message == "Test message"
assert event.metadata == {"key": "value"}
assert event.type == "flow_message_sent"
emitted_events.append(event)
assert len(emitted_events) == 1
def test_ask_emits_input_events_with_conv_provider(self):
from crewai.events.types.flow_events import (
FlowInputReceivedEvent,
FlowInputRequestedEvent,
)
provider = MockConversationalProvider(replies=["answer"])
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
with patch("crewai.flow.flow.crewai_event_bus") as mock_bus:
flow.ask("Question?")
requested = [
call[0][1]
for call in mock_bus.emit.call_args_list
if isinstance(call[0][1], FlowInputRequestedEvent)
]
received = [
call[0][1]
for call in mock_bus.emit.call_args_list
if isinstance(call[0][1], FlowInputReceivedEvent)
]
assert len(requested) == 1
assert requested[0].message == "Question?"
assert len(received) == 1
assert received[0].response == "answer"
# ── Tests: DirectProvider as conversational_provider ────────────
class TestDirectProviderIntegration:
def test_direct_provider_send_only(self):
"""DirectProvider supports send_message but not receive_message."""
provider = DirectProvider()
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
# say() should work
flow.say("Hello from flow")
assert len(provider.get_history()) == 1
assert provider.get_history()[0].content == "Hello from flow"
def test_direct_provider_ask_returns_none(self):
"""DirectProvider.receive_message raises NotImplementedError,
so ask() should return None gracefully."""
provider = DirectProvider()
flow = Flow(
conversational_provider=provider,
_skip_auto_memory=True,
suppress_flow_events=True,
)
result = flow.ask("Will fail gracefully")
assert result is None

View File

@@ -1,208 +0,0 @@
"""Tests for the agent definition parser and JSON Schema."""
from __future__ import annotations
import json
import tempfile
from pathlib import Path
from unittest.mock import patch
import pytest
from crewai.new_agent.definition_parser import (
load_agent_from_definition,
parse_agent_definition,
strip_jsonc_comments,
)
class TestStripJsoncComments:
def test_no_comments(self):
text = '{"key": "value"}'
assert json.loads(strip_jsonc_comments(text)) == {"key": "value"}
def test_single_line_comments(self):
text = '{\n // This is a comment\n "key": "value"\n}'
result = json.loads(strip_jsonc_comments(text))
assert result == {"key": "value"}
def test_multi_line_comments(self):
text = '{\n /* This is\n a multi-line comment */\n "key": "value"\n}'
result = json.loads(strip_jsonc_comments(text))
assert result == {"key": "value"}
def test_url_in_value_not_stripped(self):
text = '{"url": "https://example.com"}'
result = json.loads(strip_jsonc_comments(text))
assert result["url"] == "https://example.com"
class TestParseAgentDefinition:
def test_parse_dict(self):
defn = {"role": "R", "goal": "g"}
result = parse_agent_definition(defn)
assert result == defn
def test_parse_json_string(self):
raw = '{"role": "R", "goal": "g"}'
result = parse_agent_definition(raw)
assert result["role"] == "R"
def test_parse_json_file(self):
with tempfile.NamedTemporaryFile(suffix=".json", mode="w", delete=False) as f:
json.dump({"role": "Writer", "goal": "Write articles"}, f)
f.flush()
result = parse_agent_definition(f.name)
assert result["role"] == "Writer"
def test_parse_jsonc_file(self):
with tempfile.NamedTemporaryFile(suffix=".jsonc", mode="w", delete=False) as f:
f.write('{\n // Agent definition\n "role": "Writer",\n "goal": "Write"\n}')
f.flush()
result = parse_agent_definition(f.name)
assert result["role"] == "Writer"
class TestLoadAgentFromDefinition:
def test_basic_definition(self):
defn = {
"role": "Senior Researcher",
"goal": "Find information",
"backstory": "Expert researcher.",
}
agent = load_agent_from_definition(defn)
assert agent.role == "Senior Researcher"
assert agent.goal == "Find information"
assert agent.backstory == "Expert researcher."
def test_minimal_definition(self):
agent = load_agent_from_definition({"role": "R", "goal": "g"})
assert agent.role == "R"
assert agent.goal == "g"
def test_settings_mapping(self):
defn = {
"role": "R",
"goal": "g",
"settings": {
"memory": False,
"reasoning": False,
"planning": False,
"narration_guard": True,
"max_history_messages": 50,
},
}
agent = load_agent_from_definition(defn)
assert agent.settings.memory_enabled is False
assert agent.settings.reasoning_enabled is False
assert agent.settings.planning_enabled is False
assert agent.settings.narration_guard is True
assert agent.settings.max_history_messages == 50
def test_verbose_and_max_iter(self):
defn = {"role": "R", "goal": "g", "verbose": True, "max_iter": 10}
agent = load_agent_from_definition(defn)
assert agent.verbose is True
assert agent.max_iter == 10
def test_llm_setting(self):
defn = {"role": "R", "goal": "g", "llm": "openai/gpt-4o"}
agent = load_agent_from_definition(defn)
assert agent.llm == "openai/gpt-4o"
def test_guardrail_llm(self):
defn = {
"role": "R",
"goal": "g",
"guardrail": {"type": "llm", "instructions": "Be safe"},
}
agent = load_agent_from_definition(defn)
assert agent.guardrail is not None
from crewai.tasks.llm_guardrail import LLMGuardrail
assert isinstance(agent.guardrail, LLMGuardrail)
assert agent.guardrail.description == "Be safe"
def test_from_json_file(self):
defn = {"role": "FileAgent", "goal": "Test file loading", "backstory": "From JSON"}
with tempfile.NamedTemporaryFile(suffix=".json", mode="w", delete=False) as f:
json.dump(defn, f)
f.flush()
agent = load_agent_from_definition(f.name)
assert agent.role == "FileAgent"
assert agent.backstory == "From JSON"
def test_coworker_amp_handle(self):
defn = {
"role": "Manager",
"goal": "Manage",
"coworkers": [{"amp": "content-writer"}],
}
agent = load_agent_from_definition(defn)
# AMP handles are passed as strings for resolution
assert "content-writer" in agent.coworkers
def test_coworker_ref_with_agents_dir(self):
with tempfile.TemporaryDirectory() as tmpdir:
agents_dir = Path(tmpdir)
writer_defn = {"role": "Writer", "goal": "Write"}
(agents_dir / "writer.json").write_text(json.dumps(writer_defn))
defn = {
"role": "Manager",
"goal": "Manage",
"coworkers": [{"ref": "writer"}],
}
agent = load_agent_from_definition(defn, agents_dir=agents_dir)
assert len(agent.coworkers) == 1
def test_circular_coworker_ref_no_crash(self):
"""Two agents referencing each other as coworkers should not crash."""
with tempfile.TemporaryDirectory() as tmpdir:
agents_dir = Path(tmpdir)
a_defn = {
"name": "agent_a",
"role": "A",
"goal": "Do A",
"coworkers": [{"ref": "agent_b"}],
}
b_defn = {
"name": "agent_b",
"role": "B",
"goal": "Do B",
"coworkers": [{"ref": "agent_a"}],
}
(agents_dir / "agent_a.json").write_text(json.dumps(a_defn))
(agents_dir / "agent_b.json").write_text(json.dumps(b_defn))
agent = load_agent_from_definition(
agents_dir / "agent_a.json", agents_dir=agents_dir
)
assert agent is not None
assert agent.role == "A"
# B should be loaded as a coworker, but B's ref to A is skipped
assert len(agent.coworkers) == 1
class TestJsonSchema:
def test_schema_is_valid_json(self):
schema_path = Path(__file__).parent.parent.parent / "src" / "crewai" / "new_agent" / "agent_schema.json"
with open(schema_path) as f:
schema = json.load(f)
assert schema["$schema"] == "https://json-schema.org/draft/2020-12/schema"
assert "role" in schema["required"]
assert "goal" in schema["required"]
def test_schema_has_key_properties(self):
schema_path = Path(__file__).parent.parent.parent / "src" / "crewai" / "new_agent" / "agent_schema.json"
with open(schema_path) as f:
schema = json.load(f)
props = schema["properties"]
assert "role" in props
assert "goal" in props
assert "backstory" in props
assert "llm" in props
assert "tools" in props
assert "coworkers" in props
assert "settings" in props
assert "guardrail" in props

View File

@@ -1,654 +0,0 @@
"""Tests for GAP-78, GAP-79, GAP-84, GAP-85, GAP-86, GAP-88, GAP-89, GAP-97,
GAP-99, GAP-102, GAP-110, GAP-111, GAP-116.
Covers:
- GAP-78: parent_agent passed to build_coworker_tools
- GAP-79: reset_conversation preserves provenance
- GAP-84: conversation_started fires at conversation start, not construction
- GAP-85: response_model applied in streaming path
- GAP-86: AMP coworker dict supports both {"amp": "handle"} and {"handle": "handle"}
- GAP-88: explain() works in async contexts without planning engine
- GAP-89: Provenance entries persisted to memory backend
- GAP-97: Proactive context window summarization
- GAP-99: Circular coworker reference logs a warning
- GAP-102: confidence and sources populated on ProvenanceEntry
- GAP-110: provider field typed as ConversationalProvider
- GAP-111: memory_view property exposes memory backend
- GAP-116: conversation_history is property delegating to executor (intentional)
"""
from __future__ import annotations
import asyncio
import logging
from typing import Any
from unittest.mock import AsyncMock, MagicMock, patch, call
import pytest
from crewai.new_agent import (
AgentSettings,
Message,
NewAgent,
ProvenanceEntry,
TokenUsage,
)
from crewai.new_agent.coworker_tools import build_coworker_tools, DelegateToCoworkerTool
from crewai.new_agent.events import NewAgentCreatedEvent, NewAgentConversationStartedEvent
from crewai.new_agent.executor import ConversationalAgentExecutor
from crewai.new_agent.provider import ConversationalProvider, DirectProvider
# ── Helpers ────────────────────────────────────────────────────
def _make_agent(**overrides: Any) -> NewAgent:
"""Create a minimal NewAgent with mocked LLM for unit testing."""
defaults = dict(
role="Tester",
goal="Test things",
backstory="A test agent",
settings=AgentSettings(
memory_enabled=False,
planning_enabled=False,
self_improving=False,
provenance_enabled=True,
),
)
defaults.update(overrides)
with patch("crewai.new_agent.new_agent.NewAgent._init_llm"):
with patch("crewai.new_agent.new_agent.NewAgent._init_telemetry"):
agent = NewAgent(**defaults)
return agent
def _make_executor(agent: NewAgent) -> ConversationalAgentExecutor:
"""Create an executor from an agent."""
return ConversationalAgentExecutor(
agent=agent,
provider=DirectProvider(),
max_iter=5,
verbose=False,
)
# ── GAP-78: parent_agent passed to build_coworker_tools ──────
class TestGAP78ParentAgentInCoworkerTools:
def test_parent_agent_passed_to_build_coworker_tools(self):
"""Coworker tools built for an agent have parent_agent set to the agent itself."""
coworker = _make_agent(role="Helper", goal="Help out")
agent = _make_agent(coworkers=[coworker])
# The agent should have built coworker tools with parent_agent=self
assert len(agent._coworker_tools) >= 1
delegate_tool = agent._coworker_tools[0]
assert isinstance(delegate_tool, DelegateToCoworkerTool)
assert delegate_tool.parent_agent is agent
def test_delegate_tool_has_parent_agent_set(self):
"""DelegateToCoworkerTool receives parent_agent from build_coworker_tools."""
coworker = _make_agent(role="Writer", goal="Write stuff")
tools = build_coworker_tools(
[coworker], parent_role="Tester", parent_agent="sentinel_parent",
)
assert len(tools) >= 1
delegate_tool = tools[0]
assert isinstance(delegate_tool, DelegateToCoworkerTool)
assert delegate_tool.parent_agent == "sentinel_parent"
# ── GAP-79: reset_conversation preserves provenance ──────────
class TestGAP79ResetPreservesProvenance:
def test_provenance_survives_reset(self):
"""Provenance log is NOT cleared when conversation is reset."""
agent = _make_agent()
executor = agent._executor
assert executor is not None
# Add some provenance entries
executor.provenance_log.append(
ProvenanceEntry(conversation_id="c1", action="response", outcome="test")
)
executor.provenance_log.append(
ProvenanceEntry(conversation_id="c1", action="tool_call", outcome="tool result")
)
assert len(executor.provenance_log) == 2
# Reset conversation
agent.reset_conversation()
# The new executor should have the same provenance (same executor object, just cleared history)
new_executor = agent._executor
assert new_executor is not None
assert len(new_executor.provenance_log) == 2
def test_conversation_history_cleared_on_reset(self):
"""Conversation history IS cleared on reset (unlike provenance)."""
agent = _make_agent()
executor = agent._executor
executor.conversation_history.append(
Message(conversation_id="c1", role="user", content="hello")
)
assert len(executor.conversation_history) == 1
agent.reset_conversation()
new_executor = agent._executor
assert len(new_executor.conversation_history) == 0
def test_provenance_saved_to_provider_on_reset(self):
"""Provider.save_provenance is called before clearing conversation."""
provider = DirectProvider()
agent = _make_agent(provider=provider)
executor = agent._executor
entry = ProvenanceEntry(conversation_id="c1", action="response", outcome="test")
executor.provenance_log.append(entry)
agent.reset_conversation()
# Provider should have the provenance saved
saved = provider.load_provenance()
assert len(saved) >= 1
# ── GAP-84: conversation_started fires at conversation start ──
class TestGAP84ConversationStartedEvent:
def test_created_event_at_construction(self):
"""At construction, NewAgentCreatedEvent is emitted, not ConversationStarted."""
events_emitted = []
def capture_event(sender: Any, event: Any) -> None:
events_emitted.append(type(event).__name__)
with patch("crewai.events.event_bus.crewai_event_bus.emit", side_effect=capture_event):
agent = _make_agent()
assert "NewAgentCreatedEvent" in events_emitted
# The default executor creation does NOT go through _get_or_create_executor,
# so no ConversationStarted for the default conversation.
def test_conversation_started_on_new_conversation(self):
"""ConversationStartedEvent fires when a new conversation ID is used."""
events_emitted = []
def capture_event(sender: Any, event: Any) -> None:
events_emitted.append(type(event).__name__)
agent = _make_agent()
with patch("crewai.events.event_bus.crewai_event_bus.emit", side_effect=capture_event):
# This creates a new executor for an unknown conversation ID
executor = agent._get_or_create_executor("brand-new-conv-id")
assert "NewAgentConversationStartedEvent" in events_emitted
def test_no_duplicate_event_for_existing_conversation(self):
"""No ConversationStartedEvent for an already-existing conversation."""
events_emitted = []
def capture_event(sender: Any, event: Any) -> None:
events_emitted.append(type(event).__name__)
agent = _make_agent()
default_cid = agent._default_conversation_id
with patch("crewai.events.event_bus.crewai_event_bus.emit", side_effect=capture_event):
executor = agent._get_or_create_executor(default_cid)
assert "NewAgentConversationStartedEvent" not in events_emitted
# ── GAP-85: response_model applied in streaming path ──────────
class TestGAP85StreamingStructuredOutput:
def test_structured_output_in_streaming_metadata(self):
"""After streaming completes, structured output is parsed and added to metadata."""
from pydantic import BaseModel
class TestOutput(BaseModel):
answer: str
score: int
agent = _make_agent(response_model=TestOutput)
executor = _make_executor(agent)
# Mock _parse_structured_output to return a valid model
mock_output = TestOutput(answer="hello", score=42)
async def mock_parse(text: str) -> TestOutput:
return mock_output
executor._parse_structured_output = mock_parse
# We test that the ainvoke post-processing would call _parse_structured_output
# by checking the code path exists. Full integration test would require LLM mock.
assert agent.response_model is TestOutput
assert hasattr(executor, '_parse_structured_output')
# ── GAP-86: AMP coworker dict format ─────────────────────────
class TestGAP86AMPCoworkerDictFormat:
def test_amp_key_format(self):
"""Dict with {"amp": "handle"} format resolves the AMP coworker."""
mock_attrs = {"role": "Writer", "goal": "Write", "backstory": ""}
with patch("crewai.new_agent.new_agent.NewAgent._resolve_amp_coworker") as mock_resolve:
mock_coworker = _make_agent(role="Writer", goal="Write")
mock_resolve.return_value = mock_coworker
agent = _make_agent(coworkers=[{"amp": "content-writer", "llm": "gpt-4o"}])
mock_resolve.assert_called_once()
args, kwargs = mock_resolve.call_args
assert args[0] == "content-writer"
# "llm" should be in overrides
overrides = kwargs.get("overrides", {})
assert "llm" in overrides
assert overrides["llm"] == "gpt-4o"
def test_handle_key_format_still_works(self):
"""Dict with {"handle": "handle"} legacy format still works."""
with patch("crewai.new_agent.new_agent.NewAgent._resolve_amp_coworker") as mock_resolve:
mock_coworker = _make_agent(role="Analyst", goal="Analyze")
mock_resolve.return_value = mock_coworker
agent = _make_agent(coworkers=[{"handle": "data-analyst"}])
mock_resolve.assert_called_once()
args, kwargs = mock_resolve.call_args
assert args[0] == "data-analyst"
def test_amp_resolved_flag_set(self):
"""Resolved AMP coworkers have _amp_resolved=True."""
with patch("crewai.new_agent.new_agent.NewAgent._resolve_amp_coworker") as mock_resolve:
mock_coworker = _make_agent(role="Writer", goal="Write")
mock_resolve.return_value = mock_coworker
agent = _make_agent(coworkers=[{"amp": "content-writer"}])
assert len(agent._resolved_coworkers) == 1
assert agent._resolved_coworkers[0]._amp_resolved is True
def test_dict_without_amp_or_handle_passthrough(self):
"""Dict without 'amp' or 'handle' key is passed through as-is."""
raw_dict = {"some_key": "some_value"}
agent = _make_agent(coworkers=[raw_dict])
assert raw_dict in agent._resolved_coworkers
def test_amp_key_with_overrides(self):
"""Dict with {"amp": ..., "overrides": {...}} merges overrides."""
with patch("crewai.new_agent.new_agent.NewAgent._resolve_amp_coworker") as mock_resolve:
mock_coworker = _make_agent(role="Writer", goal="Write")
mock_resolve.return_value = mock_coworker
agent = _make_agent(coworkers=[{
"amp": "content-writer",
"overrides": {"backstory": "Expert writer"},
}])
args, kwargs = mock_resolve.call_args
overrides = kwargs.get("overrides", {})
assert "backstory" in overrides
assert overrides["backstory"] == "Expert writer"
# ── GAP-88: explain() works without planning engine ──────────
class TestGAP88ExplainDecoupledFromPlanning:
def test_explain_returns_entries_without_planning(self):
"""explain() returns provenance entries even without a planning engine."""
agent = _make_agent(settings=AgentSettings(
planning_enabled=False,
self_improving=False,
memory_enabled=False,
provenance_enabled=True,
))
executor = agent._executor
executor.provenance_log.append(
ProvenanceEntry(conversation_id="c1", action="response", outcome="test result")
)
entries = agent.explain()
assert len(entries) == 1
assert entries[0].action == "response"
def test_explain_uses_llm_for_reasoning_reconstruction(self):
"""explain() calls LLM for reasoning when entries lack reasoning."""
agent = _make_agent()
agent._llm_instance = MagicMock()
executor = agent._executor
executor.provenance_log.append(
ProvenanceEntry(conversation_id="c1", action="tool_call", outcome="data fetched")
)
with patch("crewai.utilities.agent_utils.get_llm_response", return_value="Because data was needed") as mock_llm:
with patch("crewai.utilities.agent_utils.format_message_for_llm", return_value={"role": "user", "content": "prompt"}):
entries = agent.explain()
assert len(entries) == 1
assert entries[0].reasoning == "Because data was needed"
mock_llm.assert_called_once()
def test_explain_skips_llm_when_reasoning_present(self):
"""explain() does not call LLM when all entries already have reasoning."""
agent = _make_agent()
agent._llm_instance = MagicMock()
executor = agent._executor
executor.provenance_log.append(
ProvenanceEntry(
conversation_id="c1", action="response",
reasoning="Already explained", outcome="test"
)
)
with patch("crewai.utilities.agent_utils.get_llm_response") as mock_llm:
entries = agent.explain()
mock_llm.assert_not_called()
assert entries[0].reasoning == "Already explained"
# ── GAP-89: Provenance persisted to memory ───────────────────
class TestGAP89ProvenanceMemoryPersistence:
def test_persist_provenance_to_memory(self):
"""_persist_provenance_to_memory saves entry to memory backend."""
agent = _make_agent()
mock_memory = MagicMock()
agent._memory_instance = mock_memory
executor = _make_executor(agent)
entry = ProvenanceEntry(
conversation_id="c1", action="tool_call", outcome="result data"
)
executor._persist_provenance_to_memory(entry)
mock_memory.remember.assert_called_once()
call_kwargs = mock_memory.remember.call_args
assert "provenance" in str(call_kwargs)
def test_persist_provenance_no_memory_is_noop(self):
"""_persist_provenance_to_memory does nothing when memory is None."""
agent = _make_agent()
agent._memory_instance = None
executor = _make_executor(agent)
entry = ProvenanceEntry(conversation_id="c1", action="response")
# Should not raise
executor._persist_provenance_to_memory(entry)
def test_persist_provenance_handles_exception(self):
"""_persist_provenance_to_memory silently handles save errors."""
agent = _make_agent()
mock_memory = MagicMock()
mock_memory.remember.side_effect = RuntimeError("save failed")
agent._memory_instance = mock_memory
executor = _make_executor(agent)
entry = ProvenanceEntry(conversation_id="c1", action="response")
# Should not raise despite exception
executor._persist_provenance_to_memory(entry)
# ── GAP-97: Proactive context window summarization ───────────
class TestGAP97ProactiveSummarization:
def test_history_trimmed_when_exceeds_hard_cap(self):
"""History is trimmed when exceeding the safety threshold (10x max or 500)."""
agent = _make_agent(settings=AgentSettings(
memory_enabled=False,
planning_enabled=False,
self_improving=False,
respect_context_window=True,
max_history_messages=4,
))
executor = _make_executor(agent)
# Threshold = max(4*10, 500) = 500. Add 510 messages to trigger trim.
for i in range(510):
executor.conversation_history.append(
Message(conversation_id="c1", role="user", content=f"msg-{i}")
)
assert len(executor.conversation_history) == 510
executor._maybe_summarize_history()
# Trimmed to the threshold (500)
assert len(executor.conversation_history) == 500
# Should keep the most recent 500
assert executor.conversation_history[0].content == "msg-10"
assert executor.conversation_history[-1].content == "msg-509"
def test_no_trimming_when_under_threshold(self):
"""History is not trimmed when under the safety threshold."""
agent = _make_agent(settings=AgentSettings(
memory_enabled=False,
planning_enabled=False,
self_improving=False,
respect_context_window=True,
max_history_messages=20,
))
executor = _make_executor(agent)
# Add 50 messages (under max(20*10, 500)=500 threshold)
for i in range(50):
executor.conversation_history.append(
Message(conversation_id="c1", role="user", content=f"msg-{i}")
)
executor._maybe_summarize_history()
assert len(executor.conversation_history) == 50
def test_no_trimming_when_max_is_none(self):
"""No trimming when max_history_messages is None."""
agent = _make_agent(settings=AgentSettings(
memory_enabled=False,
planning_enabled=False,
self_improving=False,
respect_context_window=True,
max_history_messages=None,
))
executor = _make_executor(agent)
for i in range(100):
executor.conversation_history.append(
Message(conversation_id="c1", role="user", content=f"msg-{i}")
)
executor._maybe_summarize_history()
assert len(executor.conversation_history) == 100
def test_no_trimming_when_respect_context_window_disabled(self):
"""No trimming when respect_context_window is False."""
agent = _make_agent(settings=AgentSettings(
memory_enabled=False,
planning_enabled=False,
self_improving=False,
respect_context_window=False,
max_history_messages=2,
))
executor = _make_executor(agent)
for i in range(10):
executor.conversation_history.append(
Message(conversation_id="c1", role="user", content=f"msg-{i}")
)
executor._maybe_summarize_history()
assert len(executor.conversation_history) == 10
# ── GAP-99: Circular ref detection warning ───────────────────
class TestGAP99CircularRefWarning:
def test_circular_ref_logs_warning(self, caplog):
"""Circular coworker reference logs a clear warning message."""
from crewai.new_agent.new_agent import _get_init_chain
agent = _make_agent(role="LoopAgent")
# Manually inject the agent ID into the init chain to simulate circular ref
chain = _get_init_chain()
chain.add(agent.id)
try:
with caplog.at_level(logging.WARNING, logger="crewai.new_agent"):
# Re-run _setup with the agent's ID already in chain
# We need to trigger the check directly
agent._setup()
# Check that the warning was logged
found = any(
"Circular coworker reference detected" in record.message
for record in caplog.records
)
assert found, f"Expected circular ref warning. Got: {[r.message for r in caplog.records]}"
finally:
chain.discard(agent.id)
# ── GAP-102: confidence and sources populated ────────────────
class TestGAP102ProvenanceFields:
def test_provenance_entry_has_sources_field(self):
"""ProvenanceEntry model supports sources field."""
entry = ProvenanceEntry(
conversation_id="c1",
action="tool_call",
sources=["search_tool", "calculator"],
confidence=0.95,
)
assert entry.sources == ["search_tool", "calculator"]
assert entry.confidence == 0.95
def test_tool_call_provenance_has_sources(self):
"""Tool call provenance entries include the tool name in sources."""
agent = _make_agent()
executor = _make_executor(agent)
# Simulate what happens during _handle_tool_calls provenance recording
entry = ProvenanceEntry(
conversation_id="c1",
action="tool_call",
inputs={"tool": "search_web", "args": "query=test"},
outcome="Found 5 results",
sources=["search_web"],
confidence=1.0,
)
assert entry.sources == ["search_web"]
assert entry.confidence == 1.0
def test_error_tool_call_has_lower_confidence(self):
"""Tool call with an error outcome gets lower confidence."""
entry = ProvenanceEntry(
conversation_id="c1",
action="tool_call",
outcome="Error executing search: timeout",
sources=["search"],
confidence=0.5,
)
assert entry.confidence == 0.5
# ── GAP-110: provider typed as ConversationalProvider ────────
class TestGAP110ProviderTyping:
def test_provider_accepts_direct_provider(self):
"""DirectProvider is accepted as provider field value."""
provider = DirectProvider()
agent = _make_agent(provider=provider)
assert agent.provider is provider
def test_provider_accepts_none(self):
"""None is accepted as provider field value."""
agent = _make_agent(provider=None)
assert agent.provider is None
def test_provider_accepts_duck_typed(self):
"""A duck-typed provider that implements the protocol methods is accepted."""
class CustomProvider:
async def send_message(self, message: Any) -> None:
pass
async def receive_message(self) -> Any:
pass
async def send_status(self, status: Any) -> None:
pass
def get_history(self) -> list:
return []
def save_history(self, messages: list) -> None:
pass
def reset_history(self) -> None:
pass
def save_provenance(self, entries: list) -> None:
pass
def load_provenance(self) -> list:
return []
custom = CustomProvider()
agent = _make_agent(provider=custom)
assert agent.provider is custom
# ── GAP-111: memory_view property ────────────────────────────
class TestGAP111MemoryView:
def test_memory_view_returns_memory_instance(self):
"""memory_view property returns the underlying memory backend."""
agent = _make_agent()
mock_memory = MagicMock()
agent._memory_instance = mock_memory
assert agent.memory_view is mock_memory
def test_memory_view_returns_none_when_no_memory(self):
"""memory_view returns None when memory is disabled."""
agent = _make_agent()
agent._memory_instance = None
assert agent.memory_view is None
# ── GAP-116: conversation_history is property (intentional) ──
class TestGAP116ConversationHistoryProperty:
def test_conversation_history_is_property(self):
"""conversation_history on NewAgent is a property, not a Pydantic field."""
assert isinstance(NewAgent.conversation_history, property)
def test_conversation_history_delegates_to_executor(self):
"""conversation_history returns the executor's conversation history."""
agent = _make_agent()
executor = agent._executor
msg = Message(conversation_id="c1", role="user", content="hello")
executor.conversation_history.append(msg)
assert len(agent.conversation_history) == 1
assert agent.conversation_history[0] is msg
def test_conversation_history_empty_when_no_executor(self):
"""conversation_history returns empty list when executor doesn't exist."""
agent = _make_agent()
# Remove all executors
agent._executors.clear()
assert agent.conversation_history == []
# ── GAP-86: _amp_resolved private attribute ──────────────────
class TestAmpResolvedAttribute:
def test_default_false(self):
"""_amp_resolved defaults to False for manually created agents."""
agent = _make_agent()
assert agent._amp_resolved is False
def test_can_be_set_true(self):
"""_amp_resolved can be set to True after creation."""
agent = _make_agent()
agent._amp_resolved = True
assert agent._amp_resolved is True

View File

@@ -1,622 +0,0 @@
"""Tests for GAP-80, GAP-81, GAP-82, GAP-100, GAP-101, GAP-112, GAP-113.
Covers:
- GAP-80: Workflow user confirmation flow (pending list, confirm, reject)
- GAP-81: Executable Python Flow code generation
- GAP-82: match_workflow() consults discovered flows
- GAP-100: Scope classification persisted with canonical memories
- GAP-101: Shared canonical memories tagged read-only and skipped
- GAP-112: Raw memories pruned after dreaming consolidation
- GAP-113: Workflow detection threshold is 5 (not 3)
"""
from __future__ import annotations
import asyncio
import json
import os
import textwrap
from pathlib import Path
from typing import Any
from unittest.mock import AsyncMock, MagicMock, call, patch
import pytest
from crewai.new_agent import NewAgent, AgentSettings
from crewai.new_agent.dreaming import (
DreamingEngine,
_classify_scope,
SCOPE_GLOBAL,
SCOPE_USER,
SCOPE_CONVERSATION,
)
from crewai.new_agent.models import ProvenanceEntry
# ── Helpers ──────────────────────────────────────────────────
def _make_agent(**kwargs: Any) -> NewAgent:
defaults = dict(role="TestAgent", goal="testing", memory=False)
defaults.update(kwargs)
return NewAgent(**defaults)
def _make_engine(agent: NewAgent | None = None) -> DreamingEngine:
if agent is None:
agent = _make_agent()
return agent._dreaming_engine
def _make_provenance_entries(tool_sequence: list[str], repeat: int) -> list[ProvenanceEntry]:
"""Create provenance entries that repeat a tool sequence `repeat` times."""
entries: list[ProvenanceEntry] = []
for _ in range(repeat):
for tool in tool_sequence:
entries.append(ProvenanceEntry(
action="tool_call",
inputs={"tool": tool},
))
entries.append(ProvenanceEntry(action="response"))
return entries
# ── GAP-80: Workflow user confirmation flow ──────────────────
class TestGAP80WorkflowConfirmation:
"""Workflows should go to a pending list, not auto-save."""
def test_pending_workflows_initially_empty(self):
engine = _make_engine()
assert engine._pending_workflows == []
assert engine.get_pending_workflows() == []
def test_propose_workflow_adds_to_pending(self):
engine = _make_engine()
wf = {"tools": ["search", "summarize"], "count": 5}
engine._propose_workflow(wf)
pending = engine.get_pending_workflows()
assert len(pending) == 1
assert pending[0]["tools"] == ["search", "summarize"]
assert "description" in pending[0]
def test_propose_workflow_does_not_auto_save(self, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
engine = _make_engine()
wf = {"tools": ["search", "summarize"], "count": 5}
engine._propose_workflow(wf)
# No recipe file should exist
flows_dir = tmp_path / ".crewai" / "flows"
json_files = list(flows_dir.glob("*.json")) if flows_dir.exists() else []
assert len(json_files) == 0
def test_confirm_workflow_saves_recipe(self, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
engine = _make_engine()
wf = {"tools": ["search", "summarize"], "count": 5}
engine._propose_workflow(wf)
confirmed = engine.confirm_workflow(0)
assert confirmed is not None
assert confirmed["tools"] == ["search", "summarize"]
# Pending list should now be empty
assert engine.get_pending_workflows() == []
# Recipe file should be created
flows_dir = tmp_path / ".crewai" / "flows"
json_files = [f for f in flows_dir.glob("*.json") if f.name != "manifest.json"]
assert len(json_files) >= 1
def test_reject_workflow_removes_from_pending(self):
engine = _make_engine()
wf = {"tools": ["search", "summarize"], "count": 5}
engine._propose_workflow(wf)
assert len(engine.get_pending_workflows()) == 1
rejected = engine.reject_workflow(0)
assert rejected is not None
assert rejected["tools"] == ["search", "summarize"]
assert engine.get_pending_workflows() == []
def test_confirm_invalid_index_returns_none(self):
engine = _make_engine()
assert engine.confirm_workflow(0) is None
assert engine.confirm_workflow(-1) is None
assert engine.confirm_workflow(99) is None
def test_reject_invalid_index_returns_none(self):
engine = _make_engine()
assert engine.reject_workflow(0) is None
assert engine.reject_workflow(-1) is None
def test_multiple_pending_workflows(self):
engine = _make_engine()
engine._propose_workflow({"tools": ["a", "b"], "count": 5})
engine._propose_workflow({"tools": ["c", "d"], "count": 6})
assert len(engine.get_pending_workflows()) == 2
# Confirm the first one
confirmed = engine.confirm_workflow(0)
assert confirmed["tools"] == ["a", "b"]
assert len(engine.get_pending_workflows()) == 1
assert engine.get_pending_workflows()[0]["tools"] == ["c", "d"]
@pytest.mark.asyncio
async def test_dream_does_not_auto_save_workflows(self, tmp_path, monkeypatch):
"""dream() should propose workflows but never auto-save them."""
monkeypatch.chdir(tmp_path)
agent = _make_agent(
settings=AgentSettings(self_improving=True, memory_enabled=False),
)
engine = agent._dreaming_engine
# Set up provenance with a repeated pattern (5+ times)
mock_executor = MagicMock()
mock_executor.provenance_log = _make_provenance_entries(
["search", "parse"], repeat=6,
)
# _executor is a property; set the underlying dict entry
cid = agent._default_conversation_id
agent._executors[cid] = mock_executor
result = await engine.dream()
assert result["workflows_detected"] >= 1
# Should be pending, NOT saved
assert len(engine.get_pending_workflows()) >= 1
flows_dir = tmp_path / ".crewai" / "flows"
json_files = list(flows_dir.glob("*.json")) if flows_dir.exists() else []
assert len(json_files) == 0
# ── GAP-81: Executable Flow code generation ──────────────────
class TestGAP81FlowCodeGeneration:
"""confirm_workflow() should generate a .py Flow file."""
def test_generate_flow_code_creates_py_file(self, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
engine = _make_engine()
wf = {"tools": ["search_web", "read_file", "summarize"], "count": 5}
path = engine._generate_flow_code(wf)
assert path is not None
assert path.endswith(".py")
assert os.path.exists(path)
content = Path(path).read_text()
assert "class " in content
assert "@start()" in content
assert "search_web" in content
assert "read_file" in content
assert "summarize" in content
assert "from crewai.flow.flow import Flow, start, listen" in content
def test_generate_flow_code_empty_tools_returns_none(self):
engine = _make_engine()
result = engine._generate_flow_code({"tools": [], "count": 5})
assert result is None
def test_confirm_workflow_also_generates_flow_code(self, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
engine = _make_engine()
wf = {"tools": ["alpha", "beta"], "count": 5}
engine._propose_workflow(wf)
engine.confirm_workflow(0)
flows_dir = tmp_path / ".crewai" / "flows"
py_files = list(flows_dir.glob("workflow_*.py"))
assert len(py_files) == 1
content = py_files[0].read_text()
assert "class " in content
assert "@start()" in content
def test_generated_flow_has_correct_steps(self, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
engine = _make_engine()
wf = {"tools": ["step_a", "step_b", "step_c"], "count": 7}
path = engine._generate_flow_code(wf)
content = Path(path).read_text()
# Should have 3 step methods
assert "step_1_step_a" in content
assert "step_2_step_b" in content
assert "step_3_step_c" in content
# First step uses @start, others use @listen
assert "@start()" in content
assert "@listen" in content
# ── GAP-82: match_workflow() ─────────────────────────────────
class TestGAP82MatchWorkflow:
"""match_workflow() should check user messages against discovered flows."""
def test_no_discovered_flows_returns_none(self, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
engine = _make_engine()
assert engine._discovered_flows == []
assert engine.match_workflow("search and summarize articles") is None
def test_match_with_sufficient_overlap(self):
engine = _make_engine()
engine._discovered_flows = [
{
"name": "search_summarize",
"description": "Repeated pattern (5x): search -> summarize articles",
"tools": ["search", "summarize"],
},
]
result = engine.match_workflow("I want to search and summarize articles")
assert result is not None
assert result["name"] == "search_summarize"
def test_no_match_with_insufficient_overlap(self):
engine = _make_engine()
engine._discovered_flows = [
{
"name": "search_summarize",
"description": "Repeated pattern (5x): search -> summarize articles",
"tools": ["search", "summarize"],
},
]
# Only one overlapping word ("search") is below the threshold of 3
result = engine.match_workflow("please search now")
assert result is None
def test_match_ignores_stop_words(self):
engine = _make_engine()
engine._discovered_flows = [
{
"name": "fetch_parse_save",
"description": "fetch data parse results save output",
"tools": ["fetch", "parse", "save"],
},
]
# "the", "and", "to" are stop words, should not count
result = engine.match_workflow("fetch parse save")
assert result is not None
def test_match_returns_first_matching_flow(self):
engine = _make_engine()
engine._discovered_flows = [
{"name": "flow1", "description": "alpha beta gamma delta", "tools": []},
{"name": "flow2", "description": "alpha beta gamma epsilon", "tools": []},
]
result = engine.match_workflow("alpha beta gamma something")
assert result is not None
assert result["name"] == "flow1"
# ── GAP-100: Scope persisted with canonical memories ─────────
class TestGAP100ScopePersistence:
"""Canonical memories should include scope in metadata."""
@pytest.mark.asyncio
async def test_canonical_memory_includes_scope_metadata(self, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
agent = _make_agent(
settings=AgentSettings(self_improving=True, memory_enabled=True),
)
engine = agent._dreaming_engine
mock_memory = MagicMock()
object.__setattr__(agent, "_memory_instance", mock_memory)
# Patch _consolidate_memories to return controlled output
async def fake_consolidate(memories):
return ["Python is a great language"]
engine._consolidate_memories = fake_consolidate
# Create mock memories to process
mock_mem = MagicMock()
mock_mem.id = "m1"
mock_mem.content = "raw memory"
mock_mem.metadata = {}
mock_memory.recall.return_value = [mock_mem]
await engine.dream()
# Verify remember was called with metadata including scope
assert mock_memory.remember.called
remember_call = mock_memory.remember.call_args
# Check the metadata kwarg
if "metadata" in (remember_call.kwargs or {}):
meta = remember_call.kwargs["metadata"]
assert "type" in meta
assert meta["type"] == "canonical"
assert "scope" in meta
assert meta["scope"] in (SCOPE_GLOBAL, SCOPE_USER, SCOPE_CONVERSATION)
assert "dreaming_cycle" in meta
@pytest.mark.asyncio
async def test_user_scoped_memory_tagged_correctly(self, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path)
agent = _make_agent(
settings=AgentSettings(self_improving=True, memory_enabled=True),
)
engine = agent._dreaming_engine
mock_memory = MagicMock()
object.__setattr__(agent, "_memory_instance", mock_memory)
mock_mem = MagicMock()
mock_mem.id = "m1"
mock_mem.content = "raw memory"
mock_mem.metadata = {}
mock_memory.recall.return_value = [mock_mem]
async def fake_consolidate(memories):
return ["I prefer dark mode for my settings"]
engine._consolidate_memories = fake_consolidate
await engine.dream()
assert mock_memory.remember.called
remember_call = mock_memory.remember.call_args
if "metadata" in (remember_call.kwargs or {}):
assert remember_call.kwargs["metadata"]["scope"] == SCOPE_USER
# ── GAP-101: Shared canonical memories read-only ─────────────
class TestGAP101SharedReadOnly:
"""Shared memories should be tagged read-only and skipped during consolidation."""
def test_shared_memory_has_read_only_tag_in_content(self):
"""_share_with_coworkers should prefix content with [shared:read-only]."""
agent = _make_agent()
engine = agent._dreaming_engine
coworker = _make_agent(role="Coworker")
cw_memory = MagicMock()
coworker._memory_instance = cw_memory
agent._resolved_coworkers = [coworker]
engine._share_with_coworkers(["Important fact"])
assert cw_memory.remember.called
call_args = cw_memory.remember.call_args
value = call_args.args[0] if call_args.args else call_args.kwargs.get("value", "")
assert "[shared:read-only]" in value
def test_shared_memory_has_read_only_metadata(self):
"""_share_with_coworkers should include read_only=True in metadata."""
agent = _make_agent()
engine = agent._dreaming_engine
coworker = _make_agent(role="Coworker")
cw_memory = MagicMock()
coworker._memory_instance = cw_memory
agent._resolved_coworkers = [coworker]
engine._share_with_coworkers(["Important fact"])
assert cw_memory.remember.called
call_kwargs = cw_memory.remember.call_args.kwargs or {}
if "metadata" in call_kwargs:
meta = call_kwargs["metadata"]
assert meta.get("read_only") is True
assert meta.get("type") == "canonical_shared"
assert meta.get("source_agent") == "TestAgent"
def test_read_only_memories_skipped_by_content_prefix(self):
"""_get_recent_memories should skip memories starting with [shared:read-only]."""
engine = _make_engine()
mock_memory = MagicMock()
mem_shared = MagicMock()
mem_shared.id = "shared-1"
mem_shared.content = "[shared:read-only][shared from Other] some fact"
mem_shared.metadata = {}
mem_normal = MagicMock()
mem_normal.id = "normal-1"
mem_normal.content = "A normal memory"
mem_normal.metadata = {}
mock_memory.recall.return_value = [mem_shared, mem_normal]
contents, ids = engine._get_recent_memories(mock_memory)
assert len(contents) == 1
assert contents[0] == "A normal memory"
assert "normal-1" in ids
assert "shared-1" not in ids
def test_read_only_memories_skipped_by_metadata(self):
"""_get_recent_memories should skip memories with read_only=True in metadata."""
engine = _make_engine()
mock_memory = MagicMock()
mem_readonly = MagicMock()
mem_readonly.id = "readonly-1"
mem_readonly.content = "Some shared fact"
mem_readonly.metadata = {"read_only": True}
mem_normal = MagicMock()
mem_normal.id = "normal-1"
mem_normal.content = "A normal memory"
mem_normal.metadata = {}
mock_memory.recall.return_value = [mem_readonly, mem_normal]
contents, ids = engine._get_recent_memories(mock_memory)
assert len(contents) == 1
assert contents[0] == "A normal memory"
# ── GAP-112: Raw memory pruning ──────────────────────────────
class TestGAP112MemoryPruning:
"""Consolidated raw memories should be pruned (keeping audit trail)."""
def test_prune_does_nothing_with_few_ids(self):
"""Should keep all if processed count <= KEEP_RECENT (20)."""
agent = _make_agent()
engine = agent._dreaming_engine
mock_memory = MagicMock()
agent._memory_instance = mock_memory
# 15 IDs < 20 threshold
ids = {str(i) for i in range(15)}
engine._prune_processed_memories(ids)
mock_memory.delete.assert_not_called()
def test_prune_deletes_oldest_keeps_recent(self):
"""Should delete the oldest and keep the 20 most recent."""
agent = _make_agent()
engine = agent._dreaming_engine
mock_memory = MagicMock()
agent._memory_instance = mock_memory
# 25 IDs > 20 threshold => prune 5
ids = {f"mem_{i:03d}" for i in range(25)}
engine._prune_processed_memories(ids)
# Should have deleted 5 (25 - 20)
assert mock_memory.delete.call_count == 5
def test_prune_exactly_at_threshold(self):
"""Exactly 20 IDs should NOT trigger pruning."""
agent = _make_agent()
engine = agent._dreaming_engine
mock_memory = MagicMock()
agent._memory_instance = mock_memory
ids = {str(i) for i in range(20)}
engine._prune_processed_memories(ids)
mock_memory.delete.assert_not_called()
def test_prune_without_memory_instance(self):
"""Should not crash if agent has no memory instance."""
agent = _make_agent()
engine = agent._dreaming_engine
agent._memory_instance = None
# Should not raise
engine._prune_processed_memories({str(i) for i in range(30)})
def test_prune_tolerates_delete_errors(self):
"""Individual delete failures should not stop the pruning."""
agent = _make_agent()
engine = agent._dreaming_engine
mock_memory = MagicMock()
mock_memory.delete.side_effect = RuntimeError("storage error")
agent._memory_instance = mock_memory
ids = {f"mem_{i:03d}" for i in range(25)}
# Should not raise despite delete failures
engine._prune_processed_memories(ids)
assert mock_memory.delete.call_count == 5
@pytest.mark.asyncio
async def test_dream_calls_prune(self, tmp_path, monkeypatch):
"""dream() should call _prune_processed_memories after consolidation."""
monkeypatch.chdir(tmp_path)
agent = _make_agent(
settings=AgentSettings(self_improving=True, memory_enabled=True),
)
engine = agent._dreaming_engine
mock_memory = MagicMock()
mock_mem = MagicMock()
mock_mem.id = "m1"
mock_mem.content = "test memory"
mock_mem.metadata = {}
mock_memory.recall.return_value = [mock_mem]
object.__setattr__(agent, "_memory_instance", mock_memory)
async def fake_consolidate(memories):
return ["canonical insight"]
engine._consolidate_memories = fake_consolidate
with patch.object(engine, "_prune_processed_memories") as mock_prune:
await engine.dream()
mock_prune.assert_called_once()
# Arg should be the full set of processed IDs
called_ids = mock_prune.call_args[0][0]
assert "m1" in called_ids
# ── GAP-113: Workflow detection threshold ────────────────────
class TestGAP113ThresholdFive:
"""Workflow detection should require count >= 5."""
def _set_executor(self, agent, mock_executor):
"""Helper to set a mock executor on the agent."""
cid = agent._default_conversation_id
agent._executors[cid] = mock_executor
def test_threshold_rejects_count_3(self):
"""Sequences appearing only 3 times should NOT be detected."""
agent = _make_agent()
engine = agent._dreaming_engine
mock_executor = MagicMock()
mock_executor.provenance_log = _make_provenance_entries(
["search", "parse"], repeat=3,
)
self._set_executor(agent, mock_executor)
workflows = engine._detect_workflows()
assert len(workflows) == 0
def test_threshold_rejects_count_4(self):
"""Sequences appearing only 4 times should NOT be detected."""
agent = _make_agent()
engine = agent._dreaming_engine
mock_executor = MagicMock()
mock_executor.provenance_log = _make_provenance_entries(
["search", "parse"], repeat=4,
)
self._set_executor(agent, mock_executor)
workflows = engine._detect_workflows()
assert len(workflows) == 0
def test_threshold_accepts_count_5(self):
"""Sequences appearing 5 times SHOULD be detected."""
agent = _make_agent()
engine = agent._dreaming_engine
mock_executor = MagicMock()
mock_executor.provenance_log = _make_provenance_entries(
["search", "parse"], repeat=5,
)
self._set_executor(agent, mock_executor)
workflows = engine._detect_workflows()
assert len(workflows) == 1
assert workflows[0]["count"] == 5
assert workflows[0]["tools"] == ["search", "parse"]
def test_threshold_accepts_count_above_5(self):
"""Sequences appearing more than 5 times should also be detected."""
agent = _make_agent()
engine = agent._dreaming_engine
mock_executor = MagicMock()
mock_executor.provenance_log = _make_provenance_entries(
["fetch", "transform", "load"], repeat=8,
)
self._set_executor(agent, mock_executor)
workflows = engine._detect_workflows()
assert len(workflows) == 1
assert workflows[0]["count"] == 8

View File

@@ -1,602 +0,0 @@
"""Tests for GAP audit batch 3: tools, models, telemetry, knowledge, definition parser.
Covers:
GAP-87: AMP coworkers tagged as "amp" in telemetry
GAP-90: Spawned copies can persist memories
GAP-91: String guardrail shorthand supported
GAP-94: dreaming_llm accepts Any (pre-configured LLM instance)
GAP-98: coworker_source field on TokenUsage
GAP-103: Spawned copies support fire-and-forget mode
GAP-104: Knowledge evaluation heuristic improvements
GAP-106: Code guardrail resolvable from JSON
GAP-107: Telemetry span attributes include version info and extras
GAP-109: share_data telemetry privacy setting
"""
from __future__ import annotations
import threading
import time
from types import SimpleNamespace
from typing import Any
from unittest.mock import MagicMock, patch, PropertyMock
import pytest
from pydantic import BaseModel
from crewai.new_agent.models import AgentSettings, TokenUsage
# ── GAP-87: AMP coworkers tagged as "amp" ──────────────────────────
class TestGap87AmpCoworkerSource:
"""build_coworker_tools() should detect _amp_resolved and set source='amp'."""
def test_local_coworker_gets_local_source(self):
from crewai.new_agent.coworker_tools import DelegateToCoworkerTool, build_coworker_tools
from crewai.new_agent.new_agent import NewAgent
mock_agent = MagicMock(spec=NewAgent)
mock_agent.role = "researcher"
mock_agent.goal = "Research things"
mock_agent._amp_resolved = False
# Directly test DelegateToCoworkerTool with known source
tool = DelegateToCoworkerTool(coworker=mock_agent, source="local")
assert tool.coworker_source == "local"
def test_amp_coworker_gets_amp_source(self):
from crewai.new_agent.coworker_tools import DelegateToCoworkerTool
from crewai.new_agent.new_agent import NewAgent
mock_agent = MagicMock(spec=NewAgent)
mock_agent.role = "researcher"
mock_agent.goal = "Research things"
mock_agent._amp_resolved = True
tool = DelegateToCoworkerTool(coworker=mock_agent, source="amp")
assert tool.coworker_source == "amp"
def test_build_coworker_tools_detects_amp_resolved(self):
"""build_coworker_tools uses _amp_resolved to set source."""
from crewai.new_agent.coworker_tools import build_coworker_tools
from crewai.new_agent.new_agent import NewAgent
# We test the logic directly: getattr(cw, "_amp_resolved", False)
# determines the source passed to DelegateToCoworkerTool
# Test with _amp_resolved=True
mock_cw = MagicMock(spec=NewAgent)
mock_cw.role = "helper"
mock_cw.goal = "help"
mock_cw._amp_resolved = True
# The isinstance check in build_coworker_tools won't pass with a MagicMock.
# So let's test the getattr logic directly:
source = "amp" if getattr(mock_cw, "_amp_resolved", False) else "local"
assert source == "amp"
# And with _amp_resolved=False
mock_cw._amp_resolved = False
source = "amp" if getattr(mock_cw, "_amp_resolved", False) else "local"
assert source == "local"
# And without _amp_resolved at all
del mock_cw._amp_resolved
source = "amp" if getattr(mock_cw, "_amp_resolved", False) else "local"
assert source == "local"
# ── GAP-90: Spawned copies can persist memories ────────────────────
class TestGap90SpawnMemory:
"""Spawned copies should have memory=True and memory_scope set."""
def test_spawn_settings_memory_enabled(self):
"""The spawn_settings AgentSettings should have memory_enabled=True."""
settings = AgentSettings(
can_spawn_copies=False,
max_spawn_depth=0,
memory_enabled=True,
)
assert settings.memory_enabled is True
def test_spawn_tool_source_code_uses_memory_true(self):
"""Verify the spawn tool source code creates copies with memory=True."""
import inspect
from crewai.new_agent.spawn_tools import SpawnSubtaskTool
source = inspect.getsource(SpawnSubtaskTool._run)
# Check that memory=True is in the NewAgent constructor call
assert "memory=True" in source
assert 'memory_scope=f"spawn-{parent_id}"' in source
# ── GAP-91: String guardrail shorthand ─────────────────────────────
class TestGap91StringGuardrail:
"""_resolve_guardrail() should accept a plain string."""
def test_string_guardrail_resolves_to_llm_type(self):
from crewai.new_agent.definition_parser import _resolve_guardrail
with patch("crewai.tasks.llm_guardrail.LLMGuardrail") as mock_guard_cls, \
patch("crewai.utilities.llm_utils.create_llm") as mock_create:
mock_create.return_value = MagicMock()
mock_guard_cls.return_value = "guard_instance"
result = _resolve_guardrail("Do not reveal internal data.")
mock_guard_cls.assert_called_once()
call_kwargs = mock_guard_cls.call_args
assert call_kwargs.kwargs.get("description") == "Do not reveal internal data." or \
(call_kwargs[1] if len(call_kwargs) > 1 else {}).get("description") == "Do not reveal internal data."
def test_none_guardrail_returns_none(self):
from crewai.new_agent.definition_parser import _resolve_guardrail
assert _resolve_guardrail(None) is None
def test_dict_guardrail_still_works(self):
from crewai.new_agent.definition_parser import _resolve_guardrail
with patch("crewai.tasks.llm_guardrail.LLMGuardrail") as mock_cls, \
patch("crewai.utilities.llm_utils.create_llm") as mock_create:
mock_create.return_value = MagicMock()
mock_cls.return_value = "ok"
result = _resolve_guardrail({"type": "llm", "instructions": "Stay safe."})
assert result == "ok"
# ── GAP-94: dreaming_llm type accepts Any ──────────────────────────
class TestGap94DreamingLlmType:
"""dreaming_llm should accept both strings and pre-configured LLM instances."""
def test_dreaming_llm_string(self):
s = AgentSettings(dreaming_llm="openai/gpt-4o")
assert s.dreaming_llm == "openai/gpt-4o"
def test_dreaming_llm_none(self):
s = AgentSettings(dreaming_llm=None)
assert s.dreaming_llm is None
def test_dreaming_llm_instance(self):
"""Pass a pre-configured LLM object (simulated as a dict)."""
fake_llm = {"model": "custom", "temperature": 0.5}
s = AgentSettings(dreaming_llm=fake_llm)
assert s.dreaming_llm == fake_llm
def test_dreaming_llm_mock_object(self):
"""Pass a mock LLM object."""
mock_llm = MagicMock()
mock_llm.model_name = "gpt-4o"
s = AgentSettings(dreaming_llm=mock_llm)
assert s.dreaming_llm is mock_llm
# ── GAP-98: coworker_source on TokenUsage ──────────────────────────
class TestGap98CoworkerSourceField:
"""TokenUsage should have a coworker_source field."""
def test_token_usage_has_coworker_source(self):
tu = TokenUsage(
action="delegation",
agent_id="a1",
input_tokens=100,
output_tokens=50,
coworker_source="amp",
)
assert tu.coworker_source == "amp"
def test_token_usage_coworker_source_default_none(self):
tu = TokenUsage(action="message", agent_id="a1")
assert tu.coworker_source is None
def test_delegation_token_includes_coworker_source(self):
"""Integration: DelegateToCoworkerTool should set coworker_source on TokenUsage."""
from crewai.new_agent.coworker_tools import DelegateToCoworkerTool
from crewai.new_agent.new_agent import NewAgent
mock_coworker = MagicMock(spec=NewAgent)
mock_coworker.role = "writer"
mock_coworker.goal = "Write things"
mock_response = SimpleNamespace(
content="Result here",
input_tokens=10,
output_tokens=20,
model="gpt-4o",
)
mock_coworker.message = MagicMock(return_value=mock_response)
mock_parent = MagicMock()
mock_parent.id = "mgr-1"
mock_parent.role = "manager"
mock_parent.on_delegate = None
sub_tokens: list[Any] = []
mock_executor = MagicMock()
mock_executor._sub_action_tokens = sub_tokens
mock_parent._executor = mock_executor
tool = DelegateToCoworkerTool(coworker=mock_coworker, source="amp", parent_agent=mock_parent)
with patch("crewai.new_agent.coworker_tools._emit_delegation_event"):
with patch("crewai.new_agent.coworker_tools._build_provenance_summary", return_value=""):
result = tool._run(message="Write something")
assert len(sub_tokens) == 1
assert sub_tokens[0].coworker_source == "amp"
# ── GAP-103: Spawned copies fire-and-forget mode ──────────────────
class TestGap103SpawnFireAndForget:
"""SpawnSubtaskArgs should have fire_and_forget, and _run should handle it."""
def test_args_schema_has_fire_and_forget(self):
from crewai.new_agent.spawn_tools import SpawnSubtaskArgs
args = SpawnSubtaskArgs(subtasks=["t1", "t2"], fire_and_forget=True)
assert args.fire_and_forget is True
def test_args_schema_default_false(self):
from crewai.new_agent.spawn_tools import SpawnSubtaskArgs
args = SpawnSubtaskArgs(subtasks=["t1"])
assert args.fire_and_forget is False
def test_fire_and_forget_returns_acknowledgment(self):
"""Verify fire_and_forget=True returns immediately with ack message."""
from crewai.new_agent.spawn_tools import SpawnSubtaskTool
from crewai.new_agent.models import AgentSettings
from crewai.new_agent.new_agent import NewAgent
parent = MagicMock(spec=NewAgent)
parent.role = "analyst"
parent.id = "p-1"
parent.tools = []
parent.llm = "test"
parent.verbose = False
parent._memory_instance = None
parent.settings = AgentSettings(can_spawn_copies=True, max_spawn_depth=1)
tool = SpawnSubtaskTool(agent=parent)
# Mock NewAgent constructor in the local import
mock_copy = MagicMock()
mock_copy.message = MagicMock(return_value=SimpleNamespace(content="done"))
with patch.dict("sys.modules", {}):
pass # no-op, just ensuring clean state
# We need to patch the import inside _run.
# The function imports NewAgent at the top, then uses it to create copies.
# Since the import is local, we patch the module's namespace after it's imported.
import crewai.new_agent.spawn_tools as spawn_mod
original_new_agent = getattr(spawn_mod, "NewAgent", None)
with patch("crewai.new_agent.spawn_tools._emit_spawn_event"):
with patch("crewai.new_agent.spawn_tools._query_parent_memory", return_value=""):
# Temporarily inject NewAgent at module level for the local import
spawn_mod.NewAgent = MagicMock(return_value=mock_copy)
try:
result = tool._run(subtasks=["task1", "task2"], fire_and_forget=True)
finally:
if original_new_agent is not None:
spawn_mod.NewAgent = original_new_agent
elif hasattr(spawn_mod, "NewAgent"):
delattr(spawn_mod, "NewAgent")
assert "fire-and-forget" in result.lower() or "background" in result.lower()
assert "2" in result # Should mention number of subtasks
# ── GAP-104: Knowledge evaluation improvements ─────────────────────
class TestGap104KnowledgeEvaluation:
"""Knowledge discovery should have expanded tool set, lower threshold, and title."""
def test_lower_threshold_50_chars(self):
from crewai.new_agent.knowledge_discovery import KnowledgeDiscovery
agent = _make_mock_agent_for_knowledge()
kd = KnowledgeDiscovery(agent=agent)
# 60 chars — was below old 100 threshold, now above new 50
result = kd.evaluate_for_knowledge("search_web", "A" * 60)
assert result is not None
def test_old_threshold_rejects_short(self):
from crewai.new_agent.knowledge_discovery import KnowledgeDiscovery
agent = _make_mock_agent_for_knowledge()
kd = KnowledgeDiscovery(agent=agent)
result = kd.evaluate_for_knowledge("search_web", "A" * 40)
assert result is None
def test_expanded_tool_set(self):
from crewai.new_agent.knowledge_discovery import KnowledgeDiscovery
agent = _make_mock_agent_for_knowledge()
kd = KnowledgeDiscovery(agent=agent)
new_tools = ["read_website", "scrape", "fetch_url", "search_knowledge", "query_database", "read_document"]
for tool in new_tools:
kd._pending_suggestions.clear()
result = kd.evaluate_for_knowledge(tool, "Content " * 20)
assert result is not None, f"Tool '{tool}' should be accepted"
def test_unknown_tool_rejected(self):
from crewai.new_agent.knowledge_discovery import KnowledgeDiscovery
agent = _make_mock_agent_for_knowledge()
kd = KnowledgeDiscovery(agent=agent)
result = kd.evaluate_for_knowledge("send_email", "A" * 200)
assert result is None
def test_suggestion_includes_title(self):
from crewai.new_agent.knowledge_discovery import KnowledgeDiscovery
agent = _make_mock_agent_for_knowledge()
kd = KnowledgeDiscovery(agent=agent)
result = kd.evaluate_for_knowledge("search_web", "Python is a programming language.\nMore content here." + "x" * 50)
assert result is not None
assert "title" in result
assert "search_web" in result["title"]
def test_title_truncation_on_long_first_line(self):
from crewai.new_agent.knowledge_discovery import KnowledgeDiscovery
agent = _make_mock_agent_for_knowledge()
kd = KnowledgeDiscovery(agent=agent)
# Very long first line with a period early
long_line = "This is a sentence." + "x" * 200
result = kd.evaluate_for_knowledge("scrape_url", long_line)
assert result is not None
title = result["title"]
# Should be truncated at the first sentence
assert "This is a sentence." in title
# ── GAP-106: Code guardrail resolvable from JSON ──────────────────
class TestGap106CodeGuardrail:
"""_resolve_guardrail() with type='code' should resolve dotted path."""
def test_code_guardrail_resolves_function(self):
from crewai.new_agent.definition_parser import _resolve_guardrail
# Use a known function path
result = _resolve_guardrail({
"type": "code",
"function": "json.loads",
})
import json
assert result is json.loads
def test_code_guardrail_with_path_key(self):
from crewai.new_agent.definition_parser import _resolve_guardrail
result = _resolve_guardrail({
"type": "code",
"path": "os.path.exists",
})
import os.path
assert result is os.path.exists
def test_code_guardrail_bad_path_returns_none(self):
from crewai.new_agent.definition_parser import _resolve_guardrail
result = _resolve_guardrail({
"type": "code",
"function": "nonexistent.module.func",
})
assert result is None
def test_code_guardrail_no_path_returns_none(self):
from crewai.new_agent.definition_parser import _resolve_guardrail
result = _resolve_guardrail({
"type": "code",
})
assert result is None
# ── GAP-107: Telemetry span attributes complete ───────────────────
class TestGap107TelemetryAttributes:
"""agent_created() should include crewai_version, python_version, and extras."""
def test_agent_created_includes_version_info(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
mock_tracer = MagicMock()
mock_span = MagicMock()
mock_tracer.start_span.return_value = mock_span
tel._telemetry = MagicMock()
tel._telemetry._tracer = mock_tracer
tel.agent_created(
agent_id="a1",
role="researcher",
goal="Find stuff",
llm="gpt-4o",
)
# Collect all set_attribute calls
attrs = {call.args[0]: call.args[1] for call in mock_span.set_attribute.call_args_list}
assert "crewai_version" in attrs
assert "python_version" in attrs
assert "new_agent_id" in attrs
assert attrs["new_agent_id"] == "a1"
def test_agent_created_forwards_extra_kwargs(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
mock_tracer = MagicMock()
mock_span = MagicMock()
mock_tracer.start_span.return_value = mock_span
tel._telemetry = MagicMock()
tel._telemetry._tracer = mock_tracer
tel.agent_created(
agent_id="a2",
role="writer",
goal="Write things",
custom_field="hello",
another_attr="world",
)
attrs = {call.args[0]: call.args[1] for call in mock_span.set_attribute.call_args_list}
assert attrs.get("custom_field") == "hello"
assert attrs.get("another_attr") == "world"
# ── GAP-109: share_data telemetry privacy ──────────────────────────
class TestGap109ShareDataPrivacy:
"""Telemetry should respect share_data setting for sensitive data."""
def test_share_data_default_false_in_settings(self):
s = AgentSettings()
assert s.share_data is False
def test_share_data_can_be_enabled(self):
s = AgentSettings(share_data=True)
assert s.share_data is True
def test_telemetry_should_share_data_false_by_default(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
assert tel._should_share_data() is False
def test_telemetry_should_share_data_true_when_set(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry(share_data=True)
assert tel._should_share_data() is True
def test_goal_not_in_span_when_share_data_false(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry(share_data=False)
mock_tracer = MagicMock()
mock_span = MagicMock()
mock_tracer.start_span.return_value = mock_span
tel._telemetry = MagicMock()
tel._telemetry._tracer = mock_tracer
tel.agent_created(
agent_id="a1",
role="researcher",
goal="Secret goal content",
)
attrs = {call.args[0]: call.args[1] for call in mock_span.set_attribute.call_args_list}
assert "new_agent_goal" not in attrs
def test_goal_in_span_when_share_data_true(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry(share_data=True)
mock_tracer = MagicMock()
mock_span = MagicMock()
mock_tracer.start_span.return_value = mock_span
tel._telemetry = MagicMock()
tel._telemetry._tracer = mock_tracer
tel.agent_created(
agent_id="a1",
role="researcher",
goal="Secret goal content",
)
attrs = {call.args[0]: call.args[1] for call in mock_span.set_attribute.call_args_list}
assert attrs.get("new_agent_goal") == "Secret goal content"
# ── JSON Schema validation for GAP-91 ─────────────────────────────
class TestGap91SchemaValidation:
"""agent_schema.json should accept both string and object guardrails."""
def test_schema_accepts_string_guardrail(self):
try:
import jsonschema
except ImportError:
pytest.skip("jsonschema not installed")
import json
from pathlib import Path
schema_path = Path(__file__).parent.parent.parent / "src" / "crewai" / "new_agent" / "agent_schema.json"
schema = json.loads(schema_path.read_text())
doc = {
"role": "test",
"goal": "test",
"guardrail": "Do not reveal secrets.",
}
jsonschema.validate(doc, schema) # Should not raise
def test_schema_accepts_object_guardrail(self):
try:
import jsonschema
except ImportError:
pytest.skip("jsonschema not installed")
import json
from pathlib import Path
schema_path = Path(__file__).parent.parent.parent / "src" / "crewai" / "new_agent" / "agent_schema.json"
schema = json.loads(schema_path.read_text())
doc = {
"role": "test",
"goal": "test",
"guardrail": {"type": "llm", "instructions": "Be safe."},
}
jsonschema.validate(doc, schema) # Should not raise
def test_schema_has_share_data_in_settings(self):
import json
from pathlib import Path
schema_path = Path(__file__).parent.parent.parent / "src" / "crewai" / "new_agent" / "agent_schema.json"
schema = json.loads(schema_path.read_text())
settings_props = schema["properties"]["settings"]["properties"]
assert "share_data" in settings_props
assert settings_props["share_data"]["type"] == "boolean"
# ── Helpers ────────────────────────────────────────────────────────
def _make_mock_agent_for_knowledge() -> Any:
"""Create a mock agent suitable for KnowledgeDiscovery."""
agent = MagicMock()
agent.settings = AgentSettings(can_create_knowledge=True)
agent.id = "kd-agent-1"
agent.knowledge = None
agent.knowledge_sources = []
return agent

View File

@@ -1,485 +0,0 @@
"""Tests for GAP-92, GAP-93, GAP-108 fixes.
Covers:
- Memory inspector rich formatting (GAP-92)
- CLI agent memory rich output (GAP-93)
- Organic relevance improvements (GAP-108)
Note: GAP-83 (knowledge event wiring) and GAP-105 (knowledge suggestion edit flow)
tests were removed because the TUI no longer has pending suggestion state — knowledge
suggestions now flow through the conversation (agent sends a message, user responds
in plain text, executor handles confirm/reject).
"""
from __future__ import annotations
import json
from pathlib import Path
from types import SimpleNamespace
from typing import Any
from unittest.mock import MagicMock, patch
import pytest
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_tui(tmp_path: Path, agents: list[dict] | None = None, config: dict | None = None):
"""Construct an AgentTUI without running it (no event loop needed)."""
from crewai_cli.agent_tui import AgentTUI
agents_dir = tmp_path / "agents"
agents_dir.mkdir(exist_ok=True)
for defn in (agents or []):
name = defn.get("name", "agent")
(agents_dir / f"{name}.json").write_text(json.dumps(defn))
tui = AgentTUI.__new__(AgentTUI)
# Manually call __init__ without running App lifecycle
tui._agents_dir = agents_dir
tui._config = config or {}
tui._agent_defs = agents or []
tui._agent_names = [d.get("name", d.get("role", "unnamed")) for d in (agents or [])]
tui._agent_instances = {}
tui._current_room = "__common__"
tui._chat_histories = {}
tui._processing = False
tui._last_active_agent = None
tui._engagement_mode = "dm"
return tui
def _make_agent_with_memory(role: str = "researcher") -> MagicMock:
"""Create a mock agent with a memory instance."""
agent = MagicMock()
agent.role = role
agent._memory_instance = MagicMock()
return agent
def _make_memory_entry(
content: str = "Some memory",
metadata: dict | None = None,
timestamp: str = "",
):
"""Create a mock memory entry with the expected attributes."""
entry = SimpleNamespace(
content=content,
metadata=metadata or {},
timestamp=timestamp,
)
return entry
# ===========================================================================
# GAP-108: Organic mode relevance improvements
# ===========================================================================
class TestScoreRelevance:
"""Tests for the _score_relevance method (was _check_relevance)."""
def test_basic_keyword_match(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
agents = [
{"name": "dev", "role": "Python developer", "goal": "Write code", "backstory": ""},
{"name": "writer", "role": "Content writer", "goal": "Write articles", "backstory": ""},
]
scored = tui._score_relevance("Write some python code", agents)
names = [a["name"] for a, _ in scored]
assert "dev" in names
def test_expanded_stop_words_filter(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
agents = [
{"name": "a1", "role": "helper", "goal": "Assist users", "backstory": ""},
]
scored = tui._score_relevance("please me with this", agents)
assert len(scored) == 0
def test_stemming_matches_ing_suffix(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
scored = tui._score_relevance("writing documentation", [
{"name": "writer", "role": "write docs", "goal": "writing manuals", "backstory": ""},
])
assert len(scored) == 1
def test_stemming_matches_ed_suffix(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
scored = tui._score_relevance("I need data parsed", [
{"name": "parser", "role": "data parser", "goal": "Parse data files", "backstory": ""},
])
assert len(scored) == 1
assert scored[0][0]["name"] == "parser"
def test_stemming_matches_s_suffix(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
agents = [
{"name": "report_gen", "role": "report generator", "goal": "Generate report", "backstory": ""},
]
scored = tui._score_relevance("I need reports", agents)
assert len(scored) == 1
assert scored[0][0]["name"] == "report_gen"
def test_backstory_included_in_matching(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
agents = [
{
"name": "secret",
"role": "assistant",
"goal": "Help users",
"backstory": "Expert in quantum computing",
},
]
scored = tui._score_relevance("Tell me about quantum", agents)
assert len(scored) == 1
assert scored[0][0]["name"] == "secret"
def test_no_match_returns_empty(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
agents = [
{"name": "a", "role": "alpha", "goal": "one", "backstory": ""},
{"name": "b", "role": "beta", "goal": "two", "backstory": ""},
]
scored = tui._score_relevance("xyzzy frobulate", agents)
assert len(scored) == 0
def test_stop_words_only_returns_empty(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
agents = [
{"name": "x", "role": "thing", "goal": "stuff", "backstory": ""},
]
scored = tui._score_relevance("the is to and or", agents)
assert len(scored) == 0
class TestStemWords:
"""Unit tests for the _stem_words static method."""
def test_ing_suffix(self) -> None:
from crewai_cli.agent_tui import AgentTUI
result = AgentTUI._stem_words({"running"})
assert "runn" in result
assert "running" in result
def test_ed_suffix(self) -> None:
from crewai_cli.agent_tui import AgentTUI
result = AgentTUI._stem_words({"parsed"})
assert "pars" in result
assert "parsed" in result
def test_s_suffix(self) -> None:
from crewai_cli.agent_tui import AgentTUI
result = AgentTUI._stem_words({"reports"})
assert "report" in result
assert "reports" in result
def test_short_words_not_stemmed(self) -> None:
from crewai_cli.agent_tui import AgentTUI
# "is" ends in "s" but len <= 2
result = AgentTUI._stem_words({"is"})
assert result == {"is"}
def test_mixed_set(self) -> None:
from crewai_cli.agent_tui import AgentTUI
result = AgentTUI._stem_words({"testing", "fixed", "bugs"})
assert "test" in result # testing -> test (strip "ing")
assert "fix" in result # fixed -> fix (strip "ed")
assert "bug" in result # bugs -> bug (strip "s")
# ===========================================================================
# GAP-92: Memory inspector rich formatting
# ===========================================================================
class TestMemoryInspectorFormatting:
"""Tests for enhanced memory panel display."""
def test_show_memory_panel_rich_format(self, tmp_path: Path) -> None:
"""Memory panel should include type tags and content."""
tui = _make_tui(tmp_path, agents=[
{"name": "researcher", "role": "researcher", "goal": "Research"}
])
agent = _make_agent_with_memory("researcher")
agent._memory_instance.list_records.return_value = [
_make_memory_entry(
"Important finding about AI",
{"type": "canonical", "importance": "high", "scope": "global"},
"2025-01-01",
),
_make_memory_entry(
"Quick note",
{"type": "raw"},
),
]
tui._agent_instances["researcher"] = agent
tui._current_room = "researcher"
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._show_memory_panel()
assert len(messages) == 1
output = messages[0]
# Should contain agent name header
assert "Memory Inspector" in output
assert "researcher" in output
# Should contain type tags
assert "canonical" in output
assert "raw" in output
# Should contain importance
assert "high" in output
# Should contain scope
assert "scope:global" in output
# Should contain content
assert "Important finding about AI" in output
assert "Quick note" in output
# Should contain help text
assert "/memory search" in output
def test_show_memory_panel_truncates_long_content(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path, agents=[
{"name": "a", "role": "a", "goal": "g"}
])
agent = _make_agent_with_memory("a")
long_content = "x" * 300
agent._memory_instance.list_records.return_value = [
_make_memory_entry(long_content, {}),
]
tui._agent_instances["a"] = agent
tui._current_room = "a"
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._show_memory_panel()
output = messages[0]
assert "..." in output
# Content should be truncated at 150 chars
assert "x" * 151 not in output
def test_show_memory_panel_no_agent(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._show_memory_panel()
assert "No agent selected." in messages[0]
def test_show_memory_panel_no_memory(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path, agents=[
{"name": "a", "role": "a", "goal": "g"}
])
agent = MagicMock()
agent._memory_instance = None
tui._agent_instances["a"] = agent
tui._current_room = "a"
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._show_memory_panel()
assert "No memories found" in messages[0]
def test_search_memory_rich_format(self, tmp_path: Path) -> None:
"""Search results should use rich formatting."""
tui = _make_tui(tmp_path, agents=[
{"name": "researcher", "role": "researcher", "goal": "Research"}
])
agent = _make_agent_with_memory("researcher")
agent._memory_instance.recall.return_value = [
_make_memory_entry(
"Found relevant data about topic",
{"type": "knowledge", "scope": "project"},
),
]
tui._agent_instances["researcher"] = agent
tui._current_room = "researcher"
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._search_memory("topic")
output = messages[0]
assert "topic" in output
assert "researcher" in output
assert "knowledge" in output
assert "scope:project" in output
def test_search_memory_no_results(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path, agents=[
{"name": "a", "role": "a", "goal": "g"}
])
agent = _make_agent_with_memory("a")
agent._memory_instance.recall.return_value = []
tui._agent_instances["a"] = agent
tui._current_room = "a"
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._search_memory("nonexistent")
assert "No memories matching" in messages[0]
def test_memory_content_fallback_to_record(self, tmp_path: Path) -> None:
"""When .content is empty, should fall back to .record.content."""
tui = _make_tui(tmp_path, agents=[
{"name": "a", "role": "a", "goal": "g"}
])
agent = _make_agent_with_memory("a")
# Memory with no direct .content but has .record.content
mem = SimpleNamespace(
content="",
record=SimpleNamespace(content="Data from record"),
metadata={"type": "raw"},
timestamp="",
)
agent._memory_instance.list_records.return_value = [mem]
tui._agent_instances["a"] = agent
tui._current_room = "a"
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._show_memory_panel()
assert "Data from record" in messages[0]
# ===========================================================================
# GAP-93: CLI agent memory rich output
# ===========================================================================
class TestCLIAgentMemoryRichOutput:
"""Tests for the enhanced CLI agent memory command."""
def test_rich_table_output(self, tmp_path: Path) -> None:
"""When rich is available, output should use Table format."""
from unittest.mock import call
mock_console = MagicMock()
mock_table_cls = MagicMock()
mock_table = MagicMock()
mock_table_cls.return_value = mock_table
mem1 = _make_memory_entry("First memory content", {"type": "knowledge", "scope": "project"})
mem2 = _make_memory_entry("Second memory content", {"type": "raw", "scope": "agent"})
mock_memory = MagicMock()
mock_memory.list_records.return_value = [mem1, mem2]
mock_agent = MagicMock()
mock_agent._memory_instance = mock_memory
with patch("crewai_cli.cli.Console", mock_console.__class__, create=True), \
patch("crewai_cli.cli.Table", mock_table_cls, create=True):
# The actual test is more about verifying the logic pattern
# since we can't easily invoke the click command without a full setup.
# Verify the data extraction logic works.
results = mock_memory.list_records(limit=20)
assert len(results) == 2
for i, mem in enumerate(results, 1):
content = getattr(mem, "content", "") or str(mem)
meta = getattr(mem, "metadata", {}) or {}
mem_type = meta.get("type", "raw")
scope = meta.get("scope", "---")
assert isinstance(content, str)
assert isinstance(mem_type, str)
def test_memory_content_extraction(self) -> None:
"""Verify content extraction logic handles various memory formats."""
# Direct content
mem1 = _make_memory_entry("direct content", {"type": "knowledge"})
content = getattr(mem1, "content", "") or str(mem1)
assert content == "direct content"
# Fallback to record.content
mem2 = SimpleNamespace(
content="",
record=SimpleNamespace(content="record content"),
metadata={"type": "raw"},
)
content = (
getattr(mem2, "content", "")
or getattr(getattr(mem2, "record", None), "content", "")
or str(mem2)
)
assert content == "record content"
# Fallback to str()
mem3 = SimpleNamespace(content="", metadata={})
content = getattr(mem3, "content", "") or str(mem3)
assert "namespace" in content.lower()
def test_truncation_at_200_chars(self) -> None:
"""Long content should be truncated at 200 characters."""
long_text = "a" * 300
mem = _make_memory_entry(long_text, {})
content = getattr(mem, "content", "") or str(mem)
if len(content) > 200:
content = content[:200] + "..."
assert len(content) == 203 # 200 + "..."
assert content.endswith("...")
# ===========================================================================
# Integration-style tests combining multiple gaps
# ===========================================================================
class TestIntegration:
"""Cross-gap integration tests."""
def test_relevance_with_stemmed_backstory(self, tmp_path: Path) -> None:
"""Stemmed backstory keywords should influence relevance."""
tui = _make_tui(tmp_path)
agents = [
{
"name": "analyst",
"role": "business analyst",
"goal": "Analyze data",
"backstory": "Experienced in forecasting market trends",
},
{
"name": "coder",
"role": "software engineer",
"goal": "Build applications",
"backstory": "Skilled in Python and JavaScript",
},
]
# "forecasted" should stem to match "forecast" in backstory
# "forecasted" -> strip "ed" -> "forecast"
# "forecasting" in backstory -> strip "ing" -> "forecast"
scored = tui._score_relevance("I forecasted the numbers", agents)
names = [a["name"] for a, _ in scored]
assert "analyst" in names
def test_memory_inspector_after_knowledge_save(self, tmp_path: Path) -> None:
"""After saving knowledge, it should appear in memory inspector."""
tui = _make_tui(tmp_path, agents=[
{"name": "a", "role": "agent", "goal": "g"}
])
agent = _make_agent_with_memory("agent")
tui._agent_instances["a"] = agent
tui._current_room = "a"
# Set up memory to return the saved knowledge
agent._memory_instance.list_records.return_value = [
_make_memory_entry(
"Curated knowledge content",
{"type": "knowledge", "scope": "agent"},
),
]
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._show_memory_panel()
output = messages[0]
assert "knowledge" in output
assert "Curated knowledge content" in output

View File

@@ -1,472 +0,0 @@
"""Tests for GAP-117 through GAP-121 (fourth audit pass)."""
from __future__ import annotations
import asyncio
import json
from typing import Any
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from crewai.new_agent.models import (
AgentSettings,
AgentStatus,
Message,
ProvenanceEntry,
TokenUsage,
)
# ── Helpers ────────────────────────────────────────────────────────
def _make_executor(
*,
provenance_detail: str = "standard",
memory_enabled: bool = True,
tools: list | None = None,
coworker_tools: list | None = None,
):
"""Build a lightweight mock executor for testing."""
from crewai.new_agent.executor import ConversationalAgentExecutor
agent = MagicMock()
agent.id = "test-agent-1"
agent.role = "Researcher"
agent.goal = "Research things"
agent.backstory = ""
agent.settings = AgentSettings(
provenance_detail=provenance_detail,
memory_enabled=memory_enabled,
)
agent.response_model = None
agent._llm_instance = MagicMock()
agent._llm_instance.model = "openai/gpt-4o"
agent._resolved_tools = tools or []
agent._coworker_tools = coworker_tools or []
agent._knowledge_discovery = None
agent.step_callback = None
agent.verbose = False
agent.knowledge = None
agent.knowledge_sources = []
executor = ConversationalAgentExecutor(agent=agent, provider=None)
return executor, agent
# ── GAP-117: Delegating status emission ───────────────────────────
class TestGAP117DelegatingStatus:
"""Executor should emit 'delegating' status for delegate_to_* tools."""
@pytest.mark.asyncio
async def test_delegation_tool_emits_delegating_status(self):
executor, agent = _make_executor()
statuses: list[AgentStatus] = []
provider = AsyncMock()
async def capture_status(status):
statuses.append(status)
provider.send_status = capture_status
provider.send_message = AsyncMock()
executor.provider = provider
# Simulate _emit_status being called for a delegation tool
await executor._emit_status(
"delegating", "Asking @writer…", coworker="writer"
)
assert len(statuses) == 1
assert statuses[0].state == "delegating"
assert statuses[0].coworker == "writer"
def test_delegate_tool_name_detected(self):
"""Tool names starting with 'delegate_to_' should be treated as delegations."""
assert "delegate_to_writer".startswith("delegate_to_")
assert "delegate_to_a2a_remote".startswith("delegate_to_")
assert not "search_web".startswith("delegate_to_")
def test_coworker_label_extraction(self):
"""The coworker label should be extracted from the tool name."""
func_name = "delegate_to_content_writer"
label = func_name.replace("delegate_to_", "").replace("_", " ")
assert label == "content writer"
# ── GAP-118: Token usage events emitted for billing ───────────────
class TestGAP118TokenUsageEvents:
"""Token usage should emit events for platform billing."""
def test_token_usage_event_class_exists(self):
from crewai.new_agent.events import NewAgentTokenUsageEvent
event = NewAgentTokenUsageEvent(
new_agent_id="a1",
conversation_id="c1",
action="message",
input_tokens=100,
output_tokens=50,
model="gpt-4o",
)
assert event.type == "new_agent_token_usage"
assert event.input_tokens == 100
assert event.output_tokens == 50
def test_record_token_usage_emits_event(self):
executor, agent = _make_executor()
executor._turn_input_tokens = 200
executor._turn_output_tokens = 100
executor.conversation_history = [
Message(role="user", content="hi", conversation_id="conv-1")
]
emitted = []
original_emit = executor._emit_event
def capture_event(event):
emitted.append(event)
try:
original_emit(event)
except Exception:
pass
executor._emit_event = capture_event
executor._record_token_usage("message", "gpt-4o")
from crewai.new_agent.events import NewAgentTokenUsageEvent
token_events = [e for e in emitted if isinstance(e, NewAgentTokenUsageEvent)]
assert len(token_events) == 1
assert token_events[0].action == "message"
assert token_events[0].input_tokens == 200
assert token_events[0].output_tokens == 100
assert token_events[0].conversation_id == "conv-1"
def test_record_token_usage_still_appends_record(self):
executor, agent = _make_executor()
executor._turn_input_tokens = 50
executor._turn_output_tokens = 25
executor._record_token_usage("tool_call", "gpt-4o", tool_name="search")
assert len(executor.usage_records) == 1
assert executor.usage_records[0].action == "tool_call"
assert executor.usage_records[0].tool_name == "search"
# ── GAP-119: Knowledge suggestions surfaced conversationally ──────
class TestGAP119KnowledgeSurfacing:
"""Knowledge suggestions should be sent as agent messages via provider."""
def test_knowledge_suggestion_sends_message(self):
executor, agent = _make_executor()
executor.conversation_history = [
Message(role="user", content="test", conversation_id="conv-1")
]
# Set up a mock provider
provider = MagicMock()
sent_messages: list[Message] = []
async def mock_send(msg):
sent_messages.append(msg)
provider.send_message = mock_send
executor.provider = provider
# Set up mock knowledge discovery
kd = MagicMock()
kd.evaluate_for_knowledge.return_value = {
"title": "search_web: AI agent frameworks comparison",
"content": "Some long content...",
"source_tool": "search_web",
"status": "pending",
}
agent._knowledge_discovery = kd
# The actual integration happens inside _execute_tool_calls
# Test the message construction via KnowledgeDiscovery.build_suggestion_message
suggestion = kd.evaluate_for_knowledge("search_web", "Some long content...")
from crewai.new_agent.knowledge_discovery import KnowledgeDiscovery
from crewai.new_agent.models import Message as AgentMessage, MessageAction
text, actions = KnowledgeDiscovery.build_suggestion_message(kd, suggestion)
action_objs = [MessageAction(**a) for a in actions]
hint_msg = AgentMessage(
role="agent",
content=text,
actions=action_objs,
sender="Researcher",
conversation_id="conv-1",
)
assert "AI agent frameworks comparison" in hint_msg.content
assert hint_msg.role == "agent"
assert "knowledge source" in hint_msg.content.lower() or "save" in hint_msg.content.lower()
assert hint_msg.actions is not None
assert len(hint_msg.actions) >= 2
def test_no_message_when_no_suggestion(self):
"""If evaluate_for_knowledge returns None, no message should be sent."""
executor, agent = _make_executor()
kd = MagicMock()
kd.evaluate_for_knowledge.return_value = None
agent._knowledge_discovery = kd
provider = MagicMock()
provider.send_message = AsyncMock()
executor.provider = provider
# Simulate the evaluation returning None
result = kd.evaluate_for_knowledge("search_web", "short")
assert result is None
# Provider should not have been called
provider.send_message.assert_not_called()
def test_no_message_when_no_provider(self):
"""If no provider is set, knowledge surfacing is silently skipped."""
executor, agent = _make_executor()
executor.provider = None
kd = MagicMock()
kd.evaluate_for_knowledge.return_value = {
"title": "test", "content": "...", "source_tool": "search", "status": "pending"
}
agent._knowledge_discovery = kd
# Should not raise even without provider
suggestion = kd.evaluate_for_knowledge("search", "long content " * 50)
assert suggestion is not None
# ── GAP-120: Memory scope filtering ──────────────────────────────
class TestGAP120MemoryScopeFiltering:
"""Memory recall should filter by conversation and user scope."""
def test_filters_out_other_conversation_memories(self):
executor, agent = _make_executor()
executor.conversation_history = [
Message(role="user", content="hi", conversation_id="conv-A")
]
# Create mock memories with different conversation scopes
m1 = MagicMock()
m1.content = "Global fact"
m1.metadata = {}
m2 = MagicMock()
m2.content = "Conv-A memory"
m2.metadata = {"conversation_id": "conv-A"}
m3 = MagicMock()
m3.content = "Conv-B memory (should be filtered)"
m3.metadata = {"conversation_id": "conv-B"}
memory = MagicMock()
memory.recall.return_value = [m1, m2, m3]
agent._memory_instance = memory
result = executor._recall_memory("query")
assert "Global fact" in result
assert "Conv-A memory" in result
assert "Conv-B" not in result
def test_filters_out_other_user_memories(self):
executor, agent = _make_executor()
executor.conversation_history = [
Message(role="user", content="hi", conversation_id="conv-1")
]
provider = MagicMock()
provider.user_id = "user-alice"
executor.provider = provider
m1 = MagicMock()
m1.content = "Alice's preference"
m1.metadata = {"user_id": "user-alice"}
m2 = MagicMock()
m2.content = "Bob's preference (should be filtered)"
m2.metadata = {"user_id": "user-bob"}
m3 = MagicMock()
m3.content = "Unscoped memory"
m3.metadata = {}
memory = MagicMock()
memory.recall.return_value = [m1, m2, m3]
agent._memory_instance = memory
result = executor._recall_memory("query")
assert "Alice's preference" in result
assert "Bob's preference" not in result
assert "Unscoped memory" in result
def test_no_filter_when_no_scope_metadata(self):
executor, agent = _make_executor()
executor.conversation_history = [
Message(role="user", content="hi", conversation_id="conv-1")
]
m1 = MagicMock()
m1.content = "Memory without metadata"
m1.metadata = {}
memory = MagicMock()
memory.recall.return_value = [m1]
agent._memory_instance = memory
result = executor._recall_memory("query")
assert "Memory without metadata" in result
def test_no_filter_when_no_provider_user(self):
"""When provider has no user_id, user-scoped memories pass through."""
executor, agent = _make_executor()
executor.conversation_history = [
Message(role="user", content="hi", conversation_id="conv-1")
]
executor.provider = None # No provider
m1 = MagicMock()
m1.content = "User-scoped but no provider to check against"
m1.metadata = {"user_id": "user-alice"}
memory = MagicMock()
memory.recall.return_value = [m1]
agent._memory_instance = memory
result = executor._recall_memory("query")
# Should pass through since we can't verify user
assert "User-scoped" in result
def test_string_metadata_handled_gracefully(self):
"""If metadata is a string instead of dict, don't crash."""
executor, agent = _make_executor()
executor.conversation_history = [
Message(role="user", content="hi", conversation_id="conv-1")
]
m1 = MagicMock()
m1.content = "Memory with bad metadata"
m1.metadata = "not a dict"
memory = MagicMock()
memory.recall.return_value = [m1]
agent._memory_instance = memory
result = executor._recall_memory("query")
assert "Memory with bad metadata" in result
def test_empty_results_after_filtering(self):
"""If all memories are filtered out, return empty string."""
executor, agent = _make_executor()
executor.conversation_history = [
Message(role="user", content="hi", conversation_id="conv-A")
]
m1 = MagicMock()
m1.content = "Wrong conversation"
m1.metadata = {"conversation_id": "conv-B"}
memory = MagicMock()
memory.recall.return_value = [m1]
agent._memory_instance = memory
result = executor._recall_memory("query")
assert result == ""
# ── GAP-121: Standard provenance tier reasoning extraction ────────
class TestGAP121StandardProvenance:
"""Standard tier should extract reasoning from model response text."""
def test_extract_reasoning_explicit_marker(self):
from crewai.new_agent.executor import ConversationalAgentExecutor
text = "Here is the analysis. My reasoning is: the data shows a clear trend toward AI adoption. Therefore I recommend investing."
result = ConversationalAgentExecutor._extract_reasoning_from_text(text)
assert "data shows" in result or "clear trend" in result
def test_extract_reasoning_because_pattern(self):
from crewai.new_agent.executor import ConversationalAgentExecutor
text = "Because the API rate limits are strict, I chose to batch the requests in groups of 10."
result = ConversationalAgentExecutor._extract_reasoning_from_text(text)
assert len(result) > 15
def test_extract_reasoning_decided_pattern(self):
from crewai.new_agent.executor import ConversationalAgentExecutor
text = "I decided to use Python for this task because it has the best library support for data analysis."
result = ConversationalAgentExecutor._extract_reasoning_from_text(text)
assert len(result) > 15
def test_extract_reasoning_fallback_first_sentence(self):
from crewai.new_agent.executor import ConversationalAgentExecutor
text = "The quarterly revenue exceeded expectations by 15 percent. This is good news for investors."
result = ConversationalAgentExecutor._extract_reasoning_from_text(text)
assert "quarterly revenue" in result
def test_extract_reasoning_empty_text(self):
from crewai.new_agent.executor import ConversationalAgentExecutor
assert ConversationalAgentExecutor._extract_reasoning_from_text("") == ""
def test_extract_reasoning_short_text(self):
from crewai.new_agent.executor import ConversationalAgentExecutor
result = ConversationalAgentExecutor._extract_reasoning_from_text("ok")
assert result == ""
def test_standard_different_from_minimal(self):
"""Standard tier should produce reasoning; minimal should not."""
from crewai.new_agent.executor import ConversationalAgentExecutor
response_text = "I decided to search the web because the user needs current information about AI frameworks."
# Standard: should extract reasoning
standard_result = ConversationalAgentExecutor._extract_reasoning_from_text(
response_text
)
assert len(standard_result) > 0
@pytest.mark.asyncio
async def test_maybe_generate_reasoning_minimal_returns_empty(self):
executor, _ = _make_executor(provenance_detail="minimal")
result = await executor._maybe_generate_reasoning(
"response", {"msg": "test"}, "Some outcome text here with reasoning."
)
assert result == ""
@pytest.mark.asyncio
async def test_maybe_generate_reasoning_standard_extracts(self):
executor, _ = _make_executor(provenance_detail="standard")
result = await executor._maybe_generate_reasoning(
"response",
{"msg": "test"},
"Because the user asked about recent trends, I searched for the latest publications.",
)
assert len(result) > 0
def test_reasoning_truncated_at_300_chars(self):
from crewai.new_agent.executor import ConversationalAgentExecutor
long_text = "My reasoning is: " + "a" * 500
result = ConversationalAgentExecutor._extract_reasoning_from_text(long_text)
assert len(result) <= 300

View File

@@ -1,488 +0,0 @@
"""Tests for GAP-122 through GAP-125 (fifth audit pass)."""
from __future__ import annotations
import asyncio
import hashlib
from typing import Any
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from crewai.new_agent.models import (
AgentSettings,
AgentStatus,
Message,
ProvenanceEntry,
TokenUsage,
)
# ── Helpers ────────────────────────────────────────────────────────
def _make_executor(
*,
provenance_detail: str = "standard",
memory_enabled: bool = True,
tools: list | None = None,
coworker_tools: list | None = None,
):
"""Build a lightweight mock executor for testing."""
from crewai.new_agent.executor import ConversationalAgentExecutor
agent = MagicMock()
agent.id = "test-agent-1"
agent.role = "Researcher"
agent.goal = "Research things"
agent.backstory = ""
agent.settings = AgentSettings(
provenance_detail=provenance_detail,
memory_enabled=memory_enabled,
)
agent.response_model = None
agent._llm_instance = MagicMock()
agent._llm_instance.model = "openai/gpt-4o"
agent._resolved_tools = tools or []
agent._coworker_tools = coworker_tools or []
agent._knowledge_discovery = None
agent.step_callback = None
agent.verbose = False
agent.knowledge = None
agent.knowledge_sources = []
executor = ConversationalAgentExecutor(agent=agent, provider=None)
return executor, agent
# ── GAP-122: Training feedback in DreamingEngine ────────────────
class TestGAP122TrainingFeedback:
"""DreamingEngine should accept and incorporate training feedback."""
def test_add_training_feedback_stores_entry(self):
from crewai.new_agent.dreaming import DreamingEngine
agent = MagicMock()
agent.role = "Researcher"
agent.id = "r1"
agent.settings = AgentSettings()
agent._executor = None
agent._memory_instance = None
engine = DreamingEngine(agent)
engine.add_training_feedback("Always cite sources", "research task")
assert len(engine._training_feedback) == 1
assert engine._training_feedback[0]["feedback"] == "Always cite sources"
assert engine._training_feedback[0]["task_context"] == "research task"
assert "timestamp" in engine._training_feedback[0]
def test_add_training_feedback_increments_memory_count(self):
from crewai.new_agent.dreaming import DreamingEngine
agent = MagicMock()
agent.role = "Researcher"
agent.id = "r1"
agent.settings = AgentSettings()
agent._executor = None
agent._memory_instance = None
engine = DreamingEngine(agent)
assert engine._memories_since_last_dream == 0
engine.add_training_feedback("feedback")
assert engine._memories_since_last_dream == 1
@pytest.mark.asyncio
async def test_training_feedback_cleared_after_consolidation(self):
"""After _consolidate_memories, training feedback should be consumed."""
from crewai.new_agent.dreaming import DreamingEngine
agent = MagicMock()
agent.role = "Researcher"
agent.id = "r1"
agent.settings = AgentSettings()
agent._executor = None
agent._memory_instance = None
engine = DreamingEngine(agent)
engine.add_training_feedback("Always be concise")
engine.add_training_feedback("Use bullet points", "report task")
assert len(engine._training_feedback) == 2
# Call _consolidate_memories — will fail on LLM call but should still clear feedback
await engine._consolidate_memories(["memory 1", "memory 2"])
# Feedback should be cleared even if consolidation returns empty (no LLM)
assert len(engine._training_feedback) == 0
def test_training_feedback_without_context(self):
from crewai.new_agent.dreaming import DreamingEngine
agent = MagicMock()
agent.role = "Writer"
agent.id = "w1"
agent.settings = AgentSettings()
agent._executor = None
agent._memory_instance = None
engine = DreamingEngine(agent)
engine.add_training_feedback("Be more creative")
assert engine._training_feedback[0]["task_context"] == ""
def test_train_calls_add_training_feedback(self):
"""NewAgent.train() should successfully call add_training_feedback now."""
from crewai.new_agent.dreaming import DreamingEngine
agent = MagicMock()
agent.role = "Researcher"
agent.id = "r1"
agent.settings = AgentSettings()
agent._executor = None
agent._memory_instance = None
engine = DreamingEngine(agent)
# This should not raise
engine.add_training_feedback("Use formal language", "writing task")
assert len(engine._training_feedback) == 1
def test_multiple_feedback_entries_accumulated(self):
from crewai.new_agent.dreaming import DreamingEngine
agent = MagicMock()
agent.role = "Researcher"
agent.id = "r1"
agent.settings = AgentSettings()
agent._executor = None
agent._memory_instance = None
engine = DreamingEngine(agent)
for i in range(5):
engine.add_training_feedback(f"Feedback {i}")
assert len(engine._training_feedback) == 5
assert engine._memories_since_last_dream == 5
# ── GAP-123: Event listener → telemetry span completion ─────────
class TestGAP123TelemetrySpanCompletion:
"""Event listener completed handlers should close telemetry spans."""
def test_telemetry_has_pending_spans_dict(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
assert hasattr(tel, "_pending_spans")
assert isinstance(tel._pending_spans, dict)
def test_store_and_retrieve_span(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
mock_span = MagicMock()
key = tel._span_key("agent-1", "delegation", "writer")
tel.store_span(key, mock_span)
assert tel.retrieve_span(key) is mock_span
# Second retrieval should return None (popped)
assert tel.retrieve_span(key) is None
def test_store_span_ignores_none(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
tel.store_span("key", None)
assert len(tel._pending_spans) == 0
def test_span_key_format(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
assert tel._span_key("a1", "delegation", "writer") == "a1:delegation:writer"
assert tel._span_key("a1", "dreaming") == "a1:dreaming:"
def test_tool_usage_completed_event_method_exists(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
assert hasattr(tel, "tool_usage_completed_event")
# Should not raise even without telemetry backend
tel.tool_usage_completed_event(agent_id="a1", tool_name="search")
def test_spawn_completed_event_method_exists(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
assert hasattr(tel, "spawn_completed_event")
tel.spawn_completed_event(agent_id="a1", spawn_id="s1")
def test_agent_registered_in_telemetry_registry(self):
"""_init_telemetry should register the agent so event listeners can find it."""
from crewai.new_agent.telemetry import (
NewAgentTelemetry,
get_telemetry_for_agent,
register_agent,
unregister_agent,
)
tel = NewAgentTelemetry()
register_agent("test-123", tel)
try:
found = get_telemetry_for_agent("test-123")
assert found is tel
finally:
unregister_agent("test-123")
assert get_telemetry_for_agent("test-123") is None
def test_event_listener_tool_completed_calls_telemetry(self):
"""_on_tool_completed handler should call tel.tool_usage_completed_event."""
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
tel.tool_usage_completed_event = MagicMock()
# Simulate what the event handler does
with patch("crewai.new_agent.event_listener._get_tel", return_value=tel):
from crewai.new_agent.event_listener import register_new_agent_listeners
from crewai.events.event_bus import crewai_event_bus
from crewai.new_agent.events import NewAgentToolUsageCompletedEvent
event = NewAgentToolUsageCompletedEvent(
new_agent_id="agent-tc", tool_name="search_web",
)
# Directly test the handler logic
handler_tel = tel
handler_tel.tool_usage_completed_event(
agent_id=event.new_agent_id, tool_name=event.tool_name,
)
tel.tool_usage_completed_event.assert_called_once_with(
agent_id="agent-tc", tool_name="search_web",
)
def test_event_listener_delegation_completed_closes_span(self):
"""Delegation started stores span, completed retrieves and closes it."""
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
mock_span = MagicMock()
# Simulate started handler: creates span and stores it
key = tel._span_key("agent-dc", "delegation", "writer")
tel.store_span(key, mock_span)
# Simulate completed handler: retrieves span and calls completion
span = tel.retrieve_span(key)
assert span is mock_span
tel.delegation_completed(span, tokens_consumed=500, response_time_ms=1200)
# span should have been popped
assert tel.retrieve_span(key) is None
def test_event_listener_dreaming_completed_closes_span(self):
"""Dreaming started stores span, completed retrieves and closes it."""
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
mock_span = MagicMock()
key = tel._span_key("agent-dr", "dreaming")
tel.store_span(key, mock_span)
span = tel.retrieve_span(key)
assert span is mock_span
tel.dreaming_completed(span, memories_processed=10, canonical_created=3)
assert tel.retrieve_span(key) is None
def test_event_listener_planning_completed_closes_span(self):
"""Planning started stores span, completed retrieves and closes it."""
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
mock_span = MagicMock()
key = tel._span_key("agent-pl", "planning")
tel.store_span(key, mock_span)
span = tel.retrieve_span(key)
assert span is mock_span
tel.planning_completed(span, steps_count=4)
assert tel.retrieve_span(key) is None
def test_event_listener_spawn_completed_closes_span(self):
"""Spawn started stores span, completed retrieves and closes it."""
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
mock_span = MagicMock()
key = tel._span_key("agent-sp", "spawn", "spawn-1")
tel.store_span(key, mock_span)
span = tel.retrieve_span(key)
assert span is mock_span
tel.spawn_completed(span)
assert tel.retrieve_span(key) is None
def test_completed_handler_without_stored_span_is_safe(self):
"""If started event was missed, completed should not crash."""
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
key = tel._span_key("agent-x", "delegation", "writer")
span = tel.retrieve_span(key)
assert span is None
# delegation_completed with None span should not raise
tel.delegation_completed(None, tokens_consumed=0, response_time_ms=0)
# ── GAP-124: Agent fingerprint in telemetry spans ──────────────
class TestGAP124AgentFingerprint:
"""Agent fingerprint should be computed and set on telemetry spans."""
def test_fingerprint_stored_on_telemetry(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
tel.set_fingerprint("abc123def456")
assert tel._agent_fingerprint == "abc123def456"
def test_fingerprint_is_deterministic(self):
"""Same config should produce the same fingerprint."""
parts = [
"Researcher",
"Research things"[:100],
"search_web,write_doc",
"True",
"True",
]
digest1 = hashlib.sha256("|".join(parts).encode()).hexdigest()[:16]
digest2 = hashlib.sha256("|".join(parts).encode()).hexdigest()[:16]
assert digest1 == digest2
assert len(digest1) == 16
def test_different_config_different_fingerprint(self):
parts_a = ["Researcher", "Research", "search", "True", "True"]
parts_b = ["Writer", "Write stories", "write", "True", "False"]
fp_a = hashlib.sha256("|".join(parts_a).encode()).hexdigest()[:16]
fp_b = hashlib.sha256("|".join(parts_b).encode()).hexdigest()[:16]
assert fp_a != fp_b
def test_fingerprint_set_via_init_telemetry(self):
"""The _init_telemetry path should set a fingerprint on the telemetry."""
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
# Simulate what _init_telemetry does
tool_names = sorted(["search_web", "write_doc"])
parts = [
"Researcher",
"Research things"[:100],
",".join(tool_names),
"True",
"True",
]
digest = hashlib.sha256("|".join(parts).encode()).hexdigest()[:16]
tel.set_fingerprint(digest)
assert len(tel._agent_fingerprint) == 16
def test_fingerprint_included_in_agent_created_span(self):
"""agent_created() should set agent_fingerprint attribute on the span."""
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
tel.set_fingerprint("fp_test_12345678")
# Mock the tracer
mock_tracer = MagicMock()
mock_span = MagicMock()
mock_tracer.start_span.return_value = mock_span
tel._telemetry = MagicMock()
tel._telemetry._tracer = mock_tracer
tel.agent_created(
agent_id="a1", role="Researcher", goal="Research",
)
# Check that agent_fingerprint was set
set_calls = {
call.args[0]: call.args[1]
for call in mock_span.set_attribute.call_args_list
}
assert set_calls.get("agent_fingerprint") == "fp_test_12345678"
def test_fingerprint_included_in_execution_span(self):
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
tel.set_fingerprint("fp_exec_test")
mock_tracer = MagicMock()
mock_span = MagicMock()
mock_tracer.start_span.return_value = mock_span
tel._telemetry = MagicMock()
tel._telemetry._tracer = mock_tracer
tel.execution_started(agent_id="a1", conversation_id="c1")
set_calls = {
call.args[0]: call.args[1]
for call in mock_span.set_attribute.call_args_list
}
assert set_calls.get("agent_fingerprint") == "fp_exec_test"
# ── GAP-125: coworker_amp_count passed to telemetry ────────────
class TestGAP125CoworkerAMPCount:
"""AMP coworker count should be calculated and passed to telemetry."""
def test_amp_count_calculation(self):
"""Count of AMP-resolved coworkers should be correct."""
coworkers = []
for i in range(3):
cw = MagicMock()
cw._amp_resolved = i < 2 # First two are AMP
coworkers.append(cw)
amp_count = sum(
1 for cw in coworkers
if getattr(cw, "_amp_resolved", False)
)
assert amp_count == 2
def test_amp_count_zero_when_no_amp(self):
coworkers = [MagicMock(spec=[]) for _ in range(3)]
amp_count = sum(
1 for cw in coworkers
if getattr(cw, "_amp_resolved", False)
)
assert amp_count == 0
def test_amp_count_zero_when_no_coworkers(self):
coworkers: list = []
amp_count = sum(
1 for cw in coworkers
if getattr(cw, "_amp_resolved", False)
)
assert amp_count == 0
def test_coworker_amp_count_in_telemetry_span(self):
"""agent_created should include coworker_amp_count attribute."""
from crewai.new_agent.telemetry import NewAgentTelemetry
tel = NewAgentTelemetry()
mock_tracer = MagicMock()
mock_span = MagicMock()
mock_tracer.start_span.return_value = mock_span
tel._telemetry = MagicMock()
tel._telemetry._tracer = mock_tracer
tel.agent_created(
agent_id="a1", role="R", goal="G",
coworkers_count=3, coworker_amp_count=2,
)
set_calls = {
call.args[0]: call.args[1]
for call in mock_span.set_attribute.call_args_list
}
assert set_calls.get("new_agent_coworker_amp_count") == 2
assert set_calls.get("new_agent_coworkers_count") == 3

View File

@@ -1,561 +0,0 @@
"""Tests for GAP-24, GAP-31, GAP-36, GAP-37, GAP-38, GAP-40, GAP-41, GAP-45, GAP-56, GAP-63.
Covers:
- GAP-24: Anaphora resolution in memory encoding
- GAP-31: Concurrent conversation support
- GAP-36: Apps field warning
- GAP-37: Skills field resolution
- GAP-38: Security/A2A config storage
- GAP-40: Training -> canonical memories
- GAP-41: Memory scoping from provider context
- GAP-45: MemoryScope/MemorySlice types
- GAP-56: AMP circular guard in Python API
- GAP-63: AMP coworker definitions cache
"""
from __future__ import annotations
import asyncio
import logging
import os
import tempfile
from pathlib import Path
from typing import Any
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from crewai.new_agent import (
AgentSettings,
MemoryScope,
MemorySlice,
Message,
NewAgent,
clear_amp_cache,
)
from crewai.new_agent.new_agent import (
_amp_cache,
_get_init_chain,
_ANAPHORA_PRONOUNS,
)
# ── GAP-45: MemoryScope / MemorySlice types ─────────────────────
class TestMemoryScopeModel:
def test_basic_creation(self):
scope = MemoryScope(namespace="project-alpha")
assert scope.namespace == "project-alpha"
assert scope.shared is False
def test_shared_flag(self):
scope = MemoryScope(namespace="shared-ns", shared=True)
assert scope.shared is True
def test_memory_slice_creation(self):
ms = MemorySlice(scope="team", user_id="user-1", tags=["important"])
assert ms.scope == "team"
assert ms.user_id == "user-1"
assert ms.tags == ["important"]
def test_memory_slice_defaults(self):
ms = MemorySlice()
assert ms.scope == ""
assert ms.user_id is None
assert ms.conversation_id is None
assert ms.tags == []
class TestMemoryScopeInAgent:
def test_memory_scope_sets_namespace(self):
agent = NewAgent(
role="R", goal="g",
memory=MemoryScope(namespace="test-ns"),
)
assert agent._memory_namespace == "test-ns"
assert agent._memory_shared is False
def test_memory_scope_shared(self):
agent = NewAgent(
role="R", goal="g",
memory=MemoryScope(namespace="shared-ns", shared=True),
)
assert agent._memory_namespace == "shared-ns"
assert agent._memory_shared is True
def test_memory_slice_sets_filter(self):
ms = MemorySlice(scope="team", user_id="user-1")
agent = NewAgent(
role="R", goal="g",
memory=ms,
)
assert agent._memory_namespace == "team"
assert agent._memory_filter is ms
def test_bool_memory_still_works(self):
agent = NewAgent(role="R", goal="g", memory=True)
# Should not crash, memory_namespace should be None
assert agent._memory_namespace is None
def test_false_memory_still_works(self):
agent = NewAgent(role="R", goal="g", memory=False)
assert agent._memory_instance is None
# ── GAP-56: AMP Circular Guard ──────────────────────────────────
class TestCircularCoworkerGuard:
def test_no_infinite_recursion(self):
"""Two agents referencing each other should not loop forever."""
# We create agents that would reference each other.
# Since they are NewAgent instances (not AMP handles), we can
# construct them without actual recursion by building one first
# and then adding it as a coworker to the other.
agent_a = NewAgent(role="Agent A", goal="Goal A")
agent_b = NewAgent(role="Agent B", goal="Goal B", coworkers=[agent_a])
# Now make A reference B — should not infinite loop
agent_a_with_b = NewAgent(
role="Agent A", goal="Goal A", coworkers=[agent_b],
)
# Should succeed without recursion
assert len(agent_a_with_b._resolved_coworkers) == 1
assert agent_a_with_b._resolved_coworkers[0].role == "Agent B"
def test_self_reference_skipped(self):
"""An agent referencing itself as a coworker should be ignored."""
agent = NewAgent(role="Solo", goal="Self")
agent2 = NewAgent(role="Solo", goal="Self", coworkers=[agent])
# Since the coworker has the same role, it's filtered out
assert len(agent2._resolved_coworkers) == 0
def test_init_chain_is_thread_local(self):
"""The init chain should be thread-local."""
chain = _get_init_chain()
assert isinstance(chain, set)
chain.add("test-id")
chain.discard("test-id")
# ── GAP-63: AMP Coworker Definitions Cache ─────────────────────
class TestAmpCache:
def setup_method(self):
clear_amp_cache()
def teardown_method(self):
clear_amp_cache()
def test_clear_amp_cache(self):
_amp_cache["test-handle"] = {"role": "Test", "goal": "g"}
assert "test-handle" in _amp_cache
clear_amp_cache()
assert len(_amp_cache) == 0
@patch("crewai.utilities.agent_utils.load_agent_from_repository")
def test_cache_hit_avoids_api_call(self, mock_load):
"""Second resolution of same handle should use cache, not call API."""
mock_load.return_value = {
"role": "Cached Agent",
"goal": "cached goal",
}
# Pre-populate cache
_amp_cache["org/agent-1"] = {
"role": "Cached Agent",
"goal": "cached goal",
}
agent = NewAgent(role="Manager", goal="Manage")
resolved = agent._resolve_amp_coworker("org/agent-1")
# API should NOT have been called because cache was hit
mock_load.assert_not_called()
assert resolved.role == "Cached Agent"
@patch("crewai.utilities.agent_utils.load_agent_from_repository")
def test_cache_miss_calls_api(self, mock_load):
"""First resolution should call API and populate cache."""
mock_load.return_value = {
"role": "New Agent",
"goal": "new goal",
}
agent = NewAgent(role="Manager", goal="Manage")
resolved = agent._resolve_amp_coworker("org/new-agent")
mock_load.assert_called_once_with("org/new-agent")
assert resolved.role == "New Agent"
assert "org/new-agent" in _amp_cache
# ── GAP-31: Concurrent Conversation Support ─────────────────────
class TestConcurrentConversations:
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_different_conversation_ids(self, mock_llm):
mock_llm.side_effect = ["Response for conv-1.", "Response for conv-2."]
agent = NewAgent(role="R", goal="g")
r1 = await agent.amessage("Hello conv-1", conversation_id="conv-1")
r2 = await agent.amessage("Hello conv-2", conversation_id="conv-2")
assert r1.conversation_id == "conv-1"
assert r2.conversation_id == "conv-2"
h1 = agent.get_conversation_history("conv-1")
h2 = agent.get_conversation_history("conv-2")
assert len(h1) == 2 # user + agent
assert len(h2) == 2
assert h1[0].content == "Hello conv-1"
assert h2[0].content == "Hello conv-2"
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_default_conversation_backward_compat(self, mock_llm):
mock_llm.return_value = "Default response."
agent = NewAgent(role="R", goal="g")
# No conversation_id -> uses default
r = await agent.amessage("Hello")
assert r.conversation_id == agent._default_conversation_id
assert len(agent.conversation_history) == 2
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_get_conversation_history_unknown_id(self, mock_llm):
agent = NewAgent(role="R", goal="g")
history = agent.get_conversation_history("nonexistent")
assert history == []
def test_reset_specific_conversation(self):
agent = NewAgent(role="R", goal="g")
# Create a second conversation executor
executor = agent._get_or_create_executor("conv-X")
executor.conversation_history.append(
Message(role="user", content="test", conversation_id="conv-X"),
)
assert len(agent.get_conversation_history("conv-X")) == 1
agent.reset_conversation(conversation_id="conv-X")
assert agent.get_conversation_history("conv-X") == []
def test_reset_default_conversation(self):
agent = NewAgent(role="R", goal="g")
old_id = agent._default_conversation_id
agent.reset_conversation()
assert agent._default_conversation_id != old_id
assert len(agent.conversation_history) == 0
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_explain_specific_conversation(self, mock_llm):
mock_llm.return_value = "Answer."
agent = NewAgent(role="R", goal="g")
await agent.amessage("Q", conversation_id="conv-explain")
entries = agent.explain(conversation_id="conv-explain")
assert len(entries) == 1
assert entries[0].action == "response"
def test_explain_unknown_conversation_returns_empty(self):
agent = NewAgent(role="R", goal="g")
entries = agent.explain(conversation_id="nonexistent")
assert entries == []
@patch("crewai.new_agent.executor.aget_llm_response")
def test_sync_message_with_conversation_id(self, mock_llm):
mock_llm.return_value = "Sync response."
agent = NewAgent(role="R", goal="g")
r = agent.message("Hello", conversation_id="sync-conv-1")
assert r.conversation_id == "sync-conv-1"
# ── GAP-36: Apps Field Warning ──────────────────────────────────
class TestAppsWarning:
def test_apps_warning_logged(self, caplog):
with caplog.at_level(logging.WARNING, logger="crewai.new_agent"):
agent = NewAgent(
role="R", goal="g",
apps=["app1", "app2"],
)
assert "Apps integration requires the CrewAI Platform" in caplog.text
assert "2 app(s)" in caplog.text
def test_no_apps_no_warning(self, caplog):
with caplog.at_level(logging.WARNING, logger="crewai.new_agent"):
agent = NewAgent(role="R", goal="g")
assert "Apps integration" not in caplog.text
# ── GAP-37: Skills Field Resolution ─────────────────────────────
class TestSkillsResolution:
def test_skill_instance_added(self):
"""A skill object with run() is added directly."""
skill = MagicMock()
skill.run = MagicMock(return_value="result")
agent = NewAgent(role="R", goal="g", skills=[skill])
assert skill in agent._resolved_tools
def test_skill_path_loaded(self, tmp_path):
"""A Path pointing to a Python file with a tool class is loaded."""
skill_code = '''
class MySkill:
name = "my_skill"
description = "A test skill"
def run(self, **kwargs):
return "skill result"
'''
skill_file = tmp_path / "my_skill.py"
skill_file.write_text(skill_code)
agent = NewAgent(role="R", goal="g", skills=[skill_file])
# The skill class should have been instantiated and added
skill_tools = [t for t in agent._resolved_tools if hasattr(t, 'name') and getattr(t, 'name', '') == 'my_skill']
assert len(skill_tools) == 1
def test_invalid_skill_path_logged(self, caplog):
with caplog.at_level(logging.WARNING, logger="crewai.new_agent"):
agent = NewAgent(
role="R", goal="g",
skills=[Path("/nonexistent/skill.py")],
)
assert "Failed to load skill" in caplog.text or "Cannot load skill" in caplog.text
def test_empty_skills_no_error(self):
agent = NewAgent(role="R", goal="g", skills=[])
assert agent._resolved_tools is not None
# ── GAP-38: Security/A2A Config Storage ─────────────────────────
class TestSecurityA2AConfig:
def test_security_config_logged(self, caplog):
with caplog.at_level(logging.INFO, logger="crewai.new_agent"):
agent = NewAgent(
role="R", goal="g",
security_config={"auth": "token"},
)
assert "Security configuration applied" in caplog.text
def test_a2a_config_stored(self, caplog):
a2a_config = {"server": {"port": 8080}}
with caplog.at_level(logging.INFO, logger="crewai.new_agent"):
agent = NewAgent(
role="R", goal="g",
a2a=a2a_config,
)
assert agent._a2a_config == a2a_config
assert "A2A server configured" in caplog.text
def test_no_config_no_logs(self, caplog):
with caplog.at_level(logging.INFO, logger="crewai.new_agent"):
agent = NewAgent(role="R", goal="g")
assert "Security configuration" not in caplog.text
assert "A2A server" not in caplog.text
# ── GAP-40: Training → Canonical Memories ───────────────────────
class TestTraining:
def test_train_saves_to_memory(self):
agent = NewAgent(role="R", goal="g")
mock_memory = MagicMock()
agent._memory_instance = mock_memory
agent.train("Always double-check calculations", "math tasks")
mock_memory.remember.assert_called_once()
call_args = mock_memory.remember.call_args
saved_text = call_args[1].get("value") or call_args[0][0]
assert "Always double-check calculations" in saved_text
assert "math tasks" in saved_text
def test_train_without_context(self):
agent = NewAgent(role="R", goal="g")
mock_memory = MagicMock()
agent._memory_instance = mock_memory
agent.train("Be more concise")
call_args = mock_memory.remember.call_args
saved_text = call_args[1].get("value") or call_args[0][0]
assert "Be more concise" in saved_text
assert "Training feedback" in saved_text
def test_train_remember_failure_is_silent(self):
agent = NewAgent(role="R", goal="g")
mock_memory = MagicMock()
mock_memory.remember.side_effect = RuntimeError("storage error")
agent._memory_instance = mock_memory
# Should not raise
agent.train("Use shorter sentences")
def test_train_no_memory_is_noop(self):
agent = NewAgent(role="R", goal="g", memory=False)
# Should not raise
agent.train("Some feedback")
def test_train_notifies_dreaming_engine(self):
agent = NewAgent(role="R", goal="g")
mock_memory = MagicMock()
agent._memory_instance = mock_memory
mock_dreaming = MagicMock()
agent._dreaming_engine = mock_dreaming
agent.train("Important insight", "context")
mock_dreaming.add_training_feedback.assert_called_once_with(
"Important insight", "context",
)
# ── GAP-41: Memory Scoping from Provider Context ────────────────
class TestMemoryScopingFromProvider:
def test_provider_memory_scope_applied(self):
mock_provider = MagicMock()
mock_provider.memory_scope = "slack-channel-123"
agent = NewAgent(
role="R", goal="g",
provider=mock_provider,
)
assert agent._memory_namespace == "slack-channel-123"
def test_manual_memory_scope_overrides_provider(self):
mock_provider = MagicMock()
mock_provider.memory_scope = "provider-scope"
agent = NewAgent(
role="R", goal="g",
provider=mock_provider,
memory_scope="manual-scope",
)
# Manual scope takes priority
assert agent._memory_namespace == "manual-scope"
def test_no_scope_is_none(self):
agent = NewAgent(role="R", goal="g")
assert agent._memory_namespace is None
def test_provider_without_scope_attr(self):
mock_provider = MagicMock(spec=[]) # No memory_scope attr
agent = NewAgent(
role="R", goal="g",
provider=mock_provider,
)
assert agent._memory_namespace is None
# ── GAP-24: Anaphora Resolution ─────────────────────────────────
class TestAnaphoraResolution:
def test_pronoun_regex_matches(self):
assert _ANAPHORA_PRONOUNS.search("He prefers Python")
assert _ANAPHORA_PRONOUNS.search("She said that")
assert _ANAPHORA_PRONOUNS.search("It works well")
assert _ANAPHORA_PRONOUNS.search("They use those tools")
assert _ANAPHORA_PRONOUNS.search("This is important")
def test_no_pronouns_no_match(self):
assert not _ANAPHORA_PRONOUNS.search("Python works well for backend development")
def test_resolve_anaphora_no_pronouns_returns_unchanged(self):
agent = NewAgent(role="R", goal="g")
text = "Python is a great language for backend development"
result = agent._resolve_anaphora(text, [])
assert result == text
def test_prepare_memory_context_format(self):
agent = NewAgent(role="R", goal="g")
result = agent.prepare_memory_context("He prefers using it")
assert "Resolve all pronouns" in result
assert "He prefers using it" in result
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_prepare_memory_context_includes_history(self, mock_llm):
mock_llm.return_value = "Response about John."
agent = NewAgent(role="R", goal="g")
await agent.amessage("Tell me about John's preferences")
result = agent.prepare_memory_context("He prefers using it")
assert "John" in result or "preferences" in result
def test_resolve_anaphora_with_no_llm(self):
"""If LLM is None, should return text unchanged."""
agent = NewAgent(role="R", goal="g")
agent._llm_instance = None
text = "He likes it"
result = agent._resolve_anaphora(text, [])
assert result == text
# ── Integration: Multiple gaps working together ──────────────────
class TestIntegration:
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_concurrent_conversations_isolated(self, mock_llm):
"""Messages in different conversations should not bleed."""
mock_llm.side_effect = [
"Conv A response 1.",
"Conv B response 1.",
"Conv A response 2.",
]
agent = NewAgent(role="R", goal="g")
await agent.amessage("A1", conversation_id="conv-a")
await agent.amessage("B1", conversation_id="conv-b")
await agent.amessage("A2", conversation_id="conv-a")
hist_a = agent.get_conversation_history("conv-a")
hist_b = agent.get_conversation_history("conv-b")
assert len(hist_a) == 4 # 2 user + 2 agent
assert len(hist_b) == 2 # 1 user + 1 agent
# Verify isolation
contents_a = [m.content for m in hist_a if m.role == "user"]
contents_b = [m.content for m in hist_b if m.role == "user"]
assert "A1" in contents_a
assert "A2" in contents_a
assert "B1" in contents_b
assert "B1" not in contents_a
def test_memory_scope_with_training(self):
"""Training should work alongside memory scoping."""
agent = NewAgent(
role="R", goal="g",
memory=MemoryScope(namespace="scoped-ns"),
)
mock_memory = MagicMock()
agent._memory_instance = mock_memory
agent.train("Always verify data sources")
mock_memory.remember.assert_called_once()

View File

@@ -1,500 +0,0 @@
"""Tests for GAP-47 through GAP-64 implementations.
Covers:
- GAP-47: Event listener telemetry bridge (registry)
- GAP-48: Dreaming — mark processed memories
- GAP-49: Sub-action token tracking (delegation/dreaming/planning)
- GAP-54: Dreaming — private memory scoping
- GAP-55: Delegation provenance summary
- GAP-57: Spawn events
- GAP-58: Parent memory for spawned copies
- GAP-61: Missing event handlers
- GAP-62: Reuse generated flows (save workflow recipes)
- GAP-64: Telemetry metadata counts
"""
from __future__ import annotations
import asyncio
import json
import os
import re
from collections import Counter
from datetime import datetime, timezone
from typing import Any
from unittest.mock import AsyncMock, MagicMock, patch, call
import pytest
from crewai.new_agent import (
AgentSettings,
Message,
NewAgent,
DreamingEngine,
PlanningEngine,
SpawnSubtaskTool,
TokenUsage,
)
from crewai.new_agent.coworker_tools import (
DelegateToCoworkerTool,
_build_provenance_summary,
build_coworker_tools,
)
from crewai.new_agent.telemetry import (
NewAgentTelemetry,
register_agent,
unregister_agent,
get_telemetry_for_agent,
_active_agents,
)
from crewai.new_agent.dreaming import _classify_scope, SCOPE_GLOBAL, SCOPE_USER, SCOPE_CONVERSATION
# ── GAP-47: Telemetry Registry ────────────────────────────────
class TestTelemetryRegistry:
def setup_method(self):
"""Clean the registry between tests."""
_active_agents.clear()
def test_register_and_lookup(self):
tel = NewAgentTelemetry()
register_agent("agent-123", tel)
assert get_telemetry_for_agent("agent-123") is tel
def test_unregister(self):
tel = NewAgentTelemetry()
register_agent("agent-123", tel)
unregister_agent("agent-123")
assert get_telemetry_for_agent("agent-123") is None
def test_lookup_unknown_returns_none(self):
assert get_telemetry_for_agent("nonexistent") is None
def test_multiple_agents(self):
tel1 = NewAgentTelemetry()
tel2 = NewAgentTelemetry()
register_agent("a1", tel1)
register_agent("a2", tel2)
assert get_telemetry_for_agent("a1") is tel1
assert get_telemetry_for_agent("a2") is tel2
def test_register_overwrites(self):
tel1 = NewAgentTelemetry()
tel2 = NewAgentTelemetry()
register_agent("a1", tel1)
register_agent("a1", tel2)
assert get_telemetry_for_agent("a1") is tel2
# ── GAP-48: Dreaming — Mark Processed Memories ────────────────
class TestDreamingProcessedMemories:
def test_processed_ids_initially_empty(self):
agent = NewAgent(role="R", goal="g")
engine = agent._dreaming_engine
assert len(engine._processed_memory_ids) == 0
def test_cycle_count_increments(self):
agent = NewAgent(
role="R", goal="g",
memory=False,
settings=AgentSettings(memory_enabled=False, self_improving=True),
)
engine = agent._dreaming_engine
assert engine._cycle_count == 0
@pytest.mark.asyncio
async def test_dream_increments_cycle_count(self):
agent = NewAgent(
role="R", goal="g",
memory=False,
settings=AgentSettings(memory_enabled=False, self_improving=True),
)
engine = agent._dreaming_engine
await engine.dream()
assert engine._cycle_count == 1
await engine.dream()
assert engine._cycle_count == 2
def test_get_recent_memories_filters_processed(self):
agent = NewAgent(role="R", goal="g")
engine = agent._dreaming_engine
# Mock a memory instance
mock_memory = MagicMock()
mock_result1 = MagicMock()
mock_result1.id = "mem-1"
mock_result1.content = "First memory"
mock_result2 = MagicMock()
mock_result2.id = "mem-2"
mock_result2.content = "Second memory"
mock_memory.recall.return_value = [mock_result1, mock_result2]
# First call gets both
contents, ids = engine._get_recent_memories(mock_memory)
assert len(contents) == 2
assert "mem-1" in ids
assert "mem-2" in ids
# Mark mem-1 as processed
engine._processed_memory_ids.add("mem-1")
# Second call should filter out mem-1
contents, ids = engine._get_recent_memories(mock_memory)
assert len(contents) == 1
assert contents[0] == "Second memory"
assert "mem-2" in ids
def test_processed_ids_path(self):
agent = NewAgent(role="Test Agent", goal="g")
engine = agent._dreaming_engine
path = engine._processed_ids_path()
assert ".crewai/dreaming/" in path
assert "processed.json" in path
# ── GAP-49: Sub-Action Token Tracking ─────────────────────────
class TestSubActionTokenTracking:
def test_dreaming_last_cycle_tokens_initially_none(self):
agent = NewAgent(role="R", goal="g")
engine = agent._dreaming_engine
assert engine._last_cycle_tokens is None
def test_planning_last_plan_tokens_initially_none(self):
agent = NewAgent(role="R", goal="g")
engine = agent._planning_engine
assert engine._last_plan_tokens is None
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_delegation_records_tokens_on_parent(self, mock_llm):
mock_llm.side_effect = [
"Coworker result.",
"Manager summary.",
]
writer = NewAgent(role="Writer", goal="Write")
manager = NewAgent(role="Manager", goal="Manage", coworkers=[writer])
tool = DelegateToCoworkerTool(coworker=writer, parent_agent=manager)
result = tool._run(message="Write something")
# Should not raise and should contain the response
assert "Coworker result." in result
# ── GAP-54: Dreaming — Private Memory Scoping ────────────────
class TestMemoryScoping:
def test_classify_global(self):
assert _classify_scope("Best practice: always validate inputs") == SCOPE_GLOBAL
assert _classify_scope("API rate limit is 100 req/min") == SCOPE_GLOBAL
def test_classify_user(self):
assert _classify_scope("User prefers dark mode") == SCOPE_USER
assert _classify_scope("My preference is to use Python") == SCOPE_USER
assert _classify_scope("I always use VS Code") == SCOPE_USER
def test_classify_conversation(self):
assert _classify_scope("In this conversation, we discussed AI") == SCOPE_CONVERSATION
assert _classify_scope("Just now the user asked about pricing") == SCOPE_CONVERSATION
def test_global_is_default(self):
assert _classify_scope("The sky is blue.") == SCOPE_GLOBAL
assert _classify_scope("Python 3.12 added new features.") == SCOPE_GLOBAL
# ── GAP-55: Delegation Provenance Summary ─────────────────────
class TestDelegationProvenanceSummary:
def test_empty_provenance(self):
coworker = MagicMock()
coworker._executor = MagicMock()
coworker._executor.provenance_log = []
summary = _build_provenance_summary(coworker, "Writer", 1000, 100, 50)
assert summary == ""
def test_with_tool_calls(self):
from crewai.new_agent.models import ProvenanceEntry
coworker = MagicMock()
coworker._executor = MagicMock()
coworker._executor.provenance_log = [
ProvenanceEntry(action="tool_call", inputs={"tool": "search_web"}),
ProvenanceEntry(action="tool_call", inputs={"tool": "search_web"}),
ProvenanceEntry(action="tool_call", inputs={"tool": "read_file"}),
ProvenanceEntry(action="response", inputs={"user_message": "test"}),
]
summary = _build_provenance_summary(coworker, "Researcher", 2000, 500, 200)
assert "Coworker: Researcher" in summary
assert "search_web (2x)" in summary
assert "read_file" in summary
assert "Steps: 4" in summary
def test_no_executor(self):
coworker = MagicMock()
coworker._executor = None
summary = _build_provenance_summary(coworker, "Writer", 1000, 100, 50)
assert summary == ""
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_delegation_includes_summary(self, mock_llm):
mock_llm.return_value = "Draft article about AI."
writer = NewAgent(role="Writer", goal="Write articles")
# Give the writer some provenance so the summary is non-empty
from crewai.new_agent.models import ProvenanceEntry
writer._executor.provenance_log = [
ProvenanceEntry(action="tool_call", inputs={"tool": "search_web"}),
ProvenanceEntry(action="response", inputs={"user_message": "test"}),
]
tool = DelegateToCoworkerTool(coworker=writer)
result = tool._run(message="Write about AI")
# The result should contain the provenance summary
assert "[Coworker: Writer" in result
assert "search_web" in result
# ── GAP-57: Spawn Events ─────────────────────────────────────
class TestSpawnEvents:
@patch("crewai.new_agent.executor.aget_llm_response")
def test_spawn_emits_events(self, mock_llm):
mock_llm.return_value = "Subtask result."
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
can_spawn_copies=True,
max_spawn_depth=1,
memory_enabled=False,
),
)
tool = SpawnSubtaskTool(agent=agent)
emitted_events: list[Any] = []
from crewai.events.event_bus import crewai_event_bus
original_emit = crewai_event_bus.emit
def capture_emit(source: Any, event: Any = None, **kw: Any) -> Any:
if event is not None:
emitted_events.append(event)
return original_emit(source, event=event)
with patch.object(crewai_event_bus, "emit", side_effect=capture_emit):
result = tool._run(subtasks=["Task A"])
from crewai.new_agent.events import (
NewAgentSpawnStartedEvent,
NewAgentSpawnCompletedEvent,
)
spawn_started = [e for e in emitted_events if isinstance(e, NewAgentSpawnStartedEvent)]
assert len(spawn_started) >= 1
assert spawn_started[0].spawn_depth == 1
def test_spawn_provenance_includes_spawn_id(self):
"""Verify the spawn ID is included in provenance entries."""
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
can_spawn_copies=True,
max_spawn_depth=1,
memory_enabled=False,
),
)
tool = SpawnSubtaskTool(agent=agent)
with patch("crewai.new_agent.executor.aget_llm_response", return_value="Done."):
tool._run(subtasks=["Task A"])
# Check provenance
prov = agent._executor.provenance_log
spawn_entries = [e for e in prov if e.action == "spawn"]
assert len(spawn_entries) >= 1
assert "spawn_id" in spawn_entries[0].inputs
# ── GAP-58: Parent Memory for Spawned Copies ─────────────────
class TestParentMemoryInjection:
@patch("crewai.new_agent.executor.aget_llm_response")
def test_spawn_with_parent_memory(self, mock_llm):
"""When parent has memory, spawned copies should receive memory context."""
mock_llm.return_value = "Result with context."
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
can_spawn_copies=True,
max_spawn_depth=1,
),
)
# Mock the parent's memory
mock_memory = MagicMock()
mock_result = MagicMock()
mock_result.content = "Important context about the task"
mock_memory.recall.return_value = [mock_result]
agent._memory_instance = mock_memory
tool = SpawnSubtaskTool(agent=agent)
result = tool._run(subtasks=["Do something specific"])
# The memory should have been queried
mock_memory.recall.assert_called()
assert "[Subtask 1]" in result
@patch("crewai.new_agent.executor.aget_llm_response")
def test_spawn_without_parent_memory(self, mock_llm):
"""When parent has no memory, spawned copies should still work."""
mock_llm.return_value = "Result without context."
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(
can_spawn_copies=True,
max_spawn_depth=1,
memory_enabled=False,
),
)
tool = SpawnSubtaskTool(agent=agent)
result = tool._run(subtasks=["Do something"])
assert "[Subtask 1]" in result
# ── GAP-61: Missing Event Handlers ───────────────────────────
class TestMissingEventHandlers:
def test_all_events_have_handlers(self):
"""All event types in events.py should have handlers registered."""
from crewai.new_agent import events as events_module
# Get all event classes
event_classes = []
for name in dir(events_module):
obj = getattr(events_module, name)
if isinstance(obj, type) and name.startswith("NewAgent") and name.endswith("Event"):
event_classes.append(name)
# Verify there are many event types
assert len(event_classes) >= 29, f"Expected at least 29 event types, found {len(event_classes)}"
def test_event_listener_imports_all_event_types(self):
"""The event listener module should import all relevant event types."""
import crewai.new_agent.event_listener as listener_module
# Just importing is enough to check it doesn't error
assert hasattr(listener_module, "register_new_agent_listeners")
# ── GAP-62: Reuse Generated Flows ────────────────────────────
class TestWorkflowRecipes:
def test_save_flow_recipe(self, tmp_path, monkeypatch):
"""Test that workflow recipes are saved as JSON files."""
monkeypatch.chdir(tmp_path)
agent = NewAgent(role="R", goal="g")
engine = agent._dreaming_engine
workflow = {
"tools": ["search_web", "read_file", "summarize"],
"count": 5,
}
engine._save_flow_recipe(workflow)
# Check that the recipe file was created
flows_dir = tmp_path / ".crewai" / "flows"
assert flows_dir.exists()
# Check manifest
manifest_path = flows_dir / "manifest.json"
assert manifest_path.exists()
manifest = json.loads(manifest_path.read_text())
assert len(manifest) == 1
assert manifest[0]["tools"] == ["search_web", "read_file", "summarize"]
# Check recipe file
recipe_files = list(flows_dir.glob("*.json"))
assert len(recipe_files) >= 2 # manifest + at least one recipe
def test_discovered_flows_loaded(self, tmp_path, monkeypatch):
"""Test that discovered flows are loaded from disk on init."""
monkeypatch.chdir(tmp_path)
# Pre-create manifest
flows_dir = tmp_path / ".crewai" / "flows"
flows_dir.mkdir(parents=True)
manifest = [{"name": "test_flow", "path": "test.json", "tools": ["a", "b"]}]
(flows_dir / "manifest.json").write_text(json.dumps(manifest))
agent = NewAgent(role="R", goal="g")
engine = agent._dreaming_engine
assert len(engine._discovered_flows) == 1
assert engine._discovered_flows[0]["name"] == "test_flow"
# ── GAP-64: Telemetry Metadata Counts ────────────────────────
class TestTelemetryMetadataCounts:
def test_agent_created_accepts_new_params(self):
"""Verify agent_created() accepts the new metadata count parameters."""
tel = NewAgentTelemetry()
# Should not raise
tel.agent_created(
agent_id="a1",
role="R",
goal="g",
llm="gpt-4o",
tools_count=5,
coworkers_count=2,
memory_enabled=True,
planning_enabled=True,
coworker_amp_count=1,
mcp_count=3,
apps_count=2,
knowledge_source_count=4,
tool_count=5,
)
def test_agent_created_backward_compatible(self):
"""Calling agent_created() without the new params still works."""
tel = NewAgentTelemetry()
tel.agent_created(
agent_id="a1",
role="R",
goal="g",
)
def test_new_telemetry_methods_exist(self):
"""Verify new telemetry span methods exist."""
tel = NewAgentTelemetry()
# All new methods should be callable without error
tel.conversation_reset(agent_id="a1")
tel.message_received(agent_id="a1", message_length=42)
tel.message_sent(agent_id="a1", input_tokens=100, output_tokens=50)
tel.llm_call_started(agent_id="a1", model="gpt-4o")
tel.llm_call_completed(agent_id="a1", model="gpt-4o", input_tokens=100)
tel.llm_call_failed(agent_id="a1", error="test")
tel.tool_usage_started(agent_id="a1", tool_name="search")
tel.tool_usage_failed(agent_id="a1", tool_name="search", error="fail")
tel.delegation_failed(agent_id="a1", coworker_role="Writer", error="fail")
tel.fire_and_forget_dispatched(agent_id="a1", coworker_role="Writer")
tel.fire_and_forget_completed(agent_id="a1", coworker_role="Writer")
tel.spawn_failed(agent_id="a1", spawn_id="s1", error="fail")
tel.context_summarized(agent_id="a1")
tel.narration_guard_triggered(agent_id="a1", retries=1)
tel.workflow_detected(agent_id="a1", tools=["a", "b"], count=3)
tel.workflow_proposed(agent_id="a1", description="test")
tel.workflow_confirmed(agent_id="a1")
tel.knowledge_query(agent_id="a1")
tel.knowledge_confirmed(agent_id="a1", source_type="file")
tel.knowledge_rejected(agent_id="a1")
tel.explain_requested(agent_id="a1")
tel.guardrail_passed(agent_id="a1", guardrail_type="code")
tel.status_update(state="thinking", detail="Working")

View File

@@ -1,542 +0,0 @@
"""Tests for guardrails, memory integration, events, and advanced features."""
from __future__ import annotations
import json
from unittest.mock import MagicMock, patch
import pytest
from crewai.new_agent import AgentSettings, Message, NewAgent
from crewai.new_agent.events import (
NewAgentConversationStartedEvent,
NewAgentGuardrailPassedEvent,
NewAgentGuardrailRejectedEvent,
NewAgentMessageReceivedEvent,
NewAgentMessageSentEvent,
NewAgentDelegationStartedEvent,
NewAgentDelegationCompletedEvent,
NewAgentToolUsageStartedEvent,
NewAgentToolUsageCompletedEvent,
NewAgentDreamingStartedEvent,
NewAgentDreamingCompletedEvent,
NewAgentPlanningStartedEvent,
NewAgentPlanningCompletedEvent,
NewAgentSpawnStartedEvent,
NewAgentSpawnCompletedEvent,
NewAgentMemorySaveEvent,
NewAgentMemoryRecallEvent,
NewAgentKnowledgeQueryEvent,
NewAgentExplainRequestedEvent,
)
# ── Guardrail tests ─────────────────────────────────────────
class TestGuardrails:
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_code_guardrail_passes(self, mock_llm):
mock_llm.return_value = "Safe response."
def my_guardrail(response: str) -> tuple[bool, str]:
return True, ""
agent = NewAgent(
role="R", goal="g",
guardrail=my_guardrail,
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Hi")
assert result.content == "Safe response."
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_code_guardrail_rejects_and_retries(self, mock_llm):
mock_llm.side_effect = ["Bad response with SECRET.", "Clean response."]
call_count = 0
def my_guardrail(response: str) -> tuple[bool, str]:
nonlocal call_count
call_count += 1
if "SECRET" in response:
return False, "Do not include secrets."
return True, ""
agent = NewAgent(
role="R", goal="g",
guardrail=my_guardrail,
settings=AgentSettings(memory_enabled=False, max_retry_limit=2),
)
result = await agent.amessage("Tell me a secret")
assert call_count >= 1
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_bool_guardrail(self, mock_llm):
mock_llm.return_value = "OK response."
def simple_guard(response: str) -> bool:
return len(response) > 0
agent = NewAgent(
role="R", goal="g",
guardrail=simple_guard,
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Hi")
assert result.content == "OK response."
# ── Memory integration tests ────────────────────────────────
class TestMemoryIntegration:
def test_memory_enabled_by_default(self):
agent = NewAgent(role="R", goal="g")
assert agent.settings.memory_enabled is True
def test_memory_disabled(self):
agent = NewAgent(
role="R", goal="g",
memory=False,
settings=AgentSettings(memory_enabled=False),
)
assert agent._memory_instance is None
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_memory_recall_in_prompt(self, mock_llm):
mock_llm.return_value = "Response with memory context."
agent = NewAgent(
role="Researcher",
goal="Research",
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("What do you know?")
stack = agent.last_prompt_stack
assert stack is not None
layer_names = [l.name for l in stack.layers]
assert "soul" in layer_names
assert "temporal" in layer_names
# ── Event types tests ────────────────────────────────────────
class TestAllEventTypes:
"""Verify all event types can be instantiated with proper defaults."""
def test_conversation_started(self):
e = NewAgentConversationStartedEvent(new_agent_id="a1", new_agent_role="R", conversation_id="c1")
assert e.type == "new_agent_conversation_started"
def test_message_received(self):
e = NewAgentMessageReceivedEvent(new_agent_id="a1", message_length=42, conversation_id="c1")
assert e.message_length == 42
def test_message_sent(self):
e = NewAgentMessageSentEvent(new_agent_id="a1", model="gpt-4o", input_tokens=100, output_tokens=50, conversation_id="c1")
assert e.input_tokens == 100
def test_tool_usage_started(self):
e = NewAgentToolUsageStartedEvent(new_agent_id="a1", tool_name="search")
assert e.tool_name == "search"
def test_tool_usage_completed(self):
e = NewAgentToolUsageCompletedEvent(new_agent_id="a1", tool_name="search")
assert e.type == "new_agent_tool_usage_completed"
def test_delegation_started(self):
e = NewAgentDelegationStartedEvent(
new_agent_id="a1",
coworker_role="Writer",
delegation_mode="sync",
coworker_source="local",
)
assert e.coworker_source == "local"
def test_delegation_completed(self):
e = NewAgentDelegationCompletedEvent(
new_agent_id="a1",
coworker_role="Writer",
tokens_consumed=500,
response_time_ms=2000,
)
assert e.tokens_consumed == 500
def test_guardrail_passed(self):
e = NewAgentGuardrailPassedEvent(new_agent_id="a1", guardrail_type="code")
assert e.guardrail_type == "code"
def test_guardrail_rejected(self):
e = NewAgentGuardrailRejectedEvent(new_agent_id="a1", guardrail_type="llm", retries=2)
assert e.retries == 2
def test_dreaming(self):
e = NewAgentDreamingStartedEvent(new_agent_id="a1")
assert e.type == "new_agent_dreaming_started"
e2 = NewAgentDreamingCompletedEvent(
new_agent_id="a1",
memories_processed=10,
canonical_created=3,
workflows_detected=1,
)
assert e2.canonical_created == 3
def test_planning(self):
e = NewAgentPlanningStartedEvent(new_agent_id="a1")
assert e.type == "new_agent_planning_started"
e2 = NewAgentPlanningCompletedEvent(new_agent_id="a1", plan_steps_count=5)
assert e2.plan_steps_count == 5
def test_spawn(self):
e = NewAgentSpawnStartedEvent(
new_agent_id="a1",
spawn_id="s1",
parent_id="p1",
spawn_depth=1,
)
assert e.spawn_depth == 1
e2 = NewAgentSpawnCompletedEvent(new_agent_id="a1", spawn_id="s1")
assert e2.type == "new_agent_spawn_completed"
def test_memory_events(self):
e = NewAgentMemorySaveEvent(new_agent_id="a1", scope="/user")
assert e.scope == "/user"
e2 = NewAgentMemoryRecallEvent(new_agent_id="a1", scope="/user", results_count=3)
assert e2.results_count == 3
def test_explain_event(self):
e = NewAgentExplainRequestedEvent(new_agent_id="a1")
assert e.type == "new_agent_explain_requested"
# ── Event emission tests ─────────────────────────────────────
class TestEventEmission:
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_events_emitted_on_message(self, mock_llm):
mock_llm.return_value = "Response."
emitted_events = []
def capture_event(source, event):
emitted_events.append(event)
with patch("crewai.events.event_bus.crewai_event_bus.emit", side_effect=capture_event):
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(memory_enabled=False),
)
await agent.amessage("Hello")
event_types = [type(e).__name__ for e in emitted_events]
# GAP-84: At construction, NewAgentCreatedEvent is emitted instead of ConversationStarted
assert "NewAgentCreatedEvent" in event_types
assert "NewAgentMessageReceivedEvent" in event_types
assert "NewAgentMessageSentEvent" in event_types
# ── Structured output tests ──────────────────────────────────
class TestStructuredOutput:
def test_response_model_attribute(self):
from pydantic import BaseModel
class Result(BaseModel):
summary: str
confidence: float
agent = NewAgent(
role="R", goal="g",
response_model=Result,
settings=AgentSettings(memory_enabled=False),
)
assert agent.response_model is Result
# ── Multi-agent delegation tests ─────────────────────────────
class TestMultiAgentDelegation:
def test_multiple_coworkers(self):
writer = NewAgent(role="Writer", goal="Write", settings=AgentSettings(memory_enabled=False))
reviewer = NewAgent(role="Reviewer", goal="Review", settings=AgentSettings(memory_enabled=False))
manager = NewAgent(
role="Manager",
goal="Manage",
coworkers=[writer, reviewer],
settings=AgentSettings(memory_enabled=False),
)
assert len(manager._resolved_coworkers) == 2
# 2 individual delegation tools + 1 multi-delegate tool
assert len(manager._coworker_tools) == 3
tool_names = [t.name for t in manager._coworker_tools]
assert any("writer" in n.lower() for n in tool_names)
assert any("reviewer" in n.lower() for n in tool_names)
assert any("multiple" in n.lower() for n in tool_names)
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_delegation_via_tool(self, mock_llm):
mock_llm.return_value = "Writer's output."
writer = NewAgent(
role="Writer", goal="Write articles",
settings=AgentSettings(memory_enabled=False),
)
from crewai.new_agent.coworker_tools import DelegateToCoworkerTool
tool = DelegateToCoworkerTool(coworker=writer, source="local")
result = tool._run(message="Write about AI")
assert "Writer's output." in result
def test_coworker_tool_args_schema(self):
writer = NewAgent(role="Writer", goal="Write", settings=AgentSettings(memory_enabled=False))
from crewai.new_agent.coworker_tools import DelegateToCoworkerTool
tool = DelegateToCoworkerTool(coworker=writer)
schema = tool.args_schema.model_json_schema()
assert "message" in schema["properties"]
assert "fire_and_forget" in schema["properties"]
# ── LLM Guardrail tests ────────────────────────────────────
class TestLLMGuardrails:
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_llm_guardrail_passes(self, mock_llm):
"""LLM guardrail that returns PASS should let the response through."""
from crewai.tasks.llm_guardrail import LLMGuardrail
# First call: the main agent response. Second call: guardrail evaluation.
mock_llm.side_effect = ["A good response.", "PASS"]
mock_guardrail_llm = MagicMock()
guardrail = LLMGuardrail(
description="Response must be polite.",
llm=mock_guardrail_llm,
)
agent = NewAgent(
role="R", goal="g",
guardrail=guardrail,
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Hi")
assert result.content == "A good response."
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_llm_guardrail_rejects_and_retries(self, mock_llm):
"""LLM guardrail that returns FAIL should trigger regeneration."""
from crewai.tasks.llm_guardrail import LLMGuardrail
# Call sequence:
# 1. Main response: "Bad response"
# 2. Guardrail evaluation: "FAIL: contains rude language"
# 3. Regeneration: "Fixed response"
# 4. Guardrail re-evaluation: "PASS"
mock_llm.side_effect = [
"Bad response",
"FAIL: contains rude language",
"Fixed response",
"PASS",
]
mock_guardrail_llm = MagicMock()
guardrail = LLMGuardrail(
description="Response must be polite.",
llm=mock_guardrail_llm,
)
agent = NewAgent(
role="R", goal="g",
guardrail=guardrail,
settings=AgentSettings(memory_enabled=False, max_retry_limit=2),
)
result = await agent.amessage("Be rude")
# After FAIL, it regenerates and the guardrail passes
assert result.content == "Fixed response"
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_llm_guardrail_falls_back_to_agent_llm(self, mock_llm):
"""When guardrail has no LLM, it should use the agent's LLM."""
from crewai.tasks.llm_guardrail import LLMGuardrail
mock_llm.side_effect = ["Some response.", "PASS"]
guardrail = LLMGuardrail(
description="Response must be safe.",
llm=None, # No guardrail LLM — should fall back to agent's
)
# Override llm to None so the isinstance(llm, str) path is not hit
guardrail.llm = None
agent = NewAgent(
role="R", goal="g",
guardrail=guardrail,
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Hello")
assert result.content == "Some response."
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_llm_guardrail_emits_correct_event_type(self, mock_llm):
"""LLM guardrail should emit events with guardrail_type='llm'."""
from crewai.tasks.llm_guardrail import LLMGuardrail
mock_llm.side_effect = ["Response.", "PASS"]
emitted_events = []
def capture_event(source, event):
emitted_events.append(event)
guardrail = LLMGuardrail(
description="Must be safe.",
llm=MagicMock(),
)
with patch("crewai.events.event_bus.crewai_event_bus.emit", side_effect=capture_event):
agent = NewAgent(
role="R", goal="g",
guardrail=guardrail,
settings=AgentSettings(memory_enabled=False),
)
await agent.amessage("Hi")
guardrail_events = [
e for e in emitted_events
if type(e).__name__ == "NewAgentGuardrailPassedEvent"
]
assert len(guardrail_events) >= 1
assert guardrail_events[0].guardrail_type == "llm"
# ── Structured output tests (parsing) ──────────────────────
class TestStructuredOutputParsing:
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_structured_output_from_json(self, mock_llm):
"""When LLM returns valid JSON, it should be parsed into response_model."""
from pydantic import BaseModel
class Result(BaseModel):
summary: str
confidence: float
json_response = json.dumps({"summary": "Test summary", "confidence": 0.95})
mock_llm.return_value = json_response
agent = NewAgent(
role="R", goal="g",
response_model=Result,
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Analyze this")
assert result.content == json_response
assert result.metadata is not None
assert "structured_output" in result.metadata
assert result.metadata["structured_output"]["summary"] == "Test summary"
assert result.metadata["structured_output"]["confidence"] == 0.95
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_structured_output_from_markdown_json(self, mock_llm):
"""When LLM returns JSON wrapped in markdown fences, it should still parse."""
from pydantic import BaseModel
class Result(BaseModel):
summary: str
confidence: float
json_str = json.dumps({"summary": "Parsed from markdown", "confidence": 0.8})
markdown_response = f"```json\n{json_str}\n```"
mock_llm.return_value = markdown_response
agent = NewAgent(
role="R", goal="g",
response_model=Result,
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Analyze this")
assert result.metadata is not None
assert result.metadata["structured_output"]["summary"] == "Parsed from markdown"
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_structured_output_llm_extraction_fallback(self, mock_llm):
"""When text is not JSON, it should ask the LLM to extract structured data."""
from pydantic import BaseModel
class Result(BaseModel):
summary: str
confidence: float
# First call: main agent response (not JSON).
# Second call: LLM extraction returns valid JSON.
mock_llm.side_effect = [
"The analysis shows high confidence in the results.",
json.dumps({"summary": "High confidence analysis", "confidence": 0.92}),
]
agent = NewAgent(
role="R", goal="g",
response_model=Result,
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Analyze this")
assert result.content == "The analysis shows high confidence in the results."
assert result.metadata is not None
assert result.metadata["structured_output"]["summary"] == "High confidence analysis"
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_structured_output_none_when_no_model(self, mock_llm):
"""When response_model is not set, metadata should not contain structured_output."""
mock_llm.return_value = "Plain response."
agent = NewAgent(
role="R", goal="g",
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Hello")
assert result.metadata is None
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_structured_output_none_on_failure(self, mock_llm):
"""When both direct parse and LLM extraction fail, metadata should be None."""
from pydantic import BaseModel
class Result(BaseModel):
summary: str
confidence: float
# First call: main response (not JSON).
# Second call: LLM extraction also returns non-JSON.
mock_llm.side_effect = [
"Not JSON at all.",
"I cannot extract structured data from this.",
]
agent = NewAgent(
role="R", goal="g",
response_model=Result,
settings=AgentSettings(memory_enabled=False),
)
result = await agent.amessage("Hello")
assert result.content == "Not JSON at all."
# metadata should be None since structured parsing failed
assert result.metadata is None

View File

@@ -1,212 +0,0 @@
"""Integration-style tests for NewAgent, fully mocked (no real LLM calls).
All tests that previously required a real OpenAI API key now use
unittest.mock to simulate LLM responses, so the suite passes with
--block-network and without any API credentials.
"""
from __future__ import annotations
import json
import tempfile
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from pydantic import BaseModel
from crewai.new_agent import AgentSettings, Message, NewAgent
from crewai.new_agent.definition_parser import load_agent_from_definition
def _agent(**kwargs) -> NewAgent:
defaults = dict(
role="Assistant",
goal="Help users",
backstory="Helpful assistant",
llm="openai/gpt-4o-mini",
memory=False,
settings=AgentSettings(memory_enabled=False),
)
defaults.update(kwargs)
return NewAgent(**defaults)
# ---------------------------------------------------------------------------
# Helper: patch aget_llm_response to return a fixed string
# ---------------------------------------------------------------------------
_PATCH_LLM = "crewai.new_agent.executor.aget_llm_response"
class TestBasicConversation:
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_simple_message(self, mock_llm):
mock_llm.return_value = "4"
agent = _agent()
result = await agent.amessage("What is 2+2? Reply with just the number.")
assert "4" in result.content
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_token_counts_nonzero(self, mock_llm):
mock_llm.return_value = "hi"
agent = _agent()
result = await agent.amessage("Say hi in one word.")
# With mocking, token counts come from the LLM's _token_usage.
# They are 0 when fully mocked — just assert the field exists.
assert result.input_tokens is not None
assert result.output_tokens is not None
assert result.response_time_ms is not None
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_conversation_continuity(self, mock_llm):
mock_llm.side_effect = ["OK", "Zephyr"]
agent = _agent()
await agent.amessage("My name is Zephyr. Reply with just OK.")
result = await agent.amessage("What is my name? One word only.")
assert "Zephyr" in result.content
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_multi_turn_token_deltas(self, mock_llm):
mock_llm.side_effect = ["Hello!", "Goodbye!"]
agent = _agent()
r1 = await agent.amessage("Say hello.")
r2 = await agent.amessage("Say goodbye.")
# Both turns exist; token counts may be 0 under mocking but fields are present.
assert r1.input_tokens is not None
assert r2.input_tokens is not None
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_sync_message(self, mock_llm):
mock_llm.return_value = "9"
agent = _agent()
result = agent.message("What is 3*3? Reply with just the number.")
assert "9" in result.content
assert result.input_tokens is not None
class TestStructuredOutput:
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_response_model(self, mock_llm):
class MathResult(BaseModel):
answer: int
explanation: str
mock_llm.return_value = '{"answer": 56, "explanation": "7 times 8 equals 56."}'
agent = _agent(response_model=MathResult)
result = await agent.amessage("What is 7*8? Show answer and brief explanation.")
assert result.metadata is not None
assert "structured_output" in result.metadata
assert result.metadata["structured_output"]["answer"] == 56
class TestGuardrails:
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_code_guardrail_passes(self, mock_llm):
mock_llm.return_value = "Hi there!"
def check_length(text):
return len(text) < 500, "Response too long"
agent = _agent(guardrail=check_length)
result = await agent.amessage("Say hi in one sentence.")
assert len(result.content) < 500
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_code_guardrail_triggers_retry(self, mock_llm):
mock_llm.side_effect = ["No greeting here.", "Hello there!"]
call_count = 0
def must_contain_hello(text):
nonlocal call_count
call_count += 1
if "hello" in text.lower():
return True, ""
return False, "Response must contain the word 'hello'"
agent = _agent(guardrail=must_contain_hello)
result = await agent.amessage("Greet the user with the word 'hello'.")
assert result.input_tokens is not None
class TestJsonDefinition:
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_load_and_run(self, mock_llm):
mock_llm.return_value = "144"
defn = {
"role": "Math Tutor",
"goal": "Help with math",
"backstory": "Math teacher",
"llm": "openai/gpt-4o-mini",
"settings": {"memory": False},
}
with tempfile.NamedTemporaryFile(suffix=".json", mode="w", delete=False) as f:
json.dump(defn, f)
f.flush()
agent = load_agent_from_definition(f.name)
result = await agent.amessage("What is 12*12? Reply with just the number.")
assert "144" in result.content
class TestToolCalling:
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_tool_called_and_result_used(self, mock_llm):
from crewai.tools.base_tool import BaseTool
class AddTool(BaseTool):
name: str = "adder"
description: str = "Add two numbers. Input: two integers a and b."
def _run(self, a: int, b: int) -> str:
return str(int(a) + int(b))
# First call: LLM requests the tool; second call: LLM uses the result
tool_call_json = json.dumps(
{"name": "adder", "parameters": {"a": 17, "b": 25}}
)
mock_llm.side_effect = [tool_call_json, "The answer is 42."]
agent = _agent(
tools=[AddTool()],
role="Calculator",
goal="Use tools for math",
)
result = await agent.amessage("Use the adder tool to add 17 and 25.")
assert result.content is not None
assert "42" in result.content or result.content # mocked response
class TestProvenance:
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_explain_after_message(self, mock_llm):
mock_llm.return_value = "10"
agent = _agent()
await agent.amessage("What is 5+5?")
entries = agent.explain()
assert len(entries) >= 1
response_entries = [e for e in entries if e.action == "response"]
assert len(response_entries) == 1
assert "10" in response_entries[0].outcome
class TestModelInfo:
@pytest.mark.asyncio
@patch(_PATCH_LLM, new_callable=AsyncMock)
async def test_model_in_response(self, mock_llm):
mock_llm.return_value = "Hello!"
agent = _agent()
result = await agent.amessage("Hi")
assert result.model == "gpt-4o-mini"

View File

@@ -1,482 +0,0 @@
"""Tests for the NewAgent class."""
from __future__ import annotations
import asyncio
from typing import Any
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from crewai.new_agent import (
AgentSettings,
AgentStatus,
ConversationalProvider,
Message,
NewAgent,
PromptLayer,
PromptStack,
ProvenanceEntry,
TokenUsage,
)
from crewai.new_agent.coworker_tools import DelegateToCoworkerTool, build_coworker_tools
from crewai.new_agent.provider import DirectProvider
# ── Model tests ──────────────────────────────────────────────
class TestMessage:
def test_defaults(self):
msg = Message(role="user", content="Hello")
assert msg.role == "user"
assert msg.content == "Hello"
assert msg.id
assert msg.timestamp
assert msg.model is None
assert msg.input_tokens is None
def test_agent_message(self):
msg = Message(
role="agent",
content="Hi there",
sender="Researcher",
model="gpt-4o",
input_tokens=100,
output_tokens=50,
response_time_ms=1200,
)
assert msg.sender == "Researcher"
assert msg.model == "gpt-4o"
assert msg.input_tokens == 100
class TestAgentSettings:
def test_defaults(self):
s = AgentSettings()
assert s.memory_enabled is True
assert s.reasoning_enabled is True
assert s.self_improving is True
assert s.dreaming_interval_hours == 24
assert s.planning_enabled is True
assert s.auto_plan is True
assert s.can_spawn_copies is False
assert s.max_spawn_depth == 1
assert s.provenance_enabled is True
assert s.provenance_detail == "standard"
assert s.narration_guard is False
assert s.max_history_messages is None
def test_custom(self):
s = AgentSettings(
memory_enabled=False,
dreaming_interval_hours=48,
max_history_messages=50,
)
assert s.memory_enabled is False
assert s.dreaming_interval_hours == 48
assert s.max_history_messages == 50
class TestAgentStatus:
def test_status(self):
status = AgentStatus(
state="using_tool",
detail="Searching the web…",
tool_name="search_web",
elapsed_ms=5000,
input_tokens=1200,
output_tokens=300,
)
assert status.state == "using_tool"
assert status.tool_name == "search_web"
assert status.elapsed_ms == 5000
class TestPromptStack:
def test_assemble(self):
stack = PromptStack()
stack.add("soul", "You are a researcher.", source="agent")
stack.add("tools", "Available tools: search", source="tools")
stack.add("empty", "", source="none")
result = stack.assemble()
assert "You are a researcher." in result
assert "Available tools: search" in result
assert result.count("\n\n") == 1
def test_empty(self):
stack = PromptStack()
assert stack.assemble() == ""
class TestProvenanceEntry:
def test_defaults(self):
entry = ProvenanceEntry(action="tool_call")
assert entry.action == "tool_call"
assert entry.id
assert entry.timestamp
assert entry.reasoning == ""
class TestTokenUsage:
def test_record(self):
usage = TokenUsage(
action="message",
input_tokens=500,
output_tokens=200,
model="gpt-4o",
)
assert usage.action == "message"
assert usage.input_tokens == 500
# ── Provider tests ───────────────────────────────────────────
class TestDirectProvider:
def test_protocol_compliance(self):
provider = DirectProvider()
assert isinstance(provider, ConversationalProvider)
@pytest.mark.asyncio
async def test_send_message(self):
provider = DirectProvider()
msg = Message(role="agent", content="Hello")
await provider.send_message(msg)
assert len(provider.get_history()) == 1
assert provider.get_history()[0].content == "Hello"
@pytest.mark.asyncio
async def test_send_status(self):
provider = DirectProvider()
status = AgentStatus(state="thinking", detail="Working…")
await provider.send_status(status)
assert provider._pending_status is not None
assert provider._pending_status.state == "thinking"
def test_reset_history(self):
provider = DirectProvider()
provider.save_history([Message(role="user", content="Hi")])
assert len(provider.get_history()) == 1
provider.reset_history()
assert len(provider.get_history()) == 0
# ── NewAgent construction tests ──────────────────────────────
class TestNewAgentConstruction:
def test_basic_creation(self):
agent = NewAgent(
role="Senior Researcher",
goal="Find information",
backstory="You are an expert researcher.",
)
assert agent.role == "Senior Researcher"
assert agent.goal == "Find information"
assert agent.id
assert agent._llm_instance is not None
def test_settings_defaults(self):
agent = NewAgent(
role="Writer",
goal="Write content",
)
assert agent.settings.memory_enabled is True
assert agent.settings.planning_enabled is True
def test_custom_settings(self):
agent = NewAgent(
role="Writer",
goal="Write content",
settings=AgentSettings(memory_enabled=False, max_history_messages=10),
)
assert agent.settings.memory_enabled is False
assert agent.settings.max_history_messages == 10
def test_prompt_stack_built(self):
agent = NewAgent(
role="Researcher",
goal="Find facts",
backstory="Expert.",
)
stack = agent._executor._build_prompt_stack()
assembled = stack.assemble()
assert "Researcher" in assembled
assert "Find facts" in assembled
assert "Expert." in assembled
def test_conversation_id_unique(self):
a1 = NewAgent(role="A", goal="g")
a2 = NewAgent(role="B", goal="g")
assert a1._conversation_id != a2._conversation_id
def test_reset_conversation(self):
agent = NewAgent(role="R", goal="g")
old_id = agent._conversation_id
agent.reset_conversation()
assert agent._conversation_id != old_id
assert len(agent.conversation_history) == 0
def test_usage_metrics_empty(self):
agent = NewAgent(role="R", goal="g")
metrics = agent.usage_metrics
assert metrics["total_tokens"] == 0
assert metrics["total_actions"] == 0
def test_explain_empty(self):
agent = NewAgent(role="R", goal="g")
assert agent.explain() == []
# ── CoWorker tools tests ─────────────────────────────────────
class TestCoworkerTools:
def test_build_tools(self):
writer = NewAgent(role="Writer", goal="Write")
tools = build_coworker_tools([writer])
assert len(tools) == 1
assert "delegate_to" in tools[0].name.lower()
def test_tool_description(self):
writer = NewAgent(role="Content Writer", goal="Draft articles")
tools = build_coworker_tools([writer])
assert "Content Writer" in tools[0].description
assert "Draft articles" in tools[0].description
def test_coworker_init(self):
writer = NewAgent(role="Writer", goal="Write")
agent = NewAgent(
role="Manager",
goal="Manage",
coworkers=[writer],
)
assert len(agent._resolved_coworkers) == 1
assert len(agent._coworker_tools) == 1
# ── Integration test with mocked LLM ────────────────────────
class TestNewAgentMessage:
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_amessage_basic(self, mock_llm_response):
mock_llm_response.return_value = "The answer is 42."
agent = NewAgent(
role="Researcher",
goal="Answer questions",
backstory="Expert.",
)
response = await agent.amessage("What is the meaning of life?")
assert response.role == "agent"
assert response.content == "The answer is 42."
assert response.sender == "Researcher"
assert response.conversation_id == agent._conversation_id
assert len(agent.conversation_history) == 2
assert agent.conversation_history[0].role == "user"
assert agent.conversation_history[1].role == "agent"
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_conversation_continuity(self, mock_llm_response):
mock_llm_response.side_effect = ["First response.", "Second response with context."]
agent = NewAgent(role="R", goal="g")
r1 = await agent.amessage("Message 1")
assert r1.content == "First response."
r2 = await agent.amessage("Message 2")
assert r2.content == "Second response with context."
assert len(agent.conversation_history) == 4
assert agent.conversation_history[0].content == "Message 1"
assert agent.conversation_history[2].content == "Message 2"
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_provenance_logged(self, mock_llm_response):
mock_llm_response.return_value = "Answer."
agent = NewAgent(role="R", goal="g")
await agent.amessage("Test")
entries = agent.explain()
assert len(entries) == 1
assert entries[0].action == "response"
assert entries[0].inputs["user_message"] == "Test"
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_token_tracking(self, mock_llm_response):
mock_llm_response.return_value = "Response."
agent = NewAgent(role="R", goal="g")
response = await agent.amessage("Hello")
assert response.response_time_ms is not None
assert response.response_time_ms >= 0
assert agent.usage_metrics["total_actions"] == 1
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_callbacks(self, mock_llm_response):
mock_llm_response.return_value = "Done."
on_message_called = []
on_complete_called = []
agent = NewAgent(
role="R",
goal="g",
on_message=lambda m: on_message_called.append(m),
on_complete=lambda m: on_complete_called.append(m),
)
await agent.amessage("Hi")
assert len(on_message_called) == 1
assert on_message_called[0].content == "Hi"
assert len(on_complete_called) == 1
assert on_complete_called[0].content == "Done."
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_max_history_messages(self, mock_llm_response):
mock_llm_response.return_value = "Response."
agent = NewAgent(
role="R",
goal="g",
settings=AgentSettings(max_history_messages=2),
)
for i in range(5):
await agent.amessage(f"Message {i}")
assert len(agent.conversation_history) == 10
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_prompt_stack_inspectable(self, mock_llm_response):
mock_llm_response.return_value = "OK."
agent = NewAgent(role="Analyst", goal="Analyze data", backstory="Expert analyst.")
await agent.amessage("Analyze this")
stack = agent.last_prompt_stack
assert stack is not None
assembled = stack.assemble()
assert "Analyst" in assembled
assert "Analyze data" in assembled
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_summarization_preserves_current_user_message(self, mock_llm_response):
"""When proactive summarization fires, the current user message must
survive — otherwise the agent never sees what the user actually asked."""
mock_llm_response.return_value = "Done."
agent = NewAgent(
role="R",
goal="g",
settings=AgentSettings(respect_context_window=True),
)
executor = agent._executor
# Build messages that simulate: system + history + current user
from crewai.utilities.agent_utils import format_message_for_llm
llm_messages = [
format_message_for_llm("System prompt", role="system"),
format_message_for_llm("Old user msg", role="user"),
format_message_for_llm("Old assistant msg", role="assistant"),
format_message_for_llm("Another old msg", role="user"),
format_message_for_llm("Another reply", role="assistant"),
format_message_for_llm("What is the weather?", role="user"), # current
]
# Mock the LLM to have a tiny context window so summarization triggers
mock_llm = MagicMock()
mock_llm.get_context_window_size.return_value = 100
mock_llm.call.return_value = "<summary>Prior context summary</summary>"
agent._llm_instance = mock_llm
executor._proactive_summarize_messages(llm_messages, [])
# The last user message must still be present
user_messages = [m for m in llm_messages if m.get("role") == "user"]
last_user = user_messages[-1] if user_messages else None
assert last_user is not None
assert "What is the weather?" in str(last_user.get("content", ""))
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_plan_injection_has_anti_echo(self, mock_llm_response):
"""Plan text added to prompt stack should instruct the LLM not to echo it."""
mock_llm_response.return_value = "Result of the work."
agent = NewAgent(role="Coder", goal="Write code", backstory="Expert.")
# Simulate what happens when planning returns steps
executor = agent._executor
executor.prompt_stack = executor._build_prompt_stack("Write a function")
# Manually add a plan layer like the planning engine would
steps = "\n".join(f"{i + 1}. Step {i + 1}" for i in range(3))
plan_text = (
f"Internal execution plan (do NOT include in your response):\n"
f"{steps}\n\n"
f"Execute these steps using your tools. "
f"Report results, not the plan itself."
)
executor.prompt_stack.add("plan", plan_text, source="planning_engine")
assembled = executor.prompt_stack.assemble()
assert "do NOT include in your response" in assembled
assert "Report results, not the plan itself" in assembled
# ── Delegation tests ─────────────────────────────────────────
class TestDelegation:
@patch("crewai.new_agent.executor.aget_llm_response")
@pytest.mark.asyncio
async def test_sync_delegation(self, mock_llm_response):
mock_llm_response.return_value = "Draft article about AI."
writer = NewAgent(
role="Writer",
goal="Write articles",
settings={"planning_enabled": False},
)
tool = DelegateToCoworkerTool(coworker=writer)
result = tool._run(message="Write an article about AI")
assert "Draft article about AI." in result
# ── Event types tests ────────────────────────────────────────
class TestEvents:
def test_event_creation(self):
from crewai.new_agent.events import (
NewAgentMessageReceivedEvent,
NewAgentMessageSentEvent,
NewAgentToolUsageStartedEvent,
)
evt = NewAgentMessageReceivedEvent(
conversation_id="conv-1",
new_agent_id="agent-1",
message_length=42,
)
assert evt.type == "new_agent_message_received"
assert evt.message_length == 42
evt2 = NewAgentToolUsageStartedEvent(
new_agent_id="a1",
tool_name="search_web",
)
assert evt2.type == "new_agent_tool_usage_started"
assert evt2.tool_name == "search_web"

View File

@@ -1,488 +0,0 @@
"""Tests for the SkillBuilder — auto-generated SKILL.md suggestion system."""
from __future__ import annotations
import json
from pathlib import Path
from types import SimpleNamespace
from typing import Any
from unittest.mock import MagicMock, patch
import pytest
# ── Helpers ──────────────────────────────────────────────────────
def _make_agent(tmp_path: Path, role: str = "analyst", **overrides: Any) -> MagicMock:
"""Create a mock NewAgent with the fields SkillBuilder needs."""
agent = MagicMock()
agent.id = "test-agent-123"
agent.role = role
agent.settings = MagicMock()
agent.settings.can_build_skills = overrides.get("can_build_skills", True)
agent._llm_instance = None
return agent
def _make_builder(tmp_path: Path, **agent_overrides: Any) -> Any:
from crewai.new_agent.skill_builder import SkillBuilder
agent = _make_agent(tmp_path, **agent_overrides)
with patch.object(SkillBuilder, "_load_existing_skills"):
builder = SkillBuilder(agent)
builder._skills_dir = tmp_path / "skills"
return builder
# ===========================================================================
# Unit Tests: Suggest / Confirm / Reject
# ===========================================================================
class TestSkillBuilderSuggest:
"""Tests for suggest_skill and pending management."""
def test_suggest_creates_pending(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
result = builder.suggest_skill(
name="format-report",
description="Format a weekly report",
instructions="## Steps\n1. Gather data\n2. Format",
source="explicit-instruction",
)
assert result["name"] == "format-report"
assert result["status"] == "pending"
assert len(builder.pending_suggestions) == 1
def test_suggest_disabled(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path, can_build_skills=False)
result = builder.suggest_skill(
name="test",
description="test",
instructions="test",
source="test",
)
assert result == {}
assert len(builder.pending_suggestions) == 0
def test_suggest_slugifies_name(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
result = builder.suggest_skill(
name="My Cool Skill!",
description="test",
instructions="test",
source="test",
)
assert result["name"] == "my-cool-skill"
def test_suggest_truncates_description(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
result = builder.suggest_skill(
name="test",
description="x" * 300,
instructions="test",
source="test",
)
assert len(result["description"]) == 200
def test_suggest_deduplicates_name(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
# Add a mock active skill with the same name
mock_skill = MagicMock()
mock_skill.name = "my-skill"
builder._active_skills.append(mock_skill)
result = builder.suggest_skill(
name="my-skill",
description="test",
instructions="test",
source="test",
)
assert result["name"] != "my-skill"
def test_suggest_emits_event(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
with patch("crewai.new_agent.skill_builder.crewai_event_bus", create=True) as mock_bus:
with patch("crewai.new_agent.skill_builder.NewAgentSkillSuggestedEvent", create=True):
builder.suggest_skill(
name="test",
description="test",
instructions="test",
source="explicit-instruction",
)
class TestSkillBuilderConfirm:
"""Tests for confirm_suggestion and disk write."""
def test_confirm_writes_skill_md(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
builder.suggest_skill(
name="my-skill",
description="A test skill",
instructions="## Steps\n1. Do thing A\n2. Do thing B",
source="explicit-instruction",
)
with patch("crewai.skills.parser.load_skill_metadata") as mock_load, \
patch("crewai.skills.parser.load_skill_instructions") as mock_instruct:
mock_skill = MagicMock()
mock_skill.name = "my-skill"
mock_load.return_value = mock_skill
mock_instruct.return_value = mock_skill
result = builder.confirm_suggestion(0)
assert result is True
assert len(builder.pending_suggestions) == 0
assert len(builder._active_skills) == 1
skill_md = tmp_path / "skills" / "my-skill" / "SKILL.md"
assert skill_md.exists()
content = skill_md.read_text()
assert "name: my-skill" in content
assert "description: \"A test skill\"" in content
assert "Do thing A" in content
def test_confirm_invalid_index(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
assert builder.confirm_suggestion(0) is False
assert builder.confirm_suggestion(-1) is False
def test_confirm_already_confirmed(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
builder.suggest_skill(
name="test", description="t", instructions="t", source="t"
)
builder._pending_suggestions[0]["status"] = "confirmed"
assert builder.confirm_suggestion(0) is False
class TestSkillBuilderReject:
"""Tests for reject_suggestion."""
def test_reject_removes_from_pending(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
builder.suggest_skill(
name="unwanted", description="t", instructions="t", source="t"
)
assert len(builder.pending_suggestions) == 1
builder.reject_suggestion(0)
assert len(builder.pending_suggestions) == 0
def test_reject_invalid_index(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
builder.reject_suggestion(5) # no crash
class TestSkillBuilderUpdate:
"""Tests for update_suggestion (edit flow)."""
def test_update_changes_instructions(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
builder.suggest_skill(
name="test", description="t", instructions="original", source="t"
)
assert builder.update_suggestion(0, "edited instructions")
assert builder.pending_suggestions[0]["instructions"] == "edited instructions"
def test_update_invalid_index(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
assert builder.update_suggestion(0, "nope") is False
# ===========================================================================
# Unit Tests: Suggestion from instruction / workflow
# ===========================================================================
class TestSuggestFromInstruction:
"""Tests for suggest_from_instruction (with mocked LLM)."""
def test_fallback_when_no_llm(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
result = builder.suggest_from_instruction(
"Always format reports with summary section first"
)
assert result["source"] == "explicit-instruction"
assert result["status"] == "pending"
assert "format reports" in result["instructions"].lower() or "summary" in result["instructions"].lower()
def test_uses_llm_when_available(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
builder.agent._llm_instance = MagicMock()
mock_response = json.dumps({
"name": "format-reports",
"description": "Format reports with summary first",
"instructions": "## Steps\n1. Add summary\n2. Add details",
})
with patch("crewai.utilities.agent_utils.get_llm_response", return_value=mock_response):
result = builder.suggest_from_instruction(
"Always format reports with summary section first"
)
assert result["name"] == "format-reports"
assert "summary" in result["instructions"].lower()
class TestSuggestFromWorkflow:
"""Tests for suggest_from_workflow."""
def test_workflow_to_skill(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
workflow = {
"tools": ["search_web", "scrape_url", "summarize"],
"count": 7,
}
result = builder.suggest_from_workflow(workflow)
assert result["source"] == "workflow-detection"
assert result["status"] == "pending"
assert "search_web" in result["instructions"] or "search-web" in result["name"]
# ===========================================================================
# Unit Tests: Format skills context
# ===========================================================================
class TestFormatSkillsContext:
"""Tests for format_skills_context (prompt injection)."""
def test_empty_when_no_active_skills(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
assert builder.format_skills_context() == ""
def test_formats_active_skills(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
mock_skill = MagicMock()
mock_skill.name = "test-skill"
mock_skill.description = "A test skill"
builder._active_skills.append(mock_skill)
with patch("crewai.skills.loader.format_skill_context", return_value="## Skill: test-skill\nA test skill"):
result = builder.format_skills_context()
assert "test-skill" in result
# ===========================================================================
# Unit Tests: Load existing skills from disk
# ===========================================================================
class TestLoadExistingSkills:
"""Tests for _load_existing_skills on init."""
def test_loads_skills_from_directory(self, tmp_path: Path) -> None:
from crewai.new_agent.skill_builder import SkillBuilder
# Create a skills directory with a SKILL.md
skill_dir = tmp_path / "skills" / "my-skill"
skill_dir.mkdir(parents=True)
(skill_dir / "SKILL.md").write_text(
"---\nname: my-skill\ndescription: A test\n---\n\n## Instructions\nDo stuff"
)
agent = _make_agent(tmp_path)
builder = SkillBuilder.__new__(SkillBuilder)
builder.agent = agent
builder._pending_suggestions = []
builder._active_skills = []
builder._skills_dir = tmp_path / "skills"
builder._load_existing_skills()
assert len(builder._active_skills) == 1
assert builder._active_skills[0].name == "my-skill"
def test_no_crash_when_dir_missing(self, tmp_path: Path) -> None:
from crewai.new_agent.skill_builder import SkillBuilder
agent = _make_agent(tmp_path)
builder = SkillBuilder.__new__(SkillBuilder)
builder.agent = agent
builder._pending_suggestions = []
builder._active_skills = []
builder._skills_dir = tmp_path / "nonexistent"
builder._load_existing_skills()
assert builder._active_skills == []
# ===========================================================================
# Integration: Events
# ===========================================================================
class TestSkillBuilderEvents:
"""Verify events are emitted correctly."""
def test_suggested_event_fields(self) -> None:
from crewai.new_agent.events import NewAgentSkillSuggestedEvent
event = NewAgentSkillSuggestedEvent(
new_agent_id="abc",
skill_name="my-skill",
source_type="explicit-instruction",
)
assert event.type == "new_agent_skill_suggested"
assert event.skill_name == "my-skill"
def test_confirmed_event_fields(self) -> None:
from crewai.new_agent.events import NewAgentSkillConfirmedEvent
event = NewAgentSkillConfirmedEvent(
new_agent_id="abc",
skill_name="my-skill",
)
assert event.type == "new_agent_skill_confirmed"
def test_rejected_event_fields(self) -> None:
from crewai.new_agent.events import NewAgentSkillRejectedEvent
event = NewAgentSkillRejectedEvent(
new_agent_id="abc",
skill_name="my-skill",
)
assert event.type == "new_agent_skill_rejected"
# ===========================================================================
# Integration: Settings
# ===========================================================================
class TestSkillBuilderSettings:
"""Verify can_build_skills setting works."""
def test_setting_default_true(self) -> None:
from crewai.new_agent.models import AgentSettings
settings = AgentSettings()
assert settings.can_build_skills is True
def test_setting_can_be_disabled(self) -> None:
from crewai.new_agent.models import AgentSettings
settings = AgentSettings(can_build_skills=False)
assert settings.can_build_skills is False
# ===========================================================================
# Integration: PromptStack skills layer
# ===========================================================================
class TestPromptStackSkillsLayer:
"""Verify skills layer is added to PromptStack."""
def test_skills_layer_included(self, tmp_path: Path) -> None:
from crewai.new_agent.executor import ConversationalAgentExecutor
from crewai.new_agent.skill_builder import SkillBuilder
from crewai.new_agent.models import PromptStack
agent = MagicMock()
agent.role = "analyst"
agent.goal = "analyze data"
agent.backstory = "expert"
agent._resolved_tools = []
agent._coworker_tools = []
agent._memory_instance = None
agent.knowledge = None
agent.knowledge_sources = []
agent._active_skills = []
mock_builder = MagicMock(spec=SkillBuilder)
mock_builder.format_skills_context.return_value = "## Skill: my-skill\nDo things"
agent._skill_builder = mock_builder
executor = ConversationalAgentExecutor(agent=agent)
with patch.object(executor, "_recall_memory", return_value=""), \
patch.object(executor, "_query_knowledge", return_value=""):
stack = executor._build_prompt_stack("test query")
layer_names = [layer.name for layer in stack.layers]
assert "skills" in layer_names
skills_layer = next(l for l in stack.layers if l.name == "skills")
assert "my-skill" in skills_layer.content
# ===========================================================================
# Conversational suggestion response
# ===========================================================================
class TestSuggestionResponse:
"""Tests for conversational approve/reject flow."""
def test_handle_response_confirm(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
builder.suggest_skill(
name="my-skill", description="test", instructions="do stuff", source="test"
)
with patch("crewai.skills.parser.load_skill_metadata") as mock_load, \
patch("crewai.skills.parser.load_skill_instructions") as mock_instruct:
mock_skill = MagicMock()
mock_skill.name = "my-skill"
mock_load.return_value = mock_skill
mock_instruct.return_value = mock_skill
result = builder.handle_suggestion_response("yes, save it")
assert result is not None
assert result["action"] == "confirmed"
assert result["name"] == "my-skill"
def test_handle_response_reject(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
builder.suggest_skill(
name="my-skill", description="test", instructions="do stuff", source="test"
)
result = builder.handle_suggestion_response("no thanks")
assert result is not None
assert result["action"] == "rejected"
assert len(builder.pending_suggestions) == 0
def test_handle_response_unrelated(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
builder.suggest_skill(
name="my-skill", description="test", instructions="do stuff", source="test"
)
result = builder.handle_suggestion_response("what's the weather like?")
assert result is not None
assert result["action"] == "ignored"
assert len(builder.pending_suggestions) == 1
def test_handle_response_no_pending(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
result = builder.handle_suggestion_response("yes")
assert result is None
class TestBuildSuggestionMessage:
"""Tests for build_suggestion_message (conversational text + actions)."""
def test_message_contains_name_and_desc(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
suggestion = builder.suggest_skill(
name="format-report",
description="Format weekly reports with summary",
instructions="## Steps\n1. Add summary\n2. Add details",
source="test",
)
text, actions = builder.build_suggestion_message(suggestion)
assert "format-report" in text
assert "Format weekly reports" in text
assert "Would you like me to save" in text
def test_actions_contain_confirm_reject(self, tmp_path: Path) -> None:
builder = _make_builder(tmp_path)
suggestion = builder.suggest_skill(
name="test-skill", description="test", instructions="test", source="test"
)
text, actions = builder.build_suggestion_message(suggestion)
action_types = {a["action_type"] for a in actions}
assert "suggestion_confirm" in action_types
assert "suggestion_reject" in action_types
def test_message_action_model(self) -> None:
from crewai.new_agent.models import MessageAction
action = MessageAction(
action_id="test-1",
label="Approve",
action_type="suggestion_confirm",
payload={"type": "skill", "name": "test"},
)
assert action.action_id == "test-1"
assert action.payload["type"] == "skill"

View File

@@ -1,448 +0,0 @@
"""Tests for the 6 TUI issues fixed in Phase 2.
Issue 1: Organic mode routing — only most relevant agent responds
Issue 2: Scheduled/recurring tasks via ScheduleTaskTool
Issue 3: Token counter updates in ThinkingIndicator
Issue 4: CLI memory listing uses correct API
Issue 5: TUI /memory uses correct API
Issue 6: Event bus pairing — MemorySaveFailedEvent on shutdown
"""
from __future__ import annotations
import json
from datetime import datetime, timedelta, timezone
from pathlib import Path
from types import SimpleNamespace
from typing import Any
from unittest.mock import MagicMock, patch
import pytest
# ── Helpers ──────────────────────────────────────────────────────
def _make_tui(
tmp_path: Path,
agents: list[dict[str, Any]] | None = None,
) -> Any:
from crewai_cli.agent_tui import AgentTUI
agents_dir = tmp_path / "agents"
agents_dir.mkdir()
for defn in (agents or []):
name = defn.get("name", "unnamed")
(agents_dir / f"{name}.yaml").write_text(
json.dumps(defn)
)
tui = AgentTUI.__new__(AgentTUI)
tui._agents_dir = agents_dir
tui._config = {}
tui._agent_defs = agents or []
tui._agent_names = [d.get("name", d.get("role", "unnamed")) for d in (agents or [])]
tui._agent_instances = {}
tui._current_room = "common"
tui._chat_histories = {}
tui._processing = False
tui._last_active_agent = None
tui._engagement_mode = "organic"
tui._scheduler = None
return tui
# ===========================================================================
# Issue 1: Organic mode routing — _score_relevance
# ===========================================================================
class TestIssue1OrgRelRouting:
"""Only the most relevant agent should respond in organic mode."""
def test_top_agent_scored_highest(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
agents = [
{"name": "chef", "role": "Chef", "goal": "Cook meals", "backstory": "Italian cuisine expert"},
{"name": "driver", "role": "Driver", "goal": "Transport goods", "backstory": "Logistics"},
{"name": "writer", "role": "Writer", "goal": "Write articles", "backstory": "Journalist"},
]
scored = tui._score_relevance("cook an Italian meal", agents)
assert len(scored) >= 1
assert scored[0][0]["name"] == "chef"
def test_no_match_returns_empty(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
agents = [
{"name": "a", "role": "alpha", "goal": "one", "backstory": ""},
{"name": "b", "role": "beta", "goal": "two", "backstory": ""},
]
scored = tui._score_relevance("xyzzy nonsense", agents)
assert scored == []
def test_tie_threshold(self, tmp_path: Path) -> None:
"""Two agents that score within 80% should both be included."""
tui = _make_tui(tmp_path)
agents = [
{"name": "dev1", "role": "Python developer", "goal": "Write Python code", "backstory": ""},
{"name": "dev2", "role": "Python engineer", "goal": "Build Python apps", "backstory": ""},
{"name": "chef", "role": "Chef", "goal": "Cook food", "backstory": ""},
]
scored = tui._score_relevance("python", agents)
assert len(scored) == 2
# Both devs match python, chef doesn't
names = {a["name"] for a, _ in scored}
assert names == {"dev1", "dev2"}
def test_sorted_by_score_descending(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path)
agents = [
{"name": "weak", "role": "assistant", "goal": "help", "backstory": ""},
{"name": "strong", "role": "data scientist", "goal": "analyze data trends", "backstory": "data analytics"},
]
scored = tui._score_relevance("analyze data", agents)
if len(scored) > 1:
assert scored[0][1] >= scored[1][1]
# ===========================================================================
# Issue 2: Scheduler
# ===========================================================================
class TestIssue2Scheduler:
"""Test TaskScheduler and ScheduleTaskTool."""
def test_parse_relative_time(self) -> None:
from crewai.new_agent.scheduler import parse_schedule_time
now = datetime.now(timezone.utc)
dt = parse_schedule_time("in 10 minutes")
assert dt is not None
diff = (dt - now).total_seconds()
assert 580 < diff < 620
def test_parse_iso_time(self) -> None:
from crewai.new_agent.scheduler import parse_schedule_time
dt = parse_schedule_time("2026-12-25T10:00:00Z")
assert dt is not None
assert dt.year == 2026
assert dt.month == 12
def test_parse_invalid_returns_none(self) -> None:
from crewai.new_agent.scheduler import parse_schedule_time
assert parse_schedule_time("next tuesday maybe") is None
def test_scheduler_add_and_list(self) -> None:
from crewai.new_agent.scheduler import ScheduledTask, TaskScheduler
TaskScheduler.reset()
scheduler = TaskScheduler()
task = ScheduledTask(
agent_name="test",
description="do something",
next_run_at=datetime.now(timezone.utc).isoformat(),
)
scheduler.add(task)
assert len(scheduler.list_tasks()) == 1
TaskScheduler.reset()
def test_scheduler_cancel(self) -> None:
from crewai.new_agent.scheduler import ScheduledTask, TaskScheduler
TaskScheduler.reset()
scheduler = TaskScheduler()
task = ScheduledTask(
agent_name="test",
description="do it",
next_run_at=(datetime.now(timezone.utc) + timedelta(hours=1)).isoformat(),
)
scheduler.add(task)
assert scheduler.cancel(task.id) is True
assert task.status == "cancelled"
assert len(scheduler.list_tasks()) == 0
TaskScheduler.reset()
def test_tick_fires_due_task(self) -> None:
from crewai.new_agent.scheduler import ScheduledTask, TaskScheduler
TaskScheduler.reset()
scheduler = TaskScheduler()
task = ScheduledTask(
agent_name="agent1",
description="check weather",
next_run_at=(datetime.now(timezone.utc) - timedelta(seconds=5)).isoformat(),
)
scheduler.add(task)
results: list[str] = []
scheduler.set_callback(lambda t: results.append(t.description))
scheduler._tick()
assert results == ["check weather"]
assert task.status == "completed"
TaskScheduler.reset()
def test_recurring_task_reschedules(self) -> None:
from crewai.new_agent.scheduler import ScheduledTask, TaskScheduler
TaskScheduler.reset()
scheduler = TaskScheduler()
task = ScheduledTask(
agent_name="agent1",
description="recurring check",
schedule_type="recurring",
interval_seconds=3600,
next_run_at=(datetime.now(timezone.utc) - timedelta(seconds=5)).isoformat(),
)
scheduler.add(task)
scheduler.set_callback(lambda t: "ok")
scheduler._tick()
assert task.status == "pending"
assert task.next_run_at > datetime.now(timezone.utc).isoformat()
TaskScheduler.reset()
def test_schedule_task_tool(self) -> None:
from crewai.new_agent.scheduler import ScheduleTaskTool, TaskScheduler
TaskScheduler.reset()
tool = ScheduleTaskTool(agent_name="myagent")
result = tool._run(description="check logs", when="in 30 minutes")
assert "Scheduled task" in result
assert "check logs" in result
scheduler = TaskScheduler()
tasks = scheduler.list_tasks()
assert len(tasks) == 1
assert tasks[0].agent_name == "myagent"
TaskScheduler.reset()
def test_schedule_task_tool_invalid_time(self) -> None:
from crewai.new_agent.scheduler import ScheduleTaskTool, TaskScheduler
TaskScheduler.reset()
tool = ScheduleTaskTool(agent_name="myagent")
result = tool._run(description="foo", when="next tuesday maybe")
assert "Could not parse" in result
TaskScheduler.reset()
def test_tui_tasks_command_empty(self, tmp_path: Path) -> None:
from crewai.new_agent.scheduler import TaskScheduler
TaskScheduler.reset()
tui = _make_tui(tmp_path)
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._handle_tasks_command(["/tasks"])
assert any("No scheduled tasks" in m for m in messages)
TaskScheduler.reset()
def test_tui_tasks_command_shows_tasks(self, tmp_path: Path) -> None:
from crewai.new_agent.scheduler import ScheduledTask, TaskScheduler
TaskScheduler.reset()
scheduler = TaskScheduler()
scheduler.add(ScheduledTask(
agent_name="chef",
description="prepare dinner",
next_run_at=(datetime.now(timezone.utc) + timedelta(hours=1)).isoformat(),
))
tui = _make_tui(tmp_path)
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._handle_tasks_command(["/tasks"])
output = messages[0]
assert "Scheduled Tasks" in output
assert "prepare dinner" in output
assert "chef" in output
TaskScheduler.reset()
def test_tui_tasks_cancel(self, tmp_path: Path) -> None:
from crewai.new_agent.scheduler import ScheduledTask, TaskScheduler
TaskScheduler.reset()
scheduler = TaskScheduler()
task = scheduler.add(ScheduledTask(
agent_name="test",
description="cancel me",
next_run_at=(datetime.now(timezone.utc) + timedelta(hours=1)).isoformat(),
))
tui = _make_tui(tmp_path)
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._handle_tasks_command(["/tasks", "cancel", task.id])
assert any("cancelled" in m for m in messages)
TaskScheduler.reset()
# ===========================================================================
# Issue 3: Token counter in ThinkingIndicator
# ===========================================================================
class TestIssue3TokenCounter:
"""Status updates should propagate token counts to ThinkingIndicator."""
def test_handle_status_update_with_tokens(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import AgentTUI, ThinkingIndicator
tui = _make_tui(tmp_path, agents=[{"name": "a", "role": "a", "goal": "g"}])
indicator = ThinkingIndicator("test-agent")
indicator._steps = []
indicator._tokens = ""
indicator.update = MagicMock()
mock_scroll = MagicMock()
mock_scroll.children = [indicator]
with patch.object(tui, "query_one", return_value=mock_scroll):
event = SimpleNamespace(
state="analyzing",
detail="Analyzing your request",
input_tokens=1234,
output_tokens=567,
)
tui._handle_status_update(None, event)
assert indicator._current_status == "Analyzing your request"
assert "1,234" in indicator._tokens
assert "567" in indicator._tokens
def test_handle_status_update_no_tokens(self, tmp_path: Path) -> None:
from crewai_cli.agent_tui import AgentTUI, ThinkingIndicator
tui = _make_tui(tmp_path)
indicator = ThinkingIndicator("test-agent")
indicator._steps = []
indicator._tokens = ""
indicator.update = MagicMock()
mock_scroll = MagicMock()
mock_scroll.children = [indicator]
with patch.object(tui, "query_one", return_value=mock_scroll):
event = SimpleNamespace(
state="thinking",
detail=None,
input_tokens=0,
output_tokens=0,
)
tui._handle_status_update(None, event)
assert indicator._current_status == "thinking"
def test_status_event_has_token_fields(self) -> None:
from crewai.new_agent.events import NewAgentStatusUpdateEvent
event = NewAgentStatusUpdateEvent(
state="analyzing",
input_tokens=100,
output_tokens=50,
elapsed_ms=1500,
)
assert event.input_tokens == 100
assert event.output_tokens == 50
assert event.elapsed_ms == 1500
# ===========================================================================
# Issue 4+5: Memory API — .recall() and .list_records()
# ===========================================================================
class TestIssue4and5MemoryAPI:
"""TUI and CLI should use recall/list_records, not search."""
def test_show_memory_panel_uses_list_records(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path, agents=[
{"name": "a", "role": "agent", "goal": "g"}
])
agent = MagicMock()
agent.role = "agent"
agent._memory_instance = MagicMock()
agent._memory_instance.list_records.return_value = [
SimpleNamespace(
content="Test memory",
metadata={"type": "raw"},
),
]
tui._agent_instances["a"] = agent
tui._current_room = "a"
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._show_memory_panel()
agent._memory_instance.list_records.assert_called_once()
assert "Test memory" in messages[0]
def test_search_memory_uses_recall(self, tmp_path: Path) -> None:
tui = _make_tui(tmp_path, agents=[
{"name": "a", "role": "agent", "goal": "g"}
])
agent = MagicMock()
agent.role = "agent"
agent._memory_instance = MagicMock()
agent._memory_instance.recall.return_value = [
SimpleNamespace(
content="Matched memory",
metadata={"type": "knowledge"},
),
]
tui._agent_instances["a"] = agent
tui._current_room = "a"
messages: list[str] = []
tui._mount_sys = lambda text: messages.append(text)
tui._search_memory("test query")
agent._memory_instance.recall.assert_called_once()
assert "Matched memory" in messages[0]
# ===========================================================================
# Issue 6: Event bus pairing — MemorySaveFailedEvent
# ===========================================================================
class TestIssue6EventPairing:
"""_background_encode_batch should emit MemorySaveFailedEvent on RuntimeError."""
def test_background_encode_emits_failed_on_runtime_error(self) -> None:
from crewai.memory.unified_memory import Memory
mem = MagicMock(spec=Memory)
mem._encode_batch = MagicMock(
side_effect=RuntimeError("cannot schedule new futures after shutdown")
)
# Call the real method, binding self to our mock
emitted: list[Any] = []
with patch("crewai.memory.unified_memory.crewai_event_bus") as mock_bus:
mock_bus.emit.side_effect = lambda s, e: emitted.append(e)
Memory._background_encode_batch(
mem,
contents=["test content"],
scope=None,
categories=None,
metadata={"scope": "test"},
importance=None,
source=None,
private=False,
agent_role=None,
root_scope=None,
)
event_types = [type(e).__name__ for e in emitted]
assert "MemorySaveStartedEvent" in event_types
assert "MemorySaveFailedEvent" in event_types
failed = [e for e in emitted if type(e).__name__ == "MemorySaveFailedEvent"]
assert len(failed) == 1
assert "shutdown" in failed[0].error
# Cleanup any persisted scheduler state after tests
@pytest.fixture(autouse=True)
def _cleanup_scheduler_file():
yield
p = Path.home() / ".crewai" / "scheduled_tasks.json"
if p.exists():
try:
p.unlink()
except Exception:
pass

View File

@@ -3901,7 +3901,7 @@ def test_crew_testing_function(researcher):
assert isinstance(received_events[1], CrewTestCompletedEvent)
@pytest.mark.vcr(record_mode="none")
@pytest.mark.vcr()
def test_hierarchical_verbose_manager_agent(researcher, writer):
task = Task(
description="Come up with a list of 5 interesting ideas to explore for an article, then write one amazing paragraph highlight for each idea that showcases how good an article about this topic could be. Return the list of ideas with their paragraph and your notes.",

View File

@@ -89,7 +89,7 @@ class TestStreamingCrewIntegration:
result = streaming.result
assert result is not None
@pytest.mark.vcr(record_mode="none")
@pytest.mark.vcr()
def test_streaming_properties_from_docs(
self, researcher: Agent, simple_task: Task
) -> None:

View File

@@ -77,16 +77,16 @@ def test_create_llm_from_env_with_unaccepted_attributes() -> None:
with patch.dict(
os.environ,
{
"OPENAI_MODEL_NAME": "gpt-3.5-turbo",
"OPENAI_API_KEY": "fake-key",
"AWS_ACCESS_KEY_ID": "fake-access-key",
"AWS_SECRET_ACCESS_KEY": "fake-secret-key",
"AWS_DEFAULT_REGION": "us-west-2",
},
clear=True,
):
llm = create_llm(llm_value=None)
assert isinstance(llm, BaseLLM)
assert llm.model == DEFAULT_LLM_MODEL
assert llm.model == "gpt-3.5-turbo"
assert not hasattr(llm, "AWS_ACCESS_KEY_ID")
assert not hasattr(llm, "AWS_SECRET_ACCESS_KEY")
assert not hasattr(llm, "AWS_DEFAULT_REGION")

View File

@@ -115,15 +115,6 @@ ignore-decorators = ["typing.overload"]
"lib/cli/tests/**/*.py" = ["S101", "RET504", "S105", "S106"] # Allow assert statements in tests
"lib/crewai-core/tests/**/*.py" = ["S101", "RET504", "S105", "S106"] # Allow assert statements in tests
"lib/devtools/tests/**/*.py" = ["S101"]
# new_agent module — telemetry/resilience patterns require try-except-pass
"lib/crewai/src/crewai/new_agent/**/*.py" = ["S110", "PERF203", "RUF006", "RUF012", "PERF401", "PERF402", "PERF403", "B905", "B007", "F841", "RET504", "N806", "RUF001", "S603"]
"lib/crewai/src/crewai/memory/unified_memory.py" = ["S110"]
# CLI agent TUI and benchmark — UI/subprocess patterns
"lib/cli/src/crewai_cli/agent_tui.py" = ["S110", "PERF203", "PERF401", "PERF402", "RUF006", "RUF012", "S603", "T201"]
"lib/cli/src/crewai_cli/benchmark.py" = ["S110", "PERF203", "E702", "B905", "RET504", "ASYNC240"]
"lib/cli/src/crewai_cli/cli.py" = ["S110", "PERF401", "B905", "N806", "N814"]
"lib/cli/src/crewai_cli/run_crew.py" = ["S603"]
"lib/cli/src/crewai_cli/create_agent.py" = ["S110", "PERF203", "S607", "RUF001", "RET504"]
[tool.mypy]
@@ -185,15 +176,15 @@ exclude-newer = "3 days"
# onnxruntime 1.24+ dropped Python 3.10 wheels; cap it so qdrant[fastembed] resolves on 3.10.
# fastembed 0.7.x and docling 2.63 cap pillow<12; the removed APIs don't affect them.
# langchain-core <1.2.31 has GHSA-926x-3r5x-gfhw and is required by langchain-text-splitters 1.1.2+.
# langchain-core 1.3.0 has GHSA-pjwx-r37v-7724; force 1.3.1+.
# langchain-core 1.0.0-1.3.2 has GHSA-pjwx-r37v-7724 (unsafe deserialization via broad load() allowlists); force 1.3.3+.
# langchain-text-splitters <1.1.2 has GHSA-fv5p-p927-qmxr (SSRF bypass in split_text_from_url).
# transformers 4.57.6 has CVE-2026-1839; force 5.4+ (docling 2.84 allows huggingface-hub>=1).
# cryptography 46.0.6 has CVE-2026-39892; force 46.0.7+.
# pypdf <6.10.2 has GHSA-4pxv-j86v-mhcw, GHSA-7gw9-cf7v-778f, GHSA-x284-j5p8-9c5p; force 6.10.2+.
# uv <0.11.6 has GHSA-pjjw-68hj-v9mw; force 0.11.6+.
# python-multipart <0.0.27 has GHSA-pp6c-gr5w-3c5g (DoS via unbounded multipart headers).
# urllib3 <2.7.0 has GHSA-qccp-gfcp-xxvc and GHSA-mf9v-mfxr-j63j; force 2.7.0+ (2.6.4 was never released).
# gitpython <3.1.50 has GHSA-mv93-w799-cj2w; <3.1.49 has GHSA-v87r-6q3f-2j67 (newline injection -> RCE via core.hooksPath).
# gitpython <3.1.50 has GHSA-mv93-w799-cj2w (config_writer newline injection bypassing the 3.1.49 patch -> RCE via core.hooksPath).
# urllib3 <2.7.0 has GHSA-qccp-gfcp-xxvc (ProxyManager cross-origin redirect leaks Authorization/Cookie) and GHSA-mf9v-mfxr-j63j (streaming decompression-bomb bypass); force 2.7.0+.
# langsmith <0.7.31 has GHSA-rr7j-v2q5-chgv (streaming token redaction bypass); force 0.7.31+.
# authlib <1.6.11 has GHSA-jj8c-mmj3-mmgv (CSRF bypass in cache-based state storage).
# litellm 1.83.8+ hard-pins openai==2.24.0, missing openai.types.responses used by crewai;
@@ -203,7 +194,7 @@ override-dependencies = [
"rich>=13.7.1",
"onnxruntime<1.24; python_version < '3.11'",
"pillow>=12.1.1",
"langchain-core>=1.3.1,<2",
"langchain-core>=1.3.3,<2",
"langchain-text-splitters>=1.1.2,<2",
"urllib3>=2.7.0",
"transformers>=5.4.0; python_version >= '3.10'",
@@ -212,7 +203,7 @@ override-dependencies = [
"uv>=0.11.6,<1",
"python-multipart>=0.0.27,<1",
"gitpython>=3.1.50,<4",
"langsmith>=0.7.33,<0.9",
"langsmith>=0.7.31,<0.8",
"authlib>=1.6.11",
]

12
uv.lock generated
View File

@@ -13,7 +13,7 @@ resolution-markers = [
]
[options]
exclude-newer = "2026-05-10T20:37:20.649349398Z"
exclude-newer = "2026-05-08T16:33:02.834109Z"
exclude-newer-span = "P3D"
[manifest]
@@ -29,9 +29,9 @@ overrides = [
{ name = "authlib", specifier = ">=1.6.11" },
{ name = "cryptography", specifier = ">=46.0.7" },
{ name = "gitpython", specifier = ">=3.1.50,<4" },
{ name = "langchain-core", specifier = ">=1.3.1,<2" },
{ name = "langchain-core", specifier = ">=1.3.3,<2" },
{ name = "langchain-text-splitters", specifier = ">=1.1.2,<2" },
{ name = "langsmith", specifier = ">=0.7.33,<0.9" },
{ name = "langsmith", specifier = ">=0.7.31,<0.8" },
{ name = "onnxruntime", marker = "python_full_version < '3.11'", specifier = "<1.24" },
{ name = "openai", specifier = ">=2.30.0,<3" },
{ name = "pillow", specifier = ">=12.1.1" },
@@ -3888,7 +3888,7 @@ sdist = { url = "https://files.pythonhosted.org/packages/0e/72/a3add0e4eec4eb9e2
[[package]]
name = "langsmith"
version = "0.8.3"
version = "0.7.32"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "httpx" },
@@ -3901,9 +3901,9 @@ dependencies = [
{ name = "xxhash" },
{ name = "zstandard" },
]
sdist = { url = "https://files.pythonhosted.org/packages/de/8a/1e8ea5e8bab2a65fa95bd36229ef38e8723ec46e430e20ca2d953487a7f1/langsmith-0.8.3.tar.gz", hash = "sha256:767ff7a8d136ed42926bf99059ac631dc6883542d6e3104b32e71c7625e1fa05", size = 4460330, upload-time = "2026-05-07T19:56:56.18Z" }
sdist = { url = "https://files.pythonhosted.org/packages/2f/b4/a0b4a501bee6b8a741ce29f8c48155b132118483cddc6f9247735ddb38fa/langsmith-0.7.32.tar.gz", hash = "sha256:b59b8e106d0e4c4842e158229296086e2aa7c561e3f602acda73d3ad0062e915", size = 1184518, upload-time = "2026-04-15T23:42:41.885Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/98/a9/51e644c1f1dbc3dd7d22dfd6412eab206d538c81e024e4f287373544bdcb/langsmith-0.8.3-py3-none-any.whl", hash = "sha256:b2e40e308222fa0beb2dccee3b4b30bfee9062d7a4f20a3e3e93df3c51a08ab4", size = 399048, upload-time = "2026-05-07T19:56:53.994Z" },
{ url = "https://files.pythonhosted.org/packages/62/bc/148f98ac7dad73ac5e1b1c985290079cfeeb9ba13d760a24f25002beb2c9/langsmith-0.7.32-py3-none-any.whl", hash = "sha256:e1fde928990c4c52f47dc5132708cec674355d9101723d564183e965f383bf5f", size = 378272, upload-time = "2026-04-15T23:42:39.905Z" },
]
[[package]]