Files
crewAI/EXPANDED_CLAUDE.md
2026-03-09 20:11:39 -07:00

483 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# EXPANDED_CLAUDE.md
Deep architectural reference for the CrewAI codebase. See [CLAUDE.md](./CLAUDE.md) for quick-start commands and overview.
## Table of Contents
- [1. Execution Flow: Crew.kickoff() to Agent Output](#1-execution-flow)
- [2. Agent System](#2-agent-system)
- [3. Task System](#3-task-system)
- [4. Flow System](#4-flow-system)
- [5. Memory System](#5-memory-system)
- [6. Tool System](#6-tool-system)
- [7. Event System](#7-event-system)
- [8. LLM Abstraction](#8-llm-abstraction)
- [9. crewai-tools Package](#9-crewai-tools-package)
- [10. crewai-files Package](#10-crewai-files-package)
- [11. CLI & Project Scaffolding](#11-cli--project-scaffolding)
- [12. Project Decorators (@CrewBase)](#12-project-decorators)
- [13. Knowledge & RAG](#13-knowledge--rag)
- [14. Security & Fingerprinting](#14-security--fingerprinting)
- [15. Agent-to-Agent (A2A)](#15-agent-to-agent-a2a)
- [16. Translations & i18n](#16-translations--i18n)
---
## 1. Execution Flow
The end-to-end path from `Crew.kickoff()` to final output:
```
Crew.kickoff(inputs)
├── prepare_kickoff() # Validate inputs, store files
├── Determine process type
│ ├── Sequential: _run_sequential_process()
│ └── Hierarchical: _run_hierarchical_process() → _create_manager_agent()
├── _execute_tasks(tasks) # Main loop
│ └── For each task:
│ ├── If ConditionalTask: check condition(previous_output)
│ ├── If async_execution: create asyncio task
│ └── If sync: task.execute_sync(agent, context, tools)
│ └── agent.execute_task(task, context, tools)
│ ├── Memory recall (if enabled)
│ ├── Knowledge retrieval (if enabled)
│ ├── Build prompt with context
│ └── CrewAgentExecutor.invoke()
│ └── Loop until AgentFinish:
│ ├── Native tool calling (if LLM supports)
│ └── OR ReAct text pattern (fallback)
├── Apply guardrails with retries
├── after_kickoff_callbacks()
└── Return CrewOutput
```
**Process types:**
- **Sequential**: Tasks execute in order; each gets context from all prior TaskOutputs
- **Hierarchical**: A manager agent delegates to other agents via delegation tools
**Agent execution loop** (`agents/crew_agent_executor.py`):
- **Native function calling**: LLM returns structured `tool_calls`; executor runs first tool, appends result, loops
- **ReAct text pattern** (fallback): LLM outputs `Thought/Action/Action Input`; executor parses text, runs tool, appends `Observation`
---
## 2. Agent System
**Key files:** `agent/core.py`, `agents/agent_builder/base_agent.py`, `agents/crew_agent_executor.py`
### Agent class (`agent/core.py`)
Extends `BaseAgent`. Core fields:
- `role`, `goal`, `backstory` — define agent identity/prompting
- `llm` — BaseLLM instance (auto-created from string)
- `function_calling_llm` — optional specialized LLM for tool calls
- `tools` — list of BaseTool instances
- `memory` — optional unified Memory instance
- `knowledge_sources` — optional knowledge base
- `max_iter` (default 25), `max_rpm`, `max_retry_limit` (default 2)
- `allow_delegation` — enables delegation tools
- `reasoning` — enables planning before execution
- `guardrail` — validation function for output
- `code_execution_mode` — "safe" (Docker) or "unsafe" (local)
- `apps` — platform integrations (Asana, GitHub, Slack, etc.)
- `mcps` — MCP server configurations
### BaseAgent (`agents/agent_builder/base_agent.py`)
Abstract base with: `id` (UUID4), `agent_executor`, `cache_handler`, `tools_handler`, `security_config`, `i18n`. Defines abstract methods: `execute_task()`, `create_agent_executor()`, `get_delegation_tools()`, `get_platform_tools()`.
### CrewAgentExecutor (`agents/crew_agent_executor.py`)
The agent execution loop. Key attributes: `llm`, `task`, `crew`, `agent`, `prompt`, `tools`, `messages`, `iterations`, `max_iter`, `respect_context_window`. Entry point: `invoke(inputs)``_invoke_loop()`.
### LiteAgent (`lite_agent.py`)
Lightweight alternative agent implementation with: event-driven execution, memory integration, LLM hooks, guardrail support, structured output via Converter.
---
## 3. Task System
**Key files:** `task.py`, `tasks/task_output.py`, `tasks/conditional_task.py`
### Task class (`task.py`)
Core fields:
- `description`, `expected_output` — task prompt and LLM guidance
- `agent` — assigned BaseAgent
- `tools` — optional task-specific tools (override agent tools)
- `context` — list of prior Tasks whose output provides context
- `output_file`, `output_pydantic`, `output_json` — output format
- `guardrail` + `guardrail_max_retries` (default 3) — output validation
- `async_execution` — run in background thread
- `human_input` — request human feedback
- `callback` — post-completion callback
### TaskOutput (`tasks/task_output.py`)
Result container: `raw` (text), `pydantic` (model instance), `json_dict`, `agent` (role string), `output_format`, `messages`.
### ConditionalTask (`tasks/conditional_task.py`)
Extends Task with `condition: Callable[[TaskOutput], bool]`. Evaluates against previous output; if False, appends empty TaskOutput and skips. Cannot be first/only task or async.
---
## 4. Flow System
**Key files:** `flow/flow.py`, `flow/flow_wrappers.py`, `flow/persistence/`, `flow/human_feedback.py`
### Flow class (`flow/flow.py`)
Generic `Flow[T]` where T is `dict` or Pydantic `BaseModel` (must have `id` field). Uses `FlowMeta` metaclass to register decorators at class definition.
**Decorator API:**
```python
@start(condition=None) # Entry point (unconditional or conditional)
@listen(condition) # Event handler (fires when condition met)
@router(condition) # Decision point (return value becomes trigger)
@human_feedback(message, emit) # Collect human feedback, optionally route
or_(*conditions) # Fire when ANY condition met
and_(*conditions) # Fire when ALL conditions met
```
**Execution model:**
1. Execute all unconditional `@start` methods in parallel
2. After each method completes: find triggered routers (sequential), then listeners (parallel)
3. Continue chain until no more triggers
**Key rules:**
- Routers are sequential; listeners are parallel
- OR listeners fire once on first trigger; AND listeners wait for all
- State access is thread-safe via `StateProxy` with `_state_lock`
- Cyclic flows: methods cleared from `_completed_methods` to allow re-execution
### Persistence (`flow/persistence/`)
- `FlowPersistence` ABC: `save_state()`, `load_state()`, `save_pending_feedback()`, `load_pending_feedback()`
- `SQLiteFlowPersistence`: stores in `~/.crewai/flows.db`
- Enables resumption via `Flow.from_pending(flow_id, persistence)`
### Human Feedback (`flow/human_feedback.py`)
`@human_feedback` decorator wraps method to collect feedback. With `emit` parameter, acts as router (LLM collapses feedback to outcome). Supports async providers that raise `HumanFeedbackPending` to pause flow. Optional `learn=True` stores lessons in memory.
### Flow Methods
- `kickoff(inputs)` / `akickoff(inputs)` — sync/async execution
- `resume(feedback)` / `resume_async(feedback)` — resume from pause
- `ask(message, timeout)` — request user input (auto-checkpoints state)
- `state` — thread-safe state proxy
- `recall(query)` / `remember(content)` — memory integration
---
## 5. Memory System
**Key files:** `memory/unified_memory.py`, `memory/types.py`, `memory/encoding_flow.py`, `memory/recall_flow.py`, `memory/memory_scope.py`, `memory/analyze.py`, `memory/storage/`
### Memory class (`memory/unified_memory.py`)
Singleton-style with lazy LLM/embedder init. Pluggable storage backend (default LanceDB). Background save queue via ThreadPoolExecutor(max_workers=1).
**Public API:**
- **Write:** `remember(content, scope, categories, importance, ...)`, `remember_many(contents, ...)` (non-blocking batch)
- **Read:** `recall(query, scope, categories, limit, depth="shallow"|"deep")`
- **Manage:** `forget(scope, categories, older_than, ...)`, `update(record_id, ...)`, `drain_writes()`
- **Scoping:** `scope(path)``MemoryScope`, `slice(scopes, read_only)``MemorySlice`
- **Introspection:** `list_scopes()`, `list_records()`, `list_categories()`, `info()`, `tree()`
**Configuration:**
- Scoring weights: `semantic_weight=0.5`, `recency_weight=0.3`, `importance_weight=0.2`
- `recency_half_life_days=30` — exponential decay
- `consolidation_threshold=0.85` — dedup trigger similarity
- `confidence_threshold_high=0.8`, `confidence_threshold_low=0.5` — recall routing
- `exploration_budget=1` — LLM exploration rounds for deep recall
### Data Types (`memory/types.py`)
- **MemoryRecord**: `id`, `content`, `scope` (hierarchical path like `/company/team`), `categories`, `metadata`, `importance` (0-1), `created_at`, `last_accessed`, `embedding`, `source`, `private`
- **MemoryMatch**: `record`, `score` (composite), `match_reasons`, `evidence_gaps`
- **ScopeInfo**: `path`, `record_count`, `categories`, date range, `child_scopes`
**Composite scoring formula:**
```
score = semantic_weight × similarity + recency_weight × (0.5 ^ (age_days / half_life)) + importance_weight × importance
```
### Encoding Flow (`memory/encoding_flow.py`)
5-step batch pipeline on save:
1. **Batch embed** all items (single API call)
2. **Intra-batch dedup** via cosine similarity matrix (threshold 0.98)
3. **Parallel find similar** records in storage (8 workers)
4. **Parallel analyze** — Groups: A (insert, 0 LLM), B (consolidation, 1 LLM), C (save analysis, 1 LLM), D (both, 2 LLM) — 10 workers
5. **Execute plans** — batch re-embed, atomic storage mutations (delete + update + insert under write lock)
### Recall Flow (`memory/recall_flow.py`)
Adaptive recall pipeline:
1. **Analyze query** — short queries skip LLM; long queries get sub-queries, scope suggestions, complexity classification, time filters
2. **Filter & chunk** candidate scopes (max 20)
3. **Parallel search** across queries × scopes (4 workers), apply filters, compute composite scores
4. **Route** — high confidence → synthesize; low confidence + budget → explore deeper
5. **Recursive exploration** (if deeper) — LLM extracts relevant info + gaps; decrements budget; re-searches
6. **Synthesize** — deduplicate by ID, rank by composite score, return top N
### Storage Backend (`memory/storage/backend.py`)
Protocol interface: `save()`, `update()`, `delete()`, `search()`, `get_record()`, `list_records()`, `get_scope_info()`, `list_scopes()`, `list_categories()`, `count()`, `reset()`, `write_lock` property.
**LanceDB implementation** (`memory/storage/lancedb_storage.py`): auto-detects vector dimensions, class-level shared RLock per DB path, auto-compaction every 100 saves, retry logic for commit conflicts (exponential backoff, 5 retries), oversamples 3x when filters present.
### Scoped Views (`memory/memory_scope.py`)
- **MemoryScope**: wraps Memory with root_path prefix; all operations relative to that root
- **MemorySlice**: multi-scope view; recall searches all scopes in parallel; optional `read_only=True`
---
## 6. Tool System
**Key files:** `tools/base_tool.py`, `tools/structured_tool.py`, `tools/tool_calling.py`, `tools/tool_usage.py`, `tools/memory_tools.py`
### BaseTool (`tools/base_tool.py`)
Abstract Pydantic BaseModel. Key fields: `name`, `description`, `args_schema` (Pydantic model), `result_as_answer`, `max_usage_count`, `cache_function`. Subclasses implement `_run(**kwargs)` and optionally `_arun(**kwargs)`.
**`@tool` decorator:** creates tool from function, auto-infers schema from type hints.
### CrewStructuredTool (`tools/structured_tool.py`)
Wraps functions as structured tools for LLM function calling. `from_function()` factory. Validates inputs before execution, enforces usage limits.
### Tool Execution Flow (`tools/tool_usage.py`)
`ToolUsage` manages selection → validation → execution:
1. Parse tool call from LLM output
2. Select tool (fuzzy matching, 85%+ ratio)
3. Validate arguments against schema
4. Execute with fingerprint metadata
5. Cache results if configured
6. Emit events throughout lifecycle
Retry: max 3 parsing attempts with fallback methods (JSON, JSON5, AST, JSON repair).
### Memory Tools (`tools/memory_tools.py`)
- **RecallMemoryTool**: searches memory with single/multiple queries, returns formatted results with deduplication
- **RememberTool**: stores facts/decisions, infers scope/categories/importance
- **CalculatorTool**: safe arithmetic via AST parser (no `eval()`), supports date differences
### MCP Integration
- **MCPToolWrapper** (`tools/mcp_tool_wrapper.py`): on-demand connections, retry with exponential backoff, timeouts (15s connect, 60s execute)
- **MCPNativeTool** (`tools/mcp_native_tool.py`): reuses persistent MCP sessions, auto-reconnect on event loop changes
---
## 7. Event System
**Key files:** `events/event_bus.py`, `events/event_listener.py`, `events/base_events.py`, `events/types/`
### Event Bus (`events/event_bus.py`)
Singleton `CrewAIEventsBus`. Thread-safe with RWLock. Supports sync handlers (ThreadPoolExecutor, 10 workers) and async handlers (dedicated daemon event loop). Handler dependency injection via `Depends()`.
**Key methods:** `emit(source, event)`, `aemit()`, `flush(timeout=30)`, `register_handler()`, `scoped_handlers()` (context manager for temporary handlers).
### Event Types (`events/types/`)
- **Tool events**: `ToolUsageStartedEvent`, `ToolUsageFinishedEvent`, `ToolUsageErrorEvent`, `ToolValidateInputErrorEvent`, `ToolSelectionErrorEvent`
- **LLM events**: `LLMCallStartedEvent`, `LLMCallCompletedEvent`, `LLMCallFailedEvent`, `LLMStreamChunkEvent`, `LLMThinkingChunkEvent`
- **Agent/Task/Crew events**: lifecycle tracking (started, completed, failed)
- **Flow events**: method execution states, paused, input requested/received
- **Memory events**: retrieval started/completed/failed
- **MCP events**: connection, tool execution
- **A2A events**: agent-to-agent delegation
All events carry: UUID, timestamp, parent/previous chain, fingerprint context.
---
## 8. LLM Abstraction
**Key files:** `llm.py`, `llms/base_llm.py`, `llms/providers/`
### BaseLLM (`llms/base_llm.py`)
Abstract interface: `call(messages, tools, ...)` and `acall(...)`. Provider-specific constants for context windows (1KB2MB). Emits LLM events. Handles context window management, timeout/auth errors, streaming.
### LLM class (`llm.py`)
High-level wrapper integrating with litellm for multi-provider support. Handles model identification, tool function calling, JSON schema responses, streaming chunk aggregation, multimodal content formatting.
### Providers (`llms/providers/`)
Per-provider adapters: OpenAI, Azure, Gemini, Claude/Anthropic, Bedrock, Watson, etc.
---
## 9. crewai-tools Package
**Location:** `lib/crewai-tools/`
93+ pre-built tools. All inherit from `crewai.tools.BaseTool`.
**Pattern for creating tools:**
```python
class MyToolSchema(BaseModel):
param: str = Field(..., description="...")
class MyTool(BaseTool):
name: str = "My Tool"
description: str = "..."
args_schema: type[BaseModel] = MyToolSchema
def _run(self, param: str) -> str:
return result
```
**Tool categories:**
- **Search/Web**: BraveSearch, Tavily, EXASearch, Serper, Spider, SerpAPI
- **Scraping**: Firecrawl, Jina, Scrapfly, Selenium, Browserbase, Stagehand
- **File search**: PDF, CSV, JSON, XML, MDX, DOCX, TXT search tools
- **Database**: MySQL, Snowflake, SingleStore, MongoDB, Qdrant, Weaviate, Couchbase
- **File I/O**: FileRead, FileWriter, DirectoryRead, DirectorySearch, FileCompressor, OCR, Vision
- **Code**: CodeInterpreter, CodeDocsSearch, NL2SQL, DallE
- **AWS**: Bedrock agent/KB, S3 reader/writer
- **Integrations**: Composio, Zapier, MCP, LlamaIndex, GitHub
- **RAG**: RagTool base with 17 loaders (CSV, Directory, Docs, DOCX, GitHub, JSON, MySQL, Postgres, etc.)
**43+ optional dependency groups** for external services.
---
## 10. crewai-files Package
**Location:** `lib/crewai-files/`
Multimodal file handling for LLM providers.
**Structure:**
- `core/` — File type classes (Image, PDF, Audio, Video, Text), source types (FilePath, FileBytes, FileUrl, FileStream), resolved representations
- `processing/` — FileProcessor validates against per-provider constraints, optional transforms (resize, compress, chunk)
- `uploaders/` — Provider-specific uploaders (Anthropic, OpenAI, Gemini, Bedrock/S3)
- `formatting/` — Format files for provider APIs: `format_multimodal_content()`, `aformat_multimodal_content()`
- `resolution/` — FileResolver decides inline base64 vs upload based on size/provider
- `cache/` — UploadCache tracks uploads by content hash, cleanup utilities
**Provider constraints**: max file sizes, supported formats, image dimensions per provider (Anthropic, OpenAI, Gemini, Bedrock).
---
## 11. CLI & Project Scaffolding
**Key file:** `cli/cli.py` (Click-based)
**Core commands:**
- `crewai create <crew|flow> <name>` — scaffold project
- `crewai run` / `crewai flow kickoff` — execute crew/flow
- `crewai chat` — interactive conversation with crew
- `crewai train [-n N]` / `crewai test [-n N] [-m MODEL]` — training and evaluation
- `crewai replay [-t TASK_ID]` — replay from specific task
**Memory/config:**
- `crewai reset_memories` — reset memory, knowledge, or all
- `crewai memory` — open Memory TUI
- `crewai config list|set|reset` — CLI configuration
**Deployment:**
- `crewai deploy create|list|push|status|logs|remove`
**Tool repository:**
- `crewai tool create|install|publish`
**Flow-specific:**
- `crewai flow kickoff|plot|add-crew`
**Other:** `crewai login`, `crewai org list|switch|current`, `crewai traces enable|disable|status`, `crewai env view`
---
## 12. Project Decorators
**Key files:** `project/crew_base.py`, `project/annotations.py`
### @CrewBase decorator
Applies `CrewBaseMeta` metaclass. Auto-loads YAML configs (`config/agents.yaml`, `config/tasks.yaml`). Registers agent/task factory methods, MCP adapters, lifecycle hooks.
### Method decorators (`project/annotations.py`)
**Component factories** (all memoized):
- `@agent` — agent factory method
- `@task` — task factory method
- `@llm` — LLM provider factory
- `@tool` — tool factory
- `@callback`, `@cache_handler`
**Lifecycle:**
- `@before_kickoff` / `@after_kickoff` — pre/post execution hooks
- `@crew` — main crew entry point (instantiates agents/tasks, manages callbacks)
**Output format:** `@output_json`, `@output_pydantic`
**LLM/Tool hooks** (optional agent/tool filtering):
- `@before_llm_call_hook` / `@after_llm_call_hook`
- `@before_tool_call_hook` / `@after_tool_call_hook`
---
## 13. Knowledge & RAG
**Key files:** `knowledge/knowledge.py`, `rag/`
### Knowledge class
Vector store integration: `query(queries, results_limit, score_threshold)`, `add_sources()`, `reset()`. Async variants available. Used by agents via `knowledge_sources` parameter.
### RAG system (`rag/`)
- **Vector DBs**: ChromaDB, Qdrant (client wrappers, factories, config)
- **Embeddings**: 25+ providers (OpenAI, Cohere, HuggingFace, Jina, Voyage, Ollama, Bedrock, Azure, Vertex, etc.)
- **Core**: `BaseClient`, `BaseEmbeddingsProvider` abstractions
- **Storage**: `BaseRAGStorage` interface
---
## 14. Security & Fingerprinting
**Key files:** `security/security_config.py`, `security/fingerprint.py`
- **SecurityConfig**: manages component fingerprints, serialization
- **Fingerprint**: dual identifiers (human-readable ID + UUID), `uuid5()` with CrewAI namespace for deterministic seeding, metadata support (1-level nesting, 10KB limit), timestamp tracking
- Every event carries fingerprint context for audit trails
---
## 15. Agent-to-Agent (A2A)
**Key files:** `a2a/config.py`, `a2a/`
Protocol for inter-agent communication:
- `A2AClientConfig`, `A2AServerConfig` — configuration
- `AgentCardSigningConfig` — JWS signing (RS256, ES256, PS256)
- `GRPCServerConfig` — gRPC transport with TLS
- Supporting: `auth/`, `updates/` (polling/push/streaming), `extensions/`, `utils/`
---
## 16. Translations & i18n
**Key file:** `translations/en.json`
All agent-facing prompts are externalized. Key sections:
- `slices/` — agent prompting templates (task, memory, role_playing, tools, format, final_answer_format)
- `errors/` — tool execution, validation, format violation, guardrail failure messages
- `tools/` — tool descriptions (delegate_work, ask_question, recall_memory, calculator, save_to_memory)
- `memory/` — query analysis, extraction rules, consolidation logic, temporal reasoning
- HITL prompts — pre-review, lesson distillation
- Lite agent prompts — system prompts with/without tools