20 KiB
EXPANDED_CLAUDE.md
Deep architectural reference for the CrewAI codebase. See CLAUDE.md for quick-start commands and overview.
Table of Contents
- 1. Execution Flow: Crew.kickoff() to Agent Output
- 2. Agent System
- 3. Task System
- 4. Flow System
- 5. Memory System
- 6. Tool System
- 7. Event System
- 8. LLM Abstraction
- 9. crewai-tools Package
- 10. crewai-files Package
- 11. CLI & Project Scaffolding
- 12. Project Decorators (@CrewBase)
- 13. Knowledge & RAG
- 14. Security & Fingerprinting
- 15. Agent-to-Agent (A2A)
- 16. Translations & i18n
1. Execution Flow
The end-to-end path from Crew.kickoff() to final output:
Crew.kickoff(inputs)
├── prepare_kickoff() # Validate inputs, store files
├── Determine process type
│ ├── Sequential: _run_sequential_process()
│ └── Hierarchical: _run_hierarchical_process() → _create_manager_agent()
├── _execute_tasks(tasks) # Main loop
│ └── For each task:
│ ├── If ConditionalTask: check condition(previous_output)
│ ├── If async_execution: create asyncio task
│ └── If sync: task.execute_sync(agent, context, tools)
│ └── agent.execute_task(task, context, tools)
│ ├── Memory recall (if enabled)
│ ├── Knowledge retrieval (if enabled)
│ ├── Build prompt with context
│ └── CrewAgentExecutor.invoke()
│ └── Loop until AgentFinish:
│ ├── Native tool calling (if LLM supports)
│ └── OR ReAct text pattern (fallback)
├── Apply guardrails with retries
├── after_kickoff_callbacks()
└── Return CrewOutput
Process types:
- Sequential: Tasks execute in order; each gets context from all prior TaskOutputs
- Hierarchical: A manager agent delegates to other agents via delegation tools
Agent execution loop (agents/crew_agent_executor.py):
- Native function calling: LLM returns structured
tool_calls; executor runs first tool, appends result, loops - ReAct text pattern (fallback): LLM outputs
Thought/Action/Action Input; executor parses text, runs tool, appendsObservation
2. Agent System
Key files: agent/core.py, agents/agent_builder/base_agent.py, agents/crew_agent_executor.py
Agent class (agent/core.py)
Extends BaseAgent. Core fields:
role,goal,backstory— define agent identity/promptingllm— BaseLLM instance (auto-created from string)function_calling_llm— optional specialized LLM for tool callstools— list of BaseTool instancesmemory— optional unified Memory instanceknowledge_sources— optional knowledge basemax_iter(default 25),max_rpm,max_retry_limit(default 2)allow_delegation— enables delegation toolsreasoning— enables planning before executionguardrail— validation function for outputcode_execution_mode— "safe" (Docker) or "unsafe" (local)apps— platform integrations (Asana, GitHub, Slack, etc.)mcps— MCP server configurations
BaseAgent (agents/agent_builder/base_agent.py)
Abstract base with: id (UUID4), agent_executor, cache_handler, tools_handler, security_config, i18n. Defines abstract methods: execute_task(), create_agent_executor(), get_delegation_tools(), get_platform_tools().
CrewAgentExecutor (agents/crew_agent_executor.py)
The agent execution loop. Key attributes: llm, task, crew, agent, prompt, tools, messages, iterations, max_iter, respect_context_window. Entry point: invoke(inputs) → _invoke_loop().
LiteAgent (lite_agent.py)
Lightweight alternative agent implementation with: event-driven execution, memory integration, LLM hooks, guardrail support, structured output via Converter.
3. Task System
Key files: task.py, tasks/task_output.py, tasks/conditional_task.py
Task class (task.py)
Core fields:
description,expected_output— task prompt and LLM guidanceagent— assigned BaseAgenttools— optional task-specific tools (override agent tools)context— list of prior Tasks whose output provides contextoutput_file,output_pydantic,output_json— output formatguardrail+guardrail_max_retries(default 3) — output validationasync_execution— run in background threadhuman_input— request human feedbackcallback— post-completion callback
TaskOutput (tasks/task_output.py)
Result container: raw (text), pydantic (model instance), json_dict, agent (role string), output_format, messages.
ConditionalTask (tasks/conditional_task.py)
Extends Task with condition: Callable[[TaskOutput], bool]. Evaluates against previous output; if False, appends empty TaskOutput and skips. Cannot be first/only task or async.
4. Flow System
Key files: flow/flow.py, flow/flow_wrappers.py, flow/persistence/, flow/human_feedback.py
Flow class (flow/flow.py)
Generic Flow[T] where T is dict or Pydantic BaseModel (must have id field). Uses FlowMeta metaclass to register decorators at class definition.
Decorator API:
@start(condition=None) # Entry point (unconditional or conditional)
@listen(condition) # Event handler (fires when condition met)
@router(condition) # Decision point (return value becomes trigger)
@human_feedback(message, emit) # Collect human feedback, optionally route
or_(*conditions) # Fire when ANY condition met
and_(*conditions) # Fire when ALL conditions met
Execution model:
- Execute all unconditional
@startmethods in parallel - After each method completes: find triggered routers (sequential), then listeners (parallel)
- Continue chain until no more triggers
Key rules:
- Routers are sequential; listeners are parallel
- OR listeners fire once on first trigger; AND listeners wait for all
- State access is thread-safe via
StateProxywith_state_lock - Cyclic flows: methods cleared from
_completed_methodsto allow re-execution
Persistence (flow/persistence/)
FlowPersistenceABC:save_state(),load_state(),save_pending_feedback(),load_pending_feedback()SQLiteFlowPersistence: stores in~/.crewai/flows.db- Enables resumption via
Flow.from_pending(flow_id, persistence)
Human Feedback (flow/human_feedback.py)
@human_feedback decorator wraps method to collect feedback. With emit parameter, acts as router (LLM collapses feedback to outcome). Supports async providers that raise HumanFeedbackPending to pause flow. Optional learn=True stores lessons in memory.
Flow Methods
kickoff(inputs)/akickoff(inputs)— sync/async executionresume(feedback)/resume_async(feedback)— resume from pauseask(message, timeout)— request user input (auto-checkpoints state)state— thread-safe state proxyrecall(query)/remember(content)— memory integration
5. Memory System
Key files: memory/unified_memory.py, memory/types.py, memory/encoding_flow.py, memory/recall_flow.py, memory/memory_scope.py, memory/analyze.py, memory/storage/
Memory class (memory/unified_memory.py)
Singleton-style with lazy LLM/embedder init. Pluggable storage backend (default LanceDB). Background save queue via ThreadPoolExecutor(max_workers=1).
Public API:
- Write:
remember(content, scope, categories, importance, ...),remember_many(contents, ...)(non-blocking batch) - Read:
recall(query, scope, categories, limit, depth="shallow"|"deep") - Manage:
forget(scope, categories, older_than, ...),update(record_id, ...),drain_writes() - Scoping:
scope(path)→MemoryScope,slice(scopes, read_only)→MemorySlice - Introspection:
list_scopes(),list_records(),list_categories(),info(),tree()
Configuration:
- Scoring weights:
semantic_weight=0.5,recency_weight=0.3,importance_weight=0.2 recency_half_life_days=30— exponential decayconsolidation_threshold=0.85— dedup trigger similarityconfidence_threshold_high=0.8,confidence_threshold_low=0.5— recall routingexploration_budget=1— LLM exploration rounds for deep recall
Data Types (memory/types.py)
- MemoryRecord:
id,content,scope(hierarchical path like/company/team),categories,metadata,importance(0-1),created_at,last_accessed,embedding,source,private - MemoryMatch:
record,score(composite),match_reasons,evidence_gaps - ScopeInfo:
path,record_count,categories, date range,child_scopes
Composite scoring formula:
score = semantic_weight × similarity + recency_weight × (0.5 ^ (age_days / half_life)) + importance_weight × importance
Encoding Flow (memory/encoding_flow.py)
5-step batch pipeline on save:
- Batch embed all items (single API call)
- Intra-batch dedup via cosine similarity matrix (threshold 0.98)
- Parallel find similar records in storage (8 workers)
- Parallel analyze — Groups: A (insert, 0 LLM), B (consolidation, 1 LLM), C (save analysis, 1 LLM), D (both, 2 LLM) — 10 workers
- Execute plans — batch re-embed, atomic storage mutations (delete + update + insert under write lock)
Recall Flow (memory/recall_flow.py)
Adaptive recall pipeline:
- Analyze query — short queries skip LLM; long queries get sub-queries, scope suggestions, complexity classification, time filters
- Filter & chunk candidate scopes (max 20)
- Parallel search across queries × scopes (4 workers), apply filters, compute composite scores
- Route — high confidence → synthesize; low confidence + budget → explore deeper
- Recursive exploration (if deeper) — LLM extracts relevant info + gaps; decrements budget; re-searches
- Synthesize — deduplicate by ID, rank by composite score, return top N
Storage Backend (memory/storage/backend.py)
Protocol interface: save(), update(), delete(), search(), get_record(), list_records(), get_scope_info(), list_scopes(), list_categories(), count(), reset(), write_lock property.
LanceDB implementation (memory/storage/lancedb_storage.py): auto-detects vector dimensions, class-level shared RLock per DB path, auto-compaction every 100 saves, retry logic for commit conflicts (exponential backoff, 5 retries), oversamples 3x when filters present.
Scoped Views (memory/memory_scope.py)
- MemoryScope: wraps Memory with root_path prefix; all operations relative to that root
- MemorySlice: multi-scope view; recall searches all scopes in parallel; optional
read_only=True
6. Tool System
Key files: tools/base_tool.py, tools/structured_tool.py, tools/tool_calling.py, tools/tool_usage.py, tools/memory_tools.py
BaseTool (tools/base_tool.py)
Abstract Pydantic BaseModel. Key fields: name, description, args_schema (Pydantic model), result_as_answer, max_usage_count, cache_function. Subclasses implement _run(**kwargs) and optionally _arun(**kwargs).
@tool decorator: creates tool from function, auto-infers schema from type hints.
CrewStructuredTool (tools/structured_tool.py)
Wraps functions as structured tools for LLM function calling. from_function() factory. Validates inputs before execution, enforces usage limits.
Tool Execution Flow (tools/tool_usage.py)
ToolUsage manages selection → validation → execution:
- Parse tool call from LLM output
- Select tool (fuzzy matching, 85%+ ratio)
- Validate arguments against schema
- Execute with fingerprint metadata
- Cache results if configured
- Emit events throughout lifecycle
Retry: max 3 parsing attempts with fallback methods (JSON, JSON5, AST, JSON repair).
Memory Tools (tools/memory_tools.py)
- RecallMemoryTool: searches memory with single/multiple queries, returns formatted results with deduplication
- RememberTool: stores facts/decisions, infers scope/categories/importance
- CalculatorTool: safe arithmetic via AST parser (no
eval()), supports date differences
MCP Integration
- MCPToolWrapper (
tools/mcp_tool_wrapper.py): on-demand connections, retry with exponential backoff, timeouts (15s connect, 60s execute) - MCPNativeTool (
tools/mcp_native_tool.py): reuses persistent MCP sessions, auto-reconnect on event loop changes
7. Event System
Key files: events/event_bus.py, events/event_listener.py, events/base_events.py, events/types/
Event Bus (events/event_bus.py)
Singleton CrewAIEventsBus. Thread-safe with RWLock. Supports sync handlers (ThreadPoolExecutor, 10 workers) and async handlers (dedicated daemon event loop). Handler dependency injection via Depends().
Key methods: emit(source, event), aemit(), flush(timeout=30), register_handler(), scoped_handlers() (context manager for temporary handlers).
Event Types (events/types/)
- Tool events:
ToolUsageStartedEvent,ToolUsageFinishedEvent,ToolUsageErrorEvent,ToolValidateInputErrorEvent,ToolSelectionErrorEvent - LLM events:
LLMCallStartedEvent,LLMCallCompletedEvent,LLMCallFailedEvent,LLMStreamChunkEvent,LLMThinkingChunkEvent - Agent/Task/Crew events: lifecycle tracking (started, completed, failed)
- Flow events: method execution states, paused, input requested/received
- Memory events: retrieval started/completed/failed
- MCP events: connection, tool execution
- A2A events: agent-to-agent delegation
All events carry: UUID, timestamp, parent/previous chain, fingerprint context.
8. LLM Abstraction
Key files: llm.py, llms/base_llm.py, llms/providers/
BaseLLM (llms/base_llm.py)
Abstract interface: call(messages, tools, ...) and acall(...). Provider-specific constants for context windows (1KB–2MB). Emits LLM events. Handles context window management, timeout/auth errors, streaming.
LLM class (llm.py)
High-level wrapper integrating with litellm for multi-provider support. Handles model identification, tool function calling, JSON schema responses, streaming chunk aggregation, multimodal content formatting.
Providers (llms/providers/)
Per-provider adapters: OpenAI, Azure, Gemini, Claude/Anthropic, Bedrock, Watson, etc.
9. crewai-tools Package
Location: lib/crewai-tools/
93+ pre-built tools. All inherit from crewai.tools.BaseTool.
Pattern for creating tools:
class MyToolSchema(BaseModel):
param: str = Field(..., description="...")
class MyTool(BaseTool):
name: str = "My Tool"
description: str = "..."
args_schema: type[BaseModel] = MyToolSchema
def _run(self, param: str) -> str:
return result
Tool categories:
- Search/Web: BraveSearch, Tavily, EXASearch, Serper, Spider, SerpAPI
- Scraping: Firecrawl, Jina, Scrapfly, Selenium, Browserbase, Stagehand
- File search: PDF, CSV, JSON, XML, MDX, DOCX, TXT search tools
- Database: MySQL, Snowflake, SingleStore, MongoDB, Qdrant, Weaviate, Couchbase
- File I/O: FileRead, FileWriter, DirectoryRead, DirectorySearch, FileCompressor, OCR, Vision
- Code: CodeInterpreter, CodeDocsSearch, NL2SQL, DallE
- AWS: Bedrock agent/KB, S3 reader/writer
- Integrations: Composio, Zapier, MCP, LlamaIndex, GitHub
- RAG: RagTool base with 17 loaders (CSV, Directory, Docs, DOCX, GitHub, JSON, MySQL, Postgres, etc.)
43+ optional dependency groups for external services.
10. crewai-files Package
Location: lib/crewai-files/
Multimodal file handling for LLM providers.
Structure:
core/— File type classes (Image, PDF, Audio, Video, Text), source types (FilePath, FileBytes, FileUrl, FileStream), resolved representationsprocessing/— FileProcessor validates against per-provider constraints, optional transforms (resize, compress, chunk)uploaders/— Provider-specific uploaders (Anthropic, OpenAI, Gemini, Bedrock/S3)formatting/— Format files for provider APIs:format_multimodal_content(),aformat_multimodal_content()resolution/— FileResolver decides inline base64 vs upload based on size/providercache/— UploadCache tracks uploads by content hash, cleanup utilities
Provider constraints: max file sizes, supported formats, image dimensions per provider (Anthropic, OpenAI, Gemini, Bedrock).
11. CLI & Project Scaffolding
Key file: cli/cli.py (Click-based)
Core commands:
crewai create <crew|flow> <name>— scaffold projectcrewai run/crewai flow kickoff— execute crew/flowcrewai chat— interactive conversation with crewcrewai train [-n N]/crewai test [-n N] [-m MODEL]— training and evaluationcrewai replay [-t TASK_ID]— replay from specific task
Memory/config:
crewai reset_memories— reset memory, knowledge, or allcrewai memory— open Memory TUIcrewai config list|set|reset— CLI configuration
Deployment:
crewai deploy create|list|push|status|logs|remove
Tool repository:
crewai tool create|install|publish
Flow-specific:
crewai flow kickoff|plot|add-crew
Other: crewai login, crewai org list|switch|current, crewai traces enable|disable|status, crewai env view
12. Project Decorators
Key files: project/crew_base.py, project/annotations.py
@CrewBase decorator
Applies CrewBaseMeta metaclass. Auto-loads YAML configs (config/agents.yaml, config/tasks.yaml). Registers agent/task factory methods, MCP adapters, lifecycle hooks.
Method decorators (project/annotations.py)
Component factories (all memoized):
@agent— agent factory method@task— task factory method@llm— LLM provider factory@tool— tool factory@callback,@cache_handler
Lifecycle:
@before_kickoff/@after_kickoff— pre/post execution hooks@crew— main crew entry point (instantiates agents/tasks, manages callbacks)
Output format: @output_json, @output_pydantic
LLM/Tool hooks (optional agent/tool filtering):
@before_llm_call_hook/@after_llm_call_hook@before_tool_call_hook/@after_tool_call_hook
13. Knowledge & RAG
Key files: knowledge/knowledge.py, rag/
Knowledge class
Vector store integration: query(queries, results_limit, score_threshold), add_sources(), reset(). Async variants available. Used by agents via knowledge_sources parameter.
RAG system (rag/)
- Vector DBs: ChromaDB, Qdrant (client wrappers, factories, config)
- Embeddings: 25+ providers (OpenAI, Cohere, HuggingFace, Jina, Voyage, Ollama, Bedrock, Azure, Vertex, etc.)
- Core:
BaseClient,BaseEmbeddingsProviderabstractions - Storage:
BaseRAGStorageinterface
14. Security & Fingerprinting
Key files: security/security_config.py, security/fingerprint.py
- SecurityConfig: manages component fingerprints, serialization
- Fingerprint: dual identifiers (human-readable ID + UUID),
uuid5()with CrewAI namespace for deterministic seeding, metadata support (1-level nesting, 10KB limit), timestamp tracking - Every event carries fingerprint context for audit trails
15. Agent-to-Agent (A2A)
Key files: a2a/config.py, a2a/
Protocol for inter-agent communication:
A2AClientConfig,A2AServerConfig— configurationAgentCardSigningConfig— JWS signing (RS256, ES256, PS256)GRPCServerConfig— gRPC transport with TLS- Supporting:
auth/,updates/(polling/push/streaming),extensions/,utils/
16. Translations & i18n
Key file: translations/en.json
All agent-facing prompts are externalized. Key sections:
slices/— agent prompting templates (task, memory, role_playing, tools, format, final_answer_format)errors/— tool execution, validation, format violation, guardrail failure messagestools/— tool descriptions (delegate_work, ask_question, recall_memory, calculator, save_to_memory)memory/— query analysis, extraction rules, consolidation logic, temporal reasoning- HITL prompts — pre-review, lesson distillation
- Lite agent prompts — system prompts with/without tools