Files
crewAI/EXPANDED_CLAUDE.md
2026-03-09 20:11:39 -07:00

20 KiB
Raw Blame History

EXPANDED_CLAUDE.md

Deep architectural reference for the CrewAI codebase. See CLAUDE.md for quick-start commands and overview.

Table of Contents


1. Execution Flow

The end-to-end path from Crew.kickoff() to final output:

Crew.kickoff(inputs)
├── prepare_kickoff()                    # Validate inputs, store files
├── Determine process type
│   ├── Sequential: _run_sequential_process()
│   └── Hierarchical: _run_hierarchical_process() → _create_manager_agent()
├── _execute_tasks(tasks)                # Main loop
│   └── For each task:
│       ├── If ConditionalTask: check condition(previous_output)
│       ├── If async_execution: create asyncio task
│       └── If sync: task.execute_sync(agent, context, tools)
│           └── agent.execute_task(task, context, tools)
│               ├── Memory recall (if enabled)
│               ├── Knowledge retrieval (if enabled)
│               ├── Build prompt with context
│               └── CrewAgentExecutor.invoke()
│                   └── Loop until AgentFinish:
│                       ├── Native tool calling (if LLM supports)
│                       └── OR ReAct text pattern (fallback)
├── Apply guardrails with retries
├── after_kickoff_callbacks()
└── Return CrewOutput

Process types:

  • Sequential: Tasks execute in order; each gets context from all prior TaskOutputs
  • Hierarchical: A manager agent delegates to other agents via delegation tools

Agent execution loop (agents/crew_agent_executor.py):

  • Native function calling: LLM returns structured tool_calls; executor runs first tool, appends result, loops
  • ReAct text pattern (fallback): LLM outputs Thought/Action/Action Input; executor parses text, runs tool, appends Observation

2. Agent System

Key files: agent/core.py, agents/agent_builder/base_agent.py, agents/crew_agent_executor.py

Agent class (agent/core.py)

Extends BaseAgent. Core fields:

  • role, goal, backstory — define agent identity/prompting
  • llm — BaseLLM instance (auto-created from string)
  • function_calling_llm — optional specialized LLM for tool calls
  • tools — list of BaseTool instances
  • memory — optional unified Memory instance
  • knowledge_sources — optional knowledge base
  • max_iter (default 25), max_rpm, max_retry_limit (default 2)
  • allow_delegation — enables delegation tools
  • reasoning — enables planning before execution
  • guardrail — validation function for output
  • code_execution_mode — "safe" (Docker) or "unsafe" (local)
  • apps — platform integrations (Asana, GitHub, Slack, etc.)
  • mcps — MCP server configurations

BaseAgent (agents/agent_builder/base_agent.py)

Abstract base with: id (UUID4), agent_executor, cache_handler, tools_handler, security_config, i18n. Defines abstract methods: execute_task(), create_agent_executor(), get_delegation_tools(), get_platform_tools().

CrewAgentExecutor (agents/crew_agent_executor.py)

The agent execution loop. Key attributes: llm, task, crew, agent, prompt, tools, messages, iterations, max_iter, respect_context_window. Entry point: invoke(inputs)_invoke_loop().

LiteAgent (lite_agent.py)

Lightweight alternative agent implementation with: event-driven execution, memory integration, LLM hooks, guardrail support, structured output via Converter.


3. Task System

Key files: task.py, tasks/task_output.py, tasks/conditional_task.py

Task class (task.py)

Core fields:

  • description, expected_output — task prompt and LLM guidance
  • agent — assigned BaseAgent
  • tools — optional task-specific tools (override agent tools)
  • context — list of prior Tasks whose output provides context
  • output_file, output_pydantic, output_json — output format
  • guardrail + guardrail_max_retries (default 3) — output validation
  • async_execution — run in background thread
  • human_input — request human feedback
  • callback — post-completion callback

TaskOutput (tasks/task_output.py)

Result container: raw (text), pydantic (model instance), json_dict, agent (role string), output_format, messages.

ConditionalTask (tasks/conditional_task.py)

Extends Task with condition: Callable[[TaskOutput], bool]. Evaluates against previous output; if False, appends empty TaskOutput and skips. Cannot be first/only task or async.


4. Flow System

Key files: flow/flow.py, flow/flow_wrappers.py, flow/persistence/, flow/human_feedback.py

Flow class (flow/flow.py)

Generic Flow[T] where T is dict or Pydantic BaseModel (must have id field). Uses FlowMeta metaclass to register decorators at class definition.

Decorator API:

@start(condition=None)          # Entry point (unconditional or conditional)
@listen(condition)              # Event handler (fires when condition met)
@router(condition)              # Decision point (return value becomes trigger)
@human_feedback(message, emit)  # Collect human feedback, optionally route

or_(*conditions)                # Fire when ANY condition met
and_(*conditions)               # Fire when ALL conditions met

Execution model:

  1. Execute all unconditional @start methods in parallel
  2. After each method completes: find triggered routers (sequential), then listeners (parallel)
  3. Continue chain until no more triggers

Key rules:

  • Routers are sequential; listeners are parallel
  • OR listeners fire once on first trigger; AND listeners wait for all
  • State access is thread-safe via StateProxy with _state_lock
  • Cyclic flows: methods cleared from _completed_methods to allow re-execution

Persistence (flow/persistence/)

  • FlowPersistence ABC: save_state(), load_state(), save_pending_feedback(), load_pending_feedback()
  • SQLiteFlowPersistence: stores in ~/.crewai/flows.db
  • Enables resumption via Flow.from_pending(flow_id, persistence)

Human Feedback (flow/human_feedback.py)

@human_feedback decorator wraps method to collect feedback. With emit parameter, acts as router (LLM collapses feedback to outcome). Supports async providers that raise HumanFeedbackPending to pause flow. Optional learn=True stores lessons in memory.

Flow Methods

  • kickoff(inputs) / akickoff(inputs) — sync/async execution
  • resume(feedback) / resume_async(feedback) — resume from pause
  • ask(message, timeout) — request user input (auto-checkpoints state)
  • state — thread-safe state proxy
  • recall(query) / remember(content) — memory integration

5. Memory System

Key files: memory/unified_memory.py, memory/types.py, memory/encoding_flow.py, memory/recall_flow.py, memory/memory_scope.py, memory/analyze.py, memory/storage/

Memory class (memory/unified_memory.py)

Singleton-style with lazy LLM/embedder init. Pluggable storage backend (default LanceDB). Background save queue via ThreadPoolExecutor(max_workers=1).

Public API:

  • Write: remember(content, scope, categories, importance, ...), remember_many(contents, ...) (non-blocking batch)
  • Read: recall(query, scope, categories, limit, depth="shallow"|"deep")
  • Manage: forget(scope, categories, older_than, ...), update(record_id, ...), drain_writes()
  • Scoping: scope(path)MemoryScope, slice(scopes, read_only)MemorySlice
  • Introspection: list_scopes(), list_records(), list_categories(), info(), tree()

Configuration:

  • Scoring weights: semantic_weight=0.5, recency_weight=0.3, importance_weight=0.2
  • recency_half_life_days=30 — exponential decay
  • consolidation_threshold=0.85 — dedup trigger similarity
  • confidence_threshold_high=0.8, confidence_threshold_low=0.5 — recall routing
  • exploration_budget=1 — LLM exploration rounds for deep recall

Data Types (memory/types.py)

  • MemoryRecord: id, content, scope (hierarchical path like /company/team), categories, metadata, importance (0-1), created_at, last_accessed, embedding, source, private
  • MemoryMatch: record, score (composite), match_reasons, evidence_gaps
  • ScopeInfo: path, record_count, categories, date range, child_scopes

Composite scoring formula:

score = semantic_weight × similarity + recency_weight × (0.5 ^ (age_days / half_life)) + importance_weight × importance

Encoding Flow (memory/encoding_flow.py)

5-step batch pipeline on save:

  1. Batch embed all items (single API call)
  2. Intra-batch dedup via cosine similarity matrix (threshold 0.98)
  3. Parallel find similar records in storage (8 workers)
  4. Parallel analyze — Groups: A (insert, 0 LLM), B (consolidation, 1 LLM), C (save analysis, 1 LLM), D (both, 2 LLM) — 10 workers
  5. Execute plans — batch re-embed, atomic storage mutations (delete + update + insert under write lock)

Recall Flow (memory/recall_flow.py)

Adaptive recall pipeline:

  1. Analyze query — short queries skip LLM; long queries get sub-queries, scope suggestions, complexity classification, time filters
  2. Filter & chunk candidate scopes (max 20)
  3. Parallel search across queries × scopes (4 workers), apply filters, compute composite scores
  4. Route — high confidence → synthesize; low confidence + budget → explore deeper
  5. Recursive exploration (if deeper) — LLM extracts relevant info + gaps; decrements budget; re-searches
  6. Synthesize — deduplicate by ID, rank by composite score, return top N

Storage Backend (memory/storage/backend.py)

Protocol interface: save(), update(), delete(), search(), get_record(), list_records(), get_scope_info(), list_scopes(), list_categories(), count(), reset(), write_lock property.

LanceDB implementation (memory/storage/lancedb_storage.py): auto-detects vector dimensions, class-level shared RLock per DB path, auto-compaction every 100 saves, retry logic for commit conflicts (exponential backoff, 5 retries), oversamples 3x when filters present.

Scoped Views (memory/memory_scope.py)

  • MemoryScope: wraps Memory with root_path prefix; all operations relative to that root
  • MemorySlice: multi-scope view; recall searches all scopes in parallel; optional read_only=True

6. Tool System

Key files: tools/base_tool.py, tools/structured_tool.py, tools/tool_calling.py, tools/tool_usage.py, tools/memory_tools.py

BaseTool (tools/base_tool.py)

Abstract Pydantic BaseModel. Key fields: name, description, args_schema (Pydantic model), result_as_answer, max_usage_count, cache_function. Subclasses implement _run(**kwargs) and optionally _arun(**kwargs).

@tool decorator: creates tool from function, auto-infers schema from type hints.

CrewStructuredTool (tools/structured_tool.py)

Wraps functions as structured tools for LLM function calling. from_function() factory. Validates inputs before execution, enforces usage limits.

Tool Execution Flow (tools/tool_usage.py)

ToolUsage manages selection → validation → execution:

  1. Parse tool call from LLM output
  2. Select tool (fuzzy matching, 85%+ ratio)
  3. Validate arguments against schema
  4. Execute with fingerprint metadata
  5. Cache results if configured
  6. Emit events throughout lifecycle

Retry: max 3 parsing attempts with fallback methods (JSON, JSON5, AST, JSON repair).

Memory Tools (tools/memory_tools.py)

  • RecallMemoryTool: searches memory with single/multiple queries, returns formatted results with deduplication
  • RememberTool: stores facts/decisions, infers scope/categories/importance
  • CalculatorTool: safe arithmetic via AST parser (no eval()), supports date differences

MCP Integration

  • MCPToolWrapper (tools/mcp_tool_wrapper.py): on-demand connections, retry with exponential backoff, timeouts (15s connect, 60s execute)
  • MCPNativeTool (tools/mcp_native_tool.py): reuses persistent MCP sessions, auto-reconnect on event loop changes

7. Event System

Key files: events/event_bus.py, events/event_listener.py, events/base_events.py, events/types/

Event Bus (events/event_bus.py)

Singleton CrewAIEventsBus. Thread-safe with RWLock. Supports sync handlers (ThreadPoolExecutor, 10 workers) and async handlers (dedicated daemon event loop). Handler dependency injection via Depends().

Key methods: emit(source, event), aemit(), flush(timeout=30), register_handler(), scoped_handlers() (context manager for temporary handlers).

Event Types (events/types/)

  • Tool events: ToolUsageStartedEvent, ToolUsageFinishedEvent, ToolUsageErrorEvent, ToolValidateInputErrorEvent, ToolSelectionErrorEvent
  • LLM events: LLMCallStartedEvent, LLMCallCompletedEvent, LLMCallFailedEvent, LLMStreamChunkEvent, LLMThinkingChunkEvent
  • Agent/Task/Crew events: lifecycle tracking (started, completed, failed)
  • Flow events: method execution states, paused, input requested/received
  • Memory events: retrieval started/completed/failed
  • MCP events: connection, tool execution
  • A2A events: agent-to-agent delegation

All events carry: UUID, timestamp, parent/previous chain, fingerprint context.


8. LLM Abstraction

Key files: llm.py, llms/base_llm.py, llms/providers/

BaseLLM (llms/base_llm.py)

Abstract interface: call(messages, tools, ...) and acall(...). Provider-specific constants for context windows (1KB2MB). Emits LLM events. Handles context window management, timeout/auth errors, streaming.

LLM class (llm.py)

High-level wrapper integrating with litellm for multi-provider support. Handles model identification, tool function calling, JSON schema responses, streaming chunk aggregation, multimodal content formatting.

Providers (llms/providers/)

Per-provider adapters: OpenAI, Azure, Gemini, Claude/Anthropic, Bedrock, Watson, etc.


9. crewai-tools Package

Location: lib/crewai-tools/

93+ pre-built tools. All inherit from crewai.tools.BaseTool.

Pattern for creating tools:

class MyToolSchema(BaseModel):
    param: str = Field(..., description="...")

class MyTool(BaseTool):
    name: str = "My Tool"
    description: str = "..."
    args_schema: type[BaseModel] = MyToolSchema

    def _run(self, param: str) -> str:
        return result

Tool categories:

  • Search/Web: BraveSearch, Tavily, EXASearch, Serper, Spider, SerpAPI
  • Scraping: Firecrawl, Jina, Scrapfly, Selenium, Browserbase, Stagehand
  • File search: PDF, CSV, JSON, XML, MDX, DOCX, TXT search tools
  • Database: MySQL, Snowflake, SingleStore, MongoDB, Qdrant, Weaviate, Couchbase
  • File I/O: FileRead, FileWriter, DirectoryRead, DirectorySearch, FileCompressor, OCR, Vision
  • Code: CodeInterpreter, CodeDocsSearch, NL2SQL, DallE
  • AWS: Bedrock agent/KB, S3 reader/writer
  • Integrations: Composio, Zapier, MCP, LlamaIndex, GitHub
  • RAG: RagTool base with 17 loaders (CSV, Directory, Docs, DOCX, GitHub, JSON, MySQL, Postgres, etc.)

43+ optional dependency groups for external services.


10. crewai-files Package

Location: lib/crewai-files/

Multimodal file handling for LLM providers.

Structure:

  • core/ — File type classes (Image, PDF, Audio, Video, Text), source types (FilePath, FileBytes, FileUrl, FileStream), resolved representations
  • processing/ — FileProcessor validates against per-provider constraints, optional transforms (resize, compress, chunk)
  • uploaders/ — Provider-specific uploaders (Anthropic, OpenAI, Gemini, Bedrock/S3)
  • formatting/ — Format files for provider APIs: format_multimodal_content(), aformat_multimodal_content()
  • resolution/ — FileResolver decides inline base64 vs upload based on size/provider
  • cache/ — UploadCache tracks uploads by content hash, cleanup utilities

Provider constraints: max file sizes, supported formats, image dimensions per provider (Anthropic, OpenAI, Gemini, Bedrock).


11. CLI & Project Scaffolding

Key file: cli/cli.py (Click-based)

Core commands:

  • crewai create <crew|flow> <name> — scaffold project
  • crewai run / crewai flow kickoff — execute crew/flow
  • crewai chat — interactive conversation with crew
  • crewai train [-n N] / crewai test [-n N] [-m MODEL] — training and evaluation
  • crewai replay [-t TASK_ID] — replay from specific task

Memory/config:

  • crewai reset_memories — reset memory, knowledge, or all
  • crewai memory — open Memory TUI
  • crewai config list|set|reset — CLI configuration

Deployment:

  • crewai deploy create|list|push|status|logs|remove

Tool repository:

  • crewai tool create|install|publish

Flow-specific:

  • crewai flow kickoff|plot|add-crew

Other: crewai login, crewai org list|switch|current, crewai traces enable|disable|status, crewai env view


12. Project Decorators

Key files: project/crew_base.py, project/annotations.py

@CrewBase decorator

Applies CrewBaseMeta metaclass. Auto-loads YAML configs (config/agents.yaml, config/tasks.yaml). Registers agent/task factory methods, MCP adapters, lifecycle hooks.

Method decorators (project/annotations.py)

Component factories (all memoized):

  • @agent — agent factory method
  • @task — task factory method
  • @llm — LLM provider factory
  • @tool — tool factory
  • @callback, @cache_handler

Lifecycle:

  • @before_kickoff / @after_kickoff — pre/post execution hooks
  • @crew — main crew entry point (instantiates agents/tasks, manages callbacks)

Output format: @output_json, @output_pydantic

LLM/Tool hooks (optional agent/tool filtering):

  • @before_llm_call_hook / @after_llm_call_hook
  • @before_tool_call_hook / @after_tool_call_hook

13. Knowledge & RAG

Key files: knowledge/knowledge.py, rag/

Knowledge class

Vector store integration: query(queries, results_limit, score_threshold), add_sources(), reset(). Async variants available. Used by agents via knowledge_sources parameter.

RAG system (rag/)

  • Vector DBs: ChromaDB, Qdrant (client wrappers, factories, config)
  • Embeddings: 25+ providers (OpenAI, Cohere, HuggingFace, Jina, Voyage, Ollama, Bedrock, Azure, Vertex, etc.)
  • Core: BaseClient, BaseEmbeddingsProvider abstractions
  • Storage: BaseRAGStorage interface

14. Security & Fingerprinting

Key files: security/security_config.py, security/fingerprint.py

  • SecurityConfig: manages component fingerprints, serialization
  • Fingerprint: dual identifiers (human-readable ID + UUID), uuid5() with CrewAI namespace for deterministic seeding, metadata support (1-level nesting, 10KB limit), timestamp tracking
  • Every event carries fingerprint context for audit trails

15. Agent-to-Agent (A2A)

Key files: a2a/config.py, a2a/

Protocol for inter-agent communication:

  • A2AClientConfig, A2AServerConfig — configuration
  • AgentCardSigningConfig — JWS signing (RS256, ES256, PS256)
  • GRPCServerConfig — gRPC transport with TLS
  • Supporting: auth/, updates/ (polling/push/streaming), extensions/, utils/

16. Translations & i18n

Key file: translations/en.json

All agent-facing prompts are externalized. Key sections:

  • slices/ — agent prompting templates (task, memory, role_playing, tools, format, final_answer_format)
  • errors/ — tool execution, validation, format violation, guardrail failure messages
  • tools/ — tool descriptions (delegate_work, ask_question, recall_memory, calculator, save_to_memory)
  • memory/ — query analysis, extraction rules, consolidation logic, temporal reasoning
  • HITL prompts — pre-review, lesson distillation
  • Lite agent prompts — system prompts with/without tools