memory improvements

refactor: simplify category handling in RecallFlow
- Updated the _merged_categories method to return only caller-supplied categories, removing the previous merging logic for inferred categories. This change enhances clarity and maintains consistency in category management.
2026-04-09 04:28:16 +00:00 · 2026-03-09 20:11:39 -07:00 · 2026-03-04 09:13:44 -08:00 · 2026-03-04 09:13:44 -08:00 · 2026-03-04 09:13:44 -08:00 · 2026-03-04 09:13:44 -08:00
8 changed files with 807 additions and 21 deletions
--- a/.github/workflows/docs-stale-check.yml
+++ b/.github/workflows/docs-stale-check.yml
@@ -0,0 +1,68 @@
+name: Check EXPANDED_CLAUDE.md freshness
+
+on:
+  pull_request:
+    paths:
+      - "lib/crewai/src/crewai/crew.py"
+      - "lib/crewai/src/crewai/task.py"
+      - "lib/crewai/src/crewai/llm.py"
+      - "lib/crewai/src/crewai/lite_agent.py"
+      - "lib/crewai/src/crewai/agent/**"
+      - "lib/crewai/src/crewai/agents/**"
+      - "lib/crewai/src/crewai/flow/**"
+      - "lib/crewai/src/crewai/memory/**"
+      - "lib/crewai/src/crewai/tools/**"
+      - "lib/crewai/src/crewai/events/**"
+      - "lib/crewai/src/crewai/llms/**"
+      - "lib/crewai/src/crewai/knowledge/**"
+      - "lib/crewai/src/crewai/rag/**"
+      - "lib/crewai/src/crewai/security/**"
+      - "lib/crewai/src/crewai/a2a/**"
+      - "lib/crewai/src/crewai/cli/**"
+      - "lib/crewai/src/crewai/project/**"
+      - "lib/crewai/src/crewai/translations/**"
+      - "lib/crewai-tools/src/**"
+      - "lib/crewai-files/src/**"
+
+jobs:
+  check-docs:
+    runs-on: ubuntu-latest
+    permissions:
+      pull-requests: write
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Check if EXPANDED_CLAUDE.md was updated
+        id: check
+        run: |
+          if git diff --name-only origin/${{ github.base_ref }}...HEAD | grep -q "^EXPANDED_CLAUDE.md$"; then
+            echo "updated=true" >> "$GITHUB_OUTPUT"
+          else
+            echo "updated=false" >> "$GITHUB_OUTPUT"
+          fi
+
+      - name: Comment on PR
+        if: steps.check.outputs.updated == 'false'
+        uses: actions/github-script@v7
+        with:
+          script: |
+            const marker = '<!-- docs-stale-check -->';
+            const body = `${marker}\n**Heads up:** This PR changes core source files but \`EXPANDED_CLAUDE.md\` wasn't updated. If the changes affect architecture (new modules, changed APIs, renamed classes), consider running \`/update-docs\` in Claude Code before merging.`;
+
+            const { data: comments } = await github.rest.issues.listComments({
+              owner: context.repo.owner,
+              repo: context.repo.repo,
+              issue_number: context.issue.number,
+            });
+
+            const existing = comments.find(c => c.body.includes(marker));
+            if (!existing) {
+              await github.rest.issues.createComment({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                issue_number: context.issue.number,
+                body,
+              });
+            }
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,92 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+CrewAI is a standalone Python framework for orchestrating autonomous AI agents. It provides two complementary paradigms: **Crews** (autonomous agent teams) and **Flows** (event-driven workflows). This is a **UV workspace monorepo**.
+
+## Repository Structure
+
+```
+lib/
+├── crewai/          # Core framework (agents, tasks, crews, flows, memory, tools, LLMs)
+├── crewai-tools/    # Pre-built tool library (70+ tools)
+├── crewai-files/    # Multimodal file handling (cache, processing, uploading)
+└── devtools/        # Internal dev utilities (version bumping)
+```
+
+Source code lives under `lib/<package>/src/` and tests under `lib/<package>/tests/`.
+
+## Common Commands
+
+```bash
+# Install dependencies
+uv lock && uv sync
+
+# Run all tests (parallel by default via pytest-xdist)
+uv run pytest
+
+# Run a single test file
+uv run pytest lib/crewai/tests/memory/test_unified_memory.py
+
+# Run a single test
+uv run pytest lib/crewai/tests/memory/test_unified_memory.py::test_function_name -x
+
+# Run tests for a specific workspace member
+uv run pytest lib/crewai/tests/
+uv run pytest lib/crewai-tools/tests/
+uv run pytest lib/crewai-files/tests/
+
+# Linting and formatting (Ruff)
+uv run ruff check lib/
+uv run ruff format lib/
+
+# Type checking (strict mypy)
+uv run mypy lib/
+
+# Pre-commit hooks
+pre-commit install
+pre-commit run --all-files
+```
+
+**Pytest defaults** (from pyproject.toml): `--tb=short -n auto --timeout=60 --dist=loadfile --block-network --import-mode=importlib`. Network is blocked in tests; use VCR cassettes for HTTP interactions.
+
+## Deep Dive
+
+For detailed architecture documentation on any subsystem, use `/deep-dive <subsystem>` (e.g. `/deep-dive memory`, `/deep-dive flow`). This pulls the relevant section from **[EXPANDED_CLAUDE.md](./EXPANDED_CLAUDE.md)**, which covers all major components, execution flows, data types, and integration patterns. To regenerate it after major changes, use `/update-docs`.
+
+## Architecture
+
+### Core modules (`lib/crewai/src/crewai/`)
+
+- **`crew.py`** - `Crew` class: orchestrates agents executing tasks (sequential or hierarchical process)
+- **`task.py`** - `Task` class: work units with description, expected output, assigned agent, guardrails
+- **`agent/core.py`** - `Agent` class: autonomous entity with role/goal/backstory, LLM, tools, memory
+- **`flow/flow.py`** - `Flow` class: event-driven workflows using `@start`, `@listen`, `@router` decorators
+- **`llm.py`** + **`llms/`** - Provider-agnostic LLM abstraction with per-provider adapters (OpenAI, Gemini, Claude, Bedrock, etc.)
+- **`memory/`** - Unified memory system (LanceDB-backed) with vector embeddings, encoding/recall flows, scope-based filtering
+- **`tools/`** - Tool ecosystem: `BaseTool`, structured tools, MCP integration, memory tools
+- **`events/`** - Central event bus for observability (agent, crew, flow, task, memory events)
+- **`knowledge/`** - Knowledge base integration with multiple source types
+- **`cli/`** - CLI for project scaffolding, deployment, and interactive crew chat
+- **`utilities/`** - Shared helpers (prompt templates, schema utils, LLM utils, i18n, guardrails)
+
+### Key patterns
+
+- **Pydantic models** throughout for validation and type safety
+- **Event-driven observability** via `events/event_bus.py` with sync/async handlers
+- **Lazy loading** of heavy modules (Memory, EncodingFlow) via `__getattr__`
+- **Pluggable storage** backends for memory (LanceDB default)
+- **VCR cassettes** for recording/replaying HTTP interactions in tests
+- **Translations** in `translations/en.json` for all agent-facing prompts
+
+## Code Standards
+
+- **Python 3.10+**, use modern syntax (`X | Y` unions, `collections.abc`, f-strings)
+- **Ruff** for linting and formatting (E501 line length ignored)
+- **mypy strict** mode: all functions need type annotations
+- **Google-style docstrings**
+- **No relative imports** (`ban-relative-imports = "all"` in Ruff config)
+- **Commitizen** commit message format enforced via pre-commit
+- Tests allow `assert`, unnecessary assignments, and hardcoded passwords (`S101`, `RET504`, `S105`, `S106` suppressed)
--- a/EXPANDED_CLAUDE.md
+++ b/EXPANDED_CLAUDE.md
@@ -0,0 +1,482 @@
+# EXPANDED_CLAUDE.md
+
+Deep architectural reference for the CrewAI codebase. See [CLAUDE.md](./CLAUDE.md) for quick-start commands and overview.
+
+## Table of Contents
+
+- [1. Execution Flow: Crew.kickoff() to Agent Output](#1-execution-flow)
+- [2. Agent System](#2-agent-system)
+- [3. Task System](#3-task-system)
+- [4. Flow System](#4-flow-system)
+- [5. Memory System](#5-memory-system)
+- [6. Tool System](#6-tool-system)
+- [7. Event System](#7-event-system)
+- [8. LLM Abstraction](#8-llm-abstraction)
+- [9. crewai-tools Package](#9-crewai-tools-package)
+- [10. crewai-files Package](#10-crewai-files-package)
+- [11. CLI & Project Scaffolding](#11-cli--project-scaffolding)
+- [12. Project Decorators (@CrewBase)](#12-project-decorators)
+- [13. Knowledge & RAG](#13-knowledge--rag)
+- [14. Security & Fingerprinting](#14-security--fingerprinting)
+- [15. Agent-to-Agent (A2A)](#15-agent-to-agent-a2a)
+- [16. Translations & i18n](#16-translations--i18n)
+
+---
+
+## 1. Execution Flow
+
+The end-to-end path from `Crew.kickoff()` to final output:
+
+```
+Crew.kickoff(inputs)
+├── prepare_kickoff()                    # Validate inputs, store files
+├── Determine process type
+│   ├── Sequential: _run_sequential_process()
+│   └── Hierarchical: _run_hierarchical_process() → _create_manager_agent()
+├── _execute_tasks(tasks)                # Main loop
+│   └── For each task:
+│       ├── If ConditionalTask: check condition(previous_output)
+│       ├── If async_execution: create asyncio task
+│       └── If sync: task.execute_sync(agent, context, tools)
+│           └── agent.execute_task(task, context, tools)
+│               ├── Memory recall (if enabled)
+│               ├── Knowledge retrieval (if enabled)
+│               ├── Build prompt with context
+│               └── CrewAgentExecutor.invoke()
+│                   └── Loop until AgentFinish:
+│                       ├── Native tool calling (if LLM supports)
+│                       └── OR ReAct text pattern (fallback)
+├── Apply guardrails with retries
+├── after_kickoff_callbacks()
+└── Return CrewOutput
+```
+
+**Process types:**
+- **Sequential**: Tasks execute in order; each gets context from all prior TaskOutputs
+- **Hierarchical**: A manager agent delegates to other agents via delegation tools
+
+**Agent execution loop** (`agents/crew_agent_executor.py`):
+- **Native function calling**: LLM returns structured `tool_calls`; executor runs first tool, appends result, loops
+- **ReAct text pattern** (fallback): LLM outputs `Thought/Action/Action Input`; executor parses text, runs tool, appends `Observation`
+
+---
+
+## 2. Agent System
+
+**Key files:** `agent/core.py`, `agents/agent_builder/base_agent.py`, `agents/crew_agent_executor.py`
+
+### Agent class (`agent/core.py`)
+
+Extends `BaseAgent`. Core fields:
+- `role`, `goal`, `backstory` — define agent identity/prompting
+- `llm` — BaseLLM instance (auto-created from string)
+- `function_calling_llm` — optional specialized LLM for tool calls
+- `tools` — list of BaseTool instances
+- `memory` — optional unified Memory instance
+- `knowledge_sources` — optional knowledge base
+- `max_iter` (default 25), `max_rpm`, `max_retry_limit` (default 2)
+- `allow_delegation` — enables delegation tools
+- `reasoning` — enables planning before execution
+- `guardrail` — validation function for output
+- `code_execution_mode` — "safe" (Docker) or "unsafe" (local)
+- `apps` — platform integrations (Asana, GitHub, Slack, etc.)
+- `mcps` — MCP server configurations
+
+### BaseAgent (`agents/agent_builder/base_agent.py`)
+
+Abstract base with: `id` (UUID4), `agent_executor`, `cache_handler`, `tools_handler`, `security_config`, `i18n`. Defines abstract methods: `execute_task()`, `create_agent_executor()`, `get_delegation_tools()`, `get_platform_tools()`.
+
+### CrewAgentExecutor (`agents/crew_agent_executor.py`)
+
+The agent execution loop. Key attributes: `llm`, `task`, `crew`, `agent`, `prompt`, `tools`, `messages`, `iterations`, `max_iter`, `respect_context_window`. Entry point: `invoke(inputs)` → `_invoke_loop()`.
+
+### LiteAgent (`lite_agent.py`)
+
+Lightweight alternative agent implementation with: event-driven execution, memory integration, LLM hooks, guardrail support, structured output via Converter.
+
+---
+
+## 3. Task System
+
+**Key files:** `task.py`, `tasks/task_output.py`, `tasks/conditional_task.py`
+
+### Task class (`task.py`)
+
+Core fields:
+- `description`, `expected_output` — task prompt and LLM guidance
+- `agent` — assigned BaseAgent
+- `tools` — optional task-specific tools (override agent tools)
+- `context` — list of prior Tasks whose output provides context
+- `output_file`, `output_pydantic`, `output_json` — output format
+- `guardrail` + `guardrail_max_retries` (default 3) — output validation
+- `async_execution` — run in background thread
+- `human_input` — request human feedback
+- `callback` — post-completion callback
+
+### TaskOutput (`tasks/task_output.py`)
+
+Result container: `raw` (text), `pydantic` (model instance), `json_dict`, `agent` (role string), `output_format`, `messages`.
+
+### ConditionalTask (`tasks/conditional_task.py`)
+
+Extends Task with `condition: Callable[[TaskOutput], bool]`. Evaluates against previous output; if False, appends empty TaskOutput and skips. Cannot be first/only task or async.
+
+---
+
+## 4. Flow System
+
+**Key files:** `flow/flow.py`, `flow/flow_wrappers.py`, `flow/persistence/`, `flow/human_feedback.py`
+
+### Flow class (`flow/flow.py`)
+
+Generic `Flow[T]` where T is `dict` or Pydantic `BaseModel` (must have `id` field). Uses `FlowMeta` metaclass to register decorators at class definition.
+
+**Decorator API:**
+```python
+@start(condition=None)          # Entry point (unconditional or conditional)
+@listen(condition)              # Event handler (fires when condition met)
+@router(condition)              # Decision point (return value becomes trigger)
+@human_feedback(message, emit)  # Collect human feedback, optionally route
+
+or_(*conditions)                # Fire when ANY condition met
+and_(*conditions)               # Fire when ALL conditions met
+```
+
+**Execution model:**
+1. Execute all unconditional `@start` methods in parallel
+2. After each method completes: find triggered routers (sequential), then listeners (parallel)
+3. Continue chain until no more triggers
+
+**Key rules:**
+- Routers are sequential; listeners are parallel
+- OR listeners fire once on first trigger; AND listeners wait for all
+- State access is thread-safe via `StateProxy` with `_state_lock`
+- Cyclic flows: methods cleared from `_completed_methods` to allow re-execution
+
+### Persistence (`flow/persistence/`)
+
+- `FlowPersistence` ABC: `save_state()`, `load_state()`, `save_pending_feedback()`, `load_pending_feedback()`
+- `SQLiteFlowPersistence`: stores in `~/.crewai/flows.db`
+- Enables resumption via `Flow.from_pending(flow_id, persistence)`
+
+### Human Feedback (`flow/human_feedback.py`)
+
+`@human_feedback` decorator wraps method to collect feedback. With `emit` parameter, acts as router (LLM collapses feedback to outcome). Supports async providers that raise `HumanFeedbackPending` to pause flow. Optional `learn=True` stores lessons in memory.
+
+### Flow Methods
+
+- `kickoff(inputs)` / `akickoff(inputs)` — sync/async execution
+- `resume(feedback)` / `resume_async(feedback)` — resume from pause
+- `ask(message, timeout)` — request user input (auto-checkpoints state)
+- `state` — thread-safe state proxy
+- `recall(query)` / `remember(content)` — memory integration
+
+---
+
+## 5. Memory System
+
+**Key files:** `memory/unified_memory.py`, `memory/types.py`, `memory/encoding_flow.py`, `memory/recall_flow.py`, `memory/memory_scope.py`, `memory/analyze.py`, `memory/storage/`
+
+### Memory class (`memory/unified_memory.py`)
+
+Singleton-style with lazy LLM/embedder init. Pluggable storage backend (default LanceDB). Background save queue via ThreadPoolExecutor(max_workers=1).
+
+**Public API:**
+- **Write:** `remember(content, scope, categories, importance, ...)`, `remember_many(contents, ...)` (non-blocking batch)
+- **Read:** `recall(query, scope, categories, limit, depth="shallow"|"deep")`
+- **Manage:** `forget(scope, categories, older_than, ...)`, `update(record_id, ...)`, `drain_writes()`
+- **Scoping:** `scope(path)` → `MemoryScope`, `slice(scopes, read_only)` → `MemorySlice`
+- **Introspection:** `list_scopes()`, `list_records()`, `list_categories()`, `info()`, `tree()`
+
+**Configuration:**
+- Scoring weights: `semantic_weight=0.5`, `recency_weight=0.3`, `importance_weight=0.2`
+- `recency_half_life_days=30` — exponential decay
+- `consolidation_threshold=0.85` — dedup trigger similarity
+- `confidence_threshold_high=0.8`, `confidence_threshold_low=0.5` — recall routing
+- `exploration_budget=1` — LLM exploration rounds for deep recall
+
+### Data Types (`memory/types.py`)
+
+- **MemoryRecord**: `id`, `content`, `scope` (hierarchical path like `/company/team`), `categories`, `metadata`, `importance` (0-1), `created_at`, `last_accessed`, `embedding`, `source`, `private`
+- **MemoryMatch**: `record`, `score` (composite), `match_reasons`, `evidence_gaps`
+- **ScopeInfo**: `path`, `record_count`, `categories`, date range, `child_scopes`
+
+**Composite scoring formula:**
+```
+score = semantic_weight × similarity + recency_weight × (0.5 ^ (age_days / half_life)) + importance_weight × importance
+```
+
+### Encoding Flow (`memory/encoding_flow.py`)
+
+5-step batch pipeline on save:
+1. **Batch embed** all items (single API call)
+2. **Intra-batch dedup** via cosine similarity matrix (threshold 0.98)
+3. **Parallel find similar** records in storage (8 workers)
+4. **Parallel analyze** — Groups: A (insert, 0 LLM), B (consolidation, 1 LLM), C (save analysis, 1 LLM), D (both, 2 LLM) — 10 workers
+5. **Execute plans** — batch re-embed, atomic storage mutations (delete + update + insert under write lock)
+
+### Recall Flow (`memory/recall_flow.py`)
+
+Adaptive recall pipeline:
+1. **Analyze query** — short queries skip LLM; long queries get sub-queries, scope suggestions, complexity classification, time filters
+2. **Filter & chunk** candidate scopes (max 20)
+3. **Parallel search** across queries × scopes (4 workers), apply filters, compute composite scores
+4. **Route** — high confidence → synthesize; low confidence + budget → explore deeper
+5. **Recursive exploration** (if deeper) — LLM extracts relevant info + gaps; decrements budget; re-searches
+6. **Synthesize** — deduplicate by ID, rank by composite score, return top N
+
+### Storage Backend (`memory/storage/backend.py`)
+
+Protocol interface: `save()`, `update()`, `delete()`, `search()`, `get_record()`, `list_records()`, `get_scope_info()`, `list_scopes()`, `list_categories()`, `count()`, `reset()`, `write_lock` property.
+
+**LanceDB implementation** (`memory/storage/lancedb_storage.py`): auto-detects vector dimensions, class-level shared RLock per DB path, auto-compaction every 100 saves, retry logic for commit conflicts (exponential backoff, 5 retries), oversamples 3x when filters present.
+
+### Scoped Views (`memory/memory_scope.py`)
+
+- **MemoryScope**: wraps Memory with root_path prefix; all operations relative to that root
+- **MemorySlice**: multi-scope view; recall searches all scopes in parallel; optional `read_only=True`
+
+---
+
+## 6. Tool System
+
+**Key files:** `tools/base_tool.py`, `tools/structured_tool.py`, `tools/tool_calling.py`, `tools/tool_usage.py`, `tools/memory_tools.py`
+
+### BaseTool (`tools/base_tool.py`)
+
+Abstract Pydantic BaseModel. Key fields: `name`, `description`, `args_schema` (Pydantic model), `result_as_answer`, `max_usage_count`, `cache_function`. Subclasses implement `_run(**kwargs)` and optionally `_arun(**kwargs)`.
+
+**`@tool` decorator:** creates tool from function, auto-infers schema from type hints.
+
+### CrewStructuredTool (`tools/structured_tool.py`)
+
+Wraps functions as structured tools for LLM function calling. `from_function()` factory. Validates inputs before execution, enforces usage limits.
+
+### Tool Execution Flow (`tools/tool_usage.py`)
+
+`ToolUsage` manages selection → validation → execution:
+1. Parse tool call from LLM output
+2. Select tool (fuzzy matching, 85%+ ratio)
+3. Validate arguments against schema
+4. Execute with fingerprint metadata
+5. Cache results if configured
+6. Emit events throughout lifecycle
+
+Retry: max 3 parsing attempts with fallback methods (JSON, JSON5, AST, JSON repair).
+
+### Memory Tools (`tools/memory_tools.py`)
+
+- **RecallMemoryTool**: searches memory with single/multiple queries, returns formatted results with deduplication
+- **RememberTool**: stores facts/decisions, infers scope/categories/importance
+- **CalculatorTool**: safe arithmetic via AST parser (no `eval()`), supports date differences
+
+### MCP Integration
+
+- **MCPToolWrapper** (`tools/mcp_tool_wrapper.py`): on-demand connections, retry with exponential backoff, timeouts (15s connect, 60s execute)
+- **MCPNativeTool** (`tools/mcp_native_tool.py`): reuses persistent MCP sessions, auto-reconnect on event loop changes
+
+---
+
+## 7. Event System
+
+**Key files:** `events/event_bus.py`, `events/event_listener.py`, `events/base_events.py`, `events/types/`
+
+### Event Bus (`events/event_bus.py`)
+
+Singleton `CrewAIEventsBus`. Thread-safe with RWLock. Supports sync handlers (ThreadPoolExecutor, 10 workers) and async handlers (dedicated daemon event loop). Handler dependency injection via `Depends()`.
+
+**Key methods:** `emit(source, event)`, `aemit()`, `flush(timeout=30)`, `register_handler()`, `scoped_handlers()` (context manager for temporary handlers).
+
+### Event Types (`events/types/`)
+
+- **Tool events**: `ToolUsageStartedEvent`, `ToolUsageFinishedEvent`, `ToolUsageErrorEvent`, `ToolValidateInputErrorEvent`, `ToolSelectionErrorEvent`
+- **LLM events**: `LLMCallStartedEvent`, `LLMCallCompletedEvent`, `LLMCallFailedEvent`, `LLMStreamChunkEvent`, `LLMThinkingChunkEvent`
+- **Agent/Task/Crew events**: lifecycle tracking (started, completed, failed)
+- **Flow events**: method execution states, paused, input requested/received
+- **Memory events**: retrieval started/completed/failed
+- **MCP events**: connection, tool execution
+- **A2A events**: agent-to-agent delegation
+
+All events carry: UUID, timestamp, parent/previous chain, fingerprint context.
+
+---
+
+## 8. LLM Abstraction
+
+**Key files:** `llm.py`, `llms/base_llm.py`, `llms/providers/`
+
+### BaseLLM (`llms/base_llm.py`)
+
+Abstract interface: `call(messages, tools, ...)` and `acall(...)`. Provider-specific constants for context windows (1KB–2MB). Emits LLM events. Handles context window management, timeout/auth errors, streaming.
+
+### LLM class (`llm.py`)
+
+High-level wrapper integrating with litellm for multi-provider support. Handles model identification, tool function calling, JSON schema responses, streaming chunk aggregation, multimodal content formatting.
+
+### Providers (`llms/providers/`)
+
+Per-provider adapters: OpenAI, Azure, Gemini, Claude/Anthropic, Bedrock, Watson, etc.
+
+---
+
+## 9. crewai-tools Package
+
+**Location:** `lib/crewai-tools/`
+
+93+ pre-built tools. All inherit from `crewai.tools.BaseTool`.
+
+**Pattern for creating tools:**
+```python
+class MyToolSchema(BaseModel):
+    param: str = Field(..., description="...")
+
+class MyTool(BaseTool):
+    name: str = "My Tool"
+    description: str = "..."
+    args_schema: type[BaseModel] = MyToolSchema
+
+    def _run(self, param: str) -> str:
+        return result
+```
+
+**Tool categories:**
+- **Search/Web**: BraveSearch, Tavily, EXASearch, Serper, Spider, SerpAPI
+- **Scraping**: Firecrawl, Jina, Scrapfly, Selenium, Browserbase, Stagehand
+- **File search**: PDF, CSV, JSON, XML, MDX, DOCX, TXT search tools
+- **Database**: MySQL, Snowflake, SingleStore, MongoDB, Qdrant, Weaviate, Couchbase
+- **File I/O**: FileRead, FileWriter, DirectoryRead, DirectorySearch, FileCompressor, OCR, Vision
+- **Code**: CodeInterpreter, CodeDocsSearch, NL2SQL, DallE
+- **AWS**: Bedrock agent/KB, S3 reader/writer
+- **Integrations**: Composio, Zapier, MCP, LlamaIndex, GitHub
+- **RAG**: RagTool base with 17 loaders (CSV, Directory, Docs, DOCX, GitHub, JSON, MySQL, Postgres, etc.)
+
+**43+ optional dependency groups** for external services.
+
+---
+
+## 10. crewai-files Package
+
+**Location:** `lib/crewai-files/`
+
+Multimodal file handling for LLM providers.
+
+**Structure:**
+- `core/` — File type classes (Image, PDF, Audio, Video, Text), source types (FilePath, FileBytes, FileUrl, FileStream), resolved representations
+- `processing/` — FileProcessor validates against per-provider constraints, optional transforms (resize, compress, chunk)
+- `uploaders/` — Provider-specific uploaders (Anthropic, OpenAI, Gemini, Bedrock/S3)
+- `formatting/` — Format files for provider APIs: `format_multimodal_content()`, `aformat_multimodal_content()`
+- `resolution/` — FileResolver decides inline base64 vs upload based on size/provider
+- `cache/` — UploadCache tracks uploads by content hash, cleanup utilities
+
+**Provider constraints**: max file sizes, supported formats, image dimensions per provider (Anthropic, OpenAI, Gemini, Bedrock).
+
+---
+
+## 11. CLI & Project Scaffolding
+
+**Key file:** `cli/cli.py` (Click-based)
+
+**Core commands:**
+- `crewai create <crew|flow> <name>` — scaffold project
+- `crewai run` / `crewai flow kickoff` — execute crew/flow
+- `crewai chat` — interactive conversation with crew
+- `crewai train [-n N]` / `crewai test [-n N] [-m MODEL]` — training and evaluation
+- `crewai replay [-t TASK_ID]` — replay from specific task
+
+**Memory/config:**
+- `crewai reset_memories` — reset memory, knowledge, or all
+- `crewai memory` — open Memory TUI
+- `crewai config list|set|reset` — CLI configuration
+
+**Deployment:**
+- `crewai deploy create|list|push|status|logs|remove`
+
+**Tool repository:**
+- `crewai tool create|install|publish`
+
+**Flow-specific:**
+- `crewai flow kickoff|plot|add-crew`
+
+**Other:** `crewai login`, `crewai org list|switch|current`, `crewai traces enable|disable|status`, `crewai env view`
+
+---
+
+## 12. Project Decorators
+
+**Key files:** `project/crew_base.py`, `project/annotations.py`
+
+### @CrewBase decorator
+
+Applies `CrewBaseMeta` metaclass. Auto-loads YAML configs (`config/agents.yaml`, `config/tasks.yaml`). Registers agent/task factory methods, MCP adapters, lifecycle hooks.
+
+### Method decorators (`project/annotations.py`)
+
+**Component factories** (all memoized):
+- `@agent` — agent factory method
+- `@task` — task factory method
+- `@llm` — LLM provider factory
+- `@tool` — tool factory
+- `@callback`, `@cache_handler`
+
+**Lifecycle:**
+- `@before_kickoff` / `@after_kickoff` — pre/post execution hooks
+- `@crew` — main crew entry point (instantiates agents/tasks, manages callbacks)
+
+**Output format:** `@output_json`, `@output_pydantic`
+
+**LLM/Tool hooks** (optional agent/tool filtering):
+- `@before_llm_call_hook` / `@after_llm_call_hook`
+- `@before_tool_call_hook` / `@after_tool_call_hook`
+
+---
+
+## 13. Knowledge & RAG
+
+**Key files:** `knowledge/knowledge.py`, `rag/`
+
+### Knowledge class
+
+Vector store integration: `query(queries, results_limit, score_threshold)`, `add_sources()`, `reset()`. Async variants available. Used by agents via `knowledge_sources` parameter.
+
+### RAG system (`rag/`)
+
+- **Vector DBs**: ChromaDB, Qdrant (client wrappers, factories, config)
+- **Embeddings**: 25+ providers (OpenAI, Cohere, HuggingFace, Jina, Voyage, Ollama, Bedrock, Azure, Vertex, etc.)
+- **Core**: `BaseClient`, `BaseEmbeddingsProvider` abstractions
+- **Storage**: `BaseRAGStorage` interface
+
+---
+
+## 14. Security & Fingerprinting
+
+**Key files:** `security/security_config.py`, `security/fingerprint.py`
+
+- **SecurityConfig**: manages component fingerprints, serialization
+- **Fingerprint**: dual identifiers (human-readable ID + UUID), `uuid5()` with CrewAI namespace for deterministic seeding, metadata support (1-level nesting, 10KB limit), timestamp tracking
+- Every event carries fingerprint context for audit trails
+
+---
+
+## 15. Agent-to-Agent (A2A)
+
+**Key files:** `a2a/config.py`, `a2a/`
+
+Protocol for inter-agent communication:
+- `A2AClientConfig`, `A2AServerConfig` — configuration
+- `AgentCardSigningConfig` — JWS signing (RS256, ES256, PS256)
+- `GRPCServerConfig` — gRPC transport with TLS
+- Supporting: `auth/`, `updates/` (polling/push/streaming), `extensions/`, `utils/`
+
+---
+
+## 16. Translations & i18n
+
+**Key file:** `translations/en.json`
+
+All agent-facing prompts are externalized. Key sections:
+- `slices/` — agent prompting templates (task, memory, role_playing, tools, format, final_answer_format)
+- `errors/` — tool execution, validation, format violation, guardrail failure messages
+- `tools/` — tool descriptions (delegate_work, ask_question, recall_memory, calculator, save_to_memory)
+- `memory/` — query analysis, extraction rules, consolidation logic, temporal reasoning
+- HITL prompts — pre-review, lesson distillation
+- Lite agent prompts — system prompts with/without tools
--- a/lib/crewai/src/crewai/agent/core.py
+++ b/lib/crewai/src/crewai/agent/core.py
@@ -1268,7 +1268,18 @@ class Agent(BaseAgent):
                    ),
                )
                start_time = time.time()
-                matches = agent_memory.recall(formatted_messages, limit=5)
+                matches = agent_memory.recall(formatted_messages, limit=20)
+                # Filter low-relevance memories to reduce noise while
+                # guaranteeing a minimum context floor.
+                if matches:
+                    _MIN_SCORE = 0.55
+                    _MIN_RESULTS = 5
+                    above = [m for m in matches if m.score >= _MIN_SCORE]
+                    if len(above) >= _MIN_RESULTS:
+                        matches = above
+                    else:
+                        # Keep at least top _MIN_RESULTS by score
+                        matches = matches[:max(_MIN_RESULTS, len(above))]
                memory_block = ""
                if matches:
                    memory_block = "Relevant memories:\n" + "\n".join(
--- a/lib/crewai/src/crewai/memory/recall_flow.py
+++ b/lib/crewai/src/crewai/memory/recall_flow.py
@@ -2,7 +2,6 @@

 Implements adaptive-depth retrieval with:
 - LLM query distillation into targeted sub-queries
- Keyword-driven category filtering
 - Time-based filtering from temporal hints
 - Parallel multi-query, multi-scope search
 - Confidence-based routing with iterative deepening (budget loop)
@@ -37,7 +36,6 @@ class RecallState(BaseModel):
    query: str = ""
    scope: str | None = None
    categories: list[str] | None = None
-    inferred_categories: list[str] = Field(default_factory=list)
    time_cutoff: datetime | None = None
    source: str | None = None
    include_private: bool = False
@@ -82,11 +80,8 @@ class RecallFlow(Flow[RecallState]):
    # ------------------------------------------------------------------

    def _merged_categories(self) -> list[str] | None:
-        """Merge caller-supplied and LLM-inferred categories."""
-        merged = list(
-            set((self.state.categories or []) + self.state.inferred_categories)
-        )
-        return merged or None
+        """Return caller-supplied categories, or None if empty."""
+        return self.state.categories or None

    def _do_search(self) -> list[dict[str, Any]]:
        """Run parallel search across (embeddings x scopes) with filters.
@@ -212,10 +207,6 @@ class RecallFlow(Flow[RecallState]):
            )
            self.state.query_analysis = analysis

-            # Wire keywords -> category filter
-            if analysis.keywords:
-                self.state.inferred_categories = analysis.keywords
-
            # Parse time_filter into a datetime cutoff
            if analysis.time_filter:
                try:
--- a/lib/crewai/src/crewai/memory/types.py
+++ b/lib/crewai/src/crewai/memory/types.py
@@ -91,10 +91,18 @@ class MemoryMatch(BaseModel):
        """Format this match as a human-readable string including metadata.

        Returns:
-            A multi-line string with score, content, categories, and non-empty
-            metadata fields.
+            A multi-line string with score, content, scope/date, categories,
+            and non-empty metadata fields.
        """
-        lines = [f"- (score={self.score:.2f}) {self.record.content}"]
+        # Extract date from scope (e.g. "/conversations/2023-05-29" -> "2023-05-29")
+        date_str = ""
+        if self.record.scope and self.record.scope != "/":
+            parts = self.record.scope.rstrip("/").rsplit("/", 1)
+            if len(parts) > 1 and len(parts[-1]) >= 10:
+                date_str = f" [date: {parts[-1]}]"
+        lines = [f"- (score={self.score:.2f}){date_str} {self.record.content}"]
+        if self.record.scope and self.record.scope != "/":
+            lines.append(f"  scope: {self.record.scope}")
        if self.record.categories:
            lines.append(f"  categories: {', '.join(self.record.categories)}")
        if self.record.metadata:
@@ -366,7 +374,13 @@ def compute_composite_score(
        Tuple of (composite_score, match_reasons). match_reasons includes
        "semantic" always; "recency" if decay > 0.5; "importance" if record.importance > 0.5.
    """
-    age_seconds = (datetime.utcnow() - record.created_at).total_seconds()
+    now = datetime.utcnow()
+    created = record.created_at
+    # Strip timezone info to avoid "can't compare offset-naive and
+    # offset-aware datetimes" when records have mixed tz awareness.
+    if created.tzinfo is not None:
+        created = created.replace(tzinfo=None)
+    age_seconds = (now - created).total_seconds()
    age_days = max(age_seconds / 86400.0, 0.0)
    decay = 0.5 ** (age_days / config.recency_half_life_days)

--- a/lib/crewai/src/crewai/tools/memory_tools.py
+++ b/lib/crewai/src/crewai/tools/memory_tools.py
@@ -2,6 +2,10 @@

 from __future__ import annotations

+import ast
+import operator
+import re
+from datetime import datetime
 from typing import Any

 from pydantic import BaseModel, Field
@@ -10,6 +14,80 @@ from crewai.tools.base_tool import BaseTool
 from crewai.utilities.i18n import get_i18n


+# ---------------------------------------------------------------------------
+# Safe arithmetic evaluator (no eval())
+# ---------------------------------------------------------------------------
+
+_BINARY_OPS: dict[type, Any] = {
+    ast.Add: operator.add,
+    ast.Sub: operator.sub,
+    ast.Mult: operator.mul,
+    ast.Div: operator.truediv,
+    ast.FloorDiv: operator.floordiv,
+    ast.Mod: operator.mod,
+    ast.Pow: operator.pow,
+}
+
+_UNARY_OPS: dict[type, Any] = {
+    ast.USub: operator.neg,
+    ast.UAdd: operator.pos,
+}
+
+
+def _safe_eval_node(node: ast.AST) -> float:
+    """Recursively evaluate an AST node containing only arithmetic."""
+    if isinstance(node, ast.Constant) and isinstance(node.value, (int, float)):
+        return float(node.value)
+    if isinstance(node, ast.BinOp):
+        op = _BINARY_OPS.get(type(node.op))
+        if op is None:
+            raise ValueError(f"Unsupported operator: {type(node.op).__name__}")
+        return op(_safe_eval_node(node.left), _safe_eval_node(node.right))
+    if isinstance(node, ast.UnaryOp):
+        op = _UNARY_OPS.get(type(node.op))
+        if op is None:
+            raise ValueError(f"Unsupported unary operator: {type(node.op).__name__}")
+        return op(_safe_eval_node(node.operand))
+    raise ValueError(f"Unsupported expression element: {ast.dump(node)}")
+
+
+def safe_calc(expression: str) -> float:
+    """Safely evaluate a mathematical expression string.
+
+    Only supports arithmetic operators (+, -, *, /, //, %, **) and numeric
+    literals.  No variable access, function calls, or attribute lookups.
+    """
+    tree = ast.parse(expression.strip(), mode="eval")
+    return _safe_eval_node(tree.body)
+
+
+# ---------------------------------------------------------------------------
+# Date difference helper
+# ---------------------------------------------------------------------------
+
+_DATE_DIFF_RE = re.compile(
+    r"^\s*(\d{4}-\d{2}-\d{2})\s*-\s*(\d{4}-\d{2}-\d{2})\s*$"
+)
+
+
+def _try_date_diff(expression: str) -> str | None:
+    """If *expression* is ``YYYY-MM-DD - YYYY-MM-DD``, return the day difference.
+
+    Returns a human-readable string like ``12 days`` or ``-5 days``, or
+    *None* if the expression is not a date subtraction.
+    """
+    m = _DATE_DIFF_RE.match(expression.strip())
+    if m is None:
+        return None
+    try:
+        d1 = datetime.strptime(m.group(1), "%Y-%m-%d")
+        d2 = datetime.strptime(m.group(2), "%Y-%m-%d")
+    except ValueError:
+        return None
+    delta = (d1 - d2).days
+    return f"{expression.strip()} = {delta} days"
+
+
 class RecallMemorySchema(BaseModel):
    """Schema for the recall memory tool."""

@@ -49,7 +127,7 @@ class RecallMemoryTool(BaseTool):
        all_lines: list[str] = []
        seen_ids: set[str] = set()
        for query in queries:
-            matches = self.memory.recall(query)
+            matches = self.memory.recall(query, limit=30)
            for m in matches:
                if m.record.id not in seen_ids:
                    seen_ids.add(m.record.id)
@@ -101,6 +179,52 @@ class RememberTool(BaseTool):
        return f"Saving {len(contents)} items to memory in background."


+class CalculatorSchema(BaseModel):
+    """Schema for the calculator tool."""
+
+    expression: str = Field(
+        ...,
+        description=(
+            "A mathematical expression to evaluate, e.g. '(30 + 25 + 85)' "
+            "or '(132 + 298) / 5'. Supports +, -, *, /, //, %, **. "
+            "Also supports date differences: '2023-04-01 - 2023-03-20' returns the number of days."
+        ),
+    )
+
+
+class CalculatorTool(BaseTool):
+    """Lightweight calculator for arithmetic during memory-based reasoning."""
+
+    name: str = "Calculator"
+    description: str = ""
+    args_schema: type[BaseModel] = CalculatorSchema
+
+    def _run(self, expression: str, **kwargs: Any) -> str:
+        """Evaluate a mathematical expression safely.
+
+        Supports arithmetic expressions and date differences
+        (``YYYY-MM-DD - YYYY-MM-DD``).
+
+        Args:
+            expression: Arithmetic or date-difference expression string.
+
+        Returns:
+            The expression and its result, or an error message.
+        """
+        # Try date difference first (e.g. "2023-04-01 - 2023-03-20")
+        date_result = _try_date_diff(expression)
+        if date_result is not None:
+            return date_result
+        try:
+            result = safe_calc(expression)
+            # Format nicely: drop .0 for whole numbers
+            if result == int(result):
+                return f"{expression} = {int(result)}"
+            return f"{expression} = {result:.4g}"
+        except Exception as e:
+            return f"Error evaluating '{expression}': {e}"
+
+
 def create_memory_tools(memory: Any) -> list[BaseTool]:
    """Create Recall and Remember tools for the given memory instance.

@@ -120,6 +244,9 @@ def create_memory_tools(memory: Any) -> list[BaseTool]:
            memory=memory,
            description=i18n.tools("recall_memory"),
        ),
+        CalculatorTool(
+            description=i18n.tools("calculator"),
+        ),
    ]
    if not getattr(memory, "_read_only", False):
        tools.append(
--- a/lib/crewai/src/crewai/translations/en.json
+++ b/lib/crewai/src/crewai/translations/en.json
@@ -7,7 +7,7 @@
  "slices": {
    "observation": "\nObservation:",
    "task": "\nCurrent Task: {input}\n\nBegin! This is VERY important to you, use the tools available and give your best Final Answer, your job depends on it!\n\nThought:",
-    "memory": "\n\n# Useful context: \n{memory}",
+    "memory": "\n\n# Memories from past conversations:\n{memory}\n\nGuidelines for using these memories:\n\n1. COMPLETENESS: The memories above are an automatic selection and may be INCOMPLETE. If the task involves counting, listing, or summing items (e.g. 'how many', 'total', 'list all'), you MUST use the Search memory tool with several different queries before answering.\n\n2. COUNTING & ARITHMETIC: When counting or computing totals:\n   - You MUST search memory multiple times with DIFFERENT phrasings to find ALL items. After your first search, ask yourself 'could there be more items I haven't found?' and search again with different terms.\n   - Enumerate EACH specific item individually with its details, then count. Do NOT guess a number — list them first.\n   - Count instances/sessions, not categories (e.g. if yoga is 2x/week, that is 2 sessions, not 1).\n   - Only exclude items if there is explicit confirmation of removal, sale, or cancellation — intent to sell or thinking about it still means the user currently has it.\n   - Use the Calculator tool for all arithmetic (sums, averages, differences) instead of computing in your head.\n   - Only count things the user personally did, owned, or participated in — not things merely mentioned or discussed informationally.\n\n3. PERSONALIZATION: When the user asks for advice, recommendations, tips, or opinions:\n   - BUILD UPON what you know about the user — don't just restate their memories as new suggestions.\n   - Explicitly reference their specific past experiences, preferences, tools, and interests by name.\n   - Frame suggestions in terms of what they've already tried or expressed interest in (e.g. 'Since you enjoyed X, you might also like Y' or 'Building on your experience with X, consider trying Y').\n   - NEVER give generic advice that ignores their memories. If you have relevant context, USE it.\n   - NEVER say 'I don't have information' or 'I don't have recommendations' if you have ANY related memories. Instead, reason from what you know — even partial context is better than no answer.\n\n4. TEMPORAL REASONING: Each memory has a 'scope' field (e.g. '/conversations/2023-03-04') that tells you the date of the conversation it came from.\n   - For 'how many days/weeks/months AGO did X happen': subtract the event's scope date from the question date. Use the Calculator with 'YYYY-MM-DD - YYYY-MM-DD' format, e.g. Calculator('2023-04-01 - 2023-03-20') returns '12 days'.\n   - For 'how many days BETWEEN Event A and Event B': subtract the two events' scope dates from each other — do NOT use the question date. Example: if A is at scope 2023-02-10 and B at scope 2023-03-01, compute Calculator('2023-03-01 - 2023-02-10') = 19 days.\n   - For 'how long ago was X WHEN Y happened': the reference point is when Y happened (Y's scope date), NOT the question date. Example: 'How many days ago did I launch my website when I signed my first client?' — use the client-signing date as the reference, not today.\n   - For 'which happened first' or ordering: compare the scope dates directly — the earlier date happened first. If a memory says 'about a month ago' from scope 2023-05-29, compute the actual date (approximately 2023-04-29) and compare.\n   - Always prefer the scope date over vague temporal references like 'recently' in the memory text.\n   - NEVER compute date differences by manually counting days in each month. Always use the Calculator with YYYY-MM-DD format.\n\n5. KNOWLEDGE UPDATES: When multiple memories describe the same fact at different dates (different scope values), the LATEST one (most recent scope date) is the current truth.\n   - If one memory says '3 sessions attended' (scope 2023-05-11) and a later one says '5 sessions attended' (scope 2023-10-30), the answer is 5 — NOT 3+5=8. The later value is a cumulative update, not an addition.\n   - If one memory says 'class on Thursday' (scope 2023-06-16) and a later one says 'class on Friday' (scope 2023-06-30), the answer is Friday — the schedule changed.\n   - If one memory says 'personal best 27:12' and a later one says 'personal best 25:50', the answer is 25:50 — the record was broken.\n   - ALWAYS check the scope dates when you find conflicting information about the same topic. The most recent scope wins.\n   - Do NOT sum values across time periods unless the question explicitly asks for a cumulative total across separate events (e.g. 'total hours spent across all sessions').",
    "role_playing": "You are {role}. {backstory}\nYour personal goal is: {goal}",
    "tools": "\nYou ONLY have access to the following tools, and should NEVER make up tools that are not listed here:\n\n{tools}\n\nIMPORTANT: Use the following format in your response:\n\n```\nThought: you should always think about what to do\nAction: the action to take, only one name of [{tool_names}], just the name, exactly as it's written.\nAction Input: the input to the action, just a simple JSON object, enclosed in curly braces, using \" to wrap keys and values.\nObservation: the result of the action\n```\n\nOnce all necessary information is gathered, return the following format:\n\n```\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n```",
    "no_tools": "",
@@ -60,12 +60,13 @@
      "description": "See image to understand its content, you can optionally ask a question about the image",
      "default_action": "Please provide a detailed description of this image, including all visual elements, context, and any notable details you can observe."
    },
-    "recall_memory": "Search through the team's shared memory for relevant information. Pass one or more queries to search for multiple things at once. Use this when you need to find facts, decisions, preferences, or past results that may have been stored previously.",
+    "recall_memory": "Search through the team's shared memory for relevant information. Pass one or more queries to search for multiple things at once. Use this when you need to find facts, decisions, preferences, or past results that may have been stored previously. IMPORTANT: For questions that require counting, summing, or listing items across multiple conversations (e.g. 'how many X', 'total Y', 'list all Z'), you MUST search multiple times with different phrasings to ensure you find ALL relevant items before giving a final count or total. Do not rely on a single search — items may be described differently across conversations.",
+    "calculator": "Perform arithmetic calculations and date differences. Use this tool whenever you need to add, subtract, multiply, divide, compute averages/totals, or calculate the number of days between two dates. Pass a mathematical expression like '(30 + 25 + 85)' or '(132 + 298) / 5', or a date difference like '2023-04-01 - 2023-03-20' (returns days). Always use this tool instead of computing in your head — NEVER manually count days across months.",
    "save_to_memory": "Store one or more important facts, decisions, observations, or lessons in memory so they can be recalled later by you or other agents. Pass multiple items at once when you have several things worth remembering."
  },
  "memory": {
    "query_system": "You analyze a query for searching memory.\nGiven the query and available scopes, output:\n1. keywords: Key entities or keywords that can be used to filter by category.\n2. suggested_scopes: Which available scopes are most relevant (empty for all).\n3. complexity: 'simple' or 'complex'.\n4. recall_queries: 1-3 short, targeted search phrases distilled from the query. Each should be a concise phrase optimized for semantic vector search. If the query is already short and focused, return it as-is in a single-item list. For long task descriptions, extract the distinct things worth searching for.\n5. time_filter: If the query references a time period (like 'last week', 'yesterday', 'in January'), return an ISO 8601 date string for the earliest relevant date (e.g. '2026-02-01'). Return null if no time constraint is implied.",
-    "extract_memories_system": "You extract discrete, reusable memory statements from raw content (e.g. a task description and its result).\n\nFor the given content, output a list of memory statements. Each memory must:\n- Be one clear sentence or short statement\n- Be understandable without the original context\n- Capture a decision, fact, outcome, preference, lesson, or observation worth remembering\n- NOT be a vague summary or a restatement of the task description\n- NOT duplicate the same idea in different words\n\nIf there is nothing worth remembering (e.g. empty result, no decisions or facts), return an empty list.\nOutput a JSON object with a single key \"memories\" whose value is a list of strings.",
+    "extract_memories_system": "You extract discrete, reusable memory statements from raw content (e.g. a task description and its result, or a conversation between a user and an assistant).\n\nFor the given content, output a list of memory statements. Each memory must:\n- Be one clear sentence or short statement\n- Be understandable without the original context\n- Capture a decision, fact, outcome, preference, lesson, or observation worth remembering\n- NOT be a vague summary or a restatement of the task description\n- NOT duplicate the same idea in different words\n\nCRITICAL — Extract ALL facts, not just the main topic:\nUsers often reveal important personal facts WHILE discussing something else. Extract these background facts as SEPARATE memories:\n- Casual asides: \"By the way, I just finished a 5K in 35 minutes\" → extract the 5K time as its own memory\n- Qualifiers in questions: \"What collar suits a Golden Retriever like Max?\" → extract \"The user's dog Max is a Golden Retriever\"\n- Brief mentions in lists: If the user lists several items (\"$100 gift card for brother, $75 earrings for sister, $100 baby gift for coworker\"), extract EACH item as a separate memory — do not skip items mentioned only once\n- Adjectival modifiers: \"our 10-day trip to Hawaii\" → extract the duration (10 days) explicitly\n\nWhen the user makes a request that reveals a personal fact, extract BOTH:\n- The fact: \"The user's dog Max is a Golden Retriever\"\n- The action: \"The user is looking for a new collar with a name tag for Max\"\nNever let the request overshadow the revealed fact.\n\nUser personal facts are HIGH PRIORITY and must always be extracted:\n- What the user did, bought, made, visited, attended, or completed\n- Names of people, pets, places, brands, and specific items the user mentions\n- Quantities, durations, dates, prices, and measurements the user states\n- Subordinate clauses and casual asides often contain important personal details\n\nPreserve exact names and numbers — never generalize:\n- Keep \"32 years old\" not \"in their 30s\"\n- Keep \"$24 after a discount\" not \"an impulse buy\"\n- Keep \"10-day trip\" not just \"trip\"\n- Keep \"35 minutes\" not just \"started running\"\n- Keep \"lavender gin fizz\" not just \"cocktail\"\n- Keep \"12 largemouth bass\" not just \"fish caught\"\n- Keep \"Golden Retriever\" not just \"dog\"\n\nWhen the content includes assistant responses, also extract useful factual information, recommendations, or solutions the assistant provides.\n\nAdditional extraction rules:\n- Presupposed facts: When the user reveals a fact indirectly in a question (e.g. \"What collar suits a Golden Retriever like Max?\" presupposes Max is a Golden Retriever), extract that fact as a separate memory.\n- Date precision: Always preserve the full date including day-of-month when stated (e.g. \"February 14th\" not just \"February\", \"March 5\" not just \"March\"). When a well-known holiday name implies a specific date (e.g. \"Valentine's Day\" = February 14th, \"Christmas\" = December 25th), include the calendar date.\n- Relative date resolution: The content often has a date header like \"[Conversation on 2023/05/21 (Sun) 18:59]\". When relative time expressions appear (\"two months ago\", \"last Thursday\", \"about a month ago\"), resolve them to approximate absolute dates using the conversation date (e.g. \"two months ago\" in a May 2023 conversation → \"around March 2023\"). Include the resolved date in the memory.\n- Life events in passing: When the user mentions a life event (birth, wedding, graduation, move, adoption) while discussing something else, extract the life event as its own FACTUAL memory (e.g. \"my friend David had a baby boy named Jasper\" → \"David had a baby boy named Jasper\"). Do NOT convert life events into task actions (do NOT store as \"set up birthday reminder for Jasper\").\n- Separate distinct actions: When the user describes multiple pending tasks (e.g. \"I need to return the old boots AND pick up the new pair\"), extract each as a separate memory. Do not merge them into one statement.\n- Completed vs planned: Clearly distinguish completed events (\"I baked cookies last Thursday\") from planned events (\"I'm thinking of baking chicken wings\"). Include the approximate date for both.\n\nIf there is nothing worth remembering (e.g. empty result, no decisions or facts), return an empty list.\nOutput a JSON object with a single key \"memories\" whose value is a list of strings.",
    "extract_memories_user": "Content:\n{content}\n\nExtract memory statements as described. Return structured output.",
    "query_user": "Query: {query}\n\nAvailable scopes: {available_scopes}\n{scope_desc}\n\nReturn the analysis as structured output.",
    "save_system": "You analyze content to be stored in a hierarchical memory system.\nGiven the content and the existing scopes and categories, output:\n1. suggested_scope: The best matching existing scope path, or a new path if none fit (use / for root).\n2. categories: A list of categories (reuse existing when relevant, add new ones if needed).\n3. importance: A number from 0.0 to 1.0 indicating how significant this memory is.\n4. extracted_metadata: A JSON object with any entities, dates, or topics you can extract.",
@@ -79,4 +80,4 @@
    "create_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou have been assigned the following task:\n{description}\n\nExpected output:\n{expected_output}\n\nAvailable tools: {tools}\n\nBefore executing this task, create a detailed plan that leverages your expertise as {role} and outlines:\n1. Your understanding of the task from your professional perspective\n2. The key steps you'll take to complete it, drawing on your background and skills\n3. How you'll approach any challenges that might arise, considering your expertise\n4. How you'll strategically use the available tools based on your experience, exactly what tools to use and how to use them\n5. The expected outcome and how it aligns with your goal\n\nAfter creating your plan, assess whether you feel ready to execute the task or if you could do better.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan because [specific reason].\"",
    "refine_plan_prompt": "You are {role} with this background: {backstory}\n\nYour primary goal is: {goal}\n\nYou created the following plan for this task:\n{current_plan}\n\nHowever, you indicated that you're not ready to execute the task yet.\n\nPlease refine your plan further, drawing on your expertise as {role} to address any gaps or uncertainties. As you refine your plan, be specific about which available tools you will use, how you will use them, and why they are the best choices for each step. Clearly outline your tool usage strategy as part of your improved plan.\n\nAfter refining your plan, assess whether you feel ready to execute the task.\nConclude with one of these statements:\n- \"READY: I am ready to execute the task.\"\n- \"NOT READY: I need to refine my plan further because [specific reason].\""
  }
-}
+}
Author	SHA1	Message	Date
Joao Moura	5e08e03e43	memory improvements	2026-03-09 20:11:39 -07:00
Joao Moura	88ad3a3ac4	refactor: simplify category handling in RecallFlow - Updated the _merged_categories method to return only caller-supplied categories, removing the previous merging logic for inferred categories. This change enhances clarity and maintains consistency in category management.	2026-03-04 09:13:44 -08:00
Joao Moura	a3ea6d280a	refactor: remove inferred_categories from RecallState and update category merging logic - Removed the inferred_categories field from RecallState to simplify state management. - Updated the _merged_categories method to only merge caller-supplied categories, enhancing clarity in category handling.	2026-03-04 09:13:44 -08:00
Joao Moura	9d09a173e6	feat: increase memory recall limit and enhance memory context documentation - Increased the memory recall limit in the Agent class from 15 to 20. - Updated the memory context message to clarify the nature of the memories presented and the importance of using the Search memory tool for comprehensive results.	2026-03-04 09:13:44 -08:00
Joao Moura	c06aa4d476	feat: enhance memory recall limits and update documentation - Increased the memory recall limit in the Agent class from 5 to 15. - Updated the RecallMemoryTool to allow a recall limit of 20. - Expanded the documentation for the recall_memory feature to emphasize the importance of multiple queries for comprehensive results.	2026-03-04 09:13:44 -08:00