mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-04-06 02:58:13 +00:00
Each trace event was serializing the ENTIRE Crew/Task/Agent object graph into
event_data JSONB, causing 500GB+ trace tables in production. For a crew with
5 agents and 10 tasks, each event could be 50-100KB because:
- Crew serialized full tasks AND full agents (with all tools, LLM configs)
- Each Task re-serialized its agent (same Agent already in Crew.agents)
- Each Task re-serialized context tasks (same Tasks already in Crew.tasks)
This fix:
1. Adds TRACE_EXCLUDE_FIELDS constant listing back-references and heavy fields
to exclude (crew, agent, agents, tasks, context, tools, llm, callbacks, etc.)
2. Adds _serialize_for_trace() helper that uses safe_serialize_to_dict with
the exclusion set, keeping scalar fields (agent_role, task_name, etc.)
that the AMP frontend actually reads
3. Updates _build_event_data() to use lightweight serialization for all
events except crew_kickoff_started
4. Adds _build_crew_started_data() that serializes the full crew structure
ONCE with:
- Agents with tool_names (list of strings, not full tool objects)
- Tasks with agent_ref (just {id, role}) instead of full agent
- Tasks with context_task_ids (just IDs) instead of full context tasks
5. Updates to_serializable() in serialization.py to:
- Handle callable objects (functions/lambdas) by falling through to repr()
- Handle regular classes with __dict__ (not just Pydantic models)
Expected size reduction: 50-100KB per event down to ~1-2KB per event.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>