Files
crewAI/lib
Alex a4f1164812 fix: reduce trace event serialization bloat by excluding redundant nested objects
Each trace event was serializing the ENTIRE Crew/Task/Agent object graph into
event_data JSONB, causing 500GB+ trace tables in production. For a crew with
5 agents and 10 tasks, each event could be 50-100KB because:
- Crew serialized full tasks AND full agents (with all tools, LLM configs)
- Each Task re-serialized its agent (same Agent already in Crew.agents)
- Each Task re-serialized context tasks (same Tasks already in Crew.tasks)

This fix:
1. Adds TRACE_EXCLUDE_FIELDS constant listing back-references and heavy fields
   to exclude (crew, agent, agents, tasks, context, tools, llm, callbacks, etc.)

2. Adds _serialize_for_trace() helper that uses safe_serialize_to_dict with
   the exclusion set, keeping scalar fields (agent_role, task_name, etc.)
   that the AMP frontend actually reads

3. Updates _build_event_data() to use lightweight serialization for all
   events except crew_kickoff_started

4. Adds _build_crew_started_data() that serializes the full crew structure
   ONCE with:
   - Agents with tool_names (list of strings, not full tool objects)
   - Tasks with agent_ref (just {id, role}) instead of full agent
   - Tasks with context_task_ids (just IDs) instead of full context tasks

5. Updates to_serializable() in serialization.py to:
   - Handle callable objects (functions/lambdas) by falling through to repr()
   - Handle regular classes with __dict__ (not just Pydantic models)

Expected size reduction: 50-100KB per event down to ~1-2KB per event.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-30 09:25:01 -07:00
..
2026-03-27 11:26:04 +08:00