mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-07-01 05:08:12 +00:00
Add opt-in cache_preload and cache_preload_strategy parameters to the
Crew class that fire lightweight 1-token cache-warming probes against
each agent's system prompt at kickoff time. This warms the provider's
prompt cache (Anthropic, OpenAI prefix caching, etc.) before the first
real task runs, reducing first-step latency and cache-write costs.
Implementation:
- BaseLLM.preload_probe(): sends max_tokens=1 completion with the
agent's system prompt; failures are logged and never propagated
- Crew.cache_preload / Crew.cache_preload_strategy fields
- Crew._preload_caches() with three strategies:
* parallel: concurrent probes via ThreadPoolExecutor
* sequential: one-by-one in agent order
* shared_prefix: warm common prefix once then per-agent suffixes;
falls back to parallel when prefix < 1024 chars
The feature is opt-in (cache_preload=False by default) and only
activates for crews with 2+ agents.
Co-Authored-By: João <joao@crewai.com>