mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-07-02 05:38:12 +00:00
docs: restructure checkpointing page
This commit is contained in:
@@ -5,225 +5,385 @@ icon: floppy-disk
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Warning>
|
||||
الـ Checkpointing في اصدار مبكر. قد تتغير واجهات البرمجة في الاصدارات المستقبلية.
|
||||
</Warning>
|
||||
الـ Checkpointing يحفظ لقطة من حالة التنفيذ اثناء التشغيل بحيث يمكن لطاقم او تدفق او وكيل الاستئناف بعد الفشل او التفرع الى فرع بديل.
|
||||
|
||||
## نظرة عامة
|
||||
<CardGroup cols={2}>
|
||||
<Card title="الشرح" icon="lightbulb" href="#الشرح">
|
||||
كيف يعمل الـ Checkpointing: الاحداث والتخزين والوراثة.
|
||||
</Card>
|
||||
<Card title="درس تطبيقي" icon="graduation-cap" href="#درس-تطبيقي-استئناف-طاقم-فاشل">
|
||||
دليل 5 دقائق: تشغيل، ايقاف، استئناف.
|
||||
</Card>
|
||||
<Card title="ادلة عملية" icon="screwdriver-wrench" href="#ادلة-عملية">
|
||||
وصفات مركزة على المهام لسير العمل الشائع.
|
||||
</Card>
|
||||
<Card title="المرجع" icon="book" href="#المرجع">
|
||||
`CheckpointConfig` والاحداث والمزودات وسطر الاوامر.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
يقوم الـ Checkpointing بحفظ حالة التنفيذ تلقائيا اثناء التشغيل. اذا فشل طاقم او تدفق او وكيل اثناء التنفيذ، يمكنك الاستعادة من اخر نقطة حفظ والاستئناف دون اعادة تنفيذ العمل المكتمل.
|
||||
## الشرح
|
||||
|
||||
## البداية السريعة
|
||||
### ما هي نقطة الحفظ
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
نقطة الحفظ هي لقطة متسلسلة من `RuntimeState` تكتب في نقطة معينة من التنفيذ. تسجل اي المهام اكتملت ومخرجاتها والمدخلات الحالية ومعرف نسب يحدد التشغيل.
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=True, # يستخدم الافتراضيات: ./.checkpoints, عند task_completed
|
||||
)
|
||||
result = crew.kickoff()
|
||||
```
|
||||
عند الاستعادة من نقطة حفظ، يعيد CrewAI بناء تلك الحالة ويتخطى العمل المكتمل ويستمر. عند التفرع، يستعيد CrewAI الحالة تحت نسب جديد بحيث لا يتداخل الفرع الجديد مع التشغيل الاصلي.
|
||||
|
||||
تتم كتابة ملفات نقاط الحفظ في `./.checkpoints/` بعد اكتمال كل مهمة.
|
||||
### متى تكتب نقاط الحفظ
|
||||
|
||||
## التكوين
|
||||
الـ Checkpointing مدفوع بالاحداث. يشترك وقت التشغيل في الاحداث التي تحددها عبر `on_events` ويكتب نقطة حفظ عند اطلاق احدها. الافتراضي `task_completed` ينتج نقطة حفظ لكل مهمة منتهية — توازن معقول بين الدقة واستخدام القرص. الاحداث عالية التردد مثل `llm_call_completed` متاحة للاستعادة الدقيقة لكنها تكتب ملفات اكثر بكثير.
|
||||
|
||||
استخدم `CheckpointConfig` للتحكم الكامل:
|
||||
### التخزين
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
يتضمن CrewAI مزودين:
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
on_events=["task_completed", "crew_kickoff_completed"],
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
- `JsonProvider` يكتب ملفا لكل نقطة حفظ. قابل للقراءة وسهل التفقد.
|
||||
- `SqliteProvider` يكتب الى قاعدة بيانات SQLite واحدة. افضل لنقاط الحفظ عالية التردد.
|
||||
|
||||
### حقول CheckpointConfig
|
||||
كلاهما يحذف اقدم نقاط الحفظ عند تحديد `max_checkpoints`.
|
||||
|
||||
| الحقل | النوع | الافتراضي | الوصف |
|
||||
|:------|:------|:----------|:------|
|
||||
| `location` | `str` | `"./.checkpoints"` | مسار ملفات نقاط الحفظ |
|
||||
| `on_events` | `list[str]` | `["task_completed"]` | انواع الاحداث التي تطلق نقطة حفظ |
|
||||
| `provider` | `BaseProvider` | `JsonProvider()` | واجهة التخزين |
|
||||
| `max_checkpoints` | `int \| None` | `None` | الحد الاقصى للملفات؛ يتم حذف الاقدم اولا |
|
||||
<Note>
|
||||
كتابة نقاط الحفظ بافضل جهد. فشل نقطة حفظ يسجل لكنه لا يقاطع التشغيل.
|
||||
</Note>
|
||||
|
||||
### الوراثة والانسحاب
|
||||
### نموذج الوراثة
|
||||
|
||||
يقبل حقل `checkpoint` في Crew و Flow و Agent قيم `CheckpointConfig` او `True` او `False` او `None`:
|
||||
`Crew` و`Flow` و`Agent` كلها تقبل وسيط `checkpoint`. يرث الابناء من الاب ما لم يحددوا قيمتهم الخاصة او يمرروا `False` للانسحاب. فعل الـ Checkpointing مرة واحدة على الطاقم وتشارك كل الوكلاء، او استبعد وكيلا واحدا بشكل انتقائي.
|
||||
|
||||
| القيمة | السلوك |
|
||||
|:-------|:-------|
|
||||
| `None` (افتراضي) | يرث من الاصل. الوكيل يرث اعدادات الطاقم. |
|
||||
| `True` | تفعيل بالاعدادات الافتراضية. |
|
||||
| `False` | انسحاب صريح. يوقف الوراثة من الاصل. |
|
||||
| `CheckpointConfig(...)` | اعدادات مخصصة. |
|
||||
## درس تطبيقي: استئناف طاقم فاشل
|
||||
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[
|
||||
Agent(role="Researcher", ...), # يرث checkpoint من الطاقم
|
||||
Agent(role="Writer", ..., checkpoint=False), # منسحب، بدون نقاط حفظ
|
||||
],
|
||||
tasks=[...],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
هذا الدليل يستغرق حوالي 5 دقائق. ستشغل طاقما بمهمتين، توقفه في المنتصف، ثم تستأنف من نقطة الحفظ المحفوظة.
|
||||
|
||||
## الاستئناف من نقطة حفظ
|
||||
<Steps>
|
||||
<Step title="انشئ الطاقم مع تفعيل الـ Checkpointing">
|
||||
```python
|
||||
from crewai import Agent, Crew, Task
|
||||
|
||||
```python
|
||||
# استعادة واستئناف
|
||||
crew = Crew.from_checkpoint("./my_checkpoints/20260407T120000_abc123.json")
|
||||
result = crew.kickoff() # يستأنف من اخر مهمة مكتملة
|
||||
```
|
||||
researcher = Agent(role="Researcher", goal="Research", backstory="Expert")
|
||||
writer = Agent(role="Writer", goal="Write", backstory="Expert")
|
||||
|
||||
يتخطى الطاقم المستعاد المهام المكتملة ويستأنف من اول مهمة غير مكتملة.
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[
|
||||
Task(description="Research AI trends", agent=researcher, expected_output="bullets"),
|
||||
Task(description="Write a summary", agent=writer, expected_output="paragraph"),
|
||||
],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
</Step>
|
||||
<Step title="شغله واوقفه بعد المهمة الاولى">
|
||||
```python
|
||||
result = crew.kickoff()
|
||||
```
|
||||
|
||||
## يعمل على Crew و Flow و Agent
|
||||
اضغط `Ctrl+C` بعد انتهاء المهمة الاولى. في `./.checkpoints/`، الملف بصيغة `<timestamp>_<uuid>.json` هو نقطة الحفظ.
|
||||
</Step>
|
||||
<Step title="استأنف من نقطة الحفظ">
|
||||
```python
|
||||
from crewai import CheckpointConfig
|
||||
|
||||
### Crew
|
||||
result = crew.kickoff(
|
||||
from_checkpoint=CheckpointConfig(
|
||||
restore_from="./.checkpoints/<timestamp>_<uuid>.json",
|
||||
),
|
||||
)
|
||||
```
|
||||
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[research_task, write_task, review_task],
|
||||
checkpoint=CheckpointConfig(location="./crew_cp"),
|
||||
)
|
||||
```
|
||||
يتم تخطي مهمة البحث، ويعمل الكاتب على مخرجات البحث المحفوظة، وينتهي الطاقم.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
المشغل الافتراضي: `task_completed` (نقطة حفظ واحدة لكل مهمة مكتملة).
|
||||
## ادلة عملية
|
||||
|
||||
### Flow
|
||||
<AccordionGroup>
|
||||
<Accordion title="تفعيل الـ Checkpointing بالاعدادات الافتراضية" icon="play">
|
||||
```python
|
||||
crew = Crew(agents=[...], tasks=[...], checkpoint=True)
|
||||
```
|
||||
|
||||
```python
|
||||
from crewai.flow.flow import Flow, start, listen
|
||||
from crewai import CheckpointConfig
|
||||
يكتب الى `./.checkpoints/` عند كل `task_completed`.
|
||||
</Accordion>
|
||||
|
||||
class MyFlow(Flow):
|
||||
@start()
|
||||
def step_one(self):
|
||||
return "data"
|
||||
<Accordion title="تخصيص التخزين والتردد" icon="sliders">
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
|
||||
@listen(step_one)
|
||||
def step_two(self, data):
|
||||
return process(data)
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
on_events=["task_completed", "crew_kickoff_completed"],
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
flow = MyFlow(
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./flow_cp",
|
||||
on_events=["method_execution_finished"],
|
||||
),
|
||||
)
|
||||
result = flow.kickoff()
|
||||
<Accordion title="اختيار مزود التخزين" icon="database">
|
||||
<CodeGroup>
|
||||
```python JsonProvider
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import JsonProvider
|
||||
|
||||
# استئناف
|
||||
flow = MyFlow.from_checkpoint("./flow_cp/20260407T120000_abc123.json")
|
||||
result = flow.kickoff()
|
||||
```
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
provider=JsonProvider(),
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
```python SqliteProvider
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import SqliteProvider
|
||||
|
||||
### Agent
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./.checkpoints.db",
|
||||
provider=SqliteProvider(),
|
||||
max_checkpoints=50,
|
||||
),
|
||||
)
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
```python
|
||||
agent = Agent(
|
||||
role="Researcher",
|
||||
goal="Research topics",
|
||||
backstory="Expert researcher",
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./agent_cp",
|
||||
on_events=["lite_agent_execution_completed"],
|
||||
),
|
||||
)
|
||||
result = agent.kickoff(messages=[{"role": "user", "content": "Research AI trends"}])
|
||||
```
|
||||
<Tip>
|
||||
SQLite يفعل وضع journal WAL للقراءات المتزامنة. يفضل لنقاط الحفظ عالية التردد.
|
||||
</Tip>
|
||||
</Accordion>
|
||||
|
||||
## مزودات التخزين
|
||||
<Accordion title="استبعاد وكيل واحد" icon="user-slash">
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[
|
||||
Agent(role="Researcher", ...),
|
||||
Agent(role="Writer", ..., checkpoint=False),
|
||||
],
|
||||
tasks=[...],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
يتضمن CrewAI مزودي تخزين لنقاط الحفظ.
|
||||
<Accordion title="الاستئناف عبر classmethod" icon="rotate-left">
|
||||
```python
|
||||
config = CheckpointConfig(restore_from="./my_checkpoints/<file>.json")
|
||||
crew = Crew.from_checkpoint(config)
|
||||
result = crew.kickoff()
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
### JsonProvider (افتراضي)
|
||||
<Accordion title="التفرع الى فرع جديد" icon="code-branch">
|
||||
`fork()` يستعيد نقطة حفظ تحت نسب جديد بحيث لا يتصادم التشغيل الجديد مع الاصلي.
|
||||
|
||||
يكتب كل نقطة حفظ كملف JSON منفصل.
|
||||
```python
|
||||
config = CheckpointConfig(restore_from="./my_checkpoints/<file>.json")
|
||||
crew = Crew.fork(config, branch="experiment-a")
|
||||
result = crew.kickoff(inputs={"strategy": "aggressive"})
|
||||
```
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import JsonProvider
|
||||
تسمية `branch` اختيارية؛ يتم انشاء واحدة اذا اغفلت.
|
||||
</Accordion>
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
provider=JsonProvider(),
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
<Accordion title="Checkpointing لـ Crew او Flow او Agent" icon="cubes">
|
||||
<Tabs>
|
||||
<Tab title="Crew">
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[research_task, write_task, review_task],
|
||||
checkpoint=CheckpointConfig(location="./crew_cp"),
|
||||
)
|
||||
```
|
||||
|
||||
### SqliteProvider
|
||||
المشغل الافتراضي: `task_completed`.
|
||||
</Tab>
|
||||
<Tab title="Flow">
|
||||
```python
|
||||
from crewai.flow.flow import Flow, start, listen
|
||||
from crewai import CheckpointConfig
|
||||
|
||||
يخزن جميع نقاط الحفظ في ملف قاعدة بيانات SQLite واحد.
|
||||
class MyFlow(Flow):
|
||||
@start()
|
||||
def step_one(self):
|
||||
return "data"
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import SqliteProvider
|
||||
@listen(step_one)
|
||||
def step_two(self, data):
|
||||
return process(data)
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./.checkpoints.db",
|
||||
provider=SqliteProvider(),
|
||||
),
|
||||
)
|
||||
```
|
||||
flow = MyFlow(
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./flow_cp",
|
||||
on_events=["method_execution_finished"],
|
||||
),
|
||||
)
|
||||
result = flow.kickoff()
|
||||
|
||||
config = CheckpointConfig(restore_from="./flow_cp/<file>.json")
|
||||
flow = MyFlow.from_checkpoint(config)
|
||||
result = flow.kickoff()
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Agent">
|
||||
```python
|
||||
agent = Agent(
|
||||
role="Researcher",
|
||||
goal="Research topics",
|
||||
backstory="Expert researcher",
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./agent_cp",
|
||||
on_events=["lite_agent_execution_completed"],
|
||||
),
|
||||
)
|
||||
result = agent.kickoff(messages=[{"role": "user", "content": "Research AI trends"}])
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
</Accordion>
|
||||
|
||||
## انواع الاحداث
|
||||
<Accordion title="كتابة نقطة حفظ يدويا" icon="code">
|
||||
سجل معالجا على اي حدث واستدع `state.checkpoint()`.
|
||||
|
||||
يقبل حقل `on_events` اي مجموعة من سلاسل انواع الاحداث. الخيارات الشائعة:
|
||||
<CodeGroup>
|
||||
```python Sync
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
def on_llm_done(source, event, state):
|
||||
path = state.checkpoint("./my_checkpoints")
|
||||
print(f"تم حفظ نقطة الحفظ: {path}")
|
||||
```
|
||||
```python Async
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
async def on_llm_done_async(source, event, state):
|
||||
path = await state.acheckpoint("./my_checkpoints")
|
||||
print(f"تم حفظ نقطة الحفظ: {path}")
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
يتم تمرير وسيط `state` تلقائيا عندما يقبل المعالج ثلاثة معاملات. راجع [Event Listeners](/ar/concepts/event-listener) لقائمة الاحداث الكاملة.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="التصفح والاستئناف والتفرع من سطر الاوامر" icon="terminal">
|
||||
```bash
|
||||
crewai checkpoint # كشف تلقائي لـ .checkpoints/ او .checkpoints.db
|
||||
crewai checkpoint --location ./my_checkpoints
|
||||
crewai checkpoint --location ./.checkpoints.db
|
||||
```
|
||||
|
||||
<Frame>
|
||||
<img src="/images/checkpointing.png" alt="Checkpoint TUI" />
|
||||
</Frame>
|
||||
|
||||
اللوحة اليسرى تجمع نقاط الحفظ حسب الفرع؛ التفرعات تتداخل تحت ابيها. اختيار نقطة حفظ يعرض بياناتها الوصفية وحالة الكيان وتقدم المهام. **Resume** يكمل التشغيل؛ **Fork** يبدا فرعا جديدا.
|
||||
|
||||
لوحة التفاصيل تعرض منطقتين قابلتين للتحرير:
|
||||
|
||||
- **Inputs** — مدخلات الـ kickoff الاصلية، معبأة مسبقا وقابلة للتحرير.
|
||||
- **مخرجات المهام** — مخرجات المهام المكتملة. تحرير مخرج والضغط على **Fork** يبطل المهام التابعة لتعاد بالسياق المعدل.
|
||||
|
||||
<Tip>
|
||||
مفيد لاستكشاف "ماذا لو": تفرع، عدل، راقب.
|
||||
</Tip>
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="تفقد نقاط الحفظ بدون TUI" icon="magnifying-glass">
|
||||
```bash
|
||||
crewai checkpoint list ./my_checkpoints
|
||||
crewai checkpoint info ./my_checkpoints/<file>.json
|
||||
crewai checkpoint info ./.checkpoints.db
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## المرجع
|
||||
|
||||
### `CheckpointConfig`
|
||||
|
||||
<ParamField path="location" type="str" default='"./.checkpoints"'>
|
||||
وجهة التخزين. مجلد لـ `JsonProvider`، مسار ملف قاعدة بيانات لـ `SqliteProvider`.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="on_events" type="list[str]" default='["task_completed"]'>
|
||||
انواع الاحداث التي تطلق نقطة حفظ. راجع [انواع الاحداث](#انواع-الاحداث).
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="provider" type="BaseProvider" default="JsonProvider()">
|
||||
واجهة التخزين. `JsonProvider` او `SqliteProvider`.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="max_checkpoints" type="int | None" default="None">
|
||||
الحد الاقصى لنقاط الحفظ المحتفظ بها. الاقدم تحذف بعد كل كتابة.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="restore_from" type="Path | str | None" default="None">
|
||||
نقطة الحفظ المراد استعادتها عند تمريرها عبر `from_checkpoint`.
|
||||
</ParamField>
|
||||
|
||||
### قيم حقل `checkpoint`
|
||||
|
||||
مقبولة في `Crew` و`Flow` و`Agent`.
|
||||
|
||||
<ParamField path="None" type="افتراضي">
|
||||
يرث من الاب.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="True" type="bool">
|
||||
تفعيل بالاعدادات الافتراضية.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="False" type="bool">
|
||||
انسحاب صريح. يوقف الوراثة.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="CheckpointConfig(...)" type="CheckpointConfig">
|
||||
اعدادات مخصصة.
|
||||
</ParamField>
|
||||
|
||||
### انواع الاحداث
|
||||
|
||||
قيم شائعة لـ `on_events`:
|
||||
|
||||
| حالة الاستخدام | الاحداث |
|
||||
|:---------------|:--------|
|
||||
| بعد كل مهمة (Crew) | `["task_completed"]` |
|
||||
| بعد كل مهمة | `["task_completed"]` |
|
||||
| بعد كل طريقة في التدفق | `["method_execution_finished"]` |
|
||||
| بعد تنفيذ الوكيل | `["agent_execution_completed"]`, `["lite_agent_execution_completed"]` |
|
||||
| عند اكتمال الطاقم فقط | `["crew_kickoff_completed"]` |
|
||||
| بعد كل استدعاء LLM | `["llm_call_completed"]` |
|
||||
| على كل شيء | `["*"]` |
|
||||
| كل شيء | `["*"]` |
|
||||
|
||||
<Warning>
|
||||
استخدام `["*"]` او احداث عالية التردد مثل `llm_call_completed` سيكتب العديد من ملفات نقاط الحفظ وقد يؤثر على الاداء. استخدم `max_checkpoints` للحد من استخدام المساحة.
|
||||
`["*"]` والاحداث عالية التردد مثل `llm_call_completed` تكتب نقاط حفظ كثيرة وقد تضر بالاداء. استخدمها مع `max_checkpoints`.
|
||||
</Warning>
|
||||
|
||||
## نقاط الحفظ اليدوية
|
||||
### مزودات التخزين
|
||||
|
||||
للتحكم الكامل، سجل معالج الاحداث الخاص بك واستدع `state.checkpoint()` مباشرة:
|
||||
<ParamField path="JsonProvider" type="provider">
|
||||
ملف واحد لكل نقطة حفظ بصيغة `<timestamp>_<uuid>.json` داخل `location`.
|
||||
</ParamField>
|
||||
|
||||
```python
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
<ParamField path="SqliteProvider" type="provider">
|
||||
ملف قاعدة بيانات واحد في `location` مع journaling WAL.
|
||||
</ParamField>
|
||||
|
||||
# معالج متزامن
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
def on_llm_done(source, event, state):
|
||||
path = state.checkpoint("./my_checkpoints")
|
||||
print(f"تم حفظ نقطة الحفظ: {path}")
|
||||
### سطر الاوامر
|
||||
|
||||
# معالج غير متزامن
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
async def on_llm_done_async(source, event, state):
|
||||
path = await state.acheckpoint("./my_checkpoints")
|
||||
print(f"تم حفظ نقطة الحفظ: {path}")
|
||||
```
|
||||
|
||||
وسيط `state` هو `RuntimeState` الذي يتم تمريره تلقائيا بواسطة ناقل الاحداث عندما يقبل المعالج 3 معاملات. يمكنك تسجيل معالجات على اي نوع حدث مدرج في وثائق [Event Listeners](/ar/concepts/event-listener).
|
||||
|
||||
الـ Checkpointing يعمل بافضل جهد: اذا فشلت كتابة نقطة حفظ، يتم تسجيل الخطأ ولكن التنفيذ يستمر دون انقطاع.
|
||||
| الامر | الغرض |
|
||||
|:------|:------|
|
||||
| `crewai checkpoint` | تشغيل TUI؛ كشف التخزين تلقائيا. |
|
||||
| `crewai checkpoint --location <path>` | تشغيل TUI على موقع محدد. |
|
||||
| `crewai checkpoint list <path>` | سرد نقاط الحفظ. |
|
||||
| `crewai checkpoint info <path>` | تفقد ملف نقطة حفظ او اخر مدخل في قاعدة بيانات SQLite. |
|
||||
|
||||
@@ -5,301 +5,385 @@ icon: floppy-disk
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Warning>
|
||||
Checkpointing is in early release. APIs may change in future versions.
|
||||
</Warning>
|
||||
Checkpointing saves a snapshot of execution state during a run so a crew, flow, or agent can resume after a failure or be forked into an alternate branch.
|
||||
|
||||
## Overview
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Explanation" icon="lightbulb" href="#explanation">
|
||||
How checkpointing works: events, storage, and inheritance.
|
||||
</Card>
|
||||
<Card title="Tutorial" icon="graduation-cap" href="#tutorial-resume-a-failing-crew">
|
||||
A 5-minute walkthrough: run, interrupt, resume.
|
||||
</Card>
|
||||
<Card title="How-to guides" icon="screwdriver-wrench" href="#how-to-guides">
|
||||
Task-focused recipes for common workflows.
|
||||
</Card>
|
||||
<Card title="Reference" icon="book" href="#reference">
|
||||
`CheckpointConfig`, events, providers, and CLI.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
Checkpointing automatically saves execution state during a run. If a crew, flow, or agent fails mid-execution, you can restore from the last checkpoint and resume without re-running completed work.
|
||||
## Explanation
|
||||
|
||||
## Quick Start
|
||||
### What a checkpoint is
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
A checkpoint is a serialized snapshot of `RuntimeState` written at a point in execution. It records which tasks have completed, their outputs, the current inputs, and a lineage ID that identifies the run.
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=True, # uses defaults: ./.checkpoints, on task_completed
|
||||
)
|
||||
result = crew.kickoff()
|
||||
```
|
||||
When you restore from a checkpoint, CrewAI rebuilds that state, skips already-completed work, and continues. When you fork from one, CrewAI restores the state under a new lineage so the new branch and the original run do not overwrite each other.
|
||||
|
||||
Checkpoint files are written to `./.checkpoints/` after each completed task.
|
||||
### When checkpoints are written
|
||||
|
||||
## Configuration
|
||||
Checkpointing is event-driven. The runtime subscribes to events you select via `on_events` and writes a checkpoint each time one fires. The default `task_completed` produces one checkpoint per finished task — a sensible tradeoff between granularity and disk use. Higher-frequency events like `llm_call_completed` are available for fine-grained recovery but write far more files.
|
||||
|
||||
Use `CheckpointConfig` for full control:
|
||||
### Storage
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
Two providers ship with CrewAI:
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
on_events=["task_completed", "crew_kickoff_completed"],
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
- `JsonProvider` writes one file per checkpoint. Human-readable and easy to inspect.
|
||||
- `SqliteProvider` writes to a single SQLite database. Better for high-frequency checkpointing.
|
||||
|
||||
### CheckpointConfig Fields
|
||||
Both prune oldest checkpoints when `max_checkpoints` is set.
|
||||
|
||||
| Field | Type | Default | Description |
|
||||
|:------|:-----|:--------|:------------|
|
||||
| `location` | `str` | `"./.checkpoints"` | Storage destination — a directory for `JsonProvider`, a database file path for `SqliteProvider` |
|
||||
| `on_events` | `list[str]` | `["task_completed"]` | Event types that trigger a checkpoint |
|
||||
| `provider` | `BaseProvider` | `JsonProvider()` | Storage backend |
|
||||
| `max_checkpoints` | `int \| None` | `None` | Max checkpoints to keep. Oldest are pruned after each write. Pruning is handled by the provider. |
|
||||
| `restore_from` | `Path \| str \| None` | `None` | Path to a checkpoint to restore from. Used when passing config via a kickoff method's `from_checkpoint` parameter. |
|
||||
<Note>
|
||||
Checkpoint writes are best-effort. A failed checkpoint is logged but does not interrupt the run.
|
||||
</Note>
|
||||
|
||||
### Inheritance and Opt-Out
|
||||
### Inheritance model
|
||||
|
||||
The `checkpoint` field on Crew, Flow, and Agent accepts `CheckpointConfig`, `True`, `False`, or `None`:
|
||||
`Crew`, `Flow`, and `Agent` all accept a `checkpoint` argument. Children inherit from their parent unless they set their own value or pass `False` to opt out. Enable checkpointing once on the crew and every agent participates, or selectively exclude one agent.
|
||||
|
||||
| Value | Behavior |
|
||||
|:------|:---------|
|
||||
| `None` (default) | Inherit from parent. An agent inherits its crew's config. |
|
||||
| `True` | Enable with defaults. |
|
||||
| `False` | Explicit opt-out. Stops inheritance from parent. |
|
||||
| `CheckpointConfig(...)` | Custom configuration. |
|
||||
## Tutorial: Resume a failing crew
|
||||
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[
|
||||
Agent(role="Researcher", ...), # inherits crew's checkpoint
|
||||
Agent(role="Writer", ..., checkpoint=False), # opted out, no checkpoints
|
||||
],
|
||||
tasks=[...],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
This walkthrough takes ~5 minutes. You will run a two-task crew, kill it midway, and resume from the saved checkpoint.
|
||||
|
||||
## Resuming from a Checkpoint
|
||||
<Steps>
|
||||
<Step title="Create the crew with checkpointing enabled">
|
||||
```python
|
||||
from crewai import Agent, Crew, Task
|
||||
|
||||
Pass a `CheckpointConfig` with `restore_from` to any kickoff method. The crew restores from that checkpoint, skips completed tasks, and resumes.
|
||||
researcher = Agent(role="Researcher", goal="Research", backstory="Expert")
|
||||
writer = Agent(role="Writer", goal="Write", backstory="Expert")
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[
|
||||
Task(description="Research AI trends", agent=researcher, expected_output="bullets"),
|
||||
Task(description="Write a summary", agent=writer, expected_output="paragraph"),
|
||||
],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
</Step>
|
||||
<Step title="Run it and interrupt after the first task">
|
||||
```python
|
||||
result = crew.kickoff()
|
||||
```
|
||||
|
||||
crew = Crew(agents=[...], tasks=[...])
|
||||
result = crew.kickoff(
|
||||
from_checkpoint=CheckpointConfig(
|
||||
restore_from="./my_checkpoints/20260407T120000_abc123.json",
|
||||
),
|
||||
)
|
||||
```
|
||||
Press `Ctrl+C` after the first task finishes. Look in `./.checkpoints/` — a file named `<timestamp>_<uuid>.json` is the checkpoint.
|
||||
</Step>
|
||||
<Step title="Resume from the checkpoint">
|
||||
```python
|
||||
from crewai import CheckpointConfig
|
||||
|
||||
Remaining `CheckpointConfig` fields apply to the new run, so checkpointing continues after the restore.
|
||||
result = crew.kickoff(
|
||||
from_checkpoint=CheckpointConfig(
|
||||
restore_from="./.checkpoints/<timestamp>_<uuid>.json",
|
||||
),
|
||||
)
|
||||
```
|
||||
|
||||
You can also use the classmethod directly:
|
||||
The research task is skipped, the writer runs against the saved research output, and the crew finishes.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
```python
|
||||
config = CheckpointConfig(restore_from="./my_checkpoints/20260407T120000_abc123.json")
|
||||
crew = Crew.from_checkpoint(config)
|
||||
result = crew.kickoff()
|
||||
```
|
||||
## How-to guides
|
||||
|
||||
## Forking from a Checkpoint
|
||||
<AccordionGroup>
|
||||
<Accordion title="Enable checkpointing with defaults" icon="play">
|
||||
```python
|
||||
crew = Crew(agents=[...], tasks=[...], checkpoint=True)
|
||||
```
|
||||
|
||||
`fork()` restores a checkpoint and starts a new execution branch. Useful for exploring alternative paths from the same point.
|
||||
Writes to `./.checkpoints/` on every `task_completed`.
|
||||
</Accordion>
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
<Accordion title="Customize storage and frequency" icon="sliders">
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
|
||||
config = CheckpointConfig(restore_from="./my_checkpoints/20260407T120000_abc123.json")
|
||||
crew = Crew.fork(config, branch="experiment-a")
|
||||
result = crew.kickoff(inputs={"strategy": "aggressive"})
|
||||
```
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
on_events=["task_completed", "crew_kickoff_completed"],
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
Each fork gets a unique lineage ID so checkpoints from different branches don't collide. The `branch` label is optional and auto-generated if omitted.
|
||||
<Accordion title="Choose a storage provider" icon="database">
|
||||
<CodeGroup>
|
||||
```python JsonProvider
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import JsonProvider
|
||||
|
||||
## Works on Crew, Flow, and Agent
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
provider=JsonProvider(),
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
```python SqliteProvider
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import SqliteProvider
|
||||
|
||||
### Crew
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./.checkpoints.db",
|
||||
provider=SqliteProvider(),
|
||||
max_checkpoints=50,
|
||||
),
|
||||
)
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[research_task, write_task, review_task],
|
||||
checkpoint=CheckpointConfig(location="./crew_cp"),
|
||||
)
|
||||
```
|
||||
<Tip>
|
||||
SQLite enables WAL journal mode for concurrent reads. Prefer it for high-frequency checkpointing.
|
||||
</Tip>
|
||||
</Accordion>
|
||||
|
||||
Default trigger: `task_completed` (one checkpoint per finished task).
|
||||
<Accordion title="Opt one agent out" icon="user-slash">
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[
|
||||
Agent(role="Researcher", ...),
|
||||
Agent(role="Writer", ..., checkpoint=False),
|
||||
],
|
||||
tasks=[...],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
### Flow
|
||||
<Accordion title="Resume via the classmethod" icon="rotate-left">
|
||||
```python
|
||||
config = CheckpointConfig(restore_from="./my_checkpoints/<file>.json")
|
||||
crew = Crew.from_checkpoint(config)
|
||||
result = crew.kickoff()
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
```python
|
||||
from crewai.flow.flow import Flow, start, listen
|
||||
from crewai import CheckpointConfig
|
||||
<Accordion title="Fork into a new branch" icon="code-branch">
|
||||
`fork()` restores a checkpoint under a fresh lineage so the new run does not collide with the original.
|
||||
|
||||
class MyFlow(Flow):
|
||||
@start()
|
||||
def step_one(self):
|
||||
return "data"
|
||||
```python
|
||||
config = CheckpointConfig(restore_from="./my_checkpoints/<file>.json")
|
||||
crew = Crew.fork(config, branch="experiment-a")
|
||||
result = crew.kickoff(inputs={"strategy": "aggressive"})
|
||||
```
|
||||
|
||||
@listen(step_one)
|
||||
def step_two(self, data):
|
||||
return process(data)
|
||||
The `branch` label is optional; one is generated if omitted.
|
||||
</Accordion>
|
||||
|
||||
flow = MyFlow(
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./flow_cp",
|
||||
on_events=["method_execution_finished"],
|
||||
),
|
||||
)
|
||||
result = flow.kickoff()
|
||||
<Accordion title="Checkpoint a Crew, Flow, or Agent" icon="cubes">
|
||||
<Tabs>
|
||||
<Tab title="Crew">
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[research_task, write_task, review_task],
|
||||
checkpoint=CheckpointConfig(location="./crew_cp"),
|
||||
)
|
||||
```
|
||||
|
||||
# Resume
|
||||
config = CheckpointConfig(restore_from="./flow_cp/20260407T120000_abc123.json")
|
||||
flow = MyFlow.from_checkpoint(config)
|
||||
result = flow.kickoff()
|
||||
```
|
||||
Default trigger: `task_completed`.
|
||||
</Tab>
|
||||
<Tab title="Flow">
|
||||
```python
|
||||
from crewai.flow.flow import Flow, start, listen
|
||||
from crewai import CheckpointConfig
|
||||
|
||||
### Agent
|
||||
class MyFlow(Flow):
|
||||
@start()
|
||||
def step_one(self):
|
||||
return "data"
|
||||
|
||||
```python
|
||||
agent = Agent(
|
||||
role="Researcher",
|
||||
goal="Research topics",
|
||||
backstory="Expert researcher",
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./agent_cp",
|
||||
on_events=["lite_agent_execution_completed"],
|
||||
),
|
||||
)
|
||||
result = agent.kickoff(messages=[{"role": "user", "content": "Research AI trends"}])
|
||||
```
|
||||
@listen(step_one)
|
||||
def step_two(self, data):
|
||||
return process(data)
|
||||
|
||||
## Storage Providers
|
||||
flow = MyFlow(
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./flow_cp",
|
||||
on_events=["method_execution_finished"],
|
||||
),
|
||||
)
|
||||
result = flow.kickoff()
|
||||
|
||||
CrewAI ships with two checkpoint storage providers.
|
||||
config = CheckpointConfig(restore_from="./flow_cp/<file>.json")
|
||||
flow = MyFlow.from_checkpoint(config)
|
||||
result = flow.kickoff()
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Agent">
|
||||
```python
|
||||
agent = Agent(
|
||||
role="Researcher",
|
||||
goal="Research topics",
|
||||
backstory="Expert researcher",
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./agent_cp",
|
||||
on_events=["lite_agent_execution_completed"],
|
||||
),
|
||||
)
|
||||
result = agent.kickoff(messages=[{"role": "user", "content": "Research AI trends"}])
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
</Accordion>
|
||||
|
||||
### JsonProvider (default)
|
||||
<Accordion title="Write a checkpoint manually" icon="code">
|
||||
Register a handler on any event and call `state.checkpoint()`.
|
||||
|
||||
Writes each checkpoint as a separate JSON file. Simple, human-readable, easy to inspect.
|
||||
<CodeGroup>
|
||||
```python Sync
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import JsonProvider
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
def on_llm_done(source, event, state):
|
||||
path = state.checkpoint("./my_checkpoints")
|
||||
print(f"Saved checkpoint: {path}")
|
||||
```
|
||||
```python Async
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
provider=JsonProvider(), # this is the default
|
||||
max_checkpoints=5, # prunes oldest files
|
||||
),
|
||||
)
|
||||
```
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
async def on_llm_done_async(source, event, state):
|
||||
path = await state.acheckpoint("./my_checkpoints")
|
||||
print(f"Saved checkpoint: {path}")
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
Files are named `<timestamp>_<uuid>.json` inside the location directory.
|
||||
A `state` argument is supplied automatically when the handler takes three parameters. See [Event Listeners](/en/concepts/event-listener) for the full event catalog.
|
||||
</Accordion>
|
||||
|
||||
### SqliteProvider
|
||||
<Accordion title="Browse, resume, and fork from the CLI" icon="terminal">
|
||||
```bash
|
||||
crewai checkpoint # auto-detects .checkpoints/ or .checkpoints.db
|
||||
crewai checkpoint --location ./my_checkpoints
|
||||
crewai checkpoint --location ./.checkpoints.db
|
||||
```
|
||||
|
||||
Stores all checkpoints in a single SQLite database file. Better for high-frequency checkpointing and avoids many small files.
|
||||
<Frame>
|
||||
<img src="/images/checkpointing.png" alt="Checkpoint TUI" />
|
||||
</Frame>
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import SqliteProvider
|
||||
The left panel groups checkpoints by branch; forks nest under their parent. Selecting a checkpoint shows its metadata, entity state, and task progress. **Resume** continues the run; **Fork** starts a new branch.
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./.checkpoints.db",
|
||||
provider=SqliteProvider(),
|
||||
max_checkpoints=50,
|
||||
),
|
||||
)
|
||||
```
|
||||
The detail panel exposes two editable areas:
|
||||
|
||||
WAL journal mode is enabled for concurrent read access.
|
||||
- **Inputs** — original kickoff inputs, pre-filled and editable.
|
||||
- **Task outputs** — outputs of completed tasks. Editing an output and hitting **Fork** invalidates downstream tasks so they re-run against the modified context.
|
||||
|
||||
## Event Types
|
||||
<Tip>
|
||||
Useful for "what if" exploration: fork, tweak, observe.
|
||||
</Tip>
|
||||
</Accordion>
|
||||
|
||||
The `on_events` field accepts any combination of event type strings. Common choices:
|
||||
<Accordion title="Inspect checkpoints without the TUI" icon="magnifying-glass">
|
||||
```bash
|
||||
crewai checkpoint list ./my_checkpoints
|
||||
crewai checkpoint info ./my_checkpoints/<file>.json
|
||||
crewai checkpoint info ./.checkpoints.db
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
| Use Case | Events |
|
||||
## Reference
|
||||
|
||||
### `CheckpointConfig`
|
||||
|
||||
<ParamField path="location" type="str" default='"./.checkpoints"'>
|
||||
Storage destination. A directory for `JsonProvider`, a database file path for `SqliteProvider`.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="on_events" type="list[str]" default='["task_completed"]'>
|
||||
Event types that trigger a checkpoint. See [event types](#event-types).
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="provider" type="BaseProvider" default="JsonProvider()">
|
||||
Storage backend. Either `JsonProvider` or `SqliteProvider`.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="max_checkpoints" type="int | None" default="None">
|
||||
Maximum checkpoints to retain. Oldest are pruned after each write.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="restore_from" type="Path | str | None" default="None">
|
||||
Checkpoint to restore from when passed via `from_checkpoint`.
|
||||
</ParamField>
|
||||
|
||||
### `checkpoint` field values
|
||||
|
||||
Accepted by `Crew`, `Flow`, and `Agent`.
|
||||
|
||||
<ParamField path="None" type="default">
|
||||
Inherit from parent.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="True" type="bool">
|
||||
Enable with defaults.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="False" type="bool">
|
||||
Explicit opt-out. Stops inheritance.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="CheckpointConfig(...)" type="CheckpointConfig">
|
||||
Custom configuration.
|
||||
</ParamField>
|
||||
|
||||
### Event types
|
||||
|
||||
Common values for `on_events`:
|
||||
|
||||
| Use case | Events |
|
||||
|:---------|:-------|
|
||||
| After each task (Crew) | `["task_completed"]` |
|
||||
| After each task | `["task_completed"]` |
|
||||
| After each flow method | `["method_execution_finished"]` |
|
||||
| After agent execution | `["agent_execution_completed"]`, `["lite_agent_execution_completed"]` |
|
||||
| On crew completion only | `["crew_kickoff_completed"]` |
|
||||
| After every LLM call | `["llm_call_completed"]` |
|
||||
| On everything | `["*"]` |
|
||||
| Everything | `["*"]` |
|
||||
|
||||
<Warning>
|
||||
Using `["*"]` or high-frequency events like `llm_call_completed` will write many checkpoint files and may impact performance. Use `max_checkpoints` to limit disk usage.
|
||||
`["*"]` and high-frequency events like `llm_call_completed` write many checkpoints and can degrade performance. Pair them with `max_checkpoints`.
|
||||
</Warning>
|
||||
|
||||
## Manual Checkpointing
|
||||
### Storage providers
|
||||
|
||||
For full control, register your own event handler and call `state.checkpoint()` directly:
|
||||
<ParamField path="JsonProvider" type="provider">
|
||||
One file per checkpoint, named `<timestamp>_<uuid>.json` inside `location`.
|
||||
</ParamField>
|
||||
|
||||
```python
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
<ParamField path="SqliteProvider" type="provider">
|
||||
Single database file at `location` with WAL journaling.
|
||||
</ParamField>
|
||||
|
||||
# Sync handler
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
def on_llm_done(source, event, state):
|
||||
path = state.checkpoint("./my_checkpoints")
|
||||
print(f"Saved checkpoint: {path}")
|
||||
### CLI
|
||||
|
||||
# Async handler
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
async def on_llm_done_async(source, event, state):
|
||||
path = await state.acheckpoint("./my_checkpoints")
|
||||
print(f"Saved checkpoint: {path}")
|
||||
```
|
||||
|
||||
The `state` argument is the `RuntimeState` passed automatically by the event bus when your handler accepts 3 parameters. You can register handlers on any event type listed in the [Event Listeners](/en/concepts/event-listener) documentation.
|
||||
|
||||
Checkpointing is best-effort: if a checkpoint write fails, the error is logged but execution continues uninterrupted.
|
||||
|
||||
## CLI
|
||||
|
||||
The `crewai checkpoint` command gives you a TUI for browsing, inspecting, resuming, and forking checkpoints. It auto-detects whether your checkpoints are JSON files or a SQLite database.
|
||||
|
||||
```bash
|
||||
# Launch the TUI — auto-detects .checkpoints/ or .checkpoints.db
|
||||
crewai checkpoint
|
||||
|
||||
# Point at a specific location
|
||||
crewai checkpoint --location ./my_checkpoints
|
||||
crewai checkpoint --location ./.checkpoints.db
|
||||
```
|
||||
|
||||
<Frame>
|
||||
<img src="/images/checkpointing.png" alt="Checkpoint TUI" />
|
||||
</Frame>
|
||||
|
||||
The left panel is a tree view. Checkpoints are grouped by branch, and forks nest under the checkpoint they diverged from. Select a checkpoint to see its metadata, entity state, and task progress in the detail panel. Hit **Resume** to pick up where it left off, or **Fork** to start a new branch from that point.
|
||||
|
||||
### Editing inputs and task outputs
|
||||
|
||||
When a checkpoint is selected, the detail panel shows:
|
||||
|
||||
- **Inputs** — if the original kickoff had inputs (e.g. `{topic}`), they appear as editable fields pre-filled with the original values. Change them before resuming or forking.
|
||||
- **Task outputs** — completed tasks show their output in editable text areas. Edit a task's output to change the context that downstream tasks receive. When you modify a task output and hit Fork, all subsequent tasks are invalidated and re-run with the new context.
|
||||
|
||||
This is useful for "what if" exploration — fork from a checkpoint, tweak a task's result, and see how it changes downstream behavior.
|
||||
|
||||
### Subcommands
|
||||
|
||||
```bash
|
||||
# List all checkpoints
|
||||
crewai checkpoint list ./my_checkpoints
|
||||
|
||||
# Inspect a specific checkpoint
|
||||
crewai checkpoint info ./my_checkpoints/20260407T120000_abc123.json
|
||||
|
||||
# Inspect latest in a SQLite database
|
||||
crewai checkpoint info ./.checkpoints.db
|
||||
```
|
||||
| Command | Purpose |
|
||||
|:--------|:--------|
|
||||
| `crewai checkpoint` | Launch the TUI; auto-detect storage. |
|
||||
| `crewai checkpoint --location <path>` | Launch the TUI against a specific location. |
|
||||
| `crewai checkpoint list <path>` | List checkpoints. |
|
||||
| `crewai checkpoint info <path>` | Inspect a checkpoint file or the latest entry in a SQLite database. |
|
||||
|
||||
@@ -5,194 +5,360 @@ icon: floppy-disk
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Warning>
|
||||
체크포인팅은 초기 릴리스 단계입니다. API는 향후 버전에서 변경될 수 있습니다.
|
||||
</Warning>
|
||||
체크포인팅은 실행 중 실행 상태의 스냅샷을 저장하여 크루, 플로우, 에이전트가 실패 후 재개하거나 대체 브랜치로 분기될 수 있도록 합니다.
|
||||
|
||||
## 개요
|
||||
<CardGroup cols={2}>
|
||||
<Card title="설명" icon="lightbulb" href="#설명">
|
||||
체크포인팅의 작동 방식: 이벤트, 스토리지, 상속.
|
||||
</Card>
|
||||
<Card title="튜토리얼" icon="graduation-cap" href="#튜토리얼-실패한-크루-재개하기">
|
||||
5분 가이드: 실행, 중단, 재개.
|
||||
</Card>
|
||||
<Card title="사용 방법" icon="screwdriver-wrench" href="#사용-방법">
|
||||
일반적인 워크플로우를 위한 작업 중심 레시피.
|
||||
</Card>
|
||||
<Card title="레퍼런스" icon="book" href="#레퍼런스">
|
||||
`CheckpointConfig`, 이벤트, 프로바이더, CLI.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
체크포인팅은 실행 중 자동으로 실행 상태를 저장합니다. 크루, 플로우 또는 에이전트가 실행 도중 실패하면 마지막 체크포인트에서 복원하여 이미 완료된 작업을 다시 실행하지 않고 재개할 수 있습니다.
|
||||
## 설명
|
||||
|
||||
## 빠른 시작
|
||||
### 체크포인트란
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
체크포인트는 실행의 특정 시점에 기록된 `RuntimeState`의 직렬화된 스냅샷입니다. 어떤 태스크가 완료되었는지, 그 출력값, 현재 입력값, 그리고 실행을 식별하는 lineage ID를 기록합니다.
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=True, # 기본값 사용: ./.checkpoints, task_completed 이벤트
|
||||
)
|
||||
result = crew.kickoff()
|
||||
```
|
||||
체크포인트에서 복원하면 CrewAI는 해당 상태를 재구성하고 이미 완료된 작업을 건너뛰고 계속 진행합니다. 포크하면 CrewAI는 새 lineage 아래에 상태를 복원하여 새 브랜치와 원본 실행이 서로 덮어쓰지 않도록 합니다.
|
||||
|
||||
각 태스크가 완료된 후 `./.checkpoints/`에 체크포인트 파일이 기록됩니다.
|
||||
### 체크포인트가 기록되는 시점
|
||||
|
||||
## 설정
|
||||
체크포인팅은 이벤트 기반입니다. 런타임은 `on_events`로 선택한 이벤트를 구독하고, 이벤트가 발생할 때마다 체크포인트를 기록합니다. 기본값 `task_completed`는 완료된 태스크당 하나의 체크포인트를 생성합니다 — 세분화와 디스크 사용의 합리적인 균형입니다. `llm_call_completed`와 같은 고빈도 이벤트는 더 세밀한 복구를 위해 사용 가능하지만 훨씬 많은 파일을 기록합니다.
|
||||
|
||||
`CheckpointConfig`를 사용하여 세부 설정을 제어합니다:
|
||||
### 스토리지
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
CrewAI에는 두 가지 프로바이더가 포함되어 있습니다:
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
on_events=["task_completed", "crew_kickoff_completed"],
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
- `JsonProvider`는 체크포인트당 하나의 파일을 기록합니다. 사람이 읽기 쉽고 검사하기 편리합니다.
|
||||
- `SqliteProvider`는 단일 SQLite 데이터베이스에 기록합니다. 고빈도 체크포인팅에 적합합니다.
|
||||
|
||||
### CheckpointConfig 필드
|
||||
`max_checkpoints`가 설정되면 두 프로바이더 모두 가장 오래된 체크포인트를 자동으로 제거합니다.
|
||||
|
||||
| 필드 | 타입 | 기본값 | 설명 |
|
||||
|:-----|:-----|:-------|:-----|
|
||||
| `location` | `str` | `"./.checkpoints"` | 체크포인트 파일 경로 |
|
||||
| `on_events` | `list[str]` | `["task_completed"]` | 체크포인트를 트리거하는 이벤트 타입 |
|
||||
| `provider` | `BaseProvider` | `JsonProvider()` | 스토리지 백엔드 |
|
||||
| `max_checkpoints` | `int \| None` | `None` | 보관할 최대 파일 수; 오래된 것부터 삭제 |
|
||||
<Note>
|
||||
체크포인트 기록은 best-effort 방식입니다. 실패한 체크포인트는 로그에 기록되지만 실행을 중단시키지 않습니다.
|
||||
</Note>
|
||||
|
||||
### 상속 및 옵트아웃
|
||||
### 상속 모델
|
||||
|
||||
Crew, Flow, Agent의 `checkpoint` 필드는 `CheckpointConfig`, `True`, `False`, `None`을 받습니다:
|
||||
`Crew`, `Flow`, `Agent` 모두 `checkpoint` 인수를 받습니다. 자식은 자체 값을 설정하거나 `False`를 전달하여 옵트아웃하지 않는 한 부모로부터 상속합니다. 크루에서 체크포인팅을 한 번 활성화하면 모든 에이전트가 참여하거나, 특정 에이전트만 선택적으로 제외할 수 있습니다.
|
||||
|
||||
| 값 | 동작 |
|
||||
|:---|:-----|
|
||||
| `None` (기본값) | 부모에서 상속. 에이전트는 크루의 설정을 상속합니다. |
|
||||
| `True` | 기본값으로 활성화. |
|
||||
| `False` | 명시적 옵트아웃. 부모 상속을 중단합니다. |
|
||||
| `CheckpointConfig(...)` | 사용자 정의 설정. |
|
||||
## 튜토리얼: 실패한 크루 재개하기
|
||||
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[
|
||||
Agent(role="Researcher", ...), # 크루의 checkpoint 상속
|
||||
Agent(role="Writer", ..., checkpoint=False), # 옵트아웃, 체크포인트 없음
|
||||
],
|
||||
tasks=[...],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
이 가이드는 약 5분이 소요됩니다. 두 개의 태스크가 있는 크루를 실행하고 중간에 종료한 다음, 저장된 체크포인트에서 재개합니다.
|
||||
|
||||
## 체크포인트에서 재개
|
||||
<Steps>
|
||||
<Step title="체크포인팅이 활성화된 크루를 생성합니다">
|
||||
```python
|
||||
from crewai import Agent, Crew, Task
|
||||
|
||||
```python
|
||||
# 복원 및 재개
|
||||
crew = Crew.from_checkpoint("./my_checkpoints/20260407T120000_abc123.json")
|
||||
result = crew.kickoff() # 마지막으로 완료된 태스크부터 재개
|
||||
```
|
||||
researcher = Agent(role="Researcher", goal="Research", backstory="Expert")
|
||||
writer = Agent(role="Writer", goal="Write", backstory="Expert")
|
||||
|
||||
복원된 크루는 이미 완료된 태스크를 건너뛰고 첫 번째 미완료 태스크부터 재개합니다.
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[
|
||||
Task(description="Research AI trends", agent=researcher, expected_output="bullets"),
|
||||
Task(description="Write a summary", agent=writer, expected_output="paragraph"),
|
||||
],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
</Step>
|
||||
<Step title="실행하고 첫 번째 태스크 후에 중단합니다">
|
||||
```python
|
||||
result = crew.kickoff()
|
||||
```
|
||||
|
||||
## Crew, Flow, Agent에서 사용 가능
|
||||
첫 번째 태스크가 완료된 후 `Ctrl+C`를 누릅니다. `./.checkpoints/` 디렉토리에서 `<timestamp>_<uuid>.json` 형식의 파일이 체크포인트입니다.
|
||||
</Step>
|
||||
<Step title="체크포인트에서 재개합니다">
|
||||
```python
|
||||
from crewai import CheckpointConfig
|
||||
|
||||
### Crew
|
||||
result = crew.kickoff(
|
||||
from_checkpoint=CheckpointConfig(
|
||||
restore_from="./.checkpoints/<timestamp>_<uuid>.json",
|
||||
),
|
||||
)
|
||||
```
|
||||
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[research_task, write_task, review_task],
|
||||
checkpoint=CheckpointConfig(location="./crew_cp"),
|
||||
)
|
||||
```
|
||||
연구 태스크는 건너뛰고, 작성자는 저장된 연구 출력에 대해 실행되며, 크루가 완료됩니다.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
기본 트리거: `task_completed` (완료된 태스크당 하나의 체크포인트).
|
||||
## 사용 방법
|
||||
|
||||
### Flow
|
||||
<AccordionGroup>
|
||||
<Accordion title="기본값으로 체크포인팅 활성화" icon="play">
|
||||
```python
|
||||
crew = Crew(agents=[...], tasks=[...], checkpoint=True)
|
||||
```
|
||||
|
||||
```python
|
||||
from crewai.flow.flow import Flow, start, listen
|
||||
from crewai import CheckpointConfig
|
||||
`task_completed` 이벤트마다 `./.checkpoints/`에 기록합니다.
|
||||
</Accordion>
|
||||
|
||||
class MyFlow(Flow):
|
||||
@start()
|
||||
def step_one(self):
|
||||
return "data"
|
||||
<Accordion title="스토리지와 빈도 사용자 정의" icon="sliders">
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
|
||||
@listen(step_one)
|
||||
def step_two(self, data):
|
||||
return process(data)
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
on_events=["task_completed", "crew_kickoff_completed"],
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
flow = MyFlow(
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./flow_cp",
|
||||
on_events=["method_execution_finished"],
|
||||
),
|
||||
)
|
||||
result = flow.kickoff()
|
||||
<Accordion title="스토리지 프로바이더 선택" icon="database">
|
||||
<CodeGroup>
|
||||
```python JsonProvider
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import JsonProvider
|
||||
|
||||
# 재개
|
||||
flow = MyFlow.from_checkpoint("./flow_cp/20260407T120000_abc123.json")
|
||||
result = flow.kickoff()
|
||||
```
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
provider=JsonProvider(),
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
```python SqliteProvider
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import SqliteProvider
|
||||
|
||||
### Agent
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./.checkpoints.db",
|
||||
provider=SqliteProvider(),
|
||||
max_checkpoints=50,
|
||||
),
|
||||
)
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
```python
|
||||
agent = Agent(
|
||||
role="Researcher",
|
||||
goal="Research topics",
|
||||
backstory="Expert researcher",
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./agent_cp",
|
||||
on_events=["lite_agent_execution_completed"],
|
||||
),
|
||||
)
|
||||
result = agent.kickoff(messages=[{"role": "user", "content": "Research AI trends"}])
|
||||
```
|
||||
<Tip>
|
||||
SQLite는 동시 읽기를 위해 WAL 저널 모드를 활성화합니다. 고빈도 체크포인팅에는 SQLite를 선호하세요.
|
||||
</Tip>
|
||||
</Accordion>
|
||||
|
||||
## 스토리지 프로바이더
|
||||
<Accordion title="특정 에이전트 옵트아웃" icon="user-slash">
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[
|
||||
Agent(role="Researcher", ...),
|
||||
Agent(role="Writer", ..., checkpoint=False),
|
||||
],
|
||||
tasks=[...],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
CrewAI는 두 가지 체크포인트 스토리지 프로바이더를 제공합니다.
|
||||
<Accordion title="classmethod로 재개" icon="rotate-left">
|
||||
```python
|
||||
config = CheckpointConfig(restore_from="./my_checkpoints/<file>.json")
|
||||
crew = Crew.from_checkpoint(config)
|
||||
result = crew.kickoff()
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
### JsonProvider (기본값)
|
||||
<Accordion title="새 브랜치로 포크" icon="code-branch">
|
||||
`fork()`는 새 lineage 아래에 체크포인트를 복원하여 새 실행이 원본과 충돌하지 않도록 합니다.
|
||||
|
||||
각 체크포인트를 별도의 JSON 파일로 저장합니다.
|
||||
```python
|
||||
config = CheckpointConfig(restore_from="./my_checkpoints/<file>.json")
|
||||
crew = Crew.fork(config, branch="experiment-a")
|
||||
result = crew.kickoff(inputs={"strategy": "aggressive"})
|
||||
```
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import JsonProvider
|
||||
`branch` 레이블은 선택 사항이며, 생략하면 자동 생성됩니다.
|
||||
</Accordion>
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
provider=JsonProvider(),
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
<Accordion title="Crew, Flow, Agent 체크포인트" icon="cubes">
|
||||
<Tabs>
|
||||
<Tab title="Crew">
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[research_task, write_task, review_task],
|
||||
checkpoint=CheckpointConfig(location="./crew_cp"),
|
||||
)
|
||||
```
|
||||
|
||||
### SqliteProvider
|
||||
기본 트리거: `task_completed`.
|
||||
</Tab>
|
||||
<Tab title="Flow">
|
||||
```python
|
||||
from crewai.flow.flow import Flow, start, listen
|
||||
from crewai import CheckpointConfig
|
||||
|
||||
모든 체크포인트를 단일 SQLite 데이터베이스 파일에 저장합니다.
|
||||
class MyFlow(Flow):
|
||||
@start()
|
||||
def step_one(self):
|
||||
return "data"
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import SqliteProvider
|
||||
@listen(step_one)
|
||||
def step_two(self, data):
|
||||
return process(data)
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./.checkpoints.db",
|
||||
provider=SqliteProvider(),
|
||||
),
|
||||
)
|
||||
```
|
||||
flow = MyFlow(
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./flow_cp",
|
||||
on_events=["method_execution_finished"],
|
||||
),
|
||||
)
|
||||
result = flow.kickoff()
|
||||
|
||||
config = CheckpointConfig(restore_from="./flow_cp/<file>.json")
|
||||
flow = MyFlow.from_checkpoint(config)
|
||||
result = flow.kickoff()
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Agent">
|
||||
```python
|
||||
agent = Agent(
|
||||
role="Researcher",
|
||||
goal="Research topics",
|
||||
backstory="Expert researcher",
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./agent_cp",
|
||||
on_events=["lite_agent_execution_completed"],
|
||||
),
|
||||
)
|
||||
result = agent.kickoff(messages=[{"role": "user", "content": "Research AI trends"}])
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
</Accordion>
|
||||
|
||||
## 이벤트 타입
|
||||
<Accordion title="수동으로 체크포인트 기록" icon="code">
|
||||
모든 이벤트에 핸들러를 등록하고 `state.checkpoint()`를 호출합니다.
|
||||
|
||||
`on_events` 필드는 이벤트 타입 문자열의 조합을 받습니다. 일반적인 선택:
|
||||
<CodeGroup>
|
||||
```python Sync
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
def on_llm_done(source, event, state):
|
||||
path = state.checkpoint("./my_checkpoints")
|
||||
print(f"체크포인트 저장: {path}")
|
||||
```
|
||||
```python Async
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
async def on_llm_done_async(source, event, state):
|
||||
path = await state.acheckpoint("./my_checkpoints")
|
||||
print(f"체크포인트 저장: {path}")
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
핸들러가 세 개의 매개변수를 받을 때 `state` 인수가 자동으로 제공됩니다. 전체 이벤트 카탈로그는 [Event Listeners](/ko/concepts/event-listener) 문서를 참조하세요.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="CLI에서 탐색, 재개, 포크" icon="terminal">
|
||||
```bash
|
||||
crewai checkpoint # .checkpoints/ 또는 .checkpoints.db 자동 감지
|
||||
crewai checkpoint --location ./my_checkpoints
|
||||
crewai checkpoint --location ./.checkpoints.db
|
||||
```
|
||||
|
||||
<Frame>
|
||||
<img src="/images/checkpointing.png" alt="Checkpoint TUI" />
|
||||
</Frame>
|
||||
|
||||
왼쪽 패널은 체크포인트를 브랜치별로 그룹화하며, 포크는 부모 아래에 중첩됩니다. 체크포인트를 선택하면 메타데이터, 엔티티 상태, 태스크 진행 상황이 표시됩니다. **Resume**은 실행을 계속하고, **Fork**는 새 브랜치를 시작합니다.
|
||||
|
||||
세부 정보 패널에는 두 개의 편집 가능한 영역이 있습니다:
|
||||
|
||||
- **Inputs** — 원래 kickoff의 입력으로, 미리 채워져 있으며 편집 가능합니다.
|
||||
- **태스크 출력** — 완료된 태스크의 출력. 출력을 편집하고 **Fork**를 누르면 다운스트림 태스크가 무효화되어 수정된 컨텍스트로 다시 실행됩니다.
|
||||
|
||||
<Tip>
|
||||
"what if" 탐색에 유용합니다: 포크, 조정, 관찰.
|
||||
</Tip>
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="TUI 없이 체크포인트 검사" icon="magnifying-glass">
|
||||
```bash
|
||||
crewai checkpoint list ./my_checkpoints
|
||||
crewai checkpoint info ./my_checkpoints/<file>.json
|
||||
crewai checkpoint info ./.checkpoints.db
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## 레퍼런스
|
||||
|
||||
### `CheckpointConfig`
|
||||
|
||||
<ParamField path="location" type="str" default='"./.checkpoints"'>
|
||||
스토리지 대상. `JsonProvider`는 디렉토리, `SqliteProvider`는 데이터베이스 파일 경로.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="on_events" type="list[str]" default='["task_completed"]'>
|
||||
체크포인트를 트리거하는 이벤트 타입. [이벤트 타입](#이벤트-타입) 참조.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="provider" type="BaseProvider" default="JsonProvider()">
|
||||
스토리지 백엔드. `JsonProvider` 또는 `SqliteProvider`.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="max_checkpoints" type="int | None" default="None">
|
||||
보관할 최대 체크포인트 수. 각 기록 후 가장 오래된 것이 제거됩니다.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="restore_from" type="Path | str | None" default="None">
|
||||
`from_checkpoint`를 통해 전달될 때 복원할 체크포인트.
|
||||
</ParamField>
|
||||
|
||||
### `checkpoint` 필드 값
|
||||
|
||||
`Crew`, `Flow`, `Agent`에서 사용 가능.
|
||||
|
||||
<ParamField path="None" type="기본값">
|
||||
부모에서 상속.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="True" type="bool">
|
||||
기본값으로 활성화.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="False" type="bool">
|
||||
명시적 옵트아웃. 상속을 중단합니다.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="CheckpointConfig(...)" type="CheckpointConfig">
|
||||
사용자 정의 설정.
|
||||
</ParamField>
|
||||
|
||||
### 이벤트 타입
|
||||
|
||||
`on_events`에 대한 일반적인 값:
|
||||
|
||||
| 사용 사례 | 이벤트 |
|
||||
|:----------|:-------|
|
||||
| 각 태스크 완료 후 (Crew) | `["task_completed"]` |
|
||||
| 각 태스크 완료 후 | `["task_completed"]` |
|
||||
| 각 플로우 메서드 완료 후 | `["method_execution_finished"]` |
|
||||
| 에이전트 실행 완료 후 | `["agent_execution_completed"]`, `["lite_agent_execution_completed"]` |
|
||||
| 크루 완료 시에만 | `["crew_kickoff_completed"]` |
|
||||
@@ -200,30 +366,24 @@ crew = Crew(
|
||||
| 모든 이벤트 | `["*"]` |
|
||||
|
||||
<Warning>
|
||||
`["*"]` 또는 `llm_call_completed`와 같은 고빈도 이벤트를 사용하면 많은 체크포인트 파일이 생성되어 성능에 영향을 줄 수 있습니다. `max_checkpoints`를 사용하여 디스크 사용량을 제한하세요.
|
||||
`["*"]` 및 `llm_call_completed`와 같은 고빈도 이벤트는 많은 체크포인트를 기록하고 성능을 저하시킬 수 있습니다. `max_checkpoints`와 함께 사용하세요.
|
||||
</Warning>
|
||||
|
||||
## 수동 체크포인팅
|
||||
### 스토리지 프로바이더
|
||||
|
||||
완전한 제어를 위해 자체 이벤트 핸들러를 등록하고 `state.checkpoint()`를 직접 호출할 수 있습니다:
|
||||
<ParamField path="JsonProvider" type="provider">
|
||||
체크포인트당 하나의 파일, `location` 내부에 `<timestamp>_<uuid>.json` 형식으로 명명.
|
||||
</ParamField>
|
||||
|
||||
```python
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
<ParamField path="SqliteProvider" type="provider">
|
||||
WAL 저널링이 있는 `location`의 단일 데이터베이스 파일.
|
||||
</ParamField>
|
||||
|
||||
# 동기 핸들러
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
def on_llm_done(source, event, state):
|
||||
path = state.checkpoint("./my_checkpoints")
|
||||
print(f"체크포인트 저장: {path}")
|
||||
### CLI
|
||||
|
||||
# 비동기 핸들러
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
async def on_llm_done_async(source, event, state):
|
||||
path = await state.acheckpoint("./my_checkpoints")
|
||||
print(f"체크포인트 저장: {path}")
|
||||
```
|
||||
|
||||
`state` 인수는 핸들러가 3개의 매개변수를 받을 때 이벤트 버스가 자동으로 전달하는 `RuntimeState`입니다. [Event Listeners](/ko/concepts/event-listener) 문서에 나열된 모든 이벤트 타입에 핸들러를 등록할 수 있습니다.
|
||||
|
||||
체크포인팅은 best-effort입니다: 체크포인트 기록이 실패하면 오류가 로그에 기록되지만 실행은 중단 없이 계속됩니다.
|
||||
| 명령 | 목적 |
|
||||
|:-----|:-----|
|
||||
| `crewai checkpoint` | TUI 실행; 스토리지 자동 감지. |
|
||||
| `crewai checkpoint --location <path>` | 특정 위치에 대해 TUI 실행. |
|
||||
| `crewai checkpoint list <path>` | 체크포인트 나열. |
|
||||
| `crewai checkpoint info <path>` | 체크포인트 파일 또는 SQLite 데이터베이스의 최신 항목 검사. |
|
||||
|
||||
@@ -5,225 +5,385 @@ icon: floppy-disk
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
<Warning>
|
||||
O checkpointing esta em versao inicial. As APIs podem mudar em versoes futuras.
|
||||
</Warning>
|
||||
O checkpointing salva um snapshot do estado de execucao durante uma execucao para que uma crew, flow ou agente possa retomar apos uma falha ou ser bifurcado em uma branch alternativa.
|
||||
|
||||
## Visao Geral
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Explicacao" icon="lightbulb" href="#explicacao">
|
||||
Como o checkpointing funciona: eventos, armazenamento e heranca.
|
||||
</Card>
|
||||
<Card title="Tutorial" icon="graduation-cap" href="#tutorial-retomar-uma-crew-com-falha">
|
||||
Um passo a passo de 5 minutos: executar, interromper, retomar.
|
||||
</Card>
|
||||
<Card title="Guias de uso" icon="screwdriver-wrench" href="#guias-de-uso">
|
||||
Receitas focadas em tarefas para fluxos comuns.
|
||||
</Card>
|
||||
<Card title="Referencia" icon="book" href="#referencia">
|
||||
`CheckpointConfig`, eventos, provedores e CLI.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
O checkpointing salva automaticamente o estado de execucao durante uma execucao. Se uma crew, flow ou agente falhar no meio da execucao, voce pode restaurar a partir do ultimo checkpoint e retomar sem reexecutar o trabalho ja concluido.
|
||||
## Explicacao
|
||||
|
||||
## Inicio Rapido
|
||||
### O que e um checkpoint
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
Um checkpoint e um snapshot serializado do `RuntimeState` gravado em um ponto da execucao. Ele registra quais tarefas foram concluidas, suas saidas, os inputs atuais e um ID de linhagem que identifica a execucao.
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=True, # usa padroes: ./.checkpoints, em task_completed
|
||||
)
|
||||
result = crew.kickoff()
|
||||
```
|
||||
Ao restaurar a partir de um checkpoint, o CrewAI reconstroi esse estado, pula o trabalho ja concluido e continua. Ao fazer fork, o CrewAI restaura o estado sob uma nova linhagem para que a nova branch e a execucao original nao se sobreponham.
|
||||
|
||||
Os arquivos de checkpoint sao gravados em `./.checkpoints/` apos cada tarefa concluida.
|
||||
### Quando os checkpoints sao gravados
|
||||
|
||||
## Configuracao
|
||||
O checkpointing e orientado a eventos. O runtime se inscreve nos eventos selecionados em `on_events` e grava um checkpoint sempre que um e disparado. O padrao `task_completed` produz um checkpoint por tarefa finalizada — um equilibrio razoavel entre granularidade e uso de disco. Eventos de alta frequencia como `llm_call_completed` estao disponiveis para recuperacao mais granular, mas gravam muito mais arquivos.
|
||||
|
||||
Use `CheckpointConfig` para controle total:
|
||||
### Armazenamento
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
Dois provedores acompanham o CrewAI:
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
on_events=["task_completed", "crew_kickoff_completed"],
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
- `JsonProvider` grava um arquivo por checkpoint. Legivel e facil de inspecionar.
|
||||
- `SqliteProvider` grava em um unico banco SQLite. Melhor para checkpointing de alta frequencia.
|
||||
|
||||
### Campos do CheckpointConfig
|
||||
Ambos removem os checkpoints mais antigos quando `max_checkpoints` esta definido.
|
||||
|
||||
| Campo | Tipo | Padrao | Descricao |
|
||||
|:------|:-----|:-------|:----------|
|
||||
| `location` | `str` | `"./.checkpoints"` | Caminho para os arquivos de checkpoint |
|
||||
| `on_events` | `list[str]` | `["task_completed"]` | Tipos de evento que acionam um checkpoint |
|
||||
| `provider` | `BaseProvider` | `JsonProvider()` | Backend de armazenamento |
|
||||
| `max_checkpoints` | `int \| None` | `None` | Maximo de arquivos a manter; os mais antigos sao removidos primeiro |
|
||||
<Note>
|
||||
As gravacoes de checkpoint sao best-effort. Um checkpoint que falha e registrado em log, mas nao interrompe a execucao.
|
||||
</Note>
|
||||
|
||||
### Heranca e Desativacao
|
||||
### Modelo de heranca
|
||||
|
||||
O campo `checkpoint` em Crew, Flow e Agent aceita `CheckpointConfig`, `True`, `False` ou `None`:
|
||||
`Crew`, `Flow` e `Agent` aceitam um argumento `checkpoint`. Filhos herdam do pai a menos que definam seu proprio valor ou passem `False` para desativar. Ative o checkpointing uma vez na crew e todos os agentes participam, ou exclua um agente seletivamente.
|
||||
|
||||
| Valor | Comportamento |
|
||||
|:------|:--------------|
|
||||
| `None` (padrao) | Herda do pai. Um agente herda a configuracao da crew. |
|
||||
| `True` | Ativa com padroes. |
|
||||
| `False` | Desativacao explicita. Interrompe a heranca do pai. |
|
||||
| `CheckpointConfig(...)` | Configuracao personalizada. |
|
||||
## Tutorial: Retomar uma crew com falha
|
||||
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[
|
||||
Agent(role="Researcher", ...), # herda checkpoint da crew
|
||||
Agent(role="Writer", ..., checkpoint=False), # desativado, sem checkpoints
|
||||
],
|
||||
tasks=[...],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
Este passo a passo leva cerca de 5 minutos. Voce executara uma crew de duas tarefas, a interrompera no meio e a retomara a partir do checkpoint salvo.
|
||||
|
||||
## Retomando a partir de um Checkpoint
|
||||
<Steps>
|
||||
<Step title="Crie a crew com checkpointing ativado">
|
||||
```python
|
||||
from crewai import Agent, Crew, Task
|
||||
|
||||
```python
|
||||
# Restaurar e retomar
|
||||
crew = Crew.from_checkpoint("./my_checkpoints/20260407T120000_abc123.json")
|
||||
result = crew.kickoff() # retoma a partir da ultima tarefa concluida
|
||||
```
|
||||
researcher = Agent(role="Researcher", goal="Research", backstory="Expert")
|
||||
writer = Agent(role="Writer", goal="Write", backstory="Expert")
|
||||
|
||||
A crew restaurada pula tarefas ja concluidas e retoma a partir da primeira incompleta.
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[
|
||||
Task(description="Research AI trends", agent=researcher, expected_output="bullets"),
|
||||
Task(description="Write a summary", agent=writer, expected_output="paragraph"),
|
||||
],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
</Step>
|
||||
<Step title="Execute e interrompa apos a primeira tarefa">
|
||||
```python
|
||||
result = crew.kickoff()
|
||||
```
|
||||
|
||||
## Funciona em Crew, Flow e Agent
|
||||
Pressione `Ctrl+C` apos a primeira tarefa concluir. Em `./.checkpoints/`, um arquivo `<timestamp>_<uuid>.json` e o checkpoint.
|
||||
</Step>
|
||||
<Step title="Retome a partir do checkpoint">
|
||||
```python
|
||||
from crewai import CheckpointConfig
|
||||
|
||||
### Crew
|
||||
result = crew.kickoff(
|
||||
from_checkpoint=CheckpointConfig(
|
||||
restore_from="./.checkpoints/<timestamp>_<uuid>.json",
|
||||
),
|
||||
)
|
||||
```
|
||||
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[research_task, write_task, review_task],
|
||||
checkpoint=CheckpointConfig(location="./crew_cp"),
|
||||
)
|
||||
```
|
||||
A tarefa de pesquisa e pulada, o escritor executa contra a saida de pesquisa salva e a crew finaliza.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
Gatilho padrao: `task_completed` (um checkpoint por tarefa finalizada).
|
||||
## Guias de uso
|
||||
|
||||
### Flow
|
||||
<AccordionGroup>
|
||||
<Accordion title="Ativar checkpointing com padroes" icon="play">
|
||||
```python
|
||||
crew = Crew(agents=[...], tasks=[...], checkpoint=True)
|
||||
```
|
||||
|
||||
```python
|
||||
from crewai.flow.flow import Flow, start, listen
|
||||
from crewai import CheckpointConfig
|
||||
Grava em `./.checkpoints/` em cada `task_completed`.
|
||||
</Accordion>
|
||||
|
||||
class MyFlow(Flow):
|
||||
@start()
|
||||
def step_one(self):
|
||||
return "data"
|
||||
<Accordion title="Personalizar armazenamento e frequencia" icon="sliders">
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
|
||||
@listen(step_one)
|
||||
def step_two(self, data):
|
||||
return process(data)
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
on_events=["task_completed", "crew_kickoff_completed"],
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
flow = MyFlow(
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./flow_cp",
|
||||
on_events=["method_execution_finished"],
|
||||
),
|
||||
)
|
||||
result = flow.kickoff()
|
||||
<Accordion title="Escolher um provedor de armazenamento" icon="database">
|
||||
<CodeGroup>
|
||||
```python JsonProvider
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import JsonProvider
|
||||
|
||||
# Retomar
|
||||
flow = MyFlow.from_checkpoint("./flow_cp/20260407T120000_abc123.json")
|
||||
result = flow.kickoff()
|
||||
```
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
provider=JsonProvider(),
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
```python SqliteProvider
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import SqliteProvider
|
||||
|
||||
### Agent
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./.checkpoints.db",
|
||||
provider=SqliteProvider(),
|
||||
max_checkpoints=50,
|
||||
),
|
||||
)
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
```python
|
||||
agent = Agent(
|
||||
role="Researcher",
|
||||
goal="Research topics",
|
||||
backstory="Expert researcher",
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./agent_cp",
|
||||
on_events=["lite_agent_execution_completed"],
|
||||
),
|
||||
)
|
||||
result = agent.kickoff(messages=[{"role": "user", "content": "Research AI trends"}])
|
||||
```
|
||||
<Tip>
|
||||
O SQLite ativa o modo journal WAL para leituras concorrentes. Prefira-o para checkpointing de alta frequencia.
|
||||
</Tip>
|
||||
</Accordion>
|
||||
|
||||
## Provedores de Armazenamento
|
||||
<Accordion title="Desativar um agente especifico" icon="user-slash">
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[
|
||||
Agent(role="Researcher", ...),
|
||||
Agent(role="Writer", ..., checkpoint=False),
|
||||
],
|
||||
tasks=[...],
|
||||
checkpoint=True,
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
O CrewAI inclui dois provedores de armazenamento para checkpoints.
|
||||
<Accordion title="Retomar via classmethod" icon="rotate-left">
|
||||
```python
|
||||
config = CheckpointConfig(restore_from="./my_checkpoints/<file>.json")
|
||||
crew = Crew.from_checkpoint(config)
|
||||
result = crew.kickoff()
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
### JsonProvider (padrao)
|
||||
<Accordion title="Fazer fork em uma nova branch" icon="code-branch">
|
||||
`fork()` restaura um checkpoint sob uma nova linhagem para que a nova execucao nao colida com a original.
|
||||
|
||||
Grava cada checkpoint como um arquivo JSON separado.
|
||||
```python
|
||||
config = CheckpointConfig(restore_from="./my_checkpoints/<file>.json")
|
||||
crew = Crew.fork(config, branch="experiment-a")
|
||||
result = crew.kickoff(inputs={"strategy": "aggressive"})
|
||||
```
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import JsonProvider
|
||||
O label `branch` e opcional; um e gerado se omitido.
|
||||
</Accordion>
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./my_checkpoints",
|
||||
provider=JsonProvider(),
|
||||
max_checkpoints=5,
|
||||
),
|
||||
)
|
||||
```
|
||||
<Accordion title="Checkpoint em Crew, Flow ou Agent" icon="cubes">
|
||||
<Tabs>
|
||||
<Tab title="Crew">
|
||||
```python
|
||||
crew = Crew(
|
||||
agents=[researcher, writer],
|
||||
tasks=[research_task, write_task, review_task],
|
||||
checkpoint=CheckpointConfig(location="./crew_cp"),
|
||||
)
|
||||
```
|
||||
|
||||
### SqliteProvider
|
||||
Gatilho padrao: `task_completed`.
|
||||
</Tab>
|
||||
<Tab title="Flow">
|
||||
```python
|
||||
from crewai.flow.flow import Flow, start, listen
|
||||
from crewai import CheckpointConfig
|
||||
|
||||
Armazena todos os checkpoints em um unico arquivo SQLite.
|
||||
class MyFlow(Flow):
|
||||
@start()
|
||||
def step_one(self):
|
||||
return "data"
|
||||
|
||||
```python
|
||||
from crewai import Crew, CheckpointConfig
|
||||
from crewai.state import SqliteProvider
|
||||
@listen(step_one)
|
||||
def step_two(self, data):
|
||||
return process(data)
|
||||
|
||||
crew = Crew(
|
||||
agents=[...],
|
||||
tasks=[...],
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./.checkpoints.db",
|
||||
provider=SqliteProvider(),
|
||||
),
|
||||
)
|
||||
```
|
||||
flow = MyFlow(
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./flow_cp",
|
||||
on_events=["method_execution_finished"],
|
||||
),
|
||||
)
|
||||
result = flow.kickoff()
|
||||
|
||||
config = CheckpointConfig(restore_from="./flow_cp/<file>.json")
|
||||
flow = MyFlow.from_checkpoint(config)
|
||||
result = flow.kickoff()
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Agent">
|
||||
```python
|
||||
agent = Agent(
|
||||
role="Researcher",
|
||||
goal="Research topics",
|
||||
backstory="Expert researcher",
|
||||
checkpoint=CheckpointConfig(
|
||||
location="./agent_cp",
|
||||
on_events=["lite_agent_execution_completed"],
|
||||
),
|
||||
)
|
||||
result = agent.kickoff(messages=[{"role": "user", "content": "Research AI trends"}])
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
</Accordion>
|
||||
|
||||
## Tipos de Evento
|
||||
<Accordion title="Gravar um checkpoint manualmente" icon="code">
|
||||
Registre um handler em qualquer evento e chame `state.checkpoint()`.
|
||||
|
||||
O campo `on_events` aceita qualquer combinacao de strings de tipo de evento. Escolhas comuns:
|
||||
<CodeGroup>
|
||||
```python Sync
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
|
||||
| Caso de Uso | Eventos |
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
def on_llm_done(source, event, state):
|
||||
path = state.checkpoint("./my_checkpoints")
|
||||
print(f"Checkpoint salvo: {path}")
|
||||
```
|
||||
```python Async
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
async def on_llm_done_async(source, event, state):
|
||||
path = await state.acheckpoint("./my_checkpoints")
|
||||
print(f"Checkpoint salvo: {path}")
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
Um argumento `state` e fornecido automaticamente quando o handler recebe tres parametros. Veja [Event Listeners](/pt-BR/concepts/event-listener) para o catalogo completo de eventos.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Navegar, retomar e fazer fork pela CLI" icon="terminal">
|
||||
```bash
|
||||
crewai checkpoint # detecta automaticamente .checkpoints/ ou .checkpoints.db
|
||||
crewai checkpoint --location ./my_checkpoints
|
||||
crewai checkpoint --location ./.checkpoints.db
|
||||
```
|
||||
|
||||
<Frame>
|
||||
<img src="/images/checkpointing.png" alt="Checkpoint TUI" />
|
||||
</Frame>
|
||||
|
||||
O painel esquerdo agrupa checkpoints por branch; forks aninham sob seu pai. Selecionar um checkpoint mostra seus metadados, estado da entidade e progresso das tarefas. **Resume** continua a execucao; **Fork** inicia uma nova branch.
|
||||
|
||||
O painel de detalhes expoe duas areas editaveis:
|
||||
|
||||
- **Inputs** — os inputs originais do kickoff, preenchidos e editaveis.
|
||||
- **Saidas das tarefas** — saidas das tarefas concluidas. Editar uma saida e pressionar **Fork** invalida tarefas downstream para que sejam reexecutadas com o contexto modificado.
|
||||
|
||||
<Tip>
|
||||
Util para exploracao de cenarios: fork, ajuste, observe.
|
||||
</Tip>
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Inspecionar checkpoints sem a TUI" icon="magnifying-glass">
|
||||
```bash
|
||||
crewai checkpoint list ./my_checkpoints
|
||||
crewai checkpoint info ./my_checkpoints/<file>.json
|
||||
crewai checkpoint info ./.checkpoints.db
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Referencia
|
||||
|
||||
### `CheckpointConfig`
|
||||
|
||||
<ParamField path="location" type="str" default='"./.checkpoints"'>
|
||||
Destino do armazenamento. Diretorio para `JsonProvider`, caminho de arquivo de banco para `SqliteProvider`.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="on_events" type="list[str]" default='["task_completed"]'>
|
||||
Tipos de evento que disparam um checkpoint. Veja [tipos de evento](#tipos-de-evento).
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="provider" type="BaseProvider" default="JsonProvider()">
|
||||
Backend de armazenamento. `JsonProvider` ou `SqliteProvider`.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="max_checkpoints" type="int | None" default="None">
|
||||
Maximo de checkpoints a reter. Os mais antigos sao removidos apos cada gravacao.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="restore_from" type="Path | str | None" default="None">
|
||||
Checkpoint a restaurar quando passado via `from_checkpoint`.
|
||||
</ParamField>
|
||||
|
||||
### Valores do campo `checkpoint`
|
||||
|
||||
Aceito por `Crew`, `Flow` e `Agent`.
|
||||
|
||||
<ParamField path="None" type="padrao">
|
||||
Herda do pai.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="True" type="bool">
|
||||
Ativa com padroes.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="False" type="bool">
|
||||
Desativacao explicita. Interrompe a heranca.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="CheckpointConfig(...)" type="CheckpointConfig">
|
||||
Configuracao personalizada.
|
||||
</ParamField>
|
||||
|
||||
### Tipos de evento
|
||||
|
||||
Valores comuns para `on_events`:
|
||||
|
||||
| Caso de uso | Eventos |
|
||||
|:------------|:--------|
|
||||
| Apos cada tarefa (Crew) | `["task_completed"]` |
|
||||
| Apos cada tarefa | `["task_completed"]` |
|
||||
| Apos cada metodo do flow | `["method_execution_finished"]` |
|
||||
| Apos execucao do agente | `["agent_execution_completed"]`, `["lite_agent_execution_completed"]` |
|
||||
| Apenas na conclusao da crew | `["crew_kickoff_completed"]` |
|
||||
| Apos cada chamada LLM | `["llm_call_completed"]` |
|
||||
| Em tudo | `["*"]` |
|
||||
| Tudo | `["*"]` |
|
||||
|
||||
<Warning>
|
||||
Usar `["*"]` ou eventos de alta frequencia como `llm_call_completed` gravara muitos arquivos de checkpoint e pode impactar o desempenho. Use `max_checkpoints` para limitar o uso de disco.
|
||||
`["*"]` e eventos de alta frequencia como `llm_call_completed` gravam muitos checkpoints e podem degradar o desempenho. Combine com `max_checkpoints`.
|
||||
</Warning>
|
||||
|
||||
## Checkpointing Manual
|
||||
### Provedores de armazenamento
|
||||
|
||||
Para controle total, registre seu proprio handler de evento e chame `state.checkpoint()` diretamente:
|
||||
<ParamField path="JsonProvider" type="provider">
|
||||
Um arquivo por checkpoint, nomeado `<timestamp>_<uuid>.json` dentro de `location`.
|
||||
</ParamField>
|
||||
|
||||
```python
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.llm_events import LLMCallCompletedEvent
|
||||
<ParamField path="SqliteProvider" type="provider">
|
||||
Arquivo de banco unico em `location` com journaling WAL.
|
||||
</ParamField>
|
||||
|
||||
# Handler sincrono
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
def on_llm_done(source, event, state):
|
||||
path = state.checkpoint("./my_checkpoints")
|
||||
print(f"Checkpoint salvo: {path}")
|
||||
### CLI
|
||||
|
||||
# Handler assincrono
|
||||
@crewai_event_bus.on(LLMCallCompletedEvent)
|
||||
async def on_llm_done_async(source, event, state):
|
||||
path = await state.acheckpoint("./my_checkpoints")
|
||||
print(f"Checkpoint salvo: {path}")
|
||||
```
|
||||
|
||||
O argumento `state` e o `RuntimeState` passado automaticamente pelo barramento de eventos quando seu handler aceita 3 parametros. Voce pode registrar handlers em qualquer tipo de evento listado na documentacao de [Event Listeners](/pt-BR/concepts/event-listener).
|
||||
|
||||
O checkpointing e best-effort: se uma gravacao de checkpoint falhar, o erro e registrado no log, mas a execucao continua sem interrupcao.
|
||||
| Comando | Proposito |
|
||||
|:--------|:----------|
|
||||
| `crewai checkpoint` | Inicia a TUI; detecta o armazenamento automaticamente. |
|
||||
| `crewai checkpoint --location <path>` | Inicia a TUI em uma localizacao especifica. |
|
||||
| `crewai checkpoint list <path>` | Lista checkpoints. |
|
||||
| `crewai checkpoint info <path>` | Inspeciona um arquivo de checkpoint ou a entrada mais recente em um banco SQLite. |
|
||||
|
||||
Reference in New Issue
Block a user