feat(flow): add restore_from_state_id kickoff parameter (#5674)

## Summary

- Reverts `b0e2fda` ("fix(flow): add execution_id separate from state.id", COR-48): removes `Flow.execution_id` and points `current_flow_id` / `current_flow_request_id` back at `flow_id` (i.e. `state.id`). The separate per-run tracking id was no longer the right abstraction once `restore_from_state_id` reshapes how `state.id` is assigned;

- Adds an optional `restore_from_state_id` kwarg to `Flow.kickoff` / `Flow.kickoff_async` that hydrates state from a previously-persisted flow's latest snapshot

- Reassigns `state.id` to a fresh value (or `inputs["id"]` if pinned) so the new run's `@persist` writes don't extend the source's history

- Existing `inputs["id"]` resume, `@persist`, and `from_checkpoint` paths are unchanged

## Problem
`@persist` only supports *resume* today: `kickoff(inputs={"id": <uuid>})` hydrates state and continues writing under the same `flow_uuid`. There's no way to **fork** — hydrate from a snapshot but persist under a separate key, leaving the source's history intact. This PR adds that.

| | `state.id` after kickoff | `@persist` writes land under |
|---|---|---|
| `inputs["id"]` (resume) | supplied id | supplied id (extends history) |
| `restore_from_state_id` (fork) | fresh id, or `inputs["id"]` if pinned | new id (source preserved) |

## Behavior

| `inputs.id` | `restore_from_state_id` | Effect |
|---|---|---|
| — | — | Fresh kickoff |
| set | — | Existing resume |
| — | UUID | Fork — new `state.id`, hydrated from source |
| set | UUID | Fork into a pinned `state.id`, hydrated from source |

- Source not found → silent fallback (mirrors existing resume)
- Both `from_checkpoint` and `restore_from_state_id` set → `ValueError`
- `restore_from_state_id=None` → byte-identical to current main

## Design
Fork hydration runs before the existing `inputs` block in `kickoff_async`. On a hit, it calls the same `_restore_state` primitive used by resume, then overwrites `state.id` with a fresh UUID (or `inputs["id"]`). A `fork_succeeded` flag gates the existing `inputs["id"]` path so we don't double-load. `_completed_methods` / `_is_execution_resuming` are intentionally untouched — skip-completed-methods remains the territory of `apply_checkpoint` and `from_pending`.

## Test plan
- [ ] `pytest tests/test_flow_persistence.py` — 5 new tests (four-row matrix, not-found fallback, default no-op, conflict raise) + 6 existing as regression
- [ ] `pytest tests/test_flow.py` — broader flow suite
- [ ] Manual end-to-end against an HITL `@persist` flow
This commit is contained in:
Tiago Freire
2026-05-01 12:46:07 -03:00
committed by GitHub
parent 07c4a30f2e
commit cd2b9ee38a
16 changed files with 683 additions and 162 deletions

View File

@@ -380,6 +380,42 @@ class AnotherFlow(Flow[dict]):
print("Method-level persisted runs:", self.state["runs"])
```
### تفرع الحالة المستمرة
يدعم `@persist` نمطين متميزين للترطيب في `kickoff` / `kickoff_async`:
- `kickoff(inputs={"id": <uuid>})` — **استئناف**: يحمّل أحدث لقطة لـ UUID المقدم ويستمر في الكتابة تحت نفس `flow_uuid`. يمتد التاريخ.
- `kickoff(restore_from_state_id=<uuid>)` — **تفرع**: يحمّل أحدث لقطة لـ UUID المقدم، يرطّب حالة التشغيل الجديد منها، ثم يعيّن `state.id` جديدًا (مولّدًا تلقائيًا، أو `inputs["id"]` إذا تم تثبيته). تذهب كتابات `@persist` للتشغيل الجديد تحت `state.id` الجديد؛ يتم الحفاظ على تاريخ تدفق المصدر.
```python
from crewai.flow.flow import Flow, start
from crewai.flow.persistence import persist
from pydantic import BaseModel
class CounterState(BaseModel):
id: str = ""
counter: int = 0
@persist
class CounterFlow(Flow[CounterState]):
@start()
def step(self):
self.state.counter += 1
print(f"[id={self.state.id}] counter={self.state.counter}")
# التشغيل 1: حالة جديدة، العداد 0 -> 1، محفوظ تحت flow_1.state.id
flow_1 = CounterFlow()
flow_1.kickoff()
# التفرع: ترطيب من أحدث لقطة لـ flow_1، لكن باستخدام state.id جديد
flow_2 = CounterFlow()
flow_2.kickoff(restore_from_state_id=flow_1.state.id)
# يبدأ flow_2.state.counter بـ 1 (مرطّب)، ثم تزيده step() إلى 2.
# flow_2.state.id != flow_1.state.id؛ تاريخ flow_1 لم يتغيّر.
```
إذا لم يطابق `restore_from_state_id` المقدم أي حالة مستمرة، يعود kickoff بصمت إلى السلوك الافتراضي — نفس سلوك `inputs["id"]` عند عدم العثور عليه. الجمع بين `restore_from_state_id` و `from_checkpoint` يطلق `ValueError`؛ اختر مصدر ترطيب واحدًا. تثبيت `inputs["id"]` أثناء التفرع يشارك مفتاح الاستمرارية مع تدفق آخر — عادةً ما تريد استخدام `restore_from_state_id` فقط.
### كيف تعمل
1. **تعريف الحالة الفريد**