crewAI/docs/edge/en/learn/streaming-runtime-contract.mdx

---
title: Streaming Runtime Contract
description: Stream ordered runtime frames from Flows, direct LLM calls, and conversational turns.
icon: tower-broadcast
mode: "wide"
---

## Overview

CrewAI exposes a frame-based streaming contract for runtimes that need more than plain text chunks. The contract emits ordered `StreamFrame` objects for Flow lifecycle events, direct LLM tokens, tool activity, conversation messages, and custom events.

Use this API when you are building a UI, service bridge, terminal app, or deployment runtime that needs a stable stream of structured events while a Flow, chat turn, or direct LLM call is running.

## StreamFrame

Every frame has the same envelope:

```python
from crewai.types.streaming import StreamFrame

frame.id           # unique frame id
frame.seq          # execution-local order, when available
frame.type         # source event type, such as "flow_started"
frame.channel      # "llm", "flow", "tools", "messages", "lifecycle", or "custom"
frame.namespace    # source/runtime namespace
frame.timestamp    # event timestamp
frame.parent_id    # parent event id, when available
frame.previous_id  # previous event id, when available
frame.data         # event payload
frame.event        # alias for frame.data
frame.content      # printable text for token-like frames, otherwise ""
```

The `channel` field is the fastest way to route frames in consumers:

| Channel | Contains |
|---------|----------|
| `llm` | Token and thinking chunks from LLM streaming events |
| `flow` | Flow lifecycle, method execution, routing, and pause/resume events |
| `tools` | Tool usage events |
| `messages` | Conversation transcript events |
| `lifecycle` | Runtime lifecycle events that are not specific to another channel |
| `custom` | Events that do not map to a built-in channel |

`frame.type` preserves the source event type, so consumers can handle specific events inside a channel.

## Stream a Flow

Set `stream=True` on a Flow to make `kickoff()` return a stream session:

```python
from crewai.flow import Flow, start


class ReportFlow(Flow):
    @start()
    def generate(self):
        return "done"


flow = ReportFlow(stream=True)
stream = flow.kickoff()

with stream:
    for chunk in stream:
        print(chunk.content, end="", flush=True)
        if chunk.type == "tool_usage_started":
            print(chunk.event["tool_name"])

result = stream.result
```

You must consume the stream before reading `stream.result`. Accessing the result early raises a `RuntimeError` so consumers do not accidentally treat a partial run as complete.

You can also call `flow.stream_events(...)` directly when you want streaming for a single invocation without setting `stream=True` on the Flow instance.

## Filter by Channel

`StreamSession` exposes channel projections that preserve global frame order within the selected channel:

```python
stream = flow.stream_events()

with stream:
    for frame in stream.llm:
        print(frame.content, end="", flush=True)

result = stream.result
```

Available projections are:

| Projection | Frames |
|------------|--------|
| `stream.events` | All frames |
| `stream.llm` | LLM frames |
| `stream.messages` | Conversation message frames |
| `stream.flow` | Flow frames |
| `stream.tools` | Tool frames |
| `stream.interleave([...])` | A selected set of channels |

Use `stream.interleave(["flow", "llm", "messages"])` when a consumer wants only some channels but still needs their relative order.

## Async Streaming

Use `astream()` for async consumers:

```python
flow = ReportFlow()
stream = flow.astream()

async with stream:
    async for chunk in stream.events:
        print(chunk.channel, chunk.type, chunk.content)

result = stream.result
```

The async session has the same projections as the sync session.

## Stream a Direct LLM Call

`llm.call(...)` still returns the final assembled result. Use `llm.stream_events(...)` when you want to iterate over chunks as they arrive while keeping the structured event payload:

```python
from crewai import LLM


llm = LLM(model="gpt-4o-mini")
stream = llm.stream_events(
    messages=[
        {
            "role": "user",
            "content": "Explain CrewAI streaming in two short sentences.",
        }
    ]
)

with stream:
    for chunk in stream:
        print(chunk.content, end="", flush=True)

result = stream.result
```

`llm.stream_events(...)` temporarily enables streaming for the wrapped call and restores the LLM's previous `stream` setting afterward. Provider integrations continue to emit the underlying LLM stream events; this helper provides a common iterator API over those events for every LLM provider.

## Conversational Turns

Conversational Flows can stream one user turn with `stream_turn()`:

```python
from crewai import Flow
from crewai.experimental.conversational import ConversationConfig, ConversationState


@ConversationConfig(llm="gpt-4o-mini", defer_trace_finalization=True)
class ChatFlow(Flow[ConversationState]):
    conversational = True


flow = ChatFlow()
stream = flow.stream_turn("What can you help me with?", session_id="session-1")

with stream:
    for frame in stream.events:
        if frame.channel == "llm" and frame.type == "llm_stream_chunk":
            print(frame.content, end="", flush=True)

reply = stream.result
```

During `stream_turn()`, the built-in conversational answer path enables LLM token streaming for that turn and restores the LLM's previous `stream` setting afterward. Custom route handlers that create their own agents or LLM instances should configure those LLMs for streaming if they need token-level output.

## Cleanup

Use the session as a context manager when possible. If a client disconnects before the stream is exhausted, close the session explicitly:

```python
stream = flow.stream_events()

try:
    for frame in stream.events:
        print(frame.type)
finally:
    if not stream.is_exhausted:
        stream.close()
```

For async streams, use `await stream.aclose()`.

## Legacy Chunk Streaming

Crew streaming with `stream=True` still returns the chunk-oriented `CrewStreamingOutput` API described in [Streaming Crew Execution](/en/learn/streaming-crew-execution). Direct `llm.call(...)` still returns the final LLM result. The frame contract is intended for runtimes that need a stable event envelope across Flows, direct LLM calls, conversational turns, tools, and messages.