mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-07-01 05:08:12 +00:00
Two correctness fixes uncovered while landing the OTel finish_reason +
response_id plumbing:
- LiteLLM streaming (sync + async): `stream_options={"include_usage": True}`
causes LiteLLM to emit a final usage-only chunk with `choices=[]`. The
post-loop `_extract_finish_reason_and_response_id(last_chunk)` silently
returned `(None, None)` because the last chunk has no choices, even though
earlier chunks carried `finish_reason="stop"`. Track both fields
incrementally inside the loop (mirroring how OpenAI/Gemini/Azure already
handle their native streams) and use the tracked values for the
LLMCallCompletedEvent emission and the partial-response error path.
- Bedrock Converse: `ResponseMetadata.RequestId` is an AWS infra trace id,
not a model-level response id (semantically different from OpenAI's
`chatcmpl-XXX`). Return None for `response_id` rather than mislead
downstream telemetry consumers. The audit-fix's async propagation chain
still works — None propagates through unchanged.
Adds `test_llm_streaming_finish_reason.py` pinning both the sync and async
LiteLLM streaming paths against the include_usage chunk shape.