chore(otel): drop dead last_chunk variable from async streaming

The streaming-fix commit (49e5581b5) replaced the post-loop
`_extract_finish_reason_and_response_id(last_chunk)` call with the
incrementally-tracked `stream_finish_reason` / `stream_response_id`,
which removed the only reader of `last_chunk` in
`_ahandle_streaming_response`. The declaration and per-iteration
assignment were left behind — harmless but confusing for future
readers because the sync sibling still legitimately uses `last_chunk`
(for usage and content fallbacks via `_handle_streaming_callbacks`).

The async path inlines its usage extraction directly inside the loop
(`chunk.model_extra.get("usage")`), so there's no fallback consumer.
Drop both lines.

Sync path untouched — `last_chunk` there is still load-bearing.
This commit is contained in:
Lucas Gomide
2026-05-27 11:11:17 -03:00
parent 16bd159eab
commit 34e8511294

View File

@@ -1452,7 +1452,6 @@ class LLM(BaseLLM):
params["stream"] = True
params["stream_options"] = {"include_usage": True}
response_id = None
last_chunk: Any | None = None
# See sync sibling: incrementally track finish_reason/response_id so the
# usage-only final chunk doesn't wipe them.
stream_finish_reason: str | None = None
@@ -1462,7 +1461,6 @@ class LLM(BaseLLM):
async for chunk in await litellm.acompletion(**params):
chunk_count += 1
chunk_content = None
last_chunk = chunk
response_id = chunk.id if isinstance(chunk, ModelResponseBase) else None
chunk_finish, chunk_id = self._extract_finish_reason_and_response_id(