Merge branch 'main' into lorenze/fix-google-vertex-api-using-api-keys

feat: add additional a2a events and enrich event metadata
docs translations
2026-01-19 12:58:14 +00:00 · 2026-01-17 10:16:15 -05:00 · 2026-01-16 16:57:31 -05:00 · 2026-01-15 14:43:43 -08:00 · 2026-01-15 14:37:37 -08:00 · 2026-01-15 14:37:16 -08:00
39 changed files with 3391 additions and 2980 deletions
--- a/docs/en/concepts/llms.mdx
+++ b/docs/en/concepts/llms.mdx
@@ -375,10 +375,13 @@ In this section, you'll find detailed examples that help you select, configure,
    GOOGLE_API_KEY=<your-api-key>
    GEMINI_API_KEY=<your-api-key>

-    # Optional - for Vertex AI
+    # For Vertex AI Express mode (API key authentication)
+    GOOGLE_GENAI_USE_VERTEXAI=true
+    GOOGLE_API_KEY=<your-api-key>
+
+    # For Vertex AI with service account
    GOOGLE_CLOUD_PROJECT=<your-project-id>
    GOOGLE_CLOUD_LOCATION=<location>  # Defaults to us-central1
-    GOOGLE_GENAI_USE_VERTEXAI=true  # Set to use Vertex AI
    ```

    **Basic Usage:**
@@ -412,7 +415,35 @@ In this section, you'll find detailed examples that help you select, configure,
    )
    ```

-    **Vertex AI Configuration:**
+    **Vertex AI Express Mode (API Key Authentication):**
+
+    Vertex AI Express mode allows you to use Vertex AI with simple API key authentication instead of service account credentials. This is the quickest way to get started with Vertex AI.
+
+    To enable Express mode, set both environment variables in your `.env` file:
+    ```toml .env
+    GOOGLE_GENAI_USE_VERTEXAI=true
+    GOOGLE_API_KEY=<your-api-key>
+    ```
+
+    Then use the LLM as usual:
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="gemini/gemini-2.0-flash",
+        temperature=0.7
+    )
+    ```
+
+    <Info>
+      To get an Express mode API key:
+      - New Google Cloud users: Get an [express mode API key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey)
+      - Existing Google Cloud users: Get a [Google Cloud API key bound to a service account](https://cloud.google.com/docs/authentication/api-keys)
+      
+      For more details, see the [Vertex AI Express mode documentation](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey).
+    </Info>
+
+    **Vertex AI Configuration (Service Account):**
    ```python Code
    from crewai import LLM

@@ -424,10 +455,10 @@ In this section, you'll find detailed examples that help you select, configure,
    ```

    **Supported Environment Variables:**
-    - `GOOGLE_API_KEY` or `GEMINI_API_KEY`: Your Google API key (required for Gemini API)
-    - `GOOGLE_CLOUD_PROJECT`: Google Cloud project ID (for Vertex AI)
+    - `GOOGLE_API_KEY` or `GEMINI_API_KEY`: Your Google API key (required for Gemini API and Vertex AI Express mode)
+    - `GOOGLE_GENAI_USE_VERTEXAI`: Set to `true` to use Vertex AI (required for Express mode)
+    - `GOOGLE_CLOUD_PROJECT`: Google Cloud project ID (for Vertex AI with service account)
    - `GOOGLE_CLOUD_LOCATION`: GCP location (defaults to `us-central1`)
-    - `GOOGLE_GENAI_USE_VERTEXAI`: Set to `true` to use Vertex AI

    **Features:**
    - Native function calling support for Gemini 1.5+ and 2.x models
--- a/docs/ko/concepts/llms.mdx
+++ b/docs/ko/concepts/llms.mdx
@@ -107,7 +107,7 @@ CrewAI 코드 내에는 사용할 모델을 지정할 수 있는 여러 위치

 ## 공급자 구성 예시

-CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양한 LLM 공급자를 지원합니다.  
+CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양한 LLM 공급자를 지원합니다.
 이 섹션에서는 프로젝트의 요구에 가장 적합한 LLM을 선택, 구성, 최적화하는 데 도움이 되는 자세한 예시를 제공합니다.

 <AccordionGroup>
@@ -153,8 +153,8 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
  </Accordion>

  <Accordion title="Meta-Llama">
-    Meta의 Llama API는 Meta의 대형 언어 모델 패밀리 접근을 제공합니다.  
-    API는 [Meta Llama API](https://llama.developer.meta.com?utm_source=partner-crewai&utm_medium=website)에서 사용할 수 있습니다.  
+    Meta의 Llama API는 Meta의 대형 언어 모델 패밀리 접근을 제공합니다.
+    API는 [Meta Llama API](https://llama.developer.meta.com?utm_source=partner-crewai&utm_medium=website)에서 사용할 수 있습니다.
    `.env` 파일에 다음 환경 변수를 설정하십시오:

    ```toml Code
@@ -207,11 +207,20 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
    `.env` 파일에 API 키를 설정하십시오. 키가 필요하거나 기존 키를 찾으려면 [AI Studio](https://aistudio.google.com/apikey)를 확인하세요.

    ```toml .env
-    # https://ai.google.dev/gemini-api/docs/api-key
+    # Gemini API 사용 시 (다음 중 하나)
+    GOOGLE_API_KEY=<your-api-key>
    GEMINI_API_KEY=<your-api-key>
+
+    # Vertex AI Express 모드 사용 시 (API 키 인증)
+    GOOGLE_GENAI_USE_VERTEXAI=true
+    GOOGLE_API_KEY=<your-api-key>
+
+    # Vertex AI 서비스 계정 사용 시
+    GOOGLE_CLOUD_PROJECT=<your-project-id>
+    GOOGLE_CLOUD_LOCATION=<location>  # 기본값: us-central1
    ```

-    CrewAI 프로젝트에서의 예시 사용법:
+    **기본 사용법:**
    ```python Code
    from crewai import LLM

@@ -221,6 +230,34 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
    )
    ```

+    **Vertex AI Express 모드 (API 키 인증):**
+
+    Vertex AI Express 모드를 사용하면 서비스 계정 자격 증명 대신 간단한 API 키 인증으로 Vertex AI를 사용할 수 있습니다. Vertex AI를 시작하는 가장 빠른 방법입니다.
+
+    Express 모드를 활성화하려면 `.env` 파일에 두 환경 변수를 모두 설정하세요:
+    ```toml .env
+    GOOGLE_GENAI_USE_VERTEXAI=true
+    GOOGLE_API_KEY=<your-api-key>
+    ```
+
+    그런 다음 평소처럼 LLM을 사용하세요:
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="gemini/gemini-2.0-flash",
+        temperature=0.7
+    )
+    ```
+
+    <Info>
+      Express 모드 API 키를 받으려면:
+      - 신규 Google Cloud 사용자: [Express 모드 API 키](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey) 받기
+      - 기존 Google Cloud 사용자: [서비스 계정에 바인딩된 Google Cloud API 키](https://cloud.google.com/docs/authentication/api-keys) 받기
+
+      자세한 내용은 [Vertex AI Express 모드 문서](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey)를 참조하세요.
+    </Info>
+
    ### Gemini 모델

    Google은 다양한 용도에 최적화된 강력한 모델을 제공합니다.
@@ -476,7 +513,7 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양

  <Accordion title="Local NVIDIA NIM Deployed using WSL2">

-    NVIDIA NIM을 이용하면 Windows 기기에서 WSL2(Windows Subsystem for Linux)를 통해 강력한 LLM을 로컬로 실행할 수 있습니다.  
+    NVIDIA NIM을 이용하면 Windows 기기에서 WSL2(Windows Subsystem for Linux)를 통해 강력한 LLM을 로컬로 실행할 수 있습니다.
    이 방식은 Nvidia GPU를 활용하여 프라이빗하고, 안전하며, 비용 효율적인 AI 추론을 클라우드 서비스에 의존하지 않고 구현할 수 있습니다.
    데이터 프라이버시, 오프라인 기능이 필요한 개발, 테스트, 또는 프로덕션 환경에 최적입니다.

@@ -954,4 +991,4 @@ LLM 설정을 최대한 활용하는 방법을 알아보세요:
    llm = LLM(model="openai/gpt-4o")  # 128K tokens
    ```
  </Tab>
-</Tabs>
+</Tabs>
--- a/docs/pt-BR/concepts/llms.mdx
+++ b/docs/pt-BR/concepts/llms.mdx
@@ -79,7 +79,7 @@ Existem diferentes locais no código do CrewAI onde você pode especificar o mod

    # Configuração avançada com parâmetros detalhados
    llm = LLM(
-        model="openai/gpt-4", 
+        model="openai/gpt-4",
        temperature=0.8,
        max_tokens=150,
        top_p=0.9,
@@ -207,11 +207,20 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
    Defina sua chave de API no seu arquivo `.env`. Se precisar de uma chave, ou encontrar uma existente, verifique o [AI Studio](https://aistudio.google.com/apikey).

    ```toml .env
-    # https://ai.google.dev/gemini-api/docs/api-key
+    # Para API Gemini (uma das seguintes)
+    GOOGLE_API_KEY=<your-api-key>
    GEMINI_API_KEY=<your-api-key>
+
+    # Para Vertex AI Express mode (autenticação por chave de API)
+    GOOGLE_GENAI_USE_VERTEXAI=true
+    GOOGLE_API_KEY=<your-api-key>
+
+    # Para Vertex AI com conta de serviço
+    GOOGLE_CLOUD_PROJECT=<your-project-id>
+    GOOGLE_CLOUD_LOCATION=<location>  # Padrão: us-central1
    ```

-    Exemplo de uso em seu projeto CrewAI:
+    **Uso Básico:**
    ```python Code
    from crewai import LLM

@@ -221,6 +230,34 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
    )
    ```

+    **Vertex AI Express Mode (Autenticação por Chave de API):**
+
+    O Vertex AI Express mode permite usar o Vertex AI com autenticação simples por chave de API, em vez de credenciais de conta de serviço. Esta é a maneira mais rápida de começar com o Vertex AI.
+
+    Para habilitar o Express mode, defina ambas as variáveis de ambiente no seu arquivo `.env`:
+    ```toml .env
+    GOOGLE_GENAI_USE_VERTEXAI=true
+    GOOGLE_API_KEY=<your-api-key>
+    ```
+
+    Em seguida, use o LLM normalmente:
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="gemini/gemini-2.0-flash",
+        temperature=0.7
+    )
+    ```
+
+    <Info>
+      Para obter uma chave de API do Express mode:
+      - Novos usuários do Google Cloud: Obtenha uma [chave de API do Express mode](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey)
+      - Usuários existentes do Google Cloud: Obtenha uma [chave de API do Google Cloud vinculada a uma conta de serviço](https://cloud.google.com/docs/authentication/api-keys)
+
+      Para mais detalhes, consulte a [documentação do Vertex AI Express mode](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey).
+    </Info>
+
    ### Modelos Gemini

    O Google oferece uma variedade de modelos poderosos otimizados para diferentes casos de uso.
@@ -823,7 +860,7 @@ Saiba como obter o máximo da configuração do seu LLM:
      Lembre-se de monitorar regularmente o uso de tokens e ajustar suas configurações para otimizar custos e desempenho.
    </Info>
  </Accordion>
-  
+
  <Accordion title="Descartar Parâmetros Adicionais">
    O CrewAI usa Litellm internamente para chamadas LLM, permitindo descartar parâmetros adicionais desnecessários para seu caso de uso. Isso pode simplificar seu código e reduzir a complexidade da configuração do LLM.
    Por exemplo, se não precisar enviar o parâmetro <code>stop</code>, basta omiti-lo na chamada do LLM:
@@ -882,4 +919,4 @@ Saiba como obter o máximo da configuração do seu LLM:
    llm = LLM(model="openai/gpt-4o")  # 128K tokens
    ```
  </Tab>
-</Tabs>
+</Tabs>
--- a/lib/crewai/src/crewai/a2a/task_helpers.py
+++ b/lib/crewai/src/crewai/a2a/task_helpers.py
@@ -3,9 +3,10 @@
 from __future__ import annotations

 from collections.abc import AsyncIterator
-from typing import TYPE_CHECKING, TypedDict
+from typing import TYPE_CHECKING, Any, TypedDict
 import uuid

+from a2a.client.errors import A2AClientHTTPError
 from a2a.types import (
    AgentCard,
    Message,
@@ -20,7 +21,10 @@ from a2a.types import (
 from typing_extensions import NotRequired

 from crewai.events.event_bus import crewai_event_bus
-from crewai.events.types.a2a_events import A2AResponseReceivedEvent
+from crewai.events.types.a2a_events import (
+    A2AConnectionErrorEvent,
+    A2AResponseReceivedEvent,
+)


 if TYPE_CHECKING:
@@ -55,7 +59,8 @@ class TaskStateResult(TypedDict):
    history: list[Message]
    result: NotRequired[str]
    error: NotRequired[str]
-    agent_card: NotRequired[AgentCard]
+    agent_card: NotRequired[dict[str, Any]]
+    a2a_agent_name: NotRequired[str | None]


 def extract_task_result_parts(a2a_task: A2ATask) -> list[str]:
@@ -131,50 +136,69 @@ def process_task_state(
    is_multiturn: bool,
    agent_role: str | None,
    result_parts: list[str] | None = None,
+    endpoint: str | None = None,
+    a2a_agent_name: str | None = None,
+    from_task: Any | None = None,
+    from_agent: Any | None = None,
+    is_final: bool = True,
 ) -> TaskStateResult | None:
    """Process A2A task state and return result dictionary.

    Shared logic for both polling and streaming handlers.

    Args:
-        a2a_task: The A2A task to process
-        new_messages: List to collect messages (modified in place)
-        agent_card: The agent card
-        turn_number: Current turn number
-        is_multiturn: Whether multi-turn conversation
-        agent_role: Agent role for logging
+        a2a_task: The A2A task to process.
+        new_messages: List to collect messages (modified in place).
+        agent_card: The agent card.
+        turn_number: Current turn number.
+        is_multiturn: Whether multi-turn conversation.
+        agent_role: Agent role for logging.
        result_parts: Accumulated result parts (streaming passes accumulated,
-            polling passes None to extract from task)
+            polling passes None to extract from task).
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        from_task: Optional CrewAI Task for event metadata.
+        from_agent: Optional CrewAI Agent for event metadata.
+        is_final: Whether this is the final response in the stream.

    Returns:
-        Result dictionary if terminal/actionable state, None otherwise
+        Result dictionary if terminal/actionable state, None otherwise.
    """
-    should_extract = result_parts is None
    if result_parts is None:
        result_parts = []

    if a2a_task.status.state == TaskState.completed:
-        if should_extract:
+        if not result_parts:
            extracted_parts = extract_task_result_parts(a2a_task)
            result_parts.extend(extracted_parts)
        if a2a_task.history:
            new_messages.extend(a2a_task.history)

        response_text = " ".join(result_parts) if result_parts else ""
+        message_id = None
+        if a2a_task.status and a2a_task.status.message:
+            message_id = a2a_task.status.message.message_id
        crewai_event_bus.emit(
            None,
            A2AResponseReceivedEvent(
                response=response_text,
                turn_number=turn_number,
+                context_id=a2a_task.context_id,
+                message_id=message_id,
                is_multiturn=is_multiturn,
                status="completed",
+                final=is_final,
                agent_role=agent_role,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                from_task=from_task,
+                from_agent=from_agent,
            ),
        )

        return TaskStateResult(
            status=TaskState.completed,
-            agent_card=agent_card,
+            agent_card=agent_card.model_dump(exclude_none=True),
            result=response_text,
            history=new_messages,
        )
@@ -194,14 +218,24 @@ def process_task_state(
            )
            new_messages.append(agent_message)

+        input_message_id = None
+        if a2a_task.status and a2a_task.status.message:
+            input_message_id = a2a_task.status.message.message_id
        crewai_event_bus.emit(
            None,
            A2AResponseReceivedEvent(
                response=response_text,
                turn_number=turn_number,
+                context_id=a2a_task.context_id,
+                message_id=input_message_id,
                is_multiturn=is_multiturn,
                status="input_required",
+                final=is_final,
                agent_role=agent_role,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                from_task=from_task,
+                from_agent=from_agent,
            ),
        )

@@ -209,7 +243,7 @@ def process_task_state(
            status=TaskState.input_required,
            error=response_text,
            history=new_messages,
-            agent_card=agent_card,
+            agent_card=agent_card.model_dump(exclude_none=True),
        )

    if a2a_task.status.state in {TaskState.failed, TaskState.rejected}:
@@ -248,6 +282,11 @@ async def send_message_and_get_task_id(
    turn_number: int,
    is_multiturn: bool,
    agent_role: str | None,
+    from_task: Any | None = None,
+    from_agent: Any | None = None,
+    endpoint: str | None = None,
+    a2a_agent_name: str | None = None,
+    context_id: str | None = None,
 ) -> str | TaskStateResult:
    """Send message and process initial response.

@@ -262,6 +301,11 @@ async def send_message_and_get_task_id(
        turn_number: Current turn number
        is_multiturn: Whether multi-turn conversation
        agent_role: Agent role for logging
+        from_task: Optional CrewAI Task object for event metadata.
+        from_agent: Optional CrewAI Agent object for event metadata.
+        endpoint: Optional A2A endpoint URL.
+        a2a_agent_name: Optional A2A agent name.
+        context_id: Optional A2A context ID for correlation.

    Returns:
        Task ID string if agent needs polling/waiting, or TaskStateResult if done.
@@ -280,9 +324,16 @@ async def send_message_and_get_task_id(
                    A2AResponseReceivedEvent(
                        response=response_text,
                        turn_number=turn_number,
+                        context_id=event.context_id,
+                        message_id=event.message_id,
                        is_multiturn=is_multiturn,
                        status="completed",
+                        final=True,
                        agent_role=agent_role,
+                        endpoint=endpoint,
+                        a2a_agent_name=a2a_agent_name,
+                        from_task=from_task,
+                        from_agent=from_agent,
                    ),
                )

@@ -290,7 +341,7 @@ async def send_message_and_get_task_id(
                    status=TaskState.completed,
                    result=response_text,
                    history=new_messages,
-                    agent_card=agent_card,
+                    agent_card=agent_card.model_dump(exclude_none=True),
                )

            if isinstance(event, tuple):
@@ -304,6 +355,10 @@ async def send_message_and_get_task_id(
                        turn_number=turn_number,
                        is_multiturn=is_multiturn,
                        agent_role=agent_role,
+                        endpoint=endpoint,
+                        a2a_agent_name=a2a_agent_name,
+                        from_task=from_task,
+                        from_agent=from_agent,
                    )
                    if result:
                        return result
@@ -316,6 +371,99 @@ async def send_message_and_get_task_id(
            history=new_messages,
        )

+    except A2AClientHTTPError as e:
+        error_msg = f"HTTP Error {e.status_code}: {e!s}"
+
+        error_message = Message(
+            role=Role.agent,
+            message_id=str(uuid.uuid4()),
+            parts=[Part(root=TextPart(text=error_msg))],
+            context_id=context_id,
+        )
+        new_messages.append(error_message)
+
+        crewai_event_bus.emit(
+            None,
+            A2AConnectionErrorEvent(
+                endpoint=endpoint or "",
+                error=str(e),
+                error_type="http_error",
+                status_code=e.status_code,
+                a2a_agent_name=a2a_agent_name,
+                operation="send_message",
+                context_id=context_id,
+                from_task=from_task,
+                from_agent=from_agent,
+            ),
+        )
+        crewai_event_bus.emit(
+            None,
+            A2AResponseReceivedEvent(
+                response=error_msg,
+                turn_number=turn_number,
+                context_id=context_id,
+                is_multiturn=is_multiturn,
+                status="failed",
+                final=True,
+                agent_role=agent_role,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                from_task=from_task,
+                from_agent=from_agent,
+            ),
+        )
+        return TaskStateResult(
+            status=TaskState.failed,
+            error=error_msg,
+            history=new_messages,
+        )
+
+    except Exception as e:
+        error_msg = f"Unexpected error during send_message: {e!s}"
+
+        error_message = Message(
+            role=Role.agent,
+            message_id=str(uuid.uuid4()),
+            parts=[Part(root=TextPart(text=error_msg))],
+            context_id=context_id,
+        )
+        new_messages.append(error_message)
+
+        crewai_event_bus.emit(
+            None,
+            A2AConnectionErrorEvent(
+                endpoint=endpoint or "",
+                error=str(e),
+                error_type="unexpected_error",
+                a2a_agent_name=a2a_agent_name,
+                operation="send_message",
+                context_id=context_id,
+                from_task=from_task,
+                from_agent=from_agent,
+            ),
+        )
+        crewai_event_bus.emit(
+            None,
+            A2AResponseReceivedEvent(
+                response=error_msg,
+                turn_number=turn_number,
+                context_id=context_id,
+                is_multiturn=is_multiturn,
+                status="failed",
+                final=True,
+                agent_role=agent_role,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                from_task=from_task,
+                from_agent=from_agent,
+            ),
+        )
+        return TaskStateResult(
+            status=TaskState.failed,
+            error=error_msg,
+            history=new_messages,
+        )
+
    finally:
        aclose = getattr(event_stream, "aclose", None)
        if aclose:
--- a/lib/crewai/src/crewai/a2a/updates/base.py
+++ b/lib/crewai/src/crewai/a2a/updates/base.py
@@ -22,6 +22,13 @@ class BaseHandlerKwargs(TypedDict, total=False):
    turn_number: int
    is_multiturn: bool
    agent_role: str | None
+    context_id: str | None
+    task_id: str | None
+    endpoint: str | None
+    agent_branch: Any
+    a2a_agent_name: str | None
+    from_task: Any
+    from_agent: Any


 class PollingHandlerKwargs(BaseHandlerKwargs, total=False):
@@ -29,8 +36,6 @@ class PollingHandlerKwargs(BaseHandlerKwargs, total=False):

    polling_interval: float
    polling_timeout: float
-    endpoint: str
-    agent_branch: Any
    history_length: int
    max_polls: int | None

@@ -38,9 +43,6 @@ class PollingHandlerKwargs(BaseHandlerKwargs, total=False):
 class StreamingHandlerKwargs(BaseHandlerKwargs, total=False):
    """Kwargs for streaming handler."""

-    context_id: str | None
-    task_id: str | None
-

 class PushNotificationHandlerKwargs(BaseHandlerKwargs, total=False):
    """Kwargs for push notification handler."""
@@ -49,7 +51,6 @@ class PushNotificationHandlerKwargs(BaseHandlerKwargs, total=False):
    result_store: PushNotificationResultStore
    polling_timeout: float
    polling_interval: float
-    agent_branch: Any


 class PushNotificationResultStore(Protocol):
--- a/lib/crewai/src/crewai/a2a/updates/polling/handler.py
+++ b/lib/crewai/src/crewai/a2a/updates/polling/handler.py
@@ -31,6 +31,7 @@ from crewai.a2a.task_helpers import (
 from crewai.a2a.updates.base import PollingHandlerKwargs
 from crewai.events.event_bus import crewai_event_bus
 from crewai.events.types.a2a_events import (
+    A2AConnectionErrorEvent,
    A2APollingStartedEvent,
    A2APollingStatusEvent,
    A2AResponseReceivedEvent,
@@ -49,23 +50,33 @@ async def _poll_task_until_complete(
    agent_branch: Any | None = None,
    history_length: int = 100,
    max_polls: int | None = None,
+    from_task: Any | None = None,
+    from_agent: Any | None = None,
+    context_id: str | None = None,
+    endpoint: str | None = None,
+    a2a_agent_name: str | None = None,
 ) -> A2ATask:
    """Poll task status until terminal state reached.

    Args:
-        client: A2A client instance
-        task_id: Task ID to poll
-        polling_interval: Seconds between poll attempts
-        polling_timeout: Max seconds before timeout
-        agent_branch: Agent tree branch for logging
-        history_length: Number of messages to retrieve per poll
-        max_polls: Max number of poll attempts (None = unlimited)
+        client: A2A client instance.
+        task_id: Task ID to poll.
+        polling_interval: Seconds between poll attempts.
+        polling_timeout: Max seconds before timeout.
+        agent_branch: Agent tree branch for logging.
+        history_length: Number of messages to retrieve per poll.
+        max_polls: Max number of poll attempts (None = unlimited).
+        from_task: Optional CrewAI Task object for event metadata.
+        from_agent: Optional CrewAI Agent object for event metadata.
+        context_id: A2A context ID for correlation.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.

    Returns:
-        Final task object in terminal state
+        Final task object in terminal state.

    Raises:
-        A2APollingTimeoutError: If polling exceeds timeout or max_polls
+        A2APollingTimeoutError: If polling exceeds timeout or max_polls.
    """
    start_time = time.monotonic()
    poll_count = 0
@@ -77,13 +88,19 @@ async def _poll_task_until_complete(
        )

        elapsed = time.monotonic() - start_time
+        effective_context_id = task.context_id or context_id
        crewai_event_bus.emit(
            agent_branch,
            A2APollingStatusEvent(
                task_id=task_id,
+                context_id=effective_context_id,
                state=str(task.status.state.value) if task.status.state else "unknown",
                elapsed_seconds=elapsed,
                poll_count=poll_count,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                from_task=from_task,
+                from_agent=from_agent,
            ),
        )

@@ -137,6 +154,9 @@ class PollingHandler:
        max_polls = kwargs.get("max_polls")
        context_id = kwargs.get("context_id")
        task_id = kwargs.get("task_id")
+        a2a_agent_name = kwargs.get("a2a_agent_name")
+        from_task = kwargs.get("from_task")
+        from_agent = kwargs.get("from_agent")

        try:
            result_or_task_id = await send_message_and_get_task_id(
@@ -146,6 +166,11 @@ class PollingHandler:
                turn_number=turn_number,
                is_multiturn=is_multiturn,
                agent_role=agent_role,
+                from_task=from_task,
+                from_agent=from_agent,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                context_id=context_id,
            )

            if not isinstance(result_or_task_id, str):
@@ -157,8 +182,12 @@ class PollingHandler:
                agent_branch,
                A2APollingStartedEvent(
                    task_id=task_id,
+                    context_id=context_id,
                    polling_interval=polling_interval,
                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    from_task=from_task,
+                    from_agent=from_agent,
                ),
            )

@@ -170,6 +199,11 @@ class PollingHandler:
                agent_branch=agent_branch,
                history_length=history_length,
                max_polls=max_polls,
+                from_task=from_task,
+                from_agent=from_agent,
+                context_id=context_id,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
            )

            result = process_task_state(
@@ -179,6 +213,10 @@ class PollingHandler:
                turn_number=turn_number,
                is_multiturn=is_multiturn,
                agent_role=agent_role,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                from_task=from_task,
+                from_agent=from_agent,
            )
            if result:
                return result
@@ -206,9 +244,15 @@ class PollingHandler:
                A2AResponseReceivedEvent(
                    response=error_msg,
                    turn_number=turn_number,
+                    context_id=context_id,
                    is_multiturn=is_multiturn,
                    status="failed",
+                    final=True,
                    agent_role=agent_role,
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    from_task=from_task,
+                    from_agent=from_agent,
                ),
            )
            return TaskStateResult(
@@ -229,14 +273,83 @@ class PollingHandler:
            )
            new_messages.append(error_message)

+            crewai_event_bus.emit(
+                agent_branch,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint,
+                    error=str(e),
+                    error_type="http_error",
+                    status_code=e.status_code,
+                    a2a_agent_name=a2a_agent_name,
+                    operation="polling",
+                    context_id=context_id,
+                    task_id=task_id,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
            crewai_event_bus.emit(
                agent_branch,
                A2AResponseReceivedEvent(
                    response=error_msg,
                    turn_number=turn_number,
+                    context_id=context_id,
                    is_multiturn=is_multiturn,
                    status="failed",
+                    final=True,
                    agent_role=agent_role,
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
+            return TaskStateResult(
+                status=TaskState.failed,
+                error=error_msg,
+                history=new_messages,
+            )
+
+        except Exception as e:
+            error_msg = f"Unexpected error during polling: {e!s}"
+
+            error_message = Message(
+                role=Role.agent,
+                message_id=str(uuid.uuid4()),
+                parts=[Part(root=TextPart(text=error_msg))],
+                context_id=context_id,
+                task_id=task_id,
+            )
+            new_messages.append(error_message)
+
+            crewai_event_bus.emit(
+                agent_branch,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint or "",
+                    error=str(e),
+                    error_type="unexpected_error",
+                    a2a_agent_name=a2a_agent_name,
+                    operation="polling",
+                    context_id=context_id,
+                    task_id=task_id,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
+            crewai_event_bus.emit(
+                agent_branch,
+                A2AResponseReceivedEvent(
+                    response=error_msg,
+                    turn_number=turn_number,
+                    context_id=context_id,
+                    is_multiturn=is_multiturn,
+                    status="failed",
+                    final=True,
+                    agent_role=agent_role,
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    from_task=from_task,
+                    from_agent=from_agent,
                ),
            )
            return TaskStateResult(
--- a/lib/crewai/src/crewai/a2a/updates/push_notifications/handler.py
+++ b/lib/crewai/src/crewai/a2a/updates/push_notifications/handler.py
@@ -29,6 +29,7 @@ from crewai.a2a.updates.base import (
 )
 from crewai.events.event_bus import crewai_event_bus
 from crewai.events.types.a2a_events import (
+    A2AConnectionErrorEvent,
    A2APushNotificationRegisteredEvent,
    A2APushNotificationTimeoutEvent,
    A2AResponseReceivedEvent,
@@ -48,6 +49,11 @@ async def _wait_for_push_result(
    timeout: float,
    poll_interval: float,
    agent_branch: Any | None = None,
+    from_task: Any | None = None,
+    from_agent: Any | None = None,
+    context_id: str | None = None,
+    endpoint: str | None = None,
+    a2a_agent_name: str | None = None,
 ) -> A2ATask | None:
    """Wait for push notification result.

@@ -57,6 +63,11 @@ async def _wait_for_push_result(
        timeout: Max seconds to wait.
        poll_interval: Seconds between polling attempts.
        agent_branch: Agent tree branch for logging.
+        from_task: Optional CrewAI Task object for event metadata.
+        from_agent: Optional CrewAI Agent object for event metadata.
+        context_id: A2A context ID for correlation.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent.

    Returns:
        Final task object, or None if timeout.
@@ -72,7 +83,12 @@ async def _wait_for_push_result(
            agent_branch,
            A2APushNotificationTimeoutEvent(
                task_id=task_id,
+                context_id=context_id,
                timeout_seconds=timeout,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                from_task=from_task,
+                from_agent=from_agent,
            ),
        )

@@ -115,18 +131,56 @@ class PushNotificationHandler:
        agent_role = kwargs.get("agent_role")
        context_id = kwargs.get("context_id")
        task_id = kwargs.get("task_id")
+        endpoint = kwargs.get("endpoint")
+        a2a_agent_name = kwargs.get("a2a_agent_name")
+        from_task = kwargs.get("from_task")
+        from_agent = kwargs.get("from_agent")

        if config is None:
+            error_msg = (
+                "PushNotificationConfig is required for push notification handler"
+            )
+            crewai_event_bus.emit(
+                agent_branch,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint or "",
+                    error=error_msg,
+                    error_type="configuration_error",
+                    a2a_agent_name=a2a_agent_name,
+                    operation="push_notification",
+                    context_id=context_id,
+                    task_id=task_id,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
            return TaskStateResult(
                status=TaskState.failed,
-                error="PushNotificationConfig is required for push notification handler",
+                error=error_msg,
                history=new_messages,
            )

        if result_store is None:
+            error_msg = (
+                "PushNotificationResultStore is required for push notification handler"
+            )
+            crewai_event_bus.emit(
+                agent_branch,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint or "",
+                    error=error_msg,
+                    error_type="configuration_error",
+                    a2a_agent_name=a2a_agent_name,
+                    operation="push_notification",
+                    context_id=context_id,
+                    task_id=task_id,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
            return TaskStateResult(
                status=TaskState.failed,
-                error="PushNotificationResultStore is required for push notification handler",
+                error=error_msg,
                history=new_messages,
            )

@@ -138,6 +192,11 @@ class PushNotificationHandler:
                turn_number=turn_number,
                is_multiturn=is_multiturn,
                agent_role=agent_role,
+                from_task=from_task,
+                from_agent=from_agent,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                context_id=context_id,
            )

            if not isinstance(result_or_task_id, str):
@@ -149,7 +208,12 @@ class PushNotificationHandler:
                agent_branch,
                A2APushNotificationRegisteredEvent(
                    task_id=task_id,
+                    context_id=context_id,
                    callback_url=str(config.url),
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    from_task=from_task,
+                    from_agent=from_agent,
                ),
            )

@@ -165,6 +229,11 @@ class PushNotificationHandler:
                timeout=polling_timeout,
                poll_interval=polling_interval,
                agent_branch=agent_branch,
+                from_task=from_task,
+                from_agent=from_agent,
+                context_id=context_id,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
            )

            if final_task is None:
@@ -181,6 +250,10 @@ class PushNotificationHandler:
                turn_number=turn_number,
                is_multiturn=is_multiturn,
                agent_role=agent_role,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                from_task=from_task,
+                from_agent=from_agent,
            )
            if result:
                return result
@@ -203,14 +276,83 @@ class PushNotificationHandler:
            )
            new_messages.append(error_message)

+            crewai_event_bus.emit(
+                agent_branch,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint or "",
+                    error=str(e),
+                    error_type="http_error",
+                    status_code=e.status_code,
+                    a2a_agent_name=a2a_agent_name,
+                    operation="push_notification",
+                    context_id=context_id,
+                    task_id=task_id,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
            crewai_event_bus.emit(
                agent_branch,
                A2AResponseReceivedEvent(
                    response=error_msg,
                    turn_number=turn_number,
+                    context_id=context_id,
                    is_multiturn=is_multiturn,
                    status="failed",
+                    final=True,
                    agent_role=agent_role,
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
+            return TaskStateResult(
+                status=TaskState.failed,
+                error=error_msg,
+                history=new_messages,
+            )
+
+        except Exception as e:
+            error_msg = f"Unexpected error during push notification: {e!s}"
+
+            error_message = Message(
+                role=Role.agent,
+                message_id=str(uuid.uuid4()),
+                parts=[Part(root=TextPart(text=error_msg))],
+                context_id=context_id,
+                task_id=task_id,
+            )
+            new_messages.append(error_message)
+
+            crewai_event_bus.emit(
+                agent_branch,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint or "",
+                    error=str(e),
+                    error_type="unexpected_error",
+                    a2a_agent_name=a2a_agent_name,
+                    operation="push_notification",
+                    context_id=context_id,
+                    task_id=task_id,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
+            crewai_event_bus.emit(
+                agent_branch,
+                A2AResponseReceivedEvent(
+                    response=error_msg,
+                    turn_number=turn_number,
+                    context_id=context_id,
+                    is_multiturn=is_multiturn,
+                    status="failed",
+                    final=True,
+                    agent_role=agent_role,
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    from_task=from_task,
+                    from_agent=from_agent,
                ),
            )
            return TaskStateResult(
--- a/lib/crewai/src/crewai/a2a/updates/streaming/handler.py
+++ b/lib/crewai/src/crewai/a2a/updates/streaming/handler.py
@@ -26,7 +26,13 @@ from crewai.a2a.task_helpers import (
 )
 from crewai.a2a.updates.base import StreamingHandlerKwargs
 from crewai.events.event_bus import crewai_event_bus
-from crewai.events.types.a2a_events import A2AResponseReceivedEvent
+from crewai.events.types.a2a_events import (
+    A2AArtifactReceivedEvent,
+    A2AConnectionErrorEvent,
+    A2AResponseReceivedEvent,
+    A2AStreamingChunkEvent,
+    A2AStreamingStartedEvent,
+)


 class StreamingHandler:
@@ -57,19 +63,57 @@ class StreamingHandler:
        turn_number = kwargs.get("turn_number", 0)
        is_multiturn = kwargs.get("is_multiturn", False)
        agent_role = kwargs.get("agent_role")
+        endpoint = kwargs.get("endpoint")
+        a2a_agent_name = kwargs.get("a2a_agent_name")
+        from_task = kwargs.get("from_task")
+        from_agent = kwargs.get("from_agent")
+        agent_branch = kwargs.get("agent_branch")

        result_parts: list[str] = []
        final_result: TaskStateResult | None = None
        event_stream = client.send_message(message)
+        chunk_index = 0
+
+        crewai_event_bus.emit(
+            agent_branch,
+            A2AStreamingStartedEvent(
+                task_id=task_id,
+                context_id=context_id,
+                endpoint=endpoint or "",
+                a2a_agent_name=a2a_agent_name,
+                turn_number=turn_number,
+                is_multiturn=is_multiturn,
+                agent_role=agent_role,
+                from_task=from_task,
+                from_agent=from_agent,
+            ),
+        )

        try:
            async for event in event_stream:
                if isinstance(event, Message):
                    new_messages.append(event)
+                    message_context_id = event.context_id or context_id
                    for part in event.parts:
                        if part.root.kind == "text":
                            text = part.root.text
                            result_parts.append(text)
+                            crewai_event_bus.emit(
+                                agent_branch,
+                                A2AStreamingChunkEvent(
+                                    task_id=event.task_id or task_id,
+                                    context_id=message_context_id,
+                                    chunk=text,
+                                    chunk_index=chunk_index,
+                                    endpoint=endpoint,
+                                    a2a_agent_name=a2a_agent_name,
+                                    turn_number=turn_number,
+                                    is_multiturn=is_multiturn,
+                                    from_task=from_task,
+                                    from_agent=from_agent,
+                                ),
+                            )
+                            chunk_index += 1

                elif isinstance(event, tuple):
                    a2a_task, update = event
@@ -81,10 +125,51 @@ class StreamingHandler:
                            for part in artifact.parts
                            if part.root.kind == "text"
                        )
+                        artifact_size = None
+                        if artifact.parts:
+                            artifact_size = sum(
+                                len(p.root.text.encode("utf-8"))
+                                if p.root.kind == "text"
+                                else len(getattr(p.root, "data", b""))
+                                for p in artifact.parts
+                            )
+                        effective_context_id = a2a_task.context_id or context_id
+                        crewai_event_bus.emit(
+                            agent_branch,
+                            A2AArtifactReceivedEvent(
+                                task_id=a2a_task.id,
+                                artifact_id=artifact.artifact_id,
+                                artifact_name=artifact.name,
+                                artifact_description=artifact.description,
+                                mime_type=artifact.parts[0].root.kind
+                                if artifact.parts
+                                else None,
+                                size_bytes=artifact_size,
+                                append=update.append or False,
+                                last_chunk=update.last_chunk or False,
+                                endpoint=endpoint,
+                                a2a_agent_name=a2a_agent_name,
+                                context_id=effective_context_id,
+                                turn_number=turn_number,
+                                is_multiturn=is_multiturn,
+                                from_task=from_task,
+                                from_agent=from_agent,
+                            ),
+                        )

                    is_final_update = False
                    if isinstance(update, TaskStatusUpdateEvent):
                        is_final_update = update.final
+                        if (
+                            update.status
+                            and update.status.message
+                            and update.status.message.parts
+                        ):
+                            result_parts.extend(
+                                part.root.text
+                                for part in update.status.message.parts
+                                if part.root.kind == "text" and part.root.text
+                            )

                    if (
                        not is_final_update
@@ -101,6 +186,11 @@ class StreamingHandler:
                        is_multiturn=is_multiturn,
                        agent_role=agent_role,
                        result_parts=result_parts,
+                        endpoint=endpoint,
+                        a2a_agent_name=a2a_agent_name,
+                        from_task=from_task,
+                        from_agent=from_agent,
+                        is_final=is_final_update,
                    )
                    if final_result:
                        break
@@ -118,13 +208,82 @@ class StreamingHandler:
            new_messages.append(error_message)

            crewai_event_bus.emit(
-                None,
+                agent_branch,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint or "",
+                    error=str(e),
+                    error_type="http_error",
+                    status_code=e.status_code,
+                    a2a_agent_name=a2a_agent_name,
+                    operation="streaming",
+                    context_id=context_id,
+                    task_id=task_id,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
+            crewai_event_bus.emit(
+                agent_branch,
                A2AResponseReceivedEvent(
                    response=error_msg,
                    turn_number=turn_number,
+                    context_id=context_id,
                    is_multiturn=is_multiturn,
                    status="failed",
+                    final=True,
                    agent_role=agent_role,
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
+            return TaskStateResult(
+                status=TaskState.failed,
+                error=error_msg,
+                history=new_messages,
+            )
+
+        except Exception as e:
+            error_msg = f"Unexpected error during streaming: {e!s}"
+
+            error_message = Message(
+                role=Role.agent,
+                message_id=str(uuid.uuid4()),
+                parts=[Part(root=TextPart(text=error_msg))],
+                context_id=context_id,
+                task_id=task_id,
+            )
+            new_messages.append(error_message)
+
+            crewai_event_bus.emit(
+                agent_branch,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint or "",
+                    error=str(e),
+                    error_type="unexpected_error",
+                    a2a_agent_name=a2a_agent_name,
+                    operation="streaming",
+                    context_id=context_id,
+                    task_id=task_id,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                ),
+            )
+            crewai_event_bus.emit(
+                agent_branch,
+                A2AResponseReceivedEvent(
+                    response=error_msg,
+                    turn_number=turn_number,
+                    context_id=context_id,
+                    is_multiturn=is_multiturn,
+                    status="failed",
+                    final=True,
+                    agent_role=agent_role,
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    from_task=from_task,
+                    from_agent=from_agent,
                ),
            )
            return TaskStateResult(
@@ -136,7 +295,23 @@ class StreamingHandler:
        finally:
            aclose = getattr(event_stream, "aclose", None)
            if aclose:
-                await aclose()
+                try:
+                    await aclose()
+                except Exception as close_error:
+                    crewai_event_bus.emit(
+                        agent_branch,
+                        A2AConnectionErrorEvent(
+                            endpoint=endpoint or "",
+                            error=str(close_error),
+                            error_type="stream_close_error",
+                            a2a_agent_name=a2a_agent_name,
+                            operation="stream_close",
+                            context_id=context_id,
+                            task_id=task_id,
+                            from_task=from_task,
+                            from_agent=from_agent,
+                        ),
+                    )

        if final_result:
            return final_result
@@ -145,5 +320,5 @@ class StreamingHandler:
            status=TaskState.completed,
            result=" ".join(result_parts) if result_parts else "",
            history=new_messages,
-            agent_card=agent_card,
+            agent_card=agent_card.model_dump(exclude_none=True),
        )
--- a/lib/crewai/src/crewai/a2a/utils/agent_card.py
+++ b/lib/crewai/src/crewai/a2a/utils/agent_card.py
@@ -23,6 +23,12 @@ from crewai.a2a.auth.utils import (
 )
 from crewai.a2a.config import A2AServerConfig
 from crewai.crew import Crew
+from crewai.events.event_bus import crewai_event_bus
+from crewai.events.types.a2a_events import (
+    A2AAgentCardFetchedEvent,
+    A2AAuthenticationFailedEvent,
+    A2AConnectionErrorEvent,
+)


 if TYPE_CHECKING:
@@ -183,6 +189,8 @@ async def _afetch_agent_card_impl(
    timeout: int,
 ) -> AgentCard:
    """Internal async implementation of AgentCard fetching."""
+    start_time = time.perf_counter()
+
    if "/.well-known/agent-card.json" in endpoint:
        base_url = endpoint.replace("/.well-known/agent-card.json", "")
        agent_card_path = "/.well-known/agent-card.json"
@@ -217,9 +225,29 @@ async def _afetch_agent_card_impl(
            )
            response.raise_for_status()

-            return AgentCard.model_validate(response.json())
+            agent_card = AgentCard.model_validate(response.json())
+            fetch_time_ms = (time.perf_counter() - start_time) * 1000
+            agent_card_dict = agent_card.model_dump(exclude_none=True)
+
+            crewai_event_bus.emit(
+                None,
+                A2AAgentCardFetchedEvent(
+                    endpoint=endpoint,
+                    a2a_agent_name=agent_card.name,
+                    agent_card=agent_card_dict,
+                    protocol_version=agent_card.protocol_version,
+                    provider=agent_card_dict.get("provider"),
+                    cached=False,
+                    fetch_time_ms=fetch_time_ms,
+                ),
+            )
+
+            return agent_card

        except httpx.HTTPStatusError as e:
+            elapsed_ms = (time.perf_counter() - start_time) * 1000
+            response_body = e.response.text[:1000] if e.response.text else None
+
            if e.response.status_code == 401:
                error_details = ["Authentication failed"]
                www_auth = e.response.headers.get("WWW-Authenticate")
@@ -228,7 +256,93 @@ async def _afetch_agent_card_impl(
                if not auth:
                    error_details.append("No auth scheme provided")
                msg = " | ".join(error_details)
+
+                auth_type = type(auth).__name__ if auth else None
+                crewai_event_bus.emit(
+                    None,
+                    A2AAuthenticationFailedEvent(
+                        endpoint=endpoint,
+                        auth_type=auth_type,
+                        error=msg,
+                        status_code=401,
+                        metadata={
+                            "elapsed_ms": elapsed_ms,
+                            "response_body": response_body,
+                            "www_authenticate": www_auth,
+                            "request_url": str(e.request.url),
+                        },
+                    ),
+                )
+
                raise A2AClientHTTPError(401, msg) from e
+
+            crewai_event_bus.emit(
+                None,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint,
+                    error=str(e),
+                    error_type="http_error",
+                    status_code=e.response.status_code,
+                    operation="fetch_agent_card",
+                    metadata={
+                        "elapsed_ms": elapsed_ms,
+                        "response_body": response_body,
+                        "request_url": str(e.request.url),
+                    },
+                ),
+            )
+            raise
+
+        except httpx.TimeoutException as e:
+            elapsed_ms = (time.perf_counter() - start_time) * 1000
+            crewai_event_bus.emit(
+                None,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint,
+                    error=str(e),
+                    error_type="timeout",
+                    operation="fetch_agent_card",
+                    metadata={
+                        "elapsed_ms": elapsed_ms,
+                        "timeout_config": timeout,
+                        "request_url": str(e.request.url) if e.request else None,
+                    },
+                ),
+            )
+            raise
+
+        except httpx.ConnectError as e:
+            elapsed_ms = (time.perf_counter() - start_time) * 1000
+            crewai_event_bus.emit(
+                None,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint,
+                    error=str(e),
+                    error_type="connection_error",
+                    operation="fetch_agent_card",
+                    metadata={
+                        "elapsed_ms": elapsed_ms,
+                        "request_url": str(e.request.url) if e.request else None,
+                    },
+                ),
+            )
+            raise
+
+        except httpx.RequestError as e:
+            elapsed_ms = (time.perf_counter() - start_time) * 1000
+            crewai_event_bus.emit(
+                None,
+                A2AConnectionErrorEvent(
+                    endpoint=endpoint,
+                    error=str(e),
+                    error_type="request_error",
+                    operation="fetch_agent_card",
+                    metadata={
+                        "elapsed_ms": elapsed_ms,
+                        "request_url": str(e.request.url) if e.request else None,
+                    },
+                ),
+            )
            raise


--- a/lib/crewai/src/crewai/a2a/utils/delegation.py
+++ b/lib/crewai/src/crewai/a2a/utils/delegation.py
@@ -88,6 +88,9 @@ def execute_a2a_delegation(
    response_model: type[BaseModel] | None = None,
    turn_number: int | None = None,
    updates: UpdateConfig | None = None,
+    from_task: Any | None = None,
+    from_agent: Any | None = None,
+    skill_id: str | None = None,
 ) -> TaskStateResult:
    """Execute a task delegation to a remote A2A agent synchronously.

@@ -129,6 +132,9 @@ def execute_a2a_delegation(
        response_model: Optional Pydantic model for structured outputs.
        turn_number: Optional turn number for multi-turn conversations.
        updates: Update mechanism config from A2AConfig.updates.
+        from_task: Optional CrewAI Task object for event metadata.
+        from_agent: Optional CrewAI Agent object for event metadata.
+        skill_id: Optional skill ID to target a specific agent capability.

    Returns:
        TaskStateResult with status, result/error, history, and agent_card.
@@ -156,10 +162,16 @@ def execute_a2a_delegation(
                transport_protocol=transport_protocol,
                turn_number=turn_number,
                updates=updates,
+                from_task=from_task,
+                from_agent=from_agent,
+                skill_id=skill_id,
            )
        )
    finally:
-        loop.close()
+        try:
+            loop.run_until_complete(loop.shutdown_asyncgens())
+        finally:
+            loop.close()


 async def aexecute_a2a_delegation(
@@ -181,6 +193,9 @@ async def aexecute_a2a_delegation(
    response_model: type[BaseModel] | None = None,
    turn_number: int | None = None,
    updates: UpdateConfig | None = None,
+    from_task: Any | None = None,
+    from_agent: Any | None = None,
+    skill_id: str | None = None,
 ) -> TaskStateResult:
    """Execute a task delegation to a remote A2A agent asynchronously.

@@ -222,6 +237,9 @@ async def aexecute_a2a_delegation(
        response_model: Optional Pydantic model for structured outputs.
        turn_number: Optional turn number for multi-turn conversations.
        updates: Update mechanism config from A2AConfig.updates.
+        from_task: Optional CrewAI Task object for event metadata.
+        from_agent: Optional CrewAI Agent object for event metadata.
+        skill_id: Optional skill ID to target a specific agent capability.

    Returns:
        TaskStateResult with status, result/error, history, and agent_card.
@@ -233,17 +251,6 @@ async def aexecute_a2a_delegation(
    if turn_number is None:
        turn_number = len([m for m in conversation_history if m.role == Role.user]) + 1

-    crewai_event_bus.emit(
-        agent_branch,
-        A2ADelegationStartedEvent(
-            endpoint=endpoint,
-            task_description=task_description,
-            agent_id=agent_id,
-            is_multiturn=is_multiturn,
-            turn_number=turn_number,
-        ),
-    )
-
    result = await _aexecute_a2a_delegation_impl(
        endpoint=endpoint,
        auth=auth,
@@ -264,15 +271,28 @@ async def aexecute_a2a_delegation(
        response_model=response_model,
        updates=updates,
        transport_protocol=transport_protocol,
+        from_task=from_task,
+        from_agent=from_agent,
+        skill_id=skill_id,
    )

+    agent_card_data: dict[str, Any] = result.get("agent_card") or {}
    crewai_event_bus.emit(
        agent_branch,
        A2ADelegationCompletedEvent(
            status=result["status"],
            result=result.get("result"),
            error=result.get("error"),
+            context_id=context_id,
            is_multiturn=is_multiturn,
+            endpoint=endpoint,
+            a2a_agent_name=result.get("a2a_agent_name"),
+            agent_card=agent_card_data,
+            provider=agent_card_data.get("provider"),
+            metadata=metadata,
+            extensions=list(extensions.keys()) if extensions else None,
+            from_task=from_task,
+            from_agent=from_agent,
        ),
    )

@@ -299,6 +319,9 @@ async def _aexecute_a2a_delegation_impl(
    agent_role: str | None,
    response_model: type[BaseModel] | None,
    updates: UpdateConfig | None,
+    from_task: Any | None = None,
+    from_agent: Any | None = None,
+    skill_id: str | None = None,
 ) -> TaskStateResult:
    """Internal async implementation of A2A delegation."""
    if auth:
@@ -331,6 +354,28 @@ async def _aexecute_a2a_delegation_impl(
    if agent_card.name:
        a2a_agent_name = agent_card.name

+    agent_card_dict = agent_card.model_dump(exclude_none=True)
+    crewai_event_bus.emit(
+        agent_branch,
+        A2ADelegationStartedEvent(
+            endpoint=endpoint,
+            task_description=task_description,
+            agent_id=agent_id or endpoint,
+            context_id=context_id,
+            is_multiturn=is_multiturn,
+            turn_number=turn_number,
+            a2a_agent_name=a2a_agent_name,
+            agent_card=agent_card_dict,
+            protocol_version=agent_card.protocol_version,
+            provider=agent_card_dict.get("provider"),
+            skill_id=skill_id,
+            metadata=metadata,
+            extensions=list(extensions.keys()) if extensions else None,
+            from_task=from_task,
+            from_agent=from_agent,
+        ),
+    )
+
    if turn_number == 1:
        agent_id_for_event = agent_id or endpoint
        crewai_event_bus.emit(
@@ -338,7 +383,17 @@ async def _aexecute_a2a_delegation_impl(
            A2AConversationStartedEvent(
                agent_id=agent_id_for_event,
                endpoint=endpoint,
+                context_id=context_id,
                a2a_agent_name=a2a_agent_name,
+                agent_card=agent_card_dict,
+                protocol_version=agent_card.protocol_version,
+                provider=agent_card_dict.get("provider"),
+                skill_id=skill_id,
+                reference_task_ids=reference_task_ids,
+                metadata=metadata,
+                extensions=list(extensions.keys()) if extensions else None,
+                from_task=from_task,
+                from_agent=from_agent,
            ),
        )

@@ -364,6 +419,10 @@ async def _aexecute_a2a_delegation_impl(
            }
        )

+    message_metadata = metadata.copy() if metadata else {}
+    if skill_id:
+        message_metadata["skill_id"] = skill_id
+
    message = Message(
        role=Role.user,
        message_id=str(uuid.uuid4()),
@@ -371,7 +430,7 @@ async def _aexecute_a2a_delegation_impl(
        context_id=context_id,
        task_id=task_id,
        reference_task_ids=reference_task_ids,
-        metadata=metadata,
+        metadata=message_metadata if message_metadata else None,
        extensions=extensions,
    )

@@ -381,8 +440,17 @@ async def _aexecute_a2a_delegation_impl(
        A2AMessageSentEvent(
            message=message_text,
            turn_number=turn_number,
+            context_id=context_id,
+            message_id=message.message_id,
            is_multiturn=is_multiturn,
            agent_role=agent_role,
+            endpoint=endpoint,
+            a2a_agent_name=a2a_agent_name,
+            skill_id=skill_id,
+            metadata=message_metadata if message_metadata else None,
+            extensions=list(extensions.keys()) if extensions else None,
+            from_task=from_task,
+            from_agent=from_agent,
        ),
    )

@@ -397,6 +465,9 @@ async def _aexecute_a2a_delegation_impl(
        "task_id": task_id,
        "endpoint": endpoint,
        "agent_branch": agent_branch,
+        "a2a_agent_name": a2a_agent_name,
+        "from_task": from_task,
+        "from_agent": from_agent,
    }

    if isinstance(updates, PollingConfig):
@@ -434,13 +505,16 @@ async def _aexecute_a2a_delegation_impl(
        use_polling=use_polling,
        push_notification_config=push_config_for_client,
    ) as client:
-        return await handler.execute(
+        result = await handler.execute(
            client=client,
            message=message,
            new_messages=new_messages,
            agent_card=agent_card,
            **handler_kwargs,
        )
+        result["a2a_agent_name"] = a2a_agent_name
+        result["agent_card"] = agent_card.model_dump(exclude_none=True)
+        return result


@asynccontextmanager
--- a/lib/crewai/src/crewai/a2a/utils/task.py
+++ b/lib/crewai/src/crewai/a2a/utils/task.py
@@ -3,11 +3,14 @@
 from __future__ import annotations

 import asyncio
+import base64
 from collections.abc import Callable, Coroutine
+from datetime import datetime
 from functools import wraps
 import logging
 import os
 from typing import TYPE_CHECKING, Any, ParamSpec, TypeVar, cast
+from urllib.parse import urlparse

 from a2a.server.agent_execution import RequestContext
 from a2a.server.events import EventQueue
@@ -45,7 +48,14 @@ T = TypeVar("T")


 def _parse_redis_url(url: str) -> dict[str, Any]:
-    from urllib.parse import urlparse
+    """Parse a Redis URL into aiocache configuration.
+
+    Args:
+        url: Redis connection URL (e.g., redis://localhost:6379/0).
+
+    Returns:
+        Configuration dict for aiocache.RedisCache.
+    """

    parsed = urlparse(url)
    config: dict[str, Any] = {
@@ -127,7 +137,7 @@ def cancellable(
                async for message in pubsub.listen():
                    if message["type"] == "message":
                        return True
-            except Exception as e:
+            except (OSError, ConnectionError) as e:
                logger.warning("Cancel watcher error for task_id=%s: %s", task_id, e)
                return await poll_for_cancel()
            return False
@@ -183,7 +193,12 @@ async def execute(
        msg = "task_id and context_id are required"
        crewai_event_bus.emit(
            agent,
-            A2AServerTaskFailedEvent(a2a_task_id="", a2a_context_id="", error=msg),
+            A2AServerTaskFailedEvent(
+                task_id="",
+                context_id="",
+                error=msg,
+                from_agent=agent,
+            ),
        )
        raise ServerError(InvalidParamsError(message=msg)) from None

@@ -195,7 +210,12 @@ async def execute(

    crewai_event_bus.emit(
        agent,
-        A2AServerTaskStartedEvent(a2a_task_id=task_id, a2a_context_id=context_id),
+        A2AServerTaskStartedEvent(
+            task_id=task_id,
+            context_id=context_id,
+            from_task=task,
+            from_agent=agent,
+        ),
    )

    try:
@@ -215,20 +235,33 @@ async def execute(
        crewai_event_bus.emit(
            agent,
            A2AServerTaskCompletedEvent(
-                a2a_task_id=task_id, a2a_context_id=context_id, result=str(result)
+                task_id=task_id,
+                context_id=context_id,
+                result=str(result),
+                from_task=task,
+                from_agent=agent,
            ),
        )
    except asyncio.CancelledError:
        crewai_event_bus.emit(
            agent,
-            A2AServerTaskCanceledEvent(a2a_task_id=task_id, a2a_context_id=context_id),
+            A2AServerTaskCanceledEvent(
+                task_id=task_id,
+                context_id=context_id,
+                from_task=task,
+                from_agent=agent,
+            ),
        )
        raise
    except Exception as e:
        crewai_event_bus.emit(
            agent,
            A2AServerTaskFailedEvent(
-                a2a_task_id=task_id, a2a_context_id=context_id, error=str(e)
+                task_id=task_id,
+                context_id=context_id,
+                error=str(e),
+                from_task=task,
+                from_agent=agent,
            ),
        )
        raise ServerError(
@@ -282,3 +315,85 @@ async def cancel(
        context.current_task.status = TaskStatus(state=TaskState.canceled)
        return context.current_task
    return None
+
+
+def list_tasks(
+    tasks: list[A2ATask],
+    context_id: str | None = None,
+    status: TaskState | None = None,
+    status_timestamp_after: datetime | None = None,
+    page_size: int = 50,
+    page_token: str | None = None,
+    history_length: int | None = None,
+    include_artifacts: bool = False,
+) -> tuple[list[A2ATask], str | None, int]:
+    """Filter and paginate A2A tasks.
+
+    Provides filtering by context, status, and timestamp, along with
+    cursor-based pagination. This is a pure utility function that operates
+    on an in-memory list of tasks - storage retrieval is handled separately.
+
+    Args:
+        tasks: All tasks to filter.
+        context_id: Filter by context ID to get tasks in a conversation.
+        status: Filter by task state (e.g., completed, working).
+        status_timestamp_after: Filter to tasks updated after this time.
+        page_size: Maximum tasks per page (default 50).
+        page_token: Base64-encoded cursor from previous response.
+        history_length: Limit history messages per task (None = full history).
+        include_artifacts: Whether to include task artifacts (default False).
+
+    Returns:
+        Tuple of (filtered_tasks, next_page_token, total_count).
+        - filtered_tasks: Tasks matching filters, paginated and trimmed.
+        - next_page_token: Token for next page, or None if no more pages.
+        - total_count: Total number of tasks matching filters (before pagination).
+    """
+    filtered: list[A2ATask] = []
+    for task in tasks:
+        if context_id and task.context_id != context_id:
+            continue
+        if status and task.status.state != status:
+            continue
+        if status_timestamp_after and task.status.timestamp:
+            ts = datetime.fromisoformat(task.status.timestamp.replace("Z", "+00:00"))
+            if ts <= status_timestamp_after:
+                continue
+        filtered.append(task)
+
+    def get_timestamp(t: A2ATask) -> datetime:
+        """Extract timestamp from task status for sorting."""
+        if t.status.timestamp is None:
+            return datetime.min
+        return datetime.fromisoformat(t.status.timestamp.replace("Z", "+00:00"))
+
+    filtered.sort(key=get_timestamp, reverse=True)
+    total = len(filtered)
+
+    start = 0
+    if page_token:
+        try:
+            cursor_id = base64.b64decode(page_token).decode()
+            for idx, task in enumerate(filtered):
+                if task.id == cursor_id:
+                    start = idx + 1
+                    break
+        except (ValueError, UnicodeDecodeError):
+            pass
+
+    page = filtered[start : start + page_size]
+
+    result: list[A2ATask] = []
+    for task in page:
+        task = task.model_copy(deep=True)
+        if history_length is not None and task.history:
+            task.history = task.history[-history_length:]
+        if not include_artifacts:
+            task.artifacts = None
+        result.append(task)
+
+    next_token: str | None = None
+    if result and len(result) == page_size:
+        next_token = base64.b64encode(result[-1].id.encode()).decode()
+
+    return result, next_token, total
--- a/lib/crewai/src/crewai/a2a/wrapper.py
+++ b/lib/crewai/src/crewai/a2a/wrapper.py
@@ -6,9 +6,10 @@ Wraps agent classes with A2A delegation capabilities.
 from __future__ import annotations

 import asyncio
-from collections.abc import Callable, Coroutine
+from collections.abc import Callable, Coroutine, Mapping
 from concurrent.futures import ThreadPoolExecutor, as_completed
 from functools import wraps
+import json
 from types import MethodType
 from typing import TYPE_CHECKING, Any

@@ -189,7 +190,7 @@ def _execute_task_with_a2a(
    a2a_agents: list[A2AConfig | A2AClientConfig],
    original_fn: Callable[..., str],
    task: Task,
-    agent_response_model: type[BaseModel],
+    agent_response_model: type[BaseModel] | None,
    context: str | None,
    tools: list[BaseTool] | None,
    extension_registry: ExtensionRegistry,
@@ -277,7 +278,7 @@ def _execute_task_with_a2a(
 def _augment_prompt_with_a2a(
    a2a_agents: list[A2AConfig | A2AClientConfig],
    task_description: str,
-    agent_cards: dict[str, AgentCard],
+    agent_cards: Mapping[str, AgentCard | dict[str, Any]],
    conversation_history: list[Message] | None = None,
    turn_num: int = 0,
    max_turns: int | None = None,
@@ -309,7 +310,15 @@ def _augment_prompt_with_a2a(
    for config in a2a_agents:
        if config.endpoint in agent_cards:
            card = agent_cards[config.endpoint]
-            agents_text += f"\n{card.model_dump_json(indent=2, exclude_none=True, include={'description', 'url', 'skills'})}\n"
+            if isinstance(card, dict):
+                filtered = {
+                    k: v
+                    for k, v in card.items()
+                    if k in {"description", "url", "skills"} and v is not None
+                }
+                agents_text += f"\n{json.dumps(filtered, indent=2)}\n"
+            else:
+                agents_text += f"\n{card.model_dump_json(indent=2, exclude_none=True, include={'description', 'url', 'skills'})}\n"

    failed_agents = failed_agents or {}
    if failed_agents:
@@ -377,7 +386,7 @@ IMPORTANT: You have the ability to delegate this task to remote A2A agents.


 def _parse_agent_response(
-    raw_result: str | dict[str, Any], agent_response_model: type[BaseModel]
+    raw_result: str | dict[str, Any], agent_response_model: type[BaseModel] | None
 ) -> BaseModel | str | dict[str, Any]:
    """Parse LLM output as AgentResponse or return raw agent response."""
    if agent_response_model:
@@ -394,6 +403,11 @@ def _parse_agent_response(
 def _handle_max_turns_exceeded(
    conversation_history: list[Message],
    max_turns: int,
+    from_task: Any | None = None,
+    from_agent: Any | None = None,
+    endpoint: str | None = None,
+    a2a_agent_name: str | None = None,
+    agent_card: dict[str, Any] | None = None,
 ) -> str:
    """Handle the case when max turns is exceeded.

@@ -421,6 +435,11 @@ def _handle_max_turns_exceeded(
                        final_result=final_message,
                        error=None,
                        total_turns=max_turns,
+                        from_task=from_task,
+                        from_agent=from_agent,
+                        endpoint=endpoint,
+                        a2a_agent_name=a2a_agent_name,
+                        agent_card=agent_card,
                    ),
                )
                return final_message
@@ -432,6 +451,11 @@ def _handle_max_turns_exceeded(
            final_result=None,
            error=f"Conversation exceeded maximum turns ({max_turns})",
            total_turns=max_turns,
+            from_task=from_task,
+            from_agent=from_agent,
+            endpoint=endpoint,
+            a2a_agent_name=a2a_agent_name,
+            agent_card=agent_card,
        ),
    )
    raise Exception(f"A2A conversation exceeded maximum turns ({max_turns})")
@@ -442,7 +466,12 @@ def _process_response_result(
    disable_structured_output: bool,
    turn_num: int,
    agent_role: str,
-    agent_response_model: type[BaseModel],
+    agent_response_model: type[BaseModel] | None,
+    from_task: Any | None = None,
+    from_agent: Any | None = None,
+    endpoint: str | None = None,
+    a2a_agent_name: str | None = None,
+    agent_card: dict[str, Any] | None = None,
 ) -> tuple[str | None, str | None]:
    """Process LLM response and determine next action.

@@ -461,6 +490,10 @@ def _process_response_result(
                turn_number=final_turn_number,
                is_multiturn=True,
                agent_role=agent_role,
+                from_task=from_task,
+                from_agent=from_agent,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
            ),
        )
        crewai_event_bus.emit(
@@ -470,6 +503,11 @@ def _process_response_result(
                final_result=result_text,
                error=None,
                total_turns=final_turn_number,
+                from_task=from_task,
+                from_agent=from_agent,
+                endpoint=endpoint,
+                a2a_agent_name=a2a_agent_name,
+                agent_card=agent_card,
            ),
        )
        return result_text, None
@@ -490,6 +528,10 @@ def _process_response_result(
                    turn_number=final_turn_number,
                    is_multiturn=True,
                    agent_role=agent_role,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
                ),
            )
            crewai_event_bus.emit(
@@ -499,6 +541,11 @@ def _process_response_result(
                    final_result=str(llm_response.message),
                    error=None,
                    total_turns=final_turn_number,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    agent_card=agent_card,
                ),
            )
            return str(llm_response.message), None
@@ -510,13 +557,15 @@ def _process_response_result(
 def _prepare_agent_cards_dict(
    a2a_result: TaskStateResult,
    agent_id: str,
-    agent_cards: dict[str, AgentCard] | None,
-) -> dict[str, AgentCard]:
+    agent_cards: Mapping[str, AgentCard | dict[str, Any]] | None,
+) -> dict[str, AgentCard | dict[str, Any]]:
    """Prepare agent cards dictionary from result and existing cards.

    Shared logic for both sync and async response handlers.
    """
-    agent_cards_dict = agent_cards or {}
+    agent_cards_dict: dict[str, AgentCard | dict[str, Any]] = (
+        dict(agent_cards) if agent_cards else {}
+    )
    if "agent_card" in a2a_result and agent_id not in agent_cards_dict:
        agent_cards_dict[agent_id] = a2a_result["agent_card"]
    return agent_cards_dict
@@ -529,7 +578,7 @@ def _prepare_delegation_context(
    original_task_description: str | None,
 ) -> tuple[
    list[A2AConfig | A2AClientConfig],
-    type[BaseModel],
+    type[BaseModel] | None,
    str,
    str,
    A2AConfig | A2AClientConfig,
@@ -598,6 +647,11 @@ def _handle_task_completion(
    reference_task_ids: list[str],
    agent_config: A2AConfig | A2AClientConfig,
    turn_num: int,
+    from_task: Any | None = None,
+    from_agent: Any | None = None,
+    endpoint: str | None = None,
+    a2a_agent_name: str | None = None,
+    agent_card: dict[str, Any] | None = None,
 ) -> tuple[str | None, str | None, list[str]]:
    """Handle task completion state including reference task updates.

@@ -624,6 +678,11 @@ def _handle_task_completion(
                    final_result=result_text,
                    error=None,
                    total_turns=final_turn_number,
+                    from_task=from_task,
+                    from_agent=from_agent,
+                    endpoint=endpoint,
+                    a2a_agent_name=a2a_agent_name,
+                    agent_card=agent_card,
                ),
            )
            return str(result_text), task_id_config, reference_task_ids
@@ -645,8 +704,11 @@ def _handle_agent_response_and_continue(
    original_fn: Callable[..., str],
    context: str | None,
    tools: list[BaseTool] | None,
-    agent_response_model: type[BaseModel],
+    agent_response_model: type[BaseModel] | None,
    remote_task_completed: bool = False,
+    endpoint: str | None = None,
+    a2a_agent_name: str | None = None,
+    agent_card: dict[str, Any] | None = None,
 ) -> tuple[str | None, str | None]:
    """Handle A2A result and get CrewAI agent's response.

@@ -698,6 +760,11 @@ def _handle_agent_response_and_continue(
        turn_num=turn_num,
        agent_role=self.role,
        agent_response_model=agent_response_model,
+        from_task=task,
+        from_agent=self,
+        endpoint=endpoint,
+        a2a_agent_name=a2a_agent_name,
+        agent_card=agent_card,
    )


@@ -750,6 +817,12 @@ def _delegate_to_a2a(

    conversation_history: list[Message] = []

+    current_agent_card = agent_cards.get(agent_id) if agent_cards else None
+    current_agent_card_dict = (
+        current_agent_card.model_dump() if current_agent_card else None
+    )
+    current_a2a_agent_name = current_agent_card.name if current_agent_card else None
+
    try:
        for turn_num in range(max_turns):
            console_formatter = getattr(crewai_event_bus, "_console", None)
@@ -777,6 +850,8 @@ def _delegate_to_a2a(
                turn_number=turn_num + 1,
                updates=agent_config.updates,
                transport_protocol=agent_config.transport_protocol,
+                from_task=task,
+                from_agent=self,
            )

            conversation_history = a2a_result.get("history", [])
@@ -797,6 +872,11 @@ def _delegate_to_a2a(
                        reference_task_ids,
                        agent_config,
                        turn_num,
+                        from_task=task,
+                        from_agent=self,
+                        endpoint=agent_config.endpoint,
+                        a2a_agent_name=current_a2a_agent_name,
+                        agent_card=current_agent_card_dict,
                    )
                )
                if trusted_result is not None:
@@ -818,6 +898,9 @@ def _delegate_to_a2a(
                    tools=tools,
                    agent_response_model=agent_response_model,
                    remote_task_completed=(a2a_result["status"] == TaskState.completed),
+                    endpoint=agent_config.endpoint,
+                    a2a_agent_name=current_a2a_agent_name,
+                    agent_card=current_agent_card_dict,
                )

                if final_result is not None:
@@ -846,6 +929,9 @@ def _delegate_to_a2a(
                tools=tools,
                agent_response_model=agent_response_model,
                remote_task_completed=False,
+                endpoint=agent_config.endpoint,
+                a2a_agent_name=current_a2a_agent_name,
+                agent_card=current_agent_card_dict,
            )

            if final_result is not None:
@@ -862,11 +948,24 @@ def _delegate_to_a2a(
                    final_result=None,
                    error=error_msg,
                    total_turns=turn_num + 1,
+                    from_task=task,
+                    from_agent=self,
+                    endpoint=agent_config.endpoint,
+                    a2a_agent_name=current_a2a_agent_name,
+                    agent_card=current_agent_card_dict,
                ),
            )
            return f"A2A delegation failed: {error_msg}"

-        return _handle_max_turns_exceeded(conversation_history, max_turns)
+        return _handle_max_turns_exceeded(
+            conversation_history,
+            max_turns,
+            from_task=task,
+            from_agent=self,
+            endpoint=agent_config.endpoint,
+            a2a_agent_name=current_a2a_agent_name,
+            agent_card=current_agent_card_dict,
+        )

    finally:
        task.description = original_task_description
@@ -916,7 +1015,7 @@ async def _aexecute_task_with_a2a(
    a2a_agents: list[A2AConfig | A2AClientConfig],
    original_fn: Callable[..., Coroutine[Any, Any, str]],
    task: Task,
-    agent_response_model: type[BaseModel],
+    agent_response_model: type[BaseModel] | None,
    context: str | None,
    tools: list[BaseTool] | None,
    extension_registry: ExtensionRegistry,
@@ -1001,8 +1100,11 @@ async def _ahandle_agent_response_and_continue(
    original_fn: Callable[..., Coroutine[Any, Any, str]],
    context: str | None,
    tools: list[BaseTool] | None,
-    agent_response_model: type[BaseModel],
+    agent_response_model: type[BaseModel] | None,
    remote_task_completed: bool = False,
+    endpoint: str | None = None,
+    a2a_agent_name: str | None = None,
+    agent_card: dict[str, Any] | None = None,
 ) -> tuple[str | None, str | None]:
    """Async version of _handle_agent_response_and_continue."""
    agent_cards_dict = _prepare_agent_cards_dict(a2a_result, agent_id, agent_cards)
@@ -1032,6 +1134,11 @@ async def _ahandle_agent_response_and_continue(
        turn_num=turn_num,
        agent_role=self.role,
        agent_response_model=agent_response_model,
+        from_task=task,
+        from_agent=self,
+        endpoint=endpoint,
+        a2a_agent_name=a2a_agent_name,
+        agent_card=agent_card,
    )


@@ -1066,6 +1173,12 @@ async def _adelegate_to_a2a(

    conversation_history: list[Message] = []

+    current_agent_card = agent_cards.get(agent_id) if agent_cards else None
+    current_agent_card_dict = (
+        current_agent_card.model_dump() if current_agent_card else None
+    )
+    current_a2a_agent_name = current_agent_card.name if current_agent_card else None
+
    try:
        for turn_num in range(max_turns):
            console_formatter = getattr(crewai_event_bus, "_console", None)
@@ -1093,6 +1206,8 @@ async def _adelegate_to_a2a(
                turn_number=turn_num + 1,
                transport_protocol=agent_config.transport_protocol,
                updates=agent_config.updates,
+                from_task=task,
+                from_agent=self,
            )

            conversation_history = a2a_result.get("history", [])
@@ -1113,6 +1228,11 @@ async def _adelegate_to_a2a(
                        reference_task_ids,
                        agent_config,
                        turn_num,
+                        from_task=task,
+                        from_agent=self,
+                        endpoint=agent_config.endpoint,
+                        a2a_agent_name=current_a2a_agent_name,
+                        agent_card=current_agent_card_dict,
                    )
                )
                if trusted_result is not None:
@@ -1134,6 +1254,9 @@ async def _adelegate_to_a2a(
                    tools=tools,
                    agent_response_model=agent_response_model,
                    remote_task_completed=(a2a_result["status"] == TaskState.completed),
+                    endpoint=agent_config.endpoint,
+                    a2a_agent_name=current_a2a_agent_name,
+                    agent_card=current_agent_card_dict,
                )

                if final_result is not None:
@@ -1161,6 +1284,9 @@ async def _adelegate_to_a2a(
                context=context,
                tools=tools,
                agent_response_model=agent_response_model,
+                endpoint=agent_config.endpoint,
+                a2a_agent_name=current_a2a_agent_name,
+                agent_card=current_agent_card_dict,
            )

            if final_result is not None:
@@ -1177,11 +1303,24 @@ async def _adelegate_to_a2a(
                    final_result=None,
                    error=error_msg,
                    total_turns=turn_num + 1,
+                    from_task=task,
+                    from_agent=self,
+                    endpoint=agent_config.endpoint,
+                    a2a_agent_name=current_a2a_agent_name,
+                    agent_card=current_agent_card_dict,
                ),
            )
            return f"A2A delegation failed: {error_msg}"

-        return _handle_max_turns_exceeded(conversation_history, max_turns)
+        return _handle_max_turns_exceeded(
+            conversation_history,
+            max_turns,
+            from_task=task,
+            from_agent=self,
+            endpoint=agent_config.endpoint,
+            a2a_agent_name=current_a2a_agent_name,
+            agent_card=current_agent_card_dict,
+        )

    finally:
        task.description = original_task_description
--- a/lib/crewai/src/crewai/agent/core.py
+++ b/lib/crewai/src/crewai/agent/core.py
@@ -1,7 +1,7 @@
 from __future__ import annotations

 import asyncio
-from collections.abc import Callable, Coroutine, Sequence
+from collections.abc import Callable, Sequence
 import shutil
 import subprocess
 import time
@@ -34,11 +34,6 @@ from crewai.agents.agent_builder.base_agent import BaseAgent
 from crewai.agents.cache.cache_handler import CacheHandler
 from crewai.agents.crew_agent_executor import CrewAgentExecutor
 from crewai.events.event_bus import crewai_event_bus
-from crewai.events.types.agent_events import (
-    LiteAgentExecutionCompletedEvent,
-    LiteAgentExecutionErrorEvent,
-    LiteAgentExecutionStartedEvent,
-)
 from crewai.events.types.knowledge_events import (
    KnowledgeQueryCompletedEvent,
    KnowledgeQueryFailedEvent,
@@ -48,10 +43,10 @@ from crewai.events.types.memory_events import (
    MemoryRetrievalCompletedEvent,
    MemoryRetrievalStartedEvent,
 )
-from crewai.experimental.agent_executor import AgentExecutor
+from crewai.experimental.crew_agent_executor_flow import CrewAgentExecutorFlow
 from crewai.knowledge.knowledge import Knowledge
 from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
-from crewai.lite_agent_output import LiteAgentOutput
+from crewai.lite_agent import LiteAgent
 from crewai.llms.base_llm import BaseLLM
 from crewai.mcp import (
    MCPClient,
@@ -69,18 +64,15 @@ from crewai.security.fingerprint import Fingerprint
 from crewai.tools.agent_tools.agent_tools import AgentTools
 from crewai.utilities.agent_utils import (
    get_tool_names,
-    is_inside_event_loop,
    load_agent_from_repository,
    parse_tools,
    render_text_description_and_args,
 )
 from crewai.utilities.constants import TRAINED_AGENTS_DATA_FILE, TRAINING_DATA_FILE
-from crewai.utilities.converter import Converter, ConverterError
-from crewai.utilities.guardrail import process_guardrail
+from crewai.utilities.converter import Converter
 from crewai.utilities.guardrail_types import GuardrailType
 from crewai.utilities.llm_utils import create_llm
 from crewai.utilities.prompts import Prompts, StandardPromptResult, SystemPromptResult
-from crewai.utilities.pydantic_schema_utils import generate_model_description
 from crewai.utilities.token_counter_callback import TokenCalcHandler
 from crewai.utilities.training_handler import CrewTrainingHandler

@@ -97,9 +89,9 @@ if TYPE_CHECKING:
    from crewai_tools import CodeInterpreterTool

    from crewai.agents.agent_builder.base_agent import PlatformAppOrAction
+    from crewai.lite_agent_output import LiteAgentOutput
    from crewai.task import Task
    from crewai.tools.base_tool import BaseTool
-    from crewai.tools.structured_tool import CrewStructuredTool
    from crewai.utilities.types import LLMMessage


@@ -121,7 +113,7 @@ class Agent(BaseAgent):
    The agent can also have memory, can operate in verbose mode, and can delegate tasks to other agents.

    Attributes:
-            agent_executor: An instance of the CrewAgentExecutor or AgentExecutor class.
+            agent_executor: An instance of the CrewAgentExecutor or CrewAgentExecutorFlow class.
            role: The role of the agent.
            goal: The objective of the agent.
            backstory: The backstory of the agent.
@@ -246,9 +238,9 @@ class Agent(BaseAgent):
        Can be a single A2AConfig/A2AClientConfig/A2AServerConfig, or a list of any number of A2AConfig/A2AClientConfig with a single A2AServerConfig.
        """,
    )
-    executor_class: type[CrewAgentExecutor] | type[AgentExecutor] = Field(
+    executor_class: type[CrewAgentExecutor] | type[CrewAgentExecutorFlow] = Field(
        default=CrewAgentExecutor,
-        description="Class to use for the agent executor. Defaults to CrewAgentExecutor, can optionally use AgentExecutor.",
+        description="Class to use for the agent executor. Defaults to CrewAgentExecutor, can optionally use CrewAgentExecutorFlow.",
    )

    @model_validator(mode="before")
@@ -1591,25 +1583,26 @@ class Agent(BaseAgent):
            )
            return None

-    def _prepare_kickoff(
+    def kickoff(
        self,
        messages: str | list[LLMMessage],
        response_format: type[Any] | None = None,
-    ) -> tuple[AgentExecutor, dict[str, str], dict[str, Any], list[CrewStructuredTool]]:
-        """Prepare common setup for kickoff execution.
+    ) -> LiteAgentOutput:
+        """
+        Execute the agent with the given messages using a LiteAgent instance.

-        This method handles all the common preparation logic shared between
-        kickoff() and kickoff_async(), including tool processing, prompt building,
-        executor creation, and input formatting.
+        This method is useful when you want to use the Agent configuration but
+        with the simpler and more direct execution flow of LiteAgent.

        Args:
            messages: Either a string query or a list of message dictionaries.
+                     If a string is provided, it will be converted to a user message.
+                     If a list is provided, each dict should have 'role' and 'content' keys.
            response_format: Optional Pydantic model for structured output.

        Returns:
-            Tuple of (executor, inputs, agent_info, parsed_tools) ready for execution.
+            LiteAgentOutput: The result of the agent execution.
        """
-        # Process platform apps and MCP tools
        if self.apps:
            platform_tools = self.get_platform_tools(self.apps)
            if platform_tools and self.tools is not None:
@@ -1619,359 +1612,25 @@ class Agent(BaseAgent):
            if mcps and self.tools is not None:
                self.tools.extend(mcps)

-        # Prepare tools
-        raw_tools: list[BaseTool] = self.tools or []
-        parsed_tools = parse_tools(raw_tools)
-
-        # Build agent_info for backward-compatible event emission
-        agent_info = {
-            "id": self.id,
-            "role": self.role,
-            "goal": self.goal,
-            "backstory": self.backstory,
-            "tools": raw_tools,
-            "verbose": self.verbose,
-        }
-
-        # Build prompt for standalone execution
-        prompt = Prompts(
-            agent=self,
-            has_tools=len(raw_tools) > 0,
-            i18n=self.i18n,
-            use_system_prompt=self.use_system_prompt,
-            system_template=self.system_template,
-            prompt_template=self.prompt_template,
-            response_template=self.response_template,
-        ).task_execution()
-
-        # Prepare stop words
-        stop_words = [self.i18n.slice("observation")]
-        if self.response_template:
-            stop_words.append(
-                self.response_template.split("{{ .Response }}")[1].strip()
-            )
-
-        # Get RPM limit function
-        rpm_limit_fn = (
-            self._rpm_controller.check_or_wait if self._rpm_controller else None
-        )
-
-        # Create the executor for standalone mode (no crew, no task)
-        executor = AgentExecutor(
-            task=None,
-            crew=None,
-            llm=cast(BaseLLM, self.llm),
-            agent=self,
-            prompt=prompt,
-            max_iter=self.max_iter,
-            tools=parsed_tools,
-            tools_names=get_tool_names(parsed_tools),
-            stop_words=stop_words,
-            tools_description=render_text_description_and_args(parsed_tools),
-            tools_handler=self.tools_handler,
-            original_tools=raw_tools,
-            step_callback=self.step_callback,
-            function_calling_llm=self.function_calling_llm,
+        lite_agent = LiteAgent(
+            id=self.id,
+            role=self.role,
+            goal=self.goal,
+            backstory=self.backstory,
+            llm=self.llm,
+            tools=self.tools or [],
+            max_iterations=self.max_iter,
+            max_execution_time=self.max_execution_time,
            respect_context_window=self.respect_context_window,
-            request_within_rpm_limit=rpm_limit_fn,
-            callbacks=[TokenCalcHandler(self._token_process)],
-            response_model=response_format,
+            verbose=self.verbose,
+            response_format=response_format,
            i18n=self.i18n,
+            original_agent=self,
+            guardrail=self.guardrail,
+            guardrail_max_retries=self.guardrail_max_retries,
        )

-        # Format messages
-        if isinstance(messages, str):
-            formatted_messages = messages
-        else:
-            formatted_messages = "\n".join(
-                str(msg.get("content", "")) for msg in messages if msg.get("content")
-            )
-
-        # Build the input dict for the executor
-        inputs = {
-            "input": formatted_messages,
-            "tool_names": get_tool_names(parsed_tools),
-            "tools": render_text_description_and_args(parsed_tools),
-        }
-
-        return executor, inputs, agent_info, parsed_tools
-
-    def kickoff(
-        self,
-        messages: str | list[LLMMessage],
-        response_format: type[Any] | None = None,
-    ) -> LiteAgentOutput | Coroutine[Any, Any, LiteAgentOutput]:
-        """
-        Execute the agent with the given messages using the AgentExecutor.
-
-        This method provides standalone agent execution without requiring a Crew.
-        It supports tools, response formatting, and guardrails.
-
-        When called from within a Flow (sync or async method), this automatically
-        detects the event loop and returns a coroutine that the Flow framework
-        awaits. Users don't need to handle async explicitly.
-
-        Args:
-            messages: Either a string query or a list of message dictionaries.
-                     If a string is provided, it will be converted to a user message.
-                     If a list is provided, each dict should have 'role' and 'content' keys.
-            response_format: Optional Pydantic model for structured output.
-
-        Returns:
-            LiteAgentOutput: The result of the agent execution.
-            When inside a Flow, returns a coroutine that resolves to LiteAgentOutput.
-
-        Note:
-            For explicit async usage outside of Flow, use kickoff_async() directly.
-        """
-        # Magic auto-async: if inside event loop (e.g., inside a Flow),
-        # return coroutine for Flow to await
-        if is_inside_event_loop():
-            return self.kickoff_async(messages, response_format)
-
-        executor, inputs, agent_info, parsed_tools = self._prepare_kickoff(
-            messages, response_format
-        )
-
-        try:
-            crewai_event_bus.emit(
-                self,
-                event=LiteAgentExecutionStartedEvent(
-                    agent_info=agent_info,
-                    tools=parsed_tools,
-                    messages=messages,
-                ),
-            )
-
-            output = self._execute_and_build_output(executor, inputs, response_format)
-
-            if self.guardrail is not None:
-                output = self._process_kickoff_guardrail(
-                    output=output,
-                    executor=executor,
-                    inputs=inputs,
-                    response_format=response_format,
-                )
-
-            crewai_event_bus.emit(
-                self,
-                event=LiteAgentExecutionCompletedEvent(
-                    agent_info=agent_info,
-                    output=output.raw,
-                ),
-            )
-
-            return output
-
-        except Exception as e:
-            crewai_event_bus.emit(
-                self,
-                event=LiteAgentExecutionErrorEvent(
-                    agent_info=agent_info,
-                    error=str(e),
-                ),
-            )
-            raise
-
-    def _execute_and_build_output(
-        self,
-        executor: AgentExecutor,
-        inputs: dict[str, str],
-        response_format: type[Any] | None = None,
-    ) -> LiteAgentOutput:
-        """Execute the agent and build the output object.
-
-        Args:
-            executor: The executor instance.
-            inputs: Input dictionary for execution.
-            response_format: Optional response format.
-
-        Returns:
-            LiteAgentOutput with raw output, formatted result, and metrics.
-        """
-        import json
-
-        # Execute the agent (this is called from sync path, so invoke returns dict)
-        result = cast(dict[str, Any], executor.invoke(inputs))
-        raw_output = result.get("output", "")
-
-        # Handle response format conversion
-        formatted_result: BaseModel | None = None
-        if response_format:
-            try:
-                model_schema = generate_model_description(response_format)
-                schema = json.dumps(model_schema, indent=2)
-                instructions = self.i18n.slice("formatted_task_instructions").format(
-                    output_format=schema
-                )
-
-                converter = Converter(
-                    llm=self.llm,
-                    text=raw_output,
-                    model=response_format,
-                    instructions=instructions,
-                )
-
-                conversion_result = converter.to_pydantic()
-                if isinstance(conversion_result, BaseModel):
-                    formatted_result = conversion_result
-            except ConverterError:
-                pass  # Keep raw output if conversion fails
-
-        # Get token usage metrics
-        if isinstance(self.llm, BaseLLM):
-            usage_metrics = self.llm.get_token_usage_summary()
-        else:
-            usage_metrics = self._token_process.get_summary()
-
-        return LiteAgentOutput(
-            raw=raw_output,
-            pydantic=formatted_result,
-            agent_role=self.role,
-            usage_metrics=usage_metrics.model_dump() if usage_metrics else None,
-            messages=executor.messages,
-        )
-
-    async def _execute_and_build_output_async(
-        self,
-        executor: AgentExecutor,
-        inputs: dict[str, str],
-        response_format: type[Any] | None = None,
-    ) -> LiteAgentOutput:
-        """Execute the agent asynchronously and build the output object.
-
-        This is the async version of _execute_and_build_output that uses
-        invoke_async() for native async execution within event loops.
-
-        Args:
-            executor: The executor instance.
-            inputs: Input dictionary for execution.
-            response_format: Optional response format.
-
-        Returns:
-            LiteAgentOutput with raw output, formatted result, and metrics.
-        """
-        import json
-
-        # Execute the agent asynchronously
-        result = await executor.invoke_async(inputs)
-        raw_output = result.get("output", "")
-
-        # Handle response format conversion
-        formatted_result: BaseModel | None = None
-        if response_format:
-            try:
-                model_schema = generate_model_description(response_format)
-                schema = json.dumps(model_schema, indent=2)
-                instructions = self.i18n.slice("formatted_task_instructions").format(
-                    output_format=schema
-                )
-
-                converter = Converter(
-                    llm=self.llm,
-                    text=raw_output,
-                    model=response_format,
-                    instructions=instructions,
-                )
-
-                conversion_result = converter.to_pydantic()
-                if isinstance(conversion_result, BaseModel):
-                    formatted_result = conversion_result
-            except ConverterError:
-                pass  # Keep raw output if conversion fails
-
-        # Get token usage metrics
-        if isinstance(self.llm, BaseLLM):
-            usage_metrics = self.llm.get_token_usage_summary()
-        else:
-            usage_metrics = self._token_process.get_summary()
-
-        return LiteAgentOutput(
-            raw=raw_output,
-            pydantic=formatted_result,
-            agent_role=self.role,
-            usage_metrics=usage_metrics.model_dump() if usage_metrics else None,
-            messages=executor.messages,
-        )
-
-    def _process_kickoff_guardrail(
-        self,
-        output: LiteAgentOutput,
-        executor: AgentExecutor,
-        inputs: dict[str, str],
-        response_format: type[Any] | None = None,
-        retry_count: int = 0,
-    ) -> LiteAgentOutput:
-        """Process guardrail for kickoff execution with retry logic.
-
-        Args:
-            output: Current agent output.
-            executor: The executor instance.
-            inputs: Input dictionary for re-execution.
-            response_format: Optional response format.
-            retry_count: Current retry count.
-
-        Returns:
-            Validated/updated output.
-        """
-        from crewai.utilities.guardrail_types import GuardrailCallable
-
-        # Ensure guardrail is callable
-        guardrail_callable: GuardrailCallable
-        if isinstance(self.guardrail, str):
-            from crewai.tasks.llm_guardrail import LLMGuardrail
-
-            guardrail_callable = cast(
-                GuardrailCallable,
-                LLMGuardrail(description=self.guardrail, llm=cast(BaseLLM, self.llm)),
-            )
-        elif callable(self.guardrail):
-            guardrail_callable = self.guardrail
-        else:
-            # Should not happen if called from kickoff with guardrail check
-            return output
-
-        guardrail_result = process_guardrail(
-            output=output,
-            guardrail=guardrail_callable,
-            retry_count=retry_count,
-            event_source=self,
-            from_agent=self,
-        )
-
-        if not guardrail_result.success:
-            if retry_count >= self.guardrail_max_retries:
-                raise ValueError(
-                    f"Agent's guardrail failed validation after {self.guardrail_max_retries} retries. "
-                    f"Last error: {guardrail_result.error}"
-                )
-
-            # Add feedback and re-execute
-            executor._append_message_to_state(
-                guardrail_result.error or "Guardrail validation failed",
-                role="user",
-            )
-
-            # Re-execute and build new output
-            output = self._execute_and_build_output(executor, inputs, response_format)
-
-            # Recursively retry guardrail
-            return self._process_kickoff_guardrail(
-                output=output,
-                executor=executor,
-                inputs=inputs,
-                response_format=response_format,
-                retry_count=retry_count + 1,
-            )
-
-        # Apply guardrail result if available
-        if guardrail_result.result is not None:
-            if isinstance(guardrail_result.result, str):
-                output.raw = guardrail_result.result
-            elif isinstance(guardrail_result.result, BaseModel):
-                output.pydantic = guardrail_result.result
-
-        return output
+        return lite_agent.kickoff(messages)

    async def kickoff_async(
        self,
@@ -1979,11 +1638,9 @@ class Agent(BaseAgent):
        response_format: type[Any] | None = None,
    ) -> LiteAgentOutput:
        """
-        Execute the agent asynchronously with the given messages.
+        Execute the agent asynchronously with the given messages using a LiteAgent instance.

-        This is the async version of the kickoff method that uses native async
-        execution. It is designed for use within async contexts, such as when
-        called from within an async Flow method.
+        This is the async version of the kickoff method.

        Args:
            messages: Either a string query or a list of message dictionaries.
@@ -1994,48 +1651,21 @@ class Agent(BaseAgent):
        Returns:
            LiteAgentOutput: The result of the agent execution.
        """
-        executor, inputs, agent_info, parsed_tools = self._prepare_kickoff(
-            messages, response_format
+        lite_agent = LiteAgent(
+            role=self.role,
+            goal=self.goal,
+            backstory=self.backstory,
+            llm=self.llm,
+            tools=self.tools or [],
+            max_iterations=self.max_iter,
+            max_execution_time=self.max_execution_time,
+            respect_context_window=self.respect_context_window,
+            verbose=self.verbose,
+            response_format=response_format,
+            i18n=self.i18n,
+            original_agent=self,
+            guardrail=self.guardrail,
+            guardrail_max_retries=self.guardrail_max_retries,
        )

-        try:
-            crewai_event_bus.emit(
-                self,
-                event=LiteAgentExecutionStartedEvent(
-                    agent_info=agent_info,
-                    tools=parsed_tools,
-                    messages=messages,
-                ),
-            )
-
-            output = await self._execute_and_build_output_async(
-                executor, inputs, response_format
-            )
-
-            if self.guardrail is not None:
-                output = self._process_kickoff_guardrail(
-                    output=output,
-                    executor=executor,
-                    inputs=inputs,
-                    response_format=response_format,
-                )
-
-            crewai_event_bus.emit(
-                self,
-                event=LiteAgentExecutionCompletedEvent(
-                    agent_info=agent_info,
-                    output=output.raw,
-                ),
-            )
-
-            return output
-
-        except Exception as e:
-            crewai_event_bus.emit(
-                self,
-                event=LiteAgentExecutionErrorEvent(
-                    agent_info=agent_info,
-                    error=str(e),
-                ),
-            )
-            raise
+        return await lite_agent.kickoff_async(messages)
--- a/lib/crewai/src/crewai/agents/agent_builder/base_agent_executor_mixin.py
+++ b/lib/crewai/src/crewai/agents/agent_builder/base_agent_executor_mixin.py
@@ -21,9 +21,9 @@ if TYPE_CHECKING:


 class CrewAgentExecutorMixin:
-    crew: Crew | None
+    crew: Crew
    agent: Agent
-    task: Task | None
+    task: Task
    iterations: int
    max_iter: int
    messages: list[LLMMessage]
--- a/lib/crewai/src/crewai/events/event_types.py
+++ b/lib/crewai/src/crewai/events/event_types.py
@@ -1,19 +1,28 @@
 from crewai.events.types.a2a_events import (
+    A2AAgentCardFetchedEvent,
+    A2AArtifactReceivedEvent,
+    A2AAuthenticationFailedEvent,
+    A2AConnectionErrorEvent,
    A2AConversationCompletedEvent,
    A2AConversationStartedEvent,
    A2ADelegationCompletedEvent,
    A2ADelegationStartedEvent,
    A2AMessageSentEvent,
+    A2AParallelDelegationCompletedEvent,
+    A2AParallelDelegationStartedEvent,
    A2APollingStartedEvent,
    A2APollingStatusEvent,
    A2APushNotificationReceivedEvent,
    A2APushNotificationRegisteredEvent,
+    A2APushNotificationSentEvent,
    A2APushNotificationTimeoutEvent,
    A2AResponseReceivedEvent,
    A2AServerTaskCanceledEvent,
    A2AServerTaskCompletedEvent,
    A2AServerTaskFailedEvent,
    A2AServerTaskStartedEvent,
+    A2AStreamingChunkEvent,
+    A2AStreamingStartedEvent,
 )
 from crewai.events.types.agent_events import (
    AgentExecutionCompletedEvent,
@@ -93,7 +102,11 @@ from crewai.events.types.tool_usage_events import (


 EventTypes = (
-    A2AConversationCompletedEvent
+    A2AAgentCardFetchedEvent
+    | A2AArtifactReceivedEvent
+    | A2AAuthenticationFailedEvent
+    | A2AConnectionErrorEvent
+    | A2AConversationCompletedEvent
    | A2AConversationStartedEvent
    | A2ADelegationCompletedEvent
    | A2ADelegationStartedEvent
@@ -102,12 +115,17 @@ EventTypes = (
    | A2APollingStatusEvent
    | A2APushNotificationReceivedEvent
    | A2APushNotificationRegisteredEvent
+    | A2APushNotificationSentEvent
    | A2APushNotificationTimeoutEvent
    | A2AResponseReceivedEvent
    | A2AServerTaskCanceledEvent
    | A2AServerTaskCompletedEvent
    | A2AServerTaskFailedEvent
    | A2AServerTaskStartedEvent
+    | A2AStreamingChunkEvent
+    | A2AStreamingStartedEvent
+    | A2AParallelDelegationStartedEvent
+    | A2AParallelDelegationCompletedEvent
    | CrewKickoffStartedEvent
    | CrewKickoffCompletedEvent
    | CrewKickoffFailedEvent
--- a/lib/crewai/src/crewai/events/listeners/tracing/trace_listener.py
+++ b/lib/crewai/src/crewai/events/listeners/tracing/trace_listener.py
@@ -1,7 +1,7 @@
 """Trace collection listener for orchestrating trace collection."""

 import os
-from typing import Any, ClassVar, cast
+from typing import Any, ClassVar
 import uuid

 from typing_extensions import Self
@@ -18,6 +18,32 @@ from crewai.events.listeners.tracing.types import TraceEvent
 from crewai.events.listeners.tracing.utils import (
    safe_serialize_to_dict,
 )
+from crewai.events.types.a2a_events import (
+    A2AAgentCardFetchedEvent,
+    A2AArtifactReceivedEvent,
+    A2AAuthenticationFailedEvent,
+    A2AConnectionErrorEvent,
+    A2AConversationCompletedEvent,
+    A2AConversationStartedEvent,
+    A2ADelegationCompletedEvent,
+    A2ADelegationStartedEvent,
+    A2AMessageSentEvent,
+    A2AParallelDelegationCompletedEvent,
+    A2AParallelDelegationStartedEvent,
+    A2APollingStartedEvent,
+    A2APollingStatusEvent,
+    A2APushNotificationReceivedEvent,
+    A2APushNotificationRegisteredEvent,
+    A2APushNotificationSentEvent,
+    A2APushNotificationTimeoutEvent,
+    A2AResponseReceivedEvent,
+    A2AServerTaskCanceledEvent,
+    A2AServerTaskCompletedEvent,
+    A2AServerTaskFailedEvent,
+    A2AServerTaskStartedEvent,
+    A2AStreamingChunkEvent,
+    A2AStreamingStartedEvent,
+)
 from crewai.events.types.agent_events import (
    AgentExecutionCompletedEvent,
    AgentExecutionErrorEvent,
@@ -105,7 +131,7 @@ class TraceCollectionListener(BaseEventListener):
        """Create or return singleton instance."""
        if cls._instance is None:
            cls._instance = super().__new__(cls)
-        return cast(Self, cls._instance)
+        return cls._instance

    def __init__(
        self,
@@ -160,6 +186,7 @@ class TraceCollectionListener(BaseEventListener):
        self._register_flow_event_handlers(crewai_event_bus)
        self._register_context_event_handlers(crewai_event_bus)
        self._register_action_event_handlers(crewai_event_bus)
+        self._register_a2a_event_handlers(crewai_event_bus)
        self._register_system_event_handlers(crewai_event_bus)

        self._listeners_setup = True
@@ -439,6 +466,147 @@ class TraceCollectionListener(BaseEventListener):
        ) -> None:
            self._handle_action_event("knowledge_query_failed", source, event)

+    def _register_a2a_event_handlers(self, event_bus: CrewAIEventsBus) -> None:
+        """Register handlers for A2A (Agent-to-Agent) events."""
+
+        @event_bus.on(A2ADelegationStartedEvent)
+        def on_a2a_delegation_started(
+            source: Any, event: A2ADelegationStartedEvent
+        ) -> None:
+            self._handle_action_event("a2a_delegation_started", source, event)
+
+        @event_bus.on(A2ADelegationCompletedEvent)
+        def on_a2a_delegation_completed(
+            source: Any, event: A2ADelegationCompletedEvent
+        ) -> None:
+            self._handle_action_event("a2a_delegation_completed", source, event)
+
+        @event_bus.on(A2AConversationStartedEvent)
+        def on_a2a_conversation_started(
+            source: Any, event: A2AConversationStartedEvent
+        ) -> None:
+            self._handle_action_event("a2a_conversation_started", source, event)
+
+        @event_bus.on(A2AMessageSentEvent)
+        def on_a2a_message_sent(source: Any, event: A2AMessageSentEvent) -> None:
+            self._handle_action_event("a2a_message_sent", source, event)
+
+        @event_bus.on(A2AResponseReceivedEvent)
+        def on_a2a_response_received(
+            source: Any, event: A2AResponseReceivedEvent
+        ) -> None:
+            self._handle_action_event("a2a_response_received", source, event)
+
+        @event_bus.on(A2AConversationCompletedEvent)
+        def on_a2a_conversation_completed(
+            source: Any, event: A2AConversationCompletedEvent
+        ) -> None:
+            self._handle_action_event("a2a_conversation_completed", source, event)
+
+        @event_bus.on(A2APollingStartedEvent)
+        def on_a2a_polling_started(source: Any, event: A2APollingStartedEvent) -> None:
+            self._handle_action_event("a2a_polling_started", source, event)
+
+        @event_bus.on(A2APollingStatusEvent)
+        def on_a2a_polling_status(source: Any, event: A2APollingStatusEvent) -> None:
+            self._handle_action_event("a2a_polling_status", source, event)
+
+        @event_bus.on(A2APushNotificationRegisteredEvent)
+        def on_a2a_push_notification_registered(
+            source: Any, event: A2APushNotificationRegisteredEvent
+        ) -> None:
+            self._handle_action_event("a2a_push_notification_registered", source, event)
+
+        @event_bus.on(A2APushNotificationReceivedEvent)
+        def on_a2a_push_notification_received(
+            source: Any, event: A2APushNotificationReceivedEvent
+        ) -> None:
+            self._handle_action_event("a2a_push_notification_received", source, event)
+
+        @event_bus.on(A2APushNotificationSentEvent)
+        def on_a2a_push_notification_sent(
+            source: Any, event: A2APushNotificationSentEvent
+        ) -> None:
+            self._handle_action_event("a2a_push_notification_sent", source, event)
+
+        @event_bus.on(A2APushNotificationTimeoutEvent)
+        def on_a2a_push_notification_timeout(
+            source: Any, event: A2APushNotificationTimeoutEvent
+        ) -> None:
+            self._handle_action_event("a2a_push_notification_timeout", source, event)
+
+        @event_bus.on(A2AStreamingStartedEvent)
+        def on_a2a_streaming_started(
+            source: Any, event: A2AStreamingStartedEvent
+        ) -> None:
+            self._handle_action_event("a2a_streaming_started", source, event)
+
+        @event_bus.on(A2AStreamingChunkEvent)
+        def on_a2a_streaming_chunk(source: Any, event: A2AStreamingChunkEvent) -> None:
+            self._handle_action_event("a2a_streaming_chunk", source, event)
+
+        @event_bus.on(A2AAgentCardFetchedEvent)
+        def on_a2a_agent_card_fetched(
+            source: Any, event: A2AAgentCardFetchedEvent
+        ) -> None:
+            self._handle_action_event("a2a_agent_card_fetched", source, event)
+
+        @event_bus.on(A2AAuthenticationFailedEvent)
+        def on_a2a_authentication_failed(
+            source: Any, event: A2AAuthenticationFailedEvent
+        ) -> None:
+            self._handle_action_event("a2a_authentication_failed", source, event)
+
+        @event_bus.on(A2AArtifactReceivedEvent)
+        def on_a2a_artifact_received(
+            source: Any, event: A2AArtifactReceivedEvent
+        ) -> None:
+            self._handle_action_event("a2a_artifact_received", source, event)
+
+        @event_bus.on(A2AConnectionErrorEvent)
+        def on_a2a_connection_error(
+            source: Any, event: A2AConnectionErrorEvent
+        ) -> None:
+            self._handle_action_event("a2a_connection_error", source, event)
+
+        @event_bus.on(A2AServerTaskStartedEvent)
+        def on_a2a_server_task_started(
+            source: Any, event: A2AServerTaskStartedEvent
+        ) -> None:
+            self._handle_action_event("a2a_server_task_started", source, event)
+
+        @event_bus.on(A2AServerTaskCompletedEvent)
+        def on_a2a_server_task_completed(
+            source: Any, event: A2AServerTaskCompletedEvent
+        ) -> None:
+            self._handle_action_event("a2a_server_task_completed", source, event)
+
+        @event_bus.on(A2AServerTaskCanceledEvent)
+        def on_a2a_server_task_canceled(
+            source: Any, event: A2AServerTaskCanceledEvent
+        ) -> None:
+            self._handle_action_event("a2a_server_task_canceled", source, event)
+
+        @event_bus.on(A2AServerTaskFailedEvent)
+        def on_a2a_server_task_failed(
+            source: Any, event: A2AServerTaskFailedEvent
+        ) -> None:
+            self._handle_action_event("a2a_server_task_failed", source, event)
+
+        @event_bus.on(A2AParallelDelegationStartedEvent)
+        def on_a2a_parallel_delegation_started(
+            source: Any, event: A2AParallelDelegationStartedEvent
+        ) -> None:
+            self._handle_action_event("a2a_parallel_delegation_started", source, event)
+
+        @event_bus.on(A2AParallelDelegationCompletedEvent)
+        def on_a2a_parallel_delegation_completed(
+            source: Any, event: A2AParallelDelegationCompletedEvent
+        ) -> None:
+            self._handle_action_event(
+                "a2a_parallel_delegation_completed", source, event
+            )
+
    def _register_system_event_handlers(self, event_bus: CrewAIEventsBus) -> None:
        """Register handlers for system signal events (SIGTERM, SIGINT, etc.)."""

@@ -570,10 +738,15 @@ class TraceCollectionListener(BaseEventListener):
        if event_type not in self.complex_events:
            return safe_serialize_to_dict(event)
        if event_type == "task_started":
+            task_name = event.task.name or event.task.description
+            task_display_name = (
+                task_name[:80] + "..." if len(task_name) > 80 else task_name
+            )
            return {
                "task_description": event.task.description,
                "expected_output": event.task.expected_output,
-                "task_name": event.task.name or event.task.description,
+                "task_name": task_name,
+                "task_display_name": task_display_name,
                "context": event.context,
                "agent_role": source.agent.role,
                "task_id": str(event.task.id),
--- a/lib/crewai/src/crewai/events/types/a2a_events.py
+++ b/lib/crewai/src/crewai/events/types/a2a_events.py
@@ -4,68 +4,120 @@ This module defines events emitted during A2A protocol delegation,
 including both single-turn and multiturn conversation flows.
 """

+from __future__ import annotations
+
 from typing import Any, Literal

+from pydantic import model_validator
+
 from crewai.events.base_events import BaseEvent


 class A2AEventBase(BaseEvent):
    """Base class for A2A events with task/agent context."""

-    from_task: Any | None = None
-    from_agent: Any | None = None
+    from_task: Any = None
+    from_agent: Any = None

-    def __init__(self, **data: Any) -> None:
-        """Initialize A2A event, extracting task and agent metadata."""
-        if data.get("from_task"):
-            task = data["from_task"]
+    @model_validator(mode="before")
+    @classmethod
+    def extract_task_and_agent_metadata(cls, data: dict[str, Any]) -> dict[str, Any]:
+        """Extract task and agent metadata before validation."""
+        if task := data.get("from_task"):
            data["task_id"] = str(task.id)
            data["task_name"] = task.name or task.description
+            data.setdefault("source_fingerprint", str(task.id))
+            data.setdefault("source_type", "task")
+            data.setdefault(
+                "fingerprint_metadata",
+                {
+                    "task_id": str(task.id),
+                    "task_name": task.name or task.description,
+                },
+            )
            data["from_task"] = None

-        if data.get("from_agent"):
-            agent = data["from_agent"]
+        if agent := data.get("from_agent"):
            data["agent_id"] = str(agent.id)
            data["agent_role"] = agent.role
+            data.setdefault("source_fingerprint", str(agent.id))
+            data.setdefault("source_type", "agent")
+            data.setdefault(
+                "fingerprint_metadata",
+                {
+                    "agent_id": str(agent.id),
+                    "agent_role": agent.role,
+                },
+            )
            data["from_agent"] = None

-        super().__init__(**data)
+        return data


 class A2ADelegationStartedEvent(A2AEventBase):
    """Event emitted when A2A delegation starts.

    Attributes:
-        endpoint: A2A agent endpoint URL (AgentCard URL)
-        task_description: Task being delegated to the A2A agent
-        agent_id: A2A agent identifier
-        is_multiturn: Whether this is part of a multiturn conversation
-        turn_number: Current turn number (1-indexed, 1 for single-turn)
+        endpoint: A2A agent endpoint URL (AgentCard URL).
+        task_description: Task being delegated to the A2A agent.
+        agent_id: A2A agent identifier.
+        context_id: A2A context ID grouping related tasks.
+        is_multiturn: Whether this is part of a multiturn conversation.
+        turn_number: Current turn number (1-indexed, 1 for single-turn).
+        a2a_agent_name: Name of the A2A agent from agent card.
+        agent_card: Full A2A agent card metadata.
+        protocol_version: A2A protocol version being used.
+        provider: Agent provider/organization info from agent card.
+        skill_id: ID of the specific skill being invoked.
+        metadata: Custom A2A metadata key-value pairs.
+        extensions: List of A2A extension URIs in use.
    """

    type: str = "a2a_delegation_started"
    endpoint: str
    task_description: str
    agent_id: str
+    context_id: str | None = None
    is_multiturn: bool = False
    turn_number: int = 1
+    a2a_agent_name: str | None = None
+    agent_card: dict[str, Any] | None = None
+    protocol_version: str | None = None
+    provider: dict[str, Any] | None = None
+    skill_id: str | None = None
+    metadata: dict[str, Any] | None = None
+    extensions: list[str] | None = None


 class A2ADelegationCompletedEvent(A2AEventBase):
    """Event emitted when A2A delegation completes.

    Attributes:
-        status: Completion status (completed, input_required, failed, etc.)
-        result: Result message if status is completed
-        error: Error/response message (error for failed, response for input_required)
-        is_multiturn: Whether this is part of a multiturn conversation
+        status: Completion status (completed, input_required, failed, etc.).
+        result: Result message if status is completed.
+        error: Error/response message (error for failed, response for input_required).
+        context_id: A2A context ID grouping related tasks.
+        is_multiturn: Whether this is part of a multiturn conversation.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        agent_card: Full A2A agent card metadata.
+        provider: Agent provider/organization info from agent card.
+        metadata: Custom A2A metadata key-value pairs.
+        extensions: List of A2A extension URIs in use.
    """

    type: str = "a2a_delegation_completed"
    status: str
    result: str | None = None
    error: str | None = None
+    context_id: str | None = None
    is_multiturn: bool = False
+    endpoint: str | None = None
+    a2a_agent_name: str | None = None
+    agent_card: dict[str, Any] | None = None
+    provider: dict[str, Any] | None = None
+    metadata: dict[str, Any] | None = None
+    extensions: list[str] | None = None


 class A2AConversationStartedEvent(A2AEventBase):
@@ -75,51 +127,95 @@ class A2AConversationStartedEvent(A2AEventBase):
    before the first message exchange.

    Attributes:
-        agent_id: A2A agent identifier
-        endpoint: A2A agent endpoint URL
-        a2a_agent_name: Name of the A2A agent from agent card
+        agent_id: A2A agent identifier.
+        endpoint: A2A agent endpoint URL.
+        context_id: A2A context ID grouping related tasks.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        agent_card: Full A2A agent card metadata.
+        protocol_version: A2A protocol version being used.
+        provider: Agent provider/organization info from agent card.
+        skill_id: ID of the specific skill being invoked.
+        reference_task_ids: Related task IDs for context.
+        metadata: Custom A2A metadata key-value pairs.
+        extensions: List of A2A extension URIs in use.
    """

    type: str = "a2a_conversation_started"
    agent_id: str
    endpoint: str
+    context_id: str | None = None
    a2a_agent_name: str | None = None
+    agent_card: dict[str, Any] | None = None
+    protocol_version: str | None = None
+    provider: dict[str, Any] | None = None
+    skill_id: str | None = None
+    reference_task_ids: list[str] | None = None
+    metadata: dict[str, Any] | None = None
+    extensions: list[str] | None = None


 class A2AMessageSentEvent(A2AEventBase):
    """Event emitted when a message is sent to the A2A agent.

    Attributes:
-        message: Message content sent to the A2A agent
-        turn_number: Current turn number (1-indexed)
-        is_multiturn: Whether this is part of a multiturn conversation
-        agent_role: Role of the CrewAI agent sending the message
+        message: Message content sent to the A2A agent.
+        turn_number: Current turn number (1-indexed).
+        context_id: A2A context ID grouping related tasks.
+        message_id: Unique A2A message identifier.
+        is_multiturn: Whether this is part of a multiturn conversation.
+        agent_role: Role of the CrewAI agent sending the message.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        skill_id: ID of the specific skill being invoked.
+        metadata: Custom A2A metadata key-value pairs.
+        extensions: List of A2A extension URIs in use.
    """

    type: str = "a2a_message_sent"
    message: str
    turn_number: int
+    context_id: str | None = None
+    message_id: str | None = None
    is_multiturn: bool = False
    agent_role: str | None = None
+    endpoint: str | None = None
+    a2a_agent_name: str | None = None
+    skill_id: str | None = None
+    metadata: dict[str, Any] | None = None
+    extensions: list[str] | None = None


 class A2AResponseReceivedEvent(A2AEventBase):
    """Event emitted when a response is received from the A2A agent.

    Attributes:
-        response: Response content from the A2A agent
-        turn_number: Current turn number (1-indexed)
-        is_multiturn: Whether this is part of a multiturn conversation
-        status: Response status (input_required, completed, etc.)
-        agent_role: Role of the CrewAI agent (for display)
+        response: Response content from the A2A agent.
+        turn_number: Current turn number (1-indexed).
+        context_id: A2A context ID grouping related tasks.
+        message_id: Unique A2A message identifier.
+        is_multiturn: Whether this is part of a multiturn conversation.
+        status: Response status (input_required, completed, etc.).
+        final: Whether this is the final response in the stream.
+        agent_role: Role of the CrewAI agent (for display).
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        metadata: Custom A2A metadata key-value pairs.
+        extensions: List of A2A extension URIs in use.
    """

    type: str = "a2a_response_received"
    response: str
    turn_number: int
+    context_id: str | None = None
+    message_id: str | None = None
    is_multiturn: bool = False
    status: str
+    final: bool = False
    agent_role: str | None = None
+    endpoint: str | None = None
+    a2a_agent_name: str | None = None
+    metadata: dict[str, Any] | None = None
+    extensions: list[str] | None = None


 class A2AConversationCompletedEvent(A2AEventBase):
@@ -128,119 +224,433 @@ class A2AConversationCompletedEvent(A2AEventBase):
    This is emitted once at the end of a multiturn conversation.

    Attributes:
-        status: Final status (completed, failed, etc.)
-        final_result: Final result if completed successfully
-        error: Error message if failed
-        total_turns: Total number of turns in the conversation
+        status: Final status (completed, failed, etc.).
+        final_result: Final result if completed successfully.
+        error: Error message if failed.
+        context_id: A2A context ID grouping related tasks.
+        total_turns: Total number of turns in the conversation.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        agent_card: Full A2A agent card metadata.
+        reference_task_ids: Related task IDs for context.
+        metadata: Custom A2A metadata key-value pairs.
+        extensions: List of A2A extension URIs in use.
    """

    type: str = "a2a_conversation_completed"
    status: Literal["completed", "failed"]
    final_result: str | None = None
    error: str | None = None
+    context_id: str | None = None
    total_turns: int
+    endpoint: str | None = None
+    a2a_agent_name: str | None = None
+    agent_card: dict[str, Any] | None = None
+    reference_task_ids: list[str] | None = None
+    metadata: dict[str, Any] | None = None
+    extensions: list[str] | None = None


 class A2APollingStartedEvent(A2AEventBase):
    """Event emitted when polling mode begins for A2A delegation.

    Attributes:
-        task_id: A2A task ID being polled
-        polling_interval: Seconds between poll attempts
-        endpoint: A2A agent endpoint URL
+        task_id: A2A task ID being polled.
+        context_id: A2A context ID grouping related tasks.
+        polling_interval: Seconds between poll attempts.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        metadata: Custom A2A metadata key-value pairs.
    """

    type: str = "a2a_polling_started"
    task_id: str
+    context_id: str | None = None
    polling_interval: float
    endpoint: str
+    a2a_agent_name: str | None = None
+    metadata: dict[str, Any] | None = None


 class A2APollingStatusEvent(A2AEventBase):
    """Event emitted on each polling iteration.

    Attributes:
-        task_id: A2A task ID being polled
-        state: Current task state from remote agent
-        elapsed_seconds: Time since polling started
-        poll_count: Number of polls completed
+        task_id: A2A task ID being polled.
+        context_id: A2A context ID grouping related tasks.
+        state: Current task state from remote agent.
+        elapsed_seconds: Time since polling started.
+        poll_count: Number of polls completed.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        metadata: Custom A2A metadata key-value pairs.
    """

    type: str = "a2a_polling_status"
    task_id: str
+    context_id: str | None = None
    state: str
    elapsed_seconds: float
    poll_count: int
+    endpoint: str | None = None
+    a2a_agent_name: str | None = None
+    metadata: dict[str, Any] | None = None


 class A2APushNotificationRegisteredEvent(A2AEventBase):
    """Event emitted when push notification callback is registered.

    Attributes:
-        task_id: A2A task ID for which callback is registered
-        callback_url: URL where agent will send push notifications
+        task_id: A2A task ID for which callback is registered.
+        context_id: A2A context ID grouping related tasks.
+        callback_url: URL where agent will send push notifications.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        metadata: Custom A2A metadata key-value pairs.
    """

    type: str = "a2a_push_notification_registered"
    task_id: str
+    context_id: str | None = None
    callback_url: str
+    endpoint: str | None = None
+    a2a_agent_name: str | None = None
+    metadata: dict[str, Any] | None = None


 class A2APushNotificationReceivedEvent(A2AEventBase):
    """Event emitted when a push notification is received.

+    This event should be emitted by the user's webhook handler when it receives
+    a push notification from the remote A2A agent, before calling
+    `result_store.store_result()`.
+
    Attributes:
-        task_id: A2A task ID from the notification
-        state: Current task state from the notification
+        task_id: A2A task ID from the notification.
+        context_id: A2A context ID grouping related tasks.
+        state: Current task state from the notification.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        metadata: Custom A2A metadata key-value pairs.
    """

    type: str = "a2a_push_notification_received"
    task_id: str
+    context_id: str | None = None
    state: str
+    endpoint: str | None = None
+    a2a_agent_name: str | None = None
+    metadata: dict[str, Any] | None = None
+
+
+class A2APushNotificationSentEvent(A2AEventBase):
+    """Event emitted when a push notification is sent to a callback URL.
+
+    Emitted by the A2A server when it sends a task status update to the
+    client's registered push notification callback URL.
+
+    Attributes:
+        task_id: A2A task ID being notified.
+        context_id: A2A context ID grouping related tasks.
+        callback_url: URL the notification was sent to.
+        state: Task state being reported.
+        success: Whether the notification was successfully delivered.
+        error: Error message if delivery failed.
+        metadata: Custom A2A metadata key-value pairs.
+    """
+
+    type: str = "a2a_push_notification_sent"
+    task_id: str
+    context_id: str | None = None
+    callback_url: str
+    state: str
+    success: bool = True
+    error: str | None = None
+    metadata: dict[str, Any] | None = None


 class A2APushNotificationTimeoutEvent(A2AEventBase):
    """Event emitted when push notification wait times out.

    Attributes:
-        task_id: A2A task ID that timed out
-        timeout_seconds: Timeout duration in seconds
+        task_id: A2A task ID that timed out.
+        context_id: A2A context ID grouping related tasks.
+        timeout_seconds: Timeout duration in seconds.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        metadata: Custom A2A metadata key-value pairs.
    """

    type: str = "a2a_push_notification_timeout"
    task_id: str
+    context_id: str | None = None
    timeout_seconds: float
+    endpoint: str | None = None
+    a2a_agent_name: str | None = None
+    metadata: dict[str, Any] | None = None
+
+
+class A2AStreamingStartedEvent(A2AEventBase):
+    """Event emitted when streaming mode begins for A2A delegation.
+
+    Attributes:
+        task_id: A2A task ID for the streaming session.
+        context_id: A2A context ID grouping related tasks.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        turn_number: Current turn number (1-indexed).
+        is_multiturn: Whether this is part of a multiturn conversation.
+        agent_role: Role of the CrewAI agent.
+        metadata: Custom A2A metadata key-value pairs.
+        extensions: List of A2A extension URIs in use.
+    """
+
+    type: str = "a2a_streaming_started"
+    task_id: str | None = None
+    context_id: str | None = None
+    endpoint: str
+    a2a_agent_name: str | None = None
+    turn_number: int = 1
+    is_multiturn: bool = False
+    agent_role: str | None = None
+    metadata: dict[str, Any] | None = None
+    extensions: list[str] | None = None
+
+
+class A2AStreamingChunkEvent(A2AEventBase):
+    """Event emitted when a streaming chunk is received.
+
+    Attributes:
+        task_id: A2A task ID for the streaming session.
+        context_id: A2A context ID grouping related tasks.
+        chunk: The text content of the chunk.
+        chunk_index: Index of this chunk in the stream (0-indexed).
+        final: Whether this is the final chunk in the stream.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        turn_number: Current turn number (1-indexed).
+        is_multiturn: Whether this is part of a multiturn conversation.
+        metadata: Custom A2A metadata key-value pairs.
+        extensions: List of A2A extension URIs in use.
+    """
+
+    type: str = "a2a_streaming_chunk"
+    task_id: str | None = None
+    context_id: str | None = None
+    chunk: str
+    chunk_index: int
+    final: bool = False
+    endpoint: str | None = None
+    a2a_agent_name: str | None = None
+    turn_number: int = 1
+    is_multiturn: bool = False
+    metadata: dict[str, Any] | None = None
+    extensions: list[str] | None = None
+
+
+class A2AAgentCardFetchedEvent(A2AEventBase):
+    """Event emitted when an agent card is successfully fetched.
+
+    Attributes:
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        agent_card: Full A2A agent card metadata.
+        protocol_version: A2A protocol version from agent card.
+        provider: Agent provider/organization info from agent card.
+        cached: Whether the agent card was retrieved from cache.
+        fetch_time_ms: Time taken to fetch the agent card in milliseconds.
+        metadata: Custom A2A metadata key-value pairs.
+    """
+
+    type: str = "a2a_agent_card_fetched"
+    endpoint: str
+    a2a_agent_name: str | None = None
+    agent_card: dict[str, Any] | None = None
+    protocol_version: str | None = None
+    provider: dict[str, Any] | None = None
+    cached: bool = False
+    fetch_time_ms: float | None = None
+    metadata: dict[str, Any] | None = None
+
+
+class A2AAuthenticationFailedEvent(A2AEventBase):
+    """Event emitted when authentication to an A2A agent fails.
+
+    Attributes:
+        endpoint: A2A agent endpoint URL.
+        auth_type: Type of authentication attempted (e.g., bearer, oauth2, api_key).
+        error: Error message describing the failure.
+        status_code: HTTP status code if applicable.
+        a2a_agent_name: Name of the A2A agent if known.
+        protocol_version: A2A protocol version being used.
+        metadata: Custom A2A metadata key-value pairs.
+    """
+
+    type: str = "a2a_authentication_failed"
+    endpoint: str
+    auth_type: str | None = None
+    error: str
+    status_code: int | None = None
+    a2a_agent_name: str | None = None
+    protocol_version: str | None = None
+    metadata: dict[str, Any] | None = None
+
+
+class A2AArtifactReceivedEvent(A2AEventBase):
+    """Event emitted when an artifact is received from a remote A2A agent.
+
+    Attributes:
+        task_id: A2A task ID the artifact belongs to.
+        artifact_id: Unique identifier for the artifact.
+        artifact_name: Name of the artifact.
+        artifact_description: Purpose description of the artifact.
+        mime_type: MIME type of the artifact content.
+        size_bytes: Size of the artifact in bytes.
+        append: Whether content should be appended to existing artifact.
+        last_chunk: Whether this is the final chunk of the artifact.
+        endpoint: A2A agent endpoint URL.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        context_id: Context ID for correlation.
+        turn_number: Current turn number (1-indexed).
+        is_multiturn: Whether this is part of a multiturn conversation.
+        metadata: Custom A2A metadata key-value pairs.
+        extensions: List of A2A extension URIs in use.
+    """
+
+    type: str = "a2a_artifact_received"
+    task_id: str
+    artifact_id: str
+    artifact_name: str | None = None
+    artifact_description: str | None = None
+    mime_type: str | None = None
+    size_bytes: int | None = None
+    append: bool = False
+    last_chunk: bool = False
+    endpoint: str | None = None
+    a2a_agent_name: str | None = None
+    context_id: str | None = None
+    turn_number: int = 1
+    is_multiturn: bool = False
+    metadata: dict[str, Any] | None = None
+    extensions: list[str] | None = None
+
+
+class A2AConnectionErrorEvent(A2AEventBase):
+    """Event emitted when a connection error occurs during A2A communication.
+
+    Attributes:
+        endpoint: A2A agent endpoint URL.
+        error: Error message describing the connection failure.
+        error_type: Type of error (e.g., timeout, connection_refused, dns_error).
+        status_code: HTTP status code if applicable.
+        a2a_agent_name: Name of the A2A agent from agent card.
+        operation: The operation being attempted when error occurred.
+        context_id: A2A context ID grouping related tasks.
+        task_id: A2A task ID if applicable.
+        metadata: Custom A2A metadata key-value pairs.
+    """
+
+    type: str = "a2a_connection_error"
+    endpoint: str
+    error: str
+    error_type: str | None = None
+    status_code: int | None = None
+    a2a_agent_name: str | None = None
+    operation: str | None = None
+    context_id: str | None = None
+    task_id: str | None = None
+    metadata: dict[str, Any] | None = None


 class A2AServerTaskStartedEvent(A2AEventBase):
-    """Event emitted when an A2A server task execution starts."""
+    """Event emitted when an A2A server task execution starts.
+
+    Attributes:
+        task_id: A2A task ID for this execution.
+        context_id: A2A context ID grouping related tasks.
+        metadata: Custom A2A metadata key-value pairs.
+    """

    type: str = "a2a_server_task_started"
-    a2a_task_id: str
-    a2a_context_id: str
+    task_id: str
+    context_id: str
+    metadata: dict[str, Any] | None = None


 class A2AServerTaskCompletedEvent(A2AEventBase):
-    """Event emitted when an A2A server task execution completes."""
+    """Event emitted when an A2A server task execution completes.
+
+    Attributes:
+        task_id: A2A task ID for this execution.
+        context_id: A2A context ID grouping related tasks.
+        result: The task result.
+        metadata: Custom A2A metadata key-value pairs.
+    """

    type: str = "a2a_server_task_completed"
-    a2a_task_id: str
-    a2a_context_id: str
+    task_id: str
+    context_id: str
    result: str
+    metadata: dict[str, Any] | None = None


 class A2AServerTaskCanceledEvent(A2AEventBase):
-    """Event emitted when an A2A server task execution is canceled."""
+    """Event emitted when an A2A server task execution is canceled.
+
+    Attributes:
+        task_id: A2A task ID for this execution.
+        context_id: A2A context ID grouping related tasks.
+        metadata: Custom A2A metadata key-value pairs.
+    """

    type: str = "a2a_server_task_canceled"
-    a2a_task_id: str
-    a2a_context_id: str
+    task_id: str
+    context_id: str
+    metadata: dict[str, Any] | None = None


 class A2AServerTaskFailedEvent(A2AEventBase):
-    """Event emitted when an A2A server task execution fails."""
+    """Event emitted when an A2A server task execution fails.
+
+    Attributes:
+        task_id: A2A task ID for this execution.
+        context_id: A2A context ID grouping related tasks.
+        error: Error message describing the failure.
+        metadata: Custom A2A metadata key-value pairs.
+    """

    type: str = "a2a_server_task_failed"
-    a2a_task_id: str
-    a2a_context_id: str
+    task_id: str
+    context_id: str
    error: str
+    metadata: dict[str, Any] | None = None
+
+
+class A2AParallelDelegationStartedEvent(A2AEventBase):
+    """Event emitted when parallel delegation to multiple A2A agents begins.
+
+    Attributes:
+        endpoints: List of A2A agent endpoints being delegated to.
+        task_description: Description of the task being delegated.
+    """
+
+    type: str = "a2a_parallel_delegation_started"
+    endpoints: list[str]
+    task_description: str
+
+
+class A2AParallelDelegationCompletedEvent(A2AEventBase):
+    """Event emitted when parallel delegation to multiple A2A agents completes.
+
+    Attributes:
+        endpoints: List of A2A agent endpoints that were delegated to.
+        success_count: Number of successful delegations.
+        failure_count: Number of failed delegations.
+        results: Summary of results from each agent.
+    """
+
+    type: str = "a2a_parallel_delegation_completed"
+    endpoints: list[str]
+    success_count: int
+    failure_count: int
+    results: dict[str, str] | None = None
--- a/lib/crewai/src/crewai/experimental/init.py
+++ b/lib/crewai/src/crewai/experimental/init.py
@@ -1,4 +1,4 @@
-from crewai.experimental.agent_executor import AgentExecutor, CrewAgentExecutorFlow
+from crewai.experimental.crew_agent_executor_flow import CrewAgentExecutorFlow
 from crewai.experimental.evaluation import (
    AgentEvaluationResult,
    AgentEvaluator,
@@ -23,9 +23,8 @@ from crewai.experimental.evaluation import (
 __all__ = [
    "AgentEvaluationResult",
    "AgentEvaluator",
-    "AgentExecutor",
    "BaseEvaluator",
-    "CrewAgentExecutorFlow",  # Deprecated alias for AgentExecutor
+    "CrewAgentExecutorFlow",
    "EvaluationScore",
    "EvaluationTraceCallback",
    "ExperimentResult",
--- a/lib/crewai/src/crewai/experimental/crew_agent_executor_flow.py
+++ b/lib/crewai/src/crewai/experimental/crew_agent_executor_flow.py
@@ -1,6 +1,6 @@
 from __future__ import annotations

-from collections.abc import Callable, Coroutine
+from collections.abc import Callable
 import threading
 from typing import TYPE_CHECKING, Any, Literal, cast
 from uuid import uuid4
@@ -37,7 +37,6 @@ from crewai.utilities.agent_utils import (
    handle_unknown_error,
    has_reached_max_iterations,
    is_context_length_exceeded,
-    is_inside_event_loop,
    process_llm_response,
 )
 from crewai.utilities.constants import TRAINING_DATA_FILE
@@ -74,17 +73,13 @@ class AgentReActState(BaseModel):
    ask_for_human_input: bool = Field(default=False)


-class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
-    """Flow-based agent executor for both standalone and crew-bound execution.
+class CrewAgentExecutorFlow(Flow[AgentReActState], CrewAgentExecutorMixin):
+    """Flow-based executor matching CrewAgentExecutor interface.

    Inherits from:
    - Flow[AgentReActState]: Provides flow orchestration capabilities
    - CrewAgentExecutorMixin: Provides memory methods (short/long/external term)

-    This executor can operate in two modes:
-    - Standalone mode: When crew and task are None (used by Agent.kickoff())
-    - Crew mode: When crew and task are provided (used by Agent.execute_task())
-
    Note: Multiple instances may be created during agent initialization
    (cache setup, RPM controller setup, etc.) but only the final instance
    should execute tasks via invoke().
@@ -93,6 +88,8 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
    def __init__(
        self,
        llm: BaseLLM,
+        task: Task,
+        crew: Crew,
        agent: Agent,
        prompt: SystemPromptResult | StandardPromptResult,
        max_iter: int,
@@ -101,8 +98,6 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
        stop_words: list[str],
        tools_description: str,
        tools_handler: ToolsHandler,
-        task: Task | None = None,
-        crew: Crew | None = None,
        step_callback: Any = None,
        original_tools: list[BaseTool] | None = None,
        function_calling_llm: BaseLLM | Any | None = None,
@@ -116,6 +111,8 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):

        Args:
            llm: Language model instance.
+            task: Task to execute.
+            crew: Crew instance.
            agent: Agent to execute.
            prompt: Prompt templates.
            max_iter: Maximum iterations.
@@ -124,8 +121,6 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
            stop_words: Stop word list.
            tools_description: Tool descriptions.
            tools_handler: Tool handler instance.
-            task: Optional task to execute (None for standalone agent execution).
-            crew: Optional crew instance (None for standalone agent execution).
            step_callback: Optional step callback.
            original_tools: Original tool list.
            function_calling_llm: Optional function calling LLM.
@@ -136,9 +131,9 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
        """
        self._i18n: I18N = i18n or get_i18n()
        self.llm = llm
-        self.task: Task | None = task
+        self.task = task
        self.agent = agent
-        self.crew: Crew | None = crew
+        self.crew = crew
        self.prompt = prompt
        self.tools = tools
        self.tools_names = tools_names
@@ -183,6 +178,7 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
                    else self.stop
                )
            )
+
        self._state = AgentReActState()

    def _ensure_flow_initialized(self) -> None:
@@ -268,7 +264,7 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
                printer=self._printer,
                from_task=self.task,
                from_agent=self.agent,
-                response_model=None,
+                response_model=self.response_model,
                executor_context=self,
            )

@@ -453,99 +449,9 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):

        return "initialized"

-    def invoke(
-        self, inputs: dict[str, Any]
-    ) -> dict[str, Any] | Coroutine[Any, Any, dict[str, Any]]:
+    def invoke(self, inputs: dict[str, Any]) -> dict[str, Any]:
        """Execute agent with given inputs.

-        When called from within an existing event loop (e.g., inside a Flow),
-        this method returns a coroutine that should be awaited. The Flow
-        framework handles this automatically.
-
-        Args:
-            inputs: Input dictionary containing prompt variables.
-
-        Returns:
-            Dictionary with agent output, or a coroutine if inside an event loop.
-        """
-        # Magic auto-async: if inside event loop, return coroutine for Flow to await
-        if is_inside_event_loop():
-            return self.invoke_async(inputs)
-
-        self._ensure_flow_initialized()
-
-        with self._execution_lock:
-            if self._is_executing:
-                raise RuntimeError(
-                    "Executor is already running. "
-                    "Cannot invoke the same executor instance concurrently."
-                )
-            self._is_executing = True
-            self._has_been_invoked = True
-
-        try:
-            # Reset state for fresh execution
-            self.state.messages.clear()
-            self.state.iterations = 0
-            self.state.current_answer = None
-            self.state.is_finished = False
-
-            if "system" in self.prompt:
-                prompt = cast("SystemPromptResult", self.prompt)
-                system_prompt = self._format_prompt(prompt["system"], inputs)
-                user_prompt = self._format_prompt(prompt["user"], inputs)
-                self.state.messages.append(
-                    format_message_for_llm(system_prompt, role="system")
-                )
-                self.state.messages.append(format_message_for_llm(user_prompt))
-            else:
-                user_prompt = self._format_prompt(self.prompt["prompt"], inputs)
-                self.state.messages.append(format_message_for_llm(user_prompt))
-
-            self.state.ask_for_human_input = bool(
-                inputs.get("ask_for_human_input", False)
-            )
-
-            self.kickoff()
-
-            formatted_answer = self.state.current_answer
-
-            if not isinstance(formatted_answer, AgentFinish):
-                raise RuntimeError(
-                    "Agent execution ended without reaching a final answer."
-                )
-
-            if self.state.ask_for_human_input:
-                formatted_answer = self._handle_human_feedback(formatted_answer)
-
-            self._create_short_term_memory(formatted_answer)
-            self._create_long_term_memory(formatted_answer)
-            self._create_external_memory(formatted_answer)
-
-            return {"output": formatted_answer.output}
-
-        except AssertionError:
-            fail_text = Text()
-            fail_text.append("❌ ", style="red bold")
-            fail_text.append(
-                "Agent failed to reach a final answer. This is likely a bug - please report it.",
-                style="red",
-            )
-            self._console.print(fail_text)
-            raise
-        except Exception as e:
-            handle_unknown_error(self._printer, e)
-            raise
-        finally:
-            self._is_executing = False
-
-    async def invoke_async(self, inputs: dict[str, Any]) -> dict[str, Any]:
-        """Execute agent asynchronously with given inputs.
-
-        This method is designed for use within async contexts, such as when
-        the agent is called from within an async Flow method. It uses
-        kickoff_async() directly instead of running in a separate thread.
-
        Args:
            inputs: Input dictionary containing prompt variables.

@@ -586,8 +492,7 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
                inputs.get("ask_for_human_input", False)
            )

-            # Use async kickoff directly since we're already in an async context
-            await self.kickoff_async()
+            self.kickoff()

            formatted_answer = self.state.current_answer

@@ -678,14 +583,11 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
        if self.agent is None:
            raise ValueError("Agent cannot be None")

-        if self.task is None:
-            return
-
        crewai_event_bus.emit(
            self.agent,
            AgentLogsStartedEvent(
                agent_role=self.agent.role,
-                task_description=self.task.description,
+                task_description=(self.task.description if self.task else "Not Found"),
                verbose=self.agent.verbose
                or (hasattr(self, "crew") and getattr(self.crew, "verbose", False)),
            ),
@@ -719,12 +621,10 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
            result: Agent's final output.
            human_feedback: Optional feedback from human.
        """
-        # Early return if no crew (standalone mode)
-        if self.crew is None:
-            return
-
        agent_id = str(self.agent.id)
-        train_iteration = getattr(self.crew, "_train_iteration", None)
+        train_iteration = (
+            getattr(self.crew, "_train_iteration", None) if self.crew else None
+        )

        if train_iteration is None or not isinstance(train_iteration, int):
            train_error = Text()
@@ -906,7 +806,3 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
        requiring arbitrary_types_allowed=True.
        """
        return core_schema.any_schema()
-
-
-# Backward compatibility alias (deprecated)
-CrewAgentExecutorFlow = AgentExecutor
--- a/lib/crewai/src/crewai/flow/flow.py
+++ b/lib/crewai/src/crewai/flow/flow.py
@@ -73,7 +73,6 @@ from crewai.flow.utils import (
    is_simple_flow_condition,
 )

-
 if TYPE_CHECKING:
    from crewai.flow.async_feedback.types import PendingFeedbackContext
    from crewai.flow.human_feedback import HumanFeedbackResult
@@ -520,9 +519,6 @@ class Flow(Generic[T], metaclass=FlowMeta):
        self._methods: dict[FlowMethodName, FlowMethod[Any, Any]] = {}
        self._method_execution_counts: dict[FlowMethodName, int] = {}
        self._pending_and_listeners: dict[PendingListenerKey, set[FlowMethodName]] = {}
-        self._fired_or_listeners: set[FlowMethodName] = (
-            set()
-        )  # Track OR listeners that already fired
        self._method_outputs: list[Any] = []  # list to store all method outputs
        self._completed_methods: set[FlowMethodName] = (
            set()
@@ -574,7 +570,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
        flow_id: str,
        persistence: FlowPersistence | None = None,
        **kwargs: Any,
-    ) -> Flow[Any]:
+    ) -> "Flow[Any]":
        """Create a Flow instance from a pending feedback state.

        This classmethod is used to restore a flow that was paused waiting
@@ -635,7 +631,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
        return instance

    @property
-    def pending_feedback(self) -> PendingFeedbackContext | None:
+    def pending_feedback(self) -> "PendingFeedbackContext | None":
        """Get the pending feedback context if this flow is waiting for feedback.

        Returns:
@@ -720,9 +716,8 @@ class Flow(Generic[T], metaclass=FlowMeta):
        Raises:
            ValueError: If no pending feedback context exists
        """
-        from datetime import datetime
-
        from crewai.flow.human_feedback import HumanFeedbackResult
+        from datetime import datetime

        if self._pending_feedback_context is None:
            raise ValueError(
@@ -1300,7 +1295,6 @@ class Flow(Generic[T], metaclass=FlowMeta):
                self._completed_methods.clear()
                self._method_outputs.clear()
                self._pending_and_listeners.clear()
-                self._fired_or_listeners.clear()
            else:
                # We're restoring from persistence, set the flag
                self._is_execution_resuming = True
@@ -1352,26 +1346,9 @@ class Flow(Generic[T], metaclass=FlowMeta):
                self._initialize_state(inputs)

            try:
-                # Determine which start methods to execute at kickoff
-                # Conditional start methods (with __trigger_methods__) are only triggered by their conditions
-                # UNLESS there are no unconditional starts (then all starts run as entry points)
-                unconditional_starts = [
-                    start_method
-                    for start_method in self._start_methods
-                    if not getattr(
-                        self._methods.get(start_method), "__trigger_methods__", None
-                    )
-                ]
-                # If there are unconditional starts, only run those at kickoff
-                # If there are NO unconditional starts, run all starts (including conditional ones)
-                starts_to_execute = (
-                    unconditional_starts
-                    if unconditional_starts
-                    else self._start_methods
-                )
                tasks = [
                    self._execute_start_method(start_method)
-                    for start_method in starts_to_execute
+                    for start_method in self._start_methods
                ]
                await asyncio.gather(*tasks)
            except Exception as e:
@@ -1504,8 +1481,6 @@ class Flow(Generic[T], metaclass=FlowMeta):
                return
            # For cyclic flows, clear from completed to allow re-execution
            self._completed_methods.discard(start_method_name)
-            # Also clear fired OR listeners to allow them to fire again in new cycle
-            self._fired_or_listeners.clear()

        method = self._methods[start_method_name]
        enhanced_method = self._inject_trigger_payload_for_start_method(method)
@@ -1528,9 +1503,11 @@ class Flow(Generic[T], metaclass=FlowMeta):
                    if self.last_human_feedback is not None
                    else result
                )
-                # Execute listeners sequentially to prevent race conditions on shared state
-                for listener_name in listeners_for_result:
-                    await self._execute_single_listener(listener_name, listener_result)
+                tasks = [
+                    self._execute_single_listener(listener_name, listener_result)
+                    for listener_name in listeners_for_result
+                ]
+                await asyncio.gather(*tasks)
        else:
            await self._execute_listeners(start_method_name, result)

@@ -1596,19 +1573,11 @@ class Flow(Generic[T], metaclass=FlowMeta):
                if future:
                    self._event_futures.append(future)

-            if asyncio.iscoroutinefunction(method):
-                result = await method(*args, **kwargs)
-            else:
-                # Run sync methods in thread pool for isolation
-                # This allows Agent.kickoff() to work synchronously inside Flow methods
-                import contextvars
-
-                ctx = contextvars.copy_context()
-                result = await asyncio.to_thread(ctx.run, method, *args, **kwargs)
-
-            # Auto-await coroutines returned from sync methods (enables AgentExecutor pattern)
-            if asyncio.iscoroutine(result):
-                result = await result
+            result = (
+                await method(*args, **kwargs)
+                if asyncio.iscoroutinefunction(method)
+                else method(*args, **kwargs)
+            )

            self._method_outputs.append(result)
            self._method_execution_counts[method_name] = (
@@ -1755,11 +1724,11 @@ class Flow(Generic[T], metaclass=FlowMeta):
                    listener_result = router_result_to_feedback.get(
                        str(current_trigger), result
                    )
-                    # Execute listeners sequentially to prevent race conditions on shared state
-                    for listener_name in listeners_triggered:
-                        await self._execute_single_listener(
-                            listener_name, listener_result
-                        )
+                    tasks = [
+                        self._execute_single_listener(listener_name, listener_result)
+                        for listener_name in listeners_triggered
+                    ]
+                    await asyncio.gather(*tasks)

                if current_trigger in router_results:
                    # Find start methods triggered by this router result
@@ -1776,16 +1745,14 @@ class Flow(Generic[T], metaclass=FlowMeta):
                                should_trigger = current_trigger in all_methods

                            if should_trigger:
-                                # Execute conditional start method triggered by router result
+                                # Only execute if this is a cycle (method was already completed)
                                if method_name in self._completed_methods:
-                                    # For cyclic re-execution, temporarily clear resumption flag
+                                    # For router-triggered start methods in cycles, temporarily clear resumption flag
+                                    # to allow cyclic execution
                                    was_resuming = self._is_execution_resuming
                                    self._is_execution_resuming = False
                                    await self._execute_start_method(method_name)
                                    self._is_execution_resuming = was_resuming
-                                else:
-                                    # First-time execution of conditional start
-                                    await self._execute_start_method(method_name)

    def _evaluate_condition(
        self,
@@ -1883,21 +1850,8 @@ class Flow(Generic[T], metaclass=FlowMeta):
                condition_type, methods = condition_data

                if condition_type == OR_CONDITION:
-                    # Only trigger multi-source OR listeners (or_(A, B, C)) once - skip if already fired
-                    # Simple single-method listeners fire every time their trigger occurs
-                    # Routers also fire every time - they're decision points
-                    has_multiple_triggers = len(methods) > 1
-                    should_check_fired = has_multiple_triggers and not is_router
-
-                    if (
-                        not should_check_fired
-                        or listener_name not in self._fired_or_listeners
-                    ):
-                        if trigger_method in methods:
-                            triggered.append(listener_name)
-                            # Only track multi-source OR listeners (not single-method or routers)
-                            if should_check_fired:
-                                self._fired_or_listeners.add(listener_name)
+                    if trigger_method in methods:
+                        triggered.append(listener_name)
                elif condition_type == AND_CONDITION:
                    pending_key = PendingListenerKey(listener_name)
                    if pending_key not in self._pending_and_listeners:
@@ -1910,26 +1864,10 @@ class Flow(Generic[T], metaclass=FlowMeta):
                        self._pending_and_listeners.pop(pending_key, None)

            elif is_flow_condition_dict(condition_data):
-                # For complex conditions, check if top-level is OR and track accordingly
-                top_level_type = condition_data.get("type", OR_CONDITION)
-                is_or_based = top_level_type == OR_CONDITION
-
-                # Only track multi-source OR conditions (multiple sub-conditions), not routers
-                sub_conditions = condition_data.get("conditions", [])
-                has_multiple_triggers = is_or_based and len(sub_conditions) > 1
-                should_check_fired = has_multiple_triggers and not is_router
-
-                # Skip compound OR-based listeners that have already fired
-                if should_check_fired and listener_name in self._fired_or_listeners:
-                    continue
-
                if self._evaluate_condition(
                    condition_data, trigger_method, listener_name
                ):
                    triggered.append(listener_name)
-                    # Track compound OR-based listeners so they only fire once
-                    if should_check_fired:
-                        self._fired_or_listeners.add(listener_name)

        return triggered

@@ -1958,22 +1896,9 @@ class Flow(Generic[T], metaclass=FlowMeta):
            if self._is_execution_resuming:
                # During resumption, skip execution but continue listeners
                await self._execute_listeners(listener_name, None)
-
-                # For routers, also check if any conditional starts they triggered are completed
-                # If so, continue their chains
-                if listener_name in self._routers:
-                    for start_method_name in self._start_methods:
-                        if (
-                            start_method_name in self._listeners
-                            and start_method_name in self._completed_methods
-                        ):
-                            # This conditional start was executed, continue its chain
-                            await self._execute_start_method(start_method_name)
                return
            # For cyclic flows, clear from completed to allow re-execution
            self._completed_methods.discard(listener_name)
-            # Also clear from fired OR listeners for cyclic flows
-            self._fired_or_listeners.discard(listener_name)

        try:
            method = self._methods[listener_name]
@@ -2006,9 +1931,11 @@ class Flow(Generic[T], metaclass=FlowMeta):
                        if self.last_human_feedback is not None
                        else listener_result
                    )
-                    # Execute listeners sequentially to prevent race conditions on shared state
-                    for name in listeners_for_result:
-                        await self._execute_single_listener(name, feedback_result)
+                    tasks = [
+                        self._execute_single_listener(name, feedback_result)
+                        for name in listeners_for_result
+                    ]
+                    await asyncio.gather(*tasks)

        except Exception as e:
            # Don't log HumanFeedbackPending as an error - it's expected control flow
--- a/lib/crewai/src/crewai/lite_agent.py
+++ b/lib/crewai/src/crewai/lite_agent.py
@@ -10,7 +10,6 @@ from typing import (
    get_origin,
 )
 import uuid
-import warnings

 from pydantic import (
    UUID4,
@@ -81,11 +80,6 @@ class LiteAgent(FlowTrackable, BaseModel):
    """
    A lightweight agent that can process messages and use tools.

-    .. deprecated::
-        LiteAgent is deprecated and will be removed in a future version.
-        Use ``Agent().kickoff(messages)`` instead, which provides the same
-        functionality with additional features like memory and knowledge support.
-
    This agent is simpler than the full Agent class, focusing on direct execution
    rather than task delegation. It's designed to be used for simple interactions
    where a full crew is not needed.
@@ -170,18 +164,6 @@ class LiteAgent(FlowTrackable, BaseModel):
        default_factory=get_after_llm_call_hooks
    )

-    @model_validator(mode="after")
-    def emit_deprecation_warning(self) -> Self:
-        """Emit deprecation warning for LiteAgent usage."""
-        warnings.warn(
-            "LiteAgent is deprecated and will be removed in a future version. "
-            "Use Agent().kickoff(messages) instead, which provides the same "
-            "functionality with additional features like memory and knowledge support.",
-            DeprecationWarning,
-            stacklevel=2,
-        )
-        return self
-
    @model_validator(mode="after")
    def setup_llm(self) -> Self:
        """Set up the LLM and other components after initialization."""
--- a/lib/crewai/src/crewai/llms/providers/gemini/completion.py
+++ b/lib/crewai/src/crewai/llms/providers/gemini/completion.py
@@ -54,15 +54,21 @@ class GeminiCompletion(BaseLLM):
        safety_settings: dict[str, Any] | None = None,
        client_params: dict[str, Any] | None = None,
        interceptor: BaseInterceptor[Any, Any] | None = None,
+        use_vertexai: bool | None = None,
        **kwargs: Any,
    ):
        """Initialize Google Gemini chat completion client.

        Args:
            model: Gemini model name (e.g., 'gemini-2.0-flash-001', 'gemini-1.5-pro')
-            api_key: Google API key (defaults to GOOGLE_API_KEY or GEMINI_API_KEY env var)
-            project: Google Cloud project ID (for Vertex AI)
-            location: Google Cloud location (for Vertex AI, defaults to 'us-central1')
+            api_key: Google API key for Gemini API authentication.
+                    Defaults to GOOGLE_API_KEY or GEMINI_API_KEY env var.
+                    NOTE: Cannot be used with Vertex AI (project parameter). Use Gemini API instead.
+            project: Google Cloud project ID for Vertex AI with ADC authentication.
+                    Requires Application Default Credentials (gcloud auth application-default login).
+                    NOTE: Vertex AI does NOT support API keys, only OAuth2/ADC.
+                    If both api_key and project are set, api_key takes precedence.
+            location: Google Cloud location (for Vertex AI with ADC, defaults to 'us-central1')
            temperature: Sampling temperature (0-2)
            top_p: Nucleus sampling parameter
            top_k: Top-k sampling parameter
@@ -73,6 +79,12 @@ class GeminiCompletion(BaseLLM):
            client_params: Additional parameters to pass to the Google Gen AI Client constructor.
                          Supports parameters like http_options, credentials, debug_config, etc.
            interceptor: HTTP interceptor (not yet supported for Gemini).
+            use_vertexai: Whether to use Vertex AI instead of Gemini API.
+                         - True: Use Vertex AI (with ADC or Express mode with API key)
+                         - False: Use Gemini API (explicitly override env var)
+                         - None (default): Check GOOGLE_GENAI_USE_VERTEXAI env var
+                         When using Vertex AI with API key (Express mode), http_options with
+                         api_version="v1" is automatically configured.
            **kwargs: Additional parameters
        """
        if interceptor is not None:
@@ -95,7 +107,8 @@ class GeminiCompletion(BaseLLM):
        self.project = project or os.getenv("GOOGLE_CLOUD_PROJECT")
        self.location = location or os.getenv("GOOGLE_CLOUD_LOCATION") or "us-central1"

-        use_vertexai = os.getenv("GOOGLE_GENAI_USE_VERTEXAI", "").lower() == "true"
+        if use_vertexai is None:
+            use_vertexai = os.getenv("GOOGLE_GENAI_USE_VERTEXAI", "").lower() == "true"

        self.client = self._initialize_client(use_vertexai)

@@ -146,13 +159,34 @@ class GeminiCompletion(BaseLLM):

        Returns:
            Initialized Google Gen AI Client
+
+        Note:
+            Google Gen AI SDK has two distinct endpoints with different auth requirements:
+            - Gemini API (generativelanguage.googleapis.com): Supports API key authentication
+            - Vertex AI (aiplatform.googleapis.com): Only supports OAuth2/ADC, NO API keys
+
+            When vertexai=True is set, it routes to aiplatform.googleapis.com which rejects
+            API keys. Use Gemini API endpoint for API key authentication instead.
        """
        client_params = {}

        if self.client_params:
            client_params.update(self.client_params)

-        if use_vertexai or self.project:
+        # Determine authentication mode based on available credentials
+        has_api_key = bool(self.api_key)
+        has_project = bool(self.project)
+
+        if has_api_key and has_project:
+            logging.warning(
+                "Both API key and project provided. Using API key authentication. "
+                "Project/location parameters are ignored when using API keys. "
+                "To use Vertex AI with ADC, remove the api_key parameter."
+            )
+            has_project = False
+
+        # Vertex AI with ADC (project without API key)
+        if (use_vertexai or has_project) and not has_api_key:
            client_params.update(
                {
                    "vertexai": True,
@@ -161,12 +195,20 @@ class GeminiCompletion(BaseLLM):
                }
            )

-            client_params.pop("api_key", None)
-
-        elif self.api_key:
+        # API key authentication (works with both Gemini API and Vertex AI Express)
+        elif has_api_key:
            client_params["api_key"] = self.api_key

-            client_params.pop("vertexai", None)
+            # Vertex AI Express mode: API key + vertexai=True + http_options with api_version="v1"
+            # See: https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey
+            if use_vertexai:
+                client_params["vertexai"] = True
+                client_params["http_options"] = types.HttpOptions(api_version="v1")
+            else:
+                # This ensures we use the Gemini API (generativelanguage.googleapis.com)
+                client_params["vertexai"] = False
+
+            # Clean up project/location (not allowed with API key)
            client_params.pop("project", None)
            client_params.pop("location", None)

@@ -175,10 +217,13 @@ class GeminiCompletion(BaseLLM):
                return genai.Client(**client_params)
            except Exception as e:
                raise ValueError(
-                    "Either GOOGLE_API_KEY/GEMINI_API_KEY (for Gemini API) or "
-                    "GOOGLE_CLOUD_PROJECT (for Vertex AI) must be set"
+                    "Authentication required. Provide one of:\n"
+                    "  1. API key via GOOGLE_API_KEY or GEMINI_API_KEY environment variable\n"
+                    "     (use_vertexai=True is optional for Vertex AI with API key)\n"
+                    "  2. For Vertex AI with ADC: Set GOOGLE_CLOUD_PROJECT and run:\n"
+                    "     gcloud auth application-default login\n"
+                    "  3. Pass api_key parameter directly to LLM constructor\n"
                ) from e
-
        return genai.Client(**client_params)

    def _get_client_params(self) -> dict[str, Any]:
@@ -202,6 +247,8 @@ class GeminiCompletion(BaseLLM):
                    "location": self.location,
                }
            )
+            if self.api_key:
+                params["api_key"] = self.api_key
        elif self.api_key:
            params["api_key"] = self.api_key

--- a/lib/crewai/src/crewai/utilities/agent_utils.py
+++ b/lib/crewai/src/crewai/utilities/agent_utils.py
@@ -1,6 +1,5 @@
 from __future__ import annotations

-import asyncio
 from collections.abc import Callable, Sequence
 import json
 import re
@@ -55,23 +54,6 @@ console = Console()
 _MULTIPLE_NEWLINES: Final[re.Pattern[str]] = re.compile(r"\n+")


-def is_inside_event_loop() -> bool:
-    """Check if code is currently running inside an asyncio event loop.
-
-    This is used to detect when code is being called from within an async context
-    (e.g., inside a Flow). In such cases, callers should return a coroutine
-    instead of executing synchronously to avoid nested event loop errors.
-
-    Returns:
-        True if inside a running event loop, False otherwise.
-    """
-    try:
-        asyncio.get_running_loop()
-        return True
-    except RuntimeError:
-        return False
-
-
 def parse_tools(tools: list[BaseTool]) -> list[CrewStructuredTool]:
    """Parse tools to be used for the task.

--- a/lib/crewai/tests/a2a/utils/test_task.py
+++ b/lib/crewai/tests/a2a/utils/test_task.py
@@ -26,9 +26,13 @@ def mock_agent() -> MagicMock:


@pytest.fixture
-def mock_task() -> MagicMock:
+def mock_task(mock_context: MagicMock) -> MagicMock:
    """Create a mock Task."""
-    return MagicMock()
+    task = MagicMock()
+    task.id = mock_context.task_id
+    task.name = "Mock Task"
+    task.description = "Mock task description"
+    return task


@pytest.fixture
@@ -179,8 +183,8 @@ class TestExecute:
        event = first_call[0][1]

        assert event.type == "a2a_server_task_started"
-        assert event.a2a_task_id == mock_context.task_id
-        assert event.a2a_context_id == mock_context.context_id
+        assert event.task_id == mock_context.task_id
+        assert event.context_id == mock_context.context_id

    @pytest.mark.asyncio
    async def test_emits_completed_event(
@@ -201,7 +205,7 @@ class TestExecute:
        event = second_call[0][1]

        assert event.type == "a2a_server_task_completed"
-        assert event.a2a_task_id == mock_context.task_id
+        assert event.task_id == mock_context.task_id
        assert event.result == "Task completed successfully"

    @pytest.mark.asyncio
@@ -250,7 +254,7 @@ class TestExecute:
        event = canceled_call[0][1]

        assert event.type == "a2a_server_task_canceled"
-        assert event.a2a_task_id == mock_context.task_id
+        assert event.task_id == mock_context.task_id


 class TestCancel:
--- a/lib/crewai/tests/agents/test_a2a_trust_completion_status.py
+++ b/lib/crewai/tests/agents/test_a2a_trust_completion_status.py
@@ -14,6 +14,16 @@ except ImportError:
    A2A_SDK_INSTALLED = False


+def _create_mock_agent_card(name: str = "Test", url: str = "http://test-endpoint.com/"):
+    """Create a mock agent card with proper model_dump behavior."""
+    mock_card = MagicMock()
+    mock_card.name = name
+    mock_card.url = url
+    mock_card.model_dump.return_value = {"name": name, "url": url}
+    mock_card.model_dump_json.return_value = f'{{"name": "{name}", "url": "{url}"}}'
+    return mock_card
+
+
@pytest.mark.skipif(not A2A_SDK_INSTALLED, reason="Requires a2a-sdk to be installed")
 def test_trust_remote_completion_status_true_returns_directly():
    """When trust_remote_completion_status=True and A2A returns completed, return result directly."""
@@ -44,8 +54,7 @@ def test_trust_remote_completion_status_true_returns_directly():
        patch("crewai.a2a.wrapper.execute_a2a_delegation") as mock_execute,
        patch("crewai.a2a.wrapper._fetch_agent_cards_concurrently") as mock_fetch,
    ):
-        mock_card = MagicMock()
-        mock_card.name = "Test"
+        mock_card = _create_mock_agent_card()
        mock_fetch.return_value = ({"http://test-endpoint.com/": mock_card}, {})

        # A2A returns completed
@@ -110,8 +119,7 @@ def test_trust_remote_completion_status_false_continues_conversation():
        patch("crewai.a2a.wrapper.execute_a2a_delegation") as mock_execute,
        patch("crewai.a2a.wrapper._fetch_agent_cards_concurrently") as mock_fetch,
    ):
-        mock_card = MagicMock()
-        mock_card.name = "Test"
+        mock_card = _create_mock_agent_card()
        mock_fetch.return_value = ({"http://test-endpoint.com/": mock_card}, {})

        # A2A returns completed
--- a/lib/crewai/tests/agents/test_crew_agent_executor_flow.py
+++ b/lib/crewai/tests/agents/test_crew_agent_executor_flow.py
@@ -1,4 +1,4 @@
-"""Unit tests for AgentExecutor.
+"""Unit tests for CrewAgentExecutorFlow.

 Tests the Flow-based agent executor implementation including state management,
 flow methods, routing logic, and error handling.
@@ -8,9 +8,9 @@ from unittest.mock import Mock, patch

 import pytest

-from crewai.experimental.agent_executor import (
+from crewai.experimental.crew_agent_executor_flow import (
    AgentReActState,
-    AgentExecutor,
+    CrewAgentExecutorFlow,
 )
 from crewai.agents.parser import AgentAction, AgentFinish

@@ -43,8 +43,8 @@ class TestAgentReActState:
        assert state.ask_for_human_input is True


-class TestAgentExecutor:
-    """Test AgentExecutor class."""
+class TestCrewAgentExecutorFlow:
+    """Test CrewAgentExecutorFlow class."""

    @pytest.fixture
    def mock_dependencies(self):
@@ -87,8 +87,8 @@ class TestAgentExecutor:
        }

    def test_executor_initialization(self, mock_dependencies):
-        """Test AgentExecutor initialization."""
-        executor = AgentExecutor(**mock_dependencies)
+        """Test CrewAgentExecutorFlow initialization."""
+        executor = CrewAgentExecutorFlow(**mock_dependencies)

        assert executor.llm == mock_dependencies["llm"]
        assert executor.task == mock_dependencies["task"]
@@ -100,9 +100,9 @@ class TestAgentExecutor:
    def test_initialize_reasoning(self, mock_dependencies):
        """Test flow entry point."""
        with patch.object(
-            AgentExecutor, "_show_start_logs"
+            CrewAgentExecutorFlow, "_show_start_logs"
        ) as mock_show_start:
-            executor = AgentExecutor(**mock_dependencies)
+            executor = CrewAgentExecutorFlow(**mock_dependencies)
            result = executor.initialize_reasoning()

            assert result == "initialized"
@@ -110,7 +110,7 @@ class TestAgentExecutor:

    def test_check_max_iterations_not_reached(self, mock_dependencies):
        """Test routing when iterations < max."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        executor.state.iterations = 5

        result = executor.check_max_iterations()
@@ -118,7 +118,7 @@ class TestAgentExecutor:

    def test_check_max_iterations_reached(self, mock_dependencies):
        """Test routing when iterations >= max."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        executor.state.iterations = 10

        result = executor.check_max_iterations()
@@ -126,7 +126,7 @@ class TestAgentExecutor:

    def test_route_by_answer_type_action(self, mock_dependencies):
        """Test routing for AgentAction."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        executor.state.current_answer = AgentAction(
            thought="thinking", tool="search", tool_input="query", text="action text"
        )
@@ -136,7 +136,7 @@ class TestAgentExecutor:

    def test_route_by_answer_type_finish(self, mock_dependencies):
        """Test routing for AgentFinish."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        executor.state.current_answer = AgentFinish(
            thought="final thoughts", output="Final answer", text="complete"
        )
@@ -146,7 +146,7 @@ class TestAgentExecutor:

    def test_continue_iteration(self, mock_dependencies):
        """Test iteration continuation."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)

        result = executor.continue_iteration()

@@ -154,8 +154,8 @@ class TestAgentExecutor:

    def test_finalize_success(self, mock_dependencies):
        """Test finalize with valid AgentFinish."""
-        with patch.object(AgentExecutor, "_show_logs") as mock_show_logs:
-            executor = AgentExecutor(**mock_dependencies)
+        with patch.object(CrewAgentExecutorFlow, "_show_logs") as mock_show_logs:
+            executor = CrewAgentExecutorFlow(**mock_dependencies)
            executor.state.current_answer = AgentFinish(
                thought="final thinking", output="Done", text="complete"
            )
@@ -168,7 +168,7 @@ class TestAgentExecutor:

    def test_finalize_failure(self, mock_dependencies):
        """Test finalize skips when given AgentAction instead of AgentFinish."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        executor.state.current_answer = AgentAction(
            thought="thinking", tool="search", tool_input="query", text="action text"
        )
@@ -181,7 +181,7 @@ class TestAgentExecutor:

    def test_format_prompt(self, mock_dependencies):
        """Test prompt formatting."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        inputs = {"input": "test input", "tool_names": "tool1, tool2", "tools": "desc"}

        result = executor._format_prompt("Prompt {input} {tool_names} {tools}", inputs)
@@ -192,18 +192,18 @@ class TestAgentExecutor:

    def test_is_training_mode_false(self, mock_dependencies):
        """Test training mode detection when not in training."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        assert executor._is_training_mode() is False

    def test_is_training_mode_true(self, mock_dependencies):
        """Test training mode detection when in training."""
        mock_dependencies["crew"]._train = True
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        assert executor._is_training_mode() is True

    def test_append_message_to_state(self, mock_dependencies):
        """Test message appending to state."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        initial_count = len(executor.state.messages)

        executor._append_message_to_state("test message")
@@ -216,7 +216,7 @@ class TestAgentExecutor:
        callback = Mock()
        mock_dependencies["step_callback"] = callback

-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        answer = AgentFinish(thought="thinking", output="test", text="final")

        executor._invoke_step_callback(answer)
@@ -226,14 +226,14 @@ class TestAgentExecutor:
    def test_invoke_step_callback_none(self, mock_dependencies):
        """Test step callback when none provided."""
        mock_dependencies["step_callback"] = None
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)

        # Should not raise error
        executor._invoke_step_callback(
            AgentFinish(thought="thinking", output="test", text="final")
        )

-    @patch("crewai.experimental.agent_executor.handle_output_parser_exception")
+    @patch("crewai.experimental.crew_agent_executor_flow.handle_output_parser_exception")
    def test_recover_from_parser_error(
        self, mock_handle_exception, mock_dependencies
    ):
@@ -242,7 +242,7 @@ class TestAgentExecutor:

        mock_handle_exception.return_value = None

-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        executor._last_parser_error = OutputParserError("test error")
        initial_iterations = executor.state.iterations

@@ -252,12 +252,12 @@ class TestAgentExecutor:
        assert executor.state.iterations == initial_iterations + 1
        mock_handle_exception.assert_called_once()

-    @patch("crewai.experimental.agent_executor.handle_context_length")
+    @patch("crewai.experimental.crew_agent_executor_flow.handle_context_length")
    def test_recover_from_context_length(
        self, mock_handle_context, mock_dependencies
    ):
        """Test recovery from context length error."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        executor._last_context_error = Exception("context too long")
        initial_iterations = executor.state.iterations

@@ -270,16 +270,16 @@ class TestAgentExecutor:
    def test_use_stop_words_property(self, mock_dependencies):
        """Test use_stop_words property."""
        mock_dependencies["llm"].supports_stop_words.return_value = True
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        assert executor.use_stop_words is True

        mock_dependencies["llm"].supports_stop_words.return_value = False
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        assert executor.use_stop_words is False

    def test_compatibility_properties(self, mock_dependencies):
        """Test compatibility properties for mixin."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        executor.state.messages = [{"role": "user", "content": "test"}]
        executor.state.iterations = 5

@@ -321,8 +321,8 @@ class TestFlowErrorHandling:
            "tools_handler": Mock(),
        }

-    @patch("crewai.experimental.agent_executor.get_llm_response")
-    @patch("crewai.experimental.agent_executor.enforce_rpm_limit")
+    @patch("crewai.experimental.crew_agent_executor_flow.get_llm_response")
+    @patch("crewai.experimental.crew_agent_executor_flow.enforce_rpm_limit")
    def test_call_llm_parser_error(
        self, mock_enforce_rpm, mock_get_llm, mock_dependencies
    ):
@@ -332,15 +332,15 @@ class TestFlowErrorHandling:
        mock_enforce_rpm.return_value = None
        mock_get_llm.side_effect = OutputParserError("parse failed")

-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        result = executor.call_llm_and_parse()

        assert result == "parser_error"
        assert executor._last_parser_error is not None

-    @patch("crewai.experimental.agent_executor.get_llm_response")
-    @patch("crewai.experimental.agent_executor.enforce_rpm_limit")
-    @patch("crewai.experimental.agent_executor.is_context_length_exceeded")
+    @patch("crewai.experimental.crew_agent_executor_flow.get_llm_response")
+    @patch("crewai.experimental.crew_agent_executor_flow.enforce_rpm_limit")
+    @patch("crewai.experimental.crew_agent_executor_flow.is_context_length_exceeded")
    def test_call_llm_context_error(
        self,
        mock_is_context_exceeded,
@@ -353,7 +353,7 @@ class TestFlowErrorHandling:
        mock_get_llm.side_effect = Exception("context length")
        mock_is_context_exceeded.return_value = True

-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        result = executor.call_llm_and_parse()

        assert result == "context_error"
@@ -397,10 +397,10 @@ class TestFlowInvoke:
            "tools_handler": Mock(),
        }

-    @patch.object(AgentExecutor, "kickoff")
-    @patch.object(AgentExecutor, "_create_short_term_memory")
-    @patch.object(AgentExecutor, "_create_long_term_memory")
-    @patch.object(AgentExecutor, "_create_external_memory")
+    @patch.object(CrewAgentExecutorFlow, "kickoff")
+    @patch.object(CrewAgentExecutorFlow, "_create_short_term_memory")
+    @patch.object(CrewAgentExecutorFlow, "_create_long_term_memory")
+    @patch.object(CrewAgentExecutorFlow, "_create_external_memory")
    def test_invoke_success(
        self,
        mock_external_memory,
@@ -410,7 +410,7 @@ class TestFlowInvoke:
        mock_dependencies,
    ):
        """Test successful invoke without human feedback."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)

        # Mock kickoff to set the final answer in state
        def mock_kickoff_side_effect():
@@ -429,10 +429,10 @@ class TestFlowInvoke:
        mock_long_term_memory.assert_called_once()
        mock_external_memory.assert_called_once()

-    @patch.object(AgentExecutor, "kickoff")
+    @patch.object(CrewAgentExecutorFlow, "kickoff")
    def test_invoke_failure_no_agent_finish(self, mock_kickoff, mock_dependencies):
        """Test invoke fails without AgentFinish."""
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)
        executor.state.current_answer = AgentAction(
            thought="thinking", tool="test", tool_input="test", text="action text"
        )
@@ -442,10 +442,10 @@ class TestFlowInvoke:
        with pytest.raises(RuntimeError, match="without reaching a final answer"):
            executor.invoke(inputs)

-    @patch.object(AgentExecutor, "kickoff")
-    @patch.object(AgentExecutor, "_create_short_term_memory")
-    @patch.object(AgentExecutor, "_create_long_term_memory")
-    @patch.object(AgentExecutor, "_create_external_memory")
+    @patch.object(CrewAgentExecutorFlow, "kickoff")
+    @patch.object(CrewAgentExecutorFlow, "_create_short_term_memory")
+    @patch.object(CrewAgentExecutorFlow, "_create_long_term_memory")
+    @patch.object(CrewAgentExecutorFlow, "_create_external_memory")
    def test_invoke_with_system_prompt(
        self,
        mock_external_memory,
@@ -459,7 +459,7 @@ class TestFlowInvoke:
            "system": "System: {input}",
            "user": "User: {input} {tool_names} {tools}",
        }
-        executor = AgentExecutor(**mock_dependencies)
+        executor = CrewAgentExecutorFlow(**mock_dependencies)

        def mock_kickoff_side_effect():
            executor.state.current_answer = AgentFinish(
--- a/lib/crewai/tests/agents/test_lite_agent.py
+++ b/lib/crewai/tests/agents/test_lite_agent.py
@@ -72,53 +72,62 @@ class ResearchResult(BaseModel):

@pytest.mark.vcr()
@pytest.mark.parametrize("verbose", [True, False])
-def test_agent_kickoff_preserves_parameters(verbose):
-    """Test that Agent.kickoff() uses the correct parameters from the Agent."""
+def test_lite_agent_created_with_correct_parameters(monkeypatch, verbose):
+    """Test that LiteAgent is created with the correct parameters when Agent.kickoff() is called."""
    # Create a test agent with specific parameters
-    mock_llm = Mock(spec=LLM)
-    mock_llm.call.return_value = "Final Answer: Test response"
-    mock_llm.stop = []
-
-    from crewai.types.usage_metrics import UsageMetrics
-
-    mock_usage_metrics = UsageMetrics(
-        total_tokens=100,
-        prompt_tokens=50,
-        completion_tokens=50,
-        cached_prompt_tokens=0,
-        successful_requests=1,
-    )
-    mock_llm.get_token_usage_summary.return_value = mock_usage_metrics
-
+    llm = LLM(model="gpt-4o-mini")
    custom_tools = [WebSearchTool(), CalculatorTool()]
    max_iter = 10
+    max_execution_time = 300

    agent = Agent(
        role="Test Agent",
        goal="Test Goal",
        backstory="Test Backstory",
-        llm=mock_llm,
+        llm=llm,
        tools=custom_tools,
        max_iter=max_iter,
+        max_execution_time=max_execution_time,
        verbose=verbose,
    )

-    # Call kickoff and verify it works
-    result = agent.kickoff("Test query")
+    # Create a mock to capture the created LiteAgent
+    created_lite_agent = None
+    original_lite_agent = LiteAgent

-    # Verify the agent was configured correctly
-    assert agent.role == "Test Agent"
-    assert agent.goal == "Test Goal"
-    assert agent.backstory == "Test Backstory"
-    assert len(agent.tools) == 2
-    assert isinstance(agent.tools[0], WebSearchTool)
-    assert isinstance(agent.tools[1], CalculatorTool)
-    assert agent.max_iter == max_iter
-    assert agent.verbose == verbose
+    # Define a mock LiteAgent class that captures its arguments
+    class MockLiteAgent(original_lite_agent):
+        def __init__(self, **kwargs):
+            nonlocal created_lite_agent
+            created_lite_agent = kwargs
+            super().__init__(**kwargs)

-    # Verify kickoff returned a result
-    assert result is not None
-    assert result.raw is not None
+    # Patch the LiteAgent class
+    monkeypatch.setattr("crewai.agent.core.LiteAgent", MockLiteAgent)
+
+    # Call kickoff to create the LiteAgent
+    agent.kickoff("Test query")
+
+    # Verify all parameters were passed correctly
+    assert created_lite_agent is not None
+    assert created_lite_agent["role"] == "Test Agent"
+    assert created_lite_agent["goal"] == "Test Goal"
+    assert created_lite_agent["backstory"] == "Test Backstory"
+    assert created_lite_agent["llm"] == llm
+    assert len(created_lite_agent["tools"]) == 2
+    assert isinstance(created_lite_agent["tools"][0], WebSearchTool)
+    assert isinstance(created_lite_agent["tools"][1], CalculatorTool)
+    assert created_lite_agent["max_iterations"] == max_iter
+    assert created_lite_agent["max_execution_time"] == max_execution_time
+    assert created_lite_agent["verbose"] == verbose
+    assert created_lite_agent["response_format"] is None
+
+    # Test with a response_format
+    class TestResponse(BaseModel):
+        test_field: str
+
+    agent.kickoff("Test query", response_format=TestResponse)
+    assert created_lite_agent["response_format"] == TestResponse


@pytest.mark.vcr()
@@ -301,8 +310,7 @@ def verify_agent_parent_flow(result, agent, flow):


 def test_sets_parent_flow_when_inside_flow():
-    """Test that an Agent can be created and executed inside a Flow context."""
-    captured_event = None
+    captured_agent = None

    mock_llm = Mock(spec=LLM)
    mock_llm.call.return_value = "Test response"
@@ -335,17 +343,15 @@ def test_sets_parent_flow_when_inside_flow():
    event_received = threading.Event()

    @crewai_event_bus.on(LiteAgentExecutionStartedEvent)
-    def capture_event(source, event):
-        nonlocal captured_event
-        captured_event = event
+    def capture_agent(source, event):
+        nonlocal captured_agent
+        captured_agent = source
        event_received.set()

-    result = flow.kickoff()
+    flow.kickoff()

    assert event_received.wait(timeout=5), "Timeout waiting for agent execution event"
-    assert captured_event is not None
-    assert captured_event.agent_info["role"] == "Test Agent"
-    assert result is not None
+    assert captured_agent.parent_flow is flow


@pytest.mark.vcr()
@@ -367,14 +373,16 @@ def test_guardrail_is_called_using_string():

    @crewai_event_bus.on(LLMGuardrailStartedEvent)
    def capture_guardrail_started(source, event):
-        assert isinstance(source, Agent)
+        assert isinstance(source, LiteAgent)
+        assert source.original_agent == agent
        with condition:
            guardrail_events["started"].append(event)
            condition.notify()

    @crewai_event_bus.on(LLMGuardrailCompletedEvent)
    def capture_guardrail_completed(source, event):
-        assert isinstance(source, Agent)
+        assert isinstance(source, LiteAgent)
+        assert source.original_agent == agent
        with condition:
            guardrail_events["completed"].append(event)
            condition.notify()
@@ -675,151 +683,3 @@ def test_agent_kickoff_with_mcp_tools(mock_get_mcp_tools):

    # Verify MCP tools were retrieved
    mock_get_mcp_tools.assert_called_once_with("https://mcp.exa.ai/mcp?api_key=test_exa_key&profile=research")
-
-
-# ============================================================================
-# Tests for LiteAgent inside Flow (magic auto-async pattern)
-# ============================================================================
-
-from crewai.flow.flow import listen
-
-
-@pytest.mark.vcr()
-def test_lite_agent_inside_flow_sync():
-    """Test that LiteAgent.kickoff() works magically inside a Flow.
-
-    This tests the "magic auto-async" pattern where calling agent.kickoff()
-    from within a Flow automatically detects the event loop and returns a
-    coroutine that the Flow framework awaits. Users don't need to use async/await.
-    """
-    # Track execution
-    execution_log = []
-
-    class TestFlow(Flow):
-        @start()
-        def run_agent(self):
-            execution_log.append("flow_started")
-            agent = Agent(
-                role="Test Agent",
-                goal="Answer questions",
-                backstory="A helpful test assistant",
-                llm=LLM(model="gpt-4o-mini"),
-                verbose=False,
-            )
-            # Magic: just call kickoff() normally - it auto-detects Flow context
-            result = agent.kickoff(messages="What is 2+2? Reply with just the number.")
-            execution_log.append("agent_completed")
-            return result
-
-    flow = TestFlow()
-    result = flow.kickoff()
-
-    # Verify the flow executed successfully
-    assert "flow_started" in execution_log
-    assert "agent_completed" in execution_log
-    assert result is not None
-    assert isinstance(result, LiteAgentOutput)
-
-
-@pytest.mark.vcr()
-def test_lite_agent_inside_flow_with_tools():
-    """Test that LiteAgent with tools works correctly inside a Flow."""
-    class TestFlow(Flow):
-        @start()
-        def run_agent_with_tools(self):
-            agent = Agent(
-                role="Calculator Agent",
-                goal="Perform calculations",
-                backstory="A math expert",
-                llm=LLM(model="gpt-4o-mini"),
-                tools=[CalculatorTool()],
-                verbose=False,
-            )
-            result = agent.kickoff(messages="Calculate 10 * 5")
-            return result
-
-    flow = TestFlow()
-    result = flow.kickoff()
-
-    assert result is not None
-    assert isinstance(result, LiteAgentOutput)
-    assert result.raw is not None
-
-
-@pytest.mark.vcr()
-def test_multiple_agents_in_same_flow():
-    """Test that multiple LiteAgents can run sequentially in the same Flow."""
-    class MultiAgentFlow(Flow):
-        @start()
-        def first_step(self):
-            agent1 = Agent(
-                role="First Agent",
-                goal="Greet users",
-                backstory="A friendly greeter",
-                llm=LLM(model="gpt-4o-mini"),
-                verbose=False,
-            )
-            return agent1.kickoff(messages="Say hello")
-
-        @listen(first_step)
-        def second_step(self, first_result):
-            agent2 = Agent(
-                role="Second Agent",
-                goal="Say goodbye",
-                backstory="A polite farewell agent",
-                llm=LLM(model="gpt-4o-mini"),
-                verbose=False,
-            )
-            return agent2.kickoff(messages="Say goodbye")
-
-    flow = MultiAgentFlow()
-    result = flow.kickoff()
-
-    assert result is not None
-    assert isinstance(result, LiteAgentOutput)
-
-
-@pytest.mark.vcr()
-def test_lite_agent_kickoff_async_inside_flow():
-    """Test that Agent.kickoff_async() works correctly from async Flow methods."""
-    class AsyncAgentFlow(Flow):
-        @start()
-        async def async_agent_step(self):
-            agent = Agent(
-                role="Async Test Agent",
-                goal="Answer questions asynchronously",
-                backstory="An async helper",
-                llm=LLM(model="gpt-4o-mini"),
-                verbose=False,
-            )
-            result = await agent.kickoff_async(messages="What is 3+3?")
-            return result
-
-    flow = AsyncAgentFlow()
-    result = flow.kickoff()
-
-    assert result is not None
-    assert isinstance(result, LiteAgentOutput)
-
-
-@pytest.mark.vcr()
-def test_lite_agent_standalone_still_works():
-    """Test that LiteAgent.kickoff() still works normally outside of a Flow.
-
-    This verifies that the magic auto-async pattern doesn't break standalone usage
-    where there's no event loop running.
-    """
-    agent = Agent(
-        role="Standalone Agent",
-        goal="Answer questions",
-        backstory="A helpful assistant",
-        llm=LLM(model="gpt-4o-mini"),
-        verbose=False,
-    )
-
-    # This should work normally - no Flow, no event loop
-    result = agent.kickoff(messages="What is 5+5? Reply with just the number.")
-
-    assert result is not None
-    assert isinstance(result, LiteAgentOutput)
-    assert result.raw is not None
--- a/lib/crewai/tests/cassettes/agents/test_lite_agent_inside_flow_sync.yaml
+++ b/lib/crewai/tests/cassettes/agents/test_lite_agent_inside_flow_sync.yaml
@@ -1,119 +0,0 @@
-interactions:
- request:
-    body: '{"messages":[{"role":"system","content":"You are Test Agent. A helpful
-      test assistant\nYour personal goal is: Answer questions\nTo give my best complete
-      final answer to the task respond using the exact following format:\n\nThought:
-      I now can give a great answer\nFinal Answer: Your final answer must be the great
-      and the most complete as possible, it must be outcome described.\n\nI MUST use
-      these formats, my job depends on it!"},{"role":"user","content":"\nCurrent Task:
-      What is 2+2? Reply with just the number.\n\nBegin! This is VERY important to
-      you, use the tools available and give your best Final Answer, your job depends
-      on it!\n\nThought:"}],"model":"gpt-4o-mini"}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - application/json
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '673'
-      content-type:
-      - application/json
-      host:
-      - api.openai.com
-      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - X-STAINLESS-OS-XXX
-      x-stainless-package-version:
-      - 1.83.0
-      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.13.3
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    body:
-      string: "{\n  \"id\": \"chatcmpl-Cy7b0HjL79y39EkUcMLrRhPFe3XGj\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768444914,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"I now can give a great answer  \\nFinal
-        Answer: 4\",\n        \"refusal\": null,\n        \"annotations\": []\n      },\n
-        \     \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n
-        \ \"usage\": {\n    \"prompt_tokens\": 136,\n    \"completion_tokens\": 13,\n
-        \   \"total_tokens\": 149,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
-        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_8bbc38b4db\"\n}\n"
-    headers:
-      CF-RAY:
-      - CF-RAY-XXX
-      Connection:
-      - keep-alive
-      Content-Type:
-      - application/json
-      Date:
-      - Thu, 15 Jan 2026 02:41:55 GMT
-      Server:
-      - cloudflare
-      Set-Cookie:
-      - SET-COOKIE-XXX
-      Strict-Transport-Security:
-      - STS-XXX
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
-      alt-svc:
-      - h3=":443"; ma=86400
-      cf-cache-status:
-      - DYNAMIC
-      content-length:
-      - '857'
-      openai-organization:
-      - OPENAI-ORG-XXX
-      openai-processing-ms:
-      - '341'
-      openai-project:
-      - OPENAI-PROJECT-XXX
-      openai-version:
-      - '2020-10-01'
-      x-envoy-upstream-service-time:
-      - '358'
-      x-openai-proxy-wasm:
-      - v0.1
-      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
-      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
-      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
-      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
-      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
-      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
-      x-request-id:
-      - X-REQUEST-ID-XXX
-    status:
-      code: 200
-      message: OK
-version: 1
--- a/lib/crewai/tests/cassettes/agents/test_lite_agent_inside_flow_with_tools.yaml
+++ b/lib/crewai/tests/cassettes/agents/test_lite_agent_inside_flow_with_tools.yaml
@@ -1,255 +0,0 @@
-interactions:
- request:
-    body: '{"messages":[{"role":"system","content":"You are Calculator Agent. A math
-      expert\nYour personal goal is: Perform calculations\nYou ONLY have access to
-      the following tools, and should NEVER make up tools that are not listed here:\n\nTool
-      Name: calculate\nTool Arguments: {\n  \"properties\": {\n    \"expression\":
-      {\n      \"title\": \"Expression\",\n      \"type\": \"string\"\n    }\n  },\n  \"required\":
-      [\n    \"expression\"\n  ],\n  \"title\": \"CalculatorToolSchema\",\n  \"type\":
-      \"object\",\n  \"additionalProperties\": false\n}\nTool Description: Calculate
-      the result of a mathematical expression.\n\nIMPORTANT: Use the following format
-      in your response:\n\n```\nThought: you should always think about what to do\nAction:
-      the action to take, only one name of [calculate], just the name, exactly as
-      it''s written.\nAction Input: the input to the action, just a simple JSON object,
-      enclosed in curly braces, using \" to wrap keys and values.\nObservation: the
-      result of the action\n```\n\nOnce all necessary information is gathered, return
-      the following format:\n\n```\nThought: I now know the final answer\nFinal Answer:
-      the final answer to the original input question\n```"},{"role":"user","content":"\nCurrent
-      Task: Calculate 10 * 5\n\nBegin! This is VERY important to you, use the tools
-      available and give your best Final Answer, your job depends on it!\n\nThought:"}],"model":"gpt-4o-mini"}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - application/json
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '1403'
-      content-type:
-      - application/json
-      host:
-      - api.openai.com
-      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - X-STAINLESS-OS-XXX
-      x-stainless-package-version:
-      - 1.83.0
-      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.13.3
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    body:
-      string: "{\n  \"id\": \"chatcmpl-Cy7avghVPSpszLmlbHpwDQlWDoD6O\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768444909,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"Thought: I need to calculate the expression
-        10 * 5.\\nAction: calculate\\nAction Input: {\\\"expression\\\":\\\"10 * 5\\\"}\\nObservation:
-        50\",\n        \"refusal\": null,\n        \"annotations\": []\n      },\n
-        \     \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n
-        \ \"usage\": {\n    \"prompt_tokens\": 291,\n    \"completion_tokens\": 33,\n
-        \   \"total_tokens\": 324,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
-        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_c4585b5b9c\"\n}\n"
-    headers:
-      CF-RAY:
-      - CF-RAY-XXX
-      Connection:
-      - keep-alive
-      Content-Type:
-      - application/json
-      Date:
-      - Thu, 15 Jan 2026 02:41:49 GMT
-      Server:
-      - cloudflare
-      Set-Cookie:
-      - SET-COOKIE-XXX
-      Strict-Transport-Security:
-      - STS-XXX
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
-      alt-svc:
-      - h3=":443"; ma=86400
-      cf-cache-status:
-      - DYNAMIC
-      content-length:
-      - '939'
-      openai-organization:
-      - OPENAI-ORG-XXX
-      openai-processing-ms:
-      - '579'
-      openai-project:
-      - OPENAI-PROJECT-XXX
-      openai-version:
-      - '2020-10-01'
-      x-envoy-upstream-service-time:
-      - '598'
-      x-openai-proxy-wasm:
-      - v0.1
-      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
-      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
-      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
-      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
-      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
-      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
-      x-request-id:
-      - X-REQUEST-ID-XXX
-    status:
-      code: 200
-      message: OK
- request:
-    body: '{"messages":[{"role":"system","content":"You are Calculator Agent. A math
-      expert\nYour personal goal is: Perform calculations\nYou ONLY have access to
-      the following tools, and should NEVER make up tools that are not listed here:\n\nTool
-      Name: calculate\nTool Arguments: {\n  \"properties\": {\n    \"expression\":
-      {\n      \"title\": \"Expression\",\n      \"type\": \"string\"\n    }\n  },\n  \"required\":
-      [\n    \"expression\"\n  ],\n  \"title\": \"CalculatorToolSchema\",\n  \"type\":
-      \"object\",\n  \"additionalProperties\": false\n}\nTool Description: Calculate
-      the result of a mathematical expression.\n\nIMPORTANT: Use the following format
-      in your response:\n\n```\nThought: you should always think about what to do\nAction:
-      the action to take, only one name of [calculate], just the name, exactly as
-      it''s written.\nAction Input: the input to the action, just a simple JSON object,
-      enclosed in curly braces, using \" to wrap keys and values.\nObservation: the
-      result of the action\n```\n\nOnce all necessary information is gathered, return
-      the following format:\n\n```\nThought: I now know the final answer\nFinal Answer:
-      the final answer to the original input question\n```"},{"role":"user","content":"\nCurrent
-      Task: Calculate 10 * 5\n\nBegin! This is VERY important to you, use the tools
-      available and give your best Final Answer, your job depends on it!\n\nThought:"},{"role":"assistant","content":"Thought:
-      I need to calculate the expression 10 * 5.\nAction: calculate\nAction Input:
-      {\"expression\":\"10 * 5\"}\nObservation: The result of 10 * 5 is 50"}],"model":"gpt-4o-mini"}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - application/json
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '1591'
-      content-type:
-      - application/json
-      cookie:
-      - COOKIE-XXX
-      host:
-      - api.openai.com
-      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - X-STAINLESS-OS-XXX
-      x-stainless-package-version:
-      - 1.83.0
-      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.13.3
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    body:
-      string: "{\n  \"id\": \"chatcmpl-Cy7avDhDZCLvv8v2dh8ZQRrLdci6A\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768444909,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"Thought: I now know the final answer.\\nFinal
-        Answer: 50\",\n        \"refusal\": null,\n        \"annotations\": []\n      },\n
-        \     \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n
-        \ \"usage\": {\n    \"prompt_tokens\": 337,\n    \"completion_tokens\": 14,\n
-        \   \"total_tokens\": 351,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
-        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_c4585b5b9c\"\n}\n"
-    headers:
-      CF-RAY:
-      - CF-RAY-XXX
-      Connection:
-      - keep-alive
-      Content-Type:
-      - application/json
-      Date:
-      - Thu, 15 Jan 2026 02:41:50 GMT
-      Server:
-      - cloudflare
-      Strict-Transport-Security:
-      - STS-XXX
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
-      alt-svc:
-      - h3=":443"; ma=86400
-      cf-cache-status:
-      - DYNAMIC
-      content-length:
-      - '864'
-      openai-organization:
-      - OPENAI-ORG-XXX
-      openai-processing-ms:
-      - '429'
-      openai-project:
-      - OPENAI-PROJECT-XXX
-      openai-version:
-      - '2020-10-01'
-      x-envoy-upstream-service-time:
-      - '457'
-      x-openai-proxy-wasm:
-      - v0.1
-      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
-      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
-      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
-      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
-      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
-      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
-      x-request-id:
-      - X-REQUEST-ID-XXX
-    status:
-      code: 200
-      message: OK
-version: 1
--- a/lib/crewai/tests/cassettes/agents/test_lite_agent_kickoff_async_inside_flow.yaml
+++ b/lib/crewai/tests/cassettes/agents/test_lite_agent_kickoff_async_inside_flow.yaml
@@ -1,119 +0,0 @@
-interactions:
- request:
-    body: '{"messages":[{"role":"system","content":"You are Async Test Agent. An async
-      helper\nYour personal goal is: Answer questions asynchronously\nTo give my best
-      complete final answer to the task respond using the exact following format:\n\nThought:
-      I now can give a great answer\nFinal Answer: Your final answer must be the great
-      and the most complete as possible, it must be outcome described.\n\nI MUST use
-      these formats, my job depends on it!"},{"role":"user","content":"\nCurrent Task:
-      What is 3+3?\n\nBegin! This is VERY important to you, use the tools available
-      and give your best Final Answer, your job depends on it!\n\nThought:"}],"model":"gpt-4o-mini"}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - application/json
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '657'
-      content-type:
-      - application/json
-      host:
-      - api.openai.com
-      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - X-STAINLESS-OS-XXX
-      x-stainless-package-version:
-      - 1.83.0
-      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.13.3
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    body:
-      string: "{\n  \"id\": \"chatcmpl-Cy7atOGxtc4y3oYNI62WiQ0Vogsdv\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768444907,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"I now can give a great answer  \\nFinal
-        Answer: The sum of 3 + 3 is 6. Therefore, the outcome is that if you add three
-        and three together, you will arrive at the total of six.\",\n        \"refusal\":
-        null,\n        \"annotations\": []\n      },\n      \"logprobs\": null,\n
-        \     \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\":
-        131,\n    \"completion_tokens\": 46,\n    \"total_tokens\": 177,\n    \"prompt_tokens_details\":
-        {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_29330a9688\"\n}\n"
-    headers:
-      CF-RAY:
-      - CF-RAY-XXX
-      Connection:
-      - keep-alive
-      Content-Type:
-      - application/json
-      Date:
-      - Thu, 15 Jan 2026 02:41:48 GMT
-      Server:
-      - cloudflare
-      Set-Cookie:
-      - SET-COOKIE-XXX
-      Strict-Transport-Security:
-      - STS-XXX
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
-      alt-svc:
-      - h3=":443"; ma=86400
-      cf-cache-status:
-      - DYNAMIC
-      content-length:
-      - '983'
-      openai-organization:
-      - OPENAI-ORG-XXX
-      openai-processing-ms:
-      - '944'
-      openai-project:
-      - OPENAI-PROJECT-XXX
-      openai-version:
-      - '2020-10-01'
-      x-envoy-upstream-service-time:
-      - '1192'
-      x-openai-proxy-wasm:
-      - v0.1
-      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
-      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
-      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
-      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
-      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
-      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
-      x-request-id:
-      - X-REQUEST-ID-XXX
-    status:
-      code: 200
-      message: OK
-version: 1
--- a/lib/crewai/tests/cassettes/agents/test_lite_agent_standalone_still_works.yaml
+++ b/lib/crewai/tests/cassettes/agents/test_lite_agent_standalone_still_works.yaml
@@ -1,119 +0,0 @@
-interactions:
- request:
-    body: '{"messages":[{"role":"system","content":"You are Standalone Agent. A helpful
-      assistant\nYour personal goal is: Answer questions\nTo give my best complete
-      final answer to the task respond using the exact following format:\n\nThought:
-      I now can give a great answer\nFinal Answer: Your final answer must be the great
-      and the most complete as possible, it must be outcome described.\n\nI MUST use
-      these formats, my job depends on it!"},{"role":"user","content":"\nCurrent Task:
-      What is 5+5? Reply with just the number.\n\nBegin! This is VERY important to
-      you, use the tools available and give your best Final Answer, your job depends
-      on it!\n\nThought:"}],"model":"gpt-4o-mini"}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - application/json
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '674'
-      content-type:
-      - application/json
-      host:
-      - api.openai.com
-      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - X-STAINLESS-OS-XXX
-      x-stainless-package-version:
-      - 1.83.0
-      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.13.3
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    body:
-      string: "{\n  \"id\": \"chatcmpl-Cy7azhPwUHQ0p5tdhxSAmLPoE8UgC\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768444913,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"I now can give a great answer  \\nFinal
-        Answer: 10\",\n        \"refusal\": null,\n        \"annotations\": []\n      },\n
-        \     \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n
-        \ \"usage\": {\n    \"prompt_tokens\": 136,\n    \"completion_tokens\": 13,\n
-        \   \"total_tokens\": 149,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
-        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_29330a9688\"\n}\n"
-    headers:
-      CF-RAY:
-      - CF-RAY-XXX
-      Connection:
-      - keep-alive
-      Content-Type:
-      - application/json
-      Date:
-      - Thu, 15 Jan 2026 02:41:54 GMT
-      Server:
-      - cloudflare
-      Set-Cookie:
-      - SET-COOKIE-XXX
-      Strict-Transport-Security:
-      - STS-XXX
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
-      alt-svc:
-      - h3=":443"; ma=86400
-      cf-cache-status:
-      - DYNAMIC
-      content-length:
-      - '858'
-      openai-organization:
-      - OPENAI-ORG-XXX
-      openai-processing-ms:
-      - '455'
-      openai-project:
-      - OPENAI-PROJECT-XXX
-      openai-version:
-      - '2020-10-01'
-      x-envoy-upstream-service-time:
-      - '583'
-      x-openai-proxy-wasm:
-      - v0.1
-      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
-      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
-      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
-      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
-      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
-      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
-      x-request-id:
-      - X-REQUEST-ID-XXX
-    status:
-      code: 200
-      message: OK
-version: 1
--- a/lib/crewai/tests/cassettes/agents/test_multiple_agents_in_same_flow.yaml
+++ b/lib/crewai/tests/cassettes/agents/test_multiple_agents_in_same_flow.yaml
@@ -1,239 +0,0 @@
-interactions:
- request:
-    body: '{"messages":[{"role":"system","content":"You are First Agent. A friendly
-      greeter\nYour personal goal is: Greet users\nTo give my best complete final
-      answer to the task respond using the exact following format:\n\nThought: I now
-      can give a great answer\nFinal Answer: Your final answer must be the great and
-      the most complete as possible, it must be outcome described.\n\nI MUST use these
-      formats, my job depends on it!"},{"role":"user","content":"\nCurrent Task: Say
-      hello\n\nBegin! This is VERY important to you, use the tools available and give
-      your best Final Answer, your job depends on it!\n\nThought:"}],"model":"gpt-4o-mini"}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - application/json
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '632'
-      content-type:
-      - application/json
-      host:
-      - api.openai.com
-      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - X-STAINLESS-OS-XXX
-      x-stainless-package-version:
-      - 1.83.0
-      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.13.3
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    body:
-      string: "{\n  \"id\": \"chatcmpl-CyRKzgODZ9yn3F9OkaXsscLk2Ln3N\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768520801,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"I now can give a great answer  \\nFinal
-        Answer: Hello! Welcome! I'm so glad to see you here. If you need any assistance
-        or have any questions, feel free to ask. Have a wonderful day!\",\n        \"refusal\":
-        null,\n        \"annotations\": []\n      },\n      \"logprobs\": null,\n
-        \     \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\":
-        127,\n    \"completion_tokens\": 43,\n    \"total_tokens\": 170,\n    \"prompt_tokens_details\":
-        {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_c4585b5b9c\"\n}\n"
-    headers:
-      CF-RAY:
-      - CF-RAY-XXX
-      Connection:
-      - keep-alive
-      Content-Type:
-      - application/json
-      Date:
-      - Thu, 15 Jan 2026 23:46:42 GMT
-      Server:
-      - cloudflare
-      Set-Cookie:
-      - SET-COOKIE-XXX
-      Strict-Transport-Security:
-      - STS-XXX
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
-      alt-svc:
-      - h3=":443"; ma=86400
-      cf-cache-status:
-      - DYNAMIC
-      content-length:
-      - '990'
-      openai-organization:
-      - OPENAI-ORG-XXX
-      openai-processing-ms:
-      - '880'
-      openai-project:
-      - OPENAI-PROJECT-XXX
-      openai-version:
-      - '2020-10-01'
-      x-envoy-upstream-service-time:
-      - '1160'
-      x-openai-proxy-wasm:
-      - v0.1
-      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
-      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
-      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
-      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
-      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
-      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
-      x-request-id:
-      - X-REQUEST-ID-XXX
-    status:
-      code: 200
-      message: OK
- request:
-    body: '{"messages":[{"role":"system","content":"You are Second Agent. A polite
-      farewell agent\nYour personal goal is: Say goodbye\nTo give my best complete
-      final answer to the task respond using the exact following format:\n\nThought:
-      I now can give a great answer\nFinal Answer: Your final answer must be the great
-      and the most complete as possible, it must be outcome described.\n\nI MUST use
-      these formats, my job depends on it!"},{"role":"user","content":"\nCurrent Task:
-      Say goodbye\n\nBegin! This is VERY important to you, use the tools available
-      and give your best Final Answer, your job depends on it!\n\nThought:"}],"model":"gpt-4o-mini"}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - application/json
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '640'
-      content-type:
-      - application/json
-      host:
-      - api.openai.com
-      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - X-STAINLESS-OS-XXX
-      x-stainless-package-version:
-      - 1.83.0
-      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.13.3
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    body:
-      string: "{\n  \"id\": \"chatcmpl-CyRL1Ua2PkK5xXPp3KeF0AnGAk3JP\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768520803,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"I now can give a great answer  \\nFinal
-        Answer: As we reach the end of our conversation, I want to express my gratitude
-        for the time we've shared. It's been a pleasure assisting you, and I hope
-        you found our interaction helpful and enjoyable. Remember, whenever you need
-        assistance, I'm just a message away. Wishing you all the best in your future
-        endeavors. Goodbye and take care!\",\n        \"refusal\": null,\n        \"annotations\":
-        []\n      },\n      \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n
-        \   }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 126,\n    \"completion_tokens\":
-        79,\n    \"total_tokens\": 205,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
-        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_29330a9688\"\n}\n"
-    headers:
-      CF-RAY:
-      - CF-RAY-XXX
-      Connection:
-      - keep-alive
-      Content-Type:
-      - application/json
-      Date:
-      - Thu, 15 Jan 2026 23:46:44 GMT
-      Server:
-      - cloudflare
-      Set-Cookie:
-      - SET-COOKIE-XXX
-      Strict-Transport-Security:
-      - STS-XXX
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
-      alt-svc:
-      - h3=":443"; ma=86400
-      cf-cache-status:
-      - DYNAMIC
-      content-length:
-      - '1189'
-      openai-organization:
-      - OPENAI-ORG-XXX
-      openai-processing-ms:
-      - '1363'
-      openai-project:
-      - OPENAI-PROJECT-XXX
-      openai-version:
-      - '2020-10-01'
-      x-envoy-upstream-service-time:
-      - '1605'
-      x-openai-proxy-wasm:
-      - v0.1
-      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
-      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
-      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
-      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
-      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
-      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
-      x-request-id:
-      - X-REQUEST-ID-XXX
-    status:
-      code: 200
-      message: OK
-version: 1
--- a/lib/crewai/tests/cassettes/llms/google/test_google_express_mode_works.yaml
+++ b/lib/crewai/tests/cassettes/llms/google/test_google_express_mode_works.yaml
@@ -0,0 +1,75 @@
+interactions:
+- request:
+    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: What is the capital
+      of Japan?\n\nThis is the expected criteria for your final answer: The capital
+      of Japan\nyou MUST return the actual complete content as the final answer, not
+      a summary.\n\nBegin! This is VERY important to you, use the tools available
+      and give your best Final Answer, your job depends on it!\n\nThought:"}], "role":
+      "user"}], "systemInstruction": {"parts": [{"text": "You are Research Assistant.
+      You are a helpful research assistant.\nYour personal goal is: Find information
+      about the capital of Japan\nTo give my best complete final answer to the task
+      respond using the exact following format:\n\nThought: I now can give a great
+      answer\nFinal Answer: Your final answer must be the great and the most complete
+      as possible, it must be outcome described.\n\nI MUST use these formats, my job
+      depends on it!"}], "role": "user"}, "generationConfig": {"stopSequences": ["\nObservation:"]}}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - '*/*'
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '952'
+      content-type:
+      - application/json
+      host:
+      - aiplatform.googleapis.com
+      x-goog-api-client:
+      - google-genai-sdk/1.59.0 gl-python/3.13.3
+      x-goog-api-key:
+      - X-GOOG-API-KEY-XXX
+    method: POST
+    uri: https://aiplatform.googleapis.com/v1/publishers/google/models/gemini-2.0-flash-exp:generateContent
+  response:
+    body:
+      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"role\":
+        \"model\",\n        \"parts\": [\n          {\n            \"text\": \"The
+        capital of Japan is Tokyo.\\nFinal Answer: Tokyo\\n\"\n          }\n        ]\n
+        \     },\n      \"finishReason\": \"STOP\",\n      \"avgLogprobs\": -0.017845841554495003\n
+        \   }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\": 163,\n    \"candidatesTokenCount\":
+        13,\n    \"totalTokenCount\": 176,\n    \"trafficType\": \"ON_DEMAND\",\n
+        \   \"promptTokensDetails\": [\n      {\n        \"modality\": \"TEXT\",\n
+        \       \"tokenCount\": 163\n      }\n    ],\n    \"candidatesTokensDetails\":
+        [\n      {\n        \"modality\": \"TEXT\",\n        \"tokenCount\": 13\n
+        \     }\n    ]\n  },\n  \"modelVersion\": \"gemini-2.0-flash-exp\",\n  \"createTime\":
+        \"2026-01-15T22:27:38.066749Z\",\n  \"responseId\": \"2mlpab2JBNOFidsPh5GigQs\"\n}\n"
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Thu, 15 Jan 2026 22:27:38 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      X-Frame-Options:
+      - X-FRAME-OPTIONS-XXX
+      X-XSS-Protection:
+      - '0'
+      content-length:
+      - '786'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/test_multiple_before_after_kickoff.yaml
+++ b/lib/crewai/tests/cassettes/test_multiple_before_after_kickoff.yaml
--- a/lib/crewai/tests/cassettes/test_task_guardrail_process_output.yaml
+++ b/lib/crewai/tests/cassettes/test_task_guardrail_process_output.yaml
@@ -1,528 +1,456 @@
 interactions:
 - request:
-    body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are Guardrail Agent.
-      You are a expert at validating the output of a task. By providing effective
-      feedback if the output is not valid.\\nYour personal goal is: Validate the output
-      of the task\\nTo give my best complete final answer to the task respond using
-      the exact following format:\\n\\nThought: I now can give a great answer\\nFinal
-      Answer: Your final answer must be the great and the most complete as possible,
-      it must be outcome described.\\n\\nI MUST use these formats, my job depends
-      on it!\"},{\"role\":\"user\",\"content\":\"\\nCurrent Task: \\n        Ensure
-      the following task result complies with the given guardrail.\\n\\n        Task
-      result:\\n        \\n        Lorem Ipsum is simply dummy text of the printing
-      and typesetting industry. Lorem Ipsum has been the industry's standard dummy
-      text ever\\n        \\n\\n        Guardrail:\\n        Ensure the result has
-      less than 10 words\\n\\n        Your task:\\n        - Confirm if the Task result
-      complies with the guardrail.\\n        - If not, provide clear feedback explaining
-      what is wrong (e.g., by how much it violates the rule, or what specific part
-      fails).\\n        - Focus only on identifying issues \u2014 do not propose corrections.\\n
-      \       - If the Task result complies with the guardrail, saying that is valid\\n
-      \       \\n\\nBegin! This is VERY important to you, use the tools available
-      and give your best Final Answer, your job depends on it!\\n\\nThought:\"}],\"model\":\"gpt-4o\"}"
+    body: '{"trace_id": "00000000-0000-0000-0000-000000000000", "execution_type": "crew", "user_identifier": null, "execution_context": {"crew_fingerprint": null, "crew_name": "Unknown Crew", "flow_name": null, "crewai_version": "1.3.0", "privacy_level": "standard"}, "execution_metadata": {"expected_duration_estimate": 300, "agent_count": 0, "task_count": 0, "flow_method_count": 0, "execution_started_at": "2025-11-05T22:19:56.074812+00:00"}}'
    headers:
+      Accept:
+      - '*/*'
+      Accept-Encoding:
+      - gzip, deflate, zstd
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '434'
+      Content-Type:
+      - application/json
      User-Agent:
-      - X-USER-AGENT-XXX
+      - CrewAI-CLI/1.3.0
+      X-Crewai-Version:
+      - 1.3.0
+    method: POST
+    uri: https://app.crewai.com/crewai_plus/api/v1/tracing/batches
+  response:
+    body:
+      string: '{"error":"bad_credentials","message":"Bad credentials"}'
+    headers:
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '55'
+      Content-Type:
+      - application/json; charset=utf-8
+      Date:
+      - Wed, 05 Nov 2025 22:19:56 GMT
+      cache-control:
+      - no-store
+      content-security-policy:
+      - 'default-src ''self'' *.app.crewai.com app.crewai.com; script-src ''self'' ''unsafe-inline'' *.app.crewai.com app.crewai.com https://cdn.jsdelivr.net/npm/apexcharts https://www.gstatic.com https://run.pstmn.io https://apis.google.com https://apis.google.com/js/api.js https://accounts.google.com https://accounts.google.com/gsi/client https://cdnjs.cloudflare.com/ajax/libs/normalize/8.0.1/normalize.min.css.map https://*.google.com https://docs.google.com https://slides.google.com https://js.hs-scripts.com https://js.sentry-cdn.com https://browser.sentry-cdn.com https://www.googletagmanager.com https://js-na1.hs-scripts.com https://js.hubspot.com http://js-na1.hs-scripts.com https://bat.bing.com https://cdn.amplitude.com https://cdn.segment.com https://d1d3n03t5zntha.cloudfront.net/ https://descriptusercontent.com https://edge.fullstory.com https://googleads.g.doubleclick.net https://js.hs-analytics.net https://js.hs-banner.com https://js.hsadspixel.net https://js.hscollectedforms.net
+        https://js.usemessages.com https://snap.licdn.com https://static.cloudflareinsights.com https://static.reo.dev https://www.google-analytics.com https://share.descript.com/; style-src ''self'' ''unsafe-inline'' *.app.crewai.com app.crewai.com https://cdn.jsdelivr.net/npm/apexcharts; img-src ''self'' data: *.app.crewai.com app.crewai.com https://zeus.tools.crewai.com https://dashboard.tools.crewai.com https://cdn.jsdelivr.net https://forms.hsforms.com https://track.hubspot.com https://px.ads.linkedin.com https://px4.ads.linkedin.com https://www.google.com https://www.google.com.br; font-src ''self'' data: *.app.crewai.com app.crewai.com; connect-src ''self'' *.app.crewai.com app.crewai.com https://zeus.tools.crewai.com https://connect.useparagon.com/ https://zeus.useparagon.com/* https://*.useparagon.com/* https://run.pstmn.io https://connect.tools.crewai.com/ https://*.sentry.io https://www.google-analytics.com https://edge.fullstory.com https://rs.fullstory.com https://api.hubspot.com
+        https://forms.hscollectedforms.net https://api.hubapi.com https://px.ads.linkedin.com https://px4.ads.linkedin.com https://google.com/pagead/form-data/16713662509 https://google.com/ccm/form-data/16713662509 https://www.google.com/ccm/collect https://worker-actionkit.tools.crewai.com https://api.reo.dev; frame-src ''self'' *.app.crewai.com app.crewai.com https://connect.useparagon.com/ https://zeus.tools.crewai.com https://zeus.useparagon.com/* https://connect.tools.crewai.com/ https://docs.google.com https://drive.google.com https://slides.google.com https://accounts.google.com https://*.google.com https://app.hubspot.com/ https://td.doubleclick.net https://www.googletagmanager.com/ https://www.youtube.com https://share.descript.com'
+      expires:
+      - '0'
+      permissions-policy:
+      - camera=(), microphone=(self), geolocation=()
+      pragma:
+      - no-cache
+      referrer-policy:
+      - strict-origin-when-cross-origin
+      strict-transport-security:
+      - max-age=63072000; includeSubDomains
+      vary:
+      - Accept
+      x-content-type-options:
+      - nosniff
+      x-frame-options:
+      - SAMEORIGIN
+      x-permitted-cross-domain-policies:
+      - none
+      x-request-id:
+      - 230c6cb5-92c7-448d-8c94-e5548a9f4259
+      x-runtime:
+      - '0.073220'
+      x-xss-protection:
+      - 1; mode=block
+    status:
+      code: 401
+      message: Unauthorized
+- request:
+    body: '{"messages":[{"role":"system","content":"You are Guardrail Agent. You are a expert at validating the output of a task. By providing effective feedback if the output is not valid.\nYour personal goal is: Validate the output of the task\n\nTo give my best complete final answer to the task respond using the exact following format:\n\nThought: I now can give a great answer\nFinal Answer: Your final answer must be the great and the most complete as possible, it must be outcome described.\n\nI MUST use these formats, my job depends on it!Ensure your final answer strictly adheres to the following OpenAPI schema: {\n  \"type\": \"json_schema\",\n  \"json_schema\": {\n    \"name\": \"LLMGuardrailResult\",\n    \"strict\": true,\n    \"schema\": {\n      \"properties\": {\n        \"valid\": {\n          \"description\": \"Whether the task output complies with the guardrail\",\n          \"title\": \"Valid\",\n          \"type\": \"boolean\"\n        },\n        \"feedback\": {\n          \"anyOf\":
+      [\n            {\n              \"type\": \"string\"\n            },\n            {\n              \"type\": \"null\"\n            }\n          ],\n          \"default\": null,\n          \"description\": \"A feedback about the task output if it is not valid\",\n          \"title\": \"Feedback\"\n        }\n      },\n      \"required\": [\n        \"valid\",\n        \"feedback\"\n      ],\n      \"title\": \"LLMGuardrailResult\",\n      \"type\": \"object\",\n      \"additionalProperties\": false\n    }\n  }\n}\n\nDo not include the OpenAPI schema in the final output. Ensure the final output does not include any code block markers like ```json or ```python."},{"role":"user","content":"\n        Ensure the following task result complies with the given guardrail.\n\n        Task result:\n        \n        Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry''s standard dummy text ever\n        \n\n        Guardrail:\n        Ensure
+      the result has less than 10 words\n\n        Your task:\n        - Confirm if the Task result complies with the guardrail.\n        - If not, provide clear feedback explaining what is wrong (e.g., by how much it violates the rule, or what specific part fails).\n        - Focus only on identifying issues — do not propose corrections.\n        - If the Task result complies with the guardrail, saying that is valid\n        "}],"model":"gpt-4o"}'
+    headers:
      accept:
      - application/json
      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
+      - gzip, deflate, zstd
      connection:
      - keep-alive
      content-length:
-      - '1467'
+      - '2452'
      content-type:
      - application/json
      host:
      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.109.1
      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
+      - arm64
      x-stainless-async:
      - 'false'
      x-stainless-lang:
      - python
      x-stainless-os:
-      - X-STAINLESS-OS-XXX
+      - MacOS
      x-stainless-package-version:
-      - 1.83.0
+      - 1.109.1
      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
+      - '600'
      x-stainless-retry-count:
      - '0'
      x-stainless-runtime:
      - CPython
      x-stainless-runtime-version:
-      - 3.13.3
+      - 3.12.9
    method: POST
    uri: https://api.openai.com/v1/chat/completions
  response:
    body:
-      string: "{\n  \"id\": \"chatcmpl-Cy7yHRYTZi8yzRbcODnKr92keLKCb\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768446357,\n  \"model\": \"gpt-4o-2024-08-06\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"The task result provided has more than
-        10 words. I will count the words to verify this.\\n\\nThe task result is the
-        following text:\\n\\\"Lorem Ipsum is simply dummy text of the printing and
-        typesetting industry. Lorem Ipsum has been the industry's standard dummy text
-        ever\\\"\\n\\nCounting the words:\\n\\n1. Lorem \\n2. Ipsum \\n3. is \\n4.
-        simply \\n5. dummy \\n6. text \\n7. of \\n8. the \\n9. printing \\n10. and
-        \\n11. typesetting \\n12. industry. \\n13. Lorem \\n14. Ipsum \\n15. has \\n16.
-        been \\n17. the \\n18. industry's \\n19. standard \\n20. dummy \\n21. text
-        \\n22. ever\\n\\nThe total word count is 22.\\n\\nThought: I now can give
-        a great answer\\nFinal Answer: The task result does not comply with the guardrail.
-        It contains 22 words, which exceeds the limit of 10 words.\",\n        \"refusal\":
-        null,\n        \"annotations\": []\n      },\n      \"logprobs\": null,\n
-        \     \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\":
-        285,\n    \"completion_tokens\": 195,\n    \"total_tokens\": 480,\n    \"prompt_tokens_details\":
-        {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_deacdd5f6f\"\n}\n"
+      string: "{\n  \"id\": \"chatcmpl-CYg96Riy2RJRxnBHvoROukymP9wvs\",\n  \"object\": \"chat.completion\",\n  \"created\": 1762381196,\n  \"model\": \"gpt-4o-2024-08-06\",\n  \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\": \"assistant\",\n        \"content\": \"Thought: I need to check if the task result meets the requirement of having less than 10 words.\\n\\nFinal Answer: {\\n  \\\"valid\\\": false,\\n  \\\"feedback\\\": \\\"The task result contains more than 10 words, violating the guardrail. The text provided contains about 21 words.\\\"\\n}\",\n        \"refusal\": null,\n        \"annotations\": []\n      },\n      \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 489,\n    \"completion_tokens\": 61,\n    \"total_tokens\": 550,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n      \"reasoning_tokens\"\
+        : 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\": 0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\": \"default\",\n  \"system_fingerprint\": \"fp_cbf1785567\"\n}\n"
    headers:
      CF-RAY:
-      - CF-RAY-XXX
+      - REDACTED-RAY
      Connection:
      - keep-alive
      Content-Type:
      - application/json
      Date:
-      - Thu, 15 Jan 2026 03:05:59 GMT
+      - Wed, 05 Nov 2025 22:19:58 GMT
      Server:
      - cloudflare
      Set-Cookie:
-      - SET-COOKIE-XXX
+      - __cf_bm=REDACTED; path=/; expires=Wed, 05-Nov-25 22:49:58 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      - _cfuvid=REDACTED; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
      Strict-Transport-Security:
-      - STS-XXX
+      - max-age=31536000; includeSubDomains; preload
      Transfer-Encoding:
      - chunked
      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
+      - nosniff
      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
+      - X-Request-ID
      alt-svc:
      - h3=":443"; ma=86400
      cf-cache-status:
      - DYNAMIC
-      content-length:
-      - '1557'
      openai-organization:
-      - OPENAI-ORG-XXX
+      - user-hortuttj2f3qtmxyik2zxf4q
      openai-processing-ms:
-      - '2130'
+      - '2201'
      openai-project:
-      - OPENAI-PROJECT-XXX
+      - proj_fL4UBWR1CMpAAdgzaSKqsVvA
      openai-version:
      - '2020-10-01'
      x-envoy-upstream-service-time:
-      - '2147'
+      - '2401'
      x-openai-proxy-wasm:
      - v0.1
      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      - '500'
      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      - '30000'
      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      - '499'
      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      - '29439'
      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
+      - 120ms
      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
+      - 1.122s
      x-request-id:
-      - X-REQUEST-ID-XXX
+      - req_REDACTED
    status:
      code: 200
      message: OK
 - request:
-    body: '{"messages":[{"role":"system","content":"Ensure your final answer strictly
-      adheres to the following OpenAPI schema: {\n  \"type\": \"json_schema\",\n  \"json_schema\":
-      {\n    \"name\": \"LLMGuardrailResult\",\n    \"strict\": true,\n    \"schema\":
-      {\n      \"properties\": {\n        \"valid\": {\n          \"description\":
-      \"Whether the task output complies with the guardrail\",\n          \"title\":
-      \"Valid\",\n          \"type\": \"boolean\"\n        },\n        \"feedback\":
-      {\n          \"anyOf\": [\n            {\n              \"type\": \"string\"\n            },\n            {\n              \"type\":
-      \"null\"\n            }\n          ],\n          \"default\": null,\n          \"description\":
-      \"A feedback about the task output if it is not valid\",\n          \"title\":
-      \"Feedback\"\n        }\n      },\n      \"required\": [\n        \"valid\",\n        \"feedback\"\n      ],\n      \"title\":
-      \"LLMGuardrailResult\",\n      \"type\": \"object\",\n      \"additionalProperties\":
-      false\n    }\n  }\n}\n\nDo not include the OpenAPI schema in the final output.
-      Ensure the final output does not include any code block markers like ```json
-      or ```python."},{"role":"user","content":"The task result does not comply with
-      the guardrail. It contains 22 words, which exceeds the limit of 10 words."}],"model":"gpt-4o","response_format":{"type":"json_schema","json_schema":{"schema":{"properties":{"valid":{"description":"Whether
-      the task output complies with the guardrail","title":"Valid","type":"boolean"},"feedback":{"anyOf":[{"type":"string"},{"type":"null"}],"description":"A
-      feedback about the task output if it is not valid","title":"Feedback"}},"required":["valid","feedback"],"title":"LLMGuardrailResult","type":"object","additionalProperties":false},"name":"LLMGuardrailResult","strict":true}},"stream":false}'
+    body: '{"messages":[{"role":"system","content":"Ensure your final answer strictly adheres to the following OpenAPI schema: {\n  \"type\": \"json_schema\",\n  \"json_schema\": {\n    \"name\": \"LLMGuardrailResult\",\n    \"strict\": true,\n    \"schema\": {\n      \"properties\": {\n        \"valid\": {\n          \"description\": \"Whether the task output complies with the guardrail\",\n          \"title\": \"Valid\",\n          \"type\": \"boolean\"\n        },\n        \"feedback\": {\n          \"anyOf\": [\n            {\n              \"type\": \"string\"\n            },\n            {\n              \"type\": \"null\"\n            }\n          ],\n          \"default\": null,\n          \"description\": \"A feedback about the task output if it is not valid\",\n          \"title\": \"Feedback\"\n        }\n      },\n      \"required\": [\n        \"valid\",\n        \"feedback\"\n      ],\n      \"title\": \"LLMGuardrailResult\",\n      \"type\": \"object\",\n      \"additionalProperties\":
+      false\n    }\n  }\n}\n\nDo not include the OpenAPI schema in the final output. Ensure the final output does not include any code block markers like ```json or ```python."},{"role":"user","content":"{\n  \"valid\": false,\n  \"feedback\": \"The task result contains more than 10 words, violating the guardrail. The text provided contains about 21 words.\"\n}"}],"model":"gpt-4o","response_format":{"type":"json_schema","json_schema":{"schema":{"properties":{"valid":{"description":"Whether the task output complies with the guardrail","title":"Valid","type":"boolean"},"feedback":{"anyOf":[{"type":"string"},{"type":"null"}],"description":"A feedback about the task output if it is not valid","title":"Feedback"}},"required":["valid","feedback"],"title":"LLMGuardrailResult","type":"object","additionalProperties":false},"name":"LLMGuardrailResult","strict":true}},"stream":false}'
    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
      accept:
      - application/json
      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
+      - gzip, deflate, zstd
      connection:
      - keep-alive
      content-length:
-      - '1835'
+      - '1884'
      content-type:
      - application/json
      cookie:
-      - COOKIE-XXX
+      - __cf_bm=REDACTED; _cfuvid=REDACTED
      host:
      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.109.1
      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
+      - arm64
      x-stainless-async:
      - 'false'
      x-stainless-helper-method:
-      - beta.chat.completions.parse
+      - chat.completions.parse
      x-stainless-lang:
      - python
      x-stainless-os:
-      - X-STAINLESS-OS-XXX
+      - MacOS
      x-stainless-package-version:
-      - 1.83.0
+      - 1.109.1
      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
+      - '600'
      x-stainless-retry-count:
      - '0'
      x-stainless-runtime:
      - CPython
      x-stainless-runtime-version:
-      - 3.13.3
+      - 3.12.9
    method: POST
    uri: https://api.openai.com/v1/chat/completions
  response:
    body:
-      string: "{\n  \"id\": \"chatcmpl-Cy7yJiPCk4fXuogyT5e8XeGRLCSf8\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768446359,\n  \"model\": \"gpt-4o-2024-08-06\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"{\\\"valid\\\":false,\\\"feedback\\\":\\\"The
-        task output exceeds the word limit of 10 words by containing 22 words.\\\"}\",\n
-        \       \"refusal\": null,\n        \"annotations\": []\n      },\n      \"logprobs\":
-        null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\":
-        363,\n    \"completion_tokens\": 25,\n    \"total_tokens\": 388,\n    \"prompt_tokens_details\":
-        {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_a0e9480a2f\"\n}\n"
+      string: "{\n  \"id\": \"chatcmpl-CYg98QlZ8NTrQ69676MpXXyCoZJT8\",\n  \"object\": \"chat.completion\",\n  \"created\": 1762381198,\n  \"model\": \"gpt-4o-2024-08-06\",\n  \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\": \"assistant\",\n        \"content\": \"{\\\"valid\\\":false,\\\"feedback\\\":\\\"The task result contains more than 10 words, violating the guardrail. The text provided contains about 21 words.\\\"}\",\n        \"refusal\": null,\n        \"annotations\": []\n      },\n      \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 374,\n    \"completion_tokens\": 32,\n    \"total_tokens\": 406,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\": 0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n\
+        \  \"service_tier\": \"default\",\n  \"system_fingerprint\": \"fp_cbf1785567\"\n}\n"
    headers:
      CF-RAY:
-      - CF-RAY-XXX
+      - REDACTED-RAY
      Connection:
      - keep-alive
      Content-Type:
      - application/json
      Date:
-      - Thu, 15 Jan 2026 03:05:59 GMT
+      - Wed, 05 Nov 2025 22:19:59 GMT
      Server:
      - cloudflare
      Strict-Transport-Security:
-      - STS-XXX
+      - max-age=31536000; includeSubDomains; preload
      Transfer-Encoding:
      - chunked
      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
+      - nosniff
      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
+      - X-Request-ID
      alt-svc:
      - h3=":443"; ma=86400
      cf-cache-status:
      - DYNAMIC
-      content-length:
-      - '913'
      openai-organization:
-      - OPENAI-ORG-XXX
+      - user-hortuttj2f3qtmxyik2zxf4q
      openai-processing-ms:
-      - '488'
+      - '419'
      openai-project:
-      - OPENAI-PROJECT-XXX
+      - proj_fL4UBWR1CMpAAdgzaSKqsVvA
      openai-version:
      - '2020-10-01'
      x-envoy-upstream-service-time:
-      - '507'
+      - '432'
      x-openai-proxy-wasm:
      - v0.1
      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      - '500'
      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      - '30000'
      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      - '499'
      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      - '29702'
      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
+      - 120ms
      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
+      - 596ms
      x-request-id:
-      - X-REQUEST-ID-XXX
+      - req_REDACTED
    status:
      code: 200
      message: OK
 - request:
-    body: "{\"messages\":[{\"role\":\"system\",\"content\":\"You are Guardrail Agent.
-      You are a expert at validating the output of a task. By providing effective
-      feedback if the output is not valid.\\nYour personal goal is: Validate the output
-      of the task\\nTo give my best complete final answer to the task respond using
-      the exact following format:\\n\\nThought: I now can give a great answer\\nFinal
-      Answer: Your final answer must be the great and the most complete as possible,
-      it must be outcome described.\\n\\nI MUST use these formats, my job depends
-      on it!\"},{\"role\":\"user\",\"content\":\"\\nCurrent Task: \\n        Ensure
-      the following task result complies with the given guardrail.\\n\\n        Task
-      result:\\n        \\n        Lorem Ipsum is simply dummy text of the printing
-      and typesetting industry. Lorem Ipsum has been the industry's standard dummy
-      text ever\\n        \\n\\n        Guardrail:\\n        Ensure the result has
-      less than 500 words\\n\\n        Your task:\\n        - Confirm if the Task
-      result complies with the guardrail.\\n        - If not, provide clear feedback
-      explaining what is wrong (e.g., by how much it violates the rule, or what specific
-      part fails).\\n        - Focus only on identifying issues \u2014 do not propose
-      corrections.\\n        - If the Task result complies with the guardrail, saying
-      that is valid\\n        \\n\\nBegin! This is VERY important to you, use the
-      tools available and give your best Final Answer, your job depends on it!\\n\\nThought:\"}],\"model\":\"gpt-4o\"}"
+    body: '{"messages":[{"role":"system","content":"You are Guardrail Agent. You are a expert at validating the output of a task. By providing effective feedback if the output is not valid.\nYour personal goal is: Validate the output of the task\n\nTo give my best complete final answer to the task respond using the exact following format:\n\nThought: I now can give a great answer\nFinal Answer: Your final answer must be the great and the most complete as possible, it must be outcome described.\n\nI MUST use these formats, my job depends on it!Ensure your final answer strictly adheres to the following OpenAPI schema: {\n  \"type\": \"json_schema\",\n  \"json_schema\": {\n    \"name\": \"LLMGuardrailResult\",\n    \"strict\": true,\n    \"schema\": {\n      \"properties\": {\n        \"valid\": {\n          \"description\": \"Whether the task output complies with the guardrail\",\n          \"title\": \"Valid\",\n          \"type\": \"boolean\"\n        },\n        \"feedback\": {\n          \"anyOf\":
+      [\n            {\n              \"type\": \"string\"\n            },\n            {\n              \"type\": \"null\"\n            }\n          ],\n          \"default\": null,\n          \"description\": \"A feedback about the task output if it is not valid\",\n          \"title\": \"Feedback\"\n        }\n      },\n      \"required\": [\n        \"valid\",\n        \"feedback\"\n      ],\n      \"title\": \"LLMGuardrailResult\",\n      \"type\": \"object\",\n      \"additionalProperties\": false\n    }\n  }\n}\n\nDo not include the OpenAPI schema in the final output. Ensure the final output does not include any code block markers like ```json or ```python."},{"role":"user","content":"\n        Ensure the following task result complies with the given guardrail.\n\n        Task result:\n        \n        Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry''s standard dummy text ever\n        \n\n        Guardrail:\n        Ensure
+      the result has less than 500 words\n\n        Your task:\n        - Confirm if the Task result complies with the guardrail.\n        - If not, provide clear feedback explaining what is wrong (e.g., by how much it violates the rule, or what specific part fails).\n        - Focus only on identifying issues — do not propose corrections.\n        - If the Task result complies with the guardrail, saying that is valid\n        "}],"model":"gpt-4o"}'
    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
      accept:
      - application/json
      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
+      - gzip, deflate, zstd
      connection:
      - keep-alive
      content-length:
-      - '1468'
+      - '2453'
      content-type:
      - application/json
      host:
      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.109.1
      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
+      - arm64
      x-stainless-async:
      - 'false'
      x-stainless-lang:
      - python
      x-stainless-os:
-      - X-STAINLESS-OS-XXX
+      - MacOS
      x-stainless-package-version:
-      - 1.83.0
+      - 1.109.1
      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
+      - '600'
      x-stainless-retry-count:
      - '0'
      x-stainless-runtime:
      - CPython
      x-stainless-runtime-version:
-      - 3.13.3
+      - 3.12.9
    method: POST
    uri: https://api.openai.com/v1/chat/completions
  response:
    body:
-      string: "{\n  \"id\": \"chatcmpl-Cy7yKa0rmi2YoTLpyXt9hjeLt2rTI\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768446360,\n  \"model\": \"gpt-4o-2024-08-06\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"First, I'll count the number of words
-        in the Task result to ensure it complies with the guardrail. \\n\\nThe Task
-        result is: \\\"Lorem Ipsum is simply dummy text of the printing and typesetting
-        industry. Lorem Ipsum has been the industry's standard dummy text ever.\\\"\\n\\nBy
-        counting the words: \\n1. Lorem\\n2. Ipsum\\n3. is\\n4. simply\\n5. dummy\\n6.
-        text\\n7. of\\n8. the\\n9. printing\\n10. and\\n11. typesetting\\n12. industry\\n13.
-        Lorem\\n14. Ipsum\\n15. has\\n16. been\\n17. the\\n18. industry's\\n19. standard\\n20.
-        dummy\\n21. text\\n22. ever\\n\\nThere are 22 words total in the Task result.\\n\\nI
-        need to verify if the count of 22 words is less than the guardrail limit of
-        500 words.\\n\\nThought: I now can give a great answer\\nFinal Answer: The
-        Task result complies with the guardrail as it contains 22 words, which is
-        less than the 500-word limit. Therefore, the output is valid.\",\n        \"refusal\":
-        null,\n        \"annotations\": []\n      },\n      \"logprobs\": null,\n
-        \     \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\":
-        285,\n    \"completion_tokens\": 227,\n    \"total_tokens\": 512,\n    \"prompt_tokens_details\":
-        {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_deacdd5f6f\"\n}\n"
+      string: "{\n  \"id\": \"chatcmpl-CYgBMV6fu7EvV2BqzMdJaKyLAg1WW\",\n  \"object\": \"chat.completion\",\n  \"created\": 1762381336,\n  \"model\": \"gpt-4o-2024-08-06\",\n  \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\": \"assistant\",\n        \"content\": \"Thought: I now can give a great answer\\nFinal Answer: {\\\"valid\\\": true, \\\"feedback\\\": null}\",\n        \"refusal\": null,\n        \"annotations\": []\n      },\n      \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 489,\n    \"completion_tokens\": 23,\n    \"total_tokens\": 512,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\": 0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\": \"default\",\n  \"system_fingerprint\"\
+        : \"fp_cbf1785567\"\n}\n"
    headers:
      CF-RAY:
-      - CF-RAY-XXX
+      - REDACTED-RAY
      Connection:
      - keep-alive
      Content-Type:
      - application/json
      Date:
-      - Thu, 15 Jan 2026 03:06:02 GMT
+      - Wed, 05 Nov 2025 22:22:16 GMT
      Server:
      - cloudflare
      Set-Cookie:
-      - SET-COOKIE-XXX
+      - __cf_bm=REDACTED; path=/; expires=Wed, 05-Nov-25 22:52:16 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      - _cfuvid=REDACTED; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
      Strict-Transport-Security:
-      - STS-XXX
+      - max-age=31536000; includeSubDomains; preload
      Transfer-Encoding:
      - chunked
      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
+      - nosniff
      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
+      - X-Request-ID
      alt-svc:
      - h3=":443"; ma=86400
      cf-cache-status:
      - DYNAMIC
-      content-length:
-      - '1668'
      openai-organization:
-      - OPENAI-ORG-XXX
+      - user-hortuttj2f3qtmxyik2zxf4q
      openai-processing-ms:
-      - '2502'
+      - '327'
      openai-project:
-      - OPENAI-PROJECT-XXX
+      - proj_fL4UBWR1CMpAAdgzaSKqsVvA
      openai-version:
      - '2020-10-01'
      x-envoy-upstream-service-time:
-      - '2522'
+      - '372'
      x-openai-proxy-wasm:
      - v0.1
      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      - '500'
      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      - '30000'
      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      - '499'
      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      - '29438'
      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
+      - 120ms
      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
+      - 1.124s
      x-request-id:
-      - X-REQUEST-ID-XXX
+      - req_REDACTED
    status:
      code: 200
      message: OK
 - request:
-    body: '{"messages":[{"role":"system","content":"Ensure your final answer strictly
-      adheres to the following OpenAPI schema: {\n  \"type\": \"json_schema\",\n  \"json_schema\":
-      {\n    \"name\": \"LLMGuardrailResult\",\n    \"strict\": true,\n    \"schema\":
-      {\n      \"properties\": {\n        \"valid\": {\n          \"description\":
-      \"Whether the task output complies with the guardrail\",\n          \"title\":
-      \"Valid\",\n          \"type\": \"boolean\"\n        },\n        \"feedback\":
-      {\n          \"anyOf\": [\n            {\n              \"type\": \"string\"\n            },\n            {\n              \"type\":
-      \"null\"\n            }\n          ],\n          \"default\": null,\n          \"description\":
-      \"A feedback about the task output if it is not valid\",\n          \"title\":
-      \"Feedback\"\n        }\n      },\n      \"required\": [\n        \"valid\",\n        \"feedback\"\n      ],\n      \"title\":
-      \"LLMGuardrailResult\",\n      \"type\": \"object\",\n      \"additionalProperties\":
-      false\n    }\n  }\n}\n\nDo not include the OpenAPI schema in the final output.
-      Ensure the final output does not include any code block markers like ```json
-      or ```python."},{"role":"user","content":"The Task result complies with the
-      guardrail as it contains 22 words, which is less than the 500-word limit. Therefore,
-      the output is valid."}],"model":"gpt-4o","response_format":{"type":"json_schema","json_schema":{"schema":{"properties":{"valid":{"description":"Whether
-      the task output complies with the guardrail","title":"Valid","type":"boolean"},"feedback":{"anyOf":[{"type":"string"},{"type":"null"}],"description":"A
-      feedback about the task output if it is not valid","title":"Feedback"}},"required":["valid","feedback"],"title":"LLMGuardrailResult","type":"object","additionalProperties":false},"name":"LLMGuardrailResult","strict":true}},"stream":false}'
+    body: '{"messages":[{"role":"system","content":"Ensure your final answer strictly adheres to the following OpenAPI schema: {\n  \"type\": \"json_schema\",\n  \"json_schema\": {\n    \"name\": \"LLMGuardrailResult\",\n    \"strict\": true,\n    \"schema\": {\n      \"properties\": {\n        \"valid\": {\n          \"description\": \"Whether the task output complies with the guardrail\",\n          \"title\": \"Valid\",\n          \"type\": \"boolean\"\n        },\n        \"feedback\": {\n          \"anyOf\": [\n            {\n              \"type\": \"string\"\n            },\n            {\n              \"type\": \"null\"\n            }\n          ],\n          \"default\": null,\n          \"description\": \"A feedback about the task output if it is not valid\",\n          \"title\": \"Feedback\"\n        }\n      },\n      \"required\": [\n        \"valid\",\n        \"feedback\"\n      ],\n      \"title\": \"LLMGuardrailResult\",\n      \"type\": \"object\",\n      \"additionalProperties\":
+      false\n    }\n  }\n}\n\nDo not include the OpenAPI schema in the final output. Ensure the final output does not include any code block markers like ```json or ```python."},{"role":"user","content":"{\"valid\": true, \"feedback\": null}"}],"model":"gpt-4o","response_format":{"type":"json_schema","json_schema":{"schema":{"properties":{"valid":{"description":"Whether the task output complies with the guardrail","title":"Valid","type":"boolean"},"feedback":{"anyOf":[{"type":"string"},{"type":"null"}],"description":"A feedback about the task output if it is not valid","title":"Feedback"}},"required":["valid","feedback"],"title":"LLMGuardrailResult","type":"object","additionalProperties":false},"name":"LLMGuardrailResult","strict":true}},"stream":false}'
    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
      accept:
      - application/json
      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
+      - gzip, deflate, zstd
      connection:
      - keep-alive
      content-length:
-      - '1864'
+      - '1762'
      content-type:
      - application/json
      cookie:
-      - COOKIE-XXX
+      - __cf_bm=REDACTED; _cfuvid=REDACTED
      host:
      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.109.1
      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
+      - arm64
      x-stainless-async:
      - 'false'
      x-stainless-helper-method:
-      - beta.chat.completions.parse
+      - chat.completions.parse
      x-stainless-lang:
      - python
      x-stainless-os:
-      - X-STAINLESS-OS-XXX
+      - MacOS
      x-stainless-package-version:
-      - 1.83.0
+      - 1.109.1
      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
+      - '600'
      x-stainless-retry-count:
      - '0'
      x-stainless-runtime:
      - CPython
      x-stainless-runtime-version:
-      - 3.13.3
+      - 3.12.9
    method: POST
    uri: https://api.openai.com/v1/chat/completions
  response:
    body:
-      string: "{\n  \"id\": \"chatcmpl-Cy7yMAjNYSCz2foZPEcSVCuapzF8y\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1768446362,\n  \"model\": \"gpt-4o-2024-08-06\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": \"{\\\"valid\\\":true,\\\"feedback\\\":null}\",\n
-        \       \"refusal\": null,\n        \"annotations\": []\n      },\n      \"logprobs\":
-        null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\":
-        369,\n    \"completion_tokens\": 9,\n    \"total_tokens\": 378,\n    \"prompt_tokens_details\":
-        {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_a0e9480a2f\"\n}\n"
+      string: "{\n  \"id\": \"chatcmpl-CYgBMU20R45qGGaLN6vNAmW1NR4R6\",\n  \"object\": \"chat.completion\",\n  \"created\": 1762381336,\n  \"model\": \"gpt-4o-2024-08-06\",\n  \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\": \"assistant\",\n        \"content\": \"{\\\"valid\\\":true,\\\"feedback\\\":null}\",\n        \"refusal\": null,\n        \"annotations\": []\n      },\n      \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 347,\n    \"completion_tokens\": 9,\n    \"total_tokens\": 356,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\": 0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\": \"default\",\n  \"system_fingerprint\": \"fp_cbf1785567\"\n}\n"
    headers:
      CF-RAY:
-      - CF-RAY-XXX
+      - REDACTED-RAY
      Connection:
      - keep-alive
      Content-Type:
      - application/json
      Date:
-      - Thu, 15 Jan 2026 03:06:03 GMT
+      - Wed, 05 Nov 2025 22:22:17 GMT
      Server:
      - cloudflare
      Strict-Transport-Security:
-      - STS-XXX
+      - max-age=31536000; includeSubDomains; preload
      Transfer-Encoding:
      - chunked
      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
+      - nosniff
      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
+      - X-Request-ID
      alt-svc:
      - h3=":443"; ma=86400
      cf-cache-status:
      - DYNAMIC
-      content-length:
-      - '837'
      openai-organization:
-      - OPENAI-ORG-XXX
+      - user-hortuttj2f3qtmxyik2zxf4q
      openai-processing-ms:
-      - '413'
+      - '1081'
      openai-project:
-      - OPENAI-PROJECT-XXX
+      - proj_fL4UBWR1CMpAAdgzaSKqsVvA
      openai-version:
      - '2020-10-01'
      x-envoy-upstream-service-time:
-      - '650'
+      - '1241'
      x-openai-proxy-wasm:
      - v0.1
      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      - '500'
      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      - '30000'
      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      - '499'
      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      - '29478'
      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
+      - 120ms
      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
+      - 1.042s
      x-request-id:
-      - X-REQUEST-ID-XXX
+      - req_REDACTED
    status:
      code: 200
      message: OK
--- a/lib/crewai/tests/llms/google/test_google.py
+++ b/lib/crewai/tests/llms/google/test_google.py
@@ -728,3 +728,39 @@ def test_google_streaming_returns_usage_metrics():
    assert result.token_usage.prompt_tokens > 0
    assert result.token_usage.completion_tokens > 0
    assert result.token_usage.successful_requests >= 1
+
+
+@pytest.mark.vcr()
+def test_google_express_mode_works() -> None:
+    """
+    Test Google Vertex AI Express mode with API key authentication.
+    This tests Vertex AI Express mode (aiplatform.googleapis.com) with API key
+    authentication.
+
+    """
+    with patch.dict(os.environ, {"GOOGLE_GENAI_USE_VERTEXAI": "true"}):
+        agent = Agent(
+            role="Research Assistant",
+            goal="Find information about the capital of Japan",
+            backstory="You are a helpful research assistant.",
+            llm=LLM(
+                model="gemini/gemini-2.0-flash-exp",
+            ),
+            verbose=True,
+        )
+
+        task = Task(
+            description="What is the capital of Japan?",
+            expected_output="The capital of Japan",
+            agent=agent,
+        )
+
+
+        crew = Crew(agents=[agent], tasks=[task])
+        result = crew.kickoff()
+
+        assert result.token_usage is not None
+        assert result.token_usage.total_tokens > 0
+        assert result.token_usage.prompt_tokens > 0
+        assert result.token_usage.completion_tokens > 0
+        assert result.token_usage.successful_requests >= 1
--- a/lib/crewai/tests/test_flow.py
+++ b/lib/crewai/tests/test_flow.py
@@ -1202,9 +1202,8 @@ def test_complex_and_or_branching():
    )
    assert execution_order.index("branch_2b") > min_branch_1_index

-    # Final should be after both 2a and 2b
-    # Note: final may not be absolutely last due to independent branches (like branch_1c)
-    # that don't contribute to the final result path with sequential listener execution
+    # Final should be last and after both 2a and 2b
+    assert execution_order[-1] == "final"
    assert execution_order.index("final") > execution_order.index("branch_2a")
    assert execution_order.index("final") > execution_order.index("branch_2b")

--- a/lib/crewai/tests/test_task_guardrails.py
+++ b/lib/crewai/tests/test_task_guardrails.py
@@ -185,8 +185,8 @@ def test_task_guardrail_process_output(task_output):

    result = guardrail(task_output)
    assert result[0] is False
-    # Check that feedback is provided (wording varies by LLM)
-    assert result[1] and len(result[1]) > 0
+
+    assert result[1] == "The task result contains more than 10 words, violating the guardrail. The text provided contains about 21 words."

    guardrail = LLMGuardrail(
        description="Ensure the result has less than 500 words", llm=LLM(model="gpt-4o")
--- a/lib/crewai/tests/utilities/test_events.py
+++ b/lib/crewai/tests/utilities/test_events.py
@@ -348,11 +348,11 @@ def test_agent_emits_execution_error_event(base_agent, base_task):

    error_message = "Error happening while sending prompt to model."
    base_agent.max_retry_limit = 0
-
-    # Patch at the class level since agent_executor is created lazily
    with patch.object(
-        CrewAgentExecutor, "invoke", side_effect=Exception(error_message)
-    ):
+        CrewAgentExecutor, "invoke", wraps=base_agent.agent_executor.invoke
+    ) as invoke_mock:
+        invoke_mock.side_effect = Exception(error_message)
+
        with pytest.raises(Exception):  # noqa: B017
            base_agent.execute_task(
                task=base_task,
Author	SHA1	Message	Date
Greyson LaLonde	806863eae7	Merge branch 'main' into lorenze/fix-google-vertex-api-using-api-keys	2026-01-17 10:16:15 -05:00
Greyson LaLonde	ceef062426	feat: add additional a2a events and enrich event metadata Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Notify Downstream / notify-downstream (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details	2026-01-16 16:57:31 -05:00
lorenzejay	e83b7554bf	docs translations	2026-01-15 14:43:43 -08:00
lorenzejay	7834b07ce4	Merge branch 'main' of github.com:crewAIInc/crewAI into lorenze/fix-google-vertex-api-using-api-keys	2026-01-15 14:37:37 -08:00
lorenzejay	a9bb03ffa8	docs update here	2026-01-15 14:37:16 -08:00
lorenzejay	5beaea189b	supporting vertex through api key use - expo mode	2026-01-15 14:34:07 -08:00