chore: trigger CI re-run

Co-Authored-By: João <joao@crewai.com>
feat: add debug logging when OutputParserError triggers agent retry
2026-01-17 12:08:30 +00:00 · 2026-01-16 18:55:05 +00:00 · 2026-01-16 18:51:36 +00:00
10 changed files with 310 additions and 293 deletions
--- a/docs/en/concepts/llms.mdx
+++ b/docs/en/concepts/llms.mdx
@@ -375,13 +375,10 @@ In this section, you'll find detailed examples that help you select, configure,
    GOOGLE_API_KEY=<your-api-key>
    GEMINI_API_KEY=<your-api-key>

-    # For Vertex AI Express mode (API key authentication)
-    GOOGLE_GENAI_USE_VERTEXAI=true
-    GOOGLE_API_KEY=<your-api-key>
-
-    # For Vertex AI with service account
+    # Optional - for Vertex AI
    GOOGLE_CLOUD_PROJECT=<your-project-id>
    GOOGLE_CLOUD_LOCATION=<location>  # Defaults to us-central1
+    GOOGLE_GENAI_USE_VERTEXAI=true  # Set to use Vertex AI
    ```

    **Basic Usage:**
@@ -415,35 +412,7 @@ In this section, you'll find detailed examples that help you select, configure,
    )
    ```

-    **Vertex AI Express Mode (API Key Authentication):**
-
-    Vertex AI Express mode allows you to use Vertex AI with simple API key authentication instead of service account credentials. This is the quickest way to get started with Vertex AI.
-
-    To enable Express mode, set both environment variables in your `.env` file:
-    ```toml .env
-    GOOGLE_GENAI_USE_VERTEXAI=true
-    GOOGLE_API_KEY=<your-api-key>
-    ```
-
-    Then use the LLM as usual:
-    ```python Code
-    from crewai import LLM
-
-    llm = LLM(
-        model="gemini/gemini-2.0-flash",
-        temperature=0.7
-    )
-    ```
-
-    <Info>
-      To get an Express mode API key:
-      - New Google Cloud users: Get an [express mode API key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey)
-      - Existing Google Cloud users: Get a [Google Cloud API key bound to a service account](https://cloud.google.com/docs/authentication/api-keys)
-      
-      For more details, see the [Vertex AI Express mode documentation](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey).
-    </Info>
-
-    **Vertex AI Configuration (Service Account):**
+    **Vertex AI Configuration:**
    ```python Code
    from crewai import LLM

@@ -455,10 +424,10 @@ In this section, you'll find detailed examples that help you select, configure,
    ```

    **Supported Environment Variables:**
-    - `GOOGLE_API_KEY` or `GEMINI_API_KEY`: Your Google API key (required for Gemini API and Vertex AI Express mode)
-    - `GOOGLE_GENAI_USE_VERTEXAI`: Set to `true` to use Vertex AI (required for Express mode)
-    - `GOOGLE_CLOUD_PROJECT`: Google Cloud project ID (for Vertex AI with service account)
+    - `GOOGLE_API_KEY` or `GEMINI_API_KEY`: Your Google API key (required for Gemini API)
+    - `GOOGLE_CLOUD_PROJECT`: Google Cloud project ID (for Vertex AI)
    - `GOOGLE_CLOUD_LOCATION`: GCP location (defaults to `us-central1`)
+    - `GOOGLE_GENAI_USE_VERTEXAI`: Set to `true` to use Vertex AI

    **Features:**
    - Native function calling support for Gemini 1.5+ and 2.x models
--- a/docs/ko/concepts/llms.mdx
+++ b/docs/ko/concepts/llms.mdx
@@ -107,7 +107,7 @@ CrewAI 코드 내에는 사용할 모델을 지정할 수 있는 여러 위치

 ## 공급자 구성 예시

-CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양한 LLM 공급자를 지원합니다.
+CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양한 LLM 공급자를 지원합니다.  
 이 섹션에서는 프로젝트의 요구에 가장 적합한 LLM을 선택, 구성, 최적화하는 데 도움이 되는 자세한 예시를 제공합니다.

 <AccordionGroup>
@@ -153,8 +153,8 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
  </Accordion>

  <Accordion title="Meta-Llama">
-    Meta의 Llama API는 Meta의 대형 언어 모델 패밀리 접근을 제공합니다.
-    API는 [Meta Llama API](https://llama.developer.meta.com?utm_source=partner-crewai&utm_medium=website)에서 사용할 수 있습니다.
+    Meta의 Llama API는 Meta의 대형 언어 모델 패밀리 접근을 제공합니다.  
+    API는 [Meta Llama API](https://llama.developer.meta.com?utm_source=partner-crewai&utm_medium=website)에서 사용할 수 있습니다.  
    `.env` 파일에 다음 환경 변수를 설정하십시오:

    ```toml Code
@@ -207,20 +207,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
    `.env` 파일에 API 키를 설정하십시오. 키가 필요하거나 기존 키를 찾으려면 [AI Studio](https://aistudio.google.com/apikey)를 확인하세요.

    ```toml .env
-    # Gemini API 사용 시 (다음 중 하나)
-    GOOGLE_API_KEY=<your-api-key>
+    # https://ai.google.dev/gemini-api/docs/api-key
    GEMINI_API_KEY=<your-api-key>
-
-    # Vertex AI Express 모드 사용 시 (API 키 인증)
-    GOOGLE_GENAI_USE_VERTEXAI=true
-    GOOGLE_API_KEY=<your-api-key>
-
-    # Vertex AI 서비스 계정 사용 시
-    GOOGLE_CLOUD_PROJECT=<your-project-id>
-    GOOGLE_CLOUD_LOCATION=<location>  # 기본값: us-central1
    ```

-    **기본 사용법:**
+    CrewAI 프로젝트에서의 예시 사용법:
    ```python Code
    from crewai import LLM

@@ -230,34 +221,6 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
    )
    ```

-    **Vertex AI Express 모드 (API 키 인증):**
-
-    Vertex AI Express 모드를 사용하면 서비스 계정 자격 증명 대신 간단한 API 키 인증으로 Vertex AI를 사용할 수 있습니다. Vertex AI를 시작하는 가장 빠른 방법입니다.
-
-    Express 모드를 활성화하려면 `.env` 파일에 두 환경 변수를 모두 설정하세요:
-    ```toml .env
-    GOOGLE_GENAI_USE_VERTEXAI=true
-    GOOGLE_API_KEY=<your-api-key>
-    ```
-
-    그런 다음 평소처럼 LLM을 사용하세요:
-    ```python Code
-    from crewai import LLM
-
-    llm = LLM(
-        model="gemini/gemini-2.0-flash",
-        temperature=0.7
-    )
-    ```
-
-    <Info>
-      Express 모드 API 키를 받으려면:
-      - 신규 Google Cloud 사용자: [Express 모드 API 키](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey) 받기
-      - 기존 Google Cloud 사용자: [서비스 계정에 바인딩된 Google Cloud API 키](https://cloud.google.com/docs/authentication/api-keys) 받기
-
-      자세한 내용은 [Vertex AI Express 모드 문서](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey)를 참조하세요.
-    </Info>
-
    ### Gemini 모델

    Google은 다양한 용도에 최적화된 강력한 모델을 제공합니다.
@@ -513,7 +476,7 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양

  <Accordion title="Local NVIDIA NIM Deployed using WSL2">

-    NVIDIA NIM을 이용하면 Windows 기기에서 WSL2(Windows Subsystem for Linux)를 통해 강력한 LLM을 로컬로 실행할 수 있습니다.
+    NVIDIA NIM을 이용하면 Windows 기기에서 WSL2(Windows Subsystem for Linux)를 통해 강력한 LLM을 로컬로 실행할 수 있습니다.  
    이 방식은 Nvidia GPU를 활용하여 프라이빗하고, 안전하며, 비용 효율적인 AI 추론을 클라우드 서비스에 의존하지 않고 구현할 수 있습니다.
    데이터 프라이버시, 오프라인 기능이 필요한 개발, 테스트, 또는 프로덕션 환경에 최적입니다.

@@ -991,4 +954,4 @@ LLM 설정을 최대한 활용하는 방법을 알아보세요:
    llm = LLM(model="openai/gpt-4o")  # 128K tokens
    ```
  </Tab>
-</Tabs>
+</Tabs>
--- a/docs/pt-BR/concepts/llms.mdx
+++ b/docs/pt-BR/concepts/llms.mdx
@@ -79,7 +79,7 @@ Existem diferentes locais no código do CrewAI onde você pode especificar o mod

    # Configuração avançada com parâmetros detalhados
    llm = LLM(
-        model="openai/gpt-4",
+        model="openai/gpt-4", 
        temperature=0.8,
        max_tokens=150,
        top_p=0.9,
@@ -207,20 +207,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
    Defina sua chave de API no seu arquivo `.env`. Se precisar de uma chave, ou encontrar uma existente, verifique o [AI Studio](https://aistudio.google.com/apikey).

    ```toml .env
-    # Para API Gemini (uma das seguintes)
-    GOOGLE_API_KEY=<your-api-key>
+    # https://ai.google.dev/gemini-api/docs/api-key
    GEMINI_API_KEY=<your-api-key>
-
-    # Para Vertex AI Express mode (autenticação por chave de API)
-    GOOGLE_GENAI_USE_VERTEXAI=true
-    GOOGLE_API_KEY=<your-api-key>
-
-    # Para Vertex AI com conta de serviço
-    GOOGLE_CLOUD_PROJECT=<your-project-id>
-    GOOGLE_CLOUD_LOCATION=<location>  # Padrão: us-central1
    ```

-    **Uso Básico:**
+    Exemplo de uso em seu projeto CrewAI:
    ```python Code
    from crewai import LLM

@@ -230,34 +221,6 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
    )
    ```

-    **Vertex AI Express Mode (Autenticação por Chave de API):**
-
-    O Vertex AI Express mode permite usar o Vertex AI com autenticação simples por chave de API, em vez de credenciais de conta de serviço. Esta é a maneira mais rápida de começar com o Vertex AI.
-
-    Para habilitar o Express mode, defina ambas as variáveis de ambiente no seu arquivo `.env`:
-    ```toml .env
-    GOOGLE_GENAI_USE_VERTEXAI=true
-    GOOGLE_API_KEY=<your-api-key>
-    ```
-
-    Em seguida, use o LLM normalmente:
-    ```python Code
-    from crewai import LLM
-
-    llm = LLM(
-        model="gemini/gemini-2.0-flash",
-        temperature=0.7
-    )
-    ```
-
-    <Info>
-      Para obter uma chave de API do Express mode:
-      - Novos usuários do Google Cloud: Obtenha uma [chave de API do Express mode](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey)
-      - Usuários existentes do Google Cloud: Obtenha uma [chave de API do Google Cloud vinculada a uma conta de serviço](https://cloud.google.com/docs/authentication/api-keys)
-
-      Para mais detalhes, consulte a [documentação do Vertex AI Express mode](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey).
-    </Info>
-
    ### Modelos Gemini

    O Google oferece uma variedade de modelos poderosos otimizados para diferentes casos de uso.
@@ -860,7 +823,7 @@ Saiba como obter o máximo da configuração do seu LLM:
      Lembre-se de monitorar regularmente o uso de tokens e ajustar suas configurações para otimizar custos e desempenho.
    </Info>
  </Accordion>
-
+  
  <Accordion title="Descartar Parâmetros Adicionais">
    O CrewAI usa Litellm internamente para chamadas LLM, permitindo descartar parâmetros adicionais desnecessários para seu caso de uso. Isso pode simplificar seu código e reduzir a complexidade da configuração do LLM.
    Por exemplo, se não precisar enviar o parâmetro <code>stop</code>, basta omiti-lo na chamada do LLM:
@@ -919,4 +882,4 @@ Saiba como obter o máximo da configuração do seu LLM:
    llm = LLM(model="openai/gpt-4o")  # 128K tokens
    ```
  </Tab>
-</Tabs>
+</Tabs>
--- a/lib/crewai/src/crewai/agents/crew_agent_executor.py
+++ b/lib/crewai/src/crewai/agents/crew_agent_executor.py
@@ -219,6 +219,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
            Final answer from the agent.
        """
        formatted_answer = None
+        last_raw_output: str | None = None
        while not isinstance(formatted_answer, AgentFinish):
            try:
                if has_reached_max_iterations(self.iterations, self.max_iter):
@@ -244,6 +245,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                    response_model=self.response_model,
                    executor_context=self,
                )
+                last_raw_output = answer
                if self.response_model is not None:
                    try:
                        self.response_model.model_validate_json(answer)
@@ -300,6 +302,8 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                    iterations=self.iterations,
                    log_error_after=self.log_error_after,
                    printer=self._printer,
+                    raw_output=last_raw_output,
+                    agent_role=self.agent.role if self.agent else None,
                )

            except Exception as e:
@@ -386,6 +390,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
            Final answer from the agent.
        """
        formatted_answer = None
+        last_raw_output: str | None = None
        while not isinstance(formatted_answer, AgentFinish):
            try:
                if has_reached_max_iterations(self.iterations, self.max_iter):
@@ -411,6 +416,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                    response_model=self.response_model,
                    executor_context=self,
                )
+                last_raw_output = answer

                if self.response_model is not None:
                    try:
@@ -467,6 +473,8 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                    iterations=self.iterations,
                    log_error_after=self.log_error_after,
                    printer=self._printer,
+                    raw_output=last_raw_output,
+                    agent_role=self.agent.role if self.agent else None,
                )

            except Exception as e:
--- a/lib/crewai/src/crewai/lite_agent.py
+++ b/lib/crewai/src/crewai/lite_agent.py
@@ -533,6 +533,7 @@ class LiteAgent(FlowTrackable, BaseModel):
        """
        # Execute the agent loop
        formatted_answer: AgentAction | AgentFinish | None = None
+        last_raw_output: str | None = None
        while not isinstance(formatted_answer, AgentFinish):
            try:
                if has_reached_max_iterations(self._iterations, self.max_iterations):
@@ -556,6 +557,7 @@ class LiteAgent(FlowTrackable, BaseModel):
                        from_agent=self,
                        executor_context=self,
                    )
+                    last_raw_output = answer

                except Exception as e:
                    raise e
@@ -594,6 +596,8 @@ class LiteAgent(FlowTrackable, BaseModel):
                    iterations=self._iterations,
                    log_error_after=3,
                    printer=self._printer,
+                    raw_output=last_raw_output,
+                    agent_role=self.role,
                )

            except Exception as e:
--- a/lib/crewai/src/crewai/llms/providers/gemini/completion.py
+++ b/lib/crewai/src/crewai/llms/providers/gemini/completion.py
@@ -54,21 +54,15 @@ class GeminiCompletion(BaseLLM):
        safety_settings: dict[str, Any] | None = None,
        client_params: dict[str, Any] | None = None,
        interceptor: BaseInterceptor[Any, Any] | None = None,
-        use_vertexai: bool | None = None,
        **kwargs: Any,
    ):
        """Initialize Google Gemini chat completion client.

        Args:
            model: Gemini model name (e.g., 'gemini-2.0-flash-001', 'gemini-1.5-pro')
-            api_key: Google API key for Gemini API authentication.
-                    Defaults to GOOGLE_API_KEY or GEMINI_API_KEY env var.
-                    NOTE: Cannot be used with Vertex AI (project parameter). Use Gemini API instead.
-            project: Google Cloud project ID for Vertex AI with ADC authentication.
-                    Requires Application Default Credentials (gcloud auth application-default login).
-                    NOTE: Vertex AI does NOT support API keys, only OAuth2/ADC.
-                    If both api_key and project are set, api_key takes precedence.
-            location: Google Cloud location (for Vertex AI with ADC, defaults to 'us-central1')
+            api_key: Google API key (defaults to GOOGLE_API_KEY or GEMINI_API_KEY env var)
+            project: Google Cloud project ID (for Vertex AI)
+            location: Google Cloud location (for Vertex AI, defaults to 'us-central1')
            temperature: Sampling temperature (0-2)
            top_p: Nucleus sampling parameter
            top_k: Top-k sampling parameter
@@ -79,12 +73,6 @@ class GeminiCompletion(BaseLLM):
            client_params: Additional parameters to pass to the Google Gen AI Client constructor.
                          Supports parameters like http_options, credentials, debug_config, etc.
            interceptor: HTTP interceptor (not yet supported for Gemini).
-            use_vertexai: Whether to use Vertex AI instead of Gemini API.
-                         - True: Use Vertex AI (with ADC or Express mode with API key)
-                         - False: Use Gemini API (explicitly override env var)
-                         - None (default): Check GOOGLE_GENAI_USE_VERTEXAI env var
-                         When using Vertex AI with API key (Express mode), http_options with
-                         api_version="v1" is automatically configured.
            **kwargs: Additional parameters
        """
        if interceptor is not None:
@@ -107,8 +95,7 @@ class GeminiCompletion(BaseLLM):
        self.project = project or os.getenv("GOOGLE_CLOUD_PROJECT")
        self.location = location or os.getenv("GOOGLE_CLOUD_LOCATION") or "us-central1"

-        if use_vertexai is None:
-            use_vertexai = os.getenv("GOOGLE_GENAI_USE_VERTEXAI", "").lower() == "true"
+        use_vertexai = os.getenv("GOOGLE_GENAI_USE_VERTEXAI", "").lower() == "true"

        self.client = self._initialize_client(use_vertexai)

@@ -159,34 +146,13 @@ class GeminiCompletion(BaseLLM):

        Returns:
            Initialized Google Gen AI Client
-
-        Note:
-            Google Gen AI SDK has two distinct endpoints with different auth requirements:
-            - Gemini API (generativelanguage.googleapis.com): Supports API key authentication
-            - Vertex AI (aiplatform.googleapis.com): Only supports OAuth2/ADC, NO API keys
-
-            When vertexai=True is set, it routes to aiplatform.googleapis.com which rejects
-            API keys. Use Gemini API endpoint for API key authentication instead.
        """
        client_params = {}

        if self.client_params:
            client_params.update(self.client_params)

-        # Determine authentication mode based on available credentials
-        has_api_key = bool(self.api_key)
-        has_project = bool(self.project)
-
-        if has_api_key and has_project:
-            logging.warning(
-                "Both API key and project provided. Using API key authentication. "
-                "Project/location parameters are ignored when using API keys. "
-                "To use Vertex AI with ADC, remove the api_key parameter."
-            )
-            has_project = False
-
-        # Vertex AI with ADC (project without API key)
-        if (use_vertexai or has_project) and not has_api_key:
+        if use_vertexai or self.project:
            client_params.update(
                {
                    "vertexai": True,
@@ -195,20 +161,12 @@ class GeminiCompletion(BaseLLM):
                }
            )

-        # API key authentication (works with both Gemini API and Vertex AI Express)
-        elif has_api_key:
+            client_params.pop("api_key", None)
+
+        elif self.api_key:
            client_params["api_key"] = self.api_key

-            # Vertex AI Express mode: API key + vertexai=True + http_options with api_version="v1"
-            # See: https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey
-            if use_vertexai:
-                client_params["vertexai"] = True
-                client_params["http_options"] = types.HttpOptions(api_version="v1")
-            else:
-                # This ensures we use the Gemini API (generativelanguage.googleapis.com)
-                client_params["vertexai"] = False
-
-            # Clean up project/location (not allowed with API key)
+            client_params.pop("vertexai", None)
            client_params.pop("project", None)
            client_params.pop("location", None)

@@ -217,13 +175,10 @@ class GeminiCompletion(BaseLLM):
                return genai.Client(**client_params)
            except Exception as e:
                raise ValueError(
-                    "Authentication required. Provide one of:\n"
-                    "  1. API key via GOOGLE_API_KEY or GEMINI_API_KEY environment variable\n"
-                    "     (use_vertexai=True is optional for Vertex AI with API key)\n"
-                    "  2. For Vertex AI with ADC: Set GOOGLE_CLOUD_PROJECT and run:\n"
-                    "     gcloud auth application-default login\n"
-                    "  3. Pass api_key parameter directly to LLM constructor\n"
+                    "Either GOOGLE_API_KEY/GEMINI_API_KEY (for Gemini API) or "
+                    "GOOGLE_CLOUD_PROJECT (for Vertex AI) must be set"
                ) from e
+
        return genai.Client(**client_params)

    def _get_client_params(self) -> dict[str, Any]:
@@ -247,8 +202,6 @@ class GeminiCompletion(BaseLLM):
                    "location": self.location,
                }
            )
-            if self.api_key:
-                params["api_key"] = self.api_key
        elif self.api_key:
            params["api_key"] = self.api_key

--- a/lib/crewai/src/crewai/utilities/agent_utils.py
+++ b/lib/crewai/src/crewai/utilities/agent_utils.py
@@ -2,6 +2,7 @@ from __future__ import annotations

 from collections.abc import Callable, Sequence
 import json
+import logging
 import re
 from typing import TYPE_CHECKING, Any, Final, Literal, TypedDict

@@ -51,6 +52,8 @@ class SummaryContent(TypedDict):

 console = Console()

+logger = logging.getLogger(__name__)
+
 _MULTIPLE_NEWLINES: Final[re.Pattern[str]] = re.compile(r"\n+")


@@ -430,6 +433,8 @@ def handle_output_parser_exception(
    iterations: int,
    log_error_after: int = 3,
    printer: Printer | None = None,
+    raw_output: str | None = None,
+    agent_role: str | None = None,
 ) -> AgentAction:
    """Handle OutputParserError by updating messages and formatted_answer.

@@ -439,6 +444,8 @@ def handle_output_parser_exception(
        iterations: Current iteration count
        log_error_after: Number of iterations after which to log errors
        printer: Optional printer instance for logging
+        raw_output: The raw LLM output that failed to parse
+        agent_role: The role of the agent for logging context

    Returns:
        AgentAction: A formatted answer with the error
@@ -452,6 +459,27 @@ def handle_output_parser_exception(
        thought="",
    )

+    retry_count = iterations + 1
+    agent_context = f" for agent '{agent_role}'" if agent_role else ""
+
+    logger.debug(
+        "Parse failed%s: %s",
+        agent_context,
+        e.error.split("\n")[0],
+    )
+
+    if raw_output is not None:
+        truncated_output = (
+            raw_output[:500] + "..." if len(raw_output) > 500 else raw_output
+        )
+        logger.debug(
+            "Raw output (truncated)%s: %s",
+            agent_context,
+            truncated_output.replace("\n", "\\n"),
+        )
+
+    logger.debug("Retry %d initiated%s", retry_count, agent_context)
+
    if iterations > log_error_after and printer:
        printer.print(
            content=f"Error parsing LLM output, agent will retry: {e.error}",
--- a/lib/crewai/tests/cassettes/llms/google/test_google_express_mode_works.yaml
+++ b/lib/crewai/tests/cassettes/llms/google/test_google_express_mode_works.yaml
@@ -1,75 +0,0 @@
-interactions:
- request:
-    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: What is the capital
-      of Japan?\n\nThis is the expected criteria for your final answer: The capital
-      of Japan\nyou MUST return the actual complete content as the final answer, not
-      a summary.\n\nBegin! This is VERY important to you, use the tools available
-      and give your best Final Answer, your job depends on it!\n\nThought:"}], "role":
-      "user"}], "systemInstruction": {"parts": [{"text": "You are Research Assistant.
-      You are a helpful research assistant.\nYour personal goal is: Find information
-      about the capital of Japan\nTo give my best complete final answer to the task
-      respond using the exact following format:\n\nThought: I now can give a great
-      answer\nFinal Answer: Your final answer must be the great and the most complete
-      as possible, it must be outcome described.\n\nI MUST use these formats, my job
-      depends on it!"}], "role": "user"}, "generationConfig": {"stopSequences": ["\nObservation:"]}}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - '*/*'
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '952'
-      content-type:
-      - application/json
-      host:
-      - aiplatform.googleapis.com
-      x-goog-api-client:
-      - google-genai-sdk/1.59.0 gl-python/3.13.3
-      x-goog-api-key:
-      - X-GOOG-API-KEY-XXX
-    method: POST
-    uri: https://aiplatform.googleapis.com/v1/publishers/google/models/gemini-2.0-flash-exp:generateContent
-  response:
-    body:
-      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"role\":
-        \"model\",\n        \"parts\": [\n          {\n            \"text\": \"The
-        capital of Japan is Tokyo.\\nFinal Answer: Tokyo\\n\"\n          }\n        ]\n
-        \     },\n      \"finishReason\": \"STOP\",\n      \"avgLogprobs\": -0.017845841554495003\n
-        \   }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\": 163,\n    \"candidatesTokenCount\":
-        13,\n    \"totalTokenCount\": 176,\n    \"trafficType\": \"ON_DEMAND\",\n
-        \   \"promptTokensDetails\": [\n      {\n        \"modality\": \"TEXT\",\n
-        \       \"tokenCount\": 163\n      }\n    ],\n    \"candidatesTokensDetails\":
-        [\n      {\n        \"modality\": \"TEXT\",\n        \"tokenCount\": 13\n
-        \     }\n    ]\n  },\n  \"modelVersion\": \"gemini-2.0-flash-exp\",\n  \"createTime\":
-        \"2026-01-15T22:27:38.066749Z\",\n  \"responseId\": \"2mlpab2JBNOFidsPh5GigQs\"\n}\n"
-    headers:
-      Alt-Svc:
-      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
-      Content-Type:
-      - application/json; charset=UTF-8
-      Date:
-      - Thu, 15 Jan 2026 22:27:38 GMT
-      Server:
-      - scaffolding on HTTPServer2
-      Transfer-Encoding:
-      - chunked
-      Vary:
-      - Origin
-      - X-Origin
-      - Referer
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      X-Frame-Options:
-      - X-FRAME-OPTIONS-XXX
-      X-XSS-Protection:
-      - '0'
-      content-length:
-      - '786'
-    status:
-      code: 200
-      message: OK
-version: 1
--- a/lib/crewai/tests/llms/google/test_google.py
+++ b/lib/crewai/tests/llms/google/test_google.py
@@ -728,39 +728,3 @@ def test_google_streaming_returns_usage_metrics():
    assert result.token_usage.prompt_tokens > 0
    assert result.token_usage.completion_tokens > 0
    assert result.token_usage.successful_requests >= 1
-
-
-@pytest.mark.vcr()
-def test_google_express_mode_works() -> None:
-    """
-    Test Google Vertex AI Express mode with API key authentication.
-    This tests Vertex AI Express mode (aiplatform.googleapis.com) with API key
-    authentication.
-
-    """
-    with patch.dict(os.environ, {"GOOGLE_GENAI_USE_VERTEXAI": "true"}):
-        agent = Agent(
-            role="Research Assistant",
-            goal="Find information about the capital of Japan",
-            backstory="You are a helpful research assistant.",
-            llm=LLM(
-                model="gemini/gemini-2.0-flash-exp",
-            ),
-            verbose=True,
-        )
-
-        task = Task(
-            description="What is the capital of Japan?",
-            expected_output="The capital of Japan",
-            agent=agent,
-        )
-
-
-        crew = Crew(agents=[agent], tasks=[task])
-        result = crew.kickoff()
-
-        assert result.token_usage is not None
-        assert result.token_usage.total_tokens > 0
-        assert result.token_usage.prompt_tokens > 0
-        assert result.token_usage.completion_tokens > 0
-        assert result.token_usage.successful_requests >= 1
--- a/lib/crewai/tests/utilities/test_agent_utils.py
+++ b/lib/crewai/tests/utilities/test_agent_utils.py
@@ -0,0 +1,240 @@
+"""Tests for agent_utils module, specifically debug logging for OutputParserError."""
+
+import logging
+from unittest.mock import MagicMock
+
+import pytest
+
+from crewai.agents.parser import AgentAction, OutputParserError
+from crewai.utilities.agent_utils import handle_output_parser_exception
+
+
+class TestHandleOutputParserExceptionDebugLogging:
+    """Tests for debug logging in handle_output_parser_exception."""
+
+    def test_debug_logging_with_raw_output_and_agent_role(self, caplog: pytest.LogCaptureFixture) -> None:
+        """Test that debug logging includes raw output and agent role when provided."""
+        error = OutputParserError("Invalid Format: I missed the 'Action:' after 'Thought:'.")
+        messages: list[dict[str, str]] = []
+        raw_output = "Let me think about this... The answer is..."
+        agent_role = "Researcher"
+
+        with caplog.at_level(logging.DEBUG):
+            result = handle_output_parser_exception(
+                e=error,
+                messages=messages,
+                iterations=0,
+                raw_output=raw_output,
+                agent_role=agent_role,
+            )
+
+        assert isinstance(result, AgentAction)
+        assert "Parse failed for agent 'Researcher'" in caplog.text
+        assert "Raw output (truncated) for agent 'Researcher'" in caplog.text
+        assert "Let me think about this... The answer is..." in caplog.text
+        assert "Retry 1 initiated for agent 'Researcher'" in caplog.text
+
+    def test_debug_logging_without_agent_role(self, caplog: pytest.LogCaptureFixture) -> None:
+        """Test that debug logging works without agent role."""
+        error = OutputParserError("Invalid Format: I missed the 'Action:' after 'Thought:'.")
+        messages: list[dict[str, str]] = []
+        raw_output = "Some raw output"
+
+        with caplog.at_level(logging.DEBUG):
+            result = handle_output_parser_exception(
+                e=error,
+                messages=messages,
+                iterations=0,
+                raw_output=raw_output,
+            )
+
+        assert isinstance(result, AgentAction)
+        assert "Parse failed:" in caplog.text
+        assert "for agent" not in caplog.text.split("Parse failed:")[1].split("\n")[0]
+        assert "Raw output (truncated):" in caplog.text
+        assert "Retry 1 initiated" in caplog.text
+
+    def test_debug_logging_without_raw_output(self, caplog: pytest.LogCaptureFixture) -> None:
+        """Test that debug logging works without raw output."""
+        error = OutputParserError("Invalid Format: I missed the 'Action:' after 'Thought:'.")
+        messages: list[dict[str, str]] = []
+
+        with caplog.at_level(logging.DEBUG):
+            result = handle_output_parser_exception(
+                e=error,
+                messages=messages,
+                iterations=0,
+                agent_role="Researcher",
+            )
+
+        assert isinstance(result, AgentAction)
+        assert "Parse failed for agent 'Researcher'" in caplog.text
+        assert "Raw output (truncated)" not in caplog.text
+        assert "Retry 1 initiated for agent 'Researcher'" in caplog.text
+
+    def test_debug_logging_truncates_long_raw_output(self, caplog: pytest.LogCaptureFixture) -> None:
+        """Test that raw output is truncated when longer than 500 characters."""
+        error = OutputParserError("Invalid Format")
+        messages: list[dict[str, str]] = []
+        long_output = "A" * 600
+
+        with caplog.at_level(logging.DEBUG):
+            handle_output_parser_exception(
+                e=error,
+                messages=messages,
+                iterations=0,
+                raw_output=long_output,
+                agent_role="Researcher",
+            )
+
+        assert "A" * 500 + "..." in caplog.text
+        assert "A" * 600 not in caplog.text
+
+    def test_debug_logging_does_not_truncate_short_raw_output(self, caplog: pytest.LogCaptureFixture) -> None:
+        """Test that short raw output is not truncated."""
+        error = OutputParserError("Invalid Format")
+        messages: list[dict[str, str]] = []
+        short_output = "Short output"
+
+        with caplog.at_level(logging.DEBUG):
+            handle_output_parser_exception(
+                e=error,
+                messages=messages,
+                iterations=0,
+                raw_output=short_output,
+                agent_role="Researcher",
+            )
+
+        assert "Short output" in caplog.text
+        assert "..." not in caplog.text.split("Short output")[1].split("\n")[0]
+
+    def test_debug_logging_retry_count_increments(self, caplog: pytest.LogCaptureFixture) -> None:
+        """Test that retry count is correctly calculated from iterations."""
+        error = OutputParserError("Invalid Format")
+        messages: list[dict[str, str]] = []
+
+        with caplog.at_level(logging.DEBUG):
+            handle_output_parser_exception(
+                e=error,
+                messages=messages,
+                iterations=4,
+                raw_output="test",
+                agent_role="Researcher",
+            )
+
+        assert "Retry 5 initiated" in caplog.text
+
+    def test_debug_logging_escapes_newlines_in_raw_output(self, caplog: pytest.LogCaptureFixture) -> None:
+        """Test that newlines in raw output are escaped for readability."""
+        error = OutputParserError("Invalid Format")
+        messages: list[dict[str, str]] = []
+        output_with_newlines = "Line 1\nLine 2\nLine 3"
+
+        with caplog.at_level(logging.DEBUG):
+            handle_output_parser_exception(
+                e=error,
+                messages=messages,
+                iterations=0,
+                raw_output=output_with_newlines,
+                agent_role="Researcher",
+            )
+
+        assert "Line 1\\nLine 2\\nLine 3" in caplog.text
+
+    def test_debug_logging_extracts_first_line_of_error(self, caplog: pytest.LogCaptureFixture) -> None:
+        """Test that only the first line of the error message is logged."""
+        error = OutputParserError("First line of error\nSecond line\nThird line")
+        messages: list[dict[str, str]] = []
+
+        with caplog.at_level(logging.DEBUG):
+            handle_output_parser_exception(
+                e=error,
+                messages=messages,
+                iterations=0,
+                agent_role="Researcher",
+            )
+
+        assert "First line of error" in caplog.text
+        parse_failed_line = [line for line in caplog.text.split("\n") if "Parse failed" in line][0]
+        assert "Second line" not in parse_failed_line
+
+    def test_messages_updated_with_error(self) -> None:
+        """Test that messages list is updated with the error."""
+        error = OutputParserError("Test error message")
+        messages: list[dict[str, str]] = []
+
+        handle_output_parser_exception(
+            e=error,
+            messages=messages,
+            iterations=0,
+        )
+
+        assert len(messages) == 1
+        assert messages[0]["role"] == "user"
+        assert messages[0]["content"] == "Test error message"
+
+    def test_returns_agent_action_with_error_text(self) -> None:
+        """Test that the function returns an AgentAction with the error text."""
+        error = OutputParserError("Test error message")
+        messages: list[dict[str, str]] = []
+
+        result = handle_output_parser_exception(
+            e=error,
+            messages=messages,
+            iterations=0,
+        )
+
+        assert isinstance(result, AgentAction)
+        assert result.text == "Test error message"
+        assert result.tool == ""
+        assert result.tool_input == ""
+        assert result.thought == ""
+
+    def test_printer_logs_after_log_error_after_iterations(self) -> None:
+        """Test that printer logs error after log_error_after iterations."""
+        error = OutputParserError("Test error")
+        messages: list[dict[str, str]] = []
+        printer = MagicMock()
+
+        handle_output_parser_exception(
+            e=error,
+            messages=messages,
+            iterations=4,
+            log_error_after=3,
+            printer=printer,
+        )
+
+        printer.print.assert_called_once()
+        call_args = printer.print.call_args
+        assert "Error parsing LLM output" in call_args.kwargs["content"]
+        assert call_args.kwargs["color"] == "red"
+
+    def test_printer_does_not_log_before_log_error_after_iterations(self) -> None:
+        """Test that printer does not log before log_error_after iterations."""
+        error = OutputParserError("Test error")
+        messages: list[dict[str, str]] = []
+        printer = MagicMock()
+
+        handle_output_parser_exception(
+            e=error,
+            messages=messages,
+            iterations=2,
+            log_error_after=3,
+            printer=printer,
+        )
+
+        printer.print.assert_not_called()
+
+    def test_backward_compatibility_without_new_parameters(self) -> None:
+        """Test that the function works without the new optional parameters."""
+        error = OutputParserError("Test error")
+        messages: list[dict[str, str]] = []
+
+        result = handle_output_parser_exception(
+            e=error,
+            messages=messages,
+            iterations=0,
+        )
+
+        assert isinstance(result, AgentAction)
+        assert len(messages) == 1