ci: add PR size and title checks, configure commitizen

Add two new GitHub Actions workflows: - pr-size.yml: auto-labels PRs by size and fails CI on PRs over 500 lines - pr-title.yml: enforces conventional commit format on PR titles Configure commitizen in pyproject.toml with strict schema pattern matching for conventional commits.
chore: update changelog and version for v1.10.1a1
2026-02-28 08:48:15 +00:00 · 2026-02-27 12:43:55 -05:00 · 2026-02-27 09:58:48 -05:00 · 2026-02-27 09:44:47 -05:00 · 2026-02-27 07:35:03 -03:00 · 2026-02-27 01:43:33 -08:00
140 changed files with 14366 additions and 5564 deletions
--- a/.env.test
+++ b/.env.test
@@ -21,7 +21,6 @@ OPENROUTER_API_KEY=fake-openrouter-key
 AWS_ACCESS_KEY_ID=fake-aws-access-key
 AWS_SECRET_ACCESS_KEY=fake-aws-secret-key
 AWS_DEFAULT_REGION=us-east-1
-AWS_REGION_NAME=us-east-1

 # -----------------------------------------------------------------------------
 # Azure OpenAI Configuration
--- a/.github/workflows/pr-size.yml
+++ b/.github/workflows/pr-size.yml
@@ -0,0 +1,37 @@
+name: PR Size Check
+
+on:
+  pull_request:
+    types: [opened, synchronize, reopened]
+
+jobs:
+  pr-size:
+    runs-on: ubuntu-latest
+    permissions:
+      pull-requests: write
+    steps:
+      - uses: codelytv/pr-size-labeler@v1
+        with:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          xs_label: "size/XS"
+          xs_max_size: 25
+          s_label: "size/S"
+          s_max_size: 100
+          m_label: "size/M"
+          m_max_size: 250
+          l_label: "size/L"
+          l_max_size: 500
+          xl_label: "size/XL"
+          fail_if_xl: true
+          message_if_xl: >
+            This PR exceeds 500 lines changed and has been labeled `size/XL`.
+            PRs of this size require release manager approval to merge.
+            Please consider splitting into smaller PRs, or add a justification
+            in the PR description for why this cannot be broken up.
+          files_to_ignore: |
+            uv.lock
+            *.lock
+            lib/crewai/src/crewai/cli/templates/**
+            **/*.json
+            **/test_durations/**
+            **/cassettes/**
--- a/.github/workflows/pr-title.yml
+++ b/.github/workflows/pr-title.yml
@@ -0,0 +1,38 @@
+name: PR Title Check
+
+on:
+  pull_request:
+    types: [opened, edited, synchronize, reopened]
+
+jobs:
+  pr-title:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: amannn/action-semantic-pull-request@v5
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+        with:
+          types: |
+            feat
+            fix
+            refactor
+            perf
+            test
+            docs
+            chore
+            ci
+            style
+            revert
+          requireScope: false
+          subjectPattern: ^[a-z].+[^.]$
+          subjectPatternError: >
+            The PR title "{title}" does not follow conventional commit format.
+
+            Expected: <type>(<scope>): <lowercase description without trailing period>
+
+            Examples:
+              feat(memory): add lancedb storage backend
+              fix(agents): resolve deadlock in concurrent execution
+              chore(deps): bump pydantic to 2.11.9
+
+            See RELEASE_PROCESS.md for the full commit message convention.
--- a/.github/workflows/publish.yml
+++ b/.github/workflows/publish.yml
@@ -1,8 +1,6 @@
 name: Publish to PyPI

 on:
-  repository_dispatch:
-    types: [deployment-tests-passed]
  workflow_dispatch:
    inputs:
      release_tag:
@@ -20,11 +18,8 @@ jobs:
      - name: Determine release tag
        id: release
        run: |
-          # Priority: workflow_dispatch input > repository_dispatch payload > default branch
          if [ -n "${{ inputs.release_tag }}" ]; then
            echo "tag=${{ inputs.release_tag }}" >> $GITHUB_OUTPUT
-          elif [ -n "${{ github.event.client_payload.release_tag }}" ]; then
-            echo "tag=${{ github.event.client_payload.release_tag }}" >> $GITHUB_OUTPUT
          else
            echo "tag=" >> $GITHUB_OUTPUT
          fi
--- a/.github/workflows/trigger-deployment-tests.yml
+++ b/.github/workflows/trigger-deployment-tests.yml
@@ -1,18 +0,0 @@
-name: Trigger Deployment Tests
-
-on:
-  release:
-    types: [published]
-
-jobs:
-  trigger:
-    name: Trigger deployment tests
-    runs-on: ubuntu-latest
-    steps:
-      - name: Trigger deployment tests
-        uses: peter-evans/repository-dispatch@v3
-        with:
-          token: ${{ secrets.CREWAI_DEPLOYMENTS_PAT }}
-          repository: ${{ secrets.CREWAI_DEPLOYMENTS_REPOSITORY }}
-          event-type: crewai-release
-          client-payload: '{"release_tag": "${{ github.event.release.tag_name }}", "release_name": "${{ github.event.release.name }}"}'
--- a/docs/docs.json
+++ b/docs/docs.json
--- a/docs/en/changelog.mdx
+++ b/docs/en/changelog.mdx
@@ -4,6 +4,106 @@ description: "Product updates, improvements, and bug fixes for CrewAI"
 icon: "clock"
 mode: "wide"
 ---
+<Update label="Feb 27, 2026">
+  ## v1.10.1a1
+
+  [View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.1a1)
+
+  ## What's Changed
+
+  ### Features
+  - Implement asynchronous invocation support in step callback methods
+  - Implement lazy loading for heavy dependencies in Memory module
+
+  ### Documentation
+  - Update changelog and version for v1.10.0
+
+  ### Refactoring
+  - Refactor step callback methods to support asynchronous invocation
+  - Refactor to implement lazy loading for heavy dependencies in Memory module
+
+  ### Bug Fixes
+  - Fix branch for release notes
+
+  ## Contributors
+
+  @greysonlalonde, @joaomdmoura
+
+</Update>
+
+<Update label="Feb 27, 2026">
+  ## v1.10.1a1
+
+  [View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.1a1)
+
+  ## What's Changed
+
+  ### Refactoring
+  - Refactor step callback methods to support asynchronous invocation
+  - Implement lazy loading for heavy dependencies in Memory module
+
+  ### Documentation
+  - Update changelog and version for v1.10.0
+
+  ### Bug Fixes
+  - Make branch for release notes
+
+  ## Contributors
+
+  @greysonlalonde, @joaomdmoura
+
+</Update>
+
+<Update label="Feb 26, 2026">
+  ## v1.10.0
+
+  [View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.0)
+
+  ## What's Changed
+
+  ### Features
+  - Enhance MCP tool resolution and related events
+  - Update lancedb version and add lance-namespace packages
+  - Enhance JSON argument parsing and validation in CrewAgentExecutor and BaseTool
+  - Migrate CLI HTTP client from requests to httpx
+  - Add versioned documentation
+  - Add yanked detection for version notes
+  - Implement user input handling in Flows
+  - Enhance HITL self-loop functionality in human feedback integration tests
+  - Add started_event_id and set in eventbus
+  - Auto update tools.specs
+
+  ### Bug Fixes
+  - Validate tool kwargs even when empty to prevent cryptic TypeError
+  - Preserve null types in tool parameter schemas for LLM
+  - Map output_pydantic/output_json to native structured output
+  - Ensure callbacks are ran/awaited if promise
+  - Capture method name in exception context
+  - Preserve enum type in router result; improve types
+  - Fix cyclic flows silently breaking when persistence ID is passed in inputs
+  - Correct CLI flag format from --skip-provider to --skip_provider
+  - Ensure OpenAI tool call stream is finalized
+  - Resolve complex schema $ref pointers in MCP tools
+  - Enforce additionalProperties=false in schemas
+  - Reject reserved script names for crew folders
+  - Resolve race condition in guardrail event emission test
+
+  ### Documentation
+  - Add litellm dependency note for non-native LLM providers
+  - Clarify NL2SQL security model and hardening guidance
+  - Add 96 missing actions across 9 integrations
+
+  ### Refactoring
+  - Refactor crew to provider
+  - Extract HITL to provider pattern
+  - Improve hook typing and registration
+
+  ## Contributors
+
+  @dependabot[bot], @github-actions[bot], @github-code-quality[bot], @greysonlalonde, @heitorado, @hobostay, @joaomdmoura, @johnvan7, @jonathansampson, @lorenzejay, @lucasgomide, @mattatcha, @mplachta, @nicoferdi96, @theCyberTech, @thiagomoretto, @vinibrsl
+
+</Update>
+
 <Update label="Jan 26, 2026">
  ## v1.9.0

--- a/docs/en/concepts/llms.mdx
+++ b/docs/en/concepts/llms.mdx
@@ -106,6 +106,15 @@ There are different places in CrewAI code where you can specify the model to use
  </Tab>
 </Tabs>

+<Info>
+  CrewAI provides native SDK integrations for OpenAI, Anthropic, Google (Gemini API), Azure, and AWS Bedrock — no extra install needed beyond the provider-specific extras (e.g. `uv add "crewai[openai]"`).
+
+  All other providers are powered by **LiteLLM**. If you plan to use any of them, add it as a dependency to your project:
+  ```bash
+  uv add 'crewai[litellm]'
+  ```
+</Info>
+
 ## Provider Configuration Examples

 CrewAI supports a multitude of LLM providers, each offering unique features, authentication methods, and model capabilities.
@@ -275,6 +284,11 @@ In this section, you'll find detailed examples that help you select, configure,
    | `meta_llama/Llama-4-Maverick-17B-128E-Instruct-FP8` | 128k | 4028 | Text, Image | Text |
    | `meta_llama/Llama-3.3-70B-Instruct` | 128k | 4028 | Text | Text |
    | `meta_llama/Llama-3.3-8B-Instruct` | 128k | 4028 | Text | Text |
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Anthropic">
@@ -470,7 +484,7 @@ In this section, you'll find detailed examples that help you select, configure,
      To get an Express mode API key:
      - New Google Cloud users: Get an [express mode API key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey)
      - Existing Google Cloud users: Get a [Google Cloud API key bound to a service account](https://cloud.google.com/docs/authentication/api-keys)
-      
+
      For more details, see the [Vertex AI Express mode documentation](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/quickstart?usertype=apikey).
    </Info>

@@ -571,6 +585,11 @@ In this section, you'll find detailed examples that help you select, configure,
    | gemini-1.5-flash               | 1M tokens      | Balanced multimodal model, good for most tasks                    |
    | gemini-1.5-flash-8B            | 1M tokens      | Fastest, most cost-efficient, good for high-frequency tasks       |
    | gemini-1.5-pro                 | 2M tokens      | Best performing, wide variety of reasoning tasks including logical reasoning, coding, and creative collaboration |
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Azure">
@@ -652,6 +671,7 @@ In this section, you'll find detailed examples that help you select, configure,
    # Optional
    AWS_SESSION_TOKEN=<your-session-token>  # For temporary credentials
    AWS_DEFAULT_REGION=<your-region>  # Defaults to us-east-1
+    AWS_REGION_NAME=<your-region>  # Alternative configuration for backwards compatibility with LiteLLM. Defaults to us-east-1
    ```

    **Basic Usage:**
@@ -695,6 +715,7 @@ In this section, you'll find detailed examples that help you select, configure,
    - `AWS_SECRET_ACCESS_KEY`: AWS secret key (required)
    - `AWS_SESSION_TOKEN`: AWS session token for temporary credentials (optional)
    - `AWS_DEFAULT_REGION`: AWS region (defaults to `us-east-1`)
+    - `AWS_REGION_NAME`: AWS region (defaults to `us-east-1`). Alternative configuration for backwards compatibility with LiteLLM

    **Features:**
    - Native tool calling support via Converse API
@@ -764,6 +785,11 @@ In this section, you'll find detailed examples that help you select, configure,
        model="sagemaker/<my-endpoint>"
    )
    ```
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Mistral">
@@ -779,6 +805,11 @@ In this section, you'll find detailed examples that help you select, configure,
        temperature=0.7
    )
    ```
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Nvidia NIM">
@@ -865,6 +896,11 @@ In this section, you'll find detailed examples that help you select, configure,
    | rakuten/rakutenai-7b-instruct                                     | 1,024 tokens   | Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation. |
    | rakuten/rakutenai-7b-chat                                         | 1,024 tokens   | Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation. |
    | baichuan-inc/baichuan2-13b-chat                                  | 4,096 tokens   | Support Chinese and English chat, coding, math, instruction following, solving quizzes |
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Local NVIDIA NIM Deployed using WSL2">
@@ -905,6 +941,11 @@ In this section, you'll find detailed examples that help you select, configure,

        # ...
    ```
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Groq">
@@ -926,6 +967,11 @@ In this section, you'll find detailed examples that help you select, configure,
    | Llama 3.1 70B/8B  | 131,072 tokens   | High-performance, large context tasks      |
    | Llama 3.2 Series  | 8,192 tokens     | General-purpose tasks                      |
    | Mixtral 8x7B      | 32,768 tokens    | Balanced performance and context           |
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="IBM watsonx.ai">
@@ -948,6 +994,11 @@ In this section, you'll find detailed examples that help you select, configure,
        base_url="https://api.watsonx.ai/v1"
    )
    ```
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Ollama (Local LLMs)">
@@ -961,6 +1012,11 @@ In this section, you'll find detailed examples that help you select, configure,
        base_url="http://localhost:11434"
    )
    ```
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Fireworks AI">
@@ -976,6 +1032,11 @@ In this section, you'll find detailed examples that help you select, configure,
        temperature=0.7
    )
    ```
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Perplexity AI">
@@ -991,6 +1052,11 @@ In this section, you'll find detailed examples that help you select, configure,
        base_url="https://api.perplexity.ai/"
    )
    ```
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Hugging Face">
@@ -1005,6 +1071,11 @@ In this section, you'll find detailed examples that help you select, configure,
        model="huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct"
    )
    ```
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="SambaNova">
@@ -1028,6 +1099,11 @@ In this section, you'll find detailed examples that help you select, configure,
    | Llama 3.2 Series   | 8,192 tokens           | General-purpose, multimodal tasks            |
    | Llama 3.3 70B      | Up to 131,072 tokens   | High-performance and output quality          |
    | Qwen2 familly      | 8,192 tokens           | High-performance and output quality          |
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Cerebras">
@@ -1053,6 +1129,11 @@ In this section, you'll find detailed examples that help you select, configure,
      - Good balance of speed and quality
      - Support for long context windows
    </Info>
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Open Router">
@@ -1075,6 +1156,11 @@ In this section, you'll find detailed examples that help you select, configure,
      - openrouter/deepseek/deepseek-r1
      - openrouter/deepseek/deepseek-chat
    </Info>
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Nebius AI Studio">
@@ -1097,6 +1183,11 @@ In this section, you'll find detailed examples that help you select, configure,
      - Competitive pricing
      - Good balance of speed and quality
    </Info>
+
+    **Note:** This provider uses LiteLLM. Add it as a dependency to your project:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>
 </AccordionGroup>

--- a/docs/en/enterprise/features/flow-hitl-management.mdx
+++ b/docs/en/enterprise/features/flow-hitl-management.mdx
@@ -38,22 +38,21 @@ CrewAI Enterprise provides a comprehensive Human-in-the-Loop (HITL) management s
 Configure human review checkpoints within your Flows using the `@human_feedback` decorator. When execution reaches a review point, the system pauses, notifies the assignee via email, and waits for a response.

 ```python
-from crewai.flow.flow import Flow, start, listen
+from crewai.flow.flow import Flow, start, listen, or_
 from crewai.flow.human_feedback import human_feedback, HumanFeedbackResult

 class ContentApprovalFlow(Flow):
    @start()
    def generate_content(self):
-        # AI generates content
        return "Generated marketing copy for Q1 campaign..."

-    @listen(generate_content)
    @human_feedback(
        message="Please review this content for brand compliance:",
        emit=["approved", "rejected", "needs_revision"],
    )
-    def review_content(self, content):
-        return content
+    @listen(or_("generate_content", "needs_revision"))
+    def review_content(self):
+        return "Marketing copy for review..."

    @listen("approved")
    def publish_content(self, result: HumanFeedbackResult):
@@ -62,10 +61,6 @@ class ContentApprovalFlow(Flow):
    @listen("rejected")
    def archive_content(self, result: HumanFeedbackResult):
        print(f"Content rejected. Reason: {result.feedback}")
-
-    @listen("needs_revision")
-    def revise_content(self, result: HumanFeedbackResult):
-        print(f"Revision requested: {result.feedback}")
 ```

 For complete implementation details, see the [Human Feedback in Flows](/en/learn/human-feedback-in-flows) guide.
--- a/docs/en/enterprise/guides/deploy-to-amp.mdx
+++ b/docs/en/enterprise/guides/deploy-to-amp.mdx
@@ -177,6 +177,11 @@ You need to push your crew to a GitHub repository. If you haven't created a crew
      ![Set Environment Variables](/images/enterprise/set-env-variables.png)
    </Frame>

+    <Info>
+      Using private Python packages? You'll need to add your registry credentials here too.
+      See [Private Package Registries](/en/enterprise/guides/private-package-registry) for the required variables.
+    </Info>
+
  </Step>

  <Step title="Deploy Your Crew">
--- a/docs/en/enterprise/guides/prepare-for-deployment.mdx
+++ b/docs/en/enterprise/guides/prepare-for-deployment.mdx
@@ -256,6 +256,12 @@ Before deployment, ensure you have:
 1. **LLM API keys** ready (OpenAI, Anthropic, Google, etc.)
 2. **Tool API keys** if using external tools (Serper, etc.)

+<Info>
+  If your project depends on packages from a **private PyPI registry**, you'll also need to configure
+  registry authentication credentials as environment variables. See the
+  [Private Package Registries](/en/enterprise/guides/private-package-registry) guide for details.
+</Info>
+
 <Tip>
  Test your project locally with the same environment variables before deploying
  to catch configuration issues early.
--- a/docs/en/enterprise/guides/private-package-registry.mdx
+++ b/docs/en/enterprise/guides/private-package-registry.mdx
@@ -0,0 +1,263 @@
+---
+title: "Private Package Registries"
+description: "Install private Python packages from authenticated PyPI registries in CrewAI AMP"
+icon: "lock"
+mode: "wide"
+---
+
+<Note>
+  This guide covers how to configure your CrewAI project to install Python packages
+  from private PyPI registries (Azure DevOps Artifacts, GitHub Packages, GitLab, AWS CodeArtifact, etc.)
+  when deploying to CrewAI AMP.
+</Note>
+
+## When You Need This
+
+If your project depends on internal or proprietary Python packages hosted on a private registry
+rather than the public PyPI, you'll need to:
+
+1. Tell UV **where** to find the package (an index URL)
+2. Tell UV **which** packages come from that index (a source mapping)
+3. Provide **credentials** so UV can authenticate during install
+
+CrewAI AMP uses [UV](https://docs.astral.sh/uv/) for dependency resolution and installation.
+UV supports authenticated private registries through `pyproject.toml` configuration combined
+with environment variables for credentials.
+
+## Step 1: Configure pyproject.toml
+
+Three pieces work together in your `pyproject.toml`:
+
+### 1a. Declare the dependency
+
+Add the private package to your `[project.dependencies]` like any other dependency:
+
+```toml
+[project]
+dependencies = [
+    "crewai[tools]>=0.100.1,<1.0.0",
+    "my-private-package>=1.2.0",
+]
+```
+
+### 1b. Define the index
+
+Register your private registry as a named index under `[[tool.uv.index]]`:
+
+```toml
+[[tool.uv.index]]
+name = "my-private-registry"
+url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
+explicit = true
+```
+
+<Info>
+  The `name` field is important — UV uses it to construct the environment variable names
+  for authentication (see [Step 2](#step-2-set-authentication-credentials) below).
+
+  Setting `explicit = true` means UV won't search this index for every package — only the
+  ones you explicitly map to it in `[tool.uv.sources]`. This avoids unnecessary queries
+  against your private registry and protects against dependency confusion attacks.
+</Info>
+
+### 1c. Map the package to the index
+
+Tell UV which packages should be resolved from your private index using `[tool.uv.sources]`:
+
+```toml
+[tool.uv.sources]
+my-private-package = { index = "my-private-registry" }
+```
+
+### Complete example
+
+```toml
+[project]
+name = "my-crew-project"
+version = "0.1.0"
+requires-python = ">=3.10,<=3.13"
+dependencies = [
+    "crewai[tools]>=0.100.1,<1.0.0",
+    "my-private-package>=1.2.0",
+]
+
+[tool.crewai]
+type = "crew"
+
+[[tool.uv.index]]
+name = "my-private-registry"
+url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
+explicit = true
+
+[tool.uv.sources]
+my-private-package = { index = "my-private-registry" }
+```
+
+After updating `pyproject.toml`, regenerate your lock file:
+
+```bash
+uv lock
+```
+
+<Warning>
+  Always commit the updated `uv.lock` along with your `pyproject.toml` changes.
+  The lock file is required for deployment — see [Prepare for Deployment](/en/enterprise/guides/prepare-for-deployment).
+</Warning>
+
+## Step 2: Set Authentication Credentials
+
+UV authenticates against private indexes using environment variables that follow a naming convention
+based on the index name you defined in `pyproject.toml`:
+
+```
+UV_INDEX_{UPPER_NAME}_USERNAME
+UV_INDEX_{UPPER_NAME}_PASSWORD
+```
+
+Where `{UPPER_NAME}` is your index name converted to **uppercase** with **hyphens replaced by underscores**.
+
+For example, an index named `my-private-registry` uses:
+
+| Variable | Value |
+|----------|-------|
+| `UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME` | Your registry username or token name |
+| `UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD` | Your registry password or token/PAT |
+
+<Warning>
+  These environment variables **must** be added via the CrewAI AMP **Environment Variables** settings —
+  either globally or at the deployment level. They cannot be set in `.env` files or hardcoded in your project.
+
+  See [Setting Environment Variables in AMP](#setting-environment-variables-in-amp) below.
+</Warning>
+
+## Registry Provider Reference
+
+The table below shows the index URL format and credential values for common registry providers.
+Replace placeholder values with your actual organization and feed details.
+
+| Provider | Index URL | Username | Password |
+|----------|-----------|----------|----------|
+| **Azure DevOps Artifacts** | `https://pkgs.dev.azure.com/{org}/_packaging/{feed}/pypi/simple/` | Any non-empty string (e.g. `token`) | Personal Access Token (PAT) with Packaging Read scope |
+| **GitHub Packages** | `https://pypi.pkg.github.com/{owner}/simple/` | GitHub username | Personal Access Token (classic) with `read:packages` scope |
+| **GitLab Package Registry** | `https://gitlab.com/api/v4/projects/{project_id}/packages/pypi/simple/` | `__token__` | Project or Personal Access Token with `read_api` scope |
+| **AWS CodeArtifact** | Use the URL from `aws codeartifact get-repository-endpoint` | `aws` | Token from `aws codeartifact get-authorization-token` |
+| **Google Artifact Registry** | `https://{region}-python.pkg.dev/{project}/{repo}/simple/` | `_json_key_base64` | Base64-encoded service account key |
+| **JFrog Artifactory** | `https://{instance}.jfrog.io/artifactory/api/pypi/{repo}/simple/` | Username or email | API key or identity token |
+| **Self-hosted (devpi, Nexus, etc.)** | Your registry's simple API URL | Registry username | Registry password |
+
+<Tip>
+  For **AWS CodeArtifact**, the authorization token expires periodically.
+  You'll need to refresh the `UV_INDEX_*_PASSWORD` value when it expires.
+  Consider automating this in your CI/CD pipeline.
+</Tip>
+
+## Setting Environment Variables in AMP
+
+Private registry credentials must be configured as environment variables in CrewAI AMP.
+You have two options:
+
+<Tabs>
+  <Tab title="Web Interface">
+    1. Log in to [CrewAI AMP](https://app.crewai.com)
+    2. Navigate to your automation
+    3. Open the **Environment Variables** tab
+    4. Add each variable (`UV_INDEX_*_USERNAME` and `UV_INDEX_*_PASSWORD`) with its value
+
+    See the [Deploy to AMP — Set Environment Variables](/en/enterprise/guides/deploy-to-amp#set-environment-variables) step for details.
+  </Tab>
+  <Tab title="CLI Deployment">
+    Add the variables to your local `.env` file before running `crewai deploy create`.
+    The CLI will securely transfer them to the platform:
+
+    ```bash
+    # .env
+    OPENAI_API_KEY=sk-...
+    UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
+    UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat-here
+    ```
+
+    ```bash
+    crewai deploy create
+    ```
+  </Tab>
+</Tabs>
+
+<Warning>
+  **Never** commit credentials to your repository. Use AMP environment variables for all secrets.
+  The `.env` file should be listed in `.gitignore`.
+</Warning>
+
+To update credentials on an existing deployment, see [Update Your Crew — Environment Variables](/en/enterprise/guides/update-crew).
+
+## How It All Fits Together
+
+When CrewAI AMP builds your automation, the resolution flow works like this:
+
+<Steps>
+  <Step title="Build starts">
+    AMP pulls your repository and reads `pyproject.toml` and `uv.lock`.
+  </Step>
+  <Step title="UV resolves dependencies">
+    UV reads `[tool.uv.sources]` to determine which index each package should come from.
+  </Step>
+  <Step title="UV authenticates">
+    For each private index, UV looks up `UV_INDEX_{NAME}_USERNAME` and `UV_INDEX_{NAME}_PASSWORD`
+    from the environment variables you configured in AMP.
+  </Step>
+  <Step title="Packages install">
+    UV downloads and installs all packages — both public (from PyPI) and private (from your registry).
+  </Step>
+  <Step title="Automation runs">
+    Your crew or flow starts with all dependencies available.
+  </Step>
+</Steps>
+
+## Troubleshooting
+
+### Authentication Errors During Build
+
+**Symptom**: Build fails with `401 Unauthorized` or `403 Forbidden` when resolving a private package.
+
+**Check**:
+- The `UV_INDEX_*` environment variable names match your index name exactly (uppercased, hyphens → underscores)
+- Credentials are set in AMP environment variables, not just in a local `.env`
+- Your token/PAT has the required read permissions for the package feed
+- The token hasn't expired (especially relevant for AWS CodeArtifact)
+
+### Package Not Found
+
+**Symptom**: `No matching distribution found for my-private-package`.
+
+**Check**:
+- The index URL in `pyproject.toml` ends with `/simple/`
+- The `[tool.uv.sources]` entry maps the correct package name to the correct index name
+- The package is actually published to your private registry
+- Run `uv lock` locally with the same credentials to verify resolution works
+
+### Lock File Conflicts
+
+**Symptom**: `uv lock` fails or produces unexpected results after adding a private index.
+
+**Solution**: Set the credentials locally and regenerate:
+
+```bash
+export UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
+export UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat
+uv lock
+```
+
+Then commit the updated `uv.lock`.
+
+## Related Guides
+
+<CardGroup cols={3}>
+  <Card title="Prepare for Deployment" icon="clipboard-check" href="/en/enterprise/guides/prepare-for-deployment">
+    Verify project structure and dependencies before deploying.
+  </Card>
+  <Card title="Deploy to AMP" icon="rocket" href="/en/enterprise/guides/deploy-to-amp">
+    Deploy your crew or flow and configure environment variables.
+  </Card>
+  <Card title="Update Your Crew" icon="arrows-rotate" href="/en/enterprise/guides/update-crew">
+    Update environment variables and push changes to a running deployment.
+  </Card>
+</CardGroup>
--- a/docs/en/learn/human-feedback-in-flows.mdx
+++ b/docs/en/learn/human-feedback-in-flows.mdx
@@ -98,33 +98,43 @@ def handle_feedback(self, result):
 When you specify `emit`, the decorator becomes a router. The human's free-form feedback is interpreted by an LLM and collapsed into one of the specified outcomes:

 ```python Code
-@start()
-@human_feedback(
-    message="Do you approve this content for publication?",
-    emit=["approved", "rejected", "needs_revision"],
-    llm="gpt-4o-mini",
-    default_outcome="needs_revision",
-)
-def review_content(self):
-    return "Draft blog post content here..."
+from crewai.flow.flow import Flow, start, listen, or_
+from crewai.flow.human_feedback import human_feedback

-@listen("approved")
-def publish(self, result):
-    print(f"Publishing! User said: {result.feedback}")
+class ReviewFlow(Flow):
+    @start()
+    def generate_content(self):
+        return "Draft blog post content here..."

-@listen("rejected")
-def discard(self, result):
-    print(f"Discarding. Reason: {result.feedback}")
+    @human_feedback(
+        message="Do you approve this content for publication?",
+        emit=["approved", "rejected", "needs_revision"],
+        llm="gpt-4o-mini",
+        default_outcome="needs_revision",
+    )
+    @listen(or_("generate_content", "needs_revision"))
+    def review_content(self):
+        return "Draft blog post content here..."

-@listen("needs_revision")
-def revise(self, result):
-    print(f"Revising based on: {result.feedback}")
+    @listen("approved")
+    def publish(self, result):
+        print(f"Publishing! User said: {result.feedback}")
+
+    @listen("rejected")
+    def discard(self, result):
+        print(f"Discarding. Reason: {result.feedback}")
 ```

+When the human says something like "needs more detail", the LLM collapses that to `"needs_revision"`, which triggers `review_content` again via `or_()` — creating a revision loop. The loop continues until the outcome is `"approved"` or `"rejected"`.
+
 <Tip>
 The LLM uses structured outputs (function calling) when available to guarantee the response is one of your specified outcomes. This makes routing reliable and predictable.
 </Tip>

+<Warning>
+A `@start()` method only runs once at the beginning of the flow. If you need a revision loop, separate the start method from the review method and use `@listen(or_("trigger", "revision_outcome"))` on the review method to enable the self-loop.
+</Warning>
+
 ## HumanFeedbackResult

 The `HumanFeedbackResult` dataclass contains all information about a human feedback interaction:
@@ -188,127 +198,183 @@ Each `HumanFeedbackResult` is appended to `human_feedback_history`, so multiple

 ## Complete Example: Content Approval Workflow

-Here's a full example implementing a content review and approval workflow:
+Here's a full example implementing a content review and approval workflow with a revision loop:

 <CodeGroup>

 ```python Code
-from crewai.flow.flow import Flow, start, listen
+from crewai.flow.flow import Flow, start, listen, or_
 from crewai.flow.human_feedback import human_feedback, HumanFeedbackResult
 from pydantic import BaseModel


 class ContentState(BaseModel):
-    topic: str = ""
    draft: str = ""
-    final_content: str = ""
    revision_count: int = 0
+    status: str = "pending"


 class ContentApprovalFlow(Flow[ContentState]):
-    """A flow that generates content and gets human approval."""
+    """A flow that generates content and loops until the human approves."""

    @start()
-    def get_topic(self):
-        self.state.topic = input("What topic should I write about? ")
-        return self.state.topic
-
-    @listen(get_topic)
-    def generate_draft(self, topic):
-        # In real use, this would call an LLM
-        self.state.draft = f"# {topic}\n\nThis is a draft about {topic}..."
+    def generate_draft(self):
+        self.state.draft = "# AI Safety\n\nThis is a draft about AI Safety..."
        return self.state.draft

-    @listen(generate_draft)
    @human_feedback(
-        message="Please review this draft. Reply 'approved', 'rejected', or provide revision feedback:",
+        message="Please review this draft. Approve, reject, or describe what needs changing:",
        emit=["approved", "rejected", "needs_revision"],
        llm="gpt-4o-mini",
        default_outcome="needs_revision",
    )
-    def review_draft(self, draft):
-        return draft
+    @listen(or_("generate_draft", "needs_revision"))
+    def review_draft(self):
+        self.state.revision_count += 1
+        return f"{self.state.draft} (v{self.state.revision_count})"

    @listen("approved")
    def publish_content(self, result: HumanFeedbackResult):
-        self.state.final_content = result.output
-        print("\n✅ Content approved and published!")
-        print(f"Reviewer comment: {result.feedback}")
+        self.state.status = "published"
+        print(f"Content approved and published! Reviewer said: {result.feedback}")
        return "published"

    @listen("rejected")
    def handle_rejection(self, result: HumanFeedbackResult):
-        print("\n❌ Content rejected")
-        print(f"Reason: {result.feedback}")
+        self.state.status = "rejected"
+        print(f"Content rejected. Reason: {result.feedback}")
        return "rejected"

-    @listen("needs_revision")
-    def revise_content(self, result: HumanFeedbackResult):
-        self.state.revision_count += 1
-        print(f"\n📝 Revision #{self.state.revision_count} requested")
-        print(f"Feedback: {result.feedback}")

-        # In a real flow, you might loop back to generate_draft
-        # For this example, we just acknowledge
-        return "revision_requested"
-
-
-# Run the flow
 flow = ContentApprovalFlow()
 result = flow.kickoff()
-print(f"\nFlow completed. Revisions requested: {flow.state.revision_count}")
+print(f"\nFlow completed. Status: {flow.state.status}, Reviews: {flow.state.revision_count}")
 ```

 ```text Output
-What topic should I write about? AI Safety
+==================================================
+OUTPUT FOR REVIEW:
+==================================================
+# AI Safety
+
+This is a draft about AI Safety... (v1)
+==================================================
+
+Please review this draft. Approve, reject, or describe what needs changing:
+(Press Enter to skip, or type your feedback)
+
+Your feedback: Needs more detail on alignment research

 ==================================================
 OUTPUT FOR REVIEW:
 ==================================================
 # AI Safety

-This is a draft about AI Safety...
+This is a draft about AI Safety... (v2)
 ==================================================

-Please review this draft. Reply 'approved', 'rejected', or provide revision feedback:
+Please review this draft. Approve, reject, or describe what needs changing:
 (Press Enter to skip, or type your feedback)

 Your feedback: Looks good, approved!

-✅ Content approved and published!
-Reviewer comment: Looks good, approved!
+Content approved and published! Reviewer said: Looks good, approved!

-Flow completed. Revisions requested: 0
+Flow completed. Status: published, Reviews: 2
 ```

 </CodeGroup>

+The key pattern is `@listen(or_("generate_draft", "needs_revision"))` — the review method listens to both the initial trigger and its own revision outcome, creating a self-loop that repeats until the human approves or rejects.
+
 ## Combining with Other Decorators

-The `@human_feedback` decorator works with other flow decorators. Place it as the innermost decorator (closest to the function):
+The `@human_feedback` decorator works with `@start()`, `@listen()`, and `or_()`. Both decorator orderings work — the framework propagates attributes in both directions — but the recommended patterns are:

 ```python Code
-# Correct: @human_feedback is innermost (closest to the function)
+# One-shot review at the start of a flow (no self-loop)
@start()
-@human_feedback(message="Review this:")
+@human_feedback(message="Review this:", emit=["approved", "rejected"], llm="gpt-4o-mini")
 def my_start_method(self):
    return "content"

+# Linear review on a listener (no self-loop)
@listen(other_method)
-@human_feedback(message="Review this too:")
+@human_feedback(message="Review this too:", emit=["good", "bad"], llm="gpt-4o-mini")
 def my_listener(self, data):
    return f"processed: {data}"
+
+# Self-loop: review that can loop back for revisions
+@human_feedback(message="Approve or revise?", emit=["approved", "revise"], llm="gpt-4o-mini")
+@listen(or_("upstream_method", "revise"))
+def review_with_loop(self):
+    return "content for review"
 ```

-<Tip>
-Place `@human_feedback` as the innermost decorator (last/closest to the function) so it wraps the method directly and can capture the return value before passing to the flow system.
-</Tip>
+### Self-loop pattern
+
+To create a revision loop, the review method must listen to **both** an upstream trigger and its own revision outcome using `or_()`:
+
+```python Code
+@start()
+def generate(self):
+    return "initial draft"
+
+@human_feedback(
+    message="Approve or request changes?",
+    emit=["revise", "approved"],
+    llm="gpt-4o-mini",
+    default_outcome="approved",
+)
+@listen(or_("generate", "revise"))
+def review(self):
+    return "content"
+
+@listen("approved")
+def publish(self):
+    return "published"
+```
+
+When the outcome is `"revise"`, the flow routes back to `review` (because it listens to `"revise"` via `or_()`). When the outcome is `"approved"`, the flow continues to `publish`. This works because the flow engine exempts routers from the "fire once" rule, allowing them to re-execute on each loop iteration.
+
+### Chained routers
+
+A listener triggered by one router's outcome can itself be a router:
+
+```python Code
+@start()
+def generate(self):
+    return "draft content"
+
+@human_feedback(message="First review:", emit=["approved", "rejected"], llm="gpt-4o-mini")
+@listen("generate")
+def first_review(self):
+    return "draft content"
+
+@human_feedback(message="Final review:", emit=["publish", "hold"], llm="gpt-4o-mini")
+@listen("approved")
+def final_review(self, prev):
+    return "final content"
+
+@listen("publish")
+def on_publish(self, prev):
+    return "published"
+
+@listen("hold")
+def on_hold(self, prev):
+    return "held for later"
+```
+
+### Limitations
+
+- **`@start()` methods run once**: A `@start()` method cannot self-loop. If you need a revision cycle, use a separate `@start()` method as the entry point and put the `@human_feedback` on a `@listen()` method.
+- **No `@start()` + `@listen()` on the same method**: This is a Flow framework constraint. A method is either a start point or a listener, not both.

 ## Best Practices

 ### 1. Write Clear Request Messages

-The `request` parameter is what the human sees. Make it actionable:
+The `message` parameter is what the human sees. Make it actionable:

 ```python Code
 # ✅ Good - clear and actionable
@@ -516,9 +582,9 @@ class ContentPipeline(Flow):
    @start()
    @human_feedback(
        message="Approve this content for publication?",
-        emit=["approved", "rejected", "needs_revision"],
+        emit=["approved", "rejected"],
        llm="gpt-4o-mini",
-        default_outcome="needs_revision",
+        default_outcome="rejected",
        provider=SlackNotificationProvider("#content-reviews"),
    )
    def generate_content(self):
@@ -534,11 +600,6 @@ class ContentPipeline(Flow):
        print(f"Archived. Reason: {result.feedback}")
        return {"status": "archived"}

-    @listen("needs_revision")
-    def queue_revision(self, result):
-        print(f"Queued for revision: {result.feedback}")
-        return {"status": "revision_needed"}
-

 # Starting the flow (will pause and wait for Slack response)
 def start_content_pipeline():
@@ -594,22 +655,22 @@ Over time, the human sees progressively better pre-reviewed output because each
 ```python Code
 class ArticleReviewFlow(Flow):
    @start()
+    def generate_article(self):
+        return self.crew.kickoff(inputs={"topic": "AI Safety"}).raw
+
    @human_feedback(
        message="Review this article draft:",
        emit=["approved", "needs_revision"],
        llm="gpt-4o-mini",
        learn=True,  # enable HITL learning
    )
-    def generate_article(self):
-        return self.crew.kickoff(inputs={"topic": "AI Safety"}).raw
+    @listen(or_("generate_article", "needs_revision"))
+    def review_article(self):
+        return self.last_human_feedback.output if self.last_human_feedback else "article draft"

    @listen("approved")
    def publish(self):
        print(f"Publishing: {self.last_human_feedback.output}")
-
-    @listen("needs_revision")
-    def revise(self):
-        print("Revising based on feedback...")
 ```

 **First run**: The human sees the raw output and says "Always include citations for factual claims." The lesson is distilled and stored in memory.
--- a/docs/en/learn/llm-connections.mdx
+++ b/docs/en/learn/llm-connections.mdx
@@ -7,7 +7,7 @@ mode: "wide"

 ## Connect CrewAI to LLMs

-CrewAI uses LiteLLM to connect to a wide variety of Language Models (LLMs). This integration provides extensive versatility, allowing you to use models from numerous providers with a simple, unified interface.
+CrewAI connects to LLMs through native SDK integrations for the most popular providers (OpenAI, Anthropic, Google Gemini, Azure, and AWS Bedrock), and uses LiteLLM as a flexible fallback for all other providers.

 <Note>
    By default, CrewAI uses the `gpt-4o-mini` model. This is determined by the `OPENAI_MODEL_NAME` environment variable, which defaults to "gpt-4o-mini" if not set.
@@ -41,6 +41,14 @@ LiteLLM supports a wide range of providers, including but not limited to:

 For a complete and up-to-date list of supported providers, please refer to the [LiteLLM Providers documentation](https://docs.litellm.ai/docs/providers).

+<Info>
+  To use any provider not covered by a native integration, add LiteLLM as a dependency to your project:
+  ```bash
+  uv add 'crewai[litellm]'
+  ```
+  Native providers (OpenAI, Anthropic, Google Gemini, Azure, AWS Bedrock) use their own SDK extras — see the [Provider Configuration Examples](/en/concepts/llms#provider-configuration-examples).
+</Info>
+
 ## Changing the LLM

 To use a different LLM with your CrewAI agents, you have several options:
--- a/docs/en/observability/tracing.mdx
+++ b/docs/en/observability/tracing.mdx
@@ -35,7 +35,7 @@ Visit [app.crewai.com](https://app.crewai.com) and create your free account. Thi
 If you haven't already, install CrewAI with the CLI tools:

 ```bash
-uv add crewai[tools]
+uv add 'crewai[tools]'
 ```

 Then authenticate your CLI with your CrewAI AMP account:
--- a/docs/ko/changelog.mdx
+++ b/docs/ko/changelog.mdx
@@ -4,6 +4,106 @@ description: "CrewAI의 제품 업데이트, 개선 사항 및 버그 수정"
 icon: "clock"
 mode: "wide"
 ---
+<Update label="2026년 2월 27일">
+  ## v1.10.1a1
+
+  [GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.10.1a1)
+
+  ## 변경 사항
+
+  ### 기능
+  - 단계 콜백 메서드에서 비동기 호출 지원 구현
+  - 메모리 모듈의 무거운 의존성에 대한 지연 로딩 구현
+
+  ### 문서
+  - v1.10.0에 대한 변경 로그 및 버전 업데이트
+
+  ### 리팩토링
+  - 비동기 호출을 지원하기 위해 단계 콜백 메서드 리팩토링
+  - 메모리 모듈의 무거운 의존성에 대한 지연 로딩을 구현하기 위해 리팩토링
+
+  ### 버그 수정
+  - 릴리스 노트의 분기 수정
+
+  ## 기여자
+
+  @greysonlalonde, @joaomdmoura
+
+</Update>
+
+<Update label="2026년 2월 27일">
+  ## v1.10.1a1
+
+  [GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.10.1a1)
+
+  ## 변경 사항
+
+  ### 리팩토링
+  - 비동기 호출을 지원하기 위해 단계 콜백 메서드 리팩토링
+  - 메모리 모듈의 무거운 의존성에 대해 지연 로딩 구현
+
+  ### 문서화
+  - v1.10.0에 대한 변경 로그 및 버전 업데이트
+
+  ### 버그 수정
+  - 릴리스 노트를 위한 브랜치 생성
+
+  ## 기여자
+
+  @greysonlalonde, @joaomdmoura
+
+</Update>
+
+<Update label="2026년 2월 26일">
+  ## v1.10.0
+
+  [GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.10.0)
+
+  ## 변경 사항
+
+  ### 기능
+  - MCP 도구 해상도 및 관련 이벤트 개선
+  - lancedb 버전 업데이트 및 lance-namespace 패키지 추가
+  - CrewAgentExecutor 및 BaseTool에서 JSON 인수 파싱 및 검증 개선
+  - CLI HTTP 클라이언트를 requests에서 httpx로 마이그레이션
+  - 버전화된 문서 추가
+  - 버전 노트에 대한 yanked 감지 추가
+  - Flows에서 사용자 입력 처리 구현
+  - 인간 피드백 통합 테스트에서 HITL 자기 루프 기능 개선
+  - eventbus에 started_event_id 추가 및 설정
+  - tools.specs 자동 업데이트
+
+  ### 버그 수정
+  - 빈 경우에도 도구 kwargs를 검증하여 모호한 TypeError 방지
+  - LLM을 위한 도구 매개변수 스키마에서 null 타입 유지
+  - output_pydantic/output_json을 네이티브 구조화된 출력으로 매핑
+  - 약속이 있는 경우 콜백이 실행/대기되도록 보장
+  - 예외 컨텍스트에서 메서드 이름 캡처
+  - 라우터 결과에서 enum 타입 유지; 타입 개선
+  - 입력으로 지속성 ID가 전달될 때 조용히 깨지는 순환 흐름 수정
+  - CLI 플래그 형식을 --skip-provider에서 --skip_provider로 수정
+  - OpenAI 도구 호출 스트림이 완료되도록 보장
+  - MCP 도구에서 복잡한 스키마 $ref 포인터 해결
+  - 스키마에서 additionalProperties=false 강제 적용
+  - 크루 폴더에 대해 예약된 스크립트 이름 거부
+  - 가드레일 이벤트 방출 테스트에서 경쟁 조건 해결
+
+  ### 문서
+  - 비네이티브 LLM 공급자를 위한 litellm 종속성 노트 추가
+  - NL2SQL 보안 모델 및 강화 지침 명확화
+  - 9개 통합에서 96개의 누락된 작업 추가
+
+  ### 리팩토링
+  - crew를 provider로 리팩토링
+  - HITL을 provider 패턴으로 추출
+  - 훅 타이핑 및 등록 개선
+
+  ## 기여자
+
+  @dependabot[bot], @github-actions[bot], @github-code-quality[bot], @greysonlalonde, @heitorado, @hobostay, @joaomdmoura, @johnvan7, @jonathansampson, @lorenzejay, @lucasgomide, @mattatcha, @mplachta, @nicoferdi96, @theCyberTech, @thiagomoretto, @vinibrsl
+
+</Update>
+
 <Update label="2026년 1월 26일">
  ## v1.9.0

--- a/docs/ko/concepts/llms.mdx
+++ b/docs/ko/concepts/llms.mdx
@@ -105,6 +105,15 @@ CrewAI 코드 내에는 사용할 모델을 지정할 수 있는 여러 위치
  </Tab>
 </Tabs>

+<Info>
+  CrewAI는 OpenAI, Anthropic, Google (Gemini API), Azure, AWS Bedrock에 대해 네이티브 SDK 통합을 제공합니다 — 제공자별 extras(예: `uv add "crewai[openai]"`) 외에 추가 설치가 필요하지 않습니다.
+
+  그 외 모든 제공자는 **LiteLLM**을 통해 지원됩니다. 이를 사용하려면 프로젝트에 의존성으로 추가하세요:
+  ```bash
+  uv add 'crewai[litellm]'
+  ```
+</Info>
+
 ## 공급자 구성 예시

 CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양한 LLM 공급자를 지원합니다.
@@ -214,6 +223,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
    | `meta_llama/Llama-4-Maverick-17B-128E-Instruct-FP8` | 128k | 4028 | 텍스트, 이미지     | 텍스트           |
    | `meta_llama/Llama-3.3-70B-Instruct`               | 128k | 4028       | 텍스트            | 텍스트           |
    | `meta_llama/Llama-3.3-8B-Instruct`                | 128k | 4028       | 텍스트            | 텍스트           |
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Anthropic">
@@ -354,6 +368,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
    | gemini-1.5-flash                 | 1M 토큰         | 밸런스 잡힌 멀티모달 모델, 대부분의 작업에 적합                         |
    | gemini-1.5-flash-8B              | 1M 토큰         | 가장 빠르고, 비용 효율적, 고빈도 작업에 적합                            |
    | gemini-1.5-pro                   | 2M 토큰         | 최고의 성능, 논리적 추론, 코딩, 창의적 협업 등 다양한 추론 작업에 적합   |
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Azure">
@@ -439,6 +458,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
        model="sagemaker/<my-endpoint>"
    )
    ```
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Mistral">
@@ -454,6 +478,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
        temperature=0.7
    )
    ```
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Nvidia NIM">
@@ -540,6 +569,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
    | rakuten/rakutenai-7b-instruct                                          | 1,024 토큰     | 언어 이해, 추론, 텍스트 생성이 탁월한 최첨단 LLM                         |
    | rakuten/rakutenai-7b-chat                                              | 1,024 토큰     | 언어 이해, 추론, 텍스트 생성이 탁월한 최첨단 LLM                         |
    | baichuan-inc/baichuan2-13b-chat                                        | 4,096 토큰     | 중국어 및 영어 대화, 코딩, 수학, 지시 따르기, 퀴즈 풀이 지원             |
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Local NVIDIA NIM Deployed using WSL2">
@@ -580,6 +614,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양

        # ...
    ```
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Groq">
@@ -601,6 +640,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
    | Llama 3.1 70B/8B| 131,072 토큰      | 고성능, 대용량 문맥 작업         |
    | Llama 3.2 Series| 8,192 토큰        | 범용 작업                        |
    | Mixtral 8x7B    | 32,768 토큰       | 성능과 문맥의 균형               |
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="IBM watsonx.ai">
@@ -623,6 +667,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
        base_url="https://api.watsonx.ai/v1"
    )
    ```
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Ollama (Local LLMs)">
@@ -636,6 +685,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
        base_url="http://localhost:11434"
    )
    ```
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Fireworks AI">
@@ -651,6 +705,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
        temperature=0.7
    )
    ```
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Perplexity AI">
@@ -666,6 +725,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
        base_url="https://api.perplexity.ai/"
    )
    ```
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Hugging Face">
@@ -680,6 +744,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
        model="huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct"
    )
    ```
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="SambaNova">
@@ -703,6 +772,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
    | Llama 3.2 Series| 8,192 토큰         | 범용, 멀티모달 작업                  |
    | Llama 3.3 70B   | 최대 131,072 토큰   | 고성능, 높은 출력 품질               |
    | Qwen2 familly   | 8,192 토큰         | 고성능, 높은 출력 품질               |
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Cerebras">
@@ -728,6 +802,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
      - 속도와 품질의 우수한 밸런스
      - 긴 컨텍스트 윈도우 지원
    </Info>
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Open Router">
@@ -750,6 +829,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
      - openrouter/deepseek/deepseek-r1
      - openrouter/deepseek/deepseek-chat
    </Info>
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Nebius AI Studio">
@@ -772,6 +856,11 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
      - 경쟁력 있는 가격
      - 속도와 품질의 우수한 밸런스
    </Info>
+
+    **참고:** 이 제공자는 LiteLLM을 사용합니다. 프로젝트에 의존성으로 추가하세요:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>
 </AccordionGroup>

--- a/docs/ko/enterprise/features/flow-hitl-management.mdx
+++ b/docs/ko/enterprise/features/flow-hitl-management.mdx
@@ -38,22 +38,21 @@ CrewAI Enterprise는 AI 워크플로우를 협업적인 인간-AI 프로세스
 `@human_feedback` 데코레이터를 사용하여 Flow 내에 인간 검토 체크포인트를 구성합니다. 실행이 검토 포인트에 도달하면 시스템이 일시 중지되고, 담당자에게 이메일로 알리며, 응답을 기다립니다.

 ```python
-from crewai.flow.flow import Flow, start, listen
+from crewai.flow.flow import Flow, start, listen, or_
 from crewai.flow.human_feedback import human_feedback, HumanFeedbackResult

 class ContentApprovalFlow(Flow):
    @start()
    def generate_content(self):
-        # AI가 콘텐츠 생성
        return "Q1 캠페인용 마케팅 카피 생성..."

-    @listen(generate_content)
    @human_feedback(
        message="브랜드 준수를 위해 이 콘텐츠를 검토해 주세요:",
        emit=["approved", "rejected", "needs_revision"],
    )
-    def review_content(self, content):
-        return content
+    @listen(or_("generate_content", "needs_revision"))
+    def review_content(self):
+        return "검토용 마케팅 카피..."

    @listen("approved")
    def publish_content(self, result: HumanFeedbackResult):
@@ -62,10 +61,6 @@ class ContentApprovalFlow(Flow):
    @listen("rejected")
    def archive_content(self, result: HumanFeedbackResult):
        print(f"콘텐츠 거부됨. 사유: {result.feedback}")
-
-    @listen("needs_revision")
-    def revise_content(self, result: HumanFeedbackResult):
-        print(f"수정 요청: {result.feedback}")
 ```

 완전한 구현 세부 사항은 [Flow에서 인간 피드백](/ko/learn/human-feedback-in-flows) 가이드를 참조하세요.
--- a/docs/ko/enterprise/guides/deploy-to-amp.mdx
+++ b/docs/ko/enterprise/guides/deploy-to-amp.mdx
@@ -176,6 +176,11 @@ Crew를 GitHub 저장소에 푸시해야 합니다. 아직 Crew를 만들지 않
      ![Set Environment Variables](/images/enterprise/set-env-variables.png)
    </Frame>

+    <Info>
+      프라이빗 Python 패키지를 사용하시나요? 여기에 레지스트리 자격 증명도 추가해야 합니다.
+      필요한 변수는 [프라이빗 패키지 레지스트리](/ko/enterprise/guides/private-package-registry)를 참조하세요.
+    </Info>
+
  </Step>

  <Step title="Crew 배포하기">
--- a/docs/ko/enterprise/guides/prepare-for-deployment.mdx
+++ b/docs/ko/enterprise/guides/prepare-for-deployment.mdx
@@ -256,6 +256,12 @@ Crews와 Flows 모두 `src/project_name/main.py`에 진입점이 있습니다:
 1. **LLM API 키** (OpenAI, Anthropic, Google 등)
 2. **도구 API 키** - 외부 도구를 사용하는 경우 (Serper 등)

+<Info>
+  프로젝트가 **프라이빗 PyPI 레지스트리**의 패키지에 의존하는 경우, 레지스트리 인증 자격 증명도
+  환경 변수로 구성해야 합니다. 자세한 내용은
+  [프라이빗 패키지 레지스트리](/ko/enterprise/guides/private-package-registry) 가이드를 참조하세요.
+</Info>
+
 <Tip>
  구성 문제를 조기에 발견하기 위해 배포 전에 동일한 환경 변수로
  로컬에서 프로젝트를 테스트하세요.
--- a/docs/ko/enterprise/guides/private-package-registry.mdx
+++ b/docs/ko/enterprise/guides/private-package-registry.mdx
@@ -0,0 +1,261 @@
+---
+title: "프라이빗 패키지 레지스트리"
+description: "CrewAI AMP에서 인증된 PyPI 레지스트리의 프라이빗 Python 패키지 설치하기"
+icon: "lock"
+mode: "wide"
+---
+
+<Note>
+  이 가이드는 CrewAI AMP에 배포할 때 프라이빗 PyPI 레지스트리(Azure DevOps Artifacts, GitHub Packages,
+  GitLab, AWS CodeArtifact 등)에서 Python 패키지를 설치하도록 CrewAI 프로젝트를 구성하는 방법을 다룹니다.
+</Note>
+
+## 이 가이드가 필요한 경우
+
+프로젝트가 공개 PyPI가 아닌 프라이빗 레지스트리에 호스팅된 내부 또는 독점 Python 패키지에
+의존하는 경우, 다음을 수행해야 합니다:
+
+1. UV에 패키지를 **어디서** 찾을지 알려줍니다 (index URL)
+2. UV에 **어떤** 패키지가 해당 index에서 오는지 알려줍니다 (source 매핑)
+3. UV가 설치 중에 인증할 수 있도록 **자격 증명**을 제공합니다
+
+CrewAI AMP는 의존성 해결 및 설치에 [UV](https://docs.astral.sh/uv/)를 사용합니다.
+UV는 `pyproject.toml` 구성과 자격 증명용 환경 변수를 결합하여 인증된 프라이빗 레지스트리를 지원합니다.
+
+## 1단계: pyproject.toml 구성
+
+`pyproject.toml`에서 세 가지 요소가 함께 작동합니다:
+
+### 1a. 의존성 선언
+
+프라이빗 패키지를 다른 의존성과 마찬가지로 `[project.dependencies]`에 추가합니다:
+
+```toml
+[project]
+dependencies = [
+    "crewai[tools]>=0.100.1,<1.0.0",
+    "my-private-package>=1.2.0",
+]
+```
+
+### 1b. index 정의
+
+프라이빗 레지스트리를 `[[tool.uv.index]]` 아래에 명명된 index로 등록합니다:
+
+```toml
+[[tool.uv.index]]
+name = "my-private-registry"
+url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
+explicit = true
+```
+
+<Info>
+  `name` 필드는 중요합니다 — UV는 이를 사용하여 인증을 위한 환경 변수 이름을
+  구성합니다 (아래 [2단계](#2단계-인증-자격-증명-설정)를 참조하세요).
+
+  `explicit = true`를 설정하면 UV가 모든 패키지에 대해 이 index를 검색하지 않습니다 —
+  `[tool.uv.sources]`에서 명시적으로 매핑한 패키지만 검색합니다. 이렇게 하면 프라이빗
+  레지스트리에 대한 불필요한 쿼리를 방지하고 의존성 혼동 공격을 차단할 수 있습니다.
+</Info>
+
+### 1c. 패키지를 index에 매핑
+
+`[tool.uv.sources]`를 사용하여 프라이빗 index에서 해결해야 할 패키지를 UV에 알려줍니다:
+
+```toml
+[tool.uv.sources]
+my-private-package = { index = "my-private-registry" }
+```
+
+### 전체 예시
+
+```toml
+[project]
+name = "my-crew-project"
+version = "0.1.0"
+requires-python = ">=3.10,<=3.13"
+dependencies = [
+    "crewai[tools]>=0.100.1,<1.0.0",
+    "my-private-package>=1.2.0",
+]
+
+[tool.crewai]
+type = "crew"
+
+[[tool.uv.index]]
+name = "my-private-registry"
+url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
+explicit = true
+
+[tool.uv.sources]
+my-private-package = { index = "my-private-registry" }
+```
+
+`pyproject.toml`을 업데이트한 후 lock 파일을 다시 생성합니다:
+
+```bash
+uv lock
+```
+
+<Warning>
+  업데이트된 `uv.lock`을 항상 `pyproject.toml` 변경 사항과 함께 커밋하세요.
+  lock 파일은 배포에 필수입니다 — [배포 준비하기](/ko/enterprise/guides/prepare-for-deployment)를 참조하세요.
+</Warning>
+
+## 2단계: 인증 자격 증명 설정
+
+UV는 `pyproject.toml`에서 정의한 index 이름을 기반으로 한 명명 규칙을 따르는
+환경 변수를 사용하여 프라이빗 index에 인증합니다:
+
+```
+UV_INDEX_{UPPER_NAME}_USERNAME
+UV_INDEX_{UPPER_NAME}_PASSWORD
+```
+
+여기서 `{UPPER_NAME}`은 index 이름을 **대문자**로 변환하고 **하이픈을 언더스코어로 대체**한 것입니다.
+
+예를 들어, `my-private-registry`라는 이름의 index는 다음을 사용합니다:
+
+| 변수 | 값 |
+|------|-----|
+| `UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME` | 레지스트리 사용자 이름 또는 토큰 이름 |
+| `UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD` | 레지스트리 비밀번호 또는 토큰/PAT |
+
+<Warning>
+  이 환경 변수는 CrewAI AMP **환경 변수** 설정을 통해 **반드시** 추가해야 합니다 —
+  전역적으로 또는 배포 수준에서. `.env` 파일에 설정하거나 프로젝트에 하드코딩할 수 없습니다.
+
+  아래 [AMP에서 환경 변수 설정](#amp에서-환경-변수-설정)을 참조하세요.
+</Warning>
+
+## 레지스트리 제공업체 참조
+
+아래 표는 일반적인 레지스트리 제공업체의 index URL 형식과 자격 증명 값을 보여줍니다.
+자리 표시자 값을 실제 조직 및 피드 세부 정보로 대체하세요.
+
+| 제공업체 | Index URL | 사용자 이름 | 비밀번호 |
+|---------|-----------|-----------|---------|
+| **Azure DevOps Artifacts** | `https://pkgs.dev.azure.com/{org}/_packaging/{feed}/pypi/simple/` | 비어 있지 않은 임의의 문자열 (예: `token`) | Packaging Read 범위의 Personal Access Token (PAT) |
+| **GitHub Packages** | `https://pypi.pkg.github.com/{owner}/simple/` | GitHub 사용자 이름 | `read:packages` 범위의 Personal Access Token (classic) |
+| **GitLab Package Registry** | `https://gitlab.com/api/v4/projects/{project_id}/packages/pypi/simple/` | `__token__` | `read_api` 범위의 Project 또는 Personal Access Token |
+| **AWS CodeArtifact** | `aws codeartifact get-repository-endpoint`의 URL 사용 | `aws` | `aws codeartifact get-authorization-token`의 토큰 |
+| **Google Artifact Registry** | `https://{region}-python.pkg.dev/{project}/{repo}/simple/` | `_json_key_base64` | Base64로 인코딩된 서비스 계정 키 |
+| **JFrog Artifactory** | `https://{instance}.jfrog.io/artifactory/api/pypi/{repo}/simple/` | 사용자 이름 또는 이메일 | API 키 또는 ID 토큰 |
+| **자체 호스팅 (devpi, Nexus 등)** | 레지스트리의 simple API URL | 레지스트리 사용자 이름 | 레지스트리 비밀번호 |
+
+<Tip>
+  **AWS CodeArtifact**의 경우 인증 토큰이 주기적으로 만료됩니다.
+  만료되면 `UV_INDEX_*_PASSWORD` 값을 갱신해야 합니다.
+  CI/CD 파이프라인에서 이를 자동화하는 것을 고려하세요.
+</Tip>
+
+## AMP에서 환경 변수 설정
+
+프라이빗 레지스트리 자격 증명은 CrewAI AMP에서 환경 변수로 구성해야 합니다.
+두 가지 옵션이 있습니다:
+
+<Tabs>
+  <Tab title="웹 인터페이스">
+    1. [CrewAI AMP](https://app.crewai.com)에 로그인합니다
+    2. 자동화로 이동합니다
+    3. **Environment Variables** 탭을 엽니다
+    4. 각 변수 (`UV_INDEX_*_USERNAME` 및 `UV_INDEX_*_PASSWORD`)에 값을 추가합니다
+
+    자세한 내용은 [AMP에 배포하기 — 환경 변수 설정하기](/ko/enterprise/guides/deploy-to-amp#환경-변수-설정하기) 단계를 참조하세요.
+  </Tab>
+  <Tab title="CLI 배포">
+    `crewai deploy create`를 실행하기 전에 로컬 `.env` 파일에 변수를 추가합니다.
+    CLI가 이를 안전하게 플랫폼으로 전송합니다:
+
+    ```bash
+    # .env
+    OPENAI_API_KEY=sk-...
+    UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
+    UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat-here
+    ```
+
+    ```bash
+    crewai deploy create
+    ```
+  </Tab>
+</Tabs>
+
+<Warning>
+  자격 증명을 저장소에 **절대** 커밋하지 마세요. 모든 비밀 정보에는 AMP 환경 변수를 사용하세요.
+  `.env` 파일은 `.gitignore`에 포함되어야 합니다.
+</Warning>
+
+기존 배포의 자격 증명을 업데이트하려면 [Crew 업데이트하기 — 환경 변수](/ko/enterprise/guides/update-crew)를 참조하세요.
+
+## 전체 동작 흐름
+
+CrewAI AMP가 자동화를 빌드할 때, 해결 흐름은 다음과 같이 작동합니다:
+
+<Steps>
+  <Step title="빌드 시작">
+    AMP가 저장소를 가져오고 `pyproject.toml`과 `uv.lock`을 읽습니다.
+  </Step>
+  <Step title="UV가 의존성 해결">
+    UV가 `[tool.uv.sources]`를 읽어 각 패키지가 어떤 index에서 와야 하는지 결정합니다.
+  </Step>
+  <Step title="UV가 인증">
+    각 프라이빗 index에 대해 UV가 AMP에서 구성한 환경 변수에서
+    `UV_INDEX_{NAME}_USERNAME`과 `UV_INDEX_{NAME}_PASSWORD`를 조회합니다.
+  </Step>
+  <Step title="패키지 설치">
+    UV가 공개(PyPI) 및 프라이빗(레지스트리) 패키지를 모두 다운로드하고 설치합니다.
+  </Step>
+  <Step title="자동화 실행">
+    모든 의존성이 사용 가능한 상태에서 crew 또는 flow가 시작됩니다.
+  </Step>
+</Steps>
+
+## 문제 해결
+
+### 빌드 중 인증 오류
+
+**증상**: 프라이빗 패키지를 해결할 때 `401 Unauthorized` 또는 `403 Forbidden`으로 빌드가 실패합니다.
+
+**확인사항**:
+- `UV_INDEX_*` 환경 변수 이름이 index 이름과 정확히 일치하는지 확인합니다 (대문자, 하이픈 -> 언더스코어)
+- 자격 증명이 로컬 `.env`뿐만 아니라 AMP 환경 변수에 설정되어 있는지 확인합니다
+- 토큰/PAT에 패키지 피드에 필요한 읽기 권한이 있는지 확인합니다
+- 토큰이 만료되지 않았는지 확인합니다 (특히 AWS CodeArtifact의 경우)
+
+### 패키지를 찾을 수 없음
+
+**증상**: `No matching distribution found for my-private-package`.
+
+**확인사항**:
+- `pyproject.toml`의 index URL이 `/simple/`로 끝나는지 확인합니다
+- `[tool.uv.sources]` 항목이 올바른 패키지 이름을 올바른 index 이름에 매핑하는지 확인합니다
+- 패키지가 실제로 프라이빗 레지스트리에 게시되어 있는지 확인합니다
+- 동일한 자격 증명으로 로컬에서 `uv lock`을 실행하여 해결이 작동하는지 확인합니다
+
+### Lock 파일 충돌
+
+**증상**: 프라이빗 index를 추가한 후 `uv lock`이 실패하거나 예상치 못한 결과를 생성합니다.
+
+**해결책**: 로컬에서 자격 증명을 설정하고 다시 생성합니다:
+
+```bash
+export UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
+export UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat
+uv lock
+```
+
+그런 다음 업데이트된 `uv.lock`을 커밋합니다.
+
+## 관련 가이드
+
+<CardGroup cols={3}>
+  <Card title="배포 준비하기" icon="clipboard-check" href="/ko/enterprise/guides/prepare-for-deployment">
+    배포 전에 프로젝트 구조와 의존성을 확인합니다.
+  </Card>
+  <Card title="AMP에 배포하기" icon="rocket" href="/ko/enterprise/guides/deploy-to-amp">
+    crew 또는 flow를 배포하고 환경 변수를 구성합니다.
+  </Card>
+  <Card title="Crew 업데이트하기" icon="arrows-rotate" href="/ko/enterprise/guides/update-crew">
+    환경 변수를 업데이트하고 실행 중인 배포에 변경 사항을 푸시합니다.
+  </Card>
+</CardGroup>
--- a/docs/ko/learn/human-feedback-in-flows.mdx
+++ b/docs/ko/learn/human-feedback-in-flows.mdx
@@ -98,33 +98,43 @@ def handle_feedback(self, result):
 `emit`을 지정하면, 데코레이터는 라우터가 됩니다. 인간의 자유 형식 피드백이 LLM에 의해 해석되어 지정된 outcome 중 하나로 매핑됩니다:

 ```python Code
-@start()
-@human_feedback(
-    message="이 콘텐츠의 출판을 승인하시겠습니까?",
-    emit=["approved", "rejected", "needs_revision"],
-    llm="gpt-4o-mini",
-    default_outcome="needs_revision",
-)
-def review_content(self):
-    return "블로그 게시물 초안 내용..."
+from crewai.flow.flow import Flow, start, listen, or_
+from crewai.flow.human_feedback import human_feedback

-@listen("approved")
-def publish(self, result):
-    print(f"출판 중! 사용자 의견: {result.feedback}")
+class ReviewFlow(Flow):
+    @start()
+    def generate_content(self):
+        return "블로그 게시물 초안 내용..."

-@listen("rejected")
-def discard(self, result):
-    print(f"폐기됨. 이유: {result.feedback}")
+    @human_feedback(
+        message="이 콘텐츠의 출판을 승인하시겠습니까?",
+        emit=["approved", "rejected", "needs_revision"],
+        llm="gpt-4o-mini",
+        default_outcome="needs_revision",
+    )
+    @listen(or_("generate_content", "needs_revision"))
+    def review_content(self):
+        return "블로그 게시물 초안 내용..."

-@listen("needs_revision")
-def revise(self, result):
-    print(f"다음을 기반으로 수정 중: {result.feedback}")
+    @listen("approved")
+    def publish(self, result):
+        print(f"출판 중! 사용자 의견: {result.feedback}")
+
+    @listen("rejected")
+    def discard(self, result):
+        print(f"폐기됨. 이유: {result.feedback}")
 ```

+사용자가 "더 자세한 내용이 필요합니다"와 같이 말하면, LLM이 이를 `"needs_revision"`으로 매핑하고, `or_()`를 통해 `review_content`가 다시 트리거됩니다 — 수정 루프가 생성됩니다. outcome이 `"approved"` 또는 `"rejected"`가 될 때까지 루프가 계속됩니다.
+
 <Tip>
 LLM은 가능한 경우 구조화된 출력(function calling)을 사용하여 응답이 지정된 outcome 중 하나임을 보장합니다. 이로 인해 라우팅이 신뢰할 수 있고 예측 가능해집니다.
 </Tip>

+<Warning>
+`@start()` 메서드는 flow 시작 시 한 번만 실행됩니다. 수정 루프가 필요한 경우, start 메서드를 review 메서드와 분리하고 review 메서드에 `@listen(or_("trigger", "revision_outcome"))`를 사용하여 self-loop을 활성화하세요.
+</Warning>
+
 ## HumanFeedbackResult

 `HumanFeedbackResult` 데이터클래스는 인간 피드백 상호작용에 대한 모든 정보를 포함합니다:
@@ -193,116 +203,162 @@ def summarize(self):
 <CodeGroup>

 ```python Code
-from crewai.flow.flow import Flow, start, listen
+from crewai.flow.flow import Flow, start, listen, or_
 from crewai.flow.human_feedback import human_feedback, HumanFeedbackResult
 from pydantic import BaseModel


 class ContentState(BaseModel):
-    topic: str = ""
    draft: str = ""
-    final_content: str = ""
    revision_count: int = 0
+    status: str = "pending"


 class ContentApprovalFlow(Flow[ContentState]):
-    """콘텐츠를 생성하고 인간의 승인을 받는 Flow입니다."""
+    """콘텐츠를 생성하고 승인될 때까지 반복하는 Flow."""

    @start()
-    def get_topic(self):
-        self.state.topic = input("어떤 주제에 대해 글을 쓸까요? ")
-        return self.state.topic
-
-    @listen(get_topic)
-    def generate_draft(self, topic):
-        # 실제 사용에서는 LLM을 호출합니다
-        self.state.draft = f"# {topic}\n\n{topic}에 대한 초안입니다..."
+    def generate_draft(self):
+        self.state.draft = "# AI 안전\n\nAI 안전에 대한 초안..."
        return self.state.draft

-    @listen(generate_draft)
    @human_feedback(
-        message="이 초안을 검토해 주세요. 'approved', 'rejected'로 답하거나 수정 피드백을 제공해 주세요:",
+        message="이 초안을 검토해 주세요. 승인, 거부 또는 변경이 필요한 사항을 설명해 주세요:",
        emit=["approved", "rejected", "needs_revision"],
        llm="gpt-4o-mini",
        default_outcome="needs_revision",
    )
-    def review_draft(self, draft):
-        return draft
+    @listen(or_("generate_draft", "needs_revision"))
+    def review_draft(self):
+        self.state.revision_count += 1
+        return f"{self.state.draft} (v{self.state.revision_count})"

    @listen("approved")
    def publish_content(self, result: HumanFeedbackResult):
-        self.state.final_content = result.output
-        print("\n✅ 콘텐츠가 승인되어 출판되었습니다!")
-        print(f"검토자 코멘트: {result.feedback}")
+        self.state.status = "published"
+        print(f"콘텐츠 승인 및 게시! 리뷰어 의견: {result.feedback}")
        return "published"

    @listen("rejected")
    def handle_rejection(self, result: HumanFeedbackResult):
-        print("\n❌ 콘텐츠가 거부되었습니다")
-        print(f"이유: {result.feedback}")
+        self.state.status = "rejected"
+        print(f"콘텐츠 거부됨. 이유: {result.feedback}")
        return "rejected"

-    @listen("needs_revision")
-    def revise_content(self, result: HumanFeedbackResult):
-        self.state.revision_count += 1
-        print(f"\n📝 수정 #{self.state.revision_count} 요청됨")
-        print(f"피드백: {result.feedback}")

-        # 실제 Flow에서는 generate_draft로 돌아갈 수 있습니다
-        # 이 예제에서는 단순히 확인합니다
-        return "revision_requested"
-
-
-# Flow 실행
 flow = ContentApprovalFlow()
 result = flow.kickoff()
-print(f"\nFlow 완료. 요청된 수정: {flow.state.revision_count}")
+print(f"\nFlow 완료. 상태: {flow.state.status}, 검토 횟수: {flow.state.revision_count}")
 ```

 ```text Output
-어떤 주제에 대해 글을 쓸까요? AI 안전
+==================================================
+OUTPUT FOR REVIEW:
+==================================================
+# AI 안전
+
+AI 안전에 대한 초안... (v1)
+==================================================
+
+이 초안을 검토해 주세요. 승인, 거부 또는 변경이 필요한 사항을 설명해 주세요:
+(Press Enter to skip, or type your feedback)
+
+Your feedback: 더 자세한 내용이 필요합니다

 ==================================================
 OUTPUT FOR REVIEW:
 ==================================================
 # AI 안전

-AI 안전에 대한 초안입니다...
+AI 안전에 대한 초안... (v2)
 ==================================================

-이 초안을 검토해 주세요. 'approved', 'rejected'로 답하거나 수정 피드백을 제공해 주세요:
+이 초안을 검토해 주세요. 승인, 거부 또는 변경이 필요한 사항을 설명해 주세요:
 (Press Enter to skip, or type your feedback)

 Your feedback: 좋아 보입니다, 승인!

-✅ 콘텐츠가 승인되어 출판되었습니다!
-검토자 코멘트: 좋아 보입니다, 승인!
+콘텐츠 승인 및 게시! 리뷰어 의견: 좋아 보입니다, 승인!

-Flow 완료. 요청된 수정: 0
+Flow 완료. 상태: published, 검토 횟수: 2
 ```

 </CodeGroup>

 ## 다른 데코레이터와 결합하기

-`@human_feedback` 데코레이터는 다른 Flow 데코레이터와 함께 작동합니다. 가장 안쪽 데코레이터(함수에 가장 가까운)로 배치하세요:
+`@human_feedback` 데코레이터는 `@start()`, `@listen()`, `or_()`와 함께 작동합니다. 데코레이터 순서는 두 가지 모두 동작합니다—프레임워크가 양방향으로 속성을 전파합니다—하지만 권장 패턴은 다음과 같습니다:

 ```python Code
-# 올바름: @human_feedback이 가장 안쪽(함수에 가장 가까움)
+# Flow 시작 시 일회성 검토 (self-loop 없음)
@start()
-@human_feedback(message="이것을 검토해 주세요:")
+@human_feedback(message="이것을 검토해 주세요:", emit=["approved", "rejected"], llm="gpt-4o-mini")
 def my_start_method(self):
    return "content"

+# 리스너에서 선형 검토 (self-loop 없음)
@listen(other_method)
-@human_feedback(message="이것도 검토해 주세요:")
+@human_feedback(message="이것도 검토해 주세요:", emit=["good", "bad"], llm="gpt-4o-mini")
 def my_listener(self, data):
    return f"processed: {data}"
+
+# Self-loop: 수정을 위해 반복할 수 있는 검토
+@human_feedback(message="승인 또는 수정 요청?", emit=["approved", "revise"], llm="gpt-4o-mini")
+@listen(or_("upstream_method", "revise"))
+def review_with_loop(self):
+    return "content for review"
 ```

-<Tip>
-`@human_feedback`를 가장 안쪽 데코레이터(마지막/함수에 가장 가까움)로 배치하여 메서드를 직접 래핑하고 Flow 시스템에 전달하기 전에 반환 값을 캡처할 수 있도록 하세요.
-</Tip>
+### Self-loop 패턴
+
+수정 루프를 만들려면 `or_()`를 사용하여 검토 메서드가 **상위 트리거**와 **자체 수정 outcome**을 모두 리스닝해야 합니다:
+
+```python Code
+@start()
+def generate(self):
+    return "initial draft"
+
+@human_feedback(
+    message="승인하시겠습니까, 아니면 변경을 요청하시겠습니까?",
+    emit=["revise", "approved"],
+    llm="gpt-4o-mini",
+    default_outcome="approved",
+)
+@listen(or_("generate", "revise"))
+def review(self):
+    return "content"
+
+@listen("approved")
+def publish(self):
+    return "published"
+```
+
+outcome이 `"revise"`이면 flow가 `review`로 다시 라우팅됩니다 (`or_()`를 통해 `"revise"`를 리스닝하기 때문). outcome이 `"approved"`이면 flow가 `publish`로 계속됩니다. flow 엔진이 라우터를 "한 번만 실행" 규칙에서 제외하여 각 루프 반복마다 재실행할 수 있기 때문에 이 패턴이 동작합니다.
+
+### 체인된 라우터
+
+한 라우터의 outcome으로 트리거된 리스너가 그 자체로 라우터가 될 수 있습니다:
+
+```python Code
+@start()
+@human_feedback(message="첫 번째 검토:", emit=["approved", "rejected"], llm="gpt-4o-mini")
+def draft(self):
+    return "draft content"
+
+@listen("approved")
+@human_feedback(message="최종 검토:", emit=["publish", "revise"], llm="gpt-4o-mini")
+def final_review(self, prev):
+    return "final content"
+
+@listen("publish")
+def on_publish(self, prev):
+    return "published"
+```
+
+### 제한 사항
+
+- **`@start()` 메서드는 한 번만 실행**: `@start()` 메서드는 self-loop할 수 없습니다. 수정 주기가 필요하면 별도의 `@start()` 메서드를 진입점으로 사용하고 `@listen()` 메서드에 `@human_feedback`를 배치하세요.
+- **동일 메서드에 `@start()` + `@listen()` 불가**: 이는 Flow 프레임워크 제약입니다. 메서드는 시작점이거나 리스너여야 하며, 둘 다일 수 없습니다.

 ## 모범 사례

@@ -516,9 +572,9 @@ class ContentPipeline(Flow):
    @start()
    @human_feedback(
        message="이 콘텐츠의 출판을 승인하시겠습니까?",
-        emit=["approved", "rejected", "needs_revision"],
+        emit=["approved", "rejected"],
        llm="gpt-4o-mini",
-        default_outcome="needs_revision",
+        default_outcome="rejected",
        provider=SlackNotificationProvider("#content-reviews"),
    )
    def generate_content(self):
@@ -534,11 +590,6 @@ class ContentPipeline(Flow):
        print(f"보관됨. 이유: {result.feedback}")
        return {"status": "archived"}

-    @listen("needs_revision")
-    def queue_revision(self, result):
-        print(f"수정 대기열에 추가됨: {result.feedback}")
-        return {"status": "revision_needed"}
-

 # Flow 시작 (Slack 응답을 기다리며 일시 중지)
 def start_content_pipeline():
@@ -594,22 +645,22 @@ async def on_slack_feedback_async(flow_id: str, slack_message: str):
 ```python Code
 class ArticleReviewFlow(Flow):
    @start()
-    @human_feedback(
-        message="Review this article draft:",
-        emit=["approved", "needs_revision"],
-        llm="gpt-4o-mini",
-        learn=True,  # HITL 학습 활성화
-    )
    def generate_article(self):
        return self.crew.kickoff(inputs={"topic": "AI Safety"}).raw

+    @human_feedback(
+        message="이 글 초안을 검토해 주세요:",
+        emit=["approved", "needs_revision"],
+        llm="gpt-4o-mini",
+        learn=True,
+    )
+    @listen(or_("generate_article", "needs_revision"))
+    def review_article(self):
+        return self.last_human_feedback.output if self.last_human_feedback else "article draft"
+
    @listen("approved")
    def publish(self):
        print(f"Publishing: {self.last_human_feedback.output}")
-
-    @listen("needs_revision")
-    def revise(self):
-        print("Revising based on feedback...")
 ```

 **첫 번째 실행**: 인간이 원시 출력을 보고 "사실에 대한 주장에는 항상 인용을 포함하세요."라고 말합니다. 교훈이 추출되어 메모리에 저장됩니다.
--- a/docs/ko/learn/llm-connections.mdx
+++ b/docs/ko/learn/llm-connections.mdx
@@ -7,7 +7,7 @@ mode: "wide"

 ## CrewAI를 LLM에 연결하기

-CrewAI는 LiteLLM을 사용하여 다양한 언어 모델(LLM)에 연결합니다. 이 통합은 높은 다양성을 제공하여, 여러 공급자의 모델을 간단하고 통합된 인터페이스로 사용할 수 있게 해줍니다.
+CrewAI는 가장 인기 있는 제공자(OpenAI, Anthropic, Google Gemini, Azure, AWS Bedrock)에 대해 네이티브 SDK 통합을 통해 LLM에 연결하며, 그 외 모든 제공자에 대해서는 LiteLLM을 유연한 폴백으로 사용합니다.

 <Note>
    기본적으로 CrewAI는 `gpt-4o-mini` 모델을 사용합니다. 이는 `OPENAI_MODEL_NAME` 환경 변수에 의해 결정되며, 설정되지 않은 경우 기본값은 "gpt-4o-mini"입니다.
@@ -41,6 +41,14 @@ LiteLLM은 다음을 포함하되 이에 국한되지 않는 다양한 프로바

 지원되는 프로바이더의 전체 및 최신 목록은 [LiteLLM 프로바이더 문서](https://docs.litellm.ai/docs/providers)를 참조하세요.

+<Info>
+  네이티브 통합에서 지원하지 않는 제공자를 사용하려면 LiteLLM을 프로젝트에 의존성으로 추가하세요:
+  ```bash
+  uv add 'crewai[litellm]'
+  ```
+  네이티브 제공자(OpenAI, Anthropic, Google Gemini, Azure, AWS Bedrock)는 자체 SDK extras를 사용합니다 — [공급자 구성 예시](/ko/concepts/llms#공급자-구성-예시)를 참조하세요.
+</Info>
+
 ## LLM 변경하기

 CrewAI agent에서 다른 LLM을 사용하려면 여러 가지 방법이 있습니다:
--- a/docs/ko/observability/tracing.mdx
+++ b/docs/ko/observability/tracing.mdx
@@ -35,7 +35,7 @@ crewai login
 아직 설치하지 않았다면 CLI 도구와 함께 CrewAI를 설치하세요:

 ```bash
-uv add crewai[tools]
+uv add 'crewai[tools]'
 ```

 그런 다음 CrewAI AMP 계정으로 CLI를 인증하세요:
--- a/docs/pt-BR/changelog.mdx
+++ b/docs/pt-BR/changelog.mdx
@@ -4,6 +4,106 @@ description: "Atualizações de produto, melhorias e correções do CrewAI"
 icon: "clock"
 mode: "wide"
 ---
+<Update label="27 fev 2026">
+  ## v1.10.1a1
+
+  [Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.1a1)
+
+  ## O que Mudou
+
+  ### Funcionalidades
+  - Implementar suporte a invocação assíncrona em métodos de callback de etapas
+  - Implementar carregamento sob demanda para dependências pesadas no módulo de Memória
+
+  ### Documentação
+  - Atualizar changelog e versão para v1.10.0
+
+  ### Refatoração
+  - Refatorar métodos de callback de etapas para suportar invocação assíncrona
+  - Refatorar para implementar carregamento sob demanda para dependências pesadas no módulo de Memória
+
+  ### Correções de Bugs
+  - Corrigir branch para notas de lançamento
+
+  ## Contribuidores
+
+  @greysonlalonde, @joaomdmoura
+
+</Update>
+
+<Update label="27 fev 2026">
+  ## v1.10.1a1
+
+  [Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.1a1)
+
+  ## O que Mudou
+
+  ### Refatoração
+  - Refatorar métodos de callback de etapas para suportar invocação assíncrona
+  - Implementar carregamento sob demanda para dependências pesadas no módulo de Memória
+
+  ### Documentação
+  - Atualizar changelog e versão para v1.10.0
+
+  ### Correções de Bugs
+  - Criar branch para notas de lançamento
+
+  ## Contribuidores
+
+  @greysonlalonde, @joaomdmoura
+
+</Update>
+
+<Update label="26 fev 2026">
+  ## v1.10.0
+
+  [Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.10.0)
+
+  ## O que Mudou
+
+  ### Recursos
+  - Aprimorar a resolução da ferramenta MCP e eventos relacionados
+  - Atualizar a versão do lancedb e adicionar pacotes lance-namespace
+  - Aprimorar a análise e validação de argumentos JSON no CrewAgentExecutor e BaseTool
+  - Migrar o cliente HTTP da CLI de requests para httpx
+  - Adicionar documentação versionada
+  - Adicionar detecção de versões removidas para notas de versão
+  - Implementar tratamento de entrada do usuário em Flows
+  - Aprimorar a funcionalidade de auto-loop HITL nos testes de integração de feedback humano
+  - Adicionar started_event_id e definir no eventbus
+  - Atualizar automaticamente tools.specs
+
+  ### Correções de Bugs
+  - Validar kwargs da ferramenta mesmo quando vazios para evitar TypeError crípticos
+  - Preservar tipos nulos nos esquemas de parâmetros da ferramenta para LLM
+  - Mapear output_pydantic/output_json para saída estruturada nativa
+  - Garantir que callbacks sejam executados/aguardados se forem promessas
+  - Capturar o nome do método no contexto da exceção
+  - Preservar tipo enum no resultado do roteador; melhorar tipos
+  - Corrigir fluxos cíclicos que quebram silenciosamente quando o ID de persistência é passado nas entradas
+  - Corrigir o formato da flag da CLI de --skip-provider para --skip_provider
+  - Garantir que o fluxo de chamada da ferramenta OpenAI seja finalizado
+  - Resolver ponteiros $ref de esquema complexos nas ferramentas MCP
+  - Impor additionalProperties=false nos esquemas
+  - Rejeitar nomes de scripts reservados para pastas de equipe
+  - Resolver condição de corrida no teste de emissão de eventos de guardrail
+
+  ### Documentação
+  - Adicionar nota de dependência litellm para provedores de LLM não nativos
+  - Esclarecer o modelo de segurança NL2SQL e orientações de fortalecimento
+  - Adicionar 96 ações ausentes em 9 integrações
+
+  ### Refatoração
+  - Refatorar crew para provider
+  - Extrair HITL para padrão de provider
+  - Melhorar tipagem e registro de hooks
+
+  ## Contribuidores
+
+  @dependabot[bot], @github-actions[bot], @github-code-quality[bot], @greysonlalonde, @heitorado, @hobostay, @joaomdmoura, @johnvan7, @jonathansampson, @lorenzejay, @lucasgomide, @mattatcha, @mplachta, @nicoferdi96, @theCyberTech, @thiagomoretto, @vinibrsl
+
+</Update>
+
 <Update label="26 jan 2026">
  ## v1.9.0

--- a/docs/pt-BR/concepts/llms.mdx
+++ b/docs/pt-BR/concepts/llms.mdx
@@ -105,6 +105,15 @@ Existem diferentes locais no código do CrewAI onde você pode especificar o mod
  </Tab>
 </Tabs>

+<Info>
+  O CrewAI oferece integrações nativas via SDK para OpenAI, Anthropic, Google (Gemini API), Azure e AWS Bedrock — sem necessidade de instalação extra além dos extras específicos do provedor (ex.: `uv add "crewai[openai]"`).
+
+  Todos os outros provedores são alimentados pelo **LiteLLM**. Se você planeja usar algum deles, adicione-o como dependência ao seu projeto:
+  ```bash
+  uv add 'crewai[litellm]'
+  ```
+</Info>
+
 ## Exemplos de Configuração de Provedores

 O CrewAI suporta uma grande variedade de provedores de LLM, cada um com recursos, métodos de autenticação e capacidades de modelo únicos.
@@ -214,6 +223,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
    | `meta_llama/Llama-4-Maverick-17B-128E-Instruct-FP8` | 128k | 4028 | Texto, Imagem | Texto |
    | `meta_llama/Llama-3.3-70B-Instruct` | 128k | 4028 | Texto | Texto |
    | `meta_llama/Llama-3.3-8B-Instruct` | 128k | 4028 | Texto | Texto |
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Anthropic">
@@ -354,6 +368,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
    | gemini-1.5-flash                 | 1M tokens          | Modelo multimodal equilibrado, bom para maioria das tarefas         |
    | gemini-1.5-flash-8B              | 1M tokens          | Mais rápido, mais eficiente em custo, adequado para tarefas de alta frequência |
    | gemini-1.5-pro                   | 2M tokens          | Melhor desempenho para uma ampla variedade de tarefas de raciocínio, incluindo lógica, codificação e colaboração criativa |
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Azure">
@@ -438,6 +457,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
        model="sagemaker/<my-endpoint>"
    )
    ```
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Mistral">
@@ -453,6 +477,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
        temperature=0.7
    )
    ```
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Nvidia NIM">
@@ -539,6 +568,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
    | rakuten/rakutenai-7b-instruct                                            | 1.024 tokens       | LLM topo de linha, compreensão, raciocínio e geração textual.|
    | rakuten/rakutenai-7b-chat                                                | 1.024 tokens       | LLM topo de linha, compreensão, raciocínio e geração textual.|
    | baichuan-inc/baichuan2-13b-chat                                          | 4.096 tokens       | Suporte a chat em chinês/inglês, programação, matemática, seguir instruções, resolver quizzes.|
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Local NVIDIA NIM Deployed using WSL2">
@@ -579,6 +613,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co

        # ...
    ```
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Groq">
@@ -600,6 +639,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
    | Llama 3.1 70B/8B  | 131.072 tokens      | Alta performance e tarefas de contexto grande|
    | Llama 3.2 Série   | 8.192 tokens        | Tarefas gerais                          |
    | Mixtral 8x7B      | 32.768 tokens       | Equilíbrio entre performance e contexto  |
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="IBM watsonx.ai">
@@ -622,6 +666,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
        base_url="https://api.watsonx.ai/v1"
    )
    ```
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Ollama (LLMs Locais)">
@@ -635,6 +684,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
        base_url="http://localhost:11434"
    )
    ```
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Fireworks AI">
@@ -650,6 +704,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
        temperature=0.7
    )
    ```
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Perplexity AI">
@@ -665,6 +724,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
        base_url="https://api.perplexity.ai/"
    )
    ```
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Hugging Face">
@@ -679,6 +743,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
        model="huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct"
    )
    ```
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="SambaNova">
@@ -702,6 +771,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
    | Llama 3.2 Série   | 8.192 tokens              | Tarefas gerais e multimodais                 |
    | Llama 3.3 70B     | Até 131.072 tokens        | Desempenho e qualidade de saída elevada      |
    | Família Qwen2     | 8.192 tokens              | Desempenho e qualidade de saída elevada      |
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Cerebras">
@@ -727,6 +801,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
      - Equilíbrio entre velocidade e qualidade
      - Suporte a longas janelas de contexto
    </Info>
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>

  <Accordion title="Open Router">
@@ -749,6 +828,11 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
      - openrouter/deepseek/deepseek-r1
      - openrouter/deepseek/deepseek-chat
    </Info>
+
+    **Nota:** Este provedor usa o LiteLLM. Adicione-o como dependência ao seu projeto:
+    ```bash
+    uv add 'crewai[litellm]'
+    ```
  </Accordion>
 </AccordionGroup>

--- a/docs/pt-BR/enterprise/features/flow-hitl-management.mdx
+++ b/docs/pt-BR/enterprise/features/flow-hitl-management.mdx
@@ -38,22 +38,21 @@ O CrewAI Enterprise oferece um sistema abrangente de gerenciamento Human-in-the-
 Configure checkpoints de revisão humana em seus Flows usando o decorador `@human_feedback`. Quando a execução atinge um ponto de revisão, o sistema pausa, notifica o responsável via email e aguarda uma resposta.

 ```python
-from crewai.flow.flow import Flow, start, listen
+from crewai.flow.flow import Flow, start, listen, or_
 from crewai.flow.human_feedback import human_feedback, HumanFeedbackResult

 class ContentApprovalFlow(Flow):
    @start()
    def generate_content(self):
-        # IA gera conteúdo
        return "Texto de marketing gerado para campanha Q1..."

-    @listen(generate_content)
    @human_feedback(
        message="Por favor, revise este conteúdo para conformidade com a marca:",
        emit=["approved", "rejected", "needs_revision"],
    )
-    def review_content(self, content):
-        return content
+    @listen(or_("generate_content", "needs_revision"))
+    def review_content(self):
+        return "Texto de marketing para revisão..."

    @listen("approved")
    def publish_content(self, result: HumanFeedbackResult):
@@ -62,10 +61,6 @@ class ContentApprovalFlow(Flow):
    @listen("rejected")
    def archive_content(self, result: HumanFeedbackResult):
        print(f"Conteúdo rejeitado. Motivo: {result.feedback}")
-
-    @listen("needs_revision")
-    def revise_content(self, result: HumanFeedbackResult):
-        print(f"Revisão solicitada: {result.feedback}")
 ```

 Para detalhes completos de implementação, consulte o guia [Feedback Humano em Flows](/pt-BR/learn/human-feedback-in-flows).
--- a/docs/pt-BR/enterprise/guides/deploy-to-amp.mdx
+++ b/docs/pt-BR/enterprise/guides/deploy-to-amp.mdx
@@ -176,6 +176,11 @@ Você precisa enviar seu crew para um repositório do GitHub. Caso ainda não te
      ![Definir Variáveis de Ambiente](/images/enterprise/set-env-variables.png)
    </Frame>

+    <Info>
+      Usando pacotes Python privados? Você também precisará adicionar suas credenciais de registro aqui.
+      Consulte [Registros de Pacotes Privados](/pt-BR/enterprise/guides/private-package-registry) para as variáveis necessárias.
+    </Info>
+
  </Step>

  <Step title="Implante Seu Crew">
--- a/docs/pt-BR/enterprise/guides/prepare-for-deployment.mdx
+++ b/docs/pt-BR/enterprise/guides/prepare-for-deployment.mdx
@@ -256,6 +256,12 @@ Antes da implantação, certifique-se de ter:
 1. **Chaves de API de LLM** prontas (OpenAI, Anthropic, Google, etc.)
 2. **Chaves de API de ferramentas** se estiver usando ferramentas externas (Serper, etc.)

+<Info>
+  Se seu projeto depende de pacotes de um **registro PyPI privado**, você também precisará configurar
+  credenciais de autenticação do registro como variáveis de ambiente. Consulte o guia
+  [Registros de Pacotes Privados](/pt-BR/enterprise/guides/private-package-registry) para mais detalhes.
+</Info>
+
 <Tip>
  Teste seu projeto localmente com as mesmas variáveis de ambiente antes de implantar
  para detectar problemas de configuração antecipadamente.
--- a/docs/pt-BR/enterprise/guides/private-package-registry.mdx
+++ b/docs/pt-BR/enterprise/guides/private-package-registry.mdx
@@ -0,0 +1,263 @@
+---
+title: "Registros de Pacotes Privados"
+description: "Instale pacotes Python privados de registros PyPI autenticados no CrewAI AMP"
+icon: "lock"
+mode: "wide"
+---
+
+<Note>
+  Este guia aborda como configurar seu projeto CrewAI para instalar pacotes Python
+  de registros PyPI privados (Azure DevOps Artifacts, GitHub Packages, GitLab, AWS CodeArtifact, etc.)
+  ao implantar no CrewAI AMP.
+</Note>
+
+## Quando Você Precisa Disso
+
+Se seu projeto depende de pacotes Python internos ou proprietários hospedados em um registro privado
+em vez do PyPI público, você precisará:
+
+1. Informar ao UV **onde** encontrar o pacote (uma URL de index)
+2. Informar ao UV **quais** pacotes vêm desse index (um mapeamento de source)
+3. Fornecer **credenciais** para que o UV possa autenticar durante a instalação
+
+O CrewAI AMP usa [UV](https://docs.astral.sh/uv/) para resolução e instalação de dependências.
+O UV suporta registros privados autenticados por meio da configuração do `pyproject.toml` combinada
+com variáveis de ambiente para credenciais.
+
+## Passo 1: Configurar o pyproject.toml
+
+Três elementos trabalham juntos no seu `pyproject.toml`:
+
+### 1a. Declarar a dependência
+
+Adicione o pacote privado ao seu `[project.dependencies]` como qualquer outra dependência:
+
+```toml
+[project]
+dependencies = [
+    "crewai[tools]>=0.100.1,<1.0.0",
+    "my-private-package>=1.2.0",
+]
+```
+
+### 1b. Definir o index
+
+Registre seu registro privado como um index nomeado em `[[tool.uv.index]]`:
+
+```toml
+[[tool.uv.index]]
+name = "my-private-registry"
+url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
+explicit = true
+```
+
+<Info>
+  O campo `name` é importante — o UV o utiliza para construir os nomes das variáveis de ambiente
+  para autenticação (veja o [Passo 2](#passo-2-configurar-credenciais-de-autenticação) abaixo).
+
+  Definir `explicit = true` significa que o UV não consultará esse index para todos os pacotes — apenas
+  os que você mapear explicitamente em `[tool.uv.sources]`. Isso evita consultas desnecessárias
+  ao seu registro privado e protege contra ataques de confusão de dependências.
+</Info>
+
+### 1c. Mapear o pacote para o index
+
+Informe ao UV quais pacotes devem ser resolvidos a partir do seu index privado usando `[tool.uv.sources]`:
+
+```toml
+[tool.uv.sources]
+my-private-package = { index = "my-private-registry" }
+```
+
+### Exemplo completo
+
+```toml
+[project]
+name = "my-crew-project"
+version = "0.1.0"
+requires-python = ">=3.10,<=3.13"
+dependencies = [
+    "crewai[tools]>=0.100.1,<1.0.0",
+    "my-private-package>=1.2.0",
+]
+
+[tool.crewai]
+type = "crew"
+
+[[tool.uv.index]]
+name = "my-private-registry"
+url = "https://pkgs.dev.azure.com/my-org/_packaging/my-feed/pypi/simple/"
+explicit = true
+
+[tool.uv.sources]
+my-private-package = { index = "my-private-registry" }
+```
+
+Após atualizar o `pyproject.toml`, regenere seu arquivo lock:
+
+```bash
+uv lock
+```
+
+<Warning>
+  Sempre faça commit do `uv.lock` atualizado junto com as alterações no `pyproject.toml`.
+  O arquivo lock é obrigatório para implantação — veja [Preparar para Implantação](/pt-BR/enterprise/guides/prepare-for-deployment).
+</Warning>
+
+## Passo 2: Configurar Credenciais de Autenticação
+
+O UV autentica em indexes privados usando variáveis de ambiente que seguem uma convenção de nomenclatura
+baseada no nome do index que você definiu no `pyproject.toml`:
+
+```
+UV_INDEX_{UPPER_NAME}_USERNAME
+UV_INDEX_{UPPER_NAME}_PASSWORD
+```
+
+Onde `{UPPER_NAME}` é o nome do seu index convertido para **maiúsculas** com **hifens substituídos por underscores**.
+
+Por exemplo, um index chamado `my-private-registry` usa:
+
+| Variável | Valor |
+|----------|-------|
+| `UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME` | Seu nome de usuário ou nome do token do registro |
+| `UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD` | Sua senha ou token/PAT do registro |
+
+<Warning>
+  Essas variáveis de ambiente **devem** ser adicionadas pelas configurações de **Variáveis de Ambiente** do CrewAI AMP —
+  globalmente ou no nível da implantação. Elas não podem ser definidas em arquivos `.env` ou codificadas no seu projeto.
+
+  Veja [Configurar Variáveis de Ambiente no AMP](#configurar-variáveis-de-ambiente-no-amp) abaixo.
+</Warning>
+
+## Referência de Provedores de Registro
+
+A tabela abaixo mostra o formato da URL de index e os valores de credenciais para provedores de registro comuns.
+Substitua os valores de exemplo pelos detalhes reais da sua organização e feed.
+
+| Provedor | URL do Index | Usuário | Senha |
+|----------|-------------|---------|-------|
+| **Azure DevOps Artifacts** | `https://pkgs.dev.azure.com/{org}/_packaging/{feed}/pypi/simple/` | Qualquer string não vazia (ex: `token`) | Personal Access Token (PAT) com escopo Packaging Read |
+| **GitHub Packages** | `https://pypi.pkg.github.com/{owner}/simple/` | Nome de usuário do GitHub | Personal Access Token (classic) com escopo `read:packages` |
+| **GitLab Package Registry** | `https://gitlab.com/api/v4/projects/{project_id}/packages/pypi/simple/` | `__token__` | Project ou Personal Access Token com escopo `read_api` |
+| **AWS CodeArtifact** | Use a URL de `aws codeartifact get-repository-endpoint` | `aws` | Token de `aws codeartifact get-authorization-token` |
+| **Google Artifact Registry** | `https://{region}-python.pkg.dev/{project}/{repo}/simple/` | `_json_key_base64` | Chave de conta de serviço codificada em Base64 |
+| **JFrog Artifactory** | `https://{instance}.jfrog.io/artifactory/api/pypi/{repo}/simple/` | Nome de usuário ou email | Chave API ou token de identidade |
+| **Auto-hospedado (devpi, Nexus, etc.)** | URL da API simple do seu registro | Nome de usuário do registro | Senha do registro |
+
+<Tip>
+  Para **AWS CodeArtifact**, o token de autorização expira periodicamente.
+  Você precisará atualizar o valor de `UV_INDEX_*_PASSWORD` quando ele expirar.
+  Considere automatizar isso no seu pipeline de CI/CD.
+</Tip>
+
+## Configurar Variáveis de Ambiente no AMP
+
+As credenciais do registro privado devem ser configuradas como variáveis de ambiente no CrewAI AMP.
+Você tem duas opções:
+
+<Tabs>
+  <Tab title="Interface Web">
+    1. Faça login no [CrewAI AMP](https://app.crewai.com)
+    2. Navegue até sua automação
+    3. Abra a aba **Environment Variables**
+    4. Adicione cada variável (`UV_INDEX_*_USERNAME` e `UV_INDEX_*_PASSWORD`) com seu valor
+
+    Veja o passo [Deploy para AMP — Definir Variáveis de Ambiente](/pt-BR/enterprise/guides/deploy-to-amp#definir-as-variáveis-de-ambiente) para detalhes.
+  </Tab>
+  <Tab title="Implantação via CLI">
+    Adicione as variáveis ao seu arquivo `.env` local antes de executar `crewai deploy create`.
+    A CLI as transferirá com segurança para a plataforma:
+
+    ```bash
+    # .env
+    OPENAI_API_KEY=sk-...
+    UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
+    UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat-here
+    ```
+
+    ```bash
+    crewai deploy create
+    ```
+  </Tab>
+</Tabs>
+
+<Warning>
+  **Nunca** faça commit de credenciais no seu repositório. Use variáveis de ambiente do AMP para todos os segredos.
+  O arquivo `.env` deve estar listado no `.gitignore`.
+</Warning>
+
+Para atualizar credenciais em uma implantação existente, veja [Atualizar Seu Crew — Variáveis de Ambiente](/pt-BR/enterprise/guides/update-crew).
+
+## Como Tudo se Conecta
+
+Quando o CrewAI AMP faz o build da sua automação, o fluxo de resolução funciona assim:
+
+<Steps>
+  <Step title="Build inicia">
+    O AMP busca seu repositório e lê o `pyproject.toml` e o `uv.lock`.
+  </Step>
+  <Step title="UV resolve dependências">
+    O UV lê `[tool.uv.sources]` para determinar de qual index cada pacote deve vir.
+  </Step>
+  <Step title="UV autentica">
+    Para cada index privado, o UV busca `UV_INDEX_{NAME}_USERNAME` e `UV_INDEX_{NAME}_PASSWORD`
+    nas variáveis de ambiente que você configurou no AMP.
+  </Step>
+  <Step title="Pacotes são instalados">
+    O UV baixa e instala todos os pacotes — tanto públicos (do PyPI) quanto privados (do seu registro).
+  </Step>
+  <Step title="Automação executa">
+    Seu crew ou flow inicia com todas as dependências disponíveis.
+  </Step>
+</Steps>
+
+## Solução de Problemas
+
+### Erros de Autenticação Durante o Build
+
+**Sintoma**: Build falha com `401 Unauthorized` ou `403 Forbidden` ao resolver um pacote privado.
+
+**Verifique**:
+- Os nomes das variáveis de ambiente `UV_INDEX_*` correspondem exatamente ao nome do seu index (maiúsculas, hifens -> underscores)
+- As credenciais estão definidas nas variáveis de ambiente do AMP, não apenas em um `.env` local
+- Seu token/PAT tem as permissões de leitura necessárias para o feed de pacotes
+- O token não expirou (especialmente relevante para AWS CodeArtifact)
+
+### Pacote Não Encontrado
+
+**Sintoma**: `No matching distribution found for my-private-package`.
+
+**Verifique**:
+- A URL do index no `pyproject.toml` termina com `/simple/`
+- A entrada `[tool.uv.sources]` mapeia o nome correto do pacote para o nome correto do index
+- O pacote está realmente publicado no seu registro privado
+- Execute `uv lock` localmente com as mesmas credenciais para verificar se a resolução funciona
+
+### Conflitos no Arquivo Lock
+
+**Sintoma**: `uv lock` falha ou produz resultados inesperados após adicionar um index privado.
+
+**Solução**: Defina as credenciais localmente e regenere:
+
+```bash
+export UV_INDEX_MY_PRIVATE_REGISTRY_USERNAME=token
+export UV_INDEX_MY_PRIVATE_REGISTRY_PASSWORD=your-pat
+uv lock
+```
+
+Em seguida, faça commit do `uv.lock` atualizado.
+
+## Guias Relacionados
+
+<CardGroup cols={3}>
+  <Card title="Preparar para Implantação" icon="clipboard-check" href="/pt-BR/enterprise/guides/prepare-for-deployment">
+    Verifique a estrutura do projeto e as dependências antes de implantar.
+  </Card>
+  <Card title="Deploy para AMP" icon="rocket" href="/pt-BR/enterprise/guides/deploy-to-amp">
+    Implante seu crew ou flow e configure variáveis de ambiente.
+  </Card>
+  <Card title="Atualizar Seu Crew" icon="arrows-rotate" href="/pt-BR/enterprise/guides/update-crew">
+    Atualize variáveis de ambiente e envie alterações para uma implantação em execução.
+  </Card>
+</CardGroup>
--- a/docs/pt-BR/learn/human-feedback-in-flows.mdx
+++ b/docs/pt-BR/learn/human-feedback-in-flows.mdx
@@ -98,33 +98,43 @@ def handle_feedback(self, result):
 Quando você especifica `emit`, o decorador se torna um roteador. O feedback livre do humano é interpretado por um LLM e mapeado para um dos outcomes especificados:

 ```python Code
-@start()
-@human_feedback(
-    message="Você aprova este conteúdo para publicação?",
-    emit=["approved", "rejected", "needs_revision"],
-    llm="gpt-4o-mini",
-    default_outcome="needs_revision",
-)
-def review_content(self):
-    return "Rascunho do post do blog aqui..."
+from crewai.flow.flow import Flow, start, listen, or_
+from crewai.flow.human_feedback import human_feedback

-@listen("approved")
-def publish(self, result):
-    print(f"Publicando! Usuário disse: {result.feedback}")
+class ReviewFlow(Flow):
+    @start()
+    def generate_content(self):
+        return "Rascunho do post do blog aqui..."

-@listen("rejected")
-def discard(self, result):
-    print(f"Descartando. Motivo: {result.feedback}")
+    @human_feedback(
+        message="Você aprova este conteúdo para publicação?",
+        emit=["approved", "rejected", "needs_revision"],
+        llm="gpt-4o-mini",
+        default_outcome="needs_revision",
+    )
+    @listen(or_("generate_content", "needs_revision"))
+    def review_content(self):
+        return "Rascunho do post do blog aqui..."

-@listen("needs_revision")
-def revise(self, result):
-    print(f"Revisando baseado em: {result.feedback}")
+    @listen("approved")
+    def publish(self, result):
+        print(f"Publicando! Usuário disse: {result.feedback}")
+
+    @listen("rejected")
+    def discard(self, result):
+        print(f"Descartando. Motivo: {result.feedback}")
 ```

+Quando o humano diz algo como "precisa de mais detalhes", o LLM mapeia para `"needs_revision"`, que dispara `review_content` novamente via `or_()` — criando um loop de revisão. O loop continua até que o outcome seja `"approved"` ou `"rejected"`.
+
 <Tip>
 O LLM usa saídas estruturadas (function calling) quando disponível para garantir que a resposta seja um dos seus outcomes especificados. Isso torna o roteamento confiável e previsível.
 </Tip>

+<Warning>
+Um método `@start()` só executa uma vez no início do flow. Se você precisa de um loop de revisão, separe o método start do método de revisão e use `@listen(or_("trigger", "revision_outcome"))` no método de revisão para habilitar o self-loop.
+</Warning>
+
 ## HumanFeedbackResult

 O dataclass `HumanFeedbackResult` contém todas as informações sobre uma interação de feedback humano:
@@ -193,116 +203,162 @@ Aqui está um exemplo completo implementando um fluxo de revisão e aprovação
 <CodeGroup>

 ```python Code
-from crewai.flow.flow import Flow, start, listen
+from crewai.flow.flow import Flow, start, listen, or_
 from crewai.flow.human_feedback import human_feedback, HumanFeedbackResult
 from pydantic import BaseModel


 class ContentState(BaseModel):
-    topic: str = ""
    draft: str = ""
-    final_content: str = ""
    revision_count: int = 0
+    status: str = "pending"


 class ContentApprovalFlow(Flow[ContentState]):
-    """Um flow que gera conteúdo e obtém aprovação humana."""
+    """Um flow que gera conteúdo e faz loop até o humano aprovar."""

    @start()
-    def get_topic(self):
-        self.state.topic = input("Sobre qual tópico devo escrever? ")
-        return self.state.topic
-
-    @listen(get_topic)
-    def generate_draft(self, topic):
-        # Em uso real, isso chamaria um LLM
-        self.state.draft = f"# {topic}\n\nEste é um rascunho sobre {topic}..."
+    def generate_draft(self):
+        self.state.draft = "# IA Segura\n\nEste é um rascunho sobre IA Segura..."
        return self.state.draft

-    @listen(generate_draft)
    @human_feedback(
-        message="Por favor, revise este rascunho. Responda 'approved', 'rejected', ou forneça feedback de revisão:",
+        message="Por favor, revise este rascunho. Aprove, rejeite ou descreva o que precisa mudar:",
        emit=["approved", "rejected", "needs_revision"],
        llm="gpt-4o-mini",
        default_outcome="needs_revision",
    )
-    def review_draft(self, draft):
-        return draft
+    @listen(or_("generate_draft", "needs_revision"))
+    def review_draft(self):
+        self.state.revision_count += 1
+        return f"{self.state.draft} (v{self.state.revision_count})"

    @listen("approved")
    def publish_content(self, result: HumanFeedbackResult):
-        self.state.final_content = result.output
-        print("\n✅ Conteúdo aprovado e publicado!")
-        print(f"Comentário do revisor: {result.feedback}")
+        self.state.status = "published"
+        print(f"Conteúdo aprovado e publicado! Revisor disse: {result.feedback}")
        return "published"

    @listen("rejected")
    def handle_rejection(self, result: HumanFeedbackResult):
-        print("\n❌ Conteúdo rejeitado")
-        print(f"Motivo: {result.feedback}")
+        self.state.status = "rejected"
+        print(f"Conteúdo rejeitado. Motivo: {result.feedback}")
        return "rejected"

-    @listen("needs_revision")
-    def revise_content(self, result: HumanFeedbackResult):
-        self.state.revision_count += 1
-        print(f"\n📝 Revisão #{self.state.revision_count} solicitada")
-        print(f"Feedback: {result.feedback}")

-        # Em um flow real, você pode voltar para generate_draft
-        # Para este exemplo, apenas reconhecemos
-        return "revision_requested"
-
-
-# Executar o flow
 flow = ContentApprovalFlow()
 result = flow.kickoff()
-print(f"\nFlow concluído. Revisões solicitadas: {flow.state.revision_count}")
+print(f"\nFlow finalizado. Status: {flow.state.status}, Revisões: {flow.state.revision_count}")
 ```

 ```text Output
-Sobre qual tópico devo escrever? Segurança em IA
+==================================================
+OUTPUT FOR REVIEW:
+==================================================
+# IA Segura
+
+Este é um rascunho sobre IA Segura... (v1)
+==================================================
+
+Por favor, revise este rascunho. Aprove, rejeite ou descreva o que precisa mudar:
+(Press Enter to skip, or type your feedback)
+
+Your feedback: Preciso de mais detalhes sobre segurança em IA.

 ==================================================
 OUTPUT FOR REVIEW:
 ==================================================
-# Segurança em IA
+# IA Segura

-Este é um rascunho sobre Segurança em IA...
+Este é um rascunho sobre IA Segura... (v2)
 ==================================================

-Por favor, revise este rascunho. Responda 'approved', 'rejected', ou forneça feedback de revisão:
+Por favor, revise este rascunho. Aprove, rejeite ou descreva o que precisa mudar:
 (Press Enter to skip, or type your feedback)

 Your feedback: Parece bom, aprovado!

-✅ Conteúdo aprovado e publicado!
-Comentário do revisor: Parece bom, aprovado!
+Conteúdo aprovado e publicado! Revisor disse: Parece bom, aprovado!

-Flow concluído. Revisões solicitadas: 0
+Flow finalizado. Status: published, Revisões: 2
 ```

 </CodeGroup>

 ## Combinando com Outros Decoradores

-O decorador `@human_feedback` funciona com outros decoradores de flow. Coloque-o como o decorador mais interno (mais próximo da função):
+O decorador `@human_feedback` funciona com `@start()`, `@listen()` e `or_()`. Ambas as ordens de decoradores funcionam — o framework propaga atributos em ambas as direções — mas os padrões recomendados são:

 ```python Code
-# Correto: @human_feedback é o mais interno (mais próximo da função)
+# Revisão única no início do flow (sem self-loop)
@start()
-@human_feedback(message="Revise isto:")
+@human_feedback(message="Revise isto:", emit=["approved", "rejected"], llm="gpt-4o-mini")
 def my_start_method(self):
    return "content"

+# Revisão linear em um listener (sem self-loop)
@listen(other_method)
-@human_feedback(message="Revise isto também:")
+@human_feedback(message="Revise isto também:", emit=["good", "bad"], llm="gpt-4o-mini")
 def my_listener(self, data):
    return f"processed: {data}"
+
+# Self-loop: revisão que pode voltar para revisões
+@human_feedback(message="Aprovar ou revisar?", emit=["approved", "revise"], llm="gpt-4o-mini")
+@listen(or_("upstream_method", "revise"))
+def review_with_loop(self):
+    return "content for review"
 ```

-<Tip>
-Coloque `@human_feedback` como o decorador mais interno (último/mais próximo da função) para que ele envolva o método diretamente e possa capturar o valor de retorno antes de passar para o sistema de flow.
-</Tip>
+### Padrão de self-loop
+
+Para criar um loop de revisão, o método de revisão deve escutar **ambos** um gatilho upstream e seu próprio outcome de revisão usando `or_()`:
+
+```python Code
+@start()
+def generate(self):
+    return "initial draft"
+
+@human_feedback(
+    message="Aprovar ou solicitar alterações?",
+    emit=["revise", "approved"],
+    llm="gpt-4o-mini",
+    default_outcome="approved",
+)
+@listen(or_("generate", "revise"))
+def review(self):
+    return "content"
+
+@listen("approved")
+def publish(self):
+    return "published"
+```
+
+Quando o outcome é `"revise"`, o flow roteia de volta para `review` (porque ele escuta `"revise"` via `or_()`). Quando o outcome é `"approved"`, o flow continua para `publish`. Isso funciona porque o engine de flow isenta roteadores da regra "fire once", permitindo que eles re-executem em cada iteração do loop.
+
+### Roteadores encadeados
+
+Um listener disparado pelo outcome de um roteador pode ser ele mesmo um roteador:
+
+```python Code
+@start()
+@human_feedback(message="Primeira revisão:", emit=["approved", "rejected"], llm="gpt-4o-mini")
+def draft(self):
+    return "draft content"
+
+@listen("approved")
+@human_feedback(message="Revisão final:", emit=["publish", "revise"], llm="gpt-4o-mini")
+def final_review(self, prev):
+    return "final content"
+
+@listen("publish")
+def on_publish(self, prev):
+    return "published"
+```
+
+### Limitações
+
+- **Métodos `@start()` executam uma vez**: Um método `@start()` não pode fazer self-loop. Se você precisa de um ciclo de revisão, use um método `@start()` separado como ponto de entrada e coloque o `@human_feedback` em um método `@listen()`.
+- **Sem `@start()` + `@listen()` no mesmo método**: Esta é uma restrição do framework de Flow. Um método é ou um ponto de início ou um listener, não ambos.

 ## Melhores Práticas

@@ -516,9 +572,9 @@ class ContentPipeline(Flow):
    @start()
    @human_feedback(
        message="Aprova este conteúdo para publicação?",
-        emit=["approved", "rejected", "needs_revision"],
+        emit=["approved", "rejected"],
        llm="gpt-4o-mini",
-        default_outcome="needs_revision",
+        default_outcome="rejected",
        provider=SlackNotificationProvider("#content-reviews"),
    )
    def generate_content(self):
@@ -534,11 +590,6 @@ class ContentPipeline(Flow):
        print(f"Arquivado. Motivo: {result.feedback}")
        return {"status": "archived"}

-    @listen("needs_revision")
-    def queue_revision(self, result):
-        print(f"Na fila para revisão: {result.feedback}")
-        return {"status": "revision_needed"}
-

 # Iniciando o flow (vai pausar e aguardar resposta do Slack)
 def start_content_pipeline():
@@ -594,22 +645,22 @@ Com o tempo, o humano vê saídas pré-revisadas progressivamente melhores porqu
 ```python Code
 class ArticleReviewFlow(Flow):
    @start()
+    def generate_article(self):
+        return self.crew.kickoff(inputs={"topic": "AI Safety"}).raw
+
    @human_feedback(
-        message="Review this article draft:",
+        message="Revise este rascunho do artigo:",
        emit=["approved", "needs_revision"],
        llm="gpt-4o-mini",
        learn=True,  # enable HITL learning
    )
-    def generate_article(self):
-        return self.crew.kickoff(inputs={"topic": "AI Safety"}).raw
+    @listen(or_("generate_article", "needs_revision"))
+    def review_article(self):
+        return self.last_human_feedback.output if self.last_human_feedback else "article draft"

    @listen("approved")
    def publish(self):
        print(f"Publishing: {self.last_human_feedback.output}")
-
-    @listen("needs_revision")
-    def revise(self):
-        print("Revising based on feedback...")
 ```

 **Primeira execução**: O humano vê a saída bruta e diz "Sempre inclua citações para afirmações factuais." A lição é destilada e armazenada na memória.
--- a/docs/pt-BR/learn/llm-connections.mdx
+++ b/docs/pt-BR/learn/llm-connections.mdx
@@ -7,7 +7,7 @@ mode: "wide"

 ## Conecte o CrewAI a LLMs

-O CrewAI utiliza o LiteLLM para conectar-se a uma grande variedade de Modelos de Linguagem (LLMs). Essa integração proporciona grande versatilidade, permitindo que você utilize modelos de inúmeros provedores por meio de uma interface simples e unificada.
+O CrewAI conecta-se a LLMs por meio de integrações nativas via SDK para os provedores mais populares (OpenAI, Anthropic, Google Gemini, Azure e AWS Bedrock), e usa o LiteLLM como alternativa flexível para todos os demais provedores.

 <Note>
    Por padrão, o CrewAI usa o modelo `gpt-4o-mini`. Isso é determinado pela variável de ambiente `OPENAI_MODEL_NAME`, que tem como padrão "gpt-4o-mini" se não for definida.
@@ -40,6 +40,14 @@ O LiteLLM oferece suporte a uma ampla gama de provedores, incluindo, mas não se

 Para uma lista completa e sempre atualizada dos provedores suportados, consulte a [documentação de Provedores do LiteLLM](https://docs.litellm.ai/docs/providers).

+<Info>
+  Para usar qualquer provedor não coberto por uma integração nativa, adicione o LiteLLM como dependência ao seu projeto:
+  ```bash
+  uv add 'crewai[litellm]'
+  ```
+  Provedores nativos (OpenAI, Anthropic, Google Gemini, Azure, AWS Bedrock) usam seus próprios extras de SDK — consulte os [Exemplos de Configuração de Provedores](/pt-BR/concepts/llms#exemplos-de-configuração-de-provedores).
+</Info>
+
 ## Alterando a LLM

 Para utilizar uma LLM diferente com seus agentes CrewAI, você tem várias opções:
--- a/lib/crewai-files/src/crewai_files/init.py
+++ b/lib/crewai-files/src/crewai_files/init.py
@@ -152,4 +152,4 @@ __all__ = [
    "wrap_file_source",
 ]

-__version__ = "1.9.3"
+__version__ = "1.10.1a1"
--- a/lib/crewai-tools/pyproject.toml
+++ b/lib/crewai-tools/pyproject.toml
@@ -8,12 +8,10 @@ authors = [
 ]
 requires-python = ">=3.10, <3.14"
 dependencies = [
-    "lancedb~=0.5.4",
    "pytube~=15.0.0",
    "requests~=2.32.5",
    "docker~=7.1.0",
-    "crewai==1.9.3",
-    "lancedb~=0.5.4",
+    "crewai==1.10.1a1",
    "tiktoken~=0.8.0",
    "beautifulsoup4~=4.13.4",
    "python-docx~=1.2.0",
--- a/lib/crewai-tools/src/crewai_tools/init.py
+++ b/lib/crewai-tools/src/crewai_tools/init.py
@@ -291,4 +291,4 @@ __all__ = [
    "ZapierActionTools",
 ]

-__version__ = "1.9.3"
+__version__ = "1.10.1a1"
--- a/lib/crewai-tools/tool.specs.json
+++ b/lib/crewai-tools/tool.specs.json
@@ -20117,18 +20117,6 @@
      "humanized_name": "Web Automation Tool",
      "init_params_schema": {
        "$defs": {
-          "AvailableModel": {
-            "enum": [
-              "gpt-4o",
-              "gpt-4o-mini",
-              "claude-3-5-sonnet-latest",
-              "claude-3-7-sonnet-latest",
-              "computer-use-preview",
-              "gemini-2.0-flash"
-            ],
-            "title": "AvailableModel",
-            "type": "string"
-          },
          "EnvVar": {
            "properties": {
              "default": {
@@ -20206,17 +20194,6 @@
            "default": null,
            "title": "Model Api Key"
          },
-          "model_name": {
-            "anyOf": [
-              {
-                "$ref": "#/$defs/AvailableModel"
-              },
-              {
-                "type": "null"
-              }
-            ],
-            "default": "claude-3-7-sonnet-latest"
-          },
          "project_id": {
            "anyOf": [
              {
--- a/lib/crewai/pyproject.toml
+++ b/lib/crewai/pyproject.toml
@@ -38,10 +38,11 @@ dependencies = [
    "json5~=0.10.0",
    "portalocker~=2.7.0",
    "pydantic-settings~=2.10.1",
+    "httpx~=0.28.1",
    "mcp~=1.26.0",
    "uv~=0.9.13",
    "aiosqlite~=0.21.0",
-    "lancedb>=0.4.0",
+    "lancedb>=0.29.2",
 ]

 [project.urls]
@@ -52,7 +53,7 @@ Repository = "https://github.com/crewAIInc/crewAI"

 [project.optional-dependencies]
 tools = [
-    "crewai-tools==1.9.3",
+    "crewai-tools==1.10.1a1",
 ]
 embeddings = [
    "tiktoken~=0.8.0"
--- a/lib/crewai/src/crewai/init.py
+++ b/lib/crewai/src/crewai/init.py
@@ -10,7 +10,6 @@ from crewai.flow.flow import Flow
 from crewai.knowledge.knowledge import Knowledge
 from crewai.llm import LLM
 from crewai.llms.base_llm import BaseLLM
-from crewai.memory.unified_memory import Memory
 from crewai.process import Process
 from crewai.task import Task
 from crewai.tasks.llm_guardrail import LLMGuardrail
@@ -41,7 +40,7 @@ def _suppress_pydantic_deprecation_warnings() -> None:

 _suppress_pydantic_deprecation_warnings()

-__version__ = "1.9.3"
+__version__ = "1.10.1a1"
 _telemetry_submitted = False


@@ -72,6 +71,25 @@ def _track_install_async() -> None:


 _track_install_async()
+
+_LAZY_IMPORTS: dict[str, tuple[str, str]] = {
+    "Memory": ("crewai.memory.unified_memory", "Memory"),
+}
+
+
+def __getattr__(name: str) -> Any:
+    """Lazily import heavy modules (e.g. Memory → lancedb) on first access."""
+    if name in _LAZY_IMPORTS:
+        module_path, attr = _LAZY_IMPORTS[name]
+        import importlib
+
+        mod = importlib.import_module(module_path)
+        val = getattr(mod, attr)
+        globals()[name] = val
+        return val
+    raise AttributeError(f"module 'crewai' has no attribute {name!r}")
+
+
 __all__ = [
    "LLM",
    "Agent",
--- a/lib/crewai/src/crewai/agent/core.py
+++ b/lib/crewai/src/crewai/agent/core.py
@@ -8,11 +8,9 @@ import time
 from typing import (
    TYPE_CHECKING,
    Any,
-    Final,
    Literal,
    cast,
 )
-from urllib.parse import urlparse

 from pydantic import (
    BaseModel,
@@ -61,16 +59,8 @@ from crewai.knowledge.knowledge import Knowledge
 from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
 from crewai.lite_agent_output import LiteAgentOutput
 from crewai.llms.base_llm import BaseLLM
-from crewai.mcp import (
-    MCPClient,
-    MCPServerConfig,
-    MCPServerHTTP,
-    MCPServerSSE,
-    MCPServerStdio,
-)
-from crewai.mcp.transports.http import HTTPTransport
-from crewai.mcp.transports.sse import SSETransport
-from crewai.mcp.transports.stdio import StdioTransport
+from crewai.mcp import MCPServerConfig
+from crewai.mcp.tool_resolver import MCPToolResolver
 from crewai.rag.embeddings.types import EmbedderConfig
 from crewai.security.fingerprint import Fingerprint
 from crewai.tools.agent_tools.agent_tools import AgentTools
@@ -111,18 +101,8 @@ if TYPE_CHECKING:
    from crewai.utilities.types import LLMMessage


-# MCP Connection timeout constants (in seconds)
-MCP_CONNECTION_TIMEOUT: Final[int] = 10
-MCP_TOOL_EXECUTION_TIMEOUT: Final[int] = 30
-MCP_DISCOVERY_TIMEOUT: Final[int] = 15
-MCP_MAX_RETRIES: Final[int] = 3
-
 _passthrough_exceptions: tuple[type[Exception], ...] = ()

-# Simple in-memory cache for MCP tool schemas (duration: 5 minutes)
-_mcp_schema_cache: dict[str, Any] = {}
-_cache_ttl: Final[int] = 300  # 5 minutes
-

 class Agent(BaseAgent):
    """Represents an agent in a system.
@@ -154,7 +134,7 @@ class Agent(BaseAgent):
    model_config = ConfigDict()

    _times_executed: int = PrivateAttr(default=0)
-    _mcp_clients: list[Any] = PrivateAttr(default_factory=list)
+    _mcp_resolver: MCPToolResolver | None = PrivateAttr(default=None)
    _last_messages: list[LLMMessage] = PrivateAttr(default_factory=list)
    max_execution_time: int | None = Field(
        default=None,
@@ -384,10 +364,10 @@ class Agent(BaseAgent):
                )
                if unified_memory is not None:
                    query = task.description
-                    matches = unified_memory.recall(query, limit=10)
+                    matches = unified_memory.recall(query, limit=5)
                    if matches:
                        memory = "Relevant memories:\n" + "\n".join(
-                            f"- {m.record.content}" for m in matches
+                            m.format() for m in matches
                        )
                if memory.strip() != "":
                    task_prompt += self.i18n.slice("memory").format(memory=memory)
@@ -622,10 +602,10 @@ class Agent(BaseAgent):
                )
                if unified_memory is not None:
                    query = task.description
-                    matches = unified_memory.recall(query, limit=10)
+                    matches = unified_memory.recall(query, limit=5)
                    if matches:
                        memory = "Relevant memories:\n" + "\n".join(
-                            f"- {m.record.content}" for m in matches
+                            m.format() for m in matches
                        )
                if memory.strip() != "":
                    task_prompt += self.i18n.slice("memory").format(memory=memory)
@@ -864,7 +844,11 @@ class Agent(BaseAgent):
                respect_context_window=self.respect_context_window,
                request_within_rpm_limit=rpm_limit_fn,
                callbacks=[TokenCalcHandler(self._token_process)],
-                response_model=task.response_model if task else None,
+                response_model=(
+                    task.response_model or task.output_pydantic or task.output_json
+                )
+                if task
+                else None,
            )

    def _update_executor_parameters(
@@ -893,7 +877,11 @@ class Agent(BaseAgent):
        self.agent_executor.stop = stop_words
        self.agent_executor.tools_names = get_tool_names(tools)
        self.agent_executor.tools_description = render_text_description_and_args(tools)
-        self.agent_executor.response_model = task.response_model if task else None
+        self.agent_executor.response_model = (
+            (task.response_model or task.output_pydantic or task.output_json)
+            if task
+            else None
+        )

        self.agent_executor.tools_handler = self.tools_handler
        self.agent_executor.request_within_rpm_limit = rpm_limit_fn
@@ -926,544 +914,17 @@ class Agent(BaseAgent):
    def get_mcp_tools(self, mcps: list[str | MCPServerConfig]) -> list[BaseTool]:
        """Convert MCP server references/configs to CrewAI tools.

-        Supports both string references (backwards compatible) and structured
-        configuration objects (MCPServerStdio, MCPServerHTTP, MCPServerSSE).
-
-        Args:
-            mcps: List of MCP server references (strings) or configurations.
-
-        Returns:
-            List of BaseTool instances from MCP servers.
+        Delegates to :class:`~crewai.mcp.tool_resolver.MCPToolResolver`.
        """
-        all_tools = []
-        clients = []
-
-        for mcp_config in mcps:
-            if isinstance(mcp_config, str):
-                tools = self._get_mcp_tools_from_string(mcp_config)
-            else:
-                tools, client = self._get_native_mcp_tools(mcp_config)
-                if client:
-                    clients.append(client)
-
-            all_tools.extend(tools)
-
-        # Store clients for cleanup
-        self._mcp_clients.extend(clients)
-        return all_tools
+        self._cleanup_mcp_clients()
+        self._mcp_resolver = MCPToolResolver(agent=self, logger=self._logger)
+        return self._mcp_resolver.resolve(mcps)

    def _cleanup_mcp_clients(self) -> None:
        """Cleanup MCP client connections after task execution."""
-        if not self._mcp_clients:
-            return
-
-        async def _disconnect_all() -> None:
-            for client in self._mcp_clients:
-                if client and hasattr(client, "connected") and client.connected:
-                    await client.disconnect()
-
-        try:
-            asyncio.run(_disconnect_all())
-        except Exception as e:
-            self._logger.log("error", f"Error during MCP client cleanup: {e}")
-        finally:
-            self._mcp_clients.clear()
-
-    def _get_mcp_tools_from_string(self, mcp_ref: str) -> list[BaseTool]:
-        """Get tools from legacy string-based MCP references.
-
-        This method maintains backwards compatibility with string-based
-        MCP references (https://... and crewai-amp:...).
-
-        Args:
-            mcp_ref: String reference to MCP server.
-
-        Returns:
-            List of BaseTool instances.
-        """
-        if mcp_ref.startswith("crewai-amp:"):
-            return self._get_amp_mcp_tools(mcp_ref)
-        if mcp_ref.startswith("https://"):
-            return self._get_external_mcp_tools(mcp_ref)
-        return []
-
-    def _get_external_mcp_tools(self, mcp_ref: str) -> list[BaseTool]:
-        """Get tools from external HTTPS MCP server with graceful error handling."""
-        from crewai.tools.mcp_tool_wrapper import MCPToolWrapper
-
-        # Parse server URL and optional tool name
-        if "#" in mcp_ref:
-            server_url, specific_tool = mcp_ref.split("#", 1)
-        else:
-            server_url, specific_tool = mcp_ref, None
-
-        server_params = {"url": server_url}
-        server_name = self._extract_server_name(server_url)
-
-        try:
-            # Get tool schemas with timeout and error handling
-            tool_schemas = self._get_mcp_tool_schemas(server_params)
-
-            if not tool_schemas:
-                self._logger.log(
-                    "warning", f"No tools discovered from MCP server: {server_url}"
-                )
-                return []
-
-            tools = []
-            for tool_name, schema in tool_schemas.items():
-                # Skip if specific tool requested and this isn't it
-                if specific_tool and tool_name != specific_tool:
-                    continue
-
-                try:
-                    wrapper = MCPToolWrapper(
-                        mcp_server_params=server_params,
-                        tool_name=tool_name,
-                        tool_schema=schema,
-                        server_name=server_name,
-                    )
-                    tools.append(wrapper)
-                except Exception as e:
-                    self._logger.log(
-                        "warning",
-                        f"Failed to create MCP tool wrapper for {tool_name}: {e}",
-                    )
-                    continue
-
-            if specific_tool and not tools:
-                self._logger.log(
-                    "warning",
-                    f"Specific tool '{specific_tool}' not found on MCP server: {server_url}",
-                )
-
-            return cast(list[BaseTool], tools)
-
-        except Exception as e:
-            self._logger.log(
-                "warning", f"Failed to connect to MCP server {server_url}: {e}"
-            )
-            return []
-
-    def _get_native_mcp_tools(
-        self, mcp_config: MCPServerConfig
-    ) -> tuple[list[BaseTool], Any | None]:
-        """Get tools from MCP server using structured configuration.
-
-        This method creates an MCP client based on the configuration type,
-        connects to the server, discovers tools, applies filtering, and
-        returns wrapped tools along with the client instance for cleanup.
-
-        Args:
-            mcp_config: MCP server configuration (MCPServerStdio, MCPServerHTTP, or MCPServerSSE).
-
-        Returns:
-            Tuple of (list of BaseTool instances, MCPClient instance for cleanup).
-        """
-        from crewai.tools.base_tool import BaseTool
-        from crewai.tools.mcp_native_tool import MCPNativeTool
-
-        transport: StdioTransport | HTTPTransport | SSETransport
-        if isinstance(mcp_config, MCPServerStdio):
-            transport = StdioTransport(
-                command=mcp_config.command,
-                args=mcp_config.args,
-                env=mcp_config.env,
-            )
-            server_name = f"{mcp_config.command}_{'_'.join(mcp_config.args)}"
-        elif isinstance(mcp_config, MCPServerHTTP):
-            transport = HTTPTransport(
-                url=mcp_config.url,
-                headers=mcp_config.headers,
-                streamable=mcp_config.streamable,
-            )
-            server_name = self._extract_server_name(mcp_config.url)
-        elif isinstance(mcp_config, MCPServerSSE):
-            transport = SSETransport(
-                url=mcp_config.url,
-                headers=mcp_config.headers,
-            )
-            server_name = self._extract_server_name(mcp_config.url)
-        else:
-            raise ValueError(f"Unsupported MCP server config type: {type(mcp_config)}")
-
-        client = MCPClient(
-            transport=transport,
-            cache_tools_list=mcp_config.cache_tools_list,
-        )
-
-        async def _setup_client_and_list_tools() -> list[dict[str, Any]]:
-            """Async helper to connect and list tools in same event loop."""
-
-            try:
-                if not client.connected:
-                    await client.connect()
-
-                tools_list = await client.list_tools()
-
-                try:
-                    await client.disconnect()
-                    # Small delay to allow background tasks to finish cleanup
-                    # This helps prevent "cancel scope in different task" errors
-                    # when asyncio.run() closes the event loop
-                    await asyncio.sleep(0.1)
-                except Exception as e:
-                    self._logger.log("error", f"Error during disconnect: {e}")
-
-                return tools_list
-            except Exception as e:
-                if client.connected:
-                    await client.disconnect()
-                    await asyncio.sleep(0.1)
-                raise RuntimeError(
-                    f"Error during setup client and list tools: {e}"
-                ) from e
-
-        try:
-            try:
-                asyncio.get_running_loop()
-                import concurrent.futures
-
-                with concurrent.futures.ThreadPoolExecutor() as executor:
-                    future = executor.submit(
-                        asyncio.run, _setup_client_and_list_tools()
-                    )
-                    tools_list = future.result()
-            except RuntimeError:
-                try:
-                    tools_list = asyncio.run(_setup_client_and_list_tools())
-                except RuntimeError as e:
-                    error_msg = str(e).lower()
-                    if "cancel scope" in error_msg or "task" in error_msg:
-                        raise ConnectionError(
-                            "MCP connection failed due to event loop cleanup issues. "
-                            "This may be due to authentication errors or server unavailability."
-                        ) from e
-                except asyncio.CancelledError as e:
-                    raise ConnectionError(
-                        "MCP connection was cancelled. This may indicate an authentication "
-                        "error or server unavailability."
-                    ) from e
-
-            if mcp_config.tool_filter:
-                filtered_tools = []
-                for tool in tools_list:
-                    if callable(mcp_config.tool_filter):
-                        try:
-                            from crewai.mcp.filters import ToolFilterContext
-
-                            context = ToolFilterContext(
-                                agent=self,
-                                server_name=server_name,
-                                run_context=None,
-                            )
-                            if mcp_config.tool_filter(context, tool):  # type: ignore[call-arg, arg-type]
-                                filtered_tools.append(tool)
-                        except (TypeError, AttributeError):
-                            if mcp_config.tool_filter(tool):  # type: ignore[call-arg, arg-type]
-                                filtered_tools.append(tool)
-                    else:
-                        # Not callable - include tool
-                        filtered_tools.append(tool)
-                tools_list = filtered_tools
-
-            tools = []
-            for tool_def in tools_list:
-                tool_name = tool_def.get("name", "")
-                if not tool_name:
-                    continue
-
-                # Convert inputSchema to Pydantic model if present
-                args_schema = None
-                if tool_def.get("inputSchema"):
-                    args_schema = self._json_schema_to_pydantic(
-                        tool_name, tool_def["inputSchema"]
-                    )
-
-                tool_schema = {
-                    "description": tool_def.get("description", ""),
-                    "args_schema": args_schema,
-                }
-
-                try:
-                    native_tool = MCPNativeTool(
-                        mcp_client=client,
-                        tool_name=tool_name,
-                        tool_schema=tool_schema,
-                        server_name=server_name,
-                    )
-                    tools.append(native_tool)
-                except Exception as e:
-                    self._logger.log("error", f"Failed to create native MCP tool: {e}")
-                    continue
-
-            return cast(list[BaseTool], tools), client
-        except Exception as e:
-            if client.connected:
-                asyncio.run(client.disconnect())
-
-            raise RuntimeError(f"Failed to get native MCP tools: {e}") from e
-
-    def _get_amp_mcp_tools(self, amp_ref: str) -> list[BaseTool]:
-        """Get tools from CrewAI AMP MCP marketplace."""
-        # Parse: "crewai-amp:mcp-name" or "crewai-amp:mcp-name#tool_name"
-        amp_part = amp_ref.replace("crewai-amp:", "")
-        if "#" in amp_part:
-            mcp_name, specific_tool = amp_part.split("#", 1)
-        else:
-            mcp_name, specific_tool = amp_part, None
-
-        # Call AMP API to get MCP server URLs
-        mcp_servers = self._fetch_amp_mcp_servers(mcp_name)
-
-        tools = []
-        for server_config in mcp_servers:
-            server_ref = server_config["url"]
-            if specific_tool:
-                server_ref += f"#{specific_tool}"
-            server_tools = self._get_external_mcp_tools(server_ref)
-            tools.extend(server_tools)
-
-        return tools
-
-    @staticmethod
-    def _extract_server_name(server_url: str) -> str:
-        """Extract clean server name from URL for tool prefixing."""
-
-        parsed = urlparse(server_url)
-        domain = parsed.netloc.replace(".", "_")
-        path = parsed.path.replace("/", "_").strip("_")
-        return f"{domain}_{path}" if path else domain
-
-    def _get_mcp_tool_schemas(
-        self, server_params: dict[str, Any]
-    ) -> dict[str, dict[str, Any]]:
-        """Get tool schemas from MCP server for wrapper creation with caching."""
-        server_url = server_params["url"]
-
-        # Check cache first
-        cache_key = server_url
-        current_time = time.time()
-
-        if cache_key in _mcp_schema_cache:
-            cached_data, cache_time = _mcp_schema_cache[cache_key]
-            if current_time - cache_time < _cache_ttl:
-                self._logger.log(
-                    "debug", f"Using cached MCP tool schemas for {server_url}"
-                )
-                return cached_data  # type: ignore[no-any-return]
-
-        try:
-            schemas = asyncio.run(self._get_mcp_tool_schemas_async(server_params))
-
-            # Cache successful results
-            _mcp_schema_cache[cache_key] = (schemas, current_time)
-
-            return schemas
-        except Exception as e:
-            # Log warning but don't raise - this allows graceful degradation
-            self._logger.log(
-                "warning", f"Failed to get MCP tool schemas from {server_url}: {e}"
-            )
-            return {}
-
-    async def _get_mcp_tool_schemas_async(
-        self, server_params: dict[str, Any]
-    ) -> dict[str, dict[str, Any]]:
-        """Async implementation of MCP tool schema retrieval with timeouts and retries."""
-        server_url = server_params["url"]
-        return await self._retry_mcp_discovery(
-            self._discover_mcp_tools_with_timeout, server_url
-        )
-
-    async def _retry_mcp_discovery(
-        self, operation_func: Any, server_url: str
-    ) -> dict[str, dict[str, Any]]:
-        """Retry MCP discovery operation with exponential backoff, avoiding try-except in loop."""
-        last_error = None
-
-        for attempt in range(MCP_MAX_RETRIES):
-            # Execute single attempt outside try-except loop structure
-            result, error, should_retry = await self._attempt_mcp_discovery(
-                operation_func, server_url
-            )
-
-            # Success case - return immediately
-            if result is not None:
-                return result
-
-            # Non-retryable error - raise immediately
-            if not should_retry:
-                raise RuntimeError(error)
-
-            # Retryable error - continue with backoff
-            last_error = error
-            if attempt < MCP_MAX_RETRIES - 1:
-                wait_time = 2**attempt  # Exponential backoff
-                await asyncio.sleep(wait_time)
-
-        raise RuntimeError(
-            f"Failed to discover MCP tools after {MCP_MAX_RETRIES} attempts: {last_error}"
-        )
-
-    @staticmethod
-    async def _attempt_mcp_discovery(
-        operation_func: Any, server_url: str
-    ) -> tuple[dict[str, dict[str, Any]] | None, str, bool]:
-        """Attempt single MCP discovery operation and return (result, error_message, should_retry)."""
-        try:
-            result = await operation_func(server_url)
-            return result, "", False
-
-        except ImportError:
-            return (
-                None,
-                "MCP library not available. Please install with: pip install mcp",
-                False,
-            )
-
-        except asyncio.TimeoutError:
-            return (
-                None,
-                f"MCP discovery timed out after {MCP_DISCOVERY_TIMEOUT} seconds",
-                True,
-            )
-
-        except Exception as e:
-            error_str = str(e).lower()
-
-            # Classify errors as retryable or non-retryable
-            if "authentication" in error_str or "unauthorized" in error_str:
-                return None, f"Authentication failed for MCP server: {e!s}", False
-            if "connection" in error_str or "network" in error_str:
-                return None, f"Network connection failed: {e!s}", True
-            if "json" in error_str or "parsing" in error_str:
-                return None, f"Server response parsing error: {e!s}", True
-            return None, f"MCP discovery error: {e!s}", False
-
-    async def _discover_mcp_tools_with_timeout(
-        self, server_url: str
-    ) -> dict[str, dict[str, Any]]:
-        """Discover MCP tools with timeout wrapper."""
-        return await asyncio.wait_for(
-            self._discover_mcp_tools(server_url), timeout=MCP_DISCOVERY_TIMEOUT
-        )
-
-    async def _discover_mcp_tools(self, server_url: str) -> dict[str, dict[str, Any]]:
-        """Discover tools from MCP server with proper timeout handling."""
-        from mcp import ClientSession
-        from mcp.client.streamable_http import streamablehttp_client
-
-        async with streamablehttp_client(server_url) as (read, write, _):
-            async with ClientSession(read, write) as session:
-                # Initialize the connection with timeout
-                await asyncio.wait_for(
-                    session.initialize(), timeout=MCP_CONNECTION_TIMEOUT
-                )
-
-                # List available tools with timeout
-                tools_result = await asyncio.wait_for(
-                    session.list_tools(),
-                    timeout=MCP_DISCOVERY_TIMEOUT - MCP_CONNECTION_TIMEOUT,
-                )
-
-                schemas = {}
-                for tool in tools_result.tools:
-                    args_schema = None
-                    if hasattr(tool, "inputSchema") and tool.inputSchema:
-                        args_schema = self._json_schema_to_pydantic(
-                            sanitize_tool_name(tool.name), tool.inputSchema
-                        )
-
-                    schemas[sanitize_tool_name(tool.name)] = {
-                        "description": getattr(tool, "description", ""),
-                        "args_schema": args_schema,
-                    }
-                return schemas
-
-    def _json_schema_to_pydantic(
-        self, tool_name: str, json_schema: dict[str, Any]
-    ) -> type:
-        """Convert JSON Schema to Pydantic model for tool arguments.
-
-        Args:
-            tool_name: Name of the tool (used for model naming)
-            json_schema: JSON Schema dict with 'properties', 'required', etc.
-
-        Returns:
-            Pydantic BaseModel class
-        """
-        from pydantic import Field, create_model
-
-        properties = json_schema.get("properties", {})
-        required_fields = json_schema.get("required", [])
-
-        field_definitions: dict[str, Any] = {}
-
-        for field_name, field_schema in properties.items():
-            field_type = self._json_type_to_python(field_schema)
-            field_description = field_schema.get("description", "")
-
-            is_required = field_name in required_fields
-
-            if is_required:
-                field_definitions[field_name] = (
-                    field_type,
-                    Field(..., description=field_description),
-                )
-            else:
-                field_definitions[field_name] = (
-                    field_type | None,
-                    Field(default=None, description=field_description),
-                )
-
-        model_name = f"{tool_name.replace('-', '_').replace(' ', '_')}Schema"
-        return create_model(model_name, **field_definitions)  # type: ignore[no-any-return]
-
-    def _json_type_to_python(self, field_schema: dict[str, Any]) -> type:
-        """Convert JSON Schema type to Python type.
-
-        Args:
-            field_schema: JSON Schema field definition
-
-        Returns:
-            Python type
-        """
-
-        json_type = field_schema.get("type")
-
-        if "anyOf" in field_schema:
-            types: list[type] = []
-            for option in field_schema["anyOf"]:
-                if "const" in option:
-                    types.append(str)
-                else:
-                    types.append(self._json_type_to_python(option))
-            unique_types = list(set(types))
-            if len(unique_types) > 1:
-                result: Any = unique_types[0]
-                for t in unique_types[1:]:
-                    result = result | t
-                return result  # type: ignore[no-any-return]
-            return unique_types[0]
-
-        type_mapping: dict[str | None, type] = {
-            "string": str,
-            "number": float,
-            "integer": int,
-            "boolean": bool,
-            "array": list,
-            "object": dict,
-        }
-
-        return type_mapping.get(json_type, Any)
-
-    @staticmethod
-    def _fetch_amp_mcp_servers(mcp_name: str) -> list[dict[str, Any]]:
-        """Fetch MCP server configurations from CrewAI AMP API."""
-        # TODO: Implement AMP API call to "integrations/mcps" endpoint
-        # Should return list of server configs with URLs
-        return []
+        if self._mcp_resolver is not None:
+            self._mcp_resolver.cleanup()
+            self._mcp_resolver = None

    @staticmethod
    def get_multimodal_tools() -> Sequence[BaseTool]:
@@ -1712,7 +1173,8 @@ class Agent(BaseAgent):

            existing_names = {sanitize_tool_name(t.name) for t in raw_tools}
            raw_tools.extend(
-                mt for mt in create_memory_tools(agent_memory)
+                mt
+                for mt in create_memory_tools(agent_memory)
                if sanitize_tool_name(mt.name) not in existing_names
            )

@@ -1802,11 +1264,11 @@ class Agent(BaseAgent):
                    ),
                )
                start_time = time.time()
-                matches = agent_memory.recall(formatted_messages, limit=10)
+                matches = agent_memory.recall(formatted_messages, limit=5)
                memory_block = ""
                if matches:
                    memory_block = "Relevant memories:\n" + "\n".join(
-                        f"- {m.record.content}" for m in matches
+                        m.format() for m in matches
                    )
                if memory_block:
                    formatted_messages += "\n\n" + self.i18n.slice("memory").format(
@@ -1937,14 +1399,15 @@ class Agent(BaseAgent):
            if isinstance(messages, str):
                input_str = messages
            else:
-                input_str = "\n".join(
-                    str(msg.get("content", "")) for msg in messages if msg.get("content")
-                ) or "User request"
-            raw = (
-                f"Input: {input_str}\n"
-                f"Agent: {self.role}\n"
-                f"Result: {output_text}"
-            )
+                input_str = (
+                    "\n".join(
+                        str(msg.get("content", ""))
+                        for msg in messages
+                        if msg.get("content")
+                    )
+                    or "User request"
+                )
+            raw = f"Input: {input_str}\nAgent: {self.role}\nResult: {output_text}"
            extracted = agent_memory.extract_memories(raw)
            if extracted:
                agent_memory.remember_many(extracted)
--- a/lib/crewai/src/crewai/agents/agent_builder/base_agent.py
+++ b/lib/crewai/src/crewai/agents/agent_builder/base_agent.py
@@ -4,7 +4,8 @@ from abc import ABC, abstractmethod
 from collections.abc import Callable
 from copy import copy as shallow_copy
 from hashlib import md5
-from typing import Any, Literal
+import re
+from typing import Any, Final, Literal
 import uuid

 from pydantic import (
@@ -36,6 +37,11 @@ from crewai.utilities.rpm_controller import RPMController
 from crewai.utilities.string_utils import interpolate_only


+_SLUG_RE: Final[re.Pattern[str]] = re.compile(
+    r"^(?:crewai-amp:)?[a-zA-Z0-9][a-zA-Z0-9_-]*(?:#\w+)?$"
+)
+
+
 PlatformApp = Literal[
    "asana",
    "box",
@@ -197,7 +203,7 @@ class BaseAgent(BaseModel, ABC, metaclass=AgentMeta):
    )
    mcps: list[str | MCPServerConfig] | None = Field(
        default=None,
-        description="List of MCP server references. Supports 'https://server.com/path' for external servers and 'crewai-amp:mcp-name' for AMP marketplace. Use '#tool_name' suffix for specific tools.",
+        description="List of MCP server references. Supports 'https://server.com/path' for external servers and bare slugs like 'notion' for connected MCP integrations. Use '#tool_name' suffix for specific tools.",
    )
    memory: Any = Field(
        default=None,
@@ -276,14 +282,16 @@ class BaseAgent(BaseModel, ABC, metaclass=AgentMeta):
        validated_mcps: list[str | MCPServerConfig] = []
        for mcp in mcps:
            if isinstance(mcp, str):
-                if mcp.startswith(("https://", "crewai-amp:")):
+                if mcp.startswith("https://"):
+                    validated_mcps.append(mcp)
+                elif _SLUG_RE.match(mcp):
                    validated_mcps.append(mcp)
                else:
                    raise ValueError(
-                        f"Invalid MCP reference: {mcp}. "
-                        "String references must start with 'https://' or 'crewai-amp:'"
+                        f"Invalid MCP reference: {mcp!r}. "
+                        "String references must be an 'https://' URL or a valid "
+                        "slug (e.g. 'notion', 'notion#search', 'crewai-amp:notion')."
                    )
-
            elif isinstance(mcp, (MCPServerConfig)):
                validated_mcps.append(mcp)
            else:
--- a/lib/crewai/src/crewai/agents/agent_builder/base_agent_executor_mixin.py
+++ b/lib/crewai/src/crewai/agents/agent_builder/base_agent_executor_mixin.py
@@ -30,7 +30,7 @@ class CrewAgentExecutorMixin:
        memory = getattr(self.agent, "memory", None) or (
            getattr(self.crew, "_memory", None) if self.crew else None
        )
-        if memory is None or not self.task:
+        if memory is None or not self.task or getattr(memory, "_read_only", False):
            return
        if (
            f"Action: {sanitize_tool_name('Delegate work to coworker')}"
--- a/lib/crewai/src/crewai/agents/crew_agent_executor.py
+++ b/lib/crewai/src/crewai/agents/crew_agent_executor.py
@@ -6,7 +6,10 @@ and memory management.

 from __future__ import annotations

+import asyncio
 from collections.abc import Callable
+from concurrent.futures import ThreadPoolExecutor, as_completed
+import inspect
 import logging
 from typing import TYPE_CHECKING, Any, Literal, cast

@@ -47,6 +50,7 @@ from crewai.utilities.agent_utils import (
    handle_unknown_error,
    has_reached_max_iterations,
    is_context_length_exceeded,
+    parse_tool_call_args,
    process_llm_response,
    track_delegation_if_needed,
 )
@@ -685,30 +689,142 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
        Returns:
            AgentFinish if tool has result_as_answer=True, None otherwise.
        """
-        from datetime import datetime
-        import json
-
-        from crewai.events import crewai_event_bus
-        from crewai.events.types.tool_usage_events import (
-            ToolUsageErrorEvent,
-            ToolUsageFinishedEvent,
-            ToolUsageStartedEvent,
-        )
-
        if not tool_calls:
            return None

-        # Only process the FIRST tool call for sequential execution with reflection
-        tool_call = tool_calls[0]
+        parsed_calls = [
+            parsed
+            for tool_call in tool_calls
+            if (parsed := self._parse_native_tool_call(tool_call)) is not None
+        ]
+        if not parsed_calls:
+            return None

-        # Extract tool call info - handle OpenAI-style, Anthropic-style, and Gemini-style
+        original_tools_by_name: dict[str, Any] = {}
+        for tool in self.original_tools or []:
+            original_tools_by_name[sanitize_tool_name(tool.name)] = tool
+
+        if len(parsed_calls) > 1:
+            has_result_as_answer_in_batch = any(
+                bool(
+                    original_tools_by_name.get(func_name)
+                    and getattr(
+                        original_tools_by_name.get(func_name), "result_as_answer", False
+                    )
+                )
+                for _, func_name, _ in parsed_calls
+            )
+            has_max_usage_count_in_batch = any(
+                bool(
+                    original_tools_by_name.get(func_name)
+                    and getattr(
+                        original_tools_by_name.get(func_name),
+                        "max_usage_count",
+                        None,
+                    )
+                    is not None
+                )
+                for _, func_name, _ in parsed_calls
+            )
+
+            # Preserve historical sequential behavior for result_as_answer batches.
+            # Also avoid threading around usage counters for max_usage_count tools.
+            if has_result_as_answer_in_batch or has_max_usage_count_in_batch:
+                logger.debug(
+                    "Skipping parallel native execution because batch includes result_as_answer or max_usage_count tool"
+                )
+            else:
+                execution_plan: list[
+                    tuple[str, str, str | dict[str, Any], Any | None]
+                ] = []
+                for call_id, func_name, func_args in parsed_calls:
+                    original_tool = original_tools_by_name.get(func_name)
+                    execution_plan.append(
+                        (call_id, func_name, func_args, original_tool)
+                    )
+
+                self._append_assistant_tool_calls_message(
+                    [
+                        (call_id, func_name, func_args)
+                        for call_id, func_name, func_args, _ in execution_plan
+                    ]
+                )
+
+                max_workers = min(8, len(execution_plan))
+                ordered_results: list[dict[str, Any] | None] = [None] * len(
+                    execution_plan
+                )
+                with ThreadPoolExecutor(max_workers=max_workers) as pool:
+                    futures = {
+                        pool.submit(
+                            self._execute_single_native_tool_call,
+                            call_id=call_id,
+                            func_name=func_name,
+                            func_args=func_args,
+                            available_functions=available_functions,
+                            original_tool=original_tool,
+                            should_execute=True,
+                        ): idx
+                        for idx, (
+                            call_id,
+                            func_name,
+                            func_args,
+                            original_tool,
+                        ) in enumerate(execution_plan)
+                    }
+                    for future in as_completed(futures):
+                        idx = futures[future]
+                        ordered_results[idx] = future.result()
+
+                for execution_result in ordered_results:
+                    if not execution_result:
+                        continue
+                    tool_finish = self._append_tool_result_and_check_finality(
+                        execution_result
+                    )
+                    if tool_finish:
+                        return tool_finish
+
+                reasoning_prompt = self._i18n.slice("post_tool_reasoning")
+                reasoning_message: LLMMessage = {
+                    "role": "user",
+                    "content": reasoning_prompt,
+                }
+                self.messages.append(reasoning_message)
+                return None
+
+        # Sequential behavior: process only first tool call, then force reflection.
+        call_id, func_name, func_args = parsed_calls[0]
+        self._append_assistant_tool_calls_message([(call_id, func_name, func_args)])
+
+        execution_result = self._execute_single_native_tool_call(
+            call_id=call_id,
+            func_name=func_name,
+            func_args=func_args,
+            available_functions=available_functions,
+            original_tool=original_tools_by_name.get(func_name),
+            should_execute=True,
+        )
+        tool_finish = self._append_tool_result_and_check_finality(execution_result)
+        if tool_finish:
+            return tool_finish
+
+        reasoning_prompt = self._i18n.slice("post_tool_reasoning")
+        reasoning_message = {
+            "role": "user",
+            "content": reasoning_prompt,
+        }
+        self.messages.append(reasoning_message)
+        return None
+
+    def _parse_native_tool_call(
+        self, tool_call: Any
+    ) -> tuple[str, str, str | dict[str, Any]] | None:
        if hasattr(tool_call, "function"):
-            # OpenAI-style: has .function.name and .function.arguments
            call_id = getattr(tool_call, "id", f"call_{id(tool_call)}")
            func_name = sanitize_tool_name(tool_call.function.name)
-            func_args = tool_call.function.arguments
-        elif hasattr(tool_call, "function_call") and tool_call.function_call:
-            # Gemini-style: has .function_call.name and .function_call.args
+            return call_id, func_name, tool_call.function.arguments
+        if hasattr(tool_call, "function_call") and tool_call.function_call:
            call_id = f"call_{id(tool_call)}"
            func_name = sanitize_tool_name(tool_call.function_call.name)
            func_args = (
@@ -716,13 +832,12 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                if tool_call.function_call.args
                else {}
            )
-        elif hasattr(tool_call, "name") and hasattr(tool_call, "input"):
-            # Anthropic format: has .name and .input (ToolUseBlock)
+            return call_id, func_name, func_args
+        if hasattr(tool_call, "name") and hasattr(tool_call, "input"):
            call_id = getattr(tool_call, "id", f"call_{id(tool_call)}")
            func_name = sanitize_tool_name(tool_call.name)
-            func_args = tool_call.input  # Already a dict in Anthropic
-        elif isinstance(tool_call, dict):
-            # Support OpenAI "id", Bedrock "toolUseId", or generate one
+            return call_id, func_name, tool_call.input
+        if isinstance(tool_call, dict):
            call_id = (
                tool_call.get("id")
                or tool_call.get("toolUseId")
@@ -733,10 +848,15 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                func_info.get("name", "") or tool_call.get("name", "")
            )
            func_args = func_info.get("arguments", "{}") or tool_call.get("input", {})
-        else:
-            return None
+            return call_id, func_name, func_args
+        return None
+
+    def _append_assistant_tool_calls_message(
+        self,
+        parsed_calls: list[tuple[str, str, str | dict[str, Any]]],
+    ) -> None:
+        import json

-        # Append assistant message with single tool call
        assistant_message: LLMMessage = {
            "role": "assistant",
            "content": None,
@@ -751,42 +871,54 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                        else json.dumps(func_args),
                    },
                }
+                for call_id, func_name, func_args in parsed_calls
            ],
        }
-
        self.messages.append(assistant_message)

-        # Parse arguments for the single tool call
-        if isinstance(func_args, str):
-            try:
-                args_dict = json.loads(func_args)
-            except json.JSONDecodeError:
-                args_dict = {}
-        else:
-            args_dict = func_args
+    def _execute_single_native_tool_call(
+        self,
+        *,
+        call_id: str,
+        func_name: str,
+        func_args: str | dict[str, Any],
+        available_functions: dict[str, Callable[..., Any]],
+        original_tool: Any | None = None,
+        should_execute: bool = True,
+    ) -> dict[str, Any]:
+        from datetime import datetime
+        import json

-        agent_key = getattr(self.agent, "key", "unknown") if self.agent else "unknown"
+        from crewai.events.types.tool_usage_events import (
+            ToolUsageErrorEvent,
+            ToolUsageFinishedEvent,
+            ToolUsageStartedEvent,
+        )

-        # Find original tool by matching sanitized name (needed for cache_function and result_as_answer)
+        args_dict, parse_error = parse_tool_call_args(func_args, func_name, call_id, original_tool)
+        if parse_error is not None:
+            return parse_error

-        original_tool = None
-        for tool in self.original_tools or []:
-            if sanitize_tool_name(tool.name) == func_name:
-                original_tool = tool
-                break
+        if original_tool is None:
+            for tool in self.original_tools or []:
+                if sanitize_tool_name(tool.name) == func_name:
+                    original_tool = tool
+                    break

-        # Check if tool has reached max usage count
        max_usage_reached = False
-        if original_tool:
-            if (
-                hasattr(original_tool, "max_usage_count")
-                and original_tool.max_usage_count is not None
-                and original_tool.current_usage_count >= original_tool.max_usage_count
-            ):
-                max_usage_reached = True
+        if not should_execute and original_tool:
+            max_usage_reached = True
+        elif (
+            should_execute
+            and original_tool
+            and (max_count := getattr(original_tool, "max_usage_count", None))
+            is not None
+            and getattr(original_tool, "current_usage_count", 0) >= max_count
+        ):
+            max_usage_reached = True

-        # Check cache before executing
        from_cache = False
+        result: str = "Tool not found"
        input_str = json.dumps(args_dict) if args_dict else ""
        if self.tools_handler and self.tools_handler.cache:
            cached_result = self.tools_handler.cache.read(
@@ -800,7 +932,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                )
                from_cache = True

-        # Emit tool usage started event
+        agent_key = getattr(self.agent, "key", "unknown") if self.agent else "unknown"
        started_at = datetime.now()
        crewai_event_bus.emit(
            self,
@@ -816,14 +948,12 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):

        track_delegation_if_needed(func_name, args_dict, self.task)

-        # Find the structured tool for hook context
        structured_tool: CrewStructuredTool | None = None
        for structured in self.tools or []:
            if sanitize_tool_name(structured.name) == func_name:
                structured_tool = structured
                break

-        # Execute before_tool_call hooks
        hook_blocked = False
        before_hook_context = ToolCallHookContext(
            tool_name=func_name,
@@ -847,58 +977,48 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                    color="red",
                )

-        # If hook blocked execution, set result and skip tool execution
        if hook_blocked:
            result = f"Tool execution blocked by hook. Tool: {func_name}"
-        # Execute the tool (only if not cached, not at max usage, and not blocked by hook)
-        elif not from_cache and not max_usage_reached:
-            result = "Tool not found"
-            if func_name in available_functions:
-                try:
-                    tool_func = available_functions[func_name]
-                    raw_result = tool_func(**args_dict)
-
-                    # Add to cache after successful execution (before string conversion)
-                    if self.tools_handler and self.tools_handler.cache:
-                        should_cache = True
-                        if (
-                            original_tool
-                            and hasattr(original_tool, "cache_function")
-                            and callable(original_tool.cache_function)
-                        ):
-                            should_cache = original_tool.cache_function(
-                                args_dict, raw_result
-                            )
-                        if should_cache:
-                            self.tools_handler.cache.add(
-                                tool=func_name, input=input_str, output=raw_result
-                            )
-
-                    # Convert to string for message
-                    result = (
-                        str(raw_result)
-                        if not isinstance(raw_result, str)
-                        else raw_result
-                    )
-                except Exception as e:
-                    result = f"Error executing tool: {e}"
-                    if self.task:
-                        self.task.increment_tools_errors()
-                    crewai_event_bus.emit(
-                        self,
-                        event=ToolUsageErrorEvent(
-                            tool_name=func_name,
-                            tool_args=args_dict,
-                            from_agent=self.agent,
-                            from_task=self.task,
-                            agent_key=agent_key,
-                            error=e,
-                        ),
-                    )
-                    error_event_emitted = True
        elif max_usage_reached and original_tool:
-            # Return error message when max usage limit is reached
            result = f"Tool '{func_name}' has reached its usage limit of {original_tool.max_usage_count} times and cannot be used anymore."
+        elif not from_cache and func_name in available_functions:
+            try:
+                raw_result = available_functions[func_name](**args_dict)
+
+                if self.tools_handler and self.tools_handler.cache:
+                    should_cache = True
+                    if (
+                        original_tool
+                        and hasattr(original_tool, "cache_function")
+                        and callable(original_tool.cache_function)
+                    ):
+                        should_cache = original_tool.cache_function(
+                            args_dict, raw_result
+                        )
+                    if should_cache:
+                        self.tools_handler.cache.add(
+                            tool=func_name, input=input_str, output=raw_result
+                        )
+
+                result = (
+                    str(raw_result) if not isinstance(raw_result, str) else raw_result
+                )
+            except Exception as e:
+                result = f"Error executing tool: {e}"
+                if self.task:
+                    self.task.increment_tools_errors()
+                crewai_event_bus.emit(
+                    self,
+                    event=ToolUsageErrorEvent(
+                        tool_name=func_name,
+                        tool_args=args_dict,
+                        from_agent=self.agent,
+                        from_task=self.task,
+                        agent_key=agent_key,
+                        error=e,
+                    ),
+                )
+                error_event_emitted = True

        after_hook_context = ToolCallHookContext(
            tool_name=func_name,
@@ -938,7 +1058,23 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                ),
            )

-        # Append tool result message
+        return {
+            "call_id": call_id,
+            "func_name": func_name,
+            "result": result,
+            "from_cache": from_cache,
+            "original_tool": original_tool,
+        }
+
+    def _append_tool_result_and_check_finality(
+        self, execution_result: dict[str, Any]
+    ) -> AgentFinish | None:
+        call_id = cast(str, execution_result["call_id"])
+        func_name = cast(str, execution_result["func_name"])
+        result = cast(str, execution_result["result"])
+        from_cache = cast(bool, execution_result["from_cache"])
+        original_tool = execution_result["original_tool"]
+
        tool_message: LLMMessage = {
            "role": "tool",
            "tool_call_id": call_id,
@@ -947,7 +1083,6 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
        }
        self.messages.append(tool_message)

-        # Log the tool execution
        if self.agent and self.agent.verbose:
            cache_info = " (from cache)" if from_cache else ""
            self._printer.print(
@@ -960,20 +1095,11 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
            and hasattr(original_tool, "result_as_answer")
            and original_tool.result_as_answer
        ):
-            # Return immediately with tool result as final answer
            return AgentFinish(
                thought="Tool result is the final answer",
                output=result,
                text=result,
            )
-
-        # Inject post-tool reasoning prompt to enforce analysis
-        reasoning_prompt = self._i18n.slice("post_tool_reasoning")
-        reasoning_message: LLMMessage = {
-            "role": "user",
-            "content": reasoning_prompt,
-        }
-        self.messages.append(reasoning_message)
        return None

    async def ainvoke(self, inputs: dict[str, Any]) -> dict[str, Any]:
@@ -1133,7 +1259,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                        formatted_answer, tool_result
                    )

-                self._invoke_step_callback(formatted_answer)  # type: ignore[arg-type]
+                await self._ainvoke_step_callback(formatted_answer)  # type: ignore[arg-type]
                self._append_message(formatted_answer.text)  # type: ignore[union-attr]

            except OutputParserError as e:
@@ -1248,7 +1374,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                        output=answer,
                        text=answer,
                    )
-                    self._invoke_step_callback(formatted_answer)
+                    await self._ainvoke_step_callback(formatted_answer)
                    self._append_message(answer)  # Save final answer to messages
                    self._show_logs(formatted_answer)
                    return formatted_answer
@@ -1260,7 +1386,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                        output=answer,
                        text=output_json,
                    )
-                    self._invoke_step_callback(formatted_answer)
+                    await self._ainvoke_step_callback(formatted_answer)
                    self._append_message(output_json)
                    self._show_logs(formatted_answer)
                    return formatted_answer
@@ -1271,7 +1397,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                    output=str(answer),
                    text=str(answer),
                )
-                self._invoke_step_callback(formatted_answer)
+                await self._ainvoke_step_callback(formatted_answer)
                self._append_message(str(answer))  # Save final answer to messages
                self._show_logs(formatted_answer)
                return formatted_answer
@@ -1365,13 +1491,28 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
    def _invoke_step_callback(
        self, formatted_answer: AgentAction | AgentFinish
    ) -> None:
-        """Invoke step callback.
+        """Invoke step callback (sync context).

        Args:
            formatted_answer: Current agent response.
        """
        if self.step_callback:
-            self.step_callback(formatted_answer)
+            cb_result = self.step_callback(formatted_answer)
+            if inspect.iscoroutine(cb_result):
+                asyncio.run(cb_result)
+
+    async def _ainvoke_step_callback(
+        self, formatted_answer: AgentAction | AgentFinish
+    ) -> None:
+        """Invoke step callback (async context).
+
+        Args:
+            formatted_answer: Current agent response.
+        """
+        if self.step_callback:
+            cb_result = self.step_callback(formatted_answer)
+            if inspect.iscoroutine(cb_result):
+                await cb_result

    def _append_message(
        self, text: str, role: Literal["user", "assistant", "system"] = "assistant"
--- a/lib/crewai/src/crewai/cli/authentication/main.py
+++ b/lib/crewai/src/crewai/cli/authentication/main.py
@@ -2,8 +2,8 @@ import time
 from typing import TYPE_CHECKING, Any, TypeVar, cast
 import webbrowser

+import httpx
 from pydantic import BaseModel, Field
-import requests
 from rich.console import Console

 from crewai.cli.authentication.utils import validate_jwt_token
@@ -98,7 +98,7 @@ class AuthenticationCommand:
            "scope": " ".join(self.oauth2_provider.get_oauth_scopes()),
            "audience": self.oauth2_provider.get_audience(),
        }
-        response = requests.post(
+        response = httpx.post(
            url=self.oauth2_provider.get_authorize_url(),
            data=device_code_payload,
            timeout=20,
@@ -130,7 +130,7 @@ class AuthenticationCommand:

        attempts = 0
        while True and attempts < 10:
-            response = requests.post(
+            response = httpx.post(
                self.oauth2_provider.get_token_url(), data=token_payload, timeout=30
            )
            token_data = response.json()
@@ -149,7 +149,7 @@ class AuthenticationCommand:
                return

            if token_data["error"] not in ("authorization_pending", "slow_down"):
-                raise requests.HTTPError(
+                raise httpx.HTTPError(
                    token_data.get("error_description") or token_data.get("error")
                )

--- a/lib/crewai/src/crewai/cli/command.py
+++ b/lib/crewai/src/crewai/cli/command.py
@@ -1,5 +1,6 @@
-import requests
-from requests.exceptions import JSONDecodeError
+import json
+
+import httpx
 from rich.console import Console

 from crewai.cli.authentication.token import get_auth_token
@@ -30,16 +31,16 @@ class PlusAPIMixin:
            console.print("Run 'crewai login' to sign up/login.", style="bold green")
            raise SystemExit from None

-    def _validate_response(self, response: requests.Response) -> None:
+    def _validate_response(self, response: httpx.Response) -> None:
        """
        Handle and display error messages from API responses.

        Args:
-            response (requests.Response): The response from the Plus API
+            response (httpx.Response): The response from the Plus API
        """
        try:
            json_response = response.json()
-        except (JSONDecodeError, ValueError):
+        except (json.JSONDecodeError, ValueError):
            console.print(
                "Failed to parse response from Enterprise API failed. Details:",
                style="bold red",
@@ -62,7 +63,7 @@ class PlusAPIMixin:
                    )
            raise SystemExit

-        if not response.ok:
+        if not response.is_success:
            console.print(
                "Request to Enterprise API failed. Details:", style="bold red"
            )
--- a/lib/crewai/src/crewai/cli/constants.py
+++ b/lib/crewai/src/crewai/cli/constants.py
@@ -69,7 +69,7 @@ ENV_VARS: dict[str, list[dict[str, Any]]] = {
        },
        {
            "prompt": "Enter your AWS Region Name (press Enter to skip)",
-            "key_name": "AWS_REGION_NAME",
+            "key_name": "AWS_DEFAULT_REGION",
        },
    ],
    "azure": [
--- a/lib/crewai/src/crewai/cli/enterprise/main.py
+++ b/lib/crewai/src/crewai/cli/enterprise/main.py
@@ -1,7 +1,7 @@
+import json
 from typing import Any, cast

-import requests
-from requests.exceptions import JSONDecodeError, RequestException
+import httpx
 from rich.console import Console

 from crewai.cli.authentication.main import Oauth2Settings, ProviderFactory
@@ -47,12 +47,12 @@ class EnterpriseConfigureCommand(BaseCommand):
                "User-Agent": f"CrewAI-CLI/{get_crewai_version()}",
                "X-Crewai-Version": get_crewai_version(),
            }
-            response = requests.get(oauth_endpoint, timeout=30, headers=headers)
+            response = httpx.get(oauth_endpoint, timeout=30, headers=headers)
            response.raise_for_status()

            try:
                oauth_config = response.json()
-            except JSONDecodeError as e:
+            except json.JSONDecodeError as e:
                raise ValueError(f"Invalid JSON response from {oauth_endpoint}") from e

            self._validate_oauth_config(oauth_config)
@@ -62,7 +62,7 @@ class EnterpriseConfigureCommand(BaseCommand):
            )
            return cast(dict[str, Any], oauth_config)

-        except RequestException as e:
+        except httpx.HTTPError as e:
            raise ValueError(f"Failed to connect to enterprise URL: {e!s}") from e
        except Exception as e:
            raise ValueError(f"Error fetching OAuth2 configuration: {e!s}") from e
--- a/lib/crewai/src/crewai/cli/memory_tui.py
+++ b/lib/crewai/src/crewai/cli/memory_tui.py
@@ -290,13 +290,20 @@ class MemoryTUI(App[None]):
        if self._memory is None:
            panel.update(self._init_error or "No memory loaded.")
            return
+        display_limit = 1000
        info = self._memory.info(path)
        self._last_scope_info = info
-        self._entries = self._memory.list_records(scope=path, limit=200)
+        self._entries = self._memory.list_records(scope=path, limit=display_limit)
        panel.update(_format_scope_info(info))
        panel.border_title = "Detail"
        entry_list = self.query_one("#entry-list", OptionList)
-        entry_list.border_title = f"Entries ({len(self._entries)})"
+        capped = info.record_count > display_limit
+        count_label = (
+            f"Entries (showing {display_limit} of {info.record_count} — display limit)"
+            if capped
+            else f"Entries ({len(self._entries)})"
+        )
+        entry_list.border_title = count_label
        self._populate_entry_list()

    def on_option_list_option_highlighted(
@@ -376,6 +383,11 @@ class MemoryTUI(App[None]):
                return

            info_lines: list[str] = []
+            info_lines.append(
+                "[dim italic]Searched the full dataset"
+                + (f" within [bold]{scope}[/]" if scope else "")
+                + " using the recall flow (semantic + recency + importance).[/]\n"
+            )
            if not self._custom_embedder:
                info_lines.append(
                    "[dim italic]Note: Using default OpenAI embedder. "
--- a/lib/crewai/src/crewai/cli/organization/main.py
+++ b/lib/crewai/src/crewai/cli/organization/main.py
@@ -1,4 +1,4 @@
-from requests import HTTPError
+from httpx import HTTPStatusError
 from rich.console import Console
 from rich.table import Table

@@ -10,11 +10,11 @@ console = Console()


 class OrganizationCommand(BaseCommand, PlusAPIMixin):
-    def __init__(self):
+    def __init__(self) -> None:
        BaseCommand.__init__(self)
        PlusAPIMixin.__init__(self, telemetry=self._telemetry)

-    def list(self):
+    def list(self) -> None:
        try:
            response = self.plus_api_client.get_organizations()
            response.raise_for_status()
@@ -33,7 +33,7 @@ class OrganizationCommand(BaseCommand, PlusAPIMixin):
                table.add_row(org["name"], org["uuid"])

            console.print(table)
-        except HTTPError as e:
+        except HTTPStatusError as e:
            if e.response.status_code == 401:
                console.print(
                    "You are not logged in to any organization. Use 'crewai login' to login.",
@@ -50,7 +50,7 @@ class OrganizationCommand(BaseCommand, PlusAPIMixin):
            )
            raise SystemExit(1) from e

-    def switch(self, org_id):
+    def switch(self, org_id: str) -> None:
        try:
            response = self.plus_api_client.get_organizations()
            response.raise_for_status()
@@ -72,7 +72,7 @@ class OrganizationCommand(BaseCommand, PlusAPIMixin):
                f"Successfully switched to {org['name']} ({org['uuid']})",
                style="bold green",
            )
-        except HTTPError as e:
+        except HTTPStatusError as e:
            if e.response.status_code == 401:
                console.print(
                    "You are not logged in to any organization. Use 'crewai login' to login.",
@@ -87,7 +87,7 @@ class OrganizationCommand(BaseCommand, PlusAPIMixin):
            console.print(f"Failed to switch organization: {e!s}", style="bold red")
            raise SystemExit(1) from e

-    def current(self):
+    def current(self) -> None:
        settings = Settings()
        if settings.org_uuid:
            console.print(
--- a/lib/crewai/src/crewai/cli/plus_api.py
+++ b/lib/crewai/src/crewai/cli/plus_api.py
@@ -3,7 +3,6 @@ from typing import Any
 from urllib.parse import urljoin

 import httpx
-import requests

 from crewai.cli.config import Settings
 from crewai.cli.constants import DEFAULT_CREWAI_ENTERPRISE_URL
@@ -43,16 +42,16 @@ class PlusAPI:

    def _make_request(
        self, method: str, endpoint: str, **kwargs: Any
-    ) -> requests.Response:
+    ) -> httpx.Response:
        url = urljoin(self.base_url, endpoint)
-        session = requests.Session()
-        session.trust_env = False
-        return session.request(method, url, headers=self.headers, **kwargs)
+        verify = kwargs.pop("verify", True)
+        with httpx.Client(trust_env=False, verify=verify) as client:
+            return client.request(method, url, headers=self.headers, **kwargs)

-    def login_to_tool_repository(self) -> requests.Response:
+    def login_to_tool_repository(self) -> httpx.Response:
        return self._make_request("POST", f"{self.TOOLS_RESOURCE}/login")

-    def get_tool(self, handle: str) -> requests.Response:
+    def get_tool(self, handle: str) -> httpx.Response:
        return self._make_request("GET", f"{self.TOOLS_RESOURCE}/{handle}")

    async def get_agent(self, handle: str) -> httpx.Response:
@@ -68,7 +67,7 @@ class PlusAPI:
        description: str | None,
        encoded_file: str,
        available_exports: list[dict[str, Any]] | None = None,
-    ) -> requests.Response:
+    ) -> httpx.Response:
        params = {
            "handle": handle,
            "public": is_public,
@@ -79,54 +78,52 @@ class PlusAPI:
        }
        return self._make_request("POST", f"{self.TOOLS_RESOURCE}", json=params)

-    def deploy_by_name(self, project_name: str) -> requests.Response:
+    def deploy_by_name(self, project_name: str) -> httpx.Response:
        return self._make_request(
            "POST", f"{self.CREWS_RESOURCE}/by-name/{project_name}/deploy"
        )

-    def deploy_by_uuid(self, uuid: str) -> requests.Response:
+    def deploy_by_uuid(self, uuid: str) -> httpx.Response:
        return self._make_request("POST", f"{self.CREWS_RESOURCE}/{uuid}/deploy")

-    def crew_status_by_name(self, project_name: str) -> requests.Response:
+    def crew_status_by_name(self, project_name: str) -> httpx.Response:
        return self._make_request(
            "GET", f"{self.CREWS_RESOURCE}/by-name/{project_name}/status"
        )

-    def crew_status_by_uuid(self, uuid: str) -> requests.Response:
+    def crew_status_by_uuid(self, uuid: str) -> httpx.Response:
        return self._make_request("GET", f"{self.CREWS_RESOURCE}/{uuid}/status")

    def crew_by_name(
        self, project_name: str, log_type: str = "deployment"
-    ) -> requests.Response:
+    ) -> httpx.Response:
        return self._make_request(
            "GET", f"{self.CREWS_RESOURCE}/by-name/{project_name}/logs/{log_type}"
        )

-    def crew_by_uuid(
-        self, uuid: str, log_type: str = "deployment"
-    ) -> requests.Response:
+    def crew_by_uuid(self, uuid: str, log_type: str = "deployment") -> httpx.Response:
        return self._make_request(
            "GET", f"{self.CREWS_RESOURCE}/{uuid}/logs/{log_type}"
        )

-    def delete_crew_by_name(self, project_name: str) -> requests.Response:
+    def delete_crew_by_name(self, project_name: str) -> httpx.Response:
        return self._make_request(
            "DELETE", f"{self.CREWS_RESOURCE}/by-name/{project_name}"
        )

-    def delete_crew_by_uuid(self, uuid: str) -> requests.Response:
+    def delete_crew_by_uuid(self, uuid: str) -> httpx.Response:
        return self._make_request("DELETE", f"{self.CREWS_RESOURCE}/{uuid}")

-    def list_crews(self) -> requests.Response:
+    def list_crews(self) -> httpx.Response:
        return self._make_request("GET", self.CREWS_RESOURCE)

-    def create_crew(self, payload: dict[str, Any]) -> requests.Response:
+    def create_crew(self, payload: dict[str, Any]) -> httpx.Response:
        return self._make_request("POST", self.CREWS_RESOURCE, json=payload)

-    def get_organizations(self) -> requests.Response:
+    def get_organizations(self) -> httpx.Response:
        return self._make_request("GET", self.ORGANIZATIONS_RESOURCE)

-    def initialize_trace_batch(self, payload: dict[str, Any]) -> requests.Response:
+    def initialize_trace_batch(self, payload: dict[str, Any]) -> httpx.Response:
        return self._make_request(
            "POST",
            f"{self.TRACING_RESOURCE}/batches",
@@ -136,7 +133,7 @@ class PlusAPI:

    def initialize_ephemeral_trace_batch(
        self, payload: dict[str, Any]
-    ) -> requests.Response:
+    ) -> httpx.Response:
        return self._make_request(
            "POST",
            f"{self.EPHEMERAL_TRACING_RESOURCE}/batches",
@@ -145,7 +142,7 @@ class PlusAPI:

    def send_trace_events(
        self, trace_batch_id: str, payload: dict[str, Any]
-    ) -> requests.Response:
+    ) -> httpx.Response:
        return self._make_request(
            "POST",
            f"{self.TRACING_RESOURCE}/batches/{trace_batch_id}/events",
@@ -155,7 +152,7 @@ class PlusAPI:

    def send_ephemeral_trace_events(
        self, trace_batch_id: str, payload: dict[str, Any]
-    ) -> requests.Response:
+    ) -> httpx.Response:
        return self._make_request(
            "POST",
            f"{self.EPHEMERAL_TRACING_RESOURCE}/batches/{trace_batch_id}/events",
@@ -165,7 +162,7 @@ class PlusAPI:

    def finalize_trace_batch(
        self, trace_batch_id: str, payload: dict[str, Any]
-    ) -> requests.Response:
+    ) -> httpx.Response:
        return self._make_request(
            "PATCH",
            f"{self.TRACING_RESOURCE}/batches/{trace_batch_id}/finalize",
@@ -175,7 +172,7 @@ class PlusAPI:

    def finalize_ephemeral_trace_batch(
        self, trace_batch_id: str, payload: dict[str, Any]
-    ) -> requests.Response:
+    ) -> httpx.Response:
        return self._make_request(
            "PATCH",
            f"{self.EPHEMERAL_TRACING_RESOURCE}/batches/{trace_batch_id}/finalize",
@@ -185,7 +182,7 @@ class PlusAPI:

    def mark_trace_batch_as_failed(
        self, trace_batch_id: str, error_message: str
-    ) -> requests.Response:
+    ) -> httpx.Response:
        return self._make_request(
            "PATCH",
            f"{self.TRACING_RESOURCE}/batches/{trace_batch_id}",
@@ -193,13 +190,20 @@ class PlusAPI:
            timeout=30,
        )

-    def get_triggers(self) -> requests.Response:
+    def get_mcp_configs(self, slugs: list[str]) -> httpx.Response:
+        """Get MCP server configurations for the given slugs."""
+        return self._make_request(
+            "GET",
+            f"{self.INTEGRATIONS_RESOURCE}/mcp_configs",
+            params={"slugs": ",".join(slugs)},
+            timeout=30,
+        )
+
+    def get_triggers(self) -> httpx.Response:
        """Get all available triggers from integrations."""
        return self._make_request("GET", f"{self.INTEGRATIONS_RESOURCE}/apps")

-    def get_trigger_payload(
-        self, app_slug: str, trigger_slug: str
-    ) -> requests.Response:
+    def get_trigger_payload(self, app_slug: str, trigger_slug: str) -> httpx.Response:
        """Get sample payload for a specific trigger."""
        return self._make_request(
            "GET", f"{self.INTEGRATIONS_RESOURCE}/{app_slug}/{trigger_slug}/payload"
--- a/lib/crewai/src/crewai/cli/provider.py
+++ b/lib/crewai/src/crewai/cli/provider.py
@@ -8,7 +8,7 @@ from typing import Any

 import certifi
 import click
-import requests
+import httpx

 from crewai.cli.constants import JSON_URL, MODELS, PROVIDERS

@@ -165,20 +165,20 @@ def fetch_provider_data(cache_file: Path) -> dict[str, Any] | None:
    ssl_config = os.environ["SSL_CERT_FILE"] = certifi.where()

    try:
-        response = requests.get(JSON_URL, stream=True, timeout=60, verify=ssl_config)
-        response.raise_for_status()
-        data = download_data(response)
-        with open(cache_file, "w") as f:
-            json.dump(data, f)
-        return data
-    except requests.RequestException as e:
+        with httpx.stream("GET", JSON_URL, timeout=60, verify=ssl_config) as response:
+            response.raise_for_status()
+            data = download_data(response)
+            with open(cache_file, "w") as f:
+                json.dump(data, f)
+            return data
+    except httpx.HTTPError as e:
        click.secho(f"Error fetching provider data: {e}", fg="red")
    except json.JSONDecodeError:
        click.secho("Error parsing provider data. Invalid JSON format.", fg="red")
    return None


-def download_data(response: requests.Response) -> dict[str, Any]:
+def download_data(response: httpx.Response) -> dict[str, Any]:
    """Downloads data from a given HTTP response and returns the JSON content.

    Args:
@@ -194,7 +194,7 @@ def download_data(response: requests.Response) -> dict[str, Any]:
    with click.progressbar(
        length=total_size, label="Downloading", show_pos=True
    ) as bar:
-        for chunk in response.iter_content(block_size):
+        for chunk in response.iter_bytes(block_size):
            if chunk:
                data_chunks.append(chunk)
                bar.update(len(chunk))
--- a/lib/crewai/src/crewai/cli/templates/crew/pyproject.toml
+++ b/lib/crewai/src/crewai/cli/templates/crew/pyproject.toml
@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
 authors = [{ name = "Your Name", email = "you@example.com" }]
 requires-python = ">=3.10,<3.14"
 dependencies = [
-    "crewai[tools]==1.9.3"
+    "crewai[tools]==1.10.1a1"
 ]

 [project.scripts]
--- a/lib/crewai/src/crewai/cli/templates/flow/pyproject.toml
+++ b/lib/crewai/src/crewai/cli/templates/flow/pyproject.toml
@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
 authors = [{ name = "Your Name", email = "you@example.com" }]
 requires-python = ">=3.10,<3.14"
 dependencies = [
-    "crewai[tools]==1.9.3"
+    "crewai[tools]==1.10.1a1"
 ]

 [project.scripts]
--- a/lib/crewai/src/crewai/cli/templates/tool/pyproject.toml
+++ b/lib/crewai/src/crewai/cli/templates/tool/pyproject.toml
@@ -5,7 +5,7 @@ description = "Power up your crews with {{folder_name}}"
 readme = "README.md"
 requires-python = ">=3.10,<3.14"
 dependencies = [
-    "crewai[tools]>=0.203.1"
+    "crewai[tools]==1.10.1a1"
 ]

 [tool.crewai]
--- a/lib/crewai/src/crewai/events/init.py
+++ b/lib/crewai/src/crewai/events/init.py
@@ -63,6 +63,7 @@ from crewai.events.types.logging_events import (
    AgentLogsStartedEvent,
 )
 from crewai.events.types.mcp_events import (
+    MCPConfigFetchFailedEvent,
    MCPConnectionCompletedEvent,
    MCPConnectionFailedEvent,
    MCPConnectionStartedEvent,
@@ -165,6 +166,7 @@ __all__ = [
    "LiteAgentExecutionCompletedEvent",
    "LiteAgentExecutionErrorEvent",
    "LiteAgentExecutionStartedEvent",
+    "MCPConfigFetchFailedEvent",
    "MCPConnectionCompletedEvent",
    "MCPConnectionFailedEvent",
    "MCPConnectionStartedEvent",
--- a/lib/crewai/src/crewai/events/event_listener.py
+++ b/lib/crewai/src/crewai/events/event_listener.py
@@ -68,6 +68,7 @@ from crewai.events.types.logging_events import (
    AgentLogsStartedEvent,
 )
 from crewai.events.types.mcp_events import (
+    MCPConfigFetchFailedEvent,
    MCPConnectionCompletedEvent,
    MCPConnectionFailedEvent,
    MCPConnectionStartedEvent,
@@ -665,6 +666,16 @@ class EventListener(BaseEventListener):
                event.error_type,
            )

+        @crewai_event_bus.on(MCPConfigFetchFailedEvent)
+        def on_mcp_config_fetch_failed(
+            _: Any, event: MCPConfigFetchFailedEvent
+        ) -> None:
+            self.formatter.handle_mcp_config_fetch_failed(
+                event.slug,
+                event.error,
+                event.error_type,
+            )
+
        @crewai_event_bus.on(MCPToolExecutionStartedEvent)
        def on_mcp_tool_execution_started(
            _: Any, event: MCPToolExecutionStartedEvent
--- a/lib/crewai/src/crewai/events/event_types.py
+++ b/lib/crewai/src/crewai/events/event_types.py
@@ -67,6 +67,7 @@ from crewai.events.types.llm_guardrail_events import (
    LLMGuardrailStartedEvent,
 )
 from crewai.events.types.mcp_events import (
+    MCPConfigFetchFailedEvent,
    MCPConnectionCompletedEvent,
    MCPConnectionFailedEvent,
    MCPConnectionStartedEvent,
@@ -181,4 +182,5 @@ EventTypes = (
    | MCPToolExecutionStartedEvent
    | MCPToolExecutionCompletedEvent
    | MCPToolExecutionFailedEvent
+    | MCPConfigFetchFailedEvent
 )
--- a/lib/crewai/src/crewai/events/types/mcp_events.py
+++ b/lib/crewai/src/crewai/events/types/mcp_events.py
@@ -83,3 +83,16 @@ class MCPToolExecutionFailedEvent(MCPEvent):
    error_type: str | None = None  # "timeout", "validation", "server_error", etc.
    started_at: datetime | None = None
    failed_at: datetime | None = None
+
+
+class MCPConfigFetchFailedEvent(BaseEvent):
+    """Event emitted when fetching an AMP MCP server config fails.
+
+    This covers cases where the slug is not connected, the API call
+    failed, or native MCP resolution failed after config was fetched.
+    """
+
+    type: str = "mcp_config_fetch_failed"
+    slug: str
+    error: str
+    error_type: str | None = None  # "not_connected", "api_error", "connection_failed"
--- a/lib/crewai/src/crewai/events/utils/console_formatter.py
+++ b/lib/crewai/src/crewai/events/utils/console_formatter.py
@@ -1512,6 +1512,34 @@ To enable tracing, do any one of these:
        self.print(panel)
        self.print()

+    def handle_mcp_config_fetch_failed(
+        self,
+        slug: str,
+        error: str = "",
+        error_type: str | None = None,
+    ) -> None:
+        """Handle MCP config fetch failed event (AMP resolution failures)."""
+        if not self.verbose:
+            return
+
+        content = Text()
+        content.append("MCP Config Fetch Failed\n\n", style="red bold")
+        content.append("Server: ", style="white")
+        content.append(f"{slug}\n", style="red")
+
+        if error_type:
+            content.append("Error Type: ", style="white")
+            content.append(f"{error_type}\n", style="red")
+
+        if error:
+            content.append("\nError: ", style="white bold")
+            error_preview = error[:500] + "..." if len(error) > 500 else error
+            content.append(f"{error_preview}\n", style="red")
+
+        panel = self.create_panel(content, "❌ MCP Config Failed", "red")
+        self.print(panel)
+        self.print()
+
    def handle_mcp_tool_execution_started(
        self,
        server_name: str,
--- a/lib/crewai/src/crewai/experimental/agent_executor.py
+++ b/lib/crewai/src/crewai/experimental/agent_executor.py
@@ -1,7 +1,10 @@
 from __future__ import annotations

+import asyncio
 from collections.abc import Callable, Coroutine
+from concurrent.futures import ThreadPoolExecutor, as_completed
 from datetime import datetime
+import inspect
 import json
 import threading
 from typing import TYPE_CHECKING, Any, Literal, cast
@@ -63,6 +66,7 @@ from crewai.utilities.agent_utils import (
    has_reached_max_iterations,
    is_context_length_exceeded,
    is_inside_event_loop,
+    parse_tool_call_args,
    process_llm_response,
    track_delegation_if_needed,
 )
@@ -668,9 +672,12 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
        if not self.state.pending_tool_calls:
            return "native_tool_completed"

+        pending_tool_calls = list(self.state.pending_tool_calls)
+        self.state.pending_tool_calls.clear()
+
        # Group all tool calls into a single assistant message
        tool_calls_to_report = []
-        for tool_call in self.state.pending_tool_calls:
+        for tool_call in pending_tool_calls:
            info = extract_tool_call_info(tool_call)
            if not info:
                continue
@@ -695,202 +702,86 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
                "content": None,
                "tool_calls": tool_calls_to_report,
            }
-            if all(
-                type(tc).__qualname__ == "Part" for tc in self.state.pending_tool_calls
-            ):
-                assistant_message["raw_tool_call_parts"] = list(
-                    self.state.pending_tool_calls
-                )
+            if all(type(tc).__qualname__ == "Part" for tc in pending_tool_calls):
+                assistant_message["raw_tool_call_parts"] = list(pending_tool_calls)
            self.state.messages.append(assistant_message)

-        # Now execute each tool
-        while self.state.pending_tool_calls:
-            tool_call = self.state.pending_tool_calls.pop(0)
-            info = extract_tool_call_info(tool_call)
-            if not info:
-                continue
+        runnable_tool_calls = [
+            tool_call
+            for tool_call in pending_tool_calls
+            if extract_tool_call_info(tool_call) is not None
+        ]
+        should_parallelize = self._should_parallelize_native_tool_calls(
+            runnable_tool_calls
+        )

-            call_id, func_name, func_args = info
-
-            # Parse arguments
-            if isinstance(func_args, str):
-                try:
-                    args_dict = json.loads(func_args)
-                except json.JSONDecodeError:
-                    args_dict = {}
-            else:
-                args_dict = func_args
-
-            # Get agent_key for event tracking
-            agent_key = (
-                getattr(self.agent, "key", "unknown") if self.agent else "unknown"
-            )
-
-            # Find original tool by matching sanitized name (needed for cache_function and result_as_answer)
-            original_tool = None
-            for tool in self.original_tools or []:
-                if sanitize_tool_name(tool.name) == func_name:
-                    original_tool = tool
-                    break
-
-            # Check if tool has reached max usage count
-            max_usage_reached = False
-            if (
-                original_tool
-                and original_tool.max_usage_count is not None
-                and original_tool.current_usage_count >= original_tool.max_usage_count
-            ):
-                max_usage_reached = True
-
-            # Check cache before executing
-            from_cache = False
-            input_str = json.dumps(args_dict) if args_dict else ""
-            if self.tools_handler and self.tools_handler.cache:
-                cached_result = self.tools_handler.cache.read(
-                    tool=func_name, input=input_str
+        execution_results: list[dict[str, Any]] = []
+        if should_parallelize:
+            max_workers = min(8, len(runnable_tool_calls))
+            with ThreadPoolExecutor(max_workers=max_workers) as pool:
+                future_to_idx = {
+                    pool.submit(self._execute_single_native_tool_call, tool_call): idx
+                    for idx, tool_call in enumerate(runnable_tool_calls)
+                }
+                ordered_results: list[dict[str, Any] | None] = [None] * len(
+                    runnable_tool_calls
                )
-                if cached_result is not None:
-                    result = (
-                        str(cached_result)
-                        if not isinstance(cached_result, str)
-                        else cached_result
-                    )
-                    from_cache = True
+                for future in as_completed(future_to_idx):
+                    idx = future_to_idx[future]
+                    ordered_results[idx] = future.result()
+                execution_results = [
+                    result for result in ordered_results if result is not None
+                ]
+        else:
+            # Execute sequentially so result_as_answer tools can short-circuit
+            # immediately without running remaining calls.
+            for tool_call in runnable_tool_calls:
+                execution_result = self._execute_single_native_tool_call(tool_call)
+                call_id = cast(str, execution_result["call_id"])
+                func_name = cast(str, execution_result["func_name"])
+                result = cast(str, execution_result["result"])
+                from_cache = cast(bool, execution_result["from_cache"])
+                original_tool = execution_result["original_tool"]

-            # Emit tool usage started event
-            started_at = datetime.now()
-            crewai_event_bus.emit(
-                self,
-                event=ToolUsageStartedEvent(
-                    tool_name=func_name,
-                    tool_args=args_dict,
-                    from_agent=self.agent,
-                    from_task=self.task,
-                    agent_key=agent_key,
-                ),
-            )
-            error_event_emitted = False
+                tool_message: LLMMessage = {
+                    "role": "tool",
+                    "tool_call_id": call_id,
+                    "name": func_name,
+                    "content": result,
+                }
+                self.state.messages.append(tool_message)

-            track_delegation_if_needed(func_name, args_dict, self.task)
-
-            structured_tool: CrewStructuredTool | None = None
-            for structured in self.tools or []:
-                if sanitize_tool_name(structured.name) == func_name:
-                    structured_tool = structured
-                    break
-
-            hook_blocked = False
-            before_hook_context = ToolCallHookContext(
-                tool_name=func_name,
-                tool_input=args_dict,
-                tool=structured_tool,  # type: ignore[arg-type]
-                agent=self.agent,
-                task=self.task,
-                crew=self.crew,
-            )
-            before_hooks = get_before_tool_call_hooks()
-            try:
-                for hook in before_hooks:
-                    hook_result = hook(before_hook_context)
-                    if hook_result is False:
-                        hook_blocked = True
-                        break
-            except Exception as hook_error:
-                if self.agent.verbose:
+                # Log the tool execution
+                if self.agent and self.agent.verbose:
+                    cache_info = " (from cache)" if from_cache else ""
                    self._printer.print(
-                        content=f"Error in before_tool_call hook: {hook_error}",
-                        color="red",
+                        content=f"Tool {func_name} executed with result{cache_info}: {result[:200]}...",
+                        color="green",
                    )

-            if hook_blocked:
-                result = f"Tool execution blocked by hook. Tool: {func_name}"
-            elif not from_cache and not max_usage_reached:
-                result = "Tool not found"
-                if func_name in self._available_functions:
-                    try:
-                        tool_func = self._available_functions[func_name]
-                        raw_result = tool_func(**args_dict)
-
-                        # Add to cache after successful execution (before string conversion)
-                        if self.tools_handler and self.tools_handler.cache:
-                            should_cache = True
-                            if original_tool:
-                                should_cache = original_tool.cache_function(
-                                    args_dict, raw_result
-                                )
-                            if should_cache:
-                                self.tools_handler.cache.add(
-                                    tool=func_name, input=input_str, output=raw_result
-                                )
-
-                        # Convert to string for message
-                        result = (
-                            str(raw_result)
-                            if not isinstance(raw_result, str)
-                            else raw_result
-                        )
-                    except Exception as e:
-                        result = f"Error executing tool: {e}"
-                        if self.task:
-                            self.task.increment_tools_errors()
-                        # Emit tool usage error event
-                        crewai_event_bus.emit(
-                            self,
-                            event=ToolUsageErrorEvent(
-                                tool_name=func_name,
-                                tool_args=args_dict,
-                                from_agent=self.agent,
-                                from_task=self.task,
-                                agent_key=agent_key,
-                                error=e,
-                            ),
-                        )
-                        error_event_emitted = True
-            elif max_usage_reached and original_tool:
-                # Return error message when max usage limit is reached
-                result = f"Tool '{func_name}' has reached its usage limit of {original_tool.max_usage_count} times and cannot be used anymore."
-
-            # Execute after_tool_call hooks (even if blocked, to allow logging/monitoring)
-            after_hook_context = ToolCallHookContext(
-                tool_name=func_name,
-                tool_input=args_dict,
-                tool=structured_tool,  # type: ignore[arg-type]
-                agent=self.agent,
-                task=self.task,
-                crew=self.crew,
-                tool_result=result,
-            )
-            after_hooks = get_after_tool_call_hooks()
-            try:
-                for after_hook in after_hooks:
-                    after_hook_result = after_hook(after_hook_context)
-                    if after_hook_result is not None:
-                        result = after_hook_result
-                        after_hook_context.tool_result = result
-            except Exception as hook_error:
-                if self.agent.verbose:
-                    self._printer.print(
-                        content=f"Error in after_tool_call hook: {hook_error}",
-                        color="red",
-                    )
-
-            if not error_event_emitted:
-                crewai_event_bus.emit(
-                    self,
-                    event=ToolUsageFinishedEvent(
+                if (
+                    original_tool
+                    and hasattr(original_tool, "result_as_answer")
+                    and original_tool.result_as_answer
+                ):
+                    self.state.current_answer = AgentFinish(
+                        thought="Tool result is the final answer",
                        output=result,
-                        tool_name=func_name,
-                        tool_args=args_dict,
-                        from_agent=self.agent,
-                        from_task=self.task,
-                        agent_key=agent_key,
-                        started_at=started_at,
-                        finished_at=datetime.now(),
-                    ),
-                )
+                        text=result,
+                    )
+                    self.state.is_finished = True
+                    return "tool_result_is_final"

-            # Append tool result message
-            tool_message: LLMMessage = {
+            return "native_tool_completed"
+
+        for execution_result in execution_results:
+            call_id = cast(str, execution_result["call_id"])
+            func_name = cast(str, execution_result["func_name"])
+            result = cast(str, execution_result["result"])
+            from_cache = cast(bool, execution_result["from_cache"])
+            original_tool = execution_result["original_tool"]
+
+            tool_message = {
                "role": "tool",
                "tool_call_id": call_id,
                "name": func_name,
@@ -922,6 +813,220 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):

        return "native_tool_completed"

+    def _should_parallelize_native_tool_calls(self, tool_calls: list[Any]) -> bool:
+        """Determine if native tool calls are safe to run in parallel."""
+        if len(tool_calls) <= 1:
+            return False
+
+        for tool_call in tool_calls:
+            info = extract_tool_call_info(tool_call)
+            if not info:
+                continue
+            _, func_name, _ = info
+
+            original_tool = None
+            for tool in self.original_tools or []:
+                if sanitize_tool_name(tool.name) == func_name:
+                    original_tool = tool
+                    break
+
+            if not original_tool:
+                continue
+
+            if getattr(original_tool, "result_as_answer", False):
+                return False
+            if getattr(original_tool, "max_usage_count", None) is not None:
+                return False
+
+        return True
+
+    def _execute_single_native_tool_call(self, tool_call: Any) -> dict[str, Any]:
+        """Execute a single native tool call and return metadata/result."""
+        info = extract_tool_call_info(tool_call)
+        if not info:
+            raise ValueError("Invalid native tool call format")
+
+        call_id, func_name, func_args = info
+
+        # Parse arguments
+        args_dict, parse_error = parse_tool_call_args(func_args, func_name, call_id)
+        if parse_error is not None:
+            return parse_error
+
+        # Get agent_key for event tracking
+        agent_key = getattr(self.agent, "key", "unknown") if self.agent else "unknown"
+
+        # Find original tool by matching sanitized name (needed for cache_function and result_as_answer)
+        original_tool = None
+        for tool in self.original_tools or []:
+            if sanitize_tool_name(tool.name) == func_name:
+                original_tool = tool
+                break
+
+        # Check if tool has reached max usage count
+        max_usage_reached = False
+        if (
+            original_tool
+            and original_tool.max_usage_count is not None
+            and original_tool.current_usage_count >= original_tool.max_usage_count
+        ):
+            max_usage_reached = True
+
+        # Check cache before executing
+        from_cache = False
+        input_str = json.dumps(args_dict) if args_dict else ""
+        if self.tools_handler and self.tools_handler.cache:
+            cached_result = self.tools_handler.cache.read(
+                tool=func_name, input=input_str
+            )
+            if cached_result is not None:
+                result = (
+                    str(cached_result)
+                    if not isinstance(cached_result, str)
+                    else cached_result
+                )
+                from_cache = True
+
+        # Emit tool usage started event
+        started_at = datetime.now()
+        crewai_event_bus.emit(
+            self,
+            event=ToolUsageStartedEvent(
+                tool_name=func_name,
+                tool_args=args_dict,
+                from_agent=self.agent,
+                from_task=self.task,
+                agent_key=agent_key,
+            ),
+        )
+        error_event_emitted = False
+
+        track_delegation_if_needed(func_name, args_dict, self.task)
+
+        structured_tool: CrewStructuredTool | None = None
+        for structured in self.tools or []:
+            if sanitize_tool_name(structured.name) == func_name:
+                structured_tool = structured
+                break
+
+        hook_blocked = False
+        before_hook_context = ToolCallHookContext(
+            tool_name=func_name,
+            tool_input=args_dict,
+            tool=structured_tool,  # type: ignore[arg-type]
+            agent=self.agent,
+            task=self.task,
+            crew=self.crew,
+        )
+        before_hooks = get_before_tool_call_hooks()
+        try:
+            for hook in before_hooks:
+                hook_result = hook(before_hook_context)
+                if hook_result is False:
+                    hook_blocked = True
+                    break
+        except Exception as hook_error:
+            if self.agent.verbose:
+                self._printer.print(
+                    content=f"Error in before_tool_call hook: {hook_error}",
+                    color="red",
+                )
+
+        if hook_blocked:
+            result = f"Tool execution blocked by hook. Tool: {func_name}"
+        elif not from_cache and not max_usage_reached:
+            result = "Tool not found"
+            if func_name in self._available_functions:
+                try:
+                    tool_func = self._available_functions[func_name]
+                    raw_result = tool_func(**args_dict)
+
+                    # Add to cache after successful execution (before string conversion)
+                    if self.tools_handler and self.tools_handler.cache:
+                        should_cache = True
+                        if original_tool:
+                            should_cache = original_tool.cache_function(
+                                args_dict, raw_result
+                            )
+                        if should_cache:
+                            self.tools_handler.cache.add(
+                                tool=func_name, input=input_str, output=raw_result
+                            )
+
+                    # Convert to string for message
+                    result = (
+                        str(raw_result)
+                        if not isinstance(raw_result, str)
+                        else raw_result
+                    )
+                except Exception as e:
+                    result = f"Error executing tool: {e}"
+                    if self.task:
+                        self.task.increment_tools_errors()
+                    # Emit tool usage error event
+                    crewai_event_bus.emit(
+                        self,
+                        event=ToolUsageErrorEvent(
+                            tool_name=func_name,
+                            tool_args=args_dict,
+                            from_agent=self.agent,
+                            from_task=self.task,
+                            agent_key=agent_key,
+                            error=e,
+                        ),
+                    )
+                    error_event_emitted = True
+        elif max_usage_reached and original_tool:
+            # Return error message when max usage limit is reached
+            result = f"Tool '{func_name}' has reached its usage limit of {original_tool.max_usage_count} times and cannot be used anymore."
+
+        # Execute after_tool_call hooks (even if blocked, to allow logging/monitoring)
+        after_hook_context = ToolCallHookContext(
+            tool_name=func_name,
+            tool_input=args_dict,
+            tool=structured_tool,  # type: ignore[arg-type]
+            agent=self.agent,
+            task=self.task,
+            crew=self.crew,
+            tool_result=result,
+        )
+        after_hooks = get_after_tool_call_hooks()
+        try:
+            for after_hook in after_hooks:
+                after_hook_result = after_hook(after_hook_context)
+                if after_hook_result is not None:
+                    result = after_hook_result
+                    after_hook_context.tool_result = result
+        except Exception as hook_error:
+            if self.agent.verbose:
+                self._printer.print(
+                    content=f"Error in after_tool_call hook: {hook_error}",
+                    color="red",
+                )
+
+        if not error_event_emitted:
+            crewai_event_bus.emit(
+                self,
+                event=ToolUsageFinishedEvent(
+                    output=result,
+                    tool_name=func_name,
+                    tool_args=args_dict,
+                    from_agent=self.agent,
+                    from_task=self.task,
+                    agent_key=agent_key,
+                    started_at=started_at,
+                    finished_at=datetime.now(),
+                ),
+            )
+
+        return {
+            "call_id": call_id,
+            "func_name": func_name,
+            "result": result,
+            "from_cache": from_cache,
+            "original_tool": original_tool,
+        }
+
    def _extract_tool_name(self, tool_call: Any) -> str:
        """Extract tool name from various tool call formats."""
        if hasattr(tool_call, "function"):
@@ -1252,7 +1357,9 @@ class AgentExecutor(Flow[AgentReActState], CrewAgentExecutorMixin):
            formatted_answer: Current agent response.
        """
        if self.step_callback:
-            self.step_callback(formatted_answer)
+            cb_result = self.step_callback(formatted_answer)
+            if inspect.iscoroutine(cb_result):
+                asyncio.run(cb_result)

    def _append_message_to_state(
        self, text: str, role: Literal["user", "assistant", "system"] = "assistant"
--- a/lib/crewai/src/crewai/flow/flow.py
+++ b/lib/crewai/src/crewai/flow/flow.py
@@ -10,13 +10,15 @@ import asyncio
 from collections.abc import (
    Callable,
    ItemsView,
+    Iterable,
    Iterator,
    KeysView,
    Sequence,
    ValuesView,
 )
-from concurrent.futures import Future
+from concurrent.futures import Future, ThreadPoolExecutor
 import copy
+import enum
 import inspect
 import logging
 import threading
@@ -27,8 +29,10 @@ from typing import (
    Generic,
    Literal,
    ParamSpec,
+    SupportsIndex,
    TypeVar,
    cast,
+    overload,
 )
 from uuid import uuid4

@@ -77,7 +81,12 @@ from crewai.flow.flow_wrappers import (
    StartMethod,
 )
 from crewai.flow.persistence.base import FlowPersistence
-from crewai.flow.types import FlowExecutionData, FlowMethodName, InputHistoryEntry, PendingListenerKey
+from crewai.flow.types import (
+    FlowExecutionData,
+    FlowMethodName,
+    InputHistoryEntry,
+    PendingListenerKey,
+)
 from crewai.flow.utils import (
    _extract_all_methods,
    _extract_all_methods_recursive,
@@ -426,8 +435,7 @@ class LockedListProxy(list, Generic[T]):  # type: ignore[type-arg]
    """

    def __init__(self, lst: list[T], lock: threading.Lock) -> None:
-        # Do NOT call super().__init__() -- we don't want to copy data into
-        # the builtin list storage. All access goes through self._list.
+        super().__init__()  # empty builtin list; all access goes through self._list
        self._list = lst
        self._lock = lock

@@ -435,11 +443,11 @@ class LockedListProxy(list, Generic[T]):  # type: ignore[type-arg]
        with self._lock:
            self._list.append(item)

-    def extend(self, items: list[T]) -> None:
+    def extend(self, items: Iterable[T]) -> None:
        with self._lock:
            self._list.extend(items)

-    def insert(self, index: int, item: T) -> None:
+    def insert(self, index: SupportsIndex, item: T) -> None:
        with self._lock:
            self._list.insert(index, item)

@@ -447,7 +455,7 @@ class LockedListProxy(list, Generic[T]):  # type: ignore[type-arg]
        with self._lock:
            self._list.remove(item)

-    def pop(self, index: int = -1) -> T:
+    def pop(self, index: SupportsIndex = -1) -> T:
        with self._lock:
            return self._list.pop(index)

@@ -455,15 +463,23 @@ class LockedListProxy(list, Generic[T]):  # type: ignore[type-arg]
        with self._lock:
            self._list.clear()

-    def __setitem__(self, index: int, value: T) -> None:
+    @overload
+    def __setitem__(self, index: SupportsIndex, value: T) -> None: ...
+    @overload
+    def __setitem__(self, index: slice, value: Iterable[T]) -> None: ...
+    def __setitem__(self, index: Any, value: Any) -> None:
        with self._lock:
            self._list[index] = value

-    def __delitem__(self, index: int) -> None:
+    def __delitem__(self, index: SupportsIndex | slice) -> None:
        with self._lock:
            del self._list[index]

-    def __getitem__(self, index: int) -> T:
+    @overload
+    def __getitem__(self, index: SupportsIndex) -> T: ...
+    @overload
+    def __getitem__(self, index: slice) -> list[T]: ...
+    def __getitem__(self, index: Any) -> Any:
        return self._list[index]

    def __len__(self) -> int:
@@ -481,7 +497,7 @@ class LockedListProxy(list, Generic[T]):  # type: ignore[type-arg]
    def __bool__(self) -> bool:
        return bool(self._list)

-    def __eq__(self, other: object) -> bool:  # type: ignore[override]
+    def __eq__(self, other: object) -> bool:
        """Compare based on the underlying list contents."""
        if isinstance(other, LockedListProxy):
            # Avoid deadlocks by acquiring locks in a consistent order.
@@ -492,7 +508,7 @@ class LockedListProxy(list, Generic[T]):  # type: ignore[type-arg]
        with self._lock:
            return self._list == other

-    def __ne__(self, other: object) -> bool:  # type: ignore[override]
+    def __ne__(self, other: object) -> bool:
        return not self.__eq__(other)


@@ -505,8 +521,7 @@ class LockedDictProxy(dict, Generic[T]):  # type: ignore[type-arg]
    """

    def __init__(self, d: dict[str, T], lock: threading.Lock) -> None:
-        # Do NOT call super().__init__() -- we don't want to copy data into
-        # the builtin dict storage. All access goes through self._dict.
+        super().__init__()  # empty builtin dict; all access goes through self._dict
        self._dict = d
        self._lock = lock

@@ -518,11 +533,11 @@ class LockedDictProxy(dict, Generic[T]):  # type: ignore[type-arg]
        with self._lock:
            del self._dict[key]

-    def pop(self, key: str, *default: T) -> T:
+    def pop(self, key: str, *default: T) -> T:  # type: ignore[override]
        with self._lock:
            return self._dict.pop(key, *default)

-    def update(self, other: dict[str, T]) -> None:
+    def update(self, other: dict[str, T]) -> None:  # type: ignore[override]
        with self._lock:
            self._dict.update(other)

@@ -530,7 +545,7 @@ class LockedDictProxy(dict, Generic[T]):  # type: ignore[type-arg]
        with self._lock:
            self._dict.clear()

-    def setdefault(self, key: str, default: T) -> T:
+    def setdefault(self, key: str, default: T) -> T:  # type: ignore[override]
        with self._lock:
            return self._dict.setdefault(key, default)

@@ -546,16 +561,16 @@ class LockedDictProxy(dict, Generic[T]):  # type: ignore[type-arg]
    def __contains__(self, key: object) -> bool:
        return key in self._dict

-    def keys(self) -> KeysView[str]:
+    def keys(self) -> KeysView[str]:  # type: ignore[override]
        return self._dict.keys()

-    def values(self) -> ValuesView[T]:
+    def values(self) -> ValuesView[T]:  # type: ignore[override]
        return self._dict.values()

-    def items(self) -> ItemsView[str, T]:
+    def items(self) -> ItemsView[str, T]:  # type: ignore[override]
        return self._dict.items()

-    def get(self, key: str, default: T | None = None) -> T | None:
+    def get(self, key: str, default: T | None = None) -> T | None:  # type: ignore[override]
        return self._dict.get(key, default)

    def __repr__(self) -> str:
@@ -564,7 +579,7 @@ class LockedDictProxy(dict, Generic[T]):  # type: ignore[type-arg]
    def __bool__(self) -> bool:
        return bool(self._dict)

-    def __eq__(self, other: object) -> bool:  # type: ignore[override]
+    def __eq__(self, other: object) -> bool:
        """Compare based on the underlying dict contents."""
        if isinstance(other, LockedDictProxy):
            # Avoid deadlocks by acquiring locks in a consistent order.
@@ -575,7 +590,7 @@ class LockedDictProxy(dict, Generic[T]):  # type: ignore[type-arg]
        with self._lock:
            return self._dict == other

-    def __ne__(self, other: object) -> bool:  # type: ignore[override]
+    def __ne__(self, other: object) -> bool:
        return not self.__eq__(other)


@@ -737,7 +752,9 @@ class Flow(Generic[T], metaclass=FlowMeta):
    name: str | None = None
    tracing: bool | None = None
    stream: bool = False
-    memory: Any = None  # Memory | MemoryScope | MemorySlice | None; auto-created if not set
+    memory: Any = (
+        None  # Memory | MemoryScope | MemorySlice | None; auto-created if not set
+    )
    input_provider: Any = None  # InputProvider | None; per-flow override for self.ask()

    def __class_getitem__(cls: type[Flow[T]], item: type[T]) -> type[Flow[T]]:
@@ -881,7 +898,8 @@ class Flow(Generic[T], metaclass=FlowMeta):
        """
        if self.memory is None:
            raise ValueError("No memory configured for this flow")
-        return self.memory.extract_memories(content)
+        result: list[str] = self.memory.extract_memories(content)
+        return result

    def _mark_or_listener_fired(self, listener_name: FlowMethodName) -> bool:
        """Mark an OR listener as fired atomically.
@@ -1352,8 +1370,10 @@ class Flow(Generic[T], metaclass=FlowMeta):
            ValueError: If structured state model lacks 'id' field
            TypeError: If state is neither BaseModel nor dictionary
        """
+        init_state = self.initial_state
+
        # Handle case where initial_state is None but we have a type parameter
-        if self.initial_state is None and hasattr(self, "_initial_state_t"):
+        if init_state is None and hasattr(self, "_initial_state_t"):
            state_type = self._initial_state_t
            if isinstance(state_type, type):
                if issubclass(state_type, FlowState):
@@ -1377,12 +1397,12 @@ class Flow(Generic[T], metaclass=FlowMeta):
                    return cast(T, {"id": str(uuid4())})

        # Handle case where no initial state is provided
-        if self.initial_state is None:
+        if init_state is None:
            return cast(T, {"id": str(uuid4())})

        # Handle case where initial_state is a type (class)
-        if isinstance(self.initial_state, type):
-            state_class: type[T] = self.initial_state
+        if isinstance(init_state, type):
+            state_class = init_state
            if issubclass(state_class, FlowState):
                return state_class()
            if issubclass(state_class, BaseModel):
@@ -1393,19 +1413,19 @@ class Flow(Generic[T], metaclass=FlowMeta):
                if not getattr(model_instance, "id", None):
                    object.__setattr__(model_instance, "id", str(uuid4()))
                return model_instance
-            if self.initial_state is dict:
+            if init_state is dict:
                return cast(T, {"id": str(uuid4())})

        # Handle dictionary instance case
-        if isinstance(self.initial_state, dict):
-            new_state = dict(self.initial_state)  # Copy to avoid mutations
+        if isinstance(init_state, dict):
+            new_state = dict(init_state)  # Copy to avoid mutations
            if "id" not in new_state:
                new_state["id"] = str(uuid4())
            return cast(T, new_state)

        # Handle BaseModel instance case
-        if isinstance(self.initial_state, BaseModel):
-            model = cast(BaseModel, self.initial_state)
+        if isinstance(init_state, BaseModel):
+            model = cast(BaseModel, init_state)
            if not hasattr(model, "id"):
                raise ValueError("Flow state model must have an 'id' field")

@@ -1719,7 +1739,12 @@ class Flow(Generic[T], metaclass=FlowMeta):
        async def _run_flow() -> Any:
            return await self.kickoff_async(inputs, input_files)

-        return asyncio.run(_run_flow())
+        try:
+            asyncio.get_running_loop()
+            with ThreadPoolExecutor(max_workers=1) as pool:
+                return pool.submit(asyncio.run, _run_flow()).result()
+        except RuntimeError:
+            return asyncio.run(_run_flow())

    async def kickoff_async(
        self,
@@ -2178,6 +2203,8 @@ class Flow(Generic[T], metaclass=FlowMeta):
            from crewai.flow.async_feedback.types import HumanFeedbackPending

            if isinstance(e, HumanFeedbackPending):
+                e.context.method_name = method_name
+
                # Auto-save pending feedback (create default persistence if needed)
                if self._persistence is None:
                    from crewai.flow.persistence import SQLiteFlowPersistence
@@ -2277,14 +2304,23 @@ class Flow(Generic[T], metaclass=FlowMeta):
                    router_name, router_input, current_triggering_event_id
                )
                if router_result:  # Only add non-None results
-                    router_results.append(FlowMethodName(str(router_result)))
+                    router_result_str = (
+                        router_result.value
+                        if isinstance(router_result, enum.Enum)
+                        else str(router_result)
+                    )
+                    router_results.append(FlowMethodName(router_result_str))
                    # If this was a human_feedback router, map the outcome to the feedback
                    if self.last_human_feedback is not None:
-                        router_result_to_feedback[str(router_result)] = (
+                        router_result_to_feedback[router_result_str] = (
                            self.last_human_feedback
                        )
                current_trigger = (
-                    FlowMethodName(str(router_result))
+                    FlowMethodName(
+                        router_result.value
+                        if isinstance(router_result, enum.Enum)
+                        else str(router_result)
+                    )
                    if router_result is not None
                    else FlowMethodName("")  # Update for next iteration of router chain
                )
@@ -2701,7 +2737,10 @@ class Flow(Generic[T], metaclass=FlowMeta):
                    return topic
            ```
        """
-        from concurrent.futures import ThreadPoolExecutor, TimeoutError as FuturesTimeoutError
+        from concurrent.futures import (
+            ThreadPoolExecutor,
+            TimeoutError as FuturesTimeoutError,
+        )
        from datetime import datetime

        from crewai.events.types.flow_events import (
@@ -2770,14 +2809,16 @@ class Flow(Generic[T], metaclass=FlowMeta):
            response = None

        # Record in history
-        self._input_history.append({
-            "message": message,
-            "response": response,
-            "method_name": method_name,
-            "timestamp": datetime.now(),
-            "metadata": metadata,
-            "response_metadata": response_metadata,
-        })
+        self._input_history.append(
+            {
+                "message": message,
+                "response": response,
+                "method_name": method_name,
+                "timestamp": datetime.now(),
+                "metadata": metadata,
+                "response_metadata": response_metadata,
+            }
+        )

        # Emit input received event
        crewai_event_bus.emit(
--- a/lib/crewai/src/crewai/lite_agent.py
+++ b/lib/crewai/src/crewai/lite_agent.py
@@ -2,10 +2,10 @@ from __future__ import annotations

 import asyncio
 from collections.abc import Callable
-import time
 from functools import wraps
 import inspect
 import json
+import time
 from types import MethodType
 from typing import (
    TYPE_CHECKING,
@@ -49,15 +49,20 @@ from crewai.events.types.agent_events import (
    LiteAgentExecutionErrorEvent,
    LiteAgentExecutionStartedEvent,
 )
+from crewai.events.types.logging_events import AgentLogsExecutionEvent
 from crewai.events.types.memory_events import (
    MemoryRetrievalCompletedEvent,
    MemoryRetrievalFailedEvent,
    MemoryRetrievalStartedEvent,
 )
-from crewai.events.types.logging_events import AgentLogsExecutionEvent
 from crewai.flow.flow_trackable import FlowTrackable
 from crewai.hooks.llm_hooks import get_after_llm_call_hooks, get_before_llm_call_hooks
-from crewai.hooks.types import AfterLLMCallHookType, BeforeLLMCallHookType
+from crewai.hooks.types import (
+    AfterLLMCallHookCallable,
+    AfterLLMCallHookType,
+    BeforeLLMCallHookCallable,
+    BeforeLLMCallHookType,
+)
 from crewai.lite_agent_output import LiteAgentOutput
 from crewai.llm import LLM
 from crewai.llms.base_llm import BaseLLM
@@ -270,11 +275,11 @@ class LiteAgent(FlowTrackable, BaseModel):
    _guardrail: GuardrailCallable | None = PrivateAttr(default=None)
    _guardrail_retry_count: int = PrivateAttr(default=0)
    _callbacks: list[TokenCalcHandler] = PrivateAttr(default_factory=list)
-    _before_llm_call_hooks: list[BeforeLLMCallHookType] = PrivateAttr(
-        default_factory=get_before_llm_call_hooks
+    _before_llm_call_hooks: list[BeforeLLMCallHookType | BeforeLLMCallHookCallable] = (
+        PrivateAttr(default_factory=get_before_llm_call_hooks)
    )
-    _after_llm_call_hooks: list[AfterLLMCallHookType] = PrivateAttr(
-        default_factory=get_after_llm_call_hooks
+    _after_llm_call_hooks: list[AfterLLMCallHookType | AfterLLMCallHookCallable] = (
+        PrivateAttr(default_factory=get_after_llm_call_hooks)
    )
    _memory: Any = PrivateAttr(default=None)

@@ -440,12 +445,16 @@ class LiteAgent(FlowTrackable, BaseModel):
        return self.role

    @property
-    def before_llm_call_hooks(self) -> list[BeforeLLMCallHookType]:
+    def before_llm_call_hooks(
+        self,
+    ) -> list[BeforeLLMCallHookType | BeforeLLMCallHookCallable]:
        """Get the before_llm_call hooks for this agent."""
        return self._before_llm_call_hooks

    @property
-    def after_llm_call_hooks(self) -> list[AfterLLMCallHookType]:
+    def after_llm_call_hooks(
+        self,
+    ) -> list[AfterLLMCallHookType | AfterLLMCallHookCallable]:
        """Get the after_llm_call hooks for this agent."""
        return self._after_llm_call_hooks

@@ -482,11 +491,12 @@ class LiteAgent(FlowTrackable, BaseModel):
        # Inject memory tools once if memory is configured (mirrors Agent._prepare_kickoff)
        if self._memory is not None:
            from crewai.tools.memory_tools import create_memory_tools
-            from crewai.utilities.agent_utils import sanitize_tool_name
+            from crewai.utilities.string_utils import sanitize_tool_name

            existing_names = {sanitize_tool_name(t.name) for t in self._parsed_tools}
            memory_tools = [
-                mt for mt in create_memory_tools(self._memory)
+                mt
+                for mt in create_memory_tools(self._memory)
                if sanitize_tool_name(mt.name) not in existing_names
            ]
            if memory_tools:
@@ -565,9 +575,10 @@ class LiteAgent(FlowTrackable, BaseModel):
            if memory_block:
                formatted = self.i18n.slice("memory").format(memory=memory_block)
                if self._messages and self._messages[0].get("role") == "system":
-                    self._messages[0]["content"] = (
-                        self._messages[0].get("content", "") + "\n\n" + formatted
-                    )
+                    existing_content = self._messages[0].get("content", "")
+                    if not isinstance(existing_content, str):
+                        existing_content = ""
+                    self._messages[0]["content"] = existing_content + "\n\n" + formatted
            crewai_event_bus.emit(
                self,
                event=MemoryRetrievalCompletedEvent(
@@ -588,16 +599,12 @@ class LiteAgent(FlowTrackable, BaseModel):
            )

    def _save_to_memory(self, output_text: str) -> None:
-        """Extract discrete memories from the run and remember each. No-op if _memory is None."""
-        if self._memory is None:
+        """Extract discrete memories from the run and remember each. No-op if _memory is None or read-only."""
+        if self._memory is None or getattr(self._memory, "_read_only", False):
            return
        input_str = self._get_last_user_content() or "User request"
        try:
-            raw = (
-                f"Input: {input_str}\n"
-                f"Agent: {self.role}\n"
-                f"Result: {output_text}"
-            )
+            raw = f"Input: {input_str}\nAgent: {self.role}\nResult: {output_text}"
            extracted = self._memory.extract_memories(raw)
            if extracted:
                self._memory.remember_many(extracted, agent_role=self.role)
@@ -622,13 +629,20 @@ class LiteAgent(FlowTrackable, BaseModel):
        )

        # Execute the agent using invoke loop
-        agent_finish = self._invoke_loop()
+        active_response_format = response_format or self.response_format
+        agent_finish = self._invoke_loop(response_model=active_response_format)
        if self._memory is not None:
-            self._save_to_memory(agent_finish.output)
+            output_text = (
+                agent_finish.output.model_dump_json()
+                if isinstance(agent_finish.output, BaseModel)
+                else agent_finish.output
+            )
+            self._save_to_memory(output_text)
        formatted_result: BaseModel | None = None

-        active_response_format = response_format or self.response_format
-        if active_response_format:
+        if isinstance(agent_finish.output, BaseModel):
+            formatted_result = agent_finish.output
+        elif active_response_format:
            try:
                model_schema = generate_model_description(active_response_format)
                schema = json.dumps(model_schema, indent=2)
@@ -660,8 +674,13 @@ class LiteAgent(FlowTrackable, BaseModel):
            usage_metrics = self._token_process.get_summary()

        # Create output
+        raw_output = (
+            agent_finish.output.model_dump_json()
+            if isinstance(agent_finish.output, BaseModel)
+            else agent_finish.output
+        )
        output = LiteAgentOutput(
-            raw=agent_finish.output,
+            raw=raw_output,
            pydantic=formatted_result,
            agent_role=self.role,
            usage_metrics=usage_metrics.model_dump() if usage_metrics else None,
@@ -838,10 +857,15 @@ class LiteAgent(FlowTrackable, BaseModel):

        return formatted_messages

-    def _invoke_loop(self) -> AgentFinish:
+    def _invoke_loop(
+        self, response_model: type[BaseModel] | None = None
+    ) -> AgentFinish:
        """
        Run the agent's thought process until it reaches a conclusion or max iterations.

+        Args:
+            response_model: Optional Pydantic model for native structured output.
+
        Returns:
            AgentFinish: The final result of the agent execution.
        """
@@ -870,12 +894,19 @@ class LiteAgent(FlowTrackable, BaseModel):
                        printer=self._printer,
                        from_agent=self,
                        executor_context=self,
+                        response_model=response_model,
                        verbose=self.verbose,
                    )

                except Exception as e:
                    raise e

+                if isinstance(answer, BaseModel):
+                    formatted_answer = AgentFinish(
+                        thought="", output=answer, text=answer.model_dump_json()
+                    )
+                    break
+
                formatted_answer = process_llm_response(
                    cast(str, answer), self.use_stop_words
                )
@@ -901,7 +932,7 @@ class LiteAgent(FlowTrackable, BaseModel):
                    )

                self._append_message(formatted_answer.text, role="assistant")
-            except OutputParserError as e:  # noqa: PERF203
+            except OutputParserError as e:
                if self.verbose:
                    self._printer.print(
                        content="Failed to parse LLM output. Retrying...",
--- a/lib/crewai/src/crewai/llm.py
+++ b/lib/crewai/src/crewai/llm.py
@@ -427,7 +427,7 @@ class LLM(BaseLLM):
                f"installed.\n\n"
                f"To fix this, either:\n"
                f"  1. Install LiteLLM for broad model support: "
-                f"uv add litellm\n"
+                f"uv add 'crewai[litellm]'\n"
                f"or\n"
                f"pip install litellm\n\n"
                f"For more details, see: "
--- a/lib/crewai/src/crewai/llms/providers/bedrock/completion.py
+++ b/lib/crewai/src/crewai/llms/providers/bedrock/completion.py
@@ -234,7 +234,7 @@ class BedrockCompletion(BaseLLM):
        aws_access_key_id: str | None = None,
        aws_secret_access_key: str | None = None,
        aws_session_token: str | None = None,
-        region_name: str = "us-east-1",
+        region_name: str | None = None,
        temperature: float | None = None,
        max_tokens: int | None = None,
        top_p: float | None = None,
@@ -287,15 +287,6 @@ class BedrockCompletion(BaseLLM):
            **kwargs,
        )

-        # Initialize Bedrock client with proper configuration
-        session = Session(
-            aws_access_key_id=aws_access_key_id or os.getenv("AWS_ACCESS_KEY_ID"),
-            aws_secret_access_key=aws_secret_access_key
-            or os.getenv("AWS_SECRET_ACCESS_KEY"),
-            aws_session_token=aws_session_token or os.getenv("AWS_SESSION_TOKEN"),
-            region_name=region_name,
-        )
-
        # Configure client with timeouts and retries following AWS best practices
        config = Config(
            read_timeout=300,
@@ -306,8 +297,12 @@ class BedrockCompletion(BaseLLM):
            tcp_keepalive=True,
        )

-        self.client = session.client("bedrock-runtime", config=config)
-        self.region_name = region_name
+        self.region_name = (
+            region_name
+            or os.getenv("AWS_DEFAULT_REGION")
+            or os.getenv("AWS_REGION_NAME")
+            or "us-east-1"
+        )

        self.aws_access_key_id = aws_access_key_id or os.getenv("AWS_ACCESS_KEY_ID")
        self.aws_secret_access_key = aws_secret_access_key or os.getenv(
@@ -315,6 +310,16 @@ class BedrockCompletion(BaseLLM):
        )
        self.aws_session_token = aws_session_token or os.getenv("AWS_SESSION_TOKEN")

+        # Initialize Bedrock client with proper configuration
+        session = Session(
+            aws_access_key_id=self.aws_access_key_id,
+            aws_secret_access_key=self.aws_secret_access_key,
+            aws_session_token=self.aws_session_token,
+            region_name=self.region_name,
+        )
+
+        self.client = session.client("bedrock-runtime", config=config)
+
        self._async_exit_stack = AsyncExitStack() if AIOBOTOCORE_AVAILABLE else None
        self._async_client_initialized = False

--- a/lib/crewai/src/crewai/llms/providers/gemini/completion.py
+++ b/lib/crewai/src/crewai/llms/providers/gemini/completion.py
@@ -894,7 +894,7 @@ class GeminiCompletion(BaseLLM):
        content = self._extract_text_from_response(response)

        effective_response_model = None if self.tools else response_model
-        if not effective_response_model:
+        if not response_model:
            content = self._apply_stop_words(content)

        return self._finalize_completion_response(
--- a/lib/crewai/src/crewai/mcp/init.py
+++ b/lib/crewai/src/crewai/mcp/init.py
@@ -18,6 +18,7 @@ from crewai.mcp.filters import (
    create_dynamic_tool_filter,
    create_static_tool_filter,
 )
+from crewai.mcp.tool_resolver import MCPToolResolver
 from crewai.mcp.transports.base import BaseTransport, TransportType


@@ -28,6 +29,7 @@ __all__ = [
    "MCPServerHTTP",
    "MCPServerSSE",
    "MCPServerStdio",
+    "MCPToolResolver",
    "StaticToolFilter",
    "ToolFilter",
    "ToolFilterContext",
--- a/lib/crewai/src/crewai/mcp/client.py
+++ b/lib/crewai/src/crewai/mcp/client.py
@@ -6,7 +6,7 @@ from contextlib import AsyncExitStack
 from datetime import datetime
 import logging
 import time
-from typing import Any
+from typing import Any, NamedTuple

 from typing_extensions import Self

@@ -34,6 +34,13 @@ from crewai.mcp.transports.stdio import StdioTransport
 from crewai.utilities.string_utils import sanitize_tool_name


+class _MCPToolResult(NamedTuple):
+    """Internal result from an MCP tool call, carrying the ``isError`` flag."""
+
+    content: str
+    is_error: bool
+
+
 # MCP Connection timeout constants (in seconds)
 MCP_CONNECTION_TIMEOUT = 30  # Increased for slow servers
 MCP_TOOL_EXECUTION_TIMEOUT = 30
@@ -420,6 +427,7 @@ class MCPClient:
        return [
            {
                "name": sanitize_tool_name(tool.name),
+                "original_name": tool.name,
                "description": getattr(tool, "description", ""),
                "inputSchema": getattr(tool, "inputSchema", {}),
            }
@@ -461,29 +469,46 @@ class MCPClient:
        )

        try:
-            result = await self._retry_operation(
+            tool_result: _MCPToolResult = await self._retry_operation(
                lambda: self._call_tool_impl(tool_name, cleaned_arguments),
                timeout=self.execution_timeout,
            )

-            completed_at = datetime.now()
-            execution_duration_ms = (completed_at - started_at).total_seconds() * 1000
-            crewai_event_bus.emit(
-                self,
-                MCPToolExecutionCompletedEvent(
-                    server_name=server_name,
-                    server_url=server_url,
-                    transport_type=transport_type,
-                    tool_name=tool_name,
-                    tool_args=cleaned_arguments,
-                    result=result,
-                    started_at=started_at,
-                    completed_at=completed_at,
-                    execution_duration_ms=execution_duration_ms,
-                ),
-            )
+            finished_at = datetime.now()
+            execution_duration_ms = (finished_at - started_at).total_seconds() * 1000

-            return result
+            if tool_result.is_error:
+                crewai_event_bus.emit(
+                    self,
+                    MCPToolExecutionFailedEvent(
+                        server_name=server_name,
+                        server_url=server_url,
+                        transport_type=transport_type,
+                        tool_name=tool_name,
+                        tool_args=cleaned_arguments,
+                        error=tool_result.content,
+                        error_type="tool_error",
+                        started_at=started_at,
+                        failed_at=finished_at,
+                    ),
+                )
+            else:
+                crewai_event_bus.emit(
+                    self,
+                    MCPToolExecutionCompletedEvent(
+                        server_name=server_name,
+                        server_url=server_url,
+                        transport_type=transport_type,
+                        tool_name=tool_name,
+                        tool_args=cleaned_arguments,
+                        result=tool_result.content,
+                        started_at=started_at,
+                        completed_at=finished_at,
+                        execution_duration_ms=execution_duration_ms,
+                    ),
+                )
+
+            return tool_result.content
        except Exception as e:
            failed_at = datetime.now()
            error_type = (
@@ -564,23 +589,27 @@ class MCPClient:

        return cleaned

-    async def _call_tool_impl(self, tool_name: str, arguments: dict[str, Any]) -> Any:
+    async def _call_tool_impl(
+        self, tool_name: str, arguments: dict[str, Any]
+    ) -> _MCPToolResult:
        """Internal implementation of call_tool."""
        result = await asyncio.wait_for(
            self.session.call_tool(tool_name, arguments),
            timeout=self.execution_timeout,
        )

+        is_error = getattr(result, "isError", False) or False
+
        # Extract result content
        if hasattr(result, "content") and result.content:
            if isinstance(result.content, list) and len(result.content) > 0:
                content_item = result.content[0]
                if hasattr(content_item, "text"):
-                    return str(content_item.text)
-                return str(content_item)
-            return str(result.content)
+                    return _MCPToolResult(str(content_item.text), is_error)
+                return _MCPToolResult(str(content_item), is_error)
+            return _MCPToolResult(str(result.content), is_error)

-        return str(result)
+        return _MCPToolResult(str(result), is_error)

    async def list_prompts(self) -> list[dict[str, Any]]:
        """List available prompts from MCP server.
--- a/lib/crewai/src/crewai/mcp/tool_resolver.py
+++ b/lib/crewai/src/crewai/mcp/tool_resolver.py
@@ -0,0 +1,592 @@
+"""MCP tool resolution for CrewAI agents.
+
+This module extracts all MCP-related tool resolution logic from the Agent class
+into a standalone MCPToolResolver. It handles three flavours of MCP reference:
+
+  1. Native configs:   MCPServerStdio / MCPServerHTTP / MCPServerSSE objects.
+  2. HTTPS URLs:       e.g. "https://mcp.example.com/api"
+  3. AMP references:   e.g. "notion" or "notion#search" (legacy "crewai-amp:" prefix also works)
+"""
+
+from __future__ import annotations
+
+import asyncio
+import time
+from typing import TYPE_CHECKING, Any, Final, cast
+from urllib.parse import urlparse
+
+from crewai.mcp.client import MCPClient
+from crewai.mcp.config import (
+    MCPServerConfig,
+    MCPServerHTTP,
+    MCPServerSSE,
+    MCPServerStdio,
+)
+from crewai.mcp.transports.http import HTTPTransport
+from crewai.mcp.transports.sse import SSETransport
+from crewai.mcp.transports.stdio import StdioTransport
+
+
+if TYPE_CHECKING:
+    from crewai.tools.base_tool import BaseTool
+    from crewai.utilities.logger import Logger
+
+MCP_CONNECTION_TIMEOUT: Final[int] = 10
+MCP_TOOL_EXECUTION_TIMEOUT: Final[int] = 30
+MCP_DISCOVERY_TIMEOUT: Final[int] = 15
+MCP_MAX_RETRIES: Final[int] = 3
+
+_mcp_schema_cache: dict[str, Any] = {}
+_cache_ttl: Final[int] = 300  # 5 minutes
+
+
+class MCPToolResolver:
+    """Resolves MCP server references / configs into CrewAI ``BaseTool`` instances.
+
+    Typical lifecycle::
+
+        resolver = MCPToolResolver(agent=my_agent, logger=my_agent._logger)
+        tools = resolver.resolve(my_agent.mcps)
+        # … agent executes tasks using *tools* …
+        resolver.cleanup()
+
+    The resolver owns the MCP client connections it creates and is responsible
+    for tearing them down via :meth:`cleanup`.
+    """
+
+    def __init__(self, agent: Any, logger: Logger) -> None:
+        self._agent = agent
+        self._logger = logger
+        self._clients: list[Any] = []
+
+    @property
+    def clients(self) -> list[Any]:
+        return list(self._clients)
+
+    def resolve(self, mcps: list[str | MCPServerConfig]) -> list[BaseTool]:
+        """Convert MCP server references/configs to CrewAI tools."""
+        all_tools: list[BaseTool] = []
+        amp_refs: list[tuple[str, str | None]] = []
+
+        for mcp_config in mcps:
+            if isinstance(mcp_config, str) and mcp_config.startswith("https://"):
+                all_tools.extend(self._resolve_external(mcp_config))
+            elif isinstance(mcp_config, str):
+                amp_refs.append(self._parse_amp_ref(mcp_config))
+            else:
+                tools, client = self._resolve_native(mcp_config)
+                all_tools.extend(tools)
+                if client:
+                    self._clients.append(client)
+
+        if amp_refs:
+            tools, clients = self._resolve_amp(amp_refs)
+            all_tools.extend(tools)
+            self._clients.extend(clients)
+
+        return all_tools
+
+    def cleanup(self) -> None:
+        """Disconnect all MCP client connections."""
+        if not self._clients:
+            return
+
+        async def _disconnect_all() -> None:
+            for client in self._clients:
+                if client and hasattr(client, "connected") and client.connected:
+                    await client.disconnect()
+
+        try:
+            asyncio.run(_disconnect_all())
+        except Exception as e:
+            self._logger.log("error", f"Error during MCP client cleanup: {e}")
+        finally:
+            self._clients.clear()
+
+    @staticmethod
+    def _parse_amp_ref(mcp_config: str) -> tuple[str, str | None]:
+        """Parse an AMP reference into *(slug, optional tool name)*.
+
+        Accepts both bare slugs (``"notion"``, ``"notion#search"``) and the
+        legacy ``"crewai-amp:notion"`` form.
+        """
+        bare = mcp_config.removeprefix("crewai-amp:")
+        slug, _, specific_tool = bare.partition("#")
+        return slug, specific_tool or None
+
+    def _resolve_amp(
+        self, amp_refs: list[tuple[str, str | None]]
+    ) -> tuple[list[BaseTool], list[Any]]:
+        """Fetch AMP configs in bulk and return their tools and clients.
+
+        Resolves each unique slug only once (single connection per server),
+        then applies per-ref tool filters to select specific tools.
+        """
+        from crewai.events.event_bus import crewai_event_bus
+        from crewai.events.types.mcp_events import MCPConfigFetchFailedEvent
+
+        unique_slugs = list(dict.fromkeys(slug for slug, _ in amp_refs))
+        amp_configs_map = self._fetch_amp_mcp_configs(unique_slugs)
+
+        all_tools: list[BaseTool] = []
+        all_clients: list[Any] = []
+
+        resolved_cache: dict[str, tuple[list[BaseTool], Any | None]] = {}
+
+        for slug in unique_slugs:
+            config_dict = amp_configs_map.get(slug)
+            if not config_dict:
+                crewai_event_bus.emit(
+                    self,
+                    MCPConfigFetchFailedEvent(
+                        slug=slug,
+                        error=f"Config for '{slug}' not found. Make sure it is connected in your account.",
+                        error_type="not_connected",
+                    ),
+                )
+                continue
+
+            mcp_server_config = self._build_mcp_config_from_dict(config_dict)
+
+            try:
+                tools, client = self._resolve_native(mcp_server_config)
+                resolved_cache[slug] = (tools, client)
+                if client:
+                    all_clients.append(client)
+            except Exception as e:
+                crewai_event_bus.emit(
+                    self,
+                    MCPConfigFetchFailedEvent(
+                        slug=slug,
+                        error=str(e),
+                        error_type="connection_failed",
+                    ),
+                )
+
+        for slug, specific_tool in amp_refs:
+            cached = resolved_cache.get(slug)
+            if not cached:
+                continue
+
+            slug_tools, _ = cached
+            if specific_tool:
+                all_tools.extend(
+                    t for t in slug_tools if t.name.endswith(f"_{specific_tool}")
+                )
+            else:
+                all_tools.extend(slug_tools)
+
+        return all_tools, all_clients
+
+    def _fetch_amp_mcp_configs(self, slugs: list[str]) -> dict[str, dict[str, Any]]:
+        """Fetch MCP server configurations via CrewAI+ API.
+
+        Sends a GET request to the CrewAI+ mcps/configs endpoint with
+        comma-separated slugs. CrewAI+ proxies the request to crewai-oauth.
+
+        API-level failures return ``{}``; individual slugs will then
+        surface as ``MCPConfigFetchFailedEvent`` in :meth:`_resolve_amp`.
+        """
+        import httpx
+
+        try:
+            from crewai_tools.tools.crewai_platform_tools.misc import (
+                get_platform_integration_token,
+            )
+
+            from crewai.cli.plus_api import PlusAPI
+
+            plus_api = PlusAPI(api_key=get_platform_integration_token())
+            response = plus_api.get_mcp_configs(slugs)
+
+            if response.status_code == 200:
+                configs: dict[str, dict[str, Any]] = response.json().get("configs", {})
+                return configs
+
+            self._logger.log(
+                "debug",
+                f"Failed to fetch MCP configs: HTTP {response.status_code}",
+            )
+            return {}
+
+        except httpx.HTTPError as e:
+            self._logger.log("debug", f"Failed to fetch MCP configs: {e}")
+            return {}
+        except Exception as e:
+            self._logger.log("debug", f"Cannot fetch AMP MCP configs: {e}")
+            return {}
+
+    def _resolve_external(self, mcp_ref: str) -> list[BaseTool]:
+        """Resolve an HTTPS MCP server URL into tools."""
+        from crewai.tools.mcp_tool_wrapper import MCPToolWrapper
+
+        if "#" in mcp_ref:
+            server_url, specific_tool = mcp_ref.split("#", 1)
+        else:
+            server_url, specific_tool = mcp_ref, None
+
+        server_params = {"url": server_url}
+        server_name = self._extract_server_name(server_url)
+
+        try:
+            tool_schemas = self._get_mcp_tool_schemas(server_params)
+
+            if not tool_schemas:
+                self._logger.log(
+                    "warning", f"No tools discovered from MCP server: {server_url}"
+                )
+                return []
+
+            tools = []
+            for tool_name, schema in tool_schemas.items():
+                if specific_tool and tool_name != specific_tool:
+                    continue
+
+                try:
+                    wrapper = MCPToolWrapper(
+                        mcp_server_params=server_params,
+                        tool_name=tool_name,
+                        tool_schema=schema,
+                        server_name=server_name,
+                    )
+                    tools.append(wrapper)
+                except Exception as e:
+                    self._logger.log(
+                        "warning",
+                        f"Failed to create MCP tool wrapper for {tool_name}: {e}",
+                    )
+                    continue
+
+            if specific_tool and not tools:
+                self._logger.log(
+                    "warning",
+                    f"Specific tool '{specific_tool}' not found on MCP server: {server_url}",
+                )
+
+            return cast(list[BaseTool], tools)
+
+        except Exception as e:
+            self._logger.log(
+                "warning", f"Failed to connect to MCP server {server_url}: {e}"
+            )
+            return []
+
+    def _resolve_native(
+        self, mcp_config: MCPServerConfig
+    ) -> tuple[list[BaseTool], Any | None]:
+        """Resolve an ``MCPServerConfig`` into tools, returning the client for cleanup."""
+        from crewai.tools.base_tool import BaseTool
+        from crewai.tools.mcp_native_tool import MCPNativeTool
+
+        transport: StdioTransport | HTTPTransport | SSETransport
+        if isinstance(mcp_config, MCPServerStdio):
+            transport = StdioTransport(
+                command=mcp_config.command,
+                args=mcp_config.args,
+                env=mcp_config.env,
+            )
+            server_name = f"{mcp_config.command}_{'_'.join(mcp_config.args)}"
+        elif isinstance(mcp_config, MCPServerHTTP):
+            transport = HTTPTransport(
+                url=mcp_config.url,
+                headers=mcp_config.headers,
+                streamable=mcp_config.streamable,
+            )
+            server_name = self._extract_server_name(mcp_config.url)
+        elif isinstance(mcp_config, MCPServerSSE):
+            transport = SSETransport(
+                url=mcp_config.url,
+                headers=mcp_config.headers,
+            )
+            server_name = self._extract_server_name(mcp_config.url)
+        else:
+            raise ValueError(f"Unsupported MCP server config type: {type(mcp_config)}")
+
+        client = MCPClient(
+            transport=transport,
+            cache_tools_list=mcp_config.cache_tools_list,
+        )
+
+        async def _setup_client_and_list_tools() -> list[dict[str, Any]]:
+            try:
+                if not client.connected:
+                    await client.connect()
+
+                tools_list = await client.list_tools()
+
+                try:
+                    await client.disconnect()
+                    await asyncio.sleep(0.1)
+                except Exception as e:
+                    self._logger.log("error", f"Error during disconnect: {e}")
+
+                return tools_list
+            except Exception as e:
+                if client.connected:
+                    await client.disconnect()
+                    await asyncio.sleep(0.1)
+                raise RuntimeError(
+                    f"Error during setup client and list tools: {e}"
+                ) from e
+
+        try:
+            try:
+                asyncio.get_running_loop()
+                import concurrent.futures
+
+                with concurrent.futures.ThreadPoolExecutor() as executor:
+                    future = executor.submit(
+                        asyncio.run, _setup_client_and_list_tools()
+                    )
+                    tools_list = future.result()
+            except RuntimeError:
+                try:
+                    tools_list = asyncio.run(_setup_client_and_list_tools())
+                except RuntimeError as e:
+                    error_msg = str(e).lower()
+                    if "cancel scope" in error_msg or "task" in error_msg:
+                        raise ConnectionError(
+                            "MCP connection failed due to event loop cleanup issues. "
+                            "This may be due to authentication errors or server unavailability."
+                        ) from e
+                except asyncio.CancelledError as e:
+                    raise ConnectionError(
+                        "MCP connection was cancelled. This may indicate an authentication "
+                        "error or server unavailability."
+                    ) from e
+
+            if mcp_config.tool_filter:
+                filtered_tools = []
+                for tool in tools_list:
+                    if callable(mcp_config.tool_filter):
+                        try:
+                            from crewai.mcp.filters import ToolFilterContext
+
+                            context = ToolFilterContext(
+                                agent=self._agent,
+                                server_name=server_name,
+                                run_context=None,
+                            )
+                            if mcp_config.tool_filter(context, tool):  # type: ignore[call-arg, arg-type]
+                                filtered_tools.append(tool)
+                        except (TypeError, AttributeError):
+                            if mcp_config.tool_filter(tool):  # type: ignore[call-arg, arg-type]
+                                filtered_tools.append(tool)
+                    else:
+                        filtered_tools.append(tool)
+                tools_list = filtered_tools
+
+            tools = []
+            for tool_def in tools_list:
+                tool_name = tool_def.get("name", "")
+                original_tool_name = tool_def.get("original_name", tool_name)
+                if not tool_name:
+                    continue
+
+                args_schema = None
+                if tool_def.get("inputSchema"):
+                    args_schema = self._json_schema_to_pydantic(
+                        tool_name, tool_def["inputSchema"]
+                    )
+
+                tool_schema = {
+                    "description": tool_def.get("description", ""),
+                    "args_schema": args_schema,
+                }
+
+                try:
+                    native_tool = MCPNativeTool(
+                        mcp_client=client,
+                        tool_name=tool_name,
+                        tool_schema=tool_schema,
+                        server_name=server_name,
+                        original_tool_name=original_tool_name,
+                    )
+                    tools.append(native_tool)
+                except Exception as e:
+                    self._logger.log("error", f"Failed to create native MCP tool: {e}")
+                    continue
+
+            return cast(list[BaseTool], tools), client
+        except Exception as e:
+            if client.connected:
+                asyncio.run(client.disconnect())
+
+            raise RuntimeError(f"Failed to get native MCP tools: {e}") from e
+
+    @staticmethod
+    def _build_mcp_config_from_dict(
+        config_dict: dict[str, Any],
+    ) -> MCPServerConfig:
+        """Convert a config dict from crewai-oauth into an MCPServerConfig."""
+        config_type = config_dict.get("type", "http")
+
+        if config_type == "sse":
+            return MCPServerSSE(
+                url=config_dict["url"],
+                headers=config_dict.get("headers"),
+                cache_tools_list=config_dict.get("cache_tools_list", False),
+            )
+
+        return MCPServerHTTP(
+            url=config_dict["url"],
+            headers=config_dict.get("headers"),
+            streamable=config_dict.get("streamable", True),
+            cache_tools_list=config_dict.get("cache_tools_list", False),
+        )
+
+    @staticmethod
+    def _extract_server_name(server_url: str) -> str:
+        """Extract clean server name from URL for tool prefixing."""
+        parsed = urlparse(server_url)
+        domain = parsed.netloc.replace(".", "_")
+        path = parsed.path.replace("/", "_").strip("_")
+        return f"{domain}_{path}" if path else domain
+
+    def _get_mcp_tool_schemas(
+        self, server_params: dict[str, Any]
+    ) -> dict[str, dict[str, Any]]:
+        """Get tool schemas from MCP server with caching."""
+        server_url = server_params["url"]
+
+        cache_key = server_url
+        current_time = time.time()
+
+        if cache_key in _mcp_schema_cache:
+            cached_data, cache_time = _mcp_schema_cache[cache_key]
+            if current_time - cache_time < _cache_ttl:
+                self._logger.log(
+                    "debug", f"Using cached MCP tool schemas for {server_url}"
+                )
+                return cached_data  # type: ignore[no-any-return]
+
+        try:
+            schemas = asyncio.run(self._get_mcp_tool_schemas_async(server_params))
+            _mcp_schema_cache[cache_key] = (schemas, current_time)
+            return schemas
+        except Exception as e:
+            self._logger.log(
+                "warning", f"Failed to get MCP tool schemas from {server_url}: {e}"
+            )
+            return {}
+
+    async def _get_mcp_tool_schemas_async(
+        self, server_params: dict[str, Any]
+    ) -> dict[str, dict[str, Any]]:
+        """Async implementation of MCP tool schema retrieval."""
+        server_url = server_params["url"]
+        return await self._retry_mcp_discovery(
+            self._discover_mcp_tools_with_timeout, server_url
+        )
+
+    async def _retry_mcp_discovery(
+        self, operation_func: Any, server_url: str
+    ) -> dict[str, dict[str, Any]]:
+        """Retry MCP discovery with exponential backoff."""
+        last_error = None
+
+        for attempt in range(MCP_MAX_RETRIES):
+            result, error, should_retry = await self._attempt_mcp_discovery(
+                operation_func, server_url
+            )
+
+            if result is not None:
+                return result
+
+            if not should_retry:
+                raise RuntimeError(error)
+
+            last_error = error
+            if attempt < MCP_MAX_RETRIES - 1:
+                wait_time = 2**attempt
+                await asyncio.sleep(wait_time)
+
+        raise RuntimeError(
+            f"Failed to discover MCP tools after {MCP_MAX_RETRIES} attempts: {last_error}"
+        )
+
+    @staticmethod
+    async def _attempt_mcp_discovery(
+        operation_func: Any, server_url: str
+    ) -> tuple[dict[str, dict[str, Any]] | None, str, bool]:
+        """Attempt single MCP discovery; returns *(result, error_message, should_retry)*."""
+        try:
+            result = await operation_func(server_url)
+            return result, "", False
+
+        except ImportError:
+            return (
+                None,
+                "MCP library not available. Please install with: pip install mcp",
+                False,
+            )
+
+        except asyncio.TimeoutError:
+            return (
+                None,
+                f"MCP discovery timed out after {MCP_DISCOVERY_TIMEOUT} seconds",
+                True,
+            )
+
+        except Exception as e:
+            error_str = str(e).lower()
+
+            if "authentication" in error_str or "unauthorized" in error_str:
+                return None, f"Authentication failed for MCP server: {e!s}", False
+            if "connection" in error_str or "network" in error_str:
+                return None, f"Network connection failed: {e!s}", True
+            if "json" in error_str or "parsing" in error_str:
+                return None, f"Server response parsing error: {e!s}", True
+            return None, f"MCP discovery error: {e!s}", False
+
+    async def _discover_mcp_tools_with_timeout(
+        self, server_url: str
+    ) -> dict[str, dict[str, Any]]:
+        """Discover MCP tools with timeout wrapper."""
+        return await asyncio.wait_for(
+            self._discover_mcp_tools(server_url), timeout=MCP_DISCOVERY_TIMEOUT
+        )
+
+    async def _discover_mcp_tools(self, server_url: str) -> dict[str, dict[str, Any]]:
+        """Discover tools from an MCP server (HTTPS / streamable-HTTP path)."""
+        from mcp import ClientSession
+        from mcp.client.streamable_http import streamablehttp_client
+
+        from crewai.utilities.string_utils import sanitize_tool_name
+
+        async with streamablehttp_client(server_url) as (read, write, _):
+            async with ClientSession(read, write) as session:
+                await asyncio.wait_for(
+                    session.initialize(), timeout=MCP_CONNECTION_TIMEOUT
+                )
+
+                tools_result = await asyncio.wait_for(
+                    session.list_tools(),
+                    timeout=MCP_DISCOVERY_TIMEOUT - MCP_CONNECTION_TIMEOUT,
+                )
+
+                schemas = {}
+                for tool in tools_result.tools:
+                    args_schema = None
+                    if hasattr(tool, "inputSchema") and tool.inputSchema:
+                        args_schema = self._json_schema_to_pydantic(
+                            sanitize_tool_name(tool.name), tool.inputSchema
+                        )
+
+                    schemas[sanitize_tool_name(tool.name)] = {
+                        "description": getattr(tool, "description", ""),
+                        "args_schema": args_schema,
+                    }
+                return schemas
+
+    @staticmethod
+    def _json_schema_to_pydantic(tool_name: str, json_schema: dict[str, Any]) -> type:
+        """Convert JSON Schema to a Pydantic model for tool arguments."""
+        from crewai.utilities.pydantic_schema_utils import create_model_from_schema
+
+        model_name = f"{tool_name.replace('-', '_').replace(' ', '_')}Schema"
+        return create_model_from_schema(
+            json_schema,
+            model_name=model_name,
+            enrich_descriptions=True,
+        )
--- a/lib/crewai/src/crewai/memory/init.py
+++ b/lib/crewai/src/crewai/memory/init.py
@@ -1,6 +1,14 @@
-"""Memory module: unified Memory with LLM analysis and pluggable storage."""
+"""Memory module: unified Memory with LLM analysis and pluggable storage.
+
+Heavy dependencies are lazily imported so that
+``import crewai`` does not initialise at runtime — critical for
+Celery pre-fork and similar deployment patterns.
+"""
+
+from __future__ import annotations
+
+from typing import Any

-from crewai.memory.encoding_flow import EncodingFlow
 from crewai.memory.memory_scope import MemoryScope, MemorySlice
 from crewai.memory.types import (
    MemoryMatch,
@@ -10,7 +18,24 @@ from crewai.memory.types import (
    embed_text,
    embed_texts,
 )
-from crewai.memory.unified_memory import Memory
+
+_LAZY_IMPORTS: dict[str, tuple[str, str]] = {
+    "Memory": ("crewai.memory.unified_memory", "Memory"),
+    "EncodingFlow": ("crewai.memory.encoding_flow", "EncodingFlow"),
+}
+
+
+def __getattr__(name: str) -> Any:
+    """Lazily import Memory / EncodingFlow to avoid pulling in lancedb at import time."""
+    if name in _LAZY_IMPORTS:
+        import importlib
+
+        module_path, attr = _LAZY_IMPORTS[name]
+        mod = importlib.import_module(module_path)
+        val = getattr(mod, attr)
+        globals()[name] = val
+        return val
+    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")


 __all__ = [
--- a/lib/crewai/src/crewai/memory/memory_scope.py
+++ b/lib/crewai/src/crewai/memory/memory_scope.py
@@ -145,7 +145,7 @@ class MemoryScope:


 class MemorySlice:
-    """View over multiple scopes: recall searches all, remember requires explicit scope unless read_only."""
+    """View over multiple scopes: recall searches all, remember is a no-op when read_only."""

    def __init__(
        self,
@@ -160,7 +160,7 @@ class MemorySlice:
            memory: The underlying Memory instance.
            scopes: List of scope paths to include.
            categories: Optional category filter for recall.
-            read_only: If True, remember() raises PermissionError.
+            read_only: If True, remember() is a silent no-op.
        """
        self._memory = memory
        self._scopes = [s.rstrip("/") or "/" for s in scopes]
@@ -176,10 +176,10 @@ class MemorySlice:
        importance: float | None = None,
        source: str | None = None,
        private: bool = False,
-    ) -> MemoryRecord:
-        """Remember into an explicit scope. Required when read_only=False."""
+    ) -> MemoryRecord | None:
+        """Remember into an explicit scope. No-op when read_only=True."""
        if self._read_only:
-            raise PermissionError("This MemorySlice is read-only")
+            return None
        return self._memory.remember(
            content,
            scope=scope,
--- a/lib/crewai/src/crewai/memory/storage/lancedb_storage.py
+++ b/lib/crewai/src/crewai/memory/storage/lancedb_storage.py
@@ -53,6 +53,7 @@ class LanceDBStorage:
        path: str | Path | None = None,
        table_name: str = "memories",
        vector_dim: int | None = None,
+        compact_every: int = 100,
    ) -> None:
        """Initialize LanceDB storage.

@@ -64,6 +65,10 @@ class LanceDBStorage:
            vector_dim: Dimensionality of the embedding vector. When ``None``
                  (default), the dimension is auto-detected from the existing
                  table schema or from the first saved embedding.
+            compact_every: Number of ``save()`` calls between automatic
+                  background compactions.  Each ``save()`` creates one new
+                  fragment file; compaction merges them, keeping query
+                  performance consistent.  Set to 0 to disable.
        """
        if path is None:
            storage_dir = os.environ.get("CREWAI_STORAGE_DIR")
@@ -78,6 +83,22 @@ class LanceDBStorage:
        self._table_name = table_name
        self._db = lancedb.connect(str(self._path))

+        # On macOS and Linux the default per-process open-file limit is 256.
+        # A LanceDB table stores one file per fragment (one fragment per save()
+        # call by default).  With hundreds of fragments, a single full-table
+        # scan opens all of them simultaneously, exhausting the limit.
+        # Raise it proactively so scans on large tables never hit OS error 24.
+        try:
+            import resource
+            soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
+            if soft < 4096:
+                resource.setrlimit(resource.RLIMIT_NOFILE, (min(hard, 4096), hard))
+        except Exception:  # noqa: S110
+            pass  # Windows or already at the max hard limit — safe to ignore
+
+        self._compact_every = compact_every
+        self._save_count = 0
+
        # Get or create a shared write lock for this database path.
        resolved = str(self._path.resolve())
        with LanceDBStorage._path_locks_guard:
@@ -91,6 +112,11 @@ class LanceDBStorage:
        try:
            self._table: lancedb.table.Table | None = self._db.open_table(self._table_name)
            self._vector_dim: int = self._infer_dim_from_table(self._table)
+            # Best-effort: create the scope index if it doesn't exist yet.
+            self._ensure_scope_index()
+            # Compact in the background if the table has accumulated many
+            # fragments from previous runs (each save() creates one).
+            self._compact_if_needed()
        except Exception:
            self._table = None
            self._vector_dim = vector_dim or 0  # 0 = not yet known
@@ -178,6 +204,56 @@ class LanceDBStorage:
        table.delete("id = '__schema_placeholder__'")
        return table

+    def _ensure_scope_index(self) -> None:
+        """Create a BTREE scalar index on the ``scope`` column if not present.
+
+        A scalar index lets LanceDB skip a full table scan when filtering by
+        scope prefix, which is the hot path for ``list_records``,
+        ``get_scope_info``, and ``list_scopes``.  The call is best-effort:
+        if the table is empty or the index already exists the exception is
+        swallowed silently.
+        """
+        if self._table is None:
+            return
+        try:
+            self._table.create_scalar_index("scope", index_type="BTREE", replace=False)
+        except Exception:  # noqa: S110
+            pass  # index already exists, table empty, or unsupported version
+
+    # ------------------------------------------------------------------
+    # Automatic background compaction
+    # ------------------------------------------------------------------
+
+    def _compact_if_needed(self) -> None:
+        """Spawn a background compaction on startup.
+
+        Called whenever an existing table is opened so that fragments
+        accumulated in previous sessions are silently merged before the
+        first query.  ``optimize()`` returns quickly when the table is
+        already compact, so the cost is negligible in the common case.
+        """
+        if self._table is None or self._compact_every <= 0:
+            return
+        self._compact_async()
+
+    def _compact_async(self) -> None:
+        """Fire-and-forget: compact the table in a daemon background thread."""
+        threading.Thread(
+            target=self._compact_safe,
+            daemon=True,
+            name="lancedb-compact",
+        ).start()
+
+    def _compact_safe(self) -> None:
+        """Run ``table.optimize()`` in a background thread, absorbing errors."""
+        try:
+            if self._table is not None:
+                self._table.optimize()
+                # Refresh the scope index so new fragments are covered.
+                self._ensure_scope_index()
+        except Exception:
+            _logger.debug("LanceDB background compaction failed", exc_info=True)
+
    def _ensure_table(self, vector_dim: int | None = None) -> lancedb.table.Table:
        """Return the table, creating it lazily if needed.

@@ -239,6 +315,7 @@ class LanceDBStorage:
            if r.embedding and len(r.embedding) > 0:
                dim = len(r.embedding)
                break
+        is_new_table = self._table is None
        with self._write_lock:
            self._ensure_table(vector_dim=dim)
            rows = [self._record_to_row(r) for r in records]
@@ -246,6 +323,13 @@ class LanceDBStorage:
                if r["vector"] is None or len(r["vector"]) != self._vector_dim:
                    r["vector"] = [0.0] * self._vector_dim
            self._retry_write("add", rows)
+        # Create the scope index on the first save so it covers the initial dataset.
+        if is_new_table:
+            self._ensure_scope_index()
+        # Auto-compact every N saves so fragment files don't pile up.
+        self._save_count += 1
+        if self._compact_every > 0 and self._save_count % self._compact_every == 0:
+            self._compact_async()

    def update(self, record: MemoryRecord) -> None:
        """Update a record by ID. Preserves created_at, updates last_accessed."""
@@ -261,6 +345,10 @@ class LanceDBStorage:
    def touch_records(self, record_ids: list[str]) -> None:
        """Update last_accessed to now for the given record IDs.

+        Uses a single batch ``table.update()`` call instead of N
+        delete-and-re-add cycles, which is both faster and avoids
+        unnecessary write amplification.
+
        Args:
            record_ids: IDs of records to touch.
        """
@@ -268,25 +356,20 @@ class LanceDBStorage:
            return
        with self._write_lock:
            now = datetime.utcnow().isoformat()
-            for rid in record_ids:
-                safe_id = str(rid).replace("'", "''")
-                rows = (
-                    self._table.search([0.0] * self._vector_dim)
-                    .where(f"id = '{safe_id}'")
-                    .limit(1)
-                    .to_list()
-                )
-                if rows:
-                    rows[0]["last_accessed"] = now
-                    self._retry_write("delete", f"id = '{safe_id}'")
-                    self._retry_write("add", [rows[0]])
+            safe_ids = [str(rid).replace("'", "''") for rid in record_ids]
+            ids_expr = ", ".join(f"'{rid}'" for rid in safe_ids)
+            self._retry_write(
+                "update",
+                where=f"id IN ({ids_expr})",
+                values={"last_accessed": now},
+            )

    def get_record(self, record_id: str) -> MemoryRecord | None:
        """Return a single record by ID, or None if not found."""
        if self._table is None:
            return None
        safe_id = str(record_id).replace("'", "''")
-        rows = self._table.search([0.0] * self._vector_dim).where(f"id = '{safe_id}'").limit(1).to_list()
+        rows = self._table.search().where(f"id = '{safe_id}'").limit(1).to_list()
        if not rows:
            return None
        return self._row_to_record(rows[0])
@@ -374,13 +457,31 @@ class LanceDBStorage:
            self._retry_write("delete", where_expr)
            return before - self._table.count_rows()

-    def _scan_rows(self, scope_prefix: str | None = None, limit: int = _SCAN_ROWS_LIMIT) -> list[dict[str, Any]]:
-        """Scan rows optionally filtered by scope prefix."""
+    def _scan_rows(
+        self,
+        scope_prefix: str | None = None,
+        limit: int = _SCAN_ROWS_LIMIT,
+        columns: list[str] | None = None,
+    ) -> list[dict[str, Any]]:
+        """Scan rows optionally filtered by scope prefix.
+
+        Uses a full table scan (no vector query) so the limit is applied after
+        the scope filter, not to ANN candidates before filtering.
+
+        Args:
+            scope_prefix: Optional scope path prefix to filter by.
+            limit: Maximum number of rows to return (applied after filtering).
+            columns: Optional list of column names to fetch.  Pass only the
+                columns you need for metadata operations to avoid reading the
+                heavy ``vector`` column unnecessarily.
+        """
        if self._table is None:
            return []
-        q = self._table.search([0.0] * self._vector_dim)
+        q = self._table.search()
        if scope_prefix is not None and scope_prefix.strip("/"):
            q = q.where(f"scope LIKE '{scope_prefix.rstrip('/')}%'")
+        if columns is not None:
+            q = q.select(columns)
        return q.limit(limit).to_list()

    def list_records(
@@ -406,7 +507,10 @@ class LanceDBStorage:
        prefix = scope if scope != "/" else ""
        if prefix and not prefix.startswith("/"):
            prefix = "/" + prefix
-        rows = self._scan_rows(prefix or None)
+        rows = self._scan_rows(
+            prefix or None,
+            columns=["scope", "categories_str", "created_at"],
+        )
        if not rows:
            return ScopeInfo(
                path=scope or "/",
@@ -453,7 +557,7 @@ class LanceDBStorage:
    def list_scopes(self, parent: str = "/") -> list[str]:
        parent = parent.rstrip("/") or ""
        prefix = (parent + "/") if parent else "/"
-        rows = self._scan_rows(prefix if prefix != "/" else None)
+        rows = self._scan_rows(prefix if prefix != "/" else None, columns=["scope"])
        children: set[str] = set()
        for row in rows:
            sc = str(row.get("scope", ""))
@@ -465,7 +569,7 @@ class LanceDBStorage:
        return sorted(children)

    def list_categories(self, scope_prefix: str | None = None) -> dict[str, int]:
-        rows = self._scan_rows(scope_prefix)
+        rows = self._scan_rows(scope_prefix, columns=["categories_str"])
        counts: dict[str, int] = {}
        for row in rows:
            cat_str = row.get("categories_str") or "[]"
@@ -498,6 +602,21 @@ class LanceDBStorage:
        if prefix:
            self._table.delete(f"scope >= '{prefix}' AND scope < '{prefix}/\uFFFF'")

+    def optimize(self) -> None:
+        """Compact the table synchronously and refresh the scope index.
+
+        Under normal usage this is called automatically in the background
+        (every ``compact_every`` saves and on startup when the table is
+        fragmented).  Call this explicitly only when you need the compaction
+        to be complete before the next operation — for example immediately
+        after a large bulk import, before a latency-sensitive recall.
+        It is a no-op if the table does not exist.
+        """
+        if self._table is None:
+            return
+        self._table.optimize()
+        self._ensure_scope_index()
+
    async def asave(self, records: list[MemoryRecord]) -> None:
        self.save(records)

--- a/lib/crewai/src/crewai/memory/types.py
+++ b/lib/crewai/src/crewai/memory/types.py
@@ -87,6 +87,22 @@ class MemoryMatch(BaseModel):
        description="Information the system looked for but could not find.",
    )

+    def format(self) -> str:
+        """Format this match as a human-readable string including metadata.
+
+        Returns:
+            A multi-line string with score, content, categories, and non-empty
+            metadata fields.
+        """
+        lines = [f"- (score={self.score:.2f}) {self.record.content}"]
+        if self.record.categories:
+            lines.append(f"  categories: {', '.join(self.record.categories)}")
+        if self.record.metadata:
+            for key, value in self.record.metadata.items():
+                if value is not None:
+                    lines.append(f"  {key}: {value}")
+        return "\n".join(lines)
+

 class ScopeInfo(BaseModel):
    """Information about a scope in the memory hierarchy."""
@@ -291,7 +307,7 @@ def embed_text(embedder: Any, text: str) -> list[float]:
        return []
    first = result[0]
    if hasattr(first, "tolist"):
-        return first.tolist()
+        return list(first.tolist())
    if isinstance(first, list):
        return [float(x) for x in first]
    return list(first)
--- a/lib/crewai/src/crewai/memory/unified_memory.py
+++ b/lib/crewai/src/crewai/memory/unified_memory.py
@@ -6,7 +6,7 @@ from concurrent.futures import Future, ThreadPoolExecutor
 from datetime import datetime
 import threading
 import time
-from typing import Any, Literal
+from typing import TYPE_CHECKING, Any, Literal

 from crewai.events.event_bus import crewai_event_bus
 from crewai.events.types.memory_events import (
@@ -21,7 +21,6 @@ from crewai.llms.base_llm import BaseLLM
 from crewai.memory.analyze import extract_memories_from_content
 from crewai.memory.recall_flow import RecallFlow
 from crewai.memory.storage.backend import StorageBackend
-from crewai.memory.storage.lancedb_storage import LanceDBStorage
 from crewai.memory.types import (
    MemoryConfig,
    MemoryMatch,
@@ -30,13 +29,20 @@ from crewai.memory.types import (
    compute_composite_score,
    embed_text,
 )
+from crewai.rag.embeddings.factory import build_embedder
+from crewai.rag.embeddings.providers.openai.types import OpenAIProviderSpec


-def _default_embedder() -> Any:
+if TYPE_CHECKING:
+    from chromadb.utils.embedding_functions.openai_embedding_function import (
+        OpenAIEmbeddingFunction,
+    )
+
+
+def _default_embedder() -> OpenAIEmbeddingFunction:
    """Build default OpenAI embedder for memory."""
-    from crewai.rag.embeddings.factory import build_embedder
-
-    return build_embedder({"provider": "openai", "config": {}})
+    spec: OpenAIProviderSpec = {"provider": "openai", "config": {}}
+    return build_embedder(spec)


 class Memory:
@@ -88,6 +94,10 @@ class Memory:
        # Queries shorter than this skip LLM analysis (saving ~1-3s).
        # Longer queries (full task descriptions) benefit from LLM distillation.
        query_analysis_threshold: int = 200,
+        # When True, all write operations (remember, remember_many) are silently
+        # skipped. Useful for sharing a read-only view of memory across agents
+        # without any of them persisting new memories.
+        read_only: bool = False,
    ) -> None:
        """Initialize Memory.

@@ -107,7 +117,9 @@ class Memory:
            complex_query_threshold: For complex queries, explore deeper below this confidence.
            exploration_budget: Number of LLM-driven exploration rounds during deep recall.
            query_analysis_threshold: Queries shorter than this skip LLM analysis during deep recall.
+            read_only: If True, remember() and remember_many() are silent no-ops.
        """
+        self._read_only = read_only
        self._config = MemoryConfig(
            recency_weight=recency_weight,
            semantic_weight=semantic_weight,
@@ -130,14 +142,15 @@ class Memory:
        self._llm_instance: BaseLLM | None = None if isinstance(llm, str) else llm
        self._embedder_config: Any = embedder
        self._embedder_instance: Any = (
-            embedder if (embedder is not None and not isinstance(embedder, dict)) else None
+            embedder
+            if (embedder is not None and not isinstance(embedder, dict))
+            else None
        )

-        # Storage is initialized eagerly (local, no API key needed).
-        if storage == "lancedb":
-            self._storage = LanceDBStorage()
-        elif isinstance(storage, str):
-            self._storage = LanceDBStorage(path=storage)
+        if isinstance(storage, str):
+            from crewai.memory.storage.lancedb_storage import LanceDBStorage
+
+            self._storage = LanceDBStorage() if storage == "lancedb" else LanceDBStorage(path=storage)
        else:
            self._storage = storage

@@ -160,12 +173,17 @@ class Memory:
            from crewai.llm import LLM

            try:
-                self._llm_instance = LLM(model=self._llm_config)
+                model_name = (
+                    self._llm_config
+                    if isinstance(self._llm_config, str)
+                    else str(self._llm_config)
+                )
+                self._llm_instance = LLM(model=model_name)
            except Exception as e:
                raise RuntimeError(
                    f"Memory requires an LLM for analysis but initialization failed: {e}\n\n"
                    "To fix this, do one of the following:\n"
-                    '  - Set OPENAI_API_KEY for the default model (gpt-4o-mini)\n'
+                    "  - Set OPENAI_API_KEY for the default model (gpt-4o-mini)\n"
                    '  - Pass a different model: Memory(llm="anthropic/claude-3-haiku-20240307")\n'
                    '  - Pass any LLM instance: Memory(llm=LLM(model="your-model"))\n'
                    "  - To skip LLM analysis, pass all fields explicitly to remember()\n"
@@ -180,8 +198,6 @@ class Memory:
        if self._embedder_instance is None:
            try:
                if isinstance(self._embedder_config, dict):
-                    from crewai.rag.embeddings.factory import build_embedder
-
                    self._embedder_instance = build_embedder(self._embedder_config)
                else:
                    self._embedder_instance = _default_embedder()
@@ -317,7 +333,7 @@ class Memory:
        source: str | None = None,
        private: bool = False,
        agent_role: str | None = None,
-    ) -> MemoryRecord:
+    ) -> MemoryRecord | None:
        """Store a single item in memory (synchronous).

        Routes through the same serialized save pool as ``remember_many``
@@ -335,11 +351,13 @@ class Memory:
            agent_role: Optional agent role for event metadata.

        Returns:
-            The created MemoryRecord.
+            The created MemoryRecord, or None if this memory is read-only.

        Raises:
            Exception: On save failure (events emitted).
        """
+        if self._read_only:
+            return None
        _source_type = "unified_memory"
        try:
            crewai_event_bus.emit(
@@ -356,7 +374,13 @@ class Memory:
            # then immediately wait for the result.
            future = self._submit_save(
                self._encode_batch,
-                [content], scope, categories, metadata, importance, source, private,
+                [content],
+                scope,
+                categories,
+                metadata,
+                importance,
+                source,
+                private,
            )
            records = future.result()
            record = records[0] if records else None
@@ -420,13 +444,19 @@ class Memory:
        Returns:
            Empty list (records are not available until the background save completes).
        """
-        if not contents:
+        if not contents or self._read_only:
            return []

        self._submit_save(
            self._background_encode_batch,
-            contents, scope, categories, metadata,
-            importance, source, private, agent_role,
+            contents,
+            scope,
+            categories,
+            metadata,
+            importance,
+            source,
+            private,
+            agent_role,
        )
        return []

@@ -566,14 +596,13 @@ class Memory:
                    # Privacy filter
                    if not include_private:
                        raw = [
-                            (r, s) for r, s in raw
+                            (r, s)
+                            for r, s in raw
                            if not r.private or r.source == source
                        ]
                    results = []
                    for r, s in raw:
-                        composite, reasons = compute_composite_score(
-                            r, s, self._config
-                        )
+                        composite, reasons = compute_composite_score(r, s, self._config)
                        results.append(
                            MemoryMatch(
                                record=r,
@@ -739,7 +768,9 @@ class Memory:
            limit: Maximum number of records to return.
            offset: Number of records to skip (for pagination).
        """
-        return self._storage.list_records(scope_prefix=scope, limit=limit, offset=offset)
+        return self._storage.list_records(
+            scope_prefix=scope, limit=limit, offset=offset
+        )

    def info(self, path: str = "/") -> ScopeInfo:
        """Return scope info for path."""
@@ -781,7 +812,7 @@ class Memory:
        importance: float | None = None,
        source: str | None = None,
        private: bool = False,
-    ) -> MemoryRecord:
+    ) -> MemoryRecord | None:
        """Async remember: delegates to sync for now."""
        return self.remember(
            content,
--- a/lib/crewai/src/crewai/rag/embeddings/factory.py
+++ b/lib/crewai/src/crewai/rag/embeddings/factory.py
@@ -216,6 +216,10 @@ def build_embedder_from_dict(
 def build_embedder_from_dict(spec: ONNXProviderSpec) -> ONNXMiniLM_L6_V2: ...


+@overload
+def build_embedder_from_dict(spec: dict[str, Any]) -> EmbeddingFunction[Any]: ...
+
+
 def build_embedder_from_dict(spec):  # type: ignore[no-untyped-def]
    """Build an embedding function instance from a dictionary specification.

@@ -341,6 +345,10 @@ def build_embedder(spec: Text2VecProviderSpec) -> Text2VecEmbeddingFunction: ...
 def build_embedder(spec: ONNXProviderSpec) -> ONNXMiniLM_L6_V2: ...


+@overload
+def build_embedder(spec: dict[str, Any]) -> EmbeddingFunction[Any]: ...
+
+
 def build_embedder(spec):  # type: ignore[no-untyped-def]
    """Build an embedding function from either a provider spec or a provider instance.

--- a/lib/crewai/src/crewai/task.py
+++ b/lib/crewai/src/crewai/task.py
@@ -1,5 +1,6 @@
 from __future__ import annotations

+import asyncio
 from concurrent.futures import Future
 from copy import copy as shallow_copy
 import datetime
@@ -585,16 +586,29 @@ class Task(BaseModel):

            self._post_agent_execution(agent)

-            if not self._guardrails and not self._guardrail:
+            if isinstance(result, BaseModel):
+                raw = result.model_dump_json()
+                if self.output_pydantic:
+                    pydantic_output = result
+                    json_output = None
+                elif self.output_json:
+                    pydantic_output = None
+                    json_output = result.model_dump()
+                else:
+                    pydantic_output = None
+                    json_output = None
+            elif not self._guardrails and not self._guardrail:
+                raw = result
                pydantic_output, json_output = self._export_output(result)
            else:
+                raw = result
                pydantic_output, json_output = None, None

            task_output = TaskOutput(
                name=self.name or self.description,
                description=self.description,
                expected_output=self.expected_output,
-                raw=result,
+                raw=raw,
                pydantic=pydantic_output,
                json_dict=json_output,
                agent=agent.role,
@@ -624,11 +638,15 @@ class Task(BaseModel):
            self.end_time = datetime.datetime.now()

            if self.callback:
-                self.callback(self.output)
+                cb_result = self.callback(self.output)
+                if inspect.isawaitable(cb_result):
+                    await cb_result

            crew = self.agent.crew  # type: ignore[union-attr]
            if crew and crew.task_callback and crew.task_callback != self.callback:
-                crew.task_callback(self.output)
+                cb_result = crew.task_callback(self.output)
+                if inspect.isawaitable(cb_result):
+                    await cb_result

            if self.output_file:
                content = (
@@ -682,16 +700,29 @@ class Task(BaseModel):

            self._post_agent_execution(agent)

-            if not self._guardrails and not self._guardrail:
+            if isinstance(result, BaseModel):
+                raw = result.model_dump_json()
+                if self.output_pydantic:
+                    pydantic_output = result
+                    json_output = None
+                elif self.output_json:
+                    pydantic_output = None
+                    json_output = result.model_dump()
+                else:
+                    pydantic_output = None
+                    json_output = None
+            elif not self._guardrails and not self._guardrail:
+                raw = result
                pydantic_output, json_output = self._export_output(result)
            else:
+                raw = result
                pydantic_output, json_output = None, None

            task_output = TaskOutput(
                name=self.name or self.description,
                description=self.description,
                expected_output=self.expected_output,
-                raw=result,
+                raw=raw,
                pydantic=pydantic_output,
                json_dict=json_output,
                agent=agent.role,
@@ -722,11 +753,15 @@ class Task(BaseModel):
            self.end_time = datetime.datetime.now()

            if self.callback:
-                self.callback(self.output)
+                cb_result = self.callback(self.output)
+                if inspect.iscoroutine(cb_result):
+                    asyncio.run(cb_result)

            crew = self.agent.crew  # type: ignore[union-attr]
            if crew and crew.task_callback and crew.task_callback != self.callback:
-                crew.task_callback(self.output)
+                cb_result = crew.task_callback(self.output)
+                if inspect.iscoroutine(cb_result):
+                    asyncio.run(cb_result)

            if self.output_file:
                content = (
--- a/lib/crewai/src/crewai/tools/base_tool.py
+++ b/lib/crewai/src/crewai/tools/base_tool.py
@@ -150,14 +150,38 @@ class BaseTool(BaseModel, ABC):

        super().model_post_init(__context)

+    def _validate_kwargs(self, kwargs: dict[str, Any]) -> dict[str, Any]:
+        """Validate keyword arguments against args_schema if present.
+
+        Args:
+            kwargs: The keyword arguments to validate.
+
+        Returns:
+            Validated (and possibly coerced) keyword arguments.
+
+        Raises:
+            ValueError: If validation against args_schema fails.
+        """
+        if self.args_schema is not None and self.args_schema.model_fields:
+            try:
+                validated = self.args_schema.model_validate(kwargs)
+                return validated.model_dump()
+            except Exception as e:
+                raise ValueError(
+                    f"Tool '{self.name}' arguments validation failed: {e}"
+                ) from e
+        return kwargs
+
    def run(
        self,
        *args: Any,
        **kwargs: Any,
    ) -> Any:
+        if not args:
+            kwargs = self._validate_kwargs(kwargs)
+
        result = self._run(*args, **kwargs)

-        # If _run is async, we safely run it
        if asyncio.iscoroutine(result):
            result = asyncio.run(result)

@@ -179,6 +203,8 @@ class BaseTool(BaseModel, ABC):
        Returns:
            The result of the tool execution.
        """
+        if not args:
+            kwargs = self._validate_kwargs(kwargs)
        result = await self._arun(*args, **kwargs)
        self.current_usage_count += 1
        return result
@@ -331,6 +357,9 @@ class Tool(BaseTool, Generic[P, R]):
        Returns:
            The result of the tool execution.
        """
+        if not args:
+            kwargs = self._validate_kwargs(kwargs)  # type: ignore[assignment]
+
        result = self.func(*args, **kwargs)

        if asyncio.iscoroutine(result):
@@ -361,6 +390,8 @@ class Tool(BaseTool, Generic[P, R]):
        Returns:
            The result of the tool execution.
        """
+        if not args:
+            kwargs = self._validate_kwargs(kwargs)  # type: ignore[assignment]
        result = await self._arun(*args, **kwargs)
        self.current_usage_count += 1
        return result
--- a/lib/crewai/src/crewai/tools/mcp_native_tool.py
+++ b/lib/crewai/src/crewai/tools/mcp_native_tool.py
@@ -27,14 +27,16 @@ class MCPNativeTool(BaseTool):
        tool_name: str,
        tool_schema: dict[str, Any],
        server_name: str,
+        original_tool_name: str | None = None,
    ) -> None:
        """Initialize native MCP tool.

        Args:
            mcp_client: MCPClient instance with active session.
-            tool_name: Original name of the tool on the MCP server.
+            tool_name: Name of the tool (may be prefixed).
            tool_schema: Schema information for the tool.
            server_name: Name of the MCP server for prefixing.
+            original_tool_name: Original name of the tool on the MCP server.
        """
        # Create tool name with server prefix to avoid conflicts
        prefixed_name = f"{server_name}_{tool_name}"
@@ -57,7 +59,7 @@ class MCPNativeTool(BaseTool):

        # Set instance attributes after super().__init__
        self._mcp_client = mcp_client
-        self._original_tool_name = tool_name
+        self._original_tool_name = original_tool_name or tool_name
        self._server_name = server_name
        # self._logger = logging.getLogger(__name__)

--- a/lib/crewai/src/crewai/tools/memory_tools.py
+++ b/lib/crewai/src/crewai/tools/memory_tools.py
@@ -20,14 +20,6 @@ class RecallMemorySchema(BaseModel):
            "or multiple items to search for several things at once."
        ),
    )
-    scope: str | None = Field(
-        default=None,
-        description="Optional scope to narrow the search (e.g. /project/alpha)",
-    )
-    depth: str = Field(
-        default="shallow",
-        description="'shallow' for fast vector search, 'deep' for LLM-analyzed retrieval",
-    )


 class RecallMemoryTool(BaseTool):
@@ -41,32 +33,27 @@ class RecallMemoryTool(BaseTool):
    def _run(
        self,
        queries: list[str] | str,
-        scope: str | None = None,
-        depth: str = "shallow",
        **kwargs: Any,
    ) -> str:
        """Search memory for relevant information.

        Args:
            queries: One or more search queries (string or list of strings).
-            scope: Optional scope prefix to narrow the search.
-            depth: "shallow" for fast vector search, "deep" for LLM-analyzed retrieval.

        Returns:
            Formatted string of matching memories, or a message if none found.
        """
        if isinstance(queries, str):
            queries = [queries]
-        actual_depth = depth if depth in ("shallow", "deep") else "shallow"

        all_lines: list[str] = []
        seen_ids: set[str] = set()
        for query in queries:
-            matches = self.memory.recall(query, scope=scope, limit=5, depth=actual_depth)
+            matches = self.memory.recall(query)
            for m in matches:
                if m.record.id not in seen_ids:
                    seen_ids.add(m.record.id)
-                    all_lines.append(f"- (score={m.score:.2f}) {m.record.content}")
+                    all_lines.append(m.format())

        if not all_lines:
            return "No relevant memories found."
@@ -117,20 +104,28 @@ class RememberTool(BaseTool):
 def create_memory_tools(memory: Any) -> list[BaseTool]:
    """Create Recall and Remember tools for the given memory instance.

+    When memory is read-only (``_read_only=True``), only the RecallMemoryTool
+    is returned — the RememberTool is omitted so agents are never offered a
+    save capability they cannot use.
+
    Args:
        memory: A Memory, MemoryScope, or MemorySlice instance.

    Returns:
-        List containing a RecallMemoryTool and a RememberTool.
+        List containing a RecallMemoryTool and, if not read-only, a RememberTool.
    """
    i18n = get_i18n()
-    return [
+    tools: list[BaseTool] = [
        RecallMemoryTool(
            memory=memory,
            description=i18n.tools("recall_memory"),
        ),
-        RememberTool(
-            memory=memory,
-            description=i18n.tools("save_to_memory"),
-        ),
    ]
+    if not getattr(memory, "_read_only", False):
+        tools.append(
+            RememberTool(
+                memory=memory,
+                description=i18n.tools("save_to_memory"),
+            )
+        )
+    return tools
--- a/lib/crewai/src/crewai/utilities/agent_utils.py
+++ b/lib/crewai/src/crewai/utilities/agent_utils.py
@@ -3,6 +3,7 @@ from __future__ import annotations
 import asyncio
 from collections.abc import Callable, Sequence
 import concurrent.futures
+import inspect
 import json
 import re
 from typing import TYPE_CHECKING, Any, Final, Literal, TypedDict
@@ -167,7 +168,9 @@ def convert_tools_to_openai_schema(
        parameters: dict[str, Any] = {}
        if hasattr(tool, "args_schema") and tool.args_schema is not None:
            try:
-                schema_output = generate_model_description(tool.args_schema)
+                schema_output = generate_model_description(
+                    tool.args_schema, strip_null_types=False
+                )
                parameters = schema_output.get("json_schema", {}).get("schema", {})
                # Remove title and description from schema root as they're redundant
                parameters.pop("title", None)
@@ -501,7 +504,9 @@ def handle_agent_action_core(
        - TODO: Remove messages parameter and its usage.
    """
    if step_callback:
-        step_callback(tool_result)
+        cb_result = step_callback(tool_result)
+        if inspect.iscoroutine(cb_result):
+            asyncio.run(cb_result)

    formatted_answer.text += f"\nObservation: {tool_result.result}"
    formatted_answer.result = tool_result.result
@@ -1143,6 +1148,36 @@ def extract_tool_call_info(
    return None


+def parse_tool_call_args(
+    func_args: dict[str, Any] | str,
+    func_name: str,
+    call_id: str,
+    original_tool: Any = None,
+) -> tuple[dict[str, Any], None] | tuple[None, dict[str, Any]]:
+    """Parse tool call arguments from a JSON string or dict.
+
+    Returns:
+        ``(args_dict, None)`` on success, or ``(None, error_result)`` on
+        JSON parse failure where ``error_result`` is a ready-to-return dict
+        with the same shape as ``_execute_single_native_tool_call`` return values.
+    """
+    if isinstance(func_args, str):
+        try:
+            return json.loads(func_args), None
+        except json.JSONDecodeError as e:
+            return None, {
+                "call_id": call_id,
+                "func_name": func_name,
+                "result": (
+                    f"Error: Failed to parse tool arguments as JSON: {e}. "
+                    f"Please provide valid JSON arguments for the '{func_name}' tool."
+                ),
+                "from_cache": False,
+                "original_tool": original_tool,
+            }
+    return func_args, None
+
+
 def _setup_before_llm_call_hooks(
    executor_context: CrewAgentExecutor | AgentExecutor | LiteAgent | None,
    printer: Printer,
--- a/lib/crewai/src/crewai/utilities/llm_utils.py
+++ b/lib/crewai/src/crewai/utilities/llm_utils.py
@@ -69,7 +69,7 @@ def create_llm(
 UNACCEPTED_ATTRIBUTES: Final[list[str]] = [
    "AWS_ACCESS_KEY_ID",
    "AWS_SECRET_ACCESS_KEY",
-    "AWS_REGION_NAME",
+    "AWS_DEFAULT_REGION",
 ]


@@ -146,7 +146,7 @@ def _llm_via_environment_or_fallback() -> LLM | None:
    unaccepted_attributes = [
        "AWS_ACCESS_KEY_ID",
        "AWS_SECRET_ACCESS_KEY",
-        "AWS_REGION_NAME",
+        "AWS_DEFAULT_REGION",
    ]
    set_provider = model_name.partition("/")[0] if "/" in model_name else "openai"

--- a/lib/crewai/src/crewai/utilities/pydantic_schema_utils.py
+++ b/lib/crewai/src/crewai/utilities/pydantic_schema_utils.py
@@ -417,7 +417,11 @@ def strip_null_from_types(schema: dict[str, Any]) -> dict[str, Any]:
    return schema


-def generate_model_description(model: type[BaseModel]) -> ModelDescription:
+def generate_model_description(
+    model: type[BaseModel],
+    *,
+    strip_null_types: bool = True,
+) -> ModelDescription:
    """Generate JSON schema description of a Pydantic model.

    This function takes a Pydantic model class and returns its JSON schema,
@@ -426,6 +430,9 @@ def generate_model_description(model: type[BaseModel]) -> ModelDescription:

    Args:
        model: A Pydantic model class.
+        strip_null_types: When ``True`` (default), remove ``null`` from
+            ``anyOf`` / ``type`` arrays.  Set to ``False`` to allow sending ``null`` for
+            optional fields.

    Returns:
        A ModelDescription with JSON schema representation of the model.
@@ -442,7 +449,9 @@ def generate_model_description(model: type[BaseModel]) -> ModelDescription:
    json_schema = fix_discriminator_mappings(json_schema)
    json_schema = convert_oneof_to_anyof(json_schema)
    json_schema = ensure_all_properties_required(json_schema)
-    json_schema = strip_null_from_types(json_schema)
+
+    if strip_null_types:
+        json_schema = strip_null_from_types(json_schema)

    return {
        "type": "json_schema",
@@ -482,10 +491,66 @@ FORMAT_TYPE_MAP: dict[str, type[Any]] = {
 }


+def build_rich_field_description(prop_schema: dict[str, Any]) -> str:
+    """Build a comprehensive field description including constraints.
+
+    Embeds format, enum, pattern, min/max, and example constraints into the
+    description text so that LLMs can understand tool parameter requirements
+    without inspecting the raw JSON Schema.
+
+    Args:
+        prop_schema: Property schema with description and constraints.
+
+    Returns:
+        Enhanced description with format, enum, and other constraints.
+    """
+    parts: list[str] = []
+
+    description = prop_schema.get("description", "")
+    if description:
+        parts.append(description)
+
+    format_type = prop_schema.get("format")
+    if format_type:
+        parts.append(f"Format: {format_type}")
+
+    enum_values = prop_schema.get("enum")
+    if enum_values:
+        enum_str = ", ".join(repr(v) for v in enum_values)
+        parts.append(f"Allowed values: [{enum_str}]")
+
+    pattern = prop_schema.get("pattern")
+    if pattern:
+        parts.append(f"Pattern: {pattern}")
+
+    minimum = prop_schema.get("minimum")
+    maximum = prop_schema.get("maximum")
+    if minimum is not None:
+        parts.append(f"Minimum: {minimum}")
+    if maximum is not None:
+        parts.append(f"Maximum: {maximum}")
+
+    min_length = prop_schema.get("minLength")
+    max_length = prop_schema.get("maxLength")
+    if min_length is not None:
+        parts.append(f"Min length: {min_length}")
+    if max_length is not None:
+        parts.append(f"Max length: {max_length}")
+
+    examples = prop_schema.get("examples")
+    if examples:
+        examples_str = ", ".join(repr(e) for e in examples[:3])
+        parts.append(f"Examples: {examples_str}")
+
+    return ". ".join(parts) if parts else ""
+
+
 def create_model_from_schema(  # type: ignore[no-any-unimported]
    json_schema: dict[str, Any],
    *,
    root_schema: dict[str, Any] | None = None,
+    model_name: str | None = None,
+    enrich_descriptions: bool = False,
    __config__: ConfigDict | None = None,
    __base__: type[BaseModel] | None = None,
    __module__: str = __name__,
@@ -503,6 +568,13 @@ def create_model_from_schema(  # type: ignore[no-any-unimported]
        json_schema: A dictionary representing the JSON schema.
        root_schema: The root schema containing $defs. If not provided, the
            current schema is treated as the root schema.
+        model_name: Override for the model name. If not provided, the schema
+            ``title`` field is used, falling back to ``"DynamicModel"``.
+        enrich_descriptions: When True, augment field descriptions with
+            constraint info (format, enum, pattern, min/max, examples) via
+            :func:`build_rich_field_description`.  Useful for LLM-facing tool
+            schemas where constraints in the description help the model
+            understand parameter requirements.
        __config__: Pydantic configuration for the generated model.
        __base__: Base class for the generated model. Defaults to BaseModel.
        __module__: Module name for the generated model class.
@@ -539,10 +611,14 @@ def create_model_from_schema(  # type: ignore[no-any-unimported]
        if "title" not in json_schema and "title" in (root_schema or {}):
            json_schema["title"] = (root_schema or {}).get("title")

-    model_name = json_schema.get("title") or "DynamicModel"
+    effective_name = model_name or json_schema.get("title") or "DynamicModel"
    field_definitions = {
        name: _json_schema_to_pydantic_field(
-            name, prop, json_schema.get("required", []), effective_root
+            name,
+            prop,
+            json_schema.get("required", []),
+            effective_root,
+            enrich_descriptions=enrich_descriptions,
        )
        for name, prop in (json_schema.get("properties", {}) or {}).items()
    }
@@ -550,7 +626,7 @@ def create_model_from_schema(  # type: ignore[no-any-unimported]
    effective_config = __config__ or ConfigDict(extra="forbid")

    return create_model_base(
-        model_name,
+        effective_name,
        __config__=effective_config,
        __base__=__base__,
        __module__=__module__,
@@ -565,6 +641,8 @@ def _json_schema_to_pydantic_field(
    json_schema: dict[str, Any],
    required: list[str],
    root_schema: dict[str, Any],
+    *,
+    enrich_descriptions: bool = False,
 ) -> Any:
    """Convert a JSON schema property to a Pydantic field definition.

@@ -573,20 +651,29 @@ def _json_schema_to_pydantic_field(
        json_schema: The JSON schema for this field.
        required: List of required field names.
        root_schema: The root schema for resolving $ref.
+        enrich_descriptions: When True, embed constraints in the description.

    Returns:
        A tuple of (type, Field) for use with create_model.
    """
-    type_ = _json_schema_to_pydantic_type(json_schema, root_schema, name_=name.title())
-    description = json_schema.get("description")
-    examples = json_schema.get("examples")
+    type_ = _json_schema_to_pydantic_type(
+        json_schema, root_schema, name_=name.title(), enrich_descriptions=enrich_descriptions
+    )
    is_required = name in required

    field_params: dict[str, Any] = {}
    schema_extra: dict[str, Any] = {}

-    if description:
-        field_params["description"] = description
+    if enrich_descriptions:
+        rich_desc = build_rich_field_description(json_schema)
+        if rich_desc:
+            field_params["description"] = rich_desc
+    else:
+        description = json_schema.get("description")
+        if description:
+            field_params["description"] = description
+
+    examples = json_schema.get("examples")
    if examples:
        schema_extra["examples"] = examples

@@ -702,6 +789,7 @@ def _json_schema_to_pydantic_type(
    root_schema: dict[str, Any],
    *,
    name_: str | None = None,
+    enrich_descriptions: bool = False,
 ) -> Any:
    """Convert a JSON schema to a Python/Pydantic type.

@@ -709,6 +797,7 @@ def _json_schema_to_pydantic_type(
        json_schema: The JSON schema to convert.
        root_schema: The root schema for resolving $ref.
        name_: Optional name for nested models.
+        enrich_descriptions: Propagated to nested model creation.

    Returns:
        A Python type corresponding to the JSON schema.
@@ -716,7 +805,9 @@ def _json_schema_to_pydantic_type(
    ref = json_schema.get("$ref")
    if ref:
        ref_schema = _resolve_ref(ref, root_schema)
-        return _json_schema_to_pydantic_type(ref_schema, root_schema, name_=name_)
+        return _json_schema_to_pydantic_type(
+            ref_schema, root_schema, name_=name_, enrich_descriptions=enrich_descriptions
+        )

    enum_values = json_schema.get("enum")
    if enum_values:
@@ -731,7 +822,10 @@ def _json_schema_to_pydantic_type(
    if any_of_schemas:
        any_of_types = [
            _json_schema_to_pydantic_type(
-                schema, root_schema, name_=f"{name_ or 'Union'}Option{i}"
+                schema,
+                root_schema,
+                name_=f"{name_ or 'Union'}Option{i}",
+                enrich_descriptions=enrich_descriptions,
            )
            for i, schema in enumerate(any_of_schemas)
        ]
@@ -741,10 +835,14 @@ def _json_schema_to_pydantic_type(
    if all_of_schemas:
        if len(all_of_schemas) == 1:
            return _json_schema_to_pydantic_type(
-                all_of_schemas[0], root_schema, name_=name_
+                all_of_schemas[0], root_schema, name_=name_,
+                enrich_descriptions=enrich_descriptions,
            )
        merged = _merge_all_of_schemas(all_of_schemas, root_schema)
-        return _json_schema_to_pydantic_type(merged, root_schema, name_=name_)
+        return _json_schema_to_pydantic_type(
+            merged, root_schema, name_=name_,
+            enrich_descriptions=enrich_descriptions,
+        )

    type_ = json_schema.get("type")

@@ -760,7 +858,8 @@ def _json_schema_to_pydantic_type(
        items_schema = json_schema.get("items")
        if items_schema:
            item_type = _json_schema_to_pydantic_type(
-                items_schema, root_schema, name_=name_
+                items_schema, root_schema, name_=name_,
+                enrich_descriptions=enrich_descriptions,
            )
            return list[item_type]  # type: ignore[valid-type]
        return list
@@ -770,7 +869,10 @@ def _json_schema_to_pydantic_type(
            json_schema_ = json_schema.copy()
            if json_schema_.get("title") is None:
                json_schema_["title"] = name_ or "DynamicModel"
-            return create_model_from_schema(json_schema_, root_schema=root_schema)
+            return create_model_from_schema(
+                json_schema_, root_schema=root_schema,
+                enrich_descriptions=enrich_descriptions,
+            )
        return dict
    if type_ == "null":
        return None
--- a/lib/crewai/tests/agents/test_agent_executor.py
+++ b/lib/crewai/tests/agents/test_agent_executor.py
@@ -4,6 +4,7 @@ Tests the Flow-based agent executor implementation including state management,
 flow methods, routing logic, and error handling.
 """

+import time
 from unittest.mock import Mock, patch

 import pytest
@@ -462,3 +463,176 @@ class TestFlowInvoke:

        assert result == {"output": "Done"}
        assert len(executor.state.messages) >= 2
+
+
+class TestNativeToolExecution:
+    """Test native tool execution behavior."""
+
+    @pytest.fixture
+    def mock_dependencies(self):
+        llm = Mock()
+        llm.supports_stop_words.return_value = True
+
+        task = Mock()
+        task.name = "Test Task"
+        task.description = "Test"
+        task.human_input = False
+        task.response_model = None
+
+        crew = Mock()
+        crew._memory = None
+        crew.verbose = False
+        crew._train = False
+
+        agent = Mock()
+        agent.id = "test-agent-id"
+        agent.role = "Test Agent"
+        agent.verbose = False
+        agent.key = "test-key"
+
+        prompt = {"prompt": "Test {input} {tool_names} {tools}"}
+
+        tools_handler = Mock()
+        tools_handler.cache = None
+
+        return {
+            "llm": llm,
+            "task": task,
+            "crew": crew,
+            "agent": agent,
+            "prompt": prompt,
+            "max_iter": 10,
+            "tools": [],
+            "tools_names": "",
+            "stop_words": [],
+            "tools_description": "",
+            "tools_handler": tools_handler,
+        }
+
+    def test_execute_native_tool_runs_parallel_for_multiple_calls(
+        self, mock_dependencies
+    ):
+        executor = AgentExecutor(**mock_dependencies)
+
+        def slow_one() -> str:
+            time.sleep(0.2)
+            return "one"
+
+        def slow_two() -> str:
+            time.sleep(0.2)
+            return "two"
+
+        executor._available_functions = {"slow_one": slow_one, "slow_two": slow_two}
+        executor.state.pending_tool_calls = [
+            {
+                "id": "call_1",
+                "function": {"name": "slow_one", "arguments": "{}"},
+            },
+            {
+                "id": "call_2",
+                "function": {"name": "slow_two", "arguments": "{}"},
+            },
+        ]
+
+        started = time.perf_counter()
+        result = executor.execute_native_tool()
+        elapsed = time.perf_counter() - started
+
+        assert result == "native_tool_completed"
+        assert elapsed < 0.5
+        tool_messages = [m for m in executor.state.messages if m.get("role") == "tool"]
+        assert len(tool_messages) == 2
+        assert tool_messages[0]["tool_call_id"] == "call_1"
+        assert tool_messages[1]["tool_call_id"] == "call_2"
+
+    def test_execute_native_tool_falls_back_to_sequential_for_result_as_answer(
+        self, mock_dependencies
+    ):
+        executor = AgentExecutor(**mock_dependencies)
+
+        def slow_one() -> str:
+            time.sleep(0.2)
+            return "one"
+
+        def slow_two() -> str:
+            time.sleep(0.2)
+            return "two"
+
+        result_tool = Mock()
+        result_tool.name = "slow_one"
+        result_tool.result_as_answer = True
+        result_tool.max_usage_count = None
+        result_tool.current_usage_count = 0
+
+        executor.original_tools = [result_tool]
+        executor._available_functions = {"slow_one": slow_one, "slow_two": slow_two}
+        executor.state.pending_tool_calls = [
+            {
+                "id": "call_1",
+                "function": {"name": "slow_one", "arguments": "{}"},
+            },
+            {
+                "id": "call_2",
+                "function": {"name": "slow_two", "arguments": "{}"},
+            },
+        ]
+
+        started = time.perf_counter()
+        result = executor.execute_native_tool()
+        elapsed = time.perf_counter() - started
+
+        assert result == "tool_result_is_final"
+        assert elapsed >= 0.2
+        assert elapsed < 0.8
+        assert isinstance(executor.state.current_answer, AgentFinish)
+        assert executor.state.current_answer.output == "one"
+
+    def test_execute_native_tool_result_as_answer_short_circuits_remaining_calls(
+        self, mock_dependencies
+    ):
+        executor = AgentExecutor(**mock_dependencies)
+        call_counts = {"slow_one": 0, "slow_two": 0}
+
+        def slow_one() -> str:
+            call_counts["slow_one"] += 1
+            time.sleep(0.2)
+            return "one"
+
+        def slow_two() -> str:
+            call_counts["slow_two"] += 1
+            time.sleep(0.2)
+            return "two"
+
+        result_tool = Mock()
+        result_tool.name = "slow_one"
+        result_tool.result_as_answer = True
+        result_tool.max_usage_count = None
+        result_tool.current_usage_count = 0
+
+        executor.original_tools = [result_tool]
+        executor._available_functions = {"slow_one": slow_one, "slow_two": slow_two}
+        executor.state.pending_tool_calls = [
+            {
+                "id": "call_1",
+                "function": {"name": "slow_one", "arguments": "{}"},
+            },
+            {
+                "id": "call_2",
+                "function": {"name": "slow_two", "arguments": "{}"},
+            },
+        ]
+
+        started = time.perf_counter()
+        result = executor.execute_native_tool()
+        elapsed = time.perf_counter() - started
+
+        assert result == "tool_result_is_final"
+        assert isinstance(executor.state.current_answer, AgentFinish)
+        assert executor.state.current_answer.output == "one"
+        assert call_counts["slow_one"] == 1
+        assert call_counts["slow_two"] == 0
+        assert elapsed < 0.5
+
+        tool_messages = [m for m in executor.state.messages if m.get("role") == "tool"]
+        assert len(tool_messages) == 1
+        assert tool_messages[0]["tool_call_id"] == "call_1"
--- a/lib/crewai/tests/agents/test_async_agent_executor.py
+++ b/lib/crewai/tests/agents/test_async_agent_executor.py
@@ -2,7 +2,7 @@

 import asyncio
 from typing import Any
-from unittest.mock import AsyncMock, MagicMock, patch
+from unittest.mock import AsyncMock, MagicMock, Mock, patch

 import pytest

@@ -291,6 +291,46 @@ class TestAsyncAgentExecutor:
        assert max_concurrent > 1, f"Expected concurrent execution, max concurrent was {max_concurrent}"


+class TestInvokeStepCallback:
+    """Tests for _invoke_step_callback with sync and async callbacks."""
+
+    def test_invoke_step_callback_with_sync_callback(
+        self, executor: CrewAgentExecutor
+    ) -> None:
+        """Test that a sync step callback is called normally."""
+        callback = Mock()
+        executor.step_callback = callback
+        answer = AgentFinish(thought="thinking", output="test", text="final")
+
+        executor._invoke_step_callback(answer)
+
+        callback.assert_called_once_with(answer)
+
+    def test_invoke_step_callback_with_async_callback(
+        self, executor: CrewAgentExecutor
+    ) -> None:
+        """Test that an async step callback is awaited via asyncio.run."""
+        async_callback = AsyncMock()
+        executor.step_callback = async_callback
+        answer = AgentFinish(thought="thinking", output="test", text="final")
+
+        with patch("crewai.agents.crew_agent_executor.asyncio.run") as mock_run:
+            executor._invoke_step_callback(answer)
+
+            async_callback.assert_called_once_with(answer)
+            mock_run.assert_called_once()
+
+    def test_invoke_step_callback_with_none(
+        self, executor: CrewAgentExecutor
+    ) -> None:
+        """Test that no error is raised when step_callback is None."""
+        executor.step_callback = None
+        answer = AgentFinish(thought="thinking", output="test", text="final")
+
+        # Should not raise
+        executor._invoke_step_callback(answer)
+
+
 class TestAsyncLLMResponseHelper:
    """Tests for aget_llm_response helper function."""

--- a/lib/crewai/tests/agents/test_lite_agent.py
+++ b/lib/crewai/tests/agents/test_lite_agent.py
@@ -659,7 +659,7 @@ def test_agent_kickoff_with_platform_tools(mock_get, mock_post):


@patch.dict("os.environ", {"EXA_API_KEY": "test_exa_key"})
-@patch("crewai.agent.Agent._get_external_mcp_tools")
+@patch("crewai.agent.Agent.get_mcp_tools")
@pytest.mark.vcr()
 def test_agent_kickoff_with_mcp_tools(mock_get_mcp_tools):
    """Test that Agent.kickoff() properly integrates MCP tools with LiteAgent"""
@@ -691,7 +691,7 @@ def test_agent_kickoff_with_mcp_tools(mock_get_mcp_tools):
    assert result.raw is not None

    # Verify MCP tools were retrieved
-    mock_get_mcp_tools.assert_called_once_with("https://mcp.exa.ai/mcp?api_key=test_exa_key&profile=research")
+    mock_get_mcp_tools.assert_called_once_with(["https://mcp.exa.ai/mcp?api_key=test_exa_key&profile=research"])


 # ============================================================================
@@ -1136,6 +1136,7 @@ def test_lite_agent_memory_instance_recall_and_save_called():
        successful_requests=1,
    )
    mock_memory = Mock()
+    mock_memory._read_only = False
    mock_memory.recall.return_value = []
    mock_memory.extract_memories.return_value = ["Fact one.", "Fact two."]

--- a/lib/crewai/tests/agents/test_native_tool_calling.py
+++ b/lib/crewai/tests/agents/test_native_tool_calling.py
@@ -6,13 +6,20 @@ when the LLM supports it, across multiple providers.

 from __future__ import annotations

+from collections.abc import Generator
 import os
-from unittest.mock import patch
+import threading
+import time
+from collections import Counter
+from unittest.mock import Mock, patch

 import pytest
 from pydantic import BaseModel, Field

 from crewai import Agent, Crew, Task
+from crewai.events import crewai_event_bus
+from crewai.hooks import register_after_tool_call_hook, register_before_tool_call_hook
+from crewai.hooks.tool_hooks import ToolCallHookContext
 from crewai.llm import LLM
 from crewai.tools.base_tool import BaseTool

@@ -64,6 +71,73 @@ class FailingTool(BaseTool):
    def _run(self) -> str:
        raise Exception("This tool always fails")

+
+class LocalSearchInput(BaseModel):
+    query: str = Field(description="Search query")
+
+
+class ParallelProbe:
+    """Thread-safe in-memory recorder for tool execution windows."""
+
+    _lock = threading.Lock()
+    _windows: list[tuple[str, float, float]] = []
+
+    @classmethod
+    def reset(cls) -> None:
+        with cls._lock:
+            cls._windows = []
+
+    @classmethod
+    def record(cls, tool_name: str, start: float, end: float) -> None:
+        with cls._lock:
+            cls._windows.append((tool_name, start, end))
+
+    @classmethod
+    def windows(cls) -> list[tuple[str, float, float]]:
+        with cls._lock:
+            return list(cls._windows)
+
+
+def _parallel_prompt() -> str:
+    return (
+        "This is a tool-calling compliance test. "
+        "In your next assistant turn, emit exactly 3 tool calls in the same response (parallel tool calls), in this order: "
+        "1) parallel_local_search_one(query='latest OpenAI model release notes'), "
+        "2) parallel_local_search_two(query='latest Anthropic model release notes'), "
+        "3) parallel_local_search_three(query='latest Gemini model release notes'). "
+        "Do not call any other tools and do not answer before those 3 tool calls are emitted. "
+        "After the tool results return, provide a one paragraph summary."
+    )
+
+
+def _max_concurrency(windows: list[tuple[str, float, float]]) -> int:
+    points: list[tuple[float, int]] = []
+    for _, start, end in windows:
+        points.append((start, 1))
+        points.append((end, -1))
+    points.sort(key=lambda p: (p[0], p[1]))
+
+    current = 0
+    maximum = 0
+    for _, delta in points:
+        current += delta
+        if current > maximum:
+            maximum = current
+    return maximum
+
+
+def _assert_tools_overlapped() -> None:
+    windows = ParallelProbe.windows()
+    local_windows = [
+        w
+        for w in windows
+        if w[0].startswith("parallel_local_search_")
+    ]
+
+    assert len(local_windows) >= 3, f"Expected at least 3 local tool calls, got {len(local_windows)}"
+    assert _max_concurrency(local_windows) >= 2, "Expected overlapping local tool executions"
+
+
@pytest.fixture
 def calculator_tool() -> CalculatorTool:
    """Create a calculator tool for testing."""
@@ -82,6 +156,65 @@ def failing_tool() -> BaseTool:

    )

+
+@pytest.fixture
+def parallel_tools() -> list[BaseTool]:
+    """Create local tools used to verify native parallel execution deterministically."""
+
+    class ParallelLocalSearchOne(BaseTool):
+        name: str = "parallel_local_search_one"
+        description: str = "Local search tool #1 for concurrency testing."
+        args_schema: type[BaseModel] = LocalSearchInput
+
+        def _run(self, query: str) -> str:
+            start = time.perf_counter()
+            time.sleep(1.0)
+            end = time.perf_counter()
+            ParallelProbe.record(self.name, start, end)
+            return f"[one] {query}"
+
+    class ParallelLocalSearchTwo(BaseTool):
+        name: str = "parallel_local_search_two"
+        description: str = "Local search tool #2 for concurrency testing."
+        args_schema: type[BaseModel] = LocalSearchInput
+
+        def _run(self, query: str) -> str:
+            start = time.perf_counter()
+            time.sleep(1.0)
+            end = time.perf_counter()
+            ParallelProbe.record(self.name, start, end)
+            return f"[two] {query}"
+
+    class ParallelLocalSearchThree(BaseTool):
+        name: str = "parallel_local_search_three"
+        description: str = "Local search tool #3 for concurrency testing."
+        args_schema: type[BaseModel] = LocalSearchInput
+
+        def _run(self, query: str) -> str:
+            start = time.perf_counter()
+            time.sleep(1.0)
+            end = time.perf_counter()
+            ParallelProbe.record(self.name, start, end)
+            return f"[three] {query}"
+
+    return [
+        ParallelLocalSearchOne(),
+        ParallelLocalSearchTwo(),
+        ParallelLocalSearchThree(),
+    ]
+
+
+def _attach_parallel_probe_handler() -> None:
+    @crewai_event_bus.on(ToolUsageFinishedEvent)
+    def _capture_tool_window(_source, event: ToolUsageFinishedEvent):
+        if not event.tool_name.startswith("parallel_local_search_"):
+            return
+        ParallelProbe.record(
+            event.tool_name,
+            event.started_at.timestamp(),
+            event.finished_at.timestamp(),
+        )
+
 # =============================================================================
 # OpenAI Provider Tests
 # =============================================================================
@@ -122,7 +255,7 @@ class TestOpenAINativeToolCalling:
        self, calculator_tool: CalculatorTool
    ) -> None:
        """Test OpenAI agent kickoff with mocked LLM call."""
-        llm = LLM(model="gpt-4o-mini")
+        llm = LLM(model="gpt-5-nano")

        with patch.object(llm, "call", return_value="The answer is 120.") as mock_call:
            agent = Agent(
@@ -146,6 +279,174 @@ class TestOpenAINativeToolCalling:
            assert mock_call.called
            assert result is not None

+    @pytest.mark.vcr()
+    @pytest.mark.timeout(180)
+    def test_openai_parallel_native_tool_calling_test_crew(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        agent = Agent(
+            role="Parallel Tool Agent",
+            goal="Use both tools exactly as instructed",
+            backstory="You follow tool instructions precisely.",
+            tools=parallel_tools,
+            llm=LLM(model="gpt-5-nano", temperature=1),
+            verbose=False,
+            max_iter=3,
+        )
+        task = Task(
+            description=_parallel_prompt(),
+            expected_output="A one sentence summary of both tool outputs",
+            agent=agent,
+        )
+        crew = Crew(agents=[agent], tasks=[task])
+        result = crew.kickoff()
+        assert result is not None
+        _assert_tools_overlapped()
+
+    @pytest.mark.vcr()
+    @pytest.mark.timeout(180)
+    def test_openai_parallel_native_tool_calling_test_agent_kickoff(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        agent = Agent(
+            role="Parallel Tool Agent",
+            goal="Use both tools exactly as instructed",
+            backstory="You follow tool instructions precisely.",
+            tools=parallel_tools,
+            llm=LLM(model="gpt-4o-mini"),
+            verbose=False,
+            max_iter=3,
+        )
+        result = agent.kickoff(_parallel_prompt())
+        assert result is not None
+        _assert_tools_overlapped()
+
+    @pytest.mark.vcr()
+    @pytest.mark.timeout(180)
+    def test_openai_parallel_native_tool_calling_tool_hook_parity_crew(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        hook_calls: dict[str, list[dict[str, str]]] = {"before": [], "after": []}
+
+        def before_hook(context: ToolCallHookContext) -> bool | None:
+            if context.tool_name.startswith("parallel_local_search_"):
+                hook_calls["before"].append(
+                    {
+                        "tool_name": context.tool_name,
+                        "query": str(context.tool_input.get("query", "")),
+                    }
+                )
+            return None
+
+        def after_hook(context: ToolCallHookContext) -> str | None:
+            if context.tool_name.startswith("parallel_local_search_"):
+                hook_calls["after"].append(
+                    {
+                        "tool_name": context.tool_name,
+                        "query": str(context.tool_input.get("query", "")),
+                    }
+                )
+            return None
+
+        register_before_tool_call_hook(before_hook)
+        register_after_tool_call_hook(after_hook)
+
+        try:
+            agent = Agent(
+                role="Parallel Tool Agent",
+                goal="Use both tools exactly as instructed",
+                backstory="You follow tool instructions precisely.",
+                tools=parallel_tools,
+                llm=LLM(model="gpt-5-nano", temperature=1),
+                verbose=False,
+                max_iter=3,
+            )
+            task = Task(
+                description=_parallel_prompt(),
+                expected_output="A one sentence summary of both tool outputs",
+                agent=agent,
+            )
+            crew = Crew(agents=[agent], tasks=[task])
+            result = crew.kickoff()
+
+            assert result is not None
+            _assert_tools_overlapped()
+
+            before_names = [call["tool_name"] for call in hook_calls["before"]]
+            after_names = [call["tool_name"] for call in hook_calls["after"]]
+            assert len(before_names) >= 3, "Expected before hooks for all parallel calls"
+            assert Counter(before_names) == Counter(after_names)
+            assert all(call["query"] for call in hook_calls["before"])
+            assert all(call["query"] for call in hook_calls["after"])
+        finally:
+            from crewai.hooks import (
+                unregister_after_tool_call_hook,
+                unregister_before_tool_call_hook,
+            )
+
+            unregister_before_tool_call_hook(before_hook)
+            unregister_after_tool_call_hook(after_hook)
+
+    @pytest.mark.vcr()
+    @pytest.mark.timeout(180)
+    def test_openai_parallel_native_tool_calling_tool_hook_parity_agent_kickoff(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        hook_calls: dict[str, list[dict[str, str]]] = {"before": [], "after": []}
+
+        def before_hook(context: ToolCallHookContext) -> bool | None:
+            if context.tool_name.startswith("parallel_local_search_"):
+                hook_calls["before"].append(
+                    {
+                        "tool_name": context.tool_name,
+                        "query": str(context.tool_input.get("query", "")),
+                    }
+                )
+            return None
+
+        def after_hook(context: ToolCallHookContext) -> str | None:
+            if context.tool_name.startswith("parallel_local_search_"):
+                hook_calls["after"].append(
+                    {
+                        "tool_name": context.tool_name,
+                        "query": str(context.tool_input.get("query", "")),
+                    }
+                )
+            return None
+
+        register_before_tool_call_hook(before_hook)
+        register_after_tool_call_hook(after_hook)
+
+        try:
+            agent = Agent(
+                role="Parallel Tool Agent",
+                goal="Use both tools exactly as instructed",
+                backstory="You follow tool instructions precisely.",
+                tools=parallel_tools,
+                llm=LLM(model="gpt-5-nano", temperature=1),
+                verbose=False,
+                max_iter=3,
+            )
+            result = agent.kickoff(_parallel_prompt())
+
+            assert result is not None
+            _assert_tools_overlapped()
+
+            before_names = [call["tool_name"] for call in hook_calls["before"]]
+            after_names = [call["tool_name"] for call in hook_calls["after"]]
+            assert len(before_names) >= 3, "Expected before hooks for all parallel calls"
+            assert Counter(before_names) == Counter(after_names)
+            assert all(call["query"] for call in hook_calls["before"])
+            assert all(call["query"] for call in hook_calls["after"])
+        finally:
+            from crewai.hooks import (
+                unregister_after_tool_call_hook,
+                unregister_before_tool_call_hook,
+            )
+
+            unregister_before_tool_call_hook(before_hook)
+            unregister_after_tool_call_hook(after_hook)
+

 # =============================================================================
 # Anthropic Provider Tests
@@ -217,6 +518,46 @@ class TestAnthropicNativeToolCalling:
            assert mock_call.called
            assert result is not None

+    @pytest.mark.vcr()
+    def test_anthropic_parallel_native_tool_calling_test_crew(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        agent = Agent(
+            role="Parallel Tool Agent",
+            goal="Use both tools exactly as instructed",
+            backstory="You follow tool instructions precisely.",
+            tools=parallel_tools,
+            llm=LLM(model="anthropic/claude-sonnet-4-6"),
+            verbose=False,
+            max_iter=3,
+        )
+        task = Task(
+            description=_parallel_prompt(),
+            expected_output="A one sentence summary of both tool outputs",
+            agent=agent,
+        )
+        crew = Crew(agents=[agent], tasks=[task])
+        result = crew.kickoff()
+        assert result is not None
+        _assert_tools_overlapped()
+
+    @pytest.mark.vcr()
+    def test_anthropic_parallel_native_tool_calling_test_agent_kickoff(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        agent = Agent(
+            role="Parallel Tool Agent",
+            goal="Use both tools exactly as instructed",
+            backstory="You follow tool instructions precisely.",
+            tools=parallel_tools,
+            llm=LLM(model="anthropic/claude-sonnet-4-6"),
+            verbose=False,
+            max_iter=3,
+        )
+        result = agent.kickoff(_parallel_prompt())
+        assert result is not None
+        _assert_tools_overlapped()
+

 # =============================================================================
 # Google/Gemini Provider Tests
@@ -247,7 +588,7 @@ class TestGeminiNativeToolCalling:
            goal="Help users with mathematical calculations",
            backstory="You are a helpful math assistant.",
            tools=[calculator_tool],
-            llm=LLM(model="gemini/gemini-2.0-flash-exp"),
+            llm=LLM(model="gemini/gemini-2.5-flash"),
        )

        task = Task(
@@ -266,7 +607,7 @@ class TestGeminiNativeToolCalling:
        self, calculator_tool: CalculatorTool
    ) -> None:
        """Test Gemini agent kickoff with mocked LLM call."""
-        llm = LLM(model="gemini/gemini-2.0-flash-001")
+        llm = LLM(model="gemini/gemini-2.5-flash")

        with patch.object(llm, "call", return_value="The answer is 120.") as mock_call:
            agent = Agent(
@@ -290,6 +631,46 @@ class TestGeminiNativeToolCalling:
            assert mock_call.called
            assert result is not None

+    @pytest.mark.vcr()
+    def test_gemini_parallel_native_tool_calling_test_crew(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        agent = Agent(
+            role="Parallel Tool Agent",
+            goal="Use both tools exactly as instructed",
+            backstory="You follow tool instructions precisely.",
+            tools=parallel_tools,
+            llm=LLM(model="gemini/gemini-2.5-flash"),
+            verbose=False,
+            max_iter=3,
+        )
+        task = Task(
+            description=_parallel_prompt(),
+            expected_output="A one sentence summary of both tool outputs",
+            agent=agent,
+        )
+        crew = Crew(agents=[agent], tasks=[task])
+        result = crew.kickoff()
+        assert result is not None
+        _assert_tools_overlapped()
+
+    @pytest.mark.vcr()
+    def test_gemini_parallel_native_tool_calling_test_agent_kickoff(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        agent = Agent(
+            role="Parallel Tool Agent",
+            goal="Use both tools exactly as instructed",
+            backstory="You follow tool instructions precisely.",
+            tools=parallel_tools,
+            llm=LLM(model="gemini/gemini-2.5-flash"),
+            verbose=False,
+            max_iter=3,
+        )
+        result = agent.kickoff(_parallel_prompt())
+        assert result is not None
+        _assert_tools_overlapped()
+

 # =============================================================================
 # Azure Provider Tests
@@ -324,7 +705,7 @@ class TestAzureNativeToolCalling:
            goal="Help users with mathematical calculations",
            backstory="You are a helpful math assistant.",
            tools=[calculator_tool],
-            llm=LLM(model="azure/gpt-4o-mini"),
+            llm=LLM(model="azure/gpt-5-nano"),
            verbose=False,
            max_iter=3,
        )
@@ -347,7 +728,7 @@ class TestAzureNativeToolCalling:
    ) -> None:
        """Test Azure agent kickoff with mocked LLM call."""
        llm = LLM(
-            model="azure/gpt-4o-mini",
+            model="azure/gpt-5-nano",
            api_key="test-key",
            base_url="https://test.openai.azure.com",
        )
@@ -374,6 +755,46 @@ class TestAzureNativeToolCalling:
            assert mock_call.called
            assert result is not None

+    @pytest.mark.vcr()
+    def test_azure_parallel_native_tool_calling_test_crew(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        agent = Agent(
+            role="Parallel Tool Agent",
+            goal="Use both tools exactly as instructed",
+            backstory="You follow tool instructions precisely.",
+            tools=parallel_tools,
+            llm=LLM(model="azure/gpt-5-nano"),
+            verbose=False,
+            max_iter=3,
+        )
+        task = Task(
+            description=_parallel_prompt(),
+            expected_output="A one sentence summary of both tool outputs",
+            agent=agent,
+        )
+        crew = Crew(agents=[agent], tasks=[task])
+        result = crew.kickoff()
+        assert result is not None
+        _assert_tools_overlapped()
+
+    @pytest.mark.vcr()
+    def test_azure_parallel_native_tool_calling_test_agent_kickoff(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        agent = Agent(
+            role="Parallel Tool Agent",
+            goal="Use both tools exactly as instructed",
+            backstory="You follow tool instructions precisely.",
+            tools=parallel_tools,
+            llm=LLM(model="azure/gpt-5-nano"),
+            verbose=False,
+            max_iter=3,
+        )
+        result = agent.kickoff(_parallel_prompt())
+        assert result is not None
+        _assert_tools_overlapped()
+

 # =============================================================================
 # Bedrock Provider Tests
@@ -384,18 +805,30 @@ class TestBedrockNativeToolCalling:
    """Tests for native tool calling with AWS Bedrock models."""

    @pytest.fixture(autouse=True)
-    def mock_aws_env(self):
-        """Mock AWS environment variables for tests."""
-        env_vars = {
-        "AWS_ACCESS_KEY_ID": "test-key",
-        "AWS_SECRET_ACCESS_KEY": "test-secret",
-        "AWS_REGION": "us-east-1",
-        }
-        if "AWS_ACCESS_KEY_ID" not in os.environ:
-            with patch.dict(os.environ, env_vars):
-                yield
-        else:
-            yield
+    def validate_bedrock_credentials_for_live_recording(self):
+        """Run Bedrock tests only when explicitly enabled."""
+        run_live_bedrock = os.getenv("RUN_BEDROCK_LIVE_TESTS", "false").lower() == "true"
+
+        if not run_live_bedrock:
+            pytest.skip(
+                "Skipping Bedrock tests by default. "
+                "Set RUN_BEDROCK_LIVE_TESTS=true with valid AWS credentials to enable."
+            )
+
+        access_key = os.getenv("AWS_ACCESS_KEY_ID", "")
+        secret_key = os.getenv("AWS_SECRET_ACCESS_KEY", "")
+        if (
+            not access_key
+            or not secret_key
+            or access_key.startswith(("fake-", "test-"))
+            or secret_key.startswith(("fake-", "test-"))
+        ):
+            pytest.skip(
+                "Skipping Bedrock tests: valid AWS credentials are required when "
+                "RUN_BEDROCK_LIVE_TESTS=true."
+            )
+
+        yield

    @pytest.mark.vcr()
    def test_bedrock_agent_kickoff_with_tools_mocked(
@@ -427,6 +860,46 @@ class TestBedrockNativeToolCalling:
        assert result.raw is not None
        assert "120" in str(result.raw)

+    @pytest.mark.vcr()
+    def test_bedrock_parallel_native_tool_calling_test_crew(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        agent = Agent(
+            role="Parallel Tool Agent",
+            goal="Use both tools exactly as instructed",
+            backstory="You follow tool instructions precisely.",
+            tools=parallel_tools,
+            llm=LLM(model="bedrock/anthropic.claude-3-haiku-20240307-v1:0"),
+            verbose=False,
+            max_iter=3,
+        )
+        task = Task(
+            description=_parallel_prompt(),
+            expected_output="A one sentence summary of both tool outputs",
+            agent=agent,
+        )
+        crew = Crew(agents=[agent], tasks=[task])
+        result = crew.kickoff()
+        assert result is not None
+        _assert_tools_overlapped()
+
+    @pytest.mark.vcr()
+    def test_bedrock_parallel_native_tool_calling_test_agent_kickoff(
+        self, parallel_tools: list[BaseTool]
+    ) -> None:
+        agent = Agent(
+            role="Parallel Tool Agent",
+            goal="Use both tools exactly as instructed",
+            backstory="You follow tool instructions precisely.",
+            tools=parallel_tools,
+            llm=LLM(model="bedrock/anthropic.claude-3-haiku-20240307-v1:0"),
+            verbose=False,
+            max_iter=3,
+        )
+        result = agent.kickoff(_parallel_prompt())
+        assert result is not None
+        _assert_tools_overlapped()
+

 # =============================================================================
 # Cross-Provider Native Tool Calling Behavior Tests
@@ -439,7 +912,7 @@ class TestNativeToolCallingBehavior:
    def test_supports_function_calling_check(self) -> None:
        """Test that supports_function_calling() is properly checked."""
        # OpenAI should support function calling
-        openai_llm = LLM(model="gpt-4o-mini")
+        openai_llm = LLM(model="gpt-5-nano")
        assert hasattr(openai_llm, "supports_function_calling")
        assert openai_llm.supports_function_calling() is True

@@ -475,7 +948,7 @@ class TestNativeToolCallingTokenUsage:
            goal="Perform calculations efficiently",
            backstory="You calculate things.",
            tools=[calculator_tool],
-            llm=LLM(model="gpt-4o-mini"),
+            llm=LLM(model="gpt-5-nano"),
            verbose=False,
            max_iter=3,
        )
@@ -519,7 +992,7 @@ def test_native_tool_calling_error_handling(failing_tool: FailingTool):
        goal="Perform calculations efficiently",
        backstory="You calculate things.",
        tools=[failing_tool],
-        llm=LLM(model="gpt-4o-mini"),
+        llm=LLM(model="gpt-5-nano"),
        verbose=False,
        max_iter=3,
    )
@@ -578,7 +1051,7 @@ class TestMaxUsageCountWithNativeToolCalling:
            goal="Call the counting tool multiple times",
            backstory="You are an agent that counts things.",
            tools=[tool],
-            llm=LLM(model="gpt-4o-mini"),
+            llm=LLM(model="gpt-5-nano"),
            verbose=False,
            max_iter=5,
        )
@@ -606,7 +1079,7 @@ class TestMaxUsageCountWithNativeToolCalling:
            goal="Use the counting tool as many times as requested",
            backstory="You are an agent that counts things. You must try to use the tool for each value requested.",
            tools=[tool],
-            llm=LLM(model="gpt-4o-mini"),
+            llm=LLM(model="gpt-5-nano"),
            verbose=False,
            max_iter=5,
        )
@@ -638,7 +1111,7 @@ class TestMaxUsageCountWithNativeToolCalling:
            goal="Use the counting tool exactly as requested",
            backstory="You are an agent that counts things precisely.",
            tools=[tool],
-            llm=LLM(model="gpt-4o-mini"),
+            llm=LLM(model="gpt-5-nano"),
            verbose=False,
            max_iter=5,
        )
@@ -653,5 +1126,153 @@ class TestMaxUsageCountWithNativeToolCalling:
        result = crew.kickoff()

        assert result is not None
-        # Verify usage count was incremented for each successful call
-        assert tool.current_usage_count == 2
+        # Verify the requested calls occurred while keeping usage bounded.
+        assert tool.current_usage_count >= 2
+        assert tool.current_usage_count <= tool.max_usage_count
+
+
+# =============================================================================
+# JSON Parse Error Handling Tests
+# =============================================================================
+
+
+class TestNativeToolCallingJsonParseError:
+    """Tests that malformed JSON tool arguments produce clear errors
+    instead of silently dropping all arguments."""
+
+    def _make_executor(self, tools: list[BaseTool]) -> "CrewAgentExecutor":
+        """Create a minimal CrewAgentExecutor with mocked dependencies."""
+        from crewai.agents.crew_agent_executor import CrewAgentExecutor
+        from crewai.tools.base_tool import to_langchain
+
+        structured_tools = to_langchain(tools)
+        mock_agent = Mock()
+        mock_agent.key = "test_agent"
+        mock_agent.role = "tester"
+        mock_agent.verbose = False
+        mock_agent.fingerprint = None
+        mock_agent.tools_results = []
+
+        mock_task = Mock()
+        mock_task.name = "test"
+        mock_task.description = "test"
+        mock_task.id = "test-id"
+
+        executor = object.__new__(CrewAgentExecutor)
+        executor.agent = mock_agent
+        executor.task = mock_task
+        executor.crew = Mock()
+        executor.tools = structured_tools
+        executor.original_tools = tools
+        executor.tools_handler = None
+        executor._printer = Mock()
+        executor.messages = []
+
+        return executor
+
+    def test_malformed_json_returns_parse_error(self) -> None:
+        """Malformed JSON args must return a descriptive error, not silently become {}."""
+
+        class CodeTool(BaseTool):
+            name: str = "execute_code"
+            description: str = "Run code"
+
+            def _run(self, code: str) -> str:
+                return f"ran: {code}"
+
+        tool = CodeTool()
+        executor = self._make_executor([tool])
+
+        from crewai.utilities.agent_utils import convert_tools_to_openai_schema
+        _, available_functions = convert_tools_to_openai_schema([tool])
+
+        malformed_json = '{"code": "print("hello")"}'
+
+        result = executor._execute_single_native_tool_call(
+            call_id="call_123",
+            func_name="execute_code",
+            func_args=malformed_json,
+            available_functions=available_functions,
+        )
+
+        assert "Failed to parse tool arguments as JSON" in result["result"]
+        assert tool.current_usage_count == 0
+
+    def test_valid_json_still_executes_normally(self) -> None:
+        """Valid JSON args should execute the tool as before."""
+
+        class CodeTool(BaseTool):
+            name: str = "execute_code"
+            description: str = "Run code"
+
+            def _run(self, code: str) -> str:
+                return f"ran: {code}"
+
+        tool = CodeTool()
+        executor = self._make_executor([tool])
+
+        from crewai.utilities.agent_utils import convert_tools_to_openai_schema
+        _, available_functions = convert_tools_to_openai_schema([tool])
+
+        valid_json = '{"code": "print(1)"}'
+
+        result = executor._execute_single_native_tool_call(
+            call_id="call_456",
+            func_name="execute_code",
+            func_args=valid_json,
+            available_functions=available_functions,
+        )
+
+        assert result["result"] == "ran: print(1)"
+
+    def test_dict_args_bypass_json_parsing(self) -> None:
+        """When func_args is already a dict, no JSON parsing occurs."""
+
+        class CodeTool(BaseTool):
+            name: str = "execute_code"
+            description: str = "Run code"
+
+            def _run(self, code: str) -> str:
+                return f"ran: {code}"
+
+        tool = CodeTool()
+        executor = self._make_executor([tool])
+
+        from crewai.utilities.agent_utils import convert_tools_to_openai_schema
+        _, available_functions = convert_tools_to_openai_schema([tool])
+
+        result = executor._execute_single_native_tool_call(
+            call_id="call_789",
+            func_name="execute_code",
+            func_args={"code": "x = 42"},
+            available_functions=available_functions,
+        )
+
+        assert result["result"] == "ran: x = 42"
+
+    def test_schema_validation_catches_missing_args_on_native_path(self) -> None:
+        """The native function calling path should now enforce args_schema,
+        catching missing required fields before _run is called."""
+
+        class StrictTool(BaseTool):
+            name: str = "strict_tool"
+            description: str = "A tool with required args"
+
+            def _run(self, code: str, language: str) -> str:
+                return f"{language}: {code}"
+
+        tool = StrictTool()
+        executor = self._make_executor([tool])
+
+        from crewai.utilities.agent_utils import convert_tools_to_openai_schema
+        _, available_functions = convert_tools_to_openai_schema([tool])
+
+        result = executor._execute_single_native_tool_call(
+            call_id="call_schema",
+            func_name="strict_tool",
+            func_args={"code": "print(1)"},
+            available_functions=available_functions,
+        )
+
+        assert "Error" in result["result"]
+        assert "validation failed" in result["result"].lower() or "missing" in result["result"].lower()
--- a/lib/crewai/tests/cassettes/agents/TestAnthropicNativeToolCalling.test_anthropic_parallel_native_tool_calling_test_agent_kickoff.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestAnthropicNativeToolCalling.test_anthropic_parallel_native_tool_calling_test_agent_kickoff.yaml
@@ -0,0 +1,247 @@
+interactions:
+- request:
+    body: '{"max_tokens":4096,"messages":[{"role":"user","content":"\nCurrent Task:
+      This is a tool-calling compliance test. In your next assistant turn, emit exactly
+      3 tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."}],"model":"claude-sonnet-4-6","stop_sequences":["\nObservation:"],"stream":false,"system":"You
+      are Parallel Tool Agent. You follow tool instructions precisely.\nYour personal
+      goal is: Use both tools exactly as instructed","tools":[{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}},{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}},{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      anthropic-version:
+      - '2023-06-01'
+      connection:
+      - keep-alive
+      content-length:
+      - '1639'
+      content-type:
+      - application/json
+      host:
+      - api.anthropic.com
+      x-api-key:
+      - X-API-KEY-XXX
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 0.73.0
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+      x-stainless-timeout:
+      - NOT_GIVEN
+    method: POST
+    uri: https://api.anthropic.com/v1/messages
+  response:
+    body:
+      string: '{"model":"claude-sonnet-4-6","id":"msg_01XeN1XTXZgmPyLMMGjivabb","type":"message","role":"assistant","content":[{"type":"text","text":"I''ll
+        execute all 3 parallel searches simultaneously right now!"},{"type":"tool_use","id":"toolu_01NwzvrxEz6tvT3A8ydvMtHu","name":"parallel_local_search_one","input":{"query":"latest
+        OpenAI model release notes"},"caller":{"type":"direct"}},{"type":"tool_use","id":"toolu_01YCxzSB1suk9uPVC1uwfHz9","name":"parallel_local_search_two","input":{"query":"latest
+        Anthropic model release notes"},"caller":{"type":"direct"}},{"type":"tool_use","id":"toolu_01Mauvxzv58eDY7pUt9HMKGy","name":"parallel_local_search_three","input":{"query":"latest
+        Gemini model release notes"},"caller":{"type":"direct"}}],"stop_reason":"tool_use","stop_sequence":null,"usage":{"input_tokens":914,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":169,"service_tier":"standard","inference_geo":"global"}}'
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Security-Policy:
+      - CSP-FILTERED
+      Content-Type:
+      - application/json
+      Date:
+      - Wed, 18 Feb 2026 23:54:43 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Robots-Tag:
+      - none
+      anthropic-organization-id:
+      - ANTHROPIC-ORGANIZATION-ID-XXX
+      anthropic-ratelimit-input-tokens-limit:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-input-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-input-tokens-reset:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-RESET-XXX
+      anthropic-ratelimit-output-tokens-limit:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-output-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-output-tokens-reset:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-RESET-XXX
+      anthropic-ratelimit-requests-limit:
+      - '20000'
+      anthropic-ratelimit-requests-remaining:
+      - '19999'
+      anthropic-ratelimit-requests-reset:
+      - '2026-02-18T23:54:41Z'
+      anthropic-ratelimit-tokens-limit:
+      - ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-tokens-reset:
+      - ANTHROPIC-RATELIMIT-TOKENS-RESET-XXX
+      cf-cache-status:
+      - DYNAMIC
+      request-id:
+      - REQUEST-ID-XXX
+      strict-transport-security:
+      - STS-XXX
+      x-envoy-upstream-service-time:
+      - '2099'
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"max_tokens":4096,"messages":[{"role":"user","content":"\nCurrent Task:
+      This is a tool-calling compliance test. In your next assistant turn, emit exactly
+      3 tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."},{"role":"assistant","content":[{"type":"tool_use","id":"toolu_01NwzvrxEz6tvT3A8ydvMtHu","name":"parallel_local_search_one","input":{"query":"latest
+      OpenAI model release notes"}},{"type":"tool_use","id":"toolu_01YCxzSB1suk9uPVC1uwfHz9","name":"parallel_local_search_two","input":{"query":"latest
+      Anthropic model release notes"}},{"type":"tool_use","id":"toolu_01Mauvxzv58eDY7pUt9HMKGy","name":"parallel_local_search_three","input":{"query":"latest
+      Gemini model release notes"}}]},{"role":"user","content":[{"type":"tool_result","tool_use_id":"toolu_01NwzvrxEz6tvT3A8ydvMtHu","content":"[one]
+      latest OpenAI model release notes"},{"type":"tool_result","tool_use_id":"toolu_01YCxzSB1suk9uPVC1uwfHz9","content":"[two]
+      latest Anthropic model release notes"},{"type":"tool_result","tool_use_id":"toolu_01Mauvxzv58eDY7pUt9HMKGy","content":"[three]
+      latest Gemini model release notes"}]}],"model":"claude-sonnet-4-6","stop_sequences":["\nObservation:"],"stream":false,"system":"You
+      are Parallel Tool Agent. You follow tool instructions precisely.\nYour personal
+      goal is: Use both tools exactly as instructed","tools":[{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}},{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}},{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      anthropic-version:
+      - '2023-06-01'
+      connection:
+      - keep-alive
+      content-length:
+      - '2517'
+      content-type:
+      - application/json
+      host:
+      - api.anthropic.com
+      x-api-key:
+      - X-API-KEY-XXX
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 0.73.0
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+      x-stainless-timeout:
+      - NOT_GIVEN
+    method: POST
+    uri: https://api.anthropic.com/v1/messages
+  response:
+    body:
+      string: "{\"model\":\"claude-sonnet-4-6\",\"id\":\"msg_01PFXqwwdwwHWadPdtNU5tUZ\",\"type\":\"message\",\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"The
+        three parallel searches were executed successfully, each targeting the latest
+        release notes for the leading AI model families. The search results confirm
+        that queries were dispatched simultaneously to retrieve the most recent developments
+        from **OpenAI** (via tool one), **Anthropic** (via tool two), and **Google's
+        Gemini** (via tool three). While the local search tools returned placeholder
+        outputs in this test environment rather than detailed release notes, the structure
+        of the test validates that all three parallel tool calls were emitted correctly
+        and in the specified order \u2014 demonstrating proper concurrent tool-call
+        behavior with no dependencies between the three independent searches.\"}],\"stop_reason\":\"end_turn\",\"stop_sequence\":null,\"usage\":{\"input_tokens\":1197,\"cache_creation_input_tokens\":0,\"cache_read_input_tokens\":0,\"cache_creation\":{\"ephemeral_5m_input_tokens\":0,\"ephemeral_1h_input_tokens\":0},\"output_tokens\":131,\"service_tier\":\"standard\",\"inference_geo\":\"global\"}}"
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Security-Policy:
+      - CSP-FILTERED
+      Content-Type:
+      - application/json
+      Date:
+      - Wed, 18 Feb 2026 23:54:49 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Robots-Tag:
+      - none
+      anthropic-organization-id:
+      - ANTHROPIC-ORGANIZATION-ID-XXX
+      anthropic-ratelimit-input-tokens-limit:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-input-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-input-tokens-reset:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-RESET-XXX
+      anthropic-ratelimit-output-tokens-limit:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-output-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-output-tokens-reset:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-RESET-XXX
+      anthropic-ratelimit-requests-limit:
+      - '20000'
+      anthropic-ratelimit-requests-remaining:
+      - '19999'
+      anthropic-ratelimit-requests-reset:
+      - '2026-02-18T23:54:44Z'
+      anthropic-ratelimit-tokens-limit:
+      - ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-tokens-reset:
+      - ANTHROPIC-RATELIMIT-TOKENS-RESET-XXX
+      cf-cache-status:
+      - DYNAMIC
+      request-id:
+      - REQUEST-ID-XXX
+      strict-transport-security:
+      - STS-XXX
+      x-envoy-upstream-service-time:
+      - '4092'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestAnthropicNativeToolCalling.test_anthropic_parallel_native_tool_calling_test_crew.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestAnthropicNativeToolCalling.test_anthropic_parallel_native_tool_calling_test_crew.yaml
@@ -0,0 +1,254 @@
+interactions:
+- request:
+    body: '{"max_tokens":4096,"messages":[{"role":"user","content":"\nCurrent Task:
+      This is a tool-calling compliance test. In your next assistant turn, emit exactly
+      3 tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."}],"model":"claude-sonnet-4-6","stop_sequences":["\nObservation:"],"stream":false,"system":"You
+      are Parallel Tool Agent. You follow tool instructions precisely.\nYour personal
+      goal is: Use both tools exactly as instructed","tools":[{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}},{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}},{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      anthropic-version:
+      - '2023-06-01'
+      connection:
+      - keep-alive
+      content-length:
+      - '1820'
+      content-type:
+      - application/json
+      host:
+      - api.anthropic.com
+      x-api-key:
+      - X-API-KEY-XXX
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 0.73.0
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+      x-stainless-timeout:
+      - NOT_GIVEN
+    method: POST
+    uri: https://api.anthropic.com/v1/messages
+  response:
+    body:
+      string: '{"model":"claude-sonnet-4-6","id":"msg_01RJ4CphwpmkmsJFJjeCNvXz","type":"message","role":"assistant","content":[{"type":"text","text":"I''ll
+        execute all 3 parallel tool calls simultaneously right away!"},{"type":"tool_use","id":"toolu_01YWY3cSomRuv4USmq55Prk3","name":"parallel_local_search_one","input":{"query":"latest
+        OpenAI model release notes"},"caller":{"type":"direct"}},{"type":"tool_use","id":"toolu_01Aaqj3LMXksE1nB3pscRhV5","name":"parallel_local_search_two","input":{"query":"latest
+        Anthropic model release notes"},"caller":{"type":"direct"}},{"type":"tool_use","id":"toolu_01AcYxQvy8aYmAoUg9zx9qfq","name":"parallel_local_search_three","input":{"query":"latest
+        Gemini model release notes"},"caller":{"type":"direct"}}],"stop_reason":"tool_use","stop_sequence":null,"usage":{"input_tokens":951,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":170,"service_tier":"standard","inference_geo":"global"}}'
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Security-Policy:
+      - CSP-FILTERED
+      Content-Type:
+      - application/json
+      Date:
+      - Wed, 18 Feb 2026 23:54:51 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Robots-Tag:
+      - none
+      anthropic-organization-id:
+      - ANTHROPIC-ORGANIZATION-ID-XXX
+      anthropic-ratelimit-input-tokens-limit:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-input-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-input-tokens-reset:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-RESET-XXX
+      anthropic-ratelimit-output-tokens-limit:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-output-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-output-tokens-reset:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-RESET-XXX
+      anthropic-ratelimit-requests-limit:
+      - '20000'
+      anthropic-ratelimit-requests-remaining:
+      - '19999'
+      anthropic-ratelimit-requests-reset:
+      - '2026-02-18T23:54:49Z'
+      anthropic-ratelimit-tokens-limit:
+      - ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-tokens-reset:
+      - ANTHROPIC-RATELIMIT-TOKENS-RESET-XXX
+      cf-cache-status:
+      - DYNAMIC
+      request-id:
+      - REQUEST-ID-XXX
+      strict-transport-security:
+      - STS-XXX
+      x-envoy-upstream-service-time:
+      - '1967'
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"max_tokens":4096,"messages":[{"role":"user","content":"\nCurrent Task:
+      This is a tool-calling compliance test. In your next assistant turn, emit exactly
+      3 tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."},{"role":"assistant","content":[{"type":"tool_use","id":"toolu_01YWY3cSomRuv4USmq55Prk3","name":"parallel_local_search_one","input":{"query":"latest
+      OpenAI model release notes"}},{"type":"tool_use","id":"toolu_01Aaqj3LMXksE1nB3pscRhV5","name":"parallel_local_search_two","input":{"query":"latest
+      Anthropic model release notes"}},{"type":"tool_use","id":"toolu_01AcYxQvy8aYmAoUg9zx9qfq","name":"parallel_local_search_three","input":{"query":"latest
+      Gemini model release notes"}}]},{"role":"user","content":[{"type":"tool_result","tool_use_id":"toolu_01YWY3cSomRuv4USmq55Prk3","content":"[one]
+      latest OpenAI model release notes"},{"type":"tool_result","tool_use_id":"toolu_01Aaqj3LMXksE1nB3pscRhV5","content":"[two]
+      latest Anthropic model release notes"},{"type":"tool_result","tool_use_id":"toolu_01AcYxQvy8aYmAoUg9zx9qfq","content":"[three]
+      latest Gemini model release notes"}]},{"role":"user","content":"Analyze the
+      tool result. If requirements are met, provide the Final Answer. Otherwise, call
+      the next tool. Deliver only the answer without meta-commentary."}],"model":"claude-sonnet-4-6","stop_sequences":["\nObservation:"],"stream":false,"system":"You
+      are Parallel Tool Agent. You follow tool instructions precisely.\nYour personal
+      goal is: Use both tools exactly as instructed","tools":[{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}},{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}},{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","input_schema":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      anthropic-version:
+      - '2023-06-01'
+      connection:
+      - keep-alive
+      content-length:
+      - '2882'
+      content-type:
+      - application/json
+      host:
+      - api.anthropic.com
+      x-api-key:
+      - X-API-KEY-XXX
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 0.73.0
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+      x-stainless-timeout:
+      - NOT_GIVEN
+    method: POST
+    uri: https://api.anthropic.com/v1/messages
+  response:
+    body:
+      string: "{\"model\":\"claude-sonnet-4-6\",\"id\":\"msg_0143MHUne1az3Tt69EoLjyZd\",\"type\":\"message\",\"role\":\"assistant\",\"content\":[{\"type\":\"text\",\"text\":\"Here
+        is the complete content returned from all three tool calls:\\n\\n- **parallel_local_search_one**
+        result: `[one] latest OpenAI model release notes`\\n- **parallel_local_search_two**
+        result: `[two] latest Anthropic model release notes`\\n- **parallel_local_search_three**
+        result: `[three] latest Gemini model release notes`\\n\\nAll three parallel
+        tool calls were executed successfully in the same response turn, returning
+        their respective outputs: the first tool searched for the latest OpenAI model
+        release notes, the second tool searched for the latest Anthropic model release
+        notes, and the third tool searched for the latest Gemini model release notes
+        \u2014 confirming that all search queries were dispatched concurrently and
+        their results retrieved as expected.\"}],\"stop_reason\":\"end_turn\",\"stop_sequence\":null,\"usage\":{\"input_tokens\":1272,\"cache_creation_input_tokens\":0,\"cache_read_input_tokens\":0,\"cache_creation\":{\"ephemeral_5m_input_tokens\":0,\"ephemeral_1h_input_tokens\":0},\"output_tokens\":172,\"service_tier\":\"standard\",\"inference_geo\":\"global\"}}"
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Security-Policy:
+      - CSP-FILTERED
+      Content-Type:
+      - application/json
+      Date:
+      - Wed, 18 Feb 2026 23:54:55 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Robots-Tag:
+      - none
+      anthropic-organization-id:
+      - ANTHROPIC-ORGANIZATION-ID-XXX
+      anthropic-ratelimit-input-tokens-limit:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-input-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-input-tokens-reset:
+      - ANTHROPIC-RATELIMIT-INPUT-TOKENS-RESET-XXX
+      anthropic-ratelimit-output-tokens-limit:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-output-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-output-tokens-reset:
+      - ANTHROPIC-RATELIMIT-OUTPUT-TOKENS-RESET-XXX
+      anthropic-ratelimit-requests-limit:
+      - '20000'
+      anthropic-ratelimit-requests-remaining:
+      - '19999'
+      anthropic-ratelimit-requests-reset:
+      - '2026-02-18T23:54:52Z'
+      anthropic-ratelimit-tokens-limit:
+      - ANTHROPIC-RATELIMIT-TOKENS-LIMIT-XXX
+      anthropic-ratelimit-tokens-remaining:
+      - ANTHROPIC-RATELIMIT-TOKENS-REMAINING-XXX
+      anthropic-ratelimit-tokens-reset:
+      - ANTHROPIC-RATELIMIT-TOKENS-RESET-XXX
+      cf-cache-status:
+      - DYNAMIC
+      request-id:
+      - REQUEST-ID-XXX
+      strict-transport-security:
+      - STS-XXX
+      x-envoy-upstream-service-time:
+      - '3144'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestAzureNativeToolCalling.test_azure_agent_with_native_tool_calling.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestAzureNativeToolCalling.test_azure_agent_with_native_tool_calling.yaml
@@ -5,20 +5,19 @@ interactions:
      calculations"}, {"role": "user", "content": "\nCurrent Task: Calculate what
      is 15 * 8\n\nThis is the expected criteria for your final answer: The result
      of the calculation\nyou MUST return the actual complete content as the final
-      answer, not a summary.\n\nThis is VERY important to you, your job depends on
-      it!"}], "stream": false, "stop": ["\nObservation:"], "tool_choice": "auto",
-      "tools": [{"function": {"name": "calculator", "description": "Perform mathematical
-      calculations. Use this for any math operations.", "parameters": {"properties":
-      {"expression": {"description": "Mathematical expression to evaluate", "title":
-      "Expression", "type": "string"}}, "required": ["expression"], "type": "object"}},
-      "type": "function"}]}'
+      answer, not a summary."}], "stream": false, "tool_choice": "auto", "tools":
+      [{"function": {"name": "calculator", "description": "Perform mathematical calculations.
+      Use this for any math operations.", "parameters": {"properties": {"expression":
+      {"description": "Mathematical expression to evaluate", "title": "Expression",
+      "type": "string"}}, "required": ["expression"], "type": "object", "additionalProperties":
+      false}}, "type": "function"}]}'
    headers:
      Accept:
      - application/json
      Connection:
      - keep-alive
      Content-Length:
-      - '883'
+      - '828'
      Content-Type:
      - application/json
      User-Agent:
@@ -32,20 +31,20 @@ interactions:
      x-ms-client-request-id:
      - X-MS-CLIENT-REQUEST-ID-XXX
    method: POST
-    uri: https://fake-azure-endpoint.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-12-01-preview
+    uri: https://fake-azure-endpoint.openai.azure.com/openai/deployments/gpt-5-nano/chat/completions?api-version=2024-12-01-preview
  response:
    body:
      string: '{"choices":[{"content_filter_results":{},"finish_reason":"tool_calls","index":0,"logprobs":null,"message":{"annotations":[],"content":null,"refusal":null,"role":"assistant","tool_calls":[{"function":{"arguments":"{\"expression\":\"15
-        * 8\"}","name":"calculator"},"id":"call_cJWzKh5LdBpY3Sk8GATS3eRe","type":"function"}]}}],"created":1769122114,"id":"chatcmpl-D0xlavS0V3m00B9Fsjyv39xQWUGFV","model":"gpt-4o-mini-2024-07-18","object":"chat.completion","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"system_fingerprint":"fp_f97eff32c5","usage":{"completion_tokens":18,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0},"prompt_tokens":137,"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0},"total_tokens":155}}
+        * 8\"}","name":"calculator"},"id":"call_Cow46pNllpDx0pxUgZFeqlh1","type":"function"}]}}],"created":1771459544,"id":"chatcmpl-DAlq4osCP9ABJ1HyXFBoYWylMg0bi","model":"gpt-5-nano-2025-08-07","object":"chat.completion","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"system_fingerprint":null,"usage":{"completion_tokens":219,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":192,"rejected_prediction_tokens":0},"prompt_tokens":208,"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0},"total_tokens":427}}

        '
    headers:
      Content-Length:
-      - '1058'
+      - '1049'
      Content-Type:
      - application/json
      Date:
-      - Thu, 22 Jan 2026 22:48:34 GMT
+      - Thu, 19 Feb 2026 00:05:45 GMT
      Strict-Transport-Security:
      - STS-XXX
      apim-request-id:
@@ -59,7 +58,7 @@ interactions:
      x-ms-client-request-id:
      - X-MS-CLIENT-REQUEST-ID-XXX
      x-ms-deployment-name:
-      - gpt-4o-mini
+      - gpt-5-nano
      x-ms-rai-invoked:
      - 'true'
      x-ms-region:
@@ -83,26 +82,25 @@ interactions:
      calculations"}, {"role": "user", "content": "\nCurrent Task: Calculate what
      is 15 * 8\n\nThis is the expected criteria for your final answer: The result
      of the calculation\nyou MUST return the actual complete content as the final
-      answer, not a summary.\n\nThis is VERY important to you, your job depends on
-      it!"}, {"role": "assistant", "content": "", "tool_calls": [{"id": "call_cJWzKh5LdBpY3Sk8GATS3eRe",
-      "type": "function", "function": {"name": "calculator", "arguments": "{\"expression\":\"15
-      * 8\"}"}}]}, {"role": "tool", "tool_call_id": "call_cJWzKh5LdBpY3Sk8GATS3eRe",
-      "content": "The result of 15 * 8 is 120"}, {"role": "user", "content": "Analyze
-      the tool result. If requirements are met, provide the Final Answer. Otherwise,
-      call the next tool. Deliver only the answer without meta-commentary."}], "stream":
-      false, "stop": ["\nObservation:"], "tool_choice": "auto", "tools": [{"function":
-      {"name": "calculator", "description": "Perform mathematical calculations. Use
-      this for any math operations.", "parameters": {"properties": {"expression":
-      {"description": "Mathematical expression to evaluate", "title": "Expression",
-      "type": "string"}}, "required": ["expression"], "type": "object"}}, "type":
-      "function"}]}'
+      answer, not a summary."}, {"role": "assistant", "content": "", "tool_calls":
+      [{"id": "call_Cow46pNllpDx0pxUgZFeqlh1", "type": "function", "function": {"name":
+      "calculator", "arguments": "{\"expression\":\"15 * 8\"}"}}]}, {"role": "tool",
+      "tool_call_id": "call_Cow46pNllpDx0pxUgZFeqlh1", "content": "The result of 15
+      * 8 is 120"}, {"role": "user", "content": "Analyze the tool result. If requirements
+      are met, provide the Final Answer. Otherwise, call the next tool. Deliver only
+      the answer without meta-commentary."}], "stream": false, "tool_choice": "auto",
+      "tools": [{"function": {"name": "calculator", "description": "Perform mathematical
+      calculations. Use this for any math operations.", "parameters": {"properties":
+      {"expression": {"description": "Mathematical expression to evaluate", "title":
+      "Expression", "type": "string"}}, "required": ["expression"], "type": "object",
+      "additionalProperties": false}}, "type": "function"}]}'
    headers:
      Accept:
      - application/json
      Connection:
      - keep-alive
      Content-Length:
-      - '1375'
+      - '1320'
      Content-Type:
      - application/json
      User-Agent:
@@ -116,20 +114,19 @@ interactions:
      x-ms-client-request-id:
      - X-MS-CLIENT-REQUEST-ID-XXX
    method: POST
-    uri: https://fake-azure-endpoint.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-12-01-preview
+    uri: https://fake-azure-endpoint.openai.azure.com/openai/deployments/gpt-5-nano/chat/completions?api-version=2024-12-01-preview
  response:
    body:
-      string: '{"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"protected_material_code":{"filtered":false,"detected":false},"protected_material_text":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"finish_reason":"stop","index":0,"logprobs":null,"message":{"annotations":[],"content":"The
-        result of the calculation is 120.","refusal":null,"role":"assistant"}}],"created":1769122115,"id":"chatcmpl-D0xlbUNVA7RVkn0GsuBGoNhgQTtac","model":"gpt-4o-mini-2024-07-18","object":"chat.completion","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"system_fingerprint":"fp_f97eff32c5","usage":{"completion_tokens":11,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0},"prompt_tokens":207,"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0},"total_tokens":218}}
+      string: '{"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"protected_material_code":{"filtered":false,"detected":false},"protected_material_text":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"finish_reason":"stop","index":0,"logprobs":null,"message":{"annotations":[],"content":"120","refusal":null,"role":"assistant"}}],"created":1771459547,"id":"chatcmpl-DAlq7zJimnIMoXieNww8jY5f2pIPd","model":"gpt-5-nano-2025-08-07","object":"chat.completion","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"system_fingerprint":null,"usage":{"completion_tokens":203,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":192,"rejected_prediction_tokens":0},"prompt_tokens":284,"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0},"total_tokens":487}}

        '
    headers:
      Content-Length:
-      - '1250'
+      - '1207'
      Content-Type:
      - application/json
      Date:
-      - Thu, 22 Jan 2026 22:48:34 GMT
+      - Thu, 19 Feb 2026 00:05:49 GMT
      Strict-Transport-Security:
      - STS-XXX
      apim-request-id:
@@ -143,7 +140,7 @@ interactions:
      x-ms-client-request-id:
      - X-MS-CLIENT-REQUEST-ID-XXX
      x-ms-deployment-name:
-      - gpt-4o-mini
+      - gpt-5-nano
      x-ms-rai-invoked:
      - 'true'
      x-ms-region:
--- a/lib/crewai/tests/cassettes/agents/TestAzureNativeToolCalling.test_azure_parallel_native_tool_calling_test_agent_kickoff.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestAzureNativeToolCalling.test_azure_parallel_native_tool_calling_test_agent_kickoff.yaml
@@ -0,0 +1,198 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "system", "content": "You are Parallel Tool Agent.
+      You follow tool instructions precisely.\nYour personal goal is: Use both tools
+      exactly as instructed"}, {"role": "user", "content": "\nCurrent Task: This is
+      a tool-calling compliance test. In your next assistant turn, emit exactly 3
+      tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."}], "stream": false, "tool_choice": "auto", "tools": [{"function":
+      {"name": "parallel_local_search_one", "description": "Local search tool #1 for
+      concurrency testing.", "parameters": {"properties": {"query": {"description":
+      "Search query", "title": "Query", "type": "string"}}, "required": ["query"],
+      "type": "object", "additionalProperties": false}}, "type": "function"}, {"function":
+      {"name": "parallel_local_search_two", "description": "Local search tool #2 for
+      concurrency testing.", "parameters": {"properties": {"query": {"description":
+      "Search query", "title": "Query", "type": "string"}}, "required": ["query"],
+      "type": "object", "additionalProperties": false}}, "type": "function"}, {"function":
+      {"name": "parallel_local_search_three", "description": "Local search tool #3
+      for concurrency testing.", "parameters": {"properties": {"query": {"description":
+      "Search query", "title": "Query", "type": "string"}}, "required": ["query"],
+      "type": "object", "additionalProperties": false}}, "type": "function"}]}'
+    headers:
+      Accept:
+      - application/json
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '1763'
+      Content-Type:
+      - application/json
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      api-key:
+      - X-API-KEY-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      x-ms-client-request-id:
+      - X-MS-CLIENT-REQUEST-ID-XXX
+    method: POST
+    uri: https://fake-azure-endpoint.openai.azure.com/openai/deployments/gpt-5-nano/chat/completions?api-version=2024-12-01-preview
+  response:
+    body:
+      string: '{"choices":[{"content_filter_results":{},"finish_reason":"tool_calls","index":0,"logprobs":null,"message":{"annotations":[],"content":null,"refusal":null,"role":"assistant","tool_calls":[{"function":{"arguments":"{\"query\":
+        \"latest OpenAI model release notes\"}","name":"parallel_local_search_one"},"id":"call_emQmocGydKuxvESfQopNngdm","type":"function"},{"function":{"arguments":"{\"query\":
+        \"latest Anthropic model release notes\"}","name":"parallel_local_search_two"},"id":"call_eNpK9WUYFCX2ZEUPhYCKvdMs","type":"function"},{"function":{"arguments":"{\"query\":
+        \"latest Gemini model release notes\"}","name":"parallel_local_search_three"},"id":"call_Wdtl6jFxGehSUMn5I1O4Mrdx","type":"function"}]}}],"created":1771459550,"id":"chatcmpl-DAlqAyJGnQKDkNCaTcjU2T8BeJaXM","model":"gpt-5-nano-2025-08-07","object":"chat.completion","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"system_fingerprint":null,"usage":{"completion_tokens":666,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":576,"rejected_prediction_tokens":0},"prompt_tokens":343,"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0},"total_tokens":1009}}
+
+        '
+    headers:
+      Content-Length:
+      - '1433'
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 00:05:55 GMT
+      Strict-Transport-Security:
+      - STS-XXX
+      apim-request-id:
+      - APIM-REQUEST-ID-XXX
+      azureml-model-session:
+      - AZUREML-MODEL-SESSION-XXX
+      x-accel-buffering:
+      - 'no'
+      x-content-type-options:
+      - X-CONTENT-TYPE-XXX
+      x-ms-client-request-id:
+      - X-MS-CLIENT-REQUEST-ID-XXX
+      x-ms-deployment-name:
+      - gpt-5-nano
+      x-ms-rai-invoked:
+      - 'true'
+      x-ms-region:
+      - X-MS-REGION-XXX
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"messages": [{"role": "system", "content": "You are Parallel Tool Agent.
+      You follow tool instructions precisely.\nYour personal goal is: Use both tools
+      exactly as instructed"}, {"role": "user", "content": "\nCurrent Task: This is
+      a tool-calling compliance test. In your next assistant turn, emit exactly 3
+      tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."}, {"role": "assistant", "content": "", "tool_calls": [{"id":
+      "call_emQmocGydKuxvESfQopNngdm", "type": "function", "function": {"name": "parallel_local_search_one",
+      "arguments": "{\"query\": \"latest OpenAI model release notes\"}"}}, {"id":
+      "call_eNpK9WUYFCX2ZEUPhYCKvdMs", "type": "function", "function": {"name": "parallel_local_search_two",
+      "arguments": "{\"query\": \"latest Anthropic model release notes\"}"}}, {"id":
+      "call_Wdtl6jFxGehSUMn5I1O4Mrdx", "type": "function", "function": {"name": "parallel_local_search_three",
+      "arguments": "{\"query\": \"latest Gemini model release notes\"}"}}]}, {"role":
+      "tool", "tool_call_id": "call_emQmocGydKuxvESfQopNngdm", "content": "[one] latest
+      OpenAI model release notes"}, {"role": "tool", "tool_call_id": "call_eNpK9WUYFCX2ZEUPhYCKvdMs",
+      "content": "[two] latest Anthropic model release notes"}, {"role": "tool", "tool_call_id":
+      "call_Wdtl6jFxGehSUMn5I1O4Mrdx", "content": "[three] latest Gemini model release
+      notes"}], "stream": false, "tool_choice": "auto", "tools": [{"function": {"name":
+      "parallel_local_search_one", "description": "Local search tool #1 for concurrency
+      testing.", "parameters": {"properties": {"query": {"description": "Search query",
+      "title": "Query", "type": "string"}}, "required": ["query"], "type": "object",
+      "additionalProperties": false}}, "type": "function"}, {"function": {"name":
+      "parallel_local_search_two", "description": "Local search tool #2 for concurrency
+      testing.", "parameters": {"properties": {"query": {"description": "Search query",
+      "title": "Query", "type": "string"}}, "required": ["query"], "type": "object",
+      "additionalProperties": false}}, "type": "function"}, {"function": {"name":
+      "parallel_local_search_three", "description": "Local search tool #3 for concurrency
+      testing.", "parameters": {"properties": {"query": {"description": "Search query",
+      "title": "Query", "type": "string"}}, "required": ["query"], "type": "object",
+      "additionalProperties": false}}, "type": "function"}]}'
+    headers:
+      Accept:
+      - application/json
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '2727'
+      Content-Type:
+      - application/json
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      api-key:
+      - X-API-KEY-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      x-ms-client-request-id:
+      - X-MS-CLIENT-REQUEST-ID-XXX
+    method: POST
+    uri: https://fake-azure-endpoint.openai.azure.com/openai/deployments/gpt-5-nano/chat/completions?api-version=2024-12-01-preview
+  response:
+    body:
+      string: '{"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"protected_material_code":{"filtered":false,"detected":false},"protected_material_text":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"finish_reason":"stop","index":0,"logprobs":null,"message":{"annotations":[],"content":"The
+        latest release notes have been published for the OpenAI, Anthropic, and Gemini
+        models, signaling concurrent updates across the leading AI model families.
+        Each set outlines new capabilities and performance improvements, along with
+        changes to APIs, tooling, and deployment guidelines. Users should review the
+        individual notes to understand new features, adjustments to tokenization,
+        latency or throughput, safety and alignment enhancements, pricing or access
+        changes, and any breaking changes or migration steps required to adopt the
+        updated models in existing workflows.","refusal":null,"role":"assistant"}}],"created":1771459556,"id":"chatcmpl-DAlqGKWXfGNlTIbDY9F6oHQp6hbxM","model":"gpt-5-nano-2025-08-07","object":"chat.completion","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"system_fingerprint":null,"usage":{"completion_tokens":747,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":640,"rejected_prediction_tokens":0},"prompt_tokens":467,"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0},"total_tokens":1214}}
+
+        '
+    headers:
+      Content-Length:
+      - '1778'
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 00:06:02 GMT
+      Strict-Transport-Security:
+      - STS-XXX
+      apim-request-id:
+      - APIM-REQUEST-ID-XXX
+      azureml-model-session:
+      - AZUREML-MODEL-SESSION-XXX
+      x-accel-buffering:
+      - 'no'
+      x-content-type-options:
+      - X-CONTENT-TYPE-XXX
+      x-ms-client-request-id:
+      - X-MS-CLIENT-REQUEST-ID-XXX
+      x-ms-deployment-name:
+      - gpt-5-nano
+      x-ms-rai-invoked:
+      - 'true'
+      x-ms-region:
+      - X-MS-REGION-XXX
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestAzureNativeToolCalling.test_azure_parallel_native_tool_calling_test_crew.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestAzureNativeToolCalling.test_azure_parallel_native_tool_calling_test_crew.yaml
@@ -0,0 +1,201 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "system", "content": "You are Parallel Tool Agent.
+      You follow tool instructions precisely.\nYour personal goal is: Use both tools
+      exactly as instructed"}, {"role": "user", "content": "\nCurrent Task: This is
+      a tool-calling compliance test. In your next assistant turn, emit exactly 3
+      tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."}], "stream": false, "tool_choice":
+      "auto", "tools": [{"function": {"name": "parallel_local_search_one", "description":
+      "Local search tool #1 for concurrency testing.", "parameters": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}, "type":
+      "function"}, {"function": {"name": "parallel_local_search_two", "description":
+      "Local search tool #2 for concurrency testing.", "parameters": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}, "type":
+      "function"}, {"function": {"name": "parallel_local_search_three", "description":
+      "Local search tool #3 for concurrency testing.", "parameters": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}, "type":
+      "function"}]}'
+    headers:
+      Accept:
+      - application/json
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '1944'
+      Content-Type:
+      - application/json
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      api-key:
+      - X-API-KEY-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      x-ms-client-request-id:
+      - X-MS-CLIENT-REQUEST-ID-XXX
+    method: POST
+    uri: https://fake-azure-endpoint.openai.azure.com/openai/deployments/gpt-5-nano/chat/completions?api-version=2024-12-01-preview
+  response:
+    body:
+      string: '{"choices":[{"content_filter_results":{},"finish_reason":"tool_calls","index":0,"logprobs":null,"message":{"annotations":[],"content":null,"refusal":null,"role":"assistant","tool_calls":[{"function":{"arguments":"{\"query\":
+        \"latest OpenAI model release notes\"}","name":"parallel_local_search_one"},"id":"call_NEvGoF86nhPQfXRoJd5SOyLd","type":"function"},{"function":{"arguments":"{\"query\":
+        \"latest Anthropic model release notes\"}","name":"parallel_local_search_two"},"id":"call_q8Q2du4gAMQLrGTgWgfwfbDZ","type":"function"},{"function":{"arguments":"{\"query\":
+        \"latest Gemini model release notes\"}","name":"parallel_local_search_three"},"id":"call_yTBal9ofZzuo10j0pWqhHCSj","type":"function"}]}}],"created":1771459563,"id":"chatcmpl-DAlqN7kyC5ACI5Yl1Pj63rOH5HIvI","model":"gpt-5-nano-2025-08-07","object":"chat.completion","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"system_fingerprint":null,"usage":{"completion_tokens":2457,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":2368,"rejected_prediction_tokens":0},"prompt_tokens":378,"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0},"total_tokens":2835}}
+
+        '
+    headers:
+      Content-Length:
+      - '1435'
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 00:06:17 GMT
+      Strict-Transport-Security:
+      - STS-XXX
+      apim-request-id:
+      - APIM-REQUEST-ID-XXX
+      azureml-model-session:
+      - AZUREML-MODEL-SESSION-XXX
+      x-accel-buffering:
+      - 'no'
+      x-content-type-options:
+      - X-CONTENT-TYPE-XXX
+      x-ms-client-request-id:
+      - X-MS-CLIENT-REQUEST-ID-XXX
+      x-ms-deployment-name:
+      - gpt-5-nano
+      x-ms-rai-invoked:
+      - 'true'
+      x-ms-region:
+      - X-MS-REGION-XXX
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"messages": [{"role": "system", "content": "You are Parallel Tool Agent.
+      You follow tool instructions precisely.\nYour personal goal is: Use both tools
+      exactly as instructed"}, {"role": "user", "content": "\nCurrent Task: This is
+      a tool-calling compliance test. In your next assistant turn, emit exactly 3
+      tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."}, {"role": "assistant", "content":
+      "", "tool_calls": [{"id": "call_NEvGoF86nhPQfXRoJd5SOyLd", "type": "function",
+      "function": {"name": "parallel_local_search_one", "arguments": "{\"query\":
+      \"latest OpenAI model release notes\"}"}}, {"id": "call_q8Q2du4gAMQLrGTgWgfwfbDZ",
+      "type": "function", "function": {"name": "parallel_local_search_two", "arguments":
+      "{\"query\": \"latest Anthropic model release notes\"}"}}, {"id": "call_yTBal9ofZzuo10j0pWqhHCSj",
+      "type": "function", "function": {"name": "parallel_local_search_three", "arguments":
+      "{\"query\": \"latest Gemini model release notes\"}"}}]}, {"role": "tool", "tool_call_id":
+      "call_NEvGoF86nhPQfXRoJd5SOyLd", "content": "[one] latest OpenAI model release
+      notes"}, {"role": "tool", "tool_call_id": "call_q8Q2du4gAMQLrGTgWgfwfbDZ", "content":
+      "[two] latest Anthropic model release notes"}, {"role": "tool", "tool_call_id":
+      "call_yTBal9ofZzuo10j0pWqhHCSj", "content": "[three] latest Gemini model release
+      notes"}, {"role": "user", "content": "Analyze the tool result. If requirements
+      are met, provide the Final Answer. Otherwise, call the next tool. Deliver only
+      the answer without meta-commentary."}], "stream": false, "tool_choice": "auto",
+      "tools": [{"function": {"name": "parallel_local_search_one", "description":
+      "Local search tool #1 for concurrency testing.", "parameters": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}, "type":
+      "function"}, {"function": {"name": "parallel_local_search_two", "description":
+      "Local search tool #2 for concurrency testing.", "parameters": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}, "type":
+      "function"}, {"function": {"name": "parallel_local_search_three", "description":
+      "Local search tool #3 for concurrency testing.", "parameters": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}, "type":
+      "function"}]}'
+    headers:
+      Accept:
+      - application/json
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '3096'
+      Content-Type:
+      - application/json
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      api-key:
+      - X-API-KEY-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      x-ms-client-request-id:
+      - X-MS-CLIENT-REQUEST-ID-XXX
+    method: POST
+    uri: https://fake-azure-endpoint.openai.azure.com/openai/deployments/gpt-5-nano/chat/completions?api-version=2024-12-01-preview
+  response:
+    body:
+      string: '{"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"protected_material_code":{"filtered":false,"detected":false},"protected_material_text":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"finish_reason":"stop","index":0,"logprobs":null,"message":{"annotations":[],"content":"The
+        three tool results indicate the latest release notes are available for OpenAI
+        models, Anthropic models, and Gemini models.","refusal":null,"role":"assistant"}}],"created":1771459579,"id":"chatcmpl-DAlqdRtr8EefmFfazuh4jm7KvVxim","model":"gpt-5-nano-2025-08-07","object":"chat.completion","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"system_fingerprint":null,"usage":{"completion_tokens":1826,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":1792,"rejected_prediction_tokens":0},"prompt_tokens":537,"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0},"total_tokens":2363}}
+
+        '
+    headers:
+      Content-Length:
+      - '1333'
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 00:06:31 GMT
+      Strict-Transport-Security:
+      - STS-XXX
+      apim-request-id:
+      - APIM-REQUEST-ID-XXX
+      azureml-model-session:
+      - AZUREML-MODEL-SESSION-XXX
+      x-accel-buffering:
+      - 'no'
+      x-content-type-options:
+      - X-CONTENT-TYPE-XXX
+      x-ms-client-request-id:
+      - X-MS-CLIENT-REQUEST-ID-XXX
+      x-ms-deployment-name:
+      - gpt-5-nano
+      x-ms-rai-invoked:
+      - 'true'
+      x-ms-region:
+      - X-MS-REGION-XXX
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestBedrockNativeToolCalling.test_bedrock_parallel_native_tool_calling_test_agent_kickoff.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestBedrockNativeToolCalling.test_bedrock_parallel_native_tool_calling_test_agent_kickoff.yaml
@@ -0,0 +1,63 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": [{"text": "\nCurrent Task: This
+      is a tool-calling compliance test. In your next assistant turn, emit exactly
+      3 tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."}]}], "inferenceConfig": {"stopSequences": ["\nObservation:"]},
+      "system": [{"text": "You are Parallel Tool Agent. You follow tool instructions
+      precisely.\nYour personal goal is: Use both tools exactly as instructed"}],
+      "toolConfig": {"tools": [{"toolSpec": {"name": "parallel_local_search_one",
+      "description": "Local search tool #1 for concurrency testing.", "inputSchema":
+      {"json": {"properties": {"query": {"description": "Search query", "title": "Query",
+      "type": "string"}}, "required": ["query"], "type": "object", "additionalProperties":
+      false}}}}, {"toolSpec": {"name": "parallel_local_search_two", "description":
+      "Local search tool #2 for concurrency testing.", "inputSchema": {"json": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}}},
+      {"toolSpec": {"name": "parallel_local_search_three", "description": "Local search
+      tool #3 for concurrency testing.", "inputSchema": {"json": {"properties": {"query":
+      {"description": "Search query", "title": "Query", "type": "string"}}, "required":
+      ["query"], "type": "object", "additionalProperties": false}}}}]}}'
+    headers:
+      Content-Length:
+      - '1773'
+      Content-Type:
+      - !!binary |
+        YXBwbGljYXRpb24vanNvbg==
+      User-Agent:
+      - X-USER-AGENT-XXX
+      amz-sdk-invocation-id:
+      - AMZ-SDK-INVOCATION-ID-XXX
+      amz-sdk-request:
+      - !!binary |
+        YXR0ZW1wdD0x
+      authorization:
+      - AUTHORIZATION-XXX
+      x-amz-date:
+      - X-AMZ-DATE-XXX
+    method: POST
+    uri: https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1%3A0/converse
+  response:
+    body:
+      string: '{"message":"The security token included in the request is invalid."}'
+    headers:
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '68'
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 00:00:08 GMT
+      x-amzn-ErrorType:
+      - UnrecognizedClientException:http://internal.amazon.com/coral/com.amazon.coral.service/
+      x-amzn-RequestId:
+      - X-AMZN-REQUESTID-XXX
+    status:
+      code: 403
+      message: Forbidden
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestBedrockNativeToolCalling.test_bedrock_parallel_native_tool_calling_test_crew.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestBedrockNativeToolCalling.test_bedrock_parallel_native_tool_calling_test_crew.yaml
@@ -0,0 +1,226 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": [{"text": "\nCurrent Task: This
+      is a tool-calling compliance test. In your next assistant turn, emit exactly
+      3 tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."}]}], "inferenceConfig": {"stopSequences":
+      ["\nObservation:"]}, "system": [{"text": "You are Parallel Tool Agent. You follow
+      tool instructions precisely.\nYour personal goal is: Use both tools exactly
+      as instructed"}], "toolConfig": {"tools": [{"toolSpec": {"name": "parallel_local_search_one",
+      "description": "Local search tool #1 for concurrency testing.", "inputSchema":
+      {"json": {"properties": {"query": {"description": "Search query", "title": "Query",
+      "type": "string"}}, "required": ["query"], "type": "object", "additionalProperties":
+      false}}}}, {"toolSpec": {"name": "parallel_local_search_two", "description":
+      "Local search tool #2 for concurrency testing.", "inputSchema": {"json": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}}},
+      {"toolSpec": {"name": "parallel_local_search_three", "description": "Local search
+      tool #3 for concurrency testing.", "inputSchema": {"json": {"properties": {"query":
+      {"description": "Search query", "title": "Query", "type": "string"}}, "required":
+      ["query"], "type": "object", "additionalProperties": false}}}}]}}'
+    headers:
+      Content-Length:
+      - '1954'
+      Content-Type:
+      - !!binary |
+        YXBwbGljYXRpb24vanNvbg==
+      User-Agent:
+      - X-USER-AGENT-XXX
+      amz-sdk-invocation-id:
+      - AMZ-SDK-INVOCATION-ID-XXX
+      amz-sdk-request:
+      - !!binary |
+        YXR0ZW1wdD0x
+      authorization:
+      - AUTHORIZATION-XXX
+      x-amz-date:
+      - X-AMZ-DATE-XXX
+    method: POST
+    uri: https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1%3A0/converse
+  response:
+    body:
+      string: '{"message":"The security token included in the request is invalid."}'
+    headers:
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '68'
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 00:00:07 GMT
+      x-amzn-ErrorType:
+      - UnrecognizedClientException:http://internal.amazon.com/coral/com.amazon.coral.service/
+      x-amzn-RequestId:
+      - X-AMZN-REQUESTID-XXX
+    status:
+      code: 403
+      message: Forbidden
+- request:
+    body: '{"messages": [{"role": "user", "content": [{"text": "\nCurrent Task: This
+      is a tool-calling compliance test. In your next assistant turn, emit exactly
+      3 tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."}]}, {"role": "user", "content":
+      [{"text": "\nCurrent Task: This is a tool-calling compliance test. In your next
+      assistant turn, emit exactly 3 tool calls in the same response (parallel tool
+      calls), in this order: 1) parallel_local_search_one(query=''latest OpenAI model
+      release notes''), 2) parallel_local_search_two(query=''latest Anthropic model
+      release notes''), 3) parallel_local_search_three(query=''latest Gemini model
+      release notes''). Do not call any other tools and do not answer before those
+      3 tool calls are emitted. After the tool results return, provide a one paragraph
+      summary.\n\nThis is the expected criteria for your final answer: A one sentence
+      summary of both tool outputs\nyou MUST return the actual complete content as
+      the final answer, not a summary."}]}], "inferenceConfig": {"stopSequences":
+      ["\nObservation:"]}, "system": [{"text": "You are Parallel Tool Agent. You follow
+      tool instructions precisely.\nYour personal goal is: Use both tools exactly
+      as instructed\n\nYou are Parallel Tool Agent. You follow tool instructions precisely.\nYour
+      personal goal is: Use both tools exactly as instructed"}], "toolConfig": {"tools":
+      [{"toolSpec": {"name": "parallel_local_search_one", "description": "Local search
+      tool #1 for concurrency testing.", "inputSchema": {"json": {"properties": {"query":
+      {"description": "Search query", "title": "Query", "type": "string"}}, "required":
+      ["query"], "type": "object", "additionalProperties": false}}}}, {"toolSpec":
+      {"name": "parallel_local_search_two", "description": "Local search tool #2 for
+      concurrency testing.", "inputSchema": {"json": {"properties": {"query": {"description":
+      "Search query", "title": "Query", "type": "string"}}, "required": ["query"],
+      "type": "object", "additionalProperties": false}}}}, {"toolSpec": {"name": "parallel_local_search_three",
+      "description": "Local search tool #3 for concurrency testing.", "inputSchema":
+      {"json": {"properties": {"query": {"description": "Search query", "title": "Query",
+      "type": "string"}}, "required": ["query"], "type": "object", "additionalProperties":
+      false}}}}]}}'
+    headers:
+      Content-Length:
+      - '2855'
+      Content-Type:
+      - !!binary |
+        YXBwbGljYXRpb24vanNvbg==
+      User-Agent:
+      - X-USER-AGENT-XXX
+      amz-sdk-invocation-id:
+      - AMZ-SDK-INVOCATION-ID-XXX
+      amz-sdk-request:
+      - !!binary |
+        YXR0ZW1wdD0x
+      authorization:
+      - AUTHORIZATION-XXX
+      x-amz-date:
+      - X-AMZ-DATE-XXX
+    method: POST
+    uri: https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1%3A0/converse
+  response:
+    body:
+      string: '{"message":"The security token included in the request is invalid."}'
+    headers:
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '68'
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 00:00:07 GMT
+      x-amzn-ErrorType:
+      - UnrecognizedClientException:http://internal.amazon.com/coral/com.amazon.coral.service/
+      x-amzn-RequestId:
+      - X-AMZN-REQUESTID-XXX
+    status:
+      code: 403
+      message: Forbidden
+- request:
+    body: '{"messages": [{"role": "user", "content": [{"text": "\nCurrent Task: This
+      is a tool-calling compliance test. In your next assistant turn, emit exactly
+      3 tool calls in the same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."}]}, {"role": "user", "content":
+      [{"text": "\nCurrent Task: This is a tool-calling compliance test. In your next
+      assistant turn, emit exactly 3 tool calls in the same response (parallel tool
+      calls), in this order: 1) parallel_local_search_one(query=''latest OpenAI model
+      release notes''), 2) parallel_local_search_two(query=''latest Anthropic model
+      release notes''), 3) parallel_local_search_three(query=''latest Gemini model
+      release notes''). Do not call any other tools and do not answer before those
+      3 tool calls are emitted. After the tool results return, provide a one paragraph
+      summary.\n\nThis is the expected criteria for your final answer: A one sentence
+      summary of both tool outputs\nyou MUST return the actual complete content as
+      the final answer, not a summary."}]}, {"role": "user", "content": [{"text":
+      "\nCurrent Task: This is a tool-calling compliance test. In your next assistant
+      turn, emit exactly 3 tool calls in the same response (parallel tool calls),
+      in this order: 1) parallel_local_search_one(query=''latest OpenAI model release
+      notes''), 2) parallel_local_search_two(query=''latest Anthropic model release
+      notes''), 3) parallel_local_search_three(query=''latest Gemini model release
+      notes''). Do not call any other tools and do not answer before those 3 tool
+      calls are emitted. After the tool results return, provide a one paragraph summary.\n\nThis
+      is the expected criteria for your final answer: A one sentence summary of both
+      tool outputs\nyou MUST return the actual complete content as the final answer,
+      not a summary."}]}], "inferenceConfig": {"stopSequences": ["\nObservation:"]},
+      "system": [{"text": "You are Parallel Tool Agent. You follow tool instructions
+      precisely.\nYour personal goal is: Use both tools exactly as instructed\n\nYou
+      are Parallel Tool Agent. You follow tool instructions precisely.\nYour personal
+      goal is: Use both tools exactly as instructed\n\nYou are Parallel Tool Agent.
+      You follow tool instructions precisely.\nYour personal goal is: Use both tools
+      exactly as instructed"}], "toolConfig": {"tools": [{"toolSpec": {"name": "parallel_local_search_one",
+      "description": "Local search tool #1 for concurrency testing.", "inputSchema":
+      {"json": {"properties": {"query": {"description": "Search query", "title": "Query",
+      "type": "string"}}, "required": ["query"], "type": "object", "additionalProperties":
+      false}}}}, {"toolSpec": {"name": "parallel_local_search_two", "description":
+      "Local search tool #2 for concurrency testing.", "inputSchema": {"json": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}}},
+      {"toolSpec": {"name": "parallel_local_search_three", "description": "Local search
+      tool #3 for concurrency testing.", "inputSchema": {"json": {"properties": {"query":
+      {"description": "Search query", "title": "Query", "type": "string"}}, "required":
+      ["query"], "type": "object", "additionalProperties": false}}}}]}}'
+    headers:
+      Content-Length:
+      - '3756'
+      Content-Type:
+      - !!binary |
+        YXBwbGljYXRpb24vanNvbg==
+      User-Agent:
+      - X-USER-AGENT-XXX
+      amz-sdk-invocation-id:
+      - AMZ-SDK-INVOCATION-ID-XXX
+      amz-sdk-request:
+      - !!binary |
+        YXR0ZW1wdD0x
+      authorization:
+      - AUTHORIZATION-XXX
+      x-amz-date:
+      - X-AMZ-DATE-XXX
+    method: POST
+    uri: https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-3-haiku-20240307-v1%3A0/converse
+  response:
+    body:
+      string: '{"message":"The security token included in the request is invalid."}'
+    headers:
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '68'
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 00:00:07 GMT
+      x-amzn-ErrorType:
+      - UnrecognizedClientException:http://internal.amazon.com/coral/com.amazon.coral.service/
+      x-amzn-RequestId:
+      - X-AMZN-REQUESTID-XXX
+    status:
+      code: 403
+      message: Forbidden
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestGeminiNativeToolCalling.test_gemini_agent_with_native_tool_calling.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestGeminiNativeToolCalling.test_gemini_agent_with_native_tool_calling.yaml
@@ -3,14 +3,14 @@ interactions:
    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Calculate what is 15
      * 8\n\nThis is the expected criteria for your final answer: The result of the
      calculation\nyou MUST return the actual complete content as the final answer,
-      not a summary.\n\nThis is VERY important to you, your job depends on it!"}],
-      "role": "user"}], "systemInstruction": {"parts": [{"text": "You are Math Assistant.
-      You are a helpful math assistant.\nYour personal goal is: Help users with mathematical
-      calculations"}], "role": "user"}, "tools": [{"functionDeclarations": [{"description":
-      "Perform mathematical calculations. Use this for any math operations.", "name":
-      "calculator", "parameters": {"properties": {"expression": {"description": "Mathematical
-      expression to evaluate", "title": "Expression", "type": "STRING"}}, "required":
-      ["expression"], "type": "OBJECT"}}]}], "generationConfig": {"stopSequences":
+      not a summary."}], "role": "user"}], "systemInstruction": {"parts": [{"text":
+      "You are Math Assistant. You are a helpful math assistant.\nYour personal goal
+      is: Help users with mathematical calculations"}], "role": "user"}, "tools":
+      [{"functionDeclarations": [{"description": "Perform mathematical calculations.
+      Use this for any math operations.", "name": "calculator", "parameters_json_schema":
+      {"properties": {"expression": {"description": "Mathematical expression to evaluate",
+      "title": "Expression", "type": "string"}}, "required": ["expression"], "type":
+      "object", "additionalProperties": false}}]}], "generationConfig": {"stopSequences":
      ["\nObservation:"]}}'
    headers:
      User-Agent:
@@ -22,7 +22,7 @@ interactions:
      connection:
      - keep-alive
      content-length:
-      - '907'
+      - '892'
      content-type:
      - application/json
      host:
@@ -32,31 +32,31 @@ interactions:
      x-goog-api-key:
      - X-GOOG-API-KEY-XXX
    method: POST
-    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent
  response:
    body:
      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
        [\n          {\n            \"functionCall\": {\n              \"name\": \"calculator\",\n
        \             \"args\": {\n                \"expression\": \"15 * 8\"\n              }\n
-        \           }\n          }\n        ],\n        \"role\": \"model\"\n      },\n
-        \     \"finishReason\": \"STOP\",\n      \"avgLogprobs\": -0.00062879999833447594\n
-        \   }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\": 103,\n    \"candidatesTokenCount\":
-        7,\n    \"totalTokenCount\": 110,\n    \"promptTokensDetails\": [\n      {\n
-        \       \"modality\": \"TEXT\",\n        \"tokenCount\": 103\n      }\n    ],\n
-        \   \"candidatesTokensDetails\": [\n      {\n        \"modality\": \"TEXT\",\n
-        \       \"tokenCount\": 7\n      }\n    ]\n  },\n  \"modelVersion\": \"gemini-2.0-flash-exp\",\n
-        \ \"responseId\": \"PpByabfUHsih_uMPlu2ysAM\"\n}\n"
+        \           },\n            \"thoughtSignature\": \"Cp8DAb4+9vu74rJ0QQNTa6oMMh3QAlvx3cS4TL0I1od7EdQZtMBbsr5viQiTUR/LKj8nwPvtLjZxib5SXqmV0t2B2ZMdq1nqD62vLPD3i7tmUeRoysODfxomRGRhy/CPysMhobt5HWF1W/n6tNiQz3V36f0/dRx5yJeyN4tJL/RZePv77FUqywOfFlYOkOIyAkrE5LT6FicOjhHm/B9bGV/y7TNmN6TtwQDxoE9nU92Q/UNZ7rNyZE7aSR7KPJZuRXrrBBh+akt5dX5n6N9kGWkyRpWVgUox01+b22RSj4S/QY45IvadtmmkFk8DMVAtAnEiK0WazltC+TOdUJHwVgBD494fngoVcHU+R1yIJrVe7h6Ce3Ts5IYLrRCedDU3wW1ghn/hXx1nvTqQumpsGTGtE2v3KjF/7DmQA96WzB1X7+QUOF2J3pK9HemiKxAQl4U9fP2eNN8shvy2YykBlahWDujEwye7ji4wIWtNHbf0t+uFwGTQ3QruAKXvWB04ExjHM2I/8O9U5tOsH0cwPqnpFR2EaTqaPXXUllZ2K+DaaA==\"\n
+        \         }\n        ],\n        \"role\": \"model\"\n      },\n      \"finishReason\":
+        \"STOP\",\n      \"index\": 0,\n      \"finishMessage\": \"Model generated
+        function call(s).\"\n    }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\":
+        115,\n    \"candidatesTokenCount\": 17,\n    \"totalTokenCount\": 227,\n    \"promptTokensDetails\":
+        [\n      {\n        \"modality\": \"TEXT\",\n        \"tokenCount\": 115\n
+        \     }\n    ],\n    \"thoughtsTokenCount\": 95\n  },\n  \"modelVersion\":
+        \"gemini-2.5-flash\",\n  \"responseId\": \"Y1KWadvNMKz1jMcPiJeJmAI\"\n}\n"
    headers:
      Alt-Svc:
      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
      Content-Type:
      - application/json; charset=UTF-8
      Date:
-      - Thu, 22 Jan 2026 21:01:50 GMT
+      - Wed, 18 Feb 2026 23:59:32 GMT
      Server:
      - scaffolding on HTTPServer2
      Server-Timing:
-      - gfet4t7; dur=521
+      - gfet4t7; dur=956
      Transfer-Encoding:
      - chunked
      Vary:
@@ -76,18 +76,19 @@ interactions:
    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Calculate what is 15
      * 8\n\nThis is the expected criteria for your final answer: The result of the
      calculation\nyou MUST return the actual complete content as the final answer,
-      not a summary.\n\nThis is VERY important to you, your job depends on it!"}],
-      "role": "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text":
-      "The result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze
-      the tool result. If requirements are met, provide the Final Answer. Otherwise,
-      call the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}], "systemInstruction": {"parts": [{"text": "You are Math Assistant.
-      You are a helpful math assistant.\nYour personal goal is: Help users with mathematical
-      calculations"}], "role": "user"}, "tools": [{"functionDeclarations": [{"description":
-      "Perform mathematical calculations. Use this for any math operations.", "name":
-      "calculator", "parameters": {"properties": {"expression": {"description": "Mathematical
-      expression to evaluate", "title": "Expression", "type": "STRING"}}, "required":
-      ["expression"], "type": "OBJECT"}}]}], "generationConfig": {"stopSequences":
+      not a summary."}], "role": "user"}, {"parts": [{"functionCall": {"args": {"expression":
+      "15 * 8"}, "name": "calculator"}}], "role": "model"}, {"parts": [{"functionResponse":
+      {"name": "calculator", "response": {"result": "The result of 15 * 8 is 120"}}}],
+      "role": "user"}, {"parts": [{"text": "Analyze the tool result. If requirements
+      are met, provide the Final Answer. Otherwise, call the next tool. Deliver only
+      the answer without meta-commentary."}], "role": "user"}], "systemInstruction":
+      {"parts": [{"text": "You are Math Assistant. You are a helpful math assistant.\nYour
+      personal goal is: Help users with mathematical calculations"}], "role": "user"},
+      "tools": [{"functionDeclarations": [{"description": "Perform mathematical calculations.
+      Use this for any math operations.", "name": "calculator", "parameters_json_schema":
+      {"properties": {"expression": {"description": "Mathematical expression to evaluate",
+      "title": "Expression", "type": "string"}}, "required": ["expression"], "type":
+      "object", "additionalProperties": false}}]}], "generationConfig": {"stopSequences":
      ["\nObservation:"]}}'
    headers:
      User-Agent:
@@ -99,7 +100,7 @@ interactions:
      connection:
      - keep-alive
      content-length:
-      - '1219'
+      - '1326'
      content-type:
      - application/json
      host:
@@ -109,378 +110,28 @@ interactions:
      x-goog-api-key:
      - X-GOOG-API-KEY-XXX
    method: POST
-    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent
  response:
    body:
      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
-        [\n          {\n            \"functionCall\": {\n              \"name\": \"calculator\",\n
-        \             \"args\": {\n                \"expression\": \"15 * 8\"\n              }\n
-        \           }\n          }\n        ],\n        \"role\": \"model\"\n      },\n
-        \     \"finishReason\": \"STOP\",\n      \"avgLogprobs\": -0.013549212898526872\n
-        \   }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\": 149,\n    \"candidatesTokenCount\":
-        7,\n    \"totalTokenCount\": 156,\n    \"promptTokensDetails\": [\n      {\n
-        \       \"modality\": \"TEXT\",\n        \"tokenCount\": 149\n      }\n    ],\n
-        \   \"candidatesTokensDetails\": [\n      {\n        \"modality\": \"TEXT\",\n
-        \       \"tokenCount\": 7\n      }\n    ]\n  },\n  \"modelVersion\": \"gemini-2.0-flash-exp\",\n
-        \ \"responseId\": \"P5Byadc8kJT-4w_p99XQAQ\"\n}\n"
+        [\n          {\n            \"text\": \"The result of 15 * 8 is 120\"\n          }\n
+        \       ],\n        \"role\": \"model\"\n      },\n      \"finishReason\":
+        \"STOP\",\n      \"index\": 0\n    }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\":
+        191,\n    \"candidatesTokenCount\": 14,\n    \"totalTokenCount\": 205,\n    \"promptTokensDetails\":
+        [\n      {\n        \"modality\": \"TEXT\",\n        \"tokenCount\": 191\n
+        \     }\n    ]\n  },\n  \"modelVersion\": \"gemini-2.5-flash\",\n  \"responseId\":
+        \"ZFKWaf2BMM6MjMcP6P--kQM\"\n}\n"
    headers:
      Alt-Svc:
      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
      Content-Type:
      - application/json; charset=UTF-8
      Date:
-      - Thu, 22 Jan 2026 21:01:51 GMT
+      - Wed, 18 Feb 2026 23:59:33 GMT
      Server:
      - scaffolding on HTTPServer2
      Server-Timing:
-      - gfet4t7; dur=444
-      Transfer-Encoding:
-      - chunked
-      Vary:
-      - Origin
-      - X-Origin
-      - Referer
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      X-Frame-Options:
-      - X-FRAME-OPTIONS-XXX
-      X-XSS-Protection:
-      - '0'
-    status:
-      code: 200
-      message: OK
- request:
-    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Calculate what is 15
-      * 8\n\nThis is the expected criteria for your final answer: The result of the
-      calculation\nyou MUST return the actual complete content as the final answer,
-      not a summary.\n\nThis is VERY important to you, your job depends on it!"}],
-      "role": "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text":
-      "The result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze
-      the tool result. If requirements are met, provide the Final Answer. Otherwise,
-      call the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text": "The
-      result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze the
-      tool result. If requirements are met, provide the Final Answer. Otherwise, call
-      the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}], "systemInstruction": {"parts": [{"text": "You are Math Assistant.
-      You are a helpful math assistant.\nYour personal goal is: Help users with mathematical
-      calculations"}], "role": "user"}, "tools": [{"functionDeclarations": [{"description":
-      "Perform mathematical calculations. Use this for any math operations.", "name":
-      "calculator", "parameters": {"properties": {"expression": {"description": "Mathematical
-      expression to evaluate", "title": "Expression", "type": "STRING"}}, "required":
-      ["expression"], "type": "OBJECT"}}]}], "generationConfig": {"stopSequences":
-      ["\nObservation:"]}}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - '*/*'
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '1531'
-      content-type:
-      - application/json
-      host:
-      - generativelanguage.googleapis.com
-      x-goog-api-client:
-      - google-genai-sdk/1.49.0 gl-python/3.13.3
-      x-goog-api-key:
-      - X-GOOG-API-KEY-XXX
-    method: POST
-    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent
-  response:
-    body:
-      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
-        [\n          {\n            \"functionCall\": {\n              \"name\": \"calculator\",\n
-        \             \"args\": {\n                \"expression\": \"15 * 8\"\n              }\n
-        \           }\n          }\n        ],\n        \"role\": \"model\"\n      },\n
-        \     \"finishReason\": \"STOP\",\n      \"avgLogprobs\": -0.0409286447933742\n
-        \   }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\": 195,\n    \"candidatesTokenCount\":
-        7,\n    \"totalTokenCount\": 202,\n    \"promptTokensDetails\": [\n      {\n
-        \       \"modality\": \"TEXT\",\n        \"tokenCount\": 195\n      }\n    ],\n
-        \   \"candidatesTokensDetails\": [\n      {\n        \"modality\": \"TEXT\",\n
-        \       \"tokenCount\": 7\n      }\n    ]\n  },\n  \"modelVersion\": \"gemini-2.0-flash-exp\",\n
-        \ \"responseId\": \"P5Byadn5HOK6_uMPnvmXwAk\"\n}\n"
-    headers:
-      Alt-Svc:
-      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
-      Content-Type:
-      - application/json; charset=UTF-8
-      Date:
-      - Thu, 22 Jan 2026 21:01:51 GMT
-      Server:
-      - scaffolding on HTTPServer2
-      Server-Timing:
-      - gfet4t7; dur=503
-      Transfer-Encoding:
-      - chunked
-      Vary:
-      - Origin
-      - X-Origin
-      - Referer
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      X-Frame-Options:
-      - X-FRAME-OPTIONS-XXX
-      X-XSS-Protection:
-      - '0'
-    status:
-      code: 200
-      message: OK
- request:
-    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Calculate what is 15
-      * 8\n\nThis is the expected criteria for your final answer: The result of the
-      calculation\nyou MUST return the actual complete content as the final answer,
-      not a summary.\n\nThis is VERY important to you, your job depends on it!"}],
-      "role": "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text":
-      "The result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze
-      the tool result. If requirements are met, provide the Final Answer. Otherwise,
-      call the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text": "The
-      result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze the
-      tool result. If requirements are met, provide the Final Answer. Otherwise, call
-      the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text": "The
-      result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze the
-      tool result. If requirements are met, provide the Final Answer. Otherwise, call
-      the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}], "systemInstruction": {"parts": [{"text": "You are Math Assistant.
-      You are a helpful math assistant.\nYour personal goal is: Help users with mathematical
-      calculations"}], "role": "user"}, "tools": [{"functionDeclarations": [{"description":
-      "Perform mathematical calculations. Use this for any math operations.", "name":
-      "calculator", "parameters": {"properties": {"expression": {"description": "Mathematical
-      expression to evaluate", "title": "Expression", "type": "STRING"}}, "required":
-      ["expression"], "type": "OBJECT"}}]}], "generationConfig": {"stopSequences":
-      ["\nObservation:"]}}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - '*/*'
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '1843'
-      content-type:
-      - application/json
-      host:
-      - generativelanguage.googleapis.com
-      x-goog-api-client:
-      - google-genai-sdk/1.49.0 gl-python/3.13.3
-      x-goog-api-key:
-      - X-GOOG-API-KEY-XXX
-    method: POST
-    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent
-  response:
-    body:
-      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
-        [\n          {\n            \"functionCall\": {\n              \"name\": \"calculator\",\n
-        \             \"args\": {\n                \"expression\": \"15 * 8\"\n              }\n
-        \           }\n          }\n        ],\n        \"role\": \"model\"\n      },\n
-        \     \"finishReason\": \"STOP\",\n      \"avgLogprobs\": -0.018002046006066457\n
-        \   }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\": 241,\n    \"candidatesTokenCount\":
-        7,\n    \"totalTokenCount\": 248,\n    \"promptTokensDetails\": [\n      {\n
-        \       \"modality\": \"TEXT\",\n        \"tokenCount\": 241\n      }\n    ],\n
-        \   \"candidatesTokensDetails\": [\n      {\n        \"modality\": \"TEXT\",\n
-        \       \"tokenCount\": 7\n      }\n    ]\n  },\n  \"modelVersion\": \"gemini-2.0-flash-exp\",\n
-        \ \"responseId\": \"P5Byafi2PKbn_uMPtIbfuQI\"\n}\n"
-    headers:
-      Alt-Svc:
-      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
-      Content-Type:
-      - application/json; charset=UTF-8
-      Date:
-      - Thu, 22 Jan 2026 21:01:52 GMT
-      Server:
-      - scaffolding on HTTPServer2
-      Server-Timing:
-      - gfet4t7; dur=482
-      Transfer-Encoding:
-      - chunked
-      Vary:
-      - Origin
-      - X-Origin
-      - Referer
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      X-Frame-Options:
-      - X-FRAME-OPTIONS-XXX
-      X-XSS-Protection:
-      - '0'
-    status:
-      code: 200
-      message: OK
- request:
-    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Calculate what is 15
-      * 8\n\nThis is the expected criteria for your final answer: The result of the
-      calculation\nyou MUST return the actual complete content as the final answer,
-      not a summary.\n\nThis is VERY important to you, your job depends on it!"}],
-      "role": "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text":
-      "The result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze
-      the tool result. If requirements are met, provide the Final Answer. Otherwise,
-      call the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text": "The
-      result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze the
-      tool result. If requirements are met, provide the Final Answer. Otherwise, call
-      the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text": "The
-      result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze the
-      tool result. If requirements are met, provide the Final Answer. Otherwise, call
-      the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text": "The
-      result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze the
-      tool result. If requirements are met, provide the Final Answer. Otherwise, call
-      the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}], "systemInstruction": {"parts": [{"text": "You are Math Assistant.
-      You are a helpful math assistant.\nYour personal goal is: Help users with mathematical
-      calculations"}], "role": "user"}, "tools": [{"functionDeclarations": [{"description":
-      "Perform mathematical calculations. Use this for any math operations.", "name":
-      "calculator", "parameters": {"properties": {"expression": {"description": "Mathematical
-      expression to evaluate", "title": "Expression", "type": "STRING"}}, "required":
-      ["expression"], "type": "OBJECT"}}]}], "generationConfig": {"stopSequences":
-      ["\nObservation:"]}}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - '*/*'
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '2155'
-      content-type:
-      - application/json
-      host:
-      - generativelanguage.googleapis.com
-      x-goog-api-client:
-      - google-genai-sdk/1.49.0 gl-python/3.13.3
-      x-goog-api-key:
-      - X-GOOG-API-KEY-XXX
-    method: POST
-    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent
-  response:
-    body:
-      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
-        [\n          {\n            \"functionCall\": {\n              \"name\": \"calculator\",\n
-        \             \"args\": {\n                \"expression\": \"15 * 8\"\n              }\n
-        \           }\n          }\n        ],\n        \"role\": \"model\"\n      },\n
-        \     \"finishReason\": \"STOP\",\n      \"avgLogprobs\": -0.10329001290457589\n
-        \   }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\": 287,\n    \"candidatesTokenCount\":
-        7,\n    \"totalTokenCount\": 294,\n    \"promptTokensDetails\": [\n      {\n
-        \       \"modality\": \"TEXT\",\n        \"tokenCount\": 287\n      }\n    ],\n
-        \   \"candidatesTokensDetails\": [\n      {\n        \"modality\": \"TEXT\",\n
-        \       \"tokenCount\": 7\n      }\n    ]\n  },\n  \"modelVersion\": \"gemini-2.0-flash-exp\",\n
-        \ \"responseId\": \"QJByaamVIP_g_uMPt6mI0Qg\"\n}\n"
-    headers:
-      Alt-Svc:
-      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
-      Content-Type:
-      - application/json; charset=UTF-8
-      Date:
-      - Thu, 22 Jan 2026 21:01:52 GMT
-      Server:
-      - scaffolding on HTTPServer2
-      Server-Timing:
-      - gfet4t7; dur=534
-      Transfer-Encoding:
-      - chunked
-      Vary:
-      - Origin
-      - X-Origin
-      - Referer
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      X-Frame-Options:
-      - X-FRAME-OPTIONS-XXX
-      X-XSS-Protection:
-      - '0'
-    status:
-      code: 200
-      message: OK
- request:
-    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Calculate what is 15
-      * 8\n\nThis is the expected criteria for your final answer: The result of the
-      calculation\nyou MUST return the actual complete content as the final answer,
-      not a summary.\n\nThis is VERY important to you, your job depends on it!"}],
-      "role": "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text":
-      "The result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze
-      the tool result. If requirements are met, provide the Final Answer. Otherwise,
-      call the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text": "The
-      result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze the
-      tool result. If requirements are met, provide the Final Answer. Otherwise, call
-      the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text": "The
-      result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze the
-      tool result. If requirements are met, provide the Final Answer. Otherwise, call
-      the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text": "The
-      result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze the
-      tool result. If requirements are met, provide the Final Answer. Otherwise, call
-      the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}, {"parts": [{"text": ""}], "role": "model"}, {"parts": [{"text": "The
-      result of 15 * 8 is 120"}], "role": "user"}, {"parts": [{"text": "Analyze the
-      tool result. If requirements are met, provide the Final Answer. Otherwise, call
-      the next tool. Deliver only the answer without meta-commentary."}], "role":
-      "user"}], "systemInstruction": {"parts": [{"text": "You are Math Assistant.
-      You are a helpful math assistant.\nYour personal goal is: Help users with mathematical
-      calculations"}], "role": "user"}, "tools": [{"functionDeclarations": [{"description":
-      "Perform mathematical calculations. Use this for any math operations.", "name":
-      "calculator", "parameters": {"properties": {"expression": {"description": "Mathematical
-      expression to evaluate", "title": "Expression", "type": "STRING"}}, "required":
-      ["expression"], "type": "OBJECT"}}]}], "generationConfig": {"stopSequences":
-      ["\nObservation:"]}}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - '*/*'
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '2467'
-      content-type:
-      - application/json
-      host:
-      - generativelanguage.googleapis.com
-      x-goog-api-client:
-      - google-genai-sdk/1.49.0 gl-python/3.13.3
-      x-goog-api-key:
-      - X-GOOG-API-KEY-XXX
-    method: POST
-    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent
-  response:
-    body:
-      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
-        [\n          {\n            \"text\": \"120\\n\"\n          }\n        ],\n
-        \       \"role\": \"model\"\n      },\n      \"finishReason\": \"STOP\",\n
-        \     \"avgLogprobs\": -0.0097615998238325119\n    }\n  ],\n  \"usageMetadata\":
-        {\n    \"promptTokenCount\": 333,\n    \"candidatesTokenCount\": 4,\n    \"totalTokenCount\":
-        337,\n    \"promptTokensDetails\": [\n      {\n        \"modality\": \"TEXT\",\n
-        \       \"tokenCount\": 333\n      }\n    ],\n    \"candidatesTokensDetails\":
-        [\n      {\n        \"modality\": \"TEXT\",\n        \"tokenCount\": 4\n      }\n
-        \   ]\n  },\n  \"modelVersion\": \"gemini-2.0-flash-exp\",\n  \"responseId\":
-        \"QZByaZHABO-i_uMP58aYqAk\"\n}\n"
-    headers:
-      Alt-Svc:
-      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
-      Content-Type:
-      - application/json; charset=UTF-8
-      Date:
-      - Thu, 22 Jan 2026 21:01:53 GMT
-      Server:
-      - scaffolding on HTTPServer2
-      Server-Timing:
-      - gfet4t7; dur=412
+      - gfet4t7; dur=421
      Transfer-Encoding:
      - chunked
      Vary:
--- a/lib/crewai/tests/cassettes/agents/TestGeminiNativeToolCalling.test_gemini_parallel_native_tool_calling_test_agent_kickoff.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestGeminiNativeToolCalling.test_gemini_parallel_native_tool_calling_test_agent_kickoff.yaml
@@ -0,0 +1,188 @@
+interactions:
+- request:
+    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."}], "role": "user"}], "systemInstruction": {"parts": [{"text":
+      "You are Parallel Tool Agent. You follow tool instructions precisely.\nYour
+      personal goal is: Use both tools exactly as instructed"}], "role": "user"},
+      "tools": [{"functionDeclarations": [{"description": "Local search tool #1 for
+      concurrency testing.", "name": "parallel_local_search_one", "parameters_json_schema":
+      {"properties": {"query": {"description": "Search query", "title": "Query", "type":
+      "string"}}, "required": ["query"], "type": "object", "additionalProperties":
+      false}}, {"description": "Local search tool #2 for concurrency testing.", "name":
+      "parallel_local_search_two", "parameters_json_schema": {"properties": {"query":
+      {"description": "Search query", "title": "Query", "type": "string"}}, "required":
+      ["query"], "type": "object", "additionalProperties": false}}, {"description":
+      "Local search tool #3 for concurrency testing.", "name": "parallel_local_search_three",
+      "parameters_json_schema": {"properties": {"query": {"description": "Search query",
+      "title": "Query", "type": "string"}}, "required": ["query"], "type": "object",
+      "additionalProperties": false}}]}], "generationConfig": {"stopSequences": ["\nObservation:"]}}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - '*/*'
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '1783'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      x-goog-api-client:
+      - google-genai-sdk/1.49.0 gl-python/3.13.3
+      x-goog-api-key:
+      - X-GOOG-API-KEY-XXX
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent
+  response:
+    body:
+      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
+        [\n          {\n            \"functionCall\": {\n              \"name\": \"parallel_local_search_one\",\n
+        \             \"args\": {\n                \"query\": \"latest OpenAI model
+        release notes\"\n              }\n            },\n            \"thoughtSignature\":
+        \"CrICAb4+9vtrrkiSatPyOs7fssb9akcgCIiQdJKp/k+hcEZVNFvU/H0e4FFmLIhTCPRyHxmU+AQPtBZ5vg6y9ZCcv11RdcWgYW8rPQzCnC+YTUxPAfDzaObky1QsL5pl9+yglQqVoVM31ZcnoiH02z85pwAv6TSJxdJZEekW6XwcIrCoHNCgY3ghHFEd3y3wLJ5JWL7wmiRNTC9TCT8aJHXKFohYrb+4JMULCx8BqKVxOucZPiDHA8GsoqSlzkYEe2xCh9oSdaZpCFrxhZ9bwoVDbVmPrjaq2hj5BoJ5hNxscHJ/E0EOl4ogeKZW+hIVfdzpjAFZW9Oejkb9G4ZSLbxXsoO7x8bi4LHFRABniGrWvNuOOH0Udh4t57oXHXZO4u5NNTood/GkJGcP+aHqUAH1fwqL\"\n
+        \         },\n          {\n            \"functionCall\": {\n              \"name\":
+        \"parallel_local_search_two\",\n              \"args\": {\n                \"query\":
+        \"latest Anthropic model release notes\"\n              }\n            }\n
+        \         },\n          {\n            \"functionCall\": {\n              \"name\":
+        \"parallel_local_search_three\",\n              \"args\": {\n                \"query\":
+        \"latest Gemini model release notes\"\n              }\n            }\n          }\n
+        \       ],\n        \"role\": \"model\"\n      },\n      \"finishReason\":
+        \"STOP\",\n      \"index\": 0,\n      \"finishMessage\": \"Model generated
+        function call(s).\"\n    }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\":
+        291,\n    \"candidatesTokenCount\": 70,\n    \"totalTokenCount\": 428,\n    \"promptTokensDetails\":
+        [\n      {\n        \"modality\": \"TEXT\",\n        \"tokenCount\": 291\n
+        \     }\n    ],\n    \"thoughtsTokenCount\": 67\n  },\n  \"modelVersion\":
+        \"gemini-2.5-flash\",\n  \"responseId\": \"alKWacytCLi5jMcPhISaoAI\"\n}\n"
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Wed, 18 Feb 2026 23:59:39 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=999
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      X-Frame-Options:
+      - X-FRAME-OPTIONS-XXX
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."}], "role": "user"}, {"parts": [{"functionCall": {"args":
+      {"query": "latest OpenAI model release notes"}, "name": "parallel_local_search_one"},
+      "thoughtSignature": "CrICAb4-9vtrrkiSatPyOs7fssb9akcgCIiQdJKp_k-hcEZVNFvU_H0e4FFmLIhTCPRyHxmU-AQPtBZ5vg6y9ZCcv11RdcWgYW8rPQzCnC-YTUxPAfDzaObky1QsL5pl9-yglQqVoVM31ZcnoiH02z85pwAv6TSJxdJZEekW6XwcIrCoHNCgY3ghHFEd3y3wLJ5JWL7wmiRNTC9TCT8aJHXKFohYrb-4JMULCx8BqKVxOucZPiDHA8GsoqSlzkYEe2xCh9oSdaZpCFrxhZ9bwoVDbVmPrjaq2hj5BoJ5hNxscHJ_E0EOl4ogeKZW-hIVfdzpjAFZW9Oejkb9G4ZSLbxXsoO7x8bi4LHFRABniGrWvNuOOH0Udh4t57oXHXZO4u5NNTood_GkJGcP-aHqUAH1fwqL"},
+      {"functionCall": {"args": {"query": "latest Anthropic model release notes"},
+      "name": "parallel_local_search_two"}}, {"functionCall": {"args": {"query": "latest
+      Gemini model release notes"}, "name": "parallel_local_search_three"}}], "role":
+      "model"}, {"parts": [{"functionResponse": {"name": "parallel_local_search_one",
+      "response": {"result": "[one] latest OpenAI model release notes"}}}], "role":
+      "user"}, {"parts": [{"functionResponse": {"name": "parallel_local_search_two",
+      "response": {"result": "[two] latest Anthropic model release notes"}}}], "role":
+      "user"}, {"parts": [{"functionResponse": {"name": "parallel_local_search_three",
+      "response": {"result": "[three] latest Gemini model release notes"}}}], "role":
+      "user"}], "systemInstruction": {"parts": [{"text": "You are Parallel Tool Agent.
+      You follow tool instructions precisely.\nYour personal goal is: Use both tools
+      exactly as instructed"}], "role": "user"}, "tools": [{"functionDeclarations":
+      [{"description": "Local search tool #1 for concurrency testing.", "name": "parallel_local_search_one",
+      "parameters_json_schema": {"properties": {"query": {"description": "Search query",
+      "title": "Query", "type": "string"}}, "required": ["query"], "type": "object",
+      "additionalProperties": false}}, {"description": "Local search tool #2 for concurrency
+      testing.", "name": "parallel_local_search_two", "parameters_json_schema": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}, {"description":
+      "Local search tool #3 for concurrency testing.", "name": "parallel_local_search_three",
+      "parameters_json_schema": {"properties": {"query": {"description": "Search query",
+      "title": "Query", "type": "string"}}, "required": ["query"], "type": "object",
+      "additionalProperties": false}}]}], "generationConfig": {"stopSequences": ["\nObservation:"]}}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - '*/*'
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '3071'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      x-goog-api-client:
+      - google-genai-sdk/1.49.0 gl-python/3.13.3
+      x-goog-api-key:
+      - X-GOOG-API-KEY-XXX
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent
+  response:
+    body:
+      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
+        [\n          {\n            \"text\": \"Here is a summary of the latest model
+        release notes: I have retrieved information regarding the latest OpenAI model
+        release notes, the latest Anthropic model release notes, and the latest Gemini
+        model release notes. The specific details of these release notes are available
+        through the respective tool outputs.\",\n            \"thoughtSignature\":
+        \"CsoBAb4+9vtPvWFM08lR1S4QrLN+Z1+Zpf04Y/bC8tjOpnxz3EEvHyRNEwkslUX5pftBi8J78Xk4/FUER0xjJZc8clUObTvayxLNup4h1JwJ5ZdatulInNGTEieFnF4w8KjSFB/vqNCZvXWZbiLkpzqAnsoAIf0x4VmMN11V0Ozo+3f2QftD+iBrfu3g21UI5tbG0Z+0QHxjRVKXrQOp7dmoZPzaxI0zalfDEI+A2jGpVl/VvauVNv0jQn0yItcA5tkVeWLq6717CjNoig==\"\n
+        \         }\n        ],\n        \"role\": \"model\"\n      },\n      \"finishReason\":
+        \"STOP\",\n      \"index\": 0\n    }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\":
+        435,\n    \"candidatesTokenCount\": 54,\n    \"totalTokenCount\": 524,\n    \"promptTokensDetails\":
+        [\n      {\n        \"modality\": \"TEXT\",\n        \"tokenCount\": 435\n
+        \     }\n    ],\n    \"thoughtsTokenCount\": 35\n  },\n  \"modelVersion\":
+        \"gemini-2.5-flash\",\n  \"responseId\": \"bFKWaZOZCqCvjMcPvvGNgAc\"\n}\n"
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Wed, 18 Feb 2026 23:59:41 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=967
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      X-Frame-Options:
+      - X-FRAME-OPTIONS-XXX
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestGeminiNativeToolCalling.test_gemini_parallel_native_tool_calling_test_crew.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestGeminiNativeToolCalling.test_gemini_parallel_native_tool_calling_test_crew.yaml
@@ -0,0 +1,192 @@
+interactions:
+- request:
+    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."}], "role": "user"}], "systemInstruction":
+      {"parts": [{"text": "You are Parallel Tool Agent. You follow tool instructions
+      precisely.\nYour personal goal is: Use both tools exactly as instructed"}],
+      "role": "user"}, "tools": [{"functionDeclarations": [{"description": "Local
+      search tool #1 for concurrency testing.", "name": "parallel_local_search_one",
+      "parameters_json_schema": {"properties": {"query": {"description": "Search query",
+      "title": "Query", "type": "string"}}, "required": ["query"], "type": "object",
+      "additionalProperties": false}}, {"description": "Local search tool #2 for concurrency
+      testing.", "name": "parallel_local_search_two", "parameters_json_schema": {"properties":
+      {"query": {"description": "Search query", "title": "Query", "type": "string"}},
+      "required": ["query"], "type": "object", "additionalProperties": false}}, {"description":
+      "Local search tool #3 for concurrency testing.", "name": "parallel_local_search_three",
+      "parameters_json_schema": {"properties": {"query": {"description": "Search query",
+      "title": "Query", "type": "string"}}, "required": ["query"], "type": "object",
+      "additionalProperties": false}}]}], "generationConfig": {"stopSequences": ["\nObservation:"]}}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - '*/*'
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '1964'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      x-goog-api-client:
+      - google-genai-sdk/1.49.0 gl-python/3.13.3
+      x-goog-api-key:
+      - X-GOOG-API-KEY-XXX
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent
+  response:
+    body:
+      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
+        [\n          {\n            \"functionCall\": {\n              \"name\": \"parallel_local_search_one\",\n
+        \             \"args\": {\n                \"query\": \"latest OpenAI model
+        release notes\"\n              }\n            },\n            \"thoughtSignature\":
+        \"CuMEAb4+9vu1V1iOC9o/a8+jQqow8F4RTrjlnjnDCwsisMHLLJ+Wj3pZxbFDeIjCJe9pa6+14InyYHh/ezgHrv+xPGIJtX9pJQatDCBAfCmcZ3fDipVIMAHLcl0Q660EVuZ+vRgvNhPSau+uSN9u303wJsaKvdzOQnfww2LfLtJMNtOhSHfkfhfw2bkBOtMa5/FuLqKSr6m94dSdE7HShR6+jLMLbiSXkBLWsRp0jGl85Wvd0hoA7dUyq+uIuyOBr5Myo9uMrLbxfnrRRbPMorOpYTCmHK0HE8mEBRjzh1hNwcBcfRL0VcgA2UnBIurStIeVbq51BJQ1UOq6r1wVi50Wdh1GjIQ/iN9C15T1Ql3adjom5QbmY+XY08RJOiNyVplh1YQ0qlWCVHEpueEfdzcIB+BUauVrLNqBcBr5g6ekO5QZCAdt7PLerQU8jhKjDQy367jCKQyaHir0GmAISS8RlZ8tkLKNZlZhd11D76ui6X8ep9yznViBbqH0AS1R2hMm+ielMVFjhidglTMjqB0X+yk1K2eZXkc+R/xsXRPlnlZWRygnV+IbU8RAnZWtneM464Wccmc1scfF45GKiji5bLYO7Zx+ZF8mSLcQaC8M3z121D6VbFonhaIdkJ3Wb7nI2vEyxFjdinVk3/P0zL8nu3nHeqQviTrQIoHMsZk0yPyqu9NWxg3wGJL5pbcaQh87ROQuTsInkuzzEr0QMzjw9W5iquhMh4/Wy/OKXAgf3maQB9Jb4HoHZlc0io+KYqewFSVx2BvqXbqJbIrTkTo6XRTbK7dkwlCbMmE1wKIwjrrzZQI=\"\n
+        \         },\n          {\n            \"functionCall\": {\n              \"name\":
+        \"parallel_local_search_two\",\n              \"args\": {\n                \"query\":
+        \"latest Anthropic model release notes\"\n              }\n            }\n
+        \         },\n          {\n            \"functionCall\": {\n              \"name\":
+        \"parallel_local_search_three\",\n              \"args\": {\n                \"query\":
+        \"latest Gemini model release notes\"\n              }\n            }\n          }\n
+        \       ],\n        \"role\": \"model\"\n      },\n      \"finishReason\":
+        \"STOP\",\n      \"index\": 0,\n      \"finishMessage\": \"Model generated
+        function call(s).\"\n    }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\":
+        327,\n    \"candidatesTokenCount\": 70,\n    \"totalTokenCount\": 536,\n    \"promptTokensDetails\":
+        [\n      {\n        \"modality\": \"TEXT\",\n        \"tokenCount\": 327\n
+        \     }\n    ],\n    \"thoughtsTokenCount\": 139\n  },\n  \"modelVersion\":
+        \"gemini-2.5-flash\",\n  \"responseId\": \"ZVKWabziF7bcjMcP3r2SuAg\"\n}\n"
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Wed, 18 Feb 2026 23:59:34 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=1262
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      X-Frame-Options:
+      - X-FRAME-OPTIONS-XXX
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."}], "role": "user"}, {"parts": [{"functionCall":
+      {"args": {"query": "latest OpenAI model release notes"}, "name": "parallel_local_search_one"}},
+      {"functionCall": {"args": {"query": "latest Anthropic model release notes"},
+      "name": "parallel_local_search_two"}}, {"functionCall": {"args": {"query": "latest
+      Gemini model release notes"}, "name": "parallel_local_search_three"}}], "role":
+      "model"}, {"parts": [{"functionResponse": {"name": "parallel_local_search_one",
+      "response": {"result": "[one] latest OpenAI model release notes"}}}], "role":
+      "user"}, {"parts": [{"functionResponse": {"name": "parallel_local_search_two",
+      "response": {"result": "[two] latest Anthropic model release notes"}}}], "role":
+      "user"}, {"parts": [{"functionResponse": {"name": "parallel_local_search_three",
+      "response": {"result": "[three] latest Gemini model release notes"}}}], "role":
+      "user"}, {"parts": [{"text": "Analyze the tool result. If requirements are met,
+      provide the Final Answer. Otherwise, call the next tool. Deliver only the answer
+      without meta-commentary."}], "role": "user"}], "systemInstruction": {"parts":
+      [{"text": "You are Parallel Tool Agent. You follow tool instructions precisely.\nYour
+      personal goal is: Use both tools exactly as instructed"}], "role": "user"},
+      "tools": [{"functionDeclarations": [{"description": "Local search tool #1 for
+      concurrency testing.", "name": "parallel_local_search_one", "parameters_json_schema":
+      {"properties": {"query": {"description": "Search query", "title": "Query", "type":
+      "string"}}, "required": ["query"], "type": "object", "additionalProperties":
+      false}}, {"description": "Local search tool #2 for concurrency testing.", "name":
+      "parallel_local_search_two", "parameters_json_schema": {"properties": {"query":
+      {"description": "Search query", "title": "Query", "type": "string"}}, "required":
+      ["query"], "type": "object", "additionalProperties": false}}, {"description":
+      "Local search tool #3 for concurrency testing.", "name": "parallel_local_search_three",
+      "parameters_json_schema": {"properties": {"query": {"description": "Search query",
+      "title": "Query", "type": "string"}}, "required": ["query"], "type": "object",
+      "additionalProperties": false}}]}], "generationConfig": {"stopSequences": ["\nObservation:"]}}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - '*/*'
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '3014'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      x-goog-api-client:
+      - google-genai-sdk/1.49.0 gl-python/3.13.3
+      x-goog-api-key:
+      - X-GOOG-API-KEY-XXX
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent
+  response:
+    body:
+      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
+        [\n          {\n            \"text\": \"The search results indicate the latest
+        model release notes for OpenAI, Anthropic, and Gemini are: [one] latest OpenAI
+        model release notes[two] latest Anthropic model release notes[three] latest
+        Gemini model release notes.\",\n            \"thoughtSignature\": \"CsUPAb4+9vs4hkuatQAakl1FSHx5DIde9nHYobJdlWs2HEzES9gHn7uwjMIlFPTzJUbnZqxpAK93hqsCofdfGANr8dwK+/IbZAiMSikpAq2ZjEbWADjfalU3ke4LcQMh6TEYFVGz1QCinjne3jZx5jOVaL8YdAtjOYnBZWA6KqdvfKjD7+Ct/BLoEqvu4LW6kxhXQgcV+D3M1QxGlr1dxpajj4wyYFI9LXchE2vCdAMPYTkPQ4WPbS3xjz0jJb6qFAwwg+BY5kGemkWWVHsvq28t09pd7FEH0bod5cEpR65qEefpJfhHsXYqmOwHDkfNePYnYC+5qmn7kvkN+fhF41SoMRZahMZGDjIo+q6vvru3eXKmZiuLsrh8AqQIks/4S3sSuxt16ogYKE+LlFxml2ygXFPww59nRAtc+xK6VW8jB2vyv9Eo5cpnG9ZBv1dOznJnmj4AWA1ddMlp+yq8AdaboTSo5dysYMwFcSXS3kuU+xi92dC+7GqZZbDr5frvnc+MnSuzYwHhNjSQqvTo5DKGit53zDwlFJT74kLBXk36BOFQp4xlfs+BpKkw11bow6qQoTvC68D023ZHami+McO1WYBDoO5CrDoosU8fAYljqaGArBoMlssF4O7VKHEaEbEZnYCr0Wxo6XP/mtPIpHQE4OyCz/GAJSJtQv1hO7DNCMzpSpkLyuemB1SOZGl3mlLQhosh3TAGP0xgqmHpKccdCSWoXGWjO48VluFuV9E1FwW1Xi++XhMRcUaljJXPZaNVjGcAG1uAxeVkUMsY8tBvQ0vaumUK2jkzbyQTWeStEWwl1yKmklI8JDXske/k6tYJOyF+8t0mF7oCEqNHSNicj7TomihpPlVjNl1Mm4l5fvwlKtAPJwiKrchCunlZB3uGN1AR0h0Hvznffutc/lV/FWFbNgFAaNJZKRs40vMk1xmRZyH2rs+Ob2fZriQ3BSwzzNeiwDLXxm0m/ytOai+K9ObFuC/IEh5fJfvQbNeo3TmiCAMCZPNXMDtlOyLqQzzKwmMFH4c53Ol+kkTiuAKECNQR1dOCufAL0U5lzEUFRxFvOq67lp6xqG8m+WzCIkbnF8QyJHfujtXVMJACaevUkM7+kAVyTwETEKQsanp0tBwzV42ieChp/h7pivcC++cFXdSG5dvR94BgkHmtpC9+jfNH32RREPLuyWfU5aBXiOkxjRs9fDexAFjrkGjM18I+jqHZNeuUR20BKe2jFsU8xJS3Fa4eXabm/YPL1t8R5jr572Ch/r4bspFp8MQ5RcFo8Nn/HiBmW8uZ2BcLEY1RPWUBvxVhfvh/hNxaRKu21x8vGz72RoiNuOjNbeADYAaBJqBGLp0MALxZ/rnXPzDLQUt6Mv07fWHAZr5p3r/skleot25lr2Tcl4qJCPM4/cfs6U0x4CY26ktBiCs4bWKqSEV1Q05nf5kpxVOIRSTgxqFOj/rWIAF3uw7mvsuRKd3YXILV5OrvEoETdQvf7BdYPbQbIQYDf7DBKhf51O8RKQgcfl6mVQswamdJ+PyqLbozTkFCjXMKI0PwJdy8tfKfCeeEe0TbOXSfeTczKQkL8WyWkBg4tS81JnWAVzfVlNjbvo/fk+wv7FyfJJS1HJGlxZ0kUlWi1369rSlldYPoSqopuekOxtYnpYpz92y/jVLNQXE1IVLqWYh9o3gTwjeyaHG7fCaWF2QRGrCUvejT8eJjevhj/sgadjPVcEP5o7Zcw5yTBCgc0+FX1j5KpCmfZ/dVvT4iIX8bOkhxjHQ8ifOx39BMM4EObgCA+g+BFN+Ra7kOf4hJ6tPNhqvJa4E4fyISlVrRiBqSt59ZkuLyWuY9SYy0nvbklP30WDUHSAvcuEwVMSuT524afHISfO/+tSgE7JAKzEPSOoVO3Z5NS9kcAqHuBSe/LL4XJbCKF9Oggm9/gwdAulnBANd4ydQ/raTPE/QUu/CGqqGhBd+wo8x0Jg/BMZWkwhz0fEzsh+OjnrEkHv4QIqZ9v/j1Rv9uc+cDeK7eGi62okGLrPFX2pNQtsZRdUM9aBSlTBUVSdCDpkvieENzLnR257EDZy1EV2HxGRfOFZVVdaW1n8XvL73pcFoQ5XABpfYuigOS8i4S8g43Qfe77GosnuXR5rcJCrL03q3hptb97K5ysKFLgumsaaWo92MBhZYKvQ6SwStgyWRlb22uQGQJYsS8OTD/uVNiQzFjOMsR/l71c9RI1Eb7SQJT6WWvL1YhA7sQw/lQf8soLKfWshoky6mMrGopjRak8xHpJe5VWbqK8PK6iXDd403JrHICyh4M3FpEja3eX2V3SN6U+EgIWKIE8lE/iQZakhLtG2KL7nNQy/cksxzIh5ElQCe5NkrQZO0fai6ek8qwbmz07RVg2FknD7F2hvmxZBqoJSXhsFVn/9+fnkcsZekEtUevFmlQQNspPc63XgO0XmpTye9uM/BbTEsNEWeHSFZTEQLLx1l+pgwsYO3NlNSIUN24/GIR7JrZFG4fAoljkDKjhrYQzr1Fiy3t5G+CmadZ0TcjRQQdDw36ETlf7cizcrQc4FNtnx5rNWEaf54vUvlsd2DD19UIkzP9omITsiuNPPcUNq0A6v1TkgnSNYfhb26nxJIg34r8MmCAhWzB2eCy54gvOHDGLFAwfFZrQdvl\"\n
+        \         }\n        ],\n        \"role\": \"model\"\n      },\n      \"finishReason\":
+        \"STOP\",\n      \"index\": 0\n    }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\":
+        504,\n    \"candidatesTokenCount\": 45,\n    \"totalTokenCount\": 973,\n    \"promptTokensDetails\":
+        [\n      {\n        \"modality\": \"TEXT\",\n        \"tokenCount\": 504\n
+        \     }\n    ],\n    \"thoughtsTokenCount\": 424\n  },\n  \"modelVersion\":
+        \"gemini-2.5-flash\",\n  \"responseId\": \"Z1KWaYbTKZvnjMcP7piEoAg\"\n}\n"
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Wed, 18 Feb 2026 23:59:37 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=2283
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      X-Frame-Options:
+      - X-FRAME-OPTIONS-XXX
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestOpenAINativeToolCalling.test_openai_agent_with_native_tool_calling.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestOpenAINativeToolCalling.test_openai_agent_with_native_tool_calling.yaml
@@ -5,9 +5,9 @@ interactions:
      calculations"},{"role":"user","content":"\nCurrent Task: Calculate what is 15
      * 8\n\nThis is the expected criteria for your final answer: The result of the
      calculation\nyou MUST return the actual complete content as the final answer,
-      not a summary.\n\nThis is VERY important to you, your job depends on it!"}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"calculator","description":"Perform
-      mathematical calculations. Use this for any math operations.","parameters":{"properties":{"expression":{"description":"Mathematical
-      expression to evaluate","title":"Expression","type":"string"}},"required":["expression"],"type":"object"}}}]}'
+      not a summary."}],"model":"gpt-5-nano","tool_choice":"auto","tools":[{"type":"function","function":{"name":"calculator","description":"Perform
+      mathematical calculations. Use this for any math operations.","strict":true,"parameters":{"properties":{"expression":{"description":"Mathematical
+      expression to evaluate","title":"Expression","type":"string"}},"required":["expression"],"type":"object","additionalProperties":false}}}]}'
    headers:
      User-Agent:
      - X-USER-AGENT-XXX
@@ -20,7 +20,7 @@ interactions:
      connection:
      - keep-alive
      content-length:
-      - '829'
+      - '813'
      content-type:
      - application/json
      host:
@@ -47,140 +47,17 @@ interactions:
    uri: https://api.openai.com/v1/chat/completions
  response:
    body:
-      string: "{\n  \"id\": \"chatcmpl-D0vm7joOuDBPcMpfmOnftOoTCPtc8\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1769114459,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-        \"assistant\",\n        \"content\": null,\n        \"tool_calls\": [\n          {\n
-        \           \"id\": \"call_G73UZDvL4wC9EEdvm1UcRIRM\",\n            \"type\":
-        \"function\",\n            \"function\": {\n              \"name\": \"calculator\",\n
-        \             \"arguments\": \"{\\\"expression\\\":\\\"15 * 8\\\"}\"\n            }\n
-        \         }\n        ],\n        \"refusal\": null,\n        \"annotations\":
-        []\n      },\n      \"logprobs\": null,\n      \"finish_reason\": \"tool_calls\"\n
-        \   }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 137,\n    \"completion_tokens\":
-        17,\n    \"total_tokens\": 154,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
-        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_c4585b5b9c\"\n}\n"
-    headers:
-      CF-RAY:
-      - CF-RAY-XXX
-      Connection:
-      - keep-alive
-      Content-Type:
-      - application/json
-      Date:
-      - Thu, 22 Jan 2026 20:40:59 GMT
-      Server:
-      - cloudflare
-      Set-Cookie:
-      - SET-COOKIE-XXX
-      Strict-Transport-Security:
-      - STS-XXX
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - X-CONTENT-TYPE-XXX
-      access-control-expose-headers:
-      - ACCESS-CONTROL-XXX
-      alt-svc:
-      - h3=":443"; ma=86400
-      cf-cache-status:
-      - DYNAMIC
-      openai-organization:
-      - OPENAI-ORG-XXX
-      openai-processing-ms:
-      - '761'
-      openai-project:
-      - OPENAI-PROJECT-XXX
-      openai-version:
-      - '2020-10-01'
-      x-envoy-upstream-service-time:
-      - '1080'
-      x-openai-proxy-wasm:
-      - v0.1
-      x-ratelimit-limit-requests:
-      - X-RATELIMIT-LIMIT-REQUESTS-XXX
-      x-ratelimit-limit-tokens:
-      - X-RATELIMIT-LIMIT-TOKENS-XXX
-      x-ratelimit-remaining-requests:
-      - X-RATELIMIT-REMAINING-REQUESTS-XXX
-      x-ratelimit-remaining-tokens:
-      - X-RATELIMIT-REMAINING-TOKENS-XXX
-      x-ratelimit-reset-requests:
-      - X-RATELIMIT-RESET-REQUESTS-XXX
-      x-ratelimit-reset-tokens:
-      - X-RATELIMIT-RESET-TOKENS-XXX
-      x-request-id:
-      - X-REQUEST-ID-XXX
-    status:
-      code: 200
-      message: OK
- request:
-    body: '{"messages":[{"role":"system","content":"You are Math Assistant. You are
-      a helpful math assistant.\nYour personal goal is: Help users with mathematical
-      calculations"},{"role":"user","content":"\nCurrent Task: Calculate what is 15
-      * 8\n\nThis is the expected criteria for your final answer: The result of the
-      calculation\nyou MUST return the actual complete content as the final answer,
-      not a summary.\n\nThis is VERY important to you, your job depends on it!"},{"role":"assistant","content":null,"tool_calls":[{"id":"call_G73UZDvL4wC9EEdvm1UcRIRM","type":"function","function":{"name":"calculator","arguments":"{\"expression\":\"15
-      * 8\"}"}}]},{"role":"tool","tool_call_id":"call_G73UZDvL4wC9EEdvm1UcRIRM","content":"The
-      result of 15 * 8 is 120"},{"role":"user","content":"Analyze the tool result.
-      If requirements are met, provide the Final Answer. Otherwise, call the next
-      tool. Deliver only the answer without meta-commentary."}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"calculator","description":"Perform
-      mathematical calculations. Use this for any math operations.","parameters":{"properties":{"expression":{"description":"Mathematical
-      expression to evaluate","title":"Expression","type":"string"}},"required":["expression"],"type":"object"}}}]}'
-    headers:
-      User-Agent:
-      - X-USER-AGENT-XXX
-      accept:
-      - application/json
-      accept-encoding:
-      - ACCEPT-ENCODING-XXX
-      authorization:
-      - AUTHORIZATION-XXX
-      connection:
-      - keep-alive
-      content-length:
-      - '1299'
-      content-type:
-      - application/json
-      cookie:
-      - COOKIE-XXX
-      host:
-      - api.openai.com
-      x-stainless-arch:
-      - X-STAINLESS-ARCH-XXX
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - X-STAINLESS-OS-XXX
-      x-stainless-package-version:
-      - 1.83.0
-      x-stainless-read-timeout:
-      - X-STAINLESS-READ-TIMEOUT-XXX
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.13.3
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    body:
-      string: "{\n  \"id\": \"chatcmpl-D0vm8mUnzLxu9pf1rc7MODkrMsCmf\",\n  \"object\":
-        \"chat.completion\",\n  \"created\": 1769114460,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
+      string: "{\n  \"id\": \"chatcmpl-DAlG9W2mJYuOgpf3FwCRgbqaiHWf3\",\n  \"object\":
+        \"chat.completion\",\n  \"created\": 1771457317,\n  \"model\": \"gpt-5-nano-2025-08-07\",\n
        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
        \"assistant\",\n        \"content\": \"120\",\n        \"refusal\": null,\n
-        \       \"annotations\": []\n      },\n      \"logprobs\": null,\n      \"finish_reason\":
-        \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 207,\n    \"completion_tokens\":
-        2,\n    \"total_tokens\": 209,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+        \       \"annotations\": []\n      },\n      \"finish_reason\": \"stop\"\n
+        \   }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 208,\n    \"completion_tokens\":
+        138,\n    \"total_tokens\": 346,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+        {\n      \"reasoning_tokens\": 128,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
-        \"default\",\n  \"system_fingerprint\": \"fp_c4585b5b9c\"\n}\n"
+        \"default\",\n  \"system_fingerprint\": null\n}\n"
    headers:
      CF-RAY:
      - CF-RAY-XXX
@@ -189,7 +66,7 @@ interactions:
      Content-Type:
      - application/json
      Date:
-      - Thu, 22 Jan 2026 20:41:00 GMT
+      - Wed, 18 Feb 2026 23:28:39 GMT
      Server:
      - cloudflare
      Strict-Transport-Security:
@@ -207,13 +84,13 @@ interactions:
      openai-organization:
      - OPENAI-ORG-XXX
      openai-processing-ms:
-      - '262'
+      - '1869'
      openai-project:
      - OPENAI-PROJECT-XXX
      openai-version:
      - '2020-10-01'
-      x-envoy-upstream-service-time:
-      - '496'
+      set-cookie:
+      - SET-COOKIE-XXX
      x-openai-proxy-wasm:
      - v0.1
      x-ratelimit-limit-requests:
--- a/lib/crewai/tests/cassettes/agents/TestOpenAINativeToolCalling.test_openai_parallel_native_tool_calling_test_agent_kickoff.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestOpenAINativeToolCalling.test_openai_parallel_native_tool_calling_test_agent_kickoff.yaml
@@ -0,0 +1,265 @@
+interactions:
+- request:
+    body: '{"messages":[{"role":"system","content":"You are Parallel Tool Agent. You
+      follow tool instructions precisely.\nYour personal goal is: Use both tools exactly
+      as instructed"},{"role":"user","content":"\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '1733'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 1.83.0
+      x-stainless-read-timeout:
+      - X-STAINLESS-READ-TIMEOUT-XXX
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: "{\n  \"id\": \"chatcmpl-DAldZHfQGVcV3FNwAJAtNooU3PAU7\",\n  \"object\":
+        \"chat.completion\",\n  \"created\": 1771458769,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
+        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+        \"assistant\",\n        \"content\": null,\n        \"tool_calls\": [\n          {\n
+        \           \"id\": \"call_kz1qLLRsugXwWiQMeH9oFAep\",\n            \"type\":
+        \"function\",\n            \"function\": {\n              \"name\": \"parallel_local_search_one\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest OpenAI model release
+        notes\\\"}\"\n            }\n          },\n          {\n            \"id\":
+        \"call_yNouGq1Kv6P5W9fhTng6acZi\",\n            \"type\": \"function\",\n
+        \           \"function\": {\n              \"name\": \"parallel_local_search_two\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest Anthropic model
+        release notes\\\"}\"\n            }\n          },\n          {\n            \"id\":
+        \"call_O7MqnuniDmyT6a0BS31GTunB\",\n            \"type\": \"function\",\n
+        \           \"function\": {\n              \"name\": \"parallel_local_search_three\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest Gemini model release
+        notes\\\"}\"\n            }\n          }\n        ],\n        \"refusal\":
+        null,\n        \"annotations\": []\n      },\n      \"logprobs\": null,\n
+        \     \"finish_reason\": \"tool_calls\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\":
+        259,\n    \"completion_tokens\": 78,\n    \"total_tokens\": 337,\n    \"prompt_tokens_details\":
+        {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
+        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+        \"default\",\n  \"system_fingerprint\": \"fp_414ba99a04\"\n}\n"
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Type:
+      - application/json
+      Date:
+      - Wed, 18 Feb 2026 23:52:50 GMT
+      Server:
+      - cloudflare
+      Strict-Transport-Security:
+      - STS-XXX
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      access-control-expose-headers:
+      - ACCESS-CONTROL-XXX
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - OPENAI-ORG-XXX
+      openai-processing-ms:
+      - '1418'
+      openai-project:
+      - OPENAI-PROJECT-XXX
+      openai-version:
+      - '2020-10-01'
+      set-cookie:
+      - SET-COOKIE-XXX
+      x-openai-proxy-wasm:
+      - v0.1
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-ratelimit-reset-requests:
+      - X-RATELIMIT-RESET-REQUESTS-XXX
+      x-ratelimit-reset-tokens:
+      - X-RATELIMIT-RESET-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"messages":[{"role":"system","content":"You are Parallel Tool Agent. You
+      follow tool instructions precisely.\nYour personal goal is: Use both tools exactly
+      as instructed"},{"role":"user","content":"\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."},{"role":"assistant","content":null,"tool_calls":[{"id":"call_kz1qLLRsugXwWiQMeH9oFAep","type":"function","function":{"name":"parallel_local_search_one","arguments":"{\"query\":
+      \"latest OpenAI model release notes\"}"}},{"id":"call_yNouGq1Kv6P5W9fhTng6acZi","type":"function","function":{"name":"parallel_local_search_two","arguments":"{\"query\":
+      \"latest Anthropic model release notes\"}"}},{"id":"call_O7MqnuniDmyT6a0BS31GTunB","type":"function","function":{"name":"parallel_local_search_three","arguments":"{\"query\":
+      \"latest Gemini model release notes\"}"}}]},{"role":"tool","tool_call_id":"call_kz1qLLRsugXwWiQMeH9oFAep","name":"parallel_local_search_one","content":"[one]
+      latest OpenAI model release notes"},{"role":"tool","tool_call_id":"call_yNouGq1Kv6P5W9fhTng6acZi","name":"parallel_local_search_two","content":"[two]
+      latest Anthropic model release notes"},{"role":"tool","tool_call_id":"call_O7MqnuniDmyT6a0BS31GTunB","name":"parallel_local_search_three","content":"[three]
+      latest Gemini model release notes"}],"model":"gpt-4o-mini","tool_choice":"auto","tools":[{"type":"function","function":{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '2756'
+      content-type:
+      - application/json
+      cookie:
+      - COOKIE-XXX
+      host:
+      - api.openai.com
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 1.83.0
+      x-stainless-read-timeout:
+      - X-STAINLESS-READ-TIMEOUT-XXX
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: "{\n  \"id\": \"chatcmpl-DAldbawkFNpOeXbaJTkTlsSi7OiII\",\n  \"object\":
+        \"chat.completion\",\n  \"created\": 1771458771,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
+        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+        \"assistant\",\n        \"content\": \"The latest release notes for OpenAI,
+        Anthropic, and Gemini models highlight significant updates and improvements
+        in each respective technology. OpenAI's notes detail new features and optimizations
+        that enhance user interaction and performance. Anthropic's release emphasizes
+        their focus on safety and alignment in AI development, showcasing advancements
+        in responsible AI practices. Gemini's notes underline their innovative approaches
+        and cutting-edge functionalities designed to push the boundaries of current
+        AI capabilities.\",\n        \"refusal\": null,\n        \"annotations\":
+        []\n      },\n      \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n
+        \   }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 377,\n    \"completion_tokens\":
+        85,\n    \"total_tokens\": 462,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
+        {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+        \"default\",\n  \"system_fingerprint\": \"fp_414ba99a04\"\n}\n"
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Type:
+      - application/json
+      Date:
+      - Wed, 18 Feb 2026 23:52:53 GMT
+      Server:
+      - cloudflare
+      Strict-Transport-Security:
+      - STS-XXX
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      access-control-expose-headers:
+      - ACCESS-CONTROL-XXX
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - OPENAI-ORG-XXX
+      openai-processing-ms:
+      - '1755'
+      openai-project:
+      - OPENAI-PROJECT-XXX
+      openai-version:
+      - '2020-10-01'
+      x-openai-proxy-wasm:
+      - v0.1
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-ratelimit-reset-requests:
+      - X-RATELIMIT-RESET-REQUESTS-XXX
+      x-ratelimit-reset-tokens:
+      - X-RATELIMIT-RESET-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestOpenAINativeToolCalling.test_openai_parallel_native_tool_calling_test_crew.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestOpenAINativeToolCalling.test_openai_parallel_native_tool_calling_test_crew.yaml
@@ -0,0 +1,265 @@
+interactions:
+- request:
+    body: '{"messages":[{"role":"system","content":"You are Parallel Tool Agent. You
+      follow tool instructions precisely.\nYour personal goal is: Use both tools exactly
+      as instructed"},{"role":"user","content":"\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."}],"model":"gpt-5-nano","temperature":1,"tool_choice":"auto","tools":[{"type":"function","function":{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '1929'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 1.83.0
+      x-stainless-read-timeout:
+      - X-STAINLESS-READ-TIMEOUT-XXX
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: "{\n  \"id\": \"chatcmpl-DAlddfEozIpgleBufPaffZMQWK0Hj\",\n  \"object\":
+        \"chat.completion\",\n  \"created\": 1771458773,\n  \"model\": \"gpt-5-nano-2025-08-07\",\n
+        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+        \"assistant\",\n        \"content\": null,\n        \"tool_calls\": [\n          {\n
+        \           \"id\": \"call_Putc2jV5GhiIZMwx8mDcI61Q\",\n            \"type\":
+        \"function\",\n            \"function\": {\n              \"name\": \"parallel_local_search_one\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest OpenAI model release
+        notes\\\"}\"\n            }\n          },\n          {\n            \"id\":
+        \"call_iyjwcvkL3PdoOddxsqkHCT9T\",\n            \"type\": \"function\",\n
+        \           \"function\": {\n              \"name\": \"parallel_local_search_two\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest Anthropic model
+        release notes\\\"}\"\n            }\n          },\n          {\n            \"id\":
+        \"call_G728RseEU7SbGk5YTiyyp9IH\",\n            \"type\": \"function\",\n
+        \           \"function\": {\n              \"name\": \"parallel_local_search_three\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest Gemini model release
+        notes\\\"}\"\n            }\n          }\n        ],\n        \"refusal\":
+        null,\n        \"annotations\": []\n      },\n      \"finish_reason\": \"tool_calls\"\n
+        \   }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 378,\n    \"completion_tokens\":
+        1497,\n    \"total_tokens\": 1875,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
+        {\n      \"reasoning_tokens\": 1408,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+        \"default\",\n  \"system_fingerprint\": null\n}\n"
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Type:
+      - application/json
+      Date:
+      - Wed, 18 Feb 2026 23:53:08 GMT
+      Server:
+      - cloudflare
+      Strict-Transport-Security:
+      - STS-XXX
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      access-control-expose-headers:
+      - ACCESS-CONTROL-XXX
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - OPENAI-ORG-XXX
+      openai-processing-ms:
+      - '14853'
+      openai-project:
+      - OPENAI-PROJECT-XXX
+      openai-version:
+      - '2020-10-01'
+      set-cookie:
+      - SET-COOKIE-XXX
+      x-openai-proxy-wasm:
+      - v0.1
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-ratelimit-reset-requests:
+      - X-RATELIMIT-RESET-REQUESTS-XXX
+      x-ratelimit-reset-tokens:
+      - X-RATELIMIT-RESET-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"messages":[{"role":"system","content":"You are Parallel Tool Agent. You
+      follow tool instructions precisely.\nYour personal goal is: Use both tools exactly
+      as instructed"},{"role":"user","content":"\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."},{"role":"assistant","content":null,"tool_calls":[{"id":"call_Putc2jV5GhiIZMwx8mDcI61Q","type":"function","function":{"name":"parallel_local_search_one","arguments":"{\"query\":
+      \"latest OpenAI model release notes\"}"}},{"id":"call_iyjwcvkL3PdoOddxsqkHCT9T","type":"function","function":{"name":"parallel_local_search_two","arguments":"{\"query\":
+      \"latest Anthropic model release notes\"}"}},{"id":"call_G728RseEU7SbGk5YTiyyp9IH","type":"function","function":{"name":"parallel_local_search_three","arguments":"{\"query\":
+      \"latest Gemini model release notes\"}"}}]},{"role":"tool","tool_call_id":"call_Putc2jV5GhiIZMwx8mDcI61Q","name":"parallel_local_search_one","content":"[one]
+      latest OpenAI model release notes"},{"role":"tool","tool_call_id":"call_iyjwcvkL3PdoOddxsqkHCT9T","name":"parallel_local_search_two","content":"[two]
+      latest Anthropic model release notes"},{"role":"tool","tool_call_id":"call_G728RseEU7SbGk5YTiyyp9IH","name":"parallel_local_search_three","content":"[three]
+      latest Gemini model release notes"},{"role":"user","content":"Analyze the tool
+      result. If requirements are met, provide the Final Answer. Otherwise, call the
+      next tool. Deliver only the answer without meta-commentary."}],"model":"gpt-5-nano","temperature":1,"tool_choice":"auto","tools":[{"type":"function","function":{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '3136'
+      content-type:
+      - application/json
+      cookie:
+      - COOKIE-XXX
+      host:
+      - api.openai.com
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 1.83.0
+      x-stainless-read-timeout:
+      - X-STAINLESS-READ-TIMEOUT-XXX
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: "{\n  \"id\": \"chatcmpl-DAldt2BXNqiYYLPgInjHCpYKfk2VK\",\n  \"object\":
+        \"chat.completion\",\n  \"created\": 1771458789,\n  \"model\": \"gpt-5-nano-2025-08-07\",\n
+        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+        \"assistant\",\n        \"content\": \"The results show the latest model release
+        notes for OpenAI, Anthropic, and Gemini.\",\n        \"refusal\": null,\n
+        \       \"annotations\": []\n      },\n      \"finish_reason\": \"stop\"\n
+        \   }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 537,\n    \"completion_tokens\":
+        2011,\n    \"total_tokens\": 2548,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
+        {\n      \"reasoning_tokens\": 1984,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+        \"default\",\n  \"system_fingerprint\": null\n}\n"
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Type:
+      - application/json
+      Date:
+      - Wed, 18 Feb 2026 23:53:25 GMT
+      Server:
+      - cloudflare
+      Strict-Transport-Security:
+      - STS-XXX
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      access-control-expose-headers:
+      - ACCESS-CONTROL-XXX
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - OPENAI-ORG-XXX
+      openai-processing-ms:
+      - '15368'
+      openai-project:
+      - OPENAI-PROJECT-XXX
+      openai-version:
+      - '2020-10-01'
+      x-openai-proxy-wasm:
+      - v0.1
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-ratelimit-reset-requests:
+      - X-RATELIMIT-RESET-REQUESTS-XXX
+      x-ratelimit-reset-tokens:
+      - X-RATELIMIT-RESET-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestOpenAINativeToolCalling.test_openai_parallel_native_tool_calling_tool_hook_parity_agent_kickoff.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestOpenAINativeToolCalling.test_openai_parallel_native_tool_calling_tool_hook_parity_agent_kickoff.yaml
@@ -0,0 +1,264 @@
+interactions:
+- request:
+    body: '{"messages":[{"role":"system","content":"You are Parallel Tool Agent. You
+      follow tool instructions precisely.\nYour personal goal is: Use both tools exactly
+      as instructed"},{"role":"user","content":"\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."}],"model":"gpt-5-nano","temperature":1,"tool_choice":"auto","tools":[{"type":"function","function":{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '1748'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 1.83.0
+      x-stainless-read-timeout:
+      - X-STAINLESS-READ-TIMEOUT-XXX
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: "{\n  \"id\": \"chatcmpl-DB244zBgA66fzl8TNcIPRWoE4lDIQ\",\n  \"object\":
+        \"chat.completion\",\n  \"created\": 1771521916,\n  \"model\": \"gpt-5-nano-2025-08-07\",\n
+        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+        \"assistant\",\n        \"content\": null,\n        \"tool_calls\": [\n          {\n
+        \           \"id\": \"call_D2ojRWqkng6krQ51vWQEU8wR\",\n            \"type\":
+        \"function\",\n            \"function\": {\n              \"name\": \"parallel_local_search_one\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest OpenAI model release
+        notes\\\"}\"\n            }\n          },\n          {\n            \"id\":
+        \"call_v1tpTKw1sYcI75SWG1LCkAC3\",\n            \"type\": \"function\",\n
+        \           \"function\": {\n              \"name\": \"parallel_local_search_two\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest Anthropic model
+        release notes\\\"}\"\n            }\n          },\n          {\n            \"id\":
+        \"call_RrbyZClymnngoNLhlkQLLpwM\",\n            \"type\": \"function\",\n
+        \           \"function\": {\n              \"name\": \"parallel_local_search_three\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest Gemini model release
+        notes\\\"}\"\n            }\n          }\n        ],\n        \"refusal\":
+        null,\n        \"annotations\": []\n      },\n      \"finish_reason\": \"tool_calls\"\n
+        \   }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 343,\n    \"completion_tokens\":
+        855,\n    \"total_tokens\": 1198,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
+        {\n      \"reasoning_tokens\": 768,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+        \"default\",\n  \"system_fingerprint\": null\n}\n"
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 17:25:23 GMT
+      Server:
+      - cloudflare
+      Strict-Transport-Security:
+      - STS-XXX
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      access-control-expose-headers:
+      - ACCESS-CONTROL-XXX
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - OPENAI-ORG-XXX
+      openai-processing-ms:
+      - '6669'
+      openai-project:
+      - OPENAI-PROJECT-XXX
+      openai-version:
+      - '2020-10-01'
+      set-cookie:
+      - SET-COOKIE-XXX
+      x-openai-proxy-wasm:
+      - v0.1
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-ratelimit-reset-requests:
+      - X-RATELIMIT-RESET-REQUESTS-XXX
+      x-ratelimit-reset-tokens:
+      - X-RATELIMIT-RESET-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"messages":[{"role":"system","content":"You are Parallel Tool Agent. You
+      follow tool instructions precisely.\nYour personal goal is: Use both tools exactly
+      as instructed"},{"role":"user","content":"\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary."},{"role":"assistant","content":null,"tool_calls":[{"id":"call_D2ojRWqkng6krQ51vWQEU8wR","type":"function","function":{"name":"parallel_local_search_one","arguments":"{\"query\":
+      \"latest OpenAI model release notes\"}"}},{"id":"call_v1tpTKw1sYcI75SWG1LCkAC3","type":"function","function":{"name":"parallel_local_search_two","arguments":"{\"query\":
+      \"latest Anthropic model release notes\"}"}},{"id":"call_RrbyZClymnngoNLhlkQLLpwM","type":"function","function":{"name":"parallel_local_search_three","arguments":"{\"query\":
+      \"latest Gemini model release notes\"}"}}]},{"role":"tool","tool_call_id":"call_D2ojRWqkng6krQ51vWQEU8wR","name":"parallel_local_search_one","content":"[one]
+      latest OpenAI model release notes"},{"role":"tool","tool_call_id":"call_v1tpTKw1sYcI75SWG1LCkAC3","name":"parallel_local_search_two","content":"[two]
+      latest Anthropic model release notes"},{"role":"tool","tool_call_id":"call_RrbyZClymnngoNLhlkQLLpwM","name":"parallel_local_search_three","content":"[three]
+      latest Gemini model release notes"}],"model":"gpt-5-nano","temperature":1,"tool_choice":"auto","tools":[{"type":"function","function":{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '2771'
+      content-type:
+      - application/json
+      cookie:
+      - COOKIE-XXX
+      host:
+      - api.openai.com
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 1.83.0
+      x-stainless-read-timeout:
+      - X-STAINLESS-READ-TIMEOUT-XXX
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: "{\n  \"id\": \"chatcmpl-DB24DjyYsIHiQJ7hHXob8tQFfeXBs\",\n  \"object\":
+        \"chat.completion\",\n  \"created\": 1771521925,\n  \"model\": \"gpt-5-nano-2025-08-07\",\n
+        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+        \"assistant\",\n        \"content\": \"The three latest release-note references
+        retrieved encompass OpenAI, Anthropic, and Gemini, indicating that all three
+        major model families are actively updating their offerings. These notes typically
+        cover improvements to capabilities, safety measures, performance enhancements,
+        and any new APIs or features, suggesting a trend of ongoing refinement across
+        providers. If you\u2019d like, I can pull the full release notes or extract
+        and compare the key changes across the three sources.\",\n        \"refusal\":
+        null,\n        \"annotations\": []\n      },\n      \"finish_reason\": \"stop\"\n
+        \   }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 467,\n    \"completion_tokens\":
+        1437,\n    \"total_tokens\": 1904,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
+        {\n      \"reasoning_tokens\": 1344,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+        \"default\",\n  \"system_fingerprint\": null\n}\n"
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 17:25:35 GMT
+      Server:
+      - cloudflare
+      Strict-Transport-Security:
+      - STS-XXX
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      access-control-expose-headers:
+      - ACCESS-CONTROL-XXX
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - OPENAI-ORG-XXX
+      openai-processing-ms:
+      - '10369'
+      openai-project:
+      - OPENAI-PROJECT-XXX
+      openai-version:
+      - '2020-10-01'
+      x-openai-proxy-wasm:
+      - v0.1
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-ratelimit-reset-requests:
+      - X-RATELIMIT-RESET-REQUESTS-XXX
+      x-ratelimit-reset-tokens:
+      - X-RATELIMIT-RESET-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/agents/TestOpenAINativeToolCalling.test_openai_parallel_native_tool_calling_tool_hook_parity_crew.yaml
+++ b/lib/crewai/tests/cassettes/agents/TestOpenAINativeToolCalling.test_openai_parallel_native_tool_calling_tool_hook_parity_crew.yaml
@@ -0,0 +1,339 @@
+interactions:
+- request:
+    body: '{"trace_id": "e456cc10-ce7b-4e68-a2cc-ddb806a2e7b9", "execution_type":
+      "crew", "user_identifier": null, "execution_context": {"crew_fingerprint": null,
+      "crew_name": "crew", "flow_name": null, "crewai_version": "1.9.3", "privacy_level":
+      "standard"}, "execution_metadata": {"expected_duration_estimate": 300, "agent_count":
+      0, "task_count": 0, "flow_method_count": 0, "execution_started_at": "2026-02-19T17:24:41.723158+00:00"},
+      "ephemeral_trace_id": "e456cc10-ce7b-4e68-a2cc-ddb806a2e7b9"}'
+    headers:
+      Accept:
+      - '*/*'
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '488'
+      Content-Type:
+      - application/json
+      User-Agent:
+      - X-USER-AGENT-XXX
+      X-Crewai-Organization-Id:
+      - 3433f0ee-8a94-4aa4-822b-2ac71aa38b18
+      X-Crewai-Version:
+      - 1.9.3
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+    method: POST
+    uri: https://app.crewai.com/crewai_plus/api/v1/tracing/ephemeral/batches
+  response:
+    body:
+      string: '{"id":"a78f2aca-0525-47c7-8f37-b3fca0ad6672","ephemeral_trace_id":"e456cc10-ce7b-4e68-a2cc-ddb806a2e7b9","execution_type":"crew","crew_name":"crew","flow_name":null,"status":"running","duration_ms":null,"crewai_version":"1.9.3","total_events":0,"execution_context":{"crew_fingerprint":null,"crew_name":"crew","flow_name":null,"crewai_version":"1.9.3","privacy_level":"standard"},"created_at":"2026-02-19T17:24:41.989Z","updated_at":"2026-02-19T17:24:41.989Z","access_code":"TRACE-bd80d6be74","user_identifier":null}'
+    headers:
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '515'
+      Content-Type:
+      - application/json; charset=utf-8
+      Date:
+      - Thu, 19 Feb 2026 17:24:41 GMT
+      cache-control:
+      - no-store
+      content-security-policy:
+      - CSP-FILTERED
+      etag:
+      - ETAG-XXX
+      expires:
+      - '0'
+      permissions-policy:
+      - PERMISSIONS-POLICY-XXX
+      pragma:
+      - no-cache
+      referrer-policy:
+      - REFERRER-POLICY-XXX
+      strict-transport-security:
+      - STS-XXX
+      vary:
+      - Accept
+      x-content-type-options:
+      - X-CONTENT-TYPE-XXX
+      x-frame-options:
+      - X-FRAME-OPTIONS-XXX
+      x-permitted-cross-domain-policies:
+      - X-PERMITTED-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+      x-runtime:
+      - X-RUNTIME-XXX
+      x-xss-protection:
+      - X-XSS-PROTECTION-XXX
+    status:
+      code: 201
+      message: Created
+- request:
+    body: '{"messages":[{"role":"system","content":"You are Parallel Tool Agent. You
+      follow tool instructions precisely.\nYour personal goal is: Use both tools exactly
+      as instructed"},{"role":"user","content":"\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."}],"model":"gpt-5-nano","temperature":1,"tool_choice":"auto","tools":[{"type":"function","function":{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '1929'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 1.83.0
+      x-stainless-read-timeout:
+      - X-STAINLESS-READ-TIMEOUT-XXX
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: "{\n  \"id\": \"chatcmpl-DB23W8RBF6zlxweiHYGb6maVfyctt\",\n  \"object\":
+        \"chat.completion\",\n  \"created\": 1771521882,\n  \"model\": \"gpt-5-nano-2025-08-07\",\n
+        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+        \"assistant\",\n        \"content\": null,\n        \"tool_calls\": [\n          {\n
+        \           \"id\": \"call_sge1FXUkpmPEDe8nTOgn0tQG\",\n            \"type\":
+        \"function\",\n            \"function\": {\n              \"name\": \"parallel_local_search_one\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest OpenAI model release
+        notes\\\"}\"\n            }\n          },\n          {\n            \"id\":
+        \"call_z5jRPH4DQ7Wp3HdDUlZe8gGh\",\n            \"type\": \"function\",\n
+        \           \"function\": {\n              \"name\": \"parallel_local_search_two\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest Anthropic model
+        release notes\\\"}\"\n            }\n          },\n          {\n            \"id\":
+        \"call_DNlgqnadODDsyQkSuLcXZCX2\",\n            \"type\": \"function\",\n
+        \           \"function\": {\n              \"name\": \"parallel_local_search_three\",\n
+        \             \"arguments\": \"{\\\"query\\\": \\\"latest Gemini model release
+        notes\\\"}\"\n            }\n          }\n        ],\n        \"refusal\":
+        null,\n        \"annotations\": []\n      },\n      \"finish_reason\": \"tool_calls\"\n
+        \   }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 378,\n    \"completion_tokens\":
+        2456,\n    \"total_tokens\": 2834,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+        0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
+        {\n      \"reasoning_tokens\": 2368,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+        0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+        \"default\",\n  \"system_fingerprint\": null\n}\n"
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 17:25:02 GMT
+      Server:
+      - cloudflare
+      Strict-Transport-Security:
+      - STS-XXX
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      access-control-expose-headers:
+      - ACCESS-CONTROL-XXX
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - OPENAI-ORG-XXX
+      openai-processing-ms:
+      - '19582'
+      openai-project:
+      - OPENAI-PROJECT-XXX
+      openai-version:
+      - '2020-10-01'
+      set-cookie:
+      - SET-COOKIE-XXX
+      x-openai-proxy-wasm:
+      - v0.1
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-ratelimit-reset-requests:
+      - X-RATELIMIT-RESET-REQUESTS-XXX
+      x-ratelimit-reset-tokens:
+      - X-RATELIMIT-RESET-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"messages":[{"role":"system","content":"You are Parallel Tool Agent. You
+      follow tool instructions precisely.\nYour personal goal is: Use both tools exactly
+      as instructed"},{"role":"user","content":"\nCurrent Task: This is a tool-calling
+      compliance test. In your next assistant turn, emit exactly 3 tool calls in the
+      same response (parallel tool calls), in this order: 1) parallel_local_search_one(query=''latest
+      OpenAI model release notes''), 2) parallel_local_search_two(query=''latest Anthropic
+      model release notes''), 3) parallel_local_search_three(query=''latest Gemini
+      model release notes''). Do not call any other tools and do not answer before
+      those 3 tool calls are emitted. After the tool results return, provide a one
+      paragraph summary.\n\nThis is the expected criteria for your final answer: A
+      one sentence summary of both tool outputs\nyou MUST return the actual complete
+      content as the final answer, not a summary."},{"role":"assistant","content":null,"tool_calls":[{"id":"call_sge1FXUkpmPEDe8nTOgn0tQG","type":"function","function":{"name":"parallel_local_search_one","arguments":"{\"query\":
+      \"latest OpenAI model release notes\"}"}},{"id":"call_z5jRPH4DQ7Wp3HdDUlZe8gGh","type":"function","function":{"name":"parallel_local_search_two","arguments":"{\"query\":
+      \"latest Anthropic model release notes\"}"}},{"id":"call_DNlgqnadODDsyQkSuLcXZCX2","type":"function","function":{"name":"parallel_local_search_three","arguments":"{\"query\":
+      \"latest Gemini model release notes\"}"}}]},{"role":"tool","tool_call_id":"call_sge1FXUkpmPEDe8nTOgn0tQG","name":"parallel_local_search_one","content":"[one]
+      latest OpenAI model release notes"},{"role":"tool","tool_call_id":"call_z5jRPH4DQ7Wp3HdDUlZe8gGh","name":"parallel_local_search_two","content":"[two]
+      latest Anthropic model release notes"},{"role":"tool","tool_call_id":"call_DNlgqnadODDsyQkSuLcXZCX2","name":"parallel_local_search_three","content":"[three]
+      latest Gemini model release notes"},{"role":"user","content":"Analyze the tool
+      result. If requirements are met, provide the Final Answer. Otherwise, call the
+      next tool. Deliver only the answer without meta-commentary."}],"model":"gpt-5-nano","temperature":1,"tool_choice":"auto","tools":[{"type":"function","function":{"name":"parallel_local_search_one","description":"Local
+      search tool #1 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_two","description":"Local
+      search tool #2 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}},{"type":"function","function":{"name":"parallel_local_search_three","description":"Local
+      search tool #3 for concurrency testing.","strict":true,"parameters":{"properties":{"query":{"description":"Search
+      query","title":"Query","type":"string"}},"required":["query"],"type":"object","additionalProperties":false}}}]}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - application/json
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      authorization:
+      - AUTHORIZATION-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '3136'
+      content-type:
+      - application/json
+      cookie:
+      - COOKIE-XXX
+      host:
+      - api.openai.com
+      x-stainless-arch:
+      - X-STAINLESS-ARCH-XXX
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - X-STAINLESS-OS-XXX
+      x-stainless-package-version:
+      - 1.83.0
+      x-stainless-read-timeout:
+      - X-STAINLESS-READ-TIMEOUT-XXX
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.13.3
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: "{\n  \"id\": \"chatcmpl-DB23sY0Ahpd1yAgLZ882KkA50Zljx\",\n  \"object\":
+        \"chat.completion\",\n  \"created\": 1771521904,\n  \"model\": \"gpt-5-nano-2025-08-07\",\n
+        \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+        \"assistant\",\n        \"content\": \"Results returned three items: the latest
+        OpenAI model release notes, the latest Anthropic model release notes, and
+        the latest Gemini model release notes.\",\n        \"refusal\": null,\n        \"annotations\":
+        []\n      },\n      \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\":
+        {\n    \"prompt_tokens\": 537,\n    \"completion_tokens\": 1383,\n    \"total_tokens\":
+        1920,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\": 0,\n      \"audio_tokens\":
+        0\n    },\n    \"completion_tokens_details\": {\n      \"reasoning_tokens\":
+        1344,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\": 0,\n
+        \     \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+        \"default\",\n  \"system_fingerprint\": null\n}\n"
+    headers:
+      CF-RAY:
+      - CF-RAY-XXX
+      Connection:
+      - keep-alive
+      Content-Type:
+      - application/json
+      Date:
+      - Thu, 19 Feb 2026 17:25:16 GMT
+      Server:
+      - cloudflare
+      Strict-Transport-Security:
+      - STS-XXX
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      access-control-expose-headers:
+      - ACCESS-CONTROL-XXX
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - OPENAI-ORG-XXX
+      openai-processing-ms:
+      - '12339'
+      openai-project:
+      - OPENAI-PROJECT-XXX
+      openai-version:
+      - '2020-10-01'
+      x-openai-proxy-wasm:
+      - v0.1
+      x-ratelimit-limit-requests:
+      - X-RATELIMIT-LIMIT-REQUESTS-XXX
+      x-ratelimit-limit-tokens:
+      - X-RATELIMIT-LIMIT-TOKENS-XXX
+      x-ratelimit-remaining-requests:
+      - X-RATELIMIT-REMAINING-REQUESTS-XXX
+      x-ratelimit-remaining-tokens:
+      - X-RATELIMIT-REMAINING-TOKENS-XXX
+      x-ratelimit-reset-requests:
+      - X-RATELIMIT-RESET-REQUESTS-XXX
+      x-ratelimit-reset-tokens:
+      - X-RATELIMIT-RESET-TOKENS-XXX
+      x-request-id:
+      - X-REQUEST-ID-XXX
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/lib/crewai/tests/cassettes/llms/google/test_gemini_crew_structured_output_with_tools.yaml
+++ b/lib/crewai/tests/cassettes/llms/google/test_gemini_crew_structured_output_with_tools.yaml
@@ -0,0 +1,197 @@
+interactions:
+- request:
+    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Calculate 15 + 27 using
+      your add_numbers tool. Report the result.\n\nThis is the expected criteria for
+      your final answer: A structured calculation result\nyou MUST return the actual
+      complete content as the final answer, not a summary.\nFormat your final answer
+      according to the following OpenAPI schema: {\n  \"properties\": {\n    \"operation\":
+      {\n      \"description\": \"The mathematical operation performed\",\n      \"title\":
+      \"Operation\",\n      \"type\": \"string\"\n    },\n    \"result\": {\n      \"description\":
+      \"The result of the calculation\",\n      \"title\": \"Result\",\n      \"type\":
+      \"integer\"\n    },\n    \"explanation\": {\n      \"description\": \"Brief
+      explanation of the calculation\",\n      \"title\": \"Explanation\",\n      \"type\":
+      \"string\"\n    }\n  },\n  \"required\": [\n    \"operation\",\n    \"result\",\n    \"explanation\"\n  ],\n  \"title\":
+      \"CalculationResult\",\n  \"type\": \"object\",\n  \"additionalProperties\":
+      false\n}\n\nIMPORTANT: Preserve the original content exactly as-is. Do NOT rewrite,
+      paraphrase, or modify the meaning of the content. Only structure it to match
+      the schema format.\n\nDo not include the OpenAPI schema in the final output.
+      Ensure the final output does not include any code block markers like ```json
+      or ```python."}], "role": "user"}], "systemInstruction": {"parts": [{"text":
+      "You are Calculator. You are a calculator assistant that uses tools to compute
+      results.\nYour personal goal is: Perform calculations using available tools"}],
+      "role": "user"}, "tools": [{"functionDeclarations": [{"description": "Add two
+      numbers together and return the sum.", "name": "add_numbers", "parameters_json_schema":
+      {"properties": {"a": {"title": "A", "type": "integer"}, "b": {"title": "B",
+      "type": "integer"}}, "required": ["a", "b"], "type": "object", "additionalProperties":
+      false}}, {"description": "Use this tool to provide your final structured response.
+      Call this tool when you have gathered all necessary information and are ready
+      to provide the final answer in the required format.", "name": "structured_output",
+      "parameters_json_schema": {"properties": {"operation": {"description": "The
+      mathematical operation performed", "title": "Operation", "type": "string"},
+      "result": {"description": "The result of the calculation", "title": "Result",
+      "type": "integer"}, "explanation": {"description": "Brief explanation of the
+      calculation", "title": "Explanation", "type": "string"}}, "required": ["operation",
+      "result", "explanation"], "title": "CalculationResult", "type": "object", "additionalProperties":
+      false, "propertyOrdering": ["operation", "result", "explanation"]}}]}], "generationConfig":
+      {"stopSequences": ["\nObservation:"]}}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - '*/*'
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '2763'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      x-goog-api-client:
+      - google-genai-sdk/1.49.0 gl-python/3.13.12
+      x-goog-api-key:
+      - X-GOOG-API-KEY-XXX
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-001:generateContent
+  response:
+    body:
+      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
+        [\n          {\n            \"functionCall\": {\n              \"name\": \"add_numbers\",\n
+        \             \"args\": {\n                \"a\": 15,\n                \"b\":
+        27\n              }\n            }\n          }\n        ],\n        \"role\":
+        \"model\"\n      },\n      \"finishReason\": \"STOP\",\n      \"avgLogprobs\":
+        4.3579145442760951e-06\n    }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\":
+        377,\n    \"candidatesTokenCount\": 7,\n    \"totalTokenCount\": 384,\n    \"promptTokensDetails\":
+        [\n      {\n        \"modality\": \"TEXT\",\n        \"tokenCount\": 377\n
+        \     }\n    ],\n    \"candidatesTokensDetails\": [\n      {\n        \"modality\":
+        \"TEXT\",\n        \"tokenCount\": 7\n      }\n    ]\n  },\n  \"modelVersion\":
+        \"gemini-2.0-flash-001\",\n  \"responseId\": \"vVefaYDSOouXjMcPicLCsQY\"\n}\n"
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Wed, 25 Feb 2026 20:12:46 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=718
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      X-Frame-Options:
+      - X-FRAME-OPTIONS-XXX
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"contents": [{"parts": [{"text": "\nCurrent Task: Calculate 15 + 27 using
+      your add_numbers tool. Report the result.\n\nThis is the expected criteria for
+      your final answer: A structured calculation result\nyou MUST return the actual
+      complete content as the final answer, not a summary.\nFormat your final answer
+      according to the following OpenAPI schema: {\n  \"properties\": {\n    \"operation\":
+      {\n      \"description\": \"The mathematical operation performed\",\n      \"title\":
+      \"Operation\",\n      \"type\": \"string\"\n    },\n    \"result\": {\n      \"description\":
+      \"The result of the calculation\",\n      \"title\": \"Result\",\n      \"type\":
+      \"integer\"\n    },\n    \"explanation\": {\n      \"description\": \"Brief
+      explanation of the calculation\",\n      \"title\": \"Explanation\",\n      \"type\":
+      \"string\"\n    }\n  },\n  \"required\": [\n    \"operation\",\n    \"result\",\n    \"explanation\"\n  ],\n  \"title\":
+      \"CalculationResult\",\n  \"type\": \"object\",\n  \"additionalProperties\":
+      false\n}\n\nIMPORTANT: Preserve the original content exactly as-is. Do NOT rewrite,
+      paraphrase, or modify the meaning of the content. Only structure it to match
+      the schema format.\n\nDo not include the OpenAPI schema in the final output.
+      Ensure the final output does not include any code block markers like ```json
+      or ```python."}], "role": "user"}, {"parts": [{"functionCall": {"args": {"a":
+      15, "b": 27}, "name": "add_numbers"}}], "role": "model"}, {"parts": [{"functionResponse":
+      {"name": "add_numbers", "response": {"result": 42}}}], "role": "user"}, {"parts":
+      [{"text": "Analyze the tool result. If requirements are met, provide the Final
+      Answer. Otherwise, call the next tool. Deliver only the answer without meta-commentary."}],
+      "role": "user"}], "systemInstruction": {"parts": [{"text": "You are Calculator.
+      You are a calculator assistant that uses tools to compute results.\nYour personal
+      goal is: Perform calculations using available tools"}], "role": "user"}, "tools":
+      [{"functionDeclarations": [{"description": "Add two numbers together and return
+      the sum.", "name": "add_numbers", "parameters_json_schema": {"properties": {"a":
+      {"title": "A", "type": "integer"}, "b": {"title": "B", "type": "integer"}},
+      "required": ["a", "b"], "type": "object", "additionalProperties": false}}, {"description":
+      "Use this tool to provide your final structured response. Call this tool when
+      you have gathered all necessary information and are ready to provide the final
+      answer in the required format.", "name": "structured_output", "parameters_json_schema":
+      {"properties": {"operation": {"description": "The mathematical operation performed",
+      "title": "Operation", "type": "string"}, "result": {"description": "The result
+      of the calculation", "title": "Result", "type": "integer"}, "explanation": {"description":
+      "Brief explanation of the calculation", "title": "Explanation", "type": "string"}},
+      "required": ["operation", "result", "explanation"], "title": "CalculationResult",
+      "type": "object", "additionalProperties": false, "propertyOrdering": ["operation",
+      "result", "explanation"]}}]}], "generationConfig": {"stopSequences": ["\nObservation:"]}}'
+    headers:
+      User-Agent:
+      - X-USER-AGENT-XXX
+      accept:
+      - '*/*'
+      accept-encoding:
+      - ACCEPT-ENCODING-XXX
+      connection:
+      - keep-alive
+      content-length:
+      - '3166'
+      content-type:
+      - application/json
+      host:
+      - generativelanguage.googleapis.com
+      x-goog-api-client:
+      - google-genai-sdk/1.49.0 gl-python/3.13.12
+      x-goog-api-key:
+      - X-GOOG-API-KEY-XXX
+    method: POST
+    uri: https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-001:generateContent
+  response:
+    body:
+      string: "{\n  \"candidates\": [\n    {\n      \"content\": {\n        \"parts\":
+        [\n          {\n            \"functionCall\": {\n              \"name\": \"structured_output\",\n
+        \             \"args\": {\n                \"result\": 42,\n                \"explanation\":
+        \"15 + 27 = 42\",\n                \"operation\": \"addition\"\n              }\n
+        \           }\n          }\n        ],\n        \"role\": \"model\"\n      },\n
+        \     \"finishReason\": \"STOP\",\n      \"avgLogprobs\": -0.07498827245500353\n
+        \   }\n  ],\n  \"usageMetadata\": {\n    \"promptTokenCount\": 421,\n    \"candidatesTokenCount\":
+        18,\n    \"totalTokenCount\": 439,\n    \"promptTokensDetails\": [\n      {\n
+        \       \"modality\": \"TEXT\",\n        \"tokenCount\": 421\n      }\n    ],\n
+        \   \"candidatesTokensDetails\": [\n      {\n        \"modality\": \"TEXT\",\n
+        \       \"tokenCount\": 18\n      }\n    ]\n  },\n  \"modelVersion\": \"gemini-2.0-flash-001\",\n
+        \ \"responseId\": \"vlefac7bJb6TjMcPzYWh0Ag\"\n}\n"
+    headers:
+      Alt-Svc:
+      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
+      Content-Type:
+      - application/json; charset=UTF-8
+      Date:
+      - Wed, 25 Feb 2026 20:12:47 GMT
+      Server:
+      - scaffolding on HTTPServer2
+      Server-Timing:
+      - gfet4t7; dur=774
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Origin
+      - X-Origin
+      - Referer
+      X-Content-Type-Options:
+      - X-CONTENT-TYPE-XXX
+      X-Frame-Options:
+      - X-FRAME-OPTIONS-XXX
+      X-XSS-Protection:
+      - '0'
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Greyson LaLonde	156b9d3285	ci: add PR size and title checks, configure commitizen Add two new GitHub Actions workflows: - pr-size.yml: auto-labels PRs by size and fails CI on PRs over 500 lines - pr-title.yml: enforces conventional commit format on PR titles Configure commitizen in pyproject.toml with strict schema pattern matching for conventional commits.	2026-02-27 12:43:55 -05:00
Greyson LaLonde	757a435ee3	chore: update changelog and version for v1.10.1a1 Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details	2026-02-27 09:58:48 -05:00
Greyson LaLonde	8bfdb188f7	feat: bump versions to 1.10.1a1	2026-02-27 09:44:47 -05:00
João Moura	1bdb9496a3	refactor: update step callback methods to support asynchronous invocation (#4633 ) * refactor: update step callback methods to support asynchronous invocation - Replaced synchronous step callback invocations with asynchronous counterparts in the CrewAgentExecutor class. - Introduced a new async method _ainvoke_step_callback to handle step callbacks in an async context, improving responsiveness and performance in asynchronous workflows. * chore: bump version to 1.10.1b1 across multiple files - Updated version strings from 1.10.1b to 1.10.1b1 in various project files including pyproject.toml and __init__.py files. - Adjusted dependency specifications to reflect the new version in relevant templates and modules.	2026-02-27 07:35:03 -03:00
Joao Moura	979aa26c3d	bump new alpha version	2026-02-27 01:43:33 -08:00
João Moura	514c082882	refactor: implement lazy loading for heavy dependencies in Memory module (#4632 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details - Introduced lazy imports for the Memory and EncodingFlow classes to optimize import time and reduce initial load, particularly beneficial for deployment scenarios like Celery pre-fork. - Updated the Memory class to include new configuration options for aggregation queries, enhancing its functionality. - Adjusted the __getattr__ method in both the crewai and memory modules to support lazy loading of specified attributes.	2026-02-27 03:20:02 -03:00
Greyson LaLonde	c9e8068578	docs: update changelog and version for v1.10.0	2026-02-26 19:14:25 -05:00
Greyson LaLonde	df2778f08b	fix: make branch for release notes Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details	2026-02-26 18:49:13 -05:00
Greyson LaLonde	d8fea2518d	feat: bump versions to 1.10.0 * feat: bump versions to 1.10.0 * chore: update tool specifications --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>	2026-02-26 18:31:14 -05:00
Lucas Gomide	d259150d8d	Enhance MCP tool resolution and related events (#4580 ) * feat: enhance MCP tool resolution * feat: emit event when MCP configuration fails * feat: emit event when MCP tool execution has failed * style: resolve linter issues * refactor: use clear and natural mcp tool name resolution * test: fix broken tests * fix: resolve MCP connection leaks, slug validation, duplicate connections, and httpx exception handling --------- Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com> Co-authored-by: Greyson LaLonde <greyson@crewai.com>	2026-02-26 13:59:30 -08:00
Greyson LaLonde	c4a328c9d5	fix: validate tool kwargs even when empty to prevent cryptic TypeError (#4611 )	2026-02-26 16:18:03 -05:00
Greyson LaLonde	373abbb6b7	fix: add dict overload to build_embedder and type default embedder	2026-02-26 16:04:28 -05:00
João Moura	86d3ee022d	feat: update lancedb version and add lance-namespace packages * chore(deps): update lancedb version and add lance-namespace packages - Updated lancedb dependency version from 0.4.0 to 0.29.2 in multiple files. - Added new packages: lance-namespace and lance-namespace-urllib3-client with version 0.5.2, including their dependencies and installation details. - Enhanced MemoryTUI to display a limit on entries and improved the LanceDBStorage class with automatic background compaction and index creation for better performance. * linter * refactor: update memory recall limit and formatting in Agent class - Reduced the memory recall limit from 10 to 5 in multiple locations within the Agent class. - Updated the memory formatting to use a new `format` method in the MemoryMatch class for improved readability and metadata inclusion. * refactor: enhance memory handling with read-only support - Updated memory-related classes and methods to support read-only functionality, allowing for silent no-ops when attempting to remember data in read-only mode. - Modified the LiteAgent and CrewAgentExecutorMixin classes to check for read-only status before saving memories. - Adjusted MemorySlice and Memory classes to reflect changes in behavior when read-only is enabled. - Updated tests to verify that memory operations behave correctly under read-only conditions. * test: set mock memory to read-write in unit tests - Updated unit tests in test_unified_memory.py to set mock_memory._read_only to False, ensuring that memory operations can be tested in a writable state. * fix test * fix: preserve falsy metadata values and fix remember() return type --------- Co-authored-by: lorenzejay <lorenzejaytech@gmail.com> Co-authored-by: Greyson LaLonde <greyson@crewai.com>	2026-02-26 15:05:10 -05:00
Lucas Gomide	09e3b81ca3	fix: preserve null types in tool parameter schemas for LLM (#4579 ) * fix: preserve null types in tool parameter schemas for LLM Tool parameter schemas were stripping null from optional fields via generate_model_description, forcing the LLM to provide non-null values for fields. Adds strip_null_types parameter to generate_model_description and passes False when generating tool schemas, so optional fields keep anyOf: [{type: T}, {type: null}] * Update lib/crewai/src/crewai/utilities/pydantic_schema_utils.py Co-authored-by: Gabe Milani <gabriel@crewai.com> --------- Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com> Co-authored-by: Gabe Milani <gabriel@crewai.com> Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>	2026-02-26 11:51:34 -05:00
Heitor Carvalho	b6d8ce5c55	docs: add litellm dependency note for non-native LLM providers (#4600 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details	2026-02-26 10:57:37 -03:00
Greyson LaLonde	b371f97a2f	fix: map output_pydantic/output_json to native structured output Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details * fix: map output_pydantic/output_json to native structured output * test: add crew+tools+structured output integration test for Gemini * fix: re-record stale cassette for test_crew_testing_function * fix: re-record remaining stale cassettes for native structured output * fix: enable native structured output for lite agent and fix mypy errors	2026-02-25 17:13:34 -05:00
dependabot[bot]	017189db78	chore(deps): bump nltk in the security-updates group across 1 directory (#4598 ) Bumps the security-updates group with 1 update in the / directory: [nltk](https://github.com/nltk/nltk). Updates `nltk` from 3.9.2 to 3.9.3 - [Changelog](https://github.com/nltk/nltk/blob/develop/ChangeLog) - [Commits](https://github.com/nltk/nltk/compare/3.9.2...3.9.3) --- updated-dependencies: - dependency-name: nltk dependency-version: 3.9.3 dependency-type: indirect dependency-group: security-updates ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-25 15:37:21 -06:00
dependabot[bot]	02d911494f	chore(deps): bump cryptography (#4506 ) Bumps the security-updates group with 1 update in the / directory: [cryptography](https://github.com/pyca/cryptography). Updates `cryptography` from 46.0.4 to 46.0.5 - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/46.0.4...46.0.5) --- updated-dependencies: - dependency-name: cryptography dependency-version: 46.0.5 dependency-type: indirect dependency-group: security-updates ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-25 15:04:07 -06:00
João Moura	8102d0a6ca	feat: enhance JSON argument parsing and validation in CrewAgentExecutor and BaseTool * feat: enhance JSON argument parsing and validation in CrewAgentExecutor and BaseTool - Added error handling for malformed JSON tool arguments in CrewAgentExecutor, providing descriptive error messages. - Implemented schema validation for tool arguments in BaseTool, ensuring that invalid arguments raise appropriate exceptions. - Introduced tests to verify correct behavior for both valid and invalid JSON inputs, enhancing robustness of tool execution. * refactor: improve argument validation in BaseTool - Introduced a new private method to handle argument validation for tools, enhancing code clarity and reusability. - Updated the method to utilize the new validation method, ensuring consistent error handling for invalid arguments. - Enhanced exception handling to specifically catch , providing clearer error messages for tool argument validation failures. * feat: introduce parse_tool_call_args for improved argument parsing - Added a new utility function, parse_tool_call_args, to handle parsing of tool call arguments from JSON strings or dictionaries, enhancing error handling for malformed JSON inputs. - Updated CrewAgentExecutor and AgentExecutor to utilize the new parsing function, streamlining argument validation and improving clarity in error reporting. - Introduced unit tests for parse_tool_call_args to ensure robust functionality and correct handling of various input scenarios. * feat: add keyword argument validation in BaseTool and Tool classes - Introduced a new method `_validate_kwargs` in BaseTool to validate keyword arguments against the defined schema, ensuring proper argument handling. - Updated the `run` and `arun` methods in both BaseTool and Tool classes to utilize the new validation method, improving error handling and robustness. - Added comprehensive tests for asynchronous execution in `TestBaseToolArunValidation` to verify correct behavior for valid and invalid keyword arguments. * Potential fix for pull request finding 'Syntax error' Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> --------- Co-authored-by: lorenzejay <lorenzejaytech@gmail.com> Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com> Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>	2026-02-25 13:13:31 -05:00
Greyson LaLonde	ee374d01de	chore: add versioning logic for devtools	2026-02-25 12:13:00 -05:00
Greyson LaLonde	9914e51199	feat: add versioned docs starting with 1.10.0	2026-02-25 11:05:31 -05:00
nicoferdi96	2dbb83ae31	Private package registry (#4583 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details adding reference and explaination for package registry Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>	2026-02-24 19:37:17 +01:00
Mike Plachta	7377e1aa26	fix: bedrock region was always set to "us-east-1" not respecting the env var. (#4582 ) * fix: bedrock region was always set to "us-east-1" not respecting the env var. code had AWS_REGION_NAME referenced, but not used, unified to AWS_DEFAULT_REGION as per documentation * DRY code improvement and fix caught by tests. * Supporting litellm configuration	2026-02-24 09:59:01 -08:00
Greyson LaLonde	51754899a2	feat: migrate CLI http client from requests to httpx Some checks failed Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details	2026-02-20 18:21:05 -05:00
Greyson LaLonde	71b4f8402a	fix: ensure callbacks are ran/awaited if promise Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Build uv cache / build-cache (3.12) (push) Has been cancelled Details Build uv cache / build-cache (3.13) (push) Has been cancelled Details Build uv cache / build-cache (3.10) (push) Has been cancelled Details Build uv cache / build-cache (3.11) (push) Has been cancelled Details	2026-02-20 13:15:50 -05:00
Greyson LaLonde	4a4c99d8a2	fix: capture method name in exception context Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details	2026-02-19 17:51:18 -05:00
Greyson LaLonde	28a6b855a2	fix: preserve enum type in router result; improve types	2026-02-19 17:30:47 -05:00
Lorenze Jay	d09656664d	supporting parallel tool use (#4513 ) * supporting parallel tool use * ensure we respect max_usage_count * ensure result_as_answer, hooks, and cache parodity * improve crew agent executor * address test comments	2026-02-19 14:07:28 -08:00
Lucas Gomide	49aa29bb41	docs: correct broken human_feedback examples with working self-loop patterns (#4520 ) Some checks failed CodeQL Advanced / Analyze (actions) (push) Has been cancelled Details CodeQL Advanced / Analyze (python) (push) Has been cancelled Details Check Documentation Broken Links / Check broken links (push) Has been cancelled Details	2026-02-19 09:02:01 -08:00