Add native Snowflake Cortex LLM provider (#6005)

2026-07-03 14:09:24 +00:00 · 2026-06-02 04:10:13 -07:00
parent fee5b3e395
commit 4a0769d97c
8 changed files with 635 additions and 4 deletions
--- a/docs/ar/concepts/llms.mdx
+++ b/docs/ar/concepts/llms.mdx
@@ -107,7 +107,7 @@ mode: "wide"
 </Tabs>

 <Info>
-  يوفر CrewAI تكاملات SDK أصلية لـ OpenAI و Anthropic و Google (Gemini API) و Azure و AWS Bedrock -- لا حاجة لتثبيت إضافي بخلاف الملحقات الخاصة بالمزود (مثل `uv add "crewai[openai]"`).
+  يوفر CrewAI تكاملات SDK أصلية لـ OpenAI و Anthropic و Google (Gemini API) و Azure و AWS Bedrock و Snowflake Cortex -- لا حاجة لتثبيت إضافي بخلاف الملحقات الخاصة بالمزود (مثل `uv add "crewai[openai]"`).

  جميع المزودين الآخرين مدعومون بواسطة **LiteLLM**. إذا كنت تخطط لاستخدام أي منهم، أضفه كتبعية لمشروعك:
  ```bash
@@ -291,6 +291,55 @@ mode: "wide"
    ```
  </Accordion>

+  <Accordion title="Snowflake Cortex">
+    يوفر CrewAI تكاملًا أصليًا مع Snowflake Cortex REST API عبر endpoint Chat Completions المتوافق مع OpenAI. تستخدم نماذج `snowflake/...` هذا المسار بدون fallback إلى LiteLLM. يدعم Snowflake Cortex في CrewAI حاليًا Chat Completions فقط، لذلك استخدم وضع `api` الافتراضي ولا تضبط `api="responses"`.
+
+    ```toml Code
+    # Required
+    SNOWFLAKE_PAT=<your-programmatic-access-token>
+    SNOWFLAKE_ACCOUNT_URL=https://<account-identifier>.snowflakecomputing.com
+
+    # Alternative account configuration
+    SNOWFLAKE_ACCOUNT=<account-identifier>
+    ```
+
+    **الاستخدام الأساسي:**
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="snowflake/openai-gpt-4.1",
+        temperature=0.7,
+        max_completion_tokens=1024,
+    )
+    ```
+
+    **نماذج Claude على Cortex:**
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="snowflake/claude-sonnet-4-5",
+        max_completion_tokens=1024,
+        stream=True,
+    )
+    ```
+
+    **متغيرات البيئة المدعومة:**
+    - `SNOWFLAKE_PAT` أو `SNOWFLAKE_TOKEN` أو `SNOWFLAKE_JWT`: الرمز المستخدم كاعتماد Bearer
+    - `SNOWFLAKE_ACCOUNT_URL`: عنوان URL الكامل لحساب Snowflake
+    - `SNOWFLAKE_ACCOUNT` أو `SNOWFLAKE_ACCOUNT_ID` أو `SNOWFLAKE_ACCOUNT_IDENTIFIER`: معرف الحساب المستخدم لبناء URL
+
+    تستخدم طلبات Snowflake REST الدور الافتراضي للمستخدم. تأكد من أن هذا الدور لديه `SNOWFLAKE.CORTEX_USER` أو `SNOWFLAKE.CORTEX_REST_API_USER`. لا يتطلب endpoint Cortex REST Chat Completions معاملات database أو schema أو warehouse أو role صريح.
+
+    **الميزات:**
+    - اختيار provider أصلي باستخدام `model="snowflake/<model-name>"`
+    - Chat Completions مع streaming وبدونه فقط؛ `api="responses"` غير مدعوم
+    - تتبع استخدام الرموز
+    - استدعاء الدوال لنماذج OpenAI و Claude المستضافة في Snowflake
+    - إزالة assistant prefill النهائي غير الصالح تلقائيًا لنماذج Claude في Snowflake
+  </Accordion>
+
  <Accordion title="Anthropic">
    يوفر CrewAI تكاملًا أصليًا مع Anthropic من خلال Anthropic Python SDK.

--- a/docs/en/concepts/llms.mdx
+++ b/docs/en/concepts/llms.mdx
@@ -107,7 +107,7 @@ There are different places in CrewAI code where you can specify the model to use
 </Tabs>

 <Info>
-  CrewAI provides native SDK integrations for OpenAI, Anthropic, Google (Gemini API), Azure, and AWS Bedrock — no extra install needed beyond the provider-specific extras (e.g. `uv add "crewai[openai]"`).
+  CrewAI provides native SDK integrations for OpenAI, Anthropic, Google (Gemini API), Azure, AWS Bedrock, and Snowflake Cortex — no extra install needed beyond the provider-specific extras (e.g. `uv add "crewai[openai]"`).

  All other providers are powered by **LiteLLM**. If you plan to use any of them, add it as a dependency to your project:
  ```bash
@@ -291,6 +291,55 @@ In this section, you'll find detailed examples that help you select, configure,
    ```
  </Accordion>

+  <Accordion title="Snowflake Cortex">
+    CrewAI provides native integration with the Snowflake Cortex REST API through its OpenAI-compatible Chat Completions endpoint. This avoids LiteLLM fallback for `snowflake/...` models. Snowflake Cortex currently supports Chat Completions only in CrewAI, so use the default `api` mode and do not set `api="responses"`.
+
+    ```toml Code
+    # Required
+    SNOWFLAKE_PAT=<your-programmatic-access-token>
+    SNOWFLAKE_ACCOUNT_URL=https://<account-identifier>.snowflakecomputing.com
+
+    # Alternative account configuration
+    SNOWFLAKE_ACCOUNT=<account-identifier>
+    ```
+
+    **Basic Usage:**
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="snowflake/openai-gpt-4.1",
+        temperature=0.7,
+        max_completion_tokens=1024,
+    )
+    ```
+
+    **Claude Models on Cortex:**
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="snowflake/claude-sonnet-4-5",
+        max_completion_tokens=1024,
+        stream=True,
+    )
+    ```
+
+    **Supported Environment Variables:**
+    - `SNOWFLAKE_PAT`, `SNOWFLAKE_TOKEN`, or `SNOWFLAKE_JWT`: token used as the Bearer credential
+    - `SNOWFLAKE_ACCOUNT_URL`: full Snowflake account URL
+    - `SNOWFLAKE_ACCOUNT`, `SNOWFLAKE_ACCOUNT_ID`, or `SNOWFLAKE_ACCOUNT_IDENTIFIER`: account identifier used to build the account URL
+
+    Snowflake REST requests use the user's default Snowflake role. Make sure that role has `SNOWFLAKE.CORTEX_USER` or `SNOWFLAKE.CORTEX_REST_API_USER`. Database, schema, warehouse, and explicit role parameters are not required by the Cortex REST Chat Completions endpoint.
+
+    **Features:**
+    - Native provider selection with `model="snowflake/<model-name>"`
+    - Streaming and non-streaming Chat Completions only; `api="responses"` is not supported
+    - Token usage tracking
+    - Function calling for Snowflake-hosted OpenAI and Claude models
+    - Automatic removal of invalid trailing assistant prefill for Snowflake Claude models
+  </Accordion>
+
  <Accordion title="Anthropic">
    CrewAI provides native integration with Anthropic through the Anthropic Python SDK.

--- a/docs/ko/concepts/llms.mdx
+++ b/docs/ko/concepts/llms.mdx
@@ -106,7 +106,7 @@ CrewAI 코드 내에는 사용할 모델을 지정할 수 있는 여러 위치
 </Tabs>

 <Info>
-  CrewAI는 OpenAI, Anthropic, Google (Gemini API), Azure, AWS Bedrock에 대해 네이티브 SDK 통합을 제공합니다 — 제공자별 extras(예: `uv add "crewai[openai]"`) 외에 추가 설치가 필요하지 않습니다.
+  CrewAI는 OpenAI, Anthropic, Google (Gemini API), Azure, AWS Bedrock, Snowflake Cortex에 대해 네이티브 SDK 통합을 제공합니다 — 제공자별 extras(예: `uv add "crewai[openai]"`) 외에 추가 설치가 필요하지 않습니다.

  그 외 모든 제공자는 **LiteLLM**을 통해 지원됩니다. 이를 사용하려면 프로젝트에 의존성으로 추가하세요:
  ```bash
@@ -230,6 +230,55 @@ CrewAI는 고유한 기능, 인증 방법, 모델 역량을 제공하는 다양
    ```
  </Accordion>

+  <Accordion title="Snowflake Cortex">
+    CrewAI는 OpenAI 호환 Chat Completions 엔드포인트를 통해 Snowflake Cortex REST API와 네이티브로 통합됩니다. `snowflake/...` 모델은 LiteLLM fallback 없이 사용됩니다. CrewAI에서 Snowflake Cortex는 현재 Chat Completions만 지원하므로 기본 `api` 모드를 사용하고 `api="responses"`를 설정하지 마세요.
+
+    ```toml Code
+    # Required
+    SNOWFLAKE_PAT=<your-programmatic-access-token>
+    SNOWFLAKE_ACCOUNT_URL=https://<account-identifier>.snowflakecomputing.com
+
+    # Alternative account configuration
+    SNOWFLAKE_ACCOUNT=<account-identifier>
+    ```
+
+    **기본 사용법:**
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="snowflake/openai-gpt-4.1",
+        temperature=0.7,
+        max_completion_tokens=1024,
+    )
+    ```
+
+    **Cortex의 Claude 모델:**
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="snowflake/claude-sonnet-4-5",
+        max_completion_tokens=1024,
+        stream=True,
+    )
+    ```
+
+    **지원 환경 변수:**
+    - `SNOWFLAKE_PAT`, `SNOWFLAKE_TOKEN`, 또는 `SNOWFLAKE_JWT`: Bearer 자격 증명으로 사용할 토큰
+    - `SNOWFLAKE_ACCOUNT_URL`: 전체 Snowflake 계정 URL
+    - `SNOWFLAKE_ACCOUNT`, `SNOWFLAKE_ACCOUNT_ID`, 또는 `SNOWFLAKE_ACCOUNT_IDENTIFIER`: 계정 URL을 만들 계정 식별자
+
+    Snowflake REST 요청은 사용자의 기본 Snowflake role을 사용합니다. 해당 role에 `SNOWFLAKE.CORTEX_USER` 또는 `SNOWFLAKE.CORTEX_REST_API_USER`가 있는지 확인하세요. Cortex REST Chat Completions 엔드포인트에는 database, schema, warehouse, 명시적 role 파라미터가 필요하지 않습니다.
+
+    **기능:**
+    - `model="snowflake/<model-name>"`을 통한 네이티브 provider 선택
+    - Streaming 및 non-streaming Chat Completions만 지원; `api="responses"`는 지원되지 않음
+    - 토큰 사용량 추적
+    - Snowflake 호스팅 OpenAI 및 Claude 모델의 함수 호출
+    - Snowflake Claude 모델에서 유효하지 않은 마지막 assistant prefill 자동 제거
+  </Accordion>
+
  <Accordion title="Anthropic">
    ```toml Code
    # Required
--- a/docs/pt-BR/concepts/llms.mdx
+++ b/docs/pt-BR/concepts/llms.mdx
@@ -106,7 +106,7 @@ Existem diferentes locais no código do CrewAI onde você pode especificar o mod
 </Tabs>

 <Info>
-  O CrewAI oferece integrações nativas via SDK para OpenAI, Anthropic, Google (Gemini API), Azure e AWS Bedrock — sem necessidade de instalação extra além dos extras específicos do provedor (ex.: `uv add "crewai[openai]"`).
+  O CrewAI oferece integrações nativas via SDK para OpenAI, Anthropic, Google (Gemini API), Azure, AWS Bedrock e Snowflake Cortex — sem necessidade de instalação extra além dos extras específicos do provedor (ex.: `uv add "crewai[openai]"`).

  Todos os outros provedores são alimentados pelo **LiteLLM**. Se você planeja usar algum deles, adicione-o como dependência ao seu projeto:
  ```bash
@@ -230,6 +230,55 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
    ```
  </Accordion>

+  <Accordion title="Snowflake Cortex">
+    O CrewAI oferece integração nativa com a API REST do Snowflake Cortex pelo endpoint Chat Completions compatível com OpenAI. Isso evita fallback para LiteLLM em modelos `snowflake/...`. Atualmente, o Snowflake Cortex no CrewAI oferece suporte apenas a Chat Completions, então use o modo `api` padrão e não defina `api="responses"`.
+
+    ```toml Code
+    # Obrigatório
+    SNOWFLAKE_PAT=<your-programmatic-access-token>
+    SNOWFLAKE_ACCOUNT_URL=https://<account-identifier>.snowflakecomputing.com
+
+    # Configuração alternativa da conta
+    SNOWFLAKE_ACCOUNT=<account-identifier>
+    ```
+
+    **Uso básico:**
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="snowflake/openai-gpt-4.1",
+        temperature=0.7,
+        max_completion_tokens=1024,
+    )
+    ```
+
+    **Modelos Claude no Cortex:**
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="snowflake/claude-sonnet-4-5",
+        max_completion_tokens=1024,
+        stream=True,
+    )
+    ```
+
+    **Variáveis de ambiente suportadas:**
+    - `SNOWFLAKE_PAT`, `SNOWFLAKE_TOKEN` ou `SNOWFLAKE_JWT`: token usado como credencial Bearer
+    - `SNOWFLAKE_ACCOUNT_URL`: URL completa da conta Snowflake
+    - `SNOWFLAKE_ACCOUNT`, `SNOWFLAKE_ACCOUNT_ID` ou `SNOWFLAKE_ACCOUNT_IDENTIFIER`: identificador da conta usado para montar a URL
+
+    As requisições REST do Snowflake usam a role padrão do usuário. Garanta que essa role tenha `SNOWFLAKE.CORTEX_USER` ou `SNOWFLAKE.CORTEX_REST_API_USER`. Parâmetros de banco de dados, schema, warehouse e role explícita não são exigidos pelo endpoint Cortex REST Chat Completions.
+
+    **Recursos:**
+    - Seleção nativa com `model="snowflake/<model-name>"`
+    - Chat Completions com e sem streaming apenas; `api="responses"` não é compatível
+    - Rastreamento de uso de tokens
+    - Chamadas de função para modelos OpenAI e Claude hospedados no Snowflake
+    - Remoção automática de prefill final de assistant inválido para modelos Claude no Snowflake
+  </Accordion>
+
  <Accordion title="Anthropic">
    ```toml Code
    # Obrigatório
--- a/lib/crewai/src/crewai/llm.py
+++ b/lib/crewai/src/crewai/llm.py
@@ -288,6 +288,7 @@ SUPPORTED_NATIVE_PROVIDERS: Final[list[str]] = [
    "hosted_vllm",
    "cerebras",
    "dashscope",
+    "snowflake",
 ]


@@ -376,6 +377,7 @@ class LLM(BaseLLM):
                "hosted_vllm": "hosted_vllm",
                "cerebras": "cerebras",
                "dashscope": "dashscope",
+                "snowflake": "snowflake",
            }

            canonical_provider = provider_mapping.get(prefix.lower())
@@ -494,6 +496,9 @@ class LLM(BaseLLM):
            # OpenRouter uses org/model format but accepts anything
            return True

+        if provider == "snowflake":
+            return True
+
        return False

    @classmethod
@@ -592,6 +597,11 @@ class LLM(BaseLLM):

            return BedrockCompletion

+        if provider == "snowflake":
+            from crewai.llms.providers.snowflake.completion import SnowflakeCompletion
+
+            return SnowflakeCompletion
+
        openai_compatible_providers = {
            "openrouter",
            "deepseek",
--- a/lib/crewai/src/crewai/llms/providers/snowflake/init.py
+++ b/lib/crewai/src/crewai/llms/providers/snowflake/init.py
--- a/lib/crewai/src/crewai/llms/providers/snowflake/completion.py
+++ b/lib/crewai/src/crewai/llms/providers/snowflake/completion.py
@@ -0,0 +1,183 @@
+from __future__ import annotations
+
+import os
+from typing import Any, Literal
+
+from pydantic import model_validator
+
+from crewai.llms.providers.openai.completion import OpenAICompletion
+from crewai.utilities.types import LLMMessage
+
+
+SNOWFLAKE_CORTEX_PATH = "/api/v2/cortex/v1"
+SNOWFLAKE_TOKEN_ENV_VARS = (
+    "SNOWFLAKE_PAT",
+    "SNOWFLAKE_TOKEN",
+    "SNOWFLAKE_JWT",
+)
+
+
+def _normalize_snowflake_base_url(value: str) -> str:
+    """Return a Snowflake Cortex REST OpenAI-compatible base URL."""
+    base_url = value.strip().rstrip("/")
+    if not base_url:
+        raise ValueError("Snowflake account URL cannot be empty")
+
+    if "://" not in base_url:
+        base_url = f"https://{base_url}"
+
+    if base_url.endswith(SNOWFLAKE_CORTEX_PATH):
+        return base_url
+
+    if "/api/v2/cortex" in base_url:
+        raise ValueError(
+            "Snowflake base URL must be the account URL or Cortex API root "
+            f"ending in {SNOWFLAKE_CORTEX_PATH}; do not include endpoint paths."
+        )
+
+    return f"{base_url}{SNOWFLAKE_CORTEX_PATH}"
+
+
+def _base_url_from_account_identifier(account_identifier: str) -> str:
+    account = account_identifier.strip()
+    if not account:
+        raise ValueError("Snowflake account identifier cannot be empty")
+    return _normalize_snowflake_base_url(f"{account}.snowflakecomputing.com")
+
+
+class SnowflakeCompletion(OpenAICompletion):
+    """Snowflake Cortex REST API native completion implementation.
+
+    Snowflake exposes an OpenAI-compatible Chat Completions endpoint at
+    ``/api/v2/cortex/v1/chat/completions``. This provider reuses CrewAI's
+    native OpenAI transport while applying Snowflake-specific authentication,
+    endpoint normalization, and Claude-family message constraints.
+    """
+
+    provider: str = "snowflake"
+    api: Literal["completions"] = "completions"
+    account_url: str | None = None
+    account_identifier: str | None = None
+    database: str | None = None
+    schema_name: str | None = None
+    warehouse: str | None = None
+    role: str | None = None
+
+    @model_validator(mode="before")
+    @classmethod
+    def _normalize_snowflake_fields(cls, data: Any) -> Any:
+        if not isinstance(data, dict):
+            return data
+
+        data["provider"] = "snowflake"
+        api = data.get("api")
+        if api and api != "completions":
+            raise ValueError(
+                "Snowflake Cortex native provider supports only the Chat Completions API"
+            )
+        data["api"] = "completions"
+
+        data["api_key"] = cls._resolve_token(data.get("api_key"))
+        resolved_base_url = cls._resolve_base_url(data)
+        data["base_url"] = resolved_base_url
+        data["account_url"] = resolved_base_url
+
+        return data
+
+    @staticmethod
+    def _resolve_token(api_key: str | None) -> str:
+        token = api_key
+        if not token:
+            for env_var in SNOWFLAKE_TOKEN_ENV_VARS:
+                token = os.getenv(env_var)
+                if token:
+                    break
+
+        if not token:
+            raise ValueError(
+                "Snowflake token is required. Set SNOWFLAKE_PAT, SNOWFLAKE_TOKEN, "
+                "or SNOWFLAKE_JWT, or pass api_key."
+            )
+
+        if token.startswith("pat/"):
+            token = token.removeprefix("pat/")
+
+        return token
+
+    @classmethod
+    def _resolve_base_url(cls, data: dict[str, Any]) -> str:
+        explicit_base_url = data.get("base_url") or data.get("api_base")
+        if explicit_base_url:
+            return _normalize_snowflake_base_url(explicit_base_url)
+
+        account_url = data.get("account_url") or os.getenv("SNOWFLAKE_ACCOUNT_URL")
+        if account_url:
+            return _normalize_snowflake_base_url(account_url)
+
+        account_identifier = (
+            data.get("account_identifier")
+            or data.get("account")
+            or data.get("snowflake_account")
+            or os.getenv("SNOWFLAKE_ACCOUNT")
+            or os.getenv("SNOWFLAKE_ACCOUNT_ID")
+            or os.getenv("SNOWFLAKE_ACCOUNT_IDENTIFIER")
+        )
+        if account_identifier:
+            return _base_url_from_account_identifier(account_identifier)
+
+        raise ValueError(
+            "Snowflake account URL is required. Set SNOWFLAKE_ACCOUNT_URL or "
+            "SNOWFLAKE_ACCOUNT, or pass account_url/base_url/account_identifier."
+        )
+
+    def _format_messages(self, messages: str | list[LLMMessage]) -> list[LLMMessage]:
+        formatted_messages = super()._format_messages(messages)
+        if self._is_claude_model():
+            return self._ensure_claude_conversation_ends_with_user(formatted_messages)
+        return formatted_messages
+
+    def _is_claude_model(self) -> bool:
+        model = self.model.lower()
+        return model.startswith(("claude-", "anthropic."))
+
+    @staticmethod
+    def _ensure_claude_conversation_ends_with_user(
+        messages: list[LLMMessage],
+    ) -> list[LLMMessage]:
+        if not messages:
+            return [{"role": "user", "content": "Hello"}]
+
+        if messages[-1].get("role") == "assistant" and not messages[-1].get(
+            "tool_calls"
+        ):
+            messages = messages[:-1]
+
+        if not messages:
+            return [{"role": "user", "content": "Hello"}]
+
+        if messages[-1].get("role") == "user":
+            return messages
+
+        return [
+            *messages,
+            {
+                "role": "user",
+                "content": "Please continue and provide your final answer.",
+            },
+        ]
+
+    def _prepare_completion_params(
+        self, messages: list[LLMMessage], tools: list[dict[str, Any]] | None = None
+    ) -> dict[str, Any]:
+        params = super()._prepare_completion_params(messages=messages, tools=tools)
+        if self._is_claude_model() and "max_tokens" in params:
+            params["max_completion_tokens"] = params.pop("max_tokens")
+        return params
+
+    def supports_function_calling(self) -> bool:
+        model = self.model.lower()
+        return model.startswith(("openai-", "claude-", "anthropic."))
+
+    def supports_multimodal(self) -> bool:
+        model = self.model.lower()
+        return model.startswith(("openai-", "claude-", "anthropic."))
--- a/lib/crewai/tests/llms/snowflake/test_snowflake.py
+++ b/lib/crewai/tests/llms/snowflake/test_snowflake.py
@@ -0,0 +1,242 @@
+from __future__ import annotations
+
+from types import SimpleNamespace
+from unittest.mock import Mock, patch
+
+import pytest
+
+from crewai.llm import LLM
+from crewai.llms.providers.snowflake.completion import (
+    SNOWFLAKE_CORTEX_PATH,
+    SnowflakeCompletion,
+    _normalize_snowflake_base_url,
+)
+
+
+def _snowflake_env(monkeypatch: pytest.MonkeyPatch) -> None:
+    monkeypatch.setenv("SNOWFLAKE_PAT", "test-pat")
+    monkeypatch.setenv("SNOWFLAKE_ACCOUNT_URL", "https://org-account.snowflakecomputing.com")
+    monkeypatch.delenv("SNOWFLAKE_TOKEN", raising=False)
+    monkeypatch.delenv("SNOWFLAKE_JWT", raising=False)
+    monkeypatch.delenv("SNOWFLAKE_ACCOUNT", raising=False)
+    monkeypatch.delenv("SNOWFLAKE_ACCOUNT_ID", raising=False)
+    monkeypatch.delenv("SNOWFLAKE_ACCOUNT_IDENTIFIER", raising=False)
+
+
+class TestSnowflakeConfig:
+    def test_normalizes_account_url_to_cortex_base_url(self):
+        assert (
+            _normalize_snowflake_base_url("https://org-account.snowflakecomputing.com")
+            == f"https://org-account.snowflakecomputing.com{SNOWFLAKE_CORTEX_PATH}"
+        )
+
+    def test_preserves_existing_cortex_base_url(self):
+        base_url = f"https://org-account.snowflakecomputing.com{SNOWFLAKE_CORTEX_PATH}"
+        assert _normalize_snowflake_base_url(base_url) == base_url
+
+    def test_rejects_endpoint_path_in_base_url(self):
+        with pytest.raises(ValueError, match="do not include endpoint paths"):
+            _normalize_snowflake_base_url(
+                "https://org-account.snowflakecomputing.com"
+                f"{SNOWFLAKE_CORTEX_PATH}/chat/completions"
+            )
+
+    def test_empty_api_key_falls_back_to_env_token(
+        self, monkeypatch: pytest.MonkeyPatch
+    ):
+        _snowflake_env(monkeypatch)
+
+        llm = SnowflakeCompletion(model="openai-gpt-4.1", api_key="")
+
+        assert llm.api_key == "test-pat"
+
+    def test_uses_env_token_and_account_url(self, monkeypatch: pytest.MonkeyPatch):
+        _snowflake_env(monkeypatch)
+
+        llm = SnowflakeCompletion(model="openai-gpt-4.1")
+
+        assert llm.api_key == "test-pat"
+        assert llm.base_url == (
+            f"https://org-account.snowflakecomputing.com{SNOWFLAKE_CORTEX_PATH}"
+        )
+        assert llm.account_url == llm.base_url
+
+    def test_strips_litellm_pat_prefix_for_compatibility(
+        self, monkeypatch: pytest.MonkeyPatch
+    ):
+        monkeypatch.setenv("SNOWFLAKE_PAT", "pat/test-pat")
+        monkeypatch.setenv("SNOWFLAKE_ACCOUNT", "org-account")
+
+        llm = SnowflakeCompletion(model="openai-gpt-4.1")
+
+        assert llm.api_key == "test-pat"
+
+    def test_missing_token_raises_clear_error(self, monkeypatch: pytest.MonkeyPatch):
+        monkeypatch.delenv("SNOWFLAKE_PAT", raising=False)
+        monkeypatch.delenv("SNOWFLAKE_TOKEN", raising=False)
+        monkeypatch.delenv("SNOWFLAKE_JWT", raising=False)
+        monkeypatch.setenv("SNOWFLAKE_ACCOUNT_URL", "https://org-account.snowflakecomputing.com")
+
+        with pytest.raises(ValueError, match="Snowflake token is required"):
+            SnowflakeCompletion(model="openai-gpt-4.1")
+
+    def test_missing_account_raises_clear_error(self, monkeypatch: pytest.MonkeyPatch):
+        monkeypatch.setenv("SNOWFLAKE_PAT", "test-pat")
+        monkeypatch.delenv("SNOWFLAKE_ACCOUNT_URL", raising=False)
+        monkeypatch.delenv("SNOWFLAKE_ACCOUNT", raising=False)
+        monkeypatch.delenv("SNOWFLAKE_ACCOUNT_ID", raising=False)
+        monkeypatch.delenv("SNOWFLAKE_ACCOUNT_IDENTIFIER", raising=False)
+
+        with pytest.raises(ValueError, match="Snowflake account URL is required"):
+            SnowflakeCompletion(model="openai-gpt-4.1")
+
+    def test_responses_api_is_rejected(self, monkeypatch: pytest.MonkeyPatch):
+        _snowflake_env(monkeypatch)
+
+        with pytest.raises(ValueError, match="supports only the Chat Completions API"):
+            SnowflakeCompletion(model="openai-gpt-4.1", api="responses")
+
+
+class TestSnowflakeFactory:
+    def test_llm_creates_native_snowflake_provider(
+        self, monkeypatch: pytest.MonkeyPatch
+    ):
+        _snowflake_env(monkeypatch)
+
+        llm = LLM(model="snowflake/openai-gpt-4.1")
+
+        assert isinstance(llm, SnowflakeCompletion)
+        assert llm.provider == "snowflake"
+        assert llm.model == "openai-gpt-4.1"
+        assert llm.is_litellm is False
+
+    def test_explicit_provider_creates_native_snowflake_provider(
+        self, monkeypatch: pytest.MonkeyPatch
+    ):
+        _snowflake_env(monkeypatch)
+
+        llm = LLM(model="claude-sonnet-4-5", provider="snowflake")
+
+        assert isinstance(llm, SnowflakeCompletion)
+        assert llm.model == "claude-sonnet-4-5"
+
+
+class TestSnowflakeRequests:
+    def test_prepare_completion_params_uses_snowflake_model_name(
+        self, monkeypatch: pytest.MonkeyPatch
+    ):
+        _snowflake_env(monkeypatch)
+        llm = SnowflakeCompletion(
+            model="openai-gpt-4.1",
+            temperature=0.2,
+            max_completion_tokens=128,
+        )
+
+        params = llm._prepare_completion_params(
+            [{"role": "user", "content": "Hello"}]
+        )
+
+        assert params["model"] == "openai-gpt-4.1"
+        assert params["temperature"] == 0.2
+        assert params["max_completion_tokens"] == 128
+        assert params["messages"] == [{"role": "user", "content": "Hello"}]
+
+    def test_claude_model_removes_trailing_assistant_prefill(
+        self, monkeypatch: pytest.MonkeyPatch
+    ):
+        _snowflake_env(monkeypatch)
+        llm = SnowflakeCompletion(model="claude-sonnet-4-5")
+
+        messages = llm._format_messages(
+            [
+                {"role": "user", "content": "Write a summary."},
+                {"role": "assistant", "content": "Here is"},
+            ]
+        )
+
+        assert messages == [{"role": "user", "content": "Write a summary."}]
+
+    def test_claude_model_adds_user_turn_after_tool_call_assistant_message(
+        self, monkeypatch: pytest.MonkeyPatch
+    ):
+        _snowflake_env(monkeypatch)
+        llm = SnowflakeCompletion(model="claude-sonnet-4-5")
+
+        messages = llm._format_messages(
+            [
+                {"role": "user", "content": "Use the tool."},
+                {
+                    "role": "assistant",
+                    "content": None,
+                    "tool_calls": [
+                        {
+                            "id": "call_1",
+                            "type": "function",
+                            "function": {"name": "lookup", "arguments": "{}"},
+                        }
+                    ],
+                },
+            ]
+        )
+
+        assert messages[-2]["role"] == "assistant"
+        assert messages[-2]["tool_calls"][0]["id"] == "call_1"
+        assert messages[-1]["role"] == "user"
+
+    def test_claude_model_maps_max_tokens_to_max_completion_tokens(
+        self, monkeypatch: pytest.MonkeyPatch
+    ):
+        _snowflake_env(monkeypatch)
+        llm = SnowflakeCompletion(model="claude-sonnet-4-5", max_tokens=256)
+
+        params = llm._prepare_completion_params(
+            [{"role": "user", "content": "Hello"}]
+        )
+
+        assert "max_tokens" not in params
+        assert params["max_completion_tokens"] == 256
+
+    def test_streaming_params_include_usage(self, monkeypatch: pytest.MonkeyPatch):
+        _snowflake_env(monkeypatch)
+        llm = SnowflakeCompletion(model="openai-gpt-4.1", stream=True)
+
+        params = llm._prepare_completion_params(
+            [{"role": "user", "content": "Hello"}]
+        )
+
+        assert params["stream"] is True
+        assert params["stream_options"] == {"include_usage": True}
+
+    def test_non_streaming_call_uses_native_openai_client(
+        self, monkeypatch: pytest.MonkeyPatch
+    ):
+        _snowflake_env(monkeypatch)
+        llm = SnowflakeCompletion(model="openai-gpt-4.1")
+        fake_response = SimpleNamespace(
+            usage=SimpleNamespace(
+                prompt_tokens=3,
+                completion_tokens=2,
+                total_tokens=5,
+                prompt_tokens_details=None,
+                completion_tokens_details=None,
+            ),
+            choices=[
+                SimpleNamespace(
+                    message=SimpleNamespace(content="Snowflake response", tool_calls=None)
+                )
+            ],
+        )
+        create = Mock(return_value=fake_response)
+        fake_client = SimpleNamespace(
+            chat=SimpleNamespace(completions=SimpleNamespace(create=create))
+        )
+
+        with patch.object(llm, "_get_sync_client", return_value=fake_client):
+            response = llm.call([{"role": "user", "content": "Hello"}])
+
+        assert response == "Snowflake response"
+        create.assert_called_once()
+        assert create.call_args.kwargs["model"] == "openai-gpt-4.1"
+        assert create.call_args.kwargs["messages"] == [
+            {"role": "user", "content": "Hello"}
+        ]