update litellm to support o3-mini and deepseek. Update docs.

Brandon/provide llm additional params (#2018 )
* Clean up to match enterprise * add additional params to LLM calls * make sure additional params are getting passed to llm * update docs * drop print
2026-04-12 22:12:37 +00:00 · 2025-02-04 10:58:34 -05:00 · 2025-01-31 12:53:58 -05:00 · 2025-01-30 18:16:10 -05:00 · 2025-01-29 19:41:09 -05:00 · 2025-01-29 19:11:14 -05:00
13 changed files with 595 additions and 20 deletions
--- a/docs/concepts/llms.mdx
+++ b/docs/concepts/llms.mdx
@@ -38,6 +38,7 @@ Here's a detailed breakdown of supported models and their capabilities, you can
    | GPT-4 | 8,192 tokens | High-accuracy tasks, complex reasoning |
    | GPT-4 Turbo | 128,000 tokens | Long-form content, document analysis |
    | GPT-4o & GPT-4o-mini | 128,000 tokens | Cost-effective large context processing |
+    | o3-mini | 200,000 tokens | Fast reasoning, complex reasoning |

    <Note>
      1 token ≈ 4 characters in English. For example, 8,192 tokens ≈ 32,768 characters or about 6,000 words.
@@ -162,7 +163,8 @@ Here's a detailed breakdown of supported models and their capabilities, you can
  <Tab title="Others">
    | Provider | Context Window | Key Features |
    |----------|---------------|--------------|
-    | Deepseek Chat | 128,000 tokens | Specialized in technical discussions |
+    | Deepseek Chat | 64,000 tokens | Specialized in technical discussions |
+    | Deepseek R1 | 64,000 tokens | Affordable reasoning model |
    | Claude 3 | Up to 200K tokens | Strong reasoning, code understanding |
    | Gemma Series | 8,192 tokens | Efficient, smaller-scale tasks |

@@ -296,6 +298,10 @@ There are three ways to configure LLMs in CrewAI. Choose the method that best fi
        # llm: sambanova/Meta-Llama-3.1-8B-Instruct
        # llm: sambanova/BioMistral-7B
        # llm: sambanova/Falcon-180B
+
+        # Open Router Models - Affordable reasoning
+        # llm: openrouter/deepseek/deepseek-r1
+        # llm: openrouter/deepseek/deepseek-chat
    ```

    <Info>
@@ -465,11 +471,22 @@ Learn how to get the most out of your LLM configuration:
    # https://cloud.google.com/vertex-ai/generative-ai/docs/overview
    ```

+    ## GET CREDENTIALS 
+    file_path = 'path/to/vertex_ai_service_account.json'
+
+    # Load the JSON file
+    with open(file_path, 'r') as file:
+        vertex_credentials = json.load(file)
+
+    # Convert to JSON string
+    vertex_credentials_json = json.dumps(vertex_credentials)
+
    Example usage:
    ```python Code
    llm = LLM(
        model="gemini/gemini-1.5-pro-latest",
-        temperature=0.7
+        temperature=0.7,
+        vertex_credentials=vertex_credentials_json
    )
    ```
  </Accordion>
@@ -680,6 +697,27 @@ Learn how to get the most out of your LLM configuration:
      - Support for long context windows
    </Info>
  </Accordion>
+
+  <Accordion title="Open Router">
+    ```python Code
+    OPENROUTER_API_KEY=<your-api-key>
+    ```
+    
+    Example usage:
+    ```python Code
+    llm = LLM(
+        model="openrouter/deepseek/deepseek-r1",
+        base_url="https://openrouter.ai/api/v1",
+        api_key=OPENROUTER_API_KEY
+    )
+    ```
+
+    <Info>
+      Open Router models:
+      - openrouter/deepseek/deepseek-r1
+      - openrouter/deepseek/deepseek-chat
+    </Info>
+  </Accordion>
 </AccordionGroup>

 ## Common Issues and Solutions
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -11,7 +11,7 @@ dependencies = [
    # Core Dependencies
    "pydantic>=2.4.2",
    "openai>=1.13.3",
-    "litellm==1.59.8",
+    "litellm==1.60.2",
    "instructor>=1.3.3",
    # Text Processing
    "pdfplumber>=0.11.4",
--- a/src/crewai/agents/crew_agent_executor.py
+++ b/src/crewai/agents/crew_agent_executor.py
@@ -519,7 +519,11 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
            color="yellow",
        )
        self._handle_crew_training_output(initial_answer, feedback)
-        self.messages.append(self._format_msg(f"Feedback: {feedback}"))
+        self.messages.append(
+            self._format_msg(
+                self._i18n.slice("feedback_instructions").format(feedback=feedback)
+            )
+        )
        improved_answer = self._invoke_loop()
        self._handle_crew_training_output(improved_answer)
        self.ask_for_human_input = False
@@ -566,7 +570,11 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):

    def _process_feedback_iteration(self, feedback: str) -> AgentFinish:
        """Process a single feedback iteration."""
-        self.messages.append(self._format_msg(f"Feedback: {feedback}"))
+        self.messages.append(
+            self._format_msg(
+                self._i18n.slice("feedback_instructions").format(feedback=feedback)
+            )
+        )
        return self._invoke_loop()

    def _log_feedback_error(self, retry_count: int, error: Exception) -> None:
--- a/src/crewai/llm.py
+++ b/src/crewai/llm.py
@@ -5,7 +5,7 @@ import sys
 import threading
 import warnings
 from contextlib import contextmanager
-from typing import Any, Dict, List, Optional, Union, cast
+from typing import Any, Dict, List, Literal, Optional, Union, cast

 from dotenv import load_dotenv

@@ -133,9 +133,12 @@ class LLM:
        logprobs: Optional[int] = None,
        top_logprobs: Optional[int] = None,
        base_url: Optional[str] = None,
+        api_base: Optional[str] = None,
        api_version: Optional[str] = None,
        api_key: Optional[str] = None,
        callbacks: List[Any] = [],
+        reasoning_effort: Optional[Literal["none", "low", "medium", "high"]] = None,
+        **kwargs,
    ):
        self.model = model
        self.timeout = timeout
@@ -152,10 +155,13 @@ class LLM:
        self.logprobs = logprobs
        self.top_logprobs = top_logprobs
        self.base_url = base_url
+        self.api_base = api_base
        self.api_version = api_version
        self.api_key = api_key
        self.callbacks = callbacks
        self.context_window_size = 0
+        self.reasoning_effort = reasoning_effort
+        self.additional_params = kwargs

        litellm.drop_params = True

@@ -232,11 +238,14 @@ class LLM:
                    "seed": self.seed,
                    "logprobs": self.logprobs,
                    "top_logprobs": self.top_logprobs,
-                    "api_base": self.base_url,
+                    "api_base": self.api_base,
+                    "base_url": self.base_url,
                    "api_version": self.api_version,
                    "api_key": self.api_key,
                    "stream": False,
                    "tools": tools,
+                    "reasoning_effort": self.reasoning_effort,
+                    **self.additional_params,
                }

                # Remove None values from params
--- a/src/crewai/translations/en.json
+++ b/src/crewai/translations/en.json
@@ -24,7 +24,8 @@
    "manager_request": "Your best answer to your coworker asking you this, accounting for the context shared.",
    "formatted_task_instructions": "Ensure your final answer contains only the content in the following format: {output_format}\n\nEnsure the final output does not include any code block markers like ```json or ```python.",
    "human_feedback_classification": "Determine if the following feedback indicates that the user is satisfied or if further changes are needed. Respond with 'True' if further changes are needed, or 'False' if the user is satisfied. **Important** Do not include any additional commentary outside of your 'True' or 'False' response.\n\nFeedback: \"{feedback}\"",
-    "conversation_history_instruction": "You are a member of a crew collaborating to achieve a common goal. Your task is a specific action that contributes to this larger objective. For additional context, please review the conversation history between you and the user that led to the initiation of this crew. Use any relevant information or feedback from the conversation to inform your task execution and ensure your response aligns with both the immediate task and the crew's overall goals."
+    "conversation_history_instruction": "You are a member of a crew collaborating to achieve a common goal. Your task is a specific action that contributes to this larger objective. For additional context, please review the conversation history between you and the user that led to the initiation of this crew. Use any relevant information or feedback from the conversation to inform your task execution and ensure your response aligns with both the immediate task and the crew's overall goals.",
+    "feedback_instructions": "User feedback: {feedback}\nInstructions: Use this feedback to enhance the next output iteration.\nNote: Do not respond or add commentary."
  },
  "errors": {
    "force_final_answer_error": "You can't keep going, here is the best final answer you generated:\n\n {formatted_answer}",
--- a/src/crewai/utilities/llm_utils.py
+++ b/src/crewai/utilities/llm_utils.py
@@ -53,6 +53,7 @@ def create_llm(
        timeout: Optional[float] = getattr(llm_value, "timeout", None)
        api_key: Optional[str] = getattr(llm_value, "api_key", None)
        base_url: Optional[str] = getattr(llm_value, "base_url", None)
+        api_base: Optional[str] = getattr(llm_value, "api_base", None)

        created_llm = LLM(
            model=model,
@@ -62,6 +63,7 @@ def create_llm(
            timeout=timeout,
            api_key=api_key,
            base_url=base_url,
+            api_base=api_base,
        )
        return created_llm
    except Exception as e:
@@ -101,8 +103,18 @@ def _llm_via_environment_or_fallback() -> Optional[LLM]:
    callbacks: List[Any] = []

    # Optional base URL from env
-    api_base = os.environ.get("OPENAI_API_BASE") or os.environ.get("OPENAI_BASE_URL")
-    if api_base:
+    base_url = (
+        os.environ.get("BASE_URL")
+        or os.environ.get("OPENAI_API_BASE")
+        or os.environ.get("OPENAI_BASE_URL")
+    )
+
+    api_base = os.environ.get("API_BASE") or os.environ.get("AZURE_API_BASE")
+
+    # Synchronize base_url and api_base if one is populated and the other is not
+    if base_url and not api_base:
+        api_base = base_url
+    elif api_base and not base_url:
        base_url = api_base

    # Initialize llm_params dictionary
@@ -115,6 +127,7 @@ def _llm_via_environment_or_fallback() -> Optional[LLM]:
        "timeout": timeout,
        "api_key": api_key,
        "base_url": base_url,
+        "api_base": api_base,
        "api_version": api_version,
        "presence_penalty": presence_penalty,
        "frequency_penalty": frequency_penalty,
--- a/src/crewai/utilities/training_handler.py
+++ b/src/crewai/utilities/training_handler.py
@@ -35,6 +35,4 @@ class CrewTrainingHandler(PickleHandler):
    def clear(self) -> None:
        """Clear the training data by removing the file or resetting its contents."""
        if os.path.exists(self.file_path):
-            with open(self.file_path, "wb") as file:
-                # Overwrite with an empty dictionary
-                self.save({})
+            self.save({})
--- a/tests/cassettes/test_deepseek_r1_with_open_router.yaml
+++ b/tests/cassettes/test_deepseek_r1_with_open_router.yaml
@@ -0,0 +1,100 @@
+interactions:
+- request:
+    body: '{"model": "deepseek/deepseek-r1", "messages": [{"role": "user", "content":
+      "What is the capital of France?"}], "stop": [], "stream": false}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '139'
+      host:
+      - openrouter.ai
+      http-referer:
+      - https://litellm.ai
+      user-agent:
+      - litellm/1.60.2
+      x-title:
+      - liteLLM
+    method: POST
+    uri: https://openrouter.ai/api/v1/chat/completions
+  response:
+    content: "\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n{\"id\":\"gen-1738684300-YnD5WOSczQWsW0vQG78a\",\"provider\":\"Nebius\",\"model\":\"deepseek/deepseek-r1\",\"object\":\"chat.completion\",\"created\":1738684300,\"choices\":[{\"logprobs\":null,\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":\"The
+      capital of France is **Paris**. Known for its iconic landmarks such as the Eiffel
+      Tower, Notre-Dame Cathedral, and the Louvre Museum, Paris has served as the
+      political and cultural center of France for centuries. \U0001F1EB\U0001F1F7\",\"refusal\":null}}],\"usage\":{\"prompt_tokens\":10,\"completion_tokens\":261,\"total_tokens\":271}}"
+    headers:
+      Access-Control-Allow-Origin:
+      - '*'
+      CF-RAY:
+      - 90cbd2ceaf3ead5e-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:51:40 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Accept-Encoding
+      x-clerk-auth-message:
+      - Invalid JWT form. A JWT consists of three parts separated by dots. (reason=token-invalid,
+        token-carrier=header)
+      x-clerk-auth-reason:
+      - token-invalid
+      x-clerk-auth-status:
+      - signed-out
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cassettes/test_o3_mini_reasoning_effort_high.yaml
+++ b/tests/cassettes/test_o3_mini_reasoning_effort_high.yaml
@@ -0,0 +1,107 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "o3-mini", "reasoning_effort": "high", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '137'
+      content-type:
+      - application/json
+      cookie:
+      - _cfuvid=etTqqA9SBOnENmrFAUBIexdW0v2ZeO1x9_Ek_WChlfU-1737568920137-0.0.1.1-604800000
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.61.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.61.0
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    content: "{\n  \"id\": \"chatcmpl-AxFNUz7l4pwtY9xhFSPIGlwNfE4Sj\",\n  \"object\":
+      \"chat.completion\",\n  \"created\": 1738683828,\n  \"model\": \"o3-mini-2025-01-31\",\n
+      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+      \"assistant\",\n        \"content\": \"The capital of France is Paris.\",\n
+      \       \"refusal\": null\n      },\n      \"finish_reason\": \"stop\"\n    }\n
+      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 13,\n    \"completion_tokens\":
+      81,\n    \"total_tokens\": 94,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
+      \     \"reasoning_tokens\": 64,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+      \"default\",\n  \"system_fingerprint\": \"fp_8bcaa0ca21\"\n}\n"
+    headers:
+      CF-RAY:
+      - 90cbc745d91fb0ca-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:43:50 GMT
+      Server:
+      - cloudflare
+      Set-Cookie:
+      - __cf_bm=.AP74BirsYr.lu61bSaimK2HRF6126qr5vCrr3HC6ak-1738683830-1.0.1.1-feh.bcMOv9wYnitoPpr.7UR7JrzCsbRLlzct09xCDm2SwmnRQQk5ZSSV41Ywer2S0rptbvufFwklV9wo9ATvWw;
+        path=/; expires=Tue, 04-Feb-25 16:13:50 GMT; domain=.api.openai.com; HttpOnly;
+        Secure; SameSite=None
+      - _cfuvid=JBfx8Sl7w82A0S_K1tQd5ZcwzWaZP5Gg5W1dqAdgwNU-1738683830528-0.0.1.1-604800000;
+        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '2169'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999974'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_163e7bd79cb5a5e62d4688245b97d1d9
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cassettes/test_o3_mini_reasoning_effort_low.yaml
+++ b/tests/cassettes/test_o3_mini_reasoning_effort_low.yaml
@@ -0,0 +1,102 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "o3-mini", "reasoning_effort": "low", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '136'
+      content-type:
+      - application/json
+      cookie:
+      - _cfuvid=JBfx8Sl7w82A0S_K1tQd5ZcwzWaZP5Gg5W1dqAdgwNU-1738683830528-0.0.1.1-604800000;
+        __cf_bm=.AP74BirsYr.lu61bSaimK2HRF6126qr5vCrr3HC6ak-1738683830-1.0.1.1-feh.bcMOv9wYnitoPpr.7UR7JrzCsbRLlzct09xCDm2SwmnRQQk5ZSSV41Ywer2S0rptbvufFwklV9wo9ATvWw
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.61.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.61.0
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    content: "{\n  \"id\": \"chatcmpl-AxFNWljEYFrf5qRwYj73OPQtAnPbF\",\n  \"object\":
+      \"chat.completion\",\n  \"created\": 1738683830,\n  \"model\": \"o3-mini-2025-01-31\",\n
+      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+      \"assistant\",\n        \"content\": \"The capital of France is Paris.\",\n
+      \       \"refusal\": null\n      },\n      \"finish_reason\": \"stop\"\n    }\n
+      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 13,\n    \"completion_tokens\":
+      17,\n    \"total_tokens\": 30,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
+      \     \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+      \"default\",\n  \"system_fingerprint\": \"fp_8bcaa0ca21\"\n}\n"
+    headers:
+      CF-RAY:
+      - 90cbc7551fe0b0ca-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:43:51 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '1103'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999974'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_fd7178a0e5060216d04f3bd023e8bca1
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cassettes/test_o3_mini_reasoning_effort_medium.yaml
+++ b/tests/cassettes/test_o3_mini_reasoning_effort_medium.yaml
@@ -0,0 +1,102 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "o3-mini", "reasoning_effort": "medium", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '139'
+      content-type:
+      - application/json
+      cookie:
+      - _cfuvid=JBfx8Sl7w82A0S_K1tQd5ZcwzWaZP5Gg5W1dqAdgwNU-1738683830528-0.0.1.1-604800000;
+        __cf_bm=.AP74BirsYr.lu61bSaimK2HRF6126qr5vCrr3HC6ak-1738683830-1.0.1.1-feh.bcMOv9wYnitoPpr.7UR7JrzCsbRLlzct09xCDm2SwmnRQQk5ZSSV41Ywer2S0rptbvufFwklV9wo9ATvWw
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.61.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.61.0
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    content: "{\n  \"id\": \"chatcmpl-AxFS8IuMeYs6Rky2UbG8wH8P5PR4k\",\n  \"object\":
+      \"chat.completion\",\n  \"created\": 1738684116,\n  \"model\": \"o3-mini-2025-01-31\",\n
+      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+      \"assistant\",\n        \"content\": \"The capital of France is Paris.\",\n
+      \       \"refusal\": null\n      },\n      \"finish_reason\": \"stop\"\n    }\n
+      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 13,\n    \"completion_tokens\":
+      145,\n    \"total_tokens\": 158,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
+      \     \"reasoning_tokens\": 128,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+      \"default\",\n  \"system_fingerprint\": \"fp_8bcaa0ca21\"\n}\n"
+    headers:
+      CF-RAY:
+      - 90cbce51b946afb4-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:48:39 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '2365'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999974'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_bfd83679e674c3894991477f1fb043b2
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/llm_test.py
+++ b/tests/llm_test.py
@@ -1,4 +1,6 @@
+import os
 from time import sleep
+from unittest.mock import MagicMock, patch

 import pytest

@@ -154,3 +156,98 @@ def test_llm_call_with_tool_and_message_list():

    assert isinstance(result, int)
    assert result == 25
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_llm_passes_additional_params():
+    llm = LLM(
+        model="gpt-4o-mini",
+        vertex_credentials="test_credentials",
+        vertex_project="test_project",
+    )
+
+    messages = [{"role": "user", "content": "Hello, world!"}]
+
+    with patch("litellm.completion") as mocked_completion:
+        # Create mocks for response structure
+        mock_message = MagicMock()
+        mock_message.content = "Test response"
+        mock_choice = MagicMock()
+        mock_choice.message = mock_message
+        mock_response = MagicMock()
+        mock_response.choices = [mock_choice]
+        mock_response.usage = {
+            "prompt_tokens": 5,
+            "completion_tokens": 5,
+            "total_tokens": 10,
+        }
+
+        # Set up the mocked completion to return the mock response
+        mocked_completion.return_value = mock_response
+
+        result = llm.call(messages)
+
+        # Assert that litellm.completion was called once
+        mocked_completion.assert_called_once()
+
+        # Retrieve the actual arguments with which litellm.completion was called
+        _, kwargs = mocked_completion.call_args
+
+        # Check that the additional_params were passed to litellm.completion
+        assert kwargs["vertex_credentials"] == "test_credentials"
+        assert kwargs["vertex_project"] == "test_project"
+
+        # Also verify that other expected parameters are present
+        assert kwargs["model"] == "gpt-4o-mini"
+        assert kwargs["messages"] == messages
+
+        # Check the result from llm.call
+        assert result == "Test response"
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_o3_mini_reasoning_effort_high():
+    llm = LLM(
+        model="o3-mini",
+        reasoning_effort="high",
+    )
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_o3_mini_reasoning_effort_low():
+    llm = LLM(
+        model="o3-mini",
+        reasoning_effort="low",
+    )
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_o3_mini_reasoning_effort_medium():
+    llm = LLM(
+        model="o3-mini",
+        reasoning_effort="medium",
+    )
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_deepseek_r1_with_open_router():
+    if not os.getenv("OPEN_ROUTER_API_KEY"):
+        pytest.skip("OPEN_ROUTER_API_KEY not set; skipping test.")
+
+    llm = LLM(
+        model="openrouter/deepseek/deepseek-r1",
+        base_url="https://openrouter.ai/api/v1",
+        api_key=os.getenv("OPEN_ROUTER_API_KEY"),
+    )
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
--- a/uv.lock
+++ b/uv.lock
@@ -740,7 +740,7 @@ requires-dist = [
    { name = "json-repair", specifier = ">=0.25.2" },
    { name = "json5", specifier = ">=0.10.0" },
    { name = "jsonref", specifier = ">=1.1.0" },
-    { name = "litellm", specifier = "==1.59.8" },
+    { name = "litellm", specifier = "==1.60.2" },
    { name = "mem0ai", marker = "extra == 'mem0'", specifier = ">=0.1.29" },
    { name = "openai", specifier = ">=1.13.3" },
    { name = "openpyxl", specifier = ">=3.1.5" },
@@ -2374,7 +2374,7 @@ wheels = [

 [[package]]
 name = "litellm"
-version = "1.59.8"
+version = "1.60.2"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "aiohttp" },
@@ -2389,9 +2389,9 @@ dependencies = [
    { name = "tiktoken" },
    { name = "tokenizers" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/86/b0/c8ec06bd1c87a92d6d824008982b3c82b450d7bd3be850a53913f1ac4907/litellm-1.59.8.tar.gz", hash = "sha256:9d645cc4460f6a9813061f07086648c4c3d22febc8e1f21c663f2b7750d90512", size = 6428607 }
+sdist = { url = "https://files.pythonhosted.org/packages/94/8f/704cdb0fdbdd49dc5062a39ae5f1a8f308ae0ffd746df6e0137fc1776b8a/litellm-1.60.2.tar.gz", hash = "sha256:a8170584fcfd6f5175201d869e61ccd8a40ffe3264fc5e53c5b805ddf8a6e05a", size = 6447447 }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/b9/38/889da058f566ef9ea321aafa25e423249492cf2a508dfdc0e5acfcf04526/litellm-1.59.8-py3-none-any.whl", hash = "sha256:2473914bd2343485a185dfe7eedb12ee5fda32da3c9d9a8b73f6966b9b20cf39", size = 6716233 },
+    { url = "https://files.pythonhosted.org/packages/8a/ba/0eaec9aee9f99fdf46ef1c0bddcfe7f5720b182f84f6ed27f13145d5ded2/litellm-1.60.2-py3-none-any.whl", hash = "sha256:1cb08cda04bf8c5ef3e690171a779979e4b16a5e3a24cd8dc1f198e7f198d5c4", size = 6746809 },
 ]

 [[package]]
@@ -3185,7 +3185,7 @@ wheels = [

 [[package]]
 name = "openai"
-version = "1.59.6"
+version = "1.61.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "anyio" },
@@ -3197,9 +3197,9 @@ dependencies = [
    { name = "tqdm" },
    { name = "typing-extensions" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/2e/7a/07fbe7bdabffd0a5be1bfe5903a02c4fff232e9acbae894014752a8e4def/openai-1.59.6.tar.gz", hash = "sha256:c7670727c2f1e4473f62fea6fa51475c8bc098c9ffb47bfb9eef5be23c747934", size = 344915 }
+sdist = { url = "https://files.pythonhosted.org/packages/32/2a/b3fa8790be17d632f59d4f50257b909a3f669036e5195c1ae55737274620/openai-1.61.0.tar.gz", hash = "sha256:216f325a24ed8578e929b0f1b3fb2052165f3b04b0461818adaa51aa29c71f8a", size = 350174 }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/70/45/6de8e5fd670c804b29c777e4716f1916741c71604d5c7d952eee8432f7d3/openai-1.59.6-py3-none-any.whl", hash = "sha256:b28ed44eee3d5ebe1a3ea045ee1b4b50fea36ecd50741aaa5ce5a5559c900cb6", size = 454817 },
+    { url = "https://files.pythonhosted.org/packages/93/76/70c5ad6612b3e4c89fa520266bbf2430a89cae8bd87c1e2284698af5927e/openai-1.61.0-py3-none-any.whl", hash = "sha256:e8c512c0743accbdbe77f3429a1490d862f8352045de8dc81969301eb4a4f666", size = 460623 },
 ]

 [[package]]
Author	SHA1	Message	Date
Brandon Hancock	748383d74c	update litellm to support o3-mini and deepseek. Update docs.	2025-02-04 10:58:34 -05:00
Brandon Hancock (bhancock_ai)	23b9e10323	Brandon/provide llm additional params (#2018 ) Some checks failed Mark stale issues and pull requests / stale (push) Has been cancelled Details * Clean up to match enterprise * add additional params to LLM calls * make sure additional params are getting passed to llm * update docs * drop print	2025-01-31 12:53:58 -05:00
Brandon Hancock (bhancock_ai)	ddb7958da7	Clean up to match enterprise (#2009 ) * Clean up to match enterprise * improve feedback prompting	2025-01-30 18:16:10 -05:00
Brandon Hancock (bhancock_ai)	477cce321f	Fix llms (#2003 ) * iwp * add in api_base --------- Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>	2025-01-29 19:41:09 -05:00
Brandon Hancock (bhancock_ai)	7bed63a693	Bugfix/fix broken training (#1993 ) * Fixing training while refactoring code * improve prompts * make sure to raise an error when missing training data * Drop comment * fix failing tests * add clear * drop bad code * fix failing test * Fix type issues pointed out by lorenze * simplify training	2025-01-29 19:11:14 -05:00