Merge branch 'main' into brandon/improve-llm-structured-output

Fix : short_term_memory with bedrock - using user defined model(when passed as attribute) rather than default (#1959 )
* Update embedding_configurator.py Modified _configure_bedrock method to use user submitted model_name rather than default amazon.titan-embed-text-v1. Sending model_name in short_term_memory (embedder_config/config) was not working. # Passing model_name to use model_name provide by user than using default. Added if/else for backward compatibility * Update embedding_configurator.py Incorporated review comments --------- Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>
2025-12-24 08:18:31 +00:00 · 2025-02-04 13:44:28 -08:00 · 2025-02-04 16:44:07 -05:00 · 2025-02-04 16:18:50 -05:00 · 2025-02-04 16:07:22 -05:00 · 2025-02-04 16:03:38 -05:00
28 changed files with 938 additions and 47 deletions
--- a/README.md
+++ b/README.md
@@ -190,7 +190,7 @@ research_task:
  description: >
    Conduct a thorough research about {topic}
    Make sure you find any interesting and relevant information given
-    the current year is 2024.
+    the current year is 2025.
  expected_output: >
    A list with 10 bullet points of the most relevant information about {topic}
  agent: researcher
--- a/docs/concepts/crews.mdx
+++ b/docs/concepts/crews.mdx
@@ -279,9 +279,9 @@ print(result)
 Once your crew is assembled, initiate the workflow with the appropriate kickoff method. CrewAI provides several methods for better control over the kickoff process: `kickoff()`, `kickoff_for_each()`, `kickoff_async()`, and `kickoff_for_each_async()`.

 - `kickoff()`: Starts the execution process according to the defined process flow.
- `kickoff_for_each()`: Executes tasks for each agent individually.
+- `kickoff_for_each()`: Executes tasks sequentially for each provided input event or item in the collection.
 - `kickoff_async()`: Initiates the workflow asynchronously.
- `kickoff_for_each_async()`: Executes tasks for each agent individually in an asynchronous manner.
+- `kickoff_for_each_async()`: Executes tasks concurrently for each provided input event or item, leveraging asynchronous processing.

 ```python Code
 # Start the crew's task execution
--- a/docs/concepts/llms.mdx
+++ b/docs/concepts/llms.mdx
@@ -38,6 +38,7 @@ Here's a detailed breakdown of supported models and their capabilities, you can
    | GPT-4 | 8,192 tokens | High-accuracy tasks, complex reasoning |
    | GPT-4 Turbo | 128,000 tokens | Long-form content, document analysis |
    | GPT-4o & GPT-4o-mini | 128,000 tokens | Cost-effective large context processing |
+    | o3-mini | 200,000 tokens | Fast reasoning, complex reasoning |

    <Note>
      1 token ≈ 4 characters in English. For example, 8,192 tokens ≈ 32,768 characters or about 6,000 words.
@@ -162,7 +163,8 @@ Here's a detailed breakdown of supported models and their capabilities, you can
  <Tab title="Others">
    | Provider | Context Window | Key Features |
    |----------|---------------|--------------|
-    | Deepseek Chat | 128,000 tokens | Specialized in technical discussions |
+    | Deepseek Chat | 64,000 tokens | Specialized in technical discussions |
+    | Deepseek R1 | 64,000 tokens | Affordable reasoning model |
    | Claude 3 | Up to 200K tokens | Strong reasoning, code understanding |
    | Gemma Series | 8,192 tokens | Efficient, smaller-scale tasks |

@@ -296,6 +298,10 @@ There are three ways to configure LLMs in CrewAI. Choose the method that best fi
        # llm: sambanova/Meta-Llama-3.1-8B-Instruct
        # llm: sambanova/BioMistral-7B
        # llm: sambanova/Falcon-180B
+
+        # Open Router Models - Affordable reasoning
+        # llm: openrouter/deepseek/deepseek-r1
+        # llm: openrouter/deepseek/deepseek-chat
    ```

    <Info>
@@ -465,11 +471,22 @@ Learn how to get the most out of your LLM configuration:
    # https://cloud.google.com/vertex-ai/generative-ai/docs/overview
    ```

+    ## GET CREDENTIALS 
+    file_path = 'path/to/vertex_ai_service_account.json'
+
+    # Load the JSON file
+    with open(file_path, 'r') as file:
+        vertex_credentials = json.load(file)
+
+    # Convert to JSON string
+    vertex_credentials_json = json.dumps(vertex_credentials)
+
    Example usage:
    ```python Code
    llm = LLM(
        model="gemini/gemini-1.5-pro-latest",
-        temperature=0.7
+        temperature=0.7,
+        vertex_credentials=vertex_credentials_json
    )
    ```
  </Accordion>
@@ -680,8 +697,53 @@ Learn how to get the most out of your LLM configuration:
      - Support for long context windows
    </Info>
  </Accordion>
+
+  <Accordion title="Open Router">
+    ```python Code
+    OPENROUTER_API_KEY=<your-api-key>
+    ```
+    
+    Example usage:
+    ```python Code
+    llm = LLM(
+        model="openrouter/deepseek/deepseek-r1",
+        base_url="https://openrouter.ai/api/v1",
+        api_key=OPENROUTER_API_KEY
+    )
+    ```
+
+    <Info>
+      Open Router models:
+      - openrouter/deepseek/deepseek-r1
+      - openrouter/deepseek/deepseek-chat
+    </Info>
+  </Accordion>
 </AccordionGroup>

+## Structured LLM Calls
+
+CrewAI supports structured responses from LLM calls by allowing you to define a `response_format` using a Pydantic model. This enables the framework to automatically parse and validate the output, making it easier to integrate the response into your application without manual post-processing.
+
+For example, you can define a Pydantic model to represent the expected response structure and pass it as the `response_format` when instantiating the LLM. The model will then be used to convert the LLM output into a structured Python object.
+
+```python Code
+from crewai import LLM
+
+class Dog(BaseModel):
+    name: str
+    age: int
+    breed: str
+
+
+llm = LLM(model="gpt-4o", response_format=Dog)
+
+response = llm.call(
+    "Analyze the following messages and return the name, age, and breed. "
+    "Meet Kona! She is 3 years old and is a black german shepherd."
+)
+print(response)
+```
+
 ## Common Issues and Solutions

 <Tabs>
--- a/docs/concepts/memory.mdx
+++ b/docs/concepts/memory.mdx
@@ -185,7 +185,7 @@ my_crew = Crew(
    process=Process.sequential,
    memory=True,
    verbose=True,
-    embedder=OpenAIEmbeddingFunction(api_key=os.getenv("OPENAI_API_KEY"), model_name="text-embedding-3-small"),
+    embedder=OpenAIEmbeddingFunction(api_key=os.getenv("OPENAI_API_KEY"), model="text-embedding-3-small"),
 )
 ```

@@ -224,7 +224,7 @@ my_crew = Crew(
        "provider": "google",
        "config": {
            "api_key": "<YOUR_API_KEY>",
-            "model_name": "<model_name>"
+            "model": "<model_name>"
        }
    }
 )
@@ -247,7 +247,7 @@ my_crew = Crew(
        api_base="YOUR_API_BASE_PATH",
        api_type="azure",
        api_version="YOUR_API_VERSION",
-        model_name="text-embedding-3-small"
+        model="text-embedding-3-small"
    )
 )
 ```
@@ -268,7 +268,7 @@ my_crew = Crew(
        project_id="YOUR_PROJECT_ID",
        region="YOUR_REGION",
        api_key="YOUR_API_KEY",
-        model_name="textembedding-gecko"
+        model="textembedding-gecko"
    )
 )
 ```
@@ -288,7 +288,7 @@ my_crew = Crew(
        "provider": "cohere",
        "config": {
            "api_key": "YOUR_API_KEY",
-            "model_name": "<model_name>"
+            "model": "<model_name>"
        }
    }
 )
@@ -308,7 +308,7 @@ my_crew = Crew(
        "provider": "voyageai",
        "config": {
            "api_key": "YOUR_API_KEY",
-            "model_name": "<model_name>"
+            "model": "<model_name>"
        }
    }
 )
--- a/docs/concepts/planning.mdx
+++ b/docs/concepts/planning.mdx
@@ -81,8 +81,8 @@ my_crew.kickoff()

 3. **Collect Data:**

-   - Search for the latest papers, articles, and reports published in 2023 and early 2024.
-   - Use keywords like "Large Language Models 2024", "AI LLM advancements", "AI ethics 2024", etc.
+   - Search for the latest papers, articles, and reports published in 2024 and early 2025.
+   - Use keywords like "Large Language Models 2025", "AI LLM advancements", "AI ethics 2025", etc.

 4. **Analyze Findings:**

--- a/docs/concepts/tasks.mdx
+++ b/docs/concepts/tasks.mdx
@@ -69,7 +69,7 @@ research_task:
  description: >
    Conduct a thorough research about {topic}
    Make sure you find any interesting and relevant information given
-    the current year is 2024.
+    the current year is 2025.
  expected_output: >
    A list with 10 bullet points of the most relevant information about {topic}
  agent: researcher
@@ -155,7 +155,7 @@ research_task = Task(
    description="""
        Conduct a thorough research about AI Agents.
        Make sure you find any interesting and relevant information given
-        the current year is 2024.
+        the current year is 2025.
    """,
    expected_output="""
        A list with 10 bullet points of the most relevant information about AI Agents
--- a/docs/how-to/human-input-on-execution.mdx
+++ b/docs/how-to/human-input-on-execution.mdx
@@ -60,12 +60,12 @@ writer = Agent(
 # Create tasks for your agents
 task1 = Task(
    description=(
-        "Conduct a comprehensive analysis of the latest advancements in AI in 2024. "
+        "Conduct a comprehensive analysis of the latest advancements in AI in 2025. "
        "Identify key trends, breakthrough technologies, and potential industry impacts. "
        "Compile your findings in a detailed report. "
        "Make sure to check with a human if the draft is good before finalizing your answer."
    ),
-    expected_output='A comprehensive full report on the latest AI advancements in 2024, leave nothing out',
+    expected_output='A comprehensive full report on the latest AI advancements in 2025, leave nothing out',
    agent=researcher,
    human_input=True
 )
@@ -76,7 +76,7 @@ task2 = Task(
        "Your post should be informative yet accessible, catering to a tech-savvy audience. "
        "Aim for a narrative that captures the essence of these breakthroughs and their implications for the future."
    ),
-    expected_output='A compelling 3 paragraphs blog post formatted as markdown about the latest AI advancements in 2024',
+    expected_output='A compelling 3 paragraphs blog post formatted as markdown about the latest AI advancements in 2025',
    agent=writer,
    human_input=True
 )
--- a/docs/how-to/mlflow-observability.mdx
+++ b/docs/how-to/mlflow-observability.mdx
@@ -0,0 +1,206 @@
+---
+title: Agent Monitoring with MLflow
+description: Quickly start monitoring your Agents with MLflow.
+icon: bars-staggered
+---
+
+# MLflow Overview
+
+[MLflow](https://mlflow.org/) is an open-source platform to assist machine learning practitioners and teams in handling the complexities of the machine learning process.
+
+It provides a tracing feature that enhances LLM observability in your Generative AI applications by capturing detailed information about the execution of your application’s services. 
+Tracing provides a way to record the inputs, outputs, and metadata associated with each intermediate step of a request, enabling you to easily pinpoint the source of bugs and unexpected behaviors.
+
+![Overview of MLflow crewAI tracing usage](/images/mlflow-tracing.gif)
+
+### Features
+
+- **Tracing Dashboard**: Monitor activities of your crewAI agents with detailed dashboards that include inputs, outputs and metadata of spans.
+- **Automated Tracing**: A fully automated integration with crewAI, which can be enabled by running `mlflow.crewai.autolog()`. 
+- **Manual Trace Instrumentation with minor efforts**: Customize trace instrumentation through MLflow's high-level fluent APIs such as decorators, function wrappers and context managers.
+- **OpenTelemetry Compatibility**: MLflow Tracing supports exporting traces to an OpenTelemetry Collector, which can then be used to export traces to various backends such as Jaeger, Zipkin, and AWS X-Ray.
+- **Package and Deploy Agents**: Package and deploy your crewAI agents to an inference server with a variety of deployment targets.
+- **Securely Host LLMs**: Host multiple LLM from various providers in one unified endpoint through MFflow gateway.
+- **Evaluation**: Evaluate your crewAI agents with a wide range of metrics using a convenient API `mlflow.evaluate()`.
+
+## Setup Instructions
+
+<Steps>
+    <Step title="Install MLflow package">
+      ```shell
+      # The crewAI integration is available in mlflow>=2.19.0
+      pip install mlflow
+      ```
+    </Step>
+    <Step title="Start MFflow tracking server">
+      ```shell
+      # This process is optional, but it is recommended to use MLflow tracking server for better visualization and broader features.
+      mlflow server
+      ```
+    </Step>
+    <Step title="Initialize MLflow in Your Application">
+      Add the following two lines to your application code:
+
+      ```python
+      import mlflow
+
+      mlflow.crewai.autolog()
+
+      # Optional: Set a tracking URI and an experiment name if you have a tracking server
+      mlflow.set_tracking_uri("http://localhost:5000")
+      mlflow.set_experiment("CrewAI")
+      ```
+      
+      Example Usage for tracing CrewAI Agents:
+
+      ```python
+      from crewai import Agent, Crew, Task
+      from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
+      from crewai_tools import SerperDevTool, WebsiteSearchTool
+
+      from textwrap import dedent
+
+      content = "Users name is John. He is 30 years old and lives in San Francisco."
+      string_source = StringKnowledgeSource(
+          content=content, metadata={"preference": "personal"}
+      )
+
+      search_tool = WebsiteSearchTool()
+
+
+      class TripAgents:
+          def city_selection_agent(self):
+              return Agent(
+                  role="City Selection Expert",
+                  goal="Select the best city based on weather, season, and prices",
+                  backstory="An expert in analyzing travel data to pick ideal destinations",
+                  tools=[
+                      search_tool,
+                  ],
+                  verbose=True,
+              )
+
+          def local_expert(self):
+              return Agent(
+                  role="Local Expert at this city",
+                  goal="Provide the BEST insights about the selected city",
+                  backstory="""A knowledgeable local guide with extensive information
+              about the city, it's attractions and customs""",
+                  tools=[search_tool],
+                  verbose=True,
+              )
+
+
+      class TripTasks:
+          def identify_task(self, agent, origin, cities, interests, range):
+              return Task(
+                  description=dedent(
+                      f"""
+                      Analyze and select the best city for the trip based
+                      on specific criteria such as weather patterns, seasonal
+                      events, and travel costs. This task involves comparing
+                      multiple cities, considering factors like current weather
+                      conditions, upcoming cultural or seasonal events, and
+                      overall travel expenses.
+                      Your final answer must be a detailed
+                      report on the chosen city, and everything you found out
+                      about it, including the actual flight costs, weather
+                      forecast and attractions.
+
+                      Traveling from: {origin}
+                      City Options: {cities}
+                      Trip Date: {range}
+                      Traveler Interests: {interests}
+                  """
+                  ),
+                  agent=agent,
+                  expected_output="Detailed report on the chosen city including flight costs, weather forecast, and attractions",
+              )
+
+          def gather_task(self, agent, origin, interests, range):
+              return Task(
+                  description=dedent(
+                      f"""
+                      As a local expert on this city you must compile an
+                      in-depth guide for someone traveling there and wanting
+                      to have THE BEST trip ever!
+                      Gather information about key attractions, local customs,
+                      special events, and daily activity recommendations.
+                      Find the best spots to go to, the kind of place only a
+                      local would know.
+                      This guide should provide a thorough overview of what
+                      the city has to offer, including hidden gems, cultural
+                      hotspots, must-visit landmarks, weather forecasts, and
+                      high level costs.
+                      The final answer must be a comprehensive city guide,
+                      rich in cultural insights and practical tips,
+                      tailored to enhance the travel experience.
+
+                      Trip Date: {range}
+                      Traveling from: {origin}
+                      Traveler Interests: {interests}
+                  """
+                  ),
+                  agent=agent,
+                  expected_output="Comprehensive city guide including hidden gems, cultural hotspots, and practical travel tips",
+              )
+
+
+      class TripCrew:
+          def __init__(self, origin, cities, date_range, interests):
+              self.cities = cities
+              self.origin = origin
+              self.interests = interests
+              self.date_range = date_range
+
+          def run(self):
+              agents = TripAgents()
+              tasks = TripTasks()
+
+              city_selector_agent = agents.city_selection_agent()
+              local_expert_agent = agents.local_expert()
+
+              identify_task = tasks.identify_task(
+                  city_selector_agent,
+                  self.origin,
+                  self.cities,
+                  self.interests,
+                  self.date_range,
+              )
+              gather_task = tasks.gather_task(
+                  local_expert_agent, self.origin, self.interests, self.date_range
+              )
+
+              crew = Crew(
+                  agents=[city_selector_agent, local_expert_agent],
+                  tasks=[identify_task, gather_task],
+                  verbose=True,
+                  memory=True,
+                  knowledge={
+                      "sources": [string_source],
+                      "metadata": {"preference": "personal"},
+                  },
+              )
+
+              result = crew.kickoff()
+              return result
+
+
+      trip_crew = TripCrew("California", "Tokyo", "Dec 12 - Dec 20", "sports")
+      result = trip_crew.run()
+
+      print(result)
+      ```
+      Refer to [MLflow Tracing Documentation](https://mlflow.org/docs/latest/llms/tracing/index.html) for more configurations and use cases.
+    </Step>
+    <Step title="Visualize Activities of Agents">
+      Now traces for your crewAI agents are captured by MLflow. 
+      Let's visit MLflow tracking server to view the traces and get insights into your Agents.
+
+      Open `127.0.0.1:5000` on your browser to visit MLflow tracking server. 
+      <Frame caption="MLflow Tracing Dashboard">
+        <img src="/images/mlflow1.png" alt="MLflow tracing example with crewai" />
+      </Frame>
+    </Step>
+</Steps> 
+
--- a/docs/images/mlflow-tracing.gif
+++ b/docs/images/mlflow-tracing.gif
--- a/docs/images/mlflow1.png
+++ b/docs/images/mlflow1.png
--- a/docs/mint.json
+++ b/docs/mint.json
@@ -101,6 +101,7 @@
        "how-to/conditional-tasks",
        "how-to/agentops-observability",
        "how-to/langtrace-observability",
+        "how-to/mlflow-observability",
        "how-to/openlit-observability",
        "how-to/portkey-observability"
      ]
--- a/docs/quickstart.mdx
+++ b/docs/quickstart.mdx
@@ -58,7 +58,7 @@ Follow the steps below to get crewing! 🚣‍♂️
      description: >
        Conduct a thorough research about {topic}
        Make sure you find any interesting and relevant information given
-        the current year is 2024.
+        the current year is 2025.
      expected_output: >
        A list with 10 bullet points of the most relevant information about {topic}
      agent: researcher
@@ -195,10 +195,10 @@ Follow the steps below to get crewing! 🚣‍♂️

  <CodeGroup>
    ```markdown output/report.md
-    # Comprehensive Report on the Rise and Impact of AI Agents in 2024
+    # Comprehensive Report on the Rise and Impact of AI Agents in 2025

    ## 1. Introduction to AI Agents
-    In 2024, Artificial Intelligence (AI) agents are at the forefront of innovation across various industries. As intelligent systems that can perform tasks typically requiring human cognition, AI agents are paving the way for significant advancements in operational efficiency, decision-making, and overall productivity within sectors like Human Resources (HR) and Finance. This report aims to detail the rise of AI agents, their frameworks, applications, and potential implications on the workforce.
+    In 2025, Artificial Intelligence (AI) agents are at the forefront of innovation across various industries. As intelligent systems that can perform tasks typically requiring human cognition, AI agents are paving the way for significant advancements in operational efficiency, decision-making, and overall productivity within sectors like Human Resources (HR) and Finance. This report aims to detail the rise of AI agents, their frameworks, applications, and potential implications on the workforce.

    ## 2. Benefits of AI Agents
    AI agents bring numerous advantages that are transforming traditional work environments. Key benefits include:
@@ -252,7 +252,7 @@ Follow the steps below to get crewing! 🚣‍♂️
    To stay competitive and harness the full potential of AI agents, organizations must remain vigilant about latest developments in AI technology and consider continuous learning and adaptation in their strategic planning.

    ## 8. Conclusion
-    The emergence of AI agents is undeniably reshaping the workplace landscape in 2024. With their ability to automate tasks, enhance efficiency, and improve decision-making, AI agents are critical in driving operational success. Organizations must embrace and adapt to AI developments to thrive in an increasingly digital business environment.
+    The emergence of AI agents is undeniably reshaping the workplace landscape in 5. With their ability to automate tasks, enhance efficiency, and improve decision-making, AI agents are critical in driving operational success. Organizations must embrace and adapt to AI developments to thrive in an increasingly digital business environment.
    ```
  </CodeGroup>
  </Step>
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -152,6 +152,7 @@ nav:
    - Agent Monitoring with AgentOps: 'how-to/AgentOps-Observability.md'
    - Agent Monitoring with LangTrace: 'how-to/Langtrace-Observability.md'
    - Agent Monitoring with OpenLIT: 'how-to/openlit-Observability.md'
+    - Agent Monitoring with MLflow: 'how-to/mlflow-Observability.md'
  - Tools Docs:
    - Browserbase Web Loader: 'tools/BrowserbaseLoadTool.md'
    - Code Docs RAG Search: 'tools/CodeDocsSearchTool.md'
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -11,7 +11,7 @@ dependencies = [
    # Core Dependencies
    "pydantic>=2.4.2",
    "openai>=1.13.3",
-    "litellm==1.59.8",
+    "litellm==1.60.2",
    "instructor>=1.3.3",
    # Text Processing
    "pdfplumber>=0.11.4",
--- a/src/crewai/agents/crew_agent_executor.py
+++ b/src/crewai/agents/crew_agent_executor.py
@@ -519,7 +519,11 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
            color="yellow",
        )
        self._handle_crew_training_output(initial_answer, feedback)
-        self.messages.append(self._format_msg(f"Feedback: {feedback}"))
+        self.messages.append(
+            self._format_msg(
+                self._i18n.slice("feedback_instructions").format(feedback=feedback)
+            )
+        )
        improved_answer = self._invoke_loop()
        self._handle_crew_training_output(improved_answer)
        self.ask_for_human_input = False
@@ -566,7 +570,11 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):

    def _process_feedback_iteration(self, feedback: str) -> AgentFinish:
        """Process a single feedback iteration."""
-        self.messages.append(self._format_msg(f"Feedback: {feedback}"))
+        self.messages.append(
+            self._format_msg(
+                self._i18n.slice("feedback_instructions").format(feedback=feedback)
+            )
+        )
        return self._invoke_loop()

    def _log_feedback_error(self, retry_count: int, error: Exception) -> None:
--- a/src/crewai/llm.py
+++ b/src/crewai/llm.py
@@ -5,15 +5,17 @@ import sys
 import threading
 import warnings
 from contextlib import contextmanager
-from typing import Any, Dict, List, Optional, Union, cast
+from typing import Any, Dict, List, Literal, Optional, Type, Union, cast

 from dotenv import load_dotenv
+from pydantic import BaseModel

 with warnings.catch_warnings():
    warnings.simplefilter("ignore", UserWarning)
    import litellm
    from litellm import Choices, get_supported_openai_params
    from litellm.types.utils import ModelResponse
+    from litellm.utils import supports_response_schema


 from crewai.utilities.exceptions.context_window_exceeding_exception import (
@@ -128,14 +130,17 @@ class LLM:
        presence_penalty: Optional[float] = None,
        frequency_penalty: Optional[float] = None,
        logit_bias: Optional[Dict[int, float]] = None,
-        response_format: Optional[Dict[str, Any]] = None,
+        response_format: Optional[Type[BaseModel]] = None,
        seed: Optional[int] = None,
        logprobs: Optional[int] = None,
        top_logprobs: Optional[int] = None,
        base_url: Optional[str] = None,
+        api_base: Optional[str] = None,
        api_version: Optional[str] = None,
        api_key: Optional[str] = None,
        callbacks: List[Any] = [],
+        reasoning_effort: Optional[Literal["none", "low", "medium", "high"]] = None,
+        **kwargs,
    ):
        self.model = model
        self.timeout = timeout
@@ -152,10 +157,13 @@ class LLM:
        self.logprobs = logprobs
        self.top_logprobs = top_logprobs
        self.base_url = base_url
+        self.api_base = api_base
        self.api_version = api_version
        self.api_key = api_key
        self.callbacks = callbacks
        self.context_window_size = 0
+        self.reasoning_effort = reasoning_effort
+        self.additional_params = kwargs

        litellm.drop_params = True

@@ -207,6 +215,9 @@ class LLM:
        response = llm.call(messages)
        print(response)
        """
+        # Validate parameters before proceeding with the call.
+        self._validate_call_params()
+
        if isinstance(messages, str):
            messages = [{"role": "user", "content": messages}]

@@ -232,11 +243,14 @@ class LLM:
                    "seed": self.seed,
                    "logprobs": self.logprobs,
                    "top_logprobs": self.top_logprobs,
-                    "api_base": self.base_url,
+                    "api_base": self.api_base,
+                    "base_url": self.base_url,
                    "api_version": self.api_version,
                    "api_key": self.api_key,
                    "stream": False,
                    "tools": tools,
+                    "reasoning_effort": self.reasoning_effort,
+                    **self.additional_params,
                }

                # Remove None values from params
@@ -303,6 +317,36 @@ class LLM:
                    logging.error(f"LiteLLM call failed: {str(e)}")
                raise

+    def _get_custom_llm_provider(self) -> str:
+        """
+        Derives the custom_llm_provider from the model string.
+        - For example, if the model is "openrouter/deepseek/deepseek-chat", returns "openrouter".
+        - If the model is "gemini/gemini-1.5-pro", returns "gemini".
+        - If there is no '/', defaults to "openai".
+        """
+        if "/" in self.model:
+            return self.model.split("/")[0]
+        return "openai"
+
+    def _validate_call_params(self) -> None:
+        """
+        Validate parameters before making a call. Currently this only checks if
+        a response_format is provided and whether the model supports it.
+        The custom_llm_provider is dynamically determined from the model:
+          - E.g., "openrouter/deepseek/deepseek-chat" yields "openrouter"
+          - "gemini/gemini-1.5-pro" yields "gemini"
+          - If no slash is present, "openai" is assumed.
+        """
+        provider = self._get_custom_llm_provider()
+        if self.response_format is not None and not supports_response_schema(
+            model=self.model,
+            custom_llm_provider=provider,
+        ):
+            raise ValueError(
+                f"The model {self.model} does not support response_format for provider '{provider}'. "
+                "Please remove response_format or use a supported model."
+            )
+
    def supports_function_calling(self) -> bool:
        try:
            params = get_supported_openai_params(model=self.model)
--- a/src/crewai/translations/en.json
+++ b/src/crewai/translations/en.json
@@ -24,7 +24,8 @@
    "manager_request": "Your best answer to your coworker asking you this, accounting for the context shared.",
    "formatted_task_instructions": "Ensure your final answer contains only the content in the following format: {output_format}\n\nEnsure the final output does not include any code block markers like ```json or ```python.",
    "human_feedback_classification": "Determine if the following feedback indicates that the user is satisfied or if further changes are needed. Respond with 'True' if further changes are needed, or 'False' if the user is satisfied. **Important** Do not include any additional commentary outside of your 'True' or 'False' response.\n\nFeedback: \"{feedback}\"",
-    "conversation_history_instruction": "You are a member of a crew collaborating to achieve a common goal. Your task is a specific action that contributes to this larger objective. For additional context, please review the conversation history between you and the user that led to the initiation of this crew. Use any relevant information or feedback from the conversation to inform your task execution and ensure your response aligns with both the immediate task and the crew's overall goals."
+    "conversation_history_instruction": "You are a member of a crew collaborating to achieve a common goal. Your task is a specific action that contributes to this larger objective. For additional context, please review the conversation history between you and the user that led to the initiation of this crew. Use any relevant information or feedback from the conversation to inform your task execution and ensure your response aligns with both the immediate task and the crew's overall goals.",
+    "feedback_instructions": "User feedback: {feedback}\nInstructions: Use this feedback to enhance the next output iteration.\nNote: Do not respond or add commentary."
  },
  "errors": {
    "force_final_answer_error": "You can't keep going, here is the best final answer you generated:\n\n {formatted_answer}",
--- a/src/crewai/utilities/embedding_configurator.py
+++ b/src/crewai/utilities/embedding_configurator.py
@@ -141,9 +141,11 @@ class EmbeddingConfigurator:
            AmazonBedrockEmbeddingFunction,
        )

-        return AmazonBedrockEmbeddingFunction(
-            session=config.get("session"),
-        )
+        # Allow custom model_name override with backwards compatibility
+        kwargs = {"session": config.get("session")}
+        if model_name is not None:
+            kwargs["model_name"] = model_name
+        return AmazonBedrockEmbeddingFunction(**kwargs)

    @staticmethod
    def _configure_huggingface(config, model_name):
--- a/src/crewai/utilities/llm_utils.py
+++ b/src/crewai/utilities/llm_utils.py
@@ -53,6 +53,7 @@ def create_llm(
        timeout: Optional[float] = getattr(llm_value, "timeout", None)
        api_key: Optional[str] = getattr(llm_value, "api_key", None)
        base_url: Optional[str] = getattr(llm_value, "base_url", None)
+        api_base: Optional[str] = getattr(llm_value, "api_base", None)

        created_llm = LLM(
            model=model,
@@ -62,6 +63,7 @@ def create_llm(
            timeout=timeout,
            api_key=api_key,
            base_url=base_url,
+            api_base=api_base,
        )
        return created_llm
    except Exception as e:
@@ -101,8 +103,18 @@ def _llm_via_environment_or_fallback() -> Optional[LLM]:
    callbacks: List[Any] = []

    # Optional base URL from env
-    api_base = os.environ.get("OPENAI_API_BASE") or os.environ.get("OPENAI_BASE_URL")
-    if api_base:
+    base_url = (
+        os.environ.get("BASE_URL")
+        or os.environ.get("OPENAI_API_BASE")
+        or os.environ.get("OPENAI_BASE_URL")
+    )
+
+    api_base = os.environ.get("API_BASE") or os.environ.get("AZURE_API_BASE")
+
+    # Synchronize base_url and api_base if one is populated and the other is not
+    if base_url and not api_base:
+        api_base = base_url
+    elif api_base and not base_url:
        base_url = api_base

    # Initialize llm_params dictionary
@@ -115,6 +127,7 @@ def _llm_via_environment_or_fallback() -> Optional[LLM]:
        "timeout": timeout,
        "api_key": api_key,
        "base_url": base_url,
+        "api_base": api_base,
        "api_version": api_version,
        "presence_penalty": presence_penalty,
        "frequency_penalty": frequency_penalty,
--- a/src/crewai/utilities/training_handler.py
+++ b/src/crewai/utilities/training_handler.py
@@ -35,6 +35,4 @@ class CrewTrainingHandler(PickleHandler):
    def clear(self) -> None:
        """Clear the training data by removing the file or resetting its contents."""
        if os.path.exists(self.file_path):
-            with open(self.file_path, "wb") as file:
-                # Overwrite with an empty dictionary
-                self.save({})
+            self.save({})
--- a/tests/cassettes/test_deepseek_r1_with_open_router.yaml
+++ b/tests/cassettes/test_deepseek_r1_with_open_router.yaml
@@ -0,0 +1,100 @@
+interactions:
+- request:
+    body: '{"model": "deepseek/deepseek-r1", "messages": [{"role": "user", "content":
+      "What is the capital of France?"}], "stop": [], "stream": false}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '139'
+      host:
+      - openrouter.ai
+      http-referer:
+      - https://litellm.ai
+      user-agent:
+      - litellm/1.60.2
+      x-title:
+      - liteLLM
+    method: POST
+    uri: https://openrouter.ai/api/v1/chat/completions
+  response:
+    content: "\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n{\"id\":\"gen-1738684300-YnD5WOSczQWsW0vQG78a\",\"provider\":\"Nebius\",\"model\":\"deepseek/deepseek-r1\",\"object\":\"chat.completion\",\"created\":1738684300,\"choices\":[{\"logprobs\":null,\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":\"The
+      capital of France is **Paris**. Known for its iconic landmarks such as the Eiffel
+      Tower, Notre-Dame Cathedral, and the Louvre Museum, Paris has served as the
+      political and cultural center of France for centuries. \U0001F1EB\U0001F1F7\",\"refusal\":null}}],\"usage\":{\"prompt_tokens\":10,\"completion_tokens\":261,\"total_tokens\":271}}"
+    headers:
+      Access-Control-Allow-Origin:
+      - '*'
+      CF-RAY:
+      - 90cbd2ceaf3ead5e-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:51:40 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Accept-Encoding
+      x-clerk-auth-message:
+      - Invalid JWT form. A JWT consists of three parts separated by dots. (reason=token-invalid,
+        token-carrier=header)
+      x-clerk-auth-reason:
+      - token-invalid
+      x-clerk-auth-status:
+      - signed-out
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cassettes/test_o3_mini_reasoning_effort_high.yaml
+++ b/tests/cassettes/test_o3_mini_reasoning_effort_high.yaml
@@ -0,0 +1,107 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "o3-mini", "reasoning_effort": "high", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '137'
+      content-type:
+      - application/json
+      cookie:
+      - _cfuvid=etTqqA9SBOnENmrFAUBIexdW0v2ZeO1x9_Ek_WChlfU-1737568920137-0.0.1.1-604800000
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.61.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.61.0
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    content: "{\n  \"id\": \"chatcmpl-AxFNUz7l4pwtY9xhFSPIGlwNfE4Sj\",\n  \"object\":
+      \"chat.completion\",\n  \"created\": 1738683828,\n  \"model\": \"o3-mini-2025-01-31\",\n
+      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+      \"assistant\",\n        \"content\": \"The capital of France is Paris.\",\n
+      \       \"refusal\": null\n      },\n      \"finish_reason\": \"stop\"\n    }\n
+      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 13,\n    \"completion_tokens\":
+      81,\n    \"total_tokens\": 94,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
+      \     \"reasoning_tokens\": 64,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+      \"default\",\n  \"system_fingerprint\": \"fp_8bcaa0ca21\"\n}\n"
+    headers:
+      CF-RAY:
+      - 90cbc745d91fb0ca-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:43:50 GMT
+      Server:
+      - cloudflare
+      Set-Cookie:
+      - __cf_bm=.AP74BirsYr.lu61bSaimK2HRF6126qr5vCrr3HC6ak-1738683830-1.0.1.1-feh.bcMOv9wYnitoPpr.7UR7JrzCsbRLlzct09xCDm2SwmnRQQk5ZSSV41Ywer2S0rptbvufFwklV9wo9ATvWw;
+        path=/; expires=Tue, 04-Feb-25 16:13:50 GMT; domain=.api.openai.com; HttpOnly;
+        Secure; SameSite=None
+      - _cfuvid=JBfx8Sl7w82A0S_K1tQd5ZcwzWaZP5Gg5W1dqAdgwNU-1738683830528-0.0.1.1-604800000;
+        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '2169'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999974'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_163e7bd79cb5a5e62d4688245b97d1d9
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cassettes/test_o3_mini_reasoning_effort_low.yaml
+++ b/tests/cassettes/test_o3_mini_reasoning_effort_low.yaml
@@ -0,0 +1,102 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "o3-mini", "reasoning_effort": "low", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '136'
+      content-type:
+      - application/json
+      cookie:
+      - _cfuvid=JBfx8Sl7w82A0S_K1tQd5ZcwzWaZP5Gg5W1dqAdgwNU-1738683830528-0.0.1.1-604800000;
+        __cf_bm=.AP74BirsYr.lu61bSaimK2HRF6126qr5vCrr3HC6ak-1738683830-1.0.1.1-feh.bcMOv9wYnitoPpr.7UR7JrzCsbRLlzct09xCDm2SwmnRQQk5ZSSV41Ywer2S0rptbvufFwklV9wo9ATvWw
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.61.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.61.0
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    content: "{\n  \"id\": \"chatcmpl-AxFNWljEYFrf5qRwYj73OPQtAnPbF\",\n  \"object\":
+      \"chat.completion\",\n  \"created\": 1738683830,\n  \"model\": \"o3-mini-2025-01-31\",\n
+      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+      \"assistant\",\n        \"content\": \"The capital of France is Paris.\",\n
+      \       \"refusal\": null\n      },\n      \"finish_reason\": \"stop\"\n    }\n
+      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 13,\n    \"completion_tokens\":
+      17,\n    \"total_tokens\": 30,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
+      \     \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+      \"default\",\n  \"system_fingerprint\": \"fp_8bcaa0ca21\"\n}\n"
+    headers:
+      CF-RAY:
+      - 90cbc7551fe0b0ca-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:43:51 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '1103'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999974'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_fd7178a0e5060216d04f3bd023e8bca1
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cassettes/test_o3_mini_reasoning_effort_medium.yaml
+++ b/tests/cassettes/test_o3_mini_reasoning_effort_medium.yaml
@@ -0,0 +1,102 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "o3-mini", "reasoning_effort": "medium", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '139'
+      content-type:
+      - application/json
+      cookie:
+      - _cfuvid=JBfx8Sl7w82A0S_K1tQd5ZcwzWaZP5Gg5W1dqAdgwNU-1738683830528-0.0.1.1-604800000;
+        __cf_bm=.AP74BirsYr.lu61bSaimK2HRF6126qr5vCrr3HC6ak-1738683830-1.0.1.1-feh.bcMOv9wYnitoPpr.7UR7JrzCsbRLlzct09xCDm2SwmnRQQk5ZSSV41Ywer2S0rptbvufFwklV9wo9ATvWw
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.61.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.61.0
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    content: "{\n  \"id\": \"chatcmpl-AxFS8IuMeYs6Rky2UbG8wH8P5PR4k\",\n  \"object\":
+      \"chat.completion\",\n  \"created\": 1738684116,\n  \"model\": \"o3-mini-2025-01-31\",\n
+      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+      \"assistant\",\n        \"content\": \"The capital of France is Paris.\",\n
+      \       \"refusal\": null\n      },\n      \"finish_reason\": \"stop\"\n    }\n
+      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 13,\n    \"completion_tokens\":
+      145,\n    \"total_tokens\": 158,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
+      \     \"reasoning_tokens\": 128,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+      \"default\",\n  \"system_fingerprint\": \"fp_8bcaa0ca21\"\n}\n"
+    headers:
+      CF-RAY:
+      - 90cbce51b946afb4-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:48:39 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '2365'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999974'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_bfd83679e674c3894991477f1fb043b2
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/config/tasks.yaml
+++ b/tests/config/tasks.yaml
@@ -2,7 +2,7 @@ research_task:
  description: >
    Conduct a thorough research about {topic}
    Make sure you find any interesting and relevant information given
-    the current year is 2024.
+    the current year is 2025.
  expected_output: >
    A list with 10 bullet points of the most relevant information about {topic}
  agent: researcher
--- a/tests/llm_test.py
+++ b/tests/llm_test.py
@@ -1,6 +1,9 @@
+import os
 from time import sleep
+from unittest.mock import MagicMock, patch

 import pytest
+from pydantic import BaseModel

 from crewai.agents.agent_builder.utilities.base_token_process import TokenProcess
 from crewai.llm import LLM
@@ -154,3 +157,144 @@ def test_llm_call_with_tool_and_message_list():

    assert isinstance(result, int)
    assert result == 25
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_llm_passes_additional_params():
+    llm = LLM(
+        model="gpt-4o-mini",
+        vertex_credentials="test_credentials",
+        vertex_project="test_project",
+    )
+
+    messages = [{"role": "user", "content": "Hello, world!"}]
+
+    with patch("litellm.completion") as mocked_completion:
+        # Create mocks for response structure
+        mock_message = MagicMock()
+        mock_message.content = "Test response"
+        mock_choice = MagicMock()
+        mock_choice.message = mock_message
+        mock_response = MagicMock()
+        mock_response.choices = [mock_choice]
+        mock_response.usage = {
+            "prompt_tokens": 5,
+            "completion_tokens": 5,
+            "total_tokens": 10,
+        }
+
+        # Set up the mocked completion to return the mock response
+        mocked_completion.return_value = mock_response
+
+        result = llm.call(messages)
+
+        # Assert that litellm.completion was called once
+        mocked_completion.assert_called_once()
+
+        # Retrieve the actual arguments with which litellm.completion was called
+        _, kwargs = mocked_completion.call_args
+
+        # Check that the additional_params were passed to litellm.completion
+        assert kwargs["vertex_credentials"] == "test_credentials"
+        assert kwargs["vertex_project"] == "test_project"
+
+        # Also verify that other expected parameters are present
+        assert kwargs["model"] == "gpt-4o-mini"
+        assert kwargs["messages"] == messages
+
+        # Check the result from llm.call
+        assert result == "Test response"
+
+
+def test_get_custom_llm_provider_openrouter():
+    llm = LLM(model="openrouter/deepseek/deepseek-chat")
+    assert llm._get_custom_llm_provider() == "openrouter"
+
+
+def test_get_custom_llm_provider_gemini():
+    llm = LLM(model="gemini/gemini-1.5-pro")
+    assert llm._get_custom_llm_provider() == "gemini"
+
+
+def test_get_custom_llm_provider_openai():
+    llm = LLM(model="gpt-4")
+    assert llm._get_custom_llm_provider() == "openai"
+
+
+def test_validate_call_params_supported():
+    class DummyResponse(BaseModel):
+        a: int
+
+    # Patch supports_response_schema to simulate a supported model.
+    with patch("crewai.llm.supports_response_schema", return_value=True):
+        llm = LLM(
+            model="openrouter/deepseek/deepseek-chat", response_format=DummyResponse
+        )
+        # Should not raise any error.
+        llm._validate_call_params()
+
+
+def test_validate_call_params_not_supported():
+    class DummyResponse(BaseModel):
+        a: int
+
+    # Patch supports_response_schema to simulate an unsupported model.
+    with patch("crewai.llm.supports_response_schema", return_value=False):
+        llm = LLM(model="gemini/gemini-1.5-pro", response_format=DummyResponse)
+        with pytest.raises(ValueError) as excinfo:
+            llm._validate_call_params()
+        assert "does not support response_format" in str(excinfo.value)
+
+
+def test_validate_call_params_no_response_format():
+    # When no response_format is provided, no validation error should occur.
+    llm = LLM(model="gemini/gemini-1.5-pro", response_format=None)
+    llm._validate_call_params()
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_o3_mini_reasoning_effort_high():
+    llm = LLM(
+        model="o3-mini",
+        reasoning_effort="high",
+    )
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_o3_mini_reasoning_effort_low():
+    llm = LLM(
+        model="o3-mini",
+        reasoning_effort="low",
+    )
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_o3_mini_reasoning_effort_medium():
+    llm = LLM(
+        model="o3-mini",
+        reasoning_effort="medium",
+    )
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_deepseek_r1_with_open_router():
+    if not os.getenv("OPEN_ROUTER_API_KEY"):
+        pytest.skip("OPEN_ROUTER_API_KEY not set; skipping test.")
+
+    llm = LLM(
+        model="openrouter/deepseek/deepseek-r1",
+        base_url="https://openrouter.ai/api/v1",
+        api_key=os.getenv("OPEN_ROUTER_API_KEY"),
+    )
+    result = llm.call("What is the capital of France?")
+    assert isinstance(result, str)
+    assert "Paris" in result
--- a/tests/task_test.py
+++ b/tests/task_test.py
@@ -723,14 +723,14 @@ def test_interpolate_inputs():
    )

    task.interpolate_inputs_and_add_conversation_history(
-        inputs={"topic": "AI", "date": "2024"}
+        inputs={"topic": "AI", "date": "2025"}
    )
    assert (
        task.description
        == "Give me a list of 5 interesting ideas about AI to explore for an article, what makes them unique and interesting."
    )
    assert task.expected_output == "Bullet point list of 5 interesting ideas about AI."
-    assert task.output_file == "/tmp/AI/output_2024.txt"
+    assert task.output_file == "/tmp/AI/output_2025.txt"

    task.interpolate_inputs_and_add_conversation_history(
        inputs={"topic": "ML", "date": "2025"}
--- a/uv.lock
+++ b/uv.lock
@@ -740,7 +740,7 @@ requires-dist = [
    { name = "json-repair", specifier = ">=0.25.2" },
    { name = "json5", specifier = ">=0.10.0" },
    { name = "jsonref", specifier = ">=1.1.0" },
-    { name = "litellm", specifier = "==1.59.8" },
+    { name = "litellm", specifier = "==1.60.2" },
    { name = "mem0ai", marker = "extra == 'mem0'", specifier = ">=0.1.29" },
    { name = "openai", specifier = ">=1.13.3" },
    { name = "openpyxl", specifier = ">=3.1.5" },
@@ -2374,7 +2374,7 @@ wheels = [

 [[package]]
 name = "litellm"
-version = "1.59.8"
+version = "1.60.2"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "aiohttp" },
@@ -2389,9 +2389,9 @@ dependencies = [
    { name = "tiktoken" },
    { name = "tokenizers" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/86/b0/c8ec06bd1c87a92d6d824008982b3c82b450d7bd3be850a53913f1ac4907/litellm-1.59.8.tar.gz", hash = "sha256:9d645cc4460f6a9813061f07086648c4c3d22febc8e1f21c663f2b7750d90512", size = 6428607 }
+sdist = { url = "https://files.pythonhosted.org/packages/94/8f/704cdb0fdbdd49dc5062a39ae5f1a8f308ae0ffd746df6e0137fc1776b8a/litellm-1.60.2.tar.gz", hash = "sha256:a8170584fcfd6f5175201d869e61ccd8a40ffe3264fc5e53c5b805ddf8a6e05a", size = 6447447 }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/b9/38/889da058f566ef9ea321aafa25e423249492cf2a508dfdc0e5acfcf04526/litellm-1.59.8-py3-none-any.whl", hash = "sha256:2473914bd2343485a185dfe7eedb12ee5fda32da3c9d9a8b73f6966b9b20cf39", size = 6716233 },
+    { url = "https://files.pythonhosted.org/packages/8a/ba/0eaec9aee9f99fdf46ef1c0bddcfe7f5720b182f84f6ed27f13145d5ded2/litellm-1.60.2-py3-none-any.whl", hash = "sha256:1cb08cda04bf8c5ef3e690171a779979e4b16a5e3a24cd8dc1f198e7f198d5c4", size = 6746809 },
 ]

 [[package]]
@@ -3185,7 +3185,7 @@ wheels = [

 [[package]]
 name = "openai"
-version = "1.59.6"
+version = "1.61.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "anyio" },
@@ -3197,9 +3197,9 @@ dependencies = [
    { name = "tqdm" },
    { name = "typing-extensions" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/2e/7a/07fbe7bdabffd0a5be1bfe5903a02c4fff232e9acbae894014752a8e4def/openai-1.59.6.tar.gz", hash = "sha256:c7670727c2f1e4473f62fea6fa51475c8bc098c9ffb47bfb9eef5be23c747934", size = 344915 }
+sdist = { url = "https://files.pythonhosted.org/packages/32/2a/b3fa8790be17d632f59d4f50257b909a3f669036e5195c1ae55737274620/openai-1.61.0.tar.gz", hash = "sha256:216f325a24ed8578e929b0f1b3fb2052165f3b04b0461818adaa51aa29c71f8a", size = 350174 }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/70/45/6de8e5fd670c804b29c777e4716f1916741c71604d5c7d952eee8432f7d3/openai-1.59.6-py3-none-any.whl", hash = "sha256:b28ed44eee3d5ebe1a3ea045ee1b4b50fea36ecd50741aaa5ce5a5559c900cb6", size = 454817 },
+    { url = "https://files.pythonhosted.org/packages/93/76/70c5ad6612b3e4c89fa520266bbf2430a89cae8bd87c1e2284698af5927e/openai-1.61.0-py3-none-any.whl", hash = "sha256:e8c512c0743accbdbe77f3429a1490d862f8352045de8dc81969301eb4a4f666", size = 460623 },
 ]

 [[package]]
Author	SHA1	Message	Date
Lorenze Jay	75bd0310f3	Merge branch 'main' into brandon/improve-llm-structured-output	2025-02-04 13:44:28 -08:00
rishi154	515478473a	Fix : short_term_memory with bedrock - using user defined model(when passed as attribute) rather than default (#1959 ) * Update embedding_configurator.py Modified _configure_bedrock method to use user submitted model_name rather than default amazon.titan-embed-text-v1. Sending model_name in short_term_memory (embedder_config/config) was not working. # Passing model_name to use model_name provide by user than using default. Added if/else for backward compatibility * Update embedding_configurator.py Incorporated review comments --------- Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>	2025-02-04 16:44:07 -05:00
TomuHirata	9cf3fadd0f	Add documentation for mlflow tracing integration (#1988 ) Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com> Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>	2025-02-04 16:18:50 -05:00
jinx	89c4b3fe88	Correct current year in tasks, to get more up to date results (#2010 ) Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>	2025-02-04 16:07:22 -05:00
Vidit Ostwal	9e5c599f58	Fixed the memory documentation (#2031 )	2025-02-04 16:03:38 -05:00
Vidit Ostwal	a950e67c7d	Fixed the documentation (#2017 ) * Fixed the documentation * Fixed typo, improved description --------- Co-authored-by: Vidit Ostwal <vidit.ostwal@piramal.com> Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>	2025-02-04 12:56:00 -05:00
Brandon Hancock	3de4653023	Merge branch 'main' into brandon/improve-llm-structured-output	2025-02-04 12:42:30 -05:00
Brandon Hancock	ce6ffb1570	update docs	2025-02-04 12:41:02 -05:00
Tony Kipkemboi	de6933b2d2	Merge pull request #2028 from crewAIInc/brandon/update-litellm-for-o3 update litellm to support o3-mini and deepseek. Update docs.	2025-02-04 12:40:36 -05:00
Brandon Hancock	47b3d8f3fa	code and tests work	2025-02-04 11:44:48 -05:00
Brandon Hancock	748383d74c	update litellm to support o3-mini and deepseek. Update docs.	2025-02-04 10:58:34 -05:00
Brandon Hancock (bhancock_ai)	23b9e10323	Brandon/provide llm additional params (#2018 ) Some checks failed Mark stale issues and pull requests / stale (push) Has been cancelled Details * Clean up to match enterprise * add additional params to LLM calls * make sure additional params are getting passed to llm * update docs * drop print	2025-01-31 12:53:58 -05:00
Brandon Hancock (bhancock_ai)	ddb7958da7	Clean up to match enterprise (#2009 ) * Clean up to match enterprise * improve feedback prompting	2025-01-30 18:16:10 -05:00
Brandon Hancock (bhancock_ai)	477cce321f	Fix llms (#2003 ) * iwp * add in api_base --------- Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>	2025-01-29 19:41:09 -05:00
Brandon Hancock (bhancock_ai)	7bed63a693	Bugfix/fix broken training (#1993 ) * Fixing training while refactoring code * improve prompts * make sure to raise an error when missing training data * Drop comment * fix failing tests * add clear * drop bad code * fix failing test * Fix type issues pointed out by lorenze * simplify training	2025-01-29 19:11:14 -05:00