Merge branch 'main' into bugfix/async-flows

Drop coroutine
Better support async
2026-07-11 09:55:11 +00:00 · 2025-02-24 10:22:54 -05:00 · 2025-02-21 11:42:29 -05:00 · 2025-02-21 11:25:12 -05:00 · 2025-02-21 10:11:55 -05:00 · 2025-02-20 21:00:10 -06:00
127 changed files with 19820 additions and 4478 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -21,4 +21,5 @@ crew_tasks_output.json
 .mypy_cache
 .ruff_cache
 .venv
-agentops.log
+agentops.log
+test_flow.html
--- a/README.md
+++ b/README.md
@@ -1,10 +1,18 @@
 <div align="center">

-![Logo of CrewAI, two people rowing on a boat](./docs/crewai_logo.png)
+![Logo of CrewAI](./docs/crewai_logo.png)

 # **CrewAI**

-🤖 **CrewAI**: Production-grade framework for orchestrating sophisticated AI agent systems. From simple automations to complex real-world applications, CrewAI provides precise control and deep customization. By fostering collaborative intelligence through flexible, production-ready architecture, CrewAI empowers agents to work together seamlessly, tackling complex business challenges with predictable, consistent results.
+**CrewAI**: Production-grade framework for orchestrating sophisticated AI agent systems. From simple automations to complex real-world applications, CrewAI provides precise control and deep customization. By fostering collaborative intelligence through flexible, production-ready architecture, CrewAI empowers agents to work together seamlessly, tackling complex business challenges with predictable, consistent results.
+
+**CrewAI Enterprise**
+Want to plan, build (+ no code), deploy, monitor and interare your agents: [CrewAI Enterprise](https://www.crewai.com/enterprise). Designed for complex, real-world applications, our enterprise solution offers:
+
+- **Seamless Integrations**
+- **Scalable & Secure Deployment**
+- **Actionable Insights**
+- **24/7 Support**

 <h3>

@@ -190,7 +198,7 @@ research_task:
  description: >
    Conduct a thorough research about {topic}
    Make sure you find any interesting and relevant information given
-    the current year is 2024.
+    the current year is 2025.
  expected_output: >
    A list with 10 bullet points of the most relevant information about {topic}
  agent: researcher
@@ -392,7 +400,7 @@ class AdvancedAnalysisFlow(Flow[MarketState]):
            goal="Gather and validate supporting market data",
            backstory="You excel at finding and correlating multiple data sources"
        )
-        
+
        analysis_task = Task(
            description="Analyze {sector} sector data for the past {timeframe}",
            expected_output="Detailed market analysis with confidence score",
@@ -403,7 +411,7 @@ class AdvancedAnalysisFlow(Flow[MarketState]):
            expected_output="Corroborating evidence and potential contradictions",
            agent=researcher
        )
-        
+
        # Demonstrate crew autonomy
        analysis_crew = Crew(
            agents=[analyst, researcher],
--- a/docs/concepts/agents.mdx
+++ b/docs/concepts/agents.mdx
@@ -43,7 +43,7 @@ Think of an agent as a specialized team member with specific skills, expertise,
 | **Max Retry Limit** _(optional)_        | `max_retry_limit`        | `int`                         | Maximum number of retries when an error occurs. Default is 2.                                                         |
 | **Respect Context Window** _(optional)_ | `respect_context_window` | `bool`                        | Keep messages under context window size by summarizing. Default is True.                                              |
 | **Code Execution Mode** _(optional)_    | `code_execution_mode`    | `Literal["safe", "unsafe"]`   | Mode for code execution: 'safe' (using Docker) or 'unsafe' (direct). Default is 'safe'.                               |
-| **Embedder Config** _(optional)_        | `embedder_config`        | `Optional[Dict[str, Any]]`    | Configuration for the embedder used by the agent.                                                                     |
+| **Embedder** _(optional)_               | `embedder`               | `Optional[Dict[str, Any]]`    | Configuration for the embedder used by the agent.                                                                     |
 | **Knowledge Sources** _(optional)_      | `knowledge_sources`      | `Optional[List[BaseKnowledgeSource]]` | Knowledge sources available to the agent.                                                                     |
 | **Use System Prompt** _(optional)_      | `use_system_prompt`      | `Optional[bool]`              | Whether to use system prompt (for o1 model support). Default is True.                                                 |

@@ -152,7 +152,7 @@ agent = Agent(
    use_system_prompt=True,  # Default: True
    tools=[SerperDevTool()],  # Optional: List of tools
    knowledge_sources=None,  # Optional: List of knowledge sources
-    embedder_config=None,  # Optional: Custom embedder configuration
+    embedder=None,  # Optional: Custom embedder configuration
    system_template=None,  # Optional: Custom system prompt template
    prompt_template=None,  # Optional: Custom prompt template
    response_template=None,  # Optional: Custom response template
--- a/docs/concepts/crews.mdx
+++ b/docs/concepts/crews.mdx
@@ -23,14 +23,14 @@ A crew in crewAI represents a collaborative group of agents working together to
 | **Language** _(optional)_             | `language`             | Language used for the crew, defaults to English.                                                                                                                                                                                                          |
 | **Language File** _(optional)_        | `language_file`        | Path to the language file to be used for the crew.                                                                                                                                                                                                        |
 | **Memory** _(optional)_               | `memory`               | Utilized for storing execution memories (short-term, long-term, entity memory).                                                                                                                                                                           |
-| **Memory Config** _(optional)_        | `memory_config`        | Configuration for the memory provider to be used by the crew.                                                                                                                                                                                          |
-| **Cache** _(optional)_                | `cache`                | Specifies whether to use a cache for storing the results of tools' execution. Defaults to `True`.                                                                                                                                                          |
-| **Embedder** _(optional)_             | `embedder`             | Configuration for the embedder to be used by the crew. Mostly used by memory for now. Default is `{"provider": "openai"}`.                                                                                                                                                                     |
-| **Full Output** _(optional)_          | `full_output`          | Whether the crew should return the full output with all tasks outputs or just the final output. Defaults to `False`.                                                                                                                                       |
+| **Memory Config** _(optional)_        | `memory_config`        | Configuration for the memory provider to be used by the crew.                                                                                                                                                                                             |
+| **Cache** _(optional)_                | `cache`                | Specifies whether to use a cache for storing the results of tools' execution. Defaults to `True`.                                                                                                                                                         |
+| **Embedder** _(optional)_             | `embedder`             | Configuration for the embedder to be used by the crew. Mostly used by memory for now. Default is `{"provider": "openai"}`.                                                                                                                                |
+| **Full Output** _(optional)_          | `full_output`          | Whether the crew should return the full output with all tasks outputs or just the final output. Defaults to `False`.                                                                                                                                      |
 | **Step Callback** _(optional)_        | `step_callback`        | A function that is called after each step of every agent. This can be used to log the agent's actions or to perform other operations; it won't override the agent-specific `step_callback`.                                                               |
 | **Task Callback** _(optional)_        | `task_callback`        | A function that is called after the completion of each task. Useful for monitoring or additional operations post-task execution.                                                                                                                          |
 | **Share Crew** _(optional)_           | `share_crew`           | Whether you want to share the complete crew information and execution with the crewAI team to make the library better, and allow us to train models.                                                                                                      |
-| **Output Log File** _(optional)_      | `output_log_file`      | Whether you want to have a file with the complete crew output and execution. You can set it using True and it will default to the folder you are currently in and it will be called logs.txt or passing a string with the full path and name of the file. |
+| **Output Log File** _(optional)_      | `output_log_file`      | Set to True to save logs as logs.txt in the current directory or provide a file path. Logs will be in JSON format if the filename ends in .json, otherwise .txt. Defautls to `None`.                                                                      |
 | **Manager Agent** _(optional)_        | `manager_agent`        | `manager` sets a custom agent that will be used as a manager.                                                                                                                                                                                             |
 | **Prompt File** _(optional)_          | `prompt_file`          | Path to the prompt JSON file to be used for the crew.                                                                                                                                                                                                     |
 | **Planning** *(optional)*             | `planning`             | Adds planning ability to the Crew. When activated before each Crew iteration, all Crew data is sent to an AgentPlanner that will plan the tasks and this plan will be added to each task description.                                                     |
@@ -240,6 +240,23 @@ print(f"Tasks Output: {crew_output.tasks_output}")
 print(f"Token Usage: {crew_output.token_usage}")
 ```

+## Accessing Crew Logs
+
+You can see real time log of the crew execution, by setting `output_log_file` as a `True(Boolean)` or a `file_name(str)`. Supports logging of events as both `file_name.txt` and `file_name.json`.
+In case of `True(Boolean)` will save as `logs.txt`.
+
+In case of `output_log_file` is set as `False(Booelan)` or `None`, the logs will not be populated.
+
+```python Code
+# Save crew logs
+crew = Crew(output_log_file = True)  # Logs will be saved as logs.txt
+crew = Crew(output_log_file = file_name)  # Logs will be saved as file_name.txt
+crew = Crew(output_log_file = file_name.txt)  # Logs will be saved as file_name.txt
+crew = Crew(output_log_file = file_name.json)  # Logs will be saved as file_name.json
+```
+
+
+
 ## Memory Utilization

 Crews can utilize memory (short-term, long-term, and entity memory) to enhance their execution and learning over time. This feature allows crews to store and recall execution memories, aiding in decision-making and task execution strategies.
@@ -279,9 +296,9 @@ print(result)
 Once your crew is assembled, initiate the workflow with the appropriate kickoff method. CrewAI provides several methods for better control over the kickoff process: `kickoff()`, `kickoff_for_each()`, `kickoff_async()`, and `kickoff_for_each_async()`.

 - `kickoff()`: Starts the execution process according to the defined process flow.
- `kickoff_for_each()`: Executes tasks for each agent individually.
+- `kickoff_for_each()`: Executes tasks sequentially for each provided input event or item in the collection.
 - `kickoff_async()`: Initiates the workflow asynchronously.
- `kickoff_for_each_async()`: Executes tasks for each agent individually in an asynchronous manner.
+- `kickoff_for_each_async()`: Executes tasks concurrently for each provided input event or item, leveraging asynchronous processing.

 ```python Code
 # Start the crew's task execution
--- a/docs/concepts/flows.mdx
+++ b/docs/concepts/flows.mdx
@@ -232,18 +232,18 @@ class UnstructuredExampleFlow(Flow):
    def first_method(self):
        # The state automatically includes an 'id' field
        print(f"State ID: {self.state['id']}")
-        self.state.message = "Hello from structured flow"
-        self.state.counter = 0
+        self.state['counter'] = 0
+        self.state['message'] = "Hello from structured flow"

    @listen(first_method)
    def second_method(self):
-        self.state.counter += 1
-        self.state.message += " - updated"
+        self.state['counter'] += 1
+        self.state['message'] += " - updated"

    @listen(second_method)
    def third_method(self):
-        self.state.counter += 1
-        self.state.message += " - updated again"
+        self.state['counter'] += 1
+        self.state['message'] += " - updated again"

        print(f"State after third_method: {self.state}")

--- a/docs/concepts/knowledge.mdx
+++ b/docs/concepts/knowledge.mdx
@@ -91,7 +91,7 @@ result = crew.kickoff(inputs={"question": "What city does John live in and how o
 ```


-Here's another example with the `CrewDoclingSource`. The CrewDoclingSource is actually quite versatile and can handle multiple file formats including TXT, PDF, DOCX, HTML, and more. 
+Here's another example with the `CrewDoclingSource`. The CrewDoclingSource is actually quite versatile and can handle multiple file formats including MD, PDF, DOCX, HTML, and more. 

 <Note>
  You need to install `docling` for the following example to work: `uv add docling`
@@ -152,10 +152,10 @@ Here are examples of how to use different types of knowledge sources:

 ### Text File Knowledge Source
 ```python
-from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
+from crewai.knowledge.source.text_file_knowledge_source import TextFileKnowledgeSource

 # Create a text file knowledge source
-text_source = CrewDoclingSource(
+text_source = TextFileKnowledgeSource(
    file_paths=["document.txt", "another.txt"]
 )

@@ -324,6 +324,13 @@ agent = Agent(
    verbose=True,
    allow_delegation=False,
    llm=gemini_llm,
+    embedder={
+        "provider": "google",
+        "config": {
+            "model": "models/text-embedding-004",
+            "api_key": GEMINI_API_KEY,
+        }
+    }
 )

 task = Task(
--- a/docs/concepts/llms.mdx
+++ b/docs/concepts/llms.mdx
@@ -27,155 +27,6 @@ Large Language Models (LLMs) are the core intelligence behind CrewAI agents. The
  </Card>
 </CardGroup>

-## Available Models and Their Capabilities
-
-Here's a detailed breakdown of supported models and their capabilities, you can compare performance at [lmarena.ai](https://lmarena.ai/?leaderboard) and [artificialanalysis.ai](https://artificialanalysis.ai/):
-
-<Tabs>
-  <Tab title="OpenAI">
-    | Model | Context Window | Best For |
-    |-------|---------------|-----------|
-    | GPT-4 | 8,192 tokens | High-accuracy tasks, complex reasoning |
-    | GPT-4 Turbo | 128,000 tokens | Long-form content, document analysis |
-    | GPT-4o & GPT-4o-mini | 128,000 tokens | Cost-effective large context processing |
-
-    <Note>
-      1 token ≈ 4 characters in English. For example, 8,192 tokens ≈ 32,768 characters or about 6,000 words.
-    </Note>
-  </Tab>
-  <Tab title="Nvidia NIM">
-    | Model | Context Window | Best For |
-    |-------|---------------|-----------|
-    | nvidia/mistral-nemo-minitron-8b-8k-instruct | 8,192 tokens | State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation. |
-    | nvidia/nemotron-4-mini-hindi-4b-instruct| 4,096 tokens | A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language. |
-    | "nvidia/llama-3.1-nemotron-70b-instruct | 128k tokens | Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA in order to improve the helpfulness of LLM generated responses. |
-    | nvidia/llama3-chatqa-1.5-8b | 128k tokens | Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines. |
-    | nvidia/llama3-chatqa-1.5-70b | 128k tokens | Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines. |
-    | nvidia/vila | 128k tokens | Multi-modal vision-language model that understands text/img/video and creates informative responses |
-    | nvidia/neva-22| 4,096 tokens | Multi-modal vision-language model that understands text/images and generates informative responses |
-    | nvidia/nemotron-mini-4b-instruct | 8,192 tokens | General-purpose tasks |
-    | nvidia/usdcode-llama3-70b-instruct | 128k tokens | State-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code. |
-    | nvidia/nemotron-4-340b-instruct | 4,096 tokens | Creates diverse synthetic data that mimics the characteristics of real-world data. |
-    | meta/codellama-70b | 100k tokens | LLM capable of generating code from natural language and vice versa. |
-    | meta/llama2-70b | 4,096 tokens | Cutting-edge large language AI model capable of generating text and code in response to prompts. |
-    | meta/llama3-8b-instruct | 8,192 tokens | Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation. |
-    | meta/llama3-70b-instruct | 8,192 tokens | Powers complex conversations with superior contextual understanding, reasoning and text generation. |
-    | meta/llama-3.1-8b-instruct | 128k tokens | Advanced state-of-the-art model with language understanding, superior reasoning, and text generation. |
-    | meta/llama-3.1-70b-instruct | 128k tokens | Powers complex conversations with superior contextual understanding, reasoning and text generation. |
-    | meta/llama-3.1-405b-instruct | 128k tokens | Advanced LLM for synthetic data generation, distillation, and inference for chatbots, coding, and domain-specific tasks. |
-    | meta/llama-3.2-1b-instruct | 128k tokens | Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation. |
-    | meta/llama-3.2-3b-instruct | 128k tokens | Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation. |
-    | meta/llama-3.2-11b-vision-instruct | 128k tokens | Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation. |
-    | meta/llama-3.2-90b-vision-instruct | 128k tokens | Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation. |
-    | meta/llama-3.1-70b-instruct | 128k tokens | Powers complex conversations with superior contextual understanding, reasoning and text generation. |
-    | google/gemma-7b | 8,192 tokens | Cutting-edge text generation model text understanding, transformation, and code generation. |
-    | google/gemma-2b | 8,192 tokens | Cutting-edge text generation model text understanding, transformation, and code generation. |
-    | google/codegemma-7b | 8,192 tokens | Cutting-edge model built on Google's Gemma-7B specialized for code generation and code completion. |
-    | google/codegemma-1.1-7b | 8,192 tokens | Advanced programming model for code generation, completion, reasoning, and instruction following. |
-    | google/recurrentgemma-2b | 8,192 tokens | Novel recurrent architecture based language model for faster inference when generating long sequences. |
-    | google/gemma-2-9b-it | 8,192 tokens | Cutting-edge text generation model text understanding, transformation, and code generation. |
-    | google/gemma-2-27b-it | 8,192 tokens | Cutting-edge text generation model text understanding, transformation, and code generation. |
-    | google/gemma-2-2b-it | 8,192 tokens | Cutting-edge text generation model text understanding, transformation, and code generation. |
-    | google/deplot | 512 tokens | One-shot visual language understanding model that translates images of plots into tables. |
-    | google/paligemma | 8,192 tokens | Vision language model adept at comprehending text and visual inputs to produce informative responses. |
-    | mistralai/mistral-7b-instruct-v0.2 | 32k tokens | This LLM follows instructions, completes requests, and generates creative text. |
-    | mistralai/mixtral-8x7b-instruct-v0.1 | 8,192 tokens | An MOE LLM that follows instructions, completes requests, and generates creative text. |
-    | mistralai/mistral-large | 4,096 tokens | Creates diverse synthetic data that mimics the characteristics of real-world data. |
-    | mistralai/mixtral-8x22b-instruct-v0.1 | 8,192 tokens | Creates diverse synthetic data that mimics the characteristics of real-world data. |
-    | mistralai/mistral-7b-instruct-v0.3 | 32k tokens | This LLM follows instructions, completes requests, and generates creative text. |
-    | nv-mistralai/mistral-nemo-12b-instruct | 128k tokens | Most advanced language model for reasoning, code, multilingual tasks; runs on a single GPU. |
-    | mistralai/mamba-codestral-7b-v0.1 | 256k tokens | Model for writing and interacting with code across a wide range of programming languages and tasks. |
-    | microsoft/phi-3-mini-128k-instruct | 128K tokens | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
-    | microsoft/phi-3-mini-4k-instruct | 4,096 tokens | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
-    | microsoft/phi-3-small-8k-instruct | 8,192 tokens | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
-    | microsoft/phi-3-small-128k-instruct | 128K tokens | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
-    | microsoft/phi-3-medium-4k-instruct | 4,096 tokens | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
-    | microsoft/phi-3-medium-128k-instruct | 128K tokens | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
-    | microsoft/phi-3.5-mini-instruct | 128K tokens | Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments |
-    | microsoft/phi-3.5-moe-instruct | 128K tokens | Advanced LLM based on Mixture of Experts architecure to deliver compute efficient content generation |
-    | microsoft/kosmos-2 | 1,024 tokens | Groundbreaking multimodal model designed to understand and reason about visual elements in images. |
-    | microsoft/phi-3-vision-128k-instruct | 128k tokens | Cutting-edge open multimodal model exceling in high-quality reasoning from images. |
-    | microsoft/phi-3.5-vision-instruct | 128k tokens | Cutting-edge open multimodal model exceling in high-quality reasoning from images. |
-    | databricks/dbrx-instruct | 12k tokens | A general-purpose LLM with state-of-the-art performance in language understanding, coding, and RAG. |
-    | snowflake/arctic | 1,024 tokens | Delivers high efficiency inference for enterprise applications focused on SQL generation and coding. |
-    | aisingapore/sea-lion-7b-instruct | 4,096 tokens | LLM to represent and serve the linguistic and cultural diversity of Southeast Asia |
-    | ibm/granite-8b-code-instruct | 4,096 tokens | Software programming LLM for code generation, completion, explanation, and multi-turn conversion. |
-    | ibm/granite-34b-code-instruct | 8,192 tokens | Software programming LLM for code generation, completion, explanation, and multi-turn conversion. |
-    | ibm/granite-3.0-8b-instruct | 4,096 tokens | Advanced Small Language Model supporting RAG, summarization, classification, code, and agentic AI |
-    | ibm/granite-3.0-3b-a800m-instruct | 4,096 tokens | Highly efficient Mixture of Experts model for RAG, summarization, entity extraction, and classification |
-    | mediatek/breeze-7b-instruct | 4,096 tokens | Creates diverse synthetic data that mimics the characteristics of real-world data. |
-    | upstage/solar-10.7b-instruct | 4,096 tokens | Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics. |
-    | writer/palmyra-med-70b-32k | 32k tokens | Leading LLM for accurate, contextually relevant responses in the medical domain. |
-    | writer/palmyra-med-70b | 32k tokens | Leading LLM for accurate, contextually relevant responses in the medical domain. |
-    | writer/palmyra-fin-70b-32k | 32k tokens | Specialized LLM for financial analysis, reporting, and data processing |
-    | 01-ai/yi-large | 32k tokens | Powerful model trained on English and Chinese for diverse tasks including chatbot and creative writing. |
-    | deepseek-ai/deepseek-coder-6.7b-instruct | 2k tokens | Powerful coding model offering advanced capabilities in code generation, completion, and infilling |
-    | rakuten/rakutenai-7b-instruct | 1,024 tokens | Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation. |
-    | rakuten/rakutenai-7b-chat | 1,024 tokens | Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation. |
-    | baichuan-inc/baichuan2-13b-chat | 4,096 tokens | Support Chinese and English chat, coding, math, instruction following, solving quizzes |
-
-    <Note>
-      NVIDIA's NIM support for models is expanding continuously! For the most up-to-date list of available models, please visit build.nvidia.com.
-    </Note>
-  </Tab>
-  <Tab title="Gemini">
-    | Model | Context Window | Best For |
-    |-------|---------------|-----------|
-    | gemini-2.0-flash-exp | 1M tokens | Higher quality at faster speed, multimodal model, good for most tasks |
-    | gemini-1.5-flash | 1M tokens | Balanced multimodal model, good for most tasks |
-    | gemini-1.5-flash-8B | 1M tokens | Fastest, most cost-efficient, good for high-frequency tasks |
-    | gemini-1.5-pro | 2M tokens | Best performing, wide variety of reasoning tasks including logical reasoning, coding, and creative collaboration |
-
-    <Tip>
-      Google's Gemini models are all multimodal, supporting audio, images, video and text, supporting context caching, json schema, function calling, etc.
-
-      These models are available via API_KEY from 
-      [The Gemini API](https://ai.google.dev/gemini-api/docs) and also from 
-      [Google Cloud Vertex](https://cloud.google.com/vertex-ai/generative-ai/docs/migrate/migrate-google-ai) as part of the
-      [Model Garden](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/explore-models).
-    </Tip>
-  </Tab>
-  <Tab title="Groq">
-    | Model | Context Window | Best For |
-    |-------|---------------|-----------|
-    | Llama 3.1 70B/8B | 131,072 tokens | High-performance, large context tasks |
-    | Llama 3.2 Series | 8,192 tokens | General-purpose tasks |
-    | Mixtral 8x7B | 32,768 tokens | Balanced performance and context |
-
-    <Tip>
-      Groq is known for its fast inference speeds, making it suitable for real-time applications.
-    </Tip>
-  </Tab>
-  <Tab title="SambaNova">
-    | Model | Context Window | Best For |
-    |-------|---------------|-----------|
-    | Llama 3.1 70B/8B | Up to 131,072 tokens | High-performance, large context tasks |
-    | Llama 3.1 405B | 8,192 tokens | High-performance and output quality |
-    | Llama 3.2 Series | 8,192 tokens | General-purpose tasks, multimodal |
-    | Llama 3.3 70B | Up to 131,072 tokens | High-performance and output quality|
-    | Qwen2 familly | 8,192 tokens | High-performance and output quality |
-
-    <Tip>
-      [SambaNova](https://cloud.sambanova.ai/) has several models with fast inference speed at full precision.
-    </Tip>
-  </Tab>
-  <Tab title="Others">
-    | Provider | Context Window | Key Features |
-    |----------|---------------|--------------|
-    | Deepseek Chat | 128,000 tokens | Specialized in technical discussions |
-    | Claude 3 | Up to 200K tokens | Strong reasoning, code understanding |
-    | Gemma Series | 8,192 tokens | Efficient, smaller-scale tasks |
-
-    <Info>
-      Provider selection should consider factors like:
-      - API availability in your region
-      - Pricing structure
-      - Required features (e.g., streaming, function calling)
-      - Performance requirements
-    </Info>
-  </Tab>
-</Tabs>
-
 ## Setting Up Your LLM

 There are three ways to configure LLMs in CrewAI. Choose the method that best fits your workflow:
@@ -204,98 +55,12 @@ There are three ways to configure LLMs in CrewAI. Choose the method that best fi

    ```yaml
    researcher:
-        # Agent Definition
        role: Research Specialist
        goal: Conduct comprehensive research and analysis
        backstory: A dedicated research professional with years of experience
        verbose: true
-
-        # Model Selection (uncomment your choice)
-        
-        # OpenAI Models - Known for reliability and performance
-        llm: openai/gpt-4o-mini
-        # llm: openai/gpt-4        # More accurate but expensive
-        # llm: openai/gpt-4-turbo  # Fast with large context
-        # llm: openai/gpt-4o       # Optimized for longer texts
-        # llm: openai/o1-preview   # Latest features
-        # llm: openai/o1-mini      # Cost-effective
-
-        # Azure Models - For enterprise deployments
-        # llm: azure/gpt-4o-mini
-        # llm: azure/gpt-4
-        # llm: azure/gpt-35-turbo
-
-        # Anthropic Models - Strong reasoning capabilities
-        # llm: anthropic/claude-3-opus-20240229-v1:0
-        # llm: anthropic/claude-3-sonnet-20240229-v1:0
-        # llm: anthropic/claude-3-haiku-20240307-v1:0
-        # llm: anthropic/claude-2.1
-        # llm: anthropic/claude-2.0
-
-        # Google Models - Strong reasoning, large cachable context window, multimodal
-        # llm: gemini/gemini-1.5-pro-latest
-        # llm: gemini/gemini-1.5-flash-latest
-        # llm: gemini/gemini-1.5-flash-8b-latest
-
-        # AWS Bedrock Models - Enterprise-grade
-        # llm: bedrock/anthropic.claude-3-sonnet-20240229-v1:0
-        # llm: bedrock/anthropic.claude-v2:1
-        # llm: bedrock/amazon.titan-text-express-v1
-        # llm: bedrock/meta.llama2-70b-chat-v1
-
-        # Amazon SageMaker Models - Enterprise-grade
-        # llm: sagemaker/<my-endpoint>
-
-        # Mistral Models - Open source alternative
-        # llm: mistral/mistral-large-latest
-        # llm: mistral/mistral-medium-latest
-        # llm: mistral/mistral-small-latest
-
-        # Groq Models - Fast inference
-        # llm: groq/mixtral-8x7b-32768
-        # llm: groq/llama-3.1-70b-versatile
-        # llm: groq/llama-3.2-90b-text-preview
-        # llm: groq/gemma2-9b-it
-        # llm: groq/gemma-7b-it
-
-        # IBM watsonx.ai Models - Enterprise features
-        # llm: watsonx/ibm/granite-13b-chat-v2
-        # llm: watsonx/meta-llama/llama-3-1-70b-instruct
-        # llm: watsonx/bigcode/starcoder2-15b
-
-        # Ollama Models - Local deployment
-        # llm: ollama/llama3:70b
-        # llm: ollama/codellama
-        # llm: ollama/mistral
-        # llm: ollama/mixtral
-        # llm: ollama/phi
-
-        # Fireworks AI Models - Specialized tasks
-        # llm: fireworks_ai/accounts/fireworks/models/llama-v3-70b-instruct
-        # llm: fireworks_ai/accounts/fireworks/models/mixtral-8x7b
-        # llm: fireworks_ai/accounts/fireworks/models/zephyr-7b-beta
-
-        # Perplexity AI Models - Research focused
-        # llm: pplx/llama-3.1-sonar-large-128k-online
-        # llm: pplx/mistral-7b-instruct
-        # llm: pplx/codellama-34b-instruct
-        # llm: pplx/mixtral-8x7b-instruct
-
-        # Hugging Face Models - Community models
-        # llm: huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct
-        # llm: huggingface/mistralai/Mixtral-8x7B-Instruct-v0.1
-        # llm: huggingface/tiiuae/falcon-180B-chat
-        # llm: huggingface/google/gemma-7b-it
-
-        # Nvidia NIM Models - GPU-optimized
-        # llm: nvidia_nim/meta/llama3-70b-instruct
-        # llm: nvidia_nim/mistral/mixtral-8x7b
-        # llm: nvidia_nim/google/gemma-7b
-
-        # SambaNova Models - Enterprise AI
-        # llm: sambanova/Meta-Llama-3.1-8B-Instruct
-        # llm: sambanova/BioMistral-7B
-        # llm: sambanova/Falcon-180B
+        llm: openai/gpt-4o-mini # your model here 
+        # (see provider configuration examples below for more)
    ```

    <Info>
@@ -343,6 +108,465 @@ There are three ways to configure LLMs in CrewAI. Choose the method that best fi
  </Tab>
 </Tabs>

+## Provider Configuration Examples
+
+
+CrewAI supports a multitude of LLM providers, each offering unique features, authentication methods, and model capabilities. 
+In this section, you'll find detailed examples that help you select, configure, and optimize the LLM that best fits your project's needs.
+
+<AccordionGroup>
+  <Accordion title="OpenAI">
+    Set the following environment variables in your `.env` file:
+
+    ```toml Code
+    # Required
+    OPENAI_API_KEY=sk-...
+    
+    # Optional
+    OPENAI_API_BASE=<custom-base-url>
+    OPENAI_ORGANIZATION=<your-org-id>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="openai/gpt-4", # call model by provider/model_name
+        temperature=0.8,
+        max_tokens=150,
+        top_p=0.9,
+        frequency_penalty=0.1,
+        presence_penalty=0.1,
+        stop=["END"],
+        seed=42
+    )
+    ```
+
+    OpenAI is one of the leading providers of LLMs with a wide range of models and features.
+
+    | Model               | Context Window   | Best For                                      |
+    |---------------------|------------------|-----------------------------------------------|
+    | GPT-4               | 8,192 tokens     | High-accuracy tasks, complex reasoning        |
+    | GPT-4 Turbo         | 128,000 tokens   | Long-form content, document analysis          |
+    | GPT-4o & GPT-4o-mini  | 128,000 tokens  | Cost-effective large context processing       |
+    | o3-mini             | 200,000 tokens   | Fast reasoning, complex reasoning             |
+    | o1-mini             | 128,000 tokens   | Fast reasoning, complex reasoning             |
+    | o1-preview          | 128,000 tokens   | Fast reasoning, complex reasoning             |
+    | o1                  | 200,000 tokens   | Fast reasoning, complex reasoning             |
+  </Accordion>
+
+  <Accordion title="Anthropic">
+    ```toml Code
+    ANTHROPIC_API_KEY=sk-ant-...
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="anthropic/claude-3-sonnet-20240229-v1:0",
+        temperature=0.7
+    )
+    ```
+  </Accordion>
+
+  <Accordion title="Google">
+    Set the following environment variables in your `.env` file:
+
+    ```toml Code
+    # Option 1: Gemini accessed with an API key.
+    # https://ai.google.dev/gemini-api/docs/api-key
+    GEMINI_API_KEY=<your-api-key>
+
+    # Option 2: Vertex AI IAM credentials for Gemini, Anthropic, and Model Garden.
+    # https://cloud.google.com/vertex-ai/generative-ai/docs/overview
+    ```
+
+    Get credentials from your Google Cloud Console and save it to a JSON file with the following code:
+    ```python Code
+    import json
+
+    file_path = 'path/to/vertex_ai_service_account.json'
+
+    # Load the JSON file
+    with open(file_path, 'r') as file:
+        vertex_credentials = json.load(file)
+
+    # Convert the credentials to a JSON string
+    vertex_credentials_json = json.dumps(vertex_credentials)
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="gemini/gemini-1.5-pro-latest",
+        temperature=0.7,
+        vertex_credentials=vertex_credentials_json
+    )
+    ```
+    Google offers a range of powerful models optimized for different use cases:
+
+    | Model                  | Context Window | Best For                                                          |
+    |-----------------------|----------------|------------------------------------------------------------------|
+    | gemini-2.0-flash-exp  | 1M tokens      | Higher quality at faster speed, multimodal model, good for most tasks |
+    | gemini-1.5-flash      | 1M tokens      | Balanced multimodal model, good for most tasks                    |
+    | gemini-1.5-flash-8B   | 1M tokens      | Fastest, most cost-efficient, good for high-frequency tasks       |
+    | gemini-1.5-pro        | 2M tokens      | Best performing, wide variety of reasoning tasks including logical reasoning, coding, and creative collaboration |
+  </Accordion>
+
+  <Accordion title="Azure">
+    ```toml Code
+    # Required
+    AZURE_API_KEY=<your-api-key>
+    AZURE_API_BASE=<your-resource-url>
+    AZURE_API_VERSION=<api-version>
+    
+    # Optional
+    AZURE_AD_TOKEN=<your-azure-ad-token>
+    AZURE_API_TYPE=<your-azure-api-type>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="azure/gpt-4",
+        api_version="2023-05-15"
+    )
+    ```
+  </Accordion>
+
+  <Accordion title="AWS Bedrock">
+    ```toml Code
+    AWS_ACCESS_KEY_ID=<your-access-key>
+    AWS_SECRET_ACCESS_KEY=<your-secret-key>
+    AWS_DEFAULT_REGION=<your-region>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0"
+    )
+    ```
+  </Accordion>
+  
+  <Accordion title="Amazon SageMaker">
+    ```toml Code
+    AWS_ACCESS_KEY_ID=<your-access-key>
+    AWS_SECRET_ACCESS_KEY=<your-secret-key>
+    AWS_DEFAULT_REGION=<your-region>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="sagemaker/<my-endpoint>"
+    )
+    ```
+  </Accordion>
+
+  <Accordion title="Mistral">
+    Set the following environment variables in your `.env` file:
+    ```toml Code
+    MISTRAL_API_KEY=<your-api-key>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="mistral/mistral-large-latest",
+        temperature=0.7
+    )
+    ```
+  </Accordion>
+
+  <Accordion title="Nvidia NIM">
+    Set the following environment variables in your `.env` file:
+    ```toml Code
+    NVIDIA_API_KEY=<your-api-key>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="nvidia_nim/meta/llama3-70b-instruct",
+        temperature=0.7
+    )
+    ```
+
+    Nvidia NIM provides a comprehensive suite of models for various use cases, from general-purpose tasks to specialized applications.
+
+    | Model                                                                   | Context Window | Best For                                                          |
+    |-------------------------------------------------------------------------|----------------|-------------------------------------------------------------------|
+    | nvidia/mistral-nemo-minitron-8b-8k-instruct                              | 8,192 tokens   | State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation. |
+    | nvidia/nemotron-4-mini-hindi-4b-instruct                                 | 4,096 tokens   | A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language. |
+    | nvidia/llama-3.1-nemotron-70b-instruct                                  | 128k tokens    | Customized for enhanced helpfulness in responses                  |
+    | nvidia/llama3-chatqa-1.5-8b                                                | 128k tokens    | Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines. |
+    | nvidia/llama3-chatqa-1.5-70b                                               | 128k tokens    | Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines. |
+    | nvidia/vila                                                             | 128k tokens    | Multi-modal vision-language model that understands text/img/video and creates informative responses |
+    | nvidia/neva-22                                                          | 4,096 tokens   | Multi-modal vision-language model that understands text/images and generates informative responses |
+    | nvidia/nemotron-mini-4b-instruct                                         | 8,192 tokens   | General-purpose tasks |
+    | nvidia/usdcode-llama3-70b-instruct                                       | 128k tokens    | State-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code. |
+    | nvidia/nemotron-4-340b-instruct                                          | 4,096 tokens   | Creates diverse synthetic data that mimics the characteristics of real-world data. |
+    | meta/codellama-70b                                                      | 100k tokens    | LLM capable of generating code from natural language and vice versa. |
+    | meta/llama2-70b                                                         | 4,096 tokens   | Cutting-edge large language AI model capable of generating text and code in response to prompts. |
+    | meta/llama3-8b-instruct                                                | 8,192 tokens   | Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation. |
+    | meta/llama3-70b-instruct                                               | 8,192 tokens   | Powers complex conversations with superior contextual understanding, reasoning and text generation. |
+    | meta/llama-3.1-8b-instruct                                             | 128k tokens    | Advanced state-of-the-art model with language understanding, superior reasoning, and text generation. |
+    | meta/llama-3.1-70b-instruct                                            | 128k tokens    | Powers complex conversations with superior contextual understanding, reasoning and text generation. |
+    | meta/llama-3.1-405b-instruct                                           | 128k tokens    | Advanced LLM for synthetic data generation, distillation, and inference for chatbots, coding, and domain-specific tasks. |
+    | meta/llama-3.2-1b-instruct                                             | 128k tokens    | Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation. |
+    | meta/llama-3.2-3b-instruct                                             | 128k tokens    | Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation. |
+    | meta/llama-3.2-11b-vision-instruct                                     | 128k tokens    | Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation. |
+    | meta/llama-3.2-90b-vision-instruct                                     | 128k tokens    | Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation. |
+    | google/gemma-7b                                                        | 8,192 tokens   | Cutting-edge text generation model text understanding, transformation, and code generation. |
+    | google/gemma-2b                                                        | 8,192 tokens   | Cutting-edge text generation model text understanding, transformation, and code generation. |
+    | google/codegemma-7b                                                    | 8,192 tokens   | Cutting-edge model built on Google's Gemma-7B specialized for code generation and code completion. |
+    | google/codegemma-1.1-7b                                               | 8,192 tokens   | Advanced programming model for code generation, completion, reasoning, and instruction following. |
+    | google/recurrentgemma-2b                                              | 8,192 tokens   | Novel recurrent architecture based language model for faster inference when generating long sequences. |
+    | google/gemma-2-9b-it                                                  | 8,192 tokens   | Cutting-edge text generation model text understanding, transformation, and code generation. |
+    | google/gemma-2-27b-it                                                 | 8,192 tokens   | Cutting-edge text generation model text understanding, transformation, and code generation. |
+    | google/gemma-2-2b-it                                                  | 8,192 tokens   | Cutting-edge text generation model text understanding, transformation, and code generation. |
+    | google/deplot                                                         | 512 tokens     | One-shot visual language understanding model that translates images of plots into tables. |
+    | google/paligemma                                                      | 8,192 tokens   | Vision language model adept at comprehending text and visual inputs to produce informative responses. |
+    | mistralai/mistral-7b-instruct-v0.2                                   | 32k tokens     | This LLM follows instructions, completes requests, and generates creative text. |
+    | mistralai/mixtral-8x7b-instruct-v0.1                                 | 8,192 tokens   | An MOE LLM that follows instructions, completes requests, and generates creative text. |
+    | mistralai/mistral-large                                              | 4,096 tokens   | Creates diverse synthetic data that mimics the characteristics of real-world data. |
+    | mistralai/mixtral-8x22b-instruct-v0.1                               | 8,192 tokens   | Creates diverse synthetic data that mimics the characteristics of real-world data. |
+    | mistralai/mistral-7b-instruct-v0.3                                  | 32k tokens     | This LLM follows instructions, completes requests, and generates creative text. |
+    | nv-mistralai/mistral-nemo-12b-instruct                              | 128k tokens    | Most advanced language model for reasoning, code, multilingual tasks; runs on a single GPU. |
+    | mistralai/mamba-codestral-7b-v0.1                                   | 256k tokens    | Model for writing and interacting with code across a wide range of programming languages and tasks. |
+    | microsoft/phi-3-mini-128k-instruct                                  | 128K tokens    | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
+    | microsoft/phi-3-mini-4k-instruct                                    | 4,096 tokens   | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
+    | microsoft/phi-3-small-8k-instruct                                   | 8,192 tokens   | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
+    | microsoft/phi-3-small-128k-instruct                                 | 128K tokens    | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
+    | microsoft/phi-3-medium-4k-instruct                                  | 4,096 tokens   | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
+    | microsoft/phi-3-medium-128k-instruct                                | 128K tokens    | Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills. |
+    | microsoft/phi-3.5-mini-instruct                                     | 128K tokens    | Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments |
+    | microsoft/phi-3.5-moe-instruct                                      | 128K tokens    | Advanced LLM based on Mixture of Experts architecure to deliver compute efficient content generation |
+    | microsoft/kosmos-2                                                  | 1,024 tokens   | Groundbreaking multimodal model designed to understand and reason about visual elements in images. |
+    | microsoft/phi-3-vision-128k-instruct                               | 128k tokens    | Cutting-edge open multimodal model exceling in high-quality reasoning from images. |
+    | microsoft/phi-3.5-vision-instruct                                  | 128k tokens    | Cutting-edge open multimodal model exceling in high-quality reasoning from images. |
+    | databricks/dbrx-instruct                                           | 12k tokens     | A general-purpose LLM with state-of-the-art performance in language understanding, coding, and RAG. |
+    | snowflake/arctic                                                   | 1,024 tokens   | Delivers high efficiency inference for enterprise applications focused on SQL generation and coding. |
+    | aisingapore/sea-lion-7b-instruct                                  | 4,096 tokens   | LLM to represent and serve the linguistic and cultural diversity of Southeast Asia |
+    | ibm/granite-8b-code-instruct                                      | 4,096 tokens   | Software programming LLM for code generation, completion, explanation, and multi-turn conversion. |
+    | ibm/granite-34b-code-instruct                                     | 8,192 tokens   | Software programming LLM for code generation, completion, explanation, and multi-turn conversion. |
+    | ibm/granite-3.0-8b-instruct                                       | 4,096 tokens   | Advanced Small Language Model supporting RAG, summarization, classification, code, and agentic AI |
+    | ibm/granite-3.0-3b-a800m-instruct                                | 4,096 tokens   | Highly efficient Mixture of Experts model for RAG, summarization, entity extraction, and classification |
+    | mediatek/breeze-7b-instruct                                       | 4,096 tokens   | Creates diverse synthetic data that mimics the characteristics of real-world data. |
+    | upstage/solar-10.7b-instruct                                      | 4,096 tokens   | Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics. |
+    | writer/palmyra-med-70b-32k                                        | 32k tokens     | Leading LLM for accurate, contextually relevant responses in the medical domain. |
+    | writer/palmyra-med-70b                                            | 32k tokens     | Leading LLM for accurate, contextually relevant responses in the medical domain. |
+    | writer/palmyra-fin-70b-32k                                        | 32k tokens     | Specialized LLM for financial analysis, reporting, and data processing |
+    | 01-ai/yi-large                                                    | 32k tokens     | Powerful model trained on English and Chinese for diverse tasks including chatbot and creative writing. |
+    | deepseek-ai/deepseek-coder-6.7b-instruct                         | 2k tokens      | Powerful coding model offering advanced capabilities in code generation, completion, and infilling |
+    | rakuten/rakutenai-7b-instruct                                     | 1,024 tokens   | Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation. |
+    | rakuten/rakutenai-7b-chat                                         | 1,024 tokens   | Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation. |
+    | baichuan-inc/baichuan2-13b-chat                                  | 4,096 tokens   | Support Chinese and English chat, coding, math, instruction following, solving quizzes |
+  </Accordion>
+
+  <Accordion title="Groq">
+    Set the following environment variables in your `.env` file:
+
+    ```toml Code
+    GROQ_API_KEY=<your-api-key>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="groq/llama-3.2-90b-text-preview",
+        temperature=0.7
+    )
+    ```
+    | Model             | Context Window   | Best For                                   |
+    |-------------------|------------------|--------------------------------------------|
+    | Llama 3.1 70B/8B  | 131,072 tokens   | High-performance, large context tasks      |
+    | Llama 3.2 Series  | 8,192 tokens     | General-purpose tasks                      |
+    | Mixtral 8x7B      | 32,768 tokens    | Balanced performance and context           |
+  </Accordion>
+
+  <Accordion title="IBM watsonx.ai">
+    Set the following environment variables in your `.env` file:
+    ```toml Code
+    # Required
+    WATSONX_URL=<your-url>
+    WATSONX_APIKEY=<your-apikey>
+    WATSONX_PROJECT_ID=<your-project-id>
+    
+    # Optional
+    WATSONX_TOKEN=<your-token>
+    WATSONX_DEPLOYMENT_SPACE_ID=<your-space-id>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="watsonx/meta-llama/llama-3-1-70b-instruct",
+        base_url="https://api.watsonx.ai/v1"
+    )
+    ```
+  </Accordion>
+
+  <Accordion title="Ollama (Local LLMs)">
+    1. Install Ollama: [ollama.ai](https://ollama.ai/)
+    2. Run a model: `ollama run llama2`
+    3. Configure:
+
+    ```python Code
+    llm = LLM(
+        model="ollama/llama3:70b",
+        base_url="http://localhost:11434"
+    )
+    ```
+  </Accordion>
+
+  <Accordion title="Fireworks AI">
+    Set the following environment variables in your `.env` file:
+    ```toml Code
+    FIREWORKS_API_KEY=<your-api-key>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="fireworks_ai/accounts/fireworks/models/llama-v3-70b-instruct",
+        temperature=0.7
+    )
+    ```
+  </Accordion>
+
+  <Accordion title="Perplexity AI">
+    Set the following environment variables in your `.env` file:
+    ```toml Code
+    PERPLEXITY_API_KEY=<your-api-key>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="llama-3.1-sonar-large-128k-online",
+        base_url="https://api.perplexity.ai/"
+    )
+    ```
+  </Accordion>
+
+  <Accordion title="Hugging Face">
+    Set the following environment variables in your `.env` file:
+    ```toml Code
+    HUGGINGFACE_API_KEY=<your-api-key>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct",
+        base_url="your_api_endpoint"
+    )
+    ```
+  </Accordion>
+
+  <Accordion title="SambaNova">
+    Set the following environment variables in your `.env` file:
+
+    ```toml Code
+    SAMBANOVA_API_KEY=<your-api-key>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="sambanova/Meta-Llama-3.1-8B-Instruct",
+        temperature=0.7
+    )
+    ```
+    | Model              | Context Window         | Best For                                     |
+    |--------------------|------------------------|----------------------------------------------|
+    | Llama 3.1 70B/8B   | Up to 131,072 tokens   | High-performance, large context tasks        |
+    | Llama 3.1 405B     | 8,192 tokens           | High-performance and output quality          |
+    | Llama 3.2 Series   | 8,192 tokens           | General-purpose, multimodal tasks            |
+    | Llama 3.3 70B      | Up to 131,072 tokens   | High-performance and output quality          |
+    | Qwen2 familly      | 8,192 tokens           | High-performance and output quality          |
+  </Accordion>
+
+  <Accordion title="Cerebras">
+    Set the following environment variables in your `.env` file:
+    ```toml Code
+    # Required
+    CEREBRAS_API_KEY=<your-api-key>
+    ```
+
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="cerebras/llama3.1-70b",
+        temperature=0.7,
+        max_tokens=8192
+    )
+    ```
+
+    <Info>
+      Cerebras features:
+      - Fast inference speeds
+      - Competitive pricing
+      - Good balance of speed and quality
+      - Support for long context windows
+    </Info>
+  </Accordion>
+
+  <Accordion title="Open Router">
+    Set the following environment variables in your `.env` file:
+    ```toml Code
+    OPENROUTER_API_KEY=<your-api-key>
+    ```
+    
+    Example usage in your CrewAI project:
+    ```python Code
+    llm = LLM(
+        model="openrouter/deepseek/deepseek-r1",
+        base_url="https://openrouter.ai/api/v1",
+        api_key=OPENROUTER_API_KEY
+    )
+    ```
+
+    <Info>
+      Open Router models:
+      - openrouter/deepseek/deepseek-r1
+      - openrouter/deepseek/deepseek-chat
+    </Info>
+  </Accordion>
+</AccordionGroup>
+
+## Structured LLM Calls
+
+CrewAI supports structured responses from LLM calls by allowing you to define a `response_format` using a Pydantic model. This enables the framework to automatically parse and validate the output, making it easier to integrate the response into your application without manual post-processing.
+
+For example, you can define a Pydantic model to represent the expected response structure and pass it as the `response_format` when instantiating the LLM. The model will then be used to convert the LLM output into a structured Python object.
+
+```python Code
+from crewai import LLM
+
+class Dog(BaseModel):
+    name: str
+    age: int
+    breed: str
+
+
+llm = LLM(model="gpt-4o", response_format=Dog)
+
+response = llm.call(
+    "Analyze the following messages and return the name, age, and breed. "
+    "Meet Kona! She is 3 years old and is a black german shepherd."
+)
+print(response)
+
+# Output:
+# Dog(name='Kona', age=3, breed='black german shepherd')
+```
+
 ## Advanced Features and Optimization

 Learn how to get the most out of your LLM configuration:
@@ -411,277 +635,6 @@ Learn how to get the most out of your LLM configuration:
  </Accordion>
 </AccordionGroup>

-## Provider Configuration Examples
-
-<AccordionGroup>
-  <Accordion title="OpenAI">
-    ```python Code
-    # Required
-    OPENAI_API_KEY=sk-...
-    
-    # Optional
-    OPENAI_API_BASE=<custom-base-url>
-    OPENAI_ORGANIZATION=<your-org-id>
-    ```
-
-    Example usage:
-    ```python Code
-    from crewai import LLM
-
-    llm = LLM(
-        model="gpt-4",
-        temperature=0.8,
-        max_tokens=150,
-        top_p=0.9,
-        frequency_penalty=0.1,
-        presence_penalty=0.1,
-        stop=["END"],
-        seed=42
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Anthropic">
-    ```python Code
-    ANTHROPIC_API_KEY=sk-ant-...
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="anthropic/claude-3-sonnet-20240229-v1:0",
-        temperature=0.7
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Google">
-    ```python Code
-    # Option 1. Gemini accessed with an API key.
-    # https://ai.google.dev/gemini-api/docs/api-key
-    GEMINI_API_KEY=<your-api-key>
-
-    # Option 2. Vertex AI IAM credentials for Gemini, Anthropic, and anything in the Model Garden.
-    # https://cloud.google.com/vertex-ai/generative-ai/docs/overview
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="gemini/gemini-1.5-pro-latest",
-        temperature=0.7
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Azure">
-    ```python Code
-    # Required
-    AZURE_API_KEY=<your-api-key>
-    AZURE_API_BASE=<your-resource-url>
-    AZURE_API_VERSION=<api-version>
-    
-    # Optional
-    AZURE_AD_TOKEN=<your-azure-ad-token>
-    AZURE_API_TYPE=<your-azure-api-type>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="azure/gpt-4",
-        api_version="2023-05-15"
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="AWS Bedrock">
-    ```python Code
-    AWS_ACCESS_KEY_ID=<your-access-key>
-    AWS_SECRET_ACCESS_KEY=<your-secret-key>
-    AWS_DEFAULT_REGION=<your-region>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0"
-    )
-    ```
-  </Accordion>
-  
-  <Accordion title="Amazon SageMaker">
-    ```python Code
-    AWS_ACCESS_KEY_ID=<your-access-key>
-    AWS_SECRET_ACCESS_KEY=<your-secret-key>
-    AWS_DEFAULT_REGION=<your-region>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="sagemaker/<my-endpoint>"
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Mistral">
-    ```python Code
-    MISTRAL_API_KEY=<your-api-key>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="mistral/mistral-large-latest",
-        temperature=0.7
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Nvidia NIM">
-    ```python Code
-    NVIDIA_API_KEY=<your-api-key>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="nvidia_nim/meta/llama3-70b-instruct",
-        temperature=0.7
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Groq">
-    ```python Code
-    GROQ_API_KEY=<your-api-key>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="groq/llama-3.2-90b-text-preview",
-        temperature=0.7
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="IBM watsonx.ai">
-    ```python Code
-    # Required
-    WATSONX_URL=<your-url>
-    WATSONX_APIKEY=<your-apikey>
-    WATSONX_PROJECT_ID=<your-project-id>
-    
-    # Optional
-    WATSONX_TOKEN=<your-token>
-    WATSONX_DEPLOYMENT_SPACE_ID=<your-space-id>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="watsonx/meta-llama/llama-3-1-70b-instruct",
-        base_url="https://api.watsonx.ai/v1"
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Ollama (Local LLMs)">
-    1. Install Ollama: [ollama.ai](https://ollama.ai/)
-    2. Run a model: `ollama run llama2`
-    3. Configure:
-
-    ```python Code
-    llm = LLM(
-        model="ollama/llama3:70b",
-        base_url="http://localhost:11434"
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Fireworks AI">
-    ```python Code
-    FIREWORKS_API_KEY=<your-api-key>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="fireworks_ai/accounts/fireworks/models/llama-v3-70b-instruct",
-        temperature=0.7
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Perplexity AI">
-    ```python Code
-    PERPLEXITY_API_KEY=<your-api-key>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="llama-3.1-sonar-large-128k-online",
-        base_url="https://api.perplexity.ai/"
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Hugging Face">
-    ```python Code
-    HUGGINGFACE_API_KEY=<your-api-key>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct",
-        base_url="your_api_endpoint"
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="SambaNova">
-    ```python Code
-    SAMBANOVA_API_KEY=<your-api-key>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="sambanova/Meta-Llama-3.1-8B-Instruct",
-        temperature=0.7
-    )
-    ```
-  </Accordion>
-
-  <Accordion title="Cerebras">
-    ```python Code
-    # Required
-    CEREBRAS_API_KEY=<your-api-key>
-    ```
-
-    Example usage:
-    ```python Code
-    llm = LLM(
-        model="cerebras/llama3.1-70b",
-        temperature=0.7,
-        max_tokens=8192
-    )
-    ```
-
-    <Info>
-      Cerebras features:
-      - Fast inference speeds
-      - Competitive pricing
-      - Good balance of speed and quality
-      - Support for long context windows
-    </Info>
-  </Accordion>
-</AccordionGroup>
-
 ## Common Issues and Solutions

 <Tabs>
--- a/docs/concepts/memory.mdx
+++ b/docs/concepts/memory.mdx
@@ -58,41 +58,107 @@ my_crew = Crew(
 ### Example: Use Custom Memory Instances e.g FAISS as the VectorDB

 ```python Code
-from crewai import Crew, Agent, Task, Process
+from crewai import Crew, Process
+from crewai.memory import LongTermMemory, ShortTermMemory, EntityMemory
+from crewai.memory.storage import LTMSQLiteStorage, RAGStorage
+from typing import List, Optional

 # Assemble your crew with memory capabilities
-my_crew = Crew(
-    agents=[...],
-    tasks=[...],
-    process="Process.sequential",
-    memory=True,
-    long_term_memory=EnhanceLongTermMemory(
+my_crew: Crew = Crew(
+    agents = [...],
+    tasks = [...],
+    process = Process.sequential,
+    memory = True,
+    # Long-term memory for persistent storage across sessions
+    long_term_memory = LongTermMemory(
        storage=LTMSQLiteStorage(
-            db_path="/my_data_dir/my_crew1/long_term_memory_storage.db"
+            db_path="/my_crew1/long_term_memory_storage.db"
        )
    ),
-    short_term_memory=EnhanceShortTermMemory(
-        storage=CustomRAGStorage(
-            crew_name="my_crew",
-            storage_type="short_term",
-            data_dir="//my_data_dir",
-            model=embedder["model"],
-            dimension=embedder["dimension"],
+    # Short-term memory for current context using RAG
+    short_term_memory = ShortTermMemory(
+        storage = RAGStorage(
+                embedder_config={
+                    "provider": "openai",
+                    "config": {
+                        "model": 'text-embedding-3-small'
+                    }
+                },
+                type="short_term",
+                path="/my_crew1/"
+            )
        ),
    ),
-    entity_memory=EnhanceEntityMemory(
-        storage=CustomRAGStorage(
-            crew_name="my_crew",
-            storage_type="entities",
-            data_dir="//my_data_dir",
-            model=embedder["model"],
-            dimension=embedder["dimension"],
-        ),
+    # Entity memory for tracking key information about entities
+    entity_memory = EntityMemory(
+        storage=RAGStorage(
+            embedder_config={
+                "provider": "openai",
+                "config": {
+                    "model": 'text-embedding-3-small'
+                }
+            },
+            type="short_term",
+            path="/my_crew1/"
+        )
    ),
    verbose=True,
 )
 ```

+## Security Considerations
+
+When configuring memory storage:
+- Use environment variables for storage paths (e.g., `CREWAI_STORAGE_DIR`)
+- Never hardcode sensitive information like database credentials
+- Consider access permissions for storage directories
+- Use relative paths when possible to maintain portability
+
+Example using environment variables:
+```python
+import os
+from crewai import Crew
+from crewai.memory import LongTermMemory
+from crewai.memory.storage import LTMSQLiteStorage
+
+# Configure storage path using environment variable
+storage_path = os.getenv("CREWAI_STORAGE_DIR", "./storage")
+crew = Crew(
+    memory=True,
+    long_term_memory=LongTermMemory(
+        storage=LTMSQLiteStorage(
+            db_path="{storage_path}/memory.db".format(storage_path=storage_path)
+        )
+    )
+)
+```
+
+## Configuration Examples
+
+### Basic Memory Configuration
+```python
+from crewai import Crew
+from crewai.memory import LongTermMemory
+
+# Simple memory configuration
+crew = Crew(memory=True)  # Uses default storage locations
+```
+
+### Custom Storage Configuration
+```python
+from crewai import Crew
+from crewai.memory import LongTermMemory
+from crewai.memory.storage import LTMSQLiteStorage
+
+# Configure custom storage paths
+crew = Crew(
+    memory=True,
+    long_term_memory=LongTermMemory(
+        storage=LTMSQLiteStorage(db_path="./memory.db")
+    )
+)
+```
+
 ## Integrating Mem0 for Enhanced User Memory

 [Mem0](https://mem0.ai/) is a self-improving memory layer for LLM applications, enabling personalized AI experiences. 
@@ -185,7 +251,12 @@ my_crew = Crew(
    process=Process.sequential,
    memory=True,
    verbose=True,
-    embedder=OpenAIEmbeddingFunction(api_key=os.getenv("OPENAI_API_KEY"), model_name="text-embedding-3-small"),
+    embedder={
+        "provider": "openai",
+        "config": {
+            "model": 'text-embedding-3-small'
+        }
+    }
 )
 ```

@@ -211,6 +282,19 @@ my_crew = Crew(

 ### Using Google AI embeddings

+#### Prerequisites
+Before using Google AI embeddings, ensure you have:
+- Access to the Gemini API
+- The necessary API keys and permissions
+
+You will need to update your *pyproject.toml* dependencies:
+```YAML
+dependencies = [
+    "google-generativeai>=0.8.4", #main version in January/2025 - crewai v.0.100.0 and crewai-tools 0.33.0
+    "crewai[tools]>=0.100.0,<1.0.0"
+]
+```
+
 ```python Code
 from crewai import Crew, Agent, Task, Process

@@ -224,7 +308,7 @@ my_crew = Crew(
        "provider": "google",
        "config": {
            "api_key": "<YOUR_API_KEY>",
-            "model_name": "<model_name>"
+            "model": "<model_name>"
        }
    }
 )
@@ -242,13 +326,15 @@ my_crew = Crew(
    process=Process.sequential,
    memory=True,
    verbose=True,
-    embedder=OpenAIEmbeddingFunction(
-        api_key="YOUR_API_KEY",
-        api_base="YOUR_API_BASE_PATH",
-        api_type="azure",
-        api_version="YOUR_API_VERSION",
-        model_name="text-embedding-3-small"
-    )
+    embedder={
+        "provider": "openai",
+        "config": {
+            "api_key": "YOUR_API_KEY",
+            "api_base": "YOUR_API_BASE_PATH",
+            "api_version": "YOUR_API_VERSION",
+            "model_name": 'text-embedding-3-small'
+        }
+    }
 )
 ```

@@ -264,12 +350,15 @@ my_crew = Crew(
    process=Process.sequential,
    memory=True,
    verbose=True,
-    embedder=GoogleVertexEmbeddingFunction(
-        project_id="YOUR_PROJECT_ID",
-        region="YOUR_REGION",
-        api_key="YOUR_API_KEY",
-        model_name="textembedding-gecko"
-    )
+    embedder={
+        "provider": "vertexai",
+        "config": {
+            "project_id"="YOUR_PROJECT_ID",
+            "region"="YOUR_REGION",
+            "api_key"="YOUR_API_KEY",
+            "model_name"="textembedding-gecko"
+        }
+    }
 )
 ```

@@ -288,7 +377,7 @@ my_crew = Crew(
        "provider": "cohere",
        "config": {
            "api_key": "YOUR_API_KEY",
-            "model_name": "<model_name>"
+            "model": "<model_name>"
        }
    }
 )
@@ -308,7 +397,7 @@ my_crew = Crew(
        "provider": "voyageai",
        "config": {
            "api_key": "YOUR_API_KEY",
-            "model_name": "<model_name>"
+            "model": "<model_name>"
        }
    }
 )
@@ -358,6 +447,65 @@ my_crew = Crew(
 )
 ```

+### Using Amazon Bedrock embeddings
+
+```python Code
+# Note: Ensure you have installed `boto3` for Bedrock embeddings to work.
+
+import os
+import boto3
+from crewai import Crew, Agent, Task, Process
+
+boto3_session = boto3.Session(
+    region_name=os.environ.get("AWS_REGION_NAME"),
+    aws_access_key_id=os.environ.get("AWS_ACCESS_KEY_ID"),
+    aws_secret_access_key=os.environ.get("AWS_SECRET_ACCESS_KEY")
+)
+
+my_crew = Crew(
+    agents=[...],
+    tasks=[...],
+    process=Process.sequential,
+    memory=True,
+    embedder={
+    "provider": "bedrock",
+        "config":{
+            "session": boto3_session,
+            "model": "amazon.titan-embed-text-v2:0",
+            "vector_dimension": 1024
+        }
+    }
+    verbose=True
+)
+```
+
+### Adding Custom Embedding Function
+
+```python Code
+from crewai import Crew, Agent, Task, Process
+from chromadb import Documents, EmbeddingFunction, Embeddings
+
+# Create a custom embedding function
+class CustomEmbedder(EmbeddingFunction):
+    def __call__(self, input: Documents) -> Embeddings:
+        # generate embeddings
+        return [1, 2, 3] # this is a dummy embedding
+
+my_crew = Crew(
+    agents=[...],
+    tasks=[...],
+    process=Process.sequential,
+    memory=True,
+    verbose=True,
+    embedder={
+        "provider": "custom",
+        "config": {
+            "embedder": CustomEmbedder()
+        }
+    }
+)
+```
+
 ### Resetting Memory

 ```shell
--- a/docs/concepts/planning.mdx
+++ b/docs/concepts/planning.mdx
@@ -81,8 +81,8 @@ my_crew.kickoff()

 3. **Collect Data:**

-   - Search for the latest papers, articles, and reports published in 2023 and early 2024.
-   - Use keywords like "Large Language Models 2024", "AI LLM advancements", "AI ethics 2024", etc.
+   - Search for the latest papers, articles, and reports published in 2024 and early 2025.
+   - Use keywords like "Large Language Models 2025", "AI LLM advancements", "AI ethics 2025", etc.

 4. **Analyze Findings:**

--- a/docs/concepts/tasks.mdx
+++ b/docs/concepts/tasks.mdx
@@ -33,11 +33,12 @@ crew = Crew(
 | :------------------------------- | :---------------- | :---------------------------- | :------------------------------------------------------------------------------------------------------------------- |
 | **Description**                  | `description`     | `str`                         | A clear, concise statement of what the task entails.                                                                 |
 | **Expected Output**              | `expected_output` | `str`                         | A detailed description of what the task's completion looks like.                                                     |
-| **Name** _(optional)_           | `name`           | `Optional[str]`               | A name identifier for the task.                                                                                      |
-| **Agent** _(optional)_          | `agent`           | `Optional[BaseAgent]`         | The agent responsible for executing the task.                                                                        |
-| **Tools** _(optional)_          | `tools`           | `List[BaseTool]`             | The tools/resources the agent is limited to use for this task.                                                       |
+| **Name** _(optional)_            | `name`            | `Optional[str]`               | A name identifier for the task.                                                                                      |
+| **Agent** _(optional)_           | `agent`           | `Optional[BaseAgent]`         | The agent responsible for executing the task.                                                                        |
+| **Tools** _(optional)_           | `tools`           | `List[BaseTool]`              | The tools/resources the agent is limited to use for this task.                                                       |
 | **Context** _(optional)_         | `context`         | `Optional[List["Task"]]`      | Other tasks whose outputs will be used as context for this task.                                                     |
 | **Async Execution** _(optional)_ | `async_execution` | `Optional[bool]`              | Whether the task should be executed asynchronously. Defaults to False.                                               |
+| **Human Input** _(optional)_     | `human_input`     | `Optional[bool]`              | Whether the task should have a human review the final answer of the agent. Defaults to False.                        |
 | **Config** _(optional)_          | `config`          | `Optional[Dict[str, Any]]`    | Task-specific configuration parameters.                                                                              |
 | **Output File** _(optional)_     | `output_file`     | `Optional[str]`               | File path for storing the task output.                                                                               |
 | **Output JSON** _(optional)_     | `output_json`     | `Optional[Type[BaseModel]]`   | A Pydantic model to structure the JSON output.                                                                       |
@@ -68,7 +69,7 @@ research_task:
  description: >
    Conduct a thorough research about {topic}
    Make sure you find any interesting and relevant information given
-    the current year is 2024.
+    the current year is 2025.
  expected_output: >
    A list with 10 bullet points of the most relevant information about {topic}
  agent: researcher
@@ -154,7 +155,7 @@ research_task = Task(
    description="""
        Conduct a thorough research about AI Agents.
        Make sure you find any interesting and relevant information given
-        the current year is 2024.
+        the current year is 2025.
    """,
    expected_output="""
        A list with 10 bullet points of the most relevant information about AI Agents
@@ -267,7 +268,7 @@ analysis_task = Task(

 Task guardrails provide a way to validate and transform task outputs before they
 are passed to the next task. This feature helps ensure data quality and provides
-efeedback to agents when their output doesn't meet specific criteria.
+feedback to agents when their output doesn't meet specific criteria.

 ### Using Task Guardrails

--- a/docs/how-to/human-input-on-execution.mdx
+++ b/docs/how-to/human-input-on-execution.mdx
@@ -60,12 +60,12 @@ writer = Agent(
 # Create tasks for your agents
 task1 = Task(
    description=(
-        "Conduct a comprehensive analysis of the latest advancements in AI in 2024. "
+        "Conduct a comprehensive analysis of the latest advancements in AI in 2025. "
        "Identify key trends, breakthrough technologies, and potential industry impacts. "
        "Compile your findings in a detailed report. "
        "Make sure to check with a human if the draft is good before finalizing your answer."
    ),
-    expected_output='A comprehensive full report on the latest AI advancements in 2024, leave nothing out',
+    expected_output='A comprehensive full report on the latest AI advancements in 2025, leave nothing out',
    agent=researcher,
    human_input=True
 )
@@ -76,7 +76,7 @@ task2 = Task(
        "Your post should be informative yet accessible, catering to a tech-savvy audience. "
        "Aim for a narrative that captures the essence of these breakthroughs and their implications for the future."
    ),
-    expected_output='A compelling 3 paragraphs blog post formatted as markdown about the latest AI advancements in 2024',
+    expected_output='A compelling 3 paragraphs blog post formatted as markdown about the latest AI advancements in 2025',
    agent=writer,
    human_input=True
 )
--- a/docs/how-to/langfuse-observability.mdx
+++ b/docs/how-to/langfuse-observability.mdx
@@ -0,0 +1,100 @@
+---
+title: Agent Monitoring with Langfuse
+description: Learn how to integrate Langfuse with CrewAI via OpenTelemetry using OpenLit
+icon: magnifying-glass-chart
+---
+
+# Integrate Langfuse with CrewAI
+
+This notebook demonstrates how to integrate **Langfuse** with **CrewAI** using OpenTelemetry via the **OpenLit** SDK. By the end of this notebook, you will be able to trace your CrewAI applications with Langfuse for improved observability and debugging.
+
+> **What is Langfuse?** [Langfuse](https://langfuse.com) is an open-source LLM engineering platform. It provides tracing and monitoring capabilities for LLM applications, helping developers debug, analyze, and optimize their AI systems. Langfuse integrates with various tools and frameworks via native integrations, OpenTelemetry, and APIs/SDKs.
+
+[![Langfuse Overview Video](https://github.com/user-attachments/assets/3926b288-ff61-4b95-8aa1-45d041c70866)](https://langfuse.com/watch-demo)
+
+## Get Started
+
+We'll walk through a simple example of using CrewAI and integrating it with Langfuse via OpenTelemetry using OpenLit.
+
+### Step 1: Install Dependencies
+
+
+```python
+%pip install langfuse openlit crewai crewai_tools
+```
+
+### Step 2: Set Up Environment Variables
+
+Set your Langfuse API keys and configure OpenTelemetry export settings to send traces to Langfuse. Please refer to the [Langfuse OpenTelemetry Docs](https://langfuse.com/docs/opentelemetry/get-started) for more information on the Langfuse OpenTelemetry endpoint `/api/public/otel` and authentication.
+
+
+```python
+import os
+import base64
+
+LANGFUSE_PUBLIC_KEY="pk-lf-..."
+LANGFUSE_SECRET_KEY="sk-lf-..."
+LANGFUSE_AUTH=base64.b64encode(f"{LANGFUSE_PUBLIC_KEY}:{LANGFUSE_SECRET_KEY}".encode()).decode()
+
+os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "https://cloud.langfuse.com/api/public/otel" # EU data region
+# os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "https://us.cloud.langfuse.com/api/public/otel" # US data region
+os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"Authorization=Basic {LANGFUSE_AUTH}"
+
+# your openai key
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+### Step 3: Initialize OpenLit
+
+Initialize the OpenLit OpenTelemetry instrumentation SDK to start capturing OpenTelemetry traces.
+
+
+```python
+import openlit
+
+openlit.init()
+```
+
+### Step 4: Create a Simple CrewAI Application
+
+We'll create a simple CrewAI application where multiple agents collaborate to answer a user's question.
+
+
+```python
+from crewai import Agent, Task, Crew
+
+from crewai_tools import (
+    WebsiteSearchTool
+)
+
+web_rag_tool = WebsiteSearchTool()
+
+writer = Agent(
+        role="Writer",
+        goal="You make math engaging and understandable for young children through poetry",
+        backstory="You're an expert in writing haikus but you know nothing of math.",
+        tools=[web_rag_tool],  
+    )
+
+task = Task(description=("What is {multiplication}?"),
+            expected_output=("Compose a haiku that includes the answer."),
+            agent=writer)
+
+crew = Crew(
+  agents=[writer],
+  tasks=[task],
+  share_crew=False
+)
+```
+
+### Step 5: See Traces in Langfuse
+
+After running the agent, you can view the traces generated by your CrewAI application in [Langfuse](https://cloud.langfuse.com). You should see detailed steps of the LLM interactions, which can help you debug and optimize your AI agent.
+
+![CrewAI example trace in Langfuse](https://langfuse.com/images/cookbook/integration_crewai/crewai-example-trace.png)
+
+_[Public example trace in Langfuse](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/e2cf380ffc8d47d28da98f136140642b?timestamp=2025-02-05T15%3A12%3A02.717Z&observation=3b32338ee6a5d9af)_
+
+## References
+
+- [Langfuse OpenTelemetry Docs](https://langfuse.com/docs/opentelemetry/get-started)
--- a/docs/how-to/mlflow-observability.mdx
+++ b/docs/how-to/mlflow-observability.mdx
@@ -0,0 +1,206 @@
+---
+title: Agent Monitoring with MLflow
+description: Quickly start monitoring your Agents with MLflow.
+icon: bars-staggered
+---
+
+# MLflow Overview
+
+[MLflow](https://mlflow.org/) is an open-source platform to assist machine learning practitioners and teams in handling the complexities of the machine learning process.
+
+It provides a tracing feature that enhances LLM observability in your Generative AI applications by capturing detailed information about the execution of your application’s services. 
+Tracing provides a way to record the inputs, outputs, and metadata associated with each intermediate step of a request, enabling you to easily pinpoint the source of bugs and unexpected behaviors.
+
+![Overview of MLflow crewAI tracing usage](/images/mlflow-tracing.gif)
+
+### Features
+
+- **Tracing Dashboard**: Monitor activities of your crewAI agents with detailed dashboards that include inputs, outputs and metadata of spans.
+- **Automated Tracing**: A fully automated integration with crewAI, which can be enabled by running `mlflow.crewai.autolog()`. 
+- **Manual Trace Instrumentation with minor efforts**: Customize trace instrumentation through MLflow's high-level fluent APIs such as decorators, function wrappers and context managers.
+- **OpenTelemetry Compatibility**: MLflow Tracing supports exporting traces to an OpenTelemetry Collector, which can then be used to export traces to various backends such as Jaeger, Zipkin, and AWS X-Ray.
+- **Package and Deploy Agents**: Package and deploy your crewAI agents to an inference server with a variety of deployment targets.
+- **Securely Host LLMs**: Host multiple LLM from various providers in one unified endpoint through MFflow gateway.
+- **Evaluation**: Evaluate your crewAI agents with a wide range of metrics using a convenient API `mlflow.evaluate()`.
+
+## Setup Instructions
+
+<Steps>
+    <Step title="Install MLflow package">
+      ```shell
+      # The crewAI integration is available in mlflow>=2.19.0
+      pip install mlflow
+      ```
+    </Step>
+    <Step title="Start MFflow tracking server">
+      ```shell
+      # This process is optional, but it is recommended to use MLflow tracking server for better visualization and broader features.
+      mlflow server
+      ```
+    </Step>
+    <Step title="Initialize MLflow in Your Application">
+      Add the following two lines to your application code:
+
+      ```python
+      import mlflow
+
+      mlflow.crewai.autolog()
+
+      # Optional: Set a tracking URI and an experiment name if you have a tracking server
+      mlflow.set_tracking_uri("http://localhost:5000")
+      mlflow.set_experiment("CrewAI")
+      ```
+      
+      Example Usage for tracing CrewAI Agents:
+
+      ```python
+      from crewai import Agent, Crew, Task
+      from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
+      from crewai_tools import SerperDevTool, WebsiteSearchTool
+
+      from textwrap import dedent
+
+      content = "Users name is John. He is 30 years old and lives in San Francisco."
+      string_source = StringKnowledgeSource(
+          content=content, metadata={"preference": "personal"}
+      )
+
+      search_tool = WebsiteSearchTool()
+
+
+      class TripAgents:
+          def city_selection_agent(self):
+              return Agent(
+                  role="City Selection Expert",
+                  goal="Select the best city based on weather, season, and prices",
+                  backstory="An expert in analyzing travel data to pick ideal destinations",
+                  tools=[
+                      search_tool,
+                  ],
+                  verbose=True,
+              )
+
+          def local_expert(self):
+              return Agent(
+                  role="Local Expert at this city",
+                  goal="Provide the BEST insights about the selected city",
+                  backstory="""A knowledgeable local guide with extensive information
+              about the city, it's attractions and customs""",
+                  tools=[search_tool],
+                  verbose=True,
+              )
+
+
+      class TripTasks:
+          def identify_task(self, agent, origin, cities, interests, range):
+              return Task(
+                  description=dedent(
+                      f"""
+                      Analyze and select the best city for the trip based
+                      on specific criteria such as weather patterns, seasonal
+                      events, and travel costs. This task involves comparing
+                      multiple cities, considering factors like current weather
+                      conditions, upcoming cultural or seasonal events, and
+                      overall travel expenses.
+                      Your final answer must be a detailed
+                      report on the chosen city, and everything you found out
+                      about it, including the actual flight costs, weather
+                      forecast and attractions.
+
+                      Traveling from: {origin}
+                      City Options: {cities}
+                      Trip Date: {range}
+                      Traveler Interests: {interests}
+                  """
+                  ),
+                  agent=agent,
+                  expected_output="Detailed report on the chosen city including flight costs, weather forecast, and attractions",
+              )
+
+          def gather_task(self, agent, origin, interests, range):
+              return Task(
+                  description=dedent(
+                      f"""
+                      As a local expert on this city you must compile an
+                      in-depth guide for someone traveling there and wanting
+                      to have THE BEST trip ever!
+                      Gather information about key attractions, local customs,
+                      special events, and daily activity recommendations.
+                      Find the best spots to go to, the kind of place only a
+                      local would know.
+                      This guide should provide a thorough overview of what
+                      the city has to offer, including hidden gems, cultural
+                      hotspots, must-visit landmarks, weather forecasts, and
+                      high level costs.
+                      The final answer must be a comprehensive city guide,
+                      rich in cultural insights and practical tips,
+                      tailored to enhance the travel experience.
+
+                      Trip Date: {range}
+                      Traveling from: {origin}
+                      Traveler Interests: {interests}
+                  """
+                  ),
+                  agent=agent,
+                  expected_output="Comprehensive city guide including hidden gems, cultural hotspots, and practical travel tips",
+              )
+
+
+      class TripCrew:
+          def __init__(self, origin, cities, date_range, interests):
+              self.cities = cities
+              self.origin = origin
+              self.interests = interests
+              self.date_range = date_range
+
+          def run(self):
+              agents = TripAgents()
+              tasks = TripTasks()
+
+              city_selector_agent = agents.city_selection_agent()
+              local_expert_agent = agents.local_expert()
+
+              identify_task = tasks.identify_task(
+                  city_selector_agent,
+                  self.origin,
+                  self.cities,
+                  self.interests,
+                  self.date_range,
+              )
+              gather_task = tasks.gather_task(
+                  local_expert_agent, self.origin, self.interests, self.date_range
+              )
+
+              crew = Crew(
+                  agents=[city_selector_agent, local_expert_agent],
+                  tasks=[identify_task, gather_task],
+                  verbose=True,
+                  memory=True,
+                  knowledge={
+                      "sources": [string_source],
+                      "metadata": {"preference": "personal"},
+                  },
+              )
+
+              result = crew.kickoff()
+              return result
+
+
+      trip_crew = TripCrew("California", "Tokyo", "Dec 12 - Dec 20", "sports")
+      result = trip_crew.run()
+
+      print(result)
+      ```
+      Refer to [MLflow Tracing Documentation](https://mlflow.org/docs/latest/llms/tracing/index.html) for more configurations and use cases.
+    </Step>
+    <Step title="Visualize Activities of Agents">
+      Now traces for your crewAI agents are captured by MLflow. 
+      Let's visit MLflow tracking server to view the traces and get insights into your Agents.
+
+      Open `127.0.0.1:5000` on your browser to visit MLflow tracking server. 
+      <Frame caption="MLflow Tracing Dashboard">
+        <img src="/images/mlflow1.png" alt="MLflow tracing example with crewai" />
+      </Frame>
+    </Step>
+</Steps> 
+
--- a/docs/how-to/multimodal-agents.mdx
+++ b/docs/how-to/multimodal-agents.mdx
@@ -45,6 +45,7 @@ image_analyst = Agent(
 # Create a task for image analysis
 task = Task(
    description="Analyze the product image at https://example.com/product.jpg and provide a detailed description",
+    expected_output="A detailed description of the product image",
    agent=image_analyst
 )

@@ -81,6 +82,7 @@ inspection_task = Task(
    3. Compliance with standards
    Provide a detailed report highlighting any issues found.
    """,
+    expected_output="A detailed report highlighting any issues found",
    agent=expert_analyst
 )

--- a/docs/how-to/portkey-observability-and-guardrails.mdx
+++ b/docs/how-to/portkey-observability-and-guardrails.mdx
@@ -1,211 +0,0 @@
-# Portkey Integration with CrewAI
-<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-CrewAI.png" alt="Portkey CrewAI Header Image" width="70%" />
-
-
-[Portkey](https://portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai) is a 2-line upgrade to make your CrewAI agents reliable, cost-efficient, and fast.
-
-Portkey adds 4 core production capabilities to any CrewAI agent:
-1. Routing to **200+ LLMs**
-2. Making each LLM call more robust
-3. Full-stack tracing & cost, performance analytics
-4. Real-time guardrails to enforce behavior
-
-
-
-
-
-## Getting Started
-
-1. **Install Required Packages:**
-
-```bash
-pip install -qU crewai portkey-ai
-```
-
-2. **Configure the LLM Client:**
-
-To build CrewAI Agents with Portkey, you'll need two keys:
- **Portkey API Key**: Sign up on the [Portkey app](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai) and copy your API key
- **Virtual Key**: Virtual Keys securely manage your LLM API keys in one place. Store your LLM provider API keys securely in Portkey's vault
-
-```python
-from crewai import LLM
-from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
-
-gpt_llm = LLM(
-    model="gpt-4",
-    base_url=PORTKEY_GATEWAY_URL,
-    api_key="dummy", # We are using Virtual key
-    extra_headers=createHeaders(
-        api_key="YOUR_PORTKEY_API_KEY",
-        virtual_key="YOUR_VIRTUAL_KEY", # Enter your Virtual key from Portkey
-    )
-)
-```
-
-3. **Create and Run Your First Agent:**
-
-```python
-from crewai import Agent, Task, Crew
-
-# Define your agents with roles and goals
-coder = Agent(
-    role='Software developer',
-    goal='Write clear, concise code on demand',
-    backstory='An expert coder with a keen eye for software trends.',
-    llm=gpt_llm
-)
-
-# Create tasks for your agents
-task1 = Task(
-    description="Define the HTML for making a simple website with heading- Hello World! Portkey is working!",
-    expected_output="A clear and concise HTML code",
-    agent=coder
-)
-
-# Instantiate your crew
-crew = Crew(
-    agents=[coder],
-    tasks=[task1],
-)
-
-result = crew.kickoff()
-print(result)
-```
-
-
-## Key Features
-
-| Feature | Description |
-|---------|-------------|
-| 🌐 Multi-LLM Support | Access OpenAI, Anthropic, Gemini, Azure, and 250+ providers through a unified interface |
-| 🛡️ Production Reliability | Implement retries, timeouts, load balancing, and fallbacks |
-| 📊 Advanced Observability | Track 40+ metrics including costs, tokens, latency, and custom metadata |
-| 🔍 Comprehensive Logging | Debug with detailed execution traces and function call logs |
-| 🚧 Security Controls | Set budget limits and implement role-based access control |
-| 🔄 Performance Analytics | Capture and analyze feedback for continuous improvement |
-| 💾 Intelligent Caching | Reduce costs and latency with semantic or simple caching |
-
-
-## Production Features with Portkey Configs
-
-All features mentioned below are through Portkey's Config system. Portkey's Config system allows you to define routing strategies using simple JSON objects in your LLM API calls. You can create and manage Configs directly in your code or through the Portkey Dashboard. Each Config has a unique ID for easy reference.
-
-<Frame>
-    <img src="https://raw.githubusercontent.com/Portkey-AI/docs-core/refs/heads/main/images/libraries/libraries-3.avif"/>
-</Frame>
-
-
-### 1. Use 250+ LLMs
-Access various LLMs like Anthropic, Gemini, Mistral, Azure OpenAI, and more with minimal code changes. Switch between providers or use them together seamlessly. [Learn more about Universal API](https://portkey.ai/docs/product/ai-gateway/universal-api)
-
-
-Easily switch between different LLM providers:
-
-```python
-# Anthropic Configuration
-anthropic_llm = LLM(
-    model="claude-3-5-sonnet-latest",
-    base_url=PORTKEY_GATEWAY_URL,
-    api_key="dummy",
-    extra_headers=createHeaders(
-        api_key="YOUR_PORTKEY_API_KEY",
-        virtual_key="YOUR_ANTHROPIC_VIRTUAL_KEY", #You don't need provider when using Virtual keys
-        trace_id="anthropic_agent"
-    )
-)
-
-# Azure OpenAI Configuration
-azure_llm = LLM(
-    model="gpt-4",
-    base_url=PORTKEY_GATEWAY_URL,
-    api_key="dummy",
-    extra_headers=createHeaders(
-        api_key="YOUR_PORTKEY_API_KEY",
-        virtual_key="YOUR_AZURE_VIRTUAL_KEY", #You don't need provider when using Virtual keys
-        trace_id="azure_agent"
-    )
-)
-```
-
-
-### 2. Caching
-Improve response times and reduce costs with two powerful caching modes:
- **Simple Cache**: Perfect for exact matches
- **Semantic Cache**: Matches responses for requests that are semantically similar
-[Learn more about Caching](https://portkey.ai/docs/product/ai-gateway/cache-simple-and-semantic)
-
-```py
-config = {
-    "cache": {
-        "mode": "semantic",  # or "simple" for exact matching
-    }
-}
-```
-
-### 3. Production Reliability
-Portkey provides comprehensive reliability features:
- **Automatic Retries**: Handle temporary failures gracefully
- **Request Timeouts**: Prevent hanging operations
- **Conditional Routing**: Route requests based on specific conditions
- **Fallbacks**: Set up automatic provider failovers
- **Load Balancing**: Distribute requests efficiently
-
-[Learn more about Reliability Features](https://portkey.ai/docs/product/ai-gateway/)
-
-
-
-### 4. Metrics
-
-Agent runs are complex. Portkey automatically logs **40+ comprehensive metrics** for your AI agents, including cost, tokens used, latency, etc. Whether you need a broad overview or granular insights into your agent runs, Portkey's customizable filters provide the metrics you need.
-
-
- Cost per agent interaction
- Response times and latency
- Token usage and efficiency
- Success/failure rates
- Cache hit rates
-
-<img src="https://github.com/siddharthsambharia-portkey/Portkey-Product-Images/blob/main/Portkey-Dashboard.png?raw=true" width="70%" alt="Portkey Dashboard" />
-
-### 5. Detailed Logging
-Logs are essential for understanding agent behavior, diagnosing issues, and improving performance. They provide a detailed record of agent activities and tool use, which is crucial for debugging and optimizing processes.
-
-
-Access a dedicated section to view records of agent executions, including parameters, outcomes, function calls, and errors. Filter logs based on multiple parameters such as trace ID, model, tokens used, and metadata.
-
-<details>
-  <summary><b>Traces</b></summary>
-  <img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-Traces.png" alt="Portkey Traces" width="70%" />
-</details>
-
-<details>
-  <summary><b>Logs</b></summary>
-  <img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-Logs.png" alt="Portkey Logs" width="70%" />
-</details>
-
-### 6. Enterprise Security Features
- Set budget limit and rate limts per Virtual Key (disposable API keys)
- Implement role-based access control
- Track system changes with audit logs
- Configure data retention policies
-
-
-
-For detailed information on creating and managing Configs, visit the [Portkey documentation](https://docs.portkey.ai/product/ai-gateway/configs).
-
-## Resources
-
- [📘 Portkey Documentation](https://docs.portkey.ai)
- [📊 Portkey Dashboard](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai)
- [🐦 Twitter](https://twitter.com/portkeyai)
- [💬 Discord Community](https://discord.gg/DD7vgKK299)
-
-
-
-
-
-
-
-
-
--- a/docs/how-to/portkey-observability.mdx
+++ b/docs/how-to/portkey-observability.mdx
@@ -1,5 +1,5 @@
 ---
-title: Portkey Observability and Guardrails
+title: Agent Monitoring with Portkey
 description: How to use Portkey with CrewAI
 icon: key
 ---
--- a/docs/images/mlflow-tracing.gif
+++ b/docs/images/mlflow-tracing.gif
--- a/docs/images/mlflow1.png
+++ b/docs/images/mlflow1.png
--- a/docs/mint.json
+++ b/docs/mint.json
@@ -101,8 +101,10 @@
        "how-to/conditional-tasks",
        "how-to/agentops-observability",
        "how-to/langtrace-observability",
+        "how-to/mlflow-observability",
        "how-to/openlit-observability",
-        "how-to/portkey-observability"
+        "how-to/portkey-observability",
+        "how-to/langfuse-observability"
      ]
    },
    {
--- a/docs/quickstart.mdx
+++ b/docs/quickstart.mdx
@@ -58,7 +58,7 @@ Follow the steps below to get crewing! 🚣‍♂️
      description: >
        Conduct a thorough research about {topic}
        Make sure you find any interesting and relevant information given
-        the current year is 2024.
+        the current year is 2025.
      expected_output: >
        A list with 10 bullet points of the most relevant information about {topic}
      agent: researcher
@@ -195,10 +195,10 @@ Follow the steps below to get crewing! 🚣‍♂️

  <CodeGroup>
    ```markdown output/report.md
-    # Comprehensive Report on the Rise and Impact of AI Agents in 2024
+    # Comprehensive Report on the Rise and Impact of AI Agents in 2025

    ## 1. Introduction to AI Agents
-    In 2024, Artificial Intelligence (AI) agents are at the forefront of innovation across various industries. As intelligent systems that can perform tasks typically requiring human cognition, AI agents are paving the way for significant advancements in operational efficiency, decision-making, and overall productivity within sectors like Human Resources (HR) and Finance. This report aims to detail the rise of AI agents, their frameworks, applications, and potential implications on the workforce.
+    In 2025, Artificial Intelligence (AI) agents are at the forefront of innovation across various industries. As intelligent systems that can perform tasks typically requiring human cognition, AI agents are paving the way for significant advancements in operational efficiency, decision-making, and overall productivity within sectors like Human Resources (HR) and Finance. This report aims to detail the rise of AI agents, their frameworks, applications, and potential implications on the workforce.

    ## 2. Benefits of AI Agents
    AI agents bring numerous advantages that are transforming traditional work environments. Key benefits include:
@@ -252,7 +252,7 @@ Follow the steps below to get crewing! 🚣‍♂️
    To stay competitive and harness the full potential of AI agents, organizations must remain vigilant about latest developments in AI technology and consider continuous learning and adaptation in their strategic planning.

    ## 8. Conclusion
-    The emergence of AI agents is undeniably reshaping the workplace landscape in 2024. With their ability to automate tasks, enhance efficiency, and improve decision-making, AI agents are critical in driving operational success. Organizations must embrace and adapt to AI developments to thrive in an increasingly digital business environment.
+    The emergence of AI agents is undeniably reshaping the workplace landscape in 5. With their ability to automate tasks, enhance efficiency, and improve decision-making, AI agents are critical in driving operational success. Organizations must embrace and adapt to AI developments to thrive in an increasingly digital business environment.
    ```
  </CodeGroup>
  </Step>
--- a/docs/tools/filewritetool.mdx
+++ b/docs/tools/filewritetool.mdx
@@ -8,9 +8,9 @@ icon: file-pen

 ## Description

-The `FileWriterTool` is a component of the crewai_tools package, designed to simplify the process of writing content to files. 
+The `FileWriterTool` is a component of the crewai_tools package, designed to simplify the process of writing content to files with cross-platform compatibility (Windows, Linux, macOS). 
 It is particularly useful in scenarios such as generating reports, saving logs, creating configuration files, and more. 
-This tool supports creating new directories if they don't exist, making it easier to organize your output.
+This tool handles path differences across operating systems, supports UTF-8 encoding, and automatically creates directories if they don't exist, making it easier to organize your output reliably across different platforms.

 ## Installation

@@ -43,6 +43,8 @@ print(result)

 ## Conclusion

-By integrating the `FileWriterTool` into your crews, the agents can execute the process of writing content to files and creating directories. 
-This tool is essential for tasks that require saving output data, creating structured file systems, and more. By adhering to the setup and usage guidelines provided, 
-incorporating this tool into projects is straightforward and efficient.
+By integrating the `FileWriterTool` into your crews, the agents can reliably write content to files across different operating systems. 
+This tool is essential for tasks that require saving output data, creating structured file systems, and handling cross-platform file operations. 
+It's particularly recommended for Windows users who may encounter file writing issues with standard Python file operations.
+
+By adhering to the setup and usage guidelines provided, incorporating this tool into projects is straightforward and ensures consistent file writing behavior across all platforms.
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -152,6 +152,7 @@ nav:
    - Agent Monitoring with AgentOps: 'how-to/AgentOps-Observability.md'
    - Agent Monitoring with LangTrace: 'how-to/Langtrace-Observability.md'
    - Agent Monitoring with OpenLIT: 'how-to/openlit-Observability.md'
+    - Agent Monitoring with MLflow: 'how-to/mlflow-Observability.md'
  - Tools Docs:
    - Browserbase Web Loader: 'tools/BrowserbaseLoadTool.md'
    - Code Docs RAG Search: 'tools/CodeDocsSearchTool.md'
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "crewai"
-version = "0.98.0"
+version = "0.102.0"
 description = "Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks."
 readme = "README.md"
 requires-python = ">=3.10,<3.13"
@@ -11,7 +11,7 @@ dependencies = [
    # Core Dependencies
    "pydantic>=2.4.2",
    "openai>=1.13.3",
-    "litellm==1.57.4",
+    "litellm==1.60.2",
    "instructor>=1.3.3",
    # Text Processing
    "pdfplumber>=0.11.4",
@@ -45,7 +45,7 @@ Documentation = "https://docs.crewai.com"
 Repository = "https://github.com/crewAIInc/crewAI"

 [project.optional-dependencies]
-tools = ["crewai-tools>=0.32.1"]
+tools = ["crewai-tools>=0.36.0"]
 embeddings = [
    "tiktoken~=0.7.0"
 ]
--- a/src/crewai/init.py
+++ b/src/crewai/init.py
@@ -14,7 +14,7 @@ warnings.filterwarnings(
    category=UserWarning,
    module="pydantic.main",
 )
-__version__ = "0.98.0"
+__version__ = "0.102.0"
 __all__ = [
    "Agent",
    "Crew",
--- a/src/crewai/agent.py
+++ b/src/crewai/agent.py
@@ -1,6 +1,7 @@
+import re
 import shutil
 import subprocess
-from typing import Any, Dict, List, Literal, Optional, Union
+from typing import Any, Dict, List, Literal, Optional, Sequence, Union

 from pydantic import Field, InstanceOf, PrivateAttr, model_validator

@@ -15,29 +16,20 @@ from crewai.memory.contextual.contextual_memory import ContextualMemory
 from crewai.task import Task
 from crewai.tools import BaseTool
 from crewai.tools.agent_tools.agent_tools import AgentTools
-from crewai.tools.base_tool import Tool
 from crewai.utilities import Converter, Prompts
 from crewai.utilities.constants import TRAINED_AGENTS_DATA_FILE, TRAINING_DATA_FILE
 from crewai.utilities.converter import generate_model_description
+from crewai.utilities.events.agent_events import (
+    AgentExecutionCompletedEvent,
+    AgentExecutionErrorEvent,
+    AgentExecutionStartedEvent,
+)
+from crewai.utilities.events.crewai_event_bus import crewai_event_bus
 from crewai.utilities.llm_utils import create_llm
 from crewai.utilities.token_counter_callback import TokenCalcHandler
 from crewai.utilities.training_handler import CrewTrainingHandler

-agentops = None

-try:
-    import agentops  # type: ignore # Name "agentops" is already defined
-    from agentops import track_agent  # type: ignore
-except ImportError:
-
-    def track_agent():
-        def noop(f):
-            return f
-
-        return noop
-
-
-@track_agent()
 class Agent(BaseAgent):
    """Represents an agent in a system.

@@ -54,13 +46,13 @@ class Agent(BaseAgent):
            llm: The language model that will run the agent.
            function_calling_llm: The language model that will handle the tool calling for this agent, it overrides the crew function_calling_llm.
            max_iter: Maximum number of iterations for an agent to execute a task.
-            memory: Whether the agent should have memory or not.
            max_rpm: Maximum number of requests per minute for the agent execution to be respected.
            verbose: Whether the agent execution should be in verbose mode.
            allow_delegation: Whether the agent is allowed to delegate tasks to other agents.
            tools: Tools at agents disposal
            step_callback: Callback to be executed after each step of the agent execution.
            knowledge_sources: Knowledge sources for the agent.
+            embedder: Embedder configuration for the agent.
    """

    _times_executed: int = PrivateAttr(default=0)
@@ -70,9 +62,6 @@ class Agent(BaseAgent):
    )
    agent_ops_agent_name: str = None  # type: ignore # Incompatible types in assignment (expression has type "None", variable has type "str")
    agent_ops_agent_id: str = None  # type: ignore # Incompatible types in assignment (expression has type "None", variable has type "str")
-    cache_handler: InstanceOf[CacheHandler] = Field(
-        default=None, description="An instance of the CacheHandler class."
-    )
    step_callback: Optional[Any] = Field(
        default=None,
        description="Callback to be executed after each step of the agent execution.",
@@ -106,10 +95,6 @@ class Agent(BaseAgent):
        default=True,
        description="Keep messages under the context window size by summarizing content.",
    )
-    max_iter: int = Field(
-        default=20,
-        description="Maximum number of iterations for an agent to execute a task before giving it's best answer",
-    )
    max_retry_limit: int = Field(
        default=2,
        description="Maximum number of retries for an agent to execute a task when an error occurs.",
@@ -122,17 +107,10 @@ class Agent(BaseAgent):
        default="safe",
        description="Mode for code execution: 'safe' (using Docker) or 'unsafe' (direct execution).",
    )
-    embedder_config: Optional[Dict[str, Any]] = Field(
+    embedder: Optional[Dict[str, Any]] = Field(
        default=None,
        description="Embedder configuration for the agent.",
    )
-    knowledge_sources: Optional[List[BaseKnowledgeSource]] = Field(
-        default=None,
-        description="Knowledge sources for the agent.",
-    )
-    _knowledge: Optional[Knowledge] = PrivateAttr(
-        default=None,
-    )

    @model_validator(mode="after")
    def post_init_setup(self):
@@ -159,14 +137,16 @@ class Agent(BaseAgent):
    def _set_knowledge(self):
        try:
            if self.knowledge_sources:
-                knowledge_agent_name = f"{self.role.replace(' ', '_')}"
+                full_pattern = re.compile(r"[^a-zA-Z0-9\-_\r\n]|(\.\.)")
+                knowledge_agent_name = f"{re.sub(full_pattern, '_', self.role)}"
                if isinstance(self.knowledge_sources, list) and all(
                    isinstance(k, BaseKnowledgeSource) for k in self.knowledge_sources
                ):
-                    self._knowledge = Knowledge(
+                    self.knowledge = Knowledge(
                        sources=self.knowledge_sources,
-                        embedder_config=self.embedder_config,
+                        embedder=self.embedder,
                        collection_name=knowledge_agent_name,
+                        storage=self.knowledge_storage or None,
                    )
        except (TypeError, ValueError) as e:
            raise ValueError(f"Invalid Knowledge Configuration: {str(e)}")
@@ -200,13 +180,15 @@ class Agent(BaseAgent):
            if task.output_json:
                # schema = json.dumps(task.output_json, indent=2)
                schema = generate_model_description(task.output_json)
+                task_prompt += "\n" + self.i18n.slice(
+                    "formatted_task_instructions"
+                ).format(output_format=schema)

            elif task.output_pydantic:
                schema = generate_model_description(task.output_pydantic)
-
-            task_prompt += "\n" + self.i18n.slice("formatted_task_instructions").format(
-                output_format=schema
-            )
+                task_prompt += "\n" + self.i18n.slice(
+                    "formatted_task_instructions"
+                ).format(output_format=schema)

        if context:
            task_prompt = self.i18n.slice("task_with_context").format(
@@ -225,8 +207,8 @@ class Agent(BaseAgent):
            if memory.strip() != "":
                task_prompt += self.i18n.slice("memory").format(memory=memory)

-        if self._knowledge:
-            agent_knowledge_snippets = self._knowledge.query([task.prompt()])
+        if self.knowledge:
+            agent_knowledge_snippets = self.knowledge.query([task.prompt()])
            if agent_knowledge_snippets:
                agent_knowledge_context = extract_knowledge_context(
                    agent_knowledge_snippets
@@ -250,6 +232,15 @@ class Agent(BaseAgent):
            task_prompt = self._use_trained_data(task_prompt=task_prompt)

        try:
+            crewai_event_bus.emit(
+                self,
+                event=AgentExecutionStartedEvent(
+                    agent=self,
+                    tools=self.tools,
+                    task_prompt=task_prompt,
+                    task=task,
+                ),
+            )
            result = self.agent_executor.invoke(
                {
                    "input": task_prompt,
@@ -261,9 +252,25 @@ class Agent(BaseAgent):
        except Exception as e:
            if e.__class__.__module__.startswith("litellm"):
                # Do not retry on litellm errors
+                crewai_event_bus.emit(
+                    self,
+                    event=AgentExecutionErrorEvent(
+                        agent=self,
+                        task=task,
+                        error=str(e),
+                    ),
+                )
                raise e
            self._times_executed += 1
            if self._times_executed > self.max_retry_limit:
+                crewai_event_bus.emit(
+                    self,
+                    event=AgentExecutionErrorEvent(
+                        agent=self,
+                        task=task,
+                        error=str(e),
+                    ),
+                )
                raise e
            result = self.execute_task(task, context, tools)

@@ -276,7 +283,10 @@ class Agent(BaseAgent):
        for tool_result in self.tools_results:  # type: ignore # Item "None" of "list[Any] | None" has no attribute "__iter__" (not iterable)
            if tool_result.get("result_as_answer", False):
                result = tool_result["result"]
-
+        crewai_event_bus.emit(
+            self,
+            event=AgentExecutionCompletedEvent(agent=self, task=task, output=result),
+        )
        return result

    def create_agent_executor(
@@ -334,14 +344,14 @@ class Agent(BaseAgent):
        tools = agent_tools.tools()
        return tools

-    def get_multimodal_tools(self) -> List[Tool]:
+    def get_multimodal_tools(self) -> Sequence[BaseTool]:
        from crewai.tools.agent_tools.add_image_tool import AddImageTool

        return [AddImageTool()]

    def get_code_execution_tools(self):
        try:
-            from crewai_tools import CodeInterpreterTool
+            from crewai_tools import CodeInterpreterTool  # type: ignore

            # Set the unsafe_mode based on the code_execution_mode attribute
            unsafe_mode = self.code_execution_mode == "unsafe"
--- a/src/crewai/agents/agent_builder/base_agent.py
+++ b/src/crewai/agents/agent_builder/base_agent.py
@@ -18,10 +18,12 @@ from pydantic_core import PydanticCustomError
 from crewai.agents.agent_builder.utilities.base_token_process import TokenProcess
 from crewai.agents.cache.cache_handler import CacheHandler
 from crewai.agents.tools_handler import ToolsHandler
-from crewai.tools import BaseTool
-from crewai.tools.base_tool import Tool
+from crewai.knowledge.knowledge import Knowledge
+from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
+from crewai.tools.base_tool import BaseTool, Tool
 from crewai.utilities import I18N, Logger, RPMController
 from crewai.utilities.config import process_config
+from crewai.utilities.converter import Converter

 T = TypeVar("T", bound="BaseAgent")

@@ -40,7 +42,7 @@ class BaseAgent(ABC, BaseModel):
        max_rpm (Optional[int]): Maximum number of requests per minute for the agent execution.
        allow_delegation (bool): Allow delegation of tasks to agents.
        tools (Optional[List[Any]]): Tools at the agent's disposal.
-        max_iter (Optional[int]): Maximum iterations for an agent to execute a task.
+        max_iter (int): Maximum iterations for an agent to execute a task.
        agent_executor (InstanceOf): An instance of the CrewAgentExecutor class.
        llm (Any): Language model that will run the agent.
        crew (Any): Crew to which the agent belongs.
@@ -48,6 +50,8 @@ class BaseAgent(ABC, BaseModel):
        cache_handler (InstanceOf[CacheHandler]): An instance of the CacheHandler class.
        tools_handler (InstanceOf[ToolsHandler]): An instance of the ToolsHandler class.
        max_tokens: Maximum number of tokens for the agent to generate in a response.
+        knowledge_sources: Knowledge sources for the agent.
+        knowledge_storage: Custom knowledge storage for the agent.


    Methods:
@@ -107,10 +111,10 @@ class BaseAgent(ABC, BaseModel):
        default=False,
        description="Enable agent to delegate and ask questions among each other.",
    )
-    tools: Optional[List[Any]] = Field(
+    tools: Optional[List[BaseTool]] = Field(
        default_factory=list, description="Tools at agents' disposal"
    )
-    max_iter: Optional[int] = Field(
+    max_iter: int = Field(
        default=25, description="Maximum iterations for an agent to execute a task"
    )
    agent_executor: InstanceOf = Field(
@@ -121,15 +125,27 @@ class BaseAgent(ABC, BaseModel):
    )
    crew: Any = Field(default=None, description="Crew to which the agent belongs.")
    i18n: I18N = Field(default=I18N(), description="Internationalization settings.")
-    cache_handler: InstanceOf[CacheHandler] = Field(
+    cache_handler: Optional[InstanceOf[CacheHandler]] = Field(
        default=None, description="An instance of the CacheHandler class."
    )
    tools_handler: InstanceOf[ToolsHandler] = Field(
-        default=None, description="An instance of the ToolsHandler class."
+        default_factory=ToolsHandler,
+        description="An instance of the ToolsHandler class.",
    )
    max_tokens: Optional[int] = Field(
        default=None, description="Maximum number of tokens for the agent's execution."
    )
+    knowledge: Optional[Knowledge] = Field(
+        default=None, description="Knowledge for the agent."
+    )
+    knowledge_sources: Optional[List[BaseKnowledgeSource]] = Field(
+        default=None,
+        description="Knowledge sources for the agent.",
+    )
+    knowledge_storage: Optional[Any] = Field(
+        default=None,
+        description="Custom knowledge storage for the agent.",
+    )

    @model_validator(mode="before")
    @classmethod
@@ -239,7 +255,7 @@ class BaseAgent(ABC, BaseModel):
    @abstractmethod
    def get_output_converter(
        self, llm: Any, text: str, model: type[BaseModel] | None, instructions: str
-    ):
+    ) -> Converter:
        """Get the converter class for the agent to create json/pydantic outputs."""
        pass

@@ -256,13 +272,44 @@ class BaseAgent(ABC, BaseModel):
            "tools_handler",
            "cache_handler",
            "llm",
+            "knowledge_sources",
+            "knowledge_storage",
+            "knowledge",
        }

-        # Copy llm and clear callbacks
+        # Copy llm
        existing_llm = shallow_copy(self.llm)
+        copied_knowledge = shallow_copy(self.knowledge)
+        copied_knowledge_storage = shallow_copy(self.knowledge_storage)
+        # Properly copy knowledge sources if they exist
+        existing_knowledge_sources = None
+        if self.knowledge_sources:
+            # Create a shared storage instance for all knowledge sources
+            shared_storage = (
+                self.knowledge_sources[0].storage if self.knowledge_sources else None
+            )
+
+            existing_knowledge_sources = []
+            for source in self.knowledge_sources:
+                copied_source = (
+                    source.model_copy()
+                    if hasattr(source, "model_copy")
+                    else shallow_copy(source)
+                )
+                # Ensure all copied sources use the same storage instance
+                copied_source.storage = shared_storage
+                existing_knowledge_sources.append(copied_source)
+
        copied_data = self.model_dump(exclude=exclude)
        copied_data = {k: v for k, v in copied_data.items() if v is not None}
-        copied_agent = type(self)(**copied_data, llm=existing_llm, tools=self.tools)
+        copied_agent = type(self)(
+            **copied_data,
+            llm=existing_llm,
+            tools=self.tools,
+            knowledge_sources=existing_knowledge_sources,
+            knowledge=copied_knowledge,
+            knowledge_storage=copied_knowledge_storage,
+        )

        return copied_agent

--- a/src/crewai/agents/agent_builder/base_agent_executor_mixin.py
+++ b/src/crewai/agents/agent_builder/base_agent_executor_mixin.py
@@ -95,18 +95,34 @@ class CrewAgentExecutorMixin:
                pass

    def _ask_human_input(self, final_answer: str) -> str:
-        """Prompt human input for final decision making."""
+        """Prompt human input with mode-appropriate messaging."""
        self._printer.print(
            content=f"\033[1m\033[95m ## Final Result:\033[00m \033[92m{final_answer}\033[00m"
        )

-        self._printer.print(
-            content=(
+        # Training mode prompt (single iteration)
+        if self.crew and getattr(self.crew, "_train", False):
+            prompt = (
                "\n\n=====\n"
-                "## Please provide feedback on the Final Result and the Agent's actions. "
-                "Respond with 'looks good' or a similar phrase when you're satisfied.\n"
+                "## TRAINING MODE: Provide feedback to improve the agent's performance.\n"
+                "This will be used to train better versions of the agent.\n"
+                "Please provide detailed feedback about the result quality and reasoning process.\n"
                "=====\n"
-            ),
-            color="bold_yellow",
-        )
-        return input()
+            )
+        # Regular human-in-the-loop prompt (multiple iterations)
+        else:
+            prompt = (
+                "\n\n=====\n"
+                "## HUMAN FEEDBACK: Provide feedback on the Final Result and Agent's actions.\n"
+                "Please follow these guidelines:\n"
+                " - If you are happy with the result, simply hit Enter without typing anything.\n"
+                " - Otherwise, provide specific improvement requests.\n"
+                " - You can provide multiple rounds of feedback until satisfied.\n"
+                "=====\n"
+            )
+
+        self._printer.print(content=prompt, color="bold_yellow")
+        response = input()
+        if response.strip() != "":
+            self._printer.print(content="\nProcessing your feedback...", color="cyan")
+        return response
--- a/src/crewai/agents/agent_builder/utilities/base_output_converter.py
+++ b/src/crewai/agents/agent_builder/utilities/base_output_converter.py
@@ -31,11 +31,11 @@ class OutputConverter(BaseModel, ABC):
    )

    @abstractmethod
-    def to_pydantic(self, current_attempt=1):
+    def to_pydantic(self, current_attempt=1) -> BaseModel:
        """Convert text to pydantic."""
        pass

    @abstractmethod
-    def to_json(self, current_attempt=1):
+    def to_json(self, current_attempt=1) -> dict:
        """Convert text to json."""
        pass
--- a/src/crewai/agents/crew_agent_executor.py
+++ b/src/crewai/agents/crew_agent_executor.py
@@ -18,6 +18,12 @@ from crewai.tools.base_tool import BaseTool
 from crewai.tools.tool_usage import ToolUsage, ToolUsageErrorException
 from crewai.utilities import I18N, Printer
 from crewai.utilities.constants import MAX_LLM_RETRY, TRAINING_DATA_FILE
+from crewai.utilities.events import (
+    ToolUsageErrorEvent,
+    ToolUsageStartedEvent,
+    crewai_event_bus,
+)
+from crewai.utilities.events.tool_usage_events import ToolUsageStartedEvent
 from crewai.utilities.exceptions.context_window_exceeding_exception import (
    LLMContextLengthExceededException,
 )
@@ -100,12 +106,18 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):

        try:
            formatted_answer = self._invoke_loop()
+        except AssertionError:
+            self._printer.print(
+                content="Agent failed to reach a final answer. This is likely a bug - please report it.",
+                color="red",
+            )
+            raise
        except Exception as e:
+            self._handle_unknown_error(e)
            if e.__class__.__module__.startswith("litellm"):
                # Do not retry on litellm errors
                raise e
            else:
-                self._handle_unknown_error(e)
                raise e

        if self.ask_for_human_input:
@@ -115,7 +127,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
        self._create_long_term_memory(formatted_answer)
        return {"output": formatted_answer.output}

-    def _invoke_loop(self):
+    def _invoke_loop(self) -> AgentFinish:
        """
        Main loop to invoke the agent's thought process until it reaches a conclusion
        or the maximum number of iterations is reached.
@@ -161,6 +173,11 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
            finally:
                self.iterations += 1

+        # During the invoke loop, formatted_answer alternates between AgentAction
+        # (when the agent is using tools) and eventually becomes AgentFinish
+        # (when the agent reaches a final answer). This assertion confirms we've
+        # reached a final answer and helps type checking understand this transition.
+        assert isinstance(formatted_answer, AgentFinish)
        self._show_logs(formatted_answer)
        return formatted_answer

@@ -292,8 +309,11 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
            self._printer.print(
                content=f"\033[1m\033[95m# Agent:\033[00m \033[1m\033[92m{agent_role}\033[00m"
            )
+            description = (
+                getattr(self.task, "description") if self.task else "Not Found"
+            )
            self._printer.print(
-                content=f"\033[95m## Task:\033[00m \033[92m{self.task.description}\033[00m"
+                content=f"\033[95m## Task:\033[00m \033[92m{description}\033[00m"
            )

    def _show_logs(self, formatted_answer: Union[AgentAction, AgentFinish]):
@@ -335,40 +355,68 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                )

    def _execute_tool_and_check_finality(self, agent_action: AgentAction) -> ToolResult:
-        tool_usage = ToolUsage(
-            tools_handler=self.tools_handler,
-            tools=self.tools,
-            original_tools=self.original_tools,
-            tools_description=self.tools_description,
-            tools_names=self.tools_names,
-            function_calling_llm=self.function_calling_llm,
-            task=self.task,  # type: ignore[arg-type]
-            agent=self.agent,
-            action=agent_action,
-        )
-        tool_calling = tool_usage.parse_tool_calling(agent_action.text)
-
-        if isinstance(tool_calling, ToolUsageErrorException):
-            tool_result = tool_calling.message
-            return ToolResult(result=tool_result, result_as_answer=False)
-        else:
-            if tool_calling.tool_name.casefold().strip() in [
-                name.casefold().strip() for name in self.tool_name_to_tool_map
-            ] or tool_calling.tool_name.casefold().replace("_", " ") in [
-                name.casefold().strip() for name in self.tool_name_to_tool_map
-            ]:
-                tool_result = tool_usage.use(tool_calling, agent_action.text)
-                tool = self.tool_name_to_tool_map.get(tool_calling.tool_name)
-                if tool:
-                    return ToolResult(
-                        result=tool_result, result_as_answer=tool.result_as_answer
-                    )
-            else:
-                tool_result = self._i18n.errors("wrong_tool_name").format(
-                    tool=tool_calling.tool_name,
-                    tools=", ".join([tool.name.casefold() for tool in self.tools]),
+        try:
+            if self.agent:
+                crewai_event_bus.emit(
+                    self,
+                    event=ToolUsageStartedEvent(
+                        agent_key=self.agent.key,
+                        agent_role=self.agent.role,
+                        tool_name=agent_action.tool,
+                        tool_args=agent_action.tool_input,
+                        tool_class=agent_action.tool,
+                    ),
                )
-        return ToolResult(result=tool_result, result_as_answer=False)
+            tool_usage = ToolUsage(
+                tools_handler=self.tools_handler,
+                tools=self.tools,
+                original_tools=self.original_tools,
+                tools_description=self.tools_description,
+                tools_names=self.tools_names,
+                function_calling_llm=self.function_calling_llm,
+                task=self.task,  # type: ignore[arg-type]
+                agent=self.agent,
+                action=agent_action,
+            )
+            tool_calling = tool_usage.parse_tool_calling(agent_action.text)
+
+            if isinstance(tool_calling, ToolUsageErrorException):
+                tool_result = tool_calling.message
+                return ToolResult(result=tool_result, result_as_answer=False)
+            else:
+                if tool_calling.tool_name.casefold().strip() in [
+                    name.casefold().strip() for name in self.tool_name_to_tool_map
+                ] or tool_calling.tool_name.casefold().replace("_", " ") in [
+                    name.casefold().strip() for name in self.tool_name_to_tool_map
+                ]:
+                    tool_result = tool_usage.use(tool_calling, agent_action.text)
+                    tool = self.tool_name_to_tool_map.get(tool_calling.tool_name)
+                    if tool:
+                        return ToolResult(
+                            result=tool_result, result_as_answer=tool.result_as_answer
+                        )
+                else:
+                    tool_result = self._i18n.errors("wrong_tool_name").format(
+                        tool=tool_calling.tool_name,
+                        tools=", ".join([tool.name.casefold() for tool in self.tools]),
+                    )
+                return ToolResult(result=tool_result, result_as_answer=False)
+
+        except Exception as e:
+            # TODO: drop
+            if self.agent:
+                crewai_event_bus.emit(
+                    self,
+                    event=ToolUsageErrorEvent(  # validation error
+                        agent_key=self.agent.key,
+                        agent_role=self.agent.role,
+                        tool_name=agent_action.tool,
+                        tool_args=agent_action.tool_input,
+                        tool_class=agent_action.tool,
+                        error=str(e),
+                    ),
+                )
+            raise e

    def _summarize_messages(self) -> None:
        messages_groups = []
@@ -418,58 +466,50 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
            )

    def _handle_crew_training_output(
-        self, result: AgentFinish, human_feedback: str | None = None
+        self, result: AgentFinish, human_feedback: Optional[str] = None
    ) -> None:
-        """Function to handle the process of the training data."""
+        """Handle the process of saving training data."""
        agent_id = str(self.agent.id)  # type: ignore
+        train_iteration = (
+            getattr(self.crew, "_train_iteration", None) if self.crew else None
+        )
+
+        if train_iteration is None or not isinstance(train_iteration, int):
+            self._printer.print(
+                content="Invalid or missing train iteration. Cannot save training data.",
+                color="red",
+            )
+            return

-        # Load training data
        training_handler = CrewTrainingHandler(TRAINING_DATA_FILE)
-        training_data = training_handler.load()
+        training_data = training_handler.load() or {}

-        # Check if training data exists, human input is not requested, and self.crew is valid
-        if training_data and not self.ask_for_human_input:
-            if self.crew is not None and hasattr(self.crew, "_train_iteration"):
-                train_iteration = self.crew._train_iteration
-                if agent_id in training_data and isinstance(train_iteration, int):
-                    training_data[agent_id][train_iteration][
-                        "improved_output"
-                    ] = result.output
-                    training_handler.save(training_data)
-                else:
-                    self._printer.print(
-                        content="Invalid train iteration type or agent_id not in training data.",
-                        color="red",
-                    )
-            else:
-                self._printer.print(
-                    content="Crew is None or does not have _train_iteration attribute.",
-                    color="red",
-                )
+        # Initialize or retrieve agent's training data
+        agent_training_data = training_data.get(agent_id, {})

-        if self.ask_for_human_input and human_feedback is not None:
-            training_data = {
+        if human_feedback is not None:
+            # Save initial output and human feedback
+            agent_training_data[train_iteration] = {
                "initial_output": result.output,
                "human_feedback": human_feedback,
-                "agent": agent_id,
-                "agent_role": self.agent.role,  # type: ignore
            }
-            if self.crew is not None and hasattr(self.crew, "_train_iteration"):
-                train_iteration = self.crew._train_iteration
-                if isinstance(train_iteration, int):
-                    CrewTrainingHandler(TRAINING_DATA_FILE).append(
-                        train_iteration, agent_id, training_data
-                    )
-                else:
-                    self._printer.print(
-                        content="Invalid train iteration type. Expected int.",
-                        color="red",
-                    )
+        else:
+            # Save improved output
+            if train_iteration in agent_training_data:
+                agent_training_data[train_iteration]["improved_output"] = result.output
            else:
                self._printer.print(
-                    content="Crew is None or does not have _train_iteration attribute.",
+                    content=(
+                        f"No existing training data for agent {agent_id} and iteration "
+                        f"{train_iteration}. Cannot save improved output."
+                    ),
                    color="red",
                )
+                return
+
+        # Update the training data and save
+        training_data[agent_id] = agent_training_data
+        training_handler.save(training_data)

    def _format_prompt(self, prompt: str, inputs: Dict[str, str]) -> str:
        prompt = prompt.replace("{input}", inputs["input"])
@@ -485,82 +525,85 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
        return {"role": role, "content": prompt}

    def _handle_human_feedback(self, formatted_answer: AgentFinish) -> AgentFinish:
-        """
-        Handles the human feedback loop, allowing the user to provide feedback
-        on the agent's output and determining if additional iterations are needed.
+        """Handle human feedback with different flows for training vs regular use.

-        Parameters:
-            formatted_answer (AgentFinish): The initial output from the agent.
+        Args:
+            formatted_answer: The initial AgentFinish result to get feedback on

        Returns:
-            AgentFinish: The final output after incorporating human feedback.
+            AgentFinish: The final answer after processing feedback
        """
+        human_feedback = self._ask_human_input(formatted_answer.output)
+
+        if self._is_training_mode():
+            return self._handle_training_feedback(formatted_answer, human_feedback)
+
+        return self._handle_regular_feedback(formatted_answer, human_feedback)
+
+    def _is_training_mode(self) -> bool:
+        """Check if crew is in training mode."""
+        return bool(self.crew and self.crew._train)
+
+    def _handle_training_feedback(
+        self, initial_answer: AgentFinish, feedback: str
+    ) -> AgentFinish:
+        """Process feedback for training scenarios with single iteration."""
+        self._handle_crew_training_output(initial_answer, feedback)
+        self.messages.append(
+            self._format_msg(
+                self._i18n.slice("feedback_instructions").format(feedback=feedback)
+            )
+        )
+        improved_answer = self._invoke_loop()
+        self._handle_crew_training_output(improved_answer)
+        self.ask_for_human_input = False
+        return improved_answer
+
+    def _handle_regular_feedback(
+        self, current_answer: AgentFinish, initial_feedback: str
+    ) -> AgentFinish:
+        """Process feedback for regular use with potential multiple iterations."""
+        feedback = initial_feedback
+        answer = current_answer
+
        while self.ask_for_human_input:
-            human_feedback = self._ask_human_input(formatted_answer.output)
-
-            if self.crew and self.crew._train:
-                self._handle_crew_training_output(formatted_answer, human_feedback)
-
-            # Make an LLM call to verify if additional changes are requested based on human feedback
-            additional_changes_prompt = self._i18n.slice(
-                "human_feedback_classification"
-            ).format(feedback=human_feedback)
-
-            retry_count = 0
-            llm_call_successful = False
-            additional_changes_response = None
-
-            while retry_count < MAX_LLM_RETRY and not llm_call_successful:
-                try:
-                    additional_changes_response = (
-                        self.llm.call(
-                            [
-                                self._format_msg(
-                                    additional_changes_prompt, role="system"
-                                )
-                            ],
-                            callbacks=self.callbacks,
-                        )
-                        .strip()
-                        .lower()
-                    )
-                    llm_call_successful = True
-                except Exception as e:
-                    retry_count += 1
-
-                    self._printer.print(
-                        content=f"Error during LLM call to classify human feedback: {e}. Retrying... ({retry_count}/{MAX_LLM_RETRY})",
-                        color="red",
-                    )
-
-            if not llm_call_successful:
-                self._printer.print(
-                    content="Error processing feedback after multiple attempts.",
-                    color="red",
-                )
+            # If the user provides a blank response, assume they are happy with the result
+            if feedback.strip() == "":
                self.ask_for_human_input = False
-                break
-
-            if additional_changes_response == "false":
-                self.ask_for_human_input = False
-            elif additional_changes_response == "true":
-                self.ask_for_human_input = True
-                # Add human feedback to messages
-                self.messages.append(self._format_msg(f"Feedback: {human_feedback}"))
-                # Invoke the loop again with updated messages
-                formatted_answer = self._invoke_loop()
-
-                if self.crew and self.crew._train:
-                    self._handle_crew_training_output(formatted_answer)
            else:
-                # Unexpected response
-                self._printer.print(
-                    content=f"Unexpected response from LLM: '{additional_changes_response}'. Assuming no additional changes requested.",
-                    color="red",
-                )
-                self.ask_for_human_input = False
+                answer = self._process_feedback_iteration(feedback)
+                feedback = self._ask_human_input(answer.output)

-        return formatted_answer
+        return answer
+
+    def _process_feedback_iteration(self, feedback: str) -> AgentFinish:
+        """Process a single feedback iteration."""
+        self.messages.append(
+            self._format_msg(
+                self._i18n.slice("feedback_instructions").format(feedback=feedback)
+            )
+        )
+        return self._invoke_loop()
+
+    def _log_feedback_error(self, retry_count: int, error: Exception) -> None:
+        """Log feedback processing errors."""
+        self._printer.print(
+            content=(
+                f"Error processing feedback: {error}. "
+                f"Retrying... ({retry_count + 1}/{MAX_LLM_RETRY})"
+            ),
+            color="red",
+        )
+
+    def _log_max_retries_exceeded(self) -> None:
+        """Log when max retries for feedback processing are exceeded."""
+        self._printer.print(
+            content=(
+                f"Failed to process feedback after {MAX_LLM_RETRY} attempts. "
+                "Ending feedback loop."
+            ),
+            color="red",
+        )

    def _handle_max_iterations_exceeded(self, formatted_answer):
        """
--- a/src/crewai/agents/parser.py
+++ b/src/crewai/agents/parser.py
@@ -94,6 +94,13 @@ class CrewAgentParser:

        elif includes_answer:
            final_answer = text.split(FINAL_ANSWER_ACTION)[-1].strip()
+            # Check whether the final answer ends with triple backticks.
+            if final_answer.endswith("```"):
+                # Count occurrences of triple backticks in the final answer.
+                count = final_answer.count("```")
+                # If count is odd then it's an unmatched trailing set; remove it.
+                if count % 2 != 0:
+                    final_answer = final_answer[:-3].rstrip()
            return AgentFinish(thought, final_answer, text)

        if not re.search(r"Action\s*\d*\s*:[\s]*(.*?)", text, re.DOTALL):
@@ -120,7 +127,10 @@ class CrewAgentParser:
        regex = r"(.*?)(?:\n\nAction|\n\nFinal Answer)"
        thought_match = re.search(regex, text, re.DOTALL)
        if thought_match:
-            return thought_match.group(1).strip()
+            thought = thought_match.group(1).strip()
+            # Remove any triple backticks from the thought string
+            thought = thought.replace("```", "").strip()
+            return thought
        return ""

    def _clean_action(self, text: str) -> str:
--- a/src/crewai/cli/reset_memories_command.py
+++ b/src/crewai/cli/reset_memories_command.py
@@ -2,11 +2,7 @@ import subprocess

 import click

-from crewai.knowledge.storage.knowledge_storage import KnowledgeStorage
-from crewai.memory.entity.entity_memory import EntityMemory
-from crewai.memory.long_term.long_term_memory import LongTermMemory
-from crewai.memory.short_term.short_term_memory import ShortTermMemory
-from crewai.utilities.task_output_storage_handler import TaskOutputStorageHandler
+from crewai.cli.utils import get_crew


 def reset_memories_command(
@@ -30,30 +26,35 @@ def reset_memories_command(
    """

    try:
+        crew = get_crew()
+        if not crew:
+            raise ValueError("No crew found.")
        if all:
-            ShortTermMemory().reset()
-            EntityMemory().reset()
-            LongTermMemory().reset()
-            TaskOutputStorageHandler().reset()
-            KnowledgeStorage().reset()
+            crew.reset_memories(command_type="all")
            click.echo("All memories have been reset.")
-        else:
-            if long:
-                LongTermMemory().reset()
-                click.echo("Long term memory has been reset.")
+            return

-            if short:
-                ShortTermMemory().reset()
-                click.echo("Short term memory has been reset.")
-            if entity:
-                EntityMemory().reset()
-                click.echo("Entity memory has been reset.")
-            if kickoff_outputs:
-                TaskOutputStorageHandler().reset()
-                click.echo("Latest Kickoff outputs stored has been reset.")
-            if knowledge:
-                KnowledgeStorage().reset()
-                click.echo("Knowledge has been reset.")
+        if not any([long, short, entity, kickoff_outputs, knowledge]):
+            click.echo(
+                "No memory type specified. Please specify at least one type to reset."
+            )
+            return
+
+        if long:
+            crew.reset_memories(command_type="long")
+            click.echo("Long term memory has been reset.")
+        if short:
+            crew.reset_memories(command_type="short")
+            click.echo("Short term memory has been reset.")
+        if entity:
+            crew.reset_memories(command_type="entity")
+            click.echo("Entity memory has been reset.")
+        if kickoff_outputs:
+            crew.reset_memories(command_type="kickoff_outputs")
+            click.echo("Latest Kickoff outputs stored has been reset.")
+        if knowledge:
+            crew.reset_memories(command_type="knowledge")
+            click.echo("Knowledge has been reset.")

    except subprocess.CalledProcessError as e:
        click.echo(f"An error occurred while resetting the memories: {e}", err=True)
--- a/src/crewai/cli/templates/crew/.gitignore
+++ b/src/crewai/cli/templates/crew/.gitignore
@@ -1,2 +1,3 @@
 .env
 __pycache__/
+.DS_Store
--- a/src/crewai/cli/templates/crew/main.py
+++ b/src/crewai/cli/templates/crew/main.py
@@ -56,7 +56,8 @@ def test():
    Test the crew execution and returns the results.
    """
    inputs = {
-        "topic": "AI LLMs"
+        "topic": "AI LLMs",
+        "current_year": str(datetime.now().year)
    }
    try:
        {{crew_name}}().crew().test(n_iterations=int(sys.argv[1]), openai_model_name=sys.argv[2], inputs=inputs)
--- a/src/crewai/cli/templates/crew/pyproject.toml
+++ b/src/crewai/cli/templates/crew/pyproject.toml
@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
 authors = [{ name = "Your Name", email = "you@example.com" }]
 requires-python = ">=3.10,<3.13"
 dependencies = [
-    "crewai[tools]>=0.98.0,<1.0.0"
+    "crewai[tools]>=0.102.0,<1.0.0"
 ]

 [project.scripts]
--- a/src/crewai/cli/templates/flow/.gitignore
+++ b/src/crewai/cli/templates/flow/.gitignore
@@ -1,3 +1,4 @@
 .env
 __pycache__/
 lib/
+.DS_Store
--- a/src/crewai/cli/templates/flow/pyproject.toml
+++ b/src/crewai/cli/templates/flow/pyproject.toml
@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
 authors = [{ name = "Your Name", email = "you@example.com" }]
 requires-python = ">=3.10,<3.13"
 dependencies = [
-    "crewai[tools]>=0.98.0,<1.0.0",
+    "crewai[tools]>=0.102.0,<1.0.0",
 ]

 [project.scripts]
--- a/src/crewai/cli/templates/tool/pyproject.toml
+++ b/src/crewai/cli/templates/tool/pyproject.toml
@@ -5,7 +5,7 @@ description = "Power up your crews with {{folder_name}}"
 readme = "README.md"
 requires-python = ">=3.10,<3.13"
 dependencies = [
-    "crewai[tools]>=0.98.0"
+    "crewai[tools]>=0.102.0"
 ]

 [tool.crewai]
--- a/src/crewai/cli/utils.py
+++ b/src/crewai/cli/utils.py
@@ -9,6 +9,7 @@ import tomli
 from rich.console import Console

 from crewai.cli.constants import ENV_VARS
+from crewai.crew import Crew

 if sys.version_info >= (3, 11):
    import tomllib
@@ -247,3 +248,64 @@ def write_env_file(folder_path, env_vars):
    with open(env_file_path, "w") as file:
        for key, value in env_vars.items():
            file.write(f"{key}={value}\n")
+
+
+def get_crew(crew_path: str = "crew.py", require: bool = False) -> Crew | None:
+    """Get the crew instance from the crew.py file."""
+    try:
+        import importlib.util
+        import os
+
+        for root, _, files in os.walk("."):
+            if "crew.py" in files:
+                crew_path = os.path.join(root, "crew.py")
+                try:
+                    spec = importlib.util.spec_from_file_location(
+                        "crew_module", crew_path
+                    )
+                    if not spec or not spec.loader:
+                        continue
+                    module = importlib.util.module_from_spec(spec)
+                    try:
+                        sys.modules[spec.name] = module
+                        spec.loader.exec_module(module)
+
+                        for attr_name in dir(module):
+                            attr = getattr(module, attr_name)
+                            try:
+                                if callable(attr) and hasattr(attr, "crew"):
+                                    crew_instance = attr().crew()
+                                    return crew_instance
+
+                            except Exception as e:
+                                print(f"Error processing attribute {attr_name}: {e}")
+                                continue
+
+                    except Exception as exec_error:
+                        print(f"Error executing module: {exec_error}")
+                        import traceback
+
+                        print(f"Traceback: {traceback.format_exc()}")
+
+                except (ImportError, AttributeError) as e:
+                    if require:
+                        console.print(
+                            f"Error importing crew from {crew_path}: {str(e)}",
+                            style="bold red",
+                        )
+                        continue
+
+                break
+
+        if require:
+            console.print("No valid Crew instance found in crew.py", style="bold red")
+            raise SystemExit
+        return None
+
+    except Exception as e:
+        if require:
+            console.print(
+                f"Unexpected error while loading crew: {str(e)}", style="bold red"
+            )
+            raise SystemExit
+        return None
--- a/src/crewai/crew.py
+++ b/src/crewai/crew.py
@@ -4,6 +4,7 @@ import re
 import uuid
 import warnings
 from concurrent.futures import Future
+from copy import copy as shallow_copy
 from hashlib import md5
 from typing import Any, Callable, Dict, List, Optional, Set, Tuple, Union

@@ -37,11 +38,24 @@ from crewai.tasks.task_output import TaskOutput
 from crewai.telemetry import Telemetry
 from crewai.tools.agent_tools.agent_tools import AgentTools
 from crewai.tools.base_tool import Tool
+from crewai.traces.unified_trace_controller import init_crew_main_trace
 from crewai.types.usage_metrics import UsageMetrics
 from crewai.utilities import I18N, FileHandler, Logger, RPMController
 from crewai.utilities.constants import TRAINING_DATA_FILE
 from crewai.utilities.evaluators.crew_evaluator_handler import CrewEvaluator
 from crewai.utilities.evaluators.task_evaluator import TaskEvaluator
+from crewai.utilities.events.crew_events import (
+    CrewKickoffCompletedEvent,
+    CrewKickoffFailedEvent,
+    CrewKickoffStartedEvent,
+    CrewTestCompletedEvent,
+    CrewTestFailedEvent,
+    CrewTestStartedEvent,
+    CrewTrainCompletedEvent,
+    CrewTrainFailedEvent,
+    CrewTrainStartedEvent,
+)
+from crewai.utilities.events.crewai_event_bus import crewai_event_bus
 from crewai.utilities.formatter import (
    aggregate_raw_outputs_from_task_outputs,
    aggregate_raw_outputs_from_tasks,
@@ -51,12 +65,6 @@ from crewai.utilities.planning_handler import CrewPlanner
 from crewai.utilities.task_output_storage_handler import TaskOutputStorageHandler
 from crewai.utilities.training_handler import CrewTrainingHandler

-try:
-    import agentops  # type: ignore
-except ImportError:
-    agentops = None
-
-
 warnings.filterwarnings("ignore", category=SyntaxWarning, module="pysbd")


@@ -182,9 +190,9 @@ class Crew(BaseModel):
        default=None,
        description="Path to the prompt json file to be used for the crew.",
    )
-    output_log_file: Optional[str] = Field(
+    output_log_file: Optional[Union[bool, str]] = Field(
        default=None,
-        description="output_log_file",
+        description="Path to the log file to be saved",
    )
    planning: Optional[bool] = Field(
        default=False,
@@ -210,8 +218,9 @@ class Crew(BaseModel):
        default=None,
        description="LLM used to handle chatting with the crew.",
    )
-    _knowledge: Optional[Knowledge] = PrivateAttr(
+    knowledge: Optional[Knowledge] = Field(
        default=None,
+        description="Knowledge for the crew.",
    )

    @field_validator("id", mode="before")
@@ -273,12 +282,26 @@ class Crew(BaseModel):
                if self.entity_memory
                else EntityMemory(crew=self, embedder_config=self.embedder)
            )
-            if hasattr(self, "memory_config") and self.memory_config is not None:
-                self._user_memory = (
-                    self.user_memory if self.user_memory else UserMemory(crew=self)
-                )
+            if (
+                self.memory_config and "user_memory" in self.memory_config
+            ):  # Check for user_memory in config
+                user_memory_config = self.memory_config["user_memory"]
+                if isinstance(
+                    user_memory_config, UserMemory
+                ):  # Check if it is already an instance
+                    self._user_memory = user_memory_config
+                elif isinstance(
+                    user_memory_config, dict
+                ):  # Check if it's a configuration dict
+                    self._user_memory = UserMemory(
+                        crew=self, **user_memory_config
+                    )  # Initialize with config
+                else:
+                    raise TypeError(
+                        "user_memory must be a UserMemory instance or a configuration dictionary"
+                    )
            else:
-                self._user_memory = None
+                self._user_memory = None  # No user memory if not in config
        return self

    @model_validator(mode="after")
@@ -289,9 +312,9 @@ class Crew(BaseModel):
                if isinstance(self.knowledge_sources, list) and all(
                    isinstance(k, BaseKnowledgeSource) for k in self.knowledge_sources
                ):
-                    self._knowledge = Knowledge(
+                    self.knowledge = Knowledge(
                        sources=self.knowledge_sources,
-                        embedder_config=self.embedder,
+                        embedder=self.embedder,
                        collection_name="crew",
                    )

@@ -378,6 +401,22 @@ class Crew(BaseModel):

        return self

+    @model_validator(mode="after")
+    def validate_must_have_non_conditional_task(self) -> "Crew":
+        """Ensure that a crew has at least one non-conditional task."""
+        if not self.tasks:
+            return self
+        non_conditional_count = sum(
+            1 for task in self.tasks if not isinstance(task, ConditionalTask)
+        )
+        if non_conditional_count == 0:
+            raise PydanticCustomError(
+                "only_conditional_tasks",
+                "Crew must include at least one non-conditional task",
+                {},
+            )
+        return self
+
    @model_validator(mode="after")
    def validate_first_task(self) -> "Crew":
        """Ensure the first task is not a ConditionalTask."""
@@ -489,83 +528,121 @@ class Crew(BaseModel):
        self, n_iterations: int, filename: str, inputs: Optional[Dict[str, Any]] = {}
    ) -> None:
        """Trains the crew for a given number of iterations."""
-        train_crew = self.copy()
-        train_crew._setup_for_training(filename)
+        try:
+            crewai_event_bus.emit(
+                self,
+                CrewTrainStartedEvent(
+                    crew_name=self.name or "crew",
+                    n_iterations=n_iterations,
+                    filename=filename,
+                    inputs=inputs,
+                ),
+            )
+            train_crew = self.copy()
+            train_crew._setup_for_training(filename)

-        for n_iteration in range(n_iterations):
-            train_crew._train_iteration = n_iteration
-            train_crew.kickoff(inputs=inputs)
+            for n_iteration in range(n_iterations):
+                train_crew._train_iteration = n_iteration
+                train_crew.kickoff(inputs=inputs)

-        training_data = CrewTrainingHandler(TRAINING_DATA_FILE).load()
+            training_data = CrewTrainingHandler(TRAINING_DATA_FILE).load()

-        for agent in train_crew.agents:
-            if training_data.get(str(agent.id)):
-                result = TaskEvaluator(agent).evaluate_training_data(
-                    training_data=training_data, agent_id=str(agent.id)
-                )
+            for agent in train_crew.agents:
+                if training_data.get(str(agent.id)):
+                    result = TaskEvaluator(agent).evaluate_training_data(
+                        training_data=training_data, agent_id=str(agent.id)
+                    )
+                    CrewTrainingHandler(filename).save_trained_data(
+                        agent_id=str(agent.role), trained_data=result.model_dump()
+                    )

-                CrewTrainingHandler(filename).save_trained_data(
-                    agent_id=str(agent.role), trained_data=result.model_dump()
-                )
+            crewai_event_bus.emit(
+                self,
+                CrewTrainCompletedEvent(
+                    crew_name=self.name or "crew",
+                    n_iterations=n_iterations,
+                    filename=filename,
+                ),
+            )
+        except Exception as e:
+            crewai_event_bus.emit(
+                self,
+                CrewTrainFailedEvent(error=str(e), crew_name=self.name or "crew"),
+            )
+            self._logger.log("error", f"Training failed: {e}", color="red")
+            CrewTrainingHandler(TRAINING_DATA_FILE).clear()
+            CrewTrainingHandler(filename).clear()
+            raise

+    @init_crew_main_trace
    def kickoff(
        self,
        inputs: Optional[Dict[str, Any]] = None,
    ) -> CrewOutput:
-        for before_callback in self.before_kickoff_callbacks:
-            if inputs is None:
-                inputs = {}
-            inputs = before_callback(inputs)
+        try:
+            for before_callback in self.before_kickoff_callbacks:
+                if inputs is None:
+                    inputs = {}
+                inputs = before_callback(inputs)

-        """Starts the crew to work on its assigned tasks."""
-        self._execution_span = self._telemetry.crew_execution_span(self, inputs)
-        self._task_output_handler.reset()
-        self._logging_color = "bold_purple"
-
-        if inputs is not None:
-            self._inputs = inputs
-            self._interpolate_inputs(inputs)
-        self._set_tasks_callbacks()
-
-        i18n = I18N(prompt_file=self.prompt_file)
-
-        for agent in self.agents:
-            agent.i18n = i18n
-            # type: ignore[attr-defined] # Argument 1 to "_interpolate_inputs" of "Crew" has incompatible type "dict[str, Any] | None"; expected "dict[str, Any]"
-            agent.crew = self  # type: ignore[attr-defined]
-            # TODO: Create an AgentFunctionCalling protocol for future refactoring
-            if not agent.function_calling_llm:  # type: ignore # "BaseAgent" has no attribute "function_calling_llm"
-                agent.function_calling_llm = self.function_calling_llm  # type: ignore # "BaseAgent" has no attribute "function_calling_llm"
-
-            if not agent.step_callback:  # type: ignore # "BaseAgent" has no attribute "step_callback"
-                agent.step_callback = self.step_callback  # type: ignore # "BaseAgent" has no attribute "step_callback"
-
-            agent.create_agent_executor()
-
-        if self.planning:
-            self._handle_crew_planning()
-
-        metrics: List[UsageMetrics] = []
-
-        if self.process == Process.sequential:
-            result = self._run_sequential_process()
-        elif self.process == Process.hierarchical:
-            result = self._run_hierarchical_process()
-        else:
-            raise NotImplementedError(
-                f"The process '{self.process}' is not implemented yet."
+            crewai_event_bus.emit(
+                self,
+                CrewKickoffStartedEvent(crew_name=self.name or "crew", inputs=inputs),
            )

-        for after_callback in self.after_kickoff_callbacks:
-            result = after_callback(result)
+            # Starts the crew to work on its assigned tasks.
+            self._task_output_handler.reset()
+            self._logging_color = "bold_purple"

-        metrics += [agent._token_process.get_summary() for agent in self.agents]
+            if inputs is not None:
+                self._inputs = inputs
+                self._interpolate_inputs(inputs)
+            self._set_tasks_callbacks()

-        self.usage_metrics = UsageMetrics()
-        for metric in metrics:
-            self.usage_metrics.add_usage_metrics(metric)
+            i18n = I18N(prompt_file=self.prompt_file)

-        return result
+            for agent in self.agents:
+                agent.i18n = i18n
+                # type: ignore[attr-defined] # Argument 1 to "_interpolate_inputs" of "Crew" has incompatible type "dict[str, Any] | None"; expected "dict[str, Any]"
+                agent.crew = self  # type: ignore[attr-defined]
+                # TODO: Create an AgentFunctionCalling protocol for future refactoring
+                if not agent.function_calling_llm:  # type: ignore # "BaseAgent" has no attribute "function_calling_llm"
+                    agent.function_calling_llm = self.function_calling_llm  # type: ignore # "BaseAgent" has no attribute "function_calling_llm"
+
+                if not agent.step_callback:  # type: ignore # "BaseAgent" has no attribute "step_callback"
+                    agent.step_callback = self.step_callback  # type: ignore # "BaseAgent" has no attribute "step_callback"
+
+                agent.create_agent_executor()
+
+            if self.planning:
+                self._handle_crew_planning()
+
+            metrics: List[UsageMetrics] = []
+
+            if self.process == Process.sequential:
+                result = self._run_sequential_process()
+            elif self.process == Process.hierarchical:
+                result = self._run_hierarchical_process()
+            else:
+                raise NotImplementedError(
+                    f"The process '{self.process}' is not implemented yet."
+                )
+
+            for after_callback in self.after_kickoff_callbacks:
+                result = after_callback(result)
+
+            metrics += [agent._token_process.get_summary() for agent in self.agents]
+
+            self.usage_metrics = UsageMetrics()
+            for metric in metrics:
+                self.usage_metrics.add_usage_metrics(metric)
+            return result
+        except Exception as e:
+            crewai_event_bus.emit(
+                self,
+                CrewKickoffFailedEvent(error=str(e), crew_name=self.name or "crew"),
+            )
+            raise

    def kickoff_for_each(self, inputs: List[Dict[str, Any]]) -> List[CrewOutput]:
        """Executes the Crew's workflow for each input in the list and aggregates results."""
@@ -674,12 +751,7 @@ class Crew(BaseModel):
                manager.tools = []
                raise Exception("Manager agent should not have tools")
        else:
-            self.manager_llm = (
-                getattr(self.manager_llm, "model_name", None)
-                or getattr(self.manager_llm, "model", None)
-                or getattr(self.manager_llm, "deployment_name", None)
-                or self.manager_llm
-            )
+            self.manager_llm = create_llm(self.manager_llm)
            manager = Agent(
                role=i18n.retrieve("hierarchical_manager_agent", "role"),
                goal=i18n.retrieve("hierarchical_manager_agent", "goal"),
@@ -739,6 +811,7 @@ class Crew(BaseModel):
                    task, task_outputs, futures, task_index, was_replayed
                )
                if skipped_task_output:
+                    task_outputs.append(skipped_task_output)
                    continue

            if task.async_execution:
@@ -762,7 +835,7 @@ class Crew(BaseModel):
                    context=context,
                    tools=tools_for_task,
                )
-                task_outputs = [task_output]
+                task_outputs.append(task_output)
                self._process_task_result(task, task_output)
                self._store_execution_log(task, task_output, task_index, was_replayed)

@@ -783,7 +856,7 @@ class Crew(BaseModel):
            task_outputs = self._process_async_tasks(futures, was_replayed)
            futures.clear()

-        previous_output = task_outputs[task_index - 1] if task_outputs else None
+        previous_output = task_outputs[-1] if task_outputs else None
        if previous_output is not None and not task.should_execute(previous_output):
            self._logger.log(
                "debug",
@@ -905,20 +978,29 @@ class Crew(BaseModel):
            )

    def _create_crew_output(self, task_outputs: List[TaskOutput]) -> CrewOutput:
-        if len(task_outputs) != 1:
-            raise ValueError(
-                "Something went wrong. Kickoff should return only one task output."
-            )
-        final_task_output = task_outputs[0]
+        if not task_outputs:
+            raise ValueError("No task outputs available to create crew output.")
+
+        # Filter out empty outputs and get the last valid one as the main output
+        valid_outputs = [t for t in task_outputs if t.raw]
+        if not valid_outputs:
+            raise ValueError("No valid task outputs available to create crew output.")
+        final_task_output = valid_outputs[-1]
+
        final_string_output = final_task_output.raw
        self._finish_execution(final_string_output)
        token_usage = self.calculate_usage_metrics()
-
+        crewai_event_bus.emit(
+            self,
+            CrewKickoffCompletedEvent(
+                crew_name=self.name or "crew", output=final_task_output
+            ),
+        )
        return CrewOutput(
            raw=final_task_output.raw,
            pydantic=final_task_output.pydantic,
            json_dict=final_task_output.json_dict,
-            tasks_output=[task.output for task in self.tasks if task.output],
+            tasks_output=task_outputs,
            token_usage=token_usage,
        )

@@ -991,8 +1073,8 @@ class Crew(BaseModel):
        return result

    def query_knowledge(self, query: List[str]) -> Union[List[Dict[str, Any]], None]:
-        if self._knowledge:
-            return self._knowledge.query(query)
+        if self.knowledge:
+            return self.knowledge.query(query)
        return None

    def fetch_inputs(self) -> Set[str]:
@@ -1036,6 +1118,8 @@ class Crew(BaseModel):
            "_telemetry",
            "agents",
            "tasks",
+            "knowledge_sources",
+            "knowledge",
        }

        cloned_agents = [agent.copy() for agent in self.agents]
@@ -1043,6 +1127,9 @@ class Crew(BaseModel):
        task_mapping = {}

        cloned_tasks = []
+        existing_knowledge_sources = shallow_copy(self.knowledge_sources)
+        existing_knowledge = shallow_copy(self.knowledge)
+
        for task in self.tasks:
            cloned_task = task.copy(cloned_agents, task_mapping)
            cloned_tasks.append(cloned_task)
@@ -1062,7 +1149,13 @@ class Crew(BaseModel):
        copied_data.pop("agents", None)
        copied_data.pop("tasks", None)

-        copied_crew = Crew(**copied_data, agents=cloned_agents, tasks=cloned_tasks)
+        copied_crew = Crew(
+            **copied_data,
+            agents=cloned_agents,
+            tasks=cloned_tasks,
+            knowledge_sources=existing_knowledge_sources,
+            knowledge=existing_knowledge,
+        )

        return copied_crew

@@ -1088,13 +1181,6 @@ class Crew(BaseModel):
    def _finish_execution(self, final_string_output: str) -> None:
        if self.max_rpm:
            self._rpm_controller.stop_rpm_counter()
-        if agentops:
-            agentops.end_session(
-                end_state="Success",
-                end_state_reason="Finished Execution",
-                is_auto_end=True,
-            )
-        self._telemetry.end_crew(self, final_string_output)

    def calculate_usage_metrics(self) -> UsageMetrics:
        """Calculates and returns the usage metrics."""
@@ -1112,25 +1198,122 @@ class Crew(BaseModel):
    def test(
        self,
        n_iterations: int,
-        openai_model_name: Optional[str] = None,
+        eval_llm: Union[str, InstanceOf[LLM]],
        inputs: Optional[Dict[str, Any]] = None,
    ) -> None:
        """Test and evaluate the Crew with the given inputs for n iterations concurrently using concurrent.futures."""
-        test_crew = self.copy()
+        try:
+            eval_llm = create_llm(eval_llm)
+            if not eval_llm:
+                raise ValueError("Failed to create LLM instance.")

-        self._test_execution_span = test_crew._telemetry.test_execution_span(
-            test_crew,
-            n_iterations,
-            inputs,
-            openai_model_name,  # type: ignore[arg-type]
-        )  # type: ignore[arg-type]
-        evaluator = CrewEvaluator(test_crew, openai_model_name)  # type: ignore[arg-type]
+            crewai_event_bus.emit(
+                self,
+                CrewTestStartedEvent(
+                    crew_name=self.name or "crew",
+                    n_iterations=n_iterations,
+                    eval_llm=eval_llm,
+                    inputs=inputs,
+                ),
+            )
+            test_crew = self.copy()
+            evaluator = CrewEvaluator(test_crew, eval_llm)  # type: ignore[arg-type]

-        for i in range(1, n_iterations + 1):
-            evaluator.set_iteration(i)
-            test_crew.kickoff(inputs=inputs)
+            for i in range(1, n_iterations + 1):
+                evaluator.set_iteration(i)
+                test_crew.kickoff(inputs=inputs)

-        evaluator.print_crew_evaluation_result()
+            evaluator.print_crew_evaluation_result()
+
+            crewai_event_bus.emit(
+                self,
+                CrewTestCompletedEvent(
+                    crew_name=self.name or "crew",
+                ),
+            )
+        except Exception as e:
+            crewai_event_bus.emit(
+                self,
+                CrewTestFailedEvent(error=str(e), crew_name=self.name or "crew"),
+            )
+            raise

    def __repr__(self):
        return f"Crew(id={self.id}, process={self.process}, number_of_agents={len(self.agents)}, number_of_tasks={len(self.tasks)})"
+
+    def reset_memories(self, command_type: str) -> None:
+        """Reset specific or all memories for the crew.
+
+        Args:
+            command_type: Type of memory to reset.
+                Valid options: 'long', 'short', 'entity', 'knowledge',
+                'kickoff_outputs', or 'all'
+
+        Raises:
+            ValueError: If an invalid command type is provided.
+            RuntimeError: If memory reset operation fails.
+        """
+        VALID_TYPES = frozenset(
+            ["long", "short", "entity", "knowledge", "kickoff_outputs", "all"]
+        )
+
+        if command_type not in VALID_TYPES:
+            raise ValueError(
+                f"Invalid command type. Must be one of: {', '.join(sorted(VALID_TYPES))}"
+            )
+
+        try:
+            if command_type == "all":
+                self._reset_all_memories()
+            else:
+                self._reset_specific_memory(command_type)
+
+            self._logger.log("info", f"{command_type} memory has been reset")
+
+        except Exception as e:
+            error_msg = f"Failed to reset {command_type} memory: {str(e)}"
+            self._logger.log("error", error_msg)
+            raise RuntimeError(error_msg) from e
+
+    def _reset_all_memories(self) -> None:
+        """Reset all available memory systems."""
+        memory_systems = [
+            ("short term", self._short_term_memory),
+            ("entity", self._entity_memory),
+            ("long term", self._long_term_memory),
+            ("task output", self._task_output_handler),
+            ("knowledge", self.knowledge),
+        ]
+
+        for name, system in memory_systems:
+            if system is not None:
+                try:
+                    system.reset()
+                except Exception as e:
+                    raise RuntimeError(f"Failed to reset {name} memory") from e
+
+    def _reset_specific_memory(self, memory_type: str) -> None:
+        """Reset a specific memory system.
+
+        Args:
+            memory_type: Type of memory to reset
+
+        Raises:
+            RuntimeError: If the specified memory system fails to reset
+        """
+        reset_functions = {
+            "long": (self._long_term_memory, "long term"),
+            "short": (self._short_term_memory, "short term"),
+            "entity": (self._entity_memory, "entity"),
+            "knowledge": (self.knowledge, "knowledge"),
+            "kickoff_outputs": (self._task_output_handler, "task output"),
+        }
+
+        memory_system, name = reset_functions[memory_type]
+        if memory_system is None:
+            raise RuntimeError(f"{name} memory system is not initialized")
+
+        try:
+            memory_system.reset()
+        except Exception as e:
+            raise RuntimeError(f"Failed to reset {name} memory") from e
--- a/src/crewai/flow/flow.py
+++ b/src/crewai/flow/flow.py
@@ -1,4 +1,5 @@
 import asyncio
+import copy
 import inspect
 import logging
 from typing import (
@@ -16,19 +17,25 @@ from typing import (
 )
 from uuid import uuid4

-from blinker import Signal
 from pydantic import BaseModel, Field, ValidationError

-from crewai.flow.flow_events import (
-    FlowFinishedEvent,
-    FlowStartedEvent,
-    MethodExecutionFinishedEvent,
-    MethodExecutionStartedEvent,
-)
 from crewai.flow.flow_visualizer import plot_flow
 from crewai.flow.persistence.base import FlowPersistence
 from crewai.flow.utils import get_possible_return_constants
-from crewai.telemetry import Telemetry
+from crewai.traces.unified_trace_controller import (
+    init_flow_main_trace,
+    trace_flow_step,
+)
+from crewai.utilities.events.crewai_event_bus import crewai_event_bus
+from crewai.utilities.events.flow_events import (
+    FlowCreatedEvent,
+    FlowFinishedEvent,
+    FlowPlotEvent,
+    FlowStartedEvent,
+    MethodExecutionFailedEvent,
+    MethodExecutionFinishedEvent,
+    MethodExecutionStartedEvent,
+)
 from crewai.utilities.printer import Printer

 logger = logging.getLogger(__name__)
@@ -394,7 +401,6 @@ class FlowMeta(type):
                or hasattr(attr_value, "__trigger_methods__")
                or hasattr(attr_value, "__is_router__")
            ):
-
                # Register start methods
                if hasattr(attr_value, "__is_start_method__"):
                    start_methods.append(attr_name)
@@ -427,7 +433,6 @@ class Flow(Generic[T], metaclass=FlowMeta):

    Type parameter T must be either Dict[str, Any] or a subclass of BaseModel."""

-    _telemetry = Telemetry()
    _printer = Printer()

    _start_methods: List[str] = []
@@ -435,7 +440,6 @@ class Flow(Generic[T], metaclass=FlowMeta):
    _routers: Set[str] = set()
    _router_paths: Dict[str, List[str]] = {}
    initial_state: Union[Type[T], T, None] = None
-    event_emitter = Signal("event_emitter")

    def __class_getitem__(cls: Type["Flow"], item: Type[T]) -> Type["Flow"]:
        class _FlowGeneric(cls):  # type: ignore
@@ -469,7 +473,13 @@ class Flow(Generic[T], metaclass=FlowMeta):
        if kwargs:
            self._initialize_state(kwargs)

-        self._telemetry.flow_creation_span(self.__class__.__name__)
+        crewai_event_bus.emit(
+            self,
+            FlowCreatedEvent(
+                type="flow_created",
+                flow_name=self.__class__.__name__,
+            ),
+        )

        # Register all flow-related methods
        for method_name in dir(self):
@@ -569,6 +579,9 @@ class Flow(Generic[T], metaclass=FlowMeta):
            f"Initial state must be dict or BaseModel, got {type(self.initial_state)}"
        )

+    def _copy_state(self) -> T:
+        return copy.deepcopy(self._state)
+
    @property
    def state(self) -> T:
        return self._state
@@ -600,7 +613,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
            ```
        """
        try:
-            if not hasattr(self, '_state'):
+            if not hasattr(self, "_state"):
                return ""

            if isinstance(self._state, dict):
@@ -700,69 +713,90 @@ class Flow(Generic[T], metaclass=FlowMeta):
            raise TypeError(f"State must be dict or BaseModel, got {type(self._state)}")

    def kickoff(self, inputs: Optional[Dict[str, Any]] = None) -> Any:
-        """Start the flow execution.
+        """
+        Start the flow execution in a synchronous context.
+
+        This method wraps kickoff_async so that all state initialization and event
+        emission is handled in the asynchronous method.
+        """
+
+        async def run_flow():
+            return await self.kickoff_async(inputs)
+
+        return asyncio.run(run_flow())
+
+    @init_flow_main_trace
+    async def kickoff_async(self, inputs: Optional[Dict[str, Any]] = None) -> Any:
+        """
+        Start the flow execution asynchronously.
+
+        This method performs state restoration (if an 'id' is provided and persistence is available)
+        and updates the flow state with any additional inputs. It then emits the FlowStartedEvent,
+        logs the flow startup, and executes all start methods. Once completed, it emits the
+        FlowFinishedEvent and returns the final output.

        Args:
-            inputs: Optional dictionary containing input values and potentially a state ID to restore
+            inputs: Optional dictionary containing input values and/or a state ID for restoration.
+
+        Returns:
+            The final output from the flow, which is the result of the last executed method.
        """
-        # Handle state restoration if ID is provided in inputs
-        if inputs and 'id' in inputs and self._persistence is not None:
-            restore_uuid = inputs['id']
-            stored_state = self._persistence.load_state(restore_uuid)
-
+        if inputs:
            # Override the id in the state if it exists in inputs
-            if 'id' in inputs:
+            if "id" in inputs:
                if isinstance(self._state, dict):
-                    self._state['id'] = inputs['id']
+                    self._state["id"] = inputs["id"]
                elif isinstance(self._state, BaseModel):
-                    setattr(self._state, 'id', inputs['id'])
+                    setattr(self._state, "id", inputs["id"])

-            if stored_state:
-                self._log_flow_event(f"Loading flow state from memory for UUID: {restore_uuid}", color="yellow")
-                # Restore the state
-                self._restore_state(stored_state)
-            else:
-                self._log_flow_event(f"No flow state found for UUID: {restore_uuid}", color="red")
+            # If persistence is enabled, attempt to restore the stored state using the provided id.
+            if "id" in inputs and self._persistence is not None:
+                restore_uuid = inputs["id"]
+                stored_state = self._persistence.load_state(restore_uuid)
+                if stored_state:
+                    self._log_flow_event(
+                        f"Loading flow state from memory for UUID: {restore_uuid}",
+                        color="yellow",
+                    )
+                    self._restore_state(stored_state)
+                else:
+                    self._log_flow_event(
+                        f"No flow state found for UUID: {restore_uuid}", color="red"
+                    )

-            # Apply any additional inputs after restoration
-            filtered_inputs = {k: v for k, v in inputs.items() if k != 'id'}
+            # Update state with any additional inputs (ignoring the 'id' key)
+            filtered_inputs = {k: v for k, v in inputs.items() if k != "id"}
            if filtered_inputs:
                self._initialize_state(filtered_inputs)

-        # Start flow execution
-        self.event_emitter.send(
+        # Emit FlowStartedEvent and log the start of the flow.
+        crewai_event_bus.emit(
            self,
-            event=FlowStartedEvent(
+            FlowStartedEvent(
                type="flow_started",
                flow_name=self.__class__.__name__,
+                inputs=inputs,
            ),
        )
-        self._log_flow_event(f"Flow started with ID: {self.flow_id}", color="bold_magenta")
+        self._log_flow_event(
+            f"Flow started with ID: {self.flow_id}", color="bold_magenta"
+        )

-        if inputs is not None and 'id' not in inputs:
-            self._initialize_state(inputs)
-
-        return asyncio.run(self.kickoff_async())
-
-    async def kickoff_async(self, inputs: Optional[Dict[str, Any]] = None) -> Any:
        if not self._start_methods:
            raise ValueError("No start method defined")

-        self._telemetry.flow_execution_span(
-            self.__class__.__name__, list(self._methods.keys())
-        )
-
+        # Execute all start methods concurrently.
        tasks = [
            self._execute_start_method(start_method)
            for start_method in self._start_methods
        ]
        await asyncio.gather(*tasks)
-
        final_output = self._method_outputs[-1] if self._method_outputs else None

-        self.event_emitter.send(
+        # Emit FlowFinishedEvent after all processing is complete.
+        crewai_event_bus.emit(
            self,
-            event=FlowFinishedEvent(
+            FlowFinishedEvent(
                type="flow_finished",
                flow_name=self.__class__.__name__,
                result=final_output,
@@ -793,19 +827,59 @@ class Flow(Generic[T], metaclass=FlowMeta):
        )
        await self._execute_listeners(start_method_name, result)

+    @trace_flow_step
    async def _execute_method(
        self, method_name: str, method: Callable, *args: Any, **kwargs: Any
    ) -> Any:
-        result = (
-            await method(*args, **kwargs)
-            if asyncio.iscoroutinefunction(method)
-            else method(*args, **kwargs)
-        )
-        self._method_outputs.append(result)
-        self._method_execution_counts[method_name] = (
-            self._method_execution_counts.get(method_name, 0) + 1
-        )
-        return result
+        try:
+            dumped_params = {f"_{i}": arg for i, arg in enumerate(args)} | (
+                kwargs or {}
+            )
+            crewai_event_bus.emit(
+                self,
+                MethodExecutionStartedEvent(
+                    type="method_execution_started",
+                    method_name=method_name,
+                    flow_name=self.__class__.__name__,
+                    params=dumped_params,
+                    state=self._copy_state(),
+                ),
+            )
+
+            result = (
+                await method(*args, **kwargs)
+                if asyncio.iscoroutinefunction(method)
+                else method(*args, **kwargs)
+            )
+
+            self._method_outputs.append(result)
+            self._method_execution_counts[method_name] = (
+                self._method_execution_counts.get(method_name, 0) + 1
+            )
+
+            crewai_event_bus.emit(
+                self,
+                MethodExecutionFinishedEvent(
+                    type="method_execution_finished",
+                    method_name=method_name,
+                    flow_name=self.__class__.__name__,
+                    state=self._copy_state(),
+                    result=result,
+                ),
+            )
+
+            return result
+        except Exception as e:
+            crewai_event_bus.emit(
+                self,
+                MethodExecutionFailedEvent(
+                    type="method_execution_failed",
+                    method_name=method_name,
+                    flow_name=self.__class__.__name__,
+                    error=e,
+                ),
+            )
+            raise e

    async def _execute_listeners(self, trigger_method: str, result: Any) -> None:
        """
@@ -944,15 +1018,6 @@ class Flow(Generic[T], metaclass=FlowMeta):
        try:
            method = self._methods[listener_name]

-            self.event_emitter.send(
-                self,
-                event=MethodExecutionStartedEvent(
-                    type="method_execution_started",
-                    method_name=listener_name,
-                    flow_name=self.__class__.__name__,
-                ),
-            )
-
            sig = inspect.signature(method)
            params = list(sig.parameters.values())
            method_params = [p for p in params if p.name != "self"]
@@ -964,15 +1029,6 @@ class Flow(Generic[T], metaclass=FlowMeta):
            else:
                listener_result = await self._execute_method(listener_name, method)

-            self.event_emitter.send(
-                self,
-                event=MethodExecutionFinishedEvent(
-                    type="method_execution_finished",
-                    method_name=listener_name,
-                    flow_name=self.__class__.__name__,
-                ),
-            )
-
            # Execute listeners (and possibly routers) of this listener
            await self._execute_listeners(listener_name, listener_result)

@@ -984,7 +1040,9 @@ class Flow(Generic[T], metaclass=FlowMeta):

            traceback.print_exc()

-    def _log_flow_event(self, message: str, color: str = "yellow", level: str = "info") -> None:
+    def _log_flow_event(
+        self, message: str, color: str = "yellow", level: str = "info"
+    ) -> None:
        """Centralized logging method for flow events.

        This method provides a consistent interface for logging flow-related events,
@@ -1009,7 +1067,11 @@ class Flow(Generic[T], metaclass=FlowMeta):
            logger.warning(message)

    def plot(self, filename: str = "crewai_flow") -> None:
-        self._telemetry.flow_plotting_span(
-            self.__class__.__name__, list(self._methods.keys())
+        crewai_event_bus.emit(
+            self,
+            FlowPlotEvent(
+                type="flow_plot",
+                flow_name=self.__class__.__name__,
+            ),
        )
        plot_flow(self, filename)
--- a/src/crewai/flow/flow_events.py
+++ b/src/crewai/flow/flow_events.py
@@ -1,33 +0,0 @@
-from dataclasses import dataclass, field
-from datetime import datetime
-from typing import Any, Optional
-
-
-@dataclass
-class Event:
-    type: str
-    flow_name: str
-    timestamp: datetime = field(init=False)
-
-    def __post_init__(self):
-        self.timestamp = datetime.now()
-
-
-@dataclass
-class FlowStartedEvent(Event):
-    pass
-
-
-@dataclass
-class MethodExecutionStartedEvent(Event):
-    method_name: str
-
-
-@dataclass
-class MethodExecutionFinishedEvent(Event):
-    method_name: str
-
-
-@dataclass
-class FlowFinishedEvent(Event):
-    result: Optional[Any] = None
--- a/src/crewai/flow/persistence/decorators.py
+++ b/src/crewai/flow/persistence/decorators.py
@@ -58,7 +58,7 @@ class PersistenceDecorator:
    _printer = Printer()  # Class-level printer instance

    @classmethod
-    def persist_state(cls, flow_instance: Any, method_name: str, persistence_instance: FlowPersistence) -> None:
+    def persist_state(cls, flow_instance: Any, method_name: str, persistence_instance: FlowPersistence, verbose: bool = False) -> None:
        """Persist flow state with proper error handling and logging.

        This method handles the persistence of flow state data, including proper
@@ -68,6 +68,7 @@ class PersistenceDecorator:
            flow_instance: The flow instance whose state to persist
            method_name: Name of the method that triggered persistence
            persistence_instance: The persistence backend to use
+            verbose: Whether to log persistence operations

        Raises:
            ValueError: If flow has no state or state lacks an ID
@@ -88,9 +89,10 @@ class PersistenceDecorator:
            if not flow_uuid:
                raise ValueError("Flow state must have an 'id' field for persistence")

-            # Log state saving with consistent message
-            cls._printer.print(LOG_MESSAGES["save_state"].format(flow_uuid), color="cyan")
-            logger.info(LOG_MESSAGES["save_state"].format(flow_uuid))
+            # Log state saving only if verbose is True
+            if verbose:
+                cls._printer.print(LOG_MESSAGES["save_state"].format(flow_uuid), color="cyan")
+                logger.info(LOG_MESSAGES["save_state"].format(flow_uuid))

            try:
                persistence_instance.save_state(
@@ -115,7 +117,7 @@ class PersistenceDecorator:
            raise ValueError(error_msg) from e


-def persist(persistence: Optional[FlowPersistence] = None):
+def persist(persistence: Optional[FlowPersistence] = None, verbose: bool = False):
    """Decorator to persist flow state.

    This decorator can be applied at either the class level or method level.
@@ -126,6 +128,7 @@ def persist(persistence: Optional[FlowPersistence] = None):
    Args:
        persistence: Optional FlowPersistence implementation to use.
                    If not provided, uses SQLiteFlowPersistence.
+        verbose: Whether to log persistence operations. Defaults to False.

    Returns:
        A decorator that can be applied to either a class or method
@@ -135,13 +138,12 @@ def persist(persistence: Optional[FlowPersistence] = None):
        RuntimeError: If state persistence fails

    Example:
-        @persist  # Class-level persistence with default SQLite
+        @persist(verbose=True)  # Class-level persistence with logging
        class MyFlow(Flow[MyState]):
            @start()
            def begin(self):
                pass
    """
-
    def decorator(target: Union[Type, Callable[..., T]]) -> Union[Type, Callable[..., T]]:
        """Decorator that handles both class and method decoration."""
        actual_persistence = persistence or SQLiteFlowPersistence()
@@ -179,7 +181,7 @@ def persist(persistence: Optional[FlowPersistence] = None):
                        @functools.wraps(original_method)
                        async def method_wrapper(self: Any, *args: Any, **kwargs: Any) -> Any:
                            result = await original_method(self, *args, **kwargs)
-                            PersistenceDecorator.persist_state(self, method_name, actual_persistence)
+                            PersistenceDecorator.persist_state(self, method_name, actual_persistence, verbose)
                            return result
                        return method_wrapper

@@ -199,7 +201,7 @@ def persist(persistence: Optional[FlowPersistence] = None):
                        @functools.wraps(original_method)
                        def method_wrapper(self: Any, *args: Any, **kwargs: Any) -> Any:
                            result = original_method(self, *args, **kwargs)
-                            PersistenceDecorator.persist_state(self, method_name, actual_persistence)
+                            PersistenceDecorator.persist_state(self, method_name, actual_persistence, verbose)
                            return result
                        return method_wrapper

@@ -228,7 +230,7 @@ def persist(persistence: Optional[FlowPersistence] = None):
                        result = await method_coro
                    else:
                        result = method_coro
-                    PersistenceDecorator.persist_state(flow_instance, method.__name__, actual_persistence)
+                    PersistenceDecorator.persist_state(flow_instance, method.__name__, actual_persistence, verbose)
                    return result

                for attr in ["__is_start_method__", "__trigger_methods__", "__condition_type__", "__is_router__"]:
@@ -240,7 +242,7 @@ def persist(persistence: Optional[FlowPersistence] = None):
                @functools.wraps(method)
                def method_sync_wrapper(flow_instance: Any, *args: Any, **kwargs: Any) -> T:
                    result = method(flow_instance, *args, **kwargs)
-                    PersistenceDecorator.persist_state(flow_instance, method.__name__, actual_persistence)
+                    PersistenceDecorator.persist_state(flow_instance, method.__name__, actual_persistence, verbose)
                    return result

                for attr in ["__is_start_method__", "__trigger_methods__", "__condition_type__", "__is_router__"]:
--- a/src/crewai/flow/state_utils.py
+++ b/src/crewai/flow/state_utils.py
@@ -0,0 +1,91 @@
+import json
+from datetime import date, datetime
+from typing import Any, Dict, List, Union
+
+from pydantic import BaseModel
+
+from crewai.flow import Flow
+
+SerializablePrimitive = Union[str, int, float, bool, None]
+Serializable = Union[
+    SerializablePrimitive, List["Serializable"], Dict[str, "Serializable"]
+]
+
+
+def export_state(flow: Flow) -> dict[str, Serializable]:
+    """Exports the Flow's internal state as JSON-compatible data structures.
+
+    Performs a one-way transformation of a Flow's state into basic Python types
+    that can be safely serialized to JSON. To prevent infinite recursion with
+    circular references, the conversion is limited to a depth of 5 levels.
+
+    Args:
+        flow: The Flow object whose state needs to be exported
+
+    Returns:
+        dict[str, Any]: The transformed state using JSON-compatible Python
+            types.
+    """
+    result = to_serializable(flow._state)
+    assert isinstance(result, dict)
+    return result
+
+
+def to_serializable(
+    obj: Any, max_depth: int = 5, _current_depth: int = 0
+) -> Serializable:
+    """Converts a Python object into a JSON-compatible representation.
+
+    Supports primitives, datetime objects, collections, dictionaries, and
+    Pydantic models. Recursion depth is limited to prevent infinite nesting.
+    Non-convertible objects default to their string representations.
+
+    Args:
+        obj (Any): Object to transform.
+        max_depth (int, optional): Maximum recursion depth. Defaults to 5.
+
+    Returns:
+        Serializable: A JSON-compatible structure.
+    """
+    if _current_depth >= max_depth:
+        return repr(obj)
+
+    if isinstance(obj, (str, int, float, bool, type(None))):
+        return obj
+    elif isinstance(obj, (date, datetime)):
+        return obj.isoformat()
+    elif isinstance(obj, (list, tuple, set)):
+        return [to_serializable(item, max_depth, _current_depth + 1) for item in obj]
+    elif isinstance(obj, dict):
+        return {
+            _to_serializable_key(key): to_serializable(
+                value, max_depth, _current_depth + 1
+            )
+            for key, value in obj.items()
+        }
+    elif isinstance(obj, BaseModel):
+        return to_serializable(obj.model_dump(), max_depth, _current_depth + 1)
+    else:
+        return repr(obj)
+
+
+def _to_serializable_key(key: Any) -> str:
+    if isinstance(key, (str, int)):
+        return str(key)
+    return f"key_{id(key)}_{repr(key)}"
+
+
+def to_string(obj: Any) -> str | None:
+    """Serializes an object into a JSON string.
+
+    Args:
+        obj (Any): Object to serialize.
+
+    Returns:
+        str | None: A JSON-formatted string or `None` if empty.
+    """
+    serializable = to_serializable(obj)
+    if serializable is None:
+        return None
+    else:
+        return json.dumps(serializable)
--- a/src/crewai/knowledge/knowledge.py
+++ b/src/crewai/knowledge/knowledge.py
@@ -15,20 +15,20 @@ class Knowledge(BaseModel):
    Args:
        sources: List[BaseKnowledgeSource] = Field(default_factory=list)
        storage: Optional[KnowledgeStorage] = Field(default=None)
-        embedder_config: Optional[Dict[str, Any]] = None
+        embedder: Optional[Dict[str, Any]] = None
    """

    sources: List[BaseKnowledgeSource] = Field(default_factory=list)
    model_config = ConfigDict(arbitrary_types_allowed=True)
    storage: Optional[KnowledgeStorage] = Field(default=None)
-    embedder_config: Optional[Dict[str, Any]] = None
+    embedder: Optional[Dict[str, Any]] = None
    collection_name: Optional[str] = None

    def __init__(
        self,
        collection_name: str,
        sources: List[BaseKnowledgeSource],
-        embedder_config: Optional[Dict[str, Any]] = None,
+        embedder: Optional[Dict[str, Any]] = None,
        storage: Optional[KnowledgeStorage] = None,
        **data,
    ):
@@ -37,25 +37,23 @@ class Knowledge(BaseModel):
            self.storage = storage
        else:
            self.storage = KnowledgeStorage(
-                embedder_config=embedder_config, collection_name=collection_name
+                embedder=embedder, collection_name=collection_name
            )
        self.sources = sources
        self.storage.initialize_knowledge_storage()
-        for source in sources:
-            source.storage = self.storage
-            source.add()
+        self._add_sources()

    def query(self, query: List[str], limit: int = 3) -> List[Dict[str, Any]]:
        """
        Query across all knowledge sources to find the most relevant information.
        Returns the top_k most relevant chunks.
-        
+
        Raises:
            ValueError: If storage is not initialized.
        """
        if self.storage is None:
            raise ValueError("Storage is not initialized.")
-            
+
        results = self.storage.search(
            query,
            limit,
@@ -63,6 +61,15 @@ class Knowledge(BaseModel):
        return results

    def _add_sources(self):
-        for source in self.sources:
-            source.storage = self.storage
-            source.add()
+        try:
+            for source in self.sources:
+                source.storage = self.storage
+                source.add()
+        except Exception as e:
+            raise e
+
+    def reset(self) -> None:
+        if self.storage:
+            self.storage.reset()
+        else:
+            raise ValueError("Storage is not initialized.")
--- a/src/crewai/knowledge/source/base_file_knowledge_source.py
+++ b/src/crewai/knowledge/source/base_file_knowledge_source.py
@@ -29,7 +29,13 @@ class BaseFileKnowledgeSource(BaseKnowledgeSource, ABC):
    def validate_file_path(cls, v, info):
        """Validate that at least one of file_path or file_paths is provided."""
        # Single check if both are None, O(1) instead of nested conditions
-        if v is None and info.data.get("file_path" if info.field_name == "file_paths" else "file_paths") is None:
+        if (
+            v is None
+            and info.data.get(
+                "file_path" if info.field_name == "file_paths" else "file_paths"
+            )
+            is None
+        ):
            raise ValueError("Either file_path or file_paths must be provided")
        return v

--- a/src/crewai/knowledge/source/excel_knowledge_source.py
+++ b/src/crewai/knowledge/source/excel_knowledge_source.py
@@ -1,28 +1,138 @@
 from pathlib import Path
-from typing import Dict, List
+from typing import Dict, Iterator, List, Optional, Union
+from urllib.parse import urlparse

-from crewai.knowledge.source.base_file_knowledge_source import BaseFileKnowledgeSource
+from pydantic import Field, field_validator
+
+from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
+from crewai.utilities.constants import KNOWLEDGE_DIRECTORY
+from crewai.utilities.logger import Logger


-class ExcelKnowledgeSource(BaseFileKnowledgeSource):
+class ExcelKnowledgeSource(BaseKnowledgeSource):
    """A knowledge source that stores and queries Excel file content using embeddings."""

-    def load_content(self) -> Dict[Path, str]:
-        """Load and preprocess Excel file content."""
-        pd = self._import_dependencies()
+    # override content to be a dict of file paths to sheet names to csv content

+    _logger: Logger = Logger(verbose=True)
+
+    file_path: Optional[Union[Path, List[Path], str, List[str]]] = Field(
+        default=None,
+        description="[Deprecated] The path to the file. Use file_paths instead.",
+    )
+    file_paths: Optional[Union[Path, List[Path], str, List[str]]] = Field(
+        default_factory=list, description="The path to the file"
+    )
+    chunks: List[str] = Field(default_factory=list)
+    content: Dict[Path, Dict[str, str]] = Field(default_factory=dict)
+    safe_file_paths: List[Path] = Field(default_factory=list)
+
+    @field_validator("file_path", "file_paths", mode="before")
+    def validate_file_path(cls, v, info):
+        """Validate that at least one of file_path or file_paths is provided."""
+        # Single check if both are None, O(1) instead of nested conditions
+        if (
+            v is None
+            and info.data.get(
+                "file_path" if info.field_name == "file_paths" else "file_paths"
+            )
+            is None
+        ):
+            raise ValueError("Either file_path or file_paths must be provided")
+        return v
+
+    def _process_file_paths(self) -> List[Path]:
+        """Convert file_path to a list of Path objects."""
+
+        if hasattr(self, "file_path") and self.file_path is not None:
+            self._logger.log(
+                "warning",
+                "The 'file_path' attribute is deprecated and will be removed in a future version. Please use 'file_paths' instead.",
+                color="yellow",
+            )
+            self.file_paths = self.file_path
+
+        if self.file_paths is None:
+            raise ValueError("Your source must be provided with a file_paths: []")
+
+        # Convert single path to list
+        path_list: List[Union[Path, str]] = (
+            [self.file_paths]
+            if isinstance(self.file_paths, (str, Path))
+            else list(self.file_paths)
+            if isinstance(self.file_paths, list)
+            else []
+        )
+
+        if not path_list:
+            raise ValueError(
+                "file_path/file_paths must be a Path, str, or a list of these types"
+            )
+
+        return [self.convert_to_path(path) for path in path_list]
+
+    def validate_content(self):
+        """Validate the paths."""
+        for path in self.safe_file_paths:
+            if not path.exists():
+                self._logger.log(
+                    "error",
+                    f"File not found: {path}. Try adding sources to the knowledge directory. If it's inside the knowledge directory, use the relative path.",
+                    color="red",
+                )
+                raise FileNotFoundError(f"File not found: {path}")
+            if not path.is_file():
+                self._logger.log(
+                    "error",
+                    f"Path is not a file: {path}",
+                    color="red",
+                )
+
+    def model_post_init(self, _) -> None:
+        if self.file_path:
+            self._logger.log(
+                "warning",
+                "The 'file_path' attribute is deprecated and will be removed in a future version. Please use 'file_paths' instead.",
+                color="yellow",
+            )
+            self.file_paths = self.file_path
+        self.safe_file_paths = self._process_file_paths()
+        self.validate_content()
+        self.content = self._load_content()
+
+    def _load_content(self) -> Dict[Path, Dict[str, str]]:
+        """Load and preprocess Excel file content from multiple sheets.
+
+        Each sheet's content is converted to CSV format and stored.
+
+        Returns:
+            Dict[Path, Dict[str, str]]: A mapping of file paths to their respective sheet contents.
+
+        Raises:
+            ImportError: If required dependencies are missing.
+            FileNotFoundError: If the specified Excel file cannot be opened.
+        """
+        pd = self._import_dependencies()
        content_dict = {}
        for file_path in self.safe_file_paths:
            file_path = self.convert_to_path(file_path)
-            df = pd.read_excel(file_path)
-            content = df.to_csv(index=False)
-            content_dict[file_path] = content
+            with pd.ExcelFile(file_path) as xl:
+                sheet_dict = {
+                    str(sheet_name): str(
+                        pd.read_excel(xl, sheet_name).to_csv(index=False)
+                    )
+                    for sheet_name in xl.sheet_names
+                }
+            content_dict[file_path] = sheet_dict
        return content_dict

+    def convert_to_path(self, path: Union[Path, str]) -> Path:
+        """Convert a path to a Path object."""
+        return Path(KNOWLEDGE_DIRECTORY + "/" + path) if isinstance(path, str) else path
+
    def _import_dependencies(self):
        """Dynamically import dependencies."""
        try:
-            import openpyxl  # noqa
            import pandas as pd

            return pd
@@ -38,10 +148,14 @@ class ExcelKnowledgeSource(BaseFileKnowledgeSource):
        and save the embeddings.
        """
        # Convert dictionary values to a single string if content is a dictionary
-        if isinstance(self.content, dict):
-            content_str = "\n".join(str(value) for value in self.content.values())
-        else:
-            content_str = str(self.content)
+        # Updated to account for .xlsx workbooks with multiple tabs/sheets
+        content_str = ""
+        for value in self.content.values():
+            if isinstance(value, dict):
+                for sheet_value in value.values():
+                    content_str += str(sheet_value) + "\n"
+            else:
+                content_str += str(value) + "\n"

        new_chunks = self._chunk_text(content_str)
        self.chunks.extend(new_chunks)
--- a/src/crewai/knowledge/storage/knowledge_storage.py
+++ b/src/crewai/knowledge/storage/knowledge_storage.py
@@ -48,11 +48,11 @@ class KnowledgeStorage(BaseKnowledgeStorage):

    def __init__(
        self,
-        embedder_config: Optional[Dict[str, Any]] = None,
+        embedder: Optional[Dict[str, Any]] = None,
        collection_name: Optional[str] = None,
    ):
        self.collection_name = collection_name
-        self._set_embedder_config(embedder_config)
+        self._set_embedder_config(embedder)

    def search(
        self,
@@ -76,7 +76,7 @@ class KnowledgeStorage(BaseKnowledgeStorage):
                        "context": fetched["documents"][0][i],  # type: ignore
                        "score": fetched["distances"][0][i],  # type: ignore
                    }
-                    if result["score"] >= score_threshold:  # type: ignore
+                    if result["score"] >= score_threshold:
                        results.append(result)
                return results
            else:
@@ -99,7 +99,7 @@ class KnowledgeStorage(BaseKnowledgeStorage):
            )
            if self.app:
                self.collection = self.app.get_or_create_collection(
-                    name=collection_name, embedding_function=self.embedder_config
+                    name=collection_name, embedding_function=self.embedder
                )
            else:
                raise Exception("Vector Database Client not initialized")
@@ -187,17 +187,15 @@ class KnowledgeStorage(BaseKnowledgeStorage):
            api_key=os.getenv("OPENAI_API_KEY"), model_name="text-embedding-3-small"
        )

-    def _set_embedder_config(
-        self, embedder_config: Optional[Dict[str, Any]] = None
-    ) -> None:
+    def _set_embedder_config(self, embedder: Optional[Dict[str, Any]] = None) -> None:
        """Set the embedding configuration for the knowledge storage.

        Args:
            embedder_config (Optional[Dict[str, Any]]): Configuration dictionary for the embedder.
                If None or empty, defaults to the default embedding function.
        """
-        self.embedder_config = (
-            EmbeddingConfigurator().configure_embedder(embedder_config)
-            if embedder_config
+        self.embedder = (
+            EmbeddingConfigurator().configure_embedder(embedder)
+            if embedder
            else self._create_default_embedding_function()
        )
--- a/src/crewai/llm.py
+++ b/src/crewai/llm.py
@@ -1,3 +1,4 @@
+import inspect
 import json
 import logging
 import os
@@ -5,20 +6,37 @@ import sys
 import threading
 import warnings
 from contextlib import contextmanager
-from typing import Any, Dict, List, Optional, Union, cast
+from typing import (
+    Any,
+    Dict,
+    List,
+    Literal,
+    Optional,
+    Tuple,
+    Type,
+    Union,
+    cast,
+)

 from dotenv import load_dotenv
+from pydantic import BaseModel
+
+from crewai.utilities.events.tool_usage_events import ToolExecutionErrorEvent

 with warnings.catch_warnings():
    warnings.simplefilter("ignore", UserWarning)
    import litellm
-    from litellm import Choices, get_supported_openai_params
+    from litellm import Choices
    from litellm.types.utils import ModelResponse
+    from litellm.utils import get_supported_openai_params, supports_response_schema


+from crewai.traces.unified_trace_controller import trace_llm_call
+from crewai.utilities.events import crewai_event_bus
 from crewai.utilities.exceptions.context_window_exceeding_exception import (
    LLMContextLengthExceededException,
 )
+from crewai.utilities.protocols import AgentExecutorProtocol

 load_dotenv()

@@ -128,14 +146,17 @@ class LLM:
        presence_penalty: Optional[float] = None,
        frequency_penalty: Optional[float] = None,
        logit_bias: Optional[Dict[int, float]] = None,
-        response_format: Optional[Dict[str, Any]] = None,
+        response_format: Optional[Type[BaseModel]] = None,
        seed: Optional[int] = None,
        logprobs: Optional[int] = None,
        top_logprobs: Optional[int] = None,
        base_url: Optional[str] = None,
+        api_base: Optional[str] = None,
        api_version: Optional[str] = None,
        api_key: Optional[str] = None,
        callbacks: List[Any] = [],
+        reasoning_effort: Optional[Literal["none", "low", "medium", "high"]] = None,
+        **kwargs,
    ):
        self.model = model
        self.timeout = timeout
@@ -152,10 +173,15 @@ class LLM:
        self.logprobs = logprobs
        self.top_logprobs = top_logprobs
        self.base_url = base_url
+        self.api_base = api_base
        self.api_version = api_version
        self.api_key = api_key
        self.callbacks = callbacks
        self.context_window_size = 0
+        self.reasoning_effort = reasoning_effort
+        self.additional_params = kwargs
+        self._message_history: List[Dict[str, str]] = []
+        self.is_anthropic = self._is_anthropic_model(model)

        litellm.drop_params = True

@@ -170,55 +196,94 @@ class LLM:
        self.set_callbacks(callbacks)
        self.set_env_callbacks()

+    @trace_llm_call
+    def _call_llm(self, params: Dict[str, Any]) -> Any:
+        with suppress_warnings():
+            response = litellm.completion(**params)
+            return response
+
+    def _is_anthropic_model(self, model: str) -> bool:
+        """Determine if the model is from Anthropic provider.
+
+        Args:
+            model: The model identifier string.
+
+        Returns:
+            bool: True if the model is from Anthropic, False otherwise.
+        """
+        ANTHROPIC_PREFIXES = ("anthropic/", "claude-", "claude/")
+        return any(prefix in model.lower() for prefix in ANTHROPIC_PREFIXES)
+
    def call(
        self,
        messages: Union[str, List[Dict[str, str]]],
        tools: Optional[List[dict]] = None,
        callbacks: Optional[List[Any]] = None,
        available_functions: Optional[Dict[str, Any]] = None,
-    ) -> str:
-        """
-        High-level llm call method that:
-          1) Accepts either a string or a list of messages
-          2) Converts string input to the required message format
-          3) Calls litellm.completion
-          4) Handles function/tool calls if any
-          5) Returns the final text response or tool result
+    ) -> Union[str, Any]:
+        """High-level LLM call method.

-        Parameters:
-        - messages (Union[str, List[Dict[str, str]]]): The input messages for the LLM.
-          - If a string is provided, it will be converted into a message list with a single entry.
-          - If a list of dictionaries is provided, each dictionary should have 'role' and 'content' keys.
-        - tools (Optional[List[dict]]): A list of tool schemas for function calling.
-        - callbacks (Optional[List[Any]]): A list of callback functions to be executed.
-        - available_functions (Optional[Dict[str, Any]]): A dictionary mapping function names to actual Python functions.
+        Args:
+            messages: Input messages for the LLM.
+                     Can be a string or list of message dictionaries.
+                     If string, it will be converted to a single user message.
+                     If list, each dict must have 'role' and 'content' keys.
+            tools: Optional list of tool schemas for function calling.
+                  Each tool should define its name, description, and parameters.
+            callbacks: Optional list of callback functions to be executed
+                      during and after the LLM call.
+            available_functions: Optional dict mapping function names to callables
+                               that can be invoked by the LLM.

        Returns:
-        - str: The final text response from the LLM or the result of a tool function call.
+            Union[str, Any]: Either a text response from the LLM (str) or
+                           the result of a tool function call (Any).
+
+        Raises:
+            TypeError: If messages format is invalid
+            ValueError: If response format is not supported
+            LLMContextLengthExceededException: If input exceeds model's context limit

        Examples:
-        ---------
-        # Example 1: Using a string input
-        response = llm.call("Return the name of a random city in the world.")
-        print(response)
+            # Example 1: Simple string input
+            >>> response = llm.call("Return the name of a random city.")
+            >>> print(response)
+            "Paris"

-        # Example 2: Using a list of messages
-        messages = [{"role": "user", "content": "What is the capital of France?"}]
-        response = llm.call(messages)
-        print(response)
+            # Example 2: Message list with system and user messages
+            >>> messages = [
+            ...     {"role": "system", "content": "You are a geography expert"},
+            ...     {"role": "user", "content": "What is France's capital?"}
+            ... ]
+            >>> response = llm.call(messages)
+            >>> print(response)
+            "The capital of France is Paris."
        """
+        # Validate parameters before proceeding with the call.
+        self._validate_call_params()
+
        if isinstance(messages, str):
            messages = [{"role": "user", "content": messages}]

+        # For O1 models, system messages are not supported.
+        # Convert any system messages into assistant messages.
+        if "o1" in self.model.lower():
+            for message in messages:
+                if message.get("role") == "system":
+                    message["role"] = "assistant"
+
        with suppress_warnings():
            if callbacks and len(callbacks) > 0:
                self.set_callbacks(callbacks)

            try:
-                # --- 1) Prepare the parameters for the completion call
+                # --- 1) Format messages according to provider requirements
+                formatted_messages = self._format_messages_for_provider(messages)
+
+                # --- 2) Prepare the parameters for the completion call
                params = {
                    "model": self.model,
-                    "messages": messages,
+                    "messages": formatted_messages,
                    "timeout": self.timeout,
                    "temperature": self.temperature,
                    "top_p": self.top_p,
@@ -232,18 +297,21 @@ class LLM:
                    "seed": self.seed,
                    "logprobs": self.logprobs,
                    "top_logprobs": self.top_logprobs,
-                    "api_base": self.base_url,
+                    "api_base": self.api_base,
+                    "base_url": self.base_url,
                    "api_version": self.api_version,
                    "api_key": self.api_key,
                    "stream": False,
                    "tools": tools,
+                    "reasoning_effort": self.reasoning_effort,
+                    **self.additional_params,
                }

                # Remove None values from params
                params = {k: v for k, v in params.items() if v is not None}

                # --- 2) Make the completion call
-                response = litellm.completion(**params)
+                response = self._call_llm(params)
                response_message = cast(Choices, cast(ModelResponse, response).choices)[
                    0
                ].message
@@ -270,7 +338,7 @@ class LLM:
                # --- 5) Handle the tool call
                tool_call = tool_calls[0]
                function_name = tool_call.function.name
-
+                print("function_name", function_name)
                if function_name in available_functions:
                    try:
                        function_args = json.loads(tool_call.function.arguments)
@@ -288,6 +356,15 @@ class LLM:
                        logging.error(
                            f"Error executing function '{function_name}': {e}"
                        )
+                        crewai_event_bus.emit(
+                            self,
+                            event=ToolExecutionErrorEvent(
+                                tool_name=function_name,
+                                tool_args=function_args,
+                                tool_class=fn,
+                                error=str(e),
+                            ),
+                        )
                        return text_response

                else:
@@ -303,10 +380,76 @@ class LLM:
                    logging.error(f"LiteLLM call failed: {str(e)}")
                raise

+    def _format_messages_for_provider(
+        self, messages: List[Dict[str, str]]
+    ) -> List[Dict[str, str]]:
+        """Format messages according to provider requirements.
+
+        Args:
+            messages: List of message dictionaries with 'role' and 'content' keys.
+                     Can be empty or None.
+
+        Returns:
+            List of formatted messages according to provider requirements.
+            For Anthropic models, ensures first message has 'user' role.
+
+        Raises:
+            TypeError: If messages is None or contains invalid message format.
+        """
+        if messages is None:
+            raise TypeError("Messages cannot be None")
+
+        # Validate message format first
+        for msg in messages:
+            if not isinstance(msg, dict) or "role" not in msg or "content" not in msg:
+                raise TypeError(
+                    "Invalid message format. Each message must be a dict with 'role' and 'content' keys"
+                )
+
+        if not self.is_anthropic:
+            return messages
+
+        # Anthropic requires messages to start with 'user' role
+        if not messages or messages[0]["role"] == "system":
+            # If first message is system or empty, add a placeholder user message
+            return [{"role": "user", "content": "."}, *messages]
+
+        return messages
+
+    def _get_custom_llm_provider(self) -> str:
+        """
+        Derives the custom_llm_provider from the model string.
+        - For example, if the model is "openrouter/deepseek/deepseek-chat", returns "openrouter".
+        - If the model is "gemini/gemini-1.5-pro", returns "gemini".
+        - If there is no '/', defaults to "openai".
+        """
+        if "/" in self.model:
+            return self.model.split("/")[0]
+        return "openai"
+
+    def _validate_call_params(self) -> None:
+        """
+        Validate parameters before making a call. Currently this only checks if
+        a response_format is provided and whether the model supports it.
+        The custom_llm_provider is dynamically determined from the model:
+          - E.g., "openrouter/deepseek/deepseek-chat" yields "openrouter"
+          - "gemini/gemini-1.5-pro" yields "gemini"
+          - If no slash is present, "openai" is assumed.
+        """
+        provider = self._get_custom_llm_provider()
+        if self.response_format is not None and not supports_response_schema(
+            model=self.model,
+            custom_llm_provider=provider,
+        ):
+            raise ValueError(
+                f"The model {self.model} does not support response_format for provider '{provider}'. "
+                "Please remove response_format or use a supported model."
+            )
+
    def supports_function_calling(self) -> bool:
        try:
            params = get_supported_openai_params(model=self.model)
-            return "response_format" in params
+            return params is not None and "tools" in params
        except Exception as e:
            logging.error(f"Failed to get supported params: {str(e)}")
            return False
@@ -314,7 +457,7 @@ class LLM:
    def supports_stop_words(self) -> bool:
        try:
            params = get_supported_openai_params(model=self.model)
-            return "stop" in params
+            return params is not None and "stop" in params
        except Exception as e:
            logging.error(f"Failed to get supported params: {str(e)}")
            return False
@@ -388,3 +531,95 @@ class LLM:

                litellm.success_callback = success_callbacks
                litellm.failure_callback = failure_callbacks
+
+    def _get_execution_context(self) -> Tuple[Optional[Any], Optional[Any]]:
+        """Get the agent and task from the execution context.
+
+        Returns:
+            tuple: (agent, task) from any AgentExecutor context, or (None, None) if not found
+        """
+        frame = inspect.currentframe()
+        caller_frame = frame.f_back if frame else None
+        agent = None
+        task = None
+
+        # Add a maximum depth to prevent infinite loops
+        max_depth = 100  # Reasonable limit for call stack depth
+        current_depth = 0
+
+        while caller_frame and current_depth < max_depth:
+            if "self" in caller_frame.f_locals:
+                caller_self = caller_frame.f_locals["self"]
+                if isinstance(caller_self, AgentExecutorProtocol):
+                    agent = caller_self.agent
+                    task = caller_self.task
+                    break
+            caller_frame = caller_frame.f_back
+            current_depth += 1
+
+        return agent, task
+
+    def _get_new_messages(self, messages: List[Dict[str, str]]) -> List[Dict[str, str]]:
+        """Get only the new messages that haven't been processed before."""
+        if not hasattr(self, "_message_history"):
+            self._message_history = []
+
+        new_messages = []
+        for message in messages:
+            message_key = (message["role"], message["content"])
+            if message_key not in [
+                (m["role"], m["content"]) for m in self._message_history
+            ]:
+                new_messages.append(message)
+                self._message_history.append(message)
+        return new_messages
+
+    def _get_new_tool_results(self, agent) -> List[Dict]:
+        """Get only the new tool results that haven't been processed before."""
+        if not agent or not agent.tools_results:
+            return []
+
+        if not hasattr(self, "_tool_results_history"):
+            self._tool_results_history: List[Dict] = []
+
+        new_tool_results = []
+
+        for result in agent.tools_results:
+            # Process tool arguments to extract actual values
+            processed_args = {}
+            if isinstance(result["tool_args"], dict):
+                for key, value in result["tool_args"].items():
+                    if isinstance(value, dict) and "type" in value:
+                        # Skip metadata and just store the actual value
+                        continue
+                    processed_args[key] = value
+
+            # Create a clean result with processed arguments
+            clean_result = {
+                "tool_name": result["tool_name"],
+                "tool_args": processed_args,
+                "result": result["result"],
+                "content": result.get("content", ""),
+                "start_time": result.get("start_time", ""),
+            }
+
+            # Check if this exact tool execution exists in history
+            is_duplicate = False
+            for history_result in self._tool_results_history:
+                if (
+                    clean_result["tool_name"] == history_result["tool_name"]
+                    and str(clean_result["tool_args"])
+                    == str(history_result["tool_args"])
+                    and str(clean_result["result"]) == str(history_result["result"])
+                    and clean_result["content"] == history_result.get("content", "")
+                    and clean_result["start_time"]
+                    == history_result.get("start_time", "")
+                ):
+                    is_duplicate = True
+                    break
+
+            if not is_duplicate:
+                new_tool_results.append(clean_result)
+                self._tool_results_history.append(clean_result)
+
+        return new_tool_results
--- a/src/crewai/memory/entity/entity_memory.py
+++ b/src/crewai/memory/entity/entity_memory.py
@@ -1,3 +1,7 @@
+from typing import Optional
+
+from pydantic import PrivateAttr
+
 from crewai.memory.entity.entity_memory_item import EntityMemoryItem
 from crewai.memory.memory import Memory
 from crewai.memory.storage.rag_storage import RAGStorage
@@ -10,13 +14,15 @@ class EntityMemory(Memory):
    Inherits from the Memory class.
    """

-    def __init__(self, crew=None, embedder_config=None, storage=None, path=None):
-        if hasattr(crew, "memory_config") and crew.memory_config is not None:
-            self.memory_provider = crew.memory_config.get("provider")
-        else:
-            self.memory_provider = None
+    _memory_provider: Optional[str] = PrivateAttr()

-        if self.memory_provider == "mem0":
+    def __init__(self, crew=None, embedder_config=None, storage=None, path=None):
+        if crew and hasattr(crew, "memory_config") and crew.memory_config is not None:
+            memory_provider = crew.memory_config.get("provider")
+        else:
+            memory_provider = None
+
+        if memory_provider == "mem0":
            try:
                from crewai.memory.storage.mem0_storage import Mem0Storage
            except ImportError:
@@ -36,11 +42,13 @@ class EntityMemory(Memory):
                    path=path,
                )
            )
-        super().__init__(storage)
+
+        super().__init__(storage=storage)
+        self._memory_provider = memory_provider

    def save(self, item: EntityMemoryItem) -> None:  # type: ignore # BUG?: Signature of "save" incompatible with supertype "Memory"
        """Saves an entity item into the SQLite storage."""
-        if self.memory_provider == "mem0":
+        if self._memory_provider == "mem0":
            data = f"""
            Remember details about the following entity:
            Name: {item.name}
--- a/src/crewai/memory/long_term/long_term_memory.py
+++ b/src/crewai/memory/long_term/long_term_memory.py
@@ -17,7 +17,7 @@ class LongTermMemory(Memory):
    def __init__(self, storage=None, path=None):
        if not storage:
            storage = LTMSQLiteStorage(db_path=path) if path else LTMSQLiteStorage()
-        super().__init__(storage)
+        super().__init__(storage=storage)

    def save(self, item: LongTermMemoryItem) -> None:  # type: ignore # BUG?: Signature of "save" incompatible with supertype "Memory"
        metadata = item.metadata
--- a/src/crewai/memory/memory.py
+++ b/src/crewai/memory/memory.py
@@ -1,15 +1,19 @@
 from typing import Any, Dict, List, Optional

-from crewai.memory.storage.rag_storage import RAGStorage
+from pydantic import BaseModel


-class Memory:
+class Memory(BaseModel):
    """
    Base class for memory, now supporting agent tags and generic metadata.
    """

-    def __init__(self, storage: RAGStorage):
-        self.storage = storage
+    embedder_config: Optional[Dict[str, Any]] = None
+
+    storage: Any
+
+    def __init__(self, storage: Any, **data: Any):
+        super().__init__(storage=storage, **data)

    def save(
        self,
--- a/src/crewai/memory/short_term/short_term_memory.py
+++ b/src/crewai/memory/short_term/short_term_memory.py
@@ -1,5 +1,7 @@
 from typing import Any, Dict, Optional

+from pydantic import PrivateAttr
+
 from crewai.memory.memory import Memory
 from crewai.memory.short_term.short_term_memory_item import ShortTermMemoryItem
 from crewai.memory.storage.rag_storage import RAGStorage
@@ -14,13 +16,15 @@ class ShortTermMemory(Memory):
    MemoryItem instances.
    """

-    def __init__(self, crew=None, embedder_config=None, storage=None, path=None):
-        if hasattr(crew, "memory_config") and crew.memory_config is not None:
-            self.memory_provider = crew.memory_config.get("provider")
-        else:
-            self.memory_provider = None
+    _memory_provider: Optional[str] = PrivateAttr()

-        if self.memory_provider == "mem0":
+    def __init__(self, crew=None, embedder_config=None, storage=None, path=None):
+        if crew and hasattr(crew, "memory_config") and crew.memory_config is not None:
+            memory_provider = crew.memory_config.get("provider")
+        else:
+            memory_provider = None
+
+        if memory_provider == "mem0":
            try:
                from crewai.memory.storage.mem0_storage import Mem0Storage
            except ImportError:
@@ -39,7 +43,8 @@ class ShortTermMemory(Memory):
                    path=path,
                )
            )
-        super().__init__(storage)
+        super().__init__(storage=storage)
+        self._memory_provider = memory_provider

    def save(
        self,
@@ -48,7 +53,7 @@ class ShortTermMemory(Memory):
        agent: Optional[str] = None,
    ) -> None:
        item = ShortTermMemoryItem(data=value, metadata=metadata, agent=agent)
-        if self.memory_provider == "mem0":
+        if self._memory_provider == "mem0":
            item.data = f"Remember the following insights from Agent run: {item.data}"

        super().save(value=item.data, metadata=item.metadata, agent=item.agent)
--- a/src/crewai/memory/storage/base_rag_storage.py
+++ b/src/crewai/memory/storage/base_rag_storage.py
@@ -13,7 +13,7 @@ class BaseRAGStorage(ABC):
        self,
        type: str,
        allow_reset: bool = True,
-        embedder_config: Optional[Any] = None,
+        embedder_config: Optional[Dict[str, Any]] = None,
        crew: Any = None,
    ):
        self.type = type
--- a/src/crewai/task.py
+++ b/src/crewai/task.py
@@ -21,7 +21,6 @@ from typing import (
    Union,
 )

-from opentelemetry.trace import Span
 from pydantic import (
    UUID4,
    BaseModel,
@@ -36,10 +35,15 @@ from crewai.agents.agent_builder.base_agent import BaseAgent
 from crewai.tasks.guardrail_result import GuardrailResult
 from crewai.tasks.output_format import OutputFormat
 from crewai.tasks.task_output import TaskOutput
-from crewai.telemetry.telemetry import Telemetry
 from crewai.tools.base_tool import BaseTool
 from crewai.utilities.config import process_config
 from crewai.utilities.converter import Converter, convert_to_model
+from crewai.utilities.events import (
+    TaskCompletedEvent,
+    TaskFailedEvent,
+    TaskStartedEvent,
+)
+from crewai.utilities.events.crewai_event_bus import crewai_event_bus
 from crewai.utilities.i18n import I18N
 from crewai.utilities.printer import Printer

@@ -183,8 +187,6 @@ class Task(BaseModel):
                    )
        return v

-    _telemetry: Telemetry = PrivateAttr(default_factory=Telemetry)
-    _execution_span: Optional[Span] = PrivateAttr(default=None)
    _original_description: Optional[str] = PrivateAttr(default=None)
    _original_expected_output: Optional[str] = PrivateAttr(default=None)
    _original_output_file: Optional[str] = PrivateAttr(default=None)
@@ -348,94 +350,102 @@ class Task(BaseModel):
        tools: Optional[List[Any]],
    ) -> TaskOutput:
        """Run the core execution logic of the task."""
-        agent = agent or self.agent
-        self.agent = agent
-        if not agent:
-            raise Exception(
-                f"The task '{self.description}' has no agent assigned, therefore it can't be executed directly and should be executed in a Crew using a specific process that support that, like hierarchical."
+        try:
+            agent = agent or self.agent
+            self.agent = agent
+            if not agent:
+                raise Exception(
+                    f"The task '{self.description}' has no agent assigned, therefore it can't be executed directly and should be executed in a Crew using a specific process that support that, like hierarchical."
+                )
+
+            self.start_time = datetime.datetime.now()
+
+            self.prompt_context = context
+            tools = tools or self.tools or []
+
+            self.processed_by_agents.add(agent.role)
+            crewai_event_bus.emit(self, TaskStartedEvent(context=context))
+            result = agent.execute_task(
+                task=self,
+                context=context,
+                tools=tools,
            )

-        self.start_time = datetime.datetime.now()
-        self._execution_span = self._telemetry.task_started(crew=agent.crew, task=self)
+            pydantic_output, json_output = self._export_output(result)
+            task_output = TaskOutput(
+                name=self.name,
+                description=self.description,
+                expected_output=self.expected_output,
+                raw=result,
+                pydantic=pydantic_output,
+                json_dict=json_output,
+                agent=agent.role,
+                output_format=self._get_output_format(),
+            )

-        self.prompt_context = context
-        tools = tools or self.tools or []
+            if self.guardrail:
+                guardrail_result = GuardrailResult.from_tuple(
+                    self.guardrail(task_output)
+                )
+                if not guardrail_result.success:
+                    if self.retry_count >= self.max_retries:
+                        raise Exception(
+                            f"Task failed guardrail validation after {self.max_retries} retries. "
+                            f"Last error: {guardrail_result.error}"
+                        )

-        self.processed_by_agents.add(agent.role)
+                    self.retry_count += 1
+                    context = self.i18n.errors("validation_error").format(
+                        guardrail_result_error=guardrail_result.error,
+                        task_output=task_output.raw,
+                    )
+                    printer = Printer()
+                    printer.print(
+                        content=f"Guardrail blocked, retrying, due to: {guardrail_result.error}\n",
+                        color="yellow",
+                    )
+                    return self._execute_core(agent, context, tools)

-        result = agent.execute_task(
-            task=self,
-            context=context,
-            tools=tools,
-        )
-
-        pydantic_output, json_output = self._export_output(result)
-        task_output = TaskOutput(
-            name=self.name,
-            description=self.description,
-            expected_output=self.expected_output,
-            raw=result,
-            pydantic=pydantic_output,
-            json_dict=json_output,
-            agent=agent.role,
-            output_format=self._get_output_format(),
-        )
-
-        if self.guardrail:
-            guardrail_result = GuardrailResult.from_tuple(self.guardrail(task_output))
-            if not guardrail_result.success:
-                if self.retry_count >= self.max_retries:
+                if guardrail_result.result is None:
                    raise Exception(
-                        f"Task failed guardrail validation after {self.max_retries} retries. "
-                        f"Last error: {guardrail_result.error}"
+                        "Task guardrail returned None as result. This is not allowed."
                    )

-                self.retry_count += 1
-                context = self.i18n.errors("validation_error").format(
-                    guardrail_result_error=guardrail_result.error,
-                    task_output=task_output.raw,
+                if isinstance(guardrail_result.result, str):
+                    task_output.raw = guardrail_result.result
+                    pydantic_output, json_output = self._export_output(
+                        guardrail_result.result
+                    )
+                    task_output.pydantic = pydantic_output
+                    task_output.json_dict = json_output
+                elif isinstance(guardrail_result.result, TaskOutput):
+                    task_output = guardrail_result.result
+
+            self.output = task_output
+            self.end_time = datetime.datetime.now()
+
+            if self.callback:
+                self.callback(self.output)
+
+            crew = self.agent.crew  # type: ignore[union-attr]
+            if crew and crew.task_callback and crew.task_callback != self.callback:
+                crew.task_callback(self.output)
+
+            if self.output_file:
+                content = (
+                    json_output
+                    if json_output
+                    else pydantic_output.model_dump_json()
+                    if pydantic_output
+                    else result
                )
-                printer = Printer()
-                printer.print(
-                    content=f"Guardrail blocked, retrying, due to: {guardrail_result.error}\n",
-                    color="yellow",
-                )
-                return self._execute_core(agent, context, tools)
-
-            if guardrail_result.result is None:
-                raise Exception(
-                    "Task guardrail returned None as result. This is not allowed."
-                )
-
-            if isinstance(guardrail_result.result, str):
-                task_output.raw = guardrail_result.result
-                pydantic_output, json_output = self._export_output(
-                    guardrail_result.result
-                )
-                task_output.pydantic = pydantic_output
-                task_output.json_dict = json_output
-            elif isinstance(guardrail_result.result, TaskOutput):
-                task_output = guardrail_result.result
-
-        self.output = task_output
-        self.end_time = datetime.datetime.now()
-
-        if self.callback:
-            self.callback(self.output)
-
-        if self._execution_span:
-            self._telemetry.task_ended(self._execution_span, self, agent.crew)
-            self._execution_span = None
-
-        if self.output_file:
-            content = (
-                json_output
-                if json_output
-                else pydantic_output.model_dump_json() if pydantic_output else result
-            )
-            self._save_file(content)
-
-        return task_output
+                self._save_file(content)
+            crewai_event_bus.emit(self, TaskCompletedEvent(output=task_output))
+            return task_output
+        except Exception as e:
+            self.end_time = datetime.datetime.now()
+            crewai_event_bus.emit(self, TaskFailedEvent(error=str(e)))
+            raise e  # Re-raise the exception after emitting the event

    def prompt(self) -> str:
        """Prompt the task.
@@ -452,7 +462,7 @@ class Task(BaseModel):
        return "\n".join(tasks_slices)

    def interpolate_inputs_and_add_conversation_history(
-        self, inputs: Dict[str, Union[str, int, float]]
+        self, inputs: Dict[str, Union[str, int, float, Dict[str, Any], List[Any]]]
    ) -> None:
        """Interpolate inputs into the task description, expected output, and output file path.
           Add conversation history if present.
@@ -524,7 +534,9 @@ class Task(BaseModel):
            )

    def interpolate_only(
-        self, input_string: Optional[str], inputs: Dict[str, Union[str, int, float]]
+        self,
+        input_string: Optional[str],
+        inputs: Dict[str, Union[str, int, float, Dict[str, Any], List[Any]]],
    ) -> str:
        """Interpolate placeholders (e.g., {key}) in a string while leaving JSON untouched.

@@ -532,17 +544,39 @@ class Task(BaseModel):
            input_string: The string containing template variables to interpolate.
                         Can be None or empty, in which case an empty string is returned.
            inputs: Dictionary mapping template variables to their values.
-                   Supported value types are strings, integers, and floats.
-                   If input_string is empty or has no placeholders, inputs can be empty.
+                   Supported value types are strings, integers, floats, and dicts/lists
+                   containing only these types and other nested dicts/lists.

        Returns:
            The interpolated string with all template variables replaced with their values.
            Empty string if input_string is None or empty.

        Raises:
-            ValueError: If a required template variable is missing from inputs.
-            KeyError: If a template variable is not found in the inputs dictionary.
+            ValueError: If a value contains unsupported types
        """
+
+        # Validation function for recursive type checking
+        def validate_type(value: Any) -> None:
+            if value is None:
+                return
+            if isinstance(value, (str, int, float, bool)):
+                return
+            if isinstance(value, (dict, list)):
+                for item in value.values() if isinstance(value, dict) else value:
+                    validate_type(item)
+                return
+            raise ValueError(
+                f"Unsupported type {type(value).__name__} in inputs. "
+                "Only str, int, float, bool, dict, and list are allowed."
+            )
+
+        # Validate all input values
+        for key, value in inputs.items():
+            try:
+                validate_type(value)
+            except ValueError as e:
+                raise ValueError(f"Invalid value for key '{key}': {str(e)}") from e
+
        if input_string is None or not input_string:
            return ""
        if "{" not in input_string and "}" not in input_string:
@@ -551,15 +585,7 @@ class Task(BaseModel):
            raise ValueError(
                "Inputs dictionary cannot be empty when interpolating variables"
            )
-
        try:
-            # Validate input types
-            for key, value in inputs.items():
-                if not isinstance(value, (str, int, float)):
-                    raise ValueError(
-                        f"Value for key '{key}' must be a string, integer, or float, got {type(value).__name__}"
-                    )
-
            escaped_string = input_string.replace("{", "{{").replace("}", "}}")

            for key in inputs.keys():
@@ -652,19 +678,32 @@ class Task(BaseModel):
            return OutputFormat.PYDANTIC
        return OutputFormat.RAW

-    def _save_file(self, result: Any) -> None:
+    def _save_file(self, result: Union[Dict, str, Any]) -> None:
        """Save task output to a file.

+        Note:
+            For cross-platform file writing, especially on Windows, consider using FileWriterTool
+            from the crewai_tools package:
+                pip install 'crewai[tools]'
+                from crewai_tools import FileWriterTool
+
        Args:
            result: The result to save to the file. Can be a dict or any stringifiable object.

        Raises:
            ValueError: If output_file is not set
-            RuntimeError: If there is an error writing to the file
+            RuntimeError: If there is an error writing to the file. For cross-platform
+                compatibility, especially on Windows, use FileWriterTool from crewai_tools
+                package.
        """
        if self.output_file is None:
            raise ValueError("output_file is not set.")

+        FILEWRITER_RECOMMENDATION = (
+            "For cross-platform file writing, especially on Windows, "
+            "use FileWriterTool from crewai_tools package."
+        )
+
        try:
            resolved_path = Path(self.output_file).expanduser().resolve()
            directory = resolved_path.parent
@@ -680,7 +719,11 @@ class Task(BaseModel):
                else:
                    file.write(str(result))
        except (OSError, IOError) as e:
-            raise RuntimeError(f"Failed to save output file: {e}")
+            raise RuntimeError(
+                "\n".join(
+                    [f"Failed to save output file: {e}", FILEWRITER_RECOMMENDATION]
+                )
+            )
        return None

    def __repr__(self):
--- a/src/crewai/tools/agent_tools/add_image_tool.py
+++ b/src/crewai/tools/agent_tools/add_image_tool.py
@@ -7,11 +7,11 @@ from crewai.utilities import I18N

 i18n = I18N()

+
 class AddImageToolSchema(BaseModel):
    image_url: str = Field(..., description="The URL or path of the image to add")
    action: Optional[str] = Field(
-        default=None,
-        description="Optional context or question about the image"
+        default=None, description="Optional context or question about the image"
    )


@@ -36,10 +36,7 @@ class AddImageTool(BaseTool):
                "image_url": {
                    "url": image_url,
                },
-            }
+            },
        ]

-        return {
-            "role": "user",
-            "content": content
-        }
+        return {"role": "user", "content": content}
--- a/src/crewai/tools/tool_usage.py
+++ b/src/crewai/tools/tool_usage.py
@@ -2,6 +2,7 @@ import ast
 import datetime
 import json
 import time
+from datetime import UTC
 from difflib import SequenceMatcher
 from json import JSONDecodeError
 from textwrap import dedent
@@ -10,20 +11,21 @@ from typing import Any, Dict, List, Optional, Union
 import json5
 from json_repair import repair_json

-import crewai.utilities.events as events
 from crewai.agents.tools_handler import ToolsHandler
 from crewai.task import Task
 from crewai.telemetry import Telemetry
 from crewai.tools import BaseTool
 from crewai.tools.structured_tool import CrewStructuredTool
 from crewai.tools.tool_calling import InstructorToolCalling, ToolCalling
-from crewai.tools.tool_usage_events import ToolUsageError, ToolUsageFinished
 from crewai.utilities import I18N, Converter, ConverterError, Printer
+from crewai.utilities.events.crewai_event_bus import crewai_event_bus
+from crewai.utilities.events.tool_usage_events import (
+    ToolSelectionErrorEvent,
+    ToolUsageErrorEvent,
+    ToolUsageFinishedEvent,
+    ToolValidateInputErrorEvent,
+)

-try:
-    import agentops  # type: ignore
-except ImportError:
-    agentops = None
 OPENAI_BIGGER_MODELS = [
    "gpt-4",
    "gpt-4o",
@@ -116,7 +118,10 @@ class ToolUsage:
                self._printer.print(content=f"\n\n{error}\n", color="red")
            return error

-        if isinstance(tool, CrewStructuredTool) and tool.name == self._i18n.tools("add_image")["name"]:  # type: ignore
+        if (
+            isinstance(tool, CrewStructuredTool)
+            and tool.name == self._i18n.tools("add_image")["name"]  # type: ignore
+        ):
            try:
                result = self._use(tool_string=tool_string, tool=tool, calling=calling)
                return result
@@ -136,7 +141,6 @@ class ToolUsage:
        tool: Any,
        calling: Union[ToolCalling, InstructorToolCalling],
    ) -> str:  # TODO: Fix this return type
-        tool_event = agentops.ToolEvent(name=calling.tool_name) if agentops else None  # type: ignore
        if self._check_tool_repeated_usage(calling=calling):  # type: ignore # _check_tool_repeated_usage of "ToolUsage" does not return a value (it only ever returns None)
            try:
                result = self._i18n.errors("task_repeated_usage").format(
@@ -154,6 +158,7 @@ class ToolUsage:
                self.task.increment_tools_errors()

        started_at = time.time()
+        started_at_trace = datetime.datetime.now(UTC)
        from_cache = False

        result = None  # type: ignore # Incompatible types in assignment (expression has type "None", variable has type "str")
@@ -181,7 +186,9 @@ class ToolUsage:

                if calling.arguments:
                    try:
-                        acceptable_args = tool.args_schema.model_json_schema()["properties"].keys()  # type: ignore
+                        acceptable_args = tool.args_schema.model_json_schema()[
+                            "properties"
+                        ].keys()  # type: ignore
                        arguments = {
                            k: v
                            for k, v in calling.arguments.items()
@@ -202,7 +209,7 @@ class ToolUsage:
                        error=e, tool=tool.name, tool_inputs=tool.description
                    )
                    error = ToolUsageErrorException(
-                        f'\n{error_message}.\nMoving on then. {self._i18n.slice("format").format(tool_names=self.tools_names)}'
+                        f"\n{error_message}.\nMoving on then. {self._i18n.slice('format').format(tool_names=self.tools_names)}"
                    ).message
                    self.task.increment_tools_errors()
                    if self.agent.verbose:
@@ -212,10 +219,6 @@ class ToolUsage:
                    return error  # type: ignore # No return value expected

                self.task.increment_tools_errors()
-                if agentops:
-                    agentops.record(
-                        agentops.ErrorEvent(exception=e, trigger_event=tool_event)
-                    )
                return self.use(calling=calling, tool_string=tool_string)  # type: ignore # No return value expected

            if self.tools_handler:
@@ -231,9 +234,6 @@ class ToolUsage:
                self.tools_handler.on_tool_use(
                    calling=calling, output=result, should_cache=should_cache
                )
-
-        if agentops:
-            agentops.record(tool_event)
        self._telemetry.tool_usage(
            llm=self.function_calling_llm,
            tool_name=tool.name,
@@ -244,6 +244,7 @@ class ToolUsage:
            "result": result,
            "tool_name": tool.name,
            "tool_args": calling.arguments,
+            "start_time": started_at_trace,
        }

        self.on_tool_use_finished(
@@ -308,14 +309,33 @@ class ToolUsage:
            ):
                return tool
        self.task.increment_tools_errors()
+        tool_selection_data = {
+            "agent_key": self.agent.key,
+            "agent_role": self.agent.role,
+            "tool_name": tool_name,
+            "tool_args": {},
+            "tool_class": self.tools_description,
+        }
        if tool_name and tool_name != "":
-            raise Exception(
-                f"Action '{tool_name}' don't exist, these are the only available Actions:\n{self.tools_description}"
+            error = f"Action '{tool_name}' don't exist, these are the only available Actions:\n{self.tools_description}"
+            crewai_event_bus.emit(
+                self,
+                ToolSelectionErrorEvent(
+                    **tool_selection_data,
+                    error=error,
+                ),
            )
+            raise Exception(error)
        else:
-            raise Exception(
-                f"I forgot the Action name, these are the only available Actions: {self.tools_description}"
+            error = f"I forgot the Action name, these are the only available Actions: {self.tools_description}"
+            crewai_event_bus.emit(
+                self,
+                ToolSelectionErrorEvent(
+                    **tool_selection_data,
+                    error=error,
+                ),
            )
+            raise Exception(error)

    def _render(self) -> str:
        """Render the tool name and description in plain text."""
@@ -368,7 +388,7 @@ class ToolUsage:
                raise
            else:
                return ToolUsageErrorException(
-                    f'{self._i18n.errors("tool_arguments_error")}'
+                    f"{self._i18n.errors('tool_arguments_error')}"
                )

        if not isinstance(arguments, dict):
@@ -376,7 +396,7 @@ class ToolUsage:
                raise
            else:
                return ToolUsageErrorException(
-                    f'{self._i18n.errors("tool_arguments_error")}'
+                    f"{self._i18n.errors('tool_arguments_error')}"
                )

        return ToolCalling(
@@ -404,7 +424,7 @@ class ToolUsage:
                if self.agent.verbose:
                    self._printer.print(content=f"\n\n{e}\n", color="red")
                return ToolUsageErrorException(  # type: ignore # Incompatible return value type (got "ToolUsageErrorException", expected "ToolCalling | InstructorToolCalling")
-                    f'{self._i18n.errors("tool_usage_error").format(error=e)}\nMoving on then. {self._i18n.slice("format").format(tool_names=self.tools_names)}'
+                    f"{self._i18n.errors('tool_usage_error').format(error=e)}\nMoving on then. {self._i18n.slice('format').format(tool_names=self.tools_names)}"
                )
            return self._tool_calling(tool_string)

@@ -451,18 +471,33 @@ class ToolUsage:
            if isinstance(arguments, dict):
                return arguments
        except Exception as e:
-            self._printer.print(content=f"Failed to repair JSON: {e}", color="red")
+            error = f"Failed to repair JSON: {e}"
+            self._printer.print(content=error, color="red")

-        # If all parsing attempts fail, raise an error
-        raise Exception(
+        error_message = (
            "Tool input must be a valid dictionary in JSON or Python literal format"
        )
+        self._emit_validate_input_error(error_message)
+        # If all parsing attempts fail, raise an error
+        raise Exception(error_message)
+
+    def _emit_validate_input_error(self, final_error: str):
+        tool_selection_data = {
+            "agent_key": self.agent.key,
+            "agent_role": self.agent.role,
+            "tool_name": self.action.tool,
+            "tool_args": str(self.action.tool_input),
+            "tool_class": self.__class__.__name__,
+        }
+
+        crewai_event_bus.emit(
+            self,
+            ToolValidateInputErrorEvent(**tool_selection_data, error=final_error),
+        )

    def on_tool_error(self, tool: Any, tool_calling: ToolCalling, e: Exception) -> None:
        event_data = self._prepare_event_data(tool, tool_calling)
-        events.emit(
-            source=self, event=ToolUsageError(**{**event_data, "error": str(e)})
-        )
+        crewai_event_bus.emit(self, ToolUsageErrorEvent(**{**event_data, "error": e}))

    def on_tool_use_finished(
        self, tool: Any, tool_calling: ToolCalling, from_cache: bool, started_at: float
@@ -476,7 +511,7 @@ class ToolUsage:
                "from_cache": from_cache,
            }
        )
-        events.emit(source=self, event=ToolUsageFinished(**event_data))
+        crewai_event_bus.emit(self, ToolUsageFinishedEvent(**event_data))

    def _prepare_event_data(self, tool: Any, tool_calling: ToolCalling) -> dict:
        return {
--- a/src/crewai/tools/tool_usage_events.py
+++ b/src/crewai/tools/tool_usage_events.py
@@ -1,24 +0,0 @@
-from datetime import datetime
-from typing import Any, Dict
-
-from pydantic import BaseModel
-
-
-class ToolUsageEvent(BaseModel):
-    agent_key: str
-    agent_role: str
-    tool_name: str
-    tool_args: Dict[str, Any]
-    tool_class: str
-    run_attempts: int | None = None
-    delegations: int | None = None
-
-
-class ToolUsageFinished(ToolUsageEvent):
-    started_at: datetime
-    finished_at: datetime
-    from_cache: bool = False
-
-
-class ToolUsageError(ToolUsageEvent):
-    error: str
--- a/src/crewai/traces/init.py
+++ b/src/crewai/traces/init.py
--- a/src/crewai/traces/context.py
+++ b/src/crewai/traces/context.py
@@ -0,0 +1,39 @@
+from contextlib import contextmanager
+from contextvars import ContextVar
+from typing import Generator
+
+
+class TraceContext:
+    """Maintains the current trace context throughout the execution stack.
+
+    This class provides a context manager for tracking trace execution across
+    async and sync code paths using ContextVars.
+    """
+
+    _context: ContextVar = ContextVar("trace_context", default=None)
+
+    @classmethod
+    def get_current(cls):
+        """Get the current trace context.
+
+        Returns:
+            Optional[UnifiedTraceController]: The current trace controller or None if not set.
+        """
+        return cls._context.get()
+
+    @classmethod
+    @contextmanager
+    def set_current(cls, trace):
+        """Set the current trace context within a context manager.
+
+        Args:
+            trace: The trace controller to set as current.
+
+        Yields:
+            UnifiedTraceController: The current trace controller.
+        """
+        token = cls._context.set(trace)
+        try:
+            yield trace
+        finally:
+            cls._context.reset(token)
--- a/src/crewai/traces/enums.py
+++ b/src/crewai/traces/enums.py
@@ -0,0 +1,19 @@
+from enum import Enum
+
+
+class TraceType(Enum):
+    LLM_CALL = "llm_call"
+    TOOL_CALL = "tool_call"
+    FLOW_STEP = "flow_step"
+    START_CALL = "start_call"
+
+
+class RunType(Enum):
+    KICKOFF = "kickoff"
+    TRAIN = "train"
+    TEST = "test"
+
+
+class CrewType(Enum):
+    CREW = "crew"
+    FLOW = "flow"
--- a/src/crewai/traces/models.py
+++ b/src/crewai/traces/models.py
@@ -0,0 +1,89 @@
+from datetime import datetime
+from typing import Any, Dict, List, Optional
+
+from pydantic import BaseModel, Field
+
+
+class ToolCall(BaseModel):
+    """Model representing a tool call during execution"""
+
+    name: str
+    arguments: Dict[str, Any]
+    output: str
+    start_time: datetime
+    end_time: Optional[datetime] = None
+    latency_ms: Optional[int] = None
+    error: Optional[str] = None
+
+
+class LLMRequest(BaseModel):
+    """Model representing the LLM request details"""
+
+    model: str
+    messages: List[Dict[str, str]]
+    temperature: Optional[float] = None
+    max_tokens: Optional[int] = None
+    stop_sequences: Optional[List[str]] = None
+    additional_params: Dict[str, Any] = Field(default_factory=dict)
+
+
+class LLMResponse(BaseModel):
+    """Model representing the LLM response details"""
+
+    content: str
+    finish_reason: Optional[str] = None
+
+
+class FlowStepIO(BaseModel):
+    """Model representing flow step input/output details"""
+
+    function_name: str
+    inputs: Dict[str, Any] = Field(default_factory=dict)
+    outputs: Any
+    metadata: Dict[str, Any] = Field(default_factory=dict)
+
+
+class CrewTrace(BaseModel):
+    """Model for tracking detailed information about LLM interactions and Flow steps"""
+
+    deployment_instance_id: Optional[str] = Field(
+        description="ID of the deployment instance"
+    )
+    trace_id: str = Field(description="Unique identifier for this trace")
+    run_id: str = Field(description="Identifier for the execution run")
+    agent_role: Optional[str] = Field(description="Role of the agent")
+    task_id: Optional[str] = Field(description="ID of the current task being executed")
+    task_name: Optional[str] = Field(description="Name of the current task")
+    task_description: Optional[str] = Field(
+        description="Description of the current task"
+    )
+    trace_type: str = Field(description="Type of the trace")
+    crew_type: str = Field(description="Type of the crew")
+    run_type: str = Field(description="Type of the run")
+
+    # Timing information
+    start_time: Optional[datetime] = None
+    end_time: Optional[datetime] = None
+    latency_ms: Optional[int] = None
+
+    # Request/Response for LLM calls
+    request: Optional[LLMRequest] = None
+    response: Optional[LLMResponse] = None
+
+    # Input/Output for Flow steps
+    flow_step: Optional[FlowStepIO] = None
+
+    # Tool usage
+    tool_calls: List[ToolCall] = Field(default_factory=list)
+
+    # Metrics
+    tokens_used: Optional[int] = None
+    prompt_tokens: Optional[int] = None
+    completion_tokens: Optional[int] = None
+    cost: Optional[float] = None
+
+    # Additional metadata
+    status: str = "running"  # running, completed, error
+    error: Optional[str] = None
+    metadata: Dict[str, Any] = Field(default_factory=dict)
+    tags: List[str] = Field(default_factory=list)
--- a/src/crewai/traces/unified_trace_controller.py
+++ b/src/crewai/traces/unified_trace_controller.py
@@ -0,0 +1,543 @@
+import inspect
+import os
+from datetime import UTC, datetime
+from functools import wraps
+from typing import Any, Awaitable, Callable, Dict, List, Optional
+from uuid import uuid4
+
+from crewai.traces.context import TraceContext
+from crewai.traces.enums import CrewType, RunType, TraceType
+from crewai.traces.models import (
+    CrewTrace,
+    FlowStepIO,
+    LLMRequest,
+    LLMResponse,
+    ToolCall,
+)
+
+
+class UnifiedTraceController:
+    """Controls and manages trace execution and recording.
+
+    This class handles the lifecycle of traces including creation, execution tracking,
+    and recording of results for various types of operations (LLM calls, tool calls, flow steps).
+    """
+
+    _task_traces: Dict[str, List["UnifiedTraceController"]] = {}
+
+    def __init__(
+        self,
+        trace_type: TraceType,
+        run_type: RunType,
+        crew_type: CrewType,
+        run_id: str,
+        deployment_instance_id: str = os.environ.get(
+            "CREWAI_DEPLOYMENT_INSTANCE_ID", ""
+        ),
+        parent_trace_id: Optional[str] = None,
+        agent_role: Optional[str] = "unknown",
+        task_name: Optional[str] = None,
+        task_description: Optional[str] = None,
+        task_id: Optional[str] = None,
+        flow_step: Dict[str, Any] = {},
+        tool_calls: List[ToolCall] = [],
+        **context: Any,
+    ) -> None:
+        """Initialize a new trace controller.
+
+        Args:
+            trace_type: Type of trace being recorded.
+            run_type: Type of run being executed.
+            crew_type: Type of crew executing the trace.
+            run_id: Unique identifier for the run.
+            deployment_instance_id: Optional deployment instance identifier.
+            parent_trace_id: Optional parent trace identifier for nested traces.
+            agent_role: Role of the agent executing the trace.
+            task_name: Optional name of the task being executed.
+            task_description: Optional description of the task.
+            task_id: Optional unique identifier for the task.
+            flow_step: Optional flow step information.
+            tool_calls: Optional list of tool calls made during execution.
+            **context: Additional context parameters.
+        """
+        self.trace_id = str(uuid4())
+        self.run_id = run_id
+        self.parent_trace_id = parent_trace_id
+        self.trace_type = trace_type
+        self.run_type = run_type
+        self.crew_type = crew_type
+        self.context = context
+        self.agent_role = agent_role
+        self.task_name = task_name
+        self.task_description = task_description
+        self.task_id = task_id
+        self.deployment_instance_id = deployment_instance_id
+        self.children: List[Dict[str, Any]] = []
+        self.start_time: Optional[datetime] = None
+        self.end_time: Optional[datetime] = None
+        self.error: Optional[str] = None
+        self.tool_calls = tool_calls
+        self.flow_step = flow_step
+        self.status: str = "running"
+
+        # Add trace to task's trace collection if task_id is present
+        if task_id:
+            self._add_to_task_traces()
+
+    def _add_to_task_traces(self) -> None:
+        """Add this trace to the task's trace collection."""
+        if not hasattr(UnifiedTraceController, "_task_traces"):
+            UnifiedTraceController._task_traces = {}
+
+        if self.task_id is None:
+            return
+
+        if self.task_id not in UnifiedTraceController._task_traces:
+            UnifiedTraceController._task_traces[self.task_id] = []
+
+        UnifiedTraceController._task_traces[self.task_id].append(self)
+
+    @classmethod
+    def get_task_traces(cls, task_id: str) -> List["UnifiedTraceController"]:
+        """Get all traces for a specific task.
+
+        Args:
+            task_id: The ID of the task to get traces for
+
+        Returns:
+            List of traces associated with the task
+        """
+        return cls._task_traces.get(task_id, [])
+
+    @classmethod
+    def clear_task_traces(cls, task_id: str) -> None:
+        """Clear traces for a specific task.
+
+        Args:
+            task_id: The ID of the task to clear traces for
+        """
+        if hasattr(cls, "_task_traces") and task_id in cls._task_traces:
+            del cls._task_traces[task_id]
+
+    def _get_current_trace(self) -> "UnifiedTraceController":
+        return TraceContext.get_current()
+
+    def start_trace(self) -> "UnifiedTraceController":
+        """Start the trace execution.
+
+        Returns:
+            UnifiedTraceController: Self for method chaining.
+        """
+        self.start_time = datetime.now(UTC)
+        return self
+
+    def end_trace(self, result: Any = None, error: Optional[str] = None) -> None:
+        """End the trace execution and record results.
+
+        Args:
+            result: Optional result from the trace execution.
+            error: Optional error message if the trace failed.
+        """
+        self.end_time = datetime.now(UTC)
+        self.status = "error" if error else "completed"
+        self.error = error
+        self._record_trace(result)
+
+    def add_child_trace(self, child_trace: Dict[str, Any]) -> None:
+        """Add a child trace to this trace's execution history.
+
+        Args:
+            child_trace: The child trace information to add.
+        """
+        self.children.append(child_trace)
+
+    def to_crew_trace(self) -> CrewTrace:
+        """Convert to CrewTrace format for storage.
+
+        Returns:
+            CrewTrace: The trace data in CrewTrace format.
+        """
+        latency_ms = None
+
+        if self.tool_calls and hasattr(self.tool_calls[0], "start_time"):
+            self.start_time = self.tool_calls[0].start_time
+
+        if self.start_time and self.end_time:
+            latency_ms = int((self.end_time - self.start_time).total_seconds() * 1000)
+
+        request = None
+        response = None
+        flow_step_obj = None
+
+        if self.trace_type in [TraceType.LLM_CALL, TraceType.TOOL_CALL]:
+            request = LLMRequest(
+                model=self.context.get("model", "unknown"),
+                messages=self.context.get("messages", []),
+                temperature=self.context.get("temperature"),
+                max_tokens=self.context.get("max_tokens"),
+                stop_sequences=self.context.get("stop_sequences"),
+            )
+            if "response" in self.context:
+                response = LLMResponse(
+                    content=self.context["response"].get("content", ""),
+                    finish_reason=self.context["response"].get("finish_reason"),
+                )
+
+        elif self.trace_type == TraceType.FLOW_STEP:
+            flow_step_obj = FlowStepIO(
+                function_name=self.flow_step.get("function_name", "unknown"),
+                inputs=self.flow_step.get("inputs", {}),
+                outputs={"result": self.context.get("response")},
+                metadata=self.flow_step.get("metadata", {}),
+            )
+
+        return CrewTrace(
+            deployment_instance_id=self.deployment_instance_id,
+            trace_id=self.trace_id,
+            task_id=self.task_id,
+            run_id=self.run_id,
+            agent_role=self.agent_role,
+            task_name=self.task_name,
+            task_description=self.task_description,
+            trace_type=self.trace_type.value,
+            crew_type=self.crew_type.value,
+            run_type=self.run_type.value,
+            start_time=self.start_time,
+            end_time=self.end_time,
+            latency_ms=latency_ms,
+            request=request,
+            response=response,
+            flow_step=flow_step_obj,
+            tool_calls=self.tool_calls,
+            tokens_used=self.context.get("tokens_used"),
+            prompt_tokens=self.context.get("prompt_tokens"),
+            completion_tokens=self.context.get("completion_tokens"),
+            status=self.status,
+            error=self.error,
+        )
+
+    def _record_trace(self, result: Any = None) -> None:
+        """Record the trace.
+
+        This method is called when a trace is completed. It ensures the trace
+        is properly recorded and associated with its task if applicable.
+
+        Args:
+            result: Optional result to include in the trace
+        """
+        if result:
+            self.context["response"] = result
+
+        # Add to task traces if this trace belongs to a task
+        if self.task_id:
+            self._add_to_task_traces()
+
+
+def should_trace() -> bool:
+    """Check if tracing is enabled via environment variable."""
+    return os.getenv("CREWAI_ENABLE_TRACING", "false").lower() == "true"
+
+
+# Crew main trace
+def init_crew_main_trace(func: Callable[..., Any]) -> Callable[..., Any]:
+    """Decorator to initialize and track the main crew execution trace.
+
+    This decorator sets up the trace context for the main crew execution,
+    handling both synchronous and asynchronous crew operations.
+
+    Args:
+        func: The crew function to be traced.
+
+    Returns:
+        Wrapped function that creates and manages the main crew trace context.
+    """
+
+    @wraps(func)
+    def wrapper(self: Any, *args: Any, **kwargs: Any) -> Any:
+        if not should_trace():
+            return func(self, *args, **kwargs)
+
+        trace = build_crew_main_trace(self)
+        with TraceContext.set_current(trace):
+            try:
+                return func(self, *args, **kwargs)
+            except Exception as e:
+                trace.end_trace(error=str(e))
+                raise
+
+    return wrapper
+
+
+def build_crew_main_trace(self: Any) -> "UnifiedTraceController":
+    """Build the main trace controller for a crew execution.
+
+    This function creates a trace controller configured for the main crew execution,
+    handling different run types (kickoff, test, train) and maintaining context.
+
+    Args:
+        self: The crew instance.
+
+    Returns:
+        UnifiedTraceController: The configured trace controller for the crew.
+    """
+    run_type = RunType.KICKOFF
+    if hasattr(self, "_test") and self._test:
+        run_type = RunType.TEST
+    elif hasattr(self, "_train") and self._train:
+        run_type = RunType.TRAIN
+
+    current_trace = TraceContext.get_current()
+
+    trace = UnifiedTraceController(
+        trace_type=TraceType.LLM_CALL,
+        run_type=run_type,
+        crew_type=current_trace.crew_type if current_trace else CrewType.CREW,
+        run_id=current_trace.run_id if current_trace else str(self.id),
+        parent_trace_id=current_trace.trace_id if current_trace else None,
+    )
+    return trace
+
+
+# Flow main trace
+def init_flow_main_trace(
+    func: Callable[..., Awaitable[Any]],
+) -> Callable[..., Awaitable[Any]]:
+    """Decorator to initialize and track the main flow execution trace.
+
+    Args:
+        func: The async flow function to be traced.
+
+    Returns:
+        Wrapped async function that creates and manages the main flow trace context.
+    """
+
+    @wraps(func)
+    async def wrapper(self: Any, *args: Any, **kwargs: Any) -> Any:
+        if not should_trace():
+            return await func(self, *args, **kwargs)
+
+        trace = build_flow_main_trace(self, *args, **kwargs)
+        with TraceContext.set_current(trace):
+            try:
+                return await func(self, *args, **kwargs)
+            except Exception:
+                raise
+
+    return wrapper
+
+
+def build_flow_main_trace(
+    self: Any, *args: Any, **kwargs: Any
+) -> "UnifiedTraceController":
+    """Build the main trace controller for a flow execution.
+
+    Args:
+        self: The flow instance.
+        *args: Variable positional arguments.
+        **kwargs: Variable keyword arguments.
+
+    Returns:
+        UnifiedTraceController: The configured trace controller for the flow.
+    """
+    current_trace = TraceContext.get_current()
+    trace = UnifiedTraceController(
+        trace_type=TraceType.FLOW_STEP,
+        run_id=current_trace.run_id if current_trace else str(self.flow_id),
+        parent_trace_id=current_trace.trace_id if current_trace else None,
+        crew_type=CrewType.FLOW,
+        run_type=RunType.KICKOFF,
+        context={
+            "crew_name": self.__class__.__name__,
+            "inputs": kwargs.get("inputs", {}),
+            "agents": [],
+            "tasks": [],
+        },
+    )
+    return trace
+
+
+# Flow step trace
+def trace_flow_step(
+    func: Callable[..., Awaitable[Any]],
+) -> Callable[..., Awaitable[Any]]:
+    """Decorator to trace individual flow step executions.
+
+    Args:
+        func: The async flow step function to be traced.
+
+    Returns:
+        Wrapped async function that creates and manages the flow step trace context.
+    """
+
+    @wraps(func)
+    async def wrapper(
+        self: Any,
+        method_name: str,
+        method: Callable[..., Any],
+        *args: Any,
+        **kwargs: Any,
+    ) -> Any:
+        if not should_trace():
+            return await func(self, method_name, method, *args, **kwargs)
+
+        trace = build_flow_step_trace(self, method_name, method, *args, **kwargs)
+        with TraceContext.set_current(trace):
+            trace.start_trace()
+            try:
+                result = await func(self, method_name, method, *args, **kwargs)
+                trace.end_trace(result=result)
+                return result
+            except Exception as e:
+                trace.end_trace(error=str(e))
+                raise
+
+    return wrapper
+
+
+def build_flow_step_trace(
+    self: Any, method_name: str, method: Callable[..., Any], *args: Any, **kwargs: Any
+) -> "UnifiedTraceController":
+    """Build a trace controller for an individual flow step.
+
+    Args:
+        self: The flow instance.
+        method_name: Name of the method being executed.
+        method: The actual method being executed.
+        *args: Variable positional arguments.
+        **kwargs: Variable keyword arguments.
+
+    Returns:
+        UnifiedTraceController: The configured trace controller for the flow step.
+    """
+    current_trace = TraceContext.get_current()
+
+    # Get method signature
+    sig = inspect.signature(method)
+    params = list(sig.parameters.values())
+
+    # Create inputs dictionary mapping parameter names to values
+    method_params = [p for p in params if p.name != "self"]
+    inputs: Dict[str, Any] = {}
+
+    # Map positional args to their parameter names
+    for i, param in enumerate(method_params):
+        if i < len(args):
+            inputs[param.name] = args[i]
+
+    # Add keyword arguments
+    inputs.update(kwargs)
+
+    trace = UnifiedTraceController(
+        trace_type=TraceType.FLOW_STEP,
+        run_type=current_trace.run_type if current_trace else RunType.KICKOFF,
+        crew_type=current_trace.crew_type if current_trace else CrewType.FLOW,
+        run_id=current_trace.run_id if current_trace else str(self.flow_id),
+        parent_trace_id=current_trace.trace_id if current_trace else None,
+        flow_step={
+            "function_name": method_name,
+            "inputs": inputs,
+            "metadata": {
+                "crew_name": self.__class__.__name__,
+            },
+        },
+    )
+    return trace
+
+
+# LLM trace
+def trace_llm_call(func: Callable[..., Any]) -> Callable[..., Any]:
+    """Decorator to trace LLM calls.
+
+    Args:
+        func: The function to trace.
+
+    Returns:
+        Wrapped function that creates and manages the LLM call trace context.
+    """
+
+    @wraps(func)
+    def wrapper(self: Any, *args: Any, **kwargs: Any) -> Any:
+        if not should_trace():
+            return func(self, *args, **kwargs)
+
+        trace = build_llm_trace(self, *args, **kwargs)
+        with TraceContext.set_current(trace):
+            trace.start_trace()
+            try:
+                response = func(self, *args, **kwargs)
+                # Extract relevant data from response
+                trace_response = {
+                    "content": response["choices"][0]["message"]["content"],
+                    "finish_reason": response["choices"][0].get("finish_reason"),
+                }
+
+                # Add usage metrics to context
+                if "usage" in response:
+                    trace.context["tokens_used"] = response["usage"].get(
+                        "total_tokens", 0
+                    )
+                    trace.context["prompt_tokens"] = response["usage"].get(
+                        "prompt_tokens", 0
+                    )
+                    trace.context["completion_tokens"] = response["usage"].get(
+                        "completion_tokens", 0
+                    )
+
+                trace.end_trace(trace_response)
+                return response
+            except Exception as e:
+                trace.end_trace(error=str(e))
+                raise
+
+    return wrapper
+
+
+def build_llm_trace(
+    self: Any, params: Dict[str, Any], *args: Any, **kwargs: Any
+) -> Any:
+    """Build a trace controller for an LLM call.
+
+    Args:
+        self: The LLM instance.
+        params: The parameters for the LLM call.
+        *args: Variable positional arguments.
+        **kwargs: Variable keyword arguments.
+
+    Returns:
+        UnifiedTraceController: The configured trace controller for the LLM call.
+    """
+    current_trace = TraceContext.get_current()
+    agent, task = self._get_execution_context()
+
+    # Get new messages and tool results
+    new_messages = self._get_new_messages(params.get("messages", []))
+    new_tool_results = self._get_new_tool_results(agent)
+
+    # Create trace context
+    trace = UnifiedTraceController(
+        trace_type=TraceType.TOOL_CALL if new_tool_results else TraceType.LLM_CALL,
+        crew_type=current_trace.crew_type if current_trace else CrewType.CREW,
+        run_type=current_trace.run_type if current_trace else RunType.KICKOFF,
+        run_id=current_trace.run_id if current_trace else str(uuid4()),
+        parent_trace_id=current_trace.trace_id if current_trace else None,
+        agent_role=agent.role if agent else "unknown",
+        task_id=str(task.id) if task else None,
+        task_name=task.name if task else None,
+        task_description=task.description if task else None,
+        model=self.model,
+        messages=new_messages,
+        temperature=self.temperature,
+        max_tokens=self.max_tokens,
+        stop_sequences=self.stop,
+        tool_calls=[
+            ToolCall(
+                name=result["tool_name"],
+                arguments=result["tool_args"],
+                output=str(result["result"]),
+                start_time=result.get("start_time", ""),
+                end_time=datetime.now(UTC),
+            )
+            for result in new_tool_results
+        ],
+    )
+    return trace
--- a/src/crewai/translations/en.json
+++ b/src/crewai/translations/en.json
@@ -15,7 +15,7 @@
    "final_answer_format": "If you don't need to use any more tools, you must give your best complete final answer, make sure it satisfies the expected criteria, use the EXACT format below:\n\n```\nThought: I now can give a great answer\nFinal Answer: my best complete final answer to the task.\n\n```",
    "format_without_tools": "\nSorry, I didn't use the right format. I MUST either use a tool (among the available ones), OR give my best final answer.\nHere is the expected format I must follow:\n\n```\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: the input to the action\nObservation: the result of the action\n```\n This Thought/Action/Action Input/Result process can repeat N times. Once I know the final answer, I must return the following format:\n\n```\nThought: I now can give a great answer\nFinal Answer: Your final answer must be the great and the most complete as possible, it must be outcome described\n\n```",
    "task_with_context": "{task}\n\nThis is the context you're working with:\n{context}",
-    "expected_output": "\nThis is the expect criteria for your final answer: {expected_output}\nyou MUST return the actual complete content as the final answer, not a summary.",
+    "expected_output": "\nThis is the expected criteria for your final answer: {expected_output}\nyou MUST return the actual complete content as the final answer, not a summary.",
    "human_feedback": "You got human feedback on your work, re-evaluate it and give a new Final Answer when ready.\n {human_feedback}",
    "getting_input": "This is the agent's final answer: {final_answer}\n\n",
    "summarizer_system_message": "You are a helpful assistant that summarizes text.",
@@ -23,8 +23,8 @@
    "summary": "This is a summary of our conversation so far:\n{merged_summary}",
    "manager_request": "Your best answer to your coworker asking you this, accounting for the context shared.",
    "formatted_task_instructions": "Ensure your final answer contains only the content in the following format: {output_format}\n\nEnsure the final output does not include any code block markers like ```json or ```python.",
-    "human_feedback_classification": "Determine if the following feedback indicates that the user is satisfied or if further changes are needed. Respond with 'True' if further changes are needed, or 'False' if the user is satisfied. **Important** Do not include any additional commentary outside of your 'True' or 'False' response.\n\nFeedback: \"{feedback}\"",
-    "conversation_history_instruction": "You are a member of a crew collaborating to achieve a common goal. Your task is a specific action that contributes to this larger objective. For additional context, please review the conversation history between you and the user that led to the initiation of this crew. Use any relevant information or feedback from the conversation to inform your task execution and ensure your response aligns with both the immediate task and the crew's overall goals."
+    "conversation_history_instruction": "You are a member of a crew collaborating to achieve a common goal. Your task is a specific action that contributes to this larger objective. For additional context, please review the conversation history between you and the user that led to the initiation of this crew. Use any relevant information or feedback from the conversation to inform your task execution and ensure your response aligns with both the immediate task and the crew's overall goals.",
+    "feedback_instructions": "User feedback: {feedback}\nInstructions: Use this feedback to enhance the next output iteration.\nNote: Do not respond or add commentary."
  },
  "errors": {
    "force_final_answer_error": "You can't keep going, here is the best final answer you generated:\n\n {formatted_answer}",
--- a/src/crewai/utilities/constants.py
+++ b/src/crewai/utilities/constants.py
@@ -4,3 +4,4 @@ DEFAULT_SCORE_THRESHOLD = 0.35
 KNOWLEDGE_DIRECTORY = "knowledge"
 MAX_LLM_RETRY = 3
 MAX_FILE_NAME_LENGTH = 255
+EMITTER_COLOR = "bold_blue"
--- a/src/crewai/utilities/converter.py
+++ b/src/crewai/utilities/converter.py
@@ -20,11 +20,11 @@ class ConverterError(Exception):
 class Converter(OutputConverter):
    """Class that converts text into either pydantic or json."""

-    def to_pydantic(self, current_attempt=1):
+    def to_pydantic(self, current_attempt=1) -> BaseModel:
        """Convert text to pydantic."""
        try:
            if self.llm.supports_function_calling():
-                return self._create_instructor().to_pydantic()
+                result = self._create_instructor().to_pydantic()
            else:
                response = self.llm.call(
                    [
@@ -32,18 +32,40 @@ class Converter(OutputConverter):
                        {"role": "user", "content": self.text},
                    ]
                )
-                return self.model.model_validate_json(response)
+                try:
+                    # Try to directly validate the response JSON
+                    result = self.model.model_validate_json(response)
+                except ValidationError:
+                    # If direct validation fails, attempt to extract valid JSON
+                    result = handle_partial_json(response, self.model, False, None)
+                    # Ensure result is a BaseModel instance
+                    if not isinstance(result, BaseModel):
+                        if isinstance(result, dict):
+                            result = self.model.parse_obj(result)
+                        elif isinstance(result, str):
+                            try:
+                                parsed = json.loads(result)
+                                result = self.model.parse_obj(parsed)
+                            except Exception as parse_err:
+                                raise ConverterError(
+                                    f"Failed to convert partial JSON result into Pydantic: {parse_err}"
+                                )
+                        else:
+                            raise ConverterError(
+                                "handle_partial_json returned an unexpected type."
+                            )
+            return result
        except ValidationError as e:
            if current_attempt < self.max_attempts:
                return self.to_pydantic(current_attempt + 1)
            raise ConverterError(
-                f"Failed to convert text into a Pydantic model due to the following validation error: {e}"
+                f"Failed to convert text into a Pydantic model due to validation error: {e}"
            )
        except Exception as e:
            if current_attempt < self.max_attempts:
                return self.to_pydantic(current_attempt + 1)
            raise ConverterError(
-                f"Failed to convert text into a Pydantic model due to the following error: {e}"
+                f"Failed to convert text into a Pydantic model due to error: {e}"
            )

    def to_json(self, current_attempt=1):
@@ -197,11 +219,15 @@ def get_conversion_instructions(model: Type[BaseModel], llm: Any) -> str:
    if llm.supports_function_calling():
        model_schema = PydanticSchemaParser(model=model).get_schema()
        instructions += (
-            f"\n\nThe JSON should follow this schema:\n```json\n{model_schema}\n```"
+            f"\n\nOutput ONLY the valid JSON and nothing else.\n\n"
+            f"The JSON must follow this schema exactly:\n```json\n{model_schema}\n```"
        )
    else:
        model_description = generate_model_description(model)
-        instructions += f"\n\nThe JSON should follow this format:\n{model_description}"
+        instructions += (
+            f"\n\nOutput ONLY the valid JSON and nothing else.\n\n"
+            f"The JSON must follow this format exactly:\n{model_description}"
+        )
    return instructions


--- a/src/crewai/utilities/embedding_configurator.py
+++ b/src/crewai/utilities/embedding_configurator.py
@@ -1,5 +1,5 @@
 import os
-from typing import Any, Dict, cast
+from typing import Any, Dict, Optional, cast

 from chromadb import Documents, EmbeddingFunction, Embeddings
 from chromadb.api.types import validate_embedding_function
@@ -18,11 +18,12 @@ class EmbeddingConfigurator:
            "bedrock": self._configure_bedrock,
            "huggingface": self._configure_huggingface,
            "watson": self._configure_watson,
+            "custom": self._configure_custom,
        }

    def configure_embedder(
        self,
-        embedder_config: Dict[str, Any] | None = None,
+        embedder_config: Optional[Dict[str, Any]] = None,
    ) -> EmbeddingFunction:
        """Configures and returns an embedding function based on the provided config."""
        if embedder_config is None:
@@ -30,21 +31,19 @@ class EmbeddingConfigurator:

        provider = embedder_config.get("provider")
        config = embedder_config.get("config", {})
-        model_name = config.get("model")
-
-        if isinstance(provider, EmbeddingFunction):
-            try:
-                validate_embedding_function(provider)
-                return provider
-            except Exception as e:
-                raise ValueError(f"Invalid custom embedding function: {str(e)}")
+        model_name = config.get("model") if provider != "custom" else None

        if provider not in self.embedding_functions:
            raise Exception(
                f"Unsupported embedding provider: {provider}, supported providers: {list(self.embedding_functions.keys())}"
            )

-        return self.embedding_functions[provider](config, model_name)
+        embedding_function = self.embedding_functions[provider]
+        return (
+            embedding_function(config)
+            if provider == "custom"
+            else embedding_function(config, model_name)
+        )

    @staticmethod
    def _create_default_embedding_function():
@@ -65,6 +64,13 @@ class EmbeddingConfigurator:
        return OpenAIEmbeddingFunction(
            api_key=config.get("api_key") or os.getenv("OPENAI_API_KEY"),
            model_name=model_name,
+            api_base=config.get("api_base", None),
+            api_type=config.get("api_type", None),
+            api_version=config.get("api_version", None),
+            default_headers=config.get("default_headers", None),
+            dimensions=config.get("dimensions", None),
+            deployment_id=config.get("deployment_id", None),
+            organization_id=config.get("organization_id", None),
        )

    @staticmethod
@@ -79,6 +85,10 @@ class EmbeddingConfigurator:
            api_type=config.get("api_type", "azure"),
            api_version=config.get("api_version"),
            model_name=model_name,
+            default_headers=config.get("default_headers"),
+            dimensions=config.get("dimensions"),
+            deployment_id=config.get("deployment_id"),
+            organization_id=config.get("organization_id"),
        )

    @staticmethod
@@ -101,6 +111,8 @@ class EmbeddingConfigurator:
        return GoogleVertexEmbeddingFunction(
            model_name=model_name,
            api_key=config.get("api_key"),
+            project_id=config.get("project_id"),
+            region=config.get("region"),
        )

    @staticmethod
@@ -112,6 +124,7 @@ class EmbeddingConfigurator:
        return GoogleGenerativeAiEmbeddingFunction(
            model_name=model_name,
            api_key=config.get("api_key"),
+            task_type=config.get("task_type"),
        )

    @staticmethod
@@ -142,9 +155,11 @@ class EmbeddingConfigurator:
            AmazonBedrockEmbeddingFunction,
        )

-        return AmazonBedrockEmbeddingFunction(
-            session=config.get("session"),
-        )
+        # Allow custom model_name override with backwards compatibility
+        kwargs = {"session": config.get("session")}
+        if model_name is not None:
+            kwargs["model_name"] = model_name
+        return AmazonBedrockEmbeddingFunction(**kwargs)

    @staticmethod
    def _configure_huggingface(config, model_name):
@@ -194,3 +209,28 @@ class EmbeddingConfigurator:
                    raise e

        return WatsonEmbeddingFunction()
+
+    @staticmethod
+    def _configure_custom(config):
+        custom_embedder = config.get("embedder")
+        if isinstance(custom_embedder, EmbeddingFunction):
+            try:
+                validate_embedding_function(custom_embedder)
+                return custom_embedder
+            except Exception as e:
+                raise ValueError(f"Invalid custom embedding function: {str(e)}")
+        elif callable(custom_embedder):
+            try:
+                instance = custom_embedder()
+                if isinstance(instance, EmbeddingFunction):
+                    validate_embedding_function(instance)
+                    return instance
+                raise ValueError(
+                    "Custom embedder does not create an EmbeddingFunction instance"
+                )
+            except Exception as e:
+                raise ValueError(f"Error instantiating custom embedder: {str(e)}")
+        else:
+            raise ValueError(
+                "Custom embedder must be an instance of `EmbeddingFunction` or a callable that creates one"
+            )
--- a/src/crewai/utilities/evaluators/crew_evaluator_handler.py
+++ b/src/crewai/utilities/evaluators/crew_evaluator_handler.py
@@ -1,11 +1,12 @@
 from collections import defaultdict

-from pydantic import BaseModel, Field
+from pydantic import BaseModel, Field, InstanceOf
 from rich.box import HEAVY_EDGE
 from rich.console import Console
 from rich.table import Table

 from crewai.agent import Agent
+from crewai.llm import LLM
 from crewai.task import Task
 from crewai.tasks.task_output import TaskOutput
 from crewai.telemetry import Telemetry
@@ -23,7 +24,7 @@ class CrewEvaluator:

    Attributes:
        crew (Crew): The crew of agents to evaluate.
-        openai_model_name (str): The model to use for evaluating the performance of the agents (for now ONLY OpenAI accepted).
+        eval_llm (LLM): Language model instance to use for evaluations
        tasks_scores (defaultdict): A dictionary to store the scores of the agents for each task.
        iteration (int): The current iteration of the evaluation.
    """
@@ -32,9 +33,9 @@ class CrewEvaluator:
    run_execution_times: defaultdict = defaultdict(list)
    iteration: int = 0

-    def __init__(self, crew, openai_model_name: str):
+    def __init__(self, crew, eval_llm: InstanceOf[LLM]):
        self.crew = crew
-        self.openai_model_name = openai_model_name
+        self.llm = eval_llm
        self._telemetry = Telemetry()
        self._setup_for_evaluating()

@@ -51,7 +52,7 @@ class CrewEvaluator:
            ),
            backstory="Evaluator agent for crew evaluation with precise capabilities to evaluate the performance of the agents in the crew based on the tasks they have performed",
            verbose=False,
-            llm=self.openai_model_name,
+            llm=self.llm,
        )

    def _evaluation_task(
@@ -181,7 +182,7 @@ class CrewEvaluator:
                self.crew,
                evaluation_result.pydantic.quality,
                current_task.execution_duration,
-                self.openai_model_name,
+                self.llm.model,
            )
            self.tasks_scores[self.iteration].append(evaluation_result.pydantic.quality)
            self.run_execution_times[self.iteration].append(
--- a/src/crewai/utilities/evaluators/task_evaluator.py
+++ b/src/crewai/utilities/evaluators/task_evaluator.py
@@ -3,19 +3,9 @@ from typing import List
 from pydantic import BaseModel, Field

 from crewai.utilities import Converter
+from crewai.utilities.events import TaskEvaluationEvent, crewai_event_bus
 from crewai.utilities.pydantic_schema_parser import PydanticSchemaParser

-agentops = None
-try:
-    from agentops import track_agent  # type: ignore
-except ImportError:
-
-    def track_agent(name):
-        def noop(f):
-            return f
-
-        return noop
-

 class Entity(BaseModel):
    name: str = Field(description="The name of the entity.")
@@ -48,12 +38,15 @@ class TrainingTaskEvaluation(BaseModel):
    )


-@track_agent(name="Task Evaluator")
 class TaskEvaluator:
    def __init__(self, original_agent):
        self.llm = original_agent.llm
+        self.original_agent = original_agent

    def evaluate(self, task, output) -> TaskEvaluation:
+        crewai_event_bus.emit(
+            self, TaskEvaluationEvent(evaluation_type="task_evaluation")
+        )
        evaluation_query = (
            f"Assess the quality of the task completed based on the description, expected output, and actual results.\n\n"
            f"Task Description:\n{task.description}\n\n"
@@ -90,15 +83,39 @@ class TaskEvaluator:
            - training_data (dict): The training data to be evaluated.
            - agent_id (str): The ID of the agent.
        """
+        crewai_event_bus.emit(
+            self, TaskEvaluationEvent(evaluation_type="training_data_evaluation")
+        )

        output_training_data = training_data[agent_id]
-
        final_aggregated_data = ""
-        for _, data in output_training_data.items():
+
+        for iteration, data in output_training_data.items():
+            improved_output = data.get("improved_output")
+            initial_output = data.get("initial_output")
+            human_feedback = data.get("human_feedback")
+
+            if not all([improved_output, initial_output, human_feedback]):
+                missing_fields = [
+                    field
+                    for field in ["improved_output", "initial_output", "human_feedback"]
+                    if not data.get(field)
+                ]
+                error_msg = (
+                    f"Critical training data error: Missing fields ({', '.join(missing_fields)}) "
+                    f"for agent {agent_id} in iteration {iteration}.\n"
+                    "This indicates a broken training process. "
+                    "Cannot proceed with evaluation.\n"
+                    "Please check your training implementation."
+                )
+                raise ValueError(error_msg)
+
            final_aggregated_data += (
-                f"Initial Output:\n{data['initial_output']}\n\n"
-                f"Human Feedback:\n{data['human_feedback']}\n\n"
-                f"Improved Output:\n{data['improved_output']}\n\n"
+                f"Iteration: {iteration}\n"
+                f"Initial Output:\n{initial_output}\n\n"
+                f"Human Feedback:\n{human_feedback}\n\n"
+                f"Improved Output:\n{improved_output}\n\n"
+                "------------------------------------------------\n\n"
            )

        evaluation_query = (
--- a/src/crewai/utilities/events.py
+++ b/src/crewai/utilities/events.py
@@ -1,44 +0,0 @@
-from functools import wraps
-from typing import Any, Callable, Dict, Generic, List, Type, TypeVar
-
-from pydantic import BaseModel
-
-T = TypeVar("T")
-EVT = TypeVar("EVT", bound=BaseModel)
-
-
-class Emitter(Generic[T, EVT]):
-    _listeners: Dict[Type[EVT], List[Callable]] = {}
-
-    def on(self, event_type: Type[EVT]):
-        def decorator(func: Callable):
-            @wraps(func)
-            def wrapper(*args, **kwargs):
-                return func(*args, **kwargs)
-
-            self._listeners.setdefault(event_type, []).append(wrapper)
-            return wrapper
-
-        return decorator
-
-    def emit(self, source: T, event: EVT) -> None:
-        event_type = type(event)
-        for func in self._listeners.get(event_type, []):
-            func(source, event)
-
-
-default_emitter = Emitter[Any, BaseModel]()
-
-
-def emit(source: Any, event: BaseModel, raise_on_error: bool = False) -> None:
-    try:
-        default_emitter.emit(source, event)
-    except Exception as e:
-        if raise_on_error:
-            raise e
-        else:
-            print(f"Error emitting event: {e}")
-
-
-def on(event_type: Type[BaseModel]) -> Callable:
-    return default_emitter.on(event_type)
--- a/src/crewai/utilities/events/init.py
+++ b/src/crewai/utilities/events/init.py
@@ -0,0 +1,40 @@
+from .crew_events import (
+    CrewKickoffStartedEvent,
+    CrewKickoffCompletedEvent,
+    CrewKickoffFailedEvent,
+    CrewTrainStartedEvent,
+    CrewTrainCompletedEvent,
+    CrewTrainFailedEvent,
+    CrewTestStartedEvent,
+    CrewTestCompletedEvent,
+    CrewTestFailedEvent,
+)
+from .agent_events import (
+    AgentExecutionStartedEvent,
+    AgentExecutionCompletedEvent,
+    AgentExecutionErrorEvent,
+)
+from .task_events import TaskStartedEvent, TaskCompletedEvent, TaskFailedEvent, TaskEvaluationEvent
+from .flow_events import (
+    FlowCreatedEvent,
+    FlowStartedEvent,
+    FlowFinishedEvent,
+    FlowPlotEvent,
+    MethodExecutionStartedEvent,
+    MethodExecutionFinishedEvent,
+    MethodExecutionFailedEvent,
+)
+from .crewai_event_bus import CrewAIEventsBus, crewai_event_bus
+from .tool_usage_events import (
+    ToolUsageFinishedEvent,
+    ToolUsageErrorEvent,
+    ToolUsageStartedEvent,
+    ToolExecutionErrorEvent,
+    ToolSelectionErrorEvent,
+    ToolUsageEvent,
+    ToolValidateInputErrorEvent,
+)
+
+# events
+from .event_listener import EventListener
+from .third_party.agentops_listener import agentops_listener
--- a/src/crewai/utilities/events/agent_events.py
+++ b/src/crewai/utilities/events/agent_events.py
@@ -0,0 +1,40 @@
+from typing import TYPE_CHECKING, Any, Dict, Optional, Sequence, Union
+
+from crewai.agents.agent_builder.base_agent import BaseAgent
+from crewai.tools.base_tool import BaseTool
+from crewai.tools.structured_tool import CrewStructuredTool
+
+from .base_events import CrewEvent
+
+if TYPE_CHECKING:
+    from crewai.agents.agent_builder.base_agent import BaseAgent
+
+
+class AgentExecutionStartedEvent(CrewEvent):
+    """Event emitted when an agent starts executing a task"""
+
+    agent: BaseAgent
+    task: Any
+    tools: Optional[Sequence[Union[BaseTool, CrewStructuredTool]]]
+    task_prompt: str
+    type: str = "agent_execution_started"
+
+    model_config = {"arbitrary_types_allowed": True}
+
+
+class AgentExecutionCompletedEvent(CrewEvent):
+    """Event emitted when an agent completes executing a task"""
+
+    agent: BaseAgent
+    task: Any
+    output: str
+    type: str = "agent_execution_completed"
+
+
+class AgentExecutionErrorEvent(CrewEvent):
+    """Event emitted when an agent encounters an error during execution"""
+
+    agent: BaseAgent
+    task: Any
+    error: str
+    type: str = "agent_execution_error"
--- a/src/crewai/utilities/events/base_event_listener.py
+++ b/src/crewai/utilities/events/base_event_listener.py
@@ -0,0 +1,14 @@
+from abc import ABC, abstractmethod
+from logging import Logger
+
+from crewai.utilities.events.crewai_event_bus import CrewAIEventsBus, crewai_event_bus
+
+
+class BaseEventListener(ABC):
+    def __init__(self):
+        super().__init__()
+        self.setup_listeners(crewai_event_bus)
+
+    @abstractmethod
+    def setup_listeners(self, crewai_event_bus: CrewAIEventsBus):
+        pass
--- a/src/crewai/utilities/events/base_events.py
+++ b/src/crewai/utilities/events/base_events.py
@@ -0,0 +1,10 @@
+from datetime import datetime
+
+from pydantic import BaseModel, Field
+
+
+class CrewEvent(BaseModel):
+    """Base class for all crew events"""
+
+    timestamp: datetime = Field(default_factory=datetime.now)
+    type: str
--- a/src/crewai/utilities/events/crew_events.py
+++ b/src/crewai/utilities/events/crew_events.py
@@ -0,0 +1,81 @@
+from typing import Any, Dict, Optional, Union
+
+from pydantic import InstanceOf
+
+from crewai.utilities.events.base_events import CrewEvent
+
+
+class CrewKickoffStartedEvent(CrewEvent):
+    """Event emitted when a crew starts execution"""
+
+    crew_name: Optional[str]
+    inputs: Optional[Dict[str, Any]]
+    type: str = "crew_kickoff_started"
+
+
+class CrewKickoffCompletedEvent(CrewEvent):
+    """Event emitted when a crew completes execution"""
+
+    crew_name: Optional[str]
+    output: Any
+    type: str = "crew_kickoff_completed"
+
+
+class CrewKickoffFailedEvent(CrewEvent):
+    """Event emitted when a crew fails to complete execution"""
+
+    error: str
+    crew_name: Optional[str]
+    type: str = "crew_kickoff_failed"
+
+
+class CrewTrainStartedEvent(CrewEvent):
+    """Event emitted when a crew starts training"""
+
+    crew_name: Optional[str]
+    n_iterations: int
+    filename: str
+    inputs: Optional[Dict[str, Any]]
+    type: str = "crew_train_started"
+
+
+class CrewTrainCompletedEvent(CrewEvent):
+    """Event emitted when a crew completes training"""
+
+    crew_name: Optional[str]
+    n_iterations: int
+    filename: str
+    type: str = "crew_train_completed"
+
+
+class CrewTrainFailedEvent(CrewEvent):
+    """Event emitted when a crew fails to complete training"""
+
+    error: str
+    crew_name: Optional[str]
+    type: str = "crew_train_failed"
+
+
+class CrewTestStartedEvent(CrewEvent):
+    """Event emitted when a crew starts testing"""
+
+    crew_name: Optional[str]
+    n_iterations: int
+    eval_llm: Optional[Union[str, Any]]
+    inputs: Optional[Dict[str, Any]]
+    type: str = "crew_test_started"
+
+
+class CrewTestCompletedEvent(CrewEvent):
+    """Event emitted when a crew completes testing"""
+
+    crew_name: Optional[str]
+    type: str = "crew_test_completed"
+
+
+class CrewTestFailedEvent(CrewEvent):
+    """Event emitted when a crew fails to complete testing"""
+
+    error: str
+    crew_name: Optional[str]
+    type: str = "crew_test_failed"
--- a/src/crewai/utilities/events/crewai_event_bus.py
+++ b/src/crewai/utilities/events/crewai_event_bus.py
@@ -0,0 +1,113 @@
+import threading
+from contextlib import contextmanager
+from typing import Any, Callable, Dict, List, Type, TypeVar, cast
+
+from blinker import Signal
+
+from crewai.utilities.events.base_events import CrewEvent
+from crewai.utilities.events.event_types import EventTypes
+
+EventT = TypeVar("EventT", bound=CrewEvent)
+
+
+class CrewAIEventsBus:
+    """
+    A singleton event bus that uses blinker signals for event handling.
+    Allows both internal (Flow/Crew) and external event handling.
+    """
+
+    _instance = None
+    _lock = threading.Lock()
+
+    def __new__(cls):
+        if cls._instance is None:
+            with cls._lock:
+                if cls._instance is None:  # prevent race condition
+                    cls._instance = super(CrewAIEventsBus, cls).__new__(cls)
+                    cls._instance._initialize()
+        return cls._instance
+
+    def _initialize(self) -> None:
+        """Initialize the event bus internal state"""
+        self._signal = Signal("crewai_event_bus")
+        self._handlers: Dict[Type[CrewEvent], List[Callable]] = {}
+
+    def on(
+        self, event_type: Type[EventT]
+    ) -> Callable[[Callable[[Any, EventT], None]], Callable[[Any, EventT], None]]:
+        """
+        Decorator to register an event handler for a specific event type.
+
+        Usage:
+            @crewai_event_bus.on(AgentExecutionCompletedEvent)
+            def on_agent_execution_completed(
+                source: Any, event: AgentExecutionCompletedEvent
+            ):
+                print(f"👍 Agent '{event.agent}' completed task")
+                print(f"   Output: {event.output}")
+        """
+
+        def decorator(
+            handler: Callable[[Any, EventT], None],
+        ) -> Callable[[Any, EventT], None]:
+            if event_type not in self._handlers:
+                self._handlers[event_type] = []
+            self._handlers[event_type].append(
+                cast(Callable[[Any, EventT], None], handler)
+            )
+            return handler
+
+        return decorator
+
+    def emit(self, source: Any, event: CrewEvent) -> None:
+        """
+        Emit an event to all registered handlers
+
+        Args:
+            source: The object emitting the event
+            event: The event instance to emit
+        """
+        event_type = type(event)
+        if event_type in self._handlers:
+            for handler in self._handlers[event_type]:
+                handler(source, event)
+        self._signal.send(source, event=event)
+
+    def clear_handlers(self) -> None:
+        """Clear all registered event handlers - useful for testing"""
+        self._handlers.clear()
+
+    def register_handler(
+        self, event_type: Type[EventTypes], handler: Callable[[Any, EventTypes], None]
+    ) -> None:
+        """Register an event handler for a specific event type"""
+        if event_type not in self._handlers:
+            self._handlers[event_type] = []
+        self._handlers[event_type].append(
+            cast(Callable[[Any, EventTypes], None], handler)
+        )
+
+    @contextmanager
+    def scoped_handlers(self):
+        """
+        Context manager for temporary event handling scope.
+        Useful for testing or temporary event handling.
+
+        Usage:
+            with crewai_event_bus.scoped_handlers():
+                @crewai_event_bus.on(CrewKickoffStarted)
+                def temp_handler(source, event):
+                    print("Temporary handler")
+                # Do stuff...
+            # Handlers are cleared after the context
+        """
+        previous_handlers = self._handlers.copy()
+        self._handlers.clear()
+        try:
+            yield
+        finally:
+            self._handlers = previous_handlers
+
+
+# Global instance
+crewai_event_bus = CrewAIEventsBus()
--- a/src/crewai/utilities/events/event_listener.py
+++ b/src/crewai/utilities/events/event_listener.py
@@ -0,0 +1,257 @@
+from pydantic import PrivateAttr
+
+from crewai.telemetry.telemetry import Telemetry
+from crewai.utilities import Logger
+from crewai.utilities.constants import EMITTER_COLOR
+from crewai.utilities.events.base_event_listener import BaseEventListener
+
+from .agent_events import AgentExecutionCompletedEvent, AgentExecutionStartedEvent
+from .crew_events import (
+    CrewKickoffCompletedEvent,
+    CrewKickoffFailedEvent,
+    CrewKickoffStartedEvent,
+    CrewTestCompletedEvent,
+    CrewTestFailedEvent,
+    CrewTestStartedEvent,
+    CrewTrainCompletedEvent,
+    CrewTrainFailedEvent,
+    CrewTrainStartedEvent,
+)
+from .flow_events import (
+    FlowCreatedEvent,
+    FlowFinishedEvent,
+    FlowStartedEvent,
+    MethodExecutionFailedEvent,
+    MethodExecutionFinishedEvent,
+    MethodExecutionStartedEvent,
+)
+from .task_events import TaskCompletedEvent, TaskFailedEvent, TaskStartedEvent
+from .tool_usage_events import (
+    ToolUsageErrorEvent,
+    ToolUsageFinishedEvent,
+    ToolUsageStartedEvent,
+)
+
+
+class EventListener(BaseEventListener):
+    _instance = None
+    _telemetry: Telemetry = PrivateAttr(default_factory=lambda: Telemetry())
+    logger = Logger(verbose=True, default_color=EMITTER_COLOR)
+
+    def __new__(cls):
+        if cls._instance is None:
+            cls._instance = super().__new__(cls)
+            cls._instance._initialized = False
+        return cls._instance
+
+    def __init__(self):
+        if not hasattr(self, "_initialized") or not self._initialized:
+            super().__init__()
+            self._telemetry = Telemetry()
+            self._telemetry.set_tracer()
+            self._initialized = True
+
+    # ----------- CREW EVENTS -----------
+
+    def setup_listeners(self, crewai_event_bus):
+        @crewai_event_bus.on(CrewKickoffStartedEvent)
+        def on_crew_started(source, event: CrewKickoffStartedEvent):
+            self.logger.log(
+                f"🚀 Crew '{event.crew_name}' started",
+                event.timestamp,
+            )
+            self._telemetry.crew_execution_span(source, event.inputs)
+
+        @crewai_event_bus.on(CrewKickoffCompletedEvent)
+        def on_crew_completed(source, event: CrewKickoffCompletedEvent):
+            final_string_output = event.output.raw
+            self._telemetry.end_crew(source, final_string_output)
+            self.logger.log(
+                f"✅ Crew '{event.crew_name}' completed",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(CrewKickoffFailedEvent)
+        def on_crew_failed(source, event: CrewKickoffFailedEvent):
+            self.logger.log(
+                f"❌ Crew '{event.crew_name}' failed",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(CrewTestStartedEvent)
+        def on_crew_test_started(source, event: CrewTestStartedEvent):
+            cloned_crew = source.copy()
+            cloned_crew._telemetry.test_execution_span(
+                cloned_crew,
+                event.n_iterations,
+                event.inputs,
+                event.eval_llm,
+            )
+            self.logger.log(
+                f"🚀 Crew '{event.crew_name}' started test",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(CrewTestCompletedEvent)
+        def on_crew_test_completed(source, event: CrewTestCompletedEvent):
+            self.logger.log(
+                f"✅ Crew '{event.crew_name}' completed test",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(CrewTestFailedEvent)
+        def on_crew_test_failed(source, event: CrewTestFailedEvent):
+            self.logger.log(
+                f"❌ Crew '{event.crew_name}' failed test",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(CrewTrainStartedEvent)
+        def on_crew_train_started(source, event: CrewTrainStartedEvent):
+            self.logger.log(
+                f"📋 Crew '{event.crew_name}' started train",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(CrewTrainCompletedEvent)
+        def on_crew_train_completed(source, event: CrewTrainCompletedEvent):
+            self.logger.log(
+                f"✅ Crew '{event.crew_name}' completed train",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(CrewTrainFailedEvent)
+        def on_crew_train_failed(source, event: CrewTrainFailedEvent):
+            self.logger.log(
+                f"❌ Crew '{event.crew_name}' failed train",
+                event.timestamp,
+            )
+
+        # ----------- TASK EVENTS -----------
+
+        @crewai_event_bus.on(TaskStartedEvent)
+        def on_task_started(source, event: TaskStartedEvent):
+            source._execution_span = self._telemetry.task_started(
+                crew=source.agent.crew, task=source
+            )
+            self.logger.log(
+                f"📋 Task started: {source.description}",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(TaskCompletedEvent)
+        def on_task_completed(source, event: TaskCompletedEvent):
+            if source._execution_span:
+                self._telemetry.task_ended(
+                    source._execution_span, source, source.agent.crew
+                )
+            self.logger.log(
+                f"✅ Task completed: {source.description}",
+                event.timestamp,
+            )
+            source._execution_span = None
+
+        @crewai_event_bus.on(TaskFailedEvent)
+        def on_task_failed(source, event: TaskFailedEvent):
+            if source._execution_span:
+                if source.agent and source.agent.crew:
+                    self._telemetry.task_ended(
+                        source._execution_span, source, source.agent.crew
+                    )
+                source._execution_span = None
+            self.logger.log(
+                f"❌ Task failed: {source.description}",
+                event.timestamp,
+            )
+
+        # ----------- AGENT EVENTS -----------
+
+        @crewai_event_bus.on(AgentExecutionStartedEvent)
+        def on_agent_execution_started(source, event: AgentExecutionStartedEvent):
+            self.logger.log(
+                f"🤖 Agent '{event.agent.role}' started task",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(AgentExecutionCompletedEvent)
+        def on_agent_execution_completed(source, event: AgentExecutionCompletedEvent):
+            self.logger.log(
+                f"✅ Agent '{event.agent.role}' completed task",
+                event.timestamp,
+            )
+
+        # ----------- FLOW EVENTS -----------
+
+        @crewai_event_bus.on(FlowCreatedEvent)
+        def on_flow_created(source, event: FlowCreatedEvent):
+            self._telemetry.flow_creation_span(self.__class__.__name__)
+            self.logger.log(
+                f"🌊 Flow Created: '{event.flow_name}'",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(FlowStartedEvent)
+        def on_flow_started(source, event: FlowStartedEvent):
+            self._telemetry.flow_execution_span(
+                source.__class__.__name__, list(source._methods.keys())
+            )
+            self.logger.log(
+                f"🤖 Flow Started: '{event.flow_name}'",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(FlowFinishedEvent)
+        def on_flow_finished(source, event: FlowFinishedEvent):
+            self.logger.log(
+                f"👍 Flow Finished: '{event.flow_name}'",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(MethodExecutionStartedEvent)
+        def on_method_execution_started(source, event: MethodExecutionStartedEvent):
+            self.logger.log(
+                f"🤖 Flow Method Started: '{event.method_name}'",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(MethodExecutionFailedEvent)
+        def on_method_execution_failed(source, event: MethodExecutionFailedEvent):
+            self.logger.log(
+                f"❌ Flow Method Failed: '{event.method_name}'",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(MethodExecutionFinishedEvent)
+        def on_method_execution_finished(source, event: MethodExecutionFinishedEvent):
+            self.logger.log(
+                f"👍 Flow Method Finished: '{event.method_name}'",
+                event.timestamp,
+            )
+
+        # ----------- TOOL USAGE EVENTS -----------
+
+        @crewai_event_bus.on(ToolUsageStartedEvent)
+        def on_tool_usage_started(source, event: ToolUsageStartedEvent):
+            self.logger.log(
+                f"🤖 Tool Usage Started: '{event.tool_name}'",
+                event.timestamp,
+            )
+
+        @crewai_event_bus.on(ToolUsageFinishedEvent)
+        def on_tool_usage_finished(source, event: ToolUsageFinishedEvent):
+            self.logger.log(
+                f"✅ Tool Usage Finished: '{event.tool_name}'",
+                event.timestamp,
+                #
+            )
+
+        @crewai_event_bus.on(ToolUsageErrorEvent)
+        def on_tool_usage_error(source, event: ToolUsageErrorEvent):
+            self.logger.log(
+                f"❌ Tool Usage Error: '{event.tool_name}'",
+                event.timestamp,
+                #
+            )
+
+
+event_listener = EventListener()
--- a/src/crewai/utilities/events/event_types.py
+++ b/src/crewai/utilities/events/event_types.py
@@ -0,0 +1,61 @@
+from typing import Union
+
+from .agent_events import (
+    AgentExecutionCompletedEvent,
+    AgentExecutionErrorEvent,
+    AgentExecutionStartedEvent,
+)
+from .crew_events import (
+    CrewKickoffCompletedEvent,
+    CrewKickoffFailedEvent,
+    CrewKickoffStartedEvent,
+    CrewTestCompletedEvent,
+    CrewTestFailedEvent,
+    CrewTestStartedEvent,
+    CrewTrainCompletedEvent,
+    CrewTrainFailedEvent,
+    CrewTrainStartedEvent,
+)
+from .flow_events import (
+    FlowFinishedEvent,
+    FlowStartedEvent,
+    MethodExecutionFailedEvent,
+    MethodExecutionFinishedEvent,
+    MethodExecutionStartedEvent,
+)
+from .task_events import (
+    TaskCompletedEvent,
+    TaskFailedEvent,
+    TaskStartedEvent,
+)
+from .tool_usage_events import (
+    ToolUsageErrorEvent,
+    ToolUsageFinishedEvent,
+    ToolUsageStartedEvent,
+)
+
+EventTypes = Union[
+    CrewKickoffStartedEvent,
+    CrewKickoffCompletedEvent,
+    CrewKickoffFailedEvent,
+    CrewTestStartedEvent,
+    CrewTestCompletedEvent,
+    CrewTestFailedEvent,
+    CrewTrainStartedEvent,
+    CrewTrainCompletedEvent,
+    CrewTrainFailedEvent,
+    AgentExecutionStartedEvent,
+    AgentExecutionCompletedEvent,
+    TaskStartedEvent,
+    TaskCompletedEvent,
+    TaskFailedEvent,
+    FlowStartedEvent,
+    FlowFinishedEvent,
+    MethodExecutionStartedEvent,
+    MethodExecutionFinishedEvent,
+    MethodExecutionFailedEvent,
+    AgentExecutionErrorEvent,
+    ToolUsageFinishedEvent,
+    ToolUsageErrorEvent,
+    ToolUsageStartedEvent,
+]
--- a/src/crewai/utilities/events/flow_events.py
+++ b/src/crewai/utilities/events/flow_events.py
@@ -0,0 +1,71 @@
+from typing import Any, Dict, Optional, Union
+
+from pydantic import BaseModel
+
+from .base_events import CrewEvent
+
+
+class FlowEvent(CrewEvent):
+    """Base class for all flow events"""
+
+    type: str
+    flow_name: str
+
+
+class FlowStartedEvent(FlowEvent):
+    """Event emitted when a flow starts execution"""
+
+    flow_name: str
+    inputs: Optional[Dict[str, Any]] = None
+    type: str = "flow_started"
+
+
+class FlowCreatedEvent(FlowEvent):
+    """Event emitted when a flow is created"""
+
+    flow_name: str
+    type: str = "flow_created"
+
+
+class MethodExecutionStartedEvent(FlowEvent):
+    """Event emitted when a flow method starts execution"""
+
+    flow_name: str
+    method_name: str
+    state: Union[Dict[str, Any], BaseModel]
+    params: Optional[Dict[str, Any]] = None
+    type: str = "method_execution_started"
+
+
+class MethodExecutionFinishedEvent(FlowEvent):
+    """Event emitted when a flow method completes execution"""
+
+    flow_name: str
+    method_name: str
+    result: Any = None
+    state: Union[Dict[str, Any], BaseModel]
+    type: str = "method_execution_finished"
+
+
+class MethodExecutionFailedEvent(FlowEvent):
+    """Event emitted when a flow method fails execution"""
+
+    flow_name: str
+    method_name: str
+    error: Any
+    type: str = "method_execution_failed"
+
+
+class FlowFinishedEvent(FlowEvent):
+    """Event emitted when a flow completes execution"""
+
+    flow_name: str
+    result: Optional[Any] = None
+    type: str = "flow_finished"
+
+
+class FlowPlotEvent(FlowEvent):
+    """Event emitted when a flow plot is created"""
+
+    flow_name: str
+    type: str = "flow_plot"
--- a/src/crewai/utilities/events/task_events.py
+++ b/src/crewai/utilities/events/task_events.py
@@ -0,0 +1,32 @@
+from typing import Any, Optional
+
+from crewai.tasks.task_output import TaskOutput
+from crewai.utilities.events.base_events import CrewEvent
+
+
+class TaskStartedEvent(CrewEvent):
+    """Event emitted when a task starts"""
+
+    type: str = "task_started"
+    context: Optional[str]
+
+
+class TaskCompletedEvent(CrewEvent):
+    """Event emitted when a task completes"""
+
+    output: TaskOutput
+    type: str = "task_completed"
+
+
+class TaskFailedEvent(CrewEvent):
+    """Event emitted when a task fails"""
+
+    error: str
+    type: str = "task_failed"
+
+
+class TaskEvaluationEvent(CrewEvent):
+    """Event emitted when a task evaluation is completed"""
+
+    type: str = "task_evaluation"
+    evaluation_type: str
--- a/src/crewai/utilities/events/third_party/init.py
+++ b/src/crewai/utilities/events/third_party/init.py
@@ -0,0 +1 @@
+from .agentops_listener import agentops_listener
--- a/src/crewai/utilities/events/third_party/agentops_listener.py
+++ b/src/crewai/utilities/events/third_party/agentops_listener.py
@@ -0,0 +1,67 @@
+from typing import Optional
+
+from crewai.utilities.events import (
+    CrewKickoffCompletedEvent,
+    ToolUsageErrorEvent,
+    ToolUsageStartedEvent,
+)
+from crewai.utilities.events.base_event_listener import BaseEventListener
+from crewai.utilities.events.crew_events import CrewKickoffStartedEvent
+from crewai.utilities.events.task_events import TaskEvaluationEvent
+
+try:
+    import agentops
+
+    AGENTOPS_INSTALLED = True
+except ImportError:
+    AGENTOPS_INSTALLED = False
+
+
+class AgentOpsListener(BaseEventListener):
+    tool_event: Optional["agentops.ToolEvent"] = None
+    session: Optional["agentops.Session"] = None
+
+    def __init__(self):
+        super().__init__()
+
+    def setup_listeners(self, crewai_event_bus):
+        if not AGENTOPS_INSTALLED:
+            return
+
+        @crewai_event_bus.on(CrewKickoffStartedEvent)
+        def on_crew_kickoff_started(source, event: CrewKickoffStartedEvent):
+            self.session = agentops.init()
+            for agent in source.agents:
+                if self.session:
+                    self.session.create_agent(
+                        name=agent.role,
+                        agent_id=str(agent.id),
+                    )
+
+        @crewai_event_bus.on(CrewKickoffCompletedEvent)
+        def on_crew_kickoff_completed(source, event: CrewKickoffCompletedEvent):
+            if self.session:
+                self.session.end_session(
+                    end_state="Success",
+                    end_state_reason="Finished Execution",
+                )
+
+        @crewai_event_bus.on(ToolUsageStartedEvent)
+        def on_tool_usage_started(source, event: ToolUsageStartedEvent):
+            self.tool_event = agentops.ToolEvent(name=event.tool_name)
+            if self.session:
+                self.session.record(self.tool_event)
+
+        @crewai_event_bus.on(ToolUsageErrorEvent)
+        def on_tool_usage_error(source, event: ToolUsageErrorEvent):
+            agentops.ErrorEvent(exception=event.error, trigger_event=self.tool_event)
+
+        @crewai_event_bus.on(TaskEvaluationEvent)
+        def on_task_evaluation(source, event: TaskEvaluationEvent):
+            if self.session:
+                self.session.create_agent(
+                    name="Task Evaluator", agent_id=str(source.original_agent.id)
+                )
+
+
+agentops_listener = AgentOpsListener()
--- a/src/crewai/utilities/events/tool_usage_events.py
+++ b/src/crewai/utilities/events/tool_usage_events.py
@@ -0,0 +1,64 @@
+from datetime import datetime
+from typing import Any, Callable, Dict
+
+from .base_events import CrewEvent
+
+
+class ToolUsageEvent(CrewEvent):
+    """Base event for tool usage tracking"""
+
+    agent_key: str
+    agent_role: str
+    tool_name: str
+    tool_args: Dict[str, Any] | str
+    tool_class: str
+    run_attempts: int | None = None
+    delegations: int | None = None
+
+    model_config = {"arbitrary_types_allowed": True}
+
+
+class ToolUsageStartedEvent(ToolUsageEvent):
+    """Event emitted when a tool execution is started"""
+
+    type: str = "tool_usage_started"
+
+
+class ToolUsageFinishedEvent(ToolUsageEvent):
+    """Event emitted when a tool execution is completed"""
+
+    started_at: datetime
+    finished_at: datetime
+    from_cache: bool = False
+    type: str = "tool_usage_finished"
+
+
+class ToolUsageErrorEvent(ToolUsageEvent):
+    """Event emitted when a tool execution encounters an error"""
+
+    error: Any
+    type: str = "tool_usage_error"
+
+
+class ToolValidateInputErrorEvent(ToolUsageEvent):
+    """Event emitted when a tool input validation encounters an error"""
+
+    error: Any
+    type: str = "tool_validate_input_error"
+
+
+class ToolSelectionErrorEvent(ToolUsageEvent):
+    """Event emitted when a tool selection encounters an error"""
+
+    error: Any
+    type: str = "tool_selection_error"
+
+
+class ToolExecutionErrorEvent(CrewEvent):
+    """Event emitted when a tool execution encounters an error"""
+
+    error: Any
+    type: str = "tool_execution_error"
+    tool_name: str
+    tool_args: Dict[str, Any]
+    tool_class: Callable
--- a/src/crewai/utilities/file_handler.py
+++ b/src/crewai/utilities/file_handler.py
@@ -1,30 +1,64 @@
+import json
 import os
 import pickle
 from datetime import datetime
+from typing import Union


 class FileHandler:
-    """take care of file operations, currently it only logs messages to a file"""
+    """Handler for file operations supporting both JSON and text-based logging.
+    
+    Args:
+        file_path (Union[bool, str]): Path to the log file or boolean flag
+    """

-    def __init__(self, file_path):
-        if isinstance(file_path, bool):
+    def __init__(self, file_path: Union[bool, str]):
+        self._initialize_path(file_path)
+        
+    def _initialize_path(self, file_path: Union[bool, str]):
+        if file_path is True:  # File path is boolean True
            self._path = os.path.join(os.curdir, "logs.txt")
-        elif isinstance(file_path, str):
-            self._path = file_path
+        
+        elif isinstance(file_path, str):  # File path is a string
+            if file_path.endswith((".json", ".txt")):
+                self._path = file_path  # No modification if the file ends with .json or .txt
+            else:
+                self._path = file_path + ".txt"  # Append .txt if the file doesn't end with .json or .txt
+        
        else:
-            raise ValueError("file_path must be either a boolean or a string.")
-
+            raise ValueError("file_path must be a string or boolean.")  # Handle the case where file_path isn't valid
+        
    def log(self, **kwargs):
-        now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
-        message = (
-            f"{now}: "
-            + ", ".join([f'{key}="{value}"' for key, value in kwargs.items()])
-            + "\n"
-        )
-        with open(self._path, "a", encoding="utf-8") as file:
-            file.write(message + "\n")
+        try:
+            now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
+            log_entry = {"timestamp": now, **kwargs}

+            if self._path.endswith(".json"):
+                # Append log in JSON format
+                with open(self._path, "a", encoding="utf-8") as file:
+                    # If the file is empty, start with a list; else, append to it
+                    try:
+                        # Try reading existing content to avoid overwriting
+                        with open(self._path, "r", encoding="utf-8") as read_file:
+                            existing_data = json.load(read_file)
+                            existing_data.append(log_entry)
+                    except (json.JSONDecodeError, FileNotFoundError):
+                        # If no valid JSON or file doesn't exist, start with an empty list
+                        existing_data = [log_entry]
+                    
+                    with open(self._path, "w", encoding="utf-8") as write_file:
+                        json.dump(existing_data, write_file, indent=4)
+                        write_file.write("\n")
+            
+            else:
+                # Append log in plain text format
+                message = f"{now}: " + ", ".join([f"{key}=\"{value}\"" for key, value in kwargs.items()]) + "\n"
+                with open(self._path, "a", encoding="utf-8") as file:
+                    file.write(message)

+        except Exception as e:
+            raise ValueError(f"Failed to log message: {str(e)}")
+        
 class PickleHandler:
    def __init__(self, file_name: str) -> None:
        """
--- a/src/crewai/utilities/llm_utils.py
+++ b/src/crewai/utilities/llm_utils.py
@@ -53,6 +53,7 @@ def create_llm(
        timeout: Optional[float] = getattr(llm_value, "timeout", None)
        api_key: Optional[str] = getattr(llm_value, "api_key", None)
        base_url: Optional[str] = getattr(llm_value, "base_url", None)
+        api_base: Optional[str] = getattr(llm_value, "api_base", None)

        created_llm = LLM(
            model=model,
@@ -62,6 +63,7 @@ def create_llm(
            timeout=timeout,
            api_key=api_key,
            base_url=base_url,
+            api_base=api_base,
        )
        return created_llm
    except Exception as e:
@@ -101,8 +103,18 @@ def _llm_via_environment_or_fallback() -> Optional[LLM]:
    callbacks: List[Any] = []

    # Optional base URL from env
-    api_base = os.environ.get("OPENAI_API_BASE") or os.environ.get("OPENAI_BASE_URL")
-    if api_base:
+    base_url = (
+        os.environ.get("BASE_URL")
+        or os.environ.get("OPENAI_API_BASE")
+        or os.environ.get("OPENAI_BASE_URL")
+    )
+
+    api_base = os.environ.get("API_BASE") or os.environ.get("AZURE_API_BASE")
+
+    # Synchronize base_url and api_base if one is populated and the other is not
+    if base_url and not api_base:
+        api_base = base_url
+    elif api_base and not base_url:
        base_url = api_base

    # Initialize llm_params dictionary
@@ -115,6 +127,7 @@ def _llm_via_environment_or_fallback() -> Optional[LLM]:
        "timeout": timeout,
        "api_key": api_key,
        "base_url": base_url,
+        "api_base": api_base,
        "api_version": api_version,
        "presence_penalty": presence_penalty,
        "frequency_penalty": frequency_penalty,
--- a/src/crewai/utilities/logger.py
+++ b/src/crewai/utilities/logger.py
@@ -8,8 +8,11 @@ from crewai.utilities.printer import Printer
 class Logger(BaseModel):
    verbose: bool = Field(default=False)
    _printer: Printer = PrivateAttr(default_factory=Printer)
+    default_color: str = Field(default="bold_yellow")

-    def log(self, level, message, color="bold_yellow"):
+    def log(self, level, message, color=None):
+        if color is None:
+            color = self.default_color
        if self.verbose:
            timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
            self._printer.print(
--- a/src/crewai/utilities/protocols.py
+++ b/src/crewai/utilities/protocols.py
@@ -0,0 +1,12 @@
+from typing import Any, Protocol, runtime_checkable
+
+
+@runtime_checkable
+class AgentExecutorProtocol(Protocol):
+    """Protocol defining the expected interface for an agent executor."""
+
+    @property
+    def agent(self) -> Any: ...
+
+    @property
+    def task(self) -> Any: ...
--- a/src/crewai/utilities/training_handler.py
+++ b/src/crewai/utilities/training_handler.py
@@ -1,3 +1,5 @@
+import os
+
 from crewai.utilities.file_handler import PickleHandler


@@ -29,3 +31,8 @@ class CrewTrainingHandler(PickleHandler):
            data[agent_id] = {train_iteration: new_data}

        self.save(data)
+
+    def clear(self) -> None:
+        """Clear the training data by removing the file or resetting its contents."""
+        if os.path.exists(self.file_path):
+            self.save({})
--- a/tests/agent_test.py
+++ b/tests/agent_test.py
@@ -8,16 +8,17 @@ import pytest

 from crewai import Agent, Crew, Task
 from crewai.agents.cache import CacheHandler
-from crewai.agents.crew_agent_executor import CrewAgentExecutor
+from crewai.agents.crew_agent_executor import AgentFinish, CrewAgentExecutor
 from crewai.agents.parser import AgentAction, CrewAgentParser, OutputParserException
+from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
 from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
 from crewai.llm import LLM
 from crewai.tools import tool
 from crewai.tools.tool_calling import InstructorToolCalling
 from crewai.tools.tool_usage import ToolUsage
-from crewai.tools.tool_usage_events import ToolUsageFinished
-from crewai.utilities import Printer, RPMController
-from crewai.utilities.events import Emitter
+from crewai.utilities import RPMController
+from crewai.utilities.events import crewai_event_bus
+from crewai.utilities.events.tool_usage_events import ToolUsageFinishedEvent


 def test_agent_llm_creation_with_env_vars():
@@ -153,15 +154,19 @@ def test_agent_execution_with_tools():
        agent=agent,
        expected_output="The result of the multiplication.",
    )
-    with patch.object(Emitter, "emit") as emit:
-        output = agent.execute_task(task)
-        assert output == "The result of the multiplication is 12."
-        assert emit.call_count == 1
-        args, _ = emit.call_args
-        assert isinstance(args[1], ToolUsageFinished)
-        assert not args[1].from_cache
-        assert args[1].tool_name == "multiplier"
-        assert args[1].tool_args == {"first_number": 3, "second_number": 4}
+    received_events = []
+
+    @crewai_event_bus.on(ToolUsageFinishedEvent)
+    def handle_tool_end(source, event):
+        received_events.append(event)
+
+    output = agent.execute_task(task)
+    assert output == "The result of the multiplication is 12."
+
+    assert len(received_events) == 1
+    assert isinstance(received_events[0], ToolUsageFinishedEvent)
+    assert received_events[0].tool_name == "multiplier"
+    assert received_events[0].tool_args == {"first_number": 3, "second_number": 4}


@pytest.mark.vcr(filter_headers=["authorization"])
@@ -248,10 +253,14 @@ def test_cache_hitting():
        "multiplier-{'first_number': 3, 'second_number': 3}": 9,
        "multiplier-{'first_number': 12, 'second_number': 3}": 36,
    }
+    received_events = []
+
+    @crewai_event_bus.on(ToolUsageFinishedEvent)
+    def handle_tool_end(source, event):
+        received_events.append(event)

    with (
        patch.object(CacheHandler, "read") as read,
-        patch.object(Emitter, "emit") as emit,
    ):
        read.return_value = "0"
        task = Task(
@@ -264,10 +273,9 @@ def test_cache_hitting():
        read.assert_called_with(
            tool="multiplier", input={"first_number": 2, "second_number": 6}
        )
-        assert emit.call_count == 1
-        args, _ = emit.call_args
-        assert isinstance(args[1], ToolUsageFinished)
-        assert args[1].from_cache
+        assert len(received_events) == 1
+        assert isinstance(received_events[0], ToolUsageFinishedEvent)
+        assert received_events[0].from_cache


@pytest.mark.vcr(filter_headers=["authorization"])
@@ -907,6 +915,8 @@ def test_tool_result_as_answer_is_the_final_answer_for_the_agent():

@pytest.mark.vcr(filter_headers=["authorization"])
 def test_tool_usage_information_is_appended_to_agent():
+    from datetime import UTC, datetime
+
    from crewai.tools import BaseTool

    class MyCustomTool(BaseTool):
@@ -916,30 +926,36 @@ def test_tool_usage_information_is_appended_to_agent():
        def _run(self) -> str:
            return "Howdy!"

-    agent1 = Agent(
-        role="Friendly Neighbor",
-        goal="Make everyone feel welcome",
-        backstory="You are the friendly neighbor",
-        tools=[MyCustomTool(result_as_answer=True)],
-    )
+    fixed_datetime = datetime(2025, 2, 10, 12, 0, 0, tzinfo=UTC)
+    with patch("datetime.datetime") as mock_datetime:
+        mock_datetime.now.return_value = fixed_datetime
+        mock_datetime.side_effect = lambda *args, **kw: datetime(*args, **kw)

-    greeting = Task(
-        description="Say an appropriate greeting.",
-        expected_output="The greeting.",
-        agent=agent1,
-    )
-    tasks = [greeting]
-    crew = Crew(agents=[agent1], tasks=tasks)
+        agent1 = Agent(
+            role="Friendly Neighbor",
+            goal="Make everyone feel welcome",
+            backstory="You are the friendly neighbor",
+            tools=[MyCustomTool(result_as_answer=True)],
+        )

-    crew.kickoff()
-    assert agent1.tools_results == [
-        {
-            "result": "Howdy!",
-            "tool_name": "Decide Greetings",
-            "tool_args": {},
-            "result_as_answer": True,
-        }
-    ]
+        greeting = Task(
+            description="Say an appropriate greeting.",
+            expected_output="The greeting.",
+            agent=agent1,
+        )
+        tasks = [greeting]
+        crew = Crew(agents=[agent1], tasks=tasks)
+
+        crew.kickoff()
+        assert agent1.tools_results == [
+            {
+                "result": "Howdy!",
+                "tool_name": "Decide Greetings",
+                "tool_args": {},
+                "result_as_answer": True,
+                "start_time": fixed_datetime,
+            }
+        ]


 def test_agent_definition_based_on_dict():
@@ -982,23 +998,35 @@ def test_agent_human_input():
    # Side effect function for _ask_human_input to simulate multiple feedback iterations
    feedback_responses = iter(
        [
-            "Don't say hi, say Hello instead!",  # First feedback
-            "looks good",  # Second feedback to exit loop
+            "Don't say hi, say Hello instead!",  # First feedback: instruct change
+            "",  # Second feedback: empty string signals acceptance
        ]
    )

    def ask_human_input_side_effect(*args, **kwargs):
        return next(feedback_responses)

-    with patch.object(
-        CrewAgentExecutor, "_ask_human_input", side_effect=ask_human_input_side_effect
-    ) as mock_human_input:
+    # Patch both _ask_human_input and _invoke_loop to avoid real API/network calls.
+    with (
+        patch.object(
+            CrewAgentExecutor,
+            "_ask_human_input",
+            side_effect=ask_human_input_side_effect,
+        ) as mock_human_input,
+        patch.object(
+            CrewAgentExecutor,
+            "_invoke_loop",
+            return_value=AgentFinish(output="Hello", thought="", text=""),
+        ) as mock_invoke_loop,
+    ):
        # Execute the task
        output = agent.execute_task(task)

-        # Assertions to ensure the agent behaves correctly
-        assert mock_human_input.call_count == 2  # Should have asked for feedback twice
-        assert output.strip().lower() == "hello"  # Final output should be 'Hello'
+        # Assertions to ensure the agent behaves correctly.
+        # It should have requested feedback twice.
+        assert mock_human_input.call_count == 2
+        # The final result should be processed to "Hello"
+        assert output.strip().lower() == "hello"


 def test_interpolate_inputs():
@@ -1182,7 +1210,7 @@ def test_agent_max_retry_limit():
            [
                mock.call(
                    {
-                        "input": "Say the word: Hi\n\nThis is the expect criteria for your final answer: The word: Hi\nyou MUST return the actual complete content as the final answer, not a summary.",
+                        "input": "Say the word: Hi\n\nThis is the expected criteria for your final answer: The word: Hi\nyou MUST return the actual complete content as the final answer, not a summary.",
                        "tool_names": "",
                        "tools": "",
                        "ask_for_human_input": True,
@@ -1190,7 +1218,7 @@ def test_agent_max_retry_limit():
                ),
                mock.call(
                    {
-                        "input": "Say the word: Hi\n\nThis is the expect criteria for your final answer: The word: Hi\nyou MUST return the actual complete content as the final answer, not a summary.",
+                        "input": "Say the word: Hi\n\nThis is the expected criteria for your final answer: The word: Hi\nyou MUST return the actual complete content as the final answer, not a summary.",
                        "tool_names": "",
                        "tools": "",
                        "ask_for_human_input": True,
@@ -1602,6 +1630,45 @@ def test_agent_with_knowledge_sources():
        assert "red" in result.raw.lower()


+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_agent_with_knowledge_sources_works_with_copy():
+    content = "Brandon's favorite color is red and he likes Mexican food."
+    string_source = StringKnowledgeSource(content=content)
+
+    with patch(
+        "crewai.knowledge.source.base_knowledge_source.BaseKnowledgeSource",
+        autospec=True,
+    ) as MockKnowledgeSource:
+        mock_knowledge_source_instance = MockKnowledgeSource.return_value
+        mock_knowledge_source_instance.__class__ = BaseKnowledgeSource
+        mock_knowledge_source_instance.sources = [string_source]
+
+        agent = Agent(
+            role="Information Agent",
+            goal="Provide information based on knowledge sources",
+            backstory="You have access to specific knowledge sources.",
+            llm=LLM(model="gpt-4o-mini"),
+            knowledge_sources=[string_source],
+        )
+
+        with patch(
+            "crewai.knowledge.storage.knowledge_storage.KnowledgeStorage"
+        ) as MockKnowledgeStorage:
+            mock_knowledge_storage = MockKnowledgeStorage.return_value
+            agent.knowledge_storage = mock_knowledge_storage
+
+            agent_copy = agent.copy()
+
+            assert agent_copy.role == agent.role
+            assert agent_copy.goal == agent.goal
+            assert agent_copy.backstory == agent.backstory
+            assert agent_copy.knowledge_sources is not None
+            assert len(agent_copy.knowledge_sources) == 1
+            assert isinstance(agent_copy.knowledge_sources[0], StringKnowledgeSource)
+            assert agent_copy.knowledge_sources[0].content == content
+            assert isinstance(agent_copy.llm, LLM)
+
+
@pytest.mark.vcr(filter_headers=["authorization"])
 def test_litellm_auth_error_handling():
    """Test that LiteLLM authentication errors are handled correctly and not retried."""
--- a/tests/cassettes/test_agent_human_input.yaml
+++ b/tests/cassettes/test_agent_human_input.yaml
@@ -1,520 +0,0 @@
-interactions:
- request:
-    body: !!binary |
-      CqcXCiQKIgoMc2VydmljZS5uYW1lEhIKEGNyZXdBSS10ZWxlbWV0cnkS/hYKEgoQY3Jld2FpLnRl
-      bGVtZXRyeRJ5ChBuJJtOdNaB05mOW/p3915eEgj2tkAd3rZcASoQVG9vbCBVc2FnZSBFcnJvcjAB
-      OYa7/URvKBUYQUpcFEVvKBUYShoKDmNyZXdhaV92ZXJzaW9uEggKBjAuODYuMEoPCgNsbG0SCAoG
-      Z3B0LTRvegIYAYUBAAEAABLJBwoQifhX01E5i+5laGdALAlZBBIIBuGM1aN+OPgqDENyZXcgQ3Jl
-      YXRlZDABORVGruBvKBUYQaipwOBvKBUYShoKDmNyZXdhaV92ZXJzaW9uEggKBjAuODYuMEoaCg5w
-      eXRob25fdmVyc2lvbhIICgYzLjEyLjdKLgoIY3Jld19rZXkSIgogN2U2NjA4OTg5ODU5YTY3ZWVj
-      ODhlZWY3ZmNlODUyMjVKMQoHY3Jld19pZBImCiRiOThiNWEwMC01YTI1LTQxMDctYjQwNS1hYmYz
-      MjBhOGYzYThKHAoMY3Jld19wcm9jZXNzEgwKCnNlcXVlbnRpYWxKEQoLY3Jld19tZW1vcnkSAhAA
-      ShoKFGNyZXdfbnVtYmVyX29mX3Rhc2tzEgIYAUobChVjcmV3X251bWJlcl9vZl9hZ2VudHMSAhgB
-      SuQCCgtjcmV3X2FnZW50cxLUAgrRAlt7ImtleSI6ICIyMmFjZDYxMWU0NGVmNWZhYzA1YjUzM2Q3
-      NWU4ODkzYiIsICJpZCI6ICJkNWIyMzM1YS0yMmIyLTQyZWEtYmYwNS03OTc3NmU3MmYzOTIiLCAi
-      cm9sZSI6ICJEYXRhIFNjaWVudGlzdCIsICJ2ZXJib3NlPyI6IGZhbHNlLCAibWF4X2l0ZXIiOiAy
-      MCwgIm1heF9ycG0iOiBudWxsLCAiZnVuY3Rpb25fY2FsbGluZ19sbG0iOiAiIiwgImxsbSI6ICJn
-      cHQtNG8tbWluaSIsICJkZWxlZ2F0aW9uX2VuYWJsZWQ/IjogZmFsc2UsICJhbGxvd19jb2RlX2V4
-      ZWN1dGlvbj8iOiBmYWxzZSwgIm1heF9yZXRyeV9saW1pdCI6IDIsICJ0b29sc19uYW1lcyI6IFsi
-      Z2V0IGdyZWV0aW5ncyJdfV1KkgIKCmNyZXdfdGFza3MSgwIKgAJbeyJrZXkiOiAiYTI3N2IzNGIy
-      YzE0NmYwYzU2YzVlMTM1NmU4ZjhhNTciLCAiaWQiOiAiMjJiZWMyMzEtY2QyMS00YzU4LTgyN2Ut
-      MDU4MWE4ZjBjMTExIiwgImFzeW5jX2V4ZWN1dGlvbj8iOiBmYWxzZSwgImh1bWFuX2lucHV0PyI6
-      IGZhbHNlLCAiYWdlbnRfcm9sZSI6ICJEYXRhIFNjaWVudGlzdCIsICJhZ2VudF9rZXkiOiAiMjJh
-      Y2Q2MTFlNDRlZjVmYWMwNWI1MzNkNzVlODg5M2IiLCAidG9vbHNfbmFtZXMiOiBbImdldCBncmVl
-      dGluZ3MiXX1degIYAYUBAAEAABKOAgoQ5WYoxRtTyPjge4BduhL0rRIIv2U6rvWALfwqDFRhc2sg
-      Q3JlYXRlZDABOX068uBvKBUYQZkv8+BvKBUYSi4KCGNyZXdfa2V5EiIKIDdlNjYwODk4OTg1OWE2
-      N2VlYzg4ZWVmN2ZjZTg1MjI1SjEKB2NyZXdfaWQSJgokYjk4YjVhMDAtNWEyNS00MTA3LWI0MDUt
-      YWJmMzIwYThmM2E4Si4KCHRhc2tfa2V5EiIKIGEyNzdiMzRiMmMxNDZmMGM1NmM1ZTEzNTZlOGY4
-      YTU3SjEKB3Rhc2tfaWQSJgokMjJiZWMyMzEtY2QyMS00YzU4LTgyN2UtMDU4MWE4ZjBjMTExegIY
-      AYUBAAEAABKQAQoQXyeDtJDFnyp2Fjk9YEGTpxIIaNE7gbhPNYcqClRvb2wgVXNhZ2UwATkaXTvj
-      bygVGEGvx0rjbygVGEoaCg5jcmV3YWlfdmVyc2lvbhIICgYwLjg2LjBKHAoJdG9vbF9uYW1lEg8K
-      DUdldCBHcmVldGluZ3NKDgoIYXR0ZW1wdHMSAhgBegIYAYUBAAEAABLVBwoQMWfznt0qwauEzl7T
-      UOQxRBII9q+pUS5EdLAqDENyZXcgQ3JlYXRlZDABORONPORvKBUYQSAoS+RvKBUYShoKDmNyZXdh
-      aV92ZXJzaW9uEggKBjAuODYuMEoaCg5weXRob25fdmVyc2lvbhIICgYzLjEyLjdKLgoIY3Jld19r
-      ZXkSIgogYzMwNzYwMDkzMjY3NjE0NDRkNTdjNzFkMWRhM2YyN2NKMQoHY3Jld19pZBImCiQ3OTQw
-      MTkyNS1iOGU5LTQ3MDgtODUzMC00NDhhZmEzYmY4YjBKHAoMY3Jld19wcm9jZXNzEgwKCnNlcXVl
-      bnRpYWxKEQoLY3Jld19tZW1vcnkSAhAAShoKFGNyZXdfbnVtYmVyX29mX3Rhc2tzEgIYAUobChVj
-      cmV3X251bWJlcl9vZl9hZ2VudHMSAhgBSuoCCgtjcmV3X2FnZW50cxLaAgrXAlt7ImtleSI6ICI5
-      OGYzYjFkNDdjZTk2OWNmMDU3NzI3Yjc4NDE0MjVjZCIsICJpZCI6ICI5OTJkZjYyZi1kY2FiLTQy
-      OTUtOTIwNi05MDBkNDExNGIxZTkiLCAicm9sZSI6ICJGcmllbmRseSBOZWlnaGJvciIsICJ2ZXJi
-      b3NlPyI6IGZhbHNlLCAibWF4X2l0ZXIiOiAyMCwgIm1heF9ycG0iOiBudWxsLCAiZnVuY3Rpb25f
-      Y2FsbGluZ19sbG0iOiAiIiwgImxsbSI6ICJncHQtNG8tbWluaSIsICJkZWxlZ2F0aW9uX2VuYWJs
-      ZWQ/IjogZmFsc2UsICJhbGxvd19jb2RlX2V4ZWN1dGlvbj8iOiBmYWxzZSwgIm1heF9yZXRyeV9s
-      aW1pdCI6IDIsICJ0b29sc19uYW1lcyI6IFsiZGVjaWRlIGdyZWV0aW5ncyJdfV1KmAIKCmNyZXdf
-      dGFza3MSiQIKhgJbeyJrZXkiOiAiODBkN2JjZDQ5MDk5MjkwMDgzODMyZjBlOTgzMzgwZGYiLCAi
-      aWQiOiAiMmZmNjE5N2UtYmEyNy00YjczLWI0YTctNGZhMDQ4ZTYyYjQ3IiwgImFzeW5jX2V4ZWN1
-      dGlvbj8iOiBmYWxzZSwgImh1bWFuX2lucHV0PyI6IGZhbHNlLCAiYWdlbnRfcm9sZSI6ICJGcmll
-      bmRseSBOZWlnaGJvciIsICJhZ2VudF9rZXkiOiAiOThmM2IxZDQ3Y2U5NjljZjA1NzcyN2I3ODQx
-      NDI1Y2QiLCAidG9vbHNfbmFtZXMiOiBbImRlY2lkZSBncmVldGluZ3MiXX1degIYAYUBAAEAABKO
-      AgoQnjTp5boK7/+DQxztYIpqihIIgGnMUkBtzHEqDFRhc2sgQ3JlYXRlZDABOcpYcuRvKBUYQalE
-      c+RvKBUYSi4KCGNyZXdfa2V5EiIKIGMzMDc2MDA5MzI2NzYxNDQ0ZDU3YzcxZDFkYTNmMjdjSjEK
-      B2NyZXdfaWQSJgokNzk0MDE5MjUtYjhlOS00NzA4LTg1MzAtNDQ4YWZhM2JmOGIwSi4KCHRhc2tf
-      a2V5EiIKIDgwZDdiY2Q0OTA5OTI5MDA4MzgzMmYwZTk4MzM4MGRmSjEKB3Rhc2tfaWQSJgokMmZm
-      NjE5N2UtYmEyNy00YjczLWI0YTctNGZhMDQ4ZTYyYjQ3egIYAYUBAAEAABKTAQoQ26H9pLUgswDN
-      p9XhJwwL6BIIx3bw7mAvPYwqClRvb2wgVXNhZ2UwATmy7NPlbygVGEEvb+HlbygVGEoaCg5jcmV3
-      YWlfdmVyc2lvbhIICgYwLjg2LjBKHwoJdG9vbF9uYW1lEhIKEERlY2lkZSBHcmVldGluZ3NKDgoI
-      YXR0ZW1wdHMSAhgBegIYAYUBAAEAAA==
-    headers:
-      Accept:
-      - '*/*'
-      Accept-Encoding:
-      - gzip, deflate
-      Connection:
-      - keep-alive
-      Content-Length:
-      - '2986'
-      Content-Type:
-      - application/x-protobuf
-      User-Agent:
-      - OTel-OTLP-Exporter-Python/1.27.0
-    method: POST
-    uri: https://telemetry.crewai.com:4319/v1/traces
-  response:
-    body:
-      string: "\n\0"
-    headers:
-      Content-Length:
-      - '2'
-      Content-Type:
-      - application/x-protobuf
-      Date:
-      - Fri, 27 Dec 2024 22:14:53 GMT
-    status:
-      code: 200
-      message: OK
- request:
-    body: '{"messages": [{"role": "system", "content": "You are test role. test backstory\nYour
-      personal goal is: test goal\nTo give my best complete final answer to the task
-      use the exact following format:\n\nThought: I now can give a great answer\nFinal
-      Answer: Your final answer must be the great and the most complete as possible,
-      it must be outcome described.\n\nI MUST use these formats, my job depends on
-      it!"}, {"role": "user", "content": "\nCurrent Task: Say the word: Hi\n\nThis
-      is the expect criteria for your final answer: The word: Hi\nyou MUST return
-      the actual complete content as the final answer, not a summary.\n\nBegin! This
-      is VERY important to you, use the tools available and give your best Final Answer,
-      your job depends on it!\n\nThought:"}], "model": "gpt-4o-mini", "stop": ["\nObservation:"],
-      "stream": false}'
-    headers:
-      accept:
-      - application/json
-      accept-encoding:
-      - gzip, deflate
-      connection:
-      - keep-alive
-      content-length:
-      - '824'
-      content-type:
-      - application/json
-      cookie:
-      - _cfuvid=ePJSDFdHag2D8lj21_ijAMWjoA6xfnPNxN4uekvC728-1727226247743-0.0.1.1-604800000
-      host:
-      - api.openai.com
-      user-agent:
-      - OpenAI/Python 1.52.1
-      x-stainless-arch:
-      - x64
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - Linux
-      x-stainless-package-version:
-      - 1.52.1
-      x-stainless-raw-response:
-      - 'true'
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.12.7
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    content: "{\n  \"id\": \"chatcmpl-AjCtZLLrWi8ZASpP9bz6HaCV7xBIn\",\n  \"object\":
-      \"chat.completion\",\n  \"created\": 1735337693,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-      \"assistant\",\n        \"content\": \"I now can give a great answer  \\nFinal
-      Answer: Hi\",\n        \"refusal\": null\n      },\n      \"logprobs\": null,\n
-      \     \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\":
-      158,\n    \"completion_tokens\": 12,\n    \"total_tokens\": 170,\n    \"prompt_tokens_details\":
-      {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-      {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"system_fingerprint\":
-      \"fp_0aa8d3e20b\"\n}\n"
-    headers:
-      CF-Cache-Status:
-      - DYNAMIC
-      CF-RAY:
-      - 8f8caa83deca756b-SEA
-      Connection:
-      - keep-alive
-      Content-Encoding:
-      - gzip
-      Content-Type:
-      - application/json
-      Date:
-      - Fri, 27 Dec 2024 22:14:53 GMT
-      Server:
-      - cloudflare
-      Set-Cookie:
-      - __cf_bm=wJkq_yLkzE3OdxE0aMJz.G0kce969.9JxRmZ0ratl4c-1735337693-1.0.1.1-OKpUoRrSPFGvWv5Hp5ET1PNZ7iZNHPKEAuakpcQUxxPSeisUIIR3qIOZ31MGmYugqB5.wkvidgbxOAagqJvmnw;
-        path=/; expires=Fri, 27-Dec-24 22:44:53 GMT; domain=.api.openai.com; HttpOnly;
-        Secure; SameSite=None
-      - _cfuvid=A_ASCLNAVfQoyucWOAIhecWtEpNotYoZr0bAFihgNxs-1735337693273-0.0.1.1-604800000;
-        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - nosniff
-      access-control-expose-headers:
-      - X-Request-ID
-      alt-svc:
-      - h3=":443"; ma=86400
-      openai-organization:
-      - crewai-iuxna1
-      openai-processing-ms:
-      - '404'
-      openai-version:
-      - '2020-10-01'
-      strict-transport-security:
-      - max-age=31536000; includeSubDomains; preload
-      x-ratelimit-limit-requests:
-      - '30000'
-      x-ratelimit-limit-tokens:
-      - '150000000'
-      x-ratelimit-remaining-requests:
-      - '29999'
-      x-ratelimit-remaining-tokens:
-      - '149999816'
-      x-ratelimit-reset-requests:
-      - 2ms
-      x-ratelimit-reset-tokens:
-      - 0s
-      x-request-id:
-      - req_6ac84634bff9193743c4b0911c09b4a6
-    http_version: HTTP/1.1
-    status_code: 200
- request:
-    body: '{"messages": [{"role": "system", "content": "Determine if the following
-      feedback indicates that the user is satisfied or if further changes are needed.
-      Respond with ''True'' if further changes are needed, or ''False'' if the user
-      is satisfied. **Important** Do not include any additional commentary outside
-      of your ''True'' or ''False'' response.\n\nFeedback: \"Don''t say hi, say Hello
-      instead!\""}], "model": "gpt-4o-mini", "stop": ["\nObservation:"], "stream":
-      false}'
-    headers:
-      accept:
-      - application/json
-      accept-encoding:
-      - gzip, deflate
-      connection:
-      - keep-alive
-      content-length:
-      - '461'
-      content-type:
-      - application/json
-      cookie:
-      - _cfuvid=A_ASCLNAVfQoyucWOAIhecWtEpNotYoZr0bAFihgNxs-1735337693273-0.0.1.1-604800000;
-        __cf_bm=wJkq_yLkzE3OdxE0aMJz.G0kce969.9JxRmZ0ratl4c-1735337693-1.0.1.1-OKpUoRrSPFGvWv5Hp5ET1PNZ7iZNHPKEAuakpcQUxxPSeisUIIR3qIOZ31MGmYugqB5.wkvidgbxOAagqJvmnw
-      host:
-      - api.openai.com
-      user-agent:
-      - OpenAI/Python 1.52.1
-      x-stainless-arch:
-      - x64
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - Linux
-      x-stainless-package-version:
-      - 1.52.1
-      x-stainless-raw-response:
-      - 'true'
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.12.7
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    content: "{\n  \"id\": \"chatcmpl-AjCtZNlWdrrPZhq0MJDqd16sMuQEJ\",\n  \"object\":
-      \"chat.completion\",\n  \"created\": 1735337693,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-      \"assistant\",\n        \"content\": \"True\",\n        \"refusal\": null\n
-      \     },\n      \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n    }\n
-      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 78,\n    \"completion_tokens\":
-      1,\n    \"total_tokens\": 79,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
-      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
-      \     \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"system_fingerprint\":
-      \"fp_0aa8d3e20b\"\n}\n"
-    headers:
-      CF-Cache-Status:
-      - DYNAMIC
-      CF-RAY:
-      - 8f8caa87094f756b-SEA
-      Connection:
-      - keep-alive
-      Content-Encoding:
-      - gzip
-      Content-Type:
-      - application/json
-      Date:
-      - Fri, 27 Dec 2024 22:14:53 GMT
-      Server:
-      - cloudflare
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - nosniff
-      access-control-expose-headers:
-      - X-Request-ID
-      alt-svc:
-      - h3=":443"; ma=86400
-      openai-organization:
-      - crewai-iuxna1
-      openai-processing-ms:
-      - '156'
-      openai-version:
-      - '2020-10-01'
-      strict-transport-security:
-      - max-age=31536000; includeSubDomains; preload
-      x-ratelimit-limit-requests:
-      - '30000'
-      x-ratelimit-limit-tokens:
-      - '150000000'
-      x-ratelimit-remaining-requests:
-      - '29999'
-      x-ratelimit-remaining-tokens:
-      - '149999898'
-      x-ratelimit-reset-requests:
-      - 2ms
-      x-ratelimit-reset-tokens:
-      - 0s
-      x-request-id:
-      - req_ec74bef2a9ef7b2144c03fd7f7bbeab0
-    http_version: HTTP/1.1
-    status_code: 200
- request:
-    body: '{"messages": [{"role": "system", "content": "You are test role. test backstory\nYour
-      personal goal is: test goal\nTo give my best complete final answer to the task
-      use the exact following format:\n\nThought: I now can give a great answer\nFinal
-      Answer: Your final answer must be the great and the most complete as possible,
-      it must be outcome described.\n\nI MUST use these formats, my job depends on
-      it!"}, {"role": "user", "content": "\nCurrent Task: Say the word: Hi\n\nThis
-      is the expect criteria for your final answer: The word: Hi\nyou MUST return
-      the actual complete content as the final answer, not a summary.\n\nBegin! This
-      is VERY important to you, use the tools available and give your best Final Answer,
-      your job depends on it!\n\nThought:"}, {"role": "assistant", "content": "I now
-      can give a great answer  \nFinal Answer: Hi"}, {"role": "user", "content": "Feedback:
-      Don''t say hi, say Hello instead!"}], "model": "gpt-4o-mini", "stop": ["\nObservation:"],
-      "stream": false}'
-    headers:
-      accept:
-      - application/json
-      accept-encoding:
-      - gzip, deflate
-      connection:
-      - keep-alive
-      content-length:
-      - '986'
-      content-type:
-      - application/json
-      cookie:
-      - _cfuvid=A_ASCLNAVfQoyucWOAIhecWtEpNotYoZr0bAFihgNxs-1735337693273-0.0.1.1-604800000;
-        __cf_bm=wJkq_yLkzE3OdxE0aMJz.G0kce969.9JxRmZ0ratl4c-1735337693-1.0.1.1-OKpUoRrSPFGvWv5Hp5ET1PNZ7iZNHPKEAuakpcQUxxPSeisUIIR3qIOZ31MGmYugqB5.wkvidgbxOAagqJvmnw
-      host:
-      - api.openai.com
-      user-agent:
-      - OpenAI/Python 1.52.1
-      x-stainless-arch:
-      - x64
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - Linux
-      x-stainless-package-version:
-      - 1.52.1
-      x-stainless-raw-response:
-      - 'true'
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.12.7
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    content: "{\n  \"id\": \"chatcmpl-AjCtZGv4f3h7GDdhyOy9G0sB1lRgC\",\n  \"object\":
-      \"chat.completion\",\n  \"created\": 1735337693,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-      \"assistant\",\n        \"content\": \"Thought: I understand the feedback and
-      will adjust my response accordingly.  \\nFinal Answer: Hello\",\n        \"refusal\":
-      null\n      },\n      \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n
-      \   }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 188,\n    \"completion_tokens\":
-      18,\n    \"total_tokens\": 206,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
-      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
-      \     \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"system_fingerprint\":
-      \"fp_0aa8d3e20b\"\n}\n"
-    headers:
-      CF-Cache-Status:
-      - DYNAMIC
-      CF-RAY:
-      - 8f8caa88cac4756b-SEA
-      Connection:
-      - keep-alive
-      Content-Encoding:
-      - gzip
-      Content-Type:
-      - application/json
-      Date:
-      - Fri, 27 Dec 2024 22:14:54 GMT
-      Server:
-      - cloudflare
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - nosniff
-      access-control-expose-headers:
-      - X-Request-ID
-      alt-svc:
-      - h3=":443"; ma=86400
-      openai-organization:
-      - crewai-iuxna1
-      openai-processing-ms:
-      - '358'
-      openai-version:
-      - '2020-10-01'
-      strict-transport-security:
-      - max-age=31536000; includeSubDomains; preload
-      x-ratelimit-limit-requests:
-      - '30000'
-      x-ratelimit-limit-tokens:
-      - '150000000'
-      x-ratelimit-remaining-requests:
-      - '29999'
-      x-ratelimit-remaining-tokens:
-      - '149999793'
-      x-ratelimit-reset-requests:
-      - 2ms
-      x-ratelimit-reset-tokens:
-      - 0s
-      x-request-id:
-      - req_ae1ab6b206d28ded6fee3c83ed0c2ab7
-    http_version: HTTP/1.1
-    status_code: 200
- request:
-    body: '{"messages": [{"role": "system", "content": "Determine if the following
-      feedback indicates that the user is satisfied or if further changes are needed.
-      Respond with ''True'' if further changes are needed, or ''False'' if the user
-      is satisfied. **Important** Do not include any additional commentary outside
-      of your ''True'' or ''False'' response.\n\nFeedback: \"looks good\""}], "model":
-      "gpt-4o-mini", "stop": ["\nObservation:"], "stream": false}'
-    headers:
-      accept:
-      - application/json
-      accept-encoding:
-      - gzip, deflate
-      connection:
-      - keep-alive
-      content-length:
-      - '439'
-      content-type:
-      - application/json
-      cookie:
-      - _cfuvid=A_ASCLNAVfQoyucWOAIhecWtEpNotYoZr0bAFihgNxs-1735337693273-0.0.1.1-604800000;
-        __cf_bm=wJkq_yLkzE3OdxE0aMJz.G0kce969.9JxRmZ0ratl4c-1735337693-1.0.1.1-OKpUoRrSPFGvWv5Hp5ET1PNZ7iZNHPKEAuakpcQUxxPSeisUIIR3qIOZ31MGmYugqB5.wkvidgbxOAagqJvmnw
-      host:
-      - api.openai.com
-      user-agent:
-      - OpenAI/Python 1.52.1
-      x-stainless-arch:
-      - x64
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - Linux
-      x-stainless-package-version:
-      - 1.52.1
-      x-stainless-raw-response:
-      - 'true'
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.12.7
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    content: "{\n  \"id\": \"chatcmpl-AjCtaiHL4TY8Dssk0j2miqmjrzquy\",\n  \"object\":
-      \"chat.completion\",\n  \"created\": 1735337694,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
-      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-      \"assistant\",\n        \"content\": \"False\",\n        \"refusal\": null\n
-      \     },\n      \"logprobs\": null,\n      \"finish_reason\": \"stop\"\n    }\n
-      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 73,\n    \"completion_tokens\":
-      1,\n    \"total_tokens\": 74,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
-      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
-      \     \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"system_fingerprint\":
-      \"fp_0aa8d3e20b\"\n}\n"
-    headers:
-      CF-Cache-Status:
-      - DYNAMIC
-      CF-RAY:
-      - 8f8caa8bdd26756b-SEA
-      Connection:
-      - keep-alive
-      Content-Encoding:
-      - gzip
-      Content-Type:
-      - application/json
-      Date:
-      - Fri, 27 Dec 2024 22:14:54 GMT
-      Server:
-      - cloudflare
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - nosniff
-      access-control-expose-headers:
-      - X-Request-ID
-      alt-svc:
-      - h3=":443"; ma=86400
-      openai-organization:
-      - crewai-iuxna1
-      openai-processing-ms:
-      - '184'
-      openai-version:
-      - '2020-10-01'
-      strict-transport-security:
-      - max-age=31536000; includeSubDomains; preload
-      x-ratelimit-limit-requests:
-      - '30000'
-      x-ratelimit-limit-tokens:
-      - '150000000'
-      x-ratelimit-remaining-requests:
-      - '29999'
-      x-ratelimit-remaining-tokens:
-      - '149999902'
-      x-ratelimit-reset-requests:
-      - 2ms
-      x-ratelimit-reset-tokens:
-      - 0s
-      x-request-id:
-      - req_652891f79c1104a7a8436275d78a69f1
-    http_version: HTTP/1.1
-    status_code: 200
-version: 1
--- a/tests/cassettes/test_agent_tool_role_matching[
+++ b/tests/cassettes/test_agent_tool_role_matching[
@@ -1,117 +0,0 @@
-interactions:
- request:
-    body: '{"messages": [{"role": "system", "content": "You are Futel Official Infopoint.
-      Futel Football Club info\nYour personal goal is: Answer questions about Futel\nTo
-      give my best complete final answer to the task respond using the exact following
-      format:\n\nThought: I now can give a great answer\nFinal Answer: Your final
-      answer must be the great and the most complete as possible, it must be outcome
-      described.\n\nI MUST use these formats, my job depends on it!"}, {"role": "user",
-      "content": "\nCurrent Task: Test task\n\nThis is the expect criteria for your
-      final answer: Your best answer to your coworker asking you this, accounting
-      for the context shared.\nyou MUST return the actual complete content as the
-      final answer, not a summary.\n\nBegin! This is VERY important to you, use the
-      tools available and give your best Final Answer, your job depends on it!\n\nThought:"}],
-      "model": "gpt-4o", "stop": ["\nObservation:"], "stream": false}'
-    headers:
-      accept:
-      - application/json
-      accept-encoding:
-      - gzip, deflate
-      connection:
-      - keep-alive
-      content-length:
-      - '939'
-      content-type:
-      - application/json
-      cookie:
-      - __cf_bm=cwWdOaPJjFMNJaLtJfa8Kjqavswg5bzVRFzBX4gneGw-1736458417-1.0.1.1-bvf2HshgcMtgn7GdxqwySFDAIacGccDFfEXniBFTTDmbGMCiIIwf6t2DiwWnBldmUHixwc5kDO9gYs08g.feBA;
-        _cfuvid=WMw7PSqkYqQOieguBRs0uNkwNU92A.ZKbgDbCAcV3EQ-1736458417825-0.0.1.1-604800000
-      host:
-      - api.openai.com
-      user-agent:
-      - OpenAI/Python 1.52.1
-      x-stainless-arch:
-      - arm64
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - MacOS
-      x-stainless-package-version:
-      - 1.52.1
-      x-stainless-raw-response:
-      - 'true'
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.12.7
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    content: "{\n  \"id\": \"chatcmpl-AnuRlxiTxduAVoXHHY58Fvfbll5IS\",\n  \"object\":
-      \"chat.completion\",\n  \"created\": 1736458417,\n  \"model\": \"gpt-4o-2024-08-06\",\n
-      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-      \"assistant\",\n        \"content\": \"I now can give a great answer  \\nFinal
-      Answer: This is a test task, and the context or question from the coworker is
-      not specified. Therefore, my best effort would be to affirm my readiness to
-      answer accurately and in detail any question about Futel Football Club based
-      on the context described. If provided with specific information or questions,
-      I will ensure to respond comprehensively as required by my job directives.\",\n
-      \       \"refusal\": null\n      },\n      \"logprobs\": null,\n      \"finish_reason\":
-      \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 177,\n    \"completion_tokens\":
-      82,\n    \"total_tokens\": 259,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
-      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
-      \     \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"system_fingerprint\":
-      \"fp_703d4ff298\"\n}\n"
-    headers:
-      CF-Cache-Status:
-      - DYNAMIC
-      CF-RAY:
-      - 8ff78bf7bd6cc002-ATL
-      Connection:
-      - keep-alive
-      Content-Encoding:
-      - gzip
-      Content-Type:
-      - application/json
-      Date:
-      - Thu, 09 Jan 2025 21:33:40 GMT
-      Server:
-      - cloudflare
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - nosniff
-      access-control-expose-headers:
-      - X-Request-ID
-      alt-svc:
-      - h3=":443"; ma=86400
-      openai-organization:
-      - crewai-iuxna1
-      openai-processing-ms:
-      - '2263'
-      openai-version:
-      - '2020-10-01'
-      strict-transport-security:
-      - max-age=31536000; includeSubDomains; preload
-      x-ratelimit-limit-requests:
-      - '10000'
-      x-ratelimit-limit-tokens:
-      - '30000000'
-      x-ratelimit-remaining-requests:
-      - '9999'
-      x-ratelimit-remaining-tokens:
-      - '29999786'
-      x-ratelimit-reset-requests:
-      - 6ms
-      x-ratelimit-reset-tokens:
-      - 0s
-      x-request-id:
-      - req_7c1a31da73cd103e9f410f908e59187f
-    http_version: HTTP/1.1
-    status_code: 200
-version: 1
--- a/tests/cassettes/test_agent_tool_role_matching[
+++ b/tests/cassettes/test_agent_tool_role_matching[
@@ -1,119 +0,0 @@
-interactions:
- request:
-    body: '{"messages": [{"role": "system", "content": "You are Futel Official Infopoint.
-      Futel Football Club info\nYour personal goal is: Answer questions about Futel\nTo
-      give my best complete final answer to the task respond using the exact following
-      format:\n\nThought: I now can give a great answer\nFinal Answer: Your final
-      answer must be the great and the most complete as possible, it must be outcome
-      described.\n\nI MUST use these formats, my job depends on it!"}, {"role": "user",
-      "content": "\nCurrent Task: Test task\n\nThis is the expect criteria for your
-      final answer: Your best answer to your coworker asking you this, accounting
-      for the context shared.\nyou MUST return the actual complete content as the
-      final answer, not a summary.\n\nBegin! This is VERY important to you, use the
-      tools available and give your best Final Answer, your job depends on it!\n\nThought:"}],
-      "model": "gpt-4o", "stop": ["\nObservation:"], "stream": false}'
-    headers:
-      accept:
-      - application/json
-      accept-encoding:
-      - gzip, deflate
-      connection:
-      - keep-alive
-      content-length:
-      - '939'
-      content-type:
-      - application/json
-      cookie:
-      - __cf_bm=cwWdOaPJjFMNJaLtJfa8Kjqavswg5bzVRFzBX4gneGw-1736458417-1.0.1.1-bvf2HshgcMtgn7GdxqwySFDAIacGccDFfEXniBFTTDmbGMCiIIwf6t2DiwWnBldmUHixwc5kDO9gYs08g.feBA;
-        _cfuvid=WMw7PSqkYqQOieguBRs0uNkwNU92A.ZKbgDbCAcV3EQ-1736458417825-0.0.1.1-604800000
-      host:
-      - api.openai.com
-      user-agent:
-      - OpenAI/Python 1.52.1
-      x-stainless-arch:
-      - arm64
-      x-stainless-async:
-      - 'false'
-      x-stainless-lang:
-      - python
-      x-stainless-os:
-      - MacOS
-      x-stainless-package-version:
-      - 1.52.1
-      x-stainless-raw-response:
-      - 'true'
-      x-stainless-retry-count:
-      - '0'
-      x-stainless-runtime:
-      - CPython
-      x-stainless-runtime-version:
-      - 3.12.7
-    method: POST
-    uri: https://api.openai.com/v1/chat/completions
-  response:
-    content: "{\n  \"id\": \"chatcmpl-AnuRrFJZGKw8cIEshvuW1PKwFZFKs\",\n  \"object\":
-      \"chat.completion\",\n  \"created\": 1736458423,\n  \"model\": \"gpt-4o-2024-08-06\",\n
-      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
-      \"assistant\",\n        \"content\": \"I now can give a great answer  \\nFinal
-      Answer: Although you mentioned this being a \\\"Test task\\\" and haven't provided
-      a specific question regarding Futel Football Club, your request appears to involve
-      ensuring accuracy and detail in responses. For a proper answer about Futel,
-      I'd be ready to provide details about the club's history, management, players,
-      match schedules, and recent performance statistics. Remember to ask specific
-      questions to receive a targeted response. If this were a real context where
-      information was shared, I would respond precisely to what's been asked regarding
-      Futel Football Club.\",\n        \"refusal\": null\n      },\n      \"logprobs\":
-      null,\n      \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\":
-      177,\n    \"completion_tokens\": 113,\n    \"total_tokens\": 290,\n    \"prompt_tokens_details\":
-      {\n      \"cached_tokens\": 0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\":
-      {\n      \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
-      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"system_fingerprint\":
-      \"fp_703d4ff298\"\n}\n"
-    headers:
-      CF-Cache-Status:
-      - DYNAMIC
-      CF-RAY:
-      - 8ff78c1d0ecdc002-ATL
-      Connection:
-      - keep-alive
-      Content-Encoding:
-      - gzip
-      Content-Type:
-      - application/json
-      Date:
-      - Thu, 09 Jan 2025 21:33:47 GMT
-      Server:
-      - cloudflare
-      Transfer-Encoding:
-      - chunked
-      X-Content-Type-Options:
-      - nosniff
-      access-control-expose-headers:
-      - X-Request-ID
-      alt-svc:
-      - h3=":443"; ma=86400
-      openai-organization:
-      - crewai-iuxna1
-      openai-processing-ms:
-      - '3097'
-      openai-version:
-      - '2020-10-01'
-      strict-transport-security:
-      - max-age=31536000; includeSubDomains; preload
-      x-ratelimit-limit-requests:
-      - '10000'
-      x-ratelimit-limit-tokens:
-      - '30000000'
-      x-ratelimit-remaining-requests:
-      - '9999'
-      x-ratelimit-remaining-tokens:
-      - '29999786'
-      x-ratelimit-reset-requests:
-      - 6ms
-      x-ratelimit-reset-tokens:
-      - 0s
-      x-request-id:
-      - req_179e1d56e2b17303e40480baffbc7b08
-    http_version: HTTP/1.1
-    status_code: 200
-version: 1
--- a/tests/cassettes/test_agent_with_knowledge_sources_works_with_copy.yaml
+++ b/tests/cassettes/test_agent_with_knowledge_sources_works_with_copy.yaml
--- a/tests/cassettes/test_crew_with_knowledge_sources_works_with_copy.yaml
+++ b/tests/cassettes/test_crew_with_knowledge_sources_works_with_copy.yaml
--- a/tests/cassettes/test_deepseek_r1_with_open_router.yaml
+++ b/tests/cassettes/test_deepseek_r1_with_open_router.yaml
@@ -0,0 +1,100 @@
+interactions:
+- request:
+    body: '{"model": "deepseek/deepseek-r1", "messages": [{"role": "user", "content":
+      "What is the capital of France?"}], "stop": [], "stream": false}'
+    headers:
+      accept:
+      - '*/*'
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '139'
+      host:
+      - openrouter.ai
+      http-referer:
+      - https://litellm.ai
+      user-agent:
+      - litellm/1.60.2
+      x-title:
+      - liteLLM
+    method: POST
+    uri: https://openrouter.ai/api/v1/chat/completions
+  response:
+    content: "\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n\n
+      \        \n\n         \n\n         \n\n         \n\n         \n\n         \n{\"id\":\"gen-1738684300-YnD5WOSczQWsW0vQG78a\",\"provider\":\"Nebius\",\"model\":\"deepseek/deepseek-r1\",\"object\":\"chat.completion\",\"created\":1738684300,\"choices\":[{\"logprobs\":null,\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":\"The
+      capital of France is **Paris**. Known for its iconic landmarks such as the Eiffel
+      Tower, Notre-Dame Cathedral, and the Louvre Museum, Paris has served as the
+      political and cultural center of France for centuries. \U0001F1EB\U0001F1F7\",\"refusal\":null}}],\"usage\":{\"prompt_tokens\":10,\"completion_tokens\":261,\"total_tokens\":271}}"
+    headers:
+      Access-Control-Allow-Origin:
+      - '*'
+      CF-RAY:
+      - 90cbd2ceaf3ead5e-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:51:40 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      Vary:
+      - Accept-Encoding
+      x-clerk-auth-message:
+      - Invalid JWT form. A JWT consists of three parts separated by dots. (reason=token-invalid,
+        token-carrier=header)
+      x-clerk-auth-reason:
+      - token-invalid
+      x-clerk-auth-status:
+      - signed-out
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cassettes/test_o3_mini_reasoning_effort_high.yaml
+++ b/tests/cassettes/test_o3_mini_reasoning_effort_high.yaml
@@ -0,0 +1,107 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "o3-mini", "reasoning_effort": "high", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '137'
+      content-type:
+      - application/json
+      cookie:
+      - _cfuvid=etTqqA9SBOnENmrFAUBIexdW0v2ZeO1x9_Ek_WChlfU-1737568920137-0.0.1.1-604800000
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.61.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.61.0
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    content: "{\n  \"id\": \"chatcmpl-AxFNUz7l4pwtY9xhFSPIGlwNfE4Sj\",\n  \"object\":
+      \"chat.completion\",\n  \"created\": 1738683828,\n  \"model\": \"o3-mini-2025-01-31\",\n
+      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+      \"assistant\",\n        \"content\": \"The capital of France is Paris.\",\n
+      \       \"refusal\": null\n      },\n      \"finish_reason\": \"stop\"\n    }\n
+      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 13,\n    \"completion_tokens\":
+      81,\n    \"total_tokens\": 94,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
+      \     \"reasoning_tokens\": 64,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+      \"default\",\n  \"system_fingerprint\": \"fp_8bcaa0ca21\"\n}\n"
+    headers:
+      CF-RAY:
+      - 90cbc745d91fb0ca-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:43:50 GMT
+      Server:
+      - cloudflare
+      Set-Cookie:
+      - __cf_bm=.AP74BirsYr.lu61bSaimK2HRF6126qr5vCrr3HC6ak-1738683830-1.0.1.1-feh.bcMOv9wYnitoPpr.7UR7JrzCsbRLlzct09xCDm2SwmnRQQk5ZSSV41Ywer2S0rptbvufFwklV9wo9ATvWw;
+        path=/; expires=Tue, 04-Feb-25 16:13:50 GMT; domain=.api.openai.com; HttpOnly;
+        Secure; SameSite=None
+      - _cfuvid=JBfx8Sl7w82A0S_K1tQd5ZcwzWaZP5Gg5W1dqAdgwNU-1738683830528-0.0.1.1-604800000;
+        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '2169'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999974'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_163e7bd79cb5a5e62d4688245b97d1d9
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cassettes/test_o3_mini_reasoning_effort_low.yaml
+++ b/tests/cassettes/test_o3_mini_reasoning_effort_low.yaml
@@ -0,0 +1,102 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "o3-mini", "reasoning_effort": "low", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '136'
+      content-type:
+      - application/json
+      cookie:
+      - _cfuvid=JBfx8Sl7w82A0S_K1tQd5ZcwzWaZP5Gg5W1dqAdgwNU-1738683830528-0.0.1.1-604800000;
+        __cf_bm=.AP74BirsYr.lu61bSaimK2HRF6126qr5vCrr3HC6ak-1738683830-1.0.1.1-feh.bcMOv9wYnitoPpr.7UR7JrzCsbRLlzct09xCDm2SwmnRQQk5ZSSV41Ywer2S0rptbvufFwklV9wo9ATvWw
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.61.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.61.0
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    content: "{\n  \"id\": \"chatcmpl-AxFNWljEYFrf5qRwYj73OPQtAnPbF\",\n  \"object\":
+      \"chat.completion\",\n  \"created\": 1738683830,\n  \"model\": \"o3-mini-2025-01-31\",\n
+      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+      \"assistant\",\n        \"content\": \"The capital of France is Paris.\",\n
+      \       \"refusal\": null\n      },\n      \"finish_reason\": \"stop\"\n    }\n
+      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 13,\n    \"completion_tokens\":
+      17,\n    \"total_tokens\": 30,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
+      \     \"reasoning_tokens\": 0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+      \"default\",\n  \"system_fingerprint\": \"fp_8bcaa0ca21\"\n}\n"
+    headers:
+      CF-RAY:
+      - 90cbc7551fe0b0ca-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:43:51 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '1103'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999974'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_fd7178a0e5060216d04f3bd023e8bca1
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cassettes/test_o3_mini_reasoning_effort_medium.yaml
+++ b/tests/cassettes/test_o3_mini_reasoning_effort_medium.yaml
@@ -0,0 +1,102 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "What is the capital of France?"}],
+      "model": "o3-mini", "reasoning_effort": "medium", "stop": []}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '139'
+      content-type:
+      - application/json
+      cookie:
+      - _cfuvid=JBfx8Sl7w82A0S_K1tQd5ZcwzWaZP5Gg5W1dqAdgwNU-1738683830528-0.0.1.1-604800000;
+        __cf_bm=.AP74BirsYr.lu61bSaimK2HRF6126qr5vCrr3HC6ak-1738683830-1.0.1.1-feh.bcMOv9wYnitoPpr.7UR7JrzCsbRLlzct09xCDm2SwmnRQQk5ZSSV41Ywer2S0rptbvufFwklV9wo9ATvWw
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.61.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.61.0
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    content: "{\n  \"id\": \"chatcmpl-AxFS8IuMeYs6Rky2UbG8wH8P5PR4k\",\n  \"object\":
+      \"chat.completion\",\n  \"created\": 1738684116,\n  \"model\": \"o3-mini-2025-01-31\",\n
+      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+      \"assistant\",\n        \"content\": \"The capital of France is Paris.\",\n
+      \       \"refusal\": null\n      },\n      \"finish_reason\": \"stop\"\n    }\n
+      \ ],\n  \"usage\": {\n    \"prompt_tokens\": 13,\n    \"completion_tokens\":
+      145,\n    \"total_tokens\": 158,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\":
+      0,\n      \"audio_tokens\": 0\n    },\n    \"completion_tokens_details\": {\n
+      \     \"reasoning_tokens\": 128,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\":
+      0,\n      \"rejected_prediction_tokens\": 0\n    }\n  },\n  \"service_tier\":
+      \"default\",\n  \"system_fingerprint\": \"fp_8bcaa0ca21\"\n}\n"
+    headers:
+      CF-RAY:
+      - 90cbce51b946afb4-ATL
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 04 Feb 2025 15:48:39 GMT
+      Server:
+      - cloudflare
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '2365'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999974'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_bfd83679e674c3894991477f1fb043b2
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cassettes/test_tool_execution_error_event.yaml
+++ b/tests/cassettes/test_tool_execution_error_event.yaml
@@ -0,0 +1,112 @@
+interactions:
+- request:
+    body: '{"messages": [{"role": "user", "content": "Use the failing tool"}], "model":
+      "gpt-4o-mini", "stop": [], "tools": [{"type": "function", "function": {"name":
+      "failing_tool", "description": "This tool always fails.", "parameters": {"type":
+      "object", "properties": {"param": {"type": "string", "description": "A test
+      parameter"}}, "required": ["param"]}}}]}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '353'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.61.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.61.0
+      x-stainless-raw-response:
+      - 'true'
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.8
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    content: "{\n  \"id\": \"chatcmpl-B2P4zoJZuES7Aom8ugEq1modz5Vsl\",\n  \"object\":
+      \"chat.completion\",\n  \"created\": 1739912761,\n  \"model\": \"gpt-4o-mini-2024-07-18\",\n
+      \ \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\":
+      \"assistant\",\n        \"content\": null,\n        \"tool_calls\": [\n          {\n
+      \           \"id\": \"call_F6fJxISpMKUBIGV6dd2vjRNG\",\n            \"type\":
+      \"function\",\n            \"function\": {\n              \"name\": \"failing_tool\",\n
+      \             \"arguments\": \"{\\\"param\\\":\\\"test\\\"}\"\n            }\n
+      \         }\n        ],\n        \"refusal\": null\n      },\n      \"logprobs\":
+      null,\n      \"finish_reason\": \"tool_calls\"\n    }\n  ],\n  \"usage\": {\n
+      \   \"prompt_tokens\": 51,\n    \"completion_tokens\": 15,\n    \"total_tokens\":
+      66,\n    \"prompt_tokens_details\": {\n      \"cached_tokens\": 0,\n      \"audio_tokens\":
+      0\n    },\n    \"completion_tokens_details\": {\n      \"reasoning_tokens\":
+      0,\n      \"audio_tokens\": 0,\n      \"accepted_prediction_tokens\": 0,\n      \"rejected_prediction_tokens\":
+      0\n    }\n  },\n  \"service_tier\": \"default\",\n  \"system_fingerprint\":
+      \"fp_00428b782a\"\n}\n"
+    headers:
+      CF-RAY:
+      - 9140fa827f38eb1e-SJC
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 18 Feb 2025 21:06:02 GMT
+      Server:
+      - cloudflare
+      Set-Cookie:
+      - __cf_bm=xbuu3IQpCMh.43ZrqL1TRMECOc6QldgHV0hzOX1GrWI-1739912762-1.0.1.1-t7iyq5xMioPrwfeaHLvPT9rwRPp7Q9A9uIm69icH9dPxRD4xMA3cWqb1aXj1_e2IyAEQQWFe1UWjlmJ22aHh3Q;
+        path=/; expires=Tue, 18-Feb-25 21:36:02 GMT; domain=.api.openai.com; HttpOnly;
+        Secure; SameSite=None
+      - _cfuvid=x9l.Rhja8_wXDN.j8qcEU1PvvEqAwZp4Fd3s_aj4qwM-1739912762161-0.0.1.1-604800000;
+        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '861'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=31536000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999978'
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_8666ec3aa6677cb346ba00993556051d
+    http_version: HTTP/1.1
+    status_code: 200
+version: 1
--- a/tests/cli/cli_test.py
+++ b/tests/cli/cli_test.py
@@ -55,72 +55,83 @@ def test_train_invalid_string_iterations(train_crew, runner):
    )


-@mock.patch("crewai.cli.reset_memories_command.ShortTermMemory")
-@mock.patch("crewai.cli.reset_memories_command.EntityMemory")
-@mock.patch("crewai.cli.reset_memories_command.LongTermMemory")
-@mock.patch("crewai.cli.reset_memories_command.TaskOutputStorageHandler")
-def test_reset_all_memories(
-    MockTaskOutputStorageHandler,
-    MockLongTermMemory,
-    MockEntityMemory,
-    MockShortTermMemory,
-    runner,
-):
-    result = runner.invoke(reset_memories, ["--all"])
-    MockShortTermMemory().reset.assert_called_once()
-    MockEntityMemory().reset.assert_called_once()
-    MockLongTermMemory().reset.assert_called_once()
-    MockTaskOutputStorageHandler().reset.assert_called_once()
+@mock.patch("crewai.cli.reset_memories_command.get_crew")
+def test_reset_all_memories(mock_get_crew, runner):
+    mock_crew = mock.Mock()
+    mock_get_crew.return_value = mock_crew
+    result = runner.invoke(reset_memories, ["-a"])

+    mock_crew.reset_memories.assert_called_once_with(command_type="all")
    assert result.output == "All memories have been reset.\n"


-@mock.patch("crewai.cli.reset_memories_command.ShortTermMemory")
-def test_reset_short_term_memories(MockShortTermMemory, runner):
+@mock.patch("crewai.cli.reset_memories_command.get_crew")
+def test_reset_short_term_memories(mock_get_crew, runner):
+    mock_crew = mock.Mock()
+    mock_get_crew.return_value = mock_crew
    result = runner.invoke(reset_memories, ["-s"])
-    MockShortTermMemory().reset.assert_called_once()
+
+    mock_crew.reset_memories.assert_called_once_with(command_type="short")
    assert result.output == "Short term memory has been reset.\n"


-@mock.patch("crewai.cli.reset_memories_command.EntityMemory")
-def test_reset_entity_memories(MockEntityMemory, runner):
+@mock.patch("crewai.cli.reset_memories_command.get_crew")
+def test_reset_entity_memories(mock_get_crew, runner):
+    mock_crew = mock.Mock()
+    mock_get_crew.return_value = mock_crew
    result = runner.invoke(reset_memories, ["-e"])
-    MockEntityMemory().reset.assert_called_once()
+
+    mock_crew.reset_memories.assert_called_once_with(command_type="entity")
    assert result.output == "Entity memory has been reset.\n"


-@mock.patch("crewai.cli.reset_memories_command.LongTermMemory")
-def test_reset_long_term_memories(MockLongTermMemory, runner):
+@mock.patch("crewai.cli.reset_memories_command.get_crew")
+def test_reset_long_term_memories(mock_get_crew, runner):
+    mock_crew = mock.Mock()
+    mock_get_crew.return_value = mock_crew
    result = runner.invoke(reset_memories, ["-l"])
-    MockLongTermMemory().reset.assert_called_once()
+
+    mock_crew.reset_memories.assert_called_once_with(command_type="long")
    assert result.output == "Long term memory has been reset.\n"


-@mock.patch("crewai.cli.reset_memories_command.TaskOutputStorageHandler")
-def test_reset_kickoff_outputs(MockTaskOutputStorageHandler, runner):
+@mock.patch("crewai.cli.reset_memories_command.get_crew")
+def test_reset_kickoff_outputs(mock_get_crew, runner):
+    mock_crew = mock.Mock()
+    mock_get_crew.return_value = mock_crew
    result = runner.invoke(reset_memories, ["-k"])
-    MockTaskOutputStorageHandler().reset.assert_called_once()
+
+    mock_crew.reset_memories.assert_called_once_with(command_type="kickoff_outputs")
    assert result.output == "Latest Kickoff outputs stored has been reset.\n"


-@mock.patch("crewai.cli.reset_memories_command.ShortTermMemory")
-@mock.patch("crewai.cli.reset_memories_command.LongTermMemory")
-def test_reset_multiple_memory_flags(MockShortTermMemory, MockLongTermMemory, runner):
-    result = runner.invoke(
-        reset_memories,
-        [
-            "-s",
-            "-l",
-        ],
+@mock.patch("crewai.cli.reset_memories_command.get_crew")
+def test_reset_multiple_memory_flags(mock_get_crew, runner):
+    mock_crew = mock.Mock()
+    mock_get_crew.return_value = mock_crew
+    result = runner.invoke(reset_memories, ["-s", "-l"])
+
+    # Check that reset_memories was called twice with the correct arguments
+    assert mock_crew.reset_memories.call_count == 2
+    mock_crew.reset_memories.assert_has_calls(
+        [mock.call(command_type="long"), mock.call(command_type="short")]
    )
-    MockShortTermMemory().reset.assert_called_once()
-    MockLongTermMemory().reset.assert_called_once()
    assert (
        result.output
        == "Long term memory has been reset.\nShort term memory has been reset.\n"
    )


+@mock.patch("crewai.cli.reset_memories_command.get_crew")
+def test_reset_knowledge(mock_get_crew, runner):
+    mock_crew = mock.Mock()
+    mock_get_crew.return_value = mock_crew
+    result = runner.invoke(reset_memories, ["--knowledge"])
+
+    mock_crew.reset_memories.assert_called_once_with(command_type="knowledge")
+    assert result.output == "Knowledge has been reset.\n"
+
+
 def test_reset_no_memory_flags(runner):
    result = runner.invoke(
        reset_memories,
--- a/tests/config/tasks.yaml
+++ b/tests/config/tasks.yaml
@@ -2,7 +2,7 @@ research_task:
  description: >
    Conduct a thorough research about {topic}
    Make sure you find any interesting and relevant information given
-    the current year is 2024.
+    the current year is 2025.
  expected_output: >
    A list with 10 bullet points of the most relevant information about {topic}
  agent: researcher
--- a/Show More
+++ b/Show More
				`@@ -0,0 +1 @@`
				`from .agentops_listener import agentops_listener`