Clean up end of docs

Remove overly complicated test
Merge branch 'main' into feature/procedure_v2
2026-02-05 21:48:13 +00:00 · 2024-07-31 10:57:09 -04:00 · 2024-07-31 10:12:37 -04:00 · 2024-07-31 09:49:48 -04:00 · 2024-07-30 19:21:18 -04:00 · 2024-07-30 10:59:50 -07:00
79 changed files with 458949 additions and 7937 deletions
--- a/README.md
+++ b/README.md
@@ -254,7 +254,7 @@ pip install dist/*.tar.gz

 CrewAI uses anonymous telemetry to collect usage data with the main purpose of helping us improve the library by focusing our efforts on the most used features, integrations and tools.

-There is NO data being collected on the prompts, tasks descriptions agents backstories or goals nor tools usage, no API calls, nor responses nor any data that is being processed by the agents, nor any secrets and env vars.
+It's pivotal to understand that **NO data is collected** concerning prompts, task descriptions, agents' backstories or goals, usage of tools, API calls, responses, any data processed by the agents, or secrets and environment variables, with the exception of the conditions mentioned. When the `share_crew` feature is enabled, detailed data including task descriptions, agents' backstories or goals, and other specific attributes are collected to provide deeper insights while respecting user privacy. We don't offer a way to disable it now, but we will in the future.

 Data collected includes:

@@ -279,7 +279,7 @@ Data collected includes:
 - Tools names available
  - Understand out of the publically available tools, which ones are being used the most so we can improve them

-Users can opt-in sharing the complete telemetry data by setting the `share_crew` attribute to `True` on their Crews.
+Users can opt-in to Further Telemetry, sharing the complete telemetry data by setting the `share_crew` attribute to `True` on their Crews. Enabling `share_crew` results in the collection of detailed crew and task execution data, including `goal`, `backstory`, `context`, and `output` of tasks. This enables a deeper insight into usage patterns while respecting the user's choice to share.

 ## License

--- a/docs/core-concepts/Agents.md
+++ b/docs/core-concepts/Agents.md
@@ -114,7 +114,7 @@ from langchain.agents import load_tools
 langchain_tools = load_tools(["google-serper"], llm=llm)

 agent1 = CustomAgent(
-    role="backstory agent",
+    role="agent role",
    goal="who is {input}?",
    backstory="agent backstory",
    verbose=True,
@@ -127,7 +127,7 @@ task1 = Task(
 )

 agent2 = Agent(
-    role="bio agent",
+    role="agent role",
    goal="summarize the short bio for {input} and if needed do more research",
    backstory="agent backstory",
    verbose=True,
--- a/docs/core-concepts/Crews.md
+++ b/docs/core-concepts/Crews.md
@@ -32,6 +32,8 @@ A crew in crewAI represents a collaborative group of agents working together to
 | **Manager Agent** _(optional)_        | `manager_agent`        | `manager` sets a custom agent that will be used as a manager.                                                                                                                                                                                             |
 | **Manager Callbacks** _(optional)_    | `manager_callbacks`    | `manager_callbacks` takes a list of callback handlers to be executed by the manager agent when a hierarchical process is used.                                                                                                                            |
 | **Prompt File** _(optional)_          | `prompt_file`          | Path to the prompt JSON file to be used for the crew.                                                                                                                                                                                                     |
+| **Planning** *(optional)*             | `planning`             |  Adds planning ability to the Crew. When activated before each Crew iteration, all Crew data is sent to an AgentPlanner that will plan the tasks and this plan will be added to each task description.
+| **Planning LLM** *(optional)*         | `planning_llm`         | The language model used by the AgentPlanner in a planning process. |

 !!! note "Crew Max RPM"
 The `max_rpm` attribute sets the maximum number of requests per minute the crew can perform to avoid rate limits and will override individual agents' `max_rpm` settings if you set it.
@@ -45,6 +47,12 @@ When assembling a crew, you combine agents with complementary roles and tools, a
 ```python
 from crewai import Crew, Agent, Task, Process
 from langchain_community.tools import DuckDuckGoSearchRun
+from crewai_tools import tool
+
+@tool('DuckDuckGoSearch')
+def search(search_query: str):
+    """Search the web for information on a given topic"""
+    return DuckDuckGoSearchRun().run(search_query)

 # Define agents with specific roles and tools
 researcher = Agent(
@@ -55,7 +63,7 @@ researcher = Agent(
        to the business.
        You're currently working on a project to analyze the
        trends and innovations in the space of artificial intelligence.""",
-    tools=[DuckDuckGoSearchRun()]
+    tools=[search]
 )

 writer = Agent(
@@ -129,7 +137,7 @@ crew = Crew(
    verbose=2
 )

-result = crew.kickoff()
+crew_output = crew.kickoff()

 # Accessing the crew output
 print(f"Raw Output: {crew_output.raw}")
@@ -213,9 +221,9 @@ These methods provide flexibility in how you manage and execute tasks within you
 ### Replaying from specific task:
 You can now replay from a specific task using our cli command replay.

-The replay_from_tasks feature in CrewAI allows you to replay from a specific task using the command-line interface (CLI). By running the command `crewai replay -t <task_id>`, you can specify the `task_id` for the replay process.
+The replay feature in CrewAI allows you to replay from a specific task using the command-line interface (CLI). By running the command `crewai replay -t <task_id>`, you can specify the `task_id` for the replay process.

-Kickoffs will now save the latest kickoffs returned task outputs locally for you to be able to replay from. 
+Kickoffs will now save the latest kickoffs returned task outputs locally for you to be able to replay from.


 ### Replaying from specific task Using the CLI
@@ -236,4 +244,4 @@ crewai log-tasks-outputs
 crewai replay -t <task_id>
 ```

-These commands let you replay from your latest kickoff tasks, still retaining context from previously executed tasks.
+These commands let you replay from your latest kickoff tasks, still retaining context from previously executed tasks.
--- a/docs/core-concepts/Memory.md
+++ b/docs/core-concepts/Memory.md
@@ -29,6 +29,11 @@ description: Leveraging memory systems in the crewAI framework to enhance agent
 When configuring a crew, you can enable and customize each memory component to suit the crew's objectives and the nature of tasks it will perform.
 By default, the memory system is disabled, and you can ensure it is active by setting `memory=True` in the crew configuration. The memory will use OpenAI Embeddings by default, but you can change it by setting `embedder` to a different model.

+The 'embedder' only applies to **Short-Term Memory** which uses Chroma for RAG using EmbedChain package.  
+The **Long-Term Memory** uses SQLLite3 to store task results.  Currently, there is no way to override these storage implementations.
+The data storage files are saved into a platform specific location found using the appdirs package 
+and the name of the project which can be overridden using the **CREWAI_STORAGE_DIR** environment variable.
+
 ### Example: Configuring Memory for a Crew

 ```python
@@ -161,10 +166,43 @@ my_crew = Crew(
 )
 ```

+### Resetting Memory
+```sh
+crewai reset_memories [OPTIONS]
+```
+
+#### Resetting Memory Options
+- **`-l, --long`**
+  - **Description:** Reset LONG TERM memory.
+  - **Type:** Flag (boolean)
+  - **Default:** False
+
+- **`-s, --short`**
+  - **Description:** Reset SHORT TERM memory.
+  - **Type:** Flag (boolean)
+  - **Default:** False
+
+- **`-e, --entities`**
+  - **Description:** Reset ENTITIES memory.
+  - **Type:** Flag (boolean)
+  - **Default:** False
+
+- **`-k, --kickoff-outputs`**
+  - **Description:** Reset LATEST KICKOFF TASK OUTPUTS.
+  - **Type:** Flag (boolean)
+  - **Default:** False
+
+- **`-a, --all`**
+  - **Description:** Reset ALL memories.
+  - **Type:** Flag (boolean)
+  - **Default:** False
+
+
+
 ## Benefits of Using crewAI's Memory System
 - **Adaptive Learning:** Crews become more efficient over time, adapting to new information and refining their approach to tasks.
 - **Enhanced Personalization:** Memory enables agents to remember user preferences and historical interactions, leading to personalized experiences.
 - **Improved Problem Solving:** Access to a rich memory store aids agents in making more informed decisions, drawing on past learnings and contextual insights.

 ## Getting Started
-Integrating crewAI's memory system into your projects is straightforward. By leveraging the provided memory components and configurations, you can quickly empower your agents with the ability to remember, reason, and learn from their interactions, unlocking new levels of intelligence and capability.
+Integrating crewAI's memory system into your projects is straightforward. By leveraging the provided memory components and configurations, you can quickly empower your agents with the ability to remember, reason, and learn from their interactions, unlocking new levels of intelligence and capability.
--- a/docs/core-concepts/Pipeline.md
+++ b/docs/core-concepts/Pipeline.md
@@ -0,0 +1,196 @@
+---
+title: crewAI Pipelines
+description: Understanding and utilizing pipelines in the crewAI framework for efficient multi-stage task processing.
+---
+
+## What is a Pipeline?
+
+A pipeline in crewAI represents a structured workflow that allows for the sequential or parallel execution of multiple crews. It provides a way to organize complex processes involving multiple stages, where the output of one stage can serve as input for subsequent stages.
+
+## Key Terminology
+
+Understanding the following terms is crucial for working effectively with pipelines:
+
+- **Stage**: A distinct part of the pipeline, which can be either sequential (a single crew) or parallel (multiple crews executing concurrently).
+- **Run**: A specific execution of the pipeline for a given set of inputs, representing a single instance of processing through the pipeline.
+- **Branch**: Parallel executions within a stage (e.g., concurrent crew operations).
+- **Trace**: The journey of an individual input through the entire pipeline, capturing the path and transformations it undergoes.
+
+Example pipeline structure:
+
+```
+crew1 >> [crew2, crew3] >> crew4
+```
+
+This represents a pipeline with three stages:
+
+1. A sequential stage (crew1)
+2. A parallel stage with two branches (crew2 and crew3 executing concurrently)
+3. Another sequential stage (crew4)
+
+Each input creates its own run, flowing through all stages of the pipeline. Multiple runs can be processed concurrently, each following the defined pipeline structure.
+
+## Pipeline Attributes
+
+| Attribute  | Parameters | Description                                                                           |
+| :--------- | :--------- | :------------------------------------------------------------------------------------ |
+| **Stages** | `stages`   | A list of crews or lists of crews representing the stages to be executed in sequence. |
+
+## Creating a Pipeline
+
+When creating a pipeline, you define a series of stages, each consisting of either a single crew or a list of crews for parallel execution. The pipeline ensures that each stage is executed in order, with the output of one stage feeding into the next.
+
+### Example: Assembling a Pipeline
+
+```python
+from crewai import Crew, Agent, Task, Pipeline
+
+# Define your crews
+research_crew = Crew(
+    agents=[researcher],
+    tasks=[research_task],
+    process=Process.sequential
+)
+
+analysis_crew = Crew(
+    agents=[analyst],
+    tasks=[analysis_task],
+    process=Process.sequential
+)
+
+writing_crew = Crew(
+    agents=[writer],
+    tasks=[writing_task],
+    process=Process.sequential
+)
+
+# Assemble the pipeline
+my_pipeline = Pipeline(
+    stages=[research_crew, analysis_crew, writing_crew]
+)
+```
+
+## Pipeline Methods
+
+| Method           | Description                                                                                                                                                                    |
+| :--------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| **process_runs** | Executes the pipeline, processing all stages and returning the results. This method initiates one or more runs through the pipeline, handling the flow of data between stages. |
+
+## Pipeline Output
+
+!!! note "Understanding Pipeline Outputs"
+The output of a pipeline in the crewAI framework is encapsulated within two main classes: `PipelineOutput` and `PipelineRunResult`. These classes provide a structured way to access the results of the pipeline's execution, including various formats such as raw strings, JSON, and Pydantic models.
+
+### Pipeline Output Attributes
+
+| Attribute       | Parameters    | Type                      | Description                                                                                               |
+| :-------------- | :------------ | :------------------------ | :-------------------------------------------------------------------------------------------------------- |
+| **ID**          | `id`          | `UUID4`                   | A unique identifier for the pipeline output.                                                              |
+| **Run Results** | `run_results` | `List[PipelineRunResult]` | A list of `PipelineRunResult` objects, each representing the output of a single run through the pipeline. |
+
+### Pipeline Output Methods
+
+| Method/Property    | Description                                            |
+| :----------------- | :----------------------------------------------------- |
+| **add_run_result** | Adds a `PipelineRunResult` to the list of run results. |
+
+### Pipeline Run Result Attributes
+
+| Attribute         | Parameters      | Type                       | Description                                                                                   |
+| :---------------- | :-------------- | :------------------------- | :-------------------------------------------------------------------------------------------- |
+| **ID**            | `id`            | `UUID4`                    | A unique identifier for the run result.                                                       |
+| **Raw**           | `raw`           | `str`                      | The raw output of the final stage in the pipeline run.                                        |
+| **Pydantic**      | `pydantic`      | `Optional[BaseModel]`      | A Pydantic model object representing the structured output of the final stage, if applicable. |
+| **JSON Dict**     | `json_dict`     | `Optional[Dict[str, Any]]` | A dictionary representing the JSON output of the final stage, if applicable.                  |
+| **Token Usage**   | `token_usage`   | `Dict[str, Any]`           | A summary of token usage across all stages of the pipeline run.                               |
+| **Trace**         | `trace`         | `List[Any]`                | A trace of the journey of inputs through the pipeline run.                                    |
+| **Crews Outputs** | `crews_outputs` | `List[CrewOutput]`         | A list of `CrewOutput` objects, representing the outputs from each crew in the pipeline run.  |
+
+### Pipeline Run Result Methods and Properties
+
+| Method/Property | Description                                                                                              |
+| :-------------- | :------------------------------------------------------------------------------------------------------- |
+| **json**        | Returns the JSON string representation of the run result if the output format of the final task is JSON. |
+| **to_dict**     | Converts the JSON and Pydantic outputs to a dictionary.                                                  |
+| \***\*str\*\*** | Returns the string representation of the run result, prioritizing Pydantic, then JSON, then raw.         |
+
+### Accessing Pipeline Outputs
+
+Once a pipeline has been executed, its output can be accessed through the `PipelineOutput` object returned by the `process_runs` method. The `PipelineOutput` class provides access to individual `PipelineRunResult` objects, each representing a single run through the pipeline.
+
+#### Example
+
+```python
+# Define input data for the pipeline
+input_data = [{"initial_query": "Latest advancements in AI"}, {"initial_query": "Future of robotics"}]
+
+# Execute the pipeline
+pipeline_output = await my_pipeline.process_runs(input_data)
+
+# Access the results
+for run_result in pipeline_output.run_results:
+    print(f"Run ID: {run_result.id}")
+    print(f"Final Raw Output: {run_result.raw}")
+    if run_result.json_dict:
+        print(f"JSON Output: {json.dumps(run_result.json_dict, indent=2)}")
+    if run_result.pydantic:
+        print(f"Pydantic Output: {run_result.pydantic}")
+    print(f"Token Usage: {run_result.token_usage}")
+    print(f"Trace: {run_result.trace}")
+    print("Crew Outputs:")
+    for crew_output in run_result.crews_outputs:
+        print(f"  Crew: {crew_output.raw}")
+    print("\n")
+```
+
+This example demonstrates how to access and work with the pipeline output, including individual run results and their associated data.
+
+## Using Pipelines
+
+Pipelines are particularly useful for complex workflows that involve multiple stages of processing, analysis, or content generation. They allow you to:
+
+1. **Sequence Operations**: Execute crews in a specific order, ensuring that the output of one crew is available as input to the next.
+2. **Parallel Processing**: Run multiple crews concurrently within a stage for increased efficiency.
+3. **Manage Complex Workflows**: Break down large tasks into smaller, manageable steps executed by specialized crews.
+
+### Example: Running a Pipeline
+
+```python
+# Define input data for the pipeline
+input_data = [{"initial_query": "Latest advancements in AI"}]
+
+# Execute the pipeline, initiating a run for each input
+results = await my_pipeline.process_runs(input_data)
+
+# Access the results
+for result in results:
+    print(f"Final Output: {result.raw}")
+    print(f"Token Usage: {result.token_usage}")
+    print(f"Trace: {result.trace}")  # Shows the path of the input through all stages
+```
+
+## Advanced Features
+
+### Parallel Execution within Stages
+
+You can define parallel execution within a stage by providing a list of crews, creating multiple branches:
+
+```python
+parallel_analysis_crew = Crew(agents=[financial_analyst], tasks=[financial_analysis_task])
+market_analysis_crew = Crew(agents=[market_analyst], tasks=[market_analysis_task])
+
+my_pipeline = Pipeline(
+    stages=[
+        research_crew,
+        [parallel_analysis_crew, market_analysis_crew],  # Parallel execution (branching)
+        writing_crew
+    ]
+)
+```
+
+### Error Handling and Validation
+
+The Pipeline class includes validation mechanisms to ensure the robustness of the pipeline structure:
+
+- Validates that stages contain only Crew instances or lists of Crew instances.
+- Prevents double nesting of stages to maintain a clear structure.
--- a/docs/core-concepts/Planning.md
+++ b/docs/core-concepts/Planning.md
@@ -0,0 +1,138 @@
+---
+title: crewAI Planning
+description: Learn how to add planning to your crewAI Crew and improve their performance.
+---
+
+## Introduction
+The planning feature in CrewAI allows you to add planning capability to your crew. When enabled, before each Crew iteration, all Crew information is sent to an AgentPlanner that will plan the tasks step by step, and this plan will be added to each task description.
+
+### Using the Planning Feature
+Getting started with the planning feature is very easy, the only step required is to add `planning=True` to your Crew:
+
+```python
+from crewai import Crew, Agent, Task, Process
+
+# Assemble your crew with planning capabilities
+my_crew = Crew(
+    agents=self.agents,
+    tasks=self.tasks,
+    process=Process.sequential,
+    planning=True,
+)
+```
+
+From this point on, your crew will have planning enabled, and the tasks will be planned before each iteration.
+
+#### Planning LLM
+
+Now you can define the LLM that will be used to plan the tasks. You can use any ChatOpenAI LLM model available.
+
+```python
+from crewai import Crew, Agent, Task, Process
+from langchain_openai import ChatOpenAI
+
+# Assemble your crew with planning capabilities and custom LLM
+my_crew = Crew(
+    agents=self.agents,
+    tasks=self.tasks,
+    process=Process.sequential,
+    planning=True,
+    planning_llm=ChatOpenAI(model="gpt-4o")
+)
+```
+
+
+### Example
+
+When running the base case example, you will see something like the following output, which represents the output of the AgentPlanner responsible for creating the step-by-step logic to add to the Agents tasks.
+
+```bash
+
+[2024-07-15 16:49:11][INFO]: Planning the crew execution
+**Step-by-Step Plan for Task Execution**
+
+**Task Number 1: Conduct a thorough research about AI LLMs**
+
+**Agent:** AI LLMs Senior Data Researcher
+
+**Agent Goal:** Uncover cutting-edge developments in AI LLMs
+
+**Task Expected Output:** A list with 10 bullet points of the most relevant information about AI LLMs
+
+**Task Tools:** None specified
+
+**Agent Tools:** None specified
+
+**Step-by-Step Plan:**
+
+1. **Define Research Scope:**
+   - Determine the specific areas of AI LLMs to focus on, such as advancements in architecture, use cases, ethical considerations, and performance metrics.
+
+2. **Identify Reliable Sources:**
+   - List reputable sources for AI research, including academic journals, industry reports, conferences (e.g., NeurIPS, ACL), AI research labs (e.g., OpenAI, Google AI), and online databases (e.g., IEEE Xplore, arXiv).
+
+3. **Collect Data:**
+   - Search for the latest papers, articles, and reports published in 2023 and early 2024.
+   - Use keywords like "Large Language Models 2024", "AI LLM advancements", "AI ethics 2024", etc.
+
+4. **Analyze Findings:**
+   - Read and summarize the key points from each source.
+   - Highlight new techniques, models, and applications introduced in the past year.
+
+5. **Organize Information:**
+   - Categorize the information into relevant topics (e.g., new architectures, ethical implications, real-world applications).
+   - Ensure each bullet point is concise but informative.
+
+6. **Create the List:**
+   - Compile the 10 most relevant pieces of information into a bullet point list.
+   - Review the list to ensure clarity and relevance.
+
+**Expected Output:**
+A list with 10 bullet points of the most relevant information about AI LLMs.
+
+---
+
+**Task Number 2: Review the context you got and expand each topic into a full section for a report**
+
+**Agent:** AI LLMs Reporting Analyst
+
+**Agent Goal:** Create detailed reports based on AI LLMs data analysis and research findings
+
+**Task Expected Output:** A fully fledge report with the main topics, each with a full section of information. Formatted as markdown without '```'
+
+**Task Tools:** None specified
+
+**Agent Tools:** None specified
+
+**Step-by-Step Plan:**
+
+1. **Review the Bullet Points:**
+   - Carefully read through the list of 10 bullet points provided by the AI LLMs Senior Data Researcher.
+
+2. **Outline the Report:**
+   - Create an outline with each bullet point as a main section heading.
+   - Plan sub-sections under each main heading to cover different aspects of the topic.
+
+3. **Research Further Details:**
+   - For each bullet point, conduct additional research if necessary to gather more detailed information.
+   - Look for case studies, examples, and statistical data to support each section.
+
+4. **Write Detailed Sections:**
+   - Expand each bullet point into a comprehensive section.
+   - Ensure each section includes an introduction, detailed explanation, examples, and a conclusion.
+   - Use markdown formatting for headings, subheadings, lists, and emphasis.
+
+5. **Review and Edit:**
+   - Proofread the report for clarity, coherence, and correctness.
+   - Make sure the report flows logically from one section to the next.
+   - Format the report according to markdown standards.
+
+6. **Finalize the Report:**
+   - Ensure the report is complete with all sections expanded and detailed.
+   - Double-check formatting and make any necessary adjustments.
+
+**Expected Output:**
+A fully-fledged report with the main topics, each with a full section of information. Formatted as markdown without '```'.
+
+---
+```
--- a/docs/core-concepts/Testing.md
+++ b/docs/core-concepts/Testing.md
@@ -0,0 +1,41 @@
+---
+title: crewAI Testing
+description: Learn how to test your crewAI Crew and evaluate their performance.
+---
+
+## Introduction
+
+Testing is a crucial part of the development process, and it is essential to ensure that your crew is performing as expected. And with crewAI, you can easily test your crew and evaluate its performance using the built-in testing capabilities.
+
+### Using the Testing Feature
+
+We added the CLI command `crewai test` to make it easy to test your crew. This command will run your crew for a specified number of iterations and provide detailed performance metrics.
+The parameters are `n_iterations` and `model` which are optional and default to 2 and `gpt-4o-mini` respectively. For now the only provider available is OpenAI.
+
+```bash
+crewai test
+```
+
+If you want to run more iterations or use a different model, you can specify the parameters like this:
+
+```bash
+crewai test --n_iterations 5 --model gpt-4o
+```
+
+What happens when you run the `crewai test` command is that the crew will be executed for the specified number of iterations, and the performance metrics will be displayed at the end of the run.
+
+A table of scores at the end will show the performance of the crew in terms of the following metrics:
+```
+                Task Scores
+          (1-10 Higher is better)
+┏━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┓
+┃ Tasks/Crew ┃ Run 1 ┃ Run 2 ┃ Avg. Total ┃
+┡━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━┩
+│ Task 1     │ 10.0  │ 9.0   │ 9.5        │
+│ Task 2     │ 9.0   │ 9.0   │ 9.0        │
+│ Crew       │ 9.5   │ 9.0   │ 9.2        │
+└────────────┴───────┴───────┴────────────┘
+```
+
+The example above shows the test results for two runs of the crew with two tasks, with the average total score for each task and the crew as a whole.
+
--- a/docs/core-concepts/Tools.md
+++ b/docs/core-concepts/Tools.md
@@ -100,16 +100,24 @@ Here is a list of the available tools and their descriptions:

 | Tool                        | Description                                                                                   |
 | :-------------------------- | :-------------------------------------------------------------------------------------------- |
+| **BrowserbaseLoadTool**     | A tool for interacting with and extracting data from web browsers.                            |
 | **CodeDocsSearchTool**      | A RAG tool optimized for searching through code documentation and related technical documents. |
+| **CodeInterpreterTool**     | A tool for interpreting python code.                                                          |
+| **ComposioTool**            | Enables use of Composio tools.                                                                |
 | **CSVSearchTool**           | A RAG tool designed for searching within CSV files, tailored to handle structured data.       |
 | **DirectorySearchTool**     | A RAG tool for searching within directories, useful for navigating through file systems.      |
 | **DOCXSearchTool**          | A RAG tool aimed at searching within DOCX documents, ideal for processing Word files.         |
 | **DirectoryReadTool**       | Facilitates reading and processing of directory structures and their contents.                |
+| **EXASearchTool**           | A tool designed for performing exhaustive searches across various data sources.               |
 | **FileReadTool**            | Enables reading and extracting data from files, supporting various file formats.              |
+| **FirecrawlSearchTool**     | A tool to search webpages using Firecrawl and return the results.                             |
+| **FirecrawlCrawlWebsiteTool** | A tool for crawling webpages using Firecrawl.                                               |
+| **FirecrawlScrapeWebsiteTool** | A tool for scraping webpages url using Firecrawl and returning its contents.               |
 | **GithubSearchTool**        | A RAG tool for searching within GitHub repositories, useful for code and documentation search.|
 | **SerperDevTool**           | A specialized tool for development purposes, with specific functionalities under development. |
 | **TXTSearchTool**           | A RAG tool focused on searching within text (.txt) files, suitable for unstructured data.     |
 | **JSONSearchTool**          | A RAG tool designed for searching within JSON files, catering to structured data handling.     |
+| **LlamaIndexTool**          | Enables the use of LlamaIndex tools.                                                          |
 | **MDXSearchTool**           | A RAG tool tailored for searching within Markdown (MDX) files, useful for documentation.      |
 | **PDFSearchTool**           | A RAG tool aimed at searching within PDF documents, ideal for processing scanned documents.    |
 | **PGSearchTool**            | A RAG tool optimized for searching within PostgreSQL databases, suitable for database queries. |
@@ -120,8 +128,6 @@ Here is a list of the available tools and their descriptions:
 | **XMLSearchTool**           | A RAG tool designed for searching within XML files, suitable for structured data formats.      |
 | **YoutubeChannelSearchTool**| A RAG tool for searching within YouTube channels, useful for video content analysis.           |
 | **YoutubeVideoSearchTool**  | A RAG tool aimed at searching within YouTube videos, ideal for video data extraction.          |
-| **BrowserbaseTool**         | A tool for interacting with and extracting data from web browsers.                            |
-| **ExaSearchTool**           | A tool designed for performing exhaustive searches across various data sources.               |

 ## Creating your own Tools

--- a/docs/getting-started/Installing-CrewAI.md
+++ b/docs/getting-started/Installing-CrewAI.md
@@ -18,4 +18,7 @@ pip install crewai
 # Install the main crewAI package and the tools package
 # that includes a series of helpful tools for your agents
 pip install 'crewai[tools]'
+
+# Alternatively, you can also use:
+pip install crewai crewai-tools
 ```
--- a/docs/getting-started/Start-a-New-CrewAI-Project-Template-Method.md
+++ b/docs/getting-started/Start-a-New-CrewAI-Project-Template-Method.md
@@ -0,0 +1,255 @@
+---
+title: Starting a New CrewAI Project - Using Template
+description: A comprehensive guide to starting a new CrewAI project, including the latest updates and project setup methods.
+---
+
+# Starting Your CrewAI Project
+
+Welcome to the ultimate guide for starting a new CrewAI project. This document will walk you through the steps to create, customize, and run your CrewAI project, ensuring you have everything you need to get started.
+
+Beforre we start there are a couple of things to note:
+
+1. CrewAI is a Python package and requires Python >=3.10 and <=3.13 to run.
+2. The preferred way of setting up CrewAI is using the `crewai create` command.This will create a new project folder and install a skeleton template for you to work on.
+
+## Prerequisites
+
+Before getting started with CrewAI, make sure that you have installed it via pip:
+
+```shell
+$ pip install crewai crewai-tools
+```
+
+### Virtual Environemnts
+It is highly recommended that you use virtual environments to ensure that your CrewAI project is isolated from other projects and dependencies. Virtual environments provide a clean, separate workspace for each project, preventing conflicts between different versions of packages and libraries. This isolation is crucial for maintaining consistency and reproducibility in your development process. You have multiple options for setting up virtual environments depending on your operating system and Python version:
+
+1. Use venv (Python's built-in virtual environment tool):
+   venv is included with Python 3.3 and later, making it a convenient choice for many developers. It's lightweight and easy to use, perfect for simple project setups.
+
+   To set up virtual environments with venv, refer to the official [Python documentation](https://docs.python.org/3/tutorial/venv.html).
+
+2. Use Conda (A Python virtual environment manager):
+   Conda is an open-source package manager and environment management system for Python. It's widely used by data scientists, developers, and researchers to manage dependencies and environments in a reproducible way.
+
+   To set up virtual environments with Conda, refer to the official [Conda documentation](https://docs.conda.io/projects/conda/en/stable/user-guide/getting-started.html).
+
+3. Use Poetry (A Python package manager and dependency management tool):
+   Poetry is an open-source Python package manager that simplifies the installation of packages and their dependencies. Poetry offers a convenient way to manage virtual environments and dependencies.
+   Poetry is CrewAI's prefered tool for package / dependancy management in CrewAI.
+
+### Code IDEs
+
+Most users of CrewAI a Code Editor / Integrated Development Environment (IDE) for building there Crews. You can use any code IDE of your choice. Seee below for some popular options for Code Editors / Integrated Development Environments (IDE):
+
+- [Visual Studio Code](https://code.visualstudio.com/) - Most popular
+- [PyCharm](https://www.jetbrains.com/pycharm/)
+- [Cursor AI](https://cursor.com)
+
+Pick one that suits your style and needs.
+
+## Creating a New Project
+In this example we will be using Venv as our virtual environment manager.
+
+To setup a virtual environment, run the following CLI command:
+
+```shell
+$ python3 -m venv <venv-name>
+```
+
+Activate your virtual environment by running the following CLI command:
+
+```shell
+$ source <venv-name>/bin/activate
+```
+
+Now, to create a new CrewAI project, run the following CLI command:
+
+```shell
+$ crewai create <project_name>
+```
+
+This command will create a new project folder with the following structure:
+
+```shell
+my_project/
+├── .gitignore
+├── pyproject.toml
+├── README.md
+└── src/
+    └── my_project/
+        ├── __init__.py
+        ├── main.py
+        ├── crew.py
+        ├── tools/
+        │   ├── custom_tool.py
+        │   └── __init__.py
+        └── config/
+            ├── agents.yaml
+            └── tasks.yaml
+```
+
+You can now start developing your project by editing the files in the `src/my_project` folder. The `main.py` file is the entry point of your project, and the `crew.py` file is where you define your agents and tasks.
+
+## Customizing Your Project
+
+To customize your project, you can:
+- Modify `src/my_project/config/agents.yaml` to define your agents.
+- Modify `src/my_project/config/tasks.yaml` to define your tasks.
+- Modify `src/my_project/crew.py` to add your own logic, tools, and specific arguments.
+- Modify `src/my_project/main.py` to add custom inputs for your agents and tasks.
+- Add your environment variables into the `.env` file.
+
+### Example: Defining Agents and Tasks
+
+#### agents.yaml
+
+```yaml
+researcher:
+  role: >
+    Job Candidate Researcher
+  goal: >
+    Find potential candidates for the job
+  backstory: >
+    You are adept at finding the right candidates by exploring various online
+    resources. Your skill in identifying suitable candidates ensures the best
+    match for job positions.
+```
+
+#### tasks.yaml
+
+```yaml
+research_candidates_task:
+  description: >
+    Conduct thorough research to find potential candidates for the specified job.
+    Utilize various online resources and databases to gather a comprehensive list of potential candidates.
+    Ensure that the candidates meet the job requirements provided.
+
+    Job Requirements:
+    {job_requirements}
+  expected_output: >
+    A list of 10 potential candidates with their contact information and brief profiles highlighting their suitability.
+  agent: researcher # THIS NEEDS TO MATCH THE AGENT NAME IN THE AGENTS.YAML FILE AND THE AGENT DEFINED IN THE Crew.PY FILE
+  context: # THESE NEED TO MATCH THE TASK NAMES DEFINED ABOVE AND THE TASKS.YAML FILE AND THE TASK DEFINED IN THE Crew.PY FILE
+    - researcher
+```
+
+### Referencing Variables:
+Your defined functions with the same name will be used. For example, you can reference the agent for specific tasks from task.yaml file. Ensure your annotated agent and function name is the same otherwise your task wont recognize the reference properly.
+
+#### Example References
+agent.yaml
+```yaml
+email_summarizer:
+    role: >
+      Email Summarizer
+    goal: >
+      Summarize emails into a concise and clear summary
+    backstory: >
+      You will create a 5 bullet point summary of the report
+    llm: mixtal_llm
+```
+
+task.yaml
+```yaml
+email_summarizer_task:
+    description: >
+      Summarize the email into a 5 bullet point summary
+    expected_output: >
+      A 5 bullet point summary of the email
+    agent: email_summarizer
+    context:
+      - reporting_task
+      - research_task
+```
+
+Use the annotations are used to properly reference the agent and task in the crew.py file.
+
+### Annotations include:
+* @agent
+* @task
+* @crew
+* @llm
+* @tool
+* @callback
+* @output_json
+* @output_pydantic
+* @cache_handler
+
+
+crew.py
+```py
+...
+    @llm
+    def mixtal_llm(self):
+        return ChatGroq(temperature=0, model_name="mixtral-8x7b-32768")
+
+    @agent
+    def email_summarizer(self) -> Agent:
+        return Agent(
+            config=self.agents_config["email_summarizer"],
+        )
+    ## ...other tasks defined
+    @task
+    def email_summarizer_task(self) -> Task:
+        return Task(
+            config=self.tasks_config["email_summarizer_task"],
+        )
+...
+```
+
+
+
+## Installing Dependencies
+
+To install the dependencies for your project, you can use Poetry. First, navigate to your project directory:
+
+```shell
+$ cd my_project
+$ poetry lock
+$ poetry install
+```
+
+This will install the dependencies specified in the `pyproject.toml` file.
+
+## Interpolating Variables
+
+Any variable interpolated in your `agents.yaml` and `tasks.yaml` files like `{variable}` will be replaced by the value of the variable in the `main.py` file.
+
+#### agents.yaml
+
+```yaml
+research_task:
+  description: >
+    Conduct a thorough research about the customer and competitors in the context
+    of {customer_domain}.
+    Make sure you find any interesting and relevant information given the
+    current year is 2024.
+  expected_output: >
+    A complete report on the customer and their customers and competitors,
+    including their demographics, preferences, market positioning and audience engagement.
+```
+
+#### main.py
+
+```python
+# main.py
+def run():
+    inputs = {
+        "customer_domain": "crewai.com"
+    }
+    MyProjectCrew(inputs).crew().kickoff(inputs=inputs)
+```
+
+## Running Your Project
+
+To run your project, use the following command:
+
+```shell
+$ poetry run my_project
+```
+
+This will initialize your crew of AI agents and begin task execution as defined in your configuration in the `main.py` file.
+
+## Deploying Your Project
+
+The easiest way to deploy your crew is through [CrewAI+](https://www.crewai.com/crewaiplus), where you can deploy your crew in a few clicks.
--- a/docs/how-to/Creating-a-Crew-and-kick-it-off.md
+++ b/docs/how-to/Creating-a-Crew-and-kick-it-off.md
@@ -1,82 +0,0 @@
---
-title: Assembling and Activating Your CrewAI Team
-description: A comprehensive guide to creating a dynamic CrewAI team for your projects, with updated functionalities including verbose mode, memory capabilities, asynchronous execution, output customization, language model configuration, code execution, integration with third-party agents, and improved task management.
---
-
-## Introduction
-Embark on your CrewAI journey by setting up your environment and initiating your AI crew with the latest features. This guide ensures a smooth start, incorporating all recent updates for an enhanced experience, including code execution capabilities, integration with third-party agents, and advanced task management.
-
-## Step 0: Installation
-Install CrewAI and any necessary packages for your project. CrewAI is compatible with Python >=3.10,<=3.13.
-
-```shell
-pip install crewai
-pip install 'crewai[tools]'
-```
-
-## Step 1: Assemble Your Agents
-Define your agents with distinct roles, backstories, and enhanced capabilities. The Agent class now supports a wide range of attributes for fine-tuned control over agent behavior and interactions, including code execution and integration with third-party agents.
-
-```python
-import os
-from langchain.llms import OpenAI
-from crewai import Agent
-from crewai_tools import SerperDevTool, BrowserbaseTool, ExaSearchTool
-
-os.environ["OPENAI_API_KEY"] = "Your OpenAI Key"
-os.environ["SERPER_API_KEY"] = "Your Serper Key"
-
-search_tool = SerperDevTool()
-browser_tool = BrowserbaseTool()
-exa_search_tool = ExaSearchTool()
-
-# Creating a senior researcher agent with advanced configurations
-researcher = Agent(
-    role='Senior Researcher',
-    goal='Uncover groundbreaking technologies in {topic}',
-    backstory=("Driven by curiosity, you're at the forefront of innovation, "
-               "eager to explore and share knowledge that could change the world."),
-    memory=True,
-    verbose=True,
-    allow_delegation=False,
-    tools=[search_tool, browser_tool],
-    allow_code_execution=False,  # New attribute for enabling code execution
-    max_iter=15,  # Maximum number of iterations for task execution
-    max_rpm=100,  # Maximum requests per minute
-    max_execution_time=3600,  # Maximum execution time in seconds
-    system_template="Your custom system template here",  # Custom system template
-    prompt_template="Your custom prompt template here",  # Custom prompt template
-    response_template="Your custom response template here",  # Custom response template
-)
-
-# Creating a writer agent with custom tools and specific configurations
-writer = Agent(
-    role='Writer',
-    goal='Narrate compelling tech stories about {topic}',
-    backstory=("With a flair for simplifying complex topics, you craft engaging "
-               "narratives that captivate and educate, bringing new discoveries to light."),
-    verbose=True,
-    allow_delegation=False,
-    memory=True,
-    tools=[exa_search_tool],
-    function_calling_llm=OpenAI(model_name="gpt-3.5-turbo"),  # Separate LLM for function calling
-)
-
-# Setting a specific manager agent
-manager = Agent(
-  role='Manager',
-  goal='Ensure the smooth operation and coordination of the team',
-  verbose=True,
-  backstory=(
-    "As a seasoned project manager, you excel in organizing "
-    "tasks, managing timelines, and ensuring the team stays on track."
-  ),
-  allow_code_execution=True,  # Enable code execution for the manager
-)
-```
-
-### New Agent Attributes and Features
-
-1. `allow_code_execution`: Enable or disable code execution capabilities for the agent (default is False).
-2. `max_execution_time`: Set a maximum execution time (in seconds) for the agent to complete a task.
-3. `function_calling_llm`: Specify a separate language model for function calling.
--- a/docs/how-to/Force-Tool-Ouput-as-Result.md
+++ b/docs/how-to/Force-Tool-Ouput-as-Result.md
@@ -7,7 +7,7 @@ description: Learn how to force tool output as the result in of an Agent's task
 In CrewAI, you can force the output of a tool as the result of an agent's task. This feature is useful when you want to ensure that the tool output is captured and returned as the task result, and avoid the agent modifying the output during the task execution.

 ## Forcing Tool Output as Result
-To force the tool output as the result of an agent's task, you can set the `force_tool_output` parameter to `True` when creating the task. This parameter ensures that the tool output is captured and returned as the task result, without any modifications by the agent.
+To force the tool output as the result of an agent's task, you can set the `result_as_answer` parameter to `True` when creating the agent. This parameter ensures that the tool output is captured and returned as the task result, without any modifications by the agent.

 Here's an example of how to force the tool output as the result of an agent's task:

--- a/docs/how-to/Replay-tasks-from-latest-Crew-Kickoff.md
+++ b/docs/how-to/Replay-tasks-from-latest-Crew-Kickoff.md
@@ -36,14 +36,14 @@ To replay from a task programmatically, use the following steps:
 2. Execute the replay command within a try-except block to handle potential errors.

 ```python
-   def replay_from_task():
+   def replay():
    """
    Replay the crew execution from a specific task.
    """
    task_id = '<task_id>'
    inputs = {"topic": "CrewAI Training"} # this is optional, you can pass in the inputs you want to replay otherwise uses the previous kickoffs inputs
    try:
-        YourCrewName_Crew().crew().replay_from_task(task_id=task_id, inputs=inputs)
+        YourCrewName_Crew().crew().replay(task_id=task_id, inputs=inputs)

    except Exception as e:
        raise Exception(f"An error occurred while replaying the crew: {e}")
--- a/docs/how-to/Start-a-New-CrewAI-Project.md
+++ b/docs/how-to/Start-a-New-CrewAI-Project.md
@@ -1,137 +0,0 @@
---
-title: Starting a New CrewAI Project
-description: A comprehensive guide to starting a new CrewAI project, including the latest updates and project setup methods.
---
-
-# Starting Your CrewAI Project
-
-Welcome to the ultimate guide for starting a new CrewAI project. This document will walk you through the steps to create, customize, and run your CrewAI project, ensuring you have everything you need to get started.
-
-## Prerequisites
-
-We assume you have already installed CrewAI. If not, please refer to the [installation guide](how-to/Installing-CrewAI.md) to install CrewAI and its dependencies.
-
-## Creating a New Project
-
-To create a new project, run the following CLI command:
-
-```shell
-$ crewai create my_project
-```
-
-This command will create a new project folder with the following structure:
-
-```shell
-my_project/
-├── .gitignore
-├── pyproject.toml
-├── README.md
-└── src/
-    └── my_project/
-        ├── __init__.py
-        ├── main.py
-        ├── crew.py
-        ├── tools/
-        │   ├── custom_tool.py
-        │   └── __init__.py
-        └── config/
-            ├── agents.yaml
-            └── tasks.yaml
-```
-
-You can now start developing your project by editing the files in the `src/my_project` folder. The `main.py` file is the entry point of your project, and the `crew.py` file is where you define your agents and tasks.
-
-## Customizing Your Project
-
-To customize your project, you can:
- Modify `src/my_project/config/agents.yaml` to define your agents.
- Modify `src/my_project/config/tasks.yaml` to define your tasks.
- Modify `src/my_project/crew.py` to add your own logic, tools, and specific arguments.
- Modify `src/my_project/main.py` to add custom inputs for your agents and tasks.
- Add your environment variables into the `.env` file.
-
-### Example: Defining Agents and Tasks
-
-#### agents.yaml
-
-```yaml
-researcher:
-  role: >
-    Job Candidate Researcher
-  goal: >
-    Find potential candidates for the job
-  backstory: >
-    You are adept at finding the right candidates by exploring various online
-    resources. Your skill in identifying suitable candidates ensures the best
-    match for job positions.
-```
-
-#### tasks.yaml
-
-```yaml
-research_candidates_task:
-  description: >
-    Conduct thorough research to find potential candidates for the specified job.
-    Utilize various online resources and databases to gather a comprehensive list of potential candidates.
-    Ensure that the candidates meet the job requirements provided.
-
-    Job Requirements:
-    {job_requirements}
-  expected_output: >
-    A list of 10 potential candidates with their contact information and brief profiles highlighting their suitability.
-```
-
-## Installing Dependencies
-
-To install the dependencies for your project, you can use Poetry. First, navigate to your project directory:
-
-```shell
-$ cd my_project
-$ poetry lock
-$ poetry install
-```
-
-This will install the dependencies specified in the `pyproject.toml` file.
-
-## Interpolating Variables
-
-Any variable interpolated in your `agents.yaml` and `tasks.yaml` files like `{variable}` will be replaced by the value of the variable in the `main.py` file.
-
-#### agents.yaml
-
-```yaml
-research_task:
-  description: >
-    Conduct a thorough research about the customer and competitors in the context
-    of {customer_domain}.
-    Make sure you find any interesting and relevant information given the
-    current year is 2024.
-  expected_output: >
-    A complete report on the customer and their customers and competitors,
-    including their demographics, preferences, market positioning and audience engagement.
-```
-
-#### main.py
-
-```python
-# main.py
-def run():
-    inputs = {
-        "customer_domain": "crewai.com"
-    }
-    MyProjectCrew(inputs).crew().kickoff(inputs=inputs)
-```
-
-## Running Your Project
-
-To run your project, use the following command:
-
-```shell
-$ poetry run my_project
-```
-
-This will initialize your crew of AI agents and begin task execution as defined in your configuration in the `main.py` file.
-
-## Deploying Your Project
-
-The easiest way to deploy your crew is through [CrewAI+](https://www.crewai.com/crewaiplus), where you can deploy your crew in a few clicks.
--- a/docs/index.md
+++ b/docs/index.md
@@ -5,6 +5,19 @@
 Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

 <div style="display:flex; margin:0 auto; justify-content: center;">
+    <div style="width:25%">
+        <h2>Getting Started</h2>
+        <ul>
+            <li><a href='./getting-started/Installing-CrewAI'>
+                   Installing CrewAI
+                 </a>
+            </li>
+            <li><a href='./getting-started/Start-a-New-CrewAI-Project-Template-Method'>
+                   Start a New CrewAI Project: Template Method
+                 </a>
+            </li>
+        </ul>
+    </div>
    <div style="width:25%">
        <h2>Core Concepts</h2>
        <ul>
@@ -33,6 +46,11 @@ Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By
                    Crews
                </a>
            </li>
+            <li>
+                <a href="./core-concepts/Pipeline">
+                    Pipeline
+                </a>
+            </li>
            <li>
                <a href="./core-concepts/Training-Crew">
                    Training
@@ -43,26 +61,16 @@ Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By
                    Memory
                </a>
            </li>
+            <li>
+                <a href="./core-concepts/Planning">
+                    Planning
+                </a>
+            </li>
        </ul>
    </div>
    <div style="width:30%">
        <h2>How-To Guides</h2>
        <ul>
-            <li>
-                <a href="./how-to/Start-a-New-CrewAI-Project">
-                    Starting Your crewAI Project
-                </a>
-            </li>
-            <li>
-                <a href="./how-to/Installing-CrewAI">
-                    Installing crewAI
-                </a>
-            </li>
-            <li>
-                <a href="./how-to/Creating-a-Crew-and-kick-it-off">
-                    Getting Started
-                </a>
-            </li>
            <li>
                <a href="./how-to/Create-Custom-Tools">
                    Create Custom Tools
--- a/docs/telemetry/Telemetry.md
+++ b/docs/telemetry/Telemetry.md
@@ -5,7 +5,7 @@ description: Understanding the telemetry data collected by CrewAI and how it con

 ## Telemetry

-CrewAI utilizes anonymous telemetry to gather usage statistics with the primary goal of enhancing the library. Our focus is on improving and developing the features, integrations, and tools most utilized by our users.
+CrewAI utilizes anonymous telemetry to gather usage statistics with the primary goal of enhancing the library. Our focus is on improving and developing the features, integrations, and tools most utilized by our users. We don't offer a way to disable it now, but we will in the future.

 It's pivotal to understand that **NO data is collected** concerning prompts, task descriptions, agents' backstories or goals, usage of tools, API calls, responses, any data processed by the agents, or secrets and environment variables, with the exception of the conditions mentioned. When the `share_crew` feature is enabled, detailed data including task descriptions, agents' backstories or goals, and other specific attributes are collected to provide deeper insights while respecting user privacy.

@@ -22,7 +22,7 @@ It's pivotal to understand that **NO data is collected** concerning prompts, tas
 - **Tool Usage**: Identifying which tools are most frequently used allows us to prioritize improvements in those areas.

 ### Opt-In Further Telemetry Sharing
-Users can choose to share their complete telemetry data by enabling the `share_crew` attribute to `True` in their crew configurations. This opt-in approach respects user privacy and aligns with data protection standards by ensuring users have control over their data sharing preferences. Enabling `share_crew` results in the collection of detailed crew and task execution data, including `goal`, `backstory`, `context`, and `output` of tasks. This enables a deeper insight into usage patterns while respecting the user's choice to share.
+Users can choose to share their complete telemetry data by enabling the `share_crew` attribute to `True` in their crew configurations. Enabling `share_crew` results in the collection of detailed crew and task execution data, including `goal`, `backstory`, `context`, and `output` of tasks. This enables a deeper insight into usage patterns while respecting the user's choice to share.

 ### Updates and Revisions
 We are committed to maintaining the accuracy and transparency of our documentation. Regular reviews and updates are performed to ensure our documentation accurately reflects the latest developments of our codebase and telemetry practices. Users are encouraged to review this section for the most current information on our data collection practices and how they contribute to the improvement of CrewAI.
--- a/docs/tools/SerperDevTool.md
+++ b/docs/tools/SerperDevTool.md
@@ -29,5 +29,70 @@ To effectively use the `SerperDevTool`, follow these steps:
 2. **API Key Acquisition**: Acquire a `serper.dev` API key by registering for a free account at `serper.dev`.
 3. **Environment Configuration**: Store your obtained API key in an environment variable named `SERPER_API_KEY` to facilitate its use by the tool.

+## Parameters
+
+The `SerperDevTool` comes with several parameters that will be passed to the API :
+
+- **search_url**: The URL endpoint for the search API. (Default is `https://google.serper.dev/search`)
+
+- **country**: Optional. Specify the country for the search results.
+- **location**: Optional. Specify the location for the search results.
+- **locale**: Optional. Specify the locale for the search results.
+- **n_results**: Number of search results to return. Default is `10`.
+
+The values for `country`, `location`, `lovale` and `search_url` can be found on the [Serper Playground](https://serper.dev/playground).
+
+## Example with Parameters
+
+Here is an example demonstrating how to use the tool with additional parameters:
+
+```python
+from crewai_tools import SerperDevTool
+
+tool = SerperDevTool(
+    search_url="https://google.serper.dev/scholar",
+    n_results=2,
+)
+
+print(tool.run(search_query="ChatGPT"))
+
+# Using Tool: Search the internet
+
+# Search results: Title: Role of chat gpt in public health
+# Link: https://link.springer.com/article/10.1007/s10439-023-03172-7
+# Snippet: … ChatGPT in public health. In this overview, we will examine the potential uses of ChatGPT in
+# ---
+# Title: Potential use of chat gpt in global warming
+# Link: https://link.springer.com/article/10.1007/s10439-023-03171-8
+# Snippet: … as ChatGPT, have the potential to play a critical role in advancing our understanding of climate
+# ---
+
+```
+
+```python
+from crewai_tools import SerperDevTool
+
+tool = SerperDevTool(
+    country="fr",
+    locale="fr",
+    location="Paris, Paris, Ile-de-France, France",
+    n_results=2,
+)
+
+print(tool.run(search_query="Jeux Olympiques"))
+
+# Using Tool: Search the internet
+
+# Search results: Title: Jeux Olympiques de Paris 2024 - Actualités, calendriers, résultats
+# Link: https://olympics.com/fr/paris-2024
+# Snippet: Quels sont les sports présents aux Jeux Olympiques de Paris 2024 ? · Athlétisme · Aviron · Badminton · Basketball · Basketball 3x3 · Boxe · Breaking · Canoë ...
+# ---
+# Title: Billetterie Officielle de Paris 2024 - Jeux Olympiques et Paralympiques
+# Link: https://tickets.paris2024.org/
+# Snippet: Achetez vos billets exclusivement sur le site officiel de la billetterie de Paris 2024 pour participer au plus grand événement sportif au monde.
+# ---
+
+```
+
 ## Conclusion
-By integrating the `SerperDevTool` into Python projects, users gain the ability to conduct real-time, relevant searches across the internet directly from their applications. By adhering to the setup and usage guidelines provided, incorporating this tool into projects is streamlined and straightforward.
+By integrating the `SerperDevTool` into Python projects, users gain the ability to conduct real-time, relevant searches across the internet directly from their applications. The updated parameters allow for more customized and localized search results. By adhering to the setup and usage guidelines provided, incorporating this tool into projects is streamlined and straightforward.
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -119,6 +119,9 @@ theme:

 nav:
  - Home: '/'
+  - Getting Started:
+    - Installing CrewAI: 'getting-started/Installing-CrewAI.md'
+    - Starting a new CrewAI project: 'getting-started/Start-a-New-CrewAI-Project-Template-Method.md'
  - Core Concepts:
    - Agents: 'core-concepts/Agents.md'
    - Tasks: 'core-concepts/Tasks.md'
@@ -128,6 +131,8 @@ nav:
    - Collaboration: 'core-concepts/Collaboration.md'
    - Training: 'core-concepts/Training-Crew.md'
    - Memory: 'core-concepts/Memory.md'
+    - Planning: 'core-concepts/Planning.md'
+    - Testing: 'core-concepts/Testing.md'
    - Using LangChain Tools: 'core-concepts/Using-LangChain-Tools.md'
    - Using LlamaIndex Tools: 'core-concepts/Using-LlamaIndex-Tools.md'
  - How to Guides:
--- a/poetry.lock
+++ b/poetry.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "crewai"
-version = "0.36.0"
+version = "0.46.0"
 description = "Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks."
 authors = ["Joao Moura <joao@crewai.com>"]
 readme = "README.md"
@@ -21,12 +21,12 @@ opentelemetry-sdk = "^1.22.0"
 opentelemetry-exporter-otlp-proto-http = "^1.22.0"
 instructor = "1.3.3"
 regex = "^2023.12.25"
-crewai-tools = { version = "^0.4.8", optional = true }
+crewai-tools = { version = "^0.4.26", optional = true }
 click = "^8.1.7"
 python-dotenv = "^1.0.0"
 appdirs = "^1.4.4"
 jsonref = "^1.1.0"
-agentops = { version = "^0.1.9", optional = true }
+agentops = { version = "^0.3.0", optional = true }
 embedchain = "^0.1.114"
 json-repair = "^0.25.2"

@@ -46,12 +46,13 @@ mkdocs-material = { extras = ["imaging"], version = "^9.5.7" }
 mkdocs-material-extensions = "^1.3.1"
 pillow = "^10.2.0"
 cairosvg = "^2.7.1"
-crewai-tools = "^0.4.8"
+crewai-tools = "^0.4.26"

 [tool.poetry.group.test.dependencies]
 pytest = "^8.0.0"
 pytest-vcr = "^1.0.2"
 python-dotenv = "1.0.0"
+pytest-asyncio = "^0.23.7"

 [tool.poetry.scripts]
 crewai = "crewai.cli.cli:crewai"
--- a/src/crewai/agent.py
+++ b/src/crewai/agent.py
@@ -55,8 +55,6 @@ class Agent(BaseAgent):
            tools: Tools at agents disposal
            step_callback: Callback to be executed after each step of the agent execution.
            callbacks: A list of callback functions from the langchain library that are triggered during the agent's execution process
-            allow_code_execution: Enable code execution for the agent.
-            max_retry_limit: Maximum number of retries for an agent to execute a task when an error occurs.
    """

    _times_executed: int = PrivateAttr(default=0)
@@ -262,6 +260,7 @@ class Agent(BaseAgent):
            "tools_handler": self.tools_handler,
            "function_calling_llm": self.function_calling_llm,
            "callbacks": self.callbacks,
+            "max_tokens": self.max_tokens,
        }

        if self._rpm_controller:
--- a/src/crewai/agents/agent_builder/base_agent.py
+++ b/src/crewai/agents/agent_builder/base_agent.py
@@ -45,6 +45,7 @@ class BaseAgent(ABC, BaseModel):
        i18n (I18N): Internationalization settings.
        cache_handler (InstanceOf[CacheHandler]): An instance of the CacheHandler class.
        tools_handler (InstanceOf[ToolsHandler]): An instance of the ToolsHandler class.
+        max_tokens: Maximum number of tokens for the agent to generate in a response.


    Methods:
@@ -118,6 +119,9 @@ class BaseAgent(ABC, BaseModel):
    tools_handler: InstanceOf[ToolsHandler] = Field(
        default=None, description="An instance of the ToolsHandler class."
    )
+    max_tokens: Optional[int] = Field(
+        default=None, description="Maximum number of tokens for the agent's execution."
+    )

    _original_role: str | None = None
    _original_goal: str | None = None
--- a/src/crewai/agents/agent_builder/base_agent_executor_mixin.py
+++ b/src/crewai/agents/agent_builder/base_agent_executor_mixin.py
@@ -3,7 +3,6 @@ from typing import TYPE_CHECKING, Optional

 from crewai.memory.entity.entity_memory_item import EntityMemoryItem
 from crewai.memory.long_term.long_term_memory_item import LongTermMemoryItem
-from crewai.memory.short_term.short_term_memory_item import ShortTermMemoryItem
 from crewai.utilities.converter import ConverterError
 from crewai.utilities.evaluators.task_evaluator import TaskEvaluator
 from crewai.utilities import I18N
@@ -39,18 +38,17 @@ class CrewAgentExecutorMixin:
            and "Action: Delegate work to coworker" not in output.log
        ):
            try:
-                memory = ShortTermMemoryItem(
-                    data=output.log,
-                    agent=self.crew_agent.role,
-                    metadata={
-                        "observation": self.task.description,
-                    },
-                )
                if (
                    hasattr(self.crew, "_short_term_memory")
                    and self.crew._short_term_memory
                ):
-                    self.crew._short_term_memory.save(memory)
+                    self.crew._short_term_memory.save(
+                        value=output.log,
+                        metadata={
+                            "observation": self.task.description,
+                        },
+                        agent=self.crew_agent.role,
+                    )
            except Exception as e:
                print(f"Failed to add to short term memory: {e}")
                pass
--- a/src/crewai/agents/agent_builder/utilities/base_token_process.py
+++ b/src/crewai/agents/agent_builder/utilities/base_token_process.py
@@ -1,4 +1,4 @@
-from typing import Any, Dict
+from crewai.types.usage_metrics import UsageMetrics


 class TokenProcess:
@@ -18,10 +18,10 @@ class TokenProcess:
    def sum_successful_requests(self, requests: int):
        self.successful_requests = self.successful_requests + requests

-    def get_summary(self) -> Dict[str, Any]:
-        return {
-            "total_tokens": self.total_tokens,
-            "prompt_tokens": self.prompt_tokens,
-            "completion_tokens": self.completion_tokens,
-            "successful_requests": self.successful_requests,
-        }
+    def get_summary(self) -> UsageMetrics:
+        return UsageMetrics(
+            total_tokens=self.total_tokens,
+            prompt_tokens=self.prompt_tokens,
+            completion_tokens=self.completion_tokens,
+            successful_requests=self.successful_requests,
+        )
--- a/src/crewai/agents/executor.py
+++ b/src/crewai/agents/executor.py
@@ -56,7 +56,7 @@ class CrewAgentExecutor(AgentExecutor, CrewAgentExecutorMixin):
        )
        intermediate_steps: List[Tuple[AgentAction, str]] = []
        # Allowing human input given task setting
-        if self.task.human_input:
+        if self.task and self.task.human_input:
            self.should_ask_for_human_input = True

        # Let's start tracking the number of iterations and time elapsed
--- a/src/crewai/cli/cli.py
+++ b/src/crewai/cli/cli.py
@@ -5,10 +5,11 @@ from crewai.memory.storage.kickoff_task_outputs_storage import (
    KickoffTaskOutputsSQLiteStorage,
 )

-
 from .create_crew import create_crew
-from .train_crew import train_crew
+from .evaluate_crew import evaluate_crew
 from .replay_from_task import replay_task_command
+from .reset_memories_command import reset_memories_command
+from .train_crew import train_crew


@click.group()
@@ -99,5 +100,52 @@ def log_tasks_outputs() -> None:
        click.echo(f"An error occurred while logging task outputs: {e}", err=True)


+@crewai.command()
+@click.option("-l", "--long", is_flag=True, help="Reset LONG TERM memory")
+@click.option("-s", "--short", is_flag=True, help="Reset SHORT TERM memory")
+@click.option("-e", "--entities", is_flag=True, help="Reset ENTITIES memory")
+@click.option(
+    "-k",
+    "--kickoff-outputs",
+    is_flag=True,
+    help="Reset LATEST KICKOFF TASK OUTPUTS",
+)
+@click.option("-a", "--all", is_flag=True, help="Reset ALL memories")
+def reset_memories(long, short, entities, kickoff_outputs, all):
+    """
+    Reset the crew memories (long, short, entity, latest_crew_kickoff_ouputs). This will delete all the data saved.
+    """
+    try:
+        if not all and not (long or short or entities or kickoff_outputs):
+            click.echo(
+                "Please specify at least one memory type to reset using the appropriate flags."
+            )
+            return
+        reset_memories_command(long, short, entities, kickoff_outputs, all)
+    except Exception as e:
+        click.echo(f"An error occurred while resetting memories: {e}", err=True)
+
+
+@crewai.command()
+@click.option(
+    "-n",
+    "--n_iterations",
+    type=int,
+    default=3,
+    help="Number of iterations to Test the crew",
+)
+@click.option(
+    "-m",
+    "--model",
+    type=str,
+    default="gpt-4o-mini",
+    help="LLM Model to run the tests on the Crew. For now only accepting only OpenAI models.",
+)
+def test(n_iterations: int, model: str):
+    """Test the crew and evaluate the results."""
+    click.echo(f"Testing the crew for {n_iterations} iterations with model {model}")
+    evaluate_crew(n_iterations, model)
+
+
 if __name__ == "__main__":
    crewai()
--- a/src/crewai/cli/evaluate_crew.py
+++ b/src/crewai/cli/evaluate_crew.py
@@ -0,0 +1,30 @@
+import subprocess
+
+import click
+
+
+def evaluate_crew(n_iterations: int, model: str) -> None:
+    """
+    Test and Evaluate the crew by running a command in the Poetry environment.
+
+    Args:
+        n_iterations (int): The number of iterations to test the crew.
+        model (str): The model to test the crew with.
+    """
+    command = ["poetry", "run", "test", str(n_iterations), model]
+
+    try:
+        if n_iterations <= 0:
+            raise ValueError("The number of iterations must be a positive integer.")
+
+        result = subprocess.run(command, capture_output=False, text=True, check=True)
+
+        if result.stderr:
+            click.echo(result.stderr, err=True)
+
+    except subprocess.CalledProcessError as e:
+        click.echo(f"An error occurred while testing the crew: {e}", err=True)
+        click.echo(e.output, err=True)
+
+    except Exception as e:
+        click.echo(f"An unexpected error occurred: {e}", err=True)
--- a/src/crewai/cli/reset_memories_command.py
+++ b/src/crewai/cli/reset_memories_command.py
@@ -0,0 +1,49 @@
+import subprocess
+import click
+
+from crewai.memory.entity.entity_memory import EntityMemory
+from crewai.memory.long_term.long_term_memory import LongTermMemory
+from crewai.memory.short_term.short_term_memory import ShortTermMemory
+from crewai.utilities.task_output_storage_handler import TaskOutputStorageHandler
+
+
+def reset_memories_command(long, short, entity, kickoff_outputs, all) -> None:
+    """
+    Reset the crew memories.
+
+    Args:
+      long (bool): Whether to reset the long-term memory.
+      short (bool): Whether to reset the short-term memory.
+      entity (bool): Whether to reset the entity memory.
+      kickoff_outputs (bool): Whether to reset the latest kickoff task outputs.
+      all (bool): Whether to reset all memories.
+    """
+
+    try:
+        if all:
+            ShortTermMemory().reset()
+            EntityMemory().reset()
+            LongTermMemory().reset()
+            TaskOutputStorageHandler().reset()
+            click.echo("All memories have been reset.")
+        else:
+            if long:
+                LongTermMemory().reset()
+                click.echo("Long term memory has been reset.")
+
+            if short:
+                ShortTermMemory().reset()
+                click.echo("Short term memory has been reset.")
+            if entity:
+                EntityMemory().reset()
+                click.echo("Entity memory has been reset.")
+            if kickoff_outputs:
+                TaskOutputStorageHandler().reset()
+                click.echo("Latest Kickoff outputs stored has been reset.")
+
+    except subprocess.CalledProcessError as e:
+        click.echo(f"An error occurred while resetting the memories: {e}", err=True)
+        click.echo(e.output, err=True)
+
+    except Exception as e:
+        click.echo(f"An unexpected error occurred: {e}", err=True)
--- a/src/crewai/cli/templates/config/tasks.yaml
+++ b/src/crewai/cli/templates/config/tasks.yaml
@@ -5,6 +5,7 @@ research_task:
    the current year is 2024.
  expected_output: >
    A list with 10 bullet points of the most relevant information about {topic}
+  agent: researcher

 reporting_task:
  description: >
@@ -13,3 +14,4 @@ reporting_task:
  expected_output: >
    A fully fledge reports with the mains topics, each with a full section of information.
    Formatted as markdown without '```'
+  agent: reporting_analyst
--- a/src/crewai/cli/templates/crew.py
+++ b/src/crewai/cli/templates/crew.py
@@ -32,14 +32,12 @@ class {{crew_name}}Crew():
 	def research_task(self) -> Task:
 		return Task(
 			config=self.tasks_config['research_task'],
-			agent=self.researcher()
 		)

 	@task
 	def reporting_task(self) -> Task:
 		return Task(
 			config=self.tasks_config['reporting_task'],
-			agent=self.reporting_analyst(),
 			output_file='report.md'
 		)

--- a/src/crewai/cli/templates/main.py
+++ b/src/crewai/cli/templates/main.py
@@ -2,9 +2,15 @@
 import sys
 from {{folder_name}}.crew import {{crew_name}}Crew

+# This main file is intended to be a way for your to run your
+# crew locally, so refrain from adding necessary logic into this file.
+# Replace with inputs you want to test with, it will automatically
+# interpolate any tasks and agents information

 def run():
-    # Replace with your inputs, it will automatically interpolate any tasks and agents information
+    """
+    Run the crew.
+    """
    inputs = {
        'topic': 'AI LLMs'
    }
@@ -15,19 +21,34 @@ def train():
    """
    Train the crew for a given number of iterations.
    """
-    inputs = {"topic": "AI LLMs"}
+    inputs = {
+        "topic": "AI LLMs"
+    }
    try:
        {{crew_name}}Crew().crew().train(n_iterations=int(sys.argv[1]), inputs=inputs)

    except Exception as e:
        raise Exception(f"An error occurred while training the crew: {e}")

-def replay_from_task():
+def replay():
    """
    Replay the crew execution from a specific task.
    """
    try:
-        {{crew_name}}Crew().crew().replay_from_task(task_id=sys.argv[1])
+        {{crew_name}}Crew().crew().replay(task_id=sys.argv[1])
+
+    except Exception as e:
+        raise Exception(f"An error occurred while replaying the crew: {e}")
+
+def test():
+    """
+    Test the crew execution and returns the results.
+    """
+    inputs = {
+        "topic": "AI LLMs"
+    }
+    try:
+        {{crew_name}}Crew().crew().test(n_iterations=int(sys.argv[1]), openai_model_name=sys.argv[2], inputs=inputs)

    except Exception as e:
        raise Exception(f"An error occurred while replaying the crew: {e}")
--- a/src/crewai/cli/templates/pyproject.toml
+++ b/src/crewai/cli/templates/pyproject.toml
@@ -6,12 +6,13 @@ authors = ["Your Name <you@example.com>"]

 [tool.poetry.dependencies]
 python = ">=3.10,<=3.13"
-crewai = { extras = ["tools"], version = "^0.35.8" }
+crewai = { extras = ["tools"], version = "^0.46.0" }

 [tool.poetry.scripts]
 {{folder_name}} = "{{folder_name}}.main:run"
 train = "{{folder_name}}.main:train"
-replay = "{{folder_name}}.main:replay_from_task"
+replay = "{{folder_name}}.main:replay"
+test = "{{folder_name}}.main:test"

 [build-system]
 requires = ["poetry-core"]
--- a/src/crewai/crew.py
+++ b/src/crewai/crew.py
@@ -3,7 +3,7 @@ import json
 import uuid
 from concurrent.futures import Future
 from hashlib import md5
-from typing import Any, Dict, List, Optional, Tuple, Union
+from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple, Union

 from langchain_core.callbacks import BaseCallbackHandler
 from pydantic import (
@@ -32,16 +32,16 @@ from crewai.tasks.conditional_task import ConditionalTask
 from crewai.tasks.task_output import TaskOutput
 from crewai.telemetry import Telemetry
 from crewai.tools.agent_tools import AgentTools
+from crewai.types.usage_metrics import UsageMetrics
 from crewai.utilities import I18N, FileHandler, Logger, RPMController
-from crewai.utilities.constants import (
-    TRAINED_AGENTS_DATA_FILE,
-    TRAINING_DATA_FILE,
-)
+from crewai.utilities.constants import TRAINED_AGENTS_DATA_FILE, TRAINING_DATA_FILE
+from crewai.utilities.evaluators.crew_evaluator_handler import CrewEvaluator
 from crewai.utilities.evaluators.task_evaluator import TaskEvaluator
 from crewai.utilities.formatter import (
    aggregate_raw_outputs_from_task_outputs,
    aggregate_raw_outputs_from_tasks,
 )
+from crewai.utilities.planning_handler import CrewPlanner
 from crewai.utilities.task_output_storage_handler import TaskOutputStorageHandler
 from crewai.utilities.training_handler import CrewTrainingHandler

@@ -50,6 +50,9 @@ try:
 except ImportError:
    agentops = None

+if TYPE_CHECKING:
+    from crewai.pipeline.pipeline import Pipeline
+

 class Crew(BaseModel):
    """
@@ -73,6 +76,7 @@ class Crew(BaseModel):
        task_callback: Callback to be executed after each task for every agents execution.
        step_callback: Callback to be executed after each step for every agents execution.
        share_crew: Whether you want to share the complete crew information and execution with crewAI to make the library better, and allow us to train models.
+        planning: Plan the crew execution and add the plan to the crew.
    """

    __hash__ = object.__hash__  # type: ignore
@@ -94,12 +98,13 @@ class Crew(BaseModel):
        default_factory=TaskOutputStorageHandler
    )

+    name: Optional[str] = Field(default=None)
    cache: bool = Field(default=True)
    model_config = ConfigDict(arbitrary_types_allowed=True)
    tasks: List[Task] = Field(default_factory=list)
    agents: List[BaseAgent] = Field(default_factory=list)
    process: Process = Field(default=Process.sequential)
-    verbose: Union[int, bool] = Field(default=0)
+    verbose: int = Field(default=0)
    memory: bool = Field(
        default=False,
        description="Whether the crew should use memory to store memories of it's execution",
@@ -108,7 +113,7 @@ class Crew(BaseModel):
        default={"provider": "openai"},
        description="Configuration for the embedder to be used for the crew.",
    )
-    usage_metrics: Optional[dict] = Field(
+    usage_metrics: Optional[UsageMetrics] = Field(
        default=None,
        description="Metrics for the LLM usage during all tasks execution.",
    )
@@ -144,10 +149,18 @@ class Crew(BaseModel):
        default=None,
        description="Path to the prompt json file to be used for the crew.",
    )
-    output_log_file: Optional[Union[bool, str]] = Field(
-        default=False,
+    output_log_file: Optional[str] = Field(
+        default=None,
        description="output_log_file",
    )
+    planning: Optional[bool] = Field(
+        default=False,
+        description="Plan the crew execution and add the plan to the crew.",
+    )
+    planning_llm: Optional[Any] = Field(
+        default=None,
+        description="Language model that will run the AgentPlanner if planning is True.",
+    )
    task_execution_output_json_files: Optional[List[str]] = Field(
        default=None,
        description="List of file paths for task execution JSON files.",
@@ -260,20 +273,6 @@ class Crew(BaseModel):

        return self

-    @model_validator(mode="after")
-    def check_tasks_in_hierarchical_process_not_async(self):
-        """Validates that the tasks in hierarchical process are not flagged with async_execution."""
-        if self.process == Process.hierarchical:
-            for task in self.tasks:
-                if task.async_execution:
-                    raise PydanticCustomError(
-                        "async_execution_in_hierarchical_process",
-                        "Hierarchical process error: Tasks cannot be flagged with async_execution.",
-                        {},
-                    )
-
-        return self
-
    @model_validator(mode="after")
    def validate_end_with_at_most_one_async_task(self):
        """Validates that the crew ends with at most one asynchronous task."""
@@ -453,7 +452,10 @@ class Crew(BaseModel):

            agent.create_agent_executor()

-        metrics = []
+        if self.planning:
+            self._handle_crew_planning()
+
+        metrics: List[UsageMetrics] = []

        if self.process == Process.sequential:
            result = self._run_sequential_process()
@@ -463,11 +465,12 @@ class Crew(BaseModel):
            raise NotImplementedError(
                f"The process '{self.process}' is not implemented yet."
            )
+
        metrics += [agent._token_process.get_summary() for agent in self.agents]

-        self.usage_metrics = {
-            key: sum([m[key] for m in metrics if m is not None]) for key in metrics[0]
-        }
+        self.usage_metrics = UsageMetrics()
+        for metric in metrics:
+            self.usage_metrics.add_usage_metrics(metric)

        return result

@@ -476,12 +479,7 @@ class Crew(BaseModel):
        results: List[CrewOutput] = []

        # Initialize the parent crew's usage metrics
-        total_usage_metrics = {
-            "total_tokens": 0,
-            "prompt_tokens": 0,
-            "completion_tokens": 0,
-            "successful_requests": 0,
-        }
+        total_usage_metrics = UsageMetrics()

        for input_data in inputs:
            crew = self.copy()
@@ -489,8 +487,7 @@ class Crew(BaseModel):
            output = crew.kickoff(inputs=input_data)

            if crew.usage_metrics:
-                for key in total_usage_metrics:
-                    total_usage_metrics[key] += crew.usage_metrics.get(key, 0)
+                total_usage_metrics.add_usage_metrics(crew.usage_metrics)

            results.append(output)

@@ -519,34 +516,25 @@ class Crew(BaseModel):

        results = await asyncio.gather(*tasks)

-        total_usage_metrics = {
-            "total_tokens": 0,
-            "prompt_tokens": 0,
-            "completion_tokens": 0,
-            "successful_requests": 0,
-        }
+        total_usage_metrics = UsageMetrics()
        for crew in crew_copies:
            if crew.usage_metrics:
-                for key in total_usage_metrics:
-                    total_usage_metrics[key] += crew.usage_metrics.get(key, 0)
-
-        self.usage_metrics = total_usage_metrics
-
-        total_usage_metrics = {
-            "total_tokens": 0,
-            "prompt_tokens": 0,
-            "completion_tokens": 0,
-            "successful_requests": 0,
-        }
-        for crew in crew_copies:
-            if crew.usage_metrics:
-                for key in total_usage_metrics:
-                    total_usage_metrics[key] += crew.usage_metrics.get(key, 0)
+                total_usage_metrics.add_usage_metrics(crew.usage_metrics)

        self.usage_metrics = total_usage_metrics
        self._task_output_handler.reset()
        return results

+    def _handle_crew_planning(self):
+        """Handles the Crew planning."""
+        self._logger.log("info", "Planning the crew execution")
+        result = CrewPlanner(
+            tasks=self.tasks, planning_agent_llm=self.planning_llm
+        )._handle_crew_planning()
+
+        for task, step_plan in zip(self.tasks, result.list_of_plans_per_task):
+            task.description += step_plan
+
    def _store_execution_log(
        self,
        task: Task,
@@ -583,7 +571,7 @@ class Crew(BaseModel):
    def _run_hierarchical_process(self) -> CrewOutput:
        """Creates and assigns a manager agent to make sure the crew completes the tasks."""
        self._create_manager_agent()
-        return self._execute_tasks(self.tasks, self.manager_agent)
+        return self._execute_tasks(self.tasks)

    def _create_manager_agent(self):
        i18n = I18N(prompt_file=self.prompt_file)
@@ -607,7 +595,6 @@ class Crew(BaseModel):
    def _execute_tasks(
        self,
        tasks: List[Task],
-        manager: Optional[BaseAgent] = None,
        start_index: Optional[int] = 0,
        was_replayed: bool = False,
    ) -> CrewOutput:
@@ -635,16 +622,14 @@ class Crew(BaseModel):
                        last_sync_output = task.output
                continue

-            self._prepare_task(task, manager)
-            if self.process == Process.hierarchical:
-                agent_to_use = manager
-            else:
-                agent_to_use = task.agent
+            agent_to_use = self._get_agent_to_use(task)
            if agent_to_use is None:
                raise ValueError(
                    f"No agent available for task: {task.description}. Ensure that either the task has an assigned agent or a manager agent is provided."
                )
-            self._log_task_start(task, agent_to_use)
+
+            self._prepare_agent_tools(task)
+            self._log_task_start(task, agent_to_use.role)

            if isinstance(task, ConditionalTask):
                skipped_task_output = self._handle_conditional_task(
@@ -657,7 +642,6 @@ class Crew(BaseModel):
                context = self._get_context(
                    task, [last_sync_output] if last_sync_output else []
                )
-                self._log_task_start(task, agent_to_use.role)
                future = task.execute_async(
                    agent=agent_to_use,
                    context=context,
@@ -670,7 +654,6 @@ class Crew(BaseModel):
                    futures.clear()

                context = self._get_context(task, task_outputs)
-                self._log_task_start(task, agent_to_use.role)
                task_output = task.execute_sync(
                    agent=agent_to_use,
                    context=context,
@@ -711,12 +694,20 @@ class Crew(BaseModel):
            return skipped_task_output
        return None

-    def _prepare_task(self, task: Task, manager: Optional[BaseAgent]):
+    def _prepare_agent_tools(self, task: Task):
        if self.process == Process.hierarchical:
-            self._update_manager_tools(task, manager)
+            if self.manager_agent:
+                self._update_manager_tools(task)
+            else:
+                raise ValueError("Manager agent is required for hierarchical process.")
        elif task.agent and task.agent.allow_delegation:
            self._add_delegation_tools(task)

+    def _get_agent_to_use(self, task: Task) -> Optional[BaseAgent]:
+        if self.process == Process.hierarchical:
+            return self.manager_agent
+        return task.agent
+
    def _add_delegation_tools(self, task: Task):
        agents_for_delegation = [agent for agent in self.agents if agent != task.agent]
        if len(self.agents) > 1 and len(agents_for_delegation) > 0 and task.agent:
@@ -750,11 +741,14 @@ class Crew(BaseModel):
        if self.output_log_file:
            self._file_handler.log(agent=role, task=task.description, status="started")

-    def _update_manager_tools(self, task: Task, manager: Optional[BaseAgent]):
-        if task.agent and manager:
-            manager.tools = task.agent.get_delegation_tools([task.agent])
-        if manager:
-            manager.tools = manager.get_delegation_tools(self.agents)
+    def _update_manager_tools(self, task: Task):
+        if self.manager_agent:
+            if task.agent:
+                self.manager_agent.tools = task.agent.get_delegation_tools([task.agent])
+            else:
+                self.manager_agent.tools = self.manager_agent.get_delegation_tools(
+                    self.agents
+                )

    def _get_context(self, task: Task, task_outputs: List[TaskOutput]):
        context = (
@@ -815,7 +809,7 @@ class Crew(BaseModel):
            None,
        )

-    def replay_from_task(
+    def replay(
        self, task_id: str, inputs: Optional[Dict[str, Any]] = None
    ) -> CrewOutput:
        stored_outputs = self._task_output_handler.load()
@@ -853,7 +847,7 @@ class Crew(BaseModel):
            self.tasks[i].output = task_output

        self._logging_color = "bold_blue"
-        result = self._execute_tasks(self.tasks, self.manager_agent, start_index, True)
+        result = self._execute_tasks(self.tasks, start_index, True)
        return result

    def copy(self):
@@ -916,27 +910,47 @@ class Crew(BaseModel):
            )
        self._telemetry.end_crew(self, final_string_output)

-    def calculate_usage_metrics(self) -> Dict[str, int]:
+    def calculate_usage_metrics(self) -> UsageMetrics:
        """Calculates and returns the usage metrics."""
-        total_usage_metrics = {
-            "total_tokens": 0,
-            "prompt_tokens": 0,
-            "completion_tokens": 0,
-            "successful_requests": 0,
-        }
+        total_usage_metrics = UsageMetrics()

        for agent in self.agents:
            if hasattr(agent, "_token_process"):
                token_sum = agent._token_process.get_summary()
-                for key in total_usage_metrics:
-                    total_usage_metrics[key] += token_sum.get(key, 0)
+                total_usage_metrics.add_usage_metrics(token_sum)

        if self.manager_agent and hasattr(self.manager_agent, "_token_process"):
            token_sum = self.manager_agent._token_process.get_summary()
-            for key in total_usage_metrics:
-                total_usage_metrics[key] += token_sum.get(key, 0)
+            total_usage_metrics.add_usage_metrics(token_sum)

        return total_usage_metrics

+    def test(
+        self,
+        n_iterations: int,
+        openai_model_name: str,
+        inputs: Optional[Dict[str, Any]] = None,
+    ) -> None:
+        """Test and evaluate the Crew with the given inputs for n iterations."""
+        evaluator = CrewEvaluator(self, openai_model_name)
+
+        for i in range(1, n_iterations + 1):
+            evaluator.set_iteration(i)
+            self.kickoff(inputs=inputs)
+
+        evaluator.print_crew_evaluation_result()
+
+    def __rshift__(self, other: "Crew") -> "Pipeline":
+        """
+        Implements the >> operator to add another Crew to an existing Pipeline.
+        """
+        from crewai.pipeline.pipeline import Pipeline
+
+        if not isinstance(other, Crew):
+            raise TypeError(
+                f"Unsupported operand type for >>: '{type(self).__name__}' and '{type(other).__name__}'"
+            )
+        return Pipeline(stages=[self, other])
+
    def __repr__(self):
        return f"Crew(id={self.id}, process={self.process}, number_of_agents={len(self.agents)}, number_of_tasks={len(self.tasks)})"
--- a/src/crewai/crews/crew_output.py
+++ b/src/crewai/crews/crew_output.py
@@ -5,6 +5,7 @@ from pydantic import BaseModel, Field

 from crewai.tasks.output_format import OutputFormat
 from crewai.tasks.task_output import TaskOutput
+from crewai.types.usage_metrics import UsageMetrics


 class CrewOutput(BaseModel):
@@ -20,9 +21,7 @@ class CrewOutput(BaseModel):
    tasks_output: list[TaskOutput] = Field(
        description="Output of each task", default=[]
    )
-    token_usage: Dict[str, Any] = Field(
-        description="Processed token summary", default={}
-    )
+    token_usage: UsageMetrics = Field(description="Processed token summary", default={})

    @property
    def json(self) -> Optional[str]:
--- a/src/crewai/memory/entity/entity_memory.py
+++ b/src/crewai/memory/entity/entity_memory.py
@@ -23,3 +23,9 @@ class EntityMemory(Memory):
        """Saves an entity item into the SQLite storage."""
        data = f"{item.name}({item.type}): {item.description}"
        super().save(data, item.metadata)
+
+    def reset(self) -> None:
+        try:
+            self.storage.reset()
+        except Exception as e:
+            raise Exception(f"An error occurred while resetting the entity memory: {e}")
--- a/src/crewai/memory/long_term/long_term_memory.py
+++ b/src/crewai/memory/long_term/long_term_memory.py
@@ -30,3 +30,6 @@ class LongTermMemory(Memory):

    def search(self, task: str, latest_n: int = 3) -> Dict[str, Any]:
        return self.storage.load(task, latest_n)  # type: ignore # BUG?: "Storage" has no attribute "load"
+
+    def reset(self) -> None:
+        self.storage.reset()
--- a/src/crewai/memory/short_term/short_term_memory.py
+++ b/src/crewai/memory/short_term/short_term_memory.py
@@ -1,3 +1,4 @@
+from typing import Any, Dict, Optional
 from crewai.memory.memory import Memory
 from crewai.memory.short_term.short_term_memory_item import ShortTermMemoryItem
 from crewai.memory.storage.rag_storage import RAGStorage
@@ -18,8 +19,23 @@ class ShortTermMemory(Memory):
        )
        super().__init__(storage)

-    def save(self, item: ShortTermMemoryItem) -> None:  # type: ignore # BUG?: Signature of "save" incompatible with supertype "Memory"
-        super().save(item.data, item.metadata, item.agent)
+    def save(
+        self,
+        value: Any,
+        metadata: Optional[Dict[str, Any]] = None,
+        agent: Optional[str] = None,
+    ) -> None:
+        item = ShortTermMemoryItem(data=value, metadata=metadata, agent=agent)
+
+        super().save(value=item.data, metadata=item.metadata, agent=item.agent)

    def search(self, query: str, score_threshold: float = 0.35):
        return self.storage.search(query=query, score_threshold=score_threshold)  # type: ignore # BUG? The reference is to the parent class, but the parent class does not have this parameters
+
+    def reset(self) -> None:
+        try:
+            self.storage.reset()
+        except Exception as e:
+            raise Exception(
+                f"An error occurred while resetting the short-term memory: {e}"
+            )
--- a/src/crewai/memory/short_term/short_term_memory_item.py
+++ b/src/crewai/memory/short_term/short_term_memory_item.py
@@ -3,7 +3,10 @@ from typing import Any, Dict, Optional

 class ShortTermMemoryItem:
    def __init__(
-        self, data: Any, agent: str, metadata: Optional[Dict[str, Any]] = None
+        self,
+        data: Any,
+        agent: Optional[str] = None,
+        metadata: Optional[Dict[str, Any]] = None,
    ):
        self.data = data
        self.agent = agent
--- a/src/crewai/memory/storage/interface.py
+++ b/src/crewai/memory/storage/interface.py
@@ -4,8 +4,11 @@ from typing import Any, Dict
 class Storage:
    """Abstract base class defining the storage interface"""

-    def save(self, key: str, value: Any, metadata: Dict[str, Any]) -> None:
+    def save(self, value: Any, metadata: Dict[str, Any]) -> None:
        pass

    def search(self, key: str) -> Dict[str, Any]:  # type: ignore
        pass
+
+    def reset(self) -> None:
+        pass
--- a/src/crewai/memory/storage/ltm_sqlite_storage.py
+++ b/src/crewai/memory/storage/ltm_sqlite_storage.py
@@ -103,3 +103,20 @@ class LTMSQLiteStorage:
                color="red",
            )
        return None
+
+    def reset(
+        self,
+    ) -> None:
+        """Resets the LTM table with error handling."""
+        try:
+            with sqlite3.connect(self.db_path) as conn:
+                cursor = conn.cursor()
+                cursor.execute("DELETE FROM long_term_memories")
+                conn.commit()
+
+        except sqlite3.Error as e:
+            self._printer.print(
+                content=f"MEMORY ERROR: An error occurred while deleting all rows in LTM: {e}",
+                color="red",
+            )
+        return None
--- a/src/crewai/memory/storage/rag_storage.py
+++ b/src/crewai/memory/storage/rag_storage.py
@@ -2,6 +2,7 @@ import contextlib
 import io
 import logging
 import os
+import shutil
 from typing import Any, Dict, List, Optional

 from embedchain import App
@@ -71,13 +72,13 @@ class RAGStorage(Storage):

        if embedder_config:
            config["embedder"] = embedder_config
-
+        self.type = type
        self.app = App.from_config(config=config)
        self.app.llm = FakeLLM()
        if allow_reset:
            self.app.reset()

-    def save(self, value: Any, metadata: Dict[str, Any]) -> None:  # type: ignore # BUG?: Should be save(key, value, metadata)  Signature of "save" incompatible with supertype "Storage"
+    def save(self, value: Any, metadata: Dict[str, Any]) -> None:
        self._generate_embedding(value, metadata)

    def search(  # type: ignore # BUG?: Signature of "search" incompatible with supertype "Storage"
@@ -102,3 +103,11 @@ class RAGStorage(Storage):
    def _generate_embedding(self, text: str, metadata: Dict[str, Any]) -> Any:
        with suppress_logging():
            self.app.add(text, data_type="text", metadata=metadata)
+
+    def reset(self) -> None:
+        try:
+            shutil.rmtree(f"{db_storage_path()}/{self.type}")
+        except Exception as e:
+            raise Exception(
+                f"An error occurred while resetting the {self.type} memory: {e}"
+            )
--- a/src/crewai/pipeline/init.py
+++ b/src/crewai/pipeline/init.py
@@ -0,0 +1,3 @@
+from crewai.pipeline.pipeline import Pipeline
+
+__all__ = ["Pipeline"]
--- a/src/crewai/pipeline/pipeline.py
+++ b/src/crewai/pipeline/pipeline.py
@@ -0,0 +1,371 @@
+import asyncio
+import copy
+from typing import Any, Dict, List, Tuple, Union
+
+from pydantic import BaseModel, Field, model_validator
+
+from crewai.crew import Crew
+from crewai.crews.crew_output import CrewOutput
+from crewai.pipeline.pipeline_kickoff_result import PipelineKickoffResult
+from crewai.types.usage_metrics import UsageMetrics
+
+Trace = Union[Union[str, Dict[str, Any]], List[Union[str, Dict[str, Any]]]]
+
+
+"""
+Developer Notes:
+
+This module defines a Pipeline class that represents a sequence of operations (stages)
+to process inputs. Each stage can be either sequential or parallel, and the pipeline
+can process multiple kickoffs concurrently.
+
+Core Loop Explanation:
+1. The `process_kickoffs` method processes multiple kickoffs in parallel, each going through
+   all pipeline stages.
+2. The `process_single_kickoff` method handles the processing of a single kickouff through
+   all stages, updating metrics and input data along the way.
+3. The `_process_stage` method determines whether a stage is sequential or parallel
+   and processes it accordingly.
+4. The `_process_single_crew` and `_process_parallel_crews` methods handle the
+   execution of single and parallel crew stages.
+5. The `_update_metrics_and_input` method updates usage metrics and the current input
+   with the outputs from a stage.
+6. The `_build_pipeline_kickoff_results` method constructs the final results of the
+   pipeline kickoff, including traces and outputs.
+
+Handling Traces and Crew Outputs:
+- During the processing of stages, we handle the results (traces and crew outputs)
+  for all stages except the last one differently from the final stage.
+- For intermediate stages, the primary focus is on passing the input data between stages.
+  This involves merging the output dictionaries from all crews in a stage into a single
+  dictionary and passing it to the next stage. This merged dictionary allows for smooth
+  data flow between stages.
+- For the final stage, in addition to passing the input data, we also need to prepare
+  the final outputs and traces to be returned as the overall result of the pipeline kickoff.
+  In this case, we do not merge the results, as each result needs to be included
+  separately in its own pipeline kickoff result.
+
+Pipeline Terminology:
+- Pipeline: The overall structure that defines a sequence of operations.
+- Stage: A distinct part of the pipeline, which can be either sequential or parallel.
+- Kickoff: A specific execution of the pipeline for a given set of inputs, representing a single instance of processing through the pipeline.
+- Branch: Parallel executions within a stage (e.g., concurrent crew operations).
+- Trace: The journey of an individual input through the entire pipeline.
+
+Example pipeline structure:
+crew1 >> crew2 >> crew3 
+
+This represents a pipeline with three sequential stages:
+1. crew1 is the first stage, which processes the input and passes its output to crew2.
+2. crew2 is the second stage, which takes the output from crew1 as its input, processes it, and passes its output to crew3.
+3. crew3 is the final stage, which takes the output from crew2 as its input and produces the final output of the pipeline.
+
+Each input creates its own kickoff, flowing through all stages of the pipeline.
+Multiple kickoffss can be processed concurrently, each following the defined pipeline structure.
+
+Another example pipeline structure:
+crew1 >> [crew2, crew3] >> crew4
+
+This represents a pipeline with three stages:
+1. A sequential stage (crew1)
+2. A parallel stage with two branches (crew2 and crew3 executing concurrently)
+3. Another sequential stage (crew4)
+
+Each input creates its own kickoff, flowing through all stages of the pipeline.
+Multiple kickoffs can be processed concurrently, each following the defined pipeline structure.
+"""
+
+
+class Pipeline(BaseModel):
+    stages: List[Union[Crew, List[Crew]]] = Field(
+        ..., description="List of crews representing stages to be executed in sequence"
+    )
+
+    @model_validator(mode="before")
+    @classmethod
+    def validate_stages(cls, values):
+        """
+        Validates the stages to ensure correct nesting and types.
+
+        Args:
+            values (dict): Dictionary containing the pipeline stages.
+
+        Returns:
+            dict: Validated stages.
+        """
+        stages = values.get("stages", [])
+
+        def check_nesting_and_type(item, depth=0):
+            if depth > 1:
+                raise ValueError("Double nesting is not allowed in pipeline stages")
+            if isinstance(item, list):
+                for sub_item in item:
+                    check_nesting_and_type(sub_item, depth + 1)
+            elif not isinstance(item, Crew):
+                raise ValueError(
+                    f"Expected Crew instance or list of Crews, got {type(item)}"
+                )
+
+        for stage in stages:
+            check_nesting_and_type(stage)
+        return values
+
+    async def kickoff(
+        self, inputs: List[Dict[str, Any]]
+    ) -> List[PipelineKickoffResult]:
+        """
+        Processes multiple runs in parallel, each going through all pipeline stages.
+
+        Args:
+            inputs (List[Dict[str, Any]]): List of inputs for each run.
+
+        Returns:
+            List[PipelineKickoffResult]: List of results from each run.
+        """
+        pipeline_results: List[PipelineKickoffResult] = []
+
+        # Process all runs in parallel
+        all_run_results = await asyncio.gather(
+            *(self.process_single_kickoff(input_data) for input_data in inputs)
+        )
+
+        # Flatten the list of lists into a single list of results
+        pipeline_results.extend(
+            result for run_result in all_run_results for result in run_result
+        )
+
+        return pipeline_results
+
+    async def process_single_kickoff(
+        self, kickoff_input: Dict[str, Any]
+    ) -> List[PipelineKickoffResult]:
+        """
+        Processes a single run through all pipeline stages.
+
+        Args:
+            input (Dict[str, Any]): The input for the run.
+
+        Returns:
+            List[PipelineKickoffResult]: The results of processing the run.
+        """
+        initial_input = copy.deepcopy(kickoff_input)
+        current_input = copy.deepcopy(kickoff_input)
+        pipeline_usage_metrics: Dict[str, UsageMetrics] = {}
+        all_stage_outputs: List[List[CrewOutput]] = []
+        traces: List[List[Union[str, Dict[str, Any]]]] = [[initial_input]]
+
+        for stage in self.stages:
+            stage_input = copy.deepcopy(current_input)
+            stage_outputs, stage_trace = await self._process_stage(stage, stage_input)
+
+            self._update_metrics_and_input(
+                pipeline_usage_metrics, current_input, stage, stage_outputs
+            )
+            traces.append(stage_trace)
+            all_stage_outputs.append(stage_outputs)
+
+        return self._build_pipeline_kickoff_results(
+            all_stage_outputs, traces, pipeline_usage_metrics
+        )
+
+    async def _process_stage(
+        self, stage: Union[Crew, List[Crew]], current_input: Dict[str, Any]
+    ) -> Tuple[List[CrewOutput], List[Union[str, Dict[str, Any]]]]:
+        """
+        Processes a single stage of the pipeline, which can be either sequential or parallel.
+
+        Args:
+            stage (Union[Crew, List[Crew]]): The stage to process.
+            current_input (Dict[str, Any]): The input for the stage.
+
+        Returns:
+            Tuple[List[CrewOutput], List[Union[str, Dict[str, Any]]]]: The outputs and trace of the stage.
+        """
+        if isinstance(stage, Crew):
+            return await self._process_single_crew(stage, current_input)
+        else:
+            return await self._process_parallel_crews(stage, current_input)
+
+    async def _process_single_crew(
+        self, crew: Crew, current_input: Dict[str, Any]
+    ) -> Tuple[List[CrewOutput], List[Union[str, Dict[str, Any]]]]:
+        """
+        Processes a single crew.
+
+        Args:
+            crew (Crew): The crew to process.
+            current_input (Dict[str, Any]): The input for the crew.
+
+        Returns:
+            Tuple[List[CrewOutput], List[Union[str, Dict[str, Any]]]]: The output and trace of the crew.
+        """
+        output = await crew.kickoff_async(inputs=current_input)
+        return [output], [crew.name or str(crew.id)]
+
+    async def _process_parallel_crews(
+        self, crews: List[Crew], current_input: Dict[str, Any]
+    ) -> Tuple[List[CrewOutput], List[Union[str, Dict[str, Any]]]]:
+        """
+        Processes multiple crews in parallel.
+
+        Args:
+            crews (List[Crew]): The list of crews to process in parallel.
+            current_input (Dict[str, Any]): The input for the crews.
+
+        Returns:
+            Tuple[List[CrewOutput], List[Union[str, Dict[str, Any]]]]: The outputs and traces of the crews.
+        """
+        parallel_outputs = await asyncio.gather(
+            *[crew.kickoff_async(inputs=current_input) for crew in crews]
+        )
+        return parallel_outputs, [crew.name or str(crew.id) for crew in crews]
+
+    def _update_metrics_and_input(
+        self,
+        usage_metrics: Dict[str, UsageMetrics],
+        current_input: Dict[str, Any],
+        stage: Union[Crew, List[Crew]],
+        outputs: List[CrewOutput],
+    ) -> None:
+        """
+        Updates metrics and current input with the outputs of a stage.
+
+        Args:
+            usage_metrics (Dict[str, Any]): The usage metrics to update.
+            current_input (Dict[str, Any]): The current input to update.
+            stage (Union[Crew, List[Crew]]): The stage that was processed.
+            outputs (List[CrewOutput]): The outputs of the stage.
+        """
+        for crew, output in zip([stage] if isinstance(stage, Crew) else stage, outputs):
+            usage_metrics[crew.name or str(crew.id)] = output.token_usage
+            current_input.update(output.to_dict())
+
+    def _build_pipeline_kickoff_results(
+        self,
+        all_stage_outputs: List[List[CrewOutput]],
+        traces: List[List[Union[str, Dict[str, Any]]]],
+        token_usage: Dict[str, UsageMetrics],
+    ) -> List[PipelineKickoffResult]:
+        """
+        Builds the results of a pipeline run.
+
+        Args:
+            all_stage_outputs (List[List[CrewOutput]]): All stage outputs.
+            traces (List[List[Union[str, Dict[str, Any]]]]): All traces.
+            token_usage (Dict[str, Any]): Token usage metrics.
+
+        Returns:
+            List[PipelineKickoffResult]: The results of the pipeline run.
+        """
+        formatted_traces = self._format_traces(traces)
+        formatted_crew_outputs = self._format_crew_outputs(all_stage_outputs)
+
+        return [
+            PipelineKickoffResult(
+                token_usage=token_usage,
+                trace=formatted_trace,
+                raw=crews_outputs[-1].raw,
+                pydantic=crews_outputs[-1].pydantic,
+                json_dict=crews_outputs[-1].json_dict,
+                crews_outputs=crews_outputs,
+            )
+            for crews_outputs, formatted_trace in zip(
+                formatted_crew_outputs, formatted_traces
+            )
+        ]
+
+    def _format_traces(
+        self, traces: List[List[Union[str, Dict[str, Any]]]]
+    ) -> List[List[Trace]]:
+        """
+        Formats the traces of a pipeline run.
+
+        Args:
+            traces (List[List[Union[str, Dict[str, Any]]]]): The traces to format.
+
+        Returns:
+            List[List[Trace]]: The formatted traces.
+        """
+        formatted_traces: List[Trace] = self._format_single_trace(traces[:-1])
+        return self._format_multiple_traces(formatted_traces, traces[-1])
+
+    def _format_single_trace(
+        self, traces: List[List[Union[str, Dict[str, Any]]]]
+    ) -> List[Trace]:
+        """
+        Formats single traces.
+
+        Args:
+            traces (List[List[Union[str, Dict[str, Any]]]]): The traces to format.
+
+        Returns:
+            List[Trace]: The formatted single traces.
+        """
+        formatted_traces: List[Trace] = []
+        for trace in traces:
+            formatted_traces.append(trace[0] if len(trace) == 1 else trace)
+        return formatted_traces
+
+    def _format_multiple_traces(
+        self,
+        formatted_traces: List[Trace],
+        final_trace: List[Union[str, Dict[str, Any]]],
+    ) -> List[List[Trace]]:
+        """
+        Formats multiple traces.
+
+        Args:
+            formatted_traces (List[Trace]): The formatted single traces.
+            final_trace (List[Union[str, Dict[str, Any]]]): The final trace to format.
+
+        Returns:
+            List[List[Trace]]: The formatted multiple traces.
+        """
+        traces_to_return: List[List[Trace]] = []
+        if len(final_trace) == 1:
+            formatted_traces.append(final_trace[0])
+            traces_to_return.append(formatted_traces)
+        else:
+            for trace in final_trace:
+                copied_traces = formatted_traces.copy()
+                copied_traces.append(trace)
+                traces_to_return.append(copied_traces)
+        return traces_to_return
+
+    def _format_crew_outputs(
+        self, all_stage_outputs: List[List[CrewOutput]]
+    ) -> List[List[CrewOutput]]:
+        """
+        Formats the outputs of all stages into a list of crew outputs.
+
+        Args:
+            all_stage_outputs (List[List[CrewOutput]]): All stage outputs.
+
+        Returns:
+            List[List[CrewOutput]]: Formatted crew outputs.
+        """
+        crew_outputs: List[CrewOutput] = [
+            output
+            for stage_outputs in all_stage_outputs[:-1]
+            for output in stage_outputs
+        ]
+        return [crew_outputs + [output] for output in all_stage_outputs[-1]]
+
+    def __rshift__(self, other: Any) -> "Pipeline":
+        """
+        Implements the >> operator to add another Stage (Crew or List[Crew]) to an existing Pipeline.
+
+        Args:
+            other (Any): The stage to add.
+
+        Returns:
+            Pipeline: A new pipeline with the added stage.
+        """
+        if isinstance(other, Crew):
+            return type(self)(stages=self.stages + [other])
+        elif isinstance(other, list) and all(isinstance(crew, Crew) for crew in other):
+            return type(self)(stages=self.stages + [other])
+        else:
+            raise TypeError(
+                f"Unsupported operand type for >>: '{type(self).__name__}' and '{type(other).__name__}'"
+            )
--- a/src/crewai/pipeline/pipeline_kickoff_result.py
+++ b/src/crewai/pipeline/pipeline_kickoff_result.py
@@ -0,0 +1,61 @@
+import json
+import uuid
+from typing import Any, Dict, List, Optional, Union
+
+from pydantic import UUID4, BaseModel, Field
+
+from crewai.crews.crew_output import CrewOutput
+from crewai.types.usage_metrics import UsageMetrics
+
+
+class PipelineKickoffResult(BaseModel):
+    """Class that represents the result of a pipeline run."""
+
+    id: UUID4 = Field(
+        default_factory=uuid.uuid4,
+        frozen=True,
+        description="Unique identifier for the object, not set by user.",
+    )
+    raw: str = Field(description="Raw output of the pipeline run", default="")
+    pydantic: Any = Field(
+        description="Pydantic output of the pipeline run", default=None
+    )
+    json_dict: Union[Dict[str, Any], None] = Field(
+        description="JSON dict output of the pipeline run", default={}
+    )
+
+    token_usage: Dict[str, UsageMetrics] = Field(
+        description="Token usage for each crew in the run"
+    )
+    trace: List[Any] = Field(
+        description="Trace of the journey of inputs through the run"
+    )
+    crews_outputs: List[CrewOutput] = Field(
+        description="Output from each crew in the run",
+        default=[],
+    )
+
+    @property
+    def json(self) -> Optional[str]:
+        if self.crews_outputs[-1].tasks_output[-1].output_format != "json":
+            raise ValueError(
+                "No JSON output found in the final task of the final crew. Please make sure to set the output_json property in the final task in your crew."
+            )
+
+        return json.dumps(self.json_dict)
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert json_output and pydantic_output to a dictionary."""
+        output_dict = {}
+        if self.json_dict:
+            output_dict.update(self.json_dict)
+        elif self.pydantic:
+            output_dict.update(self.pydantic.model_dump())
+        return output_dict
+
+    def __str__(self):
+        if self.pydantic:
+            return str(self.pydantic)
+        if self.json_dict:
+            return str(self.json_dict)
+        return self.raw
--- a/src/crewai/pipeline/pipeline_output.py
+++ b/src/crewai/pipeline/pipeline_output.py
@@ -0,0 +1,20 @@
+import uuid
+from typing import List
+
+from pydantic import UUID4, BaseModel, Field
+
+from crewai.pipeline.pipeline_kickoff_result import PipelineKickoffResult
+
+
+class PipelineOutput(BaseModel):
+    id: UUID4 = Field(
+        default_factory=uuid.uuid4,
+        frozen=True,
+        description="Unique identifier for the object, not set by user.",
+    )
+    run_results: List[PipelineKickoffResult] = Field(
+        description="List of results for each run through the pipeline", default=[]
+    )
+
+    def add_run_result(self, result: PipelineKickoffResult):
+        self.run_results.append(result)
--- a/src/crewai/project/init.py
+++ b/src/crewai/project/init.py
@@ -1,2 +1,25 @@
-from .annotations import agent, crew, task
+from .annotations import (
+    agent,
+    crew,
+    task,
+    output_json,
+    output_pydantic,
+    tool,
+    callback,
+    llm,
+    cache_handler,
+)
 from .crew_base import CrewBase
+
+__all__ = [
+    "agent",
+    "crew",
+    "task",
+    "output_json",
+    "output_pydantic",
+    "tool",
+    "callback",
+    "CrewBase",
+    "llm",
+    "cache_handler",
+]
--- a/src/crewai/project/annotations.py
+++ b/src/crewai/project/annotations.py
@@ -30,6 +30,37 @@ def agent(func):
    return func


+def llm(func):
+    func.is_llm = True
+    func = memoize(func)
+    return func
+
+
+def output_json(cls):
+    cls.is_output_json = True
+    return cls
+
+
+def output_pydantic(cls):
+    cls.is_output_pydantic = True
+    return cls
+
+
+def tool(func):
+    func.is_tool = True
+    return memoize(func)
+
+
+def callback(func):
+    func.is_callback = True
+    return memoize(func)
+
+
+def cache_handler(func):
+    func.is_cache_handler = True
+    return memoize(func)
+
+
 def crew(func):
    def wrapper(self, *args, **kwargs):
        instantiated_tasks = []
--- a/src/crewai/project/crew_base.py
+++ b/src/crewai/project/crew_base.py
@@ -1,6 +1,7 @@
 import inspect
 import os
 from pathlib import Path
+from typing import Any, Callable, Dict

 import yaml
 from dotenv import load_dotenv
@@ -20,11 +21,6 @@ def CrewBase(cls):
                base_directory = Path(frame_info.filename).parent.resolve()
                break

-        if base_directory is None:
-            raise Exception(
-                "Unable to dynamically determine the project's base directory, you must run it from the project's root directory."
-            )
-
        original_agents_config_path = getattr(
            cls, "agents_config", "config/agents.yaml"
        )
@@ -32,12 +28,20 @@ def CrewBase(cls):

        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)
+
+            if self.base_directory is None:
+                raise Exception(
+                    "Unable to dynamically determine the project's base directory, you must run it from the project's root directory."
+                )
+
            self.agents_config = self.load_yaml(
                os.path.join(self.base_directory, self.original_agents_config_path)
            )
            self.tasks_config = self.load_yaml(
                os.path.join(self.base_directory, self.original_tasks_config_path)
            )
+            self.map_all_agent_variables()
+            self.map_all_task_variables()

        @staticmethod
        def load_yaml(config_path: str):
@@ -45,4 +49,138 @@ def CrewBase(cls):
                # parsedContent = YamlParser.parse(file)  # type: ignore # Argument 1 to "parse" has incompatible type "TextIOWrapper"; expected "YamlParser"
                return yaml.safe_load(file)

+        def _get_all_functions(self):
+            return {
+                name: getattr(self, name)
+                for name in dir(self)
+                if callable(getattr(self, name))
+            }
+
+        def _filter_functions(
+            self, functions: Dict[str, Callable], attribute: str
+        ) -> Dict[str, Callable]:
+            return {
+                name: func
+                for name, func in functions.items()
+                if hasattr(func, attribute)
+            }
+
+        def map_all_agent_variables(self) -> None:
+            all_functions = self._get_all_functions()
+            llms = self._filter_functions(all_functions, "is_llm")
+            tool_functions = self._filter_functions(all_functions, "is_tool")
+            cache_handler_functions = self._filter_functions(
+                all_functions, "is_cache_handler"
+            )
+            callbacks = self._filter_functions(all_functions, "is_callback")
+            agents = self._filter_functions(all_functions, "is_agent")
+
+            for agent_name, agent_info in self.agents_config.items():
+                self._map_agent_variables(
+                    agent_name,
+                    agent_info,
+                    agents,
+                    llms,
+                    tool_functions,
+                    cache_handler_functions,
+                    callbacks,
+                )
+
+        def _map_agent_variables(
+            self,
+            agent_name: str,
+            agent_info: Dict[str, Any],
+            agents: Dict[str, Callable],
+            llms: Dict[str, Callable],
+            tool_functions: Dict[str, Callable],
+            cache_handler_functions: Dict[str, Callable],
+            callbacks: Dict[str, Callable],
+        ) -> None:
+            if llm := agent_info.get("llm"):
+                self.agents_config[agent_name]["llm"] = llms[llm]()
+
+            if tools := agent_info.get("tools"):
+                self.agents_config[agent_name]["tools"] = [
+                    tool_functions[tool]() for tool in tools
+                ]
+
+            if function_calling_llm := agent_info.get("function_calling_llm"):
+                self.agents_config[agent_name]["function_calling_llm"] = agents[
+                    function_calling_llm
+                ]()
+
+            if step_callback := agent_info.get("step_callback"):
+                self.agents_config[agent_name]["step_callback"] = callbacks[
+                    step_callback
+                ]()
+
+            if cache_handler := agent_info.get("cache_handler"):
+                self.agents_config[agent_name]["cache_handler"] = (
+                    cache_handler_functions[cache_handler]()
+                )
+
+        def map_all_task_variables(self) -> None:
+            all_functions = self._get_all_functions()
+            agents = self._filter_functions(all_functions, "is_agent")
+            tasks = self._filter_functions(all_functions, "is_task")
+            output_json_functions = self._filter_functions(
+                all_functions, "is_output_json"
+            )
+            tool_functions = self._filter_functions(all_functions, "is_tool")
+            callback_functions = self._filter_functions(all_functions, "is_callback")
+            output_pydantic_functions = self._filter_functions(
+                all_functions, "is_output_pydantic"
+            )
+
+            for task_name, task_info in self.tasks_config.items():
+                self._map_task_variables(
+                    task_name,
+                    task_info,
+                    agents,
+                    tasks,
+                    output_json_functions,
+                    tool_functions,
+                    callback_functions,
+                    output_pydantic_functions,
+                )
+
+        def _map_task_variables(
+            self,
+            task_name: str,
+            task_info: Dict[str, Any],
+            agents: Dict[str, Callable],
+            tasks: Dict[str, Callable],
+            output_json_functions: Dict[str, Callable],
+            tool_functions: Dict[str, Callable],
+            callback_functions: Dict[str, Callable],
+            output_pydantic_functions: Dict[str, Callable],
+        ) -> None:
+            if context_list := task_info.get("context"):
+                self.tasks_config[task_name]["context"] = [
+                    tasks[context_task_name]() for context_task_name in context_list
+                ]
+
+            if tools := task_info.get("tools"):
+                self.tasks_config[task_name]["tools"] = [
+                    tool_functions[tool]() for tool in tools
+                ]
+
+            if agent_name := task_info.get("agent"):
+                self.tasks_config[task_name]["agent"] = agents[agent_name]()
+
+            if output_json := task_info.get("output_json"):
+                self.tasks_config[task_name]["output_json"] = output_json_functions[
+                    output_json
+                ]
+
+            if output_pydantic := task_info.get("output_pydantic"):
+                self.tasks_config[task_name]["output_pydantic"] = (
+                    output_pydantic_functions[output_pydantic]
+                )
+
+            if callbacks := task_info.get("callbacks"):
+                self.tasks_config[task_name]["callbacks"] = [
+                    callback_functions[callback]() for callback in callbacks
+                ]
+
    return WrappedClass
--- a/src/crewai/task.py
+++ b/src/crewai/task.py
@@ -1,6 +1,6 @@
+import datetime
 import json
 import os
-import re
 import threading
 import uuid
 from concurrent.futures import Future
@@ -8,7 +8,6 @@ from copy import copy
 from hashlib import md5
 from typing import Any, Dict, List, Optional, Tuple, Type, Union

-from langchain_openai import ChatOpenAI
 from opentelemetry.trace import Span
 from pydantic import UUID4, BaseModel, Field, field_validator, model_validator
 from pydantic_core import PydanticCustomError
@@ -17,10 +16,8 @@ from crewai.agents.agent_builder.base_agent import BaseAgent
 from crewai.tasks.output_format import OutputFormat
 from crewai.tasks.task_output import TaskOutput
 from crewai.telemetry.telemetry import Telemetry
-from crewai.utilities.converter import Converter, ConverterError
+from crewai.utilities.converter import Converter, convert_to_model
 from crewai.utilities.i18n import I18N
-from crewai.utilities.printer import Printer
-from crewai.utilities.pydantic_schema_parser import PydanticSchemaParser


 class Task(BaseModel):
@@ -50,6 +47,7 @@ class Task(BaseModel):
    tools_errors: int = 0
    delegations: int = 0
    i18n: I18N = I18N()
+    name: Optional[str] = Field(default=None)
    prompt_context: Optional[str] = None
    description: str = Field(description="Description of the actual task.")
    expected_output: str = Field(
@@ -111,6 +109,7 @@ class Task(BaseModel):
    _original_description: str | None = None
    _original_expected_output: str | None = None
    _thread: threading.Thread | None = None
+    _execution_time: float | None = None

    def __init__(__pydantic_self__, **data):
        config = data.pop("config", {})
@@ -124,9 +123,15 @@ class Task(BaseModel):
                "may_not_set_field", "This field is not to be set by the user.", {}
            )

+    def _set_start_execution_time(self) -> float:
+        return datetime.datetime.now().timestamp()
+
+    def _set_end_execution_time(self, start_time: float) -> None:
+        self._execution_time = datetime.datetime.now().timestamp() - start_time
+
    @field_validator("output_file")
    @classmethod
-    def output_file_validattion(cls, value: str) -> str:
+    def output_file_validation(cls, value: str) -> str:
        """Validate the output file path by removing the / from the beginning of the path."""
        if value.startswith("/"):
            return value[1:]
@@ -213,13 +218,14 @@ class Task(BaseModel):
        tools: Optional[List[Any]],
    ) -> TaskOutput:
        """Run the core execution logic of the task."""
-        self.agent = agent
        agent = agent or self.agent
+        self.agent = agent
        if not agent:
            raise Exception(
                f"The task '{self.description}' has no agent assigned, therefore it can't be executed directly and should be executed in a Crew using a specific process that support that, like hierarchical."
            )

+        start_time = self._set_start_execution_time()
        self._execution_span = self._telemetry.task_started(crew=agent.crew, task=self)

        self.prompt_context = context
@@ -243,6 +249,7 @@ class Task(BaseModel):
        )
        self.output = task_output

+        self._set_end_execution_time(start_time)
        if self.callback:
            self.callback(self.output)

@@ -254,7 +261,9 @@ class Task(BaseModel):
            content = (
                json_output
                if json_output
-                else pydantic_output.model_dump_json() if pydantic_output else result
+                else pydantic_output.model_dump_json()
+                if pydantic_output
+                else result
            )
            self._save_file(content)

@@ -324,13 +333,6 @@ class Task(BaseModel):

        return copied_task

-    def _create_converter(self, *args, **kwargs) -> Converter:
-        """Create a converter instance."""
-        converter = self.agent.get_output_converter(*args, **kwargs)
-        if self.converter_cls:
-            converter = self.converter_cls(*args, **kwargs)
-        return converter
-
    def _export_output(
        self, result: str
    ) -> Tuple[Optional[BaseModel], Optional[Dict[str, Any]]]:
@@ -338,75 +340,26 @@ class Task(BaseModel):
        json_output: Optional[Dict[str, Any]] = None

        if self.output_pydantic or self.output_json:
-            model_output = self._convert_to_model(result)
-            pydantic_output = (
-                model_output if isinstance(model_output, BaseModel) else None
+            model_output = convert_to_model(
+                result,
+                self.output_pydantic,
+                self.output_json,
+                self.agent,
+                self.converter_cls,
            )
-            if isinstance(model_output, str):
+
+            if isinstance(model_output, BaseModel):
+                pydantic_output = model_output
+            elif isinstance(model_output, dict):
+                json_output = model_output
+            elif isinstance(model_output, str):
                try:
                    json_output = json.loads(model_output)
                except json.JSONDecodeError:
                    json_output = None
-            else:
-                json_output = model_output if isinstance(model_output, dict) else None

        return pydantic_output, json_output

-    def _convert_to_model(self, result: str) -> Union[dict, BaseModel, str]:
-        model = self.output_pydantic or self.output_json
-        if model is None:
-            return result
-
-        try:
-            return self._validate_model(result, model)
-        except Exception:
-            return self._handle_partial_json(result, model)
-
-    def _validate_model(
-        self, result: str, model: Type[BaseModel]
-    ) -> Union[dict, BaseModel]:
-        exported_result = model.model_validate_json(result)
-        if self.output_json:
-            return exported_result.model_dump()
-        return exported_result
-
-    def _handle_partial_json(
-        self, result: str, model: Type[BaseModel]
-    ) -> Union[dict, BaseModel, str]:
-        match = re.search(r"({.*})", result, re.DOTALL)
-        if match:
-            try:
-                exported_result = model.model_validate_json(match.group(0))
-                if self.output_json:
-                    return exported_result.model_dump()
-                return exported_result
-            except Exception:
-                pass
-
-        return self._convert_with_instructions(result, model)
-
-    def _convert_with_instructions(
-        self, result: str, model: Type[BaseModel]
-    ) -> Union[dict, BaseModel, str]:
-        llm = self.agent.function_calling_llm or self.agent.llm  # type: ignore # Item "None" of "BaseAgent | None" has no attribute "function_calling_llm"
-        instructions = self._get_conversion_instructions(model, llm)
-
-        converter = self._create_converter(
-            llm=llm, text=result, model=model, instructions=instructions
-        )
-        exported_result = (
-            converter.to_pydantic() if self.output_pydantic else converter.to_json()
-        )
-
-        if isinstance(exported_result, ConverterError):
-            Printer().print(
-                content=f"{exported_result.message} Using raw output instead.",
-                color="red",
-            )
-            return result
-
-        return exported_result
-
    def _get_output_format(self) -> OutputFormat:
        if self.output_json:
            return OutputFormat.JSON
@@ -414,34 +367,22 @@ class Task(BaseModel):
            return OutputFormat.PYDANTIC
        return OutputFormat.RAW

-    def _get_conversion_instructions(self, model: Type[BaseModel], llm: Any) -> str:
-        instructions = "I'm gonna convert this raw text into valid JSON."
-        if not self._is_gpt(llm):
-            model_schema = PydanticSchemaParser(model=model).get_schema()
-            instructions = f"{instructions}\n\nThe json should have the following structure, with the following keys:\n{model_schema}"
-        return instructions
-
-    def _save_output(self, content: str) -> None:
-        if not self.output_file:
-            raise Exception("Output file path is not set.")
-
-        directory = os.path.dirname(self.output_file)
-        if directory and not os.path.exists(directory):
-            os.makedirs(directory)
-        with open(self.output_file, "w", encoding="utf-8") as file:
-            file.write(content)
-
-    def _is_gpt(self, llm) -> bool:
-        return isinstance(llm, ChatOpenAI) and llm.openai_api_base is None
-
    def _save_file(self, result: Any) -> None:
+        if self.output_file is None:
+            raise ValueError("output_file is not set.")
+
        directory = os.path.dirname(self.output_file)  # type: ignore # Value of type variable "AnyOrLiteralStr" of "dirname" cannot be "str | None"

        if directory and not os.path.exists(directory):
            os.makedirs(directory)

-        with open(self.output_file, "w", encoding="utf-8") as file:  # type: ignore # Argument 1 to "open" has incompatible type "str | None"; expected "int | str | bytes | PathLike[str] | PathLike[bytes]"
-            file.write(result)
+        with open(self.output_file, "w", encoding="utf-8") as file:
+            if isinstance(result, dict):
+                import json
+
+                json.dump(result, file, ensure_ascii=False, indent=2)
+            else:
+                file.write(str(result))
        return None

    def __repr__(self):
--- a/src/crewai/telemetry/telemetry.py
+++ b/src/crewai/telemetry/telemetry.py
@@ -40,7 +40,7 @@ class Telemetry:
    - Roles of agents in a crew
    - Tools names available

-    Users can opt-in to sharing more complete data suing the `share_crew`
+    Users can opt-in to sharing more complete data using the `share_crew`
    attribute in the Crew class.
    """

--- a/src/crewai/tools/tool_usage.py
+++ b/src/crewai/tools/tool_usage.py
@@ -86,7 +86,8 @@ class ToolUsage:
    ) -> str:
        if isinstance(calling, ToolUsageErrorException):
            error = calling.message
-            self._printer.print(content=f"\n\n{error}\n", color="red")
+            if self.agent.verbose:
+                self._printer.print(content=f"\n\n{error}\n", color="red")
            self.task.increment_tools_errors()
            return error

@@ -96,7 +97,8 @@ class ToolUsage:
        except Exception as e:
            error = getattr(e, "message", str(e))
            self.task.increment_tools_errors()
-            self._printer.print(content=f"\n\n{error}\n", color="red")
+            if self.agent.verbose:
+                self._printer.print(content=f"\n\n{error}\n", color="red")
            return error
        return f"{self._use(tool_string=tool_string, tool=tool, calling=calling)}"  # type: ignore # BUG?: "_use" of "ToolUsage" does not return a value (it only ever returns None)

@@ -112,7 +114,8 @@ class ToolUsage:
                result = self._i18n.errors("task_repeated_usage").format(
                    tool_names=self.tools_names
                )
-                self._printer.print(content=f"\n\n{result}\n", color="purple")
+                if self.agent.verbose:
+                    self._printer.print(content=f"\n\n{result}\n", color="purple")
                self._telemetry.tool_repeated_usage(
                    llm=self.function_calling_llm,
                    tool_name=tool.name,
@@ -168,7 +171,10 @@ class ToolUsage:
                        f'\n{error_message}.\nMoving on then. {self._i18n.slice("format").format(tool_names=self.tools_names)}'
                    ).message
                    self.task.increment_tools_errors()
-                    self._printer.print(content=f"\n\n{error_message}\n", color="red")
+                    if self.agent.verbose:
+                        self._printer.print(
+                            content=f"\n\n{error_message}\n", color="red"
+                        )
                    return error  # type: ignore # No return value expected

                self.task.increment_tools_errors()
@@ -192,7 +198,8 @@ class ToolUsage:
                    calling=calling, output=result, should_cache=should_cache
                )

-        self._printer.print(content=f"\n\n{result}\n", color="purple")
+        if self.agent.verbose:
+            self._printer.print(content=f"\n\n{result}\n", color="purple")
        if agentops:
            agentops.record(tool_event)
        self._telemetry.tool_usage(
@@ -346,7 +353,8 @@ class ToolUsage:
            if self._run_attempts > self._max_parsing_attempts:
                self._telemetry.tool_usage_error(llm=self.function_calling_llm)
                self.task.increment_tools_errors()
-                self._printer.print(content=f"\n\n{e}\n", color="red")
+                if self.agent.verbose:
+                    self._printer.print(content=f"\n\n{e}\n", color="red")
                return ToolUsageErrorException(  # type: ignore # Incompatible return value type (got "ToolUsageErrorException", expected "ToolCalling | InstructorToolCalling")
                    f'{self._i18n.errors("tool_usage_error").format(error=e)}\nMoving on then. {self._i18n.slice("format").format(tool_names=self.tools_names)}'
                )
--- a/src/crewai/types/init.py
+++ b/src/crewai/types/init.py
--- a/src/crewai/types/usage_metrics.py
+++ b/src/crewai/types/usage_metrics.py
@@ -0,0 +1,36 @@
+from pydantic import BaseModel, Field
+
+
+class UsageMetrics(BaseModel):
+    """
+    Model to track usage metrics for the crew's execution.
+
+    Attributes:
+        total_tokens: Total number of tokens used.
+        prompt_tokens: Number of tokens used in prompts.
+        completion_tokens: Number of tokens used in completions.
+        successful_requests: Number of successful requests made.
+    """
+
+    total_tokens: int = Field(default=0, description="Total number of tokens used.")
+    prompt_tokens: int = Field(
+        default=0, description="Number of tokens used in prompts."
+    )
+    completion_tokens: int = Field(
+        default=0, description="Number of tokens used in completions."
+    )
+    successful_requests: int = Field(
+        default=0, description="Number of successful requests made."
+    )
+
+    def add_usage_metrics(self, usage_metrics: "UsageMetrics"):
+        """
+        Add the usage metrics from another UsageMetrics object.
+
+        Args:
+            usage_metrics (UsageMetrics): The usage metrics to add.
+        """
+        self.total_tokens += usage_metrics.total_tokens
+        self.prompt_tokens += usage_metrics.prompt_tokens
+        self.completion_tokens += usage_metrics.completion_tokens
+        self.successful_requests += usage_metrics.successful_requests
--- a/src/crewai/utilities/converter.py
+++ b/src/crewai/utilities/converter.py
@@ -1,9 +1,14 @@
 import json
+import re
+from typing import Any, Optional, Type, Union

 from langchain.schema import HumanMessage, SystemMessage
 from langchain_openai import ChatOpenAI
+from pydantic import BaseModel, ValidationError

 from crewai.agents.agent_builder.utilities.base_output_converter import OutputConverter
+from crewai.utilities.printer import Printer
+from crewai.utilities.pydantic_schema_parser import PydanticSchemaParser


 class ConverterError(Exception):
@@ -72,3 +77,153 @@ class Converter(OutputConverter):
    def is_gpt(self) -> bool:
        """Return if llm provided is of gpt from openai."""
        return isinstance(self.llm, ChatOpenAI) and self.llm.openai_api_base is None
+
+
+def convert_to_model(
+    result: str,
+    output_pydantic: Optional[Type[BaseModel]],
+    output_json: Optional[Type[BaseModel]],
+    agent: Any,
+    converter_cls: Optional[Type[Converter]] = None,
+) -> Union[dict, BaseModel, str]:
+    model = output_pydantic or output_json
+    if model is None:
+        return result
+
+    try:
+        escaped_result = json.dumps(json.loads(result, strict=False))
+        return validate_model(escaped_result, model, bool(output_json))
+    except json.JSONDecodeError as e:
+        Printer().print(
+            content=f"Error parsing JSON: {e}. Attempting to handle partial JSON.",
+            color="yellow",
+        )
+        return handle_partial_json(
+            result, model, bool(output_json), agent, converter_cls
+        )
+    except ValidationError as e:
+        Printer().print(
+            content=f"Pydantic validation error: {e}. Attempting to handle partial JSON.",
+            color="yellow",
+        )
+        return handle_partial_json(
+            result, model, bool(output_json), agent, converter_cls
+        )
+    except Exception as e:
+        Printer().print(
+            content=f"Unexpected error during model conversion: {type(e).__name__}: {e}. Returning original result.",
+            color="red",
+        )
+        return result
+
+
+def validate_model(
+    result: str, model: Type[BaseModel], is_json_output: bool
+) -> Union[dict, BaseModel]:
+    exported_result = model.model_validate_json(result)
+    if is_json_output:
+        return exported_result.model_dump()
+    return exported_result
+
+
+def handle_partial_json(
+    result: str,
+    model: Type[BaseModel],
+    is_json_output: bool,
+    agent: Any,
+    converter_cls: Optional[Type[Converter]] = None,
+) -> Union[dict, BaseModel, str]:
+    match = re.search(r"({.*})", result, re.DOTALL)
+    if match:
+        try:
+            exported_result = model.model_validate_json(match.group(0))
+            if is_json_output:
+                return exported_result.model_dump()
+            return exported_result
+        except json.JSONDecodeError as e:
+            Printer().print(
+                content=f"Error parsing JSON: {e}. The extracted JSON-like string is not valid JSON. Attempting alternative conversion method.",
+                color="yellow",
+            )
+        except ValidationError as e:
+            Printer().print(
+                content=f"Pydantic validation error: {e}. The JSON structure doesn't match the expected model. Attempting alternative conversion method.",
+                color="yellow",
+            )
+        except Exception as e:
+            Printer().print(
+                content=f"Unexpected error during partial JSON handling: {type(e).__name__}: {e}. Attempting alternative conversion method.",
+                color="red",
+            )
+
+    return convert_with_instructions(
+        result, model, is_json_output, agent, converter_cls
+    )
+
+
+def convert_with_instructions(
+    result: str,
+    model: Type[BaseModel],
+    is_json_output: bool,
+    agent: Any,
+    converter_cls: Optional[Type[Converter]] = None,
+) -> Union[dict, BaseModel, str]:
+    llm = agent.function_calling_llm or agent.llm
+    instructions = get_conversion_instructions(model, llm)
+
+    converter = create_converter(
+        agent=agent,
+        converter_cls=converter_cls,
+        llm=llm,
+        text=result,
+        model=model,
+        instructions=instructions,
+    )
+    exported_result = (
+        converter.to_pydantic() if not is_json_output else converter.to_json()
+    )
+
+    if isinstance(exported_result, ConverterError):
+        Printer().print(
+            content=f"{exported_result.message} Using raw output instead.",
+            color="red",
+        )
+        return result
+
+    return exported_result
+
+
+def get_conversion_instructions(model: Type[BaseModel], llm: Any) -> str:
+    instructions = "I'm gonna convert this raw text into valid JSON."
+    if not is_gpt(llm):
+        model_schema = PydanticSchemaParser(model=model).get_schema()
+        instructions = f"{instructions}\n\nThe json should have the following structure, with the following keys:\n{model_schema}"
+    return instructions
+
+
+def is_gpt(llm: Any) -> bool:
+    from langchain_openai import ChatOpenAI
+
+    return isinstance(llm, ChatOpenAI) and llm.openai_api_base is None
+
+
+def create_converter(
+    agent: Optional[Any] = None,
+    converter_cls: Optional[Type[Converter]] = None,
+    *args,
+    **kwargs,
+) -> Converter:
+    if agent and not converter_cls:
+        if hasattr(agent, "get_output_converter"):
+            converter = agent.get_output_converter(*args, **kwargs)
+        else:
+            raise AttributeError("Agent does not have a 'get_output_converter' method")
+    elif converter_cls:
+        converter = converter_cls(*args, **kwargs)
+    else:
+        raise ValueError("Either agent or converter_cls must be provided")
+
+    if not converter:
+        raise Exception("No output converter found or set.")
+
+    return converter
--- a/src/crewai/utilities/crew_pydantic_output_parser.py
+++ b/src/crewai/utilities/crew_pydantic_output_parser.py
@@ -1,5 +1,5 @@
 import json
-from typing import Any, List, Type, Union
+from typing import Any, List, Type

 import regex
 from langchain.output_parsers import PydanticOutputParser
@@ -7,29 +7,24 @@ from langchain_core.exceptions import OutputParserException
 from langchain_core.outputs import Generation
 from langchain_core.pydantic_v1 import ValidationError
 from pydantic import BaseModel
-from pydantic.v1 import BaseModel as V1BaseModel


 class CrewPydanticOutputParser(PydanticOutputParser):
    """Parses the text into pydantic models"""

-    pydantic_object: Union[Type[BaseModel], Type[V1BaseModel]]
+    pydantic_object: Type[BaseModel]

-    def parse_result(self, result: List[Generation], *, partial: bool = False) -> Any:
+    def parse_result(self, result: List[Generation]) -> Any:
        result[0].text = self._transform_in_valid_json(result[0].text)

        # Treating edge case of function calling llm returning the name instead of tool_name
        json_object = json.loads(result[0].text)
-        json_object["tool_name"] = (
-            json_object["name"]
-            if "tool_name" not in json_object
-            else json_object["tool_name"]
-        )
+        if "tool_name" not in json_object:
+            json_object["tool_name"] = json_object.get("name", "")
        result[0].text = json.dumps(json_object)

-        json_object = super().parse_result(result)
        try:
-            return self.pydantic_object.parse_obj(json_object)
+            return self.pydantic_object.model_validate(json_object)
        except ValidationError as e:
            name = self.pydantic_object.__name__
            msg = f"Failed to parse {name} from completion {json_object}. Got: {e}"
--- a/src/crewai/utilities/evaluators/crew_evaluator_handler.py
+++ b/src/crewai/utilities/evaluators/crew_evaluator_handler.py
@@ -0,0 +1,163 @@
+from collections import defaultdict
+
+from langchain_openai import ChatOpenAI
+from pydantic import BaseModel, Field
+from rich.console import Console
+from rich.table import Table
+
+from crewai.agent import Agent
+from crewai.task import Task
+from crewai.tasks.task_output import TaskOutput
+
+
+class TaskEvaluationPydanticOutput(BaseModel):
+    quality: float = Field(
+        description="A score from 1 to 10 evaluating on completion, quality, and overall performance from the task_description and task_expected_output to the actual Task Output."
+    )
+
+
+class CrewEvaluator:
+    """
+    A class to evaluate the performance of the agents in the crew based on the tasks they have performed.
+
+    Attributes:
+        crew (Crew): The crew of agents to evaluate.
+        openai_model_name (str): The model to use for evaluating the performance of the agents (for now ONLY OpenAI accepted).
+        tasks_scores (defaultdict): A dictionary to store the scores of the agents for each task.
+        iteration (int): The current iteration of the evaluation.
+    """
+
+    tasks_scores: defaultdict = defaultdict(list)
+    run_execution_times: defaultdict = defaultdict(list)
+    iteration: int = 0
+
+    def __init__(self, crew, openai_model_name: str):
+        self.crew = crew
+        self.openai_model_name = openai_model_name
+        self._setup_for_evaluating()
+
+    def _setup_for_evaluating(self) -> None:
+        """Sets up the crew for evaluating."""
+        for task in self.crew.tasks:
+            task.callback = self.evaluate
+
+    def _evaluator_agent(self):
+        return Agent(
+            role="Task Execution Evaluator",
+            goal=(
+                "Your goal is to evaluate the performance of the agents in the crew based on the tasks they have performed using score from 1 to 10 evaluating on completion, quality, and overall performance."
+            ),
+            backstory="Evaluator agent for crew evaluation with precise capabilities to evaluate the performance of the agents in the crew based on the tasks they have performed",
+            verbose=False,
+            llm=ChatOpenAI(model=self.openai_model_name),
+        )
+
+    def _evaluation_task(
+        self, evaluator_agent: Agent, task_to_evaluate: Task, task_output: str
+    ) -> Task:
+        return Task(
+            description=(
+                "Based on the task description and the expected output, compare and evaluate the performance of the agents in the crew based on the Task Output they have performed using score from 1 to 10 evaluating on completion, quality, and overall performance."
+                f"task_description: {task_to_evaluate.description} "
+                f"task_expected_output: {task_to_evaluate.expected_output} "
+                f"agent: {task_to_evaluate.agent.role if task_to_evaluate.agent else None} "
+                f"agent_goal: {task_to_evaluate.agent.goal if task_to_evaluate.agent else None} "
+                f"Task Output: {task_output}"
+            ),
+            expected_output="Evaluation Score from 1 to 10 based on the performance of the agents on the tasks",
+            agent=evaluator_agent,
+            output_pydantic=TaskEvaluationPydanticOutput,
+        )
+
+    def set_iteration(self, iteration: int) -> None:
+        self.iteration = iteration
+
+    def print_crew_evaluation_result(self) -> None:
+        """
+        Prints the evaluation result of the crew in a table.
+        A Crew with 2 tasks using the command crewai test -n 2
+        will output the following table:
+
+                        Task Scores
+                    (1-10 Higher is better)
+            ┏━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┓
+            ┃ Tasks/Crew ┃ Run 1 ┃ Run 2 ┃ Avg. Total ┃
+            ┡━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━┩
+            │ Task 1     │ 10.0  │ 9.0   │ 9.5        │
+            │ Task 2     │ 9.0   │ 9.0   │ 9.0        │
+            │ Crew       │ 9.5   │ 9.0   │ 9.2        │
+            └────────────┴───────┴───────┴────────────┘
+        """
+        task_averages = [
+            sum(scores) / len(scores) for scores in zip(*self.tasks_scores.values())
+        ]
+        crew_average = sum(task_averages) / len(task_averages)
+
+        # Create a table
+        table = Table(title="Tasks Scores \n (1-10 Higher is better)")
+
+        # Add columns for the table
+        table.add_column("Tasks/Crew")
+        for run in range(1, len(self.tasks_scores) + 1):
+            table.add_column(f"Run {run}")
+        table.add_column("Avg. Total")
+
+        # Add rows for each task
+        for task_index in range(len(task_averages)):
+            task_scores = [
+                self.tasks_scores[run][task_index]
+                for run in range(1, len(self.tasks_scores) + 1)
+            ]
+            avg_score = task_averages[task_index]
+            table.add_row(
+                f"Task {task_index + 1}", *map(str, task_scores), f"{avg_score:.1f}"
+            )
+
+        # Add a row for the crew average
+        crew_scores = [
+            sum(self.tasks_scores[run]) / len(self.tasks_scores[run])
+            for run in range(1, len(self.tasks_scores) + 1)
+        ]
+        table.add_row("Crew", *map(str, crew_scores), f"{crew_average:.1f}")
+
+        run_exec_times = [
+            int(sum(tasks_exec_times))
+            for _, tasks_exec_times in self.run_execution_times.items()
+        ]
+        execution_time_avg = int(sum(run_exec_times) / len(run_exec_times))
+        table.add_row(
+            "Execution Time (s)",
+            *map(str, run_exec_times),
+            f"{execution_time_avg}",
+        )
+        # Display the table in the terminal
+        console = Console()
+        console.print(table)
+
+    def evaluate(self, task_output: TaskOutput):
+        """Evaluates the performance of the agents in the crew based on the tasks they have performed."""
+        current_task = None
+        for task in self.crew.tasks:
+            if task.description == task_output.description:
+                current_task = task
+                break
+
+        if not current_task or not task_output:
+            raise ValueError(
+                "Task to evaluate and task output are required for evaluation"
+            )
+
+        evaluator_agent = self._evaluator_agent()
+        evaluation_task = self._evaluation_task(
+            evaluator_agent, current_task, task_output.raw
+        )
+
+        evaluation_result = evaluation_task.execute_sync()
+
+        if isinstance(evaluation_result.pydantic, TaskEvaluationPydanticOutput):
+            self.tasks_scores[self.iteration].append(evaluation_result.pydantic.quality)
+            self.run_execution_times[self.iteration].append(
+                current_task._execution_time
+            )
+        else:
+            raise ValueError("Evaluation result is not in the expected format")
--- a/src/crewai/utilities/evaluators/task_evaluator.py
+++ b/src/crewai/utilities/evaluators/task_evaluator.py
@@ -54,23 +54,23 @@ class TaskEvaluator:
    def __init__(self, original_agent):
        self.llm = original_agent.llm

-    def evaluate(self, task, ouput) -> TaskEvaluation:
+    def evaluate(self, task, output) -> TaskEvaluation:
        evaluation_query = (
            f"Assess the quality of the task completed based on the description, expected output, and actual results.\n\n"
            f"Task Description:\n{task.description}\n\n"
            f"Expected Output:\n{task.expected_output}\n\n"
-            f"Actual Output:\n{ouput}\n\n"
+            f"Actual Output:\n{output}\n\n"
            "Please provide:\n"
            "- Bullet points suggestions to improve future similar tasks\n"
            "- A score from 0 to 10 evaluating on completion, quality, and overall performance"
            "- Entities extracted from the task output, if any, their type, description, and relationships"
        )

-        instructions = "I'm gonna convert this raw text into valid JSON."
+        instructions = "Convert all responses into valid JSON output."

        if not self._is_gpt(self.llm):
            model_schema = PydanticSchemaParser(model=TaskEvaluation).get_schema()
-            instructions = f"{instructions}\n\nThe json should have the following structure, with the following keys:\n{model_schema}"
+            instructions = f"{instructions}\n\nReturn only valid JSON with the following schema:\n```json\n{model_schema}\n```"

        converter = Converter(
            llm=self.llm,
--- a/src/crewai/utilities/parser.py
+++ b/src/crewai/utilities/parser.py
@@ -1,17 +1,28 @@
 import re

-
 class YamlParser:
+    @staticmethod
    def parse(file):
+        """
+        Parses a YAML file, modifies specific patterns, and checks for unsupported 'context' usage.
+        Args:
+            file (file object): The YAML file to parse.
+        Returns:
+            str: The modified content of the YAML file.
+        Raises:
+            ValueError: If 'context:' is used incorrectly.
+        """
        content = file.read()
+
        # Replace single { and } with doubled ones, while leaving already doubled ones intact and the other special characters {# and {%
        modified_content = re.sub(r"(?<!\{){(?!\{)(?!\#)(?!\%)", "{{", content)
-        modified_content = re.sub(
-            r"(?<!\})(?<!\%)(?<!\#)\}(?!})", "}}", modified_content
-        )
+        modified_content = re.sub(r"(?<!\})(?<!\%)(?<!\#)\}(?!})", "}}", modified_content)
+
        # Check for 'context:' not followed by '[' and raise an error
        if re.search(r"context:(?!\s*\[)", modified_content):
            raise ValueError(
-                "Context is currently only supported in code when creating a task. Please use the 'context' key in the task configuration."
+                "Context is currently only supported in code when creating a task. "
+                "Please use the 'context' key in the task configuration."
            )
+
        return modified_content
--- a/src/crewai/utilities/planning_handler.py
+++ b/src/crewai/utilities/planning_handler.py
@@ -0,0 +1,76 @@
+from typing import Any, List, Optional
+
+from langchain_openai import ChatOpenAI
+from pydantic import BaseModel
+
+from crewai.agent import Agent
+from crewai.task import Task
+
+
+class PlannerTaskPydanticOutput(BaseModel):
+    list_of_plans_per_task: List[str]
+
+
+class CrewPlanner:
+    def __init__(self, tasks: List[Task], planning_agent_llm: Optional[Any] = None):
+        self.tasks = tasks
+
+        if planning_agent_llm is None:
+            self.planning_agent_llm = ChatOpenAI(model="gpt-4o-mini")
+        else:
+            self.planning_agent_llm = planning_agent_llm
+
+    def _handle_crew_planning(self) -> PlannerTaskPydanticOutput:
+        """Handles the Crew planning by creating detailed step-by-step plans for each task."""
+        planning_agent = self._create_planning_agent()
+        tasks_summary = self._create_tasks_summary()
+
+        planner_task = self._create_planner_task(planning_agent, tasks_summary)
+
+        result = planner_task.execute_sync()
+
+        if isinstance(result.pydantic, PlannerTaskPydanticOutput):
+            return result.pydantic
+
+        raise ValueError("Failed to get the Planning output")
+
+    def _create_planning_agent(self) -> Agent:
+        """Creates the planning agent for the crew planning."""
+        return Agent(
+            role="Task Execution Planner",
+            goal=(
+                "Your goal is to create an extremely detailed, step-by-step plan based on the tasks and tools "
+                "available to each agent so that they can perform the tasks in an exemplary manner"
+            ),
+            backstory="Planner agent for crew planning",
+            llm=self.planning_agent_llm,
+        )
+
+    def _create_planner_task(self, planning_agent: Agent, tasks_summary: str) -> Task:
+        """Creates the planner task using the given agent and tasks summary."""
+        return Task(
+            description=(
+                f"Based on these tasks summary: {tasks_summary} \n Create the most descriptive plan based on the tasks "
+                "descriptions, tools available, and agents' goals for them to execute their goals with perfection."
+            ),
+            expected_output="Step by step plan on how the agents can execute their tasks using the available tools with mastery",
+            agent=planning_agent,
+            output_pydantic=PlannerTaskPydanticOutput,
+        )
+
+    def _create_tasks_summary(self) -> str:
+        """Creates a summary of all tasks."""
+        tasks_summary = []
+        for idx, task in enumerate(self.tasks):
+            tasks_summary.append(
+                f"""
+                Task Number {idx + 1} - {task.description}
+                "task_description": {task.description}
+                "task_expected_output": {task.expected_output}
+                "agent": {task.agent.role if task.agent else "None"}
+                "agent_goal": {task.agent.goal if task.agent else "None"}
+                "task_tools": {task.tools}
+                "agent_tools": {task.agent.tools if task.agent else "None"}
+                """
+            )
+        return " ".join(tasks_summary)
--- a/src/crewai/utilities/pydantic_schema_parser.py
+++ b/src/crewai/utilities/pydantic_schema_parser.py
@@ -16,11 +16,13 @@ class PydanticSchemaParser(BaseModel):
        return self._get_model_schema(self.model)

    def _get_model_schema(self, model, depth=0) -> str:
-        lines = []
+        indent = "    " * depth
+        lines = [f"{indent}{{"]
        for field_name, field in model.model_fields.items():
            field_type_str = self._get_field_type(field, depth + 1)
-            lines.append(f"{' ' * 4 * depth}- {field_name}: {field_type_str}")
-
+            lines.append(f"{indent}    {field_name}: {field_type_str},")
+        lines[-1] = lines[-1].rstrip(",")  # Remove trailing comma from last item
+        lines.append(f"{indent}}}")
        return "\n".join(lines)

    def _get_field_type(self, field, depth) -> str:
@@ -35,6 +37,6 @@ class PydanticSchemaParser(BaseModel):
            else:
                return f"List[{list_item_type.__name__}]"
        elif issubclass(field_type, BaseModel):
-            return f"\n{self._get_model_schema(field_type, depth)}"
+            return self._get_model_schema(field_type, depth)
        else:
            return field_type.__name__
--- a/tests/agent_test.py
+++ b/tests/agent_test.py
@@ -397,7 +397,7 @@ def test_agent_moved_on_after_max_iterations():
    )

    task = Task(
-        description="The final answer is 42. But don't give it yet, instead keep using the `get_final_answer` tool over and over until you're told you can give yout final answer.",
+        description="The final answer is 42. But don't give it yet, instead keep using the `get_final_answer` tool over and over until you're told you can give your final answer.",
        expected_output="The final answer",
    )
    output = agent.execute_task(
@@ -948,7 +948,7 @@ def test_agent_use_trained_data(crew_training_handler):
    crew_training_handler().load.return_value = {
        agent.role: {
            "suggestions": [
-                "The result of the math operatio must be right.",
+                "The result of the math operation must be right.",
                "Result must be better than 1.",
            ]
        }
@@ -958,7 +958,7 @@ def test_agent_use_trained_data(crew_training_handler):

    assert (
        result == "What is 1 + 1?You MUST follow these feedbacks: \n "
-        "The result of the math operatio must be right.\n - Result must be better than 1."
+        "The result of the math operation must be right.\n - Result must be better than 1."
    )
    crew_training_handler.assert_has_calls(
        [mock.call(), mock.call("trained_agents_data.pkl"), mock.call().load()]
--- a/tests/cassettes/test_agent_usage_metrics_are_captured_for_hierarchical_process.yaml
+++ b/tests/cassettes/test_agent_usage_metrics_are_captured_for_hierarchical_process.yaml
--- a/tests/cassettes/test_hierarchical_crew_creation_tasks_with_async_execution.yaml
+++ b/tests/cassettes/test_hierarchical_crew_creation_tasks_with_async_execution.yaml
--- a/tests/cassettes/test_hierarchical_crew_creation_tasks_with_sync_last.yaml
+++ b/tests/cassettes/test_hierarchical_crew_creation_tasks_with_sync_last.yaml
--- a/tests/cassettes/test_manager_agent_delegating_to_all_agents.yaml
+++ b/tests/cassettes/test_manager_agent_delegating_to_all_agents.yaml
--- a/tests/cassettes/test_manager_agent_delegating_to_assigned_task_agent.yaml
+++ b/tests/cassettes/test_manager_agent_delegating_to_assigned_task_agent.yaml
--- a/tests/cassettes/test_replay_setup_context.yaml
+++ b/tests/cassettes/test_replay_setup_context.yaml
@@ -0,0 +1,163 @@
+interactions:
+- request:
+    body: '{"messages": [{"content": "You are test_agent. Test Description\nYour personal
+      goal is: Test GoalTo give my best complete final answer to the task use the
+      exact following format:\n\nThought: I now can give a great answer\nFinal Answer:
+      my best complete final answer to the task.\nYour final answer must be the great
+      and the most complete as possible, it must be outcome described.\n\nI MUST use
+      these formats, my job depends on it!\nCurrent Task: Test Task\n\nThis is the
+      expect criteria for your final answer: Say Hi to John \n you MUST return the
+      actual complete content as the final answer, not a summary.\n\nThis is the context
+      you''re working with:\ncontext raw output\n\nBegin! This is VERY important to
+      you, use the tools available and give your best Final Answer, your job depends
+      on it!\n\nThought:\n", "role": "user"}], "model": "gpt-4o", "logprobs": false,
+      "n": 1, "stop": ["\nObservation"], "stream": true, "temperature": 0.7}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '937'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.36.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.36.0
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.11.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: 'data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"Thought"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":":"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"
+        I"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"
+        now"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"
+        can"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"
+        give"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"
+        a"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"
+        great"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"
+        answer"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"\n"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"Final"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"
+        Answer"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":":"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"
+        Hi"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{"content":"
+        John"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wQ7bzZKcXAmiNgs4nn5Of0EFiM","object":"chat.completion.chunk","created":1721491782,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_400f27fa1f","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}
+
+
+        data: [DONE]
+
+
+        '
+    headers:
+      CF-Cache-Status:
+      - DYNAMIC
+      CF-RAY:
+      - 8a643794fe0341e9-EWR
+      Connection:
+      - keep-alive
+      Content-Type:
+      - text/event-stream; charset=utf-8
+      Date:
+      - Sat, 20 Jul 2024 16:09:42 GMT
+      Server:
+      - cloudflare
+      Set-Cookie:
+      - __cf_bm=7kfE3khl2E.6zM44yel5nToHzdtz0QeQ4wkLuGYyqSs-1721491782-1.0.1.1-XUb95eXTriHvSUSCH.TCyAmCGCbPK6L7p_tRTDBon8Fo6ns8TDbDoDGA.wVCFI4MTXSxkqrjD0GpYDj4GBTeSQ;
+        path=/; expires=Sat, 20-Jul-24 16:39:42 GMT; domain=.api.openai.com; HttpOnly;
+        Secure; SameSite=None
+      - _cfuvid=iN41lAEk.DjpRMAtG.K0NEvIN0xB9eS0CUCU2iWmjv4-1721491782137-0.0.1.1-604800000;
+        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      alt-svc:
+      - h3=":443"; ma=86400
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '104'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=15552000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '10000'
+      x-ratelimit-limit-tokens:
+      - '30000000'
+      x-ratelimit-remaining-requests:
+      - '9999'
+      x-ratelimit-remaining-tokens:
+      - '29999791'
+      x-ratelimit-reset-requests:
+      - 6ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_4d90924dd28a0fb48c857f03515f0ca8
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_replay_with_context.yaml
+++ b/tests/cassettes/test_replay_with_context.yaml
@@ -0,0 +1,159 @@
+interactions:
+- request:
+    body: '{"messages": [{"content": "You are test_agent. Test Description\nYour personal
+      goal is: Test GoalTo give my best complete final answer to the task use the
+      exact following format:\n\nThought: I now can give a great answer\nFinal Answer:
+      my best complete final answer to the task.\nYour final answer must be the great
+      and the most complete as possible, it must be outcome described.\n\nI MUST use
+      these formats, my job depends on it!\nCurrent Task: Test Task\n\nThis is the
+      expect criteria for your final answer: Say Hi \n you MUST return the actual
+      complete content as the final answer, not a summary.\n\nThis is the context
+      you''re working with:\ncontext raw output\n\nBegin! This is VERY important to
+      you, use the tools available and give your best Final Answer, your job depends
+      on it!\n\nThought:\n", "role": "user"}], "model": "gpt-4o", "logprobs": false,
+      "n": 1, "stop": ["\nObservation"], "stream": true, "temperature": 0.7}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '929'
+      content-type:
+      - application/json
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.36.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.36.0
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.11.7
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: 'data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"Thought"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":":"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"
+        I"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"
+        now"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"
+        can"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"
+        give"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"
+        a"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"
+        great"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"
+        answer"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"\n"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"Final"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"
+        Answer"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":":"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{"content":"
+        Hi"},"logprobs":null,"finish_reason":null}]}
+
+
+        data: {"id":"chatcmpl-9n6wPAClsh4tUGoLYKLh3VoX1vlAx","object":"chat.completion.chunk","created":1721491781,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_c4e5b6fa31","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}
+
+
+        data: [DONE]
+
+
+        '
+    headers:
+      CF-Cache-Status:
+      - DYNAMIC
+      CF-RAY:
+      - 8a643791a80e8c96-EWR
+      Connection:
+      - keep-alive
+      Content-Type:
+      - text/event-stream; charset=utf-8
+      Date:
+      - Sat, 20 Jul 2024 16:09:41 GMT
+      Server:
+      - cloudflare
+      Set-Cookie:
+      - __cf_bm=cam5sECdaTzbttLIOaiuvh9flDIAXp_FLPODnDEOn6k-1721491781-1.0.1.1-hyFl43P7HIWZsGueyWuDeO579sZ41as2mvrM.cQS1E8KSLG2ZZ0DxDGbVvHYRO0eflTUJohgZu6CGltvjQfMtQ;
+        path=/; expires=Sat, 20-Jul-24 16:39:41 GMT; domain=.api.openai.com; HttpOnly;
+        Secure; SameSite=None
+      - _cfuvid=nmlgS.bqXAu0rZ.OlHPfXrIrdnVgrBSW3e0UuU3N5ng-1721491781661-0.0.1.1-604800000;
+        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      alt-svc:
+      - h3=":443"; ma=86400
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '126'
+      openai-version:
+      - '2020-10-01'
+      strict-transport-security:
+      - max-age=15552000; includeSubDomains; preload
+      x-ratelimit-limit-requests:
+      - '10000'
+      x-ratelimit-limit-tokens:
+      - '30000000'
+      x-ratelimit-remaining-requests:
+      - '9999'
+      x-ratelimit-remaining-tokens:
+      - '29999794'
+      x-ratelimit-reset-requests:
+      - 6ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_31484eeb0af939af4e0d9c47441ba2db
+    status:
+      code: 200
+      message: OK
+version: 1
--- a/tests/cassettes/test_sequential_async_task_execution_completion.yaml
+++ b/tests/cassettes/test_sequential_async_task_execution_completion.yaml
--- a/tests/cli/cli_test.py
+++ b/tests/cli/cli_test.py
@@ -3,7 +3,7 @@ from unittest import mock
 import pytest
 from click.testing import CliRunner

-from crewai.cli.cli import train, version
+from crewai.cli.cli import reset_memories, test, train, version


@pytest.fixture
@@ -41,6 +41,82 @@ def test_train_invalid_string_iterations(train_crew, runner):
    )


+@mock.patch("crewai.cli.reset_memories_command.ShortTermMemory")
+@mock.patch("crewai.cli.reset_memories_command.EntityMemory")
+@mock.patch("crewai.cli.reset_memories_command.LongTermMemory")
+@mock.patch("crewai.cli.reset_memories_command.TaskOutputStorageHandler")
+def test_reset_all_memories(
+    MockTaskOutputStorageHandler,
+    MockLongTermMemory,
+    MockEntityMemory,
+    MockShortTermMemory,
+    runner,
+):
+    result = runner.invoke(reset_memories, ["--all"])
+    MockShortTermMemory().reset.assert_called_once()
+    MockEntityMemory().reset.assert_called_once()
+    MockLongTermMemory().reset.assert_called_once()
+    MockTaskOutputStorageHandler().reset.assert_called_once()
+
+    assert result.output == "All memories have been reset.\n"
+
+
+@mock.patch("crewai.cli.reset_memories_command.ShortTermMemory")
+def test_reset_short_term_memories(MockShortTermMemory, runner):
+    result = runner.invoke(reset_memories, ["-s"])
+    MockShortTermMemory().reset.assert_called_once()
+    assert result.output == "Short term memory has been reset.\n"
+
+
+@mock.patch("crewai.cli.reset_memories_command.EntityMemory")
+def test_reset_entity_memories(MockEntityMemory, runner):
+    result = runner.invoke(reset_memories, ["-e"])
+    MockEntityMemory().reset.assert_called_once()
+    assert result.output == "Entity memory has been reset.\n"
+
+
+@mock.patch("crewai.cli.reset_memories_command.LongTermMemory")
+def test_reset_long_term_memories(MockLongTermMemory, runner):
+    result = runner.invoke(reset_memories, ["-l"])
+    MockLongTermMemory().reset.assert_called_once()
+    assert result.output == "Long term memory has been reset.\n"
+
+
+@mock.patch("crewai.cli.reset_memories_command.TaskOutputStorageHandler")
+def test_reset_kickoff_outputs(MockTaskOutputStorageHandler, runner):
+    result = runner.invoke(reset_memories, ["-k"])
+    MockTaskOutputStorageHandler().reset.assert_called_once()
+    assert result.output == "Latest Kickoff outputs stored has been reset.\n"
+
+
+@mock.patch("crewai.cli.reset_memories_command.ShortTermMemory")
+@mock.patch("crewai.cli.reset_memories_command.LongTermMemory")
+def test_reset_multiple_memory_flags(MockShortTermMemory, MockLongTermMemory, runner):
+    result = runner.invoke(
+        reset_memories,
+        [
+            "-s",
+            "-l",
+        ],
+    )
+    MockShortTermMemory().reset.assert_called_once()
+    MockLongTermMemory().reset.assert_called_once()
+    assert (
+        result.output
+        == "Long term memory has been reset.\nShort term memory has been reset.\n"
+    )
+
+
+def test_reset_no_memory_flags(runner):
+    result = runner.invoke(
+        reset_memories,
+    )
+    assert (
+        result.output
+        == "Please specify at least one memory type to reset using the appropriate flags.\n"
+    )
+
+
 def test_version_command(runner):
    result = runner.invoke(version)

@@ -57,3 +133,33 @@ def test_version_command_with_tools(runner):
        "crewai tools version:" in result.output
        or "crewai tools not installed" in result.output
    )
+
+
+@mock.patch("crewai.cli.cli.evaluate_crew")
+def test_test_default_iterations(evaluate_crew, runner):
+    result = runner.invoke(test)
+
+    evaluate_crew.assert_called_once_with(3, "gpt-4o-mini")
+    assert result.exit_code == 0
+    assert "Testing the crew for 3 iterations with model gpt-4o-mini" in result.output
+
+
+@mock.patch("crewai.cli.cli.evaluate_crew")
+def test_test_custom_iterations(evaluate_crew, runner):
+    result = runner.invoke(test, ["--n_iterations", "5", "--model", "gpt-4o"])
+
+    evaluate_crew.assert_called_once_with(5, "gpt-4o")
+    assert result.exit_code == 0
+    assert "Testing the crew for 5 iterations with model gpt-4o" in result.output
+
+
+@mock.patch("crewai.cli.cli.evaluate_crew")
+def test_test_invalid_string_iterations(evaluate_crew, runner):
+    result = runner.invoke(test, ["--n_iterations", "invalid"])
+
+    evaluate_crew.assert_not_called()
+    assert result.exit_code == 2
+    assert (
+        "Usage: test [OPTIONS]\nTry 'test --help' for help.\n\nError: Invalid value for '-n' / '--n_iterations': 'invalid' is not a valid integer.\n"
+        in result.output
+    )
--- a/tests/cli/test_crew_test.py
+++ b/tests/cli/test_crew_test.py
@@ -0,0 +1,97 @@
+import subprocess
+from unittest import mock
+
+import pytest
+
+from crewai.cli import evaluate_crew
+
+
+@pytest.mark.parametrize(
+    "n_iterations,model",
+    [
+        (1, "gpt-4o"),
+        (5, "gpt-3.5-turbo"),
+        (10, "gpt-4"),
+    ],
+)
+@mock.patch("crewai.cli.evaluate_crew.subprocess.run")
+def test_crew_success(mock_subprocess_run, n_iterations, model):
+    """Test the crew function for successful execution."""
+    mock_subprocess_run.return_value = subprocess.CompletedProcess(
+        args=f"poetry run test {n_iterations} {model}", returncode=0
+    )
+    result = evaluate_crew.evaluate_crew(n_iterations, model)
+
+    mock_subprocess_run.assert_called_once_with(
+        ["poetry", "run", "test", str(n_iterations), model],
+        capture_output=False,
+        text=True,
+        check=True,
+    )
+    assert result is None
+
+
+@mock.patch("crewai.cli.evaluate_crew.click")
+def test_test_crew_zero_iterations(click):
+    evaluate_crew.evaluate_crew(0, "gpt-4o")
+    click.echo.assert_called_once_with(
+        "An unexpected error occurred: The number of iterations must be a positive integer.",
+        err=True,
+    )
+
+
+@mock.patch("crewai.cli.evaluate_crew.click")
+def test_test_crew_negative_iterations(click):
+    evaluate_crew.evaluate_crew(-2, "gpt-4o")
+    click.echo.assert_called_once_with(
+        "An unexpected error occurred: The number of iterations must be a positive integer.",
+        err=True,
+    )
+
+
+@mock.patch("crewai.cli.evaluate_crew.click")
+@mock.patch("crewai.cli.evaluate_crew.subprocess.run")
+def test_test_crew_called_process_error(mock_subprocess_run, click):
+    n_iterations = 5
+    mock_subprocess_run.side_effect = subprocess.CalledProcessError(
+        returncode=1,
+        cmd=["poetry", "run", "test", str(n_iterations), "gpt-4o"],
+        output="Error",
+        stderr="Some error occurred",
+    )
+    evaluate_crew.evaluate_crew(n_iterations, "gpt-4o")
+
+    mock_subprocess_run.assert_called_once_with(
+        ["poetry", "run", "test", "5", "gpt-4o"],
+        capture_output=False,
+        text=True,
+        check=True,
+    )
+    click.echo.assert_has_calls(
+        [
+            mock.call.echo(
+                "An error occurred while testing the crew: Command '['poetry', 'run', 'test', '5', 'gpt-4o']' returned non-zero exit status 1.",
+                err=True,
+            ),
+            mock.call.echo("Error", err=True),
+        ]
+    )
+
+
+@mock.patch("crewai.cli.evaluate_crew.click")
+@mock.patch("crewai.cli.evaluate_crew.subprocess.run")
+def test_test_crew_unexpected_exception(mock_subprocess_run, click):
+    # Arrange
+    n_iterations = 5
+    mock_subprocess_run.side_effect = Exception("Unexpected error")
+    evaluate_crew.evaluate_crew(n_iterations, "gpt-4o")
+
+    mock_subprocess_run.assert_called_once_with(
+        ["poetry", "run", "test", "5", "gpt-4o"],
+        capture_output=False,
+        text=True,
+        check=True,
+    )
+    click.echo.assert_called_once_with(
+        "An unexpected error occurred: Unexpected error", err=True
+    )
--- a/tests/crew_test.py
+++ b/tests/crew_test.py
@@ -18,6 +18,7 @@ from crewai.task import Task
 from crewai.tasks.conditional_task import ConditionalTask
 from crewai.tasks.output_format import OutputFormat
 from crewai.tasks.task_output import TaskOutput
+from crewai.types.usage_metrics import UsageMetrics
 from crewai.utilities import Logger, RPMController
 from crewai.utilities.task_output_storage_handler import TaskOutputStorageHandler

@@ -68,7 +69,7 @@ def test_crew_config_conditional_requirement():
                    "agent": "Senior Researcher",
                },
                {
-                    "description": "Write a 1 amazing paragraph highlight for each idead that showcases how good an article about this topic could be, check references if necessary or search for more content but make sure it's unique, interesting and well written. Return the list of ideas with their paragraph and your notes.",
+                    "description": "Write a 1 amazing paragraph highlight for each idea that showcases how good an article about this topic could be, check references if necessary or search for more content but make sure it's unique, interesting and well written. Return the list of ideas with their paragraph and your notes.",
                    "expected_output": "A 4 paragraph article about AI.",
                    "agent": "Senior Writer",
                },
@@ -286,7 +287,7 @@ def test_hierarchical_process():
    crew = Crew(
        agents=[researcher, writer],
        process=Process.hierarchical,
-        manager_llm=ChatOpenAI(temperature=0, model="gpt-4"),
+        manager_llm=ChatOpenAI(temperature=0, model="gpt-4o"),
        tasks=[task],
    )

@@ -312,6 +313,82 @@ def test_manager_llm_requirement_for_hierarchical_process():
        )


+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_manager_agent_delegating_to_assigned_task_agent():
+    """
+    Test that the manager agent delegates to the assigned task agent.
+    """
+    from langchain_openai import ChatOpenAI
+
+    task = Task(
+        description="Come up with a list of 5 interesting ideas to explore for an article, then write one amazing paragraph highlight for each idea that showcases how good an article about this topic could be. Return the list of ideas with their paragraph and your notes.",
+        expected_output="5 bullet points with a paragraph for each idea.",
+        agent=researcher,
+    )
+
+    crew = Crew(
+        agents=[researcher, writer],
+        process=Process.hierarchical,
+        manager_llm=ChatOpenAI(temperature=0, model="gpt-4o"),
+        tasks=[task],
+    )
+
+    crew.kickoff()
+
+    # Check if the manager agent has the correct tools
+    assert crew.manager_agent is not None
+    assert crew.manager_agent.tools is not None
+
+    assert len(crew.manager_agent.tools) == 2
+    assert (
+        "Delegate a specific task to one of the following coworkers: Researcher\n"
+        in crew.manager_agent.tools[0].description
+    )
+    assert (
+        "Ask a specific question to one of the following coworkers: Researcher\n"
+        in crew.manager_agent.tools[1].description
+    )
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_manager_agent_delegating_to_all_agents():
+    """
+    Test that the manager agent delegates to all agents when none are specified.
+    """
+    from langchain_openai import ChatOpenAI
+
+    task = Task(
+        description="Come up with a list of 5 interesting ideas to explore for an article, then write one amazing paragraph highlight for each idea that showcases how good an article about this topic could be. Return the list of ideas with their paragraph and your notes.",
+        expected_output="5 bullet points with a paragraph for each idea.",
+    )
+
+    crew = Crew(
+        agents=[researcher, writer],
+        process=Process.hierarchical,
+        manager_llm=ChatOpenAI(temperature=0, model="gpt-4o"),
+        tasks=[task],
+    )
+
+    crew.kickoff()
+
+    assert crew.manager_agent is not None
+    assert crew.manager_agent.tools is not None
+
+    assert len(crew.manager_agent.tools) == 2
+    print(
+        "crew.manager_agent.tools[0].description",
+        crew.manager_agent.tools[0].description,
+    )
+    assert (
+        "Delegate a specific task to one of the following coworkers: Researcher, Senior Writer\n"
+        in crew.manager_agent.tools[0].description
+    )
+    assert (
+        "Ask a specific question to one of the following coworkers: Researcher, Senior Writer\n"
+        in crew.manager_agent.tools[1].description
+    )
+
+
@pytest.mark.vcr(filter_headers=["authorization"])
 def test_crew_with_delegating_agents():
    tasks = [
@@ -521,14 +598,10 @@ def test_crew_kickoff_usage_metrics():
    assert len(results) == len(inputs)
    for result in results:
        # Assert that all required keys are in usage_metrics and their values are not None
-        for key in [
-            "total_tokens",
-            "prompt_tokens",
-            "completion_tokens",
-            "successful_requests",
-        ]:
-            assert key in result.token_usage
-            assert result.token_usage[key] > 0
+        assert result.token_usage.total_tokens > 0
+        assert result.token_usage.prompt_tokens > 0
+        assert result.token_usage.completion_tokens > 0
+        assert result.token_usage.successful_requests > 0


 def test_agents_rpm_is_never_set_if_crew_max_RPM_is_not_set():
@@ -580,7 +653,7 @@ def test_sequential_async_task_execution_completion():

    sequential_result = sequential_crew.kickoff()
    assert sequential_result.raw.startswith(
-        "**The Evolution of Artificial Intelligence: A Journey Through Milestones**"
+        "The history of artificial intelligence (AI) is marked by several pivotal events that have shaped its evolution and impact on various sectors."
    )


@@ -1112,7 +1185,7 @@ def test_task_with_no_arguments():
    )

    task = Task(
-        description="Look at the available data nd give me a sense on the total number of sales.",
+        description="Look at the available data and give me a sense on the total number of sales.",
        expected_output="The total number of sales as an integer",
        agent=researcher,
    )
@@ -1159,7 +1232,7 @@ def test_delegation_is_not_enabled_if_there_are_only_one_agent():
    )

    task = Task(
-        description="Look at the available data nd give me a sense on the total number of sales.",
+        description="Look at the available data and give me a sense on the total number of sales.",
        expected_output="The total number of sales as an integer",
        agent=researcher,
    )
@@ -1235,16 +1308,16 @@ def test_agent_usage_metrics_are_captured_for_hierarchical_process():
    )

    result = crew.kickoff()
-    assert result.raw == '"Howdy!"'
+    assert result.raw == "Howdy!"

    print(crew.usage_metrics)

-    assert crew.usage_metrics == {
-        "total_tokens": 311,
-        "prompt_tokens": 224,
-        "completion_tokens": 87,
-        "successful_requests": 1,
-    }
+    assert crew.usage_metrics == UsageMetrics(
+        total_tokens=219,
+        prompt_tokens=201,
+        completion_tokens=18,
+        successful_requests=1,
+    )


@pytest.mark.vcr(filter_headers=["authorization"])
@@ -1279,28 +1352,66 @@ def test_hierarchical_crew_creation_tasks_with_agents():

@pytest.mark.vcr(filter_headers=["authorization"])
 def test_hierarchical_crew_creation_tasks_with_async_execution():
+    """
+    Agents are not required for tasks in a hierarchical process but sometimes they are still added
+    This test makes sure that the manager still delegates the task to the agent even if the agent is passed in the task
+    """
    from langchain_openai import ChatOpenAI

    task = Task(
-        description="Come up with a list of 5 interesting ideas to explore for an article, then write one amazing paragraph highlight for each idea that showcases how good an article about this topic could be. Return the list of ideas with their paragraph and your notes.",
-        expected_output="5 bullet points with a paragraph for each idea.",
-        async_execution=True,  # should throw an error
+        description="Write one amazing paragraph about AI.",
+        expected_output="A single paragraph with 4 sentences.",
+        agent=writer,
+        async_execution=True,
    )

-    with pytest.raises(pydantic_core._pydantic_core.ValidationError) as exec_info:
-        Crew(
-            tasks=[task],
-            agents=[researcher],
-            process=Process.hierarchical,
-            manager_llm=ChatOpenAI(model="gpt-4o"),
-        )
-
-    assert (
-        exec_info.value.errors()[0]["type"] == "async_execution_in_hierarchical_process"
+    crew = Crew(
+        tasks=[task],
+        agents=[writer, researcher, ceo],
+        process=Process.hierarchical,
+        manager_llm=ChatOpenAI(model="gpt-4o"),
    )
-    assert (
-        "Hierarchical process error: Tasks cannot be flagged with async_execution."
-        in exec_info.value.errors()[0]["msg"]
+
+    crew.kickoff()
+    assert crew.manager_agent is not None
+    assert crew.manager_agent.tools is not None
+    assert crew.manager_agent.tools[0].description.startswith(
+        "Delegate a specific task to one of the following coworkers: Senior Writer\n"
+    )
+
+
+@pytest.mark.vcr(filter_headers=["authorization"])
+def test_hierarchical_crew_creation_tasks_with_sync_last():
+    """
+    Agents are not required for tasks in a hierarchical process but sometimes they are still added
+    This test makes sure that the manager still delegates the task to the agent even if the agent is passed in the task
+    """
+    from langchain_openai import ChatOpenAI
+
+    task = Task(
+        description="Write one amazing paragraph about AI.",
+        expected_output="A single paragraph with 4 sentences.",
+        agent=writer,
+        async_execution=True,
+    )
+    task2 = Task(
+        description="Write one amazing paragraph about AI.",
+        expected_output="A single paragraph with 4 sentences.",
+        async_execution=False,
+    )
+
+    crew = Crew(
+        tasks=[task, task2],
+        agents=[writer, researcher, ceo],
+        process=Process.hierarchical,
+        manager_llm=ChatOpenAI(model="gpt-4o"),
+    )
+
+    crew.kickoff()
+    assert crew.manager_agent is not None
+    assert crew.manager_agent.tools is not None
+    assert crew.manager_agent.tools[0].description.startswith(
+        "Delegate a specific task to one of the following coworkers: Senior Writer, Researcher, CEO\n"
    )


@@ -1484,16 +1595,16 @@ def test_tools_with_custom_caching():

    writer1 = Agent(
        role="Writer",
-        goal="You write lesssons of math for kids.",
-        backstory="You're an expert in writting and you love to teach kids but you know nothing of math.",
+        goal="You write lessons of math for kids.",
+        backstory="You're an expert in writing and you love to teach kids but you know nothing of math.",
        tools=[multiplcation_tool],
        allow_delegation=False,
    )

    writer2 = Agent(
        role="Writer",
-        goal="You write lesssons of math for kids.",
-        backstory="You're an expert in writting and you love to teach kids but you know nothing of math.",
+        goal="You write lessons of math for kids.",
+        backstory="You're an expert in writing and you love to teach kids but you know nothing of math.",
        tools=[multiplcation_tool],
        allow_delegation=False,
    )
@@ -1841,13 +1952,13 @@ def test_replay_feature():
        )

        crew.kickoff()
-        crew.replay_from_task(str(write.id))
+        crew.replay(str(write.id))
        # Ensure context was passed correctly
        assert mock_execute_task.call_count == 3


@pytest.mark.vcr(filter_headers=["authorization"])
-def test_crew_replay_from_task_error():
+def test_crew_replay_error():
    task = Task(
        description="Come up with a list of 5 interesting ideas to explore for an article",
        expected_output="5 bullet points with a paragraph for each idea.",
@@ -1860,7 +1971,7 @@ def test_crew_replay_from_task_error():
    )

    with pytest.raises(TypeError) as e:
-        crew.replay_from_task()  # type: ignore purposefully throwing err
+        crew.replay()  # type: ignore purposefully throwing err
        assert "task_id is required" in str(e)


@@ -1995,14 +2106,14 @@ def test_replay_task_with_context():
        with patch.object(Task, "execute_sync") as mock_replay_task:
            mock_replay_task.return_value = mock_task_output4

-            replayed_output = crew.replay_from_task(str(task4.id))
+            replayed_output = crew.replay(str(task4.id))
            assert replayed_output.raw == "Presentation on AI advancements..."

        db_handler.reset()


@pytest.mark.vcr(filter_headers=["authorization"])
-def test_replay_from_task_with_context():
+def test_replay_with_context():
    agent = Agent(role="test_agent", backstory="Test Description", goal="Test Goal")
    task1 = Task(
        description="Context Task", expected_output="Say Task Output", agent=agent
@@ -2054,7 +2165,7 @@ def test_replay_from_task_with_context():
            },
        ],
    ):
-        crew.replay_from_task(str(task2.id))
+        crew.replay(str(task2.id))

        assert crew.tasks[1].context[0].output.raw == "context raw output"

@@ -2116,7 +2227,7 @@ def test_replay_with_invalid_task_id():
            ValueError,
            match="Task with id bf5b09c9-69bd-4eb8-be12-f9e5bae31c2d not found in the crew's tasks.",
        ):
-            crew.replay_from_task("bf5b09c9-69bd-4eb8-be12-f9e5bae31c2d")
+            crew.replay("bf5b09c9-69bd-4eb8-be12-f9e5bae31c2d")


@pytest.mark.vcr(filter_headers=["authorization"])
@@ -2175,13 +2286,13 @@ def test_replay_interpolates_inputs_properly(mock_interpolate_inputs):
            },
        ],
    ):
-        crew.replay_from_task(str(task2.id))
+        crew.replay(str(task2.id))
        assert crew._inputs == {"name": "John"}
        assert mock_interpolate_inputs.call_count == 2


@pytest.mark.vcr(filter_headers=["authorization"])
-def test_replay_from_task_setup_context():
+def test_replay_setup_context():
    agent = Agent(role="test_agent", backstory="Test Description", goal="Test Goal")
    task1 = Task(description="Context Task", expected_output="Say {name}", agent=agent)
    task2 = Task(
@@ -2230,7 +2341,7 @@ def test_replay_from_task_setup_context():
            },
        ],
    ):
-        crew.replay_from_task(str(task2.id))
+        crew.replay(str(task2.id))

        # Check if the first task's output was set correctly
        assert crew.tasks[0].output is not None
@@ -2423,3 +2534,34 @@ def test_conditional_should_execute():
        assert condition_mock.call_count == 1
        assert condition_mock() is True
        assert mock_execute_sync.call_count == 2
+
+
+@mock.patch("crewai.crew.CrewEvaluator")
+@mock.patch("crewai.crew.Crew.kickoff")
+def test_crew_testing_function(mock_kickoff, crew_evaluator):
+    task = Task(
+        description="Come up with a list of 5 interesting ideas to explore for an article, then write one amazing paragraph highlight for each idea that showcases how good an article about this topic could be. Return the list of ideas with their paragraph and your notes.",
+        expected_output="5 bullet points with a paragraph for each idea.",
+        agent=researcher,
+    )
+
+    crew = Crew(
+        agents=[researcher],
+        tasks=[task],
+    )
+    n_iterations = 2
+    crew.test(n_iterations, openai_model_name="gpt-4o-mini", inputs={"topic": "AI"})
+
+    assert len(mock_kickoff.mock_calls) == n_iterations
+    mock_kickoff.assert_has_calls(
+        [mock.call(inputs={"topic": "AI"}), mock.call(inputs={"topic": "AI"})]
+    )
+
+    crew_evaluator.assert_has_calls(
+        [
+            mock.call(crew, "gpt-4o-mini"),
+            mock.call().set_iteration(1),
+            mock.call().set_iteration(2),
+            mock.call().print_crew_evaluation_result(),
+        ]
+    )
--- a/tests/memory/short_term_memory_test.py
+++ b/tests/memory/short_term_memory_test.py
@@ -23,10 +23,7 @@ def short_term_memory():
        expected_output="A list of relevant URLs based on the search query.",
        agent=agent,
    )
-    return ShortTermMemory(crew=Crew(
-        agents=[agent],
-        tasks=[task]
-    ))
+    return ShortTermMemory(crew=Crew(agents=[agent], tasks=[task]))


@pytest.mark.vcr(filter_headers=["authorization"])
@@ -38,7 +35,11 @@ def test_save_and_search(short_term_memory):
        agent="test_agent",
        metadata={"task": "test_task"},
    )
-    short_term_memory.save(memory)
+    short_term_memory.save(
+        value=memory.data,
+        metadata=memory.metadata,
+        agent=memory.agent,
+    )

    find = short_term_memory.search("test value", score_threshold=0.01)[0]
    assert find["context"] == memory.data, "Data value mismatch."
--- a/tests/pipeline/test_pipeline.py
+++ b/tests/pipeline/test_pipeline.py
@@ -0,0 +1,468 @@
+import json
+from unittest.mock import MagicMock
+
+import pytest
+from crewai.agent import Agent
+from crewai.crew import Crew
+from crewai.crews.crew_output import CrewOutput
+from crewai.pipeline.pipeline import Pipeline
+from crewai.pipeline.pipeline_kickoff_result import PipelineKickoffResult
+from crewai.process import Process
+from crewai.task import Task
+from crewai.tasks.task_output import TaskOutput
+from crewai.types.usage_metrics import UsageMetrics
+from pydantic import BaseModel, ValidationError
+
+DEFAULT_TOKEN_USAGE = UsageMetrics(
+    total_tokens=100, prompt_tokens=50, completion_tokens=50, successful_requests=3
+)
+
+
+@pytest.fixture
+def mock_crew_factory():
+    def _create_mock_crew(name: str, output_json_dict=None, pydantic_output=None):
+        crew = MagicMock(spec=Crew)
+        task_output = TaskOutput(
+            description="Test task", raw="Task output", agent="Test Agent"
+        )
+        crew_output = CrewOutput(
+            raw="Test output",
+            tasks_output=[task_output],
+            token_usage=DEFAULT_TOKEN_USAGE,
+            json_dict=output_json_dict if output_json_dict else None,
+            pydantic=pydantic_output,
+        )
+
+        async def async_kickoff(inputs=None):
+            print("inputs in async_kickoff", inputs)
+            return crew_output
+
+        crew.kickoff_async.side_effect = async_kickoff
+
+        # Add more attributes that Procedure might be expecting
+        crew.verbose = False
+        crew.output_log_file = None
+        crew.max_rpm = None
+        crew.memory = False
+        crew.process = Process.sequential
+        crew.config = None
+        crew.cache = True
+        crew.name = name
+
+        # Add non-empty agents and tasks
+        mock_agent = MagicMock(spec=Agent)
+        mock_task = MagicMock(spec=Task)
+        mock_task.agent = mock_agent
+        mock_task.async_execution = False
+        mock_task.context = None
+
+        crew.agents = [mock_agent]
+        crew.tasks = [mock_task]
+
+        return crew
+
+    return _create_mock_crew
+
+
+def test_pipeline_initialization(mock_crew_factory):
+    """
+    Test that a Pipeline is correctly initialized with the given stages.
+    """
+    crew1 = mock_crew_factory(name="Crew 1")
+    crew2 = mock_crew_factory(name="Crew 2")
+
+    pipeline = Pipeline(stages=[crew1, crew2])
+    assert len(pipeline.stages) == 2
+    assert pipeline.stages[0] == crew1
+    assert pipeline.stages[1] == crew2
+
+
+@pytest.mark.asyncio
+async def test_pipeline_with_empty_input(mock_crew_factory):
+    """
+    Ensure the pipeline handles an empty input list correctly.
+    """
+    crew = mock_crew_factory(name="Test Crew")
+    pipeline = Pipeline(stages=[crew])
+
+    input_data = []
+    pipeline_results = await pipeline.kickoff(input_data)
+
+    assert (
+        len(pipeline_results) == 0
+    ), "Pipeline should return empty results for empty input"
+
+
+@pytest.mark.asyncio
+async def test_pipeline_process_streams_single_input(mock_crew_factory):
+    """
+    Test that Pipeline.process_streams() correctly processes a single input
+    and returns the expected CrewOutput.
+    """
+    crew_name = "Test Crew"
+    mock_crew = mock_crew_factory(name="Test Crew")
+    pipeline = Pipeline(stages=[mock_crew])
+    input_data = [{"key": "value"}]
+    pipeline_results = await pipeline.kickoff(input_data)
+
+    mock_crew.kickoff_async.assert_called_once_with(inputs={"key": "value"})
+
+    for pipeline_result in pipeline_results:
+        assert isinstance(pipeline_result, PipelineKickoffResult)
+        assert pipeline_result.raw == "Test output"
+        assert len(pipeline_result.crews_outputs) == 1
+        print("pipeline_result.token_usage", pipeline_result.token_usage)
+        assert pipeline_result.token_usage == {crew_name: DEFAULT_TOKEN_USAGE}
+        assert pipeline_result.trace == [input_data[0], "Test Crew"]
+
+
+@pytest.mark.asyncio
+async def test_pipeline_result_ordering(mock_crew_factory):
+    """
+    Ensure that results are returned in the same order as the inputs, especially with parallel processing.
+    """
+    crew1 = mock_crew_factory(name="Crew 1", output_json_dict={"output": "crew1"})
+    crew2 = mock_crew_factory(name="Crew 2", output_json_dict={"output": "crew2"})
+    crew3 = mock_crew_factory(name="Crew 3", output_json_dict={"output": "crew3"})
+
+    pipeline = Pipeline(
+        stages=[crew1, [crew2, crew3]]
+    )  # Parallel stage to test ordering
+
+    input_data = [{"id": 1}, {"id": 2}, {"id": 3}]
+    pipeline_results = await pipeline.kickoff(input_data)
+
+    assert (
+        len(pipeline_results) == 6
+    ), "Should have 2 results for each input due to the parallel final stage"
+
+    # Group results by their original input id
+    grouped_results = {}
+    for result in pipeline_results:
+        input_id = result.trace[0]["id"]
+        if input_id not in grouped_results:
+            grouped_results[input_id] = []
+        grouped_results[input_id].append(result)
+
+    # Check that we have the correct number of groups and results per group
+    assert len(grouped_results) == 3, "Should have results for each of the 3 inputs"
+    for input_id, results in grouped_results.items():
+        assert (
+            len(results) == 2
+        ), f"Each input should have 2 results, but input {input_id} has {len(results)}"
+
+    # Check the ordering and content of the results
+    for input_id in range(1, 4):
+        group = grouped_results[input_id]
+        assert group[0].trace == [
+            {"id": input_id},
+            "Crew 1",
+            "Crew 2",
+        ], f"Unexpected trace for first result of input {input_id}"
+        assert group[1].trace == [
+            {"id": input_id},
+            "Crew 1",
+            "Crew 3",
+        ], f"Unexpected trace for second result of input {input_id}"
+        assert (
+            group[0].json_dict["output"] == "crew2"
+        ), f"Unexpected output for first result of input {input_id}"
+        assert (
+            group[1].json_dict["output"] == "crew3"
+        ), f"Unexpected output for second result of input {input_id}"
+
+
+class TestPydanticOutput(BaseModel):
+    key: str
+    value: int
+
+
+@pytest.mark.asyncio
+async def test_pipeline_process_streams_single_input_pydantic_output(mock_crew_factory):
+    crew_name = "Test Crew"
+    mock_crew = mock_crew_factory(
+        name=crew_name,
+        output_json_dict=None,
+        pydantic_output=TestPydanticOutput(key="test", value=42),
+    )
+    pipeline = Pipeline(stages=[mock_crew])
+    input_data = [{"key": "value"}]
+    pipeline_results = await pipeline.kickoff(input_data)
+
+    assert len(pipeline_results) == 1
+    pipeline_result = pipeline_results[0]
+
+    print("pipeline_result.trace", pipeline_result.trace)
+
+    assert isinstance(pipeline_result, PipelineKickoffResult)
+    assert pipeline_result.raw == "Test output"
+    assert len(pipeline_result.crews_outputs) == 1
+    assert pipeline_result.token_usage == {crew_name: DEFAULT_TOKEN_USAGE}
+    print("INPUT DATA POST PROCESS", input_data)
+    assert pipeline_result.trace == [input_data[0], "Test Crew"]
+
+    assert isinstance(pipeline_result.pydantic, TestPydanticOutput)
+    assert pipeline_result.pydantic.key == "test"
+    assert pipeline_result.pydantic.value == 42
+    assert pipeline_result.json_dict is None
+
+
+@pytest.mark.asyncio
+async def test_pipeline_preserves_original_input(mock_crew_factory):
+    crew_name = "Test Crew"
+    mock_crew = mock_crew_factory(
+        name=crew_name,
+        output_json_dict={"new_key": "new_value"},
+    )
+    pipeline = Pipeline(stages=[mock_crew])
+
+    # Create a deep copy of the input data to ensure we're not comparing references
+    original_input_data = [{"key": "value", "nested": {"a": 1}}]
+    input_data = json.loads(json.dumps(original_input_data))
+
+    await pipeline.kickoff(input_data)
+
+    # Assert that the original input hasn't been modified
+    assert (
+        input_data == original_input_data
+    ), "The original input data should not be modified"
+
+    # Ensure that even nested structures haven't been modified
+    assert (
+        input_data[0]["nested"] == original_input_data[0]["nested"]
+    ), "Nested structures should not be modified"
+
+    # Verify that adding new keys to the crew output doesn't affect the original input
+    assert (
+        "new_key" not in input_data[0]
+    ), "New keys from crew output should not be added to the original input"
+
+
+@pytest.mark.asyncio
+async def test_pipeline_process_streams_multiple_inputs(mock_crew_factory):
+    """
+    Test that Pipeline.process_streams() correctly processes multiple inputs
+    and returns the expected CrewOutputs.
+    """
+    mock_crew = mock_crew_factory(name="Test Crew")
+    pipeline = Pipeline(stages=[mock_crew])
+    input_data = [{"key1": "value1"}, {"key2": "value2"}]
+    pipeline_results = await pipeline.kickoff(input_data)
+
+    assert mock_crew.kickoff_async.call_count == 2
+    assert len(pipeline_results) == 2
+    for pipeline_result in pipeline_results:
+        print("pipeline_result,", pipeline_result)
+        assert all(
+            isinstance(crew_output, CrewOutput)
+            for crew_output in pipeline_result.crews_outputs
+        )
+
+
+@pytest.mark.asyncio
+async def test_pipeline_with_parallel_stages(mock_crew_factory):
+    """
+    Test that Pipeline correctly handles parallel stages.
+    """
+    crew1 = mock_crew_factory(name="Crew 1")
+    crew2 = mock_crew_factory(name="Crew 2")
+    crew3 = mock_crew_factory(name="Crew 3")
+
+    pipeline = Pipeline(stages=[crew1, [crew2, crew3]])
+    input_data = [{"initial": "data"}]
+
+    pipeline_result = await pipeline.kickoff(input_data)
+
+    crew1.kickoff_async.assert_called_once_with(inputs={"initial": "data"})
+
+    assert len(pipeline_result) == 2
+    pipeline_result_1, pipeline_result_2 = pipeline_result
+
+    pipeline_result_1.trace = [
+        "Crew 1",
+        "Crew 2",
+    ]
+    pipeline_result_2.trace = [
+        "Crew 1",
+        "Crew 3",
+    ]
+
+    expected_token_usage = {
+        "Crew 1": DEFAULT_TOKEN_USAGE,
+        "Crew 2": DEFAULT_TOKEN_USAGE,
+        "Crew 3": DEFAULT_TOKEN_USAGE,
+    }
+
+    assert pipeline_result_1.token_usage == expected_token_usage
+    assert pipeline_result_2.token_usage == expected_token_usage
+
+
+@pytest.mark.asyncio
+async def test_pipeline_with_parallel_stages_end_in_single_stage(mock_crew_factory):
+    """
+    Test that Pipeline correctly handles parallel stages.
+    """
+    crew1 = mock_crew_factory(name="Crew 1")
+    crew2 = mock_crew_factory(name="Crew 2")
+    crew3 = mock_crew_factory(name="Crew 3")
+    crew4 = mock_crew_factory(name="Crew 4")
+
+    pipeline = Pipeline(stages=[crew1, [crew2, crew3], crew4])
+    input_data = [{"initial": "data"}]
+
+    pipeline_result = await pipeline.kickoff(input_data)
+
+    crew1.kickoff_async.assert_called_once_with(inputs={"initial": "data"})
+
+    assert len(pipeline_result) == 1
+    pipeline_result_1 = pipeline_result[0]
+
+    pipeline_result_1.trace = [
+        input_data[0],
+        "Crew 1",
+        ["Crew 2", "Crew 3"],
+        "Crew 4",
+    ]
+
+    expected_token_usage = {
+        "Crew 1": DEFAULT_TOKEN_USAGE,
+        "Crew 2": DEFAULT_TOKEN_USAGE,
+        "Crew 3": DEFAULT_TOKEN_USAGE,
+        "Crew 4": DEFAULT_TOKEN_USAGE,
+    }
+
+    assert pipeline_result_1.token_usage == expected_token_usage
+
+
+def test_pipeline_rshift_operator(mock_crew_factory):
+    """
+    Test that the >> operator correctly creates a Pipeline from Crews and lists of Crews.
+    """
+    crew1 = mock_crew_factory(name="Crew 1")
+    crew2 = mock_crew_factory(name="Crew 2")
+    crew3 = mock_crew_factory(name="Crew 3")
+
+    # Test single crew addition
+    pipeline = Pipeline(stages=[]) >> crew1
+    assert len(pipeline.stages) == 1
+    assert pipeline.stages[0] == crew1
+
+    # Test adding a list of crews
+    pipeline = Pipeline(stages=[crew1])
+    pipeline = pipeline >> [crew2, crew3]
+    print("pipeline.stages:", pipeline.stages)
+    assert len(pipeline.stages) == 2
+    assert pipeline.stages[1] == [crew2, crew3]
+
+    # Test error case: trying to shift with non-Crew object
+    with pytest.raises(TypeError):
+        pipeline >> "not a crew"
+
+
+@pytest.mark.asyncio
+async def test_pipeline_parallel_crews_to_parallel_crews(mock_crew_factory):
+    """
+    Test that feeding parallel crews to parallel crews works correctly.
+    """
+    crew1 = mock_crew_factory(name="Crew 1", output_json_dict={"output1": "crew1"})
+    crew2 = mock_crew_factory(name="Crew 2", output_json_dict={"output2": "crew2"})
+    crew3 = mock_crew_factory(name="Crew 3", output_json_dict={"output3": "crew3"})
+    crew4 = mock_crew_factory(name="Crew 4", output_json_dict={"output4": "crew4"})
+
+    pipeline = Pipeline(stages=[[crew1, crew2], [crew3, crew4]])
+
+    input_data = [{"input": "test"}]
+    pipeline_results = await pipeline.kickoff(input_data)
+
+    assert len(pipeline_results) == 2, "Should have 2 results for final parallel stage"
+
+    pipeline_result_1, pipeline_result_2 = pipeline_results
+
+    # Check the outputs
+    assert pipeline_result_1.json_dict == {"output3": "crew3"}
+    assert pipeline_result_2.json_dict == {"output4": "crew4"}
+
+    # Check the traces
+    expected_traces = [
+        [{"input": "test"}, ["Crew 1", "Crew 2"], "Crew 3"],
+        [{"input": "test"}, ["Crew 1", "Crew 2"], "Crew 4"],
+    ]
+
+    for result, expected_trace in zip(pipeline_results, expected_traces):
+        assert result.trace == expected_trace, f"Unexpected trace: {result.trace}"
+
+
+def test_pipeline_double_nesting_not_allowed(mock_crew_factory):
+    """
+    Test that double nesting in pipeline stages is not allowed.
+    """
+    crew1 = mock_crew_factory(name="Crew 1")
+    crew2 = mock_crew_factory(name="Crew 2")
+    crew3 = mock_crew_factory(name="Crew 3")
+    crew4 = mock_crew_factory(name="Crew 4")
+
+    with pytest.raises(ValidationError) as exc_info:
+        Pipeline(stages=[crew1, [[crew2, crew3], crew4]])
+
+    error_msg = str(exc_info.value)
+    print(f"Full error message: {error_msg}")  # For debugging
+    assert (
+        "Double nesting is not allowed in pipeline stages" in error_msg
+    ), f"Unexpected error message: {error_msg}"
+
+
+def test_pipeline_invalid_crew(mock_crew_factory):
+    """
+    Test that non-Crew objects are not allowed in pipeline stages.
+    """
+    crew1 = mock_crew_factory(name="Crew 1")
+    not_a_crew = "This is not a crew"
+
+    with pytest.raises(ValidationError) as exc_info:
+        Pipeline(stages=[crew1, not_a_crew])
+
+    error_msg = str(exc_info.value)
+    print(f"Full error message: {error_msg}")  # For debugging
+    assert (
+        "Expected Crew instance or list of Crews, got <class 'str'>" in error_msg
+    ), f"Unexpected error message: {error_msg}"
+
+
+"""
+TODO: Figure out what is the proper output for a pipeline with multiple stages
+
+Options:
+- Should the final output only include the last stage's output?
+- Should the final output include the accumulation of previous stages' outputs?
+"""
+
+
+@pytest.mark.asyncio
+async def test_pipeline_data_accumulation(mock_crew_factory):
+    crew1 = mock_crew_factory(name="Crew 1", output_json_dict={"key1": "value1"})
+    crew2 = mock_crew_factory(name="Crew 2", output_json_dict={"key2": "value2"})
+
+    pipeline = Pipeline(stages=[crew1, crew2])
+    input_data = [{"initial": "data"}]
+    results = await pipeline.kickoff(input_data)
+
+    # Check that crew1 was called with only the initial input
+    crew1.kickoff_async.assert_called_once_with(inputs={"initial": "data"})
+
+    # Check that crew2 was called with the combined input from the initial data and crew1's output
+    crew2.kickoff_async.assert_called_once_with(
+        inputs={"initial": "data", "key1": "value1"}
+    )
+
+    # Check the final output
+    assert len(results) == 1
+    final_result = results[0]
+    assert final_result.json_dict == {"key2": "value2"}
+
+    # Check that the trace includes all stages
+    assert final_result.trace == [{"initial": "data"}, "Crew 1", "Crew 2"]
+
+    # Check that crews_outputs contain the correct information
+    assert len(final_result.crews_outputs) == 2
+    assert final_result.crews_outputs[0].json_dict == {"key1": "value1"}
+    assert final_result.crews_outputs[1].json_dict == {"key2": "value2"}
--- a/tests/task_test.py
+++ b/tests/task_test.py
@@ -109,7 +109,7 @@ def test_task_callback():
        task_completed.assert_called_once_with(task.output)


-def test_task_callback_returns_task_ouput():
+def test_task_callback_returns_task_output():
    from crewai.tasks.output_format import OutputFormat

    researcher = Agent(
--- a/tests/utilities/evaluators/test_crew_evaluator_handler.py
+++ b/tests/utilities/evaluators/test_crew_evaluator_handler.py
@@ -0,0 +1,118 @@
+from unittest import mock
+
+import pytest
+
+from crewai.agent import Agent
+from crewai.crew import Crew
+from crewai.task import Task
+from crewai.tasks.task_output import TaskOutput
+from crewai.utilities.evaluators.crew_evaluator_handler import (
+    CrewEvaluator,
+    TaskEvaluationPydanticOutput,
+)
+
+
+class TestCrewEvaluator:
+    @pytest.fixture
+    def crew_planner(self):
+        agent = Agent(role="Agent 1", goal="Goal 1", backstory="Backstory 1")
+        task = Task(
+            description="Task 1",
+            expected_output="Output 1",
+            agent=agent,
+        )
+        crew = Crew(agents=[agent], tasks=[task])
+
+        return CrewEvaluator(crew, openai_model_name="gpt-4o-mini")
+
+    def test_setup_for_evaluating(self, crew_planner):
+        crew_planner._setup_for_evaluating()
+        assert crew_planner.crew.tasks[0].callback == crew_planner.evaluate
+
+    def test_set_iteration(self, crew_planner):
+        crew_planner.set_iteration(1)
+        assert crew_planner.iteration == 1
+
+    def test_evaluator_agent(self, crew_planner):
+        agent = crew_planner._evaluator_agent()
+        assert agent.role == "Task Execution Evaluator"
+        assert (
+            agent.goal
+            == "Your goal is to evaluate the performance of the agents in the crew based on the tasks they have performed using score from 1 to 10 evaluating on completion, quality, and overall performance."
+        )
+        assert (
+            agent.backstory
+            == "Evaluator agent for crew evaluation with precise capabilities to evaluate the performance of the agents in the crew based on the tasks they have performed"
+        )
+        assert agent.verbose is False
+        assert agent.llm.model_name == "gpt-4o-mini"
+
+    def test_evaluation_task(self, crew_planner):
+        evaluator_agent = Agent(
+            role="Evaluator Agent",
+            goal="Evaluate the performance of the agents in the crew",
+            backstory="Master in Evaluation",
+        )
+        task_to_evaluate = Task(
+            description="Task 1",
+            expected_output="Output 1",
+            agent=Agent(role="Agent 1", goal="Goal 1", backstory="Backstory 1"),
+        )
+        task_output = "Task Output 1"
+        task = crew_planner._evaluation_task(
+            evaluator_agent, task_to_evaluate, task_output
+        )
+
+        assert task.description.startswith(
+            "Based on the task description and the expected output, compare and evaluate the performance of the agents in the crew based on the Task Output they have performed using score from 1 to 10 evaluating on completion, quality, and overall performance."
+        )
+
+        assert task.agent == evaluator_agent
+        assert (
+            task.description
+            == "Based on the task description and the expected output, compare and evaluate "
+            "the performance of the agents in the crew based on the Task Output they have "
+            "performed using score from 1 to 10 evaluating on completion, quality, and overall "
+            "performance.task_description: Task 1 task_expected_output: Output 1 "
+            "agent: Agent 1 agent_goal: Goal 1 Task Output: Task Output 1"
+        )
+
+    @mock.patch("crewai.utilities.evaluators.crew_evaluator_handler.Console")
+    @mock.patch("crewai.utilities.evaluators.crew_evaluator_handler.Table")
+    def test_print_crew_evaluation_result(self, table, console, crew_planner):
+        crew_planner.tasks_scores = {
+            1: [10, 9, 8],
+            2: [9, 8, 7],
+        }
+        crew_planner.run_execution_times = {
+            1: [24, 45, 66],
+            2: [55, 33, 67],
+        }
+
+        crew_planner.print_crew_evaluation_result()
+
+        table.assert_has_calls(
+            [
+                mock.call(title="Tasks Scores \n (1-10 Higher is better)"),
+                mock.call().add_column("Tasks/Crew"),
+                mock.call().add_column("Run 1"),
+                mock.call().add_column("Run 2"),
+                mock.call().add_column("Avg. Total"),
+                mock.call().add_row("Task 1", "10", "9", "9.5"),
+                mock.call().add_row("Task 2", "9", "8", "8.5"),
+                mock.call().add_row("Task 3", "8", "7", "7.5"),
+                mock.call().add_row("Crew", "9.0", "8.0", "8.5"),
+                mock.call().add_row("Execution Time (s)", "135", "155", "145"),
+            ]
+        )
+        console.assert_has_calls([mock.call(), mock.call().print(table())])
+
+    def test_evaluate(self, crew_planner):
+        task_output = TaskOutput(
+            description="Task 1", agent=str(crew_planner.crew.agents[0])
+        )
+
+        with mock.patch.object(Task, "execute_sync") as execute:
+            execute().pydantic = TaskEvaluationPydanticOutput(quality=9.5)
+            crew_planner.evaluate(task_output)
+            assert crew_planner.tasks_scores[0] == [9.5]
--- a/tests/utilities/evaluators/test_task_evaluator.py
+++ b/tests/utilities/evaluators/test_task_evaluator.py
@@ -56,8 +56,7 @@ def test_evaluate_training_data(converter_mock):
                "based on the human feedback\n",
                model=TrainingTaskEvaluation,
                instructions="I'm gonna convert this raw text into valid JSON.\n\nThe json should have the "
-                "following structure, with the following keys:\n- suggestions: List[str]\n- "
-                "quality: float\n- final_summary: str",
+                "following structure, with the following keys:\n{\n    suggestions: List[str],\n    quality: float,\n    final_summary: str\n}",
            ),
            mock.call().to_pydantic(),
        ]
--- a/tests/utilities/test_converter.py
+++ b/tests/utilities/test_converter.py
@@ -0,0 +1,266 @@
+import json
+from unittest.mock import MagicMock, Mock, patch
+
+import pytest
+from crewai.utilities.converter import (
+    Converter,
+    ConverterError,
+    convert_to_model,
+    convert_with_instructions,
+    create_converter,
+    get_conversion_instructions,
+    handle_partial_json,
+    is_gpt,
+    validate_model,
+)
+from pydantic import BaseModel
+
+
+# Sample Pydantic models for testing
+class EmailResponse(BaseModel):
+    previous_message_content: str
+
+
+class EmailResponses(BaseModel):
+    responses: list[EmailResponse]
+
+
+class SimpleModel(BaseModel):
+    name: str
+    age: int
+
+
+class NestedModel(BaseModel):
+    id: int
+    data: SimpleModel
+
+
+# Fixtures
+@pytest.fixture
+def mock_agent():
+    agent = Mock()
+    agent.function_calling_llm = None
+    agent.llm = Mock()
+    return agent
+
+
+# Tests for convert_to_model
+def test_convert_to_model_with_valid_json():
+    result = '{"name": "John", "age": 30}'
+    output = convert_to_model(result, SimpleModel, None, None)
+    assert isinstance(output, SimpleModel)
+    assert output.name == "John"
+    assert output.age == 30
+
+
+def test_convert_to_model_with_invalid_json():
+    result = '{"name": "John", "age": "thirty"}'
+    with patch("crewai.utilities.converter.handle_partial_json") as mock_handle:
+        mock_handle.return_value = "Fallback result"
+        output = convert_to_model(result, SimpleModel, None, None)
+        assert output == "Fallback result"
+
+
+def test_convert_to_model_with_no_model():
+    result = "Plain text"
+    output = convert_to_model(result, None, None, None)
+    assert output == "Plain text"
+
+
+def test_convert_to_model_with_special_characters():
+    json_string_test = """
+    {
+        "responses": [
+            {
+                "previous_message_content": "Hi Tom,\r\n\r\nNiamh has chosen the Mika phonics on"
+            }
+        ]
+    }
+    """
+    output = convert_to_model(json_string_test, EmailResponses, None, None)
+    assert isinstance(output, EmailResponses)
+    assert len(output.responses) == 1
+    assert (
+        output.responses[0].previous_message_content
+        == "Hi Tom,\r\n\r\nNiamh has chosen the Mika phonics on"
+    )
+
+
+def test_convert_to_model_with_escaped_special_characters():
+    json_string_test = json.dumps(
+        {
+            "responses": [
+                {
+                    "previous_message_content": "Hi Tom,\r\n\r\nNiamh has chosen the Mika phonics on"
+                }
+            ]
+        }
+    )
+    output = convert_to_model(json_string_test, EmailResponses, None, None)
+    assert isinstance(output, EmailResponses)
+    assert len(output.responses) == 1
+    assert (
+        output.responses[0].previous_message_content
+        == "Hi Tom,\r\n\r\nNiamh has chosen the Mika phonics on"
+    )
+
+
+def test_convert_to_model_with_multiple_special_characters():
+    json_string_test = """
+    {
+        "responses": [
+            {
+                "previous_message_content": "Line 1\r\nLine 2\tTabbed\nLine 3\r\n\rEscaped newline"
+            }
+        ]
+    }
+    """
+    output = convert_to_model(json_string_test, EmailResponses, None, None)
+    assert isinstance(output, EmailResponses)
+    assert len(output.responses) == 1
+    assert (
+        output.responses[0].previous_message_content
+        == "Line 1\r\nLine 2\tTabbed\nLine 3\r\n\rEscaped newline"
+    )
+
+
+# Tests for validate_model
+def test_validate_model_pydantic_output():
+    result = '{"name": "Alice", "age": 25}'
+    output = validate_model(result, SimpleModel, False)
+    assert isinstance(output, SimpleModel)
+    assert output.name == "Alice"
+    assert output.age == 25
+
+
+def test_validate_model_json_output():
+    result = '{"name": "Bob", "age": 40}'
+    output = validate_model(result, SimpleModel, True)
+    assert isinstance(output, dict)
+    assert output == {"name": "Bob", "age": 40}
+
+
+# Tests for handle_partial_json
+def test_handle_partial_json_with_valid_partial():
+    result = 'Some text {"name": "Charlie", "age": 35} more text'
+    output = handle_partial_json(result, SimpleModel, False, None)
+    assert isinstance(output, SimpleModel)
+    assert output.name == "Charlie"
+    assert output.age == 35
+
+
+def test_handle_partial_json_with_invalid_partial(mock_agent):
+    result = "No valid JSON here"
+    with patch("crewai.utilities.converter.convert_with_instructions") as mock_convert:
+        mock_convert.return_value = "Converted result"
+        output = handle_partial_json(result, SimpleModel, False, mock_agent)
+        assert output == "Converted result"
+
+
+# Tests for convert_with_instructions
+@patch("crewai.utilities.converter.create_converter")
+@patch("crewai.utilities.converter.get_conversion_instructions")
+def test_convert_with_instructions_success(
+    mock_get_instructions, mock_create_converter, mock_agent
+):
+    mock_get_instructions.return_value = "Instructions"
+    mock_converter = Mock()
+    mock_converter.to_pydantic.return_value = SimpleModel(name="David", age=50)
+    mock_create_converter.return_value = mock_converter
+
+    result = "Some text to convert"
+    output = convert_with_instructions(result, SimpleModel, False, mock_agent)
+
+    assert isinstance(output, SimpleModel)
+    assert output.name == "David"
+    assert output.age == 50
+
+
+@patch("crewai.utilities.converter.create_converter")
+@patch("crewai.utilities.converter.get_conversion_instructions")
+def test_convert_with_instructions_failure(
+    mock_get_instructions, mock_create_converter, mock_agent
+):
+    mock_get_instructions.return_value = "Instructions"
+    mock_converter = Mock()
+    mock_converter.to_pydantic.return_value = ConverterError("Conversion failed")
+    mock_create_converter.return_value = mock_converter
+
+    result = "Some text to convert"
+    with patch("crewai.utilities.converter.Printer") as mock_printer:
+        output = convert_with_instructions(result, SimpleModel, False, mock_agent)
+        assert output == result
+        mock_printer.return_value.print.assert_called_once()
+
+
+# Tests for get_conversion_instructions
+def test_get_conversion_instructions_gpt():
+    mock_llm = Mock()
+    mock_llm.openai_api_base = None
+    with patch("crewai.utilities.converter.is_gpt", return_value=True):
+        instructions = get_conversion_instructions(SimpleModel, mock_llm)
+        assert instructions == "I'm gonna convert this raw text into valid JSON."
+
+
+def test_get_conversion_instructions_non_gpt():
+    mock_llm = Mock()
+    with patch("crewai.utilities.converter.is_gpt", return_value=False):
+        with patch("crewai.utilities.converter.PydanticSchemaParser") as mock_parser:
+            mock_parser.return_value.get_schema.return_value = "Sample schema"
+            instructions = get_conversion_instructions(SimpleModel, mock_llm)
+            assert "Sample schema" in instructions
+
+
+# Tests for is_gpt
+def test_is_gpt_true():
+    from langchain_openai import ChatOpenAI
+
+    mock_llm = Mock(spec=ChatOpenAI)
+    mock_llm.openai_api_base = None
+    assert is_gpt(mock_llm) is True
+
+
+def test_is_gpt_false():
+    mock_llm = Mock()
+    assert is_gpt(mock_llm) is False
+
+
+class CustomConverter(Converter):
+    pass
+
+
+def test_create_converter_with_mock_agent():
+    mock_agent = MagicMock()
+    mock_agent.get_output_converter.return_value = MagicMock(spec=Converter)
+
+    converter = create_converter(
+        agent=mock_agent,
+        llm=Mock(),
+        text="Sample",
+        model=SimpleModel,
+        instructions="Convert",
+    )
+
+    assert isinstance(converter, Converter)
+    mock_agent.get_output_converter.assert_called_once()
+
+
+def test_create_converter_with_custom_converter():
+    converter = create_converter(
+        converter_cls=CustomConverter,
+        llm=Mock(),
+        text="Sample",
+        model=SimpleModel,
+        instructions="Convert",
+    )
+
+    assert isinstance(converter, CustomConverter)
+
+
+def test_create_converter_fails_without_agent_or_converter_cls():
+    with pytest.raises(
+        ValueError, match="Either agent or converter_cls must be provided"
+    ):
+        create_converter(
+            llm=Mock(), text="Sample", model=SimpleModel, instructions="Convert"
+        )
--- a/tests/utilities/test_planning_handler.py
+++ b/tests/utilities/test_planning_handler.py
@@ -0,0 +1,106 @@
+from unittest.mock import patch
+
+import pytest
+from langchain_openai import ChatOpenAI
+
+from crewai.agent import Agent
+from crewai.task import Task
+from crewai.tasks.task_output import TaskOutput
+from crewai.utilities.planning_handler import CrewPlanner, PlannerTaskPydanticOutput
+
+
+class TestCrewPlanner:
+    @pytest.fixture
+    def crew_planner(self):
+        tasks = [
+            Task(
+                description="Task 1",
+                expected_output="Output 1",
+                agent=Agent(role="Agent 1", goal="Goal 1", backstory="Backstory 1"),
+            ),
+            Task(
+                description="Task 2",
+                expected_output="Output 2",
+                agent=Agent(role="Agent 2", goal="Goal 2", backstory="Backstory 2"),
+            ),
+            Task(
+                description="Task 3",
+                expected_output="Output 3",
+                agent=Agent(role="Agent 3", goal="Goal 3", backstory="Backstory 3"),
+            ),
+        ]
+        return CrewPlanner(tasks, None)
+
+    @pytest.fixture
+    def crew_planner_different_llm(self):
+        tasks = [
+            Task(
+                description="Task 1",
+                expected_output="Output 1",
+                agent=Agent(role="Agent 1", goal="Goal 1", backstory="Backstory 1"),
+            )
+        ]
+        planning_agent_llm = ChatOpenAI(model="gpt-3.5-turbo")
+        return CrewPlanner(tasks, planning_agent_llm)
+
+    def test_handle_crew_planning(self, crew_planner):
+        with patch.object(Task, "execute_sync") as execute:
+            execute.return_value = TaskOutput(
+                description="Description",
+                agent="agent",
+                pydantic=PlannerTaskPydanticOutput(
+                    list_of_plans_per_task=["Plan 1", "Plan 2", "Plan 3"]
+                ),
+            )
+            result = crew_planner._handle_crew_planning()
+            assert crew_planner.planning_agent_llm.model_name == "gpt-4o-mini"
+            assert isinstance(result, PlannerTaskPydanticOutput)
+            assert len(result.list_of_plans_per_task) == len(crew_planner.tasks)
+            execute.assert_called_once()
+
+    def test_create_planning_agent(self, crew_planner):
+        agent = crew_planner._create_planning_agent()
+        assert isinstance(agent, Agent)
+        assert agent.role == "Task Execution Planner"
+
+    def test_create_planner_task(self, crew_planner):
+        planning_agent = Agent(
+            role="Planning Agent",
+            goal="Plan Step by Step Plan",
+            backstory="Master in Planning",
+        )
+        tasks_summary = "Summary of tasks"
+        task = crew_planner._create_planner_task(planning_agent, tasks_summary)
+
+        assert isinstance(task, Task)
+        assert task.description.startswith("Based on these tasks summary")
+        assert task.agent == planning_agent
+        assert (
+            task.expected_output
+            == "Step by step plan on how the agents can execute their tasks using the available tools with mastery"
+        )
+
+    def test_create_tasks_summary(self, crew_planner):
+        tasks_summary = crew_planner._create_tasks_summary()
+        assert isinstance(tasks_summary, str)
+        assert tasks_summary.startswith("\n                Task Number 1 - Task 1")
+        assert tasks_summary.endswith('"agent_tools": []\n                ')
+
+    def test_handle_crew_planning_different_llm(self, crew_planner_different_llm):
+        with patch.object(Task, "execute_sync") as execute:
+            execute.return_value = TaskOutput(
+                description="Description",
+                agent="agent",
+                pydantic=PlannerTaskPydanticOutput(list_of_plans_per_task=["Plan 1"]),
+            )
+            result = crew_planner_different_llm._handle_crew_planning()
+
+            assert (
+                crew_planner_different_llm.planning_agent_llm.model_name
+                == "gpt-3.5-turbo"
+            )
+            assert isinstance(result, PlannerTaskPydanticOutput)
+            assert len(result.list_of_plans_per_task) == len(
+                crew_planner_different_llm.tasks
+            )
+            execute.assert_called_once()
Author	SHA1	Message	Date
Brandon Hancock	06350a74ef	Clean up end of docs	2024-07-31 10:57:09 -04:00
Brandon Hancock	6aab0ebcbc	Remove overly complicated test	2024-07-31 10:12:37 -04:00
Brandon Hancock	41df368156	Merge branch 'main' into feature/procedure_v2	2024-07-31 09:49:48 -04:00
João Moura	c93b85ac53	Preparing for new version	2024-07-30 19:21:18 -04:00
Lorenze Jay	6378f6caec	WIP fixed mypy src types (#1036 )	2024-07-30 10:59:50 -07:00
Brandon Hancock	072044c537	Rename variables based on joaos feedback	2024-07-30 09:54:11 -04:00
Brandon Hancock	e1a03ad97d	Change names	2024-07-30 09:43:36 -04:00
Eduardo Chiarotti	d824db82a3	feat: Add execution time to both task and testing feature (#1031 ) * feat: Add execution time to both task and testing feature * feat: Remove unused functions * feat: change test_crew to evalaute_crew to avoid issues with testing libs * feat: fix tests	2024-07-29 23:17:07 -03:00
Matt Young	de6b597eff	telemetry.py - fix typo in comment. (#1020 )	2024-07-29 23:03:51 -03:00
Deepak Tammali	6111d05219	docs: Fix crewai-tools package name typo in getting-started docs (#1026 )	2024-07-29 23:03:32 -03:00
Monarch Wadia	f83c91d612	Fixed package name typo in pip install command (#1029 ) Changed `pip install crewai-tools` to `pip install crewai-tools`	2024-07-29 23:02:48 -03:00
Mackensie Alvarez	c8f360414e	Update Start-a-New-CrewAI-Project-Template-Method.md (#1030 )	2024-07-29 23:02:18 -03:00
Brandon Hancock	910c8df1a7	Update pipeline to use UsageMetric	2024-07-29 16:00:57 -04:00
Brandon Hancock	b9177f2d04	Merge branch 'brandon/cre-117-create-a-type-for-usage-metrics' into feature/procedure_v2	2024-07-29 15:55:11 -04:00
Brandon Hancock	03eafe1671	Fix 1 type error in pipeline	2024-07-29 15:54:49 -04:00
Brandon Hancock	f2830d9c7a	Drop todo	2024-07-29 15:51:14 -04:00
Brandon Hancock	619806f80d	Fix tests that were checking usage metrics	2024-07-29 15:48:48 -04:00
Brandon Hancock	e3182d135a	WIP. Converting usage metrics from a dict to an object	2024-07-29 15:30:54 -04:00
Brandon Hancock	75c7aaf585	Merge branch 'main' into feature/procedure_v2	2024-07-29 14:27:28 -04:00
Brandon Hancock (bhancock_ai)	fa4393d77e	Add in missing triple quote and execution time to resume agent functionality. (#1025 ) * Add in missing triple quote and execution time to resume agent functionality * Fixing broken kwargs and other issues causing our tests to fail	2024-07-29 14:39:02 -03:00
Brandon Hancock	083949fc23	Merge branch 'main' into feature/procedure_v2	2024-07-29 12:48:53 -04:00
Brandon Hancock	04de7730fa	Add doc strings	2024-07-29 12:44:10 -04:00
Brandon Hancock	de6950046d	Add developer notes to explain what is going on in pipelines.	2024-07-29 12:36:03 -04:00
Brandon Hancock	cb2276dc7d	Add in Eduardo feedback. Still need to add in more commentary describing the design decisions for pipeline	2024-07-29 12:20:18 -04:00
Rip&Tear	25c314befc	Minor fixes and updates (#1019 ) Co-authored-by: theCyberTech <mattrapidb@gmail.com>	2024-07-29 03:24:23 -03:00
Rip&Tear	2fe79e68cd	Small 404 error fixes (#1018 ) * Updated Docs: New Getting started section + content update / addition * fixed indentation issue * Minor updates to fix typos * Fixed up 404 error on latest commit --------- Co-authored-by: theCyberTech <the_t3ch@pm.me> Co-authored-by: theCyberTech <mattrapidb@gmail.com>	2024-07-28 22:01:04 -03:00
Nuraly	37d05a2365	Update Force-Tool-Ouput-as-Result.md (#964 ) I think there is some mistake, because there is no such parameter as force_output_result, and as the code shows, the correct parameter result_as_answer is set during agent creation, not task.	2024-07-28 15:41:56 -03:00
Carine Bruyndoncx	0111d261a4	Update Crews.md - correct result variable to crew_output (#972 )	2024-07-28 15:40:36 -03:00
Taleb	0a23e1dc13	Performed spell check across the rest of code base, and enahnced the yaml paraser code a little (#895 ) * Performed spell check across the entire documentation Thank you once again! * Performed spell check across the most of code base Folders been checked: - agents - cli - memory - project - tasks - telemetry - tools - translations * Trying to add a max_token for the agents, so they limited by number of tokens. * Performed spell check across the rest of code base, and enahnced the yaml paraser code a little * Small change in the main agent doc * Improve _save_file method to handle both dict and str inputs - Add check for dict type input - Use json.dump for dict serialization - Convert non-dict inputs to string - Remove type ignore comments --------- Co-authored-by: João Moura <joaomdmoura@gmail.com>	2024-07-28 15:39:54 -03:00
Henri Wenlin	ef5ff71346	feat: add verbose option for printing in ToolUsage (#990 )	2024-07-28 15:12:10 -03:00
Samuel Mallet	1697b4cacb	Add docs for new parameters to SerperDevTool (#993 )	2024-07-28 15:09:55 -03:00
Taleb	6b4710a8d1	Improve _save_file method to handle both dict and str inputs (#1011 ) - Add check for dict type input - Use json.dump for dict serialization - Convert non-dict inputs to string - Remove type ignore comments	2024-07-28 15:03:18 -03:00
Lennex Zinyando	6f2a8f08ba	Fixes getting started section links (#1016 )	2024-07-28 15:02:41 -03:00
João Moura	4e6abf596d	updating test	2024-07-28 13:23:03 -04:00
Rip&Tear	9018e2ab6a	Docs update (#1008 ) * Updated Docs: New Getting started section + content update / addition * fixed indentation issue * Minor updates to fix typos --------- Co-authored-by: theCyberTech <the_t3ch@pm.me>	2024-07-28 11:55:09 -03:00
ResearchAI	99d023c5f3	Update reset_memories_command.py (#974 )	2024-07-26 14:40:47 -07:00
Brandon Hancock (bhancock_ai)	da7d8256eb	Json Task Output Truncation with Escape Characters (#1009 ) * Fixed special character issue when converting json to models. Added numerous tests to ensure thigns work properly. * Fix linting error and cleaned up tests * Fix customer_converter_cls test failure * Fixed tests. Thank you lorenze for pointing that out. added a few more to ensure converter creation works properly * Address lorenze feedback * Fix linting issues	2024-07-26 17:27:01 -04:00
Brandon Hancock (bhancock_ai)	88bffaa0d0	Merge pull request #1012 from crewAIInc/fix/breaking-test-task-eval fix test due to asserting instructions model_schema change	2024-07-26 16:55:26 -04:00
Lorenze Jay	1159140d9f	fix test due to asserting instructions model_schema change	2024-07-26 13:37:44 -07:00
Lorenze Jay	5ac7050f7a	Patch/non gpt model pydantic output (#1003 ) * patching for non-gpt model * removal of json_object tool name assignment * fixed issue for smaller models due to instructions prompt * fixing for ollama llama3 models * closing brackets * removed not used and fixes	2024-07-26 10:57:56 -07:00
Lorenze Jay	8b513de64c	hierarchical process unblocked for async tasks (#995 ) * WIP: hierarchical unblock for async tasks * added better test * update name change * added more test and crew manager cleanup * remove prints * code cleanup, no need to pass manager	2024-07-26 10:55:51 -07:00
Eduardo Chiarotti	144e6d203f	feat: add ability to set LLM for AgentPLanner on Crew (#1001 ) * feat: add ability to set LLM for AgentPLanner on Crew * feat: fixes issue on instantiating the ChatOpenAI on the crew * docs: add docs for the planning_llm new parameter * docs: change message to ChatOpenAI llm * feat: add tests	2024-07-26 14:24:29 -03:00
Eduardo Chiarotti	2d2154ed65	feat: add crew Testing/Evaluating feature (#998 ) * feat: add crew Testing/evalauting feature * feat: add docs and add unit test * feat: improve testing output table * feat: add tests * feat: fix type checking issue * feat: add raise ValueError when testing if output is not the expected * docs: add docs for Testing * feat: improve tests and fix some issue * feat: back to sync * feat: change opdeai model * feat: fix test	2024-07-26 14:23:51 -03:00
Brandon Hancock	d9e60c8b57	Drop router for now. will add in separately	2024-07-25 14:29:17 -04:00
Brandon Hancock	2119ba7c32	Starting to work on router	2024-07-25 09:33:05 -04:00
Brandon Hancock	b00bc44921	Add back in commentary at top of pipeline file	2024-07-24 11:27:47 -04:00
Brandon Hancock (bhancock_ai)	2d086ab596	Merge pull request #994 from crewAIInc/fix/getting-started-docs fixed bullet points for crew yaml annoations	2024-07-23 14:36:45 -04:00
Lorenze Jay	776c67cc0f	clearer usage for crewai create command	2024-07-23 11:32:25 -07:00
Lorenze Jay	78ef490646	fixed bullet points for crew yaml annoations	2024-07-23 11:31:09 -07:00
Brandon Hancock	6b4ebe16d0	Update Pipeline docs	2024-07-23 11:24:57 -04:00
Brandon Hancock	602ade4cc4	Update pipeline to properly process input and ouput dictionary	2024-07-23 11:12:55 -04:00
Lorenze Jay	4da5cc9778	Feat yaml config all attributes (#985 ) * WIP: yaml proper mapping for agents and agent * WIP: added output_json and output_pydantic setup * WIP: core logic added, need cleanup * code cleanup * updated docs and example template to use yaml to reference agents within tasks * cleanup type errors * Update Start-a-New-CrewAI-Project.md --------- Co-authored-by: João Moura <joaomdmoura@gmail.com>	2024-07-23 00:21:01 -03:00
Brandon Hancock	471c5b970c	Update docs for pipeline	2024-07-22 19:28:32 -04:00
Brandon Hancock	33d9828edc	Implemented additional tests for pipeline. One test is failing. Need team support	2024-07-22 16:35:16 -04:00
Eduardo Chiarotti	6930656897	feat: add crewai test feature (#984 ) * feat: add crewai test feature * fix: remove unused import * feat: update docstirng * fix: tests	2024-07-22 17:21:05 -03:00
Brandon Hancock	e95ef6fca9	Merge branch 'main' into feature/procedure_v2	2024-07-22 09:55:03 -04:00
João Moura	349753a013	prepping new version	2024-07-20 12:26:32 -04:00
Eduardo Chiarotti	f53a3a00e1	fix: planning feature output (#969 ) * fix: planning feature output * fix: add validation for planning result	2024-07-20 11:56:53 -03:00
Brandon Hancock	afd6bff159	Fix pipelineoutput to look more like crewoutput and taskoutput	2024-07-19 15:18:19 -04:00
Brandon Hancock	392490c48b	new pipeline flow with traces and usage metrics working. need to add more tests and make sure PipelineOutput behaves likew CrewOutput	2024-07-19 14:39:31 -04:00
João Moura	e2113fe417	preparing new verions	2024-07-19 13:22:28 -04:00
Eduardo Chiarotti	f9288295e6	fix: agent missing fix (#966 )	2024-07-19 13:15:33 -03:00
João Moura	fcc57f2fc0	rmeoving extra logging	2024-07-19 01:16:15 -04:00
Dev Khant	5cb6ee9eeb	Docs: Update info about tools (#896 )	2024-07-19 01:38:42 -03:00
ariel	b38f0825e7	Fix broken link to the installation guide (#912 ) Updated the installation guide link to use the absolute URL instead of a relative path, ensuring it correctly points to 'https://docs.crewai.com/how-to/Installing-CrewAI/'.	2024-07-19 01:37:54 -03:00
Salman Faroz	f51e94dede	Update Crews.md (#889 ) To solve : I encountered an error while trying to use the tool. This was the error: DuckDuckGoSearchRun._run() got an unexpected keyword argument 'q'. Tool duckduckgo_search accepts these inputs: A wrapper around DuckDuckGo Search. Useful for when you need to answer questions about current events. Input should be a search query. refer : https://github.com/joaomdmoura/crewAI/issues/316	2024-07-19 01:37:24 -03:00
robbyriverside	47bf93d291	Update Memory.md (#728 ) The memory documentation left me with a lot of questions. After I went through the code to find an answer. I added this paragraph to explain what I found. Hope this is helpful.	2024-07-19 01:36:54 -03:00
Brandon Hancock	d094e178f1	Update terminology	2024-07-18 14:59:38 -04:00
Braelyn Boynton	41fd1c6124	upgrade agentops to 0.3 (#957 ) * upgrade agentops to 0.3 * lockfile	2024-07-18 13:30:04 -03:00
Lorenze Jay	be1b9a3994	Reset memory (#958 ) * reseting memory on cli * using storage.reset * deleting memories on command * added tests * handle when no flags are used * added docs	2024-07-18 13:29:42 -03:00
Eduardo Chiarotti	61a196394b	feat: Add planning feature to crew (#919 ) * feat: add planning feature to crew * feat: add test to planning handler and change to execute_async method * docs: add planning parameter to the Core documentation * docs: add planning docs * fix: fix type checking issue * fix: test and logic	2024-07-18 13:15:08 -03:00
Brandon Hancock	834c62feca	Going to start refactoring for pipeline_output	2024-07-18 11:20:26 -04:00
Brandon Hancock	c0c329b6e0	Merge branch 'main' into feature/procedure_v2	2024-07-17 13:37:27 -04:00
Lorenze Jay	5b442e4350	Merge pull request #951 from crewAIInc/test-hierarchical-tools-proper-setup Test hierarchical tools proper setup	2024-07-17 08:53:23 -07:00
Lorenze Jay	c9920b9823	better spacing	2024-07-17 08:40:52 -07:00
Lorenze Jay	2faa2dbddb	code cleanup	2024-07-17 08:39:57 -07:00
Lorenze Jay	76607062f0	using gpt4o	2024-07-17 08:27:43 -07:00
Lorenze Jay	a8cac9b7e9	Merge branch 'main' of github.com:joaomdmoura/crewAI into test-hierarchical-tools-proper-setup	2024-07-17 08:21:13 -07:00
Brandon Hancock (bhancock_ai)	dfacc8832f	Merge pull request #954 from crewAIInc/hotfix/improve-async-logging Fix logging for async and sync tasks	2024-07-17 11:20:13 -04:00
Lorenze Jay	93f643f851	fixed test	2024-07-17 08:20:05 -07:00
Lorenze Jay	6946b89e17	Merge branch 'main' of github.com:joaomdmoura/crewAI into test-hierarchical-tools-proper-setup	2024-07-17 08:16:44 -07:00
Lorenze Jay	021f2eb8a1	Merge branch 'conditional-task-f' of github.com:joaomdmoura/crewAI into test-hierarchical-tools-proper-setup	2024-07-16 20:35:27 -07:00
Lorenze Jay	731de2ff31	Merge branch 'test-hierarchical-tools-proper-setup' of github.com:joaomdmoura/crewAI into test-hierarchical-tools-proper-setup	2024-07-16 20:31:42 -07:00
Lorenze Jay	24e28da203	Merge branch 'conditional-task-f' of github.com:joaomdmoura/crewAI into test-hierarchical-tools-proper-setup	2024-07-16 20:28:50 -07:00
Lorenze Jay	0415b9982b	code cleanup	2024-07-16 20:07:05 -07:00
Lorenze Jay	ee32d36312	Merge branch 'conditional-task-f' of github.com:joaomdmoura/crewAI into test-hierarchical-tools-proper-setup	2024-07-16 16:05:09 -07:00
Lorenze Jay	c66559345f	Merge branch 'conditional-task-f' of github.com:joaomdmoura/crewAI into test-hierarchical-tools-proper-setup	2024-07-16 15:20:46 -07:00
Lorenze Jay	3ad95d50d4	ensures _update_manager_tools has a manager otherwise throw error	2024-07-16 15:15:50 -07:00
Lorenze Jay	e8cbdb7881	fixed hierarchial manager tools when assigned an agent	2024-07-16 14:00:25 -07:00
Brandon Hancock	f737b3b379	WIP	2024-07-16 11:35:24 -04:00
Brandon Hancock	467536b96a	Merge branch 'main' into feature/procedure_v2	2024-07-16 10:34:55 -04:00
Brandon Hancock	1988a00c60	Merge branch 'bugfix/langchain-tool-config-change' into feature/procedure_v2	2024-07-15 11:47:16 -04:00
Brandon Hancock	e2f4405291	Merge branch 'main' into feature/procedure_v2	2024-07-15 09:44:41 -04:00
Brandon Hancock	040e5a78d2	Add back in Gui's tool_usage fix	2024-07-15 09:21:21 -04:00
Brandon Hancock	c5002eedd9	rshift working	2024-07-12 16:48:19 -04:00
Brandon Hancock	f7680d6157	All tests are passing now	2024-07-12 11:00:29 -04:00
Brandon Hancock	adf93c91f7	WIP. Procedure appears to be working well. Working on mocking properly for tests	2024-07-11 15:26:46 -04:00