diff --git a/README.md b/README.md index 01e08821e..b4782f418 100644 --- a/README.md +++ b/README.md @@ -254,7 +254,7 @@ pip install dist/*.tar.gz CrewAI uses anonymous telemetry to collect usage data with the main purpose of helping us improve the library by focusing our efforts on the most used features, integrations and tools. -There is NO data being collected on the prompts, tasks descriptions agents backstories or goals nor tools usage, no API calls, nor responses nor any data that is being processed by the agents, nor any secrets and env vars. +It's pivotal to understand that **NO data is collected** concerning prompts, task descriptions, agents' backstories or goals, usage of tools, API calls, responses, any data processed by the agents, or secrets and environment variables, with the exception of the conditions mentioned. When the `share_crew` feature is enabled, detailed data including task descriptions, agents' backstories or goals, and other specific attributes are collected to provide deeper insights while respecting user privacy. We don't offer a way to disable it now, but we will in the future. Data collected includes: @@ -279,7 +279,7 @@ Data collected includes: - Tools names available - Understand out of the publically available tools, which ones are being used the most so we can improve them -Users can opt-in sharing the complete telemetry data by setting the `share_crew` attribute to `True` on their Crews. +Users can opt-in to Further Telemetry, sharing the complete telemetry data by setting the `share_crew` attribute to `True` on their Crews. Enabling `share_crew` results in the collection of detailed crew and task execution data, including `goal`, `backstory`, `context`, and `output` of tasks. This enables a deeper insight into usage patterns while respecting the user's choice to share. ## License diff --git a/docs/getting-started/Start-a-New-CrewAI-Project-Template-Method.md b/docs/getting-started/Start-a-New-CrewAI-Project-Template-Method.md index 0e6fcc446..71b0129c3 100644 --- a/docs/getting-started/Start-a-New-CrewAI-Project-Template-Method.md +++ b/docs/getting-started/Start-a-New-CrewAI-Project-Template-Method.md @@ -17,7 +17,7 @@ Beforre we start there are a couple of things to note: Before getting started with CrewAI, make sure that you have installed it via pip: ```shell -$ pip install crewai crewi-tools +$ pip install crewai crewai-tools ``` ### Virtual Environemnts diff --git a/docs/telemetry/Telemetry.md b/docs/telemetry/Telemetry.md index a4898825d..63d5f5905 100644 --- a/docs/telemetry/Telemetry.md +++ b/docs/telemetry/Telemetry.md @@ -5,7 +5,7 @@ description: Understanding the telemetry data collected by CrewAI and how it con ## Telemetry -CrewAI utilizes anonymous telemetry to gather usage statistics with the primary goal of enhancing the library. Our focus is on improving and developing the features, integrations, and tools most utilized by our users. +CrewAI utilizes anonymous telemetry to gather usage statistics with the primary goal of enhancing the library. Our focus is on improving and developing the features, integrations, and tools most utilized by our users. We don't offer a way to disable it now, but we will in the future. It's pivotal to understand that **NO data is collected** concerning prompts, task descriptions, agents' backstories or goals, usage of tools, API calls, responses, any data processed by the agents, or secrets and environment variables, with the exception of the conditions mentioned. When the `share_crew` feature is enabled, detailed data including task descriptions, agents' backstories or goals, and other specific attributes are collected to provide deeper insights while respecting user privacy. @@ -22,7 +22,7 @@ It's pivotal to understand that **NO data is collected** concerning prompts, tas - **Tool Usage**: Identifying which tools are most frequently used allows us to prioritize improvements in those areas. ### Opt-In Further Telemetry Sharing -Users can choose to share their complete telemetry data by enabling the `share_crew` attribute to `True` in their crew configurations. This opt-in approach respects user privacy and aligns with data protection standards by ensuring users have control over their data sharing preferences. Enabling `share_crew` results in the collection of detailed crew and task execution data, including `goal`, `backstory`, `context`, and `output` of tasks. This enables a deeper insight into usage patterns while respecting the user's choice to share. +Users can choose to share their complete telemetry data by enabling the `share_crew` attribute to `True` in their crew configurations. Enabling `share_crew` results in the collection of detailed crew and task execution data, including `goal`, `backstory`, `context`, and `output` of tasks. This enables a deeper insight into usage patterns while respecting the user's choice to share. ### Updates and Revisions We are committed to maintaining the accuracy and transparency of our documentation. Regular reviews and updates are performed to ensure our documentation accurately reflects the latest developments of our codebase and telemetry practices. Users are encouraged to review this section for the most current information on our data collection practices and how they contribute to the improvement of CrewAI. \ No newline at end of file diff --git a/pyproject.toml b/pyproject.toml index 2eaaf84f5..8858953ef 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [tool.poetry] name = "crewai" -version = "0.41.1" +version = "0.46.0" description = "Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks." authors = ["Joao Moura "] readme = "README.md" diff --git a/src/crewai/agents/agent_builder/base_agent_executor_mixin.py b/src/crewai/agents/agent_builder/base_agent_executor_mixin.py index e3c0aa211..83aad27b4 100644 --- a/src/crewai/agents/agent_builder/base_agent_executor_mixin.py +++ b/src/crewai/agents/agent_builder/base_agent_executor_mixin.py @@ -3,7 +3,6 @@ from typing import TYPE_CHECKING, Optional from crewai.memory.entity.entity_memory_item import EntityMemoryItem from crewai.memory.long_term.long_term_memory_item import LongTermMemoryItem -from crewai.memory.short_term.short_term_memory_item import ShortTermMemoryItem from crewai.utilities.converter import ConverterError from crewai.utilities.evaluators.task_evaluator import TaskEvaluator from crewai.utilities import I18N @@ -39,18 +38,17 @@ class CrewAgentExecutorMixin: and "Action: Delegate work to coworker" not in output.log ): try: - memory = ShortTermMemoryItem( - data=output.log, - agent=self.crew_agent.role, - metadata={ - "observation": self.task.description, - }, - ) if ( hasattr(self.crew, "_short_term_memory") and self.crew._short_term_memory ): - self.crew._short_term_memory.save(memory) + self.crew._short_term_memory.save( + value=output.log, + metadata={ + "observation": self.task.description, + }, + agent=self.crew_agent.role, + ) except Exception as e: print(f"Failed to add to short term memory: {e}") pass diff --git a/src/crewai/cli/cli.py b/src/crewai/cli/cli.py index c9f03f3fb..52d2bc75c 100644 --- a/src/crewai/cli/cli.py +++ b/src/crewai/cli/cli.py @@ -6,9 +6,9 @@ from crewai.memory.storage.kickoff_task_outputs_storage import ( ) from .create_crew import create_crew +from .evaluate_crew import evaluate_crew from .replay_from_task import replay_task_command from .reset_memories_command import reset_memories_command -from .test_crew import test_crew from .train_crew import train_crew @@ -144,7 +144,7 @@ def reset_memories(long, short, entities, kickoff_outputs, all): def test(n_iterations: int, model: str): """Test the crew and evaluate the results.""" click.echo(f"Testing the crew for {n_iterations} iterations with model {model}") - test_crew(n_iterations, model) + evaluate_crew(n_iterations, model) if __name__ == "__main__": diff --git a/src/crewai/cli/test_crew.py b/src/crewai/cli/evaluate_crew.py similarity index 82% rename from src/crewai/cli/test_crew.py rename to src/crewai/cli/evaluate_crew.py index b95669e55..30abda380 100644 --- a/src/crewai/cli/test_crew.py +++ b/src/crewai/cli/evaluate_crew.py @@ -1,13 +1,11 @@ import subprocess + import click -import pytest - -pytest.skip(allow_module_level=True) -def test_crew(n_iterations: int, model: str) -> None: +def evaluate_crew(n_iterations: int, model: str) -> None: """ - Test the crew by running a command in the Poetry environment. + Test and Evaluate the crew by running a command in the Poetry environment. Args: n_iterations (int): The number of iterations to test the crew. diff --git a/src/crewai/cli/templates/pyproject.toml b/src/crewai/cli/templates/pyproject.toml index 33781e14d..048782d1c 100644 --- a/src/crewai/cli/templates/pyproject.toml +++ b/src/crewai/cli/templates/pyproject.toml @@ -6,7 +6,7 @@ authors = ["Your Name "] [tool.poetry.dependencies] python = ">=3.10,<=3.13" -crewai = { extras = ["tools"], version = "^0.41.1" } +crewai = { extras = ["tools"], version = "^0.46.0" } [tool.poetry.scripts] {{folder_name}} = "{{folder_name}}.main:run" diff --git a/src/crewai/memory/short_term/short_term_memory.py b/src/crewai/memory/short_term/short_term_memory.py index 0824c6737..ea62f87f6 100644 --- a/src/crewai/memory/short_term/short_term_memory.py +++ b/src/crewai/memory/short_term/short_term_memory.py @@ -1,3 +1,4 @@ +from typing import Any, Dict, Optional from crewai.memory.memory import Memory from crewai.memory.short_term.short_term_memory_item import ShortTermMemoryItem from crewai.memory.storage.rag_storage import RAGStorage @@ -18,7 +19,14 @@ class ShortTermMemory(Memory): ) super().__init__(storage) - def save(self, item: ShortTermMemoryItem) -> None: + def save( + self, + value: Any, + metadata: Optional[Dict[str, Any]] = None, + agent: Optional[str] = None, + ) -> None: + item = ShortTermMemoryItem(data=value, metadata=metadata, agent=agent) + super().save(value=item.data, metadata=item.metadata, agent=item.agent) def search(self, query: str, score_threshold: float = 0.35): diff --git a/src/crewai/memory/short_term/short_term_memory_item.py b/src/crewai/memory/short_term/short_term_memory_item.py index c20c08699..83b7f842f 100644 --- a/src/crewai/memory/short_term/short_term_memory_item.py +++ b/src/crewai/memory/short_term/short_term_memory_item.py @@ -3,7 +3,10 @@ from typing import Any, Dict, Optional class ShortTermMemoryItem: def __init__( - self, data: Any, agent: str, metadata: Optional[Dict[str, Any]] = None + self, + data: Any, + agent: Optional[str] = None, + metadata: Optional[Dict[str, Any]] = None, ): self.data = data self.agent = agent diff --git a/src/crewai/memory/storage/interface.py b/src/crewai/memory/storage/interface.py index e988862ba..0ffc1de16 100644 --- a/src/crewai/memory/storage/interface.py +++ b/src/crewai/memory/storage/interface.py @@ -4,7 +4,7 @@ from typing import Any, Dict class Storage: """Abstract base class defining the storage interface""" - def save(self, key: str, value: Any, metadata: Dict[str, Any]) -> None: + def save(self, value: Any, metadata: Dict[str, Any]) -> None: pass def search(self, key: str) -> Dict[str, Any]: # type: ignore diff --git a/src/crewai/task.py b/src/crewai/task.py index bd7e328b0..8efaee5fc 100644 --- a/src/crewai/task.py +++ b/src/crewai/task.py @@ -1,3 +1,4 @@ +import datetime import json import os import threading @@ -108,6 +109,7 @@ class Task(BaseModel): _original_description: str | None = None _original_expected_output: str | None = None _thread: threading.Thread | None = None + _execution_time: float | None = None def __init__(__pydantic_self__, **data): config = data.pop("config", {}) @@ -121,6 +123,12 @@ class Task(BaseModel): "may_not_set_field", "This field is not to be set by the user.", {} ) + def _set_start_execution_time(self) -> float: + return datetime.datetime.now().timestamp() + + def _set_end_execution_time(self, start_time: float) -> None: + self._execution_time = datetime.datetime.now().timestamp() - start_time + @field_validator("output_file") @classmethod def output_file_validation(cls, value: str) -> str: @@ -217,6 +225,7 @@ class Task(BaseModel): f"The task '{self.description}' has no agent assigned, therefore it can't be executed directly and should be executed in a Crew using a specific process that support that, like hierarchical." ) + start_time = self._set_start_execution_time() self._execution_span = self._telemetry.task_started(crew=agent.crew, task=self) self.prompt_context = context @@ -240,6 +249,7 @@ class Task(BaseModel): ) self.output = task_output + self._set_end_execution_time(start_time) if self.callback: self.callback(self.output) @@ -251,7 +261,9 @@ class Task(BaseModel): content = ( json_output if json_output - else pydantic_output.model_dump_json() if pydantic_output else result + else pydantic_output.model_dump_json() + if pydantic_output + else result ) self._save_file(content) diff --git a/src/crewai/telemetry/telemetry.py b/src/crewai/telemetry/telemetry.py index 3983de0ec..d6f29685e 100644 --- a/src/crewai/telemetry/telemetry.py +++ b/src/crewai/telemetry/telemetry.py @@ -40,7 +40,7 @@ class Telemetry: - Roles of agents in a crew - Tools names available - Users can opt-in to sharing more complete data suing the `share_crew` + Users can opt-in to sharing more complete data using the `share_crew` attribute in the Crew class. """ diff --git a/src/crewai/utilities/evaluators/crew_evaluator_handler.py b/src/crewai/utilities/evaluators/crew_evaluator_handler.py index 3f1abb8b8..fbc5d341e 100644 --- a/src/crewai/utilities/evaluators/crew_evaluator_handler.py +++ b/src/crewai/utilities/evaluators/crew_evaluator_handler.py @@ -28,6 +28,7 @@ class CrewEvaluator: """ tasks_scores: defaultdict = defaultdict(list) + run_execution_times: defaultdict = defaultdict(list) iteration: int = 0 def __init__(self, crew, openai_model_name: str): @@ -40,9 +41,6 @@ class CrewEvaluator: for task in self.crew.tasks: task.callback = self.evaluate - def set_iteration(self, iteration: int) -> None: - self.iteration = iteration - def _evaluator_agent(self): return Agent( role="Task Execution Evaluator", @@ -71,6 +69,9 @@ class CrewEvaluator: output_pydantic=TaskEvaluationPydanticOutput, ) + def set_iteration(self, iteration: int) -> None: + self.iteration = iteration + def print_crew_evaluation_result(self) -> None: """ Prints the evaluation result of the crew in a table. @@ -119,6 +120,16 @@ class CrewEvaluator: ] table.add_row("Crew", *map(str, crew_scores), f"{crew_average:.1f}") + run_exec_times = [ + int(sum(tasks_exec_times)) + for _, tasks_exec_times in self.run_execution_times.items() + ] + execution_time_avg = int(sum(run_exec_times) / len(run_exec_times)) + table.add_row( + "Execution Time (s)", + *map(str, run_exec_times), + f"{execution_time_avg}", + ) # Display the table in the terminal console = Console() console.print(table) @@ -145,5 +156,8 @@ class CrewEvaluator: if isinstance(evaluation_result.pydantic, TaskEvaluationPydanticOutput): self.tasks_scores[self.iteration].append(evaluation_result.pydantic.quality) + self.run_execution_times[self.iteration].append( + current_task._execution_time + ) else: raise ValueError("Evaluation result is not in the expected format") diff --git a/tests/cli/cli_test.py b/tests/cli/cli_test.py index 504975dc7..509b9193a 100644 --- a/tests/cli/cli_test.py +++ b/tests/cli/cli_test.py @@ -135,29 +135,29 @@ def test_version_command_with_tools(runner): ) -@mock.patch("crewai.cli.cli.test_crew") -def test_test_default_iterations(test_crew, runner): +@mock.patch("crewai.cli.cli.evaluate_crew") +def test_test_default_iterations(evaluate_crew, runner): result = runner.invoke(test) - test_crew.assert_called_once_with(3, "gpt-4o-mini") + evaluate_crew.assert_called_once_with(3, "gpt-4o-mini") assert result.exit_code == 0 assert "Testing the crew for 3 iterations with model gpt-4o-mini" in result.output -@mock.patch("crewai.cli.cli.test_crew") -def test_test_custom_iterations(test_crew, runner): +@mock.patch("crewai.cli.cli.evaluate_crew") +def test_test_custom_iterations(evaluate_crew, runner): result = runner.invoke(test, ["--n_iterations", "5", "--model", "gpt-4o"]) - test_crew.assert_called_once_with(5, "gpt-4o") + evaluate_crew.assert_called_once_with(5, "gpt-4o") assert result.exit_code == 0 assert "Testing the crew for 5 iterations with model gpt-4o" in result.output -@mock.patch("crewai.cli.cli.test_crew") -def test_test_invalid_string_iterations(test_crew, runner): +@mock.patch("crewai.cli.cli.evaluate_crew") +def test_test_invalid_string_iterations(evaluate_crew, runner): result = runner.invoke(test, ["--n_iterations", "invalid"]) - test_crew.assert_not_called() + evaluate_crew.assert_not_called() assert result.exit_code == 2 assert ( "Usage: test [OPTIONS]\nTry 'test --help' for help.\n\nError: Invalid value for '-n' / '--n_iterations': 'invalid' is not a valid integer.\n" diff --git a/tests/cli/test_crew_test.py b/tests/cli/test_crew_test.py index 90649710a..578e413bc 100644 --- a/tests/cli/test_crew_test.py +++ b/tests/cli/test_crew_test.py @@ -3,7 +3,7 @@ from unittest import mock import pytest -from crewai.cli import test_crew +from crewai.cli import evaluate_crew @pytest.mark.parametrize( @@ -14,13 +14,13 @@ from crewai.cli import test_crew (10, "gpt-4"), ], ) -@mock.patch("crewai.cli.test_crew.subprocess.run") +@mock.patch("crewai.cli.evaluate_crew.subprocess.run") def test_crew_success(mock_subprocess_run, n_iterations, model): """Test the crew function for successful execution.""" mock_subprocess_run.return_value = subprocess.CompletedProcess( args=f"poetry run test {n_iterations} {model}", returncode=0 ) - result = test_crew.test_crew(n_iterations, model) + result = evaluate_crew.evaluate_crew(n_iterations, model) mock_subprocess_run.assert_called_once_with( ["poetry", "run", "test", str(n_iterations), model], @@ -31,26 +31,26 @@ def test_crew_success(mock_subprocess_run, n_iterations, model): assert result is None -@mock.patch("crewai.cli.test_crew.click") +@mock.patch("crewai.cli.evaluate_crew.click") def test_test_crew_zero_iterations(click): - test_crew.test_crew(0, "gpt-4o") + evaluate_crew.evaluate_crew(0, "gpt-4o") click.echo.assert_called_once_with( "An unexpected error occurred: The number of iterations must be a positive integer.", err=True, ) -@mock.patch("crewai.cli.test_crew.click") +@mock.patch("crewai.cli.evaluate_crew.click") def test_test_crew_negative_iterations(click): - test_crew.test_crew(-2, "gpt-4o") + evaluate_crew.evaluate_crew(-2, "gpt-4o") click.echo.assert_called_once_with( "An unexpected error occurred: The number of iterations must be a positive integer.", err=True, ) -@mock.patch("crewai.cli.test_crew.click") -@mock.patch("crewai.cli.test_crew.subprocess.run") +@mock.patch("crewai.cli.evaluate_crew.click") +@mock.patch("crewai.cli.evaluate_crew.subprocess.run") def test_test_crew_called_process_error(mock_subprocess_run, click): n_iterations = 5 mock_subprocess_run.side_effect = subprocess.CalledProcessError( @@ -59,7 +59,7 @@ def test_test_crew_called_process_error(mock_subprocess_run, click): output="Error", stderr="Some error occurred", ) - test_crew.test_crew(n_iterations, "gpt-4o") + evaluate_crew.evaluate_crew(n_iterations, "gpt-4o") mock_subprocess_run.assert_called_once_with( ["poetry", "run", "test", "5", "gpt-4o"], @@ -78,13 +78,13 @@ def test_test_crew_called_process_error(mock_subprocess_run, click): ) -@mock.patch("crewai.cli.test_crew.click") -@mock.patch("crewai.cli.test_crew.subprocess.run") +@mock.patch("crewai.cli.evaluate_crew.click") +@mock.patch("crewai.cli.evaluate_crew.subprocess.run") def test_test_crew_unexpected_exception(mock_subprocess_run, click): # Arrange n_iterations = 5 mock_subprocess_run.side_effect = Exception("Unexpected error") - test_crew.test_crew(n_iterations, "gpt-4o") + evaluate_crew.evaluate_crew(n_iterations, "gpt-4o") mock_subprocess_run.assert_called_once_with( ["poetry", "run", "test", "5", "gpt-4o"], diff --git a/tests/crew_test.py b/tests/crew_test.py index f3fb27872..291e4eb68 100644 --- a/tests/crew_test.py +++ b/tests/crew_test.py @@ -629,21 +629,18 @@ def test_sequential_async_task_execution_completion(): list_ideas = Task( description="Give me a list of 5 interesting ideas to explore for an article, what makes them unique and interesting.", expected_output="Bullet point list of 5 important events.", - max_retry_limit=3, agent=researcher, async_execution=True, ) list_important_history = Task( description="Research the history of AI and give me the 5 most important events that shaped the technology.", expected_output="Bullet point list of 5 important events.", - max_retry_limit=3, agent=researcher, async_execution=True, ) write_article = Task( description="Write an article about the history of AI and its most important events.", expected_output="A 4 paragraph article about AI.", - max_retry_limit=3, agent=writer, context=[list_ideas, list_important_history], ) diff --git a/tests/memory/short_term_memory_test.py b/tests/memory/short_term_memory_test.py index fa8cc41f9..8ae4e714c 100644 --- a/tests/memory/short_term_memory_test.py +++ b/tests/memory/short_term_memory_test.py @@ -23,10 +23,7 @@ def short_term_memory(): expected_output="A list of relevant URLs based on the search query.", agent=agent, ) - return ShortTermMemory(crew=Crew( - agents=[agent], - tasks=[task] - )) + return ShortTermMemory(crew=Crew(agents=[agent], tasks=[task])) @pytest.mark.vcr(filter_headers=["authorization"]) @@ -38,7 +35,11 @@ def test_save_and_search(short_term_memory): agent="test_agent", metadata={"task": "test_task"}, ) - short_term_memory.save(memory) + short_term_memory.save( + value=memory.data, + metadata=memory.metadata, + agent=memory.agent, + ) find = short_term_memory.search("test value", score_threshold=0.01)[0] assert find["context"] == memory.data, "Data value mismatch." diff --git a/tests/utilities/evaluators/test_crew_evaluator_handler.py b/tests/utilities/evaluators/test_crew_evaluator_handler.py index 39fa35c44..30fb7bf76 100644 --- a/tests/utilities/evaluators/test_crew_evaluator_handler.py +++ b/tests/utilities/evaluators/test_crew_evaluator_handler.py @@ -84,6 +84,10 @@ class TestCrewEvaluator: 1: [10, 9, 8], 2: [9, 8, 7], } + crew_planner.run_execution_times = { + 1: [24, 45, 66], + 2: [55, 33, 67], + } crew_planner.print_crew_evaluation_result() @@ -98,6 +102,7 @@ class TestCrewEvaluator: mock.call().add_row("Task 2", "9", "8", "8.5"), mock.call().add_row("Task 3", "8", "7", "7.5"), mock.call().add_row("Crew", "9.0", "8.0", "8.5"), + mock.call().add_row("Execution Time (s)", "135", "155", "145"), ] ) console.assert_has_calls([mock.call(), mock.call().print(table())])