Lorenze/better tracing events (#3382)

* feat: implement tool usage limit exception handling - Introduced `ToolUsageLimitExceeded` exception to manage maximum usage limits for tools. - Enhanced `CrewStructuredTool` to check and raise this exception when the usage limit is reached. - Updated `_run` and `_execute` methods to include usage limit checks and handle exceptions appropriately, improving reliability and user feedback. * feat: enhance PlusAPI and ToolUsage with task metadata - Removed the `send_trace_batch` method from PlusAPI to streamline the API. - Added timeout parameters to trace event methods in PlusAPI for improved reliability. - Updated ToolUsage to include task metadata (task name and ID) in event emissions, enhancing traceability and context during tool usage. - Refactored event handling in LLM and ToolUsage events to ensure task information is consistently captured. * feat: enhance memory and event handling with task and agent metadata - Added task and agent metadata to various memory and event classes, improving traceability and context during memory operations. - Updated the `ContextualMemory` and `Memory` classes to associate tasks and agents, allowing for better context management. - Enhanced event emissions in `LLM`, `ToolUsage`, and memory events to include task and agent information, facilitating improved debugging and monitoring. - Refactored event handling to ensure consistent capture of task and agent details across the system. * drop * refactor: clean up unused imports in memory and event modules - Removed unused TYPE_CHECKING imports from long_term_memory.py to streamline the code. - Eliminated unnecessary import from memory_events.py, enhancing clarity and maintainability. * fix memory tests * fix task_completed payload * fix: remove unused test agent variable in external memory tests * refactor: remove unused agent parameter from Memory class save method - Eliminated the agent parameter from the save method in the Memory class to streamline the code and improve clarity. - Updated the TraceBatchManager class by moving initialization of attributes into the constructor for better organization and readability. * refactor: enhance ExecutionState and ReasoningEvent classes with optional task and agent identifiers - Added optional `current_agent_id` and `current_task_id` attributes to the `ExecutionState` class for better tracking of agent and task states. - Updated the `from_task` attribute in the `ReasoningEvent` class to use `Optional[Any]` instead of a specific type, improving flexibility in event handling. * refactor: update ExecutionState class by removing unused agent and task identifiers - Removed the `current_agent_id` and `current_task_id` attributes from the `ExecutionState` class to simplify the code and enhance clarity. - Adjusted the import statements to include `Optional` for better type handling. * refactor: streamline LLM event handling in LiteAgent - Removed unused LLM event emissions (LLMCallStartedEvent, LLMCallCompletedEvent, LLMCallFailedEvent) from the LiteAgent class to simplify the code and improve performance. - Adjusted the flow of LLM response handling by eliminating unnecessary event bus interactions, enhancing clarity and maintainability. * flow ownership and not emitting events when a crew is done * refactor: remove unused agent parameter from ShortTermMemory save method - Eliminated the agent parameter from the save method in the ShortTermMemory class to streamline the code and improve clarity. - This change enhances the maintainability of the memory management system by reducing unnecessary complexity. * runtype check fix * fixing tests * fix lints * fix: update event assertions in test_llm_emits_event_with_lite_agent - Adjusted the expected counts for completed and started events in the test to reflect the correct behavior of the LiteAgent. - Updated assertions for agent roles and IDs to match the expected values after recent changes in event handling. * fix: update task name assertions in event tests - Modified assertions in `test_stream_llm_emits_event_with_task_and_agent_info` and `test_llm_emits_event_with_task_and_agent_info` to use `task.description` as a fallback for `task.name`. This ensures that the tests correctly validate the task name even when it is not explicitly set. * fix: update test assertions for output values and improve readability - Updated assertions in `test_output_json_dict_hierarchical` to reflect the correct expected score value. - Enhanced readability of assertions in `test_output_pydantic_to_another_task` and `test_key` by formatting the error messages for clarity. - These changes ensure that the tests accurately validate the expected outputs and improve overall code quality. * test fixes * fix crew_test * added another fixture * fix: ensure agent and task assignments in contextual memory are conditional - Updated the ContextualMemory class to check for the existence of short-term, long-term, external, and extended memory before assigning agent and task attributes. This prevents potential attribute errors when memory types are not initialized.
2026-01-11 00:58:30 +00:00 · 2025-08-26 09:09:46 -07:00
parent 4b4a119a9f
commit 7addda9398
43 changed files with 5151 additions and 295 deletions
--- a/tests/test_crew.py
+++ b/tests/test_crew.py
@@ -366,7 +366,7 @@ def test_hierarchical_process(researcher, writer):

    assert (
        result.raw
-        == "**1. The Rise of Autonomous AI Agents in Daily Life**  \nAs artificial intelligence technology progresses, the integration of autonomous AI agents into everyday life becomes increasingly prominent. These agents, capable of making decisions without human intervention, are reshaping industries from healthcare to finance. Exploring case studies where autonomous AI has successfully decreased operational costs or improved efficiency can reveal not only the benefits but also the ethical implications of delegating decision-making to machines. This topic offers an exciting opportunity to dive into the AI landscape, showcasing current developments such as AI assistants and autonomous vehicles.\n\n**2. Ethical Implications of Generative AI in Creative Industries**  \nThe surge of generative AI tools in creative fields, such as art, music, and writing, has sparked a heated debate about authorship and originality. This article could investigate how these tools are being used by artists and creators, examining both the potential for innovation and the risk of devaluing traditional art forms. Highlighting perspectives from creators, legal experts, and ethicists could provide a comprehensive overview of the challenges faced, including copyright concerns and the emotional impact on human artists. This discussion is vital as the creative landscape evolves alongside technological advancements, making it ripe for exploration.\n\n**3. AI in Climate Change Mitigation: Current Solutions and Future Potential**  \nAs the world grapples with climate change, AI technology is increasingly being harnessed to develop innovative solutions for sustainability. From predictive analytics that optimize energy consumption to machine learning algorithms that improve carbon capture methods, AI's potential in environmental science is vast. This topic invites an exploration of existing AI applications in climate initiatives, with a focus on groundbreaking research and initiatives aimed at reducing humanity's carbon footprint. Highlighting successful projects and technology partnerships can illustrate the positive impact AI can have on global climate efforts, inspiring further exploration and investment in this area.\n\n**4. The Future of Work: How AI is Reshaping Employment Landscapes**  \nThe discussions around AI's impact on the workforce are both urgent and complex, as advances in automation and machine learning continue to transform the job market. This article could delve into the current trends of AI-driven job displacement alongside opportunities for upskilling and the creation of new job roles. By examining case studies of companies that integrate AI effectively and the resulting workforce adaptations, readers can gain valuable insights into preparing for a future where humans and AI collaborate. This exploration highlights the importance of policies that promote workforce resilience in the face of change.\n\n**5. Decentralized AI: Exploring the Role of Blockchain in AI Development**  \nAs blockchain technology sweeps through various sectors, its application in AI development presents a fascinating topic worth examining. Decentralized AI could address issues of data privacy, security, and democratization in AI models by allowing users to retain ownership of data while benefiting from AI's capabilities. This article could analyze how decentralized networks are disrupting traditional AI development models, featuring innovative projects that harness the synergy between blockchain and AI. Highlighting potential pitfalls and the future landscape of decentralized AI could stimulate discussion among technologists, entrepreneurs, and policymakers alike.\n\nThese topics not only reflect current trends but also probe deeper into ethical and practical considerations, making them timely and relevant for contemporary audiences."
+        == "1. **The Rise of Autonomous AI Agents in Daily Life**  \n   As artificial intelligence technology progresses, the integration of autonomous AI agents into everyday life becomes increasingly prominent. These agents, capable of making decisions without human intervention, are reshaping industries from healthcare to finance. Exploring case studies where autonomous AI has successfully decreased operational costs or improved efficiency can reveal not only the benefits but also the ethical implications of delegating decision-making to machines. This topic offers an exciting opportunity to dive into the AI landscape, showcasing current developments such as AI assistants and autonomous vehicles.\n\n2. **Ethical Implications of Generative AI in Creative Industries**  \n   The surge of generative AI tools in creative fields, such as art, music, and writing, has sparked a heated debate about authorship and originality. This article could investigate how these tools are being used by artists and creators, examining both the potential for innovation and the risk of devaluing traditional art forms. Highlighting perspectives from creators, legal experts, and ethicists could provide a comprehensive overview of the challenges faced, including copyright concerns and the emotional impact on human artists. This discussion is vital as the creative landscape evolves alongside technological advancements, making it ripe for exploration.\n\n3. **AI in Climate Change Mitigation: Current Solutions and Future Potential**  \n   As the world grapples with climate change, AI technology is increasingly being harnessed to develop innovative solutions for sustainability. From predictive analytics that optimize energy consumption to machine learning algorithms that improve carbon capture methods, AI's potential in environmental science is vast. This topic invites an exploration of existing AI applications in climate initiatives, with a focus on groundbreaking research and initiatives aimed at reducing humanity's carbon footprint. Highlighting successful projects and technology partnerships can illustrate the positive impact AI can have on global climate efforts, inspiring further exploration and investment in this area.\n\n4. **The Future of Work: How AI is Reshaping Employment Landscapes**  \n   The discussions around AI's impact on the workforce are both urgent and complex, as advances in automation and machine learning continue to transform the job market. This article could delve into the current trends of AI-driven job displacement alongside opportunities for upskilling and the creation of new job roles. By examining case studies of companies that integrate AI effectively and the resulting workforce adaptations, readers can gain valuable insights into preparing for a future where humans and AI collaborate. This exploration highlights the importance of policies that promote workforce resilience in the face of change.\n\n5. **Decentralized AI: Exploring the Role of Blockchain in AI Development**  \n   As blockchain technology sweeps through various sectors, its application in AI development presents a fascinating topic worth examining. Decentralized AI could address issues of data privacy, security, and democratization in AI models by allowing users to retain ownership of data while benefiting from AI's capabilities. This article could analyze how decentralized networks are disrupting traditional AI development models, featuring innovative projects that harness the synergy between blockchain and AI. Highlighting potential pitfalls and the future landscape of decentralized AI could stimulate discussion among technologists, entrepreneurs, and policymakers alike."
    )


@@ -624,12 +624,12 @@ def test_crew_with_delegating_agents_should_not_override_task_tools(ceo, writer)
        _, kwargs = mock_execute_sync.call_args
        tools = kwargs["tools"]

-        assert any(
-            isinstance(tool, TestTool) for tool in tools
-        ), "TestTool should be present"
-        assert any(
-            "delegate" in tool.name.lower() for tool in tools
-        ), "Delegation tool should be present"
+        assert any(isinstance(tool, TestTool) for tool in tools), (
+            "TestTool should be present"
+        )
+        assert any("delegate" in tool.name.lower() for tool in tools), (
+            "Delegation tool should be present"
+        )


@pytest.mark.vcr(filter_headers=["authorization"])
@@ -688,12 +688,12 @@ def test_crew_with_delegating_agents_should_not_override_agent_tools(ceo, writer
        _, kwargs = mock_execute_sync.call_args
        tools = kwargs["tools"]

-        assert any(
-            isinstance(tool, TestTool) for tool in new_ceo.tools
-        ), "TestTool should be present"
-        assert any(
-            "delegate" in tool.name.lower() for tool in tools
-        ), "Delegation tool should be present"
+        assert any(isinstance(tool, TestTool) for tool in new_ceo.tools), (
+            "TestTool should be present"
+        )
+        assert any("delegate" in tool.name.lower() for tool in tools), (
+            "Delegation tool should be present"
+        )


@pytest.mark.vcr(filter_headers=["authorization"])
@@ -817,17 +817,17 @@ def test_task_tools_override_agent_tools_with_allow_delegation(researcher, write
        used_tools = kwargs["tools"]

        # Confirm AnotherTestTool is present but TestTool is not
-        assert any(
-            isinstance(tool, AnotherTestTool) for tool in used_tools
-        ), "AnotherTestTool should be present"
-        assert not any(
-            isinstance(tool, TestTool) for tool in used_tools
-        ), "TestTool should not be present among used tools"
+        assert any(isinstance(tool, AnotherTestTool) for tool in used_tools), (
+            "AnotherTestTool should be present"
+        )
+        assert not any(isinstance(tool, TestTool) for tool in used_tools), (
+            "TestTool should not be present among used tools"
+        )

        # Confirm delegation tool(s) are present
-        assert any(
-            "delegate" in tool.name.lower() for tool in used_tools
-        ), "Delegation tool should be present"
+        assert any("delegate" in tool.name.lower() for tool in used_tools), (
+            "Delegation tool should be present"
+        )

    # Finally, make sure the agent's original tools remain unchanged
    assert len(researcher_with_delegation.tools) == 1
@@ -931,9 +931,9 @@ def test_cache_hitting_between_agents(researcher, writer, ceo):
            tool="multiplier", input={"first_number": 2, "second_number": 6}
        )
        assert cache_calls[0] == expected_call, f"First call mismatch: {cache_calls[0]}"
-        assert (
-            cache_calls[1] == expected_call
-        ), f"Second call mismatch: {cache_calls[1]}"
+        assert cache_calls[1] == expected_call, (
+            f"Second call mismatch: {cache_calls[1]}"
+        )


@pytest.mark.vcr(filter_headers=["authorization"])
@@ -1676,9 +1676,9 @@ def test_code_execution_flag_adds_code_tool_upon_kickoff():

            # Verify that exactly one tool was used and it was a CodeInterpreterTool
            assert len(used_tools) == 1, "Should have exactly one tool"
-            assert isinstance(
-                used_tools[0], CodeInterpreterTool
-            ), "Tool should be CodeInterpreterTool"
+            assert isinstance(used_tools[0], CodeInterpreterTool), (
+                "Tool should be CodeInterpreterTool"
+            )


@pytest.mark.vcr(filter_headers=["authorization"])
@@ -1762,10 +1762,10 @@ def test_agent_usage_metrics_are_captured_for_hierarchical_process():
    assert result.raw == "Howdy!"

    assert result.token_usage == UsageMetrics(
-        total_tokens=1673,
-        prompt_tokens=1562,
-        completion_tokens=111,
-        successful_requests=3,
+        total_tokens=2390,
+        prompt_tokens=2264,
+        completion_tokens=126,
+        successful_requests=4,
        cached_prompt_tokens=0,
    )

@@ -3817,9 +3817,9 @@ def test_fetch_inputs():
    expected_placeholders = {"role_detail", "topic", "field"}
    actual_placeholders = crew.fetch_inputs()

-    assert (
-        actual_placeholders == expected_placeholders
-    ), f"Expected {expected_placeholders}, but got {actual_placeholders}"
+    assert actual_placeholders == expected_placeholders, (
+        f"Expected {expected_placeholders}, but got {actual_placeholders}"
+    )


 def test_task_tools_preserve_code_execution_tools():
@@ -3894,20 +3894,20 @@ def test_task_tools_preserve_code_execution_tools():
        used_tools = kwargs["tools"]

        # Verify all expected tools are present
-        assert any(
-            isinstance(tool, TestTool) for tool in used_tools
-        ), "Task's TestTool should be present"
-        assert any(
-            isinstance(tool, CodeInterpreterTool) for tool in used_tools
-        ), "CodeInterpreterTool should be present"
-        assert any(
-            "delegate" in tool.name.lower() for tool in used_tools
-        ), "Delegation tool should be present"
+        assert any(isinstance(tool, TestTool) for tool in used_tools), (
+            "Task's TestTool should be present"
+        )
+        assert any(isinstance(tool, CodeInterpreterTool) for tool in used_tools), (
+            "CodeInterpreterTool should be present"
+        )
+        assert any("delegate" in tool.name.lower() for tool in used_tools), (
+            "Delegation tool should be present"
+        )

        # Verify the total number of tools (TestTool + CodeInterpreter + 2 delegation tools)
-        assert (
-            len(used_tools) == 4
-        ), "Should have TestTool, CodeInterpreter, and 2 delegation tools"
+        assert len(used_tools) == 4, (
+            "Should have TestTool, CodeInterpreter, and 2 delegation tools"
+        )


@pytest.mark.vcr(filter_headers=["authorization"])
@@ -3951,9 +3951,9 @@ def test_multimodal_flag_adds_multimodal_tools():
        used_tools = kwargs["tools"]

        # Check that the multimodal tool was added
-        assert any(
-            isinstance(tool, AddImageTool) for tool in used_tools
-        ), "AddImageTool should be present when agent is multimodal"
+        assert any(isinstance(tool, AddImageTool) for tool in used_tools), (
+            "AddImageTool should be present when agent is multimodal"
+        )

        # Verify we have exactly one tool (just the AddImageTool)
        assert len(used_tools) == 1, "Should only have the AddImageTool"
@@ -4217,9 +4217,9 @@ def test_crew_guardrail_feedback_in_context():
    assert len(execution_contexts) > 1, "Task should have been executed multiple times"

    # Verify that the second execution included the guardrail feedback
-    assert (
-        "Output must contain the keyword 'IMPORTANT'" in execution_contexts[1]
-    ), "Guardrail feedback should be included in retry context"
+    assert "Output must contain the keyword 'IMPORTANT'" in execution_contexts[1], (
+        "Guardrail feedback should be included in retry context"
+    )

    # Verify final output meets guardrail requirements
    assert "IMPORTANT" in result.raw, "Final output should contain required keyword"
@@ -4435,46 +4435,46 @@ def test_crew_copy_with_memory():
    try:
        crew_copy = crew.copy()

-        assert hasattr(
-            crew_copy, "_short_term_memory"
-        ), "Copied crew should have _short_term_memory"
-        assert (
-            crew_copy._short_term_memory is not None
-        ), "Copied _short_term_memory should not be None"
-        assert (
-            id(crew_copy._short_term_memory) != original_short_term_id
-        ), "Copied _short_term_memory should be a new object"
+        assert hasattr(crew_copy, "_short_term_memory"), (
+            "Copied crew should have _short_term_memory"
+        )
+        assert crew_copy._short_term_memory is not None, (
+            "Copied _short_term_memory should not be None"
+        )
+        assert id(crew_copy._short_term_memory) != original_short_term_id, (
+            "Copied _short_term_memory should be a new object"
+        )

-        assert hasattr(
-            crew_copy, "_long_term_memory"
-        ), "Copied crew should have _long_term_memory"
-        assert (
-            crew_copy._long_term_memory is not None
-        ), "Copied _long_term_memory should not be None"
-        assert (
-            id(crew_copy._long_term_memory) != original_long_term_id
-        ), "Copied _long_term_memory should be a new object"
+        assert hasattr(crew_copy, "_long_term_memory"), (
+            "Copied crew should have _long_term_memory"
+        )
+        assert crew_copy._long_term_memory is not None, (
+            "Copied _long_term_memory should not be None"
+        )
+        assert id(crew_copy._long_term_memory) != original_long_term_id, (
+            "Copied _long_term_memory should be a new object"
+        )

-        assert hasattr(
-            crew_copy, "_entity_memory"
-        ), "Copied crew should have _entity_memory"
-        assert (
-            crew_copy._entity_memory is not None
-        ), "Copied _entity_memory should not be None"
-        assert (
-            id(crew_copy._entity_memory) != original_entity_id
-        ), "Copied _entity_memory should be a new object"
+        assert hasattr(crew_copy, "_entity_memory"), (
+            "Copied crew should have _entity_memory"
+        )
+        assert crew_copy._entity_memory is not None, (
+            "Copied _entity_memory should not be None"
+        )
+        assert id(crew_copy._entity_memory) != original_entity_id, (
+            "Copied _entity_memory should be a new object"
+        )

        if original_external_id:
-            assert hasattr(
-                crew_copy, "_external_memory"
-            ), "Copied crew should have _external_memory"
-            assert (
-                crew_copy._external_memory is not None
-            ), "Copied _external_memory should not be None"
-            assert (
-                id(crew_copy._external_memory) != original_external_id
-            ), "Copied _external_memory should be a new object"
+            assert hasattr(crew_copy, "_external_memory"), (
+                "Copied crew should have _external_memory"
+            )
+            assert crew_copy._external_memory is not None, (
+                "Copied _external_memory should not be None"
+            )
+            assert id(crew_copy._external_memory) != original_external_id, (
+                "Copied _external_memory should be a new object"
+            )
        else:
            assert (
                not hasattr(crew_copy, "_external_memory")
@@ -4754,5 +4754,4 @@ def test_ensure_exchanged_messages_are_propagated_to_external_memory():
    external_memory_save.assert_called_once_with(
        value=ANY,
        metadata={"description": ANY, "messages": expected_messages},
-        agent=ANY,
    )