Lorenze/better tracing events (#3382)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled

* feat: implement tool usage limit exception handling

- Introduced `ToolUsageLimitExceeded` exception to manage maximum usage limits for tools.
- Enhanced `CrewStructuredTool` to check and raise this exception when the usage limit is reached.
- Updated `_run` and `_execute` methods to include usage limit checks and handle exceptions appropriately, improving reliability and user feedback.

* feat: enhance PlusAPI and ToolUsage with task metadata

- Removed the `send_trace_batch` method from PlusAPI to streamline the API.
- Added timeout parameters to trace event methods in PlusAPI for improved reliability.
- Updated ToolUsage to include task metadata (task name and ID) in event emissions, enhancing traceability and context during tool usage.
- Refactored event handling in LLM and ToolUsage events to ensure task information is consistently captured.

* feat: enhance memory and event handling with task and agent metadata

- Added task and agent metadata to various memory and event classes, improving traceability and context during memory operations.
- Updated the `ContextualMemory` and `Memory` classes to associate tasks and agents, allowing for better context management.
- Enhanced event emissions in `LLM`, `ToolUsage`, and memory events to include task and agent information, facilitating improved debugging and monitoring.
- Refactored event handling to ensure consistent capture of task and agent details across the system.

* drop

* refactor: clean up unused imports in memory and event modules

- Removed unused TYPE_CHECKING imports from long_term_memory.py to streamline the code.
- Eliminated unnecessary import from memory_events.py, enhancing clarity and maintainability.

* fix memory tests

* fix task_completed payload

* fix: remove unused test agent variable in external memory tests

* refactor: remove unused agent parameter from Memory class save method

- Eliminated the agent parameter from the save method in the Memory class to streamline the code and improve clarity.
- Updated the TraceBatchManager class by moving initialization of attributes into the constructor for better organization and readability.

* refactor: enhance ExecutionState and ReasoningEvent classes with optional task and agent identifiers

- Added optional `current_agent_id` and `current_task_id` attributes to the `ExecutionState` class for better tracking of agent and task states.
- Updated the `from_task` attribute in the `ReasoningEvent` class to use `Optional[Any]` instead of a specific type, improving flexibility in event handling.

* refactor: update ExecutionState class by removing unused agent and task identifiers

- Removed the `current_agent_id` and `current_task_id` attributes from the `ExecutionState` class to simplify the code and enhance clarity.
- Adjusted the import statements to include `Optional` for better type handling.

* refactor: streamline LLM event handling in LiteAgent

- Removed unused LLM event emissions (LLMCallStartedEvent, LLMCallCompletedEvent, LLMCallFailedEvent) from the LiteAgent class to simplify the code and improve performance.
- Adjusted the flow of LLM response handling by eliminating unnecessary event bus interactions, enhancing clarity and maintainability.

* flow ownership and not emitting events when a crew is done

* refactor: remove unused agent parameter from ShortTermMemory save method

- Eliminated the agent parameter from the save method in the ShortTermMemory class to streamline the code and improve clarity.
- This change enhances the maintainability of the memory management system by reducing unnecessary complexity.

* runtype check fix

* fixing tests

* fix lints

* fix: update event assertions in test_llm_emits_event_with_lite_agent

- Adjusted the expected counts for completed and started events in the test to reflect the correct behavior of the LiteAgent.
- Updated assertions for agent roles and IDs to match the expected values after recent changes in event handling.

* fix: update task name assertions in event tests

- Modified assertions in `test_stream_llm_emits_event_with_task_and_agent_info` and `test_llm_emits_event_with_task_and_agent_info` to use `task.description` as a fallback for `task.name`. This ensures that the tests correctly validate the task name even when it is not explicitly set.

* fix: update test assertions for output values and improve readability

- Updated assertions in `test_output_json_dict_hierarchical` to reflect the correct expected score value.
- Enhanced readability of assertions in `test_output_pydantic_to_another_task` and `test_key` by formatting the error messages for clarity.
- These changes ensure that the tests accurately validate the expected outputs and improve overall code quality.

* test fixes

* fix crew_test

* added another fixture

* fix: ensure agent and task assignments in contextual memory are conditional

- Updated the ContextualMemory class to check for the existence of short-term, long-term, external, and extended memory before assigning agent and task attributes. This prevents potential attribute errors when memory types are not initialized.
This commit is contained in:
Lorenze Jay
2025-08-26 09:09:46 -07:00
committed by GitHub
parent 4b4a119a9f
commit 7addda9398
43 changed files with 5151 additions and 295 deletions

View File

@@ -607,4 +607,163 @@ interactions:
status:
code: 200
message: OK
- request:
body: '{"messages": [{"role": "system", "content": "You are Crew Manager. You
are a seasoned manager with a knack for getting the best out of your team.\nYou
are also known for your ability to delegate work to the right people, and to
ask the right questions to get the best out of your team.\nEven though you don''t
perform tasks by yourself, you have a lot of experience in the field, which
allows you to properly evaluate the work of your team members.\nYour personal
goal is: Manage the team to complete the task in the best way possible.\nYou
ONLY have access to the following tools, and should NEVER make up tools that
are not listed here:\n\nTool Name: Delegate work to coworker\nTool Arguments:
{''task'': {''description'': ''The task to delegate'', ''type'': ''str''}, ''context'':
{''description'': ''The context for the task'', ''type'': ''str''}, ''coworker'':
{''description'': ''The role/name of the coworker to delegate to'', ''type'':
''str''}}\nTool Description: Delegate a specific task to one of the following
coworkers: Scorer\nThe input to this tool should be the coworker, the task you
want them to do, and ALL necessary context to execute the task, they know nothing
about the task, so share absolutely everything you know, don''t reference things
but instead explain them.\nTool Name: Ask question to coworker\nTool Arguments:
{''question'': {''description'': ''The question to ask'', ''type'': ''str''},
''context'': {''description'': ''The context for the question'', ''type'': ''str''},
''coworker'': {''description'': ''The role/name of the coworker to ask'', ''type'':
''str''}}\nTool Description: Ask a specific question to one of the following
coworkers: Scorer\nThe input to this tool should be the coworker, the question
you have for them, and ALL necessary context to ask the question properly, they
know nothing about the question, so share absolutely everything you know, don''t
reference things but instead explain them.\n\nIMPORTANT: Use the following format
in your response:\n\n```\nThought: you should always think about what to do\nAction:
the action to take, only one name of [Delegate work to coworker, Ask question
to coworker], just the name, exactly as it''s written.\nAction Input: the input
to the action, just a simple JSON object, enclosed in curly braces, using \"
to wrap keys and values.\nObservation: the result of the action\n```\n\nOnce
all necessary information is gathered, return the following format:\n\n```\nThought:
I now know the final answer\nFinal Answer: the final answer to the original
input question\n```"}, {"role": "user", "content": "\nCurrent Task: Give me
an integer score between 1-5 for the following title: ''The impact of AI in
the future of work''\n\nThis is the expected criteria for your final answer:
The score of the title.\nyou MUST return the actual complete content as the
final answer, not a summary.\n\nBegin! This is VERY important to you, use the
tools available and give your best Final Answer, your job depends on it!\n\nThought:"},
{"role": "assistant", "content": "Thought: I need to get an integer score between
1-5 for the given title. To do this, I''ll ask the Scorer to evaluate the title
based on specific criteria.\n\nAction: Ask question to coworker\nAction Input:
{\"question\": \"Can you provide an integer score between 1-5 for the following
title: ''The impact of AI in the future of work''?\", \"context\": \"We need
a score for the title to assess its impact and relevance. Please evaluate based
on factors such as clarity, relevance, and potential interest to readers.\",
\"coworker\": \"Scorer\"}\nObservation: The score for the title ''The impact
of AI in the future of work'' is 4 out of 5."}], "model": "gpt-4o", "stop":
["\nObservation:"]}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate, zstd
connection:
- keep-alive
content-length:
- '3675'
content-type:
- application/json
cookie:
- _cfuvid=lnKPmS3pW0Tw6qE4ZI8cf4lyVDuLg7ZNxmzxi7CqlME-1756165116673-0.0.1.1-604800000;
__cf_bm=r95s2Hh1kHgI6PI08wwTySLyufxeDnmixIX5GVDEyoA-1756165116-1.0.1.1-X5dML3NG4uvFgDnjqJ4pQWbB9MyxJisxAZJlRivMc7T1f83cCNBH_cy5fJjn4XjKks_UKUUj5PvBGifu2gLkaiqYYWvbSjrhjKseuPN8.Q4
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.93.0
x-stainless-arch:
- arm64
x-stainless-async:
- 'false'
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.93.0
x-stainless-raw-response:
- 'true'
x-stainless-read-timeout:
- '600.0'
x-stainless-retry-count:
- '0'
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.12.9
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: !!binary |
H4sIAAAAAAAAAwAAAP//jFLva9swEP3uv0LoczzsJnYTfyuFsUHHftDA2FJsRT7bamRJSOd1o+R/
H5LT2F1b2BfB3bv3dO/uHiNCqKhpQSjvGPLeyPh6vV8e7vHbdvv5dtvfaPhw/WX59dOP71l+dUMX
nqH398DxifWO695IQKHVCHMLDMGrppdZnuZZmuYB6HUN0tNag/FKxxfJxSpO1nGSn4idFhwcLcjP
iBBCHsPrW1Q1/KYFSRZPmR6cYy3Q4lxECLVa+gxlzgmHTCFdTCDXCkGFrquq2qnbTg9thwX5SJR+
IAf/YAekEYpJwpR7ALtT70N0FaKCrHaqqqq5qoVmcMybUoOUM4AppZH5oQQ/dyfkeHYgdWus3rt/
qLQRSriutMCcVr5bh9rQgB4jQu7CpIZn5qmxujdYoj5A+O5yk456dNrNhKbrE4gamZzy62SzeEWv
rAGZkG42a8oZ76CeqNNi2FALPQOimeuX3bymPToXqv0f+QngHAxCXRoLteDPHU9lFvzpvlV2nnJo
mDqwvwSHEgVYv4kaGjbI8aqo++MQ+rIRqgVrrBhPqzHlOtlkebZc8j2NjtFfAAAA//8DAJCAq9Rj
AwAA
headers:
CF-RAY:
- 974eec8be8b98486-SJC
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Mon, 25 Aug 2025 23:38:37 GMT
Server:
- cloudflare
Strict-Transport-Security:
- max-age=31536000; includeSubDomains; preload
Transfer-Encoding:
- chunked
X-Content-Type-Options:
- nosniff
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
cf-cache-status:
- DYNAMIC
openai-organization:
- crewai-iuxna1
openai-processing-ms:
- '383'
openai-project:
- proj_xitITlrFeen7zjNSzML82h9x
openai-version:
- '2020-10-01'
x-envoy-upstream-service-time:
- '488'
x-ratelimit-limit-project-requests:
- '10000'
x-ratelimit-limit-requests:
- '10000'
x-ratelimit-limit-tokens:
- '30000000'
x-ratelimit-remaining-project-requests:
- '9999'
x-ratelimit-remaining-requests:
- '9999'
x-ratelimit-remaining-tokens:
- '29999132'
x-ratelimit-reset-project-requests:
- 6ms
x-ratelimit-reset-requests:
- 6ms
x-ratelimit-reset-tokens:
- 1ms
x-request-id:
- req_58742f78663b48c2be01fcc6497a46da
status:
code: 200
message: OK
version: 1