Lorenze/native inference sdks (#3619)

* ruff linted * using native sdks with litellm fallback * drop exa * drop print on completion * Refactor LLM and utility functions for type consistency - Updated `max_tokens` parameter in `LLM` class to accept `float` in addition to `int`. - Modified `create_llm` function to ensure consistent type hints and return types, now returning `LLM | BaseLLM | None`. - Adjusted type hints for various parameters in `create_llm` and `_llm_via_environment_or_fallback` functions for improved clarity and type safety. - Enhanced test cases to reflect changes in type handling and ensure proper instantiation of LLM instances. * fix agent_tests * fix litellm tests and usagemetrics fix * drop print * Refactor LLM event handling and improve test coverage - Removed commented-out event emission for LLM call failures in `llm.py`. - Added `from_agent` parameter to `CrewAgentExecutor` for better context in LLM responses. - Enhanced test for LLM call failure to simulate OpenAI API failure and updated assertions for clarity. - Updated agent and task ID assertions in tests to ensure they are consistently treated as strings. * fix test_converter * fixed tests/agents/test_agent.py * Refactor LLM context length exception handling and improve provider integration - Renamed `LLMContextLengthExceededException` to `LLMContextLengthExceededExceptionError` for clarity and consistency. - Updated LLM class to pass the provider parameter correctly during initialization. - Enhanced error handling in various LLM provider implementations to raise the new exception type. - Adjusted tests to reflect the updated exception name and ensure proper error handling in context length scenarios. * Enhance LLM context window handling across providers - Introduced CONTEXT_WINDOW_USAGE_RATIO to adjust context window sizes dynamically for Anthropic, Azure, Gemini, and OpenAI LLMs. - Added validation for context window sizes in Azure and Gemini providers to ensure they fall within acceptable limits. - Updated context window size calculations to use the new ratio, improving consistency and adaptability across different models. - Removed hardcoded context window sizes in favor of ratio-based calculations for better flexibility. * fix test agent again * fix test agent * feat: add native LLM providers for Anthropic, Azure, and Gemini - Introduced new completion implementations for Anthropic, Azure, and Gemini, integrating their respective SDKs. - Added utility functions for tool validation and extraction to support function calling across LLM providers. - Enhanced context window management and token usage extraction for each provider. - Created a common utility module for shared functionality among LLM providers. * chore: update dependencies and improve context management - Removed direct dependency on `litellm` from the main dependencies and added it under extras for better modularity. - Updated the `litellm` dependency specification to allow for greater flexibility in versioning. - Refactored context length exception handling across various LLM providers to use a consistent error class. - Enhanced platform-specific dependency markers for NVIDIA packages to ensure compatibility across different systems. * refactor(tests): update LLM instantiation to include is_litellm flag in test cases - Modified multiple test cases in test_llm.py to set the is_litellm parameter to True when instantiating the LLM class. - This change ensures that the tests are aligned with the latest LLM configuration requirements and improves consistency across test scenarios. - Adjusted relevant assertions and comments to reflect the updated LLM behavior. * linter * linted * revert constants * fix(tests): correct type hint in expected model description - Updated the expected description in the test_generate_model_description_dict_field function to use 'Dict' instead of 'dict' for consistency with type hinting conventions. - This change ensures that the test accurately reflects the expected output format for model descriptions. * refactor(llm): enhance LLM instantiation and error handling - Updated the LLM class to include validation for the model parameter, ensuring it is a non-empty string. - Improved error handling by logging warnings when the native SDK fails, allowing for a fallback to LiteLLM. - Adjusted the instantiation of LLM in test cases to consistently include the is_litellm flag, aligning with recent changes in LLM configuration. - Modified relevant tests to reflect these updates, ensuring better coverage and accuracy in testing scenarios. * fixed test * refactor(llm): enhance token usage tracking and add copy methods - Updated the LLM class to track token usage and log callbacks in streaming mode, improving monitoring capabilities. - Introduced shallow and deep copy methods for the LLM instance, allowing for better management of LLM configurations and parameters. - Adjusted test cases to instantiate LLM with the is_litellm flag, ensuring alignment with recent changes in LLM configuration. * refactor(tests): reorganize imports and enhance error messages in test cases - Cleaned up import statements in test_crew.py for better organization and readability. - Enhanced error messages in test cases to use `re.escape` for improved regex matching, ensuring more robust error handling. - Adjusted comments for clarity and consistency across test scenarios. - Ensured that all necessary modules are imported correctly to avoid potential runtime issues.
2026-05-03 08:12:39 +00:00 · 2025-10-03 14:32:35 -07:00
parent 428810bd6f
commit 126b91eab3
77 changed files with 25026 additions and 493 deletions
--- a/lib/crewai/tests/cassettes/test_lite_agent_structured_output.yaml
+++ b/lib/crewai/tests/cassettes/test_lite_agent_structured_output.yaml
@@ -128,4 +128,214 @@ interactions:
      - req_824c5fb422e466b60dacb6e27a0cbbda
    http_version: HTTP/1.1
    status_code: 200
+- request:
+    body: '{"messages": [{"role": "system", "content": "You are Info Gatherer. You
+      gather and summarize information quickly.\nYour personal goal is: Provide brief
+      information\n\nYou ONLY have access to the following tools, and should NEVER
+      make up tools that are not listed here:\n\nTool Name: search_web\nTool Arguments:
+      {''query'': {''description'': None, ''type'': ''str''}}\nTool Description: Search
+      the web for information about a topic.\n\nIMPORTANT: Use the following format
+      in your response:\n\n```\nThought: you should always think about what to do\nAction:
+      the action to take, only one name of [search_web], just the name, exactly as
+      it''s written.\nAction Input: the input to the action, just a simple JSON object,
+      enclosed in curly braces, using \" to wrap keys and values.\nObservation: the
+      result of the action\n```\n\nOnce all necessary information is gathered, return
+      the following format:\n\n```\nThought: I now know the final answer\nFinal Answer:
+      the final answer to the original input question\n```\nIMPORTANT: Your final
+      answer MUST contain all the information requested in the following format: {\n  \"summary\":
+      str,\n  \"confidence\": int\n}\n\nIMPORTANT: Ensure the final output does not
+      include any code block markers like ```json or ```python."}, {"role": "user",
+      "content": "What is the population of Tokyo? Return your structured output in
+      JSON format with the following fields: summary, confidence"}, {"role": "assistant",
+      "content": "Thought: I need to find the current population of Tokyo.\nAction:
+      search_web\nAction Input: {\"query\":\"population of Tokyo 2023\"}\nObservation:
+      Tokyo''s population in 2023 was approximately 21 million people in the city
+      proper, and 37 million in the greater metropolitan area."}], "model": "gpt-4o-mini",
+      "stop": ["\nObservation:"], "stream": false}'
+    headers:
+      accept:
+      - application/json
+      accept-encoding:
+      - gzip, deflate
+      connection:
+      - keep-alive
+      content-length:
+      - '1796'
+      content-type:
+      - application/json
+      cookie:
+      - _cfuvid=u769MG.poap6iEjFpbByMFUC0FygMEqYSurr5DfLbas-1743447969501-0.0.1.1-604800000
+      host:
+      - api.openai.com
+      user-agent:
+      - OpenAI/Python 1.93.0
+      x-stainless-arch:
+      - arm64
+      x-stainless-async:
+      - 'false'
+      x-stainless-lang:
+      - python
+      x-stainless-os:
+      - MacOS
+      x-stainless-package-version:
+      - 1.93.0
+      x-stainless-retry-count:
+      - '0'
+      x-stainless-runtime:
+      - CPython
+      x-stainless-runtime-version:
+      - 3.12.9
+    method: POST
+    uri: https://api.openai.com/v1/chat/completions
+  response:
+    body:
+      string: !!binary |
+        H4sIAAAAAAAAAwAAAP//jFPLbtswELz7Kxa89GIHsuOnbkmBvg7tIbkUVSBsqJXFmuQSJNXECPzv
+        BWk3ctIU6IUAOTvD2eHyaQQgVCNKELLDKI3Tk/df5h9vvq3V96/XvvfTT7e7qb6+MYt2fXWPYpwY
+        fP+TZPzDupBsnKao2B5h6QkjJdXparFeLovlssiA4YZ0om1dnMx5YpRVk1kxm0+K1WS6PrE7VpKC
+        KOHHCADgKa/Jp23oUZSQtfKJoRBwS6J8LgIQnnU6ERiCChFtFOMBlGwj2Wz9tuN+28USPoPlB9il
+        JXYErbKoAW14IF/ZD3l3lXclPFUWoBKhNwb9vhIlVOKWd3t+F8Cx6zWmFEBZmBWzS1AB0DnPj8pg
+        JL2H2RSM0vpUk26TKu7BeXbkAW0D6Lm3DVyuXhcaip4daxXRAnrCi0qMj3Yk21Y1ZCUlR5uisofz
+        nj21fcCUu+21PgPQWo7ZcU777oQcnvPVvHWe78MrqmiVVaGrPWFgm7IMkZ3I6GEEcJffsX/xNMJ5
+        Ni7WkXeUr7ucr456YhifAV3MTmDkiPqMtdmM39CrG4qodDibBCFRdtQM1GFssG8UnwGjs67/dvOW
+        9rFzZbf/Iz8AUpKL1NTOU6Pky46HMk/pd/2r7DnlbFgE8r+UpDoq8uklGmqx18eZF2EfIpm6VXZL
+        3nl1HPzW1Ytlge2SFouNGB1GvwEAAP//AwBMppztBgQAAA==
+    headers:
+      CF-RAY:
+      - 983ceae938953023-SJC
+      Connection:
+      - keep-alive
+      Content-Encoding:
+      - gzip
+      Content-Type:
+      - application/json
+      Date:
+      - Tue, 23 Sep 2025 20:51:02 GMT
+      Server:
+      - cloudflare
+      Set-Cookie:
+      - __cf_bm=GCRvAgKG_bNwYFqI4.V.ETNDFENlZGsSPgqfmPRweBE-1758660662-1.0.1.1-BbV_KqvF6uEt_DEfefPzisFvVJNAN5NBAn7UyvcCjL4cC0Earh6WKRSQEBgXDhltOn0zo_0LaT1GsrScK1y2R6EE8NtKLTLI0DvmUDiiTdo;
+        path=/; expires=Tue, 23-Sep-25 21:21:02 GMT; domain=.api.openai.com; HttpOnly;
+        Secure; SameSite=None
+      - _cfuvid=satXYLU.6M.wV_6k7mFk5Z6V97uowThF_xldugIJSJQ-1758660662273-0.0.1.1-604800000;
+        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      Strict-Transport-Security:
+      - max-age=31536000; includeSubDomains; preload
+      Transfer-Encoding:
+      - chunked
+      X-Content-Type-Options:
+      - nosniff
+      access-control-expose-headers:
+      - X-Request-ID
+      alt-svc:
+      - h3=":443"; ma=86400
+      cf-cache-status:
+      - DYNAMIC
+      openai-organization:
+      - crewai-iuxna1
+      openai-processing-ms:
+      - '1464'
+      openai-project:
+      - proj_xitITlrFeen7zjNSzML82h9x
+      openai-version:
+      - '2020-10-01'
+      x-envoy-upstream-service-time:
+      - '1521'
+      x-openai-proxy-wasm:
+      - v0.1
+      x-ratelimit-limit-project-tokens:
+      - '150000000'
+      x-ratelimit-limit-requests:
+      - '30000'
+      x-ratelimit-limit-tokens:
+      - '150000000'
+      x-ratelimit-remaining-project-tokens:
+      - '149999605'
+      x-ratelimit-remaining-requests:
+      - '29999'
+      x-ratelimit-remaining-tokens:
+      - '149999602'
+      x-ratelimit-reset-project-tokens:
+      - 0s
+      x-ratelimit-reset-requests:
+      - 2ms
+      x-ratelimit-reset-tokens:
+      - 0s
+      x-request-id:
+      - req_b7cf0ed387424a5f913d455e7bcc6949
+    status:
+      code: 200
+      message: OK
+- request:
+    body: '{"trace_id": "df56ad93-ab2e-4de8-b57c-e52cd231320c", "execution_type":
+      "crew", "user_identifier": null, "execution_context": {"crew_fingerprint": null,
+      "crew_name": "Unknown Crew", "flow_name": null, "crewai_version": "0.193.2",
+      "privacy_level": "standard"}, "execution_metadata": {"expected_duration_estimate":
+      300, "agent_count": 0, "task_count": 0, "flow_method_count": 0, "execution_started_at":
+      "2025-09-23T21:03:51.621012+00:00"}}'
+    headers:
+      Accept:
+      - '*/*'
+      Accept-Encoding:
+      - gzip, deflate
+      Connection:
+      - keep-alive
+      Content-Length:
+      - '436'
+      Content-Type:
+      - application/json
+      User-Agent:
+      - CrewAI-CLI/0.193.2
+      X-Crewai-Version:
+      - 0.193.2
+    method: POST
+    uri: http://localhost:3000/crewai_plus/api/v1/tracing/batches
+  response:
+    body:
+      string: '{"error":"bad_credentials","message":"Bad credentials"}'
+    headers:
+      Content-Length:
+      - '55'
+      cache-control:
+      - no-cache
+      content-security-policy:
+      - 'default-src ''self'' *.crewai.com crewai.com; script-src ''self'' ''unsafe-inline''
+        *.crewai.com crewai.com https://cdn.jsdelivr.net/npm/apexcharts https://www.gstatic.com
+        https://run.pstmn.io https://share.descript.com/; style-src ''self'' ''unsafe-inline''
+        *.crewai.com crewai.com https://cdn.jsdelivr.net/npm/apexcharts; img-src ''self''
+        data: *.crewai.com crewai.com https://zeus.tools.crewai.com https://dashboard.tools.crewai.com
+        https://cdn.jsdelivr.net; font-src ''self'' data: *.crewai.com crewai.com;
+        connect-src ''self'' *.crewai.com crewai.com https://zeus.tools.crewai.com
+        https://connect.useparagon.com/ https://zeus.useparagon.com/* https://*.useparagon.com/*
+        https://run.pstmn.io https://connect.tools.crewai.com/ ws://localhost:3036
+        wss://localhost:3036; frame-src ''self'' *.crewai.com crewai.com https://connect.useparagon.com/
+        https://zeus.tools.crewai.com https://zeus.useparagon.com/* https://connect.tools.crewai.com/
+        https://www.youtube.com https://share.descript.com'
+      content-type:
+      - application/json; charset=utf-8
+      permissions-policy:
+      - camera=(), microphone=(self), geolocation=()
+      referrer-policy:
+      - strict-origin-when-cross-origin
+      server-timing:
+      - cache_read.active_support;dur=0.05, sql.active_record;dur=1.55, cache_generate.active_support;dur=2.03,
+        cache_write.active_support;dur=0.18, cache_read_multi.active_support;dur=0.11,
+        start_processing.action_controller;dur=0.00, process_action.action_controller;dur=2.68
+      vary:
+      - Accept
+      x-content-type-options:
+      - nosniff
+      x-frame-options:
+      - SAMEORIGIN
+      x-permitted-cross-domain-policies:
+      - none
+      x-request-id:
+      - 3fadc173-fe84-48e8-b34f-d6ce5be9b584
+      x-runtime:
+      - '0.046122'
+      x-xss-protection:
+      - 1; mode=block
+    status:
+      code: 401
+      message: Unauthorized
 version: 1