Lorenze/native inference sdks (#3619)

* ruff linted * using native sdks with litellm fallback * drop exa * drop print on completion * Refactor LLM and utility functions for type consistency - Updated `max_tokens` parameter in `LLM` class to accept `float` in addition to `int`. - Modified `create_llm` function to ensure consistent type hints and return types, now returning `LLM | BaseLLM | None`. - Adjusted type hints for various parameters in `create_llm` and `_llm_via_environment_or_fallback` functions for improved clarity and type safety. - Enhanced test cases to reflect changes in type handling and ensure proper instantiation of LLM instances. * fix agent_tests * fix litellm tests and usagemetrics fix * drop print * Refactor LLM event handling and improve test coverage - Removed commented-out event emission for LLM call failures in `llm.py`. - Added `from_agent` parameter to `CrewAgentExecutor` for better context in LLM responses. - Enhanced test for LLM call failure to simulate OpenAI API failure and updated assertions for clarity. - Updated agent and task ID assertions in tests to ensure they are consistently treated as strings. * fix test_converter * fixed tests/agents/test_agent.py * Refactor LLM context length exception handling and improve provider integration - Renamed `LLMContextLengthExceededException` to `LLMContextLengthExceededExceptionError` for clarity and consistency. - Updated LLM class to pass the provider parameter correctly during initialization. - Enhanced error handling in various LLM provider implementations to raise the new exception type. - Adjusted tests to reflect the updated exception name and ensure proper error handling in context length scenarios. * Enhance LLM context window handling across providers - Introduced CONTEXT_WINDOW_USAGE_RATIO to adjust context window sizes dynamically for Anthropic, Azure, Gemini, and OpenAI LLMs. - Added validation for context window sizes in Azure and Gemini providers to ensure they fall within acceptable limits. - Updated context window size calculations to use the new ratio, improving consistency and adaptability across different models. - Removed hardcoded context window sizes in favor of ratio-based calculations for better flexibility. * fix test agent again * fix test agent * feat: add native LLM providers for Anthropic, Azure, and Gemini - Introduced new completion implementations for Anthropic, Azure, and Gemini, integrating their respective SDKs. - Added utility functions for tool validation and extraction to support function calling across LLM providers. - Enhanced context window management and token usage extraction for each provider. - Created a common utility module for shared functionality among LLM providers. * chore: update dependencies and improve context management - Removed direct dependency on `litellm` from the main dependencies and added it under extras for better modularity. - Updated the `litellm` dependency specification to allow for greater flexibility in versioning. - Refactored context length exception handling across various LLM providers to use a consistent error class. - Enhanced platform-specific dependency markers for NVIDIA packages to ensure compatibility across different systems. * refactor(tests): update LLM instantiation to include is_litellm flag in test cases - Modified multiple test cases in test_llm.py to set the is_litellm parameter to True when instantiating the LLM class. - This change ensures that the tests are aligned with the latest LLM configuration requirements and improves consistency across test scenarios. - Adjusted relevant assertions and comments to reflect the updated LLM behavior. * linter * linted * revert constants * fix(tests): correct type hint in expected model description - Updated the expected description in the test_generate_model_description_dict_field function to use 'Dict' instead of 'dict' for consistency with type hinting conventions. - This change ensures that the test accurately reflects the expected output format for model descriptions. * refactor(llm): enhance LLM instantiation and error handling - Updated the LLM class to include validation for the model parameter, ensuring it is a non-empty string. - Improved error handling by logging warnings when the native SDK fails, allowing for a fallback to LiteLLM. - Adjusted the instantiation of LLM in test cases to consistently include the is_litellm flag, aligning with recent changes in LLM configuration. - Modified relevant tests to reflect these updates, ensuring better coverage and accuracy in testing scenarios. * fixed test * refactor(llm): enhance token usage tracking and add copy methods - Updated the LLM class to track token usage and log callbacks in streaming mode, improving monitoring capabilities. - Introduced shallow and deep copy methods for the LLM instance, allowing for better management of LLM configurations and parameters. - Adjusted test cases to instantiate LLM with the is_litellm flag, ensuring alignment with recent changes in LLM configuration. * refactor(tests): reorganize imports and enhance error messages in test cases - Cleaned up import statements in test_crew.py for better organization and readability. - Enhanced error messages in test cases to use `re.escape` for improved regex matching, ensuring more robust error handling. - Adjusted comments for clarity and consistency across test scenarios. - Ensured that all necessary modules are imported correctly to avoid potential runtime issues.
2026-05-03 00:02:36 +00:00 · 2025-10-03 14:32:35 -07:00
parent 428810bd6f
commit 126b91eab3
77 changed files with 25026 additions and 493 deletions
--- a/lib/crewai/tests/utilities/test_llm_utils.py
+++ b/lib/crewai/tests/utilities/test_llm_utils.py
@@ -1,10 +1,16 @@
 import os
 from unittest.mock import patch

-import pytest
 from crewai.llm import LLM
+from crewai.llms.base_llm import BaseLLM
 from crewai.utilities.llm_utils import create_llm
-from litellm.exceptions import BadRequestError
+import pytest
+
+
+try:
+    from litellm.exceptions import BadRequestError
+except ImportError:
+    BadRequestError = Exception


 def test_create_llm_with_llm_instance():
@@ -15,13 +21,19 @@ def test_create_llm_with_llm_instance():

 def test_create_llm_with_valid_model_string():
    llm = create_llm(llm_value="gpt-4o")
-    assert isinstance(llm, LLM)
+    assert isinstance(llm, BaseLLM)
    assert llm.model == "gpt-4o"


 def test_create_llm_with_invalid_model_string():
-    with pytest.raises(BadRequestError, match="LLM Provider NOT provided"):
-        llm = create_llm(llm_value="invalid-model")
+    # For invalid model strings, create_llm succeeds but call() fails with API error
+    llm = create_llm(llm_value="invalid-model")
+    assert llm is not None
+    assert isinstance(llm, BaseLLM)
+
+    # The error should occur when making the actual API call
+    # We expect some kind of API error (NotFoundError, etc.)
+    with pytest.raises(Exception):  # noqa: B017
        llm.call(messages=[{"role": "user", "content": "Hello, world!"}])


@@ -32,16 +44,16 @@ def test_create_llm_with_unknown_object_missing_attributes():
    unknown_obj = UnknownObject()
    llm = create_llm(llm_value=unknown_obj)

-    # Attempt to call the LLM and expect it to raise an error due to missing attributes
-    with pytest.raises(BadRequestError, match="LLM Provider NOT provided"):
-        llm.call(messages=[{"role": "user", "content": "Hello, world!"}])
+    # Should succeed because str(unknown_obj) provides a model name
+    assert llm is not None
+    assert isinstance(llm, BaseLLM)


 def test_create_llm_with_none_uses_default_model():
-    with patch.dict(os.environ, {}, clear=True):
-        with patch("crewai.cli.constants.DEFAULT_LLM_MODEL", "gpt-4o"):
+    with patch.dict(os.environ, {"OPENAI_API_KEY": "fake-key"}, clear=True):
+        with patch("crewai.utilities.llm_utils.DEFAULT_LLM_MODEL", "gpt-4o-mini"):
            llm = create_llm(llm_value=None)
-            assert isinstance(llm, LLM)
+            assert isinstance(llm, BaseLLM)
            assert llm.model == "gpt-4o-mini"


@@ -53,7 +65,7 @@ def test_create_llm_with_unknown_object():

    unknown_obj = UnknownObject()
    llm = create_llm(llm_value=unknown_obj)
-    assert isinstance(llm, LLM)
+    assert isinstance(llm, BaseLLM)
    assert llm.model == "gpt-4o"
    assert llm.temperature == 0.7
    assert llm.max_tokens == 1500
@@ -64,13 +76,14 @@ def test_create_llm_from_env_with_unaccepted_attributes():
        os.environ,
        {
            "OPENAI_MODEL_NAME": "gpt-3.5-turbo",
+            "OPENAI_API_KEY": "fake-key",
            "AWS_ACCESS_KEY_ID": "fake-access-key",
            "AWS_SECRET_ACCESS_KEY": "fake-secret-key",
            "AWS_REGION_NAME": "us-west-2",
        },
    ):
        llm = create_llm(llm_value=None)
-        assert isinstance(llm, LLM)
+        assert isinstance(llm, BaseLLM)
        assert llm.model == "gpt-3.5-turbo"
        assert not hasattr(llm, "AWS_ACCESS_KEY_ID")
        assert not hasattr(llm, "AWS_SECRET_ACCESS_KEY")
@@ -84,12 +97,18 @@ def test_create_llm_with_partial_attributes():

    obj = PartialAttributes()
    llm = create_llm(llm_value=obj)
-    assert isinstance(llm, LLM)
+    assert isinstance(llm, BaseLLM)
    assert llm.model == "gpt-4o"
    assert llm.temperature is None  # Should handle missing attributes gracefully


 def test_create_llm_with_invalid_type():
-    with pytest.raises(BadRequestError, match="LLM Provider NOT provided"):
-        llm = create_llm(llm_value=42)
+    # For integers, create_llm succeeds because str(42) becomes "42"
+    llm = create_llm(llm_value=42)
+    assert llm is not None
+    assert isinstance(llm, BaseLLM)
+    assert llm.model == "42"
+
+    # The error should occur when making the actual API call
+    with pytest.raises(Exception):  # noqa: B017
        llm.call(messages=[{"role": "user", "content": "Hello, world!"}])