Lorenze/native inference sdks (#3619)

* ruff linted

* using native sdks with litellm fallback

* drop exa

* drop print on completion

* Refactor LLM and utility functions for type consistency

- Updated `max_tokens` parameter in `LLM` class to accept `float` in addition to `int`.
- Modified `create_llm` function to ensure consistent type hints and return types, now returning `LLM | BaseLLM | None`.
- Adjusted type hints for various parameters in `create_llm` and `_llm_via_environment_or_fallback` functions for improved clarity and type safety.
- Enhanced test cases to reflect changes in type handling and ensure proper instantiation of LLM instances.

* fix agent_tests

* fix litellm tests and usagemetrics fix

* drop print

* Refactor LLM event handling and improve test coverage

- Removed commented-out event emission for LLM call failures in `llm.py`.
- Added `from_agent` parameter to `CrewAgentExecutor` for better context in LLM responses.
- Enhanced test for LLM call failure to simulate OpenAI API failure and updated assertions for clarity.
- Updated agent and task ID assertions in tests to ensure they are consistently treated as strings.

* fix test_converter

* fixed tests/agents/test_agent.py

* Refactor LLM context length exception handling and improve provider integration

- Renamed `LLMContextLengthExceededException` to `LLMContextLengthExceededExceptionError` for clarity and consistency.
- Updated LLM class to pass the provider parameter correctly during initialization.
- Enhanced error handling in various LLM provider implementations to raise the new exception type.
- Adjusted tests to reflect the updated exception name and ensure proper error handling in context length scenarios.

* Enhance LLM context window handling across providers

- Introduced CONTEXT_WINDOW_USAGE_RATIO to adjust context window sizes dynamically for Anthropic, Azure, Gemini, and OpenAI LLMs.
- Added validation for context window sizes in Azure and Gemini providers to ensure they fall within acceptable limits.
- Updated context window size calculations to use the new ratio, improving consistency and adaptability across different models.
- Removed hardcoded context window sizes in favor of ratio-based calculations for better flexibility.

* fix test agent again

* fix test agent

* feat: add native LLM providers for Anthropic, Azure, and Gemini

- Introduced new completion implementations for Anthropic, Azure, and Gemini, integrating their respective SDKs.
- Added utility functions for tool validation and extraction to support function calling across LLM providers.
- Enhanced context window management and token usage extraction for each provider.
- Created a common utility module for shared functionality among LLM providers.

* chore: update dependencies and improve context management

- Removed direct dependency on `litellm` from the main dependencies and added it under extras for better modularity.
- Updated the `litellm` dependency specification to allow for greater flexibility in versioning.
- Refactored context length exception handling across various LLM providers to use a consistent error class.
- Enhanced platform-specific dependency markers for NVIDIA packages to ensure compatibility across different systems.

* refactor(tests): update LLM instantiation to include is_litellm flag in test cases

- Modified multiple test cases in test_llm.py to set the is_litellm parameter to True when instantiating the LLM class.
- This change ensures that the tests are aligned with the latest LLM configuration requirements and improves consistency across test scenarios.
- Adjusted relevant assertions and comments to reflect the updated LLM behavior.

* linter

* linted

* revert constants

* fix(tests): correct type hint in expected model description

- Updated the expected description in the test_generate_model_description_dict_field function to use 'Dict' instead of 'dict' for consistency with type hinting conventions.
- This change ensures that the test accurately reflects the expected output format for model descriptions.

* refactor(llm): enhance LLM instantiation and error handling

- Updated the LLM class to include validation for the model parameter, ensuring it is a non-empty string.
- Improved error handling by logging warnings when the native SDK fails, allowing for a fallback to LiteLLM.
- Adjusted the instantiation of LLM in test cases to consistently include the is_litellm flag, aligning with recent changes in LLM configuration.
- Modified relevant tests to reflect these updates, ensuring better coverage and accuracy in testing scenarios.

* fixed test

* refactor(llm): enhance token usage tracking and add copy methods

- Updated the LLM class to track token usage and log callbacks in streaming mode, improving monitoring capabilities.
- Introduced shallow and deep copy methods for the LLM instance, allowing for better management of LLM configurations and parameters.
- Adjusted test cases to instantiate LLM with the is_litellm flag, ensuring alignment with recent changes in LLM configuration.

* refactor(tests): reorganize imports and enhance error messages in test cases

- Cleaned up import statements in test_crew.py for better organization and readability.
- Enhanced error messages in test cases to use `re.escape` for improved regex matching, ensuring more robust error handling.
- Adjusted comments for clarity and consistency across test scenarios.
- Ensured that all necessary modules are imported correctly to avoid potential runtime issues.
This commit is contained in:
Lorenze Jay
2025-10-03 14:32:35 -07:00
committed by GitHub
parent 428810bd6f
commit 126b91eab3
77 changed files with 25026 additions and 493 deletions

View File

@@ -1,12 +1,9 @@
import json
import os
# Tests for enums
from enum import Enum
from typing import Dict, List, Optional
import json
import os
from unittest.mock import MagicMock, Mock, patch
import pytest
from crewai.llm import LLM
from crewai.utilities.converter import (
Converter,
@@ -21,6 +18,7 @@ from crewai.utilities.converter import (
)
from crewai.utilities.pydantic_schema_parser import PydanticSchemaParser
from pydantic import BaseModel
import pytest
@pytest.fixture(scope="module")
@@ -254,7 +252,7 @@ def test_supports_function_calling_true():
def test_supports_function_calling_false():
llm = LLM(model="non-existent-model")
llm = LLM(model="non-existent-model", is_litellm=True)
assert llm.supports_function_calling() is False
@@ -311,17 +309,17 @@ def test_generate_model_description_nested_model():
def test_generate_model_description_optional_field():
class ModelWithOptionalField(BaseModel):
name: Optional[str]
age: int
name: str
age: int | None
description = generate_model_description(ModelWithOptionalField)
expected_description = '{\n "name": Optional[str],\n "age": int\n}'
expected_description = '{\n "name": str,\n "age": int | None\n}'
assert description == expected_description
def test_generate_model_description_list_field():
class ModelWithListField(BaseModel):
items: List[int]
items: list[int]
description = generate_model_description(ModelWithListField)
expected_description = '{\n "items": List[int]\n}'
@@ -330,7 +328,7 @@ def test_generate_model_description_list_field():
def test_generate_model_description_dict_field():
class ModelWithDictField(BaseModel):
attributes: Dict[str, int]
attributes: dict[str, int]
description = generate_model_description(ModelWithDictField)
expected_description = '{\n "attributes": Dict[str, int]\n}'
@@ -470,7 +468,7 @@ def test_converter_retry_logic():
def test_converter_with_optional_fields():
class OptionalModel(BaseModel):
name: str
age: Optional[int]
age: int | None
llm = Mock(spec=LLM)
llm.supports_function_calling.return_value = False
@@ -496,7 +494,7 @@ def test_converter_with_optional_fields():
# Tests for list fields
def test_converter_with_list_field():
class ListModel(BaseModel):
items: List[int]
items: list[int]
llm = Mock(spec=LLM)
llm.supports_function_calling.return_value = False