mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-05-03 08:12:39 +00:00
Lorenze/native inference sdks (#3619)
* ruff linted * using native sdks with litellm fallback * drop exa * drop print on completion * Refactor LLM and utility functions for type consistency - Updated `max_tokens` parameter in `LLM` class to accept `float` in addition to `int`. - Modified `create_llm` function to ensure consistent type hints and return types, now returning `LLM | BaseLLM | None`. - Adjusted type hints for various parameters in `create_llm` and `_llm_via_environment_or_fallback` functions for improved clarity and type safety. - Enhanced test cases to reflect changes in type handling and ensure proper instantiation of LLM instances. * fix agent_tests * fix litellm tests and usagemetrics fix * drop print * Refactor LLM event handling and improve test coverage - Removed commented-out event emission for LLM call failures in `llm.py`. - Added `from_agent` parameter to `CrewAgentExecutor` for better context in LLM responses. - Enhanced test for LLM call failure to simulate OpenAI API failure and updated assertions for clarity. - Updated agent and task ID assertions in tests to ensure they are consistently treated as strings. * fix test_converter * fixed tests/agents/test_agent.py * Refactor LLM context length exception handling and improve provider integration - Renamed `LLMContextLengthExceededException` to `LLMContextLengthExceededExceptionError` for clarity and consistency. - Updated LLM class to pass the provider parameter correctly during initialization. - Enhanced error handling in various LLM provider implementations to raise the new exception type. - Adjusted tests to reflect the updated exception name and ensure proper error handling in context length scenarios. * Enhance LLM context window handling across providers - Introduced CONTEXT_WINDOW_USAGE_RATIO to adjust context window sizes dynamically for Anthropic, Azure, Gemini, and OpenAI LLMs. - Added validation for context window sizes in Azure and Gemini providers to ensure they fall within acceptable limits. - Updated context window size calculations to use the new ratio, improving consistency and adaptability across different models. - Removed hardcoded context window sizes in favor of ratio-based calculations for better flexibility. * fix test agent again * fix test agent * feat: add native LLM providers for Anthropic, Azure, and Gemini - Introduced new completion implementations for Anthropic, Azure, and Gemini, integrating their respective SDKs. - Added utility functions for tool validation and extraction to support function calling across LLM providers. - Enhanced context window management and token usage extraction for each provider. - Created a common utility module for shared functionality among LLM providers. * chore: update dependencies and improve context management - Removed direct dependency on `litellm` from the main dependencies and added it under extras for better modularity. - Updated the `litellm` dependency specification to allow for greater flexibility in versioning. - Refactored context length exception handling across various LLM providers to use a consistent error class. - Enhanced platform-specific dependency markers for NVIDIA packages to ensure compatibility across different systems. * refactor(tests): update LLM instantiation to include is_litellm flag in test cases - Modified multiple test cases in test_llm.py to set the is_litellm parameter to True when instantiating the LLM class. - This change ensures that the tests are aligned with the latest LLM configuration requirements and improves consistency across test scenarios. - Adjusted relevant assertions and comments to reflect the updated LLM behavior. * linter * linted * revert constants * fix(tests): correct type hint in expected model description - Updated the expected description in the test_generate_model_description_dict_field function to use 'Dict' instead of 'dict' for consistency with type hinting conventions. - This change ensures that the test accurately reflects the expected output format for model descriptions. * refactor(llm): enhance LLM instantiation and error handling - Updated the LLM class to include validation for the model parameter, ensuring it is a non-empty string. - Improved error handling by logging warnings when the native SDK fails, allowing for a fallback to LiteLLM. - Adjusted the instantiation of LLM in test cases to consistently include the is_litellm flag, aligning with recent changes in LLM configuration. - Modified relevant tests to reflect these updates, ensuring better coverage and accuracy in testing scenarios. * fixed test * refactor(llm): enhance token usage tracking and add copy methods - Updated the LLM class to track token usage and log callbacks in streaming mode, improving monitoring capabilities. - Introduced shallow and deep copy methods for the LLM instance, allowing for better management of LLM configurations and parameters. - Adjusted test cases to instantiate LLM with the is_litellm flag, ensuring alignment with recent changes in LLM configuration. * refactor(tests): reorganize imports and enhance error messages in test cases - Cleaned up import statements in test_crew.py for better organization and readability. - Enhanced error messages in test cases to use `re.escape` for improved regex matching, ensuring more robust error handling. - Adjusted comments for clarity and consistency across test scenarios. - Ensured that all necessary modules are imported correctly to avoid potential runtime issues.
This commit is contained in:
@@ -1,14 +1,9 @@
|
||||
"""Test Agent creation and execution basic functionality."""
|
||||
|
||||
# ruff: noqa: S106
|
||||
import os
|
||||
from unittest import mock
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from crewai import Agent, Crew, Task
|
||||
from crewai.agents.cache import CacheHandler
|
||||
from crewai.agents.crew_agent_executor import AgentFinish, CrewAgentExecutor
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.tool_usage_events import ToolUsageFinishedEvent
|
||||
@@ -17,12 +12,17 @@ from crewai.knowledge.knowledge_config import KnowledgeConfig
|
||||
from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
|
||||
from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
|
||||
from crewai.llm import LLM
|
||||
from crewai.llms.base_llm import BaseLLM
|
||||
from crewai.process import Process
|
||||
from crewai.tools import tool
|
||||
from crewai.tools.tool_calling import InstructorToolCalling
|
||||
from crewai.tools.tool_usage import ToolUsage
|
||||
from crewai.utilities import RPMController
|
||||
from crewai.utilities.errors import AgentRepositoryError
|
||||
import pytest
|
||||
|
||||
from crewai import Agent, Crew, Task
|
||||
from crewai.agents.cache import CacheHandler
|
||||
from crewai.tools import tool
|
||||
from crewai.utilities import RPMController
|
||||
|
||||
|
||||
def test_agent_llm_creation_with_env_vars():
|
||||
@@ -40,7 +40,7 @@ def test_agent_llm_creation_with_env_vars():
|
||||
agent = Agent(role="test role", goal="test goal", backstory="test backstory")
|
||||
|
||||
# Check if LLM is created correctly
|
||||
assert isinstance(agent.llm, LLM)
|
||||
assert isinstance(agent.llm, BaseLLM)
|
||||
assert agent.llm.model == "gpt-4-turbo"
|
||||
assert agent.llm.api_key == "test_api_key"
|
||||
assert agent.llm.base_url == "https://test-api-base.com"
|
||||
@@ -50,11 +50,18 @@ def test_agent_llm_creation_with_env_vars():
|
||||
del os.environ["OPENAI_API_BASE"]
|
||||
del os.environ["OPENAI_MODEL_NAME"]
|
||||
|
||||
if original_api_key:
|
||||
os.environ["OPENAI_API_KEY"] = original_api_key
|
||||
if original_api_base:
|
||||
os.environ["OPENAI_API_BASE"] = original_api_base
|
||||
if original_model_name:
|
||||
os.environ["OPENAI_MODEL_NAME"] = original_model_name
|
||||
|
||||
# Create an agent without specifying LLM
|
||||
agent = Agent(role="test role", goal="test goal", backstory="test backstory")
|
||||
|
||||
# Check if LLM is created correctly
|
||||
assert isinstance(agent.llm, LLM)
|
||||
assert isinstance(agent.llm, BaseLLM)
|
||||
assert agent.llm.model != "gpt-4-turbo"
|
||||
assert agent.llm.api_key != "test_api_key"
|
||||
assert agent.llm.base_url != "https://test-api-base.com"
|
||||
@@ -456,18 +463,30 @@ def test_agent_custom_max_iterations():
|
||||
allow_delegation=False,
|
||||
)
|
||||
|
||||
with patch.object(
|
||||
LLM, "call", wraps=LLM("gpt-4o", stop=["\nObservation:"]).call
|
||||
) as private_mock:
|
||||
task = Task(
|
||||
description="The final answer is 42. But don't give it yet, instead keep using the `get_final_answer` tool.",
|
||||
expected_output="The final answer",
|
||||
)
|
||||
agent.execute_task(
|
||||
task=task,
|
||||
tools=[get_final_answer],
|
||||
)
|
||||
assert private_mock.call_count == 3
|
||||
original_call = agent.llm.call
|
||||
call_count = 0
|
||||
|
||||
def counting_call(*args, **kwargs):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
return original_call(*args, **kwargs)
|
||||
|
||||
agent.llm.call = counting_call
|
||||
|
||||
task = Task(
|
||||
description="The final answer is 42. But don't give it yet, instead keep using the `get_final_answer` tool.",
|
||||
expected_output="The final answer",
|
||||
)
|
||||
result = agent.execute_task(
|
||||
task=task,
|
||||
tools=[get_final_answer],
|
||||
)
|
||||
|
||||
assert result is not None
|
||||
assert isinstance(result, str)
|
||||
assert len(result) > 0
|
||||
assert call_count > 0
|
||||
assert call_count == 3
|
||||
|
||||
|
||||
@pytest.mark.vcr(filter_headers=["authorization"])
|
||||
@@ -888,9 +907,8 @@ def test_agent_function_calling_llm():
|
||||
crew = Crew(agents=[agent1], tasks=tasks)
|
||||
from unittest.mock import patch
|
||||
|
||||
import instructor
|
||||
|
||||
from crewai.tools.tool_usage import ToolUsage
|
||||
import instructor
|
||||
|
||||
with (
|
||||
patch.object(
|
||||
@@ -1413,7 +1431,7 @@ def test_agent_with_llm():
|
||||
llm=LLM(model="gpt-3.5-turbo", temperature=0.7),
|
||||
)
|
||||
|
||||
assert isinstance(agent.llm, LLM)
|
||||
assert isinstance(agent.llm, BaseLLM)
|
||||
assert agent.llm.model == "gpt-3.5-turbo"
|
||||
assert agent.llm.temperature == 0.7
|
||||
|
||||
@@ -1427,7 +1445,7 @@ def test_agent_with_custom_stop_words():
|
||||
llm=LLM(model="gpt-3.5-turbo", stop=stop_words),
|
||||
)
|
||||
|
||||
assert isinstance(agent.llm, LLM)
|
||||
assert isinstance(agent.llm, BaseLLM)
|
||||
assert set(agent.llm.stop) == set([*stop_words, "\nObservation:"])
|
||||
assert all(word in agent.llm.stop for word in stop_words)
|
||||
assert "\nObservation:" in agent.llm.stop
|
||||
@@ -1441,10 +1459,12 @@ def test_agent_with_callbacks():
|
||||
role="test role",
|
||||
goal="test goal",
|
||||
backstory="test backstory",
|
||||
llm=LLM(model="gpt-3.5-turbo", callbacks=[dummy_callback]),
|
||||
llm=LLM(model="gpt-3.5-turbo", callbacks=[dummy_callback], is_litellm=True),
|
||||
)
|
||||
|
||||
assert isinstance(agent.llm, LLM)
|
||||
assert isinstance(agent.llm, BaseLLM)
|
||||
# All LLM implementations now support callbacks consistently
|
||||
assert hasattr(agent.llm, "callbacks")
|
||||
assert len(agent.llm.callbacks) == 1
|
||||
assert agent.llm.callbacks[0] == dummy_callback
|
||||
|
||||
@@ -1463,7 +1483,7 @@ def test_agent_with_additional_kwargs():
|
||||
),
|
||||
)
|
||||
|
||||
assert isinstance(agent.llm, LLM)
|
||||
assert isinstance(agent.llm, BaseLLM)
|
||||
assert agent.llm.model == "gpt-3.5-turbo"
|
||||
assert agent.llm.temperature == 0.8
|
||||
assert agent.llm.top_p == 0.9
|
||||
@@ -1580,40 +1600,40 @@ def test_agent_with_all_llm_attributes():
|
||||
timeout=10,
|
||||
temperature=0.7,
|
||||
top_p=0.9,
|
||||
n=1,
|
||||
# n=1,
|
||||
stop=["STOP", "END"],
|
||||
max_tokens=100,
|
||||
presence_penalty=0.1,
|
||||
frequency_penalty=0.1,
|
||||
logit_bias={50256: -100}, # Example: bias against the EOT token
|
||||
# logit_bias={50256: -100}, # Example: bias against the EOT token
|
||||
response_format={"type": "json_object"},
|
||||
seed=42,
|
||||
logprobs=True,
|
||||
top_logprobs=5,
|
||||
base_url="https://api.openai.com/v1",
|
||||
api_version="2023-05-15",
|
||||
# api_version="2023-05-15",
|
||||
api_key="sk-your-api-key-here",
|
||||
),
|
||||
)
|
||||
|
||||
assert isinstance(agent.llm, LLM)
|
||||
assert isinstance(agent.llm, BaseLLM)
|
||||
assert agent.llm.model == "gpt-3.5-turbo"
|
||||
assert agent.llm.timeout == 10
|
||||
assert agent.llm.temperature == 0.7
|
||||
assert agent.llm.top_p == 0.9
|
||||
assert agent.llm.n == 1
|
||||
# assert agent.llm.n == 1
|
||||
assert set(agent.llm.stop) == set(["STOP", "END", "\nObservation:"])
|
||||
assert all(word in agent.llm.stop for word in ["STOP", "END", "\nObservation:"])
|
||||
assert agent.llm.max_tokens == 100
|
||||
assert agent.llm.presence_penalty == 0.1
|
||||
assert agent.llm.frequency_penalty == 0.1
|
||||
assert agent.llm.logit_bias == {50256: -100}
|
||||
# assert agent.llm.logit_bias == {50256: -100}
|
||||
assert agent.llm.response_format == {"type": "json_object"}
|
||||
assert agent.llm.seed == 42
|
||||
assert agent.llm.logprobs
|
||||
assert agent.llm.top_logprobs == 5
|
||||
assert agent.llm.base_url == "https://api.openai.com/v1"
|
||||
assert agent.llm.api_version == "2023-05-15"
|
||||
# assert agent.llm.api_version == "2023-05-15"
|
||||
assert agent.llm.api_key == "sk-your-api-key-here"
|
||||
|
||||
|
||||
@@ -1982,7 +2002,7 @@ def test_agent_with_knowledge_sources_works_with_copy():
|
||||
assert len(agent_copy.knowledge_sources) == 1
|
||||
assert isinstance(agent_copy.knowledge_sources[0], StringKnowledgeSource)
|
||||
assert agent_copy.knowledge_sources[0].content == content
|
||||
assert isinstance(agent_copy.llm, LLM)
|
||||
assert isinstance(agent_copy.llm, BaseLLM)
|
||||
|
||||
|
||||
@pytest.mark.vcr(filter_headers=["authorization"])
|
||||
@@ -2130,7 +2150,7 @@ def test_litellm_auth_error_handling():
|
||||
role="test role",
|
||||
goal="test goal",
|
||||
backstory="test backstory",
|
||||
llm=LLM(model="gpt-4"),
|
||||
llm=LLM(model="gpt-4", is_litellm=True),
|
||||
max_retry_limit=0, # Disable retries for authentication errors
|
||||
)
|
||||
|
||||
@@ -2157,16 +2177,15 @@ def test_litellm_auth_error_handling():
|
||||
|
||||
def test_crew_agent_executor_litellm_auth_error():
|
||||
"""Test that CrewAgentExecutor handles LiteLLM authentication errors by raising them."""
|
||||
from litellm.exceptions import AuthenticationError
|
||||
|
||||
from crewai.agents.tools_handler import ToolsHandler
|
||||
from litellm.exceptions import AuthenticationError
|
||||
|
||||
# Create an agent and executor
|
||||
agent = Agent(
|
||||
role="test role",
|
||||
goal="test goal",
|
||||
backstory="test backstory",
|
||||
llm=LLM(model="gpt-4", api_key="invalid_api_key"),
|
||||
llm=LLM(model="gpt-4", api_key="invalid_api_key", is_litellm=True),
|
||||
)
|
||||
task = Task(
|
||||
description="Test task",
|
||||
@@ -2224,7 +2243,7 @@ def test_litellm_anthropic_error_handling():
|
||||
role="test role",
|
||||
goal="test goal",
|
||||
backstory="test backstory",
|
||||
llm=LLM(model="claude-3.5-sonnet-20240620"),
|
||||
llm=LLM(model="claude-3.5-sonnet-20240620", is_litellm=True),
|
||||
max_retry_limit=0,
|
||||
)
|
||||
|
||||
|
||||
@@ -3,16 +3,17 @@ from collections import defaultdict
|
||||
from typing import cast
|
||||
from unittest.mock import Mock, patch
|
||||
|
||||
import pytest
|
||||
from crewai import LLM, Agent
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
from crewai.events.types.agent_events import LiteAgentExecutionStartedEvent
|
||||
from crewai.events.types.tool_usage_events import ToolUsageStartedEvent
|
||||
from crewai.flow import Flow, start
|
||||
from crewai.lite_agent import LiteAgent, LiteAgentOutput
|
||||
from crewai.llms.base_llm import BaseLLM
|
||||
from crewai.tools import BaseTool
|
||||
from pydantic import BaseModel, Field
|
||||
import pytest
|
||||
|
||||
from crewai import LLM, Agent
|
||||
from crewai.flow import Flow, start
|
||||
from crewai.tools import BaseTool
|
||||
|
||||
|
||||
# A simple test tool
|
||||
@@ -197,10 +198,6 @@ def test_lite_agent_structured_output():
|
||||
response_format=SimpleOutput,
|
||||
)
|
||||
|
||||
print(f"\n=== Agent Result Type: {type(result)}")
|
||||
print(f"=== Agent Result: {result}")
|
||||
print(f"=== Pydantic: {result.pydantic}")
|
||||
|
||||
assert result.pydantic is not None, "Should return a Pydantic model"
|
||||
|
||||
output = cast(SimpleOutput, result.pydantic)
|
||||
@@ -295,6 +292,17 @@ def test_sets_parent_flow_when_inside_flow():
|
||||
mock_llm.call.return_value = "Test response"
|
||||
mock_llm.stop = []
|
||||
|
||||
from crewai.types.usage_metrics import UsageMetrics
|
||||
|
||||
mock_usage_metrics = UsageMetrics(
|
||||
total_tokens=100,
|
||||
prompt_tokens=50,
|
||||
completion_tokens=50,
|
||||
cached_prompt_tokens=0,
|
||||
successful_requests=1,
|
||||
)
|
||||
mock_llm.get_token_usage_summary.return_value = mock_usage_metrics
|
||||
|
||||
class MyFlow(Flow):
|
||||
@start()
|
||||
def start(self):
|
||||
|
||||
Reference in New Issue
Block a user