mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-05-03 08:12:39 +00:00
Lorenze/native inference sdks (#3619)
* ruff linted * using native sdks with litellm fallback * drop exa * drop print on completion * Refactor LLM and utility functions for type consistency - Updated `max_tokens` parameter in `LLM` class to accept `float` in addition to `int`. - Modified `create_llm` function to ensure consistent type hints and return types, now returning `LLM | BaseLLM | None`. - Adjusted type hints for various parameters in `create_llm` and `_llm_via_environment_or_fallback` functions for improved clarity and type safety. - Enhanced test cases to reflect changes in type handling and ensure proper instantiation of LLM instances. * fix agent_tests * fix litellm tests and usagemetrics fix * drop print * Refactor LLM event handling and improve test coverage - Removed commented-out event emission for LLM call failures in `llm.py`. - Added `from_agent` parameter to `CrewAgentExecutor` for better context in LLM responses. - Enhanced test for LLM call failure to simulate OpenAI API failure and updated assertions for clarity. - Updated agent and task ID assertions in tests to ensure they are consistently treated as strings. * fix test_converter * fixed tests/agents/test_agent.py * Refactor LLM context length exception handling and improve provider integration - Renamed `LLMContextLengthExceededException` to `LLMContextLengthExceededExceptionError` for clarity and consistency. - Updated LLM class to pass the provider parameter correctly during initialization. - Enhanced error handling in various LLM provider implementations to raise the new exception type. - Adjusted tests to reflect the updated exception name and ensure proper error handling in context length scenarios. * Enhance LLM context window handling across providers - Introduced CONTEXT_WINDOW_USAGE_RATIO to adjust context window sizes dynamically for Anthropic, Azure, Gemini, and OpenAI LLMs. - Added validation for context window sizes in Azure and Gemini providers to ensure they fall within acceptable limits. - Updated context window size calculations to use the new ratio, improving consistency and adaptability across different models. - Removed hardcoded context window sizes in favor of ratio-based calculations for better flexibility. * fix test agent again * fix test agent * feat: add native LLM providers for Anthropic, Azure, and Gemini - Introduced new completion implementations for Anthropic, Azure, and Gemini, integrating their respective SDKs. - Added utility functions for tool validation and extraction to support function calling across LLM providers. - Enhanced context window management and token usage extraction for each provider. - Created a common utility module for shared functionality among LLM providers. * chore: update dependencies and improve context management - Removed direct dependency on `litellm` from the main dependencies and added it under extras for better modularity. - Updated the `litellm` dependency specification to allow for greater flexibility in versioning. - Refactored context length exception handling across various LLM providers to use a consistent error class. - Enhanced platform-specific dependency markers for NVIDIA packages to ensure compatibility across different systems. * refactor(tests): update LLM instantiation to include is_litellm flag in test cases - Modified multiple test cases in test_llm.py to set the is_litellm parameter to True when instantiating the LLM class. - This change ensures that the tests are aligned with the latest LLM configuration requirements and improves consistency across test scenarios. - Adjusted relevant assertions and comments to reflect the updated LLM behavior. * linter * linted * revert constants * fix(tests): correct type hint in expected model description - Updated the expected description in the test_generate_model_description_dict_field function to use 'Dict' instead of 'dict' for consistency with type hinting conventions. - This change ensures that the test accurately reflects the expected output format for model descriptions. * refactor(llm): enhance LLM instantiation and error handling - Updated the LLM class to include validation for the model parameter, ensuring it is a non-empty string. - Improved error handling by logging warnings when the native SDK fails, allowing for a fallback to LiteLLM. - Adjusted the instantiation of LLM in test cases to consistently include the is_litellm flag, aligning with recent changes in LLM configuration. - Modified relevant tests to reflect these updates, ensuring better coverage and accuracy in testing scenarios. * fixed test * refactor(llm): enhance token usage tracking and add copy methods - Updated the LLM class to track token usage and log callbacks in streaming mode, improving monitoring capabilities. - Introduced shallow and deep copy methods for the LLM instance, allowing for better management of LLM configurations and parameters. - Adjusted test cases to instantiate LLM with the is_litellm flag, ensuring alignment with recent changes in LLM configuration. * refactor(tests): reorganize imports and enhance error messages in test cases - Cleaned up import statements in test_crew.py for better organization and readability. - Enhanced error messages in test cases to use `re.escape` for improved regex matching, ensuring more robust error handling. - Adjusted comments for clarity and consistency across test scenarios. - Ensured that all necessary modules are imported correctly to avoid potential runtime issues.
This commit is contained in:
@@ -1,16 +1,14 @@
|
||||
"""Test Agent creation and execution basic functionality."""
|
||||
|
||||
import json
|
||||
from collections import defaultdict
|
||||
from concurrent.futures import Future
|
||||
from hashlib import md5
|
||||
import json
|
||||
import re
|
||||
from unittest import mock
|
||||
from unittest.mock import ANY, MagicMock, patch
|
||||
|
||||
import pydantic_core
|
||||
import pytest
|
||||
from crewai.agent import Agent
|
||||
from crewai.agents import CacheHandler
|
||||
from crewai.crew import Crew
|
||||
from crewai.crews.crew_output import CrewOutput
|
||||
from crewai.events.event_bus import crewai_event_bus
|
||||
@@ -30,7 +28,6 @@ from crewai.events.types.memory_events import (
|
||||
MemorySaveFailedEvent,
|
||||
MemorySaveStartedEvent,
|
||||
)
|
||||
from crewai.flow import Flow, start
|
||||
from crewai.knowledge.knowledge import Knowledge
|
||||
from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
|
||||
from crewai.llm import LLM
|
||||
@@ -46,6 +43,11 @@ from crewai.tasks.task_output import TaskOutput
|
||||
from crewai.types.usage_metrics import UsageMetrics
|
||||
from crewai.utilities.rpm_controller import RPMController
|
||||
from crewai.utilities.task_output_storage_handler import TaskOutputStorageHandler
|
||||
import pydantic_core
|
||||
import pytest
|
||||
|
||||
from crewai.agents import CacheHandler
|
||||
from crewai.flow import Flow, start
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
@@ -199,7 +201,9 @@ def test_async_task_cannot_include_sequential_async_tasks_in_context(
|
||||
# This should raise an error because task2 is async and has task1 in its context without a sync task in between
|
||||
with pytest.raises(
|
||||
ValueError,
|
||||
match="Task 'Task 2' is asynchronous and cannot include other sequential asynchronous tasks in its context.",
|
||||
match=re.escape(
|
||||
"Task 'Task 2' is asynchronous and cannot include other sequential asynchronous tasks in its context."
|
||||
),
|
||||
):
|
||||
Crew(tasks=[task1, task2, task3, task4, task5], agents=[researcher, writer])
|
||||
|
||||
@@ -237,7 +241,9 @@ def test_context_no_future_tasks(researcher, writer):
|
||||
# This should raise an error because task1 has a context dependency on a future task (task4)
|
||||
with pytest.raises(
|
||||
ValueError,
|
||||
match="Task 'Task 1' has a context dependency on a future task 'Task 4', which is not allowed.",
|
||||
match=re.escape(
|
||||
"Task 'Task 1' has a context dependency on a future task 'Task 4', which is not allowed."
|
||||
),
|
||||
):
|
||||
Crew(tasks=[task1, task2, task3, task4], agents=[researcher, writer])
|
||||
|
||||
@@ -568,9 +574,10 @@ def test_crew_with_delegating_agents(ceo, writer):
|
||||
|
||||
@pytest.mark.vcr(filter_headers=["authorization"])
|
||||
def test_crew_with_delegating_agents_should_not_override_task_tools(ceo, writer):
|
||||
from crewai.tools import BaseTool
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from crewai.tools import BaseTool
|
||||
|
||||
class TestToolInput(BaseModel):
|
||||
"""Input schema for TestTool."""
|
||||
|
||||
@@ -627,9 +634,10 @@ def test_crew_with_delegating_agents_should_not_override_task_tools(ceo, writer)
|
||||
|
||||
@pytest.mark.vcr(filter_headers=["authorization"])
|
||||
def test_crew_with_delegating_agents_should_not_override_agent_tools(ceo, writer):
|
||||
from crewai.tools import BaseTool
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from crewai.tools import BaseTool
|
||||
|
||||
class TestToolInput(BaseModel):
|
||||
"""Input schema for TestTool."""
|
||||
|
||||
@@ -688,9 +696,10 @@ def test_crew_with_delegating_agents_should_not_override_agent_tools(ceo, writer
|
||||
|
||||
@pytest.mark.vcr(filter_headers=["authorization"])
|
||||
def test_task_tools_override_agent_tools(researcher):
|
||||
from crewai.tools import BaseTool
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from crewai.tools import BaseTool
|
||||
|
||||
class TestToolInput(BaseModel):
|
||||
"""Input schema for TestTool."""
|
||||
|
||||
@@ -744,9 +753,10 @@ def test_task_tools_override_agent_tools_with_allow_delegation(researcher, write
|
||||
Test that task tools override agent tools while preserving delegation tools when allow_delegation=True
|
||||
"""
|
||||
|
||||
from crewai.tools import BaseTool
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from crewai.tools import BaseTool
|
||||
|
||||
class TestToolInput(BaseModel):
|
||||
query: str = Field(..., description="Query to process")
|
||||
|
||||
@@ -1005,7 +1015,7 @@ def test_crew_kickoff_streaming_usage_metrics():
|
||||
role="{topic} Researcher",
|
||||
goal="Express hot takes on {topic}.",
|
||||
backstory="You have a lot of experience with {topic}.",
|
||||
llm=LLM(model="gpt-4o", stream=True),
|
||||
llm=LLM(model="gpt-4o", stream=True, is_litellm=True),
|
||||
max_iter=3,
|
||||
)
|
||||
|
||||
@@ -1773,7 +1783,7 @@ def test_hierarchical_kickoff_usage_metrics_include_manager(researcher):
|
||||
agent=researcher, # *regular* agent
|
||||
)
|
||||
|
||||
# ── 2. Stub out each agent's _token_process.get_summary() ───────────────────
|
||||
# ── 2. Stub out each agent's token usage methods ───────────────────
|
||||
researcher_metrics = UsageMetrics(
|
||||
total_tokens=120, prompt_tokens=80, completion_tokens=40, successful_requests=2
|
||||
)
|
||||
@@ -1781,10 +1791,10 @@ def test_hierarchical_kickoff_usage_metrics_include_manager(researcher):
|
||||
total_tokens=30, prompt_tokens=20, completion_tokens=10, successful_requests=1
|
||||
)
|
||||
|
||||
# Replace the internal _token_process objects with simple mocks
|
||||
researcher._token_process = MagicMock(
|
||||
get_summary=MagicMock(return_value=researcher_metrics)
|
||||
)
|
||||
# Mock the LLM's get_token_usage_summary method for the researcher
|
||||
researcher.llm.get_token_usage_summary = MagicMock(return_value=researcher_metrics)
|
||||
|
||||
# Mock the manager's _token_process since it uses the fallback path
|
||||
manager._token_process = MagicMock(
|
||||
get_summary=MagicMock(return_value=manager_metrics)
|
||||
)
|
||||
@@ -3334,7 +3344,9 @@ def test_replay_with_invalid_task_id():
|
||||
):
|
||||
with pytest.raises(
|
||||
ValueError,
|
||||
match="Task with id bf5b09c9-69bd-4eb8-be12-f9e5bae31c2d not found in the crew's tasks.",
|
||||
match=re.escape(
|
||||
"Task with id bf5b09c9-69bd-4eb8-be12-f9e5bae31c2d not found in the crew's tasks."
|
||||
),
|
||||
):
|
||||
crew.replay("bf5b09c9-69bd-4eb8-be12-f9e5bae31c2d")
|
||||
|
||||
@@ -3813,10 +3825,11 @@ def test_task_tools_preserve_code_execution_tools():
|
||||
"""
|
||||
Test that task tools don't override code execution tools when allow_code_execution=True
|
||||
"""
|
||||
from crewai.tools import BaseTool
|
||||
from crewai_tools import CodeInterpreterTool
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from crewai.tools import BaseTool
|
||||
|
||||
class TestToolInput(BaseModel):
|
||||
"""Input schema for TestTool."""
|
||||
|
||||
|
||||
Reference in New Issue
Block a user