Fix CI failures: correct context window ratio and remove unused imports

- Fix test expectations to use 0.85 ratio instead of 0.75 (matches CONTEXT_WINDOW_USAGE_RATIO) - Remove unused imports (pytest, Mock) from test file - Add context window size warning for large models (>100K tokens) - Update documentation with performance considerations and rate limiting best practices - Address code review feedback from João regarding validation and error handling Co-Authored-By: João <joao@crewai.com>
Add AI/ML API provider integration
2026-01-04 05:38:33 +00:00 · 2025-06-04 10:15:30 +00:00 · 2025-06-04 10:08:32 +00:00
4 changed files with 401 additions and 3 deletions
--- a/README.md
+++ b/README.md
@@ -546,7 +546,7 @@ This example demonstrates how to:

 ## Connecting Your Crew to a Model

-CrewAI supports using various LLMs through a variety of connection options. By default your agents will use the OpenAI API when querying the model. However, there are several other ways to allow your agents to connect to models. For example, you can configure your agents to use a local model via the Ollama tool.
+CrewAI supports using various LLMs through a variety of connection options. By default your agents will use the OpenAI API when querying the model. However, there are several other ways to allow your agents to connect to models. For example, you can configure your agents to use a local model via the Ollama tool, or access 300+ AI models through AI/ML API.

 Please refer to the [Connect CrewAI to LLMs](https://docs.crewai.com/how-to/LLM-Connections/) page for details on configuring your agents' connections to models.

@@ -706,7 +706,7 @@ A: Yes. CrewAI excels at both simple and highly complex real-world scenarios, of

 ### Q: Can I use CrewAI with local AI models?

-A: Absolutely! CrewAI supports various language models, including local ones. Tools like Ollama and LM Studio allow seamless integration. Check the [LLM Connections documentation](https://docs.crewai.com/how-to/LLM-Connections/) for more details.
+A: Absolutely! CrewAI supports various language models, including local ones. Tools like Ollama and LM Studio allow seamless integration, and AI/ML API provides access to 300+ models from various providers. Check the [LLM Connections documentation](https://docs.crewai.com/how-to/LLM-Connections/) for more details.

 ### Q: What makes Crews different from Flows?

--- a/docs/aiml_api_integration.md
+++ b/docs/aiml_api_integration.md
@@ -0,0 +1,207 @@
+# AI/ML API Integration with CrewAI
+
+CrewAI now supports AI/ML API as a provider, giving you access to 300+ AI models through their platform. AI/ML API provides a unified interface to models from various providers including Meta (Llama), Anthropic (Claude), Mistral, Qwen, and more.
+
+## Setup
+
+1. Get your API key from [AI/ML API](https://aimlapi.com)
+2. Set your API key as an environment variable:
+
+```bash
+export AIML_API_KEY="your-api-key-here"
+```
+
+## Usage
+
+AI/ML API models use the `openai/` prefix for compatibility with LiteLLM. Here are some examples:
+
+### Basic Usage
+
+```python
+from crewai import Agent, LLM
+
+# Use Llama 3.1 70B through AI/ML API
+llm = LLM(
+    model="openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
+    api_key="your-aiml-api-key"  # or set AIML_API_KEY env var
+)
+
+agent = Agent(
+    role="Research Assistant",
+    goal="Help with research tasks",
+    backstory="You are an expert researcher with access to advanced AI capabilities",
+    llm=llm
+)
+```
+
+### Available Models
+
+Popular models available through AI/ML API:
+
+#### Llama Models
+- `openai/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo` - Largest Llama model
+- `openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo` - High performance
+- `openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo` - Fast and efficient
+- `openai/meta-llama/Meta-Llama-3.2-90B-Vision-Instruct-Turbo` - Vision capabilities
+
+#### Claude Models
+- `openai/anthropic/claude-3-5-sonnet-20241022` - Latest Claude Sonnet
+- `openai/anthropic/claude-3-5-haiku-20241022` - Fast Claude model
+- `openai/anthropic/claude-3-opus-20240229` - Most capable Claude
+
+#### Other Models
+- `openai/mistralai/Mixtral-8x7B-Instruct-v0.1` - Mistral's mixture of experts
+- `openai/Qwen/Qwen2.5-72B-Instruct-Turbo` - Qwen's large model
+- `openai/deepseek-ai/DeepSeek-V2.5` - DeepSeek's latest model
+
+### Complete Example
+
+```python
+from crewai import Agent, Task, Crew, LLM
+
+# Configure AI/ML API LLM
+llm = LLM(
+    model="openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
+    api_key="your-aiml-api-key"
+)
+
+# Create an agent with AI/ML API model
+researcher = Agent(
+    role="AI Research Specialist",
+    goal="Analyze AI trends and provide insights",
+    backstory="You are an expert in artificial intelligence with deep knowledge of current trends and developments",
+    llm=llm
+)
+
+# Create a task
+research_task = Task(
+    description="Research the latest developments in large language models and summarize key findings",
+    expected_output="A comprehensive summary of recent LLM developments with key insights",
+    agent=researcher
+)
+
+# Create and run the crew
+crew = Crew(
+    agents=[researcher],
+    tasks=[research_task]
+)
+
+result = crew.kickoff()
+print(result)
+```
+
+### Environment Configuration
+
+You can configure AI/ML API in several ways:
+
+```python
+# Method 1: Direct API key
+llm = LLM(
+    model="openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
+    api_key="your-aiml-api-key"
+)
+
+# Method 2: Environment variable (recommended)
+# Set AIML_API_KEY in your environment
+llm = LLM(model="openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo")
+
+# Method 3: Base URL configuration (if needed)
+llm = LLM(
+    model="openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
+    base_url="https://api.aimlapi.com/v1",
+    api_key="your-aiml-api-key"
+)
+```
+
+## Features
+
+AI/ML API models through CrewAI support:
+
+- **Function Calling**: Most models support tool usage and function calling
+- **Streaming**: Real-time response streaming for better user experience
+- **Context Windows**: Optimized context window management for each model
+- **Vision Models**: Some models support image understanding capabilities
+- **Structured Output**: JSON and Pydantic model output formatting
+
+## Model Selection Guide
+
+Choose the right model for your use case:
+
+- **For complex reasoning**: Use Llama 3.1 405B or Claude 3.5 Sonnet
+- **For balanced performance**: Use Llama 3.1 70B or Claude 3.5 Haiku
+- **For speed and efficiency**: Use Llama 3.1 8B or smaller models
+- **For vision tasks**: Use Llama 3.2 Vision models
+- **For coding**: Consider DeepSeek or specialized coding models
+
+## Performance Considerations
+
+### Context Window Management
+
+AI/ML API models support large context windows, but be mindful of:
+
+- **Memory Usage**: Large context windows (>100K tokens) may require significant memory
+- **Processing Time**: Larger contexts take longer to process
+- **Cost Impact**: Most providers charge based on token usage
+
+### Rate Limiting Best Practices
+
+AI/ML API implements rate limiting to ensure fair usage:
+
+- **Implement Retry Logic**: Use exponential backoff for rate limit errors
+- **Monitor Usage**: Track your API usage through the AI/ML API dashboard
+- **Batch Requests**: Group multiple requests when possible to optimize throughput
+- **Cache Results**: Store frequently used responses to reduce API calls
+
+```python
+import time
+from crewai import LLM
+
+def create_llm_with_retry(model_name, max_retries=3):
+    for attempt in range(max_retries):
+        try:
+            return LLM(model=model_name)
+        except Exception as e:
+            if "rate limit" in str(e).lower() and attempt < max_retries - 1:
+                wait_time = 2 ** attempt  # Exponential backoff
+                time.sleep(wait_time)
+                continue
+            raise e
+```
+
+### Cost Optimization
+
+- **Model Selection**: Choose appropriate model size for your use case
+- **Context Management**: Trim unnecessary context to reduce token usage
+- **Streaming**: Use streaming for real-time applications to improve perceived performance
+
+## Troubleshooting
+
+### Common Issues
+
+1. **Authentication Error**: Ensure your AIML_API_KEY is set correctly
+2. **Model Not Found**: Verify the model name uses the correct `openai/` prefix
+3. **Rate Limits**: AI/ML API has rate limits; implement appropriate retry logic
+4. **Context Length**: Monitor context window usage for optimal performance
+5. **Memory Issues**: Large context windows may cause memory problems; monitor usage
+
+### Getting Help
+
+- Check the [AI/ML API Documentation](https://docs.aimlapi.com)
+- Review model-specific capabilities and limitations
+- Monitor usage and costs through the AI/ML API dashboard
+
+## Migration from Other Providers
+
+If you're migrating from other providers:
+
+```python
+# From OpenAI
+# OLD: llm = LLM(model="gpt-4")
+# NEW: llm = LLM(model="openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo")
+
+# From Anthropic
+# OLD: llm = LLM(model="claude-3-sonnet")
+# NEW: llm = LLM(model="openai/anthropic/claude-3-5-sonnet-20241022")
+```
+
+The integration maintains full compatibility with CrewAI's existing features while providing access to AI/ML API's extensive model catalog.
--- a/src/crewai/llm.py
+++ b/src/crewai/llm.py
@@ -249,6 +249,30 @@ LLM_CONTEXT_WINDOW_SIZES = {
    "mistral/mistral-large-latest": 32768,
    "mistral/mistral-large-2407": 32768,
    "mistral/mistral-large-2402": 32768,
+    "openai/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo": 131072,
+    "openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo": 131072,
+    "openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo": 131072,
+    "openai/meta-llama/Meta-Llama-3.2-90B-Vision-Instruct-Turbo": 131072,
+    "openai/meta-llama/Meta-Llama-3.2-11B-Vision-Instruct-Turbo": 131072,
+    "openai/meta-llama/Meta-Llama-3.2-3B-Instruct-Turbo": 131072,
+    "openai/meta-llama/Meta-Llama-3.2-1B-Instruct-Turbo": 131072,
+    "openai/anthropic/claude-3-5-sonnet-20241022": 200000,
+    "openai/anthropic/claude-3-5-haiku-20241022": 200000,
+    "openai/anthropic/claude-3-opus-20240229": 200000,
+    "openai/anthropic/claude-3-sonnet-20240229": 200000,
+    "openai/anthropic/claude-3-haiku-20240307": 200000,
+    "openai/mistralai/Mistral-7B-Instruct-v0.3": 32768,
+    "openai/mistralai/Mixtral-8x7B-Instruct-v0.1": 32768,
+    "openai/mistralai/Mixtral-8x22B-Instruct-v0.1": 65536,
+    "openai/google/gemma-2-9b-it": 8192,
+    "openai/google/gemma-2-27b-it": 8192,
+    "openai/Qwen/Qwen2.5-72B-Instruct-Turbo": 131072,
+    "openai/Qwen/Qwen2.5-32B-Instruct-Turbo": 131072,
+    "openai/Qwen/Qwen2.5-14B-Instruct-Turbo": 131072,
+    "openai/Qwen/Qwen2.5-7B-Instruct-Turbo": 131072,
+    "openai/deepseek-ai/DeepSeek-V2.5": 131072,
+    "openai/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF": 131072,
+    "openai/microsoft/WizardLM-2-8x22B": 65536,
 }

 DEFAULT_CONTEXT_WINDOW_SIZE = 8192
@@ -1095,7 +1119,7 @@ class LLM(BaseLLM):

    def get_context_window_size(self) -> int:
        """
-        Returns the context window size, using 75% of the maximum to avoid
+        Returns the context window size, using 85% of the maximum to avoid
        cutting off messages mid-thread.

        Raises:
@@ -1106,6 +1130,7 @@ class LLM(BaseLLM):

        MIN_CONTEXT = 1024
        MAX_CONTEXT = 2097152  # Current max from gemini-1.5-pro
+        MAX_SAFE_CONTEXT = 100000  # Warn for very large context windows

        # Validate all context window sizes
        for key, value in LLM_CONTEXT_WINDOW_SIZES.items():
@@ -1120,6 +1145,9 @@ class LLM(BaseLLM):
        for key, value in LLM_CONTEXT_WINDOW_SIZES.items():
            if self.model.startswith(key):
                self.context_window_size = int(value * CONTEXT_WINDOW_USAGE_RATIO)
+                if value > MAX_SAFE_CONTEXT:
+                    import warnings
+                    warnings.warn(f"Model {self.model} uses large context window ({value}). Monitor memory usage.")
        return self.context_window_size

    def set_callbacks(self, callbacks: List[Any]):
--- a/tests/test_aiml_api_integration.py
+++ b/tests/test_aiml_api_integration.py
@@ -0,0 +1,163 @@
+"""Tests for AI/ML API integration with CrewAI."""
+
+from unittest.mock import patch
+
+from crewai.llm import LLM
+from crewai.utilities.llm_utils import create_llm
+
+
+class TestAIMLAPIIntegration:
+    """Test suite for AI/ML API provider integration."""
+
+    def test_aiml_api_model_context_windows(self):
+        """Test that AI/ML API models have correct context window sizes."""
+        test_cases = [
+            ("openai/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo", 131072),
+            ("openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo", 131072),
+            ("openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", 131072),
+            ("openai/anthropic/claude-3-5-sonnet-20241022", 200000),
+            ("openai/anthropic/claude-3-5-haiku-20241022", 200000),
+            ("openai/mistralai/Mistral-7B-Instruct-v0.3", 32768),
+            ("openai/Qwen/Qwen2.5-72B-Instruct-Turbo", 131072),
+            ("openai/deepseek-ai/DeepSeek-V2.5", 131072),
+        ]
+        
+        for model_name, expected_context_size in test_cases:
+            llm = LLM(model=model_name)
+            expected_usable_size = int(expected_context_size * 0.85)
+            actual_context_size = llm.get_context_window_size()
+            assert actual_context_size == expected_usable_size, (
+                f"Model {model_name} should have context window size {expected_usable_size}, "
+                f"but got {actual_context_size}"
+            )
+
+    def test_aiml_api_provider_detection(self):
+        """Test that AI/ML API models are correctly identified as openai provider."""
+        llm = LLM(model="openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo")
+        provider = llm._get_custom_llm_provider()
+        assert provider == "openai", f"Expected provider 'openai', but got '{provider}'"
+
+    def test_aiml_api_model_instantiation(self):
+        """Test that AI/ML API models can be instantiated correctly."""
+        model_names = [
+            "openai/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo",
+            "openai/anthropic/claude-3-5-sonnet-20241022",
+            "openai/mistralai/Mixtral-8x7B-Instruct-v0.1",
+            "openai/Qwen/Qwen2.5-72B-Instruct-Turbo",
+        ]
+        
+        for model_name in model_names:
+            llm = LLM(model=model_name)
+            assert llm.model == model_name
+            assert llm._get_custom_llm_provider() == "openai"
+            assert llm.get_context_window_size() > 0
+
+    @patch('crewai.llm.litellm.utils.supports_function_calling')
+    def test_aiml_api_function_calling_support(self, mock_supports_function_calling):
+        """Test function calling support detection for AI/ML API models."""
+        mock_supports_function_calling.return_value = True
+        
+        llm = LLM(model="openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo")
+        supports_fc = llm.supports_function_calling()
+        
+        assert supports_fc is True
+        mock_supports_function_calling.assert_called_once_with(
+            "openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
+            custom_llm_provider="openai"
+        )
+
+    def test_aiml_api_with_create_llm(self):
+        """Test that AI/ML API models work with create_llm utility."""
+        model_name = "openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo"
+        llm = create_llm(model_name)
+        
+        assert isinstance(llm, LLM)
+        assert llm.model == model_name
+        assert llm._get_custom_llm_provider() == "openai"
+
+    def test_aiml_api_model_validation(self):
+        """Test that AI/ML API models pass validation checks."""
+        llm = LLM(model="openai/anthropic/claude-3-5-sonnet-20241022")
+        
+        llm._validate_call_params()
+        
+        llm_with_format = LLM(
+            model="openai/anthropic/claude-3-5-sonnet-20241022",
+            response_format={"type": "json_object"}
+        )
+        try:
+            llm_with_format._validate_call_params()
+        except ValueError as e:
+            assert "does not support response_format" in str(e)
+
+    def test_aiml_api_context_window_bounds(self):
+        """Test that AI/ML API model context windows are within valid bounds."""
+        from crewai.llm import LLM_CONTEXT_WINDOW_SIZES
+        
+        aiml_models = {k: v for k, v in LLM_CONTEXT_WINDOW_SIZES.items() 
+                      if k.startswith("openai/")}
+        
+        MIN_CONTEXT = 1024
+        MAX_CONTEXT = 2097152
+        
+        for model_name, context_size in aiml_models.items():
+            assert MIN_CONTEXT <= context_size <= MAX_CONTEXT, (
+                f"Model {model_name} context window {context_size} is outside "
+                f"valid bounds [{MIN_CONTEXT}, {MAX_CONTEXT}]"
+            )
+
+    def test_aiml_api_model_prefixes(self):
+        """Test that all AI/ML API models use the correct openai/ prefix."""
+        from crewai.llm import LLM_CONTEXT_WINDOW_SIZES
+        
+        aiml_models = [k for k in LLM_CONTEXT_WINDOW_SIZES.keys() 
+                      if k.startswith("openai/")]
+        
+        assert len(aiml_models) > 0, "No AI/ML API models found in context window sizes"
+        
+        for model_name in aiml_models:
+            assert model_name.startswith("openai/"), (
+                f"AI/ML API model {model_name} should start with 'openai/' prefix"
+            )
+            parts = model_name.split("/")
+            assert len(parts) >= 3, (
+                f"AI/ML API model {model_name} should have format 'openai/provider/model'"
+            )
+
+
+class TestAIMLAPIExamples:
+    """Test examples of using AI/ML API with CrewAI components."""
+
+    def test_aiml_api_with_agent_example(self):
+        """Test example usage of AI/ML API with CrewAI Agent."""
+        from crewai import Agent
+        
+        llm = LLM(model="openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo")
+        
+        agent = Agent(
+            role="AI Assistant",
+            goal="Help users with their questions",
+            backstory="You are a helpful AI assistant powered by Llama 3.1",
+            llm=llm,
+        )
+        
+        assert agent.llm.model == "openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo"
+        assert agent.llm._get_custom_llm_provider() == "openai"
+
+    def test_aiml_api_different_model_types(self):
+        """Test different types of models available through AI/ML API."""
+        model_types = {
+            "llama": "openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
+            "claude": "openai/anthropic/claude-3-5-sonnet-20241022",
+            "mistral": "openai/mistralai/Mixtral-8x7B-Instruct-v0.1",
+            "qwen": "openai/Qwen/Qwen2.5-72B-Instruct-Turbo",
+            "deepseek": "openai/deepseek-ai/DeepSeek-V2.5",
+        }
+        
+        for model_type, model_name in model_types.items():
+            llm = LLM(model=model_name)
+            assert llm.model == model_name
+            assert llm._get_custom_llm_provider() == "openai"
+            assert llm.get_context_window_size() > 0, (
+                f"{model_type} model should have positive context window"
+            )