- Fix test expectations to use 0.85 ratio instead of 0.75 (matches CONTEXT_WINDOW_USAGE_RATIO) - Remove unused imports (pytest, Mock) from test file - Add context window size warning for large models (>100K tokens) - Update documentation with performance considerations and rate limiting best practices - Address code review feedback from João regarding validation and error handling Co-Authored-By: João <joao@crewai.com>
6.5 KiB
AI/ML API Integration with CrewAI
CrewAI now supports AI/ML API as a provider, giving you access to 300+ AI models through their platform. AI/ML API provides a unified interface to models from various providers including Meta (Llama), Anthropic (Claude), Mistral, Qwen, and more.
Setup
- Get your API key from AI/ML API
- Set your API key as an environment variable:
export AIML_API_KEY="your-api-key-here"
Usage
AI/ML API models use the openai/ prefix for compatibility with LiteLLM. Here are some examples:
Basic Usage
from crewai import Agent, LLM
# Use Llama 3.1 70B through AI/ML API
llm = LLM(
model="openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
api_key="your-aiml-api-key" # or set AIML_API_KEY env var
)
agent = Agent(
role="Research Assistant",
goal="Help with research tasks",
backstory="You are an expert researcher with access to advanced AI capabilities",
llm=llm
)
Available Models
Popular models available through AI/ML API:
Llama Models
openai/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo- Largest Llama modelopenai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo- High performanceopenai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo- Fast and efficientopenai/meta-llama/Meta-Llama-3.2-90B-Vision-Instruct-Turbo- Vision capabilities
Claude Models
openai/anthropic/claude-3-5-sonnet-20241022- Latest Claude Sonnetopenai/anthropic/claude-3-5-haiku-20241022- Fast Claude modelopenai/anthropic/claude-3-opus-20240229- Most capable Claude
Other Models
openai/mistralai/Mixtral-8x7B-Instruct-v0.1- Mistral's mixture of expertsopenai/Qwen/Qwen2.5-72B-Instruct-Turbo- Qwen's large modelopenai/deepseek-ai/DeepSeek-V2.5- DeepSeek's latest model
Complete Example
from crewai import Agent, Task, Crew, LLM
# Configure AI/ML API LLM
llm = LLM(
model="openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
api_key="your-aiml-api-key"
)
# Create an agent with AI/ML API model
researcher = Agent(
role="AI Research Specialist",
goal="Analyze AI trends and provide insights",
backstory="You are an expert in artificial intelligence with deep knowledge of current trends and developments",
llm=llm
)
# Create a task
research_task = Task(
description="Research the latest developments in large language models and summarize key findings",
expected_output="A comprehensive summary of recent LLM developments with key insights",
agent=researcher
)
# Create and run the crew
crew = Crew(
agents=[researcher],
tasks=[research_task]
)
result = crew.kickoff()
print(result)
Environment Configuration
You can configure AI/ML API in several ways:
# Method 1: Direct API key
llm = LLM(
model="openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
api_key="your-aiml-api-key"
)
# Method 2: Environment variable (recommended)
# Set AIML_API_KEY in your environment
llm = LLM(model="openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo")
# Method 3: Base URL configuration (if needed)
llm = LLM(
model="openai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
base_url="https://api.aimlapi.com/v1",
api_key="your-aiml-api-key"
)
Features
AI/ML API models through CrewAI support:
- Function Calling: Most models support tool usage and function calling
- Streaming: Real-time response streaming for better user experience
- Context Windows: Optimized context window management for each model
- Vision Models: Some models support image understanding capabilities
- Structured Output: JSON and Pydantic model output formatting
Model Selection Guide
Choose the right model for your use case:
- For complex reasoning: Use Llama 3.1 405B or Claude 3.5 Sonnet
- For balanced performance: Use Llama 3.1 70B or Claude 3.5 Haiku
- For speed and efficiency: Use Llama 3.1 8B or smaller models
- For vision tasks: Use Llama 3.2 Vision models
- For coding: Consider DeepSeek or specialized coding models
Performance Considerations
Context Window Management
AI/ML API models support large context windows, but be mindful of:
- Memory Usage: Large context windows (>100K tokens) may require significant memory
- Processing Time: Larger contexts take longer to process
- Cost Impact: Most providers charge based on token usage
Rate Limiting Best Practices
AI/ML API implements rate limiting to ensure fair usage:
- Implement Retry Logic: Use exponential backoff for rate limit errors
- Monitor Usage: Track your API usage through the AI/ML API dashboard
- Batch Requests: Group multiple requests when possible to optimize throughput
- Cache Results: Store frequently used responses to reduce API calls
import time
from crewai import LLM
def create_llm_with_retry(model_name, max_retries=3):
for attempt in range(max_retries):
try:
return LLM(model=model_name)
except Exception as e:
if "rate limit" in str(e).lower() and attempt < max_retries - 1:
wait_time = 2 ** attempt # Exponential backoff
time.sleep(wait_time)
continue
raise e
Cost Optimization
- Model Selection: Choose appropriate model size for your use case
- Context Management: Trim unnecessary context to reduce token usage
- Streaming: Use streaming for real-time applications to improve perceived performance
Troubleshooting
Common Issues
- Authentication Error: Ensure your AIML_API_KEY is set correctly
- Model Not Found: Verify the model name uses the correct
openai/prefix - Rate Limits: AI/ML API has rate limits; implement appropriate retry logic
- Context Length: Monitor context window usage for optimal performance
- Memory Issues: Large context windows may cause memory problems; monitor usage
Getting Help
- Check the AI/ML API Documentation
- Review model-specific capabilities and limitations
- Monitor usage and costs through the AI/ML API dashboard
Migration from Other Providers
If you're migrating from other providers:
# From OpenAI
# OLD: llm = LLM(model="gpt-4")
# NEW: llm = LLM(model="openai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo")
# From Anthropic
# OLD: llm = LLM(model="claude-3-sonnet")
# NEW: llm = LLM(model="openai/anthropic/claude-3-5-sonnet-20241022")
The integration maintains full compatibility with CrewAI's existing features while providing access to AI/ML API's extensive model catalog.