mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-01-09 08:08:32 +00:00
Documentation Improvements: LLM Configuration and Usage (#1684)
* docs: improve tasks documentation clarity and structure - Add Task Execution Flow section - Add variable interpolation explanation - Add Task Dependencies section with examples - Improve overall document structure and readability - Update code examples with proper syntax highlighting * docs: update agent documentation with improved examples and formatting - Replace DuckDuckGoSearchRun with SerperDevTool - Update code block formatting to be consistent - Improve template examples with actual syntax - Update LLM examples to use current models - Clean up formatting and remove redundant comments * docs: enhance LLM documentation with Cerebras provider and formatting improvements * docs: simplify LLMs documentation title * docs: improve installation guide clarity and structure - Add clear Python version requirements with check command - Simplify installation options to recommended method - Improve upgrade section clarity for existing users - Add better visual structure with Notes and Tips - Update description and formatting * docs: improve introduction page organization and clarity - Update organizational analogy in Note section - Improve table formatting and alignment - Remove emojis from component table for cleaner look - Add 'helps you' to make the note more action-oriented * docs: add enterprise and community cards - Add Enterprise deployment card in quickstart - Add community card focused on open source discussions - Remove deployment reference from community description - Clean up introduction page cards - Remove link from Enterprise description text
This commit is contained in:
@@ -1,205 +1,323 @@
|
||||
---
|
||||
title: LLMs
|
||||
description: Learn how to configure and optimize LLMs for your CrewAI projects.
|
||||
icon: microchip-ai
|
||||
title: 'LLMs'
|
||||
description: 'A comprehensive guide to configuring and using Large Language Models (LLMs) in your CrewAI projects'
|
||||
icon: 'microchip-ai'
|
||||
---
|
||||
|
||||
# Large Language Models (LLMs) in CrewAI
|
||||
<Note>
|
||||
CrewAI integrates with multiple LLM providers through LiteLLM, giving you the flexibility to choose the right model for your specific use case. This guide will help you understand how to configure and use different LLM providers in your CrewAI projects.
|
||||
</Note>
|
||||
|
||||
Large Language Models (LLMs) are the backbone of intelligent agents in the CrewAI framework. This guide will help you understand, configure, and optimize LLM usage for your CrewAI projects.
|
||||
## What are LLMs?
|
||||
|
||||
## Key Concepts
|
||||
Large Language Models (LLMs) are the core intelligence behind CrewAI agents. They enable agents to understand context, make decisions, and generate human-like responses. Here's what you need to know:
|
||||
|
||||
- **LLM**: Large Language Model, the AI powering agent intelligence
|
||||
- **Agent**: A CrewAI entity that uses an LLM to perform tasks
|
||||
- **Provider**: A service that offers LLM capabilities (e.g., OpenAI, Anthropic, Ollama, [more providers](https://docs.litellm.ai/docs/providers))
|
||||
<CardGroup cols={2}>
|
||||
<Card title="LLM Basics" icon="brain">
|
||||
Large Language Models are AI systems trained on vast amounts of text data. They power the intelligence of your CrewAI agents, enabling them to understand and generate human-like text.
|
||||
</Card>
|
||||
<Card title="Context Window" icon="window">
|
||||
The context window determines how much text an LLM can process at once. Larger windows (e.g., 128K tokens) allow for more context but may be more expensive and slower.
|
||||
</Card>
|
||||
<Card title="Temperature" icon="temperature-three-quarters">
|
||||
Temperature (0.0 to 1.0) controls response randomness. Lower values (e.g., 0.2) produce more focused, deterministic outputs, while higher values (e.g., 0.8) increase creativity and variability.
|
||||
</Card>
|
||||
<Card title="Provider Selection" icon="server">
|
||||
Each LLM provider (e.g., OpenAI, Anthropic, Google) offers different models with varying capabilities, pricing, and features. Choose based on your needs for accuracy, speed, and cost.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
## Configuring LLMs for Agents
|
||||
## Available Models and Their Capabilities
|
||||
|
||||
CrewAI offers flexible options for setting up LLMs:
|
||||
|
||||
### 1. Default Configuration
|
||||
|
||||
By default, CrewAI uses the `gpt-4o-mini` model. It uses environment variables if no LLM is specified:
|
||||
- `OPENAI_MODEL_NAME` (defaults to "gpt-4o-mini" if not set)
|
||||
- `OPENAI_API_BASE`
|
||||
- `OPENAI_API_KEY`
|
||||
|
||||
### 2. Updating YAML files
|
||||
|
||||
You can update the `agents.yml` file to refer to the LLM you want to use:
|
||||
|
||||
```yaml Code
|
||||
researcher:
|
||||
role: Research Specialist
|
||||
goal: Conduct comprehensive research and analysis to gather relevant information,
|
||||
synthesize findings, and produce well-documented insights.
|
||||
backstory: A dedicated research professional with years of experience in academic
|
||||
investigation, literature review, and data analysis, known for thorough and
|
||||
methodical approaches to complex research questions.
|
||||
verbose: true
|
||||
llm: openai/gpt-4o
|
||||
# llm: azure/gpt-4o-mini
|
||||
# llm: gemini/gemini-pro
|
||||
# llm: anthropic/claude-3-5-sonnet-20240620
|
||||
# llm: bedrock/anthropic.claude-3-sonnet-20240229-v1:0
|
||||
# llm: mistral/mistral-large-latest
|
||||
# llm: ollama/llama3:70b
|
||||
# llm: groq/llama-3.2-90b-vision-preview
|
||||
# llm: watsonx/meta-llama/llama-3-1-70b-instruct
|
||||
# llm: nvidia_nim/meta/llama3-70b-instruct
|
||||
# llm: sambanova/Meta-Llama-3.1-8B-Instruct
|
||||
# ...
|
||||
```
|
||||
|
||||
Keep in mind that you will need to set certain ENV vars depending on the model you are
|
||||
using to account for the credentials or set a custom LLM object like described below.
|
||||
Here are some of the required ENV vars for some of the LLM integrations:
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="OpenAI">
|
||||
```python Code
|
||||
OPENAI_API_KEY=<your-api-key>
|
||||
OPENAI_API_BASE=<optional-custom-base-url>
|
||||
OPENAI_MODEL_NAME=<openai-model-name>
|
||||
OPENAI_ORGANIZATION=<your-org-id> # OPTIONAL
|
||||
OPENAI_API_BASE=<openaiai-api-base> # OPTIONAL
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Anthropic">
|
||||
```python Code
|
||||
ANTHROPIC_API_KEY=<your-api-key>
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Google">
|
||||
```python Code
|
||||
GEMINI_API_KEY=<your-api-key>
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Azure">
|
||||
```python Code
|
||||
AZURE_API_KEY=<your-api-key> # "my-azure-api-key"
|
||||
AZURE_API_BASE=<your-resource-url> # "https://example-endpoint.openai.azure.com"
|
||||
AZURE_API_VERSION=<api-version> # "2023-05-15"
|
||||
AZURE_AD_TOKEN=<your-azure-ad-token> # Optional
|
||||
AZURE_API_TYPE=<your-azure-api-type> # Optional
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="AWS Bedrock">
|
||||
```python Code
|
||||
AWS_ACCESS_KEY_ID=<your-access-key>
|
||||
AWS_SECRET_ACCESS_KEY=<your-secret-key>
|
||||
AWS_DEFAULT_REGION=<your-region>
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Mistral">
|
||||
```python Code
|
||||
MISTRAL_API_KEY=<your-api-key>
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Groq">
|
||||
```python Code
|
||||
GROQ_API_KEY=<your-api-key>
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="IBM watsonx.ai">
|
||||
```python Code
|
||||
WATSONX_URL=<your-url> # (required) Base URL of your WatsonX instance
|
||||
WATSONX_APIKEY=<your-apikey> # (required) IBM cloud API key
|
||||
WATSONX_TOKEN=<your-token> # (required) IAM auth token (alternative to APIKEY)
|
||||
WATSONX_PROJECT_ID=<your-project-id> # (optional) Project ID of your WatsonX instance
|
||||
WATSONX_DEPLOYMENT_SPACE_ID=<your-space-id> # (optional) ID of deployment space for deployed models
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### 3. Custom LLM Objects
|
||||
|
||||
Pass a custom LLM implementation or object from another library.
|
||||
|
||||
See below for examples.
|
||||
Here's a detailed breakdown of supported models and their capabilities:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="String Identifier">
|
||||
```python Code
|
||||
agent = Agent(llm="gpt-4o", ...)
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="OpenAI">
|
||||
| Model | Context Window | Best For |
|
||||
|-------|---------------|-----------|
|
||||
| GPT-4 | 8,192 tokens | High-accuracy tasks, complex reasoning |
|
||||
| GPT-4 Turbo | 128,000 tokens | Long-form content, document analysis |
|
||||
| GPT-4o & GPT-4o-mini | 128,000 tokens | Cost-effective large context processing |
|
||||
|
||||
<Tab title="LLM Instance">
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
<Note>
|
||||
1 token ≈ 4 characters in English. For example, 8,192 tokens ≈ 32,768 characters or about 6,000 words.
|
||||
</Note>
|
||||
</Tab>
|
||||
<Tab title="Groq">
|
||||
| Model | Context Window | Best For |
|
||||
|-------|---------------|-----------|
|
||||
| Llama 3.1 70B/8B | 131,072 tokens | High-performance, large context tasks |
|
||||
| Llama 3.2 Series | 8,192 tokens | General-purpose tasks |
|
||||
| Mixtral 8x7B | 32,768 tokens | Balanced performance and context |
|
||||
| Gemma Series | 8,192 tokens | Efficient, smaller-scale tasks |
|
||||
|
||||
llm = LLM(model="gpt-4", temperature=0.7)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Tab>
|
||||
<Tip>
|
||||
Groq is known for its fast inference speeds, making it suitable for real-time applications.
|
||||
</Tip>
|
||||
</Tab>
|
||||
<Tab title="Others">
|
||||
| Provider | Context Window | Key Features |
|
||||
|----------|---------------|--------------|
|
||||
| Deepseek Chat | 128,000 tokens | Specialized in technical discussions |
|
||||
| Claude 3 | Up to 200K tokens | Strong reasoning, code understanding |
|
||||
| Gemini | Varies by model | Multimodal capabilities |
|
||||
|
||||
<Info>
|
||||
Provider selection should consider factors like:
|
||||
- API availability in your region
|
||||
- Pricing structure
|
||||
- Required features (e.g., streaming, function calling)
|
||||
- Performance requirements
|
||||
</Info>
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Connecting to OpenAI-Compatible LLMs
|
||||
## Setting Up Your LLM
|
||||
|
||||
You can connect to OpenAI-compatible LLMs using either environment variables or by setting specific attributes on the LLM class:
|
||||
There are three ways to configure LLMs in CrewAI. Choose the method that best fits your workflow:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Using Environment Variables">
|
||||
```python Code
|
||||
import os
|
||||
<Tab title="1. Environment Variables">
|
||||
The simplest way to get started. Set these variables in your environment:
|
||||
|
||||
os.environ["OPENAI_API_KEY"] = "your-api-key"
|
||||
os.environ["OPENAI_API_BASE"] = "https://api.your-provider.com/v1"
|
||||
```bash
|
||||
# Required: Your API key for authentication
|
||||
OPENAI_API_KEY=<your-api-key>
|
||||
|
||||
# Optional: Default model selection
|
||||
OPENAI_MODEL_NAME=gpt-4o-mini # Default if not set
|
||||
|
||||
# Optional: Organization ID (if applicable)
|
||||
OPENAI_ORGANIZATION_ID=<your-org-id>
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Using LLM Class Attributes">
|
||||
```python Code
|
||||
|
||||
<Warning>
|
||||
Never commit API keys to version control. Use environment files (.env) or your system's secret management.
|
||||
</Warning>
|
||||
</Tab>
|
||||
<Tab title="2. YAML Configuration">
|
||||
Create a YAML file to define your agent configurations. This method is great for version control and team collaboration:
|
||||
|
||||
```yaml
|
||||
researcher:
|
||||
# Agent Definition
|
||||
role: Research Specialist
|
||||
goal: Conduct comprehensive research and analysis
|
||||
backstory: A dedicated research professional with years of experience
|
||||
verbose: true
|
||||
|
||||
# Model Selection (uncomment your choice)
|
||||
|
||||
# OpenAI Models - Known for reliability and performance
|
||||
llm: openai/gpt-4o-mini
|
||||
# llm: openai/gpt-4 # More accurate but expensive
|
||||
# llm: openai/gpt-4-turbo # Fast with large context
|
||||
# llm: openai/gpt-4o # Optimized for longer texts
|
||||
# llm: openai/o1-preview # Latest features
|
||||
# llm: openai/o1-mini # Cost-effective
|
||||
|
||||
# Azure Models - For enterprise deployments
|
||||
# llm: azure/gpt-4o-mini
|
||||
# llm: azure/gpt-4
|
||||
# llm: azure/gpt-35-turbo
|
||||
|
||||
# Anthropic Models - Strong reasoning capabilities
|
||||
# llm: anthropic/claude-3-opus-20240229-v1:0
|
||||
# llm: anthropic/claude-3-sonnet-20240229-v1:0
|
||||
# llm: anthropic/claude-3-haiku-20240307-v1:0
|
||||
# llm: anthropic/claude-2.1
|
||||
# llm: anthropic/claude-2.0
|
||||
|
||||
# Google Models - Good for general tasks
|
||||
# llm: gemini/gemini-pro
|
||||
# llm: gemini/gemini-1.5-pro-latest
|
||||
# llm: gemini/gemini-1.0-pro-latest
|
||||
|
||||
# AWS Bedrock Models - Enterprise-grade
|
||||
# llm: bedrock/anthropic.claude-3-sonnet-20240229-v1:0
|
||||
# llm: bedrock/anthropic.claude-v2:1
|
||||
# llm: bedrock/amazon.titan-text-express-v1
|
||||
# llm: bedrock/meta.llama2-70b-chat-v1
|
||||
|
||||
# Mistral Models - Open source alternative
|
||||
# llm: mistral/mistral-large-latest
|
||||
# llm: mistral/mistral-medium-latest
|
||||
# llm: mistral/mistral-small-latest
|
||||
|
||||
# Groq Models - Fast inference
|
||||
# llm: groq/mixtral-8x7b-32768
|
||||
# llm: groq/llama-3.1-70b-versatile
|
||||
# llm: groq/llama-3.2-90b-text-preview
|
||||
# llm: groq/gemma2-9b-it
|
||||
# llm: groq/gemma-7b-it
|
||||
|
||||
# IBM watsonx.ai Models - Enterprise features
|
||||
# llm: watsonx/ibm/granite-13b-chat-v2
|
||||
# llm: watsonx/meta-llama/llama-3-1-70b-instruct
|
||||
# llm: watsonx/bigcode/starcoder2-15b
|
||||
|
||||
# Ollama Models - Local deployment
|
||||
# llm: ollama/llama3:70b
|
||||
# llm: ollama/codellama
|
||||
# llm: ollama/mistral
|
||||
# llm: ollama/mixtral
|
||||
# llm: ollama/phi
|
||||
|
||||
# Fireworks AI Models - Specialized tasks
|
||||
# llm: fireworks_ai/accounts/fireworks/models/llama-v3-70b-instruct
|
||||
# llm: fireworks_ai/accounts/fireworks/models/mixtral-8x7b
|
||||
# llm: fireworks_ai/accounts/fireworks/models/zephyr-7b-beta
|
||||
|
||||
# Perplexity AI Models - Research focused
|
||||
# llm: pplx/llama-3.1-sonar-large-128k-online
|
||||
# llm: pplx/mistral-7b-instruct
|
||||
# llm: pplx/codellama-34b-instruct
|
||||
# llm: pplx/mixtral-8x7b-instruct
|
||||
|
||||
# Hugging Face Models - Community models
|
||||
# llm: huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct
|
||||
# llm: huggingface/mistralai/Mixtral-8x7B-Instruct-v0.1
|
||||
# llm: huggingface/tiiuae/falcon-180B-chat
|
||||
# llm: huggingface/google/gemma-7b-it
|
||||
|
||||
# Nvidia NIM Models - GPU-optimized
|
||||
# llm: nvidia_nim/meta/llama3-70b-instruct
|
||||
# llm: nvidia_nim/mistral/mixtral-8x7b
|
||||
# llm: nvidia_nim/google/gemma-7b
|
||||
|
||||
# SambaNova Models - Enterprise AI
|
||||
# llm: sambanova/Meta-Llama-3.1-8B-Instruct
|
||||
# llm: sambanova/BioMistral-7B
|
||||
# llm: sambanova/Falcon-180B
|
||||
```
|
||||
|
||||
<Info>
|
||||
The YAML configuration allows you to:
|
||||
- Version control your agent settings
|
||||
- Easily switch between different models
|
||||
- Share configurations across team members
|
||||
- Document model choices and their purposes
|
||||
</Info>
|
||||
</Tab>
|
||||
<Tab title="3. Direct Code">
|
||||
For maximum flexibility, configure LLMs directly in your Python code:
|
||||
|
||||
```python
|
||||
from crewai import LLM
|
||||
|
||||
# Basic configuration
|
||||
llm = LLM(model="gpt-4")
|
||||
|
||||
# Advanced configuration with detailed parameters
|
||||
llm = LLM(
|
||||
model="gpt-4o-mini",
|
||||
temperature=0.7, # Higher for more creative outputs
|
||||
timeout=120, # Seconds to wait for response
|
||||
max_tokens=4000, # Maximum length of response
|
||||
top_p=0.9, # Nucleus sampling parameter
|
||||
frequency_penalty=0.1, # Reduce repetition
|
||||
presence_penalty=0.1, # Encourage topic diversity
|
||||
response_format={"type": "json"}, # For structured outputs
|
||||
seed=42 # For reproducible results
|
||||
)
|
||||
```
|
||||
|
||||
<Info>
|
||||
Parameter explanations:
|
||||
- `temperature`: Controls randomness (0.0-1.0)
|
||||
- `timeout`: Maximum wait time for response
|
||||
- `max_tokens`: Limits response length
|
||||
- `top_p`: Alternative to temperature for sampling
|
||||
- `frequency_penalty`: Reduces word repetition
|
||||
- `presence_penalty`: Encourages new topics
|
||||
- `response_format`: Specifies output structure
|
||||
- `seed`: Ensures consistent outputs
|
||||
</Info>
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Advanced Features and Optimization
|
||||
|
||||
Learn how to get the most out of your LLM configuration:
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Context Window Management">
|
||||
CrewAI includes smart context management features:
|
||||
|
||||
```python
|
||||
from crewai import LLM
|
||||
|
||||
# CrewAI automatically handles:
|
||||
# 1. Token counting and tracking
|
||||
# 2. Content summarization when needed
|
||||
# 3. Task splitting for large contexts
|
||||
|
||||
llm = LLM(
|
||||
model="custom-model-name",
|
||||
api_key="your-api-key",
|
||||
base_url="https://api.your-provider.com/v1"
|
||||
model="gpt-4",
|
||||
max_tokens=4000, # Limit response length
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## LLM Configuration Options
|
||||
<Info>
|
||||
Best practices for context management:
|
||||
1. Choose models with appropriate context windows
|
||||
2. Pre-process long inputs when possible
|
||||
3. Use chunking for large documents
|
||||
4. Monitor token usage to optimize costs
|
||||
</Info>
|
||||
</Accordion>
|
||||
|
||||
When configuring an LLM for your agent, you have access to a wide range of parameters:
|
||||
<Accordion title="Performance Optimization">
|
||||
<Steps>
|
||||
<Step title="Token Usage Optimization">
|
||||
Choose the right context window for your task:
|
||||
- Small tasks (up to 4K tokens): Standard models
|
||||
- Medium tasks (between 4K-32K): Enhanced models
|
||||
- Large tasks (over 32K): Large context models
|
||||
|
||||
```python
|
||||
# Configure model with appropriate settings
|
||||
llm = LLM(
|
||||
model="openai/gpt-4-turbo-preview",
|
||||
temperature=0.7, # Adjust based on task
|
||||
max_tokens=4096, # Set based on output needs
|
||||
timeout=300 # Longer timeout for complex tasks
|
||||
)
|
||||
```
|
||||
<Tip>
|
||||
- Lower temperature (0.1 to 0.3) for factual responses
|
||||
- Higher temperature (0.7 to 0.9) for creative tasks
|
||||
</Tip>
|
||||
</Step>
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|:------------------|:---------------:|:-------------------------------------------------------------------------------------------------|
|
||||
| **model** | `str` | Name of the model to use (e.g., "gpt-4", "gpt-3.5-turbo", "ollama/llama3.1"). For more options, visit the providers documentation. |
|
||||
| **timeout** | `float, int` | Maximum time (in seconds) to wait for a response. |
|
||||
| **temperature** | `float` | Controls randomness in output (0.0 to 1.0). |
|
||||
| **top_p** | `float` | Controls diversity of output (0.0 to 1.0). |
|
||||
| **n** | `int` | Number of completions to generate. |
|
||||
| **stop** | `str, List[str]` | Sequence(s) where generation should stop. |
|
||||
| **max_tokens** | `int` | Maximum number of tokens to generate. |
|
||||
| **presence_penalty** | `float` | Penalizes new tokens based on their presence in prior text. |
|
||||
| **frequency_penalty**| `float` | Penalizes new tokens based on their frequency in prior text. |
|
||||
| **logit_bias** | `Dict[int, float]`| Modifies likelihood of specified tokens appearing. |
|
||||
| **response_format** | `Dict[str, Any]` | Specifies the format of the response (e.g., JSON object). |
|
||||
| **seed** | `int` | Sets a random seed for deterministic results. |
|
||||
| **logprobs** | `bool` | Returns log probabilities of output tokens if enabled. |
|
||||
| **top_logprobs** | `int` | Number of most likely tokens for which to return log probabilities. |
|
||||
| **base_url** | `str` | The base URL for the API endpoint. |
|
||||
| **api_version** | `str` | Version of the API to use. |
|
||||
| **api_key** | `str` | Your API key for authentication. |
|
||||
<Step title="Best Practices">
|
||||
1. Monitor token usage
|
||||
2. Implement rate limiting
|
||||
3. Use caching when possible
|
||||
4. Set appropriate max_tokens limits
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Info>
|
||||
Remember to regularly monitor your token usage and adjust your configuration as needed to optimize costs and performance.
|
||||
</Info>
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
These are examples of how to configure LLMs for your agent.
|
||||
## Provider Configuration Examples
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="OpenAI">
|
||||
<Accordion title="OpenAI">
|
||||
```python Code
|
||||
# Required
|
||||
OPENAI_API_KEY=sk-...
|
||||
|
||||
# Optional
|
||||
OPENAI_API_BASE=<custom-base-url>
|
||||
OPENAI_ORGANIZATION=<your-org-id>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
|
||||
@@ -211,193 +329,306 @@ These are examples of how to configure LLMs for your agent.
|
||||
frequency_penalty=0.1,
|
||||
presence_penalty=0.1,
|
||||
stop=["END"],
|
||||
seed=42,
|
||||
base_url="https://api.openai.com/v1",
|
||||
api_key="your-api-key-here"
|
||||
seed=42
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Cerebras">
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Anthropic">
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="cerebras/llama-3.1-70b",
|
||||
api_key="your-api-key-here"
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Ollama (Local LLMs)">
|
||||
|
||||
CrewAI supports using Ollama for running open-source models locally:
|
||||
|
||||
1. Install Ollama: [ollama.ai](https://ollama.ai/)
|
||||
2. Run a model: `ollama run llama2`
|
||||
3. Configure agent:
|
||||
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
|
||||
agent = Agent(
|
||||
llm=LLM(
|
||||
model="ollama/llama3.1",
|
||||
base_url="http://localhost:11434"
|
||||
),
|
||||
...
|
||||
model="anthropic/claude-3-sonnet-20240229-v1:0",
|
||||
temperature=0.7
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Groq">
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Google">
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
GEMINI_API_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="groq/llama3-8b-8192",
|
||||
api_key="your-api-key-here"
|
||||
model="gemini/gemini-pro",
|
||||
temperature=0.7
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Anthropic">
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Azure">
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
# Required
|
||||
AZURE_API_KEY=<your-api-key>
|
||||
AZURE_API_BASE=<your-resource-url>
|
||||
AZURE_API_VERSION=<api-version>
|
||||
|
||||
# Optional
|
||||
AZURE_AD_TOKEN=<your-azure-ad-token>
|
||||
AZURE_API_TYPE=<your-azure-api-type>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="anthropic/claude-3-5-sonnet-20241022",
|
||||
api_key="your-api-key-here"
|
||||
model="azure/gpt-4",
|
||||
api_version="2023-05-15"
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Accordion>
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Fireworks AI">
|
||||
<Accordion title="AWS Bedrock">
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
AWS_ACCESS_KEY_ID=<your-access-key>
|
||||
AWS_SECRET_ACCESS_KEY=<your-secret-key>
|
||||
AWS_DEFAULT_REGION=<your-region>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="fireworks_ai/accounts/fireworks/models/llama-v3-70b-instruct",
|
||||
api_key="your-api-key-here"
|
||||
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0"
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Gemini">
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Mistral">
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
MISTRAL_API_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="gemini/gemini-1.5-pro-002",
|
||||
api_key="your-api-key-here"
|
||||
model="mistral/mistral-large-latest",
|
||||
temperature=0.7
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Perplexity AI (pplx-api)">
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Groq">
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
GROQ_API_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="llama-3.1-sonar-large-128k-online",
|
||||
base_url="https://api.perplexity.ai/",
|
||||
api_key="your-api-key-here"
|
||||
model="groq/llama-3.2-90b-text-preview",
|
||||
temperature=0.7
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="IBM watsonx.ai">
|
||||
You can use IBM Watson by seeting the following ENV vars:
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="IBM watsonx.ai">
|
||||
```python Code
|
||||
# Required
|
||||
WATSONX_URL=<your-url>
|
||||
WATSONX_APIKEY=<your-apikey>
|
||||
WATSONX_PROJECT_ID=<your-project-id>
|
||||
|
||||
# Optional
|
||||
WATSONX_TOKEN=<your-token>
|
||||
WATSONX_DEPLOYMENT_SPACE_ID=<your-space-id>
|
||||
```
|
||||
|
||||
You can then define your agents llms by updating the `agents.yml`
|
||||
|
||||
```yaml Code
|
||||
researcher:
|
||||
role: Research Specialist
|
||||
goal: Conduct comprehensive research and analysis to gather relevant information,
|
||||
synthesize findings, and produce well-documented insights.
|
||||
backstory: A dedicated research professional with years of experience in academic
|
||||
investigation, literature review, and data analysis, known for thorough and
|
||||
methodical approaches to complex research questions.
|
||||
verbose: true
|
||||
llm: watsonx/meta-llama/llama-3-1-70b-instruct
|
||||
```
|
||||
|
||||
You can also set up agents more dynamically as a base level LLM instance, like bellow:
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
|
||||
llm = LLM(
|
||||
model="watsonx/ibm/granite-13b-chat-v2",
|
||||
base_url="https://api.watsonx.ai/v1",
|
||||
api_key="your-api-key-here"
|
||||
model="watsonx/meta-llama/llama-3-1-70b-instruct",
|
||||
base_url="https://api.watsonx.ai/v1"
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Accordion>
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Hugging Face">
|
||||
<Accordion title="Ollama (Local LLMs)">
|
||||
1. Install Ollama: [ollama.ai](https://ollama.ai/)
|
||||
2. Run a model: `ollama run llama2`
|
||||
3. Configure:
|
||||
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
llm = LLM(
|
||||
model="ollama/llama3:70b",
|
||||
base_url="http://localhost:11434"
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Fireworks AI">
|
||||
```python Code
|
||||
FIREWORKS_API_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="fireworks_ai/accounts/fireworks/models/llama-v3-70b-instruct",
|
||||
temperature=0.7
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Perplexity AI">
|
||||
```python Code
|
||||
PERPLEXITY_API_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="llama-3.1-sonar-large-128k-online",
|
||||
base_url="https://api.perplexity.ai/"
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Hugging Face">
|
||||
```python Code
|
||||
HUGGINGFACE_API_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct",
|
||||
api_key="your-api-key-here",
|
||||
base_url="your_api_endpoint"
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
</Accordion>
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Nvidia NIM">
|
||||
```python Code
|
||||
NVIDIA_API_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="nvidia_nim/meta/llama3-70b-instruct",
|
||||
temperature=0.7
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="SambaNova">
|
||||
```python Code
|
||||
SAMBANOVA_API_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="sambanova/Meta-Llama-3.1-8B-Instruct",
|
||||
temperature=0.7
|
||||
)
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Cerebras">
|
||||
```python Code
|
||||
# Required
|
||||
CEREBRAS_API_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
Example usage:
|
||||
```python Code
|
||||
llm = LLM(
|
||||
model="cerebras/llama3.1-70b",
|
||||
temperature=0.7,
|
||||
max_tokens=8192
|
||||
)
|
||||
```
|
||||
|
||||
<Info>
|
||||
Cerebras features:
|
||||
- Fast inference speeds
|
||||
- Competitive pricing
|
||||
- Good balance of speed and quality
|
||||
- Support for long context windows
|
||||
</Info>
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Changing the Base API URL
|
||||
## Common Issues and Solutions
|
||||
|
||||
You can change the base API URL for any LLM provider by setting the `base_url` parameter:
|
||||
<Tabs>
|
||||
<Tab title="Authentication">
|
||||
<Warning>
|
||||
Most authentication issues can be resolved by checking API key format and environment variable names.
|
||||
</Warning>
|
||||
|
||||
```bash
|
||||
# OpenAI
|
||||
OPENAI_API_KEY=sk-...
|
||||
|
||||
# Anthropic
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Model Names">
|
||||
<Check>
|
||||
Always include the provider prefix in model names
|
||||
</Check>
|
||||
|
||||
```python
|
||||
# Correct
|
||||
llm = LLM(model="openai/gpt-4")
|
||||
|
||||
# Incorrect
|
||||
llm = LLM(model="gpt-4")
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Context Length">
|
||||
<Tip>
|
||||
Use larger context models for extensive tasks
|
||||
</Tip>
|
||||
|
||||
```python
|
||||
# Large context model
|
||||
llm = LLM(model="openai/gpt-4o") # 128K tokens
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
```python Code
|
||||
from crewai import LLM
|
||||
## Getting Help
|
||||
|
||||
llm = LLM(
|
||||
model="custom-model-name",
|
||||
base_url="https://api.your-provider.com/v1",
|
||||
api_key="your-api-key"
|
||||
)
|
||||
agent = Agent(llm=llm, ...)
|
||||
```
|
||||
If you need assistance, these resources are available:
|
||||
|
||||
This is particularly useful when working with OpenAI-compatible APIs or when you need to specify a different endpoint for your chosen provider.
|
||||
<CardGroup cols={3}>
|
||||
<Card
|
||||
title="LiteLLM Documentation"
|
||||
href="https://docs.litellm.ai/docs/"
|
||||
icon="book"
|
||||
>
|
||||
Comprehensive documentation for LiteLLM integration and troubleshooting common issues.
|
||||
</Card>
|
||||
<Card
|
||||
title="GitHub Issues"
|
||||
href="https://github.com/joaomdmoura/crewAI/issues"
|
||||
icon="bug"
|
||||
>
|
||||
Report bugs, request features, or browse existing issues for solutions.
|
||||
</Card>
|
||||
<Card
|
||||
title="Community Forum"
|
||||
href="https://community.crewai.com"
|
||||
icon="comment-question"
|
||||
>
|
||||
Connect with other CrewAI users, share experiences, and get help from the community.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Choose the right model**: Balance capability and cost.
|
||||
2. **Optimize prompts**: Clear, concise instructions improve output.
|
||||
3. **Manage tokens**: Monitor and limit token usage for efficiency.
|
||||
4. **Use appropriate temperature**: Lower for factual tasks, higher for creative ones.
|
||||
5. **Implement error handling**: Gracefully manage API errors and rate limits.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **API Errors**: Check your API key, network connection, and rate limits.
|
||||
- **Unexpected Outputs**: Refine your prompts and adjust temperature or top_p.
|
||||
- **Performance Issues**: Consider using a more powerful model or optimizing your queries.
|
||||
- **Timeout Errors**: Increase the `timeout` parameter or optimize your input.
|
||||
<Note>
|
||||
Best Practices for API Key Security:
|
||||
- Use environment variables or secure vaults
|
||||
- Never commit keys to version control
|
||||
- Rotate keys regularly
|
||||
- Use separate keys for development and production
|
||||
- Monitor key usage for unusual patterns
|
||||
</Note>
|
||||
|
||||
Reference in New Issue
Block a user