Files
crewAI/docs/concepts/llms.mdx
2024-11-10 11:36:03 -03:00

401 lines
13 KiB
Plaintext

---
title: LLMs
description: Learn how to configure and optimize LLMs for your CrewAI projects.
icon: microchip-ai
---
# Large Language Models (LLMs) in CrewAI
Large Language Models (LLMs) are the backbone of intelligent agents in the CrewAI framework. This guide will help you understand, configure, and optimize LLM usage for your CrewAI projects.
## Key Concepts
- **LLM**: Large Language Model, the AI powering agent intelligence
- **Agent**: A CrewAI entity that uses an LLM to perform tasks
- **Provider**: A service that offers LLM capabilities (e.g., OpenAI, Anthropic, Ollama, [more providers](https://docs.litellm.ai/docs/providers))
## Configuring LLMs for Agents
CrewAI offers flexible options for setting up LLMs:
### 1. Default Configuration
By default, CrewAI uses the `gpt-4o-mini` model. It uses environment variables if no LLM is specified:
- `OPENAI_MODEL_NAME` (defaults to "gpt-4o-mini" if not set)
- `OPENAI_API_BASE`
- `OPENAI_API_KEY`
### 2. Updating YAML files
You can update the `agents.yml` file to refer to the LLM you want to use:
```yaml Code
researcher:
role: Research Specialist
goal: Conduct comprehensive research and analysis to gather relevant information,
synthesize findings, and produce well-documented insights.
backstory: A dedicated research professional with years of experience in academic
investigation, literature review, and data analysis, known for thorough and
methodical approaches to complex research questions.
verbose: true
llm: openai/gpt-4o
# llm: azure/gpt-4o-mini
# llm: gemini/gemini-pro
# llm: anthropic/claude-3-5-sonnet-20240620
# llm: bedrock/anthropic.claude-3-sonnet-20240229-v1:0
# llm: mistral/mistral-large-latest
# llm: ollama/llama3:70b
# llm: groq/llama-3.2-90b-vision-preview
# llm: watsonx/meta-llama/llama-3-1-70b-instruct
# ...
```
Keep in mind that you will need to set certain ENV vars depending on the model you are
using to account for the credentials or set a custom LLM object like described below.
Here are some of the required ENV vars for some of the LLM integrations:
<AccordionGroup>
<Accordion title="OpenAI">
```python Code
OPENAI_API_KEY=<your-api-key>
OPENAI_API_BASE=<optional-custom-base-url>
OPENAI_MODEL_NAME=<openai-model-name>
OPENAI_ORGANIZATION=<your-org-id> # OPTIONAL
OPENAI_API_BASE=<openaiai-api-base> # OPTIONAL
```
</Accordion>
<Accordion title="Anthropic">
```python Code
ANTHROPIC_API_KEY=<your-api-key>
```
</Accordion>
<Accordion title="Google">
```python Code
GEMINI_API_KEY=<your-api-key>
```
</Accordion>
<Accordion title="Azure">
```python Code
AZURE_API_KEY=<your-api-key> # "my-azure-api-key"
AZURE_API_BASE=<your-resource-url> # "https://example-endpoint.openai.azure.com"
AZURE_API_VERSION=<api-version> # "2023-05-15"
AZURE_AD_TOKEN=<your-azure-ad-token> # Optional
AZURE_API_TYPE=<your-azure-api-type> # Optional
```
</Accordion>
<Accordion title="AWS Bedrock">
```python Code
AWS_ACCESS_KEY_ID=<your-access-key>
AWS_SECRET_ACCESS_KEY=<your-secret-key>
AWS_DEFAULT_REGION=<your-region>
```
</Accordion>
<Accordion title="Mistral">
```python Code
MISTRAL_API_KEY=<your-api-key>
```
</Accordion>
<Accordion title="Groq">
```python Code
GROQ_API_KEY=<your-api-key>
```
</Accordion>
<Accordion title="IBM watsonx.ai">
```python Code
WATSONX_URL=<your-url> # (required) Base URL of your WatsonX instance
WATSONX_APIKEY=<your-apikey> # (required) IBM cloud API key
WATSONX_TOKEN=<your-token> # (required) IAM auth token (alternative to APIKEY)
WATSONX_PROJECT_ID=<your-project-id> # (optional) Project ID of your WatsonX instance
WATSONX_DEPLOYMENT_SPACE_ID=<your-space-id> # (optional) ID of deployment space for deployed models
```
</Accordion>
</AccordionGroup>
### 3. Custom LLM Objects
Pass a custom LLM implementation or object from another library.
See below for examples.
<Tabs>
<Tab title="String Identifier">
```python Code
agent = Agent(llm="gpt-4o", ...)
```
</Tab>
<Tab title="LLM Instance">
```python Code
from crewai import LLM
llm = LLM(model="gpt-4", temperature=0.7)
agent = Agent(llm=llm, ...)
```
</Tab>
</Tabs>
## Connecting to OpenAI-Compatible LLMs
You can connect to OpenAI-compatible LLMs using either environment variables or by setting specific attributes on the LLM class:
<Tabs>
<Tab title="Using Environment Variables">
```python Code
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
os.environ["OPENAI_API_BASE"] = "https://api.your-provider.com/v1"
```
</Tab>
<Tab title="Using LLM Class Attributes">
```python Code
from crewai import LLM
llm = LLM(
model="custom-model-name",
api_key="your-api-key",
base_url="https://api.your-provider.com/v1"
)
agent = Agent(llm=llm, ...)
```
</Tab>
</Tabs>
## LLM Configuration Options
When configuring an LLM for your agent, you have access to a wide range of parameters:
| Parameter | Type | Description |
|:------------------|:---------------:|:-------------------------------------------------------------------------------------------------|
| **model** | `str` | Name of the model to use (e.g., "gpt-4", "gpt-3.5-turbo", "ollama/llama3.1"). For more options, visit the providers documentation. |
| **timeout** | `float, int` | Maximum time (in seconds) to wait for a response. |
| **temperature** | `float` | Controls randomness in output (0.0 to 1.0). |
| **top_p** | `float` | Controls diversity of output (0.0 to 1.0). |
| **n** | `int` | Number of completions to generate. |
| **stop** | `str, List[str]` | Sequence(s) where generation should stop. |
| **max_tokens** | `int` | Maximum number of tokens to generate. |
| **presence_penalty** | `float` | Penalizes new tokens based on their presence in prior text. |
| **frequency_penalty**| `float` | Penalizes new tokens based on their frequency in prior text. |
| **logit_bias** | `Dict[int, float]`| Modifies likelihood of specified tokens appearing. |
| **response_format** | `Dict[str, Any]` | Specifies the format of the response (e.g., JSON object). |
| **seed** | `int` | Sets a random seed for deterministic results. |
| **logprobs** | `bool` | Returns log probabilities of output tokens if enabled. |
| **top_logprobs** | `int` | Number of most likely tokens for which to return log probabilities. |
| **base_url** | `str` | The base URL for the API endpoint. |
| **api_version** | `str` | Version of the API to use. |
| **api_key** | `str` | Your API key for authentication. |
These are examples of how to configure LLMs for your agent.
<AccordionGroup>
<Accordion title="OpenAI">
```python Code
from crewai import LLM
llm = LLM(
model="gpt-4",
temperature=0.8,
max_tokens=150,
top_p=0.9,
frequency_penalty=0.1,
presence_penalty=0.1,
stop=["END"],
seed=42,
base_url="https://api.openai.com/v1",
api_key="your-api-key-here"
)
agent = Agent(llm=llm, ...)
```
</Accordion>
<Accordion title="Cerebras">
```python Code
from crewai import LLM
llm = LLM(
model="cerebras/llama-3.1-70b",
api_key="your-api-key-here"
)
agent = Agent(llm=llm, ...)
```
</Accordion>
<Accordion title="Ollama (Local LLMs)">
CrewAI supports using Ollama for running open-source models locally:
1. Install Ollama: [ollama.ai](https://ollama.ai/)
2. Run a model: `ollama run llama2`
3. Configure agent:
```python Code
from crewai import LLM
agent = Agent(
llm=LLM(
model="ollama/llama3.1",
base_url="http://localhost:11434"
),
...
)
```
</Accordion>
<Accordion title="Groq">
```python Code
from crewai import LLM
llm = LLM(
model="groq/llama3-8b-8192",
api_key="your-api-key-here"
)
agent = Agent(llm=llm, ...)
```
</Accordion>
<Accordion title="Anthropic">
```python Code
from crewai import LLM
llm = LLM(
model="anthropic/claude-3-5-sonnet-20241022",
api_key="your-api-key-here"
)
agent = Agent(llm=llm, ...)
```
</Accordion>
<Accordion title="Fireworks AI">
```python Code
from crewai import LLM
llm = LLM(
model="fireworks_ai/accounts/fireworks/models/llama-v3-70b-instruct",
api_key="your-api-key-here"
)
agent = Agent(llm=llm, ...)
```
</Accordion>
<Accordion title="Gemini">
```python Code
from crewai import LLM
llm = LLM(
model="gemini/gemini-1.5-pro-002",
api_key="your-api-key-here"
)
agent = Agent(llm=llm, ...)
```
</Accordion>
<Accordion title="Perplexity AI (pplx-api)">
```python Code
from crewai import LLM
llm = LLM(
model="perplexity/mistral-7b-instruct",
base_url="https://api.perplexity.ai/v1",
api_key="your-api-key-here"
)
agent = Agent(llm=llm, ...)
```
</Accordion>
<Accordion title="IBM watsonx.ai">
You can use IBM Watson by seeting the following ENV vars:
```python Code
WATSONX_URL=<your-url>
WATSONX_APIKEY=<your-apikey>
WATSONX_PROJECT_ID=<your-project-id>
```
You can then define your agents llms by updating the `agents.yml`
```yaml Code
researcher:
role: Research Specialist
goal: Conduct comprehensive research and analysis to gather relevant information,
synthesize findings, and produce well-documented insights.
backstory: A dedicated research professional with years of experience in academic
investigation, literature review, and data analysis, known for thorough and
methodical approaches to complex research questions.
verbose: true
llm: watsonx/meta-llama/llama-3-1-70b-instruct
```
You can also set up agents more dynamically as a base level LLM instance, like bellow:
```python Code
from crewai import LLM
llm = LLM(
model="watsonx/ibm/granite-13b-chat-v2",
base_url="https://api.watsonx.ai/v1",
api_key="your-api-key-here"
)
agent = Agent(llm=llm, ...)
```
</Accordion>
<Accordion title="Hugging Face">
```python Code
from crewai import LLM
llm = LLM(
model="huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct",
api_key="your-api-key-here",
base_url="your_api_endpoint"
)
agent = Agent(llm=llm, ...)
```
</Accordion>
</AccordionGroup>
## Changing the Base API URL
You can change the base API URL for any LLM provider by setting the `base_url` parameter:
```python Code
from crewai import LLM
llm = LLM(
model="custom-model-name",
base_url="https://api.your-provider.com/v1",
api_key="your-api-key"
)
agent = Agent(llm=llm, ...)
```
This is particularly useful when working with OpenAI-compatible APIs or when you need to specify a different endpoint for your chosen provider.
## Best Practices
1. **Choose the right model**: Balance capability and cost.
2. **Optimize prompts**: Clear, concise instructions improve output.
3. **Manage tokens**: Monitor and limit token usage for efficiency.
4. **Use appropriate temperature**: Lower for factual tasks, higher for creative ones.
5. **Implement error handling**: Gracefully manage API errors and rate limits.
## Troubleshooting
- **API Errors**: Check your API key, network connection, and rate limits.
- **Unexpected Outputs**: Refine your prompts and adjust temperature or top_p.
- **Performance Issues**: Consider using a more powerful model or optimizing your queries.
- **Timeout Errors**: Increase the `timeout` parameter or optimize your input.