Removes model provider defaults from LLM Setup (#2766)

This removes any specific model from the "Setting up your LLM" guide,
but provides examples for the top-3 providers.

This section also conflated "model selection" with "model
configuration", where configuration is provider-specific, so I've
focused this first section on just model selection, deferring the config
to the "provider" section that follows.

Co-authored-by: Tony Kipkemboi <iamtonykipkemboi@gmail.com>
This commit is contained in:
Mark McDonald
2025-05-06 21:27:14 +08:00
committed by GitHub
parent c3726092fd
commit 836e9fc545

View File

@@ -27,23 +27,19 @@ Large Language Models (LLMs) are the core intelligence behind CrewAI agents. The
</Card> </Card>
</CardGroup> </CardGroup>
## Setting Up Your LLM ## Setting up your LLM
There are three ways to configure LLMs in CrewAI. Choose the method that best fits your workflow: There are different places in CrewAI code where you can specify the model to use. Once you specify the model you are using, you will need to provide the configuration (like an API key) for each of the model providers you use. See the [provider configuration examples](#provider-configuration-examples) section for your provider.
<Tabs> <Tabs>
<Tab title="1. Environment Variables"> <Tab title="1. Environment Variables">
The simplest way to get started. Set these variables in your environment: The simplest way to get started. Set the model in your environment directly, through an `.env` file or in your app code. If you used `crewai create` to bootstrap your project, it will be set already.
```bash ```bash .env
# Required: Your API key for authentication MODEL=model-id # e.g. gpt-4o, gemini-2.0-flash, claude-3-sonnet-...
OPENAI_API_KEY=<your-api-key>
# Optional: Default model selection # Be sure to set your API keys here too. See the Provider
OPENAI_MODEL_NAME=gpt-4o-mini # Default if not set # section below.
# Optional: Organization ID (if applicable)
OPENAI_ORGANIZATION_ID=<your-org-id>
``` ```
<Warning> <Warning>
@@ -53,13 +49,13 @@ There are three ways to configure LLMs in CrewAI. Choose the method that best fi
<Tab title="2. YAML Configuration"> <Tab title="2. YAML Configuration">
Create a YAML file to define your agent configurations. This method is great for version control and team collaboration: Create a YAML file to define your agent configurations. This method is great for version control and team collaboration:
```yaml ```yaml agents.yaml {6}
researcher: researcher:
role: Research Specialist role: Research Specialist
goal: Conduct comprehensive research and analysis goal: Conduct comprehensive research and analysis
backstory: A dedicated research professional with years of experience backstory: A dedicated research professional with years of experience
verbose: true verbose: true
llm: openai/gpt-4o-mini # your model here llm: provider/model-id # e.g. openai/gpt-4o, google/gemini-2.0-flash, anthropic/claude...
# (see provider configuration examples below for more) # (see provider configuration examples below for more)
``` ```
@@ -74,23 +70,23 @@ There are three ways to configure LLMs in CrewAI. Choose the method that best fi
<Tab title="3. Direct Code"> <Tab title="3. Direct Code">
For maximum flexibility, configure LLMs directly in your Python code: For maximum flexibility, configure LLMs directly in your Python code:
```python ```python {4,8}
from crewai import LLM from crewai import LLM
# Basic configuration # Basic configuration
llm = LLM(model="gpt-4") llm = LLM(model="model-id-here") # gpt-4o, gemini-2.0-flash, anthropic/claude...
# Advanced configuration with detailed parameters # Advanced configuration with detailed parameters
llm = LLM( llm = LLM(
model="gpt-4o-mini", model="model-id-here", # gpt-4o, gemini-2.0-flash, anthropic/claude...
temperature=0.7, # Higher for more creative outputs temperature=0.7, # Higher for more creative outputs
timeout=120, # Seconds to wait for response timeout=120, # Seconds to wait for response
max_tokens=4000, # Maximum length of response max_tokens=4000, # Maximum length of response
top_p=0.9, # Nucleus sampling parameter top_p=0.9, # Nucleus sampling parameter
frequency_penalty=0.1, # Reduce repetition frequency_penalty=0.1 , # Reduce repetition
presence_penalty=0.1, # Encourage topic diversity presence_penalty=0.1, # Encourage topic diversity
response_format={"type": "json"}, # For structured outputs response_format={"type": "json"}, # For structured outputs
seed=42 # For reproducible results seed=42 # For reproducible results
) )
``` ```
@@ -110,7 +106,6 @@ There are three ways to configure LLMs in CrewAI. Choose the method that best fi
## Provider Configuration Examples ## Provider Configuration Examples
CrewAI supports a multitude of LLM providers, each offering unique features, authentication methods, and model capabilities. CrewAI supports a multitude of LLM providers, each offering unique features, authentication methods, and model capabilities.
In this section, you'll find detailed examples that help you select, configure, and optimize the LLM that best fits your project's needs. In this section, you'll find detailed examples that help you select, configure, and optimize the LLM that best fits your project's needs.
@@ -407,19 +402,19 @@ In this section, you'll find detailed examples that help you select, configure,
</Accordion> </Accordion>
<Accordion title="Local NVIDIA NIM Deployed using WSL2"> <Accordion title="Local NVIDIA NIM Deployed using WSL2">
NVIDIA NIM enables you to run powerful LLMs locally on your Windows machine using WSL2 (Windows Subsystem for Linux). NVIDIA NIM enables you to run powerful LLMs locally on your Windows machine using WSL2 (Windows Subsystem for Linux).
This approach allows you to leverage your NVIDIA GPU for private, secure, and cost-effective AI inference without relying on cloud services. This approach allows you to leverage your NVIDIA GPU for private, secure, and cost-effective AI inference without relying on cloud services.
Perfect for development, testing, or production scenarios where data privacy or offline capabilities are required. Perfect for development, testing, or production scenarios where data privacy or offline capabilities are required.
Here is a step-by-step guide to setting up a local NVIDIA NIM model: Here is a step-by-step guide to setting up a local NVIDIA NIM model:
1. Follow installation instructions from [NVIDIA Website](https://docs.nvidia.com/nim/wsl2/latest/getting-started.html) 1. Follow installation instructions from [NVIDIA Website](https://docs.nvidia.com/nim/wsl2/latest/getting-started.html)
2. Install the local model. For Llama 3.1-8b follow [instructions](https://build.nvidia.com/meta/llama-3_1-8b-instruct/deploy) 2. Install the local model. For Llama 3.1-8b follow [instructions](https://build.nvidia.com/meta/llama-3_1-8b-instruct/deploy)
3. Configure your crewai local models: 3. Configure your crewai local models:
```python Code ```python Code
from crewai.llm import LLM from crewai.llm import LLM
@@ -441,7 +436,7 @@ In this section, you'll find detailed examples that help you select, configure,
config=self.agents_config['researcher'], # type: ignore[index] config=self.agents_config['researcher'], # type: ignore[index]
llm=local_nvidia_nim_llm llm=local_nvidia_nim_llm
) )
# ... # ...
``` ```
</Accordion> </Accordion>
@@ -637,19 +632,19 @@ CrewAI supports streaming responses from LLMs, allowing your application to rece
When streaming is enabled, responses are delivered in chunks as they're generated, creating a more responsive user experience. When streaming is enabled, responses are delivered in chunks as they're generated, creating a more responsive user experience.
</Tab> </Tab>
<Tab title="Event Handling"> <Tab title="Event Handling">
CrewAI emits events for each chunk received during streaming: CrewAI emits events for each chunk received during streaming:
```python ```python
from crewai import LLM from crewai import LLM
from crewai.utilities.events import EventHandler, LLMStreamChunkEvent from crewai.utilities.events import EventHandler, LLMStreamChunkEvent
class MyEventHandler(EventHandler): class MyEventHandler(EventHandler):
def on_llm_stream_chunk(self, event: LLMStreamChunkEvent): def on_llm_stream_chunk(self, event: LLMStreamChunkEvent):
# Process each chunk as it arrives # Process each chunk as it arrives
print(f"Received chunk: {event.chunk}") print(f"Received chunk: {event.chunk}")
# Register the event handler # Register the event handler
from crewai.utilities.events import crewai_event_bus from crewai.utilities.events import crewai_event_bus
crewai_event_bus.register_handler(MyEventHandler()) crewai_event_bus.register_handler(MyEventHandler())
@@ -785,7 +780,7 @@ Learn how to get the most out of your LLM configuration:
<Tip> <Tip>
Use larger context models for extensive tasks Use larger context models for extensive tasks
</Tip> </Tip>
```python ```python
# Large context model # Large context model
llm = LLM(model="openai/gpt-4o") # 128K tokens llm = LLM(model="openai/gpt-4o") # 128K tokens