From 2c011631f9baf4f3c3b39210ad2522ac660d55c9 Mon Sep 17 00:00:00 2001 From: Mark McDonald Date: Fri, 9 May 2025 00:24:38 +0800 Subject: [PATCH] Clean up the Google setup section (#2785) The Gemini & Vertex sections were conflated and a little hard to distingush, so I have put them in separate sections. Also added the latest 2.5 and 2.0 flash-lite models, and added a note that Gemma models work too. Co-authored-by: Tony Kipkemboi --- docs/concepts/llms.mdx | 68 +++++++++++++++++++++++++++++++++--------- 1 file changed, 54 insertions(+), 14 deletions(-) diff --git a/docs/concepts/llms.mdx b/docs/concepts/llms.mdx index 643ebfe16..249a2c7e5 100644 --- a/docs/concepts/llms.mdx +++ b/docs/concepts/llms.mdx @@ -169,19 +169,55 @@ In this section, you'll find detailed examples that help you select, configure, ``` - - Set the following environment variables in your `.env` file: + + Set your API key in your `.env` file. If you need a key, or need to find an + existing key, check [AI Studio](https://aistudio.google.com/apikey). - ```toml Code - # Option 1: Gemini accessed with an API key. + ```toml .env # https://ai.google.dev/gemini-api/docs/api-key GEMINI_API_KEY= - - # Option 2: Vertex AI IAM credentials for Gemini, Anthropic, and Model Garden. - # https://cloud.google.com/vertex-ai/generative-ai/docs/overview ``` - Get credentials from your Google Cloud Console and save it to a JSON file with the following code: + Example usage in your CrewAI project: + ```python Code + from crewai import LLM + + llm = LLM( + model="gemini/gemini-2.0-flash", + temperature=0.7, + ) + ``` + + ### Gemini models + + Google offers a range of powerful models optimized for different use cases. + + | Model | Context Window | Best For | + |--------------------------------|----------------|-------------------------------------------------------------------| + | gemini-2.5-flash-preview-04-17 | 1M tokens | Adaptive thinking, cost efficiency | + | gemini-2.5-pro-preview-05-06 | 1M tokens | Enhanced thinking and reasoning, multimodal understanding, advanced coding, and more | + | gemini-2.0-flash | 1M tokens | Next generation features, speed, thinking, and realtime streaming | + | gemini-2.0-flash-lite | 1M tokens | Cost efficiency and low latency | + | gemini-1.5-flash | 1M tokens | Balanced multimodal model, good for most tasks | + | gemini-1.5-flash-8B | 1M tokens | Fastest, most cost-efficient, good for high-frequency tasks | + | gemini-1.5-pro | 2M tokens | Best performing, wide variety of reasoning tasks including logical reasoning, coding, and creative collaboration | + + The full list of models is available in the [Gemini model docs](https://ai.google.dev/gemini-api/docs/models). + + ### Gemma + + The Gemini API also allows you to use your API key to access [Gemma models](https://ai.google.dev/gemma/docs) hosted on Google infrastructure. + + | Model | Context Window | + |----------------|----------------| + | gemma-3-1b-it | 32k tokens | + | gemma-3-4b-it | 32k tokens | + | gemma-3-12b-it | 32k tokens | + | gemma-3-27b-it | 128k tokens | + + + + Get credentials from your Google Cloud Console and save it to a JSON file, then load it with the following code: ```python Code import json @@ -205,14 +241,18 @@ In this section, you'll find detailed examples that help you select, configure, vertex_credentials=vertex_credentials_json ) ``` + Google offers a range of powerful models optimized for different use cases: - | Model | Context Window | Best For | - |-----------------------|----------------|------------------------------------------------------------------| - | gemini-2.0-flash-exp | 1M tokens | Higher quality at faster speed, multimodal model, good for most tasks | - | gemini-1.5-flash | 1M tokens | Balanced multimodal model, good for most tasks | - | gemini-1.5-flash-8B | 1M tokens | Fastest, most cost-efficient, good for high-frequency tasks | - | gemini-1.5-pro | 2M tokens | Best performing, wide variety of reasoning tasks including logical reasoning, coding, and creative collaboration | + | Model | Context Window | Best For | + |--------------------------------|----------------|-------------------------------------------------------------------| + | gemini-2.5-flash-preview-04-17 | 1M tokens | Adaptive thinking, cost efficiency | + | gemini-2.5-pro-preview-05-06 | 1M tokens | Enhanced thinking and reasoning, multimodal understanding, advanced coding, and more | + | gemini-2.0-flash | 1M tokens | Next generation features, speed, thinking, and realtime streaming | + | gemini-2.0-flash-lite | 1M tokens | Cost efficiency and low latency | + | gemini-1.5-flash | 1M tokens | Balanced multimodal model, good for most tasks | + | gemini-1.5-flash-8B | 1M tokens | Fastest, most cost-efficient, good for high-frequency tasks | + | gemini-1.5-pro | 2M tokens | Best performing, wide variety of reasoning tasks including logical reasoning, coding, and creative collaboration |