Clean up the Google setup section (#2785)

The Gemini & Vertex sections were conflated and a little hard to distingush, so I have put them in separate sections. Also added the latest 2.5 and 2.0 flash-lite models, and added a note that Gemma models work too. Co-authored-by: Tony Kipkemboi <iamtonykipkemboi@gmail.com>
2025-12-16 04:18:35 +00:00 · 2025-05-09 00:24:38 +08:00
parent d3fc2b4477
commit 2c011631f9
1 changed files with 54 additions and 14 deletions
--- a/docs/concepts/llms.mdx
+++ b/docs/concepts/llms.mdx
@@ -169,19 +169,55 @@ In this section, you'll find detailed examples that help you select, configure,
    ```
  </Accordion>

-  <Accordion title="Google">
-    Set the following environment variables in your `.env` file:
+  <Accordion title="Google (Gemini API)">
+    Set your API key in your `.env` file. If you need a key, or need to find an
+    existing key, check [AI Studio](https://aistudio.google.com/apikey).

-    ```toml Code
-    # Option 1: Gemini accessed with an API key.
+    ```toml .env
    # https://ai.google.dev/gemini-api/docs/api-key
    GEMINI_API_KEY=<your-api-key>
-
-    # Option 2: Vertex AI IAM credentials for Gemini, Anthropic, and Model Garden.
-    # https://cloud.google.com/vertex-ai/generative-ai/docs/overview
    ```

-    Get credentials from your Google Cloud Console and save it to a JSON file with the following code:
+    Example usage in your CrewAI project:
+    ```python Code
+    from crewai import LLM
+
+    llm = LLM(
+        model="gemini/gemini-2.0-flash",
+        temperature=0.7,
+    )
+    ```
+
+    ### Gemini models
+
+    Google offers a range of powerful models optimized for different use cases.
+
+    | Model                          | Context Window | Best For                                                          |
+    |--------------------------------|----------------|-------------------------------------------------------------------|
+    | gemini-2.5-flash-preview-04-17 | 1M tokens      | Adaptive thinking, cost efficiency                                |
+    | gemini-2.5-pro-preview-05-06   | 1M tokens      | Enhanced thinking and reasoning, multimodal understanding, advanced coding, and more |
+    | gemini-2.0-flash               | 1M tokens      | Next generation features, speed, thinking, and realtime streaming |
+    | gemini-2.0-flash-lite          | 1M tokens      | Cost efficiency and low latency                                   |
+    | gemini-1.5-flash               | 1M tokens      | Balanced multimodal model, good for most tasks                    |
+    | gemini-1.5-flash-8B            | 1M tokens      | Fastest, most cost-efficient, good for high-frequency tasks       |
+    | gemini-1.5-pro                 | 2M tokens      | Best performing, wide variety of reasoning tasks including logical reasoning, coding, and creative collaboration |
+
+    The full list of models is available in the [Gemini model docs](https://ai.google.dev/gemini-api/docs/models).
+
+    ### Gemma
+
+    The Gemini API also allows you to use your API key to access [Gemma models](https://ai.google.dev/gemma/docs) hosted on Google infrastructure.
+
+    | Model          | Context Window |
+    |----------------|----------------|
+    | gemma-3-1b-it  | 32k tokens     |
+    | gemma-3-4b-it  | 32k tokens     |
+    | gemma-3-12b-it | 32k tokens     |
+    | gemma-3-27b-it | 128k tokens    |
+
+  </Accordion>
+  <Accordion title="Google (Vertex AI)">
+    Get credentials from your Google Cloud Console and save it to a JSON file, then load it with the following code:
    ```python Code
    import json

@@ -205,14 +241,18 @@ In this section, you'll find detailed examples that help you select, configure,
        vertex_credentials=vertex_credentials_json
    )
    ```
+
    Google offers a range of powerful models optimized for different use cases:

-    | Model                  | Context Window | Best For                                                          |
-    |-----------------------|----------------|------------------------------------------------------------------|
-    | gemini-2.0-flash-exp  | 1M tokens      | Higher quality at faster speed, multimodal model, good for most tasks |
-    | gemini-1.5-flash      | 1M tokens      | Balanced multimodal model, good for most tasks                    |
-    | gemini-1.5-flash-8B   | 1M tokens      | Fastest, most cost-efficient, good for high-frequency tasks       |
-    | gemini-1.5-pro        | 2M tokens      | Best performing, wide variety of reasoning tasks including logical reasoning, coding, and creative collaboration |
+    | Model                          | Context Window | Best For                                                          |
+    |--------------------------------|----------------|-------------------------------------------------------------------|
+    | gemini-2.5-flash-preview-04-17 | 1M tokens      | Adaptive thinking, cost efficiency                                |
+    | gemini-2.5-pro-preview-05-06   | 1M tokens      | Enhanced thinking and reasoning, multimodal understanding, advanced coding, and more |
+    | gemini-2.0-flash               | 1M tokens      | Next generation features, speed, thinking, and realtime streaming |
+    | gemini-2.0-flash-lite          | 1M tokens      | Cost efficiency and low latency                                   |
+    | gemini-1.5-flash               | 1M tokens      | Balanced multimodal model, good for most tasks                    |
+    | gemini-1.5-flash-8B            | 1M tokens      | Fastest, most cost-efficient, good for high-frequency tasks       |
+    | gemini-1.5-pro                 | 2M tokens      | Best performing, wide variety of reasoning tasks including logical reasoning, coding, and creative collaboration |
  </Accordion>

  <Accordion title="Azure">