Add documentation for Local NVIDIA NIM with WSL2

2026-01-09 08:08:32 +00:00 · 2025-03-20 12:39:37 -07:00
parent df266bda01
commit b2c8779f4c
1 changed files with 41 additions and 1 deletions
--- a/docs/concepts/llms.mdx
+++ b/docs/concepts/llms.mdx
@@ -270,7 +270,7 @@ In this section, you'll find detailed examples that help you select, configure,
    | Claude 3.5 Haiku        | Up to 200k tokens    | Fast, compact multimodal model optimized for quick responses and seamless human-like interactions |
    | Claude 3 Sonnet         | Up to 200k tokens    | Multimodal model balancing intelligence and speed for high-volume deployments. |
    | Claude 3 Haiku          | Up to 200k tokens    | Compact, high-speed multimodal model optimized for quick responses and natural conversational interactions |
-    | Claude 3 Opus           | Up to 200k tokens    | Most advanced multimodal model excelling at complex tasks with human-like reasoning and superior contextual understanding. |
+    | Claude 3 Opus           | Up to 200k tokens    | Most advanced multimodal model exceling at complex tasks with human-like reasoning and superior contextual understanding. |
    | Claude 2.1              | Up to 200k tokens    | Enhanced version with expanded context window, improved reliability, and reduced hallucinations for long-form and RAG applications |
    | Claude                  | Up to 100k tokens    | Versatile model excelling in sophisticated dialogue, creative content, and precise instruction following. |
    | Claude Instant          | Up to 100k tokens    | Fast, cost-effective model for everyday tasks like dialogue, analysis, summarization, and document Q&A |
@@ -406,6 +406,46 @@ In this section, you'll find detailed examples that help you select, configure,
    | baichuan-inc/baichuan2-13b-chat                                  | 4,096 tokens   | Support Chinese and English chat, coding, math, instruction following, solving quizzes |
  </Accordion>
  <Accordion title="Local NVIDIA NIM Deployed using WSL2">
    NVIDIA NIM enables you to run powerful LLMs locally on your Windows machine using WSL2 (Windows Subsystem for Linux). 
    This approach allows you to leverage your NVIDIA GPU for private, secure, and cost-effective AI inference without relying on cloud services. 
    Perfect for development, testing, or production scenarios where data privacy or offline capabilities are required.
    Here is a step-by-step guide to setting up a local NVIDIA NIM model:
    1. Follow installation instructions from [NVIDIA Website](https://docs.nvidia.com/nim/wsl2/latest/getting-started.html)
    2. Install the local model. For Llama 3.1-8b follow [instructions](https://build.nvidia.com/meta/llama-3_1-8b-instruct/deploy)
    3. Configure your crewai local models:
    ```python Code
    from crewai.llm import LLM
    local_nvidia_nim_llm = LLM(
        model="openai/meta/llama-3.1-8b-instruct", # it's an openai-api compatible model
        base_url="http://localhost:8000/v1",
        api_key="<your_api_key|any text if you have not configured it>", # api_key is required, but you can use any text
    )
    # Then you can use it in your crew:
    @CrewBase
    class MyCrew():
        # ...
        @agent
        def researcher(self) -> Agent:
            return Agent(
                config=self.agents_config['researcher'],
                llm=local_nvidia_nim_llm
            )
        # ...
    ```
  </Accordion>
  <Accordion title="Groq">
    Set the following environment variables in your `.env` file: