Add documentation for Local NVIDIA NIM with WSL2

2025-12-16 04:18:35 +00:00 · 2025-03-20 12:39:37 -07:00
parent df266bda01
commit b2c8779f4c
1 changed files with 41 additions and 1 deletions
--- a/docs/concepts/llms.mdx
+++ b/docs/concepts/llms.mdx
@@ -270,7 +270,7 @@ In this section, you'll find detailed examples that help you select, configure,
    | Claude 3.5 Haiku        | Up to 200k tokens    | Fast, compact multimodal model optimized for quick responses and seamless human-like interactions |
    | Claude 3 Sonnet         | Up to 200k tokens    | Multimodal model balancing intelligence and speed for high-volume deployments. |
    | Claude 3 Haiku          | Up to 200k tokens    | Compact, high-speed multimodal model optimized for quick responses and natural conversational interactions |
-    | Claude 3 Opus           | Up to 200k tokens    | Most advanced multimodal model excelling at complex tasks with human-like reasoning and superior contextual understanding. |
+    | Claude 3 Opus           | Up to 200k tokens    | Most advanced multimodal model exceling at complex tasks with human-like reasoning and superior contextual understanding. |
    | Claude 2.1              | Up to 200k tokens    | Enhanced version with expanded context window, improved reliability, and reduced hallucinations for long-form and RAG applications |
    | Claude                  | Up to 100k tokens    | Versatile model excelling in sophisticated dialogue, creative content, and precise instruction following. |
    | Claude Instant          | Up to 100k tokens    | Fast, cost-effective model for everyday tasks like dialogue, analysis, summarization, and document Q&A |
@@ -406,6 +406,46 @@ In this section, you'll find detailed examples that help you select, configure,
    | baichuan-inc/baichuan2-13b-chat                                  | 4,096 tokens   | Support Chinese and English chat, coding, math, instruction following, solving quizzes |
  </Accordion>

+  <Accordion title="Local NVIDIA NIM Deployed using WSL2">
+    
+    NVIDIA NIM enables you to run powerful LLMs locally on your Windows machine using WSL2 (Windows Subsystem for Linux). 
+    This approach allows you to leverage your NVIDIA GPU for private, secure, and cost-effective AI inference without relying on cloud services. 
+    Perfect for development, testing, or production scenarios where data privacy or offline capabilities are required.
+    
+    Here is a step-by-step guide to setting up a local NVIDIA NIM model:
+    
+    1. Follow installation instructions from [NVIDIA Website](https://docs.nvidia.com/nim/wsl2/latest/getting-started.html)
+
+    2. Install the local model. For Llama 3.1-8b follow [instructions](https://build.nvidia.com/meta/llama-3_1-8b-instruct/deploy)
+
+    3. Configure your crewai local models:
+  
+    ```python Code
+    from crewai.llm import LLM
+
+    local_nvidia_nim_llm = LLM(
+        model="openai/meta/llama-3.1-8b-instruct", # it's an openai-api compatible model
+        base_url="http://localhost:8000/v1",
+        api_key="<your_api_key|any text if you have not configured it>", # api_key is required, but you can use any text
+    )
+
+    # Then you can use it in your crew:
+
+    @CrewBase
+    class MyCrew():
+        # ...
+
+        @agent
+        def researcher(self) -> Agent:
+            return Agent(
+                config=self.agents_config['researcher'],
+                llm=local_nvidia_nim_llm
+            )
+        
+        # ...
+    ```
+  </Accordion>
+
  <Accordion title="Groq">
    Set the following environment variables in your `.env` file: