Update docs (#1842)

* Update portkey docs * Add more examples to Knowledge docs + clarify issue with `embedder` * fix knowledge params and usage instructions
2026-01-07 07:08:31 +00:00 · 2025-01-02 16:10:31 -05:00
parent 4bcc3b532d
commit c1172a685a
4 changed files with 244 additions and 87 deletions
--- a/docs/concepts/flows.mdx
+++ b/docs/concepts/flows.mdx
@@ -138,7 +138,7 @@ print("---- Final Output ----")
 print(final_output)
 ````

-``` text Output
+```text Output
 ---- Final Output ----
 Second method received: Output from first_method
 ````
--- a/docs/concepts/knowledge.mdx
+++ b/docs/concepts/knowledge.mdx
@@ -4,8 +4,6 @@ description: What is knowledge in CrewAI and how to use it.
 icon: book
 ---

-# Using Knowledge in CrewAI
-
 ## What is Knowledge?

 Knowledge in CrewAI is a powerful system that allows AI agents to access and utilize external information sources during their tasks.
@@ -36,7 +34,20 @@ CrewAI supports various types of knowledge sources out of the box:
  </Card>
 </CardGroup>

-## Quick Start
+## Supported Knowledge Parameters
+
+| Parameter                    | Type                                | Required | Description                                                                                                                                           |
+| :--------------------------- | :---------------------------------- | :------- | :---------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `sources`                  | **List[BaseKnowledgeSource]**        | Yes      | List of knowledge sources that provide content to be stored and queried. Can include PDF, CSV, Excel, JSON, text files, or string content.           |
+| `collection_name`          | **str**                              | No       | Name of the collection where the knowledge will be stored. Used to identify different sets of knowledge. Defaults to "knowledge" if not provided.     |
+| `storage`                  | **Optional[KnowledgeStorage]**       | No       | Custom storage configuration for managing how the knowledge is stored and retrieved. If not provided, a default storage will be created.              |
+
+## Quickstart Example
+
+<Tip>
+For file-Based Knowledge Sources, make sure to place your files in a `knowledge` directory at the root of your project. 
+Also, use relative paths from the `knowledge` directory when creating the source.
+</Tip>

 Here's an example using string-based knowledge:

@@ -80,7 +91,8 @@ result = crew.kickoff(inputs={"question": "What city does John live in and how o
 ```


-Here's another example with the `CrewDoclingSource`
+Here's another example with the `CrewDoclingSource`. The CrewDoclingSource is actually quite versatile and can handle multiple file formats including TXT, PDF, DOCX, HTML, and more. 
+
 ```python Code
 from crewai import LLM, Agent, Crew, Process, Task
 from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
@@ -128,39 +140,192 @@ result = crew.kickoff(
 )
 ```

+## More Examples
+
+Here are examples of how to use different types of knowledge sources:
+
+### Text File Knowledge Source
+```python
+from crewai.knowledge.source import CrewDoclingSource
+
+# Create a text file knowledge source
+text_source = CrewDoclingSource(
+    file_paths=["document.txt", "another.txt"]
+)
+
+# Create knowledge with text file source
+knowledge = Knowledge(
+    collection_name="text_knowledge",
+    sources=[text_source]
+)
+```
+
+### PDF Knowledge Source
+```python
+from crewai.knowledge.source import PDFKnowledgeSource
+
+# Create a PDF knowledge source
+pdf_source = PDFKnowledgeSource(
+    file_paths=["document.pdf", "another.pdf"]
+)
+
+# Create knowledge with PDF source
+knowledge = Knowledge(
+    collection_name="pdf_knowledge",
+    sources=[pdf_source]
+)
+```
+
+### CSV Knowledge Source
+```python
+from crewai.knowledge.source import CSVKnowledgeSource
+
+# Create a CSV knowledge source
+csv_source = CSVKnowledgeSource(
+    file_paths=["data.csv"]
+)
+
+# Create knowledge with CSV source
+knowledge = Knowledge(
+    collection_name="csv_knowledge",
+    sources=[csv_source]
+)
+```
+
+### Excel Knowledge Source
+```python
+from crewai.knowledge.source import ExcelKnowledgeSource
+
+# Create an Excel knowledge source
+excel_source = ExcelKnowledgeSource(
+    file_paths=["spreadsheet.xlsx"]
+)
+
+# Create knowledge with Excel source
+knowledge = Knowledge(
+    collection_name="excel_knowledge",
+    sources=[excel_source]
+)
+```
+
+### JSON Knowledge Source
+```python
+from crewai.knowledge.source import JSONKnowledgeSource
+
+# Create a JSON knowledge source
+json_source = JSONKnowledgeSource(
+    file_paths=["data.json"]
+)
+
+# Create knowledge with JSON source
+knowledge = Knowledge(
+    collection_name="json_knowledge",
+    sources=[json_source]
+)
+```
+
 ## Knowledge Configuration

 ### Chunking Configuration

-Control how content is split for processing by setting the chunk size and overlap.
+Knowledge sources automatically chunk content for better processing. 
+You can configure chunking behavior in your knowledge sources:

-```python Code
-knowledge_source = StringKnowledgeSource(
-    content="Long content...",
-    chunk_size=4000,     # Characters per chunk (default)
-    chunk_overlap=200    # Overlap between chunks (default)
+```python
+from crewai.knowledge.source import StringKnowledgeSource
+
+source = StringKnowledgeSource(
+    content="Your content here",
+    chunk_size=4000,      # Maximum size of each chunk (default: 4000)
+    chunk_overlap=200     # Overlap between chunks (default: 200)
 )
 ```

-## Embedder Configuration
+The chunking configuration helps in:
+- Breaking down large documents into manageable pieces
+- Maintaining context through chunk overlap
+- Optimizing retrieval accuracy

-You can also configure the embedder for the knowledge store. This is useful if you want to use a different embedder for the knowledge store than the one used for the agents.
+### Embeddings Configuration

-```python Code
-...
+You can also configure the embedder for the knowledge store. 
+This is useful if you want to use a different embedder for the knowledge store than the one used for the agents.
+The `embedder` parameter supports various embedding model providers that include:
+- `openai`: OpenAI's embedding models
+- `google`: Google's text embedding models
+- `azure`: Azure OpenAI embeddings
+- `ollama`: Local embeddings with Ollama
+- `vertexai`: Google Cloud VertexAI embeddings
+- `cohere`: Cohere's embedding models
+- `bedrock`: AWS Bedrock embeddings
+- `huggingface`: Hugging Face models
+- `watson`: IBM Watson embeddings
+
+Here's an example of how to configure the embedder for the knowledge store using Google's `text-embedding-004` model:
+<CodeGroup>
+```python Example
+from crewai import Agent, Task, Crew, Process, LLM
+from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
+import os
+
+# Get the GEMINI API key
+GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")
+
+# Create a knowledge source
+content = "Users name is John. He is 30 years old and lives in San Francisco."
 string_source = StringKnowledgeSource(
-    content="Users name is John. He is 30 years old and lives in San Francisco.",
+    content=content,
 )
+
+# Create an LLM with a temperature of 0 to ensure deterministic outputs
+gemini_llm = LLM(
+    model="gemini/gemini-1.5-pro-002",
+    api_key=GEMINI_API_KEY,
+    temperature=0,
+)
+
+# Create an agent with the knowledge store
+agent = Agent(
+    role="About User",
+    goal="You know everything about the user.",
+    backstory="""You are a master at understanding people and their preferences.""",
+    verbose=True,
+    allow_delegation=False,
+    llm=gemini_llm,
+)
+
+task = Task(
+    description="Answer the following questions about the user: {question}",
+    expected_output="An answer to the question.",
+    agent=agent,
+)
+
 crew = Crew(
-    ...
+    agents=[agent],
+    tasks=[task],
+    verbose=True,
+    process=Process.sequential,
    knowledge_sources=[string_source],
    embedder={
-        "provider": "openai",
-        "config": {"model": "text-embedding-3-small"},
-    },
+        "provider": "google",
+        "config": {
+            "model": "models/text-embedding-004",
+            "api_key": GEMINI_API_KEY,
+        }
+    }
 )
-```

+result = crew.kickoff(inputs={"question": "What city does John live in and how old is he?"})
+```
+```text Output
+# Agent: About User
+## Task: Answer the following questions about the user: What city does John live in and how old is he?
+
+# Agent: About User
+## Final Answer: 
+John is 30 years old and lives in San Francisco.
+```
+</CodeGroup>
 ## Clearing Knowledge

 If you need to clear the knowledge stored in CrewAI, you can use the `crewai reset-memories` command with the `--knowledge` option.
--- a/docs/how-to/Portkey-Observability-and-Guardrails.md
+++ b/docs/how-to/Portkey-Observability-and-Guardrails.md
@@ -1,4 +1,9 @@
-# Portkey Integration with CrewAI
+---
+title: Portkey Observability and Guardrails
+description: How to use Portkey with CrewAI
+icon: key
+---
+
 <img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-CrewAI.png" alt="Portkey CrewAI Header Image" width="70%" />


@@ -10,74 +15,69 @@ Portkey adds 4 core production capabilities to any CrewAI agent:
 3. Full-stack tracing & cost, performance analytics
 4. Real-time guardrails to enforce behavior

-
-
-
-
 ## Getting Started

-1. **Install Required Packages:**
+<Steps>
+    <Step title="Install CrewAI and Portkey">
+    ```bash
+    pip install -qU crewai portkey-ai
+    ```
+    </Step>
+    <Step title="Configure the LLM Client">
+    To build CrewAI Agents with Portkey, you'll need two keys:
+    - **Portkey API Key**: Sign up on the [Portkey app](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai) and copy your API key
+    - **Virtual Key**: Virtual Keys securely manage your LLM API keys in one place. Store your LLM provider API keys securely in Portkey's vault

-```bash
-pip install -qU crewai portkey-ai
-```
+    ```python
+    from crewai import LLM
+    from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL

-2. **Configure the LLM Client:**
-
-To build CrewAI Agents with Portkey, you'll need two keys:
- **Portkey API Key**: Sign up on the [Portkey app](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai) and copy your API key
- **Virtual Key**: Virtual Keys securely manage your LLM API keys in one place. Store your LLM provider API keys securely in Portkey's vault
-
-```python
-from crewai import LLM
-from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
-
-gpt_llm = LLM(
-    model="gpt-4",
-    base_url=PORTKEY_GATEWAY_URL,
-    api_key="dummy", # We are using Virtual key
-    extra_headers=createHeaders(
-        api_key="YOUR_PORTKEY_API_KEY",
-        virtual_key="YOUR_VIRTUAL_KEY", # Enter your Virtual key from Portkey
+    gpt_llm = LLM(
+        model="gpt-4",
+        base_url=PORTKEY_GATEWAY_URL,
+        api_key="dummy", # We are using Virtual key
+        extra_headers=createHeaders(
+            api_key="YOUR_PORTKEY_API_KEY",
+            virtual_key="YOUR_VIRTUAL_KEY", # Enter your Virtual key from Portkey
+        )
    )
-)
-```
+    ```
+    </Step>
+    <Step title="Create and Run Your First Agent">
+    ```python
+    from crewai import Agent, Task, Crew

-3. **Create and Run Your First Agent:**
+    # Define your agents with roles and goals
+    coder = Agent(
+        role='Software developer',
+        goal='Write clear, concise code on demand',
+        backstory='An expert coder with a keen eye for software trends.',
+        llm=gpt_llm
+    )

-```python
-from crewai import Agent, Task, Crew
+    # Create tasks for your agents
+    task1 = Task(
+        description="Define the HTML for making a simple website with heading- Hello World! Portkey is working!",
+        expected_output="A clear and concise HTML code",
+        agent=coder
+    )

-# Define your agents with roles and goals
-coder = Agent(
-    role='Software developer',
-    goal='Write clear, concise code on demand',
-    backstory='An expert coder with a keen eye for software trends.',
-    llm=gpt_llm
-)
-
-# Create tasks for your agents
-task1 = Task(
-    description="Define the HTML for making a simple website with heading- Hello World! Portkey is working!",
-    expected_output="A clear and concise HTML code",
-    agent=coder
-)
-
-# Instantiate your crew
-crew = Crew(
-    agents=[coder],
-    tasks=[task1],
-)
-
-result = crew.kickoff()
-print(result)
-```
+    # Instantiate your crew
+    crew = Crew(
+        agents=[coder],
+        tasks=[task1],
+    )

+    result = crew.kickoff()
+    print(result)
+    ```
+    </Step>
+</Steps>

 ## Key Features

 | Feature | Description |
-|---------|-------------|
+|:--------|:------------|
 | 🌐 Multi-LLM Support | Access OpenAI, Anthropic, Gemini, Azure, and 250+ providers through a unified interface |
 | 🛡️ Production Reliability | Implement retries, timeouts, load balancing, and fallbacks |
 | 📊 Advanced Observability | Track 40+ metrics including costs, tokens, latency, and custom metadata |
@@ -200,12 +200,3 @@ For detailed information on creating and managing Configs, visit the [Portkey do
 - [📊 Portkey Dashboard](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai)
 - [🐦 Twitter](https://twitter.com/portkeyai)
 - [💬 Discord Community](https://discord.gg/DD7vgKK299)
-
-
-
-
-
-
-
-
-
--- a/docs/mint.json
+++ b/docs/mint.json
@@ -100,7 +100,8 @@
        "how-to/conditional-tasks",
        "how-to/agentops-observability",
        "how-to/langtrace-observability",
-        "how-to/openlit-observability"
+        "how-to/openlit-observability",
+        "how-to/portkey-observability"
      ]
    },
    {