Merge branch 'main' into add-new-gradio-guide

2026-07-05 23:19:22 +00:00 · 2025-01-02 20:36:14 -03:00
parent b6d2cfea17 845951a0db
commit 21c3f948d7
36 changed files with 2081 additions and 837 deletions
--- a/docs/concepts/flows.mdx
+++ b/docs/concepts/flows.mdx
@@ -138,7 +138,7 @@ print("---- Final Output ----")
 print(final_output)
 ````

-``` text Output
+```text Output
 ---- Final Output ----
 Second method received: Output from first_method
 ````
--- a/docs/concepts/knowledge.mdx
+++ b/docs/concepts/knowledge.mdx
@@ -4,8 +4,6 @@ description: What is knowledge in CrewAI and how to use it.
 icon: book
 ---

-# Using Knowledge in CrewAI
-
 ## What is Knowledge?

 Knowledge in CrewAI is a powerful system that allows AI agents to access and utilize external information sources during their tasks.
@@ -36,7 +34,20 @@ CrewAI supports various types of knowledge sources out of the box:
  </Card>
 </CardGroup>

-## Quick Start
+## Supported Knowledge Parameters
+
+| Parameter                    | Type                                | Required | Description                                                                                                                                           |
+| :--------------------------- | :---------------------------------- | :------- | :---------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `sources`                  | **List[BaseKnowledgeSource]**        | Yes      | List of knowledge sources that provide content to be stored and queried. Can include PDF, CSV, Excel, JSON, text files, or string content.           |
+| `collection_name`          | **str**                              | No       | Name of the collection where the knowledge will be stored. Used to identify different sets of knowledge. Defaults to "knowledge" if not provided.     |
+| `storage`                  | **Optional[KnowledgeStorage]**       | No       | Custom storage configuration for managing how the knowledge is stored and retrieved. If not provided, a default storage will be created.              |
+
+## Quickstart Example
+
+<Tip>
+For file-Based Knowledge Sources, make sure to place your files in a `knowledge` directory at the root of your project. 
+Also, use relative paths from the `knowledge` directory when creating the source.
+</Tip>

 Here's an example using string-based knowledge:

@@ -80,7 +91,8 @@ result = crew.kickoff(inputs={"question": "What city does John live in and how o
 ```


-Here's another example with the `CrewDoclingSource`
+Here's another example with the `CrewDoclingSource`. The CrewDoclingSource is actually quite versatile and can handle multiple file formats including TXT, PDF, DOCX, HTML, and more. 
+
 ```python Code
 from crewai import LLM, Agent, Crew, Process, Task
 from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
@@ -128,39 +140,192 @@ result = crew.kickoff(
 )
 ```

+## More Examples
+
+Here are examples of how to use different types of knowledge sources:
+
+### Text File Knowledge Source
+```python
+from crewai.knowledge.source import CrewDoclingSource
+
+# Create a text file knowledge source
+text_source = CrewDoclingSource(
+    file_paths=["document.txt", "another.txt"]
+)
+
+# Create knowledge with text file source
+knowledge = Knowledge(
+    collection_name="text_knowledge",
+    sources=[text_source]
+)
+```
+
+### PDF Knowledge Source
+```python
+from crewai.knowledge.source import PDFKnowledgeSource
+
+# Create a PDF knowledge source
+pdf_source = PDFKnowledgeSource(
+    file_paths=["document.pdf", "another.pdf"]
+)
+
+# Create knowledge with PDF source
+knowledge = Knowledge(
+    collection_name="pdf_knowledge",
+    sources=[pdf_source]
+)
+```
+
+### CSV Knowledge Source
+```python
+from crewai.knowledge.source import CSVKnowledgeSource
+
+# Create a CSV knowledge source
+csv_source = CSVKnowledgeSource(
+    file_paths=["data.csv"]
+)
+
+# Create knowledge with CSV source
+knowledge = Knowledge(
+    collection_name="csv_knowledge",
+    sources=[csv_source]
+)
+```
+
+### Excel Knowledge Source
+```python
+from crewai.knowledge.source import ExcelKnowledgeSource
+
+# Create an Excel knowledge source
+excel_source = ExcelKnowledgeSource(
+    file_paths=["spreadsheet.xlsx"]
+)
+
+# Create knowledge with Excel source
+knowledge = Knowledge(
+    collection_name="excel_knowledge",
+    sources=[excel_source]
+)
+```
+
+### JSON Knowledge Source
+```python
+from crewai.knowledge.source import JSONKnowledgeSource
+
+# Create a JSON knowledge source
+json_source = JSONKnowledgeSource(
+    file_paths=["data.json"]
+)
+
+# Create knowledge with JSON source
+knowledge = Knowledge(
+    collection_name="json_knowledge",
+    sources=[json_source]
+)
+```
+
 ## Knowledge Configuration

 ### Chunking Configuration

-Control how content is split for processing by setting the chunk size and overlap.
+Knowledge sources automatically chunk content for better processing. 
+You can configure chunking behavior in your knowledge sources:

-```python Code
-knowledge_source = StringKnowledgeSource(
-    content="Long content...",
-    chunk_size=4000,     # Characters per chunk (default)
-    chunk_overlap=200    # Overlap between chunks (default)
+```python
+from crewai.knowledge.source import StringKnowledgeSource
+
+source = StringKnowledgeSource(
+    content="Your content here",
+    chunk_size=4000,      # Maximum size of each chunk (default: 4000)
+    chunk_overlap=200     # Overlap between chunks (default: 200)
 )
 ```

-## Embedder Configuration
+The chunking configuration helps in:
+- Breaking down large documents into manageable pieces
+- Maintaining context through chunk overlap
+- Optimizing retrieval accuracy

-You can also configure the embedder for the knowledge store. This is useful if you want to use a different embedder for the knowledge store than the one used for the agents.
+### Embeddings Configuration

-```python Code
-...
+You can also configure the embedder for the knowledge store. 
+This is useful if you want to use a different embedder for the knowledge store than the one used for the agents.
+The `embedder` parameter supports various embedding model providers that include:
+- `openai`: OpenAI's embedding models
+- `google`: Google's text embedding models
+- `azure`: Azure OpenAI embeddings
+- `ollama`: Local embeddings with Ollama
+- `vertexai`: Google Cloud VertexAI embeddings
+- `cohere`: Cohere's embedding models
+- `bedrock`: AWS Bedrock embeddings
+- `huggingface`: Hugging Face models
+- `watson`: IBM Watson embeddings
+
+Here's an example of how to configure the embedder for the knowledge store using Google's `text-embedding-004` model:
+<CodeGroup>
+```python Example
+from crewai import Agent, Task, Crew, Process, LLM
+from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
+import os
+
+# Get the GEMINI API key
+GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")
+
+# Create a knowledge source
+content = "Users name is John. He is 30 years old and lives in San Francisco."
 string_source = StringKnowledgeSource(
-    content="Users name is John. He is 30 years old and lives in San Francisco.",
+    content=content,
 )
+
+# Create an LLM with a temperature of 0 to ensure deterministic outputs
+gemini_llm = LLM(
+    model="gemini/gemini-1.5-pro-002",
+    api_key=GEMINI_API_KEY,
+    temperature=0,
+)
+
+# Create an agent with the knowledge store
+agent = Agent(
+    role="About User",
+    goal="You know everything about the user.",
+    backstory="""You are a master at understanding people and their preferences.""",
+    verbose=True,
+    allow_delegation=False,
+    llm=gemini_llm,
+)
+
+task = Task(
+    description="Answer the following questions about the user: {question}",
+    expected_output="An answer to the question.",
+    agent=agent,
+)
+
 crew = Crew(
-    ...
+    agents=[agent],
+    tasks=[task],
+    verbose=True,
+    process=Process.sequential,
    knowledge_sources=[string_source],
    embedder={
-        "provider": "openai",
-        "config": {"model": "text-embedding-3-small"},
-    },
+        "provider": "google",
+        "config": {
+            "model": "models/text-embedding-004",
+            "api_key": GEMINI_API_KEY,
+        }
+    }
 )
-```

+result = crew.kickoff(inputs={"question": "What city does John live in and how old is he?"})
+```
+```text Output
+# Agent: About User
+## Task: Answer the following questions about the user: What city does John live in and how old is he?
+
+# Agent: About User
+## Final Answer: 
+John is 30 years old and lives in San Francisco.
+```
+</CodeGroup>
 ## Clearing Knowledge

 If you need to clear the knowledge stored in CrewAI, you can use the `crewai reset-memories` command with the `--knowledge` option.
--- a/docs/how-to/portkey-observability-and-guardrails.mdx
+++ b/docs/how-to/portkey-observability-and-guardrails.mdx
--- a/docs/how-to/portkey-observability.mdx
+++ b/docs/how-to/portkey-observability.mdx
@@ -0,0 +1,202 @@
+---
+title: Portkey Observability and Guardrails
+description: How to use Portkey with CrewAI
+icon: key
+---
+
+<img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-CrewAI.png" alt="Portkey CrewAI Header Image" width="70%" />
+
+
+[Portkey](https://portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai) is a 2-line upgrade to make your CrewAI agents reliable, cost-efficient, and fast.
+
+Portkey adds 4 core production capabilities to any CrewAI agent:
+1. Routing to **200+ LLMs**
+2. Making each LLM call more robust
+3. Full-stack tracing & cost, performance analytics
+4. Real-time guardrails to enforce behavior
+
+## Getting Started
+
+<Steps>
+    <Step title="Install CrewAI and Portkey">
+    ```bash
+    pip install -qU crewai portkey-ai
+    ```
+    </Step>
+    <Step title="Configure the LLM Client">
+    To build CrewAI Agents with Portkey, you'll need two keys:
+    - **Portkey API Key**: Sign up on the [Portkey app](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai) and copy your API key
+    - **Virtual Key**: Virtual Keys securely manage your LLM API keys in one place. Store your LLM provider API keys securely in Portkey's vault
+
+    ```python
+    from crewai import LLM
+    from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
+
+    gpt_llm = LLM(
+        model="gpt-4",
+        base_url=PORTKEY_GATEWAY_URL,
+        api_key="dummy", # We are using Virtual key
+        extra_headers=createHeaders(
+            api_key="YOUR_PORTKEY_API_KEY",
+            virtual_key="YOUR_VIRTUAL_KEY", # Enter your Virtual key from Portkey
+        )
+    )
+    ```
+    </Step>
+    <Step title="Create and Run Your First Agent">
+    ```python
+    from crewai import Agent, Task, Crew
+
+    # Define your agents with roles and goals
+    coder = Agent(
+        role='Software developer',
+        goal='Write clear, concise code on demand',
+        backstory='An expert coder with a keen eye for software trends.',
+        llm=gpt_llm
+    )
+
+    # Create tasks for your agents
+    task1 = Task(
+        description="Define the HTML for making a simple website with heading- Hello World! Portkey is working!",
+        expected_output="A clear and concise HTML code",
+        agent=coder
+    )
+
+    # Instantiate your crew
+    crew = Crew(
+        agents=[coder],
+        tasks=[task1],
+    )
+
+    result = crew.kickoff()
+    print(result)
+    ```
+    </Step>
+</Steps>
+
+## Key Features
+
+| Feature | Description |
+|:--------|:------------|
+| 🌐 Multi-LLM Support | Access OpenAI, Anthropic, Gemini, Azure, and 250+ providers through a unified interface |
+| 🛡️ Production Reliability | Implement retries, timeouts, load balancing, and fallbacks |
+| 📊 Advanced Observability | Track 40+ metrics including costs, tokens, latency, and custom metadata |
+| 🔍 Comprehensive Logging | Debug with detailed execution traces and function call logs |
+| 🚧 Security Controls | Set budget limits and implement role-based access control |
+| 🔄 Performance Analytics | Capture and analyze feedback for continuous improvement |
+| 💾 Intelligent Caching | Reduce costs and latency with semantic or simple caching |
+
+
+## Production Features with Portkey Configs
+
+All features mentioned below are through Portkey's Config system. Portkey's Config system allows you to define routing strategies using simple JSON objects in your LLM API calls. You can create and manage Configs directly in your code or through the Portkey Dashboard. Each Config has a unique ID for easy reference.
+
+<Frame>
+    <img src="https://raw.githubusercontent.com/Portkey-AI/docs-core/refs/heads/main/images/libraries/libraries-3.avif"/>
+</Frame>
+
+
+### 1. Use 250+ LLMs
+Access various LLMs like Anthropic, Gemini, Mistral, Azure OpenAI, and more with minimal code changes. Switch between providers or use them together seamlessly. [Learn more about Universal API](https://portkey.ai/docs/product/ai-gateway/universal-api)
+
+
+Easily switch between different LLM providers:
+
+```python
+# Anthropic Configuration
+anthropic_llm = LLM(
+    model="claude-3-5-sonnet-latest",
+    base_url=PORTKEY_GATEWAY_URL,
+    api_key="dummy",
+    extra_headers=createHeaders(
+        api_key="YOUR_PORTKEY_API_KEY",
+        virtual_key="YOUR_ANTHROPIC_VIRTUAL_KEY", #You don't need provider when using Virtual keys
+        trace_id="anthropic_agent"
+    )
+)
+
+# Azure OpenAI Configuration
+azure_llm = LLM(
+    model="gpt-4",
+    base_url=PORTKEY_GATEWAY_URL,
+    api_key="dummy",
+    extra_headers=createHeaders(
+        api_key="YOUR_PORTKEY_API_KEY",
+        virtual_key="YOUR_AZURE_VIRTUAL_KEY", #You don't need provider when using Virtual keys
+        trace_id="azure_agent"
+    )
+)
+```
+
+
+### 2. Caching
+Improve response times and reduce costs with two powerful caching modes:
+- **Simple Cache**: Perfect for exact matches
+- **Semantic Cache**: Matches responses for requests that are semantically similar
+[Learn more about Caching](https://portkey.ai/docs/product/ai-gateway/cache-simple-and-semantic)
+
+```py
+config = {
+    "cache": {
+        "mode": "semantic",  # or "simple" for exact matching
+    }
+}
+```
+
+### 3. Production Reliability
+Portkey provides comprehensive reliability features:
+- **Automatic Retries**: Handle temporary failures gracefully
+- **Request Timeouts**: Prevent hanging operations
+- **Conditional Routing**: Route requests based on specific conditions
+- **Fallbacks**: Set up automatic provider failovers
+- **Load Balancing**: Distribute requests efficiently
+
+[Learn more about Reliability Features](https://portkey.ai/docs/product/ai-gateway/)
+
+
+
+### 4. Metrics
+
+Agent runs are complex. Portkey automatically logs **40+ comprehensive metrics** for your AI agents, including cost, tokens used, latency, etc. Whether you need a broad overview or granular insights into your agent runs, Portkey's customizable filters provide the metrics you need.
+
+
+- Cost per agent interaction
+- Response times and latency
+- Token usage and efficiency
+- Success/failure rates
+- Cache hit rates
+
+<img src="https://github.com/siddharthsambharia-portkey/Portkey-Product-Images/blob/main/Portkey-Dashboard.png?raw=true" width="70%" alt="Portkey Dashboard" />
+
+### 5. Detailed Logging
+Logs are essential for understanding agent behavior, diagnosing issues, and improving performance. They provide a detailed record of agent activities and tool use, which is crucial for debugging and optimizing processes.
+
+
+Access a dedicated section to view records of agent executions, including parameters, outcomes, function calls, and errors. Filter logs based on multiple parameters such as trace ID, model, tokens used, and metadata.
+
+<details>
+  <summary><b>Traces</b></summary>
+  <img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-Traces.png" alt="Portkey Traces" width="70%" />
+</details>
+
+<details>
+  <summary><b>Logs</b></summary>
+  <img src="https://raw.githubusercontent.com/siddharthsambharia-portkey/Portkey-Product-Images/main/Portkey-Logs.png" alt="Portkey Logs" width="70%" />
+</details>
+
+### 6. Enterprise Security Features
+- Set budget limit and rate limts per Virtual Key (disposable API keys)
+- Implement role-based access control
+- Track system changes with audit logs
+- Configure data retention policies
+
+
+
+For detailed information on creating and managing Configs, visit the [Portkey documentation](https://docs.portkey.ai/product/ai-gateway/configs).
+
+## Resources
+
+- [📘 Portkey Documentation](https://docs.portkey.ai)
+- [📊 Portkey Dashboard](https://app.portkey.ai/?utm_source=crewai&utm_medium=crewai&utm_campaign=crewai)
+- [🐦 Twitter](https://twitter.com/portkeyai)
+- [💬 Discord Community](https://discord.gg/DD7vgKK299)
--- a/docs/mint.json
+++ b/docs/mint.json
@@ -100,7 +100,8 @@
        "how-to/conditional-tasks",
        "how-to/agentops-observability",
        "how-to/langtrace-observability",
-        "how-to/openlit-observability"
+        "how-to/openlit-observability",
+        "how-to/portkey-observability"
      ]
    },
    {