Merge branch 'main' into feat/doc_structured

This commit is contained in:
Brandon Hancock (bhancock_ai)
2024-12-09 11:31:31 -05:00
committed by GitHub
8 changed files with 82 additions and 42 deletions

View File

@@ -8,15 +8,13 @@ icon: book
## What is Knowledge?
Knowledge in CrewAI is a powerful system that allows AI agents to access and utilize external information sources during their tasks.
Think of it as giving your agents a reference library they can consult while working.
Knowledge in CrewAI is a powerful system that allows AI agents to access and utilize external information sources during their tasks.
Think of it as giving your agents a reference library they can consult while working.
<Info>
Key benefits of using Knowledge:
- Enhance agents with domain-specific information
- Support decisions with real-world data
- Maintain context across conversations
- Ground responses in factual information
Key benefits of using Knowledge: - Enhance agents with domain-specific
information - Support decisions with real-world data - Maintain context across
conversations - Ground responses in factual information
</Info>
## Supported Knowledge Sources
@@ -25,14 +23,10 @@ CrewAI supports various types of knowledge sources out of the box:
<CardGroup cols={2}>
<Card title="Text Sources" icon="text">
- Raw strings
- Text files (.txt)
- PDF documents
- Raw strings - Text files (.txt) - PDF documents
</Card>
<Card title="Structured Data" icon="table">
- CSV files
- Excel spreadsheets
- JSON documents
- CSV files - Excel spreadsheets - JSON documents
</Card>
</CardGroup>
@@ -47,7 +41,7 @@ from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSourc
# Create a knowledge source
content = "Users name is John. He is 30 years old and lives in San Francisco."
string_source = StringKnowledgeSource(
content=content,
content=content,
)
# Create an LLM with a temperature of 0 to ensure deterministic outputs
@@ -122,7 +116,6 @@ crewai reset-memories --knowledge
This is useful when you've updated your knowledge sources and want to ensure that the agents are using the most recent information.
## Custom Knowledge Sources
CrewAI allows you to create custom knowledge sources for any type of data by extending the `BaseKnowledgeSource` class. Let's create a practical example that fetches and processes space news articles.
@@ -141,10 +134,10 @@ from pydantic import BaseModel, Field
class SpaceNewsKnowledgeSource(BaseKnowledgeSource):
"""Knowledge source that fetches data from Space News API."""
api_endpoint: str = Field(description="API endpoint URL")
limit: int = Field(default=10, description="Number of articles to fetch")
def load_content(self) -> Dict[Any, str]:
"""Fetch and format space news articles."""
try:
@@ -152,15 +145,15 @@ class SpaceNewsKnowledgeSource(BaseKnowledgeSource):
f"{self.api_endpoint}?limit={self.limit}"
)
response.raise_for_status()
data = response.json()
articles = data.get('results', [])
formatted_data = self._format_articles(articles)
return {self.api_endpoint: formatted_data}
except Exception as e:
raise ValueError(f"Failed to fetch space news: {str(e)}")
def _format_articles(self, articles: list) -> str:
"""Format articles into readable text."""
formatted = "Space News Articles:\n\n"
@@ -180,7 +173,7 @@ class SpaceNewsKnowledgeSource(BaseKnowledgeSource):
for _, text in content.items():
chunks = self._chunk_text(text)
self.chunks.extend(chunks)
self._save_documents()
# Create knowledge source
@@ -193,7 +186,7 @@ recent_news = SpaceNewsKnowledgeSource(
space_analyst = Agent(
role="Space News Analyst",
goal="Answer questions about space news accurately and comprehensively",
backstory="""You are a space industry analyst with expertise in space exploration,
backstory="""You are a space industry analyst with expertise in space exploration,
satellite technology, and space industry trends. You excel at answering questions
about space news and providing detailed, accurate information.""",
knowledge_sources=[recent_news],
@@ -220,13 +213,14 @@ result = crew.kickoff(
inputs={"user_question": "What are the latest developments in space exploration?"}
)
```
```output Output
# Agent: Space News Analyst
## Task: Answer this question about space news: What are the latest developments in space exploration?
# Agent: Space News Analyst
## Final Answer:
## Final Answer:
The latest developments in space exploration, based on recent space news articles, include the following:
1. SpaceX has received the final regulatory approvals to proceed with the second integrated Starship/Super Heavy launch, scheduled for as soon as the morning of Nov. 17, 2023. This is a significant step in SpaceX's ambitious plans for space exploration and colonization. [Source: SpaceNews](https://spacenews.com/starship-cleared-for-nov-17-launch/)
@@ -242,11 +236,13 @@ The latest developments in space exploration, based on recent space news article
6. The National Natural Science Foundation of China has outlined a five-year project for researchers to study the assembly of ultra-large spacecraft. This could lead to significant advancements in spacecraft technology and space exploration capabilities. [Source: SpaceNews](https://spacenews.com/china-researching-challenges-of-kilometer-scale-ultra-large-spacecraft/)
7. The Center for AEroSpace Autonomy Research (CAESAR) at Stanford University is focusing on spacecraft autonomy. The center held a kickoff event on May 22, 2024, to highlight the industry, academia, and government collaboration it seeks to foster. This could lead to significant advancements in autonomous spacecraft technology. [Source: SpaceNews](https://spacenews.com/stanford-center-focuses-on-spacecraft-autonomy/)
```
```
</CodeGroup>
#### Key Components Explained
1. **Custom Knowledge Source (`SpaceNewsKnowledgeSource`)**:
- Extends `BaseKnowledgeSource` for integration with CrewAI
- Configurable API endpoint and article limit
- Implements three key methods:
@@ -255,10 +251,12 @@ The latest developments in space exploration, based on recent space news article
- `add()`: Processes and stores the content
2. **Agent Configuration**:
- Specialized role as a Space News Analyst
- Uses the knowledge source to access space news
3. **Task Setup**:
- Takes a user question as input through `{user_question}`
- Designed to provide detailed answers based on the knowledge source
@@ -267,6 +265,7 @@ The latest developments in space exploration, based on recent space news article
- Handles input/output through the kickoff method
This example demonstrates how to:
- Create a custom knowledge source that fetches real-time data
- Process and format external data for AI consumption
- Use the knowledge source to answer specific user questions
@@ -274,13 +273,15 @@ This example demonstrates how to:
#### About the Spaceflight News API
The example uses the [Spaceflight News API](https://api.spaceflightnewsapi.net/v4/documentation), which:
The example uses the [Spaceflight News API](https://api.spaceflightnewsapi.net/v4/docs/), which:
- Provides free access to space-related news articles
- Requires no authentication
- Returns structured data about space news
- Supports pagination and filtering
You can customize the API query by modifying the endpoint URL:
```python
# Fetch more articles
recent_news = SpaceNewsKnowledgeSource(
@@ -299,14 +300,14 @@ recent_news = SpaceNewsKnowledgeSource(
<AccordionGroup>
<Accordion title="Content Organization">
- Keep chunk sizes appropriate for your content type
- Consider content overlap for context preservation
- Organize related information into separate knowledge sources
- Keep chunk sizes appropriate for your content type - Consider content
overlap for context preservation - Organize related information into
separate knowledge sources
</Accordion>
<Accordion title="Performance Tips">
- Adjust chunk sizes based on content complexity
- Configure appropriate embedding models
- Consider using local embedding providers for faster processing
- Adjust chunk sizes based on content complexity - Configure appropriate
embedding models - Consider using local embedding providers for faster
processing
</Accordion>
</AccordionGroup>

View File

@@ -57,7 +57,7 @@ This feature is useful for debugging and understanding how agents interact with
<Step title="Install AgentOps">
Install AgentOps with:
```bash
pip install crewai[agentops]
pip install 'crewai[agentops]'
```
or
```bash

View File

@@ -299,7 +299,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
self._i18n.slice("summarizer_system_message"), role="system"
),
self._format_msg(
self._i18n.slice("sumamrize_instruction").format(group=group),
self._i18n.slice("summarize_instruction").format(group=group),
),
],
callbacks=self.callbacks,

View File

@@ -12,6 +12,6 @@ reporting_task:
Review the context you got and expand each topic into a full section for a report.
Make sure the report is detailed and contains any and all relevant information.
expected_output: >
A fully fledge reports with the mains topics, each with a full section of information.
A fully fledged report with the main topics, each with a full section of information.
Formatted as markdown without '```'
agent: reporting_analyst

View File

@@ -1,4 +1,5 @@
import logging
import os
import sys
import threading
import warnings
@@ -128,6 +129,7 @@ class LLM:
litellm.drop_params = True
litellm.set_verbose = False
self.set_callbacks(callbacks)
self.set_env_callbacks()
def call(self, messages: List[Dict[str, str]], callbacks: List[Any] = []) -> str:
with suppress_warnings():
@@ -202,3 +204,39 @@ class LLM:
litellm._async_success_callback.remove(callback)
litellm.callbacks = callbacks
def set_env_callbacks(self):
"""
Sets the success and failure callbacks for the LiteLLM library from environment variables.
This method reads the `LITELLM_SUCCESS_CALLBACKS` and `LITELLM_FAILURE_CALLBACKS`
environment variables, which should contain comma-separated lists of callback names.
It then assigns these lists to `litellm.success_callback` and `litellm.failure_callback`,
respectively.
If the environment variables are not set or are empty, the corresponding callback lists
will be set to empty lists.
Example:
LITELLM_SUCCESS_CALLBACKS="langfuse,langsmith"
LITELLM_FAILURE_CALLBACKS="langfuse"
This will set `litellm.success_callback` to ["langfuse", "langsmith"] and
`litellm.failure_callback` to ["langfuse"].
"""
success_callbacks_str = os.environ.get("LITELLM_SUCCESS_CALLBACKS", "")
success_callbacks = []
if success_callbacks_str:
success_callbacks = [
callback.strip() for callback in success_callbacks_str.split(",")
]
failure_callbacks_str = os.environ.get("LITELLM_FAILURE_CALLBACKS", "")
failure_callbacks = []
if failure_callbacks_str:
failure_callbacks = [
callback.strip() for callback in failure_callbacks_str.split(",")
]
litellm.success_callback = success_callbacks
litellm.failure_callback = failure_callbacks

View File

@@ -37,7 +37,7 @@ class UserMemory(Memory):
limit: int = 3,
score_threshold: float = 0.35,
):
results = super().search(
results = self.storage.search(
query=query,
limit=limit,
score_threshold=score_threshold,

View File

@@ -1,6 +1,6 @@
import datetime
import json
import os
from pathlib import Path
import threading
import uuid
from concurrent.futures import Future
@@ -393,12 +393,13 @@ class Task(BaseModel):
if self.output_file is None:
raise ValueError("output_file is not set.")
directory = os.path.dirname(self.output_file) # type: ignore # Value of type variable "AnyOrLiteralStr" of "dirname" cannot be "str | None"
resolved_path = Path(self.output_file).expanduser().resolve()
directory = resolved_path.parent
if directory and not os.path.exists(directory):
os.makedirs(directory)
if not directory.exists():
directory.mkdir(parents=True, exist_ok=True)
with open(self.output_file, "w", encoding="utf-8") as file:
with resolved_path.open("w", encoding="utf-8") as file:
if isinstance(result, dict):
import json

View File

@@ -19,7 +19,7 @@
"human_feedback": "You got human feedback on your work, re-evaluate it and give a new Final Answer when ready.\n {human_feedback}",
"getting_input": "This is the agent's final answer: {final_answer}\n\n",
"summarizer_system_message": "You are a helpful assistant that summarizes text.",
"sumamrize_instruction": "Summarize the following text, make sure to include all the important information: {group}",
"summarize_instruction": "Summarize the following text, make sure to include all the important information: {group}",
"summary": "This is a summary of our conversation so far:\n{merged_summary}",
"manager_request": "Your best answer to your coworker asking you this, accounting for the context shared.",
"formatted_task_instructions": "Ensure your final answer contains only the content in the following format: {output_format}\n\nEnsure the final output does not include any code block markers like ```json or ```python.",