Merge branch 'main' into feat/doc_structured

2026-07-03 06:08:15 +00:00 · 2024-12-09 11:31:31 -05:00
parent e576354ff8 b07c51532c
commit 919ff36357
8 changed files with 82 additions and 42 deletions
--- a/docs/concepts/knowledge.mdx
+++ b/docs/concepts/knowledge.mdx
@@ -8,15 +8,13 @@ icon: book

 ## What is Knowledge?

-Knowledge in CrewAI is a powerful system that allows AI agents to access and utilize external information sources during their tasks. 
-Think of it as giving your agents a reference library they can consult while working. 
+Knowledge in CrewAI is a powerful system that allows AI agents to access and utilize external information sources during their tasks.
+Think of it as giving your agents a reference library they can consult while working.

 <Info>
-  Key benefits of using Knowledge:
-  - Enhance agents with domain-specific information
-  - Support decisions with real-world data
-  - Maintain context across conversations
-  - Ground responses in factual information
+  Key benefits of using Knowledge: - Enhance agents with domain-specific
+  information - Support decisions with real-world data - Maintain context across
+  conversations - Ground responses in factual information
 </Info>

 ## Supported Knowledge Sources
@@ -25,14 +23,10 @@ CrewAI supports various types of knowledge sources out of the box:

 <CardGroup cols={2}>
  <Card title="Text Sources" icon="text">
-    - Raw strings
-    - Text files (.txt)
-    - PDF documents
+    - Raw strings - Text files (.txt) - PDF documents
  </Card>
  <Card title="Structured Data" icon="table">
-    - CSV files
-    - Excel spreadsheets
-    - JSON documents
+    - CSV files - Excel spreadsheets - JSON documents
  </Card>
 </CardGroup>

@@ -47,7 +41,7 @@ from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSourc
 # Create a knowledge source
 content = "Users name is John. He is 30 years old and lives in San Francisco."
 string_source = StringKnowledgeSource(
-    content=content, 
+    content=content,
 )

 # Create an LLM with a temperature of 0 to ensure deterministic outputs
@@ -122,7 +116,6 @@ crewai reset-memories --knowledge

 This is useful when you've updated your knowledge sources and want to ensure that the agents are using the most recent information.

-
 ## Custom Knowledge Sources

 CrewAI allows you to create custom knowledge sources for any type of data by extending the `BaseKnowledgeSource` class. Let's create a practical example that fetches and processes space news articles.
@@ -141,10 +134,10 @@ from pydantic import BaseModel, Field

 class SpaceNewsKnowledgeSource(BaseKnowledgeSource):
    """Knowledge source that fetches data from Space News API."""
-    
+
    api_endpoint: str = Field(description="API endpoint URL")
    limit: int = Field(default=10, description="Number of articles to fetch")
-    
+
    def load_content(self) -> Dict[Any, str]:
        """Fetch and format space news articles."""
        try:
@@ -152,15 +145,15 @@ class SpaceNewsKnowledgeSource(BaseKnowledgeSource):
                f"{self.api_endpoint}?limit={self.limit}"
            )
            response.raise_for_status()
-            
+
            data = response.json()
            articles = data.get('results', [])
-            
+
            formatted_data = self._format_articles(articles)
            return {self.api_endpoint: formatted_data}
        except Exception as e:
            raise ValueError(f"Failed to fetch space news: {str(e)}")
-    
+
    def _format_articles(self, articles: list) -> str:
        """Format articles into readable text."""
        formatted = "Space News Articles:\n\n"
@@ -180,7 +173,7 @@ class SpaceNewsKnowledgeSource(BaseKnowledgeSource):
        for _, text in content.items():
            chunks = self._chunk_text(text)
            self.chunks.extend(chunks)
-        
+
        self._save_documents()

 # Create knowledge source
@@ -193,7 +186,7 @@ recent_news = SpaceNewsKnowledgeSource(
 space_analyst = Agent(
    role="Space News Analyst",
    goal="Answer questions about space news accurately and comprehensively",
-    backstory="""You are a space industry analyst with expertise in space exploration, 
+    backstory="""You are a space industry analyst with expertise in space exploration,
    satellite technology, and space industry trends. You excel at answering questions
    about space news and providing detailed, accurate information.""",
    knowledge_sources=[recent_news],
@@ -220,13 +213,14 @@ result = crew.kickoff(
    inputs={"user_question": "What are the latest developments in space exploration?"}
 )
 ```
+
 ```output Output
 # Agent: Space News Analyst
 ## Task: Answer this question about space news: What are the latest developments in space exploration?


 # Agent: Space News Analyst
-## Final Answer: 
+## Final Answer:
 The latest developments in space exploration, based on recent space news articles, include the following:

 1. SpaceX has received the final regulatory approvals to proceed with the second integrated Starship/Super Heavy launch, scheduled for as soon as the morning of Nov. 17, 2023. This is a significant step in SpaceX's ambitious plans for space exploration and colonization. [Source: SpaceNews](https://spacenews.com/starship-cleared-for-nov-17-launch/)
@@ -242,11 +236,13 @@ The latest developments in space exploration, based on recent space news article
 6. The National Natural Science Foundation of China has outlined a five-year project for researchers to study the assembly of ultra-large spacecraft. This could lead to significant advancements in spacecraft technology and space exploration capabilities. [Source: SpaceNews](https://spacenews.com/china-researching-challenges-of-kilometer-scale-ultra-large-spacecraft/)

 7. The Center for AEroSpace Autonomy Research (CAESAR) at Stanford University is focusing on spacecraft autonomy. The center held a kickoff event on May 22, 2024, to highlight the industry, academia, and government collaboration it seeks to foster. This could lead to significant advancements in autonomous spacecraft technology. [Source: SpaceNews](https://spacenews.com/stanford-center-focuses-on-spacecraft-autonomy/)
-``` 
+```
+
 </CodeGroup>
 #### Key Components Explained

 1. **Custom Knowledge Source (`SpaceNewsKnowledgeSource`)**:
+
   - Extends `BaseKnowledgeSource` for integration with CrewAI
   - Configurable API endpoint and article limit
   - Implements three key methods:
@@ -255,10 +251,12 @@ The latest developments in space exploration, based on recent space news article
     - `add()`: Processes and stores the content

 2. **Agent Configuration**:
+
   - Specialized role as a Space News Analyst
   - Uses the knowledge source to access space news

 3. **Task Setup**:
+
   - Takes a user question as input through `{user_question}`
   - Designed to provide detailed answers based on the knowledge source

@@ -267,6 +265,7 @@ The latest developments in space exploration, based on recent space news article
   - Handles input/output through the kickoff method

 This example demonstrates how to:
+
 - Create a custom knowledge source that fetches real-time data
 - Process and format external data for AI consumption
 - Use the knowledge source to answer specific user questions
@@ -274,13 +273,15 @@ This example demonstrates how to:

 #### About the Spaceflight News API

-The example uses the [Spaceflight News API](https://api.spaceflightnewsapi.net/v4/documentation), which:
+The example uses the [Spaceflight News API](https://api.spaceflightnewsapi.net/v4/docs/), which:
+
 - Provides free access to space-related news articles
 - Requires no authentication
 - Returns structured data about space news
 - Supports pagination and filtering

 You can customize the API query by modifying the endpoint URL:
+
 ```python
 # Fetch more articles
 recent_news = SpaceNewsKnowledgeSource(
@@ -299,14 +300,14 @@ recent_news = SpaceNewsKnowledgeSource(

 <AccordionGroup>
  <Accordion title="Content Organization">
-    - Keep chunk sizes appropriate for your content type
-    - Consider content overlap for context preservation
-    - Organize related information into separate knowledge sources
+    - Keep chunk sizes appropriate for your content type - Consider content
+    overlap for context preservation - Organize related information into
+    separate knowledge sources
  </Accordion>
-  
+
  <Accordion title="Performance Tips">
-    - Adjust chunk sizes based on content complexity
-    - Configure appropriate embedding models
-    - Consider using local embedding providers for faster processing
+    - Adjust chunk sizes based on content complexity - Configure appropriate
+    embedding models - Consider using local embedding providers for faster
+    processing
  </Accordion>
 </AccordionGroup>
--- a/docs/how-to/agentops-observability.mdx
+++ b/docs/how-to/agentops-observability.mdx
@@ -57,7 +57,7 @@ This feature is useful for debugging and understanding how agents interact with
   <Step title="Install AgentOps">
      Install AgentOps with:
      ```bash
-      pip install crewai[agentops]
+      pip install 'crewai[agentops]'
      ```
      or
      ```bash
--- a/src/crewai/agents/crew_agent_executor.py
+++ b/src/crewai/agents/crew_agent_executor.py
@@ -299,7 +299,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
                        self._i18n.slice("summarizer_system_message"), role="system"
                    ),
                    self._format_msg(
-                        self._i18n.slice("sumamrize_instruction").format(group=group),
+                        self._i18n.slice("summarize_instruction").format(group=group),
                    ),
                ],
                callbacks=self.callbacks,
--- a/src/crewai/cli/templates/crew/config/tasks.yaml
+++ b/src/crewai/cli/templates/crew/config/tasks.yaml
@@ -12,6 +12,6 @@ reporting_task:
    Review the context you got and expand each topic into a full section for a report.
    Make sure the report is detailed and contains any and all relevant information.
  expected_output: >
-    A fully fledge reports with the mains topics, each with a full section of information.
+    A fully fledged report with the main topics, each with a full section of information.
    Formatted as markdown without '```'
  agent: reporting_analyst
--- a/src/crewai/llm.py
+++ b/src/crewai/llm.py
@@ -1,4 +1,5 @@
 import logging
+import os
 import sys
 import threading
 import warnings
@@ -128,6 +129,7 @@ class LLM:
        litellm.drop_params = True
        litellm.set_verbose = False
        self.set_callbacks(callbacks)
+        self.set_env_callbacks()

    def call(self, messages: List[Dict[str, str]], callbacks: List[Any] = []) -> str:
        with suppress_warnings():
@@ -202,3 +204,39 @@ class LLM:
                litellm._async_success_callback.remove(callback)

        litellm.callbacks = callbacks
+
+    def set_env_callbacks(self):
+        """
+        Sets the success and failure callbacks for the LiteLLM library from environment variables.
+
+        This method reads the `LITELLM_SUCCESS_CALLBACKS` and `LITELLM_FAILURE_CALLBACKS`
+        environment variables, which should contain comma-separated lists of callback names.
+        It then assigns these lists to `litellm.success_callback` and `litellm.failure_callback`,
+        respectively.
+
+        If the environment variables are not set or are empty, the corresponding callback lists
+        will be set to empty lists.
+
+        Example:
+            LITELLM_SUCCESS_CALLBACKS="langfuse,langsmith"
+            LITELLM_FAILURE_CALLBACKS="langfuse"
+
+        This will set `litellm.success_callback` to ["langfuse", "langsmith"] and
+        `litellm.failure_callback` to ["langfuse"].
+        """
+        success_callbacks_str = os.environ.get("LITELLM_SUCCESS_CALLBACKS", "")
+        success_callbacks = []
+        if success_callbacks_str:
+            success_callbacks = [
+                callback.strip() for callback in success_callbacks_str.split(",")
+            ]
+
+        failure_callbacks_str = os.environ.get("LITELLM_FAILURE_CALLBACKS", "")
+        failure_callbacks = []
+        if failure_callbacks_str:
+            failure_callbacks = [
+                callback.strip() for callback in failure_callbacks_str.split(",")
+            ]
+
+        litellm.success_callback = success_callbacks
+        litellm.failure_callback = failure_callbacks
--- a/src/crewai/memory/user/user_memory.py
+++ b/src/crewai/memory/user/user_memory.py
@@ -37,7 +37,7 @@ class UserMemory(Memory):
        limit: int = 3,
        score_threshold: float = 0.35,
    ):
-        results = super().search(
+        results = self.storage.search(
            query=query,
            limit=limit,
            score_threshold=score_threshold,
--- a/src/crewai/task.py
+++ b/src/crewai/task.py
@@ -1,6 +1,6 @@
 import datetime
 import json
-import os
+from pathlib import Path
 import threading
 import uuid
 from concurrent.futures import Future
@@ -393,12 +393,13 @@ class Task(BaseModel):
        if self.output_file is None:
            raise ValueError("output_file is not set.")

-        directory = os.path.dirname(self.output_file)  # type: ignore # Value of type variable "AnyOrLiteralStr" of "dirname" cannot be "str | None"
+        resolved_path = Path(self.output_file).expanduser().resolve()
+        directory = resolved_path.parent

-        if directory and not os.path.exists(directory):
-            os.makedirs(directory)
+        if not directory.exists():
+            directory.mkdir(parents=True, exist_ok=True)

-        with open(self.output_file, "w", encoding="utf-8") as file:
+        with resolved_path.open("w", encoding="utf-8") as file:
            if isinstance(result, dict):
                import json

--- a/src/crewai/translations/en.json
+++ b/src/crewai/translations/en.json
@@ -19,7 +19,7 @@
    "human_feedback": "You got human feedback on your work, re-evaluate it and give a new Final Answer when ready.\n {human_feedback}",
    "getting_input": "This is the agent's final answer: {final_answer}\n\n",
    "summarizer_system_message": "You are a helpful assistant that summarizes text.",
-    "sumamrize_instruction": "Summarize the following text, make sure to include all the important information: {group}",
+    "summarize_instruction": "Summarize the following text, make sure to include all the important information: {group}",
    "summary": "This is a summary of our conversation so far:\n{merged_summary}",
    "manager_request": "Your best answer to your coworker asking you this, accounting for the context shared.",
    "formatted_task_instructions": "Ensure your final answer contains only the content in the following format: {output_format}\n\nEnsure the final output does not include any code block markers like ```json or ```python.",