Enhance knowledge management in CrewAI (#2637)

* Enhance knowledge management in CrewAI - Added `KnowledgeConfig` class to configure knowledge retrieval parameters such as `limit` and `score_threshold`. - Updated `Agent` and `Crew` classes to utilize the new knowledge configuration for querying knowledge sources. - Enhanced documentation to clarify the addition of knowledge sources at both agent and crew levels. - Introduced new tips in documentation to guide users on knowledge source management and configuration. * Refactor knowledge configuration parameters in CrewAI - Renamed `limit` to `results_limit` in `KnowledgeConfig`, `query_knowledge`, and `query` methods for consistency and clarity. - Updated related documentation to reflect the new parameter name, ensuring users understand the configuration options for knowledge retrieval. * Refactor agent tests to utilize mock knowledge storage - Updated test cases in `agent_test.py` to use `KnowledgeStorage` for mocking knowledge sources, enhancing test reliability and clarity. - Renamed `limit` to `results_limit` in `KnowledgeConfig` for consistency with recent changes. - Ensured that knowledge queries are properly mocked to return expected results during tests. * Add VCR support for agent tests with query limits and score thresholds - Introduced `@pytest.mark.vcr` decorator in `agent_test.py` for tests involving knowledge sources, ensuring consistent recording of HTTP interactions. - Added new YAML cassette files for `test_agent_with_knowledge_sources_with_query_limit_and_score_threshold` and `test_agent_with_knowledge_sources_with_query_limit_and_score_threshold_default`, capturing the expected API responses for these tests. - Enhanced test reliability by utilizing VCR to manage external API calls during testing. * Update documentation to format parameter names in code style - Changed the formatting of `results_limit` and `score_threshold` in the documentation to use code style for better clarity and emphasis. - Ensured consistency in documentation presentation to enhance user understanding of configuration options. * Enhance KnowledgeConfig with field descriptions - Updated `results_limit` and `score_threshold` in `KnowledgeConfig` to use Pydantic's `Field` for improved documentation and clarity. - Added descriptions to both parameters to provide better context for their usage in knowledge retrieval configuration. * docstrings added
2026-01-09 16:18:30 +00:00 · 2025-04-18 18:33:04 -07:00
parent 371f19f3cd
commit 311a078ca6
10 changed files with 836 additions and 22 deletions
--- a/src/crewai/crew.py
+++ b/src/crewai/crew.py
@@ -304,9 +304,7 @@ class Crew(BaseModel):
        """Initialize private memory attributes."""
        self._external_memory = (
            # External memory doesn’t support a default value since it was designed to be managed entirely externally
-            self.external_memory.set_crew(self)
-            if self.external_memory
-            else None
+            self.external_memory.set_crew(self) if self.external_memory else None
        )

        self._long_term_memory = self.long_term_memory
@@ -1136,9 +1134,13 @@ class Crew(BaseModel):
        result = self._execute_tasks(self.tasks, start_index, True)
        return result

-    def query_knowledge(self, query: List[str]) -> Union[List[Dict[str, Any]], None]:
+    def query_knowledge(
+        self, query: List[str], results_limit: int = 3, score_threshold: float = 0.35
+    ) -> Union[List[Dict[str, Any]], None]:
        if self.knowledge:
-            return self.knowledge.query(query)
+            return self.knowledge.query(
+                query, results_limit=results_limit, score_threshold=score_threshold
+            )
        return None

    def fetch_inputs(self) -> Set[str]:
@@ -1220,9 +1222,13 @@ class Crew(BaseModel):
        copied_data = self.model_dump(exclude=exclude)
        copied_data = {k: v for k, v in copied_data.items() if v is not None}
        if self.short_term_memory:
-            copied_data["short_term_memory"] = self.short_term_memory.model_copy(deep=True)
+            copied_data["short_term_memory"] = self.short_term_memory.model_copy(
+                deep=True
+            )
        if self.long_term_memory:
-            copied_data["long_term_memory"] = self.long_term_memory.model_copy(deep=True)
+            copied_data["long_term_memory"] = self.long_term_memory.model_copy(
+                deep=True
+            )
        if self.entity_memory:
            copied_data["entity_memory"] = self.entity_memory.model_copy(deep=True)
        if self.external_memory:
@@ -1230,7 +1236,6 @@ class Crew(BaseModel):
        if self.user_memory:
            copied_data["user_memory"] = self.user_memory.model_copy(deep=True)

-
        copied_data.pop("agents", None)
        copied_data.pop("tasks", None)

@@ -1403,7 +1408,10 @@ class Crew(BaseModel):
            "short": (getattr(self, "_short_term_memory", None), "short term"),
            "entity": (getattr(self, "_entity_memory", None), "entity"),
            "knowledge": (getattr(self, "knowledge", None), "knowledge"),
-            "kickoff_outputs": (getattr(self, "_task_output_handler", None), "task output"),
+            "kickoff_outputs": (
+                getattr(self, "_task_output_handler", None),
+                "task output",
+            ),
            "external": (getattr(self, "_external_memory", None), "external"),
        }