feat: implement knowledge retrieval events in Agent (#2727)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled

* feat: implement knowledge retrieval events in Agent

This commit introduces a series of knowledge retrieval events in the Agent class, enhancing its ability to handle knowledge queries. New events include KnowledgeRetrievalStartedEvent, KnowledgeRetrievalCompletedEvent, KnowledgeQueryGeneratedEvent, KnowledgeQueryFailedEvent, and KnowledgeSearchQueryCompletedEvent. The Agent now emits these events during knowledge retrieval processes, allowing for better tracking and handling of knowledge queries. Additionally, the console formatter has been updated to handle these new events, providing visual feedback during knowledge retrieval operations.

* refactor: update knowledge query handling in Agent

This commit refines the knowledge query processing in the Agent class by renaming variables for clarity and optimizing the query rewriting logic. The system prompt has been updated in the translation file to enhance clarity and context for the query rewriting process. These changes aim to improve the overall readability and maintainability of the code.

* fix: add missing newline at end of en.json file

* fix broken tests

* refactor: rename knowledge query events and enhance retrieval handling

This commit renames the KnowledgeQueryGeneratedEvent to KnowledgeQueryStartedEvent to better reflect its purpose. It also updates the event handling in the EventListener and ConsoleFormatter classes to accommodate the new event structure. Additionally, the retrieval knowledge is now included in the KnowledgeRetrievalCompletedEvent, improving the overall knowledge retrieval process.

* docs for transparancy

* refactor: improve error handling in knowledge query processing

This commit refactors the knowledge query handling in the Agent class by changing the order of checks for LLM compatibility. It now logs a warning and emits a failure event if the LLM is not an instance of BaseLLM before attempting to call the LLM. Additionally, the task_prompt attribute has been removed from the KnowledgeQueryFailedEvent, simplifying the event structure.

* test: add unit test for knowledge search query and VCR cassette

This commit introduces a new test, `test_get_knowledge_search_query`, to verify that the `_get_knowledge_search_query` method in the Agent class correctly interacts with the LLM using the appropriate prompts. Additionally, a VCR cassette is added to record the interactions with the OpenAI API for this test, ensuring consistent and reliable test results.
This commit is contained in:
Lorenze Jay
2025-05-07 11:55:42 -07:00
committed by GitHub
parent e3887ae36e
commit 7ad51d9d05
13 changed files with 2776 additions and 465 deletions

View File

@@ -397,6 +397,53 @@ result = crew.kickoff(inputs={"question": "What city does John live in and how o
John is 30 years old and lives in San Francisco.
```
</CodeGroup>
## Query Rewriting
CrewAI implements an intelligent query rewriting mechanism to optimize knowledge retrieval. When an agent needs to search through knowledge sources, the raw task prompt is automatically transformed into a more effective search query.
### How Query Rewriting Works
1. When an agent executes a task with knowledge sources available, the `_get_knowledge_search_query` method is triggered
2. The agent's LLM is used to transform the original task prompt into an optimized search query
3. This optimized query is then used to retrieve relevant information from knowledge sources
### Benefits of Query Rewriting
<CardGroup cols={2}>
<Card title="Improved Retrieval Accuracy" icon="bullseye-arrow">
By focusing on key concepts and removing irrelevant content, query rewriting helps retrieve more relevant information.
</Card>
<Card title="Context Awareness" icon="brain">
The rewritten queries are designed to be more specific and context-aware for vector database retrieval.
</Card>
</CardGroup>
### Implementation Details
Query rewriting happens transparently using a system prompt that instructs the LLM to:
- Focus on key words of the intended task
- Make the query more specific and context-aware
- Remove irrelevant content like output format instructions
- Generate only the rewritten query without preamble or postamble
<Tip>
This mechanism is fully automatic and requires no configuration from users. The agent's LLM is used to perform the query rewriting, so using a more capable LLM can improve the quality of rewritten queries.
</Tip>
### Example
```python
# Original task prompt
task_prompt = "Answer the following questions about the user's favorite movies: What movie did John watch last week? Format your answer in JSON."
# Behind the scenes, this might be rewritten as:
rewritten_query = "What movies did John watch last week?"
```
The rewritten query is more focused on the core information need and removes irrelevant instructions about output formatting.
## Clearing Knowledge
If you need to clear the knowledge stored in CrewAI, you can use the `crewai reset-memories` command with the `--knowledge` option.