mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-01-08 15:48:29 +00:00
* docs(cli): document device-code login and config reset guidance; renumber sections * docs(cli): fix duplicate numbering (renumber Login/API Keys/Configuration sections) * docs: Fix webhook documentation to include meta dict in all webhook payloads - Add note explaining that meta objects from kickoff requests are included in all webhook payloads - Update webhook examples to show proper payload structure including meta field - Fix webhook examples to match actual API implementation - Apply changes to English, Korean, and Portuguese documentation Resolves the documentation gap where meta dict passing to webhooks was not documented despite being implemented in the API. * WIP: CrewAI docs theme, changelog, GEO, localization * docs(cli): fix merge markers; ensure mode: "wide"; convert ASCII tables to Markdown (en/pt-BR/ko) * docs: add group icons across locales; split Automation/Integrations; update tools overviews and links
72 lines
2.2 KiB
Plaintext
72 lines
2.2 KiB
Plaintext
---
|
|
title: PDF RAG Search
|
|
description: The `PDFSearchTool` is designed to search PDF files and return the most relevant results.
|
|
icon: file-pdf
|
|
mode: "wide"
|
|
---
|
|
|
|
# `PDFSearchTool`
|
|
|
|
<Note>
|
|
We are still working on improving tools, so there might be unexpected behavior or changes in the future.
|
|
</Note>
|
|
|
|
## Description
|
|
|
|
The PDFSearchTool is a RAG tool designed for semantic searches within PDF content. It allows for inputting a search query and a PDF document, leveraging advanced search techniques to find relevant content efficiently.
|
|
This capability makes it especially useful for extracting specific information from large PDF files quickly.
|
|
|
|
## Installation
|
|
|
|
To get started with the PDFSearchTool, first, ensure the crewai_tools package is installed with the following command:
|
|
|
|
```shell
|
|
pip install 'crewai[tools]'
|
|
```
|
|
|
|
## Example
|
|
Here's how to use the PDFSearchTool to search within a PDF document:
|
|
|
|
```python Code
|
|
from crewai_tools import PDFSearchTool
|
|
|
|
# Initialize the tool allowing for any PDF content search if the path is provided during execution
|
|
tool = PDFSearchTool()
|
|
|
|
# OR
|
|
|
|
# Initialize the tool with a specific PDF path for exclusive search within that document
|
|
tool = PDFSearchTool(pdf='path/to/your/document.pdf')
|
|
```
|
|
|
|
## Arguments
|
|
|
|
- `pdf`: **Optional** The PDF path for the search. Can be provided at initialization or within the `run` method's arguments. If provided at initialization, the tool confines its search to the specified document.
|
|
|
|
## Custom model and embeddings
|
|
|
|
By default, the tool uses OpenAI for both embeddings and summarization. To customize the model, you can use a config dictionary as follows:
|
|
|
|
```python Code
|
|
tool = PDFSearchTool(
|
|
config=dict(
|
|
llm=dict(
|
|
provider="ollama", # or google, openai, anthropic, llama2, ...
|
|
config=dict(
|
|
model="llama2",
|
|
# temperature=0.5,
|
|
# top_p=1,
|
|
# stream=true,
|
|
),
|
|
),
|
|
embedder=dict(
|
|
provider="google", # or openai, ollama, ...
|
|
config=dict(
|
|
model="models/embedding-001",
|
|
task_type="retrieval_document",
|
|
# title="Embeddings",
|
|
),
|
|
),
|
|
)
|
|
)
|
|
``` |