Files
crewAI/docs/tools/DOCXSearchTool.md
João Moura 7fd8850ddb Small RC Fixes (#608)
* mentioning ollama on the docs as embedder

* lowering barrier to match tool with simialr name

* Fixing agent tools to support co_worker

* Adding new tests

* Fixing type"

* updating tests

* fixing conflict
2024-05-13 02:29:04 -03:00

61 lines
2.1 KiB
Markdown

# DOCXSearchTool
!!! note "Experimental"
We are still working on improving tools, so there might be unexpected behavior or changes in the future.
## Description
The DOCXSearchTool is a RAG tool designed for semantic searching within DOCX documents. It enables users to effectively search and extract relevant information from DOCX files using query-based searches. This tool is invaluable for data analysis, information management, and research tasks, streamlining the process of finding specific information within large document collections.
## Installation
Install the crewai_tools package by running the following command in your terminal:
```shell
pip install 'crewai[tools]'
```
## Example
The following example demonstrates initializing the DOCXSearchTool to search within any DOCX file's content or with a specific DOCX file path.
```python
from crewai_tools import DOCXSearchTool
# Initialize the tool to search within any DOCX file's content
tool = DOCXSearchTool()
# OR
# Initialize the tool with a specific DOCX file, so the agent can only search the content of the specified DOCX file
tool = DOCXSearchTool(docx='path/to/your/document.docx')
```
## Arguments
- `docx`: An optional file path to a specific DOCX document you wish to search. If not provided during initialization, the tool allows for later specification of any DOCX file's content path for searching.
## Custom model and embeddings
By default, the tool uses OpenAI for both embeddings and summarization. To customize the model, you can use a config dictionary as follows:
```python
tool = DOCXSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
)
```