mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-01-10 16:48:30 +00:00
Fixes #4028 - WebsiteSearchTool always requires OpenAI API key even when Ollama or other providers are specified. The issue was that the documentation showed the old config format with 'llm' and 'embedder' keys, but the actual RagToolConfig type expects 'embedding_model' and 'vectordb' keys. When the old format was passed, the embedder config was not recognized, causing the tool to fall back to the default OpenAI embedding function which requires OPENAI_API_KEY. Changes: - Add _normalize_legacy_config method to RagTool that maps legacy 'embedder' key to 'embedding_model' - Emit deprecation warnings for legacy config keys - Ignore 'llm' key with warning (not used in RAG tools) - Add tests for backward compatibility - Update documentation to show new config format with examples Co-Authored-By: João <joao@crewai.com>
122 lines
3.9 KiB
Plaintext
122 lines
3.9 KiB
Plaintext
---
|
|
title: Website RAG Search
|
|
description: The `WebsiteSearchTool` is designed to perform a RAG (Retrieval-Augmented Generation) search within the content of a website.
|
|
icon: globe-stand
|
|
mode: "wide"
|
|
---
|
|
|
|
# `WebsiteSearchTool`
|
|
|
|
<Note>
|
|
The WebsiteSearchTool is currently in an experimental phase. We are actively working on incorporating this tool into our suite of offerings and will update the documentation accordingly.
|
|
</Note>
|
|
|
|
## Description
|
|
|
|
The WebsiteSearchTool is designed as a concept for conducting semantic searches within the content of websites.
|
|
It aims to leverage advanced machine learning models like Retrieval-Augmented Generation (RAG) to navigate and extract information from specified URLs efficiently.
|
|
This tool intends to offer flexibility, allowing users to perform searches across any website or focus on specific websites of interest.
|
|
Please note, the current implementation details of the WebsiteSearchTool are under development, and its functionalities as described may not yet be accessible.
|
|
|
|
## Installation
|
|
|
|
To prepare your environment for when the WebsiteSearchTool becomes available, you can install the foundational package with:
|
|
|
|
```shell
|
|
pip install 'crewai[tools]'
|
|
```
|
|
|
|
This command installs the necessary dependencies to ensure that once the tool is fully integrated, users can start using it immediately.
|
|
|
|
## Example Usage
|
|
|
|
Below are examples of how the WebsiteSearchTool could be utilized in different scenarios. Please note, these examples are illustrative and represent planned functionality:
|
|
|
|
```python Code
|
|
from crewai_tools import WebsiteSearchTool
|
|
|
|
# Example of initiating tool that agents can use
|
|
# to search across any discovered websites
|
|
tool = WebsiteSearchTool()
|
|
|
|
# Example of limiting the search to the content of a specific website,
|
|
# so now agents can only search within that website
|
|
tool = WebsiteSearchTool(website='https://example.com')
|
|
```
|
|
|
|
## Arguments
|
|
|
|
- `website`: An optional argument intended to specify the website URL for focused searches. This argument is designed to enhance the tool's flexibility by allowing targeted searches when necessary.
|
|
|
|
## Customization Options
|
|
|
|
By default, the tool uses OpenAI for embeddings. To customize the embedding model, you can use a config dictionary as follows:
|
|
|
|
```python Code
|
|
tool = WebsiteSearchTool(
|
|
config=dict(
|
|
embedding_model=dict(
|
|
provider="ollama", # or openai, google-generativeai, azure, etc.
|
|
config=dict(
|
|
model_name="nomic-embed-text",
|
|
url="http://localhost:11434/api/embeddings",
|
|
),
|
|
),
|
|
)
|
|
)
|
|
```
|
|
|
|
### Available Embedding Providers
|
|
|
|
The following embedding providers are supported:
|
|
|
|
- `openai` - OpenAI embeddings (default)
|
|
- `ollama` - Ollama local embeddings
|
|
- `google-generativeai` - Google Generative AI embeddings
|
|
- `azure` - Azure OpenAI embeddings
|
|
- `huggingface` - HuggingFace embeddings
|
|
- `cohere` - Cohere embeddings
|
|
- `voyageai` - Voyage AI embeddings
|
|
- And more...
|
|
|
|
### Example with Google Generative AI
|
|
|
|
```python Code
|
|
tool = WebsiteSearchTool(
|
|
config=dict(
|
|
embedding_model=dict(
|
|
provider="google-generativeai",
|
|
config=dict(
|
|
model_name="models/embedding-001",
|
|
task_type="RETRIEVAL_DOCUMENT",
|
|
),
|
|
),
|
|
)
|
|
)
|
|
```
|
|
|
|
### Example with Azure OpenAI
|
|
|
|
```python Code
|
|
tool = WebsiteSearchTool(
|
|
config=dict(
|
|
embedding_model=dict(
|
|
provider="azure",
|
|
config=dict(
|
|
model="text-embedding-3-small",
|
|
api_key="your-api-key",
|
|
api_base="https://your-resource.openai.azure.com/",
|
|
api_version="2024-02-01",
|
|
deployment_id="your-deployment-id",
|
|
),
|
|
),
|
|
)
|
|
)
|
|
```
|
|
|
|
<Note>
|
|
The `llm` and `embedder` config keys from older documentation are deprecated.
|
|
Please use `embedding_model` instead. The `llm` key is not used by RAG tools -
|
|
the LLM for generation is controlled by the agent's LLM configuration.
|
|
</Note>
|