mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-01-10 00:28:31 +00:00
Added Firecrawl tools to docs (#628)
This commit is contained in:
42
docs/tools/FirecrawlCrawlWebsiteTool.md
Normal file
42
docs/tools/FirecrawlCrawlWebsiteTool.md
Normal file
@@ -0,0 +1,42 @@
|
|||||||
|
# FirecrawlCrawlWebsiteTool
|
||||||
|
|
||||||
|
## Description
|
||||||
|
|
||||||
|
[Firecrawl](https://firecrawl.dev) is a platform for crawling and convert any website into clean markdown or structured data.
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
- Get an API key from [firecrawl.dev](https://firecrawl.dev) and set it in environment variables (`FIRECRAWL_API_KEY`).
|
||||||
|
- Install the [Firecrawl SDK](https://github.com/mendableai/firecrawl) along with `crewai[tools]` package:
|
||||||
|
|
||||||
|
```
|
||||||
|
pip install firecrawl-py 'crewai[tools]'
|
||||||
|
```
|
||||||
|
|
||||||
|
## Example
|
||||||
|
|
||||||
|
Utilize the FirecrawlScrapeFromWebsiteTool as follows to allow your agent to load websites:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from crewai_tools import FirecrawlCrawlWebsiteTool
|
||||||
|
|
||||||
|
tool = FirecrawlCrawlWebsiteTool(url='firecrawl.dev')
|
||||||
|
```
|
||||||
|
|
||||||
|
## Arguments
|
||||||
|
|
||||||
|
- `api_key`: Optional. Specifies Firecrawl API key. Defaults is the `FIRECRAWL_API_KEY` environment variable.
|
||||||
|
- `url`: The base URL to start crawling from.
|
||||||
|
- `page_options`: Optional.
|
||||||
|
- `onlyMainContent`: Optional. Only return the main content of the page excluding headers, navs, footers, etc.
|
||||||
|
- `includeHtml`: Optional. Include the raw HTML content of the page. Will output a html key in the response.
|
||||||
|
- `crawler_options`: Optional. Options for controlling the crawling behavior.
|
||||||
|
- `includes`: Optional. URL patterns to include in the crawl.
|
||||||
|
- `exclude`: Optional. URL patterns to exclude from the crawl.
|
||||||
|
- `generateImgAltText`: Optional. Generate alt text for images using LLMs (requires a paid plan).
|
||||||
|
- `returnOnlyUrls`: Optional. If true, returns only the URLs as a list in the crawl status. Note: the response will be a list of URLs inside the data, not a list of documents.
|
||||||
|
- `maxDepth`: Optional. Maximum depth to crawl. Depth 1 is the base URL, depth 2 includes the base URL and its direct children, and so on.
|
||||||
|
- `mode`: Optional. The crawling mode to use. Fast mode crawls 4x faster on websites without a sitemap but may not be as accurate and shouldn't be used on heavily JavaScript-rendered websites.
|
||||||
|
- `limit`: Optional. Maximum number of pages to crawl.
|
||||||
|
- `timeout`: Optional. Timeout in milliseconds for the crawling operation.
|
||||||
|
|
||||||
38
docs/tools/FirecrawlScrapeWebsiteTool.md
Normal file
38
docs/tools/FirecrawlScrapeWebsiteTool.md
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
# FirecrawlScrapeWebsiteTool
|
||||||
|
|
||||||
|
## Description
|
||||||
|
|
||||||
|
[Firecrawl](https://firecrawl.dev) is a platform for crawling and convert any website into clean markdown or structured data.
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
- Get an API key from [firecrawl.dev](https://firecrawl.dev) and set it in environment variables (`FIRECRAWL_API_KEY`).
|
||||||
|
- Install the [Firecrawl SDK](https://github.com/mendableai/firecrawl) along with `crewai[tools]` package:
|
||||||
|
|
||||||
|
```
|
||||||
|
pip install firecrawl-py 'crewai[tools]'
|
||||||
|
```
|
||||||
|
|
||||||
|
## Example
|
||||||
|
|
||||||
|
Utilize the FirecrawlScrapeWebsiteTool as follows to allow your agent to load websites:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from crewai_tools import FirecrawlScrapeWebsiteTool
|
||||||
|
|
||||||
|
tool = FirecrawlScrapeWebsiteTool(url='firecrawl.dev')
|
||||||
|
```
|
||||||
|
|
||||||
|
## Arguments
|
||||||
|
|
||||||
|
- `api_key`: Optional. Specifies Firecrawl API key. Defaults is the `FIRECRAWL_API_KEY` environment variable.
|
||||||
|
- `url`: The URL to scrape.
|
||||||
|
- `page_options`: Optional.
|
||||||
|
- `onlyMainContent`: Optional. Only return the main content of the page excluding headers, navs, footers, etc.
|
||||||
|
- `includeHtml`: Optional. Include the raw HTML content of the page. Will output a html key in the response.
|
||||||
|
- `extractor_options`: Optional. Options for LLM-based extraction of structured information from the page content
|
||||||
|
- `mode`: The extraction mode to use, currently supports 'llm-extraction'
|
||||||
|
- `extractionPrompt`: Optional. A prompt describing what information to extract from the page
|
||||||
|
- `extractionSchema`: Optional. The schema for the data to be extracted
|
||||||
|
- `timeout`: Optional. Timeout in milliseconds for the request
|
||||||
|
|
||||||
35
docs/tools/FirecrawlSearchTool.md
Normal file
35
docs/tools/FirecrawlSearchTool.md
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
# FirecrawlSearchTool
|
||||||
|
|
||||||
|
## Description
|
||||||
|
|
||||||
|
[Firecrawl](https://firecrawl.dev) is a platform for crawling and convert any website into clean markdown or structured data.
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
- Get an API key from [firecrawl.dev](https://firecrawl.dev) and set it in environment variables (`FIRECRAWL_API_KEY`).
|
||||||
|
- Install the [Firecrawl SDK](https://github.com/mendableai/firecrawl) along with `crewai[tools]` package:
|
||||||
|
|
||||||
|
```
|
||||||
|
pip install firecrawl-py 'crewai[tools]'
|
||||||
|
```
|
||||||
|
|
||||||
|
## Example
|
||||||
|
|
||||||
|
Utilize the FirecrawlSearchTool as follows to allow your agent to load websites:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from crewai_tools import FirecrawlSearchTool
|
||||||
|
|
||||||
|
tool = FirecrawlSearchTool(query='what is firecrawl?')
|
||||||
|
```
|
||||||
|
|
||||||
|
## Arguments
|
||||||
|
|
||||||
|
- `api_key`: Optional. Specifies Firecrawl API key. Defaults is the `FIRECRAWL_API_KEY` environment variable.
|
||||||
|
- `query`: The search query string to be used for searching.
|
||||||
|
- `page_options`: Optional. Options for result formatting.
|
||||||
|
- `onlyMainContent`: Optional. Only return the main content of the page excluding headers, navs, footers, etc.
|
||||||
|
- `includeHtml`: Optional. Include the raw HTML content of the page. Will output a html key in the response.
|
||||||
|
- `fetchPageContent`: Optional. Fetch the full content of the page.
|
||||||
|
- `search_options`: Optional. Options for controlling the crawling behavior.
|
||||||
|
- `limit`: Optional. Maximum number of pages to crawl.
|
||||||
@@ -152,6 +152,9 @@ nav:
|
|||||||
- Agent Monitoring with AgentOps: 'how-to/AgentOps-Observability.md'
|
- Agent Monitoring with AgentOps: 'how-to/AgentOps-Observability.md'
|
||||||
- Agent Monitoring with LangTrace: 'how-to/Langtrace-Observability.md'
|
- Agent Monitoring with LangTrace: 'how-to/Langtrace-Observability.md'
|
||||||
- Tools Docs:
|
- Tools Docs:
|
||||||
|
- Firecrawl Scrape Website Tool: 'tools/FirecrawlScrapeWebsiteTool.md'
|
||||||
|
- Firecrawl Crawl Website Tool: 'tools/FirecrawlCrawlWebsiteTool.md'
|
||||||
|
- Firecrawl Search Tool: 'tools/FirecrawlSearchTool.md'
|
||||||
- Google Serper Search: 'tools/SerperDevTool.md'
|
- Google Serper Search: 'tools/SerperDevTool.md'
|
||||||
- Browserbase Web Loader: 'tools/BrowserbaseLoadTool.md'
|
- Browserbase Web Loader: 'tools/BrowserbaseLoadTool.md'
|
||||||
- Composio Tools: 'tools/ComposioTool.md'
|
- Composio Tools: 'tools/ComposioTool.md'
|
||||||
|
|||||||
Reference in New Issue
Block a user