mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-01-09 16:18:30 +00:00
Migrate docs from MkDocs to Mintlify (#1423)
* add new mintlify docs * add favicon.svg * minor edits * add github stats
This commit is contained in:
43
docs/tools/firecrawlscrapewebsitetool.mdx
Normal file
43
docs/tools/firecrawlscrapewebsitetool.mdx
Normal file
@@ -0,0 +1,43 @@
|
||||
---
|
||||
title: Firecrawl Scrape Website
|
||||
description: The `FirecrawlScrapeWebsiteTool` is designed to scrape websites and convert them into clean markdown or structured data.
|
||||
icon: fire-flame
|
||||
---
|
||||
|
||||
# `FirecrawlScrapeWebsiteTool`
|
||||
|
||||
## Description
|
||||
|
||||
[Firecrawl](https://firecrawl.dev) is a platform for crawling and convert any website into clean markdown or structured data.
|
||||
|
||||
## Installation
|
||||
|
||||
- Get an API key from [firecrawl.dev](https://firecrawl.dev) and set it in environment variables (`FIRECRAWL_API_KEY`).
|
||||
- Install the [Firecrawl SDK](https://github.com/mendableai/firecrawl) along with `crewai[tools]` package:
|
||||
|
||||
```shell
|
||||
pip install firecrawl-py 'crewai[tools]'
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
Utilize the FirecrawlScrapeWebsiteTool as follows to allow your agent to load websites:
|
||||
|
||||
```python Code
|
||||
from crewai_tools import FirecrawlScrapeWebsiteTool
|
||||
|
||||
tool = FirecrawlScrapeWebsiteTool(url='firecrawl.dev')
|
||||
```
|
||||
|
||||
## Arguments
|
||||
|
||||
- `api_key`: Optional. Specifies Firecrawl API key. Defaults is the `FIRECRAWL_API_KEY` environment variable.
|
||||
- `url`: The URL to scrape.
|
||||
- `page_options`: Optional.
|
||||
- `onlyMainContent`: Optional. Only return the main content of the page excluding headers, navs, footers, etc.
|
||||
- `includeHtml`: Optional. Include the raw HTML content of the page. Will output a html key in the response.
|
||||
- `extractor_options`: Optional. Options for LLM-based extraction of structured information from the page content
|
||||
- `mode`: The extraction mode to use, currently supports 'llm-extraction'
|
||||
- `extractionPrompt`: Optional. A prompt describing what information to extract from the page
|
||||
- `extractionSchema`: Optional. The schema for the data to be extracted
|
||||
- `timeout`: Optional. Timeout in milliseconds for the request
|
||||
Reference in New Issue
Block a user