Files
crewAI/docs/en/tools/web-scraping/firecrawlscrapewebsitetool.mdx
Tony Kipkemboi 1a1bb0ca3d docs: Docs updates (#3459)
* docs(cli): document device-code login and config reset guidance; renumber sections

* docs(cli): fix duplicate numbering (renumber Login/API Keys/Configuration sections)

* docs: Fix webhook documentation to include meta dict in all webhook payloads

- Add note explaining that meta objects from kickoff requests are included in all webhook payloads
- Update webhook examples to show proper payload structure including meta field
- Fix webhook examples to match actual API implementation
- Apply changes to English, Korean, and Portuguese documentation

Resolves the documentation gap where meta dict passing to webhooks was not documented despite being implemented in the API.

* WIP: CrewAI docs theme, changelog, GEO, localization

* docs(cli): fix merge markers; ensure mode: "wide"; convert ASCII tables to Markdown (en/pt-BR/ko)

* docs: add group icons across locales; split Automation/Integrations; update tools overviews and links
2025-09-05 17:40:11 -04:00

45 lines
1.7 KiB
Plaintext

---
title: Firecrawl Scrape Website
description: The `FirecrawlScrapeWebsiteTool` is designed to scrape websites and convert them into clean markdown or structured data.
icon: fire-flame
mode: "wide"
---
# `FirecrawlScrapeWebsiteTool`
## Description
[Firecrawl](https://firecrawl.dev) is a platform for crawling and convert any website into clean markdown or structured data.
## Installation
- Get an API key from [firecrawl.dev](https://firecrawl.dev) and set it in environment variables (`FIRECRAWL_API_KEY`).
- Install the [Firecrawl SDK](https://github.com/mendableai/firecrawl) along with `crewai[tools]` package:
```shell
pip install firecrawl-py 'crewai[tools]'
```
## Example
Utilize the FirecrawlScrapeWebsiteTool as follows to allow your agent to load websites:
```python Code
from crewai_tools import FirecrawlScrapeWebsiteTool
tool = FirecrawlScrapeWebsiteTool(url='firecrawl.dev')
```
## Arguments
- `api_key`: Optional. Specifies Firecrawl API key. Defaults is the `FIRECRAWL_API_KEY` environment variable.
- `url`: The URL to scrape.
- `page_options`: Optional.
- `onlyMainContent`: Optional. Only return the main content of the page excluding headers, navs, footers, etc.
- `includeHtml`: Optional. Include the raw HTML content of the page. Will output a html key in the response.
- `extractor_options`: Optional. Options for LLM-based extraction of structured information from the page content
- `mode`: The extraction mode to use, currently supports 'llm-extraction'
- `extractionPrompt`: Optional. A prompt describing what information to extract from the page
- `extractionSchema`: Optional. The schema for the data to be extracted
- `timeout`: Optional. Timeout in milliseconds for the request