mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-05-07 10:12:38 +00:00
* fix: add SSRF and path traversal protections CVE-2026-2286: validate_url blocks non-http/https schemes, private IPs, loopback, link-local, reserved addresses. Applied to 11 web tools. CVE-2026-2285: validate_path confines file access to the working directory. Applied to 7 file and directory tools. * fix: drop unused assignment from validate_url call * fix: DNS rebinding protection and allow_private flag Rewrite validated URLs to use the resolved IP, preventing DNS rebinding between validation and request time. SDK-based tools use pin_ip=False since they manage their own HTTP clients. Add allow_private flag for deployments that need internal network access. * fix: unify security utilities and restore RAG chokepoint validation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: move validation to security/ package + address review comments - Move safe_path.py to crewai_tools/security/; add safe_url.py re-export - Keep utilities/safe_path.py as a backwards-compat shim - Update all 21 import sites to use crewai_tools.security.safe_path - files_compressor_tool: validate output_path (user-controlled) - serper_scrape_website_tool: call validate_url() before building payload - brightdata_unlocker: validate_url() already called without assignment (no-op fix) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: move validation to security/ package, keep utilities/ as compat shim - security/safe_path.py is the canonical location for all validation - utilities/safe_path.py re-exports for backward compatibility - All tool imports already point to security.safe_path - All review comments already addressed in prior commits * fix: move validation outside try/except blocks, use correct directory validator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: use resolved paths from validation to prevent symlink TOCTOU, remove unused safe_url.py --------- Co-authored-by: Alex <alex@crewai.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
ScrapeWebsiteTool
Description
A tool designed to extract and read the content of a specified website. It is capable of handling various types of web pages by making HTTP requests and parsing the received HTML content. This tool can be particularly useful for web scraping tasks, data collection, or extracting specific information from websites.
Installation
Install the crewai_tools package
pip install 'crewai[tools]'
Example
from crewai_tools import ScrapeWebsiteTool
# To enable scrapping any website it finds during it's execution
tool = ScrapeWebsiteTool()
# Initialize the tool with the website URL, so the agent can only scrap the content of the specified website
tool = ScrapeWebsiteTool(website_url='https://www.example.com')
Arguments
website_url: Mandatory website URL to read the file. This is the primary input for the tool, specifying which website's content should be scraped and read.