mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-01-26 16:48:13 +00:00
docs: parity for all translations
This commit is contained in:
112
docs/pt-BR/tools/web-scraping/brightdata-tools.mdx
Normal file
112
docs/pt-BR/tools/web-scraping/brightdata-tools.mdx
Normal file
@@ -0,0 +1,112 @@
|
||||
---
|
||||
title: Bright Data Tools
|
||||
description: Bright Data integrations for SERP search, Web Unlocker scraping, and Dataset API.
|
||||
icon: spider
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
# Bright Data Tools
|
||||
|
||||
This set of tools integrates Bright Data services for web extraction.
|
||||
|
||||
## Installation
|
||||
|
||||
```shell
|
||||
uv add crewai-tools requests aiohttp
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
- `BRIGHT_DATA_API_KEY` (required)
|
||||
- `BRIGHT_DATA_ZONE` (for SERP/Web Unlocker)
|
||||
|
||||
Create credentials at https://brightdata.com/ (sign up, then create an API token and zone).
|
||||
See their docs: https://developers.brightdata.com/
|
||||
|
||||
## Included Tools
|
||||
|
||||
- `BrightDataSearchTool`: SERP search (Google/Bing/Yandex) with geo/language/device options.
|
||||
- `BrightDataWebUnlockerTool`: Scrape pages with anti-bot bypass and rendering.
|
||||
- `BrightDataDatasetTool`: Run Dataset API jobs and fetch results.
|
||||
|
||||
## Examples
|
||||
|
||||
### SERP Search
|
||||
|
||||
```python Code
|
||||
from crewai_tools import BrightDataSearchTool
|
||||
|
||||
tool = BrightDataSearchTool(
|
||||
query="CrewAI",
|
||||
country="us",
|
||||
)
|
||||
|
||||
print(tool.run())
|
||||
```
|
||||
|
||||
### Web Unlocker
|
||||
|
||||
```python Code
|
||||
from crewai_tools import BrightDataWebUnlockerTool
|
||||
|
||||
tool = BrightDataWebUnlockerTool(
|
||||
url="https://example.com",
|
||||
format="markdown",
|
||||
)
|
||||
|
||||
print(tool.run(url="https://example.com"))
|
||||
```
|
||||
|
||||
### Dataset API
|
||||
|
||||
```python Code
|
||||
from crewai_tools import BrightDataDatasetTool
|
||||
|
||||
tool = BrightDataDatasetTool(
|
||||
dataset_type="ecommerce",
|
||||
url="https://example.com/product",
|
||||
)
|
||||
|
||||
print(tool.run())
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- 401/403: verify `BRIGHT_DATA_API_KEY` and `BRIGHT_DATA_ZONE`.
|
||||
- Empty/blocked content: enable rendering or try a different zone.
|
||||
|
||||
## Example
|
||||
|
||||
```python Code
|
||||
from crewai import Agent, Task, Crew
|
||||
from crewai_tools import BrightDataSearchTool
|
||||
|
||||
tool = BrightDataSearchTool(
|
||||
query="CrewAI",
|
||||
country="us",
|
||||
)
|
||||
|
||||
agent = Agent(
|
||||
role="Web Researcher",
|
||||
goal="Search with Bright Data",
|
||||
backstory="Finds reliable results",
|
||||
tools=[tool],
|
||||
verbose=True,
|
||||
)
|
||||
|
||||
task = Task(
|
||||
description="Search for CrewAI and summarize top results",
|
||||
expected_output="Short summary with links",
|
||||
agent=agent,
|
||||
)
|
||||
|
||||
crew = Crew(
|
||||
agents=[agent],
|
||||
tasks=[task],
|
||||
verbose=True,
|
||||
)
|
||||
|
||||
result = crew.kickoff()
|
||||
```
|
||||
|
||||
|
||||
101
docs/pt-BR/tools/web-scraping/serperscrapewebsitetool.mdx
Normal file
101
docs/pt-BR/tools/web-scraping/serperscrapewebsitetool.mdx
Normal file
@@ -0,0 +1,101 @@
|
||||
---
|
||||
title: Serper Scrape Website
|
||||
description: The `SerperScrapeWebsiteTool` is designed to scrape websites and extract clean, readable content using Serper's scraping API.
|
||||
icon: globe
|
||||
mode: "wide"
|
||||
---
|
||||
|
||||
# `SerperScrapeWebsiteTool`
|
||||
|
||||
## Description
|
||||
|
||||
This tool is designed to scrape website content and extract clean, readable text from any website URL. It utilizes the [serper.dev](https://serper.dev) scraping API to fetch and process web pages, optionally including markdown formatting for better structure and readability.
|
||||
|
||||
## Installation
|
||||
|
||||
To effectively use the `SerperScrapeWebsiteTool`, follow these steps:
|
||||
|
||||
1. **Package Installation**: Confirm that the `crewai[tools]` package is installed in your Python environment.
|
||||
2. **API Key Acquisition**: Acquire a `serper.dev` API key by registering for an account at `serper.dev`.
|
||||
3. **Environment Configuration**: Store your obtained API key in an environment variable named `SERPER_API_KEY` to facilitate its use by the tool.
|
||||
|
||||
To incorporate this tool into your project, follow the installation instructions below:
|
||||
|
||||
```shell
|
||||
pip install 'crewai[tools]'
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
The following example demonstrates how to initialize the tool and scrape a website:
|
||||
|
||||
```python Code
|
||||
from crewai_tools import SerperScrapeWebsiteTool
|
||||
|
||||
# Initialize the tool for website scraping capabilities
|
||||
tool = SerperScrapeWebsiteTool()
|
||||
|
||||
# Scrape a website with markdown formatting
|
||||
result = tool.run(url="https://example.com", include_markdown=True)
|
||||
```
|
||||
|
||||
## Arguments
|
||||
|
||||
The `SerperScrapeWebsiteTool` accepts the following arguments:
|
||||
|
||||
- **url**: Required. The URL of the website to scrape.
|
||||
- **include_markdown**: Optional. Whether to include markdown formatting in the scraped content. Defaults to `True`.
|
||||
|
||||
## Example with Parameters
|
||||
|
||||
Here is an example demonstrating how to use the tool with different parameters:
|
||||
|
||||
```python Code
|
||||
from crewai_tools import SerperScrapeWebsiteTool
|
||||
|
||||
tool = SerperScrapeWebsiteTool()
|
||||
|
||||
# Scrape with markdown formatting (default)
|
||||
markdown_result = tool.run(
|
||||
url="https://docs.crewai.com",
|
||||
include_markdown=True
|
||||
)
|
||||
|
||||
# Scrape without markdown formatting for plain text
|
||||
plain_result = tool.run(
|
||||
url="https://docs.crewai.com",
|
||||
include_markdown=False
|
||||
)
|
||||
|
||||
print("Markdown formatted content:")
|
||||
print(markdown_result)
|
||||
|
||||
print("\nPlain text content:")
|
||||
print(plain_result)
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
The `SerperScrapeWebsiteTool` is particularly useful for:
|
||||
|
||||
- **Content Analysis**: Extract and analyze website content for research purposes
|
||||
- **Data Collection**: Gather structured information from web pages
|
||||
- **Documentation Processing**: Convert web-based documentation into readable formats
|
||||
- **Competitive Analysis**: Scrape competitor websites for market research
|
||||
- **Content Migration**: Extract content from existing websites for migration purposes
|
||||
|
||||
## Error Handling
|
||||
|
||||
The tool includes comprehensive error handling for:
|
||||
|
||||
- **Network Issues**: Handles connection timeouts and network errors gracefully
|
||||
- **API Errors**: Provides detailed error messages for API-related issues
|
||||
- **Invalid URLs**: Validates and reports issues with malformed URLs
|
||||
- **Authentication**: Clear error messages for missing or invalid API keys
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- Always store your `SERPER_API_KEY` in environment variables, never hardcode it in your source code
|
||||
- Be mindful of rate limits imposed by the Serper API
|
||||
- Respect robots.txt and website terms of service when scraping content
|
||||
- Consider implementing delays between requests for large-scale scraping operations
|
||||
Reference in New Issue
Block a user