mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-01-10 16:48:30 +00:00
docs: add StagehandTool documentation and improve MDX structure (#2842)
This commit is contained in:
244
docs/tools/stagehandtool.mdx
Normal file
244
docs/tools/stagehandtool.mdx
Normal file
@@ -0,0 +1,244 @@
|
||||
---
|
||||
title: Stagehand Tool
|
||||
description: Web automation tool that integrates Stagehand with CrewAI for browser interaction and automation
|
||||
icon: hand
|
||||
---
|
||||
|
||||
|
||||
# Overview
|
||||
|
||||
The `StagehandTool` integrates the [Stagehand](https://docs.stagehand.dev/get_started/introduction) framework with CrewAI, enabling agents to interact with websites and automate browser tasks using natural language instructions.
|
||||
|
||||
## Overview
|
||||
|
||||
Stagehand is a powerful browser automation framework built by Browserbase that allows AI agents to:
|
||||
|
||||
- Navigate to websites
|
||||
- Click buttons, links, and other elements
|
||||
- Fill in forms
|
||||
- Extract data from web pages
|
||||
- Observe and identify elements
|
||||
- Perform complex workflows
|
||||
|
||||
The StagehandTool wraps the Stagehand Python SDK to provide CrewAI agents with browser control capabilities through three core primitives:
|
||||
|
||||
1. **Act**: Perform actions like clicking, typing, or navigating
|
||||
2. **Extract**: Extract structured data from web pages
|
||||
3. **Observe**: Identify and analyze elements on the page
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before using this tool, ensure you have:
|
||||
|
||||
1. A [Browserbase](https://www.browserbase.com/) account with API key and project ID
|
||||
2. An API key for an LLM (OpenAI or Anthropic Claude)
|
||||
3. The Stagehand Python SDK installed
|
||||
|
||||
Install the required dependency:
|
||||
|
||||
```bash
|
||||
pip install stagehand-py
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Implementation
|
||||
|
||||
The StagehandTool can be implemented in two ways:
|
||||
|
||||
#### 1. Using Context Manager (Recommended)
|
||||
<Tip>
|
||||
The context manager approach is recommended as it ensures proper cleanup of resources even if exceptions occur.
|
||||
</Tip>
|
||||
|
||||
```python
|
||||
from crewai import Agent, Task, Crew
|
||||
from crewai_tools import StagehandTool
|
||||
from stagehand.schemas import AvailableModel
|
||||
|
||||
# Initialize the tool with your API keys using a context manager
|
||||
with StagehandTool(
|
||||
api_key="your-browserbase-api-key",
|
||||
project_id="your-browserbase-project-id",
|
||||
model_api_key="your-llm-api-key", # OpenAI or Anthropic API key
|
||||
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST, # Optional: specify which model to use
|
||||
) as stagehand_tool:
|
||||
# Create an agent with the tool
|
||||
researcher = Agent(
|
||||
role="Web Researcher",
|
||||
goal="Find and summarize information from websites",
|
||||
backstory="I'm an expert at finding information online.",
|
||||
verbose=True,
|
||||
tools=[stagehand_tool],
|
||||
)
|
||||
|
||||
# Create a task that uses the tool
|
||||
research_task = Task(
|
||||
description="Go to https://www.example.com and tell me what you see on the homepage.",
|
||||
agent=researcher,
|
||||
)
|
||||
|
||||
# Run the crew
|
||||
crew = Crew(
|
||||
agents=[researcher],
|
||||
tasks=[research_task],
|
||||
verbose=True,
|
||||
)
|
||||
|
||||
result = crew.kickoff()
|
||||
print(result)
|
||||
```
|
||||
|
||||
#### 2. Manual Resource Management
|
||||
|
||||
```python
|
||||
from crewai import Agent, Task, Crew
|
||||
from crewai_tools import StagehandTool
|
||||
from stagehand.schemas import AvailableModel
|
||||
|
||||
# Initialize the tool with your API keys
|
||||
stagehand_tool = StagehandTool(
|
||||
api_key="your-browserbase-api-key",
|
||||
project_id="your-browserbase-project-id",
|
||||
model_api_key="your-llm-api-key",
|
||||
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
|
||||
)
|
||||
|
||||
try:
|
||||
# Create an agent with the tool
|
||||
researcher = Agent(
|
||||
role="Web Researcher",
|
||||
goal="Find and summarize information from websites",
|
||||
backstory="I'm an expert at finding information online.",
|
||||
verbose=True,
|
||||
tools=[stagehand_tool],
|
||||
)
|
||||
|
||||
# Create a task that uses the tool
|
||||
research_task = Task(
|
||||
description="Go to https://www.example.com and tell me what you see on the homepage.",
|
||||
agent=researcher,
|
||||
)
|
||||
|
||||
# Run the crew
|
||||
crew = Crew(
|
||||
agents=[researcher],
|
||||
tasks=[research_task],
|
||||
verbose=True,
|
||||
)
|
||||
|
||||
result = crew.kickoff()
|
||||
print(result)
|
||||
finally:
|
||||
# Explicitly clean up resources
|
||||
stagehand_tool.close()
|
||||
```
|
||||
|
||||
## Command Types
|
||||
|
||||
The StagehandTool supports three different command types for specific web automation tasks:
|
||||
|
||||
### 1. Act Command
|
||||
|
||||
The `act` command type (default) enables webpage interactions like clicking buttons, filling forms, and navigation.
|
||||
|
||||
```python
|
||||
# Perform an action (default behavior)
|
||||
result = stagehand_tool.run(
|
||||
instruction="Click the login button",
|
||||
url="https://example.com",
|
||||
command_type="act" # Default, so can be omitted
|
||||
)
|
||||
|
||||
# Fill out a form
|
||||
result = stagehand_tool.run(
|
||||
instruction="Fill the contact form with name 'John Doe', email 'john@example.com', and message 'Hello world'",
|
||||
url="https://example.com/contact"
|
||||
)
|
||||
```
|
||||
|
||||
### 2. Extract Command
|
||||
|
||||
The `extract` command type retrieves structured data from webpages.
|
||||
|
||||
```python
|
||||
# Extract all product information
|
||||
result = stagehand_tool.run(
|
||||
instruction="Extract all product names, prices, and descriptions",
|
||||
url="https://example.com/products",
|
||||
command_type="extract"
|
||||
)
|
||||
|
||||
# Extract specific information with a selector
|
||||
result = stagehand_tool.run(
|
||||
instruction="Extract the main article title and content",
|
||||
url="https://example.com/blog/article",
|
||||
command_type="extract",
|
||||
selector=".article-container" # Optional CSS selector
|
||||
)
|
||||
```
|
||||
|
||||
### 3. Observe Command
|
||||
|
||||
The `observe` command type identifies and analyzes webpage elements.
|
||||
|
||||
```python
|
||||
# Find interactive elements
|
||||
result = stagehand_tool.run(
|
||||
instruction="Find all interactive elements in the navigation menu",
|
||||
url="https://example.com",
|
||||
command_type="observe"
|
||||
)
|
||||
|
||||
# Identify form fields
|
||||
result = stagehand_tool.run(
|
||||
instruction="Identify all the input fields in the registration form",
|
||||
url="https://example.com/register",
|
||||
command_type="observe",
|
||||
selector="#registration-form"
|
||||
)
|
||||
```
|
||||
|
||||
## Configuration Options
|
||||
|
||||
Customize the StagehandTool behavior with these parameters:
|
||||
|
||||
```python
|
||||
stagehand_tool = StagehandTool(
|
||||
api_key="your-browserbase-api-key",
|
||||
project_id="your-browserbase-project-id",
|
||||
model_api_key="your-llm-api-key",
|
||||
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
|
||||
dom_settle_timeout_ms=5000, # Wait longer for DOM to settle
|
||||
headless=True, # Run browser in headless mode
|
||||
self_heal=True, # Attempt to recover from errors
|
||||
wait_for_captcha_solves=True, # Wait for CAPTCHA solving
|
||||
verbose=1, # Control logging verbosity (0-3)
|
||||
)
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Be Specific**: Provide detailed instructions for better results
|
||||
2. **Choose Appropriate Command Type**: Select the right command type for your task
|
||||
3. **Use Selectors**: Leverage CSS selectors to improve accuracy
|
||||
4. **Break Down Complex Tasks**: Split complex workflows into multiple tool calls
|
||||
5. **Implement Error Handling**: Add error handling for potential issues
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
|
||||
Common issues and solutions:
|
||||
|
||||
- **Session Issues**: Verify API keys for both Browserbase and LLM provider
|
||||
- **Element Not Found**: Increase `dom_settle_timeout_ms` for slower pages
|
||||
- **Action Failures**: Use `observe` to identify correct elements first
|
||||
- **Incomplete Data**: Refine instructions or provide specific selectors
|
||||
|
||||
|
||||
## Additional Resources
|
||||
|
||||
For questions about the CrewAI integration:
|
||||
- Join Stagehand's [Slack community](https://stagehand.dev/slack)
|
||||
- Open an issue in the [Stagehand repository](https://github.com/browserbase/stagehand)
|
||||
- Visit [Stagehand documentation](https://docs.stagehand.dev/)
|
||||
Reference in New Issue
Block a user