mirror of
https://github.com/crewAIInc/crewAI.git
synced 2025-12-16 04:18:35 +00:00
245 lines
7.1 KiB
Plaintext
245 lines
7.1 KiB
Plaintext
---
|
|
title: Stagehand Tool
|
|
description: Web automation tool that integrates Stagehand with CrewAI for browser interaction and automation
|
|
icon: hand
|
|
---
|
|
|
|
|
|
# Overview
|
|
|
|
The `StagehandTool` integrates the [Stagehand](https://docs.stagehand.dev/get_started/introduction) framework with CrewAI, enabling agents to interact with websites and automate browser tasks using natural language instructions.
|
|
|
|
## Overview
|
|
|
|
Stagehand is a powerful browser automation framework built by Browserbase that allows AI agents to:
|
|
|
|
- Navigate to websites
|
|
- Click buttons, links, and other elements
|
|
- Fill in forms
|
|
- Extract data from web pages
|
|
- Observe and identify elements
|
|
- Perform complex workflows
|
|
|
|
The StagehandTool wraps the Stagehand Python SDK to provide CrewAI agents with browser control capabilities through three core primitives:
|
|
|
|
1. **Act**: Perform actions like clicking, typing, or navigating
|
|
2. **Extract**: Extract structured data from web pages
|
|
3. **Observe**: Identify and analyze elements on the page
|
|
|
|
## Prerequisites
|
|
|
|
Before using this tool, ensure you have:
|
|
|
|
1. A [Browserbase](https://www.browserbase.com/) account with API key and project ID
|
|
2. An API key for an LLM (OpenAI or Anthropic Claude)
|
|
3. The Stagehand Python SDK installed
|
|
|
|
Install the required dependency:
|
|
|
|
```bash
|
|
pip install stagehand-py
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Basic Implementation
|
|
|
|
The StagehandTool can be implemented in two ways:
|
|
|
|
#### 1. Using Context Manager (Recommended)
|
|
<Tip>
|
|
The context manager approach is recommended as it ensures proper cleanup of resources even if exceptions occur.
|
|
</Tip>
|
|
|
|
```python
|
|
from crewai import Agent, Task, Crew
|
|
from crewai_tools import StagehandTool
|
|
from stagehand.schemas import AvailableModel
|
|
|
|
# Initialize the tool with your API keys using a context manager
|
|
with StagehandTool(
|
|
api_key="your-browserbase-api-key",
|
|
project_id="your-browserbase-project-id",
|
|
model_api_key="your-llm-api-key", # OpenAI or Anthropic API key
|
|
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST, # Optional: specify which model to use
|
|
) as stagehand_tool:
|
|
# Create an agent with the tool
|
|
researcher = Agent(
|
|
role="Web Researcher",
|
|
goal="Find and summarize information from websites",
|
|
backstory="I'm an expert at finding information online.",
|
|
verbose=True,
|
|
tools=[stagehand_tool],
|
|
)
|
|
|
|
# Create a task that uses the tool
|
|
research_task = Task(
|
|
description="Go to https://www.example.com and tell me what you see on the homepage.",
|
|
agent=researcher,
|
|
)
|
|
|
|
# Run the crew
|
|
crew = Crew(
|
|
agents=[researcher],
|
|
tasks=[research_task],
|
|
verbose=True,
|
|
)
|
|
|
|
result = crew.kickoff()
|
|
print(result)
|
|
```
|
|
|
|
#### 2. Manual Resource Management
|
|
|
|
```python
|
|
from crewai import Agent, Task, Crew
|
|
from crewai_tools import StagehandTool
|
|
from stagehand.schemas import AvailableModel
|
|
|
|
# Initialize the tool with your API keys
|
|
stagehand_tool = StagehandTool(
|
|
api_key="your-browserbase-api-key",
|
|
project_id="your-browserbase-project-id",
|
|
model_api_key="your-llm-api-key",
|
|
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
|
|
)
|
|
|
|
try:
|
|
# Create an agent with the tool
|
|
researcher = Agent(
|
|
role="Web Researcher",
|
|
goal="Find and summarize information from websites",
|
|
backstory="I'm an expert at finding information online.",
|
|
verbose=True,
|
|
tools=[stagehand_tool],
|
|
)
|
|
|
|
# Create a task that uses the tool
|
|
research_task = Task(
|
|
description="Go to https://www.example.com and tell me what you see on the homepage.",
|
|
agent=researcher,
|
|
)
|
|
|
|
# Run the crew
|
|
crew = Crew(
|
|
agents=[researcher],
|
|
tasks=[research_task],
|
|
verbose=True,
|
|
)
|
|
|
|
result = crew.kickoff()
|
|
print(result)
|
|
finally:
|
|
# Explicitly clean up resources
|
|
stagehand_tool.close()
|
|
```
|
|
|
|
## Command Types
|
|
|
|
The StagehandTool supports three different command types for specific web automation tasks:
|
|
|
|
### 1. Act Command
|
|
|
|
The `act` command type (default) enables webpage interactions like clicking buttons, filling forms, and navigation.
|
|
|
|
```python
|
|
# Perform an action (default behavior)
|
|
result = stagehand_tool.run(
|
|
instruction="Click the login button",
|
|
url="https://example.com",
|
|
command_type="act" # Default, so can be omitted
|
|
)
|
|
|
|
# Fill out a form
|
|
result = stagehand_tool.run(
|
|
instruction="Fill the contact form with name 'John Doe', email 'john@example.com', and message 'Hello world'",
|
|
url="https://example.com/contact"
|
|
)
|
|
```
|
|
|
|
### 2. Extract Command
|
|
|
|
The `extract` command type retrieves structured data from webpages.
|
|
|
|
```python
|
|
# Extract all product information
|
|
result = stagehand_tool.run(
|
|
instruction="Extract all product names, prices, and descriptions",
|
|
url="https://example.com/products",
|
|
command_type="extract"
|
|
)
|
|
|
|
# Extract specific information with a selector
|
|
result = stagehand_tool.run(
|
|
instruction="Extract the main article title and content",
|
|
url="https://example.com/blog/article",
|
|
command_type="extract",
|
|
selector=".article-container" # Optional CSS selector
|
|
)
|
|
```
|
|
|
|
### 3. Observe Command
|
|
|
|
The `observe` command type identifies and analyzes webpage elements.
|
|
|
|
```python
|
|
# Find interactive elements
|
|
result = stagehand_tool.run(
|
|
instruction="Find all interactive elements in the navigation menu",
|
|
url="https://example.com",
|
|
command_type="observe"
|
|
)
|
|
|
|
# Identify form fields
|
|
result = stagehand_tool.run(
|
|
instruction="Identify all the input fields in the registration form",
|
|
url="https://example.com/register",
|
|
command_type="observe",
|
|
selector="#registration-form"
|
|
)
|
|
```
|
|
|
|
## Configuration Options
|
|
|
|
Customize the StagehandTool behavior with these parameters:
|
|
|
|
```python
|
|
stagehand_tool = StagehandTool(
|
|
api_key="your-browserbase-api-key",
|
|
project_id="your-browserbase-project-id",
|
|
model_api_key="your-llm-api-key",
|
|
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
|
|
dom_settle_timeout_ms=5000, # Wait longer for DOM to settle
|
|
headless=True, # Run browser in headless mode
|
|
self_heal=True, # Attempt to recover from errors
|
|
wait_for_captcha_solves=True, # Wait for CAPTCHA solving
|
|
verbose=1, # Control logging verbosity (0-3)
|
|
)
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Be Specific**: Provide detailed instructions for better results
|
|
2. **Choose Appropriate Command Type**: Select the right command type for your task
|
|
3. **Use Selectors**: Leverage CSS selectors to improve accuracy
|
|
4. **Break Down Complex Tasks**: Split complex workflows into multiple tool calls
|
|
5. **Implement Error Handling**: Add error handling for potential issues
|
|
|
|
## Troubleshooting
|
|
|
|
|
|
Common issues and solutions:
|
|
|
|
- **Session Issues**: Verify API keys for both Browserbase and LLM provider
|
|
- **Element Not Found**: Increase `dom_settle_timeout_ms` for slower pages
|
|
- **Action Failures**: Use `observe` to identify correct elements first
|
|
- **Incomplete Data**: Refine instructions or provide specific selectors
|
|
|
|
|
|
## Additional Resources
|
|
|
|
For questions about the CrewAI integration:
|
|
- Join Stagehand's [Slack community](https://stagehand.dev/slack)
|
|
- Open an issue in the [Stagehand repository](https://github.com/browserbase/stagehand)
|
|
- Visit [Stagehand documentation](https://docs.stagehand.dev/)
|