docs: add StagehandTool documentation and improve MDX structure (#2842)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled

This commit is contained in:
Tony Kipkemboi
2025-05-15 12:24:25 -04:00
committed by GitHub
parent 49bbf3f234
commit 0b35e40a24
9 changed files with 245 additions and 14 deletions

View File

@@ -129,6 +129,7 @@
"tools/seleniumscrapingtool",
"tools/snowflakesearchtool",
"tools/spidertool",
"tools/stagehandtool",
"tools/txtsearchtool",
"tools/visiontool",
"tools/weaviatevectorsearchtool",

View File

@@ -4,8 +4,6 @@ description: Dive deeper into low-level prompt customization for CrewAI, enablin
icon: message-pen
---
# Customizing Prompts at a Low Level
## Why Customize Prompts?
Although CrewAI's default prompts work well for many scenarios, low-level customization opens the door to significantly more flexible and powerful agent behavior. Heres why you might want to take advantage of this deeper control:

View File

@@ -4,8 +4,6 @@ description: Learn how to use CrewAI's fingerprinting system to uniquely identif
icon: fingerprint
---
# Fingerprinting in CrewAI
## Overview
Fingerprints in CrewAI provide a way to uniquely identify and track components throughout their lifecycle. Each `Agent`, `Crew`, and `Task` automatically receives a unique fingerprint when created, which cannot be manually overridden.

View File

@@ -4,8 +4,6 @@ description: Learn best practices for designing powerful, specialized AI agents
icon: robot
---
# Crafting Effective Agents
## The Art and Science of Agent Design
At the heart of CrewAI lies the agent - a specialized AI entity designed to perform specific roles within a collaborative framework. While creating basic agents is simple, crafting truly effective agents that produce exceptional results requires understanding key design principles and best practices.

View File

@@ -4,8 +4,6 @@ description: Learn how to assess your AI application needs and choose the right
icon: scale-balanced
---
# Evaluating Use Cases for CrewAI
## Understanding the Decision Framework
When building AI applications with CrewAI, one of the most important decisions you'll make is choosing the right approach for your specific use case. Should you use a Crew? A Flow? A combination of both? This guide will help you evaluate your requirements and make informed architectural decisions.

View File

@@ -4,8 +4,6 @@ description: Step-by-step tutorial to create a collaborative AI team that works
icon: users-gear
---
# Build Your First Crew
## Unleashing the Power of Collaborative AI
Imagine having a team of specialized AI agents working together seamlessly to solve complex problems, each contributing their unique skills to achieve a common goal. This is the power of CrewAI - a framework that enables you to create collaborative AI systems that can accomplish tasks far beyond what a single AI could achieve alone.

View File

@@ -4,8 +4,6 @@ description: Learn how to create structured, event-driven workflows with precise
icon: diagram-project
---
# Build Your First Flow
## Taking Control of AI Workflows with Flows
CrewAI Flows represent the next level in AI orchestration - combining the collaborative power of AI agent crews with the precision and flexibility of procedural programming. While crews excel at agent collaboration, flows give you fine-grained control over exactly how and when different components of your AI system interact.

View File

@@ -4,8 +4,6 @@ description: A comprehensive guide to managing, persisting, and leveraging state
icon: diagram-project
---
# Mastering Flow State Management
## Understanding the Power of State in Flows
State management is the backbone of any sophisticated AI workflow. In CrewAI Flows, the state system allows you to maintain context, share data between steps, and build complex application logic. Mastering state management is essential for creating reliable, maintainable, and powerful AI applications.

View File

@@ -0,0 +1,244 @@
---
title: Stagehand Tool
description: Web automation tool that integrates Stagehand with CrewAI for browser interaction and automation
icon: hand
---
# Overview
The `StagehandTool` integrates the [Stagehand](https://docs.stagehand.dev/get_started/introduction) framework with CrewAI, enabling agents to interact with websites and automate browser tasks using natural language instructions.
## Overview
Stagehand is a powerful browser automation framework built by Browserbase that allows AI agents to:
- Navigate to websites
- Click buttons, links, and other elements
- Fill in forms
- Extract data from web pages
- Observe and identify elements
- Perform complex workflows
The StagehandTool wraps the Stagehand Python SDK to provide CrewAI agents with browser control capabilities through three core primitives:
1. **Act**: Perform actions like clicking, typing, or navigating
2. **Extract**: Extract structured data from web pages
3. **Observe**: Identify and analyze elements on the page
## Prerequisites
Before using this tool, ensure you have:
1. A [Browserbase](https://www.browserbase.com/) account with API key and project ID
2. An API key for an LLM (OpenAI or Anthropic Claude)
3. The Stagehand Python SDK installed
Install the required dependency:
```bash
pip install stagehand-py
```
## Usage
### Basic Implementation
The StagehandTool can be implemented in two ways:
#### 1. Using Context Manager (Recommended)
<Tip>
The context manager approach is recommended as it ensures proper cleanup of resources even if exceptions occur.
</Tip>
```python
from crewai import Agent, Task, Crew
from crewai_tools import StagehandTool
from stagehand.schemas import AvailableModel
# Initialize the tool with your API keys using a context manager
with StagehandTool(
api_key="your-browserbase-api-key",
project_id="your-browserbase-project-id",
model_api_key="your-llm-api-key", # OpenAI or Anthropic API key
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST, # Optional: specify which model to use
) as stagehand_tool:
# Create an agent with the tool
researcher = Agent(
role="Web Researcher",
goal="Find and summarize information from websites",
backstory="I'm an expert at finding information online.",
verbose=True,
tools=[stagehand_tool],
)
# Create a task that uses the tool
research_task = Task(
description="Go to https://www.example.com and tell me what you see on the homepage.",
agent=researcher,
)
# Run the crew
crew = Crew(
agents=[researcher],
tasks=[research_task],
verbose=True,
)
result = crew.kickoff()
print(result)
```
#### 2. Manual Resource Management
```python
from crewai import Agent, Task, Crew
from crewai_tools import StagehandTool
from stagehand.schemas import AvailableModel
# Initialize the tool with your API keys
stagehand_tool = StagehandTool(
api_key="your-browserbase-api-key",
project_id="your-browserbase-project-id",
model_api_key="your-llm-api-key",
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
)
try:
# Create an agent with the tool
researcher = Agent(
role="Web Researcher",
goal="Find and summarize information from websites",
backstory="I'm an expert at finding information online.",
verbose=True,
tools=[stagehand_tool],
)
# Create a task that uses the tool
research_task = Task(
description="Go to https://www.example.com and tell me what you see on the homepage.",
agent=researcher,
)
# Run the crew
crew = Crew(
agents=[researcher],
tasks=[research_task],
verbose=True,
)
result = crew.kickoff()
print(result)
finally:
# Explicitly clean up resources
stagehand_tool.close()
```
## Command Types
The StagehandTool supports three different command types for specific web automation tasks:
### 1. Act Command
The `act` command type (default) enables webpage interactions like clicking buttons, filling forms, and navigation.
```python
# Perform an action (default behavior)
result = stagehand_tool.run(
instruction="Click the login button",
url="https://example.com",
command_type="act" # Default, so can be omitted
)
# Fill out a form
result = stagehand_tool.run(
instruction="Fill the contact form with name 'John Doe', email 'john@example.com', and message 'Hello world'",
url="https://example.com/contact"
)
```
### 2. Extract Command
The `extract` command type retrieves structured data from webpages.
```python
# Extract all product information
result = stagehand_tool.run(
instruction="Extract all product names, prices, and descriptions",
url="https://example.com/products",
command_type="extract"
)
# Extract specific information with a selector
result = stagehand_tool.run(
instruction="Extract the main article title and content",
url="https://example.com/blog/article",
command_type="extract",
selector=".article-container" # Optional CSS selector
)
```
### 3. Observe Command
The `observe` command type identifies and analyzes webpage elements.
```python
# Find interactive elements
result = stagehand_tool.run(
instruction="Find all interactive elements in the navigation menu",
url="https://example.com",
command_type="observe"
)
# Identify form fields
result = stagehand_tool.run(
instruction="Identify all the input fields in the registration form",
url="https://example.com/register",
command_type="observe",
selector="#registration-form"
)
```
## Configuration Options
Customize the StagehandTool behavior with these parameters:
```python
stagehand_tool = StagehandTool(
api_key="your-browserbase-api-key",
project_id="your-browserbase-project-id",
model_api_key="your-llm-api-key",
model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
dom_settle_timeout_ms=5000, # Wait longer for DOM to settle
headless=True, # Run browser in headless mode
self_heal=True, # Attempt to recover from errors
wait_for_captcha_solves=True, # Wait for CAPTCHA solving
verbose=1, # Control logging verbosity (0-3)
)
```
## Best Practices
1. **Be Specific**: Provide detailed instructions for better results
2. **Choose Appropriate Command Type**: Select the right command type for your task
3. **Use Selectors**: Leverage CSS selectors to improve accuracy
4. **Break Down Complex Tasks**: Split complex workflows into multiple tool calls
5. **Implement Error Handling**: Add error handling for potential issues
## Troubleshooting
Common issues and solutions:
- **Session Issues**: Verify API keys for both Browserbase and LLM provider
- **Element Not Found**: Increase `dom_settle_timeout_ms` for slower pages
- **Action Failures**: Use `observe` to identify correct elements first
- **Incomplete Data**: Refine instructions or provide specific selectors
## Additional Resources
For questions about the CrewAI integration:
- Join Stagehand's [Slack community](https://stagehand.dev/slack)
- Open an issue in the [Stagehand repository](https://github.com/browserbase/stagehand)
- Visit [Stagehand documentation](https://docs.stagehand.dev/)