Compare commits

..

2 Commits

Author SHA1 Message Date
Devin AI
60a2b9842d docs: add required packages to SeleniumScrapingTool documentation
- Add selenium and webdriver-manager to installation instructions
- Add prerequisites and system requirements
- Add troubleshooting guidelines
- Add basic usage example with error handling
- Fixes #2153

Co-Authored-By: Joe Moura <joao@crewai.com>
2025-02-17 14:08:10 +00:00
Devin AI
b1860cbb12 docs: add required packages to SeleniumScrapingTool documentation
- Add selenium and webdriver-manager to installation instructions
- Fixes #2153

Co-Authored-By: Joe Moura <joao@crewai.com>
2025-02-17 14:03:55 +00:00
4 changed files with 42 additions and 59 deletions

View File

@@ -14,7 +14,6 @@ icon: bars-staggered
- **Sequential**: Executes tasks sequentially, ensuring tasks are completed in an orderly progression.
- **Hierarchical**: Organizes tasks in a managerial hierarchy, where tasks are delegated and executed based on a structured chain of command. A manager language model (`manager_llm`) or a custom manager agent (`manager_agent`) must be specified in the crew to enable the hierarchical process, facilitating the creation and management of tasks by the manager.
- **Parallel**: Enables concurrent execution of multiple flows, allowing transitions from one flow to multiple parallel flows for improved task parallelization. Parallel execution is automatically handled using asyncio for optimal performance.
- **Consensual Process (Planned)**: Aiming for collaborative decision-making among agents on task execution, this process type introduces a democratic approach to task management within CrewAI. It is planned for future development and is not currently implemented in the codebase.
## The Role of Processes in Teamwork
@@ -58,30 +57,9 @@ Emulates a corporate hierarchy, CrewAI allows specifying a custom manager agent
## Process Class: Detailed Overview
The `Process` class is implemented as an enumeration (`Enum`), ensuring type safety and restricting process values to the defined types (`sequential`, `hierarchical`, `parallel`). The consensual process is planned for future inclusion, emphasizing our commitment to continuous development and innovation.
## Parallel Process
The parallel process type enables concurrent execution of multiple flows, leveraging Python's asyncio for efficient task parallelization. When using parallel execution:
- Multiple start methods are executed concurrently
- Listeners can run in parallel when triggered by the same method
- State consistency is maintained through thread-safe operations
- Execution timing and order are preserved where necessary
Example of parallel flow execution:
```python
from crewai import Crew, Process
# Create a crew with parallel process
crew = Crew(
agents=my_agents,
tasks=my_tasks,
process=Process.parallel
)
```
The `Process` class is implemented as an enumeration (`Enum`), ensuring type safety and restricting process values to the defined types (`sequential`, `hierarchical`). The consensual process is planned for future inclusion, emphasizing our commitment to continuous development and innovation.
## Conclusion
The structured collaboration facilitated by processes within CrewAI is crucial for enabling systematic teamwork among agents.
This documentation has been updated to reflect the latest features, enhancements, and the planned integration of the Consensual Process, ensuring users have access to the most current and comprehensive information.
This documentation has been updated to reflect the latest features, enhancements, and the planned integration of the Consensual Process, ensuring users have access to the most current and comprehensive information.

View File

@@ -17,12 +17,51 @@ The SeleniumScrapingTool is crafted for high-efficiency web scraping tasks.
It allows for precise extraction of content from web pages by using CSS selectors to target specific elements.
Its design caters to a wide range of scraping needs, offering flexibility to work with any provided website URL.
## Prerequisites
- Python 3.7 or higher
- Chrome browser installed (for ChromeDriver)
## Installation
To get started with the SeleniumScrapingTool, install the crewai_tools package using pip:
### Option 1: All-in-one installation
```shell
pip install 'crewai[tools]' selenium>=4.0.0 webdriver-manager>=3.8.0
```
### Option 2: Step-by-step installation
```shell
pip install 'crewai[tools]'
pip install selenium>=4.0.0
pip install webdriver-manager>=3.8.0
```
### Common Installation Issues
1. If you encounter WebDriver issues, ensure your Chrome browser is up-to-date
2. For Linux users, you might need to install additional system packages:
```shell
sudo apt-get install chromium-chromedriver
```
## Basic Usage
Here's a simple example to get you started with error handling:
```python
from crewai_tools import SeleniumScrapingTool
try:
# Initialize the tool with a specific website
tool = SeleniumScrapingTool(website_url='https://example.com')
# Extract content
content = tool.run()
print(content)
except Exception as e:
print(f"Error during scraping: {str(e)}")
# Ensure proper cleanup in case of errors
tool.cleanup()
```
## Usage Examples

View File

@@ -8,5 +8,4 @@ class Process(str, Enum):
sequential = "sequential"
hierarchical = "hierarchical"
parallel = "parallel"
# TODO: consensual = 'consensual'

View File

@@ -1,7 +1,6 @@
"""Test Flow creation and execution basic functionality."""
import asyncio
import time
from datetime import datetime
import pytest
@@ -621,35 +620,3 @@ def test_stateless_flow_event_emission():
== "Deeds will not be less valiant because they are unpraised."
)
assert isinstance(event_log[5].timestamp, datetime)
def test_parallel_flow():
"""Test a flow where multiple listeners execute in parallel."""
execution_order = []
execution_times = {}
class ParallelFlow(Flow):
@start()
def start_method(self):
execution_order.append("start")
return "start"
@listen(start_method)
async def parallel_1(self):
await asyncio.sleep(0.1)
execution_times["parallel_1"] = time.time()
execution_order.append("parallel_1")
@listen(start_method)
async def parallel_2(self):
await asyncio.sleep(0.1)
execution_times["parallel_2"] = time.time()
execution_order.append("parallel_2")
flow = ParallelFlow()
flow.kickoff()
assert "start" in execution_order
assert "parallel_1" in execution_order
assert "parallel_2" in execution_order
assert abs(execution_times["parallel_1"] - execution_times["parallel_2"]) < 0.05