Fix lint error: remove unused 're' import from test file

Co-Authored-By: João <joao@crewai.com>
Fix CLI documentation to reflect actual provider count and two-step process
2026-03-15 16:28:14 +00:00 · 2025-06-24 06:21:34 +00:00 · 2025-06-24 06:15:05 +00:00 · 2025-06-23 13:58:16 -04:00 · 2025-06-20 13:35:26 -04:00
5 changed files with 299 additions and 9 deletions
--- a/docs/concepts/cli.mdx
+++ b/docs/concepts/cli.mdx
@@ -285,25 +285,32 @@ Watch this video tutorial for a step-by-step demonstration of deploying your cre

 ### 11. API Keys

-When running ```crewai create crew``` command, the CLI will first show you the top 5 most common LLM providers and ask you to select one.
+When running ```crewai create crew``` command, the CLI will show you a list of available LLM providers to choose from, followed by model selection for your chosen provider.

-Once you've selected an LLM provider, you will be prompted for API keys.
+Once you've selected an LLM provider and model, you will be prompted for API keys.

-#### Initial API key providers
+#### Available LLM Providers

-The CLI will initially prompt for API keys for the following services:
+The CLI will show you the following LLM providers to choose from:

 * OpenAI
-* Groq
 * Anthropic
 * Google Gemini
+* NVIDIA NIM
+* Groq
+* Hugging Face
+* Ollama
+* Watson
+* AWS Bedrock
+* Azure
+* Cerebras
 * SambaNova

-When you select a provider, the CLI will prompt you to enter your API key.
+When you select a provider, the CLI will then show you available models for that provider and prompt you to enter your API key.

 #### Other Options

-If you select option 6, you will be able to select from a list of LiteLLM supported providers.
+If you select "other", you will be able to select from a list of LiteLLM supported providers.

 When you select a provider, the CLI will prompt you to enter the Key name and the API key.

--- a/docs/docs.json
+++ b/docs/docs.json
@@ -134,7 +134,7 @@
                  "tools/web-scraping/stagehandtool",
                  "tools/web-scraping/firecrawlcrawlwebsitetool",
                  "tools/web-scraping/firecrawlscrapewebsitetool",
-                  "tools/web-scraping/firecrawlsearchtool"
+                  "tools/web-scraping/oxylabsscraperstool"
                ]
              },
              {
--- a/docs/tools/web-scraping/overview.mdx
+++ b/docs/tools/web-scraping/overview.mdx
@@ -56,6 +56,10 @@ These tools enable your agents to interact with the web, extract data from websi
  <Card title="Stagehand Tool" icon="hand" href="/tools/web-scraping/stagehandtool">
    Intelligent browser automation with natural language commands.
  </Card>
+
+  <Card title="Oxylabs Scraper Tool" icon="globe" href="/tools/web-scraping/oxylabsscraperstool">
+    Access web data at scale with Oxylabs.
+  </Card>
 </CardGroup>

 ## **Common Use Cases**
@@ -100,4 +104,4 @@ agent = Agent(
 - **JavaScript-Heavy Sites**: Use `SeleniumScrapingTool` for dynamic content
 - **Scale & Performance**: Use `FirecrawlScrapeWebsiteTool` for high-volume scraping
 - **Cloud Infrastructure**: Use `BrowserBaseLoadTool` for scalable browser automation
- **Complex Workflows**: Use `StagehandTool` for intelligent browser interactions 
+- **Complex Workflows**: Use `StagehandTool` for intelligent browser interactions
--- a/docs/tools/web-scraping/oxylabsscraperstool.mdx
+++ b/docs/tools/web-scraping/oxylabsscraperstool.mdx
@@ -0,0 +1,236 @@
+---
+title: Oxylabs Scrapers
+description: >
+  Oxylabs Scrapers allow to easily access the information from the respective sources. Please see the list of available sources below:
+    - `Amazon Product`
+    - `Amazon Search`
+    - `Google Seach`
+    - `Universal`
+icon: globe
+---
+
+## Installation
+
+Get the credentials by creating an Oxylabs Account [here](https://oxylabs.io).
+```shell
+pip install 'crewai[tools]' oxylabs
+```
+Check [Oxylabs Documentation](https://developers.oxylabs.io/scraping-solutions/web-scraper-api/targets) to get more information about API parameters.
+
+# `OxylabsAmazonProductScraperTool`
+
+### Example
+
+```python
+from crewai_tools import OxylabsAmazonProductScraperTool
+
+# make sure OXYLABS_USERNAME and OXYLABS_PASSWORD variables are set
+tool = OxylabsAmazonProductScraperTool()
+
+result = tool.run(query="AAAAABBBBCC")
+
+print(result)
+```
+
+### Parameters
+
+- `query` - 10-symbol ASIN code.
+- `domain` - domain localization for Amazon.
+- `geo_location` - the _Deliver to_ location.
+- `user_agent_type` - device type and browser.
+- `render` - enables JavaScript rendering when set to `html`.
+- `callback_url` - URL to your callback endpoint.
+- `context` - Additional advanced settings and controls for specialized requirements.
+- `parse` - returns parsed data when set to true.
+- `parsing_instructions` - define your own parsing and data transformation logic that will be executed on an HTML scraping result.
+
+### Advanced example
+
+```python
+from crewai_tools import OxylabsAmazonProductScraperTool
+
+# make sure OXYLABS_USERNAME and OXYLABS_PASSWORD variables are set
+tool = OxylabsAmazonProductScraperTool(
+    config={
+        "domain": "com",
+        "parse": True,
+        "context": [
+            {
+                "key": "autoselect_variant",
+                "value": True
+            }
+        ]
+    }
+)
+
+result = tool.run(query="AAAAABBBBCC")
+
+print(result)
+```
+
+# `OxylabsAmazonSearchScraperTool`
+
+### Example
+
+```python
+from crewai_tools import OxylabsAmazonSearchScraperTool
+
+# make sure OXYLABS_USERNAME and OXYLABS_PASSWORD variables are set
+tool = OxylabsAmazonSearchScraperTool()
+
+result = tool.run(query="headsets")
+
+print(result)
+```
+
+### Parameters
+
+- `query` - Amazon search term.
+- `domain` - Domain localization for Bestbuy.
+- `start_page` - starting page number.
+- `pages` - number of pages to retrieve.
+- `geo_location` - the _Deliver to_ location.
+- `user_agent_type` - device type and browser.
+- `render` - enables JavaScript rendering when set to `html`.
+- `callback_url` - URL to your callback endpoint.
+- `context` - Additional advanced settings and controls for specialized requirements.
+- `parse` - returns parsed data when set to true.
+- `parsing_instructions` - define your own parsing and data transformation logic that will be executed on an HTML scraping result.
+
+### Advanced example
+
+```python
+from crewai_tools import OxylabsAmazonSearchScraperTool
+
+# make sure OXYLABS_USERNAME and OXYLABS_PASSWORD variables are set
+tool = OxylabsAmazonSearchScraperTool(
+    config={
+        "domain": 'nl',
+        "start_page": 2,
+        "pages": 2,
+        "parse": True,
+        "context": [
+            {'key': 'category_id', 'value': 16391693031}
+        ],
+    }
+)
+
+result = tool.run(query='nirvana tshirt')
+
+print(result)
+```
+
+# `OxylabsGoogleSearchScraperTool`
+
+### Example
+
+```python
+from crewai_tools import OxylabsGoogleSearchScraperTool
+
+# make sure OXYLABS_USERNAME and OXYLABS_PASSWORD variables are set
+tool = OxylabsGoogleSearchScraperTool()
+
+result = tool.run(query="iPhone 16")
+
+print(result)
+```
+
+### Parameters
+
+- `query` - search keyword.
+- `domain` - domain localization for Google.
+- `start_page` - starting page number.
+- `pages` - number of pages to retrieve.
+- `limit` - number of results to retrieve in each page.
+- `locale` - `Accept-Language` header value which changes your Google search page web interface language.
+- `geo_location` - the geographical location that the result should be adapted for. Using this parameter correctly is extremely important to get the right data.
+- `user_agent_type` - device type and browser.
+- `render` - enables JavaScript rendering when set to `html`.
+- `callback_url` - URL to your callback endpoint.
+- `context` - Additional advanced settings and controls for specialized requirements.
+- `parse` - returns parsed data when set to true.
+- `parsing_instructions` - define your own parsing and data transformation logic that will be executed on an HTML scraping result.
+
+### Advanced example
+
+```python
+from crewai_tools import OxylabsGoogleSearchScraperTool
+
+# make sure OXYLABS_USERNAME and OXYLABS_PASSWORD variables are set
+tool = OxylabsGoogleSearchScraperTool(
+    config={
+        "parse": True,
+        "geo_location": "Paris, France",
+        "user_agent_type": "tablet",
+    }
+)
+
+result = tool.run(query="iPhone 16")
+
+print(result)
+```
+
+# `OxylabsUniversalScraperTool`
+
+### Example
+
+```python
+from crewai_tools import OxylabsUniversalScraperTool
+
+# make sure OXYLABS_USERNAME and OXYLABS_PASSWORD variables are set
+tool = OxylabsUniversalScraperTool()
+
+result = tool.run(url="https://ip.oxylabs.io")
+
+print(result)
+```
+
+### Parameters
+
+- `url` - website url to scrape.
+- `user_agent_type` - device type and browser.
+- `geo_location` - sets the proxy's geolocation to retrieve data.
+- `render` - enables JavaScript rendering when set to `html`.
+- `callback_url` - URL to your callback endpoint.
+- `context` - Additional advanced settings and controls for specialized requirements.
+- `parse` - returns parsed data when set to `true`, as long as a dedicated parser exists for the submitted URL's page type.
+- `parsing_instructions` - define your own parsing and data transformation logic that will be executed on an HTML scraping result.
+
+
+### Advanced example
+
+```python
+from crewai_tools import OxylabsUniversalScraperTool
+
+# make sure OXYLABS_USERNAME and OXYLABS_PASSWORD variables are set
+tool = OxylabsUniversalScraperTool(
+    config={
+        "render": "html",
+        "user_agent_type": "mobile",
+        "context": [
+            {"key": "force_headers", "value": True},
+            {"key": "force_cookies", "value": True},
+            {
+                "key": "headers",
+                "value": {
+                    "Custom-Header-Name": "custom header content",
+                },
+            },
+            {
+                "key": "cookies",
+                "value": [
+                    {"key": "NID", "value": "1234567890"},
+                    {"key": "1P JAR", "value": "0987654321"},
+                ],
+            },
+            {"key": "http_method", "value": "get"},
+            {"key": "follow_redirects", "value": True},
+            {"key": "successful_status_codes", "value": [808, 909]},
+        ],
+    }
+)
+
+result = tool.run(url="https://ip.oxylabs.io")
+
+print(result)
+```
--- a/tests/test_cli_documentation_sync.py
+++ b/tests/test_cli_documentation_sync.py
@@ -0,0 +1,43 @@
+from pathlib import Path
+
+from crewai.cli.constants import PROVIDERS
+
+
+def test_cli_documentation_matches_providers():
+    """Test that CLI documentation accurately reflects the available providers."""
+    docs_path = Path(__file__).parent.parent / "docs" / "concepts" / "cli.mdx"
+    with open(docs_path, 'r') as f:
+        docs_content = f.read()
+    
+    assert "top 5" not in docs_content.lower(), "Documentation should not mention 'top 5' providers"
+    assert "5 most common" not in docs_content.lower(), "Documentation should not mention '5 most common' providers"
+    
+    assert "list of available LLM providers" in docs_content or "following LLM providers" in docs_content, \
+        "Documentation should mention the availability of multiple LLM providers"
+    
+    assert len(PROVIDERS) > 5, f"Expected more than 5 providers, but found {len(PROVIDERS)}"
+    
+    key_providers = ["OpenAI", "Anthropic", "Gemini"]
+    for provider in key_providers:
+        assert provider in docs_content, f"Key provider {provider} should be mentioned in documentation"
+
+
+def test_providers_list_matches_constants():
+    """Test that the actual PROVIDERS list has the expected providers."""
+    expected_providers = [
+        "openai",
+        "anthropic", 
+        "gemini",
+        "nvidia_nim",
+        "groq",
+        "huggingface",
+        "ollama",
+        "watson",
+        "bedrock",
+        "azure",
+        "cerebras",
+        "sambanova",
+    ]
+    
+    assert PROVIDERS == expected_providers, f"PROVIDERS list has changed. Expected {expected_providers}, got {PROVIDERS}"
+    assert len(PROVIDERS) == 12, f"Expected 12 providers, but found {len(PROVIDERS)}"
Author	SHA1	Message	Date
Devin AI	f28e3e0be8	Fix lint error: remove unused 're' import from test file Co-Authored-By: João <joao@crewai.com>	2025-06-24 06:21:34 +00:00
Devin AI	10a55bd210	Fix CLI documentation to reflect actual provider count and two-step process - Update docs/concepts/cli.mdx to remove outdated 'top 5 most common LLM providers' reference - Replace with accurate description of 12 available providers plus 'other' option - Document the two-step process: select provider, then select model - Add comprehensive test to prevent documentation drift in the future - Test validates that docs stay in sync with actual CLI implementation Fixes #3054 Co-Authored-By: João <joao@crewai.com>	2025-06-24 06:15:05 +00:00
Rostyslav Borovyk	c96d4a6823	Add Oxylabs Web Scraping tools (#2905 ) Some checks failed Notify Downstream / notify-downstream (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details * Add Oxylabs tools * Review updates * Review updates --------- Co-authored-by: Tony Kipkemboi <iamtonykipkemboi@gmail.com>	2025-06-23 13:58:16 -04:00
Lucas Gomide	59032817c7	docs: update recommendation filters for MCP and Enterprise tools (#3041 ) Some checks failed Notify Downstream / notify-downstream (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details	2025-06-20 13:35:26 -04:00