Compare commits

...

33 Commits

Author SHA1 Message Date
Devin AI
0c862fd5e1 Trigger CI re-run
Co-Authored-By: João <joao@crewai.com>
2025-11-13 12:53:07 +00:00
Devin AI
6442a88856 Fix Azure structured outputs to use json_object instead of json_schema
- Azure AI Inference SDK doesn't support json_schema response_format
- Changed to use json_object format with client-side Pydantic validation
- Added comprehensive tests for structured outputs with Azure models
- Fixes issue #3906

Co-Authored-By: João <joao@crewai.com>
2025-11-13 12:45:55 +00:00
Greyson LaLonde
ffd717c51a fix: custom tool docs links, add mintlify broken links action (#3903)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
* fix: update docs links to point to correct endpoints

* fix: update all broken doc links
2025-11-12 22:55:10 -08:00
Heitor Carvalho
fbe4aa4bd1 feat: fetch and store more data about okta authorization server (#3894)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2025-11-12 15:28:00 -03:00
Lorenze Jay
c205d2e8de feat: implement before and after LLM call hooks in CrewAgentExecutor (#3893)
- Added support for before and after LLM call hooks to allow modification of messages and responses during LLM interactions.
- Introduced LLMCallHookContext to provide hooks with access to the executor state, enabling in-place modifications of messages.
- Updated get_llm_response function to utilize the new hooks, ensuring that modifications persist across iterations.
- Enhanced tests to verify the functionality of the hooks and their error handling capabilities, ensuring robust execution flow.
2025-11-12 08:38:13 -08:00
Daniel Barreto
fcb5b19b2e Enhance schema description of QdrantVectorSearchTool (#3891)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2025-11-11 14:33:33 -08:00
Rip&Tear
01f0111d52 dependabot.yml creation (#3868)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* dependabot.yml creation

* Configure dependabot for pip package updates

Co-authored-by: matt <matt@crewai.com>

* Fix Dependabot package ecosystem

* Refactor: Use uv package-ecosystem in dependabot

Co-authored-by: matt <matt@crewai.com>

* fix: ensure dependabot uses uv ecosystem

---------

Co-authored-by: Greyson LaLonde <greyson.r.lalonde@gmail.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: matt <matt@crewai.com>
2025-11-11 12:14:16 +08:00
Lorenze Jay
6b52587c67 feat: expose messages to TaskOutput and LiteAgentOutputs (#3880)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
* feat: add messages to task and agent outputs

- Introduced a new  field in  and  to capture messages from the last task execution.
- Updated the  class to store the last messages and provide a property for easy access.
- Enhanced the  and  classes to include messages in their outputs.
- Added tests to ensure that messages are correctly included in task outputs and agent outputs during execution.

* using typing_extensions for 3.10 compatability

* feat: add last_messages attribute to agent for improved task tracking

- Introduced a new `last_messages` attribute in the agent class to store messages from the last task execution.
- Updated the `Crew` class to handle the new messages attribute in task outputs.
- Enhanced existing tests to ensure that the `last_messages` attribute is correctly initialized and utilized across various guardrail scenarios.

* fix: add messages field to TaskOutput in tests for consistency

- Updated multiple test cases to include the new `messages` field in the `TaskOutput` instances.
- Ensured that all relevant tests reflect the latest changes in the TaskOutput structure, maintaining consistency across the test suite.
- This change aligns with the recent addition of the `last_messages` attribute in the agent class for improved task tracking.

* feat: preserve messages in task outputs during replay

- Added functionality to the Crew class to store and retrieve messages in task outputs.
- Enhanced the replay mechanism to ensure that messages from stored task outputs are preserved and accessible.
- Introduced a new test case to verify that messages are correctly stored and replayed, ensuring consistency in task execution and output handling.
- This change improves the overall tracking and context retention of task interactions within the CrewAI framework.

* fix original test, prev was debugging
2025-11-10 17:38:30 -08:00
Lorenze Jay
629f7f34ce docs: enhance task guardrail documentation with LLM-based validation support (#3879)
- Added section on LLM-based guardrails, explaining their usage and requirements.
- Updated examples to demonstrate the implementation of multiple guardrails, including both function-based and LLM-based approaches.
- Clarified the distinction between single and multiple guardrails in task configurations.
- Improved explanations of guardrail functionality to ensure better understanding of validation processes.
2025-11-10 15:35:42 -08:00
Lorenze Jay
0f1c173d02 feat: bump versions to 1.4.1 (#3862)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
* feat: bump versions to 1.4.1

* chore: update crewAI tools dependency to version 1.4.1 in project templates
2025-11-07 11:19:07 -08:00
Greyson LaLonde
19c5b9a35e fix: properly handle agent max iterations
fixes #3847
2025-11-07 13:54:11 -05:00
Greyson LaLonde
1ed307b58c fix: route llm model syntax to litellm
* fix: route llm model syntax to litellm

* wip: add list of supported models
2025-11-07 13:34:15 -05:00
Lorenze Jay
d29867bbb6 chore: update version numbers to 1.4.0
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2025-11-06 23:04:44 -05:00
Lorenze Jay
b2c278ed22 refactor: improve MCP tool execution handling with concurrent futures (#3854)
- Enhanced the MCP tool execution in both synchronous and asynchronous contexts by utilizing  for better event loop management.
- Updated error handling to provide clearer messages for connection issues and task cancellations.
- Added tests to validate MCP tool execution in both sync and async scenarios, ensuring robust functionality across different contexts.
2025-11-06 19:28:08 -08:00
Greyson LaLonde
f6aed9798b feat: allow non-ast plot routes 2025-11-06 21:17:29 -05:00
Greyson LaLonde
40a2d387a1 fix: keep stopwords updated 2025-11-06 21:10:25 -05:00
Lorenze Jay
6f36d7003b Lorenze/feat mcp first class support (#3850)
* WIP transport support mcp

* refactor: streamline MCP tool loading and error handling

* linted

* Self type from typing with typing_extensions in MCP transport modules

* added tests for mcp setup

* added tests for mcp setup

* docs: enhance MCP overview with detailed integration examples and structured configurations

* feat: implement MCP event handling and logging in event listener and client

- Added MCP event types and handlers for connection and tool execution events.
- Enhanced MCPClient to emit events on connection status and tool execution.
- Updated ConsoleFormatter to handle MCP event logging.
- Introduced new MCP event types for better integration and monitoring.
2025-11-06 17:45:16 -08:00
Greyson LaLonde
9e5906c52f feat: add pydantic validation dunder to BaseInterceptor
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
2025-11-06 15:27:07 -05:00
Lorenze Jay
fc521839e4 Lorenze/fix duplicating doc ids for knowledge (#3840)
* fix: update document ID handling in ChromaDB utility functions to use SHA-256 hashing and include index for uniqueness

* test: add tests for hash-based ID generation in ChromaDB utility functions

* drop idx for preventing dups, upsert should handle dups

* fix: update document ID extraction logic in ChromaDB utility functions to check for doc_id at the top level of the document

* fix: enhance document ID generation in ChromaDB utility functions to deduplicate documents and ensure unique hash-based IDs without suffixes

* fix: improve error handling and document ID generation in ChromaDB utility functions to ensure robust processing and uniqueness
2025-11-06 10:59:52 -08:00
Greyson LaLonde
e4cc9a664c fix: handle unpickleable values in flow state
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2025-11-06 01:29:21 -05:00
Greyson LaLonde
7e6171d5bc fix: ensure lite agents course-correct on validation errors
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
* fix: ensure lite agents course-correct on validation errors

* chore: update cassettes and test expectations

* fix: ensure multiple guardrails propogate
2025-11-05 19:02:11 -05:00
Greyson LaLonde
61ad1fb112 feat: add support for llm message interceptor hooks 2025-11-05 11:38:44 -05:00
Greyson LaLonde
54710a8711 fix: hash callback args correctly to ensure caching works
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
2025-11-05 07:19:09 -05:00
Lucas Gomide
5abf976373 fix: allow adding RAG source content from valid URLs (#3831)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2025-11-04 07:58:40 -05:00
Greyson LaLonde
329567153b fix: make plot node selection smoother
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2025-11-03 07:49:31 -05:00
Greyson LaLonde
60332e0b19 feat: cache i18n prompts for efficient use 2025-11-03 07:39:05 -05:00
Lorenze Jay
40932af3fa feat: bump versions to 1.3.0 (#3820)
Some checks failed
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* feat: bump versions to 1.3.0

* chore: update crew and flow templates to use crewai[tools] version 1.3.0
2025-10-31 18:54:02 -07:00
Greyson LaLonde
e134e5305b Gl/feat/a2a refactor (#3793)
* feat: agent metaclass, refactor a2a to wrappers

* feat: a2a schemas and utils

* chore: move agent class, update imports

* refactor: organize imports to avoid circularity, add a2a to console

* feat: pass response_model through call chain

* feat: add standard openapi spec serialization to tools and structured output

* feat: a2a events

* chore: add a2a to pyproject

* docs: minimal base for learn docs

* fix: adjust a2a conversation flow, allow llm to decide exit until max_retries

* fix: inject agent skills into initial prompt

* fix: format agent card as json in prompt

* refactor: simplify A2A agent prompt formatting and improve skill display

* chore: wide cleanup

* chore: cleanup logic, add auth cache, use json for messages in prompt

* chore: update docs

* fix: doc snippets formatting

* feat: optimize A2A agent card fetching and improve error reporting

* chore: move imports to top of file

* chore: refactor hasattr check

* chore: add httpx-auth, update lockfile

* feat: create base public api

* chore: cleanup modules, add docstrings, types

* fix: exclude extra fields in prompt

* chore: update docs

* tests: update to correct import

* chore: lint for ruff, add missing import

* fix: tweak openai streaming logic for response model

* tests: add reimport for test

* tests: add reimport for test

* fix: don't set a2a attr if not set

* fix: don't set a2a attr if not set

* chore: update cassettes

* tests: fix tests

* fix: use instructor and dont pass response_format for litellm

* chore: consolidate event listeners, add typing

* fix: address race condition in test, update cassettes

* tests: add correct mocks, rerun cassette for json

* tests: update cassette

* chore: regenerate cassette after new run

* fix: make token manager access-safe

* fix: make token manager access-safe

* merge

* chore: update test and cassete for output pydantic

* fix: tweak to disallow deadlock

* chore: linter

* fix: adjust event ordering for threading

* fix: use conditional for batch check

* tests: tweak for emission

* tests: simplify api + event check

* fix: ensure non-function calling llms see json formatted string

* tests: tweak message comparison

* fix: use internal instructor for litellm structure responses

---------

Co-authored-by: Mike Plachta <mike@crewai.com>
2025-10-31 18:42:03 -07:00
Greyson LaLonde
e229ef4e19 refactor: improve flow handling, typing, and logging; update UI and tests
fix: refine nested flow conditionals and ensure router methods and routes are fully parsed
fix: improve docstrings, typing, and logging coverage across all events
feat: update flow.plot feature with new UI enhancements
chore: apply Ruff linting, reorganize imports, and remove deprecated utilities/files
chore: split constants and utils, clean JS comments, and add typing for linters
tests: strengthen test coverage for flow execution paths and router logic
2025-10-31 21:15:06 -04:00
Greyson LaLonde
2e9eb8c32d fix: refactor use_stop_words to property, add check for stop words
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
2025-10-29 19:14:01 +01:00
Lucas Gomide
4ebb5114ed Fix Firecrawl tools & adding tests (#3810)
* fix: fix Firecrawl Scrape tool

* fix: fix Firecrawl Search tool

* fix: fix Firecrawl Website tool

* tests: adding tests for Firecrawl
2025-10-29 13:37:57 -04:00
Daniel Barreto
70b083945f Enhance QdrantVectorSearchTool (#3806)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2025-10-28 13:42:40 -04:00
Tony Kipkemboi
410db1ff39 docs: migrate embedder→embedding_model and require vectordb across tool docs; add provider examples (en/ko/pt-BR) (#3804)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* docs(tools): migrate embedder->embedding_model, require vectordb; add Chroma/Qdrant examples across en/ko/pt-BR PDF/TXT/XML/MDX/DOCX/CSV/Directory docs

* docs(observability): apply latest Datadog tweaks in ko and pt-BR
2025-10-27 13:29:21 -04:00
272 changed files with 39768 additions and 18949 deletions

11
.github/dependabot.yml vendored Normal file
View File

@@ -0,0 +1,11 @@
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
version: 2
updates:
- package-ecosystem: uv # See documentation for possible values
directory: "/" # Location of package manifests
schedule:
interval: "weekly"

35
.github/workflows/docs-broken-links.yml vendored Normal file
View File

@@ -0,0 +1,35 @@
name: Check Documentation Broken Links
on:
pull_request:
paths:
- "docs/**"
- "docs.json"
push:
branches:
- main
paths:
- "docs/**"
- "docs.json"
workflow_dispatch:
jobs:
check-links:
name: Check broken links
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Node
uses: actions/setup-node@v4
with:
node-version: "latest"
- name: Install Mintlify CLI
run: npm i -g mintlify
- name: Run broken link checker
run: |
# Auto-answer the prompt with yes command
yes "" | mintlify broken-links || test $? -eq 141
working-directory: ./docs

View File

@@ -19,6 +19,7 @@ repos:
language: system
pass_filenames: true
types: [python]
exclude: ^(lib/crewai/src/crewai/cli/templates/|lib/crewai/tests/|lib/crewai-tools/tests/)
- repo: https://github.com/astral-sh/uv-pre-commit
rev: 0.9.3
hooks:

View File

@@ -739,7 +739,7 @@ class KnowledgeMonitorListener(BaseEventListener):
knowledge_monitor = KnowledgeMonitorListener()
```
For more information on using events, see the [Event Listeners](https://docs.crewai.com/concepts/event-listener) documentation.
For more information on using events, see the [Event Listeners](/en/concepts/event-listener) documentation.
### Custom Knowledge Sources

View File

@@ -1035,7 +1035,7 @@ CrewAI supports streaming responses from LLMs, allowing your application to rece
```
<Tip>
[Click here](https://docs.crewai.com/concepts/event-listener#event-listeners) for more details
[Click here](/en/concepts/event-listener#event-listeners) for more details
</Tip>
</Tab>
@@ -1200,6 +1200,52 @@ Learn how to get the most out of your LLM configuration:
)
```
</Accordion>
<Accordion title="Transport Interceptors">
CrewAI provides message interceptors for several providers, allowing you to hook into request/response cycles at the transport layer.
**Supported Providers:**
- ✅ OpenAI
- ✅ Anthropic
**Basic Usage:**
```python
import httpx
from crewai import LLM
from crewai.llms.hooks import BaseInterceptor
class CustomInterceptor(BaseInterceptor[httpx.Request, httpx.Response]):
"""Custom interceptor to modify requests and responses."""
def on_outbound(self, request: httpx.Request) -> httpx.Request:
"""Print request before sending to the LLM provider."""
print(request)
return request
def on_inbound(self, response: httpx.Response) -> httpx.Response:
"""Process response after receiving from the LLM provider."""
print(f"Status: {response.status_code}")
print(f"Response time: {response.elapsed}")
return response
# Use the interceptor with an LLM
llm = LLM(
model="openai/gpt-4o",
interceptor=CustomInterceptor()
)
```
**Important Notes:**
- Both methods must return the received object or type of object.
- Modifying received objects may result in unexpected behavior or application crashes.
- Not all providers support interceptors - check the supported providers list above
<Info>
Interceptors operate at the transport layer. This is particularly useful for:
- Message transformation and filtering
- Debugging API interactions
</Info>
</Accordion>
</AccordionGroup>
## Common Issues and Solutions

View File

@@ -60,6 +60,7 @@ crew = Crew(
| **Output Pydantic** _(optional)_ | `output_pydantic` | `Optional[Type[BaseModel]]` | A Pydantic model for task output. |
| **Callback** _(optional)_ | `callback` | `Optional[Any]` | Function/object to be executed after task completion. |
| **Guardrail** _(optional)_ | `guardrail` | `Optional[Callable]` | Function to validate task output before proceeding to next task. |
| **Guardrails** _(optional)_ | `guardrails` | `Optional[List[Callable] | List[str]]` | List of guardrails to validate task output before proceeding to next task. |
| **Guardrail Max Retries** _(optional)_ | `guardrail_max_retries` | `Optional[int]` | Maximum number of retries when guardrail validation fails. Defaults to 3. |
<Note type="warning" title="Deprecated: max_retries">
@@ -223,6 +224,7 @@ By default, the `TaskOutput` will only include the `raw` output. A `TaskOutput`
| **JSON Dict** | `json_dict` | `Optional[Dict[str, Any]]` | A dictionary representing the JSON output of the task. |
| **Agent** | `agent` | `str` | The agent that executed the task. |
| **Output Format** | `output_format` | `OutputFormat` | The format of the task output, with options including RAW, JSON, and Pydantic. The default is RAW. |
| **Messages** | `messages` | `list[LLMMessage]` | The messages from the last task execution. |
### Task Methods and Properties
@@ -341,7 +343,11 @@ Task guardrails provide a way to validate and transform task outputs before they
are passed to the next task. This feature helps ensure data quality and provides
feedback to agents when their output doesn't meet specific criteria.
Guardrails are implemented as Python functions that contain custom validation logic, giving you complete control over the validation process and ensuring reliable, deterministic results.
CrewAI supports two types of guardrails:
1. **Function-based guardrails**: Python functions with custom validation logic, giving you complete control over the validation process and ensuring reliable, deterministic results.
2. **LLM-based guardrails**: String descriptions that use the agent's LLM to validate outputs based on natural language criteria. These are ideal for complex or subjective validation requirements.
### Function-Based Guardrails
@@ -355,12 +361,12 @@ def validate_blog_content(result: TaskOutput) -> Tuple[bool, Any]:
"""Validate blog content meets requirements."""
try:
# Check word count
word_count = len(result.split())
word_count = len(result.raw.split())
if word_count > 200:
return (False, "Blog content exceeds 200 words")
# Additional validation logic here
return (True, result.strip())
return (True, result.raw.strip())
except Exception as e:
return (False, "Unexpected error during validation")
@@ -372,6 +378,147 @@ blog_task = Task(
)
```
### LLM-Based Guardrails (String Descriptions)
Instead of writing custom validation functions, you can use string descriptions that leverage LLM-based validation. When you provide a string to the `guardrail` or `guardrails` parameter, CrewAI automatically creates an `LLMGuardrail` that uses the agent's LLM to validate the output based on your description.
**Requirements**:
- The task must have an `agent` assigned (the guardrail uses the agent's LLM)
- Provide a clear, descriptive string explaining the validation criteria
```python Code
from crewai import Task
# Single LLM-based guardrail
blog_task = Task(
description="Write a blog post about AI",
expected_output="A blog post under 200 words",
agent=blog_agent,
guardrail="The blog post must be under 200 words and contain no technical jargon"
)
```
LLM-based guardrails are particularly useful for:
- **Complex validation logic** that's difficult to express programmatically
- **Subjective criteria** like tone, style, or quality assessments
- **Natural language requirements** that are easier to describe than code
The LLM guardrail will:
1. Analyze the task output against your description
2. Return `(True, output)` if the output complies with the criteria
3. Return `(False, feedback)` with specific feedback if validation fails
**Example with detailed validation criteria**:
```python Code
research_task = Task(
description="Research the latest developments in quantum computing",
expected_output="A comprehensive research report",
agent=researcher_agent,
guardrail="""
The research report must:
- Be at least 1000 words long
- Include at least 5 credible sources
- Cover both technical and practical applications
- Be written in a professional, academic tone
- Avoid speculation or unverified claims
"""
)
```
### Multiple Guardrails
You can apply multiple guardrails to a task using the `guardrails` parameter. Multiple guardrails are executed sequentially, with each guardrail receiving the output from the previous one. This allows you to chain validation and transformation steps.
The `guardrails` parameter accepts:
- A list of guardrail functions or string descriptions
- A single guardrail function or string (same as `guardrail`)
**Note**: If `guardrails` is provided, it takes precedence over `guardrail`. The `guardrail` parameter will be ignored when `guardrails` is set.
```python Code
from typing import Tuple, Any
from crewai import TaskOutput, Task
def validate_word_count(result: TaskOutput) -> Tuple[bool, Any]:
"""Validate word count is within limits."""
word_count = len(result.raw.split())
if word_count < 100:
return (False, f"Content too short: {word_count} words. Need at least 100 words.")
if word_count > 500:
return (False, f"Content too long: {word_count} words. Maximum is 500 words.")
return (True, result.raw)
def validate_no_profanity(result: TaskOutput) -> Tuple[bool, Any]:
"""Check for inappropriate language."""
profanity_words = ["badword1", "badword2"] # Example list
content_lower = result.raw.lower()
for word in profanity_words:
if word in content_lower:
return (False, f"Inappropriate language detected: {word}")
return (True, result.raw)
def format_output(result: TaskOutput) -> Tuple[bool, Any]:
"""Format and clean the output."""
formatted = result.raw.strip()
# Capitalize first letter
formatted = formatted[0].upper() + formatted[1:] if formatted else formatted
return (True, formatted)
# Apply multiple guardrails sequentially
blog_task = Task(
description="Write a blog post about AI",
expected_output="A well-formatted blog post between 100-500 words",
agent=blog_agent,
guardrails=[
validate_word_count, # First: validate length
validate_no_profanity, # Second: check content
format_output # Third: format the result
],
guardrail_max_retries=3
)
```
In this example, the guardrails execute in order:
1. `validate_word_count` checks the word count
2. `validate_no_profanity` checks for inappropriate language (using the output from step 1)
3. `format_output` formats the final result (using the output from step 2)
If any guardrail fails, the error is sent back to the agent, and the task is retried up to `guardrail_max_retries` times.
**Mixing function-based and LLM-based guardrails**:
You can combine both function-based and string-based guardrails in the same list:
```python Code
from typing import Tuple, Any
from crewai import TaskOutput, Task
def validate_word_count(result: TaskOutput) -> Tuple[bool, Any]:
"""Validate word count is within limits."""
word_count = len(result.raw.split())
if word_count < 100:
return (False, f"Content too short: {word_count} words. Need at least 100 words.")
if word_count > 500:
return (False, f"Content too long: {word_count} words. Maximum is 500 words.")
return (True, result.raw)
# Mix function-based and LLM-based guardrails
blog_task = Task(
description="Write a blog post about AI",
expected_output="A well-formatted blog post between 100-500 words",
agent=blog_agent,
guardrails=[
validate_word_count, # Function-based: precise word count check
"The content must be engaging and suitable for a general audience", # LLM-based: subjective quality check
"The writing style should be clear, concise, and free of technical jargon" # LLM-based: style validation
],
guardrail_max_retries=3
)
```
This approach combines the precision of programmatic validation with the flexibility of LLM-based assessment for subjective criteria.
### Guardrail Function Requirements
1. **Function Signature**:

View File

@@ -37,7 +37,7 @@ you can use them locally or refine them to your needs.
<Card title="Tools & Integrations" href="/en/enterprise/features/tools-and-integrations" icon="wrench">
Connect external apps and manage internal tools your agents can use.
</Card>
<Card title="Tool Repository" href="/en/enterprise/features/tool-repository" icon="toolbox">
<Card title="Tool Repository" href="/en/enterprise/guides/tool-repository#tool-repository" icon="toolbox">
Publish and install tools to enhance your crews' capabilities.
</Card>
<Card title="Agents Repository" href="/en/enterprise/features/agent-repositories" icon="people-group">

View File

@@ -241,7 +241,7 @@ Tools & Integrations is the central hub for connecting thirdparty apps and ma
## Related
<CardGroup cols={2}>
<Card title="Tool Repository" href="/en/enterprise/features/tool-repository" icon="toolbox">
<Card title="Tool Repository" href="/en/enterprise/guides/tool-repository#tool-repository" icon="toolbox">
Create, publish, and version custom tools for your organization.
</Card>
<Card title="Webhook Automation" href="/en/enterprise/guides/webhook-automation" icon="bolt">

View File

@@ -21,7 +21,7 @@ The repository is not a version control system. Use Git to track code changes an
Before using the Tool Repository, ensure you have:
- A [CrewAI AMP](https://app.crewai.com) account
- [CrewAI CLI](https://docs.crewai.com/concepts/cli#cli) installed
- [CrewAI CLI](/en/concepts/cli#cli) installed
- uv>=0.5.0 installed. Check out [how to upgrade](https://docs.astral.sh/uv/getting-started/installation/#upgrading-uv)
- [Git](https://git-scm.com) installed and configured
- Access permissions to publish or install tools in your CrewAI AMP organization
@@ -112,7 +112,7 @@ By default, tools are published as private. To make a tool public:
crewai tool publish --public
```
For more details on how to build tools, see [Creating your own tools](https://docs.crewai.com/concepts/tools#creating-your-own-tools).
For more details on how to build tools, see [Creating your own tools](/en/concepts/tools#creating-your-own-tools).
## Updating Tools

View File

@@ -49,7 +49,7 @@ mode: "wide"
To integrate human input into agent execution, set the `human_input` flag in the task definition. When enabled, the agent prompts the user for input before delivering its final answer. This input can provide extra context, clarify ambiguities, or validate the agent's output.
For detailed implementation guidance, see our [Human-in-the-Loop guide](/en/how-to/human-in-the-loop).
For detailed implementation guidance, see our [Human-in-the-Loop guide](/en/enterprise/guides/human-in-the-loop).
</Accordion>
<Accordion title="What advanced customization options are available for tailoring and enhancing agent behavior and capabilities in CrewAI?">
@@ -142,7 +142,7 @@ mode: "wide"
<Accordion title="How can I create custom tools for my CrewAI agents?">
You can create custom tools by subclassing the `BaseTool` class provided by CrewAI or by using the tool decorator. Subclassing involves defining a new class that inherits from `BaseTool`, specifying the name, description, and the `_run` method for operational logic. The tool decorator allows you to create a `Tool` object directly with the required attributes and a functional logic.
<Card href="https://docs.crewai.com/how-to/create-custom-tools" icon="code">CrewAI Tools Guide</Card>
<Card href="/en/learn/create-custom-tools" icon="code">CrewAI Tools Guide</Card>
</Accordion>
<Accordion title="How can you control the maximum number of requests per minute that the entire crew can perform?">

View File

@@ -0,0 +1,291 @@
---
title: Agent-to-Agent (A2A) Protocol
description: Enable CrewAI agents to delegate tasks to remote A2A-compliant agents for specialized handling
icon: network-wired
mode: "wide"
---
## A2A Agent Delegation
CrewAI supports the Agent-to-Agent (A2A) protocol, allowing agents to delegate tasks to remote specialized agents. The agent's LLM automatically decides whether to handle a task directly or delegate to an A2A agent based on the task requirements.
<Note>
A2A delegation requires the `a2a-sdk` package. Install with: `uv add 'crewai[a2a]'` or `pip install 'crewai[a2a]'`
</Note>
## How It Works
When an agent is configured with A2A capabilities:
1. The LLM analyzes each task
2. It decides to either:
- Handle the task directly using its own capabilities
- Delegate to a remote A2A agent for specialized handling
3. If delegating, the agent communicates with the remote A2A agent through the protocol
4. Results are returned to the CrewAI workflow
## Basic Configuration
Configure an agent for A2A delegation by setting the `a2a` parameter:
```python Code
from crewai import Agent, Crew, Task
from crewai.a2a import A2AConfig
agent = Agent(
role="Research Coordinator",
goal="Coordinate research tasks efficiently",
backstory="Expert at delegating to specialized research agents",
llm="gpt-4o",
a2a=A2AConfig(
endpoint="https://example.com/.well-known/agent-card.json",
timeout=120,
max_turns=10
)
)
task = Task(
description="Research the latest developments in quantum computing",
expected_output="A comprehensive research report",
agent=agent
)
crew = Crew(agents=[agent], tasks=[task], verbose=True)
result = crew.kickoff()
```
## Configuration Options
The `A2AConfig` class accepts the following parameters:
<ParamField path="endpoint" type="str" required>
The A2A agent endpoint URL (typically points to `.well-known/agent-card.json`)
</ParamField>
<ParamField path="auth" type="AuthScheme" default="None">
Authentication scheme for the A2A agent. Supports Bearer tokens, OAuth2, API keys, and HTTP authentication.
</ParamField>
<ParamField path="timeout" type="int" default="120">
Request timeout in seconds
</ParamField>
<ParamField path="max_turns" type="int" default="10">
Maximum number of conversation turns with the A2A agent
</ParamField>
<ParamField path="response_model" type="type[BaseModel]" default="None">
Optional Pydantic model for requesting structured output from an A2A agent. A2A protocol does not
enforce this, so an A2A agent does not need to honor this request.
</ParamField>
<ParamField path="fail_fast" type="bool" default="True">
Whether to raise an error immediately if agent connection fails. When `False`, the agent continues with available agents and informs the LLM about unavailable ones.
</ParamField>
## Authentication
For A2A agents that require authentication, use one of the provided auth schemes:
<Tabs>
<Tab title="Bearer Token">
```python Code
from crewai.a2a import A2AConfig
from crewai.a2a.auth import BearerTokenAuth
agent = Agent(
role="Secure Coordinator",
goal="Coordinate tasks with secured agents",
backstory="Manages secure agent communications",
llm="gpt-4o",
a2a=A2AConfig(
endpoint="https://secure-agent.example.com/.well-known/agent-card.json",
auth=BearerTokenAuth(token="your-bearer-token"),
timeout=120
)
)
```
</Tab>
<Tab title="API Key">
```python Code
from crewai.a2a import A2AConfig
from crewai.a2a.auth import APIKeyAuth
agent = Agent(
role="API Coordinator",
goal="Coordinate with API-based agents",
backstory="Manages API-authenticated communications",
llm="gpt-4o",
a2a=A2AConfig(
endpoint="https://api-agent.example.com/.well-known/agent-card.json",
auth=APIKeyAuth(
api_key="your-api-key",
location="header", # or "query" or "cookie"
name="X-API-Key"
),
timeout=120
)
)
```
</Tab>
<Tab title="OAuth2">
```python Code
from crewai.a2a import A2AConfig
from crewai.a2a.auth import OAuth2ClientCredentials
agent = Agent(
role="OAuth Coordinator",
goal="Coordinate with OAuth-secured agents",
backstory="Manages OAuth-authenticated communications",
llm="gpt-4o",
a2a=A2AConfig(
endpoint="https://oauth-agent.example.com/.well-known/agent-card.json",
auth=OAuth2ClientCredentials(
token_url="https://auth.example.com/oauth/token",
client_id="your-client-id",
client_secret="your-client-secret",
scopes=["read", "write"]
),
timeout=120
)
)
```
</Tab>
<Tab title="HTTP Basic">
```python Code
from crewai.a2a import A2AConfig
from crewai.a2a.auth import HTTPBasicAuth
agent = Agent(
role="Basic Auth Coordinator",
goal="Coordinate with basic auth agents",
backstory="Manages basic authentication communications",
llm="gpt-4o",
a2a=A2AConfig(
endpoint="https://basic-agent.example.com/.well-known/agent-card.json",
auth=HTTPBasicAuth(
username="your-username",
password="your-password"
),
timeout=120
)
)
```
</Tab>
</Tabs>
## Multiple A2A Agents
Configure multiple A2A agents for delegation by passing a list:
```python Code
from crewai.a2a import A2AConfig
from crewai.a2a.auth import BearerTokenAuth
agent = Agent(
role="Multi-Agent Coordinator",
goal="Coordinate with multiple specialized agents",
backstory="Expert at delegating to the right specialist",
llm="gpt-4o",
a2a=[
A2AConfig(
endpoint="https://research.example.com/.well-known/agent-card.json",
timeout=120
),
A2AConfig(
endpoint="https://data.example.com/.well-known/agent-card.json",
auth=BearerTokenAuth(token="data-token"),
timeout=90
)
]
)
```
The LLM will automatically choose which A2A agent to delegate to based on the task requirements.
## Error Handling
Control how agent connection failures are handled using the `fail_fast` parameter:
```python Code
from crewai.a2a import A2AConfig
# Fail immediately on connection errors (default)
agent = Agent(
role="Research Coordinator",
goal="Coordinate research tasks",
backstory="Expert at delegation",
llm="gpt-4o",
a2a=A2AConfig(
endpoint="https://research.example.com/.well-known/agent-card.json",
fail_fast=True
)
)
# Continue with available agents
agent = Agent(
role="Multi-Agent Coordinator",
goal="Coordinate with multiple agents",
backstory="Expert at working with available resources",
llm="gpt-4o",
a2a=[
A2AConfig(
endpoint="https://primary.example.com/.well-known/agent-card.json",
fail_fast=False
),
A2AConfig(
endpoint="https://backup.example.com/.well-known/agent-card.json",
fail_fast=False
)
]
)
```
When `fail_fast=False`:
- If some agents fail, the LLM is informed which agents are unavailable and can delegate to working agents
- If all agents fail, the LLM receives a notice about unavailable agents and handles the task directly
- Connection errors are captured and included in the context for better decision-making
## Best Practices
<CardGroup cols={2}>
<Card title="Set Appropriate Timeouts" icon="clock">
Configure timeouts based on expected A2A agent response times. Longer-running tasks may need higher timeout values.
</Card>
<Card title="Limit Conversation Turns" icon="comments">
Use `max_turns` to prevent excessive back-and-forth. The agent will automatically conclude conversations before hitting the limit.
</Card>
<Card title="Use Resilient Error Handling" icon="shield-check">
Set `fail_fast=False` for production environments with multiple agents to gracefully handle connection failures and maintain workflow continuity.
</Card>
<Card title="Secure Your Credentials" icon="lock">
Store authentication tokens and credentials as environment variables, not in code.
</Card>
<Card title="Monitor Delegation Decisions" icon="eye">
Use verbose mode to observe when the LLM chooses to delegate versus handle tasks directly.
</Card>
</CardGroup>
## Supported Authentication Methods
- **Bearer Token** - Simple token-based authentication
- **OAuth2 Client Credentials** - OAuth2 flow for machine-to-machine communication
- **OAuth2 Authorization Code** - OAuth2 flow requiring user authorization
- **API Key** - Key-based authentication (header, query param, or cookie)
- **HTTP Basic** - Username/password authentication
- **HTTP Digest** - Digest authentication (requires `httpx-auth` package)
## Learn More
For more information about the A2A protocol and reference implementations:
- [A2A Protocol Documentation](https://a2a-protocol.org)
- [A2A Sample Implementations](https://github.com/a2aproject/a2a-samples)
- [A2A Python SDK](https://github.com/a2aproject/a2a-python)

View File

@@ -97,7 +97,7 @@ project_crew = Crew(
```
<Tip>
For more details on creating and customizing a manager agent, check out the [Custom Manager Agent documentation](https://docs.crewai.com/how-to/custom-manager-agent#custom-manager-agent).
For more details on creating and customizing a manager agent, check out the [Custom Manager Agent documentation](/en/learn/custom-manager-agent).
</Tip>

View File

@@ -11,9 +11,13 @@ The [Model Context Protocol](https://modelcontextprotocol.io/introduction) (MCP)
CrewAI offers **two approaches** for MCP integration:
### Simple DSL Integration** (Recommended)
### 🚀 **Simple DSL Integration** (Recommended)
Use the `mcps` field directly on agents for seamless MCP tool integration:
Use the `mcps` field directly on agents for seamless MCP tool integration. The DSL supports both **string references** (for quick setup) and **structured configurations** (for full control).
#### String-Based References (Quick Setup)
Perfect for remote HTTPS servers and CrewAI AMP marketplace:
```python
from crewai import Agent
@@ -32,6 +36,46 @@ agent = Agent(
# MCP tools are now automatically available to your agent!
```
#### Structured Configurations (Full Control)
For complete control over connection settings, tool filtering, and all transport types:
```python
from crewai import Agent
from crewai.mcp import MCPServerStdio, MCPServerHTTP, MCPServerSSE
from crewai.mcp.filters import create_static_tool_filter
agent = Agent(
role="Advanced Research Analyst",
goal="Research with full control over MCP connections",
backstory="Expert researcher with advanced tool access",
mcps=[
# Stdio transport for local servers
MCPServerStdio(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem"],
env={"API_KEY": "your_key"},
tool_filter=create_static_tool_filter(
allowed_tool_names=["read_file", "list_directory"]
),
cache_tools_list=True,
),
# HTTP/Streamable HTTP transport for remote servers
MCPServerHTTP(
url="https://api.example.com/mcp",
headers={"Authorization": "Bearer your_token"},
streamable=True,
cache_tools_list=True,
),
# SSE transport for real-time streaming
MCPServerSSE(
url="https://stream.example.com/mcp/sse",
headers={"Authorization": "Bearer your_token"},
),
]
)
```
### 🔧 **Advanced: MCPServerAdapter** (For Complex Scenarios)
For advanced use cases requiring manual connection management, the `crewai-tools` library provides the `MCPServerAdapter` class.
@@ -68,12 +112,14 @@ uv pip install 'crewai-tools[mcp]'
## Quick Start: Simple DSL Integration
The easiest way to integrate MCP servers is using the `mcps` field on your agents:
The easiest way to integrate MCP servers is using the `mcps` field on your agents. You can use either string references or structured configurations.
### Quick Start with String References
```python
from crewai import Agent, Task, Crew
# Create agent with MCP tools
# Create agent with MCP tools using string references
research_agent = Agent(
role="Research Analyst",
goal="Find and analyze information using advanced search tools",
@@ -96,13 +142,53 @@ crew = Crew(agents=[research_agent], tasks=[research_task])
result = crew.kickoff()
```
### Quick Start with Structured Configurations
```python
from crewai import Agent, Task, Crew
from crewai.mcp import MCPServerStdio, MCPServerHTTP, MCPServerSSE
# Create agent with structured MCP configurations
research_agent = Agent(
role="Research Analyst",
goal="Find and analyze information using advanced search tools",
backstory="Expert researcher with access to multiple data sources",
mcps=[
# Local stdio server
MCPServerStdio(
command="python",
args=["local_server.py"],
env={"API_KEY": "your_key"},
),
# Remote HTTP server
MCPServerHTTP(
url="https://api.research.com/mcp",
headers={"Authorization": "Bearer your_token"},
),
]
)
# Create task
research_task = Task(
description="Research the latest developments in AI agent frameworks",
expected_output="Comprehensive research report with citations",
agent=research_agent
)
# Create and run crew
crew = Crew(agents=[research_agent], tasks=[research_task])
result = crew.kickoff()
```
That's it! The MCP tools are automatically discovered and available to your agent.
## MCP Reference Formats
The `mcps` field supports various reference formats for maximum flexibility:
The `mcps` field supports both **string references** (for quick setup) and **structured configurations** (for full control). You can mix both formats in the same list.
### External MCP Servers
### String-Based References
#### External MCP Servers
```python
mcps=[
@@ -117,7 +203,7 @@ mcps=[
]
```
### CrewAI AMP Marketplace
#### CrewAI AMP Marketplace
```python
mcps=[
@@ -133,17 +219,166 @@ mcps=[
]
```
### Mixed References
### Structured Configurations
#### Stdio Transport (Local Servers)
Perfect for local MCP servers that run as processes:
```python
from crewai.mcp import MCPServerStdio
from crewai.mcp.filters import create_static_tool_filter
mcps=[
"https://external-api.com/mcp", # External server
"https://weather.service.com/mcp#forecast", # Specific external tool
"crewai-amp:financial-insights", # AMP service
"crewai-amp:data-analysis#sentiment_tool" # Specific AMP tool
MCPServerStdio(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem"],
env={"API_KEY": "your_key"},
tool_filter=create_static_tool_filter(
allowed_tool_names=["read_file", "write_file"]
),
cache_tools_list=True,
),
# Python-based server
MCPServerStdio(
command="python",
args=["path/to/server.py"],
env={"UV_PYTHON": "3.12", "API_KEY": "your_key"},
),
]
```
#### HTTP/Streamable HTTP Transport (Remote Servers)
For remote MCP servers over HTTP/HTTPS:
```python
from crewai.mcp import MCPServerHTTP
mcps=[
# Streamable HTTP (default)
MCPServerHTTP(
url="https://api.example.com/mcp",
headers={"Authorization": "Bearer your_token"},
streamable=True,
cache_tools_list=True,
),
# Standard HTTP
MCPServerHTTP(
url="https://api.example.com/mcp",
headers={"Authorization": "Bearer your_token"},
streamable=False,
),
]
```
#### SSE Transport (Real-Time Streaming)
For remote servers using Server-Sent Events:
```python
from crewai.mcp import MCPServerSSE
mcps=[
MCPServerSSE(
url="https://stream.example.com/mcp/sse",
headers={"Authorization": "Bearer your_token"},
cache_tools_list=True,
),
]
```
### Mixed References
You can combine string references and structured configurations:
```python
from crewai.mcp import MCPServerStdio, MCPServerHTTP
mcps=[
# String references
"https://external-api.com/mcp", # External server
"crewai-amp:financial-insights", # AMP service
# Structured configurations
MCPServerStdio(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem"],
),
MCPServerHTTP(
url="https://api.example.com/mcp",
headers={"Authorization": "Bearer token"},
),
]
```
### Tool Filtering
Structured configurations support advanced tool filtering:
```python
from crewai.mcp import MCPServerStdio
from crewai.mcp.filters import create_static_tool_filter, create_dynamic_tool_filter, ToolFilterContext
# Static filtering (allow/block lists)
static_filter = create_static_tool_filter(
allowed_tool_names=["read_file", "write_file"],
blocked_tool_names=["delete_file"],
)
# Dynamic filtering (context-aware)
def dynamic_filter(context: ToolFilterContext, tool: dict) -> bool:
# Block dangerous tools for certain agent roles
if context.agent.role == "Code Reviewer":
if "delete" in tool.get("name", "").lower():
return False
return True
mcps=[
MCPServerStdio(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem"],
tool_filter=static_filter, # or dynamic_filter
),
]
```
## Configuration Parameters
Each transport type supports specific configuration options:
### MCPServerStdio Parameters
- **`command`** (required): Command to execute (e.g., `"python"`, `"node"`, `"npx"`, `"uvx"`)
- **`args`** (optional): List of command arguments (e.g., `["server.py"]` or `["-y", "@mcp/server"]`)
- **`env`** (optional): Dictionary of environment variables to pass to the process
- **`tool_filter`** (optional): Tool filter function for filtering available tools
- **`cache_tools_list`** (optional): Whether to cache the tool list for faster subsequent access (default: `False`)
### MCPServerHTTP Parameters
- **`url`** (required): Server URL (e.g., `"https://api.example.com/mcp"`)
- **`headers`** (optional): Dictionary of HTTP headers for authentication or other purposes
- **`streamable`** (optional): Whether to use streamable HTTP transport (default: `True`)
- **`tool_filter`** (optional): Tool filter function for filtering available tools
- **`cache_tools_list`** (optional): Whether to cache the tool list for faster subsequent access (default: `False`)
### MCPServerSSE Parameters
- **`url`** (required): Server URL (e.g., `"https://api.example.com/mcp/sse"`)
- **`headers`** (optional): Dictionary of HTTP headers for authentication or other purposes
- **`tool_filter`** (optional): Tool filter function for filtering available tools
- **`cache_tools_list`** (optional): Whether to cache the tool list for faster subsequent access (default: `False`)
### Common Parameters
All transport types support:
- **`tool_filter`**: Filter function to control which tools are available. Can be:
- `None` (default): All tools are available
- Static filter: Created with `create_static_tool_filter()` for allow/block lists
- Dynamic filter: Created with `create_dynamic_tool_filter()` for context-aware filtering
- **`cache_tools_list`**: When `True`, caches the tool list after first discovery to improve performance on subsequent connections
## Key Features
- 🔄 **Automatic Tool Discovery**: Tools are automatically discovered and integrated
@@ -152,26 +387,47 @@ mcps=[
- 🛡️ **Error Resilience**: Graceful handling of unavailable servers
- ⏱️ **Timeout Protection**: Built-in timeouts prevent hanging connections
- 📊 **Transparent Integration**: Works seamlessly with existing CrewAI features
- 🔧 **Full Transport Support**: Stdio, HTTP/Streamable HTTP, and SSE transports
- 🎯 **Advanced Filtering**: Static and dynamic tool filtering capabilities
- 🔐 **Flexible Authentication**: Support for headers, environment variables, and query parameters
## Error Handling
The MCP DSL integration is designed to be resilient:
The MCP DSL integration is designed to be resilient and handles failures gracefully:
```python
from crewai import Agent
from crewai.mcp import MCPServerStdio, MCPServerHTTP
agent = Agent(
role="Resilient Agent",
goal="Continue working despite server issues",
backstory="Agent that handles failures gracefully",
mcps=[
# String references
"https://reliable-server.com/mcp", # Will work
"https://unreachable-server.com/mcp", # Will be skipped gracefully
"https://slow-server.com/mcp", # Will timeout gracefully
"crewai-amp:working-service" # Will work
"crewai-amp:working-service", # Will work
# Structured configs
MCPServerStdio(
command="python",
args=["reliable_server.py"], # Will work
),
MCPServerHTTP(
url="https://slow-server.com/mcp", # Will timeout gracefully
),
]
)
# Agent will use tools from working servers and log warnings for failing ones
```
All connection errors are handled gracefully:
- **Connection failures**: Logged as warnings, agent continues with available tools
- **Timeout errors**: Connections timeout after 30 seconds (configurable)
- **Authentication errors**: Logged clearly for debugging
- **Invalid configurations**: Validation errors are raised at agent creation time
## Advanced: MCPServerAdapter
For complex scenarios requiring manual connection management, use the `MCPServerAdapter` class from `crewai-tools`. Using a Python context manager (`with` statement) is the recommended approach as it automatically handles starting and stopping the connection to the MCP server.

View File

@@ -93,11 +93,15 @@ After running the application, you can view the traces in [Datadog LLM Observabi
Clicking on a trace will show you the details of the trace, including total tokens used, number of LLM calls, models used, and estimated cost. Clicking into a specific span will narrow down these details, and show related input, output, and metadata.
![Datadog LLM Observability Trace View](/images/datadog-llm-observability-1.png)
<Frame>
<img src="/images/datadog-llm-observability-1.png" alt="Datadog LLM Observability Trace View" />
</Frame>
Additionally, you can view the execution graph view of the trace, which shows the control and data flow of the trace, which will scale with larger agents to show handoffs and relationships between LLM calls, tool calls, and agent interactions.
![Datadog LLM Observability Agent Execution Flow View](/images/datadog-llm-observability-2.png)
<Frame>
<img src="/images/datadog-llm-observability-2.png" alt="Datadog LLM Observability Agent Execution Flow View" />
</Frame>
## References

View File

@@ -733,9 +733,7 @@ Here's a basic configuration to route requests to OpenAI, specifically using GPT
- Collect relevant metadata to filter logs
- Enforce access permissions
Create API keys through:
- [Portkey App](https://app.portkey.ai/)
- [API Key Management API](/en/api-reference/admin-api/control-plane/api-keys/create-api-key)
Create API keys through the [Portkey App](https://app.portkey.ai/)
Example using Python SDK:
```python
@@ -758,7 +756,7 @@ Here's a basic configuration to route requests to OpenAI, specifically using GPT
)
```
For detailed key management instructions, see our [API Keys documentation](/en/api-reference/admin-api/control-plane/api-keys/create-api-key).
For detailed key management instructions, see the [Portkey documentation](https://portkey.ai/docs).
</Accordion>
<Accordion title="Step 4: Deploy & Monitor">

View File

@@ -18,7 +18,7 @@ These tools enable your agents to interact with cloud services, access cloud sto
Write and upload files to Amazon S3 storage.
</Card>
<Card title="Bedrock Invoke Agent" icon="aws" href="/en/tools/cloud-storage/bedrockinvokeagenttool">
<Card title="Bedrock Invoke Agent" icon="aws" href="/en/tools/integration/bedrockinvokeagenttool">
Invoke Amazon Bedrock agents for AI-powered tasks.
</Card>

View File

@@ -23,13 +23,15 @@ Here's a minimal example of how to use the tool:
```python
from crewai import Agent
from crewai_tools import QdrantVectorSearchTool
from crewai_tools import QdrantVectorSearchTool, QdrantConfig
# Initialize the tool
# Initialize the tool with QdrantConfig
qdrant_tool = QdrantVectorSearchTool(
qdrant_url="your_qdrant_url",
qdrant_api_key="your_qdrant_api_key",
collection_name="your_collection"
qdrant_config=QdrantConfig(
qdrant_url="your_qdrant_url",
qdrant_api_key="your_qdrant_api_key",
collection_name="your_collection"
)
)
# Create an agent that uses the tool
@@ -82,7 +84,7 @@ def extract_text_from_pdf(pdf_path):
def get_openai_embedding(text):
response = client.embeddings.create(
input=text,
model="text-embedding-3-small"
model="text-embedding-3-large"
)
return response.data[0].embedding
@@ -90,13 +92,13 @@ def get_openai_embedding(text):
def load_pdf_to_qdrant(pdf_path, qdrant, collection_name):
# Extract text from PDF
text_chunks = extract_text_from_pdf(pdf_path)
# Create Qdrant collection
if qdrant.collection_exists(collection_name):
qdrant.delete_collection(collection_name)
qdrant.create_collection(
collection_name=collection_name,
vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
vectors_config=VectorParams(size=3072, distance=Distance.COSINE)
)
# Store embeddings
@@ -120,19 +122,23 @@ pdf_path = "path/to/your/document.pdf"
load_pdf_to_qdrant(pdf_path, qdrant, collection_name)
# Initialize Qdrant search tool
from crewai_tools import QdrantConfig
qdrant_tool = QdrantVectorSearchTool(
qdrant_url=os.getenv("QDRANT_URL"),
qdrant_api_key=os.getenv("QDRANT_API_KEY"),
collection_name=collection_name,
limit=3,
score_threshold=0.35
qdrant_config=QdrantConfig(
qdrant_url=os.getenv("QDRANT_URL"),
qdrant_api_key=os.getenv("QDRANT_API_KEY"),
collection_name=collection_name,
limit=3,
score_threshold=0.35
)
)
# Create CrewAI agents
search_agent = Agent(
role="Senior Semantic Search Agent",
goal="Find and analyze documents based on semantic search",
backstory="""You are an expert research assistant who can find relevant
backstory="""You are an expert research assistant who can find relevant
information using semantic search in a Qdrant database.""",
tools=[qdrant_tool],
verbose=True
@@ -141,7 +147,7 @@ search_agent = Agent(
answer_agent = Agent(
role="Senior Answer Assistant",
goal="Generate answers to questions based on the context provided",
backstory="""You are an expert answer assistant who can generate
backstory="""You are an expert answer assistant who can generate
answers to questions based on the context provided.""",
tools=[qdrant_tool],
verbose=True
@@ -180,21 +186,82 @@ print(result)
## Tool Parameters
### Required Parameters
- `qdrant_url` (str): The URL of your Qdrant server
- `qdrant_api_key` (str): API key for authentication with Qdrant
- `collection_name` (str): Name of the Qdrant collection to search
- `qdrant_config` (QdrantConfig): Configuration object containing all Qdrant settings
### Optional Parameters
### QdrantConfig Parameters
- `qdrant_url` (str): The URL of your Qdrant server
- `qdrant_api_key` (str, optional): API key for authentication with Qdrant
- `collection_name` (str): Name of the Qdrant collection to search
- `limit` (int): Maximum number of results to return (default: 3)
- `score_threshold` (float): Minimum similarity score threshold (default: 0.35)
- `filter` (Any, optional): Qdrant Filter instance for advanced filtering (default: None)
### Optional Tool Parameters
- `custom_embedding_fn` (Callable[[str], list[float]]): Custom function for text vectorization
- `qdrant_package` (str): Base package path for Qdrant (default: "qdrant_client")
- `client` (Any): Pre-initialized Qdrant client (optional)
## Advanced Filtering
The QdrantVectorSearchTool supports powerful filtering capabilities to refine your search results:
### Dynamic Filtering
Use `filter_by` and `filter_value` parameters in your search to filter results on-the-fly:
```python
# Agent will use these parameters when calling the tool
# The tool schema accepts filter_by and filter_value
# Example: search with category filter
# Results will be filtered where category == "technology"
```
### Preset Filters with QdrantConfig
For complex filtering, use Qdrant Filter instances in your configuration:
```python
from qdrant_client.http import models as qmodels
from crewai_tools import QdrantVectorSearchTool, QdrantConfig
# Create a filter for specific conditions
preset_filter = qmodels.Filter(
must=[
qmodels.FieldCondition(
key="category",
match=qmodels.MatchValue(value="research")
),
qmodels.FieldCondition(
key="year",
match=qmodels.MatchValue(value=2024)
)
]
)
# Initialize tool with preset filter
qdrant_tool = QdrantVectorSearchTool(
qdrant_config=QdrantConfig(
qdrant_url="your_url",
qdrant_api_key="your_key",
collection_name="your_collection",
filter=preset_filter # Preset filter applied to all searches
)
)
```
### Combining Filters
The tool automatically combines preset filters from `QdrantConfig` with dynamic filters from `filter_by` and `filter_value`:
```python
# If QdrantConfig has a preset filter for category="research"
# And the search uses filter_by="year", filter_value=2024
# Both filters will be combined (AND logic)
```
## Search Parameters
The tool accepts these parameters in its schema:
- `query` (str): The search query to find similar documents
- `filter_by` (str, optional): Metadata field to filter on
- `filter_value` (str, optional): Value to filter by
- `filter_value` (Any, optional): Value to filter by
## Return Format
@@ -214,7 +281,7 @@ The tool returns results in JSON format:
## Default Embedding
By default, the tool uses OpenAI's `text-embedding-3-small` model for vectorization. This requires:
By default, the tool uses OpenAI's `text-embedding-3-large` model for vectorization. This requires:
- OpenAI API key set in environment: `OPENAI_API_KEY`
## Custom Embeddings
@@ -240,18 +307,22 @@ def custom_embeddings(text: str) -> list[float]:
# Tokenize and get model outputs
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
# Use mean pooling to get text embedding
embeddings = outputs.last_hidden_state.mean(dim=1)
# Convert to list of floats and return
return embeddings[0].tolist()
# Use custom embeddings with the tool
from crewai_tools import QdrantConfig
tool = QdrantVectorSearchTool(
qdrant_url="your_url",
qdrant_api_key="your_key",
collection_name="your_collection",
qdrant_config=QdrantConfig(
qdrant_url="your_url",
qdrant_api_key="your_key",
collection_name="your_collection"
),
custom_embedding_fn=custom_embeddings # Pass your custom function
)
```
@@ -269,4 +340,4 @@ Required environment variables:
```bash
export QDRANT_URL="your_qdrant_url" # If not provided in constructor
export QDRANT_API_KEY="your_api_key" # If not provided in constructor
export OPENAI_API_KEY="your_openai_key" # If using default embeddings
export OPENAI_API_KEY="your_openai_key" # If using default embeddings

View File

@@ -54,25 +54,25 @@ The following parameters can be used to customize the `CSVSearchTool`'s behavior
By default, the tool uses OpenAI for both embeddings and summarization. To customize the model, you can use a config dictionary as follows:
```python Code
from chromadb.config import Settings
tool = CSVSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # or "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -46,23 +46,25 @@ tool = DirectorySearchTool(directory='/path/to/directory')
The DirectorySearchTool uses OpenAI for embeddings and summarization by default. Customization options for these settings include changing the model provider and configuration, enhancing flexibility for advanced users.
```python Code
from chromadb.config import Settings
tool = DirectorySearchTool(
config=dict(
llm=dict(
provider="ollama", # Options include ollama, google, anthropic, llama2, and more
config=dict(
model="llama2",
# Additional configurations here
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # or "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -56,25 +56,25 @@ The following parameters can be used to customize the `DOCXSearchTool`'s behavio
By default, the tool uses OpenAI for both embeddings and summarization. To customize the model, you can use a config dictionary as follows:
```python Code
from chromadb.config import Settings
tool = DOCXSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # or "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -48,27 +48,25 @@ tool = MDXSearchTool(mdx='path/to/your/document.mdx')
The tool defaults to using OpenAI for embeddings and summarization. For customization, utilize a configuration dictionary as shown below:
```python Code
from chromadb.config import Settings
tool = MDXSearchTool(
config=dict(
llm=dict(
provider="ollama", # Options include google, openai, anthropic, llama2, etc.
config=dict(
model="llama2",
# Optional parameters can be included here.
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# Optional title for the embeddings can be added here.
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # or "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -45,28 +45,64 @@ tool = PDFSearchTool(pdf='path/to/your/document.pdf')
## Custom model and embeddings
By default, the tool uses OpenAI for both embeddings and summarization. To customize the model, you can use a config dictionary as follows:
By default, the tool uses OpenAI for both embeddings and summarization. To customize the model, you can use a config dictionary as follows. Note: a vector database is required because generated embeddings must be stored and queried from a vectordb.
```python Code
from crewai_tools import PDFSearchTool
# - embedding_model (required): choose provider + provider-specific config
# - vectordb (required): choose vector DB and pass its config
tool = PDFSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
# Supported providers: "openai", "azure", "google-generativeai", "google-vertex",
# "voyageai", "cohere", "huggingface", "jina", "sentence-transformer",
# "text2vec", "ollama", "openclip", "instructor", "onnx", "roboflow", "watsonx", "custom"
"provider": "openai", # or: "google-generativeai", "cohere", "ollama", ...
"config": {
# Model identifier for the chosen provider. "model" will be auto-mapped to "model_name" internally.
"model": "text-embedding-3-small",
# Optional: API key. If omitted, the tool will use provider-specific env vars when available
# (e.g., OPENAI_API_KEY for provider="openai").
# "api_key": "sk-...",
# Provider-specific examples:
# --- Google Generative AI ---
# (Set provider="google-generativeai" above)
# "model": "models/embedding-001",
# "task_type": "retrieval_document",
# "title": "Embeddings",
# --- Cohere ---
# (Set provider="cohere" above)
# "model": "embed-english-v3.0",
# --- Ollama (local) ---
# (Set provider="ollama" above)
# "model": "nomic-embed-text",
},
},
"vectordb": {
"provider": "chromadb", # or "qdrant"
"config": {
# For ChromaDB: pass "settings" (chromadb.config.Settings) or rely on defaults.
# Example (uncomment and import):
# from chromadb.config import Settings
# "settings": Settings(
# persist_directory="/content/chroma",
# allow_reset=True,
# is_persistent=True,
# ),
# For Qdrant: pass "vectors_config" (qdrant_client.models.VectorParams).
# Example (uncomment and import):
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
# Note: collection name is controlled by the tool (default: "rag_tool_collection"), not set here.
}
},
}
)
```

View File

@@ -57,25 +57,41 @@ By default, the tool uses OpenAI for both embeddings and summarization.
To customize the model, you can use a config dictionary as follows:
```python Code
from chromadb.config import Settings
tool = TXTSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
# Required: embeddings provider + config
"embedding_model": {
"provider": "openai", # or google-generativeai, cohere, ollama, ...
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...", # optional if env var is set
# Provider examples:
# Google → model: "models/embedding-001", task_type: "retrieval_document"
# Cohere → model: "embed-english-v3.0"
# Ollama → model: "nomic-embed-text"
},
},
# Required: vector database config
"vectordb": {
"provider": "chromadb", # or "qdrant"
"config": {
# Chroma settings (optional persistence)
# "settings": Settings(
# persist_directory="/content/chroma",
# allow_reset=True,
# is_persistent=True,
# ),
# Qdrant vector params example:
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
# Note: collection name is controlled by the tool (default: "rag_tool_collection").
}
},
}
)
```

View File

@@ -54,25 +54,25 @@ It is an optional parameter during the tool's initialization but must be provide
By default, the tool uses OpenAI for both embeddings and summarization. To customize the model, you can use a config dictionary as follows:
```python Code
from chromadb.config import Settings
tool = XMLSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # or "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -632,11 +632,11 @@ mode: "wide"
## 기여
기여를 원하시면, [기여 가이드](CONTRIBUTING.md)를 참조하세요.
기여를 원하시면, [기여 가이드](https://github.com/crewAIInc/crewAI/blob/main/CONTRIBUTING.md)를 참조하세요.
## 라이센스
이 프로젝트는 MIT 라이센스 하에 배포됩니다. 자세한 내용은 [LICENSE](LICENSE) 파일을 확인하세요.
이 프로젝트는 MIT 라이센스 하에 배포됩니다. 자세한 내용은 [LICENSE](https://github.com/crewAIInc/crewAI/blob/main/LICENSE) 파일을 확인하세요.
</Update>
<Update label="2025년 5월 22일">

View File

@@ -706,7 +706,7 @@ class KnowledgeMonitorListener(BaseEventListener):
knowledge_monitor = KnowledgeMonitorListener()
```
이벤트 사용에 대한 자세한 내용은 [이벤트 리스너](https://docs.crewai.com/concepts/event-listener) 문서를 참고하세요.
이벤트 사용에 대한 자세한 내용은 [이벤트 리스너](/ko/concepts/event-listener) 문서를 참고하세요.
### 맞춤형 지식 소스

View File

@@ -748,7 +748,7 @@ CrewAI는 LLM의 스트리밍 응답을 지원하여, 애플리케이션이 출
```
<Tip>
[자세한 내용은 여기를 클릭하세요](https://docs.crewai.com/concepts/event-listener#event-listeners)
[자세한 내용은 여기를 클릭하세요](/ko/concepts/event-listener#event-listeners)
</Tip>
</Tab>

View File

@@ -36,7 +36,7 @@ mode: "wide"
<Card title="도구 & 통합" href="/ko/enterprise/features/tools-and-integrations" icon="wrench">
에이전트가 사용할 외부 앱 연결 및 내부 도구 관리.
</Card>
<Card title="도구 저장소" href="/ko/enterprise/features/tool-repository" icon="toolbox">
<Card title="도구 저장소" href="/ko/enterprise/guides/tool-repository" icon="toolbox">
크루 기능을 확장할 수 있도록 도구를 게시하고 설치.
</Card>
<Card title="에이전트 저장소" href="/ko/enterprise/features/agent-repositories" icon="people-group">

View File

@@ -231,7 +231,7 @@ mode: "wide"
## 관련 문서
<CardGroup cols={2}>
<Card title="도구 저장소" href="/ko/enterprise/features/tool-repository" icon="toolbox">
<Card title="도구 저장소" href="/ko/enterprise/guides/tool-repository" icon="toolbox">
크루 기능을 확장할 수 있도록 도구를 게시하고 설치하세요.
</Card>
<Card title="Webhook 자동화" href="/ko/enterprise/guides/webhook-automation" icon="bolt">

View File

@@ -21,7 +21,7 @@ Tool Repository는 CrewAI 도구를 위한 패키지 관리자입니다. 사용
Tool Repository를 사용하기 전에 다음이 준비되어 있어야 합니다:
- [CrewAI AMP](https://app.crewai.com) 계정
- [CrewAI CLI](https://docs.crewai.com/concepts/cli#cli) 설치됨
- [CrewAI CLI](/ko/concepts/cli#cli) 설치됨
- uv>=0.5.0 이 설치되어 있어야 합니다. [업그레이드 방법](https://docs.astral.sh/uv/getting-started/installation/#upgrading-uv)을 참고하세요.
- [Git](https://git-scm.com) 설치 및 구성 완료
- CrewAI AMP 조직에서 도구를 게시하거나 설치할 수 있는 액세스 권한
@@ -66,7 +66,7 @@ crewai tool publish
crewai tool publish --public
```
도구 빌드에 대한 자세한 내용은 [나만의 도구 만들기](https://docs.crewai.com/concepts/tools#creating-your-own-tools)를 참고하세요.
도구 빌드에 대한 자세한 내용은 [나만의 도구 만들기](/ko/concepts/tools#creating-your-own-tools)를 참고하세요.
## 도구 업데이트

View File

@@ -49,7 +49,7 @@ mode: "wide"
에이전트 실행에 인간 입력을 통합하려면 작업 정의에서 `human_input` 플래그를 설정하세요. 활성화하면, 에이전트가 최종 답변을 제공하기 전에 사용자에게 입력을 요청합니다. 이 입력은 추가 맥락을 제공하거나, 애매함을 해소하거나, 에이전트의 출력을 검증해야 할 때 활용될 수 있습니다.
자세한 구현 방법은 [Human-in-the-Loop 가이드](/ko/how-to/human-in-the-loop)를 참고해 주세요.
자세한 구현 방법은 [Human-in-the-Loop 가이드](/ko/enterprise/guides/human-in-the-loop)를 참고해 주세요.
</Accordion>
<Accordion title="CrewAI에서 에이전트의 행동과 역량을 맞춤화하고 향상시키기 위한 고급 커스터마이징 옵션에는 어떤 것이 있나요?">
@@ -142,7 +142,7 @@ mode: "wide"
<Accordion title="CrewAI 에이전트를 위한 커스텀 도구는 어떻게 만들 수 있습니까?">
CrewAI에서 제공하는 `BaseTool` 클래스를 상속받아 커스텀 도구를 직접 만들거나, tool 데코레이터를 활용할 수 있습니다. 상속 방식은 `BaseTool`을 상속하는 새로운 클래스를 정의해 이름, 설명, 그리고 실제 논리를 처리하는 `_run` 메서드를 작성합니다. tool 데코레이터를 사용하면 필수 속성과 운영 로직만 정의해 바로 `Tool` 객체를 만들 수 있습니다.
<Card href="https://docs.crewai.com/how-to/create-custom-tools" icon="code">CrewAI 도구 가이드</Card>
<Card href="/ko/learn/create-custom-tools" icon="code">CrewAI 도구 가이드</Card>
</Accordion>
<Accordion title="전체 crew가 수행할 수 있는 분당 최대 요청 수는 어떻게 제한할 수 있나요?">

View File

@@ -95,7 +95,7 @@ project_crew = Crew(
```
<Tip>
매니저 에이전트 생성 및 맞춤화에 대한 자세한 내용은 [커스텀 매니저 에이전트 문서](https://docs.crewai.com/how-to/custom-manager-agent#custom-manager-agent)를 참고하세요.
매니저 에이전트 생성 및 맞춤화에 대한 자세한 내용은 [커스텀 매니저 에이전트 문서](/ko/learn/custom-manager-agent)를 참고하세요.
</Tip>
### 워크플로우 실행

View File

@@ -93,11 +93,15 @@ ddtrace-run python crewai_agent.py
트레이스를 클릭하면 사용된 총 토큰, LLM 호출 수, 사용된 모델, 예상 비용 등 트레이스에 대한 세부 정보가 표시됩니다. 특정 스팬(span)을 클릭하면 이러한 세부 정보의 범위가 좁혀지고 관련 입력, 출력 및 메타데이터가 표시됩니다.
![Datadog LLM 옵저버빌리티 추적 보기](/images/datadog-llm-observability-1.png)
<Frame>
<img src="/images/datadog-llm-observability-1.png" alt="Datadog LLM 옵저버빌리티 추적 보기" />
</Frame>
또한, 트레이스의 제어 및 데이터 흐름을 보여주는 트레이스의 실행 그래프 보기를 볼 수 있으며, 이는 더 큰 에이전트로 확장하여 LLM 호출, 도구 호출 및 에이전트 상호 작용 간의 핸드오프와 관계를 보여줍니다.
![Datadog LLM Observability 에이전트 실행 흐름 보기](/images/datadog-llm-observability-2.png)
<Frame>
<img src="/images/datadog-llm-observability-2.png" alt="Datadog LLM Observability 에이전트 실행 흐름 보기" />
</Frame>
## 참조

View File

@@ -730,9 +730,7 @@ Portkey 대시보드에서 [구성 페이지](https://app.portkey.ai/configs)에
- 로그를 필터링하기 위한 관련 메타데이터 수집
- 액세스 권한 적용
API 키 생성 방법:
- [Portkey App](https://app.portkey.ai/)
- [API Key Management API](/ko/api-reference/admin-api/control-plane/api-keys/create-api-key)
[Portkey App](https://app.portkey.ai/)를 통해 API 키 생성하세요
Python SDK를 사용한 예시:
```python
@@ -755,7 +753,7 @@ api_key = portkey.api_keys.create(
)
```
자세한 키 관리 방법은 [API 키 문서](/ko/api-reference/admin-api/control-plane/api-keys/create-api-key)를 참조하세요.
자세한 키 관리 방법은 [Portkey 문서](https://portkey.ai/docs)를 참조하세요.
</Accordion>
<Accordion title="4단계: 배포 및 모니터링">

View File

@@ -18,7 +18,7 @@ mode: "wide"
파일을 Amazon S3 스토리지에 작성하고 업로드합니다.
</Card>
<Card title="Bedrock Invoke Agent" icon="aws" href="/ko/tools/cloud-storage/bedrockinvokeagenttool">
<Card title="Bedrock Invoke Agent" icon="aws" href="/ko/tools/integration/bedrockinvokeagenttool">
AI 기반 작업을 위해 Amazon Bedrock 에이전트를 호출합니다.
</Card>

View File

@@ -23,13 +23,15 @@ uv add qdrant-client
```python
from crewai import Agent
from crewai_tools import QdrantVectorSearchTool
from crewai_tools import QdrantVectorSearchTool, QdrantConfig
# Initialize the tool
# QdrantConfig로 도구 초기화
qdrant_tool = QdrantVectorSearchTool(
qdrant_url="your_qdrant_url",
qdrant_api_key="your_qdrant_api_key",
collection_name="your_collection"
qdrant_config=QdrantConfig(
qdrant_url="your_qdrant_url",
qdrant_api_key="your_qdrant_api_key",
collection_name="your_collection"
)
)
# Create an agent that uses the tool
@@ -82,7 +84,7 @@ def extract_text_from_pdf(pdf_path):
def get_openai_embedding(text):
response = client.embeddings.create(
input=text,
model="text-embedding-3-small"
model="text-embedding-3-large"
)
return response.data[0].embedding
@@ -90,13 +92,13 @@ def get_openai_embedding(text):
def load_pdf_to_qdrant(pdf_path, qdrant, collection_name):
# Extract text from PDF
text_chunks = extract_text_from_pdf(pdf_path)
# Create Qdrant collection
if qdrant.collection_exists(collection_name):
qdrant.delete_collection(collection_name)
qdrant.create_collection(
collection_name=collection_name,
vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
vectors_config=VectorParams(size=3072, distance=Distance.COSINE)
)
# Store embeddings
@@ -120,19 +122,23 @@ pdf_path = "path/to/your/document.pdf"
load_pdf_to_qdrant(pdf_path, qdrant, collection_name)
# Initialize Qdrant search tool
from crewai_tools import QdrantConfig
qdrant_tool = QdrantVectorSearchTool(
qdrant_url=os.getenv("QDRANT_URL"),
qdrant_api_key=os.getenv("QDRANT_API_KEY"),
collection_name=collection_name,
limit=3,
score_threshold=0.35
qdrant_config=QdrantConfig(
qdrant_url=os.getenv("QDRANT_URL"),
qdrant_api_key=os.getenv("QDRANT_API_KEY"),
collection_name=collection_name,
limit=3,
score_threshold=0.35
)
)
# Create CrewAI agents
search_agent = Agent(
role="Senior Semantic Search Agent",
goal="Find and analyze documents based on semantic search",
backstory="""You are an expert research assistant who can find relevant
backstory="""You are an expert research assistant who can find relevant
information using semantic search in a Qdrant database.""",
tools=[qdrant_tool],
verbose=True
@@ -141,7 +147,7 @@ search_agent = Agent(
answer_agent = Agent(
role="Senior Answer Assistant",
goal="Generate answers to questions based on the context provided",
backstory="""You are an expert answer assistant who can generate
backstory="""You are an expert answer assistant who can generate
answers to questions based on the context provided.""",
tools=[qdrant_tool],
verbose=True
@@ -180,21 +186,82 @@ print(result)
## 도구 매개변수
### 필수 파라미터
- `qdrant_url` (str): Qdrant 서버의 URL
- `qdrant_api_key` (str): Qdrant 인증을 위한 API 키
- `collection_name` (str): 검색할 Qdrant 컬렉션의 이름
- `qdrant_config` (QdrantConfig): 모든 Qdrant 설정을 포함하는 구성 객체
### 선택적 매개변수
### QdrantConfig 매개변수
- `qdrant_url` (str): Qdrant 서버의 URL
- `qdrant_api_key` (str, 선택 사항): Qdrant 인증을 위한 API 키
- `collection_name` (str): 검색할 Qdrant 컬렉션의 이름
- `limit` (int): 반환할 최대 결과 수 (기본값: 3)
- `score_threshold` (float): 최소 유사도 점수 임계값 (기본값: 0.35)
- `filter` (Any, 선택 사항): 고급 필터링을 위한 Qdrant Filter 인스턴스 (기본값: None)
### 선택적 도구 매개변수
- `custom_embedding_fn` (Callable[[str], list[float]]): 텍스트 벡터화를 위한 사용자 지정 함수
- `qdrant_package` (str): Qdrant의 기본 패키지 경로 (기본값: "qdrant_client")
- `client` (Any): 사전 초기화된 Qdrant 클라이언트 (선택 사항)
## 고급 필터링
QdrantVectorSearchTool은 검색 결과를 세밀하게 조정할 수 있는 강력한 필터링 기능을 지원합니다:
### 동적 필터링
검색 시 `filter_by` 및 `filter_value` 매개변수를 사용하여 즉석에서 결과를 필터링할 수 있습니다:
```python
# 에이전트는 도구를 호출할 때 이러한 매개변수를 사용합니다
# 도구 스키마는 filter_by 및 filter_value를 허용합니다
# 예시: 카테고리 필터를 사용한 검색
# 결과는 category == "기술"인 항목으로 필터링됩니다
```
### QdrantConfig를 사용한 사전 설정 필터
복잡한 필터링의 경우 구성에서 Qdrant Filter 인스턴스를 사용하세요:
```python
from qdrant_client.http import models as qmodels
from crewai_tools import QdrantVectorSearchTool, QdrantConfig
# 특정 조건에 대한 필터 생성
preset_filter = qmodels.Filter(
must=[
qmodels.FieldCondition(
key="category",
match=qmodels.MatchValue(value="research")
),
qmodels.FieldCondition(
key="year",
match=qmodels.MatchValue(value=2024)
)
]
)
# 사전 설정 필터로 도구 초기화
qdrant_tool = QdrantVectorSearchTool(
qdrant_config=QdrantConfig(
qdrant_url="your_url",
qdrant_api_key="your_key",
collection_name="your_collection",
filter=preset_filter # 모든 검색에 적용되는 사전 설정 필터
)
)
```
### 필터 결합
도구는 `QdrantConfig`의 사전 설정 필터와 `filter_by` 및 `filter_value`의 동적 필터를 자동으로 결합합니다:
```python
# QdrantConfig에 category="research"에 대한 사전 설정 필터가 있고
# 검색에서 filter_by="year", filter_value=2024를 사용하는 경우
# 두 필터가 모두 결합됩니다 (AND 논리)
```
## 검색 매개변수
이 도구는 스키마에서 다음과 같은 매개변수를 허용합니다:
- `query` (str): 유사한 문서를 찾기 위한 검색 쿼리
- `filter_by` (str, 선택 사항): 필터링할 메타데이터 필드
- `filter_value` (str, 선택 사항): 필터 기준 값
- `filter_value` (Any, 선택 사항): 필터 기준 값
## 반환 형식
@@ -214,7 +281,7 @@ print(result)
## 기본 임베딩
기본적으로, 이 도구는 벡터화를 위해 OpenAI의 `text-embedding-3-small` 모델을 사용합니다. 이를 위해서는 다음이 필요합니다:
기본적으로, 이 도구는 벡터화를 위해 OpenAI의 `text-embedding-3-large` 모델을 사용합니다. 이를 위해서는 다음이 필요합니다:
- 환경변수에 설정된 OpenAI API 키: `OPENAI_API_KEY`
## 커스텀 임베딩
@@ -240,18 +307,22 @@ def custom_embeddings(text: str) -> list[float]:
# Tokenize and get model outputs
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
# Use mean pooling to get text embedding
embeddings = outputs.last_hidden_state.mean(dim=1)
# Convert to list of floats and return
return embeddings[0].tolist()
# Use custom embeddings with the tool
from crewai_tools import QdrantConfig
tool = QdrantVectorSearchTool(
qdrant_url="your_url",
qdrant_api_key="your_key",
collection_name="your_collection",
qdrant_config=QdrantConfig(
qdrant_url="your_url",
qdrant_api_key="your_key",
collection_name="your_collection"
),
custom_embedding_fn=custom_embeddings # Pass your custom function
)
```
@@ -270,4 +341,4 @@ tool = QdrantVectorSearchTool(
export QDRANT_URL="your_qdrant_url" # If not provided in constructor
export QDRANT_API_KEY="your_api_key" # If not provided in constructor
export OPENAI_API_KEY="your_openai_key" # If using default embeddings
```
```

View File

@@ -54,25 +54,25 @@ tool = CSVSearchTool()
기본적으로 이 도구는 임베딩과 요약 모두에 OpenAI를 사용합니다. 모델을 사용자 지정하려면 다음과 같이 config 딕셔너리를 사용할 수 있습니다:
```python Code
from chromadb.config import Settings
tool = CSVSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # 또는 "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -46,23 +46,25 @@ tool = DirectorySearchTool(directory='/path/to/directory')
DirectorySearchTool은 기본적으로 OpenAI를 사용하여 임베딩 및 요약을 수행합니다. 이 설정의 커스터마이즈 옵션에는 모델 공급자 및 구성을 변경하는 것이 포함되어 있어, 고급 사용자를 위한 유연성을 향상시킵니다.
```python Code
from chromadb.config import Settings
tool = DirectorySearchTool(
config=dict(
llm=dict(
provider="ollama", # Options include ollama, google, anthropic, llama2, and more
config=dict(
model="llama2",
# Additional configurations here
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # 또는 "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -56,25 +56,25 @@ tool = DOCXSearchTool(docx='path/to/your/document.docx')
기본적으로 이 도구는 임베딩과 요약 모두에 OpenAI를 사용합니다. 모델을 커스터마이즈하려면 다음과 같이 config 딕셔너리를 사용할 수 있습니다:
```python Code
from chromadb.config import Settings
tool = DOCXSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # 또는 "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -48,27 +48,25 @@ tool = MDXSearchTool(mdx='path/to/your/document.mdx')
이 도구는 기본적으로 임베딩과 요약을 위해 OpenAI를 사용합니다. 커스터마이징을 위해 아래와 같이 설정 딕셔너리를 사용할 수 있습니다.
```python Code
from chromadb.config import Settings
tool = MDXSearchTool(
config=dict(
llm=dict(
provider="ollama", # 옵션에는 google, openai, anthropic, llama2 등이 있습니다.
config=dict(
model="llama2",
# 선택적 파라미터를 여기에 포함할 수 있습니다.
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # 또는 openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# 임베딩에 대한 선택적 제목을 여기에 추가할 수 있습니다.
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # 또는 "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -45,28 +45,60 @@ tool = PDFSearchTool(pdf='path/to/your/document.pdf')
## 커스텀 모델 및 임베딩
기본적으로 이 도구는 임베딩과 요약 모두에 OpenAI를 사용합니다. 모델을 커스터마이즈하려면 다음과 같이 config 딕셔너리를 사용할 수 있습니다:
기본적으로 이 도구는 임베딩과 요약 모두에 OpenAI를 사용합니다. 모델을 커스터마이즈하려면 다음과 같이 config 딕셔너리를 사용할 수 있습니다. 참고: 임베딩은 벡터DB에 저장되어야 하므로 vectordb 설정이 필요합니다.
```python Code
from crewai_tools import PDFSearchTool
from chromadb.config import Settings # Chroma 영속성 설정
tool = PDFSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
# 필수: 임베딩 제공자와 설정
"embedding_model": {
# 사용 가능 공급자: "openai", "azure", "google-generativeai", "google-vertex",
# "voyageai", "cohere", "huggingface", "jina", "sentence-transformer",
# "text2vec", "ollama", "openclip", "instructor", "onnx", "roboflow", "watsonx", "custom"
"provider": "openai",
"config": {
# "model" 키는 내부적으로 "model_name"으로 매핑됩니다.
"model": "text-embedding-3-small",
# 선택: API 키 (미설정 시 환경변수 사용)
# "api_key": "sk-...",
# 공급자별 예시
# --- Google ---
# (provider를 "google-generativeai"로 설정)
# "model": "models/embedding-001",
# "task_type": "retrieval_document",
# --- Cohere ---
# (provider를 "cohere"로 설정)
# "model": "embed-english-v3.0",
# --- Ollama(로컬) ---
# (provider를 "ollama"로 설정)
# "model": "nomic-embed-text",
},
},
# 필수: 벡터DB 설정
"vectordb": {
"provider": "chromadb", # 또는 "qdrant"
"config": {
# Chroma 설정 예시
# "settings": Settings(
# persist_directory="/content/chroma",
# allow_reset=True,
# is_persistent=True,
# ),
# Qdrant 설정 예시
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
# 참고: 컬렉션 이름은 도구에서 관리합니다(기본값: "rag_tool_collection").
}
},
}
)
```

View File

@@ -57,25 +57,34 @@ tool = TXTSearchTool(txt='path/to/text/file.txt')
모델을 커스터마이징하려면 다음과 같이 config 딕셔너리를 사용할 수 있습니다:
```python Code
from chromadb.config import Settings
tool = TXTSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
# 필수: 임베딩 제공자 + 설정
"embedding_model": {
"provider": "openai", # 또는 google-generativeai, cohere, ollama 등
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...", # 환경변수 사용 시 생략 가능
# 공급자별 예시: Google → model: "models/embedding-001", task_type: "retrieval_document"
},
},
# 필수: 벡터DB 설정
"vectordb": {
"provider": "chromadb", # 또는 "qdrant"
"config": {
# Chroma 설정(영속성 예시)
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# Qdrant 벡터 파라미터 예시:
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
# 참고: 컬렉션 이름은 도구에서 관리합니다(기본값: "rag_tool_collection").
}
},
}
)
```

View File

@@ -54,25 +54,25 @@ tool = XMLSearchTool(xml='path/to/your/xmlfile.xml')
기본적으로 이 도구는 임베딩과 요약 모두에 OpenAI를 사용합니다. 모델을 커스터마이징하려면 다음과 같이 config 딕셔너리를 사용할 수 있습니다.
```python Code
from chromadb.config import Settings
tool = XMLSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # or openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # 또는 "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -11,7 +11,7 @@ mode: "wide"
<Card
title="Bedrock Invoke Agent Tool"
icon="cloud"
href="/en/tools/tool-integrations/bedrockinvokeagenttool"
href="/ko/tools/integration/bedrockinvokeagenttool"
color="#0891B2"
>
Invoke Amazon Bedrock Agents from CrewAI to orchestrate actions across AWS services.
@@ -20,7 +20,7 @@ mode: "wide"
<Card
title="CrewAI Automation Tool"
icon="bolt"
href="/en/tools/tool-integrations/crewaiautomationtool"
href="/ko/tools/integration/crewaiautomationtool"
color="#7C3AED"
>
Automate deployment and operations by integrating CrewAI with external platforms and workflows.

View File

@@ -704,7 +704,7 @@ class KnowledgeMonitorListener(BaseEventListener):
knowledge_monitor = KnowledgeMonitorListener()
```
Para mais informações sobre como usar eventos, consulte a documentação [Event Listeners](https://docs.crewai.com/concepts/event-listener).
Para mais informações sobre como usar eventos, consulte a documentação [Event Listeners](/pt-BR/concepts/event-listener).
### Fontes de Knowledge Personalizadas

View File

@@ -725,7 +725,7 @@ O CrewAI suporta respostas em streaming de LLMs, permitindo que sua aplicação
```
<Tip>
[Clique aqui](https://docs.crewai.com/concepts/event-listener#event-listeners) para mais detalhes
[Clique aqui](/pt-BR/concepts/event-listener#event-listeners) para mais detalhes
</Tip>
</Tab>
</Tabs>

View File

@@ -36,7 +36,7 @@ Você também pode baixar templates diretamente do marketplace clicando em `Down
<Card title="Ferramentas & Integrações" href="/pt-BR/enterprise/features/tools-and-integrations" icon="wrench">
Conecte apps externos e gerencie ferramentas internas que seus agentes podem usar.
</Card>
<Card title="Repositório de Ferramentas" href="/pt-BR/enterprise/features/tool-repository" icon="toolbox">
<Card title="Repositório de Ferramentas" href="/pt-BR/enterprise/guides/tool-repository" icon="toolbox">
Publique e instale ferramentas para ampliar as capacidades dos seus crews.
</Card>
<Card title="Repositório de Agentes" href="/pt-BR/enterprise/features/agent-repositories" icon="people-group">

View File

@@ -231,7 +231,7 @@ Ferramentas & Integrações é o hub central para conectar aplicações de terce
## Relacionados
<CardGroup cols={2}>
<Card title="Repositório de Ferramentas" href="/pt-BR/enterprise/features/tool-repository" icon="toolbox">
<Card title="Repositório de Ferramentas" href="/pt-BR/enterprise/guides/tool-repository" icon="toolbox">
Publique e instale ferramentas para ampliar as capacidades dos seus crews.
</Card>
<Card title="Automação com Webhook" href="/pt-BR/enterprise/guides/webhook-automation" icon="bolt">

View File

@@ -21,7 +21,7 @@ O repositório não é um sistema de controle de versões. Use Git para rastrear
Antes de usar o Repositório de Ferramentas, certifique-se de que você possui:
- Uma conta [CrewAI AMP](https://app.crewai.com)
- [CrewAI CLI](https://docs.crewai.com/concepts/cli#cli) instalada
- [CrewAI CLI](/pt-BR/concepts/cli#cli) instalada
- uv>=0.5.0 instalado. Veja [como atualizar](https://docs.astral.sh/uv/getting-started/installation/#upgrading-uv)
- [Git](https://git-scm.com) instalado e configurado
- Permissões de acesso para publicar ou instalar ferramentas em sua organização CrewAI AMP
@@ -66,7 +66,7 @@ Por padrão, as ferramentas são publicadas como privadas. Para tornar uma ferra
crewai tool publish --public
```
Para mais detalhes sobre como construir ferramentas, acesse [Criando suas próprias ferramentas](https://docs.crewai.com/concepts/tools#creating-your-own-tools).
Para mais detalhes sobre como construir ferramentas, acesse [Criando suas próprias ferramentas](/pt-BR/concepts/tools#creating-your-own-tools).
## Atualizando ferramentas

View File

@@ -49,7 +49,7 @@ mode: "wide"
Para integrar a entrada humana na execução do agente, defina a flag `human_input` na definição da tarefa. Quando habilitada, o agente solicitará a entrada do usuário antes de entregar sua resposta final. Essa entrada pode fornecer contexto extra, esclarecer ambiguidades ou validar a saída do agente.
Para orientações detalhadas de implementação, veja nosso [guia Human-in-the-Loop](/pt-BR/how-to/human-in-the-loop).
Para orientações detalhadas de implementação, veja nosso [guia Human-in-the-Loop](/pt-BR/enterprise/guides/human-in-the-loop).
</Accordion>
<Accordion title="Quais opções avançadas de customização estão disponíveis para aprimorar e personalizar o comportamento e as capacidades dos agentes na CrewAI?">
@@ -142,7 +142,7 @@ mode: "wide"
<Accordion title="Como posso criar ferramentas personalizadas para meus agentes CrewAI?">
Você pode criar ferramentas personalizadas herdando da classe `BaseTool` fornecida pela CrewAI ou usando o decorador de ferramenta. Herdar envolve definir uma nova classe que herda de `BaseTool`, especificando o nome, a descrição e o método `_run` para a lógica operacional. O decorador de ferramenta permite criar um objeto `Tool` diretamente com os atributos necessários e uma lógica funcional.
<Card href="https://docs.crewai.com/how-to/create-custom-tools" icon="code">CrewAI Tools Guide</Card>
<Card href="/pt-BR/learn/create-custom-tools" icon="code">CrewAI Tools Guide</Card>
</Accordion>
<Accordion title="Como controlar o número máximo de solicitações por minuto que toda a crew pode realizar?">

View File

@@ -96,7 +96,7 @@ project_crew = Crew(
```
<Tip>
Para mais detalhes sobre a criação e personalização de um agente gerente, confira a [documentação do Custom Manager Agent](https://docs.crewai.com/how-to/custom-manager-agent#custom-manager-agent).
Para mais detalhes sobre a criação e personalização de um agente gerente, confira a [documentação do Custom Manager Agent](/pt-BR/learn/custom-manager-agent).
</Tip>

View File

@@ -93,11 +93,14 @@ Depois de executar o aplicativo, você pode visualizar os traços na [Datadog LL
Ao clicar em um rastreamento, você verá os detalhes do rastreamento, incluindo o total de tokens usados, o número de chamadas LLM, os modelos usados e o custo estimado. Clicar em um intervalo específico reduzirá esses detalhes e mostrará a entrada, a saída e os metadados relacionados.
![Visualização do rastreamento de observabilidade do Datadog LLM](/images/datadog-llm-observability-1.png)
<Frame>
<img src="/images/datadog-llm-observability-1.png" alt="Visualização do rastreamento de observabilidade do Datadog LLM" />
</Frame>
Além disso, você pode visualizar a visualização do gráfico de execução do rastreamento, que mostra o controle e o fluxo de dados do rastreamento, que será dimensionado com agentes maiores para mostrar transferências e relacionamentos entre chamadas LLM, chamadas de ferramentas e interações de agentes.
![Visualização do fluxo de execução do agente de observabilidade do Datadog LLM](/images/datadog-llm-observability-2.png)
<Frame>
<img src="/images/datadog-llm-observability-2.png" alt="Visualização do fluxo de execução do agente de observabilidade do Datadog LLM" />
</Frame>
## Referências

View File

@@ -733,9 +733,7 @@ Aqui está um exemplo básico para rotear requisições ao OpenAI, usando especi
- Coletam metadados relevantes para filtragem de logs
- Impõem permissões de acesso
Crie chaves de API através de:
- [Portkey App](https://app.portkey.ai/)
- [API Key Management API](/pt-BR/api-reference/admin-api/control-plane/api-keys/create-api-key)
Crie chaves de API através do [Portkey App](https://app.portkey.ai/)
Exemplo usando Python SDK:
```python
@@ -758,7 +756,7 @@ Aqui está um exemplo básico para rotear requisições ao OpenAI, usando especi
)
```
Para instruções detalhadas de gerenciamento de chaves, veja nossa [documentação de API Keys](/pt-BR/api-reference/admin-api/control-plane/api-keys/create-api-key).
Para instruções detalhadas de gerenciamento de chaves, veja a [documentação Portkey](https://portkey.ai/docs).
</Accordion>
<Accordion title="Etapa 4: Implante & Monitore">

View File

@@ -18,7 +18,7 @@ Essas ferramentas permitem que seus agentes interajam com serviços em nuvem, ac
Escreva e faça upload de arquivos para o armazenamento Amazon S3.
</Card>
<Card title="Bedrock Invoke Agent" icon="aws" href="/pt-BR/tools/cloud-storage/bedrockinvokeagenttool">
<Card title="Bedrock Invoke Agent" icon="aws" href="/pt-BR/tools/integration/bedrockinvokeagenttool">
Acione agentes Amazon Bedrock para tarefas orientadas por IA.
</Card>

View File

@@ -23,13 +23,15 @@ Veja um exemplo mínimo de como utilizar a ferramenta:
```python
from crewai import Agent
from crewai_tools import QdrantVectorSearchTool
from crewai_tools import QdrantVectorSearchTool, QdrantConfig
# Inicialize a ferramenta
# Inicialize a ferramenta com QdrantConfig
qdrant_tool = QdrantVectorSearchTool(
qdrant_url="your_qdrant_url",
qdrant_api_key="your_qdrant_api_key",
collection_name="your_collection"
qdrant_config=QdrantConfig(
qdrant_url="your_qdrant_url",
qdrant_api_key="your_qdrant_api_key",
collection_name="your_collection"
)
)
# Crie um agente que utiliza a ferramenta
@@ -82,7 +84,7 @@ def extract_text_from_pdf(pdf_path):
def get_openai_embedding(text):
response = client.embeddings.create(
input=text,
model="text-embedding-3-small"
model="text-embedding-3-large"
)
return response.data[0].embedding
@@ -90,13 +92,13 @@ def get_openai_embedding(text):
def load_pdf_to_qdrant(pdf_path, qdrant, collection_name):
# Extrair texto do PDF
text_chunks = extract_text_from_pdf(pdf_path)
# Criar coleção no Qdrant
if qdrant.collection_exists(collection_name):
qdrant.delete_collection(collection_name)
qdrant.create_collection(
collection_name=collection_name,
vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
vectors_config=VectorParams(size=3072, distance=Distance.COSINE)
)
# Armazenar embeddings
@@ -120,19 +122,23 @@ pdf_path = "path/to/your/document.pdf"
load_pdf_to_qdrant(pdf_path, qdrant, collection_name)
# Inicializar ferramenta de busca Qdrant
from crewai_tools import QdrantConfig
qdrant_tool = QdrantVectorSearchTool(
qdrant_url=os.getenv("QDRANT_URL"),
qdrant_api_key=os.getenv("QDRANT_API_KEY"),
collection_name=collection_name,
limit=3,
score_threshold=0.35
qdrant_config=QdrantConfig(
qdrant_url=os.getenv("QDRANT_URL"),
qdrant_api_key=os.getenv("QDRANT_API_KEY"),
collection_name=collection_name,
limit=3,
score_threshold=0.35
)
)
# Criar agentes CrewAI
search_agent = Agent(
role="Senior Semantic Search Agent",
goal="Find and analyze documents based on semantic search",
backstory="""You are an expert research assistant who can find relevant
backstory="""You are an expert research assistant who can find relevant
information using semantic search in a Qdrant database.""",
tools=[qdrant_tool],
verbose=True
@@ -141,7 +147,7 @@ search_agent = Agent(
answer_agent = Agent(
role="Senior Answer Assistant",
goal="Generate answers to questions based on the context provided",
backstory="""You are an expert answer assistant who can generate
backstory="""You are an expert answer assistant who can generate
answers to questions based on the context provided.""",
tools=[qdrant_tool],
verbose=True
@@ -180,21 +186,82 @@ print(result)
## Parâmetros da Ferramenta
### Parâmetros Obrigatórios
- `qdrant_url` (str): URL do seu servidor Qdrant
- `qdrant_api_key` (str): Chave de API para autenticação com o Qdrant
- `collection_name` (str): Nome da coleção Qdrant a ser pesquisada
- `qdrant_config` (QdrantConfig): Objeto de configuração contendo todas as configurações do Qdrant
### Parâmetros Opcionais
### Parâmetros do QdrantConfig
- `qdrant_url` (str): URL do seu servidor Qdrant
- `qdrant_api_key` (str, opcional): Chave de API para autenticação com o Qdrant
- `collection_name` (str): Nome da coleção Qdrant a ser pesquisada
- `limit` (int): Número máximo de resultados a serem retornados (padrão: 3)
- `score_threshold` (float): Limite mínimo de similaridade (padrão: 0.35)
- `filter` (Any, opcional): Instância de Filter do Qdrant para filtragem avançada (padrão: None)
### Parâmetros Opcionais da Ferramenta
- `custom_embedding_fn` (Callable[[str], list[float]]): Função personalizada para vetorização de textos
- `qdrant_package` (str): Caminho base do pacote Qdrant (padrão: "qdrant_client")
- `client` (Any): Cliente Qdrant pré-inicializado (opcional)
## Filtragem Avançada
A ferramenta QdrantVectorSearchTool oferece recursos poderosos de filtragem para refinar os resultados da busca:
### Filtragem Dinâmica
Use os parâmetros `filter_by` e `filter_value` na sua busca para filtrar resultados dinamicamente:
```python
# O agente usará esses parâmetros ao chamar a ferramenta
# O schema da ferramenta aceita filter_by e filter_value
# Exemplo: busca com filtro de categoria
# Os resultados serão filtrados onde categoria == "tecnologia"
```
### Filtros Pré-definidos com QdrantConfig
Para filtragens complexas, use instâncias de Filter do Qdrant na sua configuração:
```python
from qdrant_client.http import models as qmodels
from crewai_tools import QdrantVectorSearchTool, QdrantConfig
# Criar um filtro para condições específicas
preset_filter = qmodels.Filter(
must=[
qmodels.FieldCondition(
key="categoria",
match=qmodels.MatchValue(value="pesquisa")
),
qmodels.FieldCondition(
key="ano",
match=qmodels.MatchValue(value=2024)
)
]
)
# Inicializar ferramenta com filtro pré-definido
qdrant_tool = QdrantVectorSearchTool(
qdrant_config=QdrantConfig(
qdrant_url="your_url",
qdrant_api_key="your_key",
collection_name="your_collection",
filter=preset_filter # Filtro pré-definido aplicado a todas as buscas
)
)
```
### Combinando Filtros
A ferramenta combina automaticamente os filtros pré-definidos do `QdrantConfig` com os filtros dinâmicos de `filter_by` e `filter_value`:
```python
# Se QdrantConfig tem um filtro pré-definido para categoria="pesquisa"
# E a busca usa filter_by="ano", filter_value=2024
# Ambos os filtros serão combinados (lógica AND)
```
## Parâmetros de Busca
A ferramenta aceita estes parâmetros em seu schema:
- `query` (str): Consulta de busca para encontrar documentos similares
- `filter_by` (str, opcional): Campo de metadado para filtrar
- `filter_value` (str, opcional): Valor para filtrar
- `filter_value` (Any, opcional): Valor para filtrar
## Formato de Retorno
@@ -214,7 +281,7 @@ A ferramenta retorna resultados no formato JSON:
## Embedding Padrão
Por padrão, a ferramenta utiliza o modelo `text-embedding-3-small` da OpenAI para vetorização. Isso requer:
Por padrão, a ferramenta utiliza o modelo `text-embedding-3-large` da OpenAI para vetorização. Isso requer:
- Chave de API da OpenAI definida na variável de ambiente: `OPENAI_API_KEY`
## Embeddings Personalizados
@@ -240,18 +307,22 @@ def custom_embeddings(text: str) -> list[float]:
# Tokenizar e obter saídas do modelo
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
# Usar mean pooling para obter o embedding do texto
embeddings = outputs.last_hidden_state.mean(dim=1)
# Converter para lista de floats e retornar
return embeddings[0].tolist()
# Usar embeddings personalizados com a ferramenta
from crewai_tools import QdrantConfig
tool = QdrantVectorSearchTool(
qdrant_url="your_url",
qdrant_api_key="your_key",
collection_name="your_collection",
qdrant_config=QdrantConfig(
qdrant_url="your_url",
qdrant_api_key="your_key",
collection_name="your_collection"
),
custom_embedding_fn=custom_embeddings # Passe sua função personalizada
)
```
@@ -270,4 +341,4 @@ Variáveis de ambiente obrigatórias:
export QDRANT_URL="your_qdrant_url" # Se não for informado no construtor
export QDRANT_API_KEY="your_api_key" # Se não for informado no construtor
export OPENAI_API_KEY="your_openai_key" # Se estiver usando embeddings padrão
```
```

View File

@@ -46,23 +46,25 @@ tool = DirectorySearchTool(directory='/path/to/directory')
O DirectorySearchTool utiliza OpenAI para embeddings e sumarização por padrão. As opções de personalização dessas configurações incluem a alteração do provedor de modelo e configurações, ampliando a flexibilidade para usuários avançados.
```python Code
from chromadb.config import Settings
tool = DirectorySearchTool(
config=dict(
llm=dict(
provider="ollama", # As opções incluem ollama, google, anthropic, llama2 e mais
config=dict(
model="llama2",
# Configurações adicionais aqui
),
),
embedder=dict(
provider="google", # ou openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # ou "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -56,25 +56,25 @@ Os seguintes parâmetros podem ser usados para customizar o comportamento da `DO
Por padrão, a ferramenta utiliza o OpenAI tanto para embeddings quanto para sumarização. Para customizar o modelo, você pode usar um dicionário de configuração como no exemplo:
```python Code
from chromadb.config import Settings
tool = DOCXSearchTool(
config=dict(
llm=dict(
provider="ollama", # ou google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # ou openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # ou "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -48,27 +48,25 @@ tool = MDXSearchTool(mdx='path/to/your/document.mdx')
A ferramenta utiliza, por padrão, o OpenAI para embeddings e sumarização. Para personalizar, utilize um dicionário de configuração conforme exemplo abaixo:
```python Code
from chromadb.config import Settings
tool = MDXSearchTool(
config=dict(
llm=dict(
provider="ollama", # As opções incluem google, openai, anthropic, llama2, etc.
config=dict(
model="llama2",
# Parâmetros opcionais podem ser incluídos aqui.
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # ou openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# Um título opcional para os embeddings pode ser adicionado aqui.
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # ou "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -45,28 +45,60 @@ tool = PDFSearchTool(pdf='path/to/your/document.pdf')
## Modelo e embeddings personalizados
Por padrão, a ferramenta utiliza OpenAI tanto para embeddings quanto para sumarização. Para personalizar o modelo, você pode usar um dicionário de configuração como no exemplo abaixo:
Por padrão, a ferramenta utiliza OpenAI para embeddings e sumarização. Para personalizar, use um dicionário de configuração conforme abaixo. Observação: um banco vetorial (vectordb) é necessário, pois os embeddings gerados precisam ser armazenados e consultados.
```python Code
from crewai_tools import PDFSearchTool
from chromadb.config import Settings # Persistência no Chroma
tool = PDFSearchTool(
config=dict(
llm=dict(
provider="ollama", # ou google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # ou openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
# Obrigatório: provedor de embeddings + configuração
"embedding_model": {
# Provedores suportados: "openai", "azure", "google-generativeai", "google-vertex",
# "voyageai", "cohere", "huggingface", "jina", "sentence-transformer",
# "text2vec", "ollama", "openclip", "instructor", "onnx", "roboflow", "watsonx", "custom"
"provider": "openai",
"config": {
# "model" é mapeado internamente para "model_name".
"model": "text-embedding-3-small",
# Opcional: chave da API (se ausente, usa variáveis de ambiente do provedor)
# "api_key": "sk-...",
# Exemplos específicos por provedor
# --- Google ---
# (defina provider="google-generativeai")
# "model": "models/embedding-001",
# "task_type": "retrieval_document",
# --- Cohere ---
# (defina provider="cohere")
# "model": "embed-english-v3.0",
# --- Ollama (local) ---
# (defina provider="ollama")
# "model": "nomic-embed-text",
},
},
# Obrigatório: configuração do banco vetorial
"vectordb": {
"provider": "chromadb", # ou "qdrant"
"config": {
# Exemplo Chroma:
# "settings": Settings(
# persist_directory="/content/chroma",
# allow_reset=True,
# is_persistent=True,
# ),
# Exemplo Qdrant:
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
# Observação: o nome da coleção é controlado pela ferramenta (padrão: "rag_tool_collection").
}
},
}
)
```

View File

@@ -57,25 +57,39 @@ Por padrão, a ferramenta utiliza o OpenAI tanto para embeddings quanto para sum
Para personalizar o modelo, você pode usar um dicionário de configuração como o exemplo a seguir:
```python Code
from chromadb.config import Settings
tool = TXTSearchTool(
config=dict(
llm=dict(
provider="ollama", # ou google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # ou openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
# Obrigatório: provedor de embeddings + configuração
"embedding_model": {
"provider": "openai", # ou google-generativeai, cohere, ollama, ...
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...", # opcional se variável de ambiente estiver definida
# Exemplos por provedor:
# Google → model: "models/embedding-001", task_type: "retrieval_document"
},
},
# Obrigatório: configuração do banco vetorial
"vectordb": {
"provider": "chromadb", # ou "qdrant"
"config": {
# Configurações do Chroma (persistência opcional)
# "settings": Settings(
# persist_directory="/content/chroma",
# allow_reset=True,
# is_persistent=True,
# ),
# Exemplo de parâmetros de vetor do Qdrant:
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
# Observação: o nome da coleção é controlado pela ferramenta (padrão: "rag_tool_collection").
}
},
}
)
```

View File

@@ -54,25 +54,25 @@ Este parâmetro é opcional durante a inicialização da ferramenta, mas deve se
Por padrão, a ferramenta utiliza a OpenAI tanto para embeddings quanto para sumarização. Para personalizar o modelo, você pode usar um dicionário de configuração conforme o exemplo a seguir:
```python Code
from chromadb.config import Settings
tool = XMLSearchTool(
config=dict(
llm=dict(
provider="ollama", # ou google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google", # ou openai, ollama, ...
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
config={
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
# "api_key": "sk-...",
},
},
"vectordb": {
"provider": "chromadb", # ou "qdrant"
"config": {
# "settings": Settings(persist_directory="/content/chroma", allow_reset=True, is_persistent=True),
# from qdrant_client.models import VectorParams, Distance
# "vectors_config": VectorParams(size=384, distance=Distance.COSINE),
}
},
}
)
```

View File

@@ -11,7 +11,7 @@ mode: "wide"
<Card
title="Bedrock Invoke Agent Tool"
icon="cloud"
href="/en/tools/tool-integrations/bedrockinvokeagenttool"
href="/pt-BR/tools/integration/bedrockinvokeagenttool"
color="#0891B2"
>
Invoke Amazon Bedrock Agents from CrewAI to orchestrate actions across AWS services.
@@ -20,7 +20,7 @@ mode: "wide"
<Card
title="CrewAI Automation Tool"
icon="bolt"
href="/en/tools/tool-integrations/crewaiautomationtool"
href="/pt-BR/tools/integration/crewaiautomationtool"
color="#7C3AED"
>
Automate deployment and operations by integrating CrewAI with external platforms and workflows.

View File

@@ -12,7 +12,7 @@ dependencies = [
"pytube>=15.0.0",
"requests>=2.32.5",
"docker>=7.1.0",
"crewai==1.2.1",
"crewai==1.4.1",
"lancedb>=0.5.4",
"tiktoken>=0.8.0",
"beautifulsoup4>=4.13.4",

View File

@@ -287,4 +287,4 @@ __all__ = [
"ZapierActionTools",
]
__version__ = "1.2.1"
__version__ = "1.4.1"

View File

@@ -229,6 +229,7 @@ class CrewAIRagAdapter(Adapter):
continue
else:
metadata: dict[str, Any] = base_metadata.copy()
source_content = SourceContent(source_ref)
if data_type in [
DataType.PDF_FILE,
@@ -239,13 +240,12 @@ class CrewAIRagAdapter(Adapter):
DataType.XML,
DataType.MDX,
]:
if not os.path.isfile(source_ref):
if not source_content.is_url() and not source_content.path_exists():
raise FileNotFoundError(f"File does not exist: {source_ref}")
loader = data_type.get_loader()
chunker = data_type.get_chunker()
source_content = SourceContent(source_ref)
loader_result: LoaderResult = loader.load(source_content)
chunks = chunker.chunk(loader_result.content)

View File

@@ -22,22 +22,23 @@ class FirecrawlCrawlWebsiteToolSchema(BaseModel):
class FirecrawlCrawlWebsiteTool(BaseTool):
"""Tool for crawling websites using Firecrawl. To run this tool, you need to have a Firecrawl API key.
"""Tool for crawling websites using Firecrawl v2 API. To run this tool, you need to have a Firecrawl API key.
Args:
api_key (str): Your Firecrawl API key.
config (dict): Optional. It contains Firecrawl API parameters.
config (dict): Optional. It contains Firecrawl v2 API parameters.
Default configuration options:
max_depth (int): Maximum depth to crawl. Default: 2
Default configuration options (Firecrawl v2 API):
max_discovery_depth (int): Maximum depth for discovering pages. Default: 2
ignore_sitemap (bool): Whether to ignore sitemap. Default: True
limit (int): Maximum number of pages to crawl. Default: 100
allow_backward_links (bool): Allow crawling backward links. Default: False
limit (int): Maximum number of pages to crawl. Default: 10
allow_external_links (bool): Allow crawling external links. Default: False
scrape_options (ScrapeOptions): Options for scraping content
- formats (list[str]): Content formats to return. Default: ["markdown", "screenshot", "links"]
allow_subdomains (bool): Allow crawling subdomains. Default: False
delay (int): Delay between requests in milliseconds. Default: None
scrape_options (dict): Options for scraping content
- formats (list[str]): Content formats to return. Default: ["markdown"]
- only_main_content (bool): Only return main content. Default: True
- timeout (int): Timeout in milliseconds. Default: 30000
- timeout (int): Timeout in milliseconds. Default: 10000
"""
model_config = ConfigDict(
@@ -49,14 +50,15 @@ class FirecrawlCrawlWebsiteTool(BaseTool):
api_key: str | None = None
config: dict[str, Any] | None = Field(
default_factory=lambda: {
"maxDepth": 2,
"ignoreSitemap": True,
"max_discovery_depth": 2,
"ignore_sitemap": True,
"limit": 10,
"allowBackwardLinks": False,
"allowExternalLinks": False,
"scrapeOptions": {
"formats": ["markdown", "screenshot", "links"],
"onlyMainContent": True,
"allow_external_links": False,
"allow_subdomains": False,
"delay": None,
"scrape_options": {
"formats": ["markdown"],
"only_main_content": True,
"timeout": 10000,
},
}
@@ -107,7 +109,7 @@ class FirecrawlCrawlWebsiteTool(BaseTool):
if not self._firecrawl:
raise RuntimeError("FirecrawlApp not properly initialized")
return self._firecrawl.crawl_url(url, poll_interval=2, params=self.config)
return self._firecrawl.crawl(url=url, poll_interval=2, **self.config)
try:

View File

@@ -22,20 +22,27 @@ class FirecrawlScrapeWebsiteToolSchema(BaseModel):
class FirecrawlScrapeWebsiteTool(BaseTool):
"""Tool for scraping webpages using Firecrawl. To run this tool, you need to have a Firecrawl API key.
"""Tool for scraping webpages using Firecrawl v2 API. To run this tool, you need to have a Firecrawl API key.
Args:
api_key (str): Your Firecrawl API key.
config (dict): Optional. It contains Firecrawl API parameters.
config (dict): Optional. It contains Firecrawl v2 API parameters.
Default configuration options:
Default configuration options (Firecrawl v2 API):
formats (list[str]): Content formats to return. Default: ["markdown"]
onlyMainContent (bool): Only return main content. Default: True
includeTags (list[str]): Tags to include. Default: []
excludeTags (list[str]): Tags to exclude. Default: []
headers (dict): Headers to include. Default: {}
waitFor (int): Time to wait for page to load in ms. Default: 0
json_options (dict): Options for JSON extraction. Default: None
only_main_content (bool): Only return main content excluding headers, navs, footers, etc. Default: True
include_tags (list[str]): Tags to include in the output. Default: []
exclude_tags (list[str]): Tags to exclude from the output. Default: []
max_age (int): Returns cached version if younger than this age in milliseconds. Default: 172800000 (2 days)
headers (dict): Headers to send with the request (e.g., cookies, user-agent). Default: {}
wait_for (int): Delay in milliseconds before fetching content. Default: 0
mobile (bool): Emulate scraping from a mobile device. Default: False
skip_tls_verification (bool): Skip TLS certificate verification. Default: True
timeout (int): Request timeout in milliseconds. Default: None
remove_base64_images (bool): Remove base64 images from output. Default: True
block_ads (bool): Enable ad-blocking and cookie popup blocking. Default: True
proxy (str): Proxy type ("basic", "stealth", "auto"). Default: "auto"
store_in_cache (bool): Store page in Firecrawl index and cache. Default: True
"""
model_config = ConfigDict(
@@ -48,11 +55,18 @@ class FirecrawlScrapeWebsiteTool(BaseTool):
config: dict[str, Any] = Field(
default_factory=lambda: {
"formats": ["markdown"],
"onlyMainContent": True,
"includeTags": [],
"excludeTags": [],
"only_main_content": True,
"include_tags": [],
"exclude_tags": [],
"max_age": 172800000, # 2 days cache
"headers": {},
"waitFor": 0,
"wait_for": 0,
"mobile": False,
"skip_tls_verification": True,
"remove_base64_images": True,
"block_ads": True,
"proxy": "auto",
"store_in_cache": True,
}
)
@@ -95,7 +109,7 @@ class FirecrawlScrapeWebsiteTool(BaseTool):
if not self._firecrawl:
raise RuntimeError("FirecrawlApp not properly initialized")
return self._firecrawl.scrape_url(url, params=self.config)
return self._firecrawl.scrape(url=url, **self.config)
try:

View File

@@ -23,19 +23,24 @@ class FirecrawlSearchToolSchema(BaseModel):
class FirecrawlSearchTool(BaseTool):
"""Tool for searching webpages using Firecrawl. To run this tool, you need to have a Firecrawl API key.
"""Tool for searching webpages using Firecrawl v2 API. To run this tool, you need to have a Firecrawl API key.
Args:
api_key (str): Your Firecrawl API key.
config (dict): Optional. It contains Firecrawl API parameters.
config (dict): Optional. It contains Firecrawl v2 API parameters.
Default configuration options:
limit (int): Maximum number of pages to crawl. Default: 5
tbs (str): Time before search. Default: None
lang (str): Language. Default: "en"
country (str): Country. Default: "us"
location (str): Location. Default: None
timeout (int): Timeout in milliseconds. Default: 60000
Default configuration options (Firecrawl v2 API):
limit (int): Maximum number of search results to return. Default: 5
tbs (str): Time-based search filter (e.g., "qdr:d" for past day). Default: None
location (str): Location for search results. Default: None
timeout (int): Request timeout in milliseconds. Default: None
scrape_options (dict): Options for scraping the search results. Default: {"formats": ["markdown"]}
- formats (list[str]): Content formats to return. Default: ["markdown"]
- only_main_content (bool): Only return main content. Default: True
- include_tags (list[str]): Tags to include. Default: []
- exclude_tags (list[str]): Tags to exclude. Default: []
- wait_for (int): Delay before fetching content in ms. Default: 0
- timeout (int): Request timeout in milliseconds. Default: None
"""
model_config = ConfigDict(
@@ -49,10 +54,15 @@ class FirecrawlSearchTool(BaseTool):
default_factory=lambda: {
"limit": 5,
"tbs": None,
"lang": "en",
"country": "us",
"location": None,
"timeout": 60000,
"timeout": None,
"scrape_options": {
"formats": ["markdown"],
"only_main_content": True,
"include_tags": [],
"exclude_tags": [],
"wait_for": 0,
},
}
)
_firecrawl: FirecrawlApp | None = PrivateAttr(None)
@@ -106,7 +116,7 @@ class FirecrawlSearchTool(BaseTool):
return self._firecrawl.search(
query=query,
params=self.config,
**self.config,
)

View File

@@ -1,9 +1,9 @@
from __future__ import annotations
from collections.abc import Callable
import importlib
import json
import os
from collections.abc import Callable
from typing import Any
from crewai.tools import BaseTool, EnvVar
@@ -12,9 +12,17 @@ from pydantic.types import ImportString
class QdrantToolSchema(BaseModel):
query: str = Field(..., description="Query to search in Qdrant DB.")
filter_by: str | None = None
filter_value: str | None = None
query: str = Field(
..., description="Query to search in Qdrant DB - always required."
)
filter_by: str | None = Field(
default=None,
description="Parameter to filter the search by. When filtering, needs to be used in conjunction with filter_value.",
)
filter_value: Any | None = Field(
default=None,
description="Value to filter the search by. When filtering, needs to be used in conjunction with filter_by.",
)
class QdrantConfig(BaseModel):
@@ -25,7 +33,9 @@ class QdrantConfig(BaseModel):
collection_name: str
limit: int = 3
score_threshold: float = 0.35
filter_conditions: list[tuple[str, Any]] = Field(default_factory=list)
filter: Any | None = Field(
default=None, description="Qdrant Filter instance for advanced filtering."
)
class QdrantVectorSearchTool(BaseTool):
@@ -76,23 +86,26 @@ class QdrantVectorSearchTool(BaseTool):
filter_value: Any | None = None,
) -> str:
"""Perform vector similarity search."""
filter_ = self.qdrant_package.http.models.Filter
field_condition = self.qdrant_package.http.models.FieldCondition
match_value = self.qdrant_package.http.models.MatchValue
conditions = self.qdrant_config.filter_conditions.copy()
if filter_by and filter_value is not None:
conditions.append((filter_by, filter_value))
search_filter = (
filter_(
must=[
field_condition(key=k, match=match_value(value=v))
for k, v in conditions
]
)
if conditions
else None
self.qdrant_config.filter.model_copy()
if self.qdrant_config.filter is not None
else self.qdrant_package.http.models.Filter(must=[])
)
if filter_by and filter_value is not None:
if not hasattr(search_filter, "must") or not isinstance(
search_filter.must, list
):
search_filter.must = []
search_filter.must.append(
self.qdrant_package.http.models.FieldCondition(
key=filter_by,
match=self.qdrant_package.http.models.MatchValue(
value=filter_value
),
)
)
query_vector = (
self.custom_embedding_fn(query)
if self.custom_embedding_fn

View File

@@ -0,0 +1,289 @@
interactions:
- request:
body: '{"url": "https://firecrawl.dev", "includeTags": [], "excludeTags": [],
"onlyMainContent": true, "waitFor": 0, "skipTlsVerification": true, "removeBase64Images":
true, "fastMode": false, "blockAds": true, "storeInCache": true, "maxAge": 172800000,
"formats": ["markdown"], "headers": {}, "mobile": false, "proxy": "auto", "origin":
"python-sdk@4.5.0"}'
headers:
Accept:
- '*/*'
Accept-Encoding:
- gzip, deflate, zstd
Connection:
- keep-alive
Content-Length:
- '350'
Content-Type:
- application/json
User-Agent:
- python-requests/2.32.5
method: POST
uri: https://api.firecrawl.dev/v2/scrape
response:
body:
string: "{\"success\":true,\"data\":{\"markdown\":\"We just raised our Series
A and shipped Firecrawl /v2 \U0001F389. [Read the blog.](https://www.firecrawl.dev/blog/firecrawl-v2-series-a-announcement)\\n\\n[2
Months Free \u2014 Annually](https://www.firecrawl.dev/pricing)\\n\\n# Turn
websites into LLM-ready data\\n\\nPower your AI apps with clean web data\\n\\nfrom
any website. [It's also open source.](https://github.com/firecrawl/firecrawl)\\n\\nScrape\\n\\nSearch\\nNew\\n\\nMap\\n\\nCrawl\\n\\nScrape\\n\\nLogo\\n\\nNavigation\\n\\nButton\\n\\nH1
Title\\n\\nDescription\\n\\nCTA Button\\n\\n\\\\[ .JSON \\\\]\\n\\n```json\\n1[\\\\\\n2
\ {\\\\\\n3 \\\"url\\\": \\\"https://example.com\\\",\\\\\\n4 \\\"markdown\\\":
\\\"# Getting Started...\\\",\\\\\\n5 \\\"json\\\": { \\\"title\\\": \\\"Guide\\\",
\\\"docs\\\": \\\"...\\\" },\\\\\\n6 \\\"screenshot\\\": \\\"https://example.com/hero.png\\\"\\\\\\n7
\ }\\\\\\n8]\\n```\\n\\nScrape Completed\\n\\nTrusted by5000+\\n\\ncompaniesof
all sizes\\n\\n![Logo 17](https://www.firecrawl.dev/assets-original/logocloud/17.png)\\n\\n![Logo
18](https://www.firecrawl.dev/assets-original/logocloud/18.png)\\n\\n![Logo
1](https://www.firecrawl.dev/assets-original/logocloud/1.png)\\n\\n![Logo
2](https://www.firecrawl.dev/assets-original/logocloud/2.png)\\n\\n![Logo
3](https://www.firecrawl.dev/assets-original/logocloud/3.png)\\n\\n![Logo
4](https://www.firecrawl.dev/assets-original/logocloud/4.png)\\n\\n![Logo
5](https://www.firecrawl.dev/assets-original/logocloud/5.png)\\n\\n![Logo
6](https://www.firecrawl.dev/assets-original/logocloud/6.png)\\n\\n![Logo
7](https://www.firecrawl.dev/assets-original/logocloud/7.png)\\n\\n![Logo
8](https://www.firecrawl.dev/assets-original/logocloud/8.png)\\n\\n![Logo
9](https://www.firecrawl.dev/assets-original/logocloud/9.png)\\n\\n![Logo
10](https://www.firecrawl.dev/assets-original/logocloud/10.png)\\n\\n![Logo
11](https://www.firecrawl.dev/assets-original/logocloud/11.png)\\n\\n![Logo
12](https://www.firecrawl.dev/assets-original/logocloud/12.png)\\n\\n![Logo
13](https://www.firecrawl.dev/assets-original/logocloud/13.png)\\n\\n![Logo
14](https://www.firecrawl.dev/assets-original/logocloud/14.png)\\n\\n![Logo
15](https://www.firecrawl.dev/assets-original/logocloud/15.png)\\n\\n![Logo
16](https://www.firecrawl.dev/assets-original/logocloud/16.png)\\n\\n![Logo
17](https://www.firecrawl.dev/assets-original/logocloud/17.png)\\n\\n![Logo
18](https://www.firecrawl.dev/assets-original/logocloud/18.png)\\n\\n![Logo
19](https://www.firecrawl.dev/assets-original/logocloud/19.png)\\n\\n![Logo
20](https://www.firecrawl.dev/assets-original/logocloud/20.png)\\n\\n![Logo
21](https://www.firecrawl.dev/assets-original/logocloud/21.png)\\n\\n![Logo
17](https://www.firecrawl.dev/assets-original/logocloud/17.png)\\n\\n![Logo
18](https://www.firecrawl.dev/assets-original/logocloud/18.png)\\n\\n![Logo
1](https://www.firecrawl.dev/assets-original/logocloud/1.png)\\n\\n![Logo
2](https://www.firecrawl.dev/assets-original/logocloud/2.png)\\n\\n![Logo
3](https://www.firecrawl.dev/assets-original/logocloud/3.png)\\n\\n![Logo
4](https://www.firecrawl.dev/assets-original/logocloud/4.png)\\n\\n![Logo
5](https://www.firecrawl.dev/assets-original/logocloud/5.png)\\n\\n![Logo
6](https://www.firecrawl.dev/assets-original/logocloud/6.png)\\n\\n![Logo
7](https://www.firecrawl.dev/assets-original/logocloud/7.png)\\n\\n![Logo
8](https://www.firecrawl.dev/assets-original/logocloud/8.png)\\n\\n![Logo
9](https://www.firecrawl.dev/assets-original/logocloud/9.png)\\n\\n![Logo
10](https://www.firecrawl.dev/assets-original/logocloud/10.png)\\n\\n![Logo
11](https://www.firecrawl.dev/assets-original/logocloud/11.png)\\n\\n![Logo
12](https://www.firecrawl.dev/assets-original/logocloud/12.png)\\n\\n![Logo
13](https://www.firecrawl.dev/assets-original/logocloud/13.png)\\n\\n![Logo
14](https://www.firecrawl.dev/assets-original/logocloud/14.png)\\n\\n![Logo
15](https://www.firecrawl.dev/assets-original/logocloud/15.png)\\n\\n![Logo
16](https://www.firecrawl.dev/assets-original/logocloud/16.png)\\n\\n![Logo
17](https://www.firecrawl.dev/assets-original/logocloud/17.png)\\n\\n![Logo
18](https://www.firecrawl.dev/assets-original/logocloud/18.png)\\n\\n![Logo
19](https://www.firecrawl.dev/assets-original/logocloud/19.png)\\n\\n![Logo
20](https://www.firecrawl.dev/assets-original/logocloud/20.png)\\n\\n![Logo
21](https://www.firecrawl.dev/assets-original/logocloud/21.png)\\n\\n\\\\[01/
07 \\\\]\\n\\n\xB7\\n\\nMain Features\\n\\n//\\n\\nDeveloper First\\n\\n//\\n\\n##
Startscraping today\\n\\nEnhance your apps with industry leading web scraping
and crawling capabilities.\\n\\nScrape\\n\\nGet llm-ready data from websites.
Markdown, JSON, screenshot, etc.\\n\\nSearch\\n\\nNew\\n\\nSearch the web
and get full content from results.\\n\\nCrawl\\n\\nCrawl all the pages on
a website and get data for each page.\\n\\nPython\\n\\nNode.js\\n\\nCurl\\n\\nCopy
code\\n\\n```python\\n1# pip install firecrawl-py\\n2from firecrawl import
Firecrawl\\n3\\n4app = Firecrawl(api_key=\\\"fc-YOUR_API_KEY\\\")\\n5\\n6#
Scrape a website:\\n7app.scrape('firecrawl.dev')\\n8\\n9\\n10\\n```\\n\\n\\\\[
.MD \\\\]\\n\\n```markdown\\n1# Firecrawl\\n2\\n3Firecrawl is a powerful web
scraping\\n4library that makes it easy to extract\\n5data from websites.\\n6\\n7##
Installation\\n8\\n9To install Firecrawl, run:\\n10\\n11\\n```\\n\\n![developer-1](https://www.firecrawl.dev/assets/developer/1.png)\\n\\n![developer-2](https://www.firecrawl.dev/assets/developer/2.png)\\n\\n![developer-3](https://www.firecrawl.dev/assets/developer/3.png)\\n\\n![developer-4](https://www.firecrawl.dev/assets/developer/4.png)\\n\\n![developer-5](https://www.firecrawl.dev/assets/developer/5.png)\\n\\n![developer-6](https://www.firecrawl.dev/assets/developer/6.png)\\n\\n![developer-7](https://www.firecrawl.dev/assets/developer/7.png)\\n\\n![developer-8](https://www.firecrawl.dev/assets/developer/8.png)\\n\\n![developer-9](https://www.firecrawl.dev/assets/developer/1.png)\\n\\n![developer-10](https://www.firecrawl.dev/assets/developer/2.png)\\n\\n![developer-11](https://www.firecrawl.dev/assets/developer/3.png)\\n\\n![developer-12](https://www.firecrawl.dev/assets/developer/4.png)\\n\\n![developer-13](https://www.firecrawl.dev/assets/developer/5.png)\\n\\n![developer-14](https://www.firecrawl.dev/assets/developer/6.png)\\n\\n![developer-15](https://www.firecrawl.dev/assets/developer/7.png)\\n\\n![developer-16](https://www.firecrawl.dev/assets/developer/8.png)\\n\\n![developer-17](https://www.firecrawl.dev/assets/developer/1.png)\\n\\n![developer-18](https://www.firecrawl.dev/assets/developer/2.png)\\n\\n![developer-19](https://www.firecrawl.dev/assets/developer/3.png)\\n\\n![developer-20](https://www.firecrawl.dev/assets/developer/4.png)\\n\\n![developer-21](https://www.firecrawl.dev/assets/developer/5.png)\\n\\n![developer-22](https://www.firecrawl.dev/assets/developer/6.png)\\n\\n![developer-23](https://www.firecrawl.dev/assets/developer/7.png)\\n\\n![developer-24](https://www.firecrawl.dev/assets/developer/8.png)\\n\\nIntegrations\\n\\n###
Use well-known tools\\n\\nAlready fully integrated with the greatest existing
tools and workflows.\\n\\n[See all integrations](https://www.firecrawl.dev/app)\\n\\n![Firecrawl
icon (blueprint)](https://www.firecrawl.dev/assets-original/developer-os-icon.png)\\n\\nmendableai/firecrawl\\n\\nPublic\\n\\nStar\\n\\n65.3K\\n\\n\\\\[python-SDK\\\\]
improvs/async\\n\\n#1337\\n\\n\xB7\\n\\nApr 18, 2025\\n\\n\xB7\\n\\n![rafaelsideguide](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F150964962%3Fv%3D4&w=48&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\nrafaelsideguide\\n\\nfeat(extract):
cost limit\\n\\n#1473\\n\\n\xB7\\n\\nApr 17, 2025\\n\\n\xB7\\n\\n![mogery](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F66118807%3Fv%3D4&w=48&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\nmogery\\n\\nfeat(scrape):
get job result from GCS, avoid Redis\\n\\n#1461\\n\\n\xB7\\n\\nApr 15, 2025\\n\\n\xB7\\n\\n![mogery](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F66118807%3Fv%3D4&w=48&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\nmogery\\n\\nExtract
v2/rerank improvs\\n\\n#1437\\n\\n\xB7\\n\\nApr 11, 2025\\n\\n\xB7\\n\\n![rafaelsideguide](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F150964962%3Fv%3D4&w=48&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\nrafaelsideguide\\n\\n![https://avatars.githubusercontent.com/u/150964962?v=4](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F150964962%3Fv%3D4&w=96&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\n![https://avatars.githubusercontent.com/u/66118807?v=4](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F66118807%3Fv%3D4&w=96&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\n+90\\n\\nOpen
Source\\n\\n### Code you can trust\\n\\nDeveloped transparently and collaboratively.
Join our community of contributors.\\n\\n[Check out our repo](https://github.com/firecrawl/firecrawl)\\n\\n\\\\[02/
07 \\\\]\\n\\n\xB7\\n\\nCore\\n\\n//\\n\\nBuilt to outperform\\n\\n//\\n\\n##
Core principles, provenperformance\\n\\nBuilt from the ground up to outperform
traditional scrapers.\\n\\nNo proxy headaches\\n\\nReliable.Covers 96% of
the web,\\n\\nincluding JS-heavy and protected pages. No proxies, no puppets,
just clean data.\\n\\nFirecrawl\\n\\n96%\\n\\n![Puppeteer icon](https://www.firecrawl.dev/assets/puppeteer.png)\\n\\nPuppeteer\\n\\n79%\\n\\ncURL\\n\\n75%\\n\\nSpeed
that feels invisible\\n\\nBlazingly fast.Delivers results in less than 1 second,
fast for real-time agents\\n\\nand dynamic apps.\\n\\nURL\\n\\nCrawl\\n\\nScrape\\n\\nfirecrawl.dev/docs\\n\\n50ms\\n\\n51ms\\n\\nfirecrawl.dev/templates\\n\\n52ms\\n\\n50ms\\n\\nfirecrawl.dev/changelog\\n\\n49ms\\n\\n52ms\\n\\nfirecrawl.dev/about\\n\\n52ms\\n\\n50ms\\n\\nfirecrawl.dev/changelog\\n\\n50ms\\n\\n52ms\\n\\nfirecrawl.dev/playground\\n\\n51ms\\n\\n49ms\\n\\n\\\\[
CTA \\\\]\\n\\n\\\\[ CRAWL \\\\]\\n\\n\\\\[ SCRAPE \\\\]\\n\\n\\\\[ CTA \\\\]\\n\\n//\\n\\nGet
started\\n\\n//\\n\\nReady to build?\\n\\nStart getting Web Data for free
and scale seamlessly as your project expands. No credit card needed.\\n\\n[Start
for free](https://www.firecrawl.dev/signin) [See our plans](https://www.firecrawl.dev/pricing)\\n\\n\\\\[03/
07 \\\\]\\n\\n\xB7\\n\\nFeatures\\n\\n//\\n\\nZero configuration\\n\\n//\\n\\n##
We handle the hard stuff\\n\\nRotating proxies, orchestration, rate limits,
js-blocked content and more.\\n\\nDocs to data\\n\\nMedia parsing.Firecrawl
can parse and output content from web hosted pdfs, docx, and more.\\n\\nhttps://example.com/docs/report.pdf\\n\\nhttps://example.com/files/brief.docx\\n\\nhttps://example.com/docs/guide.html\\n\\ndocx\\n\\nParsing...\\n\\nKnows
the moment\\n\\nSmart wait.Firecrawl intelligently waits for content to load,
making scraping faster and more reliable.\\n\\nhttps://example-spa.com\\n\\nRequest
Sent\\n\\nScrapes the real thing\\n\\nCached, when you need it.Selective caching,
you choose your caching patterns, growing web index.\\n\\n![User](https://www.firecrawl.dev/_next/image?url=%2Fassets-original%2Ffeatures%2Fcached-user.png&w=256&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\nUser\\n\\nFirecrawl\\n\\nCache\\n\\nInvisible
access\\n\\nStealth mode.Crawls the web without\\n\\nbeing blocked, mimics
real users to access protected or dynamic content.\\n\\nInteractive scraping\\n\\nActions.Click,
scroll, write, wait, press and more before extracting content.\\n\\nhttps://example.com\\n\\nNavigate\\n\\nClick\\n\\nType\\n\\nWait\\n\\nScroll\\n\\nPress\\n\\nScreenshot\\n\\nScrape\\n\\n\\\\[04/
07 \\\\]\\n\\n\xB7\\n\\nPricing\\n\\n//\\n\\nTransparent\\n\\n//\\n\\n## Flexible
pricing\\n\\nExplore transparent pricing built for real-world scraping. Start
for free, then scale as you grow.\\n\\n\U0001F1FA\U0001F1F8USD\\n\\nFree Plan\\n\\nA
lightweight way to try scraping.\\n\\nNo cost, no card, no hassle.\\n\\n500
credits\\n\\n$0123456789\\n\\none-time\\n\\nGet started\\n\\nScrape 500 pages\\n\\n2
concurrent requests\\n\\nLow rate limits\\n\\nHobby\\n\\nGreat for side projects
and small tools.\\n\\nFast, simple, no overkill.\\n\\n3,000 credits\\n\\n$01234567890123456789\\n\\n/monthly\\n\\nBilled
yearly\\n\\n2 months free\\n\\nSubscribe\\n\\nScrape 3,000 pages\\n\\n5 concurrent
requests\\n\\nBasic support\\n\\n$9 per extra 1k credits\\n\\nStandard\\n\\nMost
popular\\n\\nPerfect for scaling with less effort.\\n\\nSimple, solid, dependable.\\n\\n100,000
credits\\n\\n$01234567890123456789\\n\\n/monthly\\n\\nBilled yearly\\n\\n2
months free\\n\\nSubscribe\\n\\nScrape 100,000 pages\\n\\n50 concurrent requests\\n\\nStandard
support\\n\\n$47 per extra 35k credits\\n\\nGrowth\\n\\nBuilt for high volume
and speed.\\n\\nFirecrawl at full force.\\n\\n500,000 credits\\n\\n$012345678901234567890123456789\\n\\n/monthly\\n\\nBilled
yearly\\n\\n2 months free\\n\\nSubscribe\\n\\nScrape 500,000 pages\\n\\n100
concurrent requests\\n\\nPriority support\\n\\n$177 per extra 175k credits\\n\\nExtra
credits are available via auto-recharge packs. [Enable](https://www.firecrawl.dev/signin/signup)\\n\\nEnterprise\\n\\nPower
at your pace\\n\\nUnlimited credits. Custom RPMs.\\n\\n[Contact sales](https://fk4bvu0n5qp.typeform.com/to/Ej6oydlg)
[More details](https://www.firecrawl.dev/enterprise)\\n\\nBulk discounts\\n\\nTop
priority support\\n\\nCustom concurrency limits\\n\\nImproved stealth proxies\\n\\nSLAs\\n\\nAdvanced
security & controls\\n\\n\\\\[05/ 07 \\\\]\\n\\n\xB7\\n\\nTestimonials\\n\\n//\\n\\nCommunity\\n\\n//\\n\\n##
People love building withFirecrawl\\n\\nDiscover why developers choose
Firecrawl every day.\\n\\n[![Morgan Linton](https://www.firecrawl.dev/assets/testimonials/morgan-linton.png)Morgan
Linton@morganlinton\\\"If you're coding with AI, and haven't discovered @firecrawl\\\\_dev
yet, prepare to have your mind blown \U0001F92F\\\"](https://x.com/morganlinton/status/1839454165703204955)
[![Chris DeWeese](https://www.firecrawl.dev/assets/testimonials/chris-deweese.png)Chris
DeWeese@chrisdeweese\\\\_\\\"Started using @firecrawl\\\\_dev for a project,
I wish I used this sooner.\\\"](https://x.com/chrisdeweese_/status/1853587120406876601)
[![Alex Reibman](https://www.firecrawl.dev/assets/testimonials/alex-reibman.png)Alex
Reibman@AlexReibman\\\"Moved our internal agent's web scraping tool from Apify
to Firecrawl because it benchmarked 50x faster with AgentOps.\\\"](https://x.com/AlexReibman/status/1780299595484131836)
[![Tom - Morpho](https://www.firecrawl.dev/assets/testimonials/tom-morpho.png)Tom
- Morpho@TomReppelin\\\"I found gold today. Thank you @firecrawl\\\\_dev\\\"](https://x.com/TomReppelin/status/1844382491014201613)\\n\\n[![Morgan
Linton](https://www.firecrawl.dev/assets/testimonials/morgan-linton.png)Morgan
Linton@morganlinton\\\"If you're coding with AI, and haven't discovered @firecrawl\\\\_dev
yet, prepare to have your mind blown \U0001F92F\\\"](https://x.com/morganlinton/status/1839454165703204955)
[![Chris DeWeese](https://www.firecrawl.dev/assets/testimonials/chris-deweese.png)Chris
DeWeese@chrisdeweese\\\\_\\\"Started using @firecrawl\\\\_dev for a project,
I wish I used this sooner.\\\"](https://x.com/chrisdeweese_/status/1853587120406876601)
[![Alex Reibman](https://www.firecrawl.dev/assets/testimonials/alex-reibman.png)Alex
Reibman@AlexReibman\\\"Moved our internal agent's web scraping tool from Apify
to Firecrawl because it benchmarked 50x faster with AgentOps.\\\"](https://x.com/AlexReibman/status/1780299595484131836)
[![Tom - Morpho](https://www.firecrawl.dev/assets/testimonials/tom-morpho.png)Tom
- Morpho@TomReppelin\\\"I found gold today. Thank you @firecrawl\\\\_dev\\\"](https://x.com/TomReppelin/status/1844382491014201613)\\n\\n[![Bardia](https://www.firecrawl.dev/assets/testimonials/bardia.png)Bardia@thepericulum\\\"The
Firecrawl team ships. I wanted types for their node SDK, and less than an
hour later, I got them.\\\"](https://x.com/thepericulum/status/1781397799487078874)
[![Matt Busigin](https://www.firecrawl.dev/assets/testimonials/matt-busigin.png)Matt
Busigin@mbusigin\\\"Firecrawl is dope. Congrats guys \U0001F44F\\\"](https://x.com/mbusigin/status/1836065372010656069)
[![Sumanth](https://www.firecrawl.dev/assets/testimonials/sumanth.png)Sumanth@Sumanth\\\\_077\\\"Web
scraping will never be the same!\\\\\\\\\\n\\\\\\\\\\nFirecrawl is an open-source
framework that takes a URL, crawls it, and conver...\\\"](https://x.com/Sumanth_077/status/1940049003074478511)
[![Steven Tey](https://www.firecrawl.dev/assets/testimonials/steven-tey.png)Steven
Tey@steventey\\\"Open-source Clay alternative just dropped\\\\\\\\\\n\\\\\\\\\\nUpload
a CSV of emails and...\\\"](https://x.com/steventey/status/1932945651761098889)\\n\\n[![Bardia](https://www.firecrawl.dev/assets/testimonials/bardia.png)Bardia@thepericulum\\\"The
Firecrawl team ships. I wanted types for their node SDK, and less than an
hour later, I got them.\\\"](https://x.com/thepericulum/status/1781397799487078874)
[![Matt Busigin](https://www.firecrawl.dev/assets/testimonials/matt-busigin.png)Matt
Busigin@mbusigin\\\"Firecrawl is dope. Congrats guys \U0001F44F\\\"](https://x.com/mbusigin/status/1836065372010656069)
[![Sumanth](https://www.firecrawl.dev/assets/testimonials/sumanth.png)Sumanth@Sumanth\\\\_077\\\"Web
scraping will never be the same!\\\\\\\\\\n\\\\\\\\\\nFirecrawl is an open-source
framework that takes a URL, crawls it, and conver...\\\"](https://x.com/Sumanth_077/status/1940049003074478511)
[![Steven Tey](https://www.firecrawl.dev/assets/testimonials/steven-tey.png)Steven
Tey@steventey\\\"Open-source Clay alternative just dropped\\\\\\\\\\n\\\\\\\\\\nUpload
a CSV of emails and...\\\"](https://x.com/steventey/status/1932945651761098889)\\n\\n\\\\[06/
07 \\\\]\\n\\n\xB7\\n\\nUse Cases\\n\\n//\\n\\nUse cases\\n\\n//\\n\\n## Transform
\ web data into AI-powered solutions\\n\\nDiscover how Firecrawl customers
are getting the most out of our API.\\n\\n[View all use cases](https://docs.firecrawl.dev/use-cases/overview)\\n\\nChat
with context\\n\\nSmarter AI chats\\n\\nPower your AI assistants with real-time,
accurate web content.\\n\\n[View docs](https://docs.firecrawl.dev/introduction)\\n\\n![AI
Assistant](https://www.firecrawl.dev/assets/ai/bot.png)\\n\\nAI Assistant\\n\\nwithFirecrawl\\n\\nReal-time\xB7Updated
2 min ago\\n\\nAsk anything...\\n\\nKnow your leads\\n\\nLead enrichment\\n\\nEnhance
your sales data with\\n\\nweb information.\\n\\n[Check out Extract](https://www.firecrawl.dev/extract)\\n\\nExtracting
leads from directory...\\n\\nTech startups\\n\\nWith contact info\\n\\nDecision
makers\\n\\nFunding stage\\n\\nReady to engage\\n\\n![Emily Tran](https://www.firecrawl.dev/assets/ai/leads-1.png)\\n\\n![James
Carter](https://www.firecrawl.dev/assets/ai/leads-2.png)\\n\\n![Sophia Kim](https://www.firecrawl.dev/assets/ai/leads-3.png)\\n\\n![Michael
Rivera](https://www.firecrawl.dev/assets/ai/leads-4.png)\\n\\nKnow your leads\\n\\nMCPs\\n\\nAdd
powerful scraping to your\\n\\ncode editors.\\n\\n[Get started](https://docs.firecrawl.dev/mcp-server)\\n\\n![Claude
Code](https://www.firecrawl.dev/assets/ai/mcps-claude.png)\\n\\nClaude Code\\n\\n![Cursor](https://www.firecrawl.dev/assets/ai/mcps-cursor.png)\\n\\nCursor\\n\\n![Windsurf](https://www.firecrawl.dev/assets/ai/mcps-windsurf.png)\\n\\nWindsurf\\n\\n\u273B\\n\\nWelcome
to Claude Code!\\n\\n/help for help, /status for your current setup\\n\\n>Try
\\\"how do I log an error?\\\"\\n\\nBuild with context\\n\\nAI platforms\\n\\nLet
your customers build AI apps\\n\\nwith web data.\\n\\n[Check out Map](https://docs.firecrawl.dev/features/map)\\n\\n![Logo
1](https://www.firecrawl.dev/assets/ai/platforms-1.png)\\n\\n![Logo 2](https://www.firecrawl.dev/assets/ai/platforms-2.png)\\n\\n![Logo
4](https://www.firecrawl.dev/assets/ai/platforms-4.png)\\n\\n![Logo 3](https://www.firecrawl.dev/assets/ai/platforms-3.png)\\n\\nExtracting
text...\\n\\nNo insight missed\\n\\nDeep research\\n\\nExtract comprehensive
information for\\n\\nin-depth research.\\n\\n[Build your own with Search](https://docs.firecrawl.dev/features/search)\\n\\nDeep
research in progress...\\n\\nAcademic papers\\n\\n0 found\\n\\nNews articles\\n\\n0
found\\n\\nExpert opinions\\n\\n0 found\\n\\nResearch reports\\n\\n0 found\\n\\nIndustry
data\\n\\n0 found\\n\\nAsk anything...\\n\\n\\\\[ CTA \\\\]\\n\\n\\\\[ CRAWL
\\\\]\\n\\n\\\\[ SCRAPE \\\\]\\n\\n\\\\[ CTA \\\\]\\n\\n//\\n\\nGet started\\n\\n//\\n\\nReady
to build?\\n\\nStart getting Web Data for free and scale seamlessly as your
project expands. No credit card needed.\\n\\n[Start for free](https://www.firecrawl.dev/signin)
[See our plans](https://www.firecrawl.dev/pricing)\\n\\n\\\\[07/ 07 \\\\]\\n\\n\xB7\\n\\nFAQ\\n\\n//\\n\\nFAQ\\n\\n//\\n\\n##
Frequently askedquestions\\n\\nEverything you need to know about Firecrawl.\\n\\nGeneral\\n\\nWhat
is Firecrawl?\\n\\nWhat sites work?\\n\\nWho can benefit from using Firecrawl?\\n\\nIs
Firecrawl open-source?\\n\\nWhat is the difference between Firecrawl and other
web scrapers?\\n\\nWhat is the difference between the open-source version
and the hosted version?\\n\\nScraping & Crawling\\n\\nHow does Firecrawl handle
dynamic content on websites?\\n\\nWhy is it not crawling all the pages?\\n\\nCan
Firecrawl crawl websites without a sitemap?\\n\\nWhat formats can Firecrawl
convert web data into?\\n\\nHow does Firecrawl ensure the cleanliness of the
data?\\n\\nIs Firecrawl suitable for large-scale data scraping projects?\\n\\nDoes
it respect robots.txt?\\n\\nWhat measures does Firecrawl take to handle web
scraping challenges like rate limits and caching?\\n\\nDoes Firecrawl handle
captcha or authentication?\\n\\nAPI Related\\n\\nWhere can I find my API key?\\n\\nBilling\\n\\nIs
Firecrawl free?\\n\\nIs there a pay-per-use plan instead of monthly?\\n\\nDo
credits roll over to the next month?\\n\\nHow many credits do scraping and
crawling cost?\\n\\nDo you charge for failed requests?\\n\\nWhat payment methods
do you accept?\\n\\nFOOTER\\n\\nThe easiest way to extract\\n\\ndata from
the web\\n\\nBacked by\\n\\nY Combinator\\n\\n[Linkedin](https://www.linkedin.com/company/firecrawl)
[Github](https://github.com/firecrawl/firecrawl)\\n\\nSOC II \xB7 Type 2\\n\\nAICPA\\n\\nSOC
2\\n\\n[X (Twitter)](https://x.com/firecrawl_dev) [Discord](https://discord.gg/gSmWdAkdwd)\\n\\nProducts\\n\\n[Playground](https://www.firecrawl.dev/playground)
[Extract](https://www.firecrawl.dev/extract) [Pricing](https://www.firecrawl.dev/pricing)
[Templates](https://www.firecrawl.dev/templates) [Changelog](https://www.firecrawl.dev/changelog)\\n\\nUse
Cases\\n\\n[AI Platforms](https://docs.firecrawl.dev/use-cases/ai-platforms)
[Lead Enrichment](https://docs.firecrawl.dev/use-cases/lead-enrichment) [SEO
Platforms](https://docs.firecrawl.dev/use-cases/seo-platforms) [Deep Research](https://docs.firecrawl.dev/use-cases/deep-research)\\n\\nDocumentation\\n\\n[Getting
started](https://docs.firecrawl.dev/introduction) [API Reference](https://docs.firecrawl.dev/api-reference/introduction)
[Integrations](https://www.firecrawl.dev/app) [Examples](https://docs.firecrawl.dev/use-cases/overview)
[SDKs](https://docs.firecrawl.dev/sdks/overview)\\n\\nCompany\\n\\n[Blog](https://www.firecrawl.dev/blog)
[Careers](https://www.firecrawl.dev/careers) [Creator & OSS program](https://www.firecrawl.dev/creator-oss-program)
[Student program](https://www.firecrawl.dev/student-program)\\n\\n\xA9 2025
Firecrawl\\n\\n[Terms of Service](https://www.firecrawl.dev/terms-of-service)
[Privacy Policy](https://www.firecrawl.dev/privacy-policy) [Report Abuse](mailto:help@firecrawl.com?subject=Issue:)\\n\\n[All
systems normal](https://status.firecrawl.dev/)\\n\\nStripeM-Inner\",\"metadata\":{\"twitter:title\":\"Firecrawl
- The Web Data API for AI\",\"publisher\":\"Firecrawl\",\"ogUrl\":\"https://www.firecrawl.dev\",\"robots\":\"follow,
index\",\"title\":\"Firecrawl - The Web Data API for AI\",\"ogDescription\":\"The
web crawling, scraping, and search API for AI. Built for scale. Firecrawl
delivers the entire internet to AI agents and builders. Clean, structured,
and ready to reason with.\",\"ogImage\":\"https://www.firecrawl.dev/og.png\",\"viewport\":\"width=device-width,
initial-scale=1, maximum-scale=1, user-scalable=no\",\"og:url\":\"https://www.firecrawl.dev\",\"og:site_name\":\"Firecrawl
- The Web Data API for AI\",\"og:type\":\"website\",\"twitter:image\":\"https://www.firecrawl.dev/og.png\",\"author\":\"Firecrawl\",\"og:title\":\"Firecrawl
- The Web Data API for AI\",\"favicon\":\"https://www.firecrawl.dev/favicon.png\",\"description\":\"The
web crawling, scraping, and search API for AI. Built for scale. Firecrawl
delivers the entire internet to AI agents and builders. Clean, structured,
and ready to reason with.\",\"referrer\":\"origin-when-cross-origin\",\"twitter:site\":\"@Vercel\",\"ogSiteName\":\"Firecrawl
- The Web Data API for AI\",\"og:image\":\"https://www.firecrawl.dev/og.png\",\"twitter:card\":\"summary_large_image\",\"twitter:creator\":\"@Vercel\",\"twitter:description\":\"The
web crawling, scraping, and search API for AI. Built for scale. Firecrawl
delivers the entire internet to AI agents and builders. Clean, structured,
and ready to reason with.\",\"language\":\"en\",\"keywords\":\"Firecrawl,Markdown,Data,Mendable,Langchain\",\"creator\":\"Firecrawl\",\"ogTitle\":\"Firecrawl
- The Web Data API for AI\",\"og:description\":\"The web crawling, scraping,
and search API for AI. Built for scale. Firecrawl delivers the entire internet
to AI agents and builders. Clean, structured, and ready to reason with.\",\"scrapeId\":\"e78d8060-d581-4e5e-b25a-90cfdad48530\",\"sourceURL\":\"https://firecrawl.dev\",\"url\":\"https://www.firecrawl.dev/\",\"statusCode\":200,\"contentType\":\"text/html;
charset=utf-8\",\"proxyUsed\":\"basic\",\"cacheState\":\"hit\",\"cachedAt\":\"2025-10-29T13:09:07.713Z\",\"creditsUsed\":1}}}"
headers:
Access-Control-Allow-Origin:
- '*'
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Length:
- '24693'
Content-Type:
- application/json; charset=utf-8
Date:
- Wed, 29 Oct 2025 14:34:03 GMT
ETag:
- W/"6075-Q1W6uMv95JKEZARbtaiPYYMojlU"
Via:
- 1.1 google
X-Powered-By:
- Express
X-Response-Time:
- 4719.998ms
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,937 @@
interactions:
- request:
body: '{"query": "firecrawl", "limit": 5, "scrapeOptions": {"includeTags": [],
"excludeTags": [], "onlyMainContent": true, "waitFor": 0, "skipTlsVerification":
true, "removeBase64Images": true, "fastMode": false, "blockAds": true, "storeInCache":
true, "maxAge": 14400000, "formats": ["markdown"], "mobile": false}, "origin":
"python-sdk@4.5.0"}'
headers:
Accept:
- '*/*'
Accept-Encoding:
- gzip, deflate, zstd
Connection:
- keep-alive
Content-Length:
- '338'
Content-Type:
- application/json
User-Agent:
- python-requests/2.32.5
method: POST
uri: https://api.firecrawl.dev/v2/search
response:
body:
string: "{\"success\":true,\"data\":{\"web\":[{\"url\":\"https://www.firecrawl.dev/\",\"title\":\"Firecrawl
- The Web Data API for AI\",\"description\":\"The web crawling, scraping,
and search API for AI. Built for scale. Firecrawl delivers the entire internet
to AI agents and builders.\",\"position\":1,\"markdown\":\"We just raised
our Series A and shipped Firecrawl /v2 \U0001F389. [Read the blog.](https://www.firecrawl.dev/blog/firecrawl-v2-series-a-announcement)\\n\\n[2
Months Free \u2014 Annually](https://www.firecrawl.dev/pricing)\\n\\n# Turn
websites into LLM-ready data\\n\\nPower your AI apps with clean web data\\n\\nfrom
any website. [It's also open source.](https://github.com/firecrawl/firecrawl)\\n\\nScrape\\n\\nSearch\\nNew\\n\\nMap\\n\\nCrawl\\n\\nScrape\\n\\nLogo\\n\\nNavigation\\n\\nButton\\n\\nH1
Title\\n\\nDescription\\n\\nCTA Button\\n\\n\\\\[ .JSON \\\\]\\n\\n```json\\n1[\\\\\\n2
\ {\\\\\\n3 \\\"url\\\": \\\"https://example.com\\\",\\\\\\n4 \\\"markdown\\\":
\\\"# Getting Started...\\\",\\\\\\n5 \\\"json\\\": { \\\"title\\\": \\\"Guide\\\",
\\\"docs\\\": \\\"...\\\" },\\\\\\n6 \\\"screenshot\\\": \\\"https://example.com/hero.png\\\"\\\\\\n7
\ }\\\\\\n8]\\n```\\n\\nScrape Completed\\n\\nTrusted by5000+\\n\\ncompaniesof
all sizes\\n\\n![Logo 17](https://www.firecrawl.dev/assets-original/logocloud/17.png)\\n\\n![Logo
18](https://www.firecrawl.dev/assets-original/logocloud/18.png)\\n\\n![Logo
1](https://www.firecrawl.dev/assets-original/logocloud/1.png)\\n\\n![Logo
2](https://www.firecrawl.dev/assets-original/logocloud/2.png)\\n\\n![Logo
3](https://www.firecrawl.dev/assets-original/logocloud/3.png)\\n\\n![Logo
4](https://www.firecrawl.dev/assets-original/logocloud/4.png)\\n\\n![Logo
5](https://www.firecrawl.dev/assets-original/logocloud/5.png)\\n\\n![Logo
6](https://www.firecrawl.dev/assets-original/logocloud/6.png)\\n\\n![Logo
7](https://www.firecrawl.dev/assets-original/logocloud/7.png)\\n\\n![Logo
8](https://www.firecrawl.dev/assets-original/logocloud/8.png)\\n\\n![Logo
9](https://www.firecrawl.dev/assets-original/logocloud/9.png)\\n\\n![Logo
10](https://www.firecrawl.dev/assets-original/logocloud/10.png)\\n\\n![Logo
11](https://www.firecrawl.dev/assets-original/logocloud/11.png)\\n\\n![Logo
12](https://www.firecrawl.dev/assets-original/logocloud/12.png)\\n\\n![Logo
13](https://www.firecrawl.dev/assets-original/logocloud/13.png)\\n\\n![Logo
14](https://www.firecrawl.dev/assets-original/logocloud/14.png)\\n\\n![Logo
15](https://www.firecrawl.dev/assets-original/logocloud/15.png)\\n\\n![Logo
16](https://www.firecrawl.dev/assets-original/logocloud/16.png)\\n\\n![Logo
17](https://www.firecrawl.dev/assets-original/logocloud/17.png)\\n\\n![Logo
18](https://www.firecrawl.dev/assets-original/logocloud/18.png)\\n\\n![Logo
19](https://www.firecrawl.dev/assets-original/logocloud/19.png)\\n\\n![Logo
20](https://www.firecrawl.dev/assets-original/logocloud/20.png)\\n\\n![Logo
21](https://www.firecrawl.dev/assets-original/logocloud/21.png)\\n\\n![Logo
17](https://www.firecrawl.dev/assets-original/logocloud/17.png)\\n\\n![Logo
18](https://www.firecrawl.dev/assets-original/logocloud/18.png)\\n\\n![Logo
1](https://www.firecrawl.dev/assets-original/logocloud/1.png)\\n\\n![Logo
2](https://www.firecrawl.dev/assets-original/logocloud/2.png)\\n\\n![Logo
3](https://www.firecrawl.dev/assets-original/logocloud/3.png)\\n\\n![Logo
4](https://www.firecrawl.dev/assets-original/logocloud/4.png)\\n\\n![Logo
5](https://www.firecrawl.dev/assets-original/logocloud/5.png)\\n\\n![Logo
6](https://www.firecrawl.dev/assets-original/logocloud/6.png)\\n\\n![Logo
7](https://www.firecrawl.dev/assets-original/logocloud/7.png)\\n\\n![Logo
8](https://www.firecrawl.dev/assets-original/logocloud/8.png)\\n\\n![Logo
9](https://www.firecrawl.dev/assets-original/logocloud/9.png)\\n\\n![Logo
10](https://www.firecrawl.dev/assets-original/logocloud/10.png)\\n\\n![Logo
11](https://www.firecrawl.dev/assets-original/logocloud/11.png)\\n\\n![Logo
12](https://www.firecrawl.dev/assets-original/logocloud/12.png)\\n\\n![Logo
13](https://www.firecrawl.dev/assets-original/logocloud/13.png)\\n\\n![Logo
14](https://www.firecrawl.dev/assets-original/logocloud/14.png)\\n\\n![Logo
15](https://www.firecrawl.dev/assets-original/logocloud/15.png)\\n\\n![Logo
16](https://www.firecrawl.dev/assets-original/logocloud/16.png)\\n\\n![Logo
17](https://www.firecrawl.dev/assets-original/logocloud/17.png)\\n\\n![Logo
18](https://www.firecrawl.dev/assets-original/logocloud/18.png)\\n\\n![Logo
19](https://www.firecrawl.dev/assets-original/logocloud/19.png)\\n\\n![Logo
20](https://www.firecrawl.dev/assets-original/logocloud/20.png)\\n\\n![Logo
21](https://www.firecrawl.dev/assets-original/logocloud/21.png)\\n\\n\\\\[01/
07 \\\\]\\n\\n\xB7\\n\\nMain Features\\n\\n//\\n\\nDeveloper First\\n\\n//\\n\\n##
Startscraping today\\n\\nEnhance your apps with industry leading web scraping
and crawling capabilities.\\n\\nScrape\\n\\nGet llm-ready data from websites.
Markdown, JSON, screenshot, etc.\\n\\nSearch\\n\\nNew\\n\\nSearch the web
and get full content from results.\\n\\nCrawl\\n\\nCrawl all the pages on
a website and get data for each page.\\n\\nPython\\n\\nNode.js\\n\\nCurl\\n\\nCopy
code\\n\\n```python\\n1# pip install firecrawl-py\\n2from firecrawl import
Firecrawl\\n3\\n4app = Firecrawl(api_key=\\\"fc-YOUR_API_KEY\\\")\\n5\\n6#
Scrape a website:\\n7app.scrape('firecrawl.dev')\\n8\\n9\\n10\\n```\\n\\n\\\\[
.MD \\\\]\\n\\n```markdown\\n1# Firecrawl\\n2\\n3Firecrawl is a powerful web
scraping\\n4library that makes it easy to extract\\n5data from websites.\\n6\\n7##
Installation\\n8\\n9To install Firecrawl, run:\\n10\\n11\\n```\\n\\n![developer-1](https://www.firecrawl.dev/assets/developer/1.png)\\n\\n![developer-2](https://www.firecrawl.dev/assets/developer/2.png)\\n\\n![developer-3](https://www.firecrawl.dev/assets/developer/3.png)\\n\\n![developer-4](https://www.firecrawl.dev/assets/developer/4.png)\\n\\n![developer-5](https://www.firecrawl.dev/assets/developer/5.png)\\n\\n![developer-6](https://www.firecrawl.dev/assets/developer/6.png)\\n\\n![developer-7](https://www.firecrawl.dev/assets/developer/7.png)\\n\\n![developer-8](https://www.firecrawl.dev/assets/developer/8.png)\\n\\n![developer-9](https://www.firecrawl.dev/assets/developer/1.png)\\n\\n![developer-10](https://www.firecrawl.dev/assets/developer/2.png)\\n\\n![developer-11](https://www.firecrawl.dev/assets/developer/3.png)\\n\\n![developer-12](https://www.firecrawl.dev/assets/developer/4.png)\\n\\n![developer-13](https://www.firecrawl.dev/assets/developer/5.png)\\n\\n![developer-14](https://www.firecrawl.dev/assets/developer/6.png)\\n\\n![developer-15](https://www.firecrawl.dev/assets/developer/7.png)\\n\\n![developer-16](https://www.firecrawl.dev/assets/developer/8.png)\\n\\n![developer-17](https://www.firecrawl.dev/assets/developer/1.png)\\n\\n![developer-18](https://www.firecrawl.dev/assets/developer/2.png)\\n\\n![developer-19](https://www.firecrawl.dev/assets/developer/3.png)\\n\\n![developer-20](https://www.firecrawl.dev/assets/developer/4.png)\\n\\n![developer-21](https://www.firecrawl.dev/assets/developer/5.png)\\n\\n![developer-22](https://www.firecrawl.dev/assets/developer/6.png)\\n\\n![developer-23](https://www.firecrawl.dev/assets/developer/7.png)\\n\\n![developer-24](https://www.firecrawl.dev/assets/developer/8.png)\\n\\nIntegrations\\n\\n###
Use well-known tools\\n\\nAlready fully integrated with the greatest existing
tools and workflows.\\n\\n[See all integrations](https://www.firecrawl.dev/app)\\n\\n![Firecrawl
icon (blueprint)](https://www.firecrawl.dev/assets-original/developer-os-icon.png)\\n\\nmendableai/firecrawl\\n\\nPublic\\n\\nStar\\n\\n65.3K\\n\\n\\\\[python-SDK\\\\]
improvs/async\\n\\n#1337\\n\\n\xB7\\n\\nApr 18, 2025\\n\\n\xB7\\n\\n![rafaelsideguide](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F150964962%3Fv%3D4&w=48&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\nrafaelsideguide\\n\\nfeat(extract):
cost limit\\n\\n#1473\\n\\n\xB7\\n\\nApr 17, 2025\\n\\n\xB7\\n\\n![mogery](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F66118807%3Fv%3D4&w=48&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\nmogery\\n\\nfeat(scrape):
get job result from GCS, avoid Redis\\n\\n#1461\\n\\n\xB7\\n\\nApr 15, 2025\\n\\n\xB7\\n\\n![mogery](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F66118807%3Fv%3D4&w=48&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\nmogery\\n\\nExtract
v2/rerank improvs\\n\\n#1437\\n\\n\xB7\\n\\nApr 11, 2025\\n\\n\xB7\\n\\n![rafaelsideguide](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F150964962%3Fv%3D4&w=48&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\nrafaelsideguide\\n\\n![https://avatars.githubusercontent.com/u/150964962?v=4](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F150964962%3Fv%3D4&w=96&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\n![https://avatars.githubusercontent.com/u/66118807?v=4](https://www.firecrawl.dev/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F66118807%3Fv%3D4&w=96&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\n+90\\n\\nOpen
Source\\n\\n### Code you can trust\\n\\nDeveloped transparently and collaboratively.
Join our community of contributors.\\n\\n[Check out our repo](https://github.com/firecrawl/firecrawl)\\n\\n\\\\[02/
07 \\\\]\\n\\n\xB7\\n\\nCore\\n\\n//\\n\\nBuilt to outperform\\n\\n//\\n\\n##
Core principles, provenperformance\\n\\nBuilt from the ground up to outperform
traditional scrapers.\\n\\nNo proxy headaches\\n\\nReliable.Covers 96% of
the web,\\n\\nincluding JS-heavy and protected pages. No proxies, no puppets,
just clean data.\\n\\nFirecrawl\\n\\n96%\\n\\n![Puppeteer icon](https://www.firecrawl.dev/assets/puppeteer.png)\\n\\nPuppeteer\\n\\n79%\\n\\ncURL\\n\\n75%\\n\\nSpeed
that feels invisible\\n\\nBlazingly fast.Delivers results in less than 1 second,
fast for real-time agents\\n\\nand dynamic apps.\\n\\nURL\\n\\nCrawl\\n\\nScrape\\n\\nfirecrawl.dev/docs\\n\\n50ms\\n\\n51ms\\n\\nfirecrawl.dev/templates\\n\\n52ms\\n\\n50ms\\n\\nfirecrawl.dev/changelog\\n\\n49ms\\n\\n52ms\\n\\nfirecrawl.dev/about\\n\\n52ms\\n\\n50ms\\n\\nfirecrawl.dev/changelog\\n\\n50ms\\n\\n52ms\\n\\nfirecrawl.dev/playground\\n\\n51ms\\n\\n49ms\\n\\n\\\\[
CTA \\\\]\\n\\n\\\\[ CRAWL \\\\]\\n\\n\\\\[ SCRAPE \\\\]\\n\\n\\\\[ CTA \\\\]\\n\\n//\\n\\nGet
started\\n\\n//\\n\\nReady to build?\\n\\nStart getting Web Data for free
and scale seamlessly as your project expands. No credit card needed.\\n\\n[Start
for free](https://www.firecrawl.dev/signin) [See our plans](https://www.firecrawl.dev/pricing)\\n\\n\\\\[03/
07 \\\\]\\n\\n\xB7\\n\\nFeatures\\n\\n//\\n\\nZero configuration\\n\\n//\\n\\n##
We handle the hard stuff\\n\\nRotating proxies, orchestration, rate limits,
js-blocked content and more.\\n\\nDocs to data\\n\\nMedia parsing.Firecrawl
can parse and output content from web hosted pdfs, docx, and more.\\n\\nhttps://example.com/docs/report.pdf\\n\\nhttps://example.com/files/brief.docx\\n\\nhttps://example.com/docs/guide.html\\n\\ndocx\\n\\nParsing...\\n\\nKnows
the moment\\n\\nSmart wait.Firecrawl intelligently waits for content to load,
making scraping faster and more reliable.\\n\\nhttps://example-spa.com\\n\\nRequest
Sent\\n\\nScrapes the real thing\\n\\nCached, when you need it.Selective caching,
you choose your caching patterns, growing web index.\\n\\n![User](https://www.firecrawl.dev/_next/image?url=%2Fassets-original%2Ffeatures%2Fcached-user.png&w=256&q=75&dpl=dpl_7RqvseQXNVYetFdhTKj6RntohhL1)\\n\\nUser\\n\\nFirecrawl\\n\\nCache\\n\\nInvisible
access\\n\\nStealth mode.Crawls the web without\\n\\nbeing blocked, mimics
real users to access protected or dynamic content.\\n\\nInteractive scraping\\n\\nActions.Click,
scroll, write, wait, press and more before extracting content.\\n\\nhttps://example.com\\n\\nNavigate\\n\\nClick\\n\\nType\\n\\nWait\\n\\nScroll\\n\\nPress\\n\\nScreenshot\\n\\nScrape\\n\\n\\\\[04/
07 \\\\]\\n\\n\xB7\\n\\nPricing\\n\\n//\\n\\nTransparent\\n\\n//\\n\\n## Flexible
pricing\\n\\nExplore transparent pricing built for real-world scraping. Start
for free, then scale as you grow.\\n\\n\U0001F1FA\U0001F1F8USD\\n\\nFree Plan\\n\\nA
lightweight way to try scraping.\\n\\nNo cost, no card, no hassle.\\n\\n500
credits\\n\\n$0123456789\\n\\none-time\\n\\nGet started\\n\\nScrape 500 pages\\n\\n2
concurrent requests\\n\\nLow rate limits\\n\\nHobby\\n\\nGreat for side projects
and small tools.\\n\\nFast, simple, no overkill.\\n\\n3,000 credits\\n\\n$01234567890123456789\\n\\n/monthly\\n\\nBilled
yearly\\n\\n2 months free\\n\\nSubscribe\\n\\nScrape 3,000 pages\\n\\n5 concurrent
requests\\n\\nBasic support\\n\\n$9 per extra 1k credits\\n\\nStandard\\n\\nMost
popular\\n\\nPerfect for scaling with less effort.\\n\\nSimple, solid, dependable.\\n\\n100,000
credits\\n\\n$01234567890123456789\\n\\n/monthly\\n\\nBilled yearly\\n\\n2
months free\\n\\nSubscribe\\n\\nScrape 100,000 pages\\n\\n50 concurrent requests\\n\\nStandard
support\\n\\n$47 per extra 35k credits\\n\\nGrowth\\n\\nBuilt for high volume
and speed.\\n\\nFirecrawl at full force.\\n\\n500,000 credits\\n\\n$012345678901234567890123456789\\n\\n/monthly\\n\\nBilled
yearly\\n\\n2 months free\\n\\nSubscribe\\n\\nScrape 500,000 pages\\n\\n100
concurrent requests\\n\\nPriority support\\n\\n$177 per extra 175k credits\\n\\nExtra
credits are available via auto-recharge packs. [Enable](https://www.firecrawl.dev/signin/signup)\\n\\nEnterprise\\n\\nPower
at your pace\\n\\nUnlimited credits. Custom RPMs.\\n\\n[Contact sales](https://fk4bvu0n5qp.typeform.com/to/Ej6oydlg)
[More details](https://www.firecrawl.dev/enterprise)\\n\\nBulk discounts\\n\\nTop
priority support\\n\\nCustom concurrency limits\\n\\nImproved stealth proxies\\n\\nSLAs\\n\\nAdvanced
security & controls\\n\\n\\\\[05/ 07 \\\\]\\n\\n\xB7\\n\\nTestimonials\\n\\n//\\n\\nCommunity\\n\\n//\\n\\n##
People love building withFirecrawl\\n\\nDiscover why developers choose
Firecrawl every day.\\n\\n[![Morgan Linton](https://www.firecrawl.dev/assets/testimonials/morgan-linton.png)Morgan
Linton@morganlinton\\\"If you're coding with AI, and haven't discovered @firecrawl\\\\_dev
yet, prepare to have your mind blown \U0001F92F\\\"](https://x.com/morganlinton/status/1839454165703204955)
[![Chris DeWeese](https://www.firecrawl.dev/assets/testimonials/chris-deweese.png)Chris
DeWeese@chrisdeweese\\\\_\\\"Started using @firecrawl\\\\_dev for a project,
I wish I used this sooner.\\\"](https://x.com/chrisdeweese_/status/1853587120406876601)
[![Alex Reibman](https://www.firecrawl.dev/assets/testimonials/alex-reibman.png)Alex
Reibman@AlexReibman\\\"Moved our internal agent's web scraping tool from Apify
to Firecrawl because it benchmarked 50x faster with AgentOps.\\\"](https://x.com/AlexReibman/status/1780299595484131836)
[![Tom - Morpho](https://www.firecrawl.dev/assets/testimonials/tom-morpho.png)Tom
- Morpho@TomReppelin\\\"I found gold today. Thank you @firecrawl\\\\_dev\\\"](https://x.com/TomReppelin/status/1844382491014201613)\\n\\n[![Morgan
Linton](https://www.firecrawl.dev/assets/testimonials/morgan-linton.png)Morgan
Linton@morganlinton\\\"If you're coding with AI, and haven't discovered @firecrawl\\\\_dev
yet, prepare to have your mind blown \U0001F92F\\\"](https://x.com/morganlinton/status/1839454165703204955)
[![Chris DeWeese](https://www.firecrawl.dev/assets/testimonials/chris-deweese.png)Chris
DeWeese@chrisdeweese\\\\_\\\"Started using @firecrawl\\\\_dev for a project,
I wish I used this sooner.\\\"](https://x.com/chrisdeweese_/status/1853587120406876601)
[![Alex Reibman](https://www.firecrawl.dev/assets/testimonials/alex-reibman.png)Alex
Reibman@AlexReibman\\\"Moved our internal agent's web scraping tool from Apify
to Firecrawl because it benchmarked 50x faster with AgentOps.\\\"](https://x.com/AlexReibman/status/1780299595484131836)
[![Tom - Morpho](https://www.firecrawl.dev/assets/testimonials/tom-morpho.png)Tom
- Morpho@TomReppelin\\\"I found gold today. Thank you @firecrawl\\\\_dev\\\"](https://x.com/TomReppelin/status/1844382491014201613)\\n\\n[![Bardia](https://www.firecrawl.dev/assets/testimonials/bardia.png)Bardia@thepericulum\\\"The
Firecrawl team ships. I wanted types for their node SDK, and less than an
hour later, I got them.\\\"](https://x.com/thepericulum/status/1781397799487078874)
[![Matt Busigin](https://www.firecrawl.dev/assets/testimonials/matt-busigin.png)Matt
Busigin@mbusigin\\\"Firecrawl is dope. Congrats guys \U0001F44F\\\"](https://x.com/mbusigin/status/1836065372010656069)
[![Sumanth](https://www.firecrawl.dev/assets/testimonials/sumanth.png)Sumanth@Sumanth\\\\_077\\\"Web
scraping will never be the same!\\\\\\\\\\n\\\\\\\\\\nFirecrawl is an open-source
framework that takes a URL, crawls it, and conver...\\\"](https://x.com/Sumanth_077/status/1940049003074478511)
[![Steven Tey](https://www.firecrawl.dev/assets/testimonials/steven-tey.png)Steven
Tey@steventey\\\"Open-source Clay alternative just dropped\\\\\\\\\\n\\\\\\\\\\nUpload
a CSV of emails and...\\\"](https://x.com/steventey/status/1932945651761098889)\\n\\n[![Bardia](https://www.firecrawl.dev/assets/testimonials/bardia.png)Bardia@thepericulum\\\"The
Firecrawl team ships. I wanted types for their node SDK, and less than an
hour later, I got them.\\\"](https://x.com/thepericulum/status/1781397799487078874)
[![Matt Busigin](https://www.firecrawl.dev/assets/testimonials/matt-busigin.png)Matt
Busigin@mbusigin\\\"Firecrawl is dope. Congrats guys \U0001F44F\\\"](https://x.com/mbusigin/status/1836065372010656069)
[![Sumanth](https://www.firecrawl.dev/assets/testimonials/sumanth.png)Sumanth@Sumanth\\\\_077\\\"Web
scraping will never be the same!\\\\\\\\\\n\\\\\\\\\\nFirecrawl is an open-source
framework that takes a URL, crawls it, and conver...\\\"](https://x.com/Sumanth_077/status/1940049003074478511)
[![Steven Tey](https://www.firecrawl.dev/assets/testimonials/steven-tey.png)Steven
Tey@steventey\\\"Open-source Clay alternative just dropped\\\\\\\\\\n\\\\\\\\\\nUpload
a CSV of emails and...\\\"](https://x.com/steventey/status/1932945651761098889)\\n\\n\\\\[06/
07 \\\\]\\n\\n\xB7\\n\\nUse Cases\\n\\n//\\n\\nUse cases\\n\\n//\\n\\n## Transform
\ web data into AI-powered solutions\\n\\nDiscover how Firecrawl customers
are getting the most out of our API.\\n\\n[View all use cases](https://docs.firecrawl.dev/use-cases/overview)\\n\\nChat
with context\\n\\nSmarter AI chats\\n\\nPower your AI assistants with real-time,
accurate web content.\\n\\n[View docs](https://docs.firecrawl.dev/introduction)\\n\\n![AI
Assistant](https://www.firecrawl.dev/assets/ai/bot.png)\\n\\nAI Assistant\\n\\nwithFirecrawl\\n\\nReal-time\xB7Updated
2 min ago\\n\\nAsk anything...\\n\\nKnow your leads\\n\\nLead enrichment\\n\\nEnhance
your sales data with\\n\\nweb information.\\n\\n[Check out Extract](https://www.firecrawl.dev/extract)\\n\\nExtracting
leads from directory...\\n\\nTech startups\\n\\nWith contact info\\n\\nDecision
makers\\n\\nFunding stage\\n\\nReady to engage\\n\\n![Emily Tran](https://www.firecrawl.dev/assets/ai/leads-1.png)\\n\\n![James
Carter](https://www.firecrawl.dev/assets/ai/leads-2.png)\\n\\n![Sophia Kim](https://www.firecrawl.dev/assets/ai/leads-3.png)\\n\\n![Michael
Rivera](https://www.firecrawl.dev/assets/ai/leads-4.png)\\n\\nKnow your leads\\n\\nMCPs\\n\\nAdd
powerful scraping to your\\n\\ncode editors.\\n\\n[Get started](https://docs.firecrawl.dev/mcp-server)\\n\\n![Claude
Code](https://www.firecrawl.dev/assets/ai/mcps-claude.png)\\n\\nClaude Code\\n\\n![Cursor](https://www.firecrawl.dev/assets/ai/mcps-cursor.png)\\n\\nCursor\\n\\n![Windsurf](https://www.firecrawl.dev/assets/ai/mcps-windsurf.png)\\n\\nWindsurf\\n\\n\u273B\\n\\nWelcome
to Claude Code!\\n\\n/help for help, /status for your current setup\\n\\n>Try
\\\"how do I log an error?\\\"\\n\\nBuild with context\\n\\nAI platforms\\n\\nLet
your customers build AI apps\\n\\nwith web data.\\n\\n[Check out Map](https://docs.firecrawl.dev/features/map)\\n\\n![Logo
1](https://www.firecrawl.dev/assets/ai/platforms-1.png)\\n\\n![Logo 2](https://www.firecrawl.dev/assets/ai/platforms-2.png)\\n\\n![Logo
4](https://www.firecrawl.dev/assets/ai/platforms-4.png)\\n\\n![Logo 3](https://www.firecrawl.dev/assets/ai/platforms-3.png)\\n\\nExtracting
text...\\n\\nNo insight missed\\n\\nDeep research\\n\\nExtract comprehensive
information for\\n\\nin-depth research.\\n\\n[Build your own with Search](https://docs.firecrawl.dev/features/search)\\n\\nDeep
research in progress...\\n\\nAcademic papers\\n\\n0 found\\n\\nNews articles\\n\\n0
found\\n\\nExpert opinions\\n\\n0 found\\n\\nResearch reports\\n\\n0 found\\n\\nIndustry
data\\n\\n0 found\\n\\nAsk anything...\\n\\n\\\\[ CTA \\\\]\\n\\n\\\\[ CRAWL
\\\\]\\n\\n\\\\[ SCRAPE \\\\]\\n\\n\\\\[ CTA \\\\]\\n\\n//\\n\\nGet started\\n\\n//\\n\\nReady
to build?\\n\\nStart getting Web Data for free and scale seamlessly as your
project expands. No credit card needed.\\n\\n[Start for free](https://www.firecrawl.dev/signin)
[See our plans](https://www.firecrawl.dev/pricing)\\n\\n\\\\[07/ 07 \\\\]\\n\\n\xB7\\n\\nFAQ\\n\\n//\\n\\nFAQ\\n\\n//\\n\\n##
Frequently askedquestions\\n\\nEverything you need to know about Firecrawl.\\n\\nGeneral\\n\\nWhat
is Firecrawl?\\n\\nWhat sites work?\\n\\nWho can benefit from using Firecrawl?\\n\\nIs
Firecrawl open-source?\\n\\nWhat is the difference between Firecrawl and other
web scrapers?\\n\\nWhat is the difference between the open-source version
and the hosted version?\\n\\nScraping & Crawling\\n\\nHow does Firecrawl handle
dynamic content on websites?\\n\\nWhy is it not crawling all the pages?\\n\\nCan
Firecrawl crawl websites without a sitemap?\\n\\nWhat formats can Firecrawl
convert web data into?\\n\\nHow does Firecrawl ensure the cleanliness of the
data?\\n\\nIs Firecrawl suitable for large-scale data scraping projects?\\n\\nDoes
it respect robots.txt?\\n\\nWhat measures does Firecrawl take to handle web
scraping challenges like rate limits and caching?\\n\\nDoes Firecrawl handle
captcha or authentication?\\n\\nAPI Related\\n\\nWhere can I find my API key?\\n\\nBilling\\n\\nIs
Firecrawl free?\\n\\nIs there a pay-per-use plan instead of monthly?\\n\\nDo
credits roll over to the next month?\\n\\nHow many credits do scraping and
crawling cost?\\n\\nDo you charge for failed requests?\\n\\nWhat payment methods
do you accept?\\n\\nFOOTER\\n\\nThe easiest way to extract\\n\\ndata from
the web\\n\\nBacked by\\n\\nY Combinator\\n\\n[Linkedin](https://www.linkedin.com/company/firecrawl)
[Github](https://github.com/firecrawl/firecrawl)\\n\\nSOC II \xB7 Type 2\\n\\nAICPA\\n\\nSOC
2\\n\\n[X (Twitter)](https://x.com/firecrawl_dev) [Discord](https://discord.gg/gSmWdAkdwd)\\n\\nProducts\\n\\n[Playground](https://www.firecrawl.dev/playground)
[Extract](https://www.firecrawl.dev/extract) [Pricing](https://www.firecrawl.dev/pricing)
[Templates](https://www.firecrawl.dev/templates) [Changelog](https://www.firecrawl.dev/changelog)\\n\\nUse
Cases\\n\\n[AI Platforms](https://docs.firecrawl.dev/use-cases/ai-platforms)
[Lead Enrichment](https://docs.firecrawl.dev/use-cases/lead-enrichment) [SEO
Platforms](https://docs.firecrawl.dev/use-cases/seo-platforms) [Deep Research](https://docs.firecrawl.dev/use-cases/deep-research)\\n\\nDocumentation\\n\\n[Getting
started](https://docs.firecrawl.dev/introduction) [API Reference](https://docs.firecrawl.dev/api-reference/introduction)
[Integrations](https://www.firecrawl.dev/app) [Examples](https://docs.firecrawl.dev/use-cases/overview)
[SDKs](https://docs.firecrawl.dev/sdks/overview)\\n\\nCompany\\n\\n[Blog](https://www.firecrawl.dev/blog)
[Careers](https://www.firecrawl.dev/careers) [Creator & OSS program](https://www.firecrawl.dev/creator-oss-program)
[Student program](https://www.firecrawl.dev/student-program)\\n\\n\xA9 2025
Firecrawl\\n\\n[Terms of Service](https://www.firecrawl.dev/terms-of-service)
[Privacy Policy](https://www.firecrawl.dev/privacy-policy) [Report Abuse](mailto:help@firecrawl.com?subject=Issue:)\\n\\n[All
systems normal](https://status.firecrawl.dev/)\\n\\nStripeM-Inner\",\"metadata\":{\"favicon\":\"https://www.firecrawl.dev/favicon.png\",\"ogUrl\":\"https://www.firecrawl.dev\",\"ogImage\":\"https://www.firecrawl.dev/og.png\",\"referrer\":\"origin-when-cross-origin\",\"ogDescription\":\"The
web crawling, scraping, and search API for AI. Built for scale. Firecrawl
delivers the entire internet to AI agents and builders. Clean, structured,
and ready to reason with.\",\"robots\":\"follow, index\",\"twitter:card\":\"summary_large_image\",\"og:site_name\":\"Firecrawl
- The Web Data API for AI\",\"twitter:title\":\"Firecrawl - The Web Data API
for AI\",\"og:image\":\"https://www.firecrawl.dev/og.png\",\"title\":\"Firecrawl
- The Web Data API for AI\",\"og:description\":\"The web crawling, scraping,
and search API for AI. Built for scale. Firecrawl delivers the entire internet
to AI agents and builders. Clean, structured, and ready to reason with.\",\"twitter:image\":\"https://www.firecrawl.dev/og.png\",\"viewport\":\"width=device-width,
initial-scale=1, maximum-scale=1, user-scalable=no\",\"ogSiteName\":\"Firecrawl
- The Web Data API for AI\",\"keywords\":\"Firecrawl,Markdown,Data,Mendable,Langchain\",\"author\":\"Firecrawl\",\"og:title\":\"Firecrawl
- The Web Data API for AI\",\"twitter:description\":\"The web crawling, scraping,
and search API for AI. Built for scale. Firecrawl delivers the entire internet
to AI agents and builders. Clean, structured, and ready to reason with.\",\"description\":\"The
web crawling, scraping, and search API for AI. Built for scale. Firecrawl
delivers the entire internet to AI agents and builders. Clean, structured,
and ready to reason with.\",\"twitter:site\":\"@Vercel\",\"og:url\":\"https://www.firecrawl.dev\",\"og:type\":\"website\",\"ogTitle\":\"Firecrawl
- The Web Data API for AI\",\"language\":\"en\",\"creator\":\"Firecrawl\",\"publisher\":\"Firecrawl\",\"twitter:creator\":\"@Vercel\",\"scrapeId\":\"57b0586f-36e8-4923-aaa2-88ff58c03999\",\"sourceURL\":\"https://www.firecrawl.dev/\",\"url\":\"https://www.firecrawl.dev/\",\"statusCode\":200,\"contentType\":\"text/html;
charset=utf-8\",\"proxyUsed\":\"basic\",\"cacheState\":\"hit\",\"cachedAt\":\"2025-10-29T13:09:07.713Z\"}},{\"url\":\"https://github.com/firecrawl/firecrawl\",\"title\":\"firecrawl/firecrawl:
The Web Data API for AI - Turn entire ... - GitHub\",\"description\":\"Firecrawl
is an API service that takes a URL, crawls it, and converts it into clean
markdown or structured data. We crawl all accessible subpages and give you
...\",\"position\":2,\"category\":\"github\",\"markdown\":\"[Skip to content](https://github.com/firecrawl/firecrawl#start-of-content)\\n\\nYou
signed in with another tab or window. [Reload](https://github.com/firecrawl/firecrawl)
to refresh your session.You signed out in another tab or window. [Reload](https://github.com/firecrawl/firecrawl)
to refresh your session.You switched accounts on another tab or window. [Reload](https://github.com/firecrawl/firecrawl)
to refresh your session.Dismiss alert\\n\\n{{ message }}\\n\\n[firecrawl](https://github.com/firecrawl)/
**[firecrawl](https://github.com/firecrawl/firecrawl)** Public\\n\\n- Couldn't
load subscription status.\\nRetry\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n### Uh
oh!\\n\\n\\n\\n\\n\\n\\n\\nThere was an error while loading. [Please reload
this page](https://github.com/firecrawl/firecrawl).\\n\\n- [Fork\\\\\\\\\\n5.1k](https://github.com/login?return_to=%2Ffirecrawl%2Ffirecrawl)\\n-
[Star\\\\\\\\\\n65.2k](https://github.com/login?return_to=%2Ffirecrawl%2Ffirecrawl)\\n\\n\\n\U0001F525
The Web Data API for AI - Turn entire websites into LLM-ready markdown or
structured data\\n\\n\\n[firecrawl.dev](https://firecrawl.dev/ \\\"https://firecrawl.dev\\\")\\n\\n###
License\\n\\n[AGPL-3.0 license](https://github.com/firecrawl/firecrawl/blob/main/LICENSE)\\n\\n[65.2k\\\\\\\\\\nstars](https://github.com/firecrawl/firecrawl/stargazers)
[5.1k\\\\\\\\\\nforks](https://github.com/firecrawl/firecrawl/forks) [Branches](https://github.com/firecrawl/firecrawl/branches)
[Tags](https://github.com/firecrawl/firecrawl/tags) [Activity](https://github.com/firecrawl/firecrawl/activity)\\n\\n[Star](https://github.com/login?return_to=%2Ffirecrawl%2Ffirecrawl)\\n\\nCouldn't
load subscription status.\\nRetry\\n\\n### Uh oh!\\n\\nThere was an error
while loading. [Please reload this page](https://github.com/firecrawl/firecrawl).\\n\\n#
firecrawl/firecrawl\\n\\nmain\\n\\n[**887** Branches](https://github.com/firecrawl/firecrawl/branches)
[**28** Tags](https://github.com/firecrawl/firecrawl/tags)\\n\\n[Go to Branches
page](https://github.com/firecrawl/firecrawl/branches)[Go to Tags page](https://github.com/firecrawl/firecrawl/tags)\\n\\nGo
to file\\n\\nCode\\n\\nOpen more actions menu\\n\\n## Folders and files\\n\\n|
Name | Name | Last commit message | Last commit date |\\n| --- | --- | ---
| --- |\\n| ## Latest commit<br>[![amplitudesxd](https://avatars.githubusercontent.com/u/62763456?v=4&size=40)](https://github.com/amplitudesxd)[amplitudesxd](https://github.com/firecrawl/firecrawl/commits?author=amplitudesxd)<br>[chore:
update last scrape rpc (](https://github.com/firecrawl/firecrawl/commit/37de2877fab4bae2de297e37bad3c9bcd49a64bc)
[#2339](https://github.com/firecrawl/firecrawl/pull/2339) [)](https://github.com/firecrawl/firecrawl/commit/37de2877fab4bae2de297e37bad3c9bcd49a64bc)<br>success<br>20
hours agoOct 27, 2025<br>[37de287](https://github.com/firecrawl/firecrawl/commit/37de2877fab4bae2de297e37bad3c9bcd49a64bc)\_\xB7\_20
hours agoOct 27, 2025<br>## History<br>[4,487 Commits](https://github.com/firecrawl/firecrawl/commits/main/)
<br>Open commit details<br>[View commit history for this file.](https://github.com/firecrawl/firecrawl/commits/main/)
|\\n| [.github](https://github.com/firecrawl/firecrawl/tree/main/.github \\\".github\\\")
| [.github](https://github.com/firecrawl/firecrawl/tree/main/.github \\\".github\\\")
| [fix(ci): temp disabled prod env tests](https://github.com/firecrawl/firecrawl/commit/42fc149c1ab738da0e15e772817774aa35273f8e
\\\"fix(ci): temp disabled prod env tests\\\") | 5 days agoOct 23, 2025 |\\n|
[apps](https://github.com/firecrawl/firecrawl/tree/main/apps \\\"apps\\\")
| [apps](https://github.com/firecrawl/firecrawl/tree/main/apps \\\"apps\\\")
| [chore: update last scrape rpc (](https://github.com/firecrawl/firecrawl/commit/37de2877fab4bae2de297e37bad3c9bcd49a64bc
\\\"chore: update last scrape rpc (#2339)\\\") [#2339](https://github.com/firecrawl/firecrawl/pull/2339)
[)](https://github.com/firecrawl/firecrawl/commit/37de2877fab4bae2de297e37bad3c9bcd49a64bc
\\\"chore: update last scrape rpc (#2339)\\\") | 20 hours agoOct 27, 2025
|\\n| [examples](https://github.com/firecrawl/firecrawl/tree/main/examples
\\\"examples\\\") | [examples](https://github.com/firecrawl/firecrawl/tree/main/examples
\\\"examples\\\") | [Merge pull request](https://github.com/firecrawl/firecrawl/commit/7ad57003b4ad8b230ba8252129e52bafa62dfae9
\\\"Merge pull request #2172 from MAVRICK-1/firecrawl-gemini-screenshot-editor
\ feat: Add Firecrawl + Gemini 2.5 Flash Image CLI Editor\\\") [#2172](https://github.com/firecrawl/firecrawl/pull/2172)
[from MAVRICK-1/firecrawl-gemini-screenshot-e\u2026](https://github.com/firecrawl/firecrawl/commit/7ad57003b4ad8b230ba8252129e52bafa62dfae9
\\\"Merge pull request #2172 from MAVRICK-1/firecrawl-gemini-screenshot-editor
\ feat: Add Firecrawl + Gemini 2.5 Flash Image CLI Editor\\\") | last monthSep
23, 2025 |\\n| [img](https://github.com/firecrawl/firecrawl/tree/main/img
\\\"img\\\") | [img](https://github.com/firecrawl/firecrawl/tree/main/img
\\\"img\\\") | [updated readme](https://github.com/firecrawl/firecrawl/commit/4f904e774831dc598681d3e998d0e5e15abcec27
\\\"updated readme\\\") | 2 months agoAug 18, 2025 |\\n| [.gitattributes](https://github.com/firecrawl/firecrawl/blob/main/.gitattributes
\\\".gitattributes\\\") | [.gitattributes](https://github.com/firecrawl/firecrawl/blob/main/.gitattributes
\\\".gitattributes\\\") | [Initial commit](https://github.com/firecrawl/firecrawl/commit/a6c2a878119321a196f720cce4195e086f1c6b46
\\\"Initial commit\\\") | last yearApr 15, 2024 |\\n| [.gitignore](https://github.com/firecrawl/firecrawl/blob/main/.gitignore
\\\".gitignore\\\") | [.gitignore](https://github.com/firecrawl/firecrawl/blob/main/.gitignore
\\\".gitignore\\\") | [Nick: init](https://github.com/firecrawl/firecrawl/commit/ab3fa4838458c8303a67dd30fdd75a16b89cc20b
\\\"Nick: init\\\") | 3 weeks agoOct 10, 2025 |\\n| [.gitmodules](https://github.com/firecrawl/firecrawl/blob/main/.gitmodules
\\\".gitmodules\\\") | [.gitmodules](https://github.com/firecrawl/firecrawl/blob/main/.gitmodules
\\\".gitmodules\\\") | [mendableai -> firecrawl](https://github.com/firecrawl/firecrawl/commit/2f3bc4e7a7b1a67a29c06df629f79402ee1aad1b
\\\"mendableai -> firecrawl\\\") | 2 months agoAug 18, 2025 |\\n| [CLAUDE.md](https://github.com/firecrawl/firecrawl/blob/main/CLAUDE.md
\\\"CLAUDE.md\\\") | [CLAUDE.md](https://github.com/firecrawl/firecrawl/blob/main/CLAUDE.md
\\\"CLAUDE.md\\\") | [add claude file](https://github.com/firecrawl/firecrawl/commit/3f0873c788823258a7d9f55d1c8772aed4e1a8de
\\\"add claude file\\\") | 2 months agoAug 6, 2025 |\\n| [CONTRIBUTING.md](https://github.com/firecrawl/firecrawl/blob/main/CONTRIBUTING.md
\\\"CONTRIBUTING.md\\\") | [CONTRIBUTING.md](https://github.com/firecrawl/firecrawl/blob/main/CONTRIBUTING.md
\\\"CONTRIBUTING.md\\\") | [Add Rust to CONTRIBUTING (](https://github.com/firecrawl/firecrawl/commit/f396cb20b54c3c2d7e64882642c5df6310a01002
\\\"Add Rust to CONTRIBUTING (#2180)\\\") [#2180](https://github.com/firecrawl/firecrawl/pull/2180)
[)](https://github.com/firecrawl/firecrawl/commit/f396cb20b54c3c2d7e64882642c5df6310a01002
\\\"Add Rust to CONTRIBUTING (#2180)\\\") | last monthSep 18, 2025 |\\n| [LICENSE](https://github.com/firecrawl/firecrawl/blob/main/LICENSE
\\\"LICENSE\\\") | [LICENSE](https://github.com/firecrawl/firecrawl/blob/main/LICENSE
\\\"LICENSE\\\") | [Update SDKs to MIT license](https://github.com/firecrawl/firecrawl/commit/afb49e21e7cff595ebad9ce0b7aba13b88f39cf8
\\\"Update SDKs to MIT license\\\") | last yearJul 8, 2024 |\\n| [README.md](https://github.com/firecrawl/firecrawl/blob/main/README.md
\\\"README.md\\\") | [README.md](https://github.com/firecrawl/firecrawl/blob/main/README.md
\\\"README.md\\\") | [Update README.md](https://github.com/firecrawl/firecrawl/commit/a21430e97818d95099bb365be711d9227bd75590
\\\"Update README.md\\\") | 3 weeks agoOct 6, 2025 |\\n| [SELF\\\\_HOST.md](https://github.com/firecrawl/firecrawl/blob/main/SELF_HOST.md
\\\"SELF_HOST.md\\\") | [SELF\\\\_HOST.md](https://github.com/firecrawl/firecrawl/blob/main/SELF_HOST.md
\\\"SELF_HOST.md\\\") | [Allow self-hosted webhook delivery to private IP
addresses (](https://github.com/firecrawl/firecrawl/commit/5756b834884d481382ce1f5674836a56b7fee33d
\\\"Allow self-hosted webhook delivery to private IP addresses (#2232)\\\")
[#2232](https://github.com/firecrawl/firecrawl/pull/2232) [)](https://github.com/firecrawl/firecrawl/commit/5756b834884d481382ce1f5674836a56b7fee33d
\\\"Allow self-hosted webhook delivery to private IP addresses (#2232)\\\")
| 27 days agoOct 1, 2025 |\\n| [docker-compose.yaml](https://github.com/firecrawl/firecrawl/blob/main/docker-compose.yaml
\\\"docker-compose.yaml\\\") | [docker-compose.yaml](https://github.com/firecrawl/firecrawl/blob/main/docker-compose.yaml
\\\"docker-compose.yaml\\\") | [Fix a self-hosted docker-compose.yaml bug
caused by a recent firecraw\u2026](https://github.com/firecrawl/firecrawl/commit/7d4100b274889977fa1ba26344532d9d8747494c
\\\"Fix a self-hosted docker-compose.yaml bug caused by a recent firecrawl
change (#2252) Add EXTRACT_WORKER_PORT to docker-compose environment\\\")
| 3 weeks agoOct 4, 2025 |\\n| View all files |\\n\\n## Repository files navigation\\n\\n###
[![](https://raw.githubusercontent.com/firecrawl/firecrawl/main/img/firecrawl_logo.png)](https://raw.githubusercontent.com/firecrawl/firecrawl/main/img/firecrawl_logo.png)\\n\\n[Permalink:
](https://github.com/firecrawl/firecrawl#----)\\n\\n[![License](https://camo.githubusercontent.com/d8ec6c81115d21c81bc26f2c80f8987a4d2a72e538b88afaa738fad5cd6289ff/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6963656e73652f66697265637261776c2f66697265637261776c)](https://github.com/firecrawl/firecrawl/blob/main/LICENSE)[![Downloads](https://camo.githubusercontent.com/9d76afe428b4085c8b7103f2f4e31da110ee154ad7320bace4348d92ac0c2450/68747470733a2f2f7374617469632e706570792e746563682f62616467652f66697265637261776c2d7079)](https://pepy.tech/project/firecrawl-py)[![GitHub
Contributors](https://camo.githubusercontent.com/a9eabcb95ba00300afa51ce546660540c1f65764492cb2ba8fb67fe541c7e97f/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f636f6e7472696275746f72732f66697265637261776c2f66697265637261776c2e737667)](https://github.com/firecrawl/firecrawl/graphs/contributors)[![Visit
firecrawl.dev](https://camo.githubusercontent.com/3576b8cb0e77344c001cc8456d28c830691cb96480d4b65be90f8a4c99dead56/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f56697369742d66697265637261776c2e6465762d6f72616e6765)](https://firecrawl.dev/)\\n\\n[![Follow
on X](https://camo.githubusercontent.com/610127222e603752676f0275682f12398f8e434706861d577c1f6688d999191c/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f466f6c6c6f772532306f6e253230582d3030303030303f7374796c653d666f722d7468652d6261646765266c6f676f3d78266c6f676f436f6c6f723d7768697465)](https://twitter.com/firecrawl_dev)[![Follow
on LinkedIn](https://camo.githubusercontent.com/8741d51bb8e1c8ae576ac05e875f826bcf80e8711dcf9225935bb78d5bb03802/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f466f6c6c6f772532306f6e2532304c696e6b6564496e2d3030373742353f7374796c653d666f722d7468652d6261646765266c6f676f3d6c696e6b6564696e266c6f676f436f6c6f723d7768697465)](https://www.linkedin.com/company/104100957)[![Join
our Discord](https://camo.githubusercontent.com/886138c89a84dc2ad74d06900f364d736ccf753b2732d59fbd4106f6310f3616/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4a6f696e2532306f7572253230446973636f72642d3538363546323f7374796c653d666f722d7468652d6261646765266c6f676f3d646973636f7264266c6f676f436f6c6f723d7768697465)](https://discord.com/invite/gSmWdAkdwd)\\n\\n#
\U0001F525 Firecrawl\\n\\n[Permalink: \U0001F525 Firecrawl](https://github.com/firecrawl/firecrawl#-firecrawl)\\n\\nEmpower
your AI apps with clean data from any website. Featuring advanced scraping,
crawling, and data extraction capabilities.\\n\\n_This repository is in development,
and we\u2019re still integrating custom modules into the mono repo. It's not
fully ready for self-hosted deployment yet, but you can run it locally._\\n\\n##
What is Firecrawl?\\n\\n[Permalink: What is Firecrawl?](https://github.com/firecrawl/firecrawl#what-is-firecrawl)\\n\\n[Firecrawl](https://firecrawl.dev/?ref=github)
is an API service that takes a URL, crawls it, and converts it into clean
markdown or structured data. We crawl all accessible subpages and give you
clean data for each. No sitemap required. Check out our [documentation](https://docs.firecrawl.dev/).\\n\\nLooking
for our MCP? Check out the [repo here](https://github.com/firecrawl/firecrawl-mcp-server).\\n\\n_Pst.
hey, you, join our stargazers :)_\\n\\n[![GitHub stars](https://camo.githubusercontent.com/11f7ce76e9f1608b3470b3a23a1db3a7d9ec083ee18f2d826862f420bac800dc/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f73746172732f66697265637261776c2f66697265637261776c2e7376673f7374796c653d736f6369616c266c6162656c3d53746172266d61784167653d32353932303030)](https://github.com/firecrawl/firecrawl)\\n\\n##
How to use it?\\n\\n[Permalink: How to use it?](https://github.com/firecrawl/firecrawl#how-to-use-it)\\n\\nWe
provide an easy to use API with our hosted version. You can find the playground
and documentation [here](https://firecrawl.dev/playground). You can also self
host the backend if you'd like.\\n\\nCheck out the following resources to
get started:\\n\\n- [x] **API**: [Documentation](https://docs.firecrawl.dev/api-reference/introduction)\\n-
[x] **SDKs**: [Python](https://docs.firecrawl.dev/sdks/python), [Node](https://docs.firecrawl.dev/sdks/node)\\n-
[x] **LLM Frameworks**: [Langchain (python)](https://python.langchain.com/docs/integrations/document_loaders/firecrawl/),
[Langchain (js)](https://js.langchain.com/docs/integrations/document_loaders/web_loaders/firecrawl),
[Llama Index](https://docs.llamaindex.ai/en/latest/examples/data_connectors/WebPageDemo/#using-firecrawl-reader),
[Crew.ai](https://docs.crewai.com/), [Composio](https://composio.dev/tools/firecrawl/all),
[PraisonAI](https://docs.praison.ai/firecrawl/), [Superinterface](https://superinterface.ai/docs/assistants/functions/firecrawl),
[Vectorize](https://docs.vectorize.io/integrations/source-connectors/firecrawl)\\n-
[x] **Low-code Frameworks**: [Dify](https://dify.ai/blog/dify-ai-blog-integrated-with-firecrawl),
[Langflow](https://docs.langflow.org/), [Flowise AI](https://docs.flowiseai.com/integrations/langchain/document-loaders/firecrawl),
[Cargo](https://docs.getcargo.io/integration/firecrawl), [Pipedream](https://pipedream.com/apps/firecrawl/)\\n-
[x] **Community SDKs**: [Go](https://docs.firecrawl.dev/sdks/go), [Rust](https://docs.firecrawl.dev/sdks/rust)\\n-
[x] **Others**: [Zapier](https://zapier.com/apps/firecrawl/integrations),
[Pabbly Connect](https://www.pabbly.com/connect/integrations/firecrawl/)\\n-
[ ] Want an SDK or Integration? Let us know by opening an issue.\\n\\nTo
run locally, refer to guide [here](https://github.com/firecrawl/firecrawl/blob/main/CONTRIBUTING.md).\\n\\n###
API Key\\n\\n[Permalink: API Key](https://github.com/firecrawl/firecrawl#api-key)\\n\\nTo
use the API, you need to sign up on [Firecrawl](https://firecrawl.dev/) and
get an API key.\\n\\n### Features\\n\\n[Permalink: Features](https://github.com/firecrawl/firecrawl#features)\\n\\n-
[**Scrape**](https://github.com/firecrawl/firecrawl#scraping): scrapes a URL
and get its content in LLM-ready format (markdown, structured data via [LLM
Extract](https://github.com/firecrawl/firecrawl#llm-extraction-beta), screenshot,
html)\\n- [**Crawl**](https://github.com/firecrawl/firecrawl#crawling): scrapes
all the URLs of a web page and return content in LLM-ready format\\n- [**Map**](https://github.com/firecrawl/firecrawl#map):
input a website and get all the website urls - extremely fast\\n- [**Search**](https://github.com/firecrawl/firecrawl#search):
search the web and get full content from results\\n- [**Extract**](https://github.com/firecrawl/firecrawl#extract):
get structured data from single page, multiple pages or entire websites with
AI.\\n\\n### Powerful Capabilities\\n\\n[Permalink: Powerful Capabilities](https://github.com/firecrawl/firecrawl#powerful-capabilities)\\n\\n-
**LLM-ready formats**: markdown, structured data, screenshot, HTML, links,
metadata\\n- **The hard stuff**: proxies, anti-bot mechanisms, dynamic content
(js-rendered), output parsing, orchestration\\n- **Customizability**: exclude
tags, crawl behind auth walls with custom headers, max crawl depth, etc...\\n-
**Media parsing**: pdfs, docx, images\\n- **Reliability first**: designed
to get the data you need - no matter how hard it is\\n- **Actions**: click,
scroll, input, wait and more before extracting data\\n- **Batching**: scrape
thousands of URLs at the same time with a new async endpoint\\n- **Change
Tracking**: monitor and detect changes in website content over time\\n\\nYou
can find all of Firecrawl's capabilities and how to use them in our [documentation](https://docs.firecrawl.dev/)\\n\\n###
Crawling\\n\\n[Permalink: Crawling](https://github.com/firecrawl/firecrawl#crawling)\\n\\nUsed
to crawl a URL and all accessible subpages. This submits a crawl job and returns
a job ID to check the status of the crawl.\\n\\n```\\ncurl -X POST https://api.firecrawl.dev/v2/crawl
\\\\\\n -H 'Content-Type: application/json' \\\\\\n -H 'Authorization:
Bearer fc-YOUR_API_KEY' \\\\\\n -d '{\\n \\\"url\\\": \\\"https://docs.firecrawl.dev\\\",\\n
\ \\\"limit\\\": 10,\\n \\\"scrapeOptions\\\": {\\n \\\"formats\\\":
[\\\"markdown\\\", \\\"html\\\"]\\n }\\n }'\\n```\\n\\nReturns a crawl
job id and the url to check the status of the crawl.\\n\\n```\\n{\\n \\\"success\\\":
true,\\n \\\"id\\\": \\\"123-456-789\\\",\\n \\\"url\\\": \\\"https://api.firecrawl.dev/v2/crawl/123-456-789\\\"\\n}\\n```\\n\\n###
Check Crawl Job\\n\\n[Permalink: Check Crawl Job](https://github.com/firecrawl/firecrawl#check-crawl-job)\\n\\nUsed
to check the status of a crawl job and get its result.\\n\\n```\\ncurl -X
GET https://api.firecrawl.dev/v2/crawl/123-456-789 \\\\\\n -H 'Content-Type:
application/json' \\\\\\n -H 'Authorization: Bearer YOUR_API_KEY'\\n```\\n\\n```\\n{\\n
\ \\\"status\\\": \\\"completed\\\",\\n \\\"total\\\": 36,\\n \\\"creditsUsed\\\":
36,\\n \\\"expiresAt\\\": \\\"2024-00-00T00:00:00.000Z\\\",\\n \\\"data\\\":
[\\\\\\n {\\\\\\n \\\"markdown\\\": \\\"[Firecrawl Docs home page![light
logo](https://mintlify.s3-us-west-1.amazonaws.com/firecrawl/logo/light.svg)!...\\\",\\\\\\n
\ \\\"html\\\": \\\"<!DOCTYPE html><html lang=\\\\\\\"en\\\\\\\" class=\\\\\\\"js-focus-visible
lg:[--scroll-mt:9.5rem]\\\\\\\" data-js-focus-visible=\\\\\\\"\\\\\\\">...\\\",\\\\\\n
\ \\\"metadata\\\": {\\\\\\n \\\"title\\\": \\\"Build a 'Chat with
website' using Groq Llama 3 | Firecrawl\\\",\\\\\\n \\\"language\\\":
\\\"en\\\",\\\\\\n \\\"sourceURL\\\": \\\"https://docs.firecrawl.dev/learn/rag-llama3\\\",\\\\\\n
\ \\\"description\\\": \\\"Learn how to use Firecrawl, Groq Llama 3,
and Langchain to build a 'Chat with your website' bot.\\\",\\\\\\n \\\"ogLocaleAlternate\\\":
[],\\\\\\n \\\"statusCode\\\": 200\\\\\\n }\\\\\\n }\\\\\\n
\ ]\\\\\\n}\\\\\\n```\\\\\\n\\\\\\n### Scraping\\\\\\n\\\\\\n[Permalink: Scraping](https://github.com/firecrawl/firecrawl#scraping)\\\\\\n\\\\\\nUsed
to scrape a URL and get its content in the specified formats.\\\\\\n\\\\\\n```\\\\\\ncurl
-X POST https://api.firecrawl.dev/v2/scrape \\\\\\\\\\n -H 'Content-Type:
application/json' \\\\\\\\\\n -H 'Authorization: Bearer YOUR_API_KEY' \\\\\\\\\\n
\ -d '{\\\\\\n \\\"url\\\": \\\"https://docs.firecrawl.dev\\\",\\\\\\n
\ \\\"formats\\\" : [\\\"markdown\\\", \\\"html\\\"]\\\\\\n }'\\\\\\n```\\\\\\n\\\\\\nResponse:\\\\\\n\\\\\\n```\\\\\\n{\\\\\\n
\ \\\"success\\\": true,\\\\\\n \\\"data\\\": {\\\\\\n \\\"markdown\\\":
\\\"Launch Week I is here! [See our Day 2 Release \U0001F680](https://www.firecrawl.dev/blog/launch-week-i-day-2-doubled-rate-limits)[\U0001F4A5
Get 2 months free...\\\",\\\\\\n \\\"html\\\": \\\"<!DOCTYPE html><html
lang=\\\\\\\"en\\\\\\\" class=\\\\\\\"light\\\\\\\" style=\\\\\\\"color-scheme:
light;\\\\\\\"><body class=\\\\\\\"__variable_36bd41 __variable_d7dc5d font-inter
...\\\",\\\\\\n \\\"metadata\\\": {\\\\\\n \\\"title\\\": \\\"Home
- Firecrawl\\\",\\\\\\n \\\"description\\\": \\\"Firecrawl crawls and
converts any website into clean markdown.\\\",\\\\\\n \\\"language\\\":
\\\"en\\\",\\\\\\n \\\"keywords\\\": \\\"Firecrawl,Markdown,Data,Mendable,Langchain\\\",\\\\\\n
\ \\\"robots\\\": \\\"follow, index\\\",\\\\\\n \\\"ogTitle\\\":
\\\"Firecrawl\\\",\\\\\\n \\\"ogDescription\\\": \\\"Turn any website
into LLM-ready data.\\\",\\\\\\n \\\"ogUrl\\\": \\\"https://www.firecrawl.dev/\\\",\\\\\\n
\ \\\"ogImage\\\": \\\"https://www.firecrawl.dev/og.png?123\\\",\\\\\\n
\ \\\"ogLocaleAlternate\\\": [],\\\\\\n \\\"ogSiteName\\\": \\\"Firecrawl\\\",\\\\\\n
\ \\\"sourceURL\\\": \\\"https://firecrawl.dev\\\",\\\\\\n \\\"statusCode\\\":
200\\\\\\n }\\\\\\n }\\\\\\n}\\\\\\n```\\\\\\n\\\\\\n### Map\\\\\\n\\\\\\n[Permalink:
Map](https://github.com/firecrawl/firecrawl#map)\\\\\\n\\\\\\nUsed to map
a URL and get urls of the website. This returns most links present on the
website.\\\\\\n\\\\\\n```\\\\\\ncurl -X POST https://api.firecrawl.dev/v2/map
\\\\\\\\\\n -H 'Content-Type: application/json' \\\\\\\\\\n -H 'Authorization:
Bearer YOUR_API_KEY' \\\\\\\\\\n -d '{\\\\\\n \\\"url\\\": \\\"https://firecrawl.dev\\\"\\\\\\n
\ }'\\\\\\n```\\\\\\n\\\\\\nResponse:\\\\\\n\\\\\\n```\\\\\\n{\\\\\\n \\\"success\\\":
true,\\\\\\n \\\"links\\\": [\\\\\\n { \\\"url\\\": \\\"https://firecrawl.dev\\\",
\\\"title\\\": \\\"Firecrawl\\\", \\\"description\\\": \\\"Firecrawl is a
tool that allows you to crawl a website and get the data you need.\\\" },\\\\\\n
\ { \\\"url\\\": \\\"https://www.firecrawl.dev/pricing\\\", \\\"title\\\":
\\\"Firecrawl Pricing\\\", \\\"description\\\": \\\"Firecrawl Pricing\\\"
},\\\\\\n { \\\"url\\\": \\\"https://www.firecrawl.dev/blog\\\", \\\"title\\\":
\\\"Firecrawl Blog\\\", \\\"description\\\": \\\"Firecrawl Blog\\\" },\\\\\\n
\ { \\\"url\\\": \\\"https://www.firecrawl.dev/playground\\\", \\\"title\\\":
\\\"Firecrawl Playground\\\", \\\"description\\\": \\\"Firecrawl Playground\\\"
},\\\\\\n { \\\"url\\\": \\\"https://www.firecrawl.dev/smart-crawl\\\",
\\\"title\\\": \\\"Firecrawl Smart Crawl\\\", \\\"description\\\": \\\"Firecrawl
Smart Crawl\\\" }\\\\\\n ]\\\\\\n}\\\\\\n```\\\\\\n\\\\\\n#### Map with search\\\\\\n\\\\\\n[Permalink:
Map with search](https://github.com/firecrawl/firecrawl#map-with-search)\\\\\\n\\\\\\nMap
with `search` param allows you to search for specific urls inside a website.\\\\\\n\\\\\\n```\\\\\\ncurl
-X POST https://api.firecrawl.dev/v2/map \\\\\\\\\\n -H 'Content-Type:
application/json' \\\\\\\\\\n -H 'Authorization: Bearer YOUR_API_KEY' \\\\\\\\\\n
\ -d '{\\\\\\n \\\"url\\\": \\\"https://firecrawl.dev\\\",\\\\\\n \\\"search\\\":
\\\"docs\\\"\\\\\\n }'\\\\\\n```\\\\\\n\\\\\\nResponse will be an ordered
list from the most relevant to the least relevant.\\\\\\n\\\\\\n```\\\\\\n{\\\\\\n
\ \\\"success\\\": true,\\\\\\n \\\"links\\\": [\\\\\\n { \\\"url\\\":
\\\"https://docs.firecrawl.dev\\\", \\\"title\\\": \\\"Firecrawl Docs\\\",
\\\"description\\\": \\\"Firecrawl Docs\\\" },\\\\\\n { \\\"url\\\": \\\"https://docs.firecrawl.dev/sdks/python\\\",
\\\"title\\\": \\\"Firecrawl Python SDK\\\", \\\"description\\\": \\\"Firecrawl
Python SDK\\\" },\\\\\\n { \\\"url\\\": \\\"https://docs.firecrawl.dev/learn/rag-llama3\\\",
\\\"title\\\": \\\"Firecrawl RAG Llama 3\\\", \\\"description\\\": \\\"Firecrawl
RAG Llama 3\\\" }\\\\\\n ]\\\\\\n}\\\\\\n```\\\\\\n\\\\\\n### Search\\\\\\n\\\\\\n[Permalink:
Search](https://github.com/firecrawl/firecrawl#search)\\\\\\n\\\\\\nSearch
the web and get full content from results\\\\\\n\\\\\\nFirecrawl\u2019s search
API allows you to perform web searches and optionally scrape the search results
in one operation.\\\\\\n\\\\\\n- Choose specific output formats (markdown,
HTML, links, screenshots)\\\\\\n- Search the web with customizable parameters
(language, country, etc.)\\\\\\n- Optionally retrieve content from search
results in various formats\\\\\\n- Control the number of results and set timeouts\\\\\\n\\\\\\n```\\\\\\ncurl
-X POST https://api.firecrawl.dev/v2/search \\\\\\\\\\n -H \\\"Content-Type:
application/json\\\" \\\\\\\\\\n -H \\\"Authorization: Bearer fc-YOUR_API_KEY\\\"
\\\\\\\\\\n -d '{\\\\\\n \\\"query\\\": \\\"what is firecrawl?\\\",\\\\\\n
\ \\\"limit\\\": 5\\\\\\n }'\\\\\\n```\\\\\\n\\\\\\n#### Response\\\\\\n\\\\\\n[Permalink:
Response](https://github.com/firecrawl/firecrawl#response)\\\\\\n\\\\\\n```\\\\\\n{\\\\\\n
\ \\\"success\\\": true,\\\\\\n \\\"data\\\": [\\\\\\n {\\\\\\n \\\"url\\\":
\\\"https://firecrawl.dev\\\",\\\\\\n \\\"title\\\": \\\"Firecrawl |
Home Page\\\",\\\\\\n \\\"description\\\": \\\"Turn websites into LLM-ready
data with Firecrawl\\\"\\\\\\n },\\\\\\n {\\\\\\n \\\"url\\\":
\\\"https://docs.firecrawl.dev\\\",\\\\\\n \\\"title\\\": \\\"Documentation
| Firecrawl\\\",\\\\\\n \\\"description\\\": \\\"Learn how to use Firecrawl
in your own applications\\\"\\\\\\n }\\\\\\n ]\\\\\\n}\\\\\\n```\\\\\\n\\\\\\n####
With content scraping\\\\\\n\\\\\\n[Permalink: With content scraping](https://github.com/firecrawl/firecrawl#with-content-scraping)\\\\\\n\\\\\\n```\\\\\\ncurl
-X POST https://api.firecrawl.dev/v2/search \\\\\\\\\\n -H \\\"Content-Type:
application/json\\\" \\\\\\\\\\n -H \\\"Authorization: Bearer fc-YOUR_API_KEY\\\"
\\\\\\\\\\n -d '{\\\\\\n \\\"query\\\": \\\"what is firecrawl?\\\",\\\\\\n
\ \\\"limit\\\": 5,\\\\\\n \\\"scrapeOptions\\\": {\\\\\\n \\\"formats\\\":
[\\\"markdown\\\", \\\"links\\\"]\\\\\\n }\\\\\\n }'\\\\\\n```\\\\\\n\\\\\\n###
Extract (Beta)\\\\\\n\\\\\\n[Permalink: Extract (Beta)](https://github.com/firecrawl/firecrawl#extract-beta)\\\\\\n\\\\\\nGet
structured data from entire websites with a prompt and/or a schema.\\\\\\n\\\\\\nYou
can extract structured data from one or multiple URLs, including wildcards:\\\\\\n\\\\\\nSingle
Page:\\\\\\nExample: [https://firecrawl.dev/some-page](https://firecrawl.dev/some-page)\\\\\\n\\\\\\nMultiple
Pages / Full Domain\\\\\\nExample: [https://firecrawl.dev/](https://firecrawl.dev/)\\\\*\\\\\\n\\\\\\nWhen
you use /\\\\*, Firecrawl will automatically crawl and parse all URLs it can
discover in that domain, then extract the requested data.\\\\\\n\\\\\\n```\\\\\\ncurl
-X POST https://api.firecrawl.dev/v2/extract \\\\\\\\\\n -H 'Content-Type:
application/json' \\\\\\\\\\n -H 'Authorization: Bearer YOUR_API_KEY' \\\\\\\\\\n
\ -d '{\\\\\\n \\\"urls\\\": [\\\\\\n \\\"https://firecrawl.dev/*\\\",\\\\\\n
\ \\\"https://docs.firecrawl.dev/\\\",\\\\\\n \\\"https://www.ycombinator.com/companies\\\"\\\\\\n
\ ],\\\\\\n \\\"prompt\\\": \\\"Extract the company mission, whether
it is open source, and whether it is in Y Combinator from the page.\\\",\\\\\\n
\ \\\"schema\\\": {\\\\\\n \\\"type\\\": \\\"object\\\",\\\\\\n
\ \\\"properties\\\": {\\\\\\n \\\"company_mission\\\": {\\\\\\n
\ \\\"type\\\": \\\"string\\\"\\\\\\n },\\\\\\n \\\"is_open_source\\\":
{\\\\\\n \\\"type\\\": \\\"boolean\\\"\\\\\\n },\\\\\\n
\ \\\"is_in_yc\\\": {\\\\\\n \\\"type\\\": \\\"boolean\\\"\\\\\\n
\ }\\\\\\n },\\\\\\n \\\"required\\\": [\\\\\\n \\\"company_mission\\\",\\\\\\n
\ \\\"is_open_source\\\",\\\\\\n \\\"is_in_yc\\\"\\\\\\n
\ ]\\\\\\n }\\\\\\n }'\\\\\\n```\\\\\\n\\\\\\n```\\\\\\n{\\\\\\n
\ \\\"success\\\": true,\\\\\\n \\\"id\\\": \\\"44aa536d-f1cb-4706-ab87-ed0386685740\\\",\\\\\\n
\ \\\"urlTrace\\\": []\\\\\\n}\\\\\\n```\\\\\\n\\\\\\nIf you are using the
sdks, it will auto pull the response for you:\\\\\\n\\\\\\n```\\\\\\n{\\\\\\n
\ \\\"success\\\": true,\\\\\\n \\\"data\\\": {\\\\\\n \\\"company_mission\\\":
\\\"Firecrawl is the easiest way to extract data from the web. Developers
use us to reliably convert URLs into LLM-ready markdown or structured data
with a single API call.\\\",\\\\\\n \\\"supports_sso\\\": false,\\\\\\n
\ \\\"is_open_source\\\": true,\\\\\\n \\\"is_in_yc\\\": true\\\\\\n
\ }\\\\\\n}\\\\\\n```\\\\\\n\\\\\\n### LLM Extraction (Beta)\\\\\\n\\\\\\n[Permalink:
LLM Extraction (Beta)](https://github.com/firecrawl/firecrawl#llm-extraction-beta)\\\\\\n\\\\\\nUsed
to extract structured data from scraped pages.\\\\\\n\\\\\\n```\\\\\\ncurl
-X POST https://api.firecrawl.dev/v2/scrape \\\\\\\\\\n -H 'Content-Type:
application/json' \\\\\\\\\\n -H 'Authorization: Bearer YOUR_API_KEY' \\\\\\\\\\n
\ -d '{\\\\\\n \\\"url\\\": \\\"https://www.mendable.ai/\\\",\\\\\\n \\\"formats\\\":
[\\\\\\n {\\\\\\n \\\"type\\\": \\\"json\\\",\\\\\\n \\\"schema\\\":
{\\\\\\n \\\"type\\\": \\\"object\\\",\\\\\\n \\\"properties\\\":
{\\\\\\n \\\"company_mission\\\": { \\\"type\\\": \\\"string\\\"
},\\\\\\n \\\"supports_sso\\\": { \\\"type\\\": \\\"boolean\\\"
},\\\\\\n \\\"is_open_source\\\": { \\\"type\\\": \\\"boolean\\\"
},\\\\\\n \\\"is_in_yc\\\": { \\\"type\\\": \\\"boolean\\\" }\\\\\\n
\ }\\\\\\n }\\\\\\n }\\\\\\n ]\\\\\\n }'\\\\\\n```\\\\\\n\\\\\\n```\\\\\\n{\\\\\\n
\ \\\"success\\\": true,\\\\\\n \\\"data\\\": {\\\\\\n \\\"content\\\":
\\\"Raw Content\\\",\\\\\\n \\\"metadata\\\": {\\\\\\n \\\"title\\\":
\\\"Mendable\\\",\\\\\\n \\\"description\\\": \\\"Mendable allows you
to easily build AI chat applications. Ingest, customize, then deploy with
one line of code anywhere you want. Brought to you by SideGuide\\\",\\\\\\n
\ \\\"robots\\\": \\\"follow, index\\\",\\\\\\n \\\"ogTitle\\\":
\\\"Mendable\\\",\\\\\\n \\\"ogDescription\\\": \\\"Mendable allows you
to easily build AI chat applications. Ingest, customize, then deploy with
one line of code anywhere you want. Brought to you by SideGuide\\\",\\\\\\n
\ \\\"ogUrl\\\": \\\"https://mendable.ai/\\\",\\\\\\n \\\"ogImage\\\":
\\\"https://mendable.ai/mendable_new_og1.png\\\",\\\\\\n \\\"ogLocaleAlternate\\\":
[],\\\\\\n \\\"ogSiteName\\\": \\\"Mendable\\\",\\\\\\n \\\"sourceURL\\\":
\\\"https://mendable.ai/\\\"\\\\\\n },\\\\\\n \\\"json\\\": {\\\\\\n
\ \\\"company_mission\\\": \\\"Train a secure AI on your technical resources
that answers customer and employee questions so your team doesn't have to\\\",\\\\\\n
\ \\\"supports_sso\\\": true,\\\\\\n \\\"is_open_source\\\": false,\\\\\\n
\ \\\"is_in_yc\\\": true\\\\\\n }\\\\\\n }\\\\\\n}\\\\\\n```\\\\\\n\\\\\\n###
Extracting without a schema (New)\\\\\\n\\\\\\n[Permalink: Extracting without
a schema (New)](https://github.com/firecrawl/firecrawl#extracting-without-a-schema-new)\\\\\\n\\\\\\nYou
can now extract without a schema by just passing a `prompt` to the endpoint.
The llm chooses the structure of the data.\\\\\\n\\\\\\n```\\\\\\ncurl -X
POST https://api.firecrawl.dev/v2/scrape \\\\\\\\\\n -H 'Content-Type:
application/json' \\\\\\\\\\n -H 'Authorization: Bearer YOUR_API_KEY' \\\\\\\\\\n
\ -d '{\\\\\\n \\\"url\\\": \\\"https://docs.firecrawl.dev/\\\",\\\\\\n
\ \\\"formats\\\": [\\\\\\n {\\\\\\n \\\"type\\\": \\\"json\\\",\\\\\\n
\ \\\"prompt\\\": \\\"Extract the company mission from the page.\\\"\\\\\\n
\ }\\\\\\n ]\\\\\\n }'\\\\\\n```\\\\\\n\\\\\\n### Interacting
with the page with Actions (Cloud-only)\\\\\\n\\\\\\n[Permalink: Interacting
with the page with Actions (Cloud-only)](https://github.com/firecrawl/firecrawl#interacting-with-the-page-with-actions-cloud-only)\\\\\\n\\\\\\nFirecrawl
allows you to perform various actions on a web page before scraping its content.
This is particularly useful for interacting with dynamic content, navigating
through pages, or accessing content that requires user interaction.\\\\\\n\\\\\\nHere
is an example of how to use actions to navigate to google.com, search for
Firecrawl, click on the first result, and take a screenshot.\\\\\\n\\\\\\n```\\\\\\ncurl
-X POST https://api.firecrawl.dev/v2/scrape \\\\\\\\\\n -H 'Content-Type:
application/json' \\\\\\\\\\n -H 'Authorization: Bearer YOUR_API_KEY' \\\\\\\\\\n
\ -d '{\\\\\\n \\\"url\\\": \\\"google.com\\\",\\\\\\n \\\"formats\\\":
[\\\"markdown\\\"],\\\\\\n \\\"actions\\\": [\\\\\\n {\\\"type\\\":
\\\"wait\\\", \\\"milliseconds\\\": 2000},\\\\\\n {\\\"type\\\":
\\\"click\\\", \\\"selector\\\": \\\"textarea[title=\\\\\\\"Search\\\\\\\"]\\\"},\\\\\\n
\ {\\\"type\\\": \\\"wait\\\", \\\"milliseconds\\\": 2000},\\\\\\n
\ {\\\"type\\\": \\\"write\\\", \\\"text\\\": \\\"firecrawl\\\"},\\\\\\n
\ {\\\"type\\\": \\\"wait\\\", \\\"milliseconds\\\": 2000},\\\\\\n
\ {\\\"type\\\": \\\"press\\\", \\\"key\\\": \\\"ENTER\\\"},\\\\\\n
\ {\\\"type\\\": \\\"wait\\\", \\\"milliseconds\\\": 3000},\\\\\\n
\ {\\\"type\\\": \\\"click\\\", \\\"selector\\\": \\\"h3\\\"},\\\\\\n
\ {\\\"type\\\": \\\"wait\\\", \\\"milliseconds\\\": 3000},\\\\\\n
\ {\\\"type\\\": \\\"screenshot\\\"}\\\\\\n ]\\\\\\n }'\\\\\\n```\\\\\\n\\\\\\n###
Batch Scraping Multiple URLs (New)\\\\\\n\\\\\\n[Permalink: Batch Scraping
Multiple URLs (New)](https://github.com/firecrawl/firecrawl#batch-scraping-multiple-urls-new)\\\\\\n\\\\\\nYou
can now batch scrape multiple URLs at the same time. It is very similar to
how the /crawl endpoint works. It submits a batch scrape job and returns a
job ID to check the status of the batch scrape.\\\\\\n\\\\\\n```\\\\\\ncurl
-X POST https://api.firecrawl.dev/v2/batch/scrape \\\\\\\\\\n -H 'Content-Type:
application/json' \\\\\\\\\\n -H 'Authorization: Bearer YOUR_API_KEY' \\\\\\\\\\n
\ -d '{\\\\\\n \\\"urls\\\": [\\\"https://docs.firecrawl.dev\\\", \\\"https://docs.firecrawl.dev/sdks/overview\\\"],\\\\\\n
\ \\\"formats\\\" : [\\\"markdown\\\", \\\"html\\\"]\\\\\\n }'\\\\\\n```\\\\\\n\\\\\\n##
Using Python SDK\\\\\\n\\\\\\n[Permalink: Using Python SDK](https://github.com/firecrawl/firecrawl#using-python-sdk)\\\\\\n\\\\\\n###
Installing Python SDK\\\\\\n\\\\\\n[Permalink: Installing Python SDK](https://github.com/firecrawl/firecrawl#installing-python-sdk)\\\\\\n\\\\\\n```\\\\\\npip
install firecrawl-py\\\\\\n```\\\\\\n\\\\\\n### Crawl a website\\\\\\n\\\\\\n[Permalink:
Crawl a website](https://github.com/firecrawl/firecrawl#crawl-a-website)\\\\\\n\\\\\\n```\\\\\\nfrom
firecrawl import Firecrawl\\\\\\n\\\\\\nfirecrawl = Firecrawl(api_key=\\\"fc-YOUR_API_KEY\\\")\\\\\\n\\\\\\n#
Scrape a website (returns a Document)\\\\\\ndoc = firecrawl.scrape(\\\\\\n
\ \\\"https://firecrawl.dev\\\",\\\\\\n formats=[\\\"markdown\\\", \\\"html\\\"],\\\\\\n)\\\\\\nprint(doc.markdown)\\\\\\n\\\\\\n#
Crawl a website\\\\\\nresponse = firecrawl.crawl(\\\\\\n \\\"https://firecrawl.dev\\\",\\\\\\n
\ limit=100,\\\\\\n scrape_options={\\\"formats\\\": [\\\"markdown\\\",
\\\"html\\\"]},\\\\\\n poll_interval=30,\\\\\\n)\\\\\\nprint(response)\\\\\\n```\\\\\\n\\\\\\n###
Extracting structured data from a URL\\\\\\n\\\\\\n[Permalink: Extracting
structured data from a URL](https://github.com/firecrawl/firecrawl#extracting-structured-data-from-a-url)\\\\\\n\\\\\\nWith
LLM extraction, you can easily extract structured data from any URL. We support
pydantic schemas to make it easier for you too. Here is how you to use it:\\\\\\n\\\\\\n```\\\\\\nfrom
pydantic import BaseModel, Field\\\\\\nfrom typing import List\\\\\\n\\\\\\nclass
Article(BaseModel):\\\\\\n title: str\\\\\\n points: int\\\\\\n by:
str\\\\\\n commentsURL: str\\\\\\n\\\\\\nclass TopArticles(BaseModel):\\\\\\n
\ top: List[Article] = Field(..., description=\\\"Top 5 stories\\\")\\\\\\n\\\\\\n#
Use JSON format with a Pydantic schema\\\\\\ndoc = firecrawl.scrape(\\\\\\n
\ \\\"https://news.ycombinator.com\\\",\\\\\\n formats=[{\\\"type\\\":
\\\"json\\\", \\\"schema\\\": TopArticles}],\\\\\\n)\\\\\\nprint(doc.json)\\\\\\n```\\\\\\n\\\\\\n##
Using the Node SDK\\\\\\n\\\\\\n[Permalink: Using the Node SDK](https://github.com/firecrawl/firecrawl#using-the-node-sdk)\\\\\\n\\\\\\n###
Installation\\\\\\n\\\\\\n[Permalink: Installation](https://github.com/firecrawl/firecrawl#installation)\\\\\\n\\\\\\nTo
install the Firecrawl Node SDK, you can use npm:\\\\\\n\\\\\\n```\\\\\\nnpm
install @mendable/firecrawl-js\\\\\\n```\\\\\\n\\\\\\n### Usage\\\\\\n\\\\\\n[Permalink:
Usage](https://github.com/firecrawl/firecrawl#usage)\\\\\\n\\\\\\n1. Get an
API key from [firecrawl.dev](https://firecrawl.dev/)\\\\\\n2. Set the API
key as an environment variable named `FIRECRAWL_API_KEY` or pass it as a parameter
to the `Firecrawl` class.\\\\\\n\\\\\\n```\\\\\\nimport Firecrawl from '@mendable/firecrawl-js';\\\\\\n\\\\\\nconst
firecrawl = new Firecrawl({ apiKey: 'fc-YOUR_API_KEY' });\\\\\\n\\\\\\n//
Scrape a website\\\\\\nconst doc = await firecrawl.scrape('https://firecrawl.dev',
{\\\\\\n formats: ['markdown', 'html'],\\\\\\n});\\\\\\nconsole.log(doc);\\\\\\n\\\\\\n//
Crawl a website\\\\\\nconst response = await firecrawl.crawl('https://firecrawl.dev',
{\\\\\\n limit: 100,\\\\\\n scrapeOptions: { formats: ['markdown', 'html']
},\\\\\\n});\\\\\\nconsole.log(response);\\\\\\n```\\\\\\n\\\\\\n### Extracting
structured data from a URL\\\\\\n\\\\\\n[Permalink: Extracting structured
data from a URL](https://github.com/firecrawl/firecrawl#extracting-structured-data-from-a-url-1)\\\\\\n\\\\\\nWith
LLM extraction, you can easily extract structured data from any URL. We support
zod schema to make it easier for you too. Here is how to use it:\\\\\\n\\\\\\n```\\\\\\nimport
Firecrawl from '@mendable/firecrawl-js';\\\\\\nimport { z } from 'zod';\\\\\\n\\\\\\nconst
firecrawl = new Firecrawl({ apiKey: 'fc-YOUR_API_KEY' });\\\\\\n\\\\\\n//
Define schema to extract contents into\\\\\\nconst schema = z.object({\\\\\\n
\ top: z\\\\\\n .array(\\\\\\n z.object({\\\\\\n title: z.string(),\\\\\\n
\ points: z.number(),\\\\\\n by: z.string(),\\\\\\n commentsURL:
z.string(),\\\\\\n })\\\\\\n )\\\\\\n .length(5)\\\\\\n .describe('Top
5 stories on Hacker News'),\\\\\\n});\\\\\\n\\\\\\n// Use the v2 extract API
with direct Zod schema support\\\\\\nconst extractRes = await firecrawl.extract({\\\\\\n
\ urls: ['https://news.ycombinator.com'],\\\\\\n schema,\\\\\\n prompt:
'Extract the top 5 stories',\\\\\\n});\\\\\\n\\\\\\nconsole.log(extractRes);\\\\\\n```\\\\\\n\\\\\\n##
Open Source vs Cloud Offering\\\\\\n\\\\\\n[Permalink: Open Source vs Cloud
Offering](https://github.com/firecrawl/firecrawl#open-source-vs-cloud-offering)\\\\\\n\\\\\\nFirecrawl
is open source available under the AGPL-3.0 license.\\\\\\n\\\\\\nTo deliver
the best possible product, we offer a hosted version of Firecrawl alongside
our open-source offering. The cloud solution allows us to continuously innovate
and maintain a high-quality, sustainable service for all users.\\\\\\n\\\\\\nFirecrawl
Cloud is available at [firecrawl.dev](https://firecrawl.dev/) and offers a
range of features that are not available in the open source version:\\\\\\n\\\\\\n[![Open
Source vs Cloud Offering](https://raw.githubusercontent.com/firecrawl/firecrawl/main/img/open-source-cloud.png)](https://raw.githubusercontent.com/firecrawl/firecrawl/main/img/open-source-cloud.png)\\\\\\n\\\\\\n##
Contributing\\\\\\n\\\\\\n[Permalink: Contributing](https://github.com/firecrawl/firecrawl#contributing)\\\\\\n\\\\\\nWe
love contributions! Please read our [contributing guide](https://github.com/firecrawl/firecrawl/blob/main/CONTRIBUTING.md)
before submitting a pull request. If you'd like to self-host, refer to the
[self-hosting guide](https://github.com/firecrawl/firecrawl/blob/main/SELF_HOST.md).\\\\\\n\\\\\\n_It
is the sole responsibility of the end users to respect websites' policies
when scraping, searching and crawling with Firecrawl. Users are advised to
adhere to the applicable privacy policies and terms of use of the websites
prior to initiating any scraping activities. By default, Firecrawl respects
the directives specified in the websites' robots.txt files when crawling.
By utilizing Firecrawl, you expressly agree to comply with these conditions._\\\\\\n\\\\\\n##
Contributors\\\\\\n\\\\\\n[Permalink: Contributors](https://github.com/firecrawl/firecrawl#contributors)\\\\\\n\\\\\\n[![contributors](https://camo.githubusercontent.com/e3b2e7ad4c1f76e68fc11ede158c87a0f039052f649002a1ff855d13fb9294fb/68747470733a2f2f636f6e747269622e726f636b732f696d6167653f7265706f3d66697265637261776c2f66697265637261776c)](https://github.com/firecrawl/firecrawl/graphs/contributors)\\\\\\n\\\\\\n##
License Disclaimer\\\\\\n\\\\\\n[Permalink: License Disclaimer](https://github.com/firecrawl/firecrawl#license-disclaimer)\\\\\\n\\\\\\nThis
project is primarily licensed under the GNU Affero General Public License
v3.0 (AGPL-3.0), as specified in the LICENSE file in the root directory of
this repository. However, certain components of this project are licensed
under the MIT License. Refer to the LICENSE files in these specific directories
for details.\\\\\\n\\\\\\nPlease note:\\\\\\n\\\\\\n- The AGPL-3.0 license
applies to all parts of the project unless otherwise specified.\\\\\\n- The
SDKs and some UI components are licensed under the MIT License. Refer to the
LICENSE files in these specific directories for details.\\\\\\n- When using
or contributing to this project, ensure you comply with the appropriate license
terms for the specific component you are working with.\\\\\\n\\\\\\nFor more
details on the licensing of specific components, please refer to the LICENSE
files in the respective directories or contact the project maintainers.\\\\\\n\\\\\\n[\u2191
Back to Top \u2191](https://github.com/firecrawl/firecrawl#readme-top)\\\\\\n\\\\\\n##
About\\\\\\n\\\\\\n\U0001F525 The Web Data API for AI - Turn entire websites
into LLM-ready markdown or structured data\\\\\\n\\\\\\n\\\\\\n[firecrawl.dev](https://firecrawl.dev/
\\\"https://firecrawl.dev\\\")\\\\\\n\\\\\\n### Topics\\\\\\n\\\\\\n[markdown](https://github.com/topics/markdown
\\\"Topic: markdown\\\") [crawler](https://github.com/topics/crawler \\\"Topic:
crawler\\\") [scraper](https://github.com/topics/scraper \\\"Topic: scraper\\\")
[ai](https://github.com/topics/ai \\\"Topic: ai\\\") [html-to-markdown](https://github.com/topics/html-to-markdown
\\\"Topic: html-to-markdown\\\") [web-crawler](https://github.com/topics/web-crawler
\\\"Topic: web-crawler\\\") [scraping](https://github.com/topics/scraping
\\\"Topic: scraping\\\") [web-scraper](https://github.com/topics/web-scraper
\\\"Topic: web-scraper\\\") [web-scraping](https://github.com/topics/web-scraping
\\\"Topic: web-scraping\\\") [data-extraction](https://github.com/topics/data-extraction
\\\"Topic: data-extraction\\\") [webscraping](https://github.com/topics/webscraping
\\\"Topic: webscraping\\\") [web-data-extraction](https://github.com/topics/web-data-extraction
\\\"Topic: web-data-extraction\\\") [ai-agents](https://github.com/topics/ai-agents
\\\"Topic: ai-agents\\\") [web-search](https://github.com/topics/web-search
\\\"Topic: web-search\\\") [ai-search](https://github.com/topics/ai-search
\\\"Topic: ai-search\\\") [web-data](https://github.com/topics/web-data \\\"Topic:
web-data\\\") [llm](https://github.com/topics/llm \\\"Topic: llm\\\") [ai-crawler](https://github.com/topics/ai-crawler
\\\"Topic: ai-crawler\\\") [ai-scraping](https://github.com/topics/ai-scraping
\\\"Topic: ai-scraping\\\")\\\\\\n\\\\\\n### Resources\\\\\\n\\\\\\n[Readme](https://github.com/firecrawl/firecrawl#readme-ov-file)\\\\\\n\\\\\\n###
License\\\\\\n\\\\\\n[AGPL-3.0 license](https://github.com/firecrawl/firecrawl#AGPL-3.0-1-ov-file)\\\\\\n\\\\\\n###
Contributing\\\\\\n\\\\\\n[Contributing](https://github.com/firecrawl/firecrawl#contributing-ov-file)\\\\\\n\\\\\\n###
Uh oh!\\\\\\n\\\\\\nThere was an error while loading. [Please reload this
page](https://github.com/firecrawl/firecrawl).\\\\\\n\\\\\\n[Activity](https://github.com/firecrawl/firecrawl/activity)\\\\\\n\\\\\\n[Custom
properties](https://github.com/firecrawl/firecrawl/custom-properties)\\\\\\n\\\\\\n###
Stars\\\\\\n\\\\\\n[**65.2k**\\\\\\\\\\nstars](https://github.com/firecrawl/firecrawl/stargazers)\\\\\\n\\\\\\n###
Watchers\\\\\\n\\\\\\n[**256**\\\\\\\\\\nwatching](https://github.com/firecrawl/firecrawl/watchers)\\\\\\n\\\\\\n###
Forks\\\\\\n\\\\\\n[**5.1k**\\\\\\\\\\nforks](https://github.com/firecrawl/firecrawl/forks)\\\\\\n\\\\\\n[Report
repository](https://github.com/contact/report-content?content_url=https%3A%2F%2Fgithub.com%2Ffirecrawl%2Ffirecrawl&report=firecrawl+%28user%29)\\\\\\n\\\\\\n##
[Releases\\\\ 28](https://github.com/firecrawl/firecrawl/releases)\\\\\\n\\\\\\n[v2.4.0\\\\\\\\\\nLatest\\\\\\\\\\n\\\\\\\\\\n2
weeks agoOct 13, 2025](https://github.com/firecrawl/firecrawl/releases/tag/v2.4.0)\\\\\\n\\\\\\n[\\\\+
27 releases](https://github.com/firecrawl/firecrawl/releases)\\\\\\n\\\\\\n##
[Packages\\\\ 3](https://github.com/orgs/firecrawl/packages?repo_name=firecrawl)\\\\\\n\\\\\\n-
[firecrawl](https://github.com/orgs/firecrawl/packages/container/package/firecrawl)\\\\\\n-
[playwright-service](https://github.com/orgs/firecrawl/packages/container/package/playwright-service)\\\\\\n-
[nuq-postgres](https://github.com/orgs/firecrawl/packages/container/package/nuq-postgres)\\\\\\n\\\\\\n##
[Contributors\\\\ 121](https://github.com/firecrawl/firecrawl/graphs/contributors)\\\\\\n\\\\\\n[\\\\+
107 contributors](https://github.com/firecrawl/firecrawl/graphs/contributors)\\\\\\n\\\\\\n##
Languages\\\\\\n\\\\\\n- [TypeScript73.5%](https://github.com/firecrawl/firecrawl/search?l=typescript)\\\\\\n-
[Python18.9%](https://github.com/firecrawl/firecrawl/search?l=python)\\\\\\n-
[Rust6.0%](https://github.com/firecrawl/firecrawl/search?l=rust)\\\\\\n- [Astro0.6%](https://github.com/firecrawl/firecrawl/search?l=astro)\\\\\\n-
[JavaScript0.3%](https://github.com/firecrawl/firecrawl/search?l=javascript)\\\\\\n-
[Jupyter Notebook0.2%](https://github.com/firecrawl/firecrawl/search?l=jupyter-notebook)\\\\\\n-
Other0.5%\",\"metadata\":{\"octolytics-dimension-repository_network_root_id\":\"787076358\",\"visitor-hmac\":\"163b2538b2335f7d4000a770785477e15881e1101708ca5ff1e8c021095f6ca1\",\"og:type\":\"object\",\"language\":\"en\",\"route-action\":\"disambiguate\",\"og:title\":\"GitHub
- firecrawl/firecrawl: \U0001F525 The Web Data API for AI - Turn entire websites
into LLM-ready markdown or structured data\",\"octolytics-dimension-repository_public\":\"true\",\"octolytics-dimension-repository_network_root_nwo\":\"firecrawl/firecrawl\",\"browser-errors-url\":\"https://api.github.com/_private/browser/errors\",\"browser-stats-url\":\"https://api.github.com/_private/browser/stats\",\"twitter:title\":\"GitHub
- firecrawl/firecrawl: \U0001F525 The Web Data API for AI - Turn entire websites
into LLM-ready markdown or structured data\",\"ui-target\":\"full\",\"og:image\":\"https://repository-images.githubusercontent.com/787076358/f9616c09-3701-41ef-b5a6-fdf912ffb15b\",\"google-site-verification\":\"Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I\",\"ogSiteName\":\"GitHub\",\"route-pattern\":\"/:user_id/:repository\",\"visitor-payload\":\"eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBMTVGOjE3OUI0RDo2MzdFQUREOjg3MzEwOTM6NjkwMTE4MjUiLCJ2aXNpdG9yX2lkIjoiNDkyNzk2MzExNjg5OTIxMTMwMSIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9\",\"og:description\":\"\U0001F525
The Web Data API for AI - Turn entire websites into LLM-ready markdown or
structured data - firecrawl/firecrawl\",\"expected-hostname\":\"github.com\",\"release\":\"66136a30a16cc69206f1249b6ba072daa2174535\",\"title\":\"GitHub
- firecrawl/firecrawl: \U0001F525 The Web Data API for AI - Turn entire websites
into LLM-ready markdown or structured data\",\"ogDescription\":\"\U0001F525
The Web Data API for AI - Turn entire websites into LLM-ready markdown or
structured data - firecrawl/firecrawl\",\"twitter:card\":\"summary_large_image\",\"fb:app_id\":\"1401488693436528\",\"color-scheme\":\"light
dark\",\"twitter:description\":\"\U0001F525 The Web Data API for AI - Turn
entire websites into LLM-ready markdown or structured data - firecrawl/firecrawl\",\"favicon\":\"https://github.githubassets.com/favicons/favicon.svg\",\"viewport\":\"width=device-width\",\"twitter:image\":\"https://repository-images.githubusercontent.com/787076358/f9616c09-3701-41ef-b5a6-fdf912ffb15b\",\"user-login\":\"\",\"description\":\"\U0001F525
The Web Data API for AI - Turn entire websites into LLM-ready markdown or
structured data - firecrawl/firecrawl\",\"octolytics-dimension-repository_nwo\":\"firecrawl/firecrawl\",\"octolytics-dimension-user_id\":\"135057108\",\"twitter:site\":\"@github\",\"og:url\":\"https://github.com/firecrawl/firecrawl\",\"octolytics-dimension-user_login\":\"firecrawl\",\"hostname\":\"github.com\",\"current-catalog-service-hash\":\"f3abb0cc802f3d7b95fc8762b94bdcb13bf39634c40c357301c4aa1d67a256fb\",\"html-safe-nonce\":\"e5653da3800db7de2c8b4a64ff6367242043a9e452608e9a6941a1c9e8346cfc\",\"apple-itunes-app\":\"app-id=1477376905,
app-argument=https://github.com/firecrawl/firecrawl\",\"turbo-cache-control\":\"no-preview\",\"og:site_name\":\"GitHub\",\"request-id\":\"A15F:179B4D:637EADD:8731093:69011825\",\"octolytics-url\":\"https://collector.github.com/github/collect\",\"octolytics-dimension-repository_is_fork\":\"false\",\"fetch-nonce\":\"v2:bc80e85b-edaf-e746-9f2c-ffdc75854b07\",\"og:image:alt\":\"\U0001F525
The Web Data API for AI - Turn entire websites into LLM-ready markdown or
structured data - firecrawl/firecrawl\",\"github-keyboard-shortcuts\":\"repository,copilot\",\"ogTitle\":\"GitHub
- firecrawl/firecrawl: \U0001F525 The Web Data API for AI - Turn entire websites
into LLM-ready markdown or structured data\",\"ogImage\":\"https://repository-images.githubusercontent.com/787076358/f9616c09-3701-41ef-b5a6-fdf912ffb15b\",\"analytics-location\":\"/<user-name>/<repo-name>\",\"route-controller\":\"files\",\"octolytics-dimension-repository_id\":\"787076358\",\"ogUrl\":\"https://github.com/firecrawl/firecrawl\",\"go-import\":\"github.com/firecrawl/firecrawl
git https://github.com/firecrawl/firecrawl.git\",\"hovercard-subject-tag\":\"repository:787076358\",\"theme-color\":\"#1e2327\",\"turbo-body-classes\":\"logged-out
env-production page-responsive\",\"scrapeId\":\"ec4d99a0-4c4f-4d1a-9fd2-08b8f891f883\",\"sourceURL\":\"https://github.com/firecrawl/firecrawl\",\"url\":\"https://github.com/firecrawl/firecrawl\",\"statusCode\":200,\"contentType\":\"text/html;
charset=utf-8\",\"proxyUsed\":\"basic\",\"cacheState\":\"hit\",\"cachedAt\":\"2025-10-28T19:23:20.106Z\"}},{\"url\":\"https://x.com/firecrawl_dev?lang=en\",\"title\":\"Firecrawl
(@firecrawl_dev) / Posts / X\",\"description\":\"Firecrawl (@firecrawl_dev)
- Posts - Turn websites into LLM-ready data. Built by @mendableai team Open
source: | X (formerly Twitter)\",\"position\":3},{\"url\":\"https://github.com/firecrawl\",\"title\":\"Firecrawl
- GitHub\",\"description\":\"Building AI applications? You need clean, structured
data from the web. Firecrawl handles the complexity of modern web scraping
so you can focus on building ...\",\"position\":4,\"category\":\"github\",\"markdown\":\"[Skip
to content](https://github.com/firecrawl#start-of-content)\\n\\nYou signed
in with another tab or window. [Reload](https://github.com/firecrawl) to refresh
your session.You signed out in another tab or window. [Reload](https://github.com/firecrawl)
to refresh your session.You switched accounts on another tab or window. [Reload](https://github.com/firecrawl)
to refresh your session.Dismiss alert\\n\\n{{ message }}\\n\\n[README.md](https://github.com/firecrawl/.github/tree/main/profile/README.md)\\n\\n#
\U0001F525 Firecrawl\\n\\n[Permalink: \U0001F525 Firecrawl](https://github.com/firecrawl#-firecrawl)\\n\\n[![Firecrawl
Logo](https://raw.githubusercontent.com/mendableai/firecrawl/main/img/firecrawl_logo.png)](https://raw.githubusercontent.com/mendableai/firecrawl/main/img/firecrawl_logo.png)\\n\\n###
Transform any website into LLM-ready data\\n\\n[Permalink: Transform any website
into LLM-ready data](https://github.com/firecrawl#transform-any-website-into-llm-ready-data)\\n\\nAdvanced
web scraping, crawling, and data extraction infrastructure for AI applications\\n\\n[![Get
Started](https://camo.githubusercontent.com/85b729c7fb201b60a98279ddc4e70268281fc6df999e110276f798cf5c050126/68747470733a2f2f696d672e736869656c64732e696f2f62616467652ff09f9a805f4765745f537461727465642d4646364233353f7374796c653d666f722d7468652d6261646765)](https://firecrawl.dev/)
[![Documentation](https://camo.githubusercontent.com/f7531cb91d3d3dcac76ac7c2ba9d36c1a74728016c64d942cfaf954f5ae4a238/68747470733a2f2f696d672e736869656c64732e696f2f62616467652ff09f939a5f446f63756d656e746174696f6e2d3441393045323f7374796c653d666f722d7468652d6261646765)](https://docs.firecrawl.dev/)
[![Discord](https://camo.githubusercontent.com/8c2d9f948c1d79b69e26d25add89a17a734b48cc6fd6fc0040f34f31c6a11774/68747470733a2f2f696d672e736869656c64732e696f2f62616467652ff09f92ac5f4a6f696e5f446973636f72642d3538363546323f7374796c653d666f722d7468652d6261646765)](https://discord.com/invite/gSmWdAkdwd)\\n\\n[![License](https://camo.githubusercontent.com/a6f4431b80529dbeaa43c3c5fbcf4649f6b4ebbeb82d5a58abeb39ca3eeca8be/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6963656e73652f6d656e6461626c6561692f66697265637261776c)](https://github.com/mendableai/firecrawl/blob/main/LICENSE)[![GitHub
Stars](https://camo.githubusercontent.com/f42ce9a4d46d07baa67b74b49277e72ed33877743358deebfa774e17532eb4ff/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f73746172732f6d656e6461626c6561692f66697265637261776c3f7374796c653d736f6369616c)](https://github.com/mendableai/firecrawl/stargazers)[![Python
Downloads](https://camo.githubusercontent.com/9d76afe428b4085c8b7103f2f4e31da110ee154ad7320bace4348d92ac0c2450/68747470733a2f2f7374617469632e706570792e746563682f62616467652f66697265637261776c2d7079)](https://pepy.tech/project/firecrawl-py)[![Follow
on X](https://camo.githubusercontent.com/e32f3aece18eaab32ee100cadb592843d985aa1171a8da08ae047626363d65ae/68747470733a2f2f696d672e736869656c64732e696f2f747769747465722f666f6c6c6f772f66697265637261776c5f6465763f7374796c653d736f6369616c)](https://x.com/firecrawl_dev)\\n\\n*
* *\\n\\n## Why Firecrawl?\\n\\n[Permalink: Why Firecrawl?](https://github.com/firecrawl#why-firecrawl)\\n\\n**Building
AI applications?** You need clean, structured data from the web. Firecrawl
handles the complexity of modern web scraping so you can focus on building
great products.\\n\\n## Our Core Ecosystem\\n\\n[Permalink: Our Core Ecosystem](https://github.com/firecrawl#our-core-ecosystem)\\n\\n###
Main Repository\\n\\n[Permalink: Main Repository](https://github.com/firecrawl#main-repository)\\n\\n[![](https://camo.githubusercontent.com/97aa9741f2773cb2d192c516c7689f4e2bbab89403aa868d842487551743626a/68747470733a2f2f6769746875622d726561646d652d73746174732e76657263656c2e6170702f6170692f70696e2f3f757365726e616d653d6d656e6461626c656169267265706f3d66697265637261776c267468656d653d6c69676874)](https://github.com/mendableai/firecrawl)\\n\\n**[firecrawl](https://github.com/mendableai/firecrawl)**
\\\\- Core API & SDK\\n\\nTurn entire websites into LLM-ready markdown or
structured data. Our flagship product with 40k+ stars.\\n\\n### Cloud API\\n\\n[Permalink:
Cloud API](https://github.com/firecrawl#cloud-api)\\n\\n[![Cloud API](https://camo.githubusercontent.com/6ad9773ed98c84d54b1546ffbf8b0fbb085be0c60580ae5a1ead2d23ddf1b121/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436c6f75645f4150492d4646364233353f7374796c653d666f722d7468652d6261646765266c6f676f3d636c6f7564266c6f676f436f6c6f723d7768697465)](https://firecrawl.dev/)\\n\\n**[Firecrawl](https://firecrawl.dev/)**
\\\\- Hosted API Service\\n\\nProduction-ready web scraping without infrastructure
management. Get your API key and start scraping in minutes with our reliable,
scalable cloud service.\\n\\n### MCP Integration\\n\\n[Permalink: MCP Integration](https://github.com/firecrawl#mcp-integration)\\n\\n[![MCP
Server](https://camo.githubusercontent.com/ea9cbd6a754e0932d17ae393ff53566bfb681492abddd08704dbe04b122dfaa1/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4d43505f5365727665722d3441393045323f7374796c653d666f722d7468652d6261646765266c6f676f3d736572766572266c6f676f436f6c6f723d7768697465)](https://github.com/mendableai/firecrawl-mcp-server)\\n\\n**[firecrawl-mcp-server](https://github.com/mendableai/firecrawl-mcp-server)**
\\\\- Model Context Protocol Server\\n\\nAdd powerful web scraping capabilities
to Claude, Cursor, and any MCP-compatible LLM client.\\n\\n## Community &
Support\\n\\n[Permalink: Community & Support](https://github.com/firecrawl#community--support)\\n\\n[![Discord](https://camo.githubusercontent.com/62d3d35241760cf174631c4e6b5f4503c0a6b34640fd306e36a829ab5ec47b14/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f446973636f72642d3538363546323f7374796c653d666f722d7468652d6261646765266c6f676f3d646973636f7264266c6f676f436f6c6f723d7768697465)](https://discord.com/invite/gSmWdAkdwd)[![X](https://camo.githubusercontent.com/8c709aaebc7feee6050eba44984b294d9da3ace3353bd5eed8b499dd04af3c06/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f582d3030303030303f7374796c653d666f722d7468652d6261646765266c6f676f3d78266c6f676f436f6c6f723d7768697465)](https://x.com/firecrawl_dev)[![LinkedIn](https://camo.githubusercontent.com/8c0692475a5bfc1d9e7361074bdb648e567cae7b5b40ffd32adae31180b0d7b6/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c696e6b6564496e2d3030373742353f7374796c653d666f722d7468652d6261646765266c6f676f3d6c696e6b6564696e266c6f676f436f6c6f723d7768697465)](https://www.linkedin.com/company/104100957/)[![Discussions](https://camo.githubusercontent.com/9403fd9d6d54f5a23a79f9a8a6a256ae82159fb626710fd56c2495fff1257d62/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4769744875625f44697363757373696f6e732d3138313731373f7374796c653d666f722d7468652d6261646765266c6f676f3d676974687562266c6f676f436f6c6f723d7768697465)](https://github.com/mendableai/firecrawl/discussions)[![Documentation](https://camo.githubusercontent.com/9d518c9da8018ae3524a2580522bd1ef591f343cc4df7983b4476e117fa70bba/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f446f63756d656e746174696f6e2d3441393045323f7374796c653d666f722d7468652d6261646765266c6f676f3d626f6f6b266c6f676f436f6c6f723d7768697465)](https://docs.firecrawl.dev/)\\n\\n##
Built By Mendable\\n\\n[Permalink: Built By Mendable](https://github.com/firecrawl#built-by-mendable)\\n\\nWe're
the team behind [Mendable.ai](https://mendable.ai/), passionate about making
web data accessible for AI applications. Firecrawl powers thousands of AI
products worldwide.\\n\\n* * *\\n\\n**Ready to build something amazing?**\\n\\n[Get
your API key](https://firecrawl.dev/) and start scraping in minutes\\n\\n\\n[Star
our main repo](https://github.com/mendableai/firecrawl) \u2022\\n[Try the
playground](https://firecrawl.dev/playground) \u2022\\n[Read the docs](https://docs.firecrawl.dev/)\\n\\n##
Pinned Loading\\n\\n1. [firecrawl](https://github.com/firecrawl/firecrawl)
firecrawlPublic\\n\\n\\n\\n\\n\\n\\n\U0001F525 The Web Data API for AI - Turn
entire websites into LLM-ready markdown or structured data\\n\\n\\n\\n\\nTypeScript[64.9k](https://github.com/firecrawl/firecrawl/stargazers)
[5.1k](https://github.com/firecrawl/firecrawl/forks)\\n\\n2. [mendable-nextjs-chatbot](https://github.com/firecrawl/mendable-nextjs-chatbot)
mendable-nextjs-chatbotPublic template\\n\\n\\n\\n\\n\\n\\nNext.js Starter
Template for building chatbots with Mendable\\n\\n\\n\\n\\nTypeScript[256](https://github.com/firecrawl/mendable-nextjs-chatbot/stargazers)
[52](https://github.com/firecrawl/mendable-nextjs-chatbot/forks)\\n\\n3. [rag-arena](https://github.com/firecrawl/rag-arena)
rag-arenaPublic\\n\\n\\n\\n\\n\\n\\nOpen-source RAG evaluation through users'
feedback\\n\\n\\n\\n\\nTypeScript[206](https://github.com/firecrawl/rag-arena/stargazers)
[32](https://github.com/firecrawl/rag-arena/forks)\\n\\n4. [QA\\\\_clustering](https://github.com/firecrawl/QA_clustering)
QA\\\\_clusteringPublic\\n\\n\\n\\n\\n\\n\\nAnalyzing chat interactions w/
LLMs to improve \U0001F99C\U0001F517 Langchain docs\\n\\n\\n\\n\\nJupyter
Notebook[80](https://github.com/firecrawl/QA_clustering/stargazers) [12](https://github.com/firecrawl/QA_clustering/forks)\\n\\n5.
[data-connectors](https://github.com/firecrawl/data-connectors) data-connectorsPublic\\n\\n\\n\\n\\n\\n\\nLLM-ready
data connectors\\n\\n\\n\\n\\nTypeScript[95](https://github.com/firecrawl/data-connectors/stargazers)
[23](https://github.com/firecrawl/data-connectors/forks)\\n\\n6. [mendable-py](https://github.com/firecrawl/mendable-py)
mendable-pyPublic\\n\\n\\n\\n\\n\\n\\nBuild Production Ready LLM Chat Apps
in Minutes\\n\\n\\n\\n\\nPython[33](https://github.com/firecrawl/mendable-py/stargazers)
[7](https://github.com/firecrawl/mendable-py/forks)\\n\\n\\n### Repositories\\n\\nLoading\\n\\nType\\n\\nAllPublicSourcesForksArchivedMirrorsTemplates\\n\\nLanguage\\n\\nAllCSSGoJavaJavaScriptJupyter
NotebookMDXPythonRustTypeScript\\n\\nSort\\n\\nLast updatedNameStars\\n\\nShowing
10 of 61 repositories\\n\\n- [firecrawl](https://github.com/firecrawl/firecrawl)\\nPublic\\n\\n\\n\\n\U0001F525
The Web Data API for AI - Turn entire websites into LLM-ready markdown or
structured data\\n\\n\\n\\n\\n\\n\\nfirecrawl/firecrawl\u2019s past year of
commit activity\\n\\n\\n\\nTypeScript[64,949](https://github.com/firecrawl/firecrawl/stargazers)AGPL-3.0\\n[5,132](https://github.com/firecrawl/firecrawl/forks)
[27](https://github.com/firecrawl/firecrawl/issues) [(2 issues need help)](https://github.com/firecrawl/firecrawl/issues?q=label%3A%22good+first+issue%22+is%3Aissue+is%3Aopen)
[85](https://github.com/firecrawl/firecrawl/pulls)\\nUpdated 2 hours agoOct
27, 2025\\n\\n- [firecrawl-docs](https://github.com/firecrawl/firecrawl-docs)\\nPublic\\n\\n\\n\\nDocumentation
for Firecrawl.\\n\\n\\n\\n\\n\\n\\nfirecrawl/firecrawl-docs\u2019s past year
of commit activity\\n\\n\\n\\nMDX[17](https://github.com/firecrawl/firecrawl-docs/stargazers)
[35](https://github.com/firecrawl/firecrawl-docs/forks) [10](https://github.com/firecrawl/firecrawl-docs/issues)
[5](https://github.com/firecrawl/firecrawl-docs/pulls)\\nUpdated 20 hours
agoOct 26, 2025\\n\\n- [open-agent-builder](https://github.com/firecrawl/open-agent-builder)\\nPublic\\n\\n\\n\\n\U0001F525
Visual workflow builder for AI agents powered by Firecrawl - drag-and-drop
web scraping pipelines with real-time execution\\n\\n\\n\\n\\n\\n\\nfirecrawl/open-agent-builder\u2019s
past year of commit activity\\n\\n\\n\\nTypeScript[1,673](https://github.com/firecrawl/open-agent-builder/stargazers)
[274](https://github.com/firecrawl/open-agent-builder/forks) [4](https://github.com/firecrawl/open-agent-builder/issues)
[2](https://github.com/firecrawl/open-agent-builder/pulls)\\nUpdated last
weekOct 20, 2025\\n\\n- [firecrawl-mcp-server](https://github.com/firecrawl/firecrawl-mcp-server)\\nPublic\\n\\n\\n\\n\U0001F525
Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor,
Claude and any other LLM clients.\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n[**Uh
oh!**](https://github.com/firecrawl/firecrawl-mcp-server/graphs/commit-activity)\\n\\n[There
was an error while loading.](https://github.com/firecrawl/firecrawl-mcp-server/graphs/commit-activity)
[Please reload this page](https://github.com/firecrawl).\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\nfirecrawl/firecrawl-mcp-server\u2019s
past year of commit activity\\n\\n\\n\\nJavaScript[4,794](https://github.com/firecrawl/firecrawl-mcp-server/stargazers)MIT\\n[519](https://github.com/firecrawl/firecrawl-mcp-server/forks)
[44](https://github.com/firecrawl/firecrawl-mcp-server/issues) [17](https://github.com/firecrawl/firecrawl-mcp-server/pulls)\\nUpdated
last weekOct 19, 2025\\n\\n- [n8n-nodes-firecrawl](https://github.com/firecrawl/n8n-nodes-firecrawl)\\nPublic\\n\\n\\n\\nn8n
node to interact with Firecrawl\\n\\n\\n\\n\\n\\n\\nfirecrawl/n8n-nodes-firecrawl\u2019s
past year of commit activity\\n\\n\\n\\nTypeScript[21](https://github.com/firecrawl/n8n-nodes-firecrawl/stargazers)MIT\\n[13](https://github.com/firecrawl/n8n-nodes-firecrawl/forks)
[3](https://github.com/firecrawl/n8n-nodes-firecrawl/issues) [0](https://github.com/firecrawl/n8n-nodes-firecrawl/pulls)\\nUpdated
2 weeks agoOct 17, 2025\\n\\n- [.github](https://github.com/firecrawl/.github)\\nPublic\\n\\n\\n\\n\\nfirecrawl/.github\u2019s
past year of commit activity\\n\\n\\n\\n0\\n[1](https://github.com/firecrawl/.github/forks)
[0](https://github.com/firecrawl/.github/issues) [0](https://github.com/firecrawl/.github/pulls)\\nUpdated
2 weeks agoOct 12, 2025\\n\\n- [fire-enrich](https://github.com/firecrawl/fire-enrich)\\nPublic\\n\\n\\n\\n\U0001F525
AI-powered data enrichment tool that transforms emails into rich datasets
with company profiles, funding data, tech stacks, and more using Firecrawl
and multi-agent AI\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n[**Uh oh!**](https://github.com/firecrawl/fire-enrich/graphs/commit-activity)\\n\\n[There
was an error while loading.](https://github.com/firecrawl/fire-enrich/graphs/commit-activity)
[Please reload this page](https://github.com/firecrawl).\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\nfirecrawl/fire-enrich\u2019s
past year of commit activity\\n\\n\\n\\nTypeScript[953](https://github.com/firecrawl/fire-enrich/stargazers)MIT\\n[239](https://github.com/firecrawl/fire-enrich/forks)
[12](https://github.com/firecrawl/fire-enrich/issues) [3](https://github.com/firecrawl/fire-enrich/pulls)\\nUpdated
3 weeks agoOct 8, 2025\\n\\n- [firecrawl-java-sdk](https://github.com/firecrawl/firecrawl-java-sdk)\\nPublic\\n\\n\\n\\n\\nfirecrawl/firecrawl-java-sdk\u2019s
past year of commit activity\\n\\n\\n\\nJava[11](https://github.com/firecrawl/firecrawl-java-sdk/stargazers)MIT\\n[4](https://github.com/firecrawl/firecrawl-java-sdk/forks)
[0](https://github.com/firecrawl/firecrawl-java-sdk/issues) [0](https://github.com/firecrawl/firecrawl-java-sdk/pulls)\\nUpdated
last monthSep 28, 2025\\n\\n- [open-lovable](https://github.com/firecrawl/open-lovable)\\nPublic\\n\\n\\n\\n\U0001F525
Clone and recreate any website as a modern React app in seconds\\n\\n\\n\\n\\n\\n\\nfirecrawl/open-lovable\u2019s
past year of commit activity\\n\\n\\n\\nTypeScript[21,320](https://github.com/firecrawl/open-lovable/stargazers)MIT\\n[3,986](https://github.com/firecrawl/open-lovable/forks)
[70](https://github.com/firecrawl/open-lovable/issues) [33](https://github.com/firecrawl/open-lovable/pulls)\\nUpdated
last monthSep 27, 2025\\n\\n- [mineru-api](https://github.com/firecrawl/mineru-api)\\nPublic\\n\\n\\n\\n\\nfirecrawl/mineru-api\u2019s
past year of commit activity\\n\\n\\n\\nPython[12](https://github.com/firecrawl/mineru-api/stargazers)AGPL-3.0\\n[2](https://github.com/firecrawl/mineru-api/forks)
[1](https://github.com/firecrawl/mineru-api/issues) [1](https://github.com/firecrawl/mineru-api/pulls)\\nUpdated
on Sep 26Sep 26, 2025\\n\\n\\n[View all repositories](https://github.com/orgs/firecrawl/repositories?type=all)\\n\\n[**People**](https://github.com/orgs/firecrawl/people)\\n\\n[![@alexnucci](https://avatars.githubusercontent.com/u/1919849?s=70&v=4)](https://github.com/alexnucci)[![@micahstairs](https://avatars.githubusercontent.com/u/7231485?s=70&v=4)](https://github.com/micahstairs)[![@nickscamara](https://avatars.githubusercontent.com/u/20311743?s=70&v=4)](https://github.com/nickscamara)[![@mogery](https://avatars.githubusercontent.com/u/66118807?s=70&v=4)](https://github.com/mogery)[![@developersdigest](https://avatars.githubusercontent.com/u/124798203?s=70&v=4)](https://github.com/developersdigest)\\n\\n####
Top languages\\n\\n[TypeScript](https://github.com/orgs/firecrawl/repositories?language=typescript&type=all)
[Python](https://github.com/orgs/firecrawl/repositories?language=python&type=all)
[JavaScript](https://github.com/orgs/firecrawl/repositories?language=javascript&type=all)
[Go](https://github.com/orgs/firecrawl/repositories?language=go&type=all)
[MDX](https://github.com/orgs/firecrawl/repositories?language=mdx&type=all)\\n\\n####
Most used topics\\n\\n[ai](https://github.com/search?q=topic%3Aai+org%3Afirecrawl+fork%3Atrue&type=repositories
\\\"Topic: ai\\\") [firecrawl](https://github.com/search?q=topic%3Afirecrawl+org%3Afirecrawl+fork%3Atrue&type=repositories
\\\"Topic: firecrawl\\\") [llm](https://github.com/search?q=topic%3Allm+org%3Afirecrawl+fork%3Atrue&type=repositories
\\\"Topic: llm\\\") [web-crawler](https://github.com/search?q=topic%3Aweb-crawler+org%3Afirecrawl+fork%3Atrue&type=repositories
\\\"Topic: web-crawler\\\") [web-scraping](https://github.com/search?q=topic%3Aweb-scraping+org%3Afirecrawl+fork%3Atrue&type=repositories
\\\"Topic: web-scraping\\\")\\n\\nYou can\u2019t perform that action at this
time.\",\"metadata\":{\"analytics-location\":\"/<org-login>\",\"apple-itunes-app\":\"app-id=1477376905,
app-argument=https://github.com/firecrawl\",\"twitter:card\":\"summary_large_image\",\"google-site-verification\":\"Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I\",\"description\":\"Web
data API for AI. Firecrawl has 61 repositories available. Follow their code
on GitHub.\",\"og:image\":\"https://avatars.githubusercontent.com/u/135057108?s=280&v=4\",\"og:type\":\"profile\",\"visitor-payload\":\"eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5REFEOjIzM0ExMzo4RDgzNEI6QzQ5OUNCOjY4RkYzODlGIiwidmlzaXRvcl9pZCI6IjQwMDQ0MTI5MTkxMDA5NDY1OTEiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ==\",\"github-keyboard-shortcuts\":\"copilot\",\"user-login\":\"\",\"viewport\":\"width=device-width\",\"og:description\":\"Web
data API for AI. Firecrawl has 61 repositories available. Follow their code
on GitHub.\",\"turbo-cache-control\":\"no-preview\",\"fetch-nonce\":\"v2:35a2e032-0081-3f6b-595e-967e32025c6f\",\"og:url\":\"https://github.com/firecrawl\",\"title\":\"Firecrawl
\xB7 GitHub\",\"route-pattern\":\"/:user_id(.:format)\",\"route-action\":\"show\",\"octolytics-url\":\"https://collector.github.com/github/collect\",\"og:site_name\":\"GitHub\",\"twitter:title\":\"Firecrawl\",\"request-id\":\"9DAD:233A13:8D834B:C499CB:68FF389F\",\"ogSiteName\":\"GitHub\",\"fb:app_id\":\"1401488693436528\",\"language\":\"en\",\"twitter:image\":\"https://avatars.githubusercontent.com/u/135057108?s=280&v=4\",\"ogImage\":\"https://avatars.githubusercontent.com/u/135057108?s=280&v=4\",\"release\":\"c44b7f7aa5c70f3296484971978c9f4b1b473352\",\"theme-color\":\"#1e2327\",\"color-scheme\":\"light
dark\",\"html-safe-nonce\":\"2d932295da6aa360d861f16279839ab109dc4e977a51bc99204477969d7d12c6\",\"ogTitle\":\"Firecrawl\",\"hovercard-subject-tag\":\"organization:135057108\",\"twitter:description\":\"Web
data API for AI. Firecrawl has 61 repositories available. Follow their code
on GitHub.\",\"hostname\":\"github.com\",\"ogUrl\":\"https://github.com/firecrawl\",\"ogDescription\":\"Web
data API for AI. Firecrawl has 61 repositories available. Follow their code
on GitHub.\",\"route-controller\":\"profiles\",\"favicon\":\"https://github.githubassets.com/favicons/favicon.svg\",\"visitor-hmac\":\"43cf3bbb9c57a3bcf0b1857fbcabcc78c274e7082e0d28c76cd6d4651bc2a920\",\"twitter:site\":\"@github\",\"ui-target\":\"full\",\"og:title\":\"Firecrawl\",\"expected-hostname\":\"github.com\",\"og:image:alt\":\"Web
data API for AI. Firecrawl has 61 repositories available. Follow their code
on GitHub.\",\"profile:username\":\"firecrawl\",\"current-catalog-service-hash\":\"4a1c50a83cf6cc4b55b6b9c53e553e3f847c876b87fb333f71f5d05db8f1a7db\",\"turbo-body-classes\":\"logged-out
env-production page-responsive\",\"browser-stats-url\":\"https://api.github.com/_private/browser/stats\",\"browser-errors-url\":\"https://api.github.com/_private/browser/errors\",\"scrapeId\":\"65cbc300-be11-4a1a-9d20-c114fb8473a7\",\"sourceURL\":\"https://github.com/firecrawl\",\"url\":\"https://github.com/firecrawl\",\"statusCode\":200,\"contentType\":\"text/html;
charset=utf-8\",\"proxyUsed\":\"basic\",\"cacheState\":\"hit\",\"cachedAt\":\"2025-10-27T09:17:20.756Z\"}},{\"url\":\"https://www.linkedin.com/company/firecrawl\",\"title\":\"Firecrawl
| LinkedIn\",\"description\":\"Our Dify integration now uses Firecrawl /v2
endpoints Scraping is 10x faster thanks to intelligent caching, plus we've
added semantic ...\",\"position\":5}]},\"creditsUsed\":3}"
headers:
Access-Control-Allow-Origin:
- '*'
Alt-Svc:
- h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Content-Length:
- '93428'
Content-Type:
- application/json; charset=utf-8
Date:
- Wed, 29 Oct 2025 14:37:39 GMT
ETag:
- W/"16cf4-kHwVbMu4CCVG2UIt6p1g/gz5M4M"
Via:
- 1.1 google
X-Powered-By:
- Express
X-Response-Time:
- 13172.495ms
status:
code: 200
message: OK
version: 1

View File

@@ -0,0 +1,18 @@
import pytest
from crewai_tools.tools.firecrawl_crawl_website_tool.firecrawl_crawl_website_tool import (
FirecrawlCrawlWebsiteTool,
)
@pytest.mark.vcr(filter_headers=["authorization"])
def test_firecrawl_crawl_tool_integration():
tool = FirecrawlCrawlWebsiteTool(config={
"limit": 2,
"max_discovery_depth": 1,
"scrape_options": {"formats": ["markdown"]}
})
result = tool.run(url="https://firecrawl.dev")
assert result is not None
assert hasattr(result, 'status')
assert result.status in ["completed", "scraping"]

View File

@@ -0,0 +1,15 @@
import pytest
from crewai_tools.tools.firecrawl_scrape_website_tool.firecrawl_scrape_website_tool import (
FirecrawlScrapeWebsiteTool,
)
@pytest.mark.vcr(filter_headers=["authorization"])
def test_firecrawl_scrape_tool_integration():
tool = FirecrawlScrapeWebsiteTool()
result = tool.run(url="https://firecrawl.dev")
assert result is not None
assert hasattr(result, 'markdown')
assert len(result.markdown) > 0
assert "Firecrawl" in result.markdown or "firecrawl" in result.markdown.lower()

View File

@@ -0,0 +1,12 @@
import pytest
from crewai_tools.tools.firecrawl_search_tool.firecrawl_search_tool import FirecrawlSearchTool
@pytest.mark.vcr(filter_headers=["authorization"])
def test_firecrawl_search_tool_integration():
tool = FirecrawlSearchTool()
result = tool.run(query="firecrawl")
assert result is not None
assert hasattr(result, 'web') or hasattr(result, 'news') or hasattr(result, 'images')

View File

@@ -23,7 +23,6 @@ dependencies = [
"chromadb~=1.1.0",
"tokenizers>=0.20.3",
"openpyxl>=3.1.5",
"pyvis>=0.3.2",
# Authentication and Security
"python-dotenv>=1.1.1",
"pyjwt>=2.9.0",
@@ -49,7 +48,7 @@ Repository = "https://github.com/crewAIInc/crewAI"
[project.optional-dependencies]
tools = [
"crewai-tools==1.2.1",
"crewai-tools==1.4.1",
]
embeddings = [
"tiktoken~=0.8.0"
@@ -94,10 +93,11 @@ azure-ai-inference = [
anthropic = [
"anthropic>=0.69.0",
]
# a2a = [
# "a2a-sdk~=0.3.9",
# "httpx-sse>=0.4.0",
# ]
a2a = [
"a2a-sdk~=0.3.10",
"httpx-auth>=0.23.1",
"httpx-sse>=0.4.0",
]
[project.scripts]

View File

@@ -3,7 +3,7 @@ from typing import Any
import urllib.request
import warnings
from crewai.agent import Agent
from crewai.agent.core import Agent
from crewai.crew import Crew
from crewai.crews.crew_output import CrewOutput
from crewai.flow.flow import Flow
@@ -40,7 +40,7 @@ def _suppress_pydantic_deprecation_warnings() -> None:
_suppress_pydantic_deprecation_warnings()
__version__ = "1.2.1"
__version__ = "1.4.1"
_telemetry_submitted = False

View File

@@ -0,0 +1,6 @@
"""Agent-to-Agent (A2A) protocol communication module for CrewAI."""
from crewai.a2a.config import A2AConfig
__all__ = ["A2AConfig"]

View File

@@ -0,0 +1,20 @@
"""A2A authentication schemas."""
from crewai.a2a.auth.schemas import (
APIKeyAuth,
BearerTokenAuth,
HTTPBasicAuth,
HTTPDigestAuth,
OAuth2AuthorizationCode,
OAuth2ClientCredentials,
)
__all__ = [
"APIKeyAuth",
"BearerTokenAuth",
"HTTPBasicAuth",
"HTTPDigestAuth",
"OAuth2AuthorizationCode",
"OAuth2ClientCredentials",
]

View File

@@ -0,0 +1,392 @@
"""Authentication schemes for A2A protocol agents.
Supported authentication methods:
- Bearer tokens
- OAuth2 (Client Credentials, Authorization Code)
- API Keys (header, query, cookie)
- HTTP Basic authentication
- HTTP Digest authentication
"""
from __future__ import annotations
from abc import ABC, abstractmethod
import base64
from collections.abc import Awaitable, Callable, MutableMapping
import time
from typing import Literal
import urllib.parse
import httpx
from httpx import DigestAuth
from pydantic import BaseModel, Field, PrivateAttr
class AuthScheme(ABC, BaseModel):
"""Base class for authentication schemes."""
@abstractmethod
async def apply_auth(
self, client: httpx.AsyncClient, headers: MutableMapping[str, str]
) -> MutableMapping[str, str]:
"""Apply authentication to request headers.
Args:
client: HTTP client for making auth requests.
headers: Current request headers.
Returns:
Updated headers with authentication applied.
"""
...
class BearerTokenAuth(AuthScheme):
"""Bearer token authentication (Authorization: Bearer <token>).
Attributes:
token: Bearer token for authentication.
"""
token: str = Field(description="Bearer token")
async def apply_auth(
self, client: httpx.AsyncClient, headers: MutableMapping[str, str]
) -> MutableMapping[str, str]:
"""Apply Bearer token to Authorization header.
Args:
client: HTTP client for making auth requests.
headers: Current request headers.
Returns:
Updated headers with Bearer token in Authorization header.
"""
headers["Authorization"] = f"Bearer {self.token}"
return headers
class HTTPBasicAuth(AuthScheme):
"""HTTP Basic authentication.
Attributes:
username: Username for Basic authentication.
password: Password for Basic authentication.
"""
username: str = Field(description="Username")
password: str = Field(description="Password")
async def apply_auth(
self, client: httpx.AsyncClient, headers: MutableMapping[str, str]
) -> MutableMapping[str, str]:
"""Apply HTTP Basic authentication.
Args:
client: HTTP client for making auth requests.
headers: Current request headers.
Returns:
Updated headers with Basic auth in Authorization header.
"""
credentials = f"{self.username}:{self.password}"
encoded = base64.b64encode(credentials.encode()).decode()
headers["Authorization"] = f"Basic {encoded}"
return headers
class HTTPDigestAuth(AuthScheme):
"""HTTP Digest authentication.
Note: Uses httpx-auth library for digest implementation.
Attributes:
username: Username for Digest authentication.
password: Password for Digest authentication.
"""
username: str = Field(description="Username")
password: str = Field(description="Password")
async def apply_auth(
self, client: httpx.AsyncClient, headers: MutableMapping[str, str]
) -> MutableMapping[str, str]:
"""Digest auth is handled by httpx auth flow, not headers.
Args:
client: HTTP client for making auth requests.
headers: Current request headers.
Returns:
Unchanged headers (Digest auth handled by httpx auth flow).
"""
return headers
def configure_client(self, client: httpx.AsyncClient) -> None:
"""Configure client with Digest auth.
Args:
client: HTTP client to configure with Digest authentication.
"""
client.auth = DigestAuth(self.username, self.password)
class APIKeyAuth(AuthScheme):
"""API Key authentication (header, query, or cookie).
Attributes:
api_key: API key value for authentication.
location: Where to send the API key (header, query, or cookie).
name: Parameter name for the API key (default: X-API-Key).
"""
api_key: str = Field(description="API key value")
location: Literal["header", "query", "cookie"] = Field(
default="header", description="Where to send the API key"
)
name: str = Field(default="X-API-Key", description="Parameter name for the API key")
async def apply_auth(
self, client: httpx.AsyncClient, headers: MutableMapping[str, str]
) -> MutableMapping[str, str]:
"""Apply API key authentication.
Args:
client: HTTP client for making auth requests.
headers: Current request headers.
Returns:
Updated headers with API key (for header/cookie locations).
"""
if self.location == "header":
headers[self.name] = self.api_key
elif self.location == "cookie":
headers["Cookie"] = f"{self.name}={self.api_key}"
return headers
def configure_client(self, client: httpx.AsyncClient) -> None:
"""Configure client for query param API keys.
Args:
client: HTTP client to configure with query param API key hook.
"""
if self.location == "query":
async def _add_api_key_param(request: httpx.Request) -> None:
url = httpx.URL(request.url)
request.url = url.copy_add_param(self.name, self.api_key)
client.event_hooks["request"].append(_add_api_key_param)
class OAuth2ClientCredentials(AuthScheme):
"""OAuth2 Client Credentials flow authentication.
Attributes:
token_url: OAuth2 token endpoint URL.
client_id: OAuth2 client identifier.
client_secret: OAuth2 client secret.
scopes: List of required OAuth2 scopes.
"""
token_url: str = Field(description="OAuth2 token endpoint")
client_id: str = Field(description="OAuth2 client ID")
client_secret: str = Field(description="OAuth2 client secret")
scopes: list[str] = Field(
default_factory=list, description="Required OAuth2 scopes"
)
_access_token: str | None = PrivateAttr(default=None)
_token_expires_at: float | None = PrivateAttr(default=None)
async def apply_auth(
self, client: httpx.AsyncClient, headers: MutableMapping[str, str]
) -> MutableMapping[str, str]:
"""Apply OAuth2 access token to Authorization header.
Args:
client: HTTP client for making token requests.
headers: Current request headers.
Returns:
Updated headers with OAuth2 access token in Authorization header.
"""
if (
self._access_token is None
or self._token_expires_at is None
or time.time() >= self._token_expires_at
):
await self._fetch_token(client)
if self._access_token:
headers["Authorization"] = f"Bearer {self._access_token}"
return headers
async def _fetch_token(self, client: httpx.AsyncClient) -> None:
"""Fetch OAuth2 access token using client credentials flow.
Args:
client: HTTP client for making token request.
Raises:
httpx.HTTPStatusError: If token request fails.
"""
data = {
"grant_type": "client_credentials",
"client_id": self.client_id,
"client_secret": self.client_secret,
}
if self.scopes:
data["scope"] = " ".join(self.scopes)
response = await client.post(self.token_url, data=data)
response.raise_for_status()
token_data = response.json()
self._access_token = token_data["access_token"]
expires_in = token_data.get("expires_in", 3600)
self._token_expires_at = time.time() + expires_in - 60
class OAuth2AuthorizationCode(AuthScheme):
"""OAuth2 Authorization Code flow authentication.
Note: Requires interactive authorization.
Attributes:
authorization_url: OAuth2 authorization endpoint URL.
token_url: OAuth2 token endpoint URL.
client_id: OAuth2 client identifier.
client_secret: OAuth2 client secret.
redirect_uri: OAuth2 redirect URI for callback.
scopes: List of required OAuth2 scopes.
"""
authorization_url: str = Field(description="OAuth2 authorization endpoint")
token_url: str = Field(description="OAuth2 token endpoint")
client_id: str = Field(description="OAuth2 client ID")
client_secret: str = Field(description="OAuth2 client secret")
redirect_uri: str = Field(description="OAuth2 redirect URI")
scopes: list[str] = Field(
default_factory=list, description="Required OAuth2 scopes"
)
_access_token: str | None = PrivateAttr(default=None)
_refresh_token: str | None = PrivateAttr(default=None)
_token_expires_at: float | None = PrivateAttr(default=None)
_authorization_callback: Callable[[str], Awaitable[str]] | None = PrivateAttr(
default=None
)
def set_authorization_callback(
self, callback: Callable[[str], Awaitable[str]] | None
) -> None:
"""Set callback to handle authorization URL.
Args:
callback: Async function that receives authorization URL and returns auth code.
"""
self._authorization_callback = callback
async def apply_auth(
self, client: httpx.AsyncClient, headers: MutableMapping[str, str]
) -> MutableMapping[str, str]:
"""Apply OAuth2 access token to Authorization header.
Args:
client: HTTP client for making token requests.
headers: Current request headers.
Returns:
Updated headers with OAuth2 access token in Authorization header.
Raises:
ValueError: If authorization callback is not set.
"""
if self._access_token is None:
if self._authorization_callback is None:
msg = "Authorization callback not set. Use set_authorization_callback()"
raise ValueError(msg)
await self._fetch_initial_token(client)
elif self._token_expires_at and time.time() >= self._token_expires_at:
await self._refresh_access_token(client)
if self._access_token:
headers["Authorization"] = f"Bearer {self._access_token}"
return headers
async def _fetch_initial_token(self, client: httpx.AsyncClient) -> None:
"""Fetch initial access token using authorization code flow.
Args:
client: HTTP client for making token request.
Raises:
ValueError: If authorization callback is not set.
httpx.HTTPStatusError: If token request fails.
"""
params = {
"response_type": "code",
"client_id": self.client_id,
"redirect_uri": self.redirect_uri,
"scope": " ".join(self.scopes),
}
auth_url = f"{self.authorization_url}?{urllib.parse.urlencode(params)}"
if self._authorization_callback is None:
msg = "Authorization callback not set"
raise ValueError(msg)
auth_code = await self._authorization_callback(auth_url)
data = {
"grant_type": "authorization_code",
"code": auth_code,
"client_id": self.client_id,
"client_secret": self.client_secret,
"redirect_uri": self.redirect_uri,
}
response = await client.post(self.token_url, data=data)
response.raise_for_status()
token_data = response.json()
self._access_token = token_data["access_token"]
self._refresh_token = token_data.get("refresh_token")
expires_in = token_data.get("expires_in", 3600)
self._token_expires_at = time.time() + expires_in - 60
async def _refresh_access_token(self, client: httpx.AsyncClient) -> None:
"""Refresh the access token using refresh token.
Args:
client: HTTP client for making token request.
Raises:
httpx.HTTPStatusError: If token refresh request fails.
"""
if not self._refresh_token:
await self._fetch_initial_token(client)
return
data = {
"grant_type": "refresh_token",
"refresh_token": self._refresh_token,
"client_id": self.client_id,
"client_secret": self.client_secret,
}
response = await client.post(self.token_url, data=data)
response.raise_for_status()
token_data = response.json()
self._access_token = token_data["access_token"]
if "refresh_token" in token_data:
self._refresh_token = token_data["refresh_token"]
expires_in = token_data.get("expires_in", 3600)
self._token_expires_at = time.time() + expires_in - 60

View File

@@ -0,0 +1,236 @@
"""Authentication utilities for A2A protocol agent communication.
Provides validation and retry logic for various authentication schemes including
OAuth2, API keys, and HTTP authentication methods.
"""
import asyncio
from collections.abc import Awaitable, Callable, MutableMapping
import re
from typing import Final
from a2a.client.errors import A2AClientHTTPError
from a2a.types import (
APIKeySecurityScheme,
AgentCard,
HTTPAuthSecurityScheme,
OAuth2SecurityScheme,
)
from httpx import AsyncClient, Response
from crewai.a2a.auth.schemas import (
APIKeyAuth,
AuthScheme,
BearerTokenAuth,
HTTPBasicAuth,
HTTPDigestAuth,
OAuth2AuthorizationCode,
OAuth2ClientCredentials,
)
_auth_store: dict[int, AuthScheme | None] = {}
_SCHEME_PATTERN: Final[re.Pattern[str]] = re.compile(r"(\w+)\s+(.+?)(?=,\s*\w+\s+|$)")
_PARAM_PATTERN: Final[re.Pattern[str]] = re.compile(r'(\w+)=(?:"([^"]*)"|([^\s,]+))')
_SCHEME_AUTH_MAPPING: Final[dict[type, tuple[type[AuthScheme], ...]]] = {
OAuth2SecurityScheme: (
OAuth2ClientCredentials,
OAuth2AuthorizationCode,
BearerTokenAuth,
),
APIKeySecurityScheme: (APIKeyAuth,),
}
_HTTP_SCHEME_MAPPING: Final[dict[str, type[AuthScheme]]] = {
"basic": HTTPBasicAuth,
"digest": HTTPDigestAuth,
"bearer": BearerTokenAuth,
}
def _raise_auth_mismatch(
expected_classes: type[AuthScheme] | tuple[type[AuthScheme], ...],
provided_auth: AuthScheme,
) -> None:
"""Raise authentication mismatch error.
Args:
expected_classes: Expected authentication class or tuple of classes.
provided_auth: Actually provided authentication instance.
Raises:
A2AClientHTTPError: Always raises with 401 status code.
"""
if isinstance(expected_classes, tuple):
if len(expected_classes) == 1:
required = expected_classes[0].__name__
else:
names = [cls.__name__ for cls in expected_classes]
required = f"one of ({', '.join(names)})"
else:
required = expected_classes.__name__
msg = (
f"AgentCard requires {required} authentication, "
f"but {type(provided_auth).__name__} was provided"
)
raise A2AClientHTTPError(401, msg)
def parse_www_authenticate(header_value: str) -> dict[str, dict[str, str]]:
"""Parse WWW-Authenticate header into auth challenges.
Args:
header_value: The WWW-Authenticate header value.
Returns:
Dictionary mapping auth scheme to its parameters.
Example: {"Bearer": {"realm": "api", "scope": "read write"}}
"""
if not header_value:
return {}
challenges: dict[str, dict[str, str]] = {}
for match in _SCHEME_PATTERN.finditer(header_value):
scheme = match.group(1)
params_str = match.group(2)
params: dict[str, str] = {}
for param_match in _PARAM_PATTERN.finditer(params_str):
key = param_match.group(1)
value = param_match.group(2) or param_match.group(3)
params[key] = value
challenges[scheme] = params
return challenges
def validate_auth_against_agent_card(
agent_card: AgentCard, auth: AuthScheme | None
) -> None:
"""Validate that provided auth matches AgentCard security requirements.
Args:
agent_card: The A2A AgentCard containing security requirements.
auth: User-provided authentication scheme (or None).
Raises:
A2AClientHTTPError: If auth doesn't match AgentCard requirements (status_code=401).
"""
if not agent_card.security or not agent_card.security_schemes:
return
if not auth:
msg = "AgentCard requires authentication but no auth scheme provided"
raise A2AClientHTTPError(401, msg)
first_security_req = agent_card.security[0] if agent_card.security else {}
for scheme_name in first_security_req.keys():
security_scheme_wrapper = agent_card.security_schemes.get(scheme_name)
if not security_scheme_wrapper:
continue
scheme = security_scheme_wrapper.root
if allowed_classes := _SCHEME_AUTH_MAPPING.get(type(scheme)):
if not isinstance(auth, allowed_classes):
_raise_auth_mismatch(allowed_classes, auth)
return
if isinstance(scheme, HTTPAuthSecurityScheme):
if required_class := _HTTP_SCHEME_MAPPING.get(scheme.scheme.lower()):
if not isinstance(auth, required_class):
_raise_auth_mismatch(required_class, auth)
return
msg = "Could not validate auth against AgentCard security requirements"
raise A2AClientHTTPError(401, msg)
async def retry_on_401(
request_func: Callable[[], Awaitable[Response]],
auth_scheme: AuthScheme | None,
client: AsyncClient,
headers: MutableMapping[str, str],
max_retries: int = 3,
) -> Response:
"""Retry a request on 401 authentication error.
Handles 401 errors by:
1. Parsing WWW-Authenticate header
2. Re-acquiring credentials
3. Retrying the request
Args:
request_func: Async function that makes the HTTP request.
auth_scheme: Authentication scheme to refresh credentials with.
client: HTTP client for making requests.
headers: Request headers to update with new auth.
max_retries: Maximum number of retry attempts (default: 3).
Returns:
HTTP response from the request.
Raises:
httpx.HTTPStatusError: If retries are exhausted or auth scheme is None.
"""
last_response: Response | None = None
last_challenges: dict[str, dict[str, str]] = {}
for attempt in range(max_retries):
response = await request_func()
if response.status_code != 401:
return response
last_response = response
if auth_scheme is None:
response.raise_for_status()
return response
www_authenticate = response.headers.get("WWW-Authenticate", "")
challenges = parse_www_authenticate(www_authenticate)
last_challenges = challenges
if attempt >= max_retries - 1:
break
backoff_time = 2**attempt
await asyncio.sleep(backoff_time)
await auth_scheme.apply_auth(client, headers)
if last_response:
last_response.raise_for_status()
return last_response
msg = "retry_on_401 failed without making any requests"
if last_challenges:
challenge_info = ", ".join(
f"{scheme} (realm={params.get('realm', 'N/A')})"
for scheme, params in last_challenges.items()
)
msg = f"{msg}. Server challenges: {challenge_info}"
raise RuntimeError(msg)
def configure_auth_client(
auth: HTTPDigestAuth | APIKeyAuth, client: AsyncClient
) -> None:
"""Configure HTTP client with auth-specific settings.
Only HTTPDigestAuth and APIKeyAuth need client configuration.
Args:
auth: Authentication scheme that requires client configuration.
client: HTTP client to configure.
"""
auth.configure_client(client)

View File

@@ -0,0 +1,59 @@
"""A2A configuration types.
This module is separate from experimental.a2a to avoid circular imports.
"""
from __future__ import annotations
from typing import Annotated
from pydantic import (
BaseModel,
BeforeValidator,
Field,
HttpUrl,
TypeAdapter,
)
from crewai.a2a.auth.schemas import AuthScheme
http_url_adapter = TypeAdapter(HttpUrl)
Url = Annotated[
str,
BeforeValidator(
lambda value: str(http_url_adapter.validate_python(value, strict=True))
),
]
class A2AConfig(BaseModel):
"""Configuration for A2A protocol integration.
Attributes:
endpoint: A2A agent endpoint URL.
auth: Authentication scheme (Bearer, OAuth2, API Key, HTTP Basic/Digest).
timeout: Request timeout in seconds (default: 120).
max_turns: Maximum conversation turns with A2A agent (default: 10).
response_model: Optional Pydantic model for structured A2A agent responses.
fail_fast: If True, raise error when agent unreachable; if False, skip and continue (default: True).
"""
endpoint: Url = Field(description="A2A agent endpoint URL")
auth: AuthScheme | None = Field(
default=None,
description="Authentication scheme (Bearer, OAuth2, API Key, HTTP Basic/Digest)",
)
timeout: int = Field(default=120, description="Request timeout in seconds")
max_turns: int = Field(
default=10, description="Maximum conversation turns with A2A agent"
)
response_model: type[BaseModel] | None = Field(
default=None,
description="Optional Pydantic model for structured A2A agent responses. When specified, the A2A agent is expected to return JSON matching this schema.",
)
fail_fast: bool = Field(
default=True,
description="If True, raise an error immediately when the A2A agent is unreachable. If False, skip the A2A agent and continue execution.",
)

View File

@@ -0,0 +1,29 @@
"""String templates for A2A (Agent-to-Agent) protocol messaging and status."""
from string import Template
from typing import Final
AVAILABLE_AGENTS_TEMPLATE: Final[Template] = Template(
"\n<AVAILABLE_A2A_AGENTS>\n $available_a2a_agents\n</AVAILABLE_A2A_AGENTS>\n"
)
PREVIOUS_A2A_CONVERSATION_TEMPLATE: Final[Template] = Template(
"\n<PREVIOUS_A2A_CONVERSATION>\n"
" $previous_a2a_conversation"
"\n</PREVIOUS_A2A_CONVERSATION>\n"
)
CONVERSATION_TURN_INFO_TEMPLATE: Final[Template] = Template(
"\n<CONVERSATION_PROGRESS>\n"
' turn="$turn_count"\n'
' max_turns="$max_turns"\n'
" $warning"
"\n</CONVERSATION_PROGRESS>\n"
)
UNAVAILABLE_AGENTS_NOTICE_TEMPLATE: Final[Template] = Template(
"\n<A2A_AGENTS_STATUS>\n"
" NOTE: A2A agents were configured but are currently unavailable.\n"
" You cannot delegate to remote agents for this task.\n\n"
" Unavailable Agents:\n"
" $unavailable_agents"
"\n</A2A_AGENTS_STATUS>\n"
)

View File

@@ -0,0 +1,38 @@
"""Type definitions for A2A protocol message parts."""
from typing import Any, Literal, Protocol, TypedDict, runtime_checkable
from typing_extensions import NotRequired
@runtime_checkable
class AgentResponseProtocol(Protocol):
"""Protocol for the dynamically created AgentResponse model."""
a2a_ids: tuple[str, ...]
message: str
is_a2a: bool
class PartsMetadataDict(TypedDict, total=False):
"""Metadata for A2A message parts.
Attributes:
mimeType: MIME type for the part content.
schema: JSON schema for the part content.
"""
mimeType: Literal["application/json"]
schema: dict[str, Any]
class PartsDict(TypedDict):
"""A2A message part containing text and optional metadata.
Attributes:
text: The text content of the message part.
metadata: Optional metadata describing the part content.
"""
text: str
metadata: NotRequired[PartsMetadataDict]

View File

@@ -0,0 +1,755 @@
"""Utility functions for A2A (Agent-to-Agent) protocol delegation."""
from __future__ import annotations
import asyncio
from collections.abc import AsyncIterator, MutableMapping
from contextlib import asynccontextmanager
from functools import lru_cache
import time
from typing import TYPE_CHECKING, Any
import uuid
from a2a.client import Client, ClientConfig, ClientFactory
from a2a.client.errors import A2AClientHTTPError
from a2a.types import (
AgentCard,
Message,
Part,
Role,
TaskArtifactUpdateEvent,
TaskState,
TaskStatusUpdateEvent,
TextPart,
TransportProtocol,
)
import httpx
from pydantic import BaseModel, Field, create_model
from crewai.a2a.auth.schemas import APIKeyAuth, HTTPDigestAuth
from crewai.a2a.auth.utils import (
_auth_store,
configure_auth_client,
retry_on_401,
validate_auth_against_agent_card,
)
from crewai.a2a.config import A2AConfig
from crewai.a2a.types import PartsDict, PartsMetadataDict
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.a2a_events import (
A2AConversationStartedEvent,
A2ADelegationCompletedEvent,
A2ADelegationStartedEvent,
A2AMessageSentEvent,
A2AResponseReceivedEvent,
)
from crewai.types.utils import create_literals_from_strings
if TYPE_CHECKING:
from a2a.types import Message, Task as A2ATask
from crewai.a2a.auth.schemas import AuthScheme
@lru_cache()
def _fetch_agent_card_cached(
endpoint: str,
auth_hash: int,
timeout: int,
_ttl_hash: int,
) -> AgentCard:
"""Cached version of fetch_agent_card with auth support.
Args:
endpoint: A2A agent endpoint URL
auth_hash: Hash of the auth object
timeout: Request timeout
_ttl_hash: Time-based hash for cache invalidation (unused in body)
Returns:
Cached AgentCard
"""
auth = _auth_store.get(auth_hash)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
return loop.run_until_complete(
_fetch_agent_card_async(endpoint=endpoint, auth=auth, timeout=timeout)
)
finally:
loop.close()
def fetch_agent_card(
endpoint: str,
auth: AuthScheme | None = None,
timeout: int = 30,
use_cache: bool = True,
cache_ttl: int = 300,
) -> AgentCard:
"""Fetch AgentCard from an A2A endpoint with optional caching.
Args:
endpoint: A2A agent endpoint URL (AgentCard URL)
auth: Optional AuthScheme for authentication
timeout: Request timeout in seconds
use_cache: Whether to use caching (default True)
cache_ttl: Cache TTL in seconds (default 300 = 5 minutes)
Returns:
AgentCard object with agent capabilities and skills
Raises:
httpx.HTTPStatusError: If the request fails
A2AClientHTTPError: If authentication fails
"""
if use_cache:
auth_hash = hash((type(auth).__name__, id(auth))) if auth else 0
_auth_store[auth_hash] = auth
ttl_hash = int(time.time() // cache_ttl)
return _fetch_agent_card_cached(endpoint, auth_hash, timeout, ttl_hash)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
return loop.run_until_complete(
_fetch_agent_card_async(endpoint=endpoint, auth=auth, timeout=timeout)
)
finally:
loop.close()
async def _fetch_agent_card_async(
endpoint: str,
auth: AuthScheme | None,
timeout: int,
) -> AgentCard:
"""Async implementation of AgentCard fetching.
Args:
endpoint: A2A agent endpoint URL
auth: Optional AuthScheme for authentication
timeout: Request timeout in seconds
Returns:
AgentCard object
"""
if "/.well-known/agent-card.json" in endpoint:
base_url = endpoint.replace("/.well-known/agent-card.json", "")
agent_card_path = "/.well-known/agent-card.json"
else:
url_parts = endpoint.split("/", 3)
base_url = f"{url_parts[0]}//{url_parts[2]}"
agent_card_path = f"/{url_parts[3]}" if len(url_parts) > 3 else "/"
headers: MutableMapping[str, str] = {}
if auth:
async with httpx.AsyncClient(timeout=timeout) as temp_auth_client:
if isinstance(auth, (HTTPDigestAuth, APIKeyAuth)):
configure_auth_client(auth, temp_auth_client)
headers = await auth.apply_auth(temp_auth_client, {})
async with httpx.AsyncClient(timeout=timeout, headers=headers) as temp_client:
if auth and isinstance(auth, (HTTPDigestAuth, APIKeyAuth)):
configure_auth_client(auth, temp_client)
agent_card_url = f"{base_url}{agent_card_path}"
async def _fetch_agent_card_request() -> httpx.Response:
return await temp_client.get(agent_card_url)
try:
response = await retry_on_401(
request_func=_fetch_agent_card_request,
auth_scheme=auth,
client=temp_client,
headers=temp_client.headers,
max_retries=2,
)
response.raise_for_status()
return AgentCard.model_validate(response.json())
except httpx.HTTPStatusError as e:
if e.response.status_code == 401:
error_details = ["Authentication failed"]
www_auth = e.response.headers.get("WWW-Authenticate")
if www_auth:
error_details.append(f"WWW-Authenticate: {www_auth}")
if not auth:
error_details.append("No auth scheme provided")
msg = " | ".join(error_details)
raise A2AClientHTTPError(401, msg) from e
raise
def execute_a2a_delegation(
endpoint: str,
auth: AuthScheme | None,
timeout: int,
task_description: str,
context: str | None = None,
context_id: str | None = None,
task_id: str | None = None,
reference_task_ids: list[str] | None = None,
metadata: dict[str, Any] | None = None,
extensions: dict[str, Any] | None = None,
conversation_history: list[Message] | None = None,
agent_id: str | None = None,
agent_role: Role | None = None,
agent_branch: Any | None = None,
response_model: type[BaseModel] | None = None,
turn_number: int | None = None,
) -> dict[str, Any]:
"""Execute a task delegation to a remote A2A agent with multi-turn support.
Handles:
- AgentCard discovery
- Authentication setup
- Message creation and sending
- Response parsing
- Multi-turn conversations
Args:
endpoint: A2A agent endpoint URL (AgentCard URL)
auth: Optional AuthScheme for authentication (Bearer, OAuth2, API Key, HTTP Basic/Digest)
timeout: Request timeout in seconds
task_description: The task to delegate
context: Optional context information
context_id: Context ID for correlating messages/tasks
task_id: Specific task identifier
reference_task_ids: List of related task IDs
metadata: Additional metadata (external_id, request_id, etc.)
extensions: Protocol extensions for custom fields
conversation_history: Previous Message objects from conversation
agent_id: Agent identifier for logging
agent_role: Role of the CrewAI agent delegating the task
agent_branch: Optional agent tree branch for logging
response_model: Optional Pydantic model for structured outputs
turn_number: Optional turn number for multi-turn conversations
Returns:
Dictionary with:
- status: "completed", "input_required", "failed", etc.
- result: Result string (if completed)
- error: Error message (if failed)
- history: List of new Message objects from this exchange
Raises:
ImportError: If a2a-sdk is not installed
"""
is_multiturn = bool(conversation_history and len(conversation_history) > 0)
if turn_number is None:
turn_number = (
len([m for m in (conversation_history or []) if m.role == Role.user]) + 1
)
crewai_event_bus.emit(
agent_branch,
A2ADelegationStartedEvent(
endpoint=endpoint,
task_description=task_description,
agent_id=agent_id,
is_multiturn=is_multiturn,
turn_number=turn_number,
),
)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
result = loop.run_until_complete(
_execute_a2a_delegation_async(
endpoint=endpoint,
auth=auth,
timeout=timeout,
task_description=task_description,
context=context,
context_id=context_id,
task_id=task_id,
reference_task_ids=reference_task_ids,
metadata=metadata,
extensions=extensions,
conversation_history=conversation_history or [],
is_multiturn=is_multiturn,
turn_number=turn_number,
agent_branch=agent_branch,
agent_id=agent_id,
agent_role=agent_role,
response_model=response_model,
)
)
crewai_event_bus.emit(
agent_branch,
A2ADelegationCompletedEvent(
status=result["status"],
result=result.get("result"),
error=result.get("error"),
is_multiturn=is_multiturn,
),
)
return result
finally:
loop.close()
async def _execute_a2a_delegation_async(
endpoint: str,
auth: AuthScheme | None,
timeout: int,
task_description: str,
context: str | None,
context_id: str | None,
task_id: str | None,
reference_task_ids: list[str] | None,
metadata: dict[str, Any] | None,
extensions: dict[str, Any] | None,
conversation_history: list[Message],
is_multiturn: bool = False,
turn_number: int = 1,
agent_branch: Any | None = None,
agent_id: str | None = None,
agent_role: str | None = None,
response_model: type[BaseModel] | None = None,
) -> dict[str, Any]:
"""Async implementation of A2A delegation with multi-turn support.
Args:
endpoint: A2A agent endpoint URL
auth: Optional AuthScheme for authentication
timeout: Request timeout in seconds
task_description: Task to delegate
context: Optional context
context_id: Context ID for correlation
task_id: Specific task identifier
reference_task_ids: Related task IDs
metadata: Additional metadata
extensions: Protocol extensions
conversation_history: Previous Message objects
is_multiturn: Whether this is a multi-turn conversation
turn_number: Current turn number
agent_branch: Agent tree branch for logging
agent_id: Agent identifier for logging
agent_role: Agent role for logging
response_model: Optional Pydantic model for structured outputs
Returns:
Dictionary with status, result/error, and new history
"""
agent_card = await _fetch_agent_card_async(endpoint, auth, timeout)
validate_auth_against_agent_card(agent_card, auth)
headers: MutableMapping[str, str] = {}
if auth:
async with httpx.AsyncClient(timeout=timeout) as temp_auth_client:
if isinstance(auth, (HTTPDigestAuth, APIKeyAuth)):
configure_auth_client(auth, temp_auth_client)
headers = await auth.apply_auth(temp_auth_client, {})
a2a_agent_name = None
if agent_card.name:
a2a_agent_name = agent_card.name
if turn_number == 1:
agent_id_for_event = agent_id or endpoint
crewai_event_bus.emit(
agent_branch,
A2AConversationStartedEvent(
agent_id=agent_id_for_event,
endpoint=endpoint,
a2a_agent_name=a2a_agent_name,
),
)
message_parts = []
if context:
message_parts.append(f"Context:\n{context}\n\n")
message_parts.append(f"{task_description}")
message_text = "".join(message_parts)
if is_multiturn and conversation_history and not task_id:
if first_task_id := conversation_history[0].task_id:
task_id = first_task_id
parts: PartsDict = {"text": message_text}
if response_model:
parts.update(
{
"metadata": PartsMetadataDict(
mimeType="application/json",
schema=response_model.model_json_schema(),
)
}
)
message = Message(
role=Role.user,
message_id=str(uuid.uuid4()),
parts=[Part(root=TextPart(**parts))],
context_id=context_id,
task_id=task_id,
reference_task_ids=reference_task_ids,
metadata=metadata,
extensions=extensions,
)
transport_protocol = TransportProtocol("JSONRPC")
new_messages: list[Message] = [*conversation_history, message]
crewai_event_bus.emit(
None,
A2AMessageSentEvent(
message=message_text,
turn_number=turn_number,
is_multiturn=is_multiturn,
agent_role=agent_role,
),
)
async with _create_a2a_client(
agent_card=agent_card,
transport_protocol=transport_protocol,
timeout=timeout,
headers=headers,
streaming=True,
auth=auth,
) as client:
result_parts: list[str] = []
final_result: dict[str, Any] | None = None
event_stream = client.send_message(message)
try:
async for event in event_stream:
if isinstance(event, Message):
new_messages.append(event)
for part in event.parts:
if part.root.kind == "text":
text = part.root.text
result_parts.append(text)
elif isinstance(event, tuple):
a2a_task, update = event
if isinstance(update, TaskArtifactUpdateEvent):
artifact = update.artifact
result_parts.extend(
part.root.text
for part in artifact.parts
if part.root.kind == "text"
)
is_final_update = False
if isinstance(update, TaskStatusUpdateEvent):
is_final_update = update.final
if not is_final_update and a2a_task.status.state not in [
TaskState.completed,
TaskState.input_required,
TaskState.failed,
TaskState.rejected,
TaskState.auth_required,
TaskState.canceled,
]:
continue
if a2a_task.status.state == TaskState.completed:
extracted_parts = _extract_task_result_parts(a2a_task)
result_parts.extend(extracted_parts)
if a2a_task.history:
new_messages.extend(a2a_task.history)
response_text = " ".join(result_parts) if result_parts else ""
crewai_event_bus.emit(
None,
A2AResponseReceivedEvent(
response=response_text,
turn_number=turn_number,
is_multiturn=is_multiturn,
status="completed",
agent_role=agent_role,
),
)
final_result = {
"status": "completed",
"result": response_text,
"history": new_messages,
"agent_card": agent_card,
}
break
if a2a_task.status.state == TaskState.input_required:
if a2a_task.history:
new_messages.extend(a2a_task.history)
response_text = _extract_error_message(
a2a_task, "Additional input required"
)
if response_text and not a2a_task.history:
agent_message = Message(
role=Role.agent,
message_id=str(uuid.uuid4()),
parts=[Part(root=TextPart(text=response_text))],
context_id=a2a_task.context_id
if hasattr(a2a_task, "context_id")
else None,
task_id=a2a_task.task_id
if hasattr(a2a_task, "task_id")
else None,
)
new_messages.append(agent_message)
crewai_event_bus.emit(
None,
A2AResponseReceivedEvent(
response=response_text,
turn_number=turn_number,
is_multiturn=is_multiturn,
status="input_required",
agent_role=agent_role,
),
)
final_result = {
"status": "input_required",
"error": response_text,
"history": new_messages,
"agent_card": agent_card,
}
break
if a2a_task.status.state in [TaskState.failed, TaskState.rejected]:
error_msg = _extract_error_message(
a2a_task, "Task failed without error message"
)
if a2a_task.history:
new_messages.extend(a2a_task.history)
final_result = {
"status": "failed",
"error": error_msg,
"history": new_messages,
}
break
if a2a_task.status.state == TaskState.auth_required:
error_msg = _extract_error_message(
a2a_task, "Authentication required"
)
final_result = {
"status": "auth_required",
"error": error_msg,
"history": new_messages,
}
break
if a2a_task.status.state == TaskState.canceled:
error_msg = _extract_error_message(
a2a_task, "Task was canceled"
)
final_result = {
"status": "canceled",
"error": error_msg,
"history": new_messages,
}
break
except Exception as e:
current_exception: Exception | BaseException | None = e
while current_exception:
if hasattr(current_exception, "response"):
response = current_exception.response
if hasattr(response, "text"):
break
if current_exception and hasattr(current_exception, "__cause__"):
current_exception = current_exception.__cause__
raise
finally:
if hasattr(event_stream, "aclose"):
await event_stream.aclose()
if final_result:
return final_result
return {
"status": "completed",
"result": " ".join(result_parts) if result_parts else "",
"history": new_messages,
}
@asynccontextmanager
async def _create_a2a_client(
agent_card: AgentCard,
transport_protocol: TransportProtocol,
timeout: int,
headers: MutableMapping[str, str],
streaming: bool,
auth: AuthScheme | None = None,
) -> AsyncIterator[Client]:
"""Create and configure an A2A client.
Args:
agent_card: The A2A agent card
transport_protocol: Transport protocol to use
timeout: Request timeout in seconds
headers: HTTP headers (already with auth applied)
streaming: Enable streaming responses
auth: Optional AuthScheme for client configuration
Yields:
Configured A2A client instance
"""
async with httpx.AsyncClient(
timeout=timeout,
headers=headers,
) as httpx_client:
if auth and isinstance(auth, (HTTPDigestAuth, APIKeyAuth)):
configure_auth_client(auth, httpx_client)
config = ClientConfig(
httpx_client=httpx_client,
supported_transports=[str(transport_protocol.value)],
streaming=streaming,
accepted_output_modes=["application/json"],
)
factory = ClientFactory(config)
client = factory.create(agent_card)
yield client
def _extract_task_result_parts(a2a_task: A2ATask) -> list[str]:
"""Extract result parts from A2A task history and artifacts.
Args:
a2a_task: A2A Task object with history and artifacts
Returns:
List of result text parts
"""
result_parts: list[str] = []
if a2a_task.history:
for history_msg in reversed(a2a_task.history):
if history_msg.role == Role.agent:
result_parts.extend(
part.root.text
for part in history_msg.parts
if part.root.kind == "text"
)
break
if a2a_task.artifacts:
result_parts.extend(
part.root.text
for artifact in a2a_task.artifacts
for part in artifact.parts
if part.root.kind == "text"
)
return result_parts
def _extract_error_message(a2a_task: A2ATask, default: str) -> str:
"""Extract error message from A2A task.
Args:
a2a_task: A2A Task object
default: Default message if no error found
Returns:
Error message string
"""
if a2a_task.status and a2a_task.status.message:
msg = a2a_task.status.message
if msg:
for part in msg.parts:
if part.root.kind == "text":
return str(part.root.text)
return str(msg)
if a2a_task.history:
for history_msg in reversed(a2a_task.history):
for part in history_msg.parts:
if part.root.kind == "text":
return str(part.root.text)
return default
def create_agent_response_model(agent_ids: tuple[str, ...]) -> type[BaseModel]:
"""Create a dynamic AgentResponse model with Literal types for agent IDs.
Args:
agent_ids: List of available A2A agent IDs
Returns:
Dynamically created Pydantic model with Literal-constrained a2a_ids field
"""
DynamicLiteral = create_literals_from_strings(agent_ids) # noqa: N806
return create_model(
"AgentResponse",
a2a_ids=(
tuple[DynamicLiteral, ...], # type: ignore[valid-type]
Field(
default_factory=tuple,
max_length=len(agent_ids),
description="A2A agent IDs to delegate to.",
),
),
message=(
str,
Field(
description="The message content. If is_a2a=true, this is sent to the A2A agent. If is_a2a=false, this is your final answer ending the conversation."
),
),
is_a2a=(
bool,
Field(
description="Set to true to continue the conversation by sending this message to the A2A agent and awaiting their response. Set to false ONLY when you are completely done and providing your final answer (not when asking questions)."
),
),
__base__=BaseModel,
)
def extract_a2a_agent_ids_from_config(
a2a_config: list[A2AConfig] | A2AConfig | None,
) -> tuple[list[A2AConfig], tuple[str, ...]]:
"""Extract A2A agent IDs from A2A configuration.
Args:
a2a_config: A2A configuration
Returns:
List of A2A agent IDs
"""
if a2a_config is None:
return [], ()
if isinstance(a2a_config, A2AConfig):
a2a_agents = [a2a_config]
else:
a2a_agents = a2a_config
return a2a_agents, tuple(config.endpoint for config in a2a_agents)
def get_a2a_agents_and_response_model(
a2a_config: list[A2AConfig] | A2AConfig | None,
) -> tuple[list[A2AConfig], type[BaseModel]]:
"""Get A2A agent IDs and response model.
Args:
a2a_config: A2A configuration
Returns:
Tuple of A2A agent IDs and response model
"""
a2a_agents, agent_ids = extract_a2a_agent_ids_from_config(a2a_config=a2a_config)
return a2a_agents, create_agent_response_model(agent_ids)

View File

@@ -0,0 +1,570 @@
"""A2A agent wrapping logic for metaclass integration.
Wraps agent classes with A2A delegation capabilities.
"""
from __future__ import annotations
from collections.abc import Callable
from concurrent.futures import ThreadPoolExecutor, as_completed
from functools import wraps
from types import MethodType
from typing import TYPE_CHECKING, Any, cast
from a2a.types import Role
from pydantic import BaseModel, ValidationError
from crewai.a2a.config import A2AConfig
from crewai.a2a.templates import (
AVAILABLE_AGENTS_TEMPLATE,
CONVERSATION_TURN_INFO_TEMPLATE,
PREVIOUS_A2A_CONVERSATION_TEMPLATE,
UNAVAILABLE_AGENTS_NOTICE_TEMPLATE,
)
from crewai.a2a.types import AgentResponseProtocol
from crewai.a2a.utils import (
execute_a2a_delegation,
fetch_agent_card,
get_a2a_agents_and_response_model,
)
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.a2a_events import (
A2AConversationCompletedEvent,
A2AMessageSentEvent,
)
if TYPE_CHECKING:
from a2a.types import AgentCard, Message
from crewai.agent.core import Agent
from crewai.task import Task
from crewai.tools.base_tool import BaseTool
def wrap_agent_with_a2a_instance(agent: Agent) -> None:
"""Wrap an agent instance's execute_task method with A2A support.
This function modifies the agent instance by wrapping its execute_task
method to add A2A delegation capabilities. Should only be called when
the agent has a2a configuration set.
Args:
agent: The agent instance to wrap
"""
original_execute_task = agent.execute_task.__func__
@wraps(original_execute_task)
def execute_task_with_a2a(
self: Agent,
task: Task,
context: str | None = None,
tools: list[BaseTool] | None = None,
) -> str:
"""Execute task with A2A delegation support.
Args:
self: The agent instance
task: The task to execute
context: Optional context for task execution
tools: Optional tools available to the agent
Returns:
Task execution result
"""
if not self.a2a:
return original_execute_task(self, task, context, tools)
a2a_agents, agent_response_model = get_a2a_agents_and_response_model(self.a2a)
return _execute_task_with_a2a(
self=self,
a2a_agents=a2a_agents,
original_fn=original_execute_task,
task=task,
agent_response_model=agent_response_model,
context=context,
tools=tools,
)
object.__setattr__(agent, "execute_task", MethodType(execute_task_with_a2a, agent))
def _fetch_card_from_config(
config: A2AConfig,
) -> tuple[A2AConfig, AgentCard | Exception]:
"""Fetch agent card from A2A config.
Args:
config: A2A configuration
Returns:
Tuple of (config, card or exception)
"""
try:
card = fetch_agent_card(
endpoint=config.endpoint,
auth=config.auth,
timeout=config.timeout,
)
return config, card
except Exception as e:
return config, e
def _fetch_agent_cards_concurrently(
a2a_agents: list[A2AConfig],
) -> tuple[dict[str, AgentCard], dict[str, str]]:
"""Fetch agent cards concurrently for multiple A2A agents.
Args:
a2a_agents: List of A2A agent configurations
Returns:
Tuple of (agent_cards dict, failed_agents dict mapping endpoint to error message)
"""
agent_cards: dict[str, AgentCard] = {}
failed_agents: dict[str, str] = {}
with ThreadPoolExecutor(max_workers=len(a2a_agents)) as executor:
futures = {
executor.submit(_fetch_card_from_config, config): config
for config in a2a_agents
}
for future in as_completed(futures):
config, result = future.result()
if isinstance(result, Exception):
if config.fail_fast:
raise RuntimeError(
f"Failed to fetch agent card from {config.endpoint}. "
f"Ensure the A2A agent is running and accessible. Error: {result}"
) from result
failed_agents[config.endpoint] = str(result)
else:
agent_cards[config.endpoint] = result
return agent_cards, failed_agents
def _execute_task_with_a2a(
self: Agent,
a2a_agents: list[A2AConfig],
original_fn: Callable[..., str],
task: Task,
agent_response_model: type[BaseModel],
context: str | None,
tools: list[BaseTool] | None,
) -> str:
"""Wrap execute_task with A2A delegation logic.
Args:
self: The agent instance
a2a_agents: Dictionary of A2A agent configurations
original_fn: The original execute_task method
task: The task to execute
context: Optional context for task execution
tools: Optional tools available to the agent
agent_response_model: Optional agent response model
Returns:
Task execution result (either from LLM or A2A agent)
"""
original_description: str = task.description
original_output_pydantic = task.output_pydantic
original_response_model = task.response_model
agent_cards, failed_agents = _fetch_agent_cards_concurrently(a2a_agents)
if not agent_cards and a2a_agents and failed_agents:
unavailable_agents_text = ""
for endpoint, error in failed_agents.items():
unavailable_agents_text += f" - {endpoint}: {error}\n"
notice = UNAVAILABLE_AGENTS_NOTICE_TEMPLATE.substitute(
unavailable_agents=unavailable_agents_text
)
task.description = f"{original_description}{notice}"
try:
return original_fn(self, task, context, tools)
finally:
task.description = original_description
task.description = _augment_prompt_with_a2a(
a2a_agents=a2a_agents,
task_description=original_description,
agent_cards=agent_cards,
failed_agents=failed_agents,
)
task.response_model = agent_response_model
try:
raw_result = original_fn(self, task, context, tools)
agent_response = _parse_agent_response(
raw_result=raw_result, agent_response_model=agent_response_model
)
if isinstance(agent_response, BaseModel) and isinstance(
agent_response, AgentResponseProtocol
):
if agent_response.is_a2a:
return _delegate_to_a2a(
self,
agent_response=agent_response,
task=task,
original_fn=original_fn,
context=context,
tools=tools,
agent_cards=agent_cards,
original_task_description=original_description,
)
return str(agent_response.message)
return raw_result
finally:
task.description = original_description
task.output_pydantic = original_output_pydantic
task.response_model = original_response_model
def _augment_prompt_with_a2a(
a2a_agents: list[A2AConfig],
task_description: str,
agent_cards: dict[str, AgentCard],
conversation_history: list[Message] | None = None,
turn_num: int = 0,
max_turns: int | None = None,
failed_agents: dict[str, str] | None = None,
) -> str:
"""Add A2A delegation instructions to prompt.
Args:
a2a_agents: Dictionary of A2A agent configurations
task_description: Original task description
agent_cards: dictionary mapping agent IDs to AgentCards
conversation_history: Previous A2A Messages from conversation
turn_num: Current turn number (0-indexed)
max_turns: Maximum allowed turns (from config)
failed_agents: Dictionary mapping failed agent endpoints to error messages
Returns:
Augmented task description with A2A instructions
"""
if not agent_cards:
return task_description
agents_text = ""
for config in a2a_agents:
if config.endpoint in agent_cards:
card = agent_cards[config.endpoint]
agents_text += f"\n{card.model_dump_json(indent=2, exclude_none=True, include={'description', 'url', 'skills'})}\n"
failed_agents = failed_agents or {}
if failed_agents:
agents_text += "\n<!-- Unavailable Agents -->\n"
for endpoint, error in failed_agents.items():
agents_text += f"\n<!-- Agent: {endpoint}\n Status: Unavailable\n Error: {error} -->\n"
agents_text = AVAILABLE_AGENTS_TEMPLATE.substitute(available_a2a_agents=agents_text)
history_text = ""
if conversation_history:
for msg in conversation_history:
history_text += f"\n{msg.model_dump_json(indent=2, exclude_none=True, exclude={'message_id'})}\n"
history_text = PREVIOUS_A2A_CONVERSATION_TEMPLATE.substitute(
previous_a2a_conversation=history_text
)
turn_info = ""
if max_turns is not None and conversation_history:
turn_count = turn_num + 1
warning = ""
if turn_count >= max_turns:
warning = (
"CRITICAL: This is the FINAL turn. You MUST conclude the conversation now.\n"
"Set is_a2a=false and provide your final response to complete the task."
)
elif turn_count == max_turns - 1:
warning = "WARNING: Next turn will be the last. Consider wrapping up the conversation."
turn_info = CONVERSATION_TURN_INFO_TEMPLATE.substitute(
turn_count=turn_count,
max_turns=max_turns,
warning=warning,
)
return f"""{task_description}
IMPORTANT: You have the ability to delegate this task to remote A2A agents.
{agents_text}
{history_text}{turn_info}
"""
def _parse_agent_response(
raw_result: str | dict[str, Any], agent_response_model: type[BaseModel]
) -> BaseModel | str:
"""Parse LLM output as AgentResponse or return raw agent response.
Args:
raw_result: Raw output from LLM
agent_response_model: The agent response model
Returns:
Parsed AgentResponse or string
"""
if agent_response_model:
try:
if isinstance(raw_result, str):
return agent_response_model.model_validate_json(raw_result)
if isinstance(raw_result, dict):
return agent_response_model.model_validate(raw_result)
except ValidationError:
return cast(str, raw_result)
return cast(str, raw_result)
def _handle_agent_response_and_continue(
self: Agent,
a2a_result: dict[str, Any],
agent_id: str,
agent_cards: dict[str, AgentCard] | None,
a2a_agents: list[A2AConfig],
original_task_description: str,
conversation_history: list[Message],
turn_num: int,
max_turns: int,
task: Task,
original_fn: Callable[..., str],
context: str | None,
tools: list[BaseTool] | None,
agent_response_model: type[BaseModel],
) -> tuple[str | None, str | None]:
"""Handle A2A result and get CrewAI agent's response.
Args:
self: The agent instance
a2a_result: Result from A2A delegation
agent_id: ID of the A2A agent
agent_cards: Pre-fetched agent cards
a2a_agents: List of A2A configurations
original_task_description: Original task description
conversation_history: Conversation history
turn_num: Current turn number
max_turns: Maximum turns allowed
task: The task being executed
original_fn: Original execute_task method
context: Optional context
tools: Optional tools
agent_response_model: Response model for parsing
Returns:
Tuple of (final_result, current_request) where:
- final_result is not None if conversation should end
- current_request is the next message to send if continuing
"""
agent_cards_dict = agent_cards or {}
if "agent_card" in a2a_result and agent_id not in agent_cards_dict:
agent_cards_dict[agent_id] = a2a_result["agent_card"]
task.description = _augment_prompt_with_a2a(
a2a_agents=a2a_agents,
task_description=original_task_description,
conversation_history=conversation_history,
turn_num=turn_num,
max_turns=max_turns,
agent_cards=agent_cards_dict,
)
raw_result = original_fn(self, task, context, tools)
llm_response = _parse_agent_response(
raw_result=raw_result, agent_response_model=agent_response_model
)
if isinstance(llm_response, BaseModel) and isinstance(
llm_response, AgentResponseProtocol
):
if not llm_response.is_a2a:
final_turn_number = turn_num + 1
crewai_event_bus.emit(
None,
A2AMessageSentEvent(
message=str(llm_response.message),
turn_number=final_turn_number,
is_multiturn=True,
agent_role=self.role,
),
)
crewai_event_bus.emit(
None,
A2AConversationCompletedEvent(
status="completed",
final_result=str(llm_response.message),
error=None,
total_turns=final_turn_number,
),
)
return str(llm_response.message), None
return None, str(llm_response.message)
return str(raw_result), None
def _delegate_to_a2a(
self: Agent,
agent_response: AgentResponseProtocol,
task: Task,
original_fn: Callable[..., str],
context: str | None,
tools: list[BaseTool] | None,
agent_cards: dict[str, AgentCard] | None = None,
original_task_description: str | None = None,
) -> str:
"""Delegate to A2A agent with multi-turn conversation support.
Args:
self: The agent instance
agent_response: The AgentResponse indicating delegation
task: The task being executed (for extracting A2A fields)
original_fn: The original execute_task method for follow-ups
context: Optional context for task execution
tools: Optional tools available to the agent
agent_cards: Pre-fetched agent cards from _execute_task_with_a2a
original_task_description: The original task description before A2A augmentation
Returns:
Result from A2A agent
Raises:
ImportError: If a2a-sdk is not installed
"""
a2a_agents, agent_response_model = get_a2a_agents_and_response_model(self.a2a)
agent_ids = tuple(config.endpoint for config in a2a_agents)
current_request = str(agent_response.message)
agent_id = agent_response.a2a_ids[0]
if agent_id not in agent_ids:
raise ValueError(
f"Unknown A2A agent ID(s): {agent_response.a2a_ids} not in {agent_ids}"
)
agent_config = next(filter(lambda x: x.endpoint == agent_id, a2a_agents))
task_config = task.config or {}
context_id = task_config.get("context_id")
task_id_config = task_config.get("task_id")
reference_task_ids = task_config.get("reference_task_ids")
metadata = task_config.get("metadata")
extensions = task_config.get("extensions")
if original_task_description is None:
original_task_description = task.description
conversation_history: list[Message] = []
max_turns = agent_config.max_turns
try:
for turn_num in range(max_turns):
console_formatter = getattr(crewai_event_bus, "_console", None)
agent_branch = None
if console_formatter:
agent_branch = getattr(
console_formatter, "current_agent_branch", None
) or getattr(console_formatter, "current_task_branch", None)
a2a_result = execute_a2a_delegation(
endpoint=agent_config.endpoint,
auth=agent_config.auth,
timeout=agent_config.timeout,
task_description=current_request,
context_id=context_id,
task_id=task_id_config,
reference_task_ids=reference_task_ids,
metadata=metadata,
extensions=extensions,
conversation_history=conversation_history,
agent_id=agent_id,
agent_role=Role.user,
agent_branch=agent_branch,
response_model=agent_config.response_model,
turn_number=turn_num + 1,
)
conversation_history = a2a_result.get("history", [])
if a2a_result["status"] in ["completed", "input_required"]:
final_result, next_request = _handle_agent_response_and_continue(
self=self,
a2a_result=a2a_result,
agent_id=agent_id,
agent_cards=agent_cards,
a2a_agents=a2a_agents,
original_task_description=original_task_description,
conversation_history=conversation_history,
turn_num=turn_num,
max_turns=max_turns,
task=task,
original_fn=original_fn,
context=context,
tools=tools,
agent_response_model=agent_response_model,
)
if final_result is not None:
return final_result
if next_request is not None:
current_request = next_request
continue
error_msg = a2a_result.get("error", "Unknown error")
crewai_event_bus.emit(
None,
A2AConversationCompletedEvent(
status="failed",
final_result=None,
error=error_msg,
total_turns=turn_num + 1,
),
)
raise Exception(f"A2A delegation failed: {error_msg}")
if conversation_history:
for msg in reversed(conversation_history):
if msg.role == Role.agent:
text_parts = [
part.root.text for part in msg.parts if part.root.kind == "text"
]
final_message = (
" ".join(text_parts) if text_parts else "Conversation completed"
)
crewai_event_bus.emit(
None,
A2AConversationCompletedEvent(
status="completed",
final_result=final_message,
error=None,
total_turns=max_turns,
),
)
return final_message
crewai_event_bus.emit(
None,
A2AConversationCompletedEvent(
status="failed",
final_result=None,
error=f"Conversation exceeded maximum turns ({max_turns})",
total_turns=max_turns,
),
)
raise Exception(f"A2A conversation exceeded maximum turns ({max_turns})")
finally:
task.description = original_task_description

View File

@@ -0,0 +1,5 @@
from crewai.agent.core import Agent
from crewai.utilities.training_handler import CrewTrainingHandler
__all__ = ["Agent", "CrewTrainingHandler"]

View File

@@ -2,27 +2,27 @@ from __future__ import annotations
import asyncio
from collections.abc import Sequence
import json
import shutil
import subprocess
import time
from typing import (
TYPE_CHECKING,
Any,
Final,
Literal,
cast,
)
from urllib.parse import urlparse
from pydantic import BaseModel, Field, InstanceOf, PrivateAttr, model_validator
from typing_extensions import Self
from crewai.a2a.config import A2AConfig
from crewai.agents.agent_builder.base_agent import BaseAgent
from crewai.agents.cache.cache_handler import CacheHandler
from crewai.agents.crew_agent_executor import CrewAgentExecutor
from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.agent_events import (
AgentExecutionCompletedEvent,
AgentExecutionErrorEvent,
AgentExecutionStartedEvent,
)
from crewai.events.types.knowledge_events import (
KnowledgeQueryCompletedEvent,
KnowledgeQueryFailedEvent,
@@ -40,6 +40,16 @@ from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
from crewai.knowledge.utils.knowledge_utils import extract_knowledge_context
from crewai.lite_agent import LiteAgent
from crewai.llms.base_llm import BaseLLM
from crewai.mcp import (
MCPClient,
MCPServerConfig,
MCPServerHTTP,
MCPServerSSE,
MCPServerStdio,
)
from crewai.mcp.transports.http import HTTPTransport
from crewai.mcp.transports.sse import SSETransport
from crewai.mcp.transports.stdio import StdioTransport
from crewai.memory.contextual.contextual_memory import ContextualMemory
from crewai.rag.embeddings.types import EmbedderConfig
from crewai.security.fingerprint import Fingerprint
@@ -70,14 +80,14 @@ if TYPE_CHECKING:
# MCP Connection timeout constants (in seconds)
MCP_CONNECTION_TIMEOUT = 10
MCP_TOOL_EXECUTION_TIMEOUT = 30
MCP_DISCOVERY_TIMEOUT = 15
MCP_MAX_RETRIES = 3
MCP_CONNECTION_TIMEOUT: Final[int] = 10
MCP_TOOL_EXECUTION_TIMEOUT: Final[int] = 30
MCP_DISCOVERY_TIMEOUT: Final[int] = 15
MCP_MAX_RETRIES: Final[int] = 3
# Simple in-memory cache for MCP tool schemas (duration: 5 minutes)
_mcp_schema_cache = {}
_cache_ttl = 300 # 5 minutes
_mcp_schema_cache: dict[str, Any] = {}
_cache_ttl: Final[int] = 300 # 5 minutes
class Agent(BaseAgent):
@@ -108,6 +118,8 @@ class Agent(BaseAgent):
"""
_times_executed: int = PrivateAttr(default=0)
_mcp_clients: list[Any] = PrivateAttr(default_factory=list)
_last_messages: list[LLMMessage] = PrivateAttr(default_factory=list)
max_execution_time: int | None = Field(
default=None,
description="Maximum execution time for an agent to execute a task",
@@ -197,6 +209,10 @@ class Agent(BaseAgent):
guardrail_max_retries: int = Field(
default=3, description="Maximum number of retries when guardrail fails"
)
a2a: list[A2AConfig] | A2AConfig | None = Field(
default=None,
description="A2A (Agent-to-Agent) configuration for delegating tasks to remote agents. Can be a single A2AConfig or a dict mapping agent IDs to configs.",
)
@model_validator(mode="before")
def validate_from_repository(cls, v: Any) -> dict[str, Any] | None | Any: # noqa: N805
@@ -305,17 +321,19 @@ class Agent(BaseAgent):
# If the task requires output in JSON or Pydantic format,
# append specific instructions to the task prompt to ensure
# that the final answer does not include any code block markers
if task.output_json or task.output_pydantic:
# Skip this if task.response_model is set, as native structured outputs handle schema automatically
if (task.output_json or task.output_pydantic) and not task.response_model:
# Generate the schema based on the output format
if task.output_json:
# schema = json.dumps(task.output_json, indent=2)
schema = generate_model_description(task.output_json)
schema_dict = generate_model_description(task.output_json)
schema = json.dumps(schema_dict["json_schema"]["schema"], indent=2)
task_prompt += "\n" + self.i18n.slice(
"formatted_task_instructions"
).format(output_format=schema)
elif task.output_pydantic:
schema = generate_model_description(task.output_pydantic)
schema_dict = generate_model_description(task.output_pydantic)
schema = json.dumps(schema_dict["json_schema"]["schema"], indent=2)
task_prompt += "\n" + self.i18n.slice(
"formatted_task_instructions"
).format(output_format=schema)
@@ -438,6 +456,13 @@ class Agent(BaseAgent):
else:
task_prompt = self._use_trained_data(task_prompt=task_prompt)
# Import agent events locally to avoid circular imports
from crewai.events.types.agent_events import (
AgentExecutionCompletedEvent,
AgentExecutionErrorEvent,
AgentExecutionStartedEvent,
)
try:
crewai_event_bus.emit(
self,
@@ -513,6 +538,15 @@ class Agent(BaseAgent):
self,
event=AgentExecutionCompletedEvent(agent=self, task=task, output=result),
)
self._last_messages = (
self.agent_executor.messages.copy()
if self.agent_executor and hasattr(self.agent_executor, "messages")
else []
)
self._cleanup_mcp_clients()
return result
def _execute_with_timeout(self, task_prompt: str, task: Task, timeout: int) -> Any:
@@ -618,6 +652,7 @@ class Agent(BaseAgent):
self._rpm_controller.check_or_wait if self._rpm_controller else None
),
callbacks=[TokenCalcHandler(self._token_process)],
response_model=task.response_model if task else None,
)
def get_delegation_tools(self, agents: list[BaseAgent]) -> list[BaseTool]:
@@ -635,30 +670,70 @@ class Agent(BaseAgent):
self._logger.log("error", f"Error getting platform tools: {e!s}")
return []
def get_mcp_tools(self, mcps: list[str]) -> list[BaseTool]:
"""Convert MCP server references to CrewAI tools."""
def get_mcp_tools(self, mcps: list[str | MCPServerConfig]) -> list[BaseTool]:
"""Convert MCP server references/configs to CrewAI tools.
Supports both string references (backwards compatible) and structured
configuration objects (MCPServerStdio, MCPServerHTTP, MCPServerSSE).
Args:
mcps: List of MCP server references (strings) or configurations.
Returns:
List of BaseTool instances from MCP servers.
"""
all_tools = []
clients = []
for mcp_ref in mcps:
try:
if mcp_ref.startswith("crewai-amp:"):
tools = self._get_amp_mcp_tools(mcp_ref)
elif mcp_ref.startswith("https://"):
tools = self._get_external_mcp_tools(mcp_ref)
else:
continue
for mcp_config in mcps:
if isinstance(mcp_config, str):
tools = self._get_mcp_tools_from_string(mcp_config)
else:
tools, client = self._get_native_mcp_tools(mcp_config)
if client:
clients.append(client)
all_tools.extend(tools)
self._logger.log(
"info", f"Successfully loaded {len(tools)} tools from {mcp_ref}"
)
except Exception as e:
self._logger.log("warning", f"Skipping MCP {mcp_ref} due to error: {e}")
continue
all_tools.extend(tools)
# Store clients for cleanup
self._mcp_clients.extend(clients)
return all_tools
def _cleanup_mcp_clients(self) -> None:
"""Cleanup MCP client connections after task execution."""
if not self._mcp_clients:
return
async def _disconnect_all() -> None:
for client in self._mcp_clients:
if client and hasattr(client, "connected") and client.connected:
await client.disconnect()
try:
asyncio.run(_disconnect_all())
except Exception as e:
self._logger.log("error", f"Error during MCP client cleanup: {e}")
finally:
self._mcp_clients.clear()
def _get_mcp_tools_from_string(self, mcp_ref: str) -> list[BaseTool]:
"""Get tools from legacy string-based MCP references.
This method maintains backwards compatibility with string-based
MCP references (https://... and crewai-amp:...).
Args:
mcp_ref: String reference to MCP server.
Returns:
List of BaseTool instances.
"""
if mcp_ref.startswith("crewai-amp:"):
return self._get_amp_mcp_tools(mcp_ref)
if mcp_ref.startswith("https://"):
return self._get_external_mcp_tools(mcp_ref)
return []
def _get_external_mcp_tools(self, mcp_ref: str) -> list[BaseTool]:
"""Get tools from external HTTPS MCP server with graceful error handling."""
from crewai.tools.mcp_tool_wrapper import MCPToolWrapper
@@ -709,7 +784,7 @@ class Agent(BaseAgent):
f"Specific tool '{specific_tool}' not found on MCP server: {server_url}",
)
return tools
return cast(list[BaseTool], tools)
except Exception as e:
self._logger.log(
@@ -717,6 +792,164 @@ class Agent(BaseAgent):
)
return []
def _get_native_mcp_tools(
self, mcp_config: MCPServerConfig
) -> tuple[list[BaseTool], Any | None]:
"""Get tools from MCP server using structured configuration.
This method creates an MCP client based on the configuration type,
connects to the server, discovers tools, applies filtering, and
returns wrapped tools along with the client instance for cleanup.
Args:
mcp_config: MCP server configuration (MCPServerStdio, MCPServerHTTP, or MCPServerSSE).
Returns:
Tuple of (list of BaseTool instances, MCPClient instance for cleanup).
"""
from crewai.tools.base_tool import BaseTool
from crewai.tools.mcp_native_tool import MCPNativeTool
if isinstance(mcp_config, MCPServerStdio):
transport = StdioTransport(
command=mcp_config.command,
args=mcp_config.args,
env=mcp_config.env,
)
server_name = f"{mcp_config.command}_{'_'.join(mcp_config.args)}"
elif isinstance(mcp_config, MCPServerHTTP):
transport = HTTPTransport(
url=mcp_config.url,
headers=mcp_config.headers,
streamable=mcp_config.streamable,
)
server_name = self._extract_server_name(mcp_config.url)
elif isinstance(mcp_config, MCPServerSSE):
transport = SSETransport(
url=mcp_config.url,
headers=mcp_config.headers,
)
server_name = self._extract_server_name(mcp_config.url)
else:
raise ValueError(f"Unsupported MCP server config type: {type(mcp_config)}")
client = MCPClient(
transport=transport,
cache_tools_list=mcp_config.cache_tools_list,
)
async def _setup_client_and_list_tools() -> list[dict[str, Any]]:
"""Async helper to connect and list tools in same event loop."""
try:
if not client.connected:
await client.connect()
tools_list = await client.list_tools()
try:
await client.disconnect()
# Small delay to allow background tasks to finish cleanup
# This helps prevent "cancel scope in different task" errors
# when asyncio.run() closes the event loop
await asyncio.sleep(0.1)
except Exception as e:
self._logger.log("error", f"Error during disconnect: {e}")
return tools_list
except Exception as e:
if client.connected:
await client.disconnect()
await asyncio.sleep(0.1)
raise RuntimeError(
f"Error during setup client and list tools: {e}"
) from e
try:
try:
asyncio.get_running_loop()
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(
asyncio.run, _setup_client_and_list_tools()
)
tools_list = future.result()
except RuntimeError:
try:
tools_list = asyncio.run(_setup_client_and_list_tools())
except RuntimeError as e:
error_msg = str(e).lower()
if "cancel scope" in error_msg or "task" in error_msg:
raise ConnectionError(
"MCP connection failed due to event loop cleanup issues. "
"This may be due to authentication errors or server unavailability."
) from e
except asyncio.CancelledError as e:
raise ConnectionError(
"MCP connection was cancelled. This may indicate an authentication "
"error or server unavailability."
) from e
if mcp_config.tool_filter:
filtered_tools = []
for tool in tools_list:
if callable(mcp_config.tool_filter):
try:
from crewai.mcp.filters import ToolFilterContext
context = ToolFilterContext(
agent=self,
server_name=server_name,
run_context=None,
)
if mcp_config.tool_filter(context, tool):
filtered_tools.append(tool)
except (TypeError, AttributeError):
if mcp_config.tool_filter(tool):
filtered_tools.append(tool)
else:
# Not callable - include tool
filtered_tools.append(tool)
tools_list = filtered_tools
tools = []
for tool_def in tools_list:
tool_name = tool_def.get("name", "")
if not tool_name:
continue
# Convert inputSchema to Pydantic model if present
args_schema = None
if tool_def.get("inputSchema"):
args_schema = self._json_schema_to_pydantic(
tool_name, tool_def["inputSchema"]
)
tool_schema = {
"description": tool_def.get("description", ""),
"args_schema": args_schema,
}
try:
native_tool = MCPNativeTool(
mcp_client=client,
tool_name=tool_name,
tool_schema=tool_schema,
server_name=server_name,
)
tools.append(native_tool)
except Exception as e:
self._logger.log("error", f"Failed to create native MCP tool: {e}")
continue
return cast(list[BaseTool], tools), client
except Exception as e:
if client.connected:
asyncio.run(client.disconnect())
raise RuntimeError(f"Failed to get native MCP tools: {e}") from e
def _get_amp_mcp_tools(self, amp_ref: str) -> list[BaseTool]:
"""Get tools from CrewAI AMP MCP marketplace."""
# Parse: "crewai-amp:mcp-name" or "crewai-amp:mcp-name#tool_name"
@@ -739,9 +972,9 @@ class Agent(BaseAgent):
return tools
def _extract_server_name(self, server_url: str) -> str:
@staticmethod
def _extract_server_name(server_url: str) -> str:
"""Extract clean server name from URL for tool prefixing."""
from urllib.parse import urlparse
parsed = urlparse(server_url)
domain = parsed.netloc.replace(".", "_")
@@ -778,7 +1011,9 @@ class Agent(BaseAgent):
)
return {}
async def _get_mcp_tool_schemas_async(self, server_params: dict) -> dict[str, dict]:
async def _get_mcp_tool_schemas_async(
self, server_params: dict[str, Any]
) -> dict[str, dict]:
"""Async implementation of MCP tool schema retrieval with timeouts and retries."""
server_url = server_params["url"]
return await self._retry_mcp_discovery(
@@ -787,7 +1022,7 @@ class Agent(BaseAgent):
async def _retry_mcp_discovery(
self, operation_func, server_url: str
) -> dict[str, dict]:
) -> dict[str, dict[str, Any]]:
"""Retry MCP discovery operation with exponential backoff, avoiding try-except in loop."""
last_error = None
@@ -815,9 +1050,10 @@ class Agent(BaseAgent):
f"Failed to discover MCP tools after {MCP_MAX_RETRIES} attempts: {last_error}"
)
@staticmethod
async def _attempt_mcp_discovery(
self, operation_func, server_url: str
) -> tuple[dict[str, dict] | None, str, bool]:
operation_func, server_url: str
) -> tuple[dict[str, dict[str, Any]] | None, str, bool]:
"""Attempt single MCP discovery operation and return (result, error_message, should_retry)."""
try:
result = await operation_func(server_url)
@@ -851,13 +1087,13 @@ class Agent(BaseAgent):
async def _discover_mcp_tools_with_timeout(
self, server_url: str
) -> dict[str, dict]:
) -> dict[str, dict[str, Any]]:
"""Discover MCP tools with timeout wrapper."""
return await asyncio.wait_for(
self._discover_mcp_tools(server_url), timeout=MCP_DISCOVERY_TIMEOUT
)
async def _discover_mcp_tools(self, server_url: str) -> dict[str, dict]:
async def _discover_mcp_tools(self, server_url: str) -> dict[str, dict[str, Any]]:
"""Discover tools from MCP server with proper timeout handling."""
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client
@@ -889,7 +1125,9 @@ class Agent(BaseAgent):
}
return schemas
def _json_schema_to_pydantic(self, tool_name: str, json_schema: dict) -> type:
def _json_schema_to_pydantic(
self, tool_name: str, json_schema: dict[str, Any]
) -> type:
"""Convert JSON Schema to Pydantic model for tool arguments.
Args:
@@ -926,7 +1164,7 @@ class Agent(BaseAgent):
model_name = f"{tool_name.replace('-', '_').replace(' ', '_')}Schema"
return create_model(model_name, **field_definitions)
def _json_type_to_python(self, field_schema: dict) -> type:
def _json_type_to_python(self, field_schema: dict[str, Any]) -> type:
"""Convert JSON Schema type to Python type.
Args:
@@ -935,7 +1173,6 @@ class Agent(BaseAgent):
Returns:
Python type
"""
from typing import Any
json_type = field_schema.get("type")
@@ -965,13 +1202,15 @@ class Agent(BaseAgent):
return type_mapping.get(json_type, Any)
def _fetch_amp_mcp_servers(self, mcp_name: str) -> list[dict]:
@staticmethod
def _fetch_amp_mcp_servers(mcp_name: str) -> list[dict]:
"""Fetch MCP server configurations from CrewAI AMP API."""
# TODO: Implement AMP API call to "integrations/mcps" endpoint
# Should return list of server configs with URLs
return []
def get_multimodal_tools(self) -> Sequence[BaseTool]:
@staticmethod
def get_multimodal_tools() -> Sequence[BaseTool]:
from crewai.tools.agent_tools.add_image_tool import AddImageTool
return [AddImageTool()]
@@ -991,8 +1230,9 @@ class Agent(BaseAgent):
)
return []
@staticmethod
def get_output_converter(
self, llm: BaseLLM, text: str, model: type[BaseModel], instructions: str
llm: BaseLLM, text: str, model: type[BaseModel], instructions: str
) -> Converter:
return Converter(llm=llm, text=text, model=model, instructions=instructions)
@@ -1022,7 +1262,8 @@ class Agent(BaseAgent):
)
return task_prompt
def _render_text_description(self, tools: list[Any]) -> str:
@staticmethod
def _render_text_description(tools: list[Any]) -> str:
"""Render the tool name and description in plain text.
Output will be in the format of:
@@ -1107,6 +1348,15 @@ class Agent(BaseAgent):
def set_fingerprint(self, fingerprint: Fingerprint) -> None:
self.security_config.fingerprint = fingerprint
@property
def last_messages(self) -> list[LLMMessage]:
"""Get messages from the last task execution.
Returns:
List of LLM messages from the most recent task execution.
"""
return self._last_messages
def _get_knowledge_search_query(self, task_prompt: str, task: Task) -> str | None:
"""Generate a search query for the knowledge base based on the task description."""
crewai_event_bus.emit(

View File

@@ -0,0 +1,76 @@
"""Generic metaclass for agent extensions.
This metaclass enables extension capabilities for agents by detecting
extension fields in class annotations and applying appropriate wrappers.
"""
import warnings
from functools import wraps
from typing import Any
from pydantic import model_validator
from pydantic._internal._model_construction import ModelMetaclass
class AgentMeta(ModelMetaclass):
"""Generic metaclass for agent extensions.
Detects extension fields (like 'a2a') in class annotations and applies
the appropriate wrapper logic to enable extension functionality.
"""
def __new__(
mcs,
name: str,
bases: tuple[type, ...],
namespace: dict[str, Any],
**kwargs: Any,
) -> type:
"""Create a new class with extension support.
Args:
name: The name of the class being created
bases: Base classes
namespace: Class namespace dictionary
**kwargs: Additional keyword arguments
Returns:
The newly created class with extension support if applicable
"""
orig_post_init_setup = namespace.get("post_init_setup")
if orig_post_init_setup is not None:
original_func = (
orig_post_init_setup.wrapped
if hasattr(orig_post_init_setup, "wrapped")
else orig_post_init_setup
)
def post_init_setup_with_extensions(self: Any) -> Any:
"""Wrap post_init_setup to apply extensions after initialization.
Args:
self: The agent instance
Returns:
The agent instance
"""
result = original_func(self)
a2a_value = getattr(self, "a2a", None)
if a2a_value is not None:
from crewai.a2a.wrapper import wrap_agent_with_a2a_instance
wrap_agent_with_a2a_instance(self)
return result
with warnings.catch_warnings():
warnings.filterwarnings(
"ignore", message=".*overrides an existing Pydantic.*"
)
namespace["post_init_setup"] = model_validator(mode="after")(
post_init_setup_with_extensions
)
return super().__new__(mcs, name, bases, namespace, **kwargs)

View File

@@ -7,7 +7,7 @@ output conversion for OpenAI agents, supporting JSON and Pydantic model formats.
from typing import Any
from crewai.agents.agent_adapters.base_converter_adapter import BaseConverterAdapter
from crewai.utilities.i18n import I18N
from crewai.utilities.i18n import get_i18n
class OpenAIConverterAdapter(BaseConverterAdapter):
@@ -59,7 +59,7 @@ class OpenAIConverterAdapter(BaseConverterAdapter):
return base_prompt
output_schema: str = (
I18N()
get_i18n()
.slice("formatted_task_instructions")
.format(output_format=self._schema)
)

View File

@@ -18,17 +18,19 @@ from pydantic import (
from pydantic_core import PydanticCustomError
from typing_extensions import Self
from crewai.agent.internal.meta import AgentMeta
from crewai.agents.agent_builder.utilities.base_token_process import TokenProcess
from crewai.agents.cache.cache_handler import CacheHandler
from crewai.agents.tools_handler import ToolsHandler
from crewai.knowledge.knowledge import Knowledge
from crewai.knowledge.knowledge_config import KnowledgeConfig
from crewai.knowledge.source.base_knowledge_source import BaseKnowledgeSource
from crewai.mcp.config import MCPServerConfig
from crewai.rag.embeddings.types import EmbedderConfig
from crewai.security.security_config import SecurityConfig
from crewai.tools.base_tool import BaseTool, Tool
from crewai.utilities.config import process_config
from crewai.utilities.i18n import I18N
from crewai.utilities.i18n import I18N, get_i18n
from crewai.utilities.logger import Logger
from crewai.utilities.rpm_controller import RPMController
from crewai.utilities.string_utils import interpolate_only
@@ -56,7 +58,7 @@ PlatformApp = Literal[
PlatformAppOrAction = PlatformApp | str
class BaseAgent(BaseModel, ABC):
class BaseAgent(BaseModel, ABC, metaclass=AgentMeta):
"""Abstract Base Class for all third party agents compatible with CrewAI.
Attributes:
@@ -106,7 +108,7 @@ class BaseAgent(BaseModel, ABC):
Set private attributes.
"""
__hash__ = object.__hash__ # type: ignore
__hash__ = object.__hash__
_logger: Logger = PrivateAttr(default_factory=lambda: Logger(verbose=False))
_rpm_controller: RPMController | None = PrivateAttr(default=None)
_request_within_rpm_limit: Any = PrivateAttr(default=None)
@@ -149,7 +151,7 @@ class BaseAgent(BaseModel, ABC):
)
crew: Any = Field(default=None, description="Crew to which the agent belongs.")
i18n: I18N = Field(
default_factory=I18N, description="Internationalization settings."
default_factory=get_i18n, description="Internationalization settings."
)
cache_handler: CacheHandler | None = Field(
default=None, description="An instance of the CacheHandler class."
@@ -179,8 +181,8 @@ class BaseAgent(BaseModel, ABC):
default_factory=SecurityConfig,
description="Security configuration for the agent, including fingerprinting.",
)
callbacks: list[Callable] = Field(
default=[], description="Callbacks to be used for the agent"
callbacks: list[Callable[[Any], Any]] = Field(
default_factory=list, description="Callbacks to be used for the agent"
)
adapted_agent: bool = Field(
default=False, description="Whether the agent is adapted"
@@ -193,14 +195,14 @@ class BaseAgent(BaseModel, ABC):
default=None,
description="List of applications or application/action combinations that the agent can access through CrewAI Platform. Can contain app names (e.g., 'gmail') or specific actions (e.g., 'gmail/send_email')",
)
mcps: list[str] | None = Field(
mcps: list[str | MCPServerConfig] | None = Field(
default=None,
description="List of MCP server references. Supports 'https://server.com/path' for external servers and 'crewai-amp:mcp-name' for AMP marketplace. Use '#tool_name' suffix for specific tools.",
)
@model_validator(mode="before")
@classmethod
def process_model_config(cls, values):
def process_model_config(cls, values: Any) -> dict[str, Any]:
return process_config(values, cls)
@field_validator("tools")
@@ -252,23 +254,39 @@ class BaseAgent(BaseModel, ABC):
@field_validator("mcps")
@classmethod
def validate_mcps(cls, mcps: list[str] | None) -> list[str] | None:
def validate_mcps(
cls, mcps: list[str | MCPServerConfig] | None
) -> list[str | MCPServerConfig] | None:
"""Validate MCP server references and configurations.
Supports both string references (for backwards compatibility) and
structured configuration objects (MCPServerStdio, MCPServerHTTP, MCPServerSSE).
"""
if not mcps:
return mcps
validated_mcps = []
for mcp in mcps:
if mcp.startswith(("https://", "crewai-amp:")):
if isinstance(mcp, str):
if mcp.startswith(("https://", "crewai-amp:")):
validated_mcps.append(mcp)
else:
raise ValueError(
f"Invalid MCP reference: {mcp}. "
"String references must start with 'https://' or 'crewai-amp:'"
)
elif isinstance(mcp, (MCPServerConfig)):
validated_mcps.append(mcp)
else:
raise ValueError(
f"Invalid MCP reference: {mcp}. Must start with 'https://' or 'crewai-amp:'"
f"Invalid MCP configuration: {type(mcp)}. "
"Must be a string reference or MCPServerConfig instance."
)
return list(set(validated_mcps))
return validated_mcps
@model_validator(mode="after")
def validate_and_set_attributes(self):
def validate_and_set_attributes(self) -> Self:
# Validate required fields
for field in ["role", "goal", "backstory"]:
if getattr(self, field) is None:
@@ -300,7 +318,7 @@ class BaseAgent(BaseModel, ABC):
)
@model_validator(mode="after")
def set_private_attrs(self):
def set_private_attrs(self) -> Self:
"""Set private attributes."""
self._logger = Logger(verbose=self.verbose)
if self.max_rpm and not self._rpm_controller:
@@ -312,7 +330,7 @@ class BaseAgent(BaseModel, ABC):
return self
@property
def key(self):
def key(self) -> str:
source = [
self._original_role or self.role,
self._original_goal or self.goal,
@@ -330,7 +348,7 @@ class BaseAgent(BaseModel, ABC):
pass
@abstractmethod
def create_agent_executor(self, tools=None) -> None:
def create_agent_executor(self, tools: list[BaseTool] | None = None) -> None:
pass
@abstractmethod
@@ -342,7 +360,7 @@ class BaseAgent(BaseModel, ABC):
"""Get platform tools for the specified list of applications and/or application/action combinations."""
@abstractmethod
def get_mcp_tools(self, mcps: list[str]) -> list[BaseTool]:
def get_mcp_tools(self, mcps: list[str | MCPServerConfig]) -> list[BaseTool]:
"""Get MCP tools for the specified list of MCP server references."""
def copy(self) -> Self: # type: ignore # Signature of "copy" incompatible with supertype "BaseModel"
@@ -442,5 +460,5 @@ class BaseAgent(BaseModel, ABC):
self._rpm_controller = rpm_controller
self.create_agent_executor()
def set_knowledge(self, crew_embedder: EmbedderConfig | None = None):
def set_knowledge(self, crew_embedder: EmbedderConfig | None = None) -> None:
pass

View File

@@ -9,7 +9,7 @@ from __future__ import annotations
from collections.abc import Callable
from typing import TYPE_CHECKING, Any, Literal, cast
from pydantic import GetCoreSchemaHandler
from pydantic import BaseModel, GetCoreSchemaHandler
from pydantic_core import CoreSchema, core_schema
from crewai.agents.agent_builder.base_agent_executor_mixin import CrewAgentExecutorMixin
@@ -37,7 +37,11 @@ from crewai.utilities.agent_utils import (
process_llm_response,
)
from crewai.utilities.constants import TRAINING_DATA_FILE
from crewai.utilities.i18n import I18N
from crewai.utilities.i18n import I18N, get_i18n
from crewai.utilities.llm_call_hooks import (
get_after_llm_call_hooks,
get_before_llm_call_hooks,
)
from crewai.utilities.printer import Printer
from crewai.utilities.tool_utils import execute_tool_and_check_finality
from crewai.utilities.training_handler import CrewTrainingHandler
@@ -65,7 +69,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
def __init__(
self,
llm: BaseLLM | Any,
llm: BaseLLM,
task: Task,
crew: Crew,
agent: Agent,
@@ -82,6 +86,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
respect_context_window: bool = False,
request_within_rpm_limit: Callable[[], bool] | None = None,
callbacks: list[Any] | None = None,
response_model: type[BaseModel] | None = None,
) -> None:
"""Initialize executor.
@@ -103,8 +108,9 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
respect_context_window: Respect context limits.
request_within_rpm_limit: RPM limit check function.
callbacks: Optional callbacks list.
response_model: Optional Pydantic model for structured outputs.
"""
self._i18n: I18N = I18N()
self._i18n: I18N = get_i18n()
self.llm = llm
self.task = task
self.agent = agent
@@ -119,23 +125,38 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
self.tools_handler = tools_handler
self.original_tools = original_tools or []
self.step_callback = step_callback
self.use_stop_words = self.llm.supports_stop_words()
self.tools_description = tools_description
self.function_calling_llm = function_calling_llm
self.respect_context_window = respect_context_window
self.request_within_rpm_limit = request_within_rpm_limit
self.response_model = response_model
self.ask_for_human_input = False
self.messages: list[LLMMessage] = []
self.iterations = 0
self.log_error_after = 3
existing_stop = getattr(self.llm, "stop", [])
self.llm.stop = list(
set(
existing_stop + self.stop
if isinstance(existing_stop, list)
else self.stop
self.before_llm_call_hooks: list[Callable] = []
self.after_llm_call_hooks: list[Callable] = []
self.before_llm_call_hooks.extend(get_before_llm_call_hooks())
self.after_llm_call_hooks.extend(get_after_llm_call_hooks())
if self.llm:
# This may be mutating the shared llm object and needs further evaluation
existing_stop = getattr(self.llm, "stop", [])
self.llm.stop = list(
set(
existing_stop + self.stop
if isinstance(existing_stop, list)
else self.stop
)
)
)
@property
def use_stop_words(self) -> bool:
"""Check to determine if stop words are being used.
Returns:
bool: True if tool should be used or not.
"""
return self.llm.supports_stop_words() if self.llm else False
def invoke(self, inputs: dict[str, Any]) -> dict[str, Any]:
"""Execute the agent with given inputs.
@@ -201,6 +222,7 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
llm=self.llm,
callbacks=self.callbacks,
)
break
enforce_rpm_limit(self.request_within_rpm_limit)
@@ -211,8 +233,10 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
printer=self._printer,
from_task=self.task,
from_agent=self.agent,
response_model=self.response_model,
executor_context=self,
)
formatted_answer = process_llm_response(answer, self.use_stop_words)
formatted_answer = process_llm_response(answer, self.use_stop_words) # type: ignore[assignment]
if isinstance(formatted_answer, AgentAction):
# Extract agent fingerprint if available
@@ -244,11 +268,11 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
formatted_answer, tool_result
)
self._invoke_step_callback(formatted_answer)
self._append_message(formatted_answer.text)
self._invoke_step_callback(formatted_answer) # type: ignore[arg-type]
self._append_message(formatted_answer.text) # type: ignore[union-attr,attr-defined]
except OutputParserError as e: # noqa: PERF203
formatted_answer = handle_output_parser_exception(
except OutputParserError as e:
formatted_answer = handle_output_parser_exception( # type: ignore[assignment]
e=e,
messages=self.messages,
iterations=self.iterations,

View File

@@ -18,10 +18,10 @@ from crewai.agents.constants import (
MISSING_ACTION_INPUT_AFTER_ACTION_ERROR_MESSAGE,
UNABLE_TO_REPAIR_JSON_RESULTS,
)
from crewai.utilities.i18n import I18N
from crewai.utilities.i18n import get_i18n
_I18N = I18N()
_I18N = get_i18n()
@dataclass

View File

@@ -1,5 +1,5 @@
import time
from typing import Any
from typing import TYPE_CHECKING, Any, TypeVar, cast
import webbrowser
from pydantic import BaseModel, Field
@@ -13,6 +13,8 @@ from crewai.cli.shared.token_manager import TokenManager
console = Console()
TOauth2Settings = TypeVar("TOauth2Settings", bound="Oauth2Settings")
class Oauth2Settings(BaseModel):
provider: str = Field(
@@ -28,9 +30,15 @@ class Oauth2Settings(BaseModel):
description="OAuth2 audience value, typically used to identify the target API or resource.",
default=None,
)
extra: dict[str, Any] = Field(
description="Extra configuration for the OAuth2 provider.",
default={},
)
@classmethod
def from_settings(cls):
def from_settings(cls: type[TOauth2Settings]) -> TOauth2Settings:
"""Create an Oauth2Settings instance from the CLI settings."""
settings = Settings()
return cls(
@@ -38,12 +46,20 @@ class Oauth2Settings(BaseModel):
domain=settings.oauth2_domain,
client_id=settings.oauth2_client_id,
audience=settings.oauth2_audience,
extra=settings.oauth2_extra,
)
if TYPE_CHECKING:
from crewai.cli.authentication.providers.base_provider import BaseProvider
class ProviderFactory:
@classmethod
def from_settings(cls, settings: Oauth2Settings | None = None):
def from_settings(
cls: type["ProviderFactory"], # noqa: UP037
settings: Oauth2Settings | None = None,
) -> "BaseProvider": # noqa: UP037
settings = settings or Oauth2Settings.from_settings()
import importlib
@@ -53,11 +69,11 @@ class ProviderFactory:
)
provider = getattr(module, f"{settings.provider.capitalize()}Provider")
return provider(settings)
return cast("BaseProvider", provider(settings))
class AuthenticationCommand:
def __init__(self):
def __init__(self) -> None:
self.token_manager = TokenManager()
self.oauth2_provider = ProviderFactory.from_settings()
@@ -84,7 +100,7 @@ class AuthenticationCommand:
timeout=20,
)
response.raise_for_status()
return response.json()
return cast(dict[str, Any], response.json())
def _display_auth_instructions(self, device_code_data: dict[str, str]) -> None:
"""Display the authentication instructions to the user."""

View File

@@ -24,3 +24,7 @@ class BaseProvider(ABC):
@abstractmethod
def get_client_id(self) -> str: ...
def get_required_fields(self) -> list[str]:
"""Returns which provider-specific fields inside the "extra" dict will be required"""
return []

View File

@@ -3,16 +3,16 @@ from crewai.cli.authentication.providers.base_provider import BaseProvider
class OktaProvider(BaseProvider):
def get_authorize_url(self) -> str:
return f"https://{self.settings.domain}/oauth2/default/v1/device/authorize"
return f"{self._oauth2_base_url()}/v1/device/authorize"
def get_token_url(self) -> str:
return f"https://{self.settings.domain}/oauth2/default/v1/token"
return f"{self._oauth2_base_url()}/v1/token"
def get_jwks_url(self) -> str:
return f"https://{self.settings.domain}/oauth2/default/v1/keys"
return f"{self._oauth2_base_url()}/v1/keys"
def get_issuer(self) -> str:
return f"https://{self.settings.domain}/oauth2/default"
return self._oauth2_base_url().removesuffix("/oauth2")
def get_audience(self) -> str:
if self.settings.audience is None:
@@ -27,3 +27,16 @@ class OktaProvider(BaseProvider):
"Client ID is required. Please set it in the configuration."
)
return self.settings.client_id
def get_required_fields(self) -> list[str]:
return ["authorization_server_name", "using_org_auth_server"]
def _oauth2_base_url(self) -> str:
using_org_auth_server = self.settings.extra.get("using_org_auth_server", False)
if using_org_auth_server:
base_url = f"https://{self.settings.domain}/oauth2"
else:
base_url = f"https://{self.settings.domain}/oauth2/{self.settings.extra.get('authorization_server_name', 'default')}"
return f"{base_url}"

View File

@@ -11,18 +11,18 @@ console = Console()
class BaseCommand:
def __init__(self):
def __init__(self) -> None:
self._telemetry = Telemetry()
self._telemetry.set_tracer()
class PlusAPIMixin:
def __init__(self, telemetry):
def __init__(self, telemetry: Telemetry) -> None:
try:
telemetry.set_tracer()
self.plus_api_client = PlusAPI(api_key=get_auth_token())
except Exception:
self._deploy_signup_error_span = telemetry.deploy_signup_error_span()
telemetry.deploy_signup_error_span()
console.print(
"Please sign up/login to CrewAI+ before using the CLI.",
style="bold red",

View File

@@ -2,6 +2,7 @@ import json
from logging import getLogger
from pathlib import Path
import tempfile
from typing import Any
from pydantic import BaseModel, Field
@@ -136,7 +137,12 @@ class Settings(BaseModel):
default=DEFAULT_CLI_SETTINGS["oauth2_domain"],
)
def __init__(self, config_path: Path | None = None, **data):
oauth2_extra: dict[str, Any] = Field(
description="Extra configuration for the OAuth2 provider.",
default={},
)
def __init__(self, config_path: Path | None = None, **data: dict[str, Any]) -> None:
"""Load Settings from config path with fallback support"""
if config_path is None:
config_path = get_writable_config_path()

View File

@@ -1,9 +1,10 @@
from typing import Any
from typing import Any, cast
import requests
from requests.exceptions import JSONDecodeError, RequestException
from rich.console import Console
from crewai.cli.authentication.main import Oauth2Settings, ProviderFactory
from crewai.cli.command import BaseCommand
from crewai.cli.settings.main import SettingsCommand
from crewai.cli.version import get_crewai_version
@@ -13,7 +14,7 @@ console = Console()
class EnterpriseConfigureCommand(BaseCommand):
def __init__(self):
def __init__(self) -> None:
super().__init__()
self.settings_command = SettingsCommand()
@@ -54,25 +55,12 @@ class EnterpriseConfigureCommand(BaseCommand):
except JSONDecodeError as e:
raise ValueError(f"Invalid JSON response from {oauth_endpoint}") from e
required_fields = [
"audience",
"domain",
"device_authorization_client_id",
"provider",
]
missing_fields = [
field for field in required_fields if field not in oauth_config
]
if missing_fields:
raise ValueError(
f"Missing required fields in OAuth2 configuration: {', '.join(missing_fields)}"
)
self._validate_oauth_config(oauth_config)
console.print(
"✅ Successfully retrieved OAuth2 configuration", style="green"
)
return oauth_config
return cast(dict[str, Any], oauth_config)
except RequestException as e:
raise ValueError(f"Failed to connect to enterprise URL: {e!s}") from e
@@ -89,6 +77,7 @@ class EnterpriseConfigureCommand(BaseCommand):
"oauth2_audience": oauth_config["audience"],
"oauth2_client_id": oauth_config["device_authorization_client_id"],
"oauth2_domain": oauth_config["domain"],
"oauth2_extra": oauth_config["extra"],
}
console.print("🔄 Updating local OAuth2 configuration...")
@@ -99,3 +88,38 @@ class EnterpriseConfigureCommand(BaseCommand):
except Exception as e:
raise ValueError(f"Failed to update OAuth2 settings: {e!s}") from e
def _validate_oauth_config(self, oauth_config: dict[str, Any]) -> None:
required_fields = [
"audience",
"domain",
"device_authorization_client_id",
"provider",
"extra",
]
missing_basic_fields = [
field for field in required_fields if field not in oauth_config
]
missing_provider_specific_fields = [
field
for field in self._get_provider_specific_fields(oauth_config["provider"])
if field not in oauth_config.get("extra", {})
]
if missing_basic_fields:
raise ValueError(
f"Missing required fields in OAuth2 configuration: [{', '.join(missing_basic_fields)}]"
)
if missing_provider_specific_fields:
raise ValueError(
f"Missing authentication provider required fields in OAuth2 configuration: [{', '.join(missing_provider_specific_fields)}] (Configured provider: '{oauth_config['provider']}')"
)
def _get_provider_specific_fields(self, provider_name: str) -> list[str]:
provider = ProviderFactory.from_settings(
Oauth2Settings(provider=provider_name, client_id="dummy", domain="dummy")
)
return provider.get_required_fields()

Some files were not shown because too many files have changed in this diff Show More