mirror of https://github.com/crewAIInc/crewAI.git synced 2026-04-30 14:52:36 +00:00

Files

alex-clawd 9325e2f6a4 fix: add path and URL validation to RAG tools (#5310 )

* fix: add path and URL validation to RAG tools

Add validation utilities to prevent unauthorized file reads and SSRF
when RAG tools accept LLM-controlled paths/URLs at runtime.

Changes:
- New crewai_tools.utilities.safe_path module with validate_file_path(),
  validate_directory_path(), and validate_url()
- File paths validated against base directory (defaults to cwd).
  Resolves symlinks and ../ traversal. Rejects escape attempts.
- URLs validated: file:// blocked entirely. HTTP/HTTPS resolves DNS
  and blocks private/reserved IPs (10.x, 172.16-31.x, 192.168.x,
  127.x, 169.254.x, 0.0.0.0, ::1, fc00::/7).
- Validation applied in RagTool.add() — catches all RAG search tools
  (JSON, CSV, PDF, TXT, DOCX, MDX, Directory, etc.)
- Removed file:// scheme support from DataTypes.from_content()
- CREWAI_TOOLS_ALLOW_UNSAFE_PATHS=true env var for backward compat
- 27 tests covering traversal, symlinks, private IPs, cloud metadata,
  IPv6, escape hatch, and valid paths/URLs

* fix: validate path/URL keyword args in RagTool.add()

The original patch validated positional *args but left all keyword
arguments (path=, file_path=, directory_path=, url=, website=,
github_url=, youtube_url=) unvalidated, providing a trivial bypass
for both path-traversal and SSRF checks.

Applies validate_file_path() to path/file_path/directory_path kwargs
and validate_url() to url/website/github_url/youtube_url kwargs before
they reach the adapter. Adds a regression-test file covering all eight
kwarg vectors plus the two existing positional-arg checks.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address CodeQL and review comments on RAG path/URL validation

- Replace insecure tempfile.mktemp() with inline symlink target in test
- Remove unused 'target' variable and unused tempfile import
- Narrow broad except Exception: pass to only catch urlparse errors;
  validate_url ValueError now propagates instead of being silently swallowed
- Fix ruff B904 (raise-without-from-inside-except) in safe_path.py
- Fix ruff B007 (unused loop variable 'family') in safe_path.py
- Use validate_directory_path in DirectorySearchTool.add() so the
  public utility is exercised in production code

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix ruff format + remaining lint issues

* fix: resolve mypy type errors in RAG path/URL validation

- Cast sockaddr[0] to str() to satisfy mypy (socket.getaddrinfo returns
  sockaddr where [0] is str but typed as str | int)
- Remove now-unnecessary `type: ignore[assignment]` and
  `type: ignore[literal-required]` comments in rag_tool.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: unroll dynamic TypedDict key loops to satisfy mypy literal-required

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: allow tmp paths in RAG data-type tests via CREWAI_TOOLS_ALLOW_UNSAFE_PATHS

TemporaryDirectory creates files under /tmp/ which is outside CWD and is
correctly blocked by the new path validation.  These tests exercise
data-type handling, not security, so add an autouse fixture that sets
CREWAI_TOOLS_ALLOW_UNSAFE_PATHS=true for the whole file.  Path/URL
security is covered by test_rag_tool_path_validation.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: allow tmp paths in search-tool and rag_tool tests via CREWAI_TOOLS_ALLOW_UNSAFE_PATHS

test_search_tools.py has tests for TXTSearchTool, CSVSearchTool,
MDXSearchTool, JSONSearchTool, and DirectorySearchTool that create
files under /tmp/ via tempfile, which is outside CWD and correctly
blocked by the new path validation.  rag_tool_test.py has one test
that calls tool.add() with a TemporaryDirectory path.

Add the same autouse allow_tmp_paths fixture used in
test_rag_tool_add_data_type.py.  Security is covered separately by
test_rag_tool_path_validation.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: update tool specifications

* docs: document CodeInterpreterTool removal and RAG path/URL validation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address three review comments on path/URL validation

- safe_path._is_private_or_reserved: after unwrapping IPv4-mapped IPv6
  to IPv4, only check against IPv4 networks to avoid TypeError when
  comparing an IPv4Address against IPv6Network objects.
- safe_path.validate_file_path: handle filesystem-root base_dir ('/')
  by not appending os.sep when the base already ends with a separator,
  preventing the '//'-prefix bug.
- rag_tool.add: path-detection heuristic now checks for both '/' and
  os.sep so forward-slash paths are caught on Windows as well as Unix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove unused _BLOCKED_NETWORKS variable after IPv4/IPv6 split

* chore: update tool specifications

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

2026-04-07 13:29:45 -03:00

src/crewai_tools

fix: add path and URL validation to RAG tools (#5310 )

2026-04-07 13:29:45 -03:00

tests

fix: add path and URL validation to RAG tools (#5310 )

2026-04-07 13:29:45 -03:00

pyproject.toml

feat: bump versions to 1.14.0a4

2026-04-07 23:22:58 +08:00

README.md

Release/v1.0.0 (#3618 )

2025-10-20 14:10:19 -07:00

tool.specs.json

refactor: remove CodeInterpreterTool and deprecate code execution params (#5309 )

2026-04-07 03:59:40 -03:00

README.md

CrewAI Tools

Empower your CrewAI agents with powerful, customizable tools to elevate their capabilities and tackle sophisticated, real-world tasks.

CrewAI Tools provide the essential functionality to extend your agents, helping you rapidly enhance your automations with reliable, ready-to-use tools or custom-built solutions tailored precisely to your needs.

Quick Links

Homepage | Documentation | Examples | Community

Available Tools

CrewAI provides an extensive collection of powerful tools ready to enhance your agents:

File Management: FileReadTool, FileWriteTool
Web Scraping: ScrapeWebsiteTool, SeleniumScrapingTool
Database Integrations: MySQLSearchTool
Vector Database Integrations: MongoDBVectorSearchTool, QdrantVectorSearchTool, WeaviateVectorSearchTool
API Integrations: SerperApiTool, EXASearchTool
AI-powered Tools: DallETool, VisionTool, StagehandTool

And many more robust tools to simplify your agent integrations.

Creating Custom Tools

CrewAI offers two straightforward approaches to creating custom tools:

Subclassing `BaseTool`

Define your tool by subclassing:

from crewai.tools import BaseTool

class MyCustomTool(BaseTool):
    name: str = "Tool Name"
    description: str = "Detailed description here."

    def _run(self, *args, **kwargs):
        # Your tool logic here

Using the `tool` Decorator

Quickly create lightweight tools using decorators:

from crewai import tool

@tool("Tool Name")
def my_custom_function(input):
    # Tool logic here
    return output

CrewAI Tools and MCP

CrewAI Tools supports the Model Context Protocol (MCP). It gives you access to thousands of tools from the hundreds of MCP servers out there built by the community.

Before you start using MCP with CrewAI tools, you need to install the mcp extra dependencies:

pip install crewai-tools[mcp]
# or
uv add crewai-tools --extra mcp

To quickly get started with MCP in CrewAI you have 2 options:

Option 1: Fully managed connection

In this scenario we use a contextmanager (with statement) to start and stop the the connection with the MCP server. This is done in the background and you only get to interact with the CrewAI tools corresponding to the MCP server's tools.

For an STDIO based MCP server:

from mcp import StdioServerParameters
from crewai_tools import MCPServerAdapter

serverparams = StdioServerParameters(
    command="uvx",
    args=["--quiet", "pubmedmcp@0.1.3"],
    env={"UV_PYTHON": "3.12", **os.environ},
)

with MCPServerAdapter(serverparams) as tools:
    # tools is now a list of CrewAI Tools matching 1:1 with the MCP server's tools
    agent = Agent(..., tools=tools)
    task = Task(...)
    crew = Crew(..., agents=[agent], tasks=[task])
    crew.kickoff(...)

For an SSE based MCP server:

serverparams = {"url": "http://localhost:8000/sse"}
with MCPServerAdapter(serverparams) as tools:
    # tools is now a list of CrewAI Tools matching 1:1 with the MCP server's tools
    agent = Agent(..., tools=tools)
    task = Task(...)
    crew = Crew(..., agents=[agent], tasks=[task])
    crew.kickoff(...)

Option 2: More control over the MCP connection

If you need more control over the MCP connection, you can instanciate the MCPServerAdapter into an mcp_server_adapter object which can be used to manage the connection with the MCP server and access the available tools.

important: in this case you need to call mcp_server_adapter.stop() to make sure the connection is correctly stopped. We recommend that you use a try ... finally block run to make sure the .stop() is called even in case of errors.

Here is the same example for an STDIO MCP Server:

from mcp import StdioServerParameters
from crewai_tools import MCPServerAdapter

serverparams = StdioServerParameters(
    command="uvx",
    args=["--quiet", "pubmedmcp@0.1.3"],
    env={"UV_PYTHON": "3.12", **os.environ},
)

try:
    mcp_server_adapter = MCPServerAdapter(serverparams)
    tools = mcp_server_adapter.tools
    # tools is now a list of CrewAI Tools matching 1:1 with the MCP server's tools
    agent = Agent(..., tools=tools)
    task = Task(...)
    crew = Crew(..., agents=[agent], tasks=[task])
    crew.kickoff(...)

# ** important ** don't forget to stop the connection
finally: 
    mcp_server_adapter.stop()

And finally the same thing but for an SSE MCP Server:

from mcp import StdioServerParameters
from crewai_tools import MCPServerAdapter

serverparams = {"url": "http://localhost:8000/sse"}

try:
    mcp_server_adapter = MCPServerAdapter(serverparams)
    tools = mcp_server_adapter.tools
    # tools is now a list of CrewAI Tools matching 1:1 with the MCP server's tools
    agent = Agent(..., tools=tools)
    task = Task(...)
    crew = Crew(..., agents=[agent], tasks=[task])
    crew.kickoff(...)

# ** important ** don't forget to stop the connection
finally: 
    mcp_server_adapter.stop()

Considerations & Limitations

Staying Safe with MCP

Always make sure that you trust the MCP Server before using it. Using an STDIO server will execute code on your machine. Using SSE is still not a silver bullet with many injection possible into your application from a malicious MCP server.

Limitations

At this time we only support tools from MCP Server not other type of primitives like prompts, resources...
We only return the first text output returned by the MCP Server tool using .content[0].text

Why Use CrewAI Tools?

Simplicity & Flexibility: Easy-to-use yet powerful enough for complex workflows.
Rapid Integration: Seamlessly incorporate external services, APIs, and databases.
Enterprise Ready: Built for stability, performance, and consistent results.

Contribution Guidelines

We welcome contributions from the community!

Fork and clone the repository.
Create a new branch (git checkout -b feature/my-feature).
Commit your changes (git commit -m 'Add my feature').
Push your branch (git push origin feature/my-feature).
Open a pull request.

Developer Quickstart

pip install crewai[tools]

Development Setup

Install dependencies: uv sync
Run tests: uv run pytest
Run static type checking: uv run pyright
Set up pre-commit hooks: pre-commit install

Support and Community

Join our rapidly growing community and receive real-time support:

Build smarter, faster, and more powerful AI solutions—powered by CrewAI Tools.

README.md

CrewAI Tools

Quick Links

Available Tools

Creating Custom Tools

Subclassing BaseTool

Using the tool Decorator

CrewAI Tools and MCP

Option 1: Fully managed connection

Option 2: More control over the MCP connection

Considerations & Limitations

Staying Safe with MCP

Limitations

Why Use CrewAI Tools?

Contribution Guidelines

Developer Quickstart

Development Setup

Support and Community

Subclassing `BaseTool`

Using the `tool` Decorator