mirror of
https://github.com/crewAIInc/crewAI.git
synced 2026-04-11 13:32:34 +00:00
Some checks failed
Build uv cache / build-cache (3.10) (push) Has been cancelled
Build uv cache / build-cache (3.11) (push) Has been cancelled
Build uv cache / build-cache (3.12) (push) Has been cancelled
Build uv cache / build-cache (3.13) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
* fix: harden NL2SQLTool — read-only by default, parameterized queries, query validation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address CI lint failures and remove unused import - Remove unused `sessionmaker` import from test_nl2sql_security.py - Use `Self` return type on `_apply_env_override` (fixes UP037/F821) - Fix ruff errors auto-fixed in lib/crewai (UP007, etc.) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: expand _WRITE_COMMANDS and block multi-statement semicolon injection - Add missing write commands: UPSERT, LOAD, COPY, VACUUM, ANALYZE, ANALYSE, REINDEX, CLUSTER, REFRESH, COMMENT, SET, RESET - _validate_query() now splits on ';' and validates each statement independently; multi-statement queries are rejected outright in read-only mode to prevent 'SELECT 1; DROP TABLE users' bypass - Extract single-statement logic into _validate_statement() helper - Add TestSemicolonInjection and TestExtendedWriteCommands test classes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: retrigger * fix: use typing_extensions.Self for Python 3.10 compat * chore: update tool specifications * docs: document NL2SQLTool read-only default and DML configuration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: close three NL2SQLTool security gaps (writable CTEs, EXPLAIN ANALYZE, multi-stmt commit) - Remove WITH from _READ_ONLY_COMMANDS; scan CTE body for write keywords so writable CTEs like `WITH d AS (DELETE …) SELECT …` are blocked in read-only mode. - EXPLAIN ANALYZE/ANALYSE now resolves the underlying command; EXPLAIN ANALYZE DELETE is treated as a write and blocked in read-only mode. - execute_sql commit decision now checks ALL semicolon-separated statements so a SELECT-first batch like `SELECT 1; DROP TABLE t` still triggers a commit when allow_dml=True. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: handle parenthesized EXPLAIN options syntax; remove unused _seed_db _validate_statement now strips parenthesized options from EXPLAIN (e.g. EXPLAIN (ANALYZE) DELETE, EXPLAIN (ANALYZE, VERBOSE) DELETE) before checking whether ANALYZE/ANALYSE is present — closing the bypass where the options-list form was silently allowed in read-only mode. Adds three new tests: - EXPLAIN (ANALYZE) DELETE → blocked - EXPLAIN (ANALYZE, VERBOSE) DELETE → blocked - EXPLAIN (VERBOSE) SELECT → allowed Also removes the unused _seed_db helper from test_nl2sql_security.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: update tool specifications * fix: smarter CTE write detection, fix commit logic for writable CTEs - Replace naive token-set matching with positional AS() body inspection to avoid false positives on column names like 'comment', 'set', 'reset' - Fix execute_sql commit logic to detect writable CTEs (WITH + DELETE/INSERT) not just top-level write commands - Add tests for false positive cases and writable CTE commit behavior - Format nl2sql_tool.py to pass ruff format check * fix: catch write commands in CTE main query + handle whitespace in AS() - WITH cte AS (SELECT 1) DELETE FROM users now correctly blocked - AS followed by newline/tab/multi-space before ( now detected - execute_sql commit logic updated for both cases - 4 new tests * fix: EXPLAIN ANALYZE VERBOSE handling, string literal paren bypass, commit logic for EXPLAIN ANALYZE - EXPLAIN handler now consumes all known options (ANALYZE, ANALYSE, VERBOSE) before extracting the real command, fixing 'EXPLAIN ANALYZE VERBOSE SELECT' being blocked - Paren walker in _extract_main_query_after_cte now skips string literals, preventing 'WITH cte AS (SELECT '\''('\'' FROM t) DELETE FROM users' from bypassing detection - _is_write_stmt in execute_sql now resolves EXPLAIN ANALYZE to underlying command via _resolve_explain_command, ensuring session.commit() fires for write operations - 10 new tests covering all three fixes * fix: deduplicate EXPLAIN parsing, fix AS( regex in strings, block unknown CTE commands, bump langchain-core - Refactor _validate_statement to use _resolve_explain_command (single source of truth) - _iter_as_paren_matches skips string literals so 'AS (' in data doesn't confuse CTE detection - Unknown commands after CTE definitions now blocked in read-only mode - Bump langchain-core override to >=1.2.28 (GHSA-926x-3r5x-gfhw) * fix: add return type annotation to _iter_as_paren_matches --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
176 lines
6.3 KiB
Plaintext
176 lines
6.3 KiB
Plaintext
---
|
|
title: NL2SQL Tool
|
|
description: The `NL2SQLTool` is designed to convert natural language to SQL queries.
|
|
icon: language
|
|
mode: "wide"
|
|
---
|
|
|
|
## Overview
|
|
|
|
|
|
This tool is used to convert natural language to SQL queries. When passed to the agent it will generate queries and then use them to interact with the database.
|
|
|
|
This enables multiple workflows like having an Agent to access the database fetch information based on the goal and then use the information to generate a response, report or any other output.
|
|
Along with that provides the ability for the Agent to update the database based on its goal.
|
|
|
|
**Attention**: By default the tool is read-only (SELECT/SHOW/DESCRIBE/EXPLAIN only). Write operations require `allow_dml=True` or the `CREWAI_NL2SQL_ALLOW_DML=true` environment variable. When write access is enabled, make sure the Agent uses a scoped database user or a read replica where possible.
|
|
|
|
## Security Model
|
|
|
|
`NL2SQLTool` is an execution-capable tool. It runs model-generated SQL directly against the configured database connection.
|
|
|
|
This means risk depends on your deployment choices:
|
|
|
|
- Which credentials you provide in `db_uri`
|
|
- Whether untrusted input can influence prompts
|
|
- Whether you add tool-call guardrails before execution
|
|
|
|
If you route untrusted input to agents using this tool, treat it as a high-risk integration.
|
|
|
|
## Hardening Recommendations
|
|
|
|
Use all of the following in production:
|
|
|
|
- Use a read-only database user whenever possible
|
|
- Prefer a read replica for analytics/retrieval workloads
|
|
- Grant least privilege (no superuser/admin roles, no file/system-level capabilities)
|
|
- Apply database-side resource limits (statement timeout, lock timeout, cost/row limits)
|
|
- Add `before_tool_call` hooks to enforce allowed query patterns
|
|
- Enable query logging and alerting for destructive statements
|
|
|
|
## Read-Only Mode & DML Configuration
|
|
|
|
`NL2SQLTool` operates in **read-only mode by default**. Only the following statement types are permitted without additional configuration:
|
|
|
|
- `SELECT`
|
|
- `SHOW`
|
|
- `DESCRIBE`
|
|
- `EXPLAIN`
|
|
|
|
Any attempt to execute a write operation (`INSERT`, `UPDATE`, `DELETE`, `DROP`, `CREATE`, `ALTER`, `TRUNCATE`, etc.) will raise an error unless DML is explicitly enabled.
|
|
|
|
Multi-statement queries containing semicolons (e.g. `SELECT 1; DROP TABLE users`) are also blocked in read-only mode to prevent injection attacks.
|
|
|
|
### Enabling Write Operations
|
|
|
|
You can enable DML (Data Manipulation Language) in two ways:
|
|
|
|
**Option 1 — constructor parameter:**
|
|
|
|
```python
|
|
from crewai_tools import NL2SQLTool
|
|
|
|
nl2sql = NL2SQLTool(
|
|
db_uri="postgresql://example@localhost:5432/test_db",
|
|
allow_dml=True,
|
|
)
|
|
```
|
|
|
|
**Option 2 — environment variable:**
|
|
|
|
```bash
|
|
CREWAI_NL2SQL_ALLOW_DML=true
|
|
```
|
|
|
|
```python
|
|
from crewai_tools import NL2SQLTool
|
|
|
|
# DML enabled via environment variable
|
|
nl2sql = NL2SQLTool(db_uri="postgresql://example@localhost:5432/test_db")
|
|
```
|
|
|
|
### Usage Examples
|
|
|
|
**Read-only (default) — safe for analytics and reporting:**
|
|
|
|
```python
|
|
from crewai_tools import NL2SQLTool
|
|
|
|
# Only SELECT/SHOW/DESCRIBE/EXPLAIN are permitted
|
|
nl2sql = NL2SQLTool(db_uri="postgresql://example@localhost:5432/test_db")
|
|
```
|
|
|
|
**DML enabled — required for write workloads:**
|
|
|
|
```python
|
|
from crewai_tools import NL2SQLTool
|
|
|
|
# INSERT, UPDATE, DELETE, DROP, etc. are permitted
|
|
nl2sql = NL2SQLTool(
|
|
db_uri="postgresql://example@localhost:5432/test_db",
|
|
allow_dml=True,
|
|
)
|
|
```
|
|
|
|
<Warning>
|
|
Enabling DML gives the agent the ability to modify or destroy data. Only enable this when your use case explicitly requires write access, and ensure the database credentials are scoped to the minimum required privileges.
|
|
</Warning>
|
|
|
|
## Requirements
|
|
|
|
- SqlAlchemy
|
|
- Any DB compatible library (e.g. psycopg2, mysql-connector-python)
|
|
|
|
## Installation
|
|
|
|
Install the crewai_tools package
|
|
|
|
```shell
|
|
pip install 'crewai[tools]'
|
|
```
|
|
|
|
## Usage
|
|
|
|
In order to use the NL2SQLTool, you need to pass the database URI to the tool. The URI should be in the format `dialect+driver://username:password@host:port/database`.
|
|
|
|
|
|
```python Code
|
|
from crewai_tools import NL2SQLTool
|
|
|
|
# psycopg2 was installed to run this example with PostgreSQL
|
|
nl2sql = NL2SQLTool(db_uri="postgresql://example@localhost:5432/test_db")
|
|
|
|
@agent
|
|
def researcher(self) -> Agent:
|
|
return Agent(
|
|
config=self.agents_config["researcher"],
|
|
allow_delegation=False,
|
|
tools=[nl2sql]
|
|
)
|
|
```
|
|
|
|
## Example
|
|
|
|
The primary task goal was:
|
|
|
|
"Retrieve the average, maximum, and minimum monthly revenue for each city, but only include cities that have more than one user. Also, count the number of user in each city and
|
|
sort the results by the average monthly revenue in descending order"
|
|
|
|
So the Agent tried to get information from the DB, the first one is wrong so the Agent tries again and gets the correct information and passes to the next agent.
|
|
|
|

|
|

|
|
|
|
|
|
The second task goal was:
|
|
|
|
"Review the data and create a detailed report, and then create the table on the database with the fields based on the data provided.
|
|
Include information on the average, maximum, and minimum monthly revenue for each city, but only include cities that have more than one user. Also, count the number of users in each city and sort the results by the average monthly revenue in descending order."
|
|
|
|
Now things start to get interesting, the Agent generates the SQL query to not only create the table but also insert the data into the table. And in the end the Agent still returns the final report which is exactly what was in the database.
|
|
|
|

|
|

|
|
|
|

|
|

|
|
|
|
|
|
This is a simple example of how the NL2SQLTool can be used to interact with the database and generate reports based on the data in the database.
|
|
|
|
The Tool provides endless possibilities on the logic of the Agent and how it can interact with the database.
|
|
|
|
```md
|
|
DB -> Agent -> ... -> Agent -> DB
|
|
```
|