Two sites that were mechanically rewritten by the lazy-getter
regex shouldn't actually go through the lazy getter:
- `BedrockCompletion._ensure_async_client` manages its own client
lifecycle through `aiobotocore` inside an exit stack. Its trailing
`return self._get_async_client()` was a redundant indirection
through a stub method that doesn't even attempt to build a client.
Return the cached attribute directly.
- `GeminiCompletion._get_client_params` is a lightweight config
accessor used at `to_config_dict()` time. Calling `_get_sync_client()`
here forced client construction (and would raise `ValueError` when
credentials aren't set) just to check the `vertexai` attribute. Read
`self._client` directly and null-guard before the `hasattr` check.
The lazy-init refactor rewrote `aclose` to access the async client via
`_get_async_client()`, which forces lazy construction. When an
`AzureCompletion` is instantiated without credentials (the whole point
of deferred init), that call raises `ValueError: "Azure API key is
required"` during cleanup — including via `async with` / `__aexit__`.
Access the cached `_async_client` attribute directly so cleanup on an
uninitialized LLM is a harmless no-op. Add a regression test that
enters and exits an `async with` block against a credentials-less
`AzureCompletion`.
Adds a new `crewai deploy validate` command that checks a project
locally against the most common categories of deploy-time failures,
so users don't burn attempts on fixable project-structure problems.
`crewai deploy create` and `crewai deploy push` now run the same
checks automatically and abort on errors; `--skip-validate` opts out.
Checks (errors block, warnings print only):
1. pyproject.toml present with `[project].name`
2. lockfile (uv.lock or poetry.lock) present and not stale
3. src/<package>/ resolves, rejecting empty names and .egg-info dirs
4. crew.py, config/agents.yaml, config/tasks.yaml for standard crews
5. main.py for flow projects
6. hatchling wheel target resolves
7. crew/flow module imports cleanly in a `uv run` subprocess, with
classification of common failures (missing provider extras,
missing API keys at import, stale crewai pins, pydantic errors)
8. env vars referenced in source vs .env (warning only)
9. crewai lockfile pin vs a known-bad cutoff (warning only)
Each finding has a stable code and a structured title/detail/hint so
downstream tooling and tests can pin behavior. 33 tests cover the
checks 1:1 against the failure patterns observed in practice.
All native LLM providers built their SDK clients inside
`@model_validator(mode="after")`, which required the API key at
`LLM(...)` construction time. Instantiating an LLM at module scope
(e.g. `chat_llm=LLM(model="openai/gpt-4o-mini")` on a `@crew` method)
crashed during downstream crew-metadata extraction with a confusing
`ImportError: Error importing native provider: 1 validation error...`
before the process env vars were ever consulted.
Wrap eager client construction in a try/except in each provider and
add `_get_sync_client` / `_get_async_client` methods that build on
first use. OpenAI call sites are routed through the lazy getters so
calls made without eager construction still work. The descriptive
"X_API_KEY is required" errors are re-raised from the lazy path at
first real call.
Update two Azure tests that asserted the old eager-error contract to
assert the new lazy-error contract.
Pydantic schemas intermittently fail strict tool-use on openai, anthropic,
and bedrock. All three reject nested objects missing additionalProperties:
false, and anthropic also rejects keywords like minLength and top-level
anyOf. Adds per-provider sanitizers that inline refs, close objects, mark
every property required, preserve nullable unions, and strip keywords each
grammar compiler rejects. Verified against real bedrock, anthropic, and
openai.
Substring checks like `'0.1' not in json_str` collided with timestamps
such as `2026-04-10T13:00:50.140557` on CI. Round-trip through
`model_validate_json` to verify structurally that the embedding field
is absent from the serialized output.
- Rewrite TUI with Tree widget showing branch/fork lineage
- Add Resume and Fork buttons in detail panel with Collapsible entities
- Show branch and parent_id in detail panel and CLI info output
- Auto-detect .checkpoints.db when default dir missing
- Append .db to location for SqliteProvider when no extension set
- Fix RuntimeState.from_checkpoint not setting provider/location
- Fork now writes initial checkpoint on new branch
- Add from_checkpoint, fork, and CLI docs to checkpointing.mdx
The OpenAI-format tool schema sets strict: true but this was dropped
during conversion to Anthropic/Bedrock formats, so neither provider
used constrained decoding. Without it, the model can return string
"None" instead of JSON null for nullable fields, causing Pydantic
validation failures.
Accept CheckpointConfig on Crew and Flow kickoff/kickoff_async/akickoff.
When restore_from is set, the entity resumes from that checkpoint.
When only config fields are set, checkpointing is enabled for the run.
Adds restore_from field (Path | str | None) to CheckpointConfig.
Write the crewAI package version into every checkpoint blob. On restore,
run version-based migrations so older checkpoints can be transformed
forward to the current format. Adds crewai.utilities.version module.
* fix: harden NL2SQLTool — read-only by default, parameterized queries, query validation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: address CI lint failures and remove unused import
- Remove unused `sessionmaker` import from test_nl2sql_security.py
- Use `Self` return type on `_apply_env_override` (fixes UP037/F821)
- Fix ruff errors auto-fixed in lib/crewai (UP007, etc.)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: expand _WRITE_COMMANDS and block multi-statement semicolon injection
- Add missing write commands: UPSERT, LOAD, COPY, VACUUM, ANALYZE,
ANALYSE, REINDEX, CLUSTER, REFRESH, COMMENT, SET, RESET
- _validate_query() now splits on ';' and validates each statement
independently; multi-statement queries are rejected outright in
read-only mode to prevent 'SELECT 1; DROP TABLE users' bypass
- Extract single-statement logic into _validate_statement() helper
- Add TestSemicolonInjection and TestExtendedWriteCommands test classes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* ci: retrigger
* fix: use typing_extensions.Self for Python 3.10 compat
* chore: update tool specifications
* docs: document NL2SQLTool read-only default and DML configuration
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: close three NL2SQLTool security gaps (writable CTEs, EXPLAIN ANALYZE, multi-stmt commit)
- Remove WITH from _READ_ONLY_COMMANDS; scan CTE body for write keywords so
writable CTEs like `WITH d AS (DELETE …) SELECT …` are blocked in read-only mode.
- EXPLAIN ANALYZE/ANALYSE now resolves the underlying command; EXPLAIN ANALYZE DELETE
is treated as a write and blocked in read-only mode.
- execute_sql commit decision now checks ALL semicolon-separated statements so
a SELECT-first batch like `SELECT 1; DROP TABLE t` still triggers a commit
when allow_dml=True.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: handle parenthesized EXPLAIN options syntax; remove unused _seed_db
_validate_statement now strips parenthesized options from EXPLAIN (e.g.
EXPLAIN (ANALYZE) DELETE, EXPLAIN (ANALYZE, VERBOSE) DELETE) before
checking whether ANALYZE/ANALYSE is present — closing the bypass where
the options-list form was silently allowed in read-only mode.
Adds three new tests:
- EXPLAIN (ANALYZE) DELETE → blocked
- EXPLAIN (ANALYZE, VERBOSE) DELETE → blocked
- EXPLAIN (VERBOSE) SELECT → allowed
Also removes the unused _seed_db helper from test_nl2sql_security.py.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore: update tool specifications
* fix: smarter CTE write detection, fix commit logic for writable CTEs
- Replace naive token-set matching with positional AS() body inspection
to avoid false positives on column names like 'comment', 'set', 'reset'
- Fix execute_sql commit logic to detect writable CTEs (WITH + DELETE/INSERT)
not just top-level write commands
- Add tests for false positive cases and writable CTE commit behavior
- Format nl2sql_tool.py to pass ruff format check
* fix: catch write commands in CTE main query + handle whitespace in AS()
- WITH cte AS (SELECT 1) DELETE FROM users now correctly blocked
- AS followed by newline/tab/multi-space before ( now detected
- execute_sql commit logic updated for both cases
- 4 new tests
* fix: EXPLAIN ANALYZE VERBOSE handling, string literal paren bypass, commit logic for EXPLAIN ANALYZE
- EXPLAIN handler now consumes all known options (ANALYZE, ANALYSE, VERBOSE) before
extracting the real command, fixing 'EXPLAIN ANALYZE VERBOSE SELECT' being blocked
- Paren walker in _extract_main_query_after_cte now skips string literals, preventing
'WITH cte AS (SELECT '\''('\'' FROM t) DELETE FROM users' from bypassing detection
- _is_write_stmt in execute_sql now resolves EXPLAIN ANALYZE to underlying command
via _resolve_explain_command, ensuring session.commit() fires for write operations
- 10 new tests covering all three fixes
* fix: deduplicate EXPLAIN parsing, fix AS( regex in strings, block unknown CTE commands, bump langchain-core
- Refactor _validate_statement to use _resolve_explain_command (single source of truth)
- _iter_as_paren_matches skips string literals so 'AS (' in data doesn't confuse CTE detection
- Unknown commands after CTE definitions now blocked in read-only mode
- Bump langchain-core override to >=1.2.28 (GHSA-926x-3r5x-gfhw)
* fix: add return type annotation to _iter_as_paren_matches
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
resume_async() was missing trace infrastructure that kickoff_async()
sets up, causing flow_finished to never reach the platform after HITL
feedback. Add FlowStartedEvent emission to initialize the trace batch,
await event futures, finalize the trace batch, and guard with
suppress_flow_events.
Launch a Textual TUI via `crewai checkpoint` to browse and resume
from checkpoints. Uses run_async/akickoff for fully async execution.
Adds provider auto-detection from file magic bytes.
The spec generator previously used a hardcoded list of field names to
exclude from init_params_schema. Any new field or computed_field added
to BaseTool (like tool_type from 86ce54f) would silently leak into
tool.specs.json unless someone remembered to update that list.
Now _extract_init_params() dynamically computes BaseTool's fields at
import time via model_fields + model_computed_fields, so any future
additions to BaseTool are automatically excluded.
Fields from intermediate base classes (RagTool, BraveSearchToolBase,
SerpApiBaseTool) are correctly preserved since they're not on BaseTool.
TDD:
- RED: 3 new tests confirming BaseTool field leak, intermediate base
preservation, and future-proofing — all failed before the fix
- GREEN: Dynamic allowlist applied — all 10 tests pass
- Regenerated tool.specs.json (tool_type removed from all tools)
Wrapping sys.stdout and sys.stderr at import time with a
threading.Lock is not fork-safe and adds overhead to every
print call. litellm.suppress_debug_info already silences the
noisy output this was designed to filter.
Replace the Protocol with a BaseModel + ABC so providers serialize and
deserialize natively via pydantic. Each provider gets a Literal
provider_type field. CheckpointConfig.provider uses a discriminated
union so the correct provider class is reconstructed from checkpoint JSON.
* fix: add SSRF and path traversal protections
CVE-2026-2286: validate_url blocks non-http/https schemes, private
IPs, loopback, link-local, reserved addresses. Applied to 11 web tools.
CVE-2026-2285: validate_path confines file access to the working
directory. Applied to 7 file and directory tools.
* fix: drop unused assignment from validate_url call
* fix: DNS rebinding protection and allow_private flag
Rewrite validated URLs to use the resolved IP, preventing DNS rebinding
between validation and request time. SDK-based tools use pin_ip=False
since they manage their own HTTP clients. Add allow_private flag for
deployments that need internal network access.
* fix: unify security utilities and restore RAG chokepoint validation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor: move validation to security/ package + address review comments
- Move safe_path.py to crewai_tools/security/; add safe_url.py re-export
- Keep utilities/safe_path.py as a backwards-compat shim
- Update all 21 import sites to use crewai_tools.security.safe_path
- files_compressor_tool: validate output_path (user-controlled)
- serper_scrape_website_tool: call validate_url() before building payload
- brightdata_unlocker: validate_url() already called without assignment (no-op fix)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor: move validation to security/ package, keep utilities/ as compat shim
- security/safe_path.py is the canonical location for all validation
- utilities/safe_path.py re-exports for backward compatibility
- All tool imports already point to security.safe_path
- All review comments already addressed in prior commits
* fix: move validation outside try/except blocks, use correct directory validator
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: use resolved paths from validation to prevent symlink TOCTOU, remove unused safe_url.py
---------
Co-authored-by: Alex <alex@crewai.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>