15 KiB
Security Audit Report: crewaiinc/crewai
Date: March 9, 2026
Auditor: Cursor Cloud Agent
Repository: https://github.com/crewaiinc/crewai
Scope: Quick security check of the crewai Python framework
Executive Summary
This report presents findings from a security assessment of the CrewAI framework. The codebase demonstrates good overall security practices with several security controls in place. However, there are some areas that warrant attention, particularly around code execution capabilities and input validation.
Risk Level: MEDIUM
Key Findings Summary
- ✅ Good: No hardcoded secrets in production code
- ✅ Good: JWT authentication properly implemented with validation
- ✅ Good: Security tooling in place (Bandit, Ruff with security rules)
- ✅ Good: Dependency version pinning and override policies
- ⚠️ Concern: Code interpreter tool allows arbitrary code execution
- ⚠️ Concern: SQL injection risk in NL2SQL tool
- ⚠️ Concern: Pickle deserialization without integrity checks
- ⚠️ Info: Command injection protections needed in some areas
1. Secrets and Credential Management
✅ PASS - No Production Secrets Found
Finding: All hardcoded API keys and tokens found are in test files only.
Evidence:
- All hardcoded credentials are in test files with fake/example values
- Test environment file (
.env.test) properly uses fake credentials - Production code retrieves credentials from environment variables
Examples:
# Test files use fake credentials - ACCEPTABLE
OPENAI_API_KEY=fake-api-key
ANTHROPIC_API_KEY=fake-anthropic-key
Recommendation: ✅ Current approach is secure. Continue this pattern.
2. Dependency Vulnerabilities
✅ GOOD - Proactive Dependency Management
Finding: The project has security-conscious dependency management.
Security Controls:
- Bandit (v1.9.2) - Security linter for Python code
- Ruff with security rules enabled (
S- Bandit rules) - Dependency overrides for known vulnerabilities in
pyproject.toml:[tool.uv] override-dependencies = [ "langchain-core>=0.3.80,<1", # GHSA template-injection vuln fixed "urllib3>=2.6.3", # Security updates "pillow>=12.1.1", # Security updates ]
Recommendation: ✅ Excellent practices. Maintain regular dependency audits.
3. Code Execution Vulnerabilities
⚠️ HIGH RISK - Code Interpreter Tool
File: lib/crewai-tools/src/crewai_tools/tools/code_interpreter_tool/code_interpreter_tool.py
Finding: The CodeInterpreterTool allows arbitrary code execution with three modes:
- Docker mode (default, safest)
- Restricted sandbox (fallback when Docker unavailable)
- Unsafe mode (runs code directly on host)
Critical Issues:
Issue 1: Unsafe Mode Command Injection
Lines 382-383:
for library in libraries_used:
os.system(f"pip install {library}") # noqa: S605
Risk: If library contains shell metacharacters, this could lead to command injection.
Attack Example:
libraries_used = ["numpy; rm -rf /"]
Severity: HIGH (but requires unsafe_mode=True)
Recommendation:
# Use subprocess with list arguments instead
subprocess.run(["pip", "install", library], check=True)
Issue 2: Sandbox Can Be Bypassed
Lines 60-83: The restricted sandbox blocks certain modules, but:
- Blocks are incomplete (e.g.,
pathlibnot blocked, could access filesystem) - Determined attackers may find bypass techniques
- No resource limits (CPU, memory, time)
Recommendation:
- Add resource limits to sandbox execution
- Consider using more robust sandboxing like RestrictedPython
- Document that sandbox is defense-in-depth, not primary security
Issue 3: Docker Volume Mounting
Lines 260-267:
volumes={current_path: {"bind": "/workspace", "mode": "rw"}}
Risk: Mounts entire current working directory with read-write access.
Recommendation:
- Mount as read-only by default
- Allow write access to specific temporary directory only
- Add option to restrict mounted paths
4. SQL Injection Vulnerabilities
⚠️ HIGH RISK - NL2SQL Tool
File: lib/crewai-tools/src/crewai_tools/tools/nl2sql/nl2sql_tool.py
Finding: SQL injection vulnerability in schema introspection.
Lines 56-58:
def _fetch_all_available_columns(self, table_name: str):
return self.execute_sql(
f"SELECT column_name, data_type FROM information_schema.columns WHERE table_name = '{table_name}';" # noqa: S608
)
Risk: If table_name contains malicious SQL, it will be executed.
Attack Example:
table_name = "'; DROP TABLE users; --"
Severity: HIGH
Recommendation:
def _fetch_all_available_columns(self, table_name: str):
return self.execute_sql(
"SELECT column_name, data_type FROM information_schema.columns WHERE table_name = :table_name",
params={"table_name": table_name}
)
Note: The tool does use parameterized queries via SQLAlchemy's text() for user queries (line 82), which is good. Only the internal method is vulnerable.
5. Insecure Deserialization
⚠️ MEDIUM RISK - Pickle Usage
File: lib/crewai/src/crewai/utilities/file_handler.py
Finding: Pickle is used for persistence without integrity verification.
Lines 168-170:
with open(self.file_path, "rb") as file:
try:
return pickle.load(file) # noqa: S301
Risk: Pickle can execute arbitrary code during deserialization. If an attacker can modify pickle files, they can achieve remote code execution.
Severity: MEDIUM (requires write access to pickle files)
Context: Used by PickleHandler class for storing training data and agent state.
Recommendations:
- Immediate: Add file integrity checks (HMAC signatures)
- Short-term: Switch to JSON for non-object data
- Long-term: Use
jsonpickleor similar safer alternatives - Defense: Document that pickle files must be stored securely with proper access controls
Example Mitigation:
import hmac
import hashlib
def save(self, data: Any, secret_key: str) -> None:
pickle_data = pickle.dumps(data)
signature = hmac.new(secret_key.encode(), pickle_data, hashlib.sha256).digest()
with open(self.file_path, "wb") as f:
f.write(signature + pickle_data)
def load(self, secret_key: str) -> Any:
with open(self.file_path, "rb") as f:
signature = f.read(32)
pickle_data = f.read()
expected_sig = hmac.new(secret_key.encode(), pickle_data, hashlib.sha256).digest()
if not hmac.compare_digest(signature, expected_sig):
raise ValueError("Pickle file integrity check failed")
return pickle.loads(pickle_data)
6. File Handling and Path Traversal
✅ GOOD - Path Validation Present
File: lib/crewai/src/crewai/knowledge/source/base_file_knowledge_source.py
Finding: File paths are validated and restricted to knowledge directory.
Lines 86-88:
def convert_to_path(self, path: Path | str) -> Path:
return Path(KNOWLEDGE_DIRECTORY + "/" + path) if isinstance(path, str) else path
Lines 56-64:
def validate_content(self) -> None:
for path in self.safe_file_paths:
if not path.exists():
raise FileNotFoundError(f"File not found: {path}")
if not path.is_file():
# Log error
Security Strength:
- ✅ Paths are constrained to knowledge directory
- ✅ Existence and type validation
- ⚠️ Could add explicit check for path traversal attempts (
..)
Recommendation:
def convert_to_path(self, path: Path | str) -> Path:
base_path = Path(KNOWLEDGE_DIRECTORY).resolve()
if isinstance(path, str):
full_path = (base_path / path).resolve()
else:
full_path = path.resolve()
# Ensure resolved path is still within knowledge directory
if not full_path.is_relative_to(base_path):
raise ValueError(f"Path traversal detected: {path}")
return full_path
7. Authentication and Authorization
✅ EXCELLENT - JWT Implementation
File: lib/crewai/src/crewai/cli/authentication/utils.py
Finding: JWT validation is properly implemented with all security best practices.
Strengths:
- ✅ Signature verification using JWKS
- ✅ Expiration check (
verify_exp) - ✅ Issuer validation
- ✅ Audience validation
- ✅ Required claims enforcement
- ✅ Proper exception handling
- ✅ 10-second leeway for clock skew
Lines 30-44:
return jwt.decode(
jwt_token,
signing_key.key,
algorithms=["RS256"],
audience=audience,
issuer=issuer,
leeway=10.0,
options={
"verify_signature": True,
"verify_exp": True,
"verify_nbf": True,
"verify_iat": True,
"require": ["exp", "iat", "iss", "aud", "sub"],
},
)
Recommendation: ✅ No changes needed. This is exemplary JWT validation.
8. Security Features
✅ GOOD - Built-in Security Module
Files:
lib/crewai/src/crewai/security/security_config.pylib/crewai/src/crewai/security/fingerprint.py
Finding: CrewAI includes a security module with:
- Fingerprinting - Unique agent identifiers for tracking and auditing
- Metadata validation - Prevents DoS via oversized metadata
- Type validation - Strong typing with Pydantic
Security Controls in Fingerprint:
Lines 38-40 (DoS prevention):
if len(str(v)) > 10_000: # Limit metadata size to 10KB
raise ValueError("Metadata size exceeds maximum allowed (10KB)")
Lines 28-36 (Nested data protection):
if isinstance(nested_value, dict):
raise ValueError("Metadata can only be nested one level deep")
Recommendation: ✅ Good defensive programming. Consider adding rate limiting to fingerprint generation if exposed via API.
9. Command Injection Risks
✅ MOSTLY GOOD - Limited Use of Shell Commands
Finding: No instances of shell=True found in the codebase.
Subprocess Usage:
- Most subprocess calls use list arguments (safe)
- Docker commands use proper API (no shell)
- File operations use Path/open (no shell)
Exception:
# code_interpreter_tool.py line 383 (already covered in Section 3)
os.system(f"pip install {library}") # Only in unsafe mode
Recommendation: ✅ Continue avoiding shell=True. Fix the one instance noted above.
10. SSL/TLS Configuration
✅ PASS - No SSL Verification Bypasses
Finding: No instances of verify=False or SSL certificate bypass found.
Evidence:
- HTTP requests use default SSL verification
- No override of certificate validation
Recommendation: ✅ Maintain current practices.
Security Tooling Assessment
✅ EXCELLENT - Multiple Security Tools Configured
From pyproject.toml:
- Bandit (v1.9.2) - Security-focused static analysis
- Ruff with security rules:
extend-select = [ "S", # bandit (security issues) "B", # flake8-bugbear (bug prevention) ] - MyPy (v1.19.1) - Type checking prevents many bugs
- Pre-commit hooks - Automated checks
Test Security:
- Bandit checks disabled in tests (lines 106-108) - reasonable for test code
- Fake credentials in tests - correct approach
Recommendation: ✅ Excellent security tooling. Consider adding:
safetyorpip-auditfor dependency vulnerability scanning- SAST scanning in CI/CD (GitHub CodeQL, Semgrep)
Summary of Vulnerabilities
| ID | Severity | Component | Issue | Status |
|---|---|---|---|---|
| 1 | HIGH | CodeInterpreterTool | Command injection in unsafe mode | ⚠️ Fix Recommended |
| 2 | HIGH | NL2SQLTool | SQL injection in table introspection | ⚠️ Fix Recommended |
| 3 | MEDIUM | PickleHandler | Insecure deserialization | ⚠️ Mitigation Recommended |
| 4 | MEDIUM | CodeInterpreterTool | Docker volume permissions too broad | ⚠️ Hardening Recommended |
| 5 | LOW | BaseFileKnowledgeSource | Path traversal check could be stronger | ℹ️ Enhancement Suggested |
| 6 | LOW | CodeInterpreterTool | Sandbox bypass potential | ℹ️ Document Limitations |
Recommendations
Immediate Actions (High Priority)
- Fix SQL injection in
nl2sql_tool.pyline 57 - use parameterized queries - Fix command injection in
code_interpreter_tool.pyline 383 - use subprocess.run with list - Document security model - Especially for CodeInterpreterTool unsafe mode
Short-term Actions (Medium Priority)
- Add pickle integrity checks - HMAC signing for pickle files
- Restrict Docker volume mounts - Read-only by default
- Enhance path traversal protection - Explicit
is_relative_to()check - Add dependency scanning - Integrate
pip-auditorsafetyin CI
Long-term Actions (Low Priority)
- Evaluate pickle alternatives - Consider JSON or safer serialization
- Resource limits in sandbox - CPU/memory/time limits for code execution
- Rate limiting - Add to fingerprint generation if exposed via API
- Security documentation - Create SECURITY.md with security best practices
Positive Security Practices Observed
- ✅ No hardcoded production secrets
- ✅ Excellent JWT implementation
- ✅ Strong security tooling (Bandit, Ruff, MyPy)
- ✅ Proactive dependency management with security overrides
- ✅ Type safety with Pydantic and MyPy
- ✅ No shell=True usage (except one controlled case)
- ✅ SSL verification enabled throughout
- ✅ Input validation in multiple layers
- ✅ Security module with fingerprinting and metadata limits
- ✅ Test isolation with fake credentials
Conclusion
The CrewAI framework demonstrates mature security practices overall. The development team clearly prioritizes security with multiple layers of protection, security tooling, and careful dependency management.
The main security concerns are inherent to the framework's purpose (AI agent orchestration with code execution capabilities) rather than security oversights. The identified vulnerabilities are in optional/specialized tools and should be addressed to prevent misuse.
Overall Security Posture: GOOD with room for targeted improvements.
Risk Assessment: MEDIUM (acceptable for current stage with recommended fixes)
Recommendation: Address high-priority SQL and command injection issues, then proceed with medium-priority hardening tasks.
Report Generated: 2026-03-09
Audit Tool: Manual review + automated pattern analysis
Scope: Quick security check (not comprehensive penetration test)