Files
crewAI/lib/crewai/tests/skills/test_cache.py
alex-clawd 418afd29e7
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
feat: Skills Repository — registry, cache, CLI, and SDK integration (#5867)
* feat: add Skills Repository — registry, cache, CLI, and SDK integration

Adds a Skills Repository feature allowing users to publish, install,
and use skills from the CrewAI registry with @org/skill-name refs.

## What's New

### SDK (lib/crewai/)
- SkillFrontmatter: added optional 'version' field (backward compatible)
- SkillCacheManager: manages ~/.crewai/skills/{org}/{name}/ with
  .crewai_meta.json tracking, path-traversal-safe tar extraction
- SkillRegistry: parse @org/skill-name refs, local-first resolution
  (./skills/ > cache > download), interactive prompt on first use,
  CI-mode guard (CREWAI_NONINTERACTIVE/CI env vars)
- Agent.skills and Crew.skills widened to accept str refs (@org/name)
- set_skills() resolves registry refs with org-prefixed dedup keys
- New events: SkillDownloadStartedEvent, SkillDownloadCompletedEvent

### CLI (lib/cli/)
- crewai skill create <name> — context-aware (project vs standalone)
- crewai skill install @org/name — downloads to ./skills/ or cache
- crewai skill publish — ZIP + upload to org registry
- crewai skill list — show installed skills

### PlusAPI (lib/crewai-core/)
- Added SKILLS_RESOURCE, get_skill(), publish_skill(), list_skills()

### Scaffolding
- crew and flow templates now include skills/ directory

### Tests
- 91 SDK skill tests + 15 CLI skill tests, all passing

* fix: address all CI failures and CodeRabbit review comments

Lint:
- Remove unused imports (click, pytest, json)
- Replace try-except-pass with logging (S110)
- Fix unprotected zipfile.extractall (S202)

Security:
- Path traversal: startswith → is_relative_to for tar extraction
- Add path traversal protection to ZIP extraction via _safe_extract_zip
- Both cache.py and CLI main.py hardened

Type checker:
- Fix import path: crewai.events.event_bus (not crewai_event_bus)
- Remove unused type: ignore comments
- Fix type mismatches in set_skills() variable types

Code quality:
- Fix f-string interpolation in SkillNotCachedError
- Use ValidationError instead of Exception in test

* style: ruff format + autofix remaining lint errors

* refactor: reuse SDK parser and SkillCacheManager in CLI

- _parse_frontmatter() now delegates to crewai.skills.parser.parse_frontmatter
  when available, with a minimal fallback for CLI-only installs
- install() global cache path now reuses SkillCacheManager.store() instead
  of duplicating metadata writing logic

* refactor: add _print_current_organization to SkillCommand (matches ToolCommand pattern)

* fix: write .crewai_meta.json in fallback install path

CodeRabbit caught that the ImportError fallback in install() didn't write
cache metadata, making skills invisible to 'crewai skill list'.

* fix: tighten @org/name ref validation to prevent path traversal

Reject refs with multiple slashes (@org/a/b), dot segments (@../skill),
or leading dots in org/name. Applied to both CLI install() and SDK
parse_registry_ref() so the contract is enforced consistently.

* fix: update test assertions to match tightened error messages

* fix: align OSS client with AMP API contract

- download_skill(): fetch download_url (presigned URL) instead of
  expecting inline base64. Falls back to 'file' field for compat.
- Read 'latest_version' field, fall back to 'version'
- Same fixes applied to CLI install() command

* fix: publish as tar.gz (matches AMP content_type validation) + add zip fallback to SDK cache

CLI publish:
- _build_skill_zip → _build_skill_tarball (tar.gz format)
- Content type: application/x-gzip (matches SkillVersion validation)

SDK cache:
- store() now tries tar.gz first, falls back to zip extraction
- Added _safe_extract_zip for path-traversal-safe zip handling
- Both formats work for download/install regardless of server format

---------

Co-authored-by: João Moura <joaomdmoura@gmail.com>
2026-05-20 14:38:25 -03:00

117 lines
4.4 KiB
Python

"""Tests for SkillCacheManager."""
from __future__ import annotations
import gzip
import io
import json
import tarfile
from pathlib import Path
from crewai.skills.cache import SkillCacheManager
def _make_tar_gz(files: dict[str, str]) -> bytes:
"""Build an in-memory .tar.gz containing the given filename → content mapping."""
buf = io.BytesIO()
with gzip.GzipFile(fileobj=buf, mode="wb") as gz:
gz_buf = io.BytesIO()
with tarfile.open(fileobj=gz_buf, mode="w") as tf:
for name, content in files.items():
data = content.encode()
info = tarfile.TarInfo(name=name)
info.size = len(data)
tf.addfile(info, io.BytesIO(data))
gz.write(gz_buf.getvalue())
buf.seek(0)
# Re-create properly: gzip wrapping a tar stream
out = io.BytesIO()
with tarfile.open(fileobj=out, mode="w:gz") as tf:
for name, content in files.items():
data = content.encode()
info = tarfile.TarInfo(name=name)
info.size = len(data)
tf.addfile(info, io.BytesIO(data))
return out.getvalue()
class TestSkillCacheManager:
def test_get_cached_path_missing(self, tmp_path: Path) -> None:
cache = SkillCacheManager(cache_root=tmp_path)
assert cache.get_cached_path("acme", "my-skill") is None
def test_store_and_retrieve(self, tmp_path: Path) -> None:
cache = SkillCacheManager(cache_root=tmp_path)
archive = _make_tar_gz({"SKILL.md": "---\nname: my-skill\n---\nHello"})
dest = cache.store("acme", "my-skill", "1.0.0", archive)
assert dest.is_dir()
assert (dest / "SKILL.md").exists()
retrieved = cache.get_cached_path("acme", "my-skill")
assert retrieved == dest
def test_store_writes_metadata(self, tmp_path: Path) -> None:
cache = SkillCacheManager(cache_root=tmp_path)
archive = _make_tar_gz({"SKILL.md": "content"})
dest = cache.store("acme", "my-skill", "2.3.4", archive)
meta_file = dest / ".crewai_meta.json"
assert meta_file.exists()
meta = json.loads(meta_file.read_text())
assert meta["org"] == "acme"
assert meta["name"] == "my-skill"
assert meta["version"] == "2.3.4"
assert "installed_at" in meta
def test_store_overwrites_previous_version(self, tmp_path: Path) -> None:
cache = SkillCacheManager(cache_root=tmp_path)
archive_v1 = _make_tar_gz({"SKILL.md": "v1", "extra.txt": "old"})
cache.store("acme", "my-skill", "1.0.0", archive_v1)
archive_v2 = _make_tar_gz({"SKILL.md": "v2"})
dest = cache.store("acme", "my-skill", "2.0.0", archive_v2)
# Old file should be gone
assert not (dest / "extra.txt").exists()
assert (dest / "SKILL.md").read_text() == "v2"
meta = json.loads((dest / ".crewai_meta.json").read_text())
assert meta["version"] == "2.0.0"
def test_list_cached_empty(self, tmp_path: Path) -> None:
cache = SkillCacheManager(cache_root=tmp_path)
assert cache.list_cached() == []
def test_list_cached(self, tmp_path: Path) -> None:
cache = SkillCacheManager(cache_root=tmp_path)
archive = _make_tar_gz({"SKILL.md": "x"})
cache.store("acme", "skill-a", "1.0.0", archive)
cache.store("acme", "skill-b", "0.1.0", archive)
cache.store("other-org", "skill-c", None, archive)
entries = cache.list_cached()
names = {e["name"] for e in entries}
assert names == {"skill-a", "skill-b", "skill-c"}
def test_invalidate_existing(self, tmp_path: Path) -> None:
cache = SkillCacheManager(cache_root=tmp_path)
archive = _make_tar_gz({"SKILL.md": "x"})
cache.store("acme", "my-skill", "1.0.0", archive)
removed = cache.invalidate("acme", "my-skill")
assert removed is True
assert cache.get_cached_path("acme", "my-skill") is None
def test_invalidate_missing(self, tmp_path: Path) -> None:
cache = SkillCacheManager(cache_root=tmp_path)
removed = cache.invalidate("acme", "ghost-skill")
assert removed is False
def test_store_version_none(self, tmp_path: Path) -> None:
cache = SkillCacheManager(cache_root=tmp_path)
archive = _make_tar_gz({"SKILL.md": "x"})
dest = cache.store("acme", "my-skill", None, archive)
meta = json.loads((dest / ".crewai_meta.json").read_text())
assert meta["version"] is None