feat: Skills Repository — registry, cache, CLI, and SDK integration (#5867)
Some checks failed
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled

* feat: add Skills Repository — registry, cache, CLI, and SDK integration

Adds a Skills Repository feature allowing users to publish, install,
and use skills from the CrewAI registry with @org/skill-name refs.

## What's New

### SDK (lib/crewai/)
- SkillFrontmatter: added optional 'version' field (backward compatible)
- SkillCacheManager: manages ~/.crewai/skills/{org}/{name}/ with
  .crewai_meta.json tracking, path-traversal-safe tar extraction
- SkillRegistry: parse @org/skill-name refs, local-first resolution
  (./skills/ > cache > download), interactive prompt on first use,
  CI-mode guard (CREWAI_NONINTERACTIVE/CI env vars)
- Agent.skills and Crew.skills widened to accept str refs (@org/name)
- set_skills() resolves registry refs with org-prefixed dedup keys
- New events: SkillDownloadStartedEvent, SkillDownloadCompletedEvent

### CLI (lib/cli/)
- crewai skill create <name> — context-aware (project vs standalone)
- crewai skill install @org/name — downloads to ./skills/ or cache
- crewai skill publish — ZIP + upload to org registry
- crewai skill list — show installed skills

### PlusAPI (lib/crewai-core/)
- Added SKILLS_RESOURCE, get_skill(), publish_skill(), list_skills()

### Scaffolding
- crew and flow templates now include skills/ directory

### Tests
- 91 SDK skill tests + 15 CLI skill tests, all passing

* fix: address all CI failures and CodeRabbit review comments

Lint:
- Remove unused imports (click, pytest, json)
- Replace try-except-pass with logging (S110)
- Fix unprotected zipfile.extractall (S202)

Security:
- Path traversal: startswith → is_relative_to for tar extraction
- Add path traversal protection to ZIP extraction via _safe_extract_zip
- Both cache.py and CLI main.py hardened

Type checker:
- Fix import path: crewai.events.event_bus (not crewai_event_bus)
- Remove unused type: ignore comments
- Fix type mismatches in set_skills() variable types

Code quality:
- Fix f-string interpolation in SkillNotCachedError
- Use ValidationError instead of Exception in test

* style: ruff format + autofix remaining lint errors

* refactor: reuse SDK parser and SkillCacheManager in CLI

- _parse_frontmatter() now delegates to crewai.skills.parser.parse_frontmatter
  when available, with a minimal fallback for CLI-only installs
- install() global cache path now reuses SkillCacheManager.store() instead
  of duplicating metadata writing logic

* refactor: add _print_current_organization to SkillCommand (matches ToolCommand pattern)

* fix: write .crewai_meta.json in fallback install path

CodeRabbit caught that the ImportError fallback in install() didn't write
cache metadata, making skills invisible to 'crewai skill list'.

* fix: tighten @org/name ref validation to prevent path traversal

Reject refs with multiple slashes (@org/a/b), dot segments (@../skill),
or leading dots in org/name. Applied to both CLI install() and SDK
parse_registry_ref() so the contract is enforced consistently.

* fix: update test assertions to match tightened error messages

* fix: align OSS client with AMP API contract

- download_skill(): fetch download_url (presigned URL) instead of
  expecting inline base64. Falls back to 'file' field for compat.
- Read 'latest_version' field, fall back to 'version'
- Same fixes applied to CLI install() command

* fix: publish as tar.gz (matches AMP content_type validation) + add zip fallback to SDK cache

CLI publish:
- _build_skill_zip → _build_skill_tarball (tar.gz format)
- Content type: application/x-gzip (matches SkillVersion validation)

SDK cache:
- store() now tries tar.gz first, falls back to zip extraction
- Added _safe_extract_zip for path-traversal-safe zip handling
- Both formats work for download/install regardless of server format

---------

Co-authored-by: João Moura <joaomdmoura@gmail.com>
This commit is contained in:
alex-clawd
2026-05-20 10:38:25 -07:00
committed by GitHub
parent 7cc1a7bb41
commit 418afd29e7
20 changed files with 1439 additions and 10 deletions

View File

@@ -26,6 +26,7 @@ from crewai_cli.replay_from_task import replay_task_command
from crewai_cli.reset_memories_command import reset_memories_command
from crewai_cli.run_crew import run_crew
from crewai_cli.settings.main import SettingsCommand
from crewai_cli.skills.main import SkillCommand
from crewai_cli.task_outputs import load_task_outputs
from crewai_cli.tools.main import ToolCommand
from crewai_cli.train_crew import train_crew
@@ -546,6 +547,56 @@ def tool_publish(is_public: bool, force: bool) -> None:
tool_cmd.publish(is_public, force)
@crewai.group()
def skill() -> None:
"""Skill Repository related commands."""
@skill.command(name="create")
@click.argument("name")
@click.option(
"--no-project",
"in_project",
is_flag=True,
default=True,
flag_value=False,
help="Create skill in current dir instead of ./skills/",
)
def skill_create(name: str, in_project: bool) -> None:
skill_cmd = SkillCommand()
skill_cmd.create(name, in_project=in_project)
@skill.command(name="install")
@click.argument("ref")
def skill_install(ref: str) -> None:
skill_cmd = SkillCommand()
skill_cmd.install(ref)
@skill.command(name="publish")
@click.option(
"--force",
is_flag=True,
default=False,
show_default=True,
help="Skip git-state validation.",
)
@click.option("--public", "is_public", flag_value=True, default=False)
@click.option("--private", "is_public", flag_value=False)
@click.option("--org", default=None, help="Organisation slug (overrides settings).")
def skill_publish(is_public: bool, org: str | None, force: bool) -> None:
skill_cmd = SkillCommand()
skill_cmd.publish(is_public, org=org, force=force)
@skill.command(name="list")
def skill_list() -> None:
"""List locally installed skills."""
skill_cmd = SkillCommand()
skill_cmd.list_cached()
@crewai.group()
def template() -> None:
"""Browse and install project templates."""

View File

@@ -40,7 +40,7 @@ class Repository:
encoding="utf-8",
).strip()
@cached_property # noqa: B019
@cached_property
def is_git_repo(self) -> bool:
"""Check if the current directory is a git repository."""
try:

View File

@@ -0,0 +1,415 @@
"""Skill Repository CLI commands for CrewAI."""
from __future__ import annotations
import base64
import io
import json
import os
from pathlib import Path
import tarfile
import zipfile
from rich.console import Console
from rich.table import Table
from crewai_cli.command import BaseCommand, PlusAPIMixin
from crewai_cli.config import Settings
from crewai_cli.constants import DEFAULT_CREWAI_ENTERPRISE_URL
console = Console()
_SKILL_MD_TEMPLATE = """\
---
name: {name}
version: 0.1.0
description: |
A short description of what this skill does.
---
## Instructions
Describe the skill behaviour here. This section is shown to the agent at activation time.
"""
class SkillCommand(BaseCommand, PlusAPIMixin):
"""Skill Repository related operations for CrewAI projects."""
def __init__(self) -> None:
BaseCommand.__init__(self)
PlusAPIMixin.__init__(self, telemetry=self._telemetry)
# ------------------------------------------------------------------
# create
# ------------------------------------------------------------------
def create(self, name: str, in_project: bool = True) -> None:
"""Scaffold a new skill directory.
If pyproject.toml is present (crew project), creates ./skills/{name}/.
Otherwise creates ./{name}/.
"""
if in_project and os.path.isfile("pyproject.toml"):
skill_dir = Path("skills") / name
else:
skill_dir = Path(name)
if skill_dir.exists():
console.print(f"[red]Directory {skill_dir} already exists.[/red]")
raise SystemExit(1)
skill_dir.mkdir(parents=True)
(skill_dir / "scripts").mkdir()
(skill_dir / "references").mkdir()
(skill_dir / "assets").mkdir()
skill_md = skill_dir / "SKILL.md"
skill_md.write_text(_SKILL_MD_TEMPLATE.format(name=name))
console.print(
f"[green]Created skill [bold]{name}[/bold] at [bold]{skill_dir}[/bold].[/green]"
)
console.print(f"Edit [bold]{skill_md}[/bold] to define the skill instructions.")
# ------------------------------------------------------------------
# install
# ------------------------------------------------------------------
def install(self, ref: str) -> None:
"""Download and install a registry skill.
Format: @org/name
Inside a crew project (pyproject.toml present): installs to ./skills/{name}/
Outside a project: installs to ~/.crewai/skills/{org}/{name}/
"""
if not ref.startswith("@"):
console.print(
"[red]Invalid skill reference. Use the format @org/name.[/red]"
)
raise SystemExit(1)
without_at = ref[1:]
if without_at.count("/") != 1:
console.print(
"[red]Invalid skill reference. Use the format @org/name.[/red]"
)
raise SystemExit(1)
org, name = without_at.split("/", 1)
if (
not org
or not name
or org.startswith(".")
or name.startswith(".")
or len(Path(org).parts) != 1
or len(Path(name).parts) != 1
):
console.print(
"[red]Invalid skill reference: org and name must be single, "
"non-empty path segments (no slashes, no '..').[/red]"
)
raise SystemExit(1)
self._print_current_organization()
console.print(f"[bold blue]Downloading skill {ref}...[/bold blue]")
get_response = self.plus_api_client.get_skill(org, name)
if get_response.status_code == 404:
console.print(
f"[red]Skill {ref} not found. Ensure it has been published and you have access.[/red]"
)
raise SystemExit(1)
if get_response.status_code != 200:
console.print(
f"[red]Failed to download skill {ref}: {get_response.status_code}[/red]"
)
raise SystemExit(1)
data = get_response.json()
version = data.get("latest_version") or data.get("version")
download_url = data.get("download_url")
if download_url:
import httpx
dl_response = httpx.get(download_url, follow_redirects=True)
dl_response.raise_for_status()
archive_bytes = dl_response.content
else:
encoded = data.get("file", "")
if "," in encoded:
encoded = encoded.split(",", 1)[1]
archive_bytes = base64.b64decode(encoded)
in_project = os.path.isfile("pyproject.toml")
if in_project:
dest = Path("skills") / name
dest.mkdir(parents=True, exist_ok=True)
self._unpack_archive(archive_bytes, dest)
console.print(
f"[green]Installed [bold]{ref}[/bold]{' (' + version + ')' if version else ''} to [bold]{dest}[/bold].[/green]"
)
else:
try:
from crewai.skills.cache import SkillCacheManager
cache = SkillCacheManager()
cache.store(org, name, version, archive_bytes)
except ImportError:
# Fallback if SDK not installed — write directly
cache_dir = Path.home() / ".crewai" / "skills" / org / name
if cache_dir.exists():
import shutil
shutil.rmtree(cache_dir)
cache_dir.mkdir(parents=True, exist_ok=True)
self._unpack_archive(archive_bytes, cache_dir)
# Write metadata so `crewai skill list` can discover it
from datetime import datetime, timezone
meta = {
"org": org,
"name": name,
"version": version,
"installed_at": datetime.now(tz=timezone.utc).isoformat(),
}
(cache_dir / ".crewai_meta.json").write_text(json.dumps(meta, indent=2))
console.print(
f"[green]Installed [bold]{ref}[/bold]{' (' + version + ')' if version else ''} to global cache.[/green]"
)
# ------------------------------------------------------------------
# publish
# ------------------------------------------------------------------
def publish(self, is_public: bool, org: str | None, force: bool = False) -> None:
"""Publish the skill in the current directory to the registry."""
skill_md = Path("SKILL.md")
if not skill_md.exists():
console.print(
"[red]No SKILL.md found in current directory. "
"Run this command from inside a skill directory.[/red]"
)
raise SystemExit(1)
# Parse frontmatter to extract name + version
try:
frontmatter = self._parse_frontmatter(skill_md.read_text())
except ValueError as exc:
console.print(f"[red]Failed to parse SKILL.md frontmatter: {exc}[/red]")
raise SystemExit(1) from exc
name = frontmatter.get("name")
version = frontmatter.get("version")
description = frontmatter.get("description")
if not name:
console.print(
"[red]SKILL.md frontmatter must include a 'name' field.[/red]"
)
raise SystemExit(1)
if not version:
console.print(
"[red]SKILL.md frontmatter must include a 'version' field before publishing.[/red]"
)
raise SystemExit(1)
settings = Settings()
effective_org = org or settings.org_name
if not effective_org:
console.print(
"[red]No organisation set. Run `crewai org switch <org_id>` first, "
"or pass --org.[/red]"
)
raise SystemExit(1)
self._print_current_organization()
console.print(
f"[bold blue]Publishing skill [bold]{name}[/bold] v{version} to {effective_org}...[/bold blue]"
)
archive_bytes = self._build_skill_tarball()
encoded_file = "data:application/x-gzip;base64," + base64.b64encode(
archive_bytes
).decode("utf-8")
response = self.plus_api_client.publish_skill(
org=effective_org,
name=name,
version=version,
is_public=is_public,
description=description,
encoded_file=encoded_file,
)
self._validate_response(response)
base_url = settings.enterprise_base_url or DEFAULT_CREWAI_ENTERPRISE_URL
console.print(
f"[green]Published [bold]{effective_org}/{name}[/bold] v{version}.\n\n"
"Security checks are running in the background. "
"Your skill will be available once checks complete.\n"
f"Monitor status at: {base_url}/crewai_plus/skills/{effective_org}/{name}[/green]"
)
# ------------------------------------------------------------------
# list_cached
# ------------------------------------------------------------------
def list_cached(self) -> None:
"""Show locally installed skills."""
table = Table(title="Installed Skills", show_lines=True)
table.add_column("Source", style="dim")
table.add_column("Ref")
table.add_column("Version")
table.add_column("Path")
# Project-local ./skills/
local_skills_dir = Path("skills")
if local_skills_dir.is_dir():
for skill_dir in sorted(local_skills_dir.iterdir()):
if skill_dir.is_dir() and (skill_dir / "SKILL.md").exists():
version = self._read_version(skill_dir / "SKILL.md")
table.add_row(
"project",
skill_dir.name,
version or "-",
str(skill_dir),
)
# Global cache
cache_root = Path.home() / ".crewai" / "skills"
if cache_root.exists():
for org_dir in sorted(cache_root.iterdir()):
if not org_dir.is_dir():
continue
for skill_dir in sorted(org_dir.iterdir()):
meta_file = skill_dir / ".crewai_meta.json"
if meta_file.exists():
try:
meta = json.loads(meta_file.read_text())
table.add_row(
"cache",
f"@{meta['org']}/{meta['name']}",
meta.get("version") or "-",
str(skill_dir),
)
except (json.JSONDecodeError, KeyError):
console.print(
f"[yellow]Warning: skipping malformed cache entry at {meta_file}[/yellow]"
)
console.print(table)
# ------------------------------------------------------------------
# internal helpers
# ------------------------------------------------------------------
def _print_current_organization(self) -> None:
settings = Settings()
if settings.org_uuid:
console.print(
f"Current organization: {settings.org_name} ({settings.org_uuid})",
style="bold blue",
)
else:
console.print(
"No organization currently set. We recommend setting one before using: "
"`crewai org switch <org_id>` command.",
style="yellow",
)
def _unpack_archive(self, archive_bytes: bytes, dest: Path) -> None:
"""Unpack a .tar.gz or .zip archive into dest."""
# Try tar first, then zip
try:
with tarfile.open(fileobj=io.BytesIO(archive_bytes), mode="r:gz") as tf:
try:
tf.extractall(dest, filter="data")
except TypeError:
_safe_extractall(tf, dest)
return
except tarfile.TarError:
pass
# Fallback: zip
with zipfile.ZipFile(io.BytesIO(archive_bytes)) as zf:
_safe_extract_zip(zf, dest)
def _build_skill_tarball(self) -> bytes:
"""Build an in-memory .tar.gz of SKILL.md + scripts/ + references/ + assets/."""
buf = io.BytesIO()
with tarfile.open(fileobj=buf, mode="w:gz") as tf:
tf.add("SKILL.md")
for folder in ("scripts", "references", "assets"):
folder_path = Path(folder)
if folder_path.is_dir():
for fpath in sorted(folder_path.rglob("*")):
if fpath.is_file():
tf.add(str(fpath))
return buf.getvalue()
def _parse_frontmatter(self, content: str) -> dict[str, str]:
"""Extract YAML frontmatter fields from a SKILL.md string.
Reuses crewai.skills.parser when available, with a minimal
fallback for environments where the full SDK isn't installed.
"""
try:
from crewai.skills.parser import parse_frontmatter
fm_dict, _ = parse_frontmatter(content)
return fm_dict
except ImportError:
pass
# Fallback: minimal YAML parsing without SDK dependency
import re
match = re.match(r"^---\n(.*?)\n---", content, re.DOTALL)
if not match:
raise ValueError("No YAML frontmatter block found")
try:
import yaml
return yaml.safe_load(match.group(1)) or {}
except ImportError:
result: dict[str, str] = {}
for line in match.group(1).splitlines():
if ":" in line:
key, _, value = line.partition(":")
result[key.strip()] = value.strip()
return result
def _read_version(self, skill_md: Path) -> str | None:
"""Read the version field from a SKILL.md file, or None."""
try:
fm = self._parse_frontmatter(skill_md.read_text())
return fm.get("version")
except Exception:
return None
def _safe_extractall(tf: tarfile.TarFile, dest: Path) -> None:
"""Path-traversal-safe extraction for Python < 3.12."""
dest_resolved = dest.resolve()
for member in tf.getmembers():
member_path = (dest / member.name).resolve()
if not member_path.is_relative_to(dest_resolved):
raise ValueError(f"Blocked path traversal attempt: {member.name!r}")
tf.extractall(dest) # noqa: S202
def _safe_extract_zip(zf: zipfile.ZipFile, dest: Path) -> None:
"""Path-traversal-safe ZIP extraction."""
dest_resolved = dest.resolve()
for member in zf.namelist():
member_path = (dest / member).resolve()
if not member_path.is_relative_to(dest_resolved):
raise ValueError(f"Blocked path traversal attempt: {member!r}")
zf.extractall(dest) # noqa: S202