JSON first crews (#6131)

* feat(cli): introduce JSON crew project support and TUI enhancements - Added support for creating and running JSON-defined crew projects, allowing users to scaffold projects with a new `create_json_crew.py` file. - Implemented a full-screen Textual TUI for crew execution in `crew_run_tui.py`, enhancing user interaction with a two-column layout. - Updated `run_crew.py` to prioritize JSON crew projects and added daemon mode for running without TUI. - Introduced interactive pickers in `tui_picker.py` for improved CLI prompts. - Enhanced validation for JSON crew files in `validate.py` to ensure proper structure and agent definitions. - Updated `.gitignore` to exclude demo and crewai directories. * feat: update LLM model references to gpt-5.4-mini - Changed default LLM model from gpt-4o-mini to gpt-5.4-mini across various files, including CLI options, JSON crew configurations, and agent definitions. - Enhanced benchmark and human feedback functionalities to utilize the new model. - Improved user interface elements in the TUI for better interaction and feedback during execution. - Added support for new skills directory in JSON crew project creation. * feat(benchmark): add crew-level benchmarking functionality - Introduced a new `benchmark` command in the CLI for crew-level benchmarking, allowing users to specify agents, models, and timeout settings. - Implemented `CrewBenchmarkCase` to handle crew-level benchmark cases with inputs and criteria. - Enhanced the benchmark runner to support progress tracking and detailed reporting of results for multiple models. - Added tests for loading crew benchmark cases and validating their structure. - Updated existing benchmark functions to accommodate the new crew-level execution model. * feat(cli): enhance JSON crew project functionality and TUI improvements - Added optional agent-level guardrails and advanced options in JSON crew configurations to improve output validation and flexibility. - Updated the TUI to better handle plan step statuses, including visual indicators for task completion and failure. - Introduced methods for parsing and managing step observation events, ensuring accurate updates to task statuses during execution. - Enhanced validation for JSON crew projects, ensuring proper structure and error handling for agent and task definitions. - Added comprehensive tests for new features and validation logic, ensuring robustness in JSON crew project handling. * refactor(cli): streamline JSON crew project handling and improve validation - Refactored JSON crew project loading and validation logic to enhance clarity and maintainability. - Introduced utility functions for finding JSON crew files, improving code reuse across modules. - Removed deprecated benchmark functionality and associated tests to simplify the codebase. - Updated CLI commands to utilize the new JSON project structure, ensuring compatibility with recent changes. - Enhanced test coverage for JSON crew project features, ensuring robust validation and error handling. * feat(cli): enhance activity log navigation and focus management - Added functionality to focus on the activity log when navigating through log entries. - Implemented refresh logic for the log panel to ensure updates are displayed correctly during navigation. - Improved keyboard navigation for log entries, allowing users to expand and scroll through logs seamlessly. - Added tests to verify the correct behavior of log navigation and focus management in the TUI. * feat(cli): enhance JSON crew project interaction and input handling - Introduced a new function to enable prompt line editing for better user experience during input prompts. - Updated the JSON crew project wizards to show interpolation hints for dynamic values, improving user guidance. - Enhanced the handling of missing input placeholders by prompting users for required values during crew setup. - Refactored the crew run logic to ensure proper loading and preparation of JSON-defined crews, including runtime input management. - Added tests to verify the correct behavior of new input handling features and JSON crew project interactions. * feat(cli): improve crew project input prompts and event handling - Enhanced the `_prompt_text` function to allow for configurable spacing before prompts, improving user experience during input collection. - Updated the wizards for agent and task creation to utilize the new prompt configuration, ensuring a more compact and streamlined interaction. - Introduced new plan step lifecycle events (`PlanStepStartedEvent`, `PlanStepCompletedEvent`) to better track the execution status of plan steps. - Refactored the step executor to emit these events during the execution of tasks, improving observability and debugging capabilities. - Added tests to verify the correct behavior of new prompt handling and event emissions during crew project execution. * fix: refine json-first crew interactions * fix: prioritize common json crew tools * fix: make json crew more tools expandable * fix: show json crew tools by category * feat(memory): update default embedder to OpenAI text-embedding-3-large and enhance memory compatibility - Changed the default embedding model for Memory to OpenAI text-embedding-3-large, which uses 3072-dimensional vectors. - Added warnings regarding compatibility issues with existing local memory stores created with 1536-dimensional embeddings. - Updated documentation to reflect the new default embedder and its configuration options. - Enhanced the CLI and codebase to support the new embedding model across various components, ensuring a seamless transition for users. * fix: address PR review feedback for JSON-first crews Review blockers: - Forward trained_agents_file to JSON crews: crewai run -f now exports CREWAI_TRAINED_AGENTS_FILE for the in-process JSON crew path - Wizard agent picker: Esc/cancel now reprompts instead of silently assigning the first agent - JSON tool resolution hard-fails: unknown tool names, missing custom tool files, and invalid custom tool modules raise JSONProjectError with actionable messages instead of warn-and-continue - Embedding dimension mismatch: LanceDB and Qdrant Edge storages raise EmbeddingDimensionMismatchError with reset/pin guidance instead of silently zero-filling vectors or returning empty search results - Custom tool code execution documented in loader docstring and the scaffolded project README CI fixes: - ruff format across lib/ - All 133 PR-introduced mypy errors fixed (llm.py lazy-litellm and cli.py lazy command shims now use TYPE_CHECKING imports; textual is_mounted misuse fixed; pick_many overloads; misc annotations) Bot review comments: - Empty except blocks now have explanatory comments or debug logging - Removed unused _C_BG/_C_PANEL/_C_BORDER globals and redundant import re; tests use a single import style for create_json_crew Tests: trained-agents propagation, wizard cancel, tool resolution failures, and dimension mismatch guidance. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: address second round of PR review comments Cursor Bugbot: - Wizard agent slugs: strip to [a-z0-9_] and fall back to agent_<n> so symbol-only roles can't produce an empty agents/.jsonc filename - Wizard task names: dedupe against prior task names and fall back to task_<n> for symbol-only descriptions CodeRabbit: - Agent.message(): import Task explicitly at runtime instead of relying on the namespace injection done by crewai/__init__ - Async executor: move the native-tools-unsupported fallback from _ainvoke_loop_react (self-recursion) to _ainvoke_loop_native_tools, mirroring the sync implementation - StepExecutor downgrade: keep the in-step conversation and append the text-tooling instructions instead of rebuilding messages, so completed native tool calls are not re-executed - crewai-files: extension-based MIME lookup now runs before byte sniffing so csv/xml types are not degraded to text/plain - Memory storages: validate every record in a save() batch against a consistent embedding dimension (LanceDB previously checked only the first record); added mixed-batch tests - _print_post_tui_summary now typed against CrewRunApp - Docs: Azure OpenAI default embedder change called out in the memory migration warning and provider table Code quality bots: - Removed unused _C_YELLOW/_C_CYAN (crew_run_tui) and _GREEN (tui_picker) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(cli): accordion tool picker in JSON crew wizard The flat tool list had grown to ~90 rows. The picker now shows: - Common tools always visible at the top - Every other category as a single expandable row with tool and selection counts (e.g. "Search & Research (27 tools, 2 selected)") - Expanding a category collapses the previously expanded one - Selections persist across expand/collapse via new preselected support in pick_many; cursor follows the toggled category row tui_picker gains preselected + initial_cursor options on pick_many, and Esc in multi-select now confirms the current selection instead of discarding it (required so collapsing can't silently drop choices). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(cli): remove --daemon flag from crewai run The flag only affected JSON crew projects — classic and flow projects ignored it entirely, which made the behavior inconsistent. Removed the option, the daemon code path (_run_json_crew_daemon), and its helper (_load_json_crew_with_inputs). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test: update run command tests after --daemon removal lib/crewai/tests/cli/test_run_crew.py still asserted the old run_crew(trained_agents_file=..., daemon=False) call signature. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): exit codes, mid-run quit, async statuses, hyphen placeholders Addresses the latest Bugbot review round: - Failed JSON crew runs now exit non-zero (SystemExit(1)) so scripts and CI don't treat failures as success, mirroring the classic path - Quitting the TUI mid-run now ends the process (os._exit(130)); kickoff runs in a thread worker that cannot be force-cancelled, so letting the CLI return would leave LLM/tool work burning tokens in the background - Sidebar task statuses are now async-safe: completion/failure events resolve the task's own row via identity instead of assuming the most recently started task, and starting a task no longer blanket-marks earlier active rows as done - The runtime-input prompt regex now accepts hyphenated placeholder names ({my-topic}), matching kickoff's interpolation pattern Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: validation safety, custom tool sandboxing, TUI log integrity, memory error surfacing - Deploy validation no longer executes project code: validation mode checks tool declarations structurally (well-formed entries, custom tool file exists) without importing or instantiating anything. custom:<name> resolution only happens on the actual run path. - custom:<name> is constrained to [A-Za-z_][A-Za-z0-9_]* and the resolved path must stay inside the project's tools/ directory, so custom:../foo or absolute-path names cannot execute code outside it. Tool paths resolve relative to the crew project root, not cwd. - TUI task logs are built from per-task state captured at task start (idx, description, agent, start time); an out-of-order completion takes its output from the event and no longer steals or resets the current task's streamed steps/output. - EmbeddingDimensionMismatchError now inherits ValueError instead of RuntimeError so background saves surface it through MemorySaveFailedEvent instead of silently dropping the save; the shutdown catch in _background_encode_batch is narrowed to the "cannot schedule new futures" case. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): declared project type wins over crew.json presence A flow project that also contains a crew.json(c) file now runs and validates as the flow it declares in pyproject.toml instead of being hijacked by the JSON crew path. Both crewai run (_has_json_crew) and deploy validation (_is_json_crew) check tool.crewai.type; a missing or unreadable pyproject still means a bare JSON crew project. Also documents why StepObservationFailedEvent intentionally marks the plan step "done": the event signals an observer failure, not a step failure, and the executor continues past it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): type the declared_type locals so mypy stays clean Comparing an Any-typed .get() chain returns Any, which tripped no-any-return on the previous commit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-07-03 14:09:24 +00:00 · 2026-06-14 04:19:48 -03:00
parent d80719df81
commit bb477f8a91
81 changed files with 9088 additions and 235 deletions
--- a/lib/cli/tests/test_create_crew.py
+++ b/lib/cli/tests/test_create_crew.py
@@ -6,6 +6,8 @@ from unittest import mock

 import pytest
 from click.testing import CliRunner
+import crewai_cli.create_json_crew as json_crew
+import crewai_cli.tui_picker as tui_picker
 from crewai_cli.create_crew import create_crew, create_folder_structure


@@ -345,3 +347,441 @@ def test_env_vars_are_uppercased_in_env_file(
    env_file_path = crew_path / ".env"
    content = env_file_path.read_text()
    assert "MODEL=" in content
+
+
+def test_json_wizard_defaults_to_sequential_and_memory_enabled(monkeypatch):
+    monkeypatch.setattr(
+        json_crew,
+        "_wizard_agent",
+        lambda **_: {
+            "name": "researcher",
+            "role": "Researcher",
+            "goal": "Research",
+            "backstory": "Researcher",
+            "llm": "openai/gpt-5.5",
+            "tools": [],
+            "planning": False,
+            "allow_delegation": False,
+        },
+    )
+    monkeypatch.setattr(
+        json_crew,
+        "_wizard_task",
+        lambda **_: {
+            "name": "research_task",
+            "description": "Research",
+            "expected_output": "Findings",
+            "agent": "researcher",
+            "context": [],
+        },
+    )
+
+    def confirm(label: str, default: bool = False) -> bool:
+        if label == "Enable crew memory?":
+            return default
+        return False
+
+    monkeypatch.setattr(json_crew, "_confirm", confirm)
+    monkeypatch.setattr(json_crew.click, "prompt", lambda *_, **__: "")
+    monkeypatch.setattr(
+        json_crew,
+        "pick_one",
+        lambda *_args, **_kwargs: pytest.fail("process should not be prompted"),
+    )
+
+    _agents, _tasks, settings = json_crew._wizard_agents_and_tasks(
+        skip_provider=True,
+        default_llm="openai/gpt-5.5",
+    )
+
+    assert settings == {"process": "sequential", "memory": True, "inputs": {}}
+
+
+def test_json_wizard_shows_interpolation_hint(capsys):
+    json_crew._show_interpolation_hint("tasks")
+
+    output = capsys.readouterr().out
+    assert "{placeholder}" in output
+    assert "dynamic values" in output
+    assert "{topic}" not in output
+    assert "Description >" not in output
+    assert '"description"' not in output
+
+
+def test_json_wizard_text_prompt_uses_full_prompt_for_readline(monkeypatch):
+    prompts: list[str] = []
+
+    monkeypatch.setattr(
+        json_crew, "_readline_safe_prompt", lambda prompt: f"safe:{prompt}"
+    )
+    monkeypatch.setattr(
+        "builtins.input", lambda prompt: prompts.append(prompt) or "Draft content"
+    )
+
+    assert json_crew._prompt_text("Goal", spacing_before=False) == "Draft content"
+    assert len(prompts) == 1
+    assert prompts[0].startswith("safe:")
+    assert "Goal" in prompts[0]
+    assert " > " in prompts[0]
+
+
+def test_json_wizard_tool_picker_prioritizes_common_tools(monkeypatch):
+    picker_calls: list[tuple[str, list[str], dict[str, object]]] = []
+
+    def pick_many(title: str, labels: list[str], **kwargs):
+        picker_calls.append((title, labels, kwargs))
+        return [1, 3], None
+
+    monkeypatch.setattr(json_crew, "pick_many", pick_many)
+
+    tools = json_crew._select_tools()
+
+    assert tools == ["SerperDevTool", "DirectoryReadTool"]
+    assert len(picker_calls) == 1
+    labels = picker_calls[0][1]
+    assert 0 in picker_calls[0][2]["separator_indices"]
+    assert labels[0] == "── Common tools ──"
+    assert labels[1].strip().endswith("SerperDevTool")
+    assert labels[2].strip().endswith("ScrapeWebsiteTool")
+    assert labels[3].strip().endswith("DirectoryReadTool")
+    assert labels[4].strip().endswith("FileReadTool")
+    assert labels[5].strip().endswith("FileWriterTool")
+    assert labels[1].index("Google search") < labels[1].index("SerperDevTool")
+    assert "More tools" not in labels
+
+
+def test_json_wizard_tool_picker_collapses_categories_by_default(monkeypatch):
+    picker_calls: list[tuple[str, list[str], dict[str, object]]] = []
+
+    def pick_many(title: str, labels: list[str], **kwargs):
+        picker_calls.append((title, labels, kwargs))
+        return [], None
+
+    monkeypatch.setattr(json_crew, "pick_many", pick_many)
+
+    json_crew._select_tools()
+
+    labels = picker_calls[0][1]
+    action_indices = picker_calls[0][2]["action_indices"]
+    # Categories show as collapsed action rows, not separators with tools
+    assert any(label.startswith("▸ Search & Research") for label in labels)
+    assert any(label.startswith("▸ Web Scraping") for label in labels)
+    assert not any(label.strip().endswith("BraveSearchTool") for label in labels)
+    assert len(action_indices) >= 4
+    # Only the common tools section is visible beyond the category rows
+    assert len(labels) == 1 + 5 + len(action_indices)
+
+
+def test_json_wizard_tool_picker_expands_one_category_at_a_time(monkeypatch):
+    picker_calls: list[tuple[str, list[str], dict[str, object]]] = []
+
+    def find_category_row(labels: list[str], category: str) -> int:
+        return next(
+            idx for idx, label in enumerate(labels) if category in label
+        )
+
+    def pick_many(title: str, labels: list[str], **kwargs):
+        picker_calls.append((title, labels, kwargs))
+        call_num = len(picker_calls)
+        if call_num == 1:
+            return [], find_category_row(labels, "Search & Research")
+        if call_num == 2:
+            # Search & Research is expanded; select BraveSearchTool and
+            # expand Web Scraping instead
+            brave = next(
+                idx
+                for idx, label in enumerate(labels)
+                if label.strip().endswith("BraveSearchTool")
+            )
+            return [brave], find_category_row(labels, "Web Scraping")
+        return [], None
+
+    monkeypatch.setattr(json_crew, "pick_many", pick_many)
+
+    tools = json_crew._select_tools()
+
+    assert tools == ["BraveSearchTool"]
+    assert len(picker_calls) == 3
+    # Second render: Search & Research expanded, others collapsed
+    labels2 = picker_calls[1][1]
+    assert any(label.startswith("▾ Search & Research") for label in labels2)
+    assert any(label.strip().endswith("BraveSearchTool") for label in labels2)
+    assert any(label.startswith("▸ Web Scraping") for label in labels2)
+    # Third render: Web Scraping expanded, Search & Research collapsed again
+    labels3 = picker_calls[2][1]
+    assert any(label.startswith("▸ Search & Research") for label in labels3)
+    assert any(label.startswith("▾ Web Scraping") for label in labels3)
+    assert not any(label.strip().endswith("BraveSearchTool") for label in labels3)
+    # The collapsed Search & Research row reports its selection count
+    assert any(
+        "Search & Research" in label and "1 selected" in label for label in labels3
+    )
+    # Cursor returns to the toggled category row
+    assert picker_calls[2][2]["initial_cursor"] == next(
+        idx for idx, label in enumerate(labels3) if "Web Scraping" in label
+    )
+
+
+def test_json_wizard_tool_picker_preserves_selection_across_renders(monkeypatch):
+    picker_calls: list[tuple[str, list[str], dict[str, object]]] = []
+
+    def pick_many(title: str, labels: list[str], **kwargs):
+        picker_calls.append((title, labels, kwargs))
+        call_num = len(picker_calls)
+        if call_num == 1:
+            # Select a common tool, then expand a category
+            category_row = next(
+                idx for idx, label in enumerate(labels) if "Web Scraping" in label
+            )
+            return [1], category_row
+        # Confirm without touching anything else
+        return sorted(kwargs["preselected"]), None
+
+    monkeypatch.setattr(json_crew, "pick_many", pick_many)
+
+    tools = json_crew._select_tools()
+
+    # The common-tool selection survived the expand re-render via preselected
+    assert tools == ["SerperDevTool"]
+    assert 1 in picker_calls[1][2]["preselected"]
+
+
+def test_json_wizard_tool_picker_lists_builtin_tools_across_categories(monkeypatch):
+    picker_calls: list[tuple[str, list[str], dict[str, object]]] = []
+    expanded_labels: list[str] = []
+
+    def pick_many(title: str, labels: list[str], **kwargs):
+        picker_calls.append((title, labels, kwargs))
+        expanded_labels.extend(labels)
+        action_indices = sorted(kwargs["action_indices"])
+        call_num = len(picker_calls)
+        if call_num <= len(action_indices):
+            # Expand the n-th category (indices shift between renders, so
+            # recompute from this render's action rows)
+            return [], action_indices[call_num - 1]
+        return [], None
+
+    monkeypatch.setattr(json_crew, "pick_many", pick_many)
+
+    json_crew._select_tools()
+
+    tool_names = {
+        label.rsplit(maxsplit=1)[-1]
+        for label in expanded_labels
+        if not label.startswith(("▸", "▾", "──"))
+    }
+
+    assert {
+        "DirectorySearchTool",
+        "MDXSearchTool",
+        "XMLSearchTool",
+        "YoutubeVideoSearchTool",
+        "S3ReaderTool",
+        "E2BExecTool",
+        "TavilyResearchTool",
+        "SerplyNewsSearchTool",
+        "BrowserbaseLoadTool",
+        "PatronusEvalTool",
+    }.issubset(tool_names)
+    assert {
+        "MCPServerAdapter",
+        "MongoDBVectorSearchConfig",
+        "ScrapegraphScrapeToolSchema",
+        "SnowflakeConfig",
+    }.isdisjoint(tool_names)
+
+
+def test_multi_picker_skips_separator_on_initial_cursor(monkeypatch):
+    cursors: list[int] = []
+
+    monkeypatch.setattr(tui_picker, "_read_key", lambda: "enter")
+    monkeypatch.setattr(
+        tui_picker,
+        "_draw_multi",
+        lambda _labels, cursor, *_args, **_kwargs: cursors.append(cursor),
+    )
+    monkeypatch.setattr(tui_picker, "_clear_lines", lambda *_args, **_kwargs: None)
+
+    assert tui_picker._arrow_select_multi(
+        ["── Common tools ──", "Google search via Serper API SerperDevTool"],
+        separator_indices={0},
+    ) == ([], None)
+    assert cursors == [1]
+
+
+def test_json_wizard_agent_attribute_prompts_are_compact(monkeypatch):
+    prompt_calls: list[tuple[str, bool]] = []
+    prompt_values = {
+        "Role": "Senior Dev Rel",
+        "Goal": "Draft content",
+        "Backstory": "Knows developer communities",
+    }
+
+    def prompt_text(
+        label: str,
+        default: str = "",
+        *,
+        spacing_before: bool = True,
+    ) -> str:
+        prompt_calls.append((label, spacing_before))
+        return prompt_values[label]
+
+    monkeypatch.setattr(json_crew, "_prompt_text", prompt_text)
+    monkeypatch.setattr(json_crew, "_select_model", lambda: "openai/gpt-5.5")
+    monkeypatch.setattr(json_crew, "pick_many", lambda *_args, **_kwargs: ([], None))
+    monkeypatch.setattr(json_crew, "_confirm", lambda *_args, **_kwargs: False)
+
+    agent = json_crew._wizard_agent(agent_num=1, existing_names=[])
+
+    assert agent is not None
+    assert prompt_calls == [
+        ("Role", False),
+        ("Goal", False),
+        ("Backstory", False),
+    ]
+
+
+def test_json_wizard_task_attribute_prompts_are_compact(monkeypatch):
+    prompt_calls: list[tuple[str, bool]] = []
+    prompt_values = {
+        "Description": "Research latest release",
+        "Expected output": "Release summary",
+    }
+
+    def prompt_text(
+        label: str,
+        default: str = "",
+        *,
+        spacing_before: bool = True,
+    ) -> str:
+        prompt_calls.append((label, spacing_before))
+        return prompt_values[label]
+
+    monkeypatch.setattr(json_crew, "_prompt_text", prompt_text)
+
+    task = json_crew._wizard_task(
+        task_num=1,
+        agent_names=["senior_dev_rel"],
+        prior_task_names=[],
+    )
+
+    assert task is not None
+    assert prompt_calls == [
+        ("Description", False),
+        ("Expected output", False),
+    ]
+
+
+def test_json_create_provider_preselects_default_model(tmp_path, monkeypatch):
+    monkeypatch.chdir(tmp_path)
+    with mock.patch(
+        "crewai_cli.create_json_crew._wizard_agents_and_tasks"
+    ) as mock_wizard:
+        mock_wizard.return_value = (
+            [
+                {
+                    "name": "researcher",
+                    "role": "Researcher",
+                    "goal": "Research",
+                    "backstory": "Researcher",
+                    "llm": "openai/gpt-5.5",
+                    "tools": [],
+                    "planning": False,
+                    "allow_delegation": False,
+                }
+            ],
+            [
+                {
+                    "name": "research_task",
+                    "description": "Research",
+                    "expected_output": "Findings",
+                    "agent": "researcher",
+                    "context": [],
+                }
+            ],
+            {"process": "sequential", "memory": False, "inputs": {}},
+        )
+
+        json_crew.create_json_crew("JSON Crew", provider="openai", skip_provider=True)
+
+    mock_wizard.assert_called_once_with(
+        skip_provider=True,
+        default_llm="openai/gpt-5.5",
+    )
+    assert (tmp_path / "json_crew" / "crew.jsonc").exists()
+    assert not (tmp_path / "json_crew" / "tests").exists()
+    assert not (tmp_path / "json_crew" / "config.jsonc").exists()
+
+    crew_template = (tmp_path / "json_crew" / "crew.jsonc").read_text()
+    assert (
+        '"guardrail": "Every factual claim needs context support."'
+        in crew_template
+    )
+    assert '"guardrails": [' in crew_template
+    assert '"guardrail_max_retries": 2' in crew_template
+    assert "Docs: https://docs.crewai.com/concepts/tasks" in crew_template
+    assert '"output_pydantic": null' in crew_template
+    assert '"markdown": false' in crew_template
+    assert "Docs: https://docs.crewai.com/concepts/crews" in crew_template
+    assert '"manager_agent": "researcher"' in crew_template
+    assert '"output_log_file": "crew.log"' in crew_template
+    assert "Crew-level LLM fields also accept object form" in crew_template
+    assert '"chat_llm": {"model": "llama3", "provider": "ollama"' in (
+        crew_template
+    )
+    assert "Use {placeholder} in agent or task text" in crew_template
+    assert "`crewai run` prompts for any placeholders" in crew_template
+    assert "Use {placeholder} inputs here" in crew_template
+
+    agent_template = (
+        tmp_path / "json_crew" / "agents" / "researcher.jsonc"
+    ).read_text()
+    assert "You can use {placeholder} inputs in role, goal, or backstory" in (
+        agent_template
+    )
+    assert '"role": "Senior {industry} Researcher"' in agent_template
+    assert "Optional agent-level guardrail" in agent_template
+    assert '"guardrail_max_retries": 2' in agent_template
+    assert "Docs: https://docs.crewai.com/concepts/agents" in agent_template
+    assert '"reasoning": true' in agent_template
+    assert "For custom endpoints or deployment-based providers" in agent_template
+    assert '"deployment_name": "my-deployment", "provider": "azure"' in (
+        agent_template
+    )
+    assert '"planning_config": {' in agent_template
+    assert '"llm": {"model": "deepseek-chat", "provider": "deepseek"}' in (
+        agent_template
+    )
+    assert '"knowledge_sources": []' in agent_template
+
+
+def test_json_provider_default_model_helper():
+    assert json_crew._default_model_for_provider("openai") == "openai/gpt-5.5"
+    assert json_crew._default_model_for_provider("anthropic/claude-custom") == (
+        "anthropic/claude-custom"
+    )
+    assert json_crew._default_model_for_provider("unknown") is None
+
+
+def test_json_wizard_task_reprompts_on_cancelled_agent_pick(monkeypatch):
+    """Esc on the agent picker must reprompt, not silently assign agent 0."""
+    prompts = iter(["Do the research", "A report"])
+    monkeypatch.setattr(json_crew, "_prompt_text", lambda *a, **k: next(prompts))
+
+    pick_calls: list[str] = []
+    picks = iter([-1, 1])
+
+    def fake_pick_one(title: str, labels: list[str]) -> int:
+        pick_calls.append(title)
+        return next(picks)
+
+    monkeypatch.setattr(json_crew, "pick_one", fake_pick_one)
+
+    task = json_crew._wizard_task(
+        task_num=1,
+        agent_names=["first_agent", "second_agent"],
+        prior_task_names=[],
+    )
+
+    assert len(pick_calls) == 2
+    assert task["agent"] == "second_agent"