JSON first crews (#6131 )

* feat(cli): introduce JSON crew project support and TUI enhancements - Added support for creating and running JSON-defined crew projects, allowing users to scaffold projects with a new `create_json_crew.py` file. - Implemented a full-screen Textual TUI for crew execution in `crew_run_tui.py`, enhancing user interaction with a two-column layout. - Updated `run_crew.py` to prioritize JSON crew projects and added daemon mode for running without TUI. - Introduced interactive pickers in `tui_picker.py` for improved CLI prompts. - Enhanced validation for JSON crew files in `validate.py` to ensure proper structure and agent definitions. - Updated `.gitignore` to exclude demo and crewai directories. * feat: update LLM model references to gpt-5.4-mini - Changed default LLM model from gpt-4o-mini to gpt-5.4-mini across various files, including CLI options, JSON crew configurations, and agent definitions. - Enhanced benchmark and human feedback functionalities to utilize the new model. - Improved user interface elements in the TUI for better interaction and feedback during execution. - Added support for new skills directory in JSON crew project creation. * feat(benchmark): add crew-level benchmarking functionality - Introduced a new `benchmark` command in the CLI for crew-level benchmarking, allowing users to specify agents, models, and timeout settings. - Implemented `CrewBenchmarkCase` to handle crew-level benchmark cases with inputs and criteria. - Enhanced the benchmark runner to support progress tracking and detailed reporting of results for multiple models. - Added tests for loading crew benchmark cases and validating their structure. - Updated existing benchmark functions to accommodate the new crew-level execution model. * feat(cli): enhance JSON crew project functionality and TUI improvements - Added optional agent-level guardrails and advanced options in JSON crew configurations to improve output validation and flexibility. - Updated the TUI to better handle plan step statuses, including visual indicators for task completion and failure. - Introduced methods for parsing and managing step observation events, ensuring accurate updates to task statuses during execution. - Enhanced validation for JSON crew projects, ensuring proper structure and error handling for agent and task definitions. - Added comprehensive tests for new features and validation logic, ensuring robustness in JSON crew project handling. * refactor(cli): streamline JSON crew project handling and improve validation - Refactored JSON crew project loading and validation logic to enhance clarity and maintainability. - Introduced utility functions for finding JSON crew files, improving code reuse across modules. - Removed deprecated benchmark functionality and associated tests to simplify the codebase. - Updated CLI commands to utilize the new JSON project structure, ensuring compatibility with recent changes. - Enhanced test coverage for JSON crew project features, ensuring robust validation and error handling. * feat(cli): enhance activity log navigation and focus management - Added functionality to focus on the activity log when navigating through log entries. - Implemented refresh logic for the log panel to ensure updates are displayed correctly during navigation. - Improved keyboard navigation for log entries, allowing users to expand and scroll through logs seamlessly. - Added tests to verify the correct behavior of log navigation and focus management in the TUI. * feat(cli): enhance JSON crew project interaction and input handling - Introduced a new function to enable prompt line editing for better user experience during input prompts. - Updated the JSON crew project wizards to show interpolation hints for dynamic values, improving user guidance. - Enhanced the handling of missing input placeholders by prompting users for required values during crew setup. - Refactored the crew run logic to ensure proper loading and preparation of JSON-defined crews, including runtime input management. - Added tests to verify the correct behavior of new input handling features and JSON crew project interactions. * feat(cli): improve crew project input prompts and event handling - Enhanced the `_prompt_text` function to allow for configurable spacing before prompts, improving user experience during input collection. - Updated the wizards for agent and task creation to utilize the new prompt configuration, ensuring a more compact and streamlined interaction. - Introduced new plan step lifecycle events (`PlanStepStartedEvent`, `PlanStepCompletedEvent`) to better track the execution status of plan steps. - Refactored the step executor to emit these events during the execution of tasks, improving observability and debugging capabilities. - Added tests to verify the correct behavior of new prompt handling and event emissions during crew project execution. * fix: refine json-first crew interactions * fix: prioritize common json crew tools * fix: make json crew more tools expandable * fix: show json crew tools by category * feat(memory): update default embedder to OpenAI text-embedding-3-large and enhance memory compatibility - Changed the default embedding model for Memory to OpenAI text-embedding-3-large, which uses 3072-dimensional vectors. - Added warnings regarding compatibility issues with existing local memory stores created with 1536-dimensional embeddings. - Updated documentation to reflect the new default embedder and its configuration options. - Enhanced the CLI and codebase to support the new embedding model across various components, ensuring a seamless transition for users. * fix: address PR review feedback for JSON-first crews Review blockers: - Forward trained_agents_file to JSON crews: crewai run -f now exports CREWAI_TRAINED_AGENTS_FILE for the in-process JSON crew path - Wizard agent picker: Esc/cancel now reprompts instead of silently assigning the first agent - JSON tool resolution hard-fails: unknown tool names, missing custom tool files, and invalid custom tool modules raise JSONProjectError with actionable messages instead of warn-and-continue - Embedding dimension mismatch: LanceDB and Qdrant Edge storages raise EmbeddingDimensionMismatchError with reset/pin guidance instead of silently zero-filling vectors or returning empty search results - Custom tool code execution documented in loader docstring and the scaffolded project README CI fixes: - ruff format across lib/ - All 133 PR-introduced mypy errors fixed (llm.py lazy-litellm and cli.py lazy command shims now use TYPE_CHECKING imports; textual is_mounted misuse fixed; pick_many overloads; misc annotations) Bot review comments: - Empty except blocks now have explanatory comments or debug logging - Removed unused _C_BG/_C_PANEL/_C_BORDER globals and redundant import re; tests use a single import style for create_json_crew Tests: trained-agents propagation, wizard cancel, tool resolution failures, and dimension mismatch guidance. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: address second round of PR review comments Cursor Bugbot: - Wizard agent slugs: strip to [a-z0-9_] and fall back to agent_<n> so symbol-only roles can't produce an empty agents/.jsonc filename - Wizard task names: dedupe against prior task names and fall back to task_<n> for symbol-only descriptions CodeRabbit: - Agent.message(): import Task explicitly at runtime instead of relying on the namespace injection done by crewai/__init__ - Async executor: move the native-tools-unsupported fallback from _ainvoke_loop_react (self-recursion) to _ainvoke_loop_native_tools, mirroring the sync implementation - StepExecutor downgrade: keep the in-step conversation and append the text-tooling instructions instead of rebuilding messages, so completed native tool calls are not re-executed - crewai-files: extension-based MIME lookup now runs before byte sniffing so csv/xml types are not degraded to text/plain - Memory storages: validate every record in a save() batch against a consistent embedding dimension (LanceDB previously checked only the first record); added mixed-batch tests - _print_post_tui_summary now typed against CrewRunApp - Docs: Azure OpenAI default embedder change called out in the memory migration warning and provider table Code quality bots: - Removed unused _C_YELLOW/_C_CYAN (crew_run_tui) and _GREEN (tui_picker) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(cli): accordion tool picker in JSON crew wizard The flat tool list had grown to ~90 rows. The picker now shows: - Common tools always visible at the top - Every other category as a single expandable row with tool and selection counts (e.g. "Search & Research (27 tools, 2 selected)") - Expanding a category collapses the previously expanded one - Selections persist across expand/collapse via new preselected support in pick_many; cursor follows the toggled category row tui_picker gains preselected + initial_cursor options on pick_many, and Esc in multi-select now confirms the current selection instead of discarding it (required so collapsing can't silently drop choices). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(cli): remove --daemon flag from crewai run The flag only affected JSON crew projects — classic and flow projects ignored it entirely, which made the behavior inconsistent. Removed the option, the daemon code path (_run_json_crew_daemon), and its helper (_load_json_crew_with_inputs). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test: update run command tests after --daemon removal lib/crewai/tests/cli/test_run_crew.py still asserted the old run_crew(trained_agents_file=..., daemon=False) call signature. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): exit codes, mid-run quit, async statuses, hyphen placeholders Addresses the latest Bugbot review round: - Failed JSON crew runs now exit non-zero (SystemExit(1)) so scripts and CI don't treat failures as success, mirroring the classic path - Quitting the TUI mid-run now ends the process (os._exit(130)); kickoff runs in a thread worker that cannot be force-cancelled, so letting the CLI return would leave LLM/tool work burning tokens in the background - Sidebar task statuses are now async-safe: completion/failure events resolve the task's own row via identity instead of assuming the most recently started task, and starting a task no longer blanket-marks earlier active rows as done - The runtime-input prompt regex now accepts hyphenated placeholder names ({my-topic}), matching kickoff's interpolation pattern Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: validation safety, custom tool sandboxing, TUI log integrity, memory error surfacing - Deploy validation no longer executes project code: validation mode checks tool declarations structurally (well-formed entries, custom tool file exists) without importing or instantiating anything. custom:<name> resolution only happens on the actual run path. - custom:<name> is constrained to [A-Za-z_][A-Za-z0-9_]* and the resolved path must stay inside the project's tools/ directory, so custom:../foo or absolute-path names cannot execute code outside it. Tool paths resolve relative to the crew project root, not cwd. - TUI task logs are built from per-task state captured at task start (idx, description, agent, start time); an out-of-order completion takes its output from the event and no longer steals or resets the current task's streamed steps/output. - EmbeddingDimensionMismatchError now inherits ValueError instead of RuntimeError so background saves surface it through MemorySaveFailedEvent instead of silently dropping the save; the shutdown catch in _background_encode_batch is narrowed to the "cannot schedule new futures" case. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): declared project type wins over crew.json presence A flow project that also contains a crew.json(c) file now runs and validates as the flow it declares in pyproject.toml instead of being hijacked by the JSON crew path. Both crewai run (_has_json_crew) and deploy validation (_is_json_crew) check tool.crewai.type; a missing or unreadable pyproject still means a bare JSON crew project. Also documents why StepObservationFailedEvent intentionally marks the plan step "done": the event signals an observer failure, not a step failure, and the executor continues past it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(cli): type the declared_type locals so mypy stays clean Comparing an Any-typed .get() chain returns Any, which tripped no-any-return on the previous commit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Add experimental crewai run --definition for flows (#6147 )
2026-06-15 13:18:09 +00:00 · 2026-06-14 04:19:48 -03:00 · 2026-06-12 22:31:05 -07:00 · 2026-06-12 21:56:02 -07:00 · 2026-06-12 19:47:58 -07:00 · 2026-06-12 14:48:43 -07:00
133 changed files with 17955 additions and 3187 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -31,3 +31,5 @@ chromadb-*.lock
 blogs/*
 secrets/*
 UNKNOWN.egg-info/
+demos/*
+.crewai/*
--- a/docs/ar/changelog.mdx
+++ b/docs/ar/changelog.mdx
@@ -4,6 +4,55 @@ description: "تحديثات المنتج والتحسينات وإصلاحات
 icon: "clock"
 mode: "wide"
 ---
+<Update label="11 يونيو 2026">
+  ## v1.14.7
+
+  [عرض الإصدار على GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.14.7)
+
+  ## ما الذي تغير
+
+  ### الميزات
+  - إضافة واجهات خلفية افتراضية قابلة للتوصيل للذاكرة، والمعرفة، وrag، وflow.
+  - عرض السبب الحقيقي للإنهاء، ومعلمات العينة، وresponse.id في أحداث LLM.
+  - تصنيف مشغلات DSL كزخارف واعية للمسار.
+  - إضافة واجهة برمجة تطبيقات الدردشة لتدفقات المحادثة.
+  - جعل واجهة القفل قابلة للتجاوز.
+  - بناء FlowDefinition من بيانات التعريف الخاصة بـ Flow DSL.
+  - إضافة مزود LLM من Snowflake Cortex الأصلي.
+  - إضافة دعم لملفات الوكلاء المدربين من crew.
+
+  ### إصلاحات الأخطاء
+  - إصلاح نقطة التحقق لإعادة بناء BaseLLM مخصص كـ LLM ملموس عند الاستعادة.
+  - تقييد الاستعادة على علامة لمنع اللقطات الحية من إعادة التشغيل كاستئناف.
+  - تحديد حالة وقت التشغيل لكل تشغيل للحد من النمو وعزل التشغيل المتزامن.
+  - إصلاح إعدادات التتبع على crewai-login.
+  - احترام suppress_flow_events لأحداث تنفيذ الطريقة.
+  - استعادة [project.scripts] في حزمة crewai لتثبيت أداة uv.
+  - حل مشكلات CVE الخاصة بـ pip-audit لـ aiohttp وdocling وdocling-core.
+  - إصلاح إدخال الملفات الذي لا يعمل بشكل موثوق.
+  - إصلاح تاريخ نتائج أدوات Snowflake Claude غير المكتملة.
+
+  ### الوثائق
+  - تحديث سجل التغييرات والإصدار لـ v1.14.7.
+  - تحديث وثائق جامع OpenTelemetry.
+  - تحديث دليل NVIDIA Nemotron LLM.
+  - إضافة دليل تكامل Databricks.
+  - إضافة دليل تكامل Snowflake.
+
+  ### الأداء
+  - تحسين سرعة استيراد crewai من خلال تحميل مستندات docling بشكل كسول.
+
+  ### إعادة الهيكلة
+  - تبسيط تقييم شروط التدفق ليكون بلا حالة لكل حدث.
+  - فصل منطق المحادثة عن وقت التشغيل وإضافة تعريف المحادثة.
+  - تقسيم `flow.py` إلى DSL، وتعريف، ووقت تشغيل.
+
+  ## المساهمون
+
+  @Luzk, @alex-clawd, @devin-ai-integration[bot], @greysonlalonde, @gvieira, @jessemiller, @lorenzejay, @lucasgomide, @mattatcha, @vinibrsl
+
+</Update>
+
 <Update label="10 يونيو 2026">
  ## v1.14.7rc2

--- a/docs/ar/concepts/flows.mdx
+++ b/docs/ar/concepts/flows.mdx
@@ -226,6 +226,48 @@ counter=2 message='Hello from first_method - updated by second_method'
 من خلال ضمان إعادة مخرجات الدالة الأخيرة وتوفير الوصول إلى الحالة، تجعل تدفقات CrewAI من السهل دمج نتائج سير عمل الذكاء الاصطناعي في التطبيقات أو الأنظمة الأكبر،
 مع الحفاظ على الوصول إلى الحالة طوال تنفيذ التدفق.

+## مقاييس استخدام التدفق
+
+بعد اكتمال تنفيذ التدفق، يمكنك الوصول إلى الخاصية `usage_metrics` لعرض إجمالي استخدام التوكنات عبر **كل استدعاء لنموذج اللغة** يتم خلال التشغيل — بما في ذلك الاستدعاءات من كل فريق (Crew) ينظمه التدفق، والاستدعاءات داخل أدوات الـ Agents، والاستدعاءات المباشرة لـ `LLM.call(...)` من دوال التدفق. هذا هو المكافئ على جانب الـ SDK للإجماليات المعروضة في واجهة CrewAI Enterprise.
+
+```python Code
+from crewai import LLM
+from crewai.flow.flow import Flow, listen, start
+
+class UsageMetricsFlow(Flow):
+    @start()
+    def run_first_crew(self):
+        self.state.first_result = FirstCrew().crew().kickoff()
+
+    @listen(run_first_crew)
+    def call_llm_directly(self):
+        # استدعاء مباشر لنموذج اللغة — يُحسب أيضًا ضمن flow.usage_metrics
+        llm = LLM(model="openai/gpt-4o-mini")
+        self.state.summary = llm.call("لخّص النقاط الرئيسية.")
+
+    @listen(call_llm_directly)
+    def run_second_crew(self):
+        self.state.second_result = SecondCrew().crew().kickoff()
+
+flow = UsageMetricsFlow()
+flow.kickoff()
+
+print(flow.usage_metrics)
+# UsageMetrics(total_tokens=8579, prompt_tokens=6210, completion_tokens=2369,
+#              cached_prompt_tokens=0, reasoning_tokens=0,
+#              cache_creation_tokens=0, successful_requests=5)
+```
+
+<Note>
+  `flow.usage_metrics` **ليست** نفس `flow.kickoff().token_usage`. هذه الأخيرة
+  ترجع فقط `CrewOutput.token_usage` لـ **آخر** دالة `@listen` أعادت
+  `CrewOutput`، مما يعني أنها تعكس فقط الفريق الأخير وتتجاهل الفرق السابقة
+  وكذلك أي استدعاءات مباشرة لـ `LLM.call(...)`. استخدم `flow.usage_metrics`
+  كلما احتجت إلى الإجمالي **الكامل** للتوكنات لتنفيذ التدفق.
+</Note>
+
+كل حقل في [`UsageMetrics`](https://github.com/crewAIInc/crewAI/blob/main/lib/crewai/src/crewai/types/usage_metrics.py) المُعاد هو مجموع جميع استدعاءات نموذج اللغة التي حدثت خلال استدعاء واحد لـ `flow.kickoff()`. تتم إعادة تعيين العدادات عند الاستدعاء التالي لـ `kickoff()` (وفي كل تكرار من `kickoff_for_each`)، لذلك لن تتكرر العدّات عبر التشغيلات المتتالية. يمكن قراءة هذه الخاصية بأمان في أي وقت بعد اكتمال `kickoff()`؛ قراءتها أثناء التنفيذ تُرجع المجموع الجزئي المتراكم حتى تلك اللحظة.
+
 ## إدارة حالة التدفق

 إدارة الحالة بفعالية أمر بالغ الأهمية لبناء سير عمل ذكاء اصطناعي موثوق وقابل للصيانة. توفر تدفقات CrewAI آليات قوية لإدارة الحالة غير المهيكلة والمهيكلة،
--- a/docs/docs.json
+++ b/docs/docs.json
--- a/docs/en/changelog.mdx
+++ b/docs/en/changelog.mdx
@@ -4,6 +4,55 @@ description: "Product updates, improvements, and bug fixes for CrewAI"
 icon: "clock"
 mode: "wide"
 ---
+<Update label="Jun 11, 2026">
+  ## v1.14.7
+
+  [View release on GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.14.7)
+
+  ## What's Changed
+
+  ### Features
+  - Add pluggable default backends for memory, knowledge, rag, and flow.
+  - Surface real finish_reason, sampling params, and response.id on LLM events.
+  - Type DSL triggers as route-aware decorators.
+  - Add chat API for conversational flows.
+  - Make locking backend overridable.
+  - Build FlowDefinition from Flow DSL metadata.
+  - Add native Snowflake Cortex LLM provider.
+  - Add crew trained agents file support.
+
+  ### Bug Fixes
+  - Fix checkpoint to rebuild custom BaseLLM as concrete LLM on restore.
+  - Gate restore on a flag to prevent live snapshots from replaying as resume.
+  - Scope runtime state per run to bound growth and isolate concurrent runs.
+  - Fix telemetry setup on crewai-login.
+  - Respect suppress_flow_events for method-execution events.
+  - Restore [project.scripts] in crewai package for uv tool install.
+  - Resolve pip-audit CVEs for aiohttp, docling, and docling-core.
+  - Fix file input not working reliably.
+  - Fix Snowflake Claude incomplete tool result histories.
+
+  ### Documentation
+  - Update changelog and version for v1.14.7.
+  - Update OpenTelemetry collector documentation.
+  - Update NVIDIA Nemotron LLM guide.
+  - Add Databricks integration guide.
+  - Add Snowflake integration guide.
+
+  ### Performance
+  - Improve crewai import speed by lazy-loading docling imports.
+
+  ### Refactoring
+  - Simplify flow condition evaluation to be stateless per event.
+  - Decouple convo logic from runtime and add a conversational_definition.
+  - Split `flow.py` into DSL, definition, and runtime.
+
+  ## Contributors
+
+  @Luzk, @alex-clawd, @devin-ai-integration[bot], @greysonlalonde, @gvieira, @jessemiller, @lorenzejay, @lucasgomide, @mattatcha, @vinibrsl
+
+</Update>
+
 <Update label="Jun 10, 2026">
  ## v1.14.7rc2

--- a/docs/en/concepts/flows.mdx
+++ b/docs/en/concepts/flows.mdx
@@ -226,6 +226,49 @@ After the Flow has run, you can access the final state to see the updates made b
 By ensuring that the final method's output is returned and providing access to the state, CrewAI Flows make it easy to integrate the results of your AI workflows into larger applications or systems,
 while also maintaining and accessing the state throughout the Flow's execution.

+## Flow Usage Metrics
+
+After a Flow execution completes, you can access the `usage_metrics` property to view aggregated token usage across **every LLM call** made during the run — including calls from every Crew the Flow orchestrated, calls inside Agent tools, and bare `LLM.call(...)` invocations from Flow methods. This is the SDK-side equivalent of the totals shown in the CrewAI Enterprise UI.
+
+```python Code
+from crewai import LLM
+from crewai.flow.flow import Flow, listen, start
+
+class UsageMetricsFlow(Flow):
+    @start()
+    def run_first_crew(self):
+        self.state.first_result = FirstCrew().crew().kickoff()
+
+    @listen(run_first_crew)
+    def call_llm_directly(self):
+        # Bare LLM call — still counted by flow.usage_metrics
+        llm = LLM(model="openai/gpt-4o-mini")
+        self.state.summary = llm.call("Summarize the key takeaways.")
+
+    @listen(call_llm_directly)
+    def run_second_crew(self):
+        self.state.second_result = SecondCrew().crew().kickoff()
+
+flow = UsageMetricsFlow()
+flow.kickoff()
+
+print(flow.usage_metrics)
+# UsageMetrics(total_tokens=8579, prompt_tokens=6210, completion_tokens=2369,
+#              cached_prompt_tokens=0, reasoning_tokens=0,
+#              cache_creation_tokens=0, successful_requests=5)
+```
+
+<Note>
+  `flow.usage_metrics` is **not** the same as `flow.kickoff().token_usage`. The
+  latter returns the `CrewOutput.token_usage` of the **last** `@listen` method
+  that returned a `CrewOutput`, which means it only reflects the final Crew and
+  ignores prior Crews and bare `LLM.call(...)` invocations entirely. Use
+  `flow.usage_metrics` whenever you need the **full** token rollup for the Flow
+  execution.
+</Note>
+
+Each entry in the returned [`UsageMetrics`](https://github.com/crewAIInc/crewAI/blob/main/lib/crewai/src/crewai/types/usage_metrics.py) is the sum across all LLM calls made within a single `flow.kickoff()` invocation. Counters reset on the next `kickoff()` call (or on each iteration of `kickoff_for_each`), so successive runs don't double-count. The property is safe to read at any point after `kickoff()` completes; reading it during execution returns the partial total accumulated so far.
+
 ## Flow State Management

 Managing state effectively is crucial for building reliable and maintainable AI workflows. CrewAI Flows provides robust mechanisms for both unstructured and structured state management,
--- a/docs/en/concepts/memory.mdx
+++ b/docs/en/concepts/memory.mdx
@@ -101,7 +101,7 @@ crew = Crew(
 )
 ```

-When `memory=True`, the crew creates a default `Memory()` and passes the crew's `embedder` configuration through automatically. All agents in the crew share the crew's memory unless an agent has its own.
+When `memory=True`, the crew creates a default `Memory()` and passes the crew's `embedder` configuration through automatically. All agents in the crew share the crew's memory unless an agent has its own. Without a custom `embedder`, memory uses OpenAI `text-embedding-3-large` embeddings.

 After each task, the crew automatically extracts discrete facts from the task output and stores them. Before each task, the agent recalls relevant context from memory and injects it into the task prompt.

@@ -515,7 +515,11 @@ memory = Memory(

 ## Embedder Configuration

-Memory needs an embedding model to convert text into vectors for semantic search. You can configure this in three ways.
+Memory needs an embedding model to convert text into vectors for semantic search. By default, `Memory()` uses OpenAI `text-embedding-3-large` embeddings, which produce 3072-dimensional vectors. Set `OPENAI_API_KEY` for the default path, or configure a custom embedder in one of three ways.
+
+<Warning>
+Existing local memory stores created with 1536-dimensional embeddings, such as `text-embedding-3-small` or `text-embedding-ada-002`, may not be compatible with the `text-embedding-3-large` default. This applies to both the OpenAI and Azure OpenAI providers — Azure's default embedding model also changed from `text-embedding-ada-002` to `text-embedding-3-large`. If local testing fails with an embedding dimension mismatch, reset memory with `crewai reset-memories -m`, delete the local memory storage directory, or explicitly configure the older embedder model until you migrate.
+</Warning>

 ### Passing to Memory Directly

@@ -523,7 +527,7 @@ Memory needs an embedding model to convert text into vectors for semantic search
 from crewai import Memory

 # As a config dict
-memory = Memory(embedder={"provider": "openai", "config": {"model_name": "text-embedding-3-small"}})
+memory = Memory(embedder={"provider": "openai", "config": {"model_name": "text-embedding-3-large"}})

 # As a pre-built callable
 from crewai.rag.embeddings.factory import build_embedder
@@ -542,7 +546,7 @@ crew = Crew(
    agents=[...],
    tasks=[...],
    memory=True,
-    embedder={"provider": "openai", "config": {"model_name": "text-embedding-3-small"}},
+    embedder={"provider": "openai", "config": {"model_name": "text-embedding-3-large"}},
 )
 ```

@@ -554,7 +558,7 @@ crew = Crew(
 memory = Memory(embedder={
    "provider": "openai",
    "config": {
-        "model_name": "text-embedding-3-small",
+        "model_name": "text-embedding-3-large",
        # "api_key": "sk-...",  # or set OPENAI_API_KEY env var
    },
 })
@@ -701,9 +705,9 @@ memory = Memory(embedder=my_embedder)

 | Provider | Key | Typical Model | Notes |
 | :--- | :--- | :--- | :--- |
-| OpenAI | `openai` | `text-embedding-3-small` | Default. Set `OPENAI_API_KEY`. |
+| OpenAI | `openai` | `text-embedding-3-large` | Default. Set `OPENAI_API_KEY`. |
 | Ollama | `ollama` | `mxbai-embed-large` | Local, no API key needed. |
-| Azure OpenAI | `azure` | `text-embedding-ada-002` | Requires `deployment_id`. |
+| Azure OpenAI | `azure` | `text-embedding-3-large` | Default model. Requires `deployment_id`. |
 | Google AI | `google-generativeai` | `gemini-embedding-001` | Set `GOOGLE_API_KEY`. |
 | Google Vertex | `google-vertex` | `gemini-embedding-001` | Requires `project_id`. |
 | Cohere | `cohere` | `embed-english-v3.0` | Strong multilingual support. |
@@ -836,6 +840,9 @@ class MemoryMonitor(BaseEventListener):
 **Background save errors in logs?**
 - Memory saves run in a background thread. Errors are emitted as `MemorySaveFailedEvent` but don't crash the agent. Check logs for the root cause (usually LLM or embedder connection issues).

+**Embedding dimension mismatch?**
+- Existing local memory stores may have been created with a different embedding model. The default OpenAI memory embedder is now `text-embedding-3-large` (3072 dimensions), while older stores commonly used 1536-dimensional embeddings. For local testing, run `crewai reset-memories -m`, delete the local memory storage directory, or configure the previous embedder model explicitly.
+
 **Concurrent write conflicts?**
 - LanceDB operations are serialized with a shared lock and retried automatically on conflict. This handles multiple `Memory` instances pointing at the same database (e.g. agent memory + crew memory). No action needed.

@@ -862,7 +869,7 @@ All configuration is passed as keyword arguments to `Memory(...)`. Every paramet
 | :--- | :--- | :--- |
 | `llm` | `"gpt-4o-mini"` | LLM for analysis (model name or `BaseLLM` instance). |
 | `storage` | `"lancedb"` | Storage backend (`"lancedb"`, a path string, or a `StorageBackend` instance). |
-| `embedder` | `None` (OpenAI default) | Embedder (config dict, callable, or `None` for default OpenAI). |
+| `embedder` | `None` (OpenAI `text-embedding-3-large`) | Embedder (config dict, callable, or `None` for default OpenAI). |
 | `recency_weight` | `0.3` | Weight for recency in composite score. |
 | `semantic_weight` | `0.5` | Weight for semantic similarity in composite score. |
 | `importance_weight` | `0.2` | Weight for importance in composite score. |
--- a/docs/en/guides/migration/upgrading-crewai.mdx
+++ b/docs/en/guides/migration/upgrading-crewai.mdx
@@ -141,7 +141,7 @@ crew = Crew(
    process=Process.sequential,   # or Process.hierarchical
    memory=True,
    cache=True,
-    embedder={"provider": "openai", "config": {"model": "text-embedding-3-small"}},
+    embedder={"provider": "openai", "config": {"model": "text-embedding-3-large"}},
 )
 ```

@@ -173,7 +173,7 @@ write = Task(

 ### Memory & embedder config {#memory-embedder-config}

-If `memory=True` and you're not using the default OpenAI embeddings, you must pass an `embedder`:
+If `memory=True` and you're not using the default OpenAI `text-embedding-3-large` embeddings, you must pass an `embedder`:

 ```python
 crew = Crew(
@@ -187,4 +187,4 @@ crew = Crew(
 )
 ```

-Set the relevant provider credentials (`OPENAI_API_KEY`, `OLLAMA_HOST`, etc.) in your `.env` file. Memory storage paths are project-local by default — delete the project's memory directory if you change embedders, since dimensions don't mix.
+Set the relevant provider credentials (`OPENAI_API_KEY`, `OLLAMA_HOST`, etc.) in your `.env` file. Memory storage paths are project-local by default. Existing local memory stores created with 1536-dimensional embeddings may not be compatible with the default OpenAI `text-embedding-3-large` embedder, which uses 3072 dimensions. If you hit a dimension mismatch, delete the project's memory directory, run `crewai reset-memories -m`, or explicitly configure the older embedder model until you migrate.
--- a/docs/ko/changelog.mdx
+++ b/docs/ko/changelog.mdx
@@ -4,6 +4,55 @@ description: "CrewAI의 제품 업데이트, 개선 사항 및 버그 수정"
 icon: "clock"
 mode: "wide"
 ---
+<Update label="2026년 6월 11일">
+  ## v1.14.7
+
+  [GitHub 릴리스 보기](https://github.com/crewAIInc/crewAI/releases/tag/1.14.7)
+
+  ## 변경 사항
+
+  ### 기능
+  - 메모리, 지식, RAG 및 흐름에 대한 플러그 가능한 기본 백엔드를 추가했습니다.
+  - LLM 이벤트에서 실제 finish_reason, 샘플링 매개변수 및 response.id를 표시합니다.
+  - 경로 인식 장식자로서의 타입 DSL 트리거를 설정합니다.
+  - 대화 흐름을 위한 채팅 API를 추가했습니다.
+  - 잠금 백엔드를 재정의 가능하도록 만듭니다.
+  - Flow DSL 메타데이터에서 FlowDefinition을 빌드합니다.
+  - 네이티브 Snowflake Cortex LLM 공급자를 추가했습니다.
+  - 훈련된 에이전트 파일 지원을 추가했습니다.
+
+  ### 버그 수정
+  - 복원 시 사용자 정의 BaseLLM을 구체적인 LLM으로 재구성하도록 체크포인트를 수정했습니다.
+  - 라이브 스냅샷이 재개로 재생되지 않도록 플래그를 사용하여 복원을 제한합니다.
+  - 실행마다 런타임 상태의 범위를 설정하여 성장을 제한하고 동시 실행을 격리합니다.
+  - crewai-login에서 텔레메트리 설정을 수정했습니다.
+  - 메서드 실행 이벤트에 대해 suppress_flow_events를 존중합니다.
+  - uv 도구 설치를 위해 crewai 패키지에서 [project.scripts]를 복원합니다.
+  - aiohttp, docling 및 docling-core에 대한 pip-audit CVE를 해결합니다.
+  - 파일 입력이 신뢰할 수 없게 작동하는 문제를 수정했습니다.
+  - Snowflake Claude의 불완전한 도구 결과 기록을 수정했습니다.
+
+  ### 문서
+  - v1.14.7에 대한 변경 로그 및 버전을 업데이트했습니다.
+  - OpenTelemetry 수집기 문서를 업데이트했습니다.
+  - NVIDIA Nemotron LLM 가이드를 업데이트했습니다.
+  - Databricks 통합 가이드를 추가했습니다.
+  - Snowflake 통합 가이드를 추가했습니다.
+
+  ### 성능
+  - docling 가져오기를 지연 로딩하여 crewai 가져오기 속도를 개선했습니다.
+
+  ### 리팩토링
+  - 흐름 조건 평가를 이벤트별로 상태 비저장으로 단순화했습니다.
+  - 대화 논리를 런타임에서 분리하고 conversational_definition을 추가했습니다.
+  - `flow.py`를 DSL, 정의 및 런타임으로 분리했습니다.
+
+  ## 기여자
+
+  @Luzk, @alex-clawd, @devin-ai-integration[bot], @greysonlalonde, @gvieira, @jessemiller, @lorenzejay, @lucasgomide, @mattatcha, @vinibrsl
+
+</Update>
+
 <Update label="2026년 6월 10일">
  ## v1.14.7rc2

--- a/docs/ko/concepts/flows.mdx
+++ b/docs/ko/concepts/flows.mdx
@@ -221,6 +221,48 @@ Flow가 실행된 후, 이러한 메소드들에 의해 수행된 업데이트
 최종 메소드의 출력이 반환되고 상태에 접근할 수 있도록 함으로써, CrewAI Flow는 AI 워크플로우의 결과를 더 큰 애플리케이션이나 시스템에 쉽게 통합할 수 있게 하며,
 Flow 실행 과정 전반에 걸쳐 상태를 유지하고 접근하면서도 이를 용이하게 만듭니다.

+## 플로우 사용 메트릭
+
+Flow 실행이 완료된 후, `usage_metrics` 속성에 접근하여 실행 동안 발생한 **모든 LLM 호출**의 토큰 사용량 집계를 확인할 수 있습니다. 여기에는 Flow가 오케스트레이션한 모든 Crew의 호출, Agent의 도구 내부에서 발생한 호출, 그리고 Flow 메서드에서 직접 호출한 `LLM.call(...)`이 모두 포함됩니다. 이는 CrewAI Enterprise UI에 표시되는 총량과 동등한 SDK 측 값입니다.
+
+```python Code
+from crewai import LLM
+from crewai.flow.flow import Flow, listen, start
+
+class UsageMetricsFlow(Flow):
+    @start()
+    def run_first_crew(self):
+        self.state.first_result = FirstCrew().crew().kickoff()
+
+    @listen(run_first_crew)
+    def call_llm_directly(self):
+        # 직접 LLM 호출 — flow.usage_metrics에서도 집계됩니다
+        llm = LLM(model="openai/gpt-4o-mini")
+        self.state.summary = llm.call("핵심 내용을 요약해 주세요.")
+
+    @listen(call_llm_directly)
+    def run_second_crew(self):
+        self.state.second_result = SecondCrew().crew().kickoff()
+
+flow = UsageMetricsFlow()
+flow.kickoff()
+
+print(flow.usage_metrics)
+# UsageMetrics(total_tokens=8579, prompt_tokens=6210, completion_tokens=2369,
+#              cached_prompt_tokens=0, reasoning_tokens=0,
+#              cache_creation_tokens=0, successful_requests=5)
+```
+
+<Note>
+  `flow.usage_metrics`는 `flow.kickoff().token_usage`와 **동일하지 않습니다**.
+  후자는 `CrewOutput`을 반환한 **마지막** `@listen` 메서드의
+  `CrewOutput.token_usage`만 반환하므로, 이전에 실행된 Crew들과 Flow 메서드에서
+  직접 호출한 `LLM.call(...)`은 전혀 포함되지 않습니다. Flow 실행에 대한
+  **전체** 토큰 집계가 필요할 때는 항상 `flow.usage_metrics`를 사용하십시오.
+</Note>
+
+반환되는 [`UsageMetrics`](https://github.com/crewAIInc/crewAI/blob/main/lib/crewai/src/crewai/types/usage_metrics.py)의 각 항목은 단일 `flow.kickoff()` 실행 동안 발생한 모든 LLM 호출의 합계입니다. 다음 `kickoff()` 호출(및 `kickoff_for_each`의 각 반복)에서 카운터가 초기화되므로 연속 실행이 이중으로 집계되지 않습니다. 이 속성은 `kickoff()` 완료 후 언제든지 안전하게 읽을 수 있으며, 실행 중에 읽으면 그 시점까지 누적된 부분 합계를 반환합니다.
+
 ## 플로우 상태 관리

 상태를 효과적으로 관리하는 것은 신뢰할 수 있고 유지 보수가 용이한 AI 워크플로를 구축하는 데 매우 중요합니다. CrewAI 플로우는 비정형 및 정형 상태 관리를 위한 강력한 메커니즘을 제공하여, 개발자가 자신의 애플리케이션에 가장 적합한 접근 방식을 선택할 수 있도록 합니다.
--- a/docs/pt-BR/changelog.mdx
+++ b/docs/pt-BR/changelog.mdx
@@ -4,6 +4,55 @@ description: "Atualizações de produto, melhorias e correções do CrewAI"
 icon: "clock"
 mode: "wide"
 ---
+<Update label="11 jun 2026">
+  ## v1.14.7
+
+  [Ver release no GitHub](https://github.com/crewAIInc/crewAI/releases/tag/1.14.7)
+
+  ## O que Mudou
+
+  ### Recursos
+  - Adicionar backends padrão plugáveis para memória, conhecimento, rag e fluxo.
+  - Exibir o verdadeiro finish_reason, parâmetros de amostragem e response.id em eventos LLM.
+  - Tipar os gatilhos DSL como decoradores cientes de rotas.
+  - Adicionar API de chat para fluxos de conversa.
+  - Tornar o backend de bloqueio substituível.
+  - Construir FlowDefinition a partir de metadados Flow DSL.
+  - Adicionar provedor nativo Snowflake Cortex LLM.
+  - Adicionar suporte a arquivos de agentes treinados pela equipe.
+
+  ### Correções de Bugs
+  - Corrigir checkpoint para reconstruir BaseLLM personalizado como LLM concreto na restauração.
+  - Controlar a restauração com uma flag para evitar que snapshots ao vivo sejam reproduzidos como retomar.
+  - Escopar o estado de execução por execução para limitar o crescimento e isolar execuções concorrentes.
+  - Corrigir configuração de telemetria no crewai-login.
+  - Respeitar suppress_flow_events para eventos de execução de método.
+  - Restaurar [project.scripts] no pacote crewai para instalação da ferramenta uv.
+  - Resolver CVEs de pip-audit para aiohttp, docling e docling-core.
+  - Corrigir entrada de arquivo que não estava funcionando de forma confiável.
+  - Corrigir histórias de resultados de ferramentas incompletas do Snowflake Claude.
+
+  ### Documentação
+  - Atualizar changelog e versão para v1.14.7.
+  - Atualizar documentação do coletor OpenTelemetry.
+  - Atualizar guia do LLM NVIDIA Nemotron.
+  - Adicionar guia de integração do Databricks.
+  - Adicionar guia de integração do Snowflake.
+
+  ### Desempenho
+  - Melhorar a velocidade de importação do crewai através do carregamento preguiçoso de imports do docling.
+
+  ### Refatoração
+  - Simplificar a avaliação de condições de fluxo para ser sem estado por evento.
+  - Desacoplar a lógica de conversa da execução e adicionar uma conversational_definition.
+  - Dividir `flow.py` em DSL, definição e execução.
+
+  ## Contribuidores
+
+  @Luzk, @alex-clawd, @devin-ai-integration[bot], @greysonlalonde, @gvieira, @jessemiller, @lorenzejay, @lucasgomide, @mattatcha, @vinibrsl
+
+</Update>
+
 <Update label="10 jun 2026">
  ## v1.14.7rc2

--- a/docs/pt-BR/concepts/flows.mdx
+++ b/docs/pt-BR/concepts/flows.mdx
@@ -219,6 +219,49 @@ Após o término da execução, é possível acessar o estado final e observar a
 Ao garantir que a saída do método final seja retornada e oferecer acesso ao estado, o CrewAI Flows facilita a integração dos resultados dos seus workflows de IA em aplicações maiores,
 além de permitir o gerenciamento e o acesso ao estado durante toda a execução do Flow.

+## Métricas de Uso do Flow
+
+Após a execução de um Flow, você pode acessar a propriedade `usage_metrics` para visualizar o consumo agregado de tokens em **todas as chamadas de LLM** realizadas durante a execução — incluindo chamadas das Crews orquestradas pelo Flow, chamadas dentro de tools de Agents, e invocações diretas de `LLM.call(...)` feitas a partir de métodos do Flow. Esse é o equivalente, do lado do SDK, ao total exibido na interface do CrewAI Enterprise.
+
+```python Code
+from crewai import LLM
+from crewai.flow.flow import Flow, listen, start
+
+class UsageMetricsFlow(Flow):
+    @start()
+    def run_first_crew(self):
+        self.state.first_result = FirstCrew().crew().kickoff()
+
+    @listen(run_first_crew)
+    def call_llm_directly(self):
+        # Chamada direta de LLM — também contabilizada por flow.usage_metrics
+        llm = LLM(model="openai/gpt-4o-mini")
+        self.state.summary = llm.call("Resuma os principais pontos.")
+
+    @listen(call_llm_directly)
+    def run_second_crew(self):
+        self.state.second_result = SecondCrew().crew().kickoff()
+
+flow = UsageMetricsFlow()
+flow.kickoff()
+
+print(flow.usage_metrics)
+# UsageMetrics(total_tokens=8579, prompt_tokens=6210, completion_tokens=2369,
+#              cached_prompt_tokens=0, reasoning_tokens=0,
+#              cache_creation_tokens=0, successful_requests=5)
+```
+
+<Note>
+  `flow.usage_metrics` **não** é o mesmo que `flow.kickoff().token_usage`. Este
+  último retorna apenas o `CrewOutput.token_usage` do **último** método
+  `@listen` que retornou um `CrewOutput`, ou seja, reflete somente a Crew
+  final e ignora completamente as Crews anteriores e quaisquer chamadas
+  diretas de `LLM.call(...)`. Use `flow.usage_metrics` sempre que precisar do
+  rollup **completo** de tokens da execução do Flow.
+</Note>
+
+Cada campo do [`UsageMetrics`](https://github.com/crewAIInc/crewAI/blob/main/lib/crewai/src/crewai/types/usage_metrics.py) retornado representa a soma de todas as chamadas de LLM feitas em uma única invocação de `flow.kickoff()`. Os contadores são resetados a cada novo `kickoff()` (e em cada iteração de `kickoff_for_each`), de modo que execuções sucessivas não duplicam o total. A propriedade é segura para ser lida em qualquer momento após o `kickoff()`; lê-la durante a execução retorna o total parcial acumulado até aquele instante.
+
 ## Gerenciamento de Estado em Flows

 Gerenciar o estado de forma eficaz é fundamental para construir fluxos de trabalho de IA confiáveis e de fácil manutenção. O CrewAI Flows oferece mecanismos robustos para o gerenciamento de estado tanto não estruturado quanto estruturado,
--- a/lib/cli/pyproject.toml
+++ b/lib/cli/pyproject.toml
@@ -8,7 +8,7 @@ authors = [
 ]
 requires-python = ">=3.10, <3.14"
 dependencies = [
-    "crewai-core==1.14.7rc2",
+    "crewai-core==1.14.7",
    "click>=8.1.7,<9",
    "pydantic>=2.11.9,<2.13",
    "pydantic-settings~=2.10.1",
--- a/lib/cli/src/crewai_cli/init.py
+++ b/lib/cli/src/crewai_cli/init.py
@@ -1 +1 @@
-__version__ = "1.14.7rc2"
+__version__ = "1.14.7"
--- a/lib/cli/src/crewai_cli/cli.py
+++ b/lib/cli/src/crewai_cli/cli.py
@@ -3,41 +3,94 @@ from __future__ import annotations
 from importlib.metadata import version as get_version
 import os
 import subprocess
-from typing import Any
+from typing import TYPE_CHECKING, Any

 import click
 from crewai_core.token_manager import TokenManager

-from crewai_cli.add_crew_to_flow import add_crew_to_flow
-from crewai_cli.authentication.main import AuthenticationCommand
 from crewai_cli.config import Settings
-from crewai_cli.create_crew import create_crew
-from crewai_cli.create_flow import create_flow
-from crewai_cli.crew_chat import run_chat
-from crewai_cli.deploy.main import DeployCommand
-from crewai_cli.enterprise.main import EnterpriseConfigureCommand
-from crewai_cli.evaluate_crew import evaluate_crew
-from crewai_cli.experimental.skills.main import SkillCommand
-from crewai_cli.install_crew import install_crew
-from crewai_cli.kickoff_flow import kickoff_flow
-from crewai_cli.organization.main import OrganizationCommand
-from crewai_cli.plot_flow import plot_flow
-from crewai_cli.remote_template.main import TemplateCommand
-from crewai_cli.replay_from_task import replay_task_command
-from crewai_cli.reset_memories_command import reset_memories_command
-from crewai_cli.run_crew import run_crew
-from crewai_cli.settings.main import SettingsCommand
-from crewai_cli.task_outputs import load_task_outputs
-from crewai_cli.tools.main import ToolCommand
-from crewai_cli.train_crew import train_crew
-from crewai_cli.triggers.main import TriggersCommand
-from crewai_cli.update_crew import update_crew
 from crewai_cli.user_data import (
    _load_user_data,
    is_tracing_enabled,
    update_user_data,
 )
-from crewai_cli.utils import build_env_with_all_tool_credentials, read_toml
+from crewai_cli.utils import (
+    build_env_with_all_tool_credentials,
+    enable_prompt_line_editing,
+    read_toml,
+)
+
+
+def train_crew(*args: Any, **kwargs: Any) -> Any:
+    from crewai_cli.train_crew import train_crew as _train_crew
+
+    return _train_crew(*args, **kwargs)
+
+
+def evaluate_crew(*args: Any, **kwargs: Any) -> Any:
+    from crewai_cli.evaluate_crew import evaluate_crew as _evaluate_crew
+
+    return _evaluate_crew(*args, **kwargs)
+
+
+def replay_task_command(*args: Any, **kwargs: Any) -> Any:
+    from crewai_cli.replay_from_task import replay_task_command as _replay_task_command
+
+    return _replay_task_command(*args, **kwargs)
+
+
+def run_flow_definition(*args: Any, **kwargs: Any) -> Any:
+    from crewai_cli.run_flow_definition import (
+        run_flow_definition as _run_flow_definition,
+    )
+
+    return _run_flow_definition(*args, **kwargs)
+
+
+def run_crew(*args: Any, **kwargs: Any) -> Any:
+    from crewai_cli.run_crew import run_crew as _run_crew
+
+    return _run_crew(*args, **kwargs)
+
+
+if TYPE_CHECKING:
+    # mypy sees the real classes; at runtime the shims below defer the
+    # heavy imports until a command actually instantiates them.
+    from crewai_cli.authentication.main import AuthenticationCommand
+    from crewai_cli.deploy.main import DeployCommand
+    from crewai_cli.organization.main import OrganizationCommand
+    from crewai_cli.remote_template.main import TemplateCommand
+else:
+
+    class AuthenticationCommand:
+        def __new__(cls, *args: Any, **kwargs: Any) -> Any:
+            from crewai_cli.authentication.main import (
+                AuthenticationCommand as _AuthenticationCommand,
+            )
+
+            return _AuthenticationCommand(*args, **kwargs)
+
+    class DeployCommand:
+        def __new__(cls, *args: Any, **kwargs: Any) -> Any:
+            from crewai_cli.deploy.main import DeployCommand as _DeployCommand
+
+            return _DeployCommand(*args, **kwargs)
+
+    class TemplateCommand:
+        def __new__(cls, *args: Any, **kwargs: Any) -> Any:
+            from crewai_cli.remote_template.main import (
+                TemplateCommand as _TemplateCommand,
+            )
+
+            return _TemplateCommand(*args, **kwargs)
+
+    class OrganizationCommand:
+        def __new__(cls, *args: Any, **kwargs: Any) -> Any:
+            from crewai_cli.organization.main import (
+                OrganizationCommand as _OrganizationCommand,
+            )
+
+            return _OrganizationCommand(*args, **kwargs)


 def _get_cli_version() -> str:
@@ -90,17 +143,57 @@ def uv(uv_args: tuple[str, ...]) -> None:


@crewai.command()
-@click.argument("type", type=click.Choice(["crew", "flow"]))
-@click.argument("name")
+@click.argument(
+    "type", required=False, default=None, type=click.Choice(["crew", "flow"])
+)
+@click.argument("name", required=False, default=None)
@click.option("--provider", type=str, help="The provider to use for the crew")
@click.option("--skip_provider", is_flag=True, help="Skip provider validation")
+@click.option(
+    "--classic",
+    is_flag=True,
+    help="Use classic Python/YAML project structure instead of JSON",
+)
 def create(
-    type: str, name: str, provider: str | None, skip_provider: bool = False
+    type: str | None,
+    name: str | None,
+    provider: str | None,
+    skip_provider: bool = False,
+    classic: bool = False,
 ) -> None:
    """Create a new crew, or flow."""
+    if not type:
+        from crewai_cli.tui_picker import pick
+
+        options = [
+            ("crew", "A team of AI agents working together"),
+            (
+                "flow",
+                "A deterministic workflow with full control over agents and crews",
+            ),
+        ]
+        type = pick("What would you like to create?", options)
+        if type is None:
+            raise SystemExit(0)
+        click.echo()
+    if not name:
+        enable_prompt_line_editing()
+        name = click.prompt(
+            click.style(f"  Name of your {type}", fg="cyan", bold=True),
+            prompt_suffix=click.style(" › ", fg="bright_white"),  # noqa: RUF001
+        )
    if type == "crew":
-        create_crew(name, provider, skip_provider)
+        if classic:
+            from crewai_cli.create_crew import create_crew
+
+            create_crew(name, provider, skip_provider)
+        else:
+            from crewai_cli.create_json_crew import create_json_crew
+
+            create_json_crew(name, provider, skip_provider)
    elif type == "flow":
+        from crewai_cli.create_flow import create_flow
+
        create_flow(name)
    else:
        click.secho("Error: Invalid type. Must be 'crew' or 'flow'.", fg="red")
@@ -185,6 +278,8 @@ def replay(task_id: str, trained_agents_file: str | None) -> None:
 def log_tasks_outputs() -> None:
    """Retrieve your latest crew.kickoff() task outputs."""
    try:
+        from crewai_cli.task_outputs import load_task_outputs
+
        tasks = load_task_outputs()

        if not tasks:
@@ -273,6 +368,8 @@ def reset_memories(
                "Please specify at least one memory type to reset using the appropriate flags."
            )
            return
+        from crewai_cli.reset_memories_command import reset_memories_command
+
        reset_memories_command(memory, knowledge, agent_knowledge, kickoff_outputs, all)
    except Exception as e:
        click.echo(f"An error occurred while resetting memories: {e}", err=True)
@@ -295,7 +392,7 @@ def reset_memories(
    "--embedder-model",
    type=str,
    default=None,
-    help="Embedder model name (e.g. text-embedding-3-small, gemini-embedding-001).",
+    help="Embedder model name (e.g. text-embedding-3-large, gemini-embedding-001).",
 )
@click.option(
    "--embedder-config",
@@ -350,7 +447,7 @@ def memory(
    "-m",
    "--model",
    type=str,
-    default="gpt-4o-mini",
+    default="gpt-5.4-mini",
    help="LLM Model to run the tests on the Crew. For now only accepting only OpenAI models.",
 )
@click.option(
@@ -381,6 +478,8 @@ def test(n_iterations: int, model: str, trained_agents_file: str | None) -> None
@click.pass_context
 def install(context: click.Context) -> None:
    """Install the Crew."""
+    from crewai_cli.install_crew import install_crew
+
    install_crew(context.args)


@@ -398,14 +497,46 @@ def install(context: click.Context) -> None:
        "CREWAI_TRAINED_AGENTS_FILE."
    ),
 )
-def run(trained_agents_file: str | None) -> None:
-    """Run the Crew."""
+@click.option(
+    "--definition",
+    type=str,
+    default=None,
+    help=(
+        "Experimental: path to a Flow Definition YAML/JSON file, "
+        "or an inline YAML/JSON string."
+    ),
+)
+@click.option(
+    "--inputs",
+    type=str,
+    default=None,
+    help='Experimental: JSON object passed to flow.kickoff(), e.g. \'{"topic":"AI"}\'.',
+)
+def run(
+    trained_agents_file: str | None,
+    definition: str | None,
+    inputs: str | None,
+) -> None:
+    """Run the Crew or Flow."""
+    if inputs is not None and definition is None:
+        raise click.UsageError("--inputs requires --definition")
+
+    if definition is not None:
+        click.secho(
+            "Warning: `crewai run --definition` is experimental and may change without notice.",
+            fg="yellow",
+        )
+        run_flow_definition(definition=definition, inputs=inputs)
+        return
+
    run_crew(trained_agents_file=trained_agents_file)


@crewai.command()
 def update() -> None:
    """Update the pyproject.toml of the Crew project to use uv."""
+    from crewai_cli.update_crew import update_crew
+
    update_crew()


@@ -515,6 +646,8 @@ def tool() -> None:
@tool.command(name="create")
@click.argument("handle")
 def tool_create(handle: str) -> None:
+    from crewai_cli.tools.main import ToolCommand
+
    tool_cmd = ToolCommand()
    tool_cmd.create(handle)

@@ -522,6 +655,8 @@ def tool_create(handle: str) -> None:
@tool.command(name="install")
@click.argument("handle")
 def tool_install(handle: str) -> None:
+    from crewai_cli.tools.main import ToolCommand
+
    tool_cmd = ToolCommand()
    tool_cmd.login()
    tool_cmd.install(handle)
@@ -538,6 +673,8 @@ def tool_install(handle: str) -> None:
@click.option("--public", "is_public", flag_value=True, default=False)
@click.option("--private", "is_public", flag_value=False)
 def tool_publish(is_public: bool, force: bool) -> None:
+    from crewai_cli.tools.main import ToolCommand
+
    tool_cmd = ToolCommand()
    tool_cmd.login()
    tool_cmd.publish(is_public, force)
@@ -570,6 +707,8 @@ def skill() -> None:
    help="Create skill in current dir instead of ./skills/",
 )
 def skill_create(name: str, in_project: bool) -> None:
+    from crewai_cli.experimental.skills.main import SkillCommand
+
    skill_cmd = SkillCommand()
    skill_cmd.create(name, in_project=in_project)

@@ -577,6 +716,8 @@ def skill_create(name: str, in_project: bool) -> None:
@skill.command(name="install")
@click.argument("ref")
 def skill_install(ref: str) -> None:
+    from crewai_cli.experimental.skills.main import SkillCommand
+
    skill_cmd = SkillCommand()
    skill_cmd.install(ref)

@@ -593,6 +734,8 @@ def skill_install(ref: str) -> None:
@click.option("--private", "is_public", flag_value=False)
@click.option("--org", default=None, help="Organisation slug (overrides settings).")
 def skill_publish(is_public: bool, org: str | None, force: bool) -> None:
+    from crewai_cli.experimental.skills.main import SkillCommand
+
    skill_cmd = SkillCommand()
    skill_cmd.publish(is_public, org=org, force=force)

@@ -600,6 +743,8 @@ def skill_publish(is_public: bool, org: str | None, force: bool) -> None:
@skill.command(name="list")
 def skill_list() -> None:
    """List locally installed skills."""
+    from crewai_cli.experimental.skills.main import SkillCommand
+
    skill_cmd = SkillCommand()
    skill_cmd.list_cached()

@@ -639,6 +784,8 @@ def flow() -> None:
@flow.command(name="kickoff")
 def flow_run() -> None:
    """Kickoff the Flow."""
+    from crewai_cli.kickoff_flow import kickoff_flow
+
    click.echo("Running the Flow")
    kickoff_flow()

@@ -646,6 +793,8 @@ def flow_run() -> None:
@flow.command(name="plot")
 def flow_plot() -> None:
    """Plot the Flow."""
+    from crewai_cli.plot_flow import plot_flow
+
    click.echo("Plotting the Flow")
    plot_flow()

@@ -654,6 +803,8 @@ def flow_plot() -> None:
@click.argument("crew_name")
 def flow_add_crew(crew_name: str) -> None:
    """Add a crew to an existing flow."""
+    from crewai_cli.add_crew_to_flow import add_crew_to_flow
+
    click.echo(f"Adding crew {crew_name} to the flow")
    add_crew_to_flow(crew_name)

@@ -666,6 +817,8 @@ def triggers() -> None:
@triggers.command(name="list")
 def triggers_list() -> None:
    """List all available triggers from integrations."""
+    from crewai_cli.triggers.main import TriggersCommand
+
    triggers_cmd = TriggersCommand()
    triggers_cmd.list_triggers()

@@ -674,6 +827,8 @@ def triggers_list() -> None:
@click.argument("trigger_path")
 def triggers_run(trigger_path: str) -> None:
    """Execute crew with trigger payload. Format: app_slug/trigger_slug"""
+    from crewai_cli.triggers.main import TriggersCommand
+
    triggers_cmd = TriggersCommand()
    triggers_cmd.execute_with_trigger(trigger_path)

@@ -686,6 +841,8 @@ def chat() -> None:
    click.secho(
        "\nStarting a conversation with the Crew\nType 'exit' or Ctrl+C to quit.\n",
    )
+    from crewai_cli.crew_chat import run_chat
+
    run_chat()


@@ -725,6 +882,8 @@ def enterprise() -> None:
@click.argument("enterprise_url")
 def enterprise_configure(enterprise_url: str) -> None:
    """Configure CrewAI AMP OAuth2 settings from the provided Enterprise URL."""
+    from crewai_cli.enterprise.main import EnterpriseConfigureCommand
+
    enterprise_command = EnterpriseConfigureCommand()
    enterprise_command.configure(enterprise_url)

@@ -737,6 +896,8 @@ def config() -> None:
@config.command("list")
 def config_list() -> None:
    """List all CLI configuration parameters."""
+    from crewai_cli.settings.main import SettingsCommand
+
    config_command = SettingsCommand()
    config_command.list()

@@ -746,6 +907,8 @@ def config_list() -> None:
@click.argument("value")
 def config_set(key: str, value: str) -> None:
    """Set a CLI configuration parameter."""
+    from crewai_cli.settings.main import SettingsCommand
+
    config_command = SettingsCommand()
    config_command.set(key, value)

@@ -753,6 +916,8 @@ def config_set(key: str, value: str) -> None:
@config.command("reset")
 def config_reset() -> None:
    """Reset all CLI configuration parameters to default values."""
+    from crewai_cli.settings.main import SettingsCommand
+
    config_command = SettingsCommand()
    config_command.reset_all_settings()

--- a/lib/cli/src/crewai_cli/create_json_crew.py
+++ b/lib/cli/src/crewai_cli/create_json_crew.py
--- a/lib/cli/src/crewai_cli/crew_run_tui.py
+++ b/lib/cli/src/crewai_cli/crew_run_tui.py
--- a/lib/cli/src/crewai_cli/deploy/main.py
+++ b/lib/cli/src/crewai_cli/deploy/main.py
@@ -34,6 +34,39 @@ def _run_predeploy_validation(skip_validate: bool) -> bool:
    return True


+def _display_git_repository_help() -> None:
+    """Explain how to prepare a new project for deployment."""
+    console.print(
+        "Deployment requires a Git repository with an origin remote.",
+        style="bold red",
+    )
+    console.print(
+        "CrewAI AMP deploys from the remote repository URL, so commit and push "
+        "this project first, then run deploy again.",
+        style="yellow",
+    )
+    console.print("\nSuggested setup:")
+    console.print("  git init")
+    console.print("  git add .")
+    console.print('  git commit -m "Initial crew"')
+    console.print("  git branch -M main")
+    console.print("  git remote add origin <your-repo-url>")
+    console.print("  git push -u origin main")
+
+
+def _display_git_remote_help() -> None:
+    """Explain how to add a remote to an existing Git repository."""
+    console.print("No remote repository URL found.", style="bold red")
+    console.print(
+        "CrewAI AMP deploys from the origin remote. Add a remote, push your "
+        "latest commit, then run deploy again.",
+        style="yellow",
+    )
+    console.print("\nSuggested setup:")
+    console.print("  git remote add origin <your-repo-url>")
+    console.print("  git push -u origin HEAD")
+
+
 class DeployCommand(BaseCommand, PlusAPIMixin):
    """
    A class to handle deployment-related operations for CrewAI projects.
@@ -124,14 +157,11 @@ class DeployCommand(BaseCommand, PlusAPIMixin):
        try:
            remote_repo_url = git.Repository().origin_url()
        except ValueError:
-            remote_repo_url = None
+            _display_git_repository_help()
+            return

        if remote_repo_url is None:
-            console.print("No remote repository URL found.", style="bold red")
-            console.print(
-                "Please ensure your project has a valid remote repository.",
-                style="yellow",
-            )
+            _display_git_remote_help()
            return

        self._confirm_input(env_vars, remote_repo_url, confirm)
--- a/lib/cli/src/crewai_cli/deploy/validate.py
+++ b/lib/cli/src/crewai_cli/deploy/validate.py
@@ -38,6 +38,12 @@ import subprocess
 import sys
 from typing import Any

+from crewai.project.json_loader import (
+    JSONProjectValidationError,
+    find_crew_json_file,
+    find_json_project_file,
+    validate_crew_project,
+)
 from rich.console import Console

 from crewai_cli.utils import parse_toml
@@ -151,9 +157,33 @@ class DeployValidator:
    def ok(self) -> bool:
        return not self.errors

+    @property
+    def _is_json_crew(self) -> bool:
+        """True for JSON crew projects, deferring to the declared type.
+
+        A flow project that also contains a crew.json(c) file validates as
+        the flow it declares in pyproject.toml, not as a JSON crew.
+        """
+        if find_crew_json_file(self.project_root) is None:
+            return False
+        pyproject_path = self.project_root / "pyproject.toml"
+        if not pyproject_path.exists():
+            return True
+        try:
+            data = parse_toml(pyproject_path.read_text())
+        except Exception:
+            return True
+        declared_type: str | None = (
+            (data.get("tool") or {}).get("crewai", {}).get("type")
+        )
+        return declared_type != "flow"
+
    def run(self) -> list[ValidationResult]:
        """Run all checks. Later checks are skipped when earlier ones make
        them impossible (e.g. no pyproject.toml → no lockfile check)."""
+        if self._is_json_crew:
+            return self._run_json_checks()
+
        if not self._check_pyproject():
            return self.results

@@ -176,6 +206,110 @@ class DeployValidator:

        return self.results

+    def _run_json_checks(self) -> list[ValidationResult]:
+        """Validation suite for JSON-defined crew projects."""
+        crew_path = find_crew_json_file(self.project_root)
+        if crew_path is None:
+            return self.results
+
+        try:
+            project = validate_crew_project(crew_path, self.project_root / "agents")
+        except JSONProjectValidationError as e:
+            self._add(
+                Severity.ERROR,
+                "invalid_crew_json",
+                f"{crew_path.name} has invalid JSON crew configuration",
+                detail="\n".join(e.errors),
+                hint="Fix the JSON crew, agent, and task references before deploying.",
+            )
+            return self.results
+        except Exception as e:
+            self._add(
+                Severity.ERROR,
+                "invalid_crew_json",
+                f"Cannot parse {crew_path.name}",
+                detail=str(e),
+            )
+            return self.results
+
+        agents_dir = self.project_root / "agents"
+
+        self._check_pyproject()
+        self._check_lockfile()
+        self._check_env_vars_json(crew_path, agents_dir, project.agent_names)
+        self._check_version_vs_lockfile()
+
+        return self.results
+
+    def _check_env_vars_json(
+        self, crew_path: Path, agents_dir: Path, agent_names: list[str]
+    ) -> None:
+        """Check for env var references in JSON crew files."""
+        referenced: set[str] = set()
+        pattern = re.compile(r"\$\{?([A-Z][A-Z0-9_]+)\}?")
+
+        try:
+            referenced.update(pattern.findall(crew_path.read_text(errors="ignore")))
+        except OSError as exc:
+            logger.debug("Skipping unreadable crew file %s: %s", crew_path, exc)
+
+        for name in agent_names:
+            agent_path = find_json_project_file(agents_dir, name)
+            if agent_path is None:
+                continue
+            try:
+                referenced.update(
+                    pattern.findall(agent_path.read_text(errors="ignore"))
+                )
+            except OSError as exc:
+                logger.debug("Skipping unreadable agent file %s: %s", agent_path, exc)
+
+        for py_path in self.project_root.rglob("*.py"):
+            if ".venv" in py_path.parts:
+                continue
+            try:
+                text = py_path.read_text(encoding="utf-8", errors="ignore")
+            except OSError:
+                continue
+            env_pattern = re.compile(
+                r"""(?x)
+                (?:os\.environ\s*(?:\[\s*|\.get\s*\(\s*)
+                  |os\.getenv\s*\(\s*
+                  |getenv\s*\(\s*)
+                ['"]([A-Z][A-Z0-9_]*)['"]
+                """
+            )
+            referenced.update(env_pattern.findall(text))
+
+        env_file = self.project_root / ".env"
+        env_keys: set[str] = set()
+        if env_file.exists():
+            for line in env_file.read_text(errors="ignore").splitlines():
+                line = line.strip()
+                if not line or line.startswith("#") or "=" not in line:
+                    continue
+                env_keys.add(line.split("=", 1)[0].strip())
+
+        missing_known = sorted(
+            var
+            for var in referenced
+            if var in _KNOWN_API_KEY_HINTS
+            and var not in env_keys
+            and var not in os.environ
+        )
+        if missing_known:
+            self._add(
+                Severity.WARNING,
+                "env_vars_not_in_dotenv",
+                f"{len(missing_known)} referenced API key(s) not in .env",
+                detail=(
+                    "These env vars are referenced in your project but not set "
+                    f"locally: {', '.join(missing_known)}. Deploys will fail "
+                    "unless they are added to the deployment's Environment "
+                    "Variables in the CrewAI dashboard."
+                ),
+            )
+
    def _check_pyproject(self) -> bool:
        pyproject_path = self.project_root / "pyproject.toml"
        if not pyproject_path.exists():
--- a/lib/cli/src/crewai_cli/git.py
+++ b/lib/cli/src/crewai_cli/git.py
@@ -48,6 +48,7 @@ class Repository:
                ["git", "rev-parse", "--is-inside-work-tree"],  # noqa: S607
                cwd=self.path,
                encoding="utf-8",
+                stderr=subprocess.DEVNULL,
            )
            return True
        except subprocess.CalledProcessError:
--- a/lib/cli/src/crewai_cli/run_crew.py
+++ b/lib/cli/src/crewai_cli/run_crew.py
@@ -1,25 +1,311 @@
+from __future__ import annotations
+
+from contextlib import AbstractContextManager, nullcontext
 from enum import Enum
+import os
+from pathlib import Path
+import re
 import subprocess
+import sys
+from typing import TYPE_CHECKING, Any

 import click
+from crewai.project.json_loader import find_crew_json_file
 from crewai_core.constants import CREWAI_TRAINED_AGENTS_FILE_ENV
 from packaging import version

-from crewai_cli.utils import build_env_with_all_tool_credentials, read_toml
+from crewai_cli.utils import (
+    build_env_with_all_tool_credentials,
+    enable_prompt_line_editing,
+    read_toml,
+)
 from crewai_cli.version import get_crewai_version


+if TYPE_CHECKING:
+    from crewai_cli.crew_run_tui import CrewRunApp
+
+
 class CrewType(Enum):
    STANDARD = "standard"
    FLOW = "flow"


-def run_crew(trained_agents_file: str | None = None) -> None:
-    """Run the crew or flow by running a command in the UV environment.
+# Must accept the same names as the kickoff interpolation pattern in
+# crewai.utilities.string_utils (_VARIABLE_PATTERN), including hyphens —
+# otherwise placeholders are interpolated at runtime but never prompted for.
+_INPUT_PLACEHOLDER_RE = re.compile(r"(?<!{){([A-Za-z_][A-Za-z0-9_\-]*)}(?!})")

-    Starting from version 0.103.0, this command can be used to run both
-    standard crews and flows. For flows, it detects the type from pyproject.toml
-    and automatically runs the appropriate command.
+
+def _has_json_crew() -> bool:
+    """Check if this is a JSON-defined crew project.
+
+    The project type declared in pyproject.toml wins: a flow project that
+    happens to contain a crew.json(c) file still runs as a flow. A missing
+    or unreadable pyproject means a bare JSON crew project.
+    """
+    if find_crew_json_file() is None:
+        return False
+    try:
+        pyproject_data = read_toml()
+    except Exception:
+        return True
+    declared_type: str | None = (
+        pyproject_data.get("tool", {}).get("crewai", {}).get("type")
+    )
+    return declared_type != "flow"
+
+
+def _extract_input_placeholders(text: str | None) -> set[str]:
+    if not text:
+        return set()
+    return set(_INPUT_PLACEHOLDER_RE.findall(text))
+
+
+def _missing_input_names(crew: Any, inputs: dict[str, Any]) -> list[str]:
+    """Return input placeholders used by a crew but not provided as defaults."""
+    placeholders: set[str] = set()
+
+    for agent in getattr(crew, "agents", []) or []:
+        placeholders.update(_extract_input_placeholders(getattr(agent, "role", None)))
+        placeholders.update(_extract_input_placeholders(getattr(agent, "goal", None)))
+        placeholders.update(
+            _extract_input_placeholders(getattr(agent, "backstory", None))
+        )
+
+    for task in getattr(crew, "tasks", []) or []:
+        placeholders.update(
+            _extract_input_placeholders(getattr(task, "description", None))
+        )
+        placeholders.update(
+            _extract_input_placeholders(getattr(task, "expected_output", None))
+        )
+        placeholders.update(
+            _extract_input_placeholders(getattr(task, "output_file", None))
+        )
+
+    return sorted(name for name in placeholders if name not in inputs)
+
+
+def _prompt_for_missing_inputs(
+    crew: Any, default_inputs: dict[str, Any]
+) -> dict[str, Any]:
+    """Ask for runtime values for placeholders that lack default inputs."""
+    inputs = dict(default_inputs or {})
+    missing = _missing_input_names(crew, inputs)
+    if not missing:
+        return inputs
+
+    enable_prompt_line_editing()
+
+    click.echo()
+    click.secho("  Runtime inputs", fg="cyan", bold=True)
+    click.secho(
+        "  Values for {placeholder} references in your agents and tasks.",
+        dim=True,
+    )
+
+    for name in missing:
+        inputs[name] = click.prompt(
+            click.style(f"  {name}", fg="cyan"),
+            prompt_suffix=click.style(" > ", fg="bright_white"),
+        )
+
+    return inputs
+
+
+def _json_loading_status(message: str) -> AbstractContextManager[Any]:
+    from rich.console import Console
+    from rich.text import Text
+
+    console = Console()
+    if not console.is_terminal:
+        return nullcontext()
+    return console.status(
+        Text(f"  {message}", style="bold #1F7982"),
+        spinner="dots",
+    )
+
+
+def _load_json_crew(crew_path: Path) -> tuple[Any, dict[str, Any]]:
+    from crewai.project.crew_loader import load_crew
+
+    return load_crew(crew_path)
+
+
+def _load_json_crew_for_tui(
+    crew_path: Path,
+) -> tuple[type[Any], Any, dict[str, Any], list[str], list[str]]:
+    with _json_loading_status("Preparing crew..."):
+        from crewai_cli.crew_run_tui import CrewRunApp
+
+        crew, default_inputs = _load_json_crew(crew_path)
+        _prepare_json_crew_for_tui(crew)
+        task_names = [
+            getattr(task, "name", "") or getattr(task, "description", "")[:40] or "Task"
+            for task in crew.tasks
+        ]
+        agent_names = [
+            getattr(agent, "role", "") or getattr(agent, "name", "") or "Agent"
+            for agent in crew.agents
+        ]
+
+    return CrewRunApp, crew, default_inputs, task_names, agent_names
+
+
+def _prepare_json_crew_for_tui(crew: Any) -> None:
+    """Apply the same quiet/streaming setup used by the TUI JSON loader."""
+    crew.verbose = False
+    for agent in crew.agents:
+        agent.verbose = False
+        if hasattr(agent, "llm") and hasattr(agent.llm, "stream"):
+            agent.llm.stream = True
+
+
+def _run_json_crew(trained_agents_file: str | None = None) -> Any:
+    """Load and run a JSON-defined crew."""
+    from dotenv import load_dotenv
+
+    env_file = Path.cwd() / ".env"
+    if env_file.exists():
+        load_dotenv(env_file, override=True)
+
+    # JSON crews run in-process, so export the trained-agents file directly
+    # instead of forwarding it to a subprocess like classic crews do.
+    if trained_agents_file:
+        os.environ[CREWAI_TRAINED_AGENTS_FILE_ENV] = trained_agents_file
+
+    crew_path = find_crew_json_file()
+    if crew_path is None:
+        raise FileNotFoundError("No crew.jsonc or crew.json found")
+
+    crew_run_app_cls, crew, default_inputs, task_names, agent_names = (
+        _load_json_crew_for_tui(crew_path)
+    )
+    runtime_inputs = _prompt_for_missing_inputs(crew, default_inputs)
+
+    app = crew_run_app_cls(
+        crew_name=crew.name or "Crew",
+        total_tasks=len(crew.tasks),
+        agent_names=agent_names,
+        task_names=task_names,
+    )
+    app._crew = crew
+    app._default_inputs = runtime_inputs
+
+    app.run()
+
+    _print_post_tui_summary(app)
+
+    if app._status == "failed":
+        # Mirror the classic subprocess path: a failed crew must produce a
+        # non-zero exit code so scripts and CI don't treat it as success.
+        raise SystemExit(1)
+
+    if app._status not in ("completed", "failed"):
+        # User quit mid-run. kickoff runs in a thread worker that cannot be
+        # force-cancelled, so end the process to stop in-flight LLM and tool
+        # work instead of letting it burn tokens in the background.
+        click.secho("\n  Run cancelled.", fg="yellow")
+        sys.stdout.flush()
+        os._exit(130)
+
+    if getattr(app, "_want_deploy", False):
+        _chain_deploy()
+
+    return app._crew_result
+
+
+def _chain_deploy() -> None:
+    from rich.console import Console
+
+    console = Console()
+    try:
+        from crewai_cli.deploy.main import DeployCommand
+
+        console.print("\nStarting deployment…\n", style="bold #FF5A50")
+        DeployCommand().create_crew(confirm=False, skip_validate=True)
+    except SystemExit:
+        from crewai_cli.authentication.main import AuthenticationCommand
+
+        console.print()
+        AuthenticationCommand().login()
+        try:
+            DeployCommand().create_crew(confirm=False, skip_validate=True)
+        except Exception as e:
+            console.print(f"\nDeploy failed: {e}\n", style="bold red")
+    except Exception as e:
+        console.print(f"\nDeploy failed: {e}\n", style="bold red")
+
+
+def _print_post_tui_summary(app: CrewRunApp) -> None:
+    """Print a summary to the terminal after the Textual TUI exits."""
+    import time
+
+    from rich.console import Console
+    from rich.markdown import Markdown
+    from rich.padding import Padding
+    from rich.panel import Panel
+    from rich.text import Text
+
+    console = Console()
+    elapsed = time.time() - app._start_time
+
+    out_tokens = app._output_tokens + app._live_out_tokens
+    token_parts = []
+    if app._input_tokens:
+        token_parts.append(f"↑{app._input_tokens:,}")
+    if out_tokens:
+        token_parts.append(f"↓{out_tokens:,}")
+    token_str = "  ".join(token_parts)
+    if token_str:
+        token_str += " tokens"
+
+    crewai_red = "#FF5A50"
+    crewai_teal = "#1F7982"
+
+    if app._status == "completed":
+        summary = Text()
+        summary.append(
+            f"  ✔ Completed {app._total_tasks} tasks",
+            style=f"bold {crewai_teal}",
+        )
+        summary.append(f" in {elapsed:.1f}s", style="dim")
+        if token_str:
+            summary.append(f"  {token_str}", style="dim")
+        console.print(
+            Panel(
+                summary,
+                title=f" {app._crew_name} ",
+                title_align="left",
+                border_style=crewai_teal,
+                padding=(0, 1),
+            )
+        )
+        if app._final_output:
+            console.print()
+            console.print(Text("  Final Result", style=f"bold {crewai_teal}"))
+            console.print()
+            console.print(Padding(Markdown(app._final_output), (0, 2)))
+    elif app._status == "failed":
+        content = Text()
+        content.append("  ✘ Failed", style=f"bold {crewai_red}")
+        content.append(f" after {elapsed:.1f}s\n", style="dim")
+        if app._error:
+            content.append(f"\n  {app._error}\n", style=crewai_red)
+        console.print(
+            Panel(
+                content,
+                title=f" {app._crew_name} ",
+                title_align="left",
+                border_style=crewai_red,
+                padding=(0, 1),
+            )
+        )
+
+
+def run_crew(trained_agents_file: str | None = None) -> None:
+    """Run the crew or flow.

    Args:
        trained_agents_file: Optional path to a trained-agents pickle produced
@@ -27,6 +313,11 @@ def run_crew(trained_agents_file: str | None = None) -> None:
            ``CREWAI_TRAINED_AGENTS_FILE`` so agents load suggestions from this
            file instead of the default ``trained_agents_data.pkl``.
    """
+    # JSON crew projects take precedence
+    if _has_json_crew():
+        _run_json_crew(trained_agents_file=trained_agents_file)
+        return
+
    crewai_version = get_crewai_version()
    min_required_version = "0.71.0"
    pyproject_data = read_toml()
--- a/lib/cli/src/crewai_cli/run_flow_definition.py
+++ b/lib/cli/src/crewai_cli/run_flow_definition.py
@@ -0,0 +1,113 @@
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import Any
+
+import click
+
+
+def run_flow_definition(definition: str, inputs: str | None = None) -> None:
+    """Run a flow from a Flow Definition YAML/JSON string or file path."""
+    try:
+        from crewai.flow.flow import Flow
+        from crewai.flow.flow_definition import FlowDefinition
+    except ImportError as exc:
+        click.echo(
+            "Running flows from definitions requires the full crewai package.",
+            err=True,
+        )
+        raise SystemExit(1) from exc
+
+    parsed_inputs = _parse_inputs(inputs)
+    definition_source = _read_definition_source(definition)
+
+    try:
+        flow_definition = _parse_flow_definition(FlowDefinition, definition_source)
+        flow = Flow.from_definition(flow_definition)
+        result = flow.kickoff(inputs=parsed_inputs)
+    except Exception as exc:
+        click.echo(
+            f"An error occurred while running the flow definition: {exc}", err=True
+        )
+        raise SystemExit(1) from exc
+
+    click.echo(_format_result(result))
+
+
+def _parse_inputs(inputs: str | None) -> dict[str, Any] | None:
+    if inputs is None:
+        return None
+
+    try:
+        parsed = json.loads(inputs)
+    except json.JSONDecodeError as exc:
+        click.echo(f"Invalid --inputs JSON: {exc}", err=True)
+        raise SystemExit(1) from exc
+
+    if not isinstance(parsed, dict):
+        click.echo("Invalid --inputs JSON: expected an object.", err=True)
+        raise SystemExit(1)
+
+    return parsed
+
+
+def _read_definition_source(definition: str) -> str:
+    path = Path(definition).expanduser()
+    try:
+        is_file = path.is_file()
+    except OSError as exc:
+        if _looks_like_inline_definition(definition):
+            return definition
+        click.echo(f"Invalid --definition path: {definition} ({exc})", err=True)
+        raise SystemExit(1) from exc
+
+    if is_file:
+        try:
+            return path.read_text(encoding="utf-8")
+        except (OSError, UnicodeError) as exc:
+            click.echo(
+                f"Unable to read --definition path {path}: {exc}",
+                err=True,
+            )
+            raise SystemExit(1) from exc
+
+    try:
+        if path.exists():
+            click.echo(
+                f"Invalid --definition path: {definition} is not a file.", err=True
+            )
+            raise SystemExit(1)
+    except OSError as exc:
+        click.echo(f"Invalid --definition path: {definition} ({exc})", err=True)
+        raise SystemExit(1) from exc
+
+    return definition
+
+
+def _looks_like_inline_definition(definition: str) -> bool:
+    stripped = definition.lstrip()
+    return "\n" in definition or stripped.startswith(("{", "---")) or ":" in stripped
+
+
+def _parse_flow_definition(flow_definition_cls: type[Any], source: str) -> Any:
+    if _looks_like_json(source):
+        return flow_definition_cls.from_json(source)
+
+    return flow_definition_cls.from_yaml(source)
+
+
+def _looks_like_json(source: str) -> bool:
+    stripped = source.lstrip()
+    return stripped.startswith("{")
+
+
+def _format_result(result: Any) -> str:
+    raw_result = getattr(result, "raw", result)
+    if isinstance(raw_result, str):
+        return raw_result
+
+    try:
+        return json.dumps(raw_result, default=str)
+    except TypeError:
+        return str(raw_result)
--- a/lib/cli/src/crewai_cli/templates/crew/pyproject.toml
+++ b/lib/cli/src/crewai_cli/templates/crew/pyproject.toml
@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
 authors = [{ name = "Your Name", email = "you@example.com" }]
 requires-python = ">=3.10,<3.14"
 dependencies = [
-    "crewai[tools]==1.14.7rc2"
+    "crewai[tools]==1.14.7"
 ]

 [project.scripts]
--- a/lib/cli/src/crewai_cli/templates/flow/pyproject.toml
+++ b/lib/cli/src/crewai_cli/templates/flow/pyproject.toml
@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
 authors = [{ name = "Your Name", email = "you@example.com" }]
 requires-python = ">=3.10,<3.14"
 dependencies = [
-    "crewai[tools]==1.14.7rc2"
+    "crewai[tools]==1.14.7"
 ]

 [project.scripts]
--- a/lib/cli/src/crewai_cli/templates/tool/pyproject.toml
+++ b/lib/cli/src/crewai_cli/templates/tool/pyproject.toml
@@ -5,7 +5,7 @@ description = "Power up your crews with {{folder_name}}"
 readme = "README.md"
 requires-python = ">=3.10,<3.14"
 dependencies = [
-    "crewai[tools]==1.14.7rc2"
+    "crewai[tools]==1.14.7"
 ]

 [tool.crewai]
--- a/lib/cli/src/crewai_cli/tui_picker.py
+++ b/lib/cli/src/crewai_cli/tui_picker.py
@@ -0,0 +1,419 @@
+"""Arrow-key interactive pickers for CLI prompts."""
+
+from __future__ import annotations
+
+from contextlib import suppress
+import sys
+from typing import overload
+
+import click
+
+
+# CrewAI brand: primary=#FF5A50 (coral), teal=#1F7982
+_CORAL = "\033[38;2;255;90;80m"  # #FF5A50
+_TEAL = "\033[38;2;31;121;130m"  # #1F7982
+_BOLD = "\033[1m"
+_DIM = "\033[2m"
+_RESET = "\033[0m"
+_HIDE_CURSOR = "\033[?25l"
+_SHOW_CURSOR = "\033[?25h"
+
+
+def _is_interactive() -> bool:
+    try:
+        return sys.stdin.isatty() and sys.stdout.isatty()
+    except Exception:
+        return False
+
+
+def _read_key() -> str:
+    if sys.platform == "win32":
+        import msvcrt
+
+        ch = msvcrt.getwch()
+        if ch in ("\x00", "\xe0"):
+            ch2 = msvcrt.getwch()
+            return {"H": "up", "P": "down"}.get(ch2, "")
+        if ch == "\r":
+            return "enter"
+        if ch == " ":
+            return "space"
+        if ch == "\x03":
+            raise KeyboardInterrupt
+        return ch
+
+    import termios
+    import tty
+
+    fd = sys.stdin.fileno()
+    old = termios.tcgetattr(fd)
+    try:
+        tty.setcbreak(fd)
+        ch = sys.stdin.read(1)
+        if ch == "\x1b":
+            seq = sys.stdin.read(2)
+            if seq == "[A":
+                return "up"
+            if seq == "[B":
+                return "down"
+            return "esc"
+        if ch in ("\r", "\n"):
+            return "enter"
+        if ch == " ":
+            return "space"
+        if ch == "\x03":
+            raise KeyboardInterrupt
+        return ch
+    finally:
+        termios.tcsetattr(fd, termios.TCSADRAIN, old)
+
+
+def _clear_lines(n: int) -> None:
+    sys.stdout.write(f"\033[{n}A")
+    for _ in range(n):
+        sys.stdout.write("\033[2K\n")
+    sys.stdout.write(f"\033[{n}A")
+    sys.stdout.flush()
+
+
+def _draw_single(labels: list[str], cursor: int, *, clear: bool = False) -> None:
+    total = len(labels)
+    if clear:
+        sys.stdout.write(f"\033[{total}A")
+    for i, label in enumerate(labels):
+        if i == cursor:
+            sys.stdout.write(f"\033[2K  {_CORAL}→{_RESET} {_BOLD}{label}{_RESET}\n")
+        else:
+            sys.stdout.write(f"\033[2K    {label}\n")
+    sys.stdout.flush()
+
+
+def _draw_multi(
+    labels: list[str],
+    cursor: int,
+    selected: set[int],
+    *,
+    action_indices: set[int] | None = None,
+    separator_indices: set[int] | None = None,
+    clear: bool = False,
+) -> None:
+    action_indices = action_indices or set()
+    separator_indices = separator_indices or set()
+    hint_text = "↑↓ navigate, space toggle, enter confirm"
+    if action_indices:
+        hint_text = "↑↓ navigate, space toggle, enter confirm, ▸ rows expand/collapse"
+    hint = f"  {_DIM}{hint_text}{_RESET}"
+    total = len(labels) + 1
+    if clear:
+        sys.stdout.write(f"\033[{total}A")
+    sys.stdout.write(f"\033[2K{hint}\n")
+    for i, label in enumerate(labels):
+        if i in separator_indices:
+            sys.stdout.write(f"\033[2K      {_TEAL}{label}{_RESET}\n")
+            continue
+        if i in action_indices:
+            check = "  "
+        elif i in selected:
+            check = f"{_CORAL}[x]{_RESET}"
+        else:
+            check = "[ ]"
+        arrow = f"{_CORAL}→{_RESET} " if i == cursor else "  "
+        bold = f"{_BOLD}{label}{_RESET}" if i == cursor else label
+        sys.stdout.write(f"\033[2K    {arrow}{check} {bold}\n")
+    sys.stdout.flush()
+
+
+def _arrow_select_one(labels: list[str]) -> int:
+    cursor = 0
+    total = len(labels)
+    sys.stdout.write(_HIDE_CURSOR)
+    sys.stdout.flush()
+    try:
+        _draw_single(labels, cursor)
+        while True:
+            key = _read_key()
+            if key == "up" and cursor > 0:
+                cursor -= 1
+                _draw_single(labels, cursor, clear=True)
+            elif key == "down" and cursor < total - 1:
+                cursor += 1
+                _draw_single(labels, cursor, clear=True)
+            elif key == "enter":
+                _clear_lines(total)
+                return cursor
+            elif key in ("esc", "q"):
+                _clear_lines(total)
+                return -1
+    finally:
+        sys.stdout.write(_SHOW_CURSOR)
+        sys.stdout.flush()
+
+
+def _arrow_select_multi(
+    labels: list[str],
+    *,
+    action_indices: set[int] | None = None,
+    separator_indices: set[int] | None = None,
+    preselected: set[int] | None = None,
+    initial_cursor: int | None = None,
+) -> tuple[list[int], int | None]:
+    total = len(labels)
+    selected: set[int] = set(preselected or ())
+    action_indices = action_indices or set()
+    separator_indices = separator_indices or set()
+    if initial_cursor is not None and 0 <= initial_cursor < total:
+        cursor = initial_cursor
+    else:
+        cursor = _first_selectable_index(total, separator_indices)
+    sys.stdout.write(_HIDE_CURSOR)
+    sys.stdout.flush()
+    try:
+        _draw_multi(
+            labels,
+            cursor,
+            selected,
+            action_indices=action_indices,
+            separator_indices=separator_indices,
+        )
+        while True:
+            key = _read_key()
+            if key == "up":
+                cursor = _next_selectable_index(cursor, -1, total, separator_indices)
+                _draw_multi(
+                    labels,
+                    cursor,
+                    selected,
+                    action_indices=action_indices,
+                    separator_indices=separator_indices,
+                    clear=True,
+                )
+            elif key == "down":
+                cursor = _next_selectable_index(cursor, 1, total, separator_indices)
+                _draw_multi(
+                    labels,
+                    cursor,
+                    selected,
+                    action_indices=action_indices,
+                    separator_indices=separator_indices,
+                    clear=True,
+                )
+            elif key == "space":
+                if cursor in action_indices:
+                    _clear_lines(total + 1)
+                    return sorted(selected), cursor
+                selected ^= {cursor}
+                _draw_multi(
+                    labels,
+                    cursor,
+                    selected,
+                    action_indices=action_indices,
+                    separator_indices=separator_indices,
+                    clear=True,
+                )
+            elif key == "enter":
+                _clear_lines(total + 1)
+                if cursor in action_indices:
+                    return sorted(selected), cursor
+                return sorted(selected), None
+            elif key in ("esc", "q"):
+                _clear_lines(total + 1)
+                return sorted(selected), None
+    finally:
+        sys.stdout.write(_SHOW_CURSOR)
+        sys.stdout.flush()
+
+
+def _numbered_select(labels: list[str]) -> int:
+    for idx, label in enumerate(labels, 1):
+        click.echo(f"    {idx}. {label}")
+    click.echo()
+    while True:
+        choice = click.prompt("  Select", type=str, default="1")
+        if choice.lower() == "q":
+            return -1
+        try:
+            num = int(choice)
+            if 1 <= num <= len(labels):
+                return num - 1
+        except ValueError:
+            # Non-numeric input falls through to the shared error message.
+            pass
+        click.secho(f"  Invalid choice. Enter 1-{len(labels)}.", fg="red")
+
+
+def _numbered_select_multi(
+    labels: list[str],
+    *,
+    action_indices: set[int] | None = None,
+    separator_indices: set[int] | None = None,
+    preselected: set[int] | None = None,
+) -> tuple[list[int], int | None]:
+    action_indices = action_indices or set()
+    separator_indices = separator_indices or set()
+    numbered_indices: list[int] = []
+    for idx, label in enumerate(labels):
+        if idx in separator_indices:
+            click.secho(f"    {label}", fg="cyan")
+            continue
+        numbered_indices.append(idx)
+        click.echo(f"    {len(numbered_indices)}. {label}")
+    click.echo()
+    raw = click.prompt(
+        "  Select (comma-separated numbers, or empty to skip)",
+        default="",
+        show_default=False,
+    )
+    if not raw.strip():
+        return sorted(preselected or ()), None
+    indices: list[int] = list(preselected or ())
+    for part in raw.split(","):
+        with suppress(ValueError):
+            num = int(part.strip())
+            if 1 <= num <= len(numbered_indices):
+                idx = numbered_indices[num - 1]
+                if idx in action_indices:
+                    return sorted(set(indices)), idx
+                indices.append(idx)
+    return sorted(set(indices)), None
+
+
+def _first_selectable_index(total: int, separator_indices: set[int]) -> int:
+    for idx in range(total):
+        if idx not in separator_indices:
+            return idx
+    return 0
+
+
+def _next_selectable_index(
+    cursor: int,
+    direction: int,
+    total: int,
+    separator_indices: set[int],
+) -> int:
+    next_cursor = cursor + direction
+    while 0 <= next_cursor < total:
+        if next_cursor not in separator_indices:
+            return next_cursor
+        next_cursor += direction
+    return cursor
+
+
+# ── Public API ──────────────────────────────────────────────────
+
+
+def pick(title: str, options: list[tuple[str, str]]) -> str | None:
+    """Arrow-key single-select picker.
+
+    Args:
+        title: Header text.
+        options: List of ``(value, description)`` tuples.
+
+    Returns:
+        The *value* of the selected option, or ``None`` if cancelled.
+    """
+    labels = [f"{value:<12s} {desc}" for value, desc in options]
+
+    click.echo()
+    click.secho(f"  {title}", fg="cyan", bold=True)
+    click.echo()
+
+    if _is_interactive():
+        try:
+            idx = _arrow_select_one(labels)
+        except Exception:
+            idx = _numbered_select(labels)
+    else:
+        idx = _numbered_select(labels)
+
+    if idx < 0:
+        return None
+
+    value, _desc = options[idx]
+    click.secho(f"  ✔ {value}", fg="green")
+    return value
+
+
+def pick_one(title: str, labels: list[str]) -> int:
+    """Arrow-key single-select from plain labels.
+
+    Returns:
+        Selected index, or ``-1`` if cancelled.
+    """
+    click.echo()
+    click.secho(f"  {title}", fg="cyan")
+
+    if _is_interactive():
+        try:
+            return _arrow_select_one(labels)
+        except Exception:
+            return _numbered_select(labels)
+    return _numbered_select(labels)
+
+
+@overload
+def pick_many(
+    title: str,
+    labels: list[str],
+    *,
+    separator_indices: set[int] | None = None,
+    preselected: set[int] | None = None,
+    initial_cursor: int | None = None,
+) -> list[int]: ...
+
+
+@overload
+def pick_many(
+    title: str,
+    labels: list[str],
+    *,
+    action_indices: set[int],
+    separator_indices: set[int] | None = None,
+    preselected: set[int] | None = None,
+    initial_cursor: int | None = None,
+) -> tuple[list[int], int | None]: ...
+
+
+def pick_many(
+    title: str,
+    labels: list[str],
+    *,
+    action_indices: set[int] | None = None,
+    separator_indices: set[int] | None = None,
+    preselected: set[int] | None = None,
+    initial_cursor: int | None = None,
+) -> list[int] | tuple[list[int], int | None]:
+    """Arrow-key multi-select with checkboxes.
+
+    Returns:
+        Sorted list of selected indices, or ``(indices, action_index)`` when
+        ``action_indices`` is provided.
+    """
+    click.echo()
+    click.secho(f"  {title}", fg="cyan")
+
+    if _is_interactive():
+        try:
+            selected, action = _arrow_select_multi(
+                labels,
+                action_indices=action_indices,
+                separator_indices=separator_indices,
+                preselected=preselected,
+                initial_cursor=initial_cursor,
+            )
+        except Exception:
+            selected, action = _numbered_select_multi(
+                labels,
+                action_indices=action_indices,
+                separator_indices=separator_indices,
+                preselected=preselected,
+            )
+    else:
+        selected, action = _numbered_select_multi(
+            labels,
+            action_indices=action_indices,
+            separator_indices=separator_indices,
+            preselected=preselected,
+        )
+    if action_indices is None:
+        return selected
+    return selected, action
--- a/lib/cli/src/crewai_cli/utils.py
+++ b/lib/cli/src/crewai_cli/utils.py
@@ -24,6 +24,7 @@ __all__ = [
    "build_env_with_all_tool_credentials",
    "build_env_with_tool_repository_credentials",
    "copy_template",
+    "enable_prompt_line_editing",
    "fetch_and_json_env_file",
    "get_project_description",
    "get_project_name",
@@ -40,6 +41,19 @@ __all__ = [
 console = Console()


+def enable_prompt_line_editing() -> None:
+    """Enable cursor movement/history editing for Click text prompts when available."""
+    try:
+        import readline
+    except ImportError:
+        return
+
+    try:
+        readline.parse_and_bind("set editing-mode emacs")
+    except Exception:  # pragma: no cover - readline backends vary by platform
+        return
+
+
 def copy_template(
    src: Path, dst: Path, name: str, class_name: str, folder_name: str
 ) -> None:
--- a/lib/cli/tests/deploy/test_deploy_main.py
+++ b/lib/cli/tests/deploy/test_deploy_main.py
@@ -150,6 +150,7 @@ class TestDeployCommand(unittest.TestCase):
    @patch("crewai_cli.deploy.main.fetch_and_json_env_file")
    @patch("crewai_cli.deploy.main.git.Repository.origin_url")
    @patch("builtins.input")
+    @pytest.mark.timeout(180)
    def test_create_crew(self, mock_input, mock_git_origin_url, mock_fetch_env):
        mock_fetch_env.return_value = {"ENV_VAR": "value"}
        mock_git_origin_url.return_value = "https://github.com/test/repo.git"
@@ -165,6 +166,40 @@ class TestDeployCommand(unittest.TestCase):
            self.assertIn("Deployment created successfully!", fake_out.getvalue())
            self.assertIn("new-uuid", fake_out.getvalue())

+    @patch("crewai_cli.deploy.main.fetch_and_json_env_file")
+    @patch("crewai_cli.deploy.main.git.Repository")
+    def test_create_crew_without_git_repo_shows_setup_help(
+        self, mock_repository, mock_fetch_env
+    ):
+        mock_fetch_env.return_value = {"ENV_VAR": "value"}
+        mock_repository.side_effect = ValueError("not a Git repository")
+
+        with patch("sys.stdout", new=StringIO()) as fake_out:
+            self.deploy_command.create_crew(skip_validate=True)
+            output = fake_out.getvalue()
+
+        self.assertIn("Deployment requires a Git repository", output)
+        self.assertIn("git init", output)
+        self.assertIn("git remote add origin <your-repo-url>", output)
+        self.mock_client.create_crew.assert_not_called()
+
+    @patch("crewai_cli.deploy.main.fetch_and_json_env_file")
+    @patch("crewai_cli.deploy.main.git.Repository")
+    def test_create_crew_without_remote_shows_remote_help(
+        self, mock_repository, mock_fetch_env
+    ):
+        mock_fetch_env.return_value = {"ENV_VAR": "value"}
+        mock_repository.return_value.origin_url.return_value = None
+
+        with patch("sys.stdout", new=StringIO()) as fake_out:
+            self.deploy_command.create_crew(skip_validate=True)
+            output = fake_out.getvalue()
+
+        self.assertIn("No remote repository URL found.", output)
+        self.assertIn("git remote add origin <your-repo-url>", output)
+        self.assertIn("git push -u origin HEAD", output)
+        self.mock_client.create_crew.assert_not_called()
+
    def test_list_crews(self):
        mock_response = MagicMock()
        mock_response.status_code = 200
--- a/lib/cli/tests/deploy/test_validate.py
+++ b/lib/cli/tests/deploy/test_validate.py
@@ -110,6 +110,45 @@ def _run_without_import_check(root: Path) -> DeployValidator:
    return v


+def _scaffold_json_crew(root: Path, *, task_agent: str = "researcher") -> None:
+    (root / "pyproject.toml").write_text(_make_pyproject(name="json_crew"))
+    (root / "uv.lock").write_text("# dummy uv lockfile\n")
+    agents_dir = root / "agents"
+    agents_dir.mkdir()
+    (agents_dir / "researcher.jsonc").write_text(
+        dedent(
+            """
+            {
+              "role": "Researcher",
+              "goal": "Research things",
+              "backstory": "Experienced researcher",
+              "llm": "openai/gpt-4o-mini"
+            }
+            """
+        ).strip()
+        + "\n"
+    )
+    (root / "crew.jsonc").write_text(
+        dedent(
+            f"""
+            {{
+              "name": "json_crew",
+              "agents": ["researcher"],
+              "tasks": [
+                {{
+                  "name": "research",
+                  "description": "Research https://example.com/a//b",
+                  "expected_output": "Findings",
+                  "agent": "{task_agent}"
+                }}
+              ]
+            }}
+            """
+        ).strip()
+        + "\n"
+    )
+
+
@pytest.mark.parametrize(
    "project_name, expected",
    [
@@ -129,6 +168,38 @@ def test_valid_standard_crew_project_passes(tmp_path: Path) -> None:
    assert v.ok, f"expected clean run, got {v.results}"


+def test_valid_json_crew_project_passes(tmp_path: Path) -> None:
+    _scaffold_json_crew(tmp_path)
+    v = DeployValidator(project_root=tmp_path)
+    v.run()
+    assert "invalid_crew_json" not in _codes(v)
+
+
+def test_json_task_agent_mismatch_is_error(tmp_path: Path) -> None:
+    _scaffold_json_crew(tmp_path, task_agent="missing_agent")
+    v = DeployValidator(project_root=tmp_path)
+    v.run()
+    finding = next(r for r in v.results if r.code == "invalid_crew_json")
+    assert finding.severity is Severity.ERROR
+    assert "missing_agent" in finding.detail
+
+
+def test_json_runtime_fields_are_deploy_errors(tmp_path: Path) -> None:
+    _scaffold_json_crew(tmp_path)
+    crew_path = tmp_path / "crew.jsonc"
+    crew_path.write_text(
+        crew_path.read_text().replace(
+            '"name": "json_crew",',
+            '"name": "json_crew",\n  "id": "00000000-0000-4000-8000-000000000000",',
+        )
+    )
+    v = DeployValidator(project_root=tmp_path)
+    v.run()
+    finding = next(r for r in v.results if r.code == "invalid_crew_json")
+    assert finding.severity is Severity.ERROR
+    assert "runtime-only" in finding.detail
+
+
 def test_missing_pyproject_errors(tmp_path: Path) -> None:
    v = _run_without_import_check(tmp_path)
    assert "missing_pyproject" in _codes(v)
@@ -426,4 +497,31 @@ def test_create_crew_aborts_on_validation_error(tmp_path: Path) -> None:
        cmd = DeployCommand()
        cmd.create_crew()
        assert not cmd.plus_api_client.create_crew.called
-        del mock_api  # silence unused-var lint
+        del mock_api  # silence unused-var lint
+
+
+def test_is_json_crew_defers_to_declared_flow_type(tmp_path):
+    """A flow project with a stray crew.jsonc must validate as a flow."""
+    (tmp_path / "crew.jsonc").write_text("{}")
+    (tmp_path / "pyproject.toml").write_text(
+        '[project]\nname = "demo"\nversion = "0.1.0"\n\n'
+        '[tool.crewai]\ntype = "flow"\n'
+    )
+
+    assert DeployValidator(project_root=tmp_path)._is_json_crew is False
+
+
+def test_is_json_crew_true_for_declared_crew_type(tmp_path):
+    (tmp_path / "crew.jsonc").write_text("{}")
+    (tmp_path / "pyproject.toml").write_text(
+        '[project]\nname = "demo"\nversion = "0.1.0"\n\n'
+        '[tool.crewai]\ntype = "crew"\n'
+    )
+
+    assert DeployValidator(project_root=tmp_path)._is_json_crew is True
+
+
+def test_is_json_crew_true_without_pyproject(tmp_path):
+    (tmp_path / "crew.jsonc").write_text("{}")
+
+    assert DeployValidator(project_root=tmp_path)._is_json_crew is True
--- a/lib/cli/tests/test_cli.py
+++ b/lib/cli/tests/test_cli.py
@@ -13,6 +13,7 @@ from crewai_cli.cli import (
    flow_add_crew,
    login,
    reset_memories,
+    run,
    test,
    train,
    version,
@@ -93,9 +94,9 @@ def test_version_command_with_tools(runner):
 def test_test_default_iterations(evaluate_crew, runner):
    result = runner.invoke(test)

-    evaluate_crew.assert_called_once_with(3, "gpt-4o-mini", trained_agents_file=None)
+    evaluate_crew.assert_called_once_with(3, "gpt-5.4-mini", trained_agents_file=None)
    assert result.exit_code == 0
-    assert "Testing the crew for 3 iterations with model gpt-4o-mini" in result.output
+    assert "Testing the crew for 3 iterations with model gpt-5.4-mini" in result.output


@mock.patch("crewai_cli.cli.evaluate_crew")
@@ -119,6 +120,43 @@ def test_test_invalid_string_iterations(evaluate_crew, runner):
    )


+@mock.patch("crewai_cli.cli.run_crew")
+def test_run_uses_project_runner_by_default(run_crew, runner):
+    result = runner.invoke(run)
+
+    assert result.exit_code == 0
+    run_crew.assert_called_once_with(trained_agents_file=None)
+    assert "experimental" not in result.output.lower()
+
+
+@mock.patch("crewai_cli.cli.run_flow_definition")
+def test_run_with_definition_uses_definition_runner(run_flow_definition, runner):
+    result = runner.invoke(
+        run,
+        ["--definition", "flow.yaml", "--inputs", '{"topic":"AI"}'],
+    )
+
+    assert result.exit_code == 0
+    assert (
+        "Warning: `crewai run --definition` is experimental and may change without notice."
+        in result.output
+    )
+    run_flow_definition.assert_called_once_with(
+        definition="flow.yaml", inputs='{"topic":"AI"}'
+    )
+
+
+@mock.patch("crewai_cli.cli.run_crew")
+@mock.patch("crewai_cli.cli.run_flow_definition")
+def test_run_rejects_inputs_without_definition(run_flow_definition, run_crew, runner):
+    result = runner.invoke(run, ["--inputs", '{"topic":"AI"}'])
+
+    assert result.exit_code == 2
+    assert "Error: --inputs requires --definition" in result.output
+    run_flow_definition.assert_not_called()
+    run_crew.assert_not_called()
+
+
@mock.patch("crewai_cli.cli.AuthenticationCommand")
 def test_login(command, runner):
    mock_auth = command.return_value
--- a/lib/cli/tests/test_create_crew.py
+++ b/lib/cli/tests/test_create_crew.py
@@ -6,6 +6,8 @@ from unittest import mock

 import pytest
 from click.testing import CliRunner
+import crewai_cli.create_json_crew as json_crew
+import crewai_cli.tui_picker as tui_picker
 from crewai_cli.create_crew import create_crew, create_folder_structure


@@ -345,3 +347,441 @@ def test_env_vars_are_uppercased_in_env_file(
    env_file_path = crew_path / ".env"
    content = env_file_path.read_text()
    assert "MODEL=" in content
+
+
+def test_json_wizard_defaults_to_sequential_and_memory_enabled(monkeypatch):
+    monkeypatch.setattr(
+        json_crew,
+        "_wizard_agent",
+        lambda **_: {
+            "name": "researcher",
+            "role": "Researcher",
+            "goal": "Research",
+            "backstory": "Researcher",
+            "llm": "openai/gpt-5.5",
+            "tools": [],
+            "planning": False,
+            "allow_delegation": False,
+        },
+    )
+    monkeypatch.setattr(
+        json_crew,
+        "_wizard_task",
+        lambda **_: {
+            "name": "research_task",
+            "description": "Research",
+            "expected_output": "Findings",
+            "agent": "researcher",
+            "context": [],
+        },
+    )
+
+    def confirm(label: str, default: bool = False) -> bool:
+        if label == "Enable crew memory?":
+            return default
+        return False
+
+    monkeypatch.setattr(json_crew, "_confirm", confirm)
+    monkeypatch.setattr(json_crew.click, "prompt", lambda *_, **__: "")
+    monkeypatch.setattr(
+        json_crew,
+        "pick_one",
+        lambda *_args, **_kwargs: pytest.fail("process should not be prompted"),
+    )
+
+    _agents, _tasks, settings = json_crew._wizard_agents_and_tasks(
+        skip_provider=True,
+        default_llm="openai/gpt-5.5",
+    )
+
+    assert settings == {"process": "sequential", "memory": True, "inputs": {}}
+
+
+def test_json_wizard_shows_interpolation_hint(capsys):
+    json_crew._show_interpolation_hint("tasks")
+
+    output = capsys.readouterr().out
+    assert "{placeholder}" in output
+    assert "dynamic values" in output
+    assert "{topic}" not in output
+    assert "Description >" not in output
+    assert '"description"' not in output
+
+
+def test_json_wizard_text_prompt_uses_full_prompt_for_readline(monkeypatch):
+    prompts: list[str] = []
+
+    monkeypatch.setattr(
+        json_crew, "_readline_safe_prompt", lambda prompt: f"safe:{prompt}"
+    )
+    monkeypatch.setattr(
+        "builtins.input", lambda prompt: prompts.append(prompt) or "Draft content"
+    )
+
+    assert json_crew._prompt_text("Goal", spacing_before=False) == "Draft content"
+    assert len(prompts) == 1
+    assert prompts[0].startswith("safe:")
+    assert "Goal" in prompts[0]
+    assert " > " in prompts[0]
+
+
+def test_json_wizard_tool_picker_prioritizes_common_tools(monkeypatch):
+    picker_calls: list[tuple[str, list[str], dict[str, object]]] = []
+
+    def pick_many(title: str, labels: list[str], **kwargs):
+        picker_calls.append((title, labels, kwargs))
+        return [1, 3], None
+
+    monkeypatch.setattr(json_crew, "pick_many", pick_many)
+
+    tools = json_crew._select_tools()
+
+    assert tools == ["SerperDevTool", "DirectoryReadTool"]
+    assert len(picker_calls) == 1
+    labels = picker_calls[0][1]
+    assert 0 in picker_calls[0][2]["separator_indices"]
+    assert labels[0] == "── Common tools ──"
+    assert labels[1].strip().endswith("SerperDevTool")
+    assert labels[2].strip().endswith("ScrapeWebsiteTool")
+    assert labels[3].strip().endswith("DirectoryReadTool")
+    assert labels[4].strip().endswith("FileReadTool")
+    assert labels[5].strip().endswith("FileWriterTool")
+    assert labels[1].index("Google search") < labels[1].index("SerperDevTool")
+    assert "More tools" not in labels
+
+
+def test_json_wizard_tool_picker_collapses_categories_by_default(monkeypatch):
+    picker_calls: list[tuple[str, list[str], dict[str, object]]] = []
+
+    def pick_many(title: str, labels: list[str], **kwargs):
+        picker_calls.append((title, labels, kwargs))
+        return [], None
+
+    monkeypatch.setattr(json_crew, "pick_many", pick_many)
+
+    json_crew._select_tools()
+
+    labels = picker_calls[0][1]
+    action_indices = picker_calls[0][2]["action_indices"]
+    # Categories show as collapsed action rows, not separators with tools
+    assert any(label.startswith("▸ Search & Research") for label in labels)
+    assert any(label.startswith("▸ Web Scraping") for label in labels)
+    assert not any(label.strip().endswith("BraveSearchTool") for label in labels)
+    assert len(action_indices) >= 4
+    # Only the common tools section is visible beyond the category rows
+    assert len(labels) == 1 + 5 + len(action_indices)
+
+
+def test_json_wizard_tool_picker_expands_one_category_at_a_time(monkeypatch):
+    picker_calls: list[tuple[str, list[str], dict[str, object]]] = []
+
+    def find_category_row(labels: list[str], category: str) -> int:
+        return next(
+            idx for idx, label in enumerate(labels) if category in label
+        )
+
+    def pick_many(title: str, labels: list[str], **kwargs):
+        picker_calls.append((title, labels, kwargs))
+        call_num = len(picker_calls)
+        if call_num == 1:
+            return [], find_category_row(labels, "Search & Research")
+        if call_num == 2:
+            # Search & Research is expanded; select BraveSearchTool and
+            # expand Web Scraping instead
+            brave = next(
+                idx
+                for idx, label in enumerate(labels)
+                if label.strip().endswith("BraveSearchTool")
+            )
+            return [brave], find_category_row(labels, "Web Scraping")
+        return [], None
+
+    monkeypatch.setattr(json_crew, "pick_many", pick_many)
+
+    tools = json_crew._select_tools()
+
+    assert tools == ["BraveSearchTool"]
+    assert len(picker_calls) == 3
+    # Second render: Search & Research expanded, others collapsed
+    labels2 = picker_calls[1][1]
+    assert any(label.startswith("▾ Search & Research") for label in labels2)
+    assert any(label.strip().endswith("BraveSearchTool") for label in labels2)
+    assert any(label.startswith("▸ Web Scraping") for label in labels2)
+    # Third render: Web Scraping expanded, Search & Research collapsed again
+    labels3 = picker_calls[2][1]
+    assert any(label.startswith("▸ Search & Research") for label in labels3)
+    assert any(label.startswith("▾ Web Scraping") for label in labels3)
+    assert not any(label.strip().endswith("BraveSearchTool") for label in labels3)
+    # The collapsed Search & Research row reports its selection count
+    assert any(
+        "Search & Research" in label and "1 selected" in label for label in labels3
+    )
+    # Cursor returns to the toggled category row
+    assert picker_calls[2][2]["initial_cursor"] == next(
+        idx for idx, label in enumerate(labels3) if "Web Scraping" in label
+    )
+
+
+def test_json_wizard_tool_picker_preserves_selection_across_renders(monkeypatch):
+    picker_calls: list[tuple[str, list[str], dict[str, object]]] = []
+
+    def pick_many(title: str, labels: list[str], **kwargs):
+        picker_calls.append((title, labels, kwargs))
+        call_num = len(picker_calls)
+        if call_num == 1:
+            # Select a common tool, then expand a category
+            category_row = next(
+                idx for idx, label in enumerate(labels) if "Web Scraping" in label
+            )
+            return [1], category_row
+        # Confirm without touching anything else
+        return sorted(kwargs["preselected"]), None
+
+    monkeypatch.setattr(json_crew, "pick_many", pick_many)
+
+    tools = json_crew._select_tools()
+
+    # The common-tool selection survived the expand re-render via preselected
+    assert tools == ["SerperDevTool"]
+    assert 1 in picker_calls[1][2]["preselected"]
+
+
+def test_json_wizard_tool_picker_lists_builtin_tools_across_categories(monkeypatch):
+    picker_calls: list[tuple[str, list[str], dict[str, object]]] = []
+    expanded_labels: list[str] = []
+
+    def pick_many(title: str, labels: list[str], **kwargs):
+        picker_calls.append((title, labels, kwargs))
+        expanded_labels.extend(labels)
+        action_indices = sorted(kwargs["action_indices"])
+        call_num = len(picker_calls)
+        if call_num <= len(action_indices):
+            # Expand the n-th category (indices shift between renders, so
+            # recompute from this render's action rows)
+            return [], action_indices[call_num - 1]
+        return [], None
+
+    monkeypatch.setattr(json_crew, "pick_many", pick_many)
+
+    json_crew._select_tools()
+
+    tool_names = {
+        label.rsplit(maxsplit=1)[-1]
+        for label in expanded_labels
+        if not label.startswith(("▸", "▾", "──"))
+    }
+
+    assert {
+        "DirectorySearchTool",
+        "MDXSearchTool",
+        "XMLSearchTool",
+        "YoutubeVideoSearchTool",
+        "S3ReaderTool",
+        "E2BExecTool",
+        "TavilyResearchTool",
+        "SerplyNewsSearchTool",
+        "BrowserbaseLoadTool",
+        "PatronusEvalTool",
+    }.issubset(tool_names)
+    assert {
+        "MCPServerAdapter",
+        "MongoDBVectorSearchConfig",
+        "ScrapegraphScrapeToolSchema",
+        "SnowflakeConfig",
+    }.isdisjoint(tool_names)
+
+
+def test_multi_picker_skips_separator_on_initial_cursor(monkeypatch):
+    cursors: list[int] = []
+
+    monkeypatch.setattr(tui_picker, "_read_key", lambda: "enter")
+    monkeypatch.setattr(
+        tui_picker,
+        "_draw_multi",
+        lambda _labels, cursor, *_args, **_kwargs: cursors.append(cursor),
+    )
+    monkeypatch.setattr(tui_picker, "_clear_lines", lambda *_args, **_kwargs: None)
+
+    assert tui_picker._arrow_select_multi(
+        ["── Common tools ──", "Google search via Serper API SerperDevTool"],
+        separator_indices={0},
+    ) == ([], None)
+    assert cursors == [1]
+
+
+def test_json_wizard_agent_attribute_prompts_are_compact(monkeypatch):
+    prompt_calls: list[tuple[str, bool]] = []
+    prompt_values = {
+        "Role": "Senior Dev Rel",
+        "Goal": "Draft content",
+        "Backstory": "Knows developer communities",
+    }
+
+    def prompt_text(
+        label: str,
+        default: str = "",
+        *,
+        spacing_before: bool = True,
+    ) -> str:
+        prompt_calls.append((label, spacing_before))
+        return prompt_values[label]
+
+    monkeypatch.setattr(json_crew, "_prompt_text", prompt_text)
+    monkeypatch.setattr(json_crew, "_select_model", lambda: "openai/gpt-5.5")
+    monkeypatch.setattr(json_crew, "pick_many", lambda *_args, **_kwargs: ([], None))
+    monkeypatch.setattr(json_crew, "_confirm", lambda *_args, **_kwargs: False)
+
+    agent = json_crew._wizard_agent(agent_num=1, existing_names=[])
+
+    assert agent is not None
+    assert prompt_calls == [
+        ("Role", False),
+        ("Goal", False),
+        ("Backstory", False),
+    ]
+
+
+def test_json_wizard_task_attribute_prompts_are_compact(monkeypatch):
+    prompt_calls: list[tuple[str, bool]] = []
+    prompt_values = {
+        "Description": "Research latest release",
+        "Expected output": "Release summary",
+    }
+
+    def prompt_text(
+        label: str,
+        default: str = "",
+        *,
+        spacing_before: bool = True,
+    ) -> str:
+        prompt_calls.append((label, spacing_before))
+        return prompt_values[label]
+
+    monkeypatch.setattr(json_crew, "_prompt_text", prompt_text)
+
+    task = json_crew._wizard_task(
+        task_num=1,
+        agent_names=["senior_dev_rel"],
+        prior_task_names=[],
+    )
+
+    assert task is not None
+    assert prompt_calls == [
+        ("Description", False),
+        ("Expected output", False),
+    ]
+
+
+def test_json_create_provider_preselects_default_model(tmp_path, monkeypatch):
+    monkeypatch.chdir(tmp_path)
+    with mock.patch(
+        "crewai_cli.create_json_crew._wizard_agents_and_tasks"
+    ) as mock_wizard:
+        mock_wizard.return_value = (
+            [
+                {
+                    "name": "researcher",
+                    "role": "Researcher",
+                    "goal": "Research",
+                    "backstory": "Researcher",
+                    "llm": "openai/gpt-5.5",
+                    "tools": [],
+                    "planning": False,
+                    "allow_delegation": False,
+                }
+            ],
+            [
+                {
+                    "name": "research_task",
+                    "description": "Research",
+                    "expected_output": "Findings",
+                    "agent": "researcher",
+                    "context": [],
+                }
+            ],
+            {"process": "sequential", "memory": False, "inputs": {}},
+        )
+
+        json_crew.create_json_crew("JSON Crew", provider="openai", skip_provider=True)
+
+    mock_wizard.assert_called_once_with(
+        skip_provider=True,
+        default_llm="openai/gpt-5.5",
+    )
+    assert (tmp_path / "json_crew" / "crew.jsonc").exists()
+    assert not (tmp_path / "json_crew" / "tests").exists()
+    assert not (tmp_path / "json_crew" / "config.jsonc").exists()
+
+    crew_template = (tmp_path / "json_crew" / "crew.jsonc").read_text()
+    assert (
+        '"guardrail": "Every factual claim needs context support."'
+        in crew_template
+    )
+    assert '"guardrails": [' in crew_template
+    assert '"guardrail_max_retries": 2' in crew_template
+    assert "Docs: https://docs.crewai.com/concepts/tasks" in crew_template
+    assert '"output_pydantic": null' in crew_template
+    assert '"markdown": false' in crew_template
+    assert "Docs: https://docs.crewai.com/concepts/crews" in crew_template
+    assert '"manager_agent": "researcher"' in crew_template
+    assert '"output_log_file": "crew.log"' in crew_template
+    assert "Crew-level LLM fields also accept object form" in crew_template
+    assert '"chat_llm": {"model": "llama3", "provider": "ollama"' in (
+        crew_template
+    )
+    assert "Use {placeholder} in agent or task text" in crew_template
+    assert "`crewai run` prompts for any placeholders" in crew_template
+    assert "Use {placeholder} inputs here" in crew_template
+
+    agent_template = (
+        tmp_path / "json_crew" / "agents" / "researcher.jsonc"
+    ).read_text()
+    assert "You can use {placeholder} inputs in role, goal, or backstory" in (
+        agent_template
+    )
+    assert '"role": "Senior {industry} Researcher"' in agent_template
+    assert "Optional agent-level guardrail" in agent_template
+    assert '"guardrail_max_retries": 2' in agent_template
+    assert "Docs: https://docs.crewai.com/concepts/agents" in agent_template
+    assert '"reasoning": true' in agent_template
+    assert "For custom endpoints or deployment-based providers" in agent_template
+    assert '"deployment_name": "my-deployment", "provider": "azure"' in (
+        agent_template
+    )
+    assert '"planning_config": {' in agent_template
+    assert '"llm": {"model": "deepseek-chat", "provider": "deepseek"}' in (
+        agent_template
+    )
+    assert '"knowledge_sources": []' in agent_template
+
+
+def test_json_provider_default_model_helper():
+    assert json_crew._default_model_for_provider("openai") == "openai/gpt-5.5"
+    assert json_crew._default_model_for_provider("anthropic/claude-custom") == (
+        "anthropic/claude-custom"
+    )
+    assert json_crew._default_model_for_provider("unknown") is None
+
+
+def test_json_wizard_task_reprompts_on_cancelled_agent_pick(monkeypatch):
+    """Esc on the agent picker must reprompt, not silently assign agent 0."""
+    prompts = iter(["Do the research", "A report"])
+    monkeypatch.setattr(json_crew, "_prompt_text", lambda *a, **k: next(prompts))
+
+    pick_calls: list[str] = []
+    picks = iter([-1, 1])
+
+    def fake_pick_one(title: str, labels: list[str]) -> int:
+        pick_calls.append(title)
+        return next(picks)
+
+    monkeypatch.setattr(json_crew, "pick_one", fake_pick_one)
+
+    task = json_crew._wizard_task(
+        task_num=1,
+        agent_names=["first_agent", "second_agent"],
+        prior_task_names=[],
+    )
+
+    assert len(pick_calls) == 2
+    assert task["agent"] == "second_agent"
--- a/lib/cli/tests/test_crew_run_tui.py
+++ b/lib/cli/tests/test_crew_run_tui.py
@@ -0,0 +1,796 @@
+from datetime import datetime
+import time
+
+import pytest
+
+from crewai.events.event_bus import crewai_event_bus
+from crewai.events.types.observation_events import (
+    GoalAchievedEarlyEvent,
+    PlanRefinementEvent,
+    PlanReplanTriggeredEvent,
+    PlanStepCompletedEvent,
+    PlanStepStartedEvent,
+    StepObservationCompletedEvent,
+    StepObservationFailedEvent,
+    StepObservationStartedEvent,
+)
+from crewai.events.types.tool_usage_events import (
+    ToolUsageErrorEvent,
+    ToolUsageFinishedEvent,
+    ToolUsageStartedEvent,
+)
+from crewai_cli import run_crew
+from crewai_cli.crew_run_tui import CrewRunApp
+
+
+def _app_with_plan() -> CrewRunApp:
+    app = CrewRunApp()
+    app._plan = {
+        "plan": "Demo plan",
+        "steps": [
+            {"step_number": 1, "description": "First"},
+            {"step_number": 2, "description": "Second"},
+            {"step_number": 3, "description": "Third"},
+        ],
+    }
+    app._plan_step_status = {1: "pending", 2: "pending", 3: "pending"}
+    return app
+
+
+def _log_entry(name: str) -> dict:
+    now = time.time()
+    return {
+        "tool_name": name,
+        "status": "success",
+        "args": None,
+        "result": f"{name} result",
+        "error": None,
+        "start_time": now,
+        "duration": 1.0,
+        "task_idx": 1,
+    }
+
+
+def _emit_event(event: object) -> None:
+    future = crewai_event_bus.emit(None, event)
+    if future:
+        future.result(timeout=5)
+
+
+def test_chain_deploy_skips_validation_after_auth_retry(monkeypatch) -> None:
+    create_calls: list[dict[str, object]] = []
+    login_calls: list[bool] = []
+
+    class FakeDeployCommand:
+        attempts = 0
+
+        def create_crew(self, **kwargs) -> None:
+            create_calls.append(kwargs)
+            FakeDeployCommand.attempts += 1
+            if FakeDeployCommand.attempts == 1:
+                raise SystemExit(1)
+
+    class FakeAuthenticationCommand:
+        def login(self) -> None:
+            login_calls.append(True)
+
+    monkeypatch.setattr("crewai_cli.deploy.main.DeployCommand", FakeDeployCommand)
+    monkeypatch.setattr(
+        "crewai_cli.authentication.main.AuthenticationCommand",
+        FakeAuthenticationCommand,
+    )
+
+    run_crew._chain_deploy()
+
+    assert create_calls == [
+        {"confirm": False, "skip_validate": True},
+        {"confirm": False, "skip_validate": True},
+    ]
+    assert login_calls == [True]
+
+
+def test_plan_step_status_updates_only_the_explicit_step() -> None:
+    app = _app_with_plan()
+
+    app._set_plan_step_status(2, "done")
+
+    assert app._plan_step_status == {
+        1: "pending",
+        2: "done",
+        3: "pending",
+    }
+
+
+def test_step_observation_events_update_the_explicit_step() -> None:
+    app = _app_with_plan()
+    app._subscribe()
+    try:
+        future = crewai_event_bus.emit(
+            None,
+            StepObservationStartedEvent(
+                agent_role="Agent",
+                step_number=2,
+                step_description="Second",
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+
+        assert app._plan_step_status == {
+            1: "pending",
+            2: "active",
+            3: "pending",
+        }
+
+        future = crewai_event_bus.emit(
+            None,
+            StepObservationCompletedEvent(
+                agent_role="Agent",
+                step_number=2,
+                step_description="Second",
+                step_completed_successfully=True,
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+    finally:
+        app._unsubscribe()
+
+    assert app._plan_step_status == {
+        1: "pending",
+        2: "done",
+        3: "pending",
+    }
+
+
+def test_plan_step_lifecycle_events_update_the_explicit_step() -> None:
+    app = _app_with_plan()
+    app._subscribe()
+    try:
+        _emit_event(
+            PlanStepStartedEvent(
+                agent_role="Agent",
+                step_number=2,
+                step_description="Second",
+            )
+        )
+
+        assert app._plan_step_status == {
+            1: "pending",
+            2: "active",
+            3: "pending",
+        }
+
+        _emit_event(
+            PlanStepCompletedEvent(
+                agent_role="Agent",
+                step_number=2,
+                step_description="Second",
+                success=True,
+                result="done",
+            )
+        )
+    finally:
+        app._unsubscribe()
+
+    assert app._plan_step_status == {
+        1: "pending",
+        2: "done",
+        3: "pending",
+    }
+
+
+def test_failed_plan_step_lifecycle_event_marks_exact_step_failed() -> None:
+    app = _app_with_plan()
+    app._subscribe()
+    try:
+        _emit_event(
+            PlanStepCompletedEvent(
+                agent_role="Agent",
+                step_number=2,
+                step_description="Second",
+                success=False,
+                error="Step failed",
+            )
+        )
+    finally:
+        app._unsubscribe()
+
+    assert app._plan_step_status == {
+        1: "pending",
+        2: "failed",
+        3: "pending",
+    }
+
+
+def test_tool_usage_events_do_not_advance_plan_steps() -> None:
+    app = _app_with_plan()
+    app._subscribe()
+    try:
+        future = crewai_event_bus.emit(
+            None,
+            ToolUsageStartedEvent(tool_name="search", tool_args={"query": "CrewAI"}),
+        )
+        if future:
+            future.result(timeout=5)
+
+        now = datetime.now()
+        future = crewai_event_bus.emit(
+            None,
+            ToolUsageFinishedEvent(
+                tool_name="search",
+                tool_args={"query": "CrewAI"},
+                started_at=now,
+                finished_at=now,
+                output="result",
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+    finally:
+        app._unsubscribe()
+
+    assert app._plan_step_status == {
+        1: "pending",
+        2: "pending",
+        3: "pending",
+    }
+
+
+def test_next_tool_does_not_mark_unfinished_tool_successful() -> None:
+    app = _app_with_plan()
+    app._subscribe()
+    try:
+        _emit_event(
+            ToolUsageStartedEvent(tool_name="search", tool_args={"query": "CrewAI"}),
+        )
+        _emit_event(
+            ToolUsageStartedEvent(tool_name="scrape", tool_args={"url": "https://x"}),
+        )
+    finally:
+        app._unsubscribe()
+
+    assert app._log_entries[0]["status"] == "timeout"
+    assert app._log_entries[0]["result"] is None
+    assert app._log_entries[0]["error"] == (
+        "No result received before the next tool started"
+    )
+    assert app._log_entries[1]["status"] == "running"
+    assert app._plan_step_status == {
+        1: "pending",
+        2: "pending",
+        3: "pending",
+    }
+
+
+def test_internal_reasoning_function_call_is_hidden_from_activity_log() -> None:
+    app = _app_with_plan()
+    app._subscribe()
+    try:
+        future = crewai_event_bus.emit(
+            None,
+            ToolUsageStartedEvent(
+                tool_name="create_reasoning_plan",
+                tool_args={"plan": "Plan", "steps": [], "ready": True},
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+
+        now = datetime.now()
+        future = crewai_event_bus.emit(
+            None,
+            ToolUsageFinishedEvent(
+                tool_name="create_reasoning_plan",
+                tool_args={"plan": "Plan", "steps": [], "ready": True},
+                started_at=now,
+                finished_at=now,
+                output='{"plan":"Plan","steps":[],"ready":true}',
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+
+        future = crewai_event_bus.emit(
+            None,
+            ToolUsageErrorEvent(
+                tool_name="create_reasoning_plan",
+                tool_args={"plan": "Plan", "steps": [], "ready": True},
+                error="internal planning fallback",
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+    finally:
+        app._unsubscribe()
+
+    assert app._log_entries == []
+    assert app._current_task_steps == []
+
+
+def test_tool_failure_does_not_override_successful_plan_step_completion() -> None:
+    app = _app_with_plan()
+    app._subscribe()
+    try:
+        _emit_event(
+            PlanStepStartedEvent(
+                agent_role="Agent",
+                step_number=1,
+                step_description="First",
+            )
+        )
+        _emit_event(
+            ToolUsageStartedEvent(
+                tool_name="search_the_internet_with_serper",
+                tool_args={"search_query": "CrewAI release"},
+                plan_step_number=1,
+                plan_step_description="First",
+            )
+        )
+        _emit_event(
+            ToolUsageErrorEvent(
+                tool_name="search_the_internet_with_serper",
+                tool_args={"search_query": "CrewAI release"},
+                plan_step_number=1,
+                plan_step_description="First",
+                error="No results",
+            )
+        )
+        _emit_event(
+            PlanStepCompletedEvent(
+                agent_role="Agent",
+                step_number=1,
+                step_description="First",
+                success=True,
+                result="Recovered with another source",
+            )
+        )
+    finally:
+        app._unsubscribe()
+
+    assert app._plan_step_status == {
+        1: "done",
+        2: "pending",
+        3: "pending",
+    }
+
+
+def test_tool_event_step_metadata_is_stored_in_activity_log() -> None:
+    app = _app_with_plan()
+    app._subscribe()
+    try:
+        _emit_event(
+            ToolUsageStartedEvent(
+                tool_name="search_the_internet_with_serper",
+                tool_args={"search_query": "CrewAI release"},
+                plan_step_number=2,
+                plan_step_description="Second",
+            )
+        )
+        now = datetime.now()
+        _emit_event(
+            ToolUsageFinishedEvent(
+                tool_name="search_the_internet_with_serper",
+                tool_args={"search_query": "CrewAI release"},
+                plan_step_number=2,
+                plan_step_description="Second",
+                started_at=now,
+                finished_at=now,
+                output="Found official source",
+            )
+        )
+    finally:
+        app._unsubscribe()
+
+    assert app._log_entries[0]["plan_step_number"] == 2
+    assert app._plan_step_status == {
+        1: "pending",
+        2: "pending",
+        3: "pending",
+    }
+
+
+def test_starting_next_tool_does_not_infer_plan_step_progress() -> None:
+    app = _app_with_plan()
+    app._subscribe()
+    try:
+        _emit_event(
+            ToolUsageStartedEvent(
+                tool_name="search_the_internet_with_serper",
+                tool_args={"search_query": "CrewAI release"},
+            )
+        )
+        _emit_event(
+            ToolUsageErrorEvent(
+                tool_name="search_the_internet_with_serper",
+                tool_args={"search_query": "CrewAI release"},
+                error="No results",
+            )
+        )
+        _emit_event(
+            ToolUsageStartedEvent(
+                tool_name="read_website_content",
+                tool_args={"url": "https://example.com"},
+            )
+        )
+    finally:
+        app._unsubscribe()
+
+    assert app._log_entries[0]["status"] == "error"
+    assert app._log_entries[1]["status"] == "running"
+    assert app._plan_step_status == {
+        1: "pending",
+        2: "pending",
+        3: "pending",
+    }
+
+
+@pytest.mark.asyncio
+async def test_crew_done_does_not_mark_unfinished_tool_successful() -> None:
+    app = _app_with_plan()
+
+    async with app.run_test(size=(100, 40)) as pilot:
+        app._plan_step_status = {1: "failed", 2: "done", 3: "pending"}
+        app._log_entries = [
+            {
+                "tool_name": "search",
+                "status": "running",
+                "args": '{"query": "CrewAI"}',
+                "result": None,
+                "error": None,
+                "start_time": time.time() - 2,
+                "duration": None,
+                "task_idx": 1,
+            }
+        ]
+
+        app._on_crew_done("final output")
+        await pilot.pause()
+
+    assert app._log_entries[0]["status"] == "timeout"
+    assert app._log_entries[0]["result"] is None
+    assert app._log_entries[0]["error"] == "No result received before crew completed"
+    assert app._plan_step_status == {1: "failed", 2: "done", 3: "done"}
+
+
+def test_streamed_step_observation_updates_named_step_only() -> None:
+    app = _app_with_plan()
+
+    updated = app._try_parse_step_observation(
+        '{"step_completed_successfully":true,'
+        '"key_information_learned":"Step 2 succeeded with the official source."}'
+    )
+
+    assert updated is True
+    assert app._plan_step_status == {
+        1: "pending",
+        2: "done",
+        3: "pending",
+    }
+
+
+def test_failed_streamed_step_observation_marks_named_step_failed() -> None:
+    app = _app_with_plan()
+
+    updated = app._try_parse_step_observation(
+        '{"step_completed_successfully":false,'
+        '"key_information_learned":"Step 2 failed because the tool failed."}'
+    )
+
+    assert updated is True
+    assert app._plan_step_status == {
+        1: "pending",
+        2: "failed",
+        3: "pending",
+    }
+
+
+def test_streamed_goal_achieved_observation_collapses_remaining_steps_done() -> None:
+    app = _app_with_plan()
+
+    updated = app._try_parse_step_observation(
+        '{"step_number":2,'
+        '"step_completed_successfully":true,'
+        '"key_information_learned":"Goal is already satisfied.",'
+        '"goal_already_achieved":true}'
+    )
+
+    assert updated is True
+    assert app._plan_step_status == {
+        1: "done",
+        2: "done",
+        3: "done",
+    }
+
+
+def test_task_completion_collapses_pending_plan_steps_but_preserves_failed() -> None:
+    app = _app_with_plan()
+    app._plan_step_status = {1: "failed", 2: "done", 3: "pending"}
+
+    app._collapse_plan_on_task_done()
+
+    assert app._plan_step_status == {1: "failed", 2: "done", 3: "done"}
+
+
+def test_observation_failure_collapses_to_done_because_executor_continues() -> None:
+    app = _app_with_plan()
+    app._plan_step_status = {1: "done", 2: "active", 3: "pending"}
+    app._subscribe()
+    try:
+        future = crewai_event_bus.emit(
+            None,
+            StepObservationFailedEvent(
+                agent_role="Agent",
+                step_number=2,
+                step_description="Second",
+                error="observer timeout",
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+    finally:
+        app._unsubscribe()
+
+    assert app._plan_step_status == {
+        1: "done",
+        2: "done",
+        3: "pending",
+    }
+
+
+def test_goal_achieved_event_collapses_remaining_steps_done() -> None:
+    app = _app_with_plan()
+    app._plan_step_status = {1: "done", 2: "active", 3: "pending"}
+    app._subscribe()
+    try:
+        future = crewai_event_bus.emit(
+            None,
+            GoalAchievedEarlyEvent(
+                agent_role="Agent",
+                step_number=2,
+                steps_completed=2,
+                steps_remaining=1,
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+    finally:
+        app._unsubscribe()
+
+    assert app._plan_step_status == {
+        1: "done",
+        2: "done",
+        3: "done",
+    }
+
+
+def test_replan_event_keeps_old_plan_until_next_streamed_plan_replaces_it() -> None:
+    app = _app_with_plan()
+    app._subscribe()
+    try:
+        future = crewai_event_bus.emit(
+            None,
+            PlanReplanTriggeredEvent(
+                agent_role="Agent",
+                step_number=2,
+                replan_reason="Need updated sources",
+                replan_count=1,
+                completed_steps_preserved=1,
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+    finally:
+        app._unsubscribe()
+
+    assert app._plan is not None
+    assert app._plan_step_status == {1: "pending", 2: "pending", 3: "pending"}
+    assert app._awaiting_replan is True
+
+    app._try_parse_plan(
+        '{"plan":"Updated plan","steps":['
+        '{"step_number":1,"description":"Updated first"},'
+        '{"step_number":2,"description":"Updated second"}]}'
+    )
+
+    assert app._plan == {
+        "plan": "Updated plan",
+        "steps": [
+            {"step_number": 1, "description": "Updated first"},
+            {"step_number": 2, "description": "Updated second"},
+        ],
+    }
+    assert app._plan_step_status == {1: "pending", 2: "pending"}
+    assert app._awaiting_replan is False
+
+
+def test_plan_refinement_updates_descriptions_without_new_statuses() -> None:
+    app = _app_with_plan()
+    app._plan_step_status = {1: "done", 2: "active", 3: "pending"}
+    app._subscribe()
+    try:
+        future = crewai_event_bus.emit(
+            None,
+            PlanRefinementEvent(
+                agent_role="Agent",
+                step_number=2,
+                refined_step_count=1,
+                refinements=["Step 3: Write the final answer from verified facts"],
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+    finally:
+        app._unsubscribe()
+
+    assert app._plan_step_status == {
+        1: "done",
+        2: "done",
+        3: "pending",
+    }
+    assert app._plan["steps"][2]["description"] == (
+        "Write the final answer from verified facts"
+    )
+
+
+def test_step_observation_json_is_hidden_from_streaming_text() -> None:
+    app = _app_with_plan()
+
+    assert (
+        app._strip_step_observation_json(
+            'Visible before {"step_completed_successfully":true,'
+            '"key_information_learned":"Step 2 succeeded."} visible after'
+        )
+        == "Visible before  visible after"
+    )
+
+
+@pytest.mark.asyncio
+async def test_completed_run_keeps_activity_log_keyboard_navigation_active() -> None:
+    app = CrewRunApp()
+
+    async with app.run_test(size=(100, 40)) as pilot:
+        app._log_entries = [_log_entry("search"), _log_entry("scrape")]
+
+        app._on_crew_done("final output")
+        await pilot.pause()
+
+        assert app.focused is app.query_one("#log-panel")
+
+        await pilot.press("down", "enter")
+        await pilot.pause()
+
+        assert app._log_cursor == 1
+        assert app._log_expanded == {1}
+
+        await pilot.press("up")
+        await pilot.pause()
+
+        assert app._log_cursor == 0
+
+
+class _FakeTask:
+    fingerprint = None
+
+    def __init__(self, task_id: str, name: str) -> None:
+        self.id = task_id
+        self.name = name
+        self.description = name
+
+
+def test_async_task_completion_marks_the_right_sidebar_row() -> None:
+    """Overlapping tasks: completing task 1 while task 2 runs must not
+    mark task 2 done, and starting task 2 must not mark task 1 done."""
+    from crewai.events.types.task_events import TaskCompletedEvent, TaskStartedEvent
+    from crewai.tasks.task_output import TaskOutput
+
+    app = CrewRunApp(total_tasks=2, task_names=["first", "second"])
+    app._subscribe()
+    try:
+        task1 = _FakeTask("id-1", "first")
+        task2 = _FakeTask("id-2", "second")
+
+        for task in (task1, task2):
+            future = crewai_event_bus.emit(
+                None, TaskStartedEvent(context=None, task=task)
+            )
+            if future:
+                future.result(timeout=5)
+
+        # Both started: neither prematurely done
+        assert app._task_statuses == {1: "active", 2: "active"}
+
+        future = crewai_event_bus.emit(
+            None,
+            TaskCompletedEvent(
+                output=TaskOutput(description="first", raw="done", agent="a"),
+                task=task1,
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+
+        assert app._task_statuses == {1: "done", 2: "active"}
+    finally:
+        app._unsubscribe()
+
+
+def test_pop_task_state_falls_back_to_current_task() -> None:
+    app = CrewRunApp(total_tasks=2, task_names=["first", "second"])
+    app._current_task_idx = 2
+    app._current_task_desc = "second"
+
+    class _Evt:
+        task = None
+        task_name = "unknown"
+
+    state = app._pop_task_state(_Evt())
+    assert state["idx"] == 2
+    assert state["desc"] == "second"
+
+
+def test_overlapping_task_logs_keep_their_own_state() -> None:
+    """Task 1 completing after task 2 started must log its own description,
+    agent, and output — and must not steal or reset task 2's stream state."""
+    from crewai.events.types.task_events import TaskCompletedEvent, TaskStartedEvent
+    from crewai.tasks.task_output import TaskOutput
+
+    app = CrewRunApp(total_tasks=2, task_names=["first", "second"])
+    app._subscribe()
+    try:
+        task1 = _FakeTask("id-1", "first")
+        task2 = _FakeTask("id-2", "second")
+
+        for task in (task1, task2):
+            future = crewai_event_bus.emit(
+                None, TaskStartedEvent(context=None, task=task)
+            )
+            if future:
+                future.result(timeout=5)
+
+        # Task 2 is current and has streamed state in flight
+        app._task_full_output = "task two streaming output"
+        app._current_task_steps = [{"type": "llm", "summary": "thinking"}]
+
+        future = crewai_event_bus.emit(
+            None,
+            TaskCompletedEvent(
+                output=TaskOutput(
+                    description="first", raw="task one result", agent="a1"
+                ),
+                task=task1,
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+
+        # Task 1's entry carries its own identity and output
+        entry1 = app._task_logs[-1]
+        assert entry1["idx"] == 1
+        assert entry1["desc"] == "first"
+        assert entry1["output"] == "task one result"
+        assert entry1["steps"] == []
+
+        # Task 2's in-flight stream state was not consumed or reset
+        assert app._task_full_output == "task two streaming output"
+        assert app._current_task_steps == [{"type": "llm", "summary": "thinking"}]
+
+        future = crewai_event_bus.emit(
+            None,
+            TaskCompletedEvent(
+                output=TaskOutput(
+                    description="second", raw="task two result", agent="a2"
+                ),
+                task=task2,
+            ),
+        )
+        if future:
+            future.result(timeout=5)
+
+        entry2 = app._task_logs[-1]
+        assert entry2["idx"] == 2
+        assert entry2["desc"] == "second"
+        assert entry2["output"] == "task two streaming output"
+        assert any(step.get("summary") == "thinking" for step in entry2["steps"])
+    finally:
+        app._unsubscribe()
--- a/lib/cli/tests/test_run_crew.py
+++ b/lib/cli/tests/test_run_crew.py
@@ -0,0 +1,144 @@
+"""Tests for crewai_cli.run_crew JSON crew handling."""
+
+import os
+from pathlib import Path
+
+import pytest
+from crewai_core.constants import CREWAI_TRAINED_AGENTS_FILE_ENV
+
+import crewai_cli.run_crew as run_crew_module
+
+
+def test_run_crew_forwards_trained_agents_file_to_json_crews(monkeypatch):
+    """crewai run -f must reach JSON crews, not only classic subprocess crews."""
+    monkeypatch.setattr(run_crew_module, "_has_json_crew", lambda: True)
+    called: dict = {}
+
+    def fake_run_json_crew(trained_agents_file=None):
+        called["trained_agents_file"] = trained_agents_file
+
+    monkeypatch.setattr(run_crew_module, "_run_json_crew", fake_run_json_crew)
+
+    run_crew_module.run_crew(trained_agents_file="some.pkl")
+
+    assert called == {"trained_agents_file": "some.pkl"}
+
+
+def test_run_json_crew_exports_trained_agents_env(monkeypatch, tmp_path: Path):
+    """JSON crews run in-process, so the pickle path must land in the env var."""
+    monkeypatch.chdir(tmp_path)
+    monkeypatch.delenv(CREWAI_TRAINED_AGENTS_FILE_ENV, raising=False)
+
+    try:
+        # No crew.json(c) in tmp_path: the loader fails *after* the env var
+        # export, which is the part under test.
+        with pytest.raises(FileNotFoundError):
+            run_crew_module._run_json_crew(trained_agents_file="some.pkl")
+        assert os.environ[CREWAI_TRAINED_AGENTS_FILE_ENV] == "some.pkl"
+    finally:
+        os.environ.pop(CREWAI_TRAINED_AGENTS_FILE_ENV, None)
+
+
+def test_run_json_crew_leaves_env_untouched_without_flag(monkeypatch, tmp_path: Path):
+    monkeypatch.chdir(tmp_path)
+    monkeypatch.delenv(CREWAI_TRAINED_AGENTS_FILE_ENV, raising=False)
+
+    with pytest.raises(FileNotFoundError):
+        run_crew_module._run_json_crew()
+
+    assert CREWAI_TRAINED_AGENTS_FILE_ENV not in os.environ
+
+
+def test_missing_input_names_accepts_hyphenated_placeholders():
+    """The prompt regex must accept the same names kickoff interpolation does."""
+    from types import SimpleNamespace
+
+    crew = SimpleNamespace(
+        agents=[
+            SimpleNamespace(
+                role="Researcher", goal="Cover {my-topic}", backstory=""
+            )
+        ],
+        tasks=[
+            SimpleNamespace(
+                description="Write about {my-topic} for {target-audience}",
+                expected_output="Post",
+                output_file=None,
+            )
+        ],
+    )
+
+    assert run_crew_module._missing_input_names(crew, {}) == [
+        "my-topic",
+        "target-audience",
+    ]
+
+
+def _patch_tui_run(monkeypatch, status: str):
+    """Stub the TUI pieces of _run_json_crew so only exit handling runs."""
+
+    class FakeApp:
+        def __init__(self, **kwargs):
+            self._status = status
+            self._crew_result = "result" if status == "completed" else None
+            self._want_deploy = False
+
+        def run(self):
+            pass
+
+    from types import SimpleNamespace
+
+    crew = SimpleNamespace(name="Demo", tasks=[], agents=[])
+    monkeypatch.setattr(
+        run_crew_module, "find_crew_json_file", lambda: Path("crew.jsonc")
+    )
+    monkeypatch.setattr(
+        run_crew_module,
+        "_load_json_crew_for_tui",
+        lambda _path: (FakeApp, crew, {}, [], []),
+    )
+    monkeypatch.setattr(
+        run_crew_module, "_prompt_for_missing_inputs", lambda _crew, inputs: inputs
+    )
+    monkeypatch.setattr(run_crew_module, "_print_post_tui_summary", lambda _app: None)
+
+
+def test_run_json_crew_failed_status_exits_nonzero(monkeypatch, tmp_path: Path):
+    monkeypatch.chdir(tmp_path)
+    _patch_tui_run(monkeypatch, status="failed")
+
+    with pytest.raises(SystemExit) as exc_info:
+        run_crew_module._run_json_crew()
+
+    assert exc_info.value.code == 1
+
+
+def test_run_json_crew_completed_status_returns_result(monkeypatch, tmp_path: Path):
+    monkeypatch.chdir(tmp_path)
+    _patch_tui_run(monkeypatch, status="completed")
+
+    assert run_crew_module._run_json_crew() == "result"
+
+
+def test_has_json_crew_defers_to_declared_flow_type(monkeypatch, tmp_path: Path):
+    """A flow project containing a stray crew.jsonc must still run as a flow."""
+    monkeypatch.chdir(tmp_path)
+    (tmp_path / "crew.jsonc").write_text("{}")
+    (tmp_path / "pyproject.toml").write_text('[tool.crewai]\ntype = "flow"\n')
+
+    assert run_crew_module._has_json_crew() is False
+
+
+def test_has_json_crew_true_for_declared_crew_type(monkeypatch, tmp_path: Path):
+    monkeypatch.chdir(tmp_path)
+    (tmp_path / "crew.jsonc").write_text("{}")
+    (tmp_path / "pyproject.toml").write_text('[tool.crewai]\ntype = "crew"\n')
+
+    assert run_crew_module._has_json_crew() is True
+
+
+def test_has_json_crew_true_without_pyproject(monkeypatch, tmp_path: Path):
+    monkeypatch.chdir(tmp_path)
+    (tmp_path / "crew.jsonc").write_text("{}")
+
+    assert run_crew_module._has_json_crew() is True
--- a/lib/cli/tests/test_run_flow_definition.py
+++ b/lib/cli/tests/test_run_flow_definition.py
@@ -0,0 +1,156 @@
+from __future__ import annotations
+
+import json
+import sys
+import types
+
+import pytest
+import yaml
+
+from crewai_cli.run_flow_definition import run_flow_definition
+
+
+class _FakeFlow:
+    def __init__(self, definition):
+        self.definition = definition
+
+    def kickoff(self, inputs=None):
+        return {
+            "flow": self.definition["name"],
+            "inputs": inputs or {},
+        }
+
+
+class _FakeFlowFactory:
+    @classmethod
+    def from_definition(cls, definition):
+        return _FakeFlow(definition)
+
+
+class _FakeFlowDefinition:
+    @classmethod
+    def from_yaml(cls, source):
+        return yaml.safe_load(source)
+
+    @classmethod
+    def from_json(cls, source):
+        return json.loads(source)
+
+
+@pytest.fixture
+def fake_flow_runtime(monkeypatch):
+    crewai_module = types.ModuleType("crewai")
+    flow_package = types.ModuleType("crewai.flow")
+    flow_module = types.ModuleType("crewai.flow.flow")
+    flow_definition_module = types.ModuleType("crewai.flow.flow_definition")
+
+    flow_module.Flow = _FakeFlowFactory
+    flow_definition_module.FlowDefinition = _FakeFlowDefinition
+
+    monkeypatch.setitem(sys.modules, "crewai", crewai_module)
+    monkeypatch.setitem(sys.modules, "crewai.flow", flow_package)
+    monkeypatch.setitem(sys.modules, "crewai.flow.flow", flow_module)
+    monkeypatch.setitem(
+        sys.modules, "crewai.flow.flow_definition", flow_definition_module
+    )
+
+
+def _captured_json(capsys):
+    return json.loads(capsys.readouterr().out)
+
+
+def test_run_flow_definition_reads_definition_file(
+    tmp_path, capsys, fake_flow_runtime
+):
+    definition_path = tmp_path / "flow.yaml"
+    definition_path.write_text("schema: crewai.flow/v1\nname: TestFlow\n")
+
+    run_flow_definition(str(definition_path), '{"topic":"AI"}')
+
+    assert _captured_json(capsys) == {
+        "flow": "TestFlow",
+        "inputs": {"topic": "AI"},
+    }
+
+
+@pytest.mark.parametrize(
+    ("definition_source", "expected_flow_name"),
+    [
+        pytest.param(
+            "schema: crewai.flow/v1\nname: InlineFlow\n",
+            "InlineFlow",
+            id="inline-yaml",
+        ),
+        pytest.param(
+            '{"schema":"crewai.flow/v1","name":"InlineJsonFlow"}',
+            "InlineJsonFlow",
+            id="inline-json",
+        ),
+        pytest.param(
+            '{"schema":"crewai.flow/v1","name":"' + ("JsonFlow" * 500) + '"}',
+            "JsonFlow" * 500,
+            id="large-inline-json",
+        ),
+    ],
+)
+def test_run_flow_definition_accepts_inline_definitions(
+    definition_source, expected_flow_name, capsys, fake_flow_runtime
+):
+    run_flow_definition(definition_source)
+
+    assert _captured_json(capsys) == {"flow": expected_flow_name, "inputs": {}}
+
+
+@pytest.mark.parametrize(
+    ("filename", "definition_source", "expected_flow_name"),
+    [
+        pytest.param(
+            "flow.yaml",
+            "schema: crewai.flow/v1\nname: YamlFileFlow\n",
+            "YamlFileFlow",
+            id="yaml-file",
+        ),
+        pytest.param(
+            "flow.json",
+            '{"schema":"crewai.flow/v1","name":"JsonFlow"}',
+            "JsonFlow",
+            id="json-file",
+        ),
+    ],
+)
+def test_run_flow_definition_accepts_definition_files(
+    filename, definition_source, expected_flow_name, tmp_path, capsys, fake_flow_runtime
+):
+    definition_path = tmp_path / filename
+    definition_path.write_text(definition_source)
+
+    run_flow_definition(str(definition_path))
+
+    assert _captured_json(capsys) == {"flow": expected_flow_name, "inputs": {}}
+
+
+def test_run_flow_definition_rejects_non_object_inputs(fake_flow_runtime, capsys):
+    with pytest.raises(SystemExit):
+        run_flow_definition("name: TestFlow", '["not", "an", "object"]')
+
+    assert "Invalid --inputs JSON: expected an object." in capsys.readouterr().err
+
+
+def test_run_flow_definition_reports_unreadable_file(
+    monkeypatch, tmp_path, capsys, fake_flow_runtime
+):
+    definition_path = tmp_path / "flow.yaml"
+    definition_path.write_text("schema: crewai.flow/v1\nname: TestFlow\n")
+
+    def raise_permission_error(self, *args, **kwargs):
+        raise PermissionError("no access")
+
+    monkeypatch.setattr("pathlib.Path.read_text", raise_permission_error)
+
+    with pytest.raises(SystemExit):
+        run_flow_definition(str(definition_path))
+
+    err = capsys.readouterr().err
+    assert "Unable to read --definition path" in err
+    assert str(definition_path) in err
+    assert "no access" in err
--- a/lib/cli/tests/tools/test_main.py
+++ b/lib/cli/tests/tools/test_main.py
@@ -157,14 +157,16 @@ def test_install_api_error(mock_get, capsys, tool_command):
    mock_get.assert_called_once_with("error-tool")


-@patch("crewai_cli.tools.main.git.Repository.fetch")
-@patch("crewai_cli.tools.main.git.Repository.is_synced", return_value=False)
-def test_publish_when_not_in_sync(mock_is_synced, mock_fetch, capsys, tool_command):
+@patch("crewai_cli.tools.main.git.Repository")
+def test_publish_when_not_in_sync(mock_repository, capsys, tool_command):
+    mock_repository.return_value.is_synced.return_value = False
+
    with raises(SystemExit):
        tool_command.publish(is_public=True)

    output = capsys.readouterr().out
    assert "Local changes need to be resolved before publishing" in output
+    mock_repository.return_value.is_synced.assert_called_once_with()


@patch("crewai_cli.tools.main.get_project_name", return_value="sample-tool")
--- a/lib/crewai-core/src/crewai_core/init.py
+++ b/lib/crewai-core/src/crewai_core/init.py
@@ -1 +1 @@
-__version__ = "1.14.7rc2"
+__version__ = "1.14.7"
--- a/lib/crewai-core/tests/test_smoke.py
+++ b/lib/crewai-core/tests/test_smoke.py
@@ -13,8 +13,8 @@ from crewai_core import (
    user_data,
    version,
 )
-import pytest
 from opentelemetry.sdk.trace import TracerProvider
+import pytest


 def test_version_returns_string() -> None:
--- a/lib/crewai-files/src/crewai_files/init.py
+++ b/lib/crewai-files/src/crewai_files/init.py
@@ -152,4 +152,4 @@ __all__ = [
    "wrap_file_source",
 ]

-__version__ = "1.14.7rc2"
+__version__ = "1.14.7"
--- a/lib/crewai-files/src/crewai_files/core/sources.py
+++ b/lib/crewai-files/src/crewai_files/core/sources.py
@@ -4,6 +4,7 @@ from __future__ import annotations

 from collections.abc import AsyncIterator, Iterator
 import inspect
+import json
 import mimetypes
 from pathlib import Path
 from typing import Annotated, Any, BinaryIO, Protocol, cast, runtime_checkable
@@ -23,6 +24,9 @@ from typing_extensions import TypeIs
 from crewai_files.core.constants import DEFAULT_MAX_FILE_SIZE_BYTES, MAGIC_BUFFER_SIZE


+OCTET_STREAM = "application/octet-stream"
+
+
@runtime_checkable
 class AsyncReadable(Protocol):
    """Protocol for async readable streams."""
@@ -56,13 +60,51 @@ class _AsyncReadableValidator:
 ValidatedAsyncReadable = Annotated[AsyncReadable, _AsyncReadableValidator()]


-def _fallback_content_type(filename: str | None) -> str:
-    """Get content type from filename extension or return default."""
+def _detect_content_type_from_bytes(data: bytes) -> str | None:
+    if data.startswith(b"\x89PNG\r\n\x1a\n"):
+        return "image/png"
+    if data.startswith(b"\xff\xd8\xff"):
+        return "image/jpeg"
+    if data.startswith(b"%PDF-"):
+        return "application/pdf"
+
+    try:
+        decoded = data.decode("utf-8")
+    except UnicodeDecodeError:
+        return None
+
+    stripped = decoded.lstrip()
+    if stripped.startswith(("{", "[")):
+        try:
+            json.loads(decoded)
+            return "application/json"
+        except json.JSONDecodeError:
+            pass
+
+    if "\x00" not in decoded:
+        return "text/plain"
+
+    return None
+
+
+def _fallback_content_type(filename: str | None, data: bytes | None = None) -> str:
+    """Get content type from filename extension, then content sniffing.
+
+    The extension lookup runs first so specific types like ``text/csv`` or
+    ``application/xml`` are not degraded to generic sniffed types such as
+    ``text/plain``; byte sniffing only covers extensionless/unknown names.
+    """
    if filename:
        mime_type, _ = mimetypes.guess_type(filename)
        if mime_type:
            return mime_type
-    return "application/octet-stream"
+
+    if data:
+        content_type = _detect_content_type_from_bytes(data)
+        if content_type:
+            return content_type
+
+    return OCTET_STREAM


 def generate_filename(content_type: str) -> str:
@@ -97,9 +139,19 @@ def detect_content_type(data: bytes, filename: str | None = None) -> str:
        import magic

        result: str = magic.from_buffer(data[:MAGIC_BUFFER_SIZE], mime=True)
-        return result
+        if result != OCTET_STREAM:
+            return result
+        return _fallback_content_type(filename, data)
    except ImportError:
-        return _fallback_content_type(filename)
+        return _fallback_content_type(filename, data)
+
+
+def _read_magic_header(path: Path) -> bytes | None:
+    try:
+        with path.open("rb") as file:
+            return file.read(MAGIC_BUFFER_SIZE)
+    except OSError:
+        return None


 def detect_content_type_from_path(path: Path, filename: str | None = None) -> str:
@@ -115,13 +167,16 @@ def detect_content_type_from_path(path: Path, filename: str | None = None) -> st
    Returns:
        The detected MIME type.
    """
+    fallback_filename = filename or path.name
    try:
        import magic

        result: str = magic.from_file(str(path), mime=True)
-        return result
+        if result != OCTET_STREAM:
+            return result
+        return _fallback_content_type(fallback_filename, _read_magic_header(path))
    except ImportError:
-        return _fallback_content_type(filename or path.name)
+        return _fallback_content_type(fallback_filename, _read_magic_header(path))


 class _BinaryIOValidator:
--- a/lib/crewai-files/src/crewai_files/resolution/resolver.py
+++ b/lib/crewai-files/src/crewai_files/resolution/resolver.py
@@ -129,6 +129,20 @@ class FileResolver:
        """
        return constraints is not None and constraints.supports_url_references

+    @classmethod
+    def _should_resolve_as_url_reference(
+        cls,
+        file: FileInput,
+        provider: ProviderType,
+        constraints: ProviderConstraints | None,
+    ) -> bool:
+        """Check if the provider can accept the current URL source directly."""
+        if not cls._is_url_source(file) or not cls._supports_url(constraints):
+            return False
+
+        provider_lower = provider.lower()
+        return "bedrock" not in provider_lower and "aws" not in provider_lower
+
    @staticmethod
    def _resolve_as_url(file: FileInput) -> UrlReference:
        """Resolve a URL source as UrlReference.
@@ -159,7 +173,7 @@ class FileResolver:
        """
        constraints = get_constraints_for_provider(provider)

-        if self._is_url_source(file) and self._supports_url(constraints):
+        if self._should_resolve_as_url_reference(file, provider, constraints):
            return self._resolve_as_url(file)

        context = self._build_file_context(file)
@@ -424,7 +438,7 @@ class FileResolver:
        """
        constraints = get_constraints_for_provider(provider)

-        if self._is_url_source(file) and self._supports_url(constraints):
+        if self._should_resolve_as_url_reference(file, provider, constraints):
            return self._resolve_as_url(file)

        context = self._build_file_context(file)
--- a/lib/crewai-tools/pyproject.toml
+++ b/lib/crewai-tools/pyproject.toml
@@ -10,7 +10,7 @@ requires-python = ">=3.10, <3.14"
 dependencies = [
    "pytube~=15.0.0",
    "requests>=2.33.0,<3",
-    "crewai==1.14.7rc2",
+    "crewai==1.14.7",
    "tiktoken>=0.8.0,<0.13",
    "beautifulsoup4~=4.13.4",
    "python-docx~=1.2.0",
@@ -63,7 +63,7 @@ spider-client = [
    "spider-client>=0.1.25",
 ]
 scrapegraph-py = [
-    "scrapegraph-py>=1.9.0",
+    "scrapegraph-py>=1.9.0,<2",
 ]
 linkup-sdk = [
    "linkup-sdk>=0.2.2",
--- a/lib/crewai-tools/src/crewai_tools/init.py
+++ b/lib/crewai-tools/src/crewai_tools/init.py
@@ -330,4 +330,4 @@ __all__ = [
    "ZapierActionTools",
 ]

-__version__ = "1.14.7rc2"
+__version__ = "1.14.7"
--- a/lib/crewai-tools/src/crewai_tools/security/safe_path.py
+++ b/lib/crewai-tools/src/crewai_tools/security/safe_path.py
@@ -22,6 +22,31 @@ logger = logging.getLogger(__name__)
 _UNSAFE_PATHS_ENV = "CREWAI_TOOLS_ALLOW_UNSAFE_PATHS"


+def format_path_for_display(path: str, base_dir: str | None = None) -> str:
+    """Return a path label that does not expose absolute directory prefixes."""
+    if base_dir is None:
+        base_dir = os.getcwd()
+
+    try:
+        resolved_base = os.path.realpath(base_dir)
+        resolved_path = os.path.realpath(
+            os.path.join(resolved_base, path) if not os.path.isabs(path) else path
+        )
+        if os.path.commonpath([resolved_base, resolved_path]) == resolved_base:
+            return os.path.relpath(resolved_path, resolved_base)
+    except (OSError, ValueError) as exc:
+        logger.debug("Falling back to basename for display path formatting: %s", exc)
+
+    return os.path.basename(os.path.realpath(path)) or "[redacted path]"
+
+
+def format_error_for_display(error: Exception) -> str:
+    """Return exception details without OS-added absolute path context."""
+    if isinstance(error, OSError):
+        return error.strerror or error.__class__.__name__
+    return str(error)
+
+
 def _is_escape_hatch_enabled() -> bool:
    """Check if the unsafe paths escape hatch is enabled."""
    return os.environ.get(_UNSAFE_PATHS_ENV, "").lower() in ("true", "1", "yes")
@@ -66,8 +91,8 @@ def validate_file_path(path: str, base_dir: str | None = None) -> str:
    prefix = resolved_base if resolved_base.endswith(os.sep) else resolved_base + os.sep
    if not resolved_path.startswith(prefix) and resolved_path != resolved_base:
        raise ValueError(
-            f"Path '{path}' resolves to '{resolved_path}' which is outside "
-            f"the allowed directory '{resolved_base}'. "
+            f"Path '{format_path_for_display(resolved_path, resolved_base)}' is "
+            f"outside the allowed directory. "
            f"Set {_UNSAFE_PATHS_ENV}=true to bypass this check."
        )

--- a/lib/crewai-tools/src/crewai_tools/tools/file_read_tool/file_read_tool.py
+++ b/lib/crewai-tools/src/crewai_tools/tools/file_read_tool/file_read_tool.py
@@ -3,7 +3,11 @@ from typing import Any
 from crewai.tools import BaseTool
 from pydantic import BaseModel, Field

-from crewai_tools.security.safe_path import validate_file_path
+from crewai_tools.security.safe_path import (
+    format_error_for_display,
+    format_path_for_display,
+    validate_file_path,
+)


 class FileReadToolSchema(BaseModel):
@@ -58,8 +62,9 @@ class FileReadTool(BaseTool):
            **kwargs: Additional keyword arguments passed to BaseTool.
        """
        if file_path is not None:
+            display_path = format_path_for_display(file_path)
            kwargs["description"] = (
-                f"A tool that reads file content. The default file is {file_path}, but you can provide a different 'file_path' parameter to read another file. You can also specify 'start_line' and 'line_count' to read specific parts of the file."
+                f"A tool that reads file content. The default file is {display_path}, but you can provide a different 'file_path' parameter to read another file. You can also specify 'start_line' and 'line_count' to read specific parts of the file."
            )

        super().__init__(**kwargs)
@@ -78,7 +83,12 @@ class FileReadTool(BaseTool):
        if file_path is None:
            return "Error: No file path provided. Please provide a file path either in the constructor or as an argument."

-        file_path = validate_file_path(file_path)
+        try:
+            file_path = validate_file_path(file_path)
+        except ValueError as e:
+            return f"Error: Invalid file path: {e!s}"
+
+        display_path = format_path_for_display(file_path)
        try:
            with open(file_path, "r") as file:
                if start_line == 1 and line_count is None:
@@ -98,8 +108,11 @@ class FileReadTool(BaseTool):

                return "".join(selected_lines)
        except FileNotFoundError:
-            return f"Error: File not found at path: {file_path}"
+            return f"Error: File not found at path: {display_path}"
        except PermissionError:
-            return f"Error: Permission denied when trying to read file: {file_path}"
+            return f"Error: Permission denied when trying to read file: {display_path}"
        except Exception as e:
-            return f"Error: Failed to read file {file_path}. {e!s}"
+            return (
+                f"Error: Failed to read file {display_path}. "
+                f"{format_error_for_display(e)}"
+            )
--- a/lib/crewai-tools/src/crewai_tools/tools/file_writer_tool/file_writer_tool.py
+++ b/lib/crewai-tools/src/crewai_tools/tools/file_writer_tool/file_writer_tool.py
@@ -5,6 +5,11 @@ from typing import Any
 from crewai.tools import BaseTool
 from pydantic import BaseModel

+from crewai_tools.security.safe_path import (
+    format_error_for_display,
+    format_path_for_display,
+)
+

 def strtobool(val: str | bool) -> bool:
    if isinstance(val, bool):
@@ -44,6 +49,9 @@ class FileWriterTool(BaseTool):
            # itself, since that is not a valid file target.
            real_directory = Path(directory).resolve()
            real_filepath = Path(filepath).resolve()
+            display_filepath = format_path_for_display(
+                str(real_filepath), str(real_directory)
+            )
            if (
                not real_filepath.is_relative_to(real_directory)
                or real_filepath == real_directory
@@ -56,15 +64,18 @@ class FileWriterTool(BaseTool):
            kwargs["overwrite"] = strtobool(kwargs["overwrite"])

            if os.path.exists(real_filepath) and not kwargs["overwrite"]:
-                return f"File {real_filepath} already exists and overwrite option was not passed."
+                return f"File {display_filepath} already exists and overwrite option was not passed."

            mode = "w" if kwargs["overwrite"] else "x"
            with open(real_filepath, mode) as file:
                file.write(kwargs["content"])
-            return f"Content successfully written to {real_filepath}"
+            return f"Content successfully written to {display_filepath}"
        except FileExistsError:
-            return f"File {real_filepath} already exists and overwrite option was not passed."
+            return f"File {display_filepath} already exists and overwrite option was not passed."
        except KeyError as e:
            return f"An error occurred while accessing key: {e!s}"
        except Exception as e:
-            return f"An error occurred while writing to the file: {e!s}"
+            return (
+                "An error occurred while writing to the file: "
+                f"{format_error_for_display(e)}"
+            )
--- a/lib/crewai-tools/tests/file_read_tool_test.py
+++ b/lib/crewai-tools/tests/file_read_tool_test.py
@@ -1,4 +1,3 @@
-import os
 from unittest.mock import mock_open, patch

 from crewai_tools import FileReadTool
@@ -6,21 +5,16 @@ from crewai_tools import FileReadTool

 def test_file_read_tool_constructor():
    """Test FileReadTool initialization with file_path."""
-    test_file = "/tmp/test_file.txt"
-    test_content = "Hello, World!"
-    with open(test_file, "w") as f:
-        f.write(test_content)
+    test_file = "test_file.txt"

    tool = FileReadTool(file_path=test_file)
    assert tool.file_path == test_file
    assert "test_file.txt" in tool.description

-    os.remove(test_file)
-

 def test_file_read_tool_run():
    """Test FileReadTool _run method with file_path at runtime."""
-    test_file = "/tmp/test_file.txt"
+    test_file = "test_file.txt"
    test_content = "Hello, World!"

    # Use mock_open to mock file operations
@@ -36,18 +30,18 @@ def test_file_read_tool_error_handling():
    result = tool._run()
    assert "Error: No file path provided" in result

-    result = tool._run(file_path="/nonexistent/file.txt")
+    result = tool._run(file_path="nonexistent/file.txt")
    assert "Error: File not found at path:" in result

    with patch("builtins.open", side_effect=PermissionError()):
-        result = tool._run(file_path="/tmp/no_permission.txt")
+        result = tool._run(file_path="no_permission.txt")
        assert "Error: Permission denied" in result


 def test_file_read_tool_constructor_and_run():
    """Test FileReadTool using both constructor and runtime file paths."""
-    test_file1 = "/tmp/test1.txt"
-    test_file2 = "/tmp/test2.txt"
+    test_file1 = "test1.txt"
+    test_file2 = "test2.txt"
    content1 = "File 1 content"
    content2 = "File 2 content"

@@ -64,7 +58,7 @@ def test_file_read_tool_constructor_and_run():

 def test_file_read_tool_chunk_reading():
    """Test FileReadTool reading specific chunks of a file."""
-    test_file = "/tmp/multiline_test.txt"
+    test_file = "multiline_test.txt"
    lines = [
        "Line 1\n",
        "Line 2\n",
@@ -104,7 +98,7 @@ def test_file_read_tool_chunk_reading():

 def test_file_read_tool_chunk_error_handling():
    """Test error handling for chunk reading."""
-    test_file = "/tmp/short_test.txt"
+    test_file = "short_test.txt"
    lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
    file_content = "".join(lines)

@@ -122,7 +116,7 @@ def test_file_read_tool_chunk_error_handling():

 def test_file_read_tool_zero_or_negative_start_line():
    """Test that start_line values of 0 or negative read from the start of the file."""
-    test_file = "/tmp/negative_test.txt"
+    test_file = "negative_test.txt"
    lines = ["Line 1\n", "Line 2\n", "Line 3\n", "Line 4\n", "Line 5\n"]
    file_content = "".join(lines)

@@ -150,3 +144,45 @@ def test_file_read_tool_zero_or_negative_start_line():
        result = tool._run(file_path=test_file, start_line=-10, line_count=2)
        expected = "".join(lines[0:2])  # Should read first 2 lines
        assert result == expected
+
+
+def test_file_read_tool_error_messages_do_not_disclose_absolute_paths(
+    tmp_path, monkeypatch
+):
+    """FileReadTool should redact absolute prefixes from user-visible errors."""
+    monkeypatch.chdir(tmp_path)
+    tool = FileReadTool()
+    target = tmp_path / "secret.txt"
+
+    result = tool._run(file_path=str(target))
+    assert "secret.txt" in result
+    assert str(tmp_path) not in result
+
+    target.touch()
+    with patch("builtins.open", side_effect=PermissionError()):
+        result = tool._run(file_path=str(target))
+    assert "secret.txt" in result
+    assert str(tmp_path) not in result
+
+    with patch(
+        "builtins.open",
+        side_effect=OSError(5, "Input/output error", str(target)),
+    ):
+        result = tool._run(file_path=str(target))
+    assert "secret.txt" in result
+    assert str(tmp_path) not in result
+
+
+def test_file_read_tool_invalid_path_error_does_not_disclose_workspace(
+    tmp_path, monkeypatch
+):
+    """Validation errors should not echo the resolved workspace path."""
+    monkeypatch.chdir(tmp_path)
+    outside = tmp_path.parent / "outside.txt"
+
+    result = FileReadTool()._run(file_path=str(outside))
+
+    assert "Invalid file path" in result
+    assert "outside.txt" in result
+    assert str(tmp_path) not in result
+    assert str(tmp_path.parent) not in result
--- a/lib/crewai-tools/tests/tools/test_file_writer_tool.py
+++ b/lib/crewai-tools/tests/tools/test_file_writer_tool.py
@@ -47,6 +47,8 @@ def test_basic_file_write(tool, temp_env):
    assert os.path.exists(path)
    assert read_file(path) == temp_env["test_content"]
    assert "successfully written" in result
+    assert temp_env["test_file"] in result
+    assert temp_env["temp_dir"] not in result


 def test_directory_creation(tool, temp_env):
@@ -62,6 +64,8 @@ def test_directory_creation(tool, temp_env):
    assert os.path.exists(new_dir)
    assert os.path.exists(path)
    assert "successfully written" in result
+    assert temp_env["test_file"] in result
+    assert new_dir not in result


@pytest.mark.parametrize(
@@ -134,6 +138,8 @@ def test_file_exists_error_handling(tool, temp_env, overwrite):
    )

    assert "already exists and overwrite option was not passed" in result
+    assert temp_env["test_file"] in result
+    assert temp_env["temp_dir"] not in result
    assert read_file(path) == "Pre-existing content"


--- a/lib/crewai-tools/tests/utilities/test_safe_path.py
+++ b/lib/crewai-tools/tests/utilities/test_safe_path.py
@@ -7,6 +7,7 @@ import os
 import pytest

 from crewai_tools.security.safe_path import (
+    format_path_for_display,
    validate_directory_path,
    validate_file_path,
    validate_url,
@@ -66,6 +67,37 @@ class TestValidateFilePath:
        result = validate_file_path("/etc/passwd", str(tmp_path))
        assert result == os.path.realpath("/etc/passwd")

+    def test_rejection_message_redacts_absolute_prefixes(self, tmp_path):
+        outside = tmp_path.parent / "outside.txt"
+
+        with pytest.raises(ValueError) as exc_info:
+            validate_file_path(str(outside), str(tmp_path))
+
+        message = str(exc_info.value)
+        assert "outside.txt" in message
+        assert str(tmp_path) not in message
+        assert str(tmp_path.parent) not in message
+
+
+class TestFormatPathForDisplay:
+    """Tests for user-visible path labels."""
+
+    def test_returns_relative_path_inside_base(self, tmp_path):
+        nested_file = tmp_path / "nested" / "file.txt"
+        nested_file.parent.mkdir()
+        nested_file.touch()
+
+        result = format_path_for_display(str(nested_file), str(tmp_path))
+
+        assert result == os.path.join("nested", "file.txt")
+
+    def test_redacts_absolute_prefix_outside_base(self, tmp_path):
+        outside_file = tmp_path.parent / "outside.txt"
+
+        result = format_path_for_display(str(outside_file), str(tmp_path))
+
+        assert result == "outside.txt"
+

 class TestValidateDirectoryPath:
    """Tests for validate_directory_path."""
--- a/lib/crewai/pyproject.toml
+++ b/lib/crewai/pyproject.toml
@@ -8,8 +8,8 @@ authors = [
 ]
 requires-python = ">=3.10, <3.14"
 dependencies = [
-    "crewai-core==1.14.7rc2",
-    "crewai-cli==1.14.7rc2",
+    "crewai-core==1.14.7",
+    "crewai-cli==1.14.7",
    # Core Dependencies
    "pydantic>=2.11.9,<2.13",
    "openai>=2.30.0,<3",
@@ -33,6 +33,7 @@ dependencies = [
    "appdirs~=1.4.4",
    "jsonref~=1.1.0",
    "json-repair~=0.25.2",
+    "cel-python>=0.5.0,<0.6",
    "tomli-w~=1.1.0",
    "tomli~=2.0.2",
    "json5~=0.10.0",
@@ -54,7 +55,7 @@ Repository = "https://github.com/crewAIInc/crewAI"

 [project.optional-dependencies]
 tools = [
-    "crewai-tools==1.14.7rc2",
+    "crewai-tools==1.14.7",
 ]
 embeddings = [
    "tiktoken>=0.8.0,<0.13"
--- a/lib/crewai/src/crewai/init.py
+++ b/lib/crewai/src/crewai/init.py
@@ -48,7 +48,7 @@ def _suppress_pydantic_deprecation_warnings() -> None:

 _suppress_pydantic_deprecation_warnings()

-__version__ = "1.14.7rc2"
+__version__ = "1.14.7"

 _LAZY_IMPORTS: dict[str, tuple[str, str]] = {
    "Memory": ("crewai.memory.unified_memory", "Memory"),
--- a/lib/crewai/src/crewai/agent/core.py
+++ b/lib/crewai/src/crewai/agent/core.py
@@ -758,6 +758,31 @@ class Agent(BaseAgent):
        self._check_execution_error(e, task)
        return await self.aexecute_task(task, context, tools)

+    def message(self, content: str, **kwargs: Any) -> str:
+        """Send a single message and get a response.
+
+        Creates a temporary Task + Crew, executes, and returns the raw output.
+        """
+        from crewai.crew import Crew
+        from crewai.task import Task
+        from crewai.types.streaming import CrewStreamingOutput
+
+        task = Task(
+            description=content,
+            expected_output="Respond to the user's message appropriately.",
+            agent=self,
+        )
+        crew = Crew(
+            agents=[self],
+            tasks=[task],
+            verbose=self.verbose,
+            memory=self.memory or False,
+        )
+        result = crew.kickoff()
+        if isinstance(result, CrewStreamingOutput):
+            return result.result.raw
+        return result.raw
+
    def execute_task(
        self,
        task: Task,
--- a/lib/crewai/src/crewai/agent/planning_config.py
+++ b/lib/crewai/src/crewai/agent/planning_config.py
@@ -1,9 +1,10 @@
 from __future__ import annotations

-from typing import Literal
+from typing import Annotated, Literal

-from pydantic import BaseModel, Field
+from pydantic import BaseModel, BeforeValidator, Field

+from crewai.agents.agent_builder.base_agent import _validate_llm_ref
 from crewai.llms.base_llm import BaseLLM


@@ -69,7 +70,7 @@ class PlanningConfig(BaseModel):
                max_attempts=3,
                max_steps=10,
                plan_prompt="Create a focused plan for: {description}",
-                llm="gpt-4o-mini",
+                llm="gpt-5.4-mini",
            ),
        )
        ```
@@ -139,7 +140,10 @@ class PlanningConfig(BaseModel):
            "whether to continue or replan. None means no per-step timeout."
        ),
    )
-    llm: str | BaseLLM | None = Field(
+    llm: Annotated[
+        str | BaseLLM | None,
+        BeforeValidator(_validate_llm_ref),
+    ] = Field(
        default=None,
        description="LLM to use for planning. Uses agent's LLM if None.",
    )
--- a/lib/crewai/src/crewai/agents/agent_adapters/openai_agents/openai_adapter.py
+++ b/lib/crewai/src/crewai/agents/agent_adapters/openai_agents/openai_adapter.py
@@ -81,7 +81,7 @@ class OpenAIAgentAdapter(BaseAgentAdapter):
        Raises:
            ImportError: If OpenAI agent dependencies are not installed.
        """
-        self.llm = kwargs.pop("model", "gpt-4o-mini")
+        self.llm = kwargs.pop("model", "gpt-5.4-mini")
        super().__init__(**kwargs)
        self._tool_adapter = OpenAIAgentToolAdapter(tools=kwargs.get("tools"))
        self._converter_adapter = OpenAIConverterAdapter(agent_adapter=self)
--- a/lib/crewai/src/crewai/agents/agent_builder/base_agent.py
+++ b/lib/crewai/src/crewai/agents/agent_builder/base_agent.py
@@ -82,16 +82,42 @@ _LLM_TYPE_REGISTRY: dict[str, str] = {
 def _validate_llm_ref(value: Any) -> Any:
    if isinstance(value, dict):
        import importlib
+        import inspect

        llm_type = value.get("llm_type")
-        if not llm_type or llm_type not in _LLM_TYPE_REGISTRY:
+        if not llm_type:
+            model = (
+                value.get("model")
+                or value.get("model_name")
+                or value.get("deployment_name")
+            )
+            if not model:
+                raise ValueError(
+                    "LLM config objects must include 'model', 'model_name', "
+                    "or 'deployment_name', or a serialized 'llm_type'. "
+                    f"Got keys: {list(value)}"
+                )
+            from crewai.llm import LLM
+
+            llm_kwargs = {**value, "model": model}
+            llm_kwargs.pop("model_name", None)
+            llm_kwargs.pop("deployment_name", None)
+            return LLM(**llm_kwargs)
+
+        if llm_type not in _LLM_TYPE_REGISTRY:
            raise ValueError(
-                f"Unknown or missing llm_type: {llm_type!r}. "
+                f"Unknown llm_type: {llm_type!r}. "
                f"Expected one of {list(_LLM_TYPE_REGISTRY)}"
            )
        dotted = _LLM_TYPE_REGISTRY[llm_type]
        mod_path, cls_name = dotted.rsplit(".", 1)
        cls = getattr(importlib.import_module(mod_path), cls_name)
+        if inspect.isabstract(cls):
+            from crewai.llm import LLM
+
+            return LLM(
+                **{k: v for k, v in value.items() if v is not None and k != "llm_type"}
+            )
        return cls(**value)
    return value

@@ -611,7 +637,10 @@ class BaseAgent(BaseModel, ABC, metaclass=AgentMeta):
        if self.memory is True:
            from crewai.memory.unified_memory import Memory

-            self.memory = Memory()
+            memory_kwargs: dict[str, Any] = {}
+            if self.llm is not None:
+                memory_kwargs["llm"] = self.llm
+            self.memory = Memory(**memory_kwargs)
        elif self.memory is False:
            self.memory = None
        return self
--- a/lib/crewai/src/crewai/agents/crew_agent_executor.py
+++ b/lib/crewai/src/crewai/agents/crew_agent_executor.py
@@ -53,6 +53,7 @@ from crewai.types.callback import SerializableCallable
 from crewai.utilities.agent_utils import (
    _llm_stop_words_applied,
    aget_llm_response,
+    build_text_tool_calling_fallback_message,
    convert_tools_to_openai_schema,
    enforce_rpm_limit,
    format_message_for_llm,
@@ -64,6 +65,7 @@ from crewai.utilities.agent_utils import (
    handle_unknown_error,
    has_reached_max_iterations,
    is_context_length_exceeded,
+    is_native_tool_calling_unsupported_error,
    parse_tool_call_args,
    process_llm_response,
    track_delegation_if_needed,
@@ -464,6 +466,20 @@ class CrewAgentExecutor(BaseAgentExecutor):
        self._show_logs(formatted_answer)
        return formatted_answer

+    def _append_text_tool_calling_fallback_message(self) -> None:
+        """Add text tool-calling instructions after native tools are rejected."""
+        if not self.tools:
+            return
+        self.messages.append(
+            format_message_for_llm(
+                build_text_tool_calling_fallback_message(
+                    self.tools_description,
+                    self.tools_names,
+                ),
+                role="user",
+            )
+        )
+
    def _invoke_loop_native_tools(self) -> AgentFinish:
        """Execute agent loop using native function calling.

@@ -557,6 +573,9 @@ class CrewAgentExecutor(BaseAgentExecutor):
                return formatted_answer

            except Exception as e:
+                if is_native_tool_calling_unsupported_error(e):
+                    self._append_text_tool_calling_fallback_message()
+                    return self._invoke_loop_react()
                if e.__class__.__module__.startswith("litellm"):
                    raise e
                if is_context_length_exceeded(e):
@@ -1369,6 +1388,9 @@ class CrewAgentExecutor(BaseAgentExecutor):
                return formatted_answer

            except Exception as e:
+                if is_native_tool_calling_unsupported_error(e):
+                    self._append_text_tool_calling_fallback_message()
+                    return await self._ainvoke_loop_react()
                if e.__class__.__module__.startswith("litellm"):
                    raise e
                if is_context_length_exceeded(e):
--- a/lib/crewai/src/crewai/agents/step_executor.py
+++ b/lib/crewai/src/crewai/agents/step_executor.py
@@ -29,14 +29,17 @@ from crewai.events.types.tool_usage_events import (
    ToolUsageStartedEvent,
 )
 from crewai.utilities.agent_utils import (
+    build_text_tool_calling_fallback_message,
    build_tool_calls_assistant_message,
    check_native_tool_support,
    enforce_rpm_limit,
    execute_single_native_tool_call,
    extract_task_section,
    format_message_for_llm,
+    is_native_tool_calling_unsupported_error,
    is_tool_call_list,
    process_llm_response,
+    render_text_description_and_args,
    setup_native_tools,
 )
 from crewai.utilities.i18n import I18N_DEFAULT
@@ -153,6 +156,7 @@ class StepExecutor:
            if self._use_native_tools:
                result_text = self._execute_native(
                    messages,
+                    todo,
                    tool_calls_made,
                    max_step_iterations=max_step_iterations,
                    step_timeout=step_timeout,
@@ -161,6 +165,7 @@ class StepExecutor:
            else:
                result_text = self._execute_text_parsed(
                    messages,
+                    todo,
                    tool_calls_made,
                    max_step_iterations=max_step_iterations,
                    step_timeout=step_timeout,
@@ -176,6 +181,46 @@ class StepExecutor:
                execution_time=elapsed,
            )
        except Exception as e:
+            if self._use_native_tools and is_native_tool_calling_unsupported_error(e):
+                try:
+                    self._use_native_tools = False
+                    self._openai_tools = []
+                    self._available_functions = {}
+                    # Keep the conversation built so far (including any native
+                    # tool round-trips already appended to ``messages``) and
+                    # append the text-tooling instructions instead of
+                    # restarting the step, so completed tool calls are not
+                    # re-executed against a fresh context.
+                    messages.append(
+                        format_message_for_llm(
+                            build_text_tool_calling_fallback_message(
+                                render_text_description_and_args(self.tools),
+                                ", ".join(
+                                    sanitize_tool_name(t.name) for t in self.tools
+                                ),
+                            ),
+                            role="user",
+                        )
+                    )
+                    result_text = self._execute_text_parsed(
+                        messages,
+                        todo,
+                        tool_calls_made,
+                        max_step_iterations=max_step_iterations,
+                        step_timeout=step_timeout,
+                        start_time=start_time,
+                    )
+                    self._validate_expected_tool_usage(todo, tool_calls_made)
+                    elapsed = time.monotonic() - start_time
+                    return StepResult(
+                        success=True,
+                        result=result_text,
+                        tool_calls_made=tool_calls_made,
+                        execution_time=elapsed,
+                    )
+                except Exception as fallback_error:
+                    e = fallback_error
+
            elapsed = time.monotonic() - start_time
            return StepResult(
                success=False,
@@ -272,6 +317,7 @@ class StepExecutor:
    def _execute_text_parsed(
        self,
        messages: list[LLMMessage],
+        todo: TodoItem,
        tool_calls_made: list[str],
        max_step_iterations: int = 15,
        step_timeout: int | None = None,
@@ -310,7 +356,7 @@ class StepExecutor:

            if isinstance(formatted, AgentAction):
                tool_calls_made.append(formatted.tool)
-                tool_result = self._execute_text_tool_with_events(formatted)
+                tool_result = self._execute_text_tool_with_events(formatted, todo)
                last_tool_result = tool_result
                messages.append({"role": "assistant", "content": answer_str})
                messages.append(self._build_observation_message(tool_result))
@@ -320,7 +366,9 @@ class StepExecutor:

        return last_tool_result

-    def _execute_text_tool_with_events(self, formatted: AgentAction) -> str:
+    def _execute_text_tool_with_events(
+        self, formatted: AgentAction, todo: TodoItem
+    ) -> str:
        """Execute text-parsed tool calls with tool usage events."""
        args_dict = self._parse_tool_args(formatted.tool_input)
        agent_key = getattr(self.agent, "key", "unknown") if self.agent else "unknown"
@@ -333,6 +381,8 @@ class StepExecutor:
                from_agent=self.agent,
                from_task=self.task,
                agent_key=agent_key,
+                plan_step_number=todo.step_number,
+                plan_step_description=todo.description,
            ),
        )

@@ -368,6 +418,8 @@ class StepExecutor:
                    from_agent=self.agent,
                    from_task=self.task,
                    agent_key=agent_key,
+                    plan_step_number=todo.step_number,
+                    plan_step_description=todo.description,
                    error=e,
                ),
            )
@@ -382,6 +434,8 @@ class StepExecutor:
                from_agent=self.agent,
                from_task=self.task,
                agent_key=agent_key,
+                plan_step_number=todo.step_number,
+                plan_step_description=todo.description,
                started_at=started_at,
                finished_at=datetime.now(),
            ),
@@ -474,6 +528,7 @@ class StepExecutor:
    def _execute_native(
        self,
        messages: list[LLMMessage],
+        todo: TodoItem,
        tool_calls_made: list[str],
        max_step_iterations: int = 15,
        step_timeout: int | None = None,
@@ -513,7 +568,7 @@ class StepExecutor:

            if isinstance(answer, list) and answer and is_tool_call_list(answer):
                result = self._execute_native_tool_calls(
-                    answer, messages, tool_calls_made
+                    answer, messages, todo, tool_calls_made
                )
                accumulated_results.append(result)
                continue
@@ -526,6 +581,7 @@ class StepExecutor:
        self,
        tool_calls: list[Any],
        messages: list[LLMMessage],
+        todo: TodoItem,
        tool_calls_made: list[str],
    ) -> str:
        """Execute a batch of native tool calls and return their results.
@@ -551,6 +607,8 @@ class StepExecutor:
                event_source=self,
                printer=PRINTER,
                verbose=bool(self.agent and self.agent.verbose),
+                plan_step_number=todo.step_number,
+                plan_step_description=todo.description,
            )

            if call_result.func_name:
--- a/lib/crewai/src/crewai/crew.py
+++ b/lib/crewai/src/crewai/crew.py
@@ -658,7 +658,14 @@ class Crew(FlowTrackable, BaseModel):
                from crewai.rag.embeddings.factory import build_embedder

                embedder = build_embedder(cast(dict[str, Any], self.embedder))
-            self._memory = Memory(embedder=embedder, root_scope=crew_root_scope)
+            memory_kwargs: dict[str, Any] = {
+                "embedder": embedder,
+                "root_scope": crew_root_scope,
+            }
+            memory_llm = self._memory_llm()
+            if memory_llm is not None:
+                memory_kwargs["llm"] = memory_llm
+            self._memory = Memory(**memory_kwargs)
        elif self.memory:
            # User passed a Memory / MemoryScope / MemorySlice instance
            # Respect user's configuration — don't auto-set root_scope
@@ -668,6 +675,16 @@ class Crew(FlowTrackable, BaseModel):

        return self

+    def _memory_llm(self) -> str | BaseLLM | None:
+        """Return the LLM auto-created memory should use for analysis."""
+        if self.chat_llm is not None:
+            return self.chat_llm
+        for agent in self.agents:
+            agent_llm: str | BaseLLM | None = getattr(agent, "llm", None)
+            if agent_llm is not None:
+                return agent_llm
+        return None
+
    @model_validator(mode="after")
    def create_crew_knowledge(self) -> Crew:
        """Create the knowledge for the crew."""
--- a/lib/crewai/src/crewai/events/init.py
+++ b/lib/crewai/src/crewai/events/init.py
@@ -116,6 +116,11 @@ if TYPE_CHECKING:
        MemorySaveFailedEvent,
        MemorySaveStartedEvent,
    )
+    from crewai.events.types.observation_events import (
+        PlanStepCompletedEvent,
+        PlanStepEvent,
+        PlanStepStartedEvent,
+    )
    from crewai.events.types.reasoning_events import (
        AgentReasoningCompletedEvent,
        AgentReasoningFailedEvent,
@@ -220,6 +225,9 @@ _LAZY_EVENT_MAPPING: dict[str, str] = {
    "MemorySaveCompletedEvent": "crewai.events.types.memory_events",
    "MemorySaveFailedEvent": "crewai.events.types.memory_events",
    "MemorySaveStartedEvent": "crewai.events.types.memory_events",
+    "PlanStepCompletedEvent": "crewai.events.types.observation_events",
+    "PlanStepEvent": "crewai.events.types.observation_events",
+    "PlanStepStartedEvent": "crewai.events.types.observation_events",
    "AgentReasoningCompletedEvent": "crewai.events.types.reasoning_events",
    "AgentReasoningFailedEvent": "crewai.events.types.reasoning_events",
    "AgentReasoningStartedEvent": "crewai.events.types.reasoning_events",
@@ -349,6 +357,9 @@ __all__ = [
    "MethodExecutionFailedEvent",
    "MethodExecutionFinishedEvent",
    "MethodExecutionStartedEvent",
+    "PlanStepCompletedEvent",
+    "PlanStepEvent",
+    "PlanStepStartedEvent",
    "ReasoningEvent",
    "SkillActivatedEvent",
    "SkillDiscoveryCompletedEvent",
--- a/lib/crewai/src/crewai/events/event_listener.py
+++ b/lib/crewai/src/crewai/events/event_listener.py
@@ -158,7 +158,6 @@ class EventListener(BaseEventListener):
            trace_listener.formatter = self.formatter

    def setup_listeners(self, crewai_event_bus: CrewAIEventsBus) -> None:
-
        @crewai_event_bus.on(CCEnvEvent)
        def on_cc_env(_: Any, event: CCEnvEvent) -> None:
            self._telemetry.env_context_span(event.type)
--- a/lib/crewai/src/crewai/events/event_types.py
+++ b/lib/crewai/src/crewai/events/event_types.py
@@ -99,6 +99,10 @@ from crewai.events.types.memory_events import (
    MemorySaveFailedEvent,
    MemorySaveStartedEvent,
 )
+from crewai.events.types.observation_events import (
+    PlanStepCompletedEvent,
+    PlanStepStartedEvent,
+)
 from crewai.events.types.reasoning_events import (
    AgentReasoningCompletedEvent,
    AgentReasoningFailedEvent,
@@ -191,6 +195,8 @@ EventTypes = (
    | MemoryRetrievalStartedEvent
    | MemoryRetrievalCompletedEvent
    | MemoryRetrievalFailedEvent
+    | PlanStepStartedEvent
+    | PlanStepCompletedEvent
    | MCPConnectionStartedEvent
    | MCPConnectionCompletedEvent
    | MCPConnectionFailedEvent
--- a/lib/crewai/src/crewai/events/listeners/tracing/trace_batch_manager.py
+++ b/lib/crewai/src/crewai/events/listeners/tracing/trace_batch_manager.py
@@ -24,6 +24,7 @@ from crewai.events.listeners.tracing.types import TraceEvent
 from crewai.events.listeners.tracing.utils import (
    get_user_id,
    is_tracing_enabled_in_context,
+    is_tui_mode,
    should_auto_collect_first_time_traces,
 )
 from crewai.plus_api import PlusAPI
@@ -74,6 +75,7 @@ class TraceBatchManager:
        self.defer_session_finalization: bool = False
        self._batch_finalized: bool = False
        self.backend_initialized: bool = False
+        self.trace_url: str | None = None
        self.ephemeral_trace_url: str | None = None
        try:
            self.plus_api = PlusAPI(
@@ -108,7 +110,9 @@ class TraceBatchManager:

            self.record_start_time("execution")

-            if should_auto_collect_first_time_traces():
+            if should_auto_collect_first_time_traces() or (
+                is_tui_mode() and not is_tracing_enabled_in_context()
+            ):
                self.trace_batch_id = self.current_batch.batch_id
            else:
                self._initialize_backend_batch(
@@ -411,6 +415,7 @@ class TraceBatchManager:
                        else f"{base_url}/crewai_plus/ephemeral_trace_batches/{batch_id}?access_code={access_code}"
                    )

+                    self.trace_url = return_link
                    if is_ephemeral:
                        self.ephemeral_trace_url = return_link

@@ -428,7 +433,10 @@ class TraceBatchManager:
                        title="Trace Batch Finalization",
                        border_style="green",
                    )
-                    if not should_auto_collect_first_time_traces():
+                    if (
+                        not should_auto_collect_first_time_traces()
+                        and not is_tui_mode()
+                    ):
                        console.print(panel)
                    return True

--- a/lib/crewai/src/crewai/events/listeners/tracing/trace_listener.py
+++ b/lib/crewai/src/crewai/events/listeners/tracing/trace_listener.py
@@ -18,6 +18,7 @@ from crewai.events.listeners.tracing.trace_batch_manager import TraceBatchManage
 from crewai.events.listeners.tracing.types import TraceEvent
 from crewai.events.listeners.tracing.utils import (
    is_tracing_enabled_in_context,
+    is_tui_mode,
    safe_serialize_to_dict,
    should_auto_collect_first_time_traces,
    should_enable_tracing,
@@ -212,8 +213,8 @@ class TraceCollectionListener(BaseEventListener):
            not should_enable_tracing()
            and not is_tracing_enabled_in_context()
            and not should_auto_collect_first_time_traces()
+            and not is_tui_mode()
        ):
-            self._listeners_setup = True
            return

        self._register_flow_event_handlers(crewai_event_bus)
@@ -297,6 +298,12 @@ class TraceCollectionListener(BaseEventListener):
            if self._nested_in_flow_execution():
                return
            if self.batch_manager.batch_owner_type == "crew":
+                if is_tui_mode():
+                    if self.first_time_handler.is_first_time:
+                        self.first_time_handler.mark_events_collected()
+                    elif is_tracing_enabled_in_context() or should_enable_tracing():
+                        self.batch_manager.finalize_batch()
+                    return
                if self.first_time_handler.is_first_time:
                    self.first_time_handler.mark_events_collected()
                    self.first_time_handler.handle_execution_completion()
@@ -310,6 +317,12 @@ class TraceCollectionListener(BaseEventListener):
                return
            if self._nested_in_flow_execution():
                return
+            if is_tui_mode():
+                if self.first_time_handler.is_first_time:
+                    self.first_time_handler.mark_events_collected()
+                elif is_tracing_enabled_in_context() or should_enable_tracing():
+                    self.batch_manager.finalize_batch()
+                return
            if self.first_time_handler.is_first_time:
                self.first_time_handler.mark_events_collected()
                self.first_time_handler.handle_execution_completion()
--- a/lib/crewai/src/crewai/events/listeners/tracing/utils.py
+++ b/lib/crewai/src/crewai/events/listeners/tracing/utils.py
@@ -42,6 +42,7 @@ __all__ = [
    "is_first_execution",
    "is_tracing_enabled",
    "is_tracing_enabled_in_context",
+    "is_tui_mode",
    "mark_first_execution_completed",
    "mark_first_execution_done",
    "on_first_execution_tracing_confirmation",
@@ -50,6 +51,7 @@ __all__ = [
    "safe_serialize_to_dict",
    "set_suppress_tracing_messages",
    "set_tracing_enabled",
+    "set_tui_mode",
    "should_auto_collect_first_time_traces",
    "should_enable_tracing",
    "should_suppress_tracing_messages",
@@ -71,6 +73,16 @@ _suppress_tracing_messages: ContextVar[bool] = ContextVar(
    "_suppress_tracing_messages", default=False
 )

+_tui_mode: ContextVar[bool] = ContextVar("_tui_mode", default=False)
+
+
+def set_tui_mode(enabled: bool) -> object:
+    return _tui_mode.set(enabled)
+
+
+def is_tui_mode() -> bool:
+    return _tui_mode.get()
+

 def set_suppress_tracing_messages(suppress: bool) -> object:
    """Set whether to suppress tracing-related console messages.
--- a/lib/crewai/src/crewai/events/types/observation_events.py
+++ b/lib/crewai/src/crewai/events/types/observation_events.py
@@ -26,6 +26,38 @@ class ObservationEvent(BaseEvent):
        self._set_agent_params(data)


+class PlanStepEvent(BaseEvent):
+    """Base event for authoritative plan step lifecycle updates."""
+
+    type: str
+    agent_role: str
+    step_number: int
+    step_description: str = ""
+    tool_to_use: str | None = None
+    from_task: Any | None = None
+    from_agent: Any | None = None
+
+    def __init__(self, **data: Any) -> None:
+        super().__init__(**data)
+        self._set_task_params(data)
+        self._set_agent_params(data)
+
+
+class PlanStepStartedEvent(PlanStepEvent):
+    """Emitted when a concrete plan step starts executing."""
+
+    type: Literal["plan_step_started"] = "plan_step_started"
+
+
+class PlanStepCompletedEvent(PlanStepEvent):
+    """Emitted when a concrete plan step reaches a terminal state."""
+
+    type: Literal["plan_step_completed"] = "plan_step_completed"
+    success: bool = True
+    result: str | None = None
+    error: str | None = None
+
+
 class StepObservationStartedEvent(ObservationEvent):
    """Emitted when the Planner begins observing a step's result.

--- a/lib/crewai/src/crewai/events/types/tool_usage_events.py
+++ b/lib/crewai/src/crewai/events/types/tool_usage_events.py
@@ -21,6 +21,8 @@ class ToolUsageEvent(BaseEvent):
    agent: Any | None = None
    task_name: str | None = None
    task_id: str | None = None
+    plan_step_number: int | None = None
+    plan_step_description: str | None = None
    from_task: Any | None = None
    from_agent: Any | None = None

--- a/lib/crewai/src/crewai/experimental/agent_executor.py
+++ b/lib/crewai/src/crewai/experimental/agent_executor.py
@@ -46,6 +46,8 @@ from crewai.events.types.observation_events import (
    GoalAchievedEarlyEvent,
    PlanRefinementEvent,
    PlanReplanTriggeredEvent,
+    PlanStepCompletedEvent,
+    PlanStepStartedEvent,
 )
 from crewai.events.types.tool_usage_events import (
    ToolUsageErrorEvent,
@@ -73,6 +75,7 @@ from crewai.tools.base_tool import BaseTool
 from crewai.tools.structured_tool import CrewStructuredTool
 from crewai.utilities.agent_utils import (
    _llm_stop_words_applied,
+    build_text_tool_calling_fallback_message,
    check_native_tool_support,
    enforce_rpm_limit,
    extract_tool_call_info,
@@ -86,6 +89,7 @@ from crewai.utilities.agent_utils import (
    has_reached_max_iterations,
    is_context_length_exceeded,
    is_inside_event_loop,
+    is_native_tool_calling_unsupported_error,
    is_tool_call_list,
    parse_tool_call_args,
    process_llm_response,
@@ -241,6 +245,23 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
                self._tool_name_mapping,
            ) = setup_native_tools(self.original_tools)

+    def _downgrade_to_text_tool_calling(self) -> None:
+        """Switch a running execution from native tools to text tool calls."""
+        self.state.use_native_tools = False
+        self.state.pending_tool_calls.clear()
+        self._openai_tools = []
+        self._available_functions = {}
+        if self.tools:
+            self.state.messages.append(
+                format_message_for_llm(
+                    build_text_tool_calling_fallback_message(
+                        self.tools_description,
+                        self.tools_names,
+                    ),
+                    role="user",
+                )
+            )
+
    def _is_tool_call_list(self, response: list[Any]) -> bool:
        """Check if a response is a list of tool calls."""
        return is_tool_call_list(response)
@@ -349,6 +370,84 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):

        self.state.todos = TodoList(items=todos)

+    def _emit_plan_step_started(self, todo: TodoItem) -> None:
+        try:
+            crewai_event_bus.emit(
+                self.agent,
+                event=PlanStepStartedEvent(
+                    agent_role=self.agent.role,
+                    step_number=todo.step_number,
+                    step_description=todo.description,
+                    tool_to_use=todo.tool_to_use,
+                    from_task=self.task,
+                    from_agent=self.agent,
+                ),
+            )
+        except Exception:  # noqa: S110
+            pass
+
+    def _emit_plan_step_completed(
+        self,
+        todo: TodoItem,
+        *,
+        success: bool,
+        result: str | None = None,
+        error: str | None = None,
+    ) -> None:
+        try:
+            crewai_event_bus.emit(
+                self.agent,
+                event=PlanStepCompletedEvent(
+                    agent_role=self.agent.role,
+                    step_number=todo.step_number,
+                    step_description=todo.description,
+                    tool_to_use=todo.tool_to_use,
+                    success=success,
+                    result=result,
+                    error=error,
+                    from_task=self.task,
+                    from_agent=self.agent,
+                ),
+            )
+        except Exception:  # noqa: S110
+            pass
+
+    def _mark_todo_running(self, todo: TodoItem) -> None:
+        previous_status = todo.status
+        self.state.todos.mark_running(todo.step_number)
+        if previous_status != "running":
+            self._emit_plan_step_started(todo)
+
+    def _mark_todo_completed(
+        self,
+        step_number: int,
+        result: str | None = None,
+    ) -> None:
+        todo = self.state.todos.get_by_step_number(step_number)
+        previous_status = todo.status if todo else None
+        self.state.todos.mark_completed(step_number, result=result)
+        todo = self.state.todos.get_by_step_number(step_number)
+        if todo and previous_status != "completed":
+            self._emit_plan_step_completed(todo, success=True, result=result)
+
+    def _mark_todo_failed(
+        self,
+        step_number: int,
+        result: str | None = None,
+        error: str | None = None,
+    ) -> None:
+        todo = self.state.todos.get_by_step_number(step_number)
+        previous_status = todo.status if todo else None
+        self.state.todos.mark_failed(step_number, result=result)
+        todo = self.state.todos.get_by_step_number(step_number)
+        if todo and previous_status != "failed":
+            self._emit_plan_step_completed(
+                todo,
+                success=False,
+                result=result,
+                error=error,
+            )
+
    def _ensure_step_executor(self) -> Any:
        """Lazily create the StepExecutor (avoids circular imports)."""
        if self._step_executor is None:
@@ -597,8 +696,10 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
            and not observation.step_completed_successfully
            and observation.needs_full_replan
        ):
-            self.state.todos.mark_failed(
-                current_todo.step_number, result=current_todo.result
+            self._mark_todo_failed(
+                current_todo.step_number,
+                result=current_todo.result,
+                error=observation.replan_reason,
            )
            if self.agent.verbose:
                PRINTER.print(
@@ -614,8 +715,9 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
            return "replan_now"

        if observation and not observation.step_completed_successfully:
-            self.state.todos.mark_failed(
-                current_todo.step_number, result=current_todo.result
+            self._mark_todo_failed(
+                current_todo.step_number,
+                result=current_todo.result,
            )
            if self.agent.verbose:
                failed = len(self.state.todos.get_failed_todos())
@@ -629,9 +731,7 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
                )
            return "continue_plan"

-        self.state.todos.mark_completed(
-            current_todo.step_number, result=current_todo.result
-        )
+        self._mark_todo_completed(current_todo.step_number, result=current_todo.result)

        if self.agent.verbose:
            completed = self.state.todos.completed_count
@@ -661,7 +761,7 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):

        # If observation is missing or step succeeded — continue
        if not observation or observation.step_completed_successfully:
-            self.state.todos.mark_completed(
+            self._mark_todo_completed(
                current_todo.step_number, result=current_todo.result
            )
            if self.agent.verbose:
@@ -676,8 +776,10 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
        # Step failed — only replan if observer explicitly requires it,
        # otherwise mark done and continue (same gate as low-effort).
        if observation.needs_full_replan:
-            self.state.todos.mark_failed(
-                current_todo.step_number, result=current_todo.result
+            self._mark_todo_failed(
+                current_todo.step_number,
+                result=current_todo.result,
+                error=observation.replan_reason,
            )
            if self.agent.verbose:
                PRINTER.print(
@@ -694,9 +796,7 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):

        # Step failed but observer does not require a full replan — mark as
        # failed (not completed) so get_failed_todos() tracks it correctly.
-        self.state.todos.mark_failed(
-            current_todo.step_number, result=current_todo.result
-        )
+        self._mark_todo_failed(current_todo.step_number, result=current_todo.result)
        if self.agent.verbose:
            failed = len(self.state.todos.get_failed_todos())
            total = len(self.state.todos.items)
@@ -731,12 +831,12 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
        observation = self.state.observations.get(current_todo.step_number)
        if not observation:
            # No observation available — default to continue
-            self.state.todos.mark_completed(current_todo.step_number)
+            self._mark_todo_completed(current_todo.step_number)
            return "continue_plan"

        # Goal already achieved — early termination
        if observation.goal_already_achieved:
-            self.state.todos.mark_completed(
+            self._mark_todo_completed(
                current_todo.step_number, result=current_todo.result
            )
            if self.agent.verbose:
@@ -748,8 +848,10 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):

        # Full replan needed
        if observation.needs_full_replan:
-            self.state.todos.mark_failed(
-                current_todo.step_number, result=current_todo.result
+            self._mark_todo_failed(
+                current_todo.step_number,
+                result=current_todo.result,
+                error=observation.replan_reason,
            )
            if self.agent.verbose:
                PRINTER.print(
@@ -761,9 +863,7 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):

        # Step failed — also trigger replan
        if not observation.step_completed_successfully:
-            self.state.todos.mark_failed(
-                current_todo.step_number, result=current_todo.result
-            )
+            self._mark_todo_failed(current_todo.step_number, result=current_todo.result)
            if self.agent.verbose:
                PRINTER.print(
                    content="[Decide] Step failed — triggering replan",
@@ -773,7 +873,7 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
            return "replan_now"

        if observation.remaining_plan_still_valid and observation.suggested_refinements:
-            self.state.todos.mark_completed(
+            self._mark_todo_completed(
                current_todo.step_number, result=current_todo.result
            )
            if self.agent.verbose:
@@ -783,9 +883,7 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
                )
            return "refine_and_continue"

-        self.state.todos.mark_completed(
-            current_todo.step_number, result=current_todo.result
-        )
+        self._mark_todo_completed(current_todo.step_number, result=current_todo.result)
        if self.agent.verbose:
            completed = self.state.todos.completed_count
            total = len(self.state.todos.items)
@@ -961,7 +1059,7 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
            return "needs_replan"

        if len(ready) == 1:
-            self.state.todos.mark_running(ready[0].step_number)
+            self._mark_todo_running(ready[0])
            return "single_todo_ready"

        return "multiple_todos_ready"
@@ -1099,7 +1197,7 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):

        # Mark all ready todos as running
        for todo in ready:
-            self.state.todos.mark_running(todo.step_number)
+            self._mark_todo_running(todo)

        # Build context and executor for each todo, then run in parallel
        async def _run_step(todo: TodoItem) -> tuple[TodoItem, object]:
@@ -1127,7 +1225,11 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
            if isinstance(item, BaseException):
                error_msg = f"Error: {item!s}"
                todo.result = error_msg
-                self.state.todos.mark_failed(todo.step_number, result=error_msg)
+                self._mark_todo_failed(
+                    todo.step_number,
+                    result=error_msg,
+                    error=error_msg,
+                )
                if self.agent.verbose:
                    PRINTER.print(
                        content=f"Todo {todo.step_number} failed: {error_msg}",
@@ -1197,9 +1299,9 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):

            # Mark based on observation result
            if observation.step_completed_successfully:
-                self.state.todos.mark_completed(todo.step_number, result=todo.result)
+                self._mark_todo_completed(todo.step_number, result=todo.result)
            else:
-                self.state.todos.mark_failed(todo.step_number, result=todo.result)
+                self._mark_todo_failed(todo.step_number, result=todo.result)

            if self.agent.verbose:
                PRINTER.print(
@@ -1349,7 +1451,11 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
    def call_llm_native_tools(
        self,
    ) -> Literal[
-        "native_tool_calls", "native_finished", "context_error", "todo_satisfied"
+        "native_tool_calls",
+        "native_finished",
+        "context_error",
+        "todo_satisfied",
+        "continue_reasoning",
    ]:
        """Execute LLM call with native function calling.

@@ -1428,6 +1534,9 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
            return self._route_finish_with_todos("native_finished")

        except Exception as e:
+            if is_native_tool_calling_unsupported_error(e):
+                self._downgrade_to_text_tool_calling()
+                return "continue_reasoning"
            if is_context_length_exceeded(e):
                self._last_context_error = e
                return "context_error"
@@ -2085,7 +2194,7 @@ class AgentExecutor(Flow[AgentExecutorState], BaseAgentExecutor):
            step_number: The step number to mark.
            result: The result of the todo.
        """
-        self.state.todos.mark_completed(step_number, result=result)
+        self._mark_todo_completed(step_number, result=result)

        if self.agent.verbose:
            completed = self.state.todos.completed_count
--- a/lib/crewai/src/crewai/experimental/conversational_mixin.py
+++ b/lib/crewai/src/crewai/experimental/conversational_mixin.py
@@ -47,7 +47,7 @@ from crewai.flow.conversation import (
    receive_user_message as _receive_user_message,
 )
 from crewai.flow.dsl import listen, start
-from crewai.flow.dsl._utils import _set_flow_method_definition
+from crewai.flow.dsl._utils import _method_action, _set_flow_method_definition
 from crewai.flow.flow_definition import FlowMethodDefinition
 from crewai.utilities.types import LLMMessage

@@ -78,7 +78,7 @@ def _conversation_start_router(func: Callable[..., Any]) -> Any:
    wrapper = start()(func)
    _set_flow_method_definition(
        cast(Any, wrapper),
-        FlowMethodDefinition(start=True, router=True),
+        FlowMethodDefinition(do=_method_action(func), start=True, router=True),
    )
    return wrapper

@@ -146,6 +146,10 @@ class _ConversationalMixin:
        def kickoff(self, *args: Any, **kwargs: Any) -> Any:
            pass

+        @property
+        def method_outputs(self) -> list[Any]:
+            pass
+
    def conversation_start(self) -> str | None:
        """Return the current user message for conversational route selection.

@@ -1033,7 +1037,8 @@ class _ConversationalMixin:
        # of warning about an empty scope stack.
        started_id = getattr(self, "_deferred_flow_started_event_id", None)
        if started_id:
-            last_output = self._method_outputs[-1] if self._method_outputs else None
+            method_outputs = self.method_outputs
+            last_output = method_outputs[-1] if method_outputs else None
            restore_event_scope(((started_id, "flow_started"),))
            try:
                crewai_event_bus.emit(
--- a/lib/crewai/src/crewai/flow/async_feedback/init.py
+++ b/lib/crewai/src/crewai/flow/async_feedback/init.py
@@ -20,7 +20,7 @@ Example:
        @human_feedback(
            message="Review this:",
            emit=["approved", "rejected"],
-            llm="gpt-4o-mini",
+            llm="gpt-5.4-mini",
            provider=SlackProvider(),
        )
        def review(self):
--- a/lib/crewai/src/crewai/flow/async_feedback/types.py
+++ b/lib/crewai/src/crewai/flow/async_feedback/types.py
@@ -47,7 +47,7 @@ class PendingFeedbackContext:
            method_output={"title": "Draft", "body": "..."},
            message="Please review and approve or reject:",
            emit=["approved", "rejected"],
-            llm="gpt-4o-mini",
+            llm="gpt-5.4-mini",
        )
        ```
    """
--- a/lib/crewai/src/crewai/flow/dsl/_human_feedback.py
+++ b/lib/crewai/src/crewai/flow/dsl/_human_feedback.py
@@ -3,11 +3,10 @@ from __future__ import annotations
 from collections.abc import Callable, Sequence
 from typing import TYPE_CHECKING, Any, TypeVar

-from crewai.flow.flow_definition import FlowMethodDefinition
 from crewai.flow.human_feedback import (
    HumanFeedbackConfig,
    HumanFeedbackResult,
-    _build_human_feedback_runtime_decorator,
+    _validate_human_feedback_options,
 )


@@ -21,36 +20,10 @@ F = TypeVar("F", bound=Callable[..., Any])
 __all__ = ["HumanFeedbackResult", "human_feedback"]


-def _stamp_human_feedback_metadata(
-    wrapper: Any,
-    func: Callable[..., Any],
-    config: HumanFeedbackConfig,
-) -> None:
-    for attr in [
-        "__is_flow_method__",
-        "__flow_persistence_config__",
-        "__flow_method_definition__",
-    ]:
-        if hasattr(func, attr):
-            setattr(wrapper, attr, getattr(func, attr))
-
-    wrapper.__human_feedback_config__ = config
-    wrapper.__is_flow_method__ = True
-
-    if config.emit:
-        fragment = getattr(wrapper, "__flow_method_definition__", None)
-        if isinstance(fragment, FlowMethodDefinition):
-            wrapper.__flow_method_definition__ = fragment.model_copy(
-                update={"router": True, "emit": list(config.emit)}
-            )
-
-    wrapper._human_feedback_llm = config.llm
-
-
 def human_feedback(
    message: str,
    emit: Sequence[str] | None = None,
-    llm: str | BaseLLM | None = "gpt-4o-mini",
+    llm: str | BaseLLM | None = "gpt-5.4-mini",
    default_outcome: str | None = None,
    metadata: dict[str, Any] | None = None,
    provider: HumanFeedbackProvider | None = None,
@@ -58,21 +31,18 @@ def human_feedback(
    learn_source: str = "hitl",
    learn_strict: bool = False,
 ) -> Callable[[F], F]:
-    """Decorator for Flow methods that require human feedback."""
-    runtime_decorator = _build_human_feedback_runtime_decorator(
-        message=message,
-        emit=emit,
-        llm=llm,
-        default_outcome=default_outcome,
-        metadata=metadata,
-        provider=provider,
-        learn=learn,
-        learn_source=learn_source,
-        learn_strict=learn_strict,
+    """Decorator for Flow methods that require human feedback.
+
+    The decorator is a pure metadata stamper: it records the feedback
+    configuration on the method, and the Flow engine collects and routes
+    feedback after the method completes, driven by the flow's definition.
+    """
+    _validate_human_feedback_options(
+        emit=emit, llm=llm, default_outcome=default_outcome
    )
    config = HumanFeedbackConfig(
        message=message,
-        emit=emit,
+        emit=list(emit) if emit is not None else None,
        llm=llm,
        default_outcome=default_outcome,
        metadata=metadata,
@@ -83,8 +53,7 @@ def human_feedback(
    )

    def decorator(func: F) -> F:
-        wrapper = runtime_decorator(func)
-        _stamp_human_feedback_metadata(wrapper, func, config)
-        return wrapper
+        func.__human_feedback_config__ = config  # type: ignore[attr-defined]
+        return func

    return decorator
--- a/lib/crewai/src/crewai/flow/dsl/_listen.py
+++ b/lib/crewai/src/crewai/flow/dsl/_listen.py
@@ -8,6 +8,7 @@ from crewai.flow.dsl._types import FlowMethodDecorator, FlowTrigger
 from crewai.flow.dsl._utils import (
    P,
    R,
+    _method_action,
    _set_flow_method_definition,
 )
 from crewai.flow.flow_definition import FlowMethodDefinition
@@ -45,7 +46,11 @@ def listen(condition: FlowTrigger) -> FlowMethodDecorator:
        wrapper = ListenMethod(func)

        _set_flow_method_definition(
-            wrapper, FlowMethodDefinition(listen=_to_definition_condition(condition))
+            wrapper,
+            FlowMethodDefinition(
+                do=_method_action(func),
+                listen=_to_definition_condition(condition),
+            ),
        )
        return wrapper

--- a/lib/crewai/src/crewai/flow/dsl/_router.py
+++ b/lib/crewai/src/crewai/flow/dsl/_router.py
@@ -19,6 +19,7 @@ from crewai.flow.dsl._types import FlowMethodDecorator, FlowTrigger
 from crewai.flow.dsl._utils import (
    P,
    R,
+    _method_action,
    _set_flow_method_definition,
 )
 from crewai.flow.flow_definition import FlowMethodDefinition
@@ -148,6 +149,7 @@ def router(
        _set_flow_method_definition(
            wrapper,
            FlowMethodDefinition(
+                do=_method_action(func),
                listen=_to_definition_condition(condition),
                router=True,
                emit=router_events or None,
--- a/lib/crewai/src/crewai/flow/dsl/_start.py
+++ b/lib/crewai/src/crewai/flow/dsl/_start.py
@@ -8,6 +8,7 @@ from crewai.flow.dsl._types import FlowMethodDecorator, FlowTrigger
 from crewai.flow.dsl._utils import (
    P,
    R,
+    _method_action,
    _set_flow_method_definition,
 )
 from crewai.flow.flow_definition import FlowMethodDefinition
@@ -53,13 +54,17 @@ def start(
    def decorator(func: Callable[P, R]) -> StartMethod[P, R]:
        wrapper = StartMethod(func)

-        if condition is not None:
-            _set_flow_method_definition(
-                wrapper,
-                FlowMethodDefinition(start=_to_definition_condition(condition)),
-            )
-        else:
-            _set_flow_method_definition(wrapper, FlowMethodDefinition(start=True))
+        _set_flow_method_definition(
+            wrapper,
+            FlowMethodDefinition(
+                do=_method_action(func),
+                start=(
+                    _to_definition_condition(condition)
+                    if condition is not None
+                    else True
+                ),
+            ),
+        )
        return wrapper

    return cast(FlowMethodDecorator, decorator)
--- a/lib/crewai/src/crewai/flow/dsl/_utils.py
+++ b/lib/crewai/src/crewai/flow/dsl/_utils.py
@@ -8,6 +8,8 @@ from pydantic import BaseModel
 from typing_extensions import TypeIs

 from crewai.flow.flow_definition import (
+    FlowActionDefinition,
+    FlowCodeActionDefinition,
    FlowConfigDefinition,
    FlowConversationalDefinition,
    FlowConversationalRouterDefinition,
@@ -17,6 +19,7 @@ from crewai.flow.flow_definition import (
    FlowMethodDefinition,
    FlowPersistenceDefinition,
    FlowStateDefinition,
+    _object_ref,
 )
 from crewai.flow.flow_wrappers import (
    FlowMethod,
@@ -34,15 +37,12 @@ _FLOW_METHOD_METADATA_ATTRS = [
    "__flow_method_definition__",
    "__flow_persistence_config__",
    "__human_feedback_config__",
-    "_human_feedback_llm",
 ]


 def is_flow_method(obj: Any) -> TypeIs[FlowMethod[Any, Any]]:
    """Check if the object carries Flow method wrapper metadata."""
-    return hasattr(obj, "__is_flow_method__") or hasattr(
-        obj, _FLOW_METHOD_DEFINITION_ATTR
-    )
+    return hasattr(obj, _FLOW_METHOD_DEFINITION_ATTR)


 def _should_include_flow_method(flow_class: type, method: Any) -> bool:
@@ -80,10 +80,13 @@ def _stamp_inherited_conversational_metadata(
    for attr in _FLOW_METHOD_METADATA_ATTRS:
        if hasattr(inherited, attr):
            setattr(method, attr, getattr(inherited, attr))
-    method.__is_flow_method__ = True
    return method


+def _method_action(method: Any) -> FlowActionDefinition:
+    return FlowCodeActionDefinition(ref=f"{method.__module__}:{method.__qualname__}")
+
+
 def _set_flow_method_definition(
    wrapper: FlowMethod[P, R],
    definition: FlowMethodDefinition,
@@ -100,13 +103,6 @@ def _get_flow_method_definition(method: Any) -> FlowMethodDefinition | None:
    return None


-def _object_ref(value: Any) -> str:
-    target = value if isinstance(value, type) else type(value)
-    module = getattr(target, "__module__", "")
-    qualname = getattr(target, "__qualname__", getattr(target, "__name__", ""))
-    return f"{module}:{qualname}" if module and qualname else repr(value)
-
-
 def _is_json_serializable(value: Any) -> bool:
    try:
        json.dumps(value)
@@ -214,16 +210,22 @@ def _build_config_definition(
 ) -> FlowConfigDefinition:
    config_field_names = set(FlowConfigDefinition.model_fields)
    field_defaults = {
-        name: field.default
+        name: field.get_default(call_default_factory=True)
        for name, field in getattr(flow_class, "model_fields", {}).items()
        if name in config_field_names
    }
    values: dict[str, Any] = {}
    for field_name, default in field_defaults.items():
        value = getattr(flow_class, field_name, default)
-        values[field_name] = _serialize_static_value(
-            value, diagnostics, f"config.{field_name}"
-        )
+        if field_name == "input_provider":
+            # A string value is already a ref; only live objects degrade.
+            values[field_name] = (
+                value if value is None or isinstance(value, str) else _object_ref(value)
+            )
+        else:
+            values[field_name] = _serialize_static_value(
+                value, diagnostics, f"config.{field_name}"
+            )
    return FlowConfigDefinition(**values)


@@ -239,38 +241,31 @@ def _build_human_feedback_definition(
    return FlowHumanFeedbackDefinition(
        message=str(config.message),
        emit=[str(value) for value in emit] if emit is not None else None,
-        llm=_serialize_static_value(
-            getattr(config, "llm", None), diagnostics, f"{path}.llm"
-        ),
+        # llm and provider stay live: the engine consumes them in-process and
+        # the contract degrades them to serializable forms at JSON dump time.
+        llm=getattr(config, "llm", None),
        default_outcome=getattr(config, "default_outcome", None),
        metadata=_serialize_static_value(
            getattr(config, "metadata", None), diagnostics, f"{path}.metadata"
        ),
-        provider=_serialize_static_value(
-            getattr(config, "provider", None), diagnostics, f"{path}.provider"
-        ),
+        provider=getattr(config, "provider", None),
        learn=bool(getattr(config, "learn", False)),
        learn_source=str(getattr(config, "learn_source", "hitl")),
        learn_strict=bool(getattr(config, "learn_strict", False)),
    )


-def _build_persistence_definition(
-    value: Any,
-    diagnostics: list[FlowDefinitionDiagnostic],
-    path: str,
-) -> FlowPersistenceDefinition | None:
+def _build_persistence_definition(value: Any) -> FlowPersistenceDefinition | None:
    config = getattr(value, "__flow_persistence_config__", None)
    if config is None:
        return None
-    persistence = getattr(config, "persistence", None)
-    verbose = bool(getattr(config, "verbose", False))
    return FlowPersistenceDefinition(
        enabled=True,
-        verbose=verbose,
-        persistence=_serialize_static_value(
-            persistence, diagnostics, f"{path}.persistence"
-        ),
+        verbose=bool(getattr(config, "verbose", False)),
+        # The backend stays live: the engine persists through the exact
+        # instance the user configured; the contract degrades it to a
+        # serialized config at JSON dump time.
+        persistence=getattr(config, "persistence", None),
    )


@@ -373,9 +368,11 @@ def _build_method_definition(
 ) -> FlowMethodDefinition:
    fragment = _get_flow_method_definition(method)
    if fragment is None:
-        method_definition = FlowMethodDefinition()
+        method_definition = FlowMethodDefinition(do=_method_action(method))
    else:
-        method_definition = fragment.model_copy(deep=True)
+        method_definition = fragment.model_copy(
+            deep=True, update={"do": _method_action(method)}
+        )

    human_feedback = _build_human_feedback_definition(
        method, diagnostics, f"{path}.human_feedback"
@@ -386,9 +383,7 @@ def _build_method_definition(
            method_definition.router = True
            method_definition.emit = None

-    method_definition.persist = _build_persistence_definition(
-        method, diagnostics, f"{path}.persist"
-    )
+    method_definition.persist = _build_persistence_definition(method)

    return method_definition

@@ -472,7 +467,7 @@ def _build_flow_definition_from_class(
        description=description,
        state=_build_state_definition(flow_class, diagnostics),
        config=_build_config_definition(flow_class, diagnostics),
-        persist=_build_persistence_definition(flow_class, diagnostics, "persist"),
+        persist=_build_persistence_definition(flow_class),
        conversational=_build_conversational_definition(flow_class, diagnostics),
        methods=methods,
        diagnostics=diagnostics,
--- a/lib/crewai/src/crewai/flow/flow_definition.py
+++ b/lib/crewai/src/crewai/flow/flow_definition.py
@@ -13,7 +13,7 @@ import json
 import logging
 from typing import Any, Literal as TypingLiteral

-from pydantic import BaseModel, ConfigDict, Field
+from pydantic import BaseModel, ConfigDict, Field, field_serializer, model_validator
 import yaml

 from crewai.flow.conversational_definition import (
@@ -27,19 +27,31 @@ logger = logging.getLogger(__name__)
 FlowDefinitionCondition = str | dict[str, Any]

 __all__ = [
+    "FlowActionDefinition",
+    "FlowCodeActionDefinition",
    "FlowConfigDefinition",
    "FlowConversationalDefinition",
    "FlowConversationalRouterDefinition",
    "FlowDefinition",
    "FlowDefinitionCondition",
    "FlowDefinitionDiagnostic",
+    "FlowExpressionActionDefinition",
    "FlowHumanFeedbackDefinition",
    "FlowMethodDefinition",
    "FlowPersistenceDefinition",
    "FlowStateDefinition",
+    "FlowToolActionDefinition",
 ]


+def _object_ref(value: Any) -> str:
+    """Format a class or instance as the canonical ``module:qualname`` ref."""
+    target = value if isinstance(value, type) else type(value)
+    module = getattr(target, "__module__", "")
+    qualname = getattr(target, "__qualname__", getattr(target, "__name__", ""))
+    return f"{module}:{qualname}" if module and qualname else repr(value)
+
+
 class FlowDefinitionDiagnostic(BaseModel):
    """A non-fatal Flow Definition build or validation diagnostic."""

@@ -52,9 +64,10 @@ class FlowDefinitionDiagnostic(BaseModel):
 class FlowStateDefinition(BaseModel):
    """Static description of a Flow state contract."""

-    type: TypingLiteral["dict", "pydantic", "unknown"] = "dict"
+    type: TypingLiteral["dict", "pydantic", "json_schema", "unknown"] = "dict"
    ref: str | None = None
-    default: Any = None
+    json_schema: dict[str, Any] | None = None
+    default: dict[str, Any] | None = None


 class FlowConfigDefinition(BaseModel):
@@ -62,22 +75,50 @@ class FlowConfigDefinition(BaseModel):

    tracing: bool | None = None
    stream: bool = False
-    memory: Any = None
-    input_provider: Any = None
+    memory: dict[str, Any] | None = None
+    input_provider: str | None = None
    suppress_flow_events: bool = False
    max_method_calls: int = 100
+    defer_trace_finalization: bool = False
+    checkpoint: bool | dict[str, Any] | None = None


 class FlowPersistenceDefinition(BaseModel):
-    """Static persistence configuration."""
+    """Static persistence configuration.
+
+    ``persistence`` may hold a live backend when the definition is built from
+    a decorated class — the engine then persists through the exact instance
+    the user configured; the JSON/YAML projection degrades it to its
+    serialized config.
+    """

    enabled: bool = False
    verbose: bool = False
    persistence: Any = None

+    @field_serializer("persistence", when_used="json")
+    def _serialize_persistence(self, value: Any) -> Any:
+        if value is None or isinstance(value, dict):
+            return value
+        if isinstance(value, BaseModel):
+            try:
+                return value.model_dump(mode="json")
+            except Exception:
+                logger.warning(
+                    "Persistence backend %s is not fully serializable; "
+                    "preserved import reference only.",
+                    _object_ref(value),
+                )
+        return {"ref": _object_ref(value)}
+

 class FlowHumanFeedbackDefinition(BaseModel):
-    """Static human feedback configuration."""
+    """Static human feedback configuration.
+
+    ``llm`` and ``provider`` may hold live Python objects when the definition
+    is built from a decorated class; the JSON/YAML projection degrades them to
+    a serialized config (``llm``) or a ``module:qualname`` ref (``provider``).
+    """

    message: str
    emit: list[str] | None = None
@@ -89,10 +130,58 @@ class FlowHumanFeedbackDefinition(BaseModel):
    learn_source: str = "hitl"
    learn_strict: bool = False

+    @field_serializer("llm", when_used="json")
+    def _serialize_llm(self, value: Any) -> dict[str, Any] | str | None:
+        if value is None or isinstance(value, (str, dict)):
+            return value
+        from crewai.flow.human_feedback import _serialize_llm_for_context
+
+        return _serialize_llm_for_context(value)
+
+    @field_serializer("provider", when_used="json")
+    def _serialize_provider(self, value: Any) -> str | None:
+        if value is None or isinstance(value, str):
+            return value
+        return _object_ref(value)
+
+
+class FlowCodeActionDefinition(BaseModel):
+    """A Flow method action that executes importable Python code."""
+
+    model_config = ConfigDict(extra="forbid")
+
+    call: TypingLiteral["code"] = "code"
+    ref: str
+
+
+class FlowToolActionDefinition(BaseModel):
+    """A Flow method action that invokes a CrewAI tool."""
+
+    model_config = ConfigDict(populate_by_name=True, extra="forbid")
+
+    call: TypingLiteral["tool"]
+    ref: str
+    with_: dict[str, Any] | None = Field(default=None, alias="with")
+
+
+class FlowExpressionActionDefinition(BaseModel):
+    """A Flow method action that evaluates a CEL expression."""
+
+    model_config = ConfigDict(extra="forbid")
+
+    call: TypingLiteral["expression"]
+    expr: str
+
+
+FlowActionDefinition = (
+    FlowCodeActionDefinition | FlowToolActionDefinition | FlowExpressionActionDefinition
+)
+

 class FlowMethodDefinition(BaseModel):
    """Static definition of one Flow method and its execution roles."""

+    do: FlowActionDefinition
    start: bool | FlowDefinitionCondition | None = None
    listen: FlowDefinitionCondition | None = None
    router: bool = False
@@ -100,6 +189,16 @@ class FlowMethodDefinition(BaseModel):
    human_feedback: FlowHumanFeedbackDefinition | None = None
    persist: FlowPersistenceDefinition | None = None

+    @model_validator(mode="after")
+    def _canonicalize_human_feedback_routing(self) -> FlowMethodDefinition:
+        # Canonical shape: a method whose human_feedback declares emit
+        # outcomes routes like a router, regardless of how the definition
+        # was authored.
+        if self.human_feedback is not None and self.human_feedback.emit:
+            self.router = True
+            self.emit = None
+        return self
+
    @property
    def is_start(self) -> bool:
        """Whether this method is a start method.
@@ -116,7 +215,9 @@ class FlowDefinition(BaseModel):

    model_config = ConfigDict(populate_by_name=True, arbitrary_types_allowed=True)

-    schema_: str = Field(default="crewai.flow/v1", alias="schema")
+    schema_: TypingLiteral["crewai.flow/v1"] = Field(
+        default="crewai.flow/v1", alias="schema"
+    )
    name: str
    description: str | None = None
    state: FlowStateDefinition | None = None
--- a/lib/crewai/src/crewai/flow/flow_wrappers.py
+++ b/lib/crewai/src/crewai/flow/flow_wrappers.py
@@ -83,7 +83,6 @@ class FlowMethod(Generic[P, R]):
            "__conversational_only__",  # gates registration on Flow.conversational
            "__flow_persistence_config__",
            "__flow_method_definition__",
-            "_human_feedback_llm",  # Live LLM object for HITL resume
        ]:
            if hasattr(meth, attr):
                setattr(self, attr, getattr(meth, attr))
--- a/lib/crewai/src/crewai/flow/human_feedback.py
+++ b/lib/crewai/src/crewai/flow/human_feedback.py
@@ -1,8 +1,11 @@
-"""Human feedback decorator for Flow methods.
+"""Human feedback support for Flow methods.

-This module provides the @human_feedback decorator that enables human-in-the-loop
-workflows within CrewAI Flows. It allows collecting human feedback on method outputs
-and optionally routing to different listeners based on the feedback.
+This module backs the @human_feedback decorator that enables human-in-the-loop
+workflows within CrewAI Flows. The decorator is a pure metadata stamper: it
+records a :class:`HumanFeedbackConfig` on the method, the Flow definition
+builder lifts it into ``FlowHumanFeedbackDefinition``, and the Flow engine
+collects feedback after each decorated method completes, driven by the flow's
+definition.

 Supports both synchronous (blocking) and asynchronous (non-blocking) feedback
 collection through the provider parameter.
@@ -17,7 +20,7 @@ Example (synchronous, default):
        @human_feedback(
            message="Please review this content:",
            emit=["approved", "rejected"],
-            llm="gpt-4o-mini",
+            llm="gpt-5.4-mini",
        )
        def generate_content(self):
            return {"title": "Article", "body": "Content..."}
@@ -45,7 +48,7 @@ Example (asynchronous with custom provider):
        @human_feedback(
            message="Review this:",
            emit=["approved", "rejected"],
-            llm="gpt-4o-mini",
+            llm="gpt-5.4-mini",
            provider=SlackProvider(),
        )
        def generate_content(self):
@@ -55,22 +58,18 @@ Example (asynchronous with custom provider):

 from __future__ import annotations

-import asyncio
 from collections.abc import Callable, Sequence
 from dataclasses import dataclass, field
 from datetime import datetime
-from functools import wraps
 import logging
 from typing import TYPE_CHECKING, Any, TypeVar

 from pydantic import BaseModel, Field

-from crewai.flow.flow_wrappers import FlowMethod
-

 if TYPE_CHECKING:
    from crewai.flow.async_feedback.types import HumanFeedbackProvider
-    from crewai.flow.flow import Flow
+    from crewai.flow.runtime import Flow
    from crewai.llms.base_llm import BaseLLM


@@ -160,8 +159,8 @@ class HumanFeedbackResult:
 class HumanFeedbackConfig:
    """Configuration for the @human_feedback decorator.

-    Stores the parameters passed to the decorator for later use during
-    method execution and for introspection by visualization tools.
+    Stores the parameters passed to the decorator for later use by the
+    Flow definition builder and for introspection by visualization tools.

    Attributes:
        message: The message shown to the human when requesting feedback.
@@ -174,7 +173,7 @@ class HumanFeedbackConfig:

    message: str
    emit: Sequence[str] | None = None
-    llm: str | BaseLLM | None = "gpt-4o-mini"
+    llm: str | BaseLLM | None = "gpt-5.4-mini"
    default_outcome: str | None = None
    metadata: dict[str, Any] | None = None
    provider: HumanFeedbackProvider | None = None
@@ -183,19 +182,6 @@ class HumanFeedbackConfig:
    learn_strict: bool = False


-class HumanFeedbackMethod(FlowMethod[Any, Any]):
-    """Wrapper for methods decorated with @human_feedback.
-
-    This wrapper extends FlowMethod to add human feedback specific attributes
-    used by the FlowDefinition builder and runtime feedback handling.
-
-    Attributes:
-        __human_feedback_config__: The HumanFeedbackConfig for this method.
-    """
-
-    __human_feedback_config__: HumanFeedbackConfig | None = None
-
-
 class PreReviewResult(BaseModel):
    """Structured output from the HITL pre-review LLM call."""

@@ -217,22 +203,16 @@ class DistilledLessons(BaseModel):
    )


-def _build_human_feedback_runtime_decorator(
-    message: str,
-    emit: Sequence[str] | None = None,
-    llm: str | BaseLLM | None = "gpt-4o-mini",
-    default_outcome: str | None = None,
-    metadata: dict[str, Any] | None = None,
-    provider: HumanFeedbackProvider | None = None,
-    learn: bool = False,
-    learn_source: str = "hitl",
-    learn_strict: bool = False,
-) -> Callable[[F], F]:
+def _validate_human_feedback_options(
+    emit: Sequence[str] | None,
+    llm: Any,
+    default_outcome: str | None,
+) -> None:
    if emit is not None:
        if not llm:
            raise ValueError(
                "llm is required when emit is specified. "
-                "Provide an LLM model string (e.g., 'gpt-4o-mini') or a BaseLLM instance. "
+                "Provide an LLM model string (e.g., 'gpt-5.4-mini') or a BaseLLM instance. "
                "See the CrewAI Human-in-the-Loop (HITL) documentation for more information: "
                "https://docs.crewai.com/en/learn/human-feedback-in-flows"
            )
@@ -244,301 +224,145 @@ def _build_human_feedback_runtime_decorator(
    elif default_outcome is not None:
        raise ValueError("default_outcome requires emit to be specified.")

-    def decorator(func: F) -> F:
-        def _get_hitl_prompt(key: str) -> str:
-            from crewai.utilities.i18n import I18N_DEFAULT

-            return I18N_DEFAULT.slice(key)
+def _get_hitl_prompt(key: str) -> str:
+    from crewai.utilities.i18n import I18N_DEFAULT

-        def _resolve_llm_instance() -> Any:
-            if llm is None:
-                from crewai.llm import LLM
+    return I18N_DEFAULT.slice(key)

-                return LLM(model="gpt-4o-mini")
-            if isinstance(llm, str):
-                from crewai.llm import LLM

-                return LLM(model=llm)
-            return llm  # already a BaseLLM instance
+def _resolve_llm_instance(llm: Any) -> Any:
+    from crewai.llm import LLM

-        def _pre_review_with_lessons(
-            flow_instance: Flow[Any], method_output: Any
-        ) -> Any:
-            try:
-                mem = flow_instance.memory
-                if mem is None:
-                    return method_output
-                query = f"human feedback lessons for {func.__name__}: {method_output!s}"
-                matches = mem.recall(query, source=learn_source)
-                if not matches:
-                    return method_output
+    if llm is None:
+        return LLM(model="gpt-5.4-mini")
+    if isinstance(llm, str):
+        return LLM(model=llm)
+    if isinstance(llm, dict):
+        deserialized = _deserialize_llm_from_context(llm)
+        return deserialized if deserialized is not None else LLM(model="gpt-5.4-mini")
+    return llm  # already a BaseLLM instance

-                lessons = "\n".join(f"- {m.record.content}" for m in matches)
-                llm_inst = _resolve_llm_instance()
-                prompt = _get_hitl_prompt("hitl_pre_review_user").format(
-                    output=str(method_output),
-                    lessons=lessons,
-                )
-                messages = [
-                    {
-                        "role": "system",
-                        "content": _get_hitl_prompt("hitl_pre_review_system"),
-                    },
-                    {"role": "user", "content": prompt},
-                ]
-                if getattr(llm_inst, "supports_function_calling", lambda: False)():
-                    response = llm_inst.call(messages, response_model=PreReviewResult)
-                    if isinstance(response, PreReviewResult):
-                        return response.improved_output
-                    return PreReviewResult.model_validate(response).improved_output
-                reviewed = llm_inst.call(messages)
-                return reviewed if isinstance(reviewed, str) else str(reviewed)
-            except Exception:
-                if learn_strict:
-                    logger.warning(
-                        "HITL pre-review failed for %s; re-raising (learn_strict=True)",
-                        func.__name__,
-                        exc_info=True,
-                    )
-                    raise
-                logger.warning(
-                    "HITL pre-review failed for %s; falling back to raw output",
-                    func.__name__,
-                    exc_info=True,
-                )
-                return method_output

-        def _distill_and_store_lessons(
-            flow_instance: Flow[Any], method_output: Any, raw_feedback: str
-        ) -> None:
-            try:
-                mem = flow_instance.memory
-                if mem is None:
-                    return
-                llm_inst = _resolve_llm_instance()
-                prompt = _get_hitl_prompt("hitl_distill_user").format(
-                    method_name=func.__name__,
-                    output=str(method_output),
-                    feedback=raw_feedback,
-                )
-                messages = [
-                    {
-                        "role": "system",
-                        "content": _get_hitl_prompt("hitl_distill_system"),
-                    },
-                    {"role": "user", "content": prompt},
-                ]
+def _pre_review_with_lessons(
+    flow_instance: Flow[Any],
+    method_name: str,
+    method_output: Any,
+    *,
+    llm: Any,
+    learn_source: str,
+    learn_strict: bool,
+) -> Any:
+    try:
+        mem = flow_instance.memory
+        if mem is None:
+            return method_output
+        query = f"human feedback lessons for {method_name}: {method_output!s}"
+        matches = mem.recall(query, source=learn_source)
+        if not matches:
+            return method_output

-                lessons: list[str] = []
-                if getattr(llm_inst, "supports_function_calling", lambda: False)():
-                    response = llm_inst.call(messages, response_model=DistilledLessons)
-                    if isinstance(response, DistilledLessons):
-                        lessons = response.lessons
-                    else:
-                        lessons = DistilledLessons.model_validate(response).lessons
-                else:
-                    response = llm_inst.call(messages)
-                    if isinstance(response, str):
-                        lessons = [
-                            line.strip("- ").strip()
-                            for line in response.strip().split("\n")
-                            if line.strip() and line.strip() != "NONE"
-                        ]
-
-                if lessons:
-                    mem.remember_many(lessons, source=learn_source)  # type: ignore[union-attr]
-            except Exception:
-                if learn_strict:
-                    logger.warning(
-                        "HITL lesson distillation failed for %s; re-raising (learn_strict=True)",
-                        func.__name__,
-                        exc_info=True,
-                    )
-                    raise
-                logger.warning(
-                    "HITL lesson distillation failed for %s; no lessons stored",
-                    func.__name__,
-                    exc_info=True,
-                )
-
-        def _build_feedback_context(
-            flow_instance: Flow[Any], method_output: Any
-        ) -> tuple[Any, Any]:
-            from crewai.flow.async_feedback.types import PendingFeedbackContext
-
-            context = PendingFeedbackContext(
-                flow_id=flow_instance.flow_id or "unknown",
-                flow_class=f"{flow_instance.__class__.__module__}.{flow_instance.__class__.__name__}",
-                method_name=func.__name__,
-                method_output=method_output,
-                message=message,
-                emit=list(emit) if emit else None,
-                default_outcome=default_outcome,
-                metadata=metadata or {},
-                llm=llm if isinstance(llm, str) else _serialize_llm_for_context(llm),
+        lessons = "\n".join(f"- {m.record.content}" for m in matches)
+        llm_inst = _resolve_llm_instance(llm)
+        prompt = _get_hitl_prompt("hitl_pre_review_user").format(
+            output=str(method_output),
+            lessons=lessons,
+        )
+        messages = [
+            {
+                "role": "system",
+                "content": _get_hitl_prompt("hitl_pre_review_system"),
+            },
+            {"role": "user", "content": prompt},
+        ]
+        if getattr(llm_inst, "supports_function_calling", lambda: False)():
+            response = llm_inst.call(messages, response_model=PreReviewResult)
+            if isinstance(response, PreReviewResult):
+                return response.improved_output
+            return PreReviewResult.model_validate(response).improved_output
+        reviewed = llm_inst.call(messages)
+        return reviewed if isinstance(reviewed, str) else str(reviewed)
+    except Exception:
+        if learn_strict:
+            logger.warning(
+                "HITL pre-review failed for %s; re-raising (learn_strict=True)",
+                method_name,
+                exc_info=True,
            )
+            raise
+        logger.warning(
+            "HITL pre-review failed for %s; falling back to raw output",
+            method_name,
+            exc_info=True,
+        )
+        return method_output

-            effective_provider = provider
-            if effective_provider is None:
-                from crewai.flow.flow_config import flow_config

-                effective_provider = flow_config.hitl_provider
+def _distill_and_store_lessons(
+    flow_instance: Flow[Any],
+    method_name: str,
+    method_output: Any,
+    raw_feedback: str,
+    *,
+    llm: Any,
+    learn_source: str,
+    learn_strict: bool,
+) -> None:
+    try:
+        mem = flow_instance.memory
+        if mem is None:
+            return
+        llm_inst = _resolve_llm_instance(llm)
+        prompt = _get_hitl_prompt("hitl_distill_user").format(
+            method_name=method_name,
+            output=str(method_output),
+            feedback=raw_feedback,
+        )
+        messages = [
+            {
+                "role": "system",
+                "content": _get_hitl_prompt("hitl_distill_system"),
+            },
+            {"role": "user", "content": prompt},
+        ]

-            return context, effective_provider
-
-        def _request_feedback(flow_instance: Flow[Any], method_output: Any) -> str:
-            context, effective_provider = _build_feedback_context(
-                flow_instance, method_output
-            )
-
-            if effective_provider is not None:
-                feedback_result = effective_provider.request_feedback(
-                    context, flow_instance
-                )
-                if asyncio.iscoroutine(feedback_result):
-                    raise TypeError(
-                        f"Provider {type(effective_provider).__name__}.request_feedback() "
-                        "returned a coroutine in a sync flow method. Use an async flow "
-                        "method or a synchronous provider."
-                    )
-                return str(feedback_result)
-            return flow_instance._request_human_feedback(
-                message=message,
-                output=method_output,
-                metadata=metadata,
-                emit=emit,
-            )
-
-        async def _request_feedback_async(
-            flow_instance: Flow[Any], method_output: Any
-        ) -> str:
-            context, effective_provider = _build_feedback_context(
-                flow_instance, method_output
-            )
-
-            if effective_provider is not None:
-                feedback_result = effective_provider.request_feedback(
-                    context, flow_instance
-                )
-                if asyncio.iscoroutine(feedback_result):
-                    return str(await feedback_result)
-                return str(feedback_result)
-            return flow_instance._request_human_feedback(
-                message=message,
-                output=method_output,
-                metadata=metadata,
-                emit=emit,
-            )
-
-        def _process_feedback(
-            flow_instance: Flow[Any],
-            method_output: Any,
-            raw_feedback: str,
-        ) -> HumanFeedbackResult | str:
-            collapsed_outcome: str | None = None
-
-            if not raw_feedback.strip():
-                if default_outcome:
-                    collapsed_outcome = default_outcome
-                elif emit:
-                    collapsed_outcome = emit[0]
-            elif emit:
-                if llm is not None:
-                    collapsed_outcome = flow_instance._collapse_to_outcome(
-                        feedback=raw_feedback,
-                        outcomes=emit,
-                        llm=llm,
-                    )
-                else:
-                    collapsed_outcome = emit[0]
-
-            result = HumanFeedbackResult(
-                output=method_output,
-                feedback=raw_feedback,
-                outcome=collapsed_outcome,
-                timestamp=datetime.now(),
-                method_name=func.__name__,
-                metadata=metadata or {},
-            )
-
-            flow_instance.human_feedback_history.append(result)
-            flow_instance.last_human_feedback = result
-
-            if emit:
-                if collapsed_outcome is None:
-                    collapsed_outcome = default_outcome or emit[0]
-                    result.outcome = collapsed_outcome
-                return collapsed_outcome
-            return result
-
-        if asyncio.iscoroutinefunction(func):
-
-            @wraps(func)
-            async def async_wrapper(self: Flow[Any], *args: Any, **kwargs: Any) -> Any:
-                method_output = await func(self, *args, **kwargs)
-
-                if learn and getattr(self, "memory", None) is not None:
-                    method_output = _pre_review_with_lessons(self, method_output)
-
-                raw_feedback = await _request_feedback_async(self, method_output)
-                result = _process_feedback(self, method_output, raw_feedback)
-
-                if (
-                    learn
-                    and getattr(self, "memory", None) is not None
-                    and raw_feedback.strip()
-                ):
-                    _distill_and_store_lessons(self, method_output, raw_feedback)
-
-                # Stash the real method output for final flow result when emit is set:
-                # result is the collapsed outcome string for routing, but we preserve the
-                # actual method output as the flow's final result. Uses per-method dict for
-                # concurrency safety and to handle None returns.
-                if emit:
-                    self._human_feedback_method_outputs[func.__name__] = method_output
-
-                return result
-
-            wrapper: Any = async_wrapper
+        lessons: list[str] = []
+        if getattr(llm_inst, "supports_function_calling", lambda: False)():
+            response = llm_inst.call(messages, response_model=DistilledLessons)
+            if isinstance(response, DistilledLessons):
+                lessons = response.lessons
+            else:
+                lessons = DistilledLessons.model_validate(response).lessons
        else:
+            response = llm_inst.call(messages)
+            if isinstance(response, str):
+                lessons = [
+                    line.strip("- ").strip()
+                    for line in response.strip().split("\n")
+                    if line.strip() and line.strip() != "NONE"
+                ]

-            @wraps(func)
-            def sync_wrapper(self: Flow[Any], *args: Any, **kwargs: Any) -> Any:
-                method_output = func(self, *args, **kwargs)
-
-                if learn and getattr(self, "memory", None) is not None:
-                    method_output = _pre_review_with_lessons(self, method_output)
-
-                raw_feedback = _request_feedback(self, method_output)
-                result = _process_feedback(self, method_output, raw_feedback)
-
-                if (
-                    learn
-                    and getattr(self, "memory", None) is not None
-                    and raw_feedback.strip()
-                ):
-                    _distill_and_store_lessons(self, method_output, raw_feedback)
-
-                # Stash the real method output for final flow result when emit is set:
-                # result is the collapsed outcome string for routing, but we preserve the
-                # actual method output as the flow's final result. Uses per-method dict for
-                # concurrency safety and to handle None returns.
-                if emit:
-                    self._human_feedback_method_outputs[func.__name__] = method_output
-
-                return result
-
-            wrapper = sync_wrapper
-
-        return wrapper  # type: ignore[no-any-return]
-
-    return decorator
+        if lessons:
+            mem.remember_many(lessons, source=learn_source)  # type: ignore[union-attr]
+    except Exception:
+        if learn_strict:
+            logger.warning(
+                "HITL lesson distillation failed for %s; re-raising (learn_strict=True)",
+                method_name,
+                exc_info=True,
+            )
+            raise
+        logger.warning(
+            "HITL lesson distillation failed for %s; no lessons stored",
+            method_name,
+            exc_info=True,
+        )


 def human_feedback(
    message: str,
    emit: Sequence[str] | None = None,
-    llm: str | BaseLLM | None = "gpt-4o-mini",
+    llm: str | BaseLLM | None = "gpt-5.4-mini",
    default_outcome: str | None = None,
    metadata: dict[str, Any] | None = None,
    provider: HumanFeedbackProvider | None = None,
--- a/lib/crewai/src/crewai/flow/persistence/decorators.py
+++ b/lib/crewai/src/crewai/flow/persistence/decorators.py
@@ -24,12 +24,10 @@ Example:

 from __future__ import annotations

-import asyncio
 from collections.abc import Callable
-import functools
 import logging
 from types import SimpleNamespace
-from typing import TYPE_CHECKING, Any, Final, TypeVar, cast
+from typing import TYPE_CHECKING, Any, Final, TypeVar

 from crewai_core.printer import PRINTER
 from pydantic import BaseModel
@@ -39,7 +37,7 @@ from crewai.flow.persistence.factory import default_flow_persistence


 if TYPE_CHECKING:
-    from crewai.flow.flow import Flow
+    from crewai.flow.runtime import Flow


 logger = logging.getLogger(__name__)
@@ -66,14 +64,6 @@ def _stamp_persistence_metadata(
    )


-_PRESERVED_FLOW_ATTRS: Final[tuple[str, ...]] = (
-    "__human_feedback_config__",
-    "__flow_persistence_config__",
-    "__flow_method_definition__",
-    "_human_feedback_llm",
-)
-
-
 class PersistenceDecorator:
    """Class to handle flow state persistence with consistent logging."""

@@ -164,6 +154,10 @@ def persist(
    states. When applied at the method level, it persists only that method's
    state.

+    The decorator is a pure metadata stamper: it records the persistence
+    configuration on the class or method, and the Flow engine saves state
+    after each persisted method completes, driven by the flow's definition.
+
    Args:
        persistence: Optional FlowPersistence implementation to use.
                    If not provided, uses ``default_flow_persistence()`` (the
@@ -191,122 +185,7 @@ def persist(
            persistence if persistence is not None else default_flow_persistence()
        )

-        if isinstance(target, type):
-            _stamp_persistence_metadata(target, actual_persistence, verbose)
-            original_init = target.__init__  # type: ignore[misc]
-
-            @functools.wraps(original_init)
-            def new_init(self: Any, *args: Any, **kwargs: Any) -> None:
-                if "persistence" not in kwargs:
-                    kwargs["persistence"] = actual_persistence
-                original_init(self, *args, **kwargs)
-
-            target.__init__ = new_init  # type: ignore[misc]
-
-            # Preserve original methods' decorators
-            original_methods = {
-                name: method
-                for name, method in target.__dict__.items()
-                if callable(method)
-                and (
-                    hasattr(method, "__is_flow_method__")
-                    or hasattr(method, "__flow_method_definition__")
-                )
-            }
-
-            for name, method in original_methods.items():
-                if asyncio.iscoroutinefunction(method):
-                    # Closure captures the current name and method
-                    def create_async_wrapper(
-                        method_name: str, original_method: Callable[..., Any]
-                    ) -> Callable[..., Any]:
-                        @functools.wraps(original_method)
-                        async def method_wrapper(
-                            self: Any, *args: Any, **kwargs: Any
-                        ) -> Any:
-                            result = await original_method(self, *args, **kwargs)
-                            PersistenceDecorator.persist_state(
-                                self, method_name, actual_persistence, verbose
-                            )
-                            return result
-
-                        return method_wrapper
-
-                    wrapped = create_async_wrapper(name, method)
-
-                    for attr in _PRESERVED_FLOW_ATTRS:
-                        if hasattr(method, attr):
-                            setattr(wrapped, attr, getattr(method, attr))
-                    wrapped.__is_flow_method__ = True  # type: ignore[attr-defined]
-
-                    setattr(target, name, wrapped)
-                else:
-
-                    def create_sync_wrapper(
-                        method_name: str, original_method: Callable[..., Any]
-                    ) -> Callable[..., Any]:
-                        @functools.wraps(original_method)
-                        def method_wrapper(self: Any, *args: Any, **kwargs: Any) -> Any:
-                            result = original_method(self, *args, **kwargs)
-                            PersistenceDecorator.persist_state(
-                                self, method_name, actual_persistence, verbose
-                            )
-                            return result
-
-                        return method_wrapper
-
-                    wrapped = create_sync_wrapper(name, method)
-
-                    for attr in _PRESERVED_FLOW_ATTRS:
-                        if hasattr(method, attr):
-                            setattr(wrapped, attr, getattr(method, attr))
-                    wrapped.__is_flow_method__ = True  # type: ignore[attr-defined]
-
-                    setattr(target, name, wrapped)
-
-            return target
-        method = target
-        method.__is_flow_method__ = True  # type: ignore[attr-defined]
-        _stamp_persistence_metadata(method, actual_persistence, verbose)
-
-        if asyncio.iscoroutinefunction(method):
-
-            @functools.wraps(method)
-            async def method_async_wrapper(
-                flow_instance: Any, *args: Any, **kwargs: Any
-            ) -> T:
-                method_coro = method(flow_instance, *args, **kwargs)
-                if asyncio.iscoroutine(method_coro):
-                    result = await method_coro
-                else:
-                    result = method_coro
-                PersistenceDecorator.persist_state(
-                    flow_instance, method.__name__, actual_persistence, verbose
-                )
-                return cast(T, result)
-
-            for attr in _PRESERVED_FLOW_ATTRS:
-                if hasattr(method, attr):
-                    setattr(method_async_wrapper, attr, getattr(method, attr))
-            method_async_wrapper.__is_flow_method__ = True  # type: ignore[attr-defined]
-            _stamp_persistence_metadata(
-                method_async_wrapper, actual_persistence, verbose
-            )
-            return cast(Callable[..., T], method_async_wrapper)
-
-        @functools.wraps(method)
-        def method_sync_wrapper(flow_instance: Any, *args: Any, **kwargs: Any) -> T:
-            result = method(flow_instance, *args, **kwargs)
-            PersistenceDecorator.persist_state(
-                flow_instance, method.__name__, actual_persistence, verbose
-            )
-            return result
-
-        for attr in _PRESERVED_FLOW_ATTRS:
-            if hasattr(method, attr):
-                setattr(method_sync_wrapper, attr, getattr(method, attr))
-        method_sync_wrapper.__is_flow_method__ = True  # type: ignore[attr-defined]
-        _stamp_persistence_metadata(method_sync_wrapper, actual_persistence, verbose)
-        return cast(Callable[..., T], method_sync_wrapper)
+        _stamp_persistence_metadata(target, actual_persistence, verbose)
+        return target

    return decorator
--- a/lib/crewai/src/crewai/flow/runtime/init.py
+++ b/lib/crewai/src/crewai/flow/runtime/init.py
--- a/lib/crewai/src/crewai/flow/runtime/_expressions.py
+++ b/lib/crewai/src/crewai/flow/runtime/_expressions.py
@@ -0,0 +1,144 @@
+"""Runtime expression support for FlowDefinition CEL expressions."""
+
+from __future__ import annotations
+
+import copy
+import dataclasses
+from itertools import pairwise
+import json
+import re
+from typing import TYPE_CHECKING, Any, cast
+
+from pydantic import BaseModel
+
+
+if TYPE_CHECKING:
+    from crewai.flow.runtime import Flow
+
+
+_EXPRESSION_PATTERN = re.compile(r"\$\{([^{}]*)\}")
+
+__all__ = ["FlowExpressionError", "evaluate_expression", "render_with_block"]
+
+
+class FlowExpressionError(ValueError):
+    """A FlowDefinition expression failed to parse or evaluate."""
+
+
+def render_with_block(flow: Flow[Any], value: Any) -> Any:
+    """Render CEL expressions inside a FlowDefinition ``with:`` payload."""
+    context = _expression_context(flow)
+    return _render_value(value, context)
+
+
+def evaluate_expression(flow: Flow[Any], expression: str) -> Any:
+    """Evaluate a FlowDefinition CEL expression against runtime context."""
+    expression = expression.strip()
+    if not expression:
+        raise FlowExpressionError("empty CEL expression")
+    return _eval_cel(expression, _expression_context(flow))
+
+
+def _expression_context(flow: Flow[Any]) -> dict[str, Any]:
+    return {
+        "state": flow._copy_and_serialize_state(),
+        "outputs": _outputs_by_name(flow._method_outputs),
+    }
+
+
+def _outputs_by_name(method_outputs: list[Any]) -> dict[str, Any]:
+    outputs: dict[str, Any] = {}
+    for entry in method_outputs:
+        method = ""
+        output = entry
+        if isinstance(entry, dict) and "output" in entry:
+            method = str(entry.get("method", ""))
+            output = entry["output"]
+        output = copy.deepcopy(output)
+        if isinstance(output, BaseModel):
+            output = output.model_dump(mode="json")
+        elif dataclasses.is_dataclass(output) and not isinstance(output, type):
+            output = dataclasses.asdict(output)
+        outputs[method] = output
+    return outputs
+
+
+def _render_value(value: Any, context: dict[str, Any]) -> Any:
+    if isinstance(value, str):
+        return _render_string(value, context)
+    if isinstance(value, dict):
+        return {key: _render_value(item, context) for key, item in value.items()}
+    if isinstance(value, list):
+        return [_render_value(item, context) for item in value]
+    return value
+
+
+def _render_string(value: str, context: dict[str, Any]) -> Any:
+    matches = list(_EXPRESSION_PATTERN.finditer(value))
+    if not matches:
+        _raise_for_invalid_interpolation(value)
+        return value
+
+    _raise_for_literal_braces(value[: matches[0].start()])
+    for previous, current in pairwise(matches):
+        _raise_for_literal_braces(value[previous.end() : current.start()])
+    _raise_for_literal_braces(value[matches[-1].end() :])
+
+    if len(matches) == 1 and matches[0].span() == (0, len(value)):
+        expression = matches[0].group(1).strip()
+        if not expression:
+            raise FlowExpressionError("empty CEL expression in with block")
+        return _eval_cel(expression, context)
+
+    rendered: list[str] = []
+    position = 0
+    for match in matches:
+        start, end = match.span()
+        literal = value[position:start]
+        rendered.append(literal)
+
+        expression = match.group(1).strip()
+        if not expression:
+            raise FlowExpressionError("empty CEL expression in with block")
+        result = _eval_cel(expression, context)
+        rendered.append(result if isinstance(result, str) else json.dumps(result))
+        position = end
+
+    literal = value[position:]
+    rendered.append(literal)
+
+    return "".join(rendered)
+
+
+def _raise_for_invalid_interpolation(value: str) -> None:
+    if "${" not in value:
+        return
+    raise FlowExpressionError(
+        "invalid CEL interpolation in with block: expressions must be enclosed "
+        "as ${...} and cannot contain braces"
+    )
+
+
+def _raise_for_literal_braces(value: str) -> None:
+    if "{" not in value and "}" not in value:
+        return
+    raise FlowExpressionError(
+        "invalid CEL interpolation in with block: expressions must be enclosed "
+        "as ${...} and cannot contain braces"
+    )
+
+
+def _eval_cel(expression: str, context: dict[str, Any]) -> Any:
+    try:
+        from celpy import Environment
+        from celpy.adapter import CELJSONEncoder, json_to_cel
+        from celpy.evaluation import Context
+
+        environment = Environment()
+        program = environment.program(environment.compile(expression))
+        result = program.evaluate(cast(Context, json_to_cel(context)))
+        return json.loads(json.dumps(result, cls=CELJSONEncoder))
+    except Exception as e:
+        raise FlowExpressionError(
+            f"failed to evaluate CEL expression {expression!r}: {e}"
+        ) from e
--- a/lib/crewai/src/crewai/flow/runtime/_resolvers.py
+++ b/lib/crewai/src/crewai/flow/runtime/_resolvers.py
@@ -0,0 +1,116 @@
+"""Resolution of FlowDefinition refs (``module:qualname``) into live objects.
+
+Every ref-shaped value in a definition — ``do`` actions, ``state.ref``,
+``config.input_provider``, ``human_feedback.provider`` — resolves through
+:func:`resolve_ref`. Failures are loud and name the field and the ref.
+"""
+
+from __future__ import annotations
+
+from collections.abc import Callable
+import importlib
+import inspect
+from operator import attrgetter
+from typing import TYPE_CHECKING, Any, cast
+
+from crewai.flow.flow_definition import (
+    FlowActionDefinition,
+    FlowCodeActionDefinition,
+    FlowExpressionActionDefinition,
+    FlowToolActionDefinition,
+)
+from crewai.flow.runtime._expressions import evaluate_expression, render_with_block
+
+
+if TYPE_CHECKING:
+    from crewai.flow.runtime import Flow
+
+
+class InvalidRefError(ValueError):
+    """A definition ref that cannot be resolved to a live object."""
+
+
+def resolve_ref(ref: str, *, field: str) -> Any:
+    """Import the object a definition's `module:qualname` ref points to."""
+    module_name, _, qualname = ref.partition(":")
+    if "<" in ref or not module_name or not qualname:
+        raise InvalidRefError(
+            f"invalid {field} ref {ref!r}; expected 'module:qualname'"
+        )
+    try:
+        return attrgetter(qualname)(importlib.import_module(module_name))
+    except (ImportError, AttributeError) as e:
+        raise InvalidRefError(f"unresolvable {field} ref {ref!r}") from e
+
+
+def resolve_instance_ref(ref: str, *, field: str) -> Any:
+    """Resolve a ref, auto-instantiating a no-arg class into an instance."""
+    target = resolve_ref(ref, field=field)
+    if not inspect.isclass(target):
+        return target
+    try:
+        return target()
+    except Exception as e:
+        raise InvalidRefError(
+            f"cannot instantiate {field} ref {ref!r} without arguments: {e}"
+        ) from e
+
+
+def _resolve_code_action(
+    flow: Flow[Any], action: FlowCodeActionDefinition
+) -> Callable[..., Any]:
+    ref = action.ref
+    target = resolve_ref(ref, field="do")
+    if not callable(target):
+        raise InvalidRefError(f"invalid do ref {ref!r}; object is not callable")
+    handler = cast(Callable[..., Any], target)
+    if getattr(handler, "__self__", None) is None:
+        handler = handler.__get__(flow, type(flow))
+    return handler
+
+
+def _resolve_tool_action(
+    flow: Flow[Any], action: FlowToolActionDefinition
+) -> Callable[..., Any]:
+    target = resolve_ref(action.ref, field="do")
+    from crewai.tools import BaseTool
+
+    if not (inspect.isclass(target) and issubclass(target, BaseTool)):
+        raise InvalidRefError(
+            f"invalid tool ref {action.ref!r}; expected a BaseTool class"
+        )
+
+    try:
+        tool_cls = cast(Callable[[], BaseTool], target)
+        tool = tool_cls()
+    except Exception as e:
+        raise InvalidRefError(
+            f"cannot instantiate tool ref {action.ref!r} without arguments: {e}"
+        ) from e
+
+    tool_kwargs = action.with_ or {}
+
+    def run_tool(*_args: Any, **_kwargs: Any) -> Any:
+        return tool.run(**render_with_block(flow, tool_kwargs))
+
+    return run_tool
+
+
+def _resolve_expression_action(
+    flow: Flow[Any], action: FlowExpressionActionDefinition
+) -> Callable[..., Any]:
+    def run_expression(*_args: Any, **_kwargs: Any) -> Any:
+        return evaluate_expression(flow, action.expr)
+
+    return run_expression
+
+
+def resolve_action(flow: Flow[Any], action: FlowActionDefinition) -> Callable[..., Any]:
+    """Turn one `do:` action into the callable the flow runs for that node."""
+    if action.call == "code":
+        return _resolve_code_action(flow, action)
+    if action.call == "tool":
+        return _resolve_tool_action(flow, action)
+    if action.call == "expression":
+        return _resolve_expression_action(flow, action)
+    raise ValueError(f"unknown call type {action.call!r}")
--- a/lib/crewai/src/crewai/lite_agent.py
+++ b/lib/crewai/src/crewai/lite_agent.py
@@ -390,7 +390,10 @@ class LiteAgent(FlowTrackable, BaseModel):
        if self.memory is True:
            from crewai.memory.unified_memory import Memory

-            object.__setattr__(self, "_memory", Memory())
+            memory_kwargs: dict[str, Any] = {}
+            if self.llm is not None:
+                memory_kwargs["llm"] = self.llm
+            object.__setattr__(self, "_memory", Memory(**memory_kwargs))
        elif self.memory is not None and self.memory is not False:
            object.__setattr__(self, "_memory", self.memory)
        else:
--- a/lib/crewai/src/crewai/llm.py
+++ b/lib/crewai/src/crewai/llm.py
@@ -68,7 +68,17 @@ if TYPE_CHECKING:
    from crewai.tools.base_tool import BaseTool
    from crewai.utilities.types import LLMMessage

-try:
+load_dotenv()
+logger = logging.getLogger(__name__)
+
+# litellm is lazy-loaded to avoid its module-level dotenv.load_dotenv()
+# from polluting env vars (e.g. MODEL= overriding embedder model_name).
+# The TYPE_CHECKING imports give mypy the real types; at runtime the names
+# stay None until _ensure_litellm() rebinds them.
+_litellm_loaded = False
+LITELLM_AVAILABLE = False
+
+if TYPE_CHECKING:
    import litellm
    from litellm.litellm_core_utils.get_supported_openai_params import (
        get_supported_openai_params,
@@ -85,28 +95,70 @@ try:
        StreamingChoices as LiteLLMStreamingChoices,
    )
    from litellm.utils import supports_response_schema
-
-    LITELLM_AVAILABLE = True
-except ImportError:
-    LITELLM_AVAILABLE = False
-    litellm = None  # type: ignore[assignment]
-    Choices = None  # type: ignore[assignment, misc]
-    LiteLLMDelta = None  # type: ignore[assignment, misc]
-    Message = None  # type: ignore[assignment, misc]
-    ModelResponseBase = None  # type: ignore[assignment, misc]
-    ModelResponseStream = None  # type: ignore[assignment, misc]
-    LiteLLMStreamingChoices = None  # type: ignore[assignment, misc]
-    get_supported_openai_params = None  # type: ignore[assignment]
-    ChatCompletionDeltaToolCall = None  # type: ignore[assignment, misc]
-    Function = None  # type: ignore[assignment, misc]
-    ModelResponse = None  # type: ignore[assignment, misc]
-    supports_response_schema = None  # type: ignore[assignment]
+else:
+    litellm = None
+    Choices = None
+    LiteLLMDelta = None
+    Message = None
+    ModelResponseBase = None
+    ModelResponseStream = None
+    LiteLLMStreamingChoices = None
+    get_supported_openai_params = None
+    ChatCompletionDeltaToolCall = None
+    Function = None
+    ModelResponse = None
+    supports_response_schema = None


-load_dotenv()
-logger = logging.getLogger(__name__)
-if LITELLM_AVAILABLE:
-    litellm.suppress_debug_info = True
+def _ensure_litellm() -> bool:
+    """Lazy-load litellm on first use. Returns True if available."""
+    global _litellm_loaded, LITELLM_AVAILABLE
+    global litellm, Choices, LiteLLMDelta, Message, ModelResponseBase
+    global ModelResponseStream, LiteLLMStreamingChoices, get_supported_openai_params
+    global ChatCompletionDeltaToolCall, Function
+    global ModelResponse, supports_response_schema
+
+    if _litellm_loaded:
+        return LITELLM_AVAILABLE
+    _litellm_loaded = True
+
+    try:
+        import litellm as _litellm
+        from litellm.litellm_core_utils.get_supported_openai_params import (
+            get_supported_openai_params as _get_supported_openai_params,
+        )
+        from litellm.types.utils import (
+            ChatCompletionDeltaToolCall as _ChatCompletionDeltaToolCall,
+            Choices as _Choices,
+            Delta as _LiteLLMDelta,
+            Function as _Function,
+            Message as _Message,
+            ModelResponse as _ModelResponse,
+            ModelResponseBase as _ModelResponseBase,
+            ModelResponseStream as _ModelResponseStream,
+            StreamingChoices as _LiteLLMStreamingChoices,
+        )
+        from litellm.utils import supports_response_schema as _supports_response_schema
+
+        litellm = _litellm
+        Choices = _Choices  # type: ignore[misc]
+        LiteLLMDelta = _LiteLLMDelta  # type: ignore[misc]
+        Message = _Message  # type: ignore[misc]
+        ModelResponseBase = _ModelResponseBase  # type: ignore[misc]
+        ModelResponseStream = _ModelResponseStream  # type: ignore[misc]
+        LiteLLMStreamingChoices = _LiteLLMStreamingChoices  # type: ignore[misc]
+        get_supported_openai_params = _get_supported_openai_params
+        ChatCompletionDeltaToolCall = _ChatCompletionDeltaToolCall  # type: ignore[misc]
+        Function = _Function  # type: ignore[misc]
+        ModelResponse = _ModelResponse  # type: ignore[misc]
+        supports_response_schema = _supports_response_schema
+
+        _litellm.suppress_debug_info = True
+        LITELLM_AVAILABLE = True
+    except ImportError:
+        LITELLM_AVAILABLE = False
+
+    return LITELLM_AVAILABLE


 MIN_CONTEXT: Final[int] = 1024
@@ -117,6 +169,7 @@ LLM_CONTEXT_WINDOW_SIZES: Final[dict[str, int]] = {
    "gpt-4": 8192,
    "gpt-4o": 128000,
    "gpt-4o-mini": 200000,
+    "gpt-5.4-mini": 200000,
    "gpt-4-turbo": 128000,
    "gpt-4.1": 1047576,  # Based on official docs
    "gpt-4.1-mini-2025-04-14": 1047576,
@@ -411,7 +464,8 @@ class LLM(BaseLLM):
            except Exception as e:
                raise ImportError(f"Error importing native provider: {e}") from e

-        if not LITELLM_AVAILABLE:
+        # FALLBACK to LiteLLM — lazy-load on first use
+        if not _ensure_litellm():
            native_list = ", ".join(SUPPORTED_NATIVE_PROVIDERS)
            error_msg = (
                f"Unable to initialize LLM with model '{model}'. "
@@ -632,7 +686,7 @@ class LLM(BaseLLM):
    @model_validator(mode="after")
    def _init_litellm(self) -> LLM:
        self.is_litellm = True
-        if LITELLM_AVAILABLE:
+        if _ensure_litellm():
            litellm.drop_params = True
            self.set_callbacks(self.callbacks or [])
            self.set_env_callbacks()
@@ -2290,7 +2344,8 @@ class LLM(BaseLLM):
        Note: This validation only applies to the litellm fallback path.
        Native providers have their own validation.
        """
-        if not LITELLM_AVAILABLE or supports_response_schema is None:
+        if not _ensure_litellm() or supports_response_schema is None:
+            # When litellm is not available, skip validation
            # (this path should only be reached for litellm fallback models)
            return

@@ -2310,7 +2365,7 @@ class LLM(BaseLLM):
        Note: This method is only used by the litellm fallback path.
        Native providers override this method with their own implementation.
        """
-        if not LITELLM_AVAILABLE:
+        if not _ensure_litellm():
            # When litellm is not available, assume function calling is supported
            # (all modern models support it)
            return True
@@ -2334,7 +2389,7 @@ class LLM(BaseLLM):
        if "gpt-5" in model_lower:
            return False

-        if not LITELLM_AVAILABLE or get_supported_openai_params is None:
+        if not _ensure_litellm() or get_supported_openai_params is None:
            # When litellm is not available, assume stop words are supported
            return True

@@ -2382,7 +2437,8 @@ class LLM(BaseLLM):
        Note: This only affects the litellm fallback path. Native providers
        don't use litellm callbacks - they emit events via base_llm.py.
        """
-        if not LITELLM_AVAILABLE:
+        if not _ensure_litellm():
+            # When litellm is not available, callbacks are still stored
            # but not registered with litellm globals
            return

@@ -2420,7 +2476,8 @@ class LLM(BaseLLM):
        This will set `litellm.success_callback` to ["langfuse", "langsmith"] and
        `litellm.failure_callback` to ["langfuse"].
        """
-        if not LITELLM_AVAILABLE:
+        if not _ensure_litellm():
+            # When litellm is not available, env callbacks have no effect
            return

        with suppress_warnings():
--- a/lib/crewai/src/crewai/llms/base_llm.py
+++ b/lib/crewai/src/crewai/llms/base_llm.py
@@ -890,41 +890,17 @@ class BaseLLM(BaseModel, ABC):
        Args:
            usage_data: Token usage data from the API response
        """
-        prompt_tokens = (
-            usage_data.get("prompt_tokens")
-            or usage_data.get("prompt_token_count")
-            or usage_data.get("input_tokens")
-            or 0
-        )
+        metrics = UsageMetrics.from_provider_dict(usage_data)
+        if metrics is None:
+            return

-        completion_tokens = (
-            usage_data.get("completion_tokens")
-            or usage_data.get("candidates_token_count")
-            or usage_data.get("output_tokens")
-            or 0
-        )
-
-        cached_tokens = (
-            usage_data.get("cached_tokens")
-            or usage_data.get("cached_prompt_tokens")
-            or usage_data.get("cache_read_input_tokens")
-            or 0
-        )
-        if not cached_tokens:
-            prompt_details = usage_data.get("prompt_tokens_details")
-            if isinstance(prompt_details, dict):
-                cached_tokens = prompt_details.get("cached_tokens", 0) or 0
-
-        reasoning_tokens = usage_data.get("reasoning_tokens", 0) or 0
-        cache_creation_tokens = usage_data.get("cache_creation_tokens", 0) or 0
-
-        self._token_usage["prompt_tokens"] += prompt_tokens
-        self._token_usage["completion_tokens"] += completion_tokens
-        self._token_usage["total_tokens"] += prompt_tokens + completion_tokens
-        self._token_usage["successful_requests"] += 1
-        self._token_usage["cached_prompt_tokens"] += cached_tokens
-        self._token_usage["reasoning_tokens"] += reasoning_tokens
-        self._token_usage["cache_creation_tokens"] += cache_creation_tokens
+        self._token_usage["prompt_tokens"] += metrics.prompt_tokens
+        self._token_usage["completion_tokens"] += metrics.completion_tokens
+        self._token_usage["total_tokens"] += metrics.total_tokens
+        self._token_usage["successful_requests"] += metrics.successful_requests
+        self._token_usage["cached_prompt_tokens"] += metrics.cached_prompt_tokens
+        self._token_usage["reasoning_tokens"] += metrics.reasoning_tokens
+        self._token_usage["cache_creation_tokens"] += metrics.cache_creation_tokens

    def get_token_usage_summary(self) -> UsageMetrics:
        """Get summary of token usage for this LLM instance.
--- a/lib/crewai/src/crewai/llms/providers/azure/completion.py
+++ b/lib/crewai/src/crewai/llms/providers/azure/completion.py
@@ -1300,6 +1300,7 @@ class AzureCompletion(BaseLLM):
            "gpt-4": 8192,
            "gpt-4o": 128000,
            "gpt-4o-mini": 200000,
+            "gpt-5.4-mini": 200000,
            "gpt-4-turbo": 128000,
            "gpt-35-turbo": 16385,
            "gpt-3.5-turbo": 16385,
--- a/lib/crewai/src/crewai/llms/providers/openai/completion.py
+++ b/lib/crewai/src/crewai/llms/providers/openai/completion.py
@@ -2406,6 +2406,7 @@ class OpenAICompletion(BaseLLM):
            "gpt-4": 8192,
            "gpt-4o": 128000,
            "gpt-4o-mini": 200000,
+            "gpt-5.4-mini": 200000,
            "gpt-4-turbo": 128000,
            "gpt-4.1": 1047576,
            "gpt-4.1-mini-2025-04-14": 1047576,
--- a/lib/crewai/src/crewai/memory/storage/backend.py
+++ b/lib/crewai/src/crewai/memory/storage/backend.py
@@ -8,6 +8,39 @@ from typing import Any, Protocol, runtime_checkable
 from crewai.memory.types import MemoryRecord, ScopeInfo


+class EmbeddingDimensionMismatchError(ValueError):
+    """Raised when an embedding's dimensionality doesn't match the existing store.
+
+    The most common cause is upgrading CrewAI across the default-embedder
+    change (text-embedding-3-small, 1536 dims → text-embedding-3-large,
+    3072 dims) while keeping a local memory store created before the upgrade.
+
+    Deliberately not a ``RuntimeError``: background-save plumbing treats
+    ``RuntimeError`` as interpreter/executor shutdown and silently drops the
+    save, which would swallow this actionable migration error.
+    """
+
+    def __init__(self, stored_dim: int, new_dim: int) -> None:
+        self.stored_dim = stored_dim
+        self.new_dim = new_dim
+        super().__init__(
+            f"Embedding dimension mismatch: this memory store contains "
+            f"{stored_dim}-dimensional vectors, but the current embedder produced "
+            f"a {new_dim}-dimensional vector.\n\n"
+            "This usually means the store was created with a different embedding "
+            "model. CrewAI's default embedder changed from "
+            "text-embedding-3-small (1536 dims) to text-embedding-3-large "
+            "(3072 dims), so memory stores created before the upgrade are "
+            "incompatible with the new default.\n\n"
+            "To fix, do one of the following:\n"
+            "  - Reset local memory so it is rebuilt with the new embedder:\n"
+            "      crewai reset-memories --memory   (or crew.reset_memories())\n"
+            "  - Keep existing memories by pinning the previous embedder:\n"
+            '      embedder={"provider": "openai", '
+            '"config": {"model": "text-embedding-3-small"}}'
+        )
+
+
@runtime_checkable
 class StorageBackend(Protocol):
    """Protocol for pluggable memory storage backends."""
--- a/lib/crewai/src/crewai/memory/storage/lancedb_storage.py
+++ b/lib/crewai/src/crewai/memory/storage/lancedb_storage.py
@@ -15,15 +15,16 @@ from typing import Any
 from crewai_core.lock_store import lock as store_lock
 import lancedb  # type: ignore[import-untyped]

+from crewai.memory.storage.backend import EmbeddingDimensionMismatchError
 from crewai.memory.types import MemoryRecord, ScopeInfo


 _logger = logging.getLogger(__name__)

-# Default embedding vector dimensionality (matches OpenAI text-embedding-3-small).
+# Default embedding vector dimensionality (matches OpenAI text-embedding-3-large).
 # Used when creating new tables and for zero-vector placeholder scans.
 # Callers can override via the ``vector_dim`` constructor parameter.
-DEFAULT_VECTOR_DIM = 1536
+DEFAULT_VECTOR_DIM = 3072

 # Safety cap on the number of rows returned by a single scan query.
 # Prevents unbounded memory use when scanning large tables for scope info,
@@ -288,13 +289,19 @@ class LanceDBStorage:
    def save(self, records: list[MemoryRecord]) -> None:
        if not records:
            return
-        # Auto-detect dimension from the first real embedding.
+        # Auto-detect dimension from the first real embedding and validate
+        # the whole batch against it — a silent mismatch would otherwise be
+        # zero-filled below and corrupt search results.
        dim = None
        for r in records:
            if r.embedding and len(r.embedding) > 0:
-                dim = len(r.embedding)
-                break
+                if dim is None:
+                    dim = len(r.embedding)
+                elif len(r.embedding) != dim:
+                    raise EmbeddingDimensionMismatchError(dim, len(r.embedding))
        is_new_table = self._table is None
+        if not is_new_table and dim and self._vector_dim and dim != self._vector_dim:
+            raise EmbeddingDimensionMismatchError(self._vector_dim, dim)
        with store_lock(self._lock_name):
            self._ensure_table(vector_dim=dim)
            rows = [self._record_to_row(rec) for rec in records]
@@ -311,6 +318,15 @@ class LanceDBStorage:

    def update(self, record: MemoryRecord) -> None:
        """Update a record by ID. Preserves created_at, updates last_accessed."""
+        if (
+            self._table is not None
+            and record.embedding
+            and self._vector_dim
+            and len(record.embedding) != self._vector_dim
+        ):
+            raise EmbeddingDimensionMismatchError(
+                self._vector_dim, len(record.embedding)
+            )
        with store_lock(self._lock_name):
            self._ensure_table()
            safe_id = str(record.id).replace("'", "''")
@@ -363,6 +379,10 @@ class LanceDBStorage:
    ) -> list[tuple[MemoryRecord, float]]:
        if self._table is None:
            return []
+        if self._vector_dim and len(query_embedding) != self._vector_dim:
+            raise EmbeddingDimensionMismatchError(
+                self._vector_dim, len(query_embedding)
+            )
        query = self._table.search(query_embedding)
        if scope_prefix is not None and scope_prefix.strip("/"):
            prefix = scope_prefix.rstrip("/")
--- a/lib/crewai/src/crewai/memory/storage/qdrant_edge_storage.py
+++ b/lib/crewai/src/crewai/memory/storage/qdrant_edge_storage.py
@@ -36,6 +36,7 @@ from qdrant_edge import (
    UpdateOperation,
 )

+from crewai.memory.storage.backend import EmbeddingDimensionMismatchError
 from crewai.memory.types import MemoryRecord, ScopeInfo


@@ -43,7 +44,7 @@ _logger = logging.getLogger(__name__)

 VECTOR_NAME: Final[str] = "memory"

-DEFAULT_VECTOR_DIM: Final[int] = 1536
+DEFAULT_VECTOR_DIM: Final[int] = 3072

 _SCROLL_BATCH: Final[int] = 256

@@ -183,6 +184,10 @@ class QdrantEdgeStorage:
        except Exception:
            _logger.debug("Index creation failed (may already exist)", exc_info=True)

+    def _has_existing_data(self) -> bool:
+        """True when either shard already holds persisted records."""
+        return self._local_has_data or self._central_path.exists()
+
    def _record_to_point(self, record: MemoryRecord) -> Point:
        """Convert a MemoryRecord to a Qdrant Point."""
        return Point(
@@ -277,11 +282,19 @@ class QdrantEdgeStorage:
        if not records:
            return

+        # Validate the batch is internally consistent before touching the
+        # store-level dimension.
+        batch_dim = 0
+        for r in records:
+            if r.embedding and len(r.embedding) > 0:
+                if batch_dim == 0:
+                    batch_dim = len(r.embedding)
+                elif len(r.embedding) != batch_dim:
+                    raise EmbeddingDimensionMismatchError(batch_dim, len(r.embedding))
        if self._vector_dim == 0:
-            for r in records:
-                if r.embedding and len(r.embedding) > 0:
-                    self._vector_dim = len(r.embedding)
-                    break
+            self._vector_dim = batch_dim
+        elif batch_dim and batch_dim != self._vector_dim and self._has_existing_data():
+            raise EmbeddingDimensionMismatchError(self._vector_dim, batch_dim)
        if self._config is None and self._vector_dim > 0:
            self._config = self._build_config(self._vector_dim)
        if self._config is None:
@@ -308,6 +321,14 @@ class QdrantEdgeStorage:
        min_score: float = 0.0,
    ) -> list[tuple[MemoryRecord, float]]:
        """Search both central and local shards, merge results."""
+        if (
+            self._vector_dim
+            and len(query_embedding) != self._vector_dim
+            and self._has_existing_data()
+        ):
+            raise EmbeddingDimensionMismatchError(
+                self._vector_dim, len(query_embedding)
+            )
        filt = self._build_scope_filter(scope_prefix)
        fetch_limit = limit * 3 if (categories or metadata_filter) else limit
        all_scored: list[tuple[dict[str, Any], float, bool]] = []
@@ -466,6 +487,16 @@ class QdrantEdgeStorage:

    def update(self, record: MemoryRecord) -> None:
        """Update a record by upserting with the same point ID."""
+        if (
+            self._config is not None
+            and record.embedding
+            and self._vector_dim
+            and len(record.embedding) != self._vector_dim
+            and self._has_existing_data()
+        ):
+            raise EmbeddingDimensionMismatchError(
+                self._vector_dim, len(record.embedding)
+            )
        if self._config is None:
            if record.embedding and len(record.embedding) > 0:
                self._vector_dim = len(record.embedding)
--- a/lib/crewai/src/crewai/memory/unified_memory.py
+++ b/lib/crewai/src/crewai/memory/unified_memory.py
@@ -66,7 +66,7 @@ class Memory(BaseModel):
    memory_kind: Literal["memory"] = "memory"

    llm: Annotated[BaseLLM | str, PlainValidator(_passthrough)] = Field(
-        default="gpt-4o-mini",
+        default="gpt-5.4-mini",
        description="LLM for analysis (model name or BaseLLM instance).",
    )
    storage: Annotated[StorageBackend | str, PlainValidator(_passthrough)] = Field(
@@ -239,7 +239,7 @@ class Memory(BaseModel):
                raise RuntimeError(
                    f"Memory requires an LLM for analysis but initialization failed: {e}\n\n"
                    "To fix this, do one of the following:\n"
-                    "  - Set OPENAI_API_KEY for the default model (gpt-4o-mini)\n"
+                    "  - Set OPENAI_API_KEY for the default model (gpt-5.4-mini)\n"
                    '  - Pass a different model: Memory(llm="anthropic/claude-3-haiku-20240307")\n'
                    '  - Pass any LLM instance: Memory(llm=LLM(model="your-model"))\n'
                    "  - To skip LLM analysis, pass all fields explicitly to remember()\n"
@@ -261,7 +261,7 @@ class Memory(BaseModel):
                raise RuntimeError(
                    f"Memory requires an embedder for vector search but initialization failed: {e}\n\n"
                    "To fix this, do one of the following:\n"
-                    "  - Set OPENAI_API_KEY for the default embedder (text-embedding-3-small)\n"
+                    "  - Set OPENAI_API_KEY for the default embedder (text-embedding-3-large)\n"
                    '  - Pass a different embedder: Memory(embedder={{"provider": "google", "config": {{...}}}})\n'
                    "  - Pass a callable: Memory(embedder=my_embedding_function)\n\n"
                    f"Docs: {self._MEMORY_DOCS_URL}"
@@ -322,12 +322,16 @@ class Memory(BaseModel):
        """Block until all pending background saves have completed.

        Called automatically by ``recall()`` and should be called by the
-        crew at shutdown to ensure no saves are lost.
+        crew at shutdown to ensure no saves are lost. Background save failures
+        are already reported through ``MemorySaveFailedEvent`` and should not
+        fail the task, crew, or flow that produced the output.
        """
        with self._pending_lock:
            pending = list(self._pending_saves)
        for future in pending:
-            future.result()  # blocks until done; re-raises exceptions
+            if future.cancelled():
+                continue
+            future.exception()  # blocks until done without re-raising failures

    def close(self) -> None:
        """Drain pending saves, flush storage, and shut down the background thread pool."""
@@ -605,12 +609,16 @@ class Memory(BaseModel):
                root_scope,
            )
            elapsed_ms = (time.perf_counter() - start) * 1000
-        except RuntimeError:
+        except RuntimeError as e:
            # The encoding pipeline uses asyncio.run() -> to_thread() internally.
            # If the process is shutting down, the default executor is closed and
            # to_thread raises "cannot schedule new futures after shutdown".
            # Silently abandon the save -- the process is exiting anyway.
-            return []
+            # Any other RuntimeError must propagate so the save future's
+            # done-callback reports it via MemorySaveFailedEvent.
+            if "cannot schedule new futures" in str(e):
+                return []
+            raise

        try:
            crewai_event_bus.emit(
--- a/lib/crewai/src/crewai/project/init.py
+++ b/lib/crewai/src/crewai/project/init.py
@@ -14,6 +14,8 @@ from crewai.project.annotations import (
    tool,
 )
 from crewai.project.crew_base import CrewBase
+from crewai.project.crew_loader import load_crew, load_crew_and_kickoff
+from crewai.project.json_loader import load_agent, strip_jsonc_comments


 __all__ = [
@@ -25,8 +27,12 @@ __all__ = [
    "callback",
    "crew",
    "llm",
+    "load_agent",
+    "load_crew",
+    "load_crew_and_kickoff",
    "output_json",
    "output_pydantic",
+    "strip_jsonc_comments",
    "task",
    "tool",
 ]
--- a/lib/crewai/src/crewai/project/crew_loader.py
+++ b/lib/crewai/src/crewai/project/crew_loader.py
@@ -0,0 +1,101 @@
+"""Load crew definitions from JSON/JSONC files and produce Crew instances."""
+
+from __future__ import annotations
+
+from pathlib import Path
+from typing import Any
+
+from pydantic import ValidationError
+
+from crewai.project.json_loader import (
+    JSONProjectError,
+    JSONProjectValidationError,
+    _crew_kwargs_from_definition,
+    _task_kwargs_from_definition,
+    load_json_crew_project,
+)
+
+
+def load_crew(
+    source: Path | str,
+    agents_dir: Path | None = None,
+) -> tuple[Any, dict[str, Any]]:
+    """Load a ``Crew`` from a JSON/JSONC definition file.
+
+    The definition file describes the crew's agents, tasks, process type, and
+    default inputs.  Agent definitions are resolved from individual
+    ``<name>.jsonc`` / ``<name>.json`` files inside an ``agents/`` directory.
+    """
+    from crewai import Agent, Crew, Task
+
+    crew_path = Path(source)
+    project = load_json_crew_project(crew_path, agents_dir=agents_dir)
+
+    agents_map: dict[str, Any] = {}
+    for name in project.agent_names:
+        agent_def = project.agents[name]
+        try:
+            agents_map[name] = Agent(**agent_def.kwargs)
+        except ValidationError as exc:
+            raise JSONProjectError(
+                f"{agent_def.path}: validation failed: {exc}"
+            ) from exc
+        except Exception as exc:
+            raise JSONProjectError(
+                f"{agent_def.path}: failed to load agent: {exc}"
+            ) from exc
+
+    tasks_list: list[Task] = []
+    task_name_map: dict[str, Task] = {}
+
+    for index, task_defn in enumerate(project.task_definitions):
+        source_label = f"{crew_path}: tasks[{index}]"
+        task_kwargs = _task_kwargs_from_definition(
+            task_defn,
+            agents_map=agents_map,
+            task_name_map=task_name_map,
+            source=source_label,
+            project_root=crew_path.parent,
+        )
+        try:
+            task = Task(**task_kwargs)
+        except ValidationError as exc:
+            raise JSONProjectError(f"{source_label}: validation failed: {exc}") from exc
+
+        tasks_list.append(task)
+        task_name = task_defn.get("name")
+        if isinstance(task_name, str) and task_name:
+            task_name_map[task_name] = task
+
+    crew_kwargs = _crew_kwargs_from_definition(
+        project.definition,
+        agents=list(agents_map.values()),
+        tasks=tasks_list,
+        agents_map=agents_map,
+        source=crew_path,
+    )
+
+    try:
+        crew = Crew(**crew_kwargs)
+    except ValidationError as exc:
+        raise JSONProjectError(f"{crew_path}: validation failed: {exc}") from exc
+    except JSONProjectValidationError:
+        raise
+    except Exception as exc:
+        raise JSONProjectError(f"{crew_path}: failed to load crew: {exc}") from exc
+
+    return crew, project.definition.get("inputs", {})
+
+
+def load_crew_and_kickoff(
+    crew_path: Path | str,
+    input_overrides: dict[str, Any] | None = None,
+) -> Any:
+    """Convenience function: load a crew and immediately kick it off."""
+    crew, default_inputs = load_crew(crew_path)
+
+    merged_inputs = {**default_inputs}
+    if input_overrides:
+        merged_inputs.update(input_overrides)
+
+    return crew.kickoff(inputs=merged_inputs)
--- a/lib/crewai/src/crewai/project/json_loader.py
+++ b/lib/crewai/src/crewai/project/json_loader.py
@@ -0,0 +1,837 @@
+"""Loader utilities for JSON/JSONC agent, crew, task, and tool definitions."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+import json
+import logging
+from pathlib import Path
+import re
+from typing import Any
+
+from pydantic import ValidationError
+
+
+logger = logging.getLogger(__name__)
+
+
+class JSONProjectError(ValueError):
+    """User-facing error raised while loading JSON-first crew projects."""
+
+
+class JSONProjectValidationError(JSONProjectError):
+    """Aggregates validation errors found without executing a JSON project."""
+
+    def __init__(self, errors: list[str]) -> None:
+        self.errors = errors
+        super().__init__("\n".join(errors))
+
+
+_AGENT_RUNTIME_FIELDS = {
+    "id",
+    "crew",
+    "cache_handler",
+    "tools_handler",
+    "tools_results",
+    "knowledge",
+    "knowledge_storage",
+    "adapted_agent",
+    "agent_knowledge_context",
+    "crew_knowledge_context",
+    "knowledge_search_query",
+    "execution_context",
+    "checkpoint_kickoff_event_id",
+}
+
+_TASK_RUNTIME_FIELDS = {
+    "id",
+    "used_tools",
+    "tools_errors",
+    "delegations",
+    "output",
+    "processed_by_agents",
+    "retry_count",
+    "start_time",
+    "end_time",
+    "checkpoint_original_description",
+    "checkpoint_original_expected_output",
+}
+
+_CREW_RUNTIME_FIELDS = {
+    "id",
+    "usage_metrics",
+    "task_execution_output_json_files",
+    "execution_logs",
+    "token_usage",
+    "execution_context",
+    "checkpoint_inputs",
+    "checkpoint_train",
+    "checkpoint_kickoff_event_id",
+}
+
+
+JSON_PROJECT_EXTENSIONS = (".jsonc", ".json")
+
+
+@dataclass(frozen=True)
+class JSONAgentDefinition:
+    """Parsed JSON agent definition and constructor kwargs."""
+
+    name: str
+    path: Path
+    definition: dict[str, Any]
+    kwargs: dict[str, Any]
+
+
+@dataclass(frozen=True)
+class JSONCrewProject:
+    """Parsed JSON crew project used by runtime loading and validation."""
+
+    crew_path: Path
+    agents_dir: Path
+    definition: dict[str, Any]
+    agent_names: list[str]
+    agents: dict[str, JSONAgentDefinition]
+    task_definitions: list[dict[str, Any]]
+
+
+def find_json_project_file(directory: str | Path, stem: str) -> Path | None:
+    """Return ``stem.jsonc`` or ``stem.json``, preferring JSONC."""
+    root = Path(directory)
+    for ext in JSON_PROJECT_EXTENSIONS:
+        candidate = root / f"{stem}{ext}"
+        if candidate.exists():
+            return candidate
+    return None
+
+
+def find_crew_json_file(project_root: str | Path = ".") -> Path | None:
+    """Find the JSON crew definition in a project root."""
+    return find_json_project_file(project_root, "crew")
+
+
+def strip_jsonc_comments(text: str) -> str:
+    """Strip JSONC comments and trailing commas while preserving string values."""
+    without_comments = _strip_jsonc_comments(text)
+    return _strip_trailing_commas(without_comments)
+
+
+def parse_jsonc(text: str, source: str | Path = "<string>") -> Any:
+    """Parse JSON/JSONC text into Python data with path-aware error messages."""
+    source_label = str(source)
+    try:
+        return json.loads(strip_jsonc_comments(text))
+    except json.JSONDecodeError as exc:
+        raise JSONProjectError(
+            f"{source_label}: invalid JSON at line {exc.lineno}, "
+            f"column {exc.colno}: {exc.msg}"
+        ) from exc
+
+
+def load_jsonc_file(source: str | Path) -> Any:
+    """Load a JSON or JSONC file."""
+    path = Path(source)
+    return parse_jsonc(path.read_text(encoding="utf-8"), source=path)
+
+
+def load_agent(source: str | Path) -> Any:
+    """Load an existing ``Agent`` from a ``.json`` / ``.jsonc`` definition file."""
+    from crewai import Agent
+
+    path = Path(source)
+    defn = _expect_object(load_jsonc_file(path), path)
+    root = path.parent.parent if path.parent.name == "agents" else Path.cwd()
+    agent_kwargs = _agent_kwargs_from_definition(defn, path, project_root=root)
+
+    try:
+        return Agent(**agent_kwargs)
+    except ValidationError as exc:
+        raise JSONProjectError(_format_validation_error(path, exc)) from exc
+    except Exception as exc:
+        raise JSONProjectError(f"{path}: failed to load agent: {exc}") from exc
+
+
+def validate_crew_project(
+    source: str | Path,
+    agents_dir: Path | None = None,
+) -> JSONCrewProject:
+    """Validate JSON crew structure without kicking off the crew."""
+    return load_json_crew_project(source, agents_dir=agents_dir, collect_errors=True)
+
+
+def load_json_crew_project(
+    source: str | Path,
+    agents_dir: Path | None = None,
+    *,
+    collect_errors: bool = False,
+) -> JSONCrewProject:
+    """Parse and structurally validate a JSON crew project.
+
+    When ``collect_errors`` is true, all discoverable structural errors are
+    returned as a single ``JSONProjectValidationError`` for deploy validation.
+    Runtime loading keeps the previous fail-fast behavior where possible.
+    """
+    crew_path = Path(source)
+    if agents_dir is None:
+        agents_dir = crew_path.parent / "agents"
+
+    errors: list[str] = []
+
+    def fail(message: str, exc_type: type[Exception] = JSONProjectError) -> None:
+        if collect_errors:
+            errors.append(message)
+            return
+        raise exc_type(message)
+
+    def fail_many(messages: list[str]) -> None:
+        if not messages:
+            return
+        if collect_errors:
+            errors.extend(messages)
+            return
+        raise JSONProjectValidationError(messages)
+
+    try:
+        defn = _expect_object(load_jsonc_file(crew_path), crew_path)
+    except Exception as exc:
+        if collect_errors:
+            raise JSONProjectValidationError([str(exc)]) from exc
+        raise
+
+    fail_many(
+        _field_errors(
+            defn,
+            _crew_allowed_fields(),
+            _CREW_RUNTIME_FIELDS,
+            crew_path,
+            {"inputs"},
+        )
+    )
+
+    agent_names = defn.get("agents", [])
+    if not isinstance(agent_names, list) or not agent_names:
+        fail(f"{crew_path}: 'agents' must be a non-empty list")
+        agent_names = []
+
+    agents_dir = Path(agents_dir)
+    agent_definitions: dict[str, JSONAgentDefinition] = {}
+    for agent_name in agent_names:
+        if not isinstance(agent_name, str) or not agent_name:
+            fail(f"{crew_path}: each agent reference must be a non-empty string")
+            continue
+        agent_file = find_json_project_file(agents_dir, agent_name)
+        if agent_file is None:
+            message = (
+                f"Agent definition for '{agent_name}' not found in {agents_dir} "
+                f"(tried {agent_name}.jsonc and {agent_name}.json)"
+            )
+            if collect_errors:
+                errors.append(
+                    f"{crew_path}: agent '{agent_name}' not found in {agents_dir} "
+                    f"(tried {agent_name}.jsonc and {agent_name}.json)"
+                )
+            else:
+                raise FileNotFoundError(message)
+            continue
+        try:
+            agent_defn = _expect_object(load_jsonc_file(agent_file), agent_file)
+            agent_kwargs = _agent_kwargs_from_definition(
+                agent_defn,
+                agent_file,
+                # Validation must never execute project code (custom tools).
+                resolve_tools=not collect_errors,
+                project_root=crew_path.parent,
+            )
+        except Exception as exc:
+            if collect_errors:
+                errors.append(str(exc))
+                continue
+            raise
+        agent_definitions[agent_name] = JSONAgentDefinition(
+            name=agent_name,
+            path=agent_file,
+            definition=agent_defn,
+            kwargs=agent_kwargs,
+        )
+
+    task_defs = defn.get("tasks", [])
+    if not isinstance(task_defs, list) or not task_defs:
+        fail(f"{crew_path}: 'tasks' must be a non-empty list")
+        task_defs = []
+
+    known_tasks: set[str] = set()
+    known_agents = {name for name in agent_names if isinstance(name, str)}
+    for index, task_defn in enumerate(task_defs):
+        task_path = f"{crew_path}: tasks[{index}]"
+        if not isinstance(task_defn, dict):
+            fail(f"{task_path} must be an object")
+            continue
+        fail_many(
+            _field_errors(
+                task_defn,
+                _task_allowed_fields(),
+                _TASK_RUNTIME_FIELDS,
+                task_path,
+            )
+        )
+        missing_required = [
+            f"{task_path} missing required field '{required}'"
+            for required in ("description", "expected_output")
+            if required not in task_defn
+        ]
+        fail_many(missing_required)
+
+        agent_ref = task_defn.get("agent")
+        if agent_ref is not None and agent_ref not in known_agents:
+            fail(
+                f"{task_path} references agent '{agent_ref}' which is not in the crew agents list"
+            )
+
+        fail_many(
+            _tool_definition_errors(task_defn.get("tools"), task_path, crew_path.parent)
+        )
+
+        context_names = task_defn.get("context")
+        if context_names is not None:
+            if not isinstance(context_names, list):
+                fail(f"{task_path} field 'context' must be a list of task names")
+            else:
+                fail_many(
+                    [
+                        f"{task_path} has context reference '{ctx_name}' but that task "
+                        "has not been defined yet"
+                        for ctx_name in context_names
+                        if ctx_name not in known_tasks
+                    ]
+                )
+
+        task_name = task_defn.get("name")
+        if isinstance(task_name, str) and task_name:
+            known_tasks.add(task_name)
+
+    if errors:
+        raise JSONProjectValidationError(errors)
+
+    return JSONCrewProject(
+        crew_path=crew_path,
+        agents_dir=agents_dir,
+        definition=defn,
+        agent_names=list(agent_names),
+        agents=agent_definitions,
+        task_definitions=task_defs,
+    )
+
+
+def _strip_jsonc_comments(text: str) -> str:
+    result: list[str] = []
+    i = 0
+    in_string = False
+    escape = False
+
+    while i < len(text):
+        char = text[i]
+
+        if in_string:
+            result.append(char)
+            if escape:
+                escape = False
+            elif char == "\\":
+                escape = True
+            elif char == '"':
+                in_string = False
+            i += 1
+            continue
+
+        if char == '"':
+            in_string = True
+            result.append(char)
+            i += 1
+            continue
+
+        next_char = text[i + 1] if i + 1 < len(text) else ""
+        if char == "/" and next_char == "/":
+            i += 2
+            while i < len(text) and text[i] not in "\r\n":
+                i += 1
+            continue
+
+        if char == "/" and next_char == "*":
+            i += 2
+            closed = False
+            while i < len(text) - 1:
+                if text[i] == "\n":
+                    result.append("\n")
+                if text[i] == "*" and text[i + 1] == "/":
+                    i += 2
+                    closed = True
+                    break
+                i += 1
+            if not closed:
+                raise JSONProjectError("unterminated block comment in JSONC input")
+            continue
+
+        result.append(char)
+        i += 1
+
+    return "".join(result)
+
+
+def _strip_trailing_commas(text: str) -> str:
+    result: list[str] = []
+    i = 0
+    in_string = False
+    escape = False
+
+    while i < len(text):
+        char = text[i]
+
+        if in_string:
+            result.append(char)
+            if escape:
+                escape = False
+            elif char == "\\":
+                escape = True
+            elif char == '"':
+                in_string = False
+            i += 1
+            continue
+
+        if char == '"':
+            in_string = True
+            result.append(char)
+            i += 1
+            continue
+
+        if char == ",":
+            j = i + 1
+            while j < len(text) and text[j].isspace():
+                j += 1
+            if j < len(text) and text[j] in "}]":
+                i += 1
+                continue
+
+        result.append(char)
+        i += 1
+
+    return "".join(result)
+
+
+def _expect_object(value: Any, source: str | Path) -> dict[str, Any]:
+    if not isinstance(value, dict):
+        raise JSONProjectError(f"{source}: expected a JSON object")
+    return value
+
+
+def _agent_kwargs_from_definition(
+    defn: dict[str, Any],
+    path: Path | str,
+    *,
+    resolve_tools: bool = True,
+    project_root: Path | None = None,
+) -> dict[str, Any]:
+    errors = _field_errors(
+        defn,
+        _agent_allowed_fields(),
+        _AGENT_RUNTIME_FIELDS,
+        path,
+        {"settings"},
+    )
+    for required in ("role", "goal", "backstory"):
+        if required not in defn:
+            errors.append(f"{path}: missing required field '{required}'")
+
+    settings = defn.get("settings", {})
+    if settings is None:
+        settings = {}
+    if not isinstance(settings, dict):
+        errors.append(f"{path}: 'settings' must be an object when provided")
+        settings = {}
+    else:
+        errors.extend(
+            _field_errors(
+                settings,
+                _agent_allowed_fields(),
+                _AGENT_RUNTIME_FIELDS,
+                f"{path}: settings",
+            )
+        )
+
+    if errors:
+        raise JSONProjectValidationError(errors)
+
+    agent_kwargs = {
+        key: value for key, value in defn.items() if key in _agent_allowed_fields()
+    }
+    agent_kwargs.update(settings)
+    if resolve_tools:
+        _resolve_tool_fields(agent_kwargs, project_root=project_root)
+    else:
+        # Validation/deploy mode: check tool declarations structurally without
+        # importing or instantiating anything — custom:<name> tools execute
+        # project Python on resolution, which must not happen here.
+        tool_errors = _tool_definition_errors(
+            agent_kwargs.get("tools"), path, project_root
+        )
+        if tool_errors:
+            raise JSONProjectValidationError(tool_errors)
+    return agent_kwargs
+
+
+def _task_kwargs_from_definition(
+    task_defn: dict[str, Any],
+    agents_map: dict[str, Any],
+    task_name_map: dict[str, Any],
+    source: str,
+    project_root: Path | None = None,
+) -> dict[str, Any]:
+    errors = _field_errors(
+        task_defn,
+        _task_allowed_fields(),
+        _TASK_RUNTIME_FIELDS,
+        source,
+    )
+    if errors:
+        raise JSONProjectValidationError(errors)
+
+    task_kwargs = {
+        key: value for key, value in task_defn.items() if key in _task_allowed_fields()
+    }
+
+    agent_ref = task_kwargs.get("agent")
+    if agent_ref is not None and isinstance(agent_ref, str):
+        if agent_ref not in agents_map:
+            raise JSONProjectError(
+                f"{source} references agent '{agent_ref}' which is not in the crew agents list"
+            )
+        task_kwargs["agent"] = agents_map[agent_ref]
+
+    context_names = task_kwargs.get("context")
+    if context_names:
+        context_tasks: list[Any] = []
+        for ctx_name in context_names:
+            if ctx_name not in task_name_map:
+                raise JSONProjectError(
+                    f"{source} has context reference '{ctx_name}' but that task "
+                    "has not been defined yet"
+                )
+            context_tasks.append(task_name_map[ctx_name])
+        task_kwargs["context"] = context_tasks
+
+    _resolve_tool_fields(task_kwargs, project_root=project_root)
+    return task_kwargs
+
+
+def _crew_kwargs_from_definition(
+    defn: dict[str, Any],
+    agents: list[Any],
+    tasks: list[Any],
+    agents_map: dict[str, Any],
+    source: Path | str,
+) -> dict[str, Any]:
+    errors = _field_errors(
+        defn,
+        _crew_allowed_fields(),
+        _CREW_RUNTIME_FIELDS,
+        source,
+        {"inputs"},
+    )
+    if errors:
+        raise JSONProjectValidationError(errors)
+
+    crew_kwargs = {
+        key: value for key, value in defn.items() if key in _crew_allowed_fields()
+    }
+    crew_kwargs["agents"] = agents
+    crew_kwargs["tasks"] = tasks
+
+    manager_agent = crew_kwargs.get("manager_agent")
+    if isinstance(manager_agent, str):
+        if manager_agent not in agents_map:
+            raise JSONProjectError(
+                f"{source}: manager_agent '{manager_agent}' is not in the crew agents list"
+            )
+        crew_kwargs["manager_agent"] = agents_map[manager_agent]
+
+    return crew_kwargs
+
+
+def _resolve_tool_fields(
+    kwargs: dict[str, Any], project_root: Path | None = None
+) -> None:
+    tools = kwargs.get("tools")
+    if tools is not None:
+        kwargs["tools"] = _resolve_tools(tools, project_root=project_root)
+
+
+def _field_errors(
+    data: dict[str, Any],
+    allowed_fields: set[str],
+    runtime_fields: set[str],
+    source: str | Path,
+    extra_allowed: set[str] | None = None,
+) -> list[str]:
+    extra_allowed = extra_allowed or set()
+    keys = set(data)
+    runtime = sorted(keys & runtime_fields)
+    unknown = sorted(keys - allowed_fields - runtime_fields - extra_allowed)
+
+    errors: list[str] = []
+    if runtime:
+        errors.append(
+            f"{source}: runtime-only field(s) are not supported in JSON config: "
+            + ", ".join(runtime)
+        )
+    if unknown:
+        errors.append(f"{source}: unsupported field(s): " + ", ".join(unknown))
+    return errors
+
+
+def _agent_allowed_fields() -> set[str]:
+    from crewai import Agent
+
+    return set(Agent.model_fields) - _AGENT_RUNTIME_FIELDS
+
+
+def _task_allowed_fields() -> set[str]:
+    from crewai import Task
+
+    return set(Task.model_fields) - _TASK_RUNTIME_FIELDS
+
+
+def _crew_allowed_fields() -> set[str]:
+    from crewai import Crew
+
+    return set(Crew.model_fields) - _CREW_RUNTIME_FIELDS
+
+
+def _format_validation_error(path: str | Path, exc: ValidationError) -> str:
+    return f"{path}: validation failed: {exc}"
+
+
+def _resolve_tools(tool_defs: list[Any], project_root: Path | None = None) -> list[Any]:
+    """Resolve tool specs into tool instances or serialized BaseTool dicts.
+
+    Strings keep the existing shorthand behavior. Dicts are passed through so
+    ``BaseTool``'s Pydantic validator can hydrate serialized ``tool_type`` data.
+    """
+    if not isinstance(tool_defs, list):
+        raise JSONProjectError("'tools' must be a list")
+
+    tools: list[Any] = []
+    for tool_def in tool_defs:
+        if isinstance(tool_def, dict):
+            tools.append(tool_def)
+            continue
+        if not isinstance(tool_def, str):
+            raise JSONProjectError(
+                f"Tool definitions must be strings or objects, got {type(tool_def).__name__}"
+            )
+        if not tool_def:
+            continue
+        if tool_def.startswith("custom:"):
+            tools.append(_resolve_custom_tool(tool_def[7:], project_root=project_root))
+            continue
+        try:
+            tool_cls = _find_tool_class(tool_def)
+        except Exception as e:
+            raise JSONProjectError(f"Failed to resolve tool '{tool_def}': {e}") from e
+        if tool_cls is None:
+            raise JSONProjectError(
+                f"Unknown tool '{tool_def}'. Tool names must match a class from "
+                f"the 'crewai_tools' package (e.g. 'SerperDevTool') or use the "
+                f"'custom:<name>' prefix for a tool defined in tools/<name>.py."
+            )
+        try:
+            tools.append(tool_cls())
+        except Exception as e:
+            raise JSONProjectError(
+                f"Failed to initialize tool '{tool_def}': {e}"
+            ) from e
+    return tools
+
+
+_tool_class_cache: dict[str, type | None] = {}
+
+
+def _find_tool_class(name: str) -> type | None:
+    """Look up a tool class by name from the ``crewai_tools`` package."""
+    if name in _tool_class_cache:
+        return _tool_class_cache[name]
+
+    candidates = [name]
+    if not name.endswith("Tool"):
+        candidates.append(name + "Tool")
+    snake_pascal = "".join(word.capitalize() for word in name.split("_")) + "Tool"
+    if snake_pascal not in candidates:
+        candidates.append(snake_pascal)
+
+    for class_name in candidates:
+        cls = _try_import_tool(class_name)
+        if cls is not None:
+            _tool_class_cache[name] = cls
+            return cls
+
+    _tool_class_cache[name] = None
+    return None
+
+
+def _try_import_tool(class_name: str) -> type | None:
+    """Attempt to import a single tool class without loading all of crewai_tools."""
+    import re as _re
+
+    base = (
+        class_name.removesuffix("Tool") if class_name.endswith("Tool") else class_name
+    )
+    snake = _re.sub(r"(?<=[a-z0-9])(?=[A-Z])", "_", base).lower()
+    tool_snake = snake + "_tool" if not snake.endswith("_tool") else snake
+
+    module_paths = [
+        f"crewai_tools.tools.{tool_snake}.{tool_snake}",
+        f"crewai_tools.tools.{tool_snake}",
+    ]
+
+    for mod_path in module_paths:
+        cls = _import_tool_class(mod_path, class_name)
+        if cls is not None:
+            return cls
+
+    try:
+        import crewai_tools
+
+        return getattr(crewai_tools, class_name, None)
+    except ImportError:
+        return None
+
+
+def _import_tool_class(mod_path: str, class_name: str) -> type | None:
+    try:
+        import importlib
+
+        mod = importlib.import_module(mod_path)
+    except (ImportError, ModuleNotFoundError):
+        return None
+    return getattr(mod, class_name, None)
+
+
+_CUSTOM_TOOL_NAME_RE = re.compile(r"[A-Za-z_][A-Za-z0-9_]*")
+
+
+def _custom_tool_file(tool_name: str, project_root: Path | None) -> Path:
+    """Return the validated path of a custom tool inside ``tools/``.
+
+    Rejects names that aren't plain identifiers and (belt-and-suspenders)
+    any resolved path that escapes the project's ``tools/`` directory, so
+    ``custom:../evil`` or absolute-path style names cannot execute code
+    outside the project.
+    """
+    if not _CUSTOM_TOOL_NAME_RE.fullmatch(tool_name):
+        raise JSONProjectError(
+            f"Invalid custom tool name 'custom:{tool_name}': names must match "
+            f"[A-Za-z_][A-Za-z0-9_]* and resolve to tools/<name>.py inside "
+            f"the project."
+        )
+    tools_dir = ((project_root or Path.cwd()) / "tools").resolve()
+    tool_file = (tools_dir / f"{tool_name}.py").resolve()
+    try:
+        tool_file.relative_to(tools_dir)
+    except ValueError:
+        raise JSONProjectError(
+            f"Custom tool 'custom:{tool_name}' resolves outside the project's "
+            f"tools/ directory."
+        ) from None
+    return tool_file
+
+
+def _tool_definition_errors(
+    tool_defs: Any, source: Path | str, project_root: Path | None
+) -> list[str]:
+    """Structurally validate tool declarations WITHOUT importing anything.
+
+    Used by validation/deploy paths where executing project code (which
+    ``custom:`` resolution does) would be unsafe. Library tool names are not
+    resolved here either — that requires importing crewai_tools modules and
+    would falsely fail when optional dependencies are absent in the
+    validation environment.
+    """
+    if tool_defs is None:
+        return []
+    if not isinstance(tool_defs, list):
+        return [f"{source}: 'tools' must be a list"]
+    errors: list[str] = []
+    for tool_def in tool_defs:
+        if isinstance(tool_def, dict):
+            continue
+        if not isinstance(tool_def, str):
+            errors.append(
+                f"{source}: tool definitions must be strings or objects, "
+                f"got {type(tool_def).__name__}"
+            )
+            continue
+        if not tool_def.startswith("custom:"):
+            continue
+        try:
+            tool_file = _custom_tool_file(tool_def[7:], project_root)
+        except JSONProjectError as exc:
+            errors.append(f"{source}: {exc}")
+            continue
+        if not tool_file.exists():
+            errors.append(
+                f"{source}: custom tool '{tool_def}' not found: expected "
+                f"{tool_file}. Create the file with a BaseTool subclass, or "
+                f"remove the tool from your crew JSON."
+            )
+    return errors
+
+
+def _resolve_custom_tool(tool_name: str, project_root: Path | None = None) -> Any:
+    """Resolve a custom tool from the project's ``tools/`` directory.
+
+    Note: ``custom:<name>`` tools execute ``tools/<name>.py`` as local Python
+    code at load time — JSON configs referencing them are no longer pure data.
+    Only run JSON crew projects from sources you trust. Validation paths must
+    use ``_tool_definition_errors`` instead, which never executes anything.
+    """
+    tool_file = _custom_tool_file(tool_name, project_root)
+    if not tool_file.exists():
+        raise JSONProjectError(
+            f"Custom tool 'custom:{tool_name}' not found: expected {tool_file}. "
+            f"Create the file with a BaseTool subclass, or remove the tool from "
+            f"your crew JSON."
+        )
+    try:
+        import importlib.util
+
+        spec = importlib.util.spec_from_file_location(
+            f"custom_tools.{tool_name}", tool_file
+        )
+        if spec is None or spec.loader is None:
+            raise JSONProjectError(
+                f"Could not load custom tool 'custom:{tool_name}' from {tool_file}"
+            )
+        logger.debug("Executing custom tool module: %s", tool_file)
+        module = importlib.util.module_from_spec(spec)
+        spec.loader.exec_module(module)
+
+        from crewai.tools.base_tool import BaseTool
+
+        for attr_name in dir(module):
+            attr = getattr(module, attr_name)
+            if (
+                isinstance(attr, type)
+                and issubclass(attr, BaseTool)
+                and attr is not BaseTool
+            ):
+                # Concrete subclasses supply name/description defaults that
+                # BaseTool's signature requires.
+                tool_cls: type[Any] = attr
+                return tool_cls()
+        raise JSONProjectError(
+            f"No BaseTool subclass found in {tool_file}. Custom tools must "
+            f"define a class inheriting from crewai.tools.BaseTool."
+        )
+    except JSONProjectError:
+        raise
+    except Exception as e:
+        raise JSONProjectError(
+            f"Failed to load custom tool 'custom:{tool_name}' from {tool_file}: {e}"
+        ) from e
--- a/lib/crewai/src/crewai/rag/embeddings/providers/microsoft/azure.py
+++ b/lib/crewai/src/crewai/rag/embeddings/providers/microsoft/azure.py
@@ -5,7 +5,7 @@ from typing import Any
 from chromadb.utils.embedding_functions.openai_embedding_function import (
    OpenAIEmbeddingFunction,
 )
-from pydantic import AliasChoices, Field
+from pydantic import AliasChoices, Field, model_validator

 from crewai.rag.core.base_embeddings_provider import BaseEmbeddingsProvider

@@ -13,6 +13,14 @@ from crewai.rag.core.base_embeddings_provider import BaseEmbeddingsProvider
 class AzureProvider(BaseEmbeddingsProvider[OpenAIEmbeddingFunction]):
    """Azure OpenAI embeddings provider."""

+    @model_validator(mode="before")
+    @classmethod
+    def _normalize_model_alias(cls, data: Any) -> Any:
+        if isinstance(data, dict) and "model" in data and "model_name" not in data:
+            data = data.copy()
+            data["model_name"] = data["model"]
+        return data
+
    embedding_callable: type[OpenAIEmbeddingFunction] = Field(
        default=OpenAIEmbeddingFunction,
        description="Azure OpenAI embedding function class",
@@ -43,13 +51,11 @@ class AzureProvider(BaseEmbeddingsProvider[OpenAIEmbeddingFunction]):
        ),
    )
    model_name: str = Field(
-        default="text-embedding-ada-002",
+        default="text-embedding-3-large",
        description="Model name to use for embeddings",
        validation_alias=AliasChoices(
            "EMBEDDINGS_OPENAI_MODEL_NAME",
-            "OPENAI_MODEL_NAME",
            "AZURE_OPENAI_MODEL_NAME",
-            "model",
        ),
    )
    default_headers: dict[str, Any] | None = Field(
--- a/lib/crewai/src/crewai/rag/embeddings/providers/microsoft/types.py
+++ b/lib/crewai/src/crewai/rag/embeddings/providers/microsoft/types.py
@@ -12,7 +12,7 @@ class AzureProviderConfig(TypedDict, total=False):
    api_base: str
    api_type: Annotated[str, "azure"]
    api_version: str
-    model_name: Annotated[str, "text-embedding-ada-002"]
+    model_name: Annotated[str, "text-embedding-3-large"]
    default_headers: dict[str, Any]
    dimensions: int
    deployment_id: Required[str]
--- a/lib/crewai/src/crewai/rag/embeddings/providers/openai/openai_provider.py
+++ b/lib/crewai/src/crewai/rag/embeddings/providers/openai/openai_provider.py
@@ -5,7 +5,7 @@ from typing import Any
 from chromadb.utils.embedding_functions.openai_embedding_function import (
    OpenAIEmbeddingFunction,
 )
-from pydantic import AliasChoices, Field
+from pydantic import AliasChoices, Field, model_validator

 from crewai.rag.core.base_embeddings_provider import BaseEmbeddingsProvider

@@ -13,6 +13,14 @@ from crewai.rag.core.base_embeddings_provider import BaseEmbeddingsProvider
 class OpenAIProvider(BaseEmbeddingsProvider[OpenAIEmbeddingFunction]):
    """OpenAI embeddings provider."""

+    @model_validator(mode="before")
+    @classmethod
+    def _normalize_model_alias(cls, data: Any) -> Any:
+        if isinstance(data, dict) and "model" in data and "model_name" not in data:
+            data = data.copy()
+            data["model_name"] = data["model"]
+        return data
+
    embedding_callable: type[OpenAIEmbeddingFunction] = Field(
        default=OpenAIEmbeddingFunction,
        description="OpenAI embedding function class",
@@ -23,12 +31,11 @@ class OpenAIProvider(BaseEmbeddingsProvider[OpenAIEmbeddingFunction]):
        validation_alias=AliasChoices("EMBEDDINGS_OPENAI_API_KEY", "OPENAI_API_KEY"),
    )
    model_name: str = Field(
-        default="text-embedding-ada-002",
+        default="text-embedding-3-large",
        description="Model name to use for embeddings",
        validation_alias=AliasChoices(
            "EMBEDDINGS_OPENAI_MODEL_NAME",
-            "OPENAI_MODEL_NAME",
-            "model",
+            "model_name",
        ),
    )
    api_base: str | None = Field(
--- a/lib/crewai/src/crewai/rag/embeddings/providers/openai/types.py
+++ b/lib/crewai/src/crewai/rag/embeddings/providers/openai/types.py
@@ -9,7 +9,7 @@ class OpenAIProviderConfig(TypedDict, total=False):
    """Configuration for OpenAI provider."""

    api_key: str
-    model_name: Annotated[str, "text-embedding-ada-002"]
+    model_name: Annotated[str, "text-embedding-3-large"]
    api_base: str
    api_type: str
    api_version: str
--- a/lib/crewai/src/crewai/telemetry/telemetry.py
+++ b/lib/crewai/src/crewai/telemetry/telemetry.py
@@ -931,7 +931,7 @@ class Telemetry:
            value: The attribute value.
        """

-        if span is None:
+        if span is None or value is None:
            return

        def _operation() -> None:
--- a/lib/crewai/src/crewai/types/usage_metrics.py
+++ b/lib/crewai/src/crewai/types/usage_metrics.py
@@ -4,10 +4,31 @@ This module provides models for tracking token usage and request metrics
 during crew and agent execution.
 """

+from typing import Any
+
 from pydantic import BaseModel, Field
 from typing_extensions import Self


+def _coerce_int(value: Any) -> int:
+    if value is None:
+        return 0
+    try:
+        return int(value)
+    except (TypeError, ValueError):
+        return 0
+
+
+def _first_int(usage_data: dict[str, Any], *keys: str) -> int:
+    """Return the first integer-coercible value from ``usage_data`` under any
+    of ``keys``. Falls back to ``0`` when nothing matches."""
+    for key in keys:
+        coerced = _coerce_int(usage_data.get(key))
+        if coerced:
+            return coerced
+    return 0
+
+
 class UsageMetrics(BaseModel):
    """Track usage metrics for crew execution.

@@ -54,3 +75,50 @@ class UsageMetrics(BaseModel):
        self.reasoning_tokens += usage_metrics.reasoning_tokens
        self.cache_creation_tokens += usage_metrics.cache_creation_tokens
        self.successful_requests += usage_metrics.successful_requests
+
+    @classmethod
+    def from_provider_dict(cls, usage_data: dict[str, Any] | None) -> Self | None:
+        """Normalize a provider's raw usage dict into a ``UsageMetrics``.
+
+        Accepts the full set of key aliases CrewAI providers emit:
+        ``prompt_tokens`` / ``prompt_token_count`` (Gemini) / ``input_tokens``
+        (Anthropic), and the equivalent completion / cached-prompt aliases.
+        Mirrors ``BaseLLM._track_token_usage_internal`` so per-LLM totals,
+        flow-level aggregation, and OTel spans agree on every provider.
+
+        Returns ``None`` for missing/empty input so callers can decide
+        whether to skip the event entirely or treat it as a zero-token
+        successful request.
+        """
+        if not usage_data:
+            return None
+
+        prompt_tokens = _first_int(
+            usage_data, "prompt_tokens", "prompt_token_count", "input_tokens"
+        )
+        completion_tokens = _first_int(
+            usage_data,
+            "completion_tokens",
+            "candidates_token_count",
+            "output_tokens",
+        )
+        cached_prompt_tokens = _first_int(
+            usage_data,
+            "cached_tokens",
+            "cached_prompt_tokens",
+            "cache_read_input_tokens",
+        )
+        if not cached_prompt_tokens:
+            details = usage_data.get("prompt_tokens_details")
+            if isinstance(details, dict):
+                cached_prompt_tokens = _coerce_int(details.get("cached_tokens"))
+
+        return cls(
+            total_tokens=prompt_tokens + completion_tokens,
+            prompt_tokens=prompt_tokens,
+            completion_tokens=completion_tokens,
+            cached_prompt_tokens=cached_prompt_tokens,
+            reasoning_tokens=_coerce_int(usage_data.get("reasoning_tokens")),
+            cache_creation_tokens=_coerce_int(usage_data.get("cache_creation_tokens")),
+            successful_requests=1,
+        )
--- a/lib/crewai/src/crewai/utilities/agent_utils.py
+++ b/lib/crewai/src/crewai/utilities/agent_utils.py
@@ -65,6 +65,15 @@ class SummaryContent(TypedDict):
 console = Console()

 _MULTIPLE_NEWLINES: Final[re.Pattern[str]] = re.compile(r"\n+")
+_NATIVE_TOOL_UNSUPPORTED_PATTERNS: Final[tuple[str, ...]] = (
+    "does not support tools",
+    "doesn't support tools",
+    "tools are not supported",
+    "tool calling is not supported",
+    "tool calls are not supported",
+    "function calling is not supported",
+    "does not support function calling",
+)


 def is_inside_event_loop() -> bool:
@@ -1273,6 +1282,28 @@ def check_native_tool_support(llm: Any, original_tools: list[BaseTool] | None) -
    )


+def is_native_tool_calling_unsupported_error(error: BaseException) -> bool:
+    """Return whether an error means native tool calling is unavailable."""
+    message = str(error).lower()
+    return any(pattern in message for pattern in _NATIVE_TOOL_UNSUPPORTED_PATTERNS)
+
+
+def build_text_tool_calling_fallback_message(
+    tools_description: str,
+    tools_names: str,
+) -> str:
+    """Build instructions for downgrading native tools to text tool calls."""
+    text_tooling_prompt = I18N_DEFAULT.slice("tools").format(
+        tools=tools_description,
+        tool_names=tools_names,
+    )
+    return (
+        "Native tool calling is unavailable for this model/provider. "
+        "Continue using CrewAI text tool calling instead.\n"
+        f"{text_tooling_prompt}"
+    )
+
+
 def setup_native_tools(
    original_tools: list[BaseTool],
 ) -> tuple[
@@ -1365,6 +1396,8 @@ def execute_single_native_tool_call(
    event_source: Any,
    printer: Printer | None = None,
    verbose: bool = False,
+    plan_step_number: int | None = None,
+    plan_step_description: str | None = None,
 ) -> NativeToolCallResult:
    """Execute a single native tool call with full lifecycle management.

@@ -1446,6 +1479,8 @@ def execute_single_native_tool_call(
            from_agent=agent,
            from_task=task,
            agent_key=agent_key,
+            plan_step_number=plan_step_number,
+            plan_step_description=plan_step_description,
        ),
    )

@@ -1509,6 +1544,8 @@ def execute_single_native_tool_call(
                        from_agent=agent,
                        from_task=task,
                        agent_key=agent_key,
+                        plan_step_number=plan_step_number,
+                        plan_step_description=plan_step_description,
                        error=e,
                    ),
                )
@@ -1542,6 +1579,8 @@ def execute_single_native_tool_call(
                from_agent=agent,
                from_task=task,
                agent_key=agent_key,
+                plan_step_number=plan_step_number,
+                plan_step_description=plan_step_description,
                started_at=started_at,
                finished_at=datetime.now(),
            ),
--- a/Show More
+++ b/Show More