fix(hitl): log pre-review failures and add learn_strict mode

Closes #5725. In _pre_review_with_lessons (lib/crewai/src/crewai/flow/human_feedback.py), a broad except Exception: return method_output silently swallowed any LLM, network, auth, or structured-output failure during the pre-review step. Callers could not distinguish pre-reviewed output from raw output, and there was no log or event surfaced. Changes: - Add a module logger. - Narrow the try/except in _pre_review_with_lessons so the mem is None and not matches short-circuits stay outside the failure path (those are not error cases). - On recall or LLM pre-review failure, log a WARNING with exc_info=True so the silent fallback is observable. Continue to fall back to the raw method_output so the flow does not break. - Add an opt-in learn_strict=True parameter on the human_feedback decorator and HumanFeedbackConfig that re-raises pre-review failures instead of falling back, for callers that need fail-closed behavior. - Update the Graceful degradation docs section to reflect the new observable-by-default behavior and document learn_strict. Tests (lib/crewai/tests/test_human_feedback_decorator.py): - New TestHumanFeedbackPreReviewFailure class with 7 tests covering WARNING logging on LLM and recall failures, learn_strict propagation in both sync and async paths, and config introspection of the new learn_strict flag. Co-Authored-By: João <joao@crewai.com>
2026-05-08 02:29:00 +00:00 · 2026-05-06 04:25:15 +00:00
parent ec8a522c2c
commit d633aaa5fa
3 changed files with 336 additions and 11 deletions
--- a/docs/en/learn/human-feedback-in-flows.mdx
+++ b/docs/en/learn/human-feedback-in-flows.mdx
@@ -682,6 +682,7 @@ class ArticleReviewFlow(Flow):
 | Parameter | Default | Description |
 |-----------|---------|-------------|
 | `learn` | `False` | Enable HITL learning |
+| `learn_strict` | `False` | If `True`, pre-review failures are re-raised instead of falling back to raw output |
 | `learn_limit` | `5` | Max past lessons to recall for pre-review |

 ### Key Design Decisions
@@ -689,7 +690,8 @@ class ArticleReviewFlow(Flow):
 - **Same LLM for everything**: The `llm` parameter on the decorator is shared by outcome collapsing, lesson distillation, and pre-review. No need to configure multiple models.
 - **Structured output**: Both distillation and pre-review use function calling with Pydantic models when the LLM supports it, falling back to text parsing otherwise.
 - **Non-blocking storage**: Lessons are stored via `remember_many()` which runs in a background thread -- the flow continues immediately.
- **Graceful degradation**: If the LLM fails during distillation, nothing is stored. If it fails during pre-review, the raw output is shown. Neither failure blocks the flow.
+- **Observable graceful degradation**: If the LLM fails during distillation, nothing is stored. If it fails during pre-review, the raw output is shown to the human and the failure is logged at `WARNING` level with the full traceback (`exc_info=True`) under the `crewai.flow.human_feedback` logger -- so the silent fallback is detectable. Neither failure blocks the flow.
+- **Strict mode**: Pass `learn_strict=True` to make pre-review fail closed -- failures (LLM error, network/auth error, structured-output parse error, memory `recall` error) propagate out of the flow method instead of being swallowed. Use this when downstream code must be able to assume that pre-review actually executed.
 - **No scope/categories needed**: When storing lessons, only `source` is passed. The encoding pipeline infers scope, categories, and importance automatically.

 <Note>