crewAI

andre/crewAI

Fork 0

mirror of https://github.com/crewAIInc/crewAI.git synced 2026-01-22 22:58:13 +00:00

Commit Graph

Author	SHA1	Message	Date
Devin AI	3619d4dc50	fix: Replace deprecated typing imports with built-in types - Replace Dict, List, Set, Tuple with dict, list, set, tuple throughout codebase - Add missing type annotations to crew_events.py methods - Add proper type annotations to test_crew_cancellation.py - Use type: ignore[method-assign] comments for mock assignments - Maintain backward compatibility while modernizing type hints This resolves lint and type-checker failures in CI while preserving the cancellation functionality. Co-Authored-By: João <joao@crewai.com>	2025-09-04 02:07:02 +00:00
Lucas Gomide	6ebb6c9b63	Supporting eval single Agent/LiteAgent (#3167 ) Some checks failed Notify Downstream / notify-downstream (push) Has been cancelled Details Mark stale issues and pull requests / stale (push) Has been cancelled Details * refactor: rely on task completion event to evaluate agents * feat: remove Crew dependency to evaluate agent * feat: drop execution_context in AgentEvaluator * chore: drop experimental Agent Eval feature from stable crew.test * feat: support eval LiteAgent * resolve linter issues	2025-07-15 09:22:41 -04:00
Lucas Gomide	1b6b2b36d9	Introduce Evaluator Experiment (#3133 ) * feat: add exchanged messages in LLMCallCompletedEvent * feat: add GoalAlignment metric for Agent evaluation * feat: add SemanticQuality metric for Agent evaluation * feat: add Tool Metrics for Agent evaluation * feat: add Reasoning Metrics for Agent evaluation, still in progress * feat: add AgentEvaluator class This class will evaluate Agent' results and report to user * fix: do not evaluate Agent by default This is a experimental feature we still need refine it further * test: add Agent eval tests * fix: render all feedback per iteration * style: resolve linter issues * style: fix mypy issues * fix: allow messages be empty on LLMCallCompletedEvent * feat: add Experiment evaluation framework with baseline comparison * fix: reset evaluator for each experiement iteraction * fix: fix track of new test cases * chore: split Experimental evaluation classes * refactor: remove unused method * refactor: isolate Console print in a dedicated class * fix: make crew required to run an experiment * fix: use time-aware to define experiment result * test: add tests for Evaluator Experiment * style: fix linter issues * fix: encode string before hashing * style: resolve linter issues * feat: add experimental folder for beta features (#3141) * test: move tests to experimental folder	2025-07-14 09:06:45 -04:00

Author

SHA1

Message

Date

Devin AI

3619d4dc50

fix: Replace deprecated typing imports with built-in types

- Replace Dict, List, Set, Tuple with dict, list, set, tuple throughout codebase
- Add missing type annotations to crew_events.py methods
- Add proper type annotations to test_crew_cancellation.py
- Use type: ignore[method-assign] comments for mock assignments
- Maintain backward compatibility while modernizing type hints

This resolves lint and type-checker failures in CI while preserving
the cancellation functionality.

Co-Authored-By: João <joao@crewai.com>

2025-09-04 02:07:02 +00:00

Lucas Gomide

6ebb6c9b63

Supporting eval single Agent/LiteAgent (#3167 )

Notify Downstream / notify-downstream (push) Has been cancelled

Details

Mark stale issues and pull requests / stale (push) Has been cancelled

Details

* refactor: rely on task completion event to evaluate agents

* feat: remove Crew dependency to evaluate agent

* feat: drop execution_context in AgentEvaluator

* chore: drop experimental Agent Eval feature from stable crew.test

* feat: support eval LiteAgent

* resolve linter issues

2025-07-15 09:22:41 -04:00

Lucas Gomide

1b6b2b36d9

Introduce Evaluator Experiment (#3133 )

* feat: add exchanged messages in LLMCallCompletedEvent

* feat: add GoalAlignment metric for Agent evaluation

* feat: add SemanticQuality metric for Agent evaluation

* feat: add Tool Metrics for Agent evaluation

* feat: add Reasoning Metrics for Agent evaluation, still in progress

* feat: add AgentEvaluator class

This class will evaluate Agent' results and report to user

* fix: do not evaluate Agent by default

This is a experimental feature we still need refine it further

* test: add Agent eval tests

* fix: render all feedback per iteration

* style: resolve linter issues

* style: fix mypy issues

* fix: allow messages be empty on LLMCallCompletedEvent

* feat: add Experiment evaluation framework with baseline comparison

* fix: reset evaluator for each experiement iteraction

* fix: fix track of new test cases

* chore: split Experimental evaluation classes

* refactor: remove unused method

* refactor: isolate Console print in a dedicated class

* fix: make crew required to run an experiment

* fix: use time-aware to define experiment result

* test: add tests for Evaluator Experiment

* style: fix linter issues

* fix: encode string before hashing

* style: resolve linter issues

* feat: add experimental folder for beta features (#3141)

* test: move tests to experimental folder

2025-07-14 09:06:45 -04:00

3 Commits