refactor(events): relocate events module & update imports
- Move events from utilities/ to top-level events/ with types/, listeners/, utils/ structure
- Update all source/tests/docs to new import paths
- Add backwards compatibility stubs in crewai.utilities.events with deprecation warnings
- Restore test mocks and fix related test imports
* feat: add exchanged messages in LLMCallCompletedEvent
* feat: add GoalAlignment metric for Agent evaluation
* feat: add SemanticQuality metric for Agent evaluation
* feat: add Tool Metrics for Agent evaluation
* feat: add Reasoning Metrics for Agent evaluation, still in progress
* feat: add AgentEvaluator class
This class will evaluate Agent' results and report to user
* fix: do not evaluate Agent by default
This is a experimental feature we still need refine it further
* test: add Agent eval tests
* fix: render all feedback per iteration
* style: resolve linter issues
* style: fix mypy issues
* fix: allow messages be empty on LLMCallCompletedEvent
* feat: add Experiment evaluation framework with baseline comparison
* fix: reset evaluator for each experiement iteraction
* fix: fix track of new test cases
* chore: split Experimental evaluation classes
* refactor: remove unused method
* refactor: isolate Console print in a dedicated class
* fix: make crew required to run an experiment
* fix: use time-aware to define experiment result
* test: add tests for Evaluator Experiment
* style: fix linter issues
* fix: encode string before hashing
* style: resolve linter issues
* feat: add experimental folder for beta features (#3141)
* test: move tests to experimental folder