Compare commits

...

49 Commits

Author SHA1 Message Date
Lorenze Jay
9a347ad458 chore: update crewai-tools dependency to version 0.59.0 and bump CrewAI version to 0.152.0 (#3244)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
- Updated `crewai-tools` dependency from `0.58.0` to `0.59.0` in `pyproject.toml` and `uv.lock`.
- Bumped the version of the CrewAI library from `0.150.0` to `0.152.0` in `__init__.py`.
- Updated dependency versions in CLI templates for crew, flow, and tool projects to reflect the new CrewAI version.
2025-07-30 14:38:24 -07:00
Lucas Gomide
34c3075fdb fix: support to add memories to Mem0 with agent_id (#3217)
* fix: support to add memories to Mem0 with agent_id

* feat: removing memory_type checkings from Mem0Storage

* feat: ensure agent_id is always present while saving memory into Mem0

* fix: use OR operator when querying Mem0 memories with both user_id and agent_id
2025-07-30 11:56:46 -04:00
Vidit Ostwal
498e8dc6e8 Changed the import error to show missing module files (#2423)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
* Fix issue #2421: Handle missing google.genai dependency gracefully

Co-Authored-By: Joe Moura <joao@crewai.com>

* Fix import sorting in test file

Co-Authored-By: Joe Moura <joao@crewai.com>

* Fix import sorting with ruff

Co-Authored-By: Joe Moura <joao@crewai.com>

* Removed unwatned test case

* Added dynamic catching for all the embedder function

* Dropped the comment

* Added test case

* Fixed Linting Issue

* Flaky test case in 3.13

* Test Case fixed

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Joe Moura <joao@crewai.com>
Co-authored-by: Lucas Gomide <lucaslg200@gmail.com>
2025-07-30 10:01:17 -04:00
Lorenze Jay
cb522cf500 Enhance Flow class to support custom flow names (#3234)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
- Added an optional `name` attribute to the Flow class for better identification.
- Updated event emissions to utilize the new `name` attribute, ensuring accurate flow naming in events.
- Added tests to verify the correct flow name is set and emitted during flow execution.
2025-07-29 15:41:30 -07:00
Vini Brasil
017acc74f5 Add timezone to event timestamps (#3231)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Events were lacking timezone information, making them naive datetimes,
which can be ambiguous.
2025-07-28 17:09:06 -03:00
Greyson LaLonde
fab86d197a Refactor: Move RAG components to dedicated top-level module (#3222)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* Move RAG components to top-level module

- Create src/crewai/rag directory structure
- Move embeddings configurator from utilities to rag module
- Update imports across codebase and documentation
- Remove deprecated embedding files

* Remove empty knowledge/embedder directory
2025-07-25 10:55:31 -04:00
Vidit Ostwal
864e9bfb76 Changed the default value in Mem0 config (#3216)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* Changed the default value in Mem0 config

* Added regression test for this

* Fixed Linting issues
2025-07-24 13:20:18 -04:00
Lucas Gomide
d3b45d197c fix: remove crewai signup references, replaced by crewai login (#3213) 2025-07-24 07:47:35 -04:00
Manuka Yasas
579153b070 docs: fix incorrect model naming in Google Vertex AI documentation (#3189)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
- Change model format from "gemini/gemini-1.5-pro-latest" to "gemini-1.5-pro-latest"
  in Vertex AI section examples
- Update both English and Portuguese documentation files
- Fixes incorrect provider prefix usage for Vertex AI models
- Ensures consistency with Vertex AI provider requirements

Files changed:
- docs/en/concepts/llms.mdx (line 272)
- docs/pt-BR/concepts/llms.mdx (line 270)

Co-authored-by: Tony Kipkemboi <iamtonykipkemboi@gmail.com>
2025-07-23 16:58:57 -04:00
Lorenze Jay
b1fdcdfa6e chore: update dependencies and version in project files (#3212)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
- Updated `crewai-tools` dependency from `0.55.0` to `0.58.0` in `pyproject.toml` and `uv.lock`.
- Added new packages `anthropic`, `browserbase`, `playwright`, `pyee`, and `stagehand` with their respective versions in `uv.lock`.
- Bumped the version of the CrewAI library from `0.148.0` to `0.150.0` in `__init__.py`.
- Updated dependency versions in CLI templates for crew, flow, and tool projects to reflect the new CrewAI version.
2025-07-23 11:03:50 -07:00
Mike Plachta
18d76a270c docs: add SerperScrapeWebsiteTool documentation and reorganize SerperDevTool setup instructions (#3211) 2025-07-23 12:12:59 -04:00
Vidit Ostwal
30541239ad Changed Mem0 Storage v1.1 -> v2 (#2893)
* Changed v1.1 -> v2

* Fixed Test Cases:

* Fixed linting issues

* Changed docs

* Refractored the storage

* Fixed test cases

* Fixing run-time checks

* Fixed Test Case

* Updated docs and added test case for custom categories

* Add the TODO back

* Minor Changes

* Added output_format in search

* Minor changes

* Added output_format and version in both search and save

* Small change

* Minor bugs

* Fixed test cases

* Changed docs

---------

Co-authored-by: Lucas Gomide <lucaslg200@gmail.com>
2025-07-23 08:30:52 -04:00
Tony Kipkemboi
9a65573955 Feature/update docs (#3205)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* docs: add create_directory parameter

* docs: remove string guardrails to focus on function guardrails

* docs: remove get help from docs.json

* docs: update pt-BR docs.json changes
2025-07-22 13:55:27 -04:00
Lucas Gomide
27623a1d01 feat: remove duplicate print on LLM call error (#3183)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
By improving litellm handler error / outputs

Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>
2025-07-21 22:08:07 -04:00
João Moura
2593242234 Adding Support to adhoc tool calling using the internal LLM class (#3195)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
* Adding Support to adhoc tool calling using the internal LLM class

* fix type
2025-07-21 19:36:48 -03:00
Greyson LaLonde
2ab6c31544 chore: add deprecation notices to UserMemory (#3201)
- Mark UserMemory and UserMemoryItem for removal in v0.156.0 or 2025-08-04
- Update all references with deprecation warnings
- Users should migrate to ExternalMemory
2025-07-21 15:26:34 -04:00
Lucas Gomide
3c55c8a22a fix: append user message when last message is from assistent when using Ollama models (#3200)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Ollama doesn't supports last message to be 'assistant'
We can drop this commit after merging https://github.com/BerriAI/litellm/pull/10917
2025-07-21 13:30:40 -04:00
Ranuga Disansa
424433ff58 docs: Add Tavily Search & Extractor tools to Search-Research suite (#3146)
* docs: Add Tavily Search and Extractor tools documentation

* docs: Add Tavily Search and Extractor tools to the documentation

---------

Co-authored-by: Tony Kipkemboi <iamtonykipkemboi@gmail.com>
2025-07-21 12:01:29 -04:00
Lucas Gomide
2fd99503ed build: upgrade LiteLLM to 1.74.3 (#3199) 2025-07-21 09:58:47 -04:00
Vidit Ostwal
942014962e fixed save method, changed the test cases (#3187)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* fixed save method, changed the test cases

* Linting fixed
2025-07-18 15:10:26 -04:00
Lucas Gomide
2ab79a7dd5 feat: drop unsupported stop parameter for LLM models automatically (#3184) 2025-07-18 13:54:28 -04:00
Lucas Gomide
27c449c9c4 test: remove workaround related to SQLite without FTS5 (#3179)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
For more details check out [here](actions/runner-images#12576)
2025-07-18 09:37:15 -04:00
Vini Brasil
9737333ffd Use file lock around Chroma client initialization (#3181)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
This commit fixes a bug with concurrent processess and Chroma where
`table collections already exists` (and similar) were raised.

https://cookbook.chromadb.dev/core/system_constraints/
2025-07-17 11:50:45 -03:00
Lucas Gomide
bf248d5118 docs: fix neatlogs documentation (#3171)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2025-07-16 21:18:04 -04:00
Lorenze Jay
2490e8cd46 Update CrewAI version to 0.148.0 in project templates and dependencies (#3172)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
* Update CrewAI version to 0.148.0 in project templates and dependencies

* Update crewai-tools dependency to version 0.55.0 in pyproject.toml and uv.lock for improved functionality and performance.
2025-07-16 12:36:43 -07:00
Lucas Gomide
9b67e5a15f Emit events about Agent eval (#3168)
* feat: emit events abou Agent Eval

We are triggering events when an evaluation has started/completed/failed

* style: fix type checking issues
2025-07-16 13:18:59 -04:00
Lucas Gomide
6ebb6c9b63 Supporting eval single Agent/LiteAgent (#3167)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* refactor: rely on task completion event to evaluate agents

* feat: remove Crew dependency to evaluate agent

* feat: drop execution_context in AgentEvaluator

* chore: drop experimental Agent Eval feature from stable crew.test

* feat: support eval LiteAgent

* resolve linter issues
2025-07-15 09:22:41 -04:00
Lucas Gomide
53f674be60 chore: remove evaluation folder (#3159)
This folder was moved to `experimental` folder
2025-07-15 08:30:20 -04:00
Paras Sakarwal
11717a5213 docs: added integration with neatlogs (#3138)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2025-07-14 11:08:24 -04:00
Lucas Gomide
b6d699f764 Implement thread-safe AgentEvaluator (#3157)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
* refactor: implement thread-safe AgentEvaluator with hybrid state management

* chore: remove useless comments
2025-07-14 10:05:42 -04:00
Lucas Gomide
5b15061b87 test: add test helper to assert Agent Experiments (#3156) 2025-07-14 09:24:49 -04:00
Lucas Gomide
1b6b2b36d9 Introduce Evaluator Experiment (#3133)
* feat: add exchanged messages in LLMCallCompletedEvent

* feat: add GoalAlignment metric for Agent evaluation

* feat: add SemanticQuality metric for Agent evaluation

* feat: add Tool Metrics for Agent evaluation

* feat: add Reasoning Metrics for Agent evaluation, still in progress

* feat: add AgentEvaluator class

This class will evaluate Agent' results and report to user

* fix: do not evaluate Agent by default

This is a experimental feature we still need refine it further

* test: add Agent eval tests

* fix: render all feedback per iteration

* style: resolve linter issues

* style: fix mypy issues

* fix: allow messages be empty on LLMCallCompletedEvent

* feat: add Experiment evaluation framework with baseline comparison

* fix: reset evaluator for each experiement iteraction

* fix: fix track of new test cases

* chore: split Experimental evaluation classes

* refactor: remove unused method

* refactor: isolate Console print in a dedicated class

* fix: make crew required to run an experiment

* fix: use time-aware to define experiment result

* test: add tests for Evaluator Experiment

* style: fix linter issues

* fix: encode string before hashing

* style: resolve linter issues

* feat: add experimental folder for beta features (#3141)

* test: move tests to experimental folder
2025-07-14 09:06:45 -04:00
devin-ai-integration[bot]
3ada4053bd Fix #3149: Add missing create_directory parameter to Task class (#3150)
* Fix #3149: Add missing create_directory parameter to Task class

- Add create_directory field with default value True for backward compatibility
- Update _save_file method to respect create_directory parameter
- Add comprehensive tests covering all scenarios
- Maintain existing behavior when create_directory=True (default)

The create_directory parameter was documented but missing from implementation.
Users can now control directory creation behavior:
- create_directory=True (default): Creates directories if they don't exist
- create_directory=False: Raises RuntimeError if directory doesn't exist

Fixes issue where users got TypeError when trying to use the documented
create_directory parameter.

Co-Authored-By: Jo\u00E3o <joao@crewai.com>

* Fix lint: Remove unused import os from test_create_directory_true

- Removes F401 lint error: 'os' imported but unused
- All lint checks should now pass

Co-Authored-By: Jo\u00E3o <joao@crewai.com>

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Jo\u00E3o <joao@crewai.com>
2025-07-14 08:15:41 -04:00
Vidit Ostwal
e7a5747c6b Comparing BaseLLM class instead of LLM (#3120)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* Compaing BaseLLM class instead of LLM

* Fixed test cases

* Fixed Linting Issues

* removed last line

---------

Co-authored-by: Lucas Gomide <lucaslg200@gmail.com>
2025-07-11 20:50:36 -04:00
Vidit Ostwal
eec1262d4f Fix agent knowledge (#2831)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
* Added add_sources()

* Fixed the agent knowledge querying

* Added test cases

* Fixed linting issue

* Fixed logic

* Seems like a falky test case

* Minor changes

* Added knowledge attriute to the crew documentation

* Flaky test

* fixed spaces

* Flaky Test Case

* Seems like a flaky test case

---------

Co-authored-by: Lucas Gomide <lucaslg200@gmail.com>
2025-07-11 13:52:26 -04:00
Tony Kipkemboi
c6caa763d7 docs: Add guardrail attribute documentation and examples (#3139)
- Document string-based guardrails in tasks
- Add guardrail examples to YAML configuration
- Fix Python code formatting in PT-BR CLI docs
2025-07-11 13:32:59 -04:00
Lucas Gomide
08fa3797ca Introducing Agent evaluation (#3130)
* feat: add exchanged messages in LLMCallCompletedEvent

* feat: add GoalAlignment metric for Agent evaluation

* feat: add SemanticQuality metric for Agent evaluation

* feat: add Tool Metrics for Agent evaluation

* feat: add Reasoning Metrics for Agent evaluation, still in progress

* feat: add AgentEvaluator class

This class will evaluate Agent' results and report to user

* fix: do not evaluate Agent by default

This is a experimental feature we still need refine it further

* test: add Agent eval tests

* fix: render all feedback per iteration

* style: resolve linter issues

* style: fix mypy issues

* fix: allow messages be empty on LLMCallCompletedEvent
2025-07-11 13:18:03 -04:00
Greyson LaLonde
bf8fa3232b Add SQLite FTS5 support to test workflow (#3140)
* Add SQLite FTS5 support to test workflow

* Add explanatory comment for SQLite FTS5 workaround
2025-07-11 12:01:25 -04:00
Heitor Carvalho
a6e60a5d42 fix: use production workos environment id (#3129)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
2025-07-09 17:09:01 -04:00
Lorenze Jay
7b0f3aabd9 chore: update crewAI and dependencies to version 0.141.0 and 0.51.0 (#3128)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
- Bump crewAI version to 0.141.0 in __init__.py for alignment with updated dependencies.
- Update `crewai-tools` dependency version to 0.51.0 in pyproject.toml and related template files.
- Add new testing dependencies: pytest-split and pytest-xdist for improved test execution.
- Ensure compatibility with the latest package versions in uv.lock and template files.
2025-07-09 10:37:06 -07:00
Lucas Gomide
f071966951 docs: add docs about Agent.kickoff usage (#3121)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Co-authored-by: Tony Kipkemboi <iamtonykipkemboi@gmail.com>
2025-07-08 16:15:40 -04:00
Lucas Gomide
318310bb7a docs: add docs about Agent repository (#3122) 2025-07-08 15:56:08 -04:00
Greyson LaLonde
34a03f882c feat: add crew context tracking for LLM guardrail events (#3111)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
Add crew context tracking using OpenTelemetry baggage for thread-safe propagation. Context is set during kickoff and cleaned up in finally block. Added thread safety tests with mocked agent execution.
2025-07-07 16:33:07 -04:00
Greyson LaLonde
a0fcc0c8d1 Speed up GitHub Actions tests with parallelization (#3107)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
- Add pytest-xdist and pytest-split to dev dependencies for parallel test execution
- Split tests into 8 parallel groups per Python version for better distribution
- Enable CPU-level parallelization with -n auto to maximize resource usage
- Add fail-fast strategy and maxfail=3 to stop early on failures
- Add job name to match branch protection rules
- Reduce test timeout from default to 30s for faster failure detection
- Remove redundant cache configuration
2025-07-03 21:08:00 -04:00
Lorenze Jay
748c25451c Lorenze/new version 0.140.0 (#3106)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
Mark stale issues and pull requests / stale (push) Has been cancelled
* fix: clean up whitespace and update dependencies

* Removed unnecessary whitespace in multiple files for consistency.
* Updated `crewai-tools` dependency version to `0.49.0` in `pyproject.toml` and related template files.
* Bumped CrewAI version to `0.140.0` in `__init__.py` for alignment with updated dependencies.

* chore: update pyproject.toml to exclude documentation from build targets

* Added exclusions for the `docs` directory in both wheel and sdist build targets to streamline the build process and reduce unnecessary file inclusion.

* chore: update uv.lock for dependency resolution and Python version compatibility

* Incremented revision to 2.
* Updated resolution markers to include support for Python 3.13 and adjusted platform checks for better compatibility.
* Added new wheel URLs for zstandard version 0.23.0 to ensure availability across various platforms.

* chore: pin json-repair dependency version in pyproject.toml and uv.lock

* Updated json-repair dependency from a range to a specific version (0.25.2) for consistency and to avoid potential compatibility issues.
* Adjusted related entries in uv.lock to reflect the pinned version, ensuring alignment across project files.

* chore: pin agentops dependency version in pyproject.toml and uv.lock

* Updated agentops dependency from a range to a specific version (0.3.18) for consistency and to avoid potential compatibility issues.
* Adjusted related entries in uv.lock to reflect the pinned version, ensuring alignment across project files.

* test: enhance cache call assertions in crew tests

* Improved the test for cache hitting between agents by filtering mock calls to ensure they include the expected 'tool' and 'input' keywords.
* Added assertions to verify the number of cache calls and their expected arguments, enhancing the reliability of the test.
* Cleaned up whitespace and improved readability in various test cases for better maintainability.
2025-07-02 15:22:18 -07:00
Heitor Carvalho
a77dcdd419 feat: add multiple provider support (#3089)
* Remove `crewai signup` command, update docs

* Add `Settings.clear()` and clear settings before each login

* Add pyjwt

* Remove print statement from ToolCommand.login()

* Remove auth0 dependency

* Update docs
2025-07-02 16:44:47 -04:00
Greyson LaLonde
68f5bdf0d9 feat: add console logging for LLM guardrail events (#3105)
* feat: add console logging for memory events

* fix: emit guardrail events in correct order and handle exceptions

* fix: remove unreachable elif conditions in guardrail event listener

* fix: resolve mypy type errors in guardrail event handler
2025-07-02 16:19:22 -04:00
Irineu Brito
7f83947020 fix: correct code example language inconsistency in pt-BR docs (#3088)
* fix: correct code example language inconsistency in pt-BR docs

* fix: fix: fully standardize code example language and naming in pt-BR docs

* fix: fix: fully standardize code example language and naming in pt-BR docs fixed variables

* fix: fix: fully standardize code example language and naming in pt-BR docs fixed params

---------

Co-authored-by: Lucas Gomide <lucaslg200@gmail.com>
2025-07-02 12:18:32 -04:00
Lucas Gomide
ceb310bcde docs: add docs about Memory Events (#3104)
Some checks failed
Notify Downstream / notify-downstream (push) Has been cancelled
2025-07-02 11:10:45 -04:00
165 changed files with 13521 additions and 3951 deletions

View File

@@ -260,7 +260,7 @@ def handle_success(self):
# Handle success case
pass
@listen("failure_path")
@listen("failure_path")
def handle_failure(self):
# Handle failure case
pass
@@ -288,7 +288,7 @@ class SelectiveFlow(Flow):
def critical_step(self):
# Only this method's state is persisted
self.state["important_data"] = "value"
@start()
def temporary_step(self):
# This method's state is not persisted
@@ -322,20 +322,20 @@ flow.plot("workflow_diagram") # Generates HTML visualization
class CyclicFlow(Flow):
max_iterations = 5
current_iteration = 0
@start("loop")
def process_iteration(self):
if self.current_iteration >= self.max_iterations:
return
# Process current iteration
self.current_iteration += 1
@router(process_iteration)
def check_continue(self):
if self.current_iteration < self.max_iterations:
return "loop" # Continue cycling
return "complete"
@listen("complete")
def finalize(self):
# Final processing
@@ -369,7 +369,7 @@ def risky_operation(self):
self.state["success"] = False
return None
@listen(risky_operation)
@listen(risky_operation)
def handle_result(self, result):
if self.state.get("success", False):
# Handle success case
@@ -390,7 +390,7 @@ class CrewOrchestrationFlow(Flow[WorkflowState]):
result = research_crew.crew().kickoff(inputs={"topic": self.state.research_topic})
self.state.research_results = result.raw
return result
@listen(research_phase)
def analysis_phase(self, research_results):
analysis_crew = AnalysisCrew()
@@ -400,13 +400,13 @@ class CrewOrchestrationFlow(Flow[WorkflowState]):
})
self.state.analysis_results = result.raw
return result
@router(analysis_phase)
def decide_next_action(self):
if self.state.analysis_results.confidence > 0.7:
return "generate_report"
return "additional_research"
@listen("generate_report")
def final_report(self):
reporting_crew = ReportingCrew()
@@ -439,7 +439,7 @@ class CrewOrchestrationFlow(Flow[WorkflowState]):
## CrewAI Version Compatibility:
- Stay updated with CrewAI releases for new features and bug fixes
- Test crew functionality when upgrading CrewAI versions
- Use version constraints in pyproject.toml (e.g., "crewai[tools]>=0.134.0,<1.0.0")
- Use version constraints in pyproject.toml (e.g., "crewai[tools]>=0.140.0,<1.0.0")
- Monitor deprecation warnings for future compatibility
## Code Examples and Implementation Patterns
@@ -464,22 +464,22 @@ class ResearchOutput(BaseModel):
@CrewBase
class ResearchCrew():
"""Advanced research crew with structured outputs and validation"""
agents: List[BaseAgent]
tasks: List[Task]
@before_kickoff
def setup_environment(self):
"""Initialize environment before crew execution"""
print("🚀 Setting up research environment...")
# Validate API keys, create directories, etc.
@after_kickoff
def cleanup_and_report(self, output):
"""Handle post-execution tasks"""
print(f"✅ Research completed. Generated {len(output.tasks_output)} task outputs")
print(f"📊 Token usage: {output.token_usage}")
@agent
def researcher(self) -> Agent:
return Agent(
@@ -490,7 +490,7 @@ class ResearchCrew():
max_iter=15,
max_execution_time=1800
)
@agent
def analyst(self) -> Agent:
return Agent(
@@ -499,7 +499,7 @@ class ResearchCrew():
verbose=True,
memory=True
)
@task
def research_task(self) -> Task:
return Task(
@@ -507,7 +507,7 @@ class ResearchCrew():
agent=self.researcher(),
output_pydantic=ResearchOutput
)
@task
def validation_task(self) -> Task:
return Task(
@@ -517,7 +517,7 @@ class ResearchCrew():
guardrail=self.validate_research_quality,
max_retries=3
)
def validate_research_quality(self, output) -> tuple[bool, str]:
"""Custom guardrail to ensure research quality"""
content = output.raw
@@ -526,7 +526,7 @@ class ResearchCrew():
if not any(keyword in content.lower() for keyword in ['conclusion', 'finding', 'result']):
return False, "Missing key analytical elements."
return True, content
@crew
def crew(self) -> Crew:
return Crew(
@@ -557,13 +557,13 @@ class RobustSearchTool(BaseTool):
name: str = "robust_search"
description: str = "Perform web search with retry logic and error handling"
args_schema: Type[BaseModel] = SearchInput
def __init__(self, api_key: Optional[str] = None, **kwargs):
super().__init__(**kwargs)
self.api_key = api_key or os.getenv("SEARCH_API_KEY")
self.rate_limit_delay = 1.0
self.last_request_time = 0
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=4, max=10)
@@ -575,43 +575,43 @@ class RobustSearchTool(BaseTool):
time_since_last = time.time() - self.last_request_time
if time_since_last < self.rate_limit_delay:
time.sleep(self.rate_limit_delay - time_since_last)
# Input validation
if not query or len(query.strip()) == 0:
return "Error: Empty search query provided"
if len(query) > 500:
return "Error: Search query too long (max 500 characters)"
# Perform search
results = self._perform_search(query, max_results, timeout)
self.last_request_time = time.time()
return self._format_results(results)
except requests.exceptions.Timeout:
return f"Search timed out after {timeout} seconds"
except requests.exceptions.RequestException as e:
return f"Search failed due to network error: {str(e)}"
except Exception as e:
return f"Unexpected error during search: {str(e)}"
def _perform_search(self, query: str, max_results: int, timeout: int) -> List[dict]:
"""Implement actual search logic here"""
# Your search API implementation
pass
def _format_results(self, results: List[dict]) -> str:
"""Format search results for LLM consumption"""
if not results:
return "No results found for the given query."
formatted = "Search Results:\n\n"
for i, result in enumerate(results[:10], 1):
formatted += f"{i}. {result.get('title', 'No title')}\n"
formatted += f" URL: {result.get('url', 'No URL')}\n"
formatted += f" Summary: {result.get('snippet', 'No summary')}\n\n"
return formatted
```
@@ -623,20 +623,20 @@ from crewai.memory.storage.mem0_storage import Mem0Storage
class AdvancedMemoryManager:
"""Enhanced memory management for CrewAI applications"""
def __init__(self, crew, config: dict = None):
self.crew = crew
self.config = config or {}
self.setup_memory_systems()
def setup_memory_systems(self):
"""Configure multiple memory systems"""
# Short-term memory for current session
self.short_term = ShortTermMemory()
# Long-term memory for cross-session persistence
self.long_term = LongTermMemory()
# External memory with Mem0 (if configured)
if self.config.get('use_external_memory'):
self.external = ExternalMemory.create_storage(
@@ -649,8 +649,8 @@ class AdvancedMemoryManager:
}
}
)
def save_with_context(self, content: str, memory_type: str = "short_term",
def save_with_context(self, content: str, memory_type: str = "short_term",
metadata: dict = None, agent: str = None):
"""Save content with enhanced metadata"""
enhanced_metadata = {
@@ -659,14 +659,14 @@ class AdvancedMemoryManager:
"crew_type": self.crew.__class__.__name__,
**(metadata or {})
}
if memory_type == "short_term":
self.short_term.save(content, enhanced_metadata, agent)
elif memory_type == "long_term":
self.long_term.save(content, enhanced_metadata, agent)
elif memory_type == "external" and hasattr(self, 'external'):
self.external.save(content, enhanced_metadata, agent)
def search_across_memories(self, query: str, limit: int = 5) -> dict:
"""Search across all memory systems"""
results = {
@@ -674,23 +674,23 @@ class AdvancedMemoryManager:
"long_term": [],
"external": []
}
# Search short-term memory
results["short_term"] = self.short_term.search(query, limit=limit)
# Search long-term memory
results["long_term"] = self.long_term.search(query, limit=limit)
# Search external memory (if available)
if hasattr(self, 'external'):
results["external"] = self.external.search(query, limit=limit)
return results
def cleanup_old_memories(self, days_threshold: int = 30):
"""Clean up old memories based on age"""
cutoff_time = time.time() - (days_threshold * 24 * 60 * 60)
# Implement cleanup logic based on timestamps in metadata
# This would vary based on your specific storage implementation
pass
@@ -719,12 +719,12 @@ class TaskMetrics:
class CrewMonitor:
"""Comprehensive monitoring for CrewAI applications"""
def __init__(self, crew_name: str, log_level: str = "INFO"):
self.crew_name = crew_name
self.metrics: List[TaskMetrics] = []
self.session_start = time.time()
# Setup logging
logging.basicConfig(
level=getattr(logging, log_level),
@@ -735,7 +735,7 @@ class CrewMonitor:
]
)
self.logger = logging.getLogger(f"CrewAI.{crew_name}")
def start_task_monitoring(self, task_name: str, agent_name: str) -> dict:
"""Start monitoring a task execution"""
context = {
@@ -743,16 +743,16 @@ class CrewMonitor:
"agent_name": agent_name,
"start_time": time.time()
}
self.logger.info(f"Task started: {task_name} by {agent_name}")
return context
def end_task_monitoring(self, context: dict, success: bool = True,
def end_task_monitoring(self, context: dict, success: bool = True,
tokens_used: int = 0, error: str = None):
"""End monitoring and record metrics"""
end_time = time.time()
duration = end_time - context["start_time"]
# Get memory usage (if psutil is available)
memory_usage = None
try:
@@ -761,7 +761,7 @@ class CrewMonitor:
memory_usage = process.memory_info().rss / 1024 / 1024 # MB
except ImportError:
pass
metrics = TaskMetrics(
task_name=context["task_name"],
agent_name=context["agent_name"],
@@ -773,29 +773,29 @@ class CrewMonitor:
error_message=error,
memory_usage_mb=memory_usage
)
self.metrics.append(metrics)
# Log the completion
status = "SUCCESS" if success else "FAILED"
self.logger.info(f"Task {status}: {context['task_name']} "
f"(Duration: {duration:.2f}s, Tokens: {tokens_used})")
if error:
self.logger.error(f"Task error: {error}")
def get_performance_summary(self) -> Dict[str, Any]:
"""Generate comprehensive performance summary"""
if not self.metrics:
return {"message": "No metrics recorded yet"}
successful_tasks = [m for m in self.metrics if m.success]
failed_tasks = [m for m in self.metrics if not m.success]
total_duration = sum(m.duration for m in self.metrics)
total_tokens = sum(m.tokens_used for m in self.metrics)
avg_duration = total_duration / len(self.metrics)
return {
"crew_name": self.crew_name,
"session_duration": time.time() - self.session_start,
@@ -811,7 +811,7 @@ class CrewMonitor:
"most_token_intensive": max(self.metrics, key=lambda x: x.tokens_used).task_name if self.metrics else None,
"common_errors": self._get_common_errors()
}
def _get_common_errors(self) -> Dict[str, int]:
"""Get frequency of common errors"""
error_counts = {}
@@ -819,20 +819,20 @@ class CrewMonitor:
if metric.error_message:
error_counts[metric.error_message] = error_counts.get(metric.error_message, 0) + 1
return dict(sorted(error_counts.items(), key=lambda x: x[1], reverse=True))
def export_metrics(self, filename: str = None) -> str:
"""Export metrics to JSON file"""
if not filename:
filename = f"crew_metrics_{self.crew_name}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
export_data = {
"summary": self.get_performance_summary(),
"detailed_metrics": [asdict(m) for m in self.metrics]
}
with open(filename, 'w') as f:
json.dump(export_data, f, indent=2, default=str)
self.logger.info(f"Metrics exported to {filename}")
return filename
@@ -847,10 +847,10 @@ def monitored_research_task(self) -> Task:
if context:
tokens = getattr(task_output, 'token_usage', {}).get('total', 0)
monitor.end_task_monitoring(context, success=True, tokens_used=tokens)
# Start monitoring would be called before task execution
# This is a simplified example - in practice you'd integrate this into the task execution flow
return Task(
config=self.tasks_config['research_task'],
agent=self.researcher(),
@@ -872,7 +872,7 @@ class ErrorSeverity(Enum):
class CrewError(Exception):
"""Base exception for CrewAI applications"""
def __init__(self, message: str, severity: ErrorSeverity = ErrorSeverity.MEDIUM,
def __init__(self, message: str, severity: ErrorSeverity = ErrorSeverity.MEDIUM,
context: dict = None):
super().__init__(message)
self.severity = severity
@@ -893,19 +893,19 @@ class ConfigurationError(CrewError):
class ErrorHandler:
"""Centralized error handling for CrewAI applications"""
def __init__(self, crew_name: str):
self.crew_name = crew_name
self.error_log: List[CrewError] = []
self.recovery_strategies: Dict[type, Callable] = {}
def register_recovery_strategy(self, error_type: type, strategy: Callable):
"""Register a recovery strategy for specific error types"""
self.recovery_strategies[error_type] = strategy
def handle_error(self, error: Exception, context: dict = None) -> Any:
"""Handle errors with appropriate recovery strategies"""
# Convert to CrewError if needed
if not isinstance(error, CrewError):
crew_error = CrewError(
@@ -915,11 +915,11 @@ class ErrorHandler:
)
else:
crew_error = error
# Log the error
self.error_log.append(crew_error)
self._log_error(crew_error)
# Apply recovery strategy if available
error_type = type(error)
if error_type in self.recovery_strategies:
@@ -931,21 +931,21 @@ class ErrorHandler:
ErrorSeverity.HIGH,
{"original_error": str(error), "recovery_error": str(recovery_error)}
))
# If critical, re-raise
if crew_error.severity == ErrorSeverity.CRITICAL:
raise crew_error
return None
def _log_error(self, error: CrewError):
"""Log error with appropriate level based on severity"""
logger = logging.getLogger(f"CrewAI.{self.crew_name}.ErrorHandler")
error_msg = f"[{error.severity.value.upper()}] {error}"
if error.context:
error_msg += f" | Context: {error.context}"
if error.severity in [ErrorSeverity.HIGH, ErrorSeverity.CRITICAL]:
logger.error(error_msg)
logger.error(f"Stack trace: {traceback.format_exc()}")
@@ -953,16 +953,16 @@ class ErrorHandler:
logger.warning(error_msg)
else:
logger.info(error_msg)
def get_error_summary(self) -> Dict[str, Any]:
"""Get summary of errors encountered"""
if not self.error_log:
return {"total_errors": 0}
severity_counts = {}
for error in self.error_log:
severity_counts[error.severity.value] = severity_counts.get(error.severity.value, 0) + 1
return {
"total_errors": len(self.error_log),
"severity_breakdown": severity_counts,
@@ -1004,7 +1004,7 @@ def robust_task(self) -> Task:
# Use fallback response
return "Task failed, using fallback response"
return wrapper
return Task(
config=self.tasks_config['research_task'],
agent=self.researcher()
@@ -1020,60 +1020,60 @@ from pydantic import BaseSettings, Field, validator
class Environment(str, Enum):
DEVELOPMENT = "development"
TESTING = "testing"
TESTING = "testing"
STAGING = "staging"
PRODUCTION = "production"
class CrewAISettings(BaseSettings):
"""Comprehensive settings management for CrewAI applications"""
# Environment
environment: Environment = Field(default=Environment.DEVELOPMENT)
debug: bool = Field(default=True)
# API Keys (loaded from environment)
openai_api_key: Optional[str] = Field(default=None, env="OPENAI_API_KEY")
anthropic_api_key: Optional[str] = Field(default=None, env="ANTHROPIC_API_KEY")
serper_api_key: Optional[str] = Field(default=None, env="SERPER_API_KEY")
mem0_api_key: Optional[str] = Field(default=None, env="MEM0_API_KEY")
# CrewAI Configuration
crew_max_rpm: int = Field(default=100)
crew_max_execution_time: int = Field(default=3600) # 1 hour
default_llm_model: str = Field(default="gpt-4")
fallback_llm_model: str = Field(default="gpt-3.5-turbo")
# Memory and Storage
crewai_storage_dir: str = Field(default="./storage", env="CREWAI_STORAGE_DIR")
memory_enabled: bool = Field(default=True)
memory_cleanup_interval: int = Field(default=86400) # 24 hours in seconds
# Performance
enable_caching: bool = Field(default=True)
max_retries: int = Field(default=3)
retry_delay: float = Field(default=1.0)
# Monitoring
enable_monitoring: bool = Field(default=True)
log_level: str = Field(default="INFO")
metrics_export_interval: int = Field(default=3600) # 1 hour
# Security
input_sanitization: bool = Field(default=True)
max_input_length: int = Field(default=10000)
allowed_file_types: list = Field(default=["txt", "md", "pdf", "docx"])
@validator('environment', pre=True)
def set_debug_based_on_env(cls, v):
return v
@validator('debug')
def set_debug_from_env(cls, v, values):
env = values.get('environment')
if env == Environment.PRODUCTION:
return False
return v
@validator('openai_api_key')
def validate_openai_key(cls, v):
if not v:
@@ -1081,15 +1081,15 @@ class CrewAISettings(BaseSettings):
if not v.startswith('sk-'):
raise ValueError("Invalid OpenAI API key format")
return v
@property
def is_production(self) -> bool:
return self.environment == Environment.PRODUCTION
@property
def is_development(self) -> bool:
return self.environment == Environment.DEVELOPMENT
def get_llm_config(self) -> Dict[str, Any]:
"""Get LLM configuration based on environment"""
config = {
@@ -1098,12 +1098,12 @@ class CrewAISettings(BaseSettings):
"max_tokens": 4000 if self.is_production else 2000,
"timeout": 60
}
if self.is_development:
config["model"] = self.fallback_llm_model
return config
def get_memory_config(self) -> Dict[str, Any]:
"""Get memory configuration"""
return {
@@ -1112,7 +1112,7 @@ class CrewAISettings(BaseSettings):
"cleanup_interval": self.memory_cleanup_interval,
"provider": "mem0" if self.mem0_api_key and self.is_production else "local"
}
class Config:
env_file = ".env"
env_file_encoding = 'utf-8'
@@ -1125,25 +1125,25 @@ settings = CrewAISettings()
@CrewBase
class ConfigurableCrew():
"""Crew that uses centralized configuration"""
def __init__(self):
self.settings = settings
self.validate_configuration()
def validate_configuration(self):
"""Validate configuration before crew execution"""
required_keys = [self.settings.openai_api_key]
if not all(required_keys):
raise ConfigurationError("Missing required API keys")
if not os.path.exists(self.settings.crewai_storage_dir):
os.makedirs(self.settings.crewai_storage_dir, exist_ok=True)
@agent
def adaptive_agent(self) -> Agent:
"""Agent that adapts to configuration"""
llm_config = self.settings.get_llm_config()
return Agent(
config=self.agents_config['researcher'],
llm=llm_config["model"],
@@ -1163,7 +1163,7 @@ from crewai.tasks.task_output import TaskOutput
class CrewAITestFramework:
"""Comprehensive testing framework for CrewAI applications"""
@staticmethod
def create_mock_agent(role: str = "test_agent", tools: list = None) -> Mock:
"""Create a mock agent for testing"""
@@ -1175,9 +1175,9 @@ class CrewAITestFramework:
mock_agent.llm = "gpt-3.5-turbo"
mock_agent.verbose = False
return mock_agent
@staticmethod
def create_mock_task_output(content: str, success: bool = True,
def create_mock_task_output(content: str, success: bool = True,
tokens: int = 100) -> TaskOutput:
"""Create a mock task output for testing"""
return TaskOutput(
@@ -1187,13 +1187,13 @@ class CrewAITestFramework:
pydantic=None,
json_dict=None
)
@staticmethod
def create_test_crew(agents: list = None, tasks: list = None) -> Crew:
"""Create a test crew with mock components"""
test_agents = agents or [CrewAITestFramework.create_mock_agent()]
test_tasks = tasks or []
return Crew(
agents=test_agents,
tasks=test_tasks,
@@ -1203,53 +1203,53 @@ class CrewAITestFramework:
# Example test cases
class TestResearchCrew:
"""Test cases for research crew functionality"""
def setup_method(self):
"""Setup test environment"""
self.framework = CrewAITestFramework()
self.mock_serper = Mock()
@patch('crewai_tools.SerperDevTool')
def test_agent_creation(self, mock_serper_tool):
"""Test agent creation with proper configuration"""
mock_serper_tool.return_value = self.mock_serper
crew = ResearchCrew()
researcher = crew.researcher()
assert researcher.role == "Senior Research Analyst"
assert len(researcher.tools) > 0
assert researcher.verbose is True
def test_task_validation(self):
"""Test task validation logic"""
crew = ResearchCrew()
# Test valid output
valid_output = self.framework.create_mock_task_output(
"This is a comprehensive research summary with conclusions and findings."
)
is_valid, message = crew.validate_research_quality(valid_output)
assert is_valid is True
# Test invalid output (too short)
invalid_output = self.framework.create_mock_task_output("Too short")
is_valid, message = crew.validate_research_quality(invalid_output)
assert is_valid is False
assert "brief" in message.lower()
@patch('requests.get')
def test_tool_error_handling(self, mock_requests):
"""Test tool error handling and recovery"""
# Simulate network error
mock_requests.side_effect = requests.exceptions.RequestException("Network error")
tool = RobustSearchTool()
result = tool._run("test query")
assert "network error" in result.lower()
assert "failed" in result.lower()
@pytest.mark.asyncio
async def test_crew_execution_flow(self):
"""Test complete crew execution with mocked dependencies"""
@@ -1257,18 +1257,18 @@ class TestResearchCrew:
mock_execute.return_value = self.framework.create_mock_task_output(
"Research completed successfully with findings and recommendations."
)
crew = ResearchCrew()
result = crew.crew().kickoff(inputs={"topic": "AI testing"})
assert result is not None
assert "successfully" in result.raw.lower()
def test_memory_integration(self):
"""Test memory system integration"""
crew = ResearchCrew()
memory_manager = AdvancedMemoryManager(crew)
# Test saving to memory
test_content = "Important research finding about AI"
memory_manager.save_with_context(
@@ -1277,34 +1277,34 @@ class TestResearchCrew:
metadata={"importance": "high"},
agent="researcher"
)
# Test searching memory
results = memory_manager.search_across_memories("AI research")
assert "short_term" in results
def test_error_handling_workflow(self):
"""Test error handling and recovery mechanisms"""
error_handler = ErrorHandler("test_crew")
# Test error registration and handling
test_error = TaskExecutionError("Test task failed", ErrorSeverity.MEDIUM)
result = error_handler.handle_error(test_error)
assert len(error_handler.error_log) == 1
assert error_handler.error_log[0].severity == ErrorSeverity.MEDIUM
def test_configuration_validation(self):
"""Test configuration validation"""
# Test with missing API key
with patch.dict(os.environ, {}, clear=True):
with pytest.raises(ValueError):
settings = CrewAISettings()
# Test with valid configuration
with patch.dict(os.environ, {"OPENAI_API_KEY": "sk-test-key"}):
settings = CrewAISettings()
assert settings.openai_api_key == "sk-test-key"
@pytest.mark.integration
def test_end_to_end_workflow(self):
"""Integration test for complete workflow"""
@@ -1315,41 +1315,41 @@ class TestResearchCrew:
# Performance testing
class TestCrewPerformance:
"""Performance tests for CrewAI applications"""
def test_memory_usage(self):
"""Test memory usage during crew execution"""
import psutil
import gc
process = psutil.Process()
initial_memory = process.memory_info().rss
# Create and run crew multiple times
for i in range(10):
crew = ResearchCrew()
# Simulate crew execution
del crew
gc.collect()
final_memory = process.memory_info().rss
memory_increase = final_memory - initial_memory
# Assert memory increase is reasonable (less than 100MB)
assert memory_increase < 100 * 1024 * 1024
def test_concurrent_execution(self):
"""Test concurrent crew execution"""
import concurrent.futures
def run_crew(crew_id):
crew = ResearchCrew()
# Simulate execution
return f"crew_{crew_id}_completed"
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
futures = [executor.submit(run_crew, i) for i in range(5)]
results = [future.result() for future in futures]
assert len(results) == 5
assert all("completed" in result for result in results)
@@ -1400,7 +1400,7 @@ class TestCrewPerformance:
### Development:
1. Always use .env files for sensitive configuration
2. Implement comprehensive error handling and logging
2. Implement comprehensive error handling and logging
3. Use structured outputs with Pydantic for reliability
4. Test crew functionality with different input scenarios
5. Follow CrewAI patterns and conventions consistently
@@ -1426,4 +1426,4 @@ class TestCrewPerformance:
5. Use async patterns for I/O-bound operations
6. Implement proper connection pooling and resource management
7. Profile and optimize critical paths
8. Plan for horizontal scaling when needed
8. Plan for horizontal scaling when needed

View File

@@ -7,14 +7,18 @@ permissions:
env:
OPENAI_API_KEY: fake-api-key
PYTHONUNBUFFERED: 1
jobs:
tests:
name: tests (${{ matrix.python-version }})
runs-on: ubuntu-latest
timeout-minutes: 15
strategy:
fail-fast: true
matrix:
python-version: ['3.10', '3.11', '3.12', '3.13']
group: [1, 2, 3, 4, 5, 6, 7, 8]
steps:
- name: Checkout code
uses: actions/checkout@v4
@@ -23,6 +27,9 @@ jobs:
uses: astral-sh/setup-uv@v3
with:
enable-cache: true
cache-dependency-glob: |
**/pyproject.toml
**/uv.lock
- name: Set up Python ${{ matrix.python-version }}
run: uv python install ${{ matrix.python-version }}
@@ -30,5 +37,14 @@ jobs:
- name: Install the project
run: uv sync --dev --all-extras
- name: Run tests
run: uv run pytest --block-network --timeout=60 -vv
- name: Run tests (group ${{ matrix.group }} of 8)
run: |
uv run pytest \
--block-network \
--timeout=30 \
-vv \
--splits 8 \
--group ${{ matrix.group }} \
--durations=10 \
-n auto \
--maxfail=3

3
.gitignore vendored
View File

@@ -26,4 +26,5 @@ test_flow.html
crewairules.mdc
plan.md
conceptual_plan.md
build_image
build_image
chromadb-*.lock

View File

@@ -9,12 +9,7 @@
},
"favicon": "/images/favicon.svg",
"contextual": {
"options": [
"copy",
"view",
"chatgpt",
"claude"
]
"options": ["copy", "view", "chatgpt", "claude"]
},
"navigation": {
"languages": [
@@ -37,11 +32,6 @@
"href": "https://chatgpt.com/g/g-qqTuUWsBY-crewai-assistant",
"icon": "robot"
},
{
"anchor": "Get Help",
"href": "mailto:support@crewai.com",
"icon": "headset"
},
{
"anchor": "Releases",
"href": "https://github.com/crewAIInc/crewAI/releases",
@@ -55,32 +45,22 @@
"groups": [
{
"group": "Get Started",
"pages": [
"en/introduction",
"en/installation",
"en/quickstart"
]
"pages": ["en/introduction", "en/installation", "en/quickstart"]
},
{
"group": "Guides",
"pages": [
{
"group": "Strategy",
"pages": [
"en/guides/concepts/evaluating-use-cases"
]
"pages": ["en/guides/concepts/evaluating-use-cases"]
},
{
"group": "Agents",
"pages": [
"en/guides/agents/crafting-effective-agents"
]
"pages": ["en/guides/agents/crafting-effective-agents"]
},
{
"group": "Crews",
"pages": [
"en/guides/crews/first-crew"
]
"pages": ["en/guides/crews/first-crew"]
},
{
"group": "Flows",
@@ -94,7 +74,6 @@
"pages": [
"en/guides/advanced/customizing-prompts",
"en/guides/advanced/fingerprinting"
]
}
]
@@ -182,7 +161,9 @@
"en/tools/search-research/websitesearchtool",
"en/tools/search-research/codedocssearchtool",
"en/tools/search-research/youtubechannelsearchtool",
"en/tools/search-research/youtubevideosearchtool"
"en/tools/search-research/youtubevideosearchtool",
"en/tools/search-research/tavilysearchtool",
"en/tools/search-research/tavilyextractortool"
]
},
{
@@ -241,6 +222,7 @@
"en/observability/langtrace",
"en/observability/maxim",
"en/observability/mlflow",
"en/observability/neatlogs",
"en/observability/openlit",
"en/observability/opik",
"en/observability/patronus-evaluation",
@@ -274,9 +256,7 @@
},
{
"group": "Telemetry",
"pages": [
"en/telemetry"
]
"pages": ["en/telemetry"]
}
]
},
@@ -285,9 +265,7 @@
"groups": [
{
"group": "Getting Started",
"pages": [
"en/enterprise/introduction"
]
"pages": ["en/enterprise/introduction"]
},
{
"group": "Features",
@@ -296,7 +274,8 @@
"en/enterprise/features/webhook-streaming",
"en/enterprise/features/traces",
"en/enterprise/features/hallucination-guardrail",
"en/enterprise/features/integrations"
"en/enterprise/features/integrations",
"en/enterprise/features/agent-repositories"
]
},
{
@@ -341,9 +320,7 @@
},
{
"group": "Resources",
"pages": [
"en/enterprise/resources/frequently-asked-questions"
]
"pages": ["en/enterprise/resources/frequently-asked-questions"]
}
]
},
@@ -352,9 +329,7 @@
"groups": [
{
"group": "Getting Started",
"pages": [
"en/api-reference/introduction"
]
"pages": ["en/api-reference/introduction"]
},
{
"group": "Endpoints",
@@ -364,16 +339,13 @@
},
{
"tab": "Examples",
"groups": [
"groups": [
{
"group": "Examples",
"pages": [
"en/examples/example"
]
"pages": ["en/examples/example"]
}
]
}
]
},
{
@@ -395,11 +367,6 @@
"href": "https://chatgpt.com/g/g-qqTuUWsBY-crewai-assistant",
"icon": "robot"
},
{
"anchor": "Obter Ajuda",
"href": "mailto:support@crewai.com",
"icon": "headset"
},
{
"anchor": "Lançamentos",
"href": "https://github.com/crewAIInc/crewAI/releases",
@@ -424,21 +391,15 @@
"pages": [
{
"group": "Estratégia",
"pages": [
"pt-BR/guides/concepts/evaluating-use-cases"
]
"pages": ["pt-BR/guides/concepts/evaluating-use-cases"]
},
{
"group": "Agentes",
"pages": [
"pt-BR/guides/agents/crafting-effective-agents"
]
"pages": ["pt-BR/guides/agents/crafting-effective-agents"]
},
{
"group": "Crews",
"pages": [
"pt-BR/guides/crews/first-crew"
]
"pages": ["pt-BR/guides/crews/first-crew"]
},
{
"group": "Flows",
@@ -631,9 +592,7 @@
},
{
"group": "Telemetria",
"pages": [
"pt-BR/telemetry"
]
"pages": ["pt-BR/telemetry"]
}
]
},
@@ -642,9 +601,7 @@
"groups": [
{
"group": "Começando",
"pages": [
"pt-BR/enterprise/introduction"
]
"pages": ["pt-BR/enterprise/introduction"]
},
{
"group": "Funcionalidades",
@@ -709,9 +666,7 @@
"groups": [
{
"group": "Começando",
"pages": [
"pt-BR/api-reference/introduction"
]
"pages": ["pt-BR/api-reference/introduction"]
},
{
"group": "Endpoints",
@@ -721,16 +676,13 @@
},
{
"tab": "Exemplos",
"groups": [
"groups": [
{
"group": "Exemplos",
"pages": [
"pt-BR/examples/example"
]
"pages": ["pt-BR/examples/example"]
}
]
}
]
}
]
@@ -774,7 +726,7 @@
"destination": "/en/introduction"
},
{
"source": "/installation",
"source": "/installation",
"destination": "/en/installation"
},
{

View File

@@ -526,6 +526,103 @@ agent = Agent(
The context window management feature works automatically in the background. You don't need to call any special functions - just set `respect_context_window` to your preferred behavior and CrewAI handles the rest!
</Note>
## Direct Agent Interaction with `kickoff()`
Agents can be used directly without going through a task or crew workflow using the `kickoff()` method. This provides a simpler way to interact with an agent when you don't need the full crew orchestration capabilities.
### How `kickoff()` Works
The `kickoff()` method allows you to send messages directly to an agent and get a response, similar to how you would interact with an LLM but with all the agent's capabilities (tools, reasoning, etc.).
```python Code
from crewai import Agent
from crewai_tools import SerperDevTool
# Create an agent
researcher = Agent(
role="AI Technology Researcher",
goal="Research the latest AI developments",
tools=[SerperDevTool()],
verbose=True
)
# Use kickoff() to interact directly with the agent
result = researcher.kickoff("What are the latest developments in language models?")
# Access the raw response
print(result.raw)
```
### Parameters and Return Values
| Parameter | Type | Description |
| :---------------- | :---------------------------------- | :------------------------------------------------------------------------ |
| `messages` | `Union[str, List[Dict[str, str]]]` | Either a string query or a list of message dictionaries with role/content |
| `response_format` | `Optional[Type[Any]]` | Optional Pydantic model for structured output |
The method returns a `LiteAgentOutput` object with the following properties:
- `raw`: String containing the raw output text
- `pydantic`: Parsed Pydantic model (if a `response_format` was provided)
- `agent_role`: Role of the agent that produced the output
- `usage_metrics`: Token usage metrics for the execution
### Structured Output
You can get structured output by providing a Pydantic model as the `response_format`:
```python Code
from pydantic import BaseModel
from typing import List
class ResearchFindings(BaseModel):
main_points: List[str]
key_technologies: List[str]
future_predictions: str
# Get structured output
result = researcher.kickoff(
"Summarize the latest developments in AI for 2025",
response_format=ResearchFindings
)
# Access structured data
print(result.pydantic.main_points)
print(result.pydantic.future_predictions)
```
### Multiple Messages
You can also provide a conversation history as a list of message dictionaries:
```python Code
messages = [
{"role": "user", "content": "I need information about large language models"},
{"role": "assistant", "content": "I'd be happy to help with that! What specifically would you like to know?"},
{"role": "user", "content": "What are the latest developments in 2025?"}
]
result = researcher.kickoff(messages)
```
### Async Support
An asynchronous version is available via `kickoff_async()` with the same parameters:
```python Code
import asyncio
async def main():
result = await researcher.kickoff_async("What are the latest developments in AI?")
print(result.raw)
asyncio.run(main())
```
<Note>
The `kickoff()` method uses a `LiteAgent` internally, which provides a simpler execution flow while preserving all of the agent's configuration (role, goal, backstory, tools, etc.).
</Note>
## Important Considerations and Best Practices
### Security and Code Execution

View File

@@ -4,6 +4,8 @@ description: Learn how to use the CrewAI CLI to interact with CrewAI.
icon: terminal
---
<Warning>Since release 0.140.0, CrewAI Enterprise started a process of migrating their login provider. As such, the authentication flow via CLI was updated. Users that use Google to login, or that created their account after July 3rd, 2025 will be unable to log in with older versions of the `crewai` library.</Warning>
## Overview
The CrewAI CLI provides a set of commands to interact with CrewAI, allowing you to create, train, run, and manage crews & flows.
@@ -186,10 +188,7 @@ def crew(self) -> Crew:
Deploy the crew or flow to [CrewAI Enterprise](https://app.crewai.com).
- **Authentication**: You need to be authenticated to deploy to CrewAI Enterprise.
```shell Terminal
crewai signup
```
If you already have an account, you can login with:
You can login or create an account with:
```shell Terminal
crewai login
```

View File

@@ -32,6 +32,7 @@ A crew in crewAI represents a collaborative group of agents working together to
| **Prompt File** _(optional)_ | `prompt_file` | Path to the prompt JSON file to be used for the crew. |
| **Planning** *(optional)* | `planning` | Adds planning ability to the Crew. When activated before each Crew iteration, all Crew data is sent to an AgentPlanner that will plan the tasks and this plan will be added to each task description. |
| **Planning LLM** *(optional)* | `planning_llm` | The language model used by the AgentPlanner in a planning process. |
| **Knowledge Sources** _(optional)_ | `knowledge_sources` | Knowledge sources available at the crew level, accessible to all the agents. |
<Tip>
**Crew Max RPM**: The `max_rpm` attribute sets the maximum number of requests per minute the crew can perform to avoid rate limits and will override individual agents' `max_rpm` settings if you set it.

View File

@@ -255,6 +255,17 @@ CrewAI provides a wide range of events that you can listen for:
- **LLMCallFailedEvent**: Emitted when an LLM call fails
- **LLMStreamChunkEvent**: Emitted for each chunk received during streaming LLM responses
### Memory Events
- **MemoryQueryStartedEvent**: Emitted when a memory query is started. Contains the query, limit, and optional score threshold.
- **MemoryQueryCompletedEvent**: Emitted when a memory query is completed successfully. Contains the query, results, limit, score threshold, and query execution time.
- **MemoryQueryFailedEvent**: Emitted when a memory query fails. Contains the query, limit, score threshold, and error message.
- **MemorySaveStartedEvent**: Emitted when a memory save operation is started. Contains the value to be saved, metadata, and optional agent role.
- **MemorySaveCompletedEvent**: Emitted when a memory save operation is completed successfully. Contains the saved value, metadata, agent role, and save execution time.
- **MemorySaveFailedEvent**: Emitted when a memory save operation fails. Contains the value, metadata, agent role, and error message.
- **MemoryRetrievalStartedEvent**: Emitted when memory retrieval for a task prompt starts. Contains the optional task ID.
- **MemoryRetrievalCompletedEvent**: Emitted when memory retrieval for a task prompt completes successfully. Contains the task ID, memory content, and retrieval execution time.
## Event Handler Structure
Each event handler receives two parameters:

View File

@@ -270,7 +270,7 @@ In this section, you'll find detailed examples that help you select, configure,
from crewai import LLM
llm = LLM(
model="gemini/gemini-1.5-pro-latest",
model="gemini-1.5-pro-latest", # or vertex_ai/gemini-1.5-pro-latest
temperature=0.7,
vertex_credentials=vertex_credentials_json
)

View File

@@ -9,7 +9,7 @@ icon: database
The CrewAI framework provides a sophisticated memory system designed to significantly enhance AI agent capabilities. CrewAI offers **three distinct memory approaches** that serve different use cases:
1. **Basic Memory System** - Built-in short-term, long-term, and entity memory
2. **User Memory** - User-specific memory with Mem0 integration (legacy approach)
2. **User Memory** - User-specific memory with Mem0 integration (legacy approach)
3. **External Memory** - Standalone external memory providers (new approach)
## Memory System Components
@@ -62,7 +62,7 @@ By default, CrewAI uses the `appdirs` library to determine storage locations fol
```
~/Library/Application Support/CrewAI/{project_name}/
├── knowledge/ # Knowledge base ChromaDB files
├── short_term_memory/ # Short-term memory ChromaDB files
├── short_term_memory/ # Short-term memory ChromaDB files
├── long_term_memory/ # Long-term memory ChromaDB files
├── entities/ # Entity memory ChromaDB files
└── long_term_memory_storage.db # SQLite database
@@ -252,7 +252,7 @@ chroma_path = os.path.join(storage_path, "knowledge")
if os.path.exists(chroma_path):
client = chromadb.PersistentClient(path=chroma_path)
collections = client.list_collections()
print("ChromaDB Collections:")
for collection in collections:
print(f" - {collection.name}: {collection.count()} documents")
@@ -269,7 +269,7 @@ crew = Crew(agents=[...], tasks=[...], memory=True)
# Reset specific memory types
crew.reset_memories(command_type='short') # Short-term memory
crew.reset_memories(command_type='long') # Long-term memory
crew.reset_memories(command_type='long') # Long-term memory
crew.reset_memories(command_type='entity') # Entity memory
crew.reset_memories(command_type='knowledge') # Knowledge storage
```
@@ -596,7 +596,7 @@ providers_to_test = [
{
"name": "Ollama",
"config": {
"provider": "ollama",
"provider": "ollama",
"config": {"model": "mxbai-embed-large"}
}
}
@@ -604,7 +604,7 @@ providers_to_test = [
for provider in providers_to_test:
print(f"\nTesting {provider['name']} embeddings...")
# Create crew with specific embedder
crew = Crew(
agents=[...],
@@ -612,7 +612,7 @@ for provider in providers_to_test:
memory=True,
embedder=provider['config']
)
# Run your test and measure performance
result = crew.kickoff()
print(f"{provider['name']} completed successfully")
@@ -623,7 +623,7 @@ for provider in providers_to_test:
**Model not found errors:**
```python
# Verify model availability
from crewai.utilities.embedding_configurator import EmbeddingConfigurator
from crewai.rag.embeddings.configurator import EmbeddingConfigurator
configurator = EmbeddingConfigurator()
try:
@@ -655,17 +655,17 @@ import time
def test_embedding_performance(embedder_config, test_text="This is a test document"):
start_time = time.time()
crew = Crew(
agents=[...],
tasks=[...],
memory=True,
embedder=embedder_config
)
# Simulate memory operation
crew.kickoff()
end_time = time.time()
return end_time - start_time
@@ -676,7 +676,7 @@ openai_time = test_embedding_performance({
})
ollama_time = test_embedding_performance({
"provider": "ollama",
"provider": "ollama",
"config": {"model": "mxbai-embed-large"}
})
@@ -712,7 +712,7 @@ crew = Crew(
memory_config={
"provider": "mem0",
"config": {"user_id": "john"},
"user_memory": {} # Required - triggers user memory initialization
"user_memory": {} # DEPRECATED: Will be removed in version 0.156.0 or on 2025-08-04, use external_memory instead
},
process=Process.sequential,
verbose=True
@@ -720,7 +720,16 @@ crew = Crew(
```
### Advanced Mem0 Configuration
When using Mem0 Client, you can customize the memory configuration further, by using parameters like 'includes', 'excludes', 'custom_categories', 'infer' and 'run_id' (this is only for short-term memory).
You can find more details in the [Mem0 documentation](https://docs.mem0.ai/).
```python
new_categories = [
{"lifestyle_management_concerns": "Tracks daily routines, habits, hobbies and interests including cooking, time management and work-life balance"},
{"seeking_structure": "Documents goals around creating routines, schedules, and organized systems in various life areas"},
{"personal_information": "Basic information about the user including name, preferences, and personality traits"}
]
crew = Crew(
agents=[...],
tasks=[...],
@@ -732,6 +741,11 @@ crew = Crew(
"org_id": "my_org_id", # Optional
"project_id": "my_project_id", # Optional
"api_key": "custom-api-key" # Optional - overrides env var
"run_id": "my_run_id", # Optional - for short-term memory
"includes": "include1", # Optional
"excludes": "exclude1", # Optional
"infer": True # Optional defaults to True
"custom_categories": new_categories # Optional - custom categories for user memory
},
"user_memory": {}
}
@@ -761,7 +775,8 @@ crew = Crew(
"provider": "openai",
"config": {"api_key": "your-api-key", "model": "text-embedding-3-small"}
}
}
},
"infer": True # Optional defaults to True
},
"user_memory": {}
}
@@ -783,7 +798,7 @@ os.environ["MEM0_API_KEY"] = "your-api-key"
# Create external memory instance
external_memory = ExternalMemory(
embedder_config={
"provider": "mem0",
"provider": "mem0",
"config": {"user_id": "U-123"}
}
)
@@ -808,8 +823,8 @@ class CustomStorage(Storage):
def save(self, value, metadata=None, agent=None):
self.memories.append({
"value": value,
"metadata": metadata,
"value": value,
"metadata": metadata,
"agent": agent
})
@@ -986,7 +1001,201 @@ crew = Crew(
- 🫡 **Enhanced Personalization:** Memory enables agents to remember user preferences and historical interactions, leading to personalized experiences.
- 🧠 **Improved Problem Solving:** Access to a rich memory store aids agents in making more informed decisions, drawing on past learnings and contextual insights.
## Memory Events
CrewAI's event system provides powerful insights into memory operations. By leveraging memory events, you can monitor, debug, and optimize your memory system's performance and behavior.
### Available Memory Events
CrewAI emits the following memory-related events:
| Event | Description | Key Properties |
| :---- | :---------- | :------------- |
| **MemoryQueryStartedEvent** | Emitted when a memory query begins | `query`, `limit`, `score_threshold` |
| **MemoryQueryCompletedEvent** | Emitted when a memory query completes successfully | `query`, `results`, `limit`, `score_threshold`, `query_time_ms` |
| **MemoryQueryFailedEvent** | Emitted when a memory query fails | `query`, `limit`, `score_threshold`, `error` |
| **MemorySaveStartedEvent** | Emitted when a memory save operation begins | `value`, `metadata`, `agent_role` |
| **MemorySaveCompletedEvent** | Emitted when a memory save operation completes successfully | `value`, `metadata`, `agent_role`, `save_time_ms` |
| **MemorySaveFailedEvent** | Emitted when a memory save operation fails | `value`, `metadata`, `agent_role`, `error` |
| **MemoryRetrievalStartedEvent** | Emitted when memory retrieval for a task prompt starts | `task_id` |
| **MemoryRetrievalCompletedEvent** | Emitted when memory retrieval completes successfully | `task_id`, `memory_content`, `retrieval_time_ms` |
### Practical Applications
#### 1. Memory Performance Monitoring
Track memory operation timing to optimize your application:
```python
from crewai.utilities.events.base_event_listener import BaseEventListener
from crewai.utilities.events import (
MemoryQueryCompletedEvent,
MemorySaveCompletedEvent
)
import time
class MemoryPerformanceMonitor(BaseEventListener):
def __init__(self):
super().__init__()
self.query_times = []
self.save_times = []
def setup_listeners(self, crewai_event_bus):
@crewai_event_bus.on(MemoryQueryCompletedEvent)
def on_memory_query_completed(source, event: MemoryQueryCompletedEvent):
self.query_times.append(event.query_time_ms)
print(f"Memory query completed in {event.query_time_ms:.2f}ms. Query: '{event.query}'")
print(f"Average query time: {sum(self.query_times)/len(self.query_times):.2f}ms")
@crewai_event_bus.on(MemorySaveCompletedEvent)
def on_memory_save_completed(source, event: MemorySaveCompletedEvent):
self.save_times.append(event.save_time_ms)
print(f"Memory save completed in {event.save_time_ms:.2f}ms")
print(f"Average save time: {sum(self.save_times)/len(self.save_times):.2f}ms")
# Create an instance of your listener
memory_monitor = MemoryPerformanceMonitor()
```
#### 2. Memory Content Logging
Log memory operations for debugging and insights:
```python
from crewai.utilities.events.base_event_listener import BaseEventListener
from crewai.utilities.events import (
MemorySaveStartedEvent,
MemoryQueryStartedEvent,
MemoryRetrievalCompletedEvent
)
import logging
# Configure logging
logger = logging.getLogger('memory_events')
class MemoryLogger(BaseEventListener):
def setup_listeners(self, crewai_event_bus):
@crewai_event_bus.on(MemorySaveStartedEvent)
def on_memory_save_started(source, event: MemorySaveStartedEvent):
if event.agent_role:
logger.info(f"Agent '{event.agent_role}' saving memory: {event.value[:50]}...")
else:
logger.info(f"Saving memory: {event.value[:50]}...")
@crewai_event_bus.on(MemoryQueryStartedEvent)
def on_memory_query_started(source, event: MemoryQueryStartedEvent):
logger.info(f"Memory query started: '{event.query}' (limit: {event.limit})")
@crewai_event_bus.on(MemoryRetrievalCompletedEvent)
def on_memory_retrieval_completed(source, event: MemoryRetrievalCompletedEvent):
if event.task_id:
logger.info(f"Memory retrieved for task {event.task_id} in {event.retrieval_time_ms:.2f}ms")
else:
logger.info(f"Memory retrieved in {event.retrieval_time_ms:.2f}ms")
logger.debug(f"Memory content: {event.memory_content}")
# Create an instance of your listener
memory_logger = MemoryLogger()
```
#### 3. Error Tracking and Notifications
Capture and respond to memory errors:
```python
from crewai.utilities.events.base_event_listener import BaseEventListener
from crewai.utilities.events import (
MemorySaveFailedEvent,
MemoryQueryFailedEvent
)
import logging
from typing import Optional
# Configure logging
logger = logging.getLogger('memory_errors')
class MemoryErrorTracker(BaseEventListener):
def __init__(self, notify_email: Optional[str] = None):
super().__init__()
self.notify_email = notify_email
self.error_count = 0
def setup_listeners(self, crewai_event_bus):
@crewai_event_bus.on(MemorySaveFailedEvent)
def on_memory_save_failed(source, event: MemorySaveFailedEvent):
self.error_count += 1
agent_info = f"Agent '{event.agent_role}'" if event.agent_role else "Unknown agent"
error_message = f"Memory save failed: {event.error}. {agent_info}"
logger.error(error_message)
if self.notify_email and self.error_count % 5 == 0:
self._send_notification(error_message)
@crewai_event_bus.on(MemoryQueryFailedEvent)
def on_memory_query_failed(source, event: MemoryQueryFailedEvent):
self.error_count += 1
error_message = f"Memory query failed: {event.error}. Query: '{event.query}'"
logger.error(error_message)
if self.notify_email and self.error_count % 5 == 0:
self._send_notification(error_message)
def _send_notification(self, message):
# Implement your notification system (email, Slack, etc.)
print(f"[NOTIFICATION] Would send to {self.notify_email}: {message}")
# Create an instance of your listener
error_tracker = MemoryErrorTracker(notify_email="admin@example.com")
```
### Integrating with Analytics Platforms
Memory events can be forwarded to analytics and monitoring platforms to track performance metrics, detect anomalies, and visualize memory usage patterns:
```python
from crewai.utilities.events.base_event_listener import BaseEventListener
from crewai.utilities.events import (
MemoryQueryCompletedEvent,
MemorySaveCompletedEvent
)
class MemoryAnalyticsForwarder(BaseEventListener):
def __init__(self, analytics_client):
super().__init__()
self.client = analytics_client
def setup_listeners(self, crewai_event_bus):
@crewai_event_bus.on(MemoryQueryCompletedEvent)
def on_memory_query_completed(source, event: MemoryQueryCompletedEvent):
# Forward query metrics to analytics platform
self.client.track_metric({
"event_type": "memory_query",
"query": event.query,
"duration_ms": event.query_time_ms,
"result_count": len(event.results) if hasattr(event.results, "__len__") else 0,
"timestamp": event.timestamp
})
@crewai_event_bus.on(MemorySaveCompletedEvent)
def on_memory_save_completed(source, event: MemorySaveCompletedEvent):
# Forward save metrics to analytics platform
self.client.track_metric({
"event_type": "memory_save",
"agent_role": event.agent_role,
"duration_ms": event.save_time_ms,
"timestamp": event.timestamp
})
```
### Best Practices for Memory Event Listeners
1. **Keep handlers lightweight**: Avoid complex processing in event handlers to prevent performance impacts
2. **Use appropriate logging levels**: Use INFO for normal operations, DEBUG for details, ERROR for issues
3. **Batch metrics when possible**: Accumulate metrics before sending to external systems
4. **Handle exceptions gracefully**: Ensure your event handlers don't crash due to unexpected data
5. **Consider memory consumption**: Be mindful of storing large amounts of event data
## Conclusion
Integrating CrewAI's memory system into your projects is straightforward. By leveraging the provided memory components and configurations,
Integrating CrewAI's memory system into your projects is straightforward. By leveraging the provided memory components and configurations,
you can quickly empower your agents with the ability to remember, reason, and learn from their interactions, unlocking new levels of intelligence and capability.

View File

@@ -54,9 +54,11 @@ crew = Crew(
| **Markdown** _(optional)_ | `markdown` | `Optional[bool]` | Whether the task should instruct the agent to return the final answer formatted in Markdown. Defaults to False. |
| **Config** _(optional)_ | `config` | `Optional[Dict[str, Any]]` | Task-specific configuration parameters. |
| **Output File** _(optional)_ | `output_file` | `Optional[str]` | File path for storing the task output. |
| **Create Directory** _(optional)_ | `create_directory` | `Optional[bool]` | Whether to create the directory for output_file if it doesn't exist. Defaults to True. |
| **Output JSON** _(optional)_ | `output_json` | `Optional[Type[BaseModel]]` | A Pydantic model to structure the JSON output. |
| **Output Pydantic** _(optional)_ | `output_pydantic` | `Optional[Type[BaseModel]]` | A Pydantic model for task output. |
| **Callback** _(optional)_ | `callback` | `Optional[Any]` | Function/object to be executed after task completion. |
| **Guardrail** _(optional)_ | `guardrail` | `Optional[Callable]` | Function to validate task output before proceeding to next task. |
## Creating Tasks
@@ -332,9 +334,11 @@ Task guardrails provide a way to validate and transform task outputs before they
are passed to the next task. This feature helps ensure data quality and provides
feedback to agents when their output doesn't meet specific criteria.
### Using Task Guardrails
Guardrails are implemented as Python functions that contain custom validation logic, giving you complete control over the validation process and ensuring reliable, deterministic results.
To add a guardrail to a task, provide a validation function through the `guardrail` parameter:
### Function-Based Guardrails
To add a function-based guardrail to a task, provide a validation function through the `guardrail` parameter:
```python Code
from typing import Tuple, Union, Dict, Any
@@ -372,9 +376,7 @@ blog_task = Task(
- On success: it returns a tuple of `(bool, Any)`. For example: `(True, validated_result)`
- On Failure: it returns a tuple of `(bool, str)`. For example: `(False, "Error message explain the failure")`
### LLMGuardrail
The `LLMGuardrail` class offers a robust mechanism for validating task outputs.
### Error Handling Best Practices
@@ -798,184 +800,91 @@ While creating and executing tasks, certain validation mechanisms are in place t
These validations help in maintaining the consistency and reliability of task executions within the crewAI framework.
## Task Guardrails
Task guardrails provide a powerful way to validate, transform, or filter task outputs before they are passed to the next task. Guardrails are optional functions that execute before the next task starts, allowing you to ensure that task outputs meet specific requirements or formats.
### Basic Usage
#### Define your own logic to validate
```python Code
from typing import Tuple, Union
from crewai import Task
def validate_json_output(result: str) -> Tuple[bool, Union[dict, str]]:
"""Validate that the output is valid JSON."""
try:
json_data = json.loads(result)
return (True, json_data)
except json.JSONDecodeError:
return (False, "Output must be valid JSON")
task = Task(
description="Generate JSON data",
expected_output="Valid JSON object",
guardrail=validate_json_output
)
```
#### Leverage a no-code approach for validation
```python Code
from crewai import Task
task = Task(
description="Generate JSON data",
expected_output="Valid JSON object",
guardrail="Ensure the response is a valid JSON object"
)
```
#### Using YAML
```yaml
research_task:
...
guardrail: make sure each bullet contains a minimum of 100 words
...
```
```python Code
@CrewBase
class InternalCrew:
agents_config = "config/agents.yaml"
tasks_config = "config/tasks.yaml"
...
@task
def research_task(self):
return Task(config=self.tasks_config["research_task"]) # type: ignore[index]
...
```
#### Use custom models for code generation
```python Code
from crewai import Task
from crewai.llm import LLM
task = Task(
description="Generate JSON data",
expected_output="Valid JSON object",
guardrail=LLMGuardrail(
description="Ensure the response is a valid JSON object",
llm=LLM(model="gpt-4o-mini"),
)
)
```
### How Guardrails Work
1. **Optional Attribute**: Guardrails are an optional attribute at the task level, allowing you to add validation only where needed.
2. **Execution Timing**: The guardrail function is executed before the next task starts, ensuring valid data flow between tasks.
3. **Return Format**: Guardrails must return a tuple of `(success, data)`:
- If `success` is `True`, `data` is the validated/transformed result
- If `success` is `False`, `data` is the error message
4. **Result Routing**:
- On success (`True`), the result is automatically passed to the next task
- On failure (`False`), the error is sent back to the agent to generate a new answer
### Common Use Cases
#### Data Format Validation
```python Code
def validate_email_format(result: str) -> Tuple[bool, Union[str, str]]:
"""Ensure the output contains a valid email address."""
import re
email_pattern = r'^[\w\.-]+@[\w\.-]+\.\w+$'
if re.match(email_pattern, result.strip()):
return (True, result.strip())
return (False, "Output must be a valid email address")
```
#### Content Filtering
```python Code
def filter_sensitive_info(result: str) -> Tuple[bool, Union[str, str]]:
"""Remove or validate sensitive information."""
sensitive_patterns = ['SSN:', 'password:', 'secret:']
for pattern in sensitive_patterns:
if pattern.lower() in result.lower():
return (False, f"Output contains sensitive information ({pattern})")
return (True, result)
```
#### Data Transformation
```python Code
def normalize_phone_number(result: str) -> Tuple[bool, Union[str, str]]:
"""Ensure phone numbers are in a consistent format."""
import re
digits = re.sub(r'\D', '', result)
if len(digits) == 10:
formatted = f"({digits[:3]}) {digits[3:6]}-{digits[6:]}"
return (True, formatted)
return (False, "Output must be a 10-digit phone number")
```
### Advanced Features
#### Chaining Multiple Validations
```python Code
def chain_validations(*validators):
"""Chain multiple validators together."""
def combined_validator(result):
for validator in validators:
success, data = validator(result)
if not success:
return (False, data)
result = data
return (True, result)
return combined_validator
# Usage
task = Task(
description="Get user contact info",
expected_output="Email and phone",
guardrail=chain_validations(
validate_email_format,
filter_sensitive_info
)
)
```
#### Custom Retry Logic
```python Code
task = Task(
description="Generate data",
expected_output="Valid data",
guardrail=validate_data,
max_retries=5 # Override default retry limit
)
```
## Creating Directories when Saving Files
You can now specify if a task should create directories when saving its output to a file. This is particularly useful for organizing outputs and ensuring that file paths are correctly structured.
The `create_directory` parameter controls whether CrewAI should automatically create directories when saving task outputs to files. This feature is particularly useful for organizing outputs and ensuring that file paths are correctly structured, especially when working with complex project hierarchies.
### Default Behavior
By default, `create_directory=True`, which means CrewAI will automatically create any missing directories in the output file path:
```python Code
# ...
save_output_task = Task(
description='Save the summarized AI news to a file',
expected_output='File saved successfully',
agent=research_agent,
tools=[file_save_tool],
output_file='outputs/ai_news_summary.txt',
create_directory=True
# Default behavior - directories are created automatically
report_task = Task(
description='Generate a comprehensive market analysis report',
expected_output='A detailed market analysis with charts and insights',
agent=analyst_agent,
output_file='reports/2025/market_analysis.md', # Creates 'reports/2025/' if it doesn't exist
markdown=True
)
```
#...
### Disabling Directory Creation
If you want to prevent automatic directory creation and ensure that the directory already exists, set `create_directory=False`:
```python Code
# Strict mode - directory must already exist
strict_output_task = Task(
description='Save critical data that requires existing infrastructure',
expected_output='Data saved to pre-configured location',
agent=data_agent,
output_file='secure/vault/critical_data.json',
create_directory=False # Will raise RuntimeError if 'secure/vault/' doesn't exist
)
```
### YAML Configuration
You can also configure this behavior in your YAML task definitions:
```yaml tasks.yaml
analysis_task:
description: >
Generate quarterly financial analysis
expected_output: >
A comprehensive financial report with quarterly insights
agent: financial_analyst
output_file: reports/quarterly/q4_2024_analysis.pdf
create_directory: true # Automatically create 'reports/quarterly/' directory
audit_task:
description: >
Perform compliance audit and save to existing audit directory
expected_output: >
A compliance audit report
agent: auditor
output_file: audit/compliance_report.md
create_directory: false # Directory must already exist
```
### Use Cases
**Automatic Directory Creation (`create_directory=True`):**
- Development and prototyping environments
- Dynamic report generation with date-based folders
- Automated workflows where directory structure may vary
- Multi-tenant applications with user-specific folders
**Manual Directory Management (`create_directory=False`):**
- Production environments with strict file system controls
- Security-sensitive applications where directories must be pre-configured
- Systems with specific permission requirements
- Compliance environments where directory creation is audited
### Error Handling
When `create_directory=False` and the directory doesn't exist, CrewAI will raise a `RuntimeError`:
```python Code
try:
result = crew.kickoff()
except RuntimeError as e:
# Handle missing directory error
print(f"Directory creation failed: {e}")
# Create directory manually or use fallback location
```
Check out the video below to see how to use structured outputs in CrewAI:

View File

@@ -0,0 +1,155 @@
---
title: 'Agent Repositories'
description: 'Learn how to use Agent Repositories to share and reuse your agents across teams and projects'
icon: 'database'
---
Agent Repositories allow enterprise users to store, share, and reuse agent definitions across teams and projects. This feature enables organizations to maintain a centralized library of standardized agents, promoting consistency and reducing duplication of effort.
## Benefits of Agent Repositories
- **Standardization**: Maintain consistent agent definitions across your organization
- **Reusability**: Create an agent once and use it in multiple crews and projects
- **Governance**: Implement organization-wide policies for agent configurations
- **Collaboration**: Enable teams to share and build upon each other's work
## Using Agent Repositories
### Prerequisites
1. You must have an account at CrewAI, try the [free plan](https://app.crewai.com).
2. You need to be authenticated using the CrewAI CLI.
3. If you have more than one organization, make sure you are switched to the correct organization using the CLI command:
```bash
crewai org switch <org_id>
```
### Creating and Managing Agents in Repositories
To create and manage agents in repositories,Enterprise Dashboard.
### Loading Agents from Repositories
You can load agents from repositories in your code using the `from_repository` parameter:
```python
from crewai import Agent
# Create an agent by loading it from a repository
# The agent is loaded with all its predefined configurations
researcher = Agent(
from_repository="market-research-agent"
)
```
### Overriding Repository Settings
You can override specific settings from the repository by providing them in the configuration:
```python
researcher = Agent(
from_repository="market-research-agent",
goal="Research the latest trends in AI development", # Override the repository goal
verbose=True # Add a setting not in the repository
)
```
### Example: Creating a Crew with Repository Agents
```python
from crewai import Crew, Agent, Task
# Load agents from repositories
researcher = Agent(
from_repository="market-research-agent"
)
writer = Agent(
from_repository="content-writer-agent"
)
# Create tasks
research_task = Task(
description="Research the latest trends in AI",
agent=researcher
)
writing_task = Task(
description="Write a comprehensive report based on the research",
agent=writer
)
# Create the crew
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
verbose=True
)
# Run the crew
result = crew.kickoff()
```
### Example: Using `kickoff()` with Repository Agents
You can also use repository agents directly with the `kickoff()` method for simpler interactions:
```python
from crewai import Agent
from pydantic import BaseModel
from typing import List
# Define a structured output format
class MarketAnalysis(BaseModel):
key_trends: List[str]
opportunities: List[str]
recommendation: str
# Load an agent from repository
analyst = Agent(
from_repository="market-analyst-agent",
verbose=True
)
# Get a free-form response
result = analyst.kickoff("Analyze the AI market in 2025")
print(result.raw) # Access the raw response
# Get structured output
structured_result = analyst.kickoff(
"Provide a structured analysis of the AI market in 2025",
response_format=MarketAnalysis
)
# Access structured data
print(f"Key Trends: {structured_result.pydantic.key_trends}")
print(f"Recommendation: {structured_result.pydantic.recommendation}")
```
## Best Practices
1. **Naming Convention**: Use clear, descriptive names for your repository agents
2. **Documentation**: Include comprehensive descriptions for each agent
3. **Tool Management**: Ensure that tools referenced by repository agents are available in your environment
4. **Access Control**: Manage permissions to ensure only authorized team members can modify repository agents
## Organization Management
To switch between organizations or see your current organization, use the CrewAI CLI:
```bash
# View current organization
crewai org current
# Switch to a different organization
crewai org switch <org_id>
# List all available organizations
crewai org list
```
<Note>
When loading agents from repositories, you must be authenticated and switched to the correct organization. If you receive errors, check your authentication status and organization settings using the CLI commands above.
</Note>

View File

@@ -41,11 +41,8 @@ The CLI provides the fastest way to deploy locally developed crews to the Enterp
First, you need to authenticate your CLI with the CrewAI Enterprise platform:
```bash
# If you already have a CrewAI Enterprise account
# If you already have a CrewAI Enterprise account, or want to create one:
crewai login
# If you're creating a new account
crewai signup
```
When you run either command, the CLI will:

View File

@@ -0,0 +1,134 @@
---
title: Neatlogs Integration
description: Understand, debug, and share your CrewAI agent runs
icon: magnifying-glass-chart
---
# Introduction
Neatlogs helps you **see what your agent did**, **why**, and **share it**.
It captures every step: thoughts, tool calls, responses, evaluations. No raw logs. Just clear, structured traces. Great for debugging and collaboration.
## Why use Neatlogs?
CrewAI agents use multiple tools and reasoning steps. When something goes wrong, you need context — not just errors.
Neatlogs lets you:
- Follow the full decision path
- Add feedback directly on steps
- Chat with the trace using AI assistant
- Share runs publicly for feedback
- Turn insights into tasks
All in one place.
Manage your traces effortlessly
![Traces](/images/neatlogs-1.png)
![Trace Response](/images/neatlogs-2.png)
The best UX to view a CrewAI trace. Post comments anywhere you want. Use AI to debug.
![Trace Details](/images/neatlogs-3.png)
![Ai Chat Bot With A Trace](/images/neatlogs-4.png)
![Comments Drawer](/images/neatlogs-5.png)
## Core Features
- **Trace Viewer**: Track thoughts, tools, and decisions in sequence
- **Inline Comments**: Tag teammates on any trace step
- **Feedback & Evaluation**: Mark outputs as correct or incorrect
- **Error Highlighting**: Automatic flagging of API/tool failures
- **Task Conversion**: Convert comments into assigned tasks
- **Ask the Trace (AI)**: Chat with your trace using Neatlogs AI bot
- **Public Sharing**: Publish trace links to your community
## Quick Setup with CrewAI
<Steps>
<Step title="Sign Up & Get API Key">
Visit [neatlogs.com](https://neatlogs.com/?utm_source=crewAI-docs), create a project, copy the API key.
</Step>
<Step title="Install SDK">
```bash
pip install neatlogs
```
(Latest version 0.8.0, Python 3.8+; MIT license)
</Step>
<Step title="Initialize Neatlogs">
Before starting Crew agents, add:
```python
import neatlogs
neatlogs.init("YOUR_PROJECT_API_KEY")
```
Agents run as usual. Neatlogs captures everything automatically.
</Step>
</Steps>
## Under the Hood
According to GitHub, Neatlogs:
- Captures thoughts, tool calls, responses, errors, and token stats
- Supports AI-powered task generation and robust evaluation workflows
All with just two lines of code.
## Watch It Work
### 🔍 Full Demo (4min)
<iframe
width="100%"
height="315"
src="https://www.youtube.com/embed/8KDme9T2I7Q?si=b8oHteaBwFNs_Duk"
title="YouTube video player"
frameBorder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowFullScreen
></iframe>
### ⚙️ CrewAI Integration (30s)
<iframe
className="w-full aspect-video rounded-xl"
src="https://www.loom.com/embed/9c78b552af43452bb3e4783cb8d91230?sid=e9d7d370-a91a-49b0-809e-2f375d9e801d"
title="Loom video player"
frameBorder="0"
allowFullScreen
></iframe>
## Links & Support
- 📘 [Neatlogs Docs](https://docs.neatlogs.com/)
- 🔐 [Dashboard & API Key](https://app.neatlogs.com/)
- 🐦 [Follow on Twitter](https://twitter.com/neatlogs)
- 📧 Contact: hello@neatlogs.com
- 🛠 [GitHub SDK](https://github.com/NeatLogs/neatlogs)
## TL;DR
With just:
```bash
pip install neatlogs
import neatlogs
neatlogs.init("YOUR_API_KEY")
You can now capture, understand, share, and act on your CrewAI agent runs in seconds.
No setup overhead. Full trace transparency. Full team collaboration.
```

View File

@@ -44,6 +44,14 @@ These tools enable your agents to search the web, research topics, and find info
<Card title="YouTube Video Search" icon="play" href="/en/tools/search-research/youtubevideosearchtool">
Find and analyze YouTube videos by topic, keyword, or criteria.
</Card>
<Card title="Tavily Search Tool" icon="magnifying-glass" href="/en/tools/search-research/tavilysearchtool">
Comprehensive web search using Tavily's AI-powered search API.
</Card>
<Card title="Tavily Extractor Tool" icon="file-text" href="/en/tools/search-research/tavilyextractortool">
Extract structured content from web pages using the Tavily API.
</Card>
</CardGroup>
## **Common Use Cases**
@@ -55,17 +63,19 @@ These tools enable your agents to search the web, research topics, and find info
- **Academic Research**: Find scholarly articles and technical papers
```python
from crewai_tools import SerperDevTool, GitHubSearchTool, YoutubeVideoSearchTool
from crewai_tools import SerperDevTool, GitHubSearchTool, YoutubeVideoSearchTool, TavilySearchTool, TavilyExtractorTool
# Create research tools
web_search = SerperDevTool()
code_search = GitHubSearchTool()
video_research = YoutubeVideoSearchTool()
tavily_search = TavilySearchTool()
content_extractor = TavilyExtractorTool()
# Add to your agent
agent = Agent(
role="Research Analyst",
tools=[web_search, code_search, video_research],
tools=[web_search, code_search, video_research, tavily_search, content_extractor],
goal="Gather comprehensive information on any topic"
)
```

View File

@@ -6,10 +6,6 @@ icon: google
# `SerperDevTool`
<Note>
We are still working on improving tools, so there might be unexpected behavior or changes in the future.
</Note>
## Description
This tool is designed to perform a semantic search for a specified query from a text's content across the internet. It utilizes the [serper.dev](https://serper.dev) API
@@ -17,6 +13,12 @@ to fetch and display the most relevant search results based on the query provide
## Installation
To effectively use the `SerperDevTool`, follow these steps:
1. **Package Installation**: Confirm that the `crewai[tools]` package is installed in your Python environment.
2. **API Key Acquisition**: Acquire a `serper.dev` API key by registering for a free account at `serper.dev`.
3. **Environment Configuration**: Store your obtained API key in an environment variable named `SERPER_API_KEY` to facilitate its use by the tool.
To incorporate this tool into your project, follow the installation instructions below:
```shell
@@ -34,14 +36,6 @@ from crewai_tools import SerperDevTool
tool = SerperDevTool()
```
## Steps to Get Started
To effectively use the `SerperDevTool`, follow these steps:
1. **Package Installation**: Confirm that the `crewai[tools]` package is installed in your Python environment.
2. **API Key Acquisition**: Acquire a `serper.dev` API key by registering for a free account at `serper.dev`.
3. **Environment Configuration**: Store your obtained API key in an environment variable named `SERPER_API_KEY` to facilitate its use by the tool.
## Parameters
The `SerperDevTool` comes with several parameters that will be passed to the API :

View File

@@ -0,0 +1,139 @@
---
title: "Tavily Extractor Tool"
description: "Extract structured content from web pages using the Tavily API"
icon: "file-text"
---
The `TavilyExtractorTool` allows CrewAI agents to extract structured content from web pages using the Tavily API. It can process single URLs or lists of URLs and provides options for controlling the extraction depth and including images.
## Installation
To use the `TavilyExtractorTool`, you need to install the `tavily-python` library:
```shell
pip install 'crewai[tools]' tavily-python
```
You also need to set your Tavily API key as an environment variable:
```bash
export TAVILY_API_KEY='your-tavily-api-key'
```
## Example Usage
Here's how to initialize and use the `TavilyExtractorTool` within a CrewAI agent:
```python
import os
from crewai import Agent, Task, Crew
from crewai_tools import TavilyExtractorTool
# Ensure TAVILY_API_KEY is set in your environment
# os.environ["TAVILY_API_KEY"] = "YOUR_API_KEY"
# Initialize the tool
tavily_tool = TavilyExtractorTool()
# Create an agent that uses the tool
extractor_agent = Agent(
role='Web Content Extractor',
goal='Extract key information from specified web pages',
backstory='You are an expert at extracting relevant content from websites using the Tavily API.',
tools=[tavily_tool],
verbose=True
)
# Define a task for the agent
extract_task = Task(
description='Extract the main content from the URL https://example.com using basic extraction depth.',
expected_output='A JSON string containing the extracted content from the URL.',
agent=extractor_agent
)
# Create and run the crew
crew = Crew(
agents=[extractor_agent],
tasks=[extract_task],
verbose=2
)
result = crew.kickoff()
print(result)
```
## Configuration Options
The `TavilyExtractorTool` accepts the following arguments:
- `urls` (Union[List[str], str]): **Required**. A single URL string or a list of URL strings to extract data from.
- `include_images` (Optional[bool]): Whether to include images in the extraction results. Defaults to `False`.
- `extract_depth` (Literal["basic", "advanced"]): The depth of extraction. Use `"basic"` for faster, surface-level extraction or `"advanced"` for more comprehensive extraction. Defaults to `"basic"`.
- `timeout` (int): The maximum time in seconds to wait for the extraction request to complete. Defaults to `60`.
## Advanced Usage
### Multiple URLs with Advanced Extraction
```python
# Example with multiple URLs and advanced extraction
multi_extract_task = Task(
description='Extract content from https://example.com and https://anotherexample.org using advanced extraction.',
expected_output='A JSON string containing the extracted content from both URLs.',
agent=extractor_agent
)
# Configure the tool with custom parameters
custom_extractor = TavilyExtractorTool(
extract_depth='advanced',
include_images=True,
timeout=120
)
agent_with_custom_tool = Agent(
role="Advanced Content Extractor",
goal="Extract comprehensive content with images",
tools=[custom_extractor]
)
```
### Tool Parameters
You can customize the tool's behavior by setting parameters during initialization:
```python
# Initialize with custom configuration
extractor_tool = TavilyExtractorTool(
extract_depth='advanced', # More comprehensive extraction
include_images=True, # Include image results
timeout=90 # Custom timeout
)
```
## Features
- **Single or Multiple URLs**: Extract content from one URL or process multiple URLs in a single request
- **Configurable Depth**: Choose between basic (fast) and advanced (comprehensive) extraction modes
- **Image Support**: Optionally include images in the extraction results
- **Structured Output**: Returns well-formatted JSON containing the extracted content
- **Error Handling**: Robust handling of network timeouts and extraction errors
## Response Format
The tool returns a JSON string representing the structured data extracted from the provided URL(s). The exact structure depends on the content of the pages and the `extract_depth` used.
Common response elements include:
- **Title**: The page title
- **Content**: Main text content of the page
- **Images**: Image URLs and metadata (when `include_images=True`)
- **Metadata**: Additional page information like author, description, etc.
## Use Cases
- **Content Analysis**: Extract and analyze content from competitor websites
- **Research**: Gather structured data from multiple sources for analysis
- **Content Migration**: Extract content from existing websites for migration
- **Monitoring**: Regular extraction of content for change detection
- **Data Collection**: Systematic extraction of information from web sources
Refer to the [Tavily API documentation](https://docs.tavily.com/docs/tavily-api/python-sdk#extract) for detailed information about the response structure and available options.

View File

@@ -0,0 +1,122 @@
---
title: "Tavily Search Tool"
description: "Perform comprehensive web searches using the Tavily Search API"
icon: "magnifying-glass"
---
The `TavilySearchTool` provides an interface to the Tavily Search API, enabling CrewAI agents to perform comprehensive web searches. It allows for specifying search depth, topics, time ranges, included/excluded domains, and whether to include direct answers, raw content, or images in the results.
## Installation
To use the `TavilySearchTool`, you need to install the `tavily-python` library:
```shell
pip install 'crewai[tools]' tavily-python
```
## Environment Variables
Ensure your Tavily API key is set as an environment variable:
```bash
export TAVILY_API_KEY='your_tavily_api_key'
```
## Example Usage
Here's how to initialize and use the `TavilySearchTool` within a CrewAI agent:
```python
import os
from crewai import Agent, Task, Crew
from crewai_tools import TavilySearchTool
# Ensure the TAVILY_API_KEY environment variable is set
# os.environ["TAVILY_API_KEY"] = "YOUR_TAVILY_API_KEY"
# Initialize the tool
tavily_tool = TavilySearchTool()
# Create an agent that uses the tool
researcher = Agent(
role='Market Researcher',
goal='Find information about the latest AI trends',
backstory='An expert market researcher specializing in technology.',
tools=[tavily_tool],
verbose=True
)
# Create a task for the agent
research_task = Task(
description='Search for the top 3 AI trends in 2024.',
expected_output='A JSON report summarizing the top 3 AI trends found.',
agent=researcher
)
# Form the crew and kick it off
crew = Crew(
agents=[researcher],
tasks=[research_task],
verbose=2
)
result = crew.kickoff()
print(result)
```
## Configuration Options
The `TavilySearchTool` accepts the following arguments during initialization or when calling the `run` method:
- `query` (str): **Required**. The search query string.
- `search_depth` (Literal["basic", "advanced"], optional): The depth of the search. Defaults to `"basic"`.
- `topic` (Literal["general", "news", "finance"], optional): The topic to focus the search on. Defaults to `"general"`.
- `time_range` (Literal["day", "week", "month", "year"], optional): The time range for the search. Defaults to `None`.
- `days` (int, optional): The number of days to search back. Relevant if `time_range` is not set. Defaults to `7`.
- `max_results` (int, optional): The maximum number of search results to return. Defaults to `5`.
- `include_domains` (Sequence[str], optional): A list of domains to prioritize in the search. Defaults to `None`.
- `exclude_domains` (Sequence[str], optional): A list of domains to exclude from the search. Defaults to `None`.
- `include_answer` (Union[bool, Literal["basic", "advanced"]], optional): Whether to include a direct answer synthesized from the search results. Defaults to `False`.
- `include_raw_content` (bool, optional): Whether to include the raw HTML content of the searched pages. Defaults to `False`.
- `include_images` (bool, optional): Whether to include image results. Defaults to `False`.
- `timeout` (int, optional): The request timeout in seconds. Defaults to `60`.
## Advanced Usage
You can configure the tool with custom parameters:
```python
# Example: Initialize with specific parameters
custom_tavily_tool = TavilySearchTool(
search_depth='advanced',
max_results=10,
include_answer=True
)
# The agent will use these defaults
agent_with_custom_tool = Agent(
role="Advanced Researcher",
goal="Conduct detailed research with comprehensive results",
tools=[custom_tavily_tool]
)
```
## Features
- **Comprehensive Search**: Access to Tavily's powerful search index
- **Configurable Depth**: Choose between basic and advanced search modes
- **Topic Filtering**: Focus searches on general, news, or finance topics
- **Time Range Control**: Limit results to specific time periods
- **Domain Control**: Include or exclude specific domains
- **Direct Answers**: Get synthesized answers from search results
- **Content Filtering**: Prevent context window issues with automatic content truncation
## Response Format
The tool returns search results as a JSON string containing:
- Search results with titles, URLs, and content snippets
- Optional direct answers to queries
- Optional image results
- Optional raw HTML content (when enabled)
Content for each result is automatically truncated to prevent context window issues while maintaining the most relevant information.

View File

@@ -0,0 +1,100 @@
---
title: Serper Scrape Website
description: The `SerperScrapeWebsiteTool` is designed to scrape websites and extract clean, readable content using Serper's scraping API.
icon: globe
---
# `SerperScrapeWebsiteTool`
## Description
This tool is designed to scrape website content and extract clean, readable text from any website URL. It utilizes the [serper.dev](https://serper.dev) scraping API to fetch and process web pages, optionally including markdown formatting for better structure and readability.
## Installation
To effectively use the `SerperScrapeWebsiteTool`, follow these steps:
1. **Package Installation**: Confirm that the `crewai[tools]` package is installed in your Python environment.
2. **API Key Acquisition**: Acquire a `serper.dev` API key by registering for an account at `serper.dev`.
3. **Environment Configuration**: Store your obtained API key in an environment variable named `SERPER_API_KEY` to facilitate its use by the tool.
To incorporate this tool into your project, follow the installation instructions below:
```shell
pip install 'crewai[tools]'
```
## Example
The following example demonstrates how to initialize the tool and scrape a website:
```python Code
from crewai_tools import SerperScrapeWebsiteTool
# Initialize the tool for website scraping capabilities
tool = SerperScrapeWebsiteTool()
# Scrape a website with markdown formatting
result = tool.run(url="https://example.com", include_markdown=True)
```
## Arguments
The `SerperScrapeWebsiteTool` accepts the following arguments:
- **url**: Required. The URL of the website to scrape.
- **include_markdown**: Optional. Whether to include markdown formatting in the scraped content. Defaults to `True`.
## Example with Parameters
Here is an example demonstrating how to use the tool with different parameters:
```python Code
from crewai_tools import SerperScrapeWebsiteTool
tool = SerperScrapeWebsiteTool()
# Scrape with markdown formatting (default)
markdown_result = tool.run(
url="https://docs.crewai.com",
include_markdown=True
)
# Scrape without markdown formatting for plain text
plain_result = tool.run(
url="https://docs.crewai.com",
include_markdown=False
)
print("Markdown formatted content:")
print(markdown_result)
print("\nPlain text content:")
print(plain_result)
```
## Use Cases
The `SerperScrapeWebsiteTool` is particularly useful for:
- **Content Analysis**: Extract and analyze website content for research purposes
- **Data Collection**: Gather structured information from web pages
- **Documentation Processing**: Convert web-based documentation into readable formats
- **Competitive Analysis**: Scrape competitor websites for market research
- **Content Migration**: Extract content from existing websites for migration purposes
## Error Handling
The tool includes comprehensive error handling for:
- **Network Issues**: Handles connection timeouts and network errors gracefully
- **API Errors**: Provides detailed error messages for API-related issues
- **Invalid URLs**: Validates and reports issues with malformed URLs
- **Authentication**: Clear error messages for missing or invalid API keys
## Security Considerations
- Always store your `SERPER_API_KEY` in environment variables, never hardcode it in your source code
- Be mindful of rate limits imposed by the Serper API
- Respect robots.txt and website terms of service when scraping content
- Consider implementing delays between requests for large-scale scraping operations

BIN
docs/images/neatlogs-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 222 KiB

BIN
docs/images/neatlogs-2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 329 KiB

BIN
docs/images/neatlogs-3.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 590 KiB

BIN
docs/images/neatlogs-4.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 216 KiB

BIN
docs/images/neatlogs-5.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 277 KiB

View File

@@ -149,34 +149,33 @@ from crewai_tools import SerperDevTool
# Crie um agente com todos os parâmetros disponíveis
agent = Agent(
role="Senior Data Scientist",
goal="Analyze and interpret complex datasets to provide actionable insights",
backstory="With over 10 years of experience in data science and machine learning, "
"you excel at finding patterns in complex datasets.",
llm="gpt-4", # Default: OPENAI_MODEL_NAME or "gpt-4"
function_calling_llm=None, # Optional: Separate LLM for tool calling
verbose=False, # Default: False
allow_delegation=False, # Default: False
max_iter=20, # Default: 20 iterations
max_rpm=None, # Optional: Rate limit for API calls
max_execution_time=None, # Optional: Maximum execution time in seconds
max_retry_limit=2, # Default: 2 retries on error
allow_code_execution=False, # Default: False
code_execution_mode="safe", # Default: "safe" (options: "safe", "unsafe")
respect_context_window=True, # Default: True
use_system_prompt=True, # Default: True
multimodal=False, # Default: False
inject_date=False, # Default: False
date_format="%Y-%m-%d", # Default: ISO format
reasoning=False, # Default: False
max_reasoning_attempts=None, # Default: None
tools=[SerperDevTool()], # Optional: List of tools
knowledge_sources=None, # Optional: List of knowledge sources
embedder=None, # Optional: Custom embedder configuration
system_template=None, # Optional: Custom system prompt template
prompt_template=None, # Optional: Custom prompt template
response_template=None, # Optional: Custom response template
step_callback=None, # Optional: Callback function for monitoring
role="Cientista de Dados Sênior",
goal="Analisar e interpretar conjuntos de dados complexos para fornecer insights acionáveis",
backstory="Com mais de 10 anos de experiência em ciência de dados e aprendizado de máquina, você é especialista em encontrar padrões em grandes volumes de dados.",
llm="gpt-4", # Padrão: OPENAI_MODEL_NAME ou "gpt-4"
function_calling_llm=None, # Opcional: LLM separado para chamadas de ferramentas
verbose=False, # Padrão: False
allow_delegation=False, # Padrão: False
max_iter=20, # Padrão: 20 iterações
max_rpm=None, # Opcional: Limite de requisições por minuto
max_execution_time=None, # Opcional: Tempo máximo de execução em segundos
max_retry_limit=2, # Padrão: 2 tentativas em caso de erro
allow_code_execution=False, # Padrão: False
code_execution_mode="safe", # Padrão: "safe" (opções: "safe", "unsafe")
respect_context_window=True, # Padrão: True
use_system_prompt=True, # Padrão: True
multimodal=False, # Padrão: False
inject_date=False, # Padrão: False
date_format="%Y-%m-%d", # Padrão: formato ISO
reasoning=False, # Padrão: False
max_reasoning_attempts=None, # Padrão: None
tools=[SerperDevTool()], # Opcional: Lista de ferramentas
knowledge_sources=None, # Opcional: Lista de fontes de conhecimento
embedder=None, # Opcional: Configuração de embedder customizado
system_template=None, # Opcional: Template de prompt de sistema
prompt_template=None, # Opcional: Template de prompt customizado
response_template=None, # Opcional: Template de resposta customizado
step_callback=None, # Opcional: Função de callback para monitoramento
)
```
@@ -185,65 +184,62 @@ Vamos detalhar algumas combinações de parâmetros-chave para casos de uso comu
#### Agente de Pesquisa Básico
```python Code
research_agent = Agent(
role="Research Analyst",
goal="Find and summarize information about specific topics",
backstory="You are an experienced researcher with attention to detail",
role="Analista de Pesquisa",
goal="Encontrar e resumir informações sobre tópicos específicos",
backstory="Você é um pesquisador experiente com atenção aos detalhes",
tools=[SerperDevTool()],
verbose=True # Enable logging for debugging
verbose=True # Ativa logs para depuração
)
```
#### Agente de Desenvolvimento de Código
```python Code
dev_agent = Agent(
role="Senior Python Developer",
goal="Write and debug Python code",
backstory="Expert Python developer with 10 years of experience",
role="Desenvolvedor Python Sênior",
goal="Escrever e depurar códigos Python",
backstory="Desenvolvedor Python especialista com 10 anos de experiência",
allow_code_execution=True,
code_execution_mode="safe", # Uses Docker for safety
max_execution_time=300, # 5-minute timeout
max_retry_limit=3 # More retries for complex code tasks
code_execution_mode="safe", # Usa Docker para segurança
max_execution_time=300, # Limite de 5 minutos
max_retry_limit=3 # Mais tentativas para tarefas complexas
)
```
#### Agente de Análise de Longa Duração
```python Code
analysis_agent = Agent(
role="Data Analyst",
goal="Perform deep analysis of large datasets",
backstory="Specialized in big data analysis and pattern recognition",
role="Analista de Dados",
goal="Realizar análise aprofundada de grandes conjuntos de dados",
backstory="Especialista em análise de big data e reconhecimento de padrões",
memory=True,
respect_context_window=True,
max_rpm=10, # Limit API calls
function_calling_llm="gpt-4o-mini" # Cheaper model for tool calls
max_rpm=10, # Limite de requisições por minuto
function_calling_llm="gpt-4o-mini" # Modelo mais econômico para chamadas de ferramentas
)
```
#### Agente com Template Personalizado
```python Code
custom_agent = Agent(
role="Customer Service Representative",
goal="Assist customers with their inquiries",
backstory="Experienced in customer support with a focus on satisfaction",
system_template="""<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>""",
prompt_template="""<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>""",
response_template="""<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>""",
role="Atendente de Suporte ao Cliente",
goal="Auxiliar clientes com suas dúvidas e solicitações",
backstory="Experiente em atendimento ao cliente com foco em satisfação",
system_template="""<|start_header_id|>system<|end_header_id|>\n {{ .System }}<|eot_id|>""",
prompt_template="""<|start_header_id|>user<|end_header_id|>\n {{ .Prompt }}<|eot_id|>""",
response_template="""<|start_header_id|>assistant<|end_header_id|>\n {{ .Response }}<|eot_id|>""",
)
```
#### Agente Ciente de Data, com Raciocínio
```python Code
strategic_agent = Agent(
role="Market Analyst",
goal="Track market movements with precise date references and strategic planning",
backstory="Expert in time-sensitive financial analysis and strategic reporting",
inject_date=True, # Automatically inject current date into tasks
date_format="%B %d, %Y", # Format as "May 21, 2025"
reasoning=True, # Enable strategic planning
max_reasoning_attempts=2, # Limit planning iterations
role="Analista de Mercado",
goal="Acompanhar movimentos do mercado com referências de datas precisas e planejamento estratégico",
backstory="Especialista em análise financeira sensível ao tempo e relatórios estratégicos",
inject_date=True, # Injeta automaticamente a data atual nas tarefas
date_format="%d de %B de %Y", # Exemplo: "21 de maio de 2025"
reasoning=True, # Ativa planejamento estratégico
max_reasoning_attempts=2, # Limite de iterações de planejamento
verbose=True
)
```
@@ -251,12 +247,12 @@ strategic_agent = Agent(
#### Agente de Raciocínio
```python Code
reasoning_agent = Agent(
role="Strategic Planner",
goal="Analyze complex problems and create detailed execution plans",
backstory="Expert strategic planner who methodically breaks down complex challenges",
reasoning=True, # Enable reasoning and planning
max_reasoning_attempts=3, # Limit reasoning attempts
max_iter=30, # Allow more iterations for complex planning
role="Planejador Estratégico",
goal="Analisar problemas complexos e criar planos de execução detalhados",
backstory="Especialista em planejamento estratégico que desmembra desafios complexos metodicamente",
reasoning=True, # Ativa raciocínio e planejamento
max_reasoning_attempts=3, # Limite de tentativas de raciocínio
max_iter=30, # Permite mais iterações para planejamento complexo
verbose=True
)
```
@@ -264,10 +260,10 @@ reasoning_agent = Agent(
#### Agente Multimodal
```python Code
multimodal_agent = Agent(
role="Visual Content Analyst",
goal="Analyze and process both text and visual content",
backstory="Specialized in multimodal analysis combining text and image understanding",
multimodal=True, # Enable multimodal capabilities
role="Analista de Conteúdo Visual",
goal="Analisar e processar tanto conteúdo textual quanto visual",
backstory="Especialista em análise multimodal combinando compreensão de texto e imagem",
multimodal=True, # Ativa capacidades multimodais
verbose=True
)
```
@@ -336,8 +332,8 @@ wiki_tool = WikipediaTools()
# Adicionar ferramentas ao agente
researcher = Agent(
role="AI Technology Researcher",
goal="Research the latest AI developments",
role="Pesquisador de Tecnologia em IA",
goal="Pesquisar os últimos avanços em IA",
tools=[search_tool, wiki_tool],
verbose=True
)
@@ -351,9 +347,9 @@ Agentes podem manter a memória de suas interações e usar contexto de tarefas
from crewai import Agent
analyst = Agent(
role="Data Analyst",
goal="Analyze and remember complex data patterns",
memory=True, # Enable memory
role="Analista de Dados",
goal="Analisar e memorizar padrões complexos de dados",
memory=True, # Ativa memória
verbose=True
)
```
@@ -380,10 +376,10 @@ Esta é a **configuração padrão e recomendada** para a maioria dos casos. Qua
```python Code
# Agente com gerenciamento automático de contexto (padrão)
smart_agent = Agent(
role="Research Analyst",
goal="Analyze large documents and datasets",
backstory="Expert at processing extensive information",
respect_context_window=True, # 🔑 Default: auto-handle context limits
role="Analista de Pesquisa",
goal="Analisar grandes documentos e conjuntos de dados",
backstory="Especialista em processar informações extensas",
respect_context_window=True, # 🔑 Padrão: gerencia limites de contexto automaticamente
verbose=True
)
```

View File

@@ -3,6 +3,7 @@ title: CLI
description: Aprenda a usar o CLI do CrewAI para interagir com o CrewAI.
icon: terminal
---
<Warning>A partir da versão 0.140.0, a plataforma CrewAI Enterprise iniciou um processo de migração de seu provedor de login. Como resultado, o fluxo de autenticação via CLI foi atualizado. Usuários que utlizam o Google para fazer login, ou que criaram conta após 3 de julho de 2025 não poderão fazer login com versões anteriores da biblioteca `crewai`.</Warning>
## Visão Geral
@@ -75,6 +76,22 @@ Exemplo:
crewai train -n 10 -f my_training_data.pkl
```
```python
# Exemplo de uso programático do comando train
n_iterations = 2
inputs = {"topic": "Treinamento CrewAI"}
filename = "seu_modelo.pkl"
try:
SuaCrew().crew().train(
n_iterations=n_iterations,
inputs=inputs,
filename=filename
)
except Exception as e:
raise Exception(f"Ocorreu um erro ao treinar a crew: {e}")
```
### 4. Replay
Reexecute a execução do crew a partir de uma tarefa específica.
@@ -86,7 +103,7 @@ crewai replay [OPTIONS]
- `-t, --task_id TEXT`: Reexecuta o crew a partir deste task ID, incluindo todas as tarefas subsequentes
Exemplo:
```shell Terminal
```shell Terminal
crewai replay -t task_123456
```
@@ -132,7 +149,7 @@ crewai test [OPTIONS]
- `-m, --model TEXT`: Modelo LLM para executar os testes no Crew (padrão: "gpt-4o-mini")
Exemplo:
```shell Terminal
```shell Terminal
crewai test -n 5 -m gpt-3.5-turbo
```
@@ -186,10 +203,7 @@ def crew(self) -> Crew:
Implemente o crew ou flow no [CrewAI Enterprise](https://app.crewai.com).
- **Autenticação**: Você precisa estar autenticado para implementar no CrewAI Enterprise.
```shell Terminal
crewai signup
```
Caso já tenha uma conta, você pode fazer login com:
Você pode fazer login ou criar uma conta com:
```shell Terminal
crewai login
```
@@ -236,7 +250,7 @@ Você deve estar autenticado no CrewAI Enterprise para usar estes comandos de ge
- **Implantar o Crew**: Depois de autenticado, você pode implantar seu crew ou flow no CrewAI Enterprise.
```shell Terminal
crewai deploy push
```
```
- Inicia o processo de deployment na plataforma CrewAI Enterprise.
- Após a iniciação bem-sucedida, será exibida a mensagem Deployment created successfully! juntamente com o Nome do Deployment e um Deployment ID (UUID) único.
@@ -309,4 +323,4 @@ Ao escolher um provedor, o CLI solicitará que você informe o nome da chave e a
Veja o seguinte link para o nome de chave de cada provedor:
* [LiteLLM Providers](https://docs.litellm.ai/docs/providers)
* [LiteLLM Providers](https://docs.litellm.ai/docs/providers)

View File

@@ -15,18 +15,18 @@ from crewai import Agent, Crew, Task
# Enable collaboration for agents
researcher = Agent(
role="Research Specialist",
goal="Conduct thorough research on any topic",
backstory="Expert researcher with access to various sources",
allow_delegation=True, # 🔑 Key setting for collaboration
role="Especialista em Pesquisa",
goal="Realizar pesquisas aprofundadas sobre qualquer tema",
backstory="Pesquisador especialista com acesso a diversas fontes",
allow_delegation=True, # 🔑 Configuração chave para colaboração
verbose=True
)
writer = Agent(
role="Content Writer",
goal="Create engaging content based on research",
backstory="Skilled writer who transforms research into compelling content",
allow_delegation=True, # 🔑 Enables asking questions to other agents
role="Redator de Conteúdo",
goal="Criar conteúdo envolvente com base em pesquisas",
backstory="Redator habilidoso que transforma pesquisas em conteúdo atraente",
allow_delegation=True, # 🔑 Permite fazer perguntas a outros agentes
verbose=True
)
@@ -67,19 +67,17 @@ from crewai import Agent, Crew, Task, Process
# Create collaborative agents
researcher = Agent(
role="Research Specialist",
goal="Find accurate, up-to-date information on any topic",
backstory="""You're a meticulous researcher with expertise in finding
reliable sources and fact-checking information across various domains.""",
role="Especialista em Pesquisa",
goal="Realizar pesquisas aprofundadas sobre qualquer tema",
backstory="Pesquisador especialista com acesso a diversas fontes",
allow_delegation=True,
verbose=True
)
writer = Agent(
role="Content Writer",
goal="Create engaging, well-structured content",
backstory="""You're a skilled content writer who excels at transforming
research into compelling, readable content for different audiences.""",
role="Redator de Conteúdo",
goal="Criar conteúdo envolvente com base em pesquisas",
backstory="Redator habilidoso que transforma pesquisas em conteúdo atraente",
allow_delegation=True,
verbose=True
)
@@ -95,17 +93,17 @@ editor = Agent(
# Create a task that encourages collaboration
article_task = Task(
description="""Write a comprehensive 1000-word article about 'The Future of AI in Healthcare'.
The article should include:
- Current AI applications in healthcare
- Emerging trends and technologies
- Potential challenges and ethical considerations
- Expert predictions for the next 5 years
Collaborate with your teammates to ensure accuracy and quality.""",
expected_output="A well-researched, engaging 1000-word article with proper structure and citations",
agent=writer # Writer leads, but can delegate research to researcher
description="""Escreva um artigo abrangente de 1000 palavras sobre 'O Futuro da IA na Saúde'.
O artigo deve incluir:
- Aplicações atuais de IA na saúde
- Tendências e tecnologias emergentes
- Desafios potenciais e considerações éticas
- Previsões de especialistas para os próximos 5 anos
Colabore com seus colegas para garantir precisão e qualidade.""",
expected_output="Um artigo bem pesquisado, envolvente, com 1000 palavras, estrutura adequada e citações",
agent=writer # O redator lidera, mas pode delegar pesquisa ao pesquisador
)
# Create collaborative crew
@@ -124,37 +122,37 @@ result = crew.kickoff()
### Padrão 1: Pesquisa → Redação → Edição
```python
research_task = Task(
description="Research the latest developments in quantum computing",
expected_output="Comprehensive research summary with key findings and sources",
description="Pesquise os últimos avanços em computação quântica",
expected_output="Resumo abrangente da pesquisa com principais descobertas e fontes",
agent=researcher
)
writing_task = Task(
description="Write an article based on the research findings",
expected_output="Engaging 800-word article about quantum computing",
description="Escreva um artigo com base nos achados da pesquisa",
expected_output="Artigo envolvente de 800 palavras sobre computação quântica",
agent=writer,
context=[research_task] # Gets research output as context
context=[research_task] # Recebe a saída da pesquisa como contexto
)
editing_task = Task(
description="Edit and polish the article for publication",
expected_output="Publication-ready article with improved clarity and flow",
description="Edite e revise o artigo para publicação",
expected_output="Artigo pronto para publicação, com clareza e fluidez aprimoradas",
agent=editor,
context=[writing_task] # Gets article draft as context
context=[writing_task] # Recebe o rascunho do artigo como contexto
)
```
### Padrão 2: Tarefa Única Colaborativa
```python
collaborative_task = Task(
description="""Create a marketing strategy for a new AI product.
Writer: Focus on messaging and content strategy
Researcher: Provide market analysis and competitor insights
Work together to create a comprehensive strategy.""",
expected_output="Complete marketing strategy with research backing",
agent=writer # Lead agent, but can delegate to researcher
description="""Crie uma estratégia de marketing para um novo produto de IA.
Redator: Foque em mensagens e estratégia de conteúdo
Pesquisador: Forneça análise de mercado e insights de concorrentes
Trabalhem juntos para criar uma estratégia abrangente.""",
expected_output="Estratégia de marketing completa com embasamento em pesquisa",
agent=writer # Agente líder, mas pode delegar ao pesquisador
)
```
@@ -167,35 +165,35 @@ from crewai import Agent, Crew, Task, Process
# Manager agent coordinates the team
manager = Agent(
role="Project Manager",
goal="Coordinate team efforts and ensure project success",
backstory="Experienced project manager skilled at delegation and quality control",
role="Gerente de Projetos",
goal="Coordenar esforços da equipe e garantir o sucesso do projeto",
backstory="Gerente de projetos experiente, habilidoso em delegação e controle de qualidade",
allow_delegation=True,
verbose=True
)
# Specialist agents
researcher = Agent(
role="Researcher",
goal="Provide accurate research and analysis",
backstory="Expert researcher with deep analytical skills",
allow_delegation=False, # Specialists focus on their expertise
role="Pesquisador",
goal="Fornecer pesquisa e análise precisas",
backstory="Pesquisador especialista com habilidades analíticas profundas",
allow_delegation=False, # Especialistas focam em sua expertise
verbose=True
)
writer = Agent(
role="Writer",
goal="Create compelling content",
backstory="Skilled writer who creates engaging content",
role="Redator",
goal="Criar conteúdo envolvente",
backstory="Redator habilidoso que cria conteúdo atraente",
allow_delegation=False,
verbose=True
)
# Manager-led task
project_task = Task(
description="Create a comprehensive market analysis report with recommendations",
expected_output="Executive summary, detailed analysis, and strategic recommendations",
agent=manager # Manager will delegate to specialists
description="Crie um relatório de análise de mercado completo com recomendações",
expected_output="Resumo executivo, análise detalhada e recomendações estratégicas",
agent=manager # O gerente delega para especialistas
)
# Hierarchical crew

View File

@@ -153,32 +153,32 @@ from crewai_tools import YourCustomTool
class YourCrewName:
def agent_one(self) -> Agent:
return Agent(
role="Data Analyst",
goal="Analyze data trends in the market",
backstory="An experienced data analyst with a background in economics",
role="Analista de Dados",
goal="Analisar tendências de dados no mercado brasileiro",
backstory="Analista experiente com formação em economia",
verbose=True,
tools=[YourCustomTool()]
)
def agent_two(self) -> Agent:
return Agent(
role="Market Researcher",
goal="Gather information on market dynamics",
backstory="A diligent researcher with a keen eye for detail",
role="Pesquisador de Mercado",
goal="Coletar informações sobre a dinâmica do mercado nacional",
backstory="Pesquisador dedicado com olhar atento aos detalhes",
verbose=True
)
def task_one(self) -> Task:
return Task(
description="Collect recent market data and identify trends.",
expected_output="A report summarizing key trends in the market.",
description="Coletar dados recentes do mercado brasileiro e identificar tendências.",
expected_output="Um relatório resumido com as principais tendências do mercado.",
agent=self.agent_one()
)
def task_two(self) -> Task:
return Task(
description="Research factors affecting market dynamics.",
expected_output="An analysis of factors influencing the market.",
description="Pesquisar fatores que afetam a dinâmica do mercado nacional.",
expected_output="Uma análise dos fatores que influenciam o mercado.",
agent=self.agent_two()
)

View File

@@ -51,24 +51,24 @@ from crewai.utilities.events import (
)
from crewai.utilities.events.base_event_listener import BaseEventListener
class MyCustomListener(BaseEventListener):
class MeuListenerPersonalizado(BaseEventListener):
def __init__(self):
super().__init__()
def setup_listeners(self, crewai_event_bus):
@crewai_event_bus.on(CrewKickoffStartedEvent)
def on_crew_started(source, event):
print(f"Crew '{event.crew_name}' has started execution!")
def ao_iniciar_crew(source, event):
print(f"Crew '{event.crew_name}' iniciou a execução!")
@crewai_event_bus.on(CrewKickoffCompletedEvent)
def on_crew_completed(source, event):
print(f"Crew '{event.crew_name}' has completed execution!")
print(f"Output: {event.output}")
def ao_finalizar_crew(source, event):
print(f"Crew '{event.crew_name}' finalizou a execução!")
print(f"Saída: {event.output}")
@crewai_event_bus.on(AgentExecutionCompletedEvent)
def on_agent_execution_completed(source, event):
print(f"Agent '{event.agent.role}' completed task")
print(f"Output: {event.output}")
def ao_finalizar_execucao_agente(source, event):
print(f"Agente '{event.agent.role}' concluiu a tarefa")
print(f"Saída: {event.output}")
```
## Registrando Corretamente Seu Listener

View File

@@ -486,8 +486,9 @@ Existem duas formas de executar um flow:
Você pode executar um flow programaticamente criando uma instância da sua classe de flow e chamando o método `kickoff()`:
```python
flow = ExampleFlow()
result = flow.kickoff()
# Exemplo de execução de flow em português
flow = ExemploFlow()
resultado = flow.kickoff()
```
### Usando a CLI

View File

@@ -39,17 +39,17 @@ llm = LLM(model="gpt-4o-mini", temperature=0)
# Create an agent with the knowledge store
agent = Agent(
role="About User",
goal="You know everything about the user.",
backstory="You are a master at understanding people and their preferences.",
role="Sobre o Usuário",
goal="Você sabe tudo sobre o usuário.",
backstory="Você é mestre em entender pessoas e suas preferências.",
verbose=True,
allow_delegation=False,
llm=llm,
)
task = Task(
description="Answer the following questions about the user: {question}",
expected_output="An answer to the question.",
description="Responda às seguintes perguntas sobre o usuário: {question}",
expected_output="Uma resposta para a pergunta.",
agent=agent,
)
@@ -87,17 +87,17 @@ llm = LLM(model="gpt-4o-mini", temperature=0)
# Create an agent with the knowledge store
agent = Agent(
role="About papers",
goal="You know everything about the papers.",
backstory="You are a master at understanding papers and their content.",
role="Sobre artigos",
goal="Você sabe tudo sobre os artigos.",
backstory="Você é mestre em entender artigos e seus conteúdos.",
verbose=True,
allow_delegation=False,
llm=llm,
)
task = Task(
description="Answer the following questions about the papers: {question}",
expected_output="An answer to the question.",
description="Responda às seguintes perguntas sobre os artigos: {question}",
expected_output="Uma resposta para a pergunta.",
agent=agent,
)
@@ -201,16 +201,16 @@ specialist_knowledge = StringKnowledgeSource(
)
specialist_agent = Agent(
role="Technical Specialist",
goal="Provide technical expertise",
backstory="Expert in specialized technical domains",
knowledge_sources=[specialist_knowledge] # Agent-specific knowledge
role="Especialista Técnico",
goal="Fornecer expertise técnica",
backstory="Especialista em domínios técnicos especializados",
knowledge_sources=[specialist_knowledge] # Conhecimento específico do agente
)
task = Task(
description="Answer technical questions",
description="Responda perguntas técnicas",
agent=specialist_agent,
expected_output="Technical answer"
expected_output="Resposta técnica"
)
# No crew-level knowledge required
@@ -240,7 +240,7 @@ Cada nível de knowledge usa coleções de armazenamento independentes:
```python
# Agent knowledge storage
agent_collection_name = agent.role # e.g., "Technical Specialist"
agent_collection_name = agent.role # e.g., "Especialista Técnico"
# Crew knowledge storage
crew_collection_name = "crew"
@@ -248,7 +248,7 @@ crew_collection_name = "crew"
# Both stored in same ChromaDB instance but different collections
# Path: ~/.local/share/CrewAI/{project}/knowledge/
# ├── crew/ # Crew knowledge collection
# ├── Technical Specialist/ # Agent knowledge collection
# ├── Especialista Técnico/ # Agent knowledge collection
# └── Another Agent Role/ # Another agent's collection
```
@@ -265,7 +265,7 @@ agent_knowledge = StringKnowledgeSource(
)
agent = Agent(
role="Specialist",
role="Especialista",
goal="Use specialized knowledge",
backstory="Expert with specific knowledge",
knowledge_sources=[agent_knowledge],
@@ -299,10 +299,10 @@ specialist_knowledge = StringKnowledgeSource(
)
specialist = Agent(
role="Technical Specialist",
goal="Provide technical expertise",
backstory="Technical expert",
knowledge_sources=[specialist_knowledge] # Agent-specific
role="Especialista Técnico",
goal="Fornecer expertise técnica",
backstory="Especialista em domínios técnicos especializados",
knowledge_sources=[specialist_knowledge] # Conhecimento específico do agente
)
generalist = Agent(

View File

@@ -78,15 +78,15 @@ Existem diferentes locais no código do CrewAI onde você pode especificar o mod
# Configuração avançada com parâmetros detalhados
llm = LLM(
model="model-id-here", # gpt-4o, gemini-2.0-flash, anthropic/claude...
temperature=0.7, # Mais alto para saídas criativas
timeout=120, # Segundos para aguardar resposta
max_tokens=4000, # Comprimento máximo da resposta
top_p=0.9, # Parâmetro de amostragem nucleus
frequency_penalty=0.1 , # Reduz repetição
presence_penalty=0.1, # Incentiva diversidade de tópicos
response_format={"type": "json"}, # Para respostas estruturadas
seed=42 # Para resultados reproduzíveis
model="openai/gpt-4",
temperature=0.8,
max_tokens=150,
top_p=0.9,
frequency_penalty=0.1,
presence_penalty=0.1,
response_format={"type":"json"},
stop=["FIM"],
seed=42
)
```
@@ -127,13 +127,13 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
from crewai import LLM
llm = LLM(
model="openai/gpt-4", # chamar modelo por provider/model_name
model="openai/gpt-4",
temperature=0.8,
max_tokens=150,
top_p=0.9,
frequency_penalty=0.1,
presence_penalty=0.1,
stop=["END"],
stop=["FIM"],
seed=42
)
```
@@ -169,7 +169,7 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
llm = LLM(
model="meta_llama/Llama-4-Scout-17B-16E-Instruct-FP8",
temperature=0.8,
stop=["END"],
stop=["FIM"],
seed=42
)
```
@@ -268,7 +268,7 @@ Nesta seção, você encontrará exemplos detalhados que ajudam a selecionar, co
from crewai import LLM
llm = LLM(
model="gemini/gemini-1.5-pro-latest",
model="gemini-1.5-pro-latest", # or vertex_ai/gemini-1.5-pro-latest
temperature=0.7,
vertex_credentials=vertex_credentials_json
)

View File

@@ -623,7 +623,7 @@ for provider in providers_to_test:
**Erros de modelo não encontrado:**
```python
# Verifique disponibilidade do modelo
from crewai.utilities.embedding_configurator import EmbeddingConfigurator
from crewai.rag.embeddings.configurator import EmbeddingConfigurator
configurator = EmbeddingConfigurator()
try:

View File

@@ -17,7 +17,7 @@ Começar a usar o recurso de planejamento é muito simples, o único passo neces
from crewai import Crew, Agent, Task, Process
# Monte sua crew com capacidades de planejamento
my_crew = Crew(
minha_crew = Crew(
agents=self.agents,
tasks=self.tasks,
process=Process.sequential,

View File

@@ -28,23 +28,23 @@ from crewai import Crew, Process
# Exemplo: Criando uma crew com processo sequencial
crew = Crew(
agents=my_agents,
tasks=my_tasks,
agents=meus_agentes,
tasks=minhas_tarefas,
process=Process.sequential
)
# Exemplo: Criando uma crew com processo hierárquico
# Certifique-se de fornecer um manager_llm ou manager_agent
crew = Crew(
agents=my_agents,
tasks=my_tasks,
agents=meus_agentes,
tasks=minhas_tarefas,
process=Process.hierarchical,
manager_llm="gpt-4o"
# ou
# manager_agent=my_manager_agent
# manager_agent=meu_agente_gerente
)
```
**Nota:** Certifique-se de que `my_agents` e `my_tasks` estejam definidos antes de criar o objeto `Crew`, e para o processo hierárquico, é necessário também fornecer o `manager_llm` ou `manager_agent`.
**Nota:** Certifique-se de que `meus_agentes` e `minhas_tarefas` estejam definidos antes de criar o objeto `Crew`, e para o processo hierárquico, é necessário também fornecer o `manager_llm` ou `manager_agent`.
## Processo Sequencial

View File

@@ -15,12 +15,12 @@ Para habilitar o reasoning para um agente, basta definir `reasoning=True` ao cri
```python
from crewai import Agent
agent = Agent(
role="Data Analyst",
goal="Analyze complex datasets and provide insights",
backstory="You are an experienced data analyst with expertise in finding patterns in complex data.",
reasoning=True, # Enable reasoning
max_reasoning_attempts=3 # Optional: Set a maximum number of reasoning attempts
analista = Agent(
role="Analista de Dados",
goal="Analisar dados e fornecer insights",
backstory="Você é um analista de dados especialista.",
reasoning=True,
max_reasoning_attempts=3 # Opcional: Defina um limite de tentativas de reasoning
)
```
@@ -53,23 +53,23 @@ Aqui está um exemplo completo:
from crewai import Agent, Task, Crew
# Create an agent with reasoning enabled
analyst = Agent(
role="Data Analyst",
goal="Analyze data and provide insights",
backstory="You are an expert data analyst.",
analista = Agent(
role="Analista de Dados",
goal="Analisar dados e fornecer insights",
backstory="Você é um analista de dados especialista.",
reasoning=True,
max_reasoning_attempts=3 # Optional: Set a limit on reasoning attempts
max_reasoning_attempts=3 # Opcional: Defina um limite de tentativas de reasoning
)
# Create a task
analysis_task = Task(
description="Analyze the provided sales data and identify key trends.",
expected_output="A report highlighting the top 3 sales trends.",
agent=analyst
description="Analise os dados de vendas fornecidos e identifique as principais tendências.",
expected_output="Um relatório destacando as 3 principais tendências de vendas.",
agent=analista
)
# Create a crew and run the task
crew = Crew(agents=[analyst], tasks=[analysis_task])
crew = Crew(agents=[analista], tasks=[analysis_task])
result = crew.kickoff()
print(result)
@@ -90,16 +90,16 @@ logging.basicConfig(level=logging.INFO)
# Create an agent with reasoning enabled
agent = Agent(
role="Data Analyst",
goal="Analyze data and provide insights",
role="Analista de Dados",
goal="Analisar dados e fornecer insights",
reasoning=True,
max_reasoning_attempts=3
)
# Create a task
task = Task(
description="Analyze the provided sales data and identify key trends.",
expected_output="A report highlighting the top 3 sales trends.",
description="Analise os dados de vendas fornecidos e identifique as principais tendências.",
expected_output="Um relatório destacando as 3 principais tendências de vendas.",
agent=agent
)
@@ -113,7 +113,7 @@ result = agent.execute_task(task)
Veja um exemplo de como pode ser um plano de reasoning para uma tarefa de análise de dados:
```
Task: Analyze the provided sales data and identify key trends.
Task: Analise os dados de vendas fornecidos e identifique as principais tendências.
Reasoning Plan:
I'll analyze the sales data to identify the top 3 trends.

View File

@@ -54,9 +54,11 @@ crew = Crew(
| **Markdown** _(opcional)_ | `markdown` | `Optional[bool]` | Se a tarefa deve instruir o agente a retornar a resposta final formatada em Markdown. O padrão é False. |
| **Config** _(opcional)_ | `config` | `Optional[Dict[str, Any]]` | Parâmetros de configuração específicos da tarefa. |
| **Arquivo de Saída** _(opcional)_| `output_file` | `Optional[str]` | Caminho do arquivo para armazenar a saída da tarefa. |
| **Criar Diretório** _(opcional)_ | `create_directory` | `Optional[bool]` | Se deve criar o diretório para output_file caso não exista. O padrão é True. |
| **Saída JSON** _(opcional)_ | `output_json` | `Optional[Type[BaseModel]]` | Um modelo Pydantic para estruturar a saída em JSON. |
| **Output Pydantic** _(opcional)_ | `output_pydantic` | `Optional[Type[BaseModel]]` | Um modelo Pydantic para a saída da tarefa. |
| **Callback** _(opcional)_ | `callback` | `Optional[Any]` | Função/objeto a ser executado após a conclusão da tarefa. |
| **Guardrail** _(opcional)_ | `guardrail` | `Optional[Callable]` | Função para validar a saída da tarefa antes de prosseguir para a próxima tarefa. |
## Criando Tarefas
@@ -330,9 +332,11 @@ analysis_task = Task(
Guardrails (trilhas de proteção) de tarefas fornecem uma maneira de validar e transformar as saídas das tarefas antes que elas sejam passadas para a próxima tarefa. Esse recurso assegura a qualidade dos dados e oferece feedback aos agentes quando sua saída não atende a critérios específicos.
### Usando Guardrails em Tarefas
Guardrails são implementados como funções Python que contêm lógica de validação customizada, proporcionando controle total sobre o processo de validação e garantindo resultados confiáveis e determinísticos.
Para adicionar um guardrail a uma tarefa, forneça uma função de validação por meio do parâmetro `guardrail`:
### Guardrails Baseados em Função
Para adicionar um guardrail baseado em função a uma tarefa, forneça uma função de validação por meio do parâmetro `guardrail`:
```python Code
from typing import Tuple, Union, Dict, Any
@@ -370,9 +374,7 @@ blog_task = Task(
- Em caso de sucesso: retorna uma tupla `(True, resultado_validado)`
- Em caso de falha: retorna uma tupla `(False, "mensagem de erro explicando a falha")`
### LLMGuardrail
A classe `LLMGuardrail` oferece um mecanismo robusto para validação das saídas das tarefas.
### Melhores Práticas de Tratamento de Erros
@@ -386,7 +388,7 @@ def validate_with_context(result: TaskOutput) -> Tuple[bool, Any]:
validated_data = perform_validation(result)
return (True, validated_data)
except ValidationError as e:
return (False, f"VALIDATION_ERROR: {str(e)}")
return (False, f"ERRO_DE_VALIDACAO: {str(e)}")
except Exception as e:
return (False, str(e))
```
@@ -823,26 +825,7 @@ task = Task(
)
```
#### Use uma abordagem no-code para validação
```python Code
from crewai import Task
task = Task(
description="Gerar dados em JSON",
expected_output="Objeto JSON válido",
guardrail="Garanta que a resposta é um objeto JSON válido"
)
```
#### Usando YAML
```yaml
research_task:
...
guardrail: garanta que cada bullet tenha no mínimo 100 palavras
...
```
```python Code
@CrewBase
@@ -958,21 +941,87 @@ task = Task(
## Criando Diretórios ao Salvar Arquivos
Agora é possível especificar se uma tarefa deve criar diretórios ao salvar sua saída em arquivo. Isso é útil para organizar outputs e garantir que os caminhos estejam corretos.
O parâmetro `create_directory` controla se o CrewAI deve criar automaticamente diretórios ao salvar saídas de tarefas em arquivos. Este recurso é particularmente útil para organizar outputs e garantir que os caminhos de arquivos estejam estruturados corretamente, especialmente ao trabalhar com hierarquias de projetos complexas.
### Comportamento Padrão
Por padrão, `create_directory=True`, o que significa que o CrewAI criará automaticamente qualquer diretório ausente no caminho do arquivo de saída:
```python Code
# ...
save_output_task = Task(
description='Salve o resumo das notícias de IA em um arquivo',
expected_output='Arquivo salvo com sucesso',
agent=research_agent,
tools=[file_save_tool],
output_file='outputs/ai_news_summary.txt',
create_directory=True
# Comportamento padrão - diretórios são criados automaticamente
report_task = Task(
description='Gerar um relatório abrangente de análise de mercado',
expected_output='Uma análise detalhada de mercado com gráficos e insights',
agent=analyst_agent,
output_file='reports/2025/market_analysis.md', # Cria 'reports/2025/' se não existir
markdown=True
)
```
#...
### Desabilitando a Criação de Diretórios
Se você quiser evitar a criação automática de diretórios e garantir que o diretório já exista, defina `create_directory=False`:
```python Code
# Modo estrito - o diretório já deve existir
strict_output_task = Task(
description='Salvar dados críticos que requerem infraestrutura existente',
expected_output='Dados salvos em localização pré-configurada',
agent=data_agent,
output_file='secure/vault/critical_data.json',
create_directory=False # Gerará RuntimeError se 'secure/vault/' não existir
)
```
### Configuração YAML
Você também pode configurar este comportamento em suas definições de tarefas YAML:
```yaml tasks.yaml
analysis_task:
description: >
Gerar análise financeira trimestral
expected_output: >
Um relatório financeiro abrangente com insights trimestrais
agent: financial_analyst
output_file: reports/quarterly/q4_2024_analysis.pdf
create_directory: true # Criar automaticamente o diretório 'reports/quarterly/'
audit_task:
description: >
Realizar auditoria de conformidade e salvar no diretório de auditoria existente
expected_output: >
Um relatório de auditoria de conformidade
agent: auditor
output_file: audit/compliance_report.md
create_directory: false # O diretório já deve existir
```
### Casos de Uso
**Criação Automática de Diretórios (`create_directory=True`):**
- Ambientes de desenvolvimento e prototipagem
- Geração dinâmica de relatórios com pastas baseadas em datas
- Fluxos de trabalho automatizados onde a estrutura de diretórios pode variar
- Aplicações multi-tenant com pastas específicas do usuário
**Gerenciamento Manual de Diretórios (`create_directory=False`):**
- Ambientes de produção com controles rígidos do sistema de arquivos
- Aplicações sensíveis à segurança onde diretórios devem ser pré-configurados
- Sistemas com requisitos específicos de permissão
- Ambientes de conformidade onde a criação de diretórios é auditada
### Tratamento de Erros
Quando `create_directory=False` e o diretório não existe, o CrewAI gerará um `RuntimeError`:
```python Code
try:
result = crew.kickoff()
except RuntimeError as e:
# Tratar erro de diretório ausente
print(f"Falha na criação do diretório: {e}")
# Criar diretório manualmente ou usar local alternativo
```
Veja o vídeo abaixo para aprender como utilizar saídas estruturadas no CrewAI:

View File

@@ -67,17 +67,17 @@ web_rag_tool = WebsiteSearchTool()
# Criar agentes
researcher = Agent(
role='Market Research Analyst',
goal='Provide up-to-date market analysis of the AI industry',
backstory='An expert analyst with a keen eye for market trends.',
role='Analista de Mercado',
goal='Fornecer análise de mercado atualizada da indústria de IA',
backstory='Analista especialista com olhar atento para tendências de mercado.',
tools=[search_tool, web_rag_tool],
verbose=True
)
writer = Agent(
role='Content Writer',
goal='Craft engaging blog posts about the AI industry',
backstory='A skilled writer with a passion for technology.',
role='Redator de Conteúdo',
goal='Criar posts de blog envolventes sobre a indústria de IA',
backstory='Redator habilidoso com paixão por tecnologia.',
tools=[docs_tool, file_tool],
verbose=True
)

View File

@@ -36,19 +36,18 @@ Para treinar sua crew de forma programática, siga estes passos:
3. Execute o comando de treinamento dentro de um bloco try-except para tratar possíveis erros.
```python Code
n_iterations = 2
inputs = {"topic": "CrewAI Training"}
filename = "your_model.pkl"
n_iteracoes = 2
entradas = {"topic": "Treinamento CrewAI"}
nome_arquivo = "seu_modelo.pkl"
try:
YourCrewName_Crew().crew().train(
n_iterations=n_iterations,
inputs=inputs,
filename=filename
SuaCrew().crew().train(
n_iterations=n_iteracoes,
inputs=entradas,
filename=nome_arquivo
)
except Exception as e:
raise Exception(f"An error occurred while training the crew: {e}")
raise Exception(f"Ocorreu um erro ao treinar a crew: {e}")
```
### Pontos Importantes

View File

@@ -26,13 +26,13 @@ from crewai.tasks.hallucination_guardrail import HallucinationGuardrail
from crewai import LLM
# Uso básico - utiliza o expected_output da tarefa como contexto
guardrail = HallucinationGuardrail(
protecao = HallucinationGuardrail(
llm=LLM(model="gpt-4o-mini")
)
# Com contexto de referência explícito
context_guardrail = HallucinationGuardrail(
context="AI helps with various tasks including analysis and generation.",
protecao_com_contexto = HallucinationGuardrail(
context="IA ajuda em várias tarefas, incluindo análise e geração.",
llm=LLM(model="gpt-4o-mini")
)
```
@@ -43,11 +43,11 @@ context_guardrail = HallucinationGuardrail(
from crewai import Task
# Crie sua tarefa com a proteção
task = Task(
description="Write a summary about AI capabilities",
expected_output="A factual summary based on the provided context",
agent=my_agent,
guardrail=guardrail # Adiciona a proteção para validar a saída
minha_tarefa = Task(
description="Escreva um resumo sobre as capacidades da IA",
expected_output="Um resumo factual baseado no contexto fornecido",
agent=meu_agente,
guardrail=protecao # Adiciona a proteção para validar a saída
)
```
@@ -59,8 +59,8 @@ Para validação mais rigorosa, é possível definir um limiar de fidelidade per
```python
# Proteção rigorosa exigindo alta pontuação de fidelidade
strict_guardrail = HallucinationGuardrail(
context="Quantum computing uses qubits that exist in superposition states.",
protecao_rigorosa = HallucinationGuardrail(
context="Computação quântica utiliza qubits que existem em estados de superposição.",
llm=LLM(model="gpt-4o-mini"),
threshold=8.0 # Requer pontuação >= 8 para validar
)
@@ -72,10 +72,10 @@ Se sua tarefa utiliza ferramentas, você pode incluir as respostas das ferrament
```python
# Proteção com contexto de resposta da ferramenta
weather_guardrail = HallucinationGuardrail(
context="Current weather information for the requested location",
protecao_clima = HallucinationGuardrail(
context="Informações meteorológicas atuais para o local solicitado",
llm=LLM(model="gpt-4o-mini"),
tool_response="Weather API returned: Temperature 22°C, Humidity 65%, Clear skies"
tool_response="API do Clima retornou: Temperatura 22°C, Umidade 65%, Céu limpo"
)
```
@@ -123,15 +123,15 @@ Quando uma proteção é adicionada à tarefa, ela valida automaticamente a saí
```python
# Fluxo de validação de saída da tarefa
task_output = agent.execute_task(task)
validation_result = guardrail(task_output)
task_output = meu_agente.execute_task(minha_tarefa)
resultado_validacao = protecao(task_output)
if validation_result.valid:
if resultado_validacao.valid:
# Tarefa concluída com sucesso
return task_output
else:
# Tarefa falha com feedback de validação
raise ValidationError(validation_result.feedback)
raise ValidationError(resultado_validacao.feedback)
```
### Rastreamento de Eventos
@@ -151,10 +151,10 @@ A proteção se integra ao sistema de eventos do CrewAI para fornecer observabil
Inclua todas as informações factuais relevantes nas quais a IA deve basear sua saída:
```python
context = """
Company XYZ was founded in 2020 and specializes in renewable energy solutions.
They have 150 employees and generated $50M revenue in 2023.
Their main products include solar panels and wind turbines.
contexto = """
Empresa XYZ foi fundada em 2020 e é especializada em soluções de energia renovável.
Possui 150 funcionários e faturou R$ 50 milhões em 2023.
Seus principais produtos incluem painéis solares e turbinas eólicas.
"""
```
</Step>
@@ -164,10 +164,10 @@ A proteção se integra ao sistema de eventos do CrewAI para fornecer observabil
```python
# Bom: Contexto focado
context = "The current weather in New York is 18°C with light rain."
contexto = "O clima atual em Nova York é 18°C com chuva leve."
# Evite: Informações irrelevantes
context = "The weather is 18°C. The city has 8 million people. Traffic is heavy."
contexto = "The weather is 18°C. The city has 8 million people. Traffic is heavy."
```
</Step>

View File

@@ -84,31 +84,31 @@ from crewai import Agent, Task, Crew
from crewai_tools import CrewaiEnterpriseTools
# Obtenha ferramentas enterprise (a ferramenta Gmail será incluída)
enterprise_tools = CrewaiEnterpriseTools(
enterprise_token="your_enterprise_token"
ferramentas_enterprise = CrewaiEnterpriseTools(
enterprise_token="seu_token_enterprise"
)
# imprima as ferramentas
print(enterprise_tools)
printf(ferramentas_enterprise)
# Crie um agente com capacidades do Gmail
email_agent = Agent(
role="Email Manager",
goal="Manage and organize email communications",
backstory="An AI assistant specialized in email management and communication.",
tools=enterprise_tools
agente_email = Agent(
role="Gerente de E-mails",
goal="Gerenciar e organizar comunicações por e-mail",
backstory="Um assistente de IA especializado em gestão de e-mails e comunicação.",
tools=ferramentas_enterprise
)
# Tarefa para enviar um e-mail
email_task = Task(
description="Draft and send a follow-up email to john@example.com about the project update",
agent=email_agent,
expected_output="Confirmation that email was sent successfully"
tarefa_email = Task(
description="Redigir e enviar um e-mail de acompanhamento para john@example.com sobre a atualização do projeto",
agent=agente_email,
expected_output="Confirmação de que o e-mail foi enviado com sucesso"
)
# Execute a tarefa
crew = Crew(
agents=[email_agent],
tasks=[email_task]
agents=[agente_email],
tasks=[tarefa_email]
)
# Execute o crew
@@ -125,23 +125,23 @@ enterprise_tools = CrewaiEnterpriseTools(
)
gmail_tool = enterprise_tools["gmail_find_email"]
gmail_agent = Agent(
role="Gmail Manager",
goal="Manage gmail communications and notifications",
backstory="An AI assistant that helps coordinate gmail communications.",
agente_gmail = Agent(
role="Gerente do Gmail",
goal="Gerenciar comunicações e notificações do gmail",
backstory="Um assistente de IA que ajuda a coordenar comunicações no gmail.",
tools=[gmail_tool]
)
notification_task = Task(
description="Find the email from john@example.com",
agent=gmail_agent,
expected_output="Email found from john@example.com"
tarefa_notificacao = Task(
description="Encontrar o e-mail de john@example.com",
agent=agente_gmail,
expected_output="E-mail encontrado de john@example.com"
)
# Execute a tarefa
crew = Crew(
agents=[slack_agent],
tasks=[notification_task]
agents=[agente_gmail],
tasks=[tarefa_notificacao]
)
```

View File

@@ -30,7 +30,7 @@ Antes de usar o Repositório de Ferramentas, certifique-se de que você possui:
Para instalar uma ferramenta:
```bash
crewai tool install <tool-name>
crewai tool install <nome-da-ferramenta>
```
Isso instala a ferramenta e a adiciona ao `pyproject.toml`.
@@ -40,7 +40,7 @@ Isso instala a ferramenta e a adiciona ao `pyproject.toml`.
Para criar um novo projeto de ferramenta:
```bash
crewai tool create <tool-name>
crewai tool create <nome-da-ferramenta>
```
Isso gera um projeto de ferramenta estruturado localmente.
@@ -76,7 +76,7 @@ Para atualizar uma ferramenta publicada:
3. Faça o commit das alterações e publique
```bash
git commit -m "Update version to 0.1.1"
git commit -m "Atualizar versão para 0.1.1"
crewai tool publish
```

View File

@@ -12,16 +12,17 @@ O Enterprise Event Streaming permite que você receba atualizações em tempo re
Ao utilizar a API Kickoff, inclua um objeto `webhooks` em sua requisição, por exemplo:
# Exemplo de uso da API Kickoff com webhooks
```json
{
"inputs": {"foo": "bar"},
"webhooks": {
"events": ["crew_kickoff_started", "llm_call_started"],
"url": "https://your.endpoint/webhook",
"url": "https://seu.endpoint/webhook",
"realtime": false,
"authentication": {
"strategy": "bearer",
"token": "my-secret-token"
"token": "meu-token-secreto"
}
}
}
@@ -33,19 +34,20 @@ Se `realtime` estiver definido como `true`, cada evento será entregue individua
Cada webhook envia uma lista de eventos:
# Exemplo de evento enviado pelo webhook
```json
{
"events": [
{
"id": "event-id",
"execution_id": "crew-run-id",
"id": "id-do-evento",
"execution_id": "id-da-execucao-do-crew",
"timestamp": "2025-02-16T10:58:44.965Z",
"type": "llm_call_started",
"data": {
"model": "gpt-4",
"messages": [
{"role": "system", "content": "You are an assistant."},
{"role": "user", "content": "Summarize this article."}
{"role": "system", "content": "Você é um assistente."},
{"role": "user", "content": "Resuma este artigo."}
]
}
}

View File

@@ -41,11 +41,8 @@ A CLI fornece a maneira mais rápida de implantar crews desenvolvidos localmente
Primeiro, você precisa autenticar sua CLI com a plataforma CrewAI Enterprise:
```bash
# Se já possui uma conta CrewAI Enterprise
# Se já possui uma conta CrewAI Enterprise, ou deseja criar uma:
crewai login
# Se vai criar uma nova conta
crewai signup
```
Ao executar qualquer um dos comandos, a CLI irá:

View File

@@ -16,17 +16,17 @@ from crewai import CrewBase
from crewai.project import before_kickoff
@CrewBase
class MyCrew:
class MinhaEquipe:
@before_kickoff
def prepare_data(self, inputs):
# Preprocess or modify inputs
inputs['processed'] = True
return inputs
def preparar_dados(self, entradas):
# Pré-processa ou modifica as entradas
entradas['processado'] = True
return entradas
#...
```
Neste exemplo, a função prepare_data modifica as entradas adicionando um novo par chave-valor indicando que as entradas foram processadas.
Neste exemplo, a função preparar_dados modifica as entradas adicionando um novo par chave-valor indicando que as entradas foram processadas.
## Hook Depois do Kickoff
@@ -39,17 +39,17 @@ from crewai import CrewBase
from crewai.project import after_kickoff
@CrewBase
class MyCrew:
class MinhaEquipe:
@after_kickoff
def log_results(self, result):
# Log or modify the results
print("Crew execution completed with result:", result)
return result
def registrar_resultados(self, resultado):
# Registra ou modifica os resultados
print("Execução da equipe concluída com resultado:", resultado)
return resultado
# ...
```
Na função `log_results`, os resultados da execução da crew são simplesmente impressos. Você pode estender isso para realizar operações mais complexas, como enviar notificações ou integrar com outros serviços.
Na função `registrar_resultados`, os resultados da execução da crew são simplesmente impressos. Você pode estender isso para realizar operações mais complexas, como enviar notificações ou integrar com outros serviços.
## Utilizando Ambos os Hooks

View File

@@ -77,9 +77,9 @@ search_tool = SerperDevTool()
# Inicialize o agente com opções avançadas
agent = Agent(
role='Research Analyst',
goal='Provide up-to-date market analysis',
backstory='An expert analyst with a keen eye for market trends.',
role='Analista de Pesquisa',
goal='Fornecer análises de mercado atualizadas',
backstory='Um analista especialista com olhar atento para tendências de mercado.',
tools=[search_tool],
memory=True, # Ativa memória
verbose=True,
@@ -98,14 +98,9 @@ eficiência dentro do ecossistema CrewAI. Se necessário, a delegação pode ser
```python Code
agent = Agent(
role='Content Writer',
goal='Write engaging content on market trends',
backstory='A seasoned writer with expertise in market analysis.',
role='Redator de Conteúdo',
goal='Escrever conteúdo envolvente sobre tendências de mercado',
backstory='Um redator experiente com expertise em análise de mercado.',
allow_delegation=True # Habilitando delegação
)
```
## Conclusão
Personalizar agentes no CrewAI definindo seus papéis, objetivos, histórias e ferramentas, juntamente com opções avançadas como personalização de modelo de linguagem, memória, ajustes de performance e preferências de delegação,
proporciona uma equipe de IA sofisticada e preparada para enfrentar desafios complexos.
```

View File

@@ -45,17 +45,17 @@ from crewai import Crew, Agent, Task
# Create an agent with code execution enabled
coding_agent = Agent(
role="Python Data Analyst",
goal="Analyze data and provide insights using Python",
backstory="You are an experienced data analyst with strong Python skills.",
role="Analista de Dados Python",
goal="Analisar dados e fornecer insights usando Python",
backstory="Você é um analista de dados experiente com fortes habilidades em Python.",
allow_code_execution=True
)
# Create a task that requires code execution
data_analysis_task = Task(
description="Analyze the given dataset and calculate the average age of participants. Ages: {ages}",
description="Analise o conjunto de dados fornecido e calcule a idade média dos participantes. Idades: {ages}",
agent=coding_agent,
expected_output="The average age of the participants."
expected_output="A idade média dos participantes."
)
# Create a crew and add the task
@@ -83,23 +83,23 @@ from crewai import Crew, Agent, Task
# Create an agent with code execution enabled
coding_agent = Agent(
role="Python Data Analyst",
goal="Analyze data and provide insights using Python",
backstory="You are an experienced data analyst with strong Python skills.",
role="Analista de Dados Python",
goal="Analisar dados e fornecer insights usando Python",
backstory="Você é um analista de dados experiente com fortes habilidades em Python.",
allow_code_execution=True
)
# Create tasks that require code execution
task_1 = Task(
description="Analyze the first dataset and calculate the average age of participants. Ages: {ages}",
description="Analise o primeiro conjunto de dados e calcule a idade média dos participantes. Idades: {ages}",
agent=coding_agent,
expected_output="The average age of the participants."
expected_output="A idade média dos participantes."
)
task_2 = Task(
description="Analyze the second dataset and calculate the average age of participants. Ages: {ages}",
description="Analise o segundo conjunto de dados e calcule a idade média dos participantes. Idades: {ages}",
agent=coding_agent,
expected_output="The average age of the participants."
expected_output="A idade média dos participantes."
)
# Create two crews and add tasks

View File

@@ -43,11 +43,11 @@ try:
with MCPServerAdapter(server_params_list) as aggregated_tools:
print(f"Available aggregated tools: {[tool.name for tool in aggregated_tools]}")
multi_server_agent = Agent(
role="Versatile Assistant",
goal="Utilize tools from local Stdio, remote SSE, and remote HTTP MCP servers.",
backstory="An AI agent capable of leveraging a diverse set of tools from multiple sources.",
tools=aggregated_tools, # All tools are available here
agente_multiservidor = Agent(
role="Assistente Versátil",
goal="Utilizar ferramentas de servidores MCP locais Stdio, remotos SSE e remotos HTTP.",
backstory="Um agente de IA capaz de aproveitar um conjunto diversificado de ferramentas de múltiplas fontes.",
tools=aggregated_tools, # Todas as ferramentas estão disponíveis aqui
verbose=True,
)

View File

@@ -73,10 +73,10 @@ server_params = {
with MCPServerAdapter(server_params) as mcp_tools:
print(f"Available tools: {[tool.name for tool in mcp_tools]}")
my_agent = Agent(
role="MCP Tool User",
goal="Utilize tools from an MCP server.",
backstory="I can connect to MCP servers and use their tools.",
meu_agente = Agent(
role="Usuário de Ferramentas MCP",
goal="Utilizar ferramentas de um servidor MCP.",
backstory="Posso conectar a servidores MCP e usar suas ferramentas.",
tools=mcp_tools, # Passe as ferramentas carregadas para o seu agente
reasoning=True,
verbose=True
@@ -91,10 +91,10 @@ Este padrão geral mostra como integrar ferramentas. Para exemplos específicos
with MCPServerAdapter(server_params) as mcp_tools:
print(f"Available tools: {[tool.name for tool in mcp_tools]}")
my_agent = Agent(
role="MCP Tool User",
goal="Utilize tools from an MCP server.",
backstory="I can connect to MCP servers and use their tools.",
meu_agente = Agent(
role="Usuário de Ferramentas MCP",
goal="Utilizar ferramentas de um servidor MCP.",
backstory="Posso conectar a servidores MCP e usar suas ferramentas.",
tools=mcp_tools["tool_name"], # Passe as ferramentas filtradas para o seu agente
reasoning=True,
verbose=True

View File

@@ -37,24 +37,24 @@ try:
print(f"Available tools from SSE MCP server: {[tool.name for tool in tools]}")
# Example: Using a tool from the SSE MCP server
sse_agent = Agent(
role="Remote Service User",
goal="Utilize a tool provided by a remote SSE MCP server.",
backstory="An AI agent that connects to external services via SSE.",
agente_sse = Agent(
role="Usuário de Serviço Remoto",
goal="Utilizar uma ferramenta fornecida por um servidor MCP remoto via SSE.",
backstory="Um agente de IA que conecta a serviços externos via SSE.",
tools=tools,
reasoning=True,
verbose=True,
)
sse_task = Task(
description="Fetch real-time stock updates for 'AAPL' using an SSE tool.",
expected_output="The latest stock price for AAPL.",
agent=sse_agent,
description="Buscar atualizações em tempo real das ações 'AAPL' usando uma ferramenta SSE.",
expected_output="O preço mais recente da ação AAPL.",
agent=agente_sse,
markdown=True
)
sse_crew = Crew(
agents=[sse_agent],
agents=[agente_sse],
tasks=[sse_task],
verbose=True,
process=Process.sequential
@@ -101,16 +101,16 @@ try:
print(f"Available tools (manual SSE): {[tool.name for tool in tools]}")
manual_sse_agent = Agent(
role="Remote Data Analyst",
goal="Analyze data fetched from a remote SSE MCP server using manual connection management.",
backstory="An AI skilled in handling SSE connections explicitly.",
role="Analista Remoto de Dados",
goal="Analisar dados obtidos de um servidor MCP remoto SSE usando gerenciamento manual de conexão.",
backstory="Um agente de IA especializado em gerenciar conexões SSE explicitamente.",
tools=tools,
verbose=True
)
analysis_task = Task(
description="Fetch and analyze the latest user activity trends from the SSE server.",
expected_output="A summary report of user activity trends.",
description="Buscar e analisar as tendências mais recentes de atividade de usuários do servidor SSE.",
expected_output="Um relatório resumido das tendências de atividade dos usuários.",
agent=manual_sse_agent
)

View File

@@ -38,24 +38,24 @@ with MCPServerAdapter(server_params) as tools:
print(f"Available tools from Stdio MCP server: {[tool.name for tool in tools]}")
# Exemplo: Usando as ferramentas do servidor MCP Stdio em um Agente CrewAI
research_agent = Agent(
role="Local Data Processor",
goal="Process data using a local Stdio-based tool.",
backstory="An AI that leverages local scripts via MCP for specialized tasks.",
pesquisador_local = Agent(
role="Processador Local de Dados",
goal="Processar dados usando uma ferramenta local baseada em Stdio.",
backstory="Uma IA que utiliza scripts locais via MCP para tarefas especializadas.",
tools=tools,
reasoning=True,
verbose=True,
)
processing_task = Task(
description="Process the input data file 'data.txt' and summarize its contents.",
expected_output="A summary of the processed data.",
agent=research_agent,
description="Processar o arquivo de dados de entrada 'data.txt' e resumir seu conteúdo.",
expected_output="Um resumo dos dados processados.",
agent=pesquisador_local,
markdown=True
)
data_crew = Crew(
agents=[research_agent],
agents=[pesquisador_local],
tasks=[processing_task],
verbose=True,
process=Process.sequential
@@ -95,16 +95,16 @@ try:
# Exemplo: Usando as ferramentas com sua configuração de Agent, Task, Crew
manual_agent = Agent(
role="Local Task Executor",
goal="Execute a specific local task using a manually managed Stdio tool.",
backstory="An AI proficient in controlling local processes via MCP.",
role="Executor Local de Tarefas",
goal="Executar uma tarefa local específica usando uma ferramenta Stdio gerenciada manualmente.",
backstory="Uma IA proficiente em controlar processos locais via MCP.",
tools=tools,
verbose=True
)
manual_task = Task(
description="Execute the 'perform_analysis' command via the Stdio tool.",
expected_output="Results of the analysis.",
description="Executar o comando 'perform_analysis' via ferramenta Stdio.",
expected_output="Resultados da análise.",
agent=manual_agent
)

View File

@@ -35,22 +35,22 @@ try:
with MCPServerAdapter(server_params) as tools:
print(f"Available tools from Streamable HTTP MCP server: {[tool.name for tool in tools]}")
http_agent = Agent(
role="HTTP Service Integrator",
goal="Utilize tools from a remote MCP server via Streamable HTTP.",
backstory="An AI agent adept at interacting with complex web services.",
agente_http = Agent(
role="Integrador de Serviços HTTP",
goal="Utilizar ferramentas de um servidor MCP remoto via Streamable HTTP.",
backstory="Um agente de IA especializado em interagir com serviços web complexos.",
tools=tools,
verbose=True,
)
http_task = Task(
description="Perform a complex data query using a tool from the Streamable HTTP server.",
expected_output="The result of the complex data query.",
agent=http_agent,
description="Realizar uma consulta de dados complexa usando uma ferramenta do servidor Streamable HTTP.",
expected_output="O resultado da consulta de dados complexa.",
agent=agente_http,
)
http_crew = Crew(
agents=[http_agent],
agents=[agente_http],
tasks=[http_task],
verbose=True,
process=Process.sequential
@@ -91,16 +91,16 @@ try:
print(f"Available tools (manual Streamable HTTP): {[tool.name for tool in tools]}")
manual_http_agent = Agent(
role="Advanced Web Service User",
goal="Interact with an MCP server using manually managed Streamable HTTP connections.",
backstory="An AI specialist in fine-tuning HTTP-based service integrations.",
role="Usuário Avançado de Serviços Web",
goal="Interagir com um servidor MCP usando conexões HTTP Streamable gerenciadas manualmente.",
backstory="Um especialista em IA em ajustar integrações baseadas em HTTP.",
tools=tools,
verbose=True
)
data_processing_task = Task(
description="Submit data for processing and retrieve results via Streamable HTTP.",
expected_output="Processed data or confirmation.",
description="Enviar dados para processamento e recuperar resultados via Streamable HTTP.",
expected_output="Dados processados ou confirmação.",
agent=manual_http_agent
)

View File

@@ -78,47 +78,40 @@ CrewAIInstrumentor().instrument(skip_dep_check=True, tracer_provider=tracer_prov
search_tool = SerperDevTool()
# Defina seus agentes com papéis e objetivos
researcher = Agent(
role="Senior Research Analyst",
goal="Uncover cutting-edge developments in AI and data science",
backstory="""You work at a leading tech think tank.
Your expertise lies in identifying emerging trends.
You have a knack for dissecting complex data and presenting actionable insights.""",
pesquisador = Agent(
role="Analista Sênior de Pesquisa",
goal="Descobrir os avanços mais recentes em IA e ciência de dados",
backstory="""
Você trabalha em um importante think tank de tecnologia. Sua especialidade é identificar tendências emergentes. Você tem habilidade para dissecar dados complexos e apresentar insights acionáveis.
""",
verbose=True,
allow_delegation=False,
# You can pass an optional llm attribute specifying what model you wanna use.
# llm=ChatOpenAI(model_name="gpt-3.5", temperature=0.7),
tools=[search_tool],
)
writer = Agent(
role="Tech Content Strategist",
goal="Craft compelling content on tech advancements",
backstory="""You are a renowned Content Strategist, known for your insightful and engaging articles.
You transform complex concepts into compelling narratives.""",
role="Estrategista de Conteúdo Técnico",
goal="Criar conteúdo envolvente sobre avanços tecnológicos",
backstory="Você é um Estrategista de Conteúdo renomado, conhecido por seus artigos perspicazes e envolventes. Você transforma conceitos complexos em narrativas atraentes.",
verbose=True,
allow_delegation=True,
)
# Crie tarefas para seus agentes
task1 = Task(
description="""Conduct a comprehensive analysis of the latest advancements in AI in 2024.
Identify key trends, breakthrough technologies, and potential industry impacts.""",
expected_output="Full analysis report in bullet points",
agent=researcher,
description="Realize uma análise abrangente dos avanços mais recentes em IA em 2024. Identifique tendências-chave, tecnologias inovadoras e impactos potenciais na indústria.",
expected_output="Relatório analítico completo em tópicos",
agent=pesquisador,
)
task2 = Task(
description="""Using the insights provided, develop an engaging blog
post that highlights the most significant AI advancements.
Your post should be informative yet accessible, catering to a tech-savvy audience.
Make it sound cool, avoid complex words so it doesn't sound like AI.""",
expected_output="Full blog post of at least 4 paragraphs",
description="Utilizando os insights fornecidos, desenvolva um blog envolvente destacando os avanços mais significativos em IA. O post deve ser informativo e acessível, voltado para um público técnico. Dê um tom interessante, evite palavras complexas para não soar como IA.",
expected_output="Post de blog completo com pelo menos 4 parágrafos",
agent=writer,
)
# Instancie seu crew com um processo sequencial
crew = Crew(
agents=[researcher, writer], tasks=[task1, task2], verbose=1, process=Process.sequential
agents=[pesquisador, writer], tasks=[task1, task2], verbose=1, process=Process.sequential
)
# Coloque seu crew para trabalhar!

View File

@@ -76,20 +76,20 @@ from crewai_tools import (
web_rag_tool = WebsiteSearchTool()
writer = Agent(
role="Writer",
goal="Você torna a matemática envolvente e compreensível para crianças pequenas através de poesias",
backstory="Você é especialista em escrever haicais mas não sabe nada de matemática.",
tools=[web_rag_tool],
)
escritor = Agent(
role="Escritor",
goal="Você torna a matemática envolvente e compreensível para crianças pequenas através de poesias",
backstory="Você é especialista em escrever haicais mas não sabe nada de matemática.",
tools=[web_rag_tool],
)
task = Task(description=("O que é {multiplicação}?"),
expected_output=("Componha um haicai que inclua a resposta."),
agent=writer)
tarefa = Task(description=("O que é {multiplicação}?"),
expected_output=("Componha um haicai que inclua a resposta."),
agent=escritor)
crew = Crew(
agents=[writer],
tasks=[task],
equipe = Crew(
agents=[escritor],
tasks=[tarefa],
share_crew=False
)
```

View File

@@ -35,7 +35,7 @@ Essa integração permite o registro de hiperparâmetros, o monitoramento de reg
```python
from langtrace_python_sdk import langtrace
langtrace.init(api_key='<LANGTRACE_API_KEY>')
langtrace.init(api_key='<SUA_CHAVE_LANGTRACE>')
# Agora importe os módulos do CrewAI
from crewai import Agent, Task, Crew

View File

@@ -73,26 +73,24 @@ instrument_crewai(logger)
### 4. Crie e execute sua aplicação CrewAI normalmente
```python
# Crie seu agente
researcher = Agent(
role='Senior Research Analyst',
goal='Uncover cutting-edge developments in AI',
backstory="You are an expert researcher at a tech think tank...",
pesquisador = Agent(
role='Pesquisador Sênior',
goal='Descobrir os avanços mais recentes em IA',
backstory="Você é um pesquisador especialista em um think tank de tecnologia...",
verbose=True,
llm=llm
)
# Defina a tarefa
research_task = Task(
description="Research the latest AI advancements...",
description="Pesquise os avanços mais recentes em IA...",
expected_output="",
agent=researcher
agent=pesquisador
)
# Configure e execute a crew
crew = Crew(
agents=[researcher],
agents=[pesquisador],
tasks=[research_task],
verbose=True
)

View File

@@ -70,22 +70,19 @@ O tracing fornece uma forma de registrar os inputs, outputs e metadados associad
class TripAgents:
def city_selection_agent(self):
return Agent(
role="City Selection Expert",
goal="Select the best city based on weather, season, and prices",
backstory="An expert in analyzing travel data to pick ideal destinations",
tools=[
search_tool,
],
especialista_cidades = Agent(
role="Especialista em Seleção de Cidades",
goal="Selecionar a melhor cidade com base no clima, estação e preços",
backstory="Especialista em analisar dados de viagem para escolher destinos ideais",
tools=[search_tool],
verbose=True,
)
def local_expert(self):
return Agent(
role="Local Expert at this city",
goal="Provide the BEST insights about the selected city",
backstory="""A knowledgeable local guide with extensive information
about the city, it's attractions and customs""",
especialista_local = Agent(
role="Especialista Local nesta cidade",
goal="Fornecer as MELHORES informações sobre a cidade selecionada",
backstory="Um guia local experiente com amplo conhecimento sobre a cidade, suas atrações e costumes",
tools=[search_tool],
verbose=True,
)
@@ -96,53 +93,36 @@ O tracing fornece uma forma de registrar os inputs, outputs e metadados associad
return Task(
description=dedent(
f"""
Analyze and select the best city for the trip based
on specific criteria such as weather patterns, seasonal
events, and travel costs. This task involves comparing
multiple cities, considering factors like current weather
conditions, upcoming cultural or seasonal events, and
overall travel expenses.
Your final answer must be a detailed
report on the chosen city, and everything you found out
about it, including the actual flight costs, weather
forecast and attractions.
Analise e selecione a melhor cidade para a viagem com base em critérios específicos como padrões climáticos, eventos sazonais e custos de viagem. Esta tarefa envolve comparar várias cidades, considerando fatores como condições climáticas atuais, eventos culturais ou sazonais e despesas gerais de viagem.
Sua resposta final deve ser um relatório detalhado sobre a cidade escolhida e tudo o que você descobriu sobre ela, incluindo custos reais de voo, previsão do tempo e atrações.
Traveling from: {origin}
City Options: {cities}
Trip Date: {range}
Traveler Interests: {interests}
Saindo de: {origin}
Opções de cidades: {cities}
Data da viagem: {range}
Interesses do viajante: {interests}
"""
),
agent=agent,
expected_output="Detailed report on the chosen city including flight costs, weather forecast, and attractions",
expected_output="Relatório detalhado sobre a cidade escolhida incluindo custos de voo, previsão do tempo e atrações",
)
def gather_task(self, agent, origin, interests, range):
return Task(
description=dedent(
f"""
As a local expert on this city you must compile an
in-depth guide for someone traveling there and wanting
to have THE BEST trip ever!
Gather information about key attractions, local customs,
special events, and daily activity recommendations.
Find the best spots to go to, the kind of place only a
local would know.
This guide should provide a thorough overview of what
the city has to offer, including hidden gems, cultural
hotspots, must-visit landmarks, weather forecasts, and
high level costs.
The final answer must be a comprehensive city guide,
rich in cultural insights and practical tips,
tailored to enhance the travel experience.
Como especialista local nesta cidade, você deve compilar um guia aprofundado para alguém que está viajando para lá e quer ter a MELHOR viagem possível!
Reúna informações sobre principais atrações, costumes locais, eventos especiais e recomendações de atividades diárias.
Encontre os melhores lugares para ir, aqueles que só um local conhece.
Este guia deve fornecer uma visão abrangente do que a cidade tem a oferecer, incluindo joias escondidas, pontos culturais, marcos imperdíveis, previsão do tempo e custos gerais.
A resposta final deve ser um guia completo da cidade, rico em insights culturais e dicas práticas, adaptado para aprimorar a experiência de viagem.
Trip Date: {range}
Traveling from: {origin}
Traveler Interests: {interests}
Data da viagem: {range}
Saindo de: {origin}
Interesses do viajante: {interests}
"""
),
agent=agent,
expected_output="Comprehensive city guide including hidden gems, cultural hotspots, and practical travel tips",
expected_output="Guia completo da cidade incluindo joias escondidas, pontos culturais e dicas práticas",
)
@@ -189,7 +169,7 @@ O tracing fornece uma forma de registrar os inputs, outputs e metadados associad
trip_crew = TripCrew("California", "Tokyo", "Dec 12 - Dec 20", "sports")
result = trip_crew.run()
print(result)
print("Resultado da equipe:", result)
```
Consulte a [Documentação de Tracing do MLflow](https://mlflow.org/docs/latest/llms/tracing/index.html) para mais configurações e casos de uso.
</Step>

View File

@@ -69,10 +69,10 @@ Essa configuração permite acompanhar hiperparâmetros e monitorar problemas de
openlit.init(disable_metrics=True)
# Definir seus agentes
researcher = Agent(
role="Researcher",
goal="Conduct thorough research and analysis on AI and AI agents",
backstory="You're an expert researcher, specialized in technology, software engineering, AI, and startups. You work as a freelancer and are currently researching for a new client.",
pesquisador = Agent(
role="Pesquisador",
goal="Realizar pesquisas e análises aprofundadas sobre IA e agentes de IA",
backstory="Você é um pesquisador especialista em tecnologia, engenharia de software, IA e startups. Trabalha como freelancer e está atualmente pesquisando para um novo cliente.",
allow_delegation=False,
llm='command-r'
)
@@ -80,24 +80,24 @@ Essa configuração permite acompanhar hiperparâmetros e monitorar problemas de
# Definir sua task
task = Task(
description="Generate a list of 5 interesting ideas for an article, then write one captivating paragraph for each idea that showcases the potential of a full article on this topic. Return the list of ideas with their paragraphs and your notes.",
expected_output="5 bullet points, each with a paragraph and accompanying notes.",
description="Gere uma lista com 5 ideias interessantes para um artigo e escreva um parágrafo cativante para cada ideia, mostrando o potencial de um artigo completo sobre o tema. Retorne a lista de ideias com seus parágrafos e suas anotações.",
expected_output="5 tópicos, cada um com um parágrafo e notas complementares.",
)
# Definir o agente gerente
manager = Agent(
role="Project Manager",
goal="Efficiently manage the crew and ensure high-quality task completion",
backstory="You're an experienced project manager, skilled in overseeing complex projects and guiding teams to success. Your role is to coordinate the efforts of the crew members, ensuring that each task is completed on time and to the highest standard.",
gerente = Agent(
role="Gerente de Projeto",
goal="Gerenciar eficientemente a equipe e garantir a conclusão de tarefas de alta qualidade",
backstory="Você é um gerente de projetos experiente, habilidoso em supervisionar projetos complexos e guiar equipes para o sucesso. Sua função é coordenar os esforços dos membros da equipe, garantindo que cada tarefa seja concluída no prazo e com o mais alto padrão.",
allow_delegation=True,
llm='command-r'
)
# Instanciar sua crew com um manager personalizado
crew = Crew(
agents=[researcher],
agents=[pesquisador],
tasks=[task],
manager_agent=manager,
manager_agent=gerente,
process=Process.hierarchical,
)
@@ -132,18 +132,18 @@ Essa configuração permite acompanhar hiperparâmetros e monitorar problemas de
# Criar um agente com execução de código habilitada
coding_agent = Agent(
role="Python Data Analyst",
goal="Analyze data and provide insights using Python",
backstory="You are an experienced data analyst with strong Python skills.",
role="Analista de Dados Python",
goal="Analisar dados e fornecer insights usando Python",
backstory="Você é um analista de dados experiente com fortes habilidades em Python.",
allow_code_execution=True,
llm="command-r"
)
# Criar uma task que exige execução de código
data_analysis_task = Task(
description="Analyze the given dataset and calculate the average age of participants. Ages: {ages}",
description="Analise o conjunto de dados fornecido e calcule a idade média dos participantes. Idades: {ages}",
agent=coding_agent,
expected_output="5 bullet points, each with a paragraph and accompanying notes.",
expected_output="5 tópicos, cada um com um parágrafo e notas complementares.",
)
# Criar uma crew e adicionar a task

View File

@@ -58,43 +58,43 @@ Neste guia, utilizaremos o exemplo de início rápido da CrewAI.
from crewai import Agent, Crew, Task, Process
class YourCrewName:
def agent_one(self) -> Agent:
class NomeDaEquipe:
def agente_um(self) -> Agent:
return Agent(
role="Data Analyst",
goal="Analyze data trends in the market",
backstory="An experienced data analyst with a background in economics",
role="Analista de Dados",
goal="Analisar tendências de dados no mercado",
backstory="Analista de dados experiente com formação em economia",
verbose=True,
)
def agent_two(self) -> Agent:
def agente_dois(self) -> Agent:
return Agent(
role="Market Researcher",
goal="Gather information on market dynamics",
backstory="A diligent researcher with a keen eye for detail",
role="Pesquisador de Mercado",
goal="Coletar informações sobre a dinâmica do mercado",
backstory="Pesquisador dedicado com olhar atento para detalhes",
verbose=True,
)
def task_one(self) -> Task:
def tarefa_um(self) -> Task:
return Task(
name="Collect Data Task",
description="Collect recent market data and identify trends.",
expected_output="A report summarizing key trends in the market.",
agent=self.agent_one(),
name="Tarefa de Coleta de Dados",
description="Coletar dados recentes do mercado e identificar tendências.",
expected_output="Um relatório resumindo as principais tendências do mercado.",
agent=self.agente_um(),
)
def task_two(self) -> Task:
def tarefa_dois(self) -> Task:
return Task(
name="Market Research Task",
description="Research factors affecting market dynamics.",
expected_output="An analysis of factors influencing the market.",
agent=self.agent_two(),
name="Tarefa de Pesquisa de Mercado",
description="Pesquisar fatores que afetam a dinâmica do mercado.",
expected_output="Uma análise dos fatores que influenciam o mercado.",
agent=self.agente_dois(),
)
def crew(self) -> Crew:
def equipe(self) -> Crew:
return Crew(
agents=[self.agent_one(), self.agent_two()],
tasks=[self.task_one(), self.task_two()],
agents=[self.agente_um(), self.agente_dois()],
tasks=[self.tarefa_um(), self.tarefa_dois()],
process=Process.sequential,
verbose=True,
)
@@ -108,7 +108,7 @@ Neste guia, utilizaremos o exemplo de início rápido da CrewAI.
track_crewai(project_name="crewai-integration-demo")
my_crew = YourCrewName().crew()
my_crew = NomeDaEquipe().equipe()
result = my_crew.kickoff()
print(result)

View File

@@ -64,17 +64,17 @@ patronus_eval_tool = PatronusEvalTool()
# Define an agent that uses the tool
coding_agent = Agent(
role="Coding Agent",
goal="Generate high quality code and verify that the output is code",
backstory="An experienced coder who can generate high quality python code.",
role="Agente de Programação",
goal="Gerar código de alta qualidade e verificar se a saída é código",
backstory="Um programador experiente que pode gerar código Python de alta qualidade.",
tools=[patronus_eval_tool],
verbose=True,
)
# Example task to generate and evaluate code
generate_code_task = Task(
description="Create a simple program to generate the first N numbers in the Fibonacci sequence. Select the most appropriate evaluator and criteria for evaluating your output.",
expected_output="Program that generates the first N numbers in the Fibonacci sequence.",
description="Crie um programa simples para gerar os N primeiros números da sequência de Fibonacci. Selecione o avaliador e os critérios mais apropriados para avaliar sua saída.",
expected_output="Programa que gera os N primeiros números da sequência de Fibonacci.",
agent=coding_agent,
)
@@ -98,17 +98,17 @@ patronus_eval_tool = PatronusPredefinedCriteriaEvalTool(
# Define an agent that uses the tool
coding_agent = Agent(
role="Coding Agent",
goal="Generate high quality code",
backstory="An experienced coder who can generate high quality python code.",
role="Agente de Programação",
goal="Gerar código de alta qualidade",
backstory="Um programador experiente que pode gerar código Python de alta qualidade.",
tools=[patronus_eval_tool],
verbose=True,
)
# Example task to generate code
generate_code_task = Task(
description="Create a simple program to generate the first N numbers in the Fibonacci sequence.",
expected_output="Program that generates the first N numbers in the Fibonacci sequence.",
description="Crie um programa simples para gerar os N primeiros números da sequência de Fibonacci.",
expected_output="Programa que gera os N primeiros números da sequência de Fibonacci.",
agent=coding_agent,
)
@@ -149,17 +149,17 @@ patronus_eval_tool = PatronusLocalEvaluatorTool(
# Define an agent that uses the tool
coding_agent = Agent(
role="Coding Agent",
goal="Generate high quality code",
backstory="An experienced coder who can generate high quality python code.",
role="Agente de Programação",
goal="Gerar código de alta qualidade",
backstory="Um programador experiente que pode gerar código Python de alta qualidade.",
tools=[patronus_eval_tool],
verbose=True,
)
# Example task to generate code
generate_code_task = Task(
description="Create a simple program to generate the first N numbers in the Fibonacci sequence.",
expected_output="Program that generates the first N numbers in the Fibonacci sequence.",
description="Crie um programa simples para gerar os N primeiros números da sequência de Fibonacci.",
expected_output="Programa que gera os N primeiros números da sequência de Fibonacci.",
agent=coding_agent,
)

View File

@@ -50,48 +50,48 @@ O Weave captura automaticamente rastreamentos (traces) de suas aplicações Crew
llm = LLM(model="gpt-4o", temperature=0)
# Crie os agentes
researcher = Agent(
role='Research Analyst',
goal='Find and analyze the best investment opportunities',
backstory='Expert in financial analysis and market research',
pesquisador = Agent(
role='Analista de Pesquisa',
goal='Encontrar e analisar as melhores oportunidades de investimento',
backstory='Especialista em análise financeira e pesquisa de mercado',
llm=llm,
verbose=True,
allow_delegation=False,
)
writer = Agent(
role='Report Writer',
goal='Write clear and concise investment reports',
backstory='Experienced in creating detailed financial reports',
redator = Agent(
role='Redator de Relatórios',
goal='Escrever relatórios de investimento claros e concisos',
backstory='Experiente na criação de relatórios financeiros detalhados',
llm=llm,
verbose=True,
allow_delegation=False,
)
# Crie as tarefas
research_task = Task(
description='Deep research on the {topic}',
expected_output='Comprehensive market data including key players, market size, and growth trends.',
agent=researcher
pesquisa = Task(
description='Pesquisa aprofundada sobre o {tema}',
expected_output='Dados de mercado abrangentes incluindo principais players, tamanho de mercado e tendências de crescimento.',
agent=pesquisador
)
writing_task = Task(
description='Write a detailed report based on the research',
expected_output='The report should be easy to read and understand. Use bullet points where applicable.',
agent=writer
redacao = Task(
description='Escreva um relatório detalhado com base na pesquisa',
expected_output='O relatório deve ser fácil de ler e entender. Use tópicos quando aplicável.',
agent=redator
)
# Crie o crew
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
equipe = Crew(
agents=[pesquisador, redator],
tasks=[pesquisa, redacao],
verbose=True,
process=Process.sequential,
)
# Execute o crew
result = crew.kickoff(inputs={"topic": "AI in material science"})
print(result)
resultado = equipe.kickoff(inputs={"tema": "IA em ciência dos materiais"})
print(resultado)
```
</Step>
<Step title="Visualize rastreamentos no Weave">

View File

@@ -39,23 +39,19 @@ Siga os passos abaixo para começar a tripular! 🚣‍♂️
# src/latest_ai_development/config/agents.yaml
researcher:
role: >
{topic} Senior Data Researcher
Pesquisador Sênior de Dados em {topic}
goal: >
Uncover cutting-edge developments in {topic}
Descobrir os avanços mais recentes em {topic}
backstory: >
You're a seasoned researcher with a knack for uncovering the latest
developments in {topic}. Known for your ability to find the most relevant
information and present it in a clear and concise manner.
Você é um pesquisador experiente com talento para descobrir os últimos avanços em {topic}. Conhecido por sua habilidade em encontrar as informações mais relevantes e apresentá-las de forma clara e concisa.
reporting_analyst:
role: >
{topic} Reporting Analyst
Analista de Relatórios em {topic}
goal: >
Create detailed reports based on {topic} data analysis and research findings
Criar relatórios detalhados com base na análise de dados e descobertas de pesquisa em {topic}
backstory: >
You're a meticulous analyst with a keen eye for detail. You're known for
your ability to turn complex data into clear and concise reports, making
it easy for others to understand and act on the information you provide.
Você é um analista meticuloso com um olhar atento aos detalhes. É conhecido por sua capacidade de transformar dados complexos em relatórios claros e concisos, facilitando o entendimento e a tomada de decisão por parte dos outros.
```
</Step>
<Step title="Modifique seu arquivo `tasks.yaml`">
@@ -63,20 +59,19 @@ Siga os passos abaixo para começar a tripular! 🚣‍♂️
# src/latest_ai_development/config/tasks.yaml
research_task:
description: >
Conduct a thorough research about {topic}
Make sure you find any interesting and relevant information given
the current year is 2025.
Realize uma pesquisa aprofundada sobre {topic}.
Certifique-se de encontrar informações interessantes e relevantes considerando que o ano atual é 2025.
expected_output: >
A list with 10 bullet points of the most relevant information about {topic}
Uma lista com 10 tópicos dos dados mais relevantes sobre {topic}
agent: researcher
reporting_task:
description: >
Review the context you got and expand each topic into a full section for a report.
Make sure the report is detailed and contains any and all relevant information.
Revise o contexto obtido e expanda cada tópico em uma seção completa para um relatório.
Certifique-se de que o relatório seja detalhado e contenha todas as informações relevantes.
expected_output: >
A fully fledge reports with the mains topics, each with a full section of information.
Formatted as markdown without '```'
Um relatório completo com os principais tópicos, cada um com uma seção detalhada de informações.
Formate como markdown sem usar '```'
agent: reporting_analyst
output_file: report.md
```
@@ -122,15 +117,15 @@ Siga os passos abaixo para começar a tripular! 🚣‍♂️
def reporting_task(self) -> Task:
return Task(
config=self.tasks_config['reporting_task'], # type: ignore[index]
output_file='output/report.md' # This is the file that will be contain the final report.
output_file='output/report.md' # Este é o arquivo que conterá o relatório final.
)
@crew
def crew(self) -> Crew:
"""Creates the LatestAiDevelopment crew"""
return Crew(
agents=self.agents, # Automatically created by the @agent decorator
tasks=self.tasks, # Automatically created by the @task decorator
agents=self.agents, # Criado automaticamente pelo decorador @agent
tasks=self.tasks, # Criado automaticamente pelo decorador @task
process=Process.sequential,
verbose=True,
)
@@ -229,7 +224,7 @@ Siga os passos abaixo para começar a tripular! 🚣‍♂️
<CodeGroup>
```markdown output/report.md
# Comprehensive Report on the Rise and Impact of AI Agents in 2025
# Relatório Abrangente sobre a Ascensão e o Impacto dos Agentes de IA em 2025
## 1. Introduction to AI Agents
In 2025, Artificial Intelligence (AI) agents are at the forefront of innovation across various industries. As intelligent systems that can perform tasks typically requiring human cognition, AI agents are paving the way for significant advancements in operational efficiency, decision-making, and overall productivity within sectors like Human Resources (HR) and Finance. This report aims to detail the rise of AI agents, their frameworks, applications, and potential implications on the workforce.

View File

@@ -35,78 +35,18 @@ from crewai_tools import LinkupSearchTool
from crewai import Agent
import os
# Initialize the tool with your API key
linkup_tool = LinkupSearchTool(api_key=os.getenv("LINKUP_API_KEY"))
# Inicialize a ferramenta com sua chave de API
linkup_ferramenta = LinkupSearchTool(api_key=os.getenv("LINKUP_API_KEY"))
# Define an agent that uses the tool
# Defina um agente que usa a ferramenta
@agent
def researcher(self) -> Agent:
def pesquisador(self) -> Agent:
'''
This agent uses the LinkupSearchTool to retrieve contextual information
from the Linkup API.
Este agente usa o LinkupSearchTool para recuperar informações contextuais
da API do Linkup.
'''
return Agent(
config=self.agents_config["researcher"],
tools=[linkup_tool]
config=self.agentes_config["pesquisador"],
tools=[linkup_ferramenta]
)
```
## Parâmetros
O `LinkupSearchTool` aceita os seguintes parâmetros:
### Parâmetros do Construtor
- **api_key**: Obrigatório. Sua chave de API do Linkup.
### Parâmetros de Execução
- **query**: Obrigatório. O termo ou frase de busca.
- **depth**: Opcional. A profundidade da busca. O padrão é "standard".
- **output_type**: Opcional. O tipo de saída. O padrão é "searchResults".
## Uso Avançado
Você pode personalizar os parâmetros de busca para resultados mais específicos:
```python Code
# Perform a search with custom parameters
results = linkup_tool.run(
query="Women Nobel Prize Physics",
depth="deep",
output_type="searchResults"
)
```
## Formato de Retorno
A ferramenta retorna resultados no seguinte formato:
```json
{
"success": true,
"results": [
{
"name": "Result Title",
"url": "https://example.com/result",
"content": "Content of the result..."
},
// Additional results...
]
}
```
Se ocorrer um erro, a resposta será:
```json
{
"success": false,
"error": "Error message"
}
```
## Tratamento de Erros
A ferramenta lida com erros de API de forma amigável e fornece feedback estruturado. Se a requisição à API falhar, a ferramenta retornará um dicionário com `success: false` e uma mensagem de erro.
## Conclusão
O `LinkupSearchTool` oferece uma forma integrada de incorporar as capacidades de busca de informações contextuais do Linkup aos seus agentes CrewAI. Ao utilizar esta ferramenta, os agentes podem acessar informações relevantes e atualizadas para aprimorar sua tomada de decisão e execução de tarefas.
```

View File

@@ -11,7 +11,7 @@ dependencies = [
# Core Dependencies
"pydantic>=2.4.2",
"openai>=1.13.3",
"litellm==1.72.6",
"litellm==1.74.3",
"instructor>=1.3.3",
# Text Processing
"pdfplumber>=0.11.4",
@@ -27,18 +27,19 @@ dependencies = [
"openpyxl>=3.1.5",
"pyvis>=0.3.2",
# Authentication and Security
"auth0-python>=4.7.1",
"python-dotenv>=1.0.0",
"pyjwt>=2.9.0",
# Configuration and Utils
"click>=8.1.7",
"appdirs>=1.4.4",
"jsonref>=1.1.0",
"json-repair>=0.25.2",
"json-repair==0.25.2",
"uv>=0.4.25",
"tomli-w>=1.1.0",
"tomli>=2.0.2",
"blinker>=1.9.0",
"json5>=0.10.0",
"portalocker==2.7.0",
]
[project.urls]
@@ -47,11 +48,11 @@ Documentation = "https://docs.crewai.com"
Repository = "https://github.com/crewAIInc/crewAI"
[project.optional-dependencies]
tools = ["crewai-tools~=0.48.0"]
tools = ["crewai-tools~=0.59.0"]
embeddings = [
"tiktoken~=0.8.0"
]
agentops = ["agentops>=0.3.0"]
agentops = ["agentops==0.3.18"]
pdfplumber = [
"pdfplumber>=0.11.4",
]
@@ -83,6 +84,8 @@ dev-dependencies = [
"pytest-recording>=0.13.2",
"pytest-randomly>=3.16.0",
"pytest-timeout>=2.3.1",
"pytest-xdist>=3.6.1",
"pytest-split>=0.9.0",
]
[project.scripts]
@@ -123,3 +126,15 @@ path = "src/crewai/__init__.py"
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
exclude = [
"docs/**",
"docs/",
]
[tool.hatch.build.targets.sdist]
exclude = [
"docs/**",
"docs/",
]

View File

@@ -28,19 +28,19 @@ _telemetry_submitted = False
def _track_install():
"""Track package installation/first-use via Scarf analytics."""
global _telemetry_submitted
if _telemetry_submitted or Telemetry._is_telemetry_disabled():
return
try:
pixel_url = "https://api.scarf.sh/v2/packages/CrewAI/crewai/docs/00f2dad1-8334-4a39-934e-003b2e1146db"
req = urllib.request.Request(pixel_url)
req.add_header('User-Agent', f'CrewAI-Python/{__version__}')
with urllib.request.urlopen(req, timeout=2): # nosec B310
_telemetry_submitted = True
except Exception:
pass
@@ -54,7 +54,7 @@ def _track_install_async():
_track_install_async()
__version__ = "0.134.0"
__version__ = "0.152.0"
__all__ = [
"Agent",
"Crew",

View File

@@ -210,7 +210,6 @@ class Agent(BaseAgent):
sources=self.knowledge_sources,
embedder=self.embedder,
collection_name=self.role,
storage=self.knowledge_storage or None,
)
self.knowledge.add_sources()
except (TypeError, ValueError) as e:
@@ -341,7 +340,8 @@ class Agent(BaseAgent):
self.knowledge_config.model_dump() if self.knowledge_config else {}
)
if self.knowledge:
if self.knowledge or (self.crew and self.crew.knowledge):
crewai_event_bus.emit(
self,
event=KnowledgeRetrievalStartedEvent(
@@ -353,25 +353,28 @@ class Agent(BaseAgent):
task_prompt
)
if self.knowledge_search_query:
agent_knowledge_snippets = self.knowledge.query(
[self.knowledge_search_query], **knowledge_config
)
if agent_knowledge_snippets:
self.agent_knowledge_context = extract_knowledge_context(
agent_knowledge_snippets
)
if self.agent_knowledge_context:
task_prompt += self.agent_knowledge_context
if self.crew:
knowledge_snippets = self.crew.query_knowledge(
# Quering agent specific knowledge
if self.knowledge:
agent_knowledge_snippets = self.knowledge.query(
[self.knowledge_search_query], **knowledge_config
)
if knowledge_snippets:
self.crew_knowledge_context = extract_knowledge_context(
knowledge_snippets
if agent_knowledge_snippets:
self.agent_knowledge_context = extract_knowledge_context(
agent_knowledge_snippets
)
if self.crew_knowledge_context:
task_prompt += self.crew_knowledge_context
if self.agent_knowledge_context:
task_prompt += self.agent_knowledge_context
# Quering crew specific knowledge
knowledge_snippets = self.crew.query_knowledge(
[self.knowledge_search_query], **knowledge_config
)
if knowledge_snippets:
self.crew_knowledge_context = extract_knowledge_context(
knowledge_snippets
)
if self.crew_knowledge_context:
task_prompt += self.crew_knowledge_context
crewai_event_bus.emit(
self,

View File

@@ -120,11 +120,8 @@ class CrewAgentExecutor(CrewAgentExecutorMixin):
raise
except Exception as e:
handle_unknown_error(self._printer, e)
if e.__class__.__module__.startswith("litellm"):
# Do not retry on litellm errors
raise e
else:
raise e
raise
if self.ask_for_human_input:
formatted_answer = self._handle_human_feedback(formatted_answer)

View File

@@ -2,3 +2,7 @@ ALGORITHMS = ["RS256"]
AUTH0_DOMAIN = "crewai.us.auth0.com"
AUTH0_CLIENT_ID = "DEVC5Fw6NlRoSzmDCcOhVq85EfLBjKa8"
AUTH0_AUDIENCE = "https://crewai.us.auth0.com/api/v2/"
WORKOS_DOMAIN = "login.crewai.com"
WORKOS_CLI_CONNECT_APP_ID = "client_01JYT06R59SP0NXYGD994NFXXX"
WORKOS_ENVIRONMENT_ID = "client_01JNJQWBJ4SPFN3SWJM5T7BDG8"

View File

@@ -5,37 +5,72 @@ from typing import Any, Dict
import requests
from rich.console import Console
from .constants import AUTH0_AUDIENCE, AUTH0_CLIENT_ID, AUTH0_DOMAIN
from .utils import TokenManager, validate_token
from .constants import (
AUTH0_AUDIENCE,
AUTH0_CLIENT_ID,
AUTH0_DOMAIN,
WORKOS_DOMAIN,
WORKOS_CLI_CONNECT_APP_ID,
WORKOS_ENVIRONMENT_ID,
)
from .utils import TokenManager, validate_jwt_token
from urllib.parse import quote
from crewai.cli.plus_api import PlusAPI
from crewai.cli.config import Settings
console = Console()
class AuthenticationCommand:
DEVICE_CODE_URL = f"https://{AUTH0_DOMAIN}/oauth/device/code"
TOKEN_URL = f"https://{AUTH0_DOMAIN}/oauth/token"
AUTH0_DEVICE_CODE_URL = f"https://{AUTH0_DOMAIN}/oauth/device/code"
AUTH0_TOKEN_URL = f"https://{AUTH0_DOMAIN}/oauth/token"
WORKOS_DEVICE_CODE_URL = f"https://{WORKOS_DOMAIN}/oauth2/device_authorization"
WORKOS_TOKEN_URL = f"https://{WORKOS_DOMAIN}/oauth2/token"
def __init__(self):
self.token_manager = TokenManager()
# TODO: WORKOS - This variable is temporary until migration to WorkOS is complete.
self.user_provider = "workos"
def signup(self) -> None:
def login(self) -> None:
"""Sign up to CrewAI+"""
console.print("Signing Up to CrewAI+ \n", style="bold blue")
device_code_data = self._get_device_code()
device_code_url = self.WORKOS_DEVICE_CODE_URL
token_url = self.WORKOS_TOKEN_URL
client_id = WORKOS_CLI_CONNECT_APP_ID
audience = None
console.print("Signing in to CrewAI Enterprise...\n", style="bold blue")
# TODO: WORKOS - Next line and conditional are temporary until migration to WorkOS is complete.
user_provider = self._determine_user_provider()
if user_provider == "auth0":
device_code_url = self.AUTH0_DEVICE_CODE_URL
token_url = self.AUTH0_TOKEN_URL
client_id = AUTH0_CLIENT_ID
audience = AUTH0_AUDIENCE
self.user_provider = "auth0"
# End of temporary code.
device_code_data = self._get_device_code(client_id, device_code_url, audience)
self._display_auth_instructions(device_code_data)
return self._poll_for_token(device_code_data)
return self._poll_for_token(device_code_data, client_id, token_url)
def _get_device_code(self) -> Dict[str, Any]:
def _get_device_code(
self, client_id: str, device_code_url: str, audience: str | None = None
) -> Dict[str, Any]:
"""Get the device code to authenticate the user."""
device_code_payload = {
"client_id": AUTH0_CLIENT_ID,
"client_id": client_id,
"scope": "openid",
"audience": AUTH0_AUDIENCE,
"audience": audience,
}
response = requests.post(
url=self.DEVICE_CODE_URL, data=device_code_payload, timeout=20
url=device_code_url, data=device_code_payload, timeout=20
)
response.raise_for_status()
return response.json()
@@ -46,38 +81,33 @@ class AuthenticationCommand:
console.print("2. Enter the following code: ", device_code_data["user_code"])
webbrowser.open(device_code_data["verification_uri_complete"])
def _poll_for_token(self, device_code_data: Dict[str, Any]) -> None:
"""Poll the server for the token."""
def _poll_for_token(
self, device_code_data: Dict[str, Any], client_id: str, token_poll_url: str
) -> None:
"""Polls the server for the token until it is received, or max attempts are reached."""
token_payload = {
"grant_type": "urn:ietf:params:oauth:grant-type:device_code",
"device_code": device_code_data["device_code"],
"client_id": AUTH0_CLIENT_ID,
"client_id": client_id,
}
console.print("\nWaiting for authentication... ", style="bold blue", end="")
attempts = 0
while True and attempts < 5:
response = requests.post(self.TOKEN_URL, data=token_payload, timeout=30)
while True and attempts < 10:
response = requests.post(token_poll_url, data=token_payload, timeout=30)
token_data = response.json()
if response.status_code == 200:
validate_token(token_data["id_token"])
expires_in = 360000 # Token expiration time in seconds
self.token_manager.save_tokens(token_data["access_token"], expires_in)
self._validate_and_save_token(token_data)
try:
from crewai.cli.tools.main import ToolCommand
ToolCommand().login()
except Exception:
console.print(
"\n[bold yellow]Warning:[/bold yellow] Authentication with the Tool Repository failed.",
style="yellow",
)
console.print(
"Other features will work normally, but you may experience limitations "
"with downloading and publishing tools."
"\nRun [bold]crewai login[/bold] to try logging in again.\n",
style="yellow",
)
console.print(
"Success!",
style="bold green",
)
self._login_to_tool_repository()
console.print(
"\n[bold green]Welcome to CrewAI Enterprise![/bold green]\n"
@@ -93,3 +123,88 @@ class AuthenticationCommand:
console.print(
"Timeout: Failed to get the token. Please try again.", style="bold red"
)
def _validate_and_save_token(self, token_data: Dict[str, Any]) -> None:
"""Validates the JWT token and saves the token to the token manager."""
jwt_token = token_data["access_token"]
jwt_token_data = {
"jwt_token": jwt_token,
"jwks_url": f"https://{WORKOS_DOMAIN}/oauth2/jwks",
"issuer": f"https://{WORKOS_DOMAIN}",
"audience": WORKOS_ENVIRONMENT_ID,
}
# TODO: WORKOS - The following conditional is temporary until migration to WorkOS is complete.
if self.user_provider == "auth0":
jwt_token_data["jwks_url"] = f"https://{AUTH0_DOMAIN}/.well-known/jwks.json"
jwt_token_data["issuer"] = f"https://{AUTH0_DOMAIN}/"
jwt_token_data["audience"] = AUTH0_AUDIENCE
decoded_token = validate_jwt_token(**jwt_token_data)
expires_at = decoded_token.get("exp", 0)
self.token_manager.save_tokens(jwt_token, expires_at)
def _login_to_tool_repository(self) -> None:
"""Login to the tool repository."""
from crewai.cli.tools.main import ToolCommand
try:
console.print(
"Now logging you in to the Tool Repository... ",
style="bold blue",
end="",
)
ToolCommand().login()
console.print(
"Success!\n",
style="bold green",
)
settings = Settings()
console.print(
f"You are authenticated to the tool repository as [bold cyan]'{settings.org_name}'[/bold cyan] ({settings.org_uuid})",
style="green",
)
except Exception:
console.print(
"\n[bold yellow]Warning:[/bold yellow] Authentication with the Tool Repository failed.",
style="yellow",
)
console.print(
"Other features will work normally, but you may experience limitations "
"with downloading and publishing tools."
"\nRun [bold]crewai login[/bold] to try logging in again.\n",
style="yellow",
)
# TODO: WORKOS - This method is temporary until migration to WorkOS is complete.
def _determine_user_provider(self) -> str:
"""Determine which provider to use for authentication."""
console.print(
"Enter your CrewAI Enterprise account email: ", style="bold blue", end=""
)
email = input()
email_encoded = quote(email)
# It's not correct to call this method directly, but it's temporary until migration is complete.
response = PlusAPI("")._make_request(
"GET", f"/crewai_plus/api/v1/me/provider?email={email_encoded}"
)
if response.status_code == 200:
if response.json().get("provider") == "auth0":
return "auth0"
else:
return "workos"
else:
console.print(
"Error: Failed to authenticate with crewai enterprise. Ensure that you are using the latest crewai version and please try again. If the problem persists, contact support@crewai.com.",
style="red",
)
raise SystemExit

View File

@@ -1,32 +1,72 @@
import json
import os
import sys
from datetime import datetime, timedelta
from datetime import datetime
from pathlib import Path
from typing import Optional
from auth0.authentication.token_verifier import (
AsymmetricSignatureVerifier,
TokenVerifier,
)
import jwt
from jwt import PyJWKClient
from cryptography.fernet import Fernet
from .constants import AUTH0_CLIENT_ID, AUTH0_DOMAIN
def validate_token(id_token: str) -> None:
def validate_jwt_token(
jwt_token: str, jwks_url: str, issuer: str, audience: str
) -> dict:
"""
Verify the token and its precedence
:param id_token:
Verify the token's signature and claims using PyJWT.
:param jwt_token: The JWT (JWS) string to validate.
:param jwks_url: The URL of the JWKS endpoint.
:param issuer: The expected issuer of the token.
:param audience: The expected audience of the token.
:return: The decoded token.
:raises Exception: If the token is invalid for any reason (e.g., signature mismatch,
expired, incorrect issuer/audience, JWKS fetching error,
missing required claims).
"""
jwks_url = f"https://{AUTH0_DOMAIN}/.well-known/jwks.json"
issuer = f"https://{AUTH0_DOMAIN}/"
signature_verifier = AsymmetricSignatureVerifier(jwks_url)
token_verifier = TokenVerifier(
signature_verifier=signature_verifier, issuer=issuer, audience=AUTH0_CLIENT_ID
)
token_verifier.verify(id_token)
decoded_token = None
try:
jwk_client = PyJWKClient(jwks_url)
signing_key = jwk_client.get_signing_key_from_jwt(jwt_token)
_unverified_decoded_token = jwt.decode(
jwt_token, options={"verify_signature": False}
)
decoded_token = jwt.decode(
jwt_token,
signing_key.key,
algorithms=["RS256"],
audience=audience,
issuer=issuer,
options={
"verify_signature": True,
"verify_exp": True,
"verify_nbf": True,
"verify_iat": True,
"require": ["exp", "iat", "iss", "aud", "sub"],
},
)
return decoded_token
except jwt.ExpiredSignatureError:
raise Exception("Token has expired.")
except jwt.InvalidAudienceError:
actual_audience = _unverified_decoded_token.get("aud", "[no audience found]")
raise Exception(
f"Invalid token audience. Got: '{actual_audience}'. Expected: '{audience}'"
)
except jwt.InvalidIssuerError:
actual_issuer = _unverified_decoded_token.get("iss", "[no issuer found]")
raise Exception(
f"Invalid token issuer. Got: '{actual_issuer}'. Expected: '{issuer}'"
)
except jwt.MissingRequiredClaimError as e:
raise Exception(f"Token is missing required claims: {str(e)}")
except jwt.exceptions.PyJWKClientError as e:
raise Exception(f"JWKS or key processing error: {str(e)}")
except jwt.InvalidTokenError as e:
raise Exception(f"Invalid token: {str(e)}")
class TokenManager:
@@ -56,14 +96,14 @@ class TokenManager:
self.save_secure_file(key_filename, new_key)
return new_key
def save_tokens(self, access_token: str, expires_in: int) -> None:
def save_tokens(self, access_token: str, expires_at: int) -> None:
"""
Save the access token and its expiration time.
:param access_token: The access token to save.
:param expires_in: The expiration time of the access token in seconds.
:param expires_at: The UNIX timestamp of the expiration time.
"""
expiration_time = datetime.now() + timedelta(seconds=expires_in)
expiration_time = datetime.fromtimestamp(expires_at)
data = {
"access_token": access_token,
"expiration": expiration_time.isoformat(),

View File

@@ -2,7 +2,7 @@ from importlib.metadata import version as get_version
from typing import Optional
import click
from crewai.cli.config import Settings
from crewai.cli.add_crew_to_flow import add_crew_to_flow
from crewai.cli.create_crew import create_crew
from crewai.cli.create_flow import create_flow
@@ -138,8 +138,12 @@ def log_tasks_outputs() -> None:
@click.option("-s", "--short", is_flag=True, help="Reset SHORT TERM memory")
@click.option("-e", "--entities", is_flag=True, help="Reset ENTITIES memory")
@click.option("-kn", "--knowledge", is_flag=True, help="Reset KNOWLEDGE storage")
@click.option("-akn", "--agent-knowledge", is_flag=True, help="Reset AGENT KNOWLEDGE storage")
@click.option("-k","--kickoff-outputs",is_flag=True,help="Reset LATEST KICKOFF TASK OUTPUTS")
@click.option(
"-akn", "--agent-knowledge", is_flag=True, help="Reset AGENT KNOWLEDGE storage"
)
@click.option(
"-k", "--kickoff-outputs", is_flag=True, help="Reset LATEST KICKOFF TASK OUTPUTS"
)
@click.option("-a", "--all", is_flag=True, help="Reset ALL memories")
def reset_memories(
long: bool,
@@ -154,13 +158,23 @@ def reset_memories(
Reset the crew memories (long, short, entity, latest_crew_kickoff_ouputs, knowledge, agent_knowledge). This will delete all the data saved.
"""
try:
memory_types = [long, short, entities, knowledge, agent_knowledge, kickoff_outputs, all]
memory_types = [
long,
short,
entities,
knowledge,
agent_knowledge,
kickoff_outputs,
all,
]
if not any(memory_types):
click.echo(
"Please specify at least one memory type to reset using the appropriate flags."
)
return
reset_memories_command(long, short, entities, knowledge, agent_knowledge, kickoff_outputs, all)
reset_memories_command(
long, short, entities, knowledge, agent_knowledge, kickoff_outputs, all
)
except Exception as e:
click.echo(f"An error occurred while resetting memories: {e}", err=True)
@@ -210,16 +224,11 @@ def update():
update_crew()
@crewai.command()
def signup():
"""Sign Up/Login to CrewAI+."""
AuthenticationCommand().signup()
@crewai.command()
def login():
"""Sign Up/Login to CrewAI+."""
AuthenticationCommand().signup()
"""Sign Up/Login to CrewAI Enterprise."""
Settings().clear()
AuthenticationCommand().login()
# DEPLOY CREWAI+ COMMANDS

View File

@@ -26,7 +26,7 @@ class PlusAPIMixin:
"Please sign up/login to CrewAI+ before using the CLI.",
style="bold red",
)
console.print("Run 'crewai signup' to sign up/login.", style="bold green")
console.print("Run 'crewai login' to sign up/login.", style="bold green")
raise SystemExit
def _validate_response(self, response: requests.Response) -> None:

View File

@@ -37,6 +37,10 @@ class Settings(BaseModel):
merged_data = {**file_data, **data}
super().__init__(config_path=config_path, **merged_data)
def clear(self) -> None:
"""Clear all settings"""
self.config_path.unlink(missing_ok=True)
def dump(self) -> None:
"""Save current settings to settings.json"""
if self.config_path.is_file():

View File

@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
authors = [{ name = "Your Name", email = "you@example.com" }]
requires-python = ">=3.10,<3.14"
dependencies = [
"crewai[tools]>=0.134.0,<1.0.0"
"crewai[tools]>=0.152.0,<1.0.0"
]
[project.scripts]

View File

@@ -5,7 +5,7 @@ description = "{{name}} using crewAI"
authors = [{ name = "Your Name", email = "you@example.com" }]
requires-python = ">=3.10,<3.14"
dependencies = [
"crewai[tools]>=0.134.0,<1.0.0",
"crewai[tools]>=0.152.0,<1.0.0",
]
[project.scripts]

View File

@@ -5,7 +5,7 @@ description = "Power up your crews with {{folder_name}}"
readme = "README.md"
requires-python = ">=3.10,<3.14"
dependencies = [
"crewai[tools]>=0.134.0"
"crewai[tools]>=0.152.0"
]
[tool.crewai]

View File

@@ -156,7 +156,7 @@ class ToolCommand(BaseCommand, PlusAPIMixin):
console.print(f"Successfully installed {handle}", style="bold green")
def login(self):
def login(self) -> None:
login_response = self.plus_api_client.login_to_tool_repository()
if login_response.status_code != 200:
@@ -175,18 +175,10 @@ class ToolCommand(BaseCommand, PlusAPIMixin):
settings.tool_repository_password = login_response_json["credential"][
"password"
]
settings.org_uuid = login_response_json["current_organization"][
"uuid"
]
settings.org_name = login_response_json["current_organization"][
"name"
]
settings.org_uuid = login_response_json["current_organization"]["uuid"]
settings.org_name = login_response_json["current_organization"]["name"]
settings.dump()
console.print(
f"Successfully authenticated to the tool repository as {settings.org_name} ({settings.org_uuid}).", style="bold green"
)
def _add_package(self, tool_details: dict[str, Any]):
is_from_pypi = tool_details.get("source", None) == "pypi"
tool_handle = tool_details["handle"]
@@ -243,9 +235,15 @@ class ToolCommand(BaseCommand, PlusAPIMixin):
return env
def _print_current_organization(self):
def _print_current_organization(self) -> None:
settings = Settings()
if settings.org_uuid:
console.print(f"Current organization: {settings.org_name} ({settings.org_uuid})", style="bold blue")
console.print(
f"Current organization: {settings.org_name} ({settings.org_uuid})",
style="bold blue",
)
else:
console.print("No organization currently set. We recommend setting one before using: `crewai org switch <org_id>` command.", style="yellow")
console.print(
"No organization currently set. We recommend setting one before using: `crewai org switch <org_id>` command.",
style="yellow",
)

View File

@@ -18,6 +18,11 @@ from typing import (
cast,
)
from opentelemetry import baggage
from opentelemetry.context import attach, detach
from crewai.utilities.crew.models import CrewContext
from pydantic import (
UUID4,
BaseModel,
@@ -156,7 +161,7 @@ class Crew(FlowTrackable, BaseModel):
)
user_memory: Optional[InstanceOf[UserMemory]] = Field(
default=None,
description="An instance of the UserMemory to be used by the Crew to store/fetch memories of a specific user.",
description="DEPRECATED: Will be removed in version 0.156.0 or on 2025-08-04, whichever comes first. Use external_memory instead.",
)
external_memory: Optional[InstanceOf[ExternalMemory]] = Field(
default=None,
@@ -322,7 +327,7 @@ class Crew(FlowTrackable, BaseModel):
self._short_term_memory = self.short_term_memory
self._entity_memory = self.entity_memory
# UserMemory is gonna to be deprecated in the future, but we have to initialize a default value for now
# UserMemory will be removed in version 0.156.0 or on 2025-08-04, whichever comes first
self._user_memory = None
if self.memory:
@@ -616,6 +621,11 @@ class Crew(FlowTrackable, BaseModel):
self,
inputs: Optional[Dict[str, Any]] = None,
) -> CrewOutput:
ctx = baggage.set_baggage(
"crew_context", CrewContext(id=str(self.id), key=self.key)
)
token = attach(ctx)
try:
for before_callback in self.before_kickoff_callbacks:
if inputs is None:
@@ -676,6 +686,8 @@ class Crew(FlowTrackable, BaseModel):
CrewKickoffFailedEvent(error=str(e), crew_name=self.name or "crew"),
)
raise
finally:
detach(token)
def kickoff_for_each(self, inputs: List[Dict[str, Any]]) -> List[CrewOutput]:
"""Executes the Crew's workflow for each input in the list and aggregates results."""
@@ -1243,6 +1255,7 @@ class Crew(FlowTrackable, BaseModel):
if self.external_memory:
copied_data["external_memory"] = self.external_memory.model_copy(deep=True)
if self.user_memory:
# DEPRECATED: UserMemory will be removed in version 0.156.0 or on 2025-08-04
copied_data["user_memory"] = self.user_memory.model_copy(deep=True)
copied_data.pop("agents", None)
@@ -1319,6 +1332,7 @@ class Crew(FlowTrackable, BaseModel):
),
)
test_crew = self.copy()
evaluator = CrewEvaluator(test_crew, llm_instance)
for i in range(1, n_iterations + 1):

View File

@@ -0,0 +1,40 @@
from crewai.experimental.evaluation import (
BaseEvaluator,
EvaluationScore,
MetricCategory,
AgentEvaluationResult,
SemanticQualityEvaluator,
GoalAlignmentEvaluator,
ReasoningEfficiencyEvaluator,
ToolSelectionEvaluator,
ParameterExtractionEvaluator,
ToolInvocationEvaluator,
EvaluationTraceCallback,
create_evaluation_callbacks,
AgentEvaluator,
create_default_evaluator,
ExperimentRunner,
ExperimentResults,
ExperimentResult,
)
__all__ = [
"BaseEvaluator",
"EvaluationScore",
"MetricCategory",
"AgentEvaluationResult",
"SemanticQualityEvaluator",
"GoalAlignmentEvaluator",
"ReasoningEfficiencyEvaluator",
"ToolSelectionEvaluator",
"ParameterExtractionEvaluator",
"ToolInvocationEvaluator",
"EvaluationTraceCallback",
"create_evaluation_callbacks",
"AgentEvaluator",
"create_default_evaluator",
"ExperimentRunner",
"ExperimentResults",
"ExperimentResult"
]

View File

@@ -0,0 +1,51 @@
from crewai.experimental.evaluation.base_evaluator import (
BaseEvaluator,
EvaluationScore,
MetricCategory,
AgentEvaluationResult
)
from crewai.experimental.evaluation.metrics import (
SemanticQualityEvaluator,
GoalAlignmentEvaluator,
ReasoningEfficiencyEvaluator,
ToolSelectionEvaluator,
ParameterExtractionEvaluator,
ToolInvocationEvaluator
)
from crewai.experimental.evaluation.evaluation_listener import (
EvaluationTraceCallback,
create_evaluation_callbacks
)
from crewai.experimental.evaluation.agent_evaluator import (
AgentEvaluator,
create_default_evaluator
)
from crewai.experimental.evaluation.experiment import (
ExperimentRunner,
ExperimentResults,
ExperimentResult
)
__all__ = [
"BaseEvaluator",
"EvaluationScore",
"MetricCategory",
"AgentEvaluationResult",
"SemanticQualityEvaluator",
"GoalAlignmentEvaluator",
"ReasoningEfficiencyEvaluator",
"ToolSelectionEvaluator",
"ParameterExtractionEvaluator",
"ToolInvocationEvaluator",
"EvaluationTraceCallback",
"create_evaluation_callbacks",
"AgentEvaluator",
"create_default_evaluator",
"ExperimentRunner",
"ExperimentResults",
"ExperimentResult"
]

View File

@@ -0,0 +1,245 @@
import threading
from typing import Any
from crewai.experimental.evaluation.base_evaluator import AgentEvaluationResult, AggregationStrategy
from crewai.agent import Agent
from crewai.task import Task
from crewai.experimental.evaluation.evaluation_display import EvaluationDisplayFormatter
from crewai.utilities.events.agent_events import AgentEvaluationStartedEvent, AgentEvaluationCompletedEvent, AgentEvaluationFailedEvent
from crewai.experimental.evaluation import BaseEvaluator, create_evaluation_callbacks
from collections.abc import Sequence
from crewai.utilities.events.crewai_event_bus import crewai_event_bus
from crewai.utilities.events.utils.console_formatter import ConsoleFormatter
from crewai.utilities.events.task_events import TaskCompletedEvent
from crewai.utilities.events.agent_events import LiteAgentExecutionCompletedEvent
from crewai.experimental.evaluation.base_evaluator import AgentAggregatedEvaluationResult, EvaluationScore, MetricCategory
class ExecutionState:
def __init__(self):
self.traces = {}
self.current_agent_id: str | None = None
self.current_task_id: str | None = None
self.iteration = 1
self.iterations_results = {}
self.agent_evaluators = {}
class AgentEvaluator:
def __init__(
self,
agents: list[Agent],
evaluators: Sequence[BaseEvaluator] | None = None,
):
self.agents: list[Agent] = agents
self.evaluators: Sequence[BaseEvaluator] | None = evaluators
self.callback = create_evaluation_callbacks()
self.console_formatter = ConsoleFormatter()
self.display_formatter = EvaluationDisplayFormatter()
self._thread_local: threading.local = threading.local()
for agent in self.agents:
self._execution_state.agent_evaluators[str(agent.id)] = self.evaluators
self._subscribe_to_events()
@property
def _execution_state(self) -> ExecutionState:
if not hasattr(self._thread_local, 'execution_state'):
self._thread_local.execution_state = ExecutionState()
return self._thread_local.execution_state
def _subscribe_to_events(self) -> None:
from typing import cast
crewai_event_bus.register_handler(TaskCompletedEvent, cast(Any, self._handle_task_completed))
crewai_event_bus.register_handler(LiteAgentExecutionCompletedEvent, cast(Any, self._handle_lite_agent_completed))
def _handle_task_completed(self, source: Any, event: TaskCompletedEvent) -> None:
assert event.task is not None
agent = event.task.agent
if agent and str(getattr(agent, 'id', 'unknown')) in self._execution_state.agent_evaluators:
self.emit_evaluation_started_event(agent_role=agent.role, agent_id=str(agent.id), task_id=str(event.task.id))
state = ExecutionState()
state.current_agent_id = str(agent.id)
state.current_task_id = str(event.task.id)
assert state.current_agent_id is not None and state.current_task_id is not None
trace = self.callback.get_trace(state.current_agent_id, state.current_task_id)
if not trace:
return
result = self.evaluate(
agent=agent,
task=event.task,
execution_trace=trace,
final_output=event.output,
state=state
)
current_iteration = self._execution_state.iteration
if current_iteration not in self._execution_state.iterations_results:
self._execution_state.iterations_results[current_iteration] = {}
if agent.role not in self._execution_state.iterations_results[current_iteration]:
self._execution_state.iterations_results[current_iteration][agent.role] = []
self._execution_state.iterations_results[current_iteration][agent.role].append(result)
def _handle_lite_agent_completed(self, source: object, event: LiteAgentExecutionCompletedEvent) -> None:
agent_info = event.agent_info
agent_id = str(agent_info["id"])
if agent_id in self._execution_state.agent_evaluators:
state = ExecutionState()
state.current_agent_id = agent_id
state.current_task_id = "lite_task"
target_agent = None
for agent in self.agents:
if str(agent.id) == agent_id:
target_agent = agent
break
if not target_agent:
return
assert state.current_agent_id is not None and state.current_task_id is not None
trace = self.callback.get_trace(state.current_agent_id, state.current_task_id)
if not trace:
return
result = self.evaluate(
agent=target_agent,
execution_trace=trace,
final_output=event.output,
state=state
)
current_iteration = self._execution_state.iteration
if current_iteration not in self._execution_state.iterations_results:
self._execution_state.iterations_results[current_iteration] = {}
agent_role = target_agent.role
if agent_role not in self._execution_state.iterations_results[current_iteration]:
self._execution_state.iterations_results[current_iteration][agent_role] = []
self._execution_state.iterations_results[current_iteration][agent_role].append(result)
def set_iteration(self, iteration: int) -> None:
self._execution_state.iteration = iteration
def reset_iterations_results(self) -> None:
self._execution_state.iterations_results = {}
def get_evaluation_results(self) -> dict[str, list[AgentEvaluationResult]]:
if self._execution_state.iterations_results and self._execution_state.iteration in self._execution_state.iterations_results:
return self._execution_state.iterations_results[self._execution_state.iteration]
return {}
def display_results_with_iterations(self) -> None:
self.display_formatter.display_summary_results(self._execution_state.iterations_results)
def get_agent_evaluation(self, strategy: AggregationStrategy = AggregationStrategy.SIMPLE_AVERAGE, include_evaluation_feedback: bool = True) -> dict[str, AgentAggregatedEvaluationResult]:
agent_results = {}
with crewai_event_bus.scoped_handlers():
task_results = self.get_evaluation_results()
for agent_role, results in task_results.items():
if not results:
continue
agent_id = results[0].agent_id
aggregated_result = self.display_formatter._aggregate_agent_results(
agent_id=agent_id,
agent_role=agent_role,
results=results,
strategy=strategy
)
agent_results[agent_role] = aggregated_result
if self._execution_state.iterations_results and self._execution_state.iteration == max(self._execution_state.iterations_results.keys(), default=0):
self.display_results_with_iterations()
if include_evaluation_feedback:
self.display_evaluation_with_feedback()
return agent_results
def display_evaluation_with_feedback(self) -> None:
self.display_formatter.display_evaluation_with_feedback(self._execution_state.iterations_results)
def evaluate(
self,
agent: Agent,
execution_trace: dict[str, Any],
final_output: Any,
state: ExecutionState,
task: Task | None = None,
) -> AgentEvaluationResult:
result = AgentEvaluationResult(
agent_id=state.current_agent_id or str(agent.id),
task_id=state.current_task_id or (str(task.id) if task else "unknown_task")
)
assert self.evaluators is not None
task_id = str(task.id) if task else None
for evaluator in self.evaluators:
try:
self.emit_evaluation_started_event(agent_role=agent.role, agent_id=str(agent.id), task_id=task_id)
score = evaluator.evaluate(
agent=agent,
task=task,
execution_trace=execution_trace,
final_output=final_output
)
result.metrics[evaluator.metric_category] = score
self.emit_evaluation_completed_event(agent_role=agent.role, agent_id=str(agent.id), task_id=task_id, metric_category=evaluator.metric_category, score=score)
except Exception as e:
self.emit_evaluation_failed_event(agent_role=agent.role, agent_id=str(agent.id), task_id=task_id, error=str(e))
self.console_formatter.print(f"Error in {evaluator.metric_category.value} evaluator: {str(e)}")
return result
def emit_evaluation_started_event(self, agent_role: str, agent_id: str, task_id: str | None = None):
crewai_event_bus.emit(
self,
AgentEvaluationStartedEvent(agent_role=agent_role, agent_id=agent_id, task_id=task_id, iteration=self._execution_state.iteration)
)
def emit_evaluation_completed_event(self, agent_role: str, agent_id: str, task_id: str | None = None, metric_category: MetricCategory | None = None, score: EvaluationScore | None = None):
crewai_event_bus.emit(
self,
AgentEvaluationCompletedEvent(agent_role=agent_role, agent_id=agent_id, task_id=task_id, iteration=self._execution_state.iteration, metric_category=metric_category, score=score)
)
def emit_evaluation_failed_event(self, agent_role: str, agent_id: str, error: str, task_id: str | None = None):
crewai_event_bus.emit(
self,
AgentEvaluationFailedEvent(agent_role=agent_role, agent_id=agent_id, task_id=task_id, iteration=self._execution_state.iteration, error=error)
)
def create_default_evaluator(agents: list[Agent], llm: None = None):
from crewai.experimental.evaluation import (
GoalAlignmentEvaluator,
SemanticQualityEvaluator,
ToolSelectionEvaluator,
ParameterExtractionEvaluator,
ToolInvocationEvaluator,
ReasoningEfficiencyEvaluator
)
evaluators = [
GoalAlignmentEvaluator(llm=llm),
SemanticQualityEvaluator(llm=llm),
ToolSelectionEvaluator(llm=llm),
ParameterExtractionEvaluator(llm=llm),
ToolInvocationEvaluator(llm=llm),
ReasoningEfficiencyEvaluator(llm=llm),
]
return AgentEvaluator(evaluators=evaluators, agents=agents)

View File

@@ -0,0 +1,125 @@
import abc
import enum
from enum import Enum
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, Field
from crewai.agent import Agent
from crewai.task import Task
from crewai.llm import BaseLLM
from crewai.utilities.llm_utils import create_llm
class MetricCategory(enum.Enum):
GOAL_ALIGNMENT = "goal_alignment"
SEMANTIC_QUALITY = "semantic_quality"
REASONING_EFFICIENCY = "reasoning_efficiency"
TOOL_SELECTION = "tool_selection"
PARAMETER_EXTRACTION = "parameter_extraction"
TOOL_INVOCATION = "tool_invocation"
def title(self):
return self.value.replace('_', ' ').title()
class EvaluationScore(BaseModel):
score: float | None = Field(
default=5.0,
description="Numeric score from 0-10 where 0 is worst and 10 is best, None if not applicable",
ge=0.0,
le=10.0
)
feedback: str = Field(
default="",
description="Detailed feedback explaining the evaluation score"
)
raw_response: str | None = Field(
default=None,
description="Raw response from the evaluator (e.g., LLM)"
)
def __str__(self) -> str:
if self.score is None:
return f"Score: N/A - {self.feedback}"
return f"Score: {self.score:.1f}/10 - {self.feedback}"
class BaseEvaluator(abc.ABC):
def __init__(self, llm: BaseLLM | None = None):
self.llm: BaseLLM | None = create_llm(llm)
@property
@abc.abstractmethod
def metric_category(self) -> MetricCategory:
pass
@abc.abstractmethod
def evaluate(
self,
agent: Agent,
execution_trace: Dict[str, Any],
final_output: Any,
task: Task | None = None,
) -> EvaluationScore:
pass
class AgentEvaluationResult(BaseModel):
agent_id: str = Field(description="ID of the evaluated agent")
task_id: str = Field(description="ID of the task that was executed")
metrics: Dict[MetricCategory, EvaluationScore] = Field(
default_factory=dict,
description="Evaluation scores for each metric category"
)
class AggregationStrategy(Enum):
SIMPLE_AVERAGE = "simple_average" # Equal weight to all tasks
WEIGHTED_BY_COMPLEXITY = "weighted_by_complexity" # Weight by task complexity
BEST_PERFORMANCE = "best_performance" # Use best scores across tasks
WORST_PERFORMANCE = "worst_performance" # Use worst scores across tasks
class AgentAggregatedEvaluationResult(BaseModel):
agent_id: str = Field(
default="",
description="ID of the agent"
)
agent_role: str = Field(
default="",
description="Role of the agent"
)
task_count: int = Field(
default=0,
description="Number of tasks included in this aggregation"
)
aggregation_strategy: AggregationStrategy = Field(
default=AggregationStrategy.SIMPLE_AVERAGE,
description="Strategy used for aggregation"
)
metrics: Dict[MetricCategory, EvaluationScore] = Field(
default_factory=dict,
description="Aggregated metrics across all tasks"
)
task_results: List[str] = Field(
default_factory=list,
description="IDs of tasks included in this aggregation"
)
overall_score: Optional[float] = Field(
default=None,
description="Overall score for this agent"
)
def __str__(self) -> str:
result = f"Agent Evaluation: {self.agent_role}\n"
result += f"Strategy: {self.aggregation_strategy.value}\n"
result += f"Tasks evaluated: {self.task_count}\n"
for category, score in self.metrics.items():
result += f"\n\n- {category.value.upper()}: {score.score}/10\n"
if score.feedback:
detailed_feedback = "\n ".join(score.feedback.split('\n'))
result += f" {detailed_feedback}\n"
return result

View File

@@ -0,0 +1,333 @@
from collections import defaultdict
from typing import Dict, Any, List
from rich.table import Table
from rich.box import HEAVY_EDGE, ROUNDED
from collections.abc import Sequence
from crewai.experimental.evaluation.base_evaluator import AgentAggregatedEvaluationResult, AggregationStrategy, AgentEvaluationResult, MetricCategory
from crewai.experimental.evaluation import EvaluationScore
from crewai.utilities.events.utils.console_formatter import ConsoleFormatter
from crewai.utilities.llm_utils import create_llm
class EvaluationDisplayFormatter:
def __init__(self):
self.console_formatter = ConsoleFormatter()
def display_evaluation_with_feedback(self, iterations_results: Dict[int, Dict[str, List[Any]]]):
if not iterations_results:
self.console_formatter.print("[yellow]No evaluation results to display[/yellow]")
return
all_agent_roles: set[str] = set()
for iter_results in iterations_results.values():
all_agent_roles.update(iter_results.keys())
for agent_role in sorted(all_agent_roles):
self.console_formatter.print(f"\n[bold cyan]Agent: {agent_role}[/bold cyan]")
for iter_num, results in sorted(iterations_results.items()):
if agent_role not in results or not results[agent_role]:
continue
agent_results = results[agent_role]
agent_id = agent_results[0].agent_id
aggregated_result = self._aggregate_agent_results(
agent_id=agent_id,
agent_role=agent_role,
results=agent_results,
)
self.console_formatter.print(f"\n[bold]Iteration {iter_num}[/bold]")
table = Table(box=ROUNDED)
table.add_column("Metric", style="cyan")
table.add_column("Score (1-10)", justify="center")
table.add_column("Feedback", style="green")
if aggregated_result.metrics:
for metric, evaluation_score in aggregated_result.metrics.items():
score = evaluation_score.score
if isinstance(score, (int, float)):
if score >= 8.0:
score_text = f"[green]{score:.1f}[/green]"
elif score >= 6.0:
score_text = f"[cyan]{score:.1f}[/cyan]"
elif score >= 4.0:
score_text = f"[yellow]{score:.1f}[/yellow]"
else:
score_text = f"[red]{score:.1f}[/red]"
else:
score_text = "[dim]N/A[/dim]"
table.add_section()
table.add_row(
metric.title(),
score_text,
evaluation_score.feedback or ""
)
if aggregated_result.overall_score is not None:
overall_score = aggregated_result.overall_score
if overall_score >= 8.0:
overall_color = "green"
elif overall_score >= 6.0:
overall_color = "cyan"
elif overall_score >= 4.0:
overall_color = "yellow"
else:
overall_color = "red"
table.add_section()
table.add_row(
"Overall Score",
f"[{overall_color}]{overall_score:.1f}[/]",
"Overall agent evaluation score"
)
self.console_formatter.print(table)
def display_summary_results(self, iterations_results: Dict[int, Dict[str, List[AgentAggregatedEvaluationResult]]]):
if not iterations_results:
self.console_formatter.print("[yellow]No evaluation results to display[/yellow]")
return
self.console_formatter.print("\n")
table = Table(title="Agent Performance Scores \n (1-10 Higher is better)", box=HEAVY_EDGE)
table.add_column("Agent/Metric", style="cyan")
for iter_num in sorted(iterations_results.keys()):
run_label = f"Run {iter_num}"
table.add_column(run_label, justify="center")
table.add_column("Avg. Total", justify="center")
all_agent_roles: set[str] = set()
for results in iterations_results.values():
all_agent_roles.update(results.keys())
for agent_role in sorted(all_agent_roles):
agent_scores_by_iteration = {}
agent_metrics_by_iteration = {}
for iter_num, results in sorted(iterations_results.items()):
if agent_role not in results or not results[agent_role]:
continue
agent_results = results[agent_role]
agent_id = agent_results[0].agent_id
aggregated_result = self._aggregate_agent_results(
agent_id=agent_id,
agent_role=agent_role,
results=agent_results,
strategy=AggregationStrategy.SIMPLE_AVERAGE
)
valid_scores = [score.score for score in aggregated_result.metrics.values()
if score.score is not None]
if valid_scores:
avg_score = sum(valid_scores) / len(valid_scores)
agent_scores_by_iteration[iter_num] = avg_score
agent_metrics_by_iteration[iter_num] = aggregated_result.metrics
if not agent_scores_by_iteration:
continue
avg_across_iterations = sum(agent_scores_by_iteration.values()) / len(agent_scores_by_iteration)
row = [f"[bold]{agent_role}[/bold]"]
for iter_num in sorted(iterations_results.keys()):
if iter_num in agent_scores_by_iteration:
score = agent_scores_by_iteration[iter_num]
if score >= 8.0:
color = "green"
elif score >= 6.0:
color = "cyan"
elif score >= 4.0:
color = "yellow"
else:
color = "red"
row.append(f"[bold {color}]{score:.1f}[/]")
else:
row.append("-")
if avg_across_iterations >= 8.0:
color = "green"
elif avg_across_iterations >= 6.0:
color = "cyan"
elif avg_across_iterations >= 4.0:
color = "yellow"
else:
color = "red"
row.append(f"[bold {color}]{avg_across_iterations:.1f}[/]")
table.add_row(*row)
all_metrics: set[Any] = set()
for metrics in agent_metrics_by_iteration.values():
all_metrics.update(metrics.keys())
for metric in sorted(all_metrics, key=lambda x: x.value):
metric_scores = []
row = [f" - {metric.title()}"]
for iter_num in sorted(iterations_results.keys()):
if (iter_num in agent_metrics_by_iteration and
metric in agent_metrics_by_iteration[iter_num]):
metric_score = agent_metrics_by_iteration[iter_num][metric].score
if metric_score is not None:
metric_scores.append(metric_score)
if metric_score >= 8.0:
color = "green"
elif metric_score >= 6.0:
color = "cyan"
elif metric_score >= 4.0:
color = "yellow"
else:
color = "red"
row.append(f"[{color}]{metric_score:.1f}[/]")
else:
row.append("[dim]N/A[/dim]")
else:
row.append("-")
if metric_scores:
avg = sum(metric_scores) / len(metric_scores)
if avg >= 8.0:
color = "green"
elif avg >= 6.0:
color = "cyan"
elif avg >= 4.0:
color = "yellow"
else:
color = "red"
row.append(f"[{color}]{avg:.1f}[/]")
else:
row.append("-")
table.add_row(*row)
table.add_row(*[""] * (len(sorted(iterations_results.keys())) + 2))
self.console_formatter.print(table)
self.console_formatter.print("\n")
def _aggregate_agent_results(
self,
agent_id: str,
agent_role: str,
results: Sequence[AgentEvaluationResult],
strategy: AggregationStrategy = AggregationStrategy.SIMPLE_AVERAGE,
) -> AgentAggregatedEvaluationResult:
metrics_by_category: dict[MetricCategory, list[EvaluationScore]] = defaultdict(list)
for result in results:
for metric_name, evaluation_score in result.metrics.items():
metrics_by_category[metric_name].append(evaluation_score)
aggregated_metrics: dict[MetricCategory, EvaluationScore] = {}
for category, scores in metrics_by_category.items():
valid_scores = [s.score for s in scores if s.score is not None]
avg_score = sum(valid_scores) / len(valid_scores) if valid_scores else None
feedbacks = [s.feedback for s in scores if s.feedback]
feedback_summary = None
if feedbacks:
if len(feedbacks) > 1:
feedback_summary = self._summarize_feedbacks(
agent_role=agent_role,
metric=category.title(),
feedbacks=feedbacks,
scores=[s.score for s in scores],
strategy=strategy
)
else:
feedback_summary = feedbacks[0]
aggregated_metrics[category] = EvaluationScore(
score=avg_score,
feedback=feedback_summary
)
overall_score = None
if aggregated_metrics:
valid_scores = [m.score for m in aggregated_metrics.values() if m.score is not None]
if valid_scores:
overall_score = sum(valid_scores) / len(valid_scores)
return AgentAggregatedEvaluationResult(
agent_id=agent_id,
agent_role=agent_role,
metrics=aggregated_metrics,
overall_score=overall_score,
task_count=len(results),
aggregation_strategy=strategy
)
def _summarize_feedbacks(
self,
agent_role: str,
metric: str,
feedbacks: List[str],
scores: List[float | None],
strategy: AggregationStrategy
) -> str:
if len(feedbacks) <= 2 and all(len(fb) < 200 for fb in feedbacks):
return "\n\n".join([f"Feedback {i+1}: {fb}" for i, fb in enumerate(feedbacks)])
try:
llm = create_llm()
formatted_feedbacks = []
for i, (feedback, score) in enumerate(zip(feedbacks, scores)):
if len(feedback) > 500:
feedback = feedback[:500] + "..."
score_text = f"{score:.1f}" if score is not None else "N/A"
formatted_feedbacks.append(f"Feedback #{i+1} (Score: {score_text}):\n{feedback}")
all_feedbacks = "\n\n" + "\n\n---\n\n".join(formatted_feedbacks)
strategy_guidance = ""
if strategy == AggregationStrategy.BEST_PERFORMANCE:
strategy_guidance = "Focus on the highest-scoring aspects and strengths demonstrated."
elif strategy == AggregationStrategy.WORST_PERFORMANCE:
strategy_guidance = "Focus on areas that need improvement and common issues across tasks."
else:
strategy_guidance = "Provide a balanced analysis of strengths and weaknesses across all tasks."
prompt = [
{"role": "system", "content": f"""You are an expert evaluator creating a comprehensive summary of agent performance feedback.
Your job is to synthesize multiple feedback points about the same metric across different tasks.
Create a concise, insightful summary that captures the key patterns and themes from all feedback.
{strategy_guidance}
Your summary should be:
1. Specific and concrete (not vague or general)
2. Focused on actionable insights
3. Highlighting patterns across tasks
4. 150-250 words in length
The summary should be directly usable as final feedback for the agent's performance on this metric."""},
{"role": "user", "content": f"""I need a synthesized summary of the following feedback for:
Agent Role: {agent_role}
Metric: {metric.title()}
{all_feedbacks}
"""}
]
assert llm is not None
response = llm.call(prompt)
return response
except Exception:
return "Synthesized from multiple tasks: " + "\n\n".join([f"- {fb[:500]}..." for fb in feedbacks])

View File

@@ -0,0 +1,234 @@
from datetime import datetime
from typing import Any, Dict, Optional
from collections.abc import Sequence
from crewai.agent import Agent
from crewai.task import Task
from crewai.utilities.events.base_event_listener import BaseEventListener
from crewai.utilities.events.crewai_event_bus import CrewAIEventsBus
from crewai.utilities.events.agent_events import (
AgentExecutionStartedEvent,
AgentExecutionCompletedEvent,
LiteAgentExecutionStartedEvent,
LiteAgentExecutionCompletedEvent
)
from crewai.utilities.events.tool_usage_events import (
ToolUsageFinishedEvent,
ToolUsageErrorEvent,
ToolExecutionErrorEvent,
ToolSelectionErrorEvent,
ToolValidateInputErrorEvent
)
from crewai.utilities.events.llm_events import (
LLMCallStartedEvent,
LLMCallCompletedEvent
)
class EvaluationTraceCallback(BaseEventListener):
"""Event listener for collecting execution traces for evaluation.
This listener attaches to the event bus to collect detailed information
about the execution process, including agent steps, tool uses, knowledge
retrievals, and final output - all for use in agent evaluation.
"""
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance._initialized = False
return cls._instance
def __init__(self):
if not hasattr(self, "_initialized") or not self._initialized:
super().__init__()
self.traces = {}
self.current_agent_id = None
self.current_task_id = None
self._initialized = True
def setup_listeners(self, event_bus: CrewAIEventsBus):
@event_bus.on(AgentExecutionStartedEvent)
def on_agent_started(source, event: AgentExecutionStartedEvent):
self.on_agent_start(event.agent, event.task)
@event_bus.on(LiteAgentExecutionStartedEvent)
def on_lite_agent_started(source, event: LiteAgentExecutionStartedEvent):
self.on_lite_agent_start(event.agent_info)
@event_bus.on(AgentExecutionCompletedEvent)
def on_agent_completed(source, event: AgentExecutionCompletedEvent):
self.on_agent_finish(event.agent, event.task, event.output)
@event_bus.on(LiteAgentExecutionCompletedEvent)
def on_lite_agent_completed(source, event: LiteAgentExecutionCompletedEvent):
self.on_lite_agent_finish(event.output)
@event_bus.on(ToolUsageFinishedEvent)
def on_tool_completed(source, event: ToolUsageFinishedEvent):
self.on_tool_use(event.tool_name, event.tool_args, event.output, success=True)
@event_bus.on(ToolUsageErrorEvent)
def on_tool_usage_error(source, event: ToolUsageErrorEvent):
self.on_tool_use(event.tool_name, event.tool_args, event.error,
success=False, error_type="usage_error")
@event_bus.on(ToolExecutionErrorEvent)
def on_tool_execution_error(source, event: ToolExecutionErrorEvent):
self.on_tool_use(event.tool_name, event.tool_args, event.error,
success=False, error_type="execution_error")
@event_bus.on(ToolSelectionErrorEvent)
def on_tool_selection_error(source, event: ToolSelectionErrorEvent):
self.on_tool_use(event.tool_name, event.tool_args, event.error,
success=False, error_type="selection_error")
@event_bus.on(ToolValidateInputErrorEvent)
def on_tool_validate_input_error(source, event: ToolValidateInputErrorEvent):
self.on_tool_use(event.tool_name, event.tool_args, event.error,
success=False, error_type="validation_error")
@event_bus.on(LLMCallStartedEvent)
def on_llm_call_started(source, event: LLMCallStartedEvent):
self.on_llm_call_start(event.messages, event.tools)
@event_bus.on(LLMCallCompletedEvent)
def on_llm_call_completed(source, event: LLMCallCompletedEvent):
self.on_llm_call_end(event.messages, event.response)
def on_lite_agent_start(self, agent_info: dict[str, Any]):
self.current_agent_id = agent_info['id']
self.current_task_id = "lite_task"
trace_key = f"{self.current_agent_id}_{self.current_task_id}"
self._init_trace(
trace_key=trace_key,
agent_id=self.current_agent_id,
task_id=self.current_task_id,
tool_uses=[],
llm_calls=[],
start_time=datetime.now(),
final_output=None
)
def _init_trace(self, trace_key: str, **kwargs: Any):
self.traces[trace_key] = kwargs
def on_agent_start(self, agent: Agent, task: Task):
self.current_agent_id = agent.id
self.current_task_id = task.id
trace_key = f"{agent.id}_{task.id}"
self._init_trace(
trace_key=trace_key,
agent_id=agent.id,
task_id=task.id,
tool_uses=[],
llm_calls=[],
start_time=datetime.now(),
final_output=None
)
def on_agent_finish(self, agent: Agent, task: Task, output: Any):
trace_key = f"{agent.id}_{task.id}"
if trace_key in self.traces:
self.traces[trace_key]["final_output"] = output
self.traces[trace_key]["end_time"] = datetime.now()
self._reset_current()
def _reset_current(self):
self.current_agent_id = None
self.current_task_id = None
def on_lite_agent_finish(self, output: Any):
trace_key = f"{self.current_agent_id}_lite_task"
if trace_key in self.traces:
self.traces[trace_key]["final_output"] = output
self.traces[trace_key]["end_time"] = datetime.now()
self._reset_current()
def on_tool_use(self, tool_name: str, tool_args: dict[str, Any] | str, result: Any,
success: bool = True, error_type: str | None = None):
if not self.current_agent_id or not self.current_task_id:
return
trace_key = f"{self.current_agent_id}_{self.current_task_id}"
if trace_key in self.traces:
tool_use = {
"tool": tool_name,
"args": tool_args,
"result": result,
"success": success,
"timestamp": datetime.now()
}
# Add error information if applicable
if not success and error_type:
tool_use["error"] = True
tool_use["error_type"] = error_type
self.traces[trace_key]["tool_uses"].append(tool_use)
def on_llm_call_start(self, messages: str | Sequence[dict[str, Any]] | None, tools: Sequence[dict[str, Any]] | None = None):
if not self.current_agent_id or not self.current_task_id:
return
trace_key = f"{self.current_agent_id}_{self.current_task_id}"
if trace_key not in self.traces:
return
self.current_llm_call = {
"messages": messages,
"tools": tools,
"start_time": datetime.now(),
"response": None,
"end_time": None
}
def on_llm_call_end(self, messages: str | list[dict[str, Any]] | None, response: Any):
if not self.current_agent_id or not self.current_task_id:
return
trace_key = f"{self.current_agent_id}_{self.current_task_id}"
if trace_key not in self.traces:
return
total_tokens = 0
if hasattr(response, "usage") and hasattr(response.usage, "total_tokens"):
total_tokens = response.usage.total_tokens
current_time = datetime.now()
start_time = None
if hasattr(self, "current_llm_call") and self.current_llm_call:
start_time = self.current_llm_call.get("start_time")
if not start_time:
start_time = current_time
llm_call = {
"messages": messages,
"response": response,
"start_time": start_time,
"end_time": current_time,
"total_tokens": total_tokens
}
self.traces[trace_key]["llm_calls"].append(llm_call)
if hasattr(self, "current_llm_call"):
self.current_llm_call = {}
def get_trace(self, agent_id: str, task_id: str) -> Optional[Dict[str, Any]]:
trace_key = f"{agent_id}_{task_id}"
return self.traces.get(trace_key)
def create_evaluation_callbacks() -> EvaluationTraceCallback:
from crewai.utilities.events.crewai_event_bus import crewai_event_bus
callback = EvaluationTraceCallback()
callback.setup_listeners(crewai_event_bus)
return callback

View File

@@ -0,0 +1,8 @@
from crewai.experimental.evaluation.experiment.runner import ExperimentRunner
from crewai.experimental.evaluation.experiment.result import ExperimentResults, ExperimentResult
__all__ = [
"ExperimentRunner",
"ExperimentResults",
"ExperimentResult"
]

View File

@@ -0,0 +1,122 @@
import json
import os
from datetime import datetime, timezone
from typing import Any
from pydantic import BaseModel
class ExperimentResult(BaseModel):
identifier: str
inputs: dict[str, Any]
score: int | dict[str, int | float]
expected_score: int | dict[str, int | float]
passed: bool
agent_evaluations: dict[str, Any] | None = None
class ExperimentResults:
def __init__(self, results: list[ExperimentResult], metadata: dict[str, Any] | None = None):
self.results = results
self.metadata = metadata or {}
self.timestamp = datetime.now(timezone.utc)
from crewai.experimental.evaluation.experiment.result_display import ExperimentResultsDisplay
self.display = ExperimentResultsDisplay()
def to_json(self, filepath: str | None = None) -> dict[str, Any]:
data = {
"timestamp": self.timestamp.isoformat(),
"metadata": self.metadata,
"results": [r.model_dump(exclude={"agent_evaluations"}) for r in self.results]
}
if filepath:
with open(filepath, 'w') as f:
json.dump(data, f, indent=2)
self.display.console.print(f"[green]Results saved to {filepath}[/green]")
return data
def compare_with_baseline(self, baseline_filepath: str, save_current: bool = True, print_summary: bool = False) -> dict[str, Any]:
baseline_runs = []
if os.path.exists(baseline_filepath) and os.path.getsize(baseline_filepath) > 0:
try:
with open(baseline_filepath, 'r') as f:
baseline_data = json.load(f)
if isinstance(baseline_data, dict) and "timestamp" in baseline_data:
baseline_runs = [baseline_data]
elif isinstance(baseline_data, list):
baseline_runs = baseline_data
except (json.JSONDecodeError, FileNotFoundError) as e:
self.display.console.print(f"[yellow]Warning: Could not load baseline file: {str(e)}[/yellow]")
if not baseline_runs:
if save_current:
current_data = self.to_json()
with open(baseline_filepath, 'w') as f:
json.dump([current_data], f, indent=2)
self.display.console.print(f"[green]Saved current results as new baseline to {baseline_filepath}[/green]")
return {"is_baseline": True, "changes": {}}
baseline_runs.sort(key=lambda x: x.get("timestamp", ""), reverse=True)
latest_run = baseline_runs[0]
comparison = self._compare_with_run(latest_run)
if print_summary:
self.display.comparison_summary(comparison, latest_run["timestamp"])
if save_current:
current_data = self.to_json()
baseline_runs.append(current_data)
with open(baseline_filepath, 'w') as f:
json.dump(baseline_runs, f, indent=2)
self.display.console.print(f"[green]Added current results to baseline file {baseline_filepath}[/green]")
return comparison
def _compare_with_run(self, baseline_run: dict[str, Any]) -> dict[str, Any]:
baseline_results = baseline_run.get("results", [])
baseline_lookup = {}
for result in baseline_results:
test_identifier = result.get("identifier")
if test_identifier:
baseline_lookup[test_identifier] = result
improved = []
regressed = []
unchanged = []
new_tests = []
for result in self.results:
test_identifier = result.identifier
if not test_identifier or test_identifier not in baseline_lookup:
new_tests.append(test_identifier)
continue
baseline_result = baseline_lookup[test_identifier]
baseline_passed = baseline_result.get("passed", False)
if result.passed and not baseline_passed:
improved.append(test_identifier)
elif not result.passed and baseline_passed:
regressed.append(test_identifier)
else:
unchanged.append(test_identifier)
missing_tests = []
current_test_identifiers = {result.identifier for result in self.results}
for result in baseline_results:
test_identifier = result.get("identifier")
if test_identifier and test_identifier not in current_test_identifiers:
missing_tests.append(test_identifier)
return {
"improved": improved,
"regressed": regressed,
"unchanged": unchanged,
"new_tests": new_tests,
"missing_tests": missing_tests,
"total_compared": len(improved) + len(regressed) + len(unchanged),
"baseline_timestamp": baseline_run.get("timestamp", "unknown")
}

View File

@@ -0,0 +1,70 @@
from typing import Dict, Any
from rich.console import Console
from rich.table import Table
from rich.panel import Panel
from crewai.experimental.evaluation.experiment.result import ExperimentResults
class ExperimentResultsDisplay:
def __init__(self):
self.console = Console()
def summary(self, experiment_results: ExperimentResults):
total = len(experiment_results.results)
passed = sum(1 for r in experiment_results.results if r.passed)
table = Table(title="Experiment Summary")
table.add_column("Metric", style="cyan")
table.add_column("Value", style="green")
table.add_row("Total Test Cases", str(total))
table.add_row("Passed", str(passed))
table.add_row("Failed", str(total - passed))
table.add_row("Success Rate", f"{(passed / total * 100):.1f}%" if total > 0 else "N/A")
self.console.print(table)
def comparison_summary(self, comparison: Dict[str, Any], baseline_timestamp: str):
self.console.print(Panel(f"[bold]Comparison with baseline run from {baseline_timestamp}[/bold]",
expand=False))
table = Table(title="Results Comparison")
table.add_column("Metric", style="cyan")
table.add_column("Count", style="white")
table.add_column("Details", style="dim")
improved = comparison.get("improved", [])
if improved:
details = ", ".join([f"{test_identifier}" for test_identifier in improved[:3]])
if len(improved) > 3:
details += f" and {len(improved) - 3} more"
table.add_row("✅ Improved", str(len(improved)), details)
else:
table.add_row("✅ Improved", "0", "")
regressed = comparison.get("regressed", [])
if regressed:
details = ", ".join([f"{test_identifier}" for test_identifier in regressed[:3]])
if len(regressed) > 3:
details += f" and {len(regressed) - 3} more"
table.add_row("❌ Regressed", str(len(regressed)), details, style="red")
else:
table.add_row("❌ Regressed", "0", "")
unchanged = comparison.get("unchanged", [])
table.add_row("⏺ Unchanged", str(len(unchanged)), "")
new_tests = comparison.get("new_tests", [])
if new_tests:
details = ", ".join(new_tests[:3])
if len(new_tests) > 3:
details += f" and {len(new_tests) - 3} more"
table.add_row(" New Tests", str(len(new_tests)), details)
missing_tests = comparison.get("missing_tests", [])
if missing_tests:
details = ", ".join(missing_tests[:3])
if len(missing_tests) > 3:
details += f" and {len(missing_tests) - 3} more"
table.add_row(" Missing Tests", str(len(missing_tests)), details)
self.console.print(table)

View File

@@ -0,0 +1,125 @@
from collections import defaultdict
from hashlib import md5
from typing import Any
from crewai import Crew, Agent
from crewai.experimental.evaluation import AgentEvaluator, create_default_evaluator
from crewai.experimental.evaluation.experiment.result_display import ExperimentResultsDisplay
from crewai.experimental.evaluation.experiment.result import ExperimentResults, ExperimentResult
from crewai.experimental.evaluation.evaluation_display import AgentAggregatedEvaluationResult
class ExperimentRunner:
def __init__(self, dataset: list[dict[str, Any]]):
self.dataset = dataset or []
self.evaluator: AgentEvaluator | None = None
self.display = ExperimentResultsDisplay()
def run(self, crew: Crew | None = None, agents: list[Agent] | None = None, print_summary: bool = False) -> ExperimentResults:
if crew and not agents:
agents = crew.agents
assert agents is not None
self.evaluator = create_default_evaluator(agents=agents)
results = []
for test_case in self.dataset:
self.evaluator.reset_iterations_results()
result = self._run_test_case(test_case=test_case, crew=crew, agents=agents)
results.append(result)
experiment_results = ExperimentResults(results)
if print_summary:
self.display.summary(experiment_results)
return experiment_results
def _run_test_case(self, test_case: dict[str, Any], agents: list[Agent], crew: Crew | None = None) -> ExperimentResult:
inputs = test_case["inputs"]
expected_score = test_case["expected_score"]
identifier = test_case.get("identifier") or md5(str(test_case).encode(), usedforsecurity=False).hexdigest()
try:
self.display.console.print(f"[dim]Running crew with input: {str(inputs)[:50]}...[/dim]")
self.display.console.print("\n")
if crew:
crew.kickoff(inputs=inputs)
else:
for agent in agents:
agent.kickoff(**inputs)
assert self.evaluator is not None
agent_evaluations = self.evaluator.get_agent_evaluation()
actual_score = self._extract_scores(agent_evaluations)
passed = self._assert_scores(expected_score, actual_score)
return ExperimentResult(
identifier=identifier,
inputs=inputs,
score=actual_score,
expected_score=expected_score,
passed=passed,
agent_evaluations=agent_evaluations
)
except Exception as e:
self.display.console.print(f"[red]Error running test case: {str(e)}[/red]")
return ExperimentResult(
identifier=identifier,
inputs=inputs,
score=0,
expected_score=expected_score,
passed=False
)
def _extract_scores(self, agent_evaluations: dict[str, AgentAggregatedEvaluationResult]) -> float | dict[str, float]:
all_scores: dict[str, list[float]] = defaultdict(list)
for evaluation in agent_evaluations.values():
for metric_name, score in evaluation.metrics.items():
if score.score is not None:
all_scores[metric_name.value].append(score.score)
avg_scores = {m: sum(s)/len(s) for m, s in all_scores.items()}
if len(avg_scores) == 1:
return list(avg_scores.values())[0]
return avg_scores
def _assert_scores(self, expected: float | dict[str, float],
actual: float | dict[str, float]) -> bool:
"""
Compare expected and actual scores, and return whether the test case passed.
The rules for comparison are as follows:
- If both expected and actual scores are single numbers, the actual score must be >= expected.
- If expected is a single number and actual is a dict, compare against the average of actual values.
- If expected is a dict and actual is a single number, actual must be >= all expected values.
- If both are dicts, actual must have matching keys with values >= expected values.
"""
if isinstance(expected, (int, float)) and isinstance(actual, (int, float)):
return actual >= expected
if isinstance(expected, dict) and isinstance(actual, (int, float)):
return all(actual >= exp_score for exp_score in expected.values())
if isinstance(expected, (int, float)) and isinstance(actual, dict):
if not actual:
return False
avg_score = sum(actual.values()) / len(actual)
return avg_score >= expected
if isinstance(expected, dict) and isinstance(actual, dict):
if not expected:
return True
matching_keys = set(expected.keys()) & set(actual.keys())
if not matching_keys:
return False
# All matching keys must have actual >= expected
return all(actual[key] >= expected[key] for key in matching_keys)
return False

View File

@@ -0,0 +1,30 @@
"""Robust JSON parsing utilities for evaluation responses."""
import json
import re
from typing import Any
def extract_json_from_llm_response(text: str) -> dict[str, Any]:
try:
return json.loads(text)
except json.JSONDecodeError:
pass
json_patterns = [
# Standard markdown code blocks with json
r'```json\s*([\s\S]*?)\s*```',
# Code blocks without language specifier
r'```\s*([\s\S]*?)\s*```',
# Inline code with JSON
r'`([{\\[].*[}\]])`',
]
for pattern in json_patterns:
matches = re.findall(pattern, text, re.IGNORECASE | re.DOTALL)
for match in matches:
try:
return json.loads(match.strip())
except json.JSONDecodeError:
continue
raise ValueError("No valid JSON found in the response")

View File

@@ -0,0 +1,26 @@
from crewai.experimental.evaluation.metrics.reasoning_metrics import (
ReasoningEfficiencyEvaluator
)
from crewai.experimental.evaluation.metrics.tools_metrics import (
ToolSelectionEvaluator,
ParameterExtractionEvaluator,
ToolInvocationEvaluator
)
from crewai.experimental.evaluation.metrics.goal_metrics import (
GoalAlignmentEvaluator
)
from crewai.experimental.evaluation.metrics.semantic_quality_metrics import (
SemanticQualityEvaluator
)
__all__ = [
"ReasoningEfficiencyEvaluator",
"ToolSelectionEvaluator",
"ParameterExtractionEvaluator",
"ToolInvocationEvaluator",
"GoalAlignmentEvaluator",
"SemanticQualityEvaluator"
]

View File

@@ -0,0 +1,69 @@
from typing import Any, Dict
from crewai.agent import Agent
from crewai.task import Task
from crewai.experimental.evaluation.base_evaluator import BaseEvaluator, EvaluationScore, MetricCategory
from crewai.experimental.evaluation.json_parser import extract_json_from_llm_response
class GoalAlignmentEvaluator(BaseEvaluator):
@property
def metric_category(self) -> MetricCategory:
return MetricCategory.GOAL_ALIGNMENT
def evaluate(
self,
agent: Agent,
execution_trace: Dict[str, Any],
final_output: Any,
task: Task | None = None,
) -> EvaluationScore:
task_context = ""
if task is not None:
task_context = f"Task description: {task.description}\nExpected output: {task.expected_output}\n"
prompt = [
{"role": "system", "content": """You are an expert evaluator assessing how well an AI agent's output aligns with its assigned task goal.
Score the agent's goal alignment on a scale from 0-10 where:
- 0: Complete misalignment, agent did not understand or attempt the task goal
- 5: Partial alignment, agent attempted the task but missed key requirements
- 10: Perfect alignment, agent fully satisfied all task requirements
Consider:
1. Did the agent correctly interpret the task goal?
2. Did the final output directly address the requirements?
3. Did the agent focus on relevant aspects of the task?
4. Did the agent provide all requested information or deliverables?
Return your evaluation as JSON with fields 'score' (number) and 'feedback' (string).
"""},
{"role": "user", "content": f"""
Agent role: {agent.role}
Agent goal: {agent.goal}
{task_context}
Agent's final output:
{final_output}
Evaluate how well the agent's output aligns with the assigned task goal.
"""}
]
assert self.llm is not None
response = self.llm.call(prompt)
try:
evaluation_data: dict[str, Any] = extract_json_from_llm_response(response)
assert evaluation_data is not None
return EvaluationScore(
score=evaluation_data.get("score", 0),
feedback=evaluation_data.get("feedback", response),
raw_response=response
)
except Exception:
return EvaluationScore(
score=None,
feedback=f"Failed to parse evaluation. Raw response: {response}",
raw_response=response
)

View File

@@ -0,0 +1,361 @@
"""Agent reasoning efficiency evaluators.
This module provides evaluator implementations for:
- Reasoning efficiency
- Loop detection
- Thinking-to-action ratio
"""
import logging
import re
from enum import Enum
from typing import Any, Dict, List, Tuple
import numpy as np
from collections.abc import Sequence
from crewai.agent import Agent
from crewai.task import Task
from crewai.experimental.evaluation.base_evaluator import BaseEvaluator, EvaluationScore, MetricCategory
from crewai.experimental.evaluation.json_parser import extract_json_from_llm_response
from crewai.tasks.task_output import TaskOutput
class ReasoningPatternType(Enum):
EFFICIENT = "efficient" # Good reasoning flow
LOOP = "loop" # Agent is stuck in a loop
VERBOSE = "verbose" # Agent is unnecessarily verbose
INDECISIVE = "indecisive" # Agent struggles to make decisions
SCATTERED = "scattered" # Agent jumps between topics without focus
class ReasoningEfficiencyEvaluator(BaseEvaluator):
@property
def metric_category(self) -> MetricCategory:
return MetricCategory.REASONING_EFFICIENCY
def evaluate(
self,
agent: Agent,
execution_trace: Dict[str, Any],
final_output: TaskOutput | str,
task: Task | None = None,
) -> EvaluationScore:
task_context = ""
if task is not None:
task_context = f"Task description: {task.description}\nExpected output: {task.expected_output}\n"
llm_calls = execution_trace.get("llm_calls", [])
if not llm_calls or len(llm_calls) < 2:
return EvaluationScore(
score=None,
feedback="Insufficient LLM calls to evaluate reasoning efficiency."
)
total_calls = len(llm_calls)
total_tokens = sum(call.get("total_tokens", 0) for call in llm_calls)
avg_tokens_per_call = total_tokens / total_calls if total_calls > 0 else 0
time_intervals = []
has_reliable_timing = True
for i in range(1, len(llm_calls)):
start_time = llm_calls[i-1].get("end_time")
end_time = llm_calls[i].get("start_time")
if start_time and end_time and start_time != end_time:
try:
interval = end_time - start_time
time_intervals.append(interval.total_seconds() if hasattr(interval, 'total_seconds') else 0)
except Exception:
has_reliable_timing = False
else:
has_reliable_timing = False
loop_detected, loop_details = self._detect_loops(llm_calls)
pattern_analysis = self._analyze_reasoning_patterns(llm_calls)
efficiency_metrics = {
"total_llm_calls": total_calls,
"total_tokens": total_tokens,
"avg_tokens_per_call": avg_tokens_per_call,
"reasoning_pattern": pattern_analysis["primary_pattern"].value,
"loops_detected": loop_detected,
}
if has_reliable_timing and time_intervals:
efficiency_metrics["avg_time_between_calls"] = np.mean(time_intervals)
loop_info = f"Detected {len(loop_details)} potential reasoning loops." if loop_detected else "No significant reasoning loops detected."
call_samples = self._get_call_samples(llm_calls)
final_output = final_output.raw if isinstance(final_output, TaskOutput) else final_output
prompt = [
{"role": "system", "content": """You are an expert evaluator assessing the reasoning efficiency of an AI agent's thought process.
Evaluate the agent's reasoning efficiency across these five key subcategories:
1. Focus (0-10): How well the agent stays on topic and avoids unnecessary tangents
2. Progression (0-10): How effectively the agent builds on previous thoughts rather than repeating or circling
3. Decision Quality (0-10): How decisively and appropriately the agent makes decisions
4. Conciseness (0-10): How efficiently the agent communicates without unnecessary verbosity
5. Loop Avoidance (0-10): How well the agent avoids getting stuck in repetitive thinking patterns
For each subcategory, provide a score from 0-10 where:
- 0: Completely inefficient
- 5: Moderately efficient
- 10: Highly efficient
The overall score should be a weighted average of these subcategories.
Return your evaluation as JSON with the following structure:
{
"overall_score": float,
"scores": {
"focus": float,
"progression": float,
"decision_quality": float,
"conciseness": float,
"loop_avoidance": float
},
"feedback": string (general feedback about overall reasoning efficiency),
"optimization_suggestions": string (concrete suggestions for improving reasoning efficiency),
"detected_patterns": string (describe any inefficient reasoning patterns you observe)
}"""},
{"role": "user", "content": f"""
Agent role: {agent.role}
{task_context}
Reasoning efficiency metrics:
- Total LLM calls: {efficiency_metrics["total_llm_calls"]}
- Average tokens per call: {efficiency_metrics["avg_tokens_per_call"]:.1f}
- Primary reasoning pattern: {efficiency_metrics["reasoning_pattern"]}
- {loop_info}
{"- Average time between calls: {:.2f} seconds".format(efficiency_metrics.get("avg_time_between_calls", 0)) if "avg_time_between_calls" in efficiency_metrics else ""}
Sample of agent reasoning flow (chronological sequence):
{call_samples}
Agent's final output:
{final_output[:500]}... (truncated)
Evaluate the reasoning efficiency of this agent based on these interaction patterns.
Identify any inefficient reasoning patterns and provide specific suggestions for optimization.
"""}
]
assert self.llm is not None
response = self.llm.call(prompt)
try:
evaluation_data = extract_json_from_llm_response(response)
scores = evaluation_data.get("scores", {})
focus = scores.get("focus", 5.0)
progression = scores.get("progression", 5.0)
decision_quality = scores.get("decision_quality", 5.0)
conciseness = scores.get("conciseness", 5.0)
loop_avoidance = scores.get("loop_avoidance", 5.0)
overall_score = evaluation_data.get("overall_score", evaluation_data.get("score", 5.0))
feedback = evaluation_data.get("feedback", "No detailed feedback provided.")
optimization_suggestions = evaluation_data.get("optimization_suggestions", "No specific suggestions provided.")
detailed_feedback = "Reasoning Efficiency Evaluation:\n"
detailed_feedback += f"• Focus: {focus}/10 - Staying on topic without tangents\n"
detailed_feedback += f"• Progression: {progression}/10 - Building on previous thinking\n"
detailed_feedback += f"• Decision Quality: {decision_quality}/10 - Making appropriate decisions\n"
detailed_feedback += f"• Conciseness: {conciseness}/10 - Communicating efficiently\n"
detailed_feedback += f"• Loop Avoidance: {loop_avoidance}/10 - Avoiding repetitive patterns\n\n"
detailed_feedback += f"Feedback:\n{feedback}\n\n"
detailed_feedback += f"Optimization Suggestions:\n{optimization_suggestions}"
return EvaluationScore(
score=float(overall_score),
feedback=detailed_feedback,
raw_response=response
)
except Exception as e:
logging.warning(f"Failed to parse reasoning efficiency evaluation: {e}")
return EvaluationScore(
score=None,
feedback=f"Failed to parse reasoning efficiency evaluation. Raw response: {response[:200]}...",
raw_response=response
)
def _detect_loops(self, llm_calls: List[Dict]) -> Tuple[bool, List[Dict]]:
loop_details = []
messages = []
for call in llm_calls:
content = call.get("response", "")
if isinstance(content, str):
messages.append(content)
elif isinstance(content, list) and len(content) > 0:
# Handle message list format
for msg in content:
if isinstance(msg, dict) and "content" in msg:
messages.append(msg["content"])
# Simple n-gram based similarity detection
# For a more robust implementation, consider using embedding-based similarity
for i in range(len(messages) - 2):
for j in range(i + 1, len(messages) - 1):
# Check for repeated patterns (simplistic approach)
# A more sophisticated approach would use semantic similarity
similarity = self._calculate_text_similarity(messages[i], messages[j])
if similarity > 0.7: # Arbitrary threshold
loop_details.append({
"first_occurrence": i,
"second_occurrence": j,
"similarity": similarity,
"snippet": messages[i][:100] + "..."
})
return len(loop_details) > 0, loop_details
def _calculate_text_similarity(self, text1: str, text2: str) -> float:
text1 = re.sub(r'\s+', ' ', text1.lower()).strip()
text2 = re.sub(r'\s+', ' ', text2.lower()).strip()
# Simple Jaccard similarity on word sets
words1 = set(text1.split())
words2 = set(text2.split())
intersection = len(words1.intersection(words2))
union = len(words1.union(words2))
return intersection / union if union > 0 else 0.0
def _analyze_reasoning_patterns(self, llm_calls: List[Dict]) -> Dict[str, Any]:
call_lengths = []
response_times = []
for call in llm_calls:
content = call.get("response", "")
if isinstance(content, str):
call_lengths.append(len(content))
elif isinstance(content, list) and len(content) > 0:
# Handle message list format
total_length = 0
for msg in content:
if isinstance(msg, dict) and "content" in msg:
total_length += len(msg["content"])
call_lengths.append(total_length)
start_time = call.get("start_time")
end_time = call.get("end_time")
if start_time and end_time:
try:
response_times.append(end_time - start_time)
except Exception:
pass
avg_length = np.mean(call_lengths) if call_lengths else 0
std_length = np.std(call_lengths) if call_lengths else 0
length_trend = self._calculate_trend(call_lengths)
primary_pattern = ReasoningPatternType.EFFICIENT
details = "Agent demonstrates efficient reasoning patterns."
loop_score = self._calculate_loop_likelihood(call_lengths, response_times)
if loop_score > 0.7:
primary_pattern = ReasoningPatternType.LOOP
details = "Agent appears to be stuck in repetitive thinking patterns."
elif avg_length > 1000 and std_length / avg_length < 0.3:
primary_pattern = ReasoningPatternType.VERBOSE
details = "Agent is consistently verbose across interactions."
elif len(llm_calls) > 10 and length_trend > 0.5:
primary_pattern = ReasoningPatternType.INDECISIVE
details = "Agent shows signs of indecisiveness with increasing message lengths."
elif std_length / avg_length > 0.8:
primary_pattern = ReasoningPatternType.SCATTERED
details = "Agent shows inconsistent reasoning flow with highly variable responses."
return {
"primary_pattern": primary_pattern,
"details": details,
"metrics": {
"avg_length": avg_length,
"std_length": std_length,
"length_trend": length_trend,
"loop_score": loop_score
}
}
def _calculate_trend(self, values: Sequence[float | int]) -> float:
if not values or len(values) < 2:
return 0.0
try:
x = np.arange(len(values))
y = np.array(values)
# Simple linear regression
slope = np.polyfit(x, y, 1)[0]
# Normalize slope to -1 to 1 range
max_possible_slope = max(values) - min(values)
if max_possible_slope > 0:
normalized_slope = slope / max_possible_slope
return max(min(normalized_slope, 1.0), -1.0)
return 0.0
except Exception:
return 0.0
def _calculate_loop_likelihood(self, call_lengths: Sequence[float], response_times: Sequence[float]) -> float:
if not call_lengths or len(call_lengths) < 3:
return 0.0
indicators = []
if len(call_lengths) >= 4:
repeated_lengths = 0
for i in range(len(call_lengths) - 2):
ratio = call_lengths[i] / call_lengths[i + 2] if call_lengths[i + 2] > 0 else 0
if 0.85 <= ratio <= 1.15:
repeated_lengths += 1
length_repetition_score = repeated_lengths / (len(call_lengths) - 2)
indicators.append(length_repetition_score)
if response_times and len(response_times) >= 3:
try:
std_time = np.std(response_times)
mean_time = np.mean(response_times)
if mean_time > 0:
time_consistency = 1.0 - (std_time / mean_time)
indicators.append(max(0, time_consistency - 0.3) * 1.5)
except Exception:
pass
return np.mean(indicators) if indicators else 0.0
def _get_call_samples(self, llm_calls: List[Dict]) -> str:
samples = []
if len(llm_calls) <= 6:
sample_indices = list(range(len(llm_calls)))
else:
sample_indices = [0, 1, len(llm_calls) // 2 - 1, len(llm_calls) // 2,
len(llm_calls) - 2, len(llm_calls) - 1]
for idx in sample_indices:
call = llm_calls[idx]
content = call.get("response", "")
if isinstance(content, str):
sample = content
elif isinstance(content, list) and len(content) > 0:
sample_parts = []
for msg in content:
if isinstance(msg, dict) and "content" in msg:
sample_parts.append(msg["content"])
sample = "\n".join(sample_parts)
else:
sample = str(content)
truncated = sample[:200] + "..." if len(sample) > 200 else sample
samples.append(f"Call {idx + 1}:\n{truncated}\n")
return "\n".join(samples)

View File

@@ -0,0 +1,68 @@
from typing import Any, Dict
from crewai.agent import Agent
from crewai.task import Task
from crewai.experimental.evaluation.base_evaluator import BaseEvaluator, EvaluationScore, MetricCategory
from crewai.experimental.evaluation.json_parser import extract_json_from_llm_response
class SemanticQualityEvaluator(BaseEvaluator):
@property
def metric_category(self) -> MetricCategory:
return MetricCategory.SEMANTIC_QUALITY
def evaluate(
self,
agent: Agent,
execution_trace: Dict[str, Any],
final_output: Any,
task: Task | None = None,
) -> EvaluationScore:
task_context = ""
if task is not None:
task_context = f"Task description: {task.description}"
prompt = [
{"role": "system", "content": """You are an expert evaluator assessing the semantic quality of an AI agent's output.
Score the semantic quality on a scale from 0-10 where:
- 0: Completely incoherent, confusing, or logically flawed output
- 5: Moderately clear and logical output with some issues
- 10: Exceptionally clear, coherent, and logically sound output
Consider:
1. Is the output well-structured and organized?
2. Is the reasoning logical and well-supported?
3. Is the language clear, precise, and appropriate for the task?
4. Are claims supported by evidence when appropriate?
5. Is the output free from contradictions and logical fallacies?
Return your evaluation as JSON with fields 'score' (number) and 'feedback' (string).
"""},
{"role": "user", "content": f"""
Agent role: {agent.role}
{task_context}
Agent's final output:
{final_output}
Evaluate the semantic quality and reasoning of this output.
"""}
]
assert self.llm is not None
response = self.llm.call(prompt)
try:
evaluation_data: dict[str, Any] = extract_json_from_llm_response(response)
assert evaluation_data is not None
return EvaluationScore(
score=float(evaluation_data["score"]) if evaluation_data.get("score") is not None else None,
feedback=evaluation_data.get("feedback", response),
raw_response=response
)
except Exception:
return EvaluationScore(
score=None,
feedback=f"Failed to parse evaluation. Raw response: {response}",
raw_response=response
)

View File

@@ -0,0 +1,410 @@
import json
from typing import Dict, Any
from crewai.experimental.evaluation.base_evaluator import BaseEvaluator, EvaluationScore, MetricCategory
from crewai.experimental.evaluation.json_parser import extract_json_from_llm_response
from crewai.agent import Agent
from crewai.task import Task
class ToolSelectionEvaluator(BaseEvaluator):
@property
def metric_category(self) -> MetricCategory:
return MetricCategory.TOOL_SELECTION
def evaluate(
self,
agent: Agent,
execution_trace: Dict[str, Any],
final_output: str,
task: Task | None = None,
) -> EvaluationScore:
task_context = ""
if task is not None:
task_context = f"Task description: {task.description}"
tool_uses = execution_trace.get("tool_uses", [])
tool_count = len(tool_uses)
unique_tool_types = set([tool.get("tool", "Unknown tool") for tool in tool_uses])
if tool_count == 0:
if not agent.tools:
return EvaluationScore(
score=None,
feedback="Agent had no tools available to use."
)
else:
return EvaluationScore(
score=None,
feedback="Agent had tools available but didn't use any."
)
available_tools_info = ""
if agent.tools:
for tool in agent.tools:
available_tools_info += f"- {tool.name}: {tool.description}\n"
else:
available_tools_info = "No tools available"
tool_types_summary = "Tools selected by the agent:\n"
for tool_type in sorted(unique_tool_types):
tool_types_summary += f"- {tool_type}\n"
prompt = [
{"role": "system", "content": """You are an expert evaluator assessing if an AI agent selected the most appropriate tools for a given task.
You must evaluate based on these 2 criteria:
1. Relevance (0-10): Were the tools chosen directly aligned with the task's goals?
2. Coverage (0-10): Did the agent select ALL appropriate tools from the AVAILABLE tools?
IMPORTANT:
- ONLY consider tools that are listed as available to the agent
- DO NOT suggest tools that aren't in the 'Available tools' list
- DO NOT evaluate the quality or accuracy of tool outputs/results
- DO NOT evaluate how many times each tool was used
- DO NOT evaluate how the agent used the parameters
- DO NOT evaluate whether the agent interpreted the task correctly
Focus ONLY on whether the correct CATEGORIES of tools were selected from what was available.
Return your evaluation as JSON with these fields:
- scores: {"relevance": number, "coverage": number}
- overall_score: number (average of all scores, 0-10)
- feedback: string (focused ONLY on tool selection decisions from available tools)
- improvement_suggestions: string (ONLY suggest better selection from the AVAILABLE tools list, NOT new tools)
"""},
{"role": "user", "content": f"""
Agent role: {agent.role}
{task_context}
Available tools for this agent:
{available_tools_info}
{tool_types_summary}
Based ONLY on the task description and comparing the AVAILABLE tools with those that were selected (listed above), evaluate if the agent selected the appropriate tool types for this task.
IMPORTANT:
- ONLY evaluate selection from tools listed as available
- DO NOT suggest new tools that aren't in the available tools list
- DO NOT evaluate tool usage or results
"""}
]
assert self.llm is not None
response = self.llm.call(prompt)
try:
evaluation_data = extract_json_from_llm_response(response)
assert evaluation_data is not None
scores = evaluation_data.get("scores", {})
relevance = scores.get("relevance", 5.0)
coverage = scores.get("coverage", 5.0)
overall_score = float(evaluation_data.get("overall_score", 5.0))
feedback = "Tool Selection Evaluation:\n"
feedback += f"• Relevance: {relevance}/10 - Selection of appropriate tool types for the task\n"
feedback += f"• Coverage: {coverage}/10 - Selection of all necessary tool types\n"
if "improvement_suggestions" in evaluation_data:
feedback += f"Improvement Suggestions:\n{evaluation_data['improvement_suggestions']}"
else:
feedback += evaluation_data.get("feedback", "No detailed feedback available.")
return EvaluationScore(
score=overall_score,
feedback=feedback,
raw_response=response
)
except Exception as e:
return EvaluationScore(
score=None,
feedback=f"Error evaluating tool selection: {e}",
raw_response=response
)
class ParameterExtractionEvaluator(BaseEvaluator):
@property
def metric_category(self) -> MetricCategory:
return MetricCategory.PARAMETER_EXTRACTION
def evaluate(
self,
agent: Agent,
execution_trace: Dict[str, Any],
final_output: str,
task: Task | None = None,
) -> EvaluationScore:
task_context = ""
if task is not None:
task_context = f"Task description: {task.description}"
tool_uses = execution_trace.get("tool_uses", [])
tool_count = len(tool_uses)
if tool_count == 0:
return EvaluationScore(
score=None,
feedback="No tool usage detected. Cannot evaluate parameter extraction."
)
validation_errors = []
for tool_use in tool_uses:
if not tool_use.get("success", True) and tool_use.get("error_type") == "validation_error":
validation_errors.append({
"tool": tool_use.get("tool", "Unknown tool"),
"error": tool_use.get("result"),
"args": tool_use.get("args", {})
})
validation_error_rate = len(validation_errors) / tool_count if tool_count > 0 else 0
param_samples = []
for i, tool_use in enumerate(tool_uses[:5]):
tool_name = tool_use.get("tool", "Unknown tool")
tool_args = tool_use.get("args", {})
success = tool_use.get("success", True) and not tool_use.get("error", False)
error_type = tool_use.get("error_type", "") if not success else ""
is_validation_error = error_type == "validation_error"
sample = f"Tool use #{i+1} - {tool_name}:\n"
sample += f"- Parameters: {json.dumps(tool_args, indent=2)}\n"
sample += f"- Success: {'No' if not success else 'Yes'}"
if is_validation_error:
sample += " (PARAMETER VALIDATION ERROR)\n"
sample += f"- Error: {tool_use.get('result', 'Unknown error')}"
elif not success:
sample += f" (Other error: {error_type})\n"
param_samples.append(sample)
validation_errors_info = ""
if validation_errors:
validation_errors_info = f"\nParameter validation errors detected: {len(validation_errors)} ({validation_error_rate:.1%} of tool uses)\n"
for i, err in enumerate(validation_errors[:3]):
tool_name = err.get("tool", "Unknown tool")
error_msg = err.get("error", "Unknown error")
args = err.get("args", {})
validation_errors_info += f"\nValidation Error #{i+1}:\n- Tool: {tool_name}\n- Args: {json.dumps(args, indent=2)}\n- Error: {error_msg}"
if len(validation_errors) > 3:
validation_errors_info += f"\n...and {len(validation_errors) - 3} more validation errors."
param_samples_text = "\n\n".join(param_samples)
prompt = [
{"role": "system", "content": """You are an expert evaluator assessing how well an AI agent extracts and formats PARAMETER VALUES for tool calls.
Your job is to evaluate ONLY whether the agent used the correct parameter VALUES, not whether the right tools were selected or how the tools were invoked.
Evaluate parameter extraction based on these criteria:
1. Accuracy (0-10): Are parameter values correctly identified from the context/task?
2. Formatting (0-10): Are values formatted correctly for each tool's requirements?
3. Completeness (0-10): Are all required parameter values provided, with no missing information?
IMPORTANT: DO NOT evaluate:
- Whether the right tool was chosen (that's the ToolSelectionEvaluator's job)
- How the tools were structurally invoked (that's the ToolInvocationEvaluator's job)
- The quality of results from tools
Focus ONLY on the PARAMETER VALUES - whether they were correctly extracted from the context, properly formatted, and complete.
Validation errors are important signals that parameter values weren't properly extracted or formatted.
Return your evaluation as JSON with these fields:
- scores: {"accuracy": number, "formatting": number, "completeness": number}
- overall_score: number (average of all scores, 0-10)
- feedback: string (focused ONLY on parameter value extraction quality)
- improvement_suggestions: string (concrete suggestions for better parameter VALUE extraction)
"""},
{"role": "user", "content": f"""
Agent role: {agent.role}
{task_context}
Parameter extraction examples:
{param_samples_text}
{validation_errors_info}
Evaluate the quality of the agent's parameter extraction for this task.
"""}
]
assert self.llm is not None
response = self.llm.call(prompt)
try:
evaluation_data = extract_json_from_llm_response(response)
assert evaluation_data is not None
scores = evaluation_data.get("scores", {})
accuracy = scores.get("accuracy", 5.0)
formatting = scores.get("formatting", 5.0)
completeness = scores.get("completeness", 5.0)
overall_score = float(evaluation_data.get("overall_score", 5.0))
feedback = "Parameter Extraction Evaluation:\n"
feedback += f"• Accuracy: {accuracy}/10 - Correctly identifying required parameters\n"
feedback += f"• Formatting: {formatting}/10 - Properly formatting parameters for tools\n"
feedback += f"• Completeness: {completeness}/10 - Including all necessary information\n\n"
if "improvement_suggestions" in evaluation_data:
feedback += f"Improvement Suggestions:\n{evaluation_data['improvement_suggestions']}"
else:
feedback += evaluation_data.get("feedback", "No detailed feedback available.")
return EvaluationScore(
score=overall_score,
feedback=feedback,
raw_response=response
)
except Exception as e:
return EvaluationScore(
score=None,
feedback=f"Error evaluating parameter extraction: {e}",
raw_response=response
)
class ToolInvocationEvaluator(BaseEvaluator):
@property
def metric_category(self) -> MetricCategory:
return MetricCategory.TOOL_INVOCATION
def evaluate(
self,
agent: Agent,
execution_trace: Dict[str, Any],
final_output: str,
task: Task | None = None,
) -> EvaluationScore:
task_context = ""
if task is not None:
task_context = f"Task description: {task.description}"
tool_uses = execution_trace.get("tool_uses", [])
tool_errors = []
tool_count = len(tool_uses)
if tool_count == 0:
return EvaluationScore(
score=None,
feedback="No tool usage detected. Cannot evaluate tool invocation."
)
for tool_use in tool_uses:
if not tool_use.get("success", True) or tool_use.get("error", False):
error_info = {
"tool": tool_use.get("tool", "Unknown tool"),
"error": tool_use.get("result"),
"error_type": tool_use.get("error_type", "unknown_error")
}
tool_errors.append(error_info)
error_rate = len(tool_errors) / tool_count if tool_count > 0 else 0
error_types = {}
for error in tool_errors:
error_type = error.get("error_type", "unknown_error")
if error_type not in error_types:
error_types[error_type] = 0
error_types[error_type] += 1
invocation_samples = []
for i, tool_use in enumerate(tool_uses[:5]):
tool_name = tool_use.get("tool", "Unknown tool")
tool_args = tool_use.get("args", {})
success = tool_use.get("success", True) and not tool_use.get("error", False)
error_type = tool_use.get("error_type", "") if not success else ""
error_msg = tool_use.get("result", "No error") if not success else "No error"
sample = f"Tool invocation #{i+1}:\n"
sample += f"- Tool: {tool_name}\n"
sample += f"- Parameters: {json.dumps(tool_args, indent=2)}\n"
sample += f"- Success: {'No' if not success else 'Yes'}\n"
if not success:
sample += f"- Error type: {error_type}\n"
sample += f"- Error: {error_msg}"
invocation_samples.append(sample)
error_type_summary = ""
if error_types:
error_type_summary = "Error type breakdown:\n"
for error_type, count in error_types.items():
error_type_summary += f"- {error_type}: {count} occurrences ({(count/tool_count):.1%})\n"
invocation_samples_text = "\n\n".join(invocation_samples)
prompt = [
{"role": "system", "content": """You are an expert evaluator assessing how correctly an AI agent's tool invocations are STRUCTURED.
Your job is to evaluate ONLY the structural and syntactical aspects of how the agent called tools, NOT which tools were selected or what parameter values were used.
Evaluate the agent's tool invocation based on these criteria:
1. Structure (0-10): Does the tool call follow the expected syntax and format?
2. Error Handling (0-10): Does the agent handle tool errors appropriately?
3. Invocation Patterns (0-10): Are tool calls properly sequenced, batched, or managed?
Error types that indicate invocation issues:
- execution_error: The tool was called correctly but failed during execution
- usage_error: General errors in how the tool was used structurally
IMPORTANT: DO NOT evaluate:
- Whether the right tool was chosen (that's the ToolSelectionEvaluator's job)
- Whether the parameter values are correct (that's the ParameterExtractionEvaluator's job)
- The quality of results from tools
Focus ONLY on HOW tools were invoked - the structure, format, and handling of the invocation process.
Return your evaluation as JSON with these fields:
- scores: {"structure": number, "error_handling": number, "invocation_patterns": number}
- overall_score: number (average of all scores, 0-10)
- feedback: string (focused ONLY on structural aspects of tool invocation)
- improvement_suggestions: string (concrete suggestions for better structuring of tool calls)
"""},
{"role": "user", "content": f"""
Agent role: {agent.role}
{task_context}
Tool invocation examples:
{invocation_samples_text}
Tool error rate: {error_rate:.2%} ({len(tool_errors)} errors out of {tool_count} invocations)
{error_type_summary}
Evaluate the quality of the agent's tool invocation structure during this task.
"""}
]
assert self.llm is not None
response = self.llm.call(prompt)
try:
evaluation_data = extract_json_from_llm_response(response)
assert evaluation_data is not None
scores = evaluation_data.get("scores", {})
structure = scores.get("structure", 5.0)
error_handling = scores.get("error_handling", 5.0)
invocation_patterns = scores.get("invocation_patterns", 5.0)
overall_score = float(evaluation_data.get("overall_score", 5.0))
feedback = "Tool Invocation Evaluation:\n"
feedback += f"• Structure: {structure}/10 - Following proper syntax and format\n"
feedback += f"• Error Handling: {error_handling}/10 - Appropriately handling tool errors\n"
feedback += f"• Invocation Patterns: {invocation_patterns}/10 - Proper sequencing and management of calls\n\n"
if "improvement_suggestions" in evaluation_data:
feedback += f"Improvement Suggestions:\n{evaluation_data['improvement_suggestions']}"
else:
feedback += evaluation_data.get("feedback", "No detailed feedback available.")
return EvaluationScore(
score=overall_score,
feedback=feedback,
raw_response=response
)
except Exception as e:
return EvaluationScore(
score=None,
feedback=f"Error evaluating tool invocation: {e}",
raw_response=response
)

View File

@@ -0,0 +1,52 @@
import inspect
from typing_extensions import Any
import warnings
from crewai.experimental.evaluation.experiment import ExperimentResults, ExperimentRunner
from crewai import Crew, Agent
def assert_experiment_successfully(experiment_results: ExperimentResults, baseline_filepath: str | None = None) -> None:
failed_tests = [result for result in experiment_results.results if not result.passed]
if failed_tests:
detailed_failures: list[str] = []
for result in failed_tests:
expected = result.expected_score
actual = result.score
detailed_failures.append(f"- {result.identifier}: expected {expected}, got {actual}")
failure_details = "\n".join(detailed_failures)
raise AssertionError(f"The following test cases failed:\n{failure_details}")
baseline_filepath = baseline_filepath or _get_baseline_filepath_fallback()
comparison = experiment_results.compare_with_baseline(baseline_filepath=baseline_filepath)
assert_experiment_no_regression(comparison)
def assert_experiment_no_regression(comparison_result: dict[str, list[str]]) -> None:
regressed = comparison_result.get("regressed", [])
if regressed:
raise AssertionError(f"Regression detected! The following tests that previously passed now fail: {regressed}")
missing_tests = comparison_result.get("missing_tests", [])
if missing_tests:
warnings.warn(
f"Warning: {len(missing_tests)} tests from the baseline are missing in the current run: {missing_tests}",
UserWarning
)
def run_experiment(dataset: list[dict[str, Any]], crew: Crew | None = None, agents: list[Agent] | None = None, verbose: bool = False) -> ExperimentResults:
runner = ExperimentRunner(dataset=dataset)
return runner.run(agents=agents, crew=crew, print_summary=verbose)
def _get_baseline_filepath_fallback() -> str:
test_func_name = "experiment_fallback"
try:
current_frame = inspect.currentframe()
if current_frame is not None:
test_func_name = current_frame.f_back.f_back.f_code.co_name # type: ignore[union-attr]
except Exception:
...
return f"{test_func_name}_results.json"

View File

@@ -436,6 +436,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
_routers: Set[str] = set()
_router_paths: Dict[str, List[str]] = {}
initial_state: Union[Type[T], T, None] = None
name: Optional[str] = None
def __class_getitem__(cls: Type["Flow"], item: Type[T]) -> Type["Flow"]:
class _FlowGeneric(cls): # type: ignore
@@ -473,7 +474,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
self,
FlowCreatedEvent(
type="flow_created",
flow_name=self.__class__.__name__,
flow_name=self.name or self.__class__.__name__,
),
)
@@ -769,7 +770,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
self,
FlowStartedEvent(
type="flow_started",
flow_name=self.__class__.__name__,
flow_name=self.name or self.__class__.__name__,
inputs=inputs,
),
)
@@ -792,7 +793,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
self,
FlowFinishedEvent(
type="flow_finished",
flow_name=self.__class__.__name__,
flow_name=self.name or self.__class__.__name__,
result=final_output,
),
)
@@ -834,7 +835,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
MethodExecutionStartedEvent(
type="method_execution_started",
method_name=method_name,
flow_name=self.__class__.__name__,
flow_name=self.name or self.__class__.__name__,
params=dumped_params,
state=self._copy_state(),
),
@@ -856,7 +857,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
MethodExecutionFinishedEvent(
type="method_execution_finished",
method_name=method_name,
flow_name=self.__class__.__name__,
flow_name=self.name or self.__class__.__name__,
state=self._copy_state(),
result=result,
),
@@ -869,7 +870,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
MethodExecutionFailedEvent(
type="method_execution_failed",
method_name=method_name,
flow_name=self.__class__.__name__,
flow_name=self.name or self.__class__.__name__,
error=e,
),
)
@@ -1076,7 +1077,7 @@ class Flow(Generic[T], metaclass=FlowMeta):
self,
FlowPlotEvent(
type="flow_plot",
flow_name=self.__class__.__name__,
flow_name=self.name or self.__class__.__name__,
),
)
plot_flow(self, filename)

View File

@@ -1,55 +0,0 @@
from abc import ABC, abstractmethod
from typing import List
import numpy as np
class BaseEmbedder(ABC):
"""
Abstract base class for text embedding models
"""
@abstractmethod
def embed_chunks(self, chunks: List[str]) -> np.ndarray:
"""
Generate embeddings for a list of text chunks
Args:
chunks: List of text chunks to embed
Returns:
Array of embeddings
"""
pass
@abstractmethod
def embed_texts(self, texts: List[str]) -> np.ndarray:
"""
Generate embeddings for a list of texts
Args:
texts: List of texts to embed
Returns:
Array of embeddings
"""
pass
@abstractmethod
def embed_text(self, text: str) -> np.ndarray:
"""
Generate embedding for a single text
Args:
text: Text to embed
Returns:
Embedding array
"""
pass
@property
@abstractmethod
def dimension(self) -> int:
"""Get the dimension of the embeddings"""
pass

View File

@@ -13,11 +13,12 @@ from chromadb.api.types import OneOrMany
from chromadb.config import Settings
from crewai.knowledge.storage.base_knowledge_storage import BaseKnowledgeStorage
from crewai.utilities import EmbeddingConfigurator
from crewai.rag.embeddings.configurator import EmbeddingConfigurator
from crewai.utilities.chromadb import sanitize_collection_name
from crewai.utilities.constants import KNOWLEDGE_DIRECTORY
from crewai.utilities.logger import Logger
from crewai.utilities.paths import db_storage_path
from crewai.utilities.chromadb import create_persistent_client
@contextlib.contextmanager
@@ -84,14 +85,11 @@ class KnowledgeStorage(BaseKnowledgeStorage):
raise Exception("Collection not initialized")
def initialize_knowledge_storage(self):
base_path = os.path.join(db_storage_path(), "knowledge")
chroma_client = chromadb.PersistentClient(
path=base_path,
self.app = create_persistent_client(
path=os.path.join(db_storage_path(), "knowledge"),
settings=Settings(allow_reset=True),
)
self.app = chroma_client
try:
collection_name = (
f"knowledge_{self.collection_name}"
@@ -111,9 +109,8 @@ class KnowledgeStorage(BaseKnowledgeStorage):
def reset(self):
base_path = os.path.join(db_storage_path(), KNOWLEDGE_DIRECTORY)
if not self.app:
self.app = chromadb.PersistentClient(
path=base_path,
settings=Settings(allow_reset=True),
self.app = create_persistent_client(
path=base_path, settings=Settings(allow_reset=True)
)
self.app.reset()

View File

@@ -28,7 +28,7 @@ from pydantic import (
InstanceOf,
PrivateAttr,
model_validator,
field_validator,
field_validator
)
from crewai.agents.agent_builder.base_agent import BaseAgent
@@ -40,7 +40,7 @@ from crewai.agents.parser import (
OutputParserException,
)
from crewai.flow.flow_trackable import FlowTrackable
from crewai.llm import LLM
from crewai.llm import LLM, BaseLLM
from crewai.tools.base_tool import BaseTool
from crewai.tools.structured_tool import CrewStructuredTool
from crewai.utilities import I18N
@@ -135,7 +135,7 @@ class LiteAgent(FlowTrackable, BaseModel):
role: str = Field(description="Role of the agent")
goal: str = Field(description="Goal of the agent")
backstory: str = Field(description="Backstory of the agent")
llm: Optional[Union[str, InstanceOf[LLM], Any]] = Field(
llm: Optional[Union[str, InstanceOf[BaseLLM], Any]] = Field(
default=None, description="Language model that will run the agent"
)
tools: List[BaseTool] = Field(
@@ -209,8 +209,8 @@ class LiteAgent(FlowTrackable, BaseModel):
def setup_llm(self):
"""Set up the LLM and other components after initialization."""
self.llm = create_llm(self.llm)
if not isinstance(self.llm, LLM):
raise ValueError("Unable to create LLM instance")
if not isinstance(self.llm, BaseLLM):
raise ValueError(f"Expected LLM instance of type BaseLLM, got {type(self.llm).__name__}")
# Initialize callbacks
token_callback = TokenCalcHandler(token_cost_process=self._token_process)
@@ -232,7 +232,8 @@ class LiteAgent(FlowTrackable, BaseModel):
elif isinstance(self.guardrail, str):
from crewai.tasks.llm_guardrail import LLMGuardrail
assert isinstance(self.llm, LLM)
if not isinstance(self.llm, BaseLLM):
raise TypeError(f"Guardrail requires LLM instance of type BaseLLM, got {type(self.llm).__name__}")
self._guardrail = LLMGuardrail(description=self.guardrail, llm=self.llm)
@@ -304,6 +305,7 @@ class LiteAgent(FlowTrackable, BaseModel):
"""
# Create agent info for event emission
agent_info = {
"id": self.id,
"role": self.role,
"goal": self.goal,
"backstory": self.backstory,
@@ -537,6 +539,7 @@ class LiteAgent(FlowTrackable, BaseModel):
crewai_event_bus.emit(
self,
event=LLMCallCompletedEvent(
messages=self._messages,
response=answer,
call_type=LLMCallType.LLM_CALL,
from_agent=self,
@@ -619,4 +622,4 @@ class LiteAgent(FlowTrackable, BaseModel):
def _append_message(self, text: str, role: str = "assistant") -> None:
"""Append a message to the message list with the given role."""
self._messages.append(format_message_for_llm(text, role=role))
self._messages.append(format_message_for_llm(text, role=role))

View File

@@ -59,6 +59,7 @@ from crewai.utilities.exceptions.context_window_exceeding_exception import (
load_dotenv()
litellm.suppress_debug_info = True
class FilteredStream(io.TextIOBase):
_lock = None
@@ -76,9 +77,7 @@ class FilteredStream(io.TextIOBase):
# Skip common noisy LiteLLM banners and any other lines that contain "litellm"
if (
"give feedback / get help" in lower_s
or "litellm.info:" in lower_s
or "litellm" in lower_s
"litellm.info:" in lower_s
or "Consider using a smaller input or implementing a text splitting strategy" in lower_s
):
return 0
@@ -508,7 +507,6 @@ class LLM(BaseLLM):
# Enable tool calls using streaming
if "tool_calls" in delta:
tool_calls = delta["tool_calls"]
if tool_calls:
result = self._handle_streaming_tool_calls(
tool_calls=tool_calls,
@@ -517,6 +515,7 @@ class LLM(BaseLLM):
from_task=from_task,
from_agent=from_agent,
)
if result is not None:
chunk_content = result
@@ -631,7 +630,7 @@ class LLM(BaseLLM):
# Log token usage if available in streaming mode
self._handle_streaming_callbacks(callbacks, usage_info, last_chunk)
# Emit completion event and return response
self._handle_emit_call_events(full_response, LLMCallType.LLM_CALL, from_task, from_agent)
self._handle_emit_call_events(response=full_response, call_type=LLMCallType.LLM_CALL, from_task=from_task, from_agent=from_agent, messages=params["messages"])
return full_response
# --- 9) Handle tool calls if present
@@ -643,7 +642,7 @@ class LLM(BaseLLM):
self._handle_streaming_callbacks(callbacks, usage_info, last_chunk)
# --- 11) Emit completion event and return response
self._handle_emit_call_events(full_response, LLMCallType.LLM_CALL, from_task, from_agent)
self._handle_emit_call_events(response=full_response, call_type=LLMCallType.LLM_CALL, from_task=from_task, from_agent=from_agent, messages=params["messages"])
return full_response
except ContextWindowExceededError as e:
@@ -655,7 +654,7 @@ class LLM(BaseLLM):
logging.error(f"Error in streaming response: {str(e)}")
if full_response.strip():
logging.warning(f"Returning partial response despite error: {str(e)}")
self._handle_emit_call_events(full_response, LLMCallType.LLM_CALL, from_task, from_agent)
self._handle_emit_call_events(response=full_response, call_type=LLMCallType.LLM_CALL, from_task=from_task, from_agent=from_agent, messages=params["messages"])
return full_response
# Emit failed event and re-raise the exception
@@ -760,7 +759,7 @@ class LLM(BaseLLM):
available_functions: Optional[Dict[str, Any]] = None,
from_task: Optional[Any] = None,
from_agent: Optional[Any] = None,
) -> str:
) -> str | Any:
"""Handle a non-streaming response from the LLM.
Args:
@@ -784,13 +783,11 @@ class LLM(BaseLLM):
# Convert litellm's context window error to our own exception type
# for consistent handling in the rest of the codebase
raise LLMContextLengthExceededException(str(e))
# --- 2) Extract response message and content
response_message = cast(Choices, cast(ModelResponse, response).choices)[
0
].message
text_response = response_message.content or ""
# --- 3) Handle callbacks with usage info
if callbacks and len(callbacks) > 0:
for callback in callbacks:
@@ -803,22 +800,23 @@ class LLM(BaseLLM):
start_time=0,
end_time=0,
)
# --- 4) Check for tool calls
tool_calls = getattr(response_message, "tool_calls", [])
# --- 5) If no tool calls or no available functions, return the text response directly
if not tool_calls or not available_functions:
self._handle_emit_call_events(text_response, LLMCallType.LLM_CALL, from_task, from_agent)
# --- 5) If no tool calls or no available functions, return the text response directly as long as there is a text response
if (not tool_calls or not available_functions) and text_response:
self._handle_emit_call_events(response=text_response, call_type=LLMCallType.LLM_CALL, from_task=from_task, from_agent=from_agent, messages=params["messages"])
return text_response
# --- 6) If there is no text response, no available functions, but there are tool calls, return the tool calls
elif tool_calls and not available_functions and not text_response:
return tool_calls
# --- 6) Handle tool calls if present
# --- 7) Handle tool calls if present
tool_result = self._handle_tool_call(tool_calls, available_functions)
if tool_result is not None:
return tool_result
# --- 7) If tool call handling didn't return a result, emit completion event and return text response
self._handle_emit_call_events(text_response, LLMCallType.LLM_CALL, from_task, from_agent)
# --- 8) If tool call handling didn't return a result, emit completion event and return text response
self._handle_emit_call_events(response=text_response, call_type=LLMCallType.LLM_CALL, from_task=from_task, from_agent=from_agent, messages=params["messages"])
return text_response
def _handle_tool_call(
@@ -861,6 +859,7 @@ class LLM(BaseLLM):
tool_args=function_args,
),
)
result = fn(**function_args)
crewai_event_bus.emit(
self,
@@ -874,7 +873,7 @@ class LLM(BaseLLM):
)
# --- 3.3) Emit success event
self._handle_emit_call_events(result, LLMCallType.TOOL_CALL)
self._handle_emit_call_events(response=result, call_type=LLMCallType.TOOL_CALL)
return result
except Exception as e:
# --- 3.4) Handle execution errors
@@ -951,22 +950,18 @@ class LLM(BaseLLM):
# --- 3) Convert string messages to proper format if needed
if isinstance(messages, str):
messages = [{"role": "user", "content": messages}]
# --- 4) Handle O1 model special case (system messages not supported)
if "o1" in self.model.lower():
for message in messages:
if message.get("role") == "system":
message["role"] = "assistant"
# --- 5) Set up callbacks if provided
with suppress_warnings():
if callbacks and len(callbacks) > 0:
self.set_callbacks(callbacks)
try:
# --- 6) Prepare parameters for the completion call
params = self._prepare_completion_params(messages, tools)
# --- 7) Make the completion call and handle response
if self.stream:
return self._handle_streaming_response(
@@ -983,25 +978,48 @@ class LLM(BaseLLM):
# whether to summarize the content or abort based on the respect_context_window flag
raise
except Exception as e:
unsupported_stop = "Unsupported parameter" in str(e) and "'stop'" in str(e)
if unsupported_stop:
if "additional_drop_params" in self.additional_params and isinstance(self.additional_params["additional_drop_params"], list):
self.additional_params["additional_drop_params"].append("stop")
else:
self.additional_params = {"additional_drop_params": ["stop"]}
logging.info(
"Retrying LLM call without the unsupported 'stop'"
)
return self.call(
messages,
tools=tools,
callbacks=callbacks,
available_functions=available_functions,
from_task=from_task,
from_agent=from_agent,
)
assert hasattr(crewai_event_bus, "emit")
crewai_event_bus.emit(
self,
event=LLMCallFailedEvent(error=str(e), from_task=from_task, from_agent=from_agent),
)
logging.error(f"LiteLLM call failed: {str(e)}")
raise
def _handle_emit_call_events(self, response: Any, call_type: LLMCallType, from_task: Optional[Any] = None, from_agent: Optional[Any] = None):
def _handle_emit_call_events(self, response: Any, call_type: LLMCallType, from_task: Optional[Any] = None, from_agent: Optional[Any] = None, messages: str | list[dict[str, Any]] | None = None):
"""Handle the events for the LLM call.
Args:
response (str): The response from the LLM call.
call_type (str): The type of call, either "tool_call" or "llm_call".
from_task: Optional task object
from_agent: Optional agent object
messages: Optional messages object
"""
assert hasattr(crewai_event_bus, "emit")
crewai_event_bus.emit(
self,
event=LLMCallCompletedEvent(response=response, call_type=call_type, from_task=from_task, from_agent=from_agent),
event=LLMCallCompletedEvent(messages=messages, response=response, call_type=call_type, from_task=from_task, from_agent=from_agent),
)
def _format_messages_for_provider(
@@ -1054,6 +1072,15 @@ class LLM(BaseLLM):
messages.append({"role": "user", "content": "Please continue."})
return messages
# TODO: Remove this code after merging PR https://github.com/BerriAI/litellm/pull/10917
# Ollama doesn't supports last message to be 'assistant'
if "ollama" in self.model.lower() and messages and messages[-1]["role"] == "assistant":
messages = messages.copy()
messages.append(
{"role": "user", "content": ""}
)
return messages
# Handle Anthropic models
if not self.is_anthropic:
return messages

Some files were not shown because too many files have changed in this diff Show More