Compare commits

..

13 Commits

Author SHA1 Message Date
Devin AI
9fa65f724f Fix lint errors: sort imports
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-03-21 05:15:54 +00:00
Devin AI
13e1aa96de Fix type-checker errors in embedding_configurator.py and knowledge_storage.py
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-03-21 05:14:42 +00:00
Devin AI
7fb76bb858 Fix lint error: sort imports in test_numpy_compatibility.py
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-03-21 05:12:46 +00:00
Devin AI
486cf58c3b Fix NumPy 2.x compatibility issue (#2431)
Co-Authored-By: Joe Moura <joao@crewai.com>
2025-03-21 05:10:53 +00:00
Tony Kipkemboi
03f1d57463 Merge pull request #2430 from crewAIInc/update-llm-docs
docs: add documentation for Local NVIDIA NIM with WSL2
2025-03-20 12:57:37 -07:00
Tony Kipkemboi
4725d0de0d Merge branch 'main' into update-llm-docs 2025-03-20 12:50:06 -07:00
Arthur Chien
b766af75f2 fix the _extract_thought (#2398)
* fix the _extract_thought

the regex string should be same with prompt in en.json:129
...\nThought: I now know the final answer\nFinal Answer: the...

* fix Action match

---------

Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>
2025-03-20 15:44:44 -04:00
Tony Kipkemboi
b2c8779f4c Add documentation for Local NVIDIA NIM with WSL2 2025-03-20 12:39:37 -07:00
Tony Kipkemboi
df266bda01 Update documentation: Add changelog, fix formatting issues, replace mint.json with docs.json (#2400) 2025-03-20 14:44:21 -04:00
Lorenze Jay
2155acb3a3 docs: Update JSONSearchTool and RagTool configuration parameter from 'embedder' to 'embedding_model' (#2311)
Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>
2025-03-20 13:11:37 -04:00
Sir Qasim
794574957e Add note to create ./knowldge folder for source file management (#2297)
This update includes a note in the documentation instructing users to create a ./knowldge folder. All source files (such as .txt, .pdf, .xlsx, .json) should be placed in this folder for centralized management. This change aims to streamline file organization and improve accessibility across projects.

Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>
2025-03-20 12:54:17 -04:00
Sir Qasim
66b19311a7 Fix crewai run Command Issue for Flow Projects and Cloud Deployment (#2291)
This PR addresses an issue with the crewai run command following the creation of a flow project. Previously, the update command interfered with execution, causing it not to work as expected. With these changes, the command now runs according to the instructions in the readme.md, and it also improves deployment support when using CrewAI Cloud.

Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>
2025-03-20 12:48:02 -04:00
devin-ai-integration[bot]
9fc84fc1ac Fix incorrect import statement in memory examples documentation (fixes #2395) (#2396)
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Joe Moura <joao@crewai.com>
Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>
2025-03-20 12:17:26 -04:00
13 changed files with 932 additions and 438 deletions

187
docs/changelog.mdx Normal file
View File

@@ -0,0 +1,187 @@
---
title: Changelog
description: View the latest updates and changes to CrewAI
icon: timeline
---
<Update label="2024-03-17" description="v0.108.0">
**Features**
- Converted tabs to spaces in `crew.py` template
- Enhanced LLM Streaming Response Handling and Event System
- Included `model_name`
- Enhanced Event Listener with rich visualization and improved logging
- Added fingerprints
**Bug Fixes**
- Fixed Mistral issues
- Fixed a bug in documentation
- Fixed type check error in fingerprint property
**Documentation Updates**
- Improved tool documentation
- Updated installation guide for the `uv` tool package
- Added instructions for upgrading crewAI with the `uv` tool
- Added documentation for `ApifyActorsTool`
</Update>
<Update label="2024-03-10" description="v0.105.0">
**Core Improvements & Fixes**
- Fixed issues with missing template variables and user memory configuration
- Improved async flow support and addressed agent response formatting
- Enhanced memory reset functionality and fixed CLI memory commands
- Fixed type issues, tool calling properties, and telemetry decoupling
**New Features & Enhancements**
- Added Flow state export and improved state utilities
- Enhanced agent knowledge setup with optional crew embedder
- Introduced event emitter for better observability and LLM call tracking
- Added support for Python 3.10 and ChatOllama from langchain_ollama
- Integrated context window size support for the o3-mini model
- Added support for multiple router calls
**Documentation & Guides**
- Improved documentation layout and hierarchical structure
- Added QdrantVectorSearchTool guide and clarified event listener usage
- Fixed typos in prompts and updated Amazon Bedrock model listings
</Update>
<Update label="2024-02-12" description="v0.102.0">
**Core Improvements & Fixes**
- Enhanced LLM Support: Improved structured LLM output, parameter handling, and formatting for Anthropic models
- Crew & Agent Stability: Fixed issues with cloning agents/crews using knowledge sources, multiple task outputs in conditional tasks, and ignored Crew task callbacks
- Memory & Storage Fixes: Fixed short-term memory handling with Bedrock, ensured correct embedder initialization, and added a reset memories function in the crew class
- Training & Execution Reliability: Fixed broken training and interpolation issues with dict and list input types
**New Features & Enhancements**
- Advanced Knowledge Management: Improved naming conventions and enhanced embedding configuration with custom embedder support
- Expanded Logging & Observability: Added JSON format support for logging and integrated MLflow tracing documentation
- Data Handling Improvements: Updated excel_knowledge_source.py to process multi-tab files
- General Performance & Codebase Clean-Up: Streamlined enterprise code alignment and resolved linting issues
- Adding new tool: `QdrantVectorSearchTool`
**Documentation & Guides**
- Updated AI & Memory Docs: Improved Bedrock, Google AI, and long-term memory documentation
- Task & Workflow Clarity: Added "Human Input" row to Task Attributes, Langfuse guide, and FileWriterTool documentation
- Fixed Various Typos & Formatting Issues
</Update>
<Update label="2024-01-28" description="v0.100.0">
**Features**
- Add Composio docs
- Add SageMaker as a LLM provider
**Fixes**
- Overall LLM connection issues
- Using safe accessors on training
- Add version check to crew_chat.py
**Documentation**
- New docs for crewai chat
- Improve formatting and clarity in CLI and Composio Tool docs
</Update>
<Update label="2024-01-20" description="v0.98.0">
**Features**
- Conversation crew v1
- Add unique ID to flow states
- Add @persist decorator with FlowPersistence interface
**Integrations**
- Add SambaNova integration
- Add NVIDIA NIM provider in cli
- Introducing VoyageAI
**Fixes**
- Fix API Key Behavior and Entity Handling in Mem0 Integration
- Fixed core invoke loop logic and relevant tests
- Make tool inputs actual objects and not strings
- Add important missing parts to creating tools
- Drop litellm version to prevent windows issue
- Before kickoff if inputs are none
- Fixed typos, nested pydantic model issue, and docling issues
</Update>
<Update label="2024-01-04" description="v0.95.0">
**New Features**
- Adding Multimodal Abilities to Crew
- Programatic Guardrails
- HITL multiple rounds
- Gemini 2.0 Support
- CrewAI Flows Improvements
- Add Workflow Permissions
- Add support for langfuse with litellm
- Portkey Integration with CrewAI
- Add interpolate_only method and improve error handling
- Docling Support
- Weviate Support
**Fixes**
- output_file not respecting system path
- disk I/O error when resetting short-term memory
- CrewJSONEncoder now accepts enums
- Python max version
- Interpolation for output_file in Task
- Handle coworker role name case/whitespace properly
- Add tiktoken as explicit dependency and document Rust requirement
- Include agent knowledge in planning process
- Change storage initialization to None for KnowledgeStorage
- Fix optional storage checks
- include event emitter in flows
- Docstring, Error Handling, and Type Hints Improvements
- Suppressed userWarnings from litellm pydantic issues
</Update>
<Update label="2023-12-05" description="v0.86.0">
**Changes**
- Remove all references to pipeline and pipeline router
- Add Nvidia NIM as provider in Custom LLM
- Add knowledge demo + improve knowledge docs
- Add HITL multiple rounds of followup
- New docs about yaml crew with decorators
- Simplify template crew
</Update>
<Update label="2023-12-04" description="v0.85.0">
**Features**
- Added knowledge to agent level
- Feat/remove langchain
- Improve typed task outputs
- Log in to Tool Repository on crewai login
**Fixes**
- Fixes issues with result as answer not properly exiting LLM loop
- Fix missing key name when running with ollama provider
- Fix spelling issue found
**Documentation**
- Update readme for running mypy
- Add knowledge to mint.json
- Update Github actions
- Update Agents docs to include two approaches for creating an agent
- Improvements to LLM Configuration and Usage
</Update>
<Update label="2023-11-25" description="v0.83.0">
**New Features**
- New before_kickoff and after_kickoff crew callbacks
- Support to pre-seed agents with Knowledge
- Add support for retrieving user preferences and memories using Mem0
**Fixes**
- Fix Async Execution
- Upgrade chroma and adjust embedder function generator
- Update CLI Watson supported models + docs
- Reduce level for Bandit
- Fixing all tests
**Documentation**
- Update Docs
</Update>
<Update label="2023-11-13" description="v0.80.0">
**Fixes**
- Fixing Tokens callback replacement bug
- Fixing Step callback issue
- Add cached prompt tokens info on usage metrics
- Fix crew_train_success test
</Update>

View File

@@ -150,6 +150,8 @@ result = crew.kickoff(
Here are examples of how to use different types of knowledge sources:
Note: Please ensure that you create the ./knowldge folder. All source files (e.g., .txt, .pdf, .xlsx, .json) should be placed in this folder for centralized management.
### Text File Knowledge Source
```python
from crewai.knowledge.source.text_file_knowledge_source import TextFileKnowledgeSource

View File

@@ -270,7 +270,7 @@ In this section, you'll find detailed examples that help you select, configure,
| Claude 3.5 Haiku | Up to 200k tokens | Fast, compact multimodal model optimized for quick responses and seamless human-like interactions |
| Claude 3 Sonnet | Up to 200k tokens | Multimodal model balancing intelligence and speed for high-volume deployments. |
| Claude 3 Haiku | Up to 200k tokens | Compact, high-speed multimodal model optimized for quick responses and natural conversational interactions |
| Claude 3 Opus | Up to 200k tokens | Most advanced multimodal model excelling at complex tasks with human-like reasoning and superior contextual understanding. |
| Claude 3 Opus | Up to 200k tokens | Most advanced multimodal model exceling at complex tasks with human-like reasoning and superior contextual understanding. |
| Claude 2.1 | Up to 200k tokens | Enhanced version with expanded context window, improved reliability, and reduced hallucinations for long-form and RAG applications |
| Claude | Up to 100k tokens | Versatile model excelling in sophisticated dialogue, creative content, and precise instruction following. |
| Claude Instant | Up to 100k tokens | Fast, cost-effective model for everyday tasks like dialogue, analysis, summarization, and document Q&A |
@@ -406,6 +406,46 @@ In this section, you'll find detailed examples that help you select, configure,
| baichuan-inc/baichuan2-13b-chat | 4,096 tokens | Support Chinese and English chat, coding, math, instruction following, solving quizzes |
</Accordion>
<Accordion title="Local NVIDIA NIM Deployed using WSL2">
NVIDIA NIM enables you to run powerful LLMs locally on your Windows machine using WSL2 (Windows Subsystem for Linux).
This approach allows you to leverage your NVIDIA GPU for private, secure, and cost-effective AI inference without relying on cloud services.
Perfect for development, testing, or production scenarios where data privacy or offline capabilities are required.
Here is a step-by-step guide to setting up a local NVIDIA NIM model:
1. Follow installation instructions from [NVIDIA Website](https://docs.nvidia.com/nim/wsl2/latest/getting-started.html)
2. Install the local model. For Llama 3.1-8b follow [instructions](https://build.nvidia.com/meta/llama-3_1-8b-instruct/deploy)
3. Configure your crewai local models:
```python Code
from crewai.llm import LLM
local_nvidia_nim_llm = LLM(
model="openai/meta/llama-3.1-8b-instruct", # it's an openai-api compatible model
base_url="http://localhost:8000/v1",
api_key="<your_api_key|any text if you have not configured it>", # api_key is required, but you can use any text
)
# Then you can use it in your crew:
@CrewBase
class MyCrew():
# ...
@agent
def researcher(self) -> Agent:
return Agent(
config=self.agents_config['researcher'],
llm=local_nvidia_nim_llm
)
# ...
```
</Accordion>
<Accordion title="Groq">
Set the following environment variables in your `.env` file:
@@ -746,5 +786,5 @@ Learn how to get the most out of your LLM configuration:
<Tip>
Use larger context models for extensive tasks
</Tip>
```
</Tab>
</Tabs>

223
docs/docs.json Normal file
View File

@@ -0,0 +1,223 @@
{
"$schema": "https://mintlify.com/docs.json",
"theme": "palm",
"name": "CrewAI",
"colors": {
"primary": "#EB6658",
"light": "#F3A78B",
"dark": "#C94C3C"
},
"favicon": "favicon.svg",
"navigation": {
"tabs": [
{
"tab": "Get Started",
"groups": [
{
"group": "Get Started",
"pages": [
"introduction",
"installation",
"quickstart",
"changelog"
]
},
{
"group": "Guides",
"pages": [
{
"group": "Concepts",
"pages": [
"guides/concepts/evaluating-use-cases"
]
},
{
"group": "Agents",
"pages": [
"guides/agents/crafting-effective-agents"
]
},
{
"group": "Crews",
"pages": [
"guides/crews/first-crew"
]
},
{
"group": "Flows",
"pages": [
"guides/flows/first-flow",
"guides/flows/mastering-flow-state"
]
},
{
"group": "Advanced",
"pages": [
"guides/advanced/customizing-prompts",
"guides/advanced/fingerprinting"
]
}
]
},
{
"group": "Core Concepts",
"pages": [
"concepts/agents",
"concepts/tasks",
"concepts/crews",
"concepts/flows",
"concepts/knowledge",
"concepts/llms",
"concepts/processes",
"concepts/collaboration",
"concepts/training",
"concepts/memory",
"concepts/planning",
"concepts/testing",
"concepts/cli",
"concepts/tools",
"concepts/event-listener",
"concepts/langchain-tools",
"concepts/llamaindex-tools"
]
},
{
"group": "How to Guides",
"pages": [
"how-to/create-custom-tools",
"how-to/sequential-process",
"how-to/hierarchical-process",
"how-to/custom-manager-agent",
"how-to/llm-connections",
"how-to/customizing-agents",
"how-to/multimodal-agents",
"how-to/coding-agents",
"how-to/force-tool-output-as-result",
"how-to/human-input-on-execution",
"how-to/kickoff-async",
"how-to/kickoff-for-each",
"how-to/replay-tasks-from-latest-crew-kickoff",
"how-to/conditional-tasks",
"how-to/agentops-observability",
"how-to/langtrace-observability",
"how-to/mlflow-observability",
"how-to/openlit-observability",
"how-to/portkey-observability",
"how-to/langfuse-observability"
]
},
{
"group": "Tools",
"pages": [
"tools/aimindtool",
"tools/apifyactorstool",
"tools/bravesearchtool",
"tools/browserbaseloadtool",
"tools/codedocssearchtool",
"tools/codeinterpretertool",
"tools/composiotool",
"tools/csvsearchtool",
"tools/dalletool",
"tools/directorysearchtool",
"tools/directoryreadtool",
"tools/docxsearchtool",
"tools/exasearchtool",
"tools/filereadtool",
"tools/filewritetool",
"tools/firecrawlcrawlwebsitetool",
"tools/firecrawlscrapewebsitetool",
"tools/firecrawlsearchtool",
"tools/githubsearchtool",
"tools/hyperbrowserloadtool",
"tools/linkupsearchtool",
"tools/llamaindextool",
"tools/serperdevtool",
"tools/s3readertool",
"tools/s3writertool",
"tools/scrapegraphscrapetool",
"tools/scrapeelementfromwebsitetool",
"tools/jsonsearchtool",
"tools/mdxsearchtool",
"tools/mysqltool",
"tools/multiontool",
"tools/nl2sqltool",
"tools/patronustools",
"tools/pdfsearchtool",
"tools/pgsearchtool",
"tools/qdrantvectorsearchtool",
"tools/ragtool",
"tools/scrapewebsitetool",
"tools/scrapflyscrapetool",
"tools/seleniumscrapingtool",
"tools/snowflakesearchtool",
"tools/spidertool",
"tools/txtsearchtool",
"tools/visiontool",
"tools/weaviatevectorsearchtool",
"tools/websitesearchtool",
"tools/xmlsearchtool",
"tools/youtubechannelsearchtool",
"tools/youtubevideosearchtool"
]
},
{
"group": "Telemetry",
"pages": [
"telemetry"
]
}
]
},
{
"tab": "Examples",
"groups": [
{
"group": "Examples",
"pages": [
"examples/example"
]
}
]
}
],
"global": {
"anchors": [
{
"anchor": "Community",
"href": "https://community.crewai.com",
"icon": "discourse"
}
]
}
},
"logo": {
"light": "crew_only_logo.png",
"dark": "crew_only_logo.png"
},
"appearance": {
"default": "dark",
"strict": false
},
"navbar": {
"primary": {
"type": "github",
"href": "https://github.com/crewAIInc/crewAI"
}
},
"search": {
"prompt": "Search CrewAI docs"
},
"seo": {
"indexing": "navigable"
},
"footer": {
"socials": {
"website": "https://crewai.com",
"x": "https://x.com/crewAIInc",
"github": "https://github.com/crewAIInc/crewAI",
"linkedin": "https://www.linkedin.com/company/crewai-inc",
"youtube": "https://youtube.com/@crewAIInc",
"reddit": "https://www.reddit.com/r/crewAIInc/"
}
}
}

View File

@@ -1,225 +0,0 @@
{
"name": "CrewAI",
"theme": "venus",
"logo": {
"dark": "crew_only_logo.png",
"light": "crew_only_logo.png"
},
"favicon": "favicon.svg",
"colors": {
"primary": "#EB6658",
"light": "#F3A78B",
"dark": "#C94C3C",
"anchors": {
"from": "#737373",
"to": "#EB6658"
}
},
"seo": {
"indexHiddenPages": false
},
"modeToggle": {
"default": "dark",
"isHidden": false
},
"feedback": {
"suggestEdit": true,
"raiseIssue": true,
"thumbsRating": true
},
"topbarCtaButton": {
"type": "github",
"url": "https://github.com/crewAIInc/crewAI"
},
"primaryTab": {
"name": "Get Started"
},
"tabs": [
{
"name": "Examples",
"url": "examples"
}
],
"anchors": [
{
"name": "Community",
"icon": "discourse",
"url": "https://community.crewai.com"
},
{
"name": "Changelog",
"icon": "timeline",
"url": "https://github.com/crewAIInc/crewAI/releases"
}
],
"navigation": [
{
"group": "Get Started",
"pages": [
"introduction",
"installation",
"quickstart"
]
},
{
"group": "Guides",
"pages": [
{
"group": "Concepts",
"pages": [
"guides/concepts/evaluating-use-cases"
]
},
{
"group": "Agents",
"pages": [
"guides/agents/crafting-effective-agents"
]
},
{
"group": "Crews",
"pages": [
"guides/crews/first-crew"
]
},
{
"group": "Flows",
"pages": [
"guides/flows/first-flow",
"guides/flows/mastering-flow-state"
]
},
{
"group": "Advanced",
"pages": [
"guides/advanced/customizing-prompts",
"guides/advanced/fingerprinting"
]
}
]
},
{
"group": "Core Concepts",
"pages": [
"concepts/agents",
"concepts/tasks",
"concepts/crews",
"concepts/flows",
"concepts/knowledge",
"concepts/llms",
"concepts/processes",
"concepts/collaboration",
"concepts/training",
"concepts/memory",
"concepts/planning",
"concepts/testing",
"concepts/cli",
"concepts/tools",
"concepts/event-listener",
"concepts/langchain-tools",
"concepts/llamaindex-tools"
]
},
{
"group": "How to Guides",
"pages": [
"how-to/create-custom-tools",
"how-to/sequential-process",
"how-to/hierarchical-process",
"how-to/custom-manager-agent",
"how-to/llm-connections",
"how-to/customizing-agents",
"how-to/multimodal-agents",
"how-to/coding-agents",
"how-to/force-tool-output-as-result",
"how-to/human-input-on-execution",
"how-to/kickoff-async",
"how-to/kickoff-for-each",
"how-to/replay-tasks-from-latest-crew-kickoff",
"how-to/conditional-tasks",
"how-to/agentops-observability",
"how-to/langtrace-observability",
"how-to/mlflow-observability",
"how-to/openlit-observability",
"how-to/portkey-observability",
"how-to/langfuse-observability"
]
},
{
"group": "Examples",
"pages": [
"examples/example"
]
},
{
"group": "Tools",
"pages": [
"tools/aimindtool",
"tools/apifyactorstool",
"tools/bravesearchtool",
"tools/browserbaseloadtool",
"tools/codedocssearchtool",
"tools/codeinterpretertool",
"tools/composiotool",
"tools/csvsearchtool",
"tools/dalletool",
"tools/directorysearchtool",
"tools/directoryreadtool",
"tools/docxsearchtool",
"tools/exasearchtool",
"tools/filereadtool",
"tools/filewritetool",
"tools/firecrawlcrawlwebsitetool",
"tools/firecrawlscrapewebsitetool",
"tools/firecrawlsearchtool",
"tools/githubsearchtool",
"tools/hyperbrowserloadtool",
"tools/linkupsearchtool",
"tools/llamaindextool",
"tools/serperdevtool",
"tools/s3readertool",
"tools/s3writertool",
"tools/scrapegraphscrapetool",
"tools/scrapeelementfromwebsitetool",
"tools/jsonsearchtool",
"tools/mdxsearchtool",
"tools/mysqltool",
"tools/multiontool",
"tools/nl2sqltool",
"tools/patronustools",
"tools/pdfsearchtool",
"tools/pgsearchtool",
"tools/qdrantvectorsearchtool",
"tools/ragtool",
"tools/scrapewebsitetool",
"tools/scrapflyscrapetool",
"tools/seleniumscrapingtool",
"tools/snowflakesearchtool",
"tools/spidertool",
"tools/txtsearchtool",
"tools/visiontool",
"tools/weaviatevectorsearchtool",
"tools/websitesearchtool",
"tools/xmlsearchtool",
"tools/youtubechannelsearchtool",
"tools/youtubevideosearchtool"
]
},
{
"group": "Telemetry",
"pages": [
"telemetry"
]
}
],
"search": {
"prompt": "Search CrewAI docs"
},
"footerSocials": {
"website": "https://crewai.com",
"x": "https://x.com/crewAIInc",
"github": "https://github.com/crewAIInc/crewAI",
"linkedin": "https://www.linkedin.com/company/crewai-inc",
"youtube": "https://youtube.com/@crewAIInc"
}
}

View File

@@ -7,8 +7,10 @@ icon: file-code
# `JSONSearchTool`
<Note>
The JSONSearchTool is currently in an experimental phase. This means the tool is under active development, and users might encounter unexpected behavior or changes.
We highly encourage feedback on any issues or suggestions for improvements.
The JSONSearchTool is currently in an experimental phase. This means the tool
is under active development, and users might encounter unexpected behavior or
changes. We highly encourage feedback on any issues or suggestions for
improvements.
</Note>
## Description
@@ -60,7 +62,7 @@ tool = JSONSearchTool(
# stream=true,
},
},
"embedder": {
"embedding_model": {
"provider": "google", # or openai, ollama, ...
"config": {
"model": "models/embedding-001",
@@ -70,4 +72,4 @@ tool = JSONSearchTool(
},
}
)
```
```

View File

@@ -8,8 +8,8 @@ icon: vector-square
## Description
The `RagTool` is designed to answer questions by leveraging the power of Retrieval-Augmented Generation (RAG) through EmbedChain.
It provides a dynamic knowledge base that can be queried to retrieve relevant information from various data sources.
The `RagTool` is designed to answer questions by leveraging the power of Retrieval-Augmented Generation (RAG) through EmbedChain.
It provides a dynamic knowledge base that can be queried to retrieve relevant information from various data sources.
This tool is particularly useful for applications that require access to a vast array of information and need to provide contextually relevant answers.
## Example
@@ -138,7 +138,7 @@ config = {
"model": "gpt-4",
}
},
"embedder": {
"embedding_model": {
"provider": "openai",
"config": {
"model": "text-embedding-ada-002"
@@ -151,4 +151,4 @@ rag_tool = RagTool(config=config, summarize=True)
## Conclusion
The `RagTool` provides a powerful way to create and query knowledge bases from various data sources. By leveraging Retrieval-Augmented Generation, it enables agents to access and retrieve relevant information efficiently, enhancing their ability to provide accurate and contextually appropriate responses.
The `RagTool` provides a powerful way to create and query knowledge bases from various data sources. By leveraging Retrieval-Augmented Generation, it enables agents to access and retrieve relevant information efficiently, enhancing their ability to provide accurate and contextually appropriate responses.

View File

@@ -124,9 +124,9 @@ class CrewAgentParser:
)
def _extract_thought(self, text: str) -> str:
thought_index = text.find("\n\nAction")
thought_index = text.find("\nAction")
if thought_index == -1:
thought_index = text.find("\n\nFinal Answer")
thought_index = text.find("\nFinal Answer")
if thought_index == -1:
return ""
thought = text[:thought_index].strip()

View File

@@ -10,6 +10,7 @@ dependencies = [
[project.scripts]
kickoff = "{{folder_name}}.main:kickoff"
run_crew = "{{folder_name}}.main:kickoff"
plot = "{{folder_name}}.main:plot"
[build-system]

View File

@@ -4,13 +4,34 @@ import io
import logging
import os
import shutil
import warnings
from typing import Any, Dict, List, Optional, Union, cast
import chromadb
import chromadb.errors
from chromadb.api import ClientAPI
from chromadb.api.types import OneOrMany
from chromadb.config import Settings
# Initialize module import status
CHROMADB_AVAILABLE = False
# Define placeholder types
class DummyClientAPI:
pass
class DummySettings:
pass
# Try to import chromadb-related modules with proper error handling
try:
import chromadb
import chromadb.errors
from chromadb.api import ClientAPI
from chromadb.api.types import OneOrMany
from chromadb.config import Settings
CHROMADB_AVAILABLE = True
except (ImportError, AttributeError) as e:
warnings.warn(f"Failed to import chromadb: {str(e)}. Knowledge functionality will be limited.")
# Use dummy classes when imports fail
chromadb = None
ClientAPI = DummyClientAPI
OneOrMany = Any
Settings = DummySettings
from crewai.knowledge.storage.base_knowledge_storage import BaseKnowledgeStorage
from crewai.utilities import EmbeddingConfigurator
@@ -42,9 +63,9 @@ class KnowledgeStorage(BaseKnowledgeStorage):
search efficiency.
"""
collection: Optional[chromadb.Collection] = None
collection = None # Type annotation removed to handle case when chromadb is not available
collection_name: Optional[str] = "knowledge"
app: Optional[ClientAPI] = None
app = None # Type annotation removed to handle case when chromadb is not available
def __init__(
self,
@@ -61,37 +82,52 @@ class KnowledgeStorage(BaseKnowledgeStorage):
filter: Optional[dict] = None,
score_threshold: float = 0.35,
) -> List[Dict[str, Any]]:
if not CHROMADB_AVAILABLE:
logging.warning("Cannot search knowledge as chromadb is not available.")
return []
with suppress_logging():
if self.collection:
fetched = self.collection.query(
query_texts=query,
n_results=limit,
where=filter,
)
results = []
for i in range(len(fetched["ids"][0])): # type: ignore
result = {
"id": fetched["ids"][0][i], # type: ignore
"metadata": fetched["metadatas"][0][i], # type: ignore
"context": fetched["documents"][0][i], # type: ignore
"score": fetched["distances"][0][i], # type: ignore
}
if result["score"] >= score_threshold:
results.append(result)
return results
try:
fetched = self.collection.query(
query_texts=query,
n_results=limit,
where=filter,
)
results = []
for i in range(len(fetched["ids"][0])): # type: ignore
result = {
"id": fetched["ids"][0][i], # type: ignore
"metadata": fetched["metadatas"][0][i], # type: ignore
"context": fetched["documents"][0][i], # type: ignore
"score": fetched["distances"][0][i], # type: ignore
}
if result["score"] >= score_threshold:
results.append(result)
return results
except Exception as e:
logging.error(f"Error during knowledge search: {str(e)}")
return []
else:
raise Exception("Collection not initialized")
logging.warning("Collection not initialized")
return []
def initialize_knowledge_storage(self):
base_path = os.path.join(db_storage_path(), "knowledge")
chroma_client = chromadb.PersistentClient(
path=base_path,
settings=Settings(allow_reset=True),
)
self.app = chroma_client
if not CHROMADB_AVAILABLE:
logging.warning("Cannot initialize knowledge storage as chromadb is not available.")
self.app = None
self.collection = None
return
try:
base_path = os.path.join(db_storage_path(), "knowledge")
chroma_client = chromadb.PersistentClient(
path=base_path,
settings=Settings(allow_reset=True),
)
self.app = chroma_client
collection_name = (
f"knowledge_{self.collection_name}"
if self.collection_name
@@ -102,30 +138,46 @@ class KnowledgeStorage(BaseKnowledgeStorage):
name=collection_name, embedding_function=self.embedder
)
else:
raise Exception("Vector Database Client not initialized")
except Exception:
raise Exception("Failed to create or get collection")
logging.warning("Vector Database Client not initialized")
self.collection = None
except Exception as e:
logging.error(f"Failed to create or get collection: {str(e)}")
self.app = None
self.collection = None
def reset(self):
base_path = os.path.join(db_storage_path(), KNOWLEDGE_DIRECTORY)
if not self.app:
self.app = chromadb.PersistentClient(
path=base_path,
settings=Settings(allow_reset=True),
)
if not CHROMADB_AVAILABLE:
logging.warning("Cannot reset knowledge storage as chromadb is not available.")
return
try:
base_path = os.path.join(db_storage_path(), KNOWLEDGE_DIRECTORY)
if not self.app:
self.app = chromadb.PersistentClient(
path=base_path,
settings=Settings(allow_reset=True),
)
self.app.reset()
shutil.rmtree(base_path)
self.app = None
self.collection = None
self.app.reset()
shutil.rmtree(base_path)
except Exception as e:
logging.error(f"Error during knowledge reset: {str(e)}")
finally:
self.app = None
self.collection = None
def save(
self,
documents: List[str],
metadata: Optional[Union[Dict[str, Any], List[Dict[str, Any]]]] = None,
):
if not CHROMADB_AVAILABLE:
logging.warning("Cannot save to knowledge storage as chromadb is not available.")
return
if not self.collection:
raise Exception("Collection not initialized")
logging.warning("Collection not initialized")
return
try:
# Create a dictionary to store unique documents
@@ -154,38 +206,46 @@ class KnowledgeStorage(BaseKnowledgeStorage):
filtered_ids.append(doc_id)
# If we have no metadata at all, set it to None
final_metadata: Optional[OneOrMany[chromadb.Metadata]] = (
None if all(m is None for m in filtered_metadata) else filtered_metadata
)
final_metadata = None
if not all(m is None for m in filtered_metadata):
final_metadata = filtered_metadata
self.collection.upsert(
documents=filtered_docs,
metadatas=final_metadata,
ids=filtered_ids,
)
except chromadb.errors.InvalidDimensionException as e:
Logger(verbose=True).log(
"error",
"Embedding dimension mismatch. This usually happens when mixing different embedding models. Try resetting the collection using `crewai reset-memories -a`",
"red",
)
raise ValueError(
"Embedding dimension mismatch. Make sure you're using the same embedding model "
"across all operations with this collection."
"Try resetting the collection using `crewai reset-memories -a`"
) from e
except Exception as e:
Logger(verbose=True).log("error", f"Failed to upsert documents: {e}", "red")
raise
if hasattr(chromadb, 'errors') and isinstance(e, chromadb.errors.InvalidDimensionException):
Logger(verbose=True).log(
"error",
"Embedding dimension mismatch. This usually happens when mixing different embedding models. Try resetting the collection using `crewai reset-memories -a`",
"red",
)
logging.error(
"Embedding dimension mismatch. Make sure you're using the same embedding model "
"across all operations with this collection."
"Try resetting the collection using `crewai reset-memories -a`"
)
else:
Logger(verbose=True).log("error", f"Failed to upsert documents: {e}", "red")
logging.error(f"Failed to upsert documents: {e}")
def _create_default_embedding_function(self):
from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction,
)
if not CHROMADB_AVAILABLE:
return None
try:
from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction,
)
return OpenAIEmbeddingFunction(
api_key=os.getenv("OPENAI_API_KEY"), model_name="text-embedding-3-small"
)
return OpenAIEmbeddingFunction(
api_key=os.getenv("OPENAI_API_KEY"), model_name="text-embedding-3-small"
)
except (ImportError, AttributeError) as e:
logging.warning(f"Failed to create default embedding function: {str(e)}")
return None
def _set_embedder_config(self, embedder: Optional[Dict[str, Any]] = None) -> None:
"""Set the embedding configuration for the knowledge storage.
@@ -194,8 +254,12 @@ class KnowledgeStorage(BaseKnowledgeStorage):
embedder_config (Optional[Dict[str, Any]]): Configuration dictionary for the embedder.
If None or empty, defaults to the default embedding function.
"""
self.embedder = (
EmbeddingConfigurator().configure_embedder(embedder)
if embedder
else self._create_default_embedding_function()
)
try:
self.embedder = (
EmbeddingConfigurator().configure_embedder(embedder)
if embedder
else self._create_default_embedding_function()
)
except Exception as e:
logging.warning(f"Failed to configure embedder: {str(e)}")
self.embedder = None

View File

@@ -60,26 +60,32 @@ class RAGStorage(BaseRAGStorage):
self.embedder_config = configurator.configure_embedder(self.embedder_config)
def _initialize_app(self):
import chromadb
from chromadb.config import Settings
self._set_embedder_config()
chroma_client = chromadb.PersistentClient(
path=self.path if self.path else self.storage_file_name,
settings=Settings(allow_reset=self.allow_reset),
)
self.app = chroma_client
try:
self.collection = self.app.get_collection(
name=self.type, embedding_function=self.embedder_config
)
except Exception:
self.collection = self.app.create_collection(
name=self.type, embedding_function=self.embedder_config
import chromadb
from chromadb.config import Settings
self._set_embedder_config()
chroma_client = chromadb.PersistentClient(
path=self.path if self.path else self.storage_file_name,
settings=Settings(allow_reset=self.allow_reset),
)
self.app = chroma_client
try:
self.collection = self.app.get_collection(
name=self.type, embedding_function=self.embedder_config
)
except Exception:
self.collection = self.app.create_collection(
name=self.type, embedding_function=self.embedder_config
)
except (ImportError, AttributeError) as e:
import logging
logging.warning(f"Failed to initialize chromadb: {str(e)}. Memory functionality will be limited.")
self.app = None
self.collection = None
def _sanitize_role(self, role: str) -> str:
"""
Sanitizes agent roles to ensure valid directory names.
@@ -103,6 +109,9 @@ class RAGStorage(BaseRAGStorage):
def save(self, value: Any, metadata: Dict[str, Any]) -> None:
if not hasattr(self, "app") or not hasattr(self, "collection"):
self._initialize_app()
if self.app is None or self.collection is None:
logging.warning("Cannot save to memory as chromadb is not available.")
return
try:
self._generate_embedding(value, metadata)
except Exception as e:
@@ -115,8 +124,12 @@ class RAGStorage(BaseRAGStorage):
filter: Optional[dict] = None,
score_threshold: float = 0.35,
) -> List[Any]:
if not hasattr(self, "app"):
if not hasattr(self, "app") or not hasattr(self, "collection"):
self._initialize_app()
if self.app is None or self.collection is None:
logging.warning("Cannot search memory as chromadb is not available.")
return []
try:
with suppress_logging():
@@ -141,6 +154,10 @@ class RAGStorage(BaseRAGStorage):
def _generate_embedding(self, text: str, metadata: Dict[str, Any]) -> None: # type: ignore
if not hasattr(self, "app") or not hasattr(self, "collection"):
self._initialize_app()
if self.app is None or self.collection is None:
logging.warning("Cannot generate embeddings as chromadb is not available.")
return
self.collection.add(
documents=[text],
@@ -160,15 +177,7 @@ class RAGStorage(BaseRAGStorage):
# Ignore this specific error
pass
else:
raise Exception(
f"An error occurred while resetting the {self.type} memory: {e}"
)
def _create_default_embedding_function(self):
from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction,
)
return OpenAIEmbeddingFunction(
api_key=os.getenv("OPENAI_API_KEY"), model_name="text-embedding-3-small"
)
logging.error(f"An error occurred while resetting the {self.type} memory: {e}")
# Don't raise exception to prevent crashes
self.app = None
self.collection = None

View File

@@ -1,8 +1,40 @@
import os
from typing import Any, Dict, Optional, cast
import warnings
from typing import Any, Callable, Dict, List, Optional, Union, cast
from chromadb import Documents, EmbeddingFunction, Embeddings
from chromadb.api.types import validate_embedding_function
# Initialize with None to indicate module import status
CHROMADB_AVAILABLE = False
# Define placeholder types for when chromadb is not available
class EmbeddingFunction:
def __call__(self, texts):
raise NotImplementedError("Chromadb is not available")
Documents = List[str]
Embeddings = List[List[float]]
def validate_embedding_function(func):
return func
# Try to import chromadb-related modules with proper error handling
try:
from chromadb.api.types import Documents as ChromaDocuments
from chromadb.api.types import EmbeddingFunction as ChromaEmbeddingFunction
from chromadb.api.types import Embeddings as ChromaEmbeddings
from chromadb.utils import (
validate_embedding_function as chroma_validate_embedding_function,
)
# Override our placeholder types with the real ones
Documents = ChromaDocuments
EmbeddingFunction = ChromaEmbeddingFunction
Embeddings = ChromaEmbeddings
validate_embedding_function = chroma_validate_embedding_function
CHROMADB_AVAILABLE = True
except (ImportError, AttributeError) as e:
# This captures both ImportError and AttributeError (which can happen with NumPy 2.x)
warnings.warn(f"Failed to import chromadb: {str(e)}. Embedding functionality will be limited.")
class EmbeddingConfigurator:
@@ -26,6 +58,9 @@ class EmbeddingConfigurator:
embedder_config: Optional[Dict[str, Any]] = None,
) -> EmbeddingFunction:
"""Configures and returns an embedding function based on the provided config."""
if not CHROMADB_AVAILABLE:
return self._create_unavailable_embedding_function()
if embedder_config is None:
return self._create_default_embedding_function()
@@ -44,143 +79,230 @@ class EmbeddingConfigurator:
if provider == "custom"
else embedding_function(config, model_name)
)
@staticmethod
def _create_unavailable_embedding_function():
"""Creates a fallback embedding function when chromadb is not available."""
class UnavailableEmbeddingFunction(EmbeddingFunction):
def __call__(self, input):
raise ImportError(
"Chromadb is not available due to NumPy compatibility issues. "
"Either downgrade to NumPy<2 or upgrade chromadb and related dependencies."
)
return UnavailableEmbeddingFunction()
@staticmethod
def _create_default_embedding_function():
from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction,
)
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction,
)
return OpenAIEmbeddingFunction(
api_key=os.getenv("OPENAI_API_KEY"), model_name="text-embedding-3-small"
)
return OpenAIEmbeddingFunction(
api_key=os.getenv("OPENAI_API_KEY"), model_name="text-embedding-3-small"
)
except (ImportError, AttributeError) as e:
import warnings
warnings.warn(f"Failed to import OpenAIEmbeddingFunction: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
@staticmethod
def _configure_openai(config, model_name):
from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction,
)
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction,
)
return OpenAIEmbeddingFunction(
api_key=config.get("api_key") or os.getenv("OPENAI_API_KEY"),
model_name=model_name,
api_base=config.get("api_base", None),
api_type=config.get("api_type", None),
api_version=config.get("api_version", None),
default_headers=config.get("default_headers", None),
dimensions=config.get("dimensions", None),
deployment_id=config.get("deployment_id", None),
organization_id=config.get("organization_id", None),
)
return OpenAIEmbeddingFunction(
api_key=config.get("api_key") or os.getenv("OPENAI_API_KEY"),
model_name=model_name,
api_base=config.get("api_base", None),
api_type=config.get("api_type", None),
api_version=config.get("api_version", None),
default_headers=config.get("default_headers", None),
dimensions=config.get("dimensions", None),
deployment_id=config.get("deployment_id", None),
organization_id=config.get("organization_id", None),
)
except (ImportError, AttributeError) as e:
warnings.warn(f"Failed to import OpenAIEmbeddingFunction: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
@staticmethod
def _configure_azure(config, model_name):
from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction,
)
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
from chromadb.utils.embedding_functions.openai_embedding_function import (
OpenAIEmbeddingFunction,
)
return OpenAIEmbeddingFunction(
api_key=config.get("api_key"),
api_base=config.get("api_base"),
api_type=config.get("api_type", "azure"),
api_version=config.get("api_version"),
model_name=model_name,
default_headers=config.get("default_headers"),
dimensions=config.get("dimensions"),
deployment_id=config.get("deployment_id"),
organization_id=config.get("organization_id"),
)
return OpenAIEmbeddingFunction(
api_key=config.get("api_key"),
api_base=config.get("api_base"),
api_type=config.get("api_type", "azure"),
api_version=config.get("api_version"),
model_name=model_name,
default_headers=config.get("default_headers"),
dimensions=config.get("dimensions"),
deployment_id=config.get("deployment_id"),
organization_id=config.get("organization_id"),
)
except (ImportError, AttributeError) as e:
warnings.warn(f"Failed to import OpenAIEmbeddingFunction: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
@staticmethod
def _configure_ollama(config, model_name):
from chromadb.utils.embedding_functions.ollama_embedding_function import (
OllamaEmbeddingFunction,
)
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
from chromadb.utils.embedding_functions.ollama_embedding_function import (
OllamaEmbeddingFunction,
)
return OllamaEmbeddingFunction(
url=config.get("url", "http://localhost:11434/api/embeddings"),
model_name=model_name,
)
return OllamaEmbeddingFunction(
url=config.get("url", "http://localhost:11434/api/embeddings"),
model_name=model_name,
)
except (ImportError, AttributeError) as e:
warnings.warn(f"Failed to import OllamaEmbeddingFunction: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
@staticmethod
def _configure_vertexai(config, model_name):
from chromadb.utils.embedding_functions.google_embedding_function import (
GoogleVertexEmbeddingFunction,
)
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
from chromadb.utils.embedding_functions.google_embedding_function import (
GoogleVertexEmbeddingFunction,
)
return GoogleVertexEmbeddingFunction(
model_name=model_name,
api_key=config.get("api_key"),
project_id=config.get("project_id"),
region=config.get("region"),
)
return GoogleVertexEmbeddingFunction(
model_name=model_name,
api_key=config.get("api_key"),
project_id=config.get("project_id"),
region=config.get("region"),
)
except (ImportError, AttributeError) as e:
warnings.warn(f"Failed to import GoogleVertexEmbeddingFunction: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
@staticmethod
def _configure_google(config, model_name):
from chromadb.utils.embedding_functions.google_embedding_function import (
GoogleGenerativeAiEmbeddingFunction,
)
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
from chromadb.utils.embedding_functions.google_embedding_function import (
GoogleGenerativeAiEmbeddingFunction,
)
return GoogleGenerativeAiEmbeddingFunction(
model_name=model_name,
api_key=config.get("api_key"),
task_type=config.get("task_type"),
)
return GoogleGenerativeAiEmbeddingFunction(
model_name=model_name,
api_key=config.get("api_key"),
task_type=config.get("task_type"),
)
except (ImportError, AttributeError) as e:
warnings.warn(f"Failed to import GoogleGenerativeAiEmbeddingFunction: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
@staticmethod
def _configure_cohere(config, model_name):
from chromadb.utils.embedding_functions.cohere_embedding_function import (
CohereEmbeddingFunction,
)
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
from chromadb.utils.embedding_functions.cohere_embedding_function import (
CohereEmbeddingFunction,
)
return CohereEmbeddingFunction(
model_name=model_name,
api_key=config.get("api_key"),
)
return CohereEmbeddingFunction(
model_name=model_name,
api_key=config.get("api_key"),
)
except (ImportError, AttributeError) as e:
warnings.warn(f"Failed to import CohereEmbeddingFunction: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
@staticmethod
def _configure_voyageai(config, model_name):
from chromadb.utils.embedding_functions.voyageai_embedding_function import (
VoyageAIEmbeddingFunction,
)
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
from chromadb.utils.embedding_functions.voyageai_embedding_function import (
VoyageAIEmbeddingFunction,
)
return VoyageAIEmbeddingFunction(
model_name=model_name,
api_key=config.get("api_key"),
)
return VoyageAIEmbeddingFunction(
model_name=model_name,
api_key=config.get("api_key"),
)
except (ImportError, AttributeError) as e:
warnings.warn(f"Failed to import VoyageAIEmbeddingFunction: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
@staticmethod
def _configure_bedrock(config, model_name):
from chromadb.utils.embedding_functions.amazon_bedrock_embedding_function import (
AmazonBedrockEmbeddingFunction,
)
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
from chromadb.utils.embedding_functions.amazon_bedrock_embedding_function import (
AmazonBedrockEmbeddingFunction,
)
# Allow custom model_name override with backwards compatibility
kwargs = {"session": config.get("session")}
if model_name is not None:
kwargs["model_name"] = model_name
return AmazonBedrockEmbeddingFunction(**kwargs)
# Allow custom model_name override with backwards compatibility
kwargs = {"session": config.get("session")}
if model_name is not None:
kwargs["model_name"] = model_name
return AmazonBedrockEmbeddingFunction(**kwargs)
except (ImportError, AttributeError) as e:
warnings.warn(f"Failed to import AmazonBedrockEmbeddingFunction: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
@staticmethod
def _configure_huggingface(config, model_name):
from chromadb.utils.embedding_functions.huggingface_embedding_function import (
HuggingFaceEmbeddingServer,
)
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
from chromadb.utils.embedding_functions.huggingface_embedding_function import (
HuggingFaceEmbeddingServer,
)
return HuggingFaceEmbeddingServer(
url=config.get("api_url"),
)
return HuggingFaceEmbeddingServer(
url=config.get("api_url"),
)
except (ImportError, AttributeError) as e:
warnings.warn(f"Failed to import HuggingFaceEmbeddingServer: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
@staticmethod
def _configure_watson(config, model_name):
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
try:
import ibm_watsonx_ai.foundation_models as watson_models
from ibm_watsonx_ai import Credentials
from ibm_watsonx_ai.metanames import EmbedTextParamsMetaNames as EmbedParams
except ImportError as e:
raise ImportError(
warnings.warn(
"IBM Watson dependencies are not installed. Please install them to use Watson embedding."
) from e
)
return EmbeddingConfigurator._create_unavailable_embedding_function()
class WatsonEmbeddingFunction(EmbeddingFunction):
def __call__(self, input: Documents) -> Embeddings:
@@ -212,25 +334,30 @@ class EmbeddingConfigurator:
@staticmethod
def _configure_custom(config):
if not CHROMADB_AVAILABLE:
return EmbeddingConfigurator._create_unavailable_embedding_function()
custom_embedder = config.get("embedder")
if isinstance(custom_embedder, EmbeddingFunction):
try:
validate_embedding_function(custom_embedder)
return custom_embedder
except Exception as e:
raise ValueError(f"Invalid custom embedding function: {str(e)}")
warnings.warn(f"Invalid custom embedding function: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
elif callable(custom_embedder):
try:
instance = custom_embedder()
if isinstance(instance, EmbeddingFunction):
validate_embedding_function(instance)
return instance
raise ValueError(
"Custom embedder does not create an EmbeddingFunction instance"
)
warnings.warn("Custom embedder does not create an EmbeddingFunction instance")
return EmbeddingConfigurator._create_unavailable_embedding_function()
except Exception as e:
raise ValueError(f"Error instantiating custom embedder: {str(e)}")
warnings.warn(f"Error instantiating custom embedder: {str(e)}")
return EmbeddingConfigurator._create_unavailable_embedding_function()
else:
raise ValueError(
warnings.warn(
"Custom embedder must be an instance of `EmbeddingFunction` or a callable that creates one"
)
return EmbeddingConfigurator._create_unavailable_embedding_function()

View File

@@ -0,0 +1,64 @@
import importlib
import sys
import warnings
import pytest
def test_crew_import_with_numpy():
"""Test that crewai can be imported even with NumPy compatibility issues."""
try:
# Force reload to ensure we test our fix
if "crewai" in sys.modules:
importlib.reload(sys.modules["crewai"])
# This should not raise an exception
from crewai import Crew
assert Crew is not None
except Exception as e:
pytest.fail(f"Failed to import Crew: {e}")
def test_embedding_configurator_with_numpy():
"""Test that EmbeddingConfigurator can be imported with NumPy."""
try:
# Force reload
if "crewai.utilities.embedding_configurator" in sys.modules:
importlib.reload(sys.modules["crewai.utilities.embedding_configurator"])
from crewai.utilities.embedding_configurator import EmbeddingConfigurator
configurator = EmbeddingConfigurator()
# Test that we can create an embedder (might be unavailable but shouldn't crash)
embedder = configurator.configure_embedder()
assert embedder is not None
except Exception as e:
pytest.fail(f"Failed to use EmbeddingConfigurator: {e}")
def test_rag_storage_with_numpy():
"""Test that RAGStorage can be imported and used with NumPy."""
try:
# Force reload
if "crewai.memory.storage.rag_storage" in sys.modules:
importlib.reload(sys.modules["crewai.memory.storage.rag_storage"])
from crewai.memory.storage.rag_storage import RAGStorage
# Initialize with minimal config to avoid actual DB operations
storage = RAGStorage(type="test", crew=None)
# Just verify we can create the object without errors
assert storage is not None
except Exception as e:
pytest.fail(f"Failed to use RAGStorage: {e}")
def test_knowledge_storage_with_numpy():
"""Test that KnowledgeStorage can be imported and used with NumPy."""
try:
# Force reload
if "crewai.knowledge.storage.knowledge_storage" in sys.modules:
importlib.reload(sys.modules["crewai.knowledge.storage.knowledge_storage"])
from crewai.knowledge.storage.knowledge_storage import KnowledgeStorage
# Initialize with minimal config
storage = KnowledgeStorage()
# Just verify we can create the object without errors
assert storage is not None
except Exception as e:
pytest.fail(f"Failed to use KnowledgeStorage: {e}")