--- title: Knowledge description: Understand what knowledge is in CrewAI and how to effectively use it. icon: book --- # Using Knowledge in CrewAI ## What is Knowledge? Knowledge in CrewAI is a powerful system that allows AI agents to access and utilize external information sources during their tasks. Think of it as giving your agents a reference library they can consult while working. Key benefits of using Knowledge: - Enhance agents with domain-specific information - Support decisions with real-world data - Maintain context across conversations - Ground responses in factual information ## Supported Knowledge Sources CrewAI supports various types of knowledge sources out of the box: - Raw strings - Text files (.txt) - PDF documents - CSV files - Excel spreadsheets - JSON documents ## Quick Start Here's a simple example using string-based knowledge: ```python from crewai import Agent, Task, Crew from crewai.knowledge import StringKnowledgeSource # 1. Create a knowledge source product_info = StringKnowledgeSource( content="""Our product X1000 has the following features: - 10-hour battery life - Water-resistant - Available in black and silver Price: $299.99""", metadata={"category": "product"} ) # 2. Create an agent with knowledge sales_agent = Agent( role="Sales Representative", goal="Accurately answer customer questions about products", backstory="Expert in product features and customer service", knowledge_sources=[product_info] # Attach knowledge to agent ) # 3. Create a task answer_task = Task( description="Answer: What colors is the X1000 available in and how much does it cost?", agent=sales_agent ) # 4. Create and run the crew crew = Crew( agents=[sales_agent], tasks=[answer_task] ) result = crew.kickoff() ``` ## Knowledge Configuration ### Collection Names Knowledge sources are organized into collections for better management: ```python # Create knowledge sources with specific collections tech_specs = StringKnowledgeSource( content="Technical specifications...", collection_name="product_tech_specs" ) pricing_info = StringKnowledgeSource( content="Pricing information...", collection_name="product_pricing" ) ``` ### Metadata and Filtering Add metadata to organize and filter knowledge: ```python knowledge_source = StringKnowledgeSource( content="Product details...", metadata={ "category": "electronics", "product_line": "premium", "last_updated": "2024-03" } ) ``` ### Chunking Configuration Control how your content is split for processing: ```python knowledge_source = PDFKnowledgeSource( file_path="product_manual.pdf", chunk_size=2000, # Characters per chunk chunk_overlap=200 # Overlap between chunks ) ``` ## Advanced Usage ### Custom Knowledge Sources Create your own knowledge source by extending the base class: ```python from crewai.knowledge.source import BaseKnowledgeSource class APIKnowledgeSource(BaseKnowledgeSource): def __init__(self, api_endpoint: str, **kwargs): super().__init__(**kwargs) self.api_endpoint = api_endpoint def load_content(self): # Implement API data fetching response = requests.get(self.api_endpoint) return response.json() def add(self): content = self.load_content() # Process and store content self.save_documents({"source": "api"}) ``` ### Embedder Configuration Customize the embedding process: ```python crew = Crew( agents=[agent], tasks=[task], knowledge_sources=[source], embedder_config={ "model": "BAAI/bge-small-en-v1.5", "normalize": True, "max_length": 512 } ) ``` ## Best Practices - Use meaningful collection names - Add detailed metadata for filtering - Keep chunk sizes appropriate for your content - Consider content overlap for context preservation - Use smaller chunk sizes for precise retrieval - Implement metadata filtering for faster searches - Choose appropriate embedding models for your use case - Cache frequently accessed knowledge - Validate knowledge source content - Handle missing or corrupted files - Monitor embedding generation - Implement fallback options ## Common Issues and Solutions If agents can't find relevant information: - Check chunk sizes - Verify knowledge source loading - Review metadata filters - Test with simpler queries first If knowledge retrieval is slow: - Reduce chunk sizes - Optimize metadata filtering - Consider using a lighter embedding model - Cache frequently accessed content