--- title: 'Qdrant Vector Search Tool' description: 'Semantic search capabilities for CrewAI agents using Qdrant vector database' icon: vector-square mode: "wide" --- ## Overview The Qdrant Vector Search Tool enables semantic search capabilities in your CrewAI agents by leveraging [Qdrant](https://qdrant.tech/), a vector similarity search engine. This tool allows your agents to search through documents stored in a Qdrant collection using semantic similarity. ## Installation Install the required packages: ```bash uv add qdrant-client ``` ## Basic Usage Here's a minimal example of how to use the tool: ```python from crewai import Agent from crewai_tools import QdrantVectorSearchTool, QdrantConfig # Initialize the tool with QdrantConfig qdrant_tool = QdrantVectorSearchTool( qdrant_config=QdrantConfig( qdrant_url="your_qdrant_url", qdrant_api_key="your_qdrant_api_key", collection_name="your_collection" ) ) # Create an agent that uses the tool agent = Agent( role="Research Assistant", goal="Find relevant information in documents", tools=[qdrant_tool] ) # The tool will automatically use OpenAI embeddings # and return the 3 most relevant results with scores > 0.35 ``` ## Complete Working Example Here's a complete example showing how to: 1. Extract text from a PDF 2. Generate embeddings using OpenAI 3. Store in Qdrant 4. Create a CrewAI agentic RAG workflow for semantic search ```python import os import uuid import pdfplumber from openai import OpenAI from dotenv import load_dotenv from crewai import Agent, Task, Crew, Process, LLM from crewai_tools import QdrantVectorSearchTool from qdrant_client import QdrantClient from qdrant_client.models import PointStruct, Distance, VectorParams # Load environment variables load_dotenv() # Initialize OpenAI client client = OpenAI(api_key=os.getenv("OPENAI_API_KEY")) # Extract text from PDF def extract_text_from_pdf(pdf_path): text = [] with pdfplumber.open(pdf_path) as pdf: for page in pdf.pages: page_text = page.extract_text() if page_text: text.append(page_text.strip()) return text # Generate OpenAI embeddings def get_openai_embedding(text): response = client.embeddings.create( input=text, model="text-embedding-3-large" ) return response.data[0].embedding # Store text and embeddings in Qdrant def load_pdf_to_qdrant(pdf_path, qdrant, collection_name): # Extract text from PDF text_chunks = extract_text_from_pdf(pdf_path) # Create Qdrant collection if qdrant.collection_exists(collection_name): qdrant.delete_collection(collection_name) qdrant.create_collection( collection_name=collection_name, vectors_config=VectorParams(size=3072, distance=Distance.COSINE) ) # Store embeddings points = [] for chunk in text_chunks: embedding = get_openai_embedding(chunk) points.append(PointStruct( id=str(uuid.uuid4()), vector=embedding, payload={"text": chunk} )) qdrant.upsert(collection_name=collection_name, points=points) # Initialize Qdrant client and load data qdrant = QdrantClient( url=os.getenv("QDRANT_URL"), api_key=os.getenv("QDRANT_API_KEY") ) collection_name = "example_collection" pdf_path = "path/to/your/document.pdf" load_pdf_to_qdrant(pdf_path, qdrant, collection_name) # Initialize Qdrant search tool from crewai_tools import QdrantConfig qdrant_tool = QdrantVectorSearchTool( qdrant_config=QdrantConfig( qdrant_url=os.getenv("QDRANT_URL"), qdrant_api_key=os.getenv("QDRANT_API_KEY"), collection_name=collection_name, limit=3, score_threshold=0.35 ) ) # Create CrewAI agents search_agent = Agent( role="Senior Semantic Search Agent", goal="Find and analyze documents based on semantic search", backstory="""You are an expert research assistant who can find relevant information using semantic search in a Qdrant database.""", tools=[qdrant_tool], verbose=True ) answer_agent = Agent( role="Senior Answer Assistant", goal="Generate answers to questions based on the context provided", backstory="""You are an expert answer assistant who can generate answers to questions based on the context provided.""", tools=[qdrant_tool], verbose=True ) # Define tasks search_task = Task( description="""Search for relevant documents about the {query}. Your final answer should include: - The relevant information found - The similarity scores of the results - The metadata of the relevant documents""", agent=search_agent ) answer_task = Task( description="""Given the context and metadata of relevant documents, generate a final answer based on the context.""", agent=answer_agent ) # Run CrewAI workflow crew = Crew( agents=[search_agent, answer_agent], tasks=[search_task, answer_task], process=Process.sequential, verbose=True ) result = crew.kickoff( inputs={"query": "What is the role of X in the document?"} ) print(result) ``` ## Tool Parameters ### Required Parameters - `qdrant_config` (QdrantConfig): Configuration object containing all Qdrant settings ### QdrantConfig Parameters - `qdrant_url` (str): The URL of your Qdrant server - `qdrant_api_key` (str, optional): API key for authentication with Qdrant - `collection_name` (str): Name of the Qdrant collection to search - `limit` (int): Maximum number of results to return (default: 3) - `score_threshold` (float): Minimum similarity score threshold (default: 0.35) - `filter` (Any, optional): Qdrant Filter instance for advanced filtering (default: None) ### Optional Tool Parameters - `custom_embedding_fn` (Callable[[str], list[float]]): Custom function for text vectorization - `qdrant_package` (str): Base package path for Qdrant (default: "qdrant_client") - `client` (Any): Pre-initialized Qdrant client (optional) ## Advanced Filtering The QdrantVectorSearchTool supports powerful filtering capabilities to refine your search results: ### Dynamic Filtering Use `filter_by` and `filter_value` parameters in your search to filter results on-the-fly: ```python # Agent will use these parameters when calling the tool # The tool schema accepts filter_by and filter_value # Example: search with category filter # Results will be filtered where category == "technology" ``` ### Preset Filters with QdrantConfig For complex filtering, use Qdrant Filter instances in your configuration: ```python from qdrant_client.http import models as qmodels from crewai_tools import QdrantVectorSearchTool, QdrantConfig # Create a filter for specific conditions preset_filter = qmodels.Filter( must=[ qmodels.FieldCondition( key="category", match=qmodels.MatchValue(value="research") ), qmodels.FieldCondition( key="year", match=qmodels.MatchValue(value=2024) ) ] ) # Initialize tool with preset filter qdrant_tool = QdrantVectorSearchTool( qdrant_config=QdrantConfig( qdrant_url="your_url", qdrant_api_key="your_key", collection_name="your_collection", filter=preset_filter # Preset filter applied to all searches ) ) ``` ### Combining Filters The tool automatically combines preset filters from `QdrantConfig` with dynamic filters from `filter_by` and `filter_value`: ```python # If QdrantConfig has a preset filter for category="research" # And the search uses filter_by="year", filter_value=2024 # Both filters will be combined (AND logic) ``` ## Search Parameters The tool accepts these parameters in its schema: - `query` (str): The search query to find similar documents - `filter_by` (str, optional): Metadata field to filter on - `filter_value` (Any, optional): Value to filter by ## Return Format The tool returns results in JSON format: ```json [ { "metadata": { // Any metadata stored with the document }, "context": "The actual text content of the document", "distance": 0.95 // Similarity score } ] ``` ## Default Embedding By default, the tool uses OpenAI's `text-embedding-3-large` model for vectorization. This requires: - OpenAI API key set in environment: `OPENAI_API_KEY` ## Custom Embeddings Instead of using the default embedding model, you might want to use your own embedding function in cases where you: 1. Want to use a different embedding model (e.g., Cohere, HuggingFace, Ollama models) 2. Need to reduce costs by using open-source embedding models 3. Have specific requirements for vector dimensions or embedding quality 4. Want to use domain-specific embeddings (e.g., for medical or legal text) Here's an example using a HuggingFace model: ```python from transformers import AutoTokenizer, AutoModel import torch # Load model and tokenizer tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/all-MiniLM-L6-v2') model = AutoModel.from_pretrained('sentence-transformers/all-MiniLM-L6-v2') def custom_embeddings(text: str) -> list[float]: # Tokenize and get model outputs inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True) outputs = model(**inputs) # Use mean pooling to get text embedding embeddings = outputs.last_hidden_state.mean(dim=1) # Convert to list of floats and return return embeddings[0].tolist() # Use custom embeddings with the tool from crewai_tools import QdrantConfig tool = QdrantVectorSearchTool( qdrant_config=QdrantConfig( qdrant_url="your_url", qdrant_api_key="your_key", collection_name="your_collection" ), custom_embedding_fn=custom_embeddings # Pass your custom function ) ``` ## Error Handling The tool handles these specific errors: - Raises ImportError if `qdrant-client` is not installed (with option to auto-install) - Raises ValueError if `QDRANT_URL` is not set - Prompts to install `qdrant-client` if missing using `uv add qdrant-client` ## Environment Variables Required environment variables: ```bash export QDRANT_URL="your_qdrant_url" # If not provided in constructor export QDRANT_API_KEY="your_api_key" # If not provided in constructor export OPENAI_API_KEY="your_openai_key" # If using default embeddings