I saw a rendedered whitespace inconsistency in the Tasks docs here:
ed31860071/docs/core-concepts/Tasks.md (L173)
So I set out to patch that up to make it easier to read. I then noticed
there were a few whitespace inconsistencies:
- 2 spaces
- 4 whitespaces
- tabs
It appears that the 4 whitespaces is the prevalent whitesapce usage, so
I overwrote other whitespace usages with that in this commit.
Co-authored-by: Rueben Ramirez <rramirez@ruebens-mbp.tail7c016.ts.net>
Co-authored-by: João Moura <joaomdmoura@gmail.com>
2.8 KiB
GithubSearchTool
!!! note "Experimental" We are still working on improving tools, so there might be unexpected behavior or changes in the future.
Description
The GithubSearchTool is a Read, Append, and Generate (RAG) tool specifically designed for conducting semantic searches within GitHub repositories. Utilizing advanced semantic search capabilities, it sifts through code, pull requests, issues, and repositories, making it an essential tool for developers, researchers, or anyone in need of precise information from GitHub.
Installation
To use the GithubSearchTool, first ensure the crewai_tools package is installed in your Python environment:
pip install 'crewai[tools]'
This command installs the necessary package to run the GithubSearchTool along with any other tools included in the crewai_tools package.
Example
Here’s how you can use the GithubSearchTool to perform semantic searches within a GitHub repository:
from crewai_tools import GithubSearchTool
# Initialize the tool for semantic searches within a specific GitHub repository
tool = GithubSearchTool(
github_repo='https://github.com/example/repo',
content_types=['code', 'issue'] # Options: code, repo, pr, issue
)
# OR
# Initialize the tool for semantic searches within a specific GitHub repository, so the agent can search any repository if it learns about during its execution
tool = GithubSearchTool(
content_types=['code', 'issue'] # Options: code, repo, pr, issue
)
Arguments
github_repo: The URL of the GitHub repository where the search will be conducted. This is a mandatory field and specifies the target repository for your search.content_types: Specifies the types of content to include in your search. You must provide a list of content types from the following options:codefor searching within the code,repofor searching within the repository's general information,prfor searching within pull requests, andissuefor searching within issues. This field is mandatory and allows tailoring the search to specific content types within the GitHub repository.
Custom model and embeddings
By default, the tool uses OpenAI for both embeddings and summarization. To customize the model, you can use a config dictionary as follows:
tool = GithubSearchTool(
config=dict(
llm=dict(
provider="ollama", # or google, openai, anthropic, llama2, ...
config=dict(
model="llama2",
# temperature=0.5,
# top_p=1,
# stream=true,
),
),
embedder=dict(
provider="google",
config=dict(
model="models/embedding-001",
task_type="retrieval_document",
# title="Embeddings",
),
),
)
)