Feat/docling-support (#1763)

* added tool for docling support

* docling support installation

* use file_paths instead of file_path

* fix import

* organized imports

* run_type docs

* needs to be list

* fixed logic

* logged but file_path is backwards compatible

* use file_paths instead of file_path 2

* added test for multiple sources for file_paths

* fix run-types

* enabling local files to work and type cleanup

* linted

* fix test and types

* fixed run types

* fix types

* renamed to CrewDoclingSource

* linted

* added docs

* resolve conflicts

---------

Co-authored-by: Brandon Hancock (bhancock_ai) <109994880+bhancockio@users.noreply.github.com>
Co-authored-by: Brandon Hancock <brandon@brandonhancock.io>
This commit is contained in:
Lorenze Jay
2024-12-23 10:19:58 -08:00
committed by GitHub
parent c887ff1f47
commit b3185ad90c
8 changed files with 1166 additions and 35 deletions

View File

@@ -79,6 +79,55 @@ crew = Crew(
result = crew.kickoff(inputs={"question": "What city does John live in and how old is he?"})
```
Here's another example with the `CrewDoclingSource`
```python Code
from crewai import LLM, Agent, Crew, Process, Task
from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
# Create a knowledge source
content_source = CrewDoclingSource(
file_paths=[
"https://lilianweng.github.io/posts/2024-11-28-reward-hacking",
"https://lilianweng.github.io/posts/2024-07-07-hallucination",
],
)
# Create an LLM with a temperature of 0 to ensure deterministic outputs
llm = LLM(model="gpt-4o-mini", temperature=0)
# Create an agent with the knowledge store
agent = Agent(
role="About papers",
goal="You know everything about the papers.",
backstory="""You are a master at understanding papers and their content.""",
verbose=True,
allow_delegation=False,
llm=llm,
)
task = Task(
description="Answer the following questions about the papers: {question}",
expected_output="An answer to the question.",
agent=agent,
)
crew = Crew(
agents=[agent],
tasks=[task],
verbose=True,
process=Process.sequential,
knowledge_sources=[
content_source
], # Enable knowledge by adding the sources here. You can also add more sources to the sources list.
)
result = crew.kickoff(
inputs={
"question": "What is the reward hacking paper about? Be sure to provide sources."
}
)
```
## Knowledge Configuration
### Chunking Configuration