Files
crewAI/docs/v1.15.1/en/tools/file-document/ocrtool.mdx
João Moura 6491f5a663
Some checks failed
Mark stale issues and pull requests / stale (push) Has been cancelled
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (python) (push) Has been cancelled
Check Documentation Broken Links / Check broken links (push) Has been cancelled
Vulnerability Scan / pip-audit (push) Has been cancelled
Nightly Canary Release / Check for new commits (push) Has been cancelled
Nightly Canary Release / Build nightly packages (push) Has been cancelled
Nightly Canary Release / Publish nightly to PyPI (push) Has been cancelled
[docs-freeze] docs: snapshot and changelog for v1.15.1 (#6367)
2026-06-27 03:50:32 -03:00

91 lines
1.8 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: OCR Tool
description: The `OCRTool` extracts text from local images or image URLs using an LLM with vision.
icon: image
mode: "wide"
---
# `OCRTool`
## Description
Extract text from images (local path or URL). Uses a visioncapable LLM via CrewAIs LLM interface.
## Installation
No extra install beyond `crewai-tools`. Ensure your selected LLM supports vision.
## Parameters
### Run Parameters
- `image_path_url` (str, required): Local image path or HTTP(S) URL.
## Examples
### Direct usage
```python Code
from crewai_tools import OCRTool
print(OCRTool().run(image_path_url="/tmp/receipt.png"))
```
### With an agent
```python Code
from crewai import Agent, Task, Crew
from crewai_tools import OCRTool
ocr = OCRTool()
agent = Agent(
role="OCR",
goal="Extract text",
tools=[ocr],
)
task = Task(
description="Extract text from https://example.com/invoice.jpg",
expected_output="All detected text in plain text",
agent=agent,
)
crew = Crew(agents=[agent], tasks=[task])
result = crew.kickoff()
```
## Notes
- Ensure the selected LLM supports image inputs.
- For large images, consider downscaling to reduce token usage.
- You can pass a specific LLM instance to the tool (e.g., `LLM(model="gpt-4o")`) if needed, matching the README guidance.
## Example
```python Code
from crewai import Agent, Task, Crew
from crewai_tools import OCRTool
tool = OCRTool()
agent = Agent(
role="OCR Specialist",
goal="Extract text from images",
backstory="Visionenabled analyst",
tools=[tool],
verbose=True,
)
task = Task(
description="Extract text from https://example.com/receipt.png",
expected_output="All detected text in plain text",
agent=agent,
)
crew = Crew(agents=[agent], tasks=[task])
result = crew.kickoff()
```