[GH-ISSUE #13219] Request for 2 vision/document layout models #34500

Closed
opened 2026-04-22 18:07:13 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @CanadianHusky on GitHub (Nov 23, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13219

Ollama and a few models that I tested worked 'out of the box' and brilliantly well for me. Awesome work by the Team. Congratulations!
In the future, would it to possible to incorporate these 2 vision/document layout models into the list of available models please.

1-
https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/PaddleOCR-VL.html
Here the HF Link with live demo
https://huggingface.co/PaddlePaddle/PaddleOCR-VL
It is a relatively small model (0.9B) compared to the other offerings but it does a great job for Document analysis

2-
https://huggingface.co/rednote-hilab/dots.ocr
This is a very good OCR/Layout model as well that can restore human reading order for complex multicolumn layout

These models may not work for 'agentic' use and may not respond with human readable text, which is totally fine and not required.
Getting back the models raw data (json output) is totally ok as the output will be parsed by code.

Thank you

Originally created by @CanadianHusky on GitHub (Nov 23, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13219 Ollama and a few models that I tested worked 'out of the box' and brilliantly well for me. Awesome work by the Team. Congratulations! In the future, would it to possible to incorporate these 2 vision/document layout models into the list of available models please. 1- https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/PaddleOCR-VL.html Here the HF Link with live demo https://huggingface.co/PaddlePaddle/PaddleOCR-VL It is a relatively small model (0.9B) compared to the other offerings but it does a great job for Document analysis 2- https://huggingface.co/rednote-hilab/dots.ocr This is a very good OCR/Layout model as well that can restore human reading order for complex multicolumn layout These models may not work for 'agentic' use and may not respond with human readable text, which is totally fine and not required. Getting back the models raw data (json output) is totally ok as the output will be parsed by code. Thank you
Author
Owner
<!-- gh-comment-id:3568238837 --> @rick-github commented on GitHub (Nov 23, 2025): https://github.com/ollama/ollama/issues/12685 https://github.com/ollama/ollama/issues/11653
Author
Owner

@pdevine commented on GitHub (Nov 25, 2025):

Going to close this as a dupe.

<!-- gh-comment-id:3573413421 --> @pdevine commented on GitHub (Nov 25, 2025): Going to close this as a dupe.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#34500