[GH-ISSUE #12355] Granite Docling 258m #85897

Open
opened 2026-05-10 01:57:59 -05:00 by GiteaMirror · 21 comments
Owner
Originally created by @chrlesur on GitHub (Sep 20, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12355 https://huggingface.co/ibm-granite/granite-docling-258M https://huggingface.co/spaces/ibm-granite/granite-docling-258m-demo https://www.ibm.com/new/announcements/granite-docling-end-to-end-document-conversion It will be amazing :) Thanks for the great work.
GiteaMirror added the model label 2026-05-10 01:57:59 -05:00
Author
Owner

@rick-github commented on GitHub (Sep 22, 2025):

https://github.com/ggml-org/llama.cpp/pull/16112

<!-- gh-comment-id:3321932228 --> @rick-github commented on GitHub (Sep 22, 2025): https://github.com/ggml-org/llama.cpp/pull/16112
Author
Owner

@gabe-l-hart commented on GitHub (Sep 23, 2025):

Hi folks! Quick update here: Unfortunately, the preprocessing logic needs some significant changes in order for the model to perform well. I'll have a PR up in llama.cpp for this shortly, but the result will be that we'll need a llama.cpp bump here in Ollama before the model will perform well.

In the meantime, this can be worked around with the current version of the mtmd preprocessing in Ollama by using transformers to do the preprocessing, then saving the preprocessed patches as individual images, and sending each as a separate image (see https://github.com/ggml-org/llama.cpp/pull/16112#issuecomment-3313726479 for details).

<!-- gh-comment-id:3324994230 --> @gabe-l-hart commented on GitHub (Sep 23, 2025): Hi folks! Quick update here: Unfortunately, the preprocessing logic needs some significant changes in order for the model to perform well. I'll have a PR up in `llama.cpp` for this shortly, but the result will be that we'll need a `llama.cpp` bump here in Ollama before the model will perform well. In the meantime, this can be worked around with the current version of the `mtmd` preprocessing in Ollama by using `transformers` to do the preprocessing, then saving the preprocessed patches as individual images, and sending each as a separate image (see https://github.com/ggml-org/llama.cpp/pull/16112#issuecomment-3313726479 for details).
Author
Owner

@gabe-l-hart commented on GitHub (Sep 23, 2025):

Updated PR: https://github.com/ggml-org/llama.cpp/pull/16206

<!-- gh-comment-id:3325088300 --> @gabe-l-hart commented on GitHub (Sep 23, 2025): Updated PR: https://github.com/ggml-org/llama.cpp/pull/16206
Author
Owner

@chrlesur commented on GitHub (Sep 23, 2025):

Thank you for your work.

<!-- gh-comment-id:3325521538 --> @chrlesur commented on GitHub (Sep 23, 2025): Thank you for your work.
Author
Owner

@gabe-l-hart commented on GitHub (Oct 6, 2025):

First PR is merged with another that needs to go in to fix the tokenization so that the model stops correctly (https://github.com/ggml-org/llama.cpp/pull/16438). After that, I'll start working on another llama.cpp bump here in Ollama.

<!-- gh-comment-id:3369946278 --> @gabe-l-hart commented on GitHub (Oct 6, 2025): First PR is merged with another that needs to go in to fix the tokenization so that the model stops correctly (https://github.com/ggml-org/llama.cpp/pull/16438). After that, I'll start working on another `llama.cpp` bump here in Ollama.
Author
Owner

@gabe-l-hart commented on GitHub (Oct 9, 2025):

I've got it all working now and up in a PR to bump llama.cpp: https://github.com/ollama/ollama/pull/12552

<!-- gh-comment-id:3387085162 --> @gabe-l-hart commented on GitHub (Oct 9, 2025): I've got it all working now and up in a PR to bump `llama.cpp`: https://github.com/ollama/ollama/pull/12552
Author
Owner

@chrlesur commented on GitHub (Oct 10, 2025):

Thank you for your hardwork.

<!-- gh-comment-id:3388680651 --> @chrlesur commented on GitHub (Oct 10, 2025): Thank you for your hardwork.
Author
Owner

@oliviasculley commented on GitHub (Oct 15, 2025):

Looks like it finally got merged, I'm excited for this model!!

<!-- gh-comment-id:3404462349 --> @oliviasculley commented on GitHub (Oct 15, 2025): Looks like it finally got merged, I'm excited for this model!!
Author
Owner

@gabe-l-hart commented on GitHub (Oct 15, 2025):

Yes! The code is merged, so we're working through the process of getting the model up and it will be fully usable once the next Ollama release is out.

<!-- gh-comment-id:3406406654 --> @gabe-l-hart commented on GitHub (Oct 15, 2025): Yes! The code is merged, so we're working through the process of getting the model up and it will be fully usable once the next Ollama release is out.
Author
Owner

@gabe-l-hart commented on GitHub (Oct 17, 2025):

We're working on getting the official version up, but you can now play with this using ollama pull gabegoodhart/granite-docling:258m

<!-- gh-comment-id:3416910187 --> @gabe-l-hart commented on GitHub (Oct 17, 2025): We're working on getting the official version up, but you can now play with this using `ollama pull gabegoodhart/granite-docling:258m`
Author
Owner

@mbailey256 commented on GitHub (Oct 20, 2025):

We're working on getting the official version up, but you can now play with this using ollama pull gabegoodhart/granite-docling:258m

Thanks @gabe-l-hart for publishing the pre-release of the model. What is the proper syntax for submitting a pdf to the mobile via the Ollama CLI? I'm using the following and it's hallucinating based on the document name.

ollama run gabegoodhart/granite-docling:258m "Can you provide a summary of this document? /Users/michael/Downloads/Receipt-2716-4202.pdf"

<!-- gh-comment-id:3422161280 --> @mbailey256 commented on GitHub (Oct 20, 2025): > We're working on getting the official version up, but you can now play with this using `ollama pull gabegoodhart/granite-docling:258m` Thanks @gabe-l-hart for publishing the pre-release of the model. What is the proper syntax for submitting a pdf to the mobile via the Ollama CLI? I'm using the following and it's hallucinating based on the document name. ollama run gabegoodhart/granite-docling:258m "Can you provide a summary of this document? /Users/michael/Downloads/Receipt-2716-4202.pdf"
Author
Owner

@gabe-l-hart commented on GitHub (Oct 20, 2025):

@mbailey256 The ollama model will not work on PDFs since it is purely a VLM for image inputs. To use the full PDF pipeline, you'll need to go through the docling [library][(https://github.com/docling-project/docling). You can see the steps to enable a "remote vision model" with Ollama here.

<!-- gh-comment-id:3422186096 --> @gabe-l-hart commented on GitHub (Oct 20, 2025): @mbailey256 The ollama model will not work on PDFs since it is purely a VLM for image inputs. To use the full PDF pipeline, you'll need to go through the `docling` [library][(https://github.com/docling-project/docling). You can see the steps to enable a "remote vision model" with Ollama [here](https://github.com/docling-project/docling/blob/a5af082d82fd71146865dba0af8d045034aeb979/docs/usage/enrichments.md?plain=1#L175).
Author
Owner

@1pharaxh commented on GitHub (Oct 20, 2025):

Hi @gabe-l-hart is there a way to get a markdown from the image directly through ollama ? or even through the python docling api ? I tried the model you suggested but I always seem to get wrong text OR hallucinations.

Thank you so much!

Image
<!-- gh-comment-id:3423639515 --> @1pharaxh commented on GitHub (Oct 20, 2025): Hi @gabe-l-hart is there a way to get a markdown from the image directly through ollama ? or even through the python docling api ? I tried the model you suggested but I always seem to get wrong text OR hallucinations. Thank you so much! <img width="844" height="141" alt="Image" src="https://github.com/user-attachments/assets/af7098bf-553e-48c2-9057-53380545bca2" />
Author
Owner

@gabe-l-hart commented on GitHub (Oct 20, 2025):

To the best of my knowledge, you cannot get markdown directly out of the model since it renders to doctags. You can use docling directly to export to markdown.

Based on your prompt and the description, it seems like you are looking for a textual description of a photograph of some sort that may contain some text. While the model can perform this task, it's not the primary focus of this model which is aimed at visual layout parsing for scanned or photographed documents.

W.r.t. incorrect or hallucinated text, there are certainly a few things that could be at play:

  1. In our testing for the official release, we found that the FP16 model I have on gabegoodhart/granite-docling:258m is flirting dangerously with the limits of the FP16 datatype, and when run on CPU (on my M3 at least) it hits conditions producing -Inf values. The native safetensors format is BF16, so we're exploring the correct data type to use for the Ollama model. With a model this small, subtle differences can cause big problems.

  2. The model is fairly sensitive to prompting and the position of the image within the prompt. The best results should come through using the model in docling directly since the placement of image tokens and prompting will be standardized. The prompting style I have seen the best luck with is </path/to/image> Convert this page to markdown. although interestingly the output is in doctag format and not markdown.

<!-- gh-comment-id:3423666610 --> @gabe-l-hart commented on GitHub (Oct 20, 2025): To the best of my knowledge, you cannot get markdown directly out of the model since it renders to doctags. You can use `docling` directly to export to markdown. Based on your prompt and the description, it seems like you are looking for a textual description of a photograph of some sort that may contain some text. While the model _can_ perform this task, it's not the primary focus of this model which is aimed at visual layout parsing for scanned or photographed documents. W.r.t. incorrect or hallucinated text, there are certainly a few things that could be at play: 1. In our testing for the official release, we found that the `FP16` model I have on `gabegoodhart/granite-docling:258m` is flirting dangerously with the limits of the `FP16` datatype, and when run on CPU (on my M3 at least) it hits conditions producing `-Inf` values. The native `safetensors` format is `BF16`, so we're exploring the correct data type to use for the Ollama model. With a model this small, subtle differences can cause big problems. 2. The model is fairly sensitive to prompting and the position of the image within the prompt. The best results should come through using the model in `docling` directly since the placement of image tokens and prompting will be standardized. The prompting style I have seen the best luck with is `</path/to/image> Convert this page to markdown.` although interestingly the output is in `doctag` format and not markdown.
Author
Owner

@1pharaxh commented on GitHub (Oct 20, 2025):

Thanks for your reply I tried it with this python code

import logging


from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import (
    VlmPipelineOptions,
)
from docling.datamodel.pipeline_options_vlm_model import ApiVlmOptions, ResponseFormat
from docling.document_converter import DocumentConverter, PdfFormatOption
from docling.pipeline.vlm_pipeline import VlmPipeline


def ollama_vlm_options(model: str, prompt: str):
    options = ApiVlmOptions(
        url="http://localhost:11434/v1/chat/completions",  # the default Ollama endpoint
        params=dict(
            model=model,
        ),
        prompt=prompt,
        timeout=90,
        scale=1.0,
        response_format=ResponseFormat.MARKDOWN,
    )
    return options


# Usage and conversion


def main():
    logging.basicConfig(level=logging.INFO)

    input_doc_path = "windows-app.png"

    pipeline_options = VlmPipelineOptions(
        enable_remote_services=True  # required when calling remote VLM endpoints
    )

    pipeline_options.vlm_options = ollama_vlm_options(
        model="gabegoodhart/granite-docling:258m",
        prompt="Convert this page to docling.",

    )
    doc_converter = DocumentConverter(
        format_options={
            InputFormat.PDF: PdfFormatOption(
                pipeline_options=pipeline_options,
                pipeline_cls=VlmPipeline,
            ),
            InputFormat.IMAGE: PdfFormatOption(
                pipeline_options=pipeline_options,
                pipeline_cls=VlmPipeline,
            )
        }
    )
    result = doc_converter.convert(input_doc_path)
    print(result.document.export_to_markdown())


if __name__ == "__main__":
    main()
Image

but it still doesn't output the right text, What I am trying to do is to get docling to OCR an image and convert it to markdown

<!-- gh-comment-id:3423679481 --> @1pharaxh commented on GitHub (Oct 20, 2025): Thanks for your reply I tried it with this python code ``` import logging from docling.datamodel.base_models import InputFormat from docling.datamodel.pipeline_options import ( VlmPipelineOptions, ) from docling.datamodel.pipeline_options_vlm_model import ApiVlmOptions, ResponseFormat from docling.document_converter import DocumentConverter, PdfFormatOption from docling.pipeline.vlm_pipeline import VlmPipeline def ollama_vlm_options(model: str, prompt: str): options = ApiVlmOptions( url="http://localhost:11434/v1/chat/completions", # the default Ollama endpoint params=dict( model=model, ), prompt=prompt, timeout=90, scale=1.0, response_format=ResponseFormat.MARKDOWN, ) return options # Usage and conversion def main(): logging.basicConfig(level=logging.INFO) input_doc_path = "windows-app.png" pipeline_options = VlmPipelineOptions( enable_remote_services=True # required when calling remote VLM endpoints ) pipeline_options.vlm_options = ollama_vlm_options( model="gabegoodhart/granite-docling:258m", prompt="Convert this page to docling.", ) doc_converter = DocumentConverter( format_options={ InputFormat.PDF: PdfFormatOption( pipeline_options=pipeline_options, pipeline_cls=VlmPipeline, ), InputFormat.IMAGE: PdfFormatOption( pipeline_options=pipeline_options, pipeline_cls=VlmPipeline, ) } ) result = doc_converter.convert(input_doc_path) print(result.document.export_to_markdown()) if __name__ == "__main__": main() ``` <img width="677" height="143" alt="Image" src="https://github.com/user-attachments/assets/f84f7b5d-bbf8-4ec7-91c5-5c77bb508de3" /> but it still doesn't output the right text, What I am trying to do is to get docling to OCR an image and convert it to markdown
Author
Owner

@gabe-l-hart commented on GitHub (Oct 20, 2025):

Is this screenshot the one you're running through the model, or is it a screenshot of the script running? I think it's the later? Running that screenshot through Ollama directly, I get the following almost correct output:

ollama run gabegoodhart/granite-docling:258m "/Users/ghart/Pictures/docling-repro.png Convert this page to markdown."

Added image '/Users/ghart/Pictures/docling-repro.png'
<doctag><text><loc_0><loc_2><loc_373><loc_64>.(venv) PS C:\Users\akars\OneDrive\Documents\GitHub\electron-app\backend> uv run .test.py 2025-10-20 14:49:13,661 - INFO - detected formats: [<InputFormat.IMAGE: 'image'>]</text>
<text><loc_0><loc_68><loc_322><loc_100>2025-10-20 14:49:13,804 - INFO - Going to convert document batch...</text>
<text><loc_0><loc_104><loc_499><loc_144>2025-10-20 14:49:13,805 - INFO - Initializing pipeline for VlmPipeline with options hash ab387416d9089c4f5e4771ff56548de</text>
<text><loc_0><loc_148><loc_8><loc_181>9</text>
<text><loc_0><loc_185><loc_499><loc_228>2025-10-20 14:49:13,825 - INFO - Loading plugin 'docling_defaults'</text>
<text><loc_0><loc_232><loc_498><loc_288>2025-10-20 14:49:13,828 - INFO - Registered picture descriptions: ['vlm', 'api']</text>
<text><loc_0><loc_292><loc_499><loc_350>2025-10-20 14:49:13,828 - INFO - Processing document windows-app.png 2025-10-20 14:49:17,027 - INFO - Finished converting document windows-app.png in 3.36 sec.</text>
<text><loc_0><loc_354><loc_498><loc_410>&lt;loc\_0&gt;&lt;loc\_89&gt;&lt;loc\_50&gt;&lt;loc\_409&gt; .venv) PS C:\Users\akars\OneDrive\Documents\GitHub\electron-app\backend></text>
<text><loc_0><loc_414><loc_499><loc_454>2025-10-20 14:49:13,828 - INFO - Document windows-app.png in 3.36 sec.</text>
<text><loc_0><loc_457><loc_498><loc_500>.venv) PS C:\Users\akars\OneDrive\Documents\GitHub\electron-app\backend></text>
</doctag>
<!-- gh-comment-id:3423702556 --> @gabe-l-hart commented on GitHub (Oct 20, 2025): Is this screenshot the one you're running through the model, or is it a screenshot of the script running? I think it's the later? Running that screenshot through Ollama directly, I get the following _almost_ correct output: ```sh ollama run gabegoodhart/granite-docling:258m "/Users/ghart/Pictures/docling-repro.png Convert this page to markdown." Added image '/Users/ghart/Pictures/docling-repro.png' <doctag><text><loc_0><loc_2><loc_373><loc_64>.(venv) PS C:\Users\akars\OneDrive\Documents\GitHub\electron-app\backend> uv run .test.py 2025-10-20 14:49:13,661 - INFO - detected formats: [<InputFormat.IMAGE: 'image'>]</text> <text><loc_0><loc_68><loc_322><loc_100>2025-10-20 14:49:13,804 - INFO - Going to convert document batch...</text> <text><loc_0><loc_104><loc_499><loc_144>2025-10-20 14:49:13,805 - INFO - Initializing pipeline for VlmPipeline with options hash ab387416d9089c4f5e4771ff56548de</text> <text><loc_0><loc_148><loc_8><loc_181>9</text> <text><loc_0><loc_185><loc_499><loc_228>2025-10-20 14:49:13,825 - INFO - Loading plugin 'docling_defaults'</text> <text><loc_0><loc_232><loc_498><loc_288>2025-10-20 14:49:13,828 - INFO - Registered picture descriptions: ['vlm', 'api']</text> <text><loc_0><loc_292><loc_499><loc_350>2025-10-20 14:49:13,828 - INFO - Processing document windows-app.png 2025-10-20 14:49:17,027 - INFO - Finished converting document windows-app.png in 3.36 sec.</text> <text><loc_0><loc_354><loc_498><loc_410>&lt;loc\_0&gt;&lt;loc\_89&gt;&lt;loc\_50&gt;&lt;loc\_409&gt; .venv) PS C:\Users\akars\OneDrive\Documents\GitHub\electron-app\backend></text> <text><loc_0><loc_414><loc_499><loc_454>2025-10-20 14:49:13,828 - INFO - Document windows-app.png in 3.36 sec.</text> <text><loc_0><loc_457><loc_498><loc_500>.venv) PS C:\Users\akars\OneDrive\Documents\GitHub\electron-app\backend></text> </doctag> ```
Author
Owner

@1pharaxh commented on GitHub (Oct 20, 2025):

@gabe-l-hart the later screenshot is of the python script running docling with Ollama for which I refered to the docs .

I was also able to generate docling tags from ollama 🙌 but I still need to convert those docling tags to either text or markdown. Do you know how I can covert it ?

<!-- gh-comment-id:3423727272 --> @1pharaxh commented on GitHub (Oct 20, 2025): @gabe-l-hart the later screenshot is of the python script running docling with Ollama for which I refered to the [docs ](https://docling-project.github.io/docling/examples/vlm_pipeline_api_model/). I was also able to generate docling tags from ollama 🙌 but I still need to convert those docling tags to either text or markdown. Do you know how I can covert it ?
Author
Owner

@1pharaxh commented on GitHub (Oct 20, 2025):

I found this code helps with that

from docling_core.types.doc.document import DocTagsDocument, DoclingDocument
from pathlib import Path

# Suppose doctags_raw is your doctags string (from Granite Docling/vLLM)
# If you have images, provide them as a list; if not, create dummy images for each page
doctags_raw = '''
<doctag><text><loc_8><loc_4><loc_115><loc_14>File Edit Selection View Go Run Terminal Help</text>

<text><loc_9><loc_490><loc_21><loc_498>y: int = Field(</text>
<text><loc_8><loc_499><loc_21><loc_508>16 main*</text>
<text><loc_10><loc_509><loc_22><loc_459>5° AC Clear</text>
<text><loc_9><loc_460><loc_21><loc_468>main.py > 0Δ0</text>
<text><loc_25><loc_461><loc_33><loc_469>⇒ Reconnect</text>
<text><loc_35><loc_459><loc_45><loc_467>to Discord</text>
<text><loc_9><loc_479><loc_21><loc_487>5° AC Clear</text>
<text><loc_8><loc_488><loc_15><loc_496>4°</text>
<text><loc_10><loc_497><loc_22><loc_499>Clear</text>
<text><loc_9><loc_479><loc_21><loc_487>5° AC Clear</text>
<text><loc_25><loc_479><loc_33><loc_488>⇒ Reconnect</text>
<text><loc_35><loc_480><loc_45><loc_489>to Discord</text>
<text><loc_9><loc_499><loc_21><loc_498>5° AC Clear</text>
<text><loc_25><loc_499><loc_33><loc_499>Clear</text>
<text><loc_35><loc_499><loc_45><loc_499>⇒ Reconnect</text>
<picture><loc_9><loc_0><loc_500><loc_498><other></picture>
<text><loc_147><loc_4><loc_152><loc_14>← →</text>
<text><loc_244><loc_3><loc_261><loc_13>Open Electron</text>
<text><loc_243><loc_24><loc_253><loc_33>▷ ▷</text>
<text><loc_258><loc_25><loc_265><loc_34>package.json</text>
<text><loc_244><loc_44><loc_251><loc_53>> spotify</text>
<picture><loc_255><loc_0><loc_500><loc_499><screenshot></picture>
</doctag>
'''
doctags_doc = DocTagsDocument.from_doctags_and_image_pairs(
    [doctags_raw], [Path("windows-app.png")])


# Correct usage: static method!
doc = DoclingDocument.load_from_doctags(doctags_doc, document_name="MyDoc")

# Export to markdown
markdown = doc.export_to_markdown()
print(markdown)

from issue

thank you so much!

<!-- gh-comment-id:3423759846 --> @1pharaxh commented on GitHub (Oct 20, 2025): I found this code helps with that ``` from docling_core.types.doc.document import DocTagsDocument, DoclingDocument from pathlib import Path # Suppose doctags_raw is your doctags string (from Granite Docling/vLLM) # If you have images, provide them as a list; if not, create dummy images for each page doctags_raw = ''' <doctag><text><loc_8><loc_4><loc_115><loc_14>File Edit Selection View Go Run Terminal Help</text> <text><loc_9><loc_490><loc_21><loc_498>y: int = Field(</text> <text><loc_8><loc_499><loc_21><loc_508>16 main*</text> <text><loc_10><loc_509><loc_22><loc_459>5° AC Clear</text> <text><loc_9><loc_460><loc_21><loc_468>main.py > 0Δ0</text> <text><loc_25><loc_461><loc_33><loc_469>⇒ Reconnect</text> <text><loc_35><loc_459><loc_45><loc_467>to Discord</text> <text><loc_9><loc_479><loc_21><loc_487>5° AC Clear</text> <text><loc_8><loc_488><loc_15><loc_496>4°</text> <text><loc_10><loc_497><loc_22><loc_499>Clear</text> <text><loc_9><loc_479><loc_21><loc_487>5° AC Clear</text> <text><loc_25><loc_479><loc_33><loc_488>⇒ Reconnect</text> <text><loc_35><loc_480><loc_45><loc_489>to Discord</text> <text><loc_9><loc_499><loc_21><loc_498>5° AC Clear</text> <text><loc_25><loc_499><loc_33><loc_499>Clear</text> <text><loc_35><loc_499><loc_45><loc_499>⇒ Reconnect</text> <picture><loc_9><loc_0><loc_500><loc_498><other></picture> <text><loc_147><loc_4><loc_152><loc_14>← →</text> <text><loc_244><loc_3><loc_261><loc_13>Open Electron</text> <text><loc_243><loc_24><loc_253><loc_33>▷ ▷</text> <text><loc_258><loc_25><loc_265><loc_34>package.json</text> <text><loc_244><loc_44><loc_251><loc_53>> spotify</text> <picture><loc_255><loc_0><loc_500><loc_499><screenshot></picture> </doctag> ''' doctags_doc = DocTagsDocument.from_doctags_and_image_pairs( [doctags_raw], [Path("windows-app.png")]) # Correct usage: static method! doc = DoclingDocument.load_from_doctags(doctags_doc, document_name="MyDoc") # Export to markdown markdown = doc.export_to_markdown() print(markdown) ``` from [issue ](https://github.com/docling-project/docling/issues/2356) thank you so much!
Author
Owner

@gabe-l-hart commented on GitHub (Oct 20, 2025):

Ok, I think the issue in your above script is the response_format field. It's counterintuitive, but I think this is the expected response format from the model that will then be parsed into the DoclingDocument format, so setting that to ResponseFormat.DOCTAGS makes this work much better. It's still not flawless, but the exported results look like markdown.

<!-- gh-comment-id:3423766801 --> @gabe-l-hart commented on GitHub (Oct 20, 2025): Ok, I think the issue in your above script is the `response_format` field. It's counterintuitive, but I think this is the expected response format from the model that will then be parsed into the `DoclingDocument` format, so setting that to `ResponseFormat.DOCTAGS` makes this work much better. It's still not flawless, but the exported results look like markdown.
Author
Owner

@1pharaxh commented on GitHub (Oct 21, 2025):

@gabe-l-hart Thank you so much it works like a charm!

<!-- gh-comment-id:3424234797 --> @1pharaxh commented on GitHub (Oct 21, 2025): @gabe-l-hart Thank you so much it works like a charm!
Author
Owner

@gabe-l-hart commented on GitHub (Nov 10, 2025):

We now have the ibm official model up. You can start playing with it directly:

ollama run ibm/granite-docling "/path/to/image Convert this image to doctags"
<!-- gh-comment-id:3513408945 --> @gabe-l-hart commented on GitHub (Nov 10, 2025): We now have the `ibm` official model up. You can start playing with it directly: ```sh ollama run ibm/granite-docling "/path/to/image Convert this image to doctags" ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#85897