[GH-ISSUE #11100] Slow generation, wrong output compared to LM Studio #7322

Closed
opened 2026-04-12 19:22:16 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @shkarupa-alex on GitHub (Jun 17, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11100

What is the issue?

I'm using this hf.co/mradermacher/Nanonets-OCR-s-GGUF:Q8_0 model (qwen vl) for OCR taks.

Sample file attached.
Image

It works perfectly with LM Studio:

Wall time (second call): 4.16 s / Memory 4.67 Gb
Output (almost exact text from the image):

Там был штраф в 300 к, я его заменил только на возмещение убытков.
Было
6. За нарушение требований настоящего Соглашения Получающая сторона обязана по письменному требованию Раскрывающей стороны выплатить Раскрывающей стороне штрафную неустойку в размере 300 000 (триста тысяч) рублей за каждый факт нарушения.

Стало
6.За нарушение требований настоящего Соглашения Получающая сторона обязана возместить Раскрывающей стороне причиненные убытки в порядке, который предусмотрен законодательством Российской Федерации.

But with ollama (docker image ollama/ollama:0.9.0) it works much worse:
Wall time (second call): 1.93 s / Memory 4.44 Gb
Output (nothing common with text from the image, result is different in each call):

<n>0 之</子用00 00 \0_0_) 本狰ously_\0_.000_A0_C_0__0_).嚯_0_Sum_【10_G0_SU_PRACTING)_0_Sh:0_

Here is a code for my experiment:

import base64
import io
from openai import OpenAI
from PIL import Image

image = Image.open('05b8c8bd2589767718d630d3907b1189a2fb47735bd88863e500fe82c0984f8a.png')
image = image.convert("RGB")
img_byte_arr = io.BytesIO()
image.save(img_byte_arr, format="PNG")
image_bytes = img_byte_arr.getvalue()
base64_image = base64.b64encode(image_bytes).decode("utf-8")
image_url = f"data:image/png;base64,{base64_image}"

messages=[
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Extract the text from the above document as if you were reading it naturally. Return the tables in html format. Return the equations in LaTeX representation. If there is an image in the document and image caption is not present, add a small description of the image inside the <img></img> tag; otherwise, add the image caption inside <img></img>. Watermarks should be wrapped in brackets. Ex: <watermark>OFFICIAL COPY</watermark>. Page numbers should be wrapped in brackets. Ex: <page_number>14</page_number> or <page_number>9/22</page_number>. Prefer using ☐ and ☑ for check boxes.",
            },
            {
                "type": "image_url",
                "image_url": {"url": image_url},
            },
        ],
    },
]

for port, model in [
    (1234, "mradermacher/Nanonets-OCR-s-GGUF"), # lm studio
    (11434, "hf.co/mradermacher/Nanonets-OCR-s-GGUF:Q8_0") # ollama
]:
    client = OpenAI(base_url=f"http://localhost:{port}/v1", api_key="lm-studio")
    completion = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0.0
    )
    print(completion.choices[0].message.content.strip())

Relevant log output

docker-log.txt

OS

Docker

GPU

No response

CPU

Apple

Ollama version

0.9.0

Originally created by @shkarupa-alex on GitHub (Jun 17, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11100 ### What is the issue? I'm using this `hf.co/mradermacher/Nanonets-OCR-s-GGUF:Q8_0` model (qwen vl) for OCR taks. Sample file attached. ![Image](https://github.com/user-attachments/assets/d47ce68a-b62b-4448-a93d-ec2c190b36d3) It works perfectly with LM Studio: Wall time (second call): 4.16 s / Memory 4.67 Gb Output (almost exact text from the image): ``` Там был штраф в 300 к, я его заменил только на возмещение убытков. Было 6. За нарушение требований настоящего Соглашения Получающая сторона обязана по письменному требованию Раскрывающей стороны выплатить Раскрывающей стороне штрафную неустойку в размере 300 000 (триста тысяч) рублей за каждый факт нарушения. Стало 6.За нарушение требований настоящего Соглашения Получающая сторона обязана возместить Раскрывающей стороне причиненные убытки в порядке, который предусмотрен законодательством Российской Федерации. ``` But with ollama (docker image ollama/ollama:0.9.0) it works much worse: Wall time (second call): 1.93 s / Memory 4.44 Gb Output (nothing common with text from the image, result is different in each call): ``` <n>0 之</子用00 00 \0_0_) 本狰ously_\0_.000_A0_C_0__0_).嚯_0_Sum_【10_G0_SU_PRACTING)_0_Sh:0_ ``` Here is a code for my experiment: ```python import base64 import io from openai import OpenAI from PIL import Image image = Image.open('05b8c8bd2589767718d630d3907b1189a2fb47735bd88863e500fe82c0984f8a.png') image = image.convert("RGB") img_byte_arr = io.BytesIO() image.save(img_byte_arr, format="PNG") image_bytes = img_byte_arr.getvalue() base64_image = base64.b64encode(image_bytes).decode("utf-8") image_url = f"data:image/png;base64,{base64_image}" messages=[ { "role": "user", "content": [ { "type": "text", "text": "Extract the text from the above document as if you were reading it naturally. Return the tables in html format. Return the equations in LaTeX representation. If there is an image in the document and image caption is not present, add a small description of the image inside the <img></img> tag; otherwise, add the image caption inside <img></img>. Watermarks should be wrapped in brackets. Ex: <watermark>OFFICIAL COPY</watermark>. Page numbers should be wrapped in brackets. Ex: <page_number>14</page_number> or <page_number>9/22</page_number>. Prefer using ☐ and ☑ for check boxes.", }, { "type": "image_url", "image_url": {"url": image_url}, }, ], }, ] for port, model in [ (1234, "mradermacher/Nanonets-OCR-s-GGUF"), # lm studio (11434, "hf.co/mradermacher/Nanonets-OCR-s-GGUF:Q8_0") # ollama ]: client = OpenAI(base_url=f"http://localhost:{port}/v1", api_key="lm-studio") completion = client.chat.completions.create( model=model, messages=messages, temperature=0.0 ) print(completion.choices[0].message.content.strip()) ``` ### Relevant log output [docker-log.txt](https://github.com/user-attachments/files/20778646/docker-log.txt) ### OS Docker ### GPU _No response_ ### CPU Apple ### Ollama version 0.9.0
GiteaMirror added the bug label 2026-04-12 19:22:16 -05:00
Author
Owner

@rick-github commented on GitHub (Jun 17, 2025):

hf.co/mradermacher/Nanonets-OCR-s-GGUF:Q8_0 doesn't have a template, and the GGUF doesn't have a vision model in it.

Try using the official qwen2.5vl from the ollama library:

$ ollama run qwen2.5vl ./05b8c8bd2589767718d630d3907b1189a2fb47735bd88863e500fe82c0984f8a.png 'Extract the text from the above document as if you were reading it naturally. Return the tables in html format. Return the equations in LaTeX representation. If there is an image in the document and image caption is not present, add a small description of the image inside the <img></img> tag; otherwise, add the image caption inside <img></img>. Watermarks should be wrapped in brackets. Ex: <watermark>OFFICIAL COPY</watermark>. Page numbers should be wrapped in brackets. Ex: <page_number>14</page_number> or <page_number>9/22</page_number>. Prefer using ☐ and ☑ for check boxes.'
Added image './05b8c8bd2589767718d630d3907b1189a2fb47735bd88863e500fe82c0984f8a.png'
```html
<html><body>
<p>Там был штраф в 300 к, я его заменил только на возмещение убытков.</p> 
 <p>Было</p> 
 <p>6. За нарушение требований настоящего Соглашения Получающая сторона обязана по письменному требованию Раскрывающей стороны выплатить Раскрывающей стороне штрафную неустойку в размере 300 000 (триста тысяч) рублей за каждый факт нарушения.</p> 
 <p>Стало</p> 
 <p>6. За нарушение требований настоящего Соглашения Получающая сторона обязана возместить Раскрывающей стороне причиненные убытки в порядке, который предусмотрен законодательством Российской Федерации.</p> 
</body></html>
```

<!-- gh-comment-id:2981070886 --> @rick-github commented on GitHub (Jun 17, 2025): hf.co/mradermacher/Nanonets-OCR-s-GGUF:Q8_0 doesn't have a template, and the GGUF doesn't have a vision model in it. Try using the official qwen2.5vl from the ollama library: ````console $ ollama run qwen2.5vl ./05b8c8bd2589767718d630d3907b1189a2fb47735bd88863e500fe82c0984f8a.png 'Extract the text from the above document as if you were reading it naturally. Return the tables in html format. Return the equations in LaTeX representation. If there is an image in the document and image caption is not present, add a small description of the image inside the <img></img> tag; otherwise, add the image caption inside <img></img>. Watermarks should be wrapped in brackets. Ex: <watermark>OFFICIAL COPY</watermark>. Page numbers should be wrapped in brackets. Ex: <page_number>14</page_number> or <page_number>9/22</page_number>. Prefer using ☐ and ☑ for check boxes.' Added image './05b8c8bd2589767718d630d3907b1189a2fb47735bd88863e500fe82c0984f8a.png' ```html <html><body> <p>Там был штраф в 300 к, я его заменил только на возмещение убытков.</p> <p>Было</p> <p>6. За нарушение требований настоящего Соглашения Получающая сторона обязана по письменному требованию Раскрывающей стороны выплатить Раскрывающей стороне штрафную неустойку в размере 300 000 (триста тысяч) рублей за каждый факт нарушения.</p> <p>Стало</p> <p>6. За нарушение требований настоящего Соглашения Получающая сторона обязана возместить Раскрывающей стороне причиненные убытки в порядке, который предусмотрен законодательством Российской Федерации.</p> </body></html> ``` ````
Author
Owner

@jmorganca commented on GitHub (Jun 17, 2025):

Thanks @rick-github. @shkarupa-alex also running it in Docker may have different performance implications than running it natively. Let us know if you're seeing more issues!

<!-- gh-comment-id:2981643986 --> @jmorganca commented on GitHub (Jun 17, 2025): Thanks @rick-github. @shkarupa-alex also running it in Docker may have different performance implications than running it natively. Let us know if you're seeing more issues!
Author
Owner

@shkarupa-alex commented on GitHub (Jun 17, 2025):

It's OK after switching to community-converted benhaotang/Nanonets-OCR-s:latest

<!-- gh-comment-id:2981714935 --> @shkarupa-alex commented on GitHub (Jun 17, 2025): It's OK after switching to community-converted `benhaotang/Nanonets-OCR-s:latest`
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7322