[GH-ISSUE #14053] Ollama glm-ocr returning blank Markdown on Mac #9183

Closed
opened 2026-04-12 22:01:58 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @githycody on GitHub (Feb 3, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14053

What is the issue?

When I run the the model with the image:
ollama run glm-ocr Text Recognition:

it returns this:


```markdown

```markdown

```text

I have been in contact with people from the team, who are able to make it work. Unfortunately only on Windows. Perhaps the issue can be reproduced on Mac.

ollama ps
NAME ID SIZE PROCESSOR CONTEXT UNTIL
glm-ocr:latest 6effedd0dc8a 3.8 GB 100% GPU 4096 4 minutes from now

Relevant log output


OS

macOS

GPU

Apple

CPU

Apple

Ollama version

ollama version is 0.15.5-rc1

Originally created by @githycody on GitHub (Feb 3, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14053 ### What is the issue? When I run the the model with the image: ollama run glm-ocr Text Recognition: <imagePath> it returns this: ```markdown ```markdown ```markdown ```text ``` I have been in contact with people from the team, who are able to make it work. Unfortunately only on Windows. Perhaps the issue can be reproduced on Mac. ollama ps NAME ID SIZE PROCESSOR CONTEXT UNTIL glm-ocr:latest 6effedd0dc8a 3.8 GB 100% GPU 4096 4 minutes from now ### Relevant log output ```shell ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version ollama version is 0.15.5-rc1
GiteaMirror added the bug label 2026-04-12 22:01:58 -05:00
Author
Owner

@rick-github commented on GitHub (Feb 3, 2026):

For reference: https://discord.com/channels/1128867683291627614/1468279523496956025

It works on Linux on both Nvidia and ROCm devices.

<!-- gh-comment-id:3843674439 --> @rick-github commented on GitHub (Feb 3, 2026): For reference: https://discord.com/channels/1128867683291627614/1468279523496956025 It works on Linux on both Nvidia and ROCm devices.
Author
Owner

@starchaser01 commented on GitHub (Feb 3, 2026):

I had the same problem. After reducing the image resolution, it worked.

<!-- gh-comment-id:3844265651 --> @starchaser01 commented on GitHub (Feb 3, 2026): I had the same problem. After reducing the image resolution, it worked.
Author
Owner

@rick-github commented on GitHub (Feb 4, 2026):

Agreed, this looks like a resolution issue. An A4 page at about 150dpi is the largest image the model can handle.

<!-- gh-comment-id:3844783818 --> @rick-github commented on GitHub (Feb 4, 2026): Agreed, this looks like a resolution issue. An A4 page at about 150dpi is the largest image the model can handle.
Author
Owner

@githycody commented on GitHub (Feb 4, 2026):

@rick-github but do you think this an Ollama problem? the specs specify 71mp.

<!-- gh-comment-id:3845913233 --> @githycody commented on GitHub (Feb 4, 2026): @rick-github but do you think this an Ollama problem? the specs specify 71mp.
Author
Owner

@griff12 commented on GitHub (Feb 5, 2026):

I am having the same problem on Windows as well.

<!-- gh-comment-id:3850547403 --> @griff12 commented on GitHub (Feb 5, 2026): I am having the same problem on Windows as well.
Author
Owner

@rick-github commented on GitHub (Feb 5, 2026):

The specs may specify 71mp, but looking at the SDK it looks like page OCR is 1mp by default and the image is scaled to 200 dpi. There's also a smart resizer which by default limits images to 12mp. I'm not familiar with the SDK though, maybe there's some sort of image partitioner that stitches the results together.

In any case, the ollama glmocr image processor has a default if 3.2mp, which is about the size I experimentally reached above. The comment indicates that this is to keep memory stable, whatever that means. It's theoretically possible to change this by adjusting KV values in the model, but that requires editing the GGUF file, which is a pain. I'll experiment and see what I can come up with. In the meantime, resizing images to less than 3mp is the best way to allow glm-ocr to return useful results.

<!-- gh-comment-id:3850727098 --> @rick-github commented on GitHub (Feb 5, 2026): The specs may specify 71mp, but looking at the [SDK](https://github.com/zai-org/GLM-OCR) it looks like page OCR is [1mp](https://github.com/zai-org/GLM-OCR/blob/8e6fd70a31cfbd072746a9a2f45c178efc644048/glmocr/config.py#L110) by default and the image is scaled to [200 dpi](https://github.com/zai-org/GLM-OCR/blob/8e6fd70a31cfbd072746a9a2f45c178efc644048/glmocr/config.py#L119). There's also a [smart resizer](https://github.com/zai-org/GLM-OCR/blob/8e6fd70a31cfbd072746a9a2f45c178efc644048/glmocr/utils/image_utils.py#L13) which by default limits images to 12mp. I'm not familiar with the SDK though, maybe there's some sort of image partitioner that stitches the results together. In any case, the ollama glmocr [image processor](https://github.com/ollama/ollama/blob/c61023f5548f61651b7fd04393e2a93430f89a71/model/models/glmocr/imageprocessor.go#L33) has a default if 3.2mp, which is about the size I experimentally reached above. The comment indicates that this is to keep memory stable, whatever that means. It's theoretically possible to change this by adjusting KV values in the model, but that requires editing the GGUF file, which is a pain. I'll experiment and see what I can come up with. In the meantime, resizing images to less than 3mp is the best way to allow glm-ocr to return useful results.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9183