[GH-ISSUE #13380] Enabling flash attention for vision encoders causes accuracy/stability regression for DeepSeek-OCR #34596

Closed
opened 2026-04-22 18:18:11 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @th1nhhdk on GitHub (Dec 8, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13380

What is the issue?

This commit: 1108d8b34e causes DeepSeek-OCR model to be less stable and duplicate output text more

Downgrading to Ollama 0.13.1 fixes the issue

Relevant log output


OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.13.2

Originally created by @th1nhhdk on GitHub (Dec 8, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13380 ### What is the issue? This commit: https://github.com/ollama/ollama/commit/1108d8b34e43e968812eded0ccda73503ccad77d causes DeepSeek-OCR model to be less stable and duplicate output text more Downgrading to `Ollama 0.13.1` fixes the issue ### Relevant log output ```shell ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.13.2
GiteaMirror added the bug label 2026-04-22 18:18:11 -05:00
Author
Owner

@jessegross commented on GitHub (Dec 8, 2025):

Can you please provide specific examples and logs?

<!-- gh-comment-id:3628723359 --> @jessegross commented on GitHub (Dec 8, 2025): Can you please provide specific examples and logs?
Author
Owner

@th1nhhdk commented on GitHub (Dec 9, 2025):

I tried to replicate it within Ollama, can't reproduce. Yet in my program that uses the Ollama Python API the problem happens, so it's probably not Ollama

<!-- gh-comment-id:3631331687 --> @th1nhhdk commented on GitHub (Dec 9, 2025): I tried to replicate it within Ollama, can't reproduce. Yet in my program that uses the Ollama Python API the problem happens, so it's probably not Ollama
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#34596