[GH-ISSUE #13198] deepseek-ocr via UI only answers if you query it twice #86414

Closed
opened 2026-05-10 03:17:19 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @gparent on GitHub (Nov 21, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13198

What is the issue?

NOTE: I am not super familiar with Ollama technicals so I'm sorry if there's not enough information in my initial report. I will add whatever is necessary as I learn.

When using the deepseek-ocr model, I have to ask the model a followup question to actually see the reply that was produced in my new chat. This only seems to affect the first prompt to the model.

The behavior that happens is that the model starts "thinking" (triple dots appear), then they suddenly disappear but no answer is shown.

Asking a followup prompt like "What was the answer" then causes the model to give the answer.

Example:

Image

Relevant log output


OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.13.0

Originally created by @gparent on GitHub (Nov 21, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13198 ### What is the issue? NOTE: I am not super familiar with Ollama technicals so I'm sorry if there's not enough information in my initial report. I will add whatever is necessary as I learn. When using the deepseek-ocr model, I have to ask the model a followup question to actually see the reply that was produced in my new chat. This only seems to affect the first prompt to the model. The behavior that happens is that the model starts "thinking" (triple dots appear), then they suddenly disappear but no answer is shown. Asking a followup prompt like "What was the answer" then causes the model to give the answer. Example: <img width="737" height="260" alt="Image" src="https://github.com/user-attachments/assets/ee7dd4b1-dc5f-4b5b-9c93-77e1dbe2fbe7" /> ### Relevant log output ```shell ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.13.0
GiteaMirror added the bug label 2026-05-10 03:17:19 -05:00
Author
Owner

@gparent commented on GitHub (Nov 21, 2025):

Apologies, I will try to get the log output tonight

<!-- gh-comment-id:3564263066 --> @gparent commented on GitHub (Nov 21, 2025): Apologies, I will try to get the log output tonight
Author
Owner

@rick-github commented on GitHub (Nov 21, 2025):

Deepseek-OCR does not perform well as a chatbot. It is very sensitive to variations in prompts, and it is recommended to use specific prompts for processing images.

<!-- gh-comment-id:3564277027 --> @rick-github commented on GitHub (Nov 21, 2025): Deepseek-OCR does not perform well as a chatbot. It is very sensitive to variations in prompts, and it is recommended to use [specific prompts](https://github.com/deepseek-ai/DeepSeek-OCR/?tab=readme-ov-file#prompts-examples) for processing images.
Author
Owner

@gparent commented on GitHub (Nov 21, 2025):

Deepseek-OCR does not perform well as a chatbot. It is very sensitive to variations in prompts, and it is recommended to use specific prompts for processing images.

Thank you! The replies did feel like I was misusing the model, however is it normal to not even get a reply in the first place?

It might not be a bug but it's not a great experience for beginners like me.

Thank you for taking time to explain this, it is appreciated.

<!-- gh-comment-id:3564283018 --> @gparent commented on GitHub (Nov 21, 2025): > Deepseek-OCR does not perform well as a chatbot. It is very sensitive to variations in prompts, and it is recommended to use [specific prompts](https://github.com/deepseek-ai/DeepSeek-OCR/?tab=readme-ov-file#prompts-examples) for processing images. Thank you! The replies did feel like I was misusing the model, however is it normal to not even get a reply in the first place? It might not be a bug but it's not a great experience for beginners like me. Thank you for taking time to explain this, it is appreciated.
Author
Owner

@rick-github commented on GitHub (Nov 21, 2025):

Notice from your screenshot that DOCR is completing the sentence started with "What was the answer". In this instance, the model is acting as a completion model rather than a chat model. This particular model is trained with a subset of the capabilities of a general purpose model - it responds only to a certain type of input. If you want to use a more chat-oriented model that understands images, I would recommend one of the qwen3-vl models.

<!-- gh-comment-id:3564304693 --> @rick-github commented on GitHub (Nov 21, 2025): Notice from your screenshot that DOCR is completing the sentence started with "What was the answer". In this instance, the model is acting as a completion model rather than a chat model. This particular model is trained with a subset of the capabilities of a general purpose model - it responds only to a certain type of input. If you want to use a more chat-oriented model that understands images, I would recommend one of the [qwen3-vl](https://ollama.com/library/qwen3-vl) models.
Author
Owner

@gparent commented on GitHub (Nov 21, 2025):

Notice from your screenshot that DOCR is completing the sentence started with "What was the answer". In this instance, the model is acting as a completion model rather than a chat model. This particular model is trained with a subset of the capabilities of a general purpose model - it responds only to a certain type of input. If you want to use a more chat-oriented model that understands images, I would recommend one of the qwen3-vl models.

Thanks! Yeah I also noticed that if I don't add a question mark, it starts the answers with one. I planned to use deepseek-ocr programmatically in the first place to handle some bills of mine. I think improving the way I use it (using correct prompts and not chatting with it) is going to be the way to go, but I'll keep qwen3-vl in mind for interactive usage.

Again I cannot overstate how much I appreciate you taking time to explain this to me. I don't think this is really a bug anymore, other than the possible UI experience quirk for people like me who do not know better.

<!-- gh-comment-id:3564311736 --> @gparent commented on GitHub (Nov 21, 2025): > Notice from your screenshot that DOCR is completing the sentence started with "What was the answer". In this instance, the model is acting as a completion model rather than a chat model. This particular model is trained with a subset of the capabilities of a general purpose model - it responds only to a certain type of input. If you want to use a more chat-oriented model that understands images, I would recommend one of the [qwen3-vl](https://ollama.com/library/qwen3-vl) models. Thanks! Yeah I also noticed that if I don't add a question mark, it starts the answers with one. I planned to use deepseek-ocr programmatically in the first place to handle some bills of mine. I think improving the way I use it (using correct prompts and not chatting with it) is going to be the way to go, but I'll keep qwen3-vl in mind for interactive usage. Again I cannot overstate how much I appreciate you taking time to explain this to me. I don't think this is really a bug anymore, other than the possible UI experience quirk for people like me who do not know better.
Author
Owner

@rick-github commented on GitHub (Nov 21, 2025):

An example of programmatic use of this model:

#!/usr/bin/env python3

import ollama
import sys

model = 'deepseek-ocr:3b-bf16'
image = sys.argv[1] if len(sys.argv) > 1 else "puppy.jpg"

response = ollama.chat(
    model=model,
    messages=[{
        'role': 'user',
        'content': '\nDescribe this image.',
        'images': [image]
    }],
    stream=True
)

for r in response:
  print(r['message']['content'], end='')
print("")
<!-- gh-comment-id:3564316439 --> @rick-github commented on GitHub (Nov 21, 2025): An example of programmatic use of this model: ```python #!/usr/bin/env python3 import ollama import sys model = 'deepseek-ocr:3b-bf16' image = sys.argv[1] if len(sys.argv) > 1 else "puppy.jpg" response = ollama.chat( model=model, messages=[{ 'role': 'user', 'content': '\nDescribe this image.', 'images': [image] }], stream=True ) for r in response: print(r['message']['content'], end='') print("") ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#86414