[GH-ISSUE #12547] When I use the fine tuned model to use ollama inference, there may be incorrect answers. #8324

Open
opened 2026-04-12 20:53:42 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @zhangnn520 on GitHub (Oct 9, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12547

What is the issue?

I have a question. When I use the fine tuned model to use ollama inference, there may be incorrect answers. Generally, both vllm and sglang can be used for normal inference. However, when I use ollama without quantification, the same problem still occurs. How should I solve this problem

Image

Relevant log output


OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @zhangnn520 on GitHub (Oct 9, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12547 ### What is the issue? I have a question. When I use the fine tuned model to use ollama inference, there may be incorrect answers. Generally, both vllm and sglang can be used for normal inference. However, when I use ollama without quantification, the same problem still occurs. How should I solve this problem <img width="2423" height="895" alt="Image" src="https://github.com/user-attachments/assets/a1c16fc0-767f-43da-9a33-a52201bbe57a" /> ### Relevant log output ```shell ``` ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the bug label 2026-04-12 20:53:42 -05:00
Author
Owner

@zhangnn520 commented on GitHub (Oct 9, 2025):

As shown in the figure, the first one using the official qwen3-0.6/1.7/4b can output normally, but when I use the qwen2.5-vl-code model, ggg occurs. The qwen2.5-vl-code model is my fine tuned model, but it seems to not work properly.

<!-- gh-comment-id:3384543991 --> @zhangnn520 commented on GitHub (Oct 9, 2025): As shown in the figure, the first one using the official qwen3-0.6/1.7/4b can output normally, but when I use the qwen2.5-vl-code model, ggg occurs. The qwen2.5-vl-code model is my fine tuned model, but it seems to not work properly.
Author
Owner

@pdevine commented on GitHub (Oct 10, 2025):

You can compare the parameters/tensors with ollama show -v <model>. Maybe output your model against the base model to see if there are any differences.

<!-- gh-comment-id:3392242326 --> @pdevine commented on GitHub (Oct 10, 2025): You can compare the parameters/tensors with `ollama show -v <model>`. Maybe output your model against the base model to see if there are any differences.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#8324