[GH-ISSUE #11914] How to deploy the fine tuned qwen2.5-vl #54421

Open
opened 2026-04-29 05:55:28 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @liyan1997 on GitHub (Aug 15, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11914

What is the issue?

I noticed that the official qwen2.5-vl has been adapted, but I want to deploy the model that has been fine tuned on my own dataset. I used llama.cpp to quantify the fine tuned weights and generated qwen2.5-vl-7B-Q4_K.gguf and qwen2.5-vl-7b-vision-gguf. Then, I used llama.cpp to infer everything normally. However, when I wrote the ModelFile file and deployed it with ollama create, I encountered the problem of random answers when inferring again

Relevant log output


OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.11.4

Originally created by @liyan1997 on GitHub (Aug 15, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11914 ### What is the issue? I noticed that the official qwen2.5-vl has been adapted, but I want to deploy the model that has been fine tuned on my own dataset. I used llama.cpp to quantify the fine tuned weights and generated qwen2.5-vl-7B-Q4_K.gguf and qwen2.5-vl-7b-vision-gguf. Then, I used llama.cpp to infer everything normally. However, when I wrote the ModelFile file and deployed it with ollama create, I encountered the problem of random answers when inferring again ### Relevant log output ```shell ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.11.4
GiteaMirror added the bug label 2026-04-29 05:55:28 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#54421