[GH-ISSUE #5319] Fine-tuned model responding incorrectly to my prompts #29091

Open
opened 2026-04-22 07:44:22 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @giannisak on GitHub (Jun 27, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5319

Originally assigned to: @pdevine on GitHub.

What is the issue?

I'm having an issue with my fine-tuned model. It doesn't respond to my prompts correctly and instead generates unrelated outputs.
It seems like the model is making up its own user input, then replying to this instead of my actual input.

Example:

My Input:

Hi! Who are you?

Response:

# User

Write an article about the instruction of 2-HYDROXYPYRIMIDINE-5-BORONIC ACID, PINACOL ESTER 2000 words in the chemical industry.

# Assistant
## Introduction

In the realm of modern organic synthesis, boronic acids and their derivatives have emerged as indispensable reagents (... etc)

ModelFile:

FROM path/finetuned_phi3_medium_4k_Q4_K_M.gguf

TEMPLATE """{{ if .System }}<|system|>
{{ .System }}<|end|>
{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}<|end|>
{{ end }}<|assistant|>
{{ .Response }}<|end|>"""

PARAMETER stop "<|end|>"
PARAMETER stop "<|user|>"
PARAMETER stop "<|assistant|>"

I am using the template of phi3. I have tried various templates and added system prompts, but the problem persists.

A few days ago, I ran the model in Ollama, and it gave good responses, but now the same model is exhibiting this behavior. I am running the same gguf on Jan, and the responses are okay. I updated to the current Ollama version, and I am recreating the model, but I can't fix it.

Thanks in advance!

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.1.46

Originally created by @giannisak on GitHub (Jun 27, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5319 Originally assigned to: @pdevine on GitHub. ### What is the issue? I'm having an issue with my fine-tuned model. It doesn't respond to my prompts correctly and instead generates unrelated outputs. It seems like the model is making up its own user input, then replying to this instead of my actual input. ## Example: ### My Input: `Hi! Who are you?` ### Response: ``` # User Write an article about the instruction of 2-HYDROXYPYRIMIDINE-5-BORONIC ACID, PINACOL ESTER 2000 words in the chemical industry. # Assistant ## Introduction In the realm of modern organic synthesis, boronic acids and their derivatives have emerged as indispensable reagents (... etc) ``` ## ModelFile: ``` FROM path/finetuned_phi3_medium_4k_Q4_K_M.gguf TEMPLATE """{{ if .System }}<|system|> {{ .System }}<|end|> {{ end }}{{ if .Prompt }}<|user|> {{ .Prompt }}<|end|> {{ end }}<|assistant|> {{ .Response }}<|end|>""" PARAMETER stop "<|end|>" PARAMETER stop "<|user|>" PARAMETER stop "<|assistant|>" ``` I am using the template of phi3. I have tried various templates and added system prompts, but the problem persists. A few days ago, I ran the model in Ollama, and it gave good responses, but now the same model is exhibiting this behavior. I am running the same gguf on Jan, and the responses are okay. I updated to the current Ollama version, and I am recreating the model, but I can't fix it. Thanks in advance! ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.1.46
GiteaMirror added the bug label 2026-04-22 07:44:22 -05:00
Author
Owner

@pdevine commented on GitHub (Sep 15, 2024):

Sorry for the slow response. It looks like you fused the adapter and converted it to a GGUF. How did you end up doing the fine tune? What was the scaling factor (or alpha / rank) and which tensors did you end up fine tuning?

<!-- gh-comment-id:2351794014 --> @pdevine commented on GitHub (Sep 15, 2024): Sorry for the slow response. It looks like you fused the adapter and converted it to a GGUF. How did you end up doing the fine tune? What was the scaling factor (or alpha / rank) and which tensors did you end up fine tuning?
Author
Owner

@giannisak commented on GitHub (Sep 16, 2024):

Thanks for your response!

I did the fine-tuning based on the phi 3 cookbook
(https://github.com/microsoft/Phi-3CookBook/blob/main/code/04.Finetuning/Phi-3-finetune-qlora-python.ipynb)

I merged the adapter with the base model as described in the guide,
then used convert_hf_to_gguf and llama-quantize from llama.cpp to convert to gguf and quantize.

The LoRA configuration I used was:
r = 16
lora_alpha = 32
lora_dropout = 0.05
target_modules = ['k_proj', 'q_proj', 'v_proj', 'o_proj', "gate_proj", "down_proj", "up_proj"]

<!-- gh-comment-id:2351879280 --> @giannisak commented on GitHub (Sep 16, 2024): Thanks for your response! I did the fine-tuning based on the phi 3 cookbook (https://github.com/microsoft/Phi-3CookBook/blob/main/code/04.Finetuning/Phi-3-finetune-qlora-python.ipynb) I merged the adapter with the base model as described in the guide, then used convert_hf_to_gguf and llama-quantize from llama.cpp to convert to gguf and quantize. The LoRA configuration I used was: r = 16 lora_alpha = 32 lora_dropout = 0.05 target_modules = ['k_proj', 'q_proj', 'v_proj', 'o_proj', "gate_proj", "down_proj", "up_proj"]
Author
Owner

@pdevine commented on GitHub (Sep 16, 2024):

I haven't yet tried fine tuning phi3. I'll give it a shot and see what I get.

Some things I've observed:

  • I've only targeted the q_proj and v_proj attention tensors. I think they might give the most bang for the buck.
  • I used the same rank/alpha on the last finetune I did w/ gemma2 9b, but had the dropout set to 0

I've fine tuned both gemma2 9b and llama3.1 and got fairly reasonable results.

<!-- gh-comment-id:2353670179 --> @pdevine commented on GitHub (Sep 16, 2024): I haven't yet tried fine tuning phi3. I'll give it a shot and see what I get. Some things I've observed: * I've only targeted the q_proj and v_proj attention tensors. I think they might give the most bang for the buck. * I used the same rank/alpha on the last finetune I did w/ gemma2 9b, but had the dropout set to 0 I've fine tuned both gemma2 9b and llama3.1 and got fairly reasonable results.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#29091