[GH-ISSUE #590] Lingering system prompt #27661

Closed
opened 2026-04-25 02:24:16 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @robertvazan on GitHub (Jan 27, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/590

Bug Report

Description

Bug Summary:
Topk1 models, which are normally perfectly reproducible, sometimes produce different output when system prompt in settings is set to something non-empty and then cleared. They seem to sometimes follow system prompt that has been removed long ago. This seems to persist over WebUI and ollama restarts and over long periods.

It's hard to reproduce, because there's some hidden state somewhere that makes it non-obvious what is actually being sent to ollama. Plus it's hard to tell whether the models are just failing to follow the prompt or they are actually being fed incorrect prompt. I also remember some weirdness when regular prompt (not system prompt) is ignored in favor of something apparently cached from another conversation, especially when switching between models, but I previously thought that must be some low-level issue in ollama itself. Now I am not sure whether it's related.

While trying to reproduce the issue, I also found it hard to understand how system prompt in settings interacts with system prompt in custom and official modelfiles. Do several system prompts (official modelfile, custom modelfile, settings) overwrite each other or are they appended to each other? In what order? What happens when one of them is blank?

Steps to Reproduce:
There's currently no reliable way to reproduce this. I just sometimes get responses that seem to use old system prompt. Changing the system prompt and clearing it tends to hide the issue for some time.

Expected Behavior:
Ollama WebUI should appear to be entirely stateless. There should be no state transfer between chats and no dependency on order of settings changes.

Originally created by @robertvazan on GitHub (Jan 27, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/590 # Bug Report ## Description **Bug Summary:** Topk1 models, which are normally perfectly reproducible, sometimes produce different output when system prompt in settings is set to something non-empty and then cleared. They seem to sometimes follow system prompt that has been removed long ago. This seems to persist over WebUI and ollama restarts and over long periods. It's hard to reproduce, because there's some hidden state somewhere that makes it non-obvious what is actually being sent to ollama. Plus it's hard to tell whether the models are just failing to follow the prompt or they are actually being fed incorrect prompt. I also remember some weirdness when regular prompt (not system prompt) is ignored in favor of something apparently cached from another conversation, especially when switching between models, but I previously thought that must be some low-level issue in ollama itself. Now I am not sure whether it's related. While trying to reproduce the issue, I also found it hard to understand how system prompt in settings interacts with system prompt in custom and official modelfiles. Do several system prompts (official modelfile, custom modelfile, settings) overwrite each other or are they appended to each other? In what order? What happens when one of them is blank? **Steps to Reproduce:** There's currently no reliable way to reproduce this. I just sometimes get responses that seem to use old system prompt. Changing the system prompt and clearing it tends to hide the issue for some time. **Expected Behavior:** Ollama WebUI should appear to be entirely stateless. There should be no state transfer between chats and no dependency on order of settings changes.
Author
Owner

@robertvazan commented on GitHub (Jan 27, 2024):

PS: To give you an idea of the steps that I took before observing this:

  1. I tried to reproduce this some time ago with system prompt (in settings) that asked the model to write responses in ALL CAPS. This system prompt was removed several days ago and models seemed to work fine.
  2. Today, I got a model response that was in ALL CAPS. Then in another chat, using another model, I again got ALL CAPS response. That seemed unlikely, so I started experimenting.
  3. I clicked regenerate button to verify the ALL CAPS response is generated reproducibly (topk1 model).
  4. I changed (previously blank) system prompt in settings to "refuse to answer any question" and clicked regenerate button. The model ignored the prompt (probably in favor of system prompt in custom modelfile), but it had the effect of changing the response. No more ALL CAPS.
  5. I cleared the system prompt and clicked regenerate response. This should have produced the same response as in step 3, because the settings were the same and this is topk1 model, but I instead got different response without ALL CAPS.
  6. I created new chat with the same model and copy-n-pasted the prompt. The model reproducibly generated the same response as in step 5.

So steps 2+3 and 5+6 run under the same settings (blank system prompt, same model, topk1), but they produce different outputs. Yet the output is reproducible within each pair (3 is the same as 2, 6 is the same as 5).

The only unusual thing I did today was to make some small tweaks to system prompts in custom modelfiles. Perhaps this caused WebUI or ollama to dig out the old system prompt from somewhere?

<!-- gh-comment-id:1913336298 --> @robertvazan commented on GitHub (Jan 27, 2024): PS: To give you an idea of the steps that I took before observing this: 1. I tried to reproduce this some time ago with system prompt (in settings) that asked the model to write responses in ALL CAPS. This system prompt was removed several days ago and models seemed to work fine. 2. Today, I got a model response that was in ALL CAPS. Then in another chat, using another model, I again got ALL CAPS response. That seemed unlikely, so I started experimenting. 3. I clicked regenerate button to verify the ALL CAPS response is generated reproducibly (topk1 model). 4. I changed (previously blank) system prompt in settings to "refuse to answer any question" and clicked regenerate button. The model ignored the prompt (probably in favor of system prompt in custom modelfile), but it had the effect of changing the response. No more ALL CAPS. 5. I cleared the system prompt and clicked regenerate response. This should have produced the same response as in step 3, because the settings were the same and this is topk1 model, but I instead got different response without ALL CAPS. 6. I created new chat with the same model and copy-n-pasted the prompt. The model reproducibly generated the same response as in step 5. So steps 2+3 and 5+6 run under the same settings (blank system prompt, same model, topk1), but they produce different outputs. Yet the output is reproducible within each pair (3 is the same as 2, 6 is the same as 5). The only unusual thing I did today was to make some small tweaks to system prompts in custom modelfiles. Perhaps this caused WebUI or ollama to dig out the old system prompt from somewhere?
Author
Owner

@tjbck commented on GitHub (Jan 27, 2024):

Hmm, webui should already be stateless. Could you share the payloads being sent to Ollama with us? You can find them in the network tab in browser dev tools, every request to /ollama/api/chat would be something of our interest, Keep us updated, Thanks!

image
<!-- gh-comment-id:1913344698 --> @tjbck commented on GitHub (Jan 27, 2024): Hmm, webui should already be stateless. Could you share the payloads being sent to Ollama with us? You can find them in the network tab in browser dev tools, every request to `/ollama/api/chat` would be something of our interest, Keep us updated, Thanks! <img width="880" alt="image" src="https://github.com/ollama-webui/ollama-webui/assets/25473318/ddf15671-6406-4091-9cd2-77c7d1f15123">
Author
Owner

@robertvazan commented on GitHub (Jan 27, 2024):

@tjbck Any hints on how to catch this information after the fact? Since the issue is not reliably reproducible, I never have dev tools open in time.

<!-- gh-comment-id:1913345461 --> @robertvazan commented on GitHub (Jan 27, 2024): @tjbck Any hints on how to catch this information after the fact? Since the issue is not reliably reproducible, I never have dev tools open in time.
Author
Owner

@justinh-rahb commented on GitHub (Jan 28, 2024):

@robertvazan I'd try running ollama serve in some way which logs it's console output (screen works for me), or putting a proxy between it and ollama-webui that logs the requests, either should be able to be left running to catch the smoking gun as it were.

In my experience.. I've also noticed some weirdness in how system prompts are used, and would like some clarity on the subject as well. For the most part I leave the settings box alone and assume that whatever has been set in the Modelfile (if anything) is in play, but on occasion I've noticed models that are typically very liberal in their output suddenly acting.. strange when switching back and forth.

<!-- gh-comment-id:1913447426 --> @justinh-rahb commented on GitHub (Jan 28, 2024): @robertvazan I'd try running `ollama serve` in some way which logs it's console output (screen works for me), or putting a proxy between it and ollama-webui that logs the requests, either should be able to be left running to catch the smoking gun as it were. In my experience.. I've also noticed some weirdness in how system prompts are used, and would like some clarity on the subject as well. For the most part I leave the settings box alone and assume that whatever has been set in the Modelfile (if anything) is in play, but on occasion I've noticed models that are typically very liberal in their output suddenly acting.. *strange* when switching back and forth.
Author
Owner

@tjbck commented on GitHub (Jan 29, 2024):

I'd greatly appreciate if you guys could provide more detailed information, there isn't much I can do with the provided information :/

@robertvazan I'm trying to reproduce the issue with the steps you've provided, could you clarify what you said a bit by writing a very easy to follow step-by-step guide on how to reproduce the issue? I'm not entirely sure If I'm getting what you wrote, especially here:

So steps 2+3 and 5+6 run under the same settings (blank system prompt, same model, topk1), but they produce different outputs. Yet the output is reproducible within each pair (3 is the same as 2, 6 is the same as 5).

<!-- gh-comment-id:1914042886 --> @tjbck commented on GitHub (Jan 29, 2024): I'd greatly appreciate if you guys could provide more detailed information, there isn't much I can do with the provided information :/ @robertvazan I'm trying to reproduce the issue with the steps you've provided, could you clarify what you said a bit by writing a very easy to follow step-by-step guide on how to reproduce the issue? I'm not entirely sure If I'm getting what you wrote, especially here: > So steps 2+3 and 5+6 run under the same settings (blank system prompt, same model, topk1), but they produce different outputs. Yet the output is reproducible within each pair (3 is the same as 2, 6 is the same as 5).
Author
Owner

@robertvazan commented on GitHub (Jan 29, 2024):

@tjbck Sorry, this is hard to reproduce at the moment. I have it in my chat history, so I am sure events happened as I described them, but repeating the same steps does not yield the same results. I theorize there must be hidden state somewhere. I will try reproducing this once more when I have a bit of time.

<!-- gh-comment-id:1914045997 --> @robertvazan commented on GitHub (Jan 29, 2024): @tjbck Sorry, this is hard to reproduce at the moment. I have it in my chat history, so I am sure events happened as I described them, but repeating the same steps does not yield the same results. I theorize there must be hidden state somewhere. I will try reproducing this once more when I have a bit of time.
Author
Owner

@robertvazan commented on GitHub (Feb 3, 2024):

I could not reproduce this after several trials. I am running ollama with OLLAMA_DEBUG=1 now, so next time this happens, I will know exactly what is being sent to ollama.

While looking into this, I noticed some discussion in ollama or llama.cpp about prompt cache issues. I speculate that maybe cache was getting corrupted on ollama/llama.cpp side. On the other hand, the lingering system prompt was several days old, so something must have been persisted. I am using custom models a lot, so perhaps some older version of some custom model got loaded with the old system prompt in it. Any of these issues could have been fixed in ollama or llama.cpp meantime.

BTW, I figured out that system prompts override each other with the following priority: settings, custom model, base model. Just noting it here in case anyone runs into similar problems and wonders how the system prompts are combined.

<!-- gh-comment-id:1925452578 --> @robertvazan commented on GitHub (Feb 3, 2024): I could not reproduce this after several trials. I am running ollama with `OLLAMA_DEBUG=1` now, so next time this happens, I will know exactly what is being sent to ollama. While looking into this, I noticed some discussion in ollama or llama.cpp about prompt cache issues. I speculate that maybe cache was getting corrupted on ollama/llama.cpp side. On the other hand, the lingering system prompt was several days old, so something must have been persisted. I am using custom models a lot, so perhaps some older version of some custom model got loaded with the old system prompt in it. Any of these issues could have been fixed in ollama or llama.cpp meantime. BTW, I figured out that system prompts override each other with the following priority: settings, custom model, base model. Just noting it here in case anyone runs into similar problems and wonders how the system prompts are combined.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#27661