[GH-ISSUE #23538] issue: Gemma4 output is sometime part of "thinking" part - sometimes not. #35536

Closed
opened 2026-04-25 09:44:27 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @derritter88 on GitHub (Apr 9, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23538

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.8.12

Ollama Version (if applicable)

v0.20.3

Operating System

Ubuntu 24.04

Browser (if applicable)

Firefox

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

The output after reasoning/thinking should be displayed as standalone part as usual.

Actual Behavior

Sometimes output is part of reasoning/thinking block - e.g.:

*(Drafting the "Smart Sweets" list)*:
- Dates + Peanut Butter (Nature's candy).
- Fruit + Protein (Controlled sugar).
- Dark Chocolate (The current winner).

*End of thought process.*<thought>Erst einmal: **Herzlichen Glückwunsch!**

should indicate that thinking is done and the output begins.

Steps to Reproduce

Use gemma4:26b and interact with it

Logs & Screenshots

(This matches the provided good response). Switched to German. Focus on "Dopamin-Ersatz" and "Regeneration". End with a motivating closing. (Self-correction: Ensure I don't sound like a doctor, but like a nutrition-savvy training partner). Highlighting magnesium in dark chocolate is a great "pro-athlete" tip.

Wait, I should also mention the "Sugar Crash" risk for training. If he eats a huge sugary snack right before a run, it's bad. If he eats it after, it's fine.

Final Polish of the German text. (Proceed to generate output).Das ist erst einmal: Herzlichen Glückwunsch! Mit dem Rauchstopp hast du die wichtigste Entscheidung für deine Lungenkapazität und deine kardiovaskuläre Gesundheit als Triathlet getroffen. Das ist ein gewaltiger Sieg für dein Training.

Dass du jetzt auf Süßes anspringst, ist biologisch absolut logisch. Nikotin und Zucker triggern beide das Belohnungssystem im Gehirn (Dopamin). Wenn das Nikotin wegfällt, sucht das Gehirn nach einem Ersatz, um den Dopaminspiegel stabil zu halten. Süßes ist der einfachste und schnellste Weg.

Additional Information

No response

Originally created by @derritter88 on GitHub (Apr 9, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/23538 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.8.12 ### Ollama Version (if applicable) v0.20.3 ### Operating System Ubuntu 24.04 ### Browser (if applicable) Firefox ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior The output after reasoning/thinking should be displayed as standalone part as usual. ### Actual Behavior Sometimes output is part of reasoning/thinking block - e.g.: ``` *(Drafting the "Smart Sweets" list)*: - Dates + Peanut Butter (Nature's candy). - Fruit + Protein (Controlled sugar). - Dark Chocolate (The current winner). *End of thought process.*<thought>Erst einmal: **Herzlichen Glückwunsch!** ``` <thought> should indicate that thinking is done and the output begins. ### Steps to Reproduce Use gemma4:26b and interact with it ### Logs & Screenshots (This matches the provided good response). Switched to German. Focus on "Dopamin-Ersatz" and "Regeneration". End with a motivating closing. (Self-correction: Ensure I don't sound like a doctor, but like a nutrition-savvy training partner). Highlighting magnesium in dark chocolate is a great "pro-athlete" tip. Wait, I should also mention the "Sugar Crash" risk for training. If he eats a huge sugary snack right before a run, it's bad. If he eats it after, it's fine. Final Polish of the German text. (Proceed to generate output).<thought>Das ist erst einmal: Herzlichen Glückwunsch! Mit dem Rauchstopp hast du die wichtigste Entscheidung für deine Lungenkapazität und deine kardiovaskuläre Gesundheit als Triathlet getroffen. Das ist ein gewaltiger Sieg für dein Training. Dass du jetzt auf Süßes anspringst, ist biologisch absolut logisch. Nikotin und Zucker triggern beide das Belohnungssystem im Gehirn (Dopamin). Wenn das Nikotin wegfällt, sucht das Gehirn nach einem Ersatz, um den Dopaminspiegel stabil zu halten. Süßes ist der einfachste und schnellste Weg. ### Additional Information _No response_
GiteaMirror added the bug label 2026-04-25 09:44:27 -05:00
Author
Owner

@Classic298 commented on GitHub (Apr 9, 2026):

https://github.com/open-webui/open-webui/issues/23357

<!-- gh-comment-id:4213615837 --> @Classic298 commented on GitHub (Apr 9, 2026): https://github.com/open-webui/open-webui/issues/23357
Author
Owner

@TomTheWise commented on GitHub (Apr 9, 2026):

https://www.reddit.com/r/LocalLLaMA/comments/1sgl3qz/gemma_4_on_llamacpp_should_be_stable_now/

For my understanding this MAYBE is not a bug in Open WebUI but rather a requirement to currently explicitly state the Chat Template for llama.cpp / ik_llama.cpp / other variants to prevent such errors. --jinja is apparently NOT enough.
In the reddit comments there are some indications that the interleaved template that is specially named for the 31B should work for the other gemma4 variants too (including MoE 26B-A4B) as the google documentation on parsing does not differenciate for the model variants:
https://ai.google.dev/gemma/docs/core/prompt-formatting-gemma4?hl=de#managing-thought-context

I'll test tomorrow and hopefully then also the chat parameters (reasoning enabled/disabled and so on) won't be ignored anymore as llama.cpp will know "how to tell the" LLM if it should reason or not. - But thats just my theory.

<!-- gh-comment-id:4216405543 --> @TomTheWise commented on GitHub (Apr 9, 2026): https://www.reddit.com/r/LocalLLaMA/comments/1sgl3qz/gemma_4_on_llamacpp_should_be_stable_now/ For my understanding this MAYBE is not a bug in Open WebUI but rather a requirement to currently explicitly state the Chat Template for llama.cpp / ik_llama.cpp / other variants to prevent such errors. --jinja is apparently NOT enough. In the reddit comments there are some indications that the interleaved template that is specially named for the 31B should work for the other gemma4 variants too (including MoE 26B-A4B) as the google documentation on parsing does not differenciate for the model variants: https://ai.google.dev/gemma/docs/core/prompt-formatting-gemma4?hl=de#managing-thought-context I'll test tomorrow and hopefully then also the chat parameters (reasoning enabled/disabled and so on) won't be ignored anymore as llama.cpp will know "how to tell the" LLM if it should reason or not. - But thats just my theory.
Author
Owner

@TomTheWise commented on GitHub (Apr 9, 2026):

OK even after applying the chat template to llama.cpp like currently recommended in reddit and so on I can too still reproduce it - very rarely.
It especially occurs on longer back and forth conversations. Its rare but once it comes you have a high chance of that conversation being unfixable because it will always run into that error egain.

Image
<!-- gh-comment-id:4217340033 --> @TomTheWise commented on GitHub (Apr 9, 2026): OK even after applying the chat template to llama.cpp like currently recommended in reddit and so on I can too still reproduce it - very rarely. It especially occurs on longer back and forth conversations. Its rare but once it comes you have a high chance of that conversation being unfixable because it will always run into that error egain. <img width="2132" height="890" alt="Image" src="https://github.com/user-attachments/assets/40495a7a-f17d-480e-97d5-9f80878b3e54" />
Author
Owner

@TomTheWise commented on GitHub (Apr 10, 2026):

Google finally uploaded jinja chat templates that apparently now have a good quality.

Now after researching a bit I think OWUI indeed needs a fix / change for gemma4?
So if OWUI does indeed need updates for the gemma4 control tokens, now they are official clear.
Here is the google documentation

<!-- gh-comment-id:4225988501 --> @TomTheWise commented on GitHub (Apr 10, 2026): Google finally uploaded jinja chat templates that apparently now have a good quality. Now after researching a bit I think OWUI indeed needs a fix / change for gemma4? So if OWUI does indeed need updates for the gemma4 control tokens, now they are official clear. [Here is the google documentation](https://ai.google.dev/gemma/docs/core/prompt-formatting-gemma4?hl=de)
Author
Owner

@tjbck commented on GitHub (Apr 12, 2026):

Provider dependent.

<!-- gh-comment-id:4232869283 --> @tjbck commented on GitHub (Apr 12, 2026): Provider dependent.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#35536