issue: </end_of_turn> shown in chat for both Llama and Gemma when searching Knowledgebase #5400

New Issue

GiteaMirror · 2025-11-11T16:19:58-06:00

GiteaMirror commented

2025-11-11 16:19:58 -06:00

Originally created by @digitalassassins on GitHub (May 30, 2025).

Check Existing Issues

I have searched the existing issues and discussions.
I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.13

Ollama Version (if applicable)

v0.90

Operating System

Windows

Browser (if applicable)

Edge

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When searching the Knowledgebase, Llama and Gemma do not show the text: </end_of_turn> in the chat window.

When using both models without searching the Knowledgebase, everything works as expected.

Actual Behavior

Knowledgebase replies not to show the writing </end_of_turn> in the chat window.

Steps to Reproduce

Use Gemma or Llama to send a normal message in a chat window.
Everything works as expected
Use a workspace model to search a Knowledgebase, it returns the text </end_of_turn> after the response and before the references

Logs & Screenshots

Working as expected in normal chat

Shows </end of turn> markup in the response.

Additional Information

This is using Ollama as the embedding model.
Is it the embedding model that is sending the second markdown that isn't getting stripped?

I have just stripped it myself using a filter for now. But it would be nice to be stripped automatically.

Originally created by @digitalassassins on GitHub (May 30, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.13 ### Ollama Version (if applicable) v0.90 ### Operating System Windows ### Browser (if applicable) Edge ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When searching the Knowledgebase, Llama and Gemma do not show the text: `</end_of_turn>` in the chat window. When using both models without searching the Knowledgebase, everything works as expected. ### Actual Behavior Knowledgebase replies not to show the writing `</end_of_turn>` in the chat window. ### Steps to Reproduce 1) Use Gemma or Llama to send a normal message in a chat window. 2) Everything works as expected 3) Use a workspace model to search a Knowledgebase, it returns the text `</end_of_turn>` after the response and before the references ### Logs & Screenshots ![Image](https://github.com/user-attachments/assets/54ec4d17-1fae-4376-ba54-dc07c2f37eaa) Working as expected in normal chat ![Image](https://github.com/user-attachments/assets/8fc4a3ac-ba35-45b9-8116-be14b8505bb5) Shows </end of turn> markup in the response. ### Additional Information This is using Ollama as the embedding model. Is it the embedding model that is sending the second markdown that isn't getting stripped? I have just stripped it myself using a filter for now. But it would be nice to be stripped automatically.

GiteaMirror added the bug label 2025-11-11 16:19:58 -06:00

GiteaMirror closed this issue

2025-11-11 16:19:59 -06:00

GiteaMirror referenced this issue

2025-11-11 17:57:35 -06:00

[PR #5400] [MERGED] fix: if cuda is not available fallback to cpu #8475

GiteaMirror referenced this issue

2026-04-20 03:39:00 -05:00

[PR #5400] [MERGED] fix: if cuda is not available fallback to cpu #21679

GiteaMirror referenced this issue

2026-04-25 10:50:51 -05:00

[PR #5400] [MERGED] fix: if cuda is not available fallback to cpu #37309