issue: Conversation with multiple models incorrectly reuses the context of only one model #5405

New Issue

GiteaMirror · 2025-11-11T16:20:07-06:00

GiteaMirror commented

2025-11-11 16:20:07 -06:00

Originally created by @Jefferderp on GitHub (May 30, 2025).

Check Existing Issues

I have searched the existing issues and discussions.
I am using the latest version of Open WebUI.

Installation Method

Pip Install

Open WebUI Version

v0.6.12

Ollama Version (if applicable)

No response

Operating System

Debian 12

Browser (if applicable)

No response

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When chatting with multiple models side-by-side, it's expected that each model will maintain its own separate context history.

Actual Behavior

When chatting with multiple models side-by-side, I'm observing that only one model's previous replies are being used as the context for all models. Sometimes the leftmost model, and sometimes the rightmost. See screenshot:

Steps to Reproduce

Start with a clean Debian 12 install.
Install Python 3.11 and pip
Install open-webui via pip
Start open-webui
Open Chrome 136 in incognito mode.
Go to http://192.168.1.103:8080/ and login
Setup an API connection to OpenRouter
Start a new chat, picking any two models.
Send a chat to each model. Receive unique replies.
Send a follow-up reply to both models.
Observe that each model's reply was based on the context of only one model's chat history (see previous screenshot).

Logs & Screenshots

N/A

Additional Information

To my knowledge, this behavior was not exhibited prior to a more recent version. I can look into my update history if helpful to you.

If this is intended behavior, I'd be interested to learn why.

Originally created by @Jefferderp on GitHub (May 30, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Pip Install ### Open WebUI Version v0.6.12 ### Ollama Version (if applicable) _No response_ ### Operating System Debian 12 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When chatting with multiple models side-by-side, it's expected that each model will maintain its own separate context history. ### Actual Behavior When chatting with multiple models side-by-side, I'm observing that only one model's previous replies are being used as the context for *all* models. Sometimes the leftmost model, and sometimes the rightmost. See screenshot: ![Image](https://github.com/user-attachments/assets/052b51af-2ecc-4081-bd53-29ed581ab7dc) ### Steps to Reproduce 1. Start with a clean Debian 12 install. 2. Install Python 3.11 and pip 3. Install open-webui via pip 4. Start open-webui 5. Open Chrome 136 in incognito mode. 6. Go to http://192.168.1.103:8080/ and login 7. Setup an API connection to OpenRouter 8. Start a new chat, picking any two models. 9. Send a chat to each model. Receive unique replies. 10. Send a follow-up reply to both models. 11. Observe that each model's reply was based on the context of only one model's chat history (see previous screenshot). ### Logs & Screenshots N/A ### Additional Information To my knowledge, this behavior was not exhibited prior to a more recent version. I can look into my update history if helpful to you. If this is intended behavior, I'd be interested to learn why.

GiteaMirror added the bug label 2025-11-11 16:20:07 -06:00

GiteaMirror closed this issue

2025-11-11 16:20:07 -06:00

GiteaMirror commented

2025-11-11 16:20:08 -06:00

@tjbck commented on GitHub (May 31, 2025):

Intended behaviour, you're choosing one of the responses and then continuing the conversation.

@tjbck commented on GitHub (May 31, 2025): Intended behaviour, you're choosing one of the responses and then continuing the conversation.

GiteaMirror commented

2025-11-11 16:20:09 -06:00

@devdev999 commented on GitHub (Jun 1, 2025):

In this case, what is the logic to select the response that is used for continuation?

@devdev999 commented on GitHub (Jun 1, 2025): In this case, what is the logic to select the response that is used for continuation?

GiteaMirror commented

2025-11-11 16:20:09 -06:00

@RodolfoCastanheira commented on GitHub (Jun 3, 2025):

There is a subtle indicator of the selected answer. You select it by clicking it.

@RodolfoCastanheira commented on GitHub (Jun 3, 2025): There is a subtle indicator of the selected answer. You select it by clicking it.

GiteaMirror commented

2025-11-11 16:20:09 -06:00

@Jefferderp commented on GitHub (Jun 3, 2025):

This behavior wasn't evident to me until now. The highlighting effect in dark mode is way too subtle, though I can see it now that I know to look.

I previously observed when clicking on different answers that sometimes my follow-up replies would disappear until I refreshed the page. This is because I was switching conversation branches. So what I thought was a UI glitch was actually a feature...

I think this should be better illustrated for the sake of all casual users. Until now, I was under the impression that prompting multiple models was for holding parallel conversations, not for picking the best answer and continuing from there. While the current workflow makes more sense, it's not visually intuitive.

@Jefferderp commented on GitHub (Jun 3, 2025): This behavior wasn't evident to me until now. The highlighting effect in dark mode is *way* too subtle, though I can see it now that I know to look. I previously observed when clicking on different answers that sometimes my follow-up replies would disappear until I refreshed the page. This is because I was switching conversation branches. So what I thought was a UI glitch was actually a feature... I think this should be better illustrated for the sake of all casual users. Until now, I was under the impression that prompting multiple models was for holding parallel conversations, *not* for picking the best answer and continuing from there. While the current workflow makes more sense, it's not visually intuitive.

GiteaMirror commented

2025-11-11 16:20:09 -06:00

@Classic298 commented on GitHub (Jun 3, 2025):

ah yes i see it but it is way too sublte. there is like a 1% color/brightness difference max

@Classic298 commented on GitHub (Jun 3, 2025): ah yes i see it but it is way too sublte. there is like a 1% color/brightness difference max

GiteaMirror referenced this issue

2025-11-11 17:57:38 -06:00

[PR #5405] [MERGED] refac: filter non chat completions models #8477

GiteaMirror referenced this issue

2026-04-20 03:39:05 -05:00

[PR #5405] [MERGED] refac: filter non chat completions models #21681

GiteaMirror referenced this issue