[GH-ISSUE #23176] issue: Placeholder LLM message should be created with a pending/incomplete status to handle interrupted generation gracefully

GiteaMirror commented

2026-04-20 02:27:45 -05:00

Owner

Originally created by @ShirasawaSama on GitHub (Mar 28, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23176

Check Existing Issues

I have searched for any existing and/or related issues.
I have searched for any existing and/or related discussions.
I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
I am using the latest version of Open WebUI.

Installation Method

Git Clone

Open WebUI Version

v0.8.12

Ollama Version (if applicable)

No response

Operating System

Mac

Browser (if applicable)

No response

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

The placeholder message should be created with a pending/incomplete status (e.g., status: "pending" or done: false). This way:

On page reload, the frontend can detect unfinished messages and either display an appropriate error state or automatically retry the generation.
The user is never left confused by a silent empty message.

Actual Behavior

The placeholder assistant message is saved to the database without any pending/incomplete status marker. When the /completions request fails or never reaches the backend (due to page refresh, network drop, or conversation switch), the backend is completely unaware of the failure.

Upon revisiting the conversation, the empty placeholder message is rendered as if the assistant intentionally returned an empty response — no error state, no loading indicator, no retry option, and no way for the user to understand what went wrong or recover from it.

Steps to Reproduce

Start a conversation and send a message.
Immediately refresh the page before the LLM response starts streaming back.
Navigate back to the original conversation.
Observe that the placeholder assistant message is displayed as if it completed normally — empty or with partial content, no error state, no retry option.

Alternative reproduction:

Simulate a network failure (e.g., disconnect Wi-Fi, kill the backend, or block the /completions request via DevTools) after the placeholder message is created but before the LLM response streams back, then refresh the page.

Logs & Screenshots

Additional Information

This issue becomes even more problematic when starting a new conversation by sending the first message. In this scenario:

The user sends a message, which triggers POST /api/v1/chats/new to create a new chat — this succeeds.
The frontend then calls POST /api/v1/chats/:id to create the placeholder assistant message, but the user refreshes the page before this call completes.
The /completions request is also never sent.

The result is a conversation containing only the user's message with no assistant response whatsoever — not even an empty placeholder. The user sees a dead-end conversation with just their own message, no loading state, no error, and no indication that the assistant was supposed to respond. There is no retry button or any way to recover other than manually resending the message or deleting the conversation entirely.

Originally created by @ShirasawaSama on GitHub (Mar 28, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/23176 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Git Clone ### Open WebUI Version v0.8.12 ### Ollama Version (if applicable) _No response_ ### Operating System Mac ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior The placeholder message should be created with a **pending/incomplete status** (e.g., `status: "pending"` or `done: false`). This way: - On page reload, the frontend can detect unfinished messages and either **display an appropriate error state** or **automatically retry the generation**. - The user is never left confused by a silent empty message. ### Actual Behavior The placeholder assistant message is saved to the database without any pending/incomplete status marker. When the /completions request fails or never reaches the backend (due to page refresh, network drop, or conversation switch), the backend is completely unaware of the failure. Upon revisiting the conversation, the empty placeholder message is rendered as if the assistant intentionally returned an empty response — no error state, no loading indicator, no retry option, and no way for the user to understand what went wrong or recover from it. ### Steps to Reproduce 1. Start a conversation and send a message. 2. **Immediately refresh the page** before the LLM response starts streaming back. 3. Navigate back to the original conversation. 4. Observe that the placeholder assistant message is displayed as if it completed normally — empty or with partial content, no error state, no retry option. **Alternative reproduction:** - Simulate a network failure (e.g., disconnect Wi-Fi, kill the backend, or block the `/completions` request via DevTools) after the placeholder message is created but before the LLM response streams back, then refresh the page. ### Logs & Screenshots <img width="940" height="682" alt="Image" src="https://github.com/user-attachments/assets/551d320f-82b9-4f4a-9390-edce75ce9109" /> ### Additional Information This issue becomes even more problematic when **starting a new conversation by sending the first message**. In this scenario: 1. The user sends a message, which triggers `POST /api/v1/chats/new` to create a new chat — this succeeds. 2. The frontend then calls `POST /api/v1/chats/:id` to create the placeholder assistant message, but the user refreshes the page before this call completes. 3. The `/completions` request is also never sent. The result is a conversation containing **only the user's message with no assistant response whatsoever** — not even an empty placeholder. The user sees a dead-end conversation with just their own message, no loading state, no error, and no indication that the assistant was supposed to respond. There is no retry button or any way to recover other than manually resending the message or deleting the conversation entirely. <img width="764" height="866" alt="Image" src="https://github.com/user-attachments/assets/2d366f74-78e6-4928-b452-a7a57fca67f7" />