mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-10 07:43:10 -05:00
issue: Bug: Lack of Data Integrity Validation on Chat Model Causes Silent Failure and Frontend Deadlock #5606
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @robotmikhro on GitHub (Jun 21, 2025).
Check Existing Issues
Installation Method
Git Clone
Open WebUI Version
0.6.15
Ollama Version (if applicable)
No response
Operating System
Ubuntu 22.04.5 LTS(Jammy Jellyfish)
Browser (if applicable)
Firefox 139.0.4 (64-bit)
Confirmation
README.md.Expected Behavior
Area: Backend, Frontend, Data Integrity
Description
The application's backend does not perform sufficient data integrity checks on the chat object model before persisting it to the database. This allows structurally inconsistent or corrupted chat data (e.g., messages with missing
parentIdlinks) to be saved without any errors or warnings being logged.When the frontend subsequently attempts to render this structurally flawed data, its rendering logic enters an unrecoverable state, such as an infinite loop or a state management deadlock. This results in the UI hanging indefinitely on a loading spinner, preventing the user from ever viewing the chat.
This is a fundamental data handling issue, not just an import/export problem. The import feature is simply the most direct way to introduce this kind of corrupted data and reproduce the bug.
Expected Behavior
The application should be resilient to data inconsistencies:
Reasoning: Root Cause Analysis
The core issue is a lack of data validation and sanitization in the backend logic that handles the
Chatmodel, "MAYBE" withinbackend/open_webui/models/chats.py.The Vulnerability: The backend implicitly trusts the structural integrity of the
chatJSON object it receives. It does not validate the parent-child relationships within the message tree. In the provided example, the message23067407-327f-4167-bac6-5804080aed15is an "orphan" message—it has noparentIdkey, and no other message lists it as a child. This process completes without any exceptions or warnings, as the backend logic does not currently validate the internal consistency of the chat tree.The Consequence: When the frontend fetches this chat, its rendering logic receives an inconsistent state. It may, for example, iterate over one list of messages while trying to look up relationships in a different, out-of-sync map. When it encounters an inconsistency like the orphan message, the logic to determine the next message to render or to build the conversational tree can enter an unrecoverable state. This results in a logical deadlock or an infinite loop, which prevents the rendering process from ever completing, causing the persistent loading indicator.
Actual Behavior
The application hangs indefinitely on a loading spinner.
A critical diagnostic detail is that this failure is silent. No errors are generated in the browser's developer console, and no relevant warnings or errors appear in the Open-WebUI backend logs during the import or subsequent access of the chat. This points to a logical deadlock in the frontend rendering process, triggered by data that was silently accepted as "valid" by the backend.
Steps to Reproduce
Example of Malformed Data to Reproduce the Bug
The following JSON data, with its full content, reliably triggers the bug. The key structural flaws are:
23067407-327f-4167-bac6-5804080aed15is an "orphan" message, as it is missing theparentIdkey.messagesarray is out of sync with thehistory.messagesmap (it is missing the orphan message).Screenshots
Additional Information
This bug was identified through a detailed debugging process that revealed several key insights which may be helpful for the development team:
Specific Nature of Data Corruption: The issue was traced back to a specific type of structural inconsistency in the chat's JSON data. The primary flaw was an "orphan" message object that completely lacked a
parentIdkey, effectively detaching it from the conversational tree. This was compounded by the top-levelmessagesarray being out of sync with thehistory.messagesmap, as it did not contain the orphan message.Frontend Failure Mode is a Logical Deadlock, Not a Crash: A crucial observation is that this issue does not produce any errors in the browser's developer console. This indicates the problem is not a simple runtime
TypeErrorfrom accessing anundefinedproperty. Instead, it suggests the frontend's rendering logic enters an unrecoverable state—likely an infinite loop or a state management deadlock—when it tries to process the inconsistent message tree. It cannot determine the correct sequence of messages to render, and thus hangs.@usrlocalben commented on GitHub (Aug 10, 2025):
Related to #11536
@usrlocalben commented on GitHub (Aug 10, 2025):
Users of llama-swap may get corrupted chats
due to the way it construes backend errors as API messages.Example:
This gives the Loading... spinner in chat UI despite successful retrieval at all other levels (no errors in console, no network errors, no noise on server logs etc.)
edit: I may have been a bit to hasty to link it to llama-swap. Glancing over the code there I don't see what I expect to wrt. to the error block, so it may have been caused by open-webui itself.
@YifengChenGeotab commented on GitHub (Sep 12, 2025):
I have encountered this another example of corrupted chat that can lead to this kind of silent failure and frontend deadlock. Hope someone will take a look at this bug since it has been months. @tjbck
`.history.messages[]:
"xxxxx-xxxxxx": {
"content": "......"
}`
however a normal message should look like this:
"yyy-yyyyyyy": {
"parentId": "zzz-zzzzzz",
"id": "...",
"childrenIds": [],
"role": "assistant",
"content": "...",
"model": "gpt-5",
"modelName": "GPT 5",
"modelIdx": 1,
"userContext": null,
"timestamp": 1756845175,
"lastSentence": "...",
"done": true
}
@DaRacci commented on GitHub (Sep 25, 2025):
I'm experiencing the same, I have this happen seemingly at random maybe once a week; Restarting the webserver seems to fix this but its very annoying.
Similarly I've found that having a user configured mcpo server which is unreachable can also cause this.
@rgaricano commented on GitHub (Sep 25, 2025):
The basic structural checks but is quite limited, and as implement a full validation is "delicated",
I made a Chat Integrity Checker for help in managing this issues. It can be used for check & repair chat history. It have dry run config valve (to check without make changes) and debug level.
2 versions, as action function and as tool.
As I made for testing I added log entry for all chat processed, maybe the logging could be excesive.
Action Function: Check_Chats_Action_Function.py
Tool:
Valves:
v0.1.1 Edited, fixed some minor issue and added emitters:
Check_Chats_Tool.py
v0.1.2 Added param skip in tool (for processing chats starting at skip number )
(e.g.
use Check Chats tool skip 200)Check_Chats_Tool_v0.1.2.py
v0.1.3 Added UserValves for easier config.
Check_Chats_Tool_v0.1.3.py
Example:
I check the chats with dry run and I detect a wrong one:
I ask for check only this chat:
I test it and efffectively it fails...loading
I change the dry run valve (for make changes) & I ask for check and fix only this chat
Now I test the chat again... Fixed, I can see it correctly!