mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-22 17:42:18 -05:00
[GH-ISSUE #23787] issue: file content update can silently lose KB embeddings when reindex fails #107067
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @shaun0927 on GitHub (Apr 16, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23787
Check Existing Issues
Installation Method
Git Clone
Open WebUI Version
latest
mainas of 2026-04-16 (latest release also checked:v0.8.12)Ollama Version (if applicable)
No response
Operating System
macOS Sequoia
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
Updating a file's content should not leave any knowledge collection in a worse state than before.
If the KB reindex step fails after the file content has been updated, Open WebUI should either:
Actual Behavior
/files/{id}/data/content/updatecurrently deletes the old KB vectors first and only then tries to rebuild them.If the rebuild fails, the exception is reduced to a warning and the route still returns success. That can leave the knowledge collection without any vectors for that file.
This is different from
#20558:#20558= stale old embeddings remained after editsSteps to Reproduce
The current
maincode inbackend/open_webui/routers/files.pydoes this during content updates:process_file(..., content=...)delete(... filter={'file_id': id})process_file(..., collection_name=knowledge.id)A deterministic local reproduction is:
Actual output:
The route-level call sequence is effectively:
Logs & Screenshots
Relevant current code path (
backend/open_webui/routers/files.py):and on failure:
Additional Information
Related but not exact duplicates:
#20558fixed the stale-embedding case#6311is about failed embeddings handling in generalI have a narrow fix ready that rebuilds first and only deletes the stale vector ids after the new insert succeeds, plus a focused regression test.
@shaun0927 commented on GitHub (Apr 16, 2026):
I opened a narrow fix PR for this report: #23789. The PR preserves the old KB vector ids until the replacement reindex succeeds and includes a focused regression test.