mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 19:38:46 -05:00
[PR #23736] feat: resumable WS streaming via Redis log with seq-based replay #50391
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/23736
Author: @Classic298
Created: 4/14/2026
Status: 🔄 Open
Base:
dev← Head:feat/stream-resume-redis📝 Commits (10+)
8289ac7feat(stream): resumable WS streaming via Redis log with seq-based replayff48c99fix(stream): address code review findings on resume-streamdec9176fix(stream): resume every in-flight assistant message, not just the current7dc4d84fix(stream): harden done-detection and isolate resume seq from persisted state860bde8fix(stream): close replay race, cursor-efficient resume, bounded client bookkeepingd9e2ffcfix(stream): truncate stale resume log at emitter creation46d54c6fix(stream): replace delayed-truncate task with TTL shortening on done5533368style+fix(stream): tighten comments and address two review findings21ef755fix(stream): close replay/live race and drop taskIds gated655395fix(stream): unconditional completion, fence timeout, single-batch replay📊 Changes
2 files changed (+567 additions, -5 deletions)
View changed files
📝
backend/open_webui/socket/main.py(+347 -5)📝
src/lib/components/chat/Chat.svelte(+220 -0)📄 Description
This PR SHOULDNT BE MERGED WITHOUT ALSO MERGING THIS https://github.com/open-webui/open-webui/pull/23859
Adds a bounded Redis stream log for every in-flight assistant message so clients can reconnect mid-stream (page refresh, network drop, device switch) and catch up on frames they missed without re-fetching the full chat from the database.
Problem this solves
With ENABLE_REALTIME_CHAT_SAVE=False (default), the backend does not write the assistant message to the DB until the stream finishes. If the client refreshes the page mid-stream, chat load from the DB returns nothing for the in-progress message and the response appears to vanish until the stream eventually completes. Users staring at an empty chat while the backend quietly keeps emitting tokens into the void.
Design
seqinsideget_event_emitterand appended to a bounded Redis stream keyed{REDIS_KEY_PREFIX}:stream:{message_id}. MAXLEN ~ 2000 entries, TTL 1h as a safety net.message.lastSeqin Chat.svelte as events arrive. On chat load (mid-stream refresh) and on socket reconnect they emitresume-stream {chat_id, message_id, last_seq}.seq > last_seq, and emits the missed envelopes to THAT session only (viato=sid) so live listeners in the user room keep receiving their normal live stream unchanged.seq <= message.lastSeq, making replay idempotent against live frames that race the replay after a reconnect.done: Truefires, a background task truncates the log after a 30s grace window so late reconnects still catch the finalization; anything beyond that resumes from the now-up-to-date DB.Orthogonality
Zero touches to middleware.py or the streaming hot path. The log stores whatever gets emitted; any future change to emit shape (chat:message:delta, per-block ops, JSON Patch, ...) is logged and replayed verbatim with no coupling.
Graceful degradation
No-op when Redis is not configured (WEBSOCKET_MANAGER != 'redis'). In that deployment mode, refresh during streaming retains the current behavior of waiting for the stream to complete.
Auth model
The log is keyed by message_id only. The resume handler must do a chat ownership check (Chats.get_chat_by_id_and_user_id) before replaying, so a malicious client cannot read another user's stream by guessing a message_id.
Contributor License Agreement
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.