[GH-ISSUE #13786] issue: Performance: UI is extremely slow with large chat history (recommend Virtual Scrolling, Lazy Loading, and Compression) #17033

Closed
opened 2026-04-19 22:48:55 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @chalitbkb on GitHub (May 11, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/13786

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.9

Ollama Version (if applicable)

No response

Operating System

Windows 11

Browser (if applicable)

136.0.7103.93

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have listed steps to reproduce the bug in detail.

Expected Behavior

When the chat history grows to hundreds of messages, refreshing or loading the page should display the existing messages within 1–5 seconds, and the UI should remain smooth without any long loading delays.

Actual Behavior

Currently, with a chat history of ≥ 100–200 messages, refreshing the browser causes the loading icon to spin for 5–15 minutes before the messages finally appear. The more messages there are, the longer you wait.

Steps to Reproduce

  1. In any chat session, exchange messages until the history exceeds 100–200 entries.
  2. Press F5 (refresh) in your browser.
  3. Observe that the loading icon continues for 5–15 minutes before the full history appears.

Logs & Screenshots

Image

Additional Information

Framework / Stack

  • Frontend: Svelte + TypeScript/JavaScript + CSS/Sass
  • Some components use Python via Pyodide

Root Causes

  • Full-history fetching: Every refresh loads the entire JSON chat history (even if you only see 2 or 3 latest messages at a time).
  • DOM overload: Creating hundreds or thousands of <div> elements at once in the DOM slows down rendering.
  • Reactivity patterns: Pushing into a large Svelte array triggers big diffs and repeated re-renders.

Why Compression Helps

  • Smaller payloads: Compressing the chat history on the server side (via gzip or Brotli) reduces the amount of data sent over the network—often by 70–90%. That means a JSON payload that might be 1 MB uncompressed can shrink to 100–300 KB compressed.
  • Faster transfer times: Less data over the wire means the browser gets the history faste.
  • Quicker parsing: Even when the browser has to decompress and parse JSON, modern JavaScript engines and Web Workers can handle that in milliseconds for payloads of hundreds of kilobytes—far faster than rendering thousands of DOM nodes.
  • Client-side storage: If you cache compressed history in LocalStorage (e.g. with lz-string), subsequent page loads can skip the network entirely and decompress locally in a Web Worker, making repeat visits nearly instant.

Suggested Fixes

  1. Virtual Scrolling: Use a library like svelte-virtual-list to render only the messages visible on screen, not the entire history.
  2. Lazy Loading / Pagination: Fetch chat history in chunks (e.g. 10–20 messages at a time), loading older messages only when the user scrolls up.
  3. Server-side Compression: Enable gzip or Brotli on your API endpoints so the payload size drops dramatically.
  4. Client-side Compression: If you store history locally, compress it with a library such as lz-string (UTF-16) or pako (zlib) before saving, and decompress in a Web Worker on load.
  5. Optimize Reactivity: Avoid array.push(...); instead use controlled updates like messages = [...messages, newMessage] or manage large arrays with Svelte stores.
Originally created by @chalitbkb on GitHub (May 11, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/13786 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.9 ### Ollama Version (if applicable) _No response_ ### Operating System Windows 11 ### Browser (if applicable) 136.0.7103.93 ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior When the chat history grows to hundreds of messages, refreshing or loading the page should display the existing messages within **1–5 seconds**, and the UI should remain smooth without any long loading delays. ### Actual Behavior Currently, with a chat history of ≥ 100–200 messages, refreshing the browser causes the loading icon to spin for **5–15 minutes** before the messages finally appear. The more messages there are, the longer you wait. ### Steps to Reproduce 1. In any chat session, exchange messages until the history exceeds 100–200 entries. 2. Press F5 (refresh) in your browser. 3. Observe that the loading icon continues for 5–15 minutes before the full history appears. ### Logs & Screenshots ![Image](https://github.com/user-attachments/assets/a28fd435-f052-4491-a947-ca4967c775f8) ### Additional Information **Framework / Stack** * Frontend: Svelte + TypeScript/JavaScript + CSS/Sass * Some components use Python via Pyodide **Root Causes** * **Full-history fetching**: Every refresh loads the entire JSON chat history (even if you only see 2 or 3 latest messages at a time). * **DOM overload**: Creating hundreds or thousands of `<div>` elements at once in the DOM slows down rendering. * **Reactivity patterns**: Pushing into a large Svelte array triggers big diffs and repeated re-renders. **Why Compression Helps** * **Smaller payloads**: Compressing the chat history on the server side (via gzip or Brotli) reduces the amount of data sent over the network—often by 70–90%. That means a JSON payload that might be 1 MB uncompressed can shrink to 100–300 KB compressed. * **Faster transfer times**: Less data over the wire means the browser gets the history faste. * **Quicker parsing**: Even when the browser has to decompress and parse JSON, modern JavaScript engines and Web Workers can handle that in milliseconds for payloads of hundreds of kilobytes—far faster than rendering thousands of DOM nodes. * **Client-side storage**: If you cache compressed history in LocalStorage (e.g. with `lz-string`), subsequent page loads can skip the network entirely and decompress locally in a Web Worker, making repeat visits nearly instant. **Suggested Fixes** 1. **Virtual Scrolling**: Use a library like `svelte-virtual-list` to render only the messages visible on screen, not the entire history. 2. **Lazy Loading / Pagination**: Fetch chat history in chunks (e.g. 10–20 messages at a time), loading older messages only when the user scrolls up. 3. **Server-side Compression**: Enable gzip or Brotli on your API endpoints so the payload size drops dramatically. 4. **Client-side Compression**: If you store history locally, compress it with a library such as `lz-string` (UTF-16) or `pako` (zlib) before saving, and decompress in a Web Worker on load. 5. **Optimize Reactivity**: Avoid `array.push(...)`; instead use controlled updates like `messages = [...messages, newMessage]` or manage large arrays with Svelte stores.
GiteaMirror added the bug label 2026-04-19 22:48:55 -05:00
Author
Owner

@yoobaring1528 commented on GitHub (May 11, 2025):

Same problem

<!-- gh-comment-id:2870017054 --> @yoobaring1528 commented on GitHub (May 11, 2025): Same problem
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#17033