In the streaming process, a crash may occur when the output tokens reach thousands or tens of thousands #2863

Closed
opened 2025-11-11 15:15:59 -06:00 by GiteaMirror · 0 comments
Owner

Originally created by @Nikoyyy on GitHub (Nov 28, 2024).

Installation Method

[docker run -d -p 3001:8080 --security-opt=seccomp=unconfined --privileged --mount=type=bind,source=/sys/fs/cgroup,target=/sys/fs/cgroup,readonly=false --mount=type=bind,source=/proc,target=/proc2,readonly=false,bind-recursive=disabled -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main]

Environment

  • Open WebUI Version: [v0.4.6]

  • Ollama (if applicable): []

  • Operating System: [Windows 10]

  • Browser (if applicable): [Version 131.0.6778.86 (Official Build) (64-bit)]

Confirmation:

  • [ 1] I have read and followed all the instructions provided in the README.md.
  • [ 1] I am on the latest version of both Open WebUI and Ollama.
  • [ 1] I have included the browser console logs.
  • [ 1] I have included the Docker container logs.
  • [ 1] I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below.

Expected Behavior:

Streaming process, output reaches thousands or tens of thousands without crashing

Actual Behavior:

When streaming reaches thousands or tens of thousands of tokens, crashes may occur.

Description

Bug Summary:
When streaming reaches thousands or tens of thousands of tokens, crashes may occur.

Reproduction Details

Steps to Reproduce:
Using the QwQ-32B-Preview model, input: There exist real numbers x and y, both greater than 1, such that logx (y²) = log₁ (x¼) = 10. Find xy.
The output is very likely to reach thousands or tens of thousands of tokens, at which point it is highly likely to crash.
As the output increases, it becomes noticeably slower until it eventually crashes.

Logs and Screenshots

Screenshots/Screen Recordings (if applicable):
44bb4023fd3db5cffebd2a50b00c5f54

Originally created by @Nikoyyy on GitHub (Nov 28, 2024). ## Installation Method [docker run -d -p 3001:8080 --security-opt=seccomp=unconfined --privileged --mount=type=bind,source=/sys/fs/cgroup,target=/sys/fs/cgroup,readonly=false --mount=type=bind,source=/proc,target=/proc2,readonly=false,bind-recursive=disabled -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main] ## Environment - **Open WebUI Version:** [v0.4.6] - **Ollama (if applicable):** [] - **Operating System:** [Windows 10] - **Browser (if applicable):** [Version 131.0.6778.86 (Official Build) (64-bit)] **Confirmation:** - [ 1] I have read and followed all the instructions provided in the README.md. - [ 1] I am on the latest version of both Open WebUI and Ollama. - [ 1] I have included the browser console logs. - [ 1] I have included the Docker container logs. - [ 1] I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below. ## Expected Behavior: Streaming process, output reaches thousands or tens of thousands without crashing ## Actual Behavior: When streaming reaches thousands or tens of thousands of tokens, crashes may occur. ## Description **Bug Summary:** When streaming reaches thousands or tens of thousands of tokens, crashes may occur. ## Reproduction Details **Steps to Reproduce:** Using the QwQ-32B-Preview model, input: There exist real numbers x and y, both greater than 1, such that logx (y²) = log₁ (x¼) = 10. Find xy. The output is very likely to reach thousands or tens of thousands of tokens, at which point it is highly likely to crash. As the output increases, it becomes noticeably slower until it eventually crashes. ## Logs and Screenshots **Screenshots/Screen Recordings (if applicable):** ![44bb4023fd3db5cffebd2a50b00c5f54](https://github.com/user-attachments/assets/0daabdd1-d9f1-49a5-b9fb-9077102d4cd6)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#2863