Slow chat completion for subsequent prompts #4040

Closed
opened 2025-11-11 15:45:01 -06:00 by GiteaMirror · 0 comments
Owner

Originally created by @vexvec on GitHub (Feb 22, 2025).

Bug Report

Installation Method

k8s (Following docker deployment guideline)

Environment

  • **Open WebUI Version: 0.5.16

  • **Ollama (if applicable): 0.5.11

  • **Operating System: debian

  • **Browser (if applicable): Chrome, Chromium 132

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • [-] I have included the browser console logs.
  • [-] I have included the Docker container logs.
  • I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below.

Expected Behavior:

A smoth chat experience like directly with ollama console.

Actual Behavior:

Usually the first prompt executes normally, subsequent requests have a significant delay before text is even written.

Description

Bug Summary:
OpenWebUI introduces a significant delay for answering chat prompts after the first response.

Reproduction Details

Steps to Reproduce:
1 .Make a prompt
2. Wait for answer
3. Make another prompt in the same chat
4. Wait minutes until you see the response

Logs and Screenshots

Browser Console Logs:
[Include relevant browser console logs, if applicable]

Docker Container Logs:
[Include relevant Docker container logs, if applicable]

Screenshots/Screen Recordings (if applicable):
[Attach any relevant screenshots to help illustrate the issue]

Additional Information

If I do the same using ollama console chat this delay does not occur.

Originally created by @vexvec on GitHub (Feb 22, 2025). # Bug Report ## Installation Method k8s (Following docker deployment guideline) ## Environment - **Open WebUI Version: 0.5.16 - **Ollama (if applicable): 0.5.11 - **Operating System: debian - **Browser (if applicable): Chrome, Chromium 132 **Confirmation:** - [x] I have read and followed all the instructions provided in the README.md. - [x] I am on the latest version of both Open WebUI and Ollama. - [-] I have included the browser console logs. - [-] I have included the Docker container logs. - [x] I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below. ## Expected Behavior: A smoth chat experience like directly with ollama console. ## Actual Behavior: Usually the first prompt executes normally, subsequent requests have a significant delay before text is even written. ## Description **Bug Summary:** OpenWebUI introduces a significant delay for answering chat prompts after the first response. ## Reproduction Details **Steps to Reproduce:** 1 .Make a prompt 2. Wait for answer 3. Make another prompt in the same chat 4. Wait minutes until you see the response ## Logs and Screenshots **Browser Console Logs:** [Include relevant browser console logs, if applicable] **Docker Container Logs:** [Include relevant Docker container logs, if applicable] **Screenshots/Screen Recordings (if applicable):** [Attach any relevant screenshots to help illustrate the issue] ## Additional Information If I do the same using ollama console chat this delay does not occur.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#4040