Network errors with multiple streams, ERR_INCOMPLETE_CHUNKED_ENCODING in debug mode #1841

Closed
opened 2025-11-11 14:54:37 -06:00 by GiteaMirror · 0 comments
Owner

Originally created by @crizCraig on GitHub (Aug 21, 2024).

Bug Report

Installation Method

git clone

Environment

  • Open WebUI Version: dev branch, ~3.14

  • Operating System: Mac Sonoma

  • Browser (if applicable): Chrome 127.0.6533.73

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below.

Expected Behavior:

Both responses stream in without error when multiple models are selected, exercising MultiResponseMessages.svelte

Actual Behavior:

For responses longer than around a paragraph, one or both of the outputs will end with a network error

Google Chrome 2024-08-21 13 58 16 Google Chrome 2024-08-21 13 57 04

Description

Bug Summary:

Since this only happens in debug mode (I'm using pycharm) - this is not super high priority, but wanted to document in case someone runs into it. There seems to be a race condition, perhaps due to limited concurrency support while debugging, that causes the error. I would guess data from multiple models is making it into one HTTP chunk.

Reproduction Details

Steps to Reproduce:

  1. Start FastAPI backend server with pycharm debugger
  2. Select multiple models
  3. Ask a question with a long answer like "write a long story about x"

I tried to reproduce by requesting from 2 models in three different windows in "Run" mode, but everything worked normally. So again would say this is low priority, except that if you are developing in debug mode to note that multiple models doesn't work well.

Originally created by @crizCraig on GitHub (Aug 21, 2024). # Bug Report ## Installation Method git clone ## Environment - **Open WebUI Version:** dev branch, ~3.14 - **Operating System:** Mac Sonoma - **Browser (if applicable):** Chrome 127.0.6533.73 **Confirmation:** - [x] I have read and followed all the instructions provided in the README.md. - [x] I am on the latest version of both Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below. ## Expected Behavior: Both responses stream in without error when multiple models are selected, exercising `MultiResponseMessages.svelte` ## Actual Behavior: For responses longer than around a paragraph, one or both of the outputs will end with a network error <img width="1271" alt="Google Chrome 2024-08-21 13 58 16" src="https://github.com/user-attachments/assets/777c6b82-627a-4a05-af6b-91b9462d22a1"> <img width="1082" alt="Google Chrome 2024-08-21 13 57 04" src="https://github.com/user-attachments/assets/66bf74d8-f2c1-43d6-bf21-d5d544554864"> ## Description **Bug Summary:** Since this only happens in debug mode (I'm using pycharm) - this is not super high priority, but wanted to document in case someone runs into it. There seems to be a race condition, perhaps due to limited concurrency support while debugging, that causes the error. I would guess data from multiple models is making it into one HTTP chunk. ## Reproduction Details **Steps to Reproduce:** 1. Start FastAPI backend server with pycharm debugger 2. Select multiple models 3. Ask a question with a long answer like "write a long story about x" I tried to reproduce by requesting from 2 models in three different windows in "Run" mode, but everything worked normally. So again would say this is low priority, except that if you are developing in debug mode to note that multiple models doesn't work well.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#1841