[GH-ISSUE #24169] issue: STT halts the server untill finished #58885

New Issue

GiteaMirror · 2026-05-06T00:20:38-05:00

GiteaMirror commented

2026-05-06 00:20:38 -05:00

Originally created by @Mastersomy on GitHub (Apr 27, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/24169

Check Existing Issues

I have searched for any existing and/or related issues.
I have searched for any existing and/or related discussions.
I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.9.2

Ollama Version (if applicable)

No response

Operating System

server is on Ubuntu 24.04 client is on Windows 11

Browser (if applicable)

Firefox 148.0 and Edge 147.0.3912.86

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

when somewone in the instance uses stt that stt will get processed and nobody else is disrupted while using owui.

Actual Behavior

when stt is beeing used the server process halts untill processing is finisched (or more likly the server process is the one processing the stt)
that producess timeouts (for all users of that instance) when someone makes a stt that takes long to process (cpu inference or inference overload)

Steps to Reproduce

setup OWUI with only a cpu
setup https to be able to use stt
setup stt with a larger model (large-v3 for example)
make a dictation thhat is 30 or so seconds log
while transcribing it the server will not awwnser any web requests.

Logs & Screenshots

browser logs:
websocket.js:119 WebSocket connection to 'wss://xxxxxxxxxxx/ws/socket.io/?EIO=4&transport=websocket' failed: createSocket @ websocket.js:119 doOpen @ websocket.js:24 open @ transport.js:47 _open @ socket.js:197 constructor @ socket.js:150 constructor @ socket.js:565 constructor @ socket.js:725 open @ manager.js:111 (anonymous) @ manager.js:337 websocket.js:119 WebSocket connection to 'wss://xxxxxxxxxxx/ws/socket.io/?EIO=4&transport=websocket' failed: createSocket @ websocket.js:119 doOpen @ websocket.js:24 open @ transport.js:47 _open @ socket.js:197 constructor @ socket.js:150 constructor @ socket.js:565 constructor @ socket.js:725 open @ manager.js:111 (anonymous) @ manager.js:337 websocket.js:119 WebSocket connection to 'wss://xxxxxxxxxxxxxx/ws/socket.io/?EIO=4&transport=websocket' failed: createSocket @ websocket.js:119 doOpen @ websocket.js:24 open @ transport.js:47 _open @ socket.js:197 constructor @ socket.js:150 constructor @ socket.js:565 constructor @ socket.js:725 open @ manager.js:111 (anonymous) @ manager.js:337 fetcher.js:77 GET https://xxxxxxxxxxxxxx/_app/version.json net::ERR_HTTP2_PROTOCOL_ERROR (anonymous) @ fetcher.js:77 check @ utils.js:272 setTimeout create_updated_store @ utils.js:298 (anonymous) @ client.js:129 index.ts:75 POST https://xxxxxxxxxxxxxxx/api/v1/audio/transcriptions net::ERR_CONNECTION_REFUSED (anonymous) @ fetcher.js:77 (anonymous) @ index.ts:75 (anonymous) @ VoiceRecording.svelte:186 await in (anonymous) (anonymous) @ VoiceRecording.svelte:274 websocket.js:119 WebSocket connection to 'wss://xxxxxxxxxxxxx/ws/socket.io/?EIO=4&transport=websocket' failed:

Docker logs entsprechend dem fehler:
2026-04-27 05:22:36.867 | INFO | open_webui.routers.audio:transcription:1234 - file.content_type: audio/webm;codecs=opus 2026-04-27 05:22:36.874 | INFO | open_webui.routers.audio:transcribe:1102 - transcribe: /app/backend/data/cache/audio/transcriptions/23fe 9bf1-3566-4043-a427-2657f6b13cbd.webm None 2026-04-27 05:22:38.241 | INFO | open_webui.routers.audio:convert_audio_to_mp3:123 - Converted /app/backend/data/cache/audio/transcriptio ns/23fe9bf1-3566-4043-a427-2657f6b13cbd.webm to /app/backend/data/cache/audio/transcriptions/23fe9bf1-3566-4043-a427-2657f6b13cbd.mp3 Chunk paths: ['/app/backend/data/cache/audio/transcriptions/23fe9bf1-3566-4043-a427-2657f6b13cbd.mp3'] 2026-04-27 05:22:48.199 | INFO | faster_whisper.transcribe:transcribe:881 - Processing audio with duration 00:05.160 2026-04-27 05:23:06.158 | INFO | faster_whisper.transcribe:transcribe:948 - Detected language 'de' with probability 0.97 2026-04-27 05:23:06.177 | INFO | open_webui.routers.audio:transcription_handler:660 - Detected language 'de' with probability 0.968905

Additional Information

we are currently in the evaluation of openwebui and because of that we are using large-v3 to gage the quality that will be produced once we deploy for real.

Originally created by @Mastersomy on GitHub (Apr 27, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/24169 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version 0.9.2 ### Ollama Version (if applicable) _No response_ ### Operating System server is on Ubuntu 24.04 client is on Windows 11 ### Browser (if applicable) Firefox 148.0 and Edge 147.0.3912.86 ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior when somewone in the instance uses stt that stt will get processed and nobody else is disrupted while using owui. ### Actual Behavior when stt is beeing used the server process halts untill processing is finisched (or more likly the server process is the one processing the stt) that producess timeouts (for all users of that instance) when someone makes a stt that takes long to process (cpu inference or inference overload) ### Steps to Reproduce setup OWUI with only a cpu setup https to be able to use stt setup stt with a larger model (large-v3 for example) make a dictation thhat is 30 or so seconds log while transcribing it the server will not awwnser any web requests. ### Logs & Screenshots browser logs: `websocket.js:119 WebSocket connection to 'wss://xxxxxxxxxxx/ws/socket.io/?EIO=4&transport=websocket' failed: createSocket @ websocket.js:119 doOpen @ websocket.js:24 open @ transport.js:47 _open @ socket.js:197 constructor @ socket.js:150 constructor @ socket.js:565 constructor @ socket.js:725 open @ manager.js:111 (anonymous) @ manager.js:337 websocket.js:119 WebSocket connection to 'wss://xxxxxxxxxxx/ws/socket.io/?EIO=4&transport=websocket' failed: createSocket @ websocket.js:119 doOpen @ websocket.js:24 open @ transport.js:47 _open @ socket.js:197 constructor @ socket.js:150 constructor @ socket.js:565 constructor @ socket.js:725 open @ manager.js:111 (anonymous) @ manager.js:337 websocket.js:119 WebSocket connection to 'wss://xxxxxxxxxxxxxx/ws/socket.io/?EIO=4&transport=websocket' failed: createSocket @ websocket.js:119 doOpen @ websocket.js:24 open @ transport.js:47 _open @ socket.js:197 constructor @ socket.js:150 constructor @ socket.js:565 constructor @ socket.js:725 open @ manager.js:111 (anonymous) @ manager.js:337 fetcher.js:77 GET https://xxxxxxxxxxxxxx/_app/version.json net::ERR_HTTP2_PROTOCOL_ERROR (anonymous) @ fetcher.js:77 check @ utils.js:272 setTimeout create_updated_store @ utils.js:298 (anonymous) @ client.js:129 index.ts:75 POST https://xxxxxxxxxxxxxxx/api/v1/audio/transcriptions net::ERR_CONNECTION_REFUSED (anonymous) @ fetcher.js:77 (anonymous) @ index.ts:75 (anonymous) @ VoiceRecording.svelte:186 await in (anonymous) (anonymous) @ VoiceRecording.svelte:274 websocket.js:119 WebSocket connection to 'wss://xxxxxxxxxxxxx/ws/socket.io/?EIO=4&transport=websocket' failed: ` Docker logs entsprechend dem fehler: `2026-04-27 05:22:36.867 | INFO | open_webui.routers.audio:transcription:1234 - file.content_type: audio/webm;codecs=opus 2026-04-27 05:22:36.874 | INFO | open_webui.routers.audio:transcribe:1102 - transcribe: /app/backend/data/cache/audio/transcriptions/23fe 9bf1-3566-4043-a427-2657f6b13cbd.webm None 2026-04-27 05:22:38.241 | INFO | open_webui.routers.audio:convert_audio_to_mp3:123 - Converted /app/backend/data/cache/audio/transcriptio ns/23fe9bf1-3566-4043-a427-2657f6b13cbd.webm to /app/backend/data/cache/audio/transcriptions/23fe9bf1-3566-4043-a427-2657f6b13cbd.mp3 Chunk paths: ['/app/backend/data/cache/audio/transcriptions/23fe9bf1-3566-4043-a427-2657f6b13cbd.mp3'] 2026-04-27 05:22:48.199 | INFO | faster_whisper.transcribe:transcribe:881 - Processing audio with duration 00:05.160 2026-04-27 05:23:06.158 | INFO | faster_whisper.transcribe:transcribe:948 - Detected language 'de' with probability 0.97 2026-04-27 05:23:06.177 | INFO | open_webui.routers.audio:transcription_handler:660 - Detected language 'de' with probability 0.968905 ` ### Additional Information we are currently in the evaluation of openwebui and because of that we are using large-v3 to gage the quality that will be produced once we deploy for real.

GiteaMirror added the bug label 2026-05-06 00:20:38 -05:00

GiteaMirror commented

2026-05-06 00:20:43 -05:00

@Classic298 commented on GitHub (May 1, 2026):

In backend/open_webui/routers/audio.py the transcription endpoint at line 1221 is async def but calls the synchronous transcribe(...) at line 1269 directly, which then gathers ThreadPoolExecutor chunk results via blocking future.result() at line 1138. With UVICORN_WORKERS=1 (the default in start.sh) one event loop serves HTTP and Socket.IO, so while faster-whisper runs uvicorn cannot accept anything else, which is exactly what your ECONNREFUSED and dropped websockets show. Your own log has 18s of inference for 5.16s of audio so a 30s clip is roughly two minutes of total stall.

Quick patch worth trying before anything else. In backend/open_webui/routers/audio.py:

Add to the imports near the top:
import asyncio
Replace line 1250
contents = file.file.read()
with
contents = await file.read()
Replace line 1269
result = transcribe(request, file_path, metadata, user)
with
result = await asyncio.to_thread(transcribe, request, file_path, metadata, user)

Rebuild the image or bind mount the patched file into the container and reproduce. If other users can keep using the UI while a transcription is still running then the blocking route was the cause and STT itself is fine, just slow on CPU.

If it still misbehaves please share UVICORN_WORKERS, host CPU count and RAM, findmnt -T /app/backend/data from inside the container, your DATABASE_URL setup (especially whether the data dir lives on NFS or another network mount), and ideally a py-spy dump --pid <uvicorn-pid> captured during a freeze. That should be enough to nail down whatever is left.

@Classic298 commented on GitHub (May 1, 2026): In backend/open_webui/routers/audio.py the `transcription` endpoint at line 1221 is `async def` but calls the synchronous `transcribe(...)` at line 1269 directly, which then gathers ThreadPoolExecutor chunk results via blocking `future.result()` at line 1138. With `UVICORN_WORKERS=1` (the default in start.sh) one event loop serves HTTP and Socket.IO, so while faster-whisper runs uvicorn cannot accept anything else, which is exactly what your `ECONNREFUSED` and dropped websockets show. Your own log has 18s of inference for 5.16s of audio so a 30s clip is roughly two minutes of total stall. Quick patch worth trying before anything else. In backend/open_webui/routers/audio.py: 1) Add to the imports near the top: import asyncio 2) Replace line 1250 contents = file.file.read() with contents = await file.read() 3) Replace line 1269 result = transcribe(request, file_path, metadata, user) with result = await asyncio.to_thread(transcribe, request, file_path, metadata, user) Rebuild the image or bind mount the patched file into the container and reproduce. If other users can keep using the UI while a transcription is still running then the blocking route was the cause and STT itself is fine, just slow on CPU. If it still misbehaves please share `UVICORN_WORKERS`, host CPU count and RAM, `findmnt -T /app/backend/data` from inside the container, your `DATABASE_URL` setup (especially whether the data dir lives on NFS or another network mount), and ideally a `py-spy dump --pid <uvicorn-pid>` captured during a freeze. That should be enough to nail down whatever is left.

GiteaMirror commented

2026-05-06 00:20:47 -05:00

@Mastersomy commented on GitHub (May 3, 2026):

I tested it and there where no more disconnections. also long transcriptions that caused openwebui to notice that the connections was dropped wouldn't even appear in the chatui but with those changes i can't find a problem anymore.
As a test after making this issue we tryed to use a openai compatible server for stt and there was the same problem. I retested that now with positive results it works also for that.

@Mastersomy commented on GitHub (May 3, 2026): I tested it and there where no more disconnections. also long transcriptions that caused openwebui to notice that the connections was dropped wouldn't even appear in the chatui but with those changes i can't find a problem anymore. As a test after making this issue we tryed to use a openai compatible server for stt and there was the same problem. I retested that now with positive results it works also for that.

GiteaMirror commented

2026-05-06 00:20:50 -05:00

@Classic298 commented on GitHub (May 3, 2026):

@Mastersomy what exactly of the options is what you tested successfully?

@Classic298 commented on GitHub (May 3, 2026): @Mastersomy what exactly of the options is what you tested successfully?

GiteaMirror commented

2026-05-06 00:20:58 -05:00

@Mastersomy commented on GitHub (May 3, 2026):

ok my test setup and what tests i performed:
i run 2 openwebui instances the one used in this testing is on unraid the other one is a vm at work with docker.
the one on unraid is super simple just mounted the data path to a local folder and a port exposed using nginxproxymanager to get https.
i used docker volume mount to patch the file with the lines @Classic298 mentioned.

after that i opened the instance with two devices (my ipad and my desktop)
set the stt to local whisper using the large-v3 model
then i Opend a new chat on my desktop and clicked the microphone and talked for a short while.
after kicking the check mark to end the transcription before i could not do anything on the ipad after the patch the ipad was unaffected (maybe a little slower to load but that would be expected when the cpu on the server is at 100%)

at work we tested if it would make a difference if we would use a separate openai compatible stt server.
the short version is that it behaves exactly like the local version. before the changes from @Classic298 you can't even load the site on a seperate device after the change I can't tell if there is a transcription ongoing by another device.

so the @Classic298 mentioned work to fix my issue.

@Mastersomy commented on GitHub (May 3, 2026): ok my test setup and what tests i performed: i run 2 openwebui instances the one used in this testing is on unraid the other one is a vm at work with docker. the one on unraid is super simple just mounted the data path to a local folder and a port exposed using nginxproxymanager to get https. i used docker volume mount to patch the file with the lines @Classic298 mentioned. <img width="906" height="748" alt="Image" src="https://github.com/user-attachments/assets/13fdcd75-2eb4-40dd-9016-770002db48a9" /> after that i opened the instance with two devices (my ipad and my desktop) set the stt to local whisper using the large-v3 model then i Opend a new chat on my desktop and clicked the microphone and talked for a short while. after kicking the check mark to end the transcription before i could not do anything on the ipad after the patch the ipad was unaffected (maybe a little slower to load but that would be expected when the cpu on the server is at 100%) at work we tested if it would make a difference if we would use a separate openai compatible stt server. the short version is that it behaves exactly like the local version. before the changes from @Classic298 you can't even load the site on a seperate device after the change I can't tell if there is a transcription ongoing by another device. so the @Classic298 mentioned work to fix my issue.

GiteaMirror commented

2026-05-06 00:21:00 -05:00

@Classic298 commented on GitHub (May 3, 2026):

Ok thanks for testing I'll PR it then

@Classic298 commented on GitHub (May 3, 2026): Ok thanks for testing I'll PR it then

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#58885