[PR #24302] [CLOSED] fix: prevent STT from blocking the uvicorn event loop #66447

Closed
opened 2026-05-06 12:48:53 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/24302
Author: @gaurav0107
Created: 5/1/2026
Status: Closed

Base: devHead: fix/24169-stt-blocks-event-loop


📝 Commits (1)

  • 49fc6fd fix: prevent STT from blocking the uvicorn event loop

📊 Changes

2 files changed (+4 additions, -3 deletions)

View changed files

📝 backend/open_webui/routers/audio.py (+3 -2)
📝 backend/open_webui/routers/files.py (+1 -1)

📄 Description

Description

Fixes #24169.

When a user uploads audio to POST /api/v1/audio/transcriptions, the whole Open WebUI server becomes unresponsive until transcription finishes. Other users see ECONNREFUSED and dropped Socket.IO WebSockets.

Root cause (already diagnosed in detail by @Classic298 on #24169):

  • backend/open_webui/routers/audio.py::transcription is async def but
    • reads the upload synchronously via file.file.read(), and
    • calls the sync helper transcribe(...) directly.
  • transcribe(...) uses its own ThreadPoolExecutor and blocks the calling thread on future.result() until every chunk is transcribed by faster-whisper.

With the default UVICORN_WORKERS=1, one event loop serves both HTTP and Socket.IO, so a multi-minute CPU transcription stalls every other user.

The exact same anti-pattern exists in backend/open_webui/routers/files.py::process_uploaded_file._process_handler, which also calls transcribe(...) synchronously from an async function. It is fixed in the same PR.

What this PR does

backend/open_webui/routers/audio.py:

  • Add import asyncio.
  • Replace contents = file.file.read() with contents = await file.read() (FastAPI's async UploadFile.read()).
  • Wrap transcribe(...) in await asyncio.to_thread(...).

backend/open_webui/routers/files.py:

  • Wrap the transcribe(...) call in await asyncio.to_thread(...). asyncio is already imported, and the preceding line already uses the same asyncio.to_thread pattern for Storage.get_file.

transcribe() itself is untouched — it is a sync function that correctly uses a ThreadPoolExecutor for chunk parallelism. The fix is just to stop running it on the event loop.

Net diff: +4 / -3 across 2 files, no new tests, no new dependencies, no env-var changes, no public-API changes.

Before / after (manual verification reasoning)

Before: a single STT request holds the sole uvicorn event loop until faster-whisper finishes; a concurrent GET / or Socket.IO frame is queued behind the transcription and times out. Reporter's log shows ~18s of inference for 5s of audio; a 30s clip produces ~2 minutes of server freeze.

After: await file.read() releases the loop during upload I/O; await asyncio.to_thread(transcribe, ...) schedules the CPU-bound work on the default thread pool so the event loop stays free for chat streaming, WebSocket heartbeats, and other users. ENABLE_WEBSOCKET_SUPPORT=false is no longer needed as a workaround.

Testing

What I have verified:

  1. ruff format --check passes on both modified files (the repo's only blocking backend CI check).
  2. Python AST parse succeeds on both files.
  3. Traced every caller of transcribe() in the backend: exactly two call sites exist (audio.py::transcription and files.py::process_uploaded_file._process_handler), both fixed here.
  4. Behavior-preserving swap confirmed by reading the source of each replaced call: UploadFile.read() returns the same bytes as the raw SpooledTemporaryFile.read() underneath file.file, and await asyncio.to_thread(fn, *args) invokes fn(*args) on the default thread pool and returns its return value unchanged.
  5. The pattern await asyncio.to_thread(...) is already used on the preceding line of files.py::_process_handler (for Storage.get_file), so this fix follows an idiom already in the file.

What I have NOT personally verified (please gate on this):

  • I have not set up a faster-whisper model and run a live STT request with concurrent traffic. @Classic298 correctly pointed out that the comment in #24169 was a "could be a fix", not a tested patch, and that it needs verification. I am keeping this PR in DRAFT until someone with an STT setup (reporter @Mastersomy or @Classic298) can confirm the concurrent-request behavior under a real transcription. Happy to iterate on findings.

Changelog

  • Fixed: Speech-to-text (STT) transcription no longer blocks the server event loop. Other users can continue using chat and Socket.IO while a transcription is in flight. (#24169)

Checklist

  • Target branch: dev.
  • Description and changelog included above.
  • Dependencies: none added. asyncio.to_thread has been available since Python 3.9; the project requires >=3.11.
  • Testing: static/CI checks pass locally. I have NOT run a real faster-whisper transcription; keeping as DRAFT pending maintainer or reporter verification of the concurrent-request behavior.
  • Code review: self-reviewed; follows the asyncio.to_thread pattern already used on the preceding line of files.py.
  • Design & Architecture: minimal, local fix; no new settings, no refactor of transcribe().
  • Git Hygiene: one atomic commit on top of dev.
  • Title prefix: fix:.

Contributor License Agreement

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/24302 **Author:** [@gaurav0107](https://github.com/gaurav0107) **Created:** 5/1/2026 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `fix/24169-stt-blocks-event-loop` --- ### 📝 Commits (1) - [`49fc6fd`](https://github.com/open-webui/open-webui/commit/49fc6fdf07fae2e5e8aedcd042f1b02a0ec6c652) fix: prevent STT from blocking the uvicorn event loop ### 📊 Changes **2 files changed** (+4 additions, -3 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/routers/audio.py` (+3 -2) 📝 `backend/open_webui/routers/files.py` (+1 -1) </details> ### 📄 Description ### Description Fixes #24169. When a user uploads audio to `POST /api/v1/audio/transcriptions`, the whole Open WebUI server becomes unresponsive until transcription finishes. Other users see `ECONNREFUSED` and dropped Socket.IO WebSockets. Root cause (already diagnosed in detail by @Classic298 on #24169): - `backend/open_webui/routers/audio.py::transcription` is `async def` but - reads the upload synchronously via `file.file.read()`, and - calls the sync helper `transcribe(...)` directly. - `transcribe(...)` uses its own `ThreadPoolExecutor` and blocks the calling thread on `future.result()` until every chunk is transcribed by faster-whisper. With the default `UVICORN_WORKERS=1`, one event loop serves both HTTP and Socket.IO, so a multi-minute CPU transcription stalls every other user. The exact same anti-pattern exists in `backend/open_webui/routers/files.py::process_uploaded_file._process_handler`, which also calls `transcribe(...)` synchronously from an `async` function. It is fixed in the same PR. ### What this PR does **`backend/open_webui/routers/audio.py`:** - Add `import asyncio`. - Replace `contents = file.file.read()` with `contents = await file.read()` (FastAPI's async `UploadFile.read()`). - Wrap `transcribe(...)` in `await asyncio.to_thread(...)`. **`backend/open_webui/routers/files.py`:** - Wrap the `transcribe(...)` call in `await asyncio.to_thread(...)`. `asyncio` is already imported, and the preceding line already uses the same `asyncio.to_thread` pattern for `Storage.get_file`. `transcribe()` itself is untouched — it is a sync function that correctly uses a `ThreadPoolExecutor` for chunk parallelism. The fix is just to stop running it *on the event loop*. Net diff: **+4 / -3** across 2 files, no new tests, no new dependencies, no env-var changes, no public-API changes. ### Before / after (manual verification reasoning) Before: a single STT request holds the sole uvicorn event loop until faster-whisper finishes; a concurrent `GET /` or Socket.IO frame is queued behind the transcription and times out. Reporter's log shows ~18s of inference for 5s of audio; a 30s clip produces ~2 minutes of server freeze. After: `await file.read()` releases the loop during upload I/O; `await asyncio.to_thread(transcribe, ...)` schedules the CPU-bound work on the default thread pool so the event loop stays free for chat streaming, WebSocket heartbeats, and other users. `ENABLE_WEBSOCKET_SUPPORT=false` is no longer needed as a workaround. ### Testing What I have verified: 1. `ruff format --check` passes on both modified files (the repo's only blocking backend CI check). 2. Python AST parse succeeds on both files. 3. Traced every caller of `transcribe()` in the backend: exactly two call sites exist (`audio.py::transcription` and `files.py::process_uploaded_file._process_handler`), both fixed here. 4. Behavior-preserving swap confirmed by reading the source of each replaced call: `UploadFile.read()` returns the same `bytes` as the raw `SpooledTemporaryFile.read()` underneath `file.file`, and `await asyncio.to_thread(fn, *args)` invokes `fn(*args)` on the default thread pool and returns its return value unchanged. 5. The pattern `await asyncio.to_thread(...)` is already used on the preceding line of `files.py::_process_handler` (for `Storage.get_file`), so this fix follows an idiom already in the file. What I have NOT personally verified (please gate on this): - I have not set up a faster-whisper model and run a live STT request with concurrent traffic. @Classic298 correctly pointed out that the comment in #24169 was a "could be a fix", not a tested patch, and that it needs verification. I am keeping this PR in DRAFT until someone with an STT setup (reporter @Mastersomy or @Classic298) can confirm the concurrent-request behavior under a real transcription. Happy to iterate on findings. ### Changelog - Fixed: Speech-to-text (STT) transcription no longer blocks the server event loop. Other users can continue using chat and Socket.IO while a transcription is in flight. (#24169) ### Checklist - [x] Target branch: `dev`. - [x] Description and changelog included above. - [x] Dependencies: none added. `asyncio.to_thread` has been available since Python 3.9; the project requires >=3.11. - [x] Testing: static/CI checks pass locally. I have NOT run a real faster-whisper transcription; keeping as DRAFT pending maintainer or reporter verification of the concurrent-request behavior. - [x] Code review: self-reviewed; follows the `asyncio.to_thread` pattern already used on the preceding line of `files.py`. - [x] Design & Architecture: minimal, local fix; no new settings, no refactor of `transcribe()`. - [x] Git Hygiene: one atomic commit on top of `dev`. - [x] Title prefix: `fix:`. ### Contributor License Agreement <!-- 🚨 DO NOT DELETE THE TEXT BELOW 🚨 Keep the "Contributor License Agreement" confirmation text intact. Deleting it will trigger the CLA-Bot to INVALIDATE your PR. Your PR will NOT be reviewed or merged until you check the box below confirming that you have read and agree to the terms of the CLA. --> - [x] By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. > [!NOTE] > Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-06 12:48:53 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#66447