[PR #23661] [CLOSED] feat(audio): add AUDIO_STT_SKIP_PREPROCESSING to skip pydub preprocessing #42937

Closed
opened 2026-04-25 14:41:53 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/23661
Author: @runixer
Created: 4/13/2026
Status: Closed

Base: devHead: feat/skip-audio-preprocessing


📝 Commits (1)

  • b802da4 feat: add AUDIO_STT_SKIP_PREPROCESSING to bypass audio conversion/compression/splitting

📊 Changes

3 files changed (+41 additions, -21 deletions)

View changed files

📝 backend/open_webui/config.py (+6 -0)
📝 backend/open_webui/main.py (+2 -0)
📝 backend/open_webui/routers/audio.py (+33 -21)

📄 Description

Description

When uploading large audio files, pydub loads the entire file into RAM (3-5× expansion for AAC → PCM decoding), causing OOM in containers with normal memory limits. See #21515.

Self-hosted STT backends (vLLM Whisper with [audio] extras, faster-whisper servers, etc.) handle all formats natively via ffmpeg/PyAV and have no file size limit — preprocessing is unnecessary overhead.

This PR adds an env var AUDIO_STT_SKIP_PREPROCESSING (default: false) to skip convert_audio_to_mp3 / compress_audio / split_audio and send the file as-is to the STT backend. Fully backward-compatible. pydub imports are now lazy (inside the functions that use them).

Added

  • AUDIO_STT_SKIP_PREPROCESSING env var / admin config option

Fixed

  • OOM when uploading large audio files (#21515)

Breaking Changes

  • None. Default false preserves current behavior.

Testing

Kubernetes deployment with vLLM 0.19.0 Whisper:

File Before After
73 MB .m4a OOMKill OK, ~60s
81 MB .m4a OOMKill OK, ~65s
Both simultaneously OOMKill in 10s OK, no issues

Pod memory during processing: ~640 MiB (vs 4-6 GB spike → OOMKill before).

Running in production for ~5 days. No regressions, no OOMs, users uploading large audio files daily.

Contributor License Agreement


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/23661 **Author:** [@runixer](https://github.com/runixer) **Created:** 4/13/2026 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `feat/skip-audio-preprocessing` --- ### 📝 Commits (1) - [`b802da4`](https://github.com/open-webui/open-webui/commit/b802da43dc213b9fd2558864c150af8ffc3f9622) feat: add AUDIO_STT_SKIP_PREPROCESSING to bypass audio conversion/compression/splitting ### 📊 Changes **3 files changed** (+41 additions, -21 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/config.py` (+6 -0) 📝 `backend/open_webui/main.py` (+2 -0) 📝 `backend/open_webui/routers/audio.py` (+33 -21) </details> ### 📄 Description # Description When uploading large audio files, `pydub` loads the entire file into RAM (3-5× expansion for AAC → PCM decoding), causing OOM in containers with normal memory limits. See #21515. Self-hosted STT backends (vLLM Whisper with `[audio]` extras, faster-whisper servers, etc.) handle all formats natively via ffmpeg/PyAV and have no file size limit — preprocessing is unnecessary overhead. This PR adds an env var `AUDIO_STT_SKIP_PREPROCESSING` (default: `false`) to skip `convert_audio_to_mp3` / `compress_audio` / `split_audio` and send the file as-is to the STT backend. Fully backward-compatible. pydub imports are now lazy (inside the functions that use them). ### Added - `AUDIO_STT_SKIP_PREPROCESSING` env var / admin config option ### Fixed - OOM when uploading large audio files (#21515) ### Breaking Changes - None. Default `false` preserves current behavior. --- ## Testing Kubernetes deployment with vLLM 0.19.0 Whisper: | File | Before | After | |------|--------|-------| | 73 MB .m4a | OOMKill | OK, ~60s | | 81 MB .m4a | OOMKill | OK, ~65s | | Both simultaneously | OOMKill in 10s | OK, no issues | Pod memory during processing: ~640 MiB (vs 4-6 GB spike → OOMKill before). Running in production for ~5 days. No regressions, no OOMs, users uploading large audio files daily. ### Contributor License Agreement - [x] By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-25 14:41:53 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#42937