[GH-ISSUE #21281] feat: Audio OpenAI-compatible config parity (auth modes + custom headers) for STT/TTS #34959

Open
opened 2026-04-25 09:08:10 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @tariktalay on GitHub (Feb 9, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/21281

Check Existing Issues

  • I have searched for all existing open AND closed issues and discussions for similar requests. I have found none that is comparable to my request.

Verify Feature Scope

  • I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions.

Problem Description

Audio OpenAI-compatible connections (STT/TTS) currently need the same flexibility already expected in Open WebUI’s OpenAI integration layer.

My concrete use case is a self-hosted OpenAI-compatible vLLM endpoint behind auth/proxy controls.
For audio routes, I need parity for:

  • auth-mode-based behavior (not only fixed bearer assumptions),
  • custom request headers,
  • consistent config handling between LLM connections and Audio connections.

Related discussion:
https://github.com/open-webui/open-webui/discussions/20953

Desired Solution you'd like

Bring Audio OpenAI-compatible configuration to parity with existing OpenAI integration patterns:

  1. Add Audio config support for OpenAI API config objects
  • TTS_OPENAI_API_CONFIG
  • STT_OPENAI_API_CONFIG
  1. Persist and expose these via config/state and Audio config API
  • include these fields in config get/update payloads.
  1. Use shared auth/header handling path in Audio OpenAI routes
  • support auth mode + custom headers with the same architecture used for OpenAI-compatible integrations.
  1. Add Audio Settings UI controls
  • auth mode selector for TTS/STT OpenAI settings,
  • custom headers JSON input for TTS/STT OpenAI settings.
  1. Maintain backward compatibility
  • existing default behavior should continue to work for standard setups.
  • compatibility safeguards for multipart STT uploads should remain stable across OpenAI-compatible backends.

Alternatives Considered

  • Keep separate hardcoded auth/header handling in audio router:
    • duplicates logic and increases maintenance cost.
  • Solve via external proxy-only workarounds:
    • less portable, harder to support and document for users.

Additional Context

Implementation Note (Reliability / Resource Safety)

In the STT OpenAI transcription upload path, file handling is performed via a context manager:

with open(file_path, "rb") as audio_file:
    r = requests.post(
        url=f"{request.app.state.config.STT_OPENAI_API_BASE_URL}/audio/transcriptions",
        headers=request_headers,
        cookies=cookies,
        files={"file": (filename, audio_file)},
        data=payload,
    )

Reason:

  • guarantees deterministic file descriptor cleanup,
  • avoids descriptor leaks under retries/chunked transcription flows,
  • keeps behavior unchanged while improving stability for long-running/self-hosted deployments.

Validation

  • vLLM endpoint type: OpenAI-compatible
  • Model: Whisper large-v3
  • Auth mode tested: system_oauth
  • Custom headers used: ``
  • Screenshot(s) and logs:

Image

2026-02-10 00:04:39.522 | DEBUG | python_multipart.multipart:callback:627 - Calling on_end with no data
2026-02-10 00:04:39.530 | INFO | open_webui.routers.files:upload_file_handler:225 - file.content_type: audio/mpeg True
2026-02-10 00:04:39.572 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 127.0.0.1:60033 - "POST /api/v1/files/?process=true HTTP/1.1" 200
2026-02-10 00:04:39.604 | INFO | open_webui.routers.audio:transcribe:1073 - transcribe: C:\Users\xxxx\Desktop\myprojects\mychat\backend\data\uploads/71661df1-a62d-46aa-a6be-79c933650210_ElevenLabs_2025-12-31T09_54_16_Mia - Clear, Steady and Warm_pvc_sp98_s50_sb75_se0_b_m2.mp3 {}
2026-02-10 00:04:39.637 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 127.0.0.1:60033 - "GET /api/v1/files/71661df1-a62d-46aa-a6be-79c933650210/process/status?stream=true HTTP/1.1" 200
Chunk paths: ['C:\Users\xxxx\Desktop\myprojects\mychat\backend\data\uploads/71661df1-a62d-46aa-a6be-79c933650210_ElevenLabs_2025-12-31T09_54_16_Mia - Clear, Steady and Warm_pvc_sp98_s50_sb75_se0_b_m2.mp3']
2026-02-10 00:04:40.695 | DEBUG | asyncio.proactor_events:init:633 - Using proactor: IocpProactor
2026-02-10 00:04:40.705 | DEBUG | open_webui.utils.oauth:get_oauth_token:982 - Token refresh needed for user f5c93653-d562-4c70-833f-xxxxx, provider oidc
2026-02-10 00:04:41.279 | DEBUG | open_webui.utils.oauth:_perform_token_refresh:1108 - Token refresh successful for provider oidc
2026-02-10 00:04:41.319 | INFO | open_webui.utils.oauth:_refresh_token:1020 - Successfully refreshed token for session bd787730-f089-4e22-8157-xxxxx
2026-02-10 00:04:41.358 | DEBUG | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): mygateway:443
2026-02-10 00:04:51.651 | DEBUG | urllib3.connectionpool:_make_request:544 - https://mygateway:443 "POST /xxx/stt/v1/audio/transcriptions HTTP/1.1" 200 996

Originally created by @tariktalay on GitHub (Feb 9, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/21281 ### Check Existing Issues - [x] I have searched for all existing **open AND closed** issues and discussions for similar requests. I have found none that is comparable to my request. ### Verify Feature Scope - [x] I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions. ### Problem Description Audio OpenAI-compatible connections (STT/TTS) currently need the same flexibility already expected in Open WebUI’s OpenAI integration layer. My concrete use case is a self-hosted OpenAI-compatible vLLM endpoint behind auth/proxy controls. For audio routes, I need parity for: - auth-mode-based behavior (not only fixed bearer assumptions), - custom request headers, - consistent config handling between LLM connections and Audio connections. Related discussion: https://github.com/open-webui/open-webui/discussions/20953 ### Desired Solution you'd like Bring Audio OpenAI-compatible configuration to parity with existing OpenAI integration patterns: 1. Add Audio config support for OpenAI API config objects - `TTS_OPENAI_API_CONFIG` - `STT_OPENAI_API_CONFIG` 2. Persist and expose these via config/state and Audio config API - include these fields in config get/update payloads. 3. Use shared auth/header handling path in Audio OpenAI routes - support auth mode + custom headers with the same architecture used for OpenAI-compatible integrations. 4. Add Audio Settings UI controls - auth mode selector for TTS/STT OpenAI settings, - custom headers JSON input for TTS/STT OpenAI settings. 5. Maintain backward compatibility - existing default behavior should continue to work for standard setups. - compatibility safeguards for multipart STT uploads should remain stable across OpenAI-compatible backends. ### Alternatives Considered - Keep separate hardcoded auth/header handling in audio router: - duplicates logic and increases maintenance cost. - Solve via external proxy-only workarounds: - less portable, harder to support and document for users. ### Additional Context - Discussion: https://github.com/open-webui/open-webui/discussions/20953 - Tested with my own OpenAI-compatible vLLM service behind a gateway. - I will attach evidence in this issue and PR: - Open WebUI screenshots, - Output logs ### Implementation Note (Reliability / Resource Safety) In the STT OpenAI transcription upload path, file handling is performed via a context manager: ```python with open(file_path, "rb") as audio_file: r = requests.post( url=f"{request.app.state.config.STT_OPENAI_API_BASE_URL}/audio/transcriptions", headers=request_headers, cookies=cookies, files={"file": (filename, audio_file)}, data=payload, ) ``` Reason: - guarantees deterministic file descriptor cleanup, - avoids descriptor leaks under retries/chunked transcription flows, - keeps behavior unchanged while improving stability for long-running/self-hosted deployments. ### Validation - vLLM endpoint type: OpenAI-compatible - Model: `Whisper large-v3` - Auth mode tested: `system_oauth` - Custom headers used: `` - Screenshot(s) and logs: ![Image](https://github.com/user-attachments/assets/dadf9721-7d49-4fd1-9bfa-d886e9e27453) 2026-02-10 00:04:39.522 | DEBUG | python_multipart.multipart:callback:627 - Calling on_end with no data 2026-02-10 00:04:39.530 | INFO | open_webui.routers.files:upload_file_handler:225 - file.content_type: audio/mpeg True 2026-02-10 00:04:39.572 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 127.0.0.1:60033 - "POST /api/v1/files/?process=true HTTP/1.1" 200 2026-02-10 00:04:39.604 | INFO | open_webui.routers.audio:transcribe:1073 - transcribe: C:\Users\xxxx\Desktop\myprojects\mychat\backend\data\uploads/71661df1-a62d-46aa-a6be-79c933650210_ElevenLabs_2025-12-31T09_54_16_Mia - Clear, Steady and Warm_pvc_sp98_s50_sb75_se0_b_m2.mp3 {} 2026-02-10 00:04:39.637 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 127.0.0.1:60033 - "GET /api/v1/files/71661df1-a62d-46aa-a6be-79c933650210/process/status?stream=true HTTP/1.1" 200 Chunk paths: ['C:\\Users\\xxxx\\Desktop\\myprojects\\mychat\\backend\\data\\uploads/71661df1-a62d-46aa-a6be-79c933650210_ElevenLabs_2025-12-31T09_54_16_Mia - Clear, Steady and Warm_pvc_sp98_s50_sb75_se0_b_m2.mp3'] 2026-02-10 00:04:40.695 | DEBUG | asyncio.proactor_events:__init__:633 - Using proactor: IocpProactor 2026-02-10 00:04:40.705 | DEBUG | open_webui.utils.oauth:get_oauth_token:982 - Token refresh needed for user f5c93653-d562-4c70-833f-xxxxx, provider oidc 2026-02-10 00:04:41.279 | DEBUG | open_webui.utils.oauth:_perform_token_refresh:1108 - Token refresh successful for provider oidc 2026-02-10 00:04:41.319 | INFO | open_webui.utils.oauth:_refresh_token:1020 - Successfully refreshed token for session bd787730-f089-4e22-8157-xxxxx 2026-02-10 00:04:41.358 | DEBUG | urllib3.connectionpool:_new_conn:1049 - Starting new HTTPS connection (1): mygateway:443 2026-02-10 00:04:51.651 | DEBUG | urllib3.connectionpool:_make_request:544 - [https://mygateway:443](https://mygateway/) "POST /xxx/stt/v1/audio/transcriptions HTTP/1.1" 200 996
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#34959