mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[GH-ISSUE #24143] issue: TTS treats PCM audio responses as MP3 #58874
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @daradib on GitHub (Apr 26, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/24143
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.9.2
Ollama Version (if applicable)
No response
Operating System
Ubuntu 24.04
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
I can use the OpenAI-compatible API from OpenRouter or LiteLLM to generate speech using Gemini-TTS. PCM audio from the Text-to-Speech Engine is transcoded to MP3 for the client.
Actual Behavior
OpenRouter and LiteLLM respond with PCM and do not support any other format for Gemini-TTS. Open WebUI passes the raw PCM audio to client with MP3 content type and no audio is heard.
Steps to Reproduce
Configure Text-to-Speech in Admin Panel
Logs & Screenshots
No errors. Browser network log shows audio/mpeg response even though the file is actually PCM.
Additional Information
I will open a PR to transcode when TTS Engine returns Content-Type audio/pcm.