mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #14101] issue: Speech-To-Text failure when using OpenAPI Comptaible Endpoint #32670
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @andrefecto on GitHub (May 20, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/14101
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.6.10
Ollama Version (if applicable)
N/A
Operating System
Redhat Enterprise Linux 9
Browser (if applicable)
Edge, latest
Confirmation
README.md.Expected Behavior
Speech-To-Text, when clicking the record button, should generate a file that is compatible with OpenAPI/OpenA,I but it does not.
Actual Behavior
When you use the record button, it generates a file int he webm format, it says it converts it to a wav/mp3, however it seems to keep it as webm (or it converts it to an mp3 but it puts the wrong file extension on) which then causes the gpt-4o-mini-transcribe to spit back that the file is corrupted.
Steps to Reproduce
Logs & Screenshots
Docker logs:
Browser response:
Additional Information
2.1: Open WebUI can successfully use LiteLLM for audio transcription requests
2.2: Open WebUI is somehow failing to take the front-end recorded audio file and pass it to the back-end properly. See the log line "open_webui.routers.audio:convert_audio_to_wav:98", it says it converted it from a webm, to a webm file, however, the line right above it is the ffmpeg command to convert it to a wav.
@andrefecto commented on GitHub (May 20, 2025):
I am closing this as it's resolved in v0.6.10. I thought I had updated my container, and I apparently was looking at the wrong server. (I have a few instances for dev/test/prod running.)