enh: reset requested voice model after switching TTS Engine #1456

Closed
opened 2025-11-11 14:45:35 -06:00 by GiteaMirror · 8 comments
Owner

Originally created by @ahmedsaed on GitHub (Jul 6, 2024).

Bug Report

Description

Bug Summary:
Before setting up openedai-speech, I was using the webapi Google UK English Female voice. After I completed the setup for openedai-speech, I was getting these errors

Open webui logs

INFO:     192.168.1.2:0 - "POST /audio/api/v1/speech HTTP/1.1" 400 Bad Request
ERROR:apps.audio.main:400 Client Error: Bad Request for url: http://openedai-speech:8000/v1/audio/speech
Traceback (most recent call last):
  File "/app/backend/apps/audio/main.py", line 219, in speech
    r.raise_for_status()
  File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://openedai-speech:8000/v1/audio/speech

Openedai-speech logs

INFO:     192.168.80.3:60494 - "POST /v1/audio/speech HTTP/1.1" 400 Bad Request
2024-07-06 21:26:38.527 | INFO     | openedai:openai_statuserror_handler:106 - BadRequestError(message="Error loading voice: Google UK English Female, KeyError: 'Google UK English Female'", code=400, param=voice)

The error message clearly shows that the requested model didn't get reset or overridden by the value chosen in Admin settings. I had to update the voice model from the user (settings button) settings.

I believe the admin settings and settings should be in sync at least for the admin user.

Steps to Reproduce:

  1. configure audio settings using webapi
  2. configure audio settings using any openai-compatible API
  3. Try to play any message and you will get the error

Expected Behavior:
When the engine changed through admin settings, users should have audio settings updated or reset.

Actual Behavior:
The voice model chosen in settings was being used despite it being invalid.

Environment

  • Open WebUI Version: v0.3.7

  • Operating System: Ubuntu 20.04

  • Browser (if applicable): Chrome latest

Reproduction Details

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.

Installation Method

Both open webui and openedai-speech were installed through docker.

Originally created by @ahmedsaed on GitHub (Jul 6, 2024). # Bug Report ## Description **Bug Summary:** Before setting up openedai-speech, I was using the webapi `Google UK English Female` voice. After I completed the setup for openedai-speech, I was getting these errors Open webui logs ``` INFO: 192.168.1.2:0 - "POST /audio/api/v1/speech HTTP/1.1" 400 Bad Request ERROR:apps.audio.main:400 Client Error: Bad Request for url: http://openedai-speech:8000/v1/audio/speech Traceback (most recent call last): File "/app/backend/apps/audio/main.py", line 219, in speech r.raise_for_status() File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://openedai-speech:8000/v1/audio/speech ``` Openedai-speech logs ``` INFO: 192.168.80.3:60494 - "POST /v1/audio/speech HTTP/1.1" 400 Bad Request 2024-07-06 21:26:38.527 | INFO | openedai:openai_statuserror_handler:106 - BadRequestError(message="Error loading voice: Google UK English Female, KeyError: 'Google UK English Female'", code=400, param=voice) ``` The error message clearly shows that the requested model didn't get reset or overridden by the value chosen in Admin settings. I had to update the voice model from the user (settings button) settings. I believe the `admin settings` and `settings` should be in sync at least for the admin user. **Steps to Reproduce:** 1. configure audio settings using webapi 2. configure audio settings using any openai-compatible API 3. Try to play any message and you will get the error **Expected Behavior:** When the engine changed through admin settings, users should have audio settings updated or reset. **Actual Behavior:** The voice model chosen in `settings` was being used despite it being invalid. ## Environment - **Open WebUI Version:** v0.3.7 - **Operating System:** Ubuntu 20.04 - **Browser (if applicable):** Chrome latest ## Reproduction Details **Confirmation:** - [X] I have read and followed all the instructions provided in the README.md. - [X] I am on the latest version of both Open WebUI and Ollama. - [X] I have included the browser console logs. - [X] I have included the Docker container logs. ## Installation Method Both open webui and openedai-speech were installed through docker.
Author
Owner

@ahmedsaed commented on GitHub (Jul 6, 2024):

Also the settings page doesn't display the full list of available voices when a voice is already selected. I have to delete the selected voice for the complete list to show.

It feels more of an autocomplete than a drop down menu

@ahmedsaed commented on GitHub (Jul 6, 2024): Also the settings page doesn't display the full list of available voices when a voice is already selected. I have to delete the selected voice for the complete list to show. It feels more of an autocomplete than a drop down menu
Author
Owner

@ahmedsaed commented on GitHub (Jul 7, 2024):

I've noticed a bug related to caching.

To explain it, I need to understand where the voices list in the dropdown menu comes from. Specifically, I'm referring to the list that includes voices like alloy, echo, fable, etc..

I assumed this list was populated with the available voices, but when I added my custom voice to the openedai-speech config, the list did not include the new voice, even after restarting.

Interestingly, I can manually enter the name of the custom voice and it works, except when I try to play a message that has already been played with one of the predefined voices.

To elaborate, I initially used the alloy voice to play a message. After switching to my custom voice, the play button didn't work, despite the logs showing no errors.

This issue only occurs with custom models. If I select another voice from the predefined list, it plays normally.

Playing the custom voice on a new message, however, works without any problems.

@ahmedsaed commented on GitHub (Jul 7, 2024): I've noticed a bug related to caching. To explain it, I need to understand where the voices list in the dropdown menu comes from. Specifically, I'm referring to the list that includes voices like alloy, echo, fable, etc.. I assumed this list was populated with the available voices, but when I added my custom voice to the openedai-speech config, the list did not include the new voice, even after restarting. Interestingly, I can manually enter the name of the custom voice and it works, except when I try to play a message that has already been played with one of the predefined voices. To elaborate, I initially used the alloy voice to play a message. After switching to my custom voice, the play button didn't work, despite the logs showing no errors. This issue only occurs with custom models. If I select another voice from the predefined list, it plays normally. Playing the custom voice on a new message, however, works without any problems.
Author
Owner

@tjbck commented on GitHub (Jul 8, 2024):

PR welcome!

@tjbck commented on GitHub (Jul 8, 2024): PR welcome!
Author
Owner

@jason-e-gross commented on GitHub (Jul 11, 2024):

I can confirm that this is an issue. I stood up my own container of openedai-speech, and I can exec into Open WebUI's container, and curl to the speech contsainer, and get it to generate audio (so I know the endpoint is reachable from within Open WebUI) but Open WebUI keeps throwing 400 Bad Requests, when I watch open webui's docker logs.

Also has nothing to do with any custom voices - i can't get it to work with the default settings.

Docker logs from Open WebUI
image

Docker logs from OpenedAI-Speech
image

It looks like it's passing in some other setting from elsewhere? "Microsoft Zira"?

Oh, huh - I gotta go into my own personal settings also and change it. Odd.

@jason-e-gross commented on GitHub (Jul 11, 2024): I can confirm that this is an issue. I stood up my own container of openedai-speech, and I can exec into Open WebUI's container, and `curl` to the speech contsainer, and get it to generate audio (so I know the endpoint is reachable from within Open WebUI) but Open WebUI keeps throwing 400 Bad Requests, when I watch open webui's docker logs. Also has nothing to do with any custom voices - i can't get it to work with the default settings. Docker logs from Open WebUI ![image](https://github.com/open-webui/open-webui/assets/16501884/2de3471a-075f-4205-949a-7f80fde58ac8) Docker logs from OpenedAI-Speech ![image](https://github.com/open-webui/open-webui/assets/16501884/6e48985c-52e3-46dd-b8b7-a854472799a1) It looks like it's passing in some other setting from elsewhere? "Microsoft Zira"? Oh, huh - I gotta go into my own personal settings also and change it. Odd.
Author
Owner

@jason-e-gross commented on GitHub (Jul 11, 2024):

Follow-up, tried a custom - and can't seem to get custom voices to work at all. only the standard ones included in openedai-speech.

@jason-e-gross commented on GitHub (Jul 11, 2024): Follow-up, tried a custom - and can't seem to get custom voices to work at all. only the standard ones included in openedai-speech.
Author
Owner

@ahmedsaed commented on GitHub (Jul 11, 2024):

Follow-up, tried a custom - and can't seem to get custom voices to work at all. only the standard ones included in openedai-speech.

Have you noticed the third message a wrote about a potential bug with caching and invalidation?

What exactly is the problem you are facing?

Have you tested the custom voice through a curl command?

If the model is working through other methods then the issue is probably related to the caching

Don't test the custom model on a message that had been played with the standard ones

@ahmedsaed commented on GitHub (Jul 11, 2024): > Follow-up, tried a custom - and can't seem to get custom voices to work at all. only the standard ones included in openedai-speech. Have you noticed the third message a wrote about a potential bug with caching and invalidation? What exactly is the problem you are facing? Have you tested the custom voice through a curl command? If the model is working through other methods then the issue is probably related to the caching Don't test the custom model on a message that had been played with the standard ones
Author
Owner

@jason-e-gross commented on GitHub (Jul 11, 2024):

Finally, for me - its a custom file that just won't work. But I don't know that it's open-web-ui's issue - because when I curl the file, opened-ai-speech generates the mp3, but it's corrupt - so im probably doing something wrong there.

@jason-e-gross commented on GitHub (Jul 11, 2024): Finally, for me - its a custom file that just won't work. But I don't know that it's open-web-ui's issue - because when I curl the file, opened-ai-speech generates the mp3, but it's corrupt - so im probably doing something wrong there.
Author
Owner

@ZA5542 commented on GitHub (Aug 11, 2024):

Finally, for me - its a custom file that just won't work. But I don't know that it's open-web-ui's issue - because when I curl the file, opened-ai-speech generates the mp3, but it's corrupt - so im probably doing something wrong there.

I had the same issue with the corrupt mp3 file. Change it to a .wav file, and it will work.

@ZA5542 commented on GitHub (Aug 11, 2024): > Finally, for me - its a custom file that just won't work. But I don't know that it's open-web-ui's issue - because when I curl the file, opened-ai-speech generates the mp3, but it's corrupt - so im probably doing something wrong there. I had the same issue with the corrupt mp3 file. Change it to a .wav file, and it will work.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#1456