mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-11 08:22:09 -05:00
[PR #12944] [CLOSED] Introducing Custom TTS Engine Support! Better than using OPENAI endpoint #23057
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/12944
Author: @RedsAnalysis
Created: 4/16/2025
Status: ❌ Closed
Base:
dev← Head:customtts_v1📝 Commits (4)
9b9fc90added a front end elemnt to show Customtts on admin settings page and also changed the getvoices function to respond to /audio/voices endpoint74e8df6This commit allows users to integrate their own external TTS provider by:d42fd9aEND OF THIS BRANCH: Added CustomTTS support with user friendly interface, allowing users to select models and voice from a drop down.241fcceADDED CORS_ALLOW_ORIGIN=http://localhost:5173 to the backend dev.sh script since i was running into a CORS error📊 Changes
6 files changed (+450 additions, -46 deletions)
View changed files
📝
backend/dev.sh(+1 -1)📝
backend/open_webui/config.py(+15 -0)📝
backend/open_webui/main.py(+4 -0)📝
backend/open_webui/routers/audio.py(+197 -0)📝
src/lib/apis/audio/index.ts(+47 -23)📝
src/lib/components/admin/Settings/Audio.svelte(+186 -22)📄 Description
Approach 1: Using the Existing "OpenAI" Engine Setting (with Custom URL)
Pros:
No Backend Code Change Needed (Initially): For basic synthesis, if the custom server perfectly mimics the OpenAI /audio/speech endpoint and payload, it might work without modifying Open WebUI's backend code initially.
Simple Setup (if compatible): Only requires changing the API Base URL field.
Cons:
❌ No Dynamic Voice/Model Discovery: Open WebUI won't attempt to fetch voice or model lists from the custom URL. Users see hardcoded OpenAI defaults (alloy, tts-1, etc.) or nothing in dropdowns/datalists.
❌ Manual Input Required: Users must manually type the exact Voice ID and Model ID required by the custom server into the text fields, without any validation or selection assistance. Highly error-prone.
❌ Poor User Experience: Difficult configuration, lack of guidance, potential for using incorrect/non-existent voices/models.
❌ Misleading Configuration: The UI indicates "OpenAI" is selected, even though it's pointing to a different service, causing confusion.
Approach 2: Using the New "Custom TTS" Engine Setting (Your Implementation)
Pros:
✅ Dynamic Voice Discovery: Actively fetches the voice list from the configured custom server's /audio/voices endpoint and populates a dropdown/select list.
✅ Dynamic Model Discovery: Actively fetches the model list from the configured custom server's /models endpoint and populates a dropdown/select list.
✅ Improved User Experience: Users can easily see and select the actual available voices and models from their specific custom server via intuitive dropdowns.
✅ Accurate Configuration: Clearly indicates that a custom, non-standard engine is being used.
✅ Reduced Errors: Selecting from a list prevents typos and ensures valid voice/model IDs are sent.
✅ Clear Separation of Logic: Keeps the specific logic for handling custom/external servers separate from the standard OpenAI implementation.
TODO / Future Plans:
{"voices": ["string", ...]}for the voices list and{"data": [{"id": ..., "name": ...}]}for the models list) and assumes fixed relative API paths (/models,/audio/voices,/audio/speech)./audio/speech, a user could input/generate/speech/v2or/speakif that's what their specific external service requires.🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.