[PR #12940] [CLOSED] Introducing Custom TTS Engine Support! (OpenAPI Compatible) #46104

Closed
opened 2026-04-29 20:46:31 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/12940
Author: @RedsAnalysis
Created: 4/16/2025
Status: Closed

Base: devHead: customtts_v1


📝 Commits (4)

  • 9b9fc90 added a front end elemnt to show Customtts on admin settings page and also changed the getvoices function to respond to /audio/voices endpoint
  • 74e8df6 This commit allows users to integrate their own external TTS provider by:
  • d42fd9a END OF THIS BRANCH: Added CustomTTS support with user friendly interface, allowing users to select models and voice from a drop down.
  • 241fcce ADDED CORS_ALLOW_ORIGIN=http://localhost:5173 to the backend dev.sh script since i was running into a CORS error

📊 Changes

6 files changed (+450 additions, -46 deletions)

View changed files

📝 backend/dev.sh (+1 -1)
📝 backend/open_webui/config.py (+15 -0)
📝 backend/open_webui/main.py (+4 -0)
📝 backend/open_webui/routers/audio.py (+197 -0)
📝 src/lib/apis/audio/index.ts (+47 -23)
📝 src/lib/components/admin/Settings/Audio.svelte (+186 -22)

📄 Description

Pull Request

  • Target branch: Please verify that the pull request targets the dev branch. (Assuming this is correct)
  • Description: Provide a concise description of the changes made in this pull request. (Provided below)
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description. (Provided below)
  • Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources? ( -> You need to check/update this )
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation? ( -> Added httpx, need to check docs )
  • Testing: Have you written and run sufficient tests to validate the changes? (Assuming you've tested manually)
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? (Self-review implied)
  • Prefix: To clearly categorize this pull request, prefix the pull request title using one of the following: feat

Changelog Entry

Description

feat(audio): Add support for user-configurable Custom TTS engine

This pull request introduces a new Text-to-Speech engine option called "CustomTTS". This feature allows users to integrate external TTS services beyond the currently supported options by providing a custom API Base URL and an optional API Key directly within the Open WebUI audio settings. The backend now proxies requests for voice lists and speech synthesis to the configured custom endpoints.

Added

  • Custom TTS Engine Option: Introduced "Custom TTS" in the TTS Engine dropdown list in the Audio Settings UI.
  • Configuration Fields: Added UI input fields for "Custom TTS API Base URL" and "Custom TTS API Key" (optional), displayed conditionally when the "Custom TTS" engine is selected.
  • Backend Configuration: Added new persistent configuration settings (AUDIO_TTS_CUSTOM_TTS_OPEN_API_BASE_URL, AUDIO_TTS_CUSTOM_TTS_OPEN_API_KEY) to store user-defined custom TTS endpoint details.
  • Custom Voices Endpoint: Created a new backend API endpoint /api/v1/audio/voices dedicated to fetching the voice list from the configured custom TTS service. This endpoint handles the external API call and transforms the response (expected: list of strings) into the standard [{"id": ..., "name": ...}] format for the frontend.
  • Custom Models Logic: Added logic to the existing backend /api/v1/audio/models endpoint to fetch models from the configured custom TTS service URL (<base_url>/models) when the "Custom TTS" engine is active, transforming the response (expected: {"data": [...]}) into the standard format.
  • Custom Speech Synthesis: Added logic to the backend /api/v1/audio/speech endpoint to proxy TTS requests to the configured custom TTS service URL (<base_url>/audio/speech) when the "Custom TTS" engine is selected.
  • Dependency: Added httpx for making asynchronous HTTP requests in the backend /audio/voices endpoint.

Changed

  • Frontend API Client ($lib/apis/audio.ts): Consolidated voice fetching logic. The getVoices function now accepts an engineType and calls the appropriate backend endpoint (/voices or /audio/voices).
  • Backend Config Registration (main.py): Updated application startup to register the new custom TTS configuration settings with the AppConfig instance.
  • Backend Config Endpoints (routers/audio.py): Modified /config (GET) and /config/update (POST) endpoints to handle reading and writing the new custom TTS configuration values, ensuring correct key mapping between frontend payload and backend storage.
  • Frontend Settings Component (*.svelte):
    • Replaced <input list>/<datalist> with <select> dropdowns for TTS Voice and Model selection for the customtts engine (and potentially others) for improved user experience.
    • Updated onMount logic to correctly initialize TTS_VOICE and TTS_MODEL state based on the loaded TTS_ENGINE to prevent incorrect defaults from showing on page reload.
    • Ensured consistent variable naming for custom TTS settings between component state, API payload keys, and backend config access.

Fixed

  • Corrected KeyError in backend config update (/config/update) by aligning variable names used for accessing AppConfig state during startup (main.py) and update (routers/audio.py).
  • Resolved UnboundLocalError in backend speech synthesis (/speech) endpoint's error handling by initializing the response variable (r = None) and adding checks (if r is not None) before accessing it in except blocks.
  • Fixed NameError: name 'httpx' is not defined in /audio/voices endpoint by adding the necessary import httpx.
  • Addressed issue where voice/model dropdowns were not populating for customtts by implementing the necessary backend logic in /models and ensuring correct data transformation in /audio/voices.

Security

  • N/A

Breaking Changes

  • N/A

Additional Information

  • This feature requires users to provide the correct Base URL for their external TTS provider. The expected API paths on the external service are /models (for models list, returning {"data": [...]}) and /audio/voices (for voices list, returning {"voices": [...]}). The speech synthesis path is assumed to be /audio/speech.
  • Error handling has been added for calls to the external custom TTS service.

Video Preview

  • It's full video preview of the feature and then some. Video

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/12940 **Author:** [@RedsAnalysis](https://github.com/RedsAnalysis) **Created:** 4/16/2025 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `customtts_v1` --- ### 📝 Commits (4) - [`9b9fc90`](https://github.com/open-webui/open-webui/commit/9b9fc909040455f28f19b623022ad73fb8472d71) added a front end elemnt to show Customtts on admin settings page and also changed the getvoices function to respond to /audio/voices endpoint - [`74e8df6`](https://github.com/open-webui/open-webui/commit/74e8df65159c265273bb5b9894488674d88a975e) This commit allows users to integrate their own external TTS provider by: - [`d42fd9a`](https://github.com/open-webui/open-webui/commit/d42fd9a7139cd81d23ce6d08b84e107679a5235c) END OF THIS BRANCH: Added CustomTTS support with user friendly interface, allowing users to select models and voice from a drop down. - [`241fcce`](https://github.com/open-webui/open-webui/commit/241fcceb064b7eb6f3e33de70403c952616789e7) ADDED CORS_ALLOW_ORIGIN=http://localhost:5173 to the backend dev.sh script since i was running into a CORS error ### 📊 Changes **6 files changed** (+450 additions, -46 deletions) <details> <summary>View changed files</summary> 📝 `backend/dev.sh` (+1 -1) 📝 `backend/open_webui/config.py` (+15 -0) 📝 `backend/open_webui/main.py` (+4 -0) 📝 `backend/open_webui/routers/audio.py` (+197 -0) 📝 `src/lib/apis/audio/index.ts` (+47 -23) 📝 `src/lib/components/admin/Settings/Audio.svelte` (+186 -22) </details> ### 📄 Description ### Pull Request - [X] **Target branch:** Please verify that the pull request targets the `dev` branch. (Assuming this is correct) - [X] **Description:** Provide a concise description of the changes made in this pull request. (Provided below) - [X] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. (Provided below) - [ ] **Documentation:** Have you updated relevant documentation [Open WebUI Docs](https://github.com/open-webui/docs), or other documentation sources? ( **-> You need to check/update this** ) - [ ] **Dependencies:** Are there any new dependencies? Have you updated the dependency versions in the documentation? ( **-> Added `httpx`, need to check docs** ) - [X] **Testing:** Have you written and run sufficient tests to validate the changes? (Assuming you've tested manually) - [X] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? (Self-review implied) - [X] **Prefix:** To clearly categorize this pull request, prefix the pull request title using one of the following: `feat` --- # Changelog Entry ### Description feat(audio): Add support for user-configurable Custom TTS engine This pull request introduces a new Text-to-Speech engine option called "CustomTTS". This feature allows users to integrate external TTS services beyond the currently supported options by providing a custom API Base URL and an optional API Key directly within the Open WebUI audio settings. The backend now proxies requests for voice lists and speech synthesis to the configured custom endpoints. ### Added - **Custom TTS Engine Option:** Introduced "Custom TTS" in the TTS Engine dropdown list in the Audio Settings UI. - **Configuration Fields:** Added UI input fields for "Custom TTS API Base URL" and "Custom TTS API Key" (optional), displayed conditionally when the "Custom TTS" engine is selected. - **Backend Configuration:** Added new persistent configuration settings (`AUDIO_TTS_CUSTOM_TTS_OPEN_API_BASE_URL`, `AUDIO_TTS_CUSTOM_TTS_OPEN_API_KEY`) to store user-defined custom TTS endpoint details. - **Custom Voices Endpoint:** Created a new backend API endpoint `/api/v1/audio/voices` dedicated to fetching the voice list from the configured custom TTS service. This endpoint handles the external API call and transforms the response (expected: list of strings) into the standard `[{"id": ..., "name": ...}]` format for the frontend. - **Custom Models Logic:** Added logic to the existing backend `/api/v1/audio/models` endpoint to fetch models from the configured custom TTS service URL (`<base_url>/models`) when the "Custom TTS" engine is active, transforming the response (expected: `{"data": [...]}`) into the standard format. - **Custom Speech Synthesis:** Added logic to the backend `/api/v1/audio/speech` endpoint to proxy TTS requests to the configured custom TTS service URL (`<base_url>/audio/speech`) when the "Custom TTS" engine is selected. - **Dependency:** Added `httpx` for making asynchronous HTTP requests in the backend `/audio/voices` endpoint. ### Changed - **Frontend API Client (`$lib/apis/audio.ts`):** Consolidated voice fetching logic. The `getVoices` function now accepts an `engineType` and calls the appropriate backend endpoint (`/voices` or `/audio/voices`). - **Backend Config Registration (`main.py`):** Updated application startup to register the new custom TTS configuration settings with the `AppConfig` instance. - **Backend Config Endpoints (`routers/audio.py`):** Modified `/config` (GET) and `/config/update` (POST) endpoints to handle reading and writing the new custom TTS configuration values, ensuring correct key mapping between frontend payload and backend storage. - **Frontend Settings Component (`*.svelte`):** * Replaced `<input list>`/`<datalist>` with `<select>` dropdowns for TTS Voice and Model selection for the `customtts` engine (and potentially others) for improved user experience. * Updated `onMount` logic to correctly initialize `TTS_VOICE` and `TTS_MODEL` state based on the loaded `TTS_ENGINE` to prevent incorrect defaults from showing on page reload. * Ensured consistent variable naming for custom TTS settings between component state, API payload keys, and backend config access. ### Fixed - Corrected `KeyError` in backend config update (`/config/update`) by aligning variable names used for accessing `AppConfig` state during startup (`main.py`) and update (`routers/audio.py`). - Resolved `UnboundLocalError` in backend speech synthesis (`/speech`) endpoint's error handling by initializing the response variable (`r = None`) and adding checks (`if r is not None`) before accessing it in `except` blocks. - Fixed `NameError: name 'httpx' is not defined` in `/audio/voices` endpoint by adding the necessary `import httpx`. - Addressed issue where voice/model dropdowns were not populating for `customtts` by implementing the necessary backend logic in `/models` and ensuring correct data transformation in `/audio/voices`. ### Security - N/A ### Breaking Changes - N/A --- ### Additional Information - This feature requires users to provide the correct Base URL for their external TTS provider. The expected API paths on the external service are `/models` (for models list, returning `{"data": [...]}`) and `/audio/voices` (for voices list, returning `{"voices": [...]}`). The speech synthesis path is assumed to be `/audio/speech`. - Error handling has been added for calls to the external custom TTS service. ### Video Preview - It's full video preview of the feature and then some. [Video](https://youtu.be/KkbjXabHX7Q) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 20:46:31 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#46104