[PR #16378] [CLOSED] feat: add SSML "Voice Note" speech blocks #62969

Closed
opened 2026-05-06 07:27:00 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/16378
Author: @jcbyte
Created: 8/8/2025
Status: Closed

Base: devHead: ssml-blocks


📝 Commits (7)

  • 9c7492b fix: create tts queue
  • f652222 feat: add ssml blocks
  • 521a481 renamed override call friendly name to be shorter
  • 6f36083 add admin panel default voice block settings
  • 94a66ea fix missing imports
  • 12b2970 interpret environment variables correctly as booleans
  • 3df30e5 apply prettier formatting standard

📊 Changes

16 files changed (+904 additions, -199 deletions)

View changed files

📝 backend/open_webui/config.py (+18 -0)
📝 backend/open_webui/main.py (+12 -0)
📝 backend/open_webui/routers/audio.py (+33 -2)
📝 src/lib/apis/audio/index.ts (+4 -2)
📝 src/lib/components/admin/Settings/Audio.svelte (+44 -0)
📝 src/lib/components/chat/MessageInput/CallOverlay.svelte (+7 -0)
📝 src/lib/components/chat/Messages/Markdown.svelte (+2 -0)
📝 src/lib/components/chat/Messages/Markdown/MarkdownTokens.svelte (+3 -0)
src/lib/components/chat/Messages/Markdown/SsmlRenderer.svelte (+177 -0)
src/lib/components/chat/Messages/Markdown/VoiceVisualiser.svelte (+61 -0)
📝 src/lib/components/chat/Messages/ResponseMessage.svelte (+57 -194)
📝 src/lib/components/chat/Settings/Audio.svelte (+90 -0)
📝 src/lib/components/chat/SettingsModal.svelte (+12 -1)
📝 src/lib/stores/index.ts (+1 -0)
src/lib/utils/marked/ssml-extension.ts (+38 -0)
src/lib/utils/tts.ts (+345 -0)

📄 Description

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions and describe your changes before submitting a pull request.

Before submitting, make sure you've checked the following:

  • Target branch: Please verify that the pull request targets the dev branch.
  • Description: Provide a concise description of the changes made in this pull request.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources?
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • Testing: Have you written and run sufficient tests to validate the changes?
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • Prefix: To clearly categorize this pull request, prefix the pull request title using one of the following:
    • BREAKING CHANGE: Significant changes that may affect compatibility
    • build: Changes that affect the build system or external dependencies
    • ci: Changes to our continuous integration processes or workflows
    • chore: Refactor, cleanup, or other non-functional code changes
    • docs: Documentation update or addition
    • feat: Introduces a new feature or enhancement to the codebase
    • fix: Bug fix or error correction
    • i18n: Internationalization or localization changes
    • perf: Performance improvement
    • refactor: Code restructuring for better maintainability, readability, or scalability
    • style: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.)
    • test: Adding missing tests or correcting existing tests
    • WIP: Work in progress, a temporary label for incomplete or ongoing work

Changelog Entry

Description

Created "Voice Note" style blocks to display the contents of the tags; these can be played, cancelled or queued (due to my last PR #16152 ). The motivation is to allow for a mixed output of text and audio simultaneously from models.

Added three new settings:

  • Auto-play Speech Blocks - Automatically play speech blocks as they appear.
  • Override Conversation Response with Speech Blocks - Suppress the models regular response in call mode and only read from the speech blocks.
  • Show Speech Blocks - Control if the speech blocks are hidden in the UI.

Added three new admin panel configs (and corresponding environment variables) to set the default settings for new users.

Added

  • Added a marked extension to detect tags in content.
  • Created "Voice Note" style blocks to display the contents of the tags; these can be played, cancelled or queued.
  • Added "Auto-play Speech Blocks" boolean setting to automatically play speech blocks as they appear, allowing a mixed narration output.
  • Added "Override Conversation Response with Speech Blocks" boolean setting to suppress the models regular response in call mode and only read from the speech blocks.
  • Added "Show Speech Blocks" boolean setting to control if the speech blocks are hidden in the UI.
  • Added "Default Auto-play Speech Blocks" boolean config and corresponding "DEFAULT_AUTOPLAY_SSML" environment variable to set the default value for the "Auto-play Speech Blocks" setting for new users.
  • Added "Default Override Call Response with Speech Blocks" boolean config and corresponding "DEFAULT_SSML_OVERRIDE_CALL " environment variable to set the default value for the "Override Conversation Response with Speech Blocks" setting for new users.
  • Added "Default Show Speech Blocks" boolean config and corresponding "DEFAULT_SHOW_SSML" environment variable to set the default value for the "Show Speech Blocks" setting for new users.

Changed

None

Deprecated

None

Removed

None

Fixed

None

Security

None

Breaking Changes

None


Additional Information

Screenshots or Videos

Speech Blocks:
Recording2025-08-08114937-ezgif com-video-to-gif-converter

https://github.com/user-attachments/assets/4094d8a5-7dae-41a9-b460-6b549603d290

Settings:
image

Admin Configs:
image

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/16378 **Author:** [@jcbyte](https://github.com/jcbyte) **Created:** 8/8/2025 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `ssml-blocks` --- ### 📝 Commits (7) - [`9c7492b`](https://github.com/open-webui/open-webui/commit/9c7492b5c1754a9f2c1571637a676089cce3c6c1) fix: create tts queue - [`f652222`](https://github.com/open-webui/open-webui/commit/f652222c9626b76d248e82d4ea94d223f8bc257a) feat: add ssml blocks - [`521a481`](https://github.com/open-webui/open-webui/commit/521a48193bbbd54c8a3c181e068e4d539391c4e4) renamed override call friendly name to be shorter - [`6f36083`](https://github.com/open-webui/open-webui/commit/6f36083c4cfd822523b90548cc19088109afb69c) add admin panel default voice block settings - [`94a66ea`](https://github.com/open-webui/open-webui/commit/94a66eae0b406c3c9c028127288dc32b18111898) fix missing imports - [`12b2970`](https://github.com/open-webui/open-webui/commit/12b29702207de9bf9ba0ecea43506055ed8f1bc1) interpret environment variables correctly as booleans - [`3df30e5`](https://github.com/open-webui/open-webui/commit/3df30e5f789024345048d7e7da0721e101571f0e) apply prettier formatting standard ### 📊 Changes **16 files changed** (+904 additions, -199 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/config.py` (+18 -0) 📝 `backend/open_webui/main.py` (+12 -0) 📝 `backend/open_webui/routers/audio.py` (+33 -2) 📝 `src/lib/apis/audio/index.ts` (+4 -2) 📝 `src/lib/components/admin/Settings/Audio.svelte` (+44 -0) 📝 `src/lib/components/chat/MessageInput/CallOverlay.svelte` (+7 -0) 📝 `src/lib/components/chat/Messages/Markdown.svelte` (+2 -0) 📝 `src/lib/components/chat/Messages/Markdown/MarkdownTokens.svelte` (+3 -0) ➕ `src/lib/components/chat/Messages/Markdown/SsmlRenderer.svelte` (+177 -0) ➕ `src/lib/components/chat/Messages/Markdown/VoiceVisualiser.svelte` (+61 -0) 📝 `src/lib/components/chat/Messages/ResponseMessage.svelte` (+57 -194) 📝 `src/lib/components/chat/Settings/Audio.svelte` (+90 -0) 📝 `src/lib/components/chat/SettingsModal.svelte` (+12 -1) 📝 `src/lib/stores/index.ts` (+1 -0) ➕ `src/lib/utils/marked/ssml-extension.ts` (+38 -0) ➕ `src/lib/utils/tts.ts` (+345 -0) </details> ### 📄 Description # Pull Request Checklist ### Note to first-time contributors: Please open a discussion post in [Discussions](https://github.com/open-webui/open-webui/discussions) and describe your changes before submitting a pull request. **Before submitting, make sure you've checked the following:** - [x] **Target branch:** Please verify that the pull request targets the `dev` branch. - [x] **Description:** Provide a concise description of the changes made in this pull request. - [x] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [x] **Documentation:** Have you updated relevant documentation [Open WebUI Docs](https://github.com/open-webui/docs), or other documentation sources? - [x] **Dependencies:** Are there any new dependencies? Have you updated the dependency versions in the documentation? - [x] **Testing:** Have you written and run sufficient tests to validate the changes? - [x] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? - [x] **Prefix:** To clearly categorize this pull request, prefix the pull request title using one of the following: - **BREAKING CHANGE**: Significant changes that may affect compatibility - **build**: Changes that affect the build system or external dependencies - **ci**: Changes to our continuous integration processes or workflows - **chore**: Refactor, cleanup, or other non-functional code changes - **docs**: Documentation update or addition - **feat**: Introduces a new feature or enhancement to the codebase - **fix**: Bug fix or error correction - **i18n**: Internationalization or localization changes - **perf**: Performance improvement - **refactor**: Code restructuring for better maintainability, readability, or scalability - **style**: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.) - **test**: Adding missing tests or correcting existing tests - **WIP**: Work in progress, a temporary label for incomplete or ongoing work # Changelog Entry ### Description Created "Voice Note" style blocks to display the contents of the <speak> tags; these can be played, cancelled or queued (due to my last PR #16152 ). The motivation is to allow for a mixed output of text and audio simultaneously from models. Added three new settings: - Auto-play Speech Blocks - Automatically play speech blocks as they appear. - Override Conversation Response with Speech Blocks - Suppress the models regular response in call mode and only read from the speech blocks. - Show Speech Blocks - Control if the speech blocks are hidden in the UI. Added three new admin panel configs (and corresponding environment variables) to set the default settings for new users. ### Added - Added a marked extension to detect <speak> tags in content. - Created "Voice Note" style blocks to display the contents of the <speak> tags; these can be played, cancelled or queued. - Added "Auto-play Speech Blocks" boolean setting to automatically play speech blocks as they appear, allowing a mixed narration output. - Added "Override Conversation Response with Speech Blocks" boolean setting to suppress the models regular response in call mode and only read from the speech blocks. - Added "Show Speech Blocks" boolean setting to control if the speech blocks are hidden in the UI. - Added "Default Auto-play Speech Blocks" boolean config and corresponding "DEFAULT_AUTOPLAY_SSML" environment variable to set the default value for the "Auto-play Speech Blocks" setting for new users. - Added "Default Override Call Response with Speech Blocks" boolean config and corresponding "DEFAULT_SSML_OVERRIDE_CALL " environment variable to set the default value for the "Override Conversation Response with Speech Blocks" setting for new users. - Added "Default Show Speech Blocks" boolean config and corresponding "DEFAULT_SHOW_SSML" environment variable to set the default value for the "Show Speech Blocks" setting for new users. ### Changed None ### Deprecated None ### Removed None ### Fixed None ### Security None ### Breaking Changes None --- ### Additional Information - Previous PR which this builds upon: #16152 - Docs update: https://github.com/open-webui/docs/pull/646 ### Screenshots or Videos Speech Blocks: ![Recording2025-08-08114937-ezgif com-video-to-gif-converter](https://github.com/user-attachments/assets/14526cce-2b84-41a2-af51-41ee61884795) https://github.com/user-attachments/assets/4094d8a5-7dae-41a9-b460-6b549603d290 Settings: <img width="720" height="477" alt="image" src="https://github.com/user-attachments/assets/b4989b34-202d-46ca-a4e0-843fcb035650" /> Admin Configs: <img width="1001" height="726" alt="image" src="https://github.com/user-attachments/assets/3cd37b93-5b09-49be-b1e6-a397de74f5b3" /> ### Contributor License Agreement By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-06 07:27:00 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#62969