[GH-ISSUE #6911] Markdown text is being skipped or mistakenly read during TTS "Read Aloud" #14530

Closed
opened 2026-04-19 20:52:10 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @danielj23 on GitHub (Nov 13, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/6911

Bug Report


Installation Method

Docker / windows

Environment

v0.3.35
TTS via matatonic/openedai-speech

Win 10 / FF latest

Confirmation:

  • [ x] I have read and followed all the instructions provided in the README.md.
  • [x ] I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • [x ] I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below.

Expected Behavior:

All visible alphanumeric "text" that makes up the visible markdown should be sent to the TTS service during TTS "Read Aloud" playback.

Actual Behavior:

Some text, like italicized text is not being played back during TTS "Read Aloud" playback.

Description

Bug Summary:
I noticed during Read Aloud playback that some words were skipped. I noticed a pattern that its almost always italicized markdown text not being played back. It can be other markdown text, though. Sometimes it reads invisible hash tags.

Reproduction Details

Steps to Reproduce:
(see attached sample of json and wav output)

  1. Configure a working TTS server so you can properly use the "Read Aloud" feature.
  2. Start a chat and edit the response to the example markdown provided below.
  3. Press "Read Aloud"
  4. Listen
  5. Notice it skipped some words entirely... sometimes

firefox_07FyjMoCSi
speech.zip
3c319a9970cb8c555490978176a74f47eb7becc04f1fde10ad8064c8be0b4154.json
7eccaee9ca14bc649740f954181476df390fb368fcf88cab73972d7abb9f8a41.json
13b3be85f1ff2dc7eb8364c872d4bb92d2b3ca5f3402b45184fc753412d6c01e.json
a8a0ec24b1676c26f84e86a5ad17eb550c9fb7d795babc197b84930d010e5c64.json

Example markdown to attempt playback.

 # Sample Markdown Paragraph

Markdown is a lightweight markup language that you can use to format text. It is often used to style *documents*, **web pages**, and even readme files. For example, you can create:

- A bulleted list
- **Bold text**
- *Italic text*
- `Inline code`

To emphasize certain words, you can use _italic_ or __bold__ formatting, or both together like this: **_Bold and Italic_**. If you want to create a block of code, you can wrap it in triple backticks:
Originally created by @danielj23 on GitHub (Nov 13, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/6911 # Bug Report --- ## Installation Method Docker / windows ## Environment [v0.3.35](https://github.com/open-webui/open-webui/releases/tag/v0.3.35) TTS via matatonic/openedai-speech Win 10 / FF latest **Confirmation:** - [ x] I have read and followed all the instructions provided in the README.md. - [x ] I am on the latest version of both Open WebUI and Ollama. - [ ] I have included the browser console logs. - [ ] I have included the Docker container logs. - [x ] I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below. ## Expected Behavior: All visible alphanumeric "text" that makes up the visible markdown should be sent to the TTS service during TTS "Read Aloud" playback. ## Actual Behavior: Some text, like italicized text is not being played back during TTS "Read Aloud" playback. ## Description **Bug Summary:** I noticed during Read Aloud playback that some words were skipped. I noticed a pattern that its almost always italicized markdown text not being played back. It can be other markdown text, though. Sometimes it reads invisible hash tags. ## Reproduction Details **Steps to Reproduce:** _(see attached sample of json and wav output)_ 1. Configure a working TTS server so you can properly use the "Read Aloud" feature. 2. Start a chat and edit the response to the example markdown provided below. 3. Press "Read Aloud" 4. Listen 5. Notice it skipped some words entirely... sometimes ![firefox_07FyjMoCSi](https://github.com/user-attachments/assets/721b5e35-34f3-4bbd-8249-dcff48b69623) [speech.zip](https://github.com/user-attachments/files/17735289/speech.zip) [3c319a9970cb8c555490978176a74f47eb7becc04f1fde10ad8064c8be0b4154.json](https://github.com/user-attachments/files/17735295/3c319a9970cb8c555490978176a74f47eb7becc04f1fde10ad8064c8be0b4154.json) [7eccaee9ca14bc649740f954181476df390fb368fcf88cab73972d7abb9f8a41.json](https://github.com/user-attachments/files/17735296/7eccaee9ca14bc649740f954181476df390fb368fcf88cab73972d7abb9f8a41.json) [13b3be85f1ff2dc7eb8364c872d4bb92d2b3ca5f3402b45184fc753412d6c01e.json](https://github.com/user-attachments/files/17735297/13b3be85f1ff2dc7eb8364c872d4bb92d2b3ca5f3402b45184fc753412d6c01e.json) [a8a0ec24b1676c26f84e86a5ad17eb550c9fb7d795babc197b84930d010e5c64.json](https://github.com/user-attachments/files/17735298/a8a0ec24b1676c26f84e86a5ad17eb550c9fb7d795babc197b84930d010e5c64.json) Example markdown to attempt playback. ``` # Sample Markdown Paragraph Markdown is a lightweight markup language that you can use to format text. It is often used to style *documents*, **web pages**, and even readme files. For example, you can create: - A bulleted list - **Bold text** - *Italic text* - `Inline code` To emphasize certain words, you can use _italic_ or __bold__ formatting, or both together like this: **_Bold and Italic_**. If you want to create a block of code, you can wrap it in triple backticks: ```
Author
Owner

@danielj23 commented on GitHub (Nov 13, 2024):

Adding here that..... I don't know the correct answer about some text being sent to TTS.

For example, the TTS service is reading the # hashtag symbol that is presented to it in the json.
However, because its not visualized to the reader since its just a header markdown symbol, it likely shouldn't be sent to the TTS service.

But, if I write "Hey Carl, this is a hashtag #", open-webui needs to send that hashtag to be read by TTS.

<!-- gh-comment-id:2474038317 --> @danielj23 commented on GitHub (Nov 13, 2024): Adding here that..... I don't know the _correct_ answer about _some_ text being sent to TTS. For example, the TTS service is reading the # hashtag symbol that is presented to it in the json. However, because its not visualized to the reader since its just a header markdown symbol, it likely shouldn't be sent to the TTS service. But, if I write "Hey Carl, this is a hashtag #", open-webui needs to send that hashtag to be read by TTS.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#14530