enhancement: TTS skip code sections #877

Closed
opened 2025-11-11 14:32:52 -06:00 by GiteaMirror · 3 comments
Owner

Originally created by @pcapazzi on GitHub (May 9, 2024).

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Text to Speech always reads aloud code snippets (python, sql, etc)

Describe the solution you'd like
A clear and concise description of what you want to happen.
It would be great if in settings there is an option to have TTS skip code sections.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
I asked the LLM to not read aloud the code but it continued to do so.

Additional context
Add any other context or screenshots about the feature request here.
Nothing else to add. Lots to love about this app though... great work!

Originally created by @pcapazzi on GitHub (May 9, 2024). **Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] Text to Speech always reads aloud code snippets (python, sql, etc) **Describe the solution you'd like** A clear and concise description of what you want to happen. It would be great if in settings there is an option to have TTS skip code sections. **Describe alternatives you've considered** A clear and concise description of any alternative solutions or features you've considered. I asked the LLM to not read aloud the code but it continued to do so. **Additional context** Add any other context or screenshots about the feature request here. Nothing else to add. Lots to love about this app though... great work!
GiteaMirror added the good first issue label 2025-11-11 14:32:52 -06:00
Author
Owner

@lee-b commented on GitHub (May 10, 2024):

Personally, I think this would help, but would be better as part of a more mode-based UX, where you tell the system that you're now in voice conversation mode, and it adds something to the prompt along the lines of "You are communicating with the user via voice, so try to interpret spelling errors (from the speech recognition engine), keep answers brief and conversational, and avoid code listings and other things which are best presented visually. The user cannot hear any code blocks." THEN, if the model still outputs code blocks, they could be ignored, and the LLM would understand why you ignore them in your responses.

@lee-b commented on GitHub (May 10, 2024): Personally, I think this would help, but would be better as part of a more mode-based UX, where you tell the system that you're now in voice conversation mode, and it adds something to the prompt along the lines of "You are communicating with the user via voice, so try to interpret spelling errors (from the speech recognition engine), keep answers brief and conversational, and avoid code listings and other things which are best presented visually. The user cannot hear any code blocks." THEN, if the model still outputs code blocks, they could be ignored, and the LLM would understand why you ignore them in your responses.
Author
Owner

@silentoplayz commented on GitHub (May 11, 2024):

Related comment

@silentoplayz commented on GitHub (May 11, 2024): [Related comment](https://github.com/open-webui/open-webui/issues/1331#issuecomment-2080224728)
Author
Owner
@thiswillbeyourgithub commented on GitHub (Aug 23, 2024): By the way, [openedai-speech (the dropin openai tts replacement) supports regex specification and it seems like it could perfectly handle your issue](https://github.com/matatonic/openedai-speech/blob/main/pre_process_map.default.yaml)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#877