Read aloud response only for Thinking models #3610

Closed
opened 2025-11-11 15:35:13 -06:00 by GiteaMirror · 0 comments
Owner

Originally created by @grizzlycode on GitHub (Feb 5, 2025).

Feature Request

Currently when you have a response read aloud on a thinking model (e.g., Deepseek) it will read the "thoughts + response". It would be nice to have an option to enable only to read the response out loud.


Is your feature request related to a problem? Please describe.

Currently, when using text-to-speech with thinking models (e.g., Deepseek), both the "thoughts" and the final "response" are read aloud. This can be cumbersome and distracting when the user is primarily interested in hearing only the final answer.

Describe the solution you'd like

Implement an option to allow users to select whether they want only the final "response" read aloud by the text-to-speech engine. This would provide a cleaner and more focused listening experience.

Describe alternatives you've considered

The current workaround involves manually extracting the "response" text and using a separate text-to-speech application. This is inefficient and adds unnecessary steps.

Additional context

This feature would improve the usability of thinking models, particularly in situations where users prefer to listen to the output rather than read it. It would allow for a more streamlined and less cluttered audio experience.

Originally created by @grizzlycode on GitHub (Feb 5, 2025). # Feature Request Currently when you have a response read aloud on a thinking model (e.g., Deepseek) it will read the "thoughts + response". It would be nice to have an option to enable only to read the response out loud. --- **Is your feature request related to a problem? Please describe.** Currently, when using text-to-speech with thinking models (e.g., Deepseek), both the "thoughts" and the final "response" are read aloud. This can be cumbersome and distracting when the user is primarily interested in hearing only the final answer. **Describe the solution you'd like** Implement an option to allow users to select whether they want only the final "response" read aloud by the text-to-speech engine. This would provide a cleaner and more focused listening experience. **Describe alternatives you've considered** The current workaround involves manually extracting the "response" text and using a separate text-to-speech application. This is inefficient and adds unnecessary steps. **Additional context** This feature would improve the usability of thinking models, particularly in situations where users prefer to listen to the output rather than read it. It would allow for a more streamlined and less cluttered audio experience.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#3610