mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-05 18:38:17 -05:00
[GH-ISSUE #21403] feat: stream TTS as soon as text generation starts in voice call #19464
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @iChristGit on GitHub (Feb 14, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/21403
Check Existing Issues
Verify Feature Scope
Problem Description
-When just pressing the TTS button its very fast even on very long responses
-When using voice chat option, open-webui waits until the response is complete and then starts the TTS part.
It can be almost instant if you allow it to start at first sentence , very noticeable on large pieces of text.
another closed issue that got no attention:
https://github.com/open-webui/open-webui/issues/14278
Desired Solution you'd like
Allowing streaming of TTS (kokoro tts which is in the docs and works really good)
Alternatives Considered
No response
Additional Context
No response
@iChristGit commented on GitHub (Feb 14, 2026):
This was the case in all previous versions, it works on regular chat > TTS button but in voice mode its just always waits for full long response
v0.8.1 still an issue
we wont get the benefit of starting TTS early in open-webui because in chat the button shows after the full response, but once pressed can do in punctuation or in paragraphs, and in voice mode it defaults to wait until the end of LLM response and then starts TTS.
@iChristGit commented on GitHub (Feb 14, 2026):
It only happens with Voice Call Emoji. solved as far as i am concerned