mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-17 12:31:06 -05:00
feat: voice input #19
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @honeyspoon on GitHub (Nov 2, 2023).
I would like to be able to use my voice as an input.
I don't really need the text to speech from the ai.
Just being able to talk to it.
use case is for language learning.
Inteface with a local whisper model.
Add a microphone button next to the input box.
When clicked, you would hear a sound to start recording.
It would then live transcribe your text in the chatbox.
After 2 seconds of silence it would send the prompt to ollama.
This project is able to interface with a local whisper do voice to text in a web app.
https://github.com/mayeaux/generate-subtitles
@tjbck commented on GitHub (Nov 3, 2023):
Looks interesting, I'll think of ways to incorporate into the web UI when I have more time as it seems like it might take some time to get the implementation right. Thanks for the idea.
@tjbck commented on GitHub (Nov 11, 2023):
Hi, Just merged #90 to main, you should have the voice recognition support turned on by default now.
For your specific use case, you can enable the speech auto-send function by going to Settings > Addons and clicking on the button right next to the 'Speech Auto-Send' label to toggle.
Let me know if you encounter any issues with the feature. Thanks!
@honeyspoon commented on GitHub (Nov 20, 2023):
Did not know the browser had a an integrated speech api.
How does it compare to whisper?
Now that this feature is in I might try too look at running it against a local server running whisper.
I wonder if something like ollama exists for whisper
@0x07CB commented on GitHub (Apr 27, 2024):
Whisperis great , andVoskcan help too ( and vosk can return srt data for use to generate sub-titles for audio/video media file ).Whisper and Vosk can be use with python ( I have not read the code for now but I see the repo
open-webui/open-webuiis partially written in python. )So, if backend use python you have choice to made a good STT feature. ( probably I have wrong, I have not check this repo, I have start to try just now. )