[GH-ISSUE #5451] Speech-To-Text Transcription #65444

Closed
opened 2026-05-03 21:18:10 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @HerroHK on GitHub (Jul 3, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5451

Issue: our company has audio recordings that are confidential in nature. We have setup a linux server (Ubuntu) running Ollama with both Open-WebUI and AnythingLLM as interface. However, it seems both are not able to transcribe long (up to 8 hours) audio recordings, and we only get back snippets. It is also unclear where the boundaries are in terms of time, as it seems some parts do get translated.

It would make a great addition to Ollama if we could make use of Whisper or other models locally to do this.

I am pretty sure it is a very common use-case of AI, with plenty of "commercially available options". But it is mostly that we can't use commercial services easily to fullfill our needs and contracts at the same time.

Thanks for considering this.

Originally created by @HerroHK on GitHub (Jul 3, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5451 Issue: our company has audio recordings that are confidential in nature. We have setup a linux server (Ubuntu) running Ollama with both Open-WebUI and AnythingLLM as interface. However, it seems both are not able to transcribe long (up to 8 hours) audio recordings, and we only get back snippets. It is also unclear where the boundaries are in terms of time, as it seems some parts do get translated. It would make a great addition to Ollama if we could make use of Whisper or other models locally to do this. I am pretty sure it is a very common use-case of AI, with plenty of "commercially available options". But it is mostly that we can't use commercial services easily to fullfill our needs and contracts at the same time. Thanks for considering this.
GiteaMirror added the feature request label 2026-05-03 21:18:10 -05:00
Author
Owner

@pdevine commented on GitHub (Jul 3, 2024):

@HerroHK Thanks for the issue. Definitely something we'd like to add at some point. I'm going to close this though as a dupe of #3265. There's also #1234 for Text-to-Speech.

<!-- gh-comment-id:2206760049 --> @pdevine commented on GitHub (Jul 3, 2024): @HerroHK Thanks for the issue. Definitely something we'd like to add at some point. I'm going to close this though as a dupe of #3265. There's also #1234 for Text-to-Speech.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#65444