[GH-ISSUE #5424] Supports voice recognition and text-to-speech capabilities, with customizable extension abilities #29154

Open
opened 2026-04-22 07:49:53 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @skytodmoon on GitHub (Jul 2, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5424

Feature Request: Support for Voice Recognition and Text-to-Speech with Custom Extension Capabilities

I would like to propose the addition of voice recognition and text-to-speech functionalities to the project. These features would greatly enhance the user experience by allowing for hands-free interaction and accessibility.

Additionally, I suggest implementing a customizable extension framework that would enable developers to integrate their own voice commands or speech synthesis options, thereby expanding the project's versatility and adaptability to various use cases.

Thank you for considering this enhancement to the project. I believe these features would be a valuable addition and open up new possibilities for users and developers alike.

Originally created by @skytodmoon on GitHub (Jul 2, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5424 Feature Request: Support for Voice Recognition and Text-to-Speech with Custom Extension Capabilities I would like to propose the addition of voice recognition and text-to-speech functionalities to the project. These features would greatly enhance the user experience by allowing for hands-free interaction and accessibility. Additionally, I suggest implementing a customizable extension framework that would enable developers to integrate their own voice commands or speech synthesis options, thereby expanding the project's versatility and adaptability to various use cases. Thank you for considering this enhancement to the project. I believe these features would be a valuable addition and open up new possibilities for users and developers alike.
GiteaMirror added the feature request label 2026-04-22 07:49:53 -05:00
Author
Owner

@CrazyBoyM commented on GitHub (Jan 21, 2025):

need it too.

<!-- gh-comment-id:2603609005 --> @CrazyBoyM commented on GitHub (Jan 21, 2025): need it too.
Author
Owner

@v-byte-cpu commented on GitHub (Jun 10, 2025):

Hi all — following up from my earlier issue (#11021, closed as dup), here’s a concise roadmap for adding native TTS support to Ollama:
👉 https://gist.github.com/v-byte-cpu/28d402ba5601a25432c7e18a99d3725f

It summarizes concrete ideas around API, testing, voice handling, and future extensions — based on the community discussion here and related TTS models.

I’d be happy to contribute a design document or an initial PR (Python + Go) and help move this forward.
If there are any internal plans or ongoing discussions, I’d be glad to align with them first.
Please advise what would be the most useful way to start — thanks!

<!-- gh-comment-id:2960490033 --> @v-byte-cpu commented on GitHub (Jun 10, 2025): Hi all — following up from my earlier issue (#11021, closed as dup), here’s a concise roadmap for adding native TTS support to Ollama: 👉 https://gist.github.com/v-byte-cpu/28d402ba5601a25432c7e18a99d3725f It summarizes concrete ideas around API, testing, voice handling, and future extensions — based on the community discussion here and related TTS models. I’d be happy to contribute a design document or an initial PR (Python + Go) and help move this forward. If there are any internal plans or ongoing discussions, I’d be glad to align with them first. Please advise what would be the most useful way to start — thanks!
Author
Owner

@kekko7072 commented on GitHub (Oct 12, 2025):

@v-byte-cpu Seems interesting, any update? I have an open application that desperatly need better TTS than the default macOS ones. Opra App It's just a PDF reader app, but very cool thoug.

<!-- gh-comment-id:3395339842 --> @kekko7072 commented on GitHub (Oct 12, 2025): @v-byte-cpu Seems interesting, any update? I have an open application that desperatly need better TTS than the default macOS ones. [Opra App](https://github.com/kekko7072/opra) It's just a PDF reader app, but very cool thoug.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#29154