[PR #2755] [CLOSED] Add audiomaker #15758

Closed
opened 2026-05-02 07:54:42 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/vinta/awesome-python/pull/2755
Author: @AnkushRathour
Created: 9/3/2025
Status: Closed

Base: masterHead: patch-1


📝 Commits (2)

📊 Changes

1 file changed (+1 additions, -0 deletions)

View changed files

📝 README.md (+1 -0)

📄 Description

What is this Python project?

audiomaker is a Python package for local text-to-speech generation, designed specifically to handle long-form input by automatically splitting it into manageable chunks and seamlessly merging the resulting audio files.

Unlike many other TTS tools, audiomaker is:

  • 🆓 Free and open source
  • 💻 Fully offline/local — requires no API keys or cloud dependencies
  • 🔊 Capable of generating hours-long audio from large .txt files (e.g., books, scripts)

This makes it ideal for creating:

  • Audiobooks and narrated blog posts
  • Podcast episodes and interviews
  • Long-form video voiceovers or lectures

PyPI: https://pypi.org/project/audiomaker/
GitHub: https://github.com/AnkushRathour/AudioMaker

🔧 Features

  • Uses Microsoft Edge TTS (edge-tts) for high-quality neural voices, supporting multiple languages and voice styles
  • Automatically chunks large input texts based on configurable parameters (e.g., chunk size, pause duration)
  • Merges generated audio chunks into a single smooth audio file without audible gaps
  • Provides both CLI and Python API interfaces for flexible usage
  • Supports output in common audio formats like .mp3 and .wav
  • Includes built-in progress tracking and error handling for robust long-form synthesis

Successfully tested on very large .txt files, producing over 4 hours of continuous audio output

What's the difference between this Python project and similar ones?

  • Unlike cloud-based TTS services (e.g., Google TTS, AWS Polly), audiomaker is fully offline and local with no API keys or cloud billing
  • Compared to gTTS, it requires no internet connection and supports longer inputs reliably
  • Uses Microsoft Edge TTS with advanced neural voices, unlike basic TTS engines like pyttsx3
  • Automates chunking and merging to handle arbitrarily long texts without manual intervention
  • Provides an end-to-end pipeline minimizing user effort for long-form audio generation

Anyone who agrees with this pull request may submit an Approve review.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/vinta/awesome-python/pull/2755 **Author:** [@AnkushRathour](https://github.com/AnkushRathour) **Created:** 9/3/2025 **Status:** ❌ Closed **Base:** `master` ← **Head:** `patch-1` --- ### 📝 Commits (2) - [`e78513c`](https://github.com/vinta/awesome-python/commit/e78513c7aeb97b61f50122a9f9a369a3eee0ae68) Add audiomaker - [`6af3ed2`](https://github.com/vinta/awesome-python/commit/6af3ed2f6981fffafe7078b4fd8e99e152d39644) Update README.md ### 📊 Changes **1 file changed** (+1 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `README.md` (+1 -0) </details> ### 📄 Description ## What is this Python project? [`audiomaker`](https://pypi.org/project/audiomaker/) is a Python package for **local text-to-speech generation**, designed specifically to handle **long-form input** by automatically splitting it into manageable chunks and seamlessly merging the resulting audio files. Unlike many other TTS tools, `audiomaker` is: - 🆓 **Free and open source** - 💻 **Fully offline/local** — requires no API keys or cloud dependencies - 🔊 **Capable of generating hours-long audio** from large `.txt` files (e.g., books, scripts) This makes it ideal for creating: - Audiobooks and narrated blog posts - Podcast episodes and interviews - Long-form video voiceovers or lectures **PyPI:** https://pypi.org/project/audiomaker/ **GitHub:** https://github.com/AnkushRathour/AudioMaker ### 🔧 Features - Uses **Microsoft Edge TTS** (`edge-tts`) for high-quality neural voices, supporting multiple languages and voice styles - Automatically chunks large input texts based on configurable parameters (e.g., chunk size, pause duration) - Merges generated audio chunks into a single smooth audio file without audible gaps - Provides both CLI and Python API interfaces for flexible usage - Supports output in common audio formats like `.mp3` and `.wav` - Includes built-in progress tracking and error handling for robust long-form synthesis > Successfully tested on very large `.txt` files, producing over **4 hours** of continuous audio output ## What's the difference between this Python project and similar ones? - Unlike cloud-based TTS services (e.g., Google TTS, AWS Polly), `audiomaker` is **fully offline and local** with no API keys or cloud billing - Compared to `gTTS`, it requires no internet connection and supports longer inputs reliably - Uses **Microsoft Edge TTS** with advanced neural voices, unlike basic TTS engines like `pyttsx3` - Automates chunking and merging to handle arbitrarily long texts without manual intervention - Provides an end-to-end pipeline minimizing user effort for long-form audio generation Anyone who agrees with this pull request may submit an *Approve* review. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-02 07:54:42 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/awesome-python#15758