[PR #2755] [CLOSED] Add audiomaker #15758

New Issue

GiteaMirror · 2026-05-02T07:54:42-05:00

GiteaMirror commented

2026-05-02 07:54:42 -05:00

📋 Pull Request Information

Original PR: https://github.com/vinta/awesome-python/pull/2755
Author: @AnkushRathour
Created: 9/3/2025
Status: ❌ Closed

Base: master ← Head: patch-1

📝 Commits (2)

e78513c Add audiomaker
6af3ed2 Update README.md

📊 Changes

1 file changed (+1 additions, -0 deletions)

View changed files

📝 README.md (+1 -0)

📄 Description

What is this Python project?

audiomaker is a Python package for local text-to-speech generation, designed specifically to handle long-form input by automatically splitting it into manageable chunks and seamlessly merging the resulting audio files.

Unlike many other TTS tools, audiomaker is:

🆓 Free and open source
💻 Fully offline/local — requires no API keys or cloud dependencies
🔊 Capable of generating hours-long audio from large .txt files (e.g., books, scripts)

This makes it ideal for creating:

Audiobooks and narrated blog posts
Podcast episodes and interviews
Long-form video voiceovers or lectures

PyPI: https://pypi.org/project/audiomaker/
GitHub: https://github.com/AnkushRathour/AudioMaker

🔧 Features

Uses Microsoft Edge TTS (edge-tts) for high-quality neural voices, supporting multiple languages and voice styles
Automatically chunks large input texts based on configurable parameters (e.g., chunk size, pause duration)
Merges generated audio chunks into a single smooth audio file without audible gaps
Provides both CLI and Python API interfaces for flexible usage
Supports output in common audio formats like .mp3 and .wav
Includes built-in progress tracking and error handling for robust long-form synthesis

Successfully tested on very large .txt files, producing over 4 hours of continuous audio output

What's the difference between this Python project and similar ones?

Unlike cloud-based TTS services (e.g., Google TTS, AWS Polly), audiomaker is fully offline and local with no API keys or cloud billing
Compared to gTTS, it requires no internet connection and supports longer inputs reliably
Uses Microsoft Edge TTS with advanced neural voices, unlike basic TTS engines like pyttsx3
Automates chunking and merging to handle arbitrarily long texts without manual intervention
Provides an end-to-end pipeline minimizing user effort for long-form audio generation

Anyone who agrees with this pull request may submit an Approve review.

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/vinta/awesome-python/pull/2755 **Author:** [@AnkushRathour](https://github.com/AnkushRathour) **Created:** 9/3/2025 **Status:** ❌ Closed **Base:** `master` ← **Head:** `patch-1` --- ### 📝 Commits (2) - [`e78513c`](https://github.com/vinta/awesome-python/commit/e78513c7aeb97b61f50122a9f9a369a3eee0ae68) Add audiomaker - [`6af3ed2`](https://github.com/vinta/awesome-python/commit/6af3ed2f6981fffafe7078b4fd8e99e152d39644) Update README.md ### 📊 Changes **1 file changed** (+1 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `README.md` (+1 -0) </details> ### 📄 Description ## What is this Python project? [`audiomaker`](https://pypi.org/project/audiomaker/) is a Python package for **local text-to-speech generation**, designed specifically to handle **long-form input** by automatically splitting it into manageable chunks and seamlessly merging the resulting audio files. Unlike many other TTS tools, `audiomaker` is: - 🆓 **Free and open source** - 💻 **Fully offline/local** — requires no API keys or cloud dependencies - 🔊 **Capable of generating hours-long audio** from large `.txt` files (e.g., books, scripts) This makes it ideal for creating: - Audiobooks and narrated blog posts - Podcast episodes and interviews - Long-form video voiceovers or lectures **PyPI:** https://pypi.org/project/audiomaker/ **GitHub:** https://github.com/AnkushRathour/AudioMaker ### 🔧 Features - Uses **Microsoft Edge TTS** (`edge-tts`) for high-quality neural voices, supporting multiple languages and voice styles - Automatically chunks large input texts based on configurable parameters (e.g., chunk size, pause duration) - Merges generated audio chunks into a single smooth audio file without audible gaps - Provides both CLI and Python API interfaces for flexible usage - Supports output in common audio formats like `.mp3` and `.wav` - Includes built-in progress tracking and error handling for robust long-form synthesis > Successfully tested on very large `.txt` files, producing over **4 hours** of continuous audio output ## What's the difference between this Python project and similar ones? - Unlike cloud-based TTS services (e.g., Google TTS, AWS Polly), `audiomaker` is **fully offline and local** with no API keys or cloud billing - Compared to `gTTS`, it requires no internet connection and supports longer inputs reliably - Uses **Microsoft Edge TTS** with advanced neural voices, unlike basic TTS engines like `pyttsx3` - Automates chunking and merging to handle arbitrarily long texts without manual intervention - Provides an end-to-end pipeline minimizing user effort for long-form audio generation Anyone who agrees with this pull request may submit an *Approve* review. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

GiteaMirror added the pull-request label 2026-05-02 07:54:42 -05:00

GiteaMirror closed this issue

2026-05-02 07:54:44 -05:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/awesome-python#15758