Voicebox

Open source voice cloning powered by Qwen3-TTS.

FreemiumAudio & Musicvoice cloning

About Voicebox

Voicebox is an open-source voice cloning desktop application powered by Qwen3-TTS. It allows users to create natural-sounding speech from text, replicating voices with high precision. This application is positioned as a local-first voice cloning studio providing professional voice synthesis comparable to commercial-grade software, but with user privacy as a focus. It requires no cloud services or subscriptions, thus ensuring complete user privacy and native performance. With Voicebox, one can download voice models, clone voices, and generate speech entirely on a local machine. The application is cross-platform, designed for macOS, Windows, and Linux. It provides multi-sample support to allow for greater quality and natural sounding voice cloning. The application is designed for optimal performance, leveraging Metal acceleration on Mac and CUDA acceleration on Windows/Linux for speedy, local inference operations. In addition, it enables users to run GPU inference locally or connect to a remote machine. The software also equips users with a stories editor that permits the created multi-voice narratives with a timeline-based editor, making it possible to arrange tracks, trim clips, and mix conversations. Moreover, it features an audio transcription system powered by Whisper for accurate speech-to-text, thereby allowing automatic extraction of reference text from voice samples.

108

Total Visits

Upvotes

Auto

Discovery