Open source AI voice cloning arrives with MyShell’s new OpenVoice model

An illustration for a tech article about an open-source voice cloning solution called OpenVoice. The image captures a futuristic scene with a robot assistant standing by a computer screen displaying OpenVoice's interface. The robot is playfully mimicking different voices - changing its tone, emotion, accent, rhythm, pauses, and intonation. You can see a variety of waveforms and audio metrics on the screen. In the background, symbols of academic institutions and an AI-startup are visible, suggesting collaborative research. Farther in the background, there is an abstract representation of various speakers, whose voices are being cloned. The color palette is bright and optimistic, with the overall atmosphere being positive and cutting edge.

Researchers at MIT, Tsinghua University, and Canadian AI startup MyShell have developed an open-source voice cloning solution called OpenVoice. Unlike other platforms, OpenVoice offers granular controls and near-instantaneous voice cloning. Users can clone voices with precision, controlling tone, emotion, accent, rhythm, pauses, and intonation using a small audio clip. The developers have provided a pre-reviewed research paper and links to access and try out OpenVoice. MyShell aims to support the open-source research community by providing grants, datasets, and computing power. The company believes that voice is a crucial modality for Artificial General Intelligence (AGI) and decided to focus on open-source voice cloning. OpenVoice comprises a text-to-speech model and a tone converter, trained on audio samples from various speakers. MyShell, a decentralized platform, offers OpenVoice along with other AI characters, bots, and features, charging a monthly subscription and for AI training data.

Full article

Leave a Reply