Researchers at MIT, Tsinghua University, and Canadian AI startup MyShell have developed an open-source voice cloning solution called OpenVoice. Unlike other platforms, OpenVoice offers granular controls and near-instantaneous voice cloning. Users can clone voices with precision, controlling tone, emotion, accent, rhythm, pauses, and intonation using a small audio clip. The developers have provided a pre-reviewed research paper and links to access and try out OpenVoice. MyShell aims to support the open-source research community by providing grants, datasets, and computing power. The company believes that voice is a crucial modality for Artificial General Intelligence (AGI) and decided to focus on open-source voice cloning. OpenVoice comprises a text-to-speech model and a tone converter, trained on audio samples from various speakers. MyShell, a decentralized platform, offers OpenVoice along with other AI characters, bots, and features, charging a monthly subscription and for AI training data.
