fbpx

Voicebox: Meta’s Mindblowing AI Tool For Speech Generation

June 19, 2023

Meta (formerly Facebook) has just released Voicebox, a state-of-the-art generative AI model that’s revolutionizing voice generation.

It is a text-to-speech AI tool that is multilingual and the quality is unbelievably good.

What can you do with Voicebox?

1. In-context text-to-speech synthesis

Think of this like a parrot that’s learned to mimic your voice. All it needs is a clip of your speech. Then, you can type anything you want, and it will read it out in your voice.

2. Speech editing and noise reduction

Imagine you’ve recorded a beautiful birthday message for a friend, but a car honked loudly in the background. Instead of re-recording the whole thing, Voicebox can simply ‘erase’ that car honk from your message.

Similarly, if you stumble on a word or say something wrong, you don’t need to start over. Voicebox can fix those mistakes in your original voice.

3. Cross-lingual style transfer

Suppose you speak English, but you want to surprise your Spanish-speaking friend with a birthday message in their language. You can type your message in Spanish, and Voicebox will read it out loud in your voice, even though the original recording you provided was in English.

4. Diverse speech sampling

People all around the world talk differently, right? With different accents, tones, and styles. Voicebox learns from a wide range of these speech patterns in six languages.

So, it can generate a realistic speech that sounds just like a native speaker in English, French, Spanish, German, Polish, or Portuguese. This could make things like your GPS or virtual assistant sound much more natural and familiar.

Who could use this tool?

The applications of Voicebox are wide-ranging and extend to various audiences.

  • Content creators: Voicebox can be a powerful tool for audio editing and creation. It can help creators produce high-quality audio tracks for videos without needing to re-record entire segments due to minor disturbances or errors.
  • Visually impaired individuals: Voicebox can transform written messages from friends into high-quality audio read in their voices, making digital communication more accessible.
  • Podcasters: With its speech editing and noise reduction capabilities, podcasters can seamlessly edit their recorded episodes. Whether it’s removing background noise or correcting mispronounced words, Voicebox can ensure a clean, professional-sounding podcast without the need for re-recording.


Is Voicebox available to the public?

As of now, Meta has not made the Voicebox model or code publicly available.

This is primarily due to concerns about the potential misuse of the technology. Can you imagine what prank calls are going to be like in the future?

For that reason, public access to Voicebox is not yet available.

I want to learn more about AI

If you want to stay up to date with the latest AI tools and updates (and how to use them to your advantage), make sure you are subscribed to the WGMI newsletter.

Learn how AI can make you richer and more productive

New business ideas, productivity life hacks, future technologies, and more – all in a five-minute email.

Recent Articles

Want 100+ ChatGPT Prompts for Making Money Online?
Over 400 people have paid $20 for this ChatGPT prompt pack. For you, it's completely free.

Sign up to my newsletter, and I'll send you it straight away (with some more free online business resources).
I've helped 34,000+ students build an online business... I want to help you too.