Meta boasted Friday that it has produced ‘the most versatile AI for speech generation’ in existence. 

But it added that the company would not be making their AI model public, due to grave concerns over the advanced tech’s ‘potential risks of misuse.’

In recent months, scammers have become adept at employing AI-generated speech to perpetrate eerie and shocking crimes, including an April attempt at faking the kidnapping of a teenage girl in Arizona, terrorizing the young girl’s distraught mother with realistic AI-generated pleas.

But Meta proposed a variety of more optimistic use cases in their press release, stating that Voicebox could be used to help the visually impaired hear messages from their friends and loved ones, or to allow non-native speakers to play translations of their own words, in their own voice, but in a foreign tongue.

Meta called their new Voicebox generative AI model 'the most versatile AI for speech generation' in existence. But the company added that it would not be making the AI public, due to the firm's own grave concerns regarding the advanced tech's 'potential risks of misuse'

Meta called their new Voicebox generative AI model ‘the most versatile AI for speech generation’ in existence. But the company added that it would not be making the AI public, due to the firm’s own grave concerns regarding the advanced tech’s ‘potential risks of misuse’

The announcement comes a little over a month after Zuckerberg (pictured) was snubbed by the White House ¿ which explicitly told reporters that Meta representatives had not been invited to a West Wing summit that was exclusively for firms at the 'forefront of AI innovation'

The announcement comes a little over a month after Zuckerberg (pictured) was snubbed by the White House — which explicitly told reporters that Meta representatives had not been invited to a West Wing summit that was exclusively for firms at the ‘forefront of AI innovation’

At the moment, the company said its AI model is capable of speaking six languages: English, French, Spanish, German, Polish and Portuguese.

Meta also offered a few more business-centric use cases for the technology, including deploying Voicebox as a means for audio creators to more easily edit unwanted background noises or errors from their audio or video tracks. 

It also suggested that Voicebox could be used to create more comforting, naturalistic voices for virtual assistants and more realistic sounding characters in video games. 

But all these brave new opportunities will not yet be made available to developers hoping to play in Meta’s Voicebox sandbox, the company said in a press release.   

A promotional video for Voicebox released Friday showed off the AI's ability to convert text-to-speech in a wide variety of voices

A promotional video for Voicebox released Friday showed off the AI’s ability to convert text-to-speech in a wide variety of voices

‘There are many exciting use cases for generative speech models,’ the company said in a research post, ‘but because of the potential risks of misuse, we are not making the Voicebox model or code publicly available at this time.’ 

‘While we believe it is important to be open with the AI community and to share our research to advance the state of the art in AI,’ the firm added, ‘it’s also necessary to strike the right balance between openness with responsibility.’

Meta’s deep learning AI researchers noted in their post introducing Voicebox that their system utilizes a method called Flow Matching, which has scored better than diffusion models used by the current state-of-the-art systems, like VALL-E and zero-shot text-to-speech.  

Voicebox, they said, produced artificial audio that was more intelligible, scoring a lower 1.9 percent word error rate compared to their competitor’s 5.9 percent. 

It also has a higher ratio for producing audio similarity (0.580 vs. 0.681) while being, according to Meta, nearly 20 times faster. 

When translating across languages, Voicebox outperformed a well-regarded multilingual text-to-speech AI, YourTTS, reducing the average word error rate from 10.9 percent to 5.2 percent, and upping the ratio for audio similarity from 0.335 to 0.481.

The announcement comes a little over a month after Zuckerberg was snubbed by the Biden White House — which explicitly told reporters that Meta representatives had not been invited to a West Wing summit that was exclusively for firms at the ‘forefront of AI innovation.’

This post first appeared on Dailymail.co.uk

You May Also Like

How Cellphone Data Collected for Advertising Landed at U.S. Government Agencies

A company that collects and sells consumer information gleaned from cellphones said…

Bezos to Be on Blue Origin’s First Human Space Flight Next Month

Jeff Bezos plans to travel to space next month as one of…

Post Office lawyers said leaving no stone unturned was unrealistic, inquiry told

Attitude of organisation and its legal team is ‘disgrace’, says David Davis…

Communist party accessed TikTok data of Hong Kong protesters, former executive alleges

China’s ruling party accessed activists’ data including network information, Sim card ID…