Microsoft is developing AI that can imitate human voices in just three seconds » Leadersnet

| Alexander Schöpf

|
10.01.2023

To prevent manipulation, work is being done on another piece of software that can recognize audio clips created with “VALL-E”.

VALL-E is the name of a new artificial intelligence (AI) that was developed by Microsoft and can imitate human voices in a deceptively real way. Like the Austrian daily newspaper The standard writesa snippet of sound three seconds long is enough to be able to imitate a voice – including the emotional coloring of the speaker and the acoustics of the spatial environment in which the voice sample was recorded.

Wide field of application and fear of manipulation

The computer group sees a wide range of applications for the groundbreaking technology. On the one hand, high-quality text-to-speech functions would be conceivable: For example, a text message might be read out with the sender’s voice. On the other hand, the correction of slips of the tongue would still be possible followingwards.

Of course, this also opens the door to the possibility of manipulation. For example, statements by people might be changed followingwards or created completely artificially without it being noticed. To prevent this, Microsoft wants to develop software that will recognize when an audio clip was created with VALL-E.

First sound clips released

However, the AI ​​will not be available to the general public for the time being, as it is still a research project. But to illustrate the revolutionary potential of VALL-E, the research team released some sound clipswhich show artificial intelligence in action.

www.microsoft.com

Leave a Replay