2024-04-19 15:02:03
Microsoft has a new artificial intelligence (AI) system that can generate hyperrealistic avatars of people from a photo and a the cut sound. VASA-1 is a research project not available to the general public that demonstrates the scale of the technology and revives the debate around deepfakes.
Researchers at the technology company assure that “VASA-1 is able not only to produce lip movements synchronized with sound, but also to capture a wide range of facial nuances and natural head movements that contribute to perception of authenticity and liveliness“.
This content can also be viewed on its website originating from from.
The tool generates videos with a resolution of 512 x 512 pixels at 45 frames per second. In addition, it allows users to control aspects such as movement sequence, gaze direction and facial expression. The system manipulates images and audio tracks that were not considered in the training process, such as artistic works, song sounds and dialogues in languages other than English.
Loose deepfakes They are a threat to disinformation. But some politicians, managers and academics see them as a way to expand their reach.
How was VASA-1 developed?
Microsoft’s new AI model is trained on a huge collection of videos of people talking, (the origin of the database is unknown). It has the ability to analyze faces and understand different aspects of them individually. The researchers assigned a code to each attribute to add or eliminate movements at will.
“We consider all possible facial dynamics, including lip movements, facial expressions, gaze and winking, as the only latent variable. We model their probability distribution in a uniform way. This holistic modeling of facial dynamics, together with jointly learned head movement patterns leads to the generation of a wide range of emotional and realistic conversational behaviors,” they explain.
The development team used a 3D approach to capture facial details and train the algorithm to understand head and neck movements in three-dimensional space.
Microsoft’s new AI has been born amid concerns regarding deepfakes
Microsoft clarified that it has no intention of releasing VASA-1 as a product or an API. He explained that all of the demonstration videos included in his research are based on content generated with FRA-E 3 and StyleGAN2.
The company hopes its findings will facilitate the creation of virtual AI avatars with the aim of improving accessibility for people with communication difficulties and offering therapeutic and educational support to those who need it.
Last month, more than four hundred experts in AI, cybersecurity, digital ethics and global politics signed an open letter demanding that governments worldwide take urgent mandatory measures once morest deepfakes. In the document titled ‘Disrupting the Deepfake Supply Chain’, they warned that current laws do not sufficiently limit the production and spread of these products. They accused this of being a potential danger in a year in which more than half of the world’s population will participate in democratic processes.
“For a modern society to function, people need access to credible and authentic information. Misleading the public through the use of artificial intelligence should be regulated and enforced through specific and formalized laws. It is increasingly difficult to identify what is real on the internet. It is necessary to draw lines to protect our ability to recognize real people,” the text reads.
The spread of fake and misleading videos produced with artificial intelligence has grown by 550% between 2019 and 2023, according to the report “State of Deepfakes 2023”from Home Security Heroes, an online security organization.
1713542288
#VASA1 #Microsofts #creates #hyperrealistic #videos #image #sound #clip