Could the story be going wrong? In this month of April 2023, doubt becomes immense. In recent years, artificial intelligence (AI) has announced wild promises: such a precious help for medicine, accelerated drug research, automation of tedious work… But, today, it is the dark side of the AI that is needed. Or rather its use for malicious purposes. Artificial photos that seem so real, disturbing synthetic voices, fake videos made from simple texts: the worst of technology is displayed, raising fears of a wave of manipulations of all kinds.
1. Ultra-realistic artificial photos
Fake images, nothing new to that. That of a shark supposedly swimming in a flooded highway, created with Photoshop, dates from 2012. But more than ten years ago, social networks were embryonic, Photoshop cost a fortune and required solid knowledge. This week, it may have taken just a few seconds for a huge Donald Trump fan – his handle is Brick Suit on Twitter – to create a photo of the former president parading through the streets of New York, a huge crowd on following. The image, reposted by his son Eric Trump, has been viewed more than 6 million times.
From Elon Musk hand in hand with General Motors CEO Mary Barra to Emmanuel Macron amid a mountain of trash, fake images are on the rise thanks to the democratization of the tools to create them. Midjourney, Dall-E, Stable Diffusion are more and more accessible, Microsoft has just integrated Dall-E into its Bing search engine.
According to Sébastien Marcel, head of the biometric security and privacy protection research group at the Idiap institute in Martigny, this democratization is not a surprise. “It is a technology that we have been following for several years, it was inevitable.” The specialist predicts: “Just as the quality of synthetic images will improve, detection methods will adapt if research funding in this area follows.”
A point of view shared by Geovani Rizk, a postdoctoral fellow specializing in AI, working in the EPFL Distributed Computing Laboratory, directed by Rachid Guerraoui: “Over the past nine years, thousands of research works have taken up a model of generation of images and suggested improvements. For many of these works, the source codes used are available in open source and therefore available to anyone. It therefore becomes easier for anyone to take the models already built and launch the learning process to obtain a result similar to those proposed in the research papers without having a thorough understanding of it.”
According to Geovani Rizk, “the availability of new tools such as generative models will undeniably lead to an increase in these images on the various networks. It will therefore be necessary to remain vigilant regarding what can be seen on an image and always try to cross-check the information: for example, where does the image come from? Who publishes it?, etc. If we go back to the image of Donald Trump in New York, we notice that several people accompanying him have deformities in the face and hands. Image generation is improving. But she’s not perfect.
2. The explosion of fake videos
Create videos from a few lines of text, it’s possible. In 2022, Meta, Facebook’s parent company, unveiled such a tool called Make-A-Video. In recent days, the start-up Runway makes it possible to create mini-clips of a few seconds: the scenes are chaotic, the characters strange, but it works. For its part, ModelScope has launched a system that Internet users have used, for example, to create videos of Will Smith or Scarlett Johansson stuffing himself with spaghetti in his hand. We also saw Emmanuel Macron – once more him – running in the middle of waste.
According to Sébastien Marcel, “we can also expect an evolution in the generation of synthetic videos, but it will not be as fast because it is much more complex to generate a sequence of coherent synthetic images”. For Geovani Rizk, “for video, taking on the appearance of another person requires a lot of data for it to be realistic – at least for now. Most of deepfakes generated relate to celebrities or politicians, since it is easy to collect several dozen hours of video with their faces. Taking the appearance of a person less exposed to the media will give a less realistic result. Note that the threat of seeing fake pornographic videos is getting higher and higher, with the risk of considerable damage.
3. Voices already usurped
This is arguably the area where AI has reached near perfection. “It is possible to synthesize the voice from a text or another voice”, assures Sébastien Marcel. The researcher foresees another phenomenon: “We are very close to being able to observe deepfakes audiovisuals generated in real time to, for example, convert the face and voice of one person into the face and voice of another person. The issue is identity theft for fraud, blackmail or misinformation.”
Geovani Rizk also believes that “it is already possible to convincingly imitate someone’s voice when you have enough data. As with the previous point on the video, in the absence of this data, it will be much more complicated to produce a convincing result.
In the end, will detecting fake images, videos, voices (or something else) become impossible very quickly? “For images and videos, I think it will very quickly become impossible to detect this with the naked eye“says Geovani Rizk. Sébastien Marcel is more optimistic: “Detection technologies will follow. The most important thing is to stay in the race, to continue to analyze new generation techniques to understand them and be able to anticipate.
Read also: Faced with artificial intelligence in images, the media are rethinking their strategy