OpenAI rates its latest AI model as ‘medium’

2024-08-09 15:50:00

When will we really have to worry about the dangers of artificial intelligence? That’s the question the entire industry is asking. And according to the companies creating these AIs, the answer remains “ not NOW ” As spotted The VergeOpenAI released Thursday a research paper on the security measures and risk assessment it put in place before releasing its latest AI model, GPT-4o. The result: it only poses a “medium” risk. In March, its competitor Anthropic drew the same conclusion when it released its AI Claude 3.

At OpenAI, AI risk assessment falls into four categories: its potential to become a cybersecurity threat, the model’s ability to aid in the development of a biological threat, its persuasive capabilities, and the potential for it to act autonomously, without human control, such as Skynet in the movie Terminator. This grid focuses only on existential and financial risks, and ignores a whole range of other risks, such as the potential effects of technology on inequality.

Creating “safe” superintelligences: the new obsession of AI gurus

GPT-4o harmless to humanity

In detail, OpenAI measured that its GPT-4o model has low risks, with the exception of the risk of persuasion, which is a little higher. The researchers concede that AI is capable of writing texts intended to influence opinion that are more effective than those written by humans. But they specify that this observation is limited to specific cases.

As a final step in their risk assessment process, they called on teams of ” red team “. Their role is to try to derail their model and use it for malicious purposes, so that the exploited flaws can be corrected before release. For example, they have tried to use GPT-4o to clone voices; to generate violent or erotic content; or to divert content protected by licenses. Despite these precautions, which are widespread across the industry, it is common for AI models to be successfully diverted as soon as they are released.

The most famous example remains that of Google’s AI, Gemini, which started generating black Nazis. But before it, malicious users had made Bing (Microsoft’s search engine) generate images of Kirby (an iconic character from Nintendo, a Japanese company that is very strict on intellectual property law) at the controls of a plane heading towards two towers… In short: beyond research articles, the industry still has to prove that it can produce AI models and tools that are robust to hijacking. A difficult task given that generative AI is inherently intended for a very wide range of use cases.

A Next Generation of AI Taken Closer Look

OpenAI, due to the popularity of ChatGPT and its position as a pioneer in the ecosystem, is receiving particular attention from public authorities on the subject. If for the moment, AI is not dangerous – at least in the sense of the existential threat chosen by the industry -, some experts and politicians are worried about the short-term future, as the technology is progressing rapidly.

In Paris and around the world, those concerned about artificial intelligence are calling for a pause

The most catastrophic – known as doomers in the sector – are even calling for research to be paused in order to develop robust safeguards. This is obviously not the opinion of AI companies. The latter believe they can reconcile the need for safety with a frantic race for performance, financed by the billions of tech giants.

But the question of risks will continue to come back to the table. By the end of the year, all the leaders of the ecosystem (OpenAI being the first) plan to release a new generation of models in the coming months, synonymous with a technological leap. Increasingly, public authorities are demanding organizational guarantees from AI creators. The saga surrounding Sam Altman at the head of OpenAI has proven the fragility of these startups, which are supposed to control an overpowering technology in the years to come. To avoid pressure from regulators, they will have to quickly show their credentials.