2024-01-15 14:11:12
1/15/2024-|Last update: 1/15/202407:04 PM (Mecca time)
AI chatbots like GBT Chat receive prompts or series of instructions from human users, but are instructed not to respond to unethical, questionable, or illegal requests. For example, when asking how to create malware to hack bank accounts, you will receive a flat rejection of this request.
Despite these ethical restrictions, researchers from Nanyang Technological University in Singapore demonstrated in a study published on the pre-print website (arXiv), that it is possible to manipulate the mind of these robots through a robot they created called “Master Key,” which enabled them to hack them and produce content that violates the instructions of their developers, a result known as “jailbreaking.”
“Jailbreaking” is a term in the field of computer security that refers to hackers finding flaws in a system’s software, and exploiting these flaws to make the system do something that its developers deliberately prevented.
How did scientists manipulate the brain of “GPT chat”?
Robots’ brains are a large language model (LLM) that helps them process human input, creating text that is almost indistinguishable from text that a human can generate. These brains are filled with vast amounts of textual data to understand, generate, and process human language.
What the researchers from Nanyang Technological University did – as they revealed in their study – was that they “reverse engineered” to find out how the brains of robots detect “large language models” such as “GPT chat” for unethical requests.
From the information they obtained, they trained their own large language model to produce requests that bypass the defenses of the large language models that underpin famous chatbots. Then they created their own chatbot capable of automatically generating more requests to break the protection of other chatbots, and they called it “Master Qi.”
Just as the master key “Master Key” opens multiple locks, the name the researchers chose for their robot indicates that it is a powerful and versatile tool that can penetrate the security measures of various automated chat systems.
Professor Liu Yang from Nanyang University’s School of Computer Science and Engineering, who led the study, revealed statement A journalist published on the university’s website, regarding one of the most prominent methods of fraud used by “Master Key”.
For example, chatbot developers rely on keyword monitoring tools that pick up certain words that might indicate potentially suspicious activity and refuse to answer if such words are detected.
One strategy researchers used to get around keyword censorship was to submit prompts that simply contained spaces following each letter, circumventing censorship that might operate through a list of banned words.
One strategy for researchers to get around keyword censorship is to submit claims that simply contain spaces following each letter (Archyde.com)
Muscle flex or warning message?
This study raises a group of questions, the most prominent of which is related to its main goal. Is it a “flexing muscle” and demonstrating hacking ability, or is it an attempt to send a warning message, and how can the continued development and expansion of large language models affect the ability to discover and address vulnerabilities? Inside AI chatbots, and what measures can be taken to counter potential threats?
In an email interview with Al Jazeera Net, Professor Liu Yang denies that their penetration of chatbot security systems is an attempt to show off, stressing that it is a warning message that can be summarized in the following points:
- First, it draws attention to a fundamental weakness inherent in the design of artificial intelligence models, which when requested in certain ways can deviate from ethical guidelines. These deviations occur due to gaps in the training data and explanatory logic of the model.
- Second, our MasterKey can be a valuable tool for developers to proactively identify vulnerabilities in chatbots, and its usefulness lies in its systematic method that can be integrated into regular testing and development.
- Third: Our research can benefit regulatory frameworks, as it indicates the importance of focusing on the need for strict security standards and ethical compliance in the deployment of AI-powered chatbots, including guidelines for responsible use and ongoing monitoring.
As for how the continuous development and expansion of large language models affects the ability to discover and address weaknesses, Liu Yang stresses the importance of committing to further research and continuous development of large language models, because as they become more advanced, identifying weak points may become more complex.
In this context, he says, “Developers use a combination of automated and manual processes to discover weak points, and often rely on continuous monitoring and feedback loops. The challenge lies in the evolving nature of artificial intelligence, as new weak points appear, which requires continuous monitoring.”
1705341187
#Master #Key #reveals #private #parts #GBT #Chat #technology