Reliability of ChatGPT questioned by new study

2023-07-20 15:28:56

Posted Jul 20, 2023, 12:37 p.m.Updated Jul 20, 2023, 5:28 p.m.

The quality of responses from the latest versions of ChatGPT varies rapidly, and not always for the better. It is the conclusion of a new study published Tuesday by the universities of Stanford and Berkeley. The researchers compared, three months apart, in March and then in June, the quality of the answers provided by ChatGPT-4 and ChatGPT-3.5 according to criteria of veracity and relevance. And for more representativeness, they proposed four different types of prompts: solving mathematical problems, answering sensitive or dangerous questions, generating code and visual reasoning.

In all cases, the quality of responses varied, sometimes drastically. This is the case of prompts concerning mathematical problems. Between March and June, the ability of ChatGPT-4 to recognize prime numbers, for example, collapsed. Its proportion of correct answers fell from 97.6% to 2.4%. At the same time, ChatGPT-3.5 experienced an opposite evolution, going from 7.4% of positive responses to 86.8%.

1690168686
#Reliability #ChatGPT #questioned #study

Leave a Replay