In the school year that just ended, a group of students caught the attention for their peculiarity. They are very hard-working, are improving and are very eloquent. But, curiously, these students (chatbots with artificial intelligence) often have problems with mathematics.
Chatbots like Open AI’s ChatGPT can write poetry, summarize books, and answer questions, often with human-like fluency. But when it comes to mathematical calculations, while these systems can do so based on what they’ve learned, the results can vary and be wrong. That’s because they’re designed to determine probabilities, not make rule-based calculations. Probability is not exactness, and language is more flexible and forgiving than math.
“Artificial intelligence (AI) chatbots have a hard time with math because they were never designed for that,” said Kristian Hammond, a computer science professor and AI researcher at Northwestern University.
It seems that the world’s smartest computer scientists have created an artificial intelligence that is better suited to humanities subjects than to mathematics.
At first glance, this marks a break with computing’s past. Ever since the first computers appeared in the 1940s, a good summary definition of computing has been “mathematics on steroids.” Computers have been tireless, fast, and accurate calculating machines. For a long time, they have been really good at numbers, far surpassing human performance.
Traditionally, computers were programmed to follow step-by-step rules, retrieve information and present it in structured databases. They were powerful but fragile. That’s why previous attempts to produce artificial intelligence failed.
But more than a decade ago, a different strategy emerged that began to yield surprising results. The underlying technology, called a neural network, is based on the human brain.
This type of AI is not programmed with rigid rules, but instead learns from analyzing a large amount of data. It generates language based on all the information it has absorbed and predicts which word or phrase is most likely to come next, just like humans do.
“This technology does brilliant things, but it doesn’t do everything,” Hammond said. “Everyone wants the answer to AI to be one thing. That’s nonsense.”
AI chatbots have occasionally struggled with simple arithmetic and math word problems that require multiple steps to arrive at a solution, as recently documented by some tech reviewers. AI’s prowess is improving, but it remains a flaw.
During her presentation at a recent symposium, Kristen DiCerbo, chief learning officer at Khan Academy, a nonprofit education organization that is experimenting with an AI tutor chatbot and teaching assistant, introduced the topic of math accuracy. “It’s a problem, as many of you know,” DiCerbo told educators.
A few months ago, Khan Academy introduced a significant change to its AI-powered tutor, called Khanmigo. It sends many number problems to a calculator program instead of asking the AI to solve the math. While waiting for the calculator program to finish, students see the words “calculating” on their screens and a head-shaking icon of Khanmigo.
“We’re actually using tools that are meant to do computations,” said DiCerbo, who remains optimistic that conversational chatbots will play a major role in education.
For over a year now, ChatGPT has been using a similar solution for some math problems. For tasks such as division and multiplication of large numbers, the chatbot asks a calculator program for help.
Math is an “important and developing area of research,” OpenAI said in a statement, and a field in which scientists have made steady progress. The company added that its new version of GPT achieved nearly 64 percent accuracy on a public database of thousands of problems requiring visual perception and mathematical reasoning. That’s up from 58 percent in the previous version.
AI-powered chatbots tend to excel when they’ve consumed large amounts of relevant training data — textbooks, exercises, and standardized tests. The effect is that the chatbots have already seen and analyzed very similar, if not the same, questions. According to the company, a recent version of the technology behind ChatGPT scored in the 89th percentile on the high school math SAT exam.
The technology’s erratic performance in mathematics contributes to a lively debate in the AI community about how best to move forward in the field. Broadly speaking, there are two camps.
CONTENT FOR SUBSCRIBERS
On one side are those who believe that the advanced neural networks that power AI chatbots, known as large language models (LLMs), are almost a singular path to continued progress and, ultimately, to artificial general intelligence (or AGI) — a computer capable of doing everything the human brain can do. That’s the dominant view in much of Silicon Valley.
On the other hand, there are skeptics who question whether simply adding more data and computing power to large language models is enough. Among them is Yann LeCun, chief AI scientist at Meta.
According to LeCun, large language models are weak on logic and lack common-sense reasoning, and he insists that what is needed is a broader approach, which he calls “world modeling,” or systems that can learn how the world works in a similar way to how humans do. And it may take a decade to get there.
Meanwhile, Meta decided to incorporate into its social networking services, such as Facebook, Instagram and WhatsApp, a software AI-powered intelligent assistant based on its vast language model LLaMA. Current models may be flawed, but they do a lot anyway.
#artificial #intelligence #struggles #math