ChatGPT now talks like a human, is a translator and even tells stories: this is GPT-4o

ChatGPT now talks like a human, is a translator and even tells stories: this is GPT-4o

2024-05-13 17:48:00

ChatGPT might already hear and speak, but not like now. OpenAI has revealed its new artificial intelligence model, GPT-4owhich stands out not only for being more powerful, but also for its abilities to have conversations in real timecomplete with voice intonations, practically the same way a human would.

During a spring updateshowed OpenAI what this new AI model is capable of, solving equations in real time, analyzing code, but also tell stories in real time (changing intonation to suit the user), serve as an instant translator, and even be able to analyze a person’s face.

Explains the model

According to OpenAI, GPT-4o is a new multimodal model that can be used naturally different content listingssound, sight and text in real time, making interaction “much more natural“, also be faster in their responses.

This is due to a new form of training, end-to-end, where AI processes all text, sight and sound inputs and outputs in the same neural network.

Simply put, this change changes the way you analyze content. Until now, AI had to perform three steps: transcribe an audio input into text, generate the response text, and convert it back to audio to share it with the user.

This process caused ChatGPT to lose information by not being able to analyze details such as tones, or if there were multiple participants, but also limited its ability to laugh, sing or express emotions.

It is precisely one of the areas that is improved with GPT-4o, since now AI, in addition to being able to hold conversations in real timeyou can also add tones and convey different emotions in your voice, and interact in real time with content, thanks to a live camera function in the smartphone app.

According to the company, GPT-4o has performance on par GPT-4 Turbo in text, but also in reasoning and coding intelligence sections, though uses fewer tokens to process contentdoes “more economical“, something especially useful for developers using the API.

The GPT-4o demonstrations

Some of these details were shown in the presentation, because different users were asked to tell a story they can request changes to the way it is countedpronounce it”more emotional” or even change the narration for a robotic voice.

The GPT-4o demo tells a story

The new model was also tested in other environments, help lecturers solve a linear equationstep by step in real time while pointing the smartphone camera at the problem, and to make translations between two people, from Italian to English, practically instantly.

Like the other models, the GPT-4o is capable of also analyze the code step by stepand give feedback to the user regarding how it works, but also with explanations of information that is, for example, in graphic format.

GPT-4o availability

One of the most notable elements of this new model is that it will be available to virtually all users, ie. Both come free, but with limitationsas well as for those who subscribe to any of the plans.

In accordance OpenAI, as of Wednesday 13 May, will be implemented in ChatGPT the text and image functions of the model for freeand for Plus users, there will be message limits up to five times higher.

In contrast, there is the new voice mode, one of the model’s star functions for subscribers only in alpha version in the coming weeks.

For developers using the AI ​​API, there are also benefits, as GPT-4o is a model, says OpenAI, twice as fastat half the price and with higher speed limits compared to the GPT-4 Turbo.

In addition to GPT-4o, the company also showed a new desktop applicationwho want a specific command to ask questions on any content immediately, whether you select text or take screenshots.

Currently this feature It will be exclusive to macOS and for Plus users, but the company plans to launch a version for Windows in late 2024.

1715635634
#ChatGPT #talks #human #translator #tells #stories #GPT4o

Leave a Replay