Chat GPT became able to ‘see, hear and speak’

Chat GPT became able to ‘see, hear and speak’

According to OpenAI, an artificial intelligence research organization Computer Program Chat GPT A new upgrade was added that makes this popular artificial intelligence program able to ‘see, hear and speak’.

The update available for OpenAI’s artificial intelligence chatbot will allow users to have voice conversations with the AI ​​chatbot and interact with it using images, the company said in a blog post on Monday. will do

The company also said in a post on X (formerly Twitter): ‘ChatGPT can now see, hear and speak.’

These features will be rolled out over the next two weeks thanks to users Artificial intelligence will be able to communicate with the assistant by speaking.

According to the company, with the new features ChatGPT can be used to ‘request a bedtime story for your family or wrap up a discussion at the dinner table’. In this way, ChatGPT will come close to the services offered by Amazon’s Alexa or Apple’s Siri AI assistants.

Illustrating how the feature works, OpenAI has shared a demo in which a user asks ChatGPT to tell a story about ‘a super-duper sunflower kharpast named Larry’.

The chatbot answers the question with a human-like voice and ‘How was his house?’ and ‘Who is her best friend?’ Answers such questions as well.

This section contains related reference points (Related Nodes field).

OpenAI says the voice capability is supported by a new text-to-speech model that produces human-like audio from just text and a few seconds of sampled speech.

According to the company: ‘We worked closely with professional sound engineers to create each sound. We also use Whisper, our open source speech recognition system, to convert your spoken words into text.’

The AI ​​firm believes the new audio technology has the potential to generate realistic-feeling artificial sounds from just a few seconds of real speech and could open the door to many creative applications.

However, the company also warned that the new capabilities could also lead to new risks. ‘Like malicious elements impersonating public figures or the possibility of fraud.’

Another major update to the AI ​​chatbot allows users to upload a photo and ask ChatGPT about it.

OpenAI explained ‘solving the problem of your grill not working, reviewing the ingredients in your fridge in preparation for cooking or analyzing complex graphs of task-related data.’

According to the company, this new feature also allows users to focus on a specific part of an image using the drawing tool in the ChatGPT mobile app.

Such multi-modal recognition by chatbots was predicted some time back and this new image recognition feature works with the support of multi-modal GPT 3.5 and GPT 4.

These models can use their understanding of language to understand different types of images. These various images include photographs, computer screenshots and documents.

OpenAI says the new features will be rolled out to the app within the next two weeks for users of ChatGPT’s Plus and Enterprise services.

“We are then excited to introduce these capabilities to other groups of users, including developers, soon,” the AI ​​firm said.

#Chat #GPT #hear #speak

Leave a Replay