The Google model with AI that can process long texts, videos and audios – 2024-02-20 13:47:19

Google launched for a select group of developers, Gemini 1.5 Pro, an artificial intelligence (AI) model that can process large amounts of information at once, including one hour of video, eleven hours of audio, 30 thousand lines of code or more than 700 thousand words.

“A few years ago memorizing or getting the context of hundreds of words was quite difficult and even if we look back to the 1950s, when Shannon (the mathematician who invented information theory) dreamed of language models, he was analyzing two words of context,” Oriol Vinyals, vice president of research at Google DeepMind and CEO of Gemini, told reporters.

To exemplify the capabilities of Gemini 1.5 Pro, Vinyals showed – using a pre-recorded video – that the model was capable of analyzing a 402-page text of transcripts from Apollo 11 – the first mission to land a human being on the moon – and find three funny quotes.

In addition to text, users will be able to interact with the model with photos or drawings. In the example from the presentation video, the user gave Gemini 1.5 Pro a very simple drawing of a boot hitting the ground and asked: “What time is this? Please respond with an exact quote.”

The machine’s response was astronaut Neil A. Armstrong’s famous quote: “That’s one small step for man.”

Vinyals showed other similar examples, in which a 45-minute silent film by Buster Keaton was used as a basis, instead of a text.

Read more: Apple Vision Pro: Tim Cook welcomes the new virtual reality glasses

Regarding programming, in a statement the company points out: “You can perform more relevant problem-solving tasks in longer blocks of code. When presented with a message with more than 100 thousand lines of code, you can reason better between examples , suggest useful modifications, and provide explanations regarding how different parts of the code work.”

“In some ways, it works very similar to how our brain does,” Vinyals explained.

Gemini 1.5 performs at a similar level to 1.0 UltraGoogle’s most sophisticated model to date.

In a statement from Google and Alphabet CEO Sundar Pichai, Gemini 1.5 Pro will help developers create much more useful models and applications.

“We are pleased to offer a limited preview of this experimental feature to developers and enterprise customers,” emphasizes Pichai.

Starting today, some developers and cloud customers will be able to start building with 1.0 Ultra, with the Gemini application programming interface (API) in AI Studio and Vertex AI.

Regarding ‘hallucinations’ – well-structured responses for incorrect ones – Vinyals points out that it is still a problem in AI in general that is still being worked on.

Also read: TikTok and Universal: what music is no longer on the social network following the dispute between both companies

Last week, Google renamed its artificial intelligence (AI) chatbot from Bard to Gemini, announced that the technology will be available in a new Gemini app for Android and through the Google app on iOS, and also launched a paid “advanced” version, which uses Gemini 1.0 Ultra.

window.addEventListener(‘DOMContentLoaded’, function() {
/*(function($) {*/
(function (d, s, id) {
var js, fjs = d.getElementsByTagName(s)[0];
if (d.getElementById(id)) return;
js = d.createElement(s);
js.id = id;
js.src = document.location.protocol + “//connect.facebook.net/es_LA/sdk.js#xfbml=1&version=v2.3”;
fjs.parentNode.insertBefore(js, fjs);
}(document, ‘script’, ‘facebook-jssdk’));
/*})(jQuery);*/
});

#Google #model #process #long #texts #videos #audios

Share:

Facebook
Twitter
Pinterest
LinkedIn

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.