Google launched for a select group of developers, Gemini 1.5 Pro, an artificial intelligence (AI) model that can process large amounts of information at once, including one hour of video, eleven hours of audio, 30 thousand lines of code or more than 700 thousand words.
“A few years ago memorizing or getting the context of hundreds of words was quite difficult and even if we look back to the 1950s, when Shannon (the mathematician who invented information theory) dreamed of language models, he was analyzing two words of context,” Oriol Vinyals, vice president of research at Google DeepMind and CEO of Gemini, told reporters.
To exemplify the capabilities of Gemini 1.5 Pro, Vinyals showed – using a pre-recorded video – that the model was capable of analyzing a 402-page text of transcripts from Apollo 11 – the first mission to land a human being on the moon – and find three funny quotes.
In addition to text, users will be able to interact with the model with photos or drawings. In the example from the presentation video, the user gave Gemini 1.5 Pro a very simple drawing of a boot hitting the ground and asked: “What time is this? Please respond with an exact quote.”
The machine’s response was astronaut Neil A. Armstrong’s famous quote: “That’s one small step for man.”
Today we’re introducing Gemini 1.5, our next-generation AI model. It shows dramatically enhanced performance, including long-context understanding across modalities, which opens up new possibilities for people to create and build with AI → #Gemini pic.twitter.com/043FGirXB0
— Google (@Google) February 15, 2024
Vinyals showed other similar examples, in which a 45-minute silent film by Buster Keaton was used as a basis, instead of a text.
Read more: Apple Vision Pro: Tim Cook welcomes the new virtual reality glasses
Regarding programming, in a statement the company points out: “You can perform more relevant problem-solving tasks in longer blocks of code. When presented with a message with more than 100 thousand lines of code, you can reason better between examples , suggest useful modifications, and provide explanations regarding how different parts of the code work.”
✨ Introducing Gemini 1.5: Our next-generation model with a context window of 1M tokens. ➡️
Explore the latest Gemini models, including Gemini 1.5 Pro, in Google AI Studio.#BuildWithGemini pic.twitter.com/B85fBFmPF1
— Google for Developers (@googledevs) February 15, 2024
“In some ways, it works very similar to how our brain does,” Vinyals explained.
Gemini 1.5 performs at a similar level to 1.0 UltraGoogle’s most sophisticated model to date.
In a statement from Google and Alphabet CEO Sundar Pichai, Gemini 1.5 Pro will help developers create much more useful models and applications.
“We are pleased to offer a limited preview of this experimental feature to developers and enterprise customers,” emphasizes Pichai.
Starting today, some developers and cloud customers will be able to start building with 1.0 Ultra, with the Gemini application programming interface (API) in AI Studio and Vertex AI.
Gemini 1.5 Pro can understand tasks and questions across different modalities because of its long context understanding. When given a 44-minute Buster Keaton film, it’s able to find small details in the film and understand plot points. #Gemini pic.twitter.com/FHMAfeKU0h
— Google (@Google) February 15, 2024
Regarding ‘hallucinations’ – well-structured responses for incorrect ones – Vinyals points out that it is still a problem in AI in general that is still being worked on.
Also read: TikTok and Universal: what music is no longer on the social network following the dispute between both companies
Last week, Google renamed its artificial intelligence (AI) chatbot from Bard to Gemini, announced that the technology will be available in a new Gemini app for Android and through the Google app on iOS, and also launched a paid “advanced” version, which uses Gemini 1.0 Ultra.
window.addEventListener(‘DOMContentLoaded’, function() {
/*(function($) {*/
(function (d, s, id) {
var js, fjs = d.getElementsByTagName(s)[0];
if (d.getElementById(id)) return;
js = d.createElement(s);
js.id = id;
js.src = document.location.protocol + “//connect.facebook.net/es_LA/sdk.js#xfbml=1&version=v2.3”;
fjs.parentNode.insertBefore(js, fjs);
}(document, ‘script’, ‘facebook-jssdk’));
/*})(jQuery);*/
});
#Google #model #process #long #texts #videos #audios