Gemini is what Google is calling its “most capable” artificial intelligence model and was trained to recognize, understand, and combine different types of information including text, images, audio, video, and code.
Google’s new AI model is described as its most flexible yet and is able to run on a handheld device like the Google Pixel 8 Pro as well as whole data centers and everything in between. Gemini 1.0 is broken into three sizes to allow it to fit these different needs.
Ultra is the largest and most capable of these models and is meant for highly complex tasks. It’s what would likely be deployed in the aforementioned data centers. Pro is the mid-tier version and is made to scale across a wide range of tasks. Nano is Google’s most efficient version of Gemini and is made to run on devices like smartphones.
“We’ve been rigorously testing our Gemini models and evaluating their performance on a wide variety of tasks. From natural image, audio and video understanding to mathematical reasoning, Gemini Ultra’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development,” Google writes on its blog.
“With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities.”
Google says that Gemini is more advanced than typical AI models because while those can be good at performing specific tasks, they aren’t good at more conceptual and complex reasoning.
Gemini is multimodal, which means it was built from the ground up and trained from the get-go on multiple models allowing it to, as Google claims, understand and reason based on various inputs.
“Gemini 1.0’s sophisticated multimodal reasoning capabilities can help make sense of complex written and visual information. This makes it uniquely skilled at uncovering knowledge that can be difficult to discern amid vast amounts of data,” Google claims.
“Its remarkable ability to extract insights from hundreds of thousands of documents through reading, filtering and understanding information will help deliver new breakthroughs at digital speeds in many fields from science to finance.”
Gemini is capable of understanding text, images, audio and more and is capable of analyzing multiple types of inputs at the same time so that it can better understand context and nuanced information.
Google Gemini launches today and will immediatly be deployed into Bard (Gemini Pro) as well as the Google Pixel 8 (Nano). In Bard, Gemini will allow it to be able to reason, plan, and understand more data and Google calls it the biggest update to Bard since it launched.
Gemini Ultra will launch early next year. Gemini will be available in Google’s AI Studio and Cloud Vertex AI on December 13.
Pixel 8 Pro Updates, Including Gemini
On the Pixel 8 Pro, Gemini joins a set of new updates including the ability to capture timelapses at night, which was announced yesterday. That feature is joined by what Google calls Video Boost, which uploads videos captured on device to the cloud where computational models adjust color, lighting, stabilization, and noise to make the footage look more “true to life.”
An improvement to Portrait Light in Google Photos removes harsh shadows on photos, even if they were taken on older devices. Photo Unblur also got an upgrade and is better at sharpening images of dogs and cats even if they were in motion.
The update also improves call quality on computers, the ability to literally clean up documents quickly (such as a coffee stain on a receipt), a new Repair Mode that keeps personal data private while the device is out of a user’s hands, and smarter contextual replies from the Call Screen in case a user doesn’t want to answer a call.
Gemini Nano doesn’t power all of these changes, but it does arrive on Pixel 8 Pro devices along with them. On the Pixel 8 Pro, Google specifically calls out that Gemini Nano will power new features like Summarize in the Recorder app and also Smart Reply in Gboard. That will be available in WhatsApp straightaway, but will come to other apps next year.
Image credits: Google