Google’s New Gemini Omni AI Video Model Can Do Crazy Things
Google’s new Gemini Omni artificial intelligence (AI) model can do some wild things. The model’s key promise is to create anything from, well, anything.
Google says its new Gemini Omni model can “create anything from any input,” including audio, video, photos, and text. The model starts with video generation, which users can then edit via conversational text with Gemini. This first model, Gemini Omni Flash, is launching now in the Gemini app, Google Flow, and YouTube Shorts.
As Google explains, editing AI-generated video using text is straightforward. The model also promises to keep things consistent after editing, including characters, and Omni can remember what was visible in previous scenes.
The company even promises that Gemini Omni can use its “intuitive understanding of physics,” effectively “bridging the gap from photorealism to meaningful storytelling.”
Users have already achieved impressive results with Gemini Omni. For example, ex-Google product manager Bilawal Sidhu gave Gemini Omni a photo with a sketched drone path on it and had the AI generate drone POV footage.
Gave google omni a sketched camera path and asked it to generate drone POV footage. pic.twitter.com/cQZFMtOkEi
— Bilawal Sidhu (@bilawalsidhu) May 26, 2026
The Verge‘s Allison Johnson calls Omni “wild,” and had the AI bring her child’s stuffed animal, Buddy, to life. Buddy went on exciting AI adventures, including white-water rafting and snowboarding.
“The results are such a mixed bag they’re baffling. Some were very good — much more consistent and true to my prompt than when I was testing out Veo five months ago,” Johnson writes. “But even the best clips Omni cooked up for me still have certain AI jump scares, like when Buddy suddenly switches orientation while he’s skydiving.”
As Johnson tested, Omni’s biggest claim to fame, being able to combine a wide variety of input media with AI-generated video, veers from technologically impressive to potentially hazardous. One of her deepfakes even convinced her husband, “a man who has looked at me in real life basically every single day for the last decade.”
View on Threads
Whether this is neat or terrifying depends on who is asked.
“I can’t be the only one to think, that this just has no reason to exist,” writes near_photography on Threads in response to Johnson’s post above. “There is no net benefit to society from this capability.”
As Google notes, all videos generated using Omni include its “imperceptible SynthID digital watermark,” which makes it easy for users to confirm if something was made with Google’s AI inside Gemini, Gemini in Chrome, and Google Search. But what if someone isn’t using those platforms?
Google is bringing this technology directly into YouTube Shorts and YouTube Create, for example, and it’s impossible to predict what people will do with it there.
Image credits: Google