Google’s Gemini AI Has New Photo-to-Video Feature

Jul 11, 2025

Matt Growcoot

A green alien figure with large black eyes sits inside an open cardboard box on a table in a home setting. — The photo-to-video feature lets users transform ordinary scenes into extraordinary ones.

Google has launched a new feature within its Gemini AI assistant that allows users to convert photos into short video clips, expanding access to a tool previously limited to its standalone filmmaking platform, Flow.

Available to Gemini Advanced subscribers on the Ultra and Pro plans in select regions, the feature began rolling out on Thursday and will arrive on mobile devices throughout the week. It enables users to upload an image and generate an eight-second video based on accompanying text and audio prompts. The clips include AI-generated sound effects, ambient noise, and speech, and are delivered in MP4 format at 720p resolution with a 16:9 landscape aspect ratio.

The tool is powered by Google’s Veo 3 AI video model, which was first announced in May. Users can access the feature by selecting the “tools” option in the Gemini interface, choosing “video,” and uploading a photo with a description of how it should move. Audio inputs can also be added for synchronized dialogue and sound effects.

“You can get creative by animating everyday objects, bringing your drawings and paintings to life, or adding movement to nature scenes,” Google says in a blog post. All generated videos include a visible watermark indicating they were created with AI, as well as an invisible SynthID digital watermark.

Bloomberg notes that the new integration brings Gemini closer in line with similar offerings from competitors like OpenAI, Runway AI and Pika, as well as from companies in China including Alibaba and Kuaishou, which have also developed AI video tools.

While the feature builds on capabilities already available in Flow, Gemini now offers a more accessible, chat-based experience. Flow itself is also expanding availability, launching in an additional 75 countries, according to Google.

The company emphasized safeguards intended to prevent misuse of the video generation tool. It restricts the use of images of public figures and prohibits outputs that promote harmful or violent content. However, the technology is still developing. Google says the technology is currently more effective at animating non-human subjects, such as plants, animals, and artwork.