Coming hot off the heels of Meta’s text-to-video generator, Google has announced its own artificially intelligent (AI) movie generator.
Goggle’s Imagen Video is still in its development phase, but the company says it will be capable of producing 1280×768 videos at 24 frames per second from a written prompt.
According to Google’s research paper, Imagen Video will have stylistic abilities, such as generating videos based on the work of famous artists like Vincent van Gough. It will also generate 3D rotating objects while preserving their structure and rendering text in various animation styles.
Google hopes that its AI-video model can “significantly decrease the difficulty of high-quality content generation.” Imagen Video builds on Google’s Imagen, a text-to-image program similar to OpenAI’s DALL-E.
As described by Google’s research teach, Imagen Video will take a text description and generate a 16-frame, three-frames-per-second video at 24×48 pixel resolution. The system then upscales and “predicts” additional frames, producing a final 128-frame, 24-frames-per-second video at 720p.
Google says that Imagen Video has been trained on 14 million video-text pairs and 60 million image-text pairs as well as the LAION image-text dataset which was used to train Stable Diffusion.
Among the examples provided by Google, is a panda chewing on bamboo, a zooming shot into a choppy sea filled with pirate ships, and an astronaut riding a horse.
It is worth noting that all the results from Imagen Video are picked by Google themselves and as of yet no independent testers have tried the program.
That said, the research paper claims that Imagen Video can render text properly, something that DALL-E and Stable Diffusion both struggle with. The text that those programs generate is barely readable.
It also claims that Imagen Video has demonstrated an understanding of depth and three-dimensionality, allowing drone flythrough videos to be created that rotate around and capture objects from different angles without distortion.
Google has voiced its concerns over “problematic data” used to train its AI-image generator programs. The company has attempted to filter out sexually explicit or violent content, as well as social stereotypes and cultural biases. It is concerned that the tool may be used “to generate, fake, hateful, explicit, or harmful content.”
“We have decided not to release the Imagen Video model or its source code until these concerns are mitigated,” adds Google.