Midjourney Arrives in the AI Video Space With V1 Model

A man on a motorcycle speeds through a forest, looking back in fear as a large, roaring bear chases closely behind him, both kicking up dirt from the forest floor.

Midjourney, best known for its AI image generator, has just announced its first video generation model called V1 Video which will animate still images into short video clips.

Users can generate a five-second clip using a written prompt and an image; it can be either an AI image from Midjourney’s picture generator or they can upload one of their own. Videos can then be extended in four-second increments up to four times, for a maximum duration of 21 seconds. Animations can be adjusted for either low or high motion, depending on whether both the subject and camera should move or just the subject.

The new tool is accessible via Midjourney’s website and Discord server and is part of a subscription plan starting at $10 per month. That plan provides 3.3 hours of “fast” GPU time, roughly equivalent to 200 image generations. Video jobs, however, are significantly more resource intensive costing about eight times as much as a single image generation, or approximately “one image worth of cost per second of video,” according to the company.

An automatic mode generates basic movement with a default prompt, while a manual option allows users to describe motion in more detail. Midjourney founder David Holz describes the release as “a stepping stone” toward more advanced models, such as “real-time open-world simulations.”

“This means that you still make images in Midjourney, as normal, but now you can press ‘Animate’ to make them move,” Holz says in a blog post. “Properly utilized, it’s not just fun, it can also be really useful, or even profound to make old and new worlds suddenly alive.”

The rollout comes at a sensitive time for the company. Last week, Disney and NBCUniversal filed a lawsuit against Midjourney, citing concerns about the potential misuse of their copyrighted content. The companies allege that the platform functions as a “virtual vending machine, generating endless unauthorized copies of Disney’s and Universal’s copyrighted work.” The suit also raises alarms about Midjourney’s model training methods, arguing they are infringing.

Despite trailing other companies like OpenAI, Google, and Adobe in the text-to-video space, Midjourney’s move into animation signals its intent to remain competitive. Tools like Google’s Veo, OpenAI’s Sora, and Adobe’s Firefly Video have already introduced more sophisticated prompt-to-video capabilities.

Holz acknowledges the challenges of scaling the video tool, stating, “The actual costs to produce these models and the prices we charge for them are challenging to predict.” He adds that pricing and availability may shift in the coming weeks as usage patterns emerge.

Midjourney has urged users to, “Please use these technologies responsibly.”

Discussion