Runway Trained Its Video AI By Scraping Popular Photography YouTubers

A person wearing a dark jacket and a blue beanie takes a photo with a camera while standing in an icy, mountainous landscape at dusk. The scene includes towering ice formations and a vast, frozen expanse.
An AI video generated on Runway in the style of photography YouTuber Benjamin Hardman. It is strikingly similar to the real-life Hardman’s video. | Credit: 404 Media

The AI video generator Runway used hundreds of YouTube videos to train its latest Gen-3 model with, according to a 404 Media investigation.

The online publication obtained a spreadsheet used by Runway employees to gather YouTube channels, YouTube videos, and even pirated video websites to put into its latest Gen-3 model which was codenamed Jupiter.

404 Media got hold of the spreadsheet from an ex-employee who says the document was a “company-wide effort to find good quality videos to build the model with.” The spreadsheet, which was edited to protect employee names, can be found here.

There are dozens of popular photography YouTube channels in there including Kai Wong, Peter McKinnon, Michael Shainblum, and many more.

A spreadsheet featuring columns of YouTube URLs linking to different users, topics related to wilderness living videos, and photography. Topics mentioned include photography reviews, videos, tutorials, studio photography, landscape, blacksmith, and running.
Some of the photography YouTube channels included on Runway’s spreadsheet.

404 Media used the names of some of the YouTubers on the list as prompts in Runway and found that it generates videos that look suspiciously like those belonging to the specific creators.

For example, Benjamin Hardman, who makes films about photography expeditions in Iceland — his name was used as a prompt into Runway: “Youtuber Benjamin Hardman in the style of his travel videos.” Lo-and-behold, an AI video of a photographer who looks a lot like Hardman taking photos in Iceland appears.

There are also videos in there that seemingly target specific cameras such as the Sony a7 IV and FX3. This might be for using in prompts, say if someone was an AI video “shot on a Fujifilm camera”, for example.

There is no proof that every video and channel listed on the spreadsheet was used to train Gen-3, some might have been filtered out or deleted at a later stage.

Runway did not get back to 404 Media’s request for comments but did apparently start blocking the names of several YouTubers in prompts. PetaPixel has also reached out to Runway for comment but did not receive a response.

Runway Gen-3 was released last month with the company boasting it can make AI videos of photorealistic humans and featuring a director mode with camera controls.

Runway has raised $141 million from investors — which includes Google — and is valued at $1.5 billion.

Google, which owns YouTube, has made clear that any AI company using the platform’s videos for training data is a violation of its terms of service.

“From a creator’s perspective, when a creator uploads their hard work to our platform they have certain expectations. One of those expectations is that the terms of service are going to be abided by,” YouTube CEO Neal Mohan told Bloomberg. “

“It does not allow for things like transcripts or video bits to be downloaded, and that is a clear violation of our terms of service. Those are the rules of the road in terms of content on our platform.”

The former employee who leaked the Runway spreadsheet says they hope that by sharing this information, “people will have a better understanding of the scale of these companies and what they’re doing to make ‘cool’ videos.”

Discussion