Nvidia Used Videos From Netflix and YouTube to Build AI Model

A menacing robot with glowing red eyes looms over a cart filled with balls featuring YouTube play buttons. The robot is in an industrial setting and appears to be grabbing the balls, with more scattered around on the ground. The YouTube logo is prominently displayed on the cart.

Nvidia has been caught using videos from YouTube, Netflix, and other sources to build an AI model, according to a leak obtained by 404 Media.

Nvidia says its AI training methods are “in full compliance with the letter and the spirit of copyright law.” However, there is no definitive ruling on whether using artists’ work to train AI models is legal or not.

The particular model has not been released by Nvidia but it is named internally as Cosmos and is set to be a state-of-the-art video foundation model that could power different products like world generators and digital humans.

404 media obtained Slack messages showing employees using YouTube video downloaders to grab content while also discussing the legal and ethical considerations of the practice.

“We respect the rights of all content creators and are confident that our models and our research efforts are in full compliance with the letter and the spirit of copyright law,” an Nvidia spokesperson tells 404.

“Copyright law protects particular expressions but not facts, ideas, data, or information. Anyone is free to learn facts, ideas, data, or information from another source and use it to make their own expressions. Fair use also protects the ability to use a work for a transformative purpose, such as model training.”

However, YouTube views this practice as a violation of its policies while Netflix also says scraping is against its terms of service.

According to the leak, Nvidia downloaded 100,000 videos from YouTube in just two weeks and compiled over 38.5 million video URLs. The links include content creators like Marques Brownlee and the Architectural Digest channel.

The documents also show that Nvidia trained its model on a dataset called HD-VG-130M which contains 130 million YouTube videos and is explicitly for academic research only. The leak makes clear that Nvidia’s work was for commercial gain.

When an employee raised legal and ethical concerns, Ming-Yu Liu, Nvidia’s VP of Research and a leader on the Cosmos project, told them that the decision to acquire data in this way had been made at the top of the company.

Nvidia has risen to become a huge, trillion-dollar company thanks to its computer chips laying the foundation for the booming AI market. OpenAI, Microsoft, Meta, and Google are all customers of Nvidia and rely on its graphics processing units (GPUs).

But Nvidia is doing more than just hardware. Last week, Getty Images announced a deepening of its relationship with the company after it released an updated AI image model based on the Nvidia Picasso model architecture.

Discussion