AI Image Generators Could Be the Next Frontier of Photo Copyright Theft

Robot Theif

Artificial intelligence-powered (AI) image generators have exploded in popularity and apps like DALL-E, Midjourney, and more recently Stable Diffusion are exciting and tantalizing technology enthusiasts.

To train these systems, each AI tool is fed millions of images. DALL-E 2, for example, was trained on approximately 650 million image-text pairs that its creator, OpenAI, scraped from the internet.

Now, the companies behind these technologies haven’t said as much, but to train these machines it doesn’t seem likely that millions of copyrighted images weren’t used to inform the AI’s learning.

PetaPixel reached out to OpenAI and a company spokesperson said that “DALL-E is being used by more than 3,000 artists from more than 118 countries in their creative workflows today, and we have actively sought their feedback from day one.”

“DALL-E produces unique, original images that have never existed before. Hundreds of millions of images in DALL-E’s training data were licensed by OpenAI, and others came from publicly available sources,” it adds.

“Copyright law has adapted to new technology in the past and will need to do the same with AI-generated content. We continue to seek artists’ perspectives and look forward to working with them and policymakers to help protect the rights of creators.”

Does This Constitute Copyright Infrignement?

It seems very doubtful that companies like OpenAI have only scraped public domain and creative commons images into the algorithm. More likely, the process involves image-text pairing from Google searches. That means photographers’ images have presumably been used in a way that the owners never intended or consented to.

Earlier this week, PetaPixel published an article about beautiful landscape photos that do not exist. In that case, Aurel Manea instructed Stable Diffusion to create images with the prompt “landscape photography by Marc Adamus, Glacial lake, sunset, dramatic lighting, mountains, clouds, beautiful.”

A quick look at Adamus’s work — he is a well-known landscape photographer — confirms that the AI did throw up digitally created photorealistic results that have more than a passing resemblance to his photos.

Stability Diffusion
AI image generated using Marc Adamus’s name

In order to create these very similar images, it’s likely that Stable Diffusion used Adamus’s photos that were scraped from the internet so that it was able to tell what his photos look like. AI isn’t able to make something out of nothing — at least not yet — and is only able to reference actual results to create its new images. It’s not much of a leap to think Stable Diffusion used sections of Adamus’s photographs in some of the generated results.

While this is not copyright “theft” in the traditional sense — like a website running a photo without permission — it does throw up all sorts of legal questions on whether Adamus could theoretically sue a person using the generated photos for commercial purposes.

OpenAI addressed the issue in a blog post prior to the beta release stating that it “will evaluate different approaches to handle potential copyright and trademark” issues.

“[This] may include allowing such generations as part of ‘fair use’ or similar concepts, filtering specific types of content, and working directly with copyright [and] trademark owners on these issues,” the company writes.

Jonathan Low, CEO of JumpStory, a photo stock website, tells PetaPixel about the legal grey area.

“These data sets are basically not allowed to be used for commercial purposes,” he says.

“This means that there is no problem with people generating fun works of art that don’t resemble current art, but the second they start generating realistic-looking photos of realistic-looking people, they aren’t allowed to use these for commercial purposes.”

Low believes that users of the AI image generators run the risk of being sued.

“For OpenAI, they are probably not running very huge legal risks, because they can always put the blame on the customers,” he explains.

“So the real problem is now for the millions of users who are going to pay to use DALL-E, because they think they can generate images of realistic looking people and use these for their newsletters, ads, etc.”

AI Theft

For photographers, they need to ask themselves a couple of questions: how would you feel if you saw an AI image that was based on your photo? The second question is, what are you going to do about it?

No group of creators regularly has their rights trampled over in the internet age quite like photographers. AI image generators are poised to be the next chapter in that ongoing saga.


Update 8/24:This article was amended with OpenAI’s statement.


Image credits: Header photo licensed via Depositphotos.

Discussion