As the penny drops with photographers and artists alike that their images have been used to train AI image generators on a monumental scale — the backlash is growing stronger.
Recent protests against text-to-image generators have mainly emanated from digital artists, but photographers should be up in arms too.
I would bet that any photographer whose pictures have been published on any corner of the internet has had their work included in datasets that inform the AI machines.
And then apps like Lensa AI, and many other AI selfie generators that are proving incredibly popular are also using these datasets.
How are multi-billion dollar companies casually taking the works of artists and photographers for their products without asking or compensating them?
Murky Waters Ahead
The words “class action lawsuit” has been thrown around more and more on Twitter recently.
This issue will inevitably be presented in front of a judge and it will be interesting to see what they make of it all.
This isn’t in any way a normal copyright infringement, as Midjourney’s founder David Holz told Forbes: “It would be cool if images had metadata embedded in them about the copyright owner or something. But that’s not a thing; there’s not a registry. There’s no way to find a picture on the Internet, and then automatically trace it to an owner and then have any way of doing anything to authenticate it.”
That’s not strictly true. As someone who has spent hours of his life hunting down pictures from the internet, I can attest that finding the copyright owner is possible and also the right thing to do.
However, when you’re using “hundreds of millions” of images then I can see why this task seems impossible.
Where is Google?
Google’s absence in the AI image generator space is conspicuous. Where is their Imagen model? Why hasn’t it been released to the public? Could it be that they are nervous about the obvious copyright implications and waiting to see what happens to the likes of OpenAI and Stable Diffusion before releasing its program?
Imagen was supposed to be on Google’s AI Test Kitchen app that’s available for selected users. I managed to get on it, and guess what, there’s no text-to-image program on there.
I’ve heard the data training sets referred to as “publicly available” or that they are “open datasets.” This is complete nonsense. It’s like a website publishing a photographer’s picture they found on Google Images with the defense: “it’s publicly available.” Everyone knows that that excuse doesn’t fly.
You may argue that AI image generators reflect steps forward in technology, vital to growth, et cetera, and that such copyright concerns shouldn’t stand in the way of progress.
But I don’t believe it’s fair to photographers who have spent money and time perfecting their craft only for large companies to effectively use it as the basis of their product.
I implore photographers to run their photos through the Have I Been Trained website to see if you’re included in the LAION-5B dataset.
This is a complex subject and the best way forward is unclear. But one thing is obvious, creators should have the option to opt out of these datasets. It’s simply not right that artists’ works are being used in a way they did not consent to. At the very least, they must be given a choice.
Image credits: Header photo licensed via Depositphotos.