Meta is Using Your Instagram Photos to Train its ‘Amazing’ AI Image Generator

May 14, 2024

Matt Growcoot

Chris Cox with styled hair and a beard, wearing a black suit, speaks at a panel discussion, with a blurred woman in the background. They are both in front of a blue backdrop with logos. — Meta’s chief product officer Chris Cox | World Economic Forum/CC

During an interview at Bloomberg’s Tech Summit last Thursday, Meta’s chief product officer Chris Cox said that Instagram and Facebook have an advantage in the generative AI space because of all the “public” photos available to them.

While Meta is not a major player in the AI image generator market, the company continues to build its text-to-image model called Emu and with other companies struggling to find training data to keep up with the demands of AI, Cox thinks Meta has an advantage.

“We don’t train on private stuff, we don’t train on stuff that people share with their friends, we do train on things that are public,” he tells Bloomberg.

Cox said that Emu can make “really amazing quality images” thanks to “Instagram being the data set that was used to train it” which he described as “one of the great repositories of incredible imagery.”

On using user data to train AI models, "We don't train on private stuff, we don't train on stuff people share with their friends. We do train on things that are public," @Meta Chief Product Officer Chris Cox #BloombergTech pic.twitter.com/FC0SWlTgqY

— Bloomberg Live (@BloombergLive) May 9, 2024

He goes on to say that Instagram’s varied content — including photos of “art, fashion, culture, and also just images of people” — is what makes it a useful tool for building AI image generators.

Photographers Have Little Choice

For many photographers, Instagram is the most important social media platform. Feelings of mistrust and outrage at the way AI image generators were built are felt widely in the creative community so this may be something of a conundrum for creators.

Cox says Meta is not training on photos from private accounts but, in general, photographers need public accounts to get their work and name out into the world. It is a very important marketing tool.

“What’s most galling about hearing that Instagram (Meta) are harvesting and commercially exploiting hundreds of thousands of professional photographer’s original works, for their GenAI program ‘Imagine with Meta AI’, is that they don’t appear bothered in the slightest about the impact this has on the very community of photographers and image-makers that helped build their visual social media empire,” Isabelle Doran of The Association of Photographers in the U.K. tells PetaPixel.

“It seems photographers’ creative works are simply there for the taking, irrespective of the repercussions on the community, and just looks like pure corporate greed. Together with Stable Diffusion, Midjourney, DALL-E and ImageFX, these text-to-image programs are now actively competing with professional photographers, displacing years of photographers, and the teams they work with, cultivating skills and talent and investing their creative souls into their works.

“It’s one of the reasons we ask for transparency about which copyright-protected photographs are scraped, ingested and used by these programs, in addition to requesting for permitted use and compensation. The big tech firms need to play fair and pay fair.”

Back in February, Meta CEO Mark Zuckerberg made clear that he is using images posted on Facebook and Instagram to train the company’s generative AI tools with.

“When people think about data, they typically think about the corpus that you might use to train a model up front,” Zuckerberg said in an earnings call.

“On Facebook and Instagram, there are hundreds of billions of publicly shared images and tens of billions of public videos, which we estimate is greater than the Common Crawl dataset and people share large numbers of public text posts in comments across our services as well.”

Update 5/14: Updated with the comment from The AOP.

News