Adobe Firefly is Way Behind: Is the Commitment to Ethics to Blame?
Last week, Adobe made waves by announcing the beta release of its new text-to-image generative artificial intelligence (AI) model, Firefly. Adobe says its new platform wasn’t built using stolen images, but rather, as Adobe boasts, Firefly has been trained using Adobe Stock images, openly licensed content, and public domain content.
Adobe is Building Its AI Model in The Right Way
It’s an admirable way to build an AI platform, especially in the face of competing models that are built using stolen and unauthorized content. Midjourney’s founder David Holz recently admitted that his company didn’t have permission to use the hundreds of millions of images used to train its AI image generator.
Adobe is also committed to countering biases prevalent within AI image generators. Last October, Hugging Face began hosting the “Stable Diffusion Bias Explorer.” This tool lets users see firsthand how AI models struggle with racial and gender stereotypes.
It’s impressive that Adobe is conscious of these biases and working hard to ensure Firefly is socially responsible. Adobe argues that, “any company building generative AI tools should start with an AI Ethics framework.” Adobe believes its ethical structure will ensure that its AI technologies, including generative AI like Firefly, will respect users and align with the company’s core values.
“Mitigating harmful outputs starts with building and training on safe and inclusive datasets. For example, Adobe’s first model in our Firefly family of creative generative AI models is trained on Adobe Stock images, openly licensed content, and public domain content where the copyright has expired. Training on curated, diverse datasets inherently gives your model a competitive edge when it comes to producing commercially safe and ethical results,” Adobe continues.
These goals are not just ambitious, they also deserve respect. I want to give Adobe major kudos for how it has developed and trained Firefly.
But the ‘Right Way’ Delivers Wrong Results
However, early results suggest that perhaps the ethically valuable limits Adobe has placed on itself have negatively affected the performance of its new generative AI model. Adobe Firefly, at this point, needs to catch up with the extremely impressive Midjourney version 5 and it is a long way behind.
Got access to Adobe Firefly. The image generator is good, but not great, certainly when using photorealism. Compare it with Midjourney v5 (left); Firefly (right) feels much more like stock photography, less visually dramatic. pic.twitter.com/W6lAtbKTIU
— Peter Gasston (@stopsatgreen) March 23, 2023
I ran the same prompt for @adobe firefly & @midjourney_ai v5.
This was the prompt that I ran:
"A long beaked magnificent colorful bird drinking nectar from a flower while flying in the mid air, highly realistic"
MidJourney Firefly pic.twitter.com/Za5swKzTY1
— Nasim Uddin (@nasimuddin01) March 23, 2023
Adobe Firefly isn’t bad, but it clearly isn’t as effective as the much more mature Midjourney platform. However, it’s important to consider that not only is Midjourney older, which is a significant advantage concerning the efficacy of generative AI, but it’s also worth thinking about how Midjourney was made.
Unsettled Regulatory Situation Lend Credence to Adobe’s Methods
Not only is building an AI model using stolen content ethically dubious, at best, but it’s also legally murky. Adobe’s ethical framework doesn’t exist solely because the company thinks it’s the right thing to do, it also impacts the commercial viability of its product.
“The law has to catch up to technology,” Mickey H. Osterreicher, the General Counsel for the National Press Photographers Association (NPPA) told PetaPixel.
Thomas Maddrey, Chief Legal Officer for the American Society of Media Photographers (ASMP), added, “Copyright law is not prepared and is not set up to protect the artists or the users at this moment in time. A lot of it is not going to be determined under copyright statute, but rather, unfortunately, in litigation.”
By building Firefly expressly around fair use and licensed content, Adobe’s new AI model can avoid what feels like an imminent legal catastrophe. Adobe’s primary purpose is, of course, to be financially viable. If operating ethically aligns with this overriding goal and improves Adobe’s business practices, too, all the better.
However, as I alluded to, after using Firefly, it’s hard not to wonder if its limitations aren’t due in part to the substantially smaller dataset Adobe has allowed itself.
I Only Wanted an Image of a Woman Taking a Photo of a Man, is That Too Much to Ask?
For example, when I typed the text prompt, “Woman taking a portrait of a man,” the results were disappointing. Women are criminally underrepresented in the photography industry, so I wondered if Adobe’s limited dataset could account for what seems like a normal situation to me, but one that is disappointingly unusual to actually see.
The results could have been better. In only one case was a woman taking a photo of a man; in that case, the woman was taking a picture of the back of a man’s head.
I thought I needed to be more specific, so I tried, “A woman taking a photo of a man’s face.” That’s different from how I’d typically describe what I wanted to see: a woman using a camera to take a typical portrait of a man as her subject. This prompt wasn’t much more fruitful, although at least three of the four women had a camera.
Well, maybe “A woman using a camera to take a typical portrait of a man as her subject” will work? Definitely not.
My favorite is the image in the top right, with the woman pressing her face against the end of a camera lens. A close second is whatever happened to that poor man in the top left; it looks painful.
Firefly Can’t Yet Navigate AI’s Problem with Hands
Firefly could do a better job with hands, which is admittedly a common issue for generative AI models. Midjourney v5 finally delivers mostly realistic human hands, at least for paying customers.
Midjourney does a great job of creating photorealistic portraits of people overall. I wanted to try something similar in Firefly, using specific phrases, which has proved helpful for prior AI models.
Stylized Portraits Aren’t Much Better
I chalk these results up to user error. Using “Wet plate portrait of a woman” delivered significantly better results.
Aiming for more specificity, I tried “Full body portrait of a man in the 1800s, wet collodion photography.” I’m not sure what to say about these results, but they aren’t what I expected.
Let’s Talk About Race: Diversity in a Vacuum
I decided to try one more attempt at portraits of people. I went for, “Portrait of a happy couple in love 85mm bokeh.” I ran this prompt repeatedly to view different results, and this quartet was the best. Some results had deformed faces, while others featured strange limbs. None of them looked incredibly natural or realistic.
I also noticed that while the results were constantly generating people of different races and diverse skin tones, every couple included people of the same race.
I thought that maybe a broader prompt, such as “portrait of a couple,” would help. It didn’t. To achieve the expected results, I needed to specify a “mixed-race couple.” On the one hand, I understand why, with only four visible results, I need to be specific to see certain representations of diversity. However, on the other weird AI-generated hand, I don’t think I should need to ask for diversity to see it.
Firefly Requires More Work to Deal with Gender
Something else immediately struck me — not one portrait of a couple, or people in love, showed subjects that appeared to be in a same-sex relationship.
I tried, “wedding photo, outside, daytime.” Now not only were the generated images showing just white heterosexual weddings, one of the photos showed the groom grabbing the bride’s breast. I haven’t attended many weddings, so maybe I attended relatively dull, straightforward celebrations, but that seems very out of place.
I searched with terms like “mixed-race couple” and “same-sex wedding,” and the results weren’t great.
Why does a “wedding” require one person to wear a wedding dress? Of course, it’s not just any wedding dress — it’s a very traditional western-style wedding dress.
To get a like-for-like comparison against Midjourney v5, which prohibits the term “sex,” I asked Firefly to generate a “wedding of two men.” Midjourney puts Firefly to shame here, even though both AI models seem fixated on white people with that prompt.
Photorealism Often Challenges Firefly
Overall, during my time using Adobe Firefly, my quest for photorealistic AI-generated images of people proved challenging. I’m willing to chalk some of the struggles up to its beta status and that Firefly is a relative infant in the rapidly evolving generative AI space, as well as Firefly’s small, ethically sourced training dataset.
My confidence that Firefly will evolve is tempered by its limited dataset. Adobe isn’t going to go the route of Midjourney and start stealing images suddenly and Firefly is already at least six to eight months behind some competitors regarding its results. If the model improves slowly due to a reduced dataset, will it ever catch up?
Specificity Helps in a Big Way
While I find it frustrating that I needed to be specific to see mixed-race couples, and I have no idea what Firefly is doing with same-sex couples, I think that part of the fault lies with me. Vague prompts will rarely meet my expectations.
However, precise prompts can. Consider “Portrait of a young black woman with natural hair wearing fashionable clothes.” I’m impressed that Firefly respected “natural hair” and did a nice job with it. It also did an excellent job with “young” and “fashionable clothes,” in my opinion.
Removing People Delivers Better but Monotonous Results
When I fed Firefly different prompts without people, the results improved. However, Firefly is not only limited insofar as it only shows four results at a time, the results also look very similar to one another.
This issue persisted when I ditched the “Photo” content type. When I opted for “Art” and added modifiers including “hyper realistic” and “fantasy” with “warm tone” and “golden hour” lighting, the results still lacked diversity. Still, admittedly, they’re quite lovely.
Ultimately, Adobe Firefly is a well-intentioned generative AI model that, in pursuing ethical and commercial goals, is currently limited in its performance and practicality. In important, moral ways, Firefly is better than its competitors. When it comes to the results though, it’ is undoubtedly worse.
That said, Adobe Firefly just launched, and it will improve over time. Besides, I had a lot of fun generating a “close-up portrait of a sloth astronaut in space eating a taco” in a “cartoon” art style with a blurry background, cool tones, and dramatic lighting. Firefly ticked only some of my requested boxes, but I wonder how much I care — I love my space sloth.