Why AI Image Generators Can’t Get Hands Right
AI images have shocked the photography world with their hyper-realistic output. But there is seemingly one thing they keep stumbling over — hands.
AI image generators such as DALL-E, Midjourney, and Stable Diffusion are notorious for adding one too many fingers or morphing digits together, making them look nightmarish.
Midjourney is getting crazy powerful—none of these are real photos, and none of the people in them exist. pic.twitter.com/XXV6RUrrAv
— Miles (@mileszim) January 13, 2023
Earlier this year, PetaPixel reported on realistic party pictures generated by AI. But upon closer inspection, the giveaway was the hands; with one girl holding a camera with eight fingers.
Why is AI so Bad at Hands?
Part of the reason AI image generators do so badly with hands is that in the datasets used to train the image synthesizers, humans display their extremities less visibly than their faces, a Stability AI spokesperson tells BuzzFeed News.
“Hands also tend to be much smaller in the source images, as they are relatively rarely visible in large form.”
The 2D image generators also struggle to conceptualize the 3D geometry of a hand, that’s according to Professor Peter Bentley, a computer scientist and author based at the University College London.
“They’ve got the hang of the general idea of a hand. It has a palm, fingers and nails but none of these models actually understand what the full thing is,” he tells the BBC.
In PetaPixel’s tests, we asked Stable Diffusion and DALL-E to generate “two hands clasped together” and the results were typically monstrous.
DALL-E produced comically bad images while Stable Diffusion came out with better, but worse images. They looked more believable while still being all wrong.
Professor of AI and the Arts at the University of Florida, Amelia Winger-Bearskin, explains that generative AI simply doesn’t understand what a hand is and what its function is.
“It’s just looking at how hands are represented in the images that it has been trained on,” she tells BuzzFeed. “Hands, in images, are quite nuanced. They’re usually holding on to something. Or sometimes, they’re holding on to another person.”
— Weird Ai Generations (@weirddalle) January 22, 2023
It isn’t just AI that struggles with hands, artists throughout time have avoided or sought to get around drawing hands because of their difficulty to illustrate. It wasn’t until the Renaissance period that artists like Leonardo da Vinci started studying and sketching hands.
“Da Vinci was actually quite obsessed with hands and did many, many studies of hands,” adds Winger-Bearskin.
“Meanwhile, when AI is trained on an image it’s just looking at that and saying, ‘Oh, in this case, there’s only half of a thumb,’ because the rest of it is hidden under fabric or grabbing onto something, and so when it reproduces it, it’s somewhat deformed.”