Stable Diffusion 3 AI Makes Eldritch Body Horror Abominations

The image is split into two sections. On the left, a woman with long brown hair extends her hands toward the camera. On the right, a person lies on their back on the grass, dressed in black, with their arms at their sides and feet slightly apart.

Stability AI, a fading name in the increasingly crowded generative AI space, released Stable Diffusion 3 Medium (SD3M) this week, calling it “our most sophisticated image generation model to date.” However, real-world users are finding more terror than sophistication, with the text-to-image model consistently producing Lovecraftian monsters.

“Stable Diffusion 3 Medium represents a major milestone in the evolution of generative AI, continuing our commitment to democratizing this powerful technology,” Stability AI writes, setting expectations relatively high.

Stability AI continues, promising a model “Delivers images with exceptional detail, color, and lighting, enabling photorealistic outputs as well as high-quality outputs in flexible styles,” with improved performance concerning “common pitfalls of other models, such as realism in hands and faces…”

However, a highly-upvoted post from the r/StableDiffusion subreddit asks, “Is this release supposed to be a joke?” suggesting that Stability AI has thoroughly missed the mark in its latest release.

“Right now it’s really bad,” writes one user, Coyotewld.

A person is lying on a gray sofa in a living room with beige walls, a shaggy beige rug, and light brown sofas. A plant is placed near a window with sheer white curtains, and a framed abstract art piece is on the wall. The room is bright with natural light.
‘Photograph of a person napping in a living room,’ created by Reddit user quill18.

“I haven’t been able to generate a single decent image at all outside of the example prompts. I’ve tried highly descriptive prompts with no luck. Even an absolutely basic one like ‘photograph of a person napping in a living room’ leads to Cronenberg-esque monstrosities. That’s using the example ComfyUI workflows provided,” adds user quill18.

“Look on the bright side, at least hand rendering has improved considerably,” user –Dave-AI– jokes.

A woman with long dark hair and a serious expression is extending her arms forward with palms open and fingers spread. She is wearing a black top with a maroon sweater. Gray background.
Unknown prompt. Image created by Reddit user –Dave-AI–

Quips aside, AI models are typically pretty terrible with hands, although many models have drastically improved in this area.

The model isn’t looking great on X, formerly Twitter, either.

However, although some results are awful, users on both Reddit and X have noted that SD3M performs pretty well with text, which has long been a challenge for text-to-image models.

Nearly all AI models, as improved as they are, fall short at times for various reasons, but users report that the issues go far beyond a few cherry-picked examples. One user claims that all their typical prompts are worse in the latest version, while another adds that they get “one decent image out of 20 generations by going through random seeds with the same prompt and hoping for the best.”

That is not the kind of performance users have come to expect from Stable Diffusion, especially not from a model the developer says is its best yet.

As Ars Technica reports, some Stable Diffusion users are blaming Stability AI’s censorship for the model’s poor performance.

While Stable Diffusion’s open nature gives some users hope that the latest model will be significantly improved by community-generated fine-tuning, its openness has also led to issues, including companies building illegal and unethical porn generators on Stable Diffusion models. This controversy, among others, has brought significant scrutiny to AI image generators, and Stable Diffusion users believe this is at least partly to blame for SD3M’s woeful handling of human anatomy.

A woman with long blonde hair is sitting on a sandy beach, wearing a sleeveless beige dress. The ocean and clear blue sky are in the background. She looks over her shoulder towards the camera with a relaxed expression.
‘Woman wearing a dress on the beach,’ created by Reddit user Perfect-Campaign9551.
A woman with long hair, wearing a white crop top and bottoms, lies on her back on a sandy beach near the ocean. She is smiling and her arms are stretched above her head. The clear blue water gently washes onto the shore in the background.
‘Woman laying on a beach,’ created by Reddit user Operation_Fluffy.

Plus, the company has also lost vital members in recent months, which undoubtedly doesn’t help. Around the same time as the exodus, the company was also accused of an attempted data theft by Midjourney.

Add the newest model’s propensity for crafting monstrous abominations in response to basic prompts, and Stability AI is having a tough 2024 thus far.

While the new AI base model is demonstrably a mixed bag, some Stable Diffusion users hold out hope that the community will deliver fine-tuning in short order. Sometimes, AI models must crawl before they can walk and may even take steps backward during development. On the plus side, it might be easier to walk with extra legs.

Image credits: This article includes AI-generated images created inside Stable Diffusion 3 Medium. Individual creators are credited in the captions.