ChatGPT Adds Realism to Its Powerful Image Generator

Matt Growcoot
Two mechanics in a garage check the oil in car engines. Each man, wearing overalls, stands by an open hood, wiping a dipstick with a cloth under bright workshop lighting.
The same prompt inputted into the previous model, left, and the new model, right.

ChatGPT has announced a new AI image generator, which can not only create pictures from scratch but can also make precise edits to photos.

Called ChatGPT Images, the model delivers images up to four times faster and there’s also a new Images feature, “designed to make image generation delightful—to spark inspiration and make creative exploration effortless.”

OpenAI, the company that makes ChatGPT, is trumpeting the photo editing powers of the new model.

“When you ask for edits to an uploaded image, the model adheres to your intent more reliably — down to the small details — changing only what you ask for while keeping elements like lighting, composition, and people’s appearance consistent across inputs, outputs, and subsequent edits,” OpenAI says in a blog post.

Whether this model or a similar one could pose a threat to established photo editing apps like Adobe Photoshop remains to be seen. Google has also been pushing similar AI editing tools.

A busy city street scene with a red double-decker bus, a crowd of people talking and walking, and storefronts in the background, including one labeled "Chelsea Drugstore.
ChatGPT Images

OpenAI also highlights the model’s ability to try on different clothing, try different hairstyles, and conceptual transformations. “Together, these improvements mean ChatGPT can act as a creative studio in your pocket, capable of both practical edits and expressive reimaginings,” it adds.

In one example, ChatGPT Images is asked to create a skateboarding photo taken on a 35mm Leica M with a Kodak Portra 400 color palette in L.A. in the 1990s. It then asks for a series of edits, including changing the skateboarder’s t-shirt, adding a blimp flying in the sky, and then eventually the final image as a t-shirt that’s being worn by the skateboarder.

A skateboarder in a white T-shirt and cap jumps off a ledge, with two people watching nearby. Skyscrapers and palm trees fill the background, and a speed limit sign reads 25. The city skyline suggests Los Angeles.

A skateboarder in a red shirt and yellow cap performs a trick on a ledge overlooking downtown Los Angeles, with a fire truck, two people, palm trees, and the Hollywood sign visible in the background.

A skateboarder in a red shirt and yellow cap performs a trick in front of a crowd, with a fire truck, speed limit sign, bald eagle, blimp, and city skyline in the background.

A T-shirt with a graphic showing a skateboarder performing a trick in a city scene, with onlookers, a fire truck, a bald eagle, palm trees, and skyscrapers in the background, hanging on a clothesline.

A skateboarder in a colorful shirt grinds on a ledge with a city skyline in the background. Two people watch nearby, and a speed limit sign reads 25. Palm trees and buildings are visible under a hazy sky.

According to TechCrunch, last month a leaked memo by OpenAI CEO Sam Altman declared a “code red” in relation to the escalation in competition the company is facing from Google’s Gemini AI model.

Google Nano Banana has shocked many with its remarkable realism, making AI images virtually indistinguishable from real photos. It’s done this by imitating some of the flaws associated with smartphone photography.

ChatGPT Images is rolling out this week to all users and is available in the API as GPT Image 1.5.

