DALL-E Creator is ‘Surprised’ at AI Image Generator’s Impact

openai

The creator of artificial intelligence (AI) image generator DALL-E says that he’s “surprised” at the technology’s huge impact.

In an interview with Venture Beat, Aditya Ramesh expresses his astonishment at the pace of development in the generative AI space.

“It doesn’t feel like so long ago that we were first trying this research direction to see what could be done,” Ramesh says.

“I knew that the technology was going to get to a point where it would be impactful to consumers and useful for many different applications, but I was still surprised by how quickly.”

iPhone Moment

At the beginning of 2022, AI image generators barely existed. They ended the year as arguably the biggest thing to happen to images since the invention of photography.

OpenAI, DALL-E’s parent company, only announced the unknown program two years ago. Now, the company is in talks to sell existing shares in a tender offer that would value the company at around $29 billion.

“There’ll be some kind of iPhone-like moment for image generation and other modalities,” Ramesh tells Venture Beat. “I’m excited to be able to build something that will be used for all of these applications that will emerge.”

Understanding the Tech

Ramesh believes that there is a misunderstanding of how DALL-E works. The technology has not been without its controversy regarding the rights of photographers and artists.

“People think that the way the model works is that it sort of has a database of images somewhere, and the way it generates images is by cutting and pasting together pieces of these images to create something new,” he tells Venture Beat.

“But actually, the way it works is a lot closer to a human where, when the model is trained on the images, it learns an abstract representation of what all of these concepts are.”

AI image generators, such as DALL-E, only know how to interpret written text prompts after being trained on hundreds of millions of images scraped from the internet.

“The training data isn’t used anymore when we generate an image from scratch,” Ramesh explains.

“Diffusion models start with a blurry approximation of what they’re trying to generate, and then over many steps, progressively add details to it, like how an artist would start off with a rough sketch and then slowly flesh it out over time.”

Ramesh tells Venture Beat that his goal has always been for DALL-E to be a tool for artists, in the same way Codex is a helpful tool for a programmer.

“We found that some artists find it really useful for prototyping ideas — whereas they would normally spend several hours or even several days exploring some concept before deciding to go with it, DALL-E could allow them to get to the same place in just a few hours or a few minutes.”


Image credits: Header photo licensed via Depositphotos.

Discussion