Hands-on with DALL-E 2, the AI Image Generator Taking the Web by Storm

There has been a lot of talk about AI generated images and visual art. One of the projects that I found most impressive was DALL-E 2. This is a new AI system that can create realistic images and art from a description in natural language.

DALL-E 2 is currently invite-only so I entered my email to the waitlist. A few days later, the invite arrived. Instead of focusing on more meaningful work, I fooled around with the system. Here’s what I got…

Natural Language Descriptions

I am primarily a landscape photographer, so I wanted to test when AI will replace me. My first description that I typed in was “snow covered mountains with a rainbow and a flying unicorn”. And DALL-E 2 created this,

“SNOW COVERED MOUNTAINS WITH A RAINBOW AND A FLYING UNICORN”
“SNOW COVERED MOUNTAINS WITH A RAINBOW AND A FLYING UNICORN”
“SNOW COVERED MOUNTAINS WITH A RAINBOW AND A FLYING UNICORN”
“SNOW COVERED MOUNTAINS WITH A RAINBOW AND A FLYING UNICORN”

Clearly, I can keep my job for now! Next, I tried something more realistic, “beautiful waterfall with a lot of green moss and rocks in the foreground.”

“BEAUTIFUL WATERFALL WITH A LOT OF GREEN MOSS AND ROCKS IN THE FOREGROUND”
“BEAUTIFUL WATERFALL WITH A LOTS OF GREEN MOSS AND ROCKS IN THE FOREGROUND”
“BEAUTIFUL WATERFALL WITH A LOTS OF GREEN MOSS AND ROCKS IN THE FOREGROUND”
“BEAUTIFUL WATERFALL WITH A LOTS OF GREEN MOSS AND ROCKS IN THE FOREGROUND”

If these above are really AI-generated, I am impressed! It looks real. So I continued with “autumn forest in the mist.”

“AUTUMN FOREST IN THE MIST”
“AUTUMN FOREST IN THE MIST”
“AUTUMN FOREST IN THE MIST”
“AUTUMN FOREST IN THE MIST”

A bit dull so I added “autumn forest in the mist with mushrooms on the ground.”

“AUTUMN FOREST IN THE MIST WITH MUSHROOMS ON THE GROUND”
“AUTUMN FOREST IN THE MIST WITH MUSHROOMS ON THE GROUND”
“AUTUMN FOREST IN THE MIST WITH MUSHROOMS ON THE GROUND”
“AUTUMN FOREST IN THE MIST WITH MUSHROOMS ON THE GROUND”

Impressive again! Let’s try something harder, “autumn forest in the mist with mushrooms on the ground and a deer in the back.”

“AUTUMN FOREST IN THE MIST WITH MUSHROOMS ON THE GROUND AND A DEER IN THE BACK”
“AUTUMN FOREST IN THE MIST WITH MUSHROOMS ON THE GROUND AND A DEER IN THE BACK”
“AUTUMN FOREST IN THE MIST WITH MUSHROOMS ON THE GROUND AND A DEER IN THE BACK”
“AUTUMN FOREST IN THE MIST WITH MUSHROOMS ON THE GROUND AND A DEER IN THE BACK”

Less impressive… I do like how the AI added the birds and how the mushrooms started to levitate.

Uploading My Own Images

DALL-E 2 offers an option to upload your own images and then the system generates variations. This sounds interesting too and might be an even greater test of the AI’s power. The AI has to “read” the image and then create meaningful alternatives. How did it work? Let’s see.

Lake Bled, Slovenia – The original source image.

My image of Lake Bled in autumn seemed like a perfect example to test. The composition, the colors, and the subject are all distinct and should be easy to replicate.

Here are the resulting variations, created by AI:

Next, I wanted to test how DALL-E 2 generates people in the image. I uploaded one of my travel portraits and got this message:

Fair enough, it makes sense. So I tried with a more silhouetted person, without a recognizable face. This time it worked.

Views from top of Taljanka, Albanian mountains — The original source image.

Here are the AI-generated variations:

The photographer and his camera in hand are both unnaturally skewed and distorted. The mountain layers are all very realistic and obviously changed.

Editing Existing Images

Once you upload your own images, there is an option to edit them simply by using a brush on a part of the image and describing in text what you want to create in this area. Here are a few results…

“BLACK BIRDS FLYING”
“BLACK BIRDS FLYING”
“BLACK BIRDS FLYING”
“BLACK BIRDS FLYING”
“LAND ROVER DEFENDER PARKED IN THE GRASS WITH A ROOFTOP TENT”
“LAND ROVER DEFENDER PARKED IN THE GRASS WITH A ROOFTOP TENT”
“LAND ROVER DEFENDER PARKED IN THE GRASS WITH A ROOFTOP TENT”
“LAND ROVER DEFENDER PARKED IN THE GRASS WITH A ROOFTOP TENT”
“FLOCK OF SHEEP GRAZING IN THE GRASS”
“FLOCK OF SHEEP GRAZING IN THE GRASS”
“FLOCK OF SHEEP GRAZING IN THE GRASS”

As you can see, not very impressive. It’s fun, but not realistic by any means. The land rovers are quite okay, but the sheep or the birds are clearly fake. Here below are the original images I used to create the above variations:

Conclusion

AI has definitely made huge progress in the last few years. We’ve seen music composed by the computers, deep fake videos of famous dead people, and now images being created by simply typing in text descriptions! I am not sure if AI is the right term as these are a set of algorithms that learn and improve with more and more data being fed to it. Anyways, the results I got with playing around were better than expected. There are many completely silly results, but sometimes the algorithm gets it really well.

I can see how useful this can become in the future. And how easy it will be to abuse it. The technology however is still in its infancy. At the moment I see it more or less as a toy to experiment with. Maybe in a few years this will change and we will see a flood of amazing images that have nothing to do with reality. Then I believe, the craft of true photography will only become more valued.


About the author: Luka Esenko is a photographer based in Ljubljana, Slovenia. The opinions expressed in this article are solely those of the author. Esenko teaches photography workshops in and around Slovenia and he is also a co-founder of Photohound. You can find more of his work on his website, Twitter, and Instagram. This article was also published here.

Discussion