The Indecisive Moment: Street Photography and AI

I’ve spent countless hours walking streets both near and far from home with My Precious in my hand and a muscle memory for it in my fingers that made it effectively part of me — an extension of my arm. My Precious is, of course, my camera.

I realized I wanted to photograph time. You can’t see time, and you can’t touch it. That’s why it’s the furthest thing from our five senses. But with photography, I can capture time. —Miyako Ishiuchi, What Photographers Go Up Against 6: Time

Years ago I wrote about the “Zen” of street photography, by which I meant its meditative nature for me — how its practice enlivened my senses to the world around me, made me walk more slowly and observantly — excruciatingly (and often exhaustingly) mindful of the world around me. I wrote somewhere once that of every street photo I take — no matter how long ago, no matter how many 1000s of photos I’ve taken — I remember exactly where I was standing and what the scene sounded like around me. A synaesthetic frisson generated by a sea of sensory data in the depths of my physiology.

“The decisive moment” is a phrase often used to describe the essence of street photography as a practice and genre, coming from a canon work by Henri Cartier-Bresson in 1952:

Photography implies the recognition of a rhythm in the world of real things. What the eye does is to find and focus on the particular subject within the mass of reality; what the camera does is simply to register upon film the decision made by the eye[…] Photography must seize upon this moment and hold immobile the equilibrium of it. […] Photography is the simultaneous recognition, in a fraction of a second, of the significance of an event as well as of a precise organization of forms that give that event its proper expression. —Henri Cartier-Bresson, The Decisive Moment

What is this “mass of reality” and how is this decision made? Consider all the variables of environment and motion at play — the light at that exact time of day, the street corner that is turned (or not), the speed at which human bodies pass each other, the direction the eye is looking for a split second because a sudden sound was heard. The height of the standing photographer. The bend of a knee or wrist.

Bowie ponders a passerby.

Somewhere in all that mass, that chaos — something in the mind and finger of the photographer makes a wordless, instant decision to take a photo — capturing a moment in time that never existed before and never will again. What is that recognition of a rhythm? It’s a profoundly human algorithm.

I once hacked my own photography practice by taking away almost all my own agency around it. I built a tiny camera that I could pin to my clothes and programmed it to automatically take a picture every 60 seconds, then simply went about my day. Later, it felt deeply weird to see the results for the first time — as though they were taken of my life by a stranger, a portable paparazzo. It was a fascinating experiment in loss of control, discerning where my artistic eye begins and ends, and feeling intimately witnessed by something outside of myself.

Lately, I’ve been seeing increasingly common rules in announcements for street photography competitions that say “no AI-generated images” and the like. The first time I saw this, it struck me as odd because it seemed obvious… like, of course, [by its very nature and definition] it can’t be AI-generated; it is necessarily of physical worlds and the messy people in it and time itself. AI does not exist in time or space.

Right?

Something of a meta street moment.

A fellow street photographer I’ve been following for years recently and suddenly became an exclusively generative AI artist. My heart sank as I’d really loved his poetic, ethereal, and unusual work that had inspired me over and over again. His new style was glossy and almost wholly disconnected from his earlier aesthetic and point-of-view, seemingly (to my eye, anyway) entirely text-prompt-based and not using any of his own original work as material. It was still beautiful and imaginative work, but it told me nothing about what he’d seen or lived. There was still much to appreciate about it artistically and technically, but I realized that what I’d loved was the stories he told about the world — and how he, specifically and as a fragile and imperfect being, moved in it.

Above is legendary animator Hayako Miyaki on why computer-generated characters are not interesting to him.

While I can be a romantic about more traditional practices — and was initially tempted to immediately throw a rope between “real” and evolving tech in street photography — I’ve come to think that any difference is ultimately semantic vis-à-vis how we categorize art. I think it’s a lot more interesting to find intersections of human and machine, and all the literal and figurative shades of gray between them. What is a camera anyway but a mechanized eye invented by puny mortals to try to capture time?

During the early years (!) of the pandemic, I grieved the freedom of being able to travel freely through my own neighborhood and favorite places in the world. I began to digitally composite together photographs I’d taken — over different years and places —to create worlds both familiar and new, like Mt. Rainier of my Washington home emerging over Tokyo. It felt like time travel and it felt true to my soul. I could feel where I was. It felt like memories and testimony of places I’d walked… that don’t actually exist. Like fusion and bottled magic. And it was all real… just remixed, reimagined.

Simple example of a composite. I combined a Tokyo street scene with textures of a Pacific Northwest forest.

In my day job as a product designer in tech, I started working in AI in 2017 and am as ever in awe of it, in all the senses of that word. I mostly worked in the area of natural language processing and virtual assistants like Alexa or Siri, and have some exposure to its use in computer vision (CV), consumer robotics, and smart wearables. I both enjoy and am agape to imagine how we might deploy such tools with regard to photographic art and craft. I think of the tiny wearable camera I made and what if I had glasses like that. And I think about how much I’d love to train a custom AI model (for my use only!) with the tens of thousands of photos I’ve taken over the years to generate wild new versions of my own work free from temporal and geographical bounds, soaring creatively in a world of my own curation. All that visual data! My eyes gleam with the possibilities.

I used to spend hours editing and cropping and tinkering with my photos, something I no longer do because now I just program My Precious’ settings to take photos to my exact artistic aesthetic: the exact lighting and color effects, the exact level of grain, my preferred aspect ratio, and so on. I now just compose and take the shot and it’s done right out of the camera. It’s not a huge leap to imagine that I could train it over time to detect what scenes I find most compelling, what angles I use, how I compose and frame, and ultimately how I share my life and how I see the world with others.

But, what would be the fun in that?


About the author: Jill Corral is an author, designer, and photographer who works at the intersection of visuals, sound, and transformative spaces. You can find more of her work on her website. This article was also published here.

Discussion