NY Times Using Google AI to Digitize 5M+ Photos and Find ‘Untold Stories’

The New York Times has teamed up with Google Cloud for digitizing five to seven million old photos in its archive. Google’s AI will also be tasked with unearthing “untold stories” in the massive trove of historical images.

“For over 100 years, The Times has archived approximately five to seven million of its old photos in hundreds of file cabinets three stories below street level near their Times Square offices in a location called the ‘morgue’,” Google writes. “Many of the photos have been stored in folders and not seen in years. Although a card catalog provides an overview of the archive’s contents, there are many details in the photos that are not captured in an indexed form.”

The morgue contains photos from as far back as the late 1800s — it’s “a history of the world through the eyes of the New York Times.” After the archive was threatened by a broken pipe that flooded the building in 2015, the Times began to look for a better way to safely store the photos.

“The morgue is a treasure trove of perishable documents that are a priceless chronicle of not just The Times’s history, but of nearly more than a century of global events that have shaped our modern world,” Times CTO Nick Rockwell tells Google.

In addition to storing the photos in high-resolution in the cloud, the system will also recognize text, handwriting, and other metadata found with the physical prints. These details will provide a powerful search engine for scouring over a century of imagery.

For example, when provided this old photo of Penn Station…

…Google’s Cloud Vision API was able to extract this text unaided from the front and back:

NOV 27 1985
JUL 28 1992
Clock hanging above an entrance to the main concourse of Pennsylvania Station in 1942, and, right, exterior of the station before it was demolished in 1963.
PUBLISHED IN NYC
RESORT APR 30 ‘72
The New York Time THE WAY IT WAS – Crowded Penn Station in 1942, an era “when only the brave flew – to Washington, Miami and assorted way stations.”
Penn Station’s Good Old Days | A Buff’s Journey into Nostalgia
( OCT 3194
RAPR 20072
PHOTOGRAPH BY The New York Times Crowds, top, streaming into the old Pennsylvania Station in New Yorker collegamalan for City in 1942. The former glowegoyercaptouwd a powstation at what is now the General Postadigesikha designay the firm of Hellmuth, Obata & Kassalariare accepted and financed.
Pub NYT Sun 5/2/93 Metro
THURSDAY EARLY RUN o cos x ET RESORT
EB 11 1988
RECEIVED DEC 25 1942 + ART DEPT. FILES
The New York Times Business at rail terminals is reflected in the hotels
OUTWARD BOUND FOR THE CHRISTMAS HOLIDAYS The scene in Pennsylvania Station yesterday afternoor afternoothe New York Times (Greenhaus)

The Vision API will also be able to identify objects, places, and other things seen in photos. For example, Google AI was able to determine that the train station above was Pennsylvania Station by recognizing the logo in the frame.

The hope is that the AI system will be able to uncover new stories and perspectives by grouping previously unseen photos that lack captions and identifiable information with other photos that are already labeled.

“Helping The New York Times transform its photo archive fits perfectly with Google’s mission to organize the world’s information and make it universally accessible and useful,” Google says.

Discussion