Google Generative AI Brings Visuals to Zork, a 1977 Text-Based Video Game

Matt Walsh, a Google principal engineer, has brought visuals to Zork — the text-only video game released in 1977 — using Google Imagen, the company’s text-to-image AI system.

The concept behind Zork is similar to modern “choose your own adventure” games or books, where it described a situation and responded to prompts on what the user wanted to do next. It was originally developed at the Massachusetts Institute of Technology (MIT) and is the first well-known example of an interactive, fictional, adventure game.

The game was popular among computer enthusiasts at the time — all versions collectively selling more than 680,000 copies through 1986 — and, as Walsh says, inspired many of them to pursue careers in programming.

But as impressive as the concept and execution of Zork was, it was still limited by the technology of the time. While it was able to describe the fictional landscapes, users had to imagine them as it was unable to actually visually render them. Walsh has changed that by integrating Google Imagen, the company’s in-progress generative artificial intelligence (AI) system into it.

“Some adventure game fans at Google wondered what would happen if you used the output from the classic text adventure game ‘Zork’ as the input to Imagen, Google’s text-to-image diffusion model,” Walsh says. “To support the results we wanted we made some additions to Zork itself, which was a fascinating journey through an arcane language and a lost (but re-engineered) toolchain.”

Walsh describes the system as using seven diffusion models to first produce a low-resolution draft before upgrading the resolution repeatedly. They also applied Google’s LAMDA AI to compress the information so that Imagen could process the data and remember it in the game so that returning to a location wouldn’t produce an entirely new image, which would be confusing to the player.

“This is an example of the potential of these types of generative AI integrations, but also the limitations,” Voicebot and Synthedia founder Bret Kinsella says. “It is also not clear that any game maker would allow this much randomness into their game, but it certainly could be used in the creative process.”

Google’s Imagen is the company’s take on a text-to-image AI generator, similar to Stable Diffusion, DALL-E2, or Midjourney. Imagen was announced last May but is still only available internally at the company but it showed off an updated version of the application that could generate longer and more coherent AI-generated videos last November.