Scientists Figure Out How to Record Audio by Seeing Vibrations with a Camera


Here’s something that will blow your mind: scientists have figured out how to extract audio from images captured with a camera. By looking at the extremely small vibrations captured by a high speed camera, researchers have been able to recreate music and speech from nothing but visual information.

The project was conducted by scientists at MIT, Microsoft, and Adobe, and has pretty crazy implications: in the future, people might be able to eavesdrop on your conversations by pointing a microphone-less camera at a potato chip bag near your feet.

That’s an actual example given by the researchers in the video below, which explains how this technology works:

The basic idea behind the experiment is that sound causes vibrations that can be captured by a camera. These vibrations are so small and insignificant that they cannot be detected by the human eye, just like heartbeats on faces.

“People didn’t realize that this information was there,” says MIT grad student Abe Davis.

Even though humans can’t see this data, computers can detect and process it. Throw in some fancy algorithms, and the magic begins to happen.


By pointing a high speed camera at aluminum foil, a glass of water, earbuds lying on a desk, the leaves of a potted plant, and a potato chip bag (through soundproof glass from 15 feet away), scientists were able to recover music and intelligible speech.

Doing this generally requires cameras with a frame rate faster than the frequency of the audio signal (e.g. 2,000 to 6,000 frames per second), but scientists were able to recover usable audio using an ordinary DSLR camera shooting at 60 frames per second.

The audio quality recovered from the ordinary camera wasn’t very good, but it enough to provide information about the gender of a speaker and the number of speakers in the area.

Davis says that this research is paving the way for what he calls “a new kind of imaging” — the capturing of sound through the camera lens.

(via MIT via Gizmodo)

  • ninpou_kobanashi

    Read my lips.

  • Rob S

    This has been done with windows for a long time. The exterior window of a building is essentially a speaker.

  • docholliday666

    Aw c’mon, stop stealing the technology (60’s tech that is) from the CIA! Big brother’s gonna be pissed!

  • OtterMatt

    It’s only a matter of time before Maxwell Smart is packing a Phantom Flex in his shoe instead of a phone.

  • Trinity Groves

    Makes me have a newfound appreciation for bats, dolphins and Geordi LaForge.

  • kassim

    Wizardry does exist.

  • Rob Elliott

    that is kinda nuts.

  • Lois Bryan

    Cool but creepy … or is it … creepy but cool …???

  • kassim

    Crool? Hahaha…

  • disqus_6a5r4sGw5d

    I was looking for it but couldn’t find it….so I will just ask. Why?

  • Joshua Reagan

    Because it is easier to outfit a drone with optics that can zoom in on small objects than it is to make a microphone that would do the same thing.

  • OtterMatt

    No it isn’t, a drone is nowhere near a stable enough platform to do this from. Just the motors alone would generate enough vibration to render this unusable.

  • etegration

    very CIA.

  • tedtrent

    WOW. Keep it up. Maybe we will be able to hear what people are saying on other planets one day.

  • Derek

    I remember they did this in the movie Eagle Eye with Shia LeBeouf and was blown away and thought “That could never happen”. The future is now people!

  • Joshua Reagan

    For now. Before learning of this project most people would have thought that recreating audio in this way was not possible.

  • OtterMatt

    But it’s still /been/ possible since nearly the 60s, regardless of what people might think. It absolutely is not easier to modify this tech to be flight-capable. I’ll bet my understanding of physics that there will never be a flight system that produces less vibration than is produced by sound waves on physical objects.

    Given that the tech involved isn’t new and isn’t particularly advancing (this kinda seems like an MIT senior design project to me), this won’t end up being used in large scale by anyone outside of a movie.