Facebook just expanded 3D photo posting to phones that don’t actually capture depth data. Using the magic of machine learning (i.e. artificial intelligence), Facebook taught a neural network how to “infer 3D structures from 2D photos,” even if those photos were taken with a single lens camera.
The announcement was made moments ago on the Facebook AI blog, where the company’s engineers went into depth on how exactly they pulled this off.
“This advance makes 3D photo technology easily accessible for the first time to the many millions of people who use single-lens camera phones or tablets,” reads the announcement. “It also allows everyone to experience decades-old family photos and other treasured images in a new way, by converting them to 3D.”
Facebook is obviously not the first to use AI to infer 3D data from a 2D image. Google has been doing it with the Pixel phones for years, and the LucidPix app we wrote about last month does much the same thing. The difference is that Facebook is bringing this to more users than ever before, and it’s available for free—no need to buy one of Google’s phones or pay for the premium version of an app to get all the features.
It should also do a better job than some of the more accessible implementations out there, simply because this technology has the full weight of Facebook’s engineering muscle behind it.
The technical details are complicated, but according to FB, their convoluted neural network can estimate the distance of every single pixel form the camera by using four different techniques:
- A network architecture built with a set of parameterizable, mobile-optimized neural building blocks.
- Automated architecture search to find an effective configuration of these blocks, enabling the system to perform the task in under a second on a wide range of devices.
- Quantization-aware training to leverage high-performance INT8 quantization on mobile while minimizing potential quality degradation from the quantization process.
- Large amounts of training data derived from public 3D photos.
The result is an estimated depth map that looks like this:
Of course, if all of the above sounds like gibberish, take comfort in the fact that it seems to work very well. As you can see from the examples below—which were all converted from standard 2D photos–the results are quite convincing.
Even when the system is being challenged by a very complex image like the Trevi Fountain in Rome, the amount of depth data inferred is staggering:
If you’re curious about how this technology actually works, you can dive much deeper on the Facebook AI blog.
If you’re more interested in how this impacts you, just know this: whether you want to convert “decades-old family photos” to 3D, or you want to convert your professional portraits into 3D-like creations, FB now lets you do that. You just need to upload it using the Facebook app on “an iPhone 7 or higher, or any recent midrange or better Android device.”
Image credits: All photos and animations provided by Facebook and used with permission.