Researchers Take Aim at Automatically Detecting Photo Fakes on Twitter


You might remember the photo above from last year. For a while, it circulated the web like mad, claiming to show Hurricane Sandy bearing down menacingly on the Statue of Liberty. But if you’ve read our previous coverage on the photo, you’ll know that it is, in fact, a fake — a composite of a Statue of Liberty picture and a well-known photo by weather photographer Mike Hollingshead.

Photo fakes like this wind up going viral online all the time, often helped along by Twitter where retweet upon retweet puts it in front of thousands of unsuspecting people. Having had enough, a group of researchers from the University of Maryland, IBM Research Labs and the Indraprastha Institute of Information Technology are trying to do something about it.

The team, lead by Ph.D. student Aditi Gupta, is trying to find a way to automatically detect fakes and prevent their spread. Or, at the very least, quickly alert viewers that what they’re looking at isn’t real.

The team used photo that were spread on Twitter during Hurricane Sandy to test their methods. The goal is to empower journalists with the tools necessary to weed out fakes in favor of real photos that can be tracked down, licensed, and used to illustrate their stories.

This photo showing a North Korean military exercise is another infamous Photoshop job.
This photo showing a North Korean military exercise is another infamous fake.

You can read the entire research paper here, but there were two areas where the research turned up interesting results: where fakes come from, and how well we can detect them automatically.

In regards to where they come from, it turns out that 86% are spread by re-tweet — no surprises there. Where the surprises did come in is when the researchers found that a tiny percentage of influential tweeters (only 0.3%) were indirectly responsible for 90% of the retweets of fake images.

This information came in handy when they tried to develop an automated system of detecting these fakes, and they wound up being extremely successful. Using information about the Twitter user’s account and the false tweet’s content, they developed algorithms that wound up detecting 97% of fakes without any manual help.

“Our results showed that automated techniques can be used in identifying real images from fake images posted on Twitter,” says the team. “Content and property analysis of tweets can help us in identifying real image URLs being shared on Twitter with a high accuracy,” eventually leading to “a browser plug-in that can detect fake images being shared on Twitter in real-time.”

For more info, check out the full research paper by clicking here.