Posts Tagged ‘ai’
Cameras these days are smart enough to recognize the faces found inside photographs and label them with names. What if the same kind of recognition could be done for the locations of photographs? What if, instead of using satellite geodata, the camera could simply recognize where it is by the contents of the photographs?
That’s what research being done at Carnegie Mellon University and INRIA/Ecole Normale Supérieure in Paris may one day lead to. A group of researchers have created a computer program that can identify the distinctive architectural elements of major cities by processing street-level photos.
What if all advertising photos came with a number that revealed the degree to which they were Photoshopped? We might not be very far off, especially with recent advertising controversies and efforts to get “anti-Photoshop laws” passed. Researchers Hany Farid and Eric Kee at Dartmouth have developed a software tool that detects how much fashion and beauty photos have been altered compared to the original image, grading each photo on a scale of 1-5. The program may eventually be used as a tool for regulation: both publications and models could require that retouchers stay within a certain threshold when editing images.
Here’s the current state of imagery: still cameras can shoot HD video, video cameras can capture high quality stills, and data storage costs continue to fall. In the future, it might become commonplace for people to make photos by shooting uber-high quality video and then selecting the best still. However, as any photographer knows, selecting the best photograph from a series of photos captured in burst mode is already a challenge, so selecting a still from 30fps footage would be quite a daunting challenge.
To make the future easier for us humans, researchers at Adobe and the University of Washington are working on training computers to do the grunt work for us. One research project currently being done involves training a computer to automatically select candid portraits when given video of a person. The video above is a demo of the artificial intelligence in action.
Robots might not be able to convey emotions or tell stories through photographs, but one thing they’re theoretically better than humans at is calculating proportions in a scene, and that’s exactly what one robot at India’s IIT Hydrabad has been taught to do. Computer scientist Raghudeep Gadde programmed a humanoid robot with a head-mounted camera to perfectly obey the rule of thirds and the golden ratio. New Scientist writes,
The robot is also programmed to assess the quality of its photos by rating focus, lighting and colour. The researchers taught it what makes a great photo by analysing the top and bottom 10 per cent of 60,000 images from a website hosting a photography contest, as rated by humans.
Armed with this knowledge, the robot can take photos when told to, then determine their quality. If the image scores below a certain quality threshold, the robot automatically makes another attempt. It improves on the first shot by working out the photo’s deviation from the guidelines and making the appropriate correction to its camera’s orientation.
It’s definitely a step up from Lewis, a wedding photography robot built in the early 2000s that was taught to recognize faces.
What if you could take perfect group photographs by first shooting multiple frames and then selecting the best portions of each one? Microsoft amazed us with this concept last year with its Photo Fuse technology, and now we may soon be seeing something similar coming to mobile phone cameras (and hopefully compact cameras as well). Imaging technology company Scalado gave the above demonstration at a conference earlier this month showing off Rewind, a super-useful feature that shoots a burst of full-res photos, then lets you select the best faces for each person in the image. Next up on our wishlist: Content Aware Fill.
When there’s something in the news regarding photography, like Stanford’s open source camera, I’m usually not the first to post about it. However, since I have a background in both photography and computer science, hopefully I can provide some unique insight into certain news stories.
The big story this past week has been PhotoSketch, a research project out of China’s prestigious Tsinghua University. The claim is that this program can take your rough, labeled sketches of various scenes, and automatically turn them into photo montages by combining the appropriate photographs obtained from the web. The following video posted to Vimeo demonstrating the technology has gotten over half a million views over the past week.
There are two main features that allow PhotoSketch to work. The first is filtering out undesirable images to obtain suitable ones, and the second is a novel blending algorithm that creates a seamless composition.
The key idea is that the user of the program actually does a lot of the hard work, making the job of the program a lot simpler. What’s great is that the user doesn’t even realize they’re doing a lot of work. A similar example might be CAPTCHAs, those security keys you type in to verify you’re a human. It’s pretty trival for a human to do, but (currently) very difficult for a computer.
Likewise, labeling the semantics of a photo is something very difficult for computers to do. If you gave the program unlabeled photographs, how would the program distinguish between a man reaching for something and a man throwing a ball, if both have similar shape and form? A computer can determine shapes and colors, but has an impossible time figuring out the meaning of photographs without human participation.
Since the user provides both a shape and a label, the problem becomes a shape matching problem, which isn’t nearly as difficult. The program only has to search through images that humans have previously labeled as being suitable.
In order to make it easier to extract the desired subjects from photographs, the filtering process actually throws away images that don’t have clear, uncluttered backdrops. For example, a tiger that blends into grass would be discarded, as would a lego piece among many lego pieces. This makes sense, since we all know an object is much easier to isolate from a photo when it’s very distinct from the background. In Photoshop you can simply use the magic wand or quick selection tools to eliminate the background.
Now I’ll briefly describe the various steps that go into making the program work.
Obtaining the Background
The main observation for selecting a background is that if you find all the images with a certain label (i.e. beach, mountain, meadow, etc…), you can group them by similarity. They assume that the largest “cluster” of similar images is probably what the user is looking for, so they choose 100 of the background images that are most similar to the characteristics of this cluster.
Next, they take these 100 images, and throw out the ones that don’t have the horizon line in the correct place. With the remaining images, they filter out images that have non-uniform backgrounds in order to have clean, open spaces on top of which the item images can be placed. At the end of this stage, they keep about 20 background images as possible candidates.
Selecting Scene Items
Once candidate background images have been obtained, the program searches for images that match the labels of the items in the scene. As with background selection, images that are too complicated or too cluttered are filtered out. The items need to be very distinct from the background in order for the program to isolate them.
The program then compares the extracted items with the shape the user drew, if a shape was provided. Images that don’t match are discarded, and the ones that do match are clustered together, just like in background selection. Images that both match the shape well and are part of a popular cluster are selected as candidate images.
Blending the Images
The novel methods used to blend the candidate images together is actually one of the main areas of research for this project. Everything I’ve explained prior to this section isn’t very groundbreaking, while everything related to this section is too complicated and technical to be easily explained. I’ll just say a lot of work goes into making the images not look completely absurd against the selected backgrounds.
Real or Fake?
What I find funny is how many of the comments found around the web regarding PhotoSketch claim that it’s fake. If it were fake, it would be one of the greatest hoaxes of all time, since the research was done at a prestigious university and will also be presented at the ACM SIGGRAPH Asia conference in December.
However, this doesn’t mean the program is as perfect as the video demonstrations and examples published make it seem. Here are some examples from the paper of when the program generates a semantically ridiculous photo montage:
Anything automatically generated will have semantic flaws that create absurd and non-sensical images every so often. The examples provided by the PhotoSketch group are simply examples of when the program successfully does what it’s supposed to do (which is hopefully quite often). Does it always create images that look as nice or make as much sense as the examples? No, but the examples provide a good demonstration of the technology.
PhotoSketch is a pretty amazing idea that deserves all the attention it’s getting. It’s also a taste of what’s to come with regards to computer graphic technologies. I’m sure we’re going to see more and more mindboggling research projects and commercial products in the coming years.