Posts Tagged ‘ai’

Google is Developing a Photo Recognition Program That Can Describe Exactly What’s in Your Photos

google-natural-language-object-description

Scientists at Google Research and Stanford University have teamed up to develop an artificial intelligence program designed to automatically produce captions based on the content of the image.

That’s right, not just tags, full on captions like “A person riding a motorcycle on a dirt road.” Read more…

Canon Patent Suggests Feature That Would Choose Between JPEG and RAW For You

canonjpgraw

In the future, “Auto” may be an option for choosing between JPEG and RAW on Canon DSLRs. A recently published patent reveals that the company has tinkered with the idea of a feature that could automatically choose which photos to save in RAW and which ones to save as JPEG only.
Read more…

How Artificial Intelligence Reconstructs Our Minds and Lives Using Our Photos

5183070797_33a4f3efe4_z

Data is embedded in our environment, in our behavior, and in our genes. Over the past two years, the world has generated 90% of all the data we have today. The information has always been there, but now we can extract and collect massive amounts of it.

Given the explosion of mobile photography, social media based photo sharing, and video streaming, it’s likely that a large portion of the data we collect and create comes in the form of digital images. Read more…

photoBot: A Photog Robot That Scans the Room for Pictures, R2-D2-Style

photoBot is a new photography robot designed by Tommy Dykes, a designer and PhD student at Northumbria University. It constantly scans a room for photo ops by turning its head in a manner reminiscent of R2-D2 from Star Wars (which, in case you haven’t heard, is now owned by Disney).
Read more…

Location Recognition for Photographs by Looking at Architecture

Cameras these days are smart enough to recognize the faces found inside photographs and label them with names. What if the same kind of recognition could be done for the locations of photographs? What if, instead of using satellite geodata, the camera could simply recognize where it is by the contents of the photographs?

That’s what research being done at Carnegie Mellon University and INRIA/Ecole Normale Supérieure in Paris may one day lead to. A group of researchers have created a computer program that can identify the distinctive architectural elements of major cities by processing street-level photos.
Read more…

Researchers Create Program That Can Quantify How Fake Photos Are

What if all advertising photos came with a number that revealed the degree to which they were Photoshopped? We might not be very far off, especially with recent advertising controversies and efforts to get “anti-Photoshop laws” passed. Researchers Hany Farid and Eric Kee at Dartmouth have developed a software tool that detects how much fashion and beauty photos have been altered compared to the original image, grading each photo on a scale of 1-5. The program may eventually be used as a tool for regulation: both publications and models could require that retouchers stay within a certain threshold when editing images.

(via Dartmouth via NYTimes)

Computer Trained to Select the Best Candid Portrait Photos from Videos

Here’s the current state of imagery: still cameras can shoot HD video, video cameras can capture high quality stills, and data storage costs continue to fall. In the future, it might become commonplace for people to make photos by shooting uber-high quality video and then selecting the best still. However, as any photographer knows, selecting the best photograph from a series of photos captured in burst mode is already a challenge, so selecting a still from 30fps footage would be quite a daunting challenge.

To make the future easier for us humans, researchers at Adobe and the University of Washington are working on training computers to do the grunt work for us. One research project currently being done involves training a computer to automatically select candid portraits when given video of a person. The video above is a demo of the artificial intelligence in action.

Candid Portrait Selection From Video (via John Nack)

Robot Photographer Programmed to Obey the Rule of Thirds

Robots might not be able to convey emotions or tell stories through photographs, but one thing they’re theoretically better than humans at is calculating proportions in a scene, and that’s exactly what one robot at India’s IIT Hydrabad has been taught to do. Computer scientist Raghudeep Gadde programmed a humanoid robot with a head-mounted camera to perfectly obey the rule of thirds and the golden ratio. New Scientist writes,

The robot is also programmed to assess the quality of its photos by rating focus, lighting and colour. The researchers taught it what makes a great photo by analysing the top and bottom 10 per cent of 60,000 images from a website hosting a photography contest, as rated by humans.

Armed with this knowledge, the robot can take photos when told to, then determine their quality. If the image scores below a certain quality threshold, the robot automatically makes another attempt. It improves on the first shot by working out the photo’s deviation from the guidelines and making the appropriate correction to its camera’s orientation.

It’s definitely a step up from Lewis, a wedding photography robot built in the early 2000s that was taught to recognize faces.

(via New Scientist via DVICE)

Rewind: An Awesome Camera Feature for Perfect Group Photos

What if you could take perfect group photographs by first shooting multiple frames and then selecting the best portions of each one? Microsoft amazed us with this concept last year with its Photo Fuse technology, and now we may soon be seeing something similar coming to mobile phone cameras (and hopefully compact cameras as well). Imaging technology company Scalado gave the above demonstration at a conference earlier this month showing off Rewind, a super-useful feature that shoots a burst of full-res photos, then lets you select the best faces for each person in the image. Next up on our wishlist: Content Aware Fill.

Rewind (via GigaOM)

PhotoSketch Turns Your Sketches into Photo Montages

photosketch

When there’s something in the news regarding photography, like Stanford’s open source camera, I’m usually not the first to post about it. However, since I have a background in both photography and computer science, hopefully I can provide some unique insight into certain news stories.

The big story this past week has been PhotoSketch, a research project out of China’s prestigious Tsinghua University. The claim is that this program can take your rough, labeled sketches of various scenes, and automatically turn them into photo montages by combining the appropriate photographs obtained from the web. The following video posted to Vimeo demonstrating the technology has gotten over half a million views over the past week.

Key Ideas

There are two main features that allow PhotoSketch to work. The first is filtering out undesirable images to obtain suitable ones, and the second is a novel blending algorithm that creates a seamless composition.

The key idea is that the user of the program actually does a lot of the hard work, making the job of the program a lot simpler. What’s great is that the user doesn’t even realize they’re doing a lot of work. A similar example might be CAPTCHAs, those security keys you type in to verify you’re a human. It’s pretty trival for a human to do, but (currently) very difficult for a computer.

Modern-captcha

Likewise, labeling the semantics of a photo is something very difficult for computers to do. If you gave the program unlabeled photographs, how would the program distinguish between a man reaching for something and a man throwing a ball, if both have similar shape and form? A computer can determine shapes and colors, but has an impossible time figuring out the meaning of photographs without human participation.

Since the user provides both a shape and a label, the problem becomes a shape matching problem, which isn’t nearly as difficult. The program only has to search through images that humans have previously labeled as being suitable.

In order to make it easier to extract the desired subjects from photographs, the filtering process actually throws away images that don’t have clear, uncluttered backdrops. For example, a tiger that blends into grass would be discarded, as would a lego piece among many lego pieces. This makes sense, since we all know an object is much easier to isolate from a photo when it’s very distinct from the background. In Photoshop you can simply use the magic wand or quick selection tools to eliminate the background.

overviewpsketch

Now I’ll briefly describe the various steps that go into making the program work.

Obtaining the Background

The main observation for selecting a background is that if you find all the images with a certain label (i.e. beach, mountain, meadow, etc…), you can group them by similarity. They assume that the largest “cluster” of similar images is probably what the user is looking for, so they choose 100 of the background images that are most similar to the characteristics of this cluster.

Next, they take these 100 images, and throw out the ones that don’t have the horizon line in the correct place. With the remaining images, they filter out images that have non-uniform backgrounds in order to have clean, open spaces on top of which the item images can be placed. At the end of this stage, they keep about 20 background images as possible candidates.

Selecting Scene Items

Screen shot 2009-10-09 at 2.08.53 PM

Once candidate background images have been obtained, the program searches for images that match the labels of the items in the scene. As with background selection, images that are too complicated or too cluttered are filtered out. The items need to be very distinct from the background in order for the program to isolate them.

The program then compares the extracted items with the shape the user drew, if a shape was provided. Images that don’t match are discarded, and the ones that do match are clustered together, just like in background selection. Images that both match the shape well and are part of a popular cluster are selected as candidate images.

Blending the Images

The novel methods used to blend the candidate images together is actually one of the main areas of research for this project. Everything I’ve explained prior to this section isn’t very groundbreaking, while everything related to this section is too complicated and technical to be easily explained. I’ll just say a lot of work goes into making the images not look completely absurd against the selected backgrounds.

Real or Fake?

What I find funny is how many of the comments found around the web regarding PhotoSketch claim that it’s fake. If it were fake, it would be one of the greatest hoaxes of all time, since the research was done at a prestigious university and will also be presented at the ACM SIGGRAPH Asia conference in December.

However, this doesn’t mean the program is as perfect as the video demonstrations and examples published make it seem. Here are some examples from the paper of when the program generates a semantically ridiculous photo montage:

Screen shot 2009-10-09 at 1.49.56 PM

Anything automatically generated will have semantic flaws that create absurd and non-sensical images every so often. The examples provided by the PhotoSketch group are simply examples of when the program successfully does what it’s supposed to do (which is hopefully quite often). Does it always create images that look as nice or make as much sense as the examples? No, but the examples provide a good demonstration of the technology.

Conclusion

PhotoSketch is a pretty amazing idea that deserves all the attention it’s getting. It’s also a taste of what’s to come with regards to computer graphic technologies. I’m sure we’re going to see more and more mindboggling research projects and commercial products in the coming years.

Though the group is still working on an online demonstration, the research group’s website contains the user studies, and the research paper.


Image credits: The images used in this article were obtained from the research website and their paper.