PetaPixel

Using Romeo and Juliet to Illustrate the Pitfalls of JPEG Compression

romeojpeg1

It’s common knowledge that JPEG compression leads to a loss of data, but it’s difficult to really visualize the extent of that loss in a photo. A keen eye will be able to tell a difference, but it’s still hard to quantify it.

Tom Scott wanted to bring the reality home to those who don’t already understand it. So he took the pitfalls of JPEG compression and transferred them from the world of photos, to the world of Shakespeare.

As an experiment, he decided to apply different levels of JPEG compression to the text of Shakespeare’s Romeo and Juliet. In short order, “The Tragedy of Romeo and Juliet by William Shakespeare” became “She Uragedy!of Romeo Anb!kulies by Vilmibm Shakfrpease” — and that was only the first level of compression.

romeojpeg2

On his blog, Scott reprints the famous balcony scene post-compression at “maximum” quality in Photoshop:

O Romep+ Rpldo wiepffnre arr!riov Romep@
Dgoy thz gatggr `me tefusf sgx n`me!

He achieved this by importing the text as a RAW file, then exporting the compressed result as plain text. As you can see, even “maximum” quality isn’t exactly great. Because we can only take in a small number of bytes of information in text form per second, mistakes are obvious. Scott maintains that, in photo form, the errors that arose above would translate into “a minuscule change in colour, undetectable to the eye.”

romeojpeg4

It amounts to an interesting experiment that he wanted to attempt on a whim, but when he was done he decided to bind all of the final products in book form — from “maximum” to “minimum” quality.

For more info on how he performed the compression and how he got around the issues that arose in the process, head over to his blog and read about the experiment from the man himself.

(via Laughing Squid)


Image credits: Photographs by Tom Scott.


 
Get the hottest photo stories delivered to your inbox.
Get a daily digest of the latest headlines:
  • Clayton

    And then he posted jpegs. Fun

  • chubbs

    But how does Shakespeare pronounce “gif”?

    THAT IS THE QUESTION…

  • Ralph Hightower

    One speaker at the local camera club said that every time one opens a JPEG image, the image degrades.
    As a computer programmer, I find that hard to believe. No decent programmer or programming team would rewrite a file if there were no changes to the file contents.
    Now, if the file was changed and the changes were saved, that statement would be believable. But just opening a JPEG to view a photo and then closing it? Nah, the JPEG will not degrade.

  • Jnjy

    JPEG is supposed to work on digital images, not the English language.
    If I were to create a JPEG-like stuff for the English language, I’d remove the vowels more and more as compression increases. That would seem to be a better analogy.

  • wickerprints

    The analogy is not apt. Quantized images are information that has some intrinsic noise level; there is shot noise even if the sensor and lens are ideal and the raw data is lossless. A lossy compression algorithm would not necessarily discard information that is perceptually more significant than the noise originally present in the recorded data. By contrast, the works of Shakespeare is noiseless data–the words he wrote are what they are, just as this comment’s words are what they are. Any modification, however slight, is a distinguishable degradation of the original signal.

    The problem of lossy compression is not that the data is irrevocably degraded. The problem is that repeated decompression, modification, and recompression has a multiplicative effect on the error-to-signal ratio. The higher the quality setting you use, the more such cycles it takes before artifacts become noticeable.

  • Opie

    Funny… a few years ago, I did the exact same thing with audio files opened in photoshop. An MP3 works more similarly to a RAW files than does text, so the difference in compression was similarly negligible.

    Still, “remixing” a song in photoshop sure is fun.

  • Jpegger

    Shooting raw equates to spending 5 to 15 mins, per shot, in lightroom or similar trying to get your oversized file to look as good as the jpegs did straight out of camera, then saving it as a jpeg scaled down to 700-ish pixels wide for use on the web. Raw nazis are always arguing about rescuing detail from under and over exposed shots. Your camera has metering and highlight clipping warning, so rather than feel like you are a superior being because you shoot raw, just expose correctly in the first place. believe it or not Jpegs can be photoshopped, retouched, and processed with any number of effects AND still printed large(which most people don’t do anyway) without anybody being able to discern the difference in ANY practical application. why do photographers insist on looking at their shots under a microscope?

  • Zos Xavius

    because we can! ps: raw is better.

  • Opie

    Sometimes it’s pointlessly satisfying to eke out that extra bit of detail. Other times it’s legitimately beneficial, and might even save your hide once in a blue moon. There are too many reasons TO shoot raw and only one not to: space. Well, that and the pride you seem to take in NOT having shot raw (ironically, the same as that which you acuse us of displaying, only in reverse).

    Either way, if you’re so pleased with your decisions, why take to a public forum to have them corroborated?

    I wonder if you realize you’re subverting our freedom of choice in the same paragraph you’ve used to call attention to the subversion of your freedom of choice.

  • KewlDewd

    5-15 minutes? Really? You clearly have no idea how a RAW workflow works. Working in Lightroom I can take a single photo from untouched RAW to a standard looking camera processed jpeg in about 10-20 seconds. Then assuming most or all of the shots from a shoot were done in similar lighting (as is common in my shoots), I can copy those settings to the whole shoot in seconds. So you can go from 100 flat RAW files to sharp, contrasty, saturated photos in 30 seconds easy. For you apparently that would take over 16 hours if you average 10 minutes per shot. I think it’s safe to say you’re doing it wrong.

    Also, good luck making a drastic look change to your photos if you’ve shot jpeg. If you shot a photo in RAW and processed it to be very blue, then you decide to change it 180 degrees and make a very warm version you lose ZERO image quality in doing so. Can’t say the same about jpeg.

  • 9inchnail

    Definitely not. That speaker has no idea what he’s talking about. I mean, there are idiots that click the save icon every time they open a file even if they haven’t changed anything.

  • Jpegger

    Of course i don’t have any idea about RAW workflow, i shoot Jpeg. I simply made an opposing point to the article. There is no such thing as the ‘right’ way, just different ways. My point was obviously exagerated.

  • Jpegger

    Offering an opinion is a far cry from subversion of freedom. And having your opinions corroborated is one of the two main purposes people take to public forums, the other is to argue with people who have the audacity to disagree with you.

  • Malenkov

    So “of course” you’re commenting on a subject of which you know nothing about. What productive discourse.

  • Norshan Nusi

    Interesting point here. Where basic photo editing software does not provide image quality setting when saving jpegs, how about PS or GIMP that provides it?

    Does resaving them at 100% (which mean the least or no compression) degrades the file?

  • Jnjy

    Hi
    I beg to disagree. JPEG has got nothing to do with sensor noise. Loss in JPEG is indeed due to quantization, which does discard information irrevocably. Signal and noise are treated equally during JPEG compression. You can even argue that SNR will improve after JPEG compression (on a case to case basis). Of course, noise and artifact are different concepts. You can have a sensor-noise-free image, yet full of artifacts.

    Ever wonder why raster computer generated graphics (rasterized Illustrator file perhaps) still gets degraded after a low quality JPEG encode? I bet the original file ought to be sensor noise free, as it is not produced by a sensor in the first place.

  • Jnjy

    Quick answer: Yes, it still does degrade the file. It still does lossy compression.

    Long answer: Most (if not all) JPEG encoders do lossy encoding even at 100%. There are some having the option to do “JPEG lossless” that does not degrade the file. Though going lossless, people just usually use the PNG format.

  • cy_leow

    What a load of BS!

  • Jpegger

    The comments section of an internet blog has never been the epitome of productive discourse. This discussion reminds me of a quote “He who takes offence when no offence is intended is a fool, and he who takes offence when offence is intended is a greater fool”
    Especially from anonymous strangers on the internet.

  • Bart Willems

    What you should do is look at the overal structure; lines, paragraphs. Does the character of the type change after several jpg passes? Does a printed page have more color on it? Or less? Do we end up with more or less pages?
    Of course the letters (individual pixels) get altered–that’s the point of lossy compression. But the idea is that those changes are not discernable in the overall picture.

  • AutomaticPython

    anyone shooting JPG must be a MASTER of photography, getting it right 100% at the moment of capture. Incredible! Im glad they created Raw for the rest of us monkeys then!

  • Opie

    Sure, offering an opinion does not inherently pose a threat to the opposing viewpoint…if it’s properly stated.

    If you’d posed your opinion as a simple case of “RAW’s great for you, but I prefer JPEG,” I don’t think you’d be catching all of this flack. Unfortunately, your statement attempted to belittle RAW users in the same way you’ve apparently felt belittled for using JPEG.

  • http://www.flickr.com/photos/zackhuggins/ Zack

    As with many things, it gets easier with practice. I used to shoot jpeg only, largely because of space and speed, but I would often come across shots that I wanted to “fix” but couldn’t. Eventually I had to photograph an event that I couldn’t afford to botch so I shot that event with RAW. After seeing the amount of latitude I had with editing vs. jpeg, I never went back. And over time, I’ve gotten speedier with edits, (and also better at photography, less to fix in post ;P). I would guess I spend no more than 3 minutes on editing a photo, and average more like 1-2.

    I’m not telling you that you’re wrong to shoot jpeg, because I once thought the same, but it’s nice to have a file that, maybe in a few years I can go back and re-edit if I want to. Assuming I’m better at editing then/have better software/new editing tricks up my sleeve, of course.

  • lidocaineus

    Sure you can spout of whatever you want in any internet forum – that’s your right subject to whatever forum’s rules you’re using.

    We also have the right to call you an idiot for discussing something you don’t know both sides to. And by the way, I’m totally for JPEG shooting when appropriate, just like you should shoot RAW when appropriate; knowing when to use one or the other is part of the technical skills of a digital photographer.

  • http://www.flickr.com/photos/zackhuggins/ Zack

    B.S. = Bill Shakespeare?

  • wickerprints

    You obviously have missed the point.

    Quantization as it applies to photographic imaging is simply the means of recording the incoming light from a scene, as opposed to an analog medium such as film. It’s discretized data. My point is that this data has an intrinsic level of noise, and that under PRACTICAL circumstances, a JPEG compression of the raw source data captured by a sensor will not yield any noticeable difference.

    Yes, lossy compression “loses” information. But if applied appropriately, you can’t necessarily TELL that the information has been lost, because the noise level inherent in the image may be greater than losses due to compression. Now do you understand?

    I have pointed this out because I am trying to explain why doing the same thing to text data is a poor way to illustrate the information loss of JPEG encoding, because text data doesn’t have noise. If I take a 1-megapixel photograph and change the value of a single pixel by a slight amount, could you ever find out which pixel was changed if you did not have the source image to compare against? Of course not. If, on the other hand, you changed one letter in a novel or play, you can easily tell simply by reading it, because words have canonical forms.

  • Norshan Nusi

    Thank you for the explanation! ^^

  • Jnjy

    “the noise level inherent in the image may be greater than losses due to compression”

    Hmm. I still don’t get that noise argument. I do understand ADC quantization errors and shot noise (very physical, eh?). But a question in my mind still lingers. If inherent noise is greater than compression noise, what in the world does that say about how JPEG works?

    If a 33-word paragraph loses one vowel, I’d think it’s a typo.
    I’d most likely still b able to understand what the paragraph says. And this paragraph just happens to illustrate that effectively.