processing with High Dynamic Range (2)

This comprehensive and in-depth article about HDR imaging was written by Sven Bontinck, a professional photographer and a hobby-musician.

A matter of perception.

To be able to use HDR in imaging, we must first understand what dynamic range actually means. Sometimes I notice people mistake contrast in pictures with the dynamic range. Those two concepts have some sort of relationship, but are not the same. Let me start by explaining in short how humans receive information with our eyes and ears. This is important because it influences the way we perceive what we see and hear and how we interpret that information.

We all know about the retina in our eyes where we find the light-sensitive sensors, the rods and cones. The cones provide us daytime vision and the perception of colours. The rods allow us to see low-light levels and provide us black-and-white vision. However there is a third kind of photoreceptors, the so-called photosensitive ganglion cells. These cells give our brain information about length-of-day versus length-of-night duration, but also play an important role in the pupillary control. Every sensor need a minimum amount of incitement to be able to react. At the same time all kind of sensors have a maximum amount that they may be exposed to. Above that limit, certain protection mechanisms start interacting to prevent damage occurring to the sensors.

With our eyes we can see in very low light conditions thanks to the rods. We all recognize what happens when we stay in the dark for some time. Our pupils will dilate as much as possible and after a short time we start seeing more information where we first saw almost nothing, although we will hardly be able to distinguish colors in what we see. This is because the low-light sensitive rods are working at that time. But also the dilating of the pupils are an adaptive process to let more light enter our eyes to lower our minimum sensitivity level.

On the other hand when we return to normal or brighter light conditions, our pupils will constrict again to lower the amount of light than can enter our eyes. When we try to look at the sun without  protection, everybody knows that it is almost impossible to do it without squinting our eyelids. Again  this is an attempt to protect out retina by lowering the light even more than our pupils can do. The photosensitive ganglion cells contribute to regulate the width of our pupils and work independent of  the other photosensitive cells. These cells are directly connected via the optic nerve to the central nervous system and bypass the visual systems in our brain. The speed of constricting the pupils is about three times faster than the speed of dilating. This shows how important this protecting system must be. It acts very fast and responds to the brightest part of light that enters our eyes, but it does that all by itself. So it is something we cannot control like when we move an arm or leg for example.

Protecting our eyes is linked with seeing black and white.

This may seem like a weird statement, but let me explain how it works. Now we understand the basics of how our eyes and brain work together to sense light, but also how they both help to protect our sensors, we can understand what differences in light levels will cause.

Since the most important and fastest protection mechanism is there to prevent from being blinded by a strong light source, our way of seeing things will be bleed as white, everything that has less intensity or luminance will be perceived as darker depending of its level. Our eyes can see intensity differences of about 1000 to 1 when we look during daylight (compared with digital values, this means around 10 bits of information because 2^10 equals 1024 which is about the same maximum value). A value of 1 will be the darkest value and will be perceived as black without any detail, whilst a value of 1000 will be seen as pure white. This is the dynamic range we can see during one single moment.

Everything outside that range will be perceived as too bright white, or black without any details anymore. As soon as the light conditions are changing, our pupils will react correspondingly and adapt to these changes. Of course we have to realise that, if an object that we see is black one moment, we can see more details if we come closer and eliminate the influence of the brightest parts more and more because our pupils can dilate to let more light enter our eyes. Even holding your hand within the line of sight of the light source to shield our eyes from that direct light source, can increase the details  you can see in parts that seemed completely black before.

So, it’s clear that our perception of light intensity is not absolute, but rather relative to the amount of light that is reflected from objects or emitted from light sources. At the same time that perception depends on the fact if there is a light source shining directly into our eyes or not. This all makes it pretty hard to be able to say precise number information about perception of a dynamic range in humans. The fact that our rods have a much larger dynamic range compared to our cones, they can perceive about a difference of 1000.000 to 1, does contribute to the complexity of this subject.

The more we look in low-level light circumstances, the more the rods are taking over from the cones, so the dynamic range we see will gradually increase when the light becomes darker, but only if there is no direct light source in your direct sight.The most important things to remember are that, one, the perception of light is constantly adapted to what we look at and to what amount of light is entering our eyes. Two, even direct or indirect light does make a difference because our rods will not have effect to increase the dynamic range when a direct light source is entering our eyes. And finally the protection mechanism for too bright light will always dictate the adaptation process and perception of what we will label in our brain as being white.

Different dynamic ranges to take into account.

This long introduction is necessary to understand that when using HDR in imaging, different dynamic ranges will have influence to each other and will need some matching to translate. Like we learned about our human dynamic range, a camera also has a certain range that it can capture during one certain time span. Modern cameras measure light by using their CCD or CMOS sensor chip to evaluate the differences between the levels that enters the lens and so the individual light-sensitive pixels on the chip. These levels are digitalized into values that can be stored and manipulated. Comparing with music, this process is the same as what happens in an Analogue Digital Converter when analogue sound enters the mic, goes to the soundcard and then is transformed by that ADC into digital numbers.

The kind of image file everybody knows is the standard jpg file. This file contains, simply explained, the light information of the three primary colours, red, green and blue. These three separate colours contains nothing but individual series of numbers that represents the digitized information about the brightness of each colour pixel. They are essentially three separate grey-scale images. Combined they form a colour image with information about colour, brightness and also saturation as an indirect consequence of the combination of both first types of information.

The well-known jpg file stores that information with a dynamic range of 8 bit per colour. Combined this is means a 24 bit file. Since each colour has 8 bits to reproduce everything from black to white, this means that there are 256 different values (2^8=256) that can be used to create each primary colour level. The combination of these three colours will produce 256*256*256 = 16.777.216 possible colours. At first sight, this seems a lot, but we may not forget that this amount of combinations does not stand for a very high dynamic range as well. The dynamic range is dictated only by the amount of bits per colour, not the combination of those three primary colours.

Back in the years when I was studying photography, I learned that a trained eye could distinguish about 130 to 140 different values from black to white on high quality photo paper and under normal good lightning conditions. Anything with a lower number would be seen as separate bands in a gradient from black to white. Every number of values higher than that is sufficient to create a fluid gradient without any banding effect visible.

This means that with a dynamic range of 8 bits (per colour), it is sufficient to create images that contains al information from black to white. Remember that 8 bit means 256 level differences that can be stored or reproduced. sufficient to cover the 130 levels that we can differentiate between. However, let’s not forget that this is the range a jpg file can contain and not the camera range that we are talking about here, it is not our human dynamic range also. That jpg range is determined by the bit depth of what this standard is set to, 8 bit. Like I wrote before, a modern camera always capture a (much) bigger difference between the brightest and the darkest value that enters the pixels of the sensor. The reason why this is important is that, although 8 bit at first seems sufficient to reproduce everything, it is not sufficient when we have to do some calculations with that range.

Bits do make a big difference.

Working with 8 bit images will cause certain problems when we have to shift the data for specific reasons. Shifting the colour (brightness) data upwards to brighten up details in shadows, can cause banding because the information with low values will be recalculated to reach higher values. This way we start seeing more details in the darkest parts. The curve function in Photoshop is the most obvious example of that technique. Compare this with what a compressor does to music. It raises the low-level sounds to a higher level so we can hear them more clearly. Back to imaging there is one problem when we do this on 8 bit images, meaning that banding can occur if we go too far with our correction. The reason is simple. If we think again at the range of about 130 to 140 level differences we can differentiate between on a picture, we can come into that range if we stretch the low levels too much to the higher levels. If for example a dark value of 13 is recalculated to a value of 15, we just created a gap of two values between those two. Let’s compare this with an 7 bit image to explain what the problem is now.

To create the same minimum and maximum perceived light intensity value with 7 bit per colour, the steps needed to This is because we only have 128 levels anymore to fill a gradient from black to white, not enough to make it very fluid for our eyes. Now we understand what the importance of having enough bits means, it is easier to understand why HDR images are useful to be able to shift colour and/or brightness data when we want to manipulate images or correct them afterwards. The higher that dynamic range a camera can capture and store, the more we can stretch the levels before banding will occur.

We are very sensitive to banding and if it occurs, people will see it immediately. There is a second big advantage that HDR images will give us. If a higher range in different levels is captured, we can easily manipulate the image afterwards if we actually use the whole bit depth in our imaging program. With classic 8 bit images, the internal processor of the camera besides how the captured range will be recalculated into those 8 bit (tonemapping) according to your own manual exposure settings or to some automatic or semi automatic programs settings. Your camera stores that already lowered bit depth range into the file on your memory card or computer. You cannot go back to the internal original higher bit depth by that time anymore.

If you use RAW files, then you use the internal complete dynamic range the camera is capable of capturing. To compare this again with music, think about recording in 16 bit CD quality and afterwards trying to manipulate this as 24 bit in a DAW. The result is that you gain no extra information by doing so. Interpolation will fill in the spaces between the gaps, but if information is not there, it cannot be (re)calculated, it is only smoothed to prevent stepping. Instead if you record your music with 24 bit, you have access to a much bigger dynamic range and a more detailed file.

Matching those different dynamic ranges.

Now we understand a few bits about bits, pun intended, we start to understand that each concept, be it the camera’s dynamic range, the dynamic range of the medium we use to look at the image, but also the constantly changing range of our human vision, will need some sort of standardisation to match every range to another. The most important thing to remember is that our human way of processing light will determine how things are perceived, no matter what clever calculations or conversions are done to match the different ranges.

If, for example, we use a LED screen that has a contrast ratio of 1.000.000 to 1 (some modern screens have such dynamic range) this range exceeds our own daylight range we can see by far and we will lose information because the image information is spread over a wider range than 1000 to 1. Of course, at first sight this will look like a screen that has a very high contrast, but if you lose information because of that, the ranges are not matched very well. On the other hand if we capture an image with more than 8 bit and tonemap this information into that range, it will contain more information than we normally would see. This can be a good thing if we don’t exaggerate.

Pushing a range of 14 bit information into an 8 bit one, will give a very grey-ish, unsaturated image, albeit with much more information about intensity differences that were present in the scene. In imaging such an image must often be enhanced by raising the contrast and the saturation. Otherwise it does not seem very natural anymore because we are used to see certain ratios of contrast between the objects we are looking at. What we would normally see as a black shadow, will still show a lot of detail and that can look weird.

If used moderate however, it can enrich a picture to some extent. The most effect can be seen in the highlights such as bright cloud formations and in shadow parts where there will be much more nuances and details visible that otherwise would be bleached out in the white parts, or should vanish in the dark parts as black.

Related

About these ads

Comments

  1. This was really waste of time. Come on Bootsie, there is a ton of articles explaining HDR images. Above article did not bring anything new and mentions audio in just one sentence. I thought you are about audio in this blog and your work.

  2. Interesting read. I’m still wondering how this adapts to audio and in what form the SlickHDR will manifest itself. Will it be comparable to a super adaptive limiter/compressor? And what makes HDR processing so special that it’s effect can’t be achieved in traditional ways?
    I’m thinking the part where he talks about having a big dynamic range to begin with (RAW file) doesn’t really translate into audio mixing. In most cases we have 24bit files and thats it. So it basically would have to be a generated/”fake” high dynamic range, wouldn’t it?
    With the photoshop curves I’m always thinking about waveshaping in audio. – Obviously you have the time-component in your saturation approach so that’s not the same. I’m always having an audio version of the shadow/highlight tool in photoshop in mind, with your stateful approach being the smart “radius” control. No idea if this even makes sense…..
    But how all this will work together in the end – and more importantly what sonic effect will be – is still a mystery to me.
    Anyway… looking forward to hearing more about this. Keep up the very good work!

  3. While it may not seem new to you, it does explain what this new project is attempting to do. The fact that you are here means that you are using Bootsie’s plugins. Since you got all those amazing plugins for free, at the very least show some respect and keep your negative comments to yourself.

    • I deeply respect the work of the man. I appreciate even more the in-depth analysis of problems that he tries to overcome. The articles are really profound and show that it takes a good amount of thinking outside the box to even grab some concepts. But in this article, which is NOT written by Bootsie, there is nothing directly related to audio. Instead it’s quite boring since the concept of HDR photos is (excusez le mot) sooo 2012 ;) and I’ve read far shorter explanations with more content. My point is: VOS is about sound, so even if some explanation from other field is needed, does it really need 2 parts already? Of course this is personal blog run by independant man, so he’s free to publish whatever. But then again, so is the comments section- to express one’s opinions, even if they are not always positive.
      Once again- never enough respect for the whole VOS team, never enough thankyou’s for the best plugins. I guess I’ll just wait for the new plugin and skip the articles :)

  4. Reblogged this on How to Produce Electronic Music.

Trackbacks

  1. […] all about balancing the local vs. the global contrast – as being perceived by human. And as this comprehensive article about HDR imaging (written by Sven Bontinck) already explained – that is a complex matter of perception within […]

  2. […] processing with High Dynamic Range (2) […]

What do you think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,164 other followers

%d bloggers like this: