processing with High Dynamic Range (3)

This article explores how some different HDR imaging alike techniques can be adopted right into the audio domain.

The early adopters – game developers

In the lately cross-linked article “Finding Your Way With High Dynamic Range Audio In Wwise” some good overview was given on how the HDR concept was already adopted by some game developers over the recent years. Mixing in-game audio has its very own challenge which is about mixing different arbitrary occurring audio events in real-time when the game is actually played. Opposed to that and when we do mix off-line (as in a typical song production) we do have a static output format and don’t have such issues of course.

So it comes as no surprise, that the game developer approach turned out to be a rather automatic/adaptive in-game mixing system which is capable of gating quieter sources depending on the overall volume of the entire audio plus performing some overall compression and limiting. The “off-line mixing audio engineer” can always do better and if a mix is really too difficult, even the arrangement can be fixed by hand during the mixing stage.

There is some further shortcoming and from my point of view that is the too simplistic and reduced translation from “image brightness” into “audio loudness” which might work to some extend but since the audio loudness race has been emerged we already have a clear proof how utterly bad that can sound at the end. At least, there are way more details and effects to be taken into account to perform better concerning dynamic range perception.

The plain 1:1 translation – focusing on bit resolution

In their paper “High Dynamicrange Simultaneous Signal Compositing, Applied To Audio” R. Janzen and S. Mann already presented a sort of 1:1 HDR concept translation which just deals with the intermediate extension of the actual bit depth in a ADC stage during recording. They also presented the compositing technique as used with different LDR exposures whereas within audio they must be captured simultaneously and can’t be shot in a serial fashion anymore. This is due to the fact that our hearing is so much sensitive to out-of-phase recordings and the resulting equalization.

With such high bit resolution extensions, one can imagine sampling a 200 dB dynamic range signal using four 16-bit ADCs in parallel, just as an example. While this might appear rather esoteric in a well controlled audio recording situation where we can just stick to a typical 24-bit audio converter, there seems to be serious applications in RADAR or SONAR processing. Also, one could imagine a mobile audio recorder solution, which features such high DR to get rid of that typically ugly sounding automatic input gain leveler, even in difficult recording situations.

A perceptual approach – dealing with perceived dynamics

If I would have to boil down the HDR imaging concept into just one little sentence then I would say, it’s all about balancing the local vs. the global contrast – as being perceived by human. And as this comprehensive article about HDR imaging (written by Sven Bontinck) already explained – that is a complex matter of perception within our visual system which includes both, the eye and brain. Within our hearing system this is also a matter of perception and we usually call that a “psychoaccoustic” effect.

Psychoaccoustics heavily determine how we actually perceive transient vs. steady state signals, how things are frequency dependent in our hearing, why ear fatiguing and masking effects are occurring or how our hearing copes and takes advantage of overtones – just to name some of the dimensions. As a side note, this is also the basis for designing lossy audio encoders (such as MPEG 1 layer 3) which are capable of eliminating certain audio content but w/o noticeable artifacts. In order to present a very well-balanced dynamic range impression to the ear it must be figured out how the aforementioned different dimensions are affecting and interacting to each other.

Today, the basic DSP building blocks and patterns one needs to realize such a perceptual approach not only are already available but also well-understood: Whether that’s overtone generation, transient management, up- and downward compression/expansion, parallel processing, look-ahead techniques while dealing with its overall frequency dependency. Some limited and specific combinations were already combined and implemented into broadcast processors or audio exciters, just to name the two.

Related

About these ads

Comments

  1. sounds pretty exciting-I can’t wait till Bootsy produces the first semi-automatic mixing plug-in
    -meaning that within parameters guided by the Mixer the plug-in has some Artificial Intelligence
    to produce certain “psychoaccoustic” effects as determined by the mixer.

    At one certain point-
    many of the optimizations/choices that a mixer must perform can be done by software-

    For example- tighten bass with kick by analyzing spectrum frequency of kick and bass and eqing different
    complimentary areas of the frequency to have them lock themselves into a puzzle.

    further adding of harmonics/distortion in different areas to have them not overlap but rather compliment each other in the frequency spectrum

    adding a chain of compressors/expanders etc. in series and/or parallel to side chain the kick and bass-either and/or relative to their transients or the body.

    Most mixers have a certain approach to their workflow which can be automated eventually with software such as
    first working on the kick bass then vocals
    balancing then sweetening
    working on highlighting certain parts or tracks for certain parts of the song
    final automation -like side chaining certain frequencies to make room for what one wants to emphasize at that particular point in the music or some effect splash etc.

    The software can present different options or ‘presets’ for each defined scenario- finetuning will
    always boil down to the mixer’s taste

    All these routine tasks can be automated leaving the mixer with more time to be creative in his choices.

    Therefore if all these procedures are already well understood and available like Bootsy says-

    “Today, the basic DSP building blocks and patterns one needs to realize such a perceptual approach not only are already available but also well-understood: Whether that’s overtone generation, transient management, up- and downward compression/expansion, parallel processing, look-ahead techniques while dealing with its overall frequency dependency. Some limited and specific combinations were already combined and implemented into broadcast processors or audio exciters, just to name the two”

    Therefore the next step will be implementation of all these techniques each one mutually shaping
    and influencing each other semi-automatically (the mixer will always choose the final effect/sound/product)

    The future is lookin’ bright!

  2. bootsy please dont make a semi-automatic-mixing plugin whatsoever, that whole thing is nonsense

    • Why do you think it is nonsense?
      Every aspect of music has changed through advancements in technology. More powerful tools
      will only help improve one’s creativity and productivity.

      Using the modern DAW tools, soft synths, plug ins etc. has allowed music production to grow leaps and bounds.
      If one has purist tendencies one can always choose not to use the technology (people old enough
      still remember how much effort was needed to cut tape to make edits and to calibrate the machines-today we ,arguably, have the ‘tape’ sound without the hassle.)

      Look at it this way-if one has 5 hours to mix a song and 4 hours are taken to adjust frequency and phase, balance the tracks and tune and quantize them and or enhance them to present day professional standards…then with the last hour one is left for creative mixing automation, wouldn’t it be better to have almost all the 5 hours to try different flavors and styles and arrangements?

      This would be like having a trustworthy assistant engineer we would offer his opinions on the mix.(One
      always has the final say how to use abuse and creatively ignore or implement any software ‘automatic’ mix.)

      The world is heading towards BIG changes and we, in the music community, should be on the forefront of these changes to raise the bar and expand the quality of art.

      respectfully,
      Rachel

      • Well Said Rachel!

      • Kyle McComb says:

        Personally, I don’t want an automatic mixer plugin, mostly because I don’t think it would work great. To me, stuff like that never does. I am a bit of a purist, to be fair, but my experience is having a bunch of “special” plugins doesn’t do much to improve the mix. You can have as many enhancers/”loudness” plugins/whatever as you want, but it doesn’t make mixing any easier. (I have to admit I have a weakness for analog-modeling plugins, though. I like grit.) You will always have a crowd that just wants it to be easy, and I think marketing anything as an “automatic mixer” is a disservice to that crowd. Mixing will never happen without effort; if it does, we’ll face the unique problems of A. no jobs for audio engineers and B. every mix sounding the same, which scares me even more than the first.

        As a side note, I disagree with your tape-cutting to modern editing analogy, only because that’s making a process less time consuming, not making a process unnecessary.

        Do not get me wrong, I would love to see some interesting transient-aware limiters or multiband warmers, things to balance out frequency bands are fine. Just they’re not going to happen automatically AND work well. If they do, I applaud Bootsy, but I doubt it.

        Basically, read kohugaly’s comment, because I agree with that 1000% percent. I am certainly excited about this as a new technology, but I think you ask too much of it.

    • It is semiautomatic mixer in a same way a compressor is a semiautomatic volume leveler. I mean… although compressors are (technically speaking) made to automate volume of an input, they are not really an alternative to manual volume automation. I think with HDR audio it will be the same – not an real substitution, but a different new “color”. I believe there will be many situations where HDR will be counterproductive and vice versa…

  3. I think it could work to use 2 mics – one with very high max SPL, and one with very low noise floor. Of course the capsules would be placed close as possible, perhaps with some further phase adjustment if necessary. Then use software to automatically mix between the two mics to obtain a recording which captures both the very loud and very soft.

  4. rachel and everyone else, thanks for your reply. in all due respect, it’s just that semi automatic plugins always sound like a one-knob-trick thing.
    i would rather see something REALLY NEW out of an HDR bootsy design than something that is just BETTER
    All VoS plugins are great but not quite unique at the end of it; i bet he can come up with something that no other plugins or hardware really really does.

  5. I think that I would not spend any energy hoping for Bootsy to NOT do something. There is a track record here that says to me, “just wait and see. This is not someone who tends towards frivolous behaviors.” So, while he may be teasing us, he is probably “in process” too; a sort of “mixing pot of thought” if you will.

Trackbacks

  1. […] processing with High Dynamic Range (3) […]

What do you think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,059 other followers

%d bloggers like this: