artificial reverberation: from mono to true stereo

“True stereo” is a term used in audio processing to describe a stereo recording, processing or playback technique that accurately represents the spatial location of sound sources in the stereo field. In true stereo, the left and right channels of a stereo recording contain distinct and separate audio information that accurately reflects the spatial location of sound sources in the recording environment.

This is in contrast to fake/pseudo stereo, where the stereo image is created through artificial means, such as by applying phase shifting techniques to create the impression of stereo. True stereo is generally considered to be superior to fake stereo, as it provides a more natural and immersive listening experience, allowing the listener to better locate and identify sound sources within the stereo field. In the domain of acoustic reverberation, this is essential for the perception of envelopment.

Artificial reverberation has come a long way since its early beginnings. The first mechanical devices for generating artificial reverberation, such as spring or plate reverberation, were initially only available as mono devices. Even when two-channel variants emerged, they usually did summing to mono internally or did processing in separate signal paths, known as dual mono processing. Typically, in a plate reverb, a two-channel output signal was achieved simply by mounting two transducers on the very same reverb plate.

The first digital implementations of artificial reverberation did not differ much from the mechanical ones regarding this principle. Quite common was summing the inputs to mono and the independent tap of two signals from a single reverb tank to obtain a two-channel output. Then, explicit early reflection models were added, which were typically processed for left and right separately and merged into the outputs later to preserve a basic representation of spatial information. Sometimes, also the first reflections were just taken from a (summed) mono signal. The Ursa Major 8×32 from 1981 is a good example for this design pattern. Later, the designs became more sophisticated, and even today it is common to distinguish between early and late reverberation in order to create a convincing impression of immersion.

However, ensuring proper sound localisation through early reflection models is a delicate matter. First and foremost, a real room does not have a single reflection pattern, but a vast variety of ones that depend on the actual location of the sound source and the listening position in that room. A true-to-life representation of this would, therefore, have to be represented by a whole set of individual reflection patterns per sound source and listening position in the virtual room. As far as I know, the VSL MIR solution is the only one that currently takes advantage of this, and with an enormous technical effort.

Another problem is that first reflections can also be detrimental to the sound experience. Depending on their frequency and delay in relation to the direct signal, the direct signal can be masked and affected in terms of phase coherence so that the overall sound becomes muddy and lacks clarity. This is one of the reasons why a real plate reverb is loved so much for its clarity and immediacy: it simply has no initial reflections in this range. As a side note, in the epicPLATE implementation, this behaviour is accurately modeled by utilizing a reverberation technique that completely avoids reflections (delays).

Last but not least, in a real room there is no clear separation between the first reflections and the late reverberation. It is all part of the same reverberation that gradually develops over time, starting with just an auditory event. This also means that there is no clear distinction between events that can be located in space and those that can no longer be identified – this also continuously evolves over time.

A good example of how to realise digital reverb without this kind of separation between early and late reverberation and at the same time in “true stereo” was impressively demonstrated by the Quantec QRS back in the early 80s already. Its ability to accurately reproduce stereo was one of the reasons why it became an all-time favourite not only in the music production scene, but also in post-production and broadcasting.

Artificial reverberation is full of subtleties and details and one might wonder why we can perceive them at all. In the end, it comes down to the fact that in the course of evolution there was a need for such fine-tuning of our sensory system. It was a matter of survival and important for all animal species to immediately recognise at all times: What is it and where is it? The entire sensory system is designed for this and even combines the different sensory channels to always answer these two questions. Fun Fact: This is exactly why some visual cues can have a significant impact on what is heard and why blind tests (in both meanings) are so important for assessing certain audio qualities. See also the “McGurk Effect” if you are interested.

Have fun listening!

The TesslaSE Remake

There were so many requests to revive the old and rusty TesslaSE which I’ve once moved already into the legacy folder. In this article I’m going to talk a little bit about the history of the plugin and its upcoming remake.

The original TesslaSE audio plugin was one of my first DSP designs aiming at a convincing analog signal path emulation and it was created already 15 years ago! In its release info it stated to “model pleasant sounding ‘electric effects’ coming from transformer coupled tube circuits in a digital controlled fashion” which basically refers to adding harmonic content and some subtle saturation as well as spatial effects to the incoming audio. In contrast to static waveshaping approaches quite common to that time, those effects were already inherently frequency dependent and managed within a mid/side matrix underneath.

(Later on, this approach emerged into a true stateful saturation framework capable of modeling not only memoryless circuits and the TesslaPro version took advantage of audio transient management as well.)

This design was also utilized to supress unwanted aliasing artifacts since flawless oversampling was still computational expensive to that time. And offering zero latency on top, TesslaSE always had a clear focus on being applied over the entire mixing stage, providing all those analog signal path subtleties here and there. All later revisions also sticked to the very same concept.

With the 2021 remake, TesslaSE mkII won’t change that as well but just polishing whats already there. The internal gainstaging has been reworked so that everything appears gain compensated to the outside and is dead-easy to operate within a slick, modernized user interface. Also the transformer/tube cicuit modeling got some updates now to appear more detailed and vibrant, while all non-linear algorithms got oversampled for additional aliasing supression.

On my very own, I really enjoy the elegant sound of the update now!

TesslaSE mkII will be released by end of November for PC/VST under a freeware license.

released: SlickHDR

SlickHDRSlickHDR is a “Psychoaccoustic Dynamic Processor” which:

  • balances the perceived global vs. local micro dynamics of any incoming audio.
  • creates a rich in contrast, detailed and clearly perceived image which translates way better across different listening environments.
  • provides a convenient workflow by simply adjusting three dynamic processors to show a roughly same load.
  • offers further and detailed control about overall tone and release time behavior.

The stunning UI artwork and all renders were done by Patrick once again. Made with love in switzerland – as he said!

SlickHDR is a freeware VST audio plug-in for Windows x32 and you can download a copy right here: >>> DOWNLOAD <<<

Related Links

SlickHDR – final teaser & release info

teaserSlickHDR is a “Psychoaccoustic Dynamic Processor” which:

  • balances the perceived global vs. local micro dynamics of any incoming audio.
  • creates a rich in contrast, detailed and clearly perceived image which translates way better across different listening environments.
  • provides a convenient workflow by simply adjusting three dynamic processors to show a roughly same load.
  • offers further and detailed control about overall tone and release time behavior.

Technically speaking, SlickHDR contains a coupled network of three dynamic processors with two of them running in a “stateful saturation” configuration and one based on look-ahead processing.

Fixed amounts of the unprocessed signal are then injected into the network at several specific points and also mixed back into the networks output. Being networked, all processors are highly interacting with each other and this is utilized to cope with a wide variety of sound (sic!) to balance the perceived audio dynamic range.

The stunning UI artwork and render was done by Patrick once again. Made with love in switzerland – as he said.

SlickHDR will be available around end of January 2014 as a freeware VST audio plug-in for Windows x32.

that’s why they call me slick

teaser2

UI artwork and render done by Patrick.

what I’m currently working on – Vol. 11

I’m currently assembling all the UI components designed by Patrick for our brand new SlickHDR processor. Afterwards, a manual has to be written and the usual final polishing needs to be done: Include some presets, double-check and sort all parameter names and so on. The beta testing was already closed some time ago. To sum it up, a release seems to be possible in the not too far future.

processing with High Dynamic Range (3)

This article explores how some different HDR imaging alike techniques can be adopted right into the audio domain.

The early adopters – game developers

In the lately cross-linked article “Finding Your Way With High Dynamic Range Audio In Wwise” some good overview was given on how the HDR concept was already adopted by some game developers over the recent years. Mixing in-game audio has its very own challenge which is about mixing different arbitrary occurring audio events in real-time when the game is actually played. Opposed to that and when we do mix off-line (as in a typical song production) we do have a static output format and don’t have such issues of course.

So it comes as no surprise, that the game developer approach turned out to be a rather automatic/adaptive in-game mixing system which is capable of gating quieter sources depending on the overall volume of the entire audio plus performing some overall compression and limiting. The “off-line mixing audio engineer” can always do better and if a mix is really too difficult, even the arrangement can be fixed by hand during the mixing stage.

There is some further shortcoming and from my point of view that is the too simplistic and reduced translation from “image brightness” into “audio loudness” which might work to some extend but since the audio loudness race has been emerged we already have a clear proof how utterly bad that can sound at the end. At least, there are way more details and effects to be taken into account to perform better concerning dynamic range perception. [Read more…]

processing with High Dynamic Range (2)

This comprehensive and in-depth article about HDR imaging was written by Sven Bontinck, a professional photographer and a hobby-musician.

A matter of perception.

To be able to use HDR in imaging, we must first understand what dynamic range actually means. Sometimes I notice people mistake contrast in pictures with the dynamic range. Those two concepts have some sort of relationship, but are not the same. Let me start by explaining in short how humans receive information with our eyes and ears. This is important because it influences the way we perceive what we see and hear and how we interpret that information.

We all know about the retina in our eyes where we find the light-sensitive sensors, the rods and cones. The cones provide us daytime vision and the perception of colours. The rods allow us to see low-light levels and provide us black-and-white vision. However there is a third kind of photoreceptors, the so-called photosensitive ganglion cells. These cells give our brain information about length-of-day versus length-of-night duration, but also play an important role in the pupillary control. Every sensor need a minimum amount of incitement to be able to react. At the same time all kind of sensors have a maximum amount that they may be exposed to. Above that limit, certain protection mechanisms start interacting to prevent damage occurring to the sensors. [Read more…]

processing with High Dynamic Range (1)

Back in time when I was at university, my very first DSP lectures were actually not about audio but image processing. Due to my interest in photography I followed this amazing and ever evolving domain over time. Later on, High Dynamic Range (HDR) image processing emerged and beside its high impact on digital photography, I immediately started to ask myself how such techniques could be translated into the audio domain. And to be honest, for quite some time I haven’t got a clue.

MM

This image shows a typical problem digital photography still suffers from: The highlights are completely washed out and so the lowlights are turning into black abruptly w/o containing further nuances  – the dynamic range performance is pretty much poor and this is actually not what the human eye would perceive since it features both: a higher dynamic range per se but also a better adoption to different (and maybe difficult) lighting conditions.

On top, we have to expect severe dynamic range limitations in the output entities whether that’s a cheap digital print, a crappy TFT display or the limited JPG file format, just as an example. Analog film and prints does have such problems in principle also but not to that much extend since they typically offer more dynamic resolution and the saturation behavior is rather soft unlike the digital hard clipping. And this is where HDR image processing chimes in.

It typically distinguishes between single- and multi-image processing. Within multi-image processing, a series of Low Dynamic Range (LDR) images are taken in different exposures and combined into one single new image which contains an extended dynamic range (thanks to some clever processing). Afterwards, this version is rendered back into an LDR image by utilizing special  “tone mapping” operators which are performing a sort of dynamic range compression to obtain a better dynamic range impression but now in a LDR file.

Within single-image processing, there must be one single HDR image already available and then just tone mapping is applied. As an example, the picture below takes advantage of single-image processing from a RAW file which typically does have much higher bit-depth (12 or even 14 bit as of todays sensor tech) opposed to JPG (8 bit). As a result a lot of dynamic information can be preserved even if the output file still is just a JPG. As an added sugar, such a processed image also translates way better over a wide variety of different output devices, displays and viewing light conditions.

MM-HDR

what I’m currently working on – Vol. 10

Right now, I’m extending my “compressor aficionados” interview series by a couple of outstanding developers I’ve always was interested in and wanted to talk with. Beside that, I’m pretty much delved in research and development plus testing some brand new prototypes. If/when another mkII version will appear is currently not clear – but me thinks that there will be one or another surprise during Q4, though.

In recent history, I’ve constantly extended and improved my Stateful Saturation approach and within ThrillseekerVBL I’ve managed to introduce authentic analog style sounding distortion right into VST land, which is what I’ve always had in my mind and dreamed of. And there’s so much and overwhelming feedback on that – thank you very much!

And since quite a while, I’m dreaming about a brand new series of plug-ins which will combine the strength of both worlds: analog modelling on the one side but pure digital techniques on the other – incorporating techniques such as look-ahead, FIR filtering or even stuff that comes from the digital image processing domain, such as HDR (High Definition Range) imaging.

Expect an exciting announcement quite soon …