what is a “box tone”?

“Box tone” is a term that is often used to describe the characteristic sound of a particular piece of audio equipment, particularly when it comes to classic analog effects devices such as equalizers and compressors.

The box tone of an effect is often described as the unique timbre or tonal coloration that the device imparts on the audio signal as it passes through it. This can be due to a variety of factors, including the type and quality of the components used in the device, the design of the circuitry, and the way the device processes the signal.

Some audio engineers and producers may seek out specific box tones for their recordings and mixes, as they can add character and depth to the sound. Others may prefer a more neutral or transparent sound, in which case they may choose equipment that has a more subtle or less noticeable box tone.

It’s important to note that the term “box tone” is often used informally and can be somewhat subjective, as different people may have different opinions on what constitutes a distinctive or desirable box tone.

the history of Cubase

When Cubase 3.0 came out in 1996 and introduced VST for the first time with all its new and fascinating possibilities, that was the point where I decided to get more involved in music production and set up a small (home) recording studio. VST was the basis for all this and how I imagined a modern (computer based) studio production. What a revolution that was. Watching this video today brings up a lot of nostalgic feelings …

TesslaPRO mkIII released

the magic is where the transient happens

The Tessla audio plugin series once started as a reminiscence to classic transformer based circuit designs of the 50s and 60s but without just being a clone stuck in the past. The PRO version has been made for mixing and mastering engineers working in the digital domain but always missing that extra vibe delivered by some highend analog devices.

TesslaPRO brings back the subtle artifacts from the analog right into the digital domain. It sligthly colors the sound, polishes transients and creates depth and dimension in the stereo field to get that cohesive sound we’re after. All the analog goodness in subtle doses: It’s a mixing effect intended to be used here and there, wherever the mix demands it.

The mkIII version is a technical redesign, further refined to capture all those sonic details while reducing audible distortions at the same time. It further blurs the line between compression and saturation and also takes aural perception based effects into account.

Available for Windows VST in 32 and 64bit as freeware. Download your copy here.

how I listen to audio today

Developing audio effect plugins involves quite a lot of testing. While this appears to be an easy task as long as its all about measurable criteria, it gets way more tricky beyond that. Then there is no way around (extensive) listening tests which must be structured and follow some systematic approach to avoid ending up in fluffy “wine tasting” categories.

I’ve spend quite some time with such listening tests over the years and some of the insights and principles are distilled in this brief article. They are not only useful for checking mix qualities or judging device capabilities in general but also give some  essential hints about developing our hearing.

No matter what specific audio assessment task one is up to, its always about judging the dynamic response of the audio (dynamics) vs its distribution across the frequency spectrum in particular (tonality). Both dimensions can be tested best by utilizing transient rich program material like mixes containing several acoustic instruments – e.g. guitars, percussion and so on – but which has sustaining elements and room information as well.

Drums are also a good starting point but they do not offer enough variety to cover both aspects we are talking about and to spot modulation artifacts (IMD) easily, just as an example. A rough but decent mix should do the job. On my very own, I do prefer raw mixes which are not yet processed that much to minimize the influence of flaws already burned into the audio content but more on that later.

Having such content in place allows to focus the hearing and to hear along a) the instrument transients – instrument by instrument – and b) the changes and impact within particular frequency ranges. Lets have a look into both aspects in more detail.

a) The transient information is crucial for our hearing because it is used not only to identify intruments but also to perform stereo localization. They basically impact how we can separate between different sources and how they are positioned in the stereo field. So lets say if something “lacks definition” it might be just caused by not having enough transient information available and not necessarily about flaws in equalizing. Transients tend to mask other audio events for a very short period of time and when a transient decays and the signal sustains, it unveils its pitch information to our hearing.

b) For the sustaining signal phases it is more relevant to focus on frequency ranges since our hearing is organized in bands of the entire spectrum and is not able to distinguish different affairs within the very same band. For most comparision tasks its already sufficient to consciously distinguish between the low, low-mid, high-mid and high frequency ranges and only drilling down further if necessary, e.g. to identify specific resonances. Assigning specific attributes to according ranges is the key to improve our conscious hearing abilities. As an example, one might spot something “boxy sounding” just reflecting in the mid frequency range at first sight. But focusing on the very low frequency range might also expose effects contributing to the overall impression of “boxyness”. This reveals further and previously unseen strategies to properly manage such kinds of effects.

Overall, I can not recommend highly enough to educate the hearing in both dimensions to enable a more detailed listening experience and to get more confident in assessing certain audio qualities. Most kinds of compression/distortion/saturation effects are presenting a good learning challenge since they can impact both audio dimensions very deeply. On the other hand, using already mixed material to assess the qualities of e.g. a new audio device turns out to be a very delicate matter.

Lets say an additional HF boost applied now sounds unpleasant and harsh: Is this the flaw of the added effect or was it already there but now just pulled out of that mix? During all the listening tests I’ve did so far, a lot of tainted mixes unveiled such flaws not visible at first sight. In case of the given example you might find root causes like too much mid frequency distortion (coming from compression IMD or saturation artifacts) mirroring in the HF or just inferior de-essing attempts. The most recent trend to grind each and every frequency resonance is also prone to unwanted side-effects but that’s another story.

Further psychoacoustic related hearing effects needs to be taken into account when we perform A/B testing. While comparing content at equal loudness is a well known subject (nonetheless ignored by lots of reviewers out there) it is also crucial to switch forth and back sources instantaneously and not with a break. This is due to the fact that our hearing system is not able to memorize a full audio profile much longer than a second. Then there is the “confirmation bias” effect which basically is all about that we always tend to be biased concerning the test result: Just having that button pressed or knowing the brand name has already to be seen as an influence in this regard. The only solution for this is utilizing blind testing.

Most of the time I listen through nearfield speakers and rarely by cans. I’m sticking to my speakers since more than 15 years now and it was very important for me to get used to them over time. Before that I’ve “upgraded” speakers several times unnecessarily. Having said that, using a coaxial speaker design is key for nearfield listening environments. After ditching digital room correction here in my studio the signal path is now fully analog right after the converter. The converter itself is high-end but today I think proper room acoustics right from the start would have been a better investment.

a brilliant interview

sustaining trends in audio land, 2022 edition

Forecasts are difficult, especially when they concern the future – Mark Twain

In last years edition about sustaining trends in audio land I’ve covered pretty much everything from mobile and modular, DAW and DAW-less up to retro outboard and ITB production trends. From my point of view, all points made so far are still valid. However, I’ve neglected one or another topic which I’ll now just add here to that list.

The emergence of AI in audio production

What we can currently see already in the market is the ermergence of some clever mixing tools aiming to solve very specific mixing tasks, e.g. resonance smoothing and spectral balancing. Tools like that might be based on deep learning or other smart and sophisticated algorithms. There is no such common/strict “AI” definition and we will see an increasing use of the “AI” badge even only for the marketing claim to be superior.

Some other markets are ahead in this area, so it might be a good idea to just look into them. For example, AI applications in the digital photography domain are already ranging from smart assistance during taking a photo itself up to complete automated post processing. There is AI eye/face detection in-camera, skin retouching, sky replacement and even complete picture development. Available for all kinds of devices, assisted or fully automated and in all shades of quality and pricing.

Such technology not only shapes the production itself but a market and business as a whole. For example, traditional gate keepers might disappear because they are no longer necessary to create, edit and distribute things but also the market might get flooded with mediocre content. To some extend we can see this already in the audio domain and the emergence of AI within our production will just be an accelerator for all that.

The future of audio mastering

Audio Mastering demands shifted slightly over the recent years already. We’ve seen new requirements coming from streaming services, the album concept has become less relevant and there was (and still is) a strong demand for an increased loudness target. Also, the CD has been loosing relevance but Vinyl is back and has become a sustaining trend again, surprisingly. Currently Dolby Atmos gains some momentum, but the actual consumer market acceptance remains to be proven. I would not place my bet on that since this has way more implications (from a consumer point of view) than just introducing UHD as a new display standard.

Concerning the technical production, a complete ITB shift – as we’ve seen it in the mixing domain – has not been completed yet but the new digital possibilities like dynamic equalizing or full spectrum balancing are slowly adopted. All in all, audio mastering slowly evolves along the ever changing demands but remains surprisingly stable, sustaining as a business and this will probably continue for the next (few) years.

Social Media, your constant source of misinformation

How To Make Vocals Sound Analog? Using Clippers For Clean Transparent Loudness. Am I on drugs now? No, I’ve just entered the twisted realm of social media. The place where noobs advice you pro mixing tips and the reviews are paid. Everyone is an engineer here but its sooo entertaining. Only purpose: Attention. Currency: Clicks&Subs. Tiktok surpassed YT regarding reach. Content half-life measured in hours. That DISLIKE button is gone. THERE IS NO HOPE.

The (over-) saturated audio plugin market and the future of DSP

Over the years, a vast variety of vendors and products has been flooded the audio plugin market, offering literally hundreds of options to choose from. While this appears to be a good thing at first glance (increaed competition leads to lower retail prices) this has indeed a number of implications to look at. The issues we should be concerned the most about are the lack of innovation and the drop in quality. We will continue to see a lot of “me too” products as well as retro brands gilding their HW brands with yesterday SW tech.

Also, we can expect a trend of market consolidation which might appear in different shapes. Traditionally, this is about mergers and aquisitions but today its way more prominently about successfully establishing a leading business platform. And this is why HW DSP will be dead on the long run becuse those vendors just failed in creating competitive business platforms. Other players stepped in here already.

Dynamic 1073/84 EQ curves?

Yes we can! The 1073 and 84 high shelving filters are featuring that classic frequency dip upfront the HF boost itself. Technically speaking they are not shelves but bell curves with a very wide Q but anyway, wouldn’t it be great if that would be program dependent in terms of expanding and compressing according to the curve shape and giving a dynamic frequency response to the program material?

Again, dynamic EQs makes this an easy task today and I just created some presets for the TDR Nova EQ which you can copy right from here (see below after the break). Instructions: Choose one of the 3 presets (one for each specific original frequency setting – 10/12/16kHz) and just tune the Threshold parameter for band IV (dip operation) and band V (boost operation) to fit to the actual mix situation.

They sound pretty much awesome! See also my Nova presets for the mixbus over here and the Pultec ones here.

[Read more…]

Dynamic Pultec EQ curves?

Wouldn’t it be great if the Pultec boost/cut performance would be program dependent? Sort of expanding and compressing according to the boost/cut settings and giving a dynamic frequency response to the program material.

Well, dynamic EQs makes this an easy task today and I just created some presets for the TDR Nova EQ which you can copy right from here (see below after the break). Instructions: Choose one of the 4 presets (one for each specific original frequency setting – 20/30/60/100Hz) and tune the Threshold parameter for band II (boost operation) and band III (cut operation) to fit to the actual mix situation.

See also my presets for the mixbus over here.

[Read more…]

What loudspeakers and audio transformers do have in common

Or: WTF is “group delay”?

Imagine a group of people visiting an exhibition having a guided tour. One might expect that the group reaches the exhibitions exit as a whole but in reality there might be a part of that group just lagging behind a little bit actually (e.g. just taking their time).

Speaking in terms of frequency response within audio systems now, this sort of delay is refered to as “group delay”, measured in seconds. And if parts of the frequency range do not reach a listeners ear within the very same time this group delay is being refered to as not being constant anymore.

A flat frequency response does not tell anything about this phenomena and group delay must always be measured separately. Just for reference, delays above 1-4ms (depending on the actual frequency) can actually be perceived by human hearing.

This always turned out to be a real issue in loudspeaker design in general because certain audio events can not perceived as a single event in time anymore but are spread across a certain window of time. The root cause for this anomaly typically lies in electrical components like frequency splitters, amplifiers or filter circuits in general but also physical loudspeaker construction patterns like bass reflex ports or transmission line designs.

Especially the latter ones actually do change the group delay for the lower frequency department very prominently which can be seen as a design flaw but on the other hand lots of hifi enthusiast actually do like this low end behaviour which is able to deliver a very round and full bass experience even within a quite small speaker design. In such cases, one can measure more than 20ms group delay within the frequency content below 100Hz and I’ve seen plots from real designs featuring 70ms at 40Hz which is huge.

Such speaker designs should be avoided in mixing or mastering situation where precision and accuracy is required. It’s also one of the reasons why we can still find single driver speaker designs as primary or additional monitoring options in the studios around the world. They have a constant group delay by design and do not mess around with some frequency parts while just leaving some others intact.

As mentioned before, also several analog circuit designs are able to distort the constant group delay and we can see very typical low end group delay shifts within audio transformer coupled circuit designs. Interestingly, even mastering engineers are utilizing such devices – whether to be found in a compressor, EQ or tape machine – in their analog mastering chain.

Lets talk about mixing levels (again)

Some years ago we had lots of discussions about proper mixing levels in the digital domain – with mixed (sic!) results, IIRC. Meanwhile, more and more influencers are claiming that targeting -18dBFS with a VU meter readout is the “digital audio sweet spot” and the way forward in terms of plugin gain staging. In practise that would imply mixing digital peak levels at around 0dBFS again but maybe I’ve missed something during my absence in recent years. So, to what mixing levels are you up to in your DAW today?