A short compendium on digital audio compression techniques.
Basic compressor configurations
Compression vs. limiting
Technically speaking the same principles are used in audio signal limiting and compression processors but just the transfer curves and envelope follower settings are different. Ultra fast attack rates and high ratio amounts are used for limiting purposes which causes just very few peaks to pass on a certain threshold.
In digital implementations limiting processors can be more strict due to look-ahead and clever gain prediction functions which guarantees that no peak information passes the threshold. That is called brickwall limiting then.
The audio signal path on which the gain reduction amount is actually computed is called the sidechain. The sidechain input can be obtained in several different fashions. In feed-forward (FF) design the sidechain input is obtained from the incoming signal whereas in feedback (FB) design the sidechain input is obtained from the output of the VCA where some gain reduction is already applied. In FB mode the compressor circuit kind of “knows” that some compression work is already done and therefore lowers the amount of further compression during the feedback loop. This changes the behaviour of the compressor transfer curve and leads to lower actual compression rates and a usually gentle sounding compression. Opposed to that FF compression can have a more direct and serious impact to the actual gain reduction.
FF and FB sidechains can be combined or mixed if the delay between them is properly compensated. More important, the source which is feed into the sidechain input can be a different one then the actual audio which is feed into the compressor. The compressor provides a separate input for this and the method is usually called external sidechaining opposed to internal sidechaining.
Sidechain filtering can significantly change the behaviour of the gain reduction processing afterwards. It’s main purpose is to set the focus of the compression to a certain part of the audio frequency spectrum and to manage and avoid too much unwanted distortion. More information on that topic can be found here.
When compressing more then one channel of audio the handling of the sidechain signal can be even more sophisticated. The processing inbetween the sidechain can be completely independent between all channels (unlinked) or one single signal is generated which is then used for all channels (linked). DSP maths allows to blend seemless between both options in general. When working on dual channel audio (where both channels audio signals are related to another aka stereo) then a common technique found in some compressors is mid/side compression where the incoming audio is mid/side encoded upfront the processing (and decoded back afterwards). If the compressor allows seperate control of compression parameters per channel then one can treat the mid of the audio content entirely different then the side. This can be usefull to alter and increase transparency, depth and width perception of the compressed signal and to handle difficult audio material (dynamic wise) more easily.
Finally, a common principle especially with heavy compression is to mix the original (dry) signal back into the compressed (wet) signal which can give some more body and consistency to the resulting signal afterwards.
The envelope follower
One main and crucial part of the gain control circuit is the envelope follower algorithm(s). After the sidechain signal is fullwave rectified (turning the AC signal into a DC one) and the compressor transfer curve is applied (which was already discussed over here) then the envelope follower takes usually place to smooth the gain control signal out and so to avoid too much intermodulation distortion which otherwise would happen when the gain control is applied onto the original audio path via the VCA of the compressor.
Distortion and sound
In short: the envelope follower is all about (minimized) distortion and (perceived) sound. And this is also the place where some modeling of certain analog circuits (such as photo-resistor based) might be done because this mostly influences the shape of the release curve.
In fact the actual shape and design of the resulting envelope curves dramatically affects the sound perception of the compression when the gain riding is going to happen. Major differences concerning sound can already perceived whether the algorithms are computed in linear or logarithmic scale. Simply speaking staying on the linear representation of the incoming digital audio is computationally much more cheaper but sound wise as well. The other way around having the sidechain signal log encoded results in different and better curve shapes but at the expense of higher computational costs.
There are lots of different concepts and implementations of followers each with different charactaristics and this is probably the place where to spend the most efforts when designing a new compressor device for a specific task. A basic envelope follower consists of attack and release mechanics which determines the speed of altering the resulting curve according to signal rising or decaying in the sidechain input. Other implementations might add further stages (e.g. an envelope hold stage) or improve the entire envelope detection for example by using a hilbert transform phase shift (again at higher computational cost).
Multi envelope techniques
A much more common practice is to construct an envelope out of several simple ones. Those individual ones are tuned then differently to respond faster or slower to signal changes (and maybe at different threshold levels) and then get recombined to one single curve. Those attempts can lead to more responsive curves and overall compression behaviour without introducing that much distortion due to the gain modulation. A further strategy could be to set those individual envelopes to just respond to different parts of the frequency spectrum and then be recombinded. This ain’t already multi-band compression because the frequency seperation is only performed on the sidechain path and not on the audio path and each envelope affects the whole spectrum at the end. Therefore the audio path remains unaffected by possible filtering side effects like phase shifting (opposed to true multi-band compression).
The process of envelope recombination is in fact another part where some investigations should carefully be taken. Simple addition of all envelopes might not be good because all modulation information is present all the time afterwards. Taking the maximum of all envelopes might be perform better since there is just one modulation information present at one time but this doesn’t guarantees necessarily a better sound in the end. Lowpass filtering afterwards might help to supress too much high frequency (HF) modulation and to keep the resulting curves math wise C2 continuous to avoid distortion when applying the generated gain reduction curve to the VCA.
Applying the gain reduction
This is represented by the voltage controlled amplifier (VCA) which in DSP world is simply a multiply as long as no other analog amplifying side effects were implemented such as non-linearities. The computed gain control signal can be further shaped according to such effects if desired but more modern an common this is the place where dynamic range control is performed. It could be easily done by e.g. hard- or soft-limiting the control signal before feeding it into the VCA circuit.
True multi-band compression splits the incoming audio into several non-overlapping frequency bands and then applies the whole compression process on each band individually. Usually most compressor controls are available for each band separately. This makes it tough to handle for the user (learning curve, lots experience needed) but the final results can be excellent if properly applied, even on difficult material. Though, unpleasant side effects are introduced by the filtering process: either phase shifting or pre-ringing takes place (depending on the selected filter algorithms).
Beside the common peak compression, RMS (signal averaging) driven approaches can be taken or can be combined resulting in way smoother results on program material but by lacking response to detail on the downside. In some implementations gain prediction algorithms takes place instead. This could be simple look-ahead or much more tricky signal analysis. Most gain prediction algorithms introduce overall latency. To avoid this some implementations are just performing local peak estimation algorithms by using e.g. velocity detectors.
Program dependency is a real large field for research even today because one could think of probably making each and everything which is going to happen under a compressors hood to be automatically fine adjusted to the actual charactaristics of the incoming audio. The most common place to implement program dependency are the attack and release time computations of the envelope(s). So this might be a good starting point but be aware that some common and simple practises like e.g. extending the release time during a heavy compression duty cycle maintains lower modulation distortion (wanted) but decreases the compressors ability to respond to dynamic details on the other side (unwanted). There are always such tradeoffs to be handled in the actual circuit design.
Some entirely alternative approaches on compression are e.g. pulse width modulation (PWM) based compression and “tape style” compression. PWM based compression is a sort of automatic gain control mechanism which uses the average output audio level in a feedback structure to regulate the pulse width of the (sliced) audio signal. Larger average level increases the width of the slices which causes the average then to lower due to the feedback structure. The sliced signal must be re-constructed with a filter which requires extensive (!) amounts of oversampling when done in the digital domain.
Of course there is no envelope driven gain control happening in magnetic tape based systems but those are capable of re-shaping some of the dynamic charactaristics of audio signals as well and there is much more going on then plain saturation (which is quiet often suggested). Combining saturation and compression techniques might be a good starting point here.
There is such a huge variety of methods available for assembling a concrete gain control circuit and even more parameters, variations and combinations are possible which can all influence the actual computed gain reduction and more important: the sound. Designing and testing such concepts is an almost entirely empirical process and there are no math formulas which guarantees optimal behaviour and sound. This is mostly because “optimal” is rather undefined here and has to be specified upfront designing a new schematic as much as possible. It’s not that much surprising that there is such a large amount of specific devices available out there and the most (sucessfull) ones are dedicated or established in some certain niches where “optimal” can better be defined.