Understanding Compression 1
In the first instalment of this four-part series Greg Simmons explores the interactions between threshold, ratio and gain reduction. Grab a pen and paper because you’re probably going to need it…
Compressors are part of the family of dynamic processors that includes limiters, levellers, maximisers, expanders, gates and more. Anything that is designed to alter the shape of the output signal’s envelope based on the level of the input signal can be considered a dynamic processor.
Most dynamic processors are based on the concept of applying gain reduction, which simply means reducing the gain – and therefore the output signal level – depending on whether the input signal level is above or below a threshold level. Compressors and limiters apply gain reduction when the input signal level is above the threshold level, while expanders and gates apply gain reduction when the input signal level is below the threshold level.
For the purposes of this explanation we’re going to focus on compression, but variations of the principals discussed here apply to most other dynamic processors.
USES OF COMPRESSION
Most uses of compression in recording and mixing can be divided into two categories: corrective compression and enhancing compression.
Corrective compression is primarily used to reduce a sound’s dynamic range. This includes peak limiting to rein in unwanted transients and prevent clipping, and matching dynamics so that a sound can sit alongside other sounds in a mix without requiring excessive amounts of fader automation.
Enhancing compression is primarily used to sculpt a sound’s envelope. This includes making a sound punchier or more impactful, and increasing its perceived level.
Each of these applications of compression requires a specific approach that we’ll explore at the conclusion of this four-part series. None of those approaches will make sufficient sense without a proper understanding of how compression and compressors work. Let’s get started…
STARTING ASSUMPTIONS
There are many types of compressors and ways of achieving gain reduction, but for this series we’re focusing on VCA compressors because they are commonplace and offer control over more parameters than most other dynamic processors. The standard compressor plug-ins found in most DAWs mimic the features offered by most VCA compressors.
We’re going to assume our VCA compressor is using peak sensing because it simplifies the mathematics, and we’re going to focus on downward compression because it is the most commonly used form of compression and is fundamental to dynamic processing applications such as peak limiting and matching dynamics. Understanding how downward compression works in a peak-sensing VCA compressor makes it easier to understand how other forms of dynamic processing work, so we can transfer our knowledge from one to the other with relative ease.
We will also assume we’re working in digital systems where 0dBFS is the maximum possible signal level and therefore all signal levels are represented as negative values because they are below 0dBFS. When applying the forthcoming mathematical formulae to analogue processors – where 0dBVU is the nominal operating level and signal levels can have positive or negative values – we need to be vigilant about the calculations we make and where we put the + and – signs.
We need to be even more vigilant when we’re using a digital emulation of a classic/vintage analogue compressor that retains the original compressor’s VU metering and uses 0dBVU as its nominal operating level. To get the best results from the emulation we need to know what dBFS level in our DAW is equivalent to the emulation’s nominal operating level of 0dBVU. In most cases it will be -20dBFS or thereabouts (i.e. 0dBVU in the emulation = -20dBFS in the DAW), and that’s the nominal signal level that the emulation expects to work with. If the signal has an RMS level somewhere around -20dBFS, it will be close enough to the emulation’s nominal operating level and we should get the classic vintage behaviour and sound that we are expecting from it. However, if the signal has an RMS level that is much higher than the emulation’s 0dBVU level (e.g. around -12dBFS rather than -20dBFS), it will be constantly pushing the emulation harder than it was designed for, and we won’t get the classic vintage behaviour and sound we were expecting – just as we wouldn’t if we pushed the original compressor that hard in the analogue world.
Let’s put that example into perspective… If 0dBVU = -20dBFS, then -12dBFS = +8dBVU. If the signal’s RMS level is sitting around +8dBVU it means the VU meter’s needle will be pegging well above its maximum indicated level of +3dBVU, the compressor will be slammed 8dB harder than it was designed for, and we won’t be getting the classic vintage behaviour or sound we were expecting. If we lower the level entering the emulation so that its RMS level is somewhere around -20dBFS then everything should start working as expected.
TRANSFER CHARACTERISTIC
The illustration below is known as a transfer characteristic, and shows us how changes in the input signal level are transferred through a dynamic processor to become changes in the output signal level. For any given input signal level on the horizontal axis, we can project down to the transfer characteristic and then across to the vertical axis to see the corresponding output signal level. In this case, we can see that an input level of -12dBFS results in an output level of -12dBFS.
Understanding downward compression in a VCA compressor makes it easier to understand how other forms of dynamic processing work
The example above shows a linear transfer characteristic, whereby a change to the input signal level results in a directly proportional change to the output signal level. If the input signal level increases by, say, 3dB, so does the output signal level. Likewise, if the input signal level decreases by, say, 3dB, so does the output signal level.
We’ll be using transfer characteristic graphs throughout this discussion because they are fundamental to understanding how dynamic processors work. Note that, as stated earlier, this discussion is focused primarily on digital systems where 0dBFS is the maximum possible value and therefore all signal levels are represented as negative values. This means the transfer characteristic graphs seen throughout this discussion will look different to the transfer characteristic graphs seen in older texts that were written in the days of analogue processing, and also in newer texts that have mindlessly reproduced the illustrations seen in older texts despite their numerical irrelevance.
BUILDING BLOCKS
The illustration below shows the conceptual building blocks of most dynamic processors, whether they are analogue circuits or digital algorithms coded to produce the same end result.
The signal to be compressed enters the input and passes through the Variable Gain Cell, which, as the name implies, is able to vary its gain and thereby affect the level of the signal coming out of it.
For a traditional VCA compressor the Variable Gain Cell is a Voltage Controlled Amplifier – hence ‘VCA’. It is conceptually the same as the VCAs found in analogue synthesisers; an amplifier that’s gain is determined by an applied control voltage. In other types of compressors the Variable Gain Cell could be an electronic component known as a Field Effect Transistor (hence ‘FET’), it could be a light source used in conjunction with a phototransistor (hence ‘opto’ for ‘optical’), or any other method that allows the level of one signal to control the level of another signal.
Most digital compressors are coded to emulate their analogue predecessors; some also include cool features that weren’t possible in the analogue world, and some take entirely new approaches that were simply unviable in the analogue world.
For downward compression we can think of the Variable Gain Cell as an amplifier with a maximum gain of 0db (i.e. x1), therefore it is only capable of reducing gain – hence ‘gain reduction’. If the compressor is applying 3dB of gain reduction it means the Variable Gain Cell has a gain of -3dB, which is equivalent to an amplification of x0.7071 which is obviously less than a gain of x1.
The Envelope Follower monitors, or senses, the input signal’s peak or RMS level (as determined by the switch setting) and tells the Variable Gain Cell how much gain reduction to apply based on the settings for threshold, ratio and knee, and how fast to apply and remove that gain reduction based on the settings for attack and release.
The compressor shown here has three helpful meters: one to show the input signal level, one to show the gain reduction (GR), and one to show the output signal level.
The levels shown on the Input and Output meters move upwards with increasing levels and downwards with decreasing levels, as with most vertically-oriented meters. However, note that the level indicated on the Gain Reduction (GR) meter moves from top to bottom as the gain reduction increases – essentially showing how much the compressor is reducing the signal level.
The relationship between these three meters is simple:
Output Signal Level = Input Signal Level – GR
In the example shown below the input signal level is -6dBFS and the compressor is applying 4dB of gain reduction, therefore the output signal level will be -10dBFS (i.e. -6dBFS – 4dB = -10dBFS).
Dynamic processors such as compressors and limiters, which use downward compression to reduce a signal’s dynamic range, ultimately reduce the signal’s overall amplitude. To compensate for this they often include an additional amplifier on the output to make up for the loss of level. This is typically referred to as make-up gain, output gain or simply output. Well look at this amplifier and its effect on the output signal level in the next instalment of this two-part series.
COMPRESSION & SENSING
As mentioned above, the compressor senses and responds to either the peak level or the RMS level of the input signal. What’s the difference?
Peak sensing means the compressor responds to the peak levels of the input signal, as if it was reacting to the signal levels shown by a peak meter or following the outline of the signal’s envelope as it is typically displayed on a DAW. Peak sensing is useful when working with peaks and transients – such as using peak limiting to rein in peaks and prevent clipping, or using enhancing compression to creatively shape a sound’s envelope.
RMS sensing means the compressor responds to the RMS level of the input signal, as if it was reacting to the signal levels shown by an RMS metering system or, perhaps more appropriately, a VU meter. It’s considerably slower than peak sensing; it ignores sudden transients and short-term peaks – as might be seen on the waveform display of a DAW – and focuses instead on a level that is more like the human perception of loudness. It is very useful for corrective compression when we want to bring different instruments into the same dynamic perspective without dramatically re-shaping their envelopes, or when we want to increase the overall perceived loudness of a sound or a mix.
The illustration above shows the difference between peak and RMS levels. Both waveforms are from the same section of the same stereo interleaved wav file, containing a mix of highly percussive music, and no level changes have been made between them. The first waveform is based on the mix’s peak levels and provides an indication of the signal level’s extremities, which is useful for limiting and for preventing clipping distortion. The second is based on the mix’s RMS levels and provides a closer representation of its perceived level, which is useful when we want to maintain or change a signal’s perceived loudness. Obviously, a compressor that is sensing the peak waveform shown above would behave very differently if it was sensing the RMS waveform instead.
Some compressors offer a choice between Peak and RMS sensing, and some combine an RMS-sensing compressor with a peak-sensing limiter, making them useful for a wide range of compression applications. Some compressors take this one step further by continually sensing the peak and the RMS levels, deciding which is more relevant at any point in time and altering their behaviour accordingly.
Feedback & Feedforward
There are two points in the signal path where the compressor can ‘sense’ the signal: the input and the output.
Most compressors sense the signal at the input, before compression is applied. These are known as feedforward compressors because the signal being sensed is fed forward from the input. This means the compressor is being controlled by the uncompressed version of the signal, as shown in the illustration below.
Some compressors sense the signal at the output, after compression has been applied. These are known as feedback compressors because the signal being sensed is fed back from the output. This means the compressor is being controlled by the compressed version of the signal, as shown in the illustration below.
It may seem counter-intuitive to use the compressed signal to control the compressor, but this approach has some interesting sonic characteristics. Because the compression is being controlled by the compressed signal, the feedback compressor is instantaneously fine-tuning its actions based on the effects of its previous actions. For this reason we can think of it as retrospective compression. It is often described as a ‘sonic glue’ that helps to bind together the dynamics of the individual components within a sound.
As a convenient rule of thumb, feedforward compression works well for sculpting individual components within a mix, while feedback compression works well on mixes and submixes. It’s worth noting that some of the most popular compressors in history have been feedback compressors, including SSL’s bus compressor and Urei’s 1176.
For the remainder of this series we will focus on the use of feedforward compression unless mentioned otherwise. Understanding how feedforward compression works makes it easier to understand and appreciate how feedback compression works.
Sidechain Input
Some compressors offer an additional input that’s typically called ‘Sense’, ‘Key’ or ‘Sidechain’. This input allows the compressor to be controlled by a different signal than the one passing through it, and is primarily used for applications such as ducking.
Two common examples of ducking are a) when the kick drum’s signal is used to create a momentary dip in the bass guitar’s signal to allow the kick drum to punch through the mix without making it louder than it should be, and b) when an announcer’s voice is used to automatically turn the music down underneath the announcement. We’ll take a closer look at some typical uses of the sidechain input at the end of this instalment (scroll down to ‘Ducking & Sidechaining’), but to understand how it works we need to understand the interaction between the threshold control and the ratio control, and how they determine the amount of gain reduction that will be applied to the signal passing through the compressor. Read on…
COMPRESSION & THRESHOLD
Downward compression, which we are focusing on here, is the process of applying gain reduction only when the input signal level exceeds the threshold level. All signal levels that are below the threshold level pass through the Variable Gain Cell unaffected. Gain reduction is only applied when the input signal level exceeds the threshold level, therefore choosing the correct threshold is fundamental to determining how the compressor affects the signal.
In the illustration above the threshold has been set to -15dBFS, as indicated with the vertical blue line. All input signal levels that exceed -15dBFS (shaded in red) will have gain reduction applied in accordance with the setting of the ratio control. The settings of the knee, attack and release controls will affect how the compressor applies the gain reduction – we’ll look at those in detail in the second instalment of this series.
At this point in the discussion it is important to address a common misconception. As stated earlier, gain reduction is only applied when the input signal level exceeds the threshold level, but this doesn’t mean it only affects the part of the envelope that is above the threshold level. Gain reduction is simply a controlled reduction of level, and it affects the entire envelope whenever the input signal level goes above the threshold. For example, consider an electric guitar amplifier that has some hum and buzz in the background. If a part of the guitar’s envelope goes above the threshold and causes 4dB of gain reduction, that gain reduction will also reduce the hum and buzz by 4dB in that part of the envelope.
COMPRESSION & RATIO
As stated earlier, when the input signal level rises above the threshold level, the Envelope Follower instructs the Variable Gain Cell to apply gain reduction. How much gain reduction it applies is determined by the ratio control, which expresses the ratio between changes of input signal level above threshold and changes of output signal level above threshold as follows:
Ratio = ΔInput : ΔOutput
Where:
ΔInput represents change of input signal level above the threshold level
ΔOutput represents change of output signal level above the threshold level
In mathematics it is common to use the Greek symbol delta (δ or Δ) to represent a relative change in a variable, rather than its absolute values. For example, if a signal’s level changed from 2dBFS to 5dBFS we could represent that level change as ΔLevel = 3dB. In this discussion about dynamic processors, ‘ΔInput’ means “change of input signal level above the threshold level”, and ‘ΔOutput’ means “change of output signal level above the threshold level”.
We can mathematically determine ΔInput and ΔOutput as follows:
ΔInput = Input – Threshold
ΔOutput = Output – Threshold
Where:
Input = input signal level in dBFS
Output = output signal level in dBFS
Threshold = threshold level in dBFS
Let’s say a peak-sensing compressor’s threshold level was set to -10dBFS, the input signal level was -4dBFS and the output signal level was -7dBFS. Here’s the maths:
ΔInput = Input – Threshold
ΔInput = -4dBFS – -10dBFS = 6dB
ΔOutput = Output – Threshold
ΔOutput = -7dBFS – -10dBFS = 3dB
Note that the values for ΔInput and ΔOutput are not specified as dBFS values because they are not absolute values; they represent the relative difference between two dBFS values. For the same reason, they do not have a positive or negative indicator.
From these figures we can calculate the compressor’s ratio using the formula given earlier:
Ratio = ΔInput : ΔOutput
Ratio = 6dB : 3dB
Which can be mathematically simplified to 2:1 by dividing both sides of the ratio by the lower of the two ratio values, which in this example is 3. So 6/3 = 2 and 3/3 = 1, hence the 6:3 ratio is simplified to 2:1.
A ratio of 2:1 means that if the input signal level increases by 2dB above the threshold level, the output signal level will increase by 1dB above the threshold level.
ΔInput = 2dB
ΔOutput = 1dB
Ratio = 2dB : 1dB = 2:1
Using transposition, if we know ΔInput and the ratio we can calculate ΔOutput as follows:
ΔOutput = ΔInput / Ratio
Note that this formula assumes the ratio has already been converted into a ratio of n : 1, in which case we can use n as the ratio value, e.g. if the ratio is 2:1, then n = 2.
So, if ΔInput = 2dB and the ratio was 2:1, then:
ΔOutput = ΔInput / Ratio = 2dB / 2 = 1dB
The input signal level goes 2dB above threshold, and the corresponding output signal level goes 1dB above threshold.
THRESHOLD, RATIO & GAIN REDUCTION
The illustration below shows the relationship between threshold and ratio, along with a number of different ratios that we’ll be using in the following examples. As discussed earlier (scroll up to ‘Transfer Characteristic’), the graph shows how input signal levels will be transferred to output signal levels; for any given input signal level we can project downwards to the appropriate ratio’s curve and then to the right to find the corresponding output signal level.
Note that all input signal levels below the threshold level have a ratio of 1:1, in other words, the output level is directly proportional to the input level. If the input signal level increases by 1dB, the output signal level also increases by 1dB. Gain reduction is only applied when the input signal level exceeds the threshold level, at which point the ratio determines the output signal level as we’ve seen earlier.
Note also that for any given threshold and ratio on this graph we can see the maximum possible output signal level. For example, the graph above has a threshold of -24dBFS and we can see that with a ratio of 2:1 the maximum output signal level will be -12dBFS. Why? When the input signal level reaches the maximum of 0dBFS it has exceeded the threshold by 24dB, meaning ΔInput = 24dB. Applying a 2:1 ratio means ΔOutput = 24/2 = 12dB. Therefore the output signal level will be 12dB above the threshold of -24dBFS, which is -12dBFS. Similarly, we can see that the maximum output signal level will be -16dBFS at 3:1, -20dBFS at 6:1 and -22dBFS at 12:1. All of these figures assume the threshold is set to -24dBFS, and the compressor’s attack time is set as fast as possible. As we’ll see in the second instalment of this two-part series, slowing down the attack time will allow sudden peak levels to sneak through before the compressor has time to apply the appropriate gain reduction. If those peaks exceed 0dBFS on the input they will ultimately clip, of course.
If we know the input signal level and the output signal level we can determine how much gain reduction has been applied by the compressor, as follows:
Gain Reduction = Input – Output
Similarly if we know the threshold level and we know ΔOutput we can determine the compressor’s output signal level, as follows:
Output = Threshold + ΔOutput
Knowing these things allows us to understand what a downward compressor is doing.
SOME TYPICAL EXAMPLES
Here are some examples of how the threshold and the ratio interact to determine the gain reduction in a downward compressor. Unless otherwise stated, the following examples assume a peak-sensing downward compressor with the theoretically fastest possible attack and release times of 0s (zero seconds) respectively.
Let’s start with a ratio of 2:1. If the threshold level was set to -20dBFS and the input signal level reached -14dBFS, the output signal level would only reach -17dBFS. Why? Because the input signal level exceeds the threshold level by 6dB, and the 2:1 ratio means a 6dB increase above threshold at the input will result in a 3dB increase above threshold at the output. It also coincidentally means a gain reduction of 3dB has been applied by the compressor. Here’s the maths that ties it all together:
Threshold = -20dBFS
Input = -14dBFS
ΔInput = -14dBFS – -20dBFS = 6dB
Ratio = 2:1
ΔOutput = ΔInput / Ratio = 6dB / 2 = 3dB
Output = Threshold + ΔOutput = -20dBFS + 3dB = -17dBFS
Gain Reduction = Input – Output = -14dBFS – -17dBFS = 3dB
In the example above we see that a threshold level of -20dBFS and a ratio of 2:1 reduces an input signal level of -14dBFS to an output signal level of -17dBFS. The compressor does this by applying up to 3dB of gain reduction to the parts of the signal’s envelope that exceed the threshold level, as shown in the illustration below:
What if we keep the threshold at -20dBFS but increase the ratio to 3:1?
Threshold = -20dBFS
Input = -14dBFS
ΔInput = -14dBFS – -20dBFS = 6dB
Ratio = 3:1
ΔOutput = ΔInput / Ratio = 6dB / 3 = 2dB
Output = Threshold + ΔOutput = -20dBFS + 2dB = -18dBFS
Gain Reduction = Input – Output = -14dBFS – -18dBFS = 4dB
In the example above we see that a threshold level of -20dBFS and a ratio of 3:1 reduces an input signal level of -14dBFS to an output signal level of -18dBFS. The compressor does this by applying up to 4dB of gain reduction to the parts of the signal’s envelope that exceed the threshold level, as shown in the illustration below:
Let’s do it one more time, keeping the threshold at -20dBFS but pushing the ratio up to 6:1…
Threshold = -20dBFS
Input = -14dBFS
ΔInput = -14dBFS – -20dBFS = 6dB
Ratio = 6:1
ΔOutput = ΔInput / Ratio = 6dB / 6 = 1dB
Output = Threshold + ΔOutput = -20dBFS + 1dB = -19dBFS
Gain Reduction = Input – Output = -14dBFS – -19dBFS = 5dB
In the example above the compressor converted a 6dB increase of input signal level above threshold into a 1dB increase of output signal level above threshold, as shown in the illustration below:
Despite -20dBFS being a relatively low value for the threshold, it is only 6dB lower than the input signal’s maximum level of -14dBFS and therefore we can consider it to be a high threshold for this example because it is relatively close to the input signal’s maximum level without exceeding it. Similarly, a ratio of 6:1 can be considered a high ratio when the compressor only has 6dB of input signal above the threshold to work with (ΔInput = 6dB, from -20dBFS to -14dBFS). Any ratio that equals or exceeds ΔInput can be considered high for the application because it will be attempting to compress all of ΔInput into a ΔOutput value of 1dB or less. Remember:
Ratio = ΔInput : ΔOutput
and:
ΔOutput = ΔInput / Ratio
This type of ‘high threshold/high ratio’ processing would normally require hard peak limiting; it would be acceptable for reining in one or two short and fast attack transients, but might not be acceptable on sustained and/or non-transient sounds. Which brings us to…
AN EXTREMELY COMMON PROBLEM
Let’s finish this instalment with an extreme example of this ‘high threshold/high ratio’ combination that appears all-too-often on mixing and mastering forums. A confused novice engineer will post a question that reads something like this:
“I have a peak limiter over the mix bus with a threshold of -1dBFS to prevent any clipping. The signal never goes over 0dBFS and the clipping indicators do not light up, but I can still hear clipping. Why?”
The simplest one-word answer is ‘ignorance’; this harsh response will raise a laugh but to the novice engineer it’s as unhelpful as ‘just trust your ears’ because a) it doesn’t explain or solve the problem, and b) they are trusting their ears, which is why they are asking this question.
What is happening here?
As we know, 0dBFS is the maximum possible signal level in a digital audio system and going above 0dBFS causes clipping distortion. Let’s say the mix in this example sits comfortably below the threshold of -1dBFS at all times except for attack transients on the snare that would theoretically reach up to +3dBFS but in practice will ultimately clip at 0dBFS. The peak limiter has been placed over the mix bus with a threshold of -1dB to catch those unwanted peaks and thereby prevent clipping. A peak limiter uses a very high ratio; typically anywhere between 10:1 and infinity:1. For this example we’ll assume it has a ratio of 20:1, which creates an excellent example of high threshold/high ratio processing. We’ll also assume that, being a peak limiter, it has been set to its fastest attack and release times.
Why is the engineer hearing clipping? Here’s the maths:
Threshold = -1dBFS
Input = +3dBFS
ΔInput = +3dBFS – -1dBFS = 4dB
Ratio = 20:1
ΔOutput = ΔInput / Ratio = 4dB / 20 = 0.2dB
Output = Threshold + ΔOutput = -1dBFS + 0.2dB = -0.8dBFS
Gain Reduction = Input – Output = +3dBFS – -0.8dBFS = 3.8dB
The compressor settings and metering for this example are shown in the illustration above. Note that, in addition to showing levels above 0dBFS, the meter scales in this example have been zoomed in so that each segment represents 0.2dB rather than 1dB as seen in the earlier illustrations. Also note that we are looking at signal levels on the mix bus; at this point the DAW is most likely operating in 64-bit floating point precision and can internally process a peak level of +3dBFS as shown here without clipping, but clipping will occur when the mix is bounced down to the 24-bit linear PCM format as used throughout the audio industry. Why? Because 0dBFS on the DAW’s mix bus meters is 0dBFS in the linear PCM format, and going above it will ultimately cause clipping in the bounced mix.
So what is happening here? The peak limiter is applying 3.8dB of gain reduction to compress the top 4dB of the snare’s envelope down to just 0.2dB (between -1dBFS and -0.8dBFS). This ‘high threshold/high ratio’ peak limiting reduces ΔOutput so much that the signal might as well be clipping: it looks like clipping and it sounds like clipping, but it is not exceeding 0dBFS therefore it is not indicated as clipping on the meters. The peak limiter has saved the DAW from clipping by doing the clipping itself. It’s math, not magic.
There is nothing wrong with the peak limiter, it is doing exactly what the user asked it to do: apply a very large ratio of 20:1 to a very small dynamic range of 4dB (i.e. ΔInput). The unhelpful and harsh one-word answer given previously is correct: the cause of the audible but unmetered ‘clipping’ is ignorance – or perhaps a belief in magic, which is essentially the same thing.
Let’s take a closer look. The illustration below shows the peak levels of a short excerpt of the above-mentioned mix showing four of the offending snare peaks labelled A, B, C and D. Each peak’s level on the mix bus is shown next to its label.
To prevent the peaks from clipping the engineer has placed a compressor over the mix bus and configured it as a peak limiter with a threshold of -1dBFS, a ratio of 20:1, and the fastest possible attack and release times. As shown in the illustration below, each peak has essentially been ‘flat-topped’ or clipped to a level somewhere between -0.9dBFS and -0.8dBFS, as calculated with the same maths used earlier. The end result looks and sounds like clipping, but it’s all happening below 0dBFS and therefore won’t be indicated as clipping on the mix bus meters.
Note also that the differences between the individual peak levels above threshold has been reduced by a factor of 20:1 in accordance with the ratio setting. That’s what compression does: in this example it has taken a worse-case difference of 2.7dB (between peaks B and C) and reduced it to 0.09dB. It’s a powerful tool when used correctly, but not in this example…
A Smarter Solution
It would be smarter to solve this problem by applying a ‘low threshold/low ratio’ compressor directly to the snare channel. The goal for this example is to spread the 3.8dB of gain reduction achieved in the previous example over as much of the snare’s envelope as possible, rather than crushing the top 4dB into 0.2dB and creating what looks and sounds like clipping. If done correctly this ‘low threshold/low ratio’ approach will probably have little effect on the snare sound as it is heard within the mix, but will replace the heavy-handed use of peak limiting over the mix bus with the strategically lighter touch of corrective compression applied directly to the sound that is causing the problem.
Because we are now inserting a compressor directly into the snare channel, rather than over the mix bus, we have to consider the levels of the snare track itself. The illustration below shows the snare track during the same excerpt of the mix shown in the examples above. We can see peaks between -6dBFS and -8.7dBFS on the attack transients, with spill from the rest of the drum kit sitting just below -13dBFS.
Our goal is to spread up to 3.8dB of gain reduction (as achieved in the previous example) over as much of the snare’s envelope as possible. This will minimize the effect of the gain reduction at any point on the snare’s envelope, and avoid creating what looks and sounds like clipping. We want to set the threshold as low as possible without compressing the entire snare sound. Why? Because downward compression often has the side-effect of enhancing a sound’s impact, i.e. making it ‘punchier’, and this is especially noticeable on percussive sounds. We don’t want to add impact to the spill and noise, so for this example we’ll set the threshold to -13dBFS to keep it just above the level of the spill. The gain reduction will therefore only be applied to the snare hits and won’t be enhancing the impact of spill, snare buzz and other unwanted lower level sounds on the snare track that occur when the snare is not being hit.
Having established the threshold level of -13dBFS, we now need to determine the ratio. For this we’ll use the highest peak of the snare hits as a ‘worse-case scenario’, because this is the point where we need to reach 3.8dB of gain reduction in order to match the ‘high threshold/high ratio’ example given earlier. The highest snare peak in this example is -6dBFS. Therefore:
ΔInput = -6dBFS – -13dBFS = 7dB
To reach our target of 3.8dB of gain reduction, a ΔInput of 7dB means the corresponding ΔOutput must be 3.2dB (i.e. 7dB – 3.8dB). What ratio will achieve this?
Ratio = ΔInput : ΔOutput
Ratio = 7 : 3.2 = 2.19 : 1
If we use a threshold of -13dBFS and a ratio of 2.19:1, when the highest level going into the compressor reaches -6dBFS the highest level coming out of the compressor will reach -9.8dBFS. The snare’s peak level has been reduced by 3.8dB, as in the previous example, but without clipping the envelope. Here’s the maths:
Threshold = -13dBFS
Input = -6dBFS
ΔInput = -6dBFS – -13dBFS = 7dB
Ratio = 2.19 :1
ΔOutput = ΔInput / Ratio = 7dB / 2.19 = 3.2dB
Output = Threshold + ΔOutput = -13dBFS + 3.2dB = -9.8dBFS
Gain Reduction = Input – Output = +3dBFS – -0.8dBFS = 3.8dB
In this example we have created the same 3.8dB of gain reduction seen in the previous example, except we did it by applying corrective compression to the snare sound itself rather than applying hard peak limiting over the whole mix. This approach allowed us to spread the required 3.8dB of gain reduction across the top 7dB of the snare’s peak rather than the top 4dB of the mix. The result is a less extreme effect on the shape of the snare’s attack transient that does not look or sound like clipping. The hard peak limiting problem described earlier has been solved with the judicious use of corrective compression.
The illustration above shows how each of the snare envelopes has been altered by the compression. The compressor has reduced the peaks of the transients as intended, applying 3.8dB of gain reduction to the worse-case peak without dramatically altering the shapes of the envelopes and introducing what looks and sounds like clipping. As an added bonus, the overall snare levels and envelopes are now more consistent from hit to hit, helping to maintain dynamic perspective within the mix.
The illustration below overlays the compressed version with the uncompressed version. Although the differences are visually significant, we have to consider two factors when pondering whether the differences will be audibly significant: a) the maximum gain reduction is only 3.8dB, and b) most of the gain reduction is applied to fast transient peaks that contribute very little to the human perception of loudness – until they’re clipped. In this example, the original and compressed versions will probably have similar integrated LUFS values. It might be necessary to tweak the fader level to bring the compressed snare back to the desired level in the mix, but the change in fader level will be relatively small.
PUMPING
When gain reduction is applied it doesn’t only affect the snare sound, it also affects the spill and noise that exists within the snare track. At the bottom of the illustration below we can see the sections of the spill and noise that will be affected:
Note that the effect of the gain reduction on the spill and noise as shown in this illustration is not to scale; the illustration only aims to show the parts of the spill and noise that will be affected, and provide an indication of how they are affected. Each part gets the gain reduction applied as soon as the snare signal’s level exceeds the threshold level, and returns to 0dB gain reduction when the snare signal has fallen back to the threshold level. It looks as though the spill and noise was instantly turned down then faded up to its normal level, which is essentially what happens. If the compressor’s attack and release times are not set correctly this ‘fade up’ becomes audible, creating a pumping or breathing effect where we can hear the spill and noise fading back up as the gain reduction is removed. More about pumping and breathing when we look at attack and release times in the next instalment…
COMPRESSING ON…
Throughout this instalment we have explored the dynamic processing parameters of threshold, ratio and gain reduction as they apply to downward compression and peak limiting. We can determine the values of these parameters mathematically, as demonstrated throughout this instalment. Maybe the results will sound good, maybe they won’t, but nonetheless these calculations provide a mathematically-informed starting point that we can work from. If we understand how the parameters interact we can make changes accordingly to optimise the resulting sound quality.
The interactions between threshold, ratio and gain reduction are encapsulated in the following four statements:
1: For any given threshold, increasing the ratio will increase the gain reduction and decreasing the ratio will decrease the gain reduction.
2: For any given ratio, increasing the threshold will decrease the gain reduction and decreasing the threshold will increase the gain reduction.
3: For any given gain reduction, increasing the threshold will require a higher ratio and decreasing the threshold will require a lower ratio.
4: For any given gain reduction, increasing the ratio will require a higher threshold, and decreasing the ratio will require a lower threshold.
In the following instalments of this four-part series we’ll look at the parameters that allow us to optimise the compressor so it is controlling the dynamic range as demonstrated above while also sounding good and serving the music…
Kick Ducking Bass
In this sidechaining technique the kick drum momentarily ducks (i.e. reduces) the level of the bass guitar, which has the effect of giving the kick drum more impact and clarity in the mix without turning it up louder than it should be and without EQing it inappropriately. The bass guitar signal passes through the compressor’s Variable Gain Cell, but the the kick drum controls the compressor’s Envelope Follower (and therefore the gain reduction applied to the bass guitar) via a split or an auxiliary send connected to the sidechain input – as shown below.
Note that a fourth meter has been added [far left side of the illustration] to represent the level of the signal entering the sidechain input to control the compressor. In this example the bass guitar’s level is ducked by 2dB with the beat of the kick drum. How? The compressor’s threshold is set to -18dBFS and the kick reaches a peak of -14dBFS, creating a ΔInput of 4dB. With a ratio of 2:1, a ΔInput of 4dB results in a ΔOutput of 2dB. The compressor has turned a 4dB change of input into a 2dB change of output, therefore it has applied 4 – 2 = 2dB of gain reduction to the Variable Gain Cell. The bass guitar’s level is therefore reduced from -10dBFS at the input to -12dBFS at the output – but only when the kick drum is played.
Note that this type of mixing trick needs to be transparent; we don’t want the listener to notice any changes in the bass guitar’s level. To achieve this we need to use small amounts of gain reduction, typically no more than a dB or two. We also need the gain reduction to follow the kick drum’s envelope as closely as possible so that the kick drum’s sound masks the changes in the bass guitar’s level; this requires the use of a hard knee, a very fast attack time and a very fast release time.
Announcer Talking Over Music
In this sidechaining technique the music is ducked every time the announcer speaks, creating the effect that someone is quickly fading down the music whenever the announcer speaks and then skilfully fading it back up to the proper level afterwards. The music passes through the compressor’s Variable Gain Cell, but the announcer’s voice controls the compressor’s Envelope Follower (and therefore the gain reduction applied to the music) via the sidechain input – as shown below.
In this example the announcer’s voice ducks the level of the music by 4dB. How? The compressor’s threshold is set to -15dBFS and the voice reaches a peak of -9dBFS, creating a ΔInput of 6dB. With a ratio of 3:1, a ΔInput of 6dB results in a ΔOutput of 2dB. The compressor has turned a 6dB change of input into a 2dB change of output, therefore it has applied 6 – 2 = 4dB of gain reduction to the Variable Gain Cell. The music’s level is therefore reduced by 4dB, from -12dBFS at the input to -16dBFS at the output.
Note that this type of processing needs to sound smooth and controlled. To achieve this we need to use appropriate amounts of gain reduction to allow the voice to be heard clearly over the music, typically between 3dB and 6dB. We want that gain reduction to come in quickly but smoothly at the start of the announcement, and we want the music to gently fade back in after the announcement; this requires a moderately fast attack time and a moderately slow release time, combined with RMS sensing and a soft knee to create a sense of natural ‘finger on a fader’ ballistics rather than electronic precision.
De-Essing et al
This sidechaining technique places a filter or EQ in the sidechain path, which allows us to apply more gain reduction to some frequencies than others. The voice to be de-essed passes through the compressor’s Variable Gain Cell, while a filtered or equalised version of the voice enters the compressor’s Envelope Follower via the sidechain input – as shown below.
In this example the voice’s level is ducked by 6dB when a strong ‘ess’ sound, aka sibilance, occurs. How? The voice signal enters the sidechain input via a high pass filter (HPF) that has been adjusted to filter out everything below 4kHz; therefore the gain reduction is controlled only by the parts of the voice that exist above 4kHz, which is where sibilance typically exists. The compressor’s threshold is set to -24dBFS and the sibilance reaches a peak of -16dBFS, creating a ΔInput of 8dB. With a ratio of 4:1, a ΔInput of 8dB results in a ΔOutput of 2dB. The compressor has turned an 8dB change of input into a 2dB change of output, therefore it has applied 8 – 2 = 6dB of gain reduction to the Variable Gain Cell. The voice’s level is therefore reduced from -10dBFS at the input to -16dBFS at the output during the sibilance.
Note that this type of processing needs to be done very carefully because we don’t want the listener to notice any changes in the voice’s level, tonality or intelligibility, but we also don’t want the listener to be distracted by excessive sibilance. Once sibilance is heard, it cannot be unheard! To achieve this we should use no more gain reduction than necessary – the 6dB shown in the example is probably too much for all but the most extreme sibilance problems. We also need the gain reduction to follow the sibilance’s envelope as closely as possible to minimise the effect on the vocal’s level, tonality and intelligibility. This requires the use of a peak-sensing compressor with a very fast attack time and a very fast release time, combined with a knee that works best for the vocalist’s pronunciation (this could be hard, soft, or anywhere in between). The 4kHz HPF used here is purely for the purposes of the example – to do it well we should study the frequency spectrum of the sibilance of the particular voice (it is different for each vocalist) and adjust the filter appropriately to isolate the sibilance from the voice as much as possible, ideally so that only the sibilance enters the sidechain input.
Finally, when doing this sort of processing we should always be careful to avoid ‘throwing out the baby with the bathwater’, i.e. becoming so obsessed with removing sibilance that we fail to notice how badly the de-essing is affecting what remains of the voice sound. A good voice sound with slightly excessive sibilance is better than a bad voice sound with no sibilance.
RESPONSES