Indistinguishable From Magic
Greg Simmons is the Founding Editor of AudioTechnology magazine, and looks forward to a time when recording musicians will look at an old interface and ask, “What does that knob marked ‘gain’ do?”
In the early ‘60s Arthur C. Clarke wrote: “Any sufficiently advanced technology is indistinguishable from magic.” Thirty years later I was baby-sitting a fully-configured CEDAR (Computer Enhanced Digital Audio Restoration) system; a great opportunity because my work involved restoring audio from tape, vinyl and 78rpm discs. Pre-dating iZotope’s RX and native processing by at least a decade, CEDAR was a PC-based system loaded with DSP chips that offered real-time de-noising, de-clicking, de-crackling, etc. It was amazing, and I wondered how to recreate it using the audio building blocks I was familiar with: amplifiers, filters and dynamics processing. I didn’t even know where to start. It was, essentially, indistinguishable from magic.
I asked CEDAR’s Gordon Reid to explain how it works in terms of the audio building blocks I was familiar with. After some failed attempts, he said, “I can’t explain it in those terms because CEDAR doesn’t ‘see’ an audio signal that way. It just sees information; it looks for patterns in that information and responds to them.” It was my first experience of ‘post-analogue lateral thinking’…
Fast forward to early 2017. I’m poring over a thread that’s unjustly ridiculing the latest field recorders. Supposedly well-informed professionals were dismissing them outright because the limiters were implemented in the digital domain, meaning they were after the converters and presumably too late to prevent overloading. “Ridiculous!” they scoffed. I’d been using these newer-generation field recorders with no limiter problems, and felt compelled to wade in with a dose of ‘post-analogue lateral thinking’.
OUTSTANDING IN THE FIELD
A single channel of a field recorder consists of a mic preamp, a limiter and an AD converter. In a contemporary field recorder these components form an integrated system with no user-adjustable level controls between them. Therefore, with no limiting taking place, monitoring the converter’s output is as valid as monitoring the preamp’s output. So far, so good…
There are numerous ways to make the digital limiter work, but I’m going to focus on the one I find the most interesting. In contemporary field recorders the preamp’s gain is digitally controlled by a rotary encoder or up/down buttons. The digital limiter monitors the output of the converter but applies the gain reduction directly to the digitally-controlled mic preamp. It protects the converter from clipping without any extra processing in the signal path. It is superior to the analogue equivalent in many ways, with one caveat: latency. If the internal processing is fast enough to match or better the attack time of an analogue limiter, it’s a superior solution and another good example of ‘post-analogue lateral thinking’.
For years, conventional thinking about AD conversion said there was no point going beyond 24-bits because we couldn’t make a converter that delivered 24-bit performance in terms of dynamic range. The rule of thumb for the dynamic range of a linear PCM system is simple: 6dB-per-bit. Not so long ago we were struggling to get 20-bit performance out of a 24-bit converter. At 6dB per bit a converter with 20-bit performance offers 120dB of dynamic range, which is less than a typical condenser microphone (e.g. 125dB for a Neumann KM184). One of the goals for manufacturers whose products are intended for situations where you only get one chance to record it (e.g. live performances, location recording) is to make a converter with a dynamic range that exceeds that of the microphones. It would be one less bottleneck in the signal path, and bring us closer to a situation where setting mic gain before recording will be an option rather than a necessity.
GAIN FIRING RANGE
In this issue, Stephan Schutze reviews Zoom’s F6 field recorder which, like Sound Device’s MixPre II series, offers the ability to record in ‘32-bit float’ format. It’s a different format to linear PCM so we can’t apply the 6dB-per-bit rule, but the 32-bit float format offers huge dynamic range – way more than a microphone could ever deliver. To achieve this they use a technique known as ‘gain ranging’. Instead of using one preamp and one converter on each mic input, they use two. Both preamps have fixed gain. One preamp/converter combo is optimised for low level signals, the other for high level signals. The outputs of both converters are combined through DSP to create a 32-bit floating point signal that is essentially impossible for any contemporary microphone to drive into clipping. Gain-ranging isn’t new — you’ll find it in the AES42 digital microphones from a decade ago — but it’s another example of ‘post-analogue lateral thinking’.
The F6 offers recording in 24-bit or 16-bit linear PCM format, along with the option of recording the 32-bit float signal. I’m willing to bet that the linear PCM signal is derived digitally from the 32-bit float signal, and that the knob used to control the recording gain is placed after the converters — a gamble backed by the fact that it affects the recorded level of the linear PCM signals but not the 32-bit float signal. Whatever the case, if you’re recording in 32-bit float mode from a microphone source there is no need to worry about gain anyway because you cannot overload the preamps or converters.
POST ANALOGUE WORLD
Marcel Gnauk of Free To Use Sounds (see ‘Making Free Pay’) recently shared an F6 recording of a passing train, along with screen dumps of waveforms. The whistle blows and drives the 24-bit signal well into clipping. Thankfully, he was also capturing the 32-bit float signal and was able to recover the recording. It gets a bit gritty at the peak when the mic’s diaphragm reaches its maximum SPL, but that has nothing to do with the F6. Yet another example of ‘post-analogue lateral thinking’.
So where is this leading? Zoom’s tiny F6 contains six ‘gain-free’ inputs. Imagine eight of them in a desktop interface, perfect for the non-technical recording musician. Imagine 24 in a rack-mounting interface, ideal for recording live performances. Worrying about recording levels, pad switches and limiters will be a thing of the past. For those who don’t understand the technology behind this ‘post-analogue lateral thinking’, it will be indistinguishable from magic.
First of all – thank you for a (finally!) good explanation how 32-bit float recorders work. Reading marketing materials saying: “no need to set the gain” makes me suspicious and worried. Reading your’s: “preamps have fixed gain” explains everything and I instantly know how it works.
Sure, not having to worry about clipping is wonderful. No doubt.
But then, when I read: “you set the levels in post” makes me wonder if the whole process is not turning a 5 minutes adjustment of gain in the field into a half a hour work in post.
For example – the natural sounds: birds, rain, leaves rustling. I like to record these at more orbless natural levels. To an extend of course. This means not normalising everything to, say, 23 LUFS. No, if the sound is quiet, I want to preserve the feeling of quietness, softness. So, in the field, I set my desired level and I then keep it while in post. Recording at a set gain and then trying to remember how loud or quiet the sound was doesn’t seem appealing.
I’m still learning sound recording and its post-processing but from my long experience in photography I know that setting levels (lights) properly before tripping the shutter saves A LOT of time in post. Sure, one can fix many things in post but it takes huge amount of time.
I don’t know. Haven’t used a 32-bit recorder yet, I’m just interpolating from my other experiences with fixing stuff on a computer. It’s doable but usually takes longer.
I would appreciate some insight. Out of pure curiosity for a 32-bit float recorder is nothing that I would buy soon 🙂
Regards
Let me start by apologising for the delay in responding, Pawel! Here’s a reply that is two months late, based on your comment about an article that was published over three years ago. I hope it’s still relevant to you.
I am glad that you found my brief explanation of how 32-bit float conversion works to be helpful. It’s a clever technology, and brings great benefits to those who don’t have the luxury of a soundcheck – especially those making field recordings for atmos, sound effects, sound libraries or personal enjoyment. When the mic preamps and AD converter have a greater dynamic range than the microphone itself, we don’t have to worry about setting gain on the spot.
I don’t know where you read “you set the levels in post”, it doesn’t appear in my article but I have no doubt it has appeared often in the marketing materials you refer to earlier in your comment. Nonetheless, recording only in 32-bit float does turn the gain setting into something to be done after the recording rather than before. Whether or not that’s a problem depends on your intention: if you’re part of the workflow on a film shoot and there are people downstream expecting dailies immediately after the shoot, you might not have time to convert the 32-bit float audio data into the 24-bit linear PCM that they are expecting. In that case, it is wise to capture in both 32-bit float and 24-bit linear PCM, and try to get the gain as good as you can for the 24-bit linear PCM version as you normally would. It will only affect the level of the 24-bit linear PCM signal, of course, which means the 32-bit float version becomes a safety copy. If something surprising happened on set that clipped the 24-bit linear PCM signal BUT it was the only good take or the director insisted on using it, at least you can go back to the 32-bit float version and recover a 24-bit linear PCM version that does not clip. It’s going to take a bit longer, but you have ultimately ‘saved the day’ by being able to recover an otherwise unusable take.
The big advantage of 32-bit float becomes evident when you are recording in unpredictable circumstances where anything could happen. The train recording I described in my article is a good example of that. Set the gain where you instinctively think it should be set to suit the 24-bit linear PCM files, knowing that you’ve got the 32-bit float signal to fall back on if necessary.
What you’ve said about recording at more or less natural levels is something I spend a lot of time doing. I like situations where I don’t have to change anything; same microphones, same stereo technique, same gains. That keep things consistent, and is especially helpful when I am making a number of recordings from the same mics in the same placement in the same environment. If I don’t change the gain or anything else, then all of those files should just flow from one to another – perhaps with a simple cross-fade if I’m intending to blend them all into one composition.
When making nature recordings or other field recordings where I want to match a number of different recordings from the same place but where I might’ve altered the mic gain from take to take for whatever reason, perhaps a number of different birds calling, some rain, some leaves rustling, or whatever, and I want all of those sounds to be in the correct perspective with each other, I take a different approach to normalisation. From each take I will extract a segment of the background atmos where nothing specific is taking place. I will then do a LUFS analysis on those extracted parts and, based on their LUFS, adjust the levels of the original files to ensure they all have the same perceived levels of background sounds. Then, when I play them from one file to another, or crossfade between them, the perceived background levels remain the same and the individual sounds will stand out at their natural levels relative to the background and relative to each other.
For me, the benefit of 32-bit float becomes evident when something unexpected happens that pushes the 24-bit linear PCM signal into clipping. The 32-bit float signal gives me a ‘second chance’, allowing me to re-create the 24-bit signal without clipping. I will have to do a little bit of extra work and also some level matching (and possibly some limiting or similar) but I have that option available.
Regarding your photographic lighting analogy… I’m a purist in sound and photography and always aim to get everything right at the source capture. However, I have to remind myself that, as a digital photographer, making adjustments to the overall exposure level is a simple ‘one knob’ adjustment that blurs the distinction between objective and subjective decisions. Similarly, altering the exposure in specific parts of the image is very fast and easy to do in Lightroom et al compared to dodging and burning in a darkroom.
A slightly related topic in photography is the use of sensors with ISO invariance, which has some parallels to 32-bit float recording because ISO invariance means we don’t have to get the overall exposure value right during capture. Such sensors don’t save you from clipping the highlights or losing detail in the shadows, of course, but they do mean that the amount of noise in the image won’t be significantly different whether you increase the ISO before shooting or do it later in post (hence ‘invariance’). This is similarly helpful to 32-bit float recording because it means you don’t have to worry too much about ISO and noise while shooting. Being a purist I like to capture images that are ready to go SOOC (Straight Out Of Camera), which means I strive to get my ISO right on the capture, but, as with 32-bit float recording, it’s nice to know that if I accidentally under—expose an image (say I’m doing some street photography at night and have to capture a moment super fast) I won’t be penalised for it with more noise if I increase the ISO in post rather than getting it right before pressing the shutter.
I hope some of that was helpful, Pawel, or, at least interesting…