A Standard Mix Up
Getting to know level standards across the audio disciplines so we can all speak the same language.
Tutorial: Brent Heber
I’m often asked by engineers how easy it is to transfer knowledge between the four traditional mass media industries involving audio editing and mixing. Namely, Music, TV, Film, and Radio. Short answer: Think of it like dialects. While many audio skills are translatable, our cousins in the broadcast suite aren’t necessarily speaking the same language as those native to cinema, or engineers brought up in the music studio. For instance, an understanding of levels in the digital realm and how to meet delivery specifications can be universal, but without knowing each industry’s specific standards you could end up with a mix of gobbledygook. In this article I’m going to dive into the key differences in approach when managing levels between the various formats.
ON THE LEVEL
The key difference when approaching a project destined for mixed media is to remember the target listening environments — that’s how the various standards have been developed. If we look at each format and compare them in the digital realm with each approaching zero dBFS (Decibels relative to Full Scale — zero being the maximum possible digital level measured inside your DAW), then a format that’s generally listened to as background sound with speakers turned down (i.e., music) will want to be the hottest and most compressed in order to compete and be heard. In contrast, a format with a listening environment where speakers are up nice and loud in a soundproof environment, like the cinema, will want the ‘softest’ delivery, far away from clipping and with the most dynamic range. Film mixing engineers are constantly provided music hitting close to digital zero, and the first thing we have to do is drop the level by 20dB so it doesn’t blow up our speakers, which are set much louder than in a comparably-sized music studio.
MUSIC MASTERING
Most readers can relate to delivery for Red Book audio CD: 16-bit/44.1k, get the peaks as close to zero as possible without clipping and job done. This sort of simplistic overview does nothing to illuminate how loud the material is and it’s often the first thing a new engineer notices when comparing his home recordings versus a commercial release; the mastered retail disc is massively louder and ‘fuller’ across the frequency spectrum. Of course, a quality mix destined for CD can’t simply be mastered into shape with Ozone or similar, there needs to be a loud, punchy, clean mix going into the mastering process for it to really shine. Loudness is a key measurement in all mixing and it’s only in recent years that an objective measurement scale has been developed to shed light on what has traditionally been the ‘art’ of mastering. With loudness meters an engineer can now say with certainty this track is exactly as loud as that track and standardise across a release. More about loudness meters later.
With the advent of digital delivery and online music sales we are increasingly moving away from the CD as a delivery format and this has changed the game when it comes to mastering. The MP3 format, love it or hate it, has to be a big factor when mastering a commercial release and given iTunes’ domination, the Apple AAC format is as important if not more so.
Two tools have been released in recent years to assist here, the Sonnox Fraunhofer Pro-Codec plug-in is the ‘pro’ solution. The plug-in allows real time auditioning of various delivery codecs, so it can be inserted over your mix and you can hear exactly what each adjustment you’re making will sound like once the uncompressed wav has been converted into an MP3 or AAC. The plug-in allows multiple formats to be auditioned easily and if you have a selection you use regularly, you can speed up deliveries by simultaneously bouncing your mix out to an uncompressed master as well as your common formats. At USD$295 it’s quite an affordable mastering tool.
The second option is a suite of applications called the Apple Audio Mastering Tools — freely provided by Apple. The tools contain reference guides, droplets, and command line instructions to assist engineers converting uncompressed wav files into the Apple iTunes Plus format (VBR 256kbps AAC with an .m4a extension). A bit more labour intensive (but free!), you can simply batch process your wavs to AAC and a roundtrip plug-in will convert it back to wav for you to AB the results. The suite also includes tools to check for clipping. Interestingly Apple suggests that 24-bit/96k masters are the submission format of choice for highest quality encoding — seems someone has been talking to the industry. Their algorithms and filters are optimised for ‘HD’ files and won’t create aliasing artifacts upon sample rate conversion.
THE AIRWAVES
Much like the music industry, the radio industry is another one locally without any specific delivery requirements. Each station seems to run their own preference for ingest although the largest radio networks do standardise across their stations for obvious reasons. One of the biggest networks that I’ve had a bit to do with over the years is DMG and their peak level is -6dBFS for content submitted or created in-house. Other networks I’ve worked with have been at -3dBFS. VU is king in the radio networks, you’ll see them in every studio. VU meters have been discussed in detail in AT in the past, be sure to dig back through if you have any niggling questions on how to best use average/RMS meters like VUs.
It’s interesting that in my (albeit limited) research, with the glacial shift to digital radio this would be a great time to develop some specs across the Australian radio industry but so far nothing firm has emerged. The closest thing to a standard simply states that advertising should be clearly recognisable by the listener. For example, the US specs are for “average levels at -15dB RMS and peaks at -3dBFS typically with speakers set to a reference level of 83dB SPL of pink noise at 0dB.”
SPEAKER CALIBRATION
Moving on from Music and Radio to Film and TV we start using calibrated speaker levels in studios. How does this approach differ? Simply, the engineer is working with his speakers set to a fixed level for all mixing decisions and often a large chunk of the editing process (since mixing and editing are so indelibly linked these days). This means that a freelance engineer should be able to walk into any TV/film room in the country, fire up the system and make informed mixing decisions from a sense of ‘audio memory’ rather than watching meters compulsively. If you mix at the same level all the time, you get a feel for where dialogue should sit, or in Music terms, where the drums/vocal balance should be. This approach has been standardised in the sound-for-picture industries for many years.
Reference levels for speaker calibration are spoken of in terms of how loud you have set them when pumping pink noise through them individually. Pink noise is chosen as it represents even power across the frequency spectrum. The pink noise is set to -20dB RMS inside your DAW of choice and then with a handheld SPL meter you turn your speakers up/down until they are sitting at the required dB SPL (acoustic level, c-weighted, slow response), typically 79dB SPL for small edit suites scaling up to 85dB SPL for film dubstages. If all this sounds a bit confusing there are plenty of tutorials online for students to grapple with setting their studio up to industry levels that will walk the uninitiated through your first speaker calibration. Best practice is to use an analogue SPL meter rather than digital for your measurement at the mix position.
TELEVISION BROADCAST LEVELS
TV has been going through a fairly serious amount of change of late. Transmitted picture sizes have increased massively from standard definition to 1080 vertical pixels and we have online streaming services putting pressure on local channels to fast-track shows from The States. On the audio side of things a common complaint in years gone by has been that advertising was massively louder than the program it was inserted into, causing listener fatigue and channel surfing to avoid ads. In answer to this a bunch of research was done and a new way of monitoring sound has evolved. Loudness meters are modelled on the behaviour of the human ear (building on Fletcher-Munson et al.) and our level of listening fatigue. This is in part due to specific frequency ranges where our ears are most sensitive but mostly due to intensity of audio over time. This is a key difference in how we’ve worked in the past.
Here’s a simple analogy of how we perceive loudness. Think of a loud car passing once, it’s a bit annoying but you can live with it. Think of the same car parked in your driveway creating the same dB SPL for 30 minutes — less friendly to our ears and more likely to be described as ‘louder’ even when measuring the same peak dB SPL in both instances.
Consequently we’re (very slowly) moving into a new world driven by loudness specifications. Overseas specification documents with impressive names like ITU-R BS.1771, EBU R128 and my personal favourite the USA’s CALM Act (Commercial Advertisement Loudness Mitigation) embodied in ATSC 85 are ratifying new scales of measurement and metering for us.
Loudness is a key measurement in all mixing and it’s only in recent years that an objective measurement scale has been developed to shed light on what has traditionally been the ‘art’ of mastering
MEASURING LOUDNESS
Various metering salespeople would have you believe that loudness measurement is a brand new thing, but that’s not the case. Empirical loudness measurement has been around for some time in various forms, although the development and understanding of loudness has certainly escalated in the last five years. Dolby has used the Loudness scale equalised for Movie loudness/annoyance or Leq(M) scale in their cinema ad program for some years. Loudness equalised to A-weighting or Leq(A) was the earliest of the recent crop of loudness specs to be ratified. It was rapidly replaced by Loudness K-weighted Full Scale or LKFS as a measurement of loudness as the human ear perceives it. Many broadcast specs currently use the LKFS model including the Australian FreeTV document Operating Practices 59 which dictates programs should be mixed to measure an overall loudness of -24LKFS. Where the US and Australia are referring to loudness in terms of LKFS, the Europeans use Loudness Units Full Scale or LUFS. Originally these measured slightly different, but LKFS has been modified slightly in BS 1770-2 so it should now read the same as a LUFS measurement.
MIXING TO A LOUDNESS SPEC
In mixing for TV and incorporating loudness measurement, the simplest thing to do is download a piece of dialogue that measures -24LKFS, play it on your system and modify your speaker level to suit for comfortable mixing by ear as you’ve always done. In doing this most engineers may move from the traditional 79dB speaker level to as low as 76dB or as high as 82dB depending on the acoustics of the space, the type of speaker, how big the room is and how loud the engineer usually mixes.
Another approach would be to simply do what you have always done. At the end of the mix, run a loudness measurement over the program and due to dBFS and loudness being inter-related specs, you can simply normalise up or down by the number you are out by — for example your program ends up measuring -21 LKFS so you drop the overall gain by 3dB and it will read -24LKFS on next measurement, ready for submission. This obviously drops your peak levels by 3dB as well so your takeaway would be to compress a little less in future to meet specs, opening up your dynamic range a little.
TWIN PEAKS
Peak levels have been our main reference for many years, measured in various ways. As broadcast moves into LKFS specifications and loudness measurement an opportunity arose to deal with another problem that has haunted transmission for some years — intersample peaks. An intersample peak refers to a digital signal that is technically within spec, perhaps measuring -10dBFS precisely when it’s hitting the limit, but on conversion from digital to analogue for play out through a transmission chain, the digital signal actually peaks over -10dB in the analogue realm. This problem has largely been created by heavy use of limiters in the digital domain. Consequently we now have meters for use inside our DAWs to measure ‘true peak’ or intersample peak, so we are setting our digital limiters to accurately meet specs in the analogue domain. There are also a few intersample peak limiters now appearing on the market for this exact reason — which would replace a traditional limiter coupled with monitoring a true peak meter.
Unfortunately Australia has yet to embrace the intersample peak side of the new specs and we are left with half of the revolution — loudness specs but without the dynamic range that they promise, as most broadcasters are still insisting on peaks under -10dBFS rather than -3 True Peak (TP) as is the trend overseas. This is due to our legacy analogue transmission chain which has yet to be switched off. Hopefully this will change in the new year with a fully digital network.
It’s also important to note that these specifications for -10dBFS peaks and -24LKFS loudness only relate to stereo material at present. If you are submitting a surround program for broadcast there currently is no specification for this in Australia. What we hear on the Dolby Digital stream should be the original 5.1 mix for the program, which may have cinema-style dynamic range.
MIXING FOR CINEMA
Cinema is the most controlled listening environment and consequently has the most allowed dynamic range, which in a way means the content is the least controlled. Peaks can reach as high as 0dB and dialogue will sit 15-20dB below that. Cinema standards are governed by reproduction level, by the level of the replay system, as opposed to the content of the audio signal. This has led to a loudness war of sorts in the cinema of late. A properly calibrated cinema will set its Dolby Cinema Processor (a box that decodes Dolby signals and provides a room EQ curve) to volume level ‘7’, which equates to 85dB SPL when situated two-thirds of the way back into the room. In Australia, most multiplexes have pulled that level down as low as level’ 5’, giving a reduction of nearly 7dB across the board. This means that quiet scenes of dialogue are 7dB lower than intended, to compensate for explosions that are too loud in US mixes. The effect of turning the level down on mixes also strips those moments of their punch. Aussie films often suffer most from this problem as interesting detail in ambiences and surrounds that make such a difference for the typical Aussie drama are lost when the film is played too quietly.
In the past, Dolby were the arbiters of quality control on the mix levels but that is disappearing as Digital Cinema Packages (DCP) become the standard method of film distribution — a DCP replays six or eight uncompressed wav files (one per channel) as opposed to a Dolby proprietary stream which would have been encoded onto film in the past.
Word is that Dolby is trying to remedy the problem. Their new format, Dolby Atmos, is a groundbreaking system of object-based reproduction. This means a mix can be played in a traditional 5.1 cinema, in a full-blown, 64-channel Dolby Atmos room, or anything in between, as Dolby’s new range of cinema processors will assess each room’s specific speaker layout and format, and use that information to intelligently position each element of the audio content into the room. There’s rumour that the new range of cinema processors may incorporate some loudness control as well to standardise reproduction levels but this isn’t confirmed as yet. Fingers crossed!
RESPONSES