Audiokinetic's Simon Ashby details what developers must realise to get the most from HDR audio
When a games developer asked us to add high dynamic range (HDR) audio capabilities in Wwise through a feature development contract, we started by reviewing the existing literature on the subject.
The motivation was simple; how to consolidate reality with what can be reproduced on a typical sound system.
While the human ear has an incredible dynamic range, greater than 130 dB, in a typical listening environment we can only expect to reproduce about 40 to 50 dB.
The tricky part though was this: how can we offer a compelling mixing experience for the sound designer?
After several months of work and many prototypes, we finally arrived at a point where game developers will benefit from a dynamic mixing system that really helps to clean the mix and brings the focus to the most important sounds. There’s no magic tricks and it’s very effective.
HOW TO SETUP HDR AUDIO
Sounds that benefit from HDR audio are most likely emitted from the 3D world, which, usually, excludes UI sounds, non-diegetic music and narrator commentary.
Chatter dialogue may be included in the HDR mix, but many games will prefer critical dialogue to be outside of the HDR mix to provide better control of their volumes.
To enable HDR, at least one bus in the hierarchy must be flagged as ‘HDR’ and a dynamic range set for the game. The dynamic range is called ‘the window’, which spans the range between two boundaries – the window top and window bottom. These boundaries are set on the HDR bus and in the project settings, respectively.
At runtime, the sounds playing with volumes higher than the HDR bus threshold push the window up which consequently reduces the volumes of the quieter sounds.
Your amp goes to 11. My HDR goes to 100.
We’re expecting games to set the HDR threshold around 0 dB. Consequently, the loudest sounds will be assigned volumes above 0 dB to activate the volume ducking mechanism. At this point, as an audio specialist, you might wonder how volumes above 0 dB can play without experiencing clipping. This is possible because the volume of the loudest sound, above the volume threshold, is played at the volume level of the HDR bus.
All other quieter sounds are ducked down proportionally. This automatic mechanism, which momentarily reduces the volume of the quieter sounds, gives a perception of a higher dynamic range. It also provides space in the mix and sets the focus to the most important sounds, which, in the HDR paradigm, are the loudest sounds.
The table and graphics above illustrate the effective volumes applied at the output of the HDR bus for various input values in a scenario where four sounds, distinguished by colours, are played. As seen, the HDR bus scales down the input volumes and plays them at ‘normal’ values relative to the hardware’s full scale (where 0 dBFS represents the max amplitude before clipping).
REGION OF INTEREST
After research and several iterations, we discovered that it’s typically just the transient part of the loudest sounds that should move the window upwards, and by excluding the sound’s tail we can achieve transparent volume attenuations.
To achieve this result automatically, an option to track the envelope of HDR sounds has been added with a setting to specify, in dB, what should be the active range (region of interest) for each sound. The active range property defines, for each voice, the range in decibels from the sound’s peak for which it will move the window.
Having control of the active range and the points of the HDR envelope provides control to craft exactly how the rest of the quieter sounds will be ducked down over time.
In the end, we’re glad we’ve had the opportunity to work and iterate directly with a games developer to create this feature, which is introduced in Wwise version 2013.1.
We’re now confident that developers using HDR audio will benefit from having a transparent automated mix that always sets the loudest sound to focus and creates space in the mix.
There are more exciting HDR features available in Wwise 2013.1, such as the ratio and release time at the bus level, make-up gain, and loudness normalisation at the source level. A more detailed article would be needed for a full review of the HDR functionality.
To see other articles in our Audio Special series, visit our archive