Audio loudness for gaming: The battle against 'ear fatigue'

Audio loudness for gaming: The battle against 'ear fatigue'

By Simon Pressey, Crytek

December 10th 2014 at 1:00PM

Crytek's audio whizz Simon Pressey discusses why it's important for devs to measure the loudness of their game

With the growth of global loudness standards and regulations, audio loudness has lately been a hot topic in the broader video and media arenas.

It might not be as self-evident, but loudness management is also a key factor for game developers such as my company, Crytek – especially as consumers adopt the new generation of consoles that integrate games with other media sources such as broadcast TV, Blu-ray disks, and programs streamed from the Internet.

When viewers are switching back and forth between a game they’re playing and another video program, they expect the game’s audio levels to be compatible with the other media types.

It hasn’t always been this way. In fact, loudness for gaming has historically followed the approach of music and advertising production, as summed up in four words: the louder, the better. The emergence of digital television introduced an expanded aural dynamic range (over 100 dB), which has been pushed to its boundaries by producers who want to make sure their content is louder than their competitors’ – a situation affectionately known as the Loudness Wars.

Of course, from the consumer’s viewpoint louder is not better, and it’s one of the main reasons viewers reach for the remote when a commercial comes on.

It's about the consumers

On average, consumers prefer electronic audio content to be no louder than 69 dB, which is not much louder than normal conversation. They don’t expect large changes in loudness from program to interstitials, and they want to be able to smoothly switch from channel to channel, or from TV to video game, without having to constantly adjust the volume.

At Crytek, we consider it part of the art form to use all the dynamic range we need to give players the best-possible experience, while not ruining the experience with too much volume. The ability to find that balance is one of the telltale traits of a good game developer.

On average, consumers prefer electronic audio content to be no louder than 69 dB, which is not much louder than normal conversation. They want to be able to smoothly switch from TV to video game without having to constantly adjust the volume.

The emergence of loudness regulations is actually good news for game developers. As recommended by console manufacturers, the gaming industry has begun to adopt standards that are similar to those for the broadcasting industry. By providing a standard by which we can normalise the audio to actual perceived loudness, new regulations such as EBU R128 are giving us latitude to increase dynamic range and contrast while still staying compliant.

Game-centric Loudness Considerations

Of course, mixing sound for a game is very different from that of TV or cinema content. Since every game creates its own unique atmosphere and experience, loudness requirements vary widely from title to title. Also, games are the very definition of a non-linear video experience, since you never know when someone will be playing your game, what type of console they’ll be using, and whether, or if, they’ve been watching other media.

Since it’s hard to gauge the playing habits of our average customer, nailing down loudness measurements can be a challenge.

One technique is to use anchor sounds as a benchmark around which the other audio can be measured. For example, at Crytek we have developed a tentative set of rules around how loud we will play dialogue for all games that are dialogue-centric. The loudness value for dialogue establishes an anchor point that guides the rest of the mix.

The loudness value for dialogue establishes an anchor point that guides the rest of the mix.

A critical tool on our mixing desk is NUGEN Audio’s VisLM, a software metering tool that offers detailed and objective loudness measurement, history, and logging facilities. One key feature of VisLM is its highly visual and simple interface that tells us at a glance whether the levels are within acceptable limits. Most important, VisLM can sit anywhere in the audio chain – as a plug-in on a digital audio workstation or a standalone tool for metering audio output as it’s being played back.

When used in the anchor-sound approach mentioned above, a metering tool like VisLM enables our mixers to monitor reference levels and train their ears to create consistent mixes. Because we trust the feedback from the meter, we know we’re monitoring at a fixed loudness level without having to listen on different systems. Also, with standardised loudness listening, mixers can develop a better perception of the loudness curve of sounds. They can determine the contour of the curve, how much bass it should have, and other characteristics.

This means we no longer have to overcompensate by designing a sound to be as loud as it could be as we might have done before. Typically in that case, we’d turn down the monitor to discover that the loudness curve was over-enhanced and lacking the desired qualities.  

In addition to VisLM, we use another NUGEN Audio tool, LMB Processor – an offline file-based batch loudness analysis and correction program that is extremely useful for making after-mix adjustments to some components of dialogue. Since LMB is a batch processor, we simply run our dialogue files through it to produce a normalised set of levels for spoken dialogue. LMB gives us consistent dialogue levels across a range of characters so their interactions will seem normal and natural.

Avoiding "Ear Fatigue"

Another concern for game production is the peak-level audio artefacts that are introduced by various compression codecs. The user might not notice distortion introduced by a single codec for a single sound, but multiply that by 250 sounds being played at once that have all been compressed by that codec, and you’ve added a large number of artefacts. These might not be perceptively audible but they compound “ear fatigue” – a subliminal “playing this game is hard work” feeling. Subconsciously, the brain is trying to make sense of the sounds, which can be mentally and physically taxing.

Once again, a loudness metering tool like VisLM is the best weapon against this problem. In order to avoid distortion and over-modulation by exceeding true peaks, we use VisLM to meter all of our assets.

Loudness for Marketing Assets

Although compliance with international loudness standards is not as much of a concern for the games themselves, we have to consider all of the ancillary promotional items that are produced in conjunction with each game title – assets like video trailers and ads that will be broadcast or streamed.

For broadcasting, we typically mix these items to ITU and EBU levels (with audio normalised at -23 LUFS). VisLM operates in full compliance with ITU, ATSC, and EBU standards, which basically covers every corner of the world. Also, VisLM’s built-in pre-sets make audio mixing to a specific country’s loudness standard as easy as dialing up the standard on the meter.

[You want to avoid] “ear fatigue”, [where] the brain is trying to make sense of the sounds, which can be mentally and physically taxing.

For streaming to the Internet, where loudness standards have not yet taken hold, we have adopted a standard level of -17 LUFS to bring a new level of consistency and quality to our trailers. In another trick of the trade, we mix the content to -23 to allow for maximum dynamic range and then up-convert the audio to -17.

We’ve discovered that a well-mixed -23 will not be dramatically degraded when brought up to -17, and the results are far more pleasing to the ear than producing a trailer that is as loud as possible and then turning it down to -23.

The Results: No News is Good News

The first game we mixed to our new -23 LUFS standard was an Xbox One title, Ryse: Son of Rome. The game received highly positive reviews for its sound, but some of our designers were a bit concerned: would the game be loud enough?  

The short answer is yes. The user feedback we’ve gotten is that Ryse is a highly engaging, visceral experience. But interestingly, no one has ever complained that the game audio is too soft! We attribute Ryse’s success in large part to the dynamic headroom we could work with, even at the lower loudness level. Because we were using VisLM to mix to dynamic range, we were able to add louder sounds for dramatic effect without making them seem too jarring or out-of-place.

Ryse is a highly engaging, visceral experience. But interestingly, no one has ever complained that the game audio is too soft.

The loudness standards we’ve adopted for Crytek are based on statistical analysis of best-selling video games, showing that those mixed to a lower and more consistent loudness level were actually selling better. Obviously, there’s a correlation between popular games and well-mixed, less-distorted audio that makes good use of dynamic range.

Since ours is a young industry that has not emphasised sound in the past, we believe these loudness techniques represent a real breakthrough. There’s a growing understanding that sound plays a very significant role in the way people enjoy and experience video games.

From a business perspective, the emergence of tools like VisLM and LMB Processor means that we have a highly reliable and effective way to measure and correct loudness to ensure consistently high-quality audio – and that’s a win-win for Crytek as well as its legions of gaming consumers.  

Simon Pressey is director of audio for Crytek, based in Frankfurt, Germany. Gamers around the world know Crytek as the independent developer behind popular games such as Ryse: Son of Rome, Far Cry, the Crysis series and Warface. The company is also the creator of 3D game technology, CryEngine.