Mixing for immersive experiences
Mixing is the process of blending sounds together. In linear mediums such as music and movies mixing involves finding the right volume level and panning for each of the tracks, as well setting up reverb.
For interactive experiences the soundscape is more dynamic; the volume level and panning is dependent on the direction and distance of the sounds. The tools for mixing interactive non-immersive experiences are typically automatic panning based on direction, volume controlled by distance based curves; and there are a variety of dynamic reverb solutions.
In immersive experiences the panning is replaced with HRTF which provides more accurate directional cues than panning. It’s also possible to achieve more accurate distance cues with careful consideration of how the volume changes over distance and how reverb is treated.
Audio for traditional non-immersive games could be played on systems with low-quality desktop speakers, full surround hi-fi systems, or headphones of varying quality. A consequence of having to support such a broad range of audio systems, is that audio reproduction is very inconsistent, and a primary concern for sound design and mixing is making sure it sounds decent across all systems.
Immersive devices on the other hand all have headphones which provide much more consistent audio reproduction. This, with the addition of head-tracking, allows for much more immersive spatial audio. We recommend the following best practices:
- Properly spatialize sound sources.
- Create soundscapes that are neither too dense nor too sparse.
- Avoid user fatigue.
- Use suitable volume levels comfortable for long-term listening.
- Design with appropriate room and environmental effects.
Distance attenuation curves
Mixing is a complex subject and there are many factors to consider. Controlling the relative levels of each sound is a critical component of mixing, along with the wayvolume attenuates based on the distance between the source and the listener. In non-immersive applications this is usually controlled by distance-based attenuation curves whose shape is tailored by the sound designer.
Make sure important sounds are clearly heard even at a distance, and unimportant sounds don’t clutter the mix. As an example, you don’t want to lose important character dialog because the user is too far away. It may be better for this sort of dialogue to attenuate more slowly, while less essential footsteps from a character in the background should attenuate more quickly, or potentially be inaudible at a certain distance.
When mixing for immersive experiences there is an opportunity for heightened immersion if we provide the correct audio cues, so it’s important to consider how these attenuation curves impact perception of distance. If you have a sound that is loud even when it’s far from the listener, it may feel closer than intended and negatively impact the user’s sense of immersion.
The rule of thumb for physically accurate distance attenuation is: “a doubling of distance is a halving of intensity”. For example if a sound is set to full volume (0dB) when it’s 5 meters away, it would be -6dB when it’s 10 meters away, and -12dB when it’s 20 meters away, and so on. Sometimes this attenuation model does not produce the desired result, in these cases it’s necessary to bend the laws of physics a little to achieve the desired experience.
Apart from volume, another essential distance cue is reverb. When a sound is very far away we hear a lot more reverb relative to the direct sound, whereas, when a sound is very close we hear more of the direct sound and very little reverb. Controlling the amount of reverb per sound is a critical component to creating the perception of distance.
If you’re ready to kick off the technical side of immersive audio design and engineering, be sure to review the following documentation: