Meeting Topic: Immersive Audio
Moderator Name: John Musgrave
Speaker Name: Lon Neumann, Immersive Audio Alliance
Meeting Location: Zoom
The February meeting of the AES Los Angeles Section featured a presentation (via Zoom) on immersive audio, from Lon Neumann of the Immersive Audio Alliance.
According to Lon, immersive audio is perhaps best thought of as "surround sound on steroids," building on (and including) conventional surround sound but adding to that an additional dimension of sound - height. Although immersive audio has already been widely deployed in cinemas, this presentation focused on near-field immersive audio as enjoyed in home theatres, as well as the facilities that deliver to that market. The applications for immersive sound are much broader, though, encompassing not only film and television but also music, live concerts, video games, virtual reality, live theatre and even conferences.
The history of immersive audio begins with a mouse. In 1933 Leopold Stokowski was involved in a 3-channel stereo experiment in Philadelphia, and so when he met Walt Disney 5 years later (coincidentally, at legendary Hollywood hangout Chasen's), and they chatted about Walt's vision for what would become Fantasia, Stokowski agreed to score the film for free, but wanted a stereophonic recording. This project led to some aggressive technological development, so by the time of the November 1940 release of Fantasia the sound engineers had deployed pant pots, multi-track recording, overdubbing and automation, all to create a soundtrack that could be played back only in specially-equipped venues for a performance that required 9 operators. The birth, then, of surround sound.
Effective summing localization requires precise calibration of channel levels and placement. Phase coherence also being important, speakers must be equidistant within a very small margin of error, or, if using delays to compensate for distance variation you need a timing resolution of 0.1ms.
When creating a fully immersive audio mix, it is important to monitor all derivative versions (7.1, 5.1, stereo, mono, soundbar, headpones, etc. in order to ensure the mix translates.
Additional calibration considerations for facilities creating immersive audio include reference level (78dB for most home theater-sized mixing rooms - calibration level is related to room size), with sufficient headroom above reference. Identical speakers all around are recommended to help ease level calibration efforts.
Object-based audio is very complementary to immersive audio. An object-based soundtrack can contain both beds (analogous to the 5.1 or 7.1 channel systems we all know and love), along with objects, which are sound files that also contain metadata instructing the renderer on how to play back the object, including level, size and motion through 3D space. Dolby Atmos, DTS:X and MPEG-H all utilize an object-based mixing and playback approach, and there are some very interesting applications here. For example, Sony is using MPEG-H as the codec to deliver their Global 360 Reality Audio for the Playstation 5, rendering a full hemisphere of immersive sound through headphones. On the television front, US deployment of ATSC 3.0 for broadcast television will use Dolby's AC-4 codec, which incorporates the Atmos object-based approach along with efficient data compression to deliver a whole slew of features over-the-air, including alternate languages or alternate announcers, dialogue enhancement, advanced Dynamic Range Compressions and more.
Immersive audio is definitely fast becoming a standard for film, television, streaming, gaming and music, and proper setup and calibration of stages is critical for ensuring a good experience for the audience. We're very grateful to Lon Neumann for his work in advancing this paradigm, and for taking the time to update AES-LA on the present and future of immersive audio.
Written By: Daniel Schulz