AES New York 2019
Poster Session P16
P16 - Posters: Spatial Audio
Saturday, October 19, 10:30 am — 12:00 pm (South Concourse A)
P16-1 Calibration Approaches for Higher Order Ambisonic Microphone Arrays—Charles Middlicott, University of Derby - Derby, UK; Sky Labs Brentwood - Essex, UK; Bruce Wiggins, University of Derby - Derby, Derbyshire, UK
Recent years have seen an increase in the capture and production of ambisonic material due to companies such as YouTube and Facebook utilizing ambisonics for spatial audio playback. Consequently, there is now a greater need for affordable high order microphone arrays due to this uptake in technology. This work details the development of a five-channel circular horizontal ambisonic microphone intended as a tool to explore various optimization techniques, focusing on capsule calibration & pre-processing approaches for unmatched capsules.
Convention Paper 10301 (Purchase now)
P16-2 A Qualitative Investigation of Soundbar Theory—Julia Perla, Belmont University - Nashville, TN, USA; Wesley Bulla, Belmont University - Nashville, TN, USA
This study investigated basic acoustic principals and assumptions that form the foundation of soundbar technology. A qualitative listening test compared 12 original soundscape scenes each comprising five stationary and two moving auditory elements. Subjects listened to a 5.1 reference scene and were asked to rate “spectral clarity and richness of sound,” “width and height,” and “immersion and envelopment” of stereophonic, soundbar, and 5.1 versions of each scene. ANOVA revealed a significant effect for all three systems. In all three attribute groups, stereophonic was rated lowest, followed by soundbar, then surround. Results suggest waveguide-based “soundbar technology” might provide a more immersive experience than stereo but will not likely be as immersive as true surround reproduction.
Convention Paper 10302 (Purchase now)
P16-3 The Effect of the Grid Resolution of Binaural Room Acoustic Auralization on Spatial and Timbral Fidelity—Dale Johnson, University of Huddersfield - Huddersfield, UK; Hyunkook Lee, University of Huddersfield - Huddersfield, UK
This paper investigates the effect of the grid resolution of binaural room acoustic auralization on spatial and timbral fidelity. Binaural concert hall stimuli were generated using a virtual acoustics program utilizing image source and ray tracing techniques. Each image source and ray were binaurally synthesized using Lebedev grids of increasing resolution from 6 to 5810 (reference) points. A MUSHRA test was performed where subjects rated the magnitudes of spatial and timbral differences of each stimulus to the reference. Overall, it was found that on the MUSHRA scale, 6 points were perceived to be "Fair," 14 points "Good," and 26 points and above all "Excellent" on the grading scale, for both spatial and timbral fidelity.
Convention Paper 10303 (Purchase now)
P16-4 A Compact Loudspeaker Matrix System to Create 3D Sounds for Personal Uses—Aya Saito, University of Aizu - Aizuwakamatsu City, Japan; Takahiro Nemoto, University of Aizu - Aizuwakamatsu, Japan; Akira Saji, University of Aizu - Aizuwakamatsu City, Japan; Jie Huang, University of Aizu - Aizuwakamatsu City, Japan
In this paper we propose a new 3D sound system in two-layers as a matrix that has five loudspeakers on each side of the listener. The system is effective for sound localization and compact for personal use. Sound images in this system are created by extended amplitude panning method, with the effect of head-related transfer functions (HRTFs). Performance evaluation of the system for sound localization was made by auditory experiments with listeners. As the result, listeners could distinguish sound image direction localized at any azimuth direction and high elevation direction with small biases.
Convention Paper 10304 (Purchase now)
P16-5 Evaluation of Spatial Audio Quality of the Synthesis of Binaural Room Impulse Responses for New Object Positions—Stephan Werner, Technische Universität Ilmenau - Ilmenau, Germany; Florian Klein, Technische Universität Ilmenau - Ilmenau, Germany; Clemens Müller, Technical University of Ilmenau - Ilmenau, Germany
The aim of auditory augmented reality is to create an auditory illusion combining virtual audio objects and scenarios with the perceived real acoustic surrounding. A suitable system like position-dynamic binaural synthesis is needed to minimize perceptual conflicts with the perceived real world. The needed binaural room impulse responses (BRIRs) have to fit the acoustics of the listening room. One approach to minimize the large number of BRIRs for all source-receiver relations is the synthesis of BRIRs using only one measurement in the listening room. The focus of the paper is the evaluation of the spatial audio quality. In most conditions differences in direct-to-reverberant-energy ratio between a reference and the synthesis is below the just noticeable difference. Furthermore, small differences are found for perceived overall difference, distance, and direction perception. Perceived externalization is comparable to the usage of measured BRIRs. Challenges are detected to synthesize more further away sources from a source position that is more close to the listening positions.
Convention Paper 10305 (Purchase now)
P16-6 Withdrawn—N/A
P16-7 An Adaptive Crosstalk Cancellation System Using Microphones at the Ears—Tobias Kabzinski, RWTH Aachen University - Aachen, Germany; Peter Jax, RWTH Aachen University - Aachen, Germany
For the reproduction of binaural signals via loudspeakers, crosstalk cancellation systems are necessary. To compute the crosstalk cancellation filters, the transfer functions between loudspeakers and ears must be given. If the listener moves the filters are usually updated based on a model or previously measured transfer functions. We propose a novel architecture: It is suggested to place microphones close to the listener’s ears to continuously estimate the true transfer functions and use those to adapt the crosstalk cancellation filters. A fast frequency-domain state-space approach is employed for multichannel system tracking. For simulations of slow listener rotations it is demonstrated by objective and subjective means that the proposed system successfully attenuates crosstalk of the direct sound components.
Convention Paper 10307 (Purchase now)
P16-8 Immersive Sound Reproduction in Real Environments Using a Linear Loudspeaker Array—Valeria Bruschi, Univeresità Politecnica delle Marche - Ancona, Italy; Nicola Ortolani, Università Politecnica delle Marche - Ancona (AN), Italy; Stefania Cecchi, Universitá Politecnica della Marche - Ancona, Italy; Francesco Piazza, Universitá Politecnica della Marche - Ancona (AN), Italy
In this paper an immersive sound reproduction system capable of improving the overall listening experience is presented and tested using a loudspeaker linear array. The system aims at providing a channel separation over a broadband spectrum by implementing the RACE (Recursive Ambiophonic Crosstalk Elimination) algorithm and a beamforming algorithm based on a pressure matching approach. A real time implementation of the algorithm has been performed and its performance has been evaluated comparing it with the state of the art. Objective and subjective measurements have con?rmed the effectiveness of the proposed approach.
Convention Paper 10308 (Purchase now)
P16-9 The Influences of Microphone System, Video, and Listening Position on the Perceived Quality of Surround Recording for Sport Content—Aimee Moulson, University of Huddersfield - Huddersfield, UK; Hyunkook Lee, University of Huddersfield - Huddersfield, UK
This paper investigates the influences of the recording/reproduction format, video, and listening position on the quality perception of surround ambience recordings for sporting events. Two microphone systems—First Order Ambisonics (FOA) and Equal Segment Microphone Array (ESMA)—were compared in both 4-channel (2D) and 8-channel (3D) loudspeaker reproductions. One subject group tested audio-only conditions while the other group was presented with video as well as audio. Overall, the ESMA was rated significantly higher than the FOA for all quality attributes tested regardless of the presence of video. The 2D and 3D reproductions did not have a significant difference within each microphone system. Video had a significant interaction with the microphone system and listening position depending on the attribute.
Convention Paper 10309 (Purchase now)
P16-10 Sound Design and Reproduction Techniques for Co-Located Narrative VR Experiences—Marta Gospodarek, New York University - New York, NY, USA; Andrea Genovese, New York University - New York, NY, USA; Dennis Dembeck, New York University - New York, NY, USA; Flavorlab; Corinne Brenner, New York University - New York, NY, USA; Agnieszka Roginska, New York University - New York, NY, USA; Ken Perlin, New York University - New York, NY, USA
Immersive co-located theatre aims to bring the social aspects of traditional cinematic and theatrical experience into Virtual Reality (VR). Within these VR environments, participants can see and hear each other, while their virtual seating location corresponds to their actual position in the physical space. These elements create a realistic sense of presence and communication, which enables an audience to create a cognitive impression of a shared virtual space. This article presents a theoretical framework behind the design principles, challenges and factors involved in the sound production of co-located VR cinematic productions, followed by a case-study discussion examining the implementation of an example system for a 6-minute cinematic experience for 30 simultaneous users. A hybrid reproduction system is proposed for the delivery of an effective sound design for shared cinematic VR.
Winner of the 147th AES Convention Best Peer-Reviewed Paper Award
Convention Paper 10287 (Purchase now)