AES New York 2018
Poster Session P16
P16 - Spatial Audio
Saturday, October 20, 10:30 am — 12:00 pm (Poster Area)
P16-1 Spatial Audio Coding with Backward-Adaptive Singular Value Decomposition—Sina Zamani, University of California Santa Barbara - Santa Barbara, CA, USA; Kenneth Rose, University of California Santa Barbara - Santa Barbara, CA, USA
The MPEG-H 3D Audio standard applies singular value decomposition (SVD) to higher-order ambisonics data, and divides the outcome into prominent and ambient sound components, which are then separately encoded. We recently showed that significant compression gains are achievable by moving the SVD to the frequency domain, and ensuring smooth transition between frames. Frequency domain SVD also enables SVD adaptation to frequency, but the increase in side information, to specify additional basis vectors, compromises the gains. This paper overcomes this shortcoming by introducing backward adaptive estimation of SVD basis vectors, at no cost in side information, thereby approaching the full potential of frequency domain SVD. Objective and subjective tests show considerable gains that validate the effectiveness of the proposed approach.
Convention Paper 10119 (Purchase now)
P16-2 Virtual Source Reproduction Using Two Rigid Circular Loudspeaker Arrays—Yi Ren, University of Electro-Communications - Tokyo, Japan; Yoichi Haneda, The University of Electro-Communications - Chofu-shi, Tokyo, Japan
In this paper a virtual sound source reproduction method is proposed using two circular loudspeaker arrays with rigid baffles. This study aims to reproduce virtual sources in front of, or outside the loudspeaker arrays, with each array considered as an infinite-length rigid cylinder with loudspeakers attached to its surface. Transfer functions that consider the reflection between the two arrays are introduced, and the appropriate reflection times to be used in the transfer function are discussed. Using the pressure-matching method and circular harmonic expansion, several methods are proposed and compared via computer simulation
Convention Paper 10120 (Purchase now)
P16-3 Design and Implementation of a Binaural Reproduction Controller Applying Output Tracking Control—Atsuro Ito, NHK Science & Technology Research Laboratories - Tokyo, Japan; Kentaro Matsui, NHK Science & Technology Research Laboratories - Setagaya, Tokyo, Japan; Kazuho Ono, NHK Science & Technology Research Laboratories - Setagaya-ku, Tokyo, Japan; Hisao Hattori, Sharp Corporation - Japan; Takeaki Suenaga, Sharp Corporation - Japan; Kenichi Iwauchi, Sharp Corporation - Japan; Shuichi Adachi, Keio University - Yokohama-shi, Kanagawa, Japan
We have been studying a design method for a controller for binaural reproduction with loudspeakers. The gain of the controller amplifies errors due to external disturbances and system perturbations, and this leads to deterioration of the sound quality. Therefore, the gain should be suppressed to as low as possible. For this purpose, we formulate the design of the controller as a minimization problem of the gain, in which the H8 norm of the controller is adopted as a measure of the gain. In this article we also introduce a binaural reproduction system as an implementation example. This system virtually reproduces multichannel audio such as 22.2 multichannel audio using line array loudspeakers.
Convention Paper 10121 (Purchase now)
P16-4 Horizontal Binaural Signal Generation at Semi-Arbitrary Positions Using a Linear Microphone Array—Asuka Yamazato, University of Electro-Communications - Tokyo, Japan; Yoichi Haneda, The University of Electro-Communications - Chofu-shi, Tokyo, Japan
Binaural technology using a dummy-head is a powerful technique to provide realistic sound reproduction through headphones. To obtain the binaural signals as if a listener moves around in the sound field, we need to move the dummy head. To overcome this problem, it is a promising approach to convert signals observed by a microphone array into binaural signals at arbitrary positions. In this paper we aim to reproduce horizontal binaural signals at semi-arbitrary listener’s positions using linear microphone array signals based on the inverse wave propagation method with spatial over sampling and simulated head-related transfer function (HRTF) directivity pattern. We perform the computer simulation and listening experiments in a reverberant room. A listening test is performed for two cases to verify the performance of the sound localization (case-I) and distance perception (case-II). We con?rm that the binaural signals obtained by the proposed method are almost expressed by the HRTF directivity pattern. We find that the angle errors of sound localization is ranged from 1.2° to 4.2° from the results of case-I. According to the results of case-II, the subjects can perceive a distance change of the virtual sound image when the auditory stimulus is white noise
Convention Paper 10122 (Purchase now)
P16-5 Near-Field Compensated Higher-Order Ambisonics Using a Virtual Source Panning Method—Tong Wei, Institute of Acoustics, Chinese Academy of Sciences - Beijing, China; Jinqiu Sang, Institute of Acoustics, Chinese Academy of Science - Beijing, China; Chengshi Zheng, Institute of Acoustics, Chinese Academy of Sciences - Beijing, China; Xiaodong Li, Chinese Academy of Sciences - Beijing, China; Chinese Academy of Sciences - Shanghai, China
The commonly adopted higher order ambisonics (HOA) mainly concentrates on far-field sources and neglects the rendering of near-field sources. Some studies have introduced near-field compensated HOA (NFC-HOA) to preserve the original spherical wave front curvature with lots of loudspeakers. It is worthy to combine the advantages of a physical reproduction approach with a hearing-related model approach to avoid using lots of loudspeakers in regular arrangement. In this paper an all-around virtual source panning method was proposed to improve driving functions of NFC-HOA with panning functions. In this way, a near-field sound source encoded in HOA can be rendered to arbitrary arrangement of only a few loudspeakers. Both the simulation and experimental results show the validity of the proposed method.
Convention Paper 10123 (Purchase now)
P16-6 Subjective Evaluation of Virtual Room Auralization System Based on the Ambisonics Matching Projection Decoding Method—Zhongshu Ge, Peking University - Beijing, China; Yue Qiao, Peking University - Beijing, China; Shusen Wang, AES (Beijing) Science & Technology Co., Ltd. - Beijing, China; Xihong Wu, Peking University - Beijing, China; Tianshu Qu, Peking University - Beijing, China
Based on the higher order Ambisonics theory, a loudspeaker-based room auralization system was implemented in this paper in combination with a room acoustics computer model. In the decoding part of the Ambisonics technique, the generally used mode-matching decoding method requires a uniformly arranged loudspeaker array, which sometimes cannot be satisfied. A recently proposed method, the matching projection decoding method, which can solve this problem, was introduced in the room auralization system to realize reproduction of room re-verberation with non-uniform loudspeaker arrays. Moreover, the performance of the matching projection method was evaluated objectively through room impulse response reconstruction analysis. Besides, the room auralization system is validated through subjective experiments.
Convention Paper 10124 (Purchase now)
P16-7 A Study of the Effect of Head Rotation on Transaural Reproduction—Marcos Simón, University of Southampton - Southampton, UK; Eric Hamdan, University of Southampton - Southampton, UK; Dylan Menzies, University of Southampton - Southampton, UK; Filippo Maria Fazi, University of Southampton - Southampton, Hampshire, UK
The reproduction of binaural audio through loudspeakers, also commonly referred to as Transaural audio, allows for the rendering of immersive virtual acoustic images when the original binaural signal is accurately delivered to the listener’s ears. Such accurate reproduction is generally achieved by using a network of cross-talk-cancellation filters designed for a given listener’s position and orientation. This work studies the effect of small rotational movements of the listener’s head on the perceived location of a virtual sound source when the binaural signal is reproduced using an array of loudspeakers. The results of numerical simulations presented in this paper describe how the perceived virtual source position is affected by the variation of the head orientation.
Convention Paper 10125 (Purchase now)
P16-8 A Parametric Spatial Audio Coding Method Based on Convolutional Neural Networks—Qingbo Huang, Peking University - Beijing, China; Xihong Wu, Peking University - Beijing, China; Tianshu Qu, Peking University - Beijing, China
The channel based 3D audio can be compressed to a down-mix signal with side information. In this paper the inter-channel transfer functions (ITF) are estimated through training over fitting convolutional neural networks (CNN) on a specific frame. Perfectly reconstructing the original channel and keeping the spatial cues the same is set as the target of the estimation. By taking this approach, more accurate spatial cues are maintained. The subjective evaluation experiments were carried out on stereo signals to evaluate the proposed method.
Convention Paper 10126 (Purchase now)