AES London 2010
Poster Session P3
P3 - Recording, Production, and Reproduction—Multichannel and Spatial Audio
Saturday, May 22, 10:30 — 12:00 (Room C4-Foyer)
P3-1 21-Channel Surround System Based on Physical Reconstruction of a Three Dimensional Target Sound Field—Jeongil Seo, Jae-hyoun Yoo, Kyeongok Kang, ETRI - Daejeon, Korea; Filippo M. Fazi, University of Southampton - Southampton, UK
This paper presents the 21-channel sound field reconstruction system based on the physical reconstruction of a three dimensional target sound field over the pre-defined control volume. According to the virtual sound source position and intensity, each loudspeaker signal is estimated through convolving with appropriate FIR filter to reconstruct a target sound field. In addition, the gain of FIR filter is only applied to the mid frequency band of a sound source signal to prevent aliasing effects and to save the computational complexity at the high frequency bands. Also the whole filter processing is carried out at the frequency domain to adopt a real-time application. Through the subjective listening tests the proposed system showed better performance on the localization in the horizontal plane comparing with conventional panning method.
Convention Paper 7973 (Purchase now)
P3-2 Real-Time Implementation of Wave Field Synthesis on NU-Tech Framework Using CUDA Technology—Ariano Lattanzi, Emanuele Ciavattini, Leaff Engineering - Ancona, Italy; Stefania Cecchi, Laura Romoli, Università Politecnica delle Marche - Ancona, Italy; Fabrizio Ferrandi, Politecnico di Milano - Milan, Italy
In this paper we present a novel implementation of a Wave Field Synthesis application based on emerging NVIDIA Compute Unified Device Architecture (CUDA) technology using NU-Tech Framework. CUDA technology unlocks the processing power of the Graphics Processing Units (GPUs) that are characterized by a highly parallel architecture. A wide range of complex algorithms are being re-written in order to benefit from this new approach. Wave Filed Synthesis is a quite new spatial audio rendering technique highly demanding in terms of computational power. We present here results and comparisons between a NU-Tech Plug-In (NUTS) implementing real-time WFS using CUDA libraries and the same algorithm implemented using Intel Integrated Primitives (IPP) Library.
Convention Paper 7974 (Purchase now)
P3-3 Investigation of 3-D Audio Rendering with Parametric Array Loudspeakers—Reuben Johannes, Jia-Wei Beh, Woon-Seng Gan, Ee-Leng Tan, Nanyang Technological University - Singapore
This paper investigates the applicability of parametric array loudspeakers to render 3-D audio. Unlike conventional loudspeakers, parametric array loudspeakers are able to produce sound in a highly directional manner, therefore reducing inter-aural crosstalk and room reflections. The investigation is carried out by performing objective evaluation and comparison between parametric array loudspeakers, conventional loudspeakers, and headphones. The objective evaluation includes crosstalk measurement and binaural cue analysis using a binaural hearing model. Additionally, the paper also investigates how the positioning of the parametric array loudspeakers affects 3-D audio rendering.
Convention Paper 7975 (Purchase now)
P3-4 Robust Representation of Spatial Sound in Stereo-to-Multichannel Upmix—Se-Woon Jeon, Yonsei University - Seoul, Korea; Young-Cheol Park, Yonsei University - Gangwon, Korea; Seok-Pil Lee, Korea Electronics Technology Institute (KETI) - Seoul, Korea; Dae-Hee Youn, Yonsei University - Seoul, Korea
This paper presents a stereo-to-multichannel upmix algorithm based on a source separation method. In the conventional upmix algorithms, panning source and ambient components are decomposed or separated by adaptive algorithm, i.e., least-squares (LS) or least-mean-square (LMS). Separation performance of those algorithms is easily influenced by primary to ambient energy ratio (PAR). Since PAR is time-varying, it causes the energy fluctuation of separated sound sources. To prevent this problem, we propose a robust separation algorithm using a pseudo inverse matrix. And we propose a novel post-scaling algorithm to compensate for the influence of interference with considering desired multichannel format. Performance of the proposed upmix algorithm is confirmed by subjective listening test in ITU 3/2 format.
Convention Paper 7976 (Purchase now)
P3-5 Ambisonic Decoders; Is Historical Hardware the Future?—Andrew J. Horsburgh, D. Fraser Clark, University of the West of Scotland - Paisley, UK
Ambisonic recordings aim to create full sphere audio fields through using a multi-capsule microphone and algorithms based on a “metatheory” as proposed by Gerzon. Until recently, Ambisonic decoding was solely implemented using hardware. Recent advances in computing power now allow for software decoders to supersede hardware units. It is therefore of interest to determine which of the hardware or software decoders provide the most accurate decoding of Ambisonic B-format signals. In this paper we present a comparison between hardware and software decoders with respect to their frequency and phase relationships to determine the most accurate reproduction. Results show that software is able to decode the files with little coloration compared to hardware circuits. It is possible to see which implementation of the analog or digital decoders match the behavioral characteristics of an “ideal” decoder.
Convention Paper 7977 (Purchase now)
P3-6 Localization Curves in Stereo Microphone Techniques—Comparison of Calculations and Listening Tests Results—Magdalena Plewa, Grzegorz Pyda, AGH University of Science and Technology - Kraków, Poland
Stereo microphone techniques are a simple and usable solution to reproduce music scenes while maintaining sound direction maintenance. To choose a proper microphone setup one needs to know the “recording angle” that could be easily determined from localization curves. Localization curves in different stereo microphone techniques can be determined with the use of calculations, which take into consideration interchannel level and time differences. Subjective listening tests were carried out to verify the mentioned calculations. Here we present and discuss the comparison between the calculations and the listening tests.
Convention Paper 7978 (Purchase now)
P3-7 Investigation of Robust Panning Functions for 3-D Loudspeaker Setups—Johann-Markus Batke, Florian Keiler, Technicolor, Research, and Innovation - Hannover Germany
An accurate localization is a key goal for a spatial audio reproduction system. This paper discusses different approaches for audio playback with full spatial information in three dimensions (3-D). Problems of established methods for 3-D audio playback, like the Ambisonics mode matching approach or Vector Base Amplitude Panning (VBAP), are discussed. A new approach is presented with special attention to the treatment of irregular loudspeaker setups, as they are to be expected in real world scenarios like living rooms. This new approach leads to better localization of virtual acoustic sources. Listening tests comparing the new approach with standard mode matching and VBAP will be described in a companion paper.
Convention Paper 7979 (Purchase now)