AES Warsaw 2015
Poster Session P4
P4 - (Poster) Spatial Audio
Thursday, May 7, 16:00 — 18:00 (Foyer)
P4-1 Variation of Interaural Time Difference Caused by Head Offset Relative to Coordinate Origin—Guangzheng Yu, South China University of Technology - Guangzhou, Guangdong, China; Yuye Wu, South China University of Technology - Guangzhou, China
Interaural time difference (ITD) is related with spatial position (distance and direction) of the sound source and head size. Assuming the sound source and the coordinate system are fixed, the position relationship between the sound source and head center will be influenced by the head offset relative to the coordinate origin, which may lead to the spatial distribution distortion of ITD in measuring head-related transfer functions (HRTFs). In this paper the variation of ITD caused by head offset is analyzed using the conventional Woodworth ITD model consisting of a spherical head and a point sound source. Results show that the forward (or backward) offsets of head result in small variation of ITDs, however, the spatial distribution distortion of ITDs introduced by the rightward (or leftward) offset of head is unacceptable.
Convention Paper 9240 (Purchase now)
P4-2 Functional Representation for Efficient Interpolations of Head Related Transfer Functions in Mobile Headphone Listening—Joseph Sinker, University of Salford - Salford, UK; Jamie Angus, University of Salford - Salford, Greater Manchester, UK
In this paper two common methods of HRTF/HRIR dataset interpolation, that is simple linear interpolation in the time and frequency domain, are assessed using a Normalized Mean Square Error metric. Frequency domain linear interpolation is shown to be the superior of the two methods, but both suffer from poor behavior and inconsistency over interpolated regions. An alternative interpolation approach based upon the Principal Component Analysis of the dataset is offered; the method uses a novel application of the Discrete Cosine Transform to obtain a functional representation of the PCA weight vectors that may be queried for any angle on a continuous scale. The PCA/DCT method is shown to perform favorably to the simple time domain method, even when applied to a dataset that has been heavily compressed during both the PCA and DCT analysis.
Convention Paper 9241 (Purchase now)
P4-3 Binaural Hearing Aids with Wireless Microphone Systems including Speaker Localization and Spatialization—Gilles Courtois, Swiss Federal Institute of Technology (EPFL) - Lausanne, Switzerland; Patrick Marmaroli, Swiss Federal Institute of Technology (EPFL) - Lausanne, Switzerland; Hervé Lissek, Swiss Federal Institute of Technology (EPFL) - Lausanne, Switzerland; Yves Oesch, Phonak Communciations AG - Murten, Switzerland; William Balande, Phonak Communciations AG - Murten, Switzerland
The digital wireless microphones systems for hearing aids have been developed to provide a clean and intelligible speech signal to hearing-impaired listeners for, e.g., school or teleconference applications. In this technology, the voice of the speaker is picked up by a body-worn microphone, wirelessly transmitted to the hearing aids and rendered in a diotic way (same signal at both ears), preventing any speaker localization clues from being provided. The reported algorithm performs a real-time binaural localization and tracking of the talker so that the clean speech signal can then be spatialized, according to its estimated position relative to the aided listener. This feature is supposed to increase comfort, sense of immersion, and intelligibility for the users of such wireless microphone systems.
Convention Paper 9242 (Purchase now)
P4-4 On the Development of a Matlab-Based Tool for Real-Time Spatial Audio Rendering—Gabriel Moreno, Universitat de Valencia - Burjassot, Spain; spat; Maximo Cobos, Universitat de Valencia - Burjassot, Spain; Jesus Lopez-Ballester, Universitat de Valencia - Burjassot, Spain; Pablo Gutierrez-Parera, Universitat de Valencia - Valencia, Spain; Jaume Segura-Garcia, Universitat de Valencia - Burjassot, Valencia, Spain; Polytechnic University of Valencia; Ana Torres, Polytechic University School of Cuenca - Cuenca, Spain
Spatial audio has been a topic of intensive research in the last decades. Although there are many tools available for developing real-time spatial sound systems, most of them work under audio-oriented frameworks. However, despite a significant number of signal processing researchers and engineers who develop their algorithms in MATLAB, there is not currently any MATLAB-based tool for rapid spatial audio system prototyping and algorithm testing. This paper presents a tool for spatial audio research and education under this framework. The presented tool provides the user with a friendly graphical user interface (GUI) that allows to move freely a number of sound sources in 3D and to develop specific functions to be used during their reproduction.
Convention Paper 9243 (Purchase now)
P4-5 Psychoacoustic Investigation on the Auralization of Spherical Microphone Array Processing with Wave Field Synthesis—Gyan Vardhan Singh, Technische Universität Ilmenau - Ilmenau, Germany
In the present work we have investigated the perceptual effects induced by various errors and artifacts that arise when spherical microphone arrays are used on the recording side. For spatial audio it is very important to characterize the acoustic scene in three-dimensional space. In order to achieve this three dimensional characterization of a sonic scene, spherical microphone arrays are employed. The use of these spherical arrays has some inherent issues because of some errors and by virtue of mathematics involved the in processing. In this paper we analyzed these issues on recording side (spherical microphone array) that plague the audio quality on the rendering side and did a psychoacoustic investigation to access the extent to which the errors and artifacts produce a perceivable affect during auralization when the acoustic scene is reproduced using wave field synthesis.
Convention Paper 9244 (Purchase now)
P4-6 Evaluation of a Frequency-Domain Source Position Estimator for VBAP-Panned Recordings—Alexander Adami, International Audio Laboratories Erlangen - Erlangen, Germany; Jürgen Herre, International Audio Laboratories Erlangen - Erlangen, Germany; Fraunhofer IIS - Erlangen, Germany
A frequency-domain source position estimator is presented that extracts the position of a VBAP-panned directional source by means of a direct-ambience signal decomposition. The directional signal components are used to derive an estimate of the panning gains that can be used to derive the estimated source position. We evaluated the mean estimated source positions as a function of the ideal source position as well as of different ambience energy levels using simulations. Additionally, we analyzed the influence of a second directional source to the estimated source positions.
Convention Paper 9245 (Purchase now)
P4-7 A Listener Position Adaptive Stereo System for Object-Based Reproduction—Marcos F. Simón Gálvez, University of Southampton - Southampton, UK; Dylan Menzies, University of Southampton - Southampton, UK; Filippo Maria Fazi, University of Southampton - Southampton, Hampshire, UK; Teofilo de Campos, University of Surrey - Guildford, Surrey, UK; Adrian Hilton, University of Surrey - Guildford, Surrey, UK
Stereo reproduction of spatial audio allows the creation of stable acoustic images when the listener is placed in the sweet spot, a small region in the vicinity of the axis of symmetry between both loudspeakers. If the listener moves slightly towards one of the sources, however, the images collapse to the loudspeaker the listener is leaning to. In order to overcome such limitation, a stereo reproduction technique that adapts the sweet spot to the listener position is presented here. This strategy introduces a new approach that maximizes listener immersion by rendering object-based audio, in which several audio objects or sources are placed at virtual locations between the stereo span. By using a video tracking device, the listener is allowed to move freely between the loudspeaker span, while loudspeaker outputs are compensated using conventional panning algorithms so that the position of the different audio objects is kept independent from that of the listener.
Convention Paper 9246 (Purchase now)
P4-8 Optimization of Reproduced Wave Surface for Three-Dimensional Panning—Akio Ando, University of Toyama - Toyama, Japan; Hiro Furuya, University of Toyama - Toyama, Japan; Masafumi Fujii, University of Toyama - Toyama, Japan; Minoru Tahara, University of Toyama - Toyama, Japan
Three-dimensional panning is an essential tool for production of 3D sound material. The typical method is an amplitude panning. The amplitude panning generates the weighting coefficients on the basis of the direction of virtual sound source (desired direction) and the directions of loudspeakers, or the distances between the virtual source and each loudspeaker. It then distributes the weighted signal of the corresponding sound into loudspeakers. The amplitude panning sometimes brings blurred image and deteriorates the timbre of sound. In this paper we propose the new method that optimizes the shape of the wave surface synthesized by multiple loudspeakers. The computer simulation with the frontal six-loudspeaker system showed that the new method achieved the improvement of the reproduced wave surface of sound and its frequency response.
Convention Paper 9247 (Purchase now)
P4-9 Estimation of the Radiation Pattern of a Violin During the Performance Using Plenacoustic Methods—Antonio Canclini, Politecnico di Milano - Milan, Italy; Luca Mucci, Politecnico di Milano - Milan, Italy; Fabio Antonacci, Politecnico di Milano - Milan, Italy; Augusto Sarti, Politecnico di Milano - Milan, Italy; Stefano Tubaro, Politecnico di Milano - Milan, Italy
We propose a method for estimating the 3D radiation pattern of violins during the performance of a musician. A rectangular array of 32 microphones is adopted for measuring the energy radiated by the violin in the observed directions. In order to gather measurements from all the 3D angular directions, the musician is free to move and rotate in front of the array. The position and orientation of the violin is estimated through a tracking system. As the adopted hardware is very compact and non-invasive, the musician plays in a natural fashion, thus replicating the radiation conditions of a real scenario. The experimental results prove the accuracy and the effectiveness of the method.
Convention Paper 9248 (Purchase now)
P4-10 An Evaluation of the IDHOA Ambisonics Decoder in Irregular Planar Layouts—Davide Scaini, Universitat Pompeu Fabra - Barcelona, Spain; Dolby Iberia S.L. - Barcelona, Spain; Daniel Arteaga, Dolby Iberia S.L. - Barcelona, Spain; Universitat Pompeu Fabra - Barcelona, Spain
In previous papers we presented an algorithm for decoding higher order Ambisonics for irregular real-world 3D loudspeaker arrays, implemented in the form of IDHOA, an open source project. IDHOA has many features tailored for the reproduction of Ambisonics in real audio venues. In order to benchmark the performance of the decoder against other decoding solutions, we restrict the decoder to 2D layouts, and in particular to the well studied 5.1 and 7.1 surround layouts and in particular to the well studied stereo, 5.1,and 7.1 surrounds. We report on the results of the objective evaluation of the IDHOA decoder in these layouts and of the subjective evaluation in 5.1 by benchmarking IDHOA against different decoding solutions.
Convention Paper 9249 (Purchase now)
P4-11 A General Purpose Modular Microphone Array for Spatial Audio Acquisition—Jesus Lopez-Ballester, Universitat de Valencia - Burjassot, Spain; Maximo Cobos, Universitat de Valencia - Burjassot, Spain; Juan J. Perez-Solano, Universitat de Valencia - Burjassot, Spain; Gabriel Moreno, Universitat de Valencia - Burjassot, Spain; spat; Jaume Segura-Garcia, Universitat de Valencia - Burjassot, Valencia, Spain; Polytechnic University of Valencia
Sound acquisition for spatial audio applications usually requires the use of microphone arrays. Surround recording and advanced reproduction techniques such as Ambisonics or Wave-Field Synthesis usually require the use of multi-capsule microphones. In this context, a proper sound acquisition system is necessary for achieving the desired effect. Besides spatial audio reproduction, other applications such as source localization, speech enhancement or acoustic monitoring using distributed microphone arrays are becoming increasingly important. In this paper we present the design of a general-purpose modular microphone array to be used in the above application contexts. The presented system allows performing multichannel recordings using multiple capsules arranged in different 2D and 3D geometries.
Convention Paper 9250 (Purchase now)
P4-12 Immersive Content in Three Dimensional Recording Techniques for Single Instruments in Popular Music—Bryan Martin, McGill University - Montreal, QC, Canada; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT) - Montreal, QC, Canada; Richard King, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Brett Leonard, University of Nebraska at Omaha - Omaha, NE, USA; David Benson, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Will Howie, McGill University - Montreal, QC, Canada; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT) - Montreal, Quebec, Canada
“3D Audio” has become a popular topic in recent years. A great deal of research is underway in spatial sound reproduction through computer modeling and signal processing, while less focus is being placed on actual recording practice. This study is a preliminary test in establishing effective levels of height-channel information based on the results of a listening test. In this case, an acoustic guitar was used as the source. Eight discrete channels of height information were combined with an eight-channel surround sound mix reproduced at the listener’s ear height. Data from the resulting listening test suggests that while substantial levels of height channel information increase the effect of immersion, more subtle levels fail to provide increased immersion over the conventional multichannel mix.
Convention Paper 9251 (Purchase now)