AES Paris 2016
Engineering Brief EB4
EB4 - eBriefs 4: Lectures
Tuesday, June 7, 12:00 — 13:45 (Room 353)
Chair:
Thomas Görne, Hamburg University of Applied Sciences - Hamburg, Germany
EB4-1 A Survey of Suggested Techniques for Height Channel Capture in Multichannel Recording—Richard King, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Will Howie, McGill University - Montreal, QC, Canada; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT) - Montreal, Quebec, Canada; Jack Kelly, McGill University - Montreal, QC, Canada
Capturing audio in three dimensions is becoming a required skill for many recording engineers. Playback formats and systems now exist that take advantage of height channels, which introduce the aspect of elevation into the experience. In this engineering brief several exploratory techniques in height channel capture are reviewed and compared. Techniques optimized for conventional 5.1 surround sound are employed, and additional microphones are added to increase the immersive experience. Methods that have proven to be successful in 5.1 recordings are modified for 3D audio capture and the results are discussed. This case study will show an overview of the groundwork currently underway.
Engineering Brief 266 (Download now)
EB4-2 Perceptually Significant Parameters in Stereo and Binaural Mixing with Logic Pro Binaural Panner—Blas Payri, Universitat Politècnica de València - Valencia, Spain; Juan-Manuel Sanchis-Rico, Universitat Politècnica de València - Valencia, Spain
We conducted a perception experiment using organ chords recorded with 6 microphones and mixed in stereo and binaural, varying in maximum angle distribution (0º, 87º, 174º), and for binaural mixes, varying in elevation, front-rear distribution, and postprocessing. N=51 participants (audio-related students) listened to the 20 stimuli with headphones, classified them according to similarity, and rated their valence and immersion. Results show a high agreement on similarity (Cronbach’s alpha=.93) but very low agreement on valence and immersion ratings. The parameters that were perceived are the difference binaural/stereo, the binaural postprocessing style, and in a lesser degree, the angle. Elevation and rear distribution of sources did not yield any significant response.
Engineering Brief 267 (Download now)
EB4-3 3D Tune-In: The Use of 3D Sound and Gamification to Aid Better Adoption of Hearing Aid Technologies—Yuli Levtov, Reactify - London, UK; Lorenzo Picinali, Imperial College London - London, UK; Mirabelle D'Cruz, Reactify Music LLP - London, UK; Luca Simeone, 3D Tune-In consortium
3D Tune-In is an EU-funded project with the primary aim of improving the quality of life of hearing aid users. This is an introductory paper outlining the project’s innovative approach to achieving this goal, namely via the 3D Tune-In Toolkit, and a suite of accompanying games and applications. The 3D Tune-In Toolkit is a flexible, cross-platform library of code and guidelines that gives traditional game and software developers access to high-quality sound spatialization algorithms. The accompanying suite of games and applications will then make thorough use of the 3D Tune-In toolkit in order to address the problem of the under-exploitation of advanced hearing aid features, among others.
Engineering Brief 268 (Download now)
EB4-4 Binaural Auditory Feature Classification for Stereo Image Evaluation in Listening Rooms—Gavriil Kamaris, University of Patras - Rion Campus, Greece; Stamatis Karlos, University of Patras - Patras, Greece; Nikos Fazakis, University of Patras - Patras, Greece; Stergios Terpinas, University of Patras - Patras, Greece; John Mourjopoulos, University of Patras - Patras, Greece
Two aspects of stereo imaging accuracy from audio system listening have been investigated: (i) panned phantom image localization accuracy at 5-degree and (ii) sweet spot spatial spread from the ideal anechoic reference. The simulated study used loudspeakers of different directivity under ideal anechoic or realistic varying reverberant room conditions and extracted binaural auditory features (ILDs, ITDs, and ICs) from the received audio signals. For evaluation, a Decision Tree classifier was used under a sparse data self-training achieving localization accuracy ranging from 92% (for ideal anechoic when training/test data were similar audio category), down to 55% (for high reverberation when training/test data were different music segments).Sweet spot accuracy was defined and evaluated as a spatial spread statistical distribution function.
Engineering Brief 269 (Download now)
EB4-5 Elevation Control in Binaural Rendering—Aleksandr Karapetyan, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Felix Fleischmann, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Jan Plogsties, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
Most binaural audio algorithms render the sound image solely on the horizontal plane. Recently, immersive and object-based audio applications like VR and games require control of the sound image position in the height dimension. However, measurements from elevated loudspeakers require a 3D loudspeaker setup. By analyzing early reflections of BRIRs with a fixed elevation, spectral cues for the height perception are extracted and applied to HRTFs. The parametrization of these cues allows the control of height perception. The sound image can be moved to higher as well as lower positions. The performance of this method has been evaluated by means of a listening test.
Engineering Brief 270 (Download now)
EB4-6 Headphone Virtualization for Immersive Audio Monitoring—Michael Smyth, Smyth Research Ltd. - Bangor, UK; Stephen Smyth, Smyth Research Ltd. - Bangor, UK
There are a number of competing immersive audio encoding formats, such as Dolby Atmos, Auro-3D, DTS-X, and MPEG-H, but, to date, there is no single loudspeaker format for monitoring them all. While it is argued that one of the benefits of using audio objects within immersive audio is that it allows rendering to different, and even competing, loudspeaker formats, nevertheless the documented native formats of each immersive codec must be considered the reference point when monitoring each immersive audio system. This implies that the ability to switch between different loudspeaker layouts will be important when monitoring different immersive audio formats. The solution outlined here is based on the generation of virtual loudspeakers within DSP hardware and their reproduction over normal stereo headphones. The integrated system is designed to allow the accurate monitoring of any immersive audio system of up to 32 loudspeaker sources, with the ability to switch almost instantly between formats.
Engineering Brief 271 (Download now)
EB4-7 Temporal Envelope for Audio Classification—Ewa Lukasik, Poznan University of Technology - Poznan, Poland; Cong Yang, University of Siegen - Siegen, Germany; Lukasz Kurzawski, RecArt - Poznan, Poland; Polish Radio Poznan
The paper reviews some applications of temporal envelope of audio signal from the perspective of a sound engineer. It contrasts the parametric representation of the temporal envelope (e.g., temporal centroid, attack time, attack slope) with the more global representation based on envelope shape descriptors. Such an approach would mimic the sound engineer expertise and could be useful for such classification tasks, as music genre, speech/music, musical instruments classification, and others.
Engineering Brief 272 (Download now)