AES New York 2017
Paper Session P05
P05 - Perception—Part 1
Wednesday, October 18, 2:00 pm — 5:00 pm (Rm 1E12)
Chair:
Sean Olive, Harman International - Northridge, CA, USA
P05-1 Direct and Indirect Listening Test Methods—A Discussion Based on Audio-Visual Spatial Coherence Experiments—Cleopatra Pike, University of St Andrews - Scotland, Fife, UK; Hanne Stenzel, University of Surrey - Guildford, Surrey, UK
This paper reviews the pros and cons of using direct measures (e.g. preference, annoyance) and indirect measures (e.g. “subconscious” EEG measures and reaction times, “RTs”) to determine how viewers perceive audio and audio-visual attributes. The methodologies are discussed in relation to spatial coherence testing (whether audio/visual signals arrive from the same direction). Experimental results in coherence testing are described to illustrate problems with direct measures and improvements seen with RTs. Suggestions are made for the use of indirect measures in testing, including more sophisticated uses of RTs. It is concluded that indirect measures offer novel insights into listener evaluations of audio-visual experiences but are not always suitable
Convention Paper 9829 (Purchase now)
P05-2 Identification of Perceived Sound Quality Attributes of 360º Audiovisual Recordings in VR Using a Free Verbalization Method—Marta Olko, New York University - New York, NY, USA; Dennis Dembeck, New York University - New York, NY, USA; Flavorlab; Yun-Han Wu, New York University - New York, NY, USA; Andrea Genovese, New York University - New York, NY, USA; Agnieszka Roginska, New York University - New York, NY, USA
Recent advances in Virtual Reality (VR) technology have led to fast development of 3D binaural sound rendering methods that work in conjunction with head-tracking technology. As the production of 360° media grows, new subjective experiments that can appropriately evaluate and compare the sound quality of VR production tools are required. In this preliminary study a Free Verbalization Method is employed to uncover auditory features within 360° audio-video experiences when paired with a 3-degrees-of-freedom head-tracking VR device and binaural sound over headphones. Subjects were first asked to identify perceived differences and similarities between different versions of audiovisual stimuli. In a second stage, subjects developed bipolar scales based on their verbal descriptions obtained previously. The verbal constructs created during the experiment, were then combined by the authors and experts into parent attributes by means of semantical analysis, similar to previous research on sound quality attributes. Analysis of the results indicated that there were three main groups of the sound quality attributes: attributes of sound quality describing the general impression of the 360° sound environment, attributes describing sound in relation to the head movement, and attributes describing audio and video congruency. Overall, the consistency of sound between different positions in 360° environment seems to create the new fundamental aspect of sound evaluation for VR and AR multimedia content.
Convention Paper 9830 (Purchase now)
P05-3 Tonal Component Coding in MPEG-H 3D Audio Standard—Tomasz Zernicki, Zylia sp. z o.o. - Poznan, Poland; Lukasz Januszkiewicz, Zylia Sp. z o.o. - Poznan, Poland; Andrzej Ruminski, Zylia sp. z.o.o. - Poznan, Poland; Marcin Chryszczanowicz, Zylia sp. z.o.o. - Poznan, Poland
This paper describes a Tonal Component Coding (TCC) technique that is an extension tool for the MPEG-H 3D Audio Standard. The method is used in order to enhance the perceptual quality of audio signals with strong and time-varying high frequency (HF) tonal components. At the MPEG-H 3D Audio Core Coder, the TCC tool exploits sinusoidal modeling in order to detect substantial HF tonal content and transform it into the so-called sinusoidal trajectories. A novel parametric coding scheme is applied and the additional data are multiplexed into the bitstream. At the decoder side, the trajectories are reconstructed and merged with the output of the 3D Audio Core Decoder. The TCC was tested as an extension to MPEG-H Audio Reference Quality Encoder in low bitrate (enhanced Spectral Band Replication) and low complexity (Intelligent Gap Filling) operating mode. The subjective listening tests prove the statistical improvement of perceptual quality of signals encoded with proposed technique.
Convention Paper 9831 (Purchase now)
P05-4 Lead-Signal Localization Accuracy for Inter-Channel Time Difference in Higher and Lower Vertical, Side, and Diagonal Loudspeaker Configurations—Paul Mayo, Belmont University - Nashville, TN, USA; Wesley Bulla, Belmont University - Nashville, TN, USA
The effects of inter-channel time difference (ICTD) on a sound source’s perceived location are well understood for horizontal loudspeaker configurations. This experiment tested the effect of novel loudspeaker configurations on a listener’s ability to localize the leading signal in ICTD scenarios. The experiment was designed to be a comparison to standard horizontal precedence-effect experiments but with non-traditional loudspeaker arrangements. Examples of such arrangements include vertical, elevated, and lowered configurations. Data will be analyzed using sign and ANOVA tests with listeners’ responses being visualized graphically. Outcomes are expected to follow a predicted precedence-based suppression model assuming localization will be concentrated at the leading loudspeaker.
Convention Paper 9832 (Purchase now)
P05-5 Non-Intrusive Polar Pattern Estimation in Diffuse Noise Conditions for Time Variant Directional Binaural Hearing Aids—Changxue Ma, GN Resound Inc. - Glenview, IL, USA; Andrew B. Dittberner, GN Resound Inc. - Glenview, IL, USA; Rob de Vries, GN Resound Inc. - Eindhoven, The Netherlands
The directivity index is often used to represent the performance of beamforming algorithms on hearing aids. For binaural listening modeling, we also need to measure the directivity patterns. The common method to estimate the directivity patterns is using a single rotating sound source. With this method, it is difficult to obtain a good directivity pattern when the performance of the system dependents on the acoustic environment in an adaptive manner. The directivity pattern can be also confounded by other signal processing components. These processing include direction of arrival (DOA) based nonlinear post filtering. This paper proposes a method to extract the directivity patterns of a beamforming algorithm under diffuse noise conditions with a rotating probe signal. We are able to obtain the directivity pattern from the probe signal by spectral subtraction of diffuse noise sound field in a non-intrusive manner.
Convention Paper 9833 (Purchase now)
P05-6 Training on the Acoustical Identification of the Listening Position in a Virtual Environment—Florian Klein, Technische Universität Ilmenau - Ilmenau, Germany; Annika Neidhardt, Technische Universität Ilmenau - Ilmenau, Germany; Marius Seipel, TU Ilmenau - Ilmenau, Germany; Thomas Sporer, Fraunhofer Institute for Digital Media Technology IDMT - Ilmenau, Germany
This paper presents an investigation of the training effect on the perception of position dependent room acoustics. Listeners are trained to distinguish the acoustics at different listening positions and to detect mismatches between the visual and the acoustical representations. In virtual acoustic environments simplified representation of room acoustics are used often. That works well, when fictive or unknown rooms are auralized but may be critical for real rooms. The results show that 10 out of 20 participants could significantly increase their accuracy for choosing the correct combinations after training. The publication investigates the underlying processes of the adaptation effect and the reasons for the individual differences. The relevance of these findings for acoustic virtual/augmented reality applications is discussed.
Convention Paper 9834 (Purchase now)