AES Los Angeles 2014
Paper Session P14
P14 - Perception: Part 2
Saturday, October 11, 2:00 pm — 5:30 pm (Room 308 AB)
Chair:
Sungyoung Kim, Rochester Institute of Technology - Rochester, NY, USA
P14-1 Revision of Rec. ITU-R BS.1534—Judith Liebetrau, Fraunhofer IDMT - Ilmenau, Germany; Frederik Nagel, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; International Audio Laboratories - Erlangen, Germany; Nick Zacharov, DELTA SenseLab - Iisalmi, Finland; Kaoru Watanabe, NHK Science and Technology Research Labs. - Setagaya-ku, Tokyo, Japan; Catherine Colomes, Orange Labs - Cesson Sevigné, France; Poppy Crum, Dolby Laboratories - San Francisco, CA, USA; Thomas Sporer, Fraunhofer Institute for Digital Media Technology IDMT - Ilmenau, Germany; Ilmenau University of Technology - Ilmenau, Germany; Andrew Mason, BBC Research and Development - London, UK
In audio quality evaluation, ITU-R BS.1534-1, commonly known as MUSHRA, is widely used for the subjective assessment of intermediate audio quality. Studies have identified limitations of the MUSHRA methodology [1][2], which can influence the robustness to biases and errors introduced during the testing process. Therefore ITU-R BS.1534 was revised to reduce the potential for introduction of systematic errors and biases in the resulting data. These modifications improve the validity and the reliability of data collected with the MUSHRA method. The main changes affect the post screening of listeners, the inclusion of a mandatory mid-range anchor, the number and length of test items as well as statistical analysis. In this paper the changes and reasons for modification are given.
Convention Paper 9172 (Purchase now)
P14-2 Movement Perception of Risset Tones with and without Artificial Spatialization—Julian Villegas, University of Aizu - Aizu Wakamatsu, Fukushima, Japan
The apparent radial movement (approaching or receding) of Risset tones was studied for sources in front, above, and to the right of listeners. Besides regular Risset tones, two kinds of spatialization were included: global (regarding the tone as a whole) and individual (spatializing each of its spectral components). The results suggest that regardless of the direction of the glissando, subjects tend to judge them as approaching. The effect of spatialization type was complex: For upward Risset tones, judgments were, in general, aligned with the direction of the spatialization, but this was not observed in the downward Risset tones. Furthermore, individual spatialization yielded judgments comparable to those of non-spatialized stimuli, whereas spatializing the stimuli as a whole yielded judgments more aligned with the treatment.
Convention Paper 9173 (Purchase now)
P14-3 The Audibility of Typical Digital Audio Filters in a High-Fidelity Playback System—Helen M. Jackson, Meridian Audio Ltd. - Huntingdon, UK; Michael D. Capp, Meridian Audio Ltd. - Huntingdon, UK; J. Robert Stuart, Meridian Audio Ltd. - Huntingdon, UK
This paper describes listening tests investigating the audibility of various filters applied in high-resolution wideband digital playback systems. Discrimination between filtered and unfiltered signals was compared directly in the same subjects using a double-blind psychophysical test. Filter responses tested were representative of anti-alias filters used in A/D (analog-to-digital) converters or mastering processes. Further tests probed the audibility of 16-bit quantization with or without a rectangular dither. Results suggest that listeners are sensitive to the small signal alterations introduced by these filters and quantization. Two main conclusions are offered: first, there exist audible signals that cannot be encoded transparently by a standard CD; and second, an audio chain used for such experiments must be capable of high-fidelity reproduction.
Convention Paper 9174 (Purchase now)
P14-4 Evaluation Criteria for Live Loudness Meters—Jon Allan, Luleå University of Technology - Piteå, Sweden; Jan Berg, Luleå University of Technology - Piteå, Sweden
As a response to discrepancies in loudness levels in broadcast, the recommendations of the International Telecommunication Union and the European Broadcasting Union state that audio levels should be regulated based on loudness measurement. These recommendations differ regarding the definition of meter ballistics for live loudness meters, and this paper seeks to identify possible additional information needed to attain a higher conformity between the recommendations. This work suggests that the qualities we seek in a live loudness meter could be more differentiated for different time scales (i.e., momentary and short-term that is defined by two different integration times), and therefore also should be evaluated by different evaluation criteria.
Convention Paper 9175 (Purchase now)
P14-5 Factors Influencing Listener Preference for Dynamic Range Compression—Malachy Ronan, University of Limerick - Limerick, Ireland; Robert Sazdov, University of Limerick - Limerick, Ireland; Nicholas Ward, University of Limerick - Limerick, Ireland
The introduction of loudness normalization has led some commentators to declare that the loudness wars are over. However, factors contributing to a preference for dynamic range compression have not been removed. The research presented here investigates the role of long-term memory in sound quality judgments. Factors influencing preference judgments of dynamic range compression are discussed along with suggestions of further research areas. Research is presented that indicates that an objective measure of dynamic range will facilitate a greater understanding of how dynamic range compression affects individual sound quality attributes.
Convention Paper 9176 (Purchase now)
P14-6 The Influence of Listeners’ Experience, Age, and Culture on Headphone Sound Quality Preferences—Sean Olive, Harman International - Northridge, CA, USA; Todd Welti, Harman International - Northridge, CA, USA; Elisabeth McMullin, Harman International - Northridge, CA USA
Double-blind headphone listening tests were conducted in four different countries (Canada, USA, China, and Germany) involving 238 listeners of different ages, gender, and listening experiences. Listeners gave comparative preference ratings for three popular headphones and a new reference headphone that were all virtually presented through a common replicator headphone equalized to match their measured frequency responses. In this way, biases related to headphone brand, price, visual appearance, and comfort were removed from listeners’ judgment of sound quality. On average, listeners preferred the reference headphone that was based on the in-room frequency response of an accurate loudspeaker in a reference listening room. This was generally true regardless of the listeners’ experience, age, gender, and culture. This new evidence suggests a headphone standard based on this new target response would satisfy the tastes of most listeners.
Convention Paper 9177 (Purchase now)
P14-7 A Hierarchical Approach to Archiving and Distribution—J. Robert Stuart, Meridian Audio Ltd. - Huntingdon, UK; Peter Craven, Algol Applications Ltd. - London, UK
When recording, the ideal is to capture a performance so that the highest possible sound quality can be recovered from the archive. While an archive has no hard limit on the quantity of data assignable to that information, in distribution the data deliverable depends on application-specific factors such as storage, bandwidth or legacy compatibility. Recent interest in high-resolution digital audio has been accompanied by a trend to higher and higher sampling rates and bit depths, yet the sound quality improvements show diminishing returns and so fail to reconcile human auditory capability with the information capacity of the channel. By bringing together advances in sampling theory with recent findings in human auditory science, our approach aims to deliver extremely high sound quality through a hierarchical distribution chain where sample rate and bit depth can vary at each link but where the overall system is managed from end-to-end, including the converters. Our aim is an improved time/frequency balance in a high-performance chain whose errors, from the perspective of the human listener, are equivalent to no more than those introduced by sound traveling a short distance through air.
Convention Paper 9178 (Purchase now)