AES San Francisco 2012
Poster Session P13

P13 - Auditory Perception and Evaluation

Saturday, October 27, 4:00 pm — 5:30 pm (Foyer)

P13-1 Real-Time Implementation of Glasberg and Moore's Loudness Model for Time-Varying Sounds—Elvira Burdiel, Queen Mary University of London - London, UK; Lasse Vetter, Queen Mary University of London - London, UK; Andrew J. R. Simpson, Queen Mary University of London - London, UK; Michael J. Terrell, Queen Mary University of London - London, UK; Andrew McPherson, Queen Mary University of London - London, UK; Mark B. Sandler, Queen Mary University of London - London, UK
In this paper, a real-time implementation of the loudness model of Glasbergand Moore [J. Audio Eng. Soc. 50, 331–342 (2002)] for time-varying sounds is presented. This real-time implementation embodies several approximations to the model that are necessary to reduce computational costs, both in the time and frequency domains. A quantitative analysis is given that shows the effect of parametric time and frequency domain approximations by comparison to the loudness predictions of the original model. Using real-world music, both the errors introduced as a function of the optimization parameters and the corresponding reduction in computational costs are quantified. Thus, this work provides an informed, contextual approach to approximation of the loudness model for practical use.
Convention Paper 8769 (Purchase now)

P13-2 Subjective Selection of Head-Related Transfer Functions (HRTF) Based on Spectral Coloration and Interaural Time Differences (ITD) Cues—Kyla McMullen, Clemson University - Clemson, SC USA; Agnieszka Roginska, New York University - New York, NY, USA; Gregory H. Wakefield, University of Michigan - Ann Arbor, MI, USA
The present study describes an HRTF subjective individualization procedure in which a listener selects from a database those HRTFs that pass several perceptual criteria. Earlier work has demonstrated that listeners are as likely to select a database HRTF as their own when judging externalization, elevation, and front/back discriminability. The procedure employed in this original study requires individually measured ITDs. The present study modifies the original procedure so that individually measured ITDs are unnecessary. Specifically, a standardized ITD is used, in place of the listener's ITD, to identify those database minimum-phase HRTFs with desirable perceptual properties. The selection procedure is then repeated for one of the preferred minimum-phase HRTFs and searches over a database of ITDs. Consistent with the original study, listeners prefer a small subset of HRTFs; in contrast, while individual listeners show clear preferences for some ITDs over others, no small subset of ITDs appears to satisfy all listeners.
Convention Paper 8770 (Purchase now)

P13-3 Does Understanding of Test Items Help or Hinder Subjective Assessment of Basic Audio Quality?—Nadja Schinkel-Bielefeld, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; International Audio Laboratories - Erlangen, Germany; Netaya Lotze, Leibniz Universität Hannover - Hannover, Germany; Frederik Nagel, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; International Audio Laboratories - Erlangen, Germany
During listening tests for subjective evaluation of intermediate audio quality, sometimes test items in various foreign languages are presented. The perception of basic audio quality thereby may vary depending on the listeners' native language. This study investigated the role of understanding in quality assessment employing regular German sentences and sentences consisting of half-automatically generated German-sounding pseudo words. Especially less experienced listeners rated pseudo words slightly higher than German sentences of matching prosody. While references were heard longer for pseudo items, for other conditions they tend to hear German items longer. Though effects of understanding in our study were small, they may play a role in foreign languages that are less understandable than our pseudo sentences and differ in phoneme inventory.
Convention Paper 8771 (Purchase now)

P13-4 Subjective Assessments of Higher Order Ambisonic Sound Systems in Varying Acoustical Conditions—Andrew J. Horsburgh, University of the West of Scotland - Paisley, Scotland, UK; Robert E. Davis, University of the West of Scotland - Paisley, Scotland, UK; Martyn Moffat, University of the West of Scotland - Paisley, Scotland, UK; D. Fraser Clark, University of the West of Scotland - Paisley, Scotland, UK
Results of subjective assessments in source perception using higher order Ambisonics are presented in this paper. Test stimuli include multiple synthetic and naturally recorded sources that have been presented in various horizontal and mixed order Ambisonic listening tests. Using a small group of trained and untrained listening participants, materials were evaluated over various Ambisonic orders, with each scrutinized for localization accuracy, apparent source width and realistic impression. The results show a general preference for 3rd order systems in each of the three test categories: speech, pure tone, and music. Localization results for 7th order show a trend of stable imagery with complex stimuli sources and pure tones in the anechoic environment providing the highest accuracy.
Convention Paper 8772 (Purchase now)

P13-5 A Viewer-Centered Revision of Audiovisual Content Classifiers—Katrien De Moor, Ghent University - Ghent, Belgium; Ulrich Reiter, Norwegian University of Science and Technology - Trondheim, Norway
There is a growing interest in the potential value of content-driven requirements for increasing the perceived quality of audiovisual material and optimizing the underlying performance processes. However, the categorization of content and identification of content-driven requirements is still largely based on technical characteristics. There is a gap in the literature when it comes to including viewer and content related aspects. In this paper we go beyond purely technical features as content classifiers and contribute to the deeper understanding of viewer preferences and requirements. We present results from a qualitative study using semi-structured interviews, aimed at exploring content-driven associates from a bottom-up perspective. The results show that users’ associations, requirements, and expectations differ across different content types, and that these differences should be taken into account when selecting stimulus material for subjective quality assessments. We also relate these results to previous research on content classification.
Convention Paper 8773 (Purchase now)

P13-6 Perception of Time-Varying Signals: Timbre and Phonetic JND of Diphthong—Arthi Subramaniam, Indian Institute of Science - Bangalore, India; Thippur V. Sreenivas, Indian Institute of Science - Bangalore, India
In this paper we propose a linear time-varying model for diphthong synthesis based on linear interpolation of formant frequencies. We, thence, determine the timbre just-noticeable difference (JND) for diphthong /a I/ (as in ‘buy’) with a constant pitch excitation through perception experiment involving four listeners and explore the phonetic JND of the diphthong. Their JND responses are determined using 1-up-3-down procedure. Using the experimental data, we map the timbre JND and phonetic JND onto a 2-D region of percentage change of formant glides. The timbre and phonetic JND contours for constant pitch show that the phonetic JND region encloses timbre JND region and also varies across listeners. The JND is observed to be more sensitive to ending vowel /I/ than starting vowel /a/ in some listeners and dependent on the direction of perturbation of starting and ending vowels.
Convention Paper 8774 (Purchase now)

P13-7 Employing Supercomputing Cluster to Acoustic Noise Map Creation—Andrzej Czyzewski, Gdansk University of Technology - Gdansk, Poland; Jozef Kotus, Gdansk University of Technology - Gdansk, Poland; Maciej Szczodrak, Gdansk University of Technology - Gdansk, Poland; Bozena Kostek, Gdansk University of Technology - Gdansk, Poland
A system is presented for determining acoustic noise distribution and assessing its adverse effects in short time periods inside large urban areas owing to the employment of a supercomputing cluster. A unique feature of the system is the psychoacoustic noise dosimetry implemented to inform interested citizens about predicted auditory fatigue effects that may be caused by the exposure to excessive noise. The noise level computing is based on the engineered Noise Prediction Model (NPM) stemmed from the Harmonoise model. Sound level distribution in the urban area can be viewed by users over the prepared www service. An example of a map is presented in consecutive time periods to show the capability of the supercomputing cluster to update noise level maps frequently.
Convention Paper 8775 (Purchase now)

P13-8 Objective and Subjective Evaluations of Digital Audio Workstation Summing—Brett Leonard, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Scott Levine, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Padraig Buttner-Schnirer, McGill University - Montreal, QC, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada
Many recording professionals attest to a perceivable difference in sound quality between different digital audio workstations (DAWs), yet there is little in the way of quantifiable evidence to support these claims. To test these assertions, the internal summing of five different DAWs is tested. Multitrack stems are recorded into each DAW and summed to a single, stereo mix. This mix is evaluated objectively in reference to a purely mathematical sum generated in Matlab to avoid any system-specific limitations in the summing process. The stereo sums are also evaluated by highly trained listeners through a three-alternative forced-choice test focusing on three different DAWs. Results indicate that when panning is excluded from the mixing process, minimal objective and subjective differences exist between workstations.
Convention Paper 8776 (Purchase now)

P13-9 Hong Kong Film Score Production: A Hollywood Informed Approach—Robert Jay Ellis-Geiger, City University of Hong Kong - Hong Kong, SAR China
This paper represents a Hollywood-informed approach toward film score production, with special attention given to the impact of dialogue on the recording and mixing of the music score. The author reveals his process for creating a hybrid (real and MIDI) orchestral film score that was recorded, mixed, and produced in Hong Kong for the English language feature film, New York November (2011). The film was shot in New York, directed by Austrian filmmakers Gerhard Fillei and Joachim Krenn in collaboration with film composer and fellow Austrian, Sascha Selke. Additional instruments were remotely recorded in Singapore and the final sound track was mixed at a dubbing theater in Berlin. The author acted as score producer, conductor, co-orchestrator, MIDI arranger, musician, and composer of additional music.
Convention Paper 8777 (Purchase now)

P13-10 Investigation into Electric Vehicles Exterior Noise Generation—Stefania Cecchi, Universitá Politecnica della Marche - Ancona, Italy; Andrea Primavera, Universitá Politecnica della Marche - Ancona, Italy; Laura Romoli, Universitá Politecnica della Marche - Ancona, Italy; Francesco Piazza, Universitá Politecnica della Marche - Ancona (AN), Italy; Ferruccio Bettarelli, Leaff Engineering - Ancona, Italy; Ariano Lattanzi, Leaff Engineering - Ancona, Italy
Electric vehicles have been receiving increasing interest in the last years for the well-known benefit that can be derived. However, electric cars do not produce noise as does an internal combustion engine vehicle, thus leading to safety issues for pedestrians and cyclists. Therefore, it is necessary to create an external warning sound for electric cars maintaining users’ sound quality expectation. In this context several sounds generated with different techniques are here proposed, taking into consideration some aspects of real engine characteristics. Furthermore, a subjective investigation is performed in order to define users’ preferences in the wide range of possible synthetic sounds.
Convention Paper 8778 (Purchase now)

Return to Paper Sessions

EXHIBITION HOURS October 27th 10am – 6pm October 28th 10am – 6pm October 29th 10am – 4pm

REGISTRATION DESK October 25th 3pm – 7pm October 26th 8am – 6pm October 27th 8am – 6pm October 28th 8am – 6pm October 29th 8am – 4pm

TECHNICAL PROGRAM October 26th 9am – 7pm October 27th 9am – 7pm October 28th 9am – 7pm October 29th 9am – 5pm

Audio Engineering Society

AES San Francisco 2012Poster Session P13

P13 - Auditory Perception and Evaluation

AES San Francisco 2012
Poster Session P13