AES Los Angeles 2014
Paper Session P15
P15 - Signal Processing: Part 1
Saturday, October 11, 2:00 pm — 5:00 pm (Room 309)
Chair:
Jayant Datta, THX - San Francisco, CA, USA; Syracuse University - Syracuse, NY, USA
P15-1 MATLAB Program for Calculating the Parameters of Autocorrelation and Interaural Cross-Correlation Functions Based on a Model of the Signal Processing Performed in the Auditory Pathways—Shin-ichi Sato, Universidad Nacional de Tres de Febrero - Caseros, Buenos Aires, Argentina; Alejandro Bidondo, Universidad Nacional de Tres de Febrero - UNTREF - Caseros, Buenos Aires, Argentina; Yoshiharu Soeta, National Institute of Advanced Industrial Science and Technology (AIST) - Ikeda, Japan
This paper describes a MATLAB program with a graphical user interface (GUI) for a signal processing based on the Auditory Image Model [S. Bleeck et al., Acta Acustica united with Acustica, 90 (2004) 781-787], followed by the summary autocorrelation function (SACF) and the summary interaural cross-correlation function (SIACF) analyses, and the calculation of the SACF and SIACF parameters. The effects of the number of the channels and the frequency range of the filterbanks on the SACF parameters are investigated.
Convention Paper 9179 (Purchase now)
P15-2 An Investigation of Temporal Feature Integration for a Low-Latency Classification with Application to Speech/Music/Mix Classification—Joachim Flocon-Cholet, Orange Labs - Lannion, France; Julien Faure, Orange Labs - Lannion, France; Alexandre GuĂ©rin, Orange Labs - Lannion, France; Pascal Scalart, INRIA/IRISA, UniversitĂ© de Rennes - Rennes, France
In this paper we propose several methodologies for the use of feature integration and evaluate them in a low-latency classification framework. These general methodologies are based on three key aspects that will be assessed in this study: the selection of the features that have to be temporally integrated, the choice of the integration techniques, i.e., how the temporal information is extracted, and the size of the integration window. The experiments carried out for the speech/music/mix classification task show that the different methodologies have a significant impact on the global performance. Compared to the state of the art procedures, the methodologies we proposed achieved the best performance, even with the low-latency constraints.
Convention Paper 9180 (Purchase now)
P15-3 MATLAB Program for Calculating the Parameters of the Autocorrelation and Interaural Cross-Correlation Functions Based on Ando's Auditory-Brain Model—Shin-ichi Sato, Universidad Nacional de Tres de Febrero - Caseros, Buenos Aires, Argentina
This paper describes a MATLAB program with a graphical user interface (GUI) to calculate the parameters of the autocorrelation and the interaural cross-correlation functions of a binaural signal based on the auditory-brain model proposed by Ando [Y. Ando. (1998) Architectural Acoustics: Sound Source, Sound Fields, and Listeners, Springer-Verlag, New York, Chap. 5], which can describe the various subjective attributes such as pitch, timbre, and spatial impression.
Convention Paper 9181 (Purchase now)
P15-4 Perceptual Quality of Audio Separated Using Sigmoidal Masks—Toby Stokes, University of Surrey - Guildford, Surrey, UK; Christopher Hummersone, University of Surrey - Guildford, Surrey, UK; Tim Brookes, University of Surrey - Guildford, Surrey, UK; Andrew Mason, BBC Research and Development - London, UK
Separation of underdetermined audio mixtures is often performed in the Time-Frequency (TF) domain by masking each TF element according to its target-to-mixture ratio. This work uses sigmoidal functions to map the target-to-mixture ratio to mask values. The series of functions used encompasses the ratio mask and an approximation of the binary mask. Mixtures are chosen to represent a range of different amounts of TF overlap, then separated and evaluated using objective measures. PEASS results show improved interferer suppression and artifact scores can be achieved using softer masking than that applied by binary or ratio masks. The improvement in these scores gives an improved overall perceptual score; this observation is repeated at multiple TF resolutions.
Convention Paper 9182 (Purchase now)
P15-5 A New Approach to Impulse Response Measurements at High Sampling Rates—Joseph G. Tylka, 3D3A Lab, Princeton University - Princeton, NJ, USA; Rahulram Sridhar, 3D3A Lab, Princeton University - Princeton, NJ, USA; Braxton B. Boren, Princeton University - Princeton, NJ, USA; Edgar Choueiri, Princeton University - Princeton, NJ, USA
High sampling rates are required to fully characterize some acoustical systems, but capturing the system's high-frequency roll-off decreases the signal-to-noise ratio (SNR). Band-pass filtering can improve the SNR but may create an undesirable pre-response. An iterative procedure is developed to measure impulse responses (IRs) with an improved SNR and a constrained pre-response. First, a quick measurement provides information about the system and ambient noise. A second, longer measurement is then performed, and a suitable band-pass filter is applied to the recorded signal. Experimental results show that the proposed procedure achieves an SNR of 37 dB with a peak pre-response amplitude of <0.2% of the IR peak, whereas a conventional technique achieves an SNR of 32 dB with a peak pre-response amplitude of 16%.
Convention Paper 9183 (Purchase now)
P15-6 IIR Filters for Audio Test and Measurement: Design, Implementation, and Optimization—Thomas Kite, Audio Precision, Inc. - Beaverton, OR, USA
Audio analyzers use filters for many reasons: to define the measurement bandwidth, to isolate tones for measurement, to remove fundamental signals, and so on. In modern instruments, the majority of this filtering is done digitally, following analog-to-digital conversion if the signal is not already digital. Digital filter design is a mature field that encompasses a broad range of techniques, from classical analog filter design to advanced iterative design methods. However, the filter design considerations and techniques unique to audio analyzers do not seem to occupy much space in the published literature. This paper aims to correct this with a discussion of filter design, implementation, and optimization for modern Intel x86 architectures.
Convention Paper 9184 (Purchase now)