AES Berlin 2014
Paper Session P4
P4 - Signal Processing—Part 2
Sunday, April 27, 13:30 — 16:30 (Room Paris)
Chair:
Grzegorz Sikora, Bang & Olufsen Deutschland GmbH - Munich, Germany; McGill University - Montreal, Canada
P4-1 Creation of New Virtual Patterns for Emotion Recognition through PSOLA—Inma Mohino-Herranz, Universidad de Alcalá - Alcalá de Henares, Madrid, Spain; Héctor A. Sánchez-Hevia, University of Alcalá - Alcalá de Henares, Madrid, Spain; Roberto Gil-Pita, University of Alcalá - Alcalá de Henares, Madrid, Spain; Manuel Rosa-Zurera, University of Alcalá - Alcalá de Henares, Madrid, Spain
Human emotions can be recognized through speech analysis. One main problem of this discipline is the lack of databases with a sufficient number of patterns for a correct learning. This fact makes generalization in the learning process be more difficult. One possible solution is the creation of new virtual patterns, enlarging the training set. In order to carry out this enlargement, we modify the average pitch by using the technique known as Pitch Synchronous Overlap and Add combined with resampling, that allows to change the average pitch without altering neither the pitch variations nor the speech rate. Therefore, the emotion in the utterance is unaltered. Results over the original test set show that it is possible to achieve a significant reduction in the generalization effects with the proposed creation of new virtual training patterns.
Convention Paper 9037 (Purchase now)
P4-2 Extended Subtractive Synthesis of Harmonic Musical Tones—Rémi Mignot, Aalto University - Espoo, Finland; Vesa Välimäki, Aalto University - Espoo, Finland
A new approach is presented for the digital sound analysis-synthesis of musical tones. Based on the Source-Filter principle, the Extended Subtractive Synthesis roughly consists of the real-time filtering of a source signal by a digital filter. Starting from the recorded notes of a given instrument, the time-varying fundamental frequency and the digital filters are jointly analyzed. First, one key point of this work is the use of new advanced tools for the filter identification, which allow a relative low-order approximation of the spectral envelopes with a perceptually based criterion. Second, we propose a particular filter chain, for the separated sine and noise parts, which significantly reduces the simulation cost in the case of polyphonic synthesis and facilitates the time-varying filtering.
Convention Paper 9038 (Purchase now)
P4-3 A Virtual Acoustic Environment for Automated Parameter Optimization of a Multichannel Downmix Algorithm—Fabian Knappe, Hamburg University of Applied Sciences - Hamburg, Germany; Robert Mores, Hamburg University of Applied Sciences - Hamburg, Germany; Christian Hartmann, Institut für Rundfunktechnik - Munich, Germany
This paper presents an environment for automated parameter-optimization of a multichannel downmix algorithm. Manual optimization of multiple parameters in audio signal algorithms is likely to deliver poor results, especially if many parameters mutually interfere with each other. Even professionals fail to control the correct adjustment of all the parameters. At the same time broadcast environments ask for automated and efficient handling. This paper approaches automated optimization of a 5.0 to 2.0 channel downmix algorithm by defining a virtual acoustic environment and using an optimization process based on the Levenberg-Marquardt algorithm. The aim of the study is to determine recommendations for the parameterization of the downmix algorithm that enable mixing engineers to employ the algorithm’s potential without knowledge of all the parameters’ dependencies. A listening test validates the results across various genres.
Convention Paper 9039 (Purchase now)
P4-4 An Approach to the Generation of Subharmonic Frequencies in Audio Applications—Dieter Leckschat, University of Applied Science Düsseldorf - Düsseldorf, Germany; Christian Epe, University of Applied Sciences Düsseldorf - Duesseldorf, Germany
In recording studios it is common to use equipment and algorithms to enhance audio productions in the low frequency range. Today’s methods use either a frequency-selective dynamic compression or focus on psychoacoustics to take advantage of the residuum effect. The basic subject of this paper is a method to generate sub-harmonics of an audio signal. The most interesting sub-harmonic is one octave below a signal´s fundamental frequency. By implementing a mathematical formula it is possible to produce an oscillation at half the frequency of a given harmonic oscillation. The method works in the analog or digital domain and instantaneous, which makes it suited for real-time applications of musicians. Depending on the design the process can be optimized for stationary signals or for signals with transient components.
Convention Paper 9040 (Purchase now)
P4-5 True Peak Metering—A Tutorial Review—Ian Dash, Consultant - Marrickville, NSW, Australia
Along with the loudness algorithm, ITU-R Recommendation BS.1770 specifies a true-peak metering method using oversampling and interpolation. The need for such metering is discussed, along with considerations on its implementation and on its usage. Implementation issues include oversampling factor, the tradeoff between accuracy and processor load and the proportion of total processor load when combined with loudness measurement. Four sources of error are examined: timing of interpolated samples, ripple in the passband response of the interpolation filter, incomplete alias suppression in the stopband response of the filter and departure from linear phase response. Implications of filter topology and filter order are discussed. An example of implementation is given along with performance parameters.
Convention Paper 9041 (Purchase now)
P4-6 Drift and Wow Correction of Analogue Magnetic Tape Recordings in the Analogue Domain Using HF-Bias Signals—Nadja Wallaszkovits, Phonogrammarchiv, Austrian Academy of Science - Vienna, Austria; Tobias Hetzer, University of Applied Sciences FH Technikum Wien - Vienna, Austria; Heinrich Pichler, Audio Consultant - Vienna, Austria
Based on various existing ideas of using the high frequency (HF) bias signal of analogue magnetic tape recordings as a reference signal for irregular speed deviations, this paper discusses the approach to pre-process the bias signal in the analogue domain in a way that allows control of the playback speed of the replay machine in form of a servo loop. Therefore, the bias signal is captured via a sensor head and specifically preprocessed to match the reference frequency of the capstan motor of the tape machine. The machine is set to external vari-speed control mode and, thus, deviations of the bias signal act as external vari-speed reference, allowing automatic speed correction in real-time. The paper discusses the possibilities, problems and limits of the technical implementation of such a prototype.
Convention Paper 9042 (Purchase now)