AES London 2010
Paper Session P8
P8 - Music Analysis and Processing
Sunday, May 23, 09:00 — 13:00 (Room C3)
Chair: David Malham, University of York - York, UK
P8-1 Automatic Detection of Audio Effects in Guitar and Bass Recordings—Michael Stein, Jakob Abeßer, Christian Dittmar, Fraunhofer Institue for Digital Media Technology IDMT - Ilmenau, Germany; Gerald Schuller, Ilmenau University of Technology - Ilmenau, Germany
This paper presents a novel method to detect and distinguish 10 frequently used audio effects in recordings of electric guitar and bass. It is based on spectral analysis of audio segments located in the sustain part of previously detected guitar tones. Overall, 541 spectral, cepstral and harmonic features are extracted from short time spectra of the audio segments. Support Vector Machines are used in combination with feature selection and transform techniques for automatic classification based on the extracted feature vectors. With correct classification rates up to 100% for the detection of single effects and 98% for the simultaneous distinction of 10 different effects, the method has successfully proven its capability—performing on isolated sounds as well as on multitimbral, stereophonic musical recordings.
Convention Paper 8013 (Purchase now)
P8-2 Time Domain Emulation of the Clavinet—Stefan Bilbao, University of Edinburgh - Edingburgh, UK; Matthias Rath, Technische Universität Berlin - Berlin, Germany
The simulation of classic electromechanical musical instruments and audio effects has seen a great deal of activity in recent years, due in part to great recent increases in computing power. It is now possible to perform full emulations of relatively complex musical instruments in real time, or near real time. In this paper time domain finite difference schemes are applied to the emulation of the Hohner Clavinet, an electromechanical stringed instrument exhibiting special features such as sustained hammer/string contact, pinning of the string to a metal stop, and a distributed damping mechanism. Various issues, including numerical stability, implementation details, and computational cost will be discussed. Simulation results and sound examples will be presented.
Convention Paper 8014 (Purchase now)
P8-3 Polyphony Number Estimator for Piano Recordings Using Different Spectral Patterns—Ana M. Barbancho, Isabel Barbancho, Javier Fernandez, Lorenzo J. Tardón, Universidad de Málaga - Málaga, Spain
One of the main tasks of a polyphonic transcription system is the estimation of the number of voices, i.e., the polyphony number. The correct estimation of this parameter is very important for polyphonic transcription systems, this task has not been discussed in depth in the known transcription systems. The aim of this paper is to propose a novel estimation method of the polyphony number for piano recordings. This new method is based on the use of two different types of spectral patterns: single-note patterns and composed-note patterns. The usage of composed-note patterns in the estimation of the polyphony number and in the polyphonic detection process has not been previously reported in the literature.
Convention Paper 8015 (Purchase now)
P8-4 String Ensemble Vibrato: A Spectroscopic Study—Stijn Mattheij, AVANS University - Breda, The Netherlands
A systematic observation of the presence of ensemble vibrato on early twentieth century recordings of orchestral works has been carried out by studying spectral line shapes of individual musical notes. Broadening of line shapes was detected in recordings of Beethoven’s Fifth Symphony and Brahms’s Hungarian Dance no. 5; this effect was attributed to ensemble vibrato. From these observations it may be concluded that string ensemble vibrato was common practice in orchestras from the continent throughout the twentieth century. British orchestras do not use much vibrato before 1940.
Convention Paper 8016 (Purchase now)
P8-5 Influence of Psychoacoustic Roughness on Musical Intonation Preference—Julián Villegas, Michael Cohen, Ian Wilson, University of Aizu - Aizu, Japan; William Martens, University of Sydney - Sydney, NSW, Australia
An experiment to compare the acceptability of three different music fragments rendered with three different intonations is presented. These preference results were contrasted with those of isolated chords also rendered with the same three intonations. The least rough renditions were found to be those using Twelve-Tone Equal-Temperament (12-tet). Just Intonation (ji) renditions were the roughest. A negative correlation between preference and psychoacoustic roughness was also found.
Convention Paper 8017 (Purchase now)
P8-6 Music Emotion and Genre Recognition Toward New Affective Music Taxonomy—Jonghwa Kim, Lars Larsen, University Augsburg - Augsburg, Germany
Exponentially increasing electronic music distribution creates a natural pressure for fine-grained musical metadata. On the basis of the fact that a primary motive for listening to music is its emotional effect, diversion, and the memories it awakens, we propose a novel affective music taxonomy that combines the global music genre taxonomy, e.g., classical, jazz, rock/pop, and rap, with emotion categories such as joy, sadness, anger, and pleasure, in a complementary way. In this paper we deal with all essential stages of automatic genre/emotion recognition system, i.e., from reasonable music data collection up to performance evaluation of various machine learning algorithms. Particularly, a novel classification scheme, called consecutive dichotomous decomposition tree (CDDT) is presented, which is specifically parameterized for multi-class classification problems with extremely high number of class, e.g., sixteen music categories in our case. The average recognition accuracy of 75% for the 16 music categories shows a realistic possibility of the affective music taxonomy we proposed.
Convention Paper 8018 (Purchase now)
P8-7 Perceptually-Motivated Audio Morphing: Warmth—Duncan Williams, Tim Brookes, University of Surrey - Guildford, UK
A system for morphing the warmth of a sound independently from its other timbral attributes was coded, building on previous work morphing brightness only, and morphing brightness and softness. The new warmth-softness-brightness morpher was perceptually validated using a series of listening tests. A multidimensional scaling analysis of listener responses to paired-comparisons showed perceptually orthogonal movement in two dimensions within a warmth-morphed and everything-else-morphed stimulus set. A verbal elicitation experiment showed that listeners’ descriptive labeling of these dimensions was as intended. A further “quality control” experiment provided evidence that no “hidden” timbral attributes were altered in parallel with the intended ones. A complete timbre morpher can now be considered for further work and evaluated using the tri-stage procedure documented here.
Convention Paper 8019 (Purchase now)
P8-8 A Novel Envelope-Based Generic Dynamic Range Compression Model—Adam Weisser, Oticon A/S - Smørum, Denmark
A mathematical model is presented, which reproduces typical dynamic range compression, when given the nominal input envelope of the signal and the compression constants. The model is derived geometrically in a qualitative approach and the governing differential equation for an arbitrary input and an arbitrary compressor is found. Step responses compare well to commercial compressors tested. The compression effect on speech using the general equation in its discrete version is also demonstrated. This model applicability is especially appealing to hearing aids, where the input-output curve and time constants of the nonlinear instrument are frequently consulted and the qualitative theoretical effect of compression may be crucial for speech perception.
Convention Paper 8020 (Purchase now)