AES Budapest 2012
Paper Session P2
P2 - Emerging and Innovative Audio
Thursday, April 26, 09:30 — 11:00 (Room: Liszt)
Chair:
Francis Rumsey
P2-1 Virtual Microphones: Using Ultrasonic Sound to Receive Audio Waves—Tobias Merkel, Beuth Hochschule für Technik - Berlin, Germany; Hans.-G. Lühmann, Lütronik Elektroakustik GmbH - Berlin, Germany; Tom Ritter, Beuth Hochschule für Technik - Berlin, Germany
A highly focused ultrasound beam was sent through the room. At a distance of several meters the ultrasonic wave was received again with an ultrasonic microphone. The wave field of a common audio source was overlaid with the ultrasonic beam. It was found that the phase shift of the received ultrasonic signal obtains the audio information of the overlaid field. Since the ultrasonic beam itself acts as sound receiver, there is no technical device like membranes necessary at direct vicinity of sound reception. Because this kind of sound receiver is not visible or touchable we call it “Virtual Microphone.”
Convention Paper 8587 (Purchase now)
P2-2 Implementation and Evaluation of Autonomous Multi-Track Fader Control —Stuart Mansbridge, Saoirse Finn, Joshua D. Reiss, Queen Mary University of London - London, UK
A new approach to the autonomous control of faders for multi-track audio mixing is presented. The algorithm is designed to generate an automatic sound mix from an arbitrary number of monaural or stereo audio tracks of any sample rate and to be suitable for both live and postproduction use. Mixing levels are determined by the use of the EBU R-128 loudness measure, with a cross-adaptive process to bring each track to a time-varying average. An hysteresis loudness gate and selective smoothing prevents the adjustment of intentional dynamics in the music. Real-time and off-line software implementations have been created. Subjective evaluation is provided in the form of listening tests, where the method is compared against the results of a human mix and a previous automatic fader implementation.
Convention Paper 8588 (Purchase now)
P2-3 A Voice Classification System for Younger Children with Applications to Content Navigation—Christopher Lowis, Christopher Pike, Yves Raimond, BBC R&D - UK
A speech classification system is proposed that has applications for accessibility of content for younger children. To allow a young child to access online content (where typical interfaces such as search engines or hierarchical navigation would be inappropriate) we propose a voice classification system trained to recognize a range of sounds and vocabulary typical of younger children. As an example we designed a system for classifying animal noises. Acoustic features are extracted from a corpus of animal noises made by a class of young children. A Support Vector Machine is trained to classify the sounds into one of 12 corresponding animals. We investigated the precision and recall of the classifier for various classification parameters. We investigated an appropriate choice of features to extract from the audio and compared the performance when using mean Mel-frequency Cepstral Coefficients (MFCC), a single-Gaussian model fitted to the MFCCs, as well as a range of temporal features. To investigate the real-world applicability of the system we paid particular attention to the difference between training a generic classifier from a collected corpus of examples and one trained to a particular voice.
Convention Paper 8589 (Purchase now)