AES San Francisco 2010
Poster Session P5
P5 - Emerging Applications
Thursday, November 4, 3:00 pm — 4:30 pm (Room 226)
P5-1 A Robust Audio Feature Extraction Algorithm for Music Identification—Jiajun Wang, Beijing University of Posts and Telecommunications - Beijing, China; Marie-Luce Bourguet, Queen Mary University of London - London, UK
In this paper we describe a novel audio feature extraction method that can effectively improve the performance of music identification under noisy circumstances. It is based on a dual box approach that extracts from the sound spectrogram point clusters with significant energy variation. This approach was tested in a song finder application that can identify music from samples recorded by microphone in the presence of dominant noise. A series of experiments show that under noisy circumstances, our system outperforms current state-of-the-art music identification algorithms and provides very good precision, scalability, and query efficiency.
Convention Paper 8180 (Purchase now)
P5-2 The Low Complexity MP3-Multichannel Audio Decoding System—Hyun Wook Kim, Han Gil Moon, Samsung Electronics - Suwon, Korea
In this paper a low complexity MP3 multichannel audio system is proposed. Utilizing the proposed decoding system, the advanced multichannel MP3 decoder can play high quality multichannel audio as well as the legacy stereo audio with low processing power. The system mainly consists of two parts, one of which is an MP3 decoding part and the other one a parametric multichannel decoding part. The transform domain convolution-synthesis method is equipped to replace the PQMF module in the MP3 decoding part and several small point DFT modules instead of the large point DFT module used in the multichannel decoding part. This combination can reduce computing power dramatically without any loss of decoded audio signal.
Convention Paper 8181 (Purchase now)
P5-3 The hArtes CarLab: A New Approach to Advanced Algorithms Development for Automotive Audio—Stefania Cecchi, Andrea Primavera, Francesco Piazza, Università Politecnica delle Marche - Ancona (AN), Italy; Ferruccio Bettarelli, Emanuele Ciavattini, Leaff Engineering - Ancona (AN), Italy; Romolo Toppi, FAITAL S.p.a. - Milano, Italy; Jose Gabriel De Figueiredo Coutinho, Wayne Luk, Imperial College London - London, UK; Christian Pilato, Fabrizio Ferrandi, Politecnico di Milano - Milano, Italy; Vlad M. Sima, Koen Bertels, Delft University of Technology - Delft, The Netherlands
In the last decade automotive audio has been gaining great attention by the scientific and industrial community. In this context, a new approach to test and develop advanced audio algorithms for a heterogeneous embedded platform has been proposed within the European hArtes project. A real audio laboratory installed in a real car (hArtes CarLab) has been developed employing professional audio equipment. The algorithms can be tested and validate on a PC exploiting each application as a plug-in of a real time framework. Then a set of tools (hArtes Toolchain) can be used to generate code for the embedded platform starting from plug-in implementation. An overview of the entire system is here presented, showing its effectiveness.
Convention Paper 8182 (Purchase now)
P5-4 Real-Time Speech Visualization System for Speech Training and Diagnosis—Yuichi Ueda, Tadashi Sakata, Akira Watanabe, Kumamoto University - Kumamoto-shi, Japan
We have been interested in visualizing speech information to observe speech phenomena, analyze speech signals, and substitute the hearing disorders or the speech disorders. In order to realize such speech visualization, we have developed a software tool, Speech-ART, and utilized it in investigating speech. Although the functional advantages of system have been effective in offline operation, the use of a speech training tool or real-time observation of speech sound has been restricted. Consequently, we have increased efficiency in analyzing speech parameters and displaying speech image, and then developed a real-time speech visualizing system. In this paper we describe the background of speech visualization, the characteristics of our system, and the applications of the system in the future.
Convention Paper 8184 (Purchase now)
P5-5 Underdetermined Binaural 3-D Sound Localization of Simultaneously Active Sources—Martin Rothbucher, David Kronmüller, Hao Shen, Klaus Diepold, Technische Universität München - München, Germany
Mobile robotic platforms are equipped with multimodal human-like sensing, e.g., haptic, vision, and audition, in order to collect data from the environment. Recently, robotic binaural hearing approaches based on Head-Related Transfer Functions (HRTFs) have become a promising technique to localize sounds in a three-dimensional environment with only two microphones. Usually, HRTF-based sound localization approaches are restricted to one sound source. To cope with this difficulty, Blind Source Separation (BSS) algorithms were utilized to separate the sound sources before applying HRTF localization. However, those approaches usually are computationally expensive and restricted to sparse and statistically independent signals for the underdetermined case. In this paper we present underdetermined sound localization that utilizes a super-positioned HRTF database. Our algorithm is capable of localizing sparse, as well as broadband signals, whereas the signals are not statistically independent.
Convention Paper 8185 (Purchase now)
P5-6 Wireless Multisensor Monitoring of the Florida Everglades: A Pilot Project —Colby Leider, University of Miami - FL, USA; Doug Mann, Peavey Electronics Corporation - Meridian, MS, USA; Daniel P. Dickinson, University of Miami - FL, USA
Prior work (e.g., Calahan 1984; Havstad and Herrick 2003) describes the need for long-term ecological monitoring of environmental data such as surface temperature and water quality. Newer studies by Maher, Gregoire, and Chen (2005) and Maher (2009, 2010) motivate the value in similarly documenting natural sound environments in U.S. national parks on the order of a year. Building on these ideas we describe a new system capable of combined remote audio and environmental monitoring on the order of multiple years that is currently being tested in the Florida Everglades.
Convention Paper 8186 (Purchase now)