You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
Our challenge is to analyze/classify video sound track content for indexing purposes. To this end we compare the performance of MPEG-7 Audio Spectrum Projection (ASP) features based on several basis decomposition algorithms vs. Mel-scale Frequency Cepstrum Coefficients (MFCC). For basis decomposition in the feature extraction we evaluate three approaches: Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Audio features are computed from these reduced vectors and are fed into a continuous hidden Markov model (CHMM) classifier. Our conclusion is that established MFCC features yield better performance compared to MPEG-7 ASP in the general sound recognition under practical constraints.
Author (s): Kim, Hyoung-Gook;
Sikora, Thomas;
Affiliation:
Communication Systems Group, Technical University of Berlin, Germany
(See document for exact affiliation information.)
Publication Date:
2004-06-06
Session subject:
Metadata for Audio
DOI:
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Kim, Hyoung-Gook; Sikora, Thomas; 2004; How Efficient is MPEG-7 for General Sound Recognition? [PDF]; Communication Systems Group, Technical University of Berlin, Germany; Paper 5-2; Available from: https://aes.org/publications/elibrary-page/?id=12814
Kim, Hyoung-Gook; Sikora, Thomas; How Efficient is MPEG-7 for General Sound Recognition? [PDF]; Communication Systems Group, Technical University of Berlin, Germany; Paper 5-2; 2004 Available: https://aes.org/publications/elibrary-page/?id=12814
@inproceedings{Kim2004how,
title={{How Efficient is MPEG-7 for General Sound Recognition?}},
author={Kim, Hyoung-Gook and Sikora, Thomas},
year={2004},
month={jun},
booktitle={Journal of the Audio Engineering Society},
publisher={Paper 5-2; AES Conference: 25th International Conference: Metadata for Audio; June 2004},
number={5-2},
organization={AES},
}
Notifications