You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
The timbral analysis from spectrographic features of popular music sub-genres (or micro-genres) presents unique challenges to the field of the computational auditory scene analysis, which is caused by the adjacencies among sub-genres and the complex sonic scenes from sophisticated musical textures and production processes. This paper presents a timbral modeling tool based on a modified deep learning natural language processing model. It treats the time frames in spectrograms as words in natural languages to explore the temporal dependencies. The modeling performance metrics obtained from the fine-tuned classifier of the modified Deep Bidirectional Encoder Representations from Transformers (BERT) model show strong semantic modeling performances with different temporal settings. Designed as an automatic feature engineering tool, the proposed framework provides a unique solution to the semantic modeling and representation tasks for objectively understanding of subtle musical timbral patterns from highly similar musical genres.
Author (s): Geng, Shijia;
Ren, Gang;
Pan, Xu;
Zysman, Joel;
Ogihara, Mitsu;
Affiliation:
University of Miami, FL, USA
(See document for exact affiliation information.)
Publication Date:
2020-08-06
Session subject:
Music Analysis
DOI:
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Geng, Shijia; Ren, Gang; Pan, Xu; Zysman, Joel; Ogihara, Mitsu; 2020; Sequential Modeling of Temporal Timbre Series for Popular Music Sub-Genre Analyses Using Deep Bidirectional Encoder Representations from Transformers [PDF]; University of Miami, FL, USA; Paper 10470; Available from: https://aes.org/publications/elibrary-page/?id=21147
Geng, Shijia; Ren, Gang; Pan, Xu; Zysman, Joel; Ogihara, Mitsu; Sequential Modeling of Temporal Timbre Series for Popular Music Sub-Genre Analyses Using Deep Bidirectional Encoder Representations from Transformers [PDF]; University of Miami, FL, USA; Paper 10470; 2020 Available: https://aes.org/publications/elibrary-page/?id=21147
@inproceedings{Geng2020sequential,
title={{Sequential Modeling of Temporal Timbre Series for Popular Music Sub-Genre Analyses Using Deep Bidirectional Encoder Representations from Transformers}},
author={Geng, Shijia and Ren, Gang and Pan, Xu and Zysman, Joel and Ogihara, Mitsu},
year={2020},
month={aug},
booktitle={Journal of the Audio Engineering Society},
publisher={Paper 10470; AES Conference: 2020 AES International Conference on Audio for Virtual and Augmented Reality (August 2020); August 2020},
number={10470},
organization={AES},
}
TY – paper
TI – Sequential Modeling of Temporal Timbre Series for Popular Music Sub-Genre Analyses Using Deep Bidirectional Encoder Representations from Transformers
AU – Geng, Shijia
AU – Ren, Gang
AU – Pan, Xu
AU – Zysman, Joel
AU – Ogihara, Mitsu
PY – 2020
JO – Journal of the Audio Engineering Society
VL – 10470
Y1 – August 2020
Notifications