Prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping

Qiao, Yue and Zacharov, Nick and Hoffmann, Pablo F.

AES E-Library

Prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping

Listening tests are regarded as the “gold standard” in evaluating the perceptual quality of audio systems. With the surge of applications in virtual and augmented reality, the demand for audio quality evaluations that are more efficient than listening tests has greatly increased. Auditory models are an attractive tool for this purpose, and can greatly complement listening tests. A machine-learning-based model for predicting timbral, spatial, and overall audio quality is presented. When both timbral and spatial attributes are considered, existing models (e.g., MoBi-Q [1]) often assume minimum interaction between the two attributes, and combine their respective quality predictions into a single overall quality judgement. To validate such an assumption, a listening test with various timbral and spatial distortions was conducted. Results revealed a strong correlation between the two quality attributes when moderate distortion is present. Based on this observation, the proposed model preserves the original front-end of MoBi-Q for feature extraction and uses a simple neural network as the decision module that independently maps auditory features to timbral, spatial, and overall quality scores with no explicit assumptions. Using available third-party datasets, our proposed model showed a significantly higher correlation with subjective scores than MoBi-Q for timbral and overall quality. The assessment of spatial audio quality is still ongoing.

Author (s): Qiao, Yue; Zacharov, Nick; Hoffmann, Pablo F.;
Affiliation: Princeton University, NJ, USA; Reality Labs, Meta, Redmond, WA, USA; Reality Labs, Meta, Redmond, WA, USA (See document for exact affiliation information.)
AES Convention: 153 Paper Number:48
Publication Date: 2022-10-06
Session subject: Spatial Audio

DOI:

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type: Express Paper

AES Conventions

AES Conferences

AES Training & Development

Gift Membership

AES Membership Benefits

Gift Membership

AES Membership Benefits

Become a Sustaining Member

AES Membership Benefits

AES Inside Track

Journal of the AES

AES E-library

AES Sections are active around the world and provide a means for members to meet locally.

AES Student Website

AES Educational Foundation

Student Sections

See the committee’s accomplishments in diversity & inclusion

AES Statement of solidarity

Richard C. Heyser Memorial Lecture Series

AES E-Library

Prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping

Choose your country of residence from this list:

AES E-Library

Login Institutions

Prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping

Choose your country of residence from this list: