AES E-Library

Audio Quality Prediction of Non–Waveform Preserving Distortions in Realistic Complex Scenes

Intrusive audio quality models typically compare “internal representations” of a reference and a test signal. These models are often optimized for the prediction of small signal degradations, where the test and reference signals are still highly correlated (waveform preserving distortions). However, differences between uncorrelated signals like two Gaussian-noise tokens or, for example, more complex, realistic signals in spatial audio reproduction schemes, that show only partial correlation (non–waveform preserving distortions) are not necessarily easy to distinguish by listeners. Despite this, current audio quality models typically predict large perceptual differences between such signals. Here, the decision back-end of a reference-based audio quality model was modified to account for this overestimation of signal quality differences. The suggested modifications were intended to effectively mimic short-term memory limitations by analyzing similarities in the differences between the internal representations of reference and test signals across time frames, auditory channels, and modulation channels. The modified model was evaluated with data based on different audio reproduction and room simulation methods and was compared to other state-of-the-art audio quality models. Results support the need for modifications of state-of-the-art audio quality models to accurately predict the perceptual effects of non–waveform preserving distortions.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
Publication Date:

DOI:


Type:
16938
Choose your country of residence from this list: