AES E-Library

Perception and Automated Assessment of Audio Quality in User Generated Content

Many of us now carry around technologies that allow us to record sound, whether that is the sound of our child`s first music concert on a digital camera or a recording of a practical joke on a mobile phone. However, the production quality of the sound on user-generated content is often very poor: distorted, noisy, with garbled speech or indistinct music. This paper reports the outcomes of a three-year research project on assessment of quality from user generated recordings. Our interest lies in the causes of the poor recording, especially what happens between the sound source and the electronic signal emerging from the microphone. We have investigated typical problems: distortion; wind noise, microphone handling noise, and frequency response. From subjective tests on the perceived quality of such errors and signal features extracted from the audio files we developed perceptual models to automatically predict the perceived quality of audio streams unknown to the model. It is shown that perceived quality is more strongly associated with distortion and frequency response, with wind and handling noise being just slightly less important. The work presented here has applications in areas such as perception and measurement of audio quality, signal processing, and feature detection and machine learning.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
AES Convention: Paper Number:
Publication Date:
Session subject:

DOI:


Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
16938
Choose your country of residence from this list: