AES Dublin 2019
Paper Session P12

P12 - Speech

Friday, March 22, 09:00 — 11:00 (Meeting Room 3)

Chair:
Yuxuan Ke, University of Chinese Academy of Sciences - Beijing, China

P12-1 Background Ducking to Produce Esthetically Pleasing Audio for TV with Clear Speech—Matteo Torcoli, Fraunhofer IIS - Erlangen, Germany; Alex Freke-Morin, University of Salford - Salford, UK; Jouni Paulus, Fraunhofer IIS - Erlangen, Germany; International Audio Laboratories Erlangen - Erlangen, Germany; Christian Simon, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Ben Shirley, University of Salford - Salford, Greater Manchester, UK; Salsa Sound Ltd - Salford, Greater Manchester, UK
In audio production background ducking facilitates speech intelligibility while keeping the background track enjoyable. Technical details for recommendable ducking practices are not currently documented in literature. Hence, we first analyze common practices found in TV documentaries. Second, a subjective test investigates the preferences of 22 normal-hearing listeners on the Loudness Difference (LD) between commentary and background during ducking. Highly personal preferences are observed, highlighting the importance of object-based personalization. Statistically significant difference is found between non-expert and expert listeners. On average, non-experts prefer LDs that are 4 LU higher than the ones preferred by experts. Based on the test results, we recommend at least 10 LU difference between commentary and music and at least 15 LU between commentary and ambience.
Convention Paper 10175 (Purchase now)

P12-2 Factors Influencing the Spectral Clarity of Vocals in Music Mixes—Kirsten Hermes, University of Westminster - London, UK
Vocal clarity is one of the most important quality parameters of music mixes. The clarity of isolated sounds depends heavily on spectral factors and can therefore be manipulated with EQ. Spectrum is also an important factor in determining vocal timbral and quality parameters. An experiment where listeners rate the spectral clarity of equalized vocals within a noise backing track can provide insight into spectral predictors of vocal clarity. Overall, higher frequencies contribute to vocal clarity more positively than lower ones, but the relationship is program-item-dependent. Changes in harmonic centroid (or dimensionless spectral centroid) correlate well with changes in clarity and so does the vocal-to-backing track ratio.
Convention Paper 10174 (Purchase now)

P12-3 High-Resolution Analysis of the Directivity Factor and Directivity Index Functions of Human Speech—Samuel Bellows, Brigham Young University - Provo UT, USA; Timothy Leishman, Brigham Young University - Provo, UT, USA
The detailed directivity of a sound source is a powerful tool with broad applications in modeling of sound radiation into various acoustic environments, ideal microphone positioning, and other areas. While the directivity of human speech has been assessed previously, the results have lacked the necessary resolution to accurately model radiation in three dimensions. In this work high-resolution measurements were taken using a multiple-capture spherical-scanning system. The frequency-dependent directivity factors and indices of speech were then calculated from the data and their spherical-harmonic expansions. Although past models have represented these measures in simple terms, high-resolution measurements demonstrate that over the audible range they have more variation than previously known, with important ramifications for three-dimensional modeling and audio.
Convention Paper 10173 (Purchase now)

P12-4 Poster Introductions 7—N/A
The purpose of Poster Introductions at the end of certain paper sessions is to give the poster authors a chance to briefly outline what is in their paper and encourage people to come to their poster session and ask questions. • Quantitative Analysis of Streaming Protocols for Enabling Internet of Things (IoT) Audio Hardware—Marques Hardin; Rob Toulson • Automatic Detection of Audio Problems for Quality Control in Digital Music Distribution—Pablo Alonso Jiménez; Luis Joglar Ongay; Xavier Serra; Dmitry Bogdanov • A High Power Switch-Mode Power Audio Amplifier—Niels Ekljær Iveresen, Jóhann Björnsson, Patrik Boström, Lars Petersen

Return to Paper Sessions

AES Dublin Paper Session P12: Speech

AES Dublin 2019
Paper Session P12

P12 - Speech

Shortcuts

AES Dublin Paper Session P12: Speech

AES Dublin 2019Paper Session P12

P12 - Speech

Shortcuts

AES Dublin 2019
Paper Session P12