Session O: SPATIAL AUDIO - PART 2
Monday, May 13, 14:00 17:30 h A controlled subjective experiment was undertaken to evaluate the relative merits of objective measurement techniques for predicting selected perceived spatial attributes of reproduced sound. The stimuli consisted of a number of anechoic recordings of single sound sources that were reproduced in a simulated concert hall and captured using a number of simulated multichannel microphone techniques. These were reproduced in a listening room and the subjects were asked to judge the perceived source width and perceived environment width of each stimulus. A number of objective measurements were made at the listening position and these were then compared with the subjective judgments. The results showed that a perceptually-grouped measurement of the experimental stimuli using a technique based on the interaural cross-correlation coefficient matched the subjective judgments most accurately, though the difference between this measurement and a number of other types was small. Human sound localization of speech stimuli was tested using an accurate head-pointing task in three sound conditions: (i) broadband (22 Hz to 16 kHz); (ii) low-pass (22 Hz to 8 kHz); (iii) spectrally-smeared broadband. The experiments were conducted in virtual auditory space (VAS) so that reduced frequency selectivity (a consequence of cochlear hearing loss that has effects similar to spectral smearing) could be simulated in normally-hearing listeners. Broadband noise localization provided a control. Results show that broadband speech is not localized as accurately as broadband noise and that there is a significant reduction in localization accuracy for both the low-pass and spectrally-smeared sound conditions. The data show that accurate high-frequency spectral information is important for speech localization. Assessment of the spatial quality of reproduced sound is becoming more important as the number of techniques and systems affecting such quality increases. The presence of dimensions forming spatial quality has been indicated in earlier experiments by using attributes as descriptors for the dimensions. These attributes have been found relevant for describing the spatial quality of stimuli subjected to different modes of reproduction. In this paper, new attributes are elicited and the applicability of these and previously encountered attributes for assessment of spatial quality is tested in the context of new stimuli, recorded by means of 5-channel microphone techniques and reproduced through a 5.0 system. Differential pressure synthesis (DPS) estimates the free field acoustic pressure on the boundary of an object from its geometry. The DPS method pre-calculates a database of pressure changes caused by introducing orthogonal shape deformations to a template shape. Pressures on the object are synthesized by summing weighted pressure components from the database. Given an appropriate template, the accuracy of DPS approaches that of the proven boundary element method (BEM). Yet computationally its performance is much closer to that of simple spherical head approximations. Pressures are synthesized for a 2D shape and a 3D KEMAR head model and are compared with those from direct application of the BEM. Applications and constraints of DPS are discussed, including its potential for the rapid estimation of head-related transfer functions. The aim of the present work is the reproduction of a five-channel signal over headphones. Informal listening tests show that a large number of different HRTFs do not have the desired level of quality. The frontal localization was either elevated or completely undefined. The coloration in all directions - even with correct IPTFs - was far too strong for a high quality reproduction. In order to overcome this problem two HRTFs sets together with IPTFs were selected out of a big database. These transfer functions were subsequently tuned by a tuning expert. The main methods used for tuning were smoothing and parametric equalizing of amplitude and phase with individual settings for every direction and for the left and right ear. Listening experiments that confirm the tuning results for a panel of listeners are presented and discussed. The resulting transfer functions have clearly reduced coloration and improved global localization although with modest improvements in the frontal position. The assessment of multichannel audio systems is often based on the ability of a particular loudspeaker configuration to generate the appropriate interaural cues for a listener to localize a stable virtual image between the loudspeakers. A common method used to judge the performance of a system is a subjective listening test where listeners are asked to indicate the direction of the virtual image. This work investigates an alternative method where a dummy head can be used to capture the cues and analysis of these signals allows a third octave band predication of where the image is most likely to be perceived. The MATLAB implementation is tested using two common multichannel systems and the results are compared with subjective results. |
|