AES E-Library

Subspace-Based HRTF Synthesis from Sparse Data: A Joint PCA and ML-Based Approach

Head-related transfer functions (HRTF) are used for creating the perception of a virtual sound source at an arbitrary azimuth-elevation. Publicly available databases use a subset of these directions due to physical constraints (viz., loudspeakers for generating the stimuli not being point-sources) and the time required to acquire and deconvolve responses for a large number of spatial directions. In this paper we present a subspace-based technique for reconstructing HRTFs at arbitrary directions for the IRCAM-Listen HRTF database, which comprises a set of HRTFs sampled every 15 deg along the azimuth direction. The presented technique includes first augmenting the sparse IRCAM dataset using the concept of auditory localization blur, then deriving a set of P=6 principal components, using PCA for the original and augmented HRTFs, and then training a neural network (ANN) with these directional principal components. The reconstruction of HRTF corresponding to an arbitrary direction is achieved by post-multiplying the ANN output, comprising the estimated six principal components, with a frequency weighting matrix. The advantage of using a subspace approach, involving only 6 principal components, is to obtain a low complexity HRTF synthesis ANN-based model as compared to training an ANN model to output an HRTF over all frequencies. Objective results demonstrate a reasonable interpolation with the presented approach.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
AES Convention: Paper Number:
Publication Date:
Session subject:

DOI:


Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
16938
Choose your country of residence from this list: