AES E-Library

Head-Related Transfer Function Upsampling Using an Autoencoder-Based Generative Adversarial Network With Evaluation Framework

Accurate head-related transfer functions (HRTFs) are essential for delivering realistic 3D audio experiences. However, obtaining personalized, high-resolution HRTFs for individual users is a time-consuming and costly process, typically requiring extensive acoustic measurements. To address this, spatial upsampling techniques have been developed to estimate high-resolution HRTFs from sparse, low-resolution acoustic measurements. This paper presents a novel approach that leverages the spherical harmonic domain and an autoencoder generative adversarial network to tackle the HRTF upsampling problem. Comprehensive evaluations are conducted using both perceptual models and objective spectral metrics to validate the accuracy and realism of the upsampled HRTFs. The results show that the proposed approach outperforms traditional barycentric interpolation in terms of log-spectral distortion, particularly in extreme sparsity scenarios involving fewer than 12 measurements. These results go some way to justifying that the proposed autoencoder generative adversarial network approach is able to create high-quality, high-resolution HRTFs from only a few acoustic measurements, helping pave the way for more accessible personalized spatial audio across a range of applications.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
Publication Date:

DOI:


Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
16938
Choose your country of residence from this list: