UP-WGAN: Upscaling Ambisonic Sound Scenes Using Wasserstein Generative Adversarial Networks

Wang, Yiwen and Wu, Xihong and Qu, Tianshu

AES E-Library

UP-WGAN: Upscaling Ambisonic Sound Scenes Using Wasserstein Generative Adversarial Networks

Sound field reconstruction using spherical harmonics (SH) has been widely used. However, order-limited summation leads to an inaccurate reconstruction of sound pressure when the reconstructed region is large. The reconstruction performance also degrades when it comes to high frequency. Upscaling ambisonic sound scenes is used to overcome the limitations. In this work, a deep-learning-based method for upscaling is proposed. Specifically, the generative adversarial network (GAN) is introduced. Instead of estimating the SH coefficients, a U-Net-based fully convolutional generator is introduced, which directly outputs the two-dimensional sound pressure. Results show that the proposed method significantly improves the upscaling results compared with the previous deep-learning-based method.

Author (s): Wang, Yiwen; Wu, Xihong; Qu, Tianshu;
Affiliation: Peking University, Beijing, China (See document for exact affiliation information.)
AES Convention: 152 Paper Number:10577
Publication Date: 2022-05-06
Session subject: Spatial Audio

DOI:

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type: Convention Paper

AES Conventions

AES Conferences

AES Training & Development

AES Inside Track

Journal of the AES

AES E-library

Special Publications

AES Sections are active around the world and provide a means for members to meet locally.

AES Student Website

AES Educational Foundation

Student Sections

See the committee’s accomplishments in diversity & inclusion

AES Statement of solidarity

Richard C. Heyser Memorial Lecture Series

AES E-Library

UP-WGAN: Upscaling Ambisonic Sound Scenes Using Wasserstein Generative Adversarial Networks

Choose your country of residence from this list:

AES E-Library

Login Institutions

UP-WGAN: Upscaling Ambisonic Sound Scenes Using Wasserstein Generative Adversarial Networks

Choose your country of residence from this list: