You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
We investigate applying audio manipulations using pretrained neural network-based autoencoders as an alternative to traditional signal processing methods, since the former may provide greater semantic or perceptual organization. To establish the potential of this approach, we first establish if representations from these models encode information about manipulations. We carry out experiments and produce visualizations using representations from two different pretrained autoencoders. Our findings indicate that, while some information about audio manipulations is encoded, this information is both limited and encoded in a non-trivial way. This is supported by our attempts to visualize these representations, which demonstrated that trajectories of representations for common manipulations are typically nonlinear and content dependent, even for linear signal manipulations. As a result, it is not yet clear how these pretrained autoencoders can be used to manipulate audio signals, however, our results indicate this may be due to the lack of disentanglement with respect to common audio manipulations.
Author (s): Hawley, Scott H.;
Steinmetz, Christian J.;
Affiliation:
Belmont University, Nashville, TN, USA and Harmonai, USA; Queen Mary Univ ersity of London, UK
(See document for exact affiliation information.)
AES Convention: 154
Paper Number:96
Publication Date:
2023-05-06
Session subject:
Neural Networks
DOI:
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Hawley, Scott H.; Steinmetz, Christian J.; 2023; Leveraging Neural Representations for Audio Manipulation [PDF]; Belmont University, Nashville, TN, USA and Harmonai, USA; Queen Mary Univ ersity of London, UK; Paper 96; Available from: https://aes.org/publications/elibrary-page/?id=22121
Hawley, Scott H.; Steinmetz, Christian J.; Leveraging Neural Representations for Audio Manipulation [PDF]; Belmont University, Nashville, TN, USA and Harmonai, USA; Queen Mary Univ ersity of London, UK; Paper 96; 2023 Available: https://aes.org/publications/elibrary-page/?id=22121
@inproceedings{Hawley2023leveraging,
title={{Leveraging Neural Representations for Audio Manipulation}},
author={Hawley, Scott H. and Steinmetz, Christian J.},
year={2023},
month={may},
booktitle={Journal of the Audio Engineering Society},
publisher={Express Paper 96; AES Convention 154; May 2023},
number={96},
organization={AES},
}
Notifications