AES Budapest 2012
Paper Session P16
P16 - High Resolution and Low Bit Rate
Saturday, April 28, 09:00 — 12:30 (Room: Liszt)
Chair:
Vesa Välimäki
P16-1 Time Domain Performance of Decimation Filter Architectures for High Resolution Sigma-Delta Analog to Digital Conversion—Yonghao Wang, Queen Mary University of London - London, UK, Birmingham City University, Birmingham, UK; Joshua D. Reiss, Queen Mary University of London - London, UK
We present the results of a comparison of different decimation architectures for high resolution sigma delta analog to digital conversion in terms of passband, transition band performance, simulated signal to noise ratio, and computational cost. In particular, we focus on the comparison of time domain group delay response of different filter architectures including classic multistage FIR, cascaded integrator-comb (CIC) with FIR compensation filters, particularly multistage polyphase IIR filter, cascaded halfband minimum phase FIR filter, and multistage minimum phase FIR filter designs. The analysis shows that the multistage minimum phase FIR filter and multistage polyphase IIR filter are most promising for low group delay audio applications.
Convention Paper 8648 (Purchase now)
P16-2 A Delta-Sigma Modulator Using Dual NTF for 1-Bit Digital Switching Amplifier—Jungmin Choi, Jaeyong Cho, Haekwang Park, Samsung Electronics - Suwon, Korea
In this paper a fifth-order single-loop single-bit delta-sigma modulator (DSM), which is constructed by cascade-of-integrator, feed-back (CIFB) form for a 1-bit digital audio switching amplifier is proposed. High order DSM can achieve high signal-to-noise ratio (SNR), but it has probability that the oscillation occur. To achieve high SNR and improve the stability of the modulator for a large input range, we propose the DSM which is composed of dual noise transfer function (NTF). The one is high SNR mode that maximizes SNR of DSM and the other is stable mode that enhances stability of DSM. The proposed architecture is simulated in the register transfer level (RTL) and implemented in the FPGA board.
Convention Paper 8649 (Purchase now)
P16-3 9 Years HE AAC—Technical Challenges Using an Open Standard in Real-World Applications—Martin Wolters, Dolby Germany GmbH - Nuremberg, Germany; Gregory McGarry, Dolby Australia Pty Ltd., - Sydney, Australia; Andreas Schneider, Robin Thesing, Dolby Germany GmbH - Nuremberg, Germany
The technical work on creating the MPEG HE AAC standard was finished nine years ago. Since then the format has become very popular in specific markets and devices such as PCs and mobile phones mainly due to its high-compression efficiency. However, creating a reliable eco-system based on this open standard remains a technical challenge. In this paper results of several compatibility tests, which were conducted over the last two years on both mobile phones and broadcast receivers, are presented. The problems encountered and recommended solutions are described.
Convention Paper 8650 (Purchase now)
P16-4 Subjective Tests on Audio Mix Dedicated to MP3 Coding—Szymon Piotrowski, AGH University of Science and Technology - Krakow, Poland; Magdalena Plewa, Gdansk University of Technology - Gdansk, Poland
Over the past years the Internet has become very popular as a means of distributing audio. MP3 coded audio is present in the Internet, in a bus, in broadcast. Sound engineers agree that there can often be a lack of control over the downstream processing that is applied to final material. The aim of the presented work is to compare audio mixes dedicated to CD format and converted to MP3 format with MP3-dedicated productions to and evaluate them.
Convention Paper 8651 (Purchase now)
P16-5 New Enhancements for Improved Image Quality and Channel Separation in the Immersive Sound Field Rendition (ISR) Parametric Multichannel Audio Coding System—Hari Om Aggrawal, ATC Labs - Noida, India; Deepen Sinha, ATC Labs - NJ, USA
Consumer audio applications such as satellite broadcasts, HDTV, multichannel audio streaming, gaming, and playback systems are highlighting newer challenges in low-bit-rate parametric multichannel audio coding. This paper describes the continuation of our research related to the Immersive Sound field Rendition (ISR) parametric multichannel encoding system. We focus on the recent enhancements for the surround and center channel generation components of the ISR system. The emphasis being on improving the fidelity and quality of reconstructed 5/5.1-channel audio so that it achieves a level of transparency desirable for high end applications. Furthermore, it is being attempted to improve the robustness of the coding scheme to various difficult signals and listening environments by reducing inter-channel leakage to a minimum. We describe challenging case, various algorithmic improvements to the ISR algorithm to address these and also discuss the subjective impact of these algorithmic improvements.
Convention Paper 8652 (Purchase now)
P16-6 Novel Decimation-Whitening Filter in Spectral Band Replication—Han-Wen Hsu, Chi-Min Liu, National Chiao Tung University - Hsinchu, Taiwan
MPEG-4 high-efficiency advanced audio coding (HE-AAC) has adopted spectral band replication (SBR) to efficiently compress high-frequency parts of the audio. In SBR, the linear prediction is applied to low-frequency subbands to clip the tonal components and smooth the associated spectrum for replicating to high-frequency bands. Such a process is referred to as the whitening filtering. In SBR, to avoid the alias artifact from spectral adjustment, a complex filterbank instead of real filterbank is adopted. For the QMF subbands, this paper analyzes that the linear prediction defined in SBR standard results in the predictive biases. An new whitening filter, called the decimation-whitening filter, is proposed to eliminate the predictive bias and provide advantages in terms of noise-to-signal ratio measure, frequency resolution, energy leakage, and computational complexity for SBR.
Convention Paper 8653 (Purchase now)
P16-7 MPEG Unified Speech and Audio Coding—The ISO/MPEG Standard for High-Efficiency Audio Coding of All Content Types—Max Neuendorf, Markus Multrus, Nikolaus Rettelbach, Guillaume Fuchs, Julien Robilliard, Jérémie Lecomte, Stephan Wilde, Stefan Bayer, Sascha Disch, Christian Helmric, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Roch Lefebvre, Philippe Gournay, Bruno Bessette, Jimmy Lapierre, Université de Sherbrooke - Sherbrooke, Quebec, Canada; Kristofer Kjörling, Heiko Purnhagen, Lars Villemoes, Dolby Sweden AB - Stockholm, Sweden; Werner Oomen, Erik Schuijers, Philips Research Laboratories - Eindhoven, The Netherlands; Kei Kikuiri, NTT DOCOMO, INC. - Yokosuka, Kanagawa, Japan; Toru Chinen, Sony Corporation - Shinagawa, Tokyo, Japan; Takeshi Norimatsu, Chong Kok Seng, Panasonic Corporation; Eumi Oh, Miyoung Kim, Samsung Electronics - Suwon, Korea; Schuyler Quackenbush, Audio Research Labs - Scotch Plains, NJ, USA; Bernhard Grill, Fraunhofer IIS, Erlangen, Germany
In early 2012 the ISO/IEC JTC1/SC29/WG11 (MPEG) finalized the new MPEG-D Unified Speech and Audio Coding standard. The new codec brings together the previously separated worlds of general audio coding and speech coding. It does so by integrating elements from audio coding and speech coding into a unified system. The present publication outlines all aspects of this standardization effort, starting with the history and motivation of the MPEG work item, describing all technical features of the final system, and further discussing listening test results and performance numbers that show the advantages of the new system over current state-of-the-art codecs.
Convention Paper 8654 (Purchase now)