Session J: LOW BIT RATE AUDIO CODING - PART 1
Sunday, May 12, 09:00 13:00 h Spectral Band Replication (SBR) is a novel technology which significantly improves the compression efficiency of perceptual audio codecs. SBR reconstructs the high frequency components of an audio signal on the receiver side. Thus, it takes the burden of encoding and transmitting high frequency components off the encoder, allowing for a much higher audio quality at low data rates. In December 2001, SBR has been chosen as Reference Model for the MPEG standardization process for bandwidth extension. The paper will highlight the underlying technical ideas and the achievable efficiency improvements. A second focus will be a description of current and future applications of SBR. Compared to the traditional wave form coding standards employing sub band or transform coding, parametric coding is regarded as a technique that allows even further reduction of bit-rates. An example of this is the MPEG-4 V2 HILN parametric coder. This coder is targeting medium quality at very low bit-rates. This level of quality was expected to be typical for parametric coders. In response to a call for proposals by MPEG, Philips has submitted a parametric coder targeting a significantly higher quality level. Currently, this coder is further improved as a collaborative development in the course of the "MPEG-4 Extension I" standardization process This paper describes an error resilient source coding approach for variable length codes. The scheme to be presented has been designed to make the Huffman coded scale factor data of MPEG-2/4 Advanced Audio Coding (AAC) more resilient against transmission errors and has been adopted by the ISO/IEC 14496-3 (MPEG-4 Audio) standard. The scale factor coding scheme enables bi-directional decoding and enhanced error detection, taking into account the differential coding approach used for scale factor data. Along with the basic ideas associated with the coding scheme, the paper presents appropriate error concealment strategies, which allow for sophisticated scale factor reconstruction. Subjective test results are presented in order to illustrate the performance of the discussed approach in the presence of adverse channel conditions. This paper discusses a new DSP based real-time implementation of the MPEG-2/4 AAC standards using a set of non-optimal solutions to reduce the computational load. These solutions comprise a novel implementation of a MDCT-based psychoacoustic model, managing a birth/death scheme which aims to overcome false power spectrum estimates due to the cosine transform, a smoothing of the scale factors guided by an estimation of the increment in the quantization noise and a loop-free bit allocation method based on an overestimation of the noise level. These techniques provide both robustness and efficiency to the scheme. Other fixed-point DSP programming considerations required to achieve the real-time implementation, including a choice of a MDCT fast algorithm, are presented. In this paper, we present quantization techniques to improve the low rate performance of a scalable audio coder. We show that a conditional enhancement-layer quantizer is effective in exploiting the statistical dependence of the enhancement-layer signal on the base-layer quantization parameters. It fundamentally extends our prior work on compander domain scalability, which was shown to be asymptotically optimal in the context of entropy coded uniform scalar quantization, to systems with non-uniform base-layer quantization. Moreover, in the important case that the source is well modeled by the Laplacian density, we show that the optimal conditional quantizer is implementable with only two distinct switchable quantizers. Hence, major savings in bit rate are recouped at virtually no additional computational cost. Further improvement in performance is achieved at the expense of computational complexity when the proposed quantization scheme is incorporated within an efficient ``trellis-based'' search for the quantization parameters. For example, we implemented the proposed scalable coder within the MPEG-AAC framework with four 16kbps layers build using the MPEG-AAC framework and achieved performance approximately that of a 56kbps non-scalable coder on the standard test database of 44.1kHz audio. This document seeks to determine the cost of Audio Coding in terms of its potential impact on the listener base, to identify the attributes and effects of competing digital audio compression algorithms and in conclusion to propose methodologies for maintaining the integrity of audio program material through the DAB delivery chain. MP3PRO combines the advantages of a well-known worldwide standard with the potential of Spectral Band Replication (SBR). Using SBR as key technology avoids the usual band width limitation for low bit rate coding, which can be observed with traditional audio codecs like MP3. Enhancing MP3 with SBR results in full audio bandwidth without annoying artifacts, even at bitrates below 64 kbps. In addition, the specific features of SBR allow MP3PRO to be forward and backward compatible to MP3. MP3PRO single chip solutions from two major chip manufacturers will be available in the first quarter of 2002. |
|