Session C Saturday, May 12 13:30 - 18:00 hr Room B Low-Bit Rate Audio CodingChair: Marina Bosi, MPEG LA, Denver, CO, USA 13:30 hr C-1 This paper describes
processor-efficient implementation of a high quality MPEG-2 AAC encoder
employing fast psycho-acoustic analysis, efficient encoding of side
information, and SIMD instructions. A psycho-acoustic analysis in the MDCT
domain reduces computational costs. Smoothing of scale factors and optimized
selection of Huffman tables are introduced to efficiently encode the side
information. SIMD instructions are heavily used in MDCT and quantization processes
to improve the encoding speed. Seven-grade comparison MOS test results show
that the AAC encoder at 96 kbps/stereo achieves sound quality equivalent to
that of MP3 at 128 kbps/stereo. The encoder works 13 times faster than
real-time for stereo encoding on an 800 MHz Pentium III processor. 14:00 hr C-2 This paper presents a novel
lossless multichannel audio coding algorithm to remove inter-channel
redundancy. We employ an Integer-to-Integer Discrete Cosine Transform (INT-DCT)
to perform inter-channel decorrelation after quantization of Modified Discrete
Cosine Transform (MDCT) coefficients of individual channels. When compared with
a Karhunen-Loeve Transform (KLT) based approach our new method has three major
advantages: 1) avoids quantization noise spreading to other channels; 2)
computational simplicity; 3) uses less overhead information (a quantized
covariance matrix or eigenvector is avoided in our algorithm), while having a
similar decorrelation capability. 14:30 hr C-3 We propose a new approach to achieve efficient
scalability in audio coders, and demonstrate its performance using the MPEG-4
Advanced Audio Coder (AAC). In conventional scalable coding, the
enhancement-layer performs straightforward re-quantization of the base-layer
reconstruction error. This coding scheme implicitly discards useful information
from the base-layer, and does not truly minimize a perceptually meaningful
distortion criterion such as the noise-mask ratio. We reformulate the problem
of scalable coding within a companding framework, and show that re-quantization
in the compander's compressed domain achieves, in the asymptotic sense, optimal
scalability. Based on this observation, we develop a scalable AAC coder which
performs enhancement-layer quantization while exploiting all the information
available at that layer. Simulation results of a two-layer scalable coder on
the standard test database of 44.1kHz sampled audio show that the proposed
approach yields substantial savings in bit rate for a given reproduction
quality. 15:00 hr C-4 We propose an MPEG-1 Layer III
conforming audio codec for multiple generations (cascaded) compression without
loss of perceptual quality. Previous research addressing this topic mainly
focused on less complex coding schemes like MPEG-1 Layer II. In our paper, the
techniques proposed in those approaches are extended to comply with Layer III's
advanced features such as hybrid filtering, block switching, and bit reservoir
based processing. A prototypic implementation including extensive listening
tests shows the feasibility of perceptually stable cascaded Layer III
compression. 15:30 hr C-5 Adaptive WP tree derived via dynamic algorithm transforms
(DAT's) is presented. The DAT is to define parameter of input audio signals
(sub band entropy) and output coded sequences (sub band rate) for the given
embedded system architecture. A DAT-based pipe-line processor (WP trees
analysis (encoder) and synthesis (decoder) algorithms) based on a reconfigurable
hardware (such as SRAM-FPGA plus distributed arithmetic) is described. 16:00 hr C-6 This paper deals with the design and implementation of a
scheme for CD quality audio coding that introduces a delay as low as 6 ms., and
provides near transparent coding. It is implemented with a uniform filter bank that
decomposes the audio signal in thirty-two bands. For such a delay, the filters
in the filter bank must have a short impulse response and so the filters
non-ideal frequency responses should be taken into account. Near transparent
coding is achieved at 96kbps, that is a very good result for such a low delay. 16:30 hr C-7 The HILN (Harmonic and Individual Lines plus Noise)
MPEG-4 parametric audio coding tool allows efficient representation of general
audio signals at very low bit rates. Therefore possible applications include
transmission over IP or wireless channels which are both characterized by
specific transmission error models. On the other hand, since parametric audio
coding is a relatively new technique compared to transform coding and CELP
speech coding, there have been only very limited investigations on HILN's
behavior in error prone environments. In this paper we present an analysis of
error sensitivities and approaches to error protection and concealment. 17:00 hr C-8 A new algorithm for avoiding overlapping of adjacent
frames in order to reduce the block effect is presented. The algorithm is based
on forward and backward prediction at the border of frames and has been applied
with success to an audio coder based on time-varying wavelet-packet
decompositions that uses symmetrical and periodic extension as method for
processing frames in isolation. 17:30 hr C-9 MPEG natural audio coders, as MPEG-1/2 Layer III and
MPEG-2/4 AAC, require a great amount of
calculation, mainly due to the iterative bit allocation processes proposed by
the ISO/IEC technical documentation. This complexity makes difficult a
real-time implementation of the normative algorithms. To solve this difficulty,
this paper discusses a set of non optimal solutions to reduce the computing
load and, based on these solutions, a real-time implementation of a MPEG-1/2
Layer III using a single fixed-point DSP is presented. In addition, techniques
to achieve good audio performance and the methods for the adaptation of
parameters will be discussed.
|
|