Session Q Monday, December 3 2:00 pm-5:00 pm 2:00 pm Markus Erne, Scopein
Research, Aarau, Switzerland and AES Technical Committee on Audio Coding Low-bit rate audio coding has
become a widely used technology during past years. By of the use of
sophisticated signal processing techniques, exploiting psychoacoustic
phenomena, nontransparent coding results in artifacts sounding very different
from traditional distortions which are frequently not obvious at all to the
untrained listener. The AES Technical Committee on Audio Coding therefore has
started an activity to produce a CD-ROM which presents some of the most common
coding artifacts in more detail. The CD-ROM not only explains and comments each
of the coding artifacts separately but for each artifact, audio examples are
presented, using different degrees of distortion, varying from "subtle"
up to "obvious". Convention Paper 5489 2:30 pm Michael J. Smithers and
Matt C. Fellers, Dolby Laboratories, Inc., San Francisco, CA, USA Presented are modifications to the MPEG-2 AAC encoder
that significantly increase computational efficiency while maintaining high
sound quality. These modifications include changes to the perceptual model,
block-switching control, pre-estimation of quantizer scale-factors, and changes
to the quantizer rate/distortion loop. These changes result in an overall
speed-up (when combined with processor-specific optimizations) of approximately
250% compared to the reference low-complexity professional MPEG-2 AAC encoder.
Tests show a mean degradation of 0.2 on the ITU-R 5-point audio impairment
scale. Convention Paper 5490 3:00 pm Sang-Wook Kim, Sung-Hee Park and
Yeon-Bae Kim, Samsung Advanced Institute of Technology, Suwon, Korea Previous MPEG-1, MPEG-2 Audio
standards provided a single bitrate, single bandwidth tool set, with different
configurations of that tool set specified for use in various applications.
MPEG-4 provides several bitrate and bandwidth options within a single
bitstream, providing a scalability functionality that permits a given bitstream
to scale to the requirement of different channels and applications or to be
responsive to a given channel that has dynamic throughput characteristics. Many
of the tools specified in MPEG-4 are the state-of-the-art tools providing scalable
compression of speech and audio signals. In this paper, we present the fine
grain scalability tool in MPEG-4 Audio. Convention Paper 5491 3:30 pm Chris Dunn, Scala
Technology, London, UK A comparison of audio coder quantisation schemes that
offer fine-grain bitrate scalability is made with reference to fixed-rate
quantisation. Coding efficiency is assessed in terms of the number of bits
allocated to significant transform coefficients, and the average number of
significant coefficients coded. A new method of arranging the transform
hierarchy for SPIHT zero tree algorithms is shown to result in significantly
improved performance relative to previously reported SPIHT implementations.
Results for a new quantisation algorithm are presented which suggest
low-complexity fine-grain scalable coding is possible with no coding efficiency
penalty relative to fixed-rate coding. Convention Paper 5492 4:00 pm Michael Truman, John White and
Michael Smithers, Dolby Laboratories, Inc., San Francisco, CA, USA A key requirement of interactive
applications is that they respond quickly to user input. This demands that the
audio signal processing be performed with minimal delay. Generally, perceptual
audio coders are in conflict with this requirement because they process data in
long blocks to improve compression performance. This paper describes a
real-time, multi-channel audio encoder designed to minimize the delay that is
compatible with current consumer home theatre decoding technology. Convention Paper 5493 4:30 pm Shyh-shiaw Kuo and
James D. Johnston, AT&T Labs - Research, Florham Park, NJ, USA We have studied and concluded
that the time domain cross channel prediction is generally not applicable to
perceptual audio coding. Convention Paper 5494 |
|