Session M Tuesday, May 15 8:30 - 12:30 hr Room B Signal Processing for Audio, Part 2Chair: John Mourjopoulos, University of Patras, Patras, Greece 8:30 hr M-1 The primary aim of this paper is to show that it is
possible to localize the direction of the incoming acoustical signal based on
the neural network trained for that purpose. Consequently, the automatically
localized acoustical signal may be attenuated if it obscures the desired target
sound. A set of parameters was formulated in order to localize target source
and unwanted signals. In order to process acoustical signals incoming from
various directions at the same time the neural network-based system was
designed and implemented. The feature extraction method is thoroughly
discussed, the training process is described and recently obtained results are
discussed. 9:00 hr M-2 Recently introduced, the ISO/MPEG-2 Advanced Audio Coding
(AAC) coding technology provides a powerful framework which covers almost any
application from simple monophonic compression to full-featured multi-channel
coding. This paper discusses approaches and results of an implementation effort
of a 5.1 AAC multi-channel encoder on a high performance floating point catalog
DSP platform. Based on a new AAC coding strategy, this leads to a very
efficient encoder on a single DSP, thus enabling cost-effective high-quality
multi-channel encoding also for consumer type applications. 9:30 hr M-3 This paper presents practical recipes for the processing
of DSD-Wide [64FS 8-bit] signals which are fully compatible with the DSD [64FS
1-bit] signals used by the SACD consumer audio format. The designs are
presented in a schematic form compatible with implementation by interested
engineers in either FPGA or (with some modification) by traditional DSP
methods. This is intended to open up the processing of such Super High Fidelity
signals to a wider audience. 10:00 hr M-4 Frequency-warped filters have recently been applied
successfully to a number of audio applications. The idea of all-pass delay
elements replacing unit delays in digital filters allows for focusing of
enhanced frequency resolution on lowest (or highest) frequencies and enable
good match to the psycho-acoustic Bark scale. Kautz filters can be seen as a
further generalization where each transversal element may be different,
including complex conjugate poles. This enables arbitrary allocation of
frequency resolution for filter design, such as modeling and equalization
(inverse modeling) of linear systems. In this paper we formulate strategies for
using Kautz filters in audio applications. Case studies of loudspeaker
equalization, room response modeling, and guitar body modeling for sound
synthesis are presented. 10:30 hr M-5 In this paper, we present a segmentation algorithm for
acoustic musical signals, using a hidden Markov model. Through unsupervised
learning, we discover regions in the music that present steady statistical
properties: textures. We investigate different front-ends for the system, and
compare their performances. We then show that the obtained segmentation often
translates a structure explained by musicology: chorus and verse, different
instrumental sections, etc. Finally, we discuss the necessity of the HMM and
conclude that an efficient segmentation of music is more than a static
clustering and should make use of the dynamics of the data. 11:00 hr M-6 Fuelled by the digital revolution, efficient data
reduction schemes and the break-through of the Internet, an ever-increasing
amount of audio material has become available in digital format recently.
Efficient handling and possibility of identification for these items are
becoming extremely important to manage such amounts of content. This paper
describes a prototype system for a content-based identification system of audio
material based on a database of registered works. The technical approach is
outlined and the system's current performance and the range of possible
applications are discussed. 11:30 hr M-7 We developed an artificial reverberation device based on
a novel, time-variant orthogonal matrix feedback delay network topology. Our
novel topology uses multiple, time-variant output taps for each delay line, and
therefore simultaneously reduces the amount of delay memory required without
introducing coloration, and increases the echo density. Furthermore, the system
is guaranteed stable provided that certain constraints on the delay line
lengths and tap weights are fulfilled. Our implementation on a 24-bit digital
signal processor requires only 16384 words of delay line memory for a four
channel input / four channel output reverb. 12:00 hr M-8 Equivalent masking noise estimation could be introduced
in conventional broad-band acoustic noise reduction, to provide a new class of
modified techniques. The psycho-acoustical facts exploited in this paper,
result to a frequency-depended parametric Wiener filter. A discussion of
classical spectral subtraction, and a proof of equivalence under certain
conditions to Wiener filtering, is given first. The concept of parametric Wiener
filter is then examined, and a frequency dependence based on the model of pure
tones masked by broad-band noise, is introduced. Filter-bank, STFT and wavelet
implementations of the new approach, are finally compared to classical spectral
subtraction for background noise reduction in old 78 rpm music disc recordings
and noisy speech tape recordings.
|
|