Session G Sunday, May 13 13:30 - 18:00 hr Room B Analysis and Synthesis of SoundChair: Matti Karjalainen, Helsinki University of Technology, Espoo, Finland 13:30 hr G-1 This paper presents a novel approach for pitch
detection of musical sound signals using the signal-adapted wavelet transform
(WT). As the effectiveness of the wavelet transform for a particular
application depends on the choice of the wavelet function, a wavelet function
derived from the input power spectral density (PSD) is designed to concentrate
the signal energy in the low-frequency region. Based on the corresponding
wavelet transform, a time-based event detection method is proposed to extract
the pitch periods information from the wavelet coefficients. Because the
wavelet is signal-adapted and presents the characteristics of the signal,
the pitch detector is suitable for
sound signals having a range from 50 to 4000 Hz of the fundamental frequency
values. The simulation results and real music experiments demonstrate that the
main features of this method are better accuracy than the different methods for
pitch period estimation and robustness to noise. 14:00 hr G-2 Two generic mechanisms are proposed that facilitate the
efficient integration of audio content analysis algorithms. The first
mechanism, priority-rule based interleaving of algorithms, allows the
simultaneous interoperation of several bottom-up analysis modules by
interleaving their atomic steps. It aims at increased accuracy through
controlled manipulation of common data. The second mechanism, top-down routing
of requests for data, allows high-level predictions to direct the bottom-up
analysis towards verifying the predicted hypotheses by observations. Examples
from automatic music transcription are presented to clarify the use of the
proposed methods. 14:30 hr G-3 A real-time synthesis engine is presented which
models and predicts the "timbre" of different acoustic instruments
based on perceptual features. The paper describes the modeling sequence
including the analysis of natural sounds, the inference step that finds the
mapping between control and output parameters, the timbre prediction step, and
the sound synthesis. Demonstrations include the timbre synthesis of stringed
instruments and the singing voice, as well as the cross-synthesis and timbre
morphing between these instruments. 15:00 hr G-4 There are several different audio frequency scales in
common use, each having its own particular merits. The speech-based frequency
scale derived here, from vowel formant frequency difference limens, has a
markedly different shape from the others and attaches more relative weight to
the range of frequencies associated with vowel perception, making it
potentially well suited to speech analysis applications. 15:30 hr G-5 A novel music analysis/synthesis method is proposed. The
basic structure consists of a delay line, a feedback filter, and a short
wavetable as the excitation signal. Because most musical tones are
quasi-periodic, the feedback filter predicts the next input data to the delay
line based on the signal in the delay buffer. The filter coefficients are
obtained in the analysis process performed by using source signals as the
teaching vector and a recurrent neural network learning procedure. Because the
basic architecture is identical to Digital Waveguide Filters (DWF), most
efficient supplemental processing and implementation techniques for DWF can be
applied. Instead of a fixed-length delay line, a variable-length delay line and
a control method are embedded when a wide range portamento is required. The
proposed method is currently applied to synthesize plucked-string instruments. 16:00 hr G-6 This paper presents new propositions to audio restoration
and enhancement based on Sound Source Modeling. We describe a case based on the
commuted waveguide synthesis algorithm for plucked string tones. The main motivation
is to take advantage of prior information of generative models of sound sources
when restoring or enhancing musical signals. 16:30 hr G-7 This paper presents a technique for synthesizing prosody
based upon information extracted from spoken utterances. We are interested in
designing systems that learn how to speak autonomously, by interacting with
humans. Our motivation for an in-depth investigation on prosody is prompted by
the fact that infants seem to have acute prosodic listening during the first
months of life. We presume that any system aimed at learning some form of
speaking skills should display this fundamental capacity. This paper addresses
two fundamental components for the development of such systems: prosody
listening and prosody production. It begins with a brief introduction to the
problem within the context of our research objectives. Then it introduces the
system and presents some commented examples. The paper concludes with final
remarks and a brief discussion on future developments. 17:00 hr G-8 A method is described, with which two stable sinusoids can
be represented with a single sinusoid with time-varying parameters and in some
conditions approximated with a stable sinusoid. The method is utilized in an
iterative sinusoidal analysis algorithm, which combines the components obtained
in different iteration steps using described the method. The proposed algorithm
improves the quality of the analysis at the expense of an increased number of
components. 17:30 hr G-9 Discussion on the subject of retrieval of musical data
from Internet or multimedia databases, which is carried out now for some time
does not successfully reach its final stage of application. There are still
many problems related to the subject of automatic recognition of music or
musical instrument sounds that cannot be solved easily. Especially important is
to find adequate parameters of musical signal based on time and frequency
and/or wavelet analyses. Proposed feature vectors were derived on the basis of
the constructed databases that contain recorded musical sounds. The presented
study shows some methods of automatic identification of musical instruments
based both on classical statistical and soft computing approaches. They were
used then to classify musical instruments. The results obtained in the carried
out investigations are presented and analyzed, leading to some specific and
some more general conclusions.
|
|