Meeting Topic: Annual General Meeting - 2010
Moderator Name: Géza Balogh Dr.
Speaker Name: Géza Balogh Dr. - Sándor Steinbach - Kristóf Aczél PhD (NOKIA) - Miklós Alexy, Dr. - János Gyori
Other business or activities at the meeting: Election for new commitee-member
Meeting Location: BME Informatikai épület B110
Kristóf Aczél PhD introduced his work:
'Separation of polyphonic recordings using instrument prints'
1. Overview and goals
Several unanswered questions are investigated today in the field of sound processing. One of the main research areas is the analysis, interpretation, modification and correction of polyphonic music. When a musical event is recorded there is often need for post-processing steps due to the presence of misplayed notes in the recording, be they either out of tune or incorrectly intonated. There are several methods used to avoid that the recording that finally gets to the listeners contains such imperfections.
One of the most popular methods is cutting the musical material. It is done by performing and recording the same musical content(or parts of it) multiple times. Afterwards it is the task of the editor to choose one from the many takes of the same parts that sounds best. The chosen parts are then pieced together with the help of sound editing software. Sometimes only as short as one single note is cut out from the recording and substituted with another copy of the same note from another take. In this case the other notes in the same time segment also inevitably get replaced with their copies from the other take, which may not be the intention.
The other commonly used method, multitrack recording, records the sound of each instrument (but at least the most important ones) and the voice of the singer to a separate channel using dedicated microphones. This approach allows for the modification of the channels independently from the other ones. Multitrack recording is very popular in pop music where the voice of the singer is often improved by automatic tone correction methods or manipulated in other ways. This makes the recorded music in-tune even if there al performance of the artist was out of tune.
However, the most interesting cases of sound manipulation are those polyphonic recordings that for some reason cannot be retaken (e.g. old recordings). Neither are there any musical fragments available to be used in place of the incorrect notes in these recordings, nor is the multichannel representation available, only the original mono or stereo source signal. In these cases altering the musical structure in any way requires the decomposition of the signal. The multichannel representation of the signal needs to be generated algorithmically from the original signal, or at least the note to be fixed must be isolated to a separate channel. After this step we can choose from a vast set of methods to alter the note signal to fit our needs, after which the separated signal can be mixed back to the recording.
The goal of my research was to investigate the problem of sound source separation and propose methods in this field that are particularly applicable to polyphonic music signals. I considered the separation quality the most important property of the proposed separation method; therefore I set the following goals:
a. o separation of any user specified notes from the polyphonic recording,
b. o keeping the quality of the separated note signals as high as possible.
I considered the following issues less important:
a. o separation of each and every note in the recording, that is, generating the complete multitrack representation of the music signal -as it is not needed for fixing incorrect notes, but brings unnecessary complexity to the separation system or may even lower the separation quality;
b. o creating a fully automated separation system, where the multichannel representation is generated 'by one click' -as actively involving the user may allow for higher quality output;
c. o handling the noise -as there are many capable methods in that area;
d. o real-time operation.
My contribution involves the proposition for a global architecture of a polyphonic sound source separation system, the elaboration of its algorithms and modules, as well as the fine-tuning of the proposed methods.
More here: http://www.aes-hu.org/docs/AczelKristofTezisfuzet12_Eng.pdf
and here: http://www.aes-hu.org/docs/KristofAczelPhDThesis.pdf
Samples for listening are available here: http://avalon.aut.bme.hu/~aczelkri/separation/
Written By: János Gyori