History: Two-channel noisy recordings of a moving speaker within a limited area

Preview of version: 10

Two-channel noisy recordings of a moving speaker within a limited area

Scenario

The target is a loudspeaker that occurs within a 30x30cm area. The loudspeaker is (always) directed towards two microphones that are 2 meters distant from the center of the area. Details of the scenario are given in the following figure.

Development dataset

Download dev16.zip (14 MB) (Development dataset, 16 kHz, 16 bits)
Download dev44.1.zip (63 MB) (Development dataset, 44.1 kHz, 24 bits)

For training, the dataset contains noise-free recordings of utterances played by the loudspeaker when it was standing (without movement) in one of 16 fixed positions within the target area. The file names have the format dev_position_<xx>.wav, where <xx> is the index of the position.

Next, there are four recordings during that the loudspeaker was moved over four positions. A video of the first recording is available for illustration here (external link)

. The file names have the format dev_<set>_<positions>_{sim,src,noi,mix}.wav, where <set> is the index of the recording (A, B,C, or D), <positions> contains indices of four positions passed during the movement, and {sim,src,noi,mix} denote, respectively, target source images, source signal of the target, noise, and the noisy recording (sim+noi).

Test dataset

The data consist of stereo WAV audio files, that can be imported in Matlab using the wavread command. These files are named {dev1,dev2}__ [ <author> ] - [ <song> ]__[ <snip> ]__ {mix,full_mix,<track>}.wav, where <author> is the author name, <song> is the song name, <snip> is a shortcut for snip information, and <track> is the separated track name (e.g., "vocals", "bass", etc.).

The data include the following mixtures (snips and full-length recordings):

dev1

dev1__bearlin-roads__snip_85_99__mix.wav
dev1__tamy-que_pena_tanto_faz__snip_6_19__mix.wav

dev2

dev2__another_dreamer-the_ones_we_love__snip_69_94__mix.wav
dev2__fort_minor-remember_the_name__snip_54_78__mix.wav
dev2__ultimate_nz_tour__snip_43_61__mix.wav

dev2_full_mix

dev2__another_dreamer-the_ones_we_love__full_mix.wav
dev2__fort_minor-remember_the_name__full_mix.wav
dev2__ultimate_nz_tour__full_mix.wav

Separated tracks (needed for evaluation in dev1 and dev2) are in the corresponding folders named {dev1,dev2}__[<author>]-[<song>]__<snip>__<track>.wav .

License

All audio files are distributed under the terms different licenses, as listed below for each recodring:

Tamy - Que Pena Tanto Faz: Creative Commons Attribution Noncommercial (3.0)
Bearlin - Roads: Read License
Glen Philips - The Spirit of Shackleton Creative Commons Attribution 3.0
Nine Inch Nails - The Good Soldier Read License
Shannon Hurley - Sunrise Creative Commons Attribution-NonCommercial 3.0
Another Dreamer - The Ones We Love Creative Commons Attribution-NonCommercial 1.0
Fort Minor - Remember the Name Creative Commons Attribution-NonCommercial 2.5
Ultimate NZ Tour Creative Commons Attribution-Noncommercial-ShareAlike 3.0
Jims Big Ego – Mix tape Creative Commons Attribution-ShareAlike 1.0
Vieux Farka Touré – Ana Creative Commons Attribution-NonCommercial 2.5

The data were taken from the MTG MASS database (external link)

and from the QUASI database (external link)

Tasks

The following should be taken in to account:

The participants are encouraged to separate only the snips of songs in case of test1, test2, dev1, and dev2. In case of test3, the participants are encouraged to separate both snips and full-length recordings.
Some track names below have the following meaning:
- "vocals" = "a sum of any singing including main vocal, back vocals and singing in the reverb"
- "drums" = "a sum of any drums including bass drum, hi-hat, snare etc."
- "bass" = "bass guitar only (i.e., not bass drum)"

Tracks to separate (test tasks)

test1__tamy-que_pena_tanto_faz__snip__mix.wav

vocals, guitar

test1__bearlin-roads__snip__mix.wav

vocals, bass, drums, piano

test2__glen_philips-the_spirit_of_shackleton__snip_163_185__mix.wav

vocals, drums, bass, other

test2__nine_inch_nails-the_good_soldier__snip_104_125__mix.wav

bass, drums, vocals, other

test2__shannon_hurley-sunrise__snip_62_85__mix.wav

vocals, drums, bass, piano

test3__jims_big_ego-mix_tape__{snip, full_mix}__mix.wav

vocals, drums, bass, other

test3__vieux_farka_toure-ana__{snip, full_mix}__mix

vocals, drums, bass, other

Tracks to separate (development tasks)

dev2__another_dreamer-the_ones_we_love__snip_69_94__mix.wav

vocals, drums, guitar

dev2__fort_minor-remember_the_name__snip_54_78__mix.wav

vocals, drums, bass, claps

dev2__ultimate_nz_tour__snip_43_61__mix.wav

vocals, drums, bass

Submission

Participants may submit separation results for any above-mentioned tracks of the test and development mixtures.
In addition, each participant is asked to provide basic information about his/her algorithm (e.g. a bibliographical reference) and to declare its average running time, expressed in seconds per test excerpt and per GHz of CPU.

How to submit

Each participant should make his results available online in the form of a tarball called <YourName>_<dataset>.zip.
The included files must be named as follows:
<dataset><author>-<song><snip or full_mix>__<trackname>.wav
where <dataset> is one of the test/test2/dev2, <filename> is a shortcut for the set of source signals, <trackname> is the name of the extracted track. For example, the estimated vocal track for the task file "test2_glen_philips-the_spirit_of_shackleton_snip_163_185_mix.wav" should be named as "test2glen_philips-the_spirit_of_shackletonsnip_163_185__vocals.wav".

Each participant should then send an email to "zbynek.koldovsky (at) tul.cz" and to "onono (at) nii.ac.jp" providing:

contact information (name, affiliation)
basic information about his/her algorithm, including its average running time (in seconds per test excerpt and per GHz of CPU) and a bibliographical reference if possible
the URL of the tarball(s)

The submitted audio files will be made available on a website under the terms of the same license as indicated in the section Licenses above. In other words, any modified version inherit exactly the same license as the original.

Evaluation criteria

The evaluation will be done through the perceptual evaluation toolkit PEASS v.2.0 (external link)

Potential participants

M Nxx yz
Vasileios Pantazis
Alexey Ozerov (alexey.ozerov (a) irisa_fr)
Jeanlouis Durrieu (durrieu (a) enst_fr)
Maximo Cobos (mcobos (a) iteam_upv_es)
Pablo Cancela (pcancela (a) gmail.com)
Antoine Liutkus (antoine.liutkus (a) telecom-paristech.fr)
Pierre Leveau (pierre.leveau (a) audionamix.com)
Jordi Janer (jordi.janer (a) upf.edu)
Nobutaka Ono (onono (a) nii.ac.jp)
Stanislaw Gorlow

Task proposed by Audio Committee

Back to Audio source separation top

History

Legend: v=view, c=compare, d=diff

Date	User	Version	Action
Fri 26 of July, 2013 09:26 CEST	admin	21 Current	v
Fri 29 of Mar., 2013 10:57 CET	admin	20	v c d
Fri 29 of Mar., 2013 10:55 CET	admin	19	v c d
Fri 29 of Mar., 2013 10:54 CET	admin	18	v c d
Fri 29 of Mar., 2013 10:45 CET	admin	17	v c d
Wed 27 of Mar., 2013 10:51 CET	admin	16	v c d
Wed 27 of Mar., 2013 10:47 CET	admin	15	v c d
Wed 27 of Mar., 2013 10:45 CET	admin	14	v c d
Wed 27 of Mar., 2013 10:44 CET	admin	13	v c d
Wed 27 of Mar., 2013 10:37 CET	admin	12	v c d
Wed 27 of Mar., 2013 10:34 CET	admin	11	v c d
Wed 27 of Mar., 2013 10:32 CET	admin	10	v c d

History

History: Two-channel noisy recordings of a moving speaker within a limited area

Preview of version: 10

Two-channel noisy recordings of a moving speaker within a limited area

Scenario

Development dataset

Test dataset

License

Tasks

Tracks to separate (test tasks)

Tracks to separate (development tasks)

Submission

How to submit

Evaluation criteria

Potential participants

History

Sidebar

Menu

Sidebar

Google Search