History: Two-channel noisy recordings of a moving speaker within a limited area

Preview of version: 14

Two-channel noisy recordings of a moving speaker within a limited area

Scenario

The target is a loudspeaker that occurs within a 30x30cm area. The loudspeaker is (always) directed towards two microphones that are 2 meters distant from the center of the area. Details of the scenario are given in the following figure.

Development dataset

Download dev16.zip (14 MB) (Development dataset, 16 kHz, 16 bits)
Download dev44.1.zip (63 MB) (Development dataset, 44.1 kHz, 24 bits)

For training, the dataset contains noise-free recordings of utterances played by the loudspeaker when it was standing (without movement) in one of 16 fixed positions within the target area. The file names have the format dev_position_<xx>.wav, where <xx> is the index of the position.

Next, there are four recordings during that the loudspeaker was moved over four positions. A video of the first recording is available for illustration here (external link)

. The file names have the format dev_<set>_<positions>_{sim,src,noi,mix}.wav, where <set> is the index of the recording (A, B,C, or D), <positions> contains indices of four positions passed during the movement, and {sim,src,noi,mix} denote, respectively, target source images, source signal of the target, noise, and the noisy recording (sim+noi).

Test dataset

Download test16.zip (3 MB) (Test dataset, 16 kHz, 16 bits)
Download test44.1.zip (12 MB) (Test dataset, 44.1 kHz, 24 bits)

The dataset contains five noisy recordings of the moving loudspeaker within the area. The file names have the format test_<set>_x_x_x_x_mix.wav, where <set> is the index of the recording (A, B,C, D, or E). Here, the trajectory of the movement is not revealed.

Tasks

The participants are encouraged to submit

Enhanced (de-noised) testing as well as development recordings
Estimated trajectories of the loudspeaker in terms of sequences of indices of positions

Submissions

Each participant should make his/her results available online in the form of a tarball called <YourName>_<dataset>.zip.
The files containing the enhanced utterances should be named: <dataset>_<set>_x_x_x_x_enh.wav
where <dataset> is either dev or test, <set> is A, B, C, D, or E, and x_x_x_x are the estimated positions of the target during the movement.

Each participant should then send an email to "zbynek.koldovsky (at) tul.cz" providing:

contact information (name, affiliation)
basic information about his/her algorithm, including its average running time (in seconds per test excerpt and per GHz of CPU) and a bibliographical reference if possible
the URL of the tarball(s)

The submitted audio files will be made available on this website.

Evaluation criteria

The evaluation will be done through the perceptual evaluation toolkit PEASS v.2.0 (external link)

.

Back to Audio source separation top

History

Legend: v=view, c=compare, d=diff

Date	User	Version	Action
Fri 26 of July, 2013 09:26 CEST	admin	21 Current	v
Fri 29 of Mar., 2013 10:57 CET	admin	20	v c d
Fri 29 of Mar., 2013 10:55 CET	admin	19	v c d
Fri 29 of Mar., 2013 10:54 CET	admin	18	v c d
Fri 29 of Mar., 2013 10:45 CET	admin	17	v c d
Wed 27 of Mar., 2013 10:51 CET	admin	16	v c d
Wed 27 of Mar., 2013 10:47 CET	admin	15	v c d
Wed 27 of Mar., 2013 10:45 CET	admin	14	v c d
Wed 27 of Mar., 2013 10:44 CET	admin	13	v c d
Wed 27 of Mar., 2013 10:37 CET	admin	12	v c d
Wed 27 of Mar., 2013 10:34 CET	admin	11	v c d
Wed 27 of Mar., 2013 10:32 CET	admin	10	v c d

History

History: Two-channel noisy recordings of a moving speaker within a limited area

Preview of version: 14

Two-channel noisy recordings of a moving speaker within a limited area

Scenario

Development dataset

Test dataset

Tasks

Submissions

Evaluation criteria

History

Sidebar

Menu

Sidebar

Google Search