History: Asynchronous recordings of speech mixtures
Comparing version 6 with version 17
@@ -Lines: 1-5 changed to +Lines: 1-8 @@
- !::Two-channel mixtures of speech and real-world background noise::
+ !::Asynchronous recordings of speech mixtures::
!! Introduction
In recent years, many recording devices such as voice recorders, smart phones, tablet-type mobile devices, laptop PCs, etc, can be available in our surrounding environment easily, and exploiting them for array signal processing is one of the attractive scenarios. However, in most cases, recorded signals with different devices are not synchronous, which include unknown time offsets of recording start or sampling frequency mismatch. The aim of this task to evaluate source separation on such asynchronous channles.
+
+ !!Results + Results for [http://www.onn.nii.ac.jp/sisec13/evaluation_result/ASY/ASY2013.html|development and test dataset] !! Description of the datasets @@ -Lines: 7-15 changed to +Lines: 10-18 @@
The data has two different recording environments:
- *__150ms__: all the microphone elements are spaced in a linear arrangement. The spacing of each stereo microphone pair is about 2.15 cm. The reverbaration time is about 150 ms.
*__300ms__: all the microphone elements are spaced in a radial fashion. The spacing of each stereo microphone pair is about 7.65 cm. The reverbaration time is about 300 ms.
+ *''150ms'': all the microphone elements are spaced in a linear arrangement. The spacing of each stereo microphone pair is about 2.15 cm. The reverbaration time is about 150 ms.
*''300ms'': all the microphone elements are spaced in a radial fashion. The spacing of each stereo microphone pair is about 7.65 cm. The reverbaration time is about 300 ms. !!! Test data
- __Download [http://www.mmlab.cs.tsukuba.ac.jp/~miyabe/sisec13/test.zip|test.zip] (18.8 MB)
+ __Download [http://corpus-search.nii.ac.jp/sisec/2013/async/test.zip|test.zip] (18.8 MB)__
The data consist of 18 stereo WAV audio files that can be imported in Matlab using the wavread command. These files are named ''test_<srcset>_<cond>_mix_<ch>.wav'', where
* ''<srcset>'': source sets ''male2'', ''male3'' and ''male4'', which correspond to the mixture of two, three and four male skerkers' utterrances, respectively. @@ -Lines: 20-24 changed to +Lines: 23-27 @@
!!! Development data
- __Download [http://www.mmlab.cs.tsukuba.ac.jp/~miyabe/sisec13/dev.zip|dev.zip] (75.5 MB)
+ __Download [http://corpus-search.nii.ac.jp/sisec/2013/async/dev.zip|dev.zip] (75.5 MB)__
The developement data consist of 66 stereo WAV audio files and 6 Matlab MAT files, which can be imported in Matlab using the commands load and wavread respectively. These files are named as follows:
* ''dev_src_<src>.wav'': single-channel speech signal, shared in whole the development data. @@ -Lines: 27-31 changed to +Lines: 30-34 @@
* ''dev_<srcset>_<cond>_src_<src>.wav'': MAT file including the variable A of the room impulse responses, whose size is [[number of the channels, number of the sources, number of samples]. Note that the recording time offset is included in the impulse responses.
Here the variables are determined as follows.
- * ''<srcset>'': source set ''<male2>'', ''<male3>'' and ''<male4>'', which correspond to the mixture of two, three and four male skerkers' utterrances.
+ * ''<srcset>'': source set ''male2'', ''male3'' and ''male4'', which correspond to the mixture of two, three and four male skerkers' utterrances.
* ''<cond>'': recording conditions ''150ms'' and ''300ms''.
* ''<ch>'': indexes of the stereo channels ''ch12'', ''ch34'' and ''ch56''. The channels are synchronized within each file, but The files are not synchronized each other. @@ -Lines: 41-51 changed to +Lines: 44-54 @@
!! Evaluation criteria
- We plan to use the criteria defined in the BSS_EVAL toolbox. The submitted results will be evaluated with SDR, SIR, SAR, ISR, using original sources at the first channel as "i" in bss_eval_images_nosort.m.
+ We plan to use the criteria defined in the BSS_EVAL toolbox. The submitted results will be evaluated with SDR, SIR, and SAR using original sources at the first channel as "i" in bss_eval_sources.m.
The criteria and benchmarks are respectively implemented in
- * [http://sisec2011.wiki.irisa.fr/tiki-download_file.php?fileId=1|bss_eval_images_nosort.m]
+ * [http://bass-db.gforge.inria.fr/bss_eval/bss_eval_sources.m|bss_eval_sources.m]
!! Licensing issues All files are distributed under the terms of the Creative Commons Attribution-Noncommercial-ShareAlike 3.0 license. The files to be submitted by participants will be made available on a website under the terms of the same license. The author are Yuya Sugimoto and Shigeki Miyabe.
- Task proposed by the Audio Committee.
+ Task proposed by Shigeki Miyabe and Nobutaka Ono.
|