User:Password:

History: Asynchronous recordings of speech mixtures

Comparing version 6 with version 17

@@ -Lines: 1-5 changed to +Lines: 1-8 @@

- !::Two-channel mixtures of speech and real-world background noise::

+ !::Asynchronous recordings of speech mixtures::

!! Introduction
In recent years, many recording devices such as voice recorders, smart phones, tablet-type mobile devices, laptop PCs, etc, can be available in our surrounding environment easily, and exploiting them for array signal processing is one of the attractive scenarios. However, in most cases, recorded signals with different devices are not synchronous, which include unknown time offsets of recording start or sampling frequency mismatch. The aim of this task to evaluate source separation on such asynchronous channles.

+
+ !!Results
+ Results for [http://www.onn.nii.ac.jp/sisec13/evaluation_result/ASY/ASY2013.html|development and test dataset]

!! Description of the datasets

@@ -Lines: 7-15 changed to +Lines: 10-18 @@

The data has two different recording environments:

- *__150ms__: all the microphone elements are spaced in a linear arrangement. The spacing of each stereo microphone pair is about 2.15 cm. The reverbaration time is about 150 ms.
*__300ms__: all the microphone elements are spaced in a radial fashion. The spacing of each stereo microphone pair is about 7.65 cm. The reverbaration time is about 300 ms.

+ *''150ms'': all the microphone elements are spaced in a linear arrangement. The spacing of each stereo microphone pair is about 2.15 cm. The reverbaration time is about 150 ms.
*''300ms'': all the microphone elements are spaced in a radial fashion. The spacing of each stereo microphone pair is about 7.65 cm. The reverbaration time is about 300 ms.

!!! Test data

- __Download [http://www.mmlab.cs.tsukuba.ac.jp/~miyabe/sisec13/test.zip|test.zip] (18.8 MB)

+ __Download [http://corpus-search.nii.ac.jp/sisec/2013/async/test.zip|test.zip] (18.8 MB)__

The data consist of 18 stereo WAV audio files that can be imported in Matlab using the wavread command. These files are named ''test_<srcset>_<cond>_mix_<ch>.wav'', where
* ''<srcset>'': source sets ''male2'', ''male3'' and ''male4'', which correspond to the mixture of two, three and four male skerkers' utterrances, respectively.

@@ -Lines: 20-24 changed to +Lines: 23-27 @@

!!! Development data

- __Download [http://www.mmlab.cs.tsukuba.ac.jp/~miyabe/sisec13/dev.zip|dev.zip] (75.5 MB)

+ __Download [http://corpus-search.nii.ac.jp/sisec/2013/async/dev.zip|dev.zip] (75.5 MB)__

The developement data consist of 66 stereo WAV audio files and 6 Matlab MAT files, which can be imported in Matlab using the commands load and wavread respectively. These files are named as follows:
* ''dev_src_<src>.wav'': single-channel speech signal, shared in whole the development data.

@@ -Lines: 27-31 changed to +Lines: 30-34 @@

* ''dev_<srcset>_<cond>_src_<src>.wav'': MAT file including the variable A of the room impulse responses, whose size is [[number of the channels, number of the sources, number of samples]. Note that the recording time offset is included in the impulse responses.
Here the variables are determined as follows.

- * ''<srcset>'': source set ''<male2>'', ''<male3>'' and ''<male4>'', which correspond to the mixture of two, three and four male skerkers' utterrances.

+ * ''<srcset>'': source set ''male2'', ''male3'' and ''male4'', which correspond to the mixture of two, three and four male skerkers' utterrances.

* ''<cond>'': recording conditions ''150ms'' and ''300ms''.
* ''<ch>'': indexes of the stereo channels ''ch12'', ''ch34'' and ''ch56''. The channels are synchronized within each file, but The files are not synchronized each other.

@@ -Lines: 41-51 changed to +Lines: 44-54 @@

!! Evaluation criteria

- We plan to use the criteria defined in the BSS_EVAL toolbox. The submitted results will be evaluated with SDR, SIR, SAR, ISR, using original sources at the first channel as "i" in bss_eval_images_nosort.m.

+ We plan to use the criteria defined in the BSS_EVAL toolbox. The submitted results will be evaluated with SDR, SIR, and SAR using original sources at the first channel as "i" in bss_eval_sources.m.

The criteria and benchmarks are respectively implemented in

- * [http://sisec2011.wiki.irisa.fr/tiki-download_file.php?fileId=1|bss_eval_images_nosort.m]

+ * [http://bass-db.gforge.inria.fr/bss_eval/bss_eval_sources.m|bss_eval_sources.m]

!! Licensing issues
All files are distributed under the terms of the Creative Commons Attribution-Noncommercial-ShareAlike 3.0 license. The files to be submitted by participants will be made available on a website under the terms of the same license. The author are Yuya Sugimoto and Shigeki Miyabe.

- Task proposed by the Audio Committee.

+ Task proposed by Shigeki Miyabe and Nobutaka Ono.

History

Legend: v=view, c=compare, d=diff

Date	User	Edit Comment	Version	Action
Thu 01 of Aug., 2013 01:57 CEST	admin		17 Current	v
Tue 30 of July, 2013 06:48 CEST	admin	results removed temporarily ( !! Results Results for [http://www.onn.nii.ac.jp/sisec13/evaluation_result/ASY/ASY2013.html\|development and test dataset])	16	v c d
Tue 30 of July, 2013 04:47 CEST	admin		15	v c d
Sun 31 of Mar., 2013 09:11 CEST	admin	link correction	14	v c d
Sat 30 of Mar., 2013 18:41 CET	admin	ISR removed	13	v c d
Sat 30 of Mar., 2013 12:16 CET	admin	evaluation criteria changed from images to sources	12	v c d
Sat 30 of Mar., 2013 06:08 CET	admin	link to data, proposer	11	v c d
Fri 29 of Mar., 2013 14:54 CET	admin		10	v c d
Fri 29 of Mar., 2013 14:52 CET	admin	title corrected	9	v c d
Fri 29 of Mar., 2013 14:51 CET	admin	error corrected	8	v c d
Fri 29 of Mar., 2013 14:47 CET	admin	errors corrected	7	v c d
Fri 29 of Mar., 2013 14:45 CET	admin	Pre-release version completed.	6	v c d

History: Asynchronous recordings of speech mixtures

Comparing version 6 with version 17

History

Sidebar

Menu

Sidebar

Google Search