Home › Language Resources › Data

CHiME2 WSJ0

Item Name:	CHiME2 WSJ0
Author(s):	Emmanuel Vincent, Jon Barker, Shinji Watanabe, Jonathan Le Roux, Francesco Nesta, Marco Matassoni
LDC Catalog No.:	LDC2017S10
ISBN:	1-58563-801-3
ISLRN:	071-714-384-459-0
DOI:	https://doi.org/10.35111/cxwc-kb75
Release Date:	June 15, 2017
Member Year(s):	2017
DCMI Type(s):	Sound
Sample Type:	pcm
Sample Rate:	16000
Data Source(s):	microphone speech
Project(s):	CHiME
Application(s):	speech recognition
Language(s):	English
Language ID(s):	eng
License(s):	LDC User Agreement for Non-Members
Online Documentation:	LDC2017S10 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	Vincent, Emmanuel, et al. CHiME2 WSJ0 LDC2017S10. Web Download. Philadelphia: Linguistic Data Consortium, 2017.
Related Works: Hide	View isOutcomeOf LDC93S6A CSR-I (WSJ0) Complete isSimilarWith LDC2017S07 CHiME2 Grid LDC2017S24 CHiME3

Introduction

CHiME2 WSJ0 was developed as part of The 2nd CHiME Speech Separation and Recognition Challenge and contains approximately 166 hours of English speech from a noisy living room environment. The CHiME Challenges focus on distant-microphone automatic speech recognition (ASR) in real-world environments.

CHiME2 WSJ0 reflects the medium vocabulary track of the CHiME2 Challenge. The target utterances were taken from CSR-I (WSJ0) Complete (LDC93S6A), specifically, the 5,000 word subset of read speech from Wall Street Journal news text.

LDC also released CHiME2 Grid (LDC2017S07) and CHiME3 (LDC2017S24).

Data

Data is divided into training, development and test sets. All data is provided as 16 bit WAV files sampled at 16 kHz. The noisy utterances are in isolated form and in embedded form. The latter involves five seconds of background noise before and after the utterance. Seven hours of noise background not part of the training set are also included.

Also included are baseline scoring, decoding and retraining tools based on Cambridge University' s tool, HTK (the Hidden Markov Toolkit) and related recipes. These tools include three baseline speaker-independent recognition systems trained on clean, reverberated and noisy data, respectively, and a number of scripts.

Samples

Please listen to the following samples:

Updates

None at this time.

Copyright

Portions © 1987-1989 Dow Jones & Company, Inc., © 2017 Inria Nancy - Grand Est, University of Sheffield, Mitsubishi Electric Research Labs, Fondazione Bruno Kessler, © 1992, 1993, 1996, 2017 Trustees of the University of Pennsylvania