CHiME3
Item Name: | CHiME3 |
Author(s): | Jon Barker, Ricard Marxer, Emmanuel Vincent, Shinji Watanabe |
LDC Catalog No.: | LDC2017S24 |
ISBN: | 1-58563-826-9 |
ISLRN: | 857-070-463-285-8 |
DOI: | https://doi.org/10.35111/v154-hj21 |
Release Date: | December 15, 2017 |
Member Year(s): | 2017 |
DCMI Type(s): | Sound, Text |
Sample Type: | pcm |
Sample Rate: | 16000 |
Data Source(s): | microphone speech |
Project(s): | CHiME |
Application(s): | speech recognition |
Language(s): | English |
Language ID(s): | eng |
License(s): |
LDC User Agreement for Non-Members |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Barker, Jon, et al. CHiME3 LDC2017S24. USB Flash Drive. Philadelphia: Linguistic Data Consortium, 2017. |
Related Works: | View |
Introduction
CHiME3 was developed as part of The 3rd CHiME Speech Separation and Recognition Challenge and contains approximately 342 hours of English speech and transcripts from noisy environments and 50 hours of noisy environment audio. The CHiME Challenges focus on distant-microphone automatic speech recognition (ASR) in real-world environments. See the CHIME3 home page for more information.
The task in CHiME3 was similar to the medium vocabulary track of the CHiME2 Challenge in that the target utterances were taken from CSR-I (WSJ0) Complete (LDC93S6A), specifically, the 5,000 word subset of read speech from Wall Street Journal news text. CHiME3 involved two types of data: speech data recorded in very noisy environments (on a bus, in a cafe, pedestrian area, and street junction) and noisy utterances generated by artificially mixing clean speech data with noisy backgrounds.
LDC has also released two CHiME2 corpora -- CHiME2 Grid (LDC2017S07) and CHiME2 WSJ0 (LDC2017S10).
Data
Data is divided into training, development and test sets. All data is provided as 16 bit WAV files sampled at 16 kHz. The audio data consists of the background noises, enhanced speech data using the baseline speech enhancement technique, unsegmented noisy speech data, and segmented noisy speech data.
Annotation files are based on JSON (JavaScript Object Notation) format. Transcripts are plain text in either DOT or TRN format. Also included are three software tools for acoustic simulation, speech enhancement, and ASR.
Samples
Please view the following samples:
Updates
None at this time.