CHiME3

Item Name: CHiME3
Author(s): Jon Barker, Ricard Marxer, Emmanuel Vincent, Shinji Watanabe
LDC Catalog No.: LDC2017S24
ISBN: 1-58563-826-9
ISLRN: 857-070-463-285-8
DOI: https://doi.org/10.35111/v154-hj21
Release Date: December 15, 2017
Member Year(s): 2017
DCMI Type(s): Sound, Text
Sample Type: pcm
Sample Rate: 16000
Data Source(s): microphone speech
Project(s): CHiME
Application(s): speech recognition
Language(s): English
Language ID(s): eng
License(s): LDC User Agreement for Non-Members
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Barker, Jon, et al. CHiME3 LDC2017S24. USB Flash Drive. Philadelphia: Linguistic Data Consortium, 2017.
Related Works: View

Introduction

CHiME3 was developed as part of The 3rd CHiME Speech Separation and Recognition Challenge and contains approximately 342 hours of English speech and transcripts from noisy environments and 50 hours of noisy environment audio. The CHiME Challenges focus on distant-microphone automatic speech recognition (ASR) in real-world environments. See the CHIME3 home page for more information.

The task in CHiME3 was similar to the medium vocabulary track of the CHiME2 Challenge in that the target utterances were taken from CSR-I (WSJ0) Complete (LDC93S6A), specifically, the 5,000 word subset of read speech from Wall Street Journal news text. CHiME3 involved two types of data: speech data recorded in very noisy environments (on a bus, in a cafe, pedestrian area, and street junction) and noisy utterances generated by artificially mixing clean speech data with noisy backgrounds.

LDC has also released two CHiME2 corpora -- CHiME2 Grid (LDC2017S07) and CHiME2 WSJ0 (LDC2017S10).

Data

Data is divided into training, development and test sets. All data is provided as 16 bit WAV files sampled at 16 kHz. The audio data consists of the background noises, enhanced speech data using the baseline speech enhancement technique, unsegmented noisy speech data, and segmented noisy speech data.

Annotation files are based on JSON (JavaScript Object Notation) format. Transcripts are plain text in either DOT or TRN format. Also included are three software tools for acoustic simulation, speech enhancement, and ASR.

Samples

Please view the following samples:

Updates

None at this time.

Available Media

View Fees





Login for the applicable fee