Multi-Channel WSJ Audio
Item Name: | Multi-Channel WSJ Audio |
Author(s): | Mike Lincoln, Erich Zwyssig, Iain McCowan |
LDC Catalog No.: | LDC2014S03 |
ISBN: | 1-58563-674-6 |
ISLRN: | 766-428-479-143-5 |
DOI: | https://doi.org/10.35111/zd7f-qr83 |
Release Date: | April 15, 2014 |
Member Year(s): | 2014 |
DCMI Type(s): | Sound |
Sample Type: | pcm |
Sample Rate: | 16000 |
Data Source(s): | newswire |
Application(s): | speech recognition, speaker segmentation and tracking, speaker identification |
Language(s): | English |
Language ID(s): | eng |
License(s): |
Multi-Channel WSJ Audio |
Online Documentation: | LDC2014S03 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Lincoln, Mike, Erich Zwyssig, and Iain McCowan. Multi-Channel WSJ Audio LDC2014S03. Web Download. Philadelphia: Linguistic Data Consortium, 2014. |
Related Works: | View |
Introduction
Multi-Channel WSJ Audio (MCWSJ) was developed by the Centre for Speech Technology Research at The University of Edinburgh and contains approximately 100 hours of recorded speech from 45 British English speakers. Participants read Wall Street Journal texts published in 1987-1989 in three recording scenarios: a single stationary speaker, two stationary overlapping speakers and one single moving speaker.
This corpus was designed to address the challenges of speech recognition in meetings, which often occur in rooms with non-ideal acoustic conditions and significant background noise, and may contain large sections of overlapping speech. Using headset microphones represents one approach, but meeting participants may be reluctant to wear them. Microphone arrays are another option. MCWSJ supports research in large vocabulary tasks using microphone arrays. The news sentences read by speakers are taken from WSJCAM0 Cambridge Read News, a corpus originally developed for large vocabulary continuous speech recognition experiments, which in turn was based on CSR-1 (WSJ0) Complete, made available by LDC to support large vocabulary continuous speech recognition initiatives.
Data
Speakers reading news text from prompts were recorded using a headset microphone, a lapel microphone and an eight-channel microphone array. In the single speaker scenario, participants read from six fixed positions. Fixed positions were assigned for the entire recording in the overlapping scenario. For the moving scenario, participants moved from one position to the next while reading.
Fifteen speakers were recorded for the single scenario, nine pairs for the overlapping scenario and nine individuals for the moving scenario. Each read approximately 90 sentences.
The audio data are presented as single channel 16kHz flac compressed wav files.
Samples
Please listen to the below samples.
Updates
None at this time.