CSLU: Yes/No Version 1.2

Item Name: CSLU: Yes/No Version 1.2
Author(s): Mike Noel
LDC Catalog No.: LDC2007S05
ISBN: 1-58563-445-X
ISLRN: 910-955-859-747-8
DOI: https://doi.org/10.35111/18ns-0a21
Release Date: July 17, 2007
Member Year(s): 2007
DCMI Type(s): Sound
Sample Type: pcm
Sample Rate: 8000
Data Source(s): telephone speech
Application(s): speech synthesis, speech recognition, speaker verification, speaker identification, pronunciation modeling
Language(s): English
Language ID(s): eng
License(s): CSLU Agreement
Online Documentation: LDC2007S05 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Noel, Mike. CSLU: Yes/No Version 1.2 LDC2007S05. Web Download. Philadelphia: Linguistic Data Consortium, 2007.

Introduction

This file contains documentation for CSLU:Yes/No Version 1.2, Linguistic Data Consortium (LDC) catalog number LDC2007S05 and isbn 1-58563-445-X.

CSLU: Yes/No Version 1.2 is a collection of answers to yes/no questions from various telephone speech corpora created by the Center for Spoken Language Understanding, Oregon Health and Science University (CSLU). The corpus contains approximately 20,000 examples of roughly 18,000 speakers saying "yes" or "no" in response to various questions.

Each speech file in the corpus has a corresopnding orthographic transcription following the CSLU Labeling Conventions. In cases where a transcription did not already exist, the utterance was run through a speech recognizer to automatically obtain the transcription.

The data were collected from both analog and digital phone lines. The analog data were recorded using a Gradient Technologies analog-to-digital conversion box. These files were recorded as 16-bit, 8 khz and stored in a linear format. The digital data were recorded with the CSLU T1 digital data collection system. These files were sampled at 8 khz 8-bit and stored as ulaw files. All of the data use the RIFF standard file format. This file format is 16-bit linearly encoded.

Samples

For a sample of the audio in this corpus, please listen to this sample .

Available Media

View Fees





Login for the applicable fee