Yes/No Corpus Release 1.2 Center for Spoken Language Understanding UPDATED: 23 August 2002 Overview -------- The Yes/No corpus is a collection of approximately 20000 examples of people saying "yes" or "no". This corpus will be extremely useful for the development of high accuracy yes/no speech recognizers. All of the speech in this corpus is telephone speech. The utterances in this corpus were taken from many other telephone speech data collections that have been completed at the CSLU. In many data collections, the callers were asked to answer various yes and no questions. For example, in the Census data collection, we asked each caller if they had ever been married. Each of these yes/no utterances have been gathered together to create this corpus. Each file in the Yes/No corpus has an orthographic transcription following the CSLU Labeling Conventions. In some cases where a transcription did not already exist we ran the utterance through a speech recognizer to automatically obtain the transcription. Since yes/no recognition isn't perfect, we expect there are some errors in these files. However, the error rate is extremly low, so we don't expect many errors. We have spot checked the results of the recognizer and are happy with the results. Release 1.2 of this corpus contains about 20000 files incuding automatic transcription.