Yes/No Corpus Release 1.2 Center for Spoken Language Understanding UPDATED: 23 August 2002 Corpus Description ------------------ This files describes some of the overall characteristics of the corpus. Some statistics concerning the number of each type of utterance are given. Sometimes a caller would answer a yes/no question with a response other than a simple "yes" or "no". For example, many people said "yes there is" when we asked if there was an "a" in their last name. We eliminated these utterances from the corpus. Likewise, when callers said a word other than "yes" or "no" (e.g. "sure"), we eliminated those files from the corpus. The corpus consists of about 7 hours of speech. The following table gives a count of the number of files for each utterance type. Type Number ---------------------- cellular 270 ever_married 4004 have_home_phone 3923 hispanic 3943 letterA 1568 no 2070 results 1234 yes 2040 yorn 476 The following table shows the total number of "yes" and "no" responses in the corpus. Word Number ---------------------- no 8157 yes 11371