CSLU: Apple Words and Phrases
|Item Name:||CSLU: Apple Words and Phrases|
|LDC Catalog No.:||LDC2007S13|
|Release Date:||September 17, 2007|
|Data Source(s):||telephone speech|
|Application(s):||speech recognition, speaker verification, speaker identification|
|Online Documentation:||LDC2007S13 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Noel, Mike. CSLU: Apple Words and Phrases LDC2007S13. Web Download. Philadelphia: Linguistic Data Consortium, 2007.|
Apple Words and Phrases Version 1.3 contains approximately 69.5 hours of speech from 3008 telephone calls placed on analog and digital phone systems. Apple Computer, Inc. supported the development of this data and also supplied the list of words and phrases collected. Callers responded to questions and repeated a list of phrases as they were prompted.
Subjects calling the analog system (998 callers) were employees of Apple Computer, Inc. and were solicited through interoffice email within the company. Subjects calling the digital system (2010 callers) were responding to USEnet postings or newspaper advertisements placed in several cities across the United States. Each subject called the CSLU data collection system by dialing a toll-free number. The analog data were collected via a Worldport Pod on an Apple Quadra A/V. The digital data were collected with the CSLU T1 digital data collection system.
Callers were prompted to answer certain questions including, What is your native language? In which city and state did you spend most of your childhood? What time is it now? What day is today? Callers were also instructed to repeat various comnand and control type phrases, including "play previous message again", "make a meeting for today", "quit", "who is at work", "what is the area code for this state", "hello, what are my messages", "help", "please send a car from the city", "delete my email tomorrow", "read this text", "erase all information", "record extended phonebook", "transfer all calls to home at twelve o'clock", "record urgent message" and "find the operator".
Each recorded utterance was listened to by a human verifier to determine if the speaker adequately followed the directions. If an utterance contained extraneous words or excessive noise, it was not included in the corpus.