CSLU: Apple Words and Phrases


Item Name: CSLU: Apple Words and Phrases
Authors: Mike Noel
LDC Catalog No.: LDC2007S13
ISBN: 1-58563-456-5
Release Date: Sep 17, 2007
Data Type: speech
Data Source(s): telephone speech
Application(s): speaker identification, speaker verification, speech recognition
Language(s): English
Language ID(s): eng
Distribution: 1 DVD
Member fee: $0 for 2007 members
Non-member Fee: US $150.00
Reduced-License Fee: US $150.00
Extra-Copy Fee: US $150.00
Non-member License: yes
Member License: yes
Online documentation: yes
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Mike Noel
2007
CSLU: Apple Words and Phrases
Linguistic Data Consortium, Philadelphia

Introduction

Apple Words and Phrases Version 1.3 contains approximately 69.5 hours of speech from 3008 telephone calls placed on analog and digital phone systems. Apple Computer, Inc. supported the development of this data and also supplied the list of words and phrases collected. Callers responded to questions and repeated a list of phrases as they were prompted.

Data

Subjects calling the analog system (998 callers) were employees of Apple Computer, Inc. and were solicited through interoffice email within the company. Subjects calling the digital system (2010 callers) were responding to USEnet postings or newspaper advertisements placed in several cities across the United States. Each subject called the CSLU data collection system by dialing a toll-free number. The analog data were collected via a Worldport Pod on an Apple Quadra A/V. The digital data were collected with the CSLU T1 digital data collection system.

Callers were prompted to answer certain questions including, What is your native language? In which city and state did you spend most of your childhood? What time is it now? What day is today? Callers were also instructed to repeat various comnand and control type phrases, including "play previous message again", "make a meeting for today", "quit", "who is at work", "what is the area code for this state", "hello, what are my messages", "help", "please send a car from the city", "delete my email tomorrow", "read this text", "erase all information", "record extended phonebook", "transfer all calls to home at twelve o'clock", "record urgent message" and "find the operator".

Each recorded utterance was listened to by a human verifier to determine if the speaker adequately followed the directions. If an utterance contained extraneous words or excessive noise, it was not included in the corpus.

Samples

Content Copyright

Portions 2000-2002 Center for Spoken Language Understanding, Oregon Health & Science University, 2007 Trustees of the University of Pennsylvania