CSLU: Portland Cellular Telephone Speech Version 1.3, LDC2008S01, ISBN 1-58563-463-8 was created by the Center for Spoken Language Understanding (CSLU) at OGI School of Science and Engineering, Oregon Health and Science University, Beaverton, Oregon. It consists of cellular telephone speech and corresponding transcripts, specifically, 7,571 utterances from 515 speakers who made calls in the Portland, Oregon area using cellular telephones.
Speakers called the CSLU data collection system on cellular telephones, and they were asked to repeat certain phrases and to respond to other prompts. Two prompt protocols were used: an In Vehicle Protocol for speakers calling from inside a vehicle and a Not in Vehicle Protocol for those calling from outside a vehicle. The protocols shared several questions, but each protocol contained distinct queries designed to probe the conditions of the caller's in vehicle/not in vehicle surroundings. Not every caller provided a response to each prompt.
docs/ | The documentation directory. This directory contains further documentation for CSLU: Portland Cellular Telephone Speech Version 1.3. |
labels/ | Phonetic labeling directory. This directory contains phonetic labels and phonetic transcriptions for corresponding speech files. |
misc/ | Miscellaneous directory. This directory contains software tools and scripts. |
speech/ | Speech directory. This directory contains the actual .wav files; there are subdirectories within this directory based on the speaker's ID number. |
trans/ | Transcriptions directory. This directory contains orthographic transcriptions for most of the speech files. |
The speeech data was captured digitally from CSLU's T1 connection and saved as 8 khz, 16-bit linear.
The text transcriptions in this corpus were produced using the non time-aligned word-level conventions described in The CSLU Labeling Guide, which is included in the documentation for this release. CSLU: Portland Cellular Telephone Speech contains orthographic and phonetic transcriptions of corresponding speech files. Non time-aligned orthographic transcriptions provide quick access to the content of an utterance; they may contain markers for word boundaries to support access and retrieval at the lexical level. Phonetic/phonemic transcriptions represent the phonetic content of an utterance at a given level of detail that is made explicit by the use of diacritics. Phonetic phenomena transcribed includes excessive nasalization, glottalization, frication on a stop, centralization, lateralization, rounding and palatalization.
Each caller was asked the same initial set of questions set forth below. The string in {} after the prompt is a "type key" used to identify those utterances that are responses to the corresponding prompt.
Thank you for calling the Oregon Graduate Institute. In collaboration with Cellular One we are collecting speech samples from cellular phones. The speech samples you provide will allow us to perform basic research that may lead to improved services for cellular phone users. We will share the speech data you provide with other researchers, but your identity will be kept confidential.
Are you calling from within a vehicle? Please say yes or no. {yorn}
The first four questions provide us with background information. Please wait for the beep before speaking. {iv_inst1}
Are you male or female? {morf}
What is your native language? {nlang}
What city and state did you grow up in? {growup}
What is your date of birth? {dob}
In vehicle protocol
Those calling from within a vehicle were asked the following specific questions once the background portion was complete:The answers to the next set of questions will provide information about driving conditions and the recording conditions in your car.
Please tell us if your window is open, or if you are using the windshield wipers, heater or radio? {environ}
Briefly describe the traffic conditions. {traffic}
About how fast are you traveling right now? {fast}
Are you using a digital or analog phone? {dora}
If you know the brand and model of your cellular phone, Please tell us now {brand}
Are you using your phone's handset or a mounted microphone? {horm}
The answers to the next questions will provide us with some background information about you and some examples of spoken digits and letters.
Please say your last name. {lastname}
Please spell your last name. {spelllastname}
Please say a familiar license plate number. {flpnum}
Please say a familiar phone number. {fphone}
What time is it now? {time}
Please say another phone number. {phone2}
What is today's date? {date}
Please say the days of the week. {week}
The last question is designed to provide samples of natural continuous speech. When you hear the beep, we would like you to talk for about half a minute and:
tell us something about yourself. {story1}
describe a typical day in your life. {story2}tell us what you like most about where you live. {story3}
tell us about your family. {story4}
tell us about your dream home. {story5}
tell us something about the town where you grew up. {story6}
tell us about your favorite restaurant. {story7}
tell us about your favorite sport or hobby. {story8}
tell us about your favorite movie or television show. {story9}
We would like you to keep talking until you hear two beeps. We will give you a moment to collect your thoughts. Please begin speaking at the beep.
Thanks again for your help. If you would like to receive a gift certificate to McDonalds, TCBY, Bdaltons books, Baskin-Robbins, or Blockbuster video, please let us know which one, and leave your name and address.
Hang up when you are finished. {address}
Not in vehicle protocol
Those calling from outside a vehicle were asked the following questions once the background portion was complete:
The answers to the next questions will provide information about the source of background noise during your call.
Please describe your location. {location}
Please identify any background noises that we may be hearing while you speak. For example, is the radio or TV on? Are there other people speaking nearby? {bnoise}
Are you using a digital or analog phone?
If you know the brand and model of your cellular phone, Please tell us now
Are you speaking directly into your phone's handset or a speaker phone? {horm_niv}
The answers to the next questions will provide us with Some background information about you and some examples of spoken digits and letters.
Please say your last name.
Please spell your last name.
Please say a familiar license plate number.
Please say a familiar phone number.
What time is it now?
Please say another phone number.
What is today's date?
Please say the days of the week.
The last question is designed to provide samples of natural continuous speech. When you hear the beep, we would like you to talk for about half a minute and:
tell us something about yourself.
describe a typical day in your life.
tell us what you like most about where you live.
tell us about your family.
tell us about your dream home.
tell us something about the town where you grew up.
tell us about your favorite restaurant.
tell us about your favorite sport or hobby.
tell us about your favorite movie or television show.
We would like you to keep talking until you hear two beeps. We will give you a moment to collect your thoughts. Please begin speaking at the beep.
Thanks again for your help. If you would like to receive a gift certificate to McDonalds, TCBY, Bdaltons books, Baskin-Robbins, or Blockbuster video, please let us know which one, and leave your name and address. Hang up when you are finished
Statistics
Set forth below are the total number of utterances per type key:
bnoise 234 brand 380 date 359 dob 405 dora 396 environ 167 fast 165 flpnum 370 fphone 386 growup 405 horm 165 horm_niv 226 lastname 749 location 232 morf 417 nlang 411 phone2 364 spelllastname 382 story1 38 story2 38 story3 34 story4 37 story5 43 story6 43 story7 41 story8 44 story9 38 time 357 traffic 167 week 360 yorn 500
Additional information, updates, bug fixes may be available in the LDC catalog entry for this corpus at LDC2008S01.
Portions © 1995, 1998, 2000, 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2008 Trustees of the University of Pennsylvania