CSLU: Alphadigit Version 1.3
Item Name: | CSLU: Alphadigit Version 1.3 |
Author(s): | Ronald Allan Cole, M Noel, T. Lander, T Durham |
LDC Catalog No.: | LDC2008S06 |
ISBN: | 1-58563-478-6 |
ISLRN: | 569-415-930-320-1 |
DOI: | https://doi.org/10.35111/eayh-nv69 |
Release Date: | July 16, 2008 |
Member Year(s): | 2008 |
DCMI Type(s): | Sound |
Sample Type: | ulaw |
Sample Rate: | 8000 |
Data Source(s): | telephone conversations |
Application(s): | speech recognition |
Language(s): | English |
Language ID(s): | eng |
License(s): |
CSLU Agreement |
Online Documentation: | LDC2008S06 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Cole, Ronald Allan, et al. CSLU: Alphadigit Version 1.3 LDC2008S06. Web Download. Philadelphia: Linguistic Data Consortium, 2008. |
Related Works: | View |
Introduction
This file contains documentation for CSLU: Alphadigit Version 1.3 , Linguistic Data Consortium (LDC) catalog number LDC2008S06 and isbn 1-58563-478-6.
Alphadigit Version 1.3 is a collection of 78,044 utterances from 3,025 speakers saying six-digit strings of letters and digits over the telephone for a total of approximately 82 hours of speech. Each speech file has corresponding orthographic and phonemic transcriptions. This corpus was created by the Center for Spoken Language Understanding (CSLU), Oregon Health & Science University, Beaverton, Oregon.
Data
Speakers were recruited using USEnet postings. Respondents registered for the collection by completing an online form. Once registered, they received a list of 18-29 six-digit strings (e.g., "a 2 b 4 5 g") and participation instructions. Speakers called the CSLU data collection system by dialing a toll-free number and were prompted for each string; 1102 different strings were used throughout the course of the data collection. The lists were set up to balance for phonetic context between all letter and digit pairs.
The data were recorded directly from a digital phone line without digital-to-analog or analog-to-digital conversion at the recording end using the CSLU T1 digital data collection system. The sampling rate was 8khz and the files were stored in 8-bit mu-law format on a UNIX file system. The files have been converted to RIFF standard file format, 16-bit linearly encoded.
Transcription
All of the files included in this corpus have corresponding non-time-aligned word-level transcriptions and time aligned phoneme-level transcriptions (automatic forced alignment) that comply with the conventions in the CSLU Labeling Guide. Non time-aligned orthographic transcriptions provide quick access to the content of an utterance; they may contain markers for word boundaries to support access and retrieval at the lexical level. Phonetic/phonemic transcriptions represent the phonetic content of an utterance at a given level of detail that is made explicit by the use of diacritics. Phonetic phenomena transcribed include excessive nasalization, glottalization, frication on a stop, centralization, lateralization, rounding and palatalization.
Samples
For an example of the speech contained in this corpus, please listen to this audio sample (MS wave).