CSLU: Stories v 1.2

Item Name: CSLU: Stories v 1.2
Authors: Yeshwant Muthusamy, Ron Cole and Beatrice Oshika
LDC Catalog No.: LDC2006S14
ISBN: 1-58563-366-6
Release Date: Oct 25, 2006
Data Type: speech
Sample Rate: 8000 Hz
Sampling Format: pcm
Data Source(s): telephone speech
Language(s): English
Language ID(s): eng
Distribution: 1 CD
Member fee: $0 for 2006 members
Non-member Fee: US $150.00
Reduced-License Fee: US $150.00
Extra-Copy Fee: US $150.00
Non-member License: yes
Member License: yes
Online documentation: yes
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Yeshwant Muthusamy, Ron Cole and Beatrice Oshika
CSLU: Stories v 1.2
Linguistic Data Consortium, Philadelphia


This file contains documentation on CSLU: Stories V1.2, Linguistic Data Consortium (LDC) catalog number LDC2006S14 and ISBN 1-58563-366-6.

CSLU: Stories contains extemporaneous speech collected from English speakers in the CSLU Multilanguage Telephone Speech data collection. Each speaker was asked to speak on a topic of his or her choice for one minute. Those utterances are collected in the Stories corpus.


The Stories corpus comprises:

  1. Speech files for the 702 calls
  2. Time-aligned word level transcriptions (and corresponding comment files) for approximately 322 stories
  3. Word transcriptions (not time aligned) for 702 stories
  4. Time-aligned phonetic labels for 702 stories


For an example of the data in this corpus, please listen to this audio sample.

Content Copyright

Portions 2002 Center for Spoken Language Understanding Oregon Health and Science University, 2006 Trustees of the University of Pennsylvania