Alphadigit Corpus
                            Release 1.3

              Center for Spoken Language Understanding


UPDATED: 23 August 2002

Use of this corpus is permitted only under the conditions
of the signed license agreement. Use or redistribution of
this corpus outside the agreement is prohibited by law.

Overview
--------
The Alphadigit Corpus includes recorded utterances from
3025 different callers and a transcription of each
utterance.  There are a total of 78044 speech files. All of
the files included in this corpus have corresponding non-
time-aligned word-level transcriptions, time aligned phoneme-
level transcriptions (automatic forced alignment), that 
comply with the conventions in the CSLU Labeling Guide.

Distribution Directory Structure
--------------------------------
This is the distribution for Release 1.3 of the Alphadigit
corpus.  This corpus is distributed by the Center for 
Spoken Language Understanding of the Oregon Graduate
Institute.  Following is a description of the directory
structure in this release:

  readme.txt	General information regarding the corpus.

  docs/		The documentation directory. This
		directory contains further documentation for
		the Alphadigit corpus.

  labels/	Phonetic labeling directory. This directory
                contains time aligned phoneme-level 
		transcriptions (automatic forced alignment).

  misc/		Miscellaneous directory, possibly
		containing software tools and scripts.

  speech/	The speech directory contains the actual 
		.wav files. There are many labeled
		subdirectories within the speech directory.

  trans/	The transcriptions directory. This directory
		contains non-time-aligned word-level
		transcriptions for each of the speech files.

This corpus requires approximately 5.5GB of disk space.
Please see the /docs directory for further documentation.

Contact Information
-------------------
Further information about this corpus can be found our web
site: <http://www.cslu.ogi.edu>.

Refer specific questions to:

- Center for Spoken Language Understanding
- Oregon Health & Science University
- email   : corpora@cslu.ogi.edu
- Address : 20000 NW Walker Road
            Beaverton, OR 97006 USA

Constructive feedback about this corpus is appreciated.