Alphadigit Corpus
                            Release 1.3

              Center for Spoken Language Understanding


UPDATED: 23 August 2002


Directory Structure
-------------------
This document describes the directory structure of this
release. 

Following is a written description of the directory
structure in this release:

  readme.txt	General information regarding the corpus.

  docs/		The documentation directory. This
		directory contains further documentation for
		the Alphadigit corpus.

  labels/	Phonetic labeling directory. This directory
                contains time aligned phoneme-level
                transcriptions (automatic forced alignment).

  misc/		Miscellaneous directory, possibly
		containing software tools and scripts.

  speech/	The speech directory contains the actual 
		.wav files. There are many labeled
		subdirectories within the speech directory.

  trans/	The transcriptions directory. This directory
		contains non-time-aligned word-level
		transcriptions for each of the speech files.

This corpus requires approximately 5.5GB disk space.

Visually, the directory structure looks something like this:

			 alphadigit
		             |
   --------------------------------------------------
   |           |        |        |        |         |
readme.txt   /docs   /labels   /misc   /speech   /trans

The /speech directory contains the speech data.  The files
Are divided into sub-directories based on the speaker's ID
number.

The /trans directory contains non-time-aligned word-level
transcriptions for each of the utterances. As with the
speech files, the transcription files are divided into 
sub-directories based on the speaker's ID number. 

File naming follows the following convention:

     AD-1.p22.wav

The first field ("AD") is the prefix indicating the corpus
to which this data belongs, the second field ("1") represents 
a unique ID number for the speaker, and the
third field ("p22") indicates the prompt to which the
speaker was responding.