Foreign Accented English Corpus
                            Release 1.2

              Center for Spoken Language Understanding


UPDATED: 3 June 2002

Use of this corpus is permitted only under the conditions
of the signed license agreement. Use or redistribution of
this corpus outside the agreement is prohibited by law.

Overview
--------
The Foreign Accented English (FAE) Corpus is a subset of
the CSLU 22 Language Corpus, and consists of continuous
speech in English by native speakers of 22 different
languages.

The FAE Corpus consists of 4925 utterances, information
about the speaker's linguistic background, and perceptual
judgements of the degree of accent in the utterance.  The
callers were asked to speak about themselves, for 20
seconds, in English. 

Our goals in developing and releasing the FAE Corpus were
to support the study of the underlying characteristics of
foreign accent and to enable research, development, and
evaluation of algorithms for the identification and
understanding of accented speech.

Distribution Directory Structure
--------------------------------
This is the distribution for Release 1.2 of the Foreign
Accented English Corpus.  This corpus is distributed by the
Center for Spoken Language Understanding of the Oregon
Graduate Institute.  Following is a description of the
directory structure in this release:

  readme.txt	General information regarding the corpus.

  docs/		The documentation directory. This
		directory contains further documentation for
		the Foreign Accented English corpus.

  labels/	Phonetic labeling directory. This directory
		is empty for this corpus.

  misc/		Miscellaneous directory, possibly
		containing software tools and scripts.
		This directory contains info files for 
		many of the speech files. A description of 
		info files can be found in the overview.txt 
		file in the /docs directory.

  speech/	The speech directory contains the actual 
		.wav files. There are many labeled
		subdirectories within the speech directory.

  trans/	The transcriptions directory. For this corpus, 
		there are no transcriptions in the directory.

This corpus requires approximately 1.4GB of disk space.
Please see the /docs directory for further documentation.

Contact Information
-------------------
Further information about this corpus can be found our web
site: <http://www.cslu.ogi.edu>.

Refer specific questions to:

- Alena Tkacova
- Linguistic Data Services Manager
- Center for Spoken Language Understanding
- Oregon Health & Science University
- email   : alca@asp.ogi.edu
- Phone   : 503 748-1600    
- FAX     : 503 748-7038
- Address : 20000 NW Walker Road
            Beaverton, OR 97006 USA

Constructive feedback about this corpus is appreciated.