Multilanguage Telephone Speech Release Version 1.2 Center for Spoken Language Understanding UPDATED: 3 June 2002 Use of this corpus is permitted only under the conditions of the signed license agreement. Use or redistribution of this corpus outside the agreement is prohibited by law. Overview -------- The Multilanguage Telephone Speech Corpus consists of telephone speech from eleven languages: English, Farsi, French, German, Hindi, Japanese, Korean, Mandarin, Spanish, Tamil, Vietnamese. The corpus contains fixed vocabulary utterances (eg. days of the week) as well as fluent continuous speech. The current release includes recorded utterances from about 2052 speakers, for a total of about 38.5 hours of speech. Time-aligned phonetic transcriptions for 619 of the utterances are also included. Distribution Directory Structure -------------------------------- This is the directory structure for this distribution of the Multilanguage Telephone Corpus. This corpus is distributed by the Center for Spoken Language Understanding of the Oregon Health & Science University. Following is a description of the directory structure in this release: readme.txt This file. docs/ The documentation directory. This directory contains further documentation for the Multilanguage anguage corpus. labels/ The labels directory. This directory contains the .ptlola phonetic labels for 619 of the speech files. speech/ The speech directory contains the actual .wav files. There are many further subdirectories within the speech directory. misc/ Miscellaneous directory, possibly containing software tools and scripts. trans/ Phonetic labeling directory. This directory is empty for this corpus. This corpus requires approximately 2.2GB of disk space. Please see the /docs directory for further documentation. Contact Information ------------------- Further information about this corpus can be found our web site: . Refer specific questions to: - Center for Spoken Language Understanding - Oregon Health & Science University - email : corpora@cslu.ogi.edu - Address : 20000 NW Walker Road Beaverton, OR 97006 USA Constructive feedback about this corpus is appreciated.