TIMIT Corpus

Data type: Speech
Speech type: Read
Domain(s): NA
Text available: Time-aligned orthographic and phonetic trancripts
Language: American English

General description:
The TIMIT corpus of read speech is designed to provide speech data for acoustic-phonetic studies and for the development and evaluation of automatic speech recognition systems. TIMIT contains broadband recordings of 630 speakers of 8 major dialects of American English, each reading 10 phonetically rich sentences. Corpus design was a joint effort among the Massachusetts Institute of Technology (MIT), SRI International(SRI), and Texas Instruments, Inc. (TI). The speech was recorded at TI, transcribed at MIT, and verified and prepared for CD-ROM production by the National Institute of Standards and Technology (NIST). The TIMIT Corpus transcriptions have been hand verified. Test and training subsets, balanced for phonetic and dialectical coverage, are specified. Tabular computer-searchable information is included as well as written documentation.

Useful links:
SRI International
Spoken Language Systems Group Homepage - MIT
Texas Instruments
NIST Homepage

Availability: LDC-Online and CD-ROM

Garofolo, John S., Lori F. Lamel, William M. Fisher, Jonathon G. Fiscus, David S. Pallett, and Nancy L. Dahlgren, "The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM" (printed documentation; available on request from the LDC).

Online Documentation:
Corpus description
Table of phone symbols used in phonemic dictionary and phonetic transcriptions
Description of phonemic lexicon
Description of suggested train/test subdivision
Table of speaker attributes
Table of sentence-ID numbers for each speaker
Table of sentence prompts and sentence-ID numbers
Phonemic dictionary of all orthographic words in prompts

Related Corpora:
NTIMIT: NYNEX Telephone Version of TIMIT Corpus
CTIMIT: Cellular TIMIT Speech Corpus
FFMTIMIT: Far Field Microphone Recording Version

Institution of Origin:
MIT, SRI International, Texas Instruments

Publisher & Place of Publication: NIST, Gaithersburg, MD

Collection Time Span: January-May 1986

Description of File Organization: Files are sorted by usage, dialect region, gender/speaker_id, sentence_id, file_type

Number of Files: 630 speakers x 10 sentences per speaker x 4 files per sentence = 25200 files

Total Size: 600MB approx.

Tagging Description: Time stamps for words and phonetic segments.