Data type: Speech
Speech type: Read
Text available: Time-aligned orthographic and phonetic trancripts
Language: American English
The TIMIT corpus of read speech is designed to provide speech data for acoustic-phonetic studies and for the development and evaluation of automatic speech recognition systems. TIMIT contains broadband recordings of 630 speakers of 8 major dialects of American English, each reading 10 phonetically rich sentences. Corpus design was a joint effort among the Massachusetts Institute of Technology (MIT), SRI International(SRI), and Texas Instruments, Inc. (TI). The speech was recorded at TI, transcribed at MIT, and verified and prepared for CD-ROM production by the National Institute of Standards and Technology (NIST). The TIMIT Corpus transcriptions have been hand verified. Test and training subsets, balanced for phonetic and dialectical coverage, are specified. Tabular computer-searchable information is included as well as written documentation.
Spoken Language Systems Group Homepage - MIT
Availability: LDC-Online and CD-ROM
Garofolo, John S., Lori F. Lamel, William M. Fisher, Jonathon G. Fiscus, David S. Pallett, and Nancy L. Dahlgren, "The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM" (printed documentation; available on request from the LDC).
Contact for Questions or to Report Errors: email@example.com
Table of phone symbols used in phonemic dictionary and phonetic transcriptions
Description of phonemic lexicon
Description of suggested train/test subdivision
Table of speaker attributes
Table of sentence-ID numbers for each speaker
Table of sentence prompts and sentence-ID numbers
Phonemic dictionary of all orthographic words in prompts
NTIMIT: NYNEX Telephone Version of TIMIT Corpus
CTIMIT: Cellular TIMIT Speech Corpus
FFMTIMIT: Far Field Microphone Recording Version
Institution of Origin:
MIT, SRI International, Texas Instruments
Publisher & Place of Publication: NIST, Gaithersburg, MD
Collection Time Span: January-May 1986
Description of File Organization: Files are sorted by usage, dialect region, gender/speaker_id, sentence_id, file_type
Number of Files: 630 speakers x 10 sentences per speaker x 4 files per sentence = 25200 files
Total Size: 600MB approx.
Tagging Description: Time stamps for words and phonetic segments.