Home › Language Resources › Data

TIDIGITS

Item Name:	TIDIGITS
Author(s):	R. Gary Leonard, George R. Doddington
LDC Catalog No.:	LDC93S10
ISBN:	1-58563-018-7
ISLRN:	177-353-807-744-3
DOI:	https://doi.org/10.35111/72xz-6x59
Member Year(s):	1993
DCMI Type(s):	Sound
Sample Type:	pcm
Sample Rate:	20000
Data Source(s):	microphone speech
Application(s):	speech recognition
Language(s):	English
Language ID(s):	eng
License(s):	LDC User Agreement for Non-Members
Online Documentation:	LDC93S10 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	R. Gary Leonard, and George Doddington. TIDIGITS LDC93S10. Web Download. Philadelphia: Linguistic Data Consortium, 1993.
Related Works: Hide	View isSimilarWith LDC93S9 TI 46-Word LDC2008S06 CSLU: Alphadigit Version 1.3 LDC2009S01 CSLU: Numbers Version 1.3 relatesTo LDC2004S02 ICSI Meeting Speech

Introduction

TIDIGITS was developed by Texas Instruments, Inc. (TI) and consists of approximately 13 hours of digit sequences in English spoken by over 300 men, women, and children. This corpus contains speech which was originally designed and collected at TI for the purpose of designing and evaluating algorithms for speaker-independent recognition of connected digit sequences.

Data

The corpus was collected at TI in 1982 in a quiet acoustic enclosure using an Electro-Voice RE-16 Dynamic Cardiod microphone, digitized at 20kHz. The waveform files are single channel, 16-bit files in the NIST SPHERE format. There are 326 speakers (111 men, 114 women, 50 boys and 51 girls) each pronouncing 77 digit sequences. Each speaker group is partitioned into test and training subsets. Speaker metadata includes gender, age, and dialect.

Samples

Audio (sphere)

Updates

As of April, 2015, TIDIGITS is also available in flac compressed wav. This package is available to licensees as an additional download. Not included in this version are the folders relating to handling the shortened sphere files of the original corpus.

TIDIGITS

Introduction

Data

Samples

Updates

Copyright

Available Media

View Fees