Home › Language Resources › Data

Tactical Speaker Identification Speech Corpus (TSID)

Item Name:	Tactical Speaker Identification Speech Corpus (TSID)
Author(s):	David Graff, Douglas Reynolds, Gerald C O'Leary
LDC Catalog No.:	LDC99S83
ISBN:	1-58563-154-X
ISLRN:	389-320-759-767-3
DOI:	https://doi.org/10.35111/8e9y-za80
Member Year(s):	1999
DCMI Type(s):	Sound
Data Source(s):	microphone speech
Application(s):	speech recognition, speaker identification
Language(s):	English
Language ID(s):	eng
License(s):	LDC User Agreement for Non-Members
Online Documentation:	LDC99S83 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	Graff, David, Douglas Reynolds, and Gerald O'Leary. Tactical Speaker Identification Speech Corpus (TSID) LDC99S83. Web Download. Philadelphia: Linguistic Data Consortium, 1999.
Related Works: Hide	View isSimilarWith LDC93S11 Road Rally LDC93S12 HCRC Map Task Corpus relatesTo LDC93S1 TIMIT Acoustic-Phonetic Continuous Speech Corpus

Introduction

The Tactical Speaker Identification Corpus (TSID), which was collected by Douglas Reynolds and Gerald C. O'Leary of MIT Lincoln Labs, contains recordings of 35 speakers (four female, 31 male), using a variety of different radio transmitters and receivers.

Data

The recording sessions were conducted by assembling the speakers into seven groups of five, then having each speaker perform the following tasks: - read a list of TIMIT sentences - read a list of digit strings - give directions for traveling from one point to another using a map (unscripted map task)

Each speaker performed this set of tasks on each of three transmitters (xmtr1-3), and the utterances were recorded simlutaneously on DAT recorders attached to each of six receivers (rcvr1-6), which were located at some distance (well out of ear-shot) from the transmitter. Recordings were also made at the same time on a DAT recorder near the speaker using a head-mounted microphone to provide a reference wide-band recording of the speech (refwb).

As a result, the corpus is organized along four dimensions: speaker, transmitter, receiver, and speaking task; this organization can be viewed as a four-dimensional matrix, with 35x3x7x3 cells. Due to some occasional mishaps and malfunctions during the collection, some cells in this matrix are either empty or only partially full.

In addition to the tasks listed above, three pairs of speakers also participated in a two-way map task using xmtr3; in this case, one of the speakers in the task gives directions to the other for tracing a route on a map, and both speakers are recorded on a single audio channel at each of the receivers (except for the "refwb" recording: the two speakers were separated by some distance, using radio communication to perform the task, and only one of them used a head-mounted microphone and local DAT recorder for wide-band recording).

Updates

There are no updates at this time.

Tactical Speaker Identification Speech Corpus (TSID)

Introduction

Data

Updates

Available Media

View Fees