REMIX Telephone Collection

Item Name: REMIX Telephone Collection
Author(s): David Graff, Karen Jones, Stephanie Strassel, Kevin Walker
LDC Catalog No.: LDC2023S09
ISLRN: 602-562-840-191-7
DOI: https://doi.org/10.35111/600z-f268
Release Date: November 15, 2023
Member Year(s): 2023
DCMI Type(s): Sound
Sample Type: mulaw
Sample Rate: 8000
Data Source(s): telephone conversations
Project(s): MIXER, NIST SRE
Application(s): speaker identification
Language(s): English
Language ID(s): eng
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2023S09 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Graff, David, et al. REMIX Telephone Collection LDC2023S09. Web Download. Philadelphia: Linguistic Data Consortium, 2023.
Related Works: View

Introduction

REMIX Telephone Collection was developed by the Linguistic Data Consortium (LDC) and contains 320 hours of English conversational telephone speech from 358 speakers who had completed all tasks in one of the previous LDC Mixer collections, specifically, Mixers 4-7. The data was collected in 2012; recordings in this corpus were used to support the NIST 2012 Speaker Recognition Evaluation.

Data

The audio recordings were generated using LDC's computer telephony system capable of collecting speech from the telephone network. Recruited speakers were connected through a robot operator to carry on casual conversations on suggested topics lasting up to 10 minutes. Subjects were asked to complete 12 calls, half of those in a "noisy" environment. Examples of proposed noisy environments included using a speakerphone, calling from a busy street, noisy store or office, or calling from a room with loud background noise.

The documentation for this release includes call topics, the number of calls per subject, the number of noisy calls and certain speaker demographic information (e.g., year of birth, education level, occupation).

The REMIX collection contains 1917 telephone recordings. The files are formatted as 2-channel, 8-bit, mu-law encoded sample data recorded at 8000 samples/second, with a NIST SPHERE-format header on each file.

Samples

SPH file

Updates

None at this time.

Available Media

View Fees





Login for the applicable fee