Home › Language Resources › Data

2010 NIST Speaker Recognition Evaluation Test Set

Item Name:	2010 NIST Speaker Recognition Evaluation Test Set
Author(s):	Craig Greenberg, Alvin Martin, David Graff, Linda Brandschain, Kevin Walker
LDC Catalog No.:	LDC2017S06
ISBN:	1-58563-795-5
ISLRN:	429-091-121-265-4
DOI:	https://doi.org/10.35111/fjsq-a117
Release Date:	April 17, 2017
Member Year(s):	2017
DCMI Type(s):	Sound
Sample Type:	ulaw
Sample Rate:	8000
Data Source(s):	microphone speech, telephone speech
Project(s):	NIST SRE
Application(s):	speaker identification
Language(s):	English
Language ID(s):	eng
License(s):	LDC User Agreement for Non-Members
Online Documentation:	LDC2017S06 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	Greenberg, Craig, et al. 2010 NIST Speaker Recognition Evaluation Test Set LDC2017S06. Web Download. Philadelphia: Linguistic Data Consortium, 2017.
Related Works: Hide	View isOutcomeOf LDC2013S03 Mixer 6 Speech isSimilarWith LDC96S61 1996 Speaker Recognition Benchmark LDC98S76 1998 Speaker Recognition Benchmark LDC99S80 1997 Speaker Recognition Benchmark LDC99S81 1999 Speaker Recognition Benchmark LDC2001S97 2000 NIST Speaker Recognition Evaluation LDC2002S34 2001 NIST Speaker Recognition Evaluation Corpus LDC2004S04 2002 NIST Speaker Recognition Evaluation LDC2006S44 2004 NIST Speaker Recognition Evaluation LDC2010S03 2003 NIST Speaker Recognition Evaluation LDC2011S05 2008 NIST Speaker Recognition Evaluation Training Set Part 1 LDC2011S01 2005 NIST Speaker Recognition Evaluation Training Data LDC2011S04 2005 NIST Speaker Recognition Evaluation Test Data LDC2011S07 2008 NIST Speaker Recognition Evaluation Training Set Part 2 LDC2011S08 2008 NIST Speaker Recognition Evaluation Test Set LDC2011S09 2006 NIST Speaker Recognition Evaluation Training Set LDC2011S10 2006 NIST Speaker Recognition Evaluation Test Set Part 1 LDC2011S11 2008 NIST Speaker Recognition Evaluation Supplemental Set LDC2012S01 2006 NIST Speaker Recognition Evaluation Test Set Part 2 LDC2019S20 2016 NIST Speaker Recognition Evaluation Test Set LDC2020S04 2018 NIST Speaker Recognition Evaluation Test Set LDC2023V01 2019 NIST Speaker Recognition Evaluation Test Set -- Audio-Visual relatesTo LDC2017S16 LDC Spoken Language Sampler - Fourth Release

Introduction

2010 NIST Speaker Recognition Evaluation Test Set was developed by the Linguistic Data Consortium (LDC) and NIST (National Institute of Standards and Technology). It contains 2,255 hours of American English telephone speech and speech recorded over a microphone channel involving an interview scenario used as test data in the NIST-sponsored 2010 Speaker Recognition Evaluation (SRE).

The ongoing series of SRE yearly evaluations conducted by NIST are intended to be of interest to researchers working on the general problem of text independent speaker recognition. To this end the evaluations are designed to be simple, to focus on core technology issues, to be fully supported, and to be accessible to those wishing to participate.

The 2010 evaluation was similar to the 2008 evaluation by including in the training and test conditions for the core test not only conversational telephone speech (CTS) recorded over ordinary telephone channels, but also CTS and conversational interview speech recorded over a room microphone channel. Unlike prior evaluations, some of the conversational telephone style speech was collected in a manner to produce particularly high, or particularly low, vocal effort on the part of the speaker of interest.

Data

The speech recordings in this release were collected in 2009 and 2010 by LDC at its Human Subjects Collection facility in Philadelphia. This collection was part of the Mixer 6 project, which was designed to support the development of robust speaker recognition technology by providing carefully collected and audited speech from a large pool of speakers recorded simultaneously across numerous microphones.

The telephone speech segments include two-channel excerpts of approximately 5 minutes and 10 seconds. There are also summed-channel excerpts in the range of 5 minutes. The microphone excerpts are 3-15 minutes in duration. As in prior evaluations, intervals of silence were not removed. The data included in this release is 8 bit ulaw with a sample rate of 8 kHz.

In addition to evaluation data, this package also consists of answer keys, trial and train files, development data and evaluation documentation.

2010 NIST Speaker Recognition Evaluation Test Set

Introduction

Data

Samples

Updates

Copyright

Available Media

View Fees