LDC2025S11
2021 NIST Speaker Recognition Evaluation Development and
Test Set

Authors: 
Omid Sadjadi, Craig Greenberg (NIST) 
Kevin Walker, Karen Jones, Christopher Caruso, Stephanie Strassel (LDC)

1.0 Overview

The 2021 NIST Speaker Recognition Evaluation Development and Test Set
was developed by NIST (National Institute of Standards and Technology)
utilizing data collected by Linguistic Data Consortium (LDC).

NIST SRE is part of an ongoing series of evaluations focusing on text
independent speaker recognition. This corpus contains development and
evaluation data used in SRE21, along with answer keys, trial files,
other metadata and associated documentation.

SRE21 data consists of conversational telephone speech (CTS) and audio
from video (AfV) in three languages: Cantonese, Mandarin and
English. The data was obtained from consented human subjects in Hong
Kong as part of the WeCanTalk Corpus collected by LDC. Subjects
contributed multiple conversational telephone speech recordings and
video recordings in which they were talking, plus a single selfie
image. Recordings were manually audited to verify speaker, language
and quality, and short segments were extracted for use as enrollment
and test segments for the SRE21 evaluation.

The 2021 NIST Speaker Recognition Evaluation Plan (see 
https://www.nist.gov/system/files/documents/2021/07/12/2021_SRE_Evaluation_Plan_V5.pdf) 
contains detailed information about the evaluation task, training 
conditions, performance metric, data and evaluation requirements. 
Since this package combines the SRE21 development and test sets and 
includes the answer keys, the directory structure differs slightly 
from that described in the NIST 2021 Speaker Recoginition Evaluation 
Plan.

2.0 Data Formats

The CTS segments in this release are encoded as a-law files with
SPHERE format and a sample rate of 8 KHz.

The AfV segments are encoded as 16-bit FLAC files sampled at 16 KHz,
and videos have an mp4 format.

3.0 Corpus Contents

The NIST 2021 Speaker Recognition Evaluation Plan in
/docs/2021_SRE_Evaluation_Plan_V5.pdf contains detailed information
about the SRE21 evaluation task, training conditions, performance metric,
data and evaluation requirements.

3.1 Directory Structure

Since this package combines the SRE21 development and test sets and
includes the answer keys, the directory structure differs slightly
from that described in the NIST 2021 Speaker Recoginition Evaluation
Plan.

The directory structure in this release is as follows:

/data
	/dev
		/audio/enrollment
		/audio/test
		/image/enrollment
		/video/test

	/eval
		/audio/enrollment
		/audio/test
		/image/enrollment
		/video/test
/docs
        /2021_SRE_Evaluation_Plan_V5.pdf 
	/dev/sre21_audio_dev_enrollment.tsv
	/dev/sre21_audio_dev_trial_key.tsv
	/dev/sre21_audio_dev_trials.tsv
	/dev/sre21_visual_dev_trial_key.tsv
	/dev/sre21_visual_dev_trials.tsv
	/dev/sre21_audio-visual_dev_trial_key.tsv
	/dev/sre21_audio-visual_dev_trials.tsv
	/dev/sre21_dev_segment_key.tsv
	/eval/sre21_audio_eval_enrollment.tsv
	/eval/sre21_audio_eval_trial_key.tsv
	/eval/sre21_audio_eval_trials.tsv
	/eval/sre21_audio-visual_eval_trial_key.tsv
	/eval/sre21_audio-visual_eval_trials.tsv
	/eval/sre21_visual_eval_trial_key.tsv
	/eval/sre21_visual_eval_trials.tsv
	/eval/sre21_eval_segment_key.tsv
	
/README.txt (this file)

3.2 Data Volume

The number of files in this release is summarized below.

genre     partition purpose   count
audio     dev       enrollment       198
audio     dev       test            2001
audio     eval      enrollment      1793
audio     eval      test           17037
audio     total                    21029

image     dev       enrollment        20
image     eval      enrollment       182
image     total                      202

video     dev       test             388
video     eval      test            3177
video     total                     3565

3.3 Enrollment Files

There are two tab-delimited enrollment files in the release:
 /dev/sre21_audio_dev_enrollment.tsv
 /eval/sre21_audio_eval_enrollment.tsv

These files list each segment (denoted by segmentid) provided to build
a model for each target speaker (denoted by modelid).  Example rows in
sre21_audio_dev_enrollment.tsv are given below:

modelid 	segmentid
1025_sre21      dddjaovr_sre21.sph
1025_sre21      prfdwtpv_sre21.sph
1025_sre21      bhwjsxus_sre21.sph

3.4 Segment Key Files

There are two tab-delimited segment key files in the release:
 /dev/sre21_dev_segment_key.tsv
 /eval/sre21_eval_segment_key.tsv

These files include the following information about each segment:
 segment_id
 subjectid
 gender (male/female)
 source_type (cts/afv)
 language (cantonese, mandarin, english)
 partition (enrollment, test)

3.5 Trial Files

There are six tab-delimited trial files in this release:
 /dev/sre21_audio_dev_trials.tsv
 /dev/sre21_visual_dev_trials.tsv
 /dev/sre21_audio-visual_dev_trials.tsv
 /eval/sre21_audio_eval_trials.tsv
 /eval/sre21_audio-visual_eval_trials.tsv
 /eval/sre21_visual_eval_trials.tsv

Thse files list the enrollment identifier (modelid) and/or the
imageid, and the test segment identifier (segmentid). An example row
in sre21_audio-visual_dev_trials.tsv is:

modelid 	imageid 		segmentid
1001_sre21      vvwpuyuw_sre21.jpg      aalowzys_sre21.mp4

3.6 Trial Key Files

There are six tab-delimited trial key files in this release:
 /dev/sre21_audio_dev_trial_key.tsv
 /dev/sre21_visual_dev_trial_key.tsv
 /dev/sre21_audio-visual_dev_trial_key.tsv
 /eval/sre21_audio_eval_trial_key.tsv
 /eval/sre21_audio-visual_eval_trial_key.tsv
 /eval/sre21_visual_eval_trial_key.tsv

These files reveal for each trial whether the target segment was
target or non-target. Additional information such as the number of
enrollment segments, whether there is a phone number match, gender,
source type and language is also provided. An example record from
sre21_audio_dev_trial_key.tsv is provided below:

modelid 		1001_sre21
segmentid 		aaedgerr_sre21.sph		
targettype 		nontarget
num_enroll_segs		3 
phone_num_match		N 
gender  		female
source_type_match	Y
language_match		N	

4.0 Known Issues

There are 7 cases of duplicate flac files in the 
/data/eval/audio/enrollment directory.  The duplicates are:

iitugfnz_sre21.flac
yqwdaldz_sre21.flac
 
hrmlidpi_sre21.flac
vvhzxbsu_sre21.flac

qvcaskfi_sre21.flac
sndiilcu_sre21.flac

obcfxfmy_sre21.flac
pdplbfcu_sre21.flac

shhxfyix_sre21.flac
yahvicrh_sre21.flac

dwlgroce_sre21.flac
zlrbyrln_sre21.flac

gllqioio_sre21.flac
vbdxbkbe_sre21.flac

--
README created by Karen Jones 1/19/2024
       updated by Stephanie Strassel 12/9/2024