LDC2025S11 2021 NIST Speaker Recognition Evaluation Development and Test Set Authors: Omid Sadjadi, Craig Greenberg (NIST) Kevin Walker, Karen Jones, Christopher Caruso, Stephanie Strassel (LDC) 1.0 Overview The 2021 NIST Speaker Recognition Evaluation Development and Test Set was developed by NIST (National Institute of Standards and Technology) utilizing data collected by Linguistic Data Consortium (LDC). NIST SRE is part of an ongoing series of evaluations focusing on text independent speaker recognition. This corpus contains development and evaluation data used in SRE21, along with answer keys, trial files, other metadata and associated documentation. SRE21 data consists of conversational telephone speech (CTS) and audio from video (AfV) in three languages: Cantonese, Mandarin and English. The data was obtained from consented human subjects in Hong Kong as part of the WeCanTalk Corpus collected by LDC. Subjects contributed multiple conversational telephone speech recordings and video recordings in which they were talking, plus a single selfie image. Recordings were manually audited to verify speaker, language and quality, and short segments were extracted for use as enrollment and test segments for the SRE21 evaluation. The 2021 NIST Speaker Recognition Evaluation Plan (see https://www.nist.gov/system/files/documents/2021/07/12/2021_SRE_Evaluation_Plan_V5.pdf) contains detailed information about the evaluation task, training conditions, performance metric, data and evaluation requirements. Since this package combines the SRE21 development and test sets and includes the answer keys, the directory structure differs slightly from that described in the NIST 2021 Speaker Recoginition Evaluation Plan. 2.0 Data Formats The CTS segments in this release are encoded as a-law files with SPHERE format and a sample rate of 8 KHz. The AfV segments are encoded as 16-bit FLAC files sampled at 16 KHz, and videos have an mp4 format. 3.0 Corpus Contents The NIST 2021 Speaker Recognition Evaluation Plan in /docs/2021_SRE_Evaluation_Plan_V5.pdf contains detailed information about the SRE21 evaluation task, training conditions, performance metric, data and evaluation requirements. 3.1 Directory Structure Since this package combines the SRE21 development and test sets and includes the answer keys, the directory structure differs slightly from that described in the NIST 2021 Speaker Recoginition Evaluation Plan. The directory structure in this release is as follows: /data /dev /audio/enrollment /audio/test /image/enrollment /video/test /eval /audio/enrollment /audio/test /image/enrollment /video/test /docs /2021_SRE_Evaluation_Plan_V5.pdf /dev/sre21_audio_dev_enrollment.tsv /dev/sre21_audio_dev_trial_key.tsv /dev/sre21_audio_dev_trials.tsv /dev/sre21_visual_dev_trial_key.tsv /dev/sre21_visual_dev_trials.tsv /dev/sre21_audio-visual_dev_trial_key.tsv /dev/sre21_audio-visual_dev_trials.tsv /dev/sre21_dev_segment_key.tsv /eval/sre21_audio_eval_enrollment.tsv /eval/sre21_audio_eval_trial_key.tsv /eval/sre21_audio_eval_trials.tsv /eval/sre21_audio-visual_eval_trial_key.tsv /eval/sre21_audio-visual_eval_trials.tsv /eval/sre21_visual_eval_trial_key.tsv /eval/sre21_visual_eval_trials.tsv /eval/sre21_eval_segment_key.tsv /README.txt (this file) 3.2 Data Volume The number of files in this release is summarized below. genre partition purpose count audio dev enrollment 198 audio dev test 2001 audio eval enrollment 1793 audio eval test 17037 audio total 21029 image dev enrollment 20 image eval enrollment 182 image total 202 video dev test 388 video eval test 3177 video total 3565 3.3 Enrollment Files There are two tab-delimited enrollment files in the release: /dev/sre21_audio_dev_enrollment.tsv /eval/sre21_audio_eval_enrollment.tsv These files list each segment (denoted by segmentid) provided to build a model for each target speaker (denoted by modelid). Example rows in sre21_audio_dev_enrollment.tsv are given below: modelid segmentid 1025_sre21 dddjaovr_sre21.sph 1025_sre21 prfdwtpv_sre21.sph 1025_sre21 bhwjsxus_sre21.sph 3.4 Segment Key Files There are two tab-delimited segment key files in the release: /dev/sre21_dev_segment_key.tsv /eval/sre21_eval_segment_key.tsv These files include the following information about each segment: segment_id subjectid gender (male/female) source_type (cts/afv) language (cantonese, mandarin, english) partition (enrollment, test) 3.5 Trial Files There are six tab-delimited trial files in this release: /dev/sre21_audio_dev_trials.tsv /dev/sre21_visual_dev_trials.tsv /dev/sre21_audio-visual_dev_trials.tsv /eval/sre21_audio_eval_trials.tsv /eval/sre21_audio-visual_eval_trials.tsv /eval/sre21_visual_eval_trials.tsv Thse files list the enrollment identifier (modelid) and/or the imageid, and the test segment identifier (segmentid). An example row in sre21_audio-visual_dev_trials.tsv is: modelid imageid segmentid 1001_sre21 vvwpuyuw_sre21.jpg aalowzys_sre21.mp4 3.6 Trial Key Files There are six tab-delimited trial key files in this release: /dev/sre21_audio_dev_trial_key.tsv /dev/sre21_visual_dev_trial_key.tsv /dev/sre21_audio-visual_dev_trial_key.tsv /eval/sre21_audio_eval_trial_key.tsv /eval/sre21_audio-visual_eval_trial_key.tsv /eval/sre21_visual_eval_trial_key.tsv These files reveal for each trial whether the target segment was target or non-target. Additional information such as the number of enrollment segments, whether there is a phone number match, gender, source type and language is also provided. An example record from sre21_audio_dev_trial_key.tsv is provided below: modelid 1001_sre21 segmentid aaedgerr_sre21.sph targettype nontarget num_enroll_segs 3 phone_num_match N gender female source_type_match Y language_match N 4.0 Known Issues There are 7 cases of duplicate flac files in the /data/eval/audio/enrollment directory. The duplicates are: iitugfnz_sre21.flac yqwdaldz_sre21.flac hrmlidpi_sre21.flac vvhzxbsu_sre21.flac qvcaskfi_sre21.flac sndiilcu_sre21.flac obcfxfmy_sre21.flac pdplbfcu_sre21.flac shhxfyix_sre21.flac yahvicrh_sre21.flac dwlgroce_sre21.flac zlrbyrln_sre21.flac gllqioio_sre21.flac vbdxbkbe_sre21.flac -- README created by Karen Jones 1/19/2024 updated by Stephanie Strassel 12/9/2024