Home › Language Resources › Data

RATS Low Speech Density

Item Name:	RATS Low Speech Density
Author(s):	Kevin Walker, David Graff, Xiaoyi Ma, Stephanie Strassel, Karen Jones
LDC Catalog No.:	LDC2024S03
ISLRN:	670-178-409-396-6
DOI:	https://doi.org/10.35111/4ena-fg30
Release Date:	March 15, 2024
Member Year(s):	2024
DCMI Type(s):	Sound, Text
Sample Type:	pcm
Sample Rate:	16000
Data Source(s):	telephone conversations
Project(s):	RATS
Application(s):	speech activity detection
Language(s):	English, Persian, Pushto, Urdu, Levantine Arabic
Language ID(s):	eng, fas, pus, urd, qal
License(s):	LDC User Agreement for Non-Members
Online Documentation:	LDC2024S03 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	Walker, Kevin, et al. RATS Low Speech Density LDC2024S03. Web Download. Philadelphia: Linguistic Data Consortium, 2024.
Related Works: Hide	View isSimilarWith LDC2015S02 RATS Speech Activity Detection LDC2017S20 RATS Keyword Spotting LDC2018S10 RATS Language Identification LDC2021S08 RATS Speaker Identification

Introduction

RATS Low Speech Density was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 87 hours of English, Levantine Arabic, Farsi, Pashto and Urdu speech and non-speech samples. The recordings were assembled by concatenating a randomized selection of speech, communications systems sounds, and silence. This corpus was created to measure false alarm performance in RATS speech activity detection systems.

The goal of the RATS (Robust Automatic Transcription of Speech) program was to develop human language technology systems capable of performing speech detection, language identification, speaker identification and keyword spotting on the severely degraded audio signals that are typical of various radio communication channels, especially those employing various types of handheld portable transceiver systems. To support that goal, LDC assembled a system for the transmission, reception and digital capture of audio data that allowed a single source audio signal to be distributed and recorded over eight distinct transceiver configurations simultaneously. Those configurations included three frequencies -- high, very high and ultra high -- variously combined with amplitude modulation, frequency hopping spread spectrum, narrow-band frequency modulation, single-side-band or wide-band frequency modulation. Annotations on the clear source audio signal, e.g., time boundaries for the duration of speech activity, were projected onto the corresponding eight channels recorded from the radio receivers.

Data

The source audio was extracted from RATS development and progress speech activity detection sets and from RATS keyword spotting development data. It consists of conversational telephone speech recordings collected by LDC: (1) data collected for the RATS program from Levantine Arabic, Farsi, Pashto and Urdu speakers; and (2) material from the Fisher English (LDC2004S13, LDC2005S13) and Fisher Levantine Arabic telephone studies (LDC2007S02), Levantine Arabic QT Training Data Set 5, Speech (LDC2006S29), and CALLFRIEND Farsi Second Edition Speech (LDC2014S01).

Non-speech samples were selected from communications systems sounds, including telephone network special information tones, radio selective calling signals, HF/VHF/UHF digital mode radio traffic, radio network control channel signals, two-way radio traffic containing roger beeps, and short duration shift-key modulated handset data transmissions.

The data is divided into development, progress, and train sets, each containing their own subdirectories.

All audio files are presented as single-channel, 16-bit PCM, 16000 samples per second; lossless FLAC compression is used on all files. When uncompressed, the files have "MS-WAV" (RIFF) file headers.

A collection of tables describing the design and assembly of the source audio files is included in the documentation accompanying this release.

Sponsorship

This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. D10PC20016. The content does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred.

RATS Low Speech Density

Introduction

Data

Sponsorship

Samples

Updates

Copyright

Available Media

View Fees