CHAracterizing INdividual Speakers (CHAINS)

Item Name: CHAracterizing INdividual Speakers (CHAINS)
Author(s): Fred Cummins, Marco Grimaldi, Thomas Leonard, Juraj Simko
LDC Catalog No.: LDC2008S09
ISBN: 1-58563-497-2
ISLRN: 726-472-023-584-8
Release Date: November 18, 2008
Member Year(s): 2008
DCMI Type(s): Sound
Sample Type: 16 bit linear PCM
Data Source(s): microphone speech
Application(s): speech recognition
Language(s): English
Language ID(s): eng
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2008S09 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Cummins, Fred, et al. CHAracterizing INdividual Speakers (CHAINS) LDC2008S09. Web Download. Philadelphia: Linguistic Data Consortium, 2008.

Introduction

CHAINS was created by researchers at University College Dublin and contains recordings of thirty-six English speakers reading fables and selected sentences in different speaking styles. The data was obtained in two different sessions with a time separation of about two months. The goal of the corpus is to provide a range of speaking styles and voice modifications for speakers sharing the same accent. Other existing corpora, in particular CSLU Speaker Recognition Version 1.1, TIMIT and the IViE corpus (English Intonation in the British Isles), served as referents in the selection of material. This design decision was made to ensure that methods designed and evaluated on the CHAINS corpus might be directly testable on these other corpora, which were recorded using quite different dialects and channel characteristics.

Additional documentation about the corpus and its methodolgy is available at the CHAINS website.

Data

The data was collected in two recording sessions in a total of six different speaking styles. The first recording session was carried out in a professional recording studio in December 2005. Speakers were recorded in a sound-attenuated booth reading text in the solo, synchronous and retell styles using a Neumann U87 condenser microphone. Additional tracks using other microphones (near and far-field) were also recorded and may be made available upon request to the authors. The second recording session took place from March 2006 to May 2006 in a quiet office environment, using an AKG C420 headset condenser microphone. Speakers read text in the rsi, whisper and fast modes. The six different speaking styles were:

  • solo reading
  • synchronous reading
  • spontaneous speech (retell)
  • reptitive synchronous imitation (rsi)
  • whispered fast reading
  • fast speech reading

In two of the speaking conditions adopted, speakers modified their speech in a constrained fashion towards a known target in the synchronous condition, the speech of the co-speaker served as a target, while in rsi, there was an explicit known static target. The presence of a known target which speakers aim to copy raises the bar in the discovery and design of procedures for automatic speaker identification, as the target speech provides a potentially highly confusing foil. The whisper and fast speech conditions are also well defined speaking styles which require substantial voice modification by the speaker.

Participants were recruited through the University College Dublin and were paid for their participation. No participant had any known speech or hearing deficit. The speakers were from the United Kingdom, the eastern part of Ireland (Dublin and adjacent counties) and the United States. Further information about the speakers, their gender and dialect is available in the documentation released with this corpus.

Samples

For the example of the data in this particular corpus please examine this sound file of the fast reading type

Available Media

View Fees





Login for the applicable fee