This file contains documentation on the 2002 Emotional Prosody Speech and Transcripts, Linguistic Data Consortium (LDC) catalog number LDC2002S28 and ISBN 1-58563-237-6.
This publication contains audio recordings and corresponding transcripts, collected over an eight month period in 2000-2001 and designed to support research in emotional prosody. The recordings consist of professional actors reading a series of semantically neutral utterances (dates and numbers) spanning fourteen distinct emotional categories, selected after Banse & Scherers study of vocal emotional expression in German. (Banse, R. & Scherer, K. R. 1996. Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70, 614-636.)
Actor participants were provided with descriptions of each emotional context, including situational examples adapted from those used in the original German study. Flashcards were used to display series of four-syllable dates and numbers to be uttered in the approriate emotional category.
The Prosody Recordings Project is interested in capturing the aspects of speech (emotion, intonation) that are left out of the written form of a message. In these experiments, simple phrases are expressed in ways that reflect varied contexts. The same phrase might be used to answer different questions, address listeners at different distances from the speaker, or express different emotional states. Actors were used because they are experts at producing this kind of contextual variation in a natural and convincing way.
More information about this project can be found at http://www.ldc.upenn.edu/Projects/Prosody/.
There are 30 data files: 15 recordings in sphere format and their transcripts. For a sample transcript, please click on this example.
The sphere files are encoded in two-channel interleaved 16-bit PCM, high-byte-first (big-endian) format, for a total of 2,912,067,980 bytes (2777 Mbytes) or nine hours of sphere data.
The utterences were recorded directly into WAVES+ datafiles, on two channels with a sampling rate of 22.05K. The two microphones used were a stand-mounted boom Shure SN94 and a headset Seinnheiser HMD 410.
The original session recordings are provided in their entirety, including informal chit-chat and discussion between each emotion category elicitation task. Time alignment is limited to utterances within the formal elicitation tasks and miscellanous regions have been marked as such.
There are no updates at this time.
Portions © 2000-2002 Trustees of the University of Pennsylvania.