Mandarin Affective Speech
|Item Name:||Mandarin Affective Speech|
|Author(s):||Yingchun Yang, Zhaohui Wu, Tian Wu, Dongdong Li|
|LDC Catalog No.:||LDC2007S09|
|Release Date:||July 17, 2007|
|Data Source(s):||microphone speech|
|Application(s):||prosody, pronunciation modeling, speech recognition|
Mandarin Affective Speech Agreement
|Online Documentation:||LDC2007S09 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Yang, Yingchun, et al. Mandarin Affective Speech LDC2007S09. Web Download. Philadelphia: Linguistic Data Consortium, 2007.|
Mandarin Affective Speech is a database of emotional speech consisting of audio recordings and corresponding transcripts collected in 2005 at the Advance Computing and System Laboratory, College of Computer Science and Technology, Zhejiang University, Hangzhou, People's Republic of China. This corpus was designed with two goals: first, to serve as a tool for linguistic and prosodic feature investigation of emotional expression in Mandarin Chinese; and second, to provide a source of training and test data essential to support research in speaker recognition with affective speech. The speech database was recorded by eliciting speakers to express different emotional states in response to stimuli. The speakers read scenarios designed to elicite an emotional response such as a colleague's mistake for anger, a pleasant trip for elation, a hurry-up scene for panic and a puppy's death for sadness. The five emotional states recorded are characterized as follows:
- Neutral - Simple statements without any emotion.
- Anger - A strong feeling of displeasure or hostility.
- Elation - Be glad or happy because of praise.
- Panic - A sudden, overpowering terror, often affecting many people at once.
- Sadness - Affected or characterized by sorrow or unhappiness
Over 100 speakers participated in the data collection. After screening, recordings from 68 speakers (23 females, 45 males) were used in this corpus. Most of the speakers were in their twenties at the time of collection. Information about the speakers is contained in "SpeakerInfo.doc."
Subjects were given a text to read that consisted of five phrases, fifteen sentences and two paragraphs designed to generate the emotional speech. The material included all the phonemes in Mandarin. Each subject read the phrases, paragraphs, and sentences portraying the five emotional states: neutral (unemotional), anger, elation, panic and sadness. Altogether this database contains 25,636 utterances. The read material was constructed as follows:
- 5 phrases - "yes", "no" and three nouns as "apple", "train", "tennis ball". In Chinese, these words contain many different basic vowels and consonants.
- 20 sentences - These sentences include all the phonemes and most common consonant clusters in Mandarin. The types of sentences are: simple statements, a declarative sentence with an enumeration, general questions (yes/no question), alternative questions, imperative sentences, exclamatory sentences, special questions (whquestions).
- 2 paragraphs - Two readings, one selected from a famous Chinese novel, and the other stating a normal fact.
All the data were recorded in a quiet office on an OLYMPUS DM-20 digital voice recorder with a sampling rate of 22050Hz. Afterwards, the recorded voice files were transferred to a personal computer by USB (Universal Serial Bus). The recordings were then converted into monophonic Windows PCM format at 8 kHz sampling frequency and 16 bits resolution.
Further information about the data and methodology in this corpus is contained in the authors' paper, "MASC: A Speech Corpus in Mandarin for Emotional Analysis and Affective Speaker Recognition," in "MASC.pdf."
For an example of the data in this corpus, please listen the following examples: