JEIDA/JCSD-Channel 0 Complete
Item Name: | JEIDA/JCSD-Channel 0 Complete |
Author(s): | Jonathan Hamaker, Richard J. Duncan, Joe Picone, Shuichi Itahashi. The original data was provided by the Japan Electronic Industry Development Association (JEIDA) |
LDC Catalog No.: | LDC96S64 |
ISBN: | 1-58563-093-4 |
ISLRN: | 734-085-058-125-6 |
DOI: | https://doi.org/10.35111/hs5b-e580 |
Member Year(s): | 1996, 1997 |
DCMI Type(s): | Sound |
Sample Type: | 1-channel pcm |
Sample Rate: | 16000 |
Data Source(s): | microphone speech |
Application(s): | speech recognition |
Language(s): | Japanese |
Language ID(s): | jpn |
License(s): |
LDC User Agreement for Non-Members |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Hamaker, Jonathan, et al. JEIDA/JCSD-Channel 0 Complete LDC96S64. Web Download. Philadelphia: Linguistic Data Consortium, 1996. |
Related Works: | View |
Introduction
The Japanese Electronic Industry Development Associations (JEIDA) Common Speech Data Corpus (JCSD) was prepared by Jonathan Hamaker, Richard J. Duncan and Joe Picone of the Institute for Signal and Information Processing at Mississippi State University.
Data
This collection consists of high-fidelity recordings of 150 native speakers of Japanese. Each speaker produces four repetitions of 323 short prompts, including city names, control words, monosyllabic words, isolated digits, and strings of four digits. Each reading session was recorded with two microphones, yielding two channels that differ in audio quality for each utterance. Channel 0 (LDC96S64) contains data recorded with a standard dynamic microphone---a Sanken MU-2C microphone. Channel 1 (LDC96S65) contains data recorded simultaneously with a condenser microphone that presumably varied from site to site and is available separately.
A summary of the size and content of the corpus is given below:
Number of Speakers: 150 speakers - 75 males, 75 females
Range of Speaker Age: 10 yrs. to 70 yrs.
Number of Items per Speaker: 323 items - 15 isolated digits, 35 four digit sequences, 100 city names, 110 monosyllables, 13 control words (set A), 24 control words (set B), 26 control words (set C)
Number of Repetitions per Item: 4 repetitions
Total Number of Utterances: 193,763 utterances (per channel)
Sample Frequency: 16 kHz sample type 16-bit linear
Number of Microphones: 2 (dynamic and condenser)
Description
Number of Items
Control Words: 13 Banking Services, 24 Word Processors, 26 Home Electronic Equipment
Digits: 15 isolated digits, 35 Four Digit Sequences, 100 City Names, 110 Monosyllables.
JEIDA/JCSD-Channel 0 and JEIDA/JCSD-Channel 1 can each be ordered as complete sets. Components of the corpus can also be purchased as outlined below:
JEIDA/JCSD-Channel 0 (Complete) LDC96S64
JEIDA/JCSD-Channel 0 City Names LDC96S64-1
JEIDA/JCSD-Channel 0 Control Words LDC96S64-2
JEIDA/JCSD-Channel 0 Isolated Digits LDC96S64-3
JEIDA/JCSD-Channel 0 Four Digit Seq. LDC96S64-4
JEIDA/JCSD-Channel 0 Monosyllables LDC96S64-5
JEIDA/JCSD-Channel 1 (Complete) LDC96S65
JEIDA/JCSD-Channel 1 City Names LDC96S65-1
JEIDA/JCSD-Channel 1 Control Words LDC96S65-2
JEIDA/JCSD-Channel 1 Isolated Digits LDC96S65-3
JEIDA/JCSD-Channel 1 Four Digit Seq. LDC96S65-4
JEIDA/JCSD-Channel 1 Monosyllables LDC96S65-5
Updates
There are no updates at this time.