Magic Data Chinese Mandarin Conversational Speech
Item Name: | Magic Data Chinese Mandarin Conversational Speech |
Author(s): | Beijing Magic Data Technology Co. |
LDC Catalog No.: | LDC2019S23 |
ISBN: | 1-58563-911-7 |
ISLRN: | 636-430-467-703-3 |
DOI: | https://doi.org/10.35111/zrz3-fw98 |
Release Date: | December 05, 2019 |
Member Year(s): | 2019 |
DCMI Type(s): | Sound, Text |
Sample Type: | pcm |
Sample Rate: | 16000 |
Data Source(s): | microphone conversation |
Application(s): | speech recognition |
Language(s): | Mandarin Chinese |
Language ID(s): | cmn |
License(s): |
LDC User Agreement for Non-Members |
Online Documentation: | LDC2019S23 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Beijing Magic Data Technology Co.. Magic Data Chinese Mandarin Conversational Speech LDC2019S23. Web Download. Philadelphia: Linguistic Data Consortium, 2019. |
Related Works: | View |
Introduction
Magic Data Chinese Mandarin Conversational Speech was developed by Beijing Magic Data Technology Co., Ltd. and consists of approximately 10 hours of Mandarin conversational speech from 60 speakers. Each conversation was recorded on multiple devices and is presented in multiple forms, resulting in a total of approximately 60 hours of audio with corresponding transcripts.
Data
All participants were native speakers of Mandarin in Mainland China from accent regions across the country. Speakers were paired for conversations on a range of topics, including travel, fitness, games, sports and pets.
Speech data was recorded on mobile devices and is presented as 16kHz, 16-bit flac compressed pcm wav. Most files are single channel; however, a stereo version of each conversation is also included.
Transcript data is contained in UTF-8 encoded plain text TextGrids. Metadata such as topic, collection date, mobile device and speaker demographic information is found in the documentation accompanying this release.
Samples
Please view this stereo speech sample and transcript sample.
Updates
None at this time.