UCLA High-Speed Laryngeal Video and Audio

Item Name: UCLA High-Speed Laryngeal Video and Audio
Author(s): Gang Chen, Juergen Neubauer, Marc Garellek, Robin Samlan, Bruce R. Gerratt, Jody Kreiman, Abeer Alwan
LDC Catalog No.: LDC2017V01
ISBN: 1-58563-803-X
ISLRN: 810-731-329-467-5
Release Date: June 15, 2017
Member Year(s): 2017
DCMI Type(s): MovingImage, Sound
Sample Type: pcm
Sample Rate: 16000
Data Source(s): video, microphone speech
Application(s): speech synthesis
Language(s): English, Chinese
Language ID(s): eng, zho
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2017V01 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Chen, Gang, et al. UCLA High-Speed Laryngeal Video and Audio LDC2017V01. Hard Drive. Philadelphia: Linguistic Data Consortium, 2017.

Introduction

UCLA High-Speed Laryngeal Video and Audio was developed by UCLA Speech Processing and Auditory Perception Laboratory and is comprised of high-speed laryngeal video recordings of the vocal folds and synchronized audio recordings from nine subjects collected between April 2012 and April 2013. Speakers were asked to sustain the vowel /i/ for approximately ten seconds while holding voice quality, fundamental frequency, and loudness as steady as possible.

In the field of speech production theory, data such as contained in this release may be used to study the relationship between vocal folds vibration and resulting voice quality.

Data

None of the subjects had a history of a voice disorder. There was no native language requirement for recruiting subjects; participants were native speakers of various languages, including English, Mandarin Chinese, Taiwanese Mandarin, Cantonese and German.

Audio data is presented as 16kHz 16-bit flac and video is in avi format at 5 fps (frames per second).

Samples

Please view this video sample and this audio sample.

Updates

None at this time.

Available Media

View Fees





Login for the applicable fee