Korean Telephone Conversations Speech

Item Name: Korean Telephone Conversations Speech
Author(s): Eon-Suk Ko, Na-Rae Han, Alexandra Caravan, George Zipperlen
LDC Catalog No.: LDC2003S03
ISBN: 1-58563-263-5
ISLRN: 977-452-139-220-6
DOI: https://doi.org/10.35111/6bre-jj92
Release Date: May 16, 2003
Member Year(s): 2003
DCMI Type(s): Sound
Sample Type: 2-channel ulaw
Sample Rate: 8000
Data Source(s): telephone conversations
Application(s): speech recognition
Language(s): Korean
Language ID(s): kor
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2003S03 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Ko, Eon-Suk, et al. Korean Telephone Conversations Speech LDC2003S03. Web Download. Philadelphia: Linguistic Data Consortium, 2003.
Related Works: View

Introduction

Korean Telephone Conversations Speech was produced by Linguistic Data Consortium (LDC) catalog number LDC2003S03 and ISBN 1-58563-263-5.

The telephone conversations in this corpus were originally recorded as part of the CALLFRIEND project. The CALLFRIEND Korean telephone speech was collected by Linguistic Data Consortium primarily in support of the Language Identification (LID) project, sponsored by the U.S. Department of Defense. The calls were later transcribed for use in other projects.

This publication consists of 100 telephone conversations, 49 of which were published in 1996 as CALLFRIEND Korean, while the rest of 51 are previously unexposed calls.

All 100 conversations have been transcribed and are published as Korean Telephone Conversations Transcripts.

The recorded conversations are between native speakers of Korean and last up to 30 minutes, of which the transcribed speech covers between 15 to 18 minutes. All speakers were aware that they were being recorded. They were given no guidelines concerning what they should talk about. Once a caller was recruited to participate, he/she was given a free choice of whom to call. Most participants called family members or close friends. All calls originated in either the United States or Canada.

Data

There are 100 speech files, totalling approximately 44 hours of audio. All speech files are in sphere format (shorten-compressed), recorded in two-channel ulaw with a sampling rate of 8 KHz.

Samples

Please listen to this audio sample.

Updates

There are no updates available at this time.

Available Media

View Fees





Login for the applicable fee