The CALLFRIEND project was designed to support the development of language identification technology.
The corpus consists of 60 telephone conversations, lasting between 5-30 minutes. The corpus also includes documentation describing speaker information (sex, age, education, callee telephone number) and call information (channel quality, number of speakers).
For each conversation, both the caller and callee are native speakers of Korean. All calls are domestic and were placed inside the continental United States and Canada.
Updates Transcripts for 49 of the 60 calls are now available as CALLFRIEND Korean Transcripts (LDC2003T08). An additional number of 51 calls have been published as CALLFRIEND Korean Speech Supplement (LDC2003S03).
Portions © 1996 Trustees of the University of Pennsylvania