The CALLFRIEND project supports the development of language identification technology.
The corpus consists of 60 unscripted telephone conversations, lasting between 5-30 minutes. The corpus also includes documentation describing speaker information (sex, age, education, callee telephone number) and call information (channel quality, number of speakers).
For each conversation, both the caller and callee are native speakers of Southern American English. All calls are domestic and were placed inside the continental United States, Canada, Puerto Rico or the Dominican Republic.
Callers in the "Southern" collection of CALLFRIEND American English were identified primarily on the basis of vowel quality patterns that are common among native speakers raised in the southeastern United States (from Texas eastward to the Atlantic coast and from Virginia and Kentucky southward to the Gulf of Mexico). This category also includes a small number of African-American speakers, whose geographic origins may be more dispersed, but who share some of the vowel quality patterns distinctive of Southern white speakers. (Of course, other dialect features involving phonology, syntax and prosody, serve to differentiate these two subgroups within the "Southern" collection.)
Updates There are no updates at this time.