The CALLFRIEND project supports the development of language identification technology.
The corpus consists of 60 unscripted telephone conversations, lasting between 5-30 minutes. The corpus also includes documentation describing speaker information (sex, age, education, callee telephone number) and call information (channel quality, number of speakers).
For each conversation, both the caller and callee are native speakers of non-Southern dialects of American English. All calls are domestic and were placed inside the continental United States, Canada, Puerto Rico, or the Dominican Republic.
Callers in the "non-Southern" (or "general") collection of CALLFRIEND American English appear to come from a wide geographic range, based on their own reports of where they were raised (some identified their origins as being in the southeastern U.S.). Regardless of their geographic or ethnic backgrounds, the feature they share is the clear absence of a vowel quality pattern that would distinguish them as speakers of a "Southern" dialect.
Some information was inadvertently left out of the speaker information table and the call information table. Copies of these files are available here at CALLINFO.TBL and SPKRINFO.TBL.
Updates There are no updates at this time.