The CALLFRIEND project supports the development of language identification technology.
The corpus consists of 60 unscripted telephone conversations, lasting between 5-30 minutes. The corpus also includes documentation describing speaker information (sex, age, education, callee telephone number) and call information (channel quality, number of speakers).
For each conversation, both the caller and callee are native speakers of Spanish from Caribbean countries. All calls are domestic and were placed inside the continental United States, Canada, Puerto Rico, or the Dominican Republic.
Conversations were labeled as either "Caribbean" or "non-Caribbean" based on particular attributes in the speech of the participants. Callers in the "Caribbean" and "non-Caribbean" collections of CALLFRIEND Spanish were identified primarily on the basis of consonant quality patterns, specifically, word-final "s."
Updates There are no updates at this time.