-----------------------------------------------------------
	Description of the CallFriend telephone speech  
			  corpus for Spanish
	-----------------------------------------------------------

	July, 1997

CONTENTS

	1. Summary abstract
	2. Data acquisition
	3. Data verification
	4. Speaker demographics
	5. Dialect Audit

-----------------------------------------------------------------------
1.  Summary abstract

	The CallFriend Spanish corpus of telephone speech was collected
by the Linguistic Data Consortium primarily in support of the project on 
Language Identification (LID), sponsored by the U.S. Department of Defense.

	This release of the CallFriend Spanish corpus consists of 60
unscripted telephone conversations between native speakers of Spanish
for each dialect group.  The recorded conversations last up to 30
minutes. All speakers were aware that they were being recorded. They
were given no guidelines concerning what they should talk about.  Once
a caller was recruited to participate, he/she was given a free choice
of whom to call.  Most participants called family members or close
friends.  All calls originated in the United States.


-----------------------------------------------------------------------
2.  Data acquisition

	Speakers were solicited by the LDC to participate in this
telephone speech collection effort via the internet, publications
(advertisements), and personal contacts.  A total of 100 call
originators were found per dialect, each of whom placed a telephone call via a
toll-free robot operator maintained by the LDC.  Access to the robot
operator was possible via a unique Personal Identification Number
(PIN) issued by the recruiting staff at the LDC when the caller
enrolled in the project.  The participants were made aware that their
telephone call would be recorded, as were the call recipients.  The
call was allowed only if both parties agreed to being recorded.  Each
caller was allowed to talk up to 30 minutes.  Upon successful
completion of the call, the caller was paid $20 (in addition to making
a free long-distance telephone call).  Each caller was allowed to
place only one telephone call.


-----------------------------------------------------------------------
3.  Data verification

	After a successful call was completed, a human audit of each
telephone call was conducted to verify that the proper language was
spoken, and to check the quality of the recording.  The information from
this audit may be found in the file "callinfo.tbl", and its contents
are described in greater detail in "callinfo.doc".

-----------------------------------------------------------------------
4.  Speaker demographics

	Information on speaker demographics can be found in the file
"spkrinfo.tbl", whose contents are described in the file "spkrinfo.doc".

-----------------------------------------------------------------------

5. Dialect Audit


	A second audit was conducted by a native speaker familiar
with dialect variation in Spanish.  Conversations were
labeled as either "Caribbean" or "non-Caribbean" based on particular
attributes in the speech of the participants.

	Callers in the "Caribbean" and "non-Caribbean" collections of
CallFriend Spanish were identified primarily on the basis of consonant
quality patterns, specifically, word-final "s".

-------------