CALLHOME American English Speech
|Item Name:||CALLHOME American English Speech|
|Author(s):||Alexandra Canavan, David Graff, George Zipperlen|
|LDC Catalog No.:||LDC97S42|
|Sample Type:||2-channel ulaw|
|Data Source(s):||telephone conversations|
|Project(s):||Hub5-LVCSR, GALE, EARS|
LDC User Agreement for Non-Members
|Online Documentation:||LDC97S42 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Canavan, Alexandra, David Graff, and George Zipperlen. CALLHOME American English Speech LDC97S42. DVD. Philadelphia: Linguistic Data Consortium, 1997.|
CALLHOME American English Speech was developed by the Linguistic Data Consortium (LDC) and consists of 120 unscripted 30-minute telephone conversations between native speakers of English.
All calls originated in North America; 90 of the 120 calls were placed to various locations outisde of North America, while the remaining 30 calls were made within North America. Most participants called family members or close friends.
This corpus contains speech data files with documentation describing their contents and format along with the software packages needed to uncompress the speech data. Corresponding transcripts and documentation (LDC97T14) are available separately, as is an associated lexicon (LDC97L20).
The "shorten" and "sphere" directories have been removed.
The sphere directory contained NIST "SPeech HEader REsources" (SPHERE): C-language source code libraries and utilities for manipulating NIST SPHERE-format waveform files.
The shorten directory contained files for the "shorten" software for speech compression.